Intelligent route planning for fast autonomous vehicles operating in a large natural terrain

Intelligent route planning for fast autonomous vehicles operating in a large natural terrain

Robotics and Autonomous Systems 40 (2002) 1–24 Intelligent route planning for fast autonomous vehicles operating in a large natural terrain S. Al-Has...

874KB Sizes 0 Downloads 31 Views

Robotics and Autonomous Systems 40 (2002) 1–24

Intelligent route planning for fast autonomous vehicles operating in a large natural terrain S. Al-Hasan a , G. Vachtsevanos b,∗ b

a Department of Computer Technology, The College of Technology, Al-Ahsa, 31982-804, Saudi Arabia Georgia Institute of Technology, School of Electrical and Computer Engineering, Atlanta, GA 30332-0250, USA

Received 31 May 2000; received in revised form 10 December 2001 Communicated by T.C. Henderson

Abstract This paper presents a 2D route planner to support operations for an autonomous vehicle conducting missions at high speeds in a large natural terrain. A configurable search engine is used to search a pre-prepared digitized map of the natural terrain and to generate a suboptimal route in terms of distance, safety, and maneuvering. The suboptimal route is more optimized and smoothed by a route filter that produces the optimal route under some conditions. A high-level intelligent supervisory module configures the search engine and assigns weights to the three route cost components to satisfy the mission objectives and the time constraints of the mission. To take advantage of previous route planning experience, an intelligent learning support engine is used to collect route cost data and to build a knowledge base that is used to provide better heuristics to the search engine. The presented approaches and results contribute towards a truly autonomous vehicle capable of conducting complex missions efficiently and in real time. © 2002 Elsevier Science B.V. All rights reserved. Keywords: Autonomous vehicle; A∗ algorithm; Learning; Route optimization; Route planning; Route smoothing; Search engine; Supervisory module

1. Introduction In recent years, interest has intensified for auto nomous vehicles that have capabilities to perform missions in large natural areas at high speeds [1–3]. While exploring hazardous environments and conducting rescue tasks are some applications in the civilian arena, typical military related missions include advanced reconnaissance, battle damage/contamination assessment, and exploring for geographic and geologic concerns. This paper presents the design of a real-time route planner that generates 2D static routes for an AV ∗

Corresponding author. E-mail address: [email protected] (G. Vachtsevanos).

that conducts missions in a large natural terrain at high speeds. The illustrative examples present an autonomous helicopter conducting missions at speeds in the range of 3–5 m/s over a natural terrain that is about 4 km × 4 km. In artificial intelligence (AI) route planning approaches can be classified into three general categories: roadmap, cell decomposition, and potential field [4]. Roadmap-based approaches attempt to construct visibility graphs which are non-directed graphs whose nodes are the start and goal points as well as the vertices of the obstacles. Cell decomposition approaches divide the terrain map into uniform or non-uniform cells where some cells are occupied by obstacles while the remaining are obstacle-free; a search algorithm (e.g., Dijkstra [5] or A∗ [6]) is then

0921-8890/02/$ – see front matter © 2002 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 1 - 8 8 9 0 ( 0 2 ) 0 0 2 0 8 - 7

2

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

utilized to search the map’s free cells and to construct a cell-based route from the start point to the goal point. Numerical potential field approaches generate recursively a field based on some function (e.g., a Laplace function) over the terrain map. After convergence, the field takes maximum values at obstacles and a minimum value at the goal point [7,8]. Following the minimum gradient at the start point, the global optimal path (i.e., minimum cost) is constructed [9]. The route planner introduced in this paper can be classified as belonging to the cell decomposition class which is more flexible than the other two in terms of defining costs other than distance, as will be shown in the sequel. 2D routes are appropriate for on-ground AVs. For AVs like under-water and aerial vehicles that operate in 3D spaces, 2D routes are still applicable assuming that the AV operates within a limited range of altitude that does not follow the terrain’s geographic features (see Fig. 1). Static route planning, which does not account for the AV’s dynamics, is appropriate for applications with large maps where the dimensions of the AV are much smaller than the dimensions of the terrain. The real-time feature of the presented route planner is vital to support on-line redirection of the AV and to satisfy time constraints for critical-time missions. To fulfill the requirements of complex missions, the presented route planner employs a cost function composed of distance, safety, and maneuvering metrics rather than the simple distance cost that is referred to in most of the published literature [10–14]. Based on the objectives and the circumstances of the mission, an intelligent supervisory module configures (i.e., selects a search method and sets time limits on the

search) the search engine and assigns weights to the three cost components. Assuming that the map of the terrain is represented by a mesh of equal square cells, the search engine employs an A∗ algorithm to search the map cells and plan a route from the start cell to the destination cell. The search engine can be configured by the supervisory module to use another search method to satisfy time constraints. The search engine generates unsmooth suboptimal routes. To optimize and smooth the generated route, a fast route filtering algorithm was developed. If the unfiltered route is monotonic (which will be explained in Section 5) the route filtering algorithm finds the optimal route imbedded in the unfiltered route. To learn from previous route planning experiences, a learning algorithm was developed. The learning algorithm collects and stores routes’ costs at some map landmarks and reuses those costs to provide better heuristics to the search engine. This reduces search effort and consequently reduces route planning time. This paper is organized as follows. The next section presents the configuration of the route planner and explains briefly the functionality of its modules. Section 3 explains preparation of the digitized map database off-line. Section 4 focuses on the search engine and its interface with the other modules. Section 5 addresses the issues of route optimality and smoothness and introduces a route filtering algorithm and its condition of optimality. Section 6 describes the intelligent supervisory module and explains its high-level tasks. Section 7 presents a learning algorithm for the route planner to reduce the search effort and route planning time. Concluding remarks are highlighted in Section 8. The paper provides illustrative route

Fig. 1. An altitude envelope of operation that does not follow the geographic features.

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

planning examples for an autonomous helicopter conducting missions over a large natural terrain with tree lines as obstacles. The presented approaches and results contribute towards a truly AV capable of conducting complex missions efficiently and in real time.

2. The route planner configuration Fig. 2 depicts the configuration of the route planner. At the heart of the algorithmic modules is the A∗ search engine which utilizes a pre-prepared 2D digitized map of the navigation terrain. The digitized map

3

is in the form of a mesh of equal square cells, where each cell is either free or occupied by an obstacle. Having two free cells (i.e., a start and a destination) assigned by the supervisory module, the search engine searches the map mesh and plans a cell-based route that extends from the start cell to the destination cell and avoids stationary obstacles. The generated cell route is suboptimal in terms of distance, safety, and maneuvering (i.e., turning angles). These three route cost components are assigned weights by the supervisory module based on the objectives and the circumstances of the mission. The generated cell route is further optimized and smoothed by a filtering

Fig. 2. The route planner configuration.

4

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

algorithm that will be discussed in Section 5. The filtered route is a series of consecutive 2D waypoints that the AV can navigate through. The supervisory module is a high-level intelligent planner. It reads the objectives and the status of the mission and based on that it configures the search engine and assigns weights to the route’s three cost components. Furthermore, the supervisory module chooses the start and the destination cells for the search engine depending on the current status of the AV, i.e., whether it is stationary or already navigating towards a destination and needs to be redirected to another destination. The learning-support module acquires route cost data from the search engine at certain map landmarks and updates a cost database that is used later to provide better heuristics to guide the search engine. The following sections will focus on the functionality of each module of the route planner individually.

3. The digitized map database A digitized map of the terrain, where the AV will operate, is assumed to be pre-prepared and available in

the form a mesh of equal square cells. Fig. 3 shows a partially digitized map of a natural terrain. Geographic maps and aerial photographs are usually used to prepare digitized maps. The size of the cell should be larger than the dimensions of the AV to allow for deviation when the AV changes its navigation direction at some cell. The map cells are first checked for their inclusion of stationary obstacles. A stationary obstacle is defined to be any object or area that the AV cannot or should not navigate through. To avoid getting very close to stationary obstacles, a safety area is also set around each stationary obstacle and is considered to be part of it. All map cells that contain stationary obstacles or part of a stationary obstacle (including its safety area) are marked unnavigable; the other cells are marked free. This process generates a 2D reachability matrix where each entry of the matrix represents a map cell. Each entry of the reachability matrix holds 1 or 0, where 1 means that the corresponding map cell is obstacle-free and 0 means that the cell is unnavigable. The reachability matrix is then scanned and each obstacle-free cell is assigned a hazard cost based on its proximity to stationary obstacles. This process

Fig. 3. A partially digitized map of a natural terrain.

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

generates a hazard cost matrix. The reachability and the hazard cost matrices, which are prepared off-line, constitute the digitized map database which is used by the search engine to plan routes.

4. The search engine The route planning problem is viewed here as a search problem where the digitized map represents the search graph and the assigned start and destination cells are the start and the goal nodes, respectively. The objective, then, is to find the minimum-cost route from the start cell to the destination cell. The route’s cost is defined as the weighted sum of three cost metrics: distance, hazard, and maneuvering, route cost = (Wd ∗ distance) + (Wh ∗ hazard) +(Wm ∗ maneuvering), where Wd , Wh , and Wm are weights for the three cost components. The hazard metric replaces the previously mentioned safety metric so the route’s cost can be minimized (notice that safety will be maximized when hazard is minimized). If the prescribed route is represented as a sequence of cells, route = (C1 , C2 , C3 , . . . , CN ) such that C1 is the start cell, and CN is the destination cell, then the three cost components can be formulated as follows: distance =

N−1 

d(Ci , Ci+1 ),

i=1

where d(Cn , Cm ) is the distance from the center of cell Cn to the center of cell Cm . hazard =

N 

d(Ci ) ∗ z(Ci ),

i=1

where d(Cn ) is the length of the section of the route that lies in cell Cn , and z(Cn ) is the hazard level of cell Cn as assigned in the hazard matrix. maneuvering =

N  i=1

m(α(Ci )),

5

where α(Cn ) is the route turn angle at cell Cn , and m(·) is a maneuvering cost function that assigns a cost value to every possible turn angle. Regarding cell adjacency, it is noted that two map cells are considered to be adjacent when the straight line that connects their centers does not intersect with an obstacle cell. In a fairly large map, a cell may be adjacent to hundreds or thousands of other cells. This level of adjacency requires prohibitive search times, as will be explained in the following section. In order to achieve practical route planning runtimes, cell adjacency is limited to a small number of cells in the neighborhood of each cell. Although this approach makes the planned route suboptimal rather than optimal, the CPU runtimes are more reasonable for real-time route planning purposes. An A∗ search algorithm is employed in the search engine to provide the minimum-cost route. Details of the A∗ algorithm can be found in the published literature. The interested reader can refer to [15] for a comprehensive review of the A∗ algorithm and to [16] for advanced modified versions of A∗ . In A∗ , the cost of every searched cell, n, is composed of two components: cost(n) = kg ∗ g(n) + kh ∗ h(n), where g(n) is the cost of the least-cost route (found in the search so far) from the start cell to cell n, h(n) is the heuristic (i.e., estimated) cost of the minimum-cost route from cell n to the destination cell, and kg , kh are weighting factors for g(n) and h(n), respectively, such that kg , kh ∈ [0, 1]. For the A∗ algorithm kg = kh = 1. Both cost components, g(n) and h(n), consist of the weighted sum of the three terms associated with distance, hazard, and maneuvering. The factors kg and kh , which are equal for the standard A∗ algorithm, are used to configure the search engine to techniques other than the A∗ . This configuration, which is conducted by the supervisory module, is used to satisfy real-time mission requirements as will be explained in Section 6. To guarantee route optimality, the admissibility condition should be satisfied [15]. That is, for each searched cell n, h(n) ≤ h∗ (n), where h∗ (n) is the actual (i.e., not estimated) cost of the minimum-cost route from cell n to the destination

6

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

cell. Considering the three cost components, h(n) will be h(n) = hd (n) + hh (n) + hm (n), where hd (n), hh (n), and hm (n) are the heuristic costs of distance, hazard, and maneuvering, respectively. Assigning the Euclidean distance from cell n to the destination cell to hd (n) and setting hh (n) and hm (n) to zero satisfy the admissibility condition and, consequently, guarantee that the optimal route will be found. This setting of h(n) also satisfies the monotone restriction, i.e., h(m) ≤ h(n) + c(m, n), where m is the parent cell of n, and c(m, n) is the cost of the route from the parent cell m to the child cell n.

The monotone restriction guarantees that when cell n is expanded (i.e., moved from the OPEN list to the CLOSED list of the A∗ algorithm), the optimal route from the start cell up to cell n has been found and, consequently, cell n will not be expanded again; in other words a cell is searched only once [15]. These heuristics, however, are very conservative. The main purpose of the heuristic cost, h(n), is to guide the search engine to the destination cell and, hence, to accelerate the search engine. The closer the heuristics are to their actual values, the less number of cells will be expanded and hence the less CPU runtime will be required to plan a route [15]. The Euclidean distance heuristic may lead the search engine to map traps (i.e., dead-end areas) while the zero heuristics for hazard and maneuvering imply lack of any knowledge and guidance for the search engine. We will introduce in

Fig. 4. Four planned unfiltered routes: (a) minimum distance; (b) minimum hazard (or maximum safety); (c) minimum maneuvering; (d) minimum (distance + hazard).

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

7

Table 1 Route planning data for the four cases in Fig. 4 Case

Start cell

Goal cell

Objective

a b c d

(30, (30, (30, (30,

(160, (160, (160, (160,

Minimum Minimum Minimum Minimum

30) 30) 30) 30)

160) 160) 160) 160)

distance hazard maneuvering (distance + hazard)

Number of expansions

Number of expanded cells

Route length (cells)

CPU time (s)

11,144 16,102 13,577 13,298

11,144 16,102 13,577 13,298

200 275 218 224

1.07 1.99 1.18 1.38

Section 7 a learning algorithm where the route planner learns from previous route planning experience and infers improved heuristics to reduce further the search time. Example 1. Route planning experiments for an autonomous helicopter were conducted on a 4.02 km × 3.42 km natural terrain with tree lines (see Fig. 4). The terrain is almost flat with the trees constituting the main natural obstacles. The autonomous helicopter conducts scouting missions at speeds in the range of 10–20 ft/s (i.e., about 3–6 m/s) at altitudes less than 50 ft. A commercially provided map of the terrain was divided into a grid of 201×171 cells (a total of 34,371 cells) with a cell size of 20 m × 20 m. This cell size allows redirecting the helicopter while navigating at a speed of 20 ft/s at the next route cell in a maximum time of 4 s. This cell size also makes a 20 m-wide envelop around the planned route to allow for inevitable deviations from the planned route. Map cells that contain trees (and a 20 m-wide safety area around trees) were marked to be obstacle cells. The route planning experiments were conducted on a 200 MHz Pentium Pro computer. Fig. 4 shows four unfiltered routes with computational data presented in Table 1. For these experiments, cell adjacency has been restricted to eight; i.e., a cell is assumed to be adjacent only to the eight cells that surround it (see Fig. 5). In case a, the minimum-distance route is planned. It is noted that the number of expanded cells (i.e., 11,144) is huge compared to the route length (i.e., 200 cells). This reflects the low guidance quality of the heuristics employed. Case b shows the minimum-hazard route which entails maximum clearance from obstacles. In case c, the minimum-maneuvering route is planned; this route has minimum maneuvering in terms of turning angles. In case d, a route that optimizes distance and hazard equally is planned.

Fig. 5. Cell A and its adjacent cells: (a) 8-cell adjacency; (b) 24-cell adjacency.

It is noted in the four cases that the number of expanded cells equals the number of cell expansions, which means that a cell is expanded (i.e., searched by A∗ ) no more than once. This is due to the satisfaction of the monotone restriction. For testing purposes, the key metric is actually the number of cell expansions which depends only on the algorithm. The CPU runtime can be reduced by a faster or a multi-CPU computer. Nevertheless, the CPU runtime is included in the presented results to reflect the practicality of the used algorithm and to help compare the different cases. For the autonomous helicopter application, the CPU runtime is good in the four cases (e.g., 1 s for the 4.5 km long route in case a). With this runtime rate, the helicopter flying at a speed of 20 ft/s can be redirected to another destination point at the next cell on its navigation route.

5. Route filtering The optimal route is achieved if the 8-cell adjacency restriction, introduced in the previous section, is relaxed. Relaxing this restriction implies full adjacency, i.e., a map cell is adjacent to every other map

8

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

cell as long as the straight line that connects their centers does not pass through an obstacle cell. However, a cell may be expanded into a large number of candidate children cells even if the map dimensions are restricted. The OPEN list might contain as many elements as the map itself, requiring an enormous amount of CPU processing time. One way to enhance route optimality is to increase the restricted cell adjacency to a reasonable range. For example, a 24-cell or a 48-cell adjacency can be used (see Fig. 5 for an illustration). This method generates smoother and more optimal routes than those generated with the 8-cell adjacency restriction, while the OPEN list is smaller than that required by full adjacency. However, for an AV with real-time route planning, the CPU runtime requirement for this method is still prohibitive as will be shown in the sequel. A faster method adopted to optimize and smooth the planned routes involves a two-step procedure. Basically, it generates a route with low adjacency range (e.g., 4 or 8) to be followed by a filtering operation. The advantage of this method over the previous one is that it requires less CPU runtime because of the low adjacency range and the fast filtering process while the filtered route’s cost is close to optimal. The key factor here is the fast filtering algorithm that is introduced next. The route filtering algorithm accepts as an input the route cells that have been generated by the search engine. The route cells are viewed as the nodes of a totally connected graph; i.e., each cell in the route’s cell list is connected to every other cell as long as the straight line that connects their centers does not pass through an obstacle cell. The filtering algorithm, then, searches for the optimal route in this graph. The search is expedient since the graph is very small compared to the whole map. The filtering algorithm begins at the first cell in the route’s cell list, i.e., the route’s start cell. It, iteratively, generates straight line segments from the start cell to every other cell below it in the list (see Fig. 6). It, then, checks each line segment if it reduces the route’s cost. If it does, the start cell replaces the parent of that cell; if it does not, no action is taken. The process is repeated by starting at the second cell in the route’s cell list and continues until the last cell (i.e., the destination cell) in the list is reached. A pseudo-code of the filtering algorithm can better clarify its mechanism. If the unfiltered route’s

Fig. 6. Mechanism of the route filtering algorithm.

cell list is represented by the following sequence of cells: route = (C1 , C2 , C3 , . . . , CN ), where C1 is the start cell, and CN is the destination cell, then the pseudo-code of the filtering algorithm takes the form: Step 0 for i = 1 to i = N − 1 (outer loop) Step 1 for j = i + 1 to j = N (inner loop) connect Ci and Cj with a straight line segment Lij Step 3 if the cost of Lij < the cost of the route’s part that extends from Ci to Cj Step 4 then set Ci to be the parent of Cj Step 5 otherwise do nothing Step 6 continue the jth loop Step 7 continue the ith loop Step 8 construct the filtered route by following a parent pointer from the destination cell backwards to the start cell If the cells of the optimal route that is embedded in the search engine’s planned route are ordered monotonically in the list (i.e., the parent cell is above its child cell in the list), then this filtering algorithm will guarantee finding the optimal route in the generated graph. This is proved in the next theorem.

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

Theorem. If the search engine generates the following list of route cells R = {C1 , C2 , ↑ top of the list

...,

CN } ↑ bottom of the list

such that C1 is the start cell, CN is the destination cell, and a graph is constructed by connecting each cell to all the other cells in the list (as long as the line that connects their centers does not intersect with an obstacle cell), then if the optimal route’s cells are ordered monotonically in R (i.e., each parent cell is above its child cell in the list), the filtering algorithm guarantees finding the optimal route in the constructed graph. Before presenting the proof of the theorem, let us prove the following postulate which will help prove the theorem. Postulate. If the optimal route from the start cell to the destination cell in the constructed graph is Ro = {Co1 , Co2 , . . . , CoM } such that Co1 = C1 ,

CoM = CN ,

M≤N

then the optimal route from the start cell (C1 ) to Coi is Roi such that Roi = {Co1 , Co2 , . . . , Coi }. Proof (By contradiction). If Roi = {Co1 , Co2 , . . . , Coi } then the cost of the route {Roi , Coi+1 , Coi+2 , . . . , CoM } will be less than the cost of Ro . This contradicts our assumption that Ro is the optimal route from the start cell to the destination cell.  The above postulate implies that the optimal route from the start cell to any cell (Coi ) that is part of the optimal route R is nothing but the part of R that extends from the start cell to Coi . Now, let us prove the theorem. Proof of the theorem (By induction). Consider the outer loop of the filtering algorithm and the optimal route Ro = {Co1 , Co2 , . . . , CoM }.

9

When cell Co1 is expanded in the outer loop of the filtering algorithm, then the optimal route to cell Co2 is found. The inner loop of the algorithm will check the route {Co1 , Co2 } which is the optimal route to Co2 according to the postulate. Hence the optimal route to Co2 will be found. Now, assume that the optimal route up to cell Coi has been found. From the postulate, this route is Roi = {Co1 , Co2 , . . . , Coi }. When the outer loop expands Coi then the route {Co1 , Co2 , . . . , Coi , Coi+1 } which is the optimal route to Coi + 1 will be checked because Coi + 1 is under Coi in the list. Hence the optimal route to Coi + 1 will be found. Now, by induction, when the outer loop expands CoM − 1 then the optimal route to CoM , which is Ro , will be found.  It should be noted that the optimal route in the theorem is not the global optimal route (i.e., over the whole map); it is the optimal route in the graph that is constructed from the cells of the unfiltered route. Example 2. In this example, the two explained route smoothing and optimization methods are applied to plan a more optimized route for the autonomous helicopter introduced in Example 1. In the following experiment, the search engine plans an A∗ -based route with low adjacency range (i.e., 8-cell adjacency). The planned route is then optimized by the introduced route filtering algorithm. The same experiment is then repeated using high adjacency range (i.e., 48-cell adjacency). Results from the two route optimization methods are analyzed and compared. A mission for the autonomous helicopter required flying from the ground station at map cell (10, 40) to map cell (90, 30) with minimum route turnings (i.e., minimum maneuvering). The supervisory module generated the following route cost weights: Wd = 0.1,

Wh = 0.1,

Wm = 0.8.

With 8-cell adjacency range the A∗ search engine planned the route shown in Fig. 7 as a dotted line. This route was optimized with the route filtering algorithm then plotted as a solid line. With the same route cost weights, another optimized route was planned with

10

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24 Table 2 Route optimization results for Example 2 Optimization method

CPU runtime (s)

OPEN list size (cells)

Route’s cost

8-Cell adjacency A∗ + filtering 48-Cell adjacency

3.58

661

37.2

9.50

1843

36.2

first method is 661 entries, while it is 1843 entries in the second method. Entering and removing cells from the OPEN list is the most time consuming part of the A∗ algorithm since it requires resorting all the cells in the list according to their cost. Fig. 7. Optimized route with 8-cell adjacency A∗ and filtering for Example 2 (dotted before filtering, solid after filtering).

Fig. 8. Optimized route with 48-cell adjacency A∗ for Example 2.

the A∗ search engine but with 48-cell adjacency. This second route is shown in Fig. 8. Table 2 compares computation data of both optimized routes. The first thing to notice is that both routes have comparable route costs (37.2 versus 36.2). However, the 8-cell A∗ with the filtering algorithm planned the route in 3.58 s, while the 48-cell A∗ took 9.5 s to plan the route. This means that the route filtering algorithm generated a comparable optimized route in only 38% of the time required by the second optimization method. This time saving of the first method can be explained by observing the size of the A∗ OPEN list of both methods. The size of the OPEN list of the

6. The supervisory module The supervisory module is a high-level intelligent planner that configures the search engine to meet the objectives and the time constraints of the mission. It performs three major tasks: selecting the start and the destination cells, configuring the search engine, and assigning weights to the route three cost components. To perform these high-level mission related tasks, the supervisory module reads as an input the objectives of the mission and the status of the AV. The objectives of the mission are assumed to be provided by an automatic mission planner or a human operator who can enter these objectives via a graphical user interface (GUI) at the AV’s ground station or load them to the AV if the route planner runs onboard. The status of the AV, which includes its map position, its speed, the health of its components and actuators, etc. are provided by sensors installed on board of the AV. The three tasks of the supervisory module are discussed next. 6.1. Start and destination cell selection The supervisory module assigns the start and the destination cells for the search engine. If the AV is not moving, then the start cell is simply selected to be the map cell the AV is currently in. This assumes that the supervisory module have some sensory data (e.g., GPS data) that allows locating the AV on the digitized map. If, on the other hand, the AV is already navigating thorough a route and needs to be redirected to another destination cell, then the supervisory module reads the

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

speed of the AV and based on the worst-case route planning time, it computes a redirection cell from the current route of the AV. The following simple formula is used for that purpose distance from redirection point = (1 + SF) ∗ (VAV ∗ Twc ), where VAV is the current speed of the AV, Twc is the worst-case route planning time, and SF is a positive-value safety factor (a typical value is 1 or 2). The worst-case route planning time can be approximated by the time the route planner takes to plan a route between the two furthest cells on the map. The destination cell is assumed to be entered by the human operator, e.g., by clicking on a map displayed on a computer screen. In that case, the supervisory module uses the reachability matrix to check the reachability of that destination cell before passing it to the search engine. If the assigned destination cell lies in a prohibitive or an obstacle area then the human operator is informed to change his choice. This will avoid searching all the map cells before discovering that there is no feasible route. The supervisory module should also have a database for important destination areas, like ground station, search area, fuel station, etc. Using that destinations database, the supervisory module allows the human operator to issue fast and easy commands like go home, get fuel, or come back and transforms those commands to appropriate destination points. 6.2. Route weights assignment The supervisory module also assigns weights to the three cost components (i.e., distance, hazard, and maneuvering) before the search engine searches the map. As explained in Section 3, Wd , Wh , and Wm are the route’s cost weights for distance, hazard, and maneuvering, respectively. These three cost weights are assigned to fulfill the mission’s requirements and to account for the circumstances of the mission to be conducted by the AV. For example, an escape mission for an autonomous helicopter might require flying with maximum speed. In that case, the straightest route (i.e., minimum maneuvering) should be planned. On

11

the other hand, if the helicopter is delivering important materials in a windy day, the route with maximum clearance (i.e., minimum hazard, and hence maximum safety) from obstacles should be followed. For an expert human operator, such decisions are based on his accumulated experience. For a truly AV system, experience needs to be extracted from a skilled human operator and formulated in a way that can be utilized by the AV system. 6.2.1. Knowledge base for route’s cost weights assignment Fuzzy if–then rules provide a good way to extract and formulate a human operator’s experience. The simple structure of the if–then rules allows the operator to formalate his experience using linguistic terms. Linguistic knowledge is easier to extract from a human operator than crisp mathematical terms. For example, a human operator would say IF the day is windy AND the mission is critical THEN I will follow the safest route with preference to shorter routes, rather than saying IF the wind speed is >15 mph AND the mission criticality is 90% THEN I will follow the 70% safe–30% short route. Thus, fuzzy linguistic variables are more practical than crisp variables. Furthermore, fuzzy representation supports linguistic modifiers like very, somehow, and kind of. However, for deterministic situations, crisp rules can be used. Based on the above discussion, fuzzy if–then rules are used in the supervisory module to assign the route’s cost weights. Since each rule or a group of rules represents a situation, each rule is assigned a degree of importance. The output of each rule is multiplied by its degree of importance before the outputs of the rules are composed. The three cost weights Wd , Wh , and Wm are represented by 3-term or 5-term fuzzy variables. A rule’s preconditions are also represented by fuzzy variables; the number of terms of each variable is based on the granularity of the variable in the operator’s description. The shape of the linguistic variables is based on the operator’s judgment. Now, the rules are

12

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

represented as follows: IF (x11 is A11 ) AND (x12 is A12 ) THEN Wd is Bd1 weight = R1 , IF (x21 is A21 ) AND (x22 is A22 ) THEN Wd is Bd2 AND Ws is Bs2 weight = R2 , .. . IF (xn1 is An1 ) AND (xn2 is An2 ) THEN Ws is Bsn AND Wm is Bmn weight = Rn , where the xij ’s are fuzzy variables of the AV, the mission, or the environment of the mission. Aij ’s are fuzzy linguistic terms of the fuzzy variables and are represented by membership functions. The number of precondition terms is not necessarily equal for all of the rules. The consequence part of the rules give linguistic values to some or all of the three route weights. The rule’s weight assigns an importance value Ri in the range of [0, 1] to reflect its importance with respect to the other rules. The number of precondition terms as well as the number of rules is not limited. 6.2.2. Knowledge base inferencing Each rule is fired separately to generate fuzzy values of the route’s weights that appear in its consequence part. The fuzzy values for each route’s weight are composed then defuzzified to produce a crisp value. Finally, the three crisp route’s weights are scaled in the range of [0, 1] before they are used by the search engine. Recall that the sum of the three route’s weights should be equal to 1.0. The inferencing process is explained in the following steps. 1. The degree of fulfillment of each fuzzy term in the preconditions part of a fuzzy rule is computed. This is done by projecting the crisp value of the variable on its fuzzy membership function. It is known in fuzzy logic that the degree of fulfillment should be in the range of [0, 1]. 2. A rule’s preconditions’ degree of fulfillment is computed by ANDing the degrees of fulfillment of the terms in its preconditions part. AND is

represented by Mamdani’s minimum operator. For example, (0.2) AND (0.5) AND (0.3) = min(0.2, 0.5, 0.3) = 0.2. 3. A rule’s degree of fulfillment is computed by multiplying the rule’s preconditions’ degree of fulfillment with the rule’s weight. The degree of fulfillment (DOF) of each rule will, then, be in the range of [0, 1]. 4. The membership function of each term in the consequence part of a rule is clipped according to the rule’s DOF found in step 3. Clipping features Mamdani Max–Min fuzzy implication. For example, if a rule’s DOF = 0.7 and its consequence part assigns fuzzy values to Wd and Ws , then they are clipped as shown in Fig. 9. 5. The clipped fuzzy terms of each cost weight are superimposed then defuzzified by finding the center of gravity of the formed shape. This gives a crisp value of the corresponding route’s weight. The three crisp cost weights that come out of this step are Wdc , Whc , and Wmc . 6. Now, the area under the membership functions in the consequence parts of the rules is considered to compute the three cost weights. Associated to the distance weight, the hazard weight, and the maneuvering weight are the three areas Ad , Ah , and Am , respectively. These three areas were found in step 5. Each cost weight is multiplied by the area in its consequence part as follows: Wds = Wdc ∗ Ad ,

Whs = Whc ∗ Ah ,

Wms = Wmc ∗ Am . 7. Last, the three cost weights are scaled to sum up to 1

Fig. 9. Clipping fuzzy values in a rule’s consequence part.

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

Wd = Wds /(Wds + Whs + Wms ), Wh = Whs /(Wds + Whs + Wms ), Wm = Wms /(Wds + Whs + Wms ). The choice of Mamdani’s Max–Min implication and Center-of-Gravity defuzzification is not based on any preference reason. This problem is a decision-making problem and is not sensitive to the implication and the defuzzification methods. Other implication methods like Larsen’s Max-Product, and other defuzzification methods like Center-of-Area or Center-of-Maximum will give comparable results. The following example clarifies the method to assign the route’s weights. Example 3. For an autonomous scouting helicopter, an expert helicopter pilot has put rules to assign the three route’s weights based on his expertise. Assume having only the following three rules with each rule assigned an importance weight: 1 IF (wind is high) AND (time is critical) THEN Wd is high, Wh is med., and Wm is med. weight = 0.6. 2 IF (fuel is low) THEN Wd is med. and Wm is high weight = 0.4. 3 IF (exposure is dangerous) THEN Wh is low and Wm is low weight = 0.8. Each rule reflects a specific situation and the pilot’s judgment about the quality of the route that suits that specific situation. Rule 3, for example, tries to minimize the helicopter exposure to detection devices by getting it as close as possible to natural obstacles like tree lines and mountains. On the other hand, rule 1 tries to make maximum clearance from obstacles when the wind speed is high.

2. Using Mamdani’s min operator, the DOF of the precondition part of each rules is computed as follows: DOF(rule 1 precondition) = min(0.7, 0.8) = 0.7, DOF(rule 2 precondition) = min(0.5) = 0.5, DOF(rule 3 precondition) = min(0.9) = 0.9. 3. Now, each rule’s DOF is found by multiplying its precondition’s DOF by its weight DOF(rule 1) = 0.7 ∗ 0.6 = 0.42, DOF(rule 2) = 0.5 ∗ 0.4 = 0.20, DOF(rule 3) = 0.9 ∗ 0.8 = 0.72. 4. The membership functions shown in Fig. 10 are used for the three route’s weights. Here, the range of the weights is set to be [0.1, 1] rather than [0, 1] as presented before. Assigning a small value, like 0.1, for a route’s weight reflects its unimportance. However, when there are more than one candidate route, that small value will result in choosing the route that optimizes that specific weight. Notice also that the domain of the membership functions is set to be [−0.35, 1.45]. This is because the Center-of-Gravity defuzzification method does not cover the extreme ends of the domain of triangular membership functions. Thus, the domain [−0.35, 1.45] allows the output to fully cover the range [0.1, 1]. Now, the membership function of each term in the consequence part of the three rules is clipped by the rule’s DOF as shown in Fig. 11. 5. The clipped areas for each route’s weight are superimposed and the center of gravity of the resulting shape is found (see Fig. 12).

1. To simplify the example, assume that the membership degree of fulfillment of the IF terms are as follows: DOF(wind is high) = 0.7, DOF(time is critical) = 0.8, DOF(fuel is low) = 0.5, DOF(exposure is dangerous) = 0.9.

13

Fig. 10. Membership functions for the route’s cost weights.

14

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

Fig. 11. Clipping membership functions in the consequence part of the three rules.

First, the total area (including overlapping regions) for each cost weight is found. Ad = 0.5 ∗ (0.42) ∗ (0.9 + 0.522) + 0.5 ∗ (0.2) ∗ (0.9 + 0.72) = 0.461, Ah = 0.5 ∗ (0.42) ∗ (0.9 + 0.522) + 0.5 ∗ (0.72) ∗ (0.9 + 0.252) = 0.713, Am = 0.5 ∗ (0.42) ∗ (0.9 + 0.522) + 0.5 ∗ (0.2) ∗ (0.9 + 0.72) + 0.5 ∗ (0.72) ∗ (0.9 + 0.252) = 0.875.

Next, the center of gravity for each cost weight is computed. Wdc =

0.30 ∗ 1.0 + 0.162 ∗ 0.55 = 0.842, 0.30 + 0.162

Whc =

0.30 ∗ 0.55 + 0.41 ∗ 0.1 = 0.290, 0.30 + 0.41

Wmc =

0.3 ∗ 0.55 + 0.162 ∗ 1.0 + 0.41 ∗ 0.1 0.3 + 0.162 + 0.41

= 0.422.

Fig. 12. The resulting area for each route’s cost weight.

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

6. Now, the gravity of each cost weight is encountered. Wds = Wdc ∗ Ad = 0.842 ∗ 0.461 = 0.388, Whs = Whc ∗ Ah = 0.290 ∗ 0.713 = 0.207, Wms = Wmc ∗ Am = 0.422 ∗ 0.875 = 0.369. 7. Last, the weights found in this step are scaled to sum up to 1. Wd = = Wh = = Wm = =

Wds

Wds + Whs + Wms

0.388 = 0.4, 0.388 + 0.207 + 0.369 Whs Wds + Whs + Wms 0.207 = 0.2, 0.388 + 0.207 + 0.369 Wds

Wms + Whs + Wms

0.369 = 0.4. 0.388 + 0.207 + 0.369

6.3. Search engine configuration The A∗ algorithm of the search engine generates the optimal route as explained in Section 3. However, there is no way to control the runtime of the A∗ algorithm. Thus, for situations with very limited time availability, the A∗ algorithm alone is not appropriate. Examples of situations and missions that are very time limited include rescuing victims, and escaping from a threat. For such situations, the real-time feature of route planning is more important than route optimality. In Section 3, it was explained that for each cell n searched (i.e., entered in the OPEN list) by the search engine, the following cost is assigned: cost(n) = kg ∗ g(n) + kh ∗ h(n). Refer to Section 3 for the meaning of the variables. Setting kg = kh = 1 configures the search engine to

15

the standard A∗ search. Setting kg = 0 and kh = 0 configures the search engine to optimal search, which is some form of breadth-first search. Optimal search, like the A∗ , finds the optimal route, but it is usually slower than A∗ since it does not direct the search towards the destination cell. On the other hand, setting kg = 0 and kh = 0 configures the search engine to best-first search which is some form of depth-first search. Best-first search directs the search towards the destination cell regardless of the route’s optimality. This feature of best-first search allows interrupting the search engine to collect an incomplete route. A∗ and optimal search, on the other hand, propagate in several directions seeking the optimal route. Before explaining the way fast route planning works, we introduce the notion of on-line partial route planning, which means that only part of a route (i.e., a route that goes toward the destination cell but stops short of it) is planned on-line and validated to the AV for navigation. When the AV starts tracking that incomplete route, the route planner, in parallel, plans the next part of the route and so on till a complete route (i.e., a route that reaches the destination cell) is finished. The combination of the fast directed best-first search with the partial planning method, non-optimal, but very fast routes, in terms of planning time, can be planned. Now, to achieve fast route planning the supervisory module configures the search engine to best-first and based on the criticality of the situation, sets a time limit for the search engine. The search engine resets a clock at the start of the search process and observes the elapsed search time at the end of every cell expansion loop. Whenever the search engine passes that time limit it stops the search and provides the route segment that has been planned so far. If the destination point has not been reached yet, the search engine starts from the point it stopped at and resumes planning towards the destination point. This process continues till the search engine reaches the destination point. The supervisory controller configures the search engine to A∗ in normal situations to obtain optimal routes. To switch the search engine to the fast mode and set a runtime limit, the supervisory module uses a simple rule base that is constructed by human experts and operators. An example of such rule base is

16

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

Fig. 13. Partially planned best-first route for Example 4.

the following:

Example 4. In a hostile radar detection situation the supervisory module has configured the search engine as best-first and set an emergency time threshold of 0.1 s for the search engine to provide a route for the helicopter to escape and return back to its ground station. The best-first engine planned the emergency route partially in two steps. The two segments of the emergency route are shown in Fig. 13 with a star (∗) at the start cell and the end cell of each segment. Table 3 summarizes computation data for each route segment. Notice that the A∗ optimal route took 2.05 s (see Fig. 14 and Table 4). The partial best-first route took 0.10 s for the first segment and 0.07 s for the second segment. Although the total planning time is 0.17 s, a route (i.e., the first segment of the route) for escape was ready for the autonomous helicopter after 0.10 s. Notice also that the total cost of the best-first route is 46.82 that is 68% larger than cost of the A∗ suboptimal route.

If command is “escape” Then kg = 0, kh = 1, T = 0.1 s. If command is “rescue” Then kg = 0, kh = 1, T = 0.5 s. .. . Else command is “navigate” Then kg = 1, kh = 1, where inputs for the rules come from an automatic mission planner or a human operator and T is the runtime limit. Other structures for the rule base, e.g., a fuzzy rule base similar to that used to compute route’s weights, can be used to configure the search engine. The key factor for a fuzzy rule base will be the inferencing time, which should be less than the smallest T in the rule base.

Table 3 Best-first route computation data for Example 4 Route segment

Start cell

Stop cell

Wd

Wh

Wm

Number of expansions

Route length (cells)

CPU time (s)

Cost

Segment number 1 Segment number 2

(80, 180) (126, 70)

(126, 70) (120, 47)

0.1 0.1

0.1 0.1

0.8 0.8

1871 2184

141 64

0.10 0.07

34.10 12.72

4055

205

0.17

46.82

Total

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

17

Fig. 14. A∗ route for Example 4.

Table 4 A∗ route computation data for Example 4 Search algorithm

Start cell

Goal cell

Wd

Wh

Wm

Number of expansions

Route length (cells)

CPUtime (s)

Cost

A∗ route

(80, 180)

(120, 47)

0.1

0.1

0.8

20495

181

2.05

27.81

7. The learning-support module The route planning effort may be improved further if previous planning knowledge is taken advantage of in a learning setting. The learning approach adopted here is based on determining and storing route costs at some landmark positions and using them later to guide the search engine; better estimates of the heuristic cost, h(n), are thus provided to the search algorithm. The learning algorithm entails knowledge acquisition, a knowledge base, and an inference engine. The remaining of this section describes the three main components of the learning-support module and how it supports the function of the route planner. Experimental results are used to highlight the effectiveness of the learning system as it reduces the search time and improves the overall planning effort. The first step towards building the learning-support system is to allocate appropriate landmarks on the map. Those landmarks are fictitious marks that are set at some cells on the digitized map. For simplicity,

uniformly distributed landmarks are used. However, a targeted distribution that assigns landmarks only to interesting areas in order to minimize the number of landmarks can be used. Fig. 15 depicts the landmark assignment scheme that has been employed for the learning-support system. It will be seen that each landmark is associated with a knowledge base containing route costs from that landmark to all other reachable landmarks on the map. Notice that some landmarks coincide with obstacle cells. Knowledge acquisition aims to provide costs of routes from each landmark to all other reachable landmarks on the map and store them in a knowledge base. A cost matrix accompanies each landmark in the knowledge base and contains route costs from that specific landmark to all other reachable landmarks. The uniform landmark distribution, used here, makes it possible to formulate a landmark cost matrix as a 2D matrix where each entry is the route cost to the corresponding landmark. The advantage of this approach is that the memory burden is shifted from the computer’s

18

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

Fig. 15. A uniform landmark assignment (the small white dots are selected landmarks).

RAM to the hard drive; thus, instead of searching a large area that requires large RAM storage of lengthy OPEN and CLOSED lists, the search is reduced considerably by loading relatively small cost matrices into the RAM. This feature will be highlighted further under knowledge inferencing. Knowledge acquisition can be carried out either on-line or off-line. On-line knowledge acquisition provides a partial knowledge base that grows as more routes are planned and hence more knowledge is gained. Kuipers and Byun [17] used sensors mounted on a navigating robot to collect data of the terrain and build a topological representation of it. Off-line knowledge acquisition, on the other hand, results in a complete knowledge base. In this paper, only off-line knowledge acquisition will be discussed. The on-line method is similar to the off-line acquisition with some modifications. To generate the cost matrix for a specific landmark, the A∗ search engine is run with that landmark as a starting cell, until the whole map is searched. This generates a route net centered at that landmark. That route net covers the whole map. The heuristic cost,

h(·), is set to zero in the search to make the search monotonic, so when A∗ terminates, an optimal route mesh has been constructed from that specific landmark to all other landmarks. Recall that when the A∗ search is monotonic, the optimal route to a cell is reached when that cell is expanded. Now, to construct the cost matrix for the landmark, the g(·) costs of the cells that correspond to landmarks are removed from the CLOSED list and entered into the cost matrix. g(n) is the optimal route’s cost from the landmark for which the matrix is constructed to landmark n. The same process is repeated to construct cost matrices for the other landmarks or at least those landmarks that fall in regions that are most likely to be visited as destination areas. With a uniform landmark distribution, some landmarks may coincide with obstacles. Those landmarks are ignored. The cost matrices for the map reachable landmarks constitute the knowledge base. On-line knowledge acquisition is conducted while the route planner is planning routes for real missions. After each route planning activity, the knowledge base is updated and expanded to include cost matrices for landmarks in newly visited (i.e., searched) areas

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

19

of the map. Since the start cell of the planned route may not coincide with a landmark cell, only some cost knowledge can be extracted for landmarks that are close to the start cell. With this in mind, on-line knowledge acquisition is conducted as follows. Assume that the A∗ search engine planned a route from a start cell s to a destination cell d. Assume also that cell n is a landmark cell that is close to the start cell s and that the search engine searched cell n (i.e., cell n was entered it in the OPEN list). After termination of the A∗ search, the cost matrix of the landmark cell n can be constructed (or updated if it already exists) to include route costs to landmarks that were searched. Assume, e.g., that landmark cell m was searched by the search engine. Now, the route cost from landmark cell n to landmark cell m can be approximated to be

route from that specific cell to the destination cell. Recall that the main purpose of all of this is to reduce the number of cell expansions, and hence reduce search time. So, a critical factor here refers to the inference engine’s CPU time; the inferencing process should be fast enough to justify the reduction in cell expansion. To estimate the heuristic cost, h(n), of cell n, the four landmarks M1 , M2 , M3 , and M4 that surround cell n are identified assuming they do not coincide with obstacles. The four landmarks M5 , M6 , M7 , and M8 that surround the destination cell are also identified and their cost matrices are loaded from the knowledge base. See Fig. 15 for an illustration. Now, assume that R(a, b) is the route from cell a to cell b, and C(R(a, b)) is the cost of R(a, b), then the inference engine considers the 16 routes

c(n, m) ≈ g(m) − g(n),

Li,j = R(n, Mi )+R(Mi , Mj )+R(Mj , Destination),

where c(n, m) is the route cost from landmark cell n to landmark cell m, g(m) is the g cost of landmark cell m in the OPEN list, and g(n) is the g cost of landmark cell n in the OPEN list. If the A∗ search engine is not monotonic, which is very possible when some heuristic cost h(·) is used in on-line search, g(m) and g(n) may not be the cost of the optimal routes from the start cell s to cells m and n, respectively. So, c(n, m) is nothing but the best guess of the route’s cost from landmark cell n to landmark cell m. The same procedure is repeated to cover all landmark cells searched by the search engine. This will construct a partial cost matrix for landmark cell n. With more route planning, the cost matrices for landmark n and other landmarks are updated. A cost matrix is updated by replacing some of its entries with smaller cost values when the search finds less-cost routes between some previously visited landmarks. The knowledge base is used, next, to guide the search engine by providing heuristic costs, h(·), closer to the actual values. It is well known that the better the heuristic cost estimate, a smaller number of cells will be searched [15]. Actually, if the heuristic cost, h(·), is equal to the actual cost, h∗ (·), then only the cells of the optimal route will be searched by the A∗ or the best-first algorithm. The inference engine, however, will not provide exact heuristics, but it will give much better heuristics than those used in Section 3. The search engine calls the inference engine every time a new cell is entered into the OPEN list to provide an estimate of the cost (i.e., h(·)) of the optimal

i = 1, 2, 3, 4, j = 5, 6, 7, 8. Fig. 16 shows L1,6 where R(n, M1 ) and R(M6 , Destination) are depicted as dotted lines, while R(M1 , M6 ) is shown as a solid line. The costs of these sixteen routes are computed as follows: C(Li,j ) = C(R(n, Mi )) + C(R(Mi , Mj )) + C(R(Mj , Destination)) C(R(n, Mi )) and C(R(Mj , Destination)) are computed while C(R(Mi , Mj )) is retrieved from the cost matrix of landmark Mj . Computing C(R(n, Mi )) and C(R(Mj , Destination)) does not require excessive runtime because these sections are approximated by straight line segments and their sizes are very small compared to the route’s size. C(R(Mi , Mj )) is retrieved from Mj ’s cost matrix by locating the matrix entry that corresponds to the landmark Mi . That entry is C(R(Mi , Mj )). Finally, h(n) is computed by averaging the costs of the two least-cost routes among the 16 routes considered. Designating the least two costs by La and Lb , h(n) is computed as follows: h(n) = 21 (La + Lb ) The least two costs are chosen to increase the possibility of satisfying the admissibility and the monotonocity conditions. However, there is no guarantee that these two conditions will not be violated.

20

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

Fig. 16. Estimating the heuristic cost, h, of cell n.

Example 5. Route planning experiments, with and without learning support, were conducted on the map shown in Fig. 17 to plan shortest routes. That particular maze-like map was used to make the route planning task somehow challenging. In this example, the route planner used 8-cell adjacency with no route filtering. The map was divided into 34,000 (170 × 200) cells, and routes were planned from 67 different start points to a fixed destination point. The start points were evenly distributed on the map. Fig. 17

compares route planning with and without learning support for three particular cases. The start and destination points are denoted by S and D, respectively. The lightly shaded area in each map is the area that was searched by the route planner. Results for the three cases are summarized in Tables 5 and 6. You can see the reduction in the number of searched cells when learning support was implemented. Fig. 18 shows statistical results of the 67 route planning experiments. Fig. 18(a) shows a histogram of

Table 5 Route planning without learning Case

Start cell

Goal cell

Number of expansions

Number of expanded cells

Route length (cells)

Route cost

a b c

(120, 10) (140, 120) (40, 20)

(35, 155) (35, 155) (35, 155)

27049 27361 4088

27049 27361 4088

288 278 136

320.93 297.00 138.90

Table 6 Route planning with learning Case

Start cell

Goal cell

Number of expansions

Number of expanded cells

Route length (cells)

Route cost

a b c

(120, 10) (140, 120) (40, 20)

(35, 155) (35, 155) (35, 155)

14002 12627 3314

9111 7867 3314

288 278 136

320.93 297.00 138.90

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

21

Fig. 17. Routes for the three cases of Example 5 (the shaded area is the searched area) for both with (right) and without (left) learning.

22

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

Fig. 18. Histograms of the number of cell expansions of Example 5: (a) without learning (x¯NL = 20,330, sNL = 10,874); (b) with learning (x¯L = 12,556, sL = 8127).

the number of cell expansions without learning with a mean x¯NL = 20,330 and a standard deviation s NL = 10,874. The distribution of the data with extreme clustering is due to the misleading dead-ends of the

map and the low-quality guidance of route planning without learning. Fig. 18(b) shows a histogram of the number of cell expansions when learning support was implemented. The sample mean x¯L = 12,556 and the

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24

standard deviation s L = 8127. Notice how knowledge of route’s cost at some landmarks resulted in a reduction of 7774 (38%) in the mean of the number of cell expansions. Beside this reduction in the number of cell expansions, the learning scheme did not affect the quality of the planned routes; i.e., in all the experiments, route planning with learning support produced the same routes generated with conservative heuristics (i.e., equal to straight line distance) with the same route’s cost.

8. Conclusions This paper presented the design of an intelligent real-time route planner for AV systems where the AV conducts missions at high speeds in a natural terrain. An intelligent supervisory module configures the search engine to meet the objectives and the time constraints of the mission. The search engine uses a pre-prepared digitized map of the mission terrain to plan suboptimal routes in terms of distance, safety, and maneuvering. The planned routes are more optimized and smoothed with a fast route filter. The planning effort, in terms of CPU runtime and RAM requirements, is reduced by an intelligent learning-support module that learns from previous planning experiences to provide better heuristics to the search engine. The developments and results presented in this paper contribute towards a truly AV capable of executing complex tasks at high speeds in natural environments.

References [1] P.G. Fahlstorm, T.J. Gleason, Introduction to UAV Systems, UAV Systems, Columbia, MD, 1994. [2] A. Meystel, Autonomous Mobile Robots, World Scientific Publishing, London, 1991. [3] S.S. Iyengar, A. Elfes, Autonomous Mobile Robots: Perception, Mapping, and Navigation, IEEE Computer Society Press, Los Alamitos, CA, 1991. [4] J. Latombe, Robot Motion Planning, Kluwer Academic Publishers, Boston, MA, 1990. [5] E.W. Dijkstra, A note on two problems in connexion with graphs, Numerische Mathematik 1 (1959) 269–271. [6] P.E. Hart, N.J. Nilsson, B. Raphael, A formal basis for the heuristic determination of minimum cost paths, IEEE Transactions on Systems, Man and Cybernetics SSC-4 (2) (1968) 100–107.

23

[7] O. Khatib, Real time obstacle avoidance for manipulators and mobile robots, The International Journal of Robotics Research 5 (1) (1986) 90–98. [8] Y. Hwang, N. Ahuja, A potential field approach to path planning, IEEE Transactions on Robotics and Automation 8 (1) (1992) 23–32. [9] E. Plumer, Cascading a Systolic Array and a Feedforward Neural Network for Navigation and Obstacle Avoidance Using Potential Fields, NASA Contractor Report 177575, 1991. [10] J. Allison, D. Watson, Intelligent waypoint transiting in complex AUV environs, in: Proceedings of the Sixth International Symposium on Unmanned Untethered Submersible Technology, 1989, pp. 246–257. [11] K.P. Carroll, S.R. McClaran, E.L. Nelson, D.M. Barnett, D.K. Friesen, G.N. Williams, AUV path planning: An A∗ approach to path planning with consideration of variable vehicle speeds and multiple overlapping, time-dependent exclusion zones, in: Proceedings of the 1992 IEEE Symposium on Autonomous Underwater Vehicle Technology, 1992, pp. 79–84. [12] N.A. Vlassis, N.M. Sgouros, G. Efthivoulidis, G. Papakonstantinou, Global path planning for an autonomous qualitative navigation, in: Proceedings of the Eighth IEEE International Conference on Tools with Artificial Intelligence, 1996, pp. 354–359. [13] R.C. Arkin, Navigational path planning for a vision-based mobile robot, Robotica 7 (1989) 49–63. [14] J.C. Hyland, Optimal obstacle avoidance path planning for autonomous underwater vehicles, in: Proceedings of the Sixth International Symposium on Unmanned Untethered Submersible Technology, 1989, pp. 266–278. [15] Y. Shirai, J. Tsujii, Artificial Intelligence Concepts Techniques, and Applications, Wiley, England, 1984. [16] L. Kanal, V. Kumar (Eds.), Search in Artificial Intelligence, Springer, New York, 1988. [17] B. Kuipers, Y. Byun, A robot exploration and mapping strategy based on a semantic hierarchy of spatial representation, Robotics and Autonomous Systems 8 (1991) 47–63.

S. Al-Hasan was born in 1967, Saudi Arabia. He received B.S. degree in Systems Engineering from King Fahd University of Petroleum and Minerals, Saudi Arabia, in 1990 and got an M.S. degree in Electrical Engineering in 1994, from Georgia Institute of Technology. He received Ph.D. degree from Georgia Institute of Technology in 1999 in Intelligent Control. Currently, he is an Assistant Professor of Electrical Engineering and Dean of the College of Technology at Al-Ahsa, Saudi Arabia. Areas of interest include intelligent control, fuzzy systems, object-based planning, and autonomous systems.

24

S. Al-Hasan, G. Vachtsevanos / Robotics and Autonomous Systems 40 (2002) 1–24 G. Vachtsevanos is Professor of Electrical and Computer Engineering at the Georgia Institute of Technology. He received a BEE degree from the City College of New York, an MEE degree from New York University and a Ph.D. degree from the City University of New York. He directs the Intelligent Control Systems Laboratory at Georgia Tech where faculty and students are conducting research in intelligent

control, diagnostics and prognostics for condition based maintenance, vision-based inspection and control of industrial and bioengineering systems and manufacturing systems. He has published over 250 papers in his area of expertise and he is a senior member of IEEE. His research contributions have been recognized through civic and professional awards. He serves as the Associate Editor of the International Journal of Intelligent and Robotic Systems and is a consultant to government agencies and industrial organizations.