Robotics and Autonomous
II ELSEVIER
Systems Robotics and Autonomous Systems 16 (1995) 197-211
Deliberation and reactivity in autonomous mobile robots Raja Chatila 1 L,4AS-CNRS, 7 Ave. du Colonel Roche, 31077 Toulouse Cedex, France
Abstract
This paper discusses issues related to the design of the control architectures for an autonomous mobile robot capable of performing tasks e~ciently and intelligently, i.e. in a manner adapted to its environment, to its own state and to the execution status of its task. We present our developments and experimentations on mobile robot navigation and show how it is necessary to produce representations at several levels of abstraction, that are used by adequate processes for obstacle detection, target recognition, robot localization, and motion planning and control. We also show that deliberation is necessary for the robot in order to anticipate events, take efficient decisions, and react adequately to asynchronous events. We also discuss the organization of the system, i.e. the design of the control architecture. Keywords: Mobile robots; Control architectures; Deliberation; Navigation; Behavioral control
1. Introduction
We discuss in this paper the functions to be implemented in a mobile robot in order to endow it with an autonomous and intelligent behavior in achieving its goals. We use navigation--an essential activity for a mobile robot--as a framework for discussing the issues of representation and deliberation. On the basis of our developments and experimentations on mobile robot navigation, we show that it is necessary to produce representations at several levels of abstraction, used by adequate processes in order to (a) evaluate current robot and environment states (here: obstacle and free space models, robot position, etc.) ; (b) anticipate future robot state and environment evolution to decide forthcoming actions planning of motion and perception actions) ;
l E-mail:
[email protected]
(c) react to events (obstacle avoidance, motion control. The way these representation, reasoning and reaction capacities interact and are interleaved, i.e. the architecture of the system, is a central issue that will be addressed. The organization of the system embedding the perception and motion functions, and the control of these processes actually determines the behavior of the system. The role of the control system is to take into account the variations due to the execution context in a consistent manner and adapt to the actual external constraints and the available time and resources. In Section 2 we address the navigation problem, illustrated by results and real experiments with the systems we have implemented. Then architectural concepts and the architecture of the system is presented in Section 3. In Section 4 we discuss some issues on robot intelligence and what the "behavior-based" approaches [4-6] claim on them, mainly because of its current important influence and its strong statements. Some concluding remarks close the paper.
0921-8890/95/$09.50 (~) 1995 Elsevier Science B.V. All rights reserved SSDI 0 9 2 1 - 8 8 9 0 ( 9 5 ) 0 0 1 4 7 - 6
198
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211
2. Autonomous mobile robot navigation 2.1. I n t r o d u c t i o n
Navigation is the basic task that has to be solved by an autonomous mobile robot. By navigation, we understand the task of reaching an arbitrarily distant goal, the path being not predefined or given to the robot. Navigation is in general an incremental process that can he summarized in four main steps: (1) Environment perception and modelling: any motion requires a representation of the local environment at least, and often a more global knowledge. The navigation process has to decide where, when and what to perceive; (2) Localization: the robot needs to know where it is with respect to its environment and goal; (3) Motion decision and planning: the robot has to decide where or which way to go, locally or at the longer term, and possibly compute a trajectory; (4) Motion execution: the commands corresponding to the motion decisions are executed by control processes--possibly sensor-based and using environment features. The complexity of the navigation processes depend on the general context in which this task is to be executed (nature of the environment, efficiency constraints . . . . ). For example, if the robot moves in a very dynamic environment--with respect to its own velocity and computing power--then there is no point in planning a trajectory, as it would be obsolete before it is even started. The algorithms to solve navigation in this case should be very fast and rely on local data. The motion planning and perception steps can then be made very simple, for example an attraction by the goal with local avoidance, and a collision test respectively. On the other hand, if much is known in advance on the environment and if perturbations are transient, then the solution would be to plan ahead--especially if there is an efficiency constraint on the path (and there always is, at least because of energy or time limitations)--and adapt execution to local perturbations. An environment model is needed for navigation in order to plan a collision-free path to reach the goal with respect to a criterion (distance, time, security,
etc.). It is also needed for the robot to self-locate with respect to it. Otherwise, reaching a given goal (that is not in sight) will be impossible in general, considering the uncertainties of odometry or inertial sensors. In addition, sensors are always imperfect: their data are incomplete (lack of information concerning existing features) and inaccurate. They generate artefacts (information on non-existing features) and errors (wrong information concerning existing features). The same area when perceived again can theretbre be differently represented. Hence environment representations must take into account uncertainties and tolerate variations. Models are also different in indoor or outdoor environments: natural environments are not intrinsically structured, as compared to indoor environments where simple geometric primitives (e.g., planar surfaces) are close to match reality. Finally, the representations must he at several levels of abstraction, i.e. adapted to the level of the processes using them. Path finding, which is based on geometrical and topological representations, cannot--or at least very inefficiently--be solved in general by local methods driving the robot directly (which is equivalent to performing the search physically) without mapping. A variety of motion planning algorithms exist. According to their complexity and the representations they manipulate, they are adapted to he used in different situations that the robot as a system must recognize: constrained space, uneven terrain, etc. Several motion planners should be embedded in the system; selecting the more appropriate according to the situation is devoted to the control system. Finally, to cope with perception and motion uncertainties, it is more robust to plan the motion in terms of closed-loop sensor-based processes directly, rather than executing a geometrical trajectory relying on odometry or inertial data. Another approach is to compute a trajectory that takes into account possible error reduction by sensing. We develop these various issues in the following sections, in the context of outdoor navigation. The problem of long range navigation in unknown outdoors environments is not very frequently addressed. Important achievements are Robbie [24], Ambler [ 14] and the navigation of the UGV [ 12].
R. Chatila/Roboticsand Autonomous Systems 16 (1995) 197-211 2.2. An adaptive approach According to a general "economy of means" principle due to limitations of on-board processing capacities, memory and energy, and to achieve an efficient behaviour, the robot should be adapt to the nature of the terrain [8,16]. We consider three motion modes that are to be selected according to the situation: • A reflex mode: on large fiat and lightly cluttered zones, the robot locomotion commands are determined on the basis of a goal (heading or position) and information provided by "obstacle detector" sensors. The terrain representation required by this mode is just the description of the borders of the region within which it can be applied; • A 2D planned mode: when the environment is mainly fiat, but cluttered with obstacles, it is more efficient that the robot plans its motion. The trajectory planner reasons on a binary description of the environment, which is described in terms of Crossable/NonCrossable areas. • A 3D planned mode: when the environment is highly constrained (uneven terrain), collision and stability constraints have to be checked to determine the robot locomotion commands. This is done thanks to a 3D trajectory planner [22], that reasons on a fine 3D description of the terrain (a numerical terrain model (NTM) [ 19]). Hence, different terrain representations and planning processes are needed for an improved efficiency of the robot coping with its environment. The selection of the adequate mode to apply is performed by a specific planning level, the navigation planner. Navigation planning reasons on a global qualitative representation of the terrain, built from the data acquired by the robot's sensors. The navigation planner selects the next perception task to perform, the sub-goal to reach and the motion mode to apply: it selects and controls the trajectory planning which determines the trajectory to execute (in one of the above-mentioned three motion modes) to reach the goal defined by the navigation planning level. Splitting the decisional processes into different levels of planning has the advantage to structure the problem. It has also the great advantage of helping to analyse and solve failure cases: when a planner fails to reach its goal, this means that the environment repre-
199
sentation of the upper level is erroneous and therefore that it has to be revised. The navigation planner is systematically activated at each step of the incremental execution of the task: each time 3D data are acquired, they are analysed to provide a description of the perceived zone in terms of navigation classes. This description is fused to maintain a global qualitative representation of the environment (the region map), on which the navigation planner reasons to select a sub-goal, a motion mode to apply and the next perception task to execute. The introduction of the navigation planning layer defines a particular instance of the usual "perception-decisionaction" loop, in which the "decision" part is split into two distinct processes: navigation and trajectory planning.
2.3. Several terrain representations Each of the three different motion modes requires a particular terrain representation. The navigation planner also requires a specific terrain representation, and during navigation, an exteroceptive localization process has to be activated frequently to update robot position with respect to environment features, which requires another terrain representation. Aiming at building a "universal" terrain model that contains all the necessary information for these various processes is extremely difficult, inefficient, and moreover not really useful. It is more direct and easier to build different representations adapted to their use: the environment model is then multi-layered and heterogeneous. Several perception processes coexist in the system, each dedicated to the extraction of specific representations: perception is multi-purpose. Fig. 1 presents the various terrain representations required during navigation: one can distinguish the numerical terrain model [ 19] necessary to the 3D trajectory planner, the region map dedicated to the navigation planner, and three different ways to build a localization model: (i) by modelling objects (rocks) with ellipsoids or superquadrics [ 3 ], (ii) by detecting interesting zones in the NTM (e.g., maxima of elevation) [ 10], or (iii) by detecting poles in the 3D raw data. Coherence relationships between these various representations are to be maintained when necessary.
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211
200
INITIAL DATA (ITINERARY)
a'
REGION MAP
////7/
o
o,.
POLES (TREES)
Fig. 1. The various representations used in the system. Arrows represent the constructive dependencies between them.
2.3.1. Building the region map For the purpose of navigation planning, a global representation that describes the terrain in terms of navigation classes is required. We focus in this section on the algorithms developed to build such a model from 3D data (produced either by a laser range finder or a correlation stereo-vision algorithm).
3D data classification. Applied each time 3D data are acquired, the classification process produces a description o f the perceived areas in term in terrain classes. It relies on a specific discretization of the perceived area that respects the sensor resolution (Fig. 2), that defines "cells" on which different characteristics (attributes) are determined: density ( number of points contained in a cell compared to a nominal density defined by the discretization rates), mean altitude, variance on the altitude, mean normal vector and corresponding variances . . . A non-parametric Bayesian classification procedure
is used to label each cell: a learning phase based on prototypes classified by a human lead to the determination of probability density functions, and the classical Bayesian approach is applied, which provides an estimate of the probability for each possible label. A decision function that privileges false alarms (i.e. labelling a flat area as obstacle or uneven) instead of the non-detections (i.e. the opposite: labelling an obstacle as a fiat area) is used (Fig. 3). This technique proved its efficiency and robustness on several hundreds of 3D images. Its main interest is that it provides an estimate of the confidence of its results: this information is given by the entropy of a cell. Moreover, a statistical analysis of the cell labelling confidence as a function of its label and distance to the sensor defines a predictive model of the classification process.
Incremental fusion. The partial probabilities of a cell to belong to a terrain class and the variance on
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211
201
!!!iii!iiiiN ?
Fig. 2. Discretization used for the classification procedure: a regular discretization in the sensor frame (left: a 3D image is represented as a video image, where the gray levels correspond to the points depth) defines a discretization of the perceived zone that respects the sensor resolution (fight).
Fig. 3. Classification of a correlated stereo image: correlated pixels (left) and reprojection of the result in the camera frame (right from clear to dark: unknown, fiat, uneven and obstacle). their elevation allow us to perform a fusion procedure of several classified images. The fusion procedure is performed on a bitmap, in the pixels of which are encoded some cell attributes determined by the classification procedure (label, label confidence, elevation and variance on the elevation).
Model structure and management. For the purpose of navigation planning, the global bitmap model is structured into a region map that defines a connection graph. Planning a path (as opposed to planning a trajectory) does not require a precise evaluation of the static and kinematic constraints on the robot: we simply consider a robot point model, and therefore perform an obstacle growing in the bitmap before segmenting it into regions. The regions define a connection graph, whose nodes are on their borders, and whose arcs correspond to a region crossing (Fig. 5). In order to satisfy memory constraints, the global model is explicated as a bitmap only in the surroundings of the robot's current position, and the region
model (much more compact) is kept in memory (Fig. 6).
2.3.2. Object-based representations In order to be able to localize itself with respect to environment features, it is necessary for the robot to build representations of objects that it could recognize. When the terrain has been classified as fiat but cluttered with obstacles, we extract by a segmentation process the objects that are lying on the ground. Some of these objects that are salient and easily recognisable (e.g., peak-like) can be considered as landmarks for localization. A topological model describes the organization of the objects scene [2]. This is illustrated in Fig. 7. 2.4. Navigation planning Each time 3D data are acquired, classified and fused in the global model, the robot has to answer
202
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211
Fig. 4. Fusion of 8 differentclassified laser images:terrain classes (left) and elevation (right). Note that this is not a numerical representation.
Fig. 5. The model of Fig. 4 after obstacle growing (left) and the nodes defined by region segmentation. autonomously the following questions: • Where to go? (sub-goal selection); • How to go there? (motion mode selection); • Where to perceive? (data acquisition control); • What to do with the acquired data? (perception task selection). For that purpose, the navigation planner reasons on the robot's capabilities (action models for perception and motion tasks) and the global terrain representation.
2.4.1. Planning motion versus planning perception
Fig. 6. Only the area surroundingthe robot is explicited as a bitmap
A straightforward fact is that motion and perception tasks are strongly interdependent: executing a motion requires to have formerly modelled the environment, and to acquire some specific data, a motion is often necessary to reach the adequate observation position. Planning motion tasks in an environment modelled as a connection graph is quite straightforward: finding paths in the graph that minimises some criteria (time and energy, that are respectively related to the terrain
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211
203
C o n t o u r image R a n g e image as a s r a y level image
L a b e l e d objects Result of o b j e c t s e g m e n t a t i o n OBJECT 1 :
Contact with ground: yes Obj ects in front of: nil Obj ects behind:
2, 4, 5
Obj • ct s right: 2, 3, 5, 6 Objects left: 4 P a r t i a l topological scene m o d e l Fig. 7. A 3D range image, extracted objects and scene model with topological relations.
classes and elevation variations) is easily solved by classical search techniques, using cost functions that express these criteria. Planning perception tasks is a much more difficult issue: one must be able to predict the results of such tasks (which requires a model of the perception processes), and the utility of these results to the mission to achieve:
• Localization processes can be modelled by a simple function that expresses the gain on the robot position precision, as a function of the number and distance of perceivable landmarks (Fig. 8); • With the confidence model of the 3D data classification process, one can predict the amount of information a classification task can bring. But it is much more difficult to express the utility of a classification
204
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211
Error
._~
Cost
Fig. 9. Surface to minimise to control localization tasks.
Fig. 8. A simple localization model: the gain in the precisionof the robot position increases as the distance of the perceivedlandmarks decreases, and is better as the number of perceivablelandmarks increases. This figure presents this gain in an environment with three landmarks and no occlusions. task to reach the goal: the model of the classification task cannot not predict what will be effectively perceived. It is then difficult to estimate the interest of these tasks.
2.4.2. Approach A direct and brute force approach to answer the former questions would be to perform a search in the connection graph, in which all the possible perception tasks would be predicted and evaluated at each node encountered during the search. Besides its drastic algorithmic complexity, this approach appeared unrealistic because of the difficulty to express the utility of a predicted classification task to reach the goal. We therefore choose a different approach to tackle the problem: the perception task selection is subordinated to the motion task. A search algorithm provides an optimal path, that is analysed afterwards to deduce the perceptions tasks to perform. The "optimality" criterion takes here a crucial importance: it is a linear combination of time and energy consumed, weighted by the terrain class to cross and the confidence of the terrain labelling. Introducing the labelling confidence in the crossing cost of an arc amounts to considering implicitly the modelling capabilities of the robot: tolerating to cross obstacle areas labelled with a low confidence means that the robot is able to easily acquire information on this area. Of course, the returned path is not executed
directly, it is analysed according the following procedure: ( 1) The sub-goal to reach is the last node of the path that lies in a crossable area; (2) The label of the regions crossed to reach this sub-goal determines the motion mode to apply; (3) And finally the rest of the path that reaches the global goal determines the aiming angle of the sensor. Controlling localization." the introduction of the robot position uncertainty in the cost function allows to plan localization tasks along the path. The cost to minimise is the integral of the robot position precision as a function of the cost expressed in term of time and energy (Fig. 9).
2.5. Trajectory planning According to the label of the region to be crossed (flat or uneven terrain), different trajectory planners are used to adapt to the terrain.
Flat terrain. The trajectory is computed simply by first building a binary bitmap (free/obstacle) and applying a wavefront propagation to produce a distance map from the obstacles. This determines a skeleton of free space. The trajectory is obtained by propagating a potential along the skeleton. Uneven terrain. In this case a more sophisticated planner that computes a trajectory taking into account stability and kinematic constraints is used [22]. The planner builds incrementally a graph of discrete configurations that can be reached from the initial position by applying sequences of discrete controls in a short time interval. Each arc of the graph corresponds to a trajectory portion computed for a given control.
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211
205
Fig. 10. ADAM in the Geroms test site.
Fig. 11. The navigation planner explores a dead end: it first tries to go through the bottom of the dead end, which is modelled as an obstacle region, but with a low confidence level (left); after having perceived this region and confirmed that is mast be labelled as obstacle, the planner decides to go back (right).
206
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211
ing a reactive behavior with respect to its environment. Such a robot would be both goal- and data-driven. The Subsumption architecture for example [4-6] clearly achieves the data-driven part (although not necessarily in an efficient manner), but overlooks the capacity of action planning.
3.1. Integration of deliberation and action
Fig. 12. A trajectory that avoids a dead end (80 meters, 10 perceptions).
Only the arcs verifying the placement constraints mentioned above are considered during the search. 2.6. Results and discussion
The terrain modelling procedures and navigation planning algorithm have been intensively tested with the mobile robot Adam 2 . We performed experiments on the Geroms test site in the French Space Agency CNES 3, where Adam achieved several ' ' Go To [ g o a l ] ' ' missions, travelling over 80 meters, avoiding obstacles and getting out of dead ends. Fig. 11 presents two typical behaviours of the navigation algorithm in a dead end, and Fig. 12 shows the trajectory followed by the robot to avoid this dead end, on the elevation model built after 10 data acquisitions. If the robot did not use representations, its behaviour would clearly have been far less efficient.
3. Architecture The organization of the robot capacities is clearly a central subject. We will not overview here all possible architectures, but rather focus--given the discussion above--on the objective of designing an autonomous robot endowed with the capacities of planning its own actions in order to accomplish specified tasks, and hav2 Advanced Demonstrator for Autonomy and Mobility is a property of Framatome and Matra Marconi Space and is currently lent to LAAS. 3 Centre National d'l~tudes Spatiales.
The design of the robot system must make possible the implementation of task-oriented servo loops while providing a suitable framework for the interaction between deliberation and action. Deliberation here is both a goal-oriented planning process wherein the robot anticipates its actions and the evolution of the world, and also a time-bounded context-dependent decision for a timely response to events. While there are high emergency situations where a first and immediate (reflex) reaction should be performed, such situations often require last resort safety actions. They can often be avoided if the agent is able to detect events which allow it to predict such situations in advance. Note that this is first a requirement for sensors and sensor data interpretation. Being informed in advance and consequently having more time to deliberate, the robot should be able to produce a better decision. In order to achieve both anticipation capacities and real time behavior, we decompose the system into two levels: • afunctional level: it includes processing functions and control loops (image processing, obstacle avoidance, motion control etc.); • a decisional level, which could be organized in layers and includes the capacities of planning and action control. Acting is permanent, and planning should be done in parallel with it. Since planning requires an amount of time usually longer than the dynamics imposed by the occurrence of an event, the rationale of this architecture consists in controlling the functional level by a deliberative level that has a bounded reaction time for a first response [ 1 ]. This level (Fig. 13) is composed of a planner which produces the sequence of actions necessary to achieve a given task or to reach a given goal, and a supervisor which actually interacts with the functional level, controls the execution of the plan
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211 goal /
Planner
~ C
n
goalmodalities
+ plan modalitie~ Supervisor= goal + state
I
processes Processes
Fig. 13. Integration of planning and execution.
and reacts to incoming events. The planner is given a description of the state of the world and a goal; it produces a plan. The "quality" of the produced plan is related to the cost of achievement of a given task or objective (time, energy. . . . ), and to its robustness, i.e. its ability to cope with non-nominal situations. This last aspect is one of the motivations of our approach: besides providing a plan, the planner also provides a set of execution "modalities" expressed in terms of: • constraints or directions to be used by a lower planning level if any; • description of situations to monitor and the appropriate reactions to their occurrence; such reactions are immediate reflexes, "local" correcting actions (without questioning the plan), or requests for re-planning. These "modalities" provide a convenient (and compact) representation for a class of conditional plans. However, the generation of modalities still remains to be investigated (we have no generic method yet for automatic generation of modalities in a planning algorithm). However, it is possible to produce useful modalities in a second step by performing an analysis of the plan as produced by a first "classical" planning step. Such an analysis can be based on knowledge of the limitations of the planner itself and of the world description it uses, as well as on domain or application specific knowledge. The supervisor interacts with the other layers and with the planner. The other layers are viewed as a set of processes which exchange signals with the supervisor. These processes correspond to the actions of the robot as well as events associated with environment changes independent from robot actions. The processes are under the control of the supervisor which has to comply with their specific features. For example, a process
207
representing robot motion, cannot be cancelled instantaneously by the supervisor: indeed, such a process has an "inertia". The supervisor may request a stop at any moment during execution; however, the process will go through a series of steps before actually finishing. The simplest way to represent such processes is finite state automata (FSA). The activity of the supervisor consists in monitoring the execution of the plan by performing situation detection and assessment and by taking appropriate decisions in real time, i.e. within time bounds compatible with the rate of the signals produced by the processes and their possible consequences. The responsibility of "closing the loop" at the level of plan execution control is entirely devoted to the supervisor. In order to achieve it, the supervisor makes use only of deliberation algorithms which are guaranteed to be time-bounded and compatible with the dynamics of the controlled system. Indeed, all deliberation algorithms which do not verify this property are actually performed by the planner, upon request of the supervisor. Besides the plan and its execution modalities, execution control makes use of a set of situation-driven procedures embedded in the supervisor and independent of the plan. These procedures are predefined (and can be reprogrammed), but can take into account the current goal and plan when they are executed by recognizing specific goal or plan patterns. The decisional level may have more than one layer, according to the detail of the handled representations. Decomposition into layers may be heuristic in order to improve performance, and take better into account the interactions with the environment. Furthermore, there may be several ways to execute a given action: a navigation task produced by the planner can sometimes (or at some stage) be executed by planning a trajectory, or by tracking a feature, following a wall, etc. depending on the actual execution situation. Hence, there may be several decisional layers, the lower ones manipulating representations of the environment and actions which are more procedural and closer to the execution conditions. The global system architecture we propose is organized into three levels representing two decisional layers and a functional level (Fig. 14). The two upper levels are built with the supervisor-planner paradigm. The higher level uses a temporal planner. The second
208
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211 average response time
i
guaranteed response time
Decisional Leve~ MISSION
! 10s
~
MISSION
SUPERVISOR
| SSION PLANNI
PLANNING
PLAN SUPERVISION ]
7_t
Procedural"
TASK
I=
knowledge . '-~" SUPERVISOR '"~ ls
-
O.ls
r---1
Robot and environment \ ~ state
0.01 s |
'
4
.
=
I
V
TASK PLANNING
EXECUTIVE
~
-Functional Modules
I FUNCTIONAL LEVEL:processingmodules Fig. 14. Global architecture.
level receives tasks that it transforms--or refines-into scripts (procedures) composed of elementary robot actions, and supervises script execution while being reactive to asynchronous events and environment conditions. Planning at this level is a "refinement" using domain or task specific knowledge: it does not use a general planner. The functional "execution" level embeds a set of elementary robot actions implementing task-oriented servo loops and processing functions (motion planner, perception, etc . . . . . ), and programmable monitoring functions (condition / reaction pairs), embedded into modules. A robot module is a software entity capable of performing a number of specific functions by processing inputs from, or acting on, physical robot devices and/or other modules. A module may read data exported by other modules, and output its own processing results in exported data structures [7]. The organization of the modules is notfixed. Their interactions depend on the task being executed and the perceived environment. This is an important property that enables to achieve a flexible and not a systematic robot behavior. The functional level can be seen as a library of "intelligent" functions with real-time abilities. During the execution of a task, these functions are dynamically combined according to the goal and to the outputs of other functions, under the control of a central "Executive" responsible for sending the requests for execution to the modules, and maintaining a status of the execution.
Fig. 15. Global architecture for autonomous navigation.
3.2. Example The generic architecture discussed above can be instantiated in the case of navigation planning Fig. 15. At the task refinement level, a procedural reasoning system is used (C-PRS [13]) which provides a suitable framework for goal-driven as well as situation-driven deliberation processes. Indeed, PRS implements script selection according to a situation represented in a data-base updated by sensory inputs and by the achievement of robot actions. Planning can be performed through context-dependent goal decomposition; situation-driven reaction can be performed by triggering procedures according to the world model content. The executive level is purely reactive with no planning capacities. It controls the execution of robot functions using predefined context-dependent rules. It is currently embedded in PRS, but will be implemented using a separate automata-based system. At the functional level we find the various processes for perception, modeling, trajectory planning, etc. as well as servo loops, monitoring and reaction functions.
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211
4. Robot intelligence In the light shed by the developments discussed in the previous sections, we discuss here some ideas on representations, reasoning, reactivity and architectures of autonomous robots defended by an important stream of research in the domain of robotics and AI: what has come to be called "behavior-based" systems [4,6,5]. Pioneered by Rodney Brooks from MIT, this approach claims that representations are not necessary, and central control not needed. It proposes to build intelligent agents as a hierarchy of behavior-producing layers, each directly linked to sensors and effectors. In the subsumption architecture [4] more precisely, the layers are composed of a fixed network of modules that are finite state machines. Interaction between layers is always top down: a higher level module may suppress the input or inhibit the output of a lower level one. Several mobile robots were built with this approach and demonstrated some capacities. Reported examples are heading to distant places and avoiding obstacles (Allen), obstacle avoidance, wall following, soda-can-like object recognition and grasping (Herbert), environment mapping (Toto), walking (Genghis), and visual tracking (Seymour). On the other hand, several mobile robots were built with other rationales. Without citing them all, let us point out for example that road following was achieved by the ALV project and the Navlab [23], walking by the Ambler [ 15 ], obstacle avoidance, wall following, trajectory planning and indoors navigation by many others, such as Hilare [11] (see also for example Refs. [ 9,17,21 ] ). These projects used representations adapted to the achieved task, and central control to some extent. Among the main ideas upon which the "behaviorist" approach is based, and that were formulated over the past years, are the following [5]: • situatedness: the robot is in an environment with which it interacts. • embodiment: the robot is not a software program, but a physical machine (with sensors and effectors). • "Intelligence is determined by the dynamics of the interaction with the world". • "Intelligence is in the eye of the observer". Although these assertions were criticizing main stream Artificial Intelligence and the development
209
of AI programs as having no connection with the real world (and hence sometimes addressing false problems), they also had a direct impact on robotics research. Every roboticist is certainly in total agreement with the first two points (robotics is necessarily studied with real robots). The latter assertions are more or less debatable: it is a specific kind of intelligence that is determined by the dynamics of the interaction with the world, the one necessary for survival for example. The capacity to solve a math problem--which is considered to be an intelligent activity--is certainly not in this category. The difficulty of this interaction is not only related to its dynamics but also to other aspects, such as its variability which questions the adaptation capacities of the system. These arguments actually gave birth to some dogmas on autonomous robot design such as "the world is its own best model", and hence representations are not necessary (first meaning symbolic AI-iike representations but then generalizing to any kind), or "there is no central locus of control". We will now discuss these two issues. 4.1. Representations Representations are not necessary, or central representations and abstractions are not necessary, but only local and partial data (although this distinction between central and not central is not always clearly stated, see Ref. [6] pp. 148). This argument does not give a correct account of the existing work on mobile robots. Actually, most mobile robots (including the first one, Sh'akey [ 20] ) make use of several and different representations of the environment. Almost no one claims that a central and unique symbolic representation is used for all the functions. But we have shown that several representations are indeed necessary, and that they have to be adequate, i.e. that they enable the robot to act efficiently in its environment. The robot has sensors the raw data of which do not necessarily provide directly the representation the most suited to the task. Some processing and abstraction are often necessary. Abstract symbolic representations can be extracted from sensory data as we have shown (e.g. Section 2.3.2), and are therefore physically grounded.
210
R. Chatila/Robotics and Autonomous Systems 16 (1995) 197-211
The issue of representations is therefore not centralization vs. distribution. Behavior-based systems as well as "traditional" systems use representations that are not unique. There is another point put forward in Ref. [5]. It is said that some representations are used, such as a map of landmarks, in the robot "Toto" [ 18], but these are "non-manipulable". They are rather computational elements themselves. However, the adequacy and efficiency of such representations should be compared to graphs and graph search algorithms.
4.2. Interaction with the world Another developed argument deals with the quality of the interaction with the environment. "Intelligence is determined by the dynamics of the interaction with the world" [6]. There is a confusion here between "intelligence" and the capacity to react quickly. What is needed is to (re)act adequately and efficiently. Reaction to environment changes should be context and goal dependent, and the measure of performance in adaptation is the efficiency in accomplishing the task. On-line decisional capacities for analyzing the context, anticipating situations, deciding of the relevant events to be expected (and focus some attention on them), and prepare the adequate reactions to them are necessary. Defining and "wiring" at robot design stage event-reaction pairs once and for all is not sufficient and leads to systematic behavior. We want the robot to accomplish a variety of tasks without the need for changing its hardware nor software. The tasks could be specified as objectives to reach (all details need not to be specified) or specific actions to execute, having well defined semantics. Communicating a task to the robot at an abstract level clearly implies that it possesses the reasoning abilities to rationally select the actions to be executed that would eventually lead to accomplish the objectives. To do this, the robot must anticipate the evolution of the environment. It must predict the outcome of its own actions and be able to compare it to the desired state, and this at a more or less long term, not just based on immediate stimuli. Therefore, there is a need in general to implement a planning capacity. In our opinion, "purely" reactive approaches and behavior-based approaches actually rely on the fact that all needed planning is performed at the design
phase by the designer him/herself. In the demonstrated tasks, the robot never encounters situations where it is necessary for it to anticipate its actions.
5. Conclusion We have discussed in this paper autonomous navigation, and shown what representations and decisional processes are useful to accomplish an efficient behavior. The robot is composed of several modules that interact in a coherent manner because they are integrated in an architecture that enables both event-driven and goal-driven behavior. In the case against so-called "traditional" AI and robotics, Nature and Evolution are sometimes cited [6]. In trying to design a robot control system, we have to act both as scientists and as engineers. We have to try to study and understand the concepts and principles that underlie autonomy and natural intelligence. However, since we are designing a machine, we have to carefully define its requirements: what are its functions? What is its expected performance? We have also to take into account the features and constraints of the material we use (sensors, computers, etc.). Nature did not have to face the same constraints nor to use the same material: solutions may be different. However, the machine has to exhibit an efficient behavior, adapted to its environment and to its tasks. This is the key to its autonomy. This is what we tried to convey in this paper.
Acknowledgements Many persons participated in the development of the concepts, algorithms, systems, robots, and experiments presented in this paper: R. Alami, G. Bauzil, S. Betge-Brezetz, B. Dacre-wright, B. Degaliaix, M. Devy, E Fillatreau, S. Fleury, G. Giralt, M. Herrb, E Ingrand, M. Khatib, S. Lacroix, C. Lemaire, P. Moutarlier, E Nashashibi, C. Proust, N. Sim6on, G. Vialaret.
References [ 1 ] R. Alami, R. Chatila and B. Espiau, Designing an intelligent
R. Chatila /Robotics and Autonomous Systems 16 (1995) 197-211 control architecture for autonomous robots, In Proceedings of ICAR'93. Tokyo (Japan), Nov. 1993. [21 S. Betgt, R. Chatila and M. Devy, Natural scene understanding for mobile robot navigation, In IEEE Conference on Robotics and Automation, pp. 730-737, May 1994. [31 S. Betge-Brezetz, R. Chatila and M.Devy, Natural scene understanding for mobile robot navigation, In IEEE International Conference on Robotics and Automation, San Diego, California, 1994. 14] Rodney A. Brooks, A robust layered control system for a mobile robot, IEEE Journal of Robotics and Automation, RA-2( 1):14-23, April 1986. [5] Rodney A. Brooks, Intelligence without Reason, In Proceedings of IJCAI-91, pp. 569-595, August 1991. [6] Rodney A. Brooks, Intelligence without Representation, AI Journal, 47( 1-3):139-160, January 1991. 171 R. Chatila and R. Ferraz De Camargo, Open architecture design and inter-task/intermodule communication for an autonomous mobile robot, In IEEE International Workshop On Intelligent Robots and Systems, Tsuchiura, Japan, July 1990. ] 81 R. Chatila, S. Fleury, M. Herrb, S. Lacroix and C. Proust, Autonomous navigation in natural environment, In 3rd International Symposium on Experimental Robotics. ISER, Kyoto (Japan), October 1993. 19] J.L. Crowley, Navigation for an intelligent mobile robot, IEEE journal of robotics and automation, 1, 1985. 110] P. Fillatreau and M. Devy, Localization of an autonomous mobile robot from 3d depth images using heterogeneous features, In IEEE International Workshop on Intelligent Robots and Systems (IROS '93), Yokohama, Japan), July 1993. I111 G. Giralt, R. Chatila and M. Vaisset, An integrated navigation and motion control system for autonomous multisensory mobile robots, In 1st International Symposium on Robotics Research. MIT Press, Oct. 1983. [121 M. Hebert, Pixel-based range processing for autonomous driving, In IEEE International Conference on Robotics and Automation, San Diego, California, 1994. [131 F. Ingrand, M.P. Georgeff and A.S. Rao, An architecture for real-time reasoning and system control, IEEE Expert. Intelligent Systems and Their Applications, 7:pp.34-44, 1992. [ 141 E. Krotkov, M. Hebert, M. Buffa, E Cozman and L. Robert, Stereo friving and position estimation for autonomous planetary rovers, In IARP 2nd Workshop on Robotics in Space, Montreal, Canada, 1994. 1151 E. Krotkov and R. Hoffman, Results in terrain mapping for a walking planetary rover, In Proceedings of ICAR'93, pp. 103-108. Tokyo (Japan), Nov. 1993.
211
[ 16] S. Lacroix, R. Chatila, S. Heury, M. Herrb and T. Simeon, Autonomous navigation in outdoor environment : Adaptative approach and experiment, In IEEE International Conference on Robotics and Automation, San Diego, California, 1994. [ 17] L. J. Lin, R. Simmons and C. Fedor, Experience with a Task Control Architecture for Mobile Robot, Technical Report CMU-RI-TR-89-29, CMU, 1989. [18] M.J. Mataric, A distributed model for mobile robot environment learning and navigation, Technical Report AITR-1228, MIT, May 1990. [ 19] E Nashashibi, P. Fillatreau, B. Dacre-Wright and T. Simeon, 3d autonomous navigation in a natural environment, In IEEE International Conference on Robotics and Automation, San Diego, California, 1994. [201 N.J. Nilsson, A mobile automaton: an application of artificial intelligence techniques, In Proc. of the 1st IJCAI, pp. 509520, 1969. [211 F. Noreils and R. Chatila, Control of mobile robot actions, In IEEE Conference on Robotics and Automation, pp. 701712, May 1989. [22] T. Simton and B. Dacre Wright, A Practical Motion Planner for All-terrain Mobile Robots, In IEEE International Workshop on Intelligent Robots and Systems (IROS "93) Japan, July 1993. [23] Charles Thorpe, Martial H. Hebert, Takeo Kanade and Steven A. Shafer, Vision and navigation for the Carnegie-Mellon navlab, IEEE Transactions on Pattern Analysis and Machine Intelligence, 10 (3), May 1988. [24] C.R. Weisbin, M. Montenerlo and W. Whittaker, Evolving directions in nasa's planetary rover requirements end technology, In Missions, Technologies and Design of Planetary Mobile Vehicules. Centre National d'Etudes Spatiales, France, Sept. 1992.
Raja Chatila, Directeur de Recherche at CNRS, received his Ph.D. in control science from the University of Toulouse in 1981. He spent one year at the Stanford Artificial Intelligence Laboratory in 1982 as a post-doctoral scholar. Since 1983, he has been a research scientist at the Robotics and Artificial Intelligence Group of LAAS-CNRS (Laboratoire d'Analyse et d'Architecture des Systemes du Centre National de la Recherche Scientifique), Toulouse, France, where he performs research on mobile robotics, architectures for planning and control, and perception and environment representation. He leads projects on intervention robots and planetary rovers.