Stigmergic cooperation of autonomous robots

Stigmergic cooperation of autonomous robots

Mechanism and Machine Theory 44 (2009) 656–670 Contents lists available at ScienceDirect Mechanism and Machine Theory journal homepage: www.elsevier...

770KB Sizes 4 Downloads 111 Views

Mechanism and Machine Theory 44 (2009) 656–670

Contents lists available at ScienceDirect

Mechanism and Machine Theory journal homepage: www.elsevier.com/locate/mechmt

Stigmergic cooperation of autonomous robots Cezary Zielin´ski *, Piotr Trojanek Institute of Control and Computation Engineering, Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665 Warsaw, Poland

a r t i c l e

i n f o

Article history: Received 28 September 2007 Received in revised form 28 July 2008 Accepted 29 August 2008 Available online 16 October 2008

Keywords: Robot programming Formal specification Behavioural control

a b s t r a c t This paper presents a general and formal approach to designing robot controllers. It is based on structuring the controller into subsystems dealing with effectors, receptors, inter-agent transmission and internal memory. The behaviour of the controller is governed by transition functions. They are the basis for the definitions of robot behaviours. A systematic method of partitioning the task of a robot into separate bio-inspired elementary behaviours and the method of composing those behaviours into actions performed by the robot in the real environment are presented. Those general considerations are illustrated by an example of the design of behavioural controllers for robots executing a joint transportation task. The robots only observe the effects of the others activities in the environment (stigmergy). The presented specific solution to the joint box pushing example has its own merits, however the paper focuses on the general aspects of formal approach to controller design of multirobot systems. Ó 2008 Elsevier Ltd. All rights reserved.

1. Introduction Taking into account the utilised principle of communication two extreme classes of multi-robot (multi-agent) systems exist. The first category uses explicit communication, where the robots communicate directly between themselves (e.g., using wireless or network technology). The second category uses implicit communication, where each robot observes the actions of the others or the results of their activity in the environment (i.e., stigmergy [1]). In the former case the problem at hand is subdivided into tasks. In the latter case, rather than decomposing the problem into tasks, investigations focus on the definition of elementary behaviours of individuals [2–4]. The approach to the design of those categories of systems differs considerably. The first relies on task decomposition (this results in a top down approach) while the second relies on the definition of elementary behaviours and lets them interact freely (this produces a bottom up approach) – in this case the resultant behaviour of the system emerges as a consequence of those interactions. Between those extremes spreads the realm of hybrid systems utilising both implicit and explicit communication mechanisms. Explicit communication usually requires problem decomposition into tasks and subsequently allocation of those tasks to robots. The task allocation problem depends on the requirements of the tasks and the capabilities of robots. Hence the tasks and robots are subdivided into categories [5]. Tasks can either be executed by single robots, thus SR tasks, or need many robots, thus multiple-robot (MR) tasks. The robots can either execute a single task at a time (i.e., ST robots) or several tasks simultaneously (i.e., MT – multiple task robots). Moreover, all tasks to be executed can be known in advance – this produces an instantaneous assignment (IA) problem, or the tasks can appear randomly as the time passes – this results in time-extended assignment (TA). In the former case no planning for the future is necessary, while in the latter case future requirements have to be taken into account. If there are more tasks than robots or vice versa optimal allocation of tasks requires

* Corresponding author. E-mail address: [email protected] (C. Zielin´ski). URL: http://www.ia.pw.edu.pl/~zielinsk (C. Zielin´ski). 0094-114X/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.mechmachtheory.2008.08.012

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

657

some criterion of judging the result. Usually utility is defined as a difference between quality of the obtained result and the cost of producing it. In the simplest case utility depends only on the robot and the task it has to execute. However, utility might also depend on the sequence of task execution, number of robots executing the task, other tasks that the robot has already executed, the tasks that the other robots have executed etc. Implicit communication is used to solve a wide variety of problems, e.g., path planning [6], object sorting [7], building structures, self-assembly [8], cooperative transport, surveillance, prey hunting [1]. Many of such systems are biologically inspired – frequently they rely on mimicking insect social behaviour. Social insects (i.e., insects living in colonies, e.g., ants, bees, wasps, termites) draw our attention, because each insect exhibits simple behaviour, yet the colony as a whole produces useful and complex output [1]. Thus limited perceptual and cognitive capabilities, through mutual interactions, both between the individuals and between an individual and the environment, result in attaining a complex goal. This is termed as emergent behaviour. The attractiveness of emergent behaviour, where individual elementary behaviours are simple and the communication is limited to individual’s perception, is even more underscored by the fact that explicit communication faces a scaling problem which is absent in the implicit case. In the explicit case with the increase of the number of robots the organisation of communication between them becomes ever more difficult. On the other hand, although implicit communication systems tend to be robust (immune to the failure of some individuals or changes in the environment), the emergent result is difficult to predict, hence program. The difficulty arises from the fact that the result is not only based on each individual’s behaviour, but also on the interactions between them and the environment, which are hard to foresee. In some cases, as the individuals do not have the knowledge about the global goal, this can lead to stagnation (deadlock). However, the advantages of those systems prevail over their disadvantages, if only the above disclosed drawbacks are dealt with appropriately. Although in reality it will be probably necessary to design hybrid systems, it is important to find out to what extent they can rely on autonomous behaviour of individual agents. The greater the extent of autonomy of an agent the more robust will the system as a whole be. Especially the immunity to failure of individuals will be greatly enhanced. Thus this paper mainly focuses on the method of defining and implementing behaviours of individual agents neglecting to a large extent the benefits of explicit communication. However, the presented method of defining autonomous activity of agents is elaborated in such a way that behaviours can also utilise information communicated to them directly by the other agents. This paper proposes a certain approach to the problem of defining both a set of elementary behaviours of an individual robot and the method of their composition, so that a multi-robot system executes a general task. The example presenting the general method of specifying an agent’s behaviour is biologically inspired. Both the elementary behaviours governing an individual agent’s resultant behaviour and the method of their composition into this resultant behaviour stem from the world of ants. As joint transport of objects by robots is one of the most demanding problems for systems executing tasks cooperatively (attaining global goals) this is the focus of this paper. The paper formally defines a set of elementary behaviours that each robot has to exhibit and the method of composition of those behaviours so as to execute the task at hand. Here two robots that are unaware of each others actions will push a box along a trajectory specified by its variable direction angle. During execution of this task the robots can be informed about the change of this angle, so diverse piecewise straight line segment trajectories can result. Box-pushing, as a form of object transportation, is somewhat less demanding than manipulation of rigid objects [9,10], but still it has attracted the attention of many authors, e.g., [11–13], thus it will be used here for the purpose of presenting the general method of defining behaviours. The control methods proposed in the literature of the subject rely either on centralised architectures [14] or distributed architectures mimicking swarms [11]. Decentralised architectures have been dominant due to the attractiveness of immunity to failure of individual robots forming the team [11,12]. The majority of them are behaviour-based [11,12,15]. The work [15] proposed a method of action selection without explicit communication for multirobot box-pushing in dynamic environments. In [12] two robots executed a task of pushing an object to a predefined location. It was shown that a distributed system using explicit communication for coordination of the two robots performed more effectively than a single robot or two implicitly communicating robots. This is not surprising. However the question of to what extent the individuals should be autonomous and to what extent they should rely on the information obtained from the others still remains open. The postulated method of designing behaviour-based multi-robot systems produces a three tiered architecture. The lowest tier is composed of elementary behaviours, which interact with each other through a composition function producing the resultant behaviour of the agent. The next tier is composed of resultant behaviours of agents interacting with each other to execute the task that the system is to accomplish. The task forms the uppermost tier. Such decomposition facilitates the design of the system, especially that behaviours within each of the two bottom tiers may be freely appended or deducted without disturbing the other elements of the tier, thus modification and fine-tuning of the performance of the system is possible. The paper is organised as follows. Section 2 introduces the agent and discloses its internal structure. This is the basis for classification of agents as described by Section 3. Section 4 defines the structure of the control subsystem of an agent. The functioning of an agent is defined formally by a transfer function, which has to be decomposed to simplify the formulation of elementary behaviours. The method of defining behaviours of an agent is disclosed in Section 5. Section 6 presents an example of the utilisation of the introduced method. It pertains to multi-robot box-pushing. Section 7 briefly describes the implementation of the controller and then presents both the simulation results and implementation on real robots. The paper is concluded in Section 8.

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

658

2. An agent A system consisting of many agents is considered. An agent j, i.e., aj , in its most general form, consists of four components, thus its state is described by

sj ¼ hcj ; ej ; V j ; T j i

ð1Þ

where cj – state of the control subsystem, ej – state of the effector, V j – state of the virtual sensors, T j – information transmitted/received to/from the other agents. Effectors are the devices influencing the environment, while sensors are devices gathering the information about the environment. Transmitters are responsible for the direct exchange of information between the agents. The paper, for brevity, does not distinguish denotations describing an entity (component or its name) and its state, because of contextual obviousness. Using the biological hint two types of receptors (sensors) are distinguished in an agent: proprioceptors and exteroceptors. Biologists distinguish also interoceptors, but they will not be necessary for this presentation. Proprioceptors are receptors that detect stimulus from inside of the limbs. In creatures those are the receptors located in muscles, tendons or joints. They enable perception of position of the limbs and body. In the case of agents these are the devices for measuring the internal state of the effectors (e.g., encoders, resolvers) and thus they are associated with effectors. Exteroceptors (encompassing also teleceptors) are the receptors that detect stimulus external to the body (e.g., vision, smell, touch). In technical systems they include the measuring devices gathering information from the environment (e.g., cameras, laser scanners, ultrasonic sensors). They are the source of perception of the environment. The data obtained from the exteroceptors usually cannot be used directly in motion control, e.g., control of a mobile platform requires the goal location and not the bit-map delivered by a camera. In other cases a simple sensor will not suffice to control the motion (e.g., a single proximity sensor), but several such sensors deliver meaningful data about the surrounding obstacles. The process of extracting meaningful information for the purpose of motion control is named data aggregation and is performed by virtual sensors. Thus the kth virtual sensor reading obtained by the agent aj is formed as:

tjk ¼ fvjk ðcj ; Rjk Þ

ð2Þ

As the exteroceptors may have to be prompted or configured, cj is one of the arguments of the aggregating function (2). A bundle of exteroceptors Rjk , used for the creation of the kth virtual sensor reading, consists of nr individual exteroceptor readings:

Rjk ¼ hrjk ; . . . ; r jkn i 1

ð3Þ

r

where rjk ; l ¼ 1; . . . ; nr , are the individual exteroceptors taken into account in the process of forming the reading of the kth l virtual sensor of the agent aj . Virtual sensor bundle contains nvj individual virtual sensor readings:

V j ¼ htj1 ; . . . ; tjnv i

ð4Þ

j

Each virtual sensor tjk ; k ¼ 1; . . . ; nvj , produces an aggregate reading from one or more exteroceptors, as described by (2) and (3). Each agent aj forms and uses its own bundle tj of virtual sensors. 3. Types of agents Agents can be classified on the basis of the components they posses, i.e., taking into account (1). The internal memory, associated with internal computational capability, is an obligatory component without which an agent cannot exist, because it would not be able to process any information. Thus only the three remaining components in (1) determine the eight possible types of agents (either the component is present in an agent or it is not). The state of each class of agents is defined by C CE CV CT CEV CVT CET CEVT

– – – – – – – –

zombie: blind agent: monitoring agent: computational agent: autonomous agent: remote sensor: teleoperated agent: embodied agent:

sj sj sj sj sj sj sj sj

¼ hcj ; ; ; i ¼ hcj ; ej ; ; i ¼ hcj ; ; V j ; i ¼ hcj ; ; ; T j i ¼ hcj ; ej ; V j ; i ¼ hcj ; ; V j ; T j i ¼ hcj ; ej ; ; T j i ¼ hcj ; ej ; V j ; T j i

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

659

where  represents a missing component. An agent of class C is a purely computational unit without any access to the environment and, above all, without contact with other agents, so it can neither influence its surroundings nor do any useful work for other agents, hence it is present but useless, thus its name – zombie. Agents of class CE can influence the environment, but do not get any feedback from it, e.g., a simple feeder mechanism works like that – it repeats blindly preprogrammed motions. An agent of class CV only records the changes in the state of the environment. However, it cannot deliver the measurement data to other agents – it behaves in a similar fashion that a black box in an airplane does. The collected data is extracted by extraordinary means and only under extraordinary circumstances. All of the above three classes are of minor importance to us. An agent of class CT is much more useful, as it can perform computations on the behalf of other agents, e.g., it can coordinate other agents or formulate plans for them. The agents of class CEV are completely autonomous as they do not contact other agents. However they can use stigmergy for the purpose of cooperating with other agents. Those agents collect information about the state of the environment and influence their surroundings taking into account the collected data and their inner imperative – the goal pursuing program. Agents of class CVT can behave in the same fashion as those of class CT, but above that they can also be used as remote sensors by other agents. The CET class of agents can influence the environment, but they do not get any sensor feedback, thus they must be teleoperated through their communication channel. Finally, CEVT type of agents are fully capable embodied agents, that can influence the surroundings, gather data about the environment and communicate directly with the other agents. As all the other classes are the subclasses of embodied agents, the focus of this paper is on the CEVT class. The above classification of agents can be used for the discussion of the required type of agent for the execution of a certain type of task. For example, a semi-autonomous agent, which requires human assistance for its operation, consisting in physical modification of the environment, must have at least E and V components, thus it would be classified as CEV or CEVT type. The E component is required for physical interaction with the environment and the V component is necessary for the agent to be autonomous. Autonomy implies the ability of reacting to changes in the environment without human assistance, and that requires the ability of detecting those changes, thus the necessity of having exteroceptors. Humans can instruct the agent either through a communication channel (explicit communication), then a CEVT type agent results, or by demonstrating their wishes to the agent, which perceives them through its exteroceptors (implicit communication) – in this case a CEV type agent suffices.

4. Control subsystem of an embodied agent An agent’s control subsystem cj :    

acquires information about the surroundings of the agent through the associated virtual sensors V j , 0 receives communication from the other agents aj0 ðj –jÞ, reads the state of its own effector ej using the proprioceptors, takes into account the information currently stored in its internal memory, and processes all of this data to:

   

produce the next state of the effector ej , influence its virtual sensors V j (e.g., configuring the exteroceptors), send data to the other agents aj0 and change the information stored in its own memory.

Three types of entities are distinguished within the control subsystem: input (distinguished by a leading subscript x), output (distinguished by a leading subscript y) and internal (without a leading subscript). Hence the following elements of the control subsystem are distinguished (Fig. 1): x c ej

– proprioceptive input from the effector (data conforming to the assumed input model of the effector in the control subsystem, e.g., wheel motor shaft positions, position of the mobile platform), x cVj – exteroceptive input from the virtual sensors, x cTj – information obtained from the other agents, y c ej – the desired position of the effector (data conforming to the assumed output model of the effector in the control subsystem – e.g., PWM ratios supplied to the motor drivers), y c Vj – current configuration and commands controlling the virtual sensors, y c Tj – information transmitted to the other agents, [ccj – internal memory. From the point of view of the designer of the agent the state of the control subsystem changes at the effector servo sampling rate or a low multiple of that. If i denotes the current instant, the next considered instant is denoted by i þ 1. This is called a motion step. Thus the control subsystem uses:

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

660

Fig. 1. General structure of an embodied agent.

i x cj

¼ hcicj ; x ciej ; x ciVj ; x ciTj i

ð5Þ

to produce: iþ1 y cj

iþ1 iþ1 iþ1 ¼ hciþ1 cj ; y c ej ; y c Vj ; y c Tj i

ð6Þ

For that purpose it uses transition functions:

8 ciþ1 > > > cj > > > < y ceiþ1 j iþ1 > y c Vj > > > > > : y ciþ1 Tj

¼ fccj ðcicj ; x ciej ; x ciVj ; x ciTj Þ ¼ fcej ðcicj ; x ciej ; x ciVj ; x ciTj Þ ¼ fcVj ðcicj ; x ciej ; x ciVj ; x ciTj Þ

ð7Þ

¼ fcTj ðcicj ; x ciej ; x ciVj ; x ciTj Þ

The same can be represented in compact form: iþ1 y cj

¼ fcj ðx cij Þ

ð8Þ

Formula (8) is a prescription for evolving the state of the system, thus it is treated as a program of the agent’s behaviour. For any agent exhibiting useful behaviours this function is very complex, because it describes the actions of the system throughout its existence. The complexity of this function renders impractical the representation of the program of agent’s actions as a single function. The function (8) has to be decomposed to make the specification of the agent’s program of actions comprehensible and uncomplicated. Thus instead of a single function fcj ; nf partial functions are defined: iþ1 y cj

¼ m fcj ðx cij Þ;

¼ 1; . . . ; nf

ð9Þ

Variability of description of agent behaviour is due to the diversity of those functions. 5. Behaviour of agents The behaviour of an agent is governed by the transition function (8). First an agent must read-in the arguments x cj of this function, then compute its value, and finally dispatch the effects of those computations, i.e., y cj , to the subcomponents of the agent. Global transition function (8) is decomposed into many partial transition functions (9) governing only some aspects of the general behaviour of the agent. At a single instant possibly one or more, but not all, of those functions need to be active. Thus their selection is necessary. This selection is based on predicates, i.e., Boolean valued functions. A test is performed, if it is positive, the respective transition function is chosen for evaluation. The results of all evaluated functions must subsequently be composed together to produce the desired output y cj . Thus once function (8) is decomposed the designer of the agent must have means, first, for selecting, and then, for composing the result of evaluation of partial functions (9) into the required complex behaviour. Selection is based on testing predicates q pcj ; q ¼ 1; . . . ; np , where np is the number of considered predicates. Those predicates use x cij as arguments and return a Boolean value stating whether the test is satisfied or not. If yes, an appropriate partial function (9) is chosen for

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

661

evaluation. If several predicates can be true simultaneously, then several functions will be evaluated, and thus their values must be composed to produce the resultant value. Both competitive and cooperative composition can be considered. The many possibilities of selection and composition were discussed in [16,4]. Those papers also discuss possible ways of coding behaviours of agents. For the purpose of this paper the following very simple form of coding the behaviour of an agent has been chosen:

loop ==Determine the current state of the agent ej x ciej ;

V j x ciVj ;

cTj0 x ciTj ;

for q ¼ 1; . . . ; np : clearðq cicj Þ; ==Compute the next control subsystem state :¼ 1 fcj ðx cij Þ endif if 1 pcj ðx cij Þ then 1 cciþ1 j if 2 pcj ðx cij Þ then 2 cciþ1 :¼ 2 fcj ðx cij Þ endif j  if

np

pcj ðx cij Þ then

np iþ1 c cj

ð10Þ

¼ np fcj ðx cij Þ endif

==Compute the aggregate control iþ1 y cj

:¼ compositionðq ciþ1 cj Þ; ==for q ¼ 1; . . . ; np

==Transmit the results of computations iþ1 y c ej ej ;

iþ1 y c Vj 

V j ; y ciþ1 c Tj 0 ; Tj 

i :¼ i þ 1; endloop In the pseudo-code (10) the double slash precedes the comments and the symbol ‘‘” denotes transmission of data between subcomponents of the agent. Those transmissions result in data input and data output. As each predicate singles out a specific transition function, which in turn specifics an elementary behaviour, the number of constitutive elementary behaviours is equal to np . Seamless coordination of different elementary behaviours is attained through the composition function. It is up to the system designer to provide an appropriate form of this function. The function named ‘‘clear” resets all of the variables q ccj ; q ¼ 1; . . . ; np , to their neutral values, e.g., 0, so in each iteration the previously computed transition functions, that have not been selected in the current iteration, do not affect the composition function. It is assumed that all partial functions are defined in such a way that they produce results of the same type (from within the same mathematical domain). The function named ‘‘composition” uses some form of competitive or cooperative assembly of the values of partial results stored in q ccj . For the purpose of the example presented further a cooperative method of behaviour composition has been selected. Superposition is used (e.g., linear combination of the computed values): iþ1 y c cj

¼

np X q¼1

wq q ciþ1 cj ¼

np X

wq q fcj ðx cij Þ

ð11Þ

q¼1

where wq are certain weights. It should be noted that the thus computed q ciþ1 through the composition function influence cj the whole y ciþ1 j , so the output of data produces the execution of motion by the effectors, configuration of virtual sensors and transmission of messages to the other agents (each of those if necessary). The instruction i :¼ i þ 1 synchronises the loop iterations with the agent’s heartbeat. Although this code is internally synchronous, due to the shortness of the period Di ¼ ði þ 1Þ  i, the behaviour observed externally will be asynchronous. The transition from synchronous to asynchronous actions is discussed in detail in [4]. Pseudocode (10) represents the template for execution of an agent’s resultant behaviour. The execution of a task by a multi-agent system requires the agents to interact through those behaviours. If the system, and thus each agent, is to execute more tasks, more instances of such code must be produced for each agent. Each of those instances differs by having its own set of transition functions q fcj and predicates q pcj . However the structure of the code (10) is always retained. Moreover, for an agent to become capable of executing several tasks, each loop (10) must have an extra line of code included enabling the detection of a termination condition, i.e.,

if f sj ðx cij Þ then break endif

ð12Þ

must be inserted right under the instruction determining the current state of the agent. The fulfillment of the terminal condition fsj causes the control to be transferred outside of the loop (10). An occurrence of an error produces the same effect (this can be implemented by, e.g., throwing an exception). Outside of the loop the decision must be made as to which task should be executed next, i.e., which loop (10) should be invoked now. Again a predicate would be used for this purpose and again it would take as an argument x cij . It should be noted that this procedure only switches between the roles (resultant behaviours) of the agent aj . The realization of the global goal (task) is ascertained by the interactions of agents realizing their roles

662

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

(resultant behaviours). By using x cij as a cause of switch between roles (and thus tasks) we assume that either exteroreceptors or transmitters will trigger the change. It is up to the designer of the agent to provide the specific:    

partial transition functions q fcj ; q ¼ 1; . . . ; np , defining elementary behaviours, predicates q pcj , the composition function, the terminal condition fsj ,

defining the role of an agent in the execution of each task. By providing the above listed entities the designer of the agent encodes each task. Switches between tasks require extra predicates for designating the next role of an agent. The control program for a multi-task and thus a multi-role agent would have the structure of a finite state automation. 6. An example This example is partially based on the behaviour of ants when they are cooperatively transporting objects. The full repertoire of elementary actions of an ant is the following [1]: 1. 2. 3. 4.

try to lift the object alone, if that fails try to drag it alone, if that fails reposition yourself and try 1 and 2 again, if that fails recruit nest mates – first try short range recruitment (i.e., spread poison-gland secretion in the air) and if that fails try long range recruitment (i.e., poison-gland secretion is deposited on the ground creating a path from the object to the nest), 5. if stagnation results realign yourself, 6. if stagnation persists reposition yourself, 7. if load is still too heavy, cut it into pieces (recruit workers capable of cutting). To keep this example as simple as possible, so as to be able to illustrate the general methodology of designing behavioural robot controllers, the paper concentrates only on the pushing aspect of joint transport, however the robots observe each others actions only by detecting changes brought about to the environment by the others – no explicit communication between the box pushing robots is used (i.e., they only rely on stigmergy). The purpose of this example is to show how the above introduced formalism can be used to produce a behavioural multi– robot system in a systematic way. The exemplary task consists in pushing a box by mobile robots along a partially specified trajectory. The partial specification defines only the required direction of translocation of the box (a heading represented by the angle u measured around the zg axis of the world coordinate frame) and the current velocity along that direction (Fig. 2). This specification can be stored in the memory of the agent’s controller or generated based on the exteroceptor readings or it can be transmitted to the robot segment by segment during execution. The heading vector is aligned with the axis y0 of the coordinate frame rotated by u around zg . The change in the heading may be due to the detection of obstacles in the path of

Fig. 2. Box pushing executed by two mobile robots.

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

663

the box being pushed to the goal position. The pushing robots can be informed about an obstacle, and thus change the heading, by using a remote sensing robot that senses the environment on their behalf. To keep the presentation simple only two pushing robots are used. However they know nothing of each others activities. The robots have two independently powered sets of three wheels on each side, so they can be crudely treated as twowheeled vehicles [17]. The robot’s control signals pertain to the rotational velocity of each of the wheel motors. They are calculated by using the result of control law computations that represent the desired translational and rotational velocity of the vehicle ðmj ; xj Þ, where j 2 fL; Rg – for the left and the right robot respectively. In this type of tasks a robot is faced with a set of sometimes conflicting requirements resulting from the partial goals that need to be attained to satisfactorily execute this task. The control designer must identify those goals and produce a separate elementary behaviour for each goal. Once this is done the designer must specify how the results of execution of individual elementary behaviours will be composed, producing the resultant behaviour of the agent as it is perceived from the outside. Usually problems occur simultaneously, thus they must be dealt with in parallel. Moreover, the elementary behaviours have to be expressed formally, so that their implementation will be accurate and simple. Usually the task is decomposed by taking into account all of the possible problems that can occur while trying to attain the global goal. In the case of the presented example one can single out the following problems with the execution of the task (Fig. 2): (1) The box can stray away from the prescribed heading, e.g., by an angle a. (2) Each of the robots may be at an angle to the box (in an ideal situation the front of the robot should press against the front of the box) – e.g., bL and bR for the left and right robot respectively. (3) The velocity of the box may differ from the prescribed one. (4) The robots might be improperly located with respect to the front face of the box. Each of those can be dealt with by a separate elementary behaviour producing its own command driving the robot wheels. As the behaviours produce different commands they need to be composed into the resultant command that will be executed by the wheel motors. The behaviours corresponding to the listed problems are defined as follows (Fig. 2): (1) (2) (3) (4)

Maintaining the box pushing direction ða ! 0Þ. Maintaining the pusher direction ðbL  a ! 0Þ; bR  a ! 0Þ. Maintaining box velocity along the pushing direction y0 . Locating pushers symmetrically with respect to the box edges.

Each basic task can be seen as a behaviour unrelated to the others, but as a result of their contributions final motion control is generated. Each task is realised by its control law (expressed as a negative feedback reducing the cause of the problem), which outputs two terms of the pusher robot velocity – translational and rotational ½mj ; xj ; j 2 fL; Rg. These velocities are directly related to the left and right wheel velocities of each platform ½ejL ; ejR :

mj ¼

qðejL þ ejR Þ 2

;

xj ¼

qðejL  ejR Þ d

ð13Þ

where d is the distance between the wheels and q is the radius of each wheel. The above mentioned elementary behaviours produce a resultant velocity ½mj ; xj . The effector (i.e., mobile platform) recomputes those to obtain ½ejL ; ejR :

ejL ¼

mj d m d þ x ; ejR ¼ j  xj q 2q j q 2q

ð14Þ

The presented approach can be specified with formal notation presented in [16]. System consisting of na ¼ 3 embodied agents aj is considered, where the two agents are box pushing robots j 2 fL; Rg and the third agent a0 ; ðj ¼ 0Þ, represents the knowledge about the desired box heading. For brevity further discussion will concentrate on the two robots pushing the box (i.e., aL and aR ). In this case the effectors are mobile platforms and the effector state ej of the agent represents the current velocity of the robot. The virtual sensor reading tj represents information about box position relative to the pusher, which is obtained by the agent aj in the process of data aggregation from real hardware sensors. In this case laser scanners are used and the aggregation is done by extracting straight line segments in the laser range readings. In the following the above enumerated four informally described elementary behaviours will be defined formally. Each behaviour is defined assuming that the problem that it remedies is the only one occurring. This assumption significantly simplifies the definition of elementary behaviours and is an important feature of the presented method. This also influences implementation. Initially each behaviour is debugged separately and without considering other problems. Only once each behaviour is proved to act correctly the system is tested as a whole. The four elementary behaviours might not be enough to deal with any situation that the robots can encounter. If the situations require robots to modify their actions then new behaviours can be appended using the introduced formalism. Each of the robots is controlled by the code (10). The variables defined in that code assume the following task specific names: q ccj ¼ ½q mj ; q xj ; j 2 fL; Rg; q ¼ 1; . . . ; 4. Moreover, x cVj ¼ ½a; b; bj ; dj ; j  and 0 ccj ¼ ½s; W, where W ¼ ½wq . The defini-

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

664

tions of dj , j , s and W will be provided later in the text. The internal memory assumes the structure: ccj ¼ ½0 ccj ; 1 ccj ; 2 ccj ; 3 ccj ; 4 ccj . Now the four elementary behaviours can be defined formally. 6.1. Maintaining box direction In the case of this elementary behaviour we assume that both robots are heading approximately in the desired direction, have contact with the box and are equidistant from the center of the box (Fig. 2). If the difference between the desired and current direction a is greater than zero, then one of the pushers has to stop. The box will rotate around the stopped pusher, while the other robot has to move by b sin a, which means, that the velocity terms are defined as below:

81 b sin a > < mL ¼ MINðþ s ; mmax Þ; 1 mR ¼ MINð b sins a ; mmax Þ; > :1 xL ¼ 1 xR ¼ 0

1 1

mR ¼ 0 when a P 0 mL ¼ 0 when a < 0

ð15Þ

where the value of s is scaled in ½s. It determines the duration of the corrective action and should be set to the value, that produces acceptable velocities. b ¼ LR is the length of the box edge (Fig. 2). 6.2. Maintaining pusher directions To push the box in the desired direction both pushers have to be oriented towards the goal (Fig. 3). This elementary behaviour produces only rotational velocity term for the robots. This velocity is proportional to the heading error of the left and the right pusher:

82 2 > < mL ¼ mR ¼ 0 2 xL ¼ MAXðMINðbLsa ; xmax Þ; xmax Þ > :2 xR ¼ MAXðMINðbRsa ; xmax Þ; xmax Þ

ð16Þ

6.3. Maintaining box velocity along the pushing direction In the case when both robots are correctly aligned with the box ða  0; bL  a  0; bR  a  0Þ, pushing can be achieved by applying constant forward velocity to both robots:

(

3 3

mL ¼ 3 mR ¼ mmax xL ¼ 3 xR ¼ 0

ð17Þ

6.4. Aligning pushers symmetrically with respect to the box edges To assure contact between pushers and the box an additional elementary behaviour has to be specified. The mobile platform has to rotate towards the nearest corner of the box (which can be easily located using the onboard laser range finder) and translate towards that corner (Fig. 4). The control law (formulated for the left pusher) is given by:

Fig. 3. Maintaining pushers directions behaviour.

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

(

mj ¼ MINðdsj ; mmax Þ 4 xj ¼ MAXðMINðsj ; xmax Þ; xmax Þ

665

4

ð18Þ

where dj and j are defined in Fig. 4 for the left (L) platform. Using this elementary behaviour the mobile robot locates the box corner even after loosing contact with the object. 6.5. Behaviour composition Each of the described elementary behaviours generates its own control law, which is given by the velocity terms for the left ðq mL ; q xL Þ and right pusher ðq mR ; q xR Þ. One of the possible compositions of the resultant motion is the weighted superposition of component velocity terms, here (for the left robot):

mL ¼

4 X

q

mL wq ; xL ¼

q¼1

4 X

q

xL wq ;

ð19Þ

q¼1

where wq is a weighting factor for each elementary behaviour. In the presented implementation weights are assumed to be equal, i.e., wq ¼ 1=4, since each of the four behaviours is limited to produce velocities no greater than allowed maximum. 7. Implementation The proposed task decomposition has been implemented using Player/Stage robotics middleware [18]. The elementary behaviours are implemented as functions in the client applications and communicate with device drivers D1 ; . . . ; Dn (i.e., motors, proximity sensors) through the Player server (Fig. 5). With this approach the robot control part is independent of the hardware and thus can be used with both simulated or real hardware sensors and effectors. The control method has been tested using both the Gazebo simulator, which is a part of the Player/Stage suite [19], and the real robots. The simulation uses the Open Dynamics Engine as a core. Simulations and execution by real robots of pushing a box along S-shaped curves proved that in both cases the resulting curves are very similar. The comparison of results obtained from simulation and the physical system are described in Section 7.2. The comparison of results obtained by simulation and with physical robots proved the simulation to be very precise. As experiments on real robots are much more time and resource consuming, once we have established that simulations reflect reality satisfactorily, we resorted to simulations. Models of the ELECTRON mobile robots (Fig. 6) with onboard laser range finders were used in simulation (Fig. 7). Relative position of the box was computed based on segment extraction from laser data using a recursive line fitting method [20]. Lasers range finders can measure distances upto 80 m. 7.1. Simulation results The implemented control algorithm has been verified using a world model consisting of two mobile robots and a solid box (190  40  50 cm). The information about global box heading was available directly from the simulation engine and not the third agent, as was the case for experiments with real robots (Fig. 7). At the beginning of simulation the robots were located about 10 cm behind the box. Two separate tasks (defined by the desired box trajectories) were executed in simulation – traversal of a half-circle just to validate the controller and then an Sshaped trajectory, which mimicked joint transportation of the box with obstacle avoidance. The first reference job was to rotate the box by 180degr using a smooth circular trajectory. The heading was transmitted to the robots incrementally. The desired box heading was changed during 100 s at 1.8°/s rate (Fig. 8). The box direction followed the reference trajectory with almost constant delay of about 4 s. The pushers’ velocities were differentiated by the

Fig. 4. Aligning pushers symmetrically according to box edge behaviour.

666

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

Client2

Client1

TCP / IP

Server Player1

Server Player2

D1

Dn1

D1

Dn2

Device Drivers

H1

Hn1

H1

Hn2

Simulated or real hardware

Fig. 5. Implementation of the control structure.

Fig. 6. The ELECTRON mobile robot used in experiments [17].

Fig. 7. Box pushing simulation and experiment involving two cooperating mobile robots.

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

667

3.5 reference heading box heading

3

Heading (rad)

2.5 2 1.5 1 0.5 0 —0.5 0

20

40

60

80

100

τ (s)

120

140

Fig. 8. The circular box pushing trajectory (simulation).

fact, that the right pusher had to travel a longer distance moving on the outer rim of the ring (Fig. 9). The left pusher’s translational velocity tracked the velocity of the right pusher, which resulted in smooth changes of the box direction. The angular velocities differed at the beginning of the simulation, when both robots were approaching the corners of the box from their home positions. After that the angular velocities were approximately equal, since both pushers had to turn by 180° during the time set for execution of the task. The second reference job was to push the box along the S-shaped trajectory, resembling the trajectory avoiding an obstacle, e.g., such as a trajectory ants would traverse transporting prey while avoiding a stone on their path. The trajectory was transmitted to the robots incrementally. The reference trajectory was a sine function with a period of 120 s (Fig. 10). As in the previous case the box direction was following the reference trajectory with almost constant delay of about 4 s. When rotating the box clockwise the left pusher was moving faster and while turning in opposite direction (counterclockwise, the second half of the trajectory) the right pusher was moving faster. The rotational velocity terms are following the reference box heading (the sine function), while small local fluctuations are caused by the robots aligning with box edges. After finishing

Translational velocity (m/s)

0.07

2.16

Angular velocity (deg/s)

0.06 0.05 0.04 0.03 0.02 0.01 0 0

right pusher left pusher

20

40

60

80

τ (s)

100

120

1.80 1.44 1.08 0.72 0.36 0

—0.36 140

right pusher left pusher

0

20

40

60

80

100

120

140

τ (s)

Fig. 9. Pushers’ translational and angular velocities producing the circular trajectory of the box (simulation).

reference heading box heading

0.6

Heading (rad)

0.4 0.2 0 —0.2 —0.4 —0.6 0

20

40

60

80

τ (s)

100

120

140

160

Fig. 10. The S-shaped box pushing trajectory (simulation).

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

668

0.08

Angular velocity (rad/s)

Translational velocity (m/s)

right pusher left pusher

0.03

0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0

right pusher left pusher

20

40

60

80

100

120

140

0.02 0.01 0 —0.01 —0.02 —0.03

160

0

20

40

60

τ (s)

80

100

120

140

160

τ (s)

Fig. 11. Pushers’ translational and angular velocities producing the S-shaped trajectory of the box (simulation).

reference heading box heading

0.6

0.2

Box location 2

0

Y axis [m]

Heading (rad)

0.4

–0.2 –0.4

1 0 –1

–0.6 0

20

40

60

80

τ (s)

100

120

140

–2 0

160

1

2

3

4

5

6

7

8

9

10

X axis [m]

Fig. 12. Box pushing trajectory in the experiment with real robots.

the trajectory the box remains translated in the direction parallel to the longer edge – this is a side effect of slipping and proves the simulation to be very close to real physics. 7.2. Experimental results Only the second of the described trajectories (the S-shaped one) was tested by experimenting with real robots. Knowledge about global heading of the box was provided by the third robot with an onboard laser range finder. In this case the robot was not moving, but in general it can act as a steering agent and lead the transporting team to the goal location by supplying the changing heading. The steering robot a0 was 9 m away from the initial box position. The length of the executed trajectory was 8 m long, thus the steering robot did not have to relocate itself.

0.06

Angular velocity (rad/s)

Translational velocity (m/s)

0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 right pusher left pusher

0.01 0 0

20

40

60

80

τ (s)

100

120

140

0.04 0.02 0 —0.02 —0.04 right pusher left pusher

—0.06 160

0

20

40

60

80

τ (s)

100

120

Fig. 13. Pushers’ translational and angular velocities in the experiment with real robots.

140

160

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

1st behaviour

1st behaviour 0.5

0.2

0

50

100

2nd behaviour 0.2 0.1 0 0

50

100

—0.5 0

150

150

3rd behaviour

Angular velocity (rad/s)

Translational velocity (m/s)

0.1 0 0

—0.5 0

—0.5 0

4th behaviour 0.5

0.1

0

50

100

τ (s)

100

150

50

100

150

4th behaviour

0.2

0 0

50

3rd behaviour

0

150

150

0

0.1

100

100

0.5

0.5

50

50

2nd behaviour

0.2

0 0

669

150

—0.5 0

50

τ (s)

100

150

Fig. 14. Pushers’ translational and angular velocities produced by each elementary behaviour in the experiment with real robots (grey line for left, black line for right robot respectively).

The resulting trajectory is very close to the one obtained using the simulation environment (Fig. 12). It is important to note, that there is much noise in the produced control velocities. In practise this does not disturb the control strategy, since all elementary behaviours are acting as remedies to some singled out discrepancy and control them in a feedback loop manner. The produced control velocities (Fig. 13) are very similar to those obtained during simulation (Fig. 11). They result from the superposition of separate controls produced by each of the four elementary behaviours (Fig. 14). 8. Conclusions A method of division of a task into independent elementary behaviours has been presented. The division is based on the singled out possible discrepancies between the desired value of some quantity and its measured value. Usually this value is obtained by sensors directly or results from aggregation of sensor data from several sources (i.e., fusion). Each of those discrepancies hinders the execution of the task (attaining the global goal) and is treated separately. This feature facilitates finding the remedy to each of those discrepancies. The elementary behaviour reducing the discrepancy is arranged as a negative feedback. Each such behaviour produces its suggestion as to what should the action (motion) of the robot be to remedy a particular problem. As the discrepancies will occur simultaneously the elementary behaviours have to act in parallel. Hence many suggestions (perhaps conflicting) as to the remedy are produced. They need to be composed into a final decision. For that purpose either cooperative or competitive methods can be used [16,4]. For the problem at hand a cooperative method has been proposed, namely a weighted superposition of the results produced by the elementary behaviours. However, nonlinear composition can also be considered. The simulation results, preceding experiments on real robots, showed that the presented method of designing a behaviour based controller produces a satisfactory result. However, it should be noted, that in very complex situations the proposed simple strategy can fail, because the conflicting requirements can result in mutual cancellation of responses leading to inaction or inappropriate reaction. Nevertheless, the proposed formal formulation of both the elementary behaviours and their composition can readily be extended by adding new behaviours (partial remedies) and changing the composition rule. As this formal description of behaviours becomes the specification for the implementation of the control software and the relationship between the specification and implementation is very strong, the proposed method significantly facilitates the implementation of the controller. The proposed formal approach also facilitates the discussion of diverse definitions of behaviours by imposing mathematical rigor on this discussion. Finally, the paper presents the use of stigmergy to solve a relatively difficult problem of tight cooperation between autonomous robots being unaware of each others actions. Future work will concentrate on methods of specifying the global goal for a multi-agent system in terms of roles (resultant behaviours) of specific agents. Those methods must, in particular, assure that stagnation (deadlock) will not occur. It is

´ ski, P. Trojanek / Mechanism and Machine Theory 44 (2009) 656–670 C. Zielin

670

doubtful that complex global goals can always be decomposed effectively into behaviours of completely independently acting agents. Hence some form of coordination between the agents will be necessary. Currently research is directed at finding a least constraining coordinating agent for such systems. In this case a hybrid system will result. It should be reminded that the benefit of systems relying only on implicit communication is their immunity to failure of some of the individual agents. Introduction of a coordinator, and thus explicit communication, should not wreck this property. The presented formalism should facilitate this research by enabling formal formulation and discussion of the solutions. Acknowledgement The authors gratefully acknowledge the support of this work by Warsaw University of Technology Research Program. References [1] [2] [3] [4] [5] [6] [7]

[8] [9]

[10] [11] [12] [13] [14] [15] [16] [17]

[18] [19] [20]

E. Bonabeau, M. Dorigo, G. Theraulaz, Swarm Intelligence: From Natural to Artificial Systems, Oxford University Press, New York, Oxford, 1999. R.A. Brooks, A robust layered control system for a mobile robot, IEEE Journal of Robotics and Automation RA – 2 (1) (1986) 14–23. R.C. Arkin, Behavior-Based Robotics, MIT Press, Cambridge, Mass, 1998. C. Zielin´ski, Transition-function based approach to structuring robot control software, in: K. Kozłowski (Ed.), Robot Motion and Control: Recent Developments, Lecture Notes in Control and Information Sciences, vol. 335, Springer-Verlag, 2006, pp. 265–286. B. Gerkey, M.J. Mataric´, A formal analysis and taxonomy of task allocation in multi-robot systems, International Journal of Robotics Research 23 (9) (2004) 939–954. T. Makiishi, H. Noborio, Sensor-based path-planning of multiple mobile robots to overcome large transmission delays in teleoperation, in: Proceedings of the IEEE SMC’99 International Conference on Systems, Man, and Cybernetics, 1999, 1999, vol. 4, 1999, pp. 656–661. J.L. Deneubourg, S. Goss, N. Franks, A. Sendova-Franks, C. Detrain, L. Chrétien, The dynamics of collective sorting robot-like ants and ant-like robots, in: Proceedings of the First International Conference on Simulation of Adaptive Behavior on From Animals to Animats, MIT Press, Cambridge, MA, USA, 1990, pp. 356–363. E. Tuci, R. Gropss, V. Trianni, F. Mondada, M. Bonani, M. Dorigo, Cooperation through self-assembly in multi-robot systems, ACM Transactions on Autonomous and Adaptive Systems 1 (2) (2006) 115–150. W. Szynkiewicz, C. Zielin´ski, W. Czajewski, T. Winiarski, Control architecture for sensor-based two-handed manipulation, in: T. Zielin´ska, C. Zielin´ski (Eds.), CISM Courses and Lectures – 16th CISM-IFToMM Symposium on Robot Design, Dynamics and Control, RoManSy’06, vol. 487, Springer, Wien, New York, 2006, pp. 237–244. C. Zielin´ski, W. Szynkiewicz, T. Winiarski, M. Staniak, Rubik’s cube puzzle as a benchmark for service robots, in: 12th IEEE International Conference on Methods and Models in Automation and Robotics, MMAR’2006, 2006, pp. 579–584. C. Kube, E. Bonabeau, Cooperative transport by ants and robots, Robotics and Autonomous Systems 30 (2000) 85–101. M. Mataric´, M. Nilsson, K. Simsarian, Cooperative multi-robot box-pushing, in: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 1995, pp. 556–561. D. Rus, B. Donald, J. Jennings, Moving furniture with teams of autonomous robots, in: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 1995, pp. 235–242. B.R. Donald, J. Jennings, D. Rus, Information invariants for distributed manipulation, International Journal of Robotics Research 16 (5) (1997) 673–702. S. Yamada, J. Saito, Adaptive action selection without explicit communication for multirobot box-pushing, IEEE Transactions on Systems, Man, and Cybernetics, Part C 31 (3) (2001) 398–404. C. Zielin´ski, Formal approach to the design of robot programming frameworks: the behavioural control case, Bulletin of the Polish Academy of Sciences – Technical Sciences 53 (1) (2005) 57–67. M. Olszewski, B. Siemiaßtkowska, R. Chojecki, P. Marcinkiewicz, P. Trojanek, M. Majchrowski, Mobile robot localization using laser range scanner and omni-camera, in: T. Zielin´ska, C. Zielin´ski (Eds.), CISM Courses and Lectures – 16th CISM-IFToMM Symposium on Robot Design, Dynamics and Control, RoManSy’06, vol. 487, Springer, Wien, New York, 2006, pp. 229–236. R.T. Vaughan, B. Gerkey, A. Howard, On device abstractions for portable, resuable robot code, in: Proceedings of the IEEE/RSJ International Conference on Intelligent Robot Systems (IROS2003), 2003, pp. 2121–2427. N. Koenig, A. Howard, Design and use paradigms for Gazebo, an open-source multi-robot simulator, in: IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, 2004, pp. 2149–2154. J. Xavier, M. Pacheco, D. Castro, A. Ruano, U. Nunes, Fast line, arc/circle and leg detection from laser scan data in a player driver, in: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2005, pp. 3930–3935.