An agent specific planning algorithm

An agent specific planning algorithm

Expert Systems with Applications 39 (2012) 4860–4873 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal hom...

3MB Sizes 0 Downloads 21 Views

Expert Systems with Applications 39 (2012) 4860–4873

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

An agent specific planning algorithm Luis Berdun ⇑, Analía Amandi 1, Marcelo Campo 1 ISISTAN, Facultad de Ciencias Exactas, Universidad Nacional del Centro de la Pcia. Bs. As., Campus Universitario, Paraje Arroyo Seco, Tandil, Argentina

a r t i c l e

i n f o

a b s t r a c t

Keywords: Intelligent agents Planning Agent’s preferences

Planning algorithms are often applied by intelligent agents for achieving their goals. For the plan creation, this kind of algorithm uses only an initial state definition, a set of actions, and a goal; while agents also have preferences and desires that should to be taken into account. Thus, agents need to spend time analyzing each plan returned by these algorithms to find one that satisfies their preferences. In this context, we have studied an alternative in which a classical planner could be modified to accept a new conceptual parameter for a plan creation: an agent mental state composed by preferences and constraints. In this work, we present a planning algorithm that extends a partial order algorithm to deal with the agent’s preferences. In this way, our algorithm builds an adequate plan in terms of agent mental state. In this article, we introduce this algorithm and expose experimental results showing the advantages of this adaptation. Ó 2011 Elsevier Ltd. All rights reserved.

1. Introduction

wants the box 182 to be in London. For achieving this goal, there are several feasible plans that an agent can follow. However, if we consider the agent has specified preference(visit(‘Birmingham’), 9) showing that the agent has a high preference (9/10) to visit Birmingham, a plan that goes past Birmingham will be more acceptable for the agent than one that does not consider this stop. Moreover, if visiting Birmingham is a pending objective the importance of a plan that uses Birmingham as an intermediate stop is still higher since another objective would be accomplished, although this objective is not part of the initial proposal. Nowadays, an agent should analyze the first plan generated by a planning algorithm to evaluate the degree of compatibility with its mental state. If the plan is not acceptable in relation to the agent’s preferences, the agent asks over and over again for another solution until he finds one that is good enough. The process finishes when the agent accepts one solution or decides to reject all of them. Fig. 1 illustrates these cases. For solving the above mentioned problems, we have first analyzed the existence of some way that could help agents in their planning processes. An obvious alternative is including the agent’s desires as preconditions of actions used in the planner and in its initial state definition. Although this alternative is a solution, it is a static one. A change in the agent’s mental state implies changes in action definitions. Basically, the question is that the problem definition contains elements that represent the agent preferences with regard to the solution. These elements are not relevant for the problem description but are essential at the time of searching for a solution. When the agent’s interest changes the developers need to re-code basic parameters in the problem definition. Therefore, agents need specific algorithms that deal with their desires. This is the approach we decided to follow. In this context,

Intelligent agents are autonomous entities that interact with their world for achieving their goals. Actions that are carried out for the intelligent agents are deliberate, i.e., actions made with the purpose of achieving goals. Planning algorithms are often applied for this purpose. This kind of algorithm allows building an action plan from the initial state of the world, a desired final state, and a set of actions that can be performed by the agent (Blum & Furst, 1997; Fikes & Nilsson, 1971; Weld, 1996). The needs of the intelligent agents are not totally meted with the usual way of applying these algorithms. Agents invoke these algorithms by sending an initial state, a final state, and a set of actions that they can perform. The planner, from these data, returns a plan that achieves the final state, as long as a feasible plan exists. The problem detected with this way of interaction is that, in each generated plan, the agent’s mental state is not considered. The mental state of the agents provides mental attitudes such as preferences and desires. These mental attitudes distinguish one agent from another, as well as the same agent at different times (Rao & Georgeff, 1995; Shoham, 1993). Thus, this changing parameter is avoided in classical planners, which becomes a problem for agents. For illustrating this problem, we present the following situation. An agent invokes a planning algorithm with the following goal to be achieved: goal(box(182, ‘London’)), specifying that the agent

⇑ Corresponding author. E-mail addresses: [email protected] (L. Berdun), [email protected] (A. Amandi), [email protected] (M. Campo). 1 CONICET, Consejo Nacional de Investigaciones Científicas y Técnicas, Argentina. 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.10.006

4861

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

agent-specific planning algorithms, at least in plan space centered algorithms. Consequently, we have obtained a planning algorithm that generates solution plans, thus trying to satisfy preferences and constraints of agents. An acceptable plan for an agent has linked a set of mental attitudes that comply with them. Any change of these preferences and constraints during a plan execution could be considered the trigger for changing the plans. The article is organized in the following way. Section 2 presents an overview of Ag-UCPOP algorithm. Section 3 exposes details of Ag-UCPOP algorithm. Section 4 shows experimental results. Section 5 analyzes related works. Finally, in Section 6 the work conclusions are discussed. Fig. 1. Feasible cases present in the planner–agent interaction.

2. Overview of Ag-UCPOP algorithm some proposals from the fifth International Planning Competition IPC-5 (Baier, Bacchus, & McIlraith, 2007; Baier, Hussell, Bacchus, & McIlralth, 2006; Edelkamp, 2006; Edelkamp, Jabbar, & Naizih, 2006) were useful to address our work. These algorithms consider constraints during the planning process based on an extension of the language PDDL (Gerevini & Long, 2006). PDDL3 preferences are highly expressive. However, they are solely state centric, identifying preferred states along the plan trajectory (Sohrabi, Baier, & McIlraith, 2009). At this point, we had two big categories of planning algorithms to consider: those centered on the plan space and those centered on state space. We decided to start attacking the problem by working on the plan space centered algorithms because they build partially ordered and partially instantiated plans that are more explicit and flexible for execution (Ghallab, Nau, & Traverso, 2004), which are particularly important in the environment of autonomous agents. Thus, we started with UCPOP algorithm (Weld, 1994), building our Ag-UCPOP. Ag-UCPOP considers preferences and constraints in plans that try to achieve the agents’ goals. We decided to use a simple specification of these mental attitudes. A more complex mental state definition can be incorporated with little impact. We focused our work on analyzing the viability of our approach of

In this section, we introduce our approach with a practical example, which shows how the agent’s mental state influences the generation process of a solution and the final result. We use a simple version of the mental state that only considers preferences and constraints. We did this in order to focus the presentation on the algorithm. We only show how the attitudes can improve the plan quality without confusing the explanation with the complexity of the mental state’s formalisms. In the following subsections, we present the key parts of the AgUCPOP, these will be explained in the rest of the paper. 2.1. Practical scenarios As mentioned above, the proposed solution consists of taking into account the agent’s mental state in the plan conception. Fig. 2 shows an example in which the plan solution changes when the agent’s mental state is considered in the plan generation. The formulated problem is the transportation of a man (called ‘John’) from one city to another (in this case the starting point is a city called a and the destination is a city called c). In this example, it is possible to use two different cars (called c1 and c2) for the transportation of the boxes from one city to another.

Initial State

Initial State

on(a,’John’), car(c1), car(c2)

on(a,’John’), car(c1), car(c2)

on(a, ‘John’), car(c1)..

on(a, ‘John’), car(c2)..

MoveCar(a, b, ’John’, c1) on(b, ‘Jhon’),……..

on(b, ‘John’), car(c1)……..

MoveCar(a, d, ’John’, c2) preference(on(d, ‘John’, 10) preference( car(c1), 10) preference( car(c2), 50) ……… …….

MoveCar(c, b, ’John’, c1)

on(d, ‘John’),……..

on(d, ‘John’), car(c2)……..

MoveCar(d, b, ’John’, c2)

on(c, ‘John’),……..

Agent’s mental State

on(c, ‘John’),……..

on(c, ‘John’),……..

on(c, ‘John’),……..

Final State

Final State

Plan without the agent’s mental state

Plan with the agent’s mental state

Fig. 2. The importance for considering mental states in planning.

4862

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

Fig. 2 shows, on the left, a plan obtained with the Ucpop algorithm. This plan achieves the goals, but does not take into account the mental state of the agent, which is presented in the middle of the figure. On the right, we can see the plan obtained from our algorithm; this plan also achieves the goal, but additionally generates a more suitable plan in terms of the agent’s mental state. In this case, the achieved solution has some changes in the instantiations of some actions owing to the mental state of the agent. For example, in the actions moveCar, car c2 is used instead of car c1. Besides, city d is selected as intermediate stop instead of city b, which had been selected in the first case. These changes are showed on the right of Fig. 2. The changes always take place according to the mental state. Even if the changes provoked by the mental state only include changes of variables (the car used and the intermediate city), the kind of action used in the plan may also vary; the latter depends on the mental state. Fig. 3 presents an example in which the kind of action is also changed. In this case, the mental state of the agent specifies that ‘John’ prefers to travel by plane (it is represented by preference(MovePlane(_._.‘John’,_))). In this example, it is possible to see how a few attitudes can determine changes at the time of achieving a solution to the problem. Also, in Fig. 3, we can see how some mental attitudes are not satisfied by the achieved solution. This is due to the fact that not all mental attitudes can be necessarily satisfied by the achieved plan. In this case, the solution accomplished the preferences preference(MovePlane(_,_, ‘John’,_)) and preference(Stopsh2, 50) (this specifies that the agent wants to stop no more than twice). Our proposed approach always tries to achieve a solution that satisfies the agent’s mental state. This does not mean that all mental attitudes should be satisfied; however the attitudes can be contradictory. For example, in Fig. 3, if the preference is to travel by plane, but ‘John’ uses car c2, this action contradicts the preference in an indirect way (because in order to use car c2, it is necessary to travel by car and not by plane). 2.2. Key parts of our algorithm Ag-UCPOP has three main parts in which mental states could influence the plan generation process. These parts were named cut condition, action selection, and consistency control. Below, we present a brief summary of each one. 2.2.1. Cut condition All planning algorithm has a cut condition in which it is established that we have a plan (provided that a solution exists). In the

Initial State on(a,’John’), car(c1), car(c2), plane(p1)

on(a, ‘John’), plane(p1)

MovePlane(a, c, ’John’, p1) preference( MovePlane(_,_,’John’,_), 60). preference( Stops < 2, 50 ) preference( car(c2), 40 ) preference( car(c1), 10 ) ……… …….

UCPOP algorithm, the plan conception finishes when the goal’s agenda is empty and all the variables have an associate value. In our algorithm we need to additionally check a new condition: whether the current generated plan is acceptable from the agent’s point of view or not. In this part of the algorithm, we have the possibility to check the plan’s validity against the conditions the plan should fulfill to be considered a a solution. To achieve this personal validation, the agent may specify a condition of acceptability. This condition is checked by the planner before deciding that it has a solution. An example of a cut condition could be to fulfill all preferences that have weight more than 6, or to fulfill all the preferences that use boat a, or any combination of requirements on mental attitudes that can be specified. It is important to mention that a solution to the general problem might be not considered as a solution by the agent (always on terms of the mental state). We considered the agent’s conditions as another requirement for the solution; therefore, using the previous example, if the plan does not use boat a, it will be considered an invalid solution. 2.2.2. Action selection During a plan generation process, there is a point where the planner should choose an action to try to achieve the current goal. In UCPOP, this decision is made in a non-deterministic way. From a practical point of view this decision can be made by either a random algorithm or a domain specific algorithm (Weld, 1994). In Ag-UCPOP the preferences are taken into account at the moment of making a decision (at this moment we only work with preferences but it is also feasible to include other mental attitudes). If a choice satisfies some preferences and an alternative choice does not satisfy any, for instance, the planner chooses the option that satisfies the preferences. 2.2.3. Consistency control In each loop, we need to confirm that all the actions continue in a consistent state, at least the ones affected by the last changes. Here, UCPOP looks for new threatened links. If a threatened link is detected, the planner tries to eliminate the threat using retraction, promotion, or confrontation. Here, we extend the definition of threatened link in Ag-UCPOP. We also consider it a threat when the current plan includes steps that break some constraint specified by the agent. By constraints we mean cases that the agent would not wish to find in the solution. The breach of these conditions are verified as soon as they appear, and not at the end of the plan building. This extension allows us to filter non compatible solutions at the moment in which they are built. This is useful, for instance, to limit the use of one element, or limit the application of an action in a given number, and so on. The planner, besides searching for threatened links, makes sure that the current plan is consistent with the agent’s mental state. For example, if the achieved plan uses a given car and there is a constraint specifying that the agent does not want to use that car, the plan discards the last decision coming back to the previous decision point and looks for an alternative solution.

on(c, ‘John’),……..

3. The Ag-UCPOP algorithm on(c, ‘John’),……..

Final State Agent’s mental state

Plan with the agent’s mental state Fig. 3. A new solution, according to the new agent’s mental state.

As its name shows, the Ag-UCPOP is based on the UCPOP Algorithm. UCPOP is a partial order planning algorithm whose action descriptions include conditional effects and universal quantification. The UCPOP algorithm starts with an initial plan that consists solely of a ‘‘start’’ action (whose effects encode the initial conditions) and a ‘‘goal’’ action (whose preconditions encode the goals). UCPOP attempts to complete this initial plan by adding new

4863

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

actions and constraints until the satisfaction of all preconditions. In the main loop the planner makes two types of choices: 1. If the current plan has not yet satisfied a precondition, UCPOP chooses one effect nondeterministically from all action effects that could possibly unify the desired proposition. Then, it adds a causal link (McAllester & Rosenblitt, 1991) to the plan in order to register this choice. 2. If an action has interfered with a causal link, then UCPOP will nondeterministically choose a method to solve that threat either: by adding new order constraints, or by posting additional sub goals. When all goals have been supported by a causal link, and all causal links have been protected, UCPOP finishes and returns a solution. We selected UCPOP for several reasons one of them was that the least commitment inherent in partial order planning makes it one of the more open planning frameworks (Nguyen & Kambhampati, 2001). Also, this selection allows us to work with a plan space centered algorithm without limitations, i.e. an algorithm that does not restrict the search space to optimize the time to reaching the first solution. Fig. 4 shows the Ag-UCPOP algorithm and the main points of interaction between the planner and the agent: Cut condition (1), Action selection (2) and Consistency control (3). In this section we

explain these modifications introduced by the Ag-UCPOP and the new inputs and outputs. For more information on UCPOP, read (Penberthy & Weld, 1992; Weld, 1994). In the following sections, we present a detailed explanation of each point mentioned above. Also, we explained the new inputs and outputs used by the Ag-UCPOP algorithm. 3.1. Inputs and outputs As Fig. 4 shows, two extra parameters are added to the parameters of the UCPOP algorithm, the agent’s mental state (represented by the letter M) and the cut specifications (represented by the letter E). The input of the algorithm is specified in the following way:

ðhA; O; L; Bi; agenda;A; M; EÞ The new data, provided by the agent, are divided into two parts in order to maintain the agent’s interests separate. On the one hand, variable M contains the agent‘s mental attitudes, i.e., attitudes that specify, for example, that the agent prefers to visit Birmingham, or prefers to stop in London. Next, we detail the specification of the variable M on BNF. hMi::=hClausesi hClausesi::=hAttitudeihClausesijhAttitudei

Fig. 4. Ag-UCPOP algorithm.

(continued on next page)

4864

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

hAttitudei::=hHeadijhHeadi :- hBodyi. hHeadi::=hTypei(hArgsi)j hTypei(hCompArgsi)j constraint(hConstArgsi) hArgsi::=hNameAttitudei, Weight, ConfidencejWeight, ConfidencejWeightj hPredicateijhPredicatei, Weight,

ConfidencejhPredicatei, Weight hCompArgsi::=hPredicatei,hActionNamei, hActionNamei, Weight, Confidence hConstArgsi::=hNameAttitudei, Weight, Confidence, hAppLeveli hAppLeveli::=hPlanij hAppActioni hBodyi::=hPredicatei, hBodyij hPredicatei hPlani::=plan(hActionListi, hLinkListi, hOrderListi) hActionListi::=[hActionItemsi]j[ ] hActionItemsi::=hActionItemi, hActionItemsijhActionItemi hActionItemi::=action(hActionNamei, hIdi) hLinkListi::=[hLinkItemsi]j[ ] hLinkItemsi::=hLinki, hLinksItemsij hLinki hLinki::= link(hIdi, hIdi, hPredicatei) hOrderListi::=[hOrderLevelsi]j[ ] hOrderLevelsi::=hLeveli, hOrderLevelsij hLeveli hLeveli::=[hIdsi] hIdsi::=hIdi, hIdsijhIdi hAppActioni::=action(hActionNamei, hPreListi, hPostListi, hIdi) hPreListi::=[hPredicatesListi]j[ ] hPredicatesListi::=hPredicatei, hPredicatesListijhPredicatei hPostListi::=[hPredicatesListi]j[ ] hNameAttitudei is the name assigned to the attitude.

(hValuei)jminimunWeight(hValuei)j importance(hTypei, hValuei) hStrAtti::= attitude (hAcomplishAtti, hNotAccomplishAtti) hAcomplishAtti::=hListPredi hNotAccomplishAtti::=hListPredi hListPredi::=[hPredicatesi ]j[ ] hPredicatesi::=hPredicatei, hPredicatesijhPredicatei hPredicatei::= predic(hNamePredi, [hBodyi], Weight, Confidence) hBodyi::=hPrologPredicatei, hBodyij hPrologPredicatei hNamePredi::= hTypei (hArgsi) hArgsi::=hNameAttitudi,Weight,Confidencej Weight, ConfidencejWeightttj hPrologPredicatei, Weight, Confidence hValuei A numeric value. hPrologPredicatei represents a Prolog predicate, e.g. name(john) hNameAttitudi is the name assigned to the attitude. hTypei It represents the kind of attitude.

With these attitudes it is possible to specify, for example, that a plan is valid for an agent in a given time, only if all the preferences with value higher than 5 are accomplished, or that the rules whose reliability is lower than 50 % are discarded. These cases could be specified at E in the following way: desire_level (attitudes (Lacom, LnotAcom)): - max (LnotAcom,Value), Value h6. minimumConfidence (50).

hActionNamei Name assigned to the action; it includes the variables, which allow distinguishing among several instances of actions. hIdi Identifier assigned by the planner to an action which is used in the plan (for example, A1, A2). This Id is used to recognize the actions in the causal links. hPredicatei It represents a Prolog predicate, e.g. name (john) hTypei It represents the kind of attitude, initially we only work with preference

The above specification is only an example of a condition for a particular agent, but it is possible to specify any condition, inclusive an empty one. The output of the algorithm is a partial order plan that, beginning in the initial state, allows achieving the final state. Additionally, the output constitutes an acceptable plan for the agent, i.e., it contemplates the mental attitudes included in M and satisfies the cut conditions established in E.

ðhA; O; L; Bi; Classif AttitudesÞ Classif_Attitudes contains the classification achieved by the plan, according to the current mental attitudes of the agent. 3.2. Cut condition

A feasible example of clauses contained in M could bepreference ( newCar (C3), makeCar (C3), moveCar (C3,’London’, ‘Birmingham’), 5, 60)or preference (visit (‘London’),9,80). On the other hand, variable E contains information about the interaction between the agent and the planner. The information recorded in this variable is used to explicitly collaborate with the planner in the process of building a plan. Next, we detail the specification of variable E on BNF. hEi::=hClausesi hClausesi::=hAttitudeihClausesij hAttitudei hAttitudei::=hHeadi:-hBodyi.j hHeadi. hHeadi::= desire_level(hStrAtti)jminimunConfidence

The UCPOP algorithm establishes that a solution is achieved at the time the goal agenda is empty. At this stage, the planner will be in a condition to state that one solution to the initial problem exists; it only remains to verify if the solution fulfills the acceptability criterion established by the agent. These criteria are specified on variable E of the Ag-UCPOP algorithm. This new satisfiability condition, which is added to the cut condition of the algorithm, allows checking that the agent agrees with the achieved plan. Now a plan is valid only if the agent accepts the solution. To explain this part, we use the problem presented in Fig. 5 as an example. Fig. 6 shows a schemata of the way in which the planner works together with the agent’s mental state. It can be seen that, besides

4865

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

Box Transportation Problem The Agent has the responsibility of coordinating the transportation of a box from a City to another. For this, the agent can use the next action:

transport(Box, From, To) Pre: on(From, Box), connected(From ,To ) Post: not (on(From, Box)),, on(To, Box) The boxC1, which is in city “a”, is the one that has to be transported to city “g”. This is represented in the following figure.

boxC1

boxC1

f e

e g

b a

d

g

a f

c

b d c

f Final State

Initial State Fig. 5. Box transportation problem definition.

Termination: If the agenda is empty and all variables in B are instantiated verify that: ∀ Attitude such as Attitude is a kind of mental attitude, and let Lc y Lr empty lists of attitudes ∀ N such as N is an identificator of mental attitude of the kind Attitude do: If Attitude(N, W , T) is true in M where W is the weight of the attitude and T is the confidence of the rule add to Lc Attitude(N, W , T) else add to Lr Attitude(N, W , T) Add the done classification over this Attitude to Classif_Attitudes Given the done classification to all the mental attitudes (Classif_Attitudes) control if they comply with the plan acceptability condition specified in E. If YES return , on the other hand fail

Fig. 6. Cut condition of the Ag-UCPOP algorithm.

complying with the UCPOP requirements, special care is taken for the plan to fulfill the established conditions in E. To do this, the planner verifies the achieved plan to determine if this is acceptable, according to the agent’s mental state. In order to achieve this, the planner classifies the mental attitudes into accomplished and not accomplished. After that, it verifies whether the obtained classification for the mental attitudes satisfies the established expectations in E. Fig. 6 details the cut condition, generalized for any type of attitude. We can see how, after completing the classification for the current plan, two different sets will exist for each type of attitude; one with the accomplished attitudes (Lc) and the other with attitudes that have not been achieved (Lr). The content of the sets will depend on the achieved plan. Likewise, the acceptance of the plan

on the part of the agent will depend on the classification achieved and the conditions established in E. For example, in Fig. 7 lists a possible agent’s mental state (variable M) for the BTP, and Fig. 8 shows a feasible solution to the problem. The classification achieved for the solution presented in Fig. 8 consists of preferences preference(pref001, 2, 90) and preference(pref002, 5, 85). Both of them are included in the set Lc. The set Lr has the preferences preference(pref003, 7, 100), preference( pref005, 2, 50) and preference(6, 80). These sets constitute the classification achieved by the mental state presented in Fig. 7 together with the solution shown in Fig. 8.

Actions:

1.

preference(pref001, 2, 90) :- plan(PLAN), actionList(PLAN, Actions), length(Actions, Leng), Leng < 4.

2.

preference(pref002, 5, 85) :- visit(b).

3.

preference(pref003, 7, 100) :- plan(PLAN), actionList(PLAN, Actions), member(Actions, transport( Box, d, f) ).

4.

preference(pref005, 2, 50) :- visit(e).

5.

preference(6, 80) :- plan(PLAN), path(PLAN, [d, f, g]).

Fig. 7. Agent’s mental state for the BTP.

transport( boxC1, a, b) transport( boxC1, b, d) transport( boxC1, d, g) b ox C 1

boxC1

e b a

g d

c

f

Fig. 8. Feasible solution to the BTP.

4866

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

Acceptability condition: Given the classification of the attitudes, control: desire_level(Attitudes) (this predicate is defined in E) If YES, return the levels associated with the classification, otherwise fail

Fig. 9. Specification of the plan acceptability condition.

desire_level(attitudes(Lc, Lr) ) :- level(preference, Lc, Lr, middle). level(preference, ListC, ListI, middle) :- filteringPref( ListI, ListPref), max(ListPref,Value), Value < 6. level(preference, ListC, ListI, low).

Fig. 10. Example of acceptability condition.

Once the classification is achieved, the planner goes on with the satisfiability condition using this classification. From the acceptability condition (Fig. 9) it is possible to decide whether the achieved plan fulfills the agent’s expectations and if it can be considered an acceptable solution. Attitudes is composed of two lists of attitudes: one of them contains the accomplished attitudes (Lc) and the other the not accomplished ones (Lr). For our example, the attitudes contained in Lcand Lrare known; now, Fig. 10 lists the acceptability condition used as an example. For this example, the agent only specifies two levels of preferences: middle and low. With the classification achieved and the acceptability condition of the agent, the planner rejects the solution, because the level of the classification is low. This is because there is an attitude with value 7 (preference(pref003, 7, 100)) that is not achieved by the solution. Since the solution is not achieved, a new solution is being searched for. In general, there are different levels for the different types of attitudes we work with. However, it is the agent which should decide how the attitudes are evaluated. The planner only verifies whether the conditions imposed for the acceptability condition are satisfied. If this is the case the plan is accepted; otherwise, a revaluation is asked for. 3.3. Action selection In this part of the planning algorithm, the action with which we will try to satisfy the current goal of the agenda hQ, Aci is selected (Q represents the current goal of the agenda, and Ac the action that needs the objective). The UCPOP algorithm sets up a non-deterministic selection of the action; at the moment of putting into practice this mechanism we may feel the necessity of establishing a random selection algorithm or a specific domain algorithm. This last option is the one suggested by Weld (1994) to improve the selection according to the work domain. In this way, when the domain changes, it is necessary to redefine the way of selection; if we wish to change the required conditions on the result, we must modify the way of selection even if the domain does not change. In the case of carrying out a more generic way of selection so that it can adapt itself to the different domains, we lose the domain’s knowledge, which can be useful to find a quality solution. Fig. 11 shows a feasible instance of elements that compose the selection point in the Ag-UCPOP algorithm. In this case, there are three new actions (requestBox(From, b1)buyBox(From, b1, Mon) and

Fig. 11. Example of a feasible instance of the selection point of the Ag-UCPOP algorithm.

produceBox(Material, b1)) and two existent actions ( produceBox(brownPaper, b1) and produceBox(carton, b1)); the current goal is box(b1) and the action consumer is wrapGift(redPaper, b1, Gift). In the mental state of the agent, we recognize attitudes that can affect the decision directly or indirectly. We say that an attitude affects the planner’s decision in a direct way, if it was specifically defined with respect to actions or some combination of plan elements; i.e., designed to interact with elements of the plan. For example, it may have a stronger preference for the action travel than the action send. In a dilemma, the planner will choose first the action travel. Also, it is possible to specify a preference for a part of the action, for example, a preference for reading book ZZ over book YY. The latter allows us to specify action centric preferences and not only space centric preferences (Sohrabi et al., 2009; Gerevini and Long, 2006). On the other hand, the mental attitudes that indirectly affect the decision are those that the agent has in its mental state, and have not been necessarily defined to interact with the planner. An example of this could be a case in which we have to choose a means of transport to go to a particular city, and among the alter-

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

4867

1. preference( _ , buyBox(From, B, M), _ , 1, 100). 2. preference( madePurchase(juan, A), 12, 50). 3. preference( madePurchase(carlos, A), 2, 80). 4. preference( madePurchase(Alguien, C), 2, 100). 5. preference( _ , requestBox(From, B) , _ , 5, 70). 6. preference( box(C), requestBox( pepe, C), _, 4, 80):- sizeBox(C, big) 7. preference( _ , requestBox( From, C) , _, 6, 100):- debtorList( Lis), member( From, Lis). 8. preference( _, requestBox( juan, C), _, 3, 80). 9. preference( _, produceBox(brownPaper, C), _ , 4, 70). 10. preference( box(C), produceBox(carton, C), wrapGift(P,C,R), 10, 90). 11. preference( material(C, carton), 16, 50). 12. preference( material(C, M), 9, 60) : - box(C), hardMaterial(M). 13. preference( material(C, M), 9, 60) : - paper(C), not ( hardMaterial(M) ).

Fig. 12. A feasible mental state for the enunciated problem.

Fig. 13. Action selection of the Ag-UCPOP algorithm.

Pair Action – Effect < box( b1 ) , produceBox(carton, b1) >

R e le v an ce 22.4

< box( b1 ) , requestBox( From, b1) >

3.5

< box( b1 ) , buyBox( From, b1) > < box( b1 ) , produceBox( brownPaper, b1) > < box( b1 ) , produceBox(Material, b1) >

3 2.8 0

Fig. 14. Table with the results of the classification of the options in Fig. 11 according to the mental attitudes shown in Fig. 12.

natives, there is one that establishes travels by plane. If the agent has as a preference travels by plane, the election of the last action satisfies this. Fig. 12 shows a possible agent’s mental state for the example listed in Fig. 11. In this example, the indirect mental attitudes are attitudes 2, 3, 4, 11, 12 and 13; the others are direct attitudes. The decision about which action will be used to satisfy the current goal of the agenda must be established taking into account all the mental attitudes, those that affect the decision directly or indirectly. In Fig. 13, we can see how the action selection is extended to work jointly with the agent’s mental state.

Fig. 13 shows how, starting from the objective hQ, Aci, all the pairs effect-action that could be used to achieve the objective are searched. The planner must select one from all candidate pairs. To make this selection, the relative degree of relevance for each candidate pair is calculated with respect to the accomplishment of the relations existing in M. Fig. 13 shows that the calculation of the relevance is divided into two parts; first, we calculate the attitudes that are activated in an indirect way with the inclusion of the candidate action; then, the attitudes that are activated in a direct way. After this calculus, the planner establishes in descending order, according to the relevance degree Ih, the sequence with which it will try to achieve one solution to the present problem. In this way, the pair hRj,Aii that has the higher relevance degree will be the first option that the planner will take to try solve the problem. Fig. 14 lists the classification of the candidate pairs listed in Fig. 11 according to the attitudes detailed in Fig. 12. The planner always selects the first option; if this does not achieve a solution, then it selects the next one. This example shows how the search tree is defined according the agent’s mental attitudes. Similar options have different degrees of relevance, according to the value of their variables.

4868

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

Box Transportation Problem load(Box, Truck, Place) Pre: on( Place, Box), on(Place, Truck) Post: not(on(Place, Box) ) , in(Truck, Box)

move(Truck, From, To) Pre: on(From, Truck), connected(From, To) Post: not (on(From, Truck)), on(To, Truck)

unload(Box, Truck, Place) Pre: in(Truck, Box), on(Place, Truck) Post: not(in(Truck, Box) ) , on(Place, Box) The boxC1, which is in city “a”, is the one that has to be transported to city “g”. This is represented in the following figure. boxC1

boxC1

f e

e b

a c

g

b

g d

a f

Initial State

d c

f Final State

Fig. 15. Complex version of the BTP. In this case there are three kinds of action to solve the problem.

Fig. 16. Consistency control in the Ag-UCPOP algorithm.

Fig. 17. A feasible set of constrains for the BTP.

3.4. Consistency control The modifications provoked by the instantiation of a new causal link makes it necessary to control, besides the threatened causal

links, that all the actions will be kept in a state consistent with the agent’s mental state. At this point, the planner has the possibility of checking the validity, from the agent’s point of view, of a partially achieved solution. If the partial solution turns out to be

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

4869

Fig. 18. Schema of a plan in execution.

Fig. 19. Plan obtained from the execution of the algorithm, without the specification of a mental state. This plan is equal to execute the UCPOP algorithm without an extension.

invalid to the agent, then it is not feasible to use them for this agent. An important aspect to be considered at this stage is that the current plan is not necessarily a solution to the problem, and it is likely to be incomplete. This notoriously distinguished this part from the first extension part explained, in which it is known that the plan is a solution to the problem. Because the plan that we have is not necessarily a solution to the problem, the same controls that were applied in the first part are not possible, as these are based on the idea that the plan solves the initial problem. In this part, it is not possible to discard a plan because it does not fulfill an obligation, since in a next iteration

this may be fulfilled. This means that it will be dealt with attitudes that can be controlled in a partial solution. The constraints, particularly, specify elements that are not welcome in the plan; therefore if a constraint is accomplished, then there is a plan portion that is not wanted in the solution; this constraint will be active until a solution is achieved, if there is a solution at all. Fig. 15 shows a complex version of the box transportation problem presented in Section 3.2. In this case, the action transport has been replaced by the actions load, move and unload.We use this problem to explain the consistency control. In Fig. 16, we can see how the causal link protection is extended to work jointly with the agent’s mental state. This figure shows

4870

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

Fig. 20. Plan obtained after the incorporation of the agent’s mental state. On the right is shown a schema with the final structure of the plan with the principal links; on the left, the involved actions are enunciated.

how the planner verifies that the plan does not break any constraint when the partial plan is achieved.2 Likewise, we can see how the constraints are divided into two groups; on the one hand, those that are controlled in the complete plan, for example the number of actions of the plan is lower than 6; on the other hand, those that are applied on a specific action, for example the action move never has as a destination ‘‘box c’’. For our example, we use the constraints shown in Fig. 17. It shows four plausible constraints to the problem shown in Fig. 15. The first constraints specify that a plan that has an action with the effect on (c, truck1) is not valid, i.e.., truck1 can never stop in city c. Similarly, the second constraint specifies that truck2can never stop in city e. These two constraints are action constraints, i.e., constraints that are verified in the actions. The third constraint specifies that a plan that travels from b to d and next tof is an invalid plan. The fourth constraint specifies that it is not feasible to use the same truck more than 3 times. These two constraints are plan constraints, i.e., constraints that are verified in the whole plan. At this point, the planner verifies which constraints are broken and which are not; and it decides, taking into account the broken constraints and the information in E, whether the plan is valid or not. If valid, the plan execution goes on normally; else, a plan revaluation is requested, causing the search for a new alternative. Fig. 18 shows an intermediate step in the plan execution of the BTP problem. The last selection was the connected(d, e) of the initial state, this selection provoked the instantiation of a variable in ac2 It is necessary to note that if a plan breaks a constraint, then a part of the plan is not valid to the agent, and then is not feasible to use them for execution. This plan is not valid for this agent at this execution because it breaks the constraint imposed for the agent in its mental state. We could be discarding solutions to the general problem but solutions that are invalid for the agent at this time.

tions A2 and A3 with the value e (highlighted in the figure with a circle). For our example, the plan results invalid, since constraint 2 (Fig. 17) is broken by the last selection, because truck2stops in city e. In this case, the planner redoes the last selection searching for a new alternative. Because there may be constraints that we do not want to consider, inE, it is possible to specify the minimum value, as well as the minimum confidence, that a constraint must have to be taken into account.

4. Case study With the purpose of evaluating the present algorithm, we implement a debug tool which allows for the execution of planning algorithm as well as the edition and manipulation of the agent’s mental state. For the implementation of the mental state of the agent, we use the concept of logics modules (O’Keefe, 1985) by means of language programming JavaLog (Amandi, Zunino, and Iturregui, 1999). The experimentation with the algorithm was made using the problem of travel planning in a group of islands as work domain. In the proposed problem, there are three kinds of transport: car, boat, and plane. All the islands are accessible by boat; nevertheless, only some of them are accessible by plane and by car. From the domain, the actions, an initial state, and a final state were defined. In the initial state, five people were included who wanted to be transported to one island. First, we were searching for a plan without the specification of a mental state. The plan is shown in Fig. 19. A specification of the agent’s mental state was added to the initial definition. In this mental state preferences of each passenger,

4871

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

as well as their constraints were included. Conditions about the use of the means of transport were also added. The results obtained are presented in Fig. 20. The latter shows the plan obtained with the Ag-UCPOP algorithm. In this case we can appreciate how the plan structure has changed from the plan illustrated in Fig. 19. Now, three passengers travel by boat, one by helicopter and the last one by car. In the previous example, we can see the differences, at a structural level, between a plan that contemplates the agent’s mental state and another one that does not. At the time of making a quantitative comparison between the Ag-UCPOP and other planning algorithms that do not consider the agent’s mental state, we find that the main factor of comparison is the quality of the achieved plan. In the work presented by Nguyen and Kambhampati (2001) a quantitative comparison of the planning algorithms according to the quality of the achieved plans is made. In order to do this, they introduce the following three metrics, (i) the accumulative cost of the actions involved in the plan (if the actions do not have a cost, cost 1 is assumed); (ii) the minimum time needed for the plan execution; (iii) the flexibility of the plan execution (capacity of supplying alternative orders). However, the previous metrics take into account only plan attributes. In our case,the quality of the plans varies according to the mental state specification of each agent. Therefore, a plan that is optimum for one agent is not necessarily so for another. If we search a quantitative measurement that makes possible comparing the obtained solutions with the algorithm Ag-UCPOP versus the ones obtained by other planning algorithms, we must use the mental state used for the plan conception, since the plan was built according to it. A feasible alternative consists in establishing measurements over the number of attitudes accomplished by the obtained solutions. However, this attribute can be adjusted in the satisfiability conditions that are established by the agent in its mental state (specified in E). In this way, we can specify in the agent’s mental state, for example, that a plan is acceptable if it carries out 90 % of its preferences. This last part makes a comparison with the complete algorithm nonviable, because it would be impartial. With the objective of making an impartial comparison between the proposed algorithm and the other planning algorithms, we omit to detail the satisfiability condition of the agent; this allows the comparison to be impartial. Based on this, we used a precision formula that measures the number of attitudes that are accomplished with the achieved plan, in relation to the total number of attitudes; all this is balanced according to the weight of the attitude in question (Formulae 1).

Plan Precision 0.8

0.6 0.5

PlanPrecision ¼

Ucpop

0.4 0.3

GPlan

0.2 0.1 0

Average results 1

Fig. 21. Graphic with the values of average of the precision Formulae 1 for the algorithms evaluated.

GraphPlan algorithms; since these break several constraints in the majority of the cases. The previous comparison was made modifying the content of the mental state. Next, we present a comparison made with only one mental state, but asking for revaluations. In the graphic shown in Fig. 22, the yaxis represents the plan precision and the xaxis represents the number of revaluations. The comparison was made among the Ag-UCPOP algorithm, the UCPOP Algorithm and an adaptation of the UCPOP for the travel domain. Between the UCPOP and its adaptation we can see how the adaptation presents a better development. However, the AgUCPOP has, in general, a better development than the adaptation. Also, we can see that the first plan achieved is one of the best solu-

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 1

3

5

7

9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 Ag-Ucpop

Ucpop (m)

Ucpop

Fig. 22. Plan precision versus the number of iterations made.

P

8j2Accomplished:Attitudes WeightðAcj Þ  Conf ðAc j Þ P 8i2Attitudes WeightðAi Þ  Conf ðAi Þ

Ag-Ucpop

0.7

ð1Þ

The Ag-UCPOP was compared with implementations of the UCPOP and GraphPlan algorithms (Blum and Furst, 1997). For the comparison we made different tests varying the agent’s mental state, as well as the initial and final state. The results are shown in Fig. 21. Fig. 21 shows a graphic of the values of average precision obtained for the comparison of the three algorithms. The illustrated results show that the algorithm Ag-UCPOP denotes a better precision with respect to the other algorithms. The comparison was made over the travel domain. It is necessary to remember that in the agent’s mental state, there are two different kinds of attitudes for the comparison: preferences and constraints. If we give more importance to the constraints, because the solution that breaks a constraint is an invalid solution for the agent, the differences among the planners will be more significant, especially with the

80 A g-UCPO P P

70

UCPO P P1

60

UCPO P P2

50 40 30 20 10 0 0

1

2

3

4

5

6

7

Fig. 23. Number of attitudes accomplished by the solution versus the time used to achieved the solution.

4872

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873

tions achieved among all iterations; in fact, it is superior to the best plan achieved for the adaptation of UCPOP (iteration number 12). As mentioned above, the previous evaluations have been made without considering the satisfiability condition. At the time of evaluating the satisfiability conditions we carried out a practical experiment in which we measured the response time according to the specified satisfiability condition. For this experiment, the condition was specified to accomplish a number of attitudes. The graphic in Fig. 23 shows:in the X axis, the number of attitudes requested in the cut condition, and in the Y axis the time consumed to achieve the solution. Since, in UCPOP the solution first achieved depends on the established order of the actions in the set of available actions, we performed two different experiments varying this order. As we can see, the first solution achieved by the Ag-UCPOP algorithm accomplished 6 attitudes; this is due to the fact that the mental state of the agent is taken into account in the plan conception. When we asked for a plan that could accomplish 7 attitudes, the response was obtained in 12 s. In the tests that we did with UCPOP algorithm, four different attitudes were accomplished by the first solution. When we asked for 5 attitudes in the first experiment, the response was obtained in 57 s; when we asked for 6 attitudes the response was obtained in more than 1 h. In the second experiment, the response time was shot up when we asked for a plan that could accomplish 5 attitudes. These critical points can be explained if we consider that toward fulfilling more attitudes it is necessary to combine the different stages of the solution. In our domain, we have to vary the mechanisms of transportation according to the traveller’s preferences; this, in UCPOP, implies a high execution time, due to the linearity to achieve a given solution.

this context, we need to transform the agent’s mental state into PDDL 3.0 notation, finding the correct location for each attitude; after that, the planning algorithm will transform this notation into a previous version and separating again the attitudes. This way of working is not the best one to deal with the agent’s mental attitudes. Also, as mentioned in Sohrabi et al. (2009) PDDL 3.0 preferences are expressive, but they are solely state centric, identifying preferred state along the plan trajectory. Another important question about PDDL is the metric function used to measure the plan quality. The metric function defines the quality of a plan depending on the preferences that have been achieved by the plan; this metric uses the function ‘‘is-violated name’’ that returns the number of individual preferences in the name family of preferences that have been violated by the plan. This definition includes measuring the number of break preferences but does not take into account the times that a preference is accomplished in a plan. For example, if a given plan accomplishes preference msT4 for one time it is as valid as a plan that accomplishes the preference 10 times. In our case, the latter is particularly relevant because the preferences define the agent behavior. Thus, we want the plan to accomplish a preference several times because this better defines the behaviour of the agent. On the other hand, the works presented by Ambite and Knoblock (2001) or Bäckström (1998) try to improve the quality of the achieved plan, but by specifying metrics with respect to domain attributes. In our case, we have as an objective the pursuit of a quality plan, but from the point of view of the agent’s mental state. This makes a plan, which is considered a high quality plan by one agent, of poor quality by another, and all this without changing the work domain.

6. Conclusions 5. Related work Most of the works about planning have been addressed to get results in the shortest time (Hoffmann, 2001; Nguyen and Kambhampati, 2001; Weld, 1996). Concerning planning and agents, many researches into the development of planning in multi-agent systems have been carried out. The development of planning algorithms that search quality plans from the point of view of the agent is a little explored field. The work presented in Muller and Pischel (1993) showed the principle of approximation of the collaborative work between the agent and the planner; however, the interaction between the agent and the planner is dealt with in a simple way, the emphasis is put on the problem of plan coordination among multiple agents. In the last years, the focus was put on the importance of plan quality. In the archives of the fifth International Planning Competition (IPC-5) was proposed to extend PDDL with new constructions increasing its expressive power about the plan quality specification by allowing the user to express strong and soft constraints about the different states of the plan. In this competition, several planners were presented; most of them transformed the problem into previous versions of the language and dealt with the extensions in a separate way (Baier et al., 2006; Edelkamp, 2006; Edelkamp et al., 2006). Some planners simplified the problem to a ‘‘simple preferences’’ problem (Benton, Kambhampati, and Do, 2006) and only a few planners worked with the whole problem specification (Chih-Wi, Wah, Ruoyun, and Yixin, 2006). Baier et al. (2007) proposed a set of techniques for planning with temporally extended preferences as an extension of the work presented in ICP-5. As mentioned above, several of the algorithms presented in the competition dealt with the new extensions in a separated way. They transformed the extended PDDL into a previous version, and the constraints were represented and used by another mechanism. In

In this work, we introduced a planning algorithm specifically designed for agents; an algorithm that takes into account the agent’s preferences and constrains to obtain higher quality plans than those obtained by using classical planning algorithms. This planner allows the agent both to restrict the solutions that are not acceptable for him and to specify characteristics of those that he prefers. In this way, we optimized the plan generation process by filtering solutions that the agent considers incompatible as soon as they appear, and building up the solution taking into account the agent’s preferences. Experimental results show how the agent’s mental state affects the obtained solution, even when using a simple version of it. In the experiments, we took special care of the satisfiability condition, because the use of this condition offers the advantage of adjusting the solution much more accurately to the agent’s desires. Thus, we first measured the output quality of our algorithm without considering cut conditions, with the purpose of keeping impartial comparisons. After that, we evaluated all the characteristics of our algorithm also considering this advantageous factor. In both cases the experimental results shows how the plan obtained with our algorithm denotes better precision compared to the other algorithms, especially when using the cut condition.

References Amandi, A., Zunino, A., & Iturregui, R. (1999). Multi-paradigm languages supporting multi-agents development. In F. J. Garijo & M. Bornan (Eds.), Multi-agent system engineering. Lecture notes in artificial intelligence (Vol. 1647). Berlin: SpringerVerlag. Ambite, J. L., & Knoblock, C. (2001). Planning by rewriting. Journal of Artificial Intelligence Research, 15, 207–261. Bäckström, C. (1998). Computational aspects of reordering plans. Journal of Artificial Intelligence Research, 9, 99–137.

L. Berdun et al. / Expert Systems with Applications 39 (2012) 4860–4873 Baier, J., Hussell, J., Bacchus, F., & McIlralth, S. (2006). Planning with temporally extended preferences by heuristic search. In 5th international planning competition booklet (IPC-2006), Lake District, England (pp. 20–22). Baier, J., Bacchus, F., & McIlraith, S. (2007). A heuristic search approach to planning with temporally extended preferences. In: Proceedings of the 20th international joint conference on artificial intelligence (IJCAI-2007), Hyderabad, India (pp. 1808– 1815). Benton, J., Kambhampati, S., & Do, M. B. (2006). YochanPS: PDDL3 simple preferences and partial satisfaction planning. In 5th international planning competition booklet (IPC-2006), Lake District, England (pp. 54–57). Blum, A., & Furst, M. (1997). Fast planning through planning graph analysis. Artificial Intelligence, 90, 281–300. Chih-Wi, H., Wah, B., Ruoyun, H., & Yixin, C. (2006). New features in sgplan for handling preferences and constraints in PDDL3.0. In 5th international planning competition booklet (IPC-2006), Lake District, England (pp. 39–41). Edelkamp, S. (2006). Optimal symbolic PDDL3 planning with MIPS-BDD. In 5th international planning competition booklet (IPC-2006), Lake District, England (pp. 31–33). Edelkamp, S., Jabbar, S., & Naizih, M. (2006). Large-scale optimal PDDL3 planning with MIPS-XXL. In 5th international planning competition booklet (IPC-2006), Lake District, England (pp. 28–30). Fikes, R. E., & Nilsson, N. (1971). STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 5(2), 189–208. Gerevini, A., & Long, D. (2006). Plan constraints and preferences in PDDL3 the language of the fifth international planning competition. In 5th international planning competition booklet (IPC-2006), Lake District, England (pp. 7–13). Ghallab, M., Nau, D., & Traverso, P. (2004). Automated Planning, Theory and practice. 1558608567. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc..

4873

Hoffmann, J. (2001). FF: The fast-forward planning system. AI Magazine, 22(3), 57–62. McAllester, D., & Rosenblitt, D. (1991). Systematic nonlinear planning. In Proceedings of AAAI-91, Anaheim, CA (pp. 634–639). Muller, J., & Pischel, M. (1993). The agent architecture interrap: Concept and application. Technical Report RR-93-26, DFKI Saarbrucken. Nguyen, X., & Kambhampati, S. (2001). Reviving partial order planning. In Proceedings of seventeenth international joint conference on artificial intelligence (Vol. 1, pp. 459–465). O’Keefe, R. (1985). Toward an algebra for constructing logic programs. In Proceedings of IEEE symposium on logic programming (pp. 152–160). New York: IEEE Computer Society Press. Penberthy, J., & Weld, D. S. (1992). UCPOP: A sound complete, partial order planner for ADL. In Proceedings of the international conference on knowledge representation and reasoning (KR) (pp. 103–114). Rao, A. S., & Georgeff, M. P. (1995). BDI-agents: From theory to practice. In Proceedings of the first international conference on multiagent systems, San Francisco, California (pp. 312–319). Shoham, Y. (1993). Agent-oriented programming. Artificial Intelligence, 60, 51–92. Sohrabi, S., Baier, J. A., & McIlraith, S. A. (2009). HTN planning with preferences. In Proceedings of the 21st international joint conference on artificial intelligence (IJCAI-09), Pasadena, CA, USA, July (pp. 1790–1797). Weld, D. S. (1994). An introduction to least commitment planning. AI Magazine, 15, 27–61. Weld, D. S. (1996). Planning-based control of software agents. In Proceedings of the third international conference on artificial intelligence planning systems, Edinburgh, Scotland (pp. 268–274).