Real-time supply chain control via multi-agent adjustable autonomy

Real-time supply chain control via multi-agent adjustable autonomy

Computers & Operations Research 35 (2008) 3452 – 3464 www.elsevier.com/locate/cor Real-time supply chain control via multi-agent adjustable autonomy ...

541KB Sizes 2 Downloads 49 Views

Computers & Operations Research 35 (2008) 3452 – 3464 www.elsevier.com/locate/cor

Real-time supply chain control via multi-agent adjustable autonomy Hoong Chuin Laua,∗ , Lucas Agussurjab , Ramesh Thangarajoob a School of Information Systems, Singapore Management University, 80 Stamford Road, Singapore 178902, Singapore b The Logistics Institute Asia Pacific, National University of Singapore, 11 Law Link, Singapore 119260, Singapore

Available online 12 February 2007

Abstract Real-time supply chain management in a rapidly changing environment requires reactive and dynamic collaboration among participating entities. In this work, we model supply chain as a multi-agent system where agents are subject to an adjustable autonomy. The autonomy of an agent refers to its capability to make and influence decisions within a multi-agent system. Adjustable autonomy means changing the autonomy of the agents during runtime as a response to changes in the environment. In the context of a supply chain, different entities will have different autonomy levels and objective functions as the environment changes, and the goal is to design a real-time control technique to maintain global consistency and optimality. We propose a centralized fuzzy framework for sensing and translating environmental changes to the changes in autonomy levels and objectives of the agents. In response to the changes, a coalition-formation algorithm will be executed to allow agents to negotiate and re-establish global consistency and optimality. We apply our proposed framework to two supply chain control problems with drastic changes in the environment: one in controlling a military hazardous material storage facility under peace-to-war transition, and the other in supply management during a crisis (such as bird-flu or terrorist attacks). Experimental results show that by adjusting autonomy in response to environmental changes, the behavior of the supply chain system can be controlled accordingly. 䉷 2007 Elsevier Ltd. All rights reserved. Keywords: Crisis management; Fuzzy controller; Military warehouse; Multi-agent system; Real-time control

1. Introduction A supply chain system typically comprises a geographically distributed and networked system of participating entities (such as suppliers, retailers and transporters). Real-time management of supply chains in a rapidly changing environment entails effective, efficient and reactive collaboration among participating entities. This is a challenging task because the parties involved in the supply chain have their own resources and objectives. A centralized authority controlling participating entities is likely to be costly and ineffective. In this paper, we model a supply chain system as a multi-agent system. A multi-agent approach allows real-time, decentralized, reactive and collaborative decision making. Decisions can be made asynchronously, thereby allowing the overall supply chain system behavior to react dynamically according to the changing environment. This is increasingly important in the present day’s context of heightened security concerns. For instance, the various departments of defense, ∗ Corresponding author. Tel.: +65 68280229; fax: +65 68280919.

E-mail addresses: [email protected] (H.C. Lau), [email protected] (L. Agussurja), [email protected] (R. Thangarajoo). 0305-0548/$ - see front matter 䉷 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.cor.2007.01.027

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

3453

more notably the US Department of Defense, has advocated and endorsed wide-range application of agents technology to control and optimize their logistics and supply chain processes under the general theme of “sense and respond logistics” [1]. Having a centralized system adds burden to supply chain resilience, network reliability, and may even compromise security in some sense. A distributed system on the other hand is scalable, promotes concurrency and is resilient to failures. An agent’s autonomy refers to its capability to make and influence autonomous decisions within a multi-agent system. Adjustable autonomy refers to varying the level of autonomy of the agents dynamically during execution, in effect transferring control from one agent to another. This phenomenon is usually triggered by: (1) the changing environment (such as an accident occurs, and the agent loses its resources, thus reducing its autonomy level), or (2) human interference (e.g. in a critical system like a missile control system, at some point of time, the autonomy of the agent needs to be reduced to zero, and decision making is taken over completely by humans). Determining whether and when such transfer of control must occur is arguably a fundamental research question in real-time multi-agent systems. In this work, we are interested in adjustable autonomy caused by a changing environment. Furthermore, each agent focuses on optimizing its objective function, but this function also changes over time in response to environmental changes. We assume that both the agent objective functions and autonomy levels can be decided by a central authority, the agents are fully cooperative, and information flow is reliable. We propose an efficient approach based on a fuzzy selection process to sense and translate environmental changes to the new autonomy levels and objectives of the agents. In response to these changes, we propose a decentralized optimization algorithm based on coalition formation that allows the agents to negotiate to maintain global consistency while optimizing agent objectives. We apply this approach to two supply chain control problems: one in controlling a military hazardous material storage facility under peace-to-war transition, and the other in supply management through a sudden crisis (such as natural catastrophe or terrorist attack). Experimental results show that under our approach, the supply chain system can be controlled accordingly in response to environmental changes. 2. Related works One typical approach for controlling real-time supply chains is to model it as Markov decision processes and derive suboptimal policy from such model (e.g. [2,3]). These approaches, though theoretically interesting, assume a known (learnable) underlying demand process, which make them unsuitable for practical problems at hand (which generally assume no prior knowledge of the demand process). In this paper, we adopt a multi-agent modeling approach with changing objectives and autonomy levels. The idea of modeling dynamic supply chain coordination as a multi-agent system is not new. For example, in [4], each agent performs one or more supply chain functions and coordinates its decisions with other relevant agents operationally. These agents include order acquisition agent, logistics agent, scheduling agent, resource agent, dispatching agent and transportation agent. They plan and/or control activities in the supply chain collaboratively. In [5], the author proposed modeling and deriving dynamic supply chain management policies in a multi-agent system by means of formulating it as a discrete resource allocation problem and applying a virtual market concept. Zeng and Sycara [6] modeled different business entities in supply chain as autonomous agents and studied their dynamic coordination against cost-effective measures. Chen et al. [7] applied agent-based negotiation to construct virtual supply chains dynamically, and shows, as an example, how it can be used to solve distributed constraint satisfaction problem (DCSP). Swaminathan et al. [8] proposed a general multi-agent simulation framework to model real-time supply chain dynamics. In the domain of agent autonomy, various factors entail changes of agent autonomy over time. For instance, Scerri et al. [9] studied the possibility of transferring autonomy for higher quality decision making with minimizing coordination problems. In this work, the authors limited the transfer to a complete transfer of autonomy and did not consider situations where some autonomy may be retained. Barber et al. [10] proposed three discrete levels of autonomy: “command-driven” where the agent does not make decisions and must obey orders, “consensus” where the agent shares decision making equally with other agents and “locally autonomous” where the agent makes decisions alone. In this paper, we consider a generalized real-life situation where there is a need to vary the autonomy gradually and continuously to suit changing circumstances. Fuzzy control theory has been well studied, and numerous applications to real-life control problems can be found in the literature. Fuzzy control is favored in many real-time applications because of its capability to deal with time constraint, incomplete and corrupted input data. Examples of such applications in real-time inventory control are: [11]

3454

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

which applied fuzzy control as a man–machine interface to solve an inventory model with imprecise demand, lead time and inventory cost, [12] which applied fuzzy control in the decision process to derive a policy for multiobjective, multi-item production system, and [13] which proposed a decision support system for demand forecasting with fuzzy control. Wang et al. [14] provided an example of solving real-time scheduling problems with a fuzzy approach. Another area of research closely related to our work is the dynamic or DCSP as a formalism for modeling multi-agent coordination and negotiation. For dynamic CSP, constraints can change over time. Bessiere [15] developed methods for adapting consistency computations to such changes, and Ram et al. [16] solved this problem by seeking minimal change on the solution. Modi et al. [17] extended this problem to DCSP as well and proposed the ADOPT algorithm which guaranteed a user-defined optimality. Mailler and Lesser [18] also provided an algorithm (OptAPO) to solve DCSP, which experimentally performs better than ADOPT. The coalition-formation algorithm presented in this paper is inspired by the above existing algorithms for solving DCSP. 3. Problem definition Essentially, the problem we study can be divided into two parts, i.e. sense and response. The sense problem deals with correctly translating all sensed data centrally to new local objectives and autonomy levels for each agent. Given the new objectives and the autonomy levels of the agents, the response problem deals with how the agents should interact with each other to re-establish global consistency and optimality. Let N be the index set of n agents, and each agent i ∈ N holds a set of mutually exclusive variables Ai = {ai1 , ai2 , . . . , ail i }. The domains for the variables held by agent i are Di = {Di1 , Di2 , . . . , Dil i }. The values that can be taken by the variables a11 , a12 , . . . , a21 , a22 , . . . , anl n are restricted by a set of global constraints C={C  1 , C2 , . . . , Cq }. p Each agent i has a set of objective functions Fi ={fi1 , fi2 , . . . , fi i }. Each function is defined as fik : 1  j  li Dij → R. These agents exist in a dynamic environment that are captured by the dynamic environment variables. Let E be the set of m environment variables E = {e1 , e2 , . . . , em } that can be sensed by the agents. The domain for variable ei is the set Dei of real numbers. Each agent senses some of the environment variables. Let the function sensed : N → 2E returns the set of variables that an agent can sense, equivalently sensed(i) ⊆ E. This is to capture the idea that each agent has its own sensor, and able to sense its surrounding. Formally, the sense problem is defined as, given the existing values of the environment variables in E, find  (1) the objective that the agent must focus on, i.e. a mapping for each agent i, Mi : 1  j  m Dej → Fi ; (2) the relative autonomy levels of the agents, i.e. a mapping for each agent i, Li : 1  j  m Dej → [0, 1]. The response problem is defined as, given both the objective fi and autonomy level al i set for each agent i, find an  , a  , . . . , a  , a  , . . . , a  ) which is consistent with C and minimizes the assignment for the agent variables (a11 nl n 12 21 22 agent local objectives. Furthermore, in order preserve social welfare in a multi-agent system, we also define a global objective function as the sum of the relative “errors” of all agents, i.e. minimizing    , a , . . . , a )  fi∗ − fi (ai1 il i i2 al i ∗ , (1) fi∗ i∈N

where fi∗ is the objective value of an optimal assignment to the variables of agent i. This value may be obtained solving a classical constraint optimization problem locally for each agent. Cast in the classical constraint optimization literature, our problem is hence a distributed constraint optimization problem [17,18] augmented with agent local objective functions. 4. Solution approach The overall solution approach is given as follows. For the sense problem, we propose a fuzzy framework to capture the relationships between environment variables and the agent objectives and autonomy levels. The role of the fuzzy framework is to continually capture the sensor data from all the agents and in turn translate these data into new objectives and autonomy level of the agents. Then, based on the objectives and the autonomy levels set, we apply a distributed branch-and-bound algorithm to obtain the solution for the response problem.

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

3455

4.1. Fuzzy framework The system behavior of a sense-and-response system is often difficult to design. Due to the adaptive nature required of the system, the interpretation of the inputs to the system such that a proper reaction can be designed is a complex task. In this paper, we propose a fuzzy selection process to determine the autonomy and objective of agents. Fuzzy systems have been studied extensively in the literature, and have been applied to a wide range of real-life applications. The advantages of using a fuzzy system are that it generally captures relationships between non-discrete data and handles incomplete data effectively. The 5-stage fuzzy process listed below has been extensively reported in the literature. In the following, we define  ) as consisting of: the function Li (e1 , e2 , . . . , em • Normalization performs a scale transformation of the input metric into a normalized variable. The input variables obtained from the environment will have a wide range of values. These variables must first be normalized to values in the range from 0 to a for some constant a. This may seem to be a simple task but a poor normalization will affect the performance of the fuzzy controller. The objective in normalization would be to obtain values well distributed in the range from 0 to a. Extreme values can be truncated to 0 or a. For each input variable ej ∈ E, its domain Dej = [0, rj ] is normalized into the interval [0, a], and the normalization function is nj (ej ) =



a rj



.ej .

• Fuzzification determines the degree to which the normalized variables belong to each of the appropriate fuzzy sets via membership functions. Fuzzification is the process of assigning a degree of truth to the statement about the input variables. This will show for example the extent that the if statements are true. The number of fuzzy sets can be two or more for each normalized input variable. The first step would be to determine the number of fuzzy sets described in the statements. Since each fuzzy set can be seen as a direct translation from the description of the variables in the relevant application the number of fuzzy sets can similarly be determined in this way. For example, a general low/high description will result in two fuzzy sets while a low low/medium/high description will result in three fuzzy sets. Increasing the number of fuzzy sets beyond a certain level will not result in significant improvement and will unnecessarily complicate the subsequent computational step. Thus the number of fuzzy sets used in the design of the fuzzy solution is completely dependent on the minimum requirements for the problem. The triangular membership function is used as it is the simplest of the commonly used membership functions both in terms of understanding and computation. Modifying the membership functions used may show a difference in the results obtained. The domain [0, a] of each of the normalized variable ej can be divided into several states, called fuzzy sets: s s Ij1 , Ij2 , . . . , Ij j , let Ij = {Ij1 , Ij2 , . . . , Ij j }. This applies to the output variable as well. The partial functions I k for j

1 k sj measure the degree of membership of the normalized value to the set Ijk . In the example of Fig. 1, we have I 2 (0.5a) = 0.25, I 3 (0.5a) = 0.75, while I 1 (0.5a) and I 4 (0.5a) are undefined. j j j j • Fuzzy inferencing, a rule base, also known as the inference engine, links the fuzzy inputs to the fuzzy outputs via linked statements called rules. If there is more than one input then the input statements are linked by the “or” or

1

Tj1

Tj2

Tj3

Tj4

0.75 0.25 0

0.5a Fig. 1. Fuzzification.

a

3456

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

Oi2

Oi1

1

Oi3

Oi4

from rule1: 0.75 in Oi2 from rule 2: 0.25 in Oi3 0

a 0.45a Fig. 2. Defuzzification.

“and” operator. The implication operator “then” indicates the output statements. The “and” operator is again used to link the output statements. Each of these output statements leads to a fuzzy set. The truncation level of the initial fuzzy sets determines the truncation of the output fuzzy sets. When there are more than one initial fuzzy set then the lowest level of the initial sets indicates the truncation of the output fuzzy sets. A rule is an if–then statement of the form: q

if ej is in Ijk and ep is in Ip and . . . then al i is in Oit . for some p, q, t where al i is the autonomy level of agent i, Oit ∈ Oi , and Oi is the set of fuzzy sets for the output variable al i . In other words, the if-part of a rule contains a set of fuzzy sets of the input variable, and the then-part of the rule contains a fuzzy set of the output variable. A rule will fire if I k (nj (ej )) is defined for all Ijk contained j in the if-part of the rule. The result of firing a rule is the state (fuzzy set) that the output variable in and the degree of the membership which is min{I k (nj (ej ))} for all Ijk in the if-part of the rule. j • Defuzzification the fuzzy sets for the output from different rules that fire are merged and the center-of-area of the combined fuzzy sets is found to get a defuzzified value called the centroid ci . A popular method would be to find the center-of-area of the combined fuzzy set. An example can be seen in Fig. 2, let U be the function defining the membership function of the normalized output variable (e.g. the function that defines the area in Fig. 2), we have a x.U (x) dx ci = 0 a . 0 U (x) dx • Denormalization is the reverse of the normalization process where the scaling biases are removed. The coefficients which are obtained for each agent, can be viewed as weights that have to be applied to determine the autonomy of the agent, i.e. al i = (1/a).ci . The value of a, the membership functions and the rules are defined based on empirical methods. Mi is defined in a similar way. 4.2. Coalition formation-based algorithm Here, we present a coalition formation-based algorithm for the agents to find the assignment to all the variables  , a  , . . . , a  , a  , . . . , a  ) that is consistent with C while optimizing the measure of objective described above, (a11 nl n 12 21 22 or to output no solution if such solution is not found. This algorithm is based on distributed branch-and-bound. The idea is to have the agents form coalitions, and to find local solutions. If these local solutions do not conflict each other with respect to C, then the solution is found. If conflicts between these local solutions occur, then the agents will try to form a bigger coalitions, and then find the consistent solutions within this bigger coalitions. In the worst case, all the agents will form one big coalition, and if consistent solution still cannot be found, the algorithm will output no solution.

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

3457

The algorithm is divided into three parts, namely initialization, coalition formation, and negotiation. During initialization, each agent i first constructs an optimal assignment to its variables using a branch-and-bound search that maximizes its given objective function fi . This optimal value is then stored in the variable fi∗ . Gi is the set that stores all the agents that are currently in coalition with agent i. The set is initialized to i itself. The variable Li indicates the agent that has the role as the leader in the current coalition of agent i, this agent is the one with the highest autonomy level in the coalition. The set Ki stores all the agents having conflict with agent i, initially set to empty. Variable statei indicates the state of the agent i. After initialization, the agents will enter into coalition formation stage, as shown in Algorithm 1. Algorithm 1 (Coalition formation). FORM-COALITION() 1 ∀j ∈ NEIGHBORS(i), if (CONFLICT(i, j )) 2 then Ki ← Ki ∪ {j } 3 n ← #Ki 4 if (n = 0) 5 then statei ← no_conflict 6 else SEND(invite,al Li ) to all j ∈ Ki 7 statei ← conflict 8 9 when received(invite,al Lj ) do 10 if(al Li > al Lj or statei = forming_coalition) 11 then SEND(reject,al Li ) to j 12 else if(al Li < al Lj ) 13 then SEND(accept,al Lj ) to j 14 SEND(leave,al i ) to all agent k ∈ Gi 15 statei ← forming_coalition 16 when received(reject,al Lj ) do 17 n←n−1 18 when received(accept,al Lj ) do 19 Gi ← Gi ∪ {j } 20 Li ← j where al j  max k∈Gi {al k } 21 SEND(confirm,al i ,Gi ) to all j ∈ Gi 22 n←n−1 23 when received(confirm,al j ,Gj ) do 24 if (Gi = Gj ) 25 then Gi ← Gj 26 SEND(confirm,al i , Gi ) to all j ∈ Gi 27 Li ← j where al j  max k∈Gi {al k } 28 statei ← negotiating 29 when received(leave,al j ) do 30 Gi ← Gi − {j } 31 Li ← j where al j  max k∈Gi {al k } 32 33 if(n = 0) then NEGOTIATE() In coalition formation stage, each agent first checks for any external conflicts. The function NEIGHBORS(i) returns all the agents that share some constraints with agent i, and the function CONFLICT(i, j ) check whether any two agents are in conflict, which is done through exchange of variables. After detecting conflicting neighbors, each agent will send invitations to all its conflicting neighbors to form a coalition (lines 6–7). Along with the invitation, the agent also sends the variable al Li which is the autonomy level of the leader of the current coalition, or the highest individual autonomy level in the set Gi . In lines 9–15, we see that an agent will tend to join a coalition which has a member with the higher

3458

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

autonomy level. This is important for the convergence of the algorithm. Lines 16–31 of the code deals with concurrent invite and accept, and to make sure that any agent is in at most one coalition at any point of time. The algorithm uses concurrent invite and accept to ensure each agent is in at most one coalition at any point of time. When the agents have formed the coalitions, they will enter the negotiating stage of the algorithm given as shown in Algorithm 2. Algorithm 2 (Negotiation). NEGOTIATE() 1 if (all agents are in no_conflict state) 2 then return (ai1 , ai2 , . . . , ail i ) 3 4 Li ← j where al j  max k∈Gi {al k } 5 if(i = Li ) 6 then SEND((Di1 , Di2 , . . . , Dil i ), fi , fi∗ ) to Li 7 else 8 n ← #Gi 9 when received((Dj 1 , Dj 2 , . . . , Dj l j )fj , fj∗ ) do 10 n←n−1 11 if(n = 0) 12 then do branch and bound search to find an assignment dpq to all the variables apq ∈ Ap for all p ∈ G i f ∗ −fp that minimizes p∈Gi al p ∗ pf ∗ p

13 14 15 16 17 18 19 20

if (no consistent solution found) then return “NO SOLUTION” SEND(dp1 , dp2 , . . . , dpl p ) to all p ∈ Gi when received(di1 , di2 , . . . , dil i ) do assign (di1 , di2 , . . . , dil i ) to ai1 , ai2 , . . . , ail i FORM-COALITION()

If all the agents are in no_conflict state, then a consistent solution has been reached, and each agent will return their current assignment to the variables as the solution (lines 1 and 2). If the agents are in negotiating state, then the agent with the highest autonomy level in a coalition will gather the data from all the other agents in the coalition. A branch-and-bound search will then be performed to find a consistent solution, minimizing the sum of the relative error of each agent in the coalition. If a consistent solution cannot be reached, the algorithm returns no solution. When a solution is reached, it is broadcasted to every member of the coalition, and the agents return to the coalition formation stage. 4.3. Analysis of communication complexity In this section, we provide an analysis of the communication complexity of the above procedure. For clarity, we will concentrate on the number of messages sent, assuming uniform size on the messages. Each agent will go back and forth between Algorithms 1 and 2 until the system converges to a solution or no solution is found. The worst case communication complexity would occur when the problem the agents try to solve is inconsistent (contains no consistent solution), in which case the agents would converge into a single coalition before detecting the inconsistency. Thus, the bound on the number of messages sent can be derived from the number of messages needed by the agents to converge into a single coalition. We will start by analyzing the number of messages needed to form a coalition of k agents, with a maximum of k(k − 1)/2 binary constraints. The number of messages needed to detect conflict between two agents is constant c,

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

3459

thus in the case of k agents, is ck(k − 1)/2. From Algorithm 1, each agent will then send an invite to each of the conflicting agents, and these invitations will either be rejected or accepted, creating a total messages of 2k(k − 1). In a coalition of k agents only one agent has the highest autonomy level, thus the confirmation messages in Algorithm 1 constitute a factor of kd with constant d. In Algorithm 2, data is centralized to a central mediator agent, and reallocated from the mediator agent to the rest of the agents in the coalition, creating a total number of messages of 2(k − 1). Summing up the factors, we have a total number of messages of (c/2 + 2)k 2 + (c/2 + d)k − 2. The order on which the coalitions are formed also determined the total number of messages sent. One ordering in which coalitions are formed with number of members starting from 2 up to n would give worst case complexity on communication, where the number of messages sent is nk=2 (c/2 + 2)k 2 + (c/2 + d)k − 2 = O(n3 ) where n is the number of agents in the system.

5. Application 1: military storage facility In this section, we apply our proposed framework to solve a real-time control problem in a military storage facility. Fig. 3 shows the outline of the facility. Materials are stored in storage 1, 2, and 3. The storages are connected to the main door through a road network. There are seven intermediate doors between the storages and the main door, and each of them is controlled by a door agent. A request for materials specifies from which storage they are to be retrieved and a deadline (specifying the time they should reach the main door). The materials are to be carried by a vehicle from the storage to the main door through the road network. We assume unlimited number of available vehicles at each storage. Once a vehicle arrives at the main door and delivers the materials, it returns to the storage using the same route. There is a constraint on the maximum number of vehicles on a road segment. Doors are opened to allow vehicles to pass. There are three measures that need to be considered when controlling the intermediate doors: (1) Door temperature: Due to its size, opening and closing of intermediate door increases the temperature of the door and the overall temperature of the facility. Each opening/closing increases the door temperature by a constant, and when the door is idle (whether opened or closed) its temperature drops linearly. Fig. 4 shows an example of changes in door temperature. Threshold temperature is the maximum allowable temperature for an intermediate door. (2) Facility exposure level: When doors are opened, we say the facility becomes “exposed”. The exposure level is measured by the largest connected road segments created by the opening of the doors. Fig. 5 shows the exposure level with the unshaded doors opened. In this example, five partitions are created, with the largest connecting three road segments. Thus, the exposure level of the facility in this situation is 3. (3) Tardiness: Vehicles that arrive at an intermediate door have to wait for the door to be opened before it can pass. Tardiness measures the average waiting time of vehicles arriving at an intermediate door.

Storage 1

DA 1

DA 4

Storage 2

DA 2

DA 5

Storage 3

DA 3

DA 6

TA 1

TA 2 Flow of Supply

DA = Door Agent TA = Traffic Agent Fig. 3. Storage facility.

DA 7

TA 3

3460

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

Temperture

Threshold

Normal open

close

open close

open

close

Time

periods where vehicles are allowed to pass Fig. 4. Door temperature.

Storage 1

Storage 2

Storage 3

Largest Road Segments Exposed Fig. 5. Facility exposure.

According to the measures above, each door agent has the following two objectives: (1) To maximize the safety level: fsafety = k1 ∗ 1/average_temperature + k2 ∗ 1/exposure_level, where k1 and k2 are normalizing factors. (2) To minimize the tardiness: ftardiness = average waiting time of vehicles arriving at the door. The choice of the objective function to optimize of each agent in each planning period is determined by the fuzzy controller. When one of the functions is chosen as objective, the other will be set as constraint. For example, when minimizing tardiness is chosen as the objective, threshold temperature and maximum allowable exposure become the constraint for door temperature and exposure level, respectively. The input to the fuzzy controller are: (1) critical level (war level) which is a real value between 0 and 1, determined by an external entity, and (2) average temperature of the facility as a feedback from the system. The output of the controller are: autonomy level and the choice of objective to optimize for each agent. These output will, in turn, affect the interaction of the agents when trying to arrive at a consistent schedule for opening and closing of intermediate doors. In their interaction, as described in Section 4.2, the agents will try to minimize the sum of relative error subject to the autonomy levels, i.e. Eq. (1) of Section 3. In our experiment, we measure the average local objectives: fsafety and ftardiness over all the agents to observe the effect of such interaction. We run a simulation of the facility to measure the effectiveness of our approach. Time is divided into periods of day, and in each day, requests arrive at random inter-arrival time. We divide our experiment into three cases based on the input and configuration of the fuzzy controller: Case (1): The fuzzy controller is configured to maintain a constant average temperature. This is done by monitoring the feedback from the system and adjusting the output from the controller. The critical/war level is increased gradually

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

3461

180 case 1 case 2 case 3

160 140

tardiness

120 100 80 60 40 20 0 1

2

3

4

5

6

7 t

8

9

10

11

12

13

9

10

11

12

13

Fig. 6. Comparison of tardiness.

12 case 1 case 2 case 3

10

safety level

8 6 4 2 0 1

2

3

4

5

6

7 t

8

Fig. 7. Comparison of safety.

over the simulation period. The simulation is run multiple times and the averaged result, measuring the tardiness (of requests) and the safety level of the system, can be seen in Figs. 6 and 7, respectively. We can observe that, as the critical/war level increases, the average tardiness (measured in minutes) of requests drops (Fig. 6), at the expense of safety level (Fig. 7). To be precise, since the average temperature is maintained at a fixed constant, it is at the expense of exposure level. This is the expected behavior of the system as the critical/war level approaches 1. This first case is to measure the reaction of the system against changes in critical/war level. Case (2): This is the opposite of case (1), the critical/war level is set to low fixed constant, and the fuzzy controller is configured (forced) to maintain an increasing average temperature over the period. Again the simulations are performed multiple times and the average result is obtained. The result shows that, although the average temperature increases, the system managed to maintain a steady safety level (Fig. 7), at the expense of tardiness of requests (Fig. 6). This is also the expected behavior of the system, to maintain safety level under peaceful condition. This second case is to measure the reaction of the system against changes in average temperature. Case (3): In the last case, we do not configure the fuzzy controller to maintain any average temperature and left that decision to the controller (there is still a constraint on the maximum temperature). The critical/war level however is increased in every period of the simulation. As the critical/war level increases, the system is able to reduce tardiness as in case (1) (Fig. 6), but with a higher safety level (Fig. 7).

3462

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

The result in Figs. 6 and 7 is obtained after 30 rounds of simulation and tuning. In each round, 3000 random requests are generated over the period of 13 days. This is performed until we get the balance between the objectives (tardiness, and safety: average temperature and exposure level), and obtain k1 = 75 and k2 = 12. After which, we are able to obtain consistent results in subsequent runnings as presented above, and we see how the system adapts to changes in input and responds accordingly. 6. Application 2: crisis management The recent series of world disasters and other emergencies have lead the international community to react in terms of providing aid supplies (medical, food and other basic necessities). Due to the unpredictable nature of such events and the inability to react quickly, there is often a very costly delay in aid supplies reaching the destinations. One way to reduce the delay cost is to re-direct existing resources towards the destinations. The purpose of re-direction is to fill the gap between an event and the arrival of aid supplies. In this section, we apply our proposed framework to such re-direction control problem in crisis management. Fig. 8 shows a 3-level supply chain with 12 agents, where demands occur at the lowest level (DA4-DA7). Under normal/steadystate condition, the demands comes from known normal distributions. Using these distributions, inventory policy (based on single item, periodic back order, and zero lead time) is derived for each agent. Resource agents will only produce amounts of items at a rate enough for steady-state condition. The fuzzy controller as a central controller will monitor the realization of demands in each period. This will be the only input to the controller. In periods where inconsistencies are detected in at least one of the distribution agent,

RA 1

RA 2

DA 1

RA 3

DA 2

DA 3

Flow of Supply

DA 4

DA 5

DA 6

DA 7

Varying Crisis Supply Requirements

RA = Resource Agent DA = Distribution Agent Fig. 8. Supply chain.

DA 8

DA 9

service level

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

3463

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1

3

5

7

9

11

13

15

17

19

21

23

25

27

29

non-critical agent

Fig. 9. Demand satisfaction (service level) over time.

3500 3000

cost

2500 2000 1500 threshold

1000

critical agent

non-critical agent

500 0 1

3

5

7

9

11 13

15 17 time

19

21

23

25

27

29

Fig. 10. Cost over time.

it will assign the autonomy level and objective function to each agent, and the coalition-formation algorithm kicks in. Thus our proposed framework is invoked only when there is disturbance in demand. Each distribution agent may have one of the following objectives at the critical period: (1) to maximize service level, which is the percentage of the demand satisfied, or (2) to minimize cost, which includes: storage cost, transportation cost from the higher level, and processing cost for mid-level distribution agent. Again, similar to Application 1, the coalition-formation algorithm will try to minimize the sum of the relative error of the agents with respect to the autonomy levels (i.e. Eq. (1) in Section 3), and we measure the effect by looking at the average local objectives of the distribution agents. We run initial experiments to tune the parameters and rules set inside the controller. After which, we performed a test over a 30-day period, and the result can be seen in Figs. 9 and 10. The periods of the experiment can be divided into three phases: (1) initialization phase (periods 1–7), (2) stable phase (periods 8–15) and (3) critical phase (periods 16–30). In phases 1 and 2, the system operates under steady-state. In the third phase, one of the distribution agent has a sudden surge in demand from period 16 onwards. We call this agent the critical agent. Fig. 9 shows the service level of the critical agent and the average service level of the non-critical agents. The increase in service level of the critical agent causes the service level of the non-critical agents to drop gradually. Fig. 10 shows similar results on the cost of such re-direction on the agents. These results show how the system manages to react in response to changes in the input which is similar to the behavior observed in Application 1. 7. Conclusion We studied the problem of adjustable autonomy in the context of sense and response to dynamic environmental changes. Our solution approach has shown promising results for real-world problems involving the design of a control

3464

H.C. Lau et al. / Computers & Operations Research 35 (2008) 3452 – 3464

system for a military storage facility under a peace-to-war transition, as well as the management of a supply chain under sudden crisis. Our approach, however, relies on the availability of experts in providing information for correct tuning of the fuzzy sets and rules set. We believe our proposed approach can be applied to solve other real-time supply chain management problems where drastic changes in environmental variables (such as peace-to-war) is an issue. Acknowledgments This research was supported partially by a research grant from the Singapore Management University. References [1] Office of Force Transformation, Sense and Respond Logistics. Available online at: www.oft.osd.mil/initiatives/srl/srl.cfm . [2] Treharne JT, Sox CR. Adaptive inventory control for non-stationary demand and partial information. Management Science 2002;48(5): 607–24. [3] Lovejoy WS. Stopped myopic policies in some inventory models with generalized demand processes. Management Science 1992;38(5): 688–707. [4] Fox MS, Barbuceanu M, Teigen R. Agent-oriented supply-chain management. International Journal of Flexible Manufacturing Systems 2000;12(2/3):165–88. [5] Kaihara T. Multi-agent based supply chain modelling with dynamic environment. International Journal of Production Economics 2003;85: 263–9. [6] Zeng DD, Sycara KP. Agent-facilitated real-time flexible supply chain structuring. International conference on autonomous agents, workshop on agent based decision-support for managing the internet-enabled supply-chain; 1999. [7] Chen Y, Peng Y, Finin T, Labrou Y, Cost S, Chu B, et al. A negotiation-based multi-agent system for supply chain management. International conference on autonomous agents, workshop on agent based decision-support for managing the internet-enabled supply-chain; 1999. [8] Swaminathan JM, Smith SF, Sadeh NM. Modeling supply chain dynamics: a multi-agent approach. Decision Science 1998;29(3):607–32. [9] Scerri P, Pynadath DV, Tambe M. Towards adjustable autonomy for the real world. Journal of AI Research/JAIR 2002;17:171–228. [10] Barber KS, Goel A, Martin CE. Dynamic adaptive autonomy in multi-agent systems. Journal of Experimental and Theoretical Artificial Intelligence, Special Issue on Autonomy Control Software 2000;12(2):129–47. [11] Dey JK, Kar S, Maiti M. An interactive method for inventory control with fuzzy lead-time and dynamic demand. European Journal of Operational Research 2005;167:381–97. [12] Mahapatra NK, Maiti M. Decision process for multiobjective, multi-item production-inventory system via interactive fuzzy satisficing technique. Journal of Computers and Mathematics with Applications 2005;49:805–21. [13] Petrovic D, Xie Y, Burnham K. Fuzzy decision support system for demand forecasting with a learning mechanism. Fuzzy Sets and Systems 2006;157:1713–25. [14] Wang HG, Rooda JE, Haan JF. Solve scheduling problems with a fuzzy approach. In: Proceedings of the fifth IEEE international conference on fuzzy systems, vol. 1(3); 1996. p. 194–8. [15] Bessiere C. Arc-consistency in dynamic constraint satisfaction problems. In: Proceedings of the ninth national conference on artificial intelligence/AAAI; 1991. p. 221–6. [16] Ran YP, Roos N, van den Herik HJ. Approaches to find a near-minimal change solution for dynamic CSPs. In: Proceedings of the fourth international workshop on integration of AI or OR techniques in constraint programming for combinatorial optimisation problems; 2002. [17] Modi P, Shen W, Tambe M, Yokoo M. ADOPT: asynchronous distributed constraint optimization with quality guarantees. Artificial Intelligence 2005;161:149–80. [18] Mailler R, Lesser V. Solving distributed constraint optimization problems using cooperative mediation. In: Proceedings of third international joint conference on autonomous agents and multiagent systems/AAMAS; 2004. p. 438–45.