European Journal of Operational Research 202 (2010) 707–716
Contents lists available at ScienceDirect
European Journal of Operational Research journal homepage: www.elsevier.com/locate/ejor
Production, Manufacturing and Logistics
Robust placement of sensors in dynamic water distribution systems Jianhua Xu a,*, Michael P. Johnson b, Paul S. Fischbeck c, Mitchell J. Small d, Jeanne M. VanBriesen e a
Department of Environmental Management, College of Environmental Sciences and Engineering, Peking University, Beijing 100871, PR China Department of Public Policy and Public Affairs, John W. McCormack Graduate School of Policy Studies, University of Massachusetts Boston, 100 Morrissey Blvd, Boston, MA 02125, USA c Department of Social and Decision Sciences and Department of Engineering and Public Policy, Carnegie Mellon University, Pittsburgh, PA 15213, USA d Department of Civil and Environmental Engineering and Department of Engineering and Public Policy, Carnegie Mellon University, Pittsburgh, PA 15213, USA e Department of Civil and Environmental Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA b
a r t i c l e
i n f o
Article history: Received 8 August 2007 Accepted 13 June 2009 Available online 18 June 2009 Keywords: Facilities planning and design Robust optimization Scenarios Water distribution systems
a b s t r a c t Designing a robust sensor network to detect accidental contaminants in water distribution systems is a challenge given the uncertain nature of the contamination events (what, how much, when, where and for how long) and the dynamic nature of water distribution systems (driven by the random consumption of consumers). We formulate a set of scenario-based minimax and minimax regret models in order to provide robust sensor-placement schemes that perform well under all realizable contamination scenarios, and thus protect water consumers. Single-and multi-objective versions of these models are then applied to a real water distribution system. A heuristic solution method is applied to solve the robust models. The concept of ‘‘sensitivity region” is used to visualize trade-offs between multiple objectives. Ó 2009 Elsevier B.V. All rights reserved.
1. Introduction Safe drinking water is vital for human health and well-being. Accidental water contamination could pose great risk for the consumers. An outbreak of acute watery diarrhea in Milwaukee, US, in 1993 affected more than 400,000 people when a microorganism was transported in the distribution system (Mac Kenzie et al., 1994). An outbreak of waterborne disease epidemic in Walkerton, Ontario, Canada, in 2000 affected 2,300 people as a result of exposure to contaminated drinking water (Hrudey et al., 2003). However, routine periodic monitoring of water quality as required by US Environmental Protection Agency (EPA) does not meet the need of early warning for these types of contamination events. The design of a monitoring system that can detect contaminants in real time is challenging for technical and operational reasons. Advances in chemical and biological sensor technology (Diamond, 1998; Hashemian, 2005) have made it feasible to build such a monitoring system. Ideally, we would place sensors at each node of the distribution network to give the earliest possible warning to the full population. However, sensors are generally expensive (e.g., a Hach chlorine sensor costs between USD 3,000–5,000), so the ‘‘optimal” placement of an affordable number of sensors needs to be considered. The goal of sensor-placement in water distribution systems is to develop a proactive approach to detect contamination events and * Corresponding author. Tel.: +86 10 62756593. E-mail address:
[email protected] (J. Xu). 0377-2217/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2009.06.010
thus to mitigate the impact of contamination events. A key premise is that a contamination event might occur. However, there is uncertainty as to what the contaminant could be (States et al., 2003), as well as the locations, timing, and duration of the event. Furthermore, even if we knew all these details, there is still uncertainty about the contaminant behavior in the system because of the dynamic nature of the water flows, as represented by the temporal variation of flow rates and directions. The dynamic nature of water flows is driven by pumping rates and pressure applied by the water utility at different entry points to the system as well as consumption patterns of consumers, which exhibits daily, weekly and seasonal variation. The combinations of uncertainties inherent in details of a contamination event and the dynamic nature of water distribution systems pose a challenge in providing sensor-placement schemes that work well for all realizable contamination scenarios. Placing sensors in water distribution systems is a facility location problem, in which sensors act as facilities and water distribution systems are the environments. Significant research has taken place to develop optimization models for making facility location decisions subject to financial, physical and policy constraints. For a thorough review on this research fields, readers are referred to Chung (1986), Brandeau and Chiu (1989), Owen and Daskin (1998), Current et al. (1990), ReVelle and Eiselt (2005), and Snyder (2006). Here, we summarize the facility location models that have been deployed to address the problem of locating sensors in water distribution systems. Three types of decision models are commonly used to design strategies for placing sensors in water distribution systems. The
708
J. Xu et al. / European Journal of Operational Research 202 (2010) 707–716
first is deterministic optimization that is based on either a single hypothetical instance assumed to be representative of the behavior of a water distribution system (Kessler et al., 1998; Kumar et al., 1999), or the expected value of the input data across all contamination scenarios (Krause et al., 2006). The second is stochastic optimization that optimizes the expected value of an objective (e.g., minimizing the expected volume of contaminated water consumed prior to detection) (Berry et al., 2006). The third is robust optimization that minimizes either the worst-case scenario or a set of highimpact events depending on how robust is defined (Carr et al., 2006; Watson et al., 2006). Table 1 summarizes the recent research literature which proposes these decision models to place sensors in water distribution systems (for each category, a few sample papers are listed). In previous research, neither deterministic models nor stochastic models meet the goal of minimizing the losses in health and life of the population across all realizable contamination scenarios. The optimal solution to a deterministic model performs well for a specific instance, but it could be significantly sub-optimal if other scenarios are realized. A stochastic optimization model accounts for the many contamination scenarios, but requires the specification of the probability of each of the scenarios. It is questionable whether meaningful probabilities can be assigned to random variables about which we have inadequate knowledge. Even if the probability of each of the scenarios can be estimated, the solution is optimal for the expected value of the objective function and more often cannot be used to hedge against the worst scenarios. The robust model formulated by Watson et al. (2006), Carr et al. (2006) focuses on addressing uncertainties related to the contamination location (location uncertainty) and does not explicitly incorporate the uncertainties related to the contamination occurrence time (temporal uncertainty). Given the fact that water flow is driven by the population’s consumption and the population consumption pattern shows temporal variations, different contamination occurrence times imply different consequent impacts. For example, the impact of a mid-night contamination event is expected to be different from the impact of a contamination event occurred at noon. In this paper, we formulate two sets of robust models for designing sensor-placement schemes in water distribution systems to address temporal uncertainty. The first set of models is intended to protect as large a population as possible. The
second set of models identifies as many contamination events as possible (maximize detection likelihood). We then compare the performance of these models with deterministic models that use the expected value of the input data across all the contamination scenarios. Since robust models tend to be computationally costly, we need to show a clear performance advantage to justify their use. Also, none of the above-mentioned robust optimization models address the multi-objective nature of this sensor-placement problem. We then propose a multi-objective model to maximize detection likelihood and minimize the population at risk as Krause et al. (2006) shows that these two objectives conflict. Robust optimization addresses uncertainties in problem-input parameters through scenario planning and minimizes losses associated with the worst scenario. It has been applied in diverse domains, including logistic management (Yu and Li, 2000), scheduling (Lebedev and Averbakh, 2006), and facility location (Serra and Marianov, 1998; Burkard and Dollani, 2001). The prominent feature of the robust optimization approach is that a chosen solution performs well even in the worst case. Common robustness criteria include absolute robustness, robust deviation, and relative robustness (Kouvelis and Yu, 1997). The approach based on absolute robustness is also called the minimax approach, while the methods based on robust deviation and relative robustness are called minimax regret approaches. The difference between the latter two lies in the way regret is defined. With a robust deviation criterion, the regret for a scenario is defined as the difference between (1) the performance of the solution for the robust minimax regret approach if scenario s is realized and (2) the performance of the optimal solution to the deterministic model using scenario s as the instance feed. With a relative robust criterion, the regret as defined above is modified by dividing it by the performance of the robust minimax solution. In the minimax (absolute robustness) approach, feasible solutions are evaluated across all scenarios, and the solution that performs the best in the worst scenario is the robust one. For the minimax regret approach, feasible solutions are evaluated across all scenarios, and the solution that performs the best based on the regret criterion in the worst scenario is the robust solution. The choice of the criteria should be driven by the specific need (Kouvelis and Yu, 1997). In general, the minimax approach tends to be very conservative and tries to hedge against the worst scenario while minimax regret approach is less conservative
Table 1 Summary of the decision models commonly used in placing sensors in water distribution networks. Model type
Paper
‘‘Distance”a
Remarks
Deterministic optimization models
Kessler et al. (1998)
vij: the volume of contaminated water consumed system-
Incorporate uncertainties of contamination events, but do not consider the dynamic nature of the systems.
Krause et al. (2006)
Stochastic optimization models
Robust optimization models
a b
Lee and Deininger (1992)
widely from the moment when node i is injected with contaminant to the moment when the contaminant reaches node j (potential sensor-location node) xai: the total system-wide impact associated with a contamination scenario a until a perfect sensor placed at node i raises an alarm tij: contaminant travel time from node i to node j wij: fraction of water from node i contributed to node j (potential ‘‘sensor”-location node)
Berry et al. (2006)
xai: the total impact associated with a contamination scenario a until a sensor placed at node i raises an alarm
Watson et al. (2006)
dsi: the impact that might incur if a contamination event occurs at node i and a sensor is placed at node s.
Carr et al. (2006)
–b
Consider both the uncertainty of contamination events and the dynamic nature of water distribution systems. ‘‘Distances” are averaged across all scenarios. Focus on interior deterioration of water quality, and do not address external contamination events. Use a set of snapshots to represent the dynamic nature of the system Consider both the uncertainty of contamination events and the dynamic nature of water distribution systems. Minimize the expected impact across all contamination scenarios Address the uncertainties of the contamination locations and provide robust solution with respect to the uncertain contamination locations Address the uncertainties of the contamination locations and provide robust solution with respect to the uncertain contamination locations and contamination occurrence probability
The distance here does not mean the physical distance. Its definition is equivalent to the distance in classical location models. This model is formulated as a mixed-integer model which is not a variant of the typical facility location model.
J. Xu et al. / European Journal of Operational Research 202 (2010) 707–716
and tries to seek opportunities for improvement in decision making. In this paper, we use both criteria. Solving robust optimization models are more difficult than the corresponding deterministic models. For a detailed review on solving robust optimization models, see Snyder (2006) and Kouvelis and Yu (1997). Generally, optimal results for robust facility location models are obtained for problems with special structure, such as the 1-median problem (Chen and Lin, 1998) and the 1-center problem on trees (Burkard and Dollani, 2002); for more general problems, researchers tend to resort to heuristic algorithms. In our research, a heuristic solution method is applied to solve the robust models, which is computationally affordable to most (if not all) water utilities. The concept of ‘‘sensitivity region” (Gunawan and Azarm, 2005) is used to generate the trade-off relationship in multiple-objective optimization. In the next section, we provide a detailed description of the problem formulation. Section 3 then gives a brief introduction to the solution methods to this type of robust models. Section 4 applies the robust models to place sensors in a real-world water distribution system. Finally, we conclude with a discussion of future research directions.
2. Problem formulation A sensor network in a water distribution system is expected to (1) protect as large a population as possible, (2) identify as many contamination events as possible, and (3) detect contamination events as quickly as possible to allow for a timely response. These objectives have been used by many authors, either singly or in combination (Berry et al., 2005; Kumar et al., 1999; Ostfeld and Salomons, 2005). In addition, the volume of contaminated water consumed prior to detection is also used as an objective to be minimized (Krause et al., 2006) or as an exogenous decision variable in a covering model (Kessler et al., 1998). Minimizing the population at risk is the ultimate goal for water protection, but the variability of population water use over time makes it difficult to use in an objective function. Minimizing the volume of contaminated water consumed prior to detection could be used as a surrogate for minimizing the population at risk, but there are problems with this as well. Domestic consumption tends to be high for non-potable uses (cleaning, landscaping, etc.) at certain times of the day and high for hygiene use, food preparation and consumption at other times. The correlation between water demand and consumptive water use in the context of increasing risk from an introduced contaminant is difficult to compute. However, the volume of contaminated water used captures the dynamic nature of water distribution systems, which changes with the demand pattern. When the demand is high, the volume of the contaminated water used is high. Therefore, despite these limitations, we use the volume of contaminated water consumed as a measure of population at risk. Water flows from sources to consumers. Contaminants are transported by water flows. Sensors can only detect contamination events that occur upstream from their location. Given a budgetconstrained number of sensors, it is generally impossible to detect every possible contamination event in a water distribution network. Alternatively, we wish to identify as many of the contamination events as possible. Without knowing the distribution of contamination events across a water distribution network, we generally assume that every node has an equal chance of serving as the contamination origin. With this assumption, maximizing the detection likelihood is equivalent to maximizing the number of covered nodes. This uniformity assumption is made with respect to contamination locations of the accidental contamination events; in our paper, we wish to identify a sensor-placement strategy that
709
is robust with respect to the occurrence time of the contamination. A minimax p-center formulation would be able to address both the location uncertainty and temporal uncertainty, which will be addressed in our future research. Quick contaminant detection allows for a timely response. Thus, we would like to make the detection time as short as possible. The objectives of maximizing coverage and minimizing detection time are in conflict when a limited number of sensors are to be placed. Maximizing coverage tends to place sensors at far downstream locations in a water distribution network where they are able to detect contaminant events from a larger fraction of the network. However, given a contamination event, the further downstream is the sensor, the greater is the time for the contaminant to reach the sensor. Instead of using minimizing detection time as an objective directly, we use the detection time as a measure of the ‘‘distance”. As a result, detection time is used as an exogenous parameter. 2.1. Assumptions and scenario planning The physical structure of a water distribution system is a network in which nodes represent water sources, tanks, and junctions (the connection between pipes and points of water withdrawal) and edges represent pipes, valves, and pumps. Contamination events are assumed to result only from contaminants introduced at system nodes. We do not consider contamination events that originate in the middle of a pipe: in this case, a pipe break would be required and the resulting pressure drop would be relatively easy to detect. We also assume that the hydraulic behavior of a water distribution system is not substantially changed by the intentional injection of a contaminant. We assume that sensors are perfect. They can detect contaminants of any concentration with no false negatives and false positives. Many chemical, biological, and microbial agents are possible contaminants (States et al., 2003); some of them are reactive and react with water, pipe walls, and/or other agents (such as disinfectant); while others are conservative and diluted by water in the system but do not react with water and/or other agents. Different sensors are required to detect different contaminants. A current subject of research uses surrogates (e.g., turbidity, pH, and organic carbon) as indicators of water quality (USEPA, 2005). However, it is difficult to determine the relationship between surrogates and any specific contaminant in the dynamic and complicated environment of a water distribution system. Therefore, without knowing the detailed information on the performance and availability of sensors for a specific contaminant, ‘‘sensor” and ‘‘contaminant” should be interpreted generally. Thus, the models in this paper are used to optimally locate a known number of sensors to detect a particular contaminant under the assumption that these sensors are specially designed to detect that contaminant. Scenario planning is often used to represent the uncertainty of the future in business and governmental policy analysis (Becker, 1983). In our research, we use scenario planning to structure the uncertainty inherent in a contamination event. A contamination event will occur according to one scenario from a large set of scenarios. Thus, scenarios developed should be representative of the possible realizations of unknown problem parameters. A set of scenarios is represented as {(t, d, N)jt 2 Tp; d 2 D;N = N} with each scenario starting from time t, for duration d, and at any of the N nodes, where Tp is the set of discretized event times, D is the set of possible contamination durations, and N is the node set. Water consumption follows a periodic pattern (e.g., the residential water consumption follows a diurnal pattern). Different water consumption types vary in their consumption patterns (e.g., an industrial water consumption pattern could be different from a residential water consumption pattern). Thus, the starting
710
J. Xu et al. / European Journal of Operational Research 202 (2010) 707–716
time could be any point within a chosen period (the least common multiple of all consumption patterns), since the process and the resulting flows in the water distribution system are recurrent. The duration of the contamination event could range from a few seconds to several days or even longer if not detected. The upper limit of the event duration is set to be the required detection time. For instance, if a contamination event is required to be detected within two hours from its onset, then the longest contamination duration considered is two hours. For accidental contamination event, it is generally assumed that contamination events may occur at any node with equal probability. 2.2. Minimizing the volume of contaminated water consumed Efficiency is an important factor to be considered in public investment projects. In our problem, ‘‘efficiency” is interpreted as minimizing the volume of contaminated water consumed before the contamination event is detected by sensors. We formulate a p-median-based model to realize this goal. The classical p-median model, as proposed by Hakimi (1964), tries to find the locations of P facilities on a network to minimize the total cost. Indexes, sets and parameters N = the total number of nodes in the network i,I = the index and set of potential contamination source nodes j,J = the index and set of potential sensor location nodes s,S = the index and set of scenarios P = the total number of sensors available vijs = the volume of contaminated water consumed before an event originating at node i is detected by a sensor placed at node j under scenario s. In reality, some contamination events cannot be detected. In this case, we say alternatively that the contamination event is detected, and the volume of contaminated water consumed equals to the volume of the contaminated water when the event is not detected.
zj ¼
1 if a sensor is located at node j;
0 otherwise; 8 > < 1 if node i is assigned to potential sensor location xijs ¼ node j under scenario s; > : 0 otherwise; V ¼ The maximum volume of contaminated water consumed across all the scenarios Minimax p-median-based sensor-location model
Minimize Subject to :
V; XX i2I
X
ð1Þ
v ijs xijs 6 V; 8s 2 S;
ð2Þ
j2J
zj ¼ P;
8j 2 J;
Minimize V 0 ;
xijs ¼ 1;
i 2 I;
and constraint (2) becomes
XX i2I
zj 2 f0; 1g; xijs 2 f0; 1g;
v ijs xijs Z v s 6 V 0 ; 8s 2 S;
ð9Þ
j2J
where Zvs is the minimum volume of contaminated water consumed if we formulate a deterministic p-median model for scenario s. Before formulating and solving the minimax regret model, Zvs must be obtained for each scenario s. The regret for scenario s is exP P pressed as i2I j2J v ijs xijs Z v s . Constraint (9) stipulates the maximum regret V0 across all the scenarios. For both variants of the p-median-based sensor-location model, the number of decision variables and the number of constraints are O(N2S) and O(2N2S), respectively. The number of nodes N varies from several nodes for a small village to tens of thousands nodes for a city serving hundreds of thousands of people. The number of scenarios S varies from several scenarios to several thousands scenarios, depending on the decision contexts.
For public investment decisions such as water sensor-placement, a budget constraint limits the amount of service that can be provided. Therefore, another appropriate objective is to maximize the area that receives service. In this problem, this means maximizing the number of nodes covered by sensors. Node i is ‘‘covered,” if a contamination event at node i can be detected by any of the sensors within time T after the event has occurred. We refer to time T as the ‘‘coverage time”. The following is the formulation of a robust coverage model. Since the models above use a minimizing objective, we transform the model for maximizing coverage to a model for minimizing the number of uncovered nodes for consistency. A deterministic model for minimizing the number of uncovered nodes can be found in Daskin (1995, p. 200). Additional inputs and decision variable are as follows:
8s 2 S;
T
=
tijs
=
Nis
=
yis
=
ð4Þ
j2J
zj xijs P 0;
ð8Þ
ð3Þ
j2J
X
Minimax regret p -median-based sensor-location model In this model, regret is defined as the difference between the actual volume of contaminated water consumed under scenario s if we use the minimax regret approach and the volume of contaminated water consumed if we formulate a p-median model for scenario s. The mathematical formulation of the minimax regret model differs from the minimax model in the objective function and constraint (2). In the minimax regret model, the objective is to
2.3. Maximizing coverage
Decision variables
to detection across all the scenarios for any feasible solution. Constraint (3) specifies that P sensors are to be located throughout the network. Constraint (4) requires that every potential contamination source node be assigned to a sensor under each scenario s. Constraint (5) enforces that a potential contamination source node can be assigned to a sensor-location node if and only if there is a sensor located in that node. Constraints (6) and (7) are integrality constraints.
8i 2 I;
j 2 J;
s 2 S;
8j 2 J; 8i 2 I;
ð5Þ ð6Þ
j 2 J;
s 2 S:
ð7Þ
Objective function (1) minimizes the maximum volume of contaminated water consumed V, which is stipulated by constraint (2). Constraint (2) defines the maximum volume of water consumed prior
the ‘‘coverage time,” i.e., the maximum allowable time from the start of a contamination event to the event being detected, within which an event may be detected. contaminant travel time from contamination source node i to potential sensor-location node j under scenario s {j:tijs 6 T} the set of candidate sensor-location nodes that can detect a contamination event occurring at node i within time T under scenario s 1 if node i is uncovered under scenario s; 0 otherwise:
711
J. Xu et al. / European Journal of Operational Research 202 (2010) 707–716
Minimax covering sensor-location model For the minimax approach, we wish to locate sensors to minimize the maximum number of uncovered nodes across all scenarios. Equivalently, we wish to maximize the minimum number of covered nodes across all scenarios. The optimization model is as follows:
Minimize
Uc ; X
Subject to :
8s 2 S;
yis 6 Uc;
too. The transformation of a covering model to a p-median model can be found in Daskin (1995, p. 312). To formulate the multi-objective robust model, the following transformation is needed as an additional input to the robust coverage and the p-median-based models.
1; tijs P T;
ð10Þ
bt ijs ¼
ð11Þ
The multi-objective robust optimization model is as follows:
ð12Þ
Minimize Minimize
ð13Þ
Subject to :
0;
tijs < T:
i2I
X
zj ¼ P;
8j 2 J;
j2J
X
zj þ yis P 1;
8i 2 I;
s 2 S;
i
j2Nis
zj 2 f0; 1g;
8j 2 J;
Uc; V; XX
i
X
Objective function (10) minimizes the maximum number of nodes uncovered Uc, which is stipulated by constraint (11). Constraint (11) defines the maximum number of nodes uncovered across all the scenarios for any feasible solution. Constraint (12) specifies the number of available sensors. Constraint (13) stipulates that P node i is either covered by the sensors ( j2Nis zj P 1Þ or uncovered (yis = 1) under scenario s. Since the objective is to minimize the number of nodes uncovered, yis is forced to be 0 whenever P j2Nis zj P 1. Constraints (14) and (15) are integrality constraints.
X
8i 2 I;
s 2 S:
Minimax regret covering sensor-location model In this model, regret is defined as the difference between the actual number of nodes uncovered and the nodes that would have been uncovered if we knew with certainty which scenario would occur. The mathematical formulation of the minimax regret model differs from the minimax model in the objective function (10) and constraint (11). In the minimax regret model, the objective is to
Minimize Uc0
ð16Þ
and constraint (11) becomes
X
yis Zcs 6 Uc0 ;
8s 2 S;
ð17Þ
i2I
where Zcs is the minimum number of uncovered nodes corresponding to the deterministic model for scenario s. Before formulating and solving the scenario-based covering model with the minimax regret approach, Zcs must be obtained first for each scenario s. P The regret for scenario s is expressed as i2I yis Zcs . Constraint 0 (17) defines the maximum regret Uc across all scenarios. For both variants of the covering location models with the objective of minimizing the number of uncovered nodes, the number of variables and the number of constraints are O(NS) and O(2NS), respectively. 2.4. Multi-objective robust optimization model Minimizing the number of uncovered nodes (maximizing coverage) and minimizing the volume of contaminated water consumed are conflicting objectives (Krause et al. unpublished manuscript, 2006). We formulate a multi-objective robust optimization model to minimize the number of uncovered nodes and minimize the volume of contaminated water consumed. P-median type constraints are used to minimize the volume of contaminated water consumed, which motivates the transformation of the constraints for minimizing the number of uncovered nodes into a p-median type,
v ijs xijs 6 V; 8s 2 S;
ð20Þ
j
X X_ t ijs xijs 6 Uc;
ð14Þ ð15Þ
yis 2 f0; 1g;
ð18Þ ð19Þ
8s 2 S;
ð21Þ
j
zj ¼ P;
8j 2 J;
xijs ¼ 1;
i 2 I;
ð22Þ
j2J
8s 2 S;
ð23Þ
j2J
8i 2 I; j 2 J; s 2 S; 8j 2 J; xijs 2 f0; 1g; 8i 2 I; j 2 J; s 2 S:
zj xijs P 0; zj 2 f0; 1g;
ð24Þ ð25Þ ð26Þ
Objective function (18) minimizes the maximum number of uncovered nodes, and objective function (19) minimizes the maximum volume of contaminated water consumed prior to detection. Constraint (20) defines the maximum volume of water consumed prior to detection across all the scenarios for any feasible solution. Constraint (21) defines the maximum number of uncovered nodes across all the scenarios for any feasible solution. Constraint (22) specifies the total number of sensors to be located. Constraint (23) forces every potential contamination source node to be assigned to a potential sensor-location node. Constraint (24) enforces that a potential contamination source node i can be assigned to sensor-location node j if and only if there is actually a sensor placed at node j. Constraints (25) and (26) are integrality constraints. The number of decision variables and the number of constraints for this model are O(N2S) and O(2N2S), respectively. We solve model (18)–(26) using the weighting method (Cohen, 1978). The weighted objective becomes:
Minimize W
ð27Þ
and constraints (20) and (21) are then combined to form one new constraint (28)
XX i
j
v ijs xijs þ w
X X_ t ijs xijs 6 W; i
8s 2 S;
ð28Þ
j
where W = V + wUc, and w is the relative weight between the two objectives. 3. Solution method We start by solving the single-objective models, and then solve the multi-objective model. Since the deterministic maximum covering problem and p-median problem are known to be NP-hard for a general network (Drezner and Hamacher, 2002), the corresponding robust optimization models are necessarily NP-hard as well (Snyder, 2006). Serra and Marianov (1998) formulated a robust p-median-based model for locating fire stations in Barcelona to minimize the maximum population-weighted travel time across a set of scenarios when the population and distance are all uncertain, and solved the problem with a heuristic method and the branch-and-bound method. The heuristic solution method adapts
712
J. Xu et al. / European Journal of Operational Research 202 (2010) 707–716
a procedure developed by Teitz and Bart (1968). The Serra and Marianov’s algorithm consists of: (1) solving the deterministic pmedian problem for each scenario, (2) evaluating each solution across all the scenarios, (3) choosing an initial solution, and (4) applying an exchange heuristic to improve the solution. The comparison between the optimal solution with the branch-and-bound method and the solution using the heuristic method shows that the heuristic solution method performs well. By using the branch-and-bound method, Serra and Marianov (1998) also explored the computation time required to optimally solve ten random 20-node networks with each network having four scenarios, when three facilities are to be located. The average CPU time required for solving the problem to optimality was 16.5 minutes for the minimax model, and 22.0 minutes for the minimax regret model. Since the relative instance in this paper is an order of magnitude larger than that Serra and Marianov solved to optimality, and given the NP-hard nature of the problem, we focus in this paper on heuristic solution method. In practice, population served by water distribution systems may vary in size from several dozen households to millions of households. The number of nodes in the corresponding systems could range from less than ten to many hundreds of thousands. Solving instances of the deterministic p-median model or maximum covering model to optimality can be challenging for very last networks. Similarly-sized instances of the robust optimization models formulated above could easily become intractable. However, Nemhauser et al. (1978) prove that at least 63% of the optimal value is guaranteed by using a greedy algorithm for submodular set functions (p-median model and maximum covering models are examples of submolular set functions). Also, it is found that for water distribution network structures, the greedy algorithm finds the optimal solutions in 99% of the cases (Krause et al., 2006). Solving instances of the deterministic models with more than 12, 000 nodes with such algorithms requires just a few second on a standard PC. Thus, we adopt the simplest approach (i.e., the first three steps in the aforementioned heuristic algorithm in Serra and Marianov (1998)). This approach makes the problem manageable, but does not guarantee that the global optimal is found. The weighted multi-objective model can be solved in a manner similar to the robust p-median model after the relative weight w is specified. However, identifying non-dominated solutions and constructing approximation to the Pareto frontier is more complicated for a robust multi-objective model than for a deterministic model. Conventional methods such as the weighted method and the constraint method (Cohen, 1978) cannot be applied directly. For example, if we use a weighted method to find a solution that is robust for the weighted objective function, we have to decide which scenario to use to calculate the individual objective. Thus, traditional trade-off curves cannot be used. Instead, we define a ‘‘sensitivity region” for a solution to the weighed objective function when weight w is specified. A ‘‘sensitivity region” (Gunawan and Azarm, 2005) for a solution to the weighted multi-objective model is defined as the rectangle composed of the minimum and maximum values of each of the objectives across all the scenarios, respectively. We then use the sensitivity region to construct the trade-off relationship between the multiple objectives in a robust model (the detailed procedure is shown in Section 4.2.3 with an example).
system, researchers often model the water distribution system and the associated water behavior and then work with simulation results rather than using experimental data. With such models (hydraulic and water quality models), we are able to track how a contaminant travels with water in the distribution system when injected at a node. Contaminant concentration levels at any node over time can then be tracked, and thus, contaminant travel times from the contamination source node to all the other nodes can be derived. A similar analysis can be done for the volume of contaminated water consumed prior to detection. For one scenario, the process has to be repeated across all the nodes in the network. Large-scale simulation on even a medium-size water distribution network is time-consuming since the hydraulic and water quality models require solving many linear and nonlinear equations simultaneously. The 129-node network (Fig. 1) provided by the organizers of the Battle of Water Sensor Networks (BWSN) (Ostfeld et al., 2006) was chosen for our study because the time-consuming scenario simulation had already been conducted within our research group. In Fig. 1, some of the nodes are highlighted and labeled with their ID for ease of visualizing model results. We use this 129-node network for illustration purpose. The formulation and solution method presented here can be applied to much bigger networks. 4.1. Scenario design The scenarios for our network are derived from the simulation results prepared by Krause et al. (2006) for the BWSN. An exhaustive simulation of random contamination events was conducted with distinct contamination events beginning every 5 minutes for the first 24 hour with each event lasting for 2 hour. A suite of 129 24 60/5 contamination events was simulated for each of the 129 nodes in the network, yielding a total of 37,125 events. In our analysis, we generate robust solutions with respect to the contamination occurrence time instead of the contamination location since we focus on accidental contamination event instead of informed intentional contamination events. In this paper, we assume that the contamination event might occur at any node. Thus, a ‘‘scenario” used in this paper is different from that used in the BWSN. In the BWSN, one scenario is one injection at one node. For our purpose, one scenario is one injection at any of the nodes
4. Application to a water distribution system Applications of our robust optimization models to place sensors in a water distribution system require knowledge of how water travels and contaminant behaves in the system. Since it is not possible to conduct contamination experiments in a real distribution
Fig. 1. The structure of the studied network with the highlighted nodes and their IDs.
J. Xu et al. / European Journal of Operational Research 202 (2010) 707–716
713
with the same amount of contaminant, at the same starting time and for the same duration. Thus, we have 24 60/5 = 288 scenarios with each scenario starting every five minutes for the first 24 hours of the simulation. 4.2. Results and discussion In this section, we show how five sensors should be placed in the 129-node network with 288 contamination scenarios with different robust models for different objectives. AMPL–CPLEX 10.0 (ILOG-CPLEX, 2003a,b) was used to formulate and solve the covering model, the p-median model, and the weighted multi-objective model for each scenario. The evaluation of each solution across all the scenarios was programmed in Python 2.4. All the calculations were done on a Dell Optiplex GX620 computer with an Intel Pentium(R) 3.4 GHz CPU, and 3.39 GHz, 2.00 GB RAM. 4.2.1. Results for the robust p-median-based models Each deterministic p-median model consists of 1 objective, 16,770 variables, and 16,771 constraints. The average CPU time required to solve each problem instance to optimality is about 2 seconds. Evaluating the 288 solutions across all the scenarios and finding the robust solution takes about 15 minutes for both the minimax model and the minimax regret model. Table 2 shows the sensor-placement schemes based on different decision approaches. The exact position of each node (with marked ID) is shown in Fig. 1. The solution for the deterministic model, which uses the expected value of the uncertain input data, is different from the solutions based on the robust decision models. The solutions to the robust models with different criteria (minimax vs. minimax regret) are the same in this case; however, these are likely to differ in general. The solutions for robust models are decided by their worst performance across all the scenarios, regardless of the distribution of the performances across the scenarios. It is possible that the selected solution achieves the best-possible objective function value for the worst scenario, and achieves the worst-possible objective function value for all other scenarios. We investigated the distribution of performances across all the scenarios for each of the solutions. The results are shown in Fig. 2. In Fig. 2, each point on the x-axis corresponds to a solution. There are 288 solutions, with one for each of the 288 scenarios. Each of the solutions is then evaluated across the 288 scenarios. As a result, each data point corresponds to the solution’s performance in a given scenario. The solutions are sorted based on their worst performance (the maximum volume of contaminated water consumed when being evaluated across the 288 scenarios). The maximum and minimum values of the performances across the 288 scenarios for each of the solutions are highlighted in Fig. 2. The robust solution corresponds to the second to the rightmost column. The rest of the figure to the left shows the performance of the solutions which are less robust and only shows the best and worst performance across the realizable scenarios. It can be seen that although our robust solution performs better in hedging against extreme scenarios, it performs worse in some non-extreme scenarios than some of the other feasible solutions.
Table 2 Sensor-placement schemes based on different decision models for minimizing the volume of contaminated water consumed prior to detection. Approach
Solution set (Sensor-location node ID)
Minimax approach Minimax regret approach Deterministic model Krause et al. (2006)
[22, 68, 79, 116, 128] [22, 68, 79, 116, 128] [17, 49, 68, 79, 102]
Fig. 2. The distribution of the performances of sensor-placement schemes across all scenarios for minimizing the volume of contaminated water consumed.
We compare the performance of the solution based on the deterministic model (the rightmost column in Fig. 2), which using the expected value of the uncertain input data (Krause et al., 2006), with the performance of the solution based on the robust approach (the second to the rightmost column in Fig. 2). It can be seen that the robust solution performs better in hedging against extreme scenarios even using the heuristic solution method. 4.2.2. Results for the robust covering models Each deterministic covering problem consists of 1 objective, 258 variables, and 130 constraints. Less than 1 second of CPU time is required to solve each problem instance to optimality. Before solving a covering model, the ‘‘coverage distance” (the coverage time here) has to be specified exogenously. The solution is contingent on the coverage time chosen. We investigated the relationship between the performance (the number of uncovered nodes) of the robust solution and the coverage time chosen. Fig. 3 shows that robust covering model displays diminishing returns with respect to coverage time, as is the case for the deterministic covering model. Table 3 shows the sensor-placement schemes based on the different approaches (deterministic vs. robust) and different robustness criteria (minimax vs. minimax regret) when the ‘‘coverage time” is set to be 24 hours. The exact position of each node (with marked ID) is shown in Fig. 1. Two different sensor-placement schemes (a and b) perform equally well in hedging against the worst case when the minimax approach is applied. From the table, it can be seen that although the sensor-placement schemes based on the different approaches are different, three nodes (74, 83 and 100) are common to the robust model and the deterministic model. As that for the p-median-based robust model, we also investigated the distribution of performances across all the scenarios for each of the 288 solutions for the robust covering model. Results are shown in Fig. 4. It can be seen that the solutions to a set of scenarios are robust in this case. Combining the information from Table 3, we can interpret that two sets of solutions (a and b) are shared by many scenarios. For each solution, we again compute the maximum and minimum values of its performances across all the scenarios. The maximum and minimum values for the performance of the robust solutions are all as good as or superior than those for the solutions to other scenarios.
714
J. Xu et al. / European Journal of Operational Research 202 (2010) 707–716
Fig. 3. The trade-off between the ‘‘coverage time” and the performance of the robust sensor-placement schemes.
Table 3 Sensor-placement schemes based on different decision models for maximizing the coverage. Approach
Solution set (Sensor-location node ID)
Minimax approach
a. [72, 83, 100, 116, 121] b. [74, 83, 100, 116, 121] [34, 74, 83, 100, 116] [74, 83, 100, 120, 124]
Minimax regret approach Deterministic model Krause et al. (2006)
We compare the performance of the solution based on the deterministic model (the rightmost column in Fig. 4), which using the expected value of the uncertain input data, with the performance of the solution based on the robust approach (the second to the rightmost column in Fig. 4). It can be seen that the robust solution performs better in hedging against extreme scenarios even using the heuristic solution method. 4.2.3. Results for the multi-objective robust decision model In solving the multi-objective robust decision model, we identify the trade-off between the multiple objectives. Given the uncer-
Fig. 4. The distribution of the performances of sensor-placement schemes across all the scenarios for minimizing the number of uncovered nodes.
tain nature of this multi-objective program, the trade-off singleline curve that we are trying to find is actually a trade-off ‘‘band”. The procedure for constructing the trade-off relationship between multiple objectives is described as follows. A weight w is assigned following the grid-search weighting method as described by Cohen (1978). The magnitude of the number of uncovered node is less than 129, which is the total number of the nodes in the system, and the magnitude of the volume of contaminated water is in the order of about 104–107. Be setting w equal to 0, we can identify the solution that minimizes the volume of contaminated water only. By setting w equal to a very large number (using trial-and-error, we found that w equals 200,000), we can identify the solution that minimizes the number of uncovered nodes. Other weights are chosen in between. In the example shown here, we normalized the number of uncovered nodes and the volume of contaminated water consumed. The normalized scores range from 0 to 1. One unit score on the number of uncovered nodes represents 52 nodes and one unit score on the volume of contaminated water consumed represents 107 gallon water. The normalized weight ranges from 0 to 50, with 50 (107/200,000) corresponding to the absolute value of 200,000. When the normalized weight is greater than 50, the solution keeps the same as that for the normalized weight 50. Hereafter, when performing all calculations, we used absolute weight. In the following, we illustrate how the trade-off between the two objectives is constructed step by step: 1. After a weight w is assigned, the robust solution to the weighted multi-objective model is found in a similar manner as solving the robust p median-based model; 2. The robust solution is then used to calculate the number of uncovered nodes and the volume of contaminated water consumed across all the scenarios, respectively; 3. The minimum and maximum values of the number of uncovered node across all the scenarios are recorded, as well as the minimum and maximum values of volume of contaminated water consumed across all the scenarios; 4. The rectangle composed of the four values (two minimum and two maximum values) is the ‘‘sensitivity region” for the solution corresponding to the weight w. As shown in Fig. 5a, the upper left rectangle is the ‘‘sensitivity region” when w is equal to or greater than 50 and the lower right rectangle is the ‘‘sensitivity region” when w is equal to 0. By taking the lower right rectangle as an example, the sensitivity regions means that the normalized value of the number of uncovered nodes ranges from about 0 0.1 and the normalized value of the volume of contaminated water ranges from about 0.8 1.0. These ranges manifest the uncertainty of the problem, which are the possibility of the realization of a variety of different scenarios. 5. Repeat step 1–4 for different weight values. Fig. 5b shows the ‘‘sensitivity regions” for different weights with diagonal lines. Diagonal lines are used to represent rectangles to make the figure readable. In each ‘‘sensitivity region”, the lower leftmost point and the upper rightmost point correspond to the best and the worst performances on the two objectives, respectively. The line connecting the most upper rightmost points for each sensitivity region (trade-off curve 1 in Fig. 5c) is a trade-off curve for the two objectives based on their worst performance, and the line connecting the most lower leftmost points for each sensitivity region (trade-off curve 2 in Fig. 5c) is a trade-off curve for the two objectives based on their best performance. For decision makers, the trade-off between the two objectives can be interpreted as a ‘‘band” bounded by the best and worst performance across all the scenarios, which is similar to that for the trade-off single-line curve.
J. Xu et al. / European Journal of Operational Research 202 (2010) 707–716
715
tional cost for generating scenarios. We focused in our paper on developing robust models to place sensors and put less effort into scenario planning. The study of structuring scenarios for intentional contamination in a water distribution systems is thus of particular importance, and thus should be a next step in the research. Acknowledgement This work was funded by the National Science Foundation under grant BES-0329549. We wish to thank Andreas Krause for providing the simulation results on which our analysis was based, as well as James Uber and Avi Ostfeld for providing the network. References
Fig. 5. (a) An illustration of the ‘‘sensitivity region”; (b) An illustration of the ‘‘sensitivity regions” for different weights with diagonal lines; (c) The trade-off curves based on the best performance and worst performance in the ‘‘sensitivity region”.
5. Conclusion and next steps When placing sensors to detect contaminants in water distribution systems, commonly used decision models (deterministic optimization models and stochastic optimization models) fail to meet the goal of minimizing the life and health of the population across all realizable contamination scenarios. Robust optimization as an approach to hedge against the worst consequences provides a promising basis for locating sensors in water distribution systems to detect contamination events. Current robust optimization models (Carr et al., 2006; Watson et al., 2006) addresses the uncertainties in contamination locations, which are more suitable for designing robust sensors schemes to hedge against well-informed intentional attackers. In designing sensor schemes to detect accidental contamination events, we wish to design a robust sensor-placement scheme that performs well over all contamination events that start from different times. Thus, we develop several robust models to design robust sensor schemes with respect to the uncertainty of contamination occurrence time. The procedure for formulating and solving the problem could shed light on designing monitoring stations in water distributions systems. When the variable and constraint sets are large, solving this type of facility location models is challenging. However, for the specific structure of water distribution systems, the performance of the solution to deterministic models with the greedy heuristics algorithm is comparable with the performance of an optimal solution (Krause et al., 2006). Solving deterministic models for each of the realizable scenarios is the key part for solving the robust decision models. Thus, the computational effort required for solving real-world problems is affordable for most water utilities. For any given value of the relative weight, instances of the multi-objective robust optimization model may be solved easily. However, constructing the trade-off relationship between the multiple objectives for many values of the relative weight is difficult. In our paper, we use the concept of ‘‘sensitivity region” to build the tradeoff relationship between the number of uncovered nodes and the volume of contaminated water consumed prior to detection. The ‘‘sensitivity region” captures how the two objectives vary with the uncertain input parameters. The planning of scenarios to structure and represent the uncertainty of a problem is itself a major area of methods development. In our problem, this is especially important given the computa-
Becker, H.S., 1983. Scenarios-A tool of growing importance to policy analysts in government and industry. Technological Forecasting and Social Change 23 (2), 95–120. Berry, J.W., Fleischer, L., Hart, W.E., Phillips, C.A., Watson, J., 2005. Sensor placement in municipal water networks. Journal of Water Resources Planning and Management 131 (3), 237–243. Berry, J., Hart, W.E., Phillips, C.A., Uber, J.G., Watson, J., 2006. Sensor placement in municipal water networks with temporal integer programming models. Journal of Water Resources Planning and Management 132 (4), 218–224. Brandeau, M.L., Chiu, S.S., 1989. An overview of representative problems in location research. Management Science 35 (6), 645–674. Burkard, R.E., Dollani, H., 2001. Robust location problems with Pos/Neg weights on a tree. Networks 38 (2), 102–113. Burkard, R.E., Dollani, H., 2002. A note on the robust 1-center problem on trees. Annals of Operations Research 110 (1), 69–82. Carr, R.D., Greenberg, H.J., Hart, W.E., Konjevod, G., Lauer, E., Lin, H., Morrison, T., Phillips, G.A., 2006. Robust optimization of contaminant sensor placement for community water systems. Mathematical Programming Series B 107, 337–356. Chen, B., Lin, C., 1998. Minmax-regret robust 1-median location on a tree. Networks 31 (2), 93–103. Chung, C., 1986. Recent applications of the maximal covering location planning (M.C.L.P.) model. The Journal of the Operational Research Society 37 (8), 735– 746. Cohen, J.L., 1978. Multiobjective Programming and Planning. Academic Press, New York, NY. Current, J., Min, H., Schilling, D., 1990. Multiobjective analysis of facility location decisions. European Journal of Operational Research 49 (3), 295–307. Daskin, M.S., 1995. Network and Discrete Location: Models Algorithms and Applications. John Wiley & Sons Inc., New York, NY. Diamond, D., 1998. Principles of Chemical and Biological Sensors. John Wiley & Sons Inc., New York, NY. Drezner, Z., Hamacher, H., 2002. Facility Location: Applications and Theory. Springer, Berlin. Gunawan, S., Azarm, S., 2005. Multi-objective robust optimization using a sensitivity region concept. Structural and Multidisciplinary Optimization 29, 50–60. Hakimi, S.L., 1964. Optimum locations of switching centers and the absolute centers and medians of a graph. Operations Research 12 (3), 450–459. Hashemian, H.M., 2005. Sensor Performance and Reliability. Instrumentation, Systems, and Automation Society, Triangle Park, NC. Hrudey, S.E., Payment, P., Huck, P.M., Gillham, R.W., Hrudey, E.J., 2003. A fatal waterborne disease epidemic in Walkerton, Ontario: Comparison with other waterborne outbreaks in the developed world. Water Science and Technology 47 (3), 7–14. ILOG-CPLEX Division. 2003a. AMPL Version 8.0. Mountain View, CA. ILOG-CPLEX Division. 2003b. CPLEX 8.0 for AMPL. Mountain View, CA. Kessler, A., Ostfeld, A., Sinai, G., 1998. Detecting accidental contaminations in municipal water networks. Journal of Water Resources Planning and Management 124 (4), 192–198. Kouvelis, P., Yu, G., 1997. Robust Discrete Optimization and its Applications. Kluwer Academic Publishers., Boston. Krause, A., Leskovec, J., Isovitsch, S., Xu, J., Guestrin, C., VanBriesen, J., Small, M., Fischbeck, P., 2006. Optimizing sensor placements in water distribution systems using submodular function maximization. In: 8th Annual Water Distribution System Analysis Symposium, Department of Civil and Environmental Engineering, University of Cincinnati, Cincinnati, OH. Kumar, A., Kansal, M.L., Arora, G., Ostfeld, A., Kessler, A., 1999. Detecting accidental contaminations in municipal water networks. Journal of Water Resources Planning and Management 125 (5), 308–310. Lebedev, V., Averbakh, I., 2006. Complexity of minimizing the total flow time with interval data and minmax regret criterion. Discrete Applied Mathematics 154, 2167–2177. Lee, B.H., Deininger, R.A., 1992. Optimal locations of monitoring stations in water distribution system. Journal of Environmental Engineering 118 (1), 4–16. Mac Kenzie, W.R., Hoxie, N.J., Proctor, M.E., Gradus, M.S., Blair, K.A., Peterson, D.E., Kazmierczak, J.J., Addiss, D.G., Fox, K.R., Rose, J.B., Davis, J.P., 1994. A
716
J. Xu et al. / European Journal of Operational Research 202 (2010) 707–716
massive outbreak in Milwaukee of cryptosporidium infection transmitted through the public water supply. The New England Journal of Medicine 331 (3), 161–167. Nemhauser, G.L., Wolsey, L., Fisher, M., 1978. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming 14, 268– 294. Ostfeld, A., Salomons, E., 2005. Securing water distribution systems using online contamination monitoring. Journal of Water Resources Planning and Management 131 (5), 402–405. Ostfeld, A., Uber, J., Salomons, E., 2006. Battle of the water sensor networks (BWSN): A design challenge for engineers and algorithms. In: 8th Annual Water Distribution System Analysis Symposium, Department of Civil and Environmental Engineering, University of Cincinnati, Cincinnati, OH. Owen, S.H., Daskin, M.S., 1998. Strategic facility location: A review. European Journal of Operational Research 111 (3), 423–447. ReVelle, C.S., Eiselt, H.A., 2005. Location analysis: A synthesis and survey. European Journal of Operational Research 165 (1), 1–19.
Serra, D., Marianov, V., 1998. The p-median problem in a changing network: The case of Barcelona. Location Science 6, 383–394. Snyder, L.V., 2006. Facility location under uncertainty: A review. IIE Transactions 38 (7), 537–554. States, S., Scheuring, M., Kuchta, J., Newberry, J., Casson, L., 2003. Utility-based analytical methods to ensure public water supply security. Journal of the American Water Works Association 95 (4), 103–115. Teitz, M.B., Bart, P., 1968. Heuristic methods for location problems on stochastic networks. Transportation Science 17, 168–180. United States Environmental Protection Agency, 2005. WaterSentinel online water quality monitoring as an indicator of drinking water contamination. Rep. No. EPA 817-D-05-002. Watson, J., Hart, W.E., Murray, R., 2006. Formulation and optimization of robust sensor placement problems for contaminant warning systems. In: 8th Annual Water Distribution Systems Analysis Symposium, p. 93. Yu, C., Li, H., 2000. A robust optimization model for stochastic logistic problems. International Journal of Production Economics 64, 385–397.