Available online at www.sciencedirect.com
SCIENCE ~DIFIHCT ELSEVIER
e
MATHEMATICAL
AND
COMPUTER MODELLING
Mathematical and Computer Modelling 42 (2005) 1385-1396 www.elsevier.com/locate/mcm
Discrete Sensor Placement Problems in D i s t r i b u t i o n N e t w o r k s T. Y. BERGER-WOLF C e n t e r for Discrete M a t h e m a t i c s a n d T h e o r e t i c a l C o m p u t e r Science ( D I M A C S ) C o R E Bldg., R u t g e r s U n i v e r s i t y 96 F r e l i n g h u y s e n Rd., Piscataway, N J 08854, U.S.A. t anyabw©dimacs, rutgers, edu
W. E. HART Computer Science Research Institute Sandia National Laboratories Albuquerque, NM 87185, U.S.A. wehart©sandia, gov
J. SAIA D e p a r t m e n t of C o m p u t e r Science University of New Mexico A l b u q u e r q u e , NM 87131, U.S.A. saia~cs, unm. edu
(Received December 2003; revised and accepted March 2005)
A b s t r a c t - - W e consider the problem of placing sensors in a network to detect and identify the source of any contamination. We consider two variants of this problem: (1) sensor-constrained: we are allowed a fixed number of sensors and want to minimize contamination detection time; and (2) time-constrained: we must detect contamination within a given time limit and want to minimize the number of sensors required. Our main results are as follows. First, we give a necessary and sufficient condition for source identification. Second, we show that the sensor and time constrained versions of the problem are polynomially equivalent. Finally, we show that the sensor-constrained version of the problem is polynomially equivalent to the asymmetric k-center problem and that the time-constrained version of the problem is polynomially equivalent to the dominating set problem. @ 2005 Elsevier Ltd. All rights reserved.
1. I N T R O D U C T I O N W e c o n s i d e r t h e p r o b l e m o f p l a c i n g s e n s o r s in a b u i l d i n g or a u t i l i t y n e t w o r k in o r d e r t o d e t e c t c o n t a m i n a t i o n o f t h e air o r w a t e r s u p p l y . P r a c t i c a l m o t i v a t i o n for t h i s p r o b l e m d e r i v e s f r o m r e c e n t We are grateful to C. Phillips and the anonymous reviewers for their helpful comments. This work was performed in part at Sandia National Laboratories. Sandia is a multiprogram laboratory operated by Sandia corporation, a Lockheed Martin Company, for the United States Department of Energy under Contract DE-AC04-94AL85000. This work is supported by the National Science Foundation postdoctoral fellowship Grants EIA 02-03584, EIA 02-05116 (T. Berger-Wolf), by the National Science Foundation Grant CCR-0313160 (J. Saia), and by the Sandia University Research Program Grant No. 191445 (J. Saia). 0895-7177/05/$ - see front matter @ 2005 Elsevier Ltd. All rights reserved. doi: 10.1016/j .mcm. 2005.03.005
Typeset by ~4AdS-TEX
1386
T.Y. BERGER-WOLFet al.
world events including Tokyo's subway incident, London's poison gas b o m b plot, and various U.S. government warnings. While more effective sensors (e.g., the SnifferSTAR air quality sensor [1]) are currently being developed to address increasing threats of contamination, these new sensors are likely to be expensive. Thus, algorithmic techniques are needed to place sensors in a network in such a way t h a t cost is minimized and contamination can still be quickly detected. Two possible goals for sensor placement are: (1) contamination detection, i.e., ensuring quick detection of a contamination event, and (2) source identification, i.e., ensuring t h a t the source of contamination can always be identified. Two natural constraints for sensor placement are: (1) sensor-constrained, i.e., allowing only a fixed number of sensors and (2) time-constrained, i.e., requiring contamination detection or source identification within a given time limit. These two goals and two constraints define four sensor placement problems which we address in this paper. Our results in this paper, provide the first analysis of the computational complexity of these sensor placement problems. The problems are described formMly in Section 2.1. For contamination detection, we show t h a t the sensor-constrained and time-constrained problems are polynomially equivalent and NP-hard (Section 2.2). We show t h a t the sensor-constrained contamination detection problem is polynomially equivalent to the asymmetric k-center problem (Section 4) and the time-constrained contamination detection problem is polynomially equivalent to the dominating set problem (Section 5). In addition, we describe necessary and sufficient conditions for source identification (Section 3); and give exact solutions to two specific cases of the source identification problem: the uniform clique and rooted trees (Section 6). A variety of numerical techniques have been proposed for sensor placement. There are several integer p r o g r a m m i n g formulations for contamination detection in water networks [2-5]. Berry et al. [2] show t h a t these integer programs can be robust to noise in data. Unfortunately, integer p r o g r a m m i n g can be exponential slow so these formulations of the problem m a y have difficulty scaling to real-world problems. Numerical methods have also been developed for the problem of source identification [6-10]. The approaches suggested in these papers are either to place sensors randomly or apply a greedy heuristic to compute a minimal (but not necessarily optimal) set of sensors capable of contamination source identification. Unfortunately, these numerical methods can also be exponentially slow. There have also been nonnumerical approaches to sensor placement. Gonz£lez-Banos and L a t o m b e show how to place visual sensors in a building for 3D mapping using a combinatorial optimization approach [11]. Additionally, heuristic optimizers have been used to perform sensor placement in water networks using water quality and protection measures based on detailed hydraulic models like E P A N E T (e.g., see [12-15]).
2. D I S C R E T E OPTIMIZATION
MODEL
2.1. O v e r v i e w In this section, we give a formal description of our sensor placement problems. We model the network as a directed weighted graph G = (V, E). V is a set of vertices representing possible locations far the sensors. These m a y be rooms in a building or pipe junctions in a water network. E is a set of edges representing flow between the vertices. These may be airways or hallways in a building or pipes in a utility network. Each edge ( i , j ) has weight rij E ~R+ which is the time it takes for contaminant to pass from vertex i to vertex j. We use the shortest p a t h metric to define the actual translocation rates in the graph. T h a t is, for any two vertices i and j, the
Discrete Sensor Placement Problems
1387
translocation rate rij is defined as the length of the shortest path between i and j; rij is infinite if no such path exists. Since the shortest path metric defines the actual translocation rates in this graph, the triangle inequality is valid. We can construct an adjacency matrix representation of the effective translocation rates in the graph by using any all-pairs shortest paths algorithm for directed weighted graphs with no negative weights 1. Thus, the input graph is a weighted complete graph (possibly with some infinite weight edges), where some edges represent the actual flow conduits and the rest are inferred to represent the effective translocation rates. For simplicity of presentation, however, in some cases we will explicitly describe only the edges representing flow conduits and the remaining edges are inferred. Given this input, the goal is to place sensors on the vertices so it is always possible to detect the contamination and, additionally, identify the vertex that is the source of contamination. We say that a sensor s can detect contamination at vertex v if there exists a finite length directed path from v to s. We say that s detects contamination at v within time t if the length of the v-s path is at most t. We make several assumptions about the sensor placement problem. First, we assume that a sensor can detect the presence of contaminant before it reaches dangerous levels and that once a sensor is activated, it remains activated. Second, we assume rapid mixing in the air flow. That is, (I) there is no delay between the material entering and leaving a compartment, and (2) the spread of the material inside the compartment is even. Third, we assume that there is a single contamination source. Otherwise, a sensor is needed at every vertex for source identification to be possible. For any particular contamination event, we assume that all the vertices down the flow from the contamination source eventually become contaminated. Any vertex in the network may be a potential contamination source. In the sensor-constrained variant of our problem, we are given a maximum number of sensors, Smax, and we want to minimize the time from contamination to detection or source identification. In the time-constrained variant of the problem, we are given a time limit T, and want to minimize the number of sensors required to detect contamination or identify the source within that time limit.
2.2. S e n s o r - C o n s t r a i n e d and T i m e - C o n s t r a i n e d E q u i v a l e n c e We now show a polynomial time equivalence between the sensor-constrained and time-constrained variants of the sensor placement problem.
OBSERVATION 1. The sensor-constrained and time-constrained versions of the sensor placement problem are polynomially equivalent. Moreover, an algorithm for the sensor-constrained version can be converted to an algorithm for the time-constrained version with a lg IV[ factor increase in the running time, and an algorithm for the time-constrained version can be converted to an algorithm for the sensor-constrained version with a lg m factor increase in the running time (where m <_ n 2 is the n u m b e r of finite weight edges). PROOF. Suppose we have an exact algorithm that solves the sensor-constrained version of the problem. We now describe an algorithm for the time-constrained version, with time limit T, that minimizes the number of sensors. Set the sensor limit at [V[ and use the given algorithm to find the minimum time to detection or identification. If the solution for the contamination source identification exceeds T, then there is no feasible solution to the identification problem. (Note that when sensor limit is IV[ the time to detection is 0, and therefore, there is a feasible solution.) Otherwise, use binary search on the sensor limit to find the smallest number of sensors for which the solution to the sensor-constrained version is within the specified time limit T. There are at most lg IVI iterations of the binary search, hence, the logarithmic factor increase in the running time. 1The best algorithm to date is, we believe, Pettie's algorithm that runs in O(IE[[V] + IV[2 log log [VI) time [16].
1388
T.Y. BERGER-WOLFet al.
The other direction is very similar. Suppose we have an exact algorithm for the time-constrained version. Sort the edges by their length. (Note that this adds an O ( m l g r n ) additive factor to the running time, which is assumed to be less than A t - c × lgm, where A t - c is the running time of the algorithm for the time-constrained problem.) Recall that we assume that all the edges exist and their weights are defined by the shortest path metric. Therefore, the time to detection or identification for any algorithm must be the length of some edge. Start with the time limit set at rma×, the length of the longest finite edge. Solve this problem for the smallest number of sensors. This gives the smallest feasible number of sensors. Thus, if the solution requires more sensors than the sensor limit Smax, then there is no feasible solution. Otherwise, use binary search on edge lengths to find the smallest time limit for which the time-constrained solution gives at most the number of alloted sensors. Note that the binary search approach assumes monotonicity. However, it is clear that the number of sensors needed monotonically increases as the detection time limit decreases and vice versa. | 3.
SOURCE
IDENTIFICATION In both the sensor and time-constrained variants of sensor placement, the question of unique contamination source identification must be addressed. The problem of source identification is clearly at least as hard as contamination detection, since detection is necessary for identification. The following lemma gives a necessary and sufficient condition for source identification. LEMMA 1. Unique contamination source identification is possible within the s y s t e m if and only if every v e r t e x has a unique configuration o f sensors and reaction time delays b e t w e e n successive sensors. T h a t is, let S = { s o , . . . , sk-1} be the set o f all sensors in the graph. For a vertex v~, let (Sio,..., sik_l) be the sensors ordered by the length o f the p a t h from v e r t e x v~ to each sensor (breaking the ties by sensor number). Further, let (d~l,... ,dik_l) be the reaction delay t i m e sequence, where dij is the difference between the p a t h length from vi to s~j_ 1 and the p a t h length from vi to sij. I f there is no p a t h from vi to sij, then d~j -- oo. Then, each vertex m u s t have a unique tuple ((S/o, 0), ( si~ , di~ ), . . . , (six_,, d~k_ a) ). The reaction time delay sequence in addition to the sensor permutation is generally necessary for unique identification. Figure 1 shows an example of the same sensor permutation with different time delays t h a t allows unique identification. (The edges not drawn in the figure are inferred.) Clearly, there must be a path from each vertex to at least one sensor for source identification. In the worst case, the entire sequence of sensors may be necessary for the unique source identification. Figure 2 shows an example of this phenomena, with six vertices and three sensors that can be easily generalized to s sensors with 2s vertices for an arbitrary s. Given a graph with edge weights and a sensor placement, using Lemma 1, it is possible to verify whether the placement indeed allows unique source identification. It also leads to a natural contamination source identification algorithm. First, we construct the sensor-delay sequence for each vertex by sorting the set of weights from a vertex to all the sensors. The sorting and computing of the delay in the reaction time between two successive sensors can be done in O([VIIS[ log[S[) time, where IS[ is the number of sensors. Once the sensor-delay sequences are constructed, we need to verify that each vertex has at least one finite sensor reaction time and that there are no two identical sequences. This can be done in a straight-forward way in O([V[2[S[) time (in the worst case, the sequences differ in the last delay time). Now we can build a delay sequence multigraph, with sensors as vertices and edges (i, j) of weight wij if there exists a sensor reaction sequence in which sensor j reacts after sensor i with a delay of w~j. For each edge we also keep a label that identifies the set of possible vertices that are consistent with the delay sequence as being the contamination source. Traversing two consecutive edges implies taking the intersection of their labels. This graph can be built in O([V[[S[) time. Thus, the total
Discrete Sensor Placement Problems
1389
3l
1LL3 S1
S2
Figure 1. Sensors sl and s2 will react in the same order to vertices vl or v2 being contaminated. However, if vl is the contamination source, then the reaction delay between sl and s2 is 1, while if v2 is the source, the delay is 2.
S1
S2
S3
Figure 2. An example of a graph where an entire sensor sequence may be necessary for a unique contamination source identification. For each sensor si the incoming edges have weight i. The delay sequence for the vertices is as follows: Vl : (Sl, 0)~ (82, 1), (83, 1), v2 : (s2,0), (sa, i), v3 : (sl, 0), (s2, 1). For each sensor vertex si the delay sequence is (si, 0). The minimum number of sensors for this graph is the three sensors shown in the figure. preprocessing t i m e is O(]V]2fS]). W h e n c o n t a m i n a t i o n occurs, t h e n a n y a c t u a l sensor reaction sequence corresponds to a u n i q u e p a t h in the delay sequence graph. T h u s , t h e c o n t a m i n a t i o n source can be identified in O(IS]) time. We have shown t h e following s t a t e m e n t . PROPOSITION 1. Given a placement of[S] sensors in the system, it takes at most O([V[2[S[) time
to verify t h a t unique source contamination identification is possible and, in case of contamination, it takes at most O(ISI) time to identify the source. Note t h a t this a l g o r i t h m relies on t h e relative difference of t h e edge weights, a n d n o t on their p a r t i c u l a r values. If t h e error in m e a s u r e m e n t of t h e t r a n s l o c a t i o n rates is less t h a n t h e smallest difference b e t w e e n a n y two paths, t h e a l g o r i t h m r o b u s t l y identifies sources. To our knowledge, this sort of r o b u s t n e s s has n o t b e e n quantified in o t h e r m e t h o d s for source identification for these problems.
4. S E N S O R - C O N S T R A I N E D CONTAMINATION DETECTION I n the s e n s o r - c o n s t r a i n e d c o n t a m i n a t i o n d e t e c t i o n problem, we are given a weighted directed g r a p h G = (V, E ) with positive weights r~j a n d a positive integer Sma×, which is t h e m a x i m u m n u m b e r of sensors to be used. O u r goal is to place t h e sensors onto t h e vertices of t h e g r a p h in a way t h a t m i n i m i z e s the m a x i m u m c o n t a m i n a t i o n d e t e c t i o n time. This p r o b l e m is e q u i v a l e n t to the a s y m m e t r i c k-center problem, which is well k n o w n to be N P - h a r d [17]. ASYMMETRIC k-CENTER. We are given a complete directed g r a p h G = ( V , E ) of shortest (weighted) p a t h distances b e t w e e n the vertices t h a t satisfies t r i a n g l e inequality, a n d a positive integer k. We m u s t find a s u b s e t of vertices S, ISI = k, which m i n i m i z e s the longest d i s t a n c e from a vertex in S a n d a n y vertex in t h e graph. T h a t is we w a n t to find a p p r o p r i a t e S m i n i m i z i n g cost(S), defined as follows: cost(S) = m a x rain dist(s, v).
vEV sES
T h e a s y m m e t r i c k-center p r o b l e m c a n n o t be a p p r o x i m a t e d w i t h i n a factor of (1 - o(1)) log* n unless N P C_ D T I M E ( n l ° g l ° g n ) . It is also i n a p p r o x i m a b l e w i t h i n a n y c o n s t a n t factor unless
T.Y. BERGER-WOLF¢t al.
1390
P = NP [18]. There exist both an O(log* n)-approximation algorithm [191 and a O(log* k)-approximation algorithm [20], which are theoretically the best possible (unless NP C_ DTIME(nI°g log n)). THEOREM 1. The sensor-constrained contamination detection problem is equivalent to the asym-
metric k-center problem. PROOF. To show the reduction in either direction, we equate k = Sm~× and retain the underlying directed graph G with the edge weights reversed dist(i, j) = rj~. Since the edge weights use the shortest p a t h metric, the triangle inequality is satisfied. A solution to the asymmetric k-center problem is a subset of at most k vertices, S, t h a t minimizes the distance from a vertex in S to a vertex in the graph. T h a t is, this is a subset of at most Smax vertices such t h a t the m a x i m u m time from any vertex to a vertex in S is minimizes among all such subsets. T h a t is, this is a subset of vertices such t h a t a placement of sensors at these vertices minimizes the contamination detection time. | 5.
TIME-CONSTRAINED
CONTAMINATION
DETECTION
In the time-constrained contamination detection problem, we are given a weighted directed graph G = (V, E) with positive weights rij and a positive integer T, which is the detection time limit. Our goal is to place the minimum number of sensors onto the vertices of the graph that allows contamination detection and source identification within the time limit. From Observation 1, we know t h a t the time-constrained and sensor-constrained variants are polynomially equivalent. We have shown in Section 4 t h a t the sensor-constrained variant is NP-hard, therefore, the time-constrained variant is NP-hard as well. The question remains, however, what is the best approximation algorithm for the time-constrained variant. Notice, t h a t we cannot directly apply any approximation algorithm for the sensor-constrained minimization using the binary search to find the smallest number of sensors within a given time limit. The solution t h a t we find might exceed the specified limit. Thus, the a p p r o x i m a t e solution must be within the time limit. However, in t h a t case, we might restrict the problem too much, increasing the number of sensors needed. In fact, for an arbitrary graph, there is no bound on the increase in the number of sensors with the decrease in the time detection limit. Consider a uniform clique with all the edge weights (in both directions) being T, the time limit. We need only one sensor to guarantee contamination detection within time T. However, detection within any time less t h a n T requires all n sensors. The clique graph is not unique. We can construct numerous nontrivial examples with arbitrary increase in the number of sensors when the detection time limit decreases. Thus, we need an approximation algorithm designed specifically for the time-constrained version of the problem. For that, we reduce time-constrained sensor placement directly to the m i n i m u m dominating set problem. MINIMUM DOMINATING SET. Given a graph G = (V, E) find the smallest subset S C V such t h a t for all u E V - S, there is a v E S such t h a t (u,v) E E. Minimum dominating set is not approximable within cln IV I for any constant 0 < c < 1 [21]. The best approximation algorithm is the 1974 Johnson's 1 + In IVI approximation algorithm [22] which is the direct application of the greedy set cover algorithm. The running time of the greedy set cover algorithm is asymptotically proportional to the sum of the set sizes. In case of the minimum dominating set problem, there are IVI sets, each of size at most A ( m a x i m u m degree of a vertex in the graph). Thus, the running time is O(IVIA). THEOREM 2. Time-constrained sensor placement problem is polynomially equivalent to minimum
dominating set. PROOF. Given a weighted graph G = (V, E) for the sensor placement problem and a time limit T, we create the input graph G ~ = (V, E I) for the minimum dominating set problem as follows. An
Discrete Sensor Placement Problems
1391
edge (u, v) E E p if and only if ru,, <_ T. T h a t is, we create the "reachability in T graph". A solution to the minimum dominating set on G ~ provides a minimum subset of vertices S C_ V such that for any u E V, it is either in S or there exists v E S such that (u, v) C E I. T h a t is, S is a minimum subset of vertices V such that for any vertex in the original graph G the distance from it to S is at most T. This is the solution to the time-constrained sensor placement problem for contamination detection. For the other direction, given graph G = (V, E) for the minimum dominating set problem, we create an instance of the time-constrained sensor placement problem by assigning all the edges in E weights 1 and setting the time limit T = 1. The weights for the edges not present in E are the path length between the corresponding vertices. A solution to the time-constrained sensor placement problem provides a minimum subset of vertices S such that for any vertex u E V - S the distance from u to a vertex in v c S is at most 1. That is, (u, v) c E. | 6.
SOURCE IDENTIFICATION IN SPECIAL GRAPHS
In this section, we give exact solutions to the source identification problem for two special graphs: the uniform clique and rooted trees. 6.1. U n i f o r m C l i q u e Consider the uniform clique graph on n vertices, K n , where each pair of vertices i, j is connected by a bidirectional edge with weight r. The complete solution to the uniform clique graph sensor placement problem is outlined in the following two observations. OBSERVATION 2. T h e uniform clique graph g n needs one sensor for contamination detection (in time greater than O) and a sensor placed at any vertex guarantees detection within the time limit of r.
OBSERVATION 3. T h e uniform clique graph K n needs at least n - 1 sensors for unique source identification (in time greater than O) and any n - 1 sensors guarantee unique source identification within the t i m e limit o f r. rlb reduce the detection or identification time to 0, it is necessary to place a sensor at every vertex. This is true for any graph with all nonzero edge weights. Clearly, any sensor placement graph is a weighted clique (with possibly some infinite edge weights). 6.2. R o o t e d T r e e s A rooted tree is a tree graph with a special vertex designated as a root. All the edges are oriented away from the root. That is for any edge ( i , j ) , it is oriented from i to j if i is on the path from the root to j. Water and other distribution networks can in some cases be modeled as rooted trees. In such a network there is a single supply source and it is delivered along a unique path to each destination. This model does not take into account possible back flow, that is, it assumes the flow moves from the source to the destinations only. (The rest of the edges in the graph are inferred from the tree edges.) 6.2.1. D e t e c t i o n There must be a sensor in each leaf of the tree (a vertex with no outgoing edge), otherwise there is no way to detect contamination at that vertex. For a detection time limit T, we use algorithm RTREE presented in Figure 3 to find the minimum set of sensors that ensures contamination detection within time T.
1392
T . Y . BERGER-WOLF et al. RTREE (1)
Place a sensor at each leaf.
(2)
Follow the tree edges towards the root from the current sensors and mark all the vertices that have distance at most T to a sensor as "covered".
(3)
Put a sensor at each uncovered vertex that is first on the path from a sensor to the root.
(4)
Repeat Steps 2 and 3 until all vertices are covered.
Figure 3. Algorithm RTREEfor contamination detection in rooted trees. THEOREM 3. A l g o r i t h m RTREE produces a m i n i m u m set o f sensors t h a t guarantees contamination detection w i t h i n the specified time limit T . PROOF. S u p p o s e to the c o n t r a r y t h a t a n o p t i m a l set of sensors is smaller t h a n t h a t p r o d u c e d by the a l g o r i t h m . T h e r e m u s t exist a sensor in t h a t set t h a t was placed at a vertex a l r e a d y covered at some i t e r a t i o n of t h e algorithm. Let s be the first such sensor. T h e l e n g t h of t h e p a t h from s to the closest sensor o t h e r t h a n s is less t h a n t h e t i m e limit T. Notice t h a t in the set of sensors p r o d u c e d by RTREE each two sensors are at least d i s t a n c e T apart, since we never place a sensor on a covered vertex. I n a tree graph, t h e r e is a u n i q u e p a t h from t h e root to every vertex. If we replace t h e sensor s with the n e x t sensor sR from RTREE on the p a t h from t h e root to s, t h e n we do not increase t h e n u m b e r of sensors. Moreover, all t h e vertices are still covered since s was the vertex f u r t h e s t from t h e root t h a t was n o t in t h e set p r o d u c e d by RTREE. W h e n we replace s by SR, all t h e d e s c e n d a n t s of sR are still covered by the set of sensors t h a t is t h e s a m e in the o p t i m a l a n d RTREE. T h e ancestors of s are covered because SR is t h e first v e r t e x uncovered by t h a t set a n d it is closer to t h e next o p t i m a l sensor o n the p a t h from t h e root to s. T h u s , we can replace t h e o p t i m a l set of sensors w i t h t h a t p r o d u c e d by t h e a l g o r i t h m w i t h o u t increasing the n u m b e r of sensors. This is a c o n t r a d i c t i o n to the a s s u m p t i o n t h a t the o p t i m a l set is smaller t h a n the one p r o d u c e d by t h e algorithm. Hence, t h e a l g o r i t h m RTREE p r o d u c e s a n o p t i m a l set of sensors. | 6.2.2. Identification Notice t h a t t h e above a l g o r i t h m g u a r a n t e e s detection w i t h i n the specified t i m e limit, it does not g u a r a n t e e identification. We need to m o d i f y t h e a l g o r i t h m slightly to g u a r a n t e e u n i q u e c o n t a m i n a t i o n source identification. LEMMA 2. In a rooted tree graph, for a n y v e r t e x at m o s t two sensors are s u m c i e n t to uniquely identify c o n t a m i n a t i o n at t h a t vertex (the two sensors are n o t necessarily the s a m e for all the vertices). PROOF. I n a tree w i t h no vertices of degree 2, every i n t e r n a l vertex has a set of d e s c e n d e n t s different from t h e set of d e s c e n d e n t s of its child (besides the child itself), a n d each i n t e r n a l vertex has at least one pair of d e s c e n d a n t s such t h a t it is their least c o m m o n a n c e s t o r 2. T h u s , for a n y i n t e r n a l v e r t e x it is sufficient to have sensors at some two vertices whose least c o m m o n ancestor it is. T h e s e two sensors identify their least c o m m o n ancestor as t h e c o n t a m i n a t i o n source in such a tree. P l a c i n g a sensor at a n y vertex of degree 2 u n i q u e l y identifies c o n t a m i n a t i o n at t h a t vertex a n d effectively converts t h e tree to a tree with no vertices of degree 2.
|
T h u s , w h e n no t i m e limit is imposed, it is sufficient to place t h e sensors at t h e leaves to ensure u n i q u e c o n t a m i n a t i o n source identification in a tree w i t h no degree-2 vertices. However, we m u s t 2 The least common ancestor of two nodes in a rooted tree is the node that is an ancestor of both and is the furthest of all such from the root.
Discrete Sensor Placement Problems
1393
place a sensor at a n y vertex t h a t has o n l y one o u t g o i n g edge, since otherwise it c a n n o t be a least c o m m o n a n c e s t o r of some two vertices. W i t h these o b s e r v a t i o n s in m i n d , in F i g u r e 4, we present the modified Mgorithm RTREE-ID t h a t produces t h e smallest set of sensors t h a t g u a r a n t e e s u n i q u e source i d e n t i f i c a t i o n w i t h i n a given t i m e limit T. RTREE-ID (1)
Place a sensor at each vertex with out-degree less than 2.
(2)
Follow the edges in the reverse direction from the current sensors and mark all the vertices that have distance at most T to a sensor. For each vertex that is marked at least twice, mark it as "covered".
(3)
Put a sensor at each not "covered" vertex that is first on the path from a sensor to the root.
(4)
Repeat Steps 2 and 3 until all vertices are covered.
Figure 4. Algorithm RTREE-IDfor contamination source identification in rooted trees. THEOREM 4. Algorithm RTREE-ID produces a m i n i m u m set of sensors that guarantee contami-
nation source identification within the specified time limit. T h e proof is similar to t h a t of T h e o r e m 3. B o t h a l g o r i t h m s RTREE a n d RTREE-ID m i n i m i z e t h e n u m b e r of sensors given a t i m e limit, t h a t is, t h e y are a l g o r i t h m s for the t i m e - c o n s t r a i n e d m o d e l of t h e problem.
However, since these
a l g o r i t h m s are exact, we have shown in Section 2.2 t h a t using those algorithms, we can c o n s t r u c t a s o l u t i o n to t h e s e n s o r - c o n s t r a i n e d p r o b l e m w i t h a O(log ]Vt) factor increase in r u n n i n g time. Thus, we have solved t h e sensor p l a c e m e n t p r o b l e m for rooted trees. 6.3. D i r e c t e d
Acyclic Graphs
A directed acyclic graph (DAG) is a directed g r a p h w i t h no cycles. T h e vertices with no maximal elements, a n d the vertices w i t h no o u t g o i n g edges are called
i n c o m i n g edges are called minimal e l e m e n t s of the DAGs. I n such a n e t w o r k is o n l y from t h e source to back flows. 6.3.1.
graph. M a n y d i s t r i b u t i o n networks c a n be realistically m o d e l e d as t h e r e m a y be m a n y s u p p l y sources ( m a x i m a l elements), b u t the flow the d e s t i n a t i o n s . T h i s m o d e l still does n o t take into a c c o u n t possible
Detection
THEOREM 5. Time-constrained sensor placement for contamination detection in a D A G is NP-
complete. PROOF. T h e general sensor p l a c e m e n t p r o b l e m for c o n t a m i n a t i o n d e t e c t i o n is in NP, hence, this special case is in N P as well. We show N P - h a r d n e s s by a r e d u c t i o n from m i n i m u m set cover. MINIMUM SET COVER. G i v e n a universe U -= {1, 2 , . . . , n} a n d a collection of sets $ = {$1, $2, • . . , Sk) such t h a t S i c U, find t h e smallest n u m b e r of sets in S such t h a t U j Sj = U. G i v e n a n i n s t a n c e of t h e m i n i m u m set cover p r o b l e m w i t h the e l e m e n t s U = {1, 2 , . . . , n } a n d a collection of sets S = {$1, $ 2 , . . . , Sk}, we create a n i n s t a n c e of t h e t i m e - c o n s t r a i n e d sensor p l a c e m e n t on a D A G as follows. T h e r e are t h r e e t y p e s of vertices. All the m a x i m a vertices c o r r e s p o n d to U, one v e r t e x for each element. T h e r e is also a v e r t e x for each set, w i t h a n edge from an e l e m e n t u to a set S if u E S. Finally, t h e r e is one m i n i m u m v e r t e x w i t h a n edge from each of the sets to it. Clearly, the r e s u l t i n g g r a p h has no directed cycles a n d c a n be c o n s t r u c t e d in p o l y n o m i a l time. F i g u r e 5 illustrates t h e c o n s t r u c t i o n .
1394
T . Y . BERGER-WOLF et al.
1
2
3
0
0
0
0
"'"
0
"'"
n-1
n
0
0
0
© Figure 5. A graph obtained after the reduction from an instance of set cover. There is an edge from i to Sj if i E Sj. All vertices Sj have an edge to the minimum vertex. To finish t h e r e d u c t i o n , all t h e edges have t h e s a m e weight 1 a n d t h e t i m e l i m i t for d e t e c t i o n i s T = 1. LEMMA 3. T h e r e e x i s t s a feasible s e t cover o f t h e e l e m e n t s o f U i f a n d o n l y i f t h e r e e x i s t s a feasible solution to r e s u l t i n g sensor p l a c e m e n t p r o b l e m w i t h no sensors placed a t t h e m a x i m a l vertices. PROOF. G i v e n a feasible set cover of U, we place t h e sensors at t h e m i n i m a l v e r t e x a n d t h e vertices c o r r e s p o n d i n g t o t h e sets in t h e cover. C o n t a m i n a t i o n in t h e m i n i m u m v e r t e x a n d e v e r y v e r t e x St is d e t e c t e d b y t h e sensor at t h e m i n i m u m v e r t e x ( w i t h i n t i m e 1). C o n t a m i n a t i o n at a n y m a x i m a l v e r t e x i is d e t e c t e d b y t h e sensor a t t h e v e r t e x c o r r e s p o n d i n g t o t h e set S j from t h e set cover t h a t c o n t a i n s i. Conversely, given a p l a c e m e n t of sensors in t h e g r a p h w i t h no sensors a t t h e m a x i m a l vertices, we c r e a t e a set cover b y t a k i n g t h e sets c o r r e s p o n d i n g to t h e vertices S j t h a t have sensors p l a c e d at t h e m . C o n t a m i n a t i o n at a n y of t h e m a x i m a l vertices m u s t b e d e t e c t e d b y one of t h e s e sensors, since t h e d i s t a n c e to t h e m i n i m u m v e r t e x is 2. T h u s , t h e sets c o r r e s p o n d i n g to t h e s e vertices cover all t h e e l e m e n t s of U. | W e now c o n t i n u e w i t h t h e p r o o f of T h e o r e m 5. If t h e r e exists a feasible s o l u t i o n to t h e sensor p l a c e m e n t p r o b l e m w i t h no sensors a t t h e m a x i m a l vertices, t h e n t h e r e exists an o p t i m a l s o l u t i o n w i t h no sensor at t h e m a x i m a l vertices. T h i s is d u e t o t h e fact t h a t we can r e p l a c e a n y sensor at a m a x i m a l v e r t e x i w i t h a sensor at a v e r t e x S j t h a t has an edge from i w i t h no i n c r e a s e in t h e t o t a l cost. G i v e n an o p t i m a l s o l u t i o n to t h e sensor p l a c e m e n t p r o b l e m w i t h no sensor a t t h e m a x i m a l vertices, we c o n s t r u c t a set cover b y t a k i n g t h e sets c o r r e s p o n d i n g t o t h e v e r t i c e s S j w i t h sensors. F r o m L e m m a 3, t h i s is a feasible s o l u t i o n t o t h e set cover p r o b l e m . T h i s is also an o p t i m a l solution, since t h e n u m b e r of sensors in a n y sensor p l a c e m e n t w i t h no sensor at t h e m a x i m a l vertices is one m o r e t h a n t h e size of t h e set cover. Thus, t h e sensor p l a c e m e n t p r o b l e m on a D A G is N P - c o m p l e t e . | 7. AND
CONCLUSIONS DISCUSSION
We have defined d i s c r e t e m o d e l s for t h e p r o b l e m of p l a c i n g sensors in d i s t r i b u t i o n n e t w o r k s a n d p r e s e n t e d r o b u s t a p p r o x i m a t i o n m e t h o d s . O u r results show t h a t o p t i m a l s o l u t i o n s c a n n o t be efficiently f o u n d in all cases (unless P = N P ) , b u t t h a t n e a r - o p t i m a l s o l u t i o n s c a n b e efficiently g e n e r a t e d . A l t h o u g h we have n o t a p p l i e d t h e a p p r o x i m a t i o n m e t h o d s we discuss, it is clear t h a t fast a p p r o x i m a t i o n m e t h o d s a r e i m p o r t a n t for sensor p l a c e m e n t b e c a u s e r e a l - w o r l d d a t a sets can be q u i t e large. F o r e x a m p l e , a w a t e r n e t w o r k for a large m u n i c i p a l i t y c a n e a s i l y c o n t a i n over 10,000 m a j o r d i s t r i b u t i o n pipes. W e have also a d d r e s s e d t h e i m p o r t a n t goal of t h e i d e n t i f i c a t i o n of t h e source of c o n t a m i n a t i o n for t h e first t i m e in a d i s c r e t e s e t t i n g . T h e m o d e l s c o n s i d e r e d in t h i s p a p e r m a k e m a n y s i m p l i f i c a t i o n s a n d a s s u m p t i o n s t h a t m i g h t need to be r e v i s e d for p r a c t i c a l a p p l i c a t i o n s . For e x a m p l e , we have a s s u m e d a single set of
Discrete Sensor Placement Problems
1395
network flows. Although this might be reasonable for building ventilation systems, this will often be insufficient for large-scale municipal water networks. Network flows in water networks vary based on consumer demands, which naturally vary throughout the day and by day of the week. Consequently, sensor placement methods for such domains need to account for the best placement for a variety of flows (e.g., see the model proposed by Berry et al. [2]). Even if the gross flow remains stable, there will naturally be modest variations in flows. A more careful consideration of these effects needs to be addressed to ensure quick robust contamination detection and source identification. Another major simplification in our model is that we have assumed robust detection regardless of the contaminant concentration level. A more realistic model would require a minimum concentration for robust detection. Additionally, it may be necessary to explicitly model sensor failures. Sensor failures might simply reflect the ability of sensors to operate under normal operating conditions. However, modeling this aspect could account for malicious destruction of sensors (e.g., by assessing how many sensors could be destroyed without seriously compromising their detection capacity). Finally, we note that accurate prediction of sensor placements might require explicit models of the decay of contaminants within the network. For example, it is well known that chemicals like chlorine bind to water pipes, which is modeled by water models such as EPANET [23]. Contaminants that decay rapidly will naturally be more difficult to detect, though this is not captured in our models.
REFERENCES 1. N. Singer, Flying SnifferSTAR may aid civilians and US military, Sandia LabNews 55 (2). 2. J. Berry, L. Fleischer, W.E. Hart and C. Phillips, Sensor placement in municipal water networks, In Proc. World Water and Environmental Resources Conference, 2003. 3. J. Berry, W.E. Hart, C.A. Phillips and J. Uber, A general integer-programming-based framework for sensor placement in municipal water networks, In Proc. Sixth Annual Symposium on Water Distribution Systems Analysis, 2004. 4. R.D. Cart, H.3. Greenberg, W.E. Hart and C.A. Phillips, Addressing modelling uncertainties in sensor placement for community water systems, In Proc. Sixth Annual Symposium on Water Distribution Systems Analysis, 2004. 5. J.P. Watson, W.E. Hart and H.J. Greenberg, A multiple-objective analysis of sensor placement optimization in water networks, In Proe. Sixth Annual Symposium on Water Distribution Systems Analysis, 2004. 6. A. Birchall, A microcomputer algorithm for solving compartmental models involving radionuclide transformations, Health Physics 50 (3), 389-397, (1986). 7. A. Birchall and A.C. James, A microcomputer algorithm for solving first-order compartmental models involving recycling, Health Physics 56 (6), 857-868, (1989). 8. F. Gelbard, J.E. Brockmann, K.K. M u r a t a and W.E. Hart, An algorithm for locating sensors in a large multiroom building, Tech. Rep. SAND2000-0851, Sandia National Laboratories, (2000). 9. C.D. Laird, L.T. Biegler, B.G. vanBloemen Waanders and R.A. Bartlett, Time dependent contamination source determination for municipal water networks using large scale optimization, Journal of Water Resources Planning and Management (to appear). 10. C.D. Laird, L.T. Biegler, B.G. vanBloemen Waanders and R.A. Bartlett, Time dependent contamination source determination: A network subdomain approach for very large networks, In Proc. World Water and Environmental Resources Congress, (2004). 11. H. Gonz£1ez-Banos and J.-C. Latombe, A randomized art-gallery algorithm for sensor placement, In Proceedings of the 17 th Annual Symposium on Computational Geometry (SoCC), pp. 232-240, ACM Press, New York, NY, (2002). 12. A. Kessler, A. Ostfeld and G. Sinai, Detecting accidental contaminations in municipal water networks, J. Water Resources Planning and Management, 192-198, (1998). 13. A. Kumar, M. Kansat and G. Arora, Discussion of "Detecting accidental contaminations in municipal water networks", Journal of Water Resources Planning and Management, 308-310, (1999). 14. A. Ostfeld and E. Salomons, Optimal layout of early warning detection stations for water distribution systems security, Journal of Water Resources Planning and Management 130 (5), 377-385, (2004). 15. M. Tryby, D. Boccelli, J. Uber and L.A. Rossman, Facility location model for booster disinfection of water supply networks, Journal of Water Resources Planning and Management, 322-332, (2002).
1396
T . Y . BERGER-WOLF et al.
16. S. Pettie, A faster all-pairs shortest path algorithm for real-weighted sparse graphs, In Automata, Languages and Programming. P9 ~h International Colloquium. Proceedings, Volume 2380 of Lecture Notes in CompuLer Science, (Edited by P. Widmayer, F. Triguero, R. Morales, M. Hennessy, S. Eidenbenz and R. Conejo), pp. 85-97, ICALP 2002, Malaga, Spain, July 8-13, 2002, Springer, Berlin, (2002). 17. M.R. Carey and D.S. Johnson, Computers and Intractability: A Guide to the TheorT of NP-Completeness, W. H. Freeman and Company, (1979). 18. J. Chuzhoy, S. Guha, E. Halperin, S. Khanna, G. Kortsarz and J. Naor, Asymmetric k-center is log* n hard to approximate, In Proceedings of the 36~h A C M Symposium on Theory of Computing ( S T O C 200~), ACM, Chicago, IL, (2004). 19. R. Panigrahy and S. Vishwanathan, An o(log* n) approximation algorithm for the asymmetric p-center problem, Journal of Algorithms 27 (2), 259-268, (1998). 20. A.F. Archer, Two o(log* k)-approximation algorithms for the asymmetric k-center problem, In Proceedings of the Eighth Conference on Integer Programming and Combinatorial Optimization, pp. 1-14, (2001). 21. R. Raz and S. Safra, A sub-constant error-probability low-degree test, and sub-constant error-probability PCP characterization of NP, In Proceedings of the P9 th Annual A C M Symposium on Theory of Computing, pp. 475 484, ACM, (1997). 22. D.S. Johnson, Approximation algorithms for combinatorial problems, J. Comput. System Sci. 9, 256-278, (1974). 23. L.A. Rossman, The EPANET programmer's toolkit for analysis of water distribution systems, In Proceedings of the Annual Water Resources Planning and Management Conference, (1999).