SHORTEST DISTANCE AND RELIABILITY OF PROBABILISTIC NETWORKS PITU B. MIRCHANDANI Electrical and Systems Engineering Department, Rensselaer Polytechnic Institute. Troy, New York 12181, U.S.A. Abstract-When the “length” of a link is not deterministic and is governed by a stochastic process, the “shortest” path between two points in the network is not necessarily always composed of the same links and depends on the state of the network. For example, in communication and transportation networks, the travel time on a link is not deterministic and the fastest path between two points is not fixed. This paper presents an algorithm to compute the expected shortest travel time between two nodes in the network when the travel time on each link has a given independent discrete probability distribution. The algorithm assumes the knowledge of all the paths between two nodes and methods to determine the paths are referenced. In reliability (i.e. the probability that two given points are connected by a path) computations, associated with each link is a probability of “failure” and a probability of “success”. Since “failure” implies infinite travel time, the algorithm simultaneously computes reliability. The paper also discusses the algorithm’s capability to simultaneously compute some other performance measures which are useful in the analysis of emergency services operating on a network. Scope and purpose-Many real-life operational systems may be modeled as networks. In transportation models the links may represent roadways, in communication models they may correspond to communication channels, and in operational models (for example, job shop scheduling) the links denote order sequences of tasks or activities. In transportation networks the nodes are demand points, supply points or junction points. Nodes are communication centers or communicating individuals in communication models; they denote the sequenced operations in operational models. Many of these network models are probabilistic. In the analysis of probabilistic networks two computational tools are important: (1) a method to compute the network reliability and (2) a method to compute the average travel time (or cost), expected shortest distance, or the time response, depending on the network model and/or problem of the system under study. This paper presents a three-step algorithm to compute the above quantities. Much of the research reported here is new. Frank and Frisch[3] and Hansler et al.171provide a good background in network analysis and reliability of probabilistic networks, respectively.
INTRODUCTION
In the design and analysis of communication networks two computational tools are often important: (1) a method to compute network reliability and (2) a method to compute the time response of the system. By reliability in a communication network we refer to the probability of connectivity between a given source node V, and a given sink node ut, where each link Ii has a given probability pi of being operative. By time response in a communication network we mean the expected time delay that a message which originates at the source node sustains before it reaches the sink node, the message being transmitted through the shortest available path. The use of these tools is not limited solely to the design and analysis of communication networks but also extends to the investigation of other complex systems which can be appropriately modeled as networks. The computation of expected travel time for emergency vehicles operating on a transportation network is one such example and, in fact, it is the application which motivated this research[l4]. If one assumes that the vehicle dispatcher in an emergency service system has complete information on the state of the network (perhaps, from traffic reports), then an emergency vehicle will take the shortest available route. The system thus behaves in a fashion similar to a communication system. In this paper we shall first review existing approaches and results pertaining to these two *Pitu B. Mirchandani is an Assistant Professor of Electrical and Systems Engineering at Rensselaer Polytechnic Institute. He holds the B.Sc. (Highest Honors) and M.S. degrees in Engineering from the University of California at Los Angeles, the SM. degree in Aeronautics and Astronautics and D.Sc. degree in Operations Research from the Massachusetts Institute of Technology. Professor Mirchandani has presented and published papers on network theory and applied estimation and control. His current research interests include probabilistic graphs and application of quantitative methods in the analysis of large scale systems. 347
348
P. B. M~~~~~AND~N~
computational tools. We shall show the connection between the problem of computing the refiability and the problem of computing the time responsein probabilistic networks and then present a new method which computes these two quantities s~multaueously. That these two computationaf problems have not been treated in conjunction before is, perhaps, due to the apparently different contexts in which they have been viewed in the past. ~rougbout this paper we shall assume that the reader is familiar with efemen~y graph theory and its notation. The refiab~it~ and the time response of a probabilistic network depends on the reliabifit~~sand the time delay ch~cterist~cs of the individual components of the network, The individu~ components wiii be assumed to be st~t~sticalfy independent afthough the algorithms presented may be easiiy modified when this is not so. The reader is referred to Frank and Frisch[3] and Wilkov[D] for a good introduction to probabilistic networks and some of the problems associated with them.
Since the welt-known paper of ~os~ow~~~I~] there have been many ~go~thms presented which impute the reliab~~ty of a compfex system. A recent article by Hiinsier et aI.[7] is recommended as a good review on the subject. H&4er~(i] himself has presented an interesting algorithm on computing reliability of networks based on venerating mutually exclusive cut-sets and calculating probabilities of related events. However, as wiff be apparent later, computing refiabifity from cut-sets does not p~tic~~rly facilitate the computation of the expected shortest response time. A second rnet~~ to compute retiabifity is to enumerate paths in the network and then to compute the probab~~t~esof related events. Since, in order to compute the expected shortest response time we afso need informa~on on the various paths from the source to the sink, the path enumeration method fends itself well to expected response time compu~tions. It is this avenue that we shall follow in.‘this paper. In the literature dealing wrth reliabifit~ of networks, and also in the literature related to switching circuits, communication networks, traffic networks and graph theory, many methods have been presented which enumerate paths in a network. ~kada[~7] followed by Seshu[f8], Goufd[5], MinerII] and Wing and Kimf20] introduce afgebraic rnet~~s to determine paths in ~o~o~~~~~dnetworks. The basis’of their methods is the orthogonafity between the incidence matrix and the circuit matrix of the network. Meanwhile, Hohn and Schissfer[8], Hohn et al.f9] and MurchIand[l6] presented algori~s to obtain paths in oriented networks. Recently, Mirchandanir~3] has presented efficient Iabeffing algorithms to determine paths in acyclic or nearly acyclic oriented networks. In this paper we shall assume that algorithms to identify paths are readily available. Once the information on the paths is available, how does one compute the ref~ab~ity of the network? While severaf researchers have ~n~oduced various ~lgo~~rns and subsequent mod~~cations~4, IO, 1I 1 F& the most ~ign~~~nt papers d~f~ng with this problem, in the opinion of the author, are those of M~ne~ll] and Kim et al.[fO]. Mine used algebraic methods to determine paths in a nonoriented network and then the factoring theorem [ll] to compute reliability. By his definition, reliability R is the probability that there exists at feast one path between the source and the sink. If there are I finks in the networks, then the state of the system can be represented by a vector a = {al, a2, . . . at) where a<= t if the ith fink is operative, a, = 0 otherwise. Mine then defines a ~*pro~s~ti~~~ h&ion” F(s) which is equal to unity when a contains at feast one path and is equal to zero otherwise. This function ahso has been referred to as “transmission function” by Brown[l] undoubtedly with reference to communication networks. If pi is the probability of “success” for the ith fink, i = 1,2, . . , t, then the “re~i~b~ity function”[l, 1l] r(q) is the expression of the system reliability R as a function of fpi). That is,
The factoring theorem, originaIly introduced by Moskow~tz~I~],basicaffy expands the refiability function as
Shortest distance and relia~ity of probab~ist~cnetworks
349
where C&h, p2,. . . p,)IPtcl and fr(p,, pzr.. . pl)lPi.+, are the reliability functions when the kth hnk is replaced by a failure-proof* link and a failed link respectively. By repeating this expansion with respect to all links, Mine obtained R=
2 fi pP’(1 -pi)‘-“iF(a)
rEn i=l
(2)
where R is the set of all possible values of a. From the determination of the paths in the network, one easily can obtain all a E Q for which F(a) = 1 and hence compute reliability using equation (2). Kim et al. obtained the reliability of oriented networks by hrst determining directed paths from the connection matrix and then using basic theory on probability of events to compute reliability. If Rsti denotes the jth path from source u, to sink v,, then the probability of success Pi, of the path Rstj is given by the product of the reliabilities of the links in series which identify path &. That is, (3) where (Ii E R,,} is the set of links in path &. If the paths do not contain any common links, (that is, the paths are par&l), then the reliability of the system is obtained directly from R =
I-fi(l-Pi) j=l
where K is the number of paths from u, to vt. Since, in general, the paths are not in parallel, Kim et al. introduced operator [ ]* defined by
where i is the exponent of the variable pk. In other words, [ I.+.,removes all exponents in a product term. This operator basically considers the reliability of each link only once, in a set of paths, to account for the statistical dependence of the paths due to the existence of common links. The reliability of the system is then computed from
or from R = C Pi - C [PiPi]* + C i.j.kEK jEK UEK i;j#k i*j
[PiPjPk]*
Kim et al. presented equations (6) and (7) with an “exemplified argument” to indicate their validity: they are proved formally in 1141. Through the use of the factoring theorem Mine considers the reliability of each link separately and hence evaluates 2’ independent states of the network to compute reliability. He still has to determine paths in order to evaluate the propositional function F(a). The method of Kim et al. first considers the reliability of each path and then evaluates 2K (where K is the number of paths from v, to u,) possible states of the paths to compute reliability. When K < I, and this is often true in many networks with redundancies, the method of Kim et al. is more efficient than Mine’s. The other problem that we are dealing with in this paper is the development of an approach for computing the expected shortest response time in a probabilistic network. In such a network, *By a f~lure-proof link we mean a link that fails with probability zero; that is, it never fails. CAOR
Vol. 3, No. GF
P. B.
350
~IRC~~ANDANI
link may not only be “operative” or “inoperative” but may also have a probability distribution for the time delay. (In general, when the link is inoperative the time delay may be considered as infinite). Note that if the reliability of the network is less than one, then the expected response time is naturally infinite since there is a non-zero probability of an infinite response time. However, if the reliability is one then computing the expected response time is a meaningful problem. If it is required to determine the expected response time when it is known that the source and the sink are connected, we are again faced with the same problem. We shalt refer to this problem as the “expected shortest* path travel time through a probabilistic network” problem. Frank[2] has considered this problem when each link has an independent continuous probability distribution for travel times. He derives expressions for the exact probability distribution, in terms of characteristic functions, for the travel times of the shortest paths. When the link travel times can be approximated as having discrete probability distribution, with finite number of values, his method becomes unnecessarily tedious. This paper assumes discrete probability ~st~butions and therein lies the connection between retiab~ity and the expected shortest path travel time. We shall assume that a message is always sent through the operative path with the shortest delay in a communication network, or, in the case of a transportation network, that a vehicle (for example, an emergency vehicle) travels on the fastest operating path. In a probabilistic network, this shortest path is not always the same path. Since the network is stochastic, each possible path could have several possible travel times and each possible path with a particular travel time could have, in general, a non-zero ~robab~ity of being the shortest and hence being used by the message or the vehicle. The method presented here to compute the expected shortest response time involves three sequential operations. The first is to transform the given probabilistic network to its “emergency equivalent network” where each link has an independent Bernoulli probability of operating successfully and a deterministic travel time when the link is operative. Such networks are similar to the models used in the reliability and communication literature. The second operation is to determine all the possible paths from the source to the sink. Here is where we assume that path enumeration algorithms are available. Finally, in the third operation, we use a ~ec~~~j~ealgorithm to compute both the expected response time and the reliability using the information on the paths. To illustrate our method, we shall consider throughout this paper the network of Fig. 1. a
Fig. 1. Example of a probabilistic network. Notation p,, l, denotes that link I, has reliability p, and travel time r, p*,(t) is a discrete travel time distribution for link (2,l).
We assume that the distribution p&t) in this example is p,fort=t, p2,(f) = P4for t = f4 > tS i
[l-(ps+P4)forf=m while, as indicated on the figure, the other four links have Bernoulli probabilities of success. EMERGENCY
EQUIVALENT
NETWORKS
(2,l) in the network of Fig. 1. The travel time from vz to vI is TVwith probability pa, it is t., with probability p4, and it is infinite with probability 1 - (p3+ p4). Now consider an equivalent representation of the link (2,l) which consists of two possible links Consider
the link
*The word “shortest” is used, instead of “fastest”, to be consistent with the literature.
351
Shortest distance and reliability of probabilistic networks
between v2 and u, as shown in Fig. 2. In this case the shortest path travel time from v2 to uI is t, with probability p3, is f4 with probability p4(l - pl) and is infinite with probability (I - p3)( 1 - p4). Hence by making p3 = p3, and p4 = &(I -ps) we have transformed the network into an “emergency equivalent network” where each link has an independent Bernoulli probability of success. Definition 1. In a probabilistic network G whose link delay times have discrete probability distributions, the emergency equivalent network GE is an abstraction of the given network such that (i) each link has a Bernoulli probability of success, (ii) the link delay times are deterministic and (iii) the expected shortest path response time between two points in the emergency equivalent network is equal to the expected shortest path response time in the original network. In general, if any link (j, k) has a probability distribution pe(t) for travel times p,fort=t,,m=1,2
Pik(f) =
1
,...,
r (81
p-=1-&fort=3: i=t
where t, < t2 < - * * c: t,, then in the emergency equivalent network this link is represented by r parallel links (j, m, k), m = 1,2, . . . , r where each link (j, m, k) has an independent probability of success pm and associated travel time t,,,, with pi = pi
~~=p.,(l-~~~i)-‘,rn=2,3
,.,.,
r
Theurem 1. The prubability of co~neci~vity and the expected sho~e~t pafh response iime fur the probabilistic network G are equal to those of the ~~ns~~cred emergency e4~ivalent network GE. Proof. Since link delay times are discretely distributed with a finite number of values, there
are only a finite number of different “network states”. The expected shortest path response time between two points may be determined by first obtaining the shortest times for each possibIe network state and then computing the weighted sum of these shortest times, each weighted by the probability of occurrence of the network state for which the shortest time was obtained. Consider the network states for which the particular link (j, k) of G was used in the shortest path. Thus to prove the theorem, it suffices to show that the probability of connectivity and the probability distributions of the delay times for the particular link (j, k) in G is equal to the probability of connectivity and the probabiIity distribution of the shortest path response time from Vj to vi, in G, using only the links (j, m, k), m = 1, 2, . . . , r, which represent link (j, k) of G. The probability of connectivity between nodes vi and V~in GE using only (j, m, k), m = 1,2, . . . ( r. = 1 - Prob[All links (j, m, k) fail, m = 1, 2, . . . rl
=l-(l-P,)(l-+$-)(l-l_;~_p*)+--5--) I i=l
= Probability of connectivity of (j, k) in G. Probability that shortest path travel time is t, < m between Ujand
ok
in GE using only links
352
P. B.
MIRCHANDANI
Fig. 2. Emergency Equivalent Network for the Network of Fig. 1.
(j, m, k), m = 1,2,. . . , r, = Prob[links (j, m, k) fail for all m < m. and link (j, mo, k) is operative] =(I -PAl
=
-P2). . .(I -Pmo-,)P,
Q.E.D.
Probability that travel time is t, for (j, k) in G. COMPUTING
RELIABILITY
AND EXPECTED
SHORTEST TIME
Let us now assume that we have transformed the network to its emergency equivalent network. That is, we have a network where each link li has a probability pi of being operative and a probability 1 - pi of being inoperative. Suppose there are K paths between the source node v, and the sink node v,, Rsn,j = 1,2, . . , , K. Let Ei denote the event that path Rsti is connected. Let Q,,, denote the expression for the probability that one or more of the paths Rs,,, RsrZ,. . . , R,,, are connected. That is, Q,, = Prob [E,UE,U..
. UEm].
Let the conditional probability that paths R,,, or Rst2or . . . or R,,+,, path Rsrm is connected be denoted by P”,,,,. That is, P Ulm= Prob[E,UEzU..
w are connected given that
. UE,_,IE,,,].
(11)
Let us define Pulo= Pult = Qo=O. We then obtain Pulm from Qm-, using
(12) where subscript P,,,_, indicates that all the link reliabilities associated with R,,, are set equal to unity in the subscripted expression. Using basic probability theory we can then show that
where P, is the probability that path R,,, is connected. Equations (12) and (13) can be used recursively to determine expressions for Pull, Pu12,. . . , P,,,. Let the index j of paths Rsri be ordered such that . < 5fj< 7j+, <. - . < TK r,
353
Shortest distance and reliability of probabilistic networks
When the probability of connectivity R is unity, there exists a path Rsth whose reliability Ph is unity, 1 5 h I K, while Pi < 1 for i < h. (In transportation networks R is usually unity). Then the expected response time is 7 = 7,p, + T,P*(l -P,,,) +. * * + nit1 - Pw)
(16)
Note that since Pi
Tk, where ?k is the travel time for the shortest path whose reliability is one (i.e. for path Rsth), then R, = 1 and ?’ = ?. But if T*5 rk, then we have R, = P, + Pz(1 - Pu,z)+ . . . + P,(l - Pu,,)
(17)
where r, < T, and T*+,I T,. Also when T, 5 r,,, the expected shortest path response time among admissable paths is Pu,z)+ * . *f r,P,(l -Pu,,)}/R,
?’ = {r,P, + r,P,(l-
(18)
The ~gorithm presented below uses the equations developed in this section to compute recursively R, the probability of connectivity, R,, the probability that the response time is less than the given threshold X, ?, the expected response time given connectivity, and 7, the expected response time given that the response time is less than the threshold ‘P,.
Algorithm
1 Step 1: (Initialization) (i) Determine paths Rstj and their respective reliabilities Pi and travel times q. Arrange index j such that r,
’
<
Tj
<
Tj+t
<
*
.
’
<:
?K.
(ii) Let Pu,, = P,,, = Q. = 0 and M = 0, P = 0, 7 = 0. Step 2: (computing R, and ?‘) (i) mtrn + 1 (ii) If r,,, > T, then R, = P and ?’ = r/Rt and go to Step 3. Otherwise continue. (iii) Derive Qm from (13) and Pui,,, from (12). (iv) 7tr + 7mPm(l-P&n) (v) P +P iP,(l -Pu,,). (vi) If P, = 1, then R = 1, R, x 1, 7 = ?= T and terminate, Otherwise go to Step 2(i). S&p 3: (computing R and r) (i) Derive Q,,, from (13) and PU,,,, from (12) (ii) T+T + LP~(I -P+) (iii) P +P + P,(l -PO,,). (iv) If P, = 1, then R = 1, ? = T and terminate. Otherwise continue. (v) If m = K, R = P and ? = T and terminate. Otherwise m+m + 1 and go to Step 3(i).
When the algorithm terminates at Step 2(G) then there exists a path R,,, such that P, = 1 and
P. B.
354
MIRCHANDANI
T,,, < T, and hence it is clear that R = 1, R, = 1 and 7 = 7’ = 7 as obtained by the algorithm satisfy (16) with h = m. If T, I r,, and Pi < 1 for i < h then the R, and 7’ are obtained from Step 2(ii) and they satisfy equations (17) and (18) respectively. When P, = 1 for T” > T, the algorithm terminates at Step 3(iv). It is clear then that R = 1 and that the 7 obtained satisfies equation (16). When the probability of connectivity R is not unity, then on termination at Step 3(C) R and ? Q.E.D. computed satisfy equations (14) and (15) respectively. One should note that due to the recursive nature of Algorithm 1 the effort to compute R,, ?,7’ besides the reliability R is not much greater than that for computing only the reliability R. In fact, if we were only interested in computing R, and 7’ we can eliminate Step 3 altogether. Note also that the distribution of the shortest path response times is also obtained since the probability that the travel time is ?k is equal to Pk( 1 - &k). Let us illustrate the method on the probabilistic network of Fig. 2. From an algorithm to determine paths (or by inspection) we find that there are four possible paths, I, - Is, 1, - &, It - 1,- 1, and l2- I4- 1,.After computing the travel times for each path and arranging the travel times in ascending order, assume we have
7, =
tj
t, +
<
72 =
f*
t Lj <
tz t t3 t ts <
73 =
74 =
f4 +
f* +
t5
From equation (3) we have Pl = pips,
Assume is
Pi <
and
P2 = pZp6? P3 = pZp3pJ
P4 = p2p4ps.
1, i = 1,2,3,4. Also suppose that the given threshold T, lies between 73and TV.That
r3 < Tt < 74.
In the process of the algorithm, Step 2 generates Pu12 = PIPS
Q, = ~1~5,
Q2=p1ps+pZpa(l-p,pd, R, = P, + 71 =
71p1+
P”13=pltp6-plp6 Pu12)+ P3U - Pu13)
Pz(1 -
TzPz(1 - Pl_l2)+ GP,(l
- PVIS)
R,
On termination from Step 3, the algorithm gives
Q3= PIPS +pzp6(1 -PI&) +P2P3PS(1-PI -P6+pd P”,4=(p,+p6-plp6)+p3(1-pl-p6+plp6) R = I’, + I’,( 1 - &,,) TIP1 f=
t
72P2(
1-
+ P3(1 - pu13) + PLT(Z)
+
73P3(1
-
p4(1
PUi3)
-
+
pw4)
7dp4t1
-
pu14)
R
The usefulness of the algorithm presented here can best be appreciated by expanding the expressions for R and 7 in terms of pi and &, i = 1, 2, . . . , 6, and observing their complexity. When the network is not large this algorithm is also an important tool for computing reliability and time response by hand calculations, since, once the paths have been identified, the algorithm permits the derivation of all quantities of interest essentially by inspection. Note that Q2 is obtained from Q1 by mechanically adding a term in the expression of Q,, Q3 is similarly obtained from Q2, and so on. Hence, to compute the reliability one needs only to keep adding terms to Q, until all the paths have been considered. Once a path has been considered in the expressions for
Shortest distance and reliability of probabilistic networks
355
Qm and P”irn, the information on the path may be discarded. In this respect, this algorithm requires less computer memory compared to other path enumeration methods to compute reliability. Of course, the expressions for Q,,, will become longer and longer, but no more so than
the expressions for the larger terms of R in (7). With regard to computational time, this algorithm is superior since it dos not have to compute various combinations of n out of k paths, n = I, 2, . . . , K, as required by (7). Comparing the algorithm with a cut-set enumeration method,
such as that of HLnsler[6], would be an interesting project for future research. Acknowledgements-This work was supported in part by the National Science Foundation (Division of Social Systems and Human Resources)under a grant to the M.I.T. Operations Research Center and in part by Signatron Inc., Lexington, Massachusetts. The author thanks Nicholas Johnson of Signatron, Inc., Amedeo Odoni of M.I.T. and James Jarvis of Clemson University for comments and suggestions with regard to this research. REFERENCES I. D. B. Brown, A computerized algorithm for determining the reliability of redundant configurations, IEEE Trans. Reliability 20, 121-124(1971). 2. H. Frank, Shortest paths in probabilistic graphs, Ops Rex 17, 583-599(1%9). 3. H. Frank and 1. Frisch, Communication, Transmission, and Transportation Networks. Addison-Wesley, Reading, MA (1971). 4. Y. Fu and S. S. Yau, A note on the reliability of communication networks, IEEE Trans. Communicalion Technology 13, 301-307(1%5). 5. R. Gould, The application of graph theory to the synthesis of contact networks, Proc. Int. Symp. Switching Circuifs, Harvard University, April (1957). 6. E. HLnsler, A fast recursive algorithm to calculate the reliability of a communication network IEEE Trans. Communications 20. 637-640 (1972). 7. E. HLnsler, G. K. McAuliffe and R. S. Wilkov. Exact calculation of computer network reliability, Networks 4,95-l 12 (1974). 8. F. E. Hohn and L. R. Schissler, Boolean matrices and the design of combinational relay circuits, Bell Sysl. Tech 1. 34. 177-202(1955). 9. F. E. Hohn, S. Seshu and D. D. Aufenkamp, The theoryof nets, IRE Trans. Electronic Computers 6,154-161(1957). 10. Y. H. Kim, K. E. Case and P. M. Ghare, A method for computing complex system reliability, IEEE Trans. Reliability 21, 215-219 (1972). Il. H. Mine, Reliability of physical systems, IRE Trans. Circuit Theory, 6, 138-151(1959). 12. K. B. Misra, An algorithm for the reliability evaluation of redundant networks, IEEE Trans. Reliability, 19, 146-151 (1970). 13. P. B. Mirchandani, Simple paths in a directed network, WP-02-74, IRP Project, Operations Research Center, Massachusetts Institute of Technology, February (1974). 14. P. B. Mirchandani, Analysis of stochastic networks in emergency service systems, TR-15-75,IRP Project, Operations Research Center, Massachusetts Institute of Technology, March (1975). 15. F. Moskowitz, The analysis of redundancy networks, AIEE Trans. Communication Electronics 39, 627-632 (1958). 16. J. D. Murchland, A new method for finding all elementary paths in a complete directed graph, London School of Economics, LSE-TNT-22, (1%5). 17. S. Okada, Topology applied to switching circuits, Proc. Symp. Information Networks, 3, 267-290 April (1954). 18. S. Seshu, On electronic circuits and switching circuits, Trans. IRE 3, 172-178(1956). 19. R. S. Wilkov, Analysis and design of reliable computer networks, IEEE Trans. Communications, 20,660-678 (1972). 20. 0. Wing and W. H. Kim, The path matrix and switching functions. J. Franklin Inst. 268, 251-269 (1959).