77
Probabilistic analysis of solving the assignment problem for the traveling salesman problem J-C. P A N A Y I O T O P O U L O S * Piraeus Graduate School of hzdustrial Studies, Piraeus, Greece
Received February 1980 Revised November 1980
This paper deals with probabilistic analysis of optimal solutions of the asymmetric traveling salesman problem. The exact distribution for the number of required next-best solutions of the assignment problem with random data in order to find an optimal tour is given. For evexy n-city asymmetric problem, there exists an algorithm such that (i) with probability I - s , s E(0,1) the algorithm produces an optimal tour, (ii) it runs in time O(n4/3), and (iii) it requires less than w((w + n !) log(w + n - i) + w + I) + ~ w ( n J + 3n z + 2 n - 6) computational steps, where w = l o g ( s ) / l o g ( l - En); E n E (0,1) is given by a simple mathematical, formula. Additionally, the polynomial of (iii) gives the exact (deterministic) execution time to find w = 1,2.... next-best solutions of the assignment problem,
1. Introduction
One of the best known NP-complete problems of operations research, mathematical programruing and management science is undoubtedly the asymmetric traveling salesman problem (TSP). A statement of this historical problem is: A traveling salesman wishes to visit each of n cities once and only once and return to the start. In what order should he visit the cities to minimize the total distance traveled? [3,6,9,10,12,14]. Hitherto, a polynomial-time algorithm capable of solvk,g the problem exactly has not been found. In fact, there does not exist in the literature any method which guarantees optimality within a rea* Also: Department of Industrial Engineering and OR, S.W. Mudd Bldg., Columbia University, New York, NY 10027, U.S.A. This research was partially supported by the Trading Corporation and partially by the Hermes Transport., (212)2746530/TITT 421379, broadway, New York, U,S.A.
sonable amount of computer time and core memory. Three general approaches to the problem have been proposed: (i) dynamic programming, (ii) powerful branch-and-bound methods, and (iii) relaxation and dual techniques [2,5,7,8,11,4]. In case (i) it is not possible to find an optimal tour even when n I> 25, because the corresponding dynamic program runs in time O(2nn2), [8]. But in cases (ii) and (iii) it may be possible to compute an optimal solution when we use the idea of next-best solutions [13] of the assignment problem (AP) or relaxation methods [4]. The next-best solutions can easily he determined if all the assignments can be arranged as a sequence in increasing order of cost. ~n this paper, we will deal only with the case (ii). In [12] some numerical examples of this technique are given as well as a practical distribution pertaining to some average computer times. For instance, the probability that a 40-city problem would require 64 minutes or more was estimated to be 0.023. The present paper also takes a probabilistic approach. Making the usual assumption that all feasible solutions of the AP are equally likely, we give the exact distribution which is quite different from that found in [12]. This assumption is cancelled when we have random data, from a common distribution uniform in (0,1) and is not a strong one in any other non-pathological case. Consequently, a 40-city problem requires 2 sec. with probability 0.023 and only 4 see. with probability 0.009. In this paper we prove that for every n, there is an algorithm such that (i) with probability 1 - s, s E (0,1), the algorithm produces an optimal tour, (ii) it runs in time less than O(na/3), and (iii) it requires
w ( ( w + n+ l) log(w+ n - I) + w + l) +~w(n 3 + 3 n 2 + 2 n - 6 )
North-HoUand Publishing Company European Journal of C:perational Research 9 (1982) 77-82
computational steps, where
0377-2217/82/000a-0000/$02.75 © 1982 North-Holland
J.-C Panaylotopoulos/ Travelingsalesmanproblem
78
w=log(s)/log(1-E.).
E. E(0,1).
The probability E n is given by the following simple mathematical formula: E . = 1 / ( ( 1 / E ~ _ t ) + ( 1 / ( E . - 2 ( n - 2)))) where E n is the ratio ( n - 1)l/R,, and R,, is the number of the feasible solutions of the corresponding AP:
=(n-
R 3 = ( 3 - 1)R, + (3-- 1)(3-- 2)R 0 = 2,
+
Thus, it iz possible for us to find exact results on asymmetric non-pathological problems with up to 350 cities; for instance, if n = 300 and l0 -a min. is the computer-basis (per step) for a CDC or IBM System, then the execution time is not more than 27 minutes.
2. Assignment problem and traveling salesman problem
~ a,jxij , i , j E (1,2 ..... n} ~ x,j = l
(all j ) ,
xi) = 1 (all i)
x,jE{O,I},
3)R o = 9 ,
R. = ( n - l ) R . _ 2 + ( n - - l ) ( n - 2)R._ 3 + ... +(n-
|)!Re.
Therefore,
" R ~ _ x ( n - 1)!
X
. - 2 R~
X
,~=0
Owing to this the following relation is true:
i
]
+ (4- 1)(4-2)(4-
(1)
L)
subject to
R4 = ( 4 - l)R 2 + ( 4 - 1)(4- 2)R,
/~,=2
Consider the following AP: minimize
can form the subassignment, and the rest of the cities form an ( n - 2)-city subassignment. In case (b), the n-city assignment can be constructed from an (n - 1)-city assignment by choosing any of the ( n - 1) cities, cutting the outgoing arc leaving it and inserting city n in the assignment. Obviously, R o = ! (by definition), a m= 0 and R 2 = 1. Consequently the following equalities hold:
j i .-- 2
R./(.-l),=
X A=O
R , , / ( n - l ) ! = ( R , , _ , / ( n - 2)!) + ( R . _ 2 / ( n - 2)!). Hence the formula R. = (n - I)(R~_ I +
a, = oo (all i)
where a , / > 0 is the cost of going from city i to city j, and xj/ is the corresponding decision variable. Obviously, the first tour which appears among the w = 1,2... next-best solutions of problem (1) is an optimal tour. Consequently, the following question arises: "What is the exact value of the upper bound of w that gives an optimal tour with probability 1 - s, s ~ (0, !)"? I.emma 2.1. If R, is the number of the feasible solutions of problem (1) having finite cost, then the following formula holds R,, = ( n - l)(Rn_ ~ + Rn-2).
Rn22).
The number of all tours is equal to (n - 1)!. We define the ratio E, = (n - I)I/R,. At this point, it must be noted that the only known result in the literature which is concerned with the ratio E, is the empirical value e/n, [10]. Lemma 2.2. Any solution of problem (1) has probability E, to be a tour, where
E,, = I,
n~3,
E,,=l/((l/E,,_t)+(l/(E,,_2(n-2)))),
n~4.
Proof. Based on Lemma 2.1, we get
E,, = ( n - 1)!/R,, ProoL In constructing an optimum assignment for R n either (a) city n is involved in a two-city subassignmeat, or (b) city n is involved in a 3 or more city subassignment. In case (a) there are (n-l) cities with which it
= (n= 1/((R._,/(n-
+ R._2) 2)[) + ( R . _ 2 / ( n - 2)[))
= 1/((l/E._l)+(l/En_2(n-2))
),
n;~4.
Finally, it is obvious that the relation E. = 1, n ~ 3 holds.
J.-C Panayiotopoulos/ Travelingsalesmanproblem
79
Table I The probability of a feasible
solution of :.he AP being a tour (n: number of cities; E,,: the corresponding probability)
n
E.
n
En
n
E.
n
10 20 30 40 50 60 70 80 90 100
0.271 0.135 0.090 0.067 0.054 0.045 0.038 0.033 0.030 0.027
II0 120' 130 140 150 160 170 180 190 200
0.024 0.022 0.020 0.019 0.018 0.017 0.016 0.015 0.014 0.013
210 220 230 240 250 260 270 280 290 300
0.012 0.012 0.011 0.011 0.010 0.010 0.010 0.009 0.009 0.009
310 320 330 340 350 360 370 400 450 500
By making use of L e m m a 2.2, we take the probabilities that Table I describes. For instance, when n = 50, then E50 = 0.054, i.e., a feasible solution of problem (1) may be a tour with probability 5.4~. In making the assumption that all feasible solutions of (i) are equally 'likely, we get the following result:
E. 0.008 0.008 0.008 0.008 0.007 0.007 0.007 0.006 0.006 0.005
Therefore, the ratio w = [Iog(s ) / I o g ( l - E , )] gives the exact value of w such that at least one tour among the w next-best solutions of problem ( 1), with certainty 1 - s, exists. Table 2 shows some representative values of certainties 99.1%, 90% and 50%. However, only some pathological examples require so small values of s as 0.009. The usual phenomenon is to find a tour with s = 0 . 5 . In addition, if we use a corresponding branch-andb o u n d method, then it is not always necessary to find all the required n~xt-best solutions of problem (l). For instance, if the distances a u are drawn independently from a common distribution uniform in (0,1), then we can use the b o u n d
Theorem 2.3. I f [a] denotes the integer part of any real number a, s ~ (0,1) is a given number, and w ~ [log(s)/log(l - E,)]
then the following relation holds: P (there does not exist a tour among the w next-best solutions) ~ s. Proof. The number w is an integer non-negative number. Therefore,
Cost(TSP) ~ (2. C o s t ( A a ) - 3).
w ;~ log(s ) / l o g ( l - E n ),
This is an immediate implication of the inequalities
log(l -- E,, ) w~ l o g ( s ) ,
Cost(TSP) ~ C o s t ( A P ) ,
(l-E,)W~s,
However, the distribution of Table 2 remains as a criterion regarding the upper bounds w to solve the AP for the general case of TSP.
P~s, because
(l -
Cost(AP) ~ 3.
E.)~ < l and s < 1.
Table 2 n
20
50
80
I00
150
200
250
300
350
400
s =0.009 w
cerlainty99.1% 32 84
136
171
257
344
431
517
604
691
s=0.1 w
c~n~ntygO% 15 41
66
83
126
168
210
253
295
337
s=0.5 w
cen~ntySO$ 4 12
20
25
37
50
63
76
88
100
J.-C. Panayiotopoulos / Traveling .~:lesman problem
80
3. Analysis of the execution time of finding an optimal tour
The remaining work is dominated by the estimation of g(n,w):
Let TA(n) denote the number of steps required to find an optimal solution of problem (1).
s(n,w)=½n 2 ~ p2-½n 2 p ( 4 p - l )
Lemma 3.1. For every n there is an algorithm such that O) with probability 1 - s, s E (0,1), the algorithm produces an optimal tour, and (ii) it requires T( s, n) steps, where
I¢
p= I
+½ ~ 2 p ( 2 p - 1) p=l W
=~n~w(w+ 1 ) - 2 ( 1 - n ) ~ p~ p=l w
+½(n-2) ),=2
=~(8w3(n-
+ ~ ( s ~ 3 ( n - l)+ 3w*(n* + 5n- 6) +w(3n 2 + 7 n - I0)).
Proof. Suppose AS 1 is the set of the arcs of the optimal solution of problem(l): AS' = { ( i , j , ) , (2,j2) ..... (n,j,,)}. Based on [13], we can easily find a next-best solution by solving the following sub-problems: AS, = {(1,j,), (2,j2) ..... (k - l.jk_,); (k-~t) }, k = 1,2 ..... n - l, where the arc (i,.~,) means that we strike off the row i and the column j,, and the arc (i,jl) means that we set a,./, = oo (or the largest number of our computer, e.g. a,j, = 103°). Also, in [13] has been proved that A S k a r e mutually disjoint, each of which does not contain AS ~ and the minimal cost sub-assignment AS~ gives the next best solution. In order to find the set AS s, we have to partition AS~ by AS 2 and to continue the method as before. But now, we have 1Io find the minimal cost among the new sub-assignments and the old ones except the AS¢ which we have already used. Therefore, it is easy to see, that to find the first next-best solution w e have to make " [ n - l ~
comparisons, for the [ 2 / second solution ( 2 n ~ - 3 } comparisons, and for
l) +3w2(n 2 + 5 n - 6 )
+w(3n 2 + 7n-- 10)).
Finally, based on Theorem 2.3, we have that w = log(s)/log(1 - E,). Lemma 3.2. The function f ( n , w ) of (2) is given by the fol, rowing polynomial / ( n , w ) = ±6w(n3 + 3n 2 + 2n-- 6).
Proof. The assignment problem requires no more than (½n(n+ 1)) computational steps [1]. Therefore, PI
f(..w).~w ~, ~x(x+ l) h=2
= ½(½n(n+ I))- 1
+ ~ ( . ( n + I)(2,,+ l)-l)w = ±6w(n3 + 3n 2 + 2n - 6).
Theorem 3.3. For every n there is an algorithm such that (i) with probability 1 - s, s E (0,1), the algorithm produces an optimal tour, and (ii) it requires less than 1-1(n,w) steps:
~t
the w solution { w n - 2 w + l / comparisons. On 2 I the other hand, the mar.~mum computational effort required to generate a next-best solution is that of solving at most ( n - 1 ) sub-assignment problems. Therefore,
r(s,.) <--f(n,w) + g(n,w) - 2 TA(X)~+ h=2
~ p p=l
~ TA(h)w
j
p= I
w
w : log(s)/log(l -- E. ), T(s,n)~
w
pn-2p+l p=l
2
(2I '
n(n,w) = w((w + n - I)log(w+ n - I)+ w+ I) +f(.,w) where w = log(s)/log(1 - E,) and f(n,w) is given by the Lemma 3.2.
Proof. There are several sorting methods in the literature which require (nlogn) computational steps. Suppose that for each next-best solution we store only w sub-assignments wh~ich give the
J.-C. Panayiotopoulos/ Travelingsalesmanproblem minimal costs. This is possible for us, if at each solution we sort the w old costs and the n - | new ones. Therefore it is required to be done (w + n 1) l o g ( w + n - 1) steps per each solution. After that, it is necessary to store the w first ordered costs among them and to get out from the list the first of them, which is the minimum cost (AS~). Therefore, there are
(w+n-1)
lo8(w+n- l ) + w +
I
computational steps. On the other hand, we must repeat the whole activity for every next-best solution, i.e., ( w - 1) times. Therefore, we get instead of the function g(n,w) of (2), the new bound
w ( ( w + n - I)log(w + n - I) + w + I) based on Theorem 2.3 and Lemmas 3.1, 3.2 we have the formula of H(n,w).
4. The algorithm Having said a good deal about the existence of an algorithm and a little about computational matters, we now have to sketch the corresponding algorithm in broad terms. The. flow in the following is such that one proceeds to the next line unless told otherwise. (A) (1) (2) (3) (4)
Initialization.
(B) (5) (6) (7) (8) (9) (10)
Processing.
Set M = 0 , F = I . The n, a~j, i, j = 1,2, .... n are given, The probability s is given. The list L is empty. Compute the probability E,. Compute the number w. Find the set AS r. If the set AS s is a tour then set M = 1. Go to (16). Solve the sub-assignraents ASk.
(C) Forwardstep. (I I) Store in L the costs of AS k. (12) Sort the costs of L and keep only the w minimal among them. (13) Get out from the list L the minimum cost. (14) Set F = F + 1.
(15) Go to (7).
(D) Backward step. (16) If M = 1, terminate.
81
(17) I f / > w, stop.
(18) Go to (lO). Theorem 4.1. The proposed algorithm runs in time
less than 0(13n4). Proof. By using Theorem 3.3, we know that the proposed algorithm requires H(n,w) computational steps. On the other hand, observing the results of Table 2, we get the strict inequality w < 2n. Therefore,
H(n,w)<(3n-1) log(3n-1)+2n + 1 +½(n 4 + 3n 3 + 2 n 2 - 6 n ) < 9 n 2 + 2 n + ~ ( n 4 + 3n 3 + 2n 2 - 6n) <½(n z + 3 n + Z 9 ) n 2. Hence the time O(3n~4). On the basis of the algorithm described, we solved 10 asymmetric problems with s--0.1 and n = 200. The maximum execution time with an IBM/370 system and Fortran H was not more than 11 minutes. It must be noted that the execution times of six among the ten problems, were not more than 2 minutes. Finally, we solved 5 problems with distances a~j uniformly distributed in (0,100), and s = 0.5, n = 350. The mean average of execution times was equal to 14 min., but one of them required s = 0.3.
5. Conclusion In this paper we have followed the idea of using a 'solution generator' for the AP as a means for obtaining an optimal solution to the corresponding asymmetric TSP. We think that the contribution here appears to be a better and more favorable description of the technique within a conventional probability model. Although, it seems that Theorem 4.1 gives the required polynomial algorithm for the TSP, nevertheless we remind you of the fact that the proposed analysis and method is a probabilistic one and not a determinism' technique. On the other hand, if the given distances aij are random numbers, it seems that the bound w is an absolutely upper one with probability i - ~. In this case we can reduce the execution time to the third; because, we can store only w - ~ costs for each h = !.2 ..... w - 1 next-best solution. But this
82
J.-C.
Panayiotopoulos/ Traveling salesman,problem
Table 3 Execution times (in minutes) to find w = 30 or I00 next-best solutions of the assignment problem with no loops; the computer-basis is 10-s minute per step n
50
100
200
300
400
500
1000
w = 30 w = 100
0.00 ! 0.006
0,015 0.050
0.132 0.440
0.450 1,50
1.05 3.50
2.07 6.90
16.5 55.0
strategy is a risky one. when we have to find an optimal tour of a real word asymmetric problem. We do not suggest this policy, because a few pathological problems require w + ~p, q~= 1,2..... v next-best solutions of problem (i). The estimation of the number v is an interesting open question. Some of the technical results in analysing or proving the propositions may be of independent interest. Among these results is the execution time required to find w next-best solutions of any problem which is based on the AP. Table 3 shows some representative values of execution times required to find w = 30 or 100 next-best solutions of the assignment problem (!). However, this result is a determinisSc one and extends the historical works of Balinski, Gomory and Murty [1,13]. Another result is the exact value of R,; this number gives the exact number of permutations which have no loops. Our research is going on for the m - TSP, m > 1.
References [i] M.L. Balinski and R.E. Gomory, A primal method for the assignment and transportation problems, Management Sci, 10 {3} (1964) 578-593. 12] R.E. Bellman, Dynamic programming treatment of the traveling salesman problem, J. ACM 9 (1962) 61-63.
[3] M. Bellmore and G.L. Nemhauser. The traveling salesman problem: a review, Operations Res. 16 (1968) 538-558. [4] E. Balas and N. Ch~stofides, Relaxation and TSP, Management Science Research Report No. 439, Graduate School of IA, Carnegie-Mellon University (1979~. [5] M.R. Garey, R.L. Graham and D.S. Johnson, Some NPcomplete geometric problems, Proc. 8th ACM Symposium on Theory of Computing (1976) 10-29. [6] B.L. Golden, A statistical approach to the TSP, Networks 7 (1977) 209-226. [7] K.H. Hansen and J. Krarup, Improvements of the HeldKarp algorithm for the symmetric traveling salesman problem, Math. Programming 7 (I 974) 87-96. [8] M. Held and R.M. Karp, A dynamic programming approach to sequencing problems, SIAM J. 10 (1962) 196210. [9] R.M. Karp, Probabilistic analysis of partitioning algorithms for the traveling salesman problem in the plane, Math. Operations Res. 2 (3) (1977) 209-224. [10] R.M. Karp, Reducibility among combinatorial problems, in: R.E. Miller and J.W. Thatcher, Eds., Complexity of Computer Computations (Plenum, New York, 1972). [1 I] E.L. Lawler and D.E. Wood, Branch-and-bound methods: • a survey, Operations ges. 14 (1966) 699-719. [12] J.D.C. Little, K.G. Murty, D.W. Sweeney and C. Karel, An algorithm for the traveling salesman problem, Operations Res. I I (1963) 972-989. [13] K.G. Murty, An algorithm for ranking all the assignments in order of increasing cost, Operations Res. 16 (19681 682-687. [I.4] G.L. Thompson, Algorithmic and computational methods for solving symmetric and asymmetric traveling salesman problems, presented at the Workshop on IP, Bonn (1975).