Transpn. Res:B Vol. 188, No. 2, pp. 135-145, Printed m Great Britain.
019l-2615/84 Pergamon
1984
THE LENGTH OF TOURS IN ZONES DIFFERENT SHAPES?
Department
CARLOS F. DAGANZO of Civil Engineering, University of California, (Received
7 November
1982; in revisedform
Abstract-The object of this paper is to explain changes with zone shape To do this, a simple is presented. The resulting tours are suboptimal by hand. Thus, the formulas that are provided with better strategies. The results of this paper
Berkeley,
$3.00 + .oO Press Ltd.
OF
CA 94720, U.S.A.
3 August 1983)
how the expected length of traveling salesman tours strategy that yields good traveling salesman tours but appear to be close to those that can be obtained may also be indicative of the length of tours built are useful for the design of distribution systems.
1. INTRODUCTION Most of the research on traveling salesman tours can be classified into two rather well-defined categories. Prescriptive research efforts attempt to derive algorithms for the construction of optimal or near-optimal tours, while descriptive efforts attempt to give length formulas under different conditions. Eilon, et al. (1971) provide a good summary of research in both areas. Despite all this-and more recent-research, little seems to have been done to explore the impact that zone shape has both on good tour-building strategies and on tour length. This paper attempts to fill this gap because understanding the subtle interplay of point density, zone shape, and tour length can help in several practical problems. We shall be concerned with N points located in a connected region of a plane, &, in which distances are given by either a Euclidean (straight line) or L, (grid) metric. It is well known that if the area of d is A, and the N points are uniformly and independently scattered, the expected tour length, D, is:
(1) where 4 is an unknown constant. This is true for both metrics, and for the Euclidean, 4 is believed to be 0.75. It is striking to see how well eqn (1) holds for circular and square areas with small N. Even for N = 2, when the approximate value is 1.06,/A, the exact value is 1.04,/A for squares and 1.03,/A for circ1es.S For zones of elongated shape, however, the formula underpredicts the tour distance when N is small. The next sections develop an approximate formula for expected tour length in zones of irregular shape that will apply at all point density levels. The approximation overpredicts somewhat the expected length of the shortest tour because it is based on a suboptimal tour building strategy. Nevertheless, tours constructed by humans appeared to be only marginally shorter than those derived from the strategy.
2.
ONE-DIRECTIONAL
TRIPS
IN
A
STRIP
Consider an infinitely long strip of width, W,containing uniformly, randomly scattered points with a density of 6 points per unit area. Consider also a path which visits all the points by moving along the strip without backtracking (top to bottom on Fig. 1). TResearch supported by NSF Grant CEEBl-11681 to the University of California, Berkeley, U.S.A. $The exact values are obtained by doubling the expected distance between two random points in a circle and a square. See Fairthome (1965) for the derivations. ,F((LI, lH/Z’ ”
135
C. F. DACANZO
136
I
b---wa)
Euclideon
t
h----w--distances
4
b) I_( distances
Fig. 1. Unidirectional
trips in a strip of width,
W.
The expected total length of a section of the path containing N points, D,, is given by: D, = Nd,,
(2)
where d, is the expected distance between two consecutive points. If we let X denote the random distance between two consecutive points along the width of the strip, we can write: 2
( > 1-z
Pr{X>x}=
)
o
and E(X) = w/3.
(3b)
This is because X has the same distribution as the distance between two random points on a segment of length w. Letting Y denote the random distance between two consecutive points along the side of the strip, we have: (4) This occurs because with uniformly, independently, and randomly scattered points on the strip, the positions along the side of the strip at which points lie form (locally) a Poisson process with rate 8w.t E(Y) = (dw)-‘.
(5)
Equations (3-5) are now used to calculate d,. The expected distance between two consecutive points is given by:
tThis
dw=-%,AX+ Y)
for the L, metric, and
(6a)
d, =
for the Euclidean metric.
(6b)
Ex,J(X2 + Y2)"')
is one of the definitions
of a Poisson
process.
The length of tours in zones of different shapes
Ol
0
I
I
I
I
I
I
2
3
4
5”
137
I IO
1
t ah Fig. 2. Comparison of Euclidean and L, distances.
Because
eqn (6a) is linear,
it facilitates
calculations: for the L, metric.
d,v=;+-&,
(74
Equation (6b) can be evaluated numerically using both the distributions of X and Y. However, for our ultimate goal, we need a simpler expression, which can be obtained using some approximations (see appendix): d, g 3 + -& x $(6w2), for the Euclidean
metric,
(7b)
with II/(x) = (2/x2)[( 1 + x) log (1 + x) - x]. The function $(x) which influences the difference between the Euclidean and L, average distances is plotted on Fig. 2. As should happen, the ratio, y, between the approximate Euclidean and L, distances, y = (3$(6w2) + (6w2))/(3 + (6w2)), which is also plotted on Fig. 2, approaches 1 for both w -+O and w +co. It is interesting to note that for a wide range of 6w2 values, the ratio is close to 0.79, which is the same as for randomly selected pairs of points, Fairthorne (1965).
3. TOUR
LENGTH
IN ZONES
OF
DIFFERENT
SHAPES
3.1 The optimal width of the strip Consider a zone, _&, containing a large number of points, N, and imagine we cut a swath of approximate width, w, covering the whole zone. Fi ure 3 illustrates two possible patterns for covering a square with a swath of width J A 16. If the pattern of the swath is selected independently of the specific location of the points, it is possible to determine the expected length of a tour covering all the points when the strategy of Section 2 (see Fig. 1) is used while moving forward (or backwards) along the swath. If we ignore the turns of the swath, the distance between two consecutive points is d,, and the expected length of the tour must be: D = Nd,.
138
C. F. DAGANZO
I
Fig. 3. Two square-covering
swath
patterns.
It should be noted (see eqn (7)) that if w is selected too small, d, will be large because the density of points, 6w, along the swath is small. On the other hand, if w is selected too large, the zigzag deviations along the swath (distances comparable with w) will be unnecessarily large. The right balance can be struck by finding the value, w*, that minimizes eqn (7). It is:
w* =
J(> 3
s
.
This is an exact value for the L, metric but only approximate for the Euclidean metric (the exact value is w * = ,/(2.95/d)). Th e a p proximation is sufficiently accurate because d, is rather flat around its maximum. Note that because 6w*’ = 3, a square with side equal to w * should on the average contain 3 points. If a swath of width w* can be cut, the resulting tours have expected interpoint distances: d* = 1.156-“’
for the L, metric,
d* = 0.906 - 1’2
for the Euclidean
(9a)
and metric.
(9b)
Equation (9a) applies only if the swath is made to run with the coordinates of the grid almost all the way. Equation (9b) yields values which overpredict somewhat the conjectured expected length of optimal tours, eqn (1). Nevertheless, it may be more representative of tour lengths that can be achieved by manual construction, and it will be useful to determine lengths in oddly shaped zones. 3.2 Tour length formulas As long as the zone d is large enough so that a swath of width w * can be built to cover it, eqn (9) will be representative of achievable tour lengths. However, if the zone is so narrow that w* is larger than the zone width, a narrower swath will have to be employed, and the tours will become longer. Figure 4 illustrates this phenomenon. For any given zone, one can always cut a swath whose width will depart from w* as little as possible and use it as the basis for building a good tour. Because in the research that will follow we shall be interested mostly in approximately rectangular-shaped zones, the rest of the discussion will center around a rectangle of sides L and 1, I d L. As long as w* is less than (l/2), eqn (9) applies?; otherwise we use eqn (7) with w = (l/2). Since w* is the side of a square containing on the average three points, the best tIf I > 2w*, a swath width that will differ from differ from eqns (7) by less than 2%.
w* by less than 20% can be chosen.
In that case eqns (9)
The length
of tours
in zones of different
139
shapes
J, (8w2) 0.9
d” L + -> t
I
SW
ir
I
WbW’ 8
Fig. 4. Optimal
d* will be achievable
the Euclidean
3
metric,
swath
width in two differently
as long as a square the results are:
of side I contains
d’ = 0.96 -‘j2
if 61’2
shaped
zones.
on the average
12,
12 points.
For
(lOa)
and if 612 < 12.
(lob)
Note that the length of the rectangle, L, plays no role here. Furthermore, as 1 approaches zero, d* approaches 2/6Z, which is also true of the L, metric when the rectangle is oriented with its sides following the coordinate directions. Equation (lob) can be written as follows:
where the dimensionless quantity in brackets, 4(H2), reaches a minimum 612 = 12. Figure 5 plots $(H2), which is the multiplicative coefficient of m tour length formula:
of 0.9 when in the total
Dgd*xN
= tf~(61*)6- "'N = 4 (H’),/NA.
(104
C. F. DAGANZO
140
5
4
? 9 2
I I
0
I
I
I
I
I
I
4
8
I
I
I
16
12
I
I
20
&t2 Fig. 5. The tour
length
factor,
4, as a function
of the shape/density
constant,
~51~.
It can be seen that even for zones with 61’ substantially less than 12,4(61’) c 0.9. However, once 61’ becomes smaller than 2, ~$(61*)z 2(61*)-I’*.
4.
SOME
COMPARISONS
To assess the reasonableness of eqn (10) in practical contexts, we would like to test its validity. First, consider a rectangle of infinitesimal width, Z-0. If N points are located at random in this rectangle, the optimal tour will have an expected length equal to twice the distance between the two extreme points. It is: L.
(11)
Equation (lOd), with 4 = 2(61*)-‘I*, yields D z2L,
(12)
which is reasonable as long as N is not very small. We already know that for circles and squares, eqn (10d) tends to overpredict slightly for all values of N because 4 is 0.9 instead of 0.75. To verify the performance of the formulas for larger values of N in the case where tours are manually constructed, three experiments were conducted by randomly scattering 10, 22, and 111 points over a 7.25 x 10 in. rectangle and asking half a dozen individuals to construct the best possible tour. For the problem with 10 points, everyone obtained the same tour: its length was 25.2 in. The 22-point problem resulted in tours of varying lengths from a low of 38.2 in. to a high of 39.7 in. Only two people out of the six obtained tours shorter than the worst by more than 0.5 in. and they spent a considerable amount of time trying. For the problem with 111 points, the pattern developed by each individual was entirely different (Fig. 6 depicts two tours). Yet, the lengths obtained did not change very much.
The length
(A)
of tours
Length
in zones of different
= 77.4
shapes
inches
Fig. 6(a)
The shortest tour was 77.4 in. and the longest 79.6 in. To verify this phenomenon further, we constructed two more tours based on the patterns of Fig. 3. The strip width used, w = 1.2 in., is close to the optimal, which should result in a length close to the one with w*. Figure 7 depicts these two tours and their lengths, which both were 81.3 in. It is striking to notice that the very simple strategy discussed in this paper produced tours which are not even 5% longer than the best obtained by six separate individuals. The formulas given in this paper for an area 7$in. by 10 in. yield: 25 in for 10 points, 36 in. for 22 points, and 80.7 in. for 111 points. The small deviations from the observed values are partly due to statistical error. These discrepancies become less noticeable for large N because of the laws of large numbers; they were smaller than 1% for the case with 111 points.
C. F. DAGANZO
142
(B1
Length
q
77.9
inches
Fig. 6. Two human-built
5.
tours.
CONCLUSIONS
This paper has presented a simple strategy that can be used to build good traveling salesman tours in zones of irregular shapes without the help of a computer. While better tour-building algorithms can and have been developed, the strategy presented in this paper seems to be within 5% of what careful humans can accomplish for complex problems with many points. The main advantage of its simplicity is that its tour lengths can be estimated accurately in zones of different shapes. The paper concentrated on rectangles with uniform density of points, but other shapes and patterns could be explored as well. The formulas that were developed reproduced rather well the expected length of a tour in the extreme case of an infinitely thin rectangle, but in general they tend to overpredict. Nevertheless, because the length of the proposed tours in zones of widths significantly smaller than 2w* is close to the optimal length in an infinitely thin rectangle of the same length, the overprediction should be small if 61’ < 12.
The length of tours in zones of different shapes
(A) Length = 81.3 inches Fig. 7(a).
Irregularities in the shape of zones, or in the density of points, do not prevent the strategy from being used, since the distance in between tour points is only affected by the width of the swath in its immediate neighborhood. If a swath of approximately constant width is not feasible (or reasonable), a swath of varying (locally optimal) width can be used. The length of the total tour cannot be expressed by a simple formula, but if the departures from rectangles and homogeneity are small, the formulas of this paper apply. In a companion publication studying a multiple vehicle dispatching problem (Daganzo, 1982) the formulas were applied to nonrectangular regions with acceptable results. For zones with 61’ 2 12, the exact form of the underlying networks should not influence strategies much, since the optimal width of the strip is the same for L, and Euclidean metrics. Moreover, since it is true for the L, metric, it is safe to conjecture that for other metrics the coefficient in eqn (9) should be 0.9 (the coefficient for the Euclidean metric) times the route circuitry factor.
144
C. F. DAGANZO
i
i
I
i
I
I
I
I5
I
I
I
Fig. 7. Tours resulting from strategy.
Among the possible applications of the strategy and formulas discussed in this paper, we have: (i) Route design for a fleet of distribution vehicles serving a zone with a depot. (see, for example, Daganzo (1982)); this reference develops some theory and contains a case study. (ii) Design of checkpoint dial-a-ride system (Daganzo, 1983). (iii) Obtaining good initial tours for more complex tour construction algorithms.
Acknowledgemenr-Gordon
probably his.
F. Newell shared his ideas with me, and some of the errors in this paper are
The length of tours in zones of different shapes
145
REFERENCES Daganzo C. F. (1982) The distance traveled to visit N points with a maximum of C stops per vehicle: A manual tour-building strategy and case study. Research Report UCBITS-RR-82-10, Institute of Transportation Studies, Univ. of California, Berkeley. Daganzo C. F. (1983) Checkpoint dial-a-ride systems. Transpn Rex in press. Eilon S., Watson-Gandy C. D. T. and Christofides N. (1971) Distribution Management: Mathematical Modelling and Practical Analysis. Hafner, New York. Fairthome D. (1965) Distances between pairs of points in towns of simple geometrical shapes. Proc., 2nd Znt. Symp. on the Theory of Trafic Flow, pp. 391-406, OECD, Paris.
APPENDIX of d, in Euclidean space We seek E(XZ + Yz)“‘) when X and Y are independent
Derivation
random variables with densities&.)
and&(.)
given
below:
f,(x)=; 1-E (
, o
= 0 , otherwise and fr~)=(Gw)e-6wy,
y>O,
= 0 . otherwise. We first calculate the expectation conditional on X: &((XZ + Y*)“$Y) =
m (X2 + y2)1’2(6w)ee6”‘ydy s0
The last expression is the result of the change of variable r = 6wy. For ease of integration, we replace the square root in the integrand by: (r/GwX) + exp ( - r/GwX), which differs from it by less than 5%. So we have, Er((X* + Y2)li21X) g
Om kx S[
1
+ e-c”awx, ee’dr
C-41) The result sought is obtained by taking the expectation of (Al) with respect to X: E((X*+
After a few algebraic manipulations,
Y*)‘++uE(X)+E(&)}
the expression reduces to:
(A2) where (A3) These are the expressions used in the text.