European Journal of Operational Research 36 (1988) 251-253 North-Holland
251
Theory and Methodology
On estimating road distances by mathematical functions R o b e r t F. L O V E
Faculty of Business, McMaster University, Hamilton, Ontario L8S 4M4, Canada J a m e s G. M O R R I S
School of Business, University of Wisconsin-Madison, Madison, Wisconsin, WI 53706, USA Abstract: Empirical results are presented that are in contrast to those reported earlier in this journal by Berens and KiSrling. Based on these results, a modelling philosophy is stressed and the practical impact of using empirically based distance-predicting functions is discussed. Keywords: Road transportation, location
1. Introduction
fitting measures used in [5,6] were
Berens and K~Srling [1] applied the methodology described in [5,6] for fitting road travel distance estimating functions to inter-city road distances in the Federal Republic of Germany. Specifically, consider the function
d(q, r; k, p, s ) = k [ l x q - x r l P + [yq_Yrlp]l/~, where (Xq, yq) and (x,, Yr) are coordinates of the points q and r on the plane and k, p and s are parameters, and the special cases considered in [5,6]:
d,(q, r) = d(q, r; k, 2, 2), d2(q, r)= d(q, r; k, p, p), d3(q, r)= d(q, r; k, p, s),
k > O,
k, p > O, k, p, s > O.
For k = 1, d I is the familiar Euclidean distance function. The parameters k, p and s may be determined using an appropriate goodness-of-fit criterion. Although others may be used, the two Received April 1987
n--1
AD/= E
~
Idf(q, r ) - A ( q , r) l
q=l r=q+l
and
soz= E
(dr(q, r ) - A ( q , r))2/A(q, r)
q~l r=q+l
where f
indexes the estimating functions and
A(q, r) is the actual road travel distance between points q and r. Berens and KiSrling concluded that: " I n contrast to the results obtained for the USA, there were only slight increases in the degree of accuracy of the estimates when single-parameter distance functions were replaced by multiparametric ones. Since, moreover, the determination of a second or even a third parameter entails a great deal of work, it is doubtful whether the use of a two-parameter function can continue to be regarded as an economically justifiable method in Europe."
0377-2217/88/$3.50 © 1988, Elsevier Science Publishers B.V. (North-Holland)
252
R.F. Love, J. (7. Morris / On estimatingroaddistances by mathematicalfunctions
2. Remarks The discovery by Berens and KiSrling that inter-city road distances in the Federal Republic of G e r m a n y can be accurately represented by inflating Euclidean distances by a factor of k = 1.338 using the measure A D a (or 1.340 using SDI) is intriguing. Their results were calculated on the basis of 6786 pairwise distances between 117 cities. In [5] this factor was estimated to be k = 1.16 using A D 1 (or 1.18 using SD1) for a sample of inter-city road distances in the USA. In their study Berens and K/Srling found that k = 1.344 and p = 2.058 using A D 2 (or k = 1.344 and p = 2.031 using SOa) for d2, while the improvement in the criterion of accuracy was 0.12% for A D and 0.03% for SD. The associated improvements were 10.55% and 34.63%, respectively for the USA data, when using the two-parameter (k = 1.15 a n d p = 1.78 in each case) function versus the inflated Euclidean function. In order to verify their results we considered the 25 largest cities in the Federal Republic of G e r m a n y and from those randomly chose 15 cities. We then estimated k for d a using a sample of 100 inter-city distances from among the 105 possible pairings of the cities. The result was k = 1.13 using A D a and 1.135 using SD 1. The same data produced minimizers k---1.115, p = 1.877 and k = 1.104, p = 1.761 for A D 2 and SD 2, respectively, with associated improvements of 2.4% and 8.3%. Several thoughts come to mind. The estimated parameter values for the Federal Republic of G e r m a n y given here are clearly dissimilar to those found in Berens and K~Srling. Users of such distance predicting functions for this geographical region will surely want to carry out their own study to do the required estimation. Although the accuracy improvements in using d 2 versus d a were much greater in our study of F R G distances than those found by Berens and K~Srling, the magnitudes are smaller than those found in the USA study. Viewing these results the analyst m a y wish to use d a due to its computational simplicity (e.g., such straight-line distances m a y be ' r e a d ' from a m a p using a ruler). The d 1 function models road distances as having no directional bias and would thus be p a r t i c u l a r l y suited to geographical regions with a highly developed road system. Indeed, with but one parameter to estimate, d a is quite utilitarian and may be suffi-
ciently accurate for some situations. The impact of using dl versus d 2 (or d 3) may vary considerably depending on the application (several applications are described by Love and Morris [6]). In situations such as testing road network distance data for errors (see Ginsburgh and Hansen [3]), precise distance-predicting properties are important. In these cases it is desirable to use the most accurate function that is available. However, in applications such as the use of a distance function in a facility location model, there is a further practical impact of choosing da versus d 2, say. Consider the following facility location problem: Minimize g
f ( x ) = ~ w / d ( x , ag), j=l
(F)
where X
is the location of the facility given by x = (Xl, x2),
aj
is the location of the j t h fixed point given by a s = (a jl, a/2 ), d ( x , aj) is one of the estimating functions d I or d 2 for the distance between the facility and the j t h fixed point, wj is positive and converts distance into cost. In this model, if x * minimizes f ( x ) , x * also minimizes f ( x ) / k since k > 0. Hence, k m a y be set equal to 1 in problem (F) to find x*. The import of this result is that a suitable value of k need not be known in order to find an optimal facility location. The practical impact of this result is quite different for the two functions d I and d2. If it has been decided a priori to use dl, then no empirical study need be done at all. However, when d 2 is used, an empirical study must be carried out to determine the optimal parameters k * and p * (according to some fitting criterion) even though, when this has been done, k may then be set equal to 1 when computing x*. Similar remarks hold for d 3• Of course, computational convenience must be balanced against accuracy. It is on this point that we wish to take particular issue with two of the conclusions reached by Berens and KiSrling. The first is the statement that " a great deal of extra work has to be done" to replace d 1 by d 2 and
R.F.. Looe, J.G. Morris / On estimating road distances by mathematical functions
thus e s t i m a t e two p a r a m e t e r s r a t h e r t h a n one. T h e m e t h o d s suggested in [5,6] were b a s e d o n a s i m p l e n u m e r i c a l search over a grid w h e r e p a r a m e t e r s were v a r i e d over c h o s e n grid-widths. If a c o m p u t e r p r o g r a m similar to LPDIST [4] is available, then other t h a n s o m e a d d i t i o n a l c o m p u t e r time, there is no e x t r a cost r e q u i r e d to fit d 2 since the d a t a r e q u i r e m e n t s a r e the s a m e in either case. T h e s e c o n d c o n c l u s i o n t h a t we w o u l d like to a d d r e s s is: " t h e r e is o n l y a s l e n d e r c h a n c e of o b t a i n i n g m o r e a c c u r a t e results in this w a y " ( b y u s i n g a t w o - p a r a m e t e r function). A s Berens a n d K r r l i n g p o i n t e d out themselves, a n d we h a d stressed in o u r earlier p a p e r s , every g e o g r a p h i c a l a r e a is different. T h e p r o b a b i l i t y of o b t a i n i n g s u b s t a n t i a l l y m o r e a c c u r a t e results ( f r o m a twop a r a m e t e r m o d e l ) is thus d e p e n d e n t o n the region b e i n g m o d e l l e d . F o r this reason, generalizing f r o m the F R G d a t a (or a n y o t h e r restricted d a t a set) is p o t e n t i a l l y misleading. U s e r s s h o u l d c a r r y o u t their o w n studies using g o o d n e s s - o f - f i t criteria a p p r o p r i a t e to their a p p l i c a t i o n . C o l l e c t i n g these thoughts, we w o u l d suggest the following. If a g r i d - s e a r c h c o m p u t e r p r o g r a m is d e v e l o p e d , o n e s h o u l d always d o at least a twop a r a m e t e r fit to c o m p a r e with the o n e - p a r a m e t e r fit. I n this w a y the a n a l y s t will o b t a i n a s u p e r i o r d e s c r i p t i o n of the r o a d d i s t a n c e s in the r e g i o n u n d e r s t u d y at m i n i m a l cost, a n d h e n c e b e a b l e to
253
d e c i d e b e t w e e n the m o r e a c c u r a t e f u n c t i o n d 2 a n d the m o r e c o n v e n i e n t f u n c t i o n d 1. I n short, we agree with Box a n d T i a o [2, p. 7]: " B e c a u s e we c a n n e v e r be sure that a p o s t u l a t e d m o d e l is entirely a p p r o p r i a t e we m u s t p r o c e e d in such a m a n n e r that i n a d e q u a c i e s c a n b e t a k e n a c c o u n t of a n d their i m p l i c a t i o n s considered as we go along."
References
[1] Berens, W., and KiSrling, F.J., "Estimating road distances by mathematical functions", European Journal of Operational Research 21 (1985) 54-56. [2] Box, G.E.P., and Tiao, G.C., Bayesian Inference in Statistical Analysis, Addison-Wesley, Reading, MA, 1973. [3] Ginsburgh, V., and Hansen, P., "Procedures for the reduction of errors in road network data", Operational Research Quarterly 25 (1974) 321-322. [4] Love, R.F., "Computer programs for distance modelling, location, and location-allocation studies", paper presented at the 7th European Congress on Operational Research, Bologna, Italy, June 16-20, 1985. [5] Love, R.F., and Morris, J.G., "Modelling inter-city road distances by mathematical functions", Operational Research Quarterly 23 (1972) 61-71. [6] Love, R.F., and Morris, J.G., "Mathematical models of road travel distances", Management Science 25 (1979) 130-139.