EUROPEAN JOURNAL OF OPERATIONAL RESEARCH ELSEVIER
European Journal of Operational Research 95 (1996) 211-230
Theory
and Methodology
Approximating probability density functions and their convolutions using orthogonal polynomials Ralph D. Badinelli Department of Management Science, Virginia polytechnic Institute & State University, Blacksburg, VA 24 061, USA
Received February 1993; revised July 1995
Abstract This paper describes and tests methods for piecewise polynomial approximation of probability density functions using orthogonal polynomials. Empirical tests indicate that the procedure described in this paper can provide very accurate estimates of probabilities and means when the probability density function cannot be integrated in closed form. Furthermore, the procedure lends itself to approximating convolutions of probability densities. Such approximations are useful in project management, inventory modeling, and reliability calculations, to name a few applications. In these applications, decision makers desire an approximation method that is robust rather than customized. Also, for these applications the most appropriate criterion for accuracy is the average percent error over the support of the density function as opposed to the conventional average absolute error or average squared error. In this paper, we develop methods for using five well-known orthogonal polynomials for approximating density functions and recommend one of them as giving the best performance overall.
Keywords: Approximation: Project management; Inventory
i 1. Introduction In m a n y fields of analysis including queuing theory, project network analysis, inventory models, and reliability calculations, it is necessary to estimate the cumulative probability distribution and the mean o f a random variable based on its density function. In m a n y cases, the density function involved can be quite complex and approximations are necessary (see Tadikamella, 1979, 1981, and Barlow and Proschan, 1981). Typically, in the problem settings in which these requirements arise, it is desirable to estimate probabilities and means to within 5% or better o f their true values. Moreover, the need for repeated computations in order to compute Probabilities o f different events makes a closed-form approximation formula very useful. In this paper, we present techniques which can achieve such accuracy with a piecewise polynomial function. Approximating individual probability distributions has a rich history o f development in operations research. A good example is provided by Shore (1986) who uses two-piece, linear transformations of a c o m m o n random variable to approximate the distribution o f an u n c o m m o n random variable. However, m a n y models in the fields 0377-2217f96f$15.00 Copyright © 1996 Elsevier Science B.V. All rights reserved SSDI 0377-2217(95)00250-2
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
212
mentioned above require the cumulative distribution function and the mean of an n-fold convolution, h(x), of a set of other, known density functions, fn, f n - 1. . . . . f2, fl.
= fo"fo'°-'
tn-2)''" f2( t2- tl)fl(tl) d/1 dt2...dtn-1.
(1)
For such convolutions, the analyst struggles to find adequate approximation schemes. The integral in (1) evaluates the probability density function of random variable x that is the sum of n other random variables, the most common condition motivating such integrals. A variation on (1) would evaluate the density function for a random variable that is a linear combination of n other random variables. The method presented in this paper approximates such integrals for application in reliability studies, inventory control, project management and in any other fields in which they arise. Reliability calculations are similar to those of project management in that they require the computation of the probability distribution of the time to reach the final node of a network of component lifetimes. This computation requires the calculation of convolutions such as (1) (see Barlow and Proschan, 1981). Much has been written about the computational problem of calculating the probabilities of project completion times; see Ragsdale (1989) for a review of this literature. Schonberger (1986), using a simple example in which different network paths are critical under different scenarios of activity times, provides a rudimentary explanation of the shortcomings of computing project completion times deterministically. Although he recommends subjective post-analysis of mean path lengths as a cure, the position of the current paper, and that of most PERT research, is that the calculation or estimation of the likelihoods of different scenarios can greatly assist project managers in determining which activities deserve their attention. In a seminal work, Robillard and Trahan (1977) demonstrate that fundamental to answering any questions about project completion time or path criticality is the necessity of approximating the convolutions of the activity times along paths of a PERT network. A major contribution to this area of research is the method of Dodin and Elmaghraby (1985). Their method draws on much previous work by Dodin (1980, 1984, 1985). Space does not allow and our interest does not require a review of the complex background to their method. In their paper, Dodin and Elmaghraby recognize that a chief cause of error in their approximation of these convolutions is the requirement of Dodin's (1980) estimation method to discretize the probability densities of the activity times. In the current paper, we offer an alternative to Dodin's convolution method that is efficient, accurate, and does not require this discretization. The method is not dependent on the form of the probability densities involved and does not require the densities involved in the convolution to be of the same type. This property is desirable in the field of project management as the beta distribution with its traditional settings for its mean and variance is no longer accorded the sacred status that it enjoyed for almost thirty years. The current paper provides a robust method for estimating path-length probabilities regardless of the form of the activity-time densities. Scully (1983) provides a concise summary of the arguments for using more general probability densities in PERT analysis and concludes: "Where sufficient data is available, the exact distribution should be established, whatever its shape or nature may happen to be". The invocation of the Central Limit Theorem to justify the use of the normal density for the random variable of path length is likewise a long-standing, but often unjustified assumption. Anklesaria and Drezner (1986) take this approach in representing the densities of sets of activities as multivariate normal densities; however, they break with tradition by including correlations between activity times on the assumption that such correlations can be estimated. The result is a method for computing upper bounds on the probability of a given time for project completion. Correlations between activity times occur when the occurence of a longer-than-expected time for one activity has the effect of lengthening or shortening some future activity. Although a project manager may have an idea of such connections between activities, it is doubtful that h e / s h e can express such complicated interactions explicitly in the form of a correlation coefficient as required by Anklesaria and Drezner's model. Research to
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
213
date has only scratched the surface of this topic. A recommendation for representing such correlations within the framework of the current paper's path analysis is to do the following. If two activities on the same path are correlated, then estimate the probability density of the sum of these two variables, thereby embedding the correlation within this density. Very often a convolution such as (1) will be composed of well-known probability functions fn, f ~ - 1. . . . . f2, f~ but the result of the convolution is not known. A standard approach in such cases is to take the Laplace transforms of the functions involved which allows one to express the transform of the convolution integral as the product of the transforms of the probability functions involved in the convolution (see Cox, 1967). This method often runs aground when the inverse transform must be found to express the final result as a probability function. Often, this inverse cannot be found. In such instances, an approximation scheme is called for. A second deficiency of the method of using Laplace transforms is uncovered when one attempts to derive the function which results from a 'truncated convolution'. Such a convolution is given by (2). h(x)=
fa~"fa '~"-' . fai~f~(x. n
t , _ l ) f. ~ _ l ( t , _ .l --
t,_2)'''f2(t2--tl)f,(tl)dr,
dt 2 . . d t , _ 1.
(2)
n-I
In fact, the initial motive for this investigation of orthogonal polynomials in closely approximating convolutions such as (2) grew out of the occurrence of such convolutions in an inventory model developed by Badinelli (1992). In project management, such a truncated convolution would arise if one was interested in the probability of project completion time conditioned on one or more of the activities being completed in a given time ' window'. Because the limits of integration do not take each probability function to the limits of its support, the Laplace transform theory does not hold. Approximating a convolution can be done by enumerating the actual convolution probabilities in the case of discrete distributions (i.e., the integrals in (1) and (2) are replaced by summations). Dodin and Elmaghraby (1985) show such a 'discretized' method in computing completion time probabilities in PERT networks. For a case of continuous density functions, Cleroux and McConalogue (1976) apply cubic spline interpolation to convolutions which arise in reliability problems. In their application, maintaining continuity of the derivative of the probability functions across the boundaries of the segments that make up the piecewise approximation is important. In this paper, we make use of piecewise polynomial approximation to density functions without concern about the continuity of derivatives. In the many applications, such as the inventory model of Badinelli (1992) and in project scheduling, the density function h(x) in (1) or (2) is integrated once or twice in order to make statements about probabilities of events and to compute the first moment of the density function, respectively. Hence, continuity of the derivative is not important and need not be incorporated in the approximation procedure. This allows for the use of many polynomial forms. Moreover, when using orthogonal polynomials, this integration can cause positive and negative errors in the fitted density function to cancel, making the fit of the integral more accurate than the fit of the integrand. In this paper, we make use of the special properties of orthogonal polynomials. It will be shown that excellent approximations can be obtained using these polynomials with fewer segments in the piecewise approximation than a cubic-spline approximation would entail. This reduction in the number of segments results in computational savings and suppression of round-off error. The thesis of this research is that piecewise polynomial approximations using orthogonal polynomials to probability density functions can produce very accurate approximations to the cumulative distribution function and the first moment as well as to the convolutions (truncated and conventional) of the density functions. Furthermore, this approximation method is robust with respect to the form of the probability densities involved. In consequence of these properties such approximations are particularly useful for the computations embedded in Dodin and Elmaghraby's PERT analysis and Badinelli's inventory model. In this paper, we compare the performance of five different kinds of orthogonal polynomials and suggest the best all-round choice for
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
214
approximating probability functions. The criteria for performance is the accuracy of the approximations to the complementary cumulative distribution function and to the integral of the complementary cumulative distribution function. Accuracy of fit will be measured in terms of the average absolute percent difference between the given function and the polynomial function used to approximate it. We formalize this definition of accuracy below. In Sections 2 and 3 we define notation and present the basic characteristics of orthogonal polynomials applied to curve fitting. In Section 4 we detail the application of orthogonal polynomials to piecewise approximation of probability density functions. In Section 5 we present the results of empirical tests on the five different kinds of orthogonal polynomials studied. The final section is a summary of this research.
2. Orthogonal polynomials We begin with the simple case of estimating a single density function with orthogonal polynomials. In this section, the basic characteristics of orthogonal polynomials are presented. First, we define some notation. x = Continuous random variable x ~ X ___R. f= Density function on x that is to be approximated. F = Complementary cumulative distribution of f. G= Complementary cumulative distribution of F.
F( x) =
s?
( t) dt,
G( x) =
F( t) d t =
f ( s) ds dt.
[ x L, x u ] = Subset of X outside of which f ( x ) has negligible magnitude. m= Defined 'origin'. s= Defined scaling factor. z= (x - m ) / s = Normalized random variable. 7rp(z) = p-th orthogonal polynomial of a certain type. qS(~-) = The function which approximates f. Basic probability theory implies P ( a ~< x < b) = F ( a ) - F ( b ) .
(3)
A double integral of the density function can be used to compute the expectation of the random variable. For example, for a random variable, x (e.g., the time for an activity within a project), that can only take on nonnegative values, E ( x ) = G(O).
(4)
Moreover, a double integral of the density can be used to compute conditional expectations, or loss functions, that typically arise in inventory models.
fa ( x oo
a ) f ( x ) d x = G(a).
(5)
See the Appendix for a derivation of formulas (4) and (5). As an extension, the expected length of a path through a PERT network or reliability network, given that this path takes longer than a certain time, a,
e ( x l x >~a) = ( G( a) + a) / F ( a). Hence, the quantities of interest are all determined directly from the function F and G, so our examination of the accuracy of polynomial fit will be based on F and G. For a given probability density, the ideal type of orthogonal polynomial can be found using the theory of Tchebyshev systems - a time consuming and highly theoretical task, and one that would represent a major distraction for the user interested in estimating completion times for a project or setting safety stocks for an
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
215
inventory system. We envision a method that could, if necessary, accept as input data a probability density function 'sketched' by a decision maker. By this we encompass a variety of techniques for estimating probability distributions including the specification of a commonly-used probability density function, the listing of cumulative probability values for different values of the random variable and the construction of a histogram that approximates the density. Each of these techniques could be applied through either subjective estimation or historical data. Our interest is in developing a method that is both robust with respect to the form of the probability density and standard, so that it can be coded into a computer program and applied repeatedly in an application such as PERT analysis in a way that is transparent to the user. With these requirements in mind, the method must be based on standard orthogonal polynomials, not customized ones. In this paper, we will illustrate the use of five rather well-known orthogonal polynomials: Tchebyshev polynomials of the first type, Tchebyshev polynomials of the second type, Legendre polynomials, Laguerre polynomials, and Hermite polynomials. See Arfken (1970) for a detailed description of these polynomials. In this section we will begin with common properties of all orthogonal polynomials. A set of orthogonal polynomials, H = {~r0, ~r 1, 7r2 . . . . }, is defined over a domain D __ ~. P
~rp(T) = E ~Pp/T/-
(6)
f=0
The orthogonality property states that the inner product of any two different polynomials in H Specifically, 12p~pq = (Tip, "l'l'q) =
fD%(~-)%(~-)o~(~-)aT,
is zero.
(7)
where ~pq = 1 if p = q, 0 otherwise. The normalization constant up and the weighting function ~o(7) are specific to the type of polynomial being used. The set of orthogonal polynomials forms a basis of a linear subspace of continuous functions on D. This means that the polynomials in H can be put into a series expansion that will converge to a given continuous function, f, in this subspace. Since D is typically a bounded interval of the real line, the random variable x is normalized to ~- in such a way that the domain [x L, x v ] maps onto D; see the definition of z above. With this transformation of variables, f can be expressed as oo
f(Ts+m)
= •
~perp('r)
for~-ED,
(8)
p=0
where ~'0, ~'1. . . . are constant coefficients of the polynomials in the series expansion. The orthogonality / property, (7), permits the computation of the coefficients in (8) by the following integration:
Cp=
f f(
s + rn)zrp(r)to( t)
dr.
'i
(9)
The approximation procedure consists of computing the values of f over a sequence of points in D. These data are used to integrate numerically the integral in (9). Also, by truncating the infinite series in (8), an approximation to the function f is obtained. This method is exactly the same as that used with other orthogonal functions such as Bessel functions or Fourier series. We define ¢ ( z ) as the series in (8) truncated at the polynomial of order P, hereinafter referred to as the 'order of the polynomial approximation'. P
¢(7) = E
(10)
p=O
With some algebraic manipulation, &(z) can be written as a polynomial in ~- instead of a linear combination of polynomials. This will make any computations involving (h(~') more efficient. P
¢ ( ~ ' ) = ~] "yt~"~', /=0
(ll)
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
216
where P
(12) p=t
Values of the Up, ~Opt, f = 0 . . . . . p; p = 0 , 1,2 . . . . . polynomials listed above are given in the Appendix. Define
~b(a) : fa ~b(~-) d'r,
and D for each of the five types of orthogonal
(13)
o a) =
14)
As is described above, the average of the absolute value of the difference between the true complementary cumulative function, F, and the fitted complementary cumulative function, @, expressed as a fraction of F is one of the desired accuracy measures. A similar measure is used for comparing G to a. In the applications of the polynomial fits to project management and other problems, the fitted probability function is likely to be evaluated at numerous points. In fact, the user of the fitted functions may desire to plot the probability distribution function, a feat that would require repeated calculation of the polynomial functions. This use of the approximation method raises two important performance criteria: accuracy over a large portion or all of D and fast computation time. The same user could also require the estimation of the probability of a particular event or a conditional expectation over an event. If one is focused on an unlikely event, then one would require a much higher degree of accuracy in terms of the error in the probability estimated than one would require if the event was more likely, Therefore, in order to properly test the performance of the fit of the approximation, we measure the percent error over the domain, D. Specifically, we would like the value of the following two measures to fall within an acceptable upper limit.
1
l eF I= -xu- f --XL x 1
-
l e°l= XU
xuIF(t)-dP(t)[ L xtj
f X L xL
F(t)
dt,
(15)
dt.
(16)
G(t) - @(t) l G(t)
However, the literature does not prescribe an optimal choice of polynomial (or of any other kind of function) that will minimize either on of these functions when the polynomial is being fitted to the density function, f (see Powell, 1981). Therefore, in this paper, we will use empirical results for (15) and (16) when the five types of orthogonal polynomials mentioned above are applied to a given density function. These empirical results will indicate the best choice of polynomial type and the best order of the approximation.
3. Approximation concepts In this section we identify some basic principles of estimating functions with polynomials. These principles guide the construction of the empirical study which follows. While there are no theoretical results on the ability of various polynomial types to minimize (15) or (16), there is a useful result regarding the accuracy of the fit of th to f. Inasmuch as the errors in the fit of ~b to f eventually manifest themselves in the accuracy of the fit of • to F and of a to G, it is worth studying the approximation of th to f. The main result on the accuracy of such an approximation is that the use of a class of
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
1.5 --
Density function = normal Power of approximation = 5 Number of points = 50 Polynomial = Tchebyshev-2 Interval of Int . . . . t ~ mean,__Aone standard doeanivti
f
217
~Q~%
1.0
0.5
0.0
-0.5
-- - -
-1.0
approximation of density
-1.5
-1.00
I
I
I
I
I
I
-.75
-.50
-.25
0.0
.25
.50
.75
1.00
Fig. 1. Errors and weight functions.
orthogonal polynomials up to order P over a domain D results in minimizing the weighted-squared error as defined by (17), even if f is not in the linear subspace spanned by H ; see Powell (1981, Chapter 11). error =
fo[f(rs
+ m) - ch(r)]2w(r)
dr.
(17)
Since we are interested in the performance measures given by (15) and (16), these classic results from approximation theory are not very useful. Because the weighting functions for some of the five polynomials used in this study decrease as one approaches the extreme values in D, the accuracy of the approximation deteriorates at these values. For example, Fig. 1 shows the actual errors in the approximation of a standard normal density function with Tchebyshev polynomials of the second type over the interval [ - 1 , 1]. Superimposed on this graph is the weighting function for this polynomial approximation. It can be seen that the magnitude of the errors increases as the weighting function decreases. This simply reflects the fact that the emphasis on the errors at these points in the minimization of (17) by the approximation process, is less than that on the errors where the weighting function is high. The large errors at the 'fringes' of D can be effectively truncated by fitting the polynomial to a wider interval than the interval over which the approximation is to be used. Implementing this technique requires us to define D = [%, %] as the 'interval of approximation' and a subset of D, [%, rd], which we define as the 'interval of interest', We map the interval of approximation in r onto the interval of approximation in x, [Xa, Xb] , and the interval of interest in r onto the interval of interest in x, [XL, XU]. In many cases, the error terms from an orthogonal polynomial approximation show some symmetry. That is, over the interval of approximation [ x a, Xb], the errors oscillate around zero somewhat symmetrically. This has a salutary effect on the errors of the integrated functions, F and G. The errors in approximating f over the interval of interest tend to cancel one another on integration, so that the accuracy of the approximation for the integral is actually better than that of the integrand. Theoretically, as the maximum order of the polynomials used in the approximation, P, is increased, the accuracy of the approximation improves. However, this improvement shows a diminishing marginal return on
218
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
added CPU time. Moreover, higher order terms introduce a volatility to the polynomial which can cause the approximation to deteriorate in some parts of the domain. It was also found in this study that all of the polynomial approximations have limited flexibility. That is, if the function, f , is decreasing or increasing over a wide range a n d / o r changing shape, higher order polynomials are needed to obtain a good approximation. In such cases, a more efficient method is to divide the interval [ x L, x U] into segments and perform a separate polynomial approximation over each segment. Such a piecewise approximation method can achieve a very close approximation with three or four polynomial approximations of order 3 or 4 instead of a single polynomial approximation of order 7 or 8. There are two commonly-used alternatives to the use of orthogonal polynomials - cubic spline interpolation and the Lagrange interpolation. In the application of splines, a sequence of points over D is selected. The function, f , and its derivatives are evaluated at each of these points. Each interval formed by a pair of consecutive points becomes a segment with its own polynomial function (cubic). Therefore, if 100 points are used in the spline approximation, the result will be a piecewise polynomial approximation consisting of 99 different polynomials; see Powell (1981, Chapters 18-21). Orthogonal polynomials provide more flexibility in constructing the approximation. Using orthogonal polynomials, we still can make use of a large set of points because the numerical integration of (9) can be done with as many points as we like. The more accurately this integration is performed, the more accurately the polynomial coefficients are estimated. Therefore, we still can incorporate into the approximation as much or as little data as we like, independently of choosing the number of segments in the piecewise approximation, the order of the polynomial approximation on each segment, and the boundaries of each segment. The method of Lagrange interpolation sets the coefficients of a polynomial so that it passes through a given set of points; see Powell (1981, Chapter 4). This means that fitting a polynomial through three points will require a quadratic polynomial, four points a cubic polynomial, and so on. In order to achieve the desired level of accuracy, many points will be needed. Using this method, the large number of points used in the approximation will result in either a very high order of the polynomial approximation or a large number of segments as in the case of splines. Here again, the use of orthogonal polynomials allows for fewer segments and lower orders of the polynomial approximations without sacrificing data.
4. Application to probability functions
4.1. Piecewise approximation We now extend these concepts to a piecewise approximation of a probability function. In a piecewise approximation, we set upper and lower bounds, x~; and x L, respectively on the values of the random variable, x, over which the approximation is to take place. In the case of a probability function with unbounded support, we would set these bounds at the points beyond which the probability function has negligble magnitude. In most applications, such points exist and can be conservatively estimated using the well-known theorem by Tchebyshev for the upper bounds on tail probabilities. From these starting points, the method can tighten the bounds as function values are computed and found to be too small to be included in the approximation. In some cases, as in the convolutions described in Badinelli (1992), the behavior of the probability function is well-approximated by a known probability function (such as the normal) at the upper (lower) tail. In such a case, the polynomial approximation must extend over only that portion of the random variable's range in which a known probability function does not apply. This allows for setting the upper (lower) bound, x U (XL), at the point where the function's behavior must be approximated by the polynomial. This can result in a large reduction in the number of data points that must be collected in order to carry out the approximation.
R.D. Badinelli / European Journal o f Operational Research 95 (1996) 211-230
219
Once the upper and lower bounds have been established, the interval between them is divided into N contiguous intervals which represent intervals of interest. We will call this collection of N intervals the partition J f . That is, lower bound = x L = Xic < xla = x2c < X2d = X3c < • " " < XN_ ~,a = XN~ < XNa = Xtj = upper bound. For each interval of interest, [xj~, Xjd], we define an interval of approximation [xja, Xjo] such that xja < xi~ < Xjd < Xjb.
In order to identify the correspondence between the fith interval of approximation in x and the j-th interval of approximation in z, we define xj as the random variable taking part in the fith approximation. We define rj as a normalized version of the variable xj. That is,
"rj=(xj-mj)/sj,
x ; ~ [ x j ~ , xjo ].
(iS)
In the case of the Tchebyshev polynomials and the Legendre polynomials, ~-j.~= - 1 and ~'jb = + 1; see the 1 Appendix. In these cases, m j ~- -~(Xjb+ Xj~) and sj = 7I( X j b - - Xja). In the case of the Laguerre and Hermite polynomials, D has an infinite extent. For these polynomials, the parameters mj and sj are set so that [xj. a, Xjb] is mapped into, but not onto D. We cannot linearly re-scale a finite interval to correspond to D in these cases, but by choosing s to be small enough, a very good approximation can be made. In doing this, the weighting functions of these polynomials is supportive of this truncated mapping because the weight decreases exponentially as one moves away from the origin. Hence, the impact on the approximation of the portion of D that is outside of the image of [ Xja, xjb] as shown by (17), is negligible.
4.2. Integrals of the density In utilizing the piecewise polynomial approximation in a problem setting, it will be necessary to integrate the density function in order to make probability statements as shown by (3), (4), and (5). Polynomials are well-suited to integration. The integration formulas, (19) and (20) given below, are derived for the case of a piecewise polynomial approximation. Note that because the integration can cover more than one segment of the partition J f , the formulas have many more terms than just the antiderivative of a polynomial. A close inspection of these formulas, however, will reveal that they are not time-consuming to compute. Define: ~bi = the polynomial approximation to the density function over interval of interest i, [ Xic, Xid ] . /z i = the antiderivative of ~bi. v i = the antiderivative o f / x i.
F ( a)
= fa
f ( x) dx.-~ crp(a)
= Es,
-
k= 1
~
if a < Xlc ,
(a)=sj
txj
1
+f
Sk
txj!~_
]l
x)
x' xdx r(
XNd
(19a)
Y'. s k
k=j+ 1
+Ff(x) d x
-IXk
Sk
if a ~ [ X j c , Xid ], j < N ,
(19b)
if a ~ [ x N ~ , x N d ],
(19C)
XNd
= s N ixN
su
--~N[------~N
+
(x) dx
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
220
if a >
= fa~f(X) d x •
~c
G(a)=
fa FI =
=
XNd.
(19d)
oc
fy f(x)dx 2 sk
dy- a(a)
Xkc -- mk --
~'k
)( -
sk
k= 1
--
sk
+
~=l
sk tx~
)(Xk~--a)+(XNd--a)
( Xkc -- m k
-
Xkd -- mk
uk
sk IX~ - -
Sk
+ faXlC f / 1 ~f(x) dx dy
2[ [a--mj
N +k~=lsktxk
)( X~d
--
s~
s~
Nd
if a < x I ~,
N (Xkd--a)--
E sk #k
] - - IIN
(20a)
sk
--m k
( xk~ - a)
if a ~ [ x j c ,
x;a], j
(20b)
Nd
SN
q- SN ~)N
+(XNd--a) f ~ f ( x ) dx + ~ .fyf(x) dx dy XNd
Xkc -- m k )
k=j+ 1
+(XNd--a) f ~ f ( x ) dX+ fx fyf(X) dx dy XNd
a)
jijy f(x) dxdy
(X) d x + N~
Xjd~mjl] + ~ S2[vk(Xkc--mk
( Xkd -- m k )
= S 2 b,u ~ - -SN
Xkd -- mk
Su
( XNd - -
if a E [ X N c ,
Nd
a) XNd],
j
(2Oc)
cc
=fa fyf(X) dxdy
if a >
XNd.
(20d)
We identify two cases. In the first case, we set x U high enough and x L low enough so that the integrals of over [XNd, ~) and over [a, xlc] in (19) and (20) are approximately zero. In the second case, an example of which we give below, the function f ( x ) for large values of x is well-approximated by a known density function (e.g., normal) while for smaller values of x, the approximation by the known density function is not valid. In this second case, we can make use of the known values of the integrals of the density function over [ XNa, ~) in order to Carry out the approximation with a smaller value of XNa than would be required in the first case. Using a smaller value of XNd saves computing time since it reduces the number of intervals and the number of points required by the approximation procedure. Consider the case of the truncated convolutions, (2), where the probability density function (normal) is known and calculable. Suppose the convolution is truncated from below. That is, in (2), bj = tj+ l, J = 1 . . . . . n - 1, and b n = x, but aj > 0 for j = 1 . . . . . n. There is no known expression for the truncated convolution of normal densities. However, for large values of the random variable, x, the truncated convolution, (2), is closely approximated by the untruncated convolution since the sample paths which pass through small values in tl, t 2 . . . . . t n_ ~ and then achieve a high value of tn are unlikely. Therefore, in this model, the upper tail of the density function generated by the convolution can be approximated accurately by its untruncated values. Over
f(x)
R.D. Badinelli/ EuropeanJournalof OperationalResearch95 (1996)211-230
221
the rest of the domain of the random variable, x, the piecewise polynomial approximation applies. Hence, in computing (19) and (20), we can substitute the known values of integrals of f ( x ) for x > XNd. Expressions (19) and (20) will also support the case in which the density function for small values of x is well-approximated by a known density function.
4.3. Convolution of the density function In order to approximate the resulting density from a convolution such as h in (2), we will begin by integrating the polynomial approximations for the first two functions in the integrand, f2 and g~. We redefine fl as g~ in (1) in order to conform to the notation for a recursion that we develop below. We assume that fz and g l have been approximated by piecewise polynomials over the partitions Jf2 and ~2.gl respectively. We would like to approximate the function h, which is the convolution (perhaps truncated) of f2 and g j.
g2(t2)=fai'f2(t2-tx)gj(tl)dt
1, b l o t 2.
(21)
For a given t2, the partition fy2 in terms of t 2 -- t 1 Can be re-expressed as a partition in terms of intervals in t 1 which we shall denote a~f2(t2). By overlaying fff2(t2) with ~gi we obtain a partition in t i with no more than Nf2-~Ng I intervals. We call this partition ~ 2 ( t 2 ) . The i-th interval in ~12(t2) [tic, rid], represents the intersection of some interval f ( i ) in a~ff2(t2) and some interval j(i) in Jgl. We convolve the corresponding polynomial approximations for f2(t2 -- t l) and g ~(t1) over the intersection of these intervals to obtain values of g2(t2) as follows.
f2(t2 - tl) -~ (af2f( t 2 - t l - m j ' ) , t l ~ [ t l f c , Sj,
gl(tl)'~4)gljtT g2(t2)
~" ~
i~j,~,,~,
tl/a],
(22)
] , tlE[tljc, tljd], Jt llia~f2j' ,,c
(--) s,,
4)g
(23)
1' 1
Sj
]
dt I .
(24)
The general expression for the integral over any given interval, i EJlz(t2), is
t
s,,
J<"t-T-,
PY Pkf Pg ]lfj ~ [k~ = k=0 E 7sk / E = 0 - Sf r=O i..~ t r ] ( - 1 )
dr, r(t2-mf-Mj)
k-r
r+f-4-1
)'( [ ( tlid -- Mj) r+~+l -- ( tlic -- Mj)r+~+ l] .
(25)
From (25) it is clear that truncated convolutions pose no special problems when using polynomial approximations. By computing (25) over each interval in the joint partition, ~12(t2), and adding the results, we obtain the approximate value for g 2 ( t 2) shown in (24). We use (25) in (24) to compute individual values of g2(t 2) at a collection of points within [t2L , t2u ]. These data points are then used in carrying out the numerical integration in (9) in order to obtain a piecewise polynomial approximation for g2(t2) over a chosen partition of [ t2L, t2u ]. In this manner, an n-fold convolution such as (1) or (2) can be evaluated iteratively. At the k-th iteration, we convolve the polynomial approximation of gk(tk) with the polynomial approximation to fk+ l(tk+ 1 -- tk) via (25) and (24) and then obtain a piecewise approximation to gk+ l(tk+ 1)-
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
222
5. Empirical tests As far as computing time is concerned, in all test problems, the time required to carry out the approximation for each interval of approximation was not significant (less than 0.2 seconds on an IBM 3090). However, in some applications where the probability density functions are not known, the data points used in the polynomial approximation may be generated through simulations or estimated judgementally by a decision maker. This can be time-consuming to compute. Also, the potential for repeated evaluation of the polynomials in the problem context implies that the order of the polynomial approximation will affect computing time. Therefore, we take the number of data points and the order of the polynomial approximation as proxy measures of CPU time and keep them at the smallest values possible while achieving desired accuracy. In addition to numerically evaluating the integrals (15) and (16), tight bounds were computed for them. This was done by setting up a sampling partition over [ x L, x u ]. The reader should note that the sampling intervals thus defined are not the same as the intervals of interest defined above. The sampling partition is defined only in order to measure the accuracy of the approximation for the sake of this empirical test, not to carry out the approximation. We define the sampling partition to be finer than the partition Jy. The sampling partition is defined in such a w a y that each sample interval will be contained within one interval of interest (that is, not overlap two intervals of interest). For the j-th interval of interest, the functions F and G and their fitted polynomials are computed at each sample point along with the error e F and e c.
ev( x) = ( F ( x) - q)j( x) ) / F ( x),
(26)
ec( x) = (G( x) - aj( x) ) / G ( x).
(27)
In order to be confident in the statements about the accuracy of the approximations that we shall make, a bounding procedure was derived for putting upper and lower bounds on the error function within a sampling interval, based on the values of the error function at the endpoints of the sampling interval. This procedure is derived in the Appendix. Using this procedure to obtain bounds on e r and e a over each sampling interval in the integration, upper bounds on (15) and (16) can be obtained. In most of the empirical tests below, a normal density was used for f. This choice was made because routines for computing integrated values of the normal, accurate to the fifth decimal place, are available. Hence, accurate values of F and G can be computed for (26) and (27). Moreover, it is reasonable to suggest that a procedure yielding an accurate approximation to various segments of the normal density will perform well on other density functions. This suggestion is borne out by the results quoted below on the accuracy of the approximation of the gamma and beta densities. Different segments of the normal density were used as intervals of interest, [xjc ~ Xja]. For testing the approximation procedure for a single density function (i.e., not a convolution), the end points of these intervals were set to / z - 2o-, / x - o-, /x + o-, /x + 2o-. This simulated different conditions (i.e., convex, concave, increasing, decreasing) for the approximation procedure. It was found that the performance of the approximation procedure was insensitive to changes in /x and o-. The scaling of ~- for the Laguerre and Hermite polynomials as described above was investigated first. Fig. 2 shows the dependence of the accuracy of the approximation on the scaling factor, s, for these two polynomials. Segments of the normal density function were used as test cases. It was found that setting s --- 5 for the Laguerre polynomials worked best, overall. For the Hermite polynomials, the issue is more complex, In this case, Fig. 2 shows the sensitivity of the approximation to s for two different intervals of interest over which the shape of the density function is quite different. In the case of the Hermite polynomial, this sensitivity is more unpredictable as the shape of the function segment being approximated has a large effect. Since one would use this polynomial approximation procedure in cases where the nature of the function is not well-known, this unpredictability greatly limits the usefulness of the Hermite polynomials.
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
3.0 ~|
| 2.5 ~
Densityfunction = normal Power of approximation = 5 Numberof points = 50 -~. . . . . . ~'= standarddeviation
~
/
Hermite Interval of Interest= Czt'+0-,J~'-F20-)/
20 [
-Tkaguerre leFl I intervalof Interest={/-Z-¢';,z,'+O ~')
/
\
/
~
X"
/
l
/
223
/
\
/
/
lnterval°f Interest= (4/"~'~+ °')
0.0
s 2
4
6
8
10
12
14
16
-18
Fig. 2. Sensitivity to scaling factor.
Again using a normal distribution as a test case, the influence of the power of the polynomial approximation on the accuracy of the approximation was tested. For each polynomial type, the approximation procedure was executed over the segments of a normal density identified above. For the segment of the normal density from one standard deviation below the mean to one standard deviation above the mean, the results are shown in Table 1. The results on the other segments are very similar. The best performance was obtained from the Tchebyshev polynomial of the second type. For every polynomial type there is an optimal power of the approximation. When the order of the polynomial is too small, the polynomial function simply cannot take on all of the complexity in the shape of the density function. When the order of the polynomial is too high, the polynomial function is volatile. In those parts of the approximation interval where the weight function, (17), is high, the approximation is extremely accurate. However, with high order terms, the price that is paid for this approximation is a polynomial function that swings wildly in those parts of the approximation interval where the weight is small. While the weighted squared errors, (17), will show improvement as the order of the approximation increases, the average error of the integrals, (15) and (16), may not. From Table 1, it is seen that a good order for the polynomial approximation is typically 5. In Table 2, the same kind of comparison is shown for the case where the tail of the normal distribution is approximated. In this case, the segment being approximated runs from one standard deviation above the mean to two standard deviations above the mean. Once again, the Tchebyshev Polynomial of the Second Type shows the best performance, with the Laguerre polynomial being second best. The Legendre polynomial and the Tchebyshev polynomial of the first type are not impressive at all. The problem with the Tchebyshev polynomial of the first type is that the weight function becomes infinite at the ends of the approximation interval. This causes the approximation procedure to attempt to approximate the density function at the fringes of the approximation interval very accurately at the expense of the accuracy of the approximation in the central part of the approximation interval. This property gives the Tchebyshev polynomials of the first type their celebrated performance advantage over least squares approximation because they avoid the extreme errors that can be encountered with least squares approximation at the fringes of the approximation interval. However, by extending the approximation interval beyond the interval of interest as described above, this problem can be diminished satisfactorily with any of the orthogonal polynomials
224
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
Table 1 Accuracy vs. power. Normal density: 50 points; interval of interest = [/x - 0", /x + o"] Polynomial type
Tchebyshev Type 1 Tchebyshev Type 2 Legendre Laguerre(s=5) Hermite(s= 15)
Power of approximation 3
4
13.7 (14.4) 8.67 (9.13) 0.239 (0.285) 0.200 (0.212) 10.5 (11.0) 6.30 (6.63) 1.38 (1.47) 0.600 (0.629) 1.00 (1.09) 0.892(0.944)
9.63 5.58 0.0776 0.0496 10.7 6.35 0.80t 0.421 2.12 1.62
(10.1) (5.88) (0.123) (0.0529) (11.2) (6.67) (0.879) (0.456) (2.24) (1.70)
5
6
9.63 (10.1) 5.58 (5.88) 0.0779 (0.123) 0.0504 (0.0538) 10.7 (11.2) 6.35 (6.67) 0.315 (0.365) 0.194 (0.206) 1.25 (1.34) 0.902 (0.954)
14.2 (14.9) 8.53 (9.04) 0.114 (0.163) 0.109 (0.1 t 6) 10.7 (11.2) 6.42 (6.77) 1.33 (1.41) 1.05 (1.11) 3.39 (3.69) 3.19 (3.40)
a Upper entries: [ - ~ % (upper bound on [ ~ % ) . h Lower entries: 1~-]% (upper bound on ~-~-]%).
considered in this study. Since the weight function of the Legendre polynomial is uniform, this same phenomenon occurs but to a lesser degree. Generally, orthogonal polynomials with weighting functions that diminish toward the ends of the interval of approximation performed better with respect to performance measures (15) and (16). Returning to Fig. 1, we see the error terms for the approximation of the normal density from one standard deviation below the mean to one standard deviation above the mean. The approximation was made with Tchebyshev polynomials of the second type. The increase in the error magnitudes toward the ends of the interval of approximation occurs rather sharply. As long as the interval of interest is contained within the interval of approximation in such a way as to avoid these extreme errors, the overall accuracy of the approximation will be high. It was found that making the interval of approximation 25% wider than the interval of interest was sufficient. A nice property of Tchebyshev polynomials is that the error functions are always symmetric about the center of the interval as shown in Fig. 1 by the errors for approximating the density function. This property implies that centering the interval of interest in the interval of approximation will achieve best results. Centering the interval of interest within the interval of approximation gave best results for a wide variety of density-function segments.
Table 2 Accuracy vs. power. Normal density: 50 points; interval of interest [/x + 0-, /z + 20"] Polynomial type Tchebyshev Type 1 Tchebyshev Type 2 Legendre Laguerre (s = 5) Hermite (s = 10)
Power of approximation 3
4
5
11.3 (11.9) a 8.11 (8.58) b 0.324 (0.376) 0.276 (0.374) 5.06 (5.31) 3,02 (3.21) 0.878 (0.960) 0.641 (0.733) 0.894 (0.960) 0.599 (0.690)
5.07 (5.50) 3.40 (3.63) 0.285 (0.344) 0.171 (0.278) 5.03 (5.29) 3.01 (3.21) 0.402 (0.473) 0.345 (0.442) 0.286 (0.339) 0.150 (0.246)
5.45 3.50 0.198 0.0835 5.18 3.12 0.720 0.628 0.400 0.274
a Upper entries: [ - ~ % (upper bound on [ - ~ % ) . b Lower entries: [ ~ % (upper bound on ~ ' ] % ) .
6 (5.86) (3.73) (0.256) (0.191) (5.44) (3.33) (0.807) (0.723) (0.455) (0.371)
7.85 (8.32) 4.77 (5.11) 0.200 (0.263) 0.0627 (0.185) 4.82 (5.07) 2.84 (3.04) 1.53 (1.66) 1.27 (1.37)
14.8
(15.5)
9.89
(10.5)
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
\
3.0-- - \ ~ ~ ~ I e F 2.5 --
225
Densityfunction = normal IPteWrevr~fofalPPr°x:ta=ti°nean5one standarddeviation
]
2.0 -1.5
1.0
- -
0.5 0,0 15
F
25
I
35
I
45
I
55
I
65
1
75
85
1
95
Fig. 3. Accuracy (percent) vs. number of points.
The error function from an approximation using orthogonal polynomials also shows some symmetry with respect to the sign of the error. This property implies that when the polynomial is integrated, the positive and negative errors will have a tendency to cancel. Because of this, the integrated forms of the polynomials are usually more accurate approximations to the integrals of the density functions than the polynomial s are to the densities. Fig. I shows the error function for the first and second integrals of the normal density, e r and e o, respectively. Increasing the number of points used in the approximation procedure will enhance the accuracy of the approximation. Fig. 3 shows the improvement in accuracy as the number of points used increases when using Tchebyshev polynomials of the second type. The diminishing marginal improvement from added data and the level of accuracy that is typically required suggest that a set of points on the order of 50 is adequate for a given interval of approximation. Finally, the application of the approximation procedure was extended to other density functions. Results with the gamma density function and the beta density function were similar to those obtained with the normal density. Table 3 shows the performance of the five kinds of polynomials in approximating a segment of each of these two density functions from one standard deviation below the mean to one standard deviation above the
Table 3 Beta and Gamma approximations. Number of points = 50; interval of interest = [/x - cr, /x + cr ]; power of approximation = 5 Polynomial type
Beta density
Tehebyshev-1 Tchebyshev-2 Legendre L a g u e r r e ( s = 5) H e r m i t e ( s = 10)
7.96 0.513 10.0 2.16 2.32
Gamma density (8.54) a (0.593) (10.1) (2.27) (2.45)
14.5 0.346 15.4 1.49 6.09
(14.6) (0.388) (15.5) (t.65) (6.43)
a Entries: Average percent absolute error of density (upper bound on average absolute error of density).
226
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
Table 4 Approximation of convolutions. Normal densities; 250 points, 4 segments; [ x L, x U] = [/z - 3.4o-, /z + 3.40"]; power of approximation = 5 Order of convolution 2 3 4 5 6 7 8 9 10
["~% (u.b. on ["~%)
["~% (u.b. on V~%)
0.469 (2.33) 1.04 (2.54) 1.65 (2.97) 2.15 (3.40) 2.33 (3.45) 2.74 (3.79) 3.80 (4.96) 4.30 (5.41) 5.00 (6.07)
0.408 (2.66) 0.923 (2.78) 1.64 (3.17) 2.30 (3.66) 2.65 (3.80) 3.27 (4.27) 4.46 (5.50) 4.92 (5.91) 5.56 (6.53)
mean. Since the integrals of these densities are not easily computable, the errors given in Table 3 are for the approximation of the density function. As is indicated above, one can expect that the accuracy of the fit to the functions F and G would be substantially better than that shown in Table 3. In fact, the errors for the approximation of the density functions shown in Table 3 are comparable to those of the approximation of the normal density which gave the results in Tables 1 and 2. Several variations on the parameters of the beta and gamma distributions did not change the relative performance of the polynomials and did not change the errors given in Table 3 by more than the second decimal place. The application of the method to convolutions gave good results. Accuracy does not significantly deteriorate as the number of convolutions taken increases. Table 4 gives the error results for convolutions of the normal distribution from a two-fold convolution up to a 10-fold convolution. While the method will allow f o r approximating truncated convolutions as described above, for the purpose of testing accuracy, full convolutions were used since it is only for these that theoretical values are available for comparison to approximated values. At each convolution, the procedure using Tchebyshev polynomials of the second type was used to approximate each of the functions involved in the convolution. The random variables involved in each probability function were allowed to range from the mean minus 3.4 standard deviations to the mean plus 3.4 standard deviations. This domain was partitioned into four segments. The intervals of fit were approximately 25% wider than the intervals of interest for each of these segments. The results of 10 tests with varied setting of the means and standard deviations yielded accuracy very similar to that shown in Table 4, again showing an insensitivity to these parameters.
6. Summary This paper has presented a method for polynomial approximation of probability functions. While the technique of using orthogonal polynomials for approximating functions is not new, the application to probability functions has not been catalogued. This research does not promote the use of orthogonal polynomial approximations as a panacea. To be sure, orthogonal polynomials have their limitations. It is only suggested that they are a desirable choice in approximating probability functions and their convolutions. Specifically, in cases where integrals of functions are required and where functional forms are required for repeated computation, the use of orthogonal polynomials is recommended. Finally, for approximating convolutions and truncated convolutions, as required by many applications, the use of orthogonal polynomials is also recommended.
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
227
Appendix
Derivation of Eqs. (4) and (5) Let x be a random variable defined on [0, oo) with density f.
f]xf(x) dx= f[x(-dF(x)/dx) dx. Integrating by parts, and noting that
limx__,~:xF(x) = 0, we obtain
- x F ( x ) I: + fa F(x) dx=aF(a) +G(a). Hence, E(x)
= fo xf(x)
dx
= 0F(0)
+ G(0)
=
G(0).
Formula (5) follows directly: oc
fa ( x - a)f( x)
dx= £
~
~c
xf( x) d X - fa af( x) dx= aF( a) + G( a) - aF( a) = G( a).
There are similar results for discrete random variables.
Error bounds ee(X). That is, we look at the function dJgj(x-- mj)/sj, where x ~ [ Xjc, Xjd ].
First, we consider the numerator of
6(x) = F ( x )
-
By obtaining bounds on this numerator, we will be able to obtain bounds on eF(X) for values of x within one of the sampling intervals. We denote the given sampling interval [x, ~]. The reader should not confuse the particular sampling interval with the interval of interest that it is contained in. That is, [x, 2] ___[Xjc, Xjd]. Table A . 1 Tchebyshev polynomials D=[-1, Iro=
Tchebyshev polynomials -
Type I
+1]
D=[--1,
~ro= l
1
"/7"1 = 2 7
T'I~T
"/7"2
=
T y p e II
+11
2 7 2 -- 1
'rr 2 =
~'3 = 4 7 3 -- 3 7 ~ 4 = 8 7 4 -- 8 7 2 + 1 ~r5 = 1675 -- 2 0 r 3 + 5 7
472
--
1
I r 3 = 8 7 3 -- 4 7 7r4 = 1 6 7 4 - 1 2 7 2 + 1
~r6 = 3 2 7 6 -- 4 8 7 4 + 1872 -- 1
7r 5 = 3 2 7 5 - 3 2 7 3 + 6 7 Ir 6=6476-8074+2472-
Recursion relation: q T i + 1 = 2r~i -- ~ i - 1
qri+
1 =
277ri -- ~i-
1
1
Weighting function: w(r)
w(7)= lfi-27:
= 1/~/1 - r 2
Normalization constants:
f+l~ri(7)%(r)w(r)dr= -l
0,
i4:j
½~,
i=j¢O
7r,
i=j=0
£1
1"~'i(T)"~Tj(7)W(7) d r =
(Ol'
7"r¢,
i•j i =j ~ O
rr 4 = ( r 4 _ 1 6 r 3 + 7 2 r 2 _ 9 6 r + 2 4 ) / 2 4 rr 5 = ( - r 5 + 25r 4 - 2 0 0 r 3 + 6 0 0 r 2
Ir 4 = ( 3 5 r 4 - 3 0 r 2 + 3 ) / 8
i -=~j j
4r + 2)/2 ~r 3 = ( - r 3 + 9 r 2 _ 1 8 r + 6 ) / 6
f _+ l % ( r ) r q ( r ) w ( r ) d r = { 0t2,- 7 2- ~ ' 1
'
J
1,
+l r . ( r ) ~ r . ( r ) w ( r ) d r = { 0 ,
i4= j i=j
w(r) = e -r
Weighting function: w(r) = 1
Normalization constants:
rri+ l = ( ( 2 i + 1 - r)rr i - irr t_ t ) / ( i + 1)
+ 5400r 2 _ 4320¢ + 720)/720
Recursion relation: rri+ t = ( ( 2 i + l)r~r i - irr i_ l ) / ( i + 1)
~r5 = ( 6 3 r 5 - 7 0 r 3 + 1 5 r ) / 8 Ir 6 = ( 2 3 1 r 6 - 3 1 5 r 4 + 105r 2 + 3 ) / 1 6 - 600r + 120)/120 ~r6 = ( r 6 - 3 6 r 5 + 4 5 0 r 4 - 2 4 0 0 r 3
7/', 2 = 4 7 .2 - -
7 r 2 = (,/. 2 - -
Ir 2 = ( 3 r 2 - 1 ) / 2 Ir 3 = ( 5 r 3 - 3 r ) / 2
~TI=T
2
f_+ll qT"i('r)'77"j(Y)W(T)dT =
w ( z ) = e - ,2
2i~/-~i!,
(0,
8r 3 - 12r 16T 4 - 4 8 7 .2 + 12 3 2 r 5 -- 1 6 0 r 3 + 1 2 0 r 64"r 6 __ 480"/" 4 + 7 2 0 r 2 __ 120
7ri+ I = 2 r % - 2 i r r i_ l
"rr3 = 7r4 = "rrs = "rr6 =
o9)
~o = 1 ,/rl = 2,/"
-fro = 1 ~-I=-r+l
~ro=l
=(-%
D
D = [0, oo)
D=[-1, +1]
Hermite polynomials
Laguerre polynomials
Legendre polynomials
T a b l e A.2
i¢ j
i= j
t~
t~
e~
to
R.D. Badinelli / European Journal of Operational Research 95 (1996) 211-230
229
Define the slopes of the secant line of the functions F and ~ from x to .~ as
Ta~= dPj
s~
j
sj
/]/(~--~)'
TF= ( F( Yc) -- F( x) ) l ( Yc--x). Using these slopes and the slopes obtained by the derivatives at the endpoints, x, 2, we have three approximations for each of these two functions over [ x, 2].
~e(x) = T / x - _ u ) , be(x ) = F ( x )
+F'(_x)(x-x_),
CF(X ) = F(.~) +F'(.Tc)(x-Yc),
aa,( x) = T,( x-_x),
b.(x) = ~(_x) + ~'(_x)(x-_~),
e.(x) = ~ ( ~ ) + ~ ' ( ~ ) ( u - ~).
For a known probability function used for empirical testing, we can locate the inflection points. Likewise, we can locate the inflection points and the local maxima and local minima of the piecewise polynomial. Doing this allows us to divide [ x L, x U] into sampling intervals over each of which the function F and its polynomial approximation have known convexity/concavity and monotonicity properties. We can also set up the sampling partition so that these inflection points are the end points of certain sampling intervals. This will guarantee that over any sampling interval, the convexity or concavity property and monotonicity property of F and q0j will not change. If F(x) is convex over [_x, 2], then for x ~ [_x, 2],
ae( x) >~F ( x ) ,
bF( x) <-%F ( x ) ,
CF( X) .%
If F(x) is concave over [x, 2], then for x ~ [x, 2],
aF( X) < F( x),
be(x) >1F( x),
CF( X) >1F( x).
Similar statements apply to the functions a~, be, c~. Define:
81(x)=aF(x)=a,(x),
62(x)=aF(X)--be(x),
83(x) = aF(X) -- C . ( X ) ,
64( x) = bF( X) -- a4,( x),
85( x) = bF( x) - b,~( x ),
a6(X) = b e ( x ) -- c . ( x ) ,
6 7 ( x ) = ce( x) - aa,( x),
6 8 ( x ) = CF( X) -- bq,( x),
ag(~) = c e ( x ) -- e . ( x ) .
We identify four cases for the sampling interval [_x, 2].
Case 1: F is convex, qb is convex. Then for all, [_x, 2], 8(x) < 82(x); 8(x) < 83(x). 8(x) >/84(x); a(x)/> 87(x). A = max(minx 84(x), m i n , 8 7 ( x ) ) < 8(x) < min(maxx 82(x) , maXx83(x)) = B. Note: since 8i(x) for all i is linear over [_x, 2], the max(min) of 8i(x) takes place at either _x or 2. Hence, A and B are easily computed.
Case 2: F is convex, q0 is concave. 8(x) < a~(x); 8(x)/> 85(x). a(x) >/a6(x); a(x) >/as(X); 8(x) 1> 89(x). A = max(min, 65(x), min x 66(x), rnin x 68(x), min x 39(x)) < 8 ( x ) < max x 81(x) = B.
230
R.D. BadineUi / European Journal of Operational Research 95 (1996) 211-230
Case 3: F is concave, q) is convex. 6 ( X ) ~ 65(X); ~ ( X ) ~
66(X) , ~ ( X ) ~
6s(X); 6(X)~
~9(X).
6(x) >/6,(x). a = minxtl(x) ~< 6 ( x ) <~min(maXx65(x), maxx 66(x), maxx68(x), maxx 89(x)) = B. Case 4: F is concave, q~ is concave.
~(x) ~< 64(x); 6(x) ~< 67(x). 6(x) >1 62(x); 6(x) >/63(x). A = max(rain x 62(x), min x 63(x)) ~ 8(x) ~ rain(max x 64(x), max x 6 7 ( X ) ) = B. From all of these results we can derive the following bounds on I 6(x) l over Ix, 2]: [ 6 ( x ) I ~
I 8 ( x ) [ >1 max(0, A, - B ) .
Therefore, for x E [x, 2], m a x ( - a , B) l eF(x) I ~< m i n ( F ( x ) , V ( ~ ) ) '
max(0, A, - B ) [ e r ( x ) I >~ m a x ( F ( x ) , F ( 2 ) ) "
The same derivation results in a similar bounding procedure for [ ec(x)[.
References Anklesaria, K.P., and Drezner, Z., " A multivariate approach to estimating the completion time for PERT networks", Journal of the Operational Research Society 37/8, 811-815. Affken, G. (1970), Mathematical Methods for Physicists, 2rid ed., Academic Press, New York. Badinelli, R. (1992), " A model for continuous-review pull policies in serial inventory systems", Operations Research 40/1, 142-156. Barlow, R., and Proschan, F. (1981), Statistical Theory of Reliability and Life Testing, McArdle Press, Silver Spring, MD. Cleroux, R., and McConalogue, DJ. (1976), " A numerical algorithm for recursively-defined convolution integrals involving distribution functions", Management Science 22/10, 1138-1146. Cinlar, E. (1975), Introduction to Stochastic Processes, Prentice-Hall, Englewood Cliffs, NJ. Cox, D. (1967), Renewal Theory, 2nd ed., Barnes & Noble, New York. Dodin, B.M. (1980), "Approximating the probability distribution function of the project completion time in PERT networks", OR Report No. 153 (revised), OR Program, North Carolina State University, Raleigh, NC. Dodin, B.M. (1984), "Determining the k most critical paths in PERT networks", Operations Research 32/4, 859-877. Dodin, B.M. (1985), "Bounding the project completion time distribution in PERT networks", Operations Research 33/4, 862-881. Dodin, B.M., and Elmaghraby, S.E. (1985), "Approximating the criticality indices of the activities in PERT networks", Management Science 31/2, 207-223. Powell, M.J.D. (1981), Approximation Theory and Methods, Cambridge University Press, Cambridge. Ragsdale, C. (1989), "Current state of network simulation in project management theory and practice", OMEGA 17/1, 21-25. Robillard, P., and Trahan, M. (1977), "The completion time of PERT networks", Operations Research 25/1, 15-29. Scully, D. (1983), "The completion time of PERT networks", Journal of the Operational Research Society 34/2, 155-158. Schonberger, R.J., "Why projects are always late: A rationale based on manual simulation of a PERT/CPM network", Interfaces 11/5, 66-70. Shore, H., "Simple general approximations for a random variable and its inverse distribution function based on linear transformations of a nonskewed variate", SIAM Journal on Scientific and Statistical Computing 7/1, 1-23. Tadikamella, P.R. (1979), "The lognormal approximation to the lead time demand in inventory control", OMEGA 7/6, 553-556. Tadikamella, P.R. (1981), "The inverse Gaussian approximation to lead time demand", International Journal of Production Research 19/2, 213-219.