Sensitivity analysis of static simulation models of discrete event systems by the local approximation methods

Sensitivity analysis of static simulation models of discrete event systems by the local approximation methods

SENSITIVITY ANALYSIS OF STATIC SIMULATION MODELS OF DISCRETE EVENT SYSTEMS BY THE LOCAL APPROXIMATION METHODS V.Ya. Katkovnik Department of Machanical...

1MB Sizes 0 Downloads 38 Views

SENSITIVITY ANALYSIS OF STATIC SIMULATION MODELS OF DISCRETE EVENT SYSTEMS BY THE LOCAL APPROXIMATION METHODS V.Ya. Katkovnik Department of Machanical Engineering. Leningrad State Technical University Leningrad.USSR. Abstract . We present methods for deriving sensitivities of performance measures for computer simulation models. We show that both the sensitivites and the performance measure can be estimated simultaneously from the same simulation run. The local approximation gives possibilities to reconstract performance measures. considered as regressions given by simulation. and their derivatives in the whole domain of augment values. We study a case wh~~ performence is an unspecified regression. We give correct statements on strong unIform convergence and convergence rate of the suggested estimates. These statements may be applied to decide some optimization and sensitivity analysis problems . Key words. Sensitivity analysis. discrete event systems. simulation. local approximation. nonparametric estimation .

INTRODUCTION Many modern technical systems can be considered as discrete event one's. Examples of such systems are computer networks. flexible manufactul ins systeJIIS, production automatic lines, etc. The conventional approach for analyzing such complex systems is simulation . By simulation we mean the standard computerbased discrete event simulation which involv es writing a computer program to mimic the system and then performing a Monte Carlo experiments. Cons ider ' a stochastic discrete event systems tOES) charact er i zed by some parameter vector x E Rn. Let m repr ese nts a s pec ific sample reali zation of this system. i.e. w contains all affecting stochastic perturbations. Given w ,the performance of the DES is de scribed by some sample function y(x . w I . Let y(x)=Ety(x,m))=Jy(x,w)dF(w), xE XCRn, be the · expected performance of DES, where the expectation is taken over all possible sampl e real iza ti ons w with joint c.d.f . (cumulative distribution function) F(w) . The modern appoach to design and optimi zation problems of complex sys tems basi s on comprehension of the fact that it is not enough to find optimal decisi on even it is possible . It is necessay to study robust properties of this deci s ion. It means for an example that we have to reseach the sensitivity of the optimal performance with respect to different parameter varrations. By sensitivities we mean derivatives • . gradients, Hessians. etc. The natural es timator for ,y(xl is a sample mean 1

~N(X)=--

N where w

H

r

y(x,w

( a )

,

s =J

Ca l

is s-th sample reali zation of m.

We will separate two essentially different cases: ytx.m) is a specified function of x (avai lable analytically) and y(x,m) is unspecified one . If y(x,m) is diffentiable in x for the specified y(x.w) we can easily get estimates of all necessary derivatives of y(x) . These estimates are like the (t) with replacing y(x,w) on sample mean corresponding derivatives with respect to x. Indicate that then we can have estimates of performance and all sensitivities simultaneously . It means that all these estimates may be obtained on single realization, on single sample set of random values m
YN

1H

(01)

(x) =--

r g (u
) y (x-11 u
N s=1 where uta) E Rn are independent random vectors with some c.d.f. ,g
-192-

problel. To obtain it for all x E X we lust use repeated silulation for every desirable xE X. this way generally obviously , incurs too high computational cost. The following way is alternative.Let silulation experilents are scattered, completely covering in some sense the whole dOlain X. Let x(a), s=I,2, . . . N be experilental points. In this paper we suggest to apply for the decision of simultaneously sensitivity problems in "the large "special nonparametric' identification method so-called the Local Approximation Method (LAM). For corresponding estilate we will use notatiorl the LA estimate. Give some relevant references on papers about sensitivity analysis and performances for computer simulation models of DES. We assume that y(x) is unspecified, ' but y(x,w) may be specified or respect of x. To estilate uncpecif ied with performance measure we may always apply Monte Carlo methods and deal with estimates similar (11 . There exists many good references on this methods in the simulation literature (e.g.Fishman, 1978: Law and Kelton, 1982: Rubinstein, J!J86) . To estimate sensitivities for specified y(x,w) case one may use score function (or likelihood ratio) methods Reiman and Weiss (1986),Glynn and Sunders (1986), Cao(1987), Robinstein(1986,19891, and infinitesimal perturbation analysis Gon and Ho(1987), Ho(19S7), Ho,Cao and Cassandars(1983) . These methods give simultaneous estimates and practically they are fespecially affective when x are parameters of F (c.d. f.) . Simultaneous estimations similar (21 have studied in detail in connection with smoothed functionals and stochastic approximation methods by Katkovnik (1976), Kre imer and Rubin s tein (19881 . LAM or the locally weighted regression or the moving mean square method as nonparametric approaches were sugin Katkovnik, 1976, gested and developed 1979,1985:Stone 1977: Cleveland, 1979).Well known "Kernel " estimation are particular case of LAM socalled- zero order estimators. The primary contribution of this paper is to apply and to justify LAM to sensitivity analysis problems. In contrast to the ordinary least square menthod (LSM), LAM uses parametri c models with given coordinate functions only as local one ' s which Hre authenti c the true y(xl only into some s mal; neighborhood of every x. W e indi cate some important properties of LAM : I.The LA estimates are nonparametric in the domain x 2.10 cal culate the LA estimates for performances and sensitivities in any xE X it is necessary lo keep in mind th e lotal set of s imulated Ys, 5=1, .. . , N 3. The s pe t ial choice of some parameters of LAM gives possibi I ities to extract the maximum informntion about y(xl from data Ys . 4. LAM are comp Iex much more t.han l.SM. 'Th is complexity is reasonable only under condition 01 the -193-

high cost of si.ulations experiments. The justification of LAM here is theoretical.We give aSYIPtotic results vhich lean that for enough large nUlber of si.ulation experiments N the accuracy of the LA esti.ates is high enough. The rest of this paper is organized as follows. Section 2 deals with the linear LAM. Here there are main forlulas for the LA estilates, convergence with probability I results and also convergence rate (optilal rate) results. Section 3 contents the nev, nonlinear LAM. robust esti.ation probleMS are discussed. Here there are also convergence and convergence rate results. 2.THE LINEAR LAM FOR A P~RFORMANCE AND

SE~SITIVITIES

Suppose we have drown N independent simulation observat ions (Ya, X (.)) 1N, where vectors x (. ) E XCRn are generally speaking different values of variable parameters of DES, Ya=Y(X'·', w (a)1 are stochastic observations of the DES perforMance. Let the performance be a scale real-valued regression y on xE Rn,y(x)=E",LY(x,w )).Then y(x,w )=y(x)+ n(x,w), where E..., Ln (x,w ))=0, E,)n (x,w )2 ] = <1 y2(X), E"" Ln(x,w )n(x,w ')J=O, if w =w '. LAM is a procedure for fitting a rezression surface y(xI to scatterplots data (y.,x(~'IIN by a special lultivariate SMoothing algorithm . One provides an estimate YN(X) of the regression surface y(x) at any x in the n-dimensional space of the independent variables. The observations y. forming the l.A estiMates YN(X) have different weights according ,g to distance x's) from x. The observations with x ) close to x have larger weights and one's with x's, further to x have slllaller weights. To carry out the locally-weighted estimator we introduce a functi on p(xI, x E Rn. We will call this function a lo~ally­ weight function or simply a local functi6n. denote a n-tuple 01 Let a =(a I, ... ,a n) nonnegat i ve integers and set I a i =a 1+.. . +a nand a '=a 11 ... '% n!. For xE Rn set x· =XI&, ... X,.pl. Let D~ denote the differential operator .

DO<

=

J"'/

_:---d_:---_ _ __ 01, ~

x, ..

01.,

· i)'X II

Define the LA estimates for perfoemance y(x) and sensitivities D4 Ix) as fol lows (Katkovnik,19851

(4)

tI. p(-G"-"'~lf Ys - ~ ~ -o/-!()(-,.e')o(J J ';(X, ~)= ii· l.. .s::.

l

(5)

lo(l~'"

where p (x) is the local function. il) 0 is sOle I oca I paraleter. y!t (x) is the est i late of y'O('(x) = D" y (x) •

is

the

local

polynolial

-Iultivariable approxilation of the regression y(x). Paraleter eo( and estilates g~'(x) are defined by the linilization of the sq~are functional IN. At a =0 9~-(x)=Y(x)=C.(x) is the estilate of the perforlance y(x). At i a i =1 ;~'(x) are estilates of the gradient of y(x). At /al =2 9;:)(x) are estilates of Hessian. etc. The I oca I funct i on , p (x) is a bound one satisfying sOle properties standard for "kernel" estilators: p(x»O. p (0) =Iax P (x). P (x) ---)0 if ii xli ---)00. in HUlerous exalPles of local function are given regression literature on nonparaletric the estiMation. The loca I parameter il is very ilPortant for the accuracy and convergence properties of LAM. Sometimes it is called a scale factor or a bandwidth. The value of il aassigns the "size" of the correct polynomial approximation domain for the At regression y(x) in _,;9 neighbourhood of x. il---)oc in(3) P ( ~---) const and LAM transform Such way LSM may be into the ordinary LSM. considered as a particular case of LAM. It means that the reasonable choice of the local paraJleter May give lore precise estimates than LSM. To estimate derivatives O"y(x) we use the. order of the local polynoJlial 1I0del JI> i a i . The value m defines the order of LAM. At m=O we obtain the zero order estilator in the analytical form A

The following conditions are enough to ensure consistency and si.ple evaluation of the conv~rgence rate with probability I. (AI) x(·) are independent n-dimensional randol vectors identically distributed with continuous probability density IT (x»O for all xE X. (A2) The domain X is bounded. (A3) P has a bounded support, i.e. P (x»O only for xEB,where B is bounded, and JBP(u)du=l. (A4) y(x) and derivatives D··' y(x) are continuous with conditions

i O··'y(x)-O··'y(x-u) i
AssuIPtion (8) is necessary for convergence of estilates in lIean square and in probability. Assumption (9) is used to show pointwise convergence with probability I.Assulllption (10) is necessary for convergence with probability 1 uniformly in X. To estilate of convergence rate we will assume that the local function is multipl icative. (I ))

H

YN(X)=}: h.(x.x(e»y. s=1 h.(x.x(· »)=

enough derivatives, the increase of the order I decreseases systelatic errors and increases the level of stochastic errors. LAM has powerful potential capabilities to intensive average of stochastic cOIPonents of observations and silultaneously to interpolate and extrapolate regular cOIPonents of thel. This capabilities define potential effectiveness of LAM.

P

(lC"5"")

[~

(6) where P I(XI) are scale variable functions subjected to the condition (A3l . P

(lC-;"jJ_'

(7)

s=1 This estilator is known as Hadaraya-Watson Kernel one. The following properties are the LAM basis. The estilate (3)-(5) is linear with respect to observations Y., and always lay ·be written in the for. (6). The increasing of 11 increases the local approximation domain and the number of observation points x(el entering fn the estimation with enough high weights hs(x"x c.»). It follows that then systematic esti.ation error of YN(X,X.) increases, but the stochastic error decreases. It leans that there is the alternative optillal value of 11.

Theorem I. Let (At)-(A6) and llO),(11) be satisfied . Then for linear estimates (3)-(5) we obtain:

I ..

" ... ,
(I) SLIP D yU(j -y",

(121

x'S

(ii) the order of the uniform convergence rate given by

The similar alternative situation occurs with respect to the order m. If the regession y (x) is smooth enough, i . e. it has some number of sma I J -194-

(I31

is

(iii) the optimal uniform

con~ergence

SUP/ 'ff'jt>c)-;,,/-)01) I~-) €~,,. /I~_

lodifications of LAM and discuss their origins and possibilities briefly.

rate

I

A. Bounded LAM. Let admissible values of regression and their derivatives be known

(14)

sup I:DO<'y()(l I::, bol,

occurs at

The validity of the theorem 1 follows theorem 2 in (loffe. Katkovnik. 1990).

from

For us it is important that with probability large N 01

(") I:: eN

,.. ('-)

SUp Il) !Jt~)- J",

Define the bounded LAM "by (3). (5) where the for

(16)

xeX t

N

I, 1., ••• , ,..,

xlig

(5)

where

I'" I ~ 0/

is sma II va I ue tend i ng to zero as N---)oo.

The assumption (8) is enough only for convergence in mean square . Then the optimal convergence rate ,is defined as

~ CoL) E(D ~(lO- 'i", ('0 01

11.

--)0

[N

2/",1+2

2f>1

_,01,'

+l -+ n

J

(J 9) The estimates e",are the decision of the constrained square optimal problem with a convex admissible domain of variables. In general case Co<. (x) are nohliriear with respect to observed values y•. It may "be proved that Theorem 1 is valid in this case.

B. LAM with simultaneous scale estimation. The parameter uy(x) defines the probability scale of observed random values Ye.Let scale by(x) be essentially variable in X. Then the precision of the LA estimates may be improved by the change of residuals weights in the minimi zed functional JH(X. (j) as follows x -,.<5)

p(-~--)

(]"7)

The assumption (9) is enough only for point-wise convergence with probability 1 and then we obtain opt i ma I convergence rate (sea. theorem 1. roffe. Katkovn ik 1990)

1

1')'1+1-,01,

.2"'1"" TZ

1

THE NO.NLINEAR SENSITIVITIES

LAM

FOR

A PERFORMANCE

(20)

(J 8)

here p(x»_O. p(x)---)O at ii x ii

The double-logarithm law (18) for point-wise estimates was proved by Toffe. Katkovnik initially for kernel estimates of probability density and its derivatives (1989) and for the LA estimates of regression (1990). The one-logarithm law (14) is a specific feature of the uniform convergence in a whole domain. The optimal convergence rate in mean square (1"7) coincides with the optilal convergence rate for the best nonparametric estimations obtained in (Stone. 1982). 3.

a

AND

---)00.

J

1J(u)du=l

The LAM functional (20) may be justified by a stochastic local model of a regression (see.Katkovnik.1985). where p(x)---)oo at 11 xli ---)0. But in general case this demand is not necessary . In genera I funct ions p and if are different. At (j y(x)= (j =const the functiona"ls (20) and (5) are equ i va lent if p (x) = P (x) I (] + "P (x) (j 2) . y

As a rule of x is appropriate est i mate by

In many cases there is an essential a priori information about performance and sensitivities as functions of their" arguments and about silulated random values y. and their probability properties. This additional information gives possibilities to improve the precision of LAM but as a rule it may be lade only by a transformation the linear LAM into nonlinear one. We present here some nonlinear -195-

the scale factor ii.(x) as a function unknown then we replace it by an statistical estimate and define the LA (3). (4). or (3). (J 9) where

(21)

z

'"

SN(X) =~~h(J()

where

, ""')[.Y, - Y~ Ne >r(~)J J~

Here h is defined by (7). The esti.ate SA 2 depends of the esti.ate YN. It .eans that the LA esti.ate in this case is 'the decision of the nonlinear algebraic equation syste.s (3). (4). (21). (22) or (3). (19). (21). (22). The natural recursive method for the decision of this syste.s .ay be ground on the recursive procedure for SN 2 (X). The first we calculate the esti.ate YN(X) at given SN 2 (X) and then correct SN 2 (X) using this YN. and so on.

~.

a are so.e para.eters.

The consistency and the convergence rate of nonlinear robust esti.ates (3). (4). (23) are given by the theorem co.pletely theorem I Let the following additional (BI) The function

conditions'

occur.

ijl
Rl

is continious and twice differentiable . (B2) The equation nr(r.x);;O has the unique dec'ision r;;O. (B3) Jlr rr (0. x);;O •X EX.

C.Robust LAM.

11

Consider LAM differing fro. (3)-(5) by nonquadratic residual function in I N. Let

_

J,N (X

J

5)=~L.P( N

x-><''' 5

)

FUs-LC" 1,(I~ro\

(B4) f f(v/x) L$r(V.X) Fdv
Rl

(,,_,.!SIlo(

cl!

)

{24'j

Here f(v/x) is the conditional probability dencity of n(x.w) at fixed x.

\

The robust nonlinear LA estimates are defined by f on,,,, I as (3). (4). (23) . ihe zero order LAM (11;;0) is kno-.tn as nonpar3llletrio M-estimate VV1 ; "'-

Co

Theorem 2. Let all the conditions of the theorell 1 and the conditions (Bt)-(B4) be satisfied. Then the nonl inear robust estimates (3). (4). (23) converge with probability I and the convergence rates are defined by formulas (13). (14). The validity of the theorem follows from Theorem 3 (Ioffe.Katkovnik.1990).

4. CONCLUS IONS the special choice of the residual function F has its main aim to decrease the weight in the esti.ate of "bad" simulation data or even elillinate thell completely. Bad data are "unexplained" differing froll "normal" ones. Robust estillates reduce the estimate sensitivity. for an example. to rare high amplitude errors or breakdowns of simulation experiments. The parametric regression robust 14estimates have been suggested and justifi~d By Huber tI981).Poljak and Tsypkin (1980) . The nonpara.etric regression zero order M-estillates have been suggested by Brillinger (to see the in discussion to Stone. 1977) and investigated (Tsybacov.1983; Hardle.1984).The nonparalletric robust LAM si.ilar (23) have been suggested by Cleveland(1979). Katkovnik(1979) . It has been proved (Katkovnik.1985) that the asymptotic optimal residual function F coincides with such for paralletric M-esti.ates. ' It lIeans the choice of F in (23) may be carry out on the base of well known robust estimation, theory results. For exa.ple we functions

indicate sOlle robust

residual

(l)Ftu);;iu i (ii) f(u);;1 / 2u 2 at ! u !"i:' .and F(u)=6 i u i -1/26 2 at i u ! )6 (iil) F(u);;O at ! u !
EIIPloying local polynomial appoximation of a performance measure we have suggested new linear and nonlinear methods (in particular robust methods) to reconstract 'a performance measure and sensitivities in "the large" in a whole dOllain. Theorems give conditions en9Ugh of the uniform convergence and appropriate convergence rates with probability I . The uniforll convergence is ~ very important property for different complex optimization and analysis problems when a perfor~ance measure is considered as an unspecified regression. The paper contents a theoretical basis for these applications. REFERENCES Cao.X.R . (1987) . Sensitivity estimates based on one realization of stochastic system. J.Stat.Comput . • 27. 211-232. Robust locally weighted Cleveland. W.S. (1979). regression and smoothing scatterplots. J. 'of Am.Stat. Ass .• 74.829-836. Fish.an.G . S. (1978) . Principles of discrete event digital si.ulation. John Wiley &Sons.N.V . . Gong.W.B .• and V.C.Ho(1987). Smoothed perturbation analysis of discrete event dyna.ic systells. IEEE Trans . Auto.at.Contr .• 32.858-866. Hardel.W . (1984). Robust regression function estimation.J.Multivariate anal. 14. 169-180. Ho.V.C. . Cao.X . . and G.M.Cassandras (1983). Infinitesi.al and Finite Perturbation Analysis

-196-

for Queueing Networks. Automatica.19.439-445. Ho.Y.C. (1987). Performance evaluation and perturbation analysis of discrete event dynamic systems. IEEE Trans.Automat.Contr .• 32.563-572. Huber.P. (1981). Robust statistics. Wi ley.N. Y. Ioffe.M.O .. and V.Y.Katkovnik (1986). Necessary and sufficient conditions of almost sure convergence of kernel estimates of probability density and its derivatives.Autolll.Remote Contr .• 47.1632-1641. rate of Ioffe.M.O .. andV.Y.Katkovnik(1990).The pointwise and uniform convergence almost surely its derivatives. of the regression and Autom.ReMote Contr. Linear estimation and Katkovnik.V . V. (19'76). stochastic opCillization problems. Nauka. · Moscow (in Russian) Katkovn i k. V. Y. (19'79). Li near and non I i near methods. of nonparametric regression analysis. Soviet J. of Autom.& InfoalA.Sci. (trans . of "Automatika").N5. identification Katkovnik. V. Y. (1985) .Nonparametric and smoothing of data (Local approxiMation methods). Nauka. Moscow (in Russian). Kreimer. J .• and R.Y.Rubinstein (1988). Smoothed functionals ' and constrained stochastic ap~roxilla­ tion. SIAM J.~umer.Anal .• 25.470-487. Law.A.M .• andW.D.Kelton(1982). Simulation modeling and analysis. cGraw-Hill.N.Y. Poljak.B.T .• and Y. Z.Tsypkin(1080). Robust identification.Autolllatica.16.56-63. Reinman.M. I .• and A.Weiss~1986).Sensitivity analysis for siMulation via likelihood rations. In Proc. Winter Simulation Conf .. 285-289. Rubinstein.R. V. (1986) .Monte-Carlo optimization.siMulation and sensitivity of queueing networks . John Wi ley & Sons. N.V .. Rubinstein.R . Y. (1989). Sensitivity analysis and performanc e extrapolation for computer simulation models. Oper~tion Research. 3'7.12-81. Stone.C.J. (1982).Optimal global rates of convergence for nonp~rametric regression. Annals of Stati st ics .4.1040-1053. Tsybakov.A.B. (1983). Convergence of nonparametric robust algorithms of reconstruction of functions. Ilutom. Remote Control. 44.1582-1591.

-197-