Compuf.
& 0~3.
Ra.,
Vol.
I. pp. 283-31
I. Pergamon
Press,
1974. Printed
in Great
Britam
APPLICATION OF VARIANCE REDUCTION SCALE SIMULATION PROBLEMS E. J.
MCGRATH*
Science Applications
and D. C.
TO LARGE
IRVING?
Inc.. La Jolla. Calif., U.S.A
Scope and purpose-Monte Carlo simulations are commonly used in a wide variety of stochastic problems which are too complex to admit of solution in closed form. The essence of the method is that the probabilities of different possible outcomes are estimated from the result of a finite number of replications or histories. Each history is developed by following a series of events and, at each point where the outcome is probabilistic, selecting one of the possible outcomes in accordance with a pre-specified probability for that particular branching of events. Only this selected branch is followed until the next probabilistic point is reached when the process is repeated. The principal difficulty with Monte Carlo techniques generally is that in order to achieve accurate results in terms of probabilities of ultimate outcomes, a large number of replications must be made and computer running times are therefore long. Variance reduction techniques which greatly improve the efficiency and reduce the running time of Monte Carlo programs by reducing the number of replications necessary to achieve a specified statistical accuracy in the result have been extensively developed and heavily utilized in the radiation transport field. This paper reviews some of these techniques and discusses their implementation in other areas of Monte Carlo simulation. A military application is used as a demonstration ofthe effectiveness ofvariance reduction techniques.
Abstract-Variance reduction techniques are classified and described generally. A comprehensive table of variance reduction techniques and their characteristics is presented. Considerations for their use. particularly for taking advantage of available a priori knowledge. are described. The final results of a demonstration study, using the antisubmarine warfare air engagement model APAIR, are presented. The more significant results are described in detail. The conclusion is reached that variance reduction (and, hence. computing time reduction) can be profitably used in APAIR with reductions of a factor of IO not too difficult to reach.
INTRODUCTION
Monte Carlo simulation is one of the most powerful and commonly used techniques for analyzing complex physical problems. Applications can be found in many diverse areas such as radiation transport, river basin modeling, antisubmarine warfare exercises and operations. The initial development of the Monte Carlo method occurred during the Manhattan Project which developed the atomic bomb during World War II. Since that time, the major application of the technique has continued to be in radiation transport.
* Dr. McGrath is Assistant to the President of Science Applications, Inc. in LaJolla, California. He has published two graduate textbooks in Operations Analysis and Optimization Techniques and several papers in the professional literature. He has worked in the areas of environmental protection and traffic, among others. He holds a B.S. in Engineering Physics from the University of Alberta, and the M.S. and Ph.D. from the University of Washington in Engineering and Mathematics. t David Irving is a Computer Scientist for the Systems Division ofScience Applications, Inc., LaJolla. California. He has extensive experience in Monte Carlo programs for radiation transport and treatment ofcomplex geometries. He holds the A.B. from the University of Pennsylvania and the M.S. from the University ofTennessee in Physics. 283
284
E. J. MCGRATH and D. C. IRVING
The requirements of computer size and computational time, as well as greater familiarity with simpler deterministic methods, have served as barriers to wider usage of Monte Carlo simulations. In recent years, however, the range of applications has been broadening and the size, complexity, and computational effort required have been increasing. This has been occasioned by the development of faster and larger computers and a desire for the increased realism that is concomitant with more complex and extensive problem descriptions. As a result of such trends, the requirements for improved simulation techniques are becoming more pressing. Monte Carlo solutions frequently involve extensive amounts of computer time, although techniques exist for reducing the effort required. Unfortunately, these methods for achieving greater efficiency are frequently overlooked in developing simulations. This can generally be attributed to one or more of the following reasons: Analysts usually seek advanced computer systems to exploit the increased speed and/or storage capabilities for performing more realistic and detailed simulation studies. Reduction of the considerably increased expense which is incurred is a secondary aim that is overshadowed by the primary goal of greater problem realism. Many efficient simulation methods have been evolved for specialized applications. Some of the most impressive Monte Carlo techniques have been developed in radiation transport, a discipline that does not overlap into areas where even a small number of simulation analysts are working. Many known techniques are not developed to the point where they can be easily understood or applied by even a small fraction of the analysts who are performing simulation studies or developing simulation models. Comprehensive Carlo simulation
references describing efficient have not been available.
methodologies
for improving
Monte
A useful feature of Monte Carlo simulation is that the analyst has the flexibility to dictate his simulation conditions and sampling plans to a much greater extent than does an experimenter in a real world environment. This extra latitude provides an excellent opportunity for optimal design of simulations to obtain estimates with minimal sampling size. This will effectively reduce the time and effort involved in computation as the number of trials necessary to achieve a given accuracy is thereby reduced. The procedures which are available in the design of a Monte Carlo simulation for minimizing the required sample size are generally called variance reduction techniques. In view of the large number of situations where simulation results can be substantially improved, it is fair to say that no simulation problem has been justly treated until the possibility of applying variance reduction techniques has been seriously considered. It is difficult to provide a complete perspective on variance reduction techniques. This is primarily due to the fact that there are an infinite number of ways Monte Carlo simulation can be accomplished for a given problem and each could conceivably be used to calculate the simulation objective although with greatly different efficiencies. However, it appears fair to say that the approach to improving simulation efficiency was not seriously considered until the work on the atomic bomb during the Second World War. This work initially involved the use of “straightforward” Monte Carlo simulation for nuclear particle transport, but early in these investigations Von Neumann and Ulam applied certain variance reduction
Application
of variance
reduction
to large scale simulation
problems
285
techniques. The subsequent application and development of variance reduction techniques has been almost exclusively carried out within the radiation transport community. Several general references to Monte Carlo calculation exist[l-31, but most of the literature, particularly where variance reduction is involved, pertains to the nuclear applications[4-61. This has resulted in limited application in other areas where Monte Carlo simulation is used. The purpose of this paper is to stimulate greater usage of variance reduction techniques in Monte Carlo simulations by presenting simplified descriptions of a few techniques and by demonstrating that these techniques can be applied successfully to a field other than radiation transport. This paper is an outgrowth of a study performed for the Office of Naval Research in which much more comprehensive and detailed reports were prepared on several aspects of improved simulation[7-91. In addition a fourth report[lO] presents in greater detail the results of the demonstration cases. VARIANCE
REDUCTION
As the name implies, variance reduction is concerned with increasing the accuracy of Monte Carlo estimates of parameters. A simulation using one or more reduction techniques can be contrasted with what may be considered the crude (sometimes called direct or straightforward) Monte Carlo approach where an attempt is made to create true-to-life or actual models of the process. In crude sampling, flows through the model and sampling probability distributions are chosen to reflect the real situation as exactly as possible. On the other hand, variance reduction techniques attempt to increase the effectiveness of the Monte Carlo method by: Modifying the simulation or sampling procedures Utilizing approximate
or analytical information
Studying the system within a different context or abstract representation Based on these approaches a general classification of several known variance reduction schemes is presented in Table 1. Many of the techniques presented in Table 1 are related and it is difficult to arrive at a completely distinct classification. However, the manner in which they are presented here is useful for subsequent discussions. Modifying the sampling process is usually achieved by using more effective sampling techniques or by altering the sampling distributions. As an example consider the problem of estimating the probability of an early failure in a piece of electronic equipment, and suppose that the failure distribution for this equipment is exponential with a very long mean time between failures (MTBF). In a crude Monte Carlo evaluation the ratio of the number of early failures to the total number of simulated failures is very small. In order to generate confidence in an estimate for the probability of early failure, one must simulate a very large number of failures. The number of simulated events required can be substantially reduced, however, if the failure distribution in the simulation is suitably modified. In particular, if an exponential distribution with a short MTBF is substituted for the actual failure distribution, more early failures will be observed, and thus a more accurate answer can be derived with less simulation effort. This procedure is referred to as importance sampling. Of course, the modifications introduced in the sampling distribution must be accounted for when determining the desired estimate since the failure processes (actual and modified) are not the same.
286
E. J. MCGRATH and D. C. IRVING Table
1. Classification of variance reduction techniques
Modijication
of the sampling
process
Importance Sampling Russian Roulette and Splitting Systematic Sampling Stratified Sampling Use o/analytical equiualance Expected Values Statistical Estimation Correlated Sampling History Reanalysis Control Variates Antithetic Variates Regression Specialized
techniques
Sequential Sampling Adjoint Formulation Transformations Orthonormal Functions Conditional Monte Carlo
The above example, simulating events of very low probability, illustrates one area where variance reduction techniques are always beneficial, if not an absolute necessity. If the occurrence of an event in a process is on the order of one in a thousand, then one would expect an event to occur only once in every thousand direct simulations of the process. Since the accuracy in measuring an event is related to the number of times it occurs, the crude simulation has to be run many thousands of times before much accuracy is achieved. The common variance reduction procedure in these cases involves altering the simulation in a known way so that the rare events can be observed more frequently. Other forms of variance reduction are based on the fact that analytic procedures are usually preferable to simulation. Thus, reverting to simulation implies the problem does not have a readily available analytic solution. However, in many cases segments of the process may be amenable to determining a closed form solution. In other cases, the overall process or segments of the process may be closely correlated to a simpler, approximate process with known analytic solutions. In both situations substantial improvement can be realized by taking advantage of this knowledge. This class of techniques is described by the term “use of analytical equivalence.” As a simple example of the use of analytical equivalence, consider again a piece of electronic equipment. Suppose this time, however, that the failure distribution of the equipment is not exponential, but assume that the exponential distribution may serve as a first approximation to it. The control variate approach to variance reduction would involve investigating the failure properties of this equipment by taking advantage of this knowledge and simulating the difference between the actual and the approximate exponential failure rate instead of simulating the actual process. The properties of the actual process can then be inferred using the analytic properties of the exponential distribution and the results from the simulation on the difference between the actual and exponential distributions. In addition to sampling modification and analytical equivalence, there are certain specialized techniques that can be used to achieve variance reduction. These procedures
Application
of variance
reduction
to large scale simulation
problems
287
may include the application of one or more of the above techniques in its implementation. One powerful procedure is called sequential Monte Carlo. In order to effectively employ variance reduction in a simulation, some knowledge about the process and the answers to be generated must exist. One way to gain this information is through a direct simulation of the process. Results from this simulation can then be used to define variance reduction techniques which will refine and improve the efficiency of a second simulation. In complex problems, several iterations may be called for. Another procedure which often proves valuable in developing variance reduction procedures is to consider the process from various viewpoints. In many flow processes, for example, hints for effective importance functions can be gained by considering the process in reverse or looking at the mathematical adjoint of the problem under study. However, as with many of the specialized techniques described in Table 1, it is not adequately developed for general application.
SELECTION
OF
VARIANCE
REDUCTION
TECHNIQUES
As the discussion above suggests, variance reduction can be viewed as a means to use known, usually qualitative, information about the process in an explicit and quantitative manner. In fact, if nothing is known about the process to be simulated, variance reduction cannot be directly achieved. (However, sequential sampling may be used to generate the required knowledge.) The other extreme from no knowledge is complete knowledge, and in this case a zero variance simulation can be devised. Put very simply. variance reduction techniques cannot give the user something for nothing; it is merely a way of not wasting information. Therefore, the more that is known about the problem, the more effective variance reduction can be and the more powerful are the techniques that can be employed. Hence, it is always important to clearly define as much as is possible what is known about a problem. Knowledge of a process to be simulated can be qualitative and/or quantitative. Both are useful. It is important to use all the information available, and in fact it may be useful to do limited crude simulations of the process to gain some knowledge, especially if a little data might lead to extensive insight. Selection of a variance reduction technique(s) for a particular simulation is thus peculiar to that simulation, and general procedures are difficult to establish. However, the mental exercise and the initial groundwork that must be established in order to select or evaluate the usefulness of applying these techniques is almost always worth the effort. Searching for a technique forces the simulation designer into asking the basic questions of: (1) “What answers are to be generated from the simulation,” and (2) “What is known about the behavior of the process?” Problem definition is thus of paramount importance. Before considering variance reduction techniques it is important to characterize aspects of the problem which might indicate which technique might be fruitfully applied. To evaluate the usefulness of these methods for a particular problem it is necessary to: List all of the parameters to be estimated from the simulation. Determine all the available knowledge on the internal workings of the process to be simulated.
288
E. J. MCGRATH and D. C. IRVING
Table 2. Recommended
I.
2.
problem
information
to be defined prior to selecting and implementing techniques
variance
Dejine nature ofthe problem relative toexpected values (means, variances, probabilities, etc.) to be estimated sensitivities, perturbations or variations of parameters of interest possible mathematical formulations (e.g., integral equations, expected values, etc.) any sequential characteristics such as independent paths, outcomes dependent on intermediate input conditions which are random variables to be sampled. Identify portions of the problem or parameters to be estimated that can beexpressed in an analytical form such as single or multidimensional integrals, differential and/or equations solved analytically, such as expected values, variances, probabilities, etc. represented by approximate, simplified positively correlated analytical expressions represented by approximate, simplified negatively correlated analytical representations established as relatively not important to final outcomes compared to other aspects of the problem.
3.
Identijjl wriables in the problem whichare very important to the expected outcome are not expected to significantly impact the results over their range of variation have relatively little effect on the problem are strongly correlated with other variables.
4.
Locate,final events or outcomes of the problem whichhave very small probabilities have very large probabilities have outcomes relatively insensitive to problem parameters have known probabilities of occurrence from intermediate stages in the problem are linear combinations of other events or random variables have known correlation with other events or outcomes.
reduction
steps, etc.
integral
-
Unless one is very familiar with the concepts of variance reduction, the selection of a promising approach for a particular problem can cause considerable difficulty due to the large number of possibilities available. The usefulness of variance reduction techniques is ultimately determined by how effectively known information about the problem is utilized. As a guide to organizing the known material about a problem, Table 2 presents a list of problem information to be determined prior to selecting a variance reduction technique. Sudh information is not strictly required as certain approaches (such as sequential sampling) can generate useful information, but this is not generally accomplished without cost. Thus, prior information is highly desirable. The more that is known, the better the ultimate results will be. Consider item 1 in Table 2. Here it is required to clearly define which parameters are to be estimated. This could include mean values, variances or probabilities. Furthermore, if it is known that the problem is such that sensitivity or perturbation studies are to be performed, it is important that this be recognized at the outset. Additional information of significance includes the sequential nature of a problem, as well as identifying any input conditions that are random variables. Under the second item in Table 2, the significance of integral formulations for parameters to be estimated is pointed out. The importance of integral formulations cannot be overemphasized since it is this basic integral structure which is used to understand almost every variance reduction method. In addition to the analytical forms, the knowledge of various expected values in the problem and the availability of simplified analytical expressions which are positively or negatively correlated with the parameters whose expected values
Application
of variance
reduction
to large scale simulation
problems
289
are being estimated can provide key information as to the variance reduction approach to be finally implemented. Next, identifying intermediate events or parameters which assume significance relative to their importance, unimportance or insensitivity to the problem outcomes can provide valuable information. A key ingredient for improving the efficiency of multistage simulations is identification of variables or outcomes in the problem that will probably lead to either important or unimportant outcomes of the final events. Finally, identifying those final outcomes which occur with large or small probabilities, are insensitive to problem parameters, have correlation with other events, or the final events which have known probabilities of occurrence from intermediate stages will also prove to be very useful in effectively reducing the variance. It should also be recognized that, in general, variance reduction techniques are aimed at reducing the variance of only one parameter or aspect of the process being simulated. Using variance reduction techniques designed for one parameter will usually reduce the effectiveness of the simulation to estimate other parameters. It is very important, therefore, to determine all of the results which will be desired from the simulation before searching for a technique to apply to a given situation. If several quantities are to be estimated by the simulation, the selection of a variance reduction technique has to be considered from the standpoint of all of these parameters. In many circumstances it may be beneficial (or even necessary) to implement a different variance reduction technique for each parameter. This might be accomplished in the extreme case by developing a different simulator for each parameter of interest. A comprehensive summary of variance reduction techniques is shown in Table 3. Here, each alternative is described briefly along with the suggested criteria for application. In addition, advantages, disadvantages and typical applications are noted. As will be seen many of these techniques are interrelated, although their method of application may differ substantially. The second part of this paper will present a description of a few of these techniques as well as the results of a demonstration application of them to a Navy antisubmarine simulator. A complete description of all of the variance reduction techniques would, however, be too lengthy for this paper, and there are quite a number of references which can be consulted for more details. The report[9], produced by the study which led also to this paper, is an attempt at providing a comprehensive reference with a detailed description of variance reduction techniques and a methodology for their application and includes an extended bibliography of references.
THE
EFFICIENCY
OF
VARIANCE
REDUCTION
Before embarking on a description of a demonstration application of variance reduction, we must introduce the concept of efficiency to compare two different Monte Carlo estimators for the same parameter. In any Monte Carlo program the goal is to estimate some parameter I. Let the variance associated with a single observation be a2 and let i be the estimate for I produced by N observations. Then it is known that with high probability a the estimate i will fall between I - kc/J?? and I + ko/fi where k is some constant depending on ~1.Thus for a fixed c(, the convergence of the estimate is related to the number of histories, N, and to the variance, g2.
C.A.O.R..
Vol. 1. No. Z-1
Sampling is accomplished by selecting random numbers and systematically distributing them over the sample space.
Systematic sampling
Structuring the sampling such that the more important regions are sampledmorefrequently.
of technique
Use of probabilities to limit or kill samples in an uninteresting portion of the simulation (Russian Roulette) and to increase the number of samples in interesting regions (splitting).
___~
samp-
Description
Russian Roulette and splitting
-.
Importance ling
Variance reduction technique or procedure
of variance
Random variables are to be generated at the beginning of the problem (source or initial conditions) Relative importance of ranges of random variables bemg sampled is unknown
Process is sequential in nature Low probability events are involved Importanceofall outcomes are qualitatively known Often useful where importance and/or expected values are used
Certain events are known to be important Parameters to be estimated are based on events with very low probabilityofoccurrence Certain events are not of interest Regions of importance can be defined at various stages in the simulation -.-.-~--. .____.
Suggested criteria for application ________-. ____~
Table 3. Summary
techniques
--
-..-_--..___-_
Easy to implement Little additional computation required No risk in application Always get variance reduction
Very little computation Convenient way to accomplish crude importance sampling
--
Often gives marginal improvement Effective only in a limited number of situations
Requires a priori knowledge of importance of intermediate steps There may be counteracting pitfalls if used with importance sampling May not be effective if used alone Splitting may be difficult to implement
Can give worse results if not properly applied May require cosiderable a priori knowledge May require additional computation time to implement and to correct for biased sampling
Disadvantages --__I__
and characteristics
Can give great improvement with limited effort Well developed procedures Can often be easily implemented May be only way to get a reasonable answer
Advantages(l) .-.---
reduction
-._
_
Application to many problems beyond direct integral evaluatton has been limited. Could be useful in any problem where random inputs are of interest. Examples are in queueing, reliability, etc.
Classical application has been in particle transport. Can be used in search problems, network analysis, traffic flow, random walk, etc. when low probability events or unimportant events are involved.
-..___
Most commonly used method. Applications in PERT, reliability, fault tree analysis, queueing and radiation transport. Applicable to almost any problem having criteria indicated.
Comments on typical applications ~._.___
___.
Statistical estimation
-____---_
__
-.---
___~.
.--
-~
.-...
Include known expected value or probability in estimator but not in simulation model itself.
Analytic results, probabilities or expected values of portions of the problem are known Stochastic nature is essential to overall simulation. Process is sequential nature. Probability of final outcomes is known, but small, at all intermediate stages. ----
-
~________
Always gives improvement Can get great improvement Often only way to get an answer.
_~.______
May require prohibitive analysis computation
or
May require prohibitive analysis/computation
Requires u priori knowledge of importance Can lead to worse results if not properly applied Sometimes difficult to define sampling ranges
~-___
_.-.____~____-
Almost always gives improvements Trivia) to implement in most useful cases
Analytical results, probabihties, or expected values of portions of the problem are known Stochastic nature (i.e., higher moments) of portion whose expected value is known is not essential to overall simulation. ~---
Include known expected values in the process to replace stochastic portions of a simulation.
Expected
value
Easy to implement Can give great improvement Gives better results than systematic if properly applied Can be optimally applied
Random variables are to be generated at the beginning of the problem (source or initial conditions) Certain ranges of the variables are more important than others.
Sampling is distributed over the sample space by dividing the distribation into sections and sampling from each section. More important areas of sampling are emphasized. .-
Strattfied or quota sampling
Table 3 (Continued)
-__
-._-~____--~
Primary application has been in radiation transport. Potential exists for use on almost all simulations.
~____
Has been applied to PERT and radiation transport. Its use in other areas has been limited. However, could be useful in almost any area where known results are available for parts of the problem.
Known applications are primarily in queueing. Any problem with initial random conditions should find this technique useful. includes reliability, network analysis, etc.
Antithetic variates
-_-_--.-
-
Using simplified or approximate expressions for undetermined parts of the problem except that an inverse or negative correlation is maintained in the sense they follow the opposite trend of the true expressions. Analytical or approximate representations are known and correlate in a negative or inverse fashion with the problem variables.
Portions of the problem can be approximated with an analytical representation A simple low variance approximate simulation is known.
Using simplified or approximate representations for undetermined parts in the simulation. These representations are positively correlated in the sense that they follow the trend of the true expressions.
variates
Control
Results are needed for two or moresimilar problems Difference in problems may be written as a difference in probability distributions.
Sensitivity studies are of interest Effect of parameter variations to be determined Small perturbations are of interest.
reduction
Takes results of one simulation and reinterprets as results of second simulation weighting answers to remove bias.
-
Some or part of the random numbers are used in successive trials to provide correlation among the outcome in successive triats.
of variance
Nis1ory reanalysis
Correlated sampling
Table 3. Summary
Simple calculations usually required Provides considerable flexibility Can give great improvement.
.--.-I___-
Simple calculations usually required Provides considerable flexibility Can give great improvement.
Results are highly correlated, thus reducing variance in difference Saves computation time for second simulation.
-
and characteristics
Easy to comprehend Applicable to many practical situations Little risk involved in application Often only way to detect small effects. -.-..._-______
techniques
Requires some understanding of the process Usually easier to use than control variates.
__~~
Requires some understanding of the process Simple analytic approximation may nat exist.
Difference between problems may be more than a difference in probability distribution Variance of second problem may be prohibitively high in history reanalysis.
May be difficult to implement Sensitive to specific applications.
(continued)
Has been applied to queueing problems to improve efficiency. Could be used in a variety of operations research problems in reliability, queueing, sensitivity analysis, etc.
--
Usefulness to queueing problems has been demonstrated. its potential is great for use in almost all areas where some approximate representation of the random variable whose expected value is to be determined is known.
Potentially many applications but, so far, mainly used in radiation transport.
Can be used in almost any problem where sensitivities are of interest. Examples are network design, tactics evaluation, reliability, queueing system design, etc.
;I(:
N
Orthonormal functions
The orthogonal characteristics of certain orthonormal functions are exploited in formulating a sampling scheme with reduced variance.
Development of estimates for parameters to be used in other variance reduction techniques by using information generated from previous sampling.
Sequential sampling
-
Used with linear combinations of parameters being estimated in a simulation model. An approximate minimum variance estimate is included which exploits any inherent correlation among the parameters to reduce the variance.
Regression
Offers great possibilities for problems with multidimensional integrals.
Can almost always be applied Requires little or no prior knowledge Can be used in conjunction with other techniques.
Little or no infortnation is available about the process Other rariance reduction techniques (e.g., importanGe, control variates, etc.) are to be implemented.
Many-dimensional variables in problem Multiple integral formulations can be determined.
Little, if any, bias results Exploits correlation Little to lose if there is no correlation Can be applied with little or no information about the problem
Estimated parameters are linear combinations of random variables Correlation among variables comprising the estimator are known to exist.
Table 3 (Continued)
._-..II_
Not w,ell developed for general applications Generally difficult to efficiently implement May require considerable calculations Requires construction of orthonormal functions Involves sampling from joint random variables.
May require considerable calculations initially Efficiency may be very low May not be useful in certain problems (such as low probabilities) Not well developed.
-_______
Increased computation required Limited to linear combinations of random variables in its usual form.
Useful when considering multiple integrals. Has been applied in nuclear physics. However, it may .be useful in complex systems problems where a muitip~e integral formulation is possible.
Evaluation of multiple integrals, estimation of distribution parameters and has found use in lifetime studies.
“__ ^_.
Should be applicable in almost any problem (reliability, queueing, etc.) where the outcome is a linear combination of statistical estimators.
Imbeds problem in larger process using conditional probabilities in larger process.
Conditional Monte Carlo
-
-.---
____-_____._I_. Process is extremely complicated but can be embedded in a simpler simulation.
When applicable, can give substantial improvements Some algorithms do exist for network applications.
Permits parametric representation of knowledge about the system Can lead to substantial improvements.
Approximationsareknown for parts of the problem An analytical formulation for the problem is known Transformed equations can be easily simulated.
--
Can give indication of what importance sampling is required Very powerful method when it can be developed Useful for certain problems involving system input sensitivity studies. ---._
Reverse process can be formulated Starting positions for final outcomes can be defined Mathematical adjoint formulation is known.
1 Advantages are noted relative to crude Monte Carlo except where indicated.
Use a transform or modification of the process being simulated to give _ a problem that can be simulated with a reduced variance.
The problem is simulated in the backward or reverse sense which is often described by a mathematical adjoint formulation.
Transformation
--~
Adjoint method
(continued)
Not well developed Sometimes difficult to recognize applicability.
-_I~
Not well developed for most applications May require complete analytical statement of the problem Appears to be limited in scope of applicatisn.
Not well developed for most applications Difficult to implement Adjoint concept difficult to formulate for most problems.
Table 3. Summary of variance reduction techniques and characteristics
~__ Improving the efficiency of estimating the distribution of the function of times to pass through a network. Network applications of interest are stochastic PERT, CPM, etc.
--
Has been used almost exclusively in radiation transport problems. Has potential for queueing, inventory, etc.
-.
Has been used almost exclusively in radiation transport. Could be useful in queueing, inventory, etc. if problem can be formulated as integrals or differential equations.
Application
of variance
reduction
to.large
scale simulation
problems
295
Two approaches can be taken to increase the accuracy of the estimator, i. One is to increase the number of histories. The other is to reduce the variance (c?) associated with each observation. The disadvantage of increasing the number of iterations (i.e., the size of N) is obvious. For example, to reduce the interval of uncertainty by a factor of two, thus doubling the accuracy, four times as many histories would be required (for a fourfold increase in computing time). Eventually it becomes prohibitively expensive to gain further accuracy by increasing the number of histories. Therefore, achieving variance reduction by reducing the variance associated with each history is highly desirable for improvements in the answers. To evaluate the efficiency gained in the use of variance reduction techniques, it is clearly desirable to have a quantitative measure. This can readily be established based on the ideas introduced above. Suppose two simulation methods exist for estimating the same parameter 1. Let the variance per history associated with the first simulation method be 0: and that associated with the second be 0:. It is desired that the result be known within an uncertainty of E (i.e., the estimate i fall in the interval I - G to I + E). For this to happen with high probability will require N, = k2af/c2 histories for the first method. For the second method, it will require N, = k2cz/tz2 histories. In general the two methods will require different amounts of computational effort to generate each history. Let the computer time taken per history by the first method be t, set and by the second t, sec. Then the total time required for the first method to achieve the desired accuracy would be k2aft,/E2. Total time for the second method would be k2a:t2/E2. The relative efficiency of the two simulation methods is given by the ratio of the computing times required. Thus, 2
efficiency
= 53 t24
which is the relative time advantage gained by using the second method. This can be viewed either as the reduction in computer time when both simulations are run to generate the same accuracy or as the gain in accuracy when both simulations are run for the same length of computing time.
A DEMONSTRATION
CASE
OF
VARIANCE
REDUCTION
APPLICATIONS
In order to demonstrate that the variance reduction techniques, which had been largely developed for specialized applications, would improve the efficiency of a wider range of Monte Carlo simulations, we tried out several techniques on a simulation program of interest to our Navy sponsors. The program chosen was the Antisubmarine Warfare Air Engagement Model, APAIR. This model is a Monte Carlo simulation of the full engagement between one aircraft and one submarine. The general version (APAIR 2.6) simulates the detection, localization and attack phases of airborne ASW missions. The model is particularly suited for this study since it is a relatively long running program, and because it is in rather wide use, improvements in the running time would be most welcome. It is emphasized, however, that the objective was not to critique the APAIR model but rather to evaluate the effectiveness of efficient techniques which are not commonly used in Monte Carlo simulations. It should also be mentioned that not only are the techniques studied here applicable to the APAIR program but also to other ASW simulation models and to many other types of Monte Carlo simulation.
E. J. MCGRATH and D. C. IRVING
296
APAIR can be applied to a variety of ASW problems involving one aircraft in pursuit of a submarine. The problem used in this study is considered to be a typical application of APAIR in that it is designed to estimate the effectiveness of the aircraft and sensors in finding and killing a submarine. A total simulation of APAIR is comprised of a series of independent histories. A single history in the simulation proceeds as follows: The initial aircraft location, direction and speed are chosen. The submarine location, bearing and speed are also chosen according to specified random characteristics. The aircraft proceeds to execute certain tactics with the objective of detecting the submarine. Once the submarine is detected, the aircraft enters a localization phase where certain maneuvers are performed and sonobuoys are dropped in a specified pattern. In addition, Magnetic Anomaly Detection (MAD) gear is used in the localization process. Two typical problems, differing only in the range ofdetection of the MAD gear, were utilized in this study. If localization of the submarine has been achieved, the aircraft proceeds to drop a torpedo which may or may not result in a kill. If no kill is realized, the aircraft then proceeds through further localization to attempt another torpedo drop, again following specified maneuvers. The history can be terminated in several ways which include exceeding a time limit, achieving a submarine kill, or exhausting aircraft stores (torpedoes or sonobuoys). Once the history is terminated, by any event whatever, a new history is initiated. When the specified number of histories are completed, final results are tabulated and printed out. Among the parameters estimated by APAIR the following were used in this study to illustrate the efficiency of variance reduction: Probability Average
of a submarine
kill
time used to achieve a submarine
kill
Number of stores used per submarine kill for three types of stores. One type was torpedoes and the other two were two kinds of sonobuoys. Improvement in the estimate of these basic parameters was the objective in using the variance reduction techniques considered here. Of course, such an improvement could also be achieved by increasing the number of histories, although the running time could become prohibitively long. For example, the above problem required approximately 15 minutes on a Univac 1108 to generate 100 histories. The desirability for achieving variance reduction is therefore obvious. The type of problems solved by APAIR provide an excellent opportunity to apply a wide variety of variance reduction techniques which can substantially improve APAIR efficiency. Several of these were tried on APAIR with generally excellent results, In some applications the gain in efficiency was estimated to be almost a factor of 20. It appears that a factor of 10 improvement can readily be accomplished in many APAIR applications if the time and effort is expended to understand and implement the appropriate techniques. It is unquestionably to the benefit of the analyst to follow such an approach when long running times are of concern. Resources available to perform the study did not permit application of all the possible variance reduction schemes available. Therefore, a selected number of techniques representing a broad range of characteristics were applied to various types of ASW problems. These included:
Application
of variance
reduction
to large scale simulation
problems
297
Statistical estimation using an expected value of kill at each torpedo drop rather than scoring actual kills based on Monte Carlo simulation. Two cases were studied corresponding to different types of Magnetic Anomaly Detectors (MAD). Expected ualue replacing actual submarine kill by a reduction in the “survival value” of the submarine for computing percent kill as a function of which torpedo caused the kill (first, second, etc.). This was also performed for two types of MAD gear. Systematic sampling selecting initial starting coordinates for the submarine location using two types of MAD gear. Two types of systematic sampling were considered. Antithetic
variates used to select initial
starting
locations
of the submarine.
Correlated sampling using random number control to generate identical histories point where differences in two types of MAD gear lead to a divergence of histories. was used to estimate differences in effectiveness between two types of MAD gear.
to a This
History reanalysis where one run was made unbiased and the second generated using weighting factors on the histories from the first to correct for differences in MAD gear. This was also used for estimating differences in effectiveness between two types of MAD gear. Importance sampling performed with an importance function weighted to generate correlated samples for the two types of MAD gear. Again, this was used for estimating differences in effectiveness between two types of MAD gear. As a basis for quantitatively characterizing the calculational efficiency of variance reduction over straightforward or crude sampling (i.e., with no variance reduction) the following definition was used throughout. i:=
variance i variance
with crude sampling
with the variance
reduction APAIR
technique
running
1
time with crude sampling
X
i APAIR
running
time with variance
reduction
1
Crude sampling represents the procedure currently used in APAIR. Using the above definition for efficiency, a value of E = 2 for application of a variance reduction technique implies a reduction in computer time by i to achieve the same variance as would be obtained with crude sampling. It should be recognized that the efficiency, E, as defined above is a random variable since the variances with and without variance reduction are estimates generated from the statistics of the simulation and not theoretical analyses. A batching method was used in estimating most of these variances. The procedure is presented in [9] and will not be detailed here. It is important to recognize, however, that the efficiencies presented are estimates and, therefore, are subject to a certain variability. In fact, the variance estimates are second order quantities and thus this variability is generally much greater than that encountered for the primary quantity whose variance is being estimated. From theoretical arguments the error in the efficiency figure could easily be as large as f 45 ‘A. In some of the cases presented in this report, there are large correlations between the crude simulation and the variance reduced technique and thus the variability of the efficiency figure is much less than this theoretical maximum. Since efficiency is so difficult to calculate with any
1.oo
I.30
1.22
1.14
I.50
1.23
I.10
1.25
I.23
I.13
l-15
Statistical estimation (2nd application)
1.03
Statistical estimation (1st application)
1.51
I.46
1.21
IGO
2.36
Expected value for 2nd t”rpedo (1st application)
10
IO
Expected value for 3rd torpedo (1st application)
1.69
14.1
lo.2
1.37 1.34
2.19
0.71
IS.4
I-63
1.07 1.78
17-l
1.74
0.96
Systemattc sampling (1stapplication)
-
107
Expected value for 3rd t”rpedo (2nd applicatlon)
200
Expected value for 2nd tarpedo (2nd application)
with variance
in APAIR application
Effictency of simulation
efficiency improvements
I-14 1.31
I00
1.17 I.12
I.08 087
138 214 I.28
variance reduction
1~00
0.92
Antithetic variates (1st apphcation)
6.8
9.7
6.4
9.3
7.2
I.21
History reanalysis
7.2
5.0
3.0
19.7
I.8
8.5
---
Importance sampling
Indicated to achieve the same variance
I .90
I.37 technique
I.17
2.40
I.33
l-58
3.0
Correlated sampling
214
1.32
I.36
1.11
@92
Antithetic variates (2nd apphcation)
techniques
applied*
reduction
technique
Systematic sampling (2nd application)
reduction
using variance
* The efficiency can be interpreted as the factor with which the running time can be reduced by “sing the correspondrng as when no variance reduction techniques were used. t This is simply a numerical average of the numbers in the column above.
Probability of submarine kill Time to submarine kill Number of torpedoes used Number of sonobuoys usedTYPO A Number of sonobuoys usedTYPO B “Average” unproveI” efficiencyt
______-__
ASW parameters considered for variance reduction
Table 4. A summary.of
x
j; 5
cl
P
6 a
P 2
“0
m
Application of variance reduction to large scale simulation probiems
299
accuracy, calculations were made for five similar parameters-kill probability, mean time to kill, number of stores used per kill (for three types of stores: torpedoes and two kinds of sonobuoys), In general the variance reduction efficiencies for these parameters should not be greatly different, therefore, the spread in efficiency values for these five parameters should give a rough idea of the variability of the efficiency figure for each variance reduction technique. As an improved estimate of the efficiency of each technique, a simple average of the five efficiency figures is presented in the tables. A summary of the efficiency improvements made in APAIR through the application of several different types of variance reduction is presented in Table 4. Each of the variance reduction techniques used here has been described in detail in [9] and will not be presented to that extent in this paper. A fuller explanation of the more significant cases is given below. APPLICATION
OF STATISTICAL
ESTIMATION
In the course of an individual history, the aircraft will drop a torpedo when it thinks it has determined the submarine’s location and heading. The actual submarine speed, aspect and range are used to determine, from input tables, the probability, pK, that the torpedo will destroy the submarine. A random number, R, from a uniform distribution is generated and compared to pK, If R < pK, a kill is scored and the history is terminated. If R 2 pK , no kill occurs and no scoring is done; the game continues with more localization and, possibly, another attack. If other torpedoes are dropped, a similar procedure is used to determine if a kill is scored later in the game. At the completion of the simulation, the probability of submarine kill is estimated as
where n = number of kills scored and N = number of histories run. For estimating the time to kill, APAIR currently uses i;y =; l/n i i=
TKi 1
where n is the number of kills scored and 7& is the time taken to effect a kill in the ith history which ended in a kill. Similar procedures are used for the other parameters (number of torpedoes and sonobuoys used per kill) estimated. In the statistical estimation technique for variance reduction, the scoring was changed so that actual kills are not scored, but rather the expected value of kill, pK, was scored for all torpedo drops regardless of whether or not a kill was achieved. However, the simulation game itself was not modified in any way. That is, a random number, R, was still used at each torpedo drop to determine whether the history was terminated by a kill or more maneuvers took place. The outcome of this random choice did not affect the scoring of pK, Specifically, define pKij = probability
of kill for thejth torpedo dropped during history i.
n, = number of times a torpedo was dropped during history i.
E. J. MCGRATH and D. C. IRVING
300
Then, the total score generated for history i is ilpKij;i
= l,...,N
and the estimate for the probability of kill for the entire simulation (N histories) is
In the case of the remaining parameters, a similar approach was taken. For example if TKij = time to torpedo detonation
for the jth torpedo dropped in history i
then the estimate for time to kill is given by
Similar expressions were used for the remainder of the parameters being estimated. The computational time per history is very slightly increased using this technique since the same game is still being played but there is a little more bookkeeping in the scoring. However, this is offset by the resulting variance reduction. In demonstrating the statistical estimation technique, both the crude Monte Carlo estimate (counting actual kills) and the statistical estimate (summing over pK values) were calculated in the same run and, therefore, used the same histories. This produces a high degree of correlation between the crude and the statistical estimation results which reduces the variance of the efficiency figure. Two problems were run using MAD gear having long and short range detection capabilities respectively. The variances obtained with statistical estimation and with crude Monte Carlo are shown in Tables 5 and 6 along with the resulting efficiency factor for the use of this variance reduction technique. The sample variances were estimated using the statistical techniques described in [9] to obtain the variance
Table 5. Variance
reduction
using the statistical
estimation
technique
for the short range MAD gear
Estimated expected values
Variance with crude sampling
Variance with statistical estimation
Probability of submarine kill Time to submarine kill Nuyber of torpedoes used per kill Number of buoys used per kill-Type A Number of buoys used per kill-Type B
2.5 X lo-3
2.4 x lO-3
1.03
102 2.39 x lO-3
9.15 1.9 X lo-3
1.10 1.25
0,084
0,067
1.23
0.450
0.396
1.13
Efficiency of variance reduction
Average efficiency improvement 2 1.15. Ratio of running times (increase with variance reduction) z 1.01. Statistical estimation. Rather than score actual kills, an expected value of kill is scored at each torpedo regardless of whether or not the torpedo scored a kill. The game is not modified.
drop
Application Table 6. Variance
reduction
of variance
reduction
using the statistical
to large scale simulation estimation
technique
problems
301
for the long range MAD gear
Estimated expected values
Variance with crude sampling
Variance with statistical estimation
Probability of submarine kill Time to submarine kill Number of torpedoes used per kill Number of buoys used per kill-Type A Number of buoys used per kill-Type B
2.5 x 10-s
2.5 x 1O-3
lG0
920 6.1 x 1O-3
7.00 4.96 x 1O-3
1.30 1.22
0.076
0.066
1.14
0.50
0.33
1.50
Efficiency of variance reduction
Average efficiency improvement z 1.23. Ratio of time increase with variance reduction z 1.01. Statistical estimation. Rather than score actual kills, an expected value of kill is scored at each torpedo regardless of whether or not the torpedo scored a kill. The game is not modified.
drop
results indicated. The actual estimated values of the parameters are not presented since they are not considered to be germane to comparison of the efficiencies. Therefore, only the variances in the two cases are shown along with the efficiencies obtained in estimating the parameters with variance reduction. The efficiencies obtained varied from 140 (implying no improvement) to as high as 1.50 (implying a factor of 1.5 reduction in running time). However, these extreme values probably represent statistical fluctuations in the variance estimates rather than real efficiency improvement. Average efficiencies of 1.15 for the short MAD problem and 1.23 for the long were achieved. Thus, it can be seen that use of statistical estimation could have made roughly 19 per cent improvement in APAIR run times.
APPLICATION OF EXPECTED EFFECTIVENESS OF
VALUE TECHNIQUE MULTIPLE TORPEDO
TO ESTIMATING DROPS
The expected value technique differs from statistical estimation in that the actual game or simulation being played, as opposed to just the scoring technique, is changed to replace a random choice with an expected value for the outcome of that choice. For example, instead of generating R to test against pK with the choice of either killing or missing the submarine. the submarine is given an initial “weight” of 1.0. If the first torpedo dropped has a pK of 0.80, then 80 per cent of the submarine is deemed killed but 20 per cent survives and the weight is reduced to 0.2. If a second torpedo is dropped which has a pK of 050, then half of the remaining weight, or 0.1, is killed, while a weight of 0.1 continues to survive. The history is never ended due to a submarine kill but continues until some other limit, such as using up the mission time or using up the stores of torpedoes or sonobuoys, stops the history. Such a technique could be useful in studies such as determining the effectiveness of the number of torpedoes carried on an aircraft or the worthiness of the tactics for relocalization and reattack following a miss by the first torpedo. In these cases one is not so much interested in the overall kill probability as in the kills scored by the second, third, etc. torpedoes. In a crude Monte Carlo most of the kills are made by the first torpedo and the
302
E. J. MCGRATH and D. C. IRVING
history ends there. These histories add nothing to the knowledge of tactics involving reattack or to the kill value of the second torpedo, but simply constitute wasted time towards calculating the items of importance. In fact, what is worse, these histories add variance to the overall kill probability. It would be prohibitively expensive to determine the value of reattacks from a crude Monte Carlo due to the large proportion of the running time being spent on histories of no value to this parameter. Using the expected value technique outlined above, most histories will contribute to knowledge concerning the second and third torpedoes as the history will always continue after the first torpedo drop. For example, one of the crude Monte Carlo problems run had (out of 100 histories) 78 cases in which the first torpedo was dropped, 10 in which the second was dropped, and only one history where the third torpedo was dropped. Using the expected value technique, the same problem had 79 histories with one torpedo drop, 65 histories in which the second torpedo was dropped, and 43 histories of a third torpedo drop. (Some histories are terminated between torpedo drops when the aircraft run out of fuel or sonobuoys, when the submarine escapes from the search area entirely, or for other similar reasons, thus these numbers are not all 79.) Estimation in the expected value case was done as follows. If pKij is the kill probability for thejth torpedo in the ith history and wij is the submarine weight at the time that torpedo was dropped, then bKj = l/N
5
PKijWij
i=l
is the kill probability
of thejth
torpedo.
Likewise, the time to kill by the jth torpedo
is given
by
Table 7. Efficiency
calculated
for expected
Problem 1 long range MAD gear Estimated parameter Probability of kill Time to kill Sonobuoys used per kill-Type A Sonobuoys used per kill-Type B Average efficiency
2nd Torpedo
3rd Torpedo
value technique Problem 2 short range MAD gear 2nd Torpedo
2.00 1.74 1.63
13.7 17.1 18.4
2.36 1XlO 1.21
1.37
1@2
1.46
1.69
14.1
1.51
3rd Torpedo “10
*10
Efficiency factors calculated from a comparison ofexpected value to statistical estimation have been increased by 20 per cent to make comparison of expected value technique with crude Monte Carlo. * Theoretical estimate. Actual estimate was infinite as crude Monte Carlo did not contain any realizations of a third torpedo drop for this case. Expected ualue: At each torpedo drop, the “weight” of the submarine is decreased by the torpedo kill probability. No random game to determine kill/miss is played and the simulation continues.
Application
of variance
reduction
to large scale simulation
problems
303
where TKij is the time of detonation of the jth torpedo on the ith history. A similar formula is used to estimate the numbers of sonobuoys dropped per kill by the jth torpedo. (Obviously the number of torpedoes used is constant in this case, so this parameter was omitted.) For this technique. two sample problems were run; one with the longrrange MAD gear and one with the short-range MAD gear. In each case APAIR was run twice, with and without the expected value technique. To reduce the variance of the efficiency factors. the histories in the two runs were correlated using the technique described in the next section. This kept the histories in each run identical through the first torpedo drop. The resulting efficiencies for the second and third torpedo drops for both cases are shown in Table 7. In the short MAD case, there were no examples of a third torpedo drop in the crude Monte Carlo run, so the efficiency of the variance reduction technique is theoretically infinite in this case. However, using the iK estimate from the expected value run, it was possible to estimate the number of crude Monte Carlo histories that would be necessary to get similar statistics: this led to the efficiency factor of 10 for fiK shown for the third torpedo in the short MAD case. The running times for the crude and expected value calculations are shown in Table 8. As anticipated, the expected value histories took much longer to run because they did not stop at the first torpedo drop (as most of the crude Monte Carlo histories did) but went on to simulate a second and a third torpedo drop. This extra running time is used to generate information concerning the parameters of interest, i.e., the kills made by second and third torpedo drops, and, therefore, the overall efficiency is much improved as Table 7 shows. It is instructive to consider the efficiency for the first torpedo drop. As the histories in the two runs were identical through the first torpedo drop, the variances are identical for the first torpedo parameters. Due to the increased running time, the first torpedo efficiencies will be lower by 0.49 (for the short MAD) and 0.36 (for the long MAD). The extra running time used in calculating second and third torpedo drops is wasted as far as first torpedo drops is considered and this lowers the efficiency. This illustrates a common point of variance reduction: any technique which reduces variance for one parameter will increase variance for some other parameter. Variance reduction techniques must be carefully tailored to the parameters of importance, in this case the kills involving two and three torpedo drops. For ease in making programming changes to calculate the parameters separately by torpedo drop, base runs with the current APAIR, representing crude Monte Carlo. were not made. Runs using the expected value technique then provided efficiency factors for an expected value/statistical estimation comparison. The average efficiency of the statistical
Table 8. Running
Running time for “crude” Monte Carlo (set)
Problem Problem
times for crude Monte
1:
long MAD Problem 2: short MAD
Carlo and expected
value calculations
Running time for expected value technique (set)
Ratio of times
664
1866
2.81
822
1666
2.03
304
E. J. MCGRATH and D. C. IRVING
estimation/crude Monte Carlo comparison described above was determined to be a factor of 1.2. Multiplying the efficiencies from the expected value/statistical estimation comparisons by 1.2 generated the figures presented in Table 7 as the efficiency of expected value versus crude Monte Carlo. This combination of efficiency factors is justified because the expected value technique necessarily incorporates statistical estimation scoring; once the kill/miss decision has been removed from the game, the scoring *must be by the pK’s for each torpedo drop. Thus there is no possible “expected value without statistical estimation” comparison to crude Monte Carlo but only the combined efficiency.
AN
APPLICATION
OF
CORRELATED
SAMPLING
One of the most powerful and generally useful variance reduction techniques is correlated sampling. It is a procedure that can be used to reduce variance in Monte Carlo simulation in the following general situations: The effect of a perturbation
to a known
The difference between estimated istics is to be calculated. A parametric
problem
parameters
study of several similar
is to be determined.
in two problems
problems
having
similar
character-
is to be performed.
Such situations occur very frequently. In fact, in most studies the most important result to be investigated is the change in response of the system as a problem characteristic is varied. As will be seen, the payoff from various types of correlation in sampling can be very high. The problem selected for demonstration of correlated sampling involved the short range and long range MAD detector cases discussed previously. The main parameter of interest to be calculated was the difference between these two cases in the parameters used in prior sections. The only difference in problem characteristics in the two cases was a difference in MAD detection capabilities. This was expressed in a- function relating probability of detection to range of target from the aircraft, as shown in Fig. 1.
I.0
0.5 Range Fig. 1. Detection
and probabilities
of
submarine
from
for the long and short studies.
I.5
2.0
aircraft
range
MAD gear used in the APAIR
Application
of variance
reduction
to large scale simulation
problems
305
To demonstrate the concept of correlated sampling for this application, we define ps = probability
of submarine kill using the short range MAD detectors
pL = probability
of submarine kill using the long range MAD detectors.
The problem of interest is to perform Monte Carlo simulations to estimate the difference A=
PL -
PS
The crude Monte Carlo approach would first estimate ps (denoted by a’& using one set of histories and then pL (denoted by p;) using another set of independent histories. Then the difference A is estimated using 61 ZZ& - &. The variance in 8’ is
for the case of independent histories in the ps and pL estimations. Suppose, however, that positiveiy correiated estimates for ps (say psj and for pL (say fiLj were used to estimate A. Then for
the variance in b will be given by 02(8) = G2(CL)+ a2(@J - 2 cov(&, &,) where cov(&, FL) is the covariance correlated, then
between a, and aL. Since fis and fiL are positively cov@,,li,)
and hence
2 0,
02(A) < a2(8’).
Thus, the objective of correlated sampling is to develop a sampling strategy that will correlate the estimators a, and bL. This technique can be extremely powerful when small perturbations of problems are to be studied since the correlations induced will tend to emphasize differences in the problems due to the perturbation rather than differences due to statistical fluctuations which are usually the controlling differences in cases with independent histories. The above arguments for improvement in the variance using correlated sampling for the probability of kill would also apply directly to estimating the differences in other ASW parameters. Correiation can be accompiished in severai ways. For exampie, the short and iong range MAD problems could be simulated independently except the same random number could be used in determining the outcome once the probability of detection had been obtained from the curves in Fig. 1. Thus, correlation between the two results would exist. Another way, and the one that was used here, is to control the random numbers in the two simulations, by using the same random numbers in the two problems until a difference in detection occurs in the problems due to the difference in the MAD gear. Two separate
306
E. J. MCGRATH and D. C. IRVING
runs were made, but corresponding histories in the two simulations were made to start at the same point in the random number sequence. Thus, the histories would be identical up to the point where the difference in MAD gear resulted in different decisions. At the time the detectors came into play, the detection outcome was selected in each case from the same random number. Subsequently, the histories continued independently until the end of the game. More correlation could have been introduced by subsequently using identical random numbers wherever the problem logic allowed. The program changes made to induce this much correlation were fairly simple. Two separate random number generators were used in each simulation. The first was used once each history to give a random starting point in the sequence of the second generator; it produced the same sequence of starting points in both simulations. The second generator was used in the history to obtain random numbers for the simulation process. By starting at the same point in both cases, identical histories will be generated until there is a difference in decisions made concerning a MAD detection. Even though the first history in one problem might use more random numbers than in the other problem, the random number sequences would be returned to the same point at the start of the second history in each problem. The results of the calculations are shown in Table 9. The greatest increase in efficiency (a factor of 3) was found for the probability of kill. The average improvement over all parameters was almost a factor of 2. Thus, bhe running time for the same variance could be reduced by about a factor of 2 by using correlated sampling.
Analysis of correlated histories One of the great benefits of correlation is the potential it provides to gain insight into understanding simulation problems. For example, if two highly correlated histories, one with and one without a problem perturbation were available, then, most of the time, differences observed in the histories will be due to variations in the problem perturbation rather than statistical variations. The problem of the two MAD detectors was ideally suited for demonstrating the possibility of analysis of correlated histories since the MAD detectors were sufficiently similar so that variations on estimates for the kill probability could be lost in the statistics of the
Table 9. Variance
reduction
using correlated
Estimated difference in expected value (long range MADshort range MAD) Probability of submarine kill Time to submarine kill Number of torpedoes per kill Sonobuoys used per kill-Type Sonobuoys used per kill-Type
A B
sampling .for estimating MAD gear parameters
differences
between
short and long range
Variance with crude sampling
Variance with correlated sampling
Efficiency
2.96 x lo-3 15.0 2.2 X lo-’ 0.166 @174
9.6 x 1O-4 9.2 1.6 x 1O-3 0.068 0,144
3.0 1.58 1.33 2.40 1.17
Average efficiency improvement 2 1.9. Ratio of time increase’with variance reduction 2 1.03. Correlated sampling: Two separate runs were made. Histories MAD gear led to a divergence in histories.
were identical
up to point where difference
in
Application Table
of variance
reduction
to large scale simulation
10. Analysis of correlated histories for long short range MAD simulations
History
No.
8 9 10 (I) In these histories
P, (Short
range
301
and
P, (Long MAD)
MAD)
0.80 (3) 0.80 (1) OQO (1) 0.80 (I) 0.80 ( I ) 0.00(1) 0.80 (I) 0.96 (2) 0.68 (3) 0.80 (1)
0.96 0.80 0.00 0.80 0.80 OGI 0.80 @OO 0.80 @80 the difference
A large number cfsuch occurrences
problems
in MAD gear had no effect.
is reasonable
due to correlation.
(2) In this history, the long range MAD gear picked up the submarine, resulting in a kill while the short range gear missed detection This type of result is expected. (3) In these histories. detection was accomplished with both types of gear. However, the kill probability is lower with the ltistories should be studied in detarl to gain,/urther
insight
problem. In fact, the initial results obtained without using correlated sampling showed no statistically significant difference between the two types of detectors. Correlated sampling was used in an attempt to improve the accuracy sufficiently that the difference in the two cases was significant. Unexpectedly, the correlated results were nearly indistinguishable and a sample of histories were looked at in detail to gain an understanding as to why the long range detection gear did not show superiority. The sample series of single histories is shown in Table 10. These histories were obtained from runs correlated in the manner described above. There are three types of situations that can be identified in Table 10. First, there are a large number of histories where the short and long range gear give the same result. This situation should be expected since the gear is similar and the histories are highly correlated. That is, when a short range detection occurs with the short MAD, it will also occur with the long MAD. Also when no detection occurs for the long MAD at long ranges, none will occur for the short MAD. At intermediate ranges where the two MAD curves differ, this, of course, is not true. This is manifested by the second situation (history 8) where the long range MAD detected the submarine and effected a kill and the short range MAD did not. with no kill as a result. This type of history was also expected. Of most interest is the third situation where, in histories 1 and 9, the long range MAD gear produces a lower kill probability than the short MAD gear although both types detected the submarine. This result was quite unexpected and would tend to indicate there are considerations other than variations in MAD gear which make the two problems different. For example, the tactics used might be appropriate to the short MAD detector but might be “trigger happy” when used with the long MAD gear resulting in premature torpedo drops with a lower kill probability. Had the histories not been correlated, it would
308
E. J. MCGRATH and D. C. IRVING
have been difficult, due to the statistical variation from history to history, to notice that such events were occurring. In this manner, therefore, correlation can identify which histories should be examined in detail to provide more insight into the problem. Another application of correlated sampling is presented in the following section.
AN
APPLICATION
OF
HISTORY
REANALYSIS
In the second application of correlated sampling to APAIR, reanalysis of histories using weight factors to correct for the difference in problem characteristics was used rather than controlling the random numbers. Effectively the following was performed for each history : A base history was run using crude sampling for the long MAD gear up to the point where the possibility of detection was to be tested. Given the range, a probability range gear (pS) were determined
of detection for the long range gear (pJ and the short from the curves shown in Fig. 2.
A random number was compared to pL to determine if a detection history continued, using the results of that random decision. Weighting
factors were assigned
to the history
according
occurred
to the following
and the
rules:
If the first test for a detection with the long gear resulted in a detection, a weight correction factor of W, = ps/pL was assigned to account for the short range gear. If the first attempt resulted in no detection, then the assigned weighting factor for the short range gear was l,$,J =1-p, __. I 1 - PL
Long range MAD ,5Short
range
0.5 Range of
Fig. 2. Probability
I.0 submarine
l-5 from
2.0
aircraft
of detection versus range for the MAD detectors used in the demonstration history reanalysis and importance sampling.
of
Application
of variance
reduction
The history continues for subsequent assigning weights according to
if a detection
occurred
to large scale simulation
tests of detection
309
problems
with the long MAD
gear by
on the i + 1st test and lq,,
=-
l - ps.
f,.f(
l-P,
L
when a nondetection occurred on the i + 1st use of the MAD gear. This is continued until the game stops due to a kill or some other reason (e.g., sonobuoy stores exhausted). The final weighting factor is defined as VI/: If a kill occurred
in this history
then nj = 1
otherwise nj = 0 where the subscript
j denotes
(Note that the total number
the jth history. of kills in N histories
is simply
N
N,
c
=
nj.)
j=l
The above series of steps was performed for the N histories and the following formulas were used to estimate the differences in the parameters of interest. Consider first the probability of kill. The estimated probability of kill for the long gear is given by N
bL = l/N
c
ni.
j=l
The estimated
probability
of kill for the short gear is given by fi, = l/N
:
njP$.
j=
The difference
in probability
1
of kill (long MAD-short A = l/N
:
j=
nj(l -
MAD) is Wj).
1
Similar considerations were used for the time to kill. If history j resulted the time of kill was q, the average time to kill for the long MAD is ?,. = l/Nk
F njTj j=
1
while for the short MAD it is 1 N ?.s = Y 1 njTjWj. P&’ i=l
in a kill and
310
E. J. MCGRATH and D. C. IRVING
(&N is the “number” of kills in the short MAD problem and is the correct normalizing factor for the time to kill and other parameters.) In a similar manner the remaining parameters (number of torpedoes or sonobuoys used per kill) were calculated. It is clear the results obtained for long and short range detectors are highly correlated. The histories for the two cases are not only identical up to the point of possible detection, but they continue to be correlated throughout. When a kill is registered due to the use of the long range gear, it is also registered for the short range gear but carries a different weight. As the random choices made are appropriate to the long gear, they will not be optimum for the short range gear. This will increase the variance of the short range gear estimates, but the high degree of correlation should still reduce the total variance of the difference. A second significant advantage of the above approach is that the results of two problems have been obtained by actually performing only one simulation (i.e., for the long gear) although some additional computation is required to perform the weighting. However, the total computing time required was only 53 per cent of that needed to perform two independent calculations. Thus, to generate the required correlated cases, approximately one-half of the computational effort was required. In addition to this time saving, there were substantial improvements in the variances of the different parameters estimated. These results are summarized in Table 11 where it is seen that substantial improvements in the efficiency were realized. For example, in estimating the stores used (torpedoes and Type B buoys) about a factor of 10 improvement in efficiency was obtained. The overall average efficiency was found to be almost 7. This can, of course, be interpreted as a reduction in computational time that can be realized using the correlated sampling scheme described here.
CONCLUSION
It is the firm opinion of the authors that the applicability of variance reduction techniques to large scale Monte Carlo simulations has been successfully demonstrated. It is strongly urged that all simulation analysts acquaint themselves with variance reduction techniques
Table
11. Variance
reduction
using history
Variance with crude sampling
Estimated difference in expected value (long range-short range) Probability of submarine kill Time to submarine kill Number of torpedoes per kill Sonobuoys used per kill-Type Sonobuoys used per kill-Type
reanalysis for estimating MAD gear
A B
4.2 x 1O-3 7.8 I.25 x 1O-2 0,097 0,204
differences
between
short and long range
Variance with history reanalysis
Efficiency
6.6 X lo-” 2.05 2.56 x 1O-3 0.0289 0.040
1.21 I.2 9.3 6.4 9.7
Average efficiency improvement 2 6.8. Ratio of time decrease with variance reduction z 0.53. History reanalysis: One run was made using the long range gear probability distribution for detection. Simultaneously, calculations were made for short range gear using weighting factors to correct for difference in probability between the two distributions.
Application
of variance
reduction
to large scale simulation
problems
311
and their implementation. It is felt that substantial improvements in variance reduction can readily be achieved in most simulations. In addition, it is hoped that the report, [9], produced as a part of this study can go a long way toward providing a long-needed comprehensive reference on the description and implementation ofvariancereduction techniques.
REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
J. M. Hammersley and D. C. Handscomb, Monte Carlo Methods, Methuen, London (1964). H. Kahn, Applications of Monte Carlo, AECU-3259, The Rand Corporation, Santa Monica, Calif. (1954). Y. A. Shreider, The Monte Carlo Method, Pergamon Press, Oxford (1966). G. Goertzel and M. H. Kales. Monte Carlo methods in transport problems, Progress in Nuclear Science, Series I, Volume II, pp. 315-369, Pergamon Press, Oxford. J. Spanier and E. M. Gelbard, Monte Carlo Principles and Neutron Transport Problems, Addison-Wesley, Reading, Mass. (1969). R. R. Coveyou, V. R. Cain and K. J. Yost, Adjoint and importance in Monte Carlo application, Nucl. Sri. Engng 21,219-234 (1967). E. J. McGrath, et al., Techniques for Eficient Monte Carlo Simulation, Vol. I, Selecting Probability Distributions, March, 1973, SAI-72-590-LJ. E. J. McGrath and D. C. Irving, Techniques for Eficient Monte Carlo Simulation, Vol. II, Random Number Generationfor Selected Probability Distributions, March, 1973, SAI-72-590-LJ. E. J. McGrath and D. C. Irving, Techniquesfor Eficient Monte Carlo Simulation, Vol. III, Variance Reduction, March, 1973, SAI-72-590-LJ. E. J. McGrath and D. C. Irving, Demonstration of Improved Monte Carlo Simulation Techniques for the APAIR Antisubmarine Warfare Program, March, 1973, SAI-73-533-LJ. (Presented at the Computers & Operations Research Symposium 20-21 August 1973)