Mathematics and Computers in Simulation XXIV (1982) 60-64 North-Holland Publishing Company
60
GENERATING DISTORTING
AN AUTOCORRELATED
SEQUENCE OF RANDOM VARIATES
WITHOUT
THEIR DISTRIBUTION
John R. METZNER Computer Science Department,
University ofMissouri, Rolla, MO 65401,
L!S.A.
Using moving averages to introduceautocorreLationinto a sequence of pseudorandomvariables will most often distort the shape of the distribution. When the correlationcoefficientsare to be zero from some lag onward, it is possible to compensatefor most of the distortionby startingwith variates from another distribution. It is shown how to calculate the moving average coeff-lcients and the parametersof the basic sequence so as to achieve the desired autocorrelationwhile retainingfidelity with the desired distributionto four moments. The moving average approach yields an internalgenerationmethod having quite modest storage requirements.
INTRODUCTION
E[Y]
It is often desired to generate a (wide sense) stationary sequence of continuous random variates of a given distribution with the sequence having While specified serial correlation coefficients. it may be possible to achieve the deisred correlation coefficients using moving averages of variates having the desired distribution, the procedure will usually distort the higher moments of the distribution. However, that distortion can be compensated for by starting with variates of a related distribution and using the moving average technique to produce a sequence of variates having the desired distribution to four moments and the specified serial correlation coefficients. Specifically, suppose that a sequence Xi of identically distributed continuous random variates is sought having the first four moments =!J
E[Xl E[(X-
u,)~]
1 = L+
E[(X-
F,)‘]
= u3
u:,j41
= u4
E[(Xand the
serial
correlation
coefficients
E[(Xi-~1)(Xi+k-~~)/~21
= pk
k=1,2;.*,K
E[(Xi-~1)(Xi+j-u1)/~2]
= 0
j=K+l,K+2,.
It is assumed that there is available a sequence of identically distributed uncorrelated random variates, Y i, having the moments
03784754/82/0000-0000/$02.75
E[Y'] = 1 E[Y3] = $ E[Y4] =
B2.
Schmeiser [1] gives a survey of methods for generating these variates covering all possible combinations of 81 and B,. (16 1 >B -lis There is al& a regtriction not possible.) on the p's; the spectral density function K fK(A) = ~~(1 + 2 X pi cos iX)/Zn i=l must be nonnegative for the sequence to be stationary. (Sea, e.g., Yaglom [2], p. 56.) There have been attacks upon various relaxations Polge, Holliday, and Bhagaven of this problem. [3] describe an external method (the variable sequence must be sorted external to the program as opposed to internal methods which generate the next variate on demand) for approximately achieving the desired correlation coefficients. Fraker and Rippy [4] suggest an internal mixing procedure which requires K=l and ignores distortionsof p and u4' 3 This paper explores the use of moving averages of the form
..
specified.
= 0
Xi = CIYi + C2Yi_1 + *** + CK+lYi_K + LJl where the c's and the two parameters of the distribution of the Y's, 81 and 62, can be chosen to produce the desired moments and correlation This will yield an internal genercoefficients. ation method. And, the vanishing of the correlation coefficients beyond lag K places a Attention will bound upon memory requirements. be limited to real random variables, so real coefficients will be sought and even moments will The general methods be assumed positive.
0 1982 IMACS/North-Holland
J.R. Metzner / Autocdrrelated
mentioned in [1] assume a positive third moment; USE negations of the Y's in the moving average variate generator if u3 < 0.
sequence
61
of random variates
The function f2 has a zero derivative at cosX = -p1/4p2, which exists when 1~21 2 IP11/4. Evaluation f, at that value for X yields L
K = 1 CASE The restriction
f2(Arc cos(-pl/4p2))
=
!J2
2 + u-p1 __-
which is positive when p2 is negative, so this constraint is
gives IP
1
"Pi),2T
4p2
0 5 fl(X) = 112(1 + 2p1 cosh)/2n
j 2 l/2.
P2 < lP,1/4
Setting Xi = aYi + bYi_l + ~1
or
p;+8p;-4p2 < 0.
(7)
The intersection of the regions (5), (6), (7) is the "ice cream cone" area shown in Figure 1.
and taking moments gives E[X] = 1 =
E[(X-VI)21 = a2 + b2 E[(X-LI$~] = (a3 + b3) J81 E[(X-IQ)~] = (a4 + b4) 82 + 6a2b.2 E[(Xi-ul)(Xi+l -I+I,]
p2
(1)
1-13
(2)
= p4
(3)
= ab/(a2 + b') = pl
(4)
because Yi and Yi_l are independent and have zero expection. The independence of the Y's also assures that pj = 0 for j = 2,3,4,**. . Equations
(1) and (4) yield the solution
‘?I=%
(J1+2pl
+JixQ
/2
b = $
(J1+2p1
- GF)
/2
which is real when lpll < l/2. Then, the shape of the Y distribution may be determined from the remaining equations;
Jg, = B2
=
$a3 + b3)
Feasibi
lity
Region
for
Figure
1
K = 2 Case
Using
(u,-6a'b2)/(a4 + b4)
Xi = aYi+bYi_1+cYi_2+u1
unless pl = -l/2. There will be no solution in the p1 = -l/2 case unless ~3 = 0 and then the choice of qis arbitrary.
and taking moments as before gives
a2+b2+c2 K = 2 CASE -__
2
(8)
= u3
(9)
= u4
(10)
ab+bc
= PlU2
(11)
ac
= P2U2
(12)
(a3+b3+c3)q
Here, the restriction on p and p is 2 1 0 2 f2(X) = ~2(1+2Plcosx+2plcos2x)/2.
=u
(a4+b4+c4)82+6(a2b2+a2c2+b2c2)
Setting X = 0 gives 2 = 1+2p1+2p2 L 0, Yl
(5)
and h=~ gives 2 = l-2p1+2p2 2 0. y2
(6)
and E[(Xi-~l)(Xi+j-pl)] = 0 for j = 3,4,5,*.. because the product involved only terms containing YiYi+k which have zero expectation for k>O.
62
J.R. Metzner / Autocorrelated
Equations give
(8))
sequence of random variates
(11)) and (12) may be solved to c
a = ~((~1+~2+~2-12~~+2y~y~)/4
:a944 70:1
b = J1J.2(yl-y2)/2
,948, :9949 9789
c = ~(~~+~2-~2-12~~+2~1y2)/4 which are real throughout the feasibility region for (~1, ~2). With these values, the parameters of the Y distribution may be determined from (9) and (10). As before, there is a case (~1 = 0, P2 = -12) in which no solution exists unless u3 = 0 and that case permits an arbitrary choice of 6 . Table I presents a selection of coefficients for use in determining moving average generators for the K=2 case. The values to use may be determined from the following equations. For Pl 2 0:
For pl < 0; use lpll and:
a = G2A
a = G2A
b = G2B
b =-G2B
c = G2c
c = G2c 3
Jt31= Kllu3\~~-~'~
= K;~+J~-~'~
It may happen that particular values of P1, p2, and ~4 lead to a violation of the constraint and an autocorrelated y;;ju&2-1 sequence of variates cannot be produced without distortion of the higher moments. K = 3 CASE --
.9949 .9789 .9487 .5944
"
:81"" 7071 :&x97 9326
,X162 :1310 1637
.9825 :9751 9907
.ll*: ,084: ."916
,945" .89"3 ,8152 ,937" .967X o-74 .9632 9334 m 7993 9235 9532 042" '5539 9131
."?92 ."747 .447? .2764 ,231s .1852 IT"8 :15'14 .lS"" .sTT ,568" .I835 .2599 .??69 .2416
7634 4139 9039 8815 8161 8472 8576 8337 7519 6109 7701 7561 5742
.6X25 .i!l"9 .354x .x274 ,3"62 ,518" .4584 .4189 .3894 .7746 ,5826 .52"4 .6934
-.7”71 -.4472 ~.3162 -.204x -.I""5 .I""5 .?I)43 .31f': .447* * -.x217 -.I"63 ~.1"12 .I""? .?"51 .31-s * -.21x4 -.I"34 .I"23 2076 :3214 * -.108X .I"49 .212x .X286 * .I"94 .?Z"l 3403 ‘4902 .118D .2332 3599 * 2597 '3908 -%zT A
Coefficients for K = 2 Generators Table I The intersection of the regions (13), (14) and (15) is the feasibility region for this case. Its contours are shown in Figure 2. The region is symmetric about the line pl = 0, P3 = 0, so the contours for p3 < 0 are reflections about the p2-axis of those given.
The restriction for this case, 0
(13)
and for h = TI, 2 = l-2p1+2p2-2p3 5 0. &2
Now, using + ul Xi = aYi + bYi_l + cYi_2 + dY. 1-3
(14) and taking moments gives
The overall constraint, that the minimum of f3 on the interval [-~,IT]be nonnegative, produces the rather complex region
22+d
a2 +b2+c
= u2
(16)
= PlV2
(17)
ac + db
=
P2!J2
(18)
ad
=
P3?J2
(19)
ab + bc + cd 2 a = P2-3P1P3+9P3
2
< 0
011
\(J;;-P~)/~P~~
’
(15)
1
3 3 +c 3 + d3)$ (a +b p2(1-p 3
2
):T(p3-c%3/2)/27-P 2
p p /3 123
> 0 -
I
= "3
(20)
(a4+b4+c4+d4)R2+6(a2b2+a2c2+a2d2+b2c2+b2d2+czd2) = (21) u4 Scaling the first four of these equations by a=
$A
b=
%B
c=
VQ
d=
%D
(22)
J.R. Metzner / Autocorrelated
sequence
63
of random variates
(23)
A solution in closed form has not been found for HOWeVer, the subcase pl=p2=0 has been this case. solved. There,
B = %(61-6l)-~~/A
(24)
a = 4+,+6*)/2
c = Jg(61+Ci2)-A
(25)
d = &&-~,)/2
(26)
and
and reducing yields
2 A4 - !,(&l+fi2)A3+p2A2- %(61-62)~3A+~3 = 0
D =
p3/A
Jq=
c=o
b=O
~u31u2-3'2/(l-P;)mq
with the feasibility constraint /p3I 2 k. The numerical solution of (23) is not difficult, it is well-behaved except possibly at a boundary of the feasible region and the solution sought is the largest root in [1/2, I]. A selection of coefficients for K = 3 generators is given in Table II. Those A-values can be interpolated to give starting values for a Newton's Method solution of (23), which converTo use the tabulated or calculated ges nicely. coefficients, apply the following scheme. For p3 2 0:
For p3 < 0; use -pl and 1~31 with:
a=
GA
a=+
b=
$B
b=-J;;;B
SC
c
c
=
82
SC
d=-GD
d=$D
$- =
=
Jg, = +31u2-3’2 K11u31u2-3’2
= K2(~&-K3)
f32= K2(u4/+K3)
As before, there are combinations of the serial correlafion coefficients for which the value of Kl or Kl is undefined and can be used only when u3 = 0. checked.
Contours
of
Feasibility K = 3 Case Figure
2
Region
for
Also, the (@l( < B2-1 constraint must be
J.R. Metzner / Autocorrelated
64
sequence
of random variates
EXAMPLE Suppose an autocorrelated sequence of exponentially distributed variates having unit mean and variance is desired with pl = 0 and p2 = k. Table I gives the values A = C = 1 6, Kl = fi, K2 = 2, and Kg = 1.5. Then 41 = 2fi and 82 = 15. These moments can be achieved by the gamma distribution with parameter k. So, if Zi is a sequence of variates from that distribution, Yi = fi(Zi-4) in the moving average formula gives
xi = zi + _.l I.:
.Gvs
from which the moments and correlation coefficients may be verified.
.3 .4 .l .: .I 1
.9406 .8814 .933* .9627 .!I216 .9?% .5871 .1 .8472 1 ,979s .954? .3 .8631 1 .98X
,l .l
This may be contrasted with the technique which uses moving averages of variates having the distribution sought. If Wi is a sequence of variates having the exponential distribution with parameter 1, Vi = (Wi + Wi+2)/~ + 1-G is a sequence having the desired mean, variance, and correlation coefficients. However, the shape of the distribution differs significantly from the exponential since it has third and fourth moments l/c and 6 respectively, as opposed to the moments 2 and 9 respectively for the exponential distribution. In fact, the distribution of the Xi is exponential. This exact matching is achieved because the weights in the moving average formula are all equal and the sought distribution is infinitely divisible.
:2
.j -.3
:2
-.l
#
zi+2
REFERENCES Coefficients
for K = 3 Generators
[II
Table II K > 3 CASE -The feasibility of a set of coefficients will have to be checked by an examination of K fK(X) = u*(1+2 x PiCOS ih) i=l throughout the interval [-n,~]. equations,
[21 [31
Then, the moment [41
! a2 i i=l
112 151
K-l ' aiai+l i=l
= P1U2
K-2
’
aiai+2 i=l
alaK
= P2U2
= PKU2
can be solved numerically by iterative techniques such as Brown's [5] except possibly at the boundary of the feasible region.
Schmeiser, B., Methods for Modelling and Generating Probabilistic Components in Digital Computer Simulation When the Standard Distributions Are Not Adequate: A Survey, Proc. 1977 Winter Simulation Conf., pp. 51-57, December, 1977. Yaglom, A. M., An Introduction to the Theory of Stationary Random Functions, (PrenticeHall, Englewood Cliffs, N.J., 1962). Polge, R. J., Holliday, E. M., and Bhagavan, B.K., Generation of a Pseudo-Random Set with Desired Correlation and Probability Distributions, Simulation, 20:153-158, (May, 1973). Fraker, J. R., and Rippy, D. V., A Composite Approach to Generating Autocorrelated Sequences, Simulation, 23:171-175, (December, 1974). Brown, K. M., Computer Oriented Algorithms for Solving Systems of Simultaneous Nonlinear Algebraic Systems, pp. 281-348 in Numerical Solution of Systems of Nonlinear Algebraic ~----Equations, G. D. Byrne and C. A. Hall, Eds., (Academic Press, New York, 1973).