Evaluation of probability point estimate methods Che-Hao Chang, Yeou-Koung
Tung,* and Jinn-Chuang
Energy & Resources R. 0. C.
Industrial
Laboratory,
Technology
Yang’
Research
Institute,
Hsinchu,
Taiwan,
In modern engineering designs and analyses, computer models are frequently used. Due to the presence of uncertainties associated with the model inputs andparameters, which are treated as random variables, analysis is feasible tf the methods employed do not require excessive computations yet produce reasonably accurate results. Point estimate methods are such schemes that are potentially capable of achieving the goals. Assuming normal distributions to the random variables, three point estimate methods (Rosenblueth ‘s, Harr ‘s, and a modtfied Harr ‘s method) were evaluated in this paper for d$erent numbers of random variables and dtxerent model types. Results of this evaluation indicated that the proposed modified Harr ‘s method yielded comparable, if not better, performance than the other two methods. Also, performance evaluation indicated that additional statistical information, other than the commonly usedfirst two moments, should be incorporated, tfavailable, to enhance the accuracy of uncertainty analysis. Keywords: computer model, uncertainty
analysis, point estimates, equal probability,
1. Introduction Modern engineering designs and analyses frequently require using computer models, many of which are complex. Unavoidably, inputs and parameters in the model involve uncertainties that result in model output uncertainty. In uncertainty analysis, one ideally derives exact uncertainty measures about the model output. However due to the complex and/or implicit nature of a functional relation, uncertainty analysis of a computer model can hardly be performed analytically. Alternative approaches such as the first-order second-moment (FOSM) method, the Monte Carlo simulation, or point estimate (PE) methods may be more practical. Selection of appropriate methods for uncertainty analysis of a computer model depends upon the model itself, the computational requirement, and available information related to the stochastic inputs and parameters. The FOSM method is frequently used in engineering applications. Its advantages and disadvantages have been described by Yen et al.’ It has been shown by Karmeshu and Lara-Rosano’ that the FOSM method is a special case of Rosenblueth’s PE method when the uncertainties in random variables are small. Since the amount of Address reprint requests to Dr. Tung at the Wyoming Water Resource Center and Statistics Department, University’ of Wyoming, Laramie, WY 82071, U.S.A. Received 24 June 1993; revised 23 August 1994
1994; accepted
19 September
* Presently Professor at Wyoming Water Resource Center and Statistics Department, University of Wyoming, Laramie, WY 82071, U.S.A. ’ Presently Professor at Department of Civil Engineering, National Chiao Tung University, Hsinchu, Taiwan, 30050, R.O.C. Appl. Math. Modelling 1995, Vol. 19, February 0 1995 by Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010
probability
computation involved is about the same for both methods3 PE methods are more generally applicable. As for the Monte Carlo simulation, model output uncertainty with better precision is achievable by repeatedly running the model a large number of times. Should model computation be time-consuming, the Monte Carlo simulation would not be practical for uncertainty analysis. Furthermore, the Monte Carlo simulation requires the specification of marginal distributions of random variables. Under such circumstances, a PE method often constitutes a more practical alternative because of the smaller amount of computation required along with statistical moments for inputs only. Three PE methods, Rosenblueth’s, Harr’s, and a modified Harr’s algorithm, were compared in this paper. The concept of PE methods is originated by Rosenblueth4 for uncertainty analysis of model output involving symmetric random variables. It is later extended to handle asymmetric random variables by the same author.’ However, the practical applicability of Rosenblueth’s PE method is limited because the number of model computations increases rapidly with the number of random variables. For this concern. Lind6 proposes an alternative point estimate scheme using face-center point distribution. A similar idea is adopted by Harr’ but based on a correlation matrix. It is noted that Harr’s method considers only the first two moments, including convariance, of the random variables involved. Implicitly, the method is appropriate for problems involving symmetric random variables. Weighting factors are considered in Rosenblueth’s and Harr’s PE methods to compute the moments of the outputs. To simplify the computation, a procedure modifying Harr’s algorithm with equal weights was proposed here. This method 0307-904x/95/$10.00 SSDI 0307-904X(94)00018-2
Evaluation
of probability
point
estimate
methods:
C.-H.
selected points of evaluation with an equal probability density for normally distributed random variables. To place the three PE methods on common ground, the stochastic parameters were assumed to be correlated normal random variables. The evaluation focuses on the relative performance of the methods under different numbers of random variables and function types.
Chang
et al
the value of model output at (x1,,,, weighing factor, ~(a,, ___, a,,), is simply probability masses obtained from random variables are uncorrelated. exists among the random variables, added and the result is n-1 1 di P(c5,..._,a,,) = fi Pi,& +
PE methods evaluate model output uncertainty based on stochastic input parameters, which are treated as random variables, at specified points in the parameters space. The points are selected to preserve the probabilistic information of the input parameters. Different PE techniques select different points in the parameter space for model evaluation. For the sake of completeness, algorithms of the three PE methods are briefly described in the following sections. Rosenblueth’s
Algorithm
Consider a function W = g(X) involving a single stochastic parameter X, which is treated as a random variable. The locations of two points, x, , x-, and the corresponding probability mass, p+ , p_ , are determined to preserve the first three moments of the stochastic parameter, X. The four quantities are computed as X+ =p+z+g x_ =p-z-a Z+
in which p and a are the mean and standard deviation of the stochastic parameter X, respectively, and
z_
2
Y
1+
z+ =
z+
\i -
;
0
(2)
y
and y is the skew coefficient of the stochastic parameter X. For multivariate problems involving n correlated random variables, W = g(X) = g(X,, X,, . . , X,), the univariate algorithm, is applied to define the two point values for each random variable. Then points in the parameter space for model evaluation are determined by permuting the two points in each dimension resulting in 2” points at which model output values are computed. Hence, the mth order moment about the origin of model output, An 3 can be computed from p:, = E(W”) Z
,A-.” I ,
&a,, .... 6”+%. .... 6.) = an=+,(3)
in which subscript, hi, i = 1 = n, is a sign indicator and can only be + or - representing the input parameter Xi having the value of xi+ or xi_, respectively, as the two points obtained in the univariate case, w~~,,,..,~,,)is
96
Appl.
dj aij
1
in which P~,~, represents the probability location x~,~, and aij is determined as
(4) 1
mass at point
where pij is the correlation coefficient between two random variables. From the moment about the origin, the central moments of model output, p,,,, can be obtained from the moments about the origin as
i=O
where Cy is the binomial coefficient which can be calculated by m!/[i! (m - i)!]. As can been in equation (3), Rosenblueth’s algorithm for uncertainty analysis requires 2” model evaluations. As the number of random variables is moderate or large, the algorithm could become very computationally intensive. Harr’s
Z_
p_ = l--p+
Z +=
j=i+
(1)
P +=z+
(
i=l
2. Point estimate methods
x2,62, . . . , x,J. The the product of the equation (1) if the When correlation a correction term is
Math.
Modelling,
1995,
Vol. 19, February
Algorithm
Harr’ proposes an alternative PE method to circumvent the computational disadvantage of Rosenblueth’s algorithm. By Harr’s algorithm, for a problem involving n correlated random variables, a hypersphere with radius ,,& centred at the origin of the standardized parameters space is defined to represent the correlation matrix. The points of model evaluation are selected at the intersections of the hypersphere and the eigenvectors of the correlation matrix. Note that two intersections are obtained for each eigenvector. The number of total candidate points selected for model evaluations is 2n. Thus, the number of computations using Harr’s algorithm is significantly less than Rosenblueth’s algorithms as n increases. To calculate the moments of model output, the point estimates are weighed by the eigenvalues associated with the correlation matrix. Harr’s PE algorithm is summarized as the following: The correlation decomposed as R, = VLV’
matrix
of random
variables,
R,,
is (7)
VJ is the eigenvector matrix where V= [vlrv2,..., with vl, v~,..., v, being the column vectors of the eigenvectors; L = diag@,, /1,, . . . , A,) is the corresponding diagonal eigenvalue matrix with 2,) A,, . . . , A, being the eigenvalues.
Evaluation of probability point estimate methods: parameter 2. Along each eigenvector in the standardized space, the coordinates of two intersection points on the hypersphere with a radius origin, are determined as
i = 1, 2,...,n
xi+ = +Gvi, where xi+ and xi_ are the coordinates of two eigenvector, vi, in the Then, the coordinates the original parameter =
xi+
p
+
of &,
centered
at the
i=
1,2,...,n
probability in equation (3) and of eigenvalues in equation (10). To simplify the computation, the proposed method modifies Harr’s algorithm by selecting points at the intersections of the eigenvectors and the hypersphere in
the standardized eigenspace. (8)
the column vectors containing intersection points along the ith standardized parameter space. of the 2n intersection points in space are obtained as
c”2x;*,
C.-H. Chang et al.
(9)
are the column vectors where xi+ and xicontaining the coordinates of the two intersection points in the original parameter space, p is the vector of the expected values of the random variables, and Z1/2 is the diagonal matrix containing the standard deviations of the random variables. selected along each 3. Based on the two points eigenvector, compute the corresponding model output values wi+ = g(xi+) for i = 1 to n. about the origin for the 4. The mth order moment model output can be computed as
The random variables can be obtained as
in the standardized
eigen space
U = T(X - p)
(17)
where U is the vector of random variables in the standardized eigenspace having zero means and unit variances; and T = V’L- ‘/2Z-1/2 denotes the transformation from the original space to the standarized eigenspace. The transformation by equation (17) is linear. Therefore, if all the original random variables are normally distributed, the transformed parameters, Us, are standard normal random variables. In the standardized eigenspace, the proposed modified Harr’s algorithm selects, along each axis, two points located a distance & on either side of the origin. Under the normality condition, all selected points are on the surface of equal probability density. Therefore, the corresponding 2n values of model output can be assigned with equal weights. By equation (17), the proposed points in the original parameter space can be obtained as
(10)
xi+
=
p
+
&
Cl"
L1’2vi,
i=l
,...,n
(18)
from
which, the corresponding model output values = g(x,&), for i = 1 to n, can be computed. Equations can be adopted to obtain the statistical moments &14) by replacing the eigenvalues 2’s in equations (lOH14) by
i=l
where -
+$I? =
WY++ WY_ (11)
2
Then, the central moments can be obtained by equation (6). More specifically, the mean and variance of the model value W can be calculated as
Ai6
i E(W)
i=l
=
(12)
n I
2 JqW2)
=
Aiiq
i=l
(13)
n Var(W) = E(W’) - E2(W)
(14)
in which Wi+
+
Wi_
i=
KC
2 qL
’
w;+ + wi’2
’
1, 2,...,n
(15)
i= 1,2,...,n
(16)
l/n. Figure 1 schematically shows a difference in point selection by the three PE methods for a bivariate case in the standarized parameter space. Rosenblueth’s points are located at the corners of the square centered at the origin with side length of 2. However, the eigenvectors of a bivariate correlation matrix would have an angle of 45” with each standardized parameter axis. Accordingly, Harr’s algorithm selects the intersections of the eigenvectors and the circle with a radius of $ centered at the origin, which is identical to those of Rosenblueth’s in the bivariate case. Note that the points selected by the modified Harr’s method are on an elliptic contour of equal density, which is a circle in the standardized eigenspace. When the random variables are uncorrelated, the proposed modified Harr’s method is identical to Harr’s algorithm.
3. Numerical experiment and results
ModiJied Harr’s Method From the above descriptions for the two existing PE methods, the computations of statistical moments of model output use weighing factors in the form of
To examine the relative performance of the three PE methods, a numerical experiment was carried out considering different model types involving different numbers of stochastic parameters which are treated as random variables. The three models considered were: (a) x = CX,, (b) & = XX:, and (c) Y, = HXi. In each model, the number of stochastic parameters were 2-6.
Appl.
Math.
Modelling,
1995,
Vol. 19, February
97
Evaluation
of probability
point estimate methods:
C. -H. Chang et al.
x;
0
points
selected
by Rosenbluth’s
A
points
selected
by Harr’s
0
points
selected
by the
method
method
modified
Harr’s
method
Figure 1. Schematic diagram of the three PE methods a standardized parameter space for a bivariate case.
in
The mean values and standard deviations of all stochastic parameters were set to 1.0 and 0.3, respectively. For simplicity, the correlation between the two stochastic parameters Xi and Xj was set as pij = 0.91i-jj
(19)
Except for Rosenblueth’s algorithm, the other two PE methods do not account for the skew coefficient of random variables. To place all three PE methods on common ground in the comparative study, all the stochastic parameters were assumed to be multivariate normal. In the Monte Carlo simulation, 30,000 samples were generated for each of a different combination of model types and parameter numbers. The first three moments of the model values computed by the Monte Carlo simulation were treated as the true uncertainty feature based on which performance of the three PE methods were compared. For models (aHc), the moments of model values involving 2-6 stochastic parameters were listed in Tables 1-3, respectively. The percentages of the differences with respect to the true values were also attached. For the linear model (a) as shown in Table I, all three PE methods yielded very close estimations for the first two moments. Since all the stochastic parameters were assumed to be normal random variables in the investigation, the output, Y,, should have a normal distribution. The presence of the skew coefficients in the
98
Appl.
Math.
Modelling,
1995,
Vol. 19, February
simulated results, which were rather insignificant, was due to sampling variation. For model (b) as shown in Table 2, the means estimated by the three PE methods were all close to the true value from simulation. However, discrepancies existed in estimating standard deviation and skew coefficient. For standard deviation, both Rosenblueth’s and Harr’s methods yielded identical estimates and underestimated the true value. However, the modified Harr’s method overestimated the standard deviations. Note that model (b) was a sum of squared normal random variables and the combined effect should result in a positively skewed 6. Table 2 shows that Rosenblueth’s and Harr’s algorithms both yielded a zero skew coefficient. This indicates the intrinsic deficiency in the two methods. The modified Harr’s algorithm was capable of revealing the existence of a skew coefficient in &. However, the estimated skew coefficient for model (b) increased as the number of the stochastic parameters increased. For this model, the modified Harr’s method clearly had better performance when 24 stochastic parameters were involved. For model (c), Table 3 shows that the estimated statistical moments by the three PE methods disagreed more and more with the true values as the number of involved stochastic parameters increased. In particular, Rosenblueth’s and Harr’s algorithms consistently underestimated all the first three moments of Y, while the modified Harr’s algorithm overestimated the mean and standard deviation and, like the other two PE methods, underestimated the skew coefficient. Interestingly, one observes that Harr’s algorithm in this case yielded somewhat accurate estimates of the moments for Y,. Judging from the computation aspects of the two algorithms, Harr’s algorithm undoubtedly is more advantageous than Rosenblueth’s. From Table 3, the modified Harr’s method yielded closer estimations for standard deviations and skew coefficients than the other two PE methods, especially when the number of stochastic parameters was not very large. Compared with the other two models, model (c) clearly was critical of the performance of the PE methods due to its increasing nonlinearity with the number of stochastic parameters. As can be seen, the uncertainty feature of a model response became more and more difficult to estimate accurately by the PE methods as the degree of model nonlinearity increased. From Tables 2-3, the standard deviations and skew coefficients estimated by the modified Harr’s method for models (b) and (c) were consistently higher than its two competitors. This tendency became more pronounced as the number of stochastic parameters in the model increased. This can possibly be explained by referring to Figure 1. When strong correlation exists among involved stochastic parameters, the eccentricity of the correlation matrix, measured by the ratio of the largest to the smallest eigenvalues, is large. Under this condition, the two points selected by the modified method for model evaluation are less evenly distributed over the parameter space than the other two PE methods. Since all selected points carry equal weight by the modified Harr’s method, the
Skewness
$
3
2
6i
R = Rosenblueth’s
Kurtosis
deviation Standard
;; .g
.G
Mean
8 = 6
g
Moment
5
method;
S
M H S
R
H R M S
H R M S
Method
2.9658
0.0000 0.0000 0.0000 0.0202
0.8798 0.8798 0.8798 9.8746
3.0000 3.0000 3.0000 3.0053
3.9792
0.0000 0.4038 0.8539
0.0000
1.1696 1.1808 1.1941
2.1800 2.1800 2.1726
estimated
H = Harr’s
moments
method;
R = Rosenblueth’s
Statistical
S
Kurtosis
2.
R H M S
Skewness
Table
R H M S
Standard deviation
z
R H M S
estimated
H = Harr’s
moments
Method
Statistical
Mean
Moment
Table 1.
method;
2
- 100.00% - 100.00%
- 1.67% 0.98%
-0.36% -0.36%
-100.00% 42.55%
- 100.00%
Carlo simulation.
3.7773
0.0000 1.1320 0.7941
0.0000 2.2912
2.2530 2.3137
4.3600 4.3600 4.3759
4
parameters
parameters
of stochastic
distributed
0.86% 0.86% 0.86%
-0.19% -0.19% -0.19%
- 100.00%
Carlo simulation.
2.8979
0.0000 0.0000 0.0000 0.0207
2.8407 2.8407 2.8407 2.8165
Number
normally
S = the Monte
- 100.00% 100.00% - 5.46%
- 1.74% 0.09%
0.17% 0.17%
with
S = the Monte
- 100.00%
Harr’s method;
3.8862
0.0000 0.7805 0.8256
0.0000 1.7506
1.7202 1.7521
3
for model-(b)
3.2700 3.2700 3.2645
M = the modified
- 100.00% -52.71% 100.00%
-2.05% -1.11%
0.34% 0.34%
PE methods
0.12% 0.12% 0.12%
10.0000 10.0000 10.0194
10.0000
4
parameters
parameters
of stochastic
distributed
Number
normally
0.03% 0.03% 0.03%
with
- 100.00% - 100.00%
Harr’s method;
2.9460
0.0000 0.0000 0.0000 0.0127
M = the modified
- 100.00% - 100.00%
- 100.00%
1.7301 1.7301 1.7301 1.7281
3
for model-(a)
6.0000 6.0000 6.0000 5.9981
PE methods
0.59% 0.59% 0.59%
-0.18% -0.18% -0.18%
by the three
method;
2
by the three
3.8206
0.0000 1.4611 0.8095
0.0000 2.8166
2.7691 2.8656
5.4500 5.4500 5.4517
2.9466
0.0000 0.0000 0.0000 0.0392
4.2017 4.2017 4.2017 4.1635
15.0000 15.0000 15.0000 14.9844
5
5
- 100.00% 100.00% 80.49%
- 1.69% 1.74%
- 0.03% -0.03% -0.03%
- 100.00% - 100.00%
~ 100.00%
0.92% 0.92% 0.92%
0.10% 0.10% 0.10%
3.8426
0.0000 1.7698 0.7969
0.0000 3.3240
3.2688 3.4070
6.5400 6.5400 6.5264
2.8779
0.0000 0.0000 0.0000 0.0094
5.8036 5.8036 5.8036 5.7829
21 .oooo 21 .oooo 21 .oooo 21.0288
6
6
- 100.00% 122.09% 100.00%
- 1.66% 2.50%
0.21% 0.21%
- 100.00% - 100.00%
- 100.00%
0.36% 0.36% 0.36%
-0.14% -0.14% -0.14%
moments
R H M S
R H M S
R H M S
S
method;
Standard deviation
Skewness
Kurtosis
R = Rosenblueth’s
Method
Statistical
Mean
Moment
Table 3.
2
6.8529
0.1007 0.1096 1.3567 1.6050
0.8912 0.8932 0.9969 0.9908
with
-96.13% - 92.73% - 15.84%
- 23.80% - 21.08% 8.94%
- 1.70% -1.18% 0.01%
Carlo simulation
0.0971 0.1825 2.1136 2.5115
1.2066 1.2497 1.7250 1.5835
1.4545 1.4622 1.4798 1.4797
4
parameters
parameters
of stochastic
distributed
Number
normally
S = the Monte
- 93.73% -93.17% - 15.47%
- 10.05% - 9.85% 0.62%
-0.04% - 0.04% -0.04%
Harr’s method;
3
for model-(c)
1.2349 1.2349 1.2349 1.2354
PE methods
M = the modified
- 94.59% -94.59% -45.88%
- 1.84% - 1.84% - 0.90%
-0.27% -0.27% -0.27%
by the three
H = Harr’s method;
3.8642
0.0446 0.0446 0.4458 0.8238
0.5861 0.5861 0.5917 0.5971
1.0810 1.0810 1.0810 1.0839
estimated
- 3.90% -1.91% 4.26% - 38.56% - 32.20% 24.47% - 97.73% - 93.24% -30.71%
1.7335 1.7694 1.8807 1.8038 1.5275 1.6858 3.0947 2.4863 0.0860 0.2559 2.6222 3.7845
5
0.0967 0.3253 3.0031 4.9107
1.8512 2.2339 5.6995 3.9289
2.0656 2.1693 2.5761 2.3199
6
- 98.03% - 93.38% - 38.85%
- 52.88% -43.14% 45.07%
- 10.96% - 6.49% 11.04%
Evaluation of probability point esrimare methods: C.-H. Chang et al. estimated second and higher moments are inclined to be larger than the other two competitors under the condition of strong correlation. This effect could be magnified as the number of stochastic parameters and/or the nonlinearity of the model increase.
4. Performance
Evaluation
Although Tables l-3 provide some indications about the relative performance of the three PE methods based on the statistical moments, there are still many cases to which the picture is not entirely clear. For example, from Table 3, the modified Harr’s method had better estimations for standard deviation and skew coefficients but slightly poorer estimation for the mean in the case involving five stochastic parameters. Because the overall effect of statistical moments can be demonstrated through a probability distribution curve, comparisons were therefore made based on the goodness-of-fit to the true distribution curve for evaluating the performance of the PE methods. Normal, lognormal, and two somewhat more complicated but general distributions were adopted as probability distributions for the three model outputs in this study. Because identical moments were obtained for Y, by the three PE methods, the comparisons were made only for Yb and Y,.
Performance
Criteria
Three criteria were used to compare performance of the three PE methods:
the
relative
(1) Biasness (BIAS):
Distributions
for Model Outputs
In view of preserving the marginal information, the three PE methods have different capabilities. Rosenblueth’s algorithm is capable of accounting for the first three moments of individual variable whereas the other two consider only the first two moments. Intuitively, is is sensible to compute the first N moments of model output if the first N moments of stochastic parameters can be preserved in the PE methods. For this reason, the fourth and higher order moments of the model output obtained from the PE methods were not calculated. In the performance evaluation of the three PE methods, two situations were considered: adopting only the first two moments for the model outputs, and incorporating all available and plausible information. The first situation assumed normal and lognormal as the distribution functions for the model output, whereas the second one was based on the first three or four moments along with an analytical distribution capable of preserving the specified moments. More specifically, the Johnson probability distribution’ along with the first four moments from the Monte Carlo simulation was used as the true distribution based on which Pearson distribution’ with the first three moments from the PE methods was compared. The computer algorithm employed to compute the quantiles of Johnson probability curve was developed by Hill, Hill, and Holder.” For the Pearson probability curve, the program developed by Davis and Stephens’ ’ was used to compute the quantiles based on the first three moments along with a left bound, defined as five standard deviations lower than the mean.
Results
1 BIAS
=
(WeJl
-
Wt.,)
&
(20)
s0
(2) Mean absolute
error (MAE):
1 MAE
=
Iw,,~
-
WJ
dp
(21)
s0 (3) Root mean squared error (RMSE): r f-1 -II/2 RMSE = (w,,~ - w,.,J2 & LJ 0
(22)
where w,,~ is the value of the pth order quantile for the assumed true probability model and w,,~ is the estimated value. Note that, in this study, the true quantile values, W f,p, were computed using the moments from the Monte Carlo simulation along with the assumed distribution, whereas the estimated quantiles, w,,_ were based on the moments from the PE methods. The integration in the three performance criteria, equation (20H22), were assessed numerically at several discrete probability values: 0.01, 0.025, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95, 0.975, 0.99. The quantiles at these probability values were obtained to calculate the value of performance criteria.
Table 4 lists the values of the three performance criteria for the cases of assuming normal, lognormal, and Johnson/Pearson distributions for outputs of model (b). A comparison based on the first two moments, as for normal and lognormal cases, indicated an identical performance of Rosenblueth’s and Harr’s methods. Morever, the modified Harr’s algorithm revealed better results when the model involved 2-4 stochastic parameters. Neglecting the skew coefficient of the model output, the performance of modified Harr’s method deteriorated with the increasing number of stochastic parameters in this study. Incorporating the skew coefficient, results from using Johnson/Pearson distributions showed a consistently superior performance using the modified Harr’s method. Results listed in Table 5 are for model (c). In the case of using normal and lognormal distributions, the modified Harr’s method yielded closer estimations to the true value than the other methods. There is a restriction on the value of skew coefficient not exceeding 2.0 in the program used for computing Pearson probability quantiles. As can be observed in Table 3, the skew coefficient estimated by the modified Harr’s method was beyond this limit for most case in model (c). Therefore,
Appl.
Math.
Modelling,
1995,
Vol. 19, February
101
Evaluation
of probability
Table 4. parameters
Values
point esrimare merhods:
of performance
criteria
associated
C-H. with
the
Chang et al. three
PE methods
Number Assumed distribution
Performance criteria
Normal
Log-normal
Johnson/Pearson
R = Rosenblueth’s
method;
Method
2
Math.
4
normally
distributed
parameters 5
6
0.0074 0.0074 0.0074
0.0055 0.0055 0.0055
-0.0159 - 0.0159 - 0.0159
-0.0017 -0.0017 - 0.0017
0.0136 0.0136 0.0136
MAE
R H M
0.0207 0.0207 0.0123
0.0250 0.0250 0.0055
0.0336 0.0336 0.0226
0.0384 0.0384 0.0396
0.0458 0.0458 0.0681
RMSE
R H M
0.0260 0.0260 0.0154
0.0315 0.0315 0.0057
0.0420 0.0420 0.0279
0.0484 0.0484 0.0499
0.0579 0.0579 0.0856
BIAS
R H M
0.0069 0.0069 0.0071
0.0050 0.0050 0.0055
- 0.0025 -0.0025 - 0.0009
0.0126 0.0126 0.0149
MAE
R H M
0.0200 0.0200 0.0132
0.0231 0.0231 0.0055
0.0245 0.0245 0.0248
0.0321 0.0321 0.0339
0.0442 0.0442 0.0532
RMSE
R H M
0.0283 0.0283 0.0168
0.0341 0.0341 0.0057
0.0441 0.0441 0.0300
0.0519 0.0519 0.0538
0.0623 0.0623 0.0912
BIAS
R H M
0.0148 0.0148 0.0265
0.0169 0.0169 0.0488
-0.0003 -0.0003 0.0446
0.0188 0.0188 0.0807
0.0390 0.0390 0.1187
MAE
R H M
0.2077 0.2077 0.1062
0.2950 0.2950 0.0648
0.3694 0.3694 0.0978
0.4549 0.4549 0.1688
0.5256 0.5256 0.2614
RMSE
R H M
0.2810 0.2810 0.1396
0.3973 0.0962
0.4999 0.4999 0.1375
0.6141 0.6141 0.2445
0.7064 0.7064 0.3850
H = Harr’s method;
M = the modified
Consideration
Appl.
of stochastic
with
R H M
The above evaluations were distributed stochastic parameters.
102
model-(b)
BIAS
Table 5 is not capable of providing a complete comparison for all the cases. However, for normally distributed stochastic parameters and Y,, Harr’s algorithm had slightly better performance than Rosenblueth’s according to the values of performance criteria shown in Table 5. Note that the values of MAE and RMSE in Tables 4-5 increased with the number of stochastic parameters for the three PE methods. This indicated that the accuracy and reliability of the PE methods deteriorated with an increase in number of stochastic parameters and the degree of non-linearity of the model under consideration. Further
3
for
Modelling,
limited to normally No skew coefficient
1995,
Vol. 19, February
Harr’s
method;
-0.0165 -0.0165 -0.0154
S = the Monte
Carlo simulation
was considered in the evaluation. To have some idea about the effect of non-normal stochastic parameters, the simple linear model (a) was used in which the stochastic parameters were assumed to be independent having lognormal distributions. Table 6 lists the statistical moments and the percentage difference obtained by the three PE methods under the independent, lognormal condition. The two Harr’s algorithms yielded identical results because all the stochastic parameters have the same means and standard deviations. In this case, it is observed that the standard deviation estimated by Rosenblueth’s method was consistently lower and less accurate than the other two competitors. However, the strength of Rosenblueth’s method over the other two algorithms is its ability to reflect the presence of skew coefficient of linear model, Y,. Table 7 lists the performance evaluation by the three criteria using Johnson/Pearson distributions. Judging
Evaluation of probability point estimate methods: C.-H. Chang et al. Table
5.
Values
of performance
criteria
associated
with
the three
PE methods
for model-(c) Number
Assumed distribution
Performance criteria
Normal
Log-normal
Johnson/Pearson
R = Rosenblueth’s
method;
Method
2
3
with
of stochastic
normally
distributed
parameters
parameters
4
5
BIAS
R H M
- 0.0029 - 0.0029 - 0.0029
- 0.0005 - 0.0005 - 0.0005
- 0.0252 -0.0175 0.0001
MAE
R H M
0.0092 0.0092 0.0050
0.0802 0.0786 0.0050
0.3057 0.2703 0.1138
0.7784 0.6474 0.4971
1.6969 1.3787 1.4501
RMSE
I? H M
0.0116 0.0116 0.0062
0.1015 0.0994 0.0062
0.3848 0.3405 0.1442
0.9793 0.8163 0.6246
2.1319 1.7334 1.8219
BIAS
R H M
- 0.0031 -0.0031 - 0.0030
-0.0025 - 0.0025 - 0.0004
- .0.0330 _ 0.0244
- -0.0899 _ -0.0510
0.0031
0.0844
-0.2949 -0.1840 0.2858
MAE
R H M
0.0068 0.0068 0.0036
0.0585 0.0573 0.0037
0.1845 0.1636 0.0650
0.4081 0.3383 0.1886
0.7576 0.6007 0.4205
RMSE
R H M
0.0123 0.0123 0.0065
0.1128 0.1105 0.0070
0.4276 0.3794 0.1615
1.0770 0.9017 0.6554
2.2803 1.8577 1.7766
BIAS
R H M
0.0094 0.0094 0.0113
0.0235 0.0236 N/C
0.0182 0.0275 N/C
0.0065 0.0471 N/C
MAE
R H M
0.0842 0.0842 0.0517
0.2294 0.2282 N/C
0.4350 0.4230 N/C
0.7222 0.6969 N/C
1.1310 1 .1030 N/C
RMSE
R H M
0.1042 0.1042 0.0608
0.2940 0.2923 N/C
0.6134 0.5850 N/C
1.1217 1.0273 N/C
2.0103 1.7846 N/C
H = Harr’s
method;
M = the modified
from the values of MAE and RMSE, it is clear Rosenblueth’s method was superior in describing overall probability curve of Y,.
that the
5. Summary and Conclusions Uncertainty analysis provides information about the degree of model output uncertainty which is essential for engineering designs and analysis in stochastic environments. However, implementation of such analysis would become less attractive and impractical when the amount of computations involved is tremendous. Point estimate methods are practical tools for uncertainty analysis. This paper examined the relative performance of the three PE methods under different model types with a varying number of stochastic parameters. Rosenblueth’s method is capable of incorporating more statistical information of the stochastic parameters in the analysis; Harr’s method renders a valuable improvement for dealing with a model containing a large number of stochastic parameters; a modified Harr’s method was proposed to further simplify the computation. In the performance evaluation, the stochastic parameters were assumed to be multivariate normal variables and positively corre-
Harr’s
method;
S = the Monte
- 0.0702 - 0.0344 0.0768
6 - 0.2540 -0.1504 0.2559
-0.1243 -0.0113 N/C
Carlo simulation.
lated. The first four statistical moments under different combinations of model type and parameter number were computed using the Monte Carlo simulation based on which the performance of the three PE methods were compared. In the case that model was linear, all the three PE methods yielded an identical estimations of the first three moments for model value. Rosenblueth’s method became less computationally attractive as the number of stochastic parameters in the model increased. For the model involving the sum of squared stochastic parameters that were normally distributed random variables, Rosenblueth’s and Harr’s algorithms yielded the same estimations for the first three moments. The modified Harr’s algorithm estimated the same mean values as the other two methods. It should be pointed out that, in this case, the model output was no longer a symmetric random variable. This characteristic was captured only by the modified Harr’s algorithm. For standard deviation and skew coefficient, the modified Harr’s algorithm tended to overestimate the true values whereas the other two PE methods consistently underestimated them. The tendency to overestimate the standard deviation and skew coefficient by the modified
Appl.
Math.
Modelling,
1995,
Vol. 19, February
103
-c
e
2
-l-l
_i .. iti
R H M S
Skewness
method;
R H M S
Standard deviation
R = Rosenblueth’s
R H M S
Method 0.03% 0.03% 0.03%
1 .I 022 0.0000 0.0000 0.8934
2.6683 2.8407 218407 2.8417
10.0000 10.0000 10.0000 10.0096
4
23.37% - 100.00% - 100.00%
-6.10% - 0.04% - 0.04%
-0.10% -0.10% -0.10%
R = Rosenblueth’s
Johnson/Pearson
Assumed Distribution
method;
R H M R H M
MAE
RMSE
H = Harr’s method;
R H M
0.0446 0.1638 0.1638
0.0260 0.1288 0.1288
0.0247 0.0166 0.0166
2
M = the modified
Method
BIAS
Performance criteria
1 .I848 0.0000 0.0000 0.8559
3.7998 4.2017 4.2017 4.2051
5
0.0688 0.0439 0.0439 0.1674 0.4152 0.4152 0.2022 0.5339 0.5339
0.0502 0.0341 0.0341 0.0750 0.2578 0.2578 0.0917 0.3292 0.3292
N/C 0.7682 0.7682
N/C 0.5991 0.5991
N/C 0.0631 0.0631
5
N/C 1.0464 1.0464
N/C 0.8229 0.8229
N/C 0.1229 0.1229
6
distributed
38.43% ~ 100.00% - 100.00%
- 9.64% - 0.08% - 0.08%
-0.10% -0.10% -0.10%
S = the Monte Carlo simulation.
4
3
parameters
with logonormally
Number of stoachstic
Harr’s method;
parameters
15.000 15.0000 15.0000 15.0155
distributed
parameters
lognormally
S = the Monte Carlo simulation
14.15% - 100.00% - 100.00%
- 2.87% -0.16% -0.16%
Harr’s method;
1 .0146 0.0000 0.0000 0.8888
1.6831 1.7301 1.7301 1.7329
6.0000 6.0000 6.0000 5.9980
M = the modified
8.40% 100.00% 100.00%
0.73% 0.73% 0.73%
0.02% 0.02% 0.02%
3
distributed
Number of stochastic
with normally
Table 7. Values of performance criteria associated with the three PE methods for model-(a) parameters
H = Harr’s method;
0.9263 0.000 0.000 0.0545
0.8798 0.8798 0.8798 0.8734
3.000 3.0000 3.0000 2.9993
2
Statistical moments estimated by the three PE methods for model-(a)
Mean
Moment
Table 6.
1.2636 0.0000 0.0000 0.8342
5.0461 5.8036 5.8036 5.7919
21 .oooo 21 .oooo 21 .oooo 20.9849
6
51.47% - 100.00% - 100.00%
- 12.88% 0.20% 0.20%
0.07% 0.07% 0.07%
Evaluation
method increased as the number of stochastic parameters increased. For the model involving multiplication of normally distributed stochastic parameters, accuracy of the three PE methods deteriorated rapidly as the number of stochastic parameters and the order of moments increased. The modified Harr’s method yielded more accurate estimations of the moments for model output than its competitors, especially when the number of stochastic parameters was not too large. Interestingly, Harr’s algorithm in this model resulted in better estimation on the first three moments than Rosenblueth’s method. Comparison based on statistical moments alone does not provide a clear picture of the relative performance of the three PE methods in many of the cases examined. Therefore, the statistical moments estimated by the PE methods were incorporated into a specified distribution that was used to compare how close to the distribution curve were the data based on the moments obtained from the Monte Carlo simulation. Three criteria were used to evaluate the goodness-of-fit of two distribution curves. Considering only the first two moments and assuming normal and lognormal distributions for the two nonlinear model outputs, the accuracy of modified Harr’s method was less desirable than the other two PE methods when the number of stochastic parameters was large. However, further incorporating skew coefficients significantly improved the accuracy of the modified Harr’s algorithm. Ignoring the skew coefficient of stochastic parameters, if it exists, may cause biased or inaccurate estimations of the uncertainty feature of model output. A simple test was performed using the linear model by assuming the stochastic parameters being independent and lognormally distributed. In this case, only Rosenblueth’s algorithm is capable of account for the asymmetric feature of the stochastic parameters. As expected, Rosenblueth’s method, in spite of its weakness in computation, outperforms its two competitors. The
of probability
point estimate
methods:
C.-H.
Chang et al.
results indicate that significant improvement in uncertainty analysis can be achieved if known skew coefficients in stochastic parameters can be incorporated. For this reason, if Harr’s algorithm or its modification is to be enhanced, their ability to account for skew coefficient of the stochastic parameters should be developed.
Acknowledgment The authors wish to express their gratitude to Water Resources Planning Commission, MOEA, Republic of China for the financial support of this study.
References 1
9
10 11
Appl.
Yen, B. C., Cheng, S. T. and Melching, C. S. First-order reliability analysis. Stochastic and Risk Analysis in Hydraulic Engineering, ed. B. C. Yen, Water Research Publications, Littleton, CO, 1986, l-36 Karmeshu, and Lara-Rosano, F. Modelling data uncertainty in growth forecasts. Appl. Math. Modelling 1987, 11, 62-68 Yeh, K. C. and Tung, Y. K. Uncertainty and sensitivity of a pit migration model. J. Hydraul. Engin. A.S.C.E., 1993,11!&2), 262-281 Rosenblueth, E. Point estimates for probability moments. Proc. Nail. Acad. Sci. U.S.A. 1975, 72(10) Rosenblueth, E. Two-point estimates in probabilities. Appl. Math. Modelling 1981, 5, 329-335 Lind, N. C., Modelling of uncertainty in discrete dynamical sytems. Appl. Math. Modelling 1983, 7, 146153 Harr, M. E. Probabilistic estimates for multivariate analyses. Appl. Math. Modelling 1989, 13, 313-318 Johnson, N. L. and Kotz, S. Distributions in Statistics: Continuous Uniuariate Distribution-l. John Wiley & Sons, Inc., New York, 1987, 22-27 Johnson, N. L. and Kotz, S. Distributions in Statistics: Continuous Univariate Distribution-l. John Wiley & Sons, Inc., New York, 1987, 9915 Hill, 1. D., Hill, R. and Holder, R. L. Algorithm AS 99: Fitting Johnson curves by moments. Appl. Stat. 1976, 25, 180-189 Davis, C. S. and Stephens, M. A. Algorithm AS 192: Approximate percentage points using Pearson Curves, Appl. Stat. 1983, 32, 3222327
Math.
Modelling,
1995,
Vol. 19, February
105