Two methods for the structured assessment of model uncertainty by experts in performance assessments of radioactive waste repositories

Two methods for the structured assessment of model uncertainty by experts in performance assessments of radioactive waste repositories

Reliability Engineering and System Safety 54 (1996) 225-241 ELSEVIER PII: S095 I -8320(96100078-6 ~) 1996 Elsevier Science Limited Printed in Nor...

2MB Sizes 0 Downloads 15 Views

Reliability Engineering and System Safety 54 (1996) 225-241

ELSEVIER

PII:

S095

I -8320(96100078-6

~) 1996 Elsevier Science Limited Printed in Northern Ireland. All rights reserved 1~51-832(1/96/$15.00

Two methods for the structured assessment of model uncertainty by experts in performance assessments of radioactive waste repositories E. Zio & G. E. Apostolakis* Department of Nuch'ar Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139-4307, USA

The assessment of the performance of high-level radioactive waste repositories is based on the use of models for predicting system behaviour. The complexity of tile system together with the large spatial and temporal scales imposed by the regulations introducc large uncertainties in the analysis. The difficulty of validating thc relevant modcls creates the nced of assessing their validity by means of expert judgments. This paper addresscs the problem of model uncertainty both from a theoretical and a practical point of view and presents two mathematical approaches to the treatment of model uncertainty that can assi,~t the experts in the formulation of their judgments. The formal elicitation of expert judgments is investigated within the Technical Facilitator~Integrator (TF.t) framework that has been proposed by the Senior Seismic Hazard Analysis Committee. Within this framework, the mathematical formulations for the treatment of model uncertainty are regarded as tools for sensitivity analyses that give insights into the model characteristics and are helpful in structuring the expert opinion elicitation process itself. The first approach, referred to as the alternate-hypotheses formulation, amounts to constructing a suitable set of plausible hypotheses and evaluating their validity. The second appToach to model uncertainty is referred to as the adjustment-factor formulation and it requires that a reference model be identified and its predictions be directly modified through an adjustment factor that accounts for the uncertainty in the models. Furthermore, both approachcs require a clear understanding of the distinction betwecn aleatory and cpistemic uncertainties. The implications that these two formulations havc on, and the issucs that thcy raise in, the elicitation of expert opinions are explored. A case study of model uno=rtainty regarding alternative models for the description of groundwater flow and contaminant transport in unsaturated, fractured tuff is presented. (~) 1996 Elsevier Science Limited.

1 INTRODUCTION

evolution. These models are typically of a distributedp a r a m e t e r form since their response is governed by partial differential equations in which the parameters normally vary in space. Due to the long time period over which the safety performance is sought and the large spatial scale over which the disposal system extends (several kilometres), one can expect that uncertainties will have a strong impact on the results of the assessment. For this reason, it is well recognized f-3 that any procedure used to assess the performance of a long-term nuclear waste repository must explicitly account for the uncertainty inherent in the problem. Gallegos & Bonano, L as well as Fehringer & Coplan, 4 present a classification of the uncertainties which typically affect the results of a performance assessment: 1) uncertainty in the future state of the

1.1 Classification of uncertainty The purpose of a long-term nuclear waste repository is to isolate the radioactive matter from the environment accessible to humans so that the risks to human health and safety will be within acceptable limits. Regulations concerning the siting and design of these engineered facilities require that their performance be assessed quantitatively for a regulatory period of tens of thousands to a million years, depending on the country. ~ It follows thai the assessments, commonly called performance assessments (PA), will have to rely on predictive models for the description of the system * To whom correspondence should be addresscd. 225

226

E. Zio, G. E. Apostolakis

disposal system; 2) uncertainty in the structure of the models used to predict the evolution of the system; 3) uncertainty in the values of the parameters embedded in these models. The same classification is used by Eisenberg et al. 5 in their presentation of the modelling process of the performance of a geologic repository, and by Thompson & Sagar 3 in their exposition of the Probabilistic Systems Assessment (PSA) approach to performance assessments. In the present paper, wc will be primarily concerned with the uncertainties in the models' structure. The PSA approach aims at defining a comprehensive, integrated model of the repository system in which the predictions over the long-term post-closure period are performed within a Monte Carlo simulation scheme so as to account for the effects of the uncertainties. It is claimed that within this framework a fundamental distinction between model and parameter uncertainty is not strictly necessary. Rather, some of the 'model uncertainty' could be treated as 'parameter uncertainty' through the introduction of particular 'flag-parameters' which are set to zero to eliminate a particular process from the model for a particular trial in the Monte Carlo simulation. A similar conclusion is drawn by Buslik6 in his work on a Bayesian approach to model uncertainty based on Savage's partition problem. 7 The treatment is equivalent to Kaplan & Garrick's probability of frequency approach x and, in the case of a finite number of alternatives, model uncertainty is seen to be equivalent to parameter uncertainty. Eisenberg et al. 5 also argue that, for geohydrologic models, it is not always possible to clearly distinguish between model and parameter uncertainty. Their point of view, however, is different stemming from the fact that the choice of the parameter values, on the basis of the available data, is conditional on the model selected and therefore a separation of model and parameter uncertainty is not always possible and they cannot be treated independently. Apostolakis & Wu '~ recognize that model uncertainty can, in fact, be represented as a parameter uncertainty by introducing a parameter, if, whose values (1, 2,...) correspond to different models (M~, M2 ..... ). Note, however, that the parameters that appear within each model Mi, say 0j, are (and should be kept) separate from ¢. Extending the values of 0j to account for model variability is inappropriate. Methods for characterizing and propagating the uncertainties on the parameters have been developed and applied in many studies. An extended review of these techniques is presented by Helton. ~° In practice, it is useful to classify uncertainties as being aleatory or epistemic. This is done merely for convenience and is not meant to imply that these are fundamentally different kinds of uncertainty. To explain the distinction between aleatory and epistemic uncertainties we utilize the concept of the model of the

world, i.e., the model of the physical situation under analysis, t~ Since many physical phenomena exist which cannot bc represented by means of deterministic models of the world, scientists and engineers construct models of the world which directly incorporate the uncertainty in these phenomena. A typical example of this situation is the Poisson model used to represent the probability of earthquakes over a given period of time. We then call aleatory the type of uncertainty which is due to the 'random' or 'stochastic' variability of the phenomena and is contained in the formulation of the model of the world itself. Other terms used to indicate this uncertainty are 'randomness' or 'stochastic uncertainty'. We note that the uncertainties that Refs 1, 3, 4 and 5 (which we discussed above) classify are epistemic, except those referring to the future state of the disposal system. The latter may include aleatory uncertainties, as described, for example, by probabilities of occurrence of major earthquakes and tectonic movements. Each model of the world is conditional on the validity of its assumptions and on the numerical values of its parameters. In the above example, the Poisson model is conditional on the assumption of constant rate of earthquake occurrence; furthermore, the numerical value of this rate, which happens to be the parameter of the model, may be uncertain. To capture these uncertainties, we introduce an epistemic probability model which represents our knowledge regarding the numerical values of the parameters and the validity of the model assumptions (other terms that have been used are "state-of-knowledge' uncertainty or, simply, 'uncertainty'). Thus, in this formulation, we see that what are commonly called model and parameter uncertainties arc of the epistemic type. Furthermore, parameter uncertainties arc represented by cpistemic probability distribution functions that are conditional on the model being true. This separation of model and parameter (epistemic) uncertainties is very valuable in practice.

1.2 Model uncertainty Model uncertainty is considered a primary source of uncertainty for the study of the performance of mined geologic repositories proposed for the disposal of nuclear wastes. By definition, a model is a representation of a real system and uncertainties inevitably arise wherever there are possible alternative interpretations of the system and its phenomena, which are all plausible in light of the current knowledge of the system. Although in this paper we refer mainly to a model as an abstract representation of a real system which develops into a mathematical description to be computationally solved, possibly by means of a computer code, we acknowledge, at this point, the

Structured assessment of model uncertainty existence of physical models as well, which suffer from many of the same problems that will be discussed here and to which many of the conclusions that will be drawn here apply. A natural analogue, for example, could be an adequate physical model of release, flow and transport. A metal coupon could be an adequate physical model for waste package corrosion. Often the uncertainties, in models can be considered as being of three types: t 1) conceptual model uncertainty; 2) mathematical model uncertainty; 3) computer code uncertainty. A conceptual model i~ a qualitative description of the system with regard to the processes assumed to be occurring, the quantities and parameters that represent these processes, and the spatial and temporal scales of these phenomena. The uncertainties in a conceptual model are typically of the epistemic type in the sense that they stern from imprecise knowledge which can take the form of poor understanding of phenomena that are known to occur in the system, as well as complete ignorance of other phenomena actually occurring. In the taxonomy of uncertainty presented by Beck, 12 this type of uncertainty belongs to a suite of uncertainties which affect the internal description of the system and can lead to erroneous prior assumptions regzrding the model structure. Typical examples of this kind of uncertainty occur in a variety of scientific fields such as seismic hazard analysis, groundwater flew and transport, modelling of the biosphere for contaminant transfer, pharmacokinetic modelling, severe nuclear accident phenomenology and econometrics. Model uncertainty can also arise due to our inability to fully characterize the system itself. This is typical of groundwater flow and cc.ntaminant transport problems in which the geology arm hydrology of the site itself often cannot be adequately characterized, in the sense that the available infcrmation is not sufficient to uniquely determine the exact structure and properties of the soil (site characterization) and, consequently, the actual mechanisms of flow and transport occurring. Within Beck's taxonomy of uncertainty, 12 this source of uncertainly belongs to the group which characterizes the external description of the system and stems from the inherent natural variability of the system itself. The uncertainty in mathematical models derives from the additional approximations and simplifications introduced in order to translate the qualitative models into tractable mathematical expressions, in an attempt to give as complete a conceivable description of the system as possible. ]3eck12 places this type of uncertainty within the internal description of the system and refers to it as errors of aggregation. Some work on the errors of aggregation resulting from the approximation of a three-dimensional spatial continuum by a two-dimensional model representation

227

has been reported by McLaughlin ~3 for groundwater systems. Yeh & Yoon 14 report an example of the impact of different parameterization schemes, within the inverse problem of aquifer parameter identification for a two-dimensional unsteady groundwater flow. The approach typically used in the past for dealing with model uncertainty evaluation is a sort of 'data analytic method' which amounts to specifying a plausible single 'best' choice S* for the model structure and then proceeding to address the uncertainty in the parameters as if S* were known to be correct. The field of data analysis, ~5 for instance, is devoted to the development of graphical and numerical methods, often based on the examination of residuals from the fit of a single reference model, that facilitate a data-driven search for S*. The main difficulties in this search typically stem from the fact that the selection and evaluation of the model structure require a test of the adequacy of each constituent model hypothesis whereas, unfortunately, this test cannot merely be conducted on a part of the model isolated from the whole. In a performance assessment, the quantitative evaluation of model uncertainty is strongly affected by the complexity of the system and by the large spatial and temporal scales required by the analysis. The predictive capabilities of the models used cannot be observed over the time frames and spatial scales to which they are required to apply. As a consequence, the analyst cannot obtain empirical confirmation of the validity of a model from observationsJ 6'~7 The evaluation of the models must then rely on the subjective interpretation of the information available at the time of the analysis. This leads to the conclusion that any attempt to address the issue of model uncertainty in a quantitative manner will rely on the use of expert judgment.

1.3 Expertjudgment The use of expert judgment in the evaluation of risks to individuals and society from large and complex technological systems is not a new subject to the scientific community. There have been many cases in risk assessment in which expert judgment was used to properly characterize uncertainties where statistical or experimental evidence was insufficient and the remaining uncertainties were, therefore, very broad. The quality of an assessment involving expert judgment can be controlled through the proper application of formal procedures for the elicitation of these judgments and thorough documentation. Formal expert judgment processes have been proposed 1~'19 which employ explicit protocols to define clearly the issues at stake and the nature of the required judgments, to make transparent the assumptions and

228

E. Zio, G. E. Apostolakis

reasoning process that lead to the judgment, to identify the inherent uncertainties and to eliminate or at least mitigate possible biases. In addition, the extensive documentation of the process allows others to review and understand it. Relevant contributions to the formal use of expert judgment in the performance assessment of high-levcl radioactive waste disposals can be found in the work of Bonano et al. 2° and Bonano & Apostolakis, 2j with reference to both theoretical and practical issues. In the area of model development, Thorne 22 describes a study in which formal expert judgment procedures were used to construct an overall model of the phenomena and factors to be considered in the disposal of low and intermediate level solid radioactive wastes in a well-defined facility at a specified site. There are many studies in the literature that document cases in which it turns out that the experts underestimated the relevant uncertainties or missed the actual evolution of events. A good example of such a situation arose in the Energy Modelling Forum which assembled a group of 43 economists and energy experts with the goal of forecasting world oil prices from 1981 t o 2020 to aid in policy planning. 23 The group considered 10 leading econometric models and 12 future scenarios based on assumptions about supply, demand and growth rates of relevant quantities. A 'reference' scenario was chosen as 'representative of the general trends that might be expected'. A retrospective analysis revealed that the unccrtainty bands based on this scenario were too narrow to include the true evolution of the oil price that actually occurred. This was due mainly to the fact that scenario uncertainty was not fully assessed and propagated, as shown in the analysis done by Draper. -'4

1.4 Objective of this work The overall objective of this paper is to contribute to the structured assessment of model uncertainty by experts. In Section 2 we present a framework for the formal elicitation of expert opinion based on the concept of Technical Facilitator~Integrator (TFI) recently proposed by the Senior Seismic Hazard Analysis Committee. 25 In the following two sections we approach the problem of assessing model uncertainty from two distinct directions, corresponding to two different theoretical formulations, both approaches involving the idea of expansion: model set expansion in one case, to create a set of models based on plausible alternate hypotheses; and prediction expansion, in the second case, so as to allow for adjustments directly on the predictions of a 'best" model that account for the uncertainty in the model structure. The classification of uncertainty into aleatory and epistemic will be

shown to play a central role in the development and application of both formulations. In Section 5 we discuss how the process of formal expert judgment elicitation is affected by the proposed methods of model uncertainty evaluation. Within the TFI framework, the theoretical approaches proposed for the treatment of model uncertainty are seen as tools providing insights useful in the formulation of the experts opinions, as well as the elicitation process itself. In this context, the key role of the distinction between aleatory and epistemic uncertainty will be stressed. In Section 6, the mathematical framework provided by the two theoretical approaches is applied to a case study involving alternative models for the description of groundwater flow and contaminant transport in unsaturated, fractured tufffl6 Finally, Section 7 offers several conclusions.

2 THE TECHNICAL FACILITATORIINTEGRATOR (TFi) APPROACH In our work, the process of eliciting the opinions of experts is considered within a framework based on the concept of the Technical Facilitator~Integrator ( TFI). 2~ The motivation behind this approach to expert judgment is contained in the definition of the TFI as "a single entity who has the responsibility and is empowered to represent the composite state of information regarding a technical issue of the scientific community'. This composite state of information is achieved through a protocol which adopts a hybrid scheme in an attempt to exploit many of the best attributes of existing mathematical and "behavioral', i.e., judgmental approaches to multi-expert elicitation processes. In accord with the point of view taken by Kaplan, 27 the experts are regarded not as proponents of a specific viewpoint but rather as informed evaluators of the available, plausible, models. They also provide advice to the TFI on the appropriate representation of the composite position of the community as a whole. In this phase of the process, the TFI functions as the leader of the team of experts who work together towards the development of a combined representation of the knowledge of the group. The TFI process is centred on the precept of a thorough and well-documented expert interaction that takes place over a series of carefully structured meetings. In contrast with the classical role of experts as individuals providing distinct judgments, the experts are viewed as a team working together, under the guidance of the TFI, to attain a composite representation of the knowledge of the group and of the community at large. The process is completely

Structured assessment of model uncertainty transparent to the experts at all stages. The TFI conducts individual elicitations and group interactions, and together with the experts, interprets and integrates data and models to arrive at a full probabilistic characterization of the scientific issue under investigation. Together with the experts, the TFI 'owns' the study and defends it, as appropriate and necessary. The TFI must have the stature and expertise to deal authoritatively with the multiplicity of disciplines and individuals involved. It is therefore reasonable to anticipate that the TFI will consist of a small group of individuals, including at least one who should be a specialist on 'substantivt:' knowledge of the subject, and one who should be a normative expert with extensive experience in individual and multi-expert elicitation processes, as well as in decision analysis and probability theory. In general, the TFI approaches the aggregation of the various judgments bchaviourally, without employing any pre-specified combination formulas, but, rather, using the available mathematical schemes of aggregation to perform sensitivity analyses. Most behavioral schemes are centred around some type of consensus building process in which the group, through either a structured or unstructured interaction, discusses the issue, exchanges information and attempts to reach a consensus result. Within these activities, the proper role of mathematical models is envisioned to be a supporting one: the TFI utilizes these models to investigate the implications of various assumptions, discusses the results obtained and accordingly modifies the consensus representation so that the ultimate aggregation will be sound and defensible. It is to these investigations that this paper contributes by developing formulations of model uncertainty that the TFI can utilize in eliciting and interpreting expert judgraents.

3 E X P A N D I N G THE MODEL SET: THE ALTERNATE HYPOTHESES A P P R O A C H

229

complete quantitative assessment and propagation of model uncertainty consists of considering a set of N~ models {Mi}, where each model Mi = (S~,Oi) consists of an alternate structure S~ based on alternate hypotheses which are plausible in light of the existing information, and the associated parameter vector 0~ [in the S*-approach the set {M3 consists of just one single best model M* = (S*,0*)]. Each model in the set provides a different description of the unknown quantity y, for example the hydraulic head distribution in an aquifer. Let us assume that the predictive description of y provided by the models is in the form of aleatory distributions of the form:

F~(y) =--F(y ] M,) = F(y ]S~,_O,), i = 1,2 ..... N~-, (1) which are conditional on the structure of the model as well as on the values of the parameters (for simplicity of notation the dependence on the known information x is not explicitly indicated). The family of distributions {F~(y)}, which properly represents the uncertainty in the unknown y due to uncertainty in the models' structure and parameters' values, can be probabilistically combined in a summary measure by means of a standard Bayesian approachJ ~ The joint density function ~b(S,O~) which expresses the analyst's beliefs regarding the numerical values of the parameters and the physical validity of the model hypotheses can be expressed as ~,(S,, ,9~) = ~,(0~ I S,)p(S,)

(2)

where ~r,(~[S,) is the epistemic probability density function (pdf) of the vector of model parameters O~, conditional upon the choice of model structure S~, and p(S,) is the epistemic probability which expresses the analyst's confidence in the set of assumptions underpinning the model. We can now assess the unconditional aleatory distribution F ( y ) in the form of the standard Bayesian estimator

F(y) = ~

F,(y)Jr,(O~]S,)dO_,p(S,)

(3)

i=1

A predictive model provides a tool for cxpressing the uncertainty associated with an unknown quantity y as a function of known q~aantities x. For simplicity of notation, we shall omit the vector sign on these quantities. The model M formalizing the assumptions about how x and y are related typically consists of a set of hypotheses which define its structure S (e.g., hypotheses about the physical relationship between x and y, hypotheses about the form of the appropriate mathematical expression describing this relationship and the numerical scheme to be employed for its practical implementation) and a vector of parameters 0 which are embedded i:a the chosen structure. Theoretically, a straightforward approach to a

where '5'ff'_'lp(S~)=l for normalization. Note that, while the sample space for the parameters" values can be considered continuous, the model set is usually regarded as discrete. This average value has often encountered objections when employed in a decision-making environment. The argument is that the decision maker should be aware of the full epistemic uncertainties that 7ri and p represent. The average value can lead to erroneous decisions, particularly when the epistemic uncertainty is very large, since the average can be greatly affected by high values of the variable even though they may be very unlikely. In general, we agree that the entire

E. Zio, G. E. Apostolakis

230

distribution of the uncertainties should be presented to the decision-maker who may then choose his own summary measure upon which to base the decisions. Furthermore, we point out that an important contribution to the decision making process should come from presenting the decision maker with a sufficiently defined breakdown of how much of the uncertainty is of the aleatory type and how much is of the epistemic type. The averaging process affects only the epistemic uncertainties and may somewhat obscure their role and significance, whereas the aleatory uncertainties are left untouched. The alternate hypotheses approach, just presented, is based on two fundamental assumptions: mutual exclusiveness and collective exhaustiveness of the set of models. While the first assumption can often be accepted in practice, the second one is harder to accept for it requires that a perfect model not only exist but that it also bc one of the Ns models considered. In general, the complexity of the phenomena is such that the list of plausible models considered is necessarily incomplete. Moreover, progress in understanding the physical laws underpinning the process under analysis and the increasing computational capabilities are such that models evolve in time. As pointed out by Vencziano, z~ 'an important distinction in decision theory is between cases when the state of uncertainty remains fixed and a decision is made now and cases when the state of uncertainty varies in time and what needs to be defined is a strategy (a rule to make decisions in time, depending on the evolving state of information and on previously made decisions)'. In the latter time-dependent context, failure to realize the temporal dimension can lead to a misguided endorsement of the mean rule for treating epistemic unccrtainty. One possible modification of this approach, which addresses the problem of exhaustiveness, demands that p(Si) be considered as the probability of Si providing the best approximate description of the process within the objective of the analysis, and then allows for a correction factor in eqn (3) which takes into account the possible existence of other descriptions not considered within the Ns- models. In other words, we write

N,f

F(y) = ~.=

F~(y)Tr~(tg,lS,)dO_,p(S~)

+ f F°(y)rc°('-9"lS")dtg"P(S')

(4)

where the superscript 0 stands for other. Although this formulation seems to resolve, in principle, the problem of exhaustiveness, it still leaves open many issues from the practical point of view regarding the definition of the alternate hypotheses underpinning S"

and the assessment of F"(y), together with the associated probability p(S"). Within the alternate-hypotheses approach, the key issue in improving on the S*-approach is how to specify p(Si), which in the S*-approach is set equal to a Kronecker-deita function on S*. This choice in most cases is to() focused on a single set of structural hypotheses to result in well-calibrated predictions. On the other hand, one might consider a very large set of plausible alternate hypotheses thus specifying p(S,) much more diffusely and then search for some relevant information to update the distribution, hoping that this will reduce the size of the set of models. However, the set of all plausible models is in most cases too large to guarantee the success of this updating, provided that the necessary, relevant information becomes available. To satisfy at best the constraint of exhaustiveness we need to find a procedure which allows the building of a manageable set of plausible models. Two approaches can be taken. The first one, known as discrete model expansion,2~'3° consists of starting with a single 'best' structural choice S* and expanding it in directions suggested by context, by the data analytic Search that led to S*, and by other considerations regarding the weakness of the various hypotheses. In this context, an attractive method of model building with little prior knowledge of the system and few experimental observations of its behaviour is due to Young et al. 31 This approach aims at generating alternate model hypotheses by testing for failure of inadequate hypotheses and speculating about the form of improved ones. The problem of generating some preliminary hypotheses about the possible mechanisms governing qualitatively observed behaviour is addressed by sifting through a set of prior hypotheses and rejecting from further consideration those to which observed behaviour appears to be insensitive. The approach is appealing and powerful for its simplicity and flexibility: its applicability is essentially independent of the complexity of the model structure. The second approach to the generation of alternate structures {Si} consists of establishing a systematic framework for the construction of the alternate models themselves that considers all possible alternatives at each point, during the construction, when an assumption is to be made. In this approach, the best single model S* is constructed simultaneously with its alternatives and the data analytic search for it occurs only at the end of the process development stage, whereas in the first approach the best model was developed first and then alternatives were generated from it by variations in the direction of its weakest hypotheses. Kerl et al., 32 for example, propose a structured approach to the development of conceptual models for the groundwater flow and contaminant transport based on a systematic decomposition of the

Structured assessment of model uncertainty models in alternate, elementary hypotheses organized in a hierarchical tree. The choice of the alternate structures S~ to be actually included in the expanded set is, obviously, highly context specific, I:ut some general guidelines can be given. First, let us emphasize, once again, that the objective of this operation is the achievement of a practical and manageable degree of exhaustiveness of the set of models. The main stimulus driving the inclusion of alternate hypotheses is avoiding the possibility that, after the fact, a set of modelling assumptions different flora those originally considered turns out to be correct, as in the case of the oil price forecasting. This argues for the choice of a set sufficiently large to encompass the unknown truth. The decision as to when the set is sufficiently large should be based on a pr,~posterior analysis of model assumptions so as to identify all plausible situations which might be found to occur in retrospect and which are seen to significantly impact the results of the analysis. With reference to eqn (3), we should make sure to include in the set all those alternative structures S~ which have large probabilities p(S~) and whose predictive outcomes F~(y) differ substantially from those provided by the reference model S*. These ideas may be employed to define directions of departure from S* that are the most relevant for the development of an appropriately expanded set of models in which exhaustiveness is satisfied at our best. Notwithstanding these difficulties, this approach, in the form of eqn (3), h~.s been used in the past in various fields of application supported by a somewhat implicit assumption of e~:haustiveness. In the medical field, for example, Evans 33 presents an analysis of model uncertainty associated with an investigation of the carcinogenic potency of chloroform. Once again, the uncertainty arises largely as a result of fundamental limitations in our understanding of the mechanisms through wh!ch chemicals induce cancer. This ignorance prevents us from defining appropriate measures of biologically effective dose and limits our ability to clearly specify the functional relationship between dose and response. In this study model uncertainty was addre.,;sed by showing alternate hypotheses in a tree structure and formally eliciting expert judgments.

4 THE ADJUSTMENT-FACTOR A P P R O A C H

As mentioned in the introduction, it is possible to take a second direction in the improvement of the uncertainty assessment provided by the single best model S*-approach, that is, to introduce a sort of "adjustment' directly o~ the predictions y* of the model to account for the uncertainty associated with it. To formalize this situation, we introduce a factor

231

E*, which may be additive (E*,) or multiplicative (E'm), so that our assessed value for the unknown quantity y can be written as y = y * + E*

(additive)

y = y* + E,*, (multiplicative).

(5a) (5b)

The factor E* is, in general, unknown and the uncertainty associated with it can be represented in the form of a distribution g(E*). It is important to note that the uncertainty in E* could be of the epistemic type only or both epistemic and aleatory. In the first case, E* simply represents the systematic bias of the model prediction and the uncertainty in its numerical value is strictly related to our lack of knowledge which could, in principle, be eliminated with a single observation on y. In the second case, the model bias E* itself exhibits aleatory variability, due to some random effects which have been neglected in the model. In this case, a single observation cannot eliminate the uncertainty and actually a sequence of observations, repeated under apparently identical conditions, would lead to different values. E* is then described by an aleatory distribution whose parameters are uncertain due to the epistemic uncertainty in their values. The general formulation of model uncertainty provided by eqn (5a) has also been employed in various contexts. In the fire risk assessment area, the actual time to damage T,~ of an object can be considered to be the product of its deterministic reference model (drm) prediction, T,~.d,,,, and an adjustment factor E* which accounts for the inadequacy of the drm, viz. T,~ = T,~.d,,,E*. In earlier applications, this factor was assumed to have an epistemic probability distribution function reflecting the analysts' uncertainty regarding the amount of systematic over- or underestimation of the damage time by the deterministic reference model. 34 In a subsequent analysis, it was recognized that E* itself may be an aleatory variable and its aleatory distribution was introduced in the form of a lognormal distribution with parameters/.t and tr. 35 The epistemic uncertainty is, in this case, described by a pdf over the parametcr vector (/.L,tr), and it can be updated using Bayes' theorem. In seismic risk analysis, a similar formulation is applied to the estimation of the model uncertainty of predicted ground motion for future earthquakes. 36 In this case, the data coming from recorded earthquakes can be exploited to quantify the goodness-of-fit of simulations of the recorded earthquakes by considering the differences in the response spectra of the observed and simulated ground motions. The natural logarithm of the average horizontal spectral acceleration, InSA, i for the j-th station and the i-th

232

E. Zio, G. E. Apostolakis

earthquake, is given by In SA"#(f) = In SA",j(f) + Ix(f) + e#(f) (6) where .f is the frequency, superscripts o and c refer to observed and calculated quantities, Ix(f) is the model bias and %(]') is the error term, assumed to be a normally distributed random variable with zero mean and variance equal to ¢r,(f). In this case, SA',i(f) represents the output of the model and it is corrected by a multiplicative factor E,,,(f) 'J = exp[Ix(f) + E/j(f)], to account for the inaccuracy of the model description. The adjustment factor E~,(f) contains both aleatory and epistemic uncertainty, since it is intended to account for 'model' uncertainties (differences in the actual physical process and the numerical simulation), as well as 'state-of-knowledge" uncertainties (detailed aspects of the earthquakes source and wave propagation that cannot be modeled deterministically on the basis of the current state of knowledge). In this particular application, the availability of the data allows one to resort to classical statistical techniques for the assessment of model, as well as parametric, uncertainty. In a more general case, the assessment of these uncertainties will result from all available data, including information provided as experts" opinions, especially with regard to model uncertainty.

5 EXPERT J U D G M E N T R E G A R D I N G MODEL UNCERTAINTY 5.1 A structured approach As mentioned previously, the process of eliciting the opinions of the experts is considered here within the framework that is based on the concept of the Technical Facilitator~Integrator (TFI). 25 This framework adopts the seven major steps suggested by Keeney & von Winterfeldt for the formal use of expert judgment: I~ 1) Identification and selection of the issues; 2) Identification and selection of the experts; 3) Discussion and refinement of the issues; 4) Training for elicitation; 5) Elicitation; 6) Analysis, aggregation and resolution of disagreements; 7) Documentation and communication. In the present section, we illustrate how the preceding theoretical formulations for the treatment of model uncertainty can be of help in the process of expert judgment elicitation and how they can be exploited within the hybrid scheme of mathematical and behavioral tools to be used by the TFI. The discussion will be limited to those steps of the process which are directly affected by these formulations. 5.2 Discussion and refinement of the issues The main goal of the discussion and refinement of the issues (step 3) is to ensure that there exists a common understanding of the issue being addressed and that

the experts would be responding to the same elicitation questions. A thorough explanation of the distinction between parameter and model uncertainty as well as between aleatory and epistemic uncertainty is necessary at this point of the process. T o o many times in practice, failure to define clearly these concepts has led to confusion throughout the entire clicitation process with consequent negative effects on the results of the analysis. In the N U R E G - I 1 5 0 study of the Peach Bottom 2 reactor, for example, the issue of containment failure pressure is addressed by asking a group of experts to provide their judgments. ~7 Five values of pressure are considered, with the implication that one of these will actually turn out to be the true one but, at this time, wc do not know which value it will be. The phenomenon is, therefore, considered in a deterministic way with no aleatory uncertainty. Actually, it is stated in Appendix A of the report that there might bc some randomness about each value and that 'there was a great deal of discussion concerning this issue due to the difficulties in defining the meaning of the failure pressure distributions derived for this issue. Each reviewer had a somewhat different interpretation of the input that was being required, as well as of the use of the input in the Limited Latin Hypercube sensitivity analysis'. This means that the experts debated the validity of the assumption that the model of the process was deterministic (that only one value was the truc failure pressure). It was finally decided that the aleatory variability was 'generally small' and it was dropped from the subsequent analysis, which was, thus, concerned solely with epistemic uncertainty. Wc note that this is an important assumption that affects the entire elicitation process. If the group had decided to include aleatory uncertainty in the model, a different question would have been asked of the experts: "What is the fraction of times that failure occurs at each of these pressures and what is your uncertainty about this fraction?' Instead, the results obtained in the analysis are the responses to a completely different question, i.e., "What are your probabilities that the true failure pressure will be one of the five values considered?'. A similar situation is discussed by Parry -~ in his paper on the interpretation of containment-event-tree probabilities. These probabilities are associated with branch points of containment event trees that refer to the occurrence or not of various physical phenomena and their interpretation is not necessarily consistent with those of the core melt event tree branch points. Parry's argument is that the latter probabilities are all interpretable as relative frequencies within an aleatory model, while the former are to be interpreted as measures of belief in the different values of an unknown quantity, thus expressing epistemic uncertainty.

Structured assessment of model uncertainty The distinction between model and parameter uncertainty is also believed to be a key point of discussion, in order to avoid misinterpretations. In particular, the TFI should guide the experts to acknowledge the fact that the distributions used to depict the uncertainty in the parameters cannot be used to express model uncertainty as well. As a typical example, consider the exponential distribution for the time to failure of a component. Often times, engineers tend to express the uncertainty in the failure rate with a distribution, typically lognormal, over a range of possible values that is 6eliberately stretched on the high tail so as to account for the higher values of the failure rate under abnormal (accident) conditions. In this case, the experts are trying to capture different operating conditions with the distribution of the uncertainty in the parameter. However, this is conceptually incorrect, since, under perfect knowledge, the distribution for the parameter would reduce to a delta function around the unique true value of failure rate, which applies either to the normal or to the abnormal conditions, but not both. The proper treatment of the two different operating conditions would require consideration of two models for the failure rate, one for normal and one for abnormal conditions, each having i':s own set of parameters. Besides these fundamental points, during the session on the discussion and refinement of issues the TFI should facilitate and solicit interactions on various concerns specific to the approach chosen for the treatment of model uncertainty. Depending on the approach employed for the quantification and representation of model uncertainty, the meaning and interpretation of the various quantities involved should be thoroughly discussed and understood. In the alternate-hypotheses approach, a key point is expected to be the interpretation of the quantity p(M~) as the probability that model M~ is correct, or that its underpinning assumptions are valid, which has raised much controversy from both a philosophical and a practical point of view. ~3'~ In many practical cases that occur within a performance assessment of high-level radioactive waste repositories, the impossibility of performing a complete model validaticm, in its strict scientific sense, demands a different approach to the interpretation of the quantity p(M~). Al':hough other interpretations have been proposed, two fundamental issues must be pointed out. First, the interpretation of probability as a degree of belief should be made clear to the experts. This basically calls for accepting the primitive notion of likelihood, thus using probabilities simply as numerical measures of our beliefs regarding the likelihood of a certain .,~vent. Second, as mentioned earlier, it is important to address the fact that the conditions under which we can declare that a model provides a good approximation to the real behaviour

233

of the quantity of interest depend necessarily on the objectives of the particular analysis and the corresponding level of precision required, so that one should re-write the model probabilities as p(MilO), where the objectives O are expressed explicitly or, as is often the case, implicitly ('everyone knows about them'). ~t The two issues just pointed out should be exploited by the TFI and by the experts to provide insights into the meaning to be given to p(Mi) and its assessment in practice. Within the alternate-hypotheses approach, the set of models to be considered in the analysis is expected to be a major object of debate. Questions regarding the mutual exclusiveness and collective exhaustiveness of such a set will be addressed, as these are two basic assumptions in the approach. The systematic methods of model set expansion, presented in Section 3, will be of great value in addressing these issues and should provide the structural basis for an analysis that will be as complete as possible. Furthermore, the alternate-hypotheses approach suggests a natural way of investigation in the form of a decomposition of the problem into more fundamental assumptions on the process to be described. The question of which decomposition to use would obviously arise. The TFI could suggest a common scheme of decomposition to be adopted by all experts, or the experts could be left free to provide their own, or they could still be involved by the TFI in an open forum discussion to decide on a common, consensus decomposition. In general, a common scheme would be appealing because it would provide uniformity to the elicitation step and it would be a valuable aid in the following phase of analysis and aggregation of expert judgments. On the other hand, it could introduce an early bias in the analysis. This type of decision will depend mainly on the specific case at hand. Problem decomposition improves the quality of the assessment by structuring the analysis so that the expert is required to make a series of simpler assessments rather than a single complex one. The improvements are mainly due to the fact that the experts respond to less difficult questions. Under the guidance of the TFI, the experts must also make their reasoning explicit, which forces deeper introspection into the assumptions of the analysis and requires formal consideration of alternatives that might otherwise be ignored. Decomposition also provides a sort of self-documentation since the expert's thought process is made explicit in the decomposition. Concerning the adjustment-factor approach, the first issue which requires discussion with the experts is the appropriateness of the model chosen for reference. If only one model is available, then this point is of no concern although a brief discussion might still be worthwhile to ensure that that is the actual case. On the other hand, if a suite of models is

234

E. Zio, G. E. Apostolakis

available, then a detailed presentation and thorough discussion of the assumptions and simplifications underpinning each model is necessary. Obviously, consensus on which model to adopt would greatly ease the subsequent steps of the elicitation process, but this should not be a compelling objective and cases may very well arise in which experts decide to rely on different models as references. The second key issue to be addressed concerns the nature of the distribution g(E*). As mentioned earlier, the uncertainty in E* could be of the epistemic type only or both epistemic and aleatory. For this reason, the discussion on aleatory and epistemic uncertainty, previously emphasized, gains particular importance within this formulation. A clear understanding of which type of uncertainty should this distribution bc addressing is fundamental to guarantee the proper formulation of the questions to be asked to the experts and the interpretation of its responses. We expect that the completion of the discussion and refinement of the issues will require more than one session with the experts. During the period of time between meetings the experts will have an opportunity to examine and interpret the available information regarding the models as well as data from the actual repository site and from analogue sites exhibiting similar behaviour.

5.3 Training for elicitation The next step in the formal elicitation process is the training of experts which is carried out by the normative experts. The basic premise is that the substantive experts, i.e., experts on the relevant physical sciences, are not necessarily good assessors of probability distributions that reflect their true state of knowledge and belief. During this phase, the normative expert should make sure that the basic concepts of model uncertainty and the distinction with parameter uncertainty are clear to the substantive experts. Similarly, the ideas of aleatory and epistemic uncertainty should be reviewed. These concepts will have already been discussed with the experts, since they are needed during the phase of refinement of the issues. Therefore, training will probably have to takc place in two stages. In the adjustment factor approach, the training phase seems to be more standard in that the experts are required to express their beliefs in terms of uncertainty distributions. Besides the subjectivist interpretation of probability, the TF1 should also address, by means of examples, the logical connection between a probability curve and the evidence that supports it, i.e., the meaning and use of Bayes' theorem.

5.4 Elicitation The elicitation phase of the process is obviously affected in a strong way by the formulation adopted to represent model uncertainty. Within the alternatehypotheses approach, the clicitation sessions would serve the purpose of obtaining the decomposition and quantitative assessment of the experts who provide their judgments regarding the validity of the basic alternate hypotheses in the model. The questions asked would be of the type 'Taking into consideration the objectives of the analysis, what is your probability that the valid hypothesis regarding the process under analysis will be one of those considered in the set of alternatives?', 'What is the probability value that best represents the confidence you would place on the considered hypothesis as giving an adequate description of the process, given the objectives of the analysis?" The decomposition itself facilitates the explanation of each expert's thinking and the motivation behind the elicited values. If these values are elicited with respect to individual hypotheses constituting the alternate models, they should be recomposed to give the aggregate probability values representing the confidence levels that the expert places on the alternate models. The probability network scheme proposed by Kerl et al. 32 could serve this purpose. The TFI then presents the results of this aggregation to the experts (perhaps in the form of simple graphical representations) which are then free to appropriately revise the individual entries if they believe it necessary. The effects of the changes should be readily displayed by the TFI, discussed with the experts and justified. In the adjustment-factor approach, the experts are required to provide their assessed distribution g(E*). The TFI must make sure that the elicitation is performed in such a way that the expert's thinking and motivation is made clear and explicitly reported. This is a detail to which particular attention must be paid because, while in the alternate hypotheses approach the expert's reasoning can be more easily followed, as it is somewhat framed within the structure offered by the decomposition in alternate hypotheses, this structure is generally lacking here. Furthermore, although one would expect the expert to have some sort of structure in his or her mind, it might not be made explicit and oftentimes he or she might be using unconscious, unstructured motivations which affect its assessment. Notice that, even though this lack of structure may be seen as a drawback from this point of view, it could very well be thought of as an advantage of the approach from another point of view, because it provides the method a high degree of flexibility and the expert is by no means constrained in the expression of his or her beliefs.

Structured assessment o f m o d e l uncertainty

5.5 Analysis, aggregation, and resolution of disagreements The last step of the formal elicitation process is the aggregation of the subjective assessments. This is often a controversial subject. Excellent reviews of the various aggregation methods used in practice can be found in Refs 40-42 and therefore the methods themselves will not be di,,;cussed in detail here. As discussed in Section 2, the TFI aggregates the expert judgments after e'~tensive interactions with the experts. If this process does not lead to consensus regarding the composite distribution(s), the TFI proceeds to develop this distribution behaviouraily. The mathematical aggre~iation models that have been proposed in the literature, are used as tools for sensitivity studies that provide insights to the experts and the TFI concerning the sensitivity of the aggregation process to alternate plausible assumptions, e.g., the degree to which expert judgments are correlated. The aggregation step is also influenced by the choice of the mathematical formulation used to represent the issue of model uncertainty. In the alternate-hypotheses approach, if a common decomposition scheme is used by all experts and the probability values are assessed with regard to individual, constitutive hypotheses, the TFI might choose to aggregate the various expert judgments at the hypotheses level and then an aggregated probability value for ea:h alternate model could be obtained, for instance by means of a probability network as proposed by Kerl et al. 32 If, on the other hand, different decomposition schemes are used by the experts, then the assessments of each of them must be treated separately by the TFI to obtain the discrete probability distribution expressing his or her confidence in the alternate models.

6 CASE STUDY: ALTERNATE MODELS OF G R O U N D W A T E R FLOW A N D CONTAMINANT TRANSPORT IN U N S A T U R A T E D , F R A C T U R E D TUFF Previous work z6"43 has presented the evaluation of several conceptual models for groundwater flow and solute transport at a high level waste repository site in unsaturated, fractured tuff formations. These analyses have been mainly concerned with the fundamental assumptions made in the construction of the various models and their possible implications on the overall performance assessment modelling. In particular, Gallegos et al. 26 address the problem of conceptual model uncertainty by taking six alternate models and graphically comparing the various flow and transport results obtained. In general, it has been recognized

235

that, given the available information, it is not possible to develop a single conceptual model for such a system, but rather several plausible alternatives exist. In this section, we re-examine the six conceptual models presented by the authors, with the objective of assessing quantitatively the uncertainty associated with the predictions, within the frameworks presented in Sections 3 and 4. A brief description of the six models is in order and is provided in this section. For further details on the specific models, the reader should consult Ref. 26. The six models considered are based on a set of common assumptions and simplifications but differ by some fundamental hypotheses on the flow and transport mechanisms in the system. In particular, four different models for the groundwater flow have been considered, and when these arc combined with the assumptions for the solute transport, they give rise to the six different models. All of the models are based on the same system geometry. A spatially uniformly distributed flux at the repository horizon equal to 0.1 mm/yr is assumed as the boundary condition for all models. Steady-state groundwater flow through a one-dimensional system, occurring from the base of the repository to the water table, was also assumed. Fifteen hydrogeologic layers in the vertical direction were used to describe the hydraulic and geologic properties in five of the six models. The sixth model was based on an equivalent four-layer stratigraphy based on average properties. Four radionuclide specics, Tc-99, 1-129, Cs-135 and Np-237, are assumed to be released from the repository and transported as solute. The transport model is based on the assumption of a single, dominant, non-branching transport path. Radionuclide retardation factors are based on the distribution coefficient K,t for each radionuclide. No other chemical reactions are included. Gaseous phase transport is not included either. Table 1 reports the principal characteristics that distinguish the models. Table 2 reports the uncertainty (across models) stemming from the six models, in terms of the cumulative release to the watcr table in curies, for a given set of values of the parameters. The cumulative release to the water table for the all-fracture model (model 2) is, for all practical purposes, equal to the cumulative release from the source, and this is true for all radionuclides. This is due to the relatively short travel time for this case, which results from the relatively high fracture velocities and lack of radionuclidc retardation in the fracture continuum. Also, because of the very short travel time, a negligible amount of the inventory is lost to radioactive dccay. Adding any matrix transport and/or matrix retardation decreases significantly the cumulative release. In fact, for Cs-135 and Np-237,

E. Zio, G. E. Apostolakis

236

Table 1. Characteristics of the six alternative models ~

Model

Distinguishing assumptions Single-continuum: groundwater flow and solute transport occur only in the porous matrix Single-continuum: groundwater flow and solute transport occur only in connected fractures; no retardation effects in the fracture continuum Single-continuum: groundwater flow as in 2: solute transport through the fractures and also through the immobile matrix water by diffusion: retardation effects only in the matrix by sorption Dual-continuum: simultaneous matrix and fracture flow: solute transport occurs in the medium with highest pore velocity: retardation effects only in the matrix by sorption Dual-continuum: groundwater flow as in 4: transport described using an equivalent porous medium approximation: retardation effects only in the matrix by sorption Dual-continuum: groundwater flow and solute transport as in 4: only four equivalent hydrogeologic layers

predictions of any of the models that contain matrix transport result in zero release to the water table, primarily because of the high matrix retardation factors for these two isotopes. We begin our analysis by first looking at the models of groundwater flow available. A close examination of the models from this point of view shows that there are actually only four different structures, as given in Table 3 where the superscript f stands for flow. Figure 1 reports the deterministic predictions of the hydraulic head distribution in the system provided by the four models for a given set of parameter values. In the analysis that follows, we ignore the effects due to uncertainty in the parameters and we focus on the uncertainty due to the model structure. Following the alternate-hypotheses approach introduced in Section 3, suppose that the expert considers S~ as the most plausible model, S*, and assigns a value of 0.6 to its probability, meaning that given the available information, he or she feels 60% confident that this model would provide an appropriate description of the groundwater flow phenomenon for the objective of the analysis. The other structures S~, SI2, S~, are seen to differ from S~ by only one

Table 3. Groundwater flow models

Flow structure

Model (Table 1)

s~

1

S~

2,3

S~

4,5

Sf,

6

Characteristics Groundwater flow in porous matrix only Groundwater flow in fractures only Groundwater flow in matrix and fracture simultaneously Groundwater flow in matrix and fracture simultaneously, but four stratigraphic layers only

S / = i-th alternative model structure for groundwater flow.

assumption on the fundamental mechanism of flow. On the basis of the available evidence, the expert assigns values to the probabilities of the alternate hypotheses. An example is reported in Table 4 where by assigning a value of 0.01 to the probability of model 1 the expert expresses his or her belief regarding the hypothesis of an all-matrix flow. Also, for the sake of this example, we assume that all four models deserve to be included in the expanded model set and no other category is necessary, so that they form an exhaustive set, in the sense that they are expected to provide a satisfactory representation of the retrospective truth and, therefore, enough hedging against uncertainty. Figure 2 shows a comparison between the prediction provided by the best model S~ and the Bayesian estimator obtained by introducing the probability values of Table 4 into eqn (3), which, we argue, provides a more appropriate representation of the uncertainty in the model structure. To complete the example, we now consider all six models for groundwater flow and contaminant transport, which we denote as S{', i = 1, 2 ...... 6, in the order given in Table 1, and assume that the expert chooses $4r' = S* with p(S*)= 0.45. Model structures S(', S~', S~' differ by two hypotheses while S[' and $6I' differ only by one. The corresponding probabilities are given in Table 5. The application of the probabilities in Table 5 as weights in the Bayesian estimator of eqn

IOO0

-

H2

950 -

Table 2. Cumulative release to water table(Ci) ~

Model

Tc-99

1-129

Cs- 135

1

197

7.15

0.0

2 3 4 5 6

6,26E5 5.8E-4 5.29E4 5.52E4 3.41 E4

1.510 1.68E4 8.5E-7 4.5E-12 515 0.0 416 0.0 535 0.0

Np-237

900 -

/.J

j,"

850-

. H3 .

H4

0.0

0.043 4.3E-15 0.0 0.0 0.0

751) --

7oo 701.)

I

I

I

I

l

r

750

8oo

850

900

950

I (XX)

x(m)

Fig. 1. Uncertainty in the hydraulic head distribution due to the four different models of groundwater flow.

Structured assessment of model uncertainty Table 5. Assessment of model uncertainty for the groundwater flow and contaminant transport models of Table 1: S* = S~4'

Table 4. Assessment of model uncertainty for the groundwater flow models of Table 3: S* = S~3

s'

p(Sl )

s~ s~ S*

o.ol 0.o4 0.60

s~

0.35

237

S* = reference model structure. S/ =i-th alternative model structure for groundwater flow. p(S/) = probability of the i-th alternative model structure for groundwater flow.

(3) leads to a measure of cumulative release as indicated in Table 6 where it is compared to the values obtained by S*=S~'. Again we argue that the estimates obtained by means of eqn (3) give a more accurate representation of the uncertainty inherent in the model structure. Notice how the estimates are quite close in all cases, as the mean is driven by models S~' and $4I' which happen to have similar probabilities and predictions. Suppose now that we wish to represent the uncertainty in the output predictions regarding the cumulative release of 1-1129 given by the six models, by expanding directly the prediction of a selected best model S* through a properly assessed adjustment factor E*. To be consistent with the earlier analysis we assume that the expert chooses S* = $4I~. Indeed, one could very well think of a two-step process which combines the two approaches proposed in Section 2 and 3: first, the expert expands the model set and selects the best model S*; then he proceeds to modify the predictions of S* to account for its uncertainty as suggested by the available information, including that provided by the identified alternate models. This process has the potential of leading to a very efficient and explicit method for evaluating and propagating model uncertainty. In our simple example, we simply

920 b 9O0 880 861) 840 82O

s?

p(S")

sl' S~' s~ S* S~' s"

0.01 0.04 0.1 0.45 0.3 o.1

S* = reference model structurc. S,"= i-th alternative model structure for groundwater flow and contaminant transport p(S, t') = probability of the i-th alternative model structure for groundwater flow and contaminant transport. assume that the expert uses the information given by the alternate models to build a distribution for the modification factor E*, which then represents model-to-model uncertainty. In most practical cases, other information combined with expert judgments will also be used to arrive at the final distribution for E*. Notwithstanding the paucity of the data, for the purpose of this example, the residuals (y, - y*) of the five alternate models can be equally weighed to produce a mean value of E* = -21.37 and a standard deviation t~t~. = 616.66, which indeed reflects the large uncertainty present in the predictions given by the models. The solid line in Fig. 3 represents the distribution of E*, under a normality assumption. Taking into account the expert's judgments on the credibility of the various alternatives, expressed in terms of probabilities p(S,), we can modify the distribution just obtained by using the probabilities of Table 5, to estimate a Bayesian predictive distribution according to eqn (3) (dotted line in Fig. 3). In this case, this leads to a slight shift of the curve towards more negative values and a reduction in the uncertainty, the new mean and standard deviation values being -44.48 and 262.32, respectively. The results of the above analyscs can then be presented to the expert for discussion and refinement. During this phase the TFI operates as a normative Table 6. Comparison of cumulative releases to water table(Ci)

8(X)

780 760 740 720 7(X)

750

8C0

850

t,~)¢)

950

I I(XX)

x (m)

Fig. 2. Comparison of the hydraulic bead distributions given by the best model S* =S~ (solid) and by the Bayesian estimator of eqn (3) wilh the probabilities of Table 4 (dashed).

Model

Tc-99

1-129

S* = S~' Bayesian

5.29E4 6.88E4

515 470.52

Cs-135 0.0 672

Np-237 0.0 1.72E - 3

S* = reference model structure. S~ = 4-th alternative model structure for groundwater flow and contaminant transport. Bayesian = Bayesian estimator of eqn (3).

E. Zio, G. E. Apostolakis

238

0.00 0 . 0 5 ~-*

0.04 --

- ~ 0.O3 - O.O2

--

O.OI

-0

-3(XX)

- 2000

- I(XX)

0

I(XX)

I_ 2000

J 3(XX)

E*

Fig. 3. Comparison of thc distributions of E* obtained by simple (solid) and weighted (dashed) averaging.

expert and is expected to point out to the substantive expert the intrinsic meaning of the assessed distributions, in terms of their characteristic values, such as median, mean and percentiles. The expert is then left free to reconsider the previously assessed values and adjust them, if that is the case, on the basis of the analysis of the available evidence and of other considerations reflecting his/her personal expertise. The TFI acts first as an integrator by computing the new composite distribution which results from the adjusted values and then as a facilitator by presenting it again to the expert for refinement. The process continues in an iterative manner until the substantive expert is satisfied with the assessed distribution as a full, and appropriatc, characterization of the uncertainty. The two methods of aggregation presented are, therefore, used by the TFI and by the substantive expert as tools for sensitivity analysis to explore the effects of different input judgments, with the goal of producing a final distribution which properly reflects the expert's beliefs.

7 CONCLUSIONS Performancc assessments of high-level radioactive waste repositories are typically based on the extensive use of simulation models that predict the system behaviour. The complexity of the phenomena and of the system itself introduces several uncertainties in the assessment, which must be accounted for. This paper addresses the uncertainty which derives from the model structure itself. In effect, almost all problems of environmental system identification can be regarded as problems of model structure identification. The inherent, natural variability of the environmental system considered for a performance assessment gives risc to a problem of model identifiability which strongly affects the results of the analysis and is a principal source of uncertainty. It implies an uncertain and ambiguous interpretation of past observed behaviour and, most of all, the possibility of ambiguous, and even contradictory, predictions of the system evolution.

The problem of model uncertainty typically originatcs from both inappropriate prior hypotheses of model structure and the set of input/output variables observcd in a planned experiment. For this reason, problems of model identifiability and issues of experimental design are closely related for the specific purpose of reducing critical uncertainties associated with a model. For the models employed in performance assessments one should typically envision an iterative process of model identification and experimental design due to the fact that a good experimental design requires good prior knowledge of the system's behaviour, i.e., a good model. In this regard, sensitivity analysis will play an important rolc in the identification of those parameters which are individually identifiable and those which are not. In practice, the situation is typically one in which, due to the large spatial and temporal scales required by the analysis and the complexity of the system, the predictive capabilities of the models used cannot be observed. As a consequence, the analyst cannot obtain empirical confirmation of the validity of a model from observed data and the evaluation of a model will then have to rely on the subjective interpretation of the information at the time of the analysis. The assessment of model validity through the use of expert judgment represents a difficult and controversial task. Nevertheless, it is the basis of any serious attempt at quantification of model uncertainty. The thesis of this paper is that appropriate mathematical formulations for the treatment of model uncertainty can provide valuable insights and structured guidance to the process of elicitation of these judgments. The clicitation process is framed within the scheme of the Technical Facilitator/Integrator approach recently proposed by the Senior Seismic Hazard Analysis Committee, which attempts to exploit many of the best attributes of mathematical and behavioral approaches to multi-expert elicitation processes. The theoretical formulations for the treatment of model uncertainty are seen as additional tools to be exploited in the elicitation process. Two mathematical frameworks are formulated which, in principlc, account for the uncertainty associated with models. In the first approach, the idea is to systematically construct a set of models based on alternate hypotheses which, in light of the available information, provide plausible descriptions of the system undcr analysis. We have referred to this method as the alternate-hypotheses approach, for obvious reasons. Since a model consists of many constitutive assumptions, the alternate hypotheses typically pertain to several of these individual assumptions. The credibility values assigned to the individual hypotheses must then be recomposed to give rise to a probability value which expresses the analyst's belief on the overall model considered.

Structured assessment o f model uncertainty For decision-making p~arposes, it is often necessary to combine the results of the uncertainty assessment into an aggregated measure. The alternate-hypotheses approach provides a sound and defensible theoretical framework for the evaluation of a predictive distribution to be used in formal decision analysis. However, many difficulties have been encountered in practical applications where the experts have shown a sense of uneasiness with representing their beliefs concerning alternative models in the form of probability values, since it is often not clear what these probabilities means and what they are intended to quantify. The other mathematical formulation, the adjustment-factor approach, which can be used to account for model uncertainty amounts to identifying a 'best" model and then appropriately modifying its predictions by means of an adjustment factor. This factor is, in general, unknown and the uncertainty associated with it is represented in the form of a distribution which is intended to reflect the uncertainty in the models. This approach has clearly a more empirical foundation. Although less defensible from a theoretical point of view, it has proved to be useful in practical situations because of its flexibility. Both the approaches presented result in predictive distributions which are meant to contain all the uncertainties and can be used in formal decision analysis to evaluate the expected utilities. As discussed in previous sections, new information can be used to update the epistemic model by means of Bayes' theorem. However, the models used for the representation of the uncertainty cannot be changed. Unfortunately, this is not the way engineering models, especially the ones employed in risk assessments, evolve in time. New evidence and advances in science very often lead to new models. In these cases, the old formulation does not apply anymore and one must start the analysis with the new models. An investigation of the contribution that the two mathematical formulaticns could introduce in a formal elicitation exercise has demonstrated how the two approaches could pose different questions and raise different concerns in several steps of the elicitation protocol. Both formulations are seen to provide insights in the elicitation process and valuable guidance to its various steps. Overall, it does not seem possible, at the present time, to select one approach as being definitely better than the other but, rather, it seems that the decision on which approach to u,~;e will be very context-specific, without discarding the possibility of using a combination of the two. The problem of treating model uncertainty is far from being resolved. However, the theoretical frameworks presented here allow additional forms of structural uncertainty to enter the probabilistic

239

calculations quantitatively. This greater acknowledgment of model uncertainty often has the consequence of widening the uncertainty bands in pursuit of better calibration. But in view of the examples presented, systematically widening the bands leads to decisions based on more complete uncertainty assessments and reduces the chance of missing the truth.

ACKNOWLEDGEMENT This work is based on research supported by the U.S. Nuclear Regulatory Commission under grant NRC-0493-093. This grant was awarded to U C L A where the authors did part of the work. Although this paper is based on research funded in part by the US Nuclear Regulatory Commission, it presents the opinions of the authors, and does not necessarily reflect the regulatory requirements or policies of the USNRC. The authors wish to thank Professor D. Okrent of U C L A for his useful comments in the early stages of this work and Dr J. Randall of the NRC for his careful review and overall support.

REFERENCES 1. Gallegos, D. P. & Bonano, E. J., Consideration of uncertainty in the performance assessment of radioactive waste disposal from an international regulatory perspective. Reliab. Engng System Safety, 42 (1993) 111-123. 2. ICRP Publication 60, 1990 recommendations of the international commission of radiological protection. Annals of the ICRP, vol.21, ICRP, Pergamon Press, Oxford, 1991. 3. Thompson, B. G. J. & Sagar, B., The development and application of integrated procedures for post-closure assessment based upon Monte Carlo simulation: the probabilistic systems assessment (PSA) approach. Reliab. Engng System Safety, 42 (1993) 125-160. 4. Fehringer, D., & Coplan, S., Uncertainty in regulatory decision-making. In Proc. III Int. Conf. on High-Level Radioactive Waste Management, Las Vegas, NV, 12-16 April 1992, pp. 106-109. 5. Eisenberg, N. A., Sagar, B., & Wittmeyer, G. W., Some concepts of model uncertainty for performance assessments of nuclear waste repositories, in Proc. Workshop ! on Advanced Topics in Risk and Reliability Analysis, (eds A. Mosleh, N. Siu, C. Smidts and C. Lui) Annapolis, Maryland, 20-22 October 1993, pp. 167-186, NUREG/CP-0138, US Nuclear Regulatory Commission, Washington DC, 1993. 6. Buslik, A., A Bayesian approach to model uncertainty. Proc. Workshop ! on Advanced Topics in Risk and Reliability Analysis, (eds A. Mosleh, N. Siu, C. Smidts and C. Lui) Annapolis, Maryland, 20-22 October 1993, NUREG/CP-0138, US Nuclear Regulatory Commission, Washington DC, 1993. 7. Savage, L. J., The Foundation of Statistics, Dover Publications Inc., New York, 1972. 8. Kaplan, S. & Garrick, B. J., On the quantitative definition of risk. Risk Analysis. 1 (1981) 11-27.

240

E. Zio, G. E. Apostolakis

9. Apostolakis, G. E. & Wu, J. S., The interpretation of probability, De Finetti's representation theorem, and their implications to the use of expert opinions in safety assessment. In Reliability and Decision Making, (eds R.E. Barlow. C.A. Clarotti, and F. Spizzichino) Chapman and Hall, London. 1993. 10, Helton, J. C., Uncertainty and sensitivity analysis techniques for use in performance assessment for radioactive waste disposal. Reliab. Engng System Safety, 42 (1993) 327-367. 11. Apostolakis, G. E., A commentary on model uncertainty. In Proc. Workshop I on Advanced Topics in Risk and Reliability Analysis, (eds A. Mosleh, N. Siu, C. Smidts and C. Lui) Annapolis, Maryland, 20-22 October 1993, NUREG/CP-0138, US Nuclear Regulatory Commission, Washington DC, 1993. 12. Beck, M. B. Water quality modelling: a review of the analysis of uncertainty. Water Resources Res., 23 (1987) 1393-1442. 13. McLaughlin, D. B., A distributed parameter state space approach for evaluating the accuracy of groundwater predictions. Ph.D. dissertation, Princeton Univ., N.J., 1985. 14. Yeh, W. W.-G. & Yoon, Y. S., Aquifer parameter identification with optimum dimension in parameterization. Water Resources Res.. 17 (1981) 664-672. 15. Hoaglin, C., Mosteller, F. & Tukey, J. W., Fxploring Data Tables, Trends and Shapes, Wiley, New York, 1985. 16. Sheng, G., Elzas, M. S., Oren, T. 1. & Cronhjort, B. T., Model validation: a systemic and systematic approach. Reliab. Engng System Safety, 42 (1993) 247-260. 17. Kozak, W., Validation, confidence and performance assessment. In Proc. Nuclear and Hazardous Waste Management Int. Topical Meet. SPECTRUM'94, 14-18 August 1994, Atlanta, Georgia, USA, Vol.2, pp. 1354-1359. 18. Keeney, R. L. & von Winterfeldt, D., Eliciting probabilities from experts in complex technical problems. IEEE Trans. Fngng Man., 38 (1991) 191-201. 19. Hora, S. C. & lman, R. L., Expert opinion in risk analysis: the NUREG-II50 methodology. Nucl. Sci. & Engng, 102 (1989) 323-331. 20. Bonano, E. J., Hora, S. C., Kceney, R. L. & v o n Winterfeldt, D., Elicitation and use of expert judgment in performance assessment for high-level radioactive waste repositories. NUREG/CR-5411, US Nuclear Regulatory Commission, Washington, DC, 1990. 21. Bonano, E. J. & Apostolakis, G. E., Theoretical foundations and practical issues for using expert judgements in uncertainty analysis of high-level radioactive waste disposal. Radioactive Waste Man. & Nucl. Fuel Cycle, 16 (1991) 137-159. 22. Thorne, M. C.. The use of expert opinion in formulating conceptual models of underground disposal systems and the treatment of associated bias. Reliab. Engng ~vstem Safety. 42 (1993) 161-180. 23. Energy Modelling Forum, World oil: summary report. EMF Report 6, Stanford University, Stanford, USA, 1982. 24. Draper, D., Assessment and propagation of model uncertainty. J. R. Star. Soc. B, 57 (1995) 45-97. 25. Budnitz, R. J., Apostolakis, G., Boore, D. M., Cluff, L, S., Coppersmith, K. J., Cornell, C. A. & Morris, P. A., Recommendations for probabilistic seismic hazard analysis: guidance on uncertainty and use of experts. UCRL-ID-122160, Scnior Scismic Hazard Analysis

26.

27.

28. 29. 30. 31.

32.

33.

34. 35.

36.

37.

38. 39.

40. 41.

Committee (SSHAC), Lawrence Livermore National Laboratory, 1995. Gallegos, D. P., Pohl. P. I. & Updegraff, C. D., .Preliminary assessment of the impact of conceptual model uncertainty on site performance, high level radioactive waste management. In Proc. Second Int. Conf., Las Vegas, Nevada, 29 April-3 May, 1991. Kaplan, S., Expert information vs expert opinions: another approach to the problem of eliciting/combining/using expert knowledge in PRA. Reliab. Engng System Safety, 25 (1992) 61-72. Veneziano, D.. Uncertainty and expert opinion in geologic hazards. In Symposium in Honor of R. V. Whitman, MIT, 7-8 October 1994. Box, G. E. P., Sampling and Bayes' inference in scientific modelling and robustness. J. R. Stat. Soc. A, 143 (1980) 383-430. Smith, A, F. M., Bayesian statistics. J. R. Stat. Soc. A, 147 (19841 245-259. Young, P. C., Hornberger, G. M. & Spear, R. C., Modelling badly defijned systems--some further thoughts. In Proc. SIMSIG Simulation Conf., Canberra, Australia, 1978, pp. 24-32. Kerl, A.. Heger, A. S. & Gallegos, D.P., Developing conceptual models for performance assessment of waste management sites. In Proc. Third Int. Conf. High-Level Radioactive Waste Management, Las Vegas, NV, vol.1, 12-16 April 1992 pp.502-5(19. Evans, J. S., Addressing model uncertainty in dose-response: the case of chloroform. In Proc. Workshop 1 on Advanced Topics in Ris'k and Reliability Analysis, (eds A. Mosleh, N. Siu, C. Smidts and C. Lui) Annapolis, Maryland, 20-22 October 1993, NUREG/CP-0138, US Nuclear Regulatory Commission, Washington DC, 1993. Siu, N. & Apostolakis, G., Probabilistic models of cable tray fires. Reliab. Engng, 3 (1982) 213-227. Siu, N. & Apostolakis, G., On the quantification of modelling uncertainties. In Proc. 8th Int. Conf. Structural Mechanics in Reactor Technology, Brussels, Belgium, 19-23 August 1985. Abrahamson, N. A., Somerville, P. G. & Cornell, C.A., Uncertainty in numerical strong motion predictions. In Proc. Fourth US National Conf. Earthquake Engng, Palm Springs, California, 20-24 May 1990, Vol. 1, pp. 407-416. Amos, C., et al., Evaluation of severe accident risks and the potential for risk reduction: Peach Bottom, Unit 2. NUREG/CR-4551/Vol. 3, Nuclear Regulatory Commission, Washington DC, 1987. Parry, G. W., A discussion on the use of judgment in representing uncertainty in PRAs. Nucl. Engng Design, 93 (19861 135-144. Winkler, R. L., Model uncertainty: probabilities for models? In Proc. Workshop I on Advanced Topics in Risk and Reliability Analysis, (eds A. Mosleh, N. Siu, C. Smidts and C. Lui) Annapolis, Maryland, 20-22 October 1993, NUREG/CP-0138,.US Nuclear Regulatory Commission. Washington DC, 1993. Cooke, R. M., Experts in Uncertainty: Opinion and Subjective Probability in Science, Oxford University Press, New York, 1991. Apostolakis, G. E., Expert judgment in probabilistic safety assessment. In Accelerated Life Testing and Experts" Opinions in Reliability. Proc. Int. School of Physics, Italian Physical Society, Course CII, (eds C.A. Clarotti and D.V. Lindley) North-Holland, New York, NY, 1988.

Structured assessment of model uncertainty 42. Lindley, D. V., The use of probability statements. In

Accelerated Life Testing and Experts" Opinions in Reliability, Proc. Int. School of Physics, Italian Physical Society, Course CII, (eds C.A. Clarotti and D.V. Lindley) North-Holland, New York, NY, 1988.

241

43. Parsons, M., Olague N. E. & Gailegos, D.P., Conceptualization of a hypothetical high-level nuclear waste repository site in unsaturated fractured tuff. NUREG/CR-5495, US Nuclear Regulatory Commission, Washington DC, 1991.