Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286 www.elsevier.com/locate/jspi
Prior viability assessment for Bayesian analysis Michael Goldstein, Allan Seheult∗ Department of Mathematical Sciences, Durham University, Durham DH1 3LE, UK Received 20 January 2006; received in revised form 16 February 2007; accepted 24 April 2007 Available online 22 May 2007
Abstract We address the problem of determining whether the cost of a proposed Bayesian analysis is likely to be justified by the potential benefit. A method is described for identifying the likely order of magnitude benefits from the analysis, and this approach is applied to an example concerning trading on the sugar market. © 2007 Elsevier B.V. All rights reserved. Keywords: Bayes linear analysis; Commodity trading; Preposterior analysis; Price forecasting; Random walk; Temporal sure preference; Value of analysis
1. Introduction We develop a method for assessing the practical viability of carrying out a Bayesian analysis. Such an analysis is concerned with the combination of expert prior judgements with statistical models based on informative data in order to reduce certain key uncertainties in the application of interest. This type of analysis may be very expensive, and before embarking on such a program an organisation will need to estimate an order of magnitude budget for (i) data collection and analysis; (ii) formulation and maintenance of the Bayesian model; (iii) expert elicitation for prior uncertainties; (iv) developing good solutions to the decision problems to which the uncertainties relate; (v) software requirements to allow the approach to be used routinely within the organisation. Such budget setting is a process which any organisation must go through any time that they consider taking on a large Bayesian project, and is strongly project dependent. Our objective is to offer a method by which the organisation can get some idea as to whether the budget is likely to be well spent. In principle, to decide whether the budget is justified, we should weigh the costs of the analysis against the resulting benefits. In many cases, the analysis will repay the investment only if it can sufficiently reduce posterior uncertainty on certain key quantities to result in substantial improvements in some key operating procedures. If the project is large and complex, then there will be considerable uncertainty as to the degree of success that is likely to be attained. It is therefore useful, before such a potentially expensive Bayesian analysis is carried out, to make simple ruleof-thumb calculations to help us judge the likely benefits from the analysis. One approach is to construct a quick approximate version of the modelling and elicitation, using easily available parts of the relevant data, to show concept viability. However, the expense involved in constructing the simplified model may still be considerable, and there is no ∗ Corresponding author. Tel.: +44 191 3343113; fax: +44 191 3343051.
E-mail addresses:
[email protected] (M. Goldstein),
[email protected] (A. Seheult). 0378-3758/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2007.04.023
1272
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
theoretical basis which would allow us to assess the extent to which final success may be predicted by the rough initial model, so that such simplifications may not address the decision support issues that we have raised. In contrast to approaches which approximate the form of the Bayesian analysis, we suggest a viability assessment based on direct modelling of the objectives of the analysis. Our aim is to produce a simple order of magnitude assessment of the relationship between the accuracy which is likely to be achieved by the analysis and the actual benefits which would derive from such an increase in accuracy. This assessment is based on a simple model for the potential gain in information whose effect on the operating procedures may be assessed both by simulation and by using sets of training data. Our procedure is illustrated by an application based on our work on sugar trading. The general methodology for constructing the prior viability analysis is described in Section 2. This methodology works by exploiting the temporal sure preference principle in order to construct the necessary stochastic relationships between the data outcomes and our inferences about these outcomes. In Section 3, we describe a sugar trading example which we use to illustrate preliminary viability analysis. Specific application to a simple random walk model for sugar prices is analysed in detail and numerical evaluation of the viability analysis is provided. There is concluding discussion in Section 4.
2. Prior viability analysis We now describe a general approach to Bayesian preliminary viability analysis. Our problem is as follows. We must decide whether to carry out a full Bayesian analysis intended to reduce uncertainty about a vector of random quantities X. The full analysis is likely to be costly in terms of data collection, model formulation, analysis, software creation and maintenance, expert time in elicitation and so forth. We are uncertain as to how successful our analysis will be. One of the intended benefits of the reduction in uncertainty is improved performance for various tasks. Our aim is to consider whether the potential benefits of the analysis outweigh the costs. A full Bayesian analysis of the benefits of the analysis is likely to be extremely complex. We calculate a pragmatic lower bound on the likely benefit for the specified tasks from the Bayesian analysis, from which to gauge the viability of such an analysis. We will not know how successful we will be in reducing uncertainty until we have carried out the full analysis, but we may be able to make quantitative assessments, based on direct expert judgements or on simple modelling, as to what levels of accuracy seem realistically attainable. By deriving a pragmatic lower bound on the value of any given reduction in uncertainty, we identify the minimum reduction in uncertainty which offers a sufficient expected gain to offset the cost of the analysis. If it seems plausible that we should be able to achieve such a reduction in uncertainty, then this offers a sensible rationale for carrying out the full Bayes procedure of modelling, elicitation, data collection and so forth. There may be less tangible benefits from a full Bayesian analysis, and these should be separately assessed.
2.1. Prior judgements for future inferences Our problem is as follows. If we carry out a Bayesian analysis, then the benefits that follow will arise because of the improvement in our decision making. The decisions that we shall make will be functions of our posterior judgements about X given the Bayes analysis, while the gains from our decisions will be functions of the actual value of X. However, we have not yet decided which data we shall collect, how we shall model and analyse this data, or the level of detail to which we shall elicit beliefs from experts. Therefore, we cannot construct a full Bayesian model for the analysis. Therefore, we treat our posterior judgements as primitive quantities. We express directly our beliefs describing the relationship between X and our posterior judgements for X. Such assessments are similar to those that are made when we must assess the potential value of expert judgements; see Goldstein and O’Hagan (1996). Thus, suppose that we envisage carrying out the full Bayesian analysis. If we carry out the analysis, we replace the prior mean E[X] by a posterior mean E[X]. The value of E[X] is currently an unknown random vector. We will also replace the prior variance matrix Var[X] with the random posterior variance matrix Var[X] ≡ E[(X − E[X])(X − E[X])T ].
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
1273
There is a simple relationship between current and future uncertainty judgements which derives from the temporal sure preference principle: If you are sure that, at some future time, you will have a preference for W over U, where W and U are small, random money penalties, then you should not, at the present time, prefer U to W. This principle was described, motivated and discussed in Goldstein (1997), where it is argued that, whatever method you choose to form your future beliefs, temporal sure preference implies that your current beliefs about your future beliefs must satisfy the stochastic relationship X = E[X] + R,
(1)
where the difference R = X − E[X] satisfies the conditions E[R] = 0
and
Cov[ E[X], R] = 0.
(2)
These conditions imply that E[ E[X]] = E[X].
(3)
Relations (1) and (3) give Var[R] = E[(X − E[X])(X − E[X])T ] = E[ E[(X − E[X])(X − E[X])T ]] = E[Var[X]] = EPV[X],
(4)
where EPV[X] is our prior expectation for the posterior variance matrix. Thus, Cov[X, E[X]] = Var[ E[X]] = Var[X] − EPV[X].
(5)
Therefore, by specifying EPV[X], in addition to E[X] and Var[X], we may construct the full mean, variance and covariance specification over X, E[X]. In our viability analysis, we use these relations to analyse the impact of different choices of EPV[X] on the potential gain from carrying out the full Bayes analysis. 2.2. Evaluating decisions based on posterior judgements We now explain how to produce a viability function which translates each value of EPV[X] into a lower bound for the corresponding expected gain for the tasks of interest. We shall use this function to make our judgement as to the value of EPV[X] which would be required in order to justify the cost of the analysis. To assess how much we expect to gain by the full Bayes analysis, we require for comparison a baseline decision procedure which describes how we would proceed if we did not carry out such a full Bayes analysis. We also need to assess our prior probability distribution P[X] for X. For each value of X, we determine the gain g[X] (ideally in units of utility) that we would make by following our baseline decision procedure. Therefore, by simulation or by exact calculation, we may evaluate our expected gain, g = E[g[X]], by making an immediate decision, where g is evaluated with respect to the distribution P[X]. We compare g with the expected gain if we carry out the full Bayes analysis before making our decision. The full Bayes calculation of this gain would be enormously complicated. However, we may derive a lower bound to this gain if we suppose that, rather than obtaining the full posterior probability distribution over X, we were, instead, only able to discover the value of E[X]. Any price that we would consider it reasonable to pay for E[X] would be less than the price that we would pay for the full posterior distribution, and thus serves as a lower bound for the value of the Bayes analysis. The value to us of learning E[X] will depend largely on how much the analysis may reduce our uncertainty about X. This uncertainty reduction is reflected in the value that we may specify for EPV[X]. Our aim is to assess the expected benefit of learning the value of E[X] for different values of EPV[X]. To assess this benefit, we must specify the decision choice that we will follow when we learn the value of E[X]. Any decision rule that we use will give a lower bound for the expected gain from the optimal Bayes rule given E[X], and thus a lower bound for the value of the full Bayes analysis.
1274
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
Suppose that we have chosen a decision procedure which depends only on the observed value of E[X] and the specified value for EPV[X]. We suppose for now that there is a single decision, after which the value of X will be revealed and then we shall receive the payoff. Therefore, our decision d will depend on the value that we learn for E[X] and the value we assign for EPV[X]; call this choice of decision, d = d[ E[X], EPV[X]]. Our gain will depend on the true value of X. Call this value G[X, d]. We define G[EPV[X]] = E[ G[X, d]]
(6)
to be the viability function over values of EPV[X]. Therefore, we must assess our expectation for the gain function G[X, d] over values of X and E[X]. Suppose that we specify a particular choice for EPV[X]. We do not have a full joint distribution for the pair X and E[X]. However, we have the marginal distribution for X, and from relations (1) and (5) we have a full mean, variance and covariance specification for X and E[X]. We may exploit this specification by constructing an appropriate Bayes linear analysis. 2.3. Bayes linear methods In general, the Bayes linear approach may either be motivated from a belief that the target quantities are roughly normally distributed, or from the viewpoint that the Bayes linear approach is the appropriate form when dealing with partial belief specifications based on means and variances, or as a pragmatic and tractable lower bound for expected mean square error for an estimator using the full Bayes analysis, based on linear fitting. The Bayes linear analysis follows directly from the formulation relating current and future beliefs through temporal sure preference; for an overview of the Bayes linear approach, see Goldstein, 1999. In particular, if B and D are random vectors, then the adjusted expectation and variance for B given D are ED [B] = E[B] + Cov[B, D]Var[D]−1 (D − E[D]),
(7)
Var D [B] = Var[B] − Cov[B, D]Var[D]−1 Cov[D, B].
(8)
It follows that the adjusted expectation and variance for E[X] given X can be written EX [ E[X]] = (I − R[X])X + R[X] E[X],
(9)
Var X [ E[X]] = (I − R[X])R[X]Var[X],
(10)
R[X] = EPV[X]Var[X]−1
(11)
where
is a dimensionless matrix with all its eigenvalues in (0, 1). These specifications provide the adjusted mean and variance for E[X], for each X. However, to assess the expected gain function G[EPV[X]], we need a full probability distribution for E[X] given X. We therefore choose a plausible distribution, with the given adjusted mean and adjusted variance for E[X], and carry out the analysis with this distribution. We may then explore the effects of using a range of distributions for E[X], to give more sensitivity as to the important further aspects of our beliefs about posterior expectations in determining expected gains. We now may compute, analytically or by simulation, the value of G[EPV[X]], for fixed EPV[X]. For example, we may simulate values of X from P[X], and, for each such X, we may simulate values of E[X] from our chosen distribution with mean and variance given by (9) and (10). For each pair of values of X and E[X], we evaluate d and therefore the value of G[X, d]. Averaging over simulations gives the value of G[EPV[X]]. The viability function G[EPV[X]] reveals how much we must expect to reduce variation over X in order to justify any given cost of analysis. The analysis is viable if sufficiently large gains correspond to reductions in uncertainty which we may be reasonably confident of achieving without too much cost. Alternatively, we may place prior probabilities on the different values EPV[X], and thus of G[EPV[X]], that we may achieve, and therefore assess whether the expected benefit is justified by the expected cost. If the viability analysis suggests that it is unclear whether to continue, then we may choose to conduct a more careful version of the analysis, based on more detailed aspects of the uncertainty specification and more effective decision rules.
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
1275
2.4. Extending the viability analysis (i) Historical data: If we have a reasonably large number m of historical data sets X1 , . . . , Xm judged to be exchangeable with X, then there is a useful supporting calculation that we may make. For each i, we simulate the value d] as above. Then we may estimate G[EPV[X]] by of G[Xi , G[EPV[X]] =
m 1 G[Xi , EPV[X]]. m
(12)
i=1
A comparison of the analysis of simulated and historical data is a useful diagnostic for those aspects of the viability assessment which rely on the assessment of the prior distribution for X. With sufficient historical data, we might even use (12) as a fast and cheap alternative to the full simulation based assessment of G[EPV[X]], as this method only requires the specification of the prior mean and variance for X, which are used to evaluate (9) and (10), so that we do not need to specify a full prior distribution for X. (ii) Several decisions: We have supposed above that there is a single decision involved in the viability analysis. Suppose instead that there are several decisions to be taken, in sequence, and that before each decision a subset of the values of the elements of X is revealed. Each element of X that we observe modifies our expectation for each remaining element of X. We proceed as follows. Initially, we specify E[X] and EPV[X] as the adjusted mean and variance for X. For each element of X that we observe, we may further adjust the mean and variance for the remaining elements of X using (7) and (8), as we have made a full covariance specification between all the elements of X and E[X]. Therefore, provided that at each stage we restrict attention to decisions which are functions of the current adjusted mean and variance for the unobserved elements in X, then we may evaluate the expected gain exactly as for a single decision, and again this will provide a lower bound for the expected gain over all possible choices of decision. (iii) Several problems: So far, we have supposed that there is a single problem for which the Bayesian analysis will be relevant. It may be that there are a sequence of problems to which the analysis will apply. We may judge that variability in each problem differs simply by location and scale factors. In such cases, we may construct a parameter vector say, of location and scale parameters, and specify a prior probability distribution over . Then, for each simulation in deriving our viability function, we begin by drawing a value for . We construct our viability analysis with the prior expectation E[X] determined by the location parameters and prior variance Var[X] determined by the scale parameters. We consider EPV[X] to be the expected posterior variance matrix for X for some standard value of the prior variance, and scale the value of EPV[X] for each simulation by the corresponding multiplier for the scale parameters. The average of the simulations will therefore provide our viability function as before. 2.5. Enlarging the set of uncertain quantities In many problems, the number of random quantities whose values are relevant to the decisions that we must choose is very large, as is the case in the example we discuss in Section 3. A viability analysis carried out in the way that we have suggested above may therefore become unwieldy, as it requires informed prior judgements as to the likely magnitude of the elements of a large posterior variance matrix. Therefore, we may decide to simplify the viability analysis, by supposing that, instead of receiving the full posterior expectation vector for all the quantities, we only receive the posterior expectation for some aspects of the vector. In particular, suppose that our decision depends on a large vector Y for which we have a prior mean and variance E[Y] and Var[Y]. However, suppose that we are primarily concerned with learning about some smaller set X = LY of linear combinations of the components of Y. We therefore consider our expected gain if we are informed of the value of E[X], but given no other information about the value of E[Y]. Again, this analysis offers a lower bound for the full posterior analysis, which will be informative for all elements of Y. To identify the information about Y that we receive by learning the value of E[X], we write Y in the following form: Y = EX [Y] + RY ,
(13)
where E[RY ] = 0,
Var[RY ] = Var X [Y] and
Cov[X, RY ] = 0.
1276
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
The first term on the right in Eq. (13), namely EX [Y], is a linear function of X. Therefore, any reduction in variance E[EX [Y]], for X automatically translates into a corresponding reduction in variance for EX [Y], so that the values of EPV[EX [Y]] are determined by the corresponding quantities for X. The random quantity RY is uncorrelated with X. The least informative updating of beliefs about RY which follows from learning the value of E[X] would be to suppose that E[X] is uninformative for RY , and thus does not affect the mean and variance for RY , so that E[RY ] = 0, and EPV[RY ] = Var X [Y]. Again, this is a lower bound for the expected information that would be provided by the full Bayesian analysis. We complete the specification by setting the expected posterior covariance to be equal to the prior covariance, namely EPC[X, RY ] = 0. E[U], We may now construct the combined vector U = (X, RY ), so that U is a linear transform of Y. The values EPV[U] are determined as above. Thus, as EPV[U] is a function only of EPV[X] and our prior assessments, we may carry out the viability analysis exactly as in Section 2.2, by evaluating the lower bound to expected gain for different values of EPV[X]. 2.6. Summary We now summarise the main steps in carrying out a viability analysis. V1 Problem specification: We must specify (i) the vector Y of random quantities whose values determine the gain for each decision; (ii) the vector X = LY of primary interest, in the sense that the viability analysis will give a lower bound for the value of the Bayesian analysis for different choices of EPV[X]; (iii) the default decision procedure d0 if no Bayesian analysis is to be carried out; (iv) the decision procedure d we are using to give a lower bound for the value of a full Bayesian analysis. The procedure only depends on the posterior distribution of Y through its posterior expectation E[Y] and EPV[Y], the expected value of the posterior variance. V2 Belief specification and analysis: (i) We specify the prior distribution for Y. (ii) We choose a specification for EPV[X]. (iii) From the prior variance matrix for Y, we construct the vector RY = Y − EX [Y]. Set E[RY ] = E[RY ] = 0. We convert Y to the linearly equivalent vector U = (X, RY ), extend the specification to EPV[U] by EPV[RY ] = Var[RY ] and EPC[X, RY ] = 0, and deduce the corresponding value of EPV[Y]. We deduce the mean and variance specification for E[Y] and Y as E[ E[Y]] = E[Y]
and
Var[ E[Y]] = Var[Y] − EPV[Y] = Cov[Y, E[Y]].
(iv) We construct the adjusted expectation and variance for E[Y] given Y using Eqs. (9) and (10) with X replaced by Y. (v) We evaluate the form of the decision procedure given E[Y]. V3 Viability analysis: (i) Simulate a value of Y from its prior distribution. (ii) Evaluate the gain from the default decision procedure. (iii) Evaluate the Bayes linear adjusted mean and variance for E[Y] given Y using V2 (v). Simulate a value for E[Y] from an appropriate distribution with this mean and variance. Choose the decision corresponding to these values of E[Y] and EPV[Y]. Evaluate the gain. In some circumstances, we may be able to write out explicitly the decision function and associated gain, in which case we may evaluate the expected gain directly. [If there is a sequence of decisions, repeat this procedure sequentially, updating the mean and variance of E[Y] at each stage, and accumulate gains.] (iv) Repeat steps (i)–(iii) N times (N large). The difference between the average for step (ii) and the average for step (iii) is a lower bound for the expected gain of the Bayes analysis, for the given choice of EPV[X].
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
1277
(v) By repeating the analysis for different choices of EPV[X], construct a picture of the minimal gains that we can expect, depending on how much we expect to be able to reduce the uncertainty over X by carrying out a full Bayesian analysis. This allows an informed judgement as to whether these gains are likely to justify the cost of the analysis. (vi) If we have extensive historical data on equivalent systems, then we may supplement the simulation analysis above by an exactly equivalent analysis based on the observed, rather than the simulated, Y values. 3. Example: sugar trading We motivate and illustrate our approach with reference to a problem related to our work with C. Czarnikow Sugar Ltd, who are concerned with a wide range of financial activities within the sugar industry. Czarnikow were interested in the relevance of Bayesian methodology to aspects of their services. The particular trading rule that we were asked to consider is confidential, and so we will present an example that is similar in spirit. All of the features of the actual case study carried out for the client have been preserved, namely the data, the beliefs, the modelling, the evaluations, and the displays. The single change that we have made is to replace the actual trading rule with an alternative for which the nature of the analysis is essentially the same. The problem that we shall consider is as follows. Sugar traders consult Czarnikow as to the best time to place their sugar on the market. A trader will have a certain amount of sugar to sell. Czarnikow are experts in the analysis of the sugar trade, based, in part, on their knowledge of such features as likely production over future sugar harvests and the political and economic factors that influence the rate at which sugar is likely to be released onto the international market. Optimal solution of this problem is a complex problem in Bayesian modelling, based on careful forecasting of future production and detailed assessment of the relationship between future sugar production and sugar price in order to get good posterior forecasts of future sugar prices, followed by, for example, backward induction based on maximising overall expected utility for the sugar trader to assess the optimal sale policy. While we cannot fully evaluate the benefits of the Bayesian approach without carrying out such an analysis, it is prudent to try to obtain some order of magnitude assessments of the potential benefits of the Bayesian approach before embarking on such a complicated and potentially expensive form of analysis. We shall therefore derive preliminary quantification of the potential benefits for sugar traders of using Bayesian methods to attempt to improve medium term forecasts for sugar prices. To help address this question, we have historical data on the closing price for sugar for each of 1560 days on the New York exchange. Table 1 shows the data, tabulated as 26 consecutive quarters (60 working days) of prices against which we can assess our procedures. We now describe how to carry out the viability summarised in Section 2.6, using the numbering there. 3.1. Problem specification V1 (i) Czarnikow were concerned to develop decision procedures which exploited their expert knowledge about the likely medium term future behaviour of sugar prices. For our example, we consider the price series Y for sugar over the next 60 days. Specifically, we denote by Y1 , . . . , Yn the daily closing prices of a pound of white sugar on the New York exchange over a period of “one quarter” of n = 60 working days. V1 (ii) We will suppose that the Bayesian analysis will focus strictly on reducing uncertainty for the price X ≡ Y60 on the final day. We will therefore need to quantify the potential increase in trading profits which would be achieved by various levels of reduction of uncertainty for Y60 . V1 (iii) We suppose that the trader intends to sell all of the sugar within the n = 60 days. If we do not carry out the Bayes analysis, then our default decision procedure d0 is that the trader will sell out on the first day. Item (B) in Section 3.5 discusses other non-Bayesian default rules. V1 (iv) Our alternative trading rule d for the viability analysis is as follows. Initially, Czarnikow make a forecast of Y60 , the sugar price which will be obtainable in three months time. If the prior expectation for Y60 is lower than the initial sugar price Y0 , then all the sugar is sold immediately; otherwise, no sugar is sold. Each day, the current sugar price Yt is monitored and beliefs about Y60 are updated by Bayes linear adjustment on Y1 . . . , Yt and our future posterior expectation E[Y60 ]. If on day t the adjusted expectation for Y60 is smaller than the observed value of Yt , then all the sugar is sold; otherwise, all of the sugar is sold on day 60 at price Y60 . We have chosen this rule for simplicity of exposition, as it depends explicitly on only a single prior elicitation, and is straightforward to implement, so that we may derive a closed form expression for the viability analysis. Also, it would
Q1
Q2
Q3
Q4
Q5
Q6
Q7
Q8
Q9
Q10
Q11
Q12
Q13
Q14
Q15
Q16
Q17
Q18
1278
Table 1 Daily sugar prices (in US cents per lb) on the number 11 exchange for 25 quarters Q1, Q2, . . . , Q25 (each of 60 working days); and estimates of the simple random walk mean and increment variance 2z for each quarter derived from the prices for the immediately preceding quarter. The quarter preceding the first quarter is not shown Q19 Q20 Q21 Q22 Q23 Q24 Q25
2z 0.02 0.05 0.03 0.04 0.05 0.09 0.08 0.03 0.04 0.02 0.04 0.02 0.01 0.02 0.01 0.02 0.01 0.03 0.03 0.04 0.02 0.03 0.05 0.02 0.02 12.17 12.08 12.66 14.60 14.56 11.58 10.97 10.92 12.53 11.26 11.82 10.37 10.64 11.04 11.67 12.30 11.23 8.43 8.41 7.62 8.42 5.10 5.52 6.84 6.12 11.84 11.95 11.96 11.85 11.05 11.28 11.21 11.20 10.56 10.57 10.88 10.91 10.97 11.01 10.87 10.77 11.21 11.66 11.68 11.77 11.65 11.29 11.53 11.50 11.65 11.82 11.72 11.94 11.83 12.26 12.17 11.98 11.86 11.97 12.23 12.21 11.91 12.17 11.93
12.14 11.57 11.61 11.54 11.49 11.86 11.67 11.67 11.26 11.32 11.47 11.52 11.54 11.70 12.05 12.09 12.02 11.88 11.90 11.86 11.78 11.72 11.72 11.82 11.68 11.70 11.48 11.54 11.93 11.88 11.80 11.82 11.91 11.82 11.90 11.81 11.89 11.85 12.11
12.73 12.57 12.58 12.53 12.68 12.54 12.51 12.69 12.61 12.52 12.42 12.45 12.41 12.47 12.42 12.15 12.37 12.29 12.57 12.82 12.72 12.79 12.78 12.70 12.71 12.70 12.86 12.89 12.83 12.80 12.90 13.21 13.18 13.19 13.27 13.15 13.15 13.26 13.64
15.01 15.13 14.85 14.94 14.79 14.96 15.02 15.11 14.91 14.89 15.17 15.39 15.61 15.61 15.74 15.45 15.42 15.18 15.21 14.26 14.02 14.42 14.67 14.78 15.13 14.93 14.80 14.53 14.46 14.37 14.37 14.12 14.25 14.44 14.35 13.74 13.86 13.92 14.07
14.30 13.87 13.88 13.89 14.11 13.95 14.10 14.18 14.18 14.20 14.48 14.23 14.32 14.28 14.39 14.27 13.88 14.04 14.01 13.70 13.37 13.28 12.65 12.60 12.75 12.82 13.09 13.24 12.99 12.82 12.89 13.02 13.35 11.76 11.58 11.61 11.81 11.71 11.78
11.88 11.93 11.79 11.46 11.57 11.60 11.66 11.65 11.85 12.02 12.10 12.62 12.81 12.83 12.35 12.31 12.03 10.79 10.77 10.25 10.34 10.38 9.72 9.87 9.69 9.72 9.78 10.05 10.03 10.25 10.73 10.61 10.54 10.55 10.65 10.50 10.58 10.56 10.75
10.81 10.75 10.76 10.86 10.79 10.77 10.81 10.81 10.75 10.83 10.96 11.18 11.22 11.23 11.04 11.25 11.52 11.70 11.29 10.25 10.39 10.41 10.55 10.47 10.53 10.43 10.69 10.66 10.62 10.61 10.52 10.56 10.68 10.67 10.68 10.94 10.81 10.60 10.53
10.97 11.30 11.38 11.29 11.32 11.26 11.44 11.38 11.46 11.37 11.40 11.43 11.29 11.38 11.50 11.54 11.58 11.54 11.56 11.60 11.83 11.84 11.72 11.89 11.94 11.73 11.79 11.67 10.87 11.09 11.30 11.49 11.56 11.76 11.67 11.72 12.19 12.48 12.36
11.65 11.70 11.72 11.73 11.74 11.70 12.16 12.09 12.16 12.23 12.17 12.41 12.46 12.48 12.24 12.25 12.07 11.90 11.86 11.68 11.79 11.55 11.54 11.77 11.89 11.91 11.58 11.63 11.74 11.62 11.67 11.51 11.54 11.15 11.09 11.11 11.06 11.09 10.88
11.01 10.95 11.02 11.21 11.47 11.42 11.35 11.60 11.58 11.44 11.80 11.74 11.68 11.67 11.67 11.54 11.59 11.75 12.03 11.84 11.91 12.17 12.61 12.43 11.32 11.38 11.39 11.69 11.67 11.64 11.68 11.77 11.85 11.65 11.79 11.66 11.65 11.51 11.45
11.85 11.85 11.65 11.74 11.73 11.73 11.78 12.18 12.12 12.01 12.00 12.03 12.05 11.96 11.89 11.64 11.57 11.43 11.40 11.42 11.19 11.30 11.17 11.21 11.11 11.04 11.47 10.78 10.89 10.85 10.93 10.88 10.82 10.78 10.88 10.81 10.86 10.73 10.70
10.35 10.46 10.69 10.68 10.59 10.65 10.69 10.75 10.66 10.66 10.27 10.30 10.31 10.40 10.30 10.35 10.42 10.46 10.73 10.56 10.77 10.86 10.71 10.69 10.75 10.73 10.89 10.99 11.00 10.94 11.08 11.07 10.96 10.87 10.72 10.64 10.49 10.51 10.62
10.78 10.92 11.25 11.33 11.18 10.92 10.80 10.90 11.05 11.32 10.84 10.88 10.86 10.93 10.97 11.03 10.92 10.77 10.76 10.90 10.92 10.88 10.75 10.90 10.83 10.92 10.86 10.81 10.81 10.79 11.00 10.96 11.00 11.00 11.00 11.04 11.31 11.37 11.33
11.11 11.05 11.01 10.97 10.98 11.05 11.15 11.06 11.11 11.16 11.18 11.21 11.17 11.34 11.48 11.51 11.41 11.45 11.45 11.37 11.26 11.49 11.42 11.36 11.34 11.36 11.37 11.06 11.13 11.08 11.15 11.11 10.92 11.00 11.24 11.05 11.18 11.06 11.06
11.58 11.38 11.53 11.51 11.49 11.55 11.86 11.82 11.84 11.69 11.60 11.79 11.66 11.79 11.72 11.68 11.60 11.70 11.60 11.61 11.54 11.51 11.50 11.40 11.56 11.49 11.32 11.29 11.31 11.22 11.17 11.05 10.98 10.76 10.61 10.89 11.04 11.17 11.69
12.39 12.43 12.29 12.32 12.25 12.25 12.18 12.35 12.27 12.34 12.34 12.31 12.26 12.31 12.00 12.03 12.15 12.22 12.24 12.49 12.45 12.38 12.38 12.44 12.35 12.33 12.25 12.10 12.14 12.17 11.93 12.16 12.31 12.29 12.24 12.21 12.29 12.36 12.30
11.20 11.12 10.89 10.71 10.78 10.88 10.97 11.01 11.00 10.75 10.58 10.59 10.37 9.82 9.73 9.94 9.62 9.64 9.98 9.85 9.81 9.73 9.73 9.50 9.39 9.34 9.25 9.40 9.44 9.63 9.78 9.85 9.88 9.83 9.93 9.83 9.80 9.77 9.86
8.50 8.47 8.49 8.53 8.96 9.05 9.14 9.24 9.35 9.18 9.24 9.09 8.95 8.85 8.96 8.99 8.89 8.56 8.34 8.26 8.28 8.08 8.26 8.25 8.13 8.07 7.86 7.55 7.36 7.71 7.45 7.59 7.60 7.98 7.99 7.97 7.89 7.84 7.89
8.44 8.53 8.52 8.84 8.76 8.73 8.71 8.85 8.83 8.94 9.06 9.05 8.66 8.71 8.54 8.59 8.58 8.53 8.49 8.19 7.93 7.93 7.53 7.50 7.56 7.55 7.38 7.70 7.54 7.59 7.35 7.21 7.00 7.10 7.03 7.15 7.31 6.95 6.67
7.75 7.66 7.74 7.75 7.63 7.55 7.76 7.70 7.71 7.60 7.87 8.05 8.02 7.95 7.88 8.05 8.38 8.38 8.37 8.35 8.49 8.46 8.33 8.48 8.24 8.03 8.13 8.23 8.31 8.30 8.29 8.24 8.10 8.20 8.24 8.25 8.19 8.12 8.12
7.69 7.52 7.47 7.10 6.97 6.73 6.65 6.92 7.11 6.76 6.92 6.81 6.69 6.71 6.84 6.83 6.89 6.67 6.75 6.63 6.65 6.68 6.75 6.81 6.79 6.73 6.87 6.34 6.02 5.97 5.85 5.50 5.72 5.72 5.81 6.04 5.91 5.90 5.86
5.07 5.13 5.10 5.14 5.05 4.78 4.82 4.80 4.72 4.58 4.90 5.34 4.50 4.61 4.82 4.77 4.70 4.88 4.82 4.82 4.76 4.90 4.63 4.62 4.72 4.85 4.89 4.82 4.78 4.79 4.77 5.00 5.24 5.49 5.30 5.34 5.45 5.41 5.26
5.40 5.31 5.42 5.52 5.47 5.54 5.77 5.62 5.84 5.79 5.73 5.76 6.04 6.14 5.98 6.03 6.04 6.11 6.20 5.99 6.06 6.13 6.08 5.88 6.01 6.05 6.00 6.08 6.10 6.09 6.01 6.09 6.10 6.07 6.16 6.70 6.82 6.65 6.78
6.68 6.72 6.72 6.89 6.88 6.81 6.81 6.69 6.56 6.46 6.69 6.94 6.92 6.88 7.25 6.89 6.91 7.01 6.88 7.09 6.94 6.96 6.97 7.00 6.88 6.90 6.98 6.81 6.78 6.71 6.56 6.23 6.04 5.91 5.91 6.03 6.07 6.06 5.78
6.10 5.77 5.81 5.77 5.84 5.83 5.67 5.45 5.49 5.45 5.29 5.49 5.56 5.43 5.29 5.32 5.40 5.41 5.47 5.42 5.39 5.50 5.61 5.79 5.66 5.53 5.49 5.52 5.49 5.21 5.13 4.95 5.07 5.03 5.14 5.15 5.12 4.94 4.70
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
11.85 12.01 11.79 12.04 12.19 12.02 12.09 12.00 12.14 12.46 12.42 12.44 12.36 12.37 12.33 12.40 12.36 12.06 12.10 11.83 12.08
12.18 12.11 12.03 12.00 12.11 11.96 12.06 12.12 12.14 12.05 12.04 11.90 12.38 12.28 12.26 12.30 12.54 12.75 12.82 12.65 12.66
13.66 13.64 13.58 13.74 13.80 13.98 14.59 14.49 15.16 14.70 14.71 14.44 14.29 14.40 14.71 14.61 14.46 14.33 14.41 14.71 14.60
14.25 14.28 14.52 14.40 14.58 14.80 14.79 14.65 14.98 15.05 14.75 14.70 14.73 14.66 14.59 14.55 14.35 14.15 14.34 14.39 14.56
11.54 11.56 11.18 11.47 11.56 11.88 11.86 11.85 11.61 11.60 11.57 11.77 11.67 11.40 11.22 11.72 11.61 11.43 11.52 11.78 11.58
10.70 10.94 10.79 11.01 11.04 11.14 11.15 11.00 10.99 11.08 11.45 11.46 11.34 11.58 11.39 11.30 10.71 10.84 10.63 10.86 10.97
10.55 10.56 10.64 10.68 10.77 10.85 10.87 10.75 10.58 10.64 10.74 10.74 10.83 10.93 10.84 10.89 10.79 10.86 10.97 10.81 10.92
12.15 11.93 12.04 12.19 12.00 12.02 12.12 12.01 12.29 12.20 12.31 12.49 12.66 12.62 12.59 12.63 12.61 12.80 13.03 13.16 12.53
10.54 10.50 10.51 10.33 10.40 10.51 10.59 10.62 10.87 10.78 10.76 10.80 10.91 10.93 11.05 11.42 11.50 11.39 11.38 11.22 11.26
11.69 11.73 11.90 11.88 11.87 11.70 11.78 11.71 11.73 11.64 11.48 11.37 11.48 11.59 11.88 11.81 11.66 11.64 11.72 11.81 11.82
10.77 10.64 10.57 10.63 10.65 10.66 10.70 10.54 10.49 10.47 10.30 10.36 10.35 10.33 10.40 10.48 10.53 10.55 10.38 10.48 10.37
10.57 10.52 10.35 10.17 10.17 10.15 10.24 10.44 10.41 10.33 10.39 10.45 10.42 10.44 10.58 10.66 10.60 10.51 10.63 10.66 10.64
11.24 11.13 10.95 10.96 11.29 11.24 11.27 11.53 11.46 11.55 11.49 11.32 11.14 10.70 10.80 10.79 10.88 10.76 10.84 10.88 11.04
11.11 11.06 11.09 11.12 10.97 11.30 11.16 11.28 11.26 11.59 11.49 11.56 11.63 11.70 11.66 11.58 11.68 11.65 11.58 11.61 11.67
11.87 11.90 11.81 11.79 11.90 11.97 11.94 11.85 11.80 11.85 11.78 11.68 11.85 11.74 11.73 11.73 11.81 11.89 11.78 12.00 12.30
12.22 10.19 8.22 6.91 8.09 5.82 5.26 6.90 5.76 4.65 12.02 9.95 8.63 7.19 7.78 5.79 5.41 7.09 5.75 5.08 11.96 9.94 8.74 7.10 7.58 5.72 5.34 7.06 5.98 4.96 11.82 9.91 8.46 6.98 7.52 5.54 5.29 6.97 6.23 5.05 11.86 9.87 8.40 7.04 7.58 5.59 5.29 6.70 6.07 5.04 11.79 9.78 8.58 6.93 7.53 5.60 5.52 6.79 6.03 5.07 11.46 9.51 8.54 7.04 7.62 5.47 5.85 6.70 6.10 5.12 11.51 9.53 8.40 7.13 7.58 5.56 5.82 6.81 6.03 5.16 11.54 9.30 8.34 7.62 7.83 5.70 5.98 6.77 6.02 5.22 11.17 9.43 8.39 7.35 7.88 5.74 5.92 6.91 5.88 5.42 11.20 9.35 8.64 7.36 7.86 5.82 5.95 6.93 5.74 5.28 11.12 9.11 8.76 7.27 8.18 5.91 6.09 6.79 5.75 5.25 11.28 9.02 8.67 7.48 8.74 5.59 5.95 6.95 5.88 5.34 11.27 9.05 8.66 7.27 8.79 5.62 5.72 6.98 5.96 5.27 11.23 9.15 8.71 7.60 8.80 5.63 5.60 6.94 5.92 5.13 11.18 9.19 8.95 7.66 8.71 5.50 6.29 6.61 5.89 5.20 11.03 9.15 9.00 7.66 8.75 5.44 6.13 6.68 5.88 5.24 11.23 9.15 8.51 7.78 8.67 5.43 5.74 6.99 5.92 5.45 11.23 9.04 8.45 7.72 8.70 5.25 5.37 6.99 5.92 5.38 11.23 8.86 8.63 7.90 8.55 5.14 5.64 6.91 6.14 5.47 11.23 8.43 8.41 7.62 8.42 5.10 5.52 6.84 6.12 5.41
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1279
1280
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
be simple to explain and motivate to a trader, as it is based on a very simple heuristic, namely to hold if the price is expected to rise and sell if the price is expected to fall. Further, this rule is not dependent on deep modelling of the future form of the series and so, if viability can be demonstrated for this procedure, then we may be fairly confident that the benefits from whichever trading rule emerges from a full Bayesian analysis should prove at least as substantial. Item (C) of Section 3.5 provides a brief discussion of more elaborate rules. This completes the problem specification V1. 3.2. Belief specification V2 (i) Data analysis data analysis for many 60 day historical sequences suggested a simple random walk for daily sugar price. Whether the random walk model is appropriate for modelling sugar prices is a controversial question, but the model does give a simple basis for simulation, sufficient to demonstrate the general approach. Further discussion and comment on the appropriateness of this simple model can be found in item (A) in Section 3.5. We could similarly construct a viability analysis for any alternative stochastic model for Yt . We model the Yt as a random walk Yt = +
t
Zi ,
t = 1, . . . , n,
(14)
i=1
where the Zi are uncorrelated, zero-mean random quantities, each with variance 2z . Our prior beliefs about (equivalently Y0 ), 2z (reflecting market volatility) and the random walk model for Y1 , . . . , Yn may be assessed from historical data. V2 (ii) Our prior beliefs for X ≡ Y60 follow from the random walk model in (14); in particular, E[Yn ] = and Var[Yn ] = n2z . Following the discussion in Section 2, E[Yn ] is our (random) posterior expectation for Yn . To simplify subsequent n for E[Yn ]. The stochastic relationship resulting from the temporal sure preference principle expressions, we will write Y in Section 2 becomes n + Rn , Yn = Y
(15)
n , Rn ] = 0. Hence, where E[Rn ] = 0 and Cov[Y n ] = E[Yn ] = and Var[Y n ] = Var[Yn ] − Var[Rn ] = n2z − EPV[Yn ]. E[Y Denoting the variance ratio R[Yn ] = EPV[Yn ]/Var[Yn ] in (11) by R, we can write EPV[Yn ] = R n2z and hence, n ] = (1 − R) n2z . Var[Y Thus, for each combination of and 2z , the viability function will be determined by R ∈ (0, 1). V2 (iii) We now construct the residual vector RY = Y − EYn [Y], since the mean and variance assessments of the Yt are required for the trading rule d. It follows from the random walk model in (14) that EYn [Yt ] = + Cov[Yt , Yn ]Var[Yn ]−1 (Yn − ) = qt + pt Yn ,
(16)
where pt = t/n and qt = 1 − pt . Hence, EYn [Y] = qn + pn Yn ,
(17)
where pn = (p1 , . . . , pn )T and qn = (q1 , . . . , qn )T . Therefore, as Y = EYn [Y] + RY = qn + pn Yn + RY and EPV[RY ] = Var[RY ], it follows that EPV[Y] = Var[RY ] + pn pTn EPV[Yn ] = Var[RY ] + pn pTn nR2z
(18)
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
1281
and Var[RY ] =
KCC T K T
0
0T
0
2z ,
where K = [In−1 | − pn−1 /n], C is the n × n cumulative sum matrix, In−1 is (n − 1) × (n − 1) identity matrix and 0 is the (n − 1)-vector of zeros. As E[RY ] = 0, we see that E[ E[Y]] = E[Y] and E[Y]]. Var[ E[Y]] = Var[Y] − EPV[Y] = pn pTn nR2z = Cov[Y,
(19)
V2 (iv) It is straightforward to show from (18) and (19) that n ] = EYn [Y n ] = (1 − R)Yn + R, EY [Y
(20)
n ] = nR(1 − R)2z . n ] = Var Yn [Y VarY [Y
(21)
We can extend these results to the adjusted expectation and variance for E[Y] given Y using (9) and (10) with E[Y] = 1, R(Y) in (11) determined from (18), Var[Y] = CC T 2z and Cov[Y, E[Y]] given in (19). n ,Y[t] ), to sell or hold the sugar stock on day t, will be based on the Bayes V2 (v) The Bayesian trading rule d = d(Y linear forecast E(Yn ,Y[t] ) [Yn ]
(22)
n and the prices Y[t] = (Y1 , . . . , Yt )T observed up to and including day t. In our of the final price Yn adjusted by Y example, d is such that we sell on day t if Y1 < E(Yn ,Y[1] ) [Yn ], Y2 < E(Yn ,Y[2] ) [Yn ], . . . , Yt−1 < E(Yn ,Y[t−1] ) [Yn ]
and
Yt E(Yn ,Y[t] ) [Yn ]
otherwise, hold the current stock over until the next day. Note that if we have not sold by day n−1 we must sell at Yn on day n. We now write Zi = EYn [Zi ] + RZi ,
(23)
n are uncorrelated. It then follows that where RZi and Yn are uncorrelated and RZi and Y n ] = Cov[EYn [Zi ], Y n ] = Cov[Zi , Yn ]Var[Yn ]−1 Var[Y n ] = (1 − R) 2z . Cov[Zi , Y
(24)
We now compute (22). The following adjusted expectations, variances and covariances of Z1 , . . . , Zn can be straightforwardly derived using (24) n − )/n, EYn [Zi ] = (Y CovYn [Zi , Zj ] = (ij − n )2z , where n = (1 − R)/n and ij = 1 if i = j and 0 otherwise. As the Yt are partial sums of the Zi , we readily deduce that n , EYn [Yt ] = qt + pt Y CovYn [Ys , Yt ] = min(s, t)[1 − n max(s, t)]2z .
1282
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
It is then straightforward to show that CovYn [Yn ,Y[t] ] = (1 − R) 2z aTt , −1 T −1 −1 −1 T −1 2z [Var Yn [Y[t] ]]−1 = A−1 t − At at (at At at − (1 − R) ) at At , T T where at = (1, 2, . . . , t)T and A−1 t = t t − et et , where the t × (t + 1) matrix t takes successive first-differences of the elements of any vector of length t + 1, and et = (0, . . . , 0, 1)T . These are all of the results needed to calculate the expressions for the adjusted expectation of Yn given in (25) and the adjusted variance of Yn given in (26) required for the Bayes trading rule d given in (27) below. E(Yn ,Y[t] ) [Yn ] and corresponding variance Var (Yn ,Y[t] ) [Yn ] are given by
n − ) + R(Yt − ) qt (Y , qt + Rp t Rq t Var (Yn ,Y[t] ) [Yn ] = n2z . qt + Rp t
E(Yn ,Y[t] ) [Yn ] = +
(25) (26)
Re-arranging (25), we sell at Yt on day t if we have not sold previously and n R + (1 − R)Yt . Y
(27)
otherwise, hold. We note that d only depends on Y[t] through Yt . Item (D) in Section 3.5 discusses this result and the role of sufficiency as applied to the random walk model. We have now completed the belief specification V2. 3.3. Viability analysis Our simulation procedure is as follows. V3 (i) We fix a value of R. We simulate a series y = (y1 , . . . , yn ) of daily prices at times t = 1, . . . , n from the random walk model with the values of and 2z simulated from a prior distribution assessed either subjectively or using historical data. V3 (ii) The gain from the default decision rule d0 is y1 . V3 (iii) It is readily seen from (27) that the probability distribution for the actual selling price SP is given by ⎧ t = 1, Pr[W y1 ], ⎪ ⎨ Pr[SP = yt ] = Pr[max{y1 , . . . , yt−1 } < W yt ], 2 t n − 1, (28) ⎪ ⎩ Pr[W > max{y1 , . . . , yn−1 }], t = n, n − R)/(1 − R). where W = (Y Notice that some of the probabilities in (28) can be zero. It is straightforward to show from (20) that the expectation and variance of W adjusted by the observed value Yn = yn are EYn [W ] = yn and Var Yn [W ] = nR2z /(1 − R); and if we n , such as Gaussian, we may evaluate the probability distribution in (28). assume a particular distribution for Y The gain using d compared with d0 is computed as Gy (R) =
n
yt Pr[SP = yt ] − y1 ,
R ∈ [0, 1]
(29)
t=1
is the viability of the simple Bayes selling rule for a particular value of R for this y using whatever values of and 2z were chosen for the simulation. When R is “small”, we expect the rule to produce better gains over the strategy of selling on the first day; for example, R = 0 corresponds to knowing Yn , while R = 1 corresponds to receiving no additional information about Yn beyond that implied by the prior mean and variance structure of the assumed random walk model for daily prices.
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
1283
Gain distribution estimate from 1000 simulated quarters
Expected value and quantiles of gain distribution in cents per lb
2.0 1.5 1.0 0.5 0.0 -0.5 -1.0
0.0
0.2 0.4 0.6 0.8 Relative expected posterior variance R
1.0
Fig. 1. Viability function: expected value (solid) and selected quantiles—5% (dot-dash), 25% (dot), 50% (dash), 75% (dot), 95% (dot-dash)—of the selling price distribution of d less the first day price Y1 of a quarter (in cents per lb) as a function of the relative expected posterior variance R of the last day price Y60 , estimated by pooling the corresponding distributions for 1000 simulated quarters. Note that the 5%, 25% and 50% quantiles coincide for R ∈ [0, 0.3] as do the 25% and 50% quantiles for R ∈ [0.3, 1.00].
V3 (iv) We now repeat items (1), (ii) and (iii) a large number of times to obtain the average G(R) of Gy (R) over the simulated values of y, and 2z . V3 (v) We now repeat (i)–(iv) to obtain the viability function G(R) for a grid of values of R. V3 (vi) We may supplement the simulation analysis by repeating the above calculations using a collection of historical time series. We have now described all of the calculations required to carry out the viability analysis V3. 3.4. Viability analysis for the given data We now apply the foregoing theoretical development using historical data on sugar prices to assess the viability of the simple Bayes trading rule determined by (27). The historical data, which are shown in Table 1, are daily closing prices for sugar (in US cents per lb) on the New York exchange for m = 25 quarters, each of 60 working days. We use these data to illustrate both simulation and historical data viability analysis, as described in Sections 2.2 and 2.4, respectively. Viability based on simulated prices: We first generated m = 1000 simulated random walk price series of length n = 60 as follows: (a) The mean was drawn from a Gaussian distribution with expectation 10 and variance 9 and then the precision 1/2z was drawn independently from a Gamma distribution with shape 1.7 and rate 0.025. This joint prior distribution was assessed using the data in Table 1. Item (E) in Section 3.5 gives more detail on this assessment. (b) The series was then generated from (14) by simulating Z1 , . . . , Z60 independently from a Gaussian distribution with expectation zero and the variance 2z generated in (a) and then adding the generated in (a) to each of the 60 values. (c) Next, the SP distribution in (28) for d was computed for each of the m = 1000 simulated quarters. Fig. 1 shows the expectation (solid line) and selected quantiles of the overall SP distribution obtained by pooling the distributions in (28) generated from the 1000 simulated quarters. Note the following points.
1284
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
Estimate of viability distribution from 25 quarters
Expected value and quantiles of gain distribution in cents per lb
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5 0.0
0.2
0.4
0.6
0.8
1.0
Relative expected posterior variance R Fig. 2. Viability function: expected value (solid) and selected quantiles—5% (dot-dash), 25% (dot), 50% (dash), 75% (dot), 95% (dot-dash)—of the selling price distribution less the first day price Y1 of a quarter (in cents per lb) as a function of the relative expected posterior variance R of the last day price Y60 , pooled over the corresponding distributions for the 25 quarters. Quantiles coincide similarly to those in Fig. 1.
(i) The expectation curve, the average of the 1000 gain curves is our prior viability assessment of d; and, as we would expect, it decreases as R increases. This curve indicates how much we expect to gain over first day price Y1 as a function of our ability to assess our uncertainty about Y60 ; for example, reducing our uncertainty by 40% (R = 0.6) leads to a substantial average gain of about one-third of a cent per lb of sugar. (ii) The distribution is skewed towards large values for all values of R, but more so for small values of R, as we would expect. (iii) The 5%, 25% and 50% quantile curves coincide for some intervals of R: 5%, 25% and 50% in [0, 0.3] and 25% and 50% in [0.3, 1.00]. (iv) The 50% quantile is zero for all R, as it should be (at least approximately for the simulation) because for a 60 is less than the starting price Y1 (so that we sell immediately without random walk the probability that Y gain) must be 0.5. Also, the probability of non-negative gain will decrease from 1 at R = 0 (no uncertainty about Y60 ) to 0.5 at R = 1, while the probability of negative gain increases from 0 at R = 0 to 0.25 at R = 1. (c) We may view the distribution summaries in Fig. 1 as an approximation to those for an infinite number of such quarters. Viability based on historical prices: We now give a brief account of an analysis described in Section 2.4 (i). Standard analyses suggest that a random walk model is a plausible description for the sugar price series shown in Table 1. Also shown in Table 1 are the estimates ˆ = y60 and ˆ 2z = 60 (y − yt−1 )2 /59 of the simple random walk mean t t=2 2 and volatility variance z for each quarter, based on the prices y1 , . . . , y60 for the immediately preceding quarter. Prices for the quarter preceding the first quarter (Q1), not shown in Table 1, are used solely to estimate and volatility variance 2z for Q1. These estimates are taken to be our prior assessments for and 2z for each of the m = 25 quarters. Fig. 2 shows selected quantiles and the expectation (solid line) of the overall SP distribution obtained by averaging the distribution in (28) for each quarter over the 25 quarters. Although less smooth, the plot tells essentially the “same story” as the corresponding plot in Fig. 1 based on simulation, providing further support for the appropriateness of the viability analysis.
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
1285
3.5. Further notes on the example (A) We have chosen a random walk model with constant variance (as an approximation to slowly varying volatility) for price rather than for the (often preferred) logarithm of price, primarily because our data analysis suggested the small benefits for the later were offset by the ease of interpretation and prior elicitation for the former. However, the viability calculations carry through for log price as a random walk, either via simulation or using historical data. While we could allow for a more intricate variance model of the Zt to be assessed, we believe that such an extra layer of modelling will not be a major factor in the current viability analysis. A careful Bayes analysis might well lead us to develop a much more detailed probability model for prices. However, for the prior viability assessment there is little value, even were it possible, in trying to average over all the models that we might develop, and we suspect, in any case, that the changes in the expected posterior variance structure will be small. (B) We might use a more sophisticated default rule, such as waiting until the price is a certain percentage above some recent historical average price before selling. However, any such rule would be based on the supposition that there is a simple, automatic way of beating the market. We would expect any such rule to be apparent to all traders in the market, so that many traders would take this position (as you do not need to possess sugar to sell sugar on the commodity market), so that the market should already hold the alternative price. The question as to whether the market prices commodities fairly is, of course, somewhat controversial, but the view is widely held, and seems a reasonable point of comparison for our analysis. (C) The method that we describe would be straightforward in principle to apply to more elaborate rules based on splitting the amount sold on different days, according to some utility based criterion, based on more detailed aspects of our prior specification, subject only to the increased complexity of the simulation procedures. (D) Observe that the adjusted expectation of Yn in (25) depends on Yt = (Y1 , . . . , Yt ) only through Yt . This sufficiency property is a consequence of certain basic properties of Bayes linear separation. We say that random vector B separates random vectors A and C, written [A@C]/B if the adjusted covariance of A and C given B is zero, or equivalently if Cov[A − EB [A], C − EB [C]] = 0. Goldstein, 1990 shows that belief separation may be viewed as a generalised conditional independence property; in particular, it is shown that [B@(C, D)]/E implies [B@C]/(D, E), for any four vectors B, C, D and E. As Y1 , . . . , Yn is a random walk, [Yn @Yt−1 ]/Yt so that Cov[Yn ,Yt−1 − EYt [Yt−1 ]] = 0. Hence, by our inn ,Yt−1 − EYt [Yt−1 ]] = 0, so that [Y n @Yt−1 ]/Yt and therefore formation assumption in Section 2.5, Cov[Y n ), and [(Yn , Yn )@Yt−1 ]/Yt . Hence, by the generalised conditional independence property [Yn @Yt−1 ]/(Yt , Y the result follows. (E) A priori, we chose and the precision 1/2z to be independent with Gaussian and Gamma distributions, respectively. The data in Table 1 were used to assess the values of the parameters in these two distributions. The mean and standard deviation of the values of for the 25 quarters, shown at the head of the columns in Table 1, were used as a basis to assess the mean and the standard deviation of the prior distribution for . The parameters in the prior distribution for the precision 1/2z were assessed by applying a parametric empirical Bayes calculation to the independent successive squared differences (Yi −Yi−1 )2 ∼ 2z 21 ; that is, the parameters in the Gamma distribution were estimated by maximising the likelihood generated by the product of the marginal density functions of the successive squared differences. The quality of the fit to the empirical distribution of the squared successive differences to the fitted marginal distribution (not shown) is very good. 4. Conclusions We have described a general approach to assessing the potential viability for a Bayesian analysis, and shown how this approach may be applied in an example on commodity trading. While the approach must always be carefully matched to the problem at hand, we feel that the ideas involved in such a viability analysis are widely applicable. We therefore suggest that, for any problem where a proper Bayesian analysis would incur substantial expense, an attempt is made to develop order of magnitude assessments of the potential benefits of the analysis, along the lines that we have suggested, both to avoid carrying out analyses where an unrealistic level of success would appear to be required in order to justify the cost, and also to identify and support those analyses where good returns are likely to follow from more modest levels of success. While our example was carried out in terms of monetary gain, there is no difficulty in making a similar
1286
M. Goldstein, A. Seheult / Journal of Statistical Planning and Inference 138 (2008) 1271 – 1286
viability assessment based on general expected utility gains. In a similar way, each aspect of the viability analysis may be more carefully modelled and evaluated, subject only to the constraint that the chosen procedures should not incur so large a cost as to cast doubt on the financial viability of the preliminary viability analysis itself. Acknowledgements We would like to thank Peter Thompson, Chris Pack and John Kovacs of C. Czarnikow Sugar Limited for providing information and insights on trading on the sugar market. References Goldstein, M., 1990. Influence and belief adjustment. In: Smith, J.Q., Oliver, R.M. (Eds.), Influence Diagrams, Belief Nets and Decision Analysis. Wiley, New York, pp. 143–174. Goldstein, M., 1997. Prior inferences for posterior judgements. In: Dalla Chiara, M.L., Doets, K., Mundici, D. (Eds.), Structures and Norms in Science. Kluwer Academic, Dordrecht, pp. 55–71. Goldstein, M., 1999. Bayes linear analysis. In: Kotz, S., Read, C., Banks, D.L. (Eds.), Encyclopedia of Statistical Sciences: Update volume 3, Wiley, New York, pp. 29–34. Goldstein, M., O’Hagan, A., 1996. Bayes linear sufficiency and systems of expert posterior assessments. J. Roy. of Statist. Soc. B 58, 301–316.