European Journal of Operational Research 147 (2003) 217–228 www.elsevier.com/locate/dsw
O.R. Applications
Probabilistic programming for nitrate pollution control: Comparing different probabilistic constraint approximations Athanasios Kampas a, Ben White b
b,*
a Macaulay Land Use Research Institute, Aberdeen AB15, UK School of Agricultural and Resource Economics, University of Western Australia, 35 Stirling Highway, Crawley, Perth, Western Australia 6009, Australia
Received 4 September 2000; accepted 8 February 2002
Abstract Agricultural nitrate emissions within a river catchment are, due to rainfall and other sources of natural variation, uncertain. A regulator aiming to reduce nitrate emissions into surface and groundwater faces a trade-off between reliability in achieving emission standards and the cost of compliance to agriculture. This paper explores this trade-off by comparing different assumptions about the probability distribution of nitrate emissions and thus the probabilistic constraint included in the catchment model. Three categories of probabilistic constraints are considered: (1) nonparametric, (2) normal and (3) lognormal. The results indicate that the restrictiveness of the non-parametric assumption could lead to a significant reduction in profit relative to the normal and lognormal. The lognormal assumption, although it is theoretically correct, cannot be generalised to the case of correlated emissions. However, ignoring the dependence between different sources of nitrate emissions introduces more bias than mis-specifying their distribution. Therefore a probabilistic constraint based on a correlated normal distribution of emissions gives the best approximation for nitrate emissions in this study. Ó 2002 Elsevier Science B.V. All rights reserved. Keywords: Stochastic programming; Probabilistic programming; Probabilistic constraints; Chance constraints; Nitrate pollution
1. Introduction The quality of water resources in a river catchment is significantly affected by the nature and intensity of agricultural production in the * Corresponding author. Tel.: +61-8-9380-4309; fax: +61-89380-1098. E-mail addresses:
[email protected] (A. Kampas), bwhite@ agric.uwa.edu.au (B. White).
catchment. Agricultural intensity is determined by the levels of nutrients such as nitrogen, phosphorous and potassium applied to crops and grassland. Of these three nutrients, emissions of nitrogen in the form of leached nitrate have proved to be one of the more difficult to control. Nitrogen emissions from agriculture are a significant pollutant in two respects: first, there are health concerns relating to nitrates in drinking water and this is restricted to 11.3 mg/l by European law (Water
0377-2217/03/$ - see front matter Ó 2002 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 7 - 2 2 1 7 ( 0 2 ) 0 0 2 5 4 - 0
218
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228
Framework Directive); second there are concerns relating to the damage done to aquatic ecosystems by raised nitrogen concentrations in rivers, lakes and wetlands (Addiscott et al., 1991). Nitrate pollution can be controlled, at the river catchment level, either by input restrictions or land use restrictions. Typically the direct control of emissions is impossible due to the high cost of observing farm emissions. A further problem is that nitrate emissions, which are largely determined by rainfall events, are stochastic and this leads to wide variations in the nitrate concentrations of surface and groundwater. Thus in practice the regulator is unable to apply deterministic water quality standards, but instead, must set a reliability level which states that the minimum standard must be met for some proportion of the time. In setting this reliability level the regulator trades off reliability against compliance costs. This paper presents a case study of the application of probabilistic programming to the regulation of nitrate emissions. The use of probabilistic constraints was initially introduced by Charnes et al. (1958), while an early use of probabilistic programming in environmental economics is by Maler (1974). Since then, a number of environmental management case studies use probabilistic constraints (Bouzaher and Offutt, 1992; Ellis, 1987; Hanley et al., 1998; Pinter, 1991; Wagner and Gorelick, 1987). This paper compares nine alternative probabilistic constraint formulations and compares them in terms of the realism of their assumed probability distributions, the treatment of correlation between emission sources and the form of the trade-off between reliability and compliance costs. The paper is organised as follows. The following section reviews the application of stochastic programming to environmental management problems. Section 3 describes the case study and the form of the probabilistic constraints. Section 4 discusses the results, and Section 5 concludes.
2. Stochastic programming Stochastic programming (SP) problems are usually classified into probabilistic programming
(or chance constrained) problems, which involve constraints expressed as probabilities; and SP with recourse, which incorporate penalties for violating constraints in the objective function. The difference between probabilistic programming problems and SP with recourse is that they use different measures for risk. Probabilistic constraints represent risk in a qualitative way, whereas models with penalties measure risk quantitatively. That is, in probabilistic programming only the possibility of infeasibility is at stake regardless of the amount by which the constraints are violated. Conversely, in SP with recourse the levels of violations are important (Haveveld and van der Vlerk, 2000; Prekopa, 1995). In our case study of nitrate pollution control, we limit our attention to probabilistic programming, since in most environmental management problems quantitative data on the cost of exceeding emission standards are impossible to estimate (Ellis et al., 1985). This is the case with nitrate emissions where the costs of exceeding the standard would include health costs and a range of costs due to ecological damage. The general form of a probabilistic constraint is given by Pr f AðxÞx < bðxÞg P b;
ð1Þ
where AðxÞ is an m n matrix usually referred to as a technology matrix, x is an n-component vector of decision variables, bðxÞ is an m-component vector and the scalar b stands for the probability with which constraint (1) must be satisfied. The scalar b represents the reliability of decision x and 1 b gives the risk of infeasibility associated with decision x. The choice of b is at the discretion of the decision-maker, in other words it is a policy choice which can be interpreted as an expression of the regulatorÕs aversion to uncertainty (Lichtenberg and Zilberman, 1988). The technology matrix AðxÞ and the vector bðxÞ are functions of x ðx 2 Rm Þ, a random vector defined on the probability space ðX; A; PÞ. X is the support of a Pclosed subset of Rm and A is a Borel r-field relative to X. Probabilistic constraints can either be joint or separate. A typical separate probabilistic constraint has the form
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228
Pr fAi ðxÞx < bi ðxÞg P bi ;
ð2Þ
where Ai ðxÞ is the ith row of the technology matrix, and bi ðxÞ the ith component of bðxÞ. Such probabilistic constraints provide the option of choosing different probability levels, bi , for different rows according to their importance from a reliability point of view (Mayer, 1992). A joint probabilistic constraint specifies the same reliability level, b, for all the probabilistic constraints. Further, if the random vectors corresponding to different rows of the technology matrix are stochastically independent, then the joint chance constraint is given by m Y
219
follows directly from the monotonicity of the distribution functions (Kall, 1976). Second, in the case of joint probabilistic constraints where the technology matrix and the vector bðxÞ are random, the feasible domain is generally non-convex and the only known convexity results come from the theory of logarithmic concave measures (Prekopa, 1971). Examples of multivariate distributions that result in log-concave probability measures have been identified in the literature (An, 1998). Finally, the mathematical and computational problems of joint probabilistic constraints are discussed by Mayer (1992), Prekopa (1993) and Growe (1997).
ð3Þ
3. A case study of nitrate pollution control in the Kennet catchment
where m is the number of rows of the technology matrix. Problems with joint probabilistic constraints were introduced by Miller and Wagner (1965) for independent variables, while the general case was examined by Prekopa (1970). The choice between constraint (2) and (3) depends on the form of the problem. There are cases where the use of (2) can be justified when the individual activities, Ai ðxÞ, do not affect each other and especially when they have different reliability requirements. Sometimes, however, the inequalities Ai ðxÞx < bi ðxÞ are not relevant individually, but the joint probability is important. Tin-Loi et al. (1996) further classify probabilistic constraints into three cases: (a) only the vector bðxÞ is random; (b) only the technology matrix AðxÞ is random; and (c) both bðxÞ and AðxÞ are random. The key issue of converting the probabilistic constraint to its deterministic equivalent is the convexity of the feasible domain (Mayer, 1992). This ensures the global maximum within concave or quasi-concave programming (Beavis and Dobbs, 1990). Extensive discussion of the convexity properties of stochastic problems can be found in Kall (1976), Prekopa (1995) and Wets (1997). These convexity results can be briefly summarised as: First, in the case of separate probabilistic constraints the feasible set is obviously convex if the randomness appears only on one side. This
This case study addresses the problem of assessing the cost of complying with environmental standards, as required by the Water Framework Directive (WFD), for the Kennet a small (142 km2 ) arable farming catchment in South East England. The Kennet was chosen as one of the few catchments in the UK with long records of nitrate concentration measurements (Scholefield et al., 1996), and these data were used in model validation (Kampas, 1999). The water quality standard was imposed on water percolating below the root zone. This is a standard assumption of applied research (Pan and Hodge, 1994; Giraldez and Fox, 1995), in the Kennet it indicates the quality of water percolating into groundwater and, in some areas, re-emerging as spring water.
Pr fAi ðxÞx < bi ðxÞg P b;
i¼1
3.1. Modelling framework The approach adopted was to divide the catchment into land classes characterised by different agricultural productivity and nitrate emission characteristics and to use an aggregate non-linear programming model to simulate producersÕ responses to a water quality standard. Similar integrated modelling analysis for diffuse pollution control is followed by Bouzaher and Shogren (1995), Vatn et al. (1997) and Wu et al. (1995). The approach is of policy relevance as the model is
220
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228
based on publicly available data and does not rely on one off experimental studies. Therefore the model can be transferred readily to other catchments in the UK. Fig. 1 gives the modelling framework and shows the links between data sets, models and model estimates. The development of the empirical model proceeded in three stages. First, a Geographical Information System (GIS) was used to classify land classes based on their soil properties
and to identify the total area of the catchment. Second, two biophysical models simulate the nitrogen cycle in agricultural systems (arable and grassland). Finally, a land allocation model, in which the agricultural system, as it exists in the Kennet catchment, is assumed to be a single profit maximising farm (see Hazell and Norton (1986) for a discussion of this approach). The spatially distribution of agricultural activities involve the use of various data sets within a
Fig. 1. Modelling framework for nitrate pollution control.
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228
GIS, such as data on agricultural output, land cover data and soil type data. The most comprehensive source of data on agricultural production is the annual Agricultural Census Data. Satellite cover data for the Kennet was obtained from the ITE Land Cover Map of Great Britain (LCMGB). A digitised soil association map of the area was obtained from the Soil Survey and Land Research Centre (SSLRC), which identifies 14 soil series for the Kennet catchment. A land classification based on soil type was derived by translating the soil series into soil types according to the information provided by Findlay (1984). Overlaying maps of land cover and soil type using ARC/INFO, it was possible to obtain the distribution of land cover by soil type. Nitrogen losses were estimated by using two process-based simulation models which describe nitrate emissions by crop/soil type. Nitrate loss functions for all the major soil/crop combinations were estimated econometrically from nitrate emission estimates derived from the simulation models: SUNDIAL for the arable land, and NCYCLE for the grassland. SUNDIAL (SimUlation of Nitrogen Dynamics In Arable Land) was used to estimate nitrate losses given a set of agronomic parameters and weather conditions (Smith et al., 1996). Similarly, N-CYCLE is an empirical mass-balance, which simulates the nitrogen cycle in grassland systems developed at the Institute of Grassland and Environmental Research (IGER) (Lockyer et al., 1995). The pattern of nitrate loss estimates for the Kennet catchment was consistent with the range of values reported in the literature (Bradbury et al., 1993; Whitehead, 1995; Kampas, 1999). The output from SUNDIAL and N-CYCLE simulations provide the potential nitrate load but, give no information on the concentration of nitrate in the leachate. To convert nitrate load to leaching estimates for the given drainage pattern we use TOPCAT a simple hydrological model (Quinn and Antony, 1996). TOPCAT is a one-dimensional model that predicts daily concentrations of nitrates in leachate using data for daily effective rainfall. It is suited to the needs of this study since it provides estimates of the daily average nitrate leaching as well as the daily variance.
221
The third stage assembled the production and nutrient loss information within a non-linear optimisation framework. A non-linear programming model was used since both the production functions and the N loss functions are non-linear. Crop production functions were taken from the literature (England, 1986). The assumption that the whole catchment is a single profit maximising firm overcomes the lack of information on farm type farm size and their exact spatial location. Output and input prices refer to the 1995–1996 production period. The regulation of nitrate pollution is examined within a cost-effectiveness framework, in which the regulatorÕs problem is to maximise the sum of the benefits from agricultural production while complying with environmental standards. The problem can be expressed as max
n X
pi ðqi ; ei Þ ¼ max
i¼1
subject to: ( ) n X Pr ai ei 6 e P b;
n X
ai ðpqi ci ðqi ; ei ÞÞ
ð4Þ
i¼1
ð5Þ
i¼1
where pi denotes the profit derived from the ith parcel of land (defined as a land class/crop combination), p and qi are vectors of product prices and output produced, ci is a vector of cost functions. Under the assumption of undistorted markets where private profits equal social benefits from agricultural production (Xepapadeas, 1997). The emission level, a side-effect of production, is denoted by ei and is assumed random. The scalar ai is the amount of land used by ith land class/crop combination, e is the maximum permissible level of nitrate emission that is inferred from the WFD and b stands for the reliability level. This form of the model assumes that the regulator is concerned with the sum of nitrate emissions and does not require each point in the catchment to comply with nitrate standards. This is appropriate where water quality is determined by the overall nitrate loading and the fact that parts of the catchment may exceed the nitrate concentration is not relevant.
222
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228
The above optimisation problem uses three realvalued decision variables: agricultural output; nitrogen input and the allocation of land classes between crops and grassland. The cost of complying with the nitrate standard is given by the difference between the unrestricted and restricted profit. The probabilistic constraint (5) represents the nitrate emissions as random variables. The constraint given by (5) is one of the simplest cases of a probabilistic constraint, since by construct it P concerns n a univariate random variable, that of i¼1 ai ei , and therefore results in a convex feasible domain. Applying meta-modelling techniques to the outcome of the simulation models it is possible to express the expected values, the variances and the correlation coefficients of the emission levels as functions of the control variables of the problem, and to insert them into a mathematical programming model (Bouzaher et al., 1993). The incorporation of the relevant modelling assumptions resulted in a medium size non-linear optimisation problem that comprises 769 equations and 683 variables. The full version of model given is presented in Kampas and White (1999), while the values of the modelÕs parameters can be found in Kampas (1999). 3.2. Approximations of the probabilistic constraint
Then the deterministic equivalent is given by (see Taha (1997) for a proof) n X
pffiffiffiffiffiffiffiffiffiffiffi ai E½ei þ Kb aT V a 6 e ;
pffiffiffiffiffiffiffiffiffiffiffi where aT V a is the standard deviation of Pn a e , i¼1 i i V is the variance–covariance matrix of ei , Kb ¼ U1 ðbÞ, U1 the inverse standard normal distribution. To illustrate, (7) can be written as n X
ai E½ei
i¼1
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u n n1 X n X uX þ Kb t a2i r2i þ 2 qij ai aj ri rj 6 e ; i¼1
n X i¼1
" ai ei N E
n X i¼1
# ai ei ; var
n X i¼1
!! ai e i
:
ð6Þ
i¼1 j¼iþ1
ð8Þ where qij is the correlation coefficient between ei and ej , and ri is the standard deviation of ei . The probabilistic constraint (8) results in a non-linear programming problems. A linear approximation of (8) on the basis that emissions are independent, qij ¼ 0 is n X
ai E½ei þ Kb
i¼1
In order to express (5) as its deterministic equivalent we need to assess the distribution of the sum of random variables. In terms of nitrate emissions there are a number of problems and issues. First, emissions are non-negative and therefore a normal distribution may not be an accurate approximation. Second, nitrate emissions for one tract of land in the catchment tend to be correlated with emissions from another due to the effects of rainfall. The most widely used approximation is to use the Central Limit Theorem (CLT) and argue that the sum of random variables has a normal distribution as the number of observations tends to infinity. That is
ð7Þ
i¼1
n X
ai ri 6 e :
ð9Þ
i¼1
This approximation was used by Ellis et al. (1986), Olson and Swenseth (1987) and Zare and Daneshmand (1995). Nevertheless, (9) is an inaccurate approximation of (8). This can be seen by the fact that sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n n X X a2i r2i < ai ri ; i¼1
i¼1
which is a special case of MinkowskiÕs inequality (Sydsaeter et al., 1999). If the use of CLT is inappropriate due to limited observations then a non-parametric approximation of the probabilistic constraint may be used. Based on the well-known ChebyshevÕs inequality, a distribution free deterministic equivalent of (5) is given by (see Wets (1983) for a proof)
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228 n X
ai E½ei
i¼1
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u n n1 X n X uX þ ð1 bÞ1=2 t a2i r2i þ 2 qij ai aj ri rj 6 e : i¼1
i¼1 j¼iþ1
ð10Þ
It should be pointed out that in general such distribution-free approximations are by their nature inferior to constraints based on parametric assumptions (Sengupta, 1972; Wets, 1983). Alternative parametric assumptions, to that of a normal distribution, may also be appropriate. The fact that many environmental variables are nonnegative means that they are generated by a skewed distribution. Several skewed probability models have been used to describe environmental data, including the Poisson, negative binomial, Weibull, gamma, exponential and the lognormal. Among these distributions, the lognormal has been the most widely applied (Parkin and Robinson, 1992). The assumption of a lognormal distribution of nitrate emissions has some scientific support. Ott (1990), for example, has proposed the Theory of Successive Random Dilution (TSRD) to explain why concentrations of a range of pollutants are approximately lognormally distributed. More recently, Cooper et al. (1996) stress the importance of using skewed distributions, and especially the lognormal, to represent environmental variables within mathematical programming. However, under the assumption of lognormally distributed random variables, only the case of independent variables can be examined since there is no known distribution for the sum of correlated lognormal variables (Curran, 1994; Milevsky and Posner, 1998). The sum of independent lognormals can be approximated by another lognormal distribution, although there is no exact closed form solution to the problem of computing the distribution of a sum of independent lognormal random variables (Schwartz and Yeh, 1982; Shimizu and Crow, 1988). So if xi denotes a random variable lognormally distributed (yi ¼ ln xi ) then L ¼ x1 þ x2 þ þ xn ¼ ey1 þ ey2 þ þ eyn ffi ez , where ez is another lognormal distribution that approximates the sum of xi Õs.
223
Among different solutions to the problem of computing the moments of such a distribution, we used the method of moment matching (MoMM) (Shore, 1995; Thompson, 1999). Under the MoMM the moments estimated empirically from the sample are equated to the theoretical moments of the underlying distribution. Given that the standard errors associated with the third and fourth moments are generally quite high most MoMM approaches limit their attention to the first two moments (Shore, 1995). We examine two such methods: FentonÕs method (Fenton, 1960), which equates the sample mean and variance with the lognormal mean and variance; and WilkinsonÕs method which equates the first two sample moments with the analogous moments from the lognormal distribution (Beaulieu et al., 1995). Applying FentonÕs method, we have X n r2z z E½e ¼ exp lz þ ai E½ei ; ð11Þ ffi 2 i¼1 n X Var½ez ¼ E2 ½ez exp r2z 1 ffi a2i r2i :
ð12Þ
i¼1
Solving (11) and (12) for lz and rz yields ! n X ai E½ei 0:5 lz ¼ ln i¼1
( ln
Pn Pn
i¼1
i¼1
and r2z
¼ ln
(
Pn Pn
i¼1
i¼1
)
a2i r2i
ai E½ei
a2i r2i
ai E½ei
2 þ 1
ð13Þ
) 2 þ 1 :
ð14Þ
Since the first moment is the expected value therefore WilkinsonÕs method uses (11) as well as the following: h i 2 E ðez Þ ¼ exp 2lz þ 2r2z 2 !2 3 n X ffi E4 ai e i 5 i¼1
¼
n X i¼1
a2i r2i
þ
n X i¼1
Note that r2x ¼ E½x2 l2x .
!2 ai E½ei
:
ð15Þ
224
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228
Solving (11) and (15) for lz and rz yields ! n X lz ¼ 2 ln ai E½ei
0:5 ln
r2z
¼ ln
8 n
8
a2i r2i þ
i¼1
a2i r2i
ai E½ei
þ
n X
ai E½ei
;
;
ð16Þ
þ
!2 9 =
i¼1 n X
!2 9 =
i¼1
i¼1
2 ln
n X
ð17Þ
The probability function is non-decreasing and monotonic, therefore we can write ( ) n X Pr ai ei 6 e ffi Pr fez 6 e g i¼1
¼ Pr f z 6 lnðe Þg lnðe Þ lz ¼U rz P UðKb Þ ð18Þ
which can be rearranged as pffiffiffiffiffi lz þ Kb rz 6 lnðe Þ
ð19Þ
and substituting lz and rz from FentonÕs method results in
ln
n X
! ai E½ei
( 0:5 ln
Pn
i¼1
a2i r2i
)
2 þ 1 ai E½ei vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ) u ( Pn 2 2 u i¼1 ai ri t þ Kb ln Pn 2 þ 1 6 lnðe Þ ð20Þ i¼1 ai E½ei i¼1
Pn
ai E½ei
6 lnðe Þ:
i¼1
lnðe Þ lz P Kb ; rz
n X
!
)
ai E½ei
i¼1
;
ai E½ei :
!
i¼1
while substituting lz and rz from WilkinsonÕs method (19) can be written as
8 n
a2i r2i : i¼1 i¼1 2 8 !2 9 =
2 ln
i¼1
n X
0:5 ln
!2 9 = ;
2 ln
n X
!31=2 ai E½ei 5
i¼1
ð21Þ
The forms of the probabilistic constraint considered in this section are compared in the context of the catchment nitrate model in the following section.
4. Results The model was solved using the GAMS/CONOPT solver which is suitable for models with nonlinear constraints and few degrees of freedom (Stolbjerg-Drud, 1993). This is the case of the model in this study since the probabilistic constraints are irreducible non-linear and non-separate, and the number of constraints is roughly the same as the number of variables. In addition, the other non-linear solver GAMS/MINOS produced almost identical results. Table 1 summarises the results of our case study which compares all the different forms of the probabilistic constraint discussed in Section 3. The results in Table 1 are expressed as percentage losses of profitability against the unconstrained optimisation. Three reliability levels were examined (70%, 80%, and 95%) which represent different degrees of Ôaversion to uncertaintyÕ from a regulatorÕs point of view. With reference to Table 1 the main results can be summarised as follows. First, where the assumption of independent random variables can be relaxed (e.g. non-parametric and normal distribution) it is evident that the correlation between the random variables results in a stricter constraint. Further, where emissions are collinear gives the highest profitability losses. Second, the non-parametric approximation is restrictive and ignorance
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228 Table 1 Profitability losses for different reliability levels and different approximations of the probabilistic constraint Probabilistic constraint
Reliability level 70%
80%
95%
1. Non-parametric (1a) Independent (1b) Correlated (1c) Perfect collinear
0.620% 1.198% 1.603%
0.867% 1.615% 2.142%
2.492% 4.390% 6.533%
2. Normal (2a) Independent (2b) Correlated (2c) Perfect collinear (2d) Independent-linearised
0.084% 0.132% 0.173% 0.362%
0.173% 0.278% 0.381% 0.806%
0.521% 0.728% 0.951% 2.449%
0.072%
0.164%
0.584%
0.070%
0.163%
0.583%
3. Lognormal (3a) Independent (FentonÕs method) (3b) Independent (WilkinsonÕs method)
of the underlying distribution potentially leads to a policy with relatively high losses in profitability. A similar result is observed for the linearised version of (9). In this application, the implications for profitability are similar between the non-parametric and linearised constraints. Both MoMM approaches give similar results with FentonÕs method being slightly more restrictive than WilkinsonÕs. As Al-Khalidi (1994) argues, the mis-specification of lognormal data by a normal distribution results in overestimated probabilities, and hence it was expected that the assumption of lognormal distribution will result in lower costs than the normal one. However, this is not the case in Table 1. For reliability levels 70% and 80% the result are consistent with the AlKhalidiÕs result but for higher reliability levels the opposite is observed. A possible explanation of this may be that the MoMM utilises only the first two moments of the distribution and these may not be sufficient for characterising a non-symmetrical distribution such as the lognormal. Another possible explanation may be due to the use of the biased moment estimators, as opposed to the use of the Uniformly Minimum Variance estimators (UMUs). The UMUs are described by Shimizu (1988) and should always be preferred according to Parkin and Robinson (1992), espe-
225
cially with small sample sizes and high coefficient of variation. Although the biased estimator and the UMUs are asymptotically equivalent for sample sizes less than 100 the biased estimator are found to be highly inefficient (Ott, 1995; Parkin and Robinson, 1992). It is noteworthy that the limited number of empirical applications examining the lognormal assumption contradict each other. For example Zhu et al. (1994) have found that a binding probabilistic constraint based on lognormal distribution is always more restrictive than the one based on normal distribution, while the opposite is argued by Xu et al. (1996). In terms of policy implications these results are significant. The non-parametric constraints represent ignorance concerning the underlying distribution of emissions. The fact that these constraints impose additional compliance costs on producers would suggest that this ignorance is costly and regulators should analyse which of the parametric alternatives gives the best model of nitrate emissions in the catchment. Turning to the parametric alternatives, there is little difference between the costs imposed by independent normal and independent lognormal constraints. However, there are significant cost differences between independent and collinear normal constraint. If, as is the case here, nitrate emissions are correlated then this would favour the correlated normal model over others. Finally, we should emphasise that the ‘‘optimal’’ values of the decision variables depend upon the specific approximation examined. 1 Table 2 presents the ‘‘optimal’’ values of the main control variables of the optimisation problem for 95% reliability level. This table shows that the form of control policy would vary significantly depending upon the probabilistic constraint. For instance, if the regulator assumes a non-parametric constraint and perfect collinearity (1c) and pursues a nitrogen input restriction, then the nitrogen use would be reduced significantly. However, this is not the case
1 We would like to thank the anonymous referee for stressing the importance such an observation.
226
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228
Table 2 Solutions for the control variables for different approximations of the probabilistic constraint (95% reliability level) Probabilistic constraint
Land use (000Õs ha) Arable
Cutting grass
Temporary grass
1. Non-parametric (1a) Independent (1b) Correlated (1c) Perfect collinear
6.696 6.529 5.528
0.738 0.872 1.372
0.660 0.827 1.228
2. Normal (2a) Independent (2b) Correlated (2c) Perfect collinear (2d) Independent-linearised
6.703 6.663 6.612 6.654
0.732 0.765 0.807 0.771
6.699 6.687
3. Lognormal (3a) Independent (FentonÕs method) (3b) Independent (WilkinsonÕs method)
Livestock units
Fertiliser (000Õs t) Arable
Grassland
4.538 4.605 5.006
866.766 740.893 459.167
328.929 295.180 219.096
0.653 0.693 0.747 0.702
4.536 4.552 4.573 4.555
869.573 834.561 797.566 812.475
331.585 320.161 309.066 319.177
0.735
0.657
4.537
860.455
329.925
0.745
0.669
4.542
840.617
325.991
with the normal constraint where the nitrogen input is reduced to a lesser extent on the basis of a collinear constraints (2c) instead of independence (2a) and there is a slight shift in land use from arable to grassland. Overall land use and nitrogen input use between normal and lognormal constraints is broadly similar.
5. Conclusions Before concluding it should be stressed that the exact solution of stochastic optimisation problems is not generally known, and frequently no pronounced optimum exists. The choice of approach depends upon data availability and the concerns of the decision-makers (Somlyody and Wets, 1988; Wagner et al., 1992; Birge and Louveaux, 1997). This case study adopts probabilistic programming to overcome the lack of information on the costs of exceeding the nitrate standard, which precluded the application of stochastic with recourse. At a catchment level water resource managers will be concerned if nitrate levels consistently exceed standards, but will not be overly concerned if there are occasional violations of the standard or if parts of the catchment exceed the standard. This is reflected in our choice of probabilistic programming approach with a single probabilistic constraint.
On this basis, conclusions from the model show that if nitrate emissions are approximately normally or lognormally distributed and emissions from different land ÔparcelsÕ are correlated (as is to be expected due to the common effects of rainfall (Kampas, 1999)) then a probabilistic constraint based on a correlated normal distribution gives the most realistic approximation of the underlying stochastic process. The lognormal distribution is more realistic for an independent emission, but the results indicate that mis-specifying the distribution of emissions as normal results in less bias than ignoring the fact that emissions are correlated.
Acknowledgements We would like to thank J. Mayer, T. Szantai, B. McCarl and an anonymous referee for helpful comments. An earlier version of the paper was presented at the conference APMOD 2000 (Applied Mathematical Programming and Modelling), Brunel University, London, 17–19 April 2000.
References Addiscott, T., Whitmore, A., Powlson, D., 1991. Farming, Fertilizers and the Nitrate Problem. CAB International, Wallingford.
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228 Al-Khalidi, H., 1994. On the misspecification of a lognormal distribution. Communication in Statistics: Theory and Methods 23 (8), 2243–2350. An, M., 1998. Logconcavity versus logconvexity: A complete characterisation. Journal of Economic Theory 80 (2), 350– 369. Beaulieu, N., Abu-Dayya, A., McLane, P., 1995. Estimating the distribution of a sum of independent lognormal random variables. IEEE Transactions on Communications 43 (12), 2869–2873. Beavis, B., Dobbs, I., 1990. Optimization and stability theory for economic analysis. Cambridge University Press, Cambridge. Birge, J., Louveaux, F., 1997. Introduction to Stochastic Programming. Springer, New York. Bouzaher, A., Lakshminaryan, P., Cabe, R., Carriouiry, A., Gassman, P., Shogren, J., 1993. Metamodels and nonpoint pollution policy in agriculture. Water Resources Research 29 (6), 1579–1587. Bouzaher, A., Offutt, S., 1992. A stochastic linear programming model for corn residue production. Journal of the Operational Research Society 43 (9), 843–857. Bouzaher, A., Shogren, J., 1995. Modeling nonpoint source pollution in an integrated system. In: Martin, W. (Ed.), Environmental Policy Modeling. Kluwer Academic Publishers, Boston. Bradbury, N., Whitmore, A., Hart, P., Jenkinson, D., 1993. Modelling the fate of nitrogen in crop and soil in the years following of 15N-labelled fertilizer to winter wheat. Journal of Agricultural Science 121, 363–379. Charnes, A., Cooper, W., Symonds, G., 1958. Cost horizons and certainty equivalents: An approach to stochastic programming of heating oil. Management Science 4, 235– 263. Cooper, W., Hemphill, H., Huang, Z., Li, S., Lelas, V., Sullivan, D., 1996. Survey of mathematical programming models in air pollution. European Journal of Operational Research 96 (1), 1–35. Curran, M., 1994. Valuing asian and portfolio options by conditioning on the geometric mean price. Management Science 40 (12), 1705–1711. Ellis, J., 1987. Stochastic water quality optimization using imbedded chance constraints. Water Resources Research 23 (12), 2227–2238. Ellis, J., McBean, E., Farquhar, G., 1985. Chance-constrained/ stochastic linear programming model for acid rain abatement-I complete collinearity and noncollinearity. Atmospheric Environment 19, 925–937. Ellis, J., McBean, E., Farquhar, G., 1986. Chance-constrained/ stochastic linear programming model for acid rain abatement-II limited collinearity. Atmospheric Environment 20, 501–511. England, R., 1986. Reducing the nitrogen input on arable farms. Journal of Agricultural Economics 37, 13–24. Fenton, L., 1960. The sum of lognormal probability distributions in scatter transmission systems. IRE Transactions on Communications Systems CS-8, 57–67.
227
Findlay, D., 1984. Soils and their use in South West England. Rothamsted Experimental Station, Harpenden. Giraldez, C., Fox, G., 1995. An economic analysis of groundwater contamination from agriculture nitrate emissions in Southern Ontario. Canadian Journal of Agricultural Economics 43, 387–402. Growe, N., 1997. Estimated stochastic programs with chance constraints. European Journal of Operational Research 101 (2), 285–305. Hanley, N., Faichney, R., Munro, A., Shortle, J., 1998. Economic and environmental modelling for pollution control in an estuary. Journal of Environmental Management 52 (3), 211–225. Haveveld, W., van der Vlerk, M., 2000. Lectures Notes on Stochastic Programming, Department of Econometrics & OR, University of Groningen. Hazell, P., Norton, R., 1986. Mathematical Programming for Economic Analysis in Agriculture. MacMillan, New York. Kall, P., 1976. Stochastic Linear Programming. Springer, Berlin. Kampas, A., 1999. Policies to control agricultural externalities: The case of nitrate pollution. Unpublished Ph.D. Thesis, Department of Agricultural Economics and Food Marketing, University of Newcastle upon Tyne. Kampas, A., White, B., 1999. Efficient policies for controlling nitrate pollution. Paper presented at the Annual Conference of Operational Research Society, Edinburgh. Lichtenberg, E., Zilberman, D., 1988. Efficient regulation of environmental risks. Quarterly Journal of Economics 103, 167–178. Lockyer, D., Scholefield, D., Dawson, B., 1995. N-CYCLE, MERTaL Courseware, Aberdeen. Maler, K., 1974. Environmental Economics: A Theoretical Inquiry. The Johns Hopkins University Press, Baltimore, MD. Mayer, J., 1992. Computational techniques for probabilistic constrained optimization problems. In: Marti, K. (Ed.), Stochastic Optimization: Numerical Methods and Technical Applications. Springer, Berlin, pp. 141–163. Milevsky, M., Posner, S., 1998. Asian options, the sum of lognormals, the reciprocal gamma distribution. Journal of Financial and Quantitative Analysis 33 (3), 409–422. Miller, B., Wagner, H., 1965. Chance-constrained programming with joint constraints. Operations Research 13, 30– 945. Olson, D., Swenseth, S., 1987. A linear approximation for chance-constrained programming. Journal of Operations Research 38 (3), 261–267. Ott, W., 1990. A physical explanation of the lognormality of pollutant concentrations. Journal of Air and Waste Management Association 40 (10), 1378–1383. Ott, W., 1995. Environmental Statistics and Data Analysis. Lewis, Boca Raton, FL. Pan, J., Hodge, I., 1994. Land use permits as an alternative to fertiliser and leaching taxes for the control of nitrate pollution. Journal of Agricultural Economics 45, 102– 112.
228
A. Kampas, B. White / European Journal of Operational Research 147 (2003) 217–228
Parkin, T., Robinson, J., 1992. Analysis of lognormal data. In: Stewart, B. (Ed.), Advances in Soil Science. Springer, New York, pp. 193–235. Pinter, J., 1991. Stochastic modelling and optimization for environmental management. Annals of Operations Research 31, 527–544. Prekopa, A., 1970. On probabilistic constrained programming. Mathematical Programming Study 28, 113–138. Prekopa, A., 1971. Logarithmic concave measures with applications to stochastic programming. Acta Scientiarum Mathematicarum 32, 301–315. Prekopa, A., 1993. Programming under probabilistic constraint and maximizing a probability under constraints. Report no. 35-93, Center for Operational Research, Rutgers University. Prekopa, A., 1995. Stochastic Programming: Mathematics and its Applications. Kluwer Academic Publishers, Dordrecht. Quinn, P., Antony, S., 1996. TOPCAT: MS-Excel based freeware version of TOPMODEL, ADAS R&D,Wolverhampton. Scholefield, D., Lord, E., Rodda, H., Webb, B., 1996. Estimating peak nitrate concentrations from annual nitrate loads. Journal of Hydrology 86, 355–373. Schwartz, S., Yeh, Y., 1982. On the distribution function and moments of power sums with lognormal components. The Bell System Technical Journal 61 (7), 1441–1462. Sengupta, J., 1972. Stochastic Programming: Methods and Applications. North-Holland, Amsterdam. Shimizu, K., 1988. Point estimation. In: Crow, E., Shimizu, K. (Eds.), Lognormal Distributions: Theory and Applications. Marcel Dekker, New York, pp. 27–85. Shimizu, K., Crow, E., 1988. History, genesis, properties. In: Crow, E., Shimizu, K. (Eds.), Lognormal Distributions: Theory and Applications. Marcel Dekker, New York, pp. 7–26. Shore, H., 1995. Fitting a distribution by the first two moments (partial and complete). Computational Statistics & Data Analysis 19 (5), 563–577. Smith, J., Bradbury, N., Addiscott, T., 1996. SUNDIAL: A PC-based system for simulating nitrogen dynamics in arable land. Journal of Agronomy 88, 38–43. Somlyody, L., Wets, R., 1988. Stochastic optimization models for lake eutrophication management. Operations Research 36, 660–681. Stolbjerg-Drud, A., 1993. GAMS/CONOPT: GAMS-The Solver Manuals, GAMS Development Corporation, Washington, DC. Sydsaeter, K., Stom, A., Berck, P., 1999. EconomistsÕ Mathematical Manual. Springer, Heidelberg. Taha, H., 1997. Operational Research: An Introduction. Prentice-Hall, Englewood Cliffs, NJ.
Thompson, K., 1999. Developing univariate distributions from data for risk analysis. Human and Ecological Risk Assessment 5 (4), 755–783. Tin-Loi, F., Qi, L., Wei, Z., Womersley, R., 1996. Stochastic ultimate load analysis: Models and solution methods. Numerical Functional Analysis and Optimization 17 (9&10), 1029–1043. Vatn, A., Bakken, L., Lundeby, H., Romstad, E., Rorstad, P., Vold, A., 1997. Regulating nonpoint source pollution from agriculture: An integrated modelling analysis. European Review of Agricultural Economics 24, 207–229. Wagner, B., Gorelick, S., 1987. Optimal groundwater management under parameter uncertainty. Water Resources Research 23 (7), 1162–1174. Wagner, J., Shamir, U., Nemati, H., 1992. Groundwater quality management under uncertainty: Stochastic programming approaches and the value of information. Water Resources Research 28, 1233–1246. Wets, R., 1983. Stochastic programming: Solution techniques and approximation schemes. In: Bachem, A., Grotschel, M., Korte, B. (Eds.), Mathematical Programming: The state of the art. Springer, Berlin, pp. 566–603. Wets, R., 1997. Stochastic programs with chance constraints: Generalised convexity and approximation issues. In: Crouzeix, J., Martinez-Legaz, J., Volle, M. (Eds.), Generalised Convexity, Generalised Monotonicity: Recent Results. Kluwer Academic Publishers, Dordrecht, pp. 61– 74. Whitehead, D., 1995. Grassland Nitrogen. CAB International, Wallinford. Wu, J., Teague, M., Mapp, H., Bernardo, D., 1995. An empirical analysis of the relative efficiency of policy instruments to reduce nitrate water pollution in the US southern high plains. Canadian Journal of Agricultural Economics 43, 403–420. Xepapadeas, A., 1997. Advanced Principles in Environmental Policy. Edward Elgar, Cheltenham. Xu, F., Prato, T., Zhu, M., 1996. Effects of distributions for sediment yields on farm returns in a chance-constrained programming model. Review of Agricultural Economics 18, 53–64. Zare, Y., Daneshmand, A., 1995. A linear approximation method for solving a special class of the chance constrained programming problem. European Journal of Operational Research 80 (1), 213–225. Zhu, M., Taylor, D., Sarin, S., Kramer, R., 1994. Chance constrained programming models for risk-based economic and policy analysis of soil conservation. Agricultural and Resource Economics Review 23 (1), 58–65.