Contemporary Stochastic Approach to Water Resources Systems: The ARMP and Feature Prediction Models

Contemporary Stochastic Approach to Water Resources Systems: The ARMP and Feature Prediction Models

Copyr ig ht © IFAC Identi fi catio n and Sys tem Parame te r Est ima tion 1982. W as hi ngto n D .e . . US A 1982 CONTEMPORARY STOCHASTIC APPROACH TO...

1014KB Sizes 0 Downloads 15 Views

Copyr ig ht © IFAC Identi fi catio n and Sys tem Parame te r Est ima tion 1982. W as hi ngto n D .e . . US A 1982

CONTEMPORARY STOCHASTIC APPROACH TO WATER RESOURCES SYSTEMS: THE ARMP AND FEATURE PREDICTION MODELS O. Ibidapo-Obe Faculty of Engin eering, Universl'ty of Lagos, Akoka, Lagos, Nigeria

Abstract. An autoregressive model with markovian parameters (ARMP/ a.r.m.p.) and a feature prediction scheme (FPM) are developed in this paper . The ARMP is physically based and adaptive in its implementation thus taking into consideration the inherent periodicities in hydrological time series. The FPM is motivated by the current inability to provide a suitable and sufficiently comprehensive yet simplified mathematical hydrological models. It is based on pattern analysis and is such that a system dynamic feature is identified using a priori data which can subsequentl y be used to simulate the missing data (syntheti c data generation) and forecast future hydrological parameters. The ARMP and FPM provide efficient alternatives to some other existing models which are not, in general, applicable to all classes of hydrological problems(perenial droughts and storm surges} It also affords an added advantage as a result of the ability of the schemes to forecast in real-time. A comparative analysis of the two techniques are undertaken using the discharge record data from the River Nile at Aswan Dam from 1870-1945 . It is further proposed that in order to enhance the over-all performance . of the prediction scheme, the FPM may be used as an input (training data) to the ARMP. Keywords. Adaptive systems, data generation, filtering, kalman filters, markov processes, modelling, on-line operation, prediction, stochastic system, water resources. INTRODUCTION It is now generally accepted that for the optimum operation of dams and several other hydraulic structures, an efficient method for forecasting in the short, medium and long-term ranges the various inputs to those structures is essential. These time series inputs which are usually the discharges" water levels, precipitations, etc. invariably have large components of uncertainty thus necessitating the use of stochastic methods for the analysis of outputs (runoff, etc.). A physically based autoregressive scheme (a.r.m.p) is presented in this paper; the parameters o f the scheme are assumed to be markovian. The algorithm can be implemented on-line thus allowing real-time forecasting; the simplicity of the scheme and its adaptive nature are clear advantages over some of the other existing models (Box-Jenkins (1970), Ivakhnenko's group of data handling (1968), etc.) The a.r.m.p., requires a training data and the inability to effectively obtain the length required motivates the use of a feature prediction model which is based on the theory of pattern recognition as an alternative . The model which has been used in several studies related to control and communications (Tou and Gonsalez, 1974) has recently been applied to hydrological models. It provides a useful tool for data generation. The synopsis of the model

given herein suggests some improvement that may incorporate the concept of entropy in hydrologic data generation as a means of improving forecasts. This paper compares the two schemes using the discharge data from the River Nile at Aswan. THE AUTOREGRESSIVE MODEL WITH MARKOVIAN PARAMETERS The autoregressive model with markovian parameters (a.r.m.p) is of the form

(1)

where { y . , i = 0,1,2, . .. , P } are the inputs (pie2iPitations, stream flows, etc); the parameter p represents the length of the training data which is to be selected a-priori usually through the use of certain information criteria (AIC, spectral window, etc, etc) ; a(i) are the parameters of the autoregressiSe AR model which are to be assumed to be gaussian with zero-mean and covariance q ~ . Equation (1) is the phase-variable model (Bohlin, 1970) without the control and can be said to represent a Markov chain of order p. It is further assumed that the AR

1583

1584

O. Ibidapo-Obe (i)

parameters at model (i)

at +1

b

(i)

t

(i)

at

P

satisfies a linear markov

+ v

(i)

=

i

t

(i)

1,2, . . • ,p

(2)

.

such that v are also stochastlc zero t mean uncorrelated gaussIan processes and bt(i) is a deterministic function of t. It should be remarked that a second-order heirachy of parameter processes is obtained if b~i) is also markovian. The equations (1) and (2) are now put in the state space form A

t+1

b A + Vt -t t

,

......

,

...... ,

~

,

~

v (P»' t

Equation (3) now facilitates the use of the Kalman-Bucy (Gelb, 1974) estimation equations. The key problem is to obtain the steady-state solutions for set of parameters

a~l) , i = 1,2, .•• ,p since hydrological time series exhibits some well-defined periodicity. The steady state outputs for the parameter set yields the input for the prediction of the output Yt+1' The mathematical problem can now be stated as follows: Minimize the function

where At+1 is the estimate of A + ; subject t 1 to the constraint given in equation (3). The smoothing and prediction algorithms are given by the following recursive formulas: (6)

with the error covariance

(7) for the smoothing (state estimate extrapolation) and y

t

-

y'

~

l

t-l t -

(10)

t -

for the prediction (state estimate update, algorithm) The one-step ahead forecast for Yt is therefore (11)

The a.r.m.p algorithm is adaptive in nature and it will find very useful purposes not only in synthetic data regeneration but also in the forecasting of flows / discharges.

(P»' at

Ut'

r

= Pt+Rt-PtYt-1Y~-lPt[Y~-lPtYt-l

2 + Iq l-2

(4)

The vector of zero-mean white gaussian noises V have a covariance matrix R , and is such t that no component of V is coirelated with t

t -

(9)

ofOthe initial state; the state noise covariance matrix R at each time t and .t . the measurement nOIse varIance qt2

....... ,

+ K

rV' P y I 2 ]-1 - t-l t t-l + qt

(3)

) , = Y (Yt-1' Yt-2 Yt-p t-l diag(b!l) , b(2) , • • • • J b(P» b -t t t where ( )' indicates the transpose of ( ).

~ t+1 = ~ t

Y t-l

The initial conditions associated with equation (6) to (11) above include an estimate of the vector of coefficients At at time to; the covariance matrix Po

and y' Yt = t-1 At + u t such that (1 ) (2) (at At at (2) (v!l) v v t t

P t +1

t

(8)

THE FEATURE PREDICTION MODEL It is well known that hydrological timeseries data exhibits a random behaviour within a global periodic wave-form; this structural format motivates the application of a model that is capable of predicting significant features of the data and thus identifying interventions, this is what is referred to as the feature prediction model which is based on the principles of pattern recognition that involves the analysis of information (data) in groups. The data set which is assumed periodic with observations at regular discrete time intervals are divided into a suitable periodic time duration (one year or six months usually) called segments. The segments are further divided into objects, which may correspond to the difference seasons in the segment; the observed measurements on the objects are characterised by an n-dimensional vector,

l

=

(Yl' Y2'

.... , Yj'

. .... , y ) '

known as the pattern vector.

n

(12)

The pattern vector is further reduced to contain only those features that are akin to the object thus obtaining a feature vector f = (f l ,f 2 , .... , fj'

... ,fm);m ~ n (13 )

Similar pattern vectors are grouped into a class so as to obtain K « m
Contemporary Stochastic Approach to Water Resourc es Sy stems classification space are known as reference vectors. Confidence bounds are usually obtained to delimit reference vectors. The feature prediction model for hydrological data is in three steps viz. classification of patterns (pattern recognition system), analysis of intraand inter-pattern structures and development of a generation scheme for time series realizations. The pattern vector is transformed into a feature vector by the equation. f

=

(14)

A~

f(k) = f(k) f(k) -i il' i2' .. .• • • , f

~k)

l.m

.... ,

f(k) ij

,

1,2, ... ,no

For each class k: the jth component of all the feature vectors is divid~~)into a positive and negative means (+Sj , _S j(k» as well as standard deviations O. (k» - J

are obtained.

Either of

the means may constitute the jth component y(k) of the R. th reference vector R. j

....

y(k) = (y (k) y(k) , y(k) , '. 00 . , R. j , n R. 2 - R. y(k) (16) ) R. m since there are m components per reference vector and each can take either the positive or negative aean values.

-

The feature vectors, and hence the stochastic time series are assumed to be multivariate normally distributed around each of the reference vectors (intra) as follows:

(~-\l f) }

(17)

where \I f is the mean vector for each of components of the reference vecthe tor and C the covariance matrix of the f feature - vectors of each class. The occurence of reference vectors in successive periodicities is assumed to be markovian so that the probabilities of transitions may be obtained. The association of feature vectors with reference vectors is realized by the use of minimum distance concept (Sebestyen, 1962) such that a distance function between ~ik) and arbitrary feature vector T is '(k)

DR. ISPE-2-y

min R.

n(k) = _11: D(k)

R.

LkR.

(19)

R.

A data set may now be generated using the classifiad reference vectors; the refe~ence vectors may be generated sequentially using the transition probability matrix, so that associated with each period, related feature vectors are synthesized using equation (17) subject to the following constra ints : <

(11 ) f .. < {. S . ± z l.J

-

1.

J

a/ 2

.0.

1.

J

i

+ , - }

m (11i)

l:

j=l

f . . < {S f - z a/ 2 0 f } l.J

-

za/ 2 is the significance level usually 1.96 (15)

) -

k = 1,2, ... K; i

R. = 1,2, .•. , Lk A characteristic distance may be obtained for example as

(i)

where the columns of matrix A are the normalized eigenvectors of the covariance matrix corresponding to the components of the pattern vectors. A feature vector in the kth pattern class is of the form

15 85

(18)

confidence level ; S and 0 are the mean and standard deviation ~ssocia{ed with distribution of sum of all components of the feature vectors within the class of the associated reference vector. The feature vectors are transformed into pattern vectors using equation (17); thus producing a synthetic realization of the time series. This scheme thus has its value in data regeneration and it may be used as a base for forecasting using the a.r.m.p. APPLICATIONS The a.r.m.p and the feature prediction model were applied to the simulation of forecasts for the average annual discharge o f the River Nile at Aswan. A shorter analysis using the data of discharges from October I, 1870 to September 30, 1945 on the a.r.m.p has been presented earlier (Ibidapo-Obe, 1978). The entire data period has been partitioned into two components viz . data records prior to 1902 when the Suez Canal was closed and recorded discharges from 1902 to 1945 . These sub periods are referred to a pre-and po stintervention epochs. The following assumptions were made in the use of the a.r.m.p scheme: A is a unit 2 to vector; qt is unity; R and Pt are unit t matrices and the length of 0 the training data p is 5(five). The I-step ahead forecasts is from periods 6(six) onwards. Table 1 provides the annual and predicted discharges as well as a regressive equation of the predicted on actual. The flows follows very closely the actual with a crosscorrelation coefficient of 0 . 79. The feature prediction model is applied t o the same discharge data from the River Nile at Aswan. The data set was divided into 5(five) segments with each segment co ntaining

1586

O. Ibidapo-Obe

3(three) objects. The individual pattern vectors that characterise the objects contain 5(five) data records. The feature vectors are of dimension 3; this provides three pattern classes. A synthesized data from the reference vectors based on the use of random number generator with a multivariate normal distribution is obtained. The re~ suIts are included in Table 1 and Figure 1. The a.r . m.p adapts more efficiently to the intervention than the FPM; hence the fairly large errors in the FPM forecast; the FPM would require a large data set for more accurate forecasts. The upper limits of the confidence intervals have been taken in generating th e feature vectors; these intervals have been used to delimit the range of random numbers that are acceptable. The prior and post intervention means are 3129.80m 3 / s and 3042.50m 3 / s compared with the actuals of 3370.1Om ' / s and 2620.4Om 3/ s respe c tively. The results are compared with the Box and Jenkins Model AR(4) (Sinha and Prasad, 1979).

to other random processes in control, communication, physiological and other systems. REFERENCES 1.

Bohlin, T. (1970), Inf0rmation pattern for linear ciscretetime models witr.. stochastic coefficients. IEEE - Tans. Autom. Control (USA)

2.

Box, G.E. P., anti .Jen'dns, G. /.'. (1970) Time Series Analysis: F?recastin~ and Contro!, HoldenDay Inc., San Fransisco.

3.

Gelb, A. (1974) AppJied Optimal Estimation, The M.I.T. Pr~s-:-­ Cambridge, Mass.

1.

Ibidapo-Cbe, 0. (1978). A New approach to stochastic data analysis: Tre Nile River at Aswan. Electronics Letters, 14, 765-767.

5.

Ivakhnenko, A.G. (1968). The Group Method of Data Handling, a rival to the method of stochastic approximation: Soviet Automatic Control, 13, 43-55.

6.

Panu, U.S. Prasad, T. and Unny, T.E. (1~77). A Feature prediction model for stochastic time series data: Proc, National Systems Conference, PSG College of Technology, India, E5-1-E5-6.

7.

Sebestyen, G.S. (1962) Decision-Making Processes in Pattern Recognition, Macmillan Co., New York. --

8.

Sinha, N. K. and Prasad, T. (1979). Some stochastic modelling techniques and their applications : Applied Mathematical Modelling, ~ 2-6.

9.

Tou, J.T. and Gonsalez, R.C. (1974) Pattern Recognition Principles, Addisonwesley, Reading, Mass.

CONCLUSION Two techniques have been proposed for the forecast and synthesizing hydrological data respectively. The a.r.m.p is adaptive and hence may be implemented on-line; the feature prediction technique is novel and provides a method by which hydrological data may be generated satisfactor~ly. The two techniques may be implemented sequentially with th e feature pre diction using an apriori data for synthesis and the a.r.m.p takes the system feature as the training data needed to start off the pred i c tion. It is ex p ected th at the application of t h ese techniques to th e analysis and synthesis of stream flows would yi el d very usefu~ results in t h e design and operat ion of seve ral h ydraulic structures. The a.r.m.p.

an d the feature prediction techniques may b e applied

Cont empora r y Stoc hasti c Approach to Wat e r Resources Sy st ems ",J, ~

sooo-

ssco -

l S0 0

---~----------'-------'-----~----

1960

1e 70

Di sc ha r re Data f o r

YEAR

~iv e ~ ~ ile

at Aswan (1870 - 1945)

ACTUAL

ARMP

FPM

(xl )

(x ) 2

(x ) 3

(t) 18 70

3 9 fl R.04 1

(1Qf> P, .

041)

3229.566

1875

3817. 6 06

(3 8 17.606)

3039.475

1 8 80

3 0 76.770

221'6.962

3920.143

18 85

:::9F.3 . 424

3,,1 ;; .963

2948.697

18 90

:~556. 6 1 ~.

3617.457

3731.498

1 8 95

3 657. 5 0C

4329.433

3141.024

1900

2843.64 9

2589.490

3291.746

H 105

2 6ZR. 2 71

19 21.269

3325.741

1910

288 9. 85 0

2470.057

2132.476

2680.220

191 ~

3 0 35 .199

2544.917

3446.854

2823.468

1020

2499.997

2952.204

3858.720

2269.967

1925

24 94 .98 0

2283.447

4102.326

2477 . 708

1930

2 205.1 46

2268.225

2246.458

2666.855

1935

2902.773

2631.558

3420.197

2675.871

1940

1848.086

2453.082

4018.326

2407.764

1945

2211.126

2232.681

2411.496

2790.056

Sample Statistics : (x is the mean and Xl

= 2913.377;

01

604.756 ;

Trend Lines :

x

2

= 2859.337 ;

x3

731 . 267 ;

03

02 xl

°

is the unbiased standard deviation)

= 3266.546; 608.031

-107.579t + 3827.798 96.648t + 3680.842 8.690t + 3340 . 411 1.157t + 2593.782

x

4

= 2598.989 194.034

158 7