Reduced-Rank ARX and Subspace System Identification for Process Control

Reduced-Rank ARX and Subspace System Identification for Process Control

Copyright ~' IFAC Dynamics and Control of Process Systems, Corfu, Greece, 1998 . REDUCED-RANK ARX AND SUBSPACE SYSTEM IDENTIFICATION FOR PROCESS CONT...

2MB Sizes 0 Downloads 73 Views

Copyright ~' IFAC Dynamics and Control of Process Systems, Corfu, Greece, 1998 .

REDUCED-RANK ARX AND SUBSPACE SYSTEM IDENTIFICATION FOR PROCESS CONTROL Ben C. Juricek' Wallace E. Larimore" Dale E. Seborg*

• Dept. of Chemical Engineering, University of California, Santa Barbara, CA 93106 •• Adaptics Inc . 1717 Briar Ridge Rd., McLean, VA, 22101

Abstract: A number of methods have been proposed to deal with the modeling of multivariable dynamical systems from observational data that are ill-conditioned or collinear. Canonical correlation regression (CCR), principal components regression (PCR), and partial least squares (PLS) have been used to model dynamical systems as finite impulse response (FIR) models and autoregressive models with exogenous inputs (ARX). Subspace identification methods have recently been developed to directly determine the state of a dynamical system and a state-space model of the process. In this paper, the major characteristics of FIR/ ARX and subspace methods are described in the context of statistically based system identification. The methods are compared with simulated data from a continuous stirred tank reactor (CSTR) with two inputs and two outputs. The effects of feedback and correlated measurement noise on the identified models are examined. Copyright © 1998IFAC Keywords: Identification algorithms, Subspace methods, i\Iultivariate Regression

l. INTRODUCTION

parison. The measures of model fitting accuracy used in this paper are the traditional sum squared error (SSE) and confidence bounds on the identified transfer functions. The latter is much more relevant to model fitting for subsequent analysis and design of control systems.

Several methods have been proposed recently to model multi variable dynamical systems from observed data that may be ill-conditioned or collinear. Such methods include the extension of chemometric methods such as reduced rank regression (RRR) , principal component regression (PCR), canonical correlation regression (CCR) , and partial least squares (PLS) to model dynamical systems as finite impulse response (FIR) models and autoregressive with exogenous inputs (ARX) models. Another class of identification methods, subspace methods, have been developed to directly determine the state of a dynamical system and a state-space model of the process. In this paper, these methods are briefly described along with some of their significant characteristics in the context of statistically based system identification.

In the next sections the two different model parameterizations, ARX and subspace models. are discussed. The previously noted measures of model accuracy for model selection are then described. A simulation of a continuous stirred tank reactor (CSTR) proyides an example of ARX modeling via PLS and CCR. and subspace models from CYA and \"4SID .

2. REDCCED RAKK FIR AND ARX :\IODELS Several researchers recently have studied the usage of multivariate statistical methods for identifying ARX and FIR models of dynamical systems (Dayal and :\1acGregor, 1997; Lakshminarayanan

A critical issue in the evaluation of model identification accuracy is the criteria used for com-

247

Thus the maximum likelihood (ML) estimates of the plant and feedback models are independent, and the effect of the initial condition Po becomes small as the sample size increases. Therefore, maximum likelihood methods maximize the function given by the first product term in (4) .

et al., 1997) . The general ARX model is described since the FIR model is a special case. Consider the problem of identifying an autoregressive model with n-dimensional output y and m-dimensional exogenous input u , where the output vector Yt at time t is related to the past outputs and past inputs Ut-i for lags i = 0, ... , r by the equation, r

Yt

=L

The problem is then reduced to finding maximum likelihood estimates for the ARX model (1). Of the various reduced rank methods for identifying the ARX model, the maximum likelihood solution is given by CCR (Tso, 1981; Larimore, 1997) . The asymptotic properties of ML in obtaining minimum variance unbiased estimates of the parameters suggest that this will be the optimal method if the appropriate rank can be determined. Larimore (Larimore, 1997) gives an optimal procedure for determining the rank using a small sample version of the Akaike information criterion (AIC) (Akaike, 1973) .

r

O:iYt-i

+L

{3i Ut-i

+ {3oUt + et

(1)

i=1

i=1

where the coefficient matrices to be estimated are O:i (n x n) and {3i (n x m) as well as the covariance matrix l:ee of the additive white noise et. The contemporaneous term {30 may be constrained to be zero. In the case of an FIR model, the O:i terms are set to zero. To simplify the notation, the past, Pt, with respect to the time t is defined as the vector Pt

, " = [Yt-1,

" Yt"- r' u t - 1, .. . , u 't -]r '

(2)

Another regression technique that has received much attention recently is partial least squares (PLS) . PLS can be thought of as rotating the principal components in the regression matrix of (3), , to describe the variation in the outputs Y. PLS has applied with considerable success to chemometric problems, and recently has been studied for identifying dynamic systems (Dayal and MacGregor, 1997) .

where the prime (') denotes the matrix transpose. The above estimation problem is a standard multivariate regression problem. Equations (1) and (2) can be written for all t in the form

Y

= 0 + E

(3)

where row t of Y is Yt , the observation matrix has row t expressed in terms of the past up to lag r as
In this paper, the reduced rank multivariate procedures of PLS and CCR are used. Each of these are applied to the identification of FIR and ARX model coefficients for a continuous stirred tank reactor (CSTR) example.

3. SUBSPACE IDENTIFICATION METHODS A class of models that have attracted much attention from researchers recently are the so-called subspace methods. Subspace models are somewhat more general than the ARX models discussed above, and when used in system identification can produce more accurate models because fewer parameters need to be estimated. Statistical accuracy is directly related to the number of parameters that need to be estimated. For example, if the noise et is the result of passing white noise nt through a first order moving average of the form

There has been considerable discussion and comparison of these methods in the literature in terms of computational requirements and theory. Here some observations that will give insight in the discussion of the simulation results are presented. For any parametrized time series model in ARX form, the joint likelihood function of the observed inputs and outputs is of the form (Larimore, 1997) N

(5)

p(YN , UN;fJ) = IIp(Ytlpt,Ut;O) x t=1

then (5) is equivalent to an infinite order autoregression. As a result , the equivalent ARX model theoretically is infinite order. In practice a high order ARX model may be required to accurately represent the error structure, even though there is only one parameter involved. One solution is to include moving average (MA) terms for the noise in the ARX model (e.g., use an ARMAX model).

N

IIp(Utlpt;O)p(Po;0) t=1

(4)

where Po specifies the initial condition of the ARX model. The second term above is the feedback part of the system which can be identified separately if the process indeed has a plant/feedback structure. 248

considerable simulation experience indicate that CVA may be maximum likelihood whereas other subspace methods can haye much higher errors .

The estimation of the MA terms in an ARMAX model is an ill-conditioned problem relying on nonlinear optimization (Gevers and Wertz, 1982) , and current algorithms are not computationally reliable. Subspace methods use an equivalent representation of the ARMAX process in the state space form

4. MEASURES OF MODEL ACCURACY In this section, the following measures of model accuracy of identified models are discussed: onestep ahead prediction error and confidence bounds on the estimated transfer functions. In all simulation cases , the model is fit on an 'identification set' of simulated data, and the measure of model accuracy is calculated based on the 'validation set' of data that is generated independently of the identification set data.

(6) Yt

= HXt

+ AUt + BWt +Vt

(7)

where Xt is a k-order Markov state and Wt and are independent, white noise processes with covariance matrices, Q and R. This state space model is more general than typically used since the noise term, BWt + Vt, in (7) is correlated with the noise Wt in the state equation. This is a consequence of requiring that the state of the state space equations be a k-order Markov state. Requiring the noises in (6) and (7) to be uncorrelated may result in a nonminimal realization.

Vt

4.1 One-Step Prediction Error

A traditional measure of model accuracy is the one-step prediction error variance computed on the validation set. This is normalized by the total sum squared (SS) variation and expressed as a percentage

Unlike the ARMAX model, the subspace methods utilize a well conditioned and efficient computational procedure for estimating the coefficient matrices of the state equations. The fundamental steps for all subspace methods involve: • defining the past and future vectors • projecting future outputs on future inputs • a generalized singular value decomposition (GSVD) between past and projected future to determine the system states • using the computed system state, the model matrices are computed using multivariate regression .

Although the SSE is fairly intuitive and easily interpretable, comparing prediction errors with total sum of squares explained can give very little or misleading information about the actual model accuracy, particularly for correlated noise processes and in the presence of feedback.

A variety of subspace algorithms have been proposed for the determination of the system state. Van Overschee and De Moor (1994) show that all of the subspace algorithms can be expressed in terms of a GSVD where the different subspace methods effectively use different weightings in the GSVD. Two popular subspace algorithms, the N4SID (Van Overschee and De Moor, 1994) and the canonical variate analysis (CVA) algorithms (Larimore, 1983; Larimore, 1990) , are discussed in this paper.

4.2 Confidence Regions on the Frequency Response Function

A more complete measure of model accuracy is the estimated confidence bounds about the transfer function of an identified model. The identified transfer function and confidence bounds can be used to describe the model errors across all frequencies. Such confidence bounds can also be used for robust control design . A method for computing confidence bounds for an identified model using maximum likelihood methods is reviewed in (Larimore, 1993) . In this paper, transfer functions are computed for each identified model for all of the previously mentioned techniques. For each identification technique, the mean and variance of the magnitude of the transfer function are computed across all simulation trials.

For the N4SID method , the model order is chosen based on the singular values of a projection matrix constructed from the past data. The CVA method chooses the model order via the AIC and defines the projection matrix via canonical variates. While the statistical theory and analysis of the ARX model fitting is relatively simple and direct since it is essentially a regression, the fitting of state space models using subspace algorithms is much more complex and the theory is less complete. Nevertheless, the available theory and 249

Table 2. Model dimensions

Table 1. CSTR Parameters and Nominal Operating Conditions . ko \

pCp Tf T TJ

100 7.2e10 239 350 383.7 309.9

L min - 1 J / m3 . K K K K

E

R

~H

UA CAf

CA q

8750 -5e4 5e4 1 0.1 100

Identification Technique PLS FIR PLS ARX CCR FIR CCR ARX N4SID CVA

K J / m3 . K J / min· K mol / L mol / L L/ min

5. CSTR SIMULATION

PLS FIR PLS ARX CCR FIR CCR ARX N4SID CVA

A well-known physical model of a non-isothermal CSTR (Seborg et al., 1989) with first order reaction, A --+ B , is given by: dC

TtA

q E = V(CAf - CA) - koexp(- RT)CA(9)

q -dT = -(Tf dt V

+

T)

-tl.H pCp

vpCp

Case 2 Rank Lags 80 5 7 24 2 80 2 24 15 N/ A 2 N/ A Case.! 5 80 7 24 2 80 24 2 15 N/ A 2 N/ A

noise ratio (SNR) ~ 3 for each of the variables in Case 1. This measurement noise was used in Cases 1 - 3. For Case 4, the variance of the actual noise , nt in (5), was smaller so that the SNR was approximately the same as in Case 3. For both open and closed-loop conditions, 50 trials with different noise seeds and PRBS inputs generated 2000 data points. One thousand points were used for model identification, and the remainder used for model validation.

E RT

+ --koexp(--)CA

UA --(TJ - T)

Case 1 Rank Lags 80 3 17 6 2 80 17 2 15 N/ A 2 N/ A Case 3 3 80 17 6 2 80 2 17 15 N/ A 2 N/ A

(10)

The exit concentration, CA, and temperature, T, are both measured, and the feed concentration, C Af, and feed temperature, Tf, are both unmeasured and act as disturbances. Both disturbance variables were held constant during this simulation. The jacket temperature, T J , and the feed flow rate, q, are used as manipulated variables to control CA and T. The model was linearized about the nominal operating conditions and parameters given in Table 1. The model was discretized with a sampling period of 0.05 min.

The CSTR simulation results were used to demonstrate the different model identification techniques. The techniques examined were FIR and ARX models generated via PLS and CCR, and subspace models generated by the N4SID and CVA methods. The number of lags for the FIR model was chosen to be four times the CSTR residence time (V/q) to yield 80 lags. The number of lags for the ARX models was chosen by minimizing the AIC criterion on the first sample set. The number of latent variables used in the PLS models was chosen via cross validation on the first sample set; the number of eigenvectors in CCR was chosen by Bartlett's Test on the first sample set (Tso, 1981). The state space model order for the N4SID and CVA models was chosen for each trial by the respective identification procedures. A summary of model lags and model rank (the number of latent variables for PLS, eigenvectors for CCR, and the state dimension for I\4SID and CVA) is shown in Table 2.

Separate identifications of the discrete , linear CSTR were performed for four different conditions: Case 1 Open-loop operation with uncorrelated white noise added to the outputs. Case 2 Open-loop operation with a moving average noise process added to outputs. Case 3 Closed-loop operation with uncorrelated white noise added to the outputs. Case 4 Closed-loop operation with a moving average noise process added to outputs. In the open-loop cases, independent PRBS signals on TJ and q with basic periods of two samples (0.1 min) were used to excite the process. Amplitudes of 1 K and 0.6 L/min were chosen so that each had approximately the same effect on output variances. In closed-loop simulations, independent PRBS setpoints with basic period of 2 samples were used for excitation . Again, the amplitudes were chosen to have approximately the same effect on process output variances. For the correlated noise conditions, two independent moving average processes were added to the measurement equation , each with 11 = 1 in (5). Measurement noise was added with variances to yield a signal to

5.1 Accuracy of Identified Models

The accuracy of the identified models is indicated by two criteria. In Table 3, values for the SSE (for both outputs) are given for each identification method for the open-loop simulations. ~otice that in all of the cases there is little difference between the methods as measured by the SSE . The mean values and 95% confidence bounds for the computed frequency responses for CA (s) / q( s) are shown in Figures 1 and 2. 250

of feedback . PLS models were less accurate. particularly in closed-loop simulations. By contrast , the maximum likelihood methods produced accurate models for all conditions . The traditional SSE metric was relatively uninformative regarding the true model accuracy, compared with the identified frequency responses and confidence limits.

For a Si\R :::::: 3, the limiting value of SSE should be 25 %, in accordance with the values in Table 3. The moving average increases the effective noise variance, hence increasing the SSE values for Case 2. Although there is little difference in SSE values between the models for Cases 1 and 2, CVA gives a much more accurate prediction of the frequency response compared with the other models. The CCR FIR models are also quite accurate at low frequencies. The effect of the correlated and uncorrelated noise increases the confidence bounds, but does not affect the mean values greatly.

7. REFERENCES Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In: 2nd International Symposium on Information Theory. Akademiai Kiado . Budapest . pp. 267-281. Dayal , B.S. and J.F. MacGregor (1997) . ~lulti­ output process identification. Journal of Process Control 7 , 269 - 282. Gevers, M. and V. Wertz (1982) . On the problem of structure selection for the identification of stationary stochastic processes. In: Sixth IFAC Symposium on Identification and System Parameter Estimation. McGregor and Werner. Washington, D.C .. pp. 387 - 392. Lakshminarayanan, S., S.L. Shah and K. Nandakumar (1997). Modeling and control of multi variable processes: Dynamic PLS approach. AIChE Journal 43, 2307 - 2322. Larimore, W. E. (1983). System identificaiton, reduced-order filtering and modeling via canonical variate analysis. In: American Control Conference. IEEE. New York. pp. 445 451. Larimore, W . E. (1990) . Canonical variate analysis for system identification , filtering, and adaptive control. In: 29th IEEE Conference on Decision and Control. Vol. 1. Honolulu , HI. pp. 635-639. Larimore, W. E. (1993) . Accuracy confidence bands including the bias of model underfitting. In: American Control Conference. San Francisco, CA . Larimore, W. E. (1997). Optimal reduced rank modeling, prediction , monitoring, and control using canonical variate analysis. In : IFAC 1997 Int. Symp. on Advanced Control of Chemical Processes. Banff, Canada. pp . 61 66. Seborg, D.E. , T.F. Edgar and D.A. Mellichamp (1989). Process Dynamics and Control. John Wiley. New York. Tso, M.K.S. (1981). Reduced-rank regression and canonical analysis . J. Royal Statist. Soc. B 43, 183 - 189. Van Overschee, P. and B. De ~10or (1994). A unifying theorem for three subspace system identification algorithms. In: American Control Conference. Vol. 2. Baltimor~. pp . 16451649.

The PLS models appear to be biased from the true frequency response. This may be due to sensitivity of scaling. In PLS the scaling of inputs and outputs is critical for consistent results. In this study, each model was derived by scaling the input and outputs to zero mean and unit variance. Varying the scaling slightly could produce better results, but required some skill and considerable effort to do so. The ARX models are less accurate than the FIR models as judged by the frequency responses. This was quite surprising to the authors and possibly could be due to numerical problems. The SSE values for the closed-loop simulations are shown in Table 4. The addition of feedback makes the SSE's difficult to interpret since the controller is actually responding to the added measurement noise. The maximum likelihood methods should continue to produce accurate results, since they are feedback invariant per the discussion following (4). In fact, the C CR and CVA techniques were more accurate for the closed loop cases. The addition of correlated noise did not affect the PLS models much (since the SNR in Cases 3 and 4 was about the same) , but the SSE's for CVA and CCR became quite small since the actual noise variance decreased. The identified frequency responses , shown in Figures 3 and 4, reflect these features. Additionally, notice that the SSE does not delineate between the CCR and CVA models greatly, while in fact , the CVA model is considerably more accurate comparing the identified frequency responses . This is due to the fact that SSE is not a sufficient statistic for measuring model accuracy as mentioned in Section 4. The N4SID model also produces good models , although less precise than the CVA models.

6. CONCLUSION Although the simulation used to compare the CCR and PLS regression techniques with the :"4SID and CVA subspace techniques was hardly an exhaustive analysis of the methods , the subspace methods , in particular CVA , performed quite well for correlated noise and in the presence 251

Table 3. SSE of Identified Models: Open Loop Identification Technique PLS FIR PLS ARX CCR FIR CCRARX N4SID CVA

Case 1

Case 2

% SSE CA

% SSE T

% SSE CA

% SSE T

28.4 28 .3 30 .2 28.2 33 .5 27.6

28.9 28 .8 30.1 27.9 32.7 27.6

43.4 43.4 48.5 44.4 45.4 41.3

42.9 43.5 47 .5 43.6 45.0 40 .1

Table 4. SSE of Identified Models: Closed Loop Identification Technique PLS FIR PLS ARX CCR FIR CCR ARX N4SID CVA

Case 3

% SSE CA

% SSE T

62.4 30.1 16.9 15.9 26.6 14 .6

81.0 52.6 9.6 9.1 11.6 8.3

Case 4 % SSE T CA 56 .7 77.3 44.5 30.2 4.6 5.8 2.4 3.1 5.2 14.7 2.3 3.1

% SSE

Fig. 1. Magnitude of identified transfer function (CA (s) / q( s)) for open loop, uncorrelated noise.

Fig. 3. Magnitude of identified transfer function (CA (s) / q( s)) for closed loop, uncorrelated noise.

Fig. 2. Magnitude of identified transfer function (CA (s) / q( s)) for open loop, correlated noise.

Fig. 4. Magnitude of identified transfer function (CA (s) / q( s)) for closed loop, correlated noise.

252