Identification of k-step-ahead prediction error model and MPC control

Identification of k-step-ahead prediction error model and MPC control

Journal of Process Control 24 (2014) 48–56 Contents lists available at ScienceDirect Journal of Process Control journal homepage: www.elsevier.com/l...

469KB Sizes 3 Downloads 47 Views

Journal of Process Control 24 (2014) 48–56

Contents lists available at ScienceDirect

Journal of Process Control journal homepage: www.elsevier.com/locate/jprocont

Identification of k-step-ahead prediction error model and MPC control Jun Zhao a , Yucai Zhu a,∗ , Rohit Patwardhan b a

State Key Laboratory of Industrial Control Technology, Department of Control Science and Engineering, Zhejiang University, Zheda Road 38, Hangzhou 310027, China b Advanced Process Control Unit, P&CSD, ES, Saudi Aramco, Dhahran, Saudi Arabia

a r t i c l e

i n f o

a b s t r a c t

Article history: Received 20 February 2013 Received in revised form 11 October 2013 Accepted 14 October 2013 Available online 25 December 2013

This work studies k-step-ahead prediction error model identification and its relationship to MPC control. The use of error criteria in parameter estimation will be discussed, where the identified model is used in model predictive control (MPC). Assume that the model error is dominated by the variance part, it can be shown that a k-step-ahead prediction error model is not optimal for k-step-ahead prediction. A normal one-step-ahead prediction error criterion will be optimal for k-step-ahead prediction. Then it is argued that even when some bias exists, the result could still hold true. Therefore, for MPC identification of linear processes, one-step-ahead prediction error models fever k-step-ahead prediction models. Simulations and industrial testing data will be used to illustrate the idea. © 2013 Elsevier Ltd. All rights reserved.

Keywords: Identification MPC Error criteria Industrial application

1. Introduction Most industrial model predictive control (MPC) systems use identified process models. Naturally a question arises about what is the best method for MPC identification. The choice of error criterion in parameter estimation is a key issue in system identification. A number of researchers have studied the issue. See [7,4,6,1,2]. In MPC, a model is used to make k-step-ahead predictions. A common believe is that the optimal model for k-step-ahead prediction must be obtained by minimizing the k-step-ahead prediction errors in model identification. Shook et al. [7] and Huang and Wang [4] discuss the use of data prefiltering for improving the k-step-ahead prediction; Gopaluni et al. (2004) studied the use of noise model. These methods aim to minimize the loss function that is the summation of the norms of k-step-ahead prediction errors where k varies from 1 to time to steady state. One of the problems with these methods is that it only makes a compromise of prediction quality for various k’s. To overcome this problem, Rossiter and Kouvaritakis [6] suggested the use of multiple models each of which is optimal for a given k. The difficulty of this approach is the high cost in identification and MPC computation. ˆ iω ) as that of Given a linear time invariant system in discrete-time, denote G◦ (eiω ) as the frequency response of the true model and G(e the identified model. Assume that the unmeasured disturbance is a stationary stochastic process, then the identified model is a random variable. Define the bias error of the model as 

ˆ iω )] B(eiω ) = Go (eiω ) − E[G(e

(1)

and the variance error as 

ˆ iω ) − E[G(e ˆ iω )]} V 2 (eiω ) = E{G(e

2

(2)

where E denotes mathematical expectation. Then the total mean square error is 

2





ˆ iω )] = B2 (eiω ) + V (eiω ) M 2 (eiω ) = E[Go (eiω ) − G(e

(3)

The bias is caused by under modeling; the variance is caused by unmeasured disturbances. In MPC applications, the best model should minimize the total error. The works so far aim to minimize only the bias error. When the identification error is dominated by the bias, then it is true that an optimal model for k-step-ahead prediction must be obtained by

∗ Corresponding author. Tel.: +86 15381085960. E-mail address: [email protected] (Y. Zhu). 0959-1524/$ – see front matter © 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.jprocont.2013.10.011

J. Zhao et al. / Journal of Process Control 24 (2014) 48–56

49

minimizing the k-step-ahead prediction errors in identification. However, the statement is not true when the model error is dominated by the variance error (caused by disturbances), as will be shown later. In this work, we will first show that when the identification error is dominated by the variance (noise), the traditional one-step-ahead prediction error model will give optimal k-step-ahead predictions for all k’s. Then we will discuss how to balance the bias error and variance error in order to minimize the total error. Finally we will show the applicability of our approach using industrial data. In Section 2 the optimality of one-step-ahead prediction error model for k-step-ahead prediction will be studied; in Section 3 the relation between bias error and variance error will be discussed; industrial data set will be used to show that the variance error can dominate the model error. Section 4 is the conclusion. 2. MPC optimality of the prediction error model Given a linear time-invariant discrete-time process y(t) = Go (q)u(t) + v(t)

(4)

v(t) = H o (q)e(t)

where u(t) is the process input at time t, y(t) is the process output, v(t) is the disturbance at the output, {e(t)} is white Gaussian noise / 1, which implies that the disturbance v(t) is with zero mean, Go (q) and H o (q) are stable transfer functions of order n. Assume that H o (q) = colored noise. Assume that an identification test has been carried out. Denote the collected input/output data as {u(1), y(1), u(2), y(2), ......, u(N), y(N)}

(5)

where N is the number of samples. The model to be identified is y(t) = G(q)u(t) + H(q)e(t)

(6)

where G(q) is a stable transfer function, H(q) is stable and minimum-phase and {e(t)} is white Gaussian noise with zero mean. The prediction error model can be parameterized in different ways such as Box–Jenkins model and ARMAX model. The Box–Jenkins model is given as y(t) = G(q, )u(t) + H(q, )e(t) =

b0 + b1 q−1 + · · · + bn q−n 1 + c1 q−1 + · · · + cn q−n u(t) + e(t) 1 + a1 q−1 + · · · + an q−n 1 + d1 q−1 + · · · + dn q−n

(7)

T

where  = [bo , b1 , ...bn , a1 , ...an , c1 , ...cn , d1 , ...dn ] is the parameter vector of the process and disturbance model. Introduce the one-step-ahead prediction error ε(t + 1|t) = y(t + 1) − yˆ (t + 1|t) =





D(q) B(q) u(t) y(t) − C(q) A(q)

(8)

and the loss function 1 2 ε (t + 1|t) N N

VNPE () =

(9)

t=1

where N is the number of data samples. Then the parameters are determined by minimizing the loss function of one-step-ahead prediction error as ˆ PE = argminVNPE ()

(10)



Assume that the data is persistently exciting, the model order is correct and the optimization routine reaches the global minimum, it is well known that the Box–Jenkins is consistent and efficient (minimum variance) for both open loop and closed-loop tests; see e.g., Ljung [5]. Similarly, introduce the k-step-ahead prediction error [5] ¯ k (q)ε(t + k|t + k − 1) ε(t + k|t) = y(t + k) − yˆ (t + k|t) = Wk (q)[y(t + k) − G(q)u(t + k)] = H

(11)

where ¯ k (q)H −1 (q) Wk = H ¯ k (q) = H



k−1 

(12)

h(l)q−l

l=0



and h(l), l = 0, 1, 2, ... is the impulse response of the disturbance model H(q). Denote the loss function of the k-step-ahead prediction error as 1 2 ε (t + k|t) N N

kstep

VN

() =

t=1

(13)

50

J. Zhao et al. / Journal of Process Control 24 (2014) 48–56

Then the parameters are determined by minimizing the loss function of k-step-ahead prediction error as kstep ˆ kstep = argminVN ()

(14)



From (11), we can see that the two loss functions (9) and (13) have the following relation kstep

VN

¯ 2 (q)V PE () () = H N k

(15)

This relation implies that a k-step-ahead prediction error model can be approximately obtained by the corresponding one-step-ahead ¯ k (q) prediction error model obtained using filtered input-output data using filter H In MPC control, the model is used to predict the future output. With some approximation, the best MPC identification should minimize the following loss function (see, e.g., Gopaluni et al., 2004) 1  2 1  2 ε (t + k|t) = E (y(t + k) − yˆ (t + k|t)) E P P

VMPC =

P

P

k=1

k=1

(16)

where E denotes expectation and P is the prediction horizon. Remark.

The underlying input signal in (16) is the control signal during MPC control, NOT the input signal during identification test.

In general, to minimize loss function (16) in identification is not an easy task. Three situations can be distinguished: (1) There exists a single identified model that minimizes each k-step-ahead prediction error Eε2 (t + k|t) for k = 1, 2, . . . P. In this case, use this single model will minimize the loss function (16). We will show in the sequel that this is the situation when the model error is caused by unmeasured disturbances and the bias error is negligible. (2) There is no single identified model that minimizes each k-step-ahead prediction error Eε2 (t + k|t) for k = 1, 2, . . ., P. This is the situation when the model error is dominated by the bias, which has been studied by most researchers. In order to minimize the loss function (16), one can estimates multiple models where each model is optimal for a given k; and then uses the multiple models in MPC control where each model is used to predict output of the corresponding k. This is the approach proposed in [6]. (3) There is no single identified model that minimizes each k-step-ahead prediction error Eε2 (t + k|t) for k = 1, 2, . . ., P. A single model is estimated by minimizing loss function (16). This approach is taken by [7,4,1] and Gopaluni et al. (2004). It is easy to see that the multiple model approach in situation (2) is more optimal than the single model approach in (3) because multiple models have much higher degree of freedom in optimizing the loss function (16). Now the identified model will be used in MPC to predict the process output using an input u∗ (t). Note that u∗ (t) has NOT been ˆ ˆ used in identification. Denote [G(q, ˆ PE ), H(q, ˆ PE )] as the one-step-ahead prediction error model estimated using the test data and ˆ ˆ [G(q, ˆ kstep ), H(q, ˆ kstep )] as the k-step-ahead prediction error model. Now we will check which of the two models perform better in minimizing the MPC relevant loss function (16). Denote



2

1  2 1  E ε (t + k|t, ˆ PE ) = E y(t + k) − yˆ (t + k|t, ˆ PE ) P P

PE VMPC =

P

P

k=1

k=1

(17)

as the MPC relevant loss function of the one-step-ahead prediction error model; and denote 2 1  2 1  ε (t + k|t, ˆ kstep ) = E (y(t + k) − yˆ (t + k|t, ˆ kstep )) E P P P

kstep

VMPC =

k=1

P

(18)

k=1

as the MPC relevant loss function of P k-step-ahead prediction error models, k = 1,2, . . ., P. We will compare each terms of the loss functions (17) and (18). It is clear that the first terms of (17) and (18) are the same because when k = 1, both models are one-step-ahead prediction error models. Now assume k > 1 and we will compare the sizes of Eε2 (t + k|t, ˆ BJ ) PE and V kstep . and Eε2 (t + k|t, ˆ kstep ), and further that of VMPC MPC Theorem 1.

Given the linear process in (4). Assume that:

(1) the identification test has been performed using a test signal with a continuous power spectrum; ˆ (2) the true process model order is n which is used for estimating the one-step-ahead prediction error model G(q, ˆ PE ) and the k-step-ahead kstep ˆ ˆ prediction error model G(q,  ); (3) the numerical search routines have converged to their global minima; (4) the MPC control input u∗ (t) is a stationary stochastic process and has an spectrum ˚∗u (ω); u∗ (t) and the identification test input u(t) are not correlated. Then for k > 1 Eε2 (t + k|t, ˆ PE ) < Eε2 (t + k|t, ˆ kstep ) as N → ∞ Remark.

The inequality (19) has an immediate use in MPC control, namely, (19) implies that

(19)

J. Zhao et al. / Journal of Process Control 24 (2014) 48–56

51

kstep

PE VMPC < VMPC as N → ∞

(20) kstep

PE is given in (17) and V where VMPC in (18). MPC

Proof. The proof is based on the fact that the one-step-ahead prediction error model is efficient (minimum variance) estimator and the k-step-ahead prediction error model is not. Denote  o as the vector of true process model parameters. Under assumptions (1)–(3), according to [5,11], the one-step-ahead prediction error model is consistent, that is,  o − ˆ PE → 0 with probability one as N → ∞

(21)

and efficient (minimum-variance), that is cov(ˆ PE ) < cov(ˆ ∗ )

(22)

where ˆ ∗ is any other model parameter vector. Note that here the inequality holds strictly because of condition (1), namely, the input is not multiple sinusoids; see [11]. Therefore, comparing the one-step-ahead prediction error model with the k-step-ahead prediction error model we have cov(ˆ PE ) < cov(ˆ kstep )

(23)

By Taylor’s expansion



T

ˆ iω , ˆ PE ) − W o (eiω ,  o )Go (eiω , ) = Wk (eiω ,  o )G(eiω , ) ˆ k (eiω , ˆ PE )G(e W k

(ˆ PE −  o ) + O(|ˆ PE −  o )

(24)

where 

[(Wk eiω ,  o G(eiω ,  o )] = [∂(Wk (eiω ,  o G(eiω ,  o /∂))] From here on, we will drop the notation eiω for frequency responses when it does not causing confusion. Because the k-step-ahead prediction error is a filtered one-step-ahead prediction error; see (11) or (15), the k-step-ahead prediction error model can be obtained by a one-step-ahead prediction error method after pre-filtering the input-output data. It is known that data pre-filtering will preserve the consistence property (see, e.g., [5]); the k-step-ahead prediction error model is also consistent. Then, similarly, we have Taylor’s expansion for



 T

ˆ ˆ kstep ) − W ◦ ( ◦ )G◦ ( ◦ ) = Wk ( ◦ )G( ◦ ) ˆ k (ˆ kstep )G( W k

(ˆ kstep −  ◦ ) + O(|ˆ kstep −  ◦ |)

(25)

Then, according to (24) we obtain the asymptotic variance expression in the frequency domain, namely, as N → ∞



 T

ˆ ˆ PE ) = (Wk ( ◦ )G( ◦ )) ˆ k (ˆ PE )G( var W





cov(ˆ PE ) (Wk ( ◦ )G( ◦ ))

(26)

Similarly, from (25) we have for the k-step-ahead prediction error model, as N → ∞



 T

ˆ ˆ kstep ) = (Wk ( o )G( o )) ˆ k (ˆ kstep )G( var W





cov(ˆ kstep ) (Wk ( o )G( o ))

(27)

Because one-step-ahead prediction error model is consistent we have, as N → ∞ Eε2 (t + k|t, ˆ PE ) = varε(t + k|t, ˆ PE ) =

1 2







 T

(Wk ( o )G( o ))





cov(ˆ PE ) (Wk ( o )G( o )) ˚∗u dω

(28)

−

For the k-step-ahead prediction error model we have, as N → ∞ 1 Eε (t + k|t, ˆ kstep ) = var(ε(t + k|t, ˆ kstep )) = 2





ˆ ˆ kstep )]˚∗u dω ˆ k (ˆ kstep )G( var[W

2

=

1 2







 T

(Wk ( o )G( o ))

−





cov(ˆ kstep ) (Wk ( o )G( o )) ˚∗u dω

(29)

−

When comparing the two error terms we have, as N → ∞ Eε2 (t + k|t, ˆ PE ) − Eε2 (t + k|t, ˆ kstep ) = var(ε(t + k|t, ˆ PE )) − var(ε(t + k|t, ˆ kstep )) =

1 2







 T

(Wk ( o )G( o ))





[cov(ˆ PE ) − cov(ˆ kstep )] (Wk ( o )G( o )) ˚∗u dω < 0

(30)

−

Note that the last inequality is due to the minimum variance of the one-step-ahead prediction error model (23). This proves (19). End of proof.  This result is somewhat against intuition. Some readers may have believed that a k-step-ahead prediction error model must always minimize the k-step-ahead prediction error loss function. This can be true when the data is noise-free and bias is the only error source; also true when the identification test data are used to evaluate the loss function. However, when the model order is correct and when the data are noisy, the one-step-ahead prediction error model will have a minimal k-step-ahead prediction error loss function when evaluated on a fresh data set. The source of the deviation is the difference in parameter variances and not in comparing 1/N to 1/(N − k).

52

J. Zhao et al. / Journal of Process Control 24 (2014) 48–56 Loss function of PE model (solid) and of k-step-ahead PE model (dashed) 20

15

10

5

0

10

20

30

40

50

60

70

80

90

100

Number of simulations Fig. 1. The k-step-ahead prediction error loss function of the Box–Jenkins model (solid) and that of the k-step-ahead prediction error model (dashed).

A Simulation Example. Given the process y(t) =

q−1 + 0.5q−2 u(t) + v(t) 1 − 1.5q−1 + 0.7q−2

(31)

v(t) =

˛(1 − 0.7q−1 ) e(t) 1 − 0.97q−1

(32)

with

The input signal is a GBN (generalized binary noise, [8]) with average switch time of 10 samples. The test time is 1000 samples. The disturbance power is 15% of that of the noise-free output, which is a realistic signal-to-noise ratio in MPC applications. For each simulation run, two data sets are generated; the first one for model identification, the second one for loss function evaluation. In total 100 open-loop simulations have been carried out. The data of each run are used to estimate a 2nd order Box–Jenkins model and a 2nd k-step-ahead prediction error Box–Jenkins model; the correct disturbance model order is used which is 1. ¯ k (q) are not To estimate a k-step-ahead prediction error Box–Jenkins model is not an easy task because the coefficients in polynomial H given explicitly in the model parameters. Here some approximation is used based on the relation (15). The k-step-ahead prediction error ˆ¯ (q) of the identified Box–Jenkins model, and then by Box–Jenkins model is obtained by first pre-filtering the input-output data using H k estimating a Box–Jenkins model using the filtered data. The following k-step-ahead prediction error loss functions are evaluated using the second data set and compared the quality of the two models 2 2 1  ˆ PE (q)u(t)] ˆ PE (q)] [y(t) − G [W k 1000 1000

VkPE =

t=1 1000

kstep Vk

2 2 1  ˆ kstep (q)u(t)] ˆ kstep (q)] [y(t) − G = [W k 1000

(33)

t=1

The results for k = 30 are shown in Fig. 1. One can see that Box–Jenkins model indeed has smaller loss function most of the time. On some occasions, the k-step-ahead prediction error model has slightly smaller loss function (see Figs. 2 and 3) Now we have shown that when the bias is negligible and noise is the cause of model errors, the one-step-ahead prediction error model minimizes the k-step-ahead prediction error loss function (15). This is an idealized situation and we are not sure if this result is useful for real industrial applications. For later discussion we will call this method variance approach. Another situation is that the noise is negligible and model errors are caused by under- modeling and nonlinearity. The existing works on MPC relevant identification studied this situation; see [7,4,6,1] and Gopaluni et al. (2004). For later discussion we will call this method bias approach. 3. Bias error versus variance error In industrial MPC applications, identified process models have both bias, due to high order of the true process and nonlinearity, and variance, due to unmeasured disturbance. Unfortunately there is no theory available that can analyze the total mean square error. What we have so far are the bias approach that ignore the variance error and the variance approach that ignores the bias error. The question is what method should be used for selecting an MPC relevant estimation criterion, variance approach or bias approach? We will answer the question for two situations: (1) the process can be represented by a linear process; (2) the process can be approximated very well in small operating ranges and is nonlinear in large operating ranges. The second situation represents the real industrial applications.

J. Zhao et al. / Journal of Process Control 24 (2014) 48–56

53

u4

1 0 -1

u5

2 0 -2

u6

2 0 -2 2 0 -2

1 0 -1

u7

u3

u2

u1

Input sihnals of the crude unit of one test 1 0 -1

2 0 -2 0

500

1000

1500

2000

2500

3000

3500

Samples Fig. 2. Inputs of one test of the main distillation column.

In situation (1), we want to identify the most accurate model by minimizing the total mean square error. How shall we balance the bias error and variance error? An answer to the question is based on the following result [3]. Theorem 2.

Assume that the process is linear and time-invariant with any order. Denote the bias error of the model as



ˆ iω )] B(eiω ) = Go (eiω ) − E[G(e

(34)

and variance error as 

ˆ iω ) − E[G(e ˆ iω )]} V 2 (eiω ) = E{G(e

2

(35)

Then the total mean square error is 2







ˆ iω )] = B2 (eiω ) + V 2 (eiω ) M 2 (eiω ) = E[Go (eiω ) − G(e

(36) 

B2 (eiω )

is faster then the increase of variance error Assume also that, when model order increases, the decrease of bias error  then for the optimal model (order) that minimizes the total error M 2 (eiω ), the following relation holds 



B2 (eiω ) ≤ V 2 (e−iω )



V 2 (eiω ), (37)

that is, the total error is dominated by the variance error.

y1

Part of output signals of the crude unit of one test 2 0 -2

y2

2 0 -2

y3

2 0 -2

y4

4 2 0 -2 0

500

1000

1500

2000

2500

3000

Samples Fig. 3. Four outputs of the main distillation column.

3500

54

J. Zhao et al. / Journal of Process Control 24 (2014) 48–56

FOE, ARMAX model, y1

0.2 0.15 0.1

0

2

4

6

8

10

12

14

16

18

20

14

16

18

20

14

16

18

20

14

16

18

20

FOE, ARMAX model, y2

0.4 0.3 0.2

0

2

4

6

8

10

12

FOE, ARMAX model, y3

0.2 0.1 0

0

2

4

6

8

10

12

FOE, ARMAX model, y4

0.4 0.2 0

0

2

4

6

8

10

12

Model order Fig. 4. FOE of the ARMAX models.

Therefore, in model identification of linear processes, the most accurate model will have a larger variance error. Then it is logical to consider variance error in identification, and, based on the result of Section 2, a one-step-ahead prediction error model is preferred to P k-step-ahead prediction error models. In industrial MPC identification where linear models are used, although the process is nonlinear in a wide operating range, we avoid the excitation of the nonlinearity of the underlying process due to two reasons: (1) In identification tests, in general it is not allowed to disturb normal process operation and to cause products off specifications. Therefore, the amplitudes of the test signals are small and signal-to-noise ratio of the data set is not very high. (2) If in some periods of the test some process outputs have too big variations due to improper tests or unmeasured disturbance, these data portions will be removed (called data slicing) and not used for model identification. Therefore, in industrial applications, although bias is inevitable, most often the unmeasured disturbance is the main cause of model errors provided that model orders are properly selected which are not too low. In the following we will verify our reasoning using industrial test data. An Industrial Case Study The process is a crude unit at a European refinery which has been studied in [9]. The data set is obtained from an identification test of the main distillation column of the crude unit. The identified model was used in MPC control. Two open loop tests were carried out. The data set contains the inputs (MVs) and four outputs (CVs) of one test. Output 1 is the temperature difference of two trays; outputs 2–4 are product qualities measured by online analyzers. There are 7 inputs. Inputs 1–3 are temperature setpoints; input 4 is a flow setpoint; inputs 5 and 6 are flow ratio setpoints; input 7 is a flow measurement which is the measured disturbance in the MPC controller. The sampling time is 1 min. We make the data available for researchers on system identification. Interested readers can contact the corresponding author to obtain the data. The data were pre-processed using the following operations: (1) The means of all the input-output signals are removed. (2) The standard deviations of all the input–output signals are scaled to 1. (3) There are very large delays in second output. This output signal is forward shifted by 80 samples, which mean 80 delays are removed. Samples 1:2000 are used for model identification which will be called identification data set; samples 2001:3300 will be called validation data set. The validation data set will be used to evaluate model quality, namely, to calculate the loss function of the k-step-ahead prediction errors of various models. Determine the process delays According to process knowledge there are considerable delays between process inputs and outputs. So different delays are used in model estimation and the best delays are determined using the so called final output error (FOE) criterion; see Zhu [10] for the motivation of FOE and its comparison to FPE (final prediction error) criterion. To keep it simple the same delay is used for each output. The determined delays for the four outputs are: 4, 20, 20 and 20 samples.

J. Zhao et al. / Journal of Process Control 24 (2014) 48–56

55

FPE, k-ARMAX model, y1

1 0.5 0

0

2

4

6

8

10

12

14

16

18

20

16

18

20

16

18

20

16

18

20

FPE, k-ARMAX model, y2

1 0.5 0

0

2

4

6

8

10

12

14

FPE, k-ARMAX model, y3

2 1 0

0

2

4

6

8

10

12

14

FPE, k-ARMAX model, y4

10 5 0

0

2

4

6

8

10

12

14

Model order Fig. 5. FPEs of 30-step-ahead ARMAX models. Loss functions of ARMAX model (x) and of k-step-ahead ARMAX model (o)

y1

0.5

0

5

10

15

20

25

30

35

40

45

50

55

5

10

15

20

25

30

35

40

45

50

55

5

10

15

20

25

30

35

40

45

50

55

5

10

15

20

25

30

35

40

45

50

55

y2

0.2 0.1 0

y3

0.4 0.2 0

y4

0.5

0

Prediction horizon, k Fig. 6. The loss functions of the ARMAX model (x) and of the k-step-ahead ARMAX models (o).

Note that 80 delays have been removed for output 2, which means that the total delay for the output 2 is 100 samples. Model order selection and parameter estimation Box–Jenkins model was used first but failed to converge due to the instability of the estimated models. The reason can be that, in Matlab, the Box–Jenkins model structure is too complex for multivariable systems, as each transfer function is parameterized individually. Therefore, ARMAX models are estimated using the identification data set. In Matlab, the ARMAX model has a diagonal form that means for an output a common denominator polynomial is used for all the inputs, or, the model is in a multi-input single output (MISO) form. For each output, ARXMAX models with order between 1 and 20 are estimated and their FOEs (final output error criterion) calculated for order selection. Fig. 4 shows the FOEs for the ARMAX models. The orders [5,3,4,4] are selected for the ARMAX model. ¯ k (q) of The k-step-ahead ARMAX models are obtained the same way as before, namely, first pre-filter the input-output data using the H the corresponding of the ARMAX model, then estimate an normal ARMAX model using the filtered data. Five k-step-ahead ARMAX models are estimated: k = 10, 20, 30, 40, 50. For each output, k-step-ahead ARMAX models with order between 1 and 20 are estimated and their FPEs (final prediction error criterion, [5]) calculated for order selection. Here the k-step-ahead prediction errors are used in calculating the FPEs. Fig. 5 shows the FPEs for the 30-step-ahead ARMAX models. The order selection results for k-step-ahead ARMAX models are given below: Orders for k = 10, [1, 1, 1, 1]; Orders for k = 20, [1, 1, 1, 1]; Orders for k = 30, [1, 1, 1, 1]; Orders for k = 40, [15, 1, 1, 1]; Orders for k = 50, [10, 1, 20, 1];

56

J. Zhao et al. / Journal of Process Control 24 (2014) 48–56

One can see that most selected model orders are very low. The reason for this can be that, for a large k, the model focus on low frequency part of the process where it shows low order model behavior. Compare the k-step-ahead prediction errors The loss functions of k-step-ahead prediction errors of the best ARMAX models (17) and that of k-step-ahead ARMAX models (18) are calculated using the validation data set which has not been used in model identification; see Fig. 6. One can see that the one-step-ahead ARMAX model has smaller loss functions for all the outputs and for all the prediction horizons. This study confirms that in industrial identification of linear models, the variance error can dominates the bias error. 4. Conclusion We have studied k-step-ahead prediction error model identification. Under the condition that the model order is correct and data are noisy, we have shown that a k-step-ahead prediction error model does not minimize the k-step-ahead prediction error loss function when evaluated on a fresh data set. When there are both bias error and variance error in industrial process identification, one should choose a model order so that the variance is lager than the bias. Therefore, our result will also apply when some bias exists. Industrial data are used to show the optimality of the normal one-step-ahead prediction error models in k-step-ahead predictions. Therefore, for industrial applications of linear MPC, the recommendation of the authors is to use normal prediction error models with not too low orders. Acknowledgments The work of Zhu is supported by 973 Program of China (No. 2012CB720500) and by the National Science Foundation of China (No. 61273191). The work of Zhao is supported by the National Science Foundation of China (No. 61273146). References [1] R.B. Gopaluni, R.S. Patwardhan, S.L. Shah, The nature of data pre-filters in MPC relevant identification—open and closed-loop issues, Automatica 39 (2003) 1617–1626. [2] R.B. Gopaluni, R.S. Patwardhan, S.L. Shah, MPC relevant identification—tuning the noise model, Journal of Process Control 14 (2004) 699–714. [3] L. Guo, L. Ljung, The role of model validation for assessing the size of the unmodelled dynamics, in: Proceedings of the 33rd Conference on Decision and Control, Lake Buena Vista, FL, 1994, pp. 3894–3899. [4] B. Huang, Z. Wang, The role of data prefiltering for integrated identification and model predictive control, Proceedings of IFAC Congress 1999, Beijing, China, 1999. [5] L. Ljung, System Identification: Theory for the User, second ed., Prentice-Hall, Englewood Cliffs, N.J, 1999. [6] J.A. Rossiter, B. Kouvaritakis, Modelling and implicit modelling for predictive control, International Journal of Control 74 (2001) 1085–1095. [7] D.S. Shook, S. Mohtadi, S.L. Shah, A control-relevant identification strategy for GPC, IEEE Transactions on Automatic Control 37 (1992) 975–980. [8] H.J.A.F. Tulleken, Generalized binary noise test-signal concept for improved identification-experiment design, Automatica 26 (1) (1990) 37–49. [9] Y.C. Zhu, Multivariable process identification for MPC: the asymptotic method and its applications, Journal of Process Control 8 (2) (1998) 101–115. [10] Y.C. Zhu, Multivariable System Identification for Process Control, Elsevier Science Ltd, Oxford, 2001. [11] T. Söderström, P. Stoica, System Identification, Prentice-Hall, New York, 1989.