Journal of Process Control 39 (2016) 100–110
Contents lists available at ScienceDirect
Journal of Process Control journal homepage: www.elsevier.com/locate/jprocont
Two-time dimensional recursive system identification incorporating priori pole and zero knowledge Zhixing Cao a , Ridong Zhang a , Jingyi Lu a , Furong Gao a,b,∗ a b
Department of Chemical & Biomolecular Engineering, Hong Kong University of Science & Technology, Clear Water Bay, Kowloon, Hong Kong Fok Ying Tung Research Institute, Hong Kong University of Science & Technology, Hong Kong
a r t i c l e
i n f o
Article history: Received 16 March 2015 Received in revised form 2 December 2015 Accepted 17 December 2015 Available online 25 January 2016 Keywords: Batch processes Constrained recursive least squares Minimum phase Priori knowledge Two-time dimensional
a b s t r a c t This paper studies an online identification algorithm for batch processes incorporating priori process knowledge of pole and zero positions. The knowledge is available to control engineers and can be exploited to improve the accuracy of the identified process model. To reduce the computation burden of directly invoking Lyapunov inequality, a bound on the identified parameters is imposed to enforce the match between the priori knowledge and identified model. The bound is recursively calculated according to the newly obtained model. The proposed identification method uses the information not only from the time direction but also along the batch direction to improve the identification performance from batch to batch. A filter is introduced to suppress the variation on the identified parameters. Finally, numerical simulations verify the performance and robustness of the proposed method. © 2016 Elsevier Ltd. All rights reserved.
1. Introduction In the era of modern process control, model has an important role in lots of advanced control techniques, such as model predictive control (MPC) [1–5]and adaptive control [6–10]. Generally speaking, there are two schools of methods to establish a mathematical model – principle based and data based. In the first method, various physical conservation laws are used to link the input to the output with a bunch of partial differential equations (PDEs) and ordinary differential equations (ODEs). These methods require tedious procedures and sophisticated techniques to solve the differential equations for the control purpose. Moreover, these differential equations usually contain some unknown parameters that need to be further determined by experiments or other approaches. Unlike the methods above, system identification is an empirical method to extract model information from the process input and output data; thus there are not the aforementioned obstacles in this method. For the past few decades, this topic has been extensively studied [10–14]. Most of these studies focused on the conditions and methods to capture the dynamics of continuous processes asymptotically. Although these conventional identification methods can apply to batch processes, it often fails to yield satisfactory results, since the dynamics between continuous
∗ Corresponding author at: Department of Chemical & Biomolecular Engineering, Hong Kong University of Science & Technology, Clear Water Bay, Kowloon, Hong Kong. Tel.: +852 23587139; fax: +852 23580054. http://dx.doi.org/10.1016/j.jprocont.2015.12.011 0959-1524/© 2016 Elsevier Ltd. All rights reserved.
and batch processes is quite different [15]. For example, the ultiˆ mate goal for the online conventional identification is lim (t) = 0 , t→∞
ˆ where (t) stands for the estimated parameters of the process and 0 is the true parameters of the process, provided that the process can be delineated by the specific model structure and the parameters. But it is not that plausible for batch processes, owing to the finite duration and non-steady state of batch processes. Therefore, it is necessary to develop new identification methods for batch processes. There are few literatures attributing to this topic. Ma and Braatz [16] developed a stopping criterion for off-line batch process identification. The identification terminates when the worst-case performance index satisfies certain specifications. Tayebi [17] proposed a continuous-time iterative learning adaptive control for robot manipulator with the Lagrangian dynamics assumption. Chi and his co-workers [18] extended the idea to discrete-time systems. The identification method included in both papers estimated the parameters in a two-time dimensional framework without its performance in the transient period discussed. Golshan and MacGregor [19] focused on the closed-loop identifiability condition for batch processes. They argued that adding a dither signal or shifting the control trajectory in a few batches is sufficient for the processes’ identifiability. They also proposed to unfold the input and output data in a two-time dimensional fashion. Cao and his colleagues [15] studied the sufficient conditions for almost sure convergence of a two-time dimensional recursive least squares (2DRLS). It was also pointed out that the severe variation of the parameters identified
Z. Cao et al. / Journal of Process Control 39 (2016) 100–110
along the time direction was a defect of 2DRLS, which necessitates constraints on the estimated parameters into identification. Constraints, a typical form of representing the priori knowledge of process, are ubiquitous in states estimation for refining the posterior estimates. Generally speaking, there are three types of approaches for constrained states estimation. The first method named clipping is to project the posterior estimates into the constrained regions [20,21]. The second method named acceptance/rejection is commonly adopted in constrained particle filtering [22]. Its underlying idea is to refine the posterior distribution by discarding the particles non-compliant with the constraints. The other method is to handle the constraints directly within the optimization. A typical example is moving horizon estimator [23,24]. Nevertheless, there are few papers addressing the constrained identification topic. Ikonen and Najim [25] studied using constraints to incorporate the priori process knowledge. They also pointed out that it was a difficult problem to consider the constraints on poles or zeros. Bruwer and MacGregor [26] studied the experimental design for process identification. They only addressed the input and output constraints via the D-optimal technique. It is known that most processes in nature are stable, thanks to various conservation laws. Hence, this paper intends to merge the priori knowledge on zeros and poles into identification to further enhance the corresponding performance. The identification uses the current time input and output data to refine the estimated parameters of the previous batch. To circumvent the computation issue of directly applying Lyapunov inequality, a bound is introduced to allow the estimated parameters to achieve a suboptimal solution. The bound is also recursively calculated by simple linear matrix inequalities (LMIs). A filter is introduced to overcome the shortcoming of 2DRLS by smoothing the estimated parameters. The paper is organized as follows. Section 2 revisits the basics of recursive least squares (RLS) and 2DRLS and provides the motivation of this paper. Section 3 develops the algorithm. A detailed analysis associated with the algorithm is given in Section 4. Numerical simulations demonstrate the performance and robustness in Section 5. Conclusions are drawn in Section 6.
E[ek (t)] = 0
It has been well known that most batch processes possess certain nonlinearity. In most situations, the process variables are controlled to track a certain batch-wisely identical trajectory, which enables the control engineers to adopt a collection of linear models to approximate the dynamics. It has been reported that this idea has been successfully applied on the injection velocity control in injection molding, a classical example of batch processes [27–29]. To avoid obscuring the focus, only the autoregressive exogenous (ARX) model will be discussed in this paper. Consider the following ARX model:
Here ıi,j , ım,n are Kronecker delta, and ıi,j = 1 if and only if i = j. Eq. (1) can be rewritten as yk (t) = kT (t)0 (t) + ek (t)
[−yk (t − 1) −yk (t − 2)
kT (t) =
where yk (t) and uk (t) are, respectively, the process output and control input of the tth time instant and kth batch. d stands for the delay of the process. na, nb are the order of process output and input dynamics. a1,0 , a2,0 , . . ., ana,0 and b1,0 , b2,0 , . . ., bnb,0 are the output and input parameters. {ai,0 } and {bi,0 } are a function of time. It is also noted that the parameter sequence is batch invariant due to the invariance of input profile. Apparently, if the input profile is constant, the parameters {ai,0 } and {bi,0 } reduce to constants as well. {ek (t)} is subject to identical independent distribution and
−yk (t − na)
...
uk (t − d) uk (t − d − 1)
uk (t − d − nb + 1)]
...
(4) and 0T (t) =
[a1,0 (t)
a2,0 (t)
...
ana,0 (t)
b1,0 (t)
b2,0 (t)
...
(5) bnb,0 (t)]
The length of both vectors is n = na + nb. To resolve the parameters tracking problem, two identifying algorithms – RLS with forgetting factor and 2DRLS will be revisited. RLS with forgetting factor [10]: ˆ k (t) = ˆ k (t − 1) + Kk (t)[yk (t) − kT (t)ˆ k (t − 1)] Kk (t) =
(6a)
Pk (t − 1)k (t)
(6b)
+ kT (t)Pk (t − 1)k (t)
Pk (t − 1)k (t)kT (t)Pk (t − 1) 1 Pk (t) = Pk (t − 1) − + kT (t)Pk (t − 1)k (t)
(6c)
2DRLS [15]: ˆ k (t) = ˆ k−1 (t) + Kk (t)[yk (t) − kT (t)ˆ k−1 (t)] Kk (t) =
(7a)
Pk−1 (t)k (t)
(7b)
1 + kT (t)Pk−1 (t)k (t) Pk−1 (t)k (t)kT (t)Pk−1 (t)
(7c)
1 + kT (t)Pk−1 (t)k (t)
Here stands for the forgetting factor; usually selected between 0.98 (with memory size 50) and 0.995 (with memory size 200) [10]. 2DRLS distinguishes RLS with forgetting factor on two aspects. First, it is the way to update the three equations. Unlike RLS, 2DRLS updates the three equations from the batch direction (k − 1 → k) instead of the time direction (t − 1 → t). Second, there is a forgetting factor involved in RLS, while 2DRLS does not have such a parameter. As stated before, the reason is that the process is time-varying from the time direction, but batch invariant from the batch direction. To further compare these two methods, the following example will be examined [15].
= b1,0 uk (t − d) + b2,0 uk (t − d − 1) + · · · (1)
(3)
And k (t) and 0 (t) are denoted as follows:
yk (t) + a1,0 yk (t − 1) + a2,0 yk (t − 2) + · · · + ana,0 yk (t − na)
+ bnb,0 uk (t − d − nb + 1) + ek (t)
(2)
E[ei (m)ej (n)] = 2 ıi,j ım,n
Pk (t) = Pk−1 (t) −
2. Revisit of RLS and 2DRLS
101
G(z) =
⎧ 1.69z −1 + 1.419z −2 ⎪ ⎪ ⎪ − 1.582z −1 + 0.5916z −2 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ t − 150 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩
)z −1 + 1.419z −2 150 1 − 1.591z −1 + 0.5916z −2
(1.69 − 0.2 ∗
1.49z −1 + 1.419z −2 1 − 1.591z −1 + 0.5916z −2
t ∈ [0, 150)
t ∈ [150, 300) t ∈ [300, 400] (8)
Fig. 1 shows the performance comparison between RLS with a forgetting factor 0.98 and the 50th batch identification of 2DRLS. It apparently shows that 2DRLS stays closer to the true parameter value than RLS with forgetting factor, except with the severe
102
Z. Cao et al. / Journal of Process Control 39 (2016) 100–110
3. Algorithm derivation
Parameter a(1)
-1.56
Without loss of generality, it is assumed that the characteristic polynomial PA (z, t) associated with {a1,0 (t), a2,0 (t), . . ., ana,0 (t)} is stable for any t. According to this priori knowledge, it requires to impose a constraint on the estimates, viz., |ri (PAˆ (z, t, k))| < 1 or simply |ri (t, k)| < 1, where ri (·) stands for the ith root of the characteristic polynomial equation associated with the estimate on the tth time and the kth batch PAˆ (z, t, k) = 0. Or equivalently, according to Lyapunov equation theorem, one can have that
-1.58
-1.60
RLS 50th batch of 2DRLS True value
-1.62 0
50 100 150 200 250 300 350 400 ˆ T (t) − M = −Q ˆ k (t)M A A k
Time(10ms) Fig. 1. Comparison between RLS and 2DRLS.
ˆ k (t) is the conwhere M, Q are two positive definite matrices and A troller canonical realization form denoted as
Spectral radius
1.10 RLS 2nd batch of 2DRLS
1.05
ˆ k (t) = A
T −ˆ A,k,1×na (t)
with
0.95
T (t) = aˆ 1,k (t) ˆ A,k,1×na
(10)
0.90
Ina = I(na−1)×(na−1)
100
200
300
min estimate variation. More specifically, if we use PTE and SI 1 to describe the two algorithms, PTE of RLS and 2DRLS are 20.414 and 0.2413 respectively, while SI of RLS and 2DRLS are 0.0164 and 0.3759 respectively. The underlying reason is that 2DRLS utilizes the repetitiveness of batch processes, thus being able to capture the dynamics change very fast. It should also be noted that RLS has inherent capabilities to smooth its estimates, viz., lim ˆ k (t) − t→∞
| 0(na−1)×1
aˆ na,k (t)
T
(11) (12)
Then, it is easy to rewrite the identification problem as an optimization problem.
Fig. 2. The maximum magnitude of the characteristic polynomials associated with the identifications.
ˆ k (t − 1)2 = 0 or more generally, lim ˆ k (t) − ˆ k (t
aˆ 2,k (t) . . .
400
Time(10ms)
t→∞ − i)2 =
0 for
any positive integer i. Please find the detailed proof of these statements in pp. 288–291 [10]. However, 2DRLS does not own such abilities that yield smooth estimates preferred by a control engineer. Further note that G(z) is stable but its spectral radius is close to the unit circle. Fig. 2 plots the special radius of estimates comparison between RLS with a forgetting factor and the second batch identification of 2DRLS. The figure illustrates that many estimates associated with 2DRLS violate the process stability. Thereafter, it is the motivation of this paper to develop a system identification method by incorporating priori process knowledge for identification performance improvement.
ˆ k (t),M,Q
Parameter tracking error (PTE) and smoothing index (SI) are defined as follows.
k 2 yi (t) − iT (t)ˆ k (t) i=1
ˆ k (t)M A ˆ T (t) − M = −Q A k
s.t.
(13)
M>0 Q>0 Here we define the objective function following the fashion in RLS (cf. p. 183 [10]). Basically, it means that we need to find an optimal vector in the space generated by the past inputs, i.e., {i (t), i = 1, 2, . . . }, to describe the past observations, i.e., {yi (t), i = 1, 2, . . . }. By observing the optimization problem above, although its cost function is in a quadratic form, the first term in Eq. (9) is a product of two optimization variable; thus, it is non-convex and the problem cannot be solved in polynomial time. Then, the task becomes to find a plausible suboptimal solution. Instead of searching the optimal ˆ k (t) in the whole space, only the vicinity of the previous estimate ˆ k−1 (t) is searched. The reason is that it is believed that the previous estimate ˆ k−1 (t) is accurate enough that the current estimate is not far away from it. Mathematically, it can be written as
min
k
ˆ k (t)
s.t.
Ina
1.00
0
1
(9)
yi (t) − iT (t)ˆ k (t)2
(14)
i=1
ˆ k (t) − ˆ k−1 (t)M ≤ k (t)
L
PTE(k) =
ˆ k (t) − 0 (t)2
t=1
SI(k) =
L
x2M = xT Mx, M is a positive definite weight matrix. The following lemma discusses the solution to the optimization above. Lemma 1. The explicit solution for Eq. (14) is as follows. If the constraint is inactive, then
ˆ k (t) − ˆ k (t − 1)2
t=10
PTE quantifies an identification algorithm’s capability to track the real system dynamics, while SI describes the smoothness of the parameter estimate.
ˆ k (t) =
k i=1
−1 i (t)iT (t)
k i=1
i (t)yi (t)
(15)
Z. Cao et al. / Journal of Process Control 39 (2016) 100–110
103
Otherwise, ˆ k (t)
= ˆ k−1 (t) −
×
k
−1 i (t)iT (t) + ω M
i=1 k
i (t)iT (t)
k
ˆ k−1 (t) +
i=1
i (t)yi (t)
(16)
i=1
where ω is solved from T [ˆ k−1 (t)Pk−1 (t) + NkT (t)](Pk−1 (t) + ω M)
= 2k (t)
−2
[Pk−1 (t)ˆ k−1 (t) + Nk (t)] (17)
and Pk−1 (t)
k i=1
i (t)i (t) and Nk (t)
k i=1
Fig. 3. Illustration of the projection method.
i (t)yi (t).
Proof. The Lagrangian L : Rn × R → R associated with the optimization problem defined in Eq. (14) is L(ˆ k (t), ω) =
k
yi (t) − iT (t)ˆ k (t)2 + ω[ˆ k (t) − ˆ k−1 (t)2M − 2k (t)]
(18)
i=1
where ω ≥ 0 is referred as the Lagrange multiplier associated with the inequality constraint. Since both the cost function and constraint are quadratic, the Karush–Kuhn–Tucker (KKT) condition is the sufficient and necessary condition for the solution of the optimization problem [30]. Calling the KKT condition, we can have
the following derivations give a simple solution based on the projection method. Suppose that ˆ k2D (t) stands for the solution without considering the constraint, which is exactly the solution of 2DRLS. The solution lies out of the bound as shown in Fig. 3. The projection method suggests to project the 2DRLS solution ˆ k2D (t) onto the bound around ˆ k−1 (t) based on the Euclidian distance. Followed this, it is easy to reach that ˆ k (t) = ˆ k−1 (t) + ˛k (t)[ˆ k2D (t) − ˆ k−1 (t)] = [1 − ˛k (t)]ˆ k−1 (t) + ˛k (t)ˆ k2D (t)
(21)
ˆ k (t) − ˆ k−1 (t)2M − 2k (t) ≤ 0
(19a)
Here ˛k (t) is a positive real number between zero and one defined as
ω ≥ 0
(19b)
˛k (t)
ω [ˆ k (t) − ˆ k−1 (t)2M − 2k (t)] = 0
(19c)
∂ˆ k (t)
=0⇒
[i (t)iT (t)]ˆ k (t)
i=1
−
k
i (t)yi (t) + ω M[ˆ k (t) − ˆ k−1 (t)] = 0
(19d)
i=1
From Eq. (19c), it shows that if the constraint is inactive, ω = 0 and the optimal solution is as given in Eq. (15). Otherwise, ω > 0 and ˆ k (t) − ˆ k−1 (t) = k (t). From Eq. (19d), one can get that ˆ k (t)
=
k
−1 i (t)iT (t) + ω M
i=1
= ˆ k−1 (t) −
×
k i=1
k i=1
ˆ
ω k−1 (t) +
−1
k
i (t)yi (t)
i=1
i (t)iT (t) + ω M
i (t)iT (t)
(22)
Note that the definition of ˛k (t) above may be conservative. Because the bound k (t) yielded is actually imposed on the first na components of ˆ k (t); the denominator above also penalize the parameters on inputs, i.e., bi,k (t), and may lead to an ˛k (t) too small, or a slower convergence rate. Therefore, ˛k (t) above can be modified to
k
∂L
k (t) ˆ k2D (t) − ˆ k−1 (t)M
ˆ k−1 (t) +
k
i (t)yi (t)
(20)
i=1
Apparently, the optimal ω can be solved from the equality of the active constraint as given in Eq. (17). 䊐 Remark 1. The optimal solution for the inactive constraint is equivalent to the recursive solution of 2DRLS given in Eq. (7), refer to Cao et al. for the similar proof [15]. Also note that there is only one variable that needs to be solved in Eq. (17). The equation can be solved by numerical methods, such as Newton’s method. In order to further improve computation efficiency without solving Eq. (17),
˛k (t) =
k (t)
(23)
2D (t) − ˆ A,k−1 (t)M ˆ A,k
2D (t), ˆ A,k−1 (t) are respectively the first na components of where ˆ A,k
ˆ k2D (t), ˆ k−1 (t) according to Eq. (11). It should be mentioned that Eq. (21) actually represents a first-order filter to actively make a tradeoff between staying close to the previous batch’s identification to avoid violating the constraints (conservatism) and tending to the true value driven by the 2DRLS solution (aggressiveness). To this end, how to get a set of identified parameters under the norm bounded constraints has been solved. The only problem left is how to get a bound ensuring the fulfilment of the stability constraint. The basic idea is borrowed from the robust control [31]. ˆ k (t) is the state matrix associated with the idenSuppose that A tified parameters ˆ k (t) and compliant with the stability constraint so that ˆ k (t)Mk (t)A ˆ T (t) − Mk (t) < 0 A k
(24)
ˆ k (t) = A ˆ k−1 (t) + ıA ˆ k (t), the above equation can be According to A rewritten as T
ˆ k−1 (t) + ıA ˆ k (t)]Mk (t)[A ˆ k−1 (t) + ıA ˆ k (t)] − Mk (t) < 0 [A
(25)
where Mk (t) is the positive definite matrix, which is the solution of the following Lyapunov equation ˆ T (t) − Mk (t) = −Q ˆ k−1 (t)Mk (t)A A k−1
(26)
Z. Cao et al. / Journal of Process Control 39 (2016) 100–110
provided that the positive definite matrix Q is given. In our case, Q is fixed as the identity matrix. Thereafter, Mk (t) can be explicˆ k−1 (t) is known and is a stable itly and uniquely determined since A state matrix according to the recursive argument. It is vec[Mk (t)] = −1 ˆ k−1 (t) ⊗ A ˆ k−1 (t) − I] vec(I) and vec, ⊗ are vectorization opera−[A tor and Kronecker product respectively. On the other hand, we have
−1 Aˆ k−1 (t)
×
Mk (t) −
−1 Aˆ k−1 (t)
√
ıAˆ k (t)
Mk (t) −
√
Mk (t)
ıAˆ k (t)
T Mk (t)
≥0
k-th batch unconstrained k-th batch constrained Identified parameter
104
k-1-th batch
(27)
Here is a positive real number. It is equivalent to
t-1
t
Time
Fig. 4. Variation comparison between constrained and unconstrained identification.
−1 Aˆ k−1 (t)Mk (t)Aˆ Tk−1 + ıAˆ k (t)Mk (t)ıAˆ Tk (t) ˆ k (t)Mk (t)A ˆ T (t) + A ˆ k (t)Mk (t)ıA ˆ T (t) ≥ ıA k k
(28)
From the equation above, we obtain the sufficient condition for Eq. (25) T
ˆ k−1 (t) + ıA ˆ k (t)]Mk (t)[A ˆ k−1 (t) + ıA ˆ k (t)] − Mk (t) [A ˆ k−1 (t)Mk (t)A ˆ T (t) ≤ (1 + −1 )A k−1 ˆ k (t)Mk (t)ıA ˆ T (t) − Mk (t) + (1 + )ıA k ˆ k (t)Mk (t)ıA ˆ T (t) < 0 = −1 Mk (t) − (1 + −1 )I + (1 + )ıA k
(29)
last batch is d1 . In the constrained case, the distance will never be larger than d = ˆ k (t) − ˆ k (t − 1) = ˆ k−1 (t) − ˆ k−1 (t − 1) + ıˆ k (t) − ıˆ k (t − 1) ≤ d1 + k (t) + k (t − 1). This means that the variation is bounded in the constrained identification, while there is no such a bound that can be found in the unconstrained case. In order to further suppress the variation, an adaptive filter is introduced as follows.
f ˆ k (t) =
Furthermore, note that
ˆ k (t) = ıA
T (t) ıˆ A,k
means T (t)2 {ıˆ A,k M
k
that
(30)
the
term
ˆ k (t)Mk (t)ıA ˆ T (t) ıA k
is
diag
, 0(na−1)×(na−1) }. Thus, Eq. (29) holds if (t)
−1
max [Mk (t)] − (1 + −1 ) + (1 + )ıˆ A,k (t)2M
k (t)
<0
(31)
There is a solution to Eq. (31) if and only if
> max [Mk (t)] − 1
(32)
From Eq. (26), it can be deduced that Mk (t) > I, or max [Mk (t)] > 1. Note that ıˆ A,k (t)Mk (t) is actually k (t) by definition. Thus, to this end, the bound has been solved. But it is reasonable to get the maximum bound max{k (t)} since a larger feasible domain yields a smaller optimal index. Denote that = −1 + max [Mk (t)] + ˇ, ˇ > 0. So we should maximize max{2k (t)} = max{f (ˇ)}
= max
ˇ (max [Mk (t)] + ˇ)(−1 + max [Mk (t)] + ˇ)
(33)
Thus, max 2k (t) =
⎪ ⎪ ⎩
> 1 + 2 3k
f 2 ˆ k (t) + (1 − 2 )ˆ k (t
− 1)
(35)
Otherwise
f
0(na−1)×na
It
⎧ f f ⎪ ˆ (t) + (1 − 1 )ˆ k (t − 1) ˆ k (t) − ˆ k (t − 1) ⎪ ⎨ 1 k
2max [Mk (t)] − 1 + 2
1 (max [Mk (t)] − 1)max [Mk (t)]
(34)
To this end, the fulfillment of stability constraints can be guaranteed. The problem left is how to suppress the variations on the identified parameters. However, the aforementioned method for the stability constraint can somehow relieve this “symptom” as explained in Fig. 4. Either constrained or unconstrained identification refines the identification result of the previous batch. Suppose that the distance of the identified parameters of two consecutive time instants in the
Here ˆ k (t) stands for the filtered identification. 1 in the first filter should be small so that the excessively large variations can be smoothed. In this sense, 1 ≥ 0 should be the maximum variation of the true system dynamics between two consecutive time instants, which is included in the priori system knowledge. 2 > 0, 3 > 0 are two tuning knobs for improving the identification performance in the transient period. 3 < 1 is the decaying factor based on the reason that the variations in the first several batches are larger than the following. The large variation should be allowed in the first several batches, since the ratio tending to the true parameter of the first several batches are much larger than the following batches. The second filter’s main function is to suppress the variations with small magnitudes while capturing the system dynamics; thus, a large 2 is preferred. There are two general rules for selecting 1 , 2 . First, as explained above, 1 should be smaller than 2 to yield a strong smoothing effect. Second, 2 should not be too small; otherwise, it may lead to a very slow convergence rate. According to the authors’ experiences, 1 , 2 are suggested selected within the ranges 0.2–0.4 and 0.6–0.8 respectively. 3.1. Summary The whole algorithm has been presented to this end. The proposed 2DRLS incorporating the priori poles and zeros knowledge is basically a 2DRLS driven identification algorithm with two adaptive sequential first-order filters (the first is from the batch direction and the second is from the time direction). Fig. 5 demonstrates the workflow of the proposed algorithm. Needless to say, the computation effort of our method is greater than the traditional 2DRLS. However, Fig. 5 shows that the parameter identification part and the bound calculation can run in parallel. In other words, the computation load does not increase significantly in our approach. It demonstrates the main contribution of the paper – turning a non-convex problem into two parallel convex problems that can be solved recursively with relatively small computational loads.
Z. Cao et al. / Journal of Process Control 39 (2016) 100–110
105
Fig. 5. Workflow of the proposed algorithm.
0.5
In this section, some properties of the proposed algorithm are to be examined, such as the recursive feasibility and consistency. The recursive feasibility is important because the identification performance would deteriorate rapidly if it cannot be guaranteed. So is the consistency because control engineers always require the knowledge about the ultimate identification performance when the data is sufficient.
0.4
Theorem 1 (Recursive feasibility). If the first batch’s identification satisfies the stability constraint, then there always exist a pair f (ˆ k (t), Mk (t)) satisfying the constraints in Eq. (13). Proof. We are going to show it by inductive arguments. Assume that at the tth time instant and the k − 1th batch and t − 1th
Prediction error
4. Property analysis
Proposed method RLS 2DRLS
Missing data 4.42
0.3 0.2 0.1 0.0 0
10
20
30
40
50
Batch Fig. 6. The prediction error comparison among RLS, 2DRLS and the proposed method. (The data missing is due to the small range of y-axis.)
106
Z. Cao et al. / Journal of Process Control 39 (2016) 100–110
Fig. 7. The 50th batch identification (RLS). f f time instant and kth batch, the pairs (ˆ k−1 (t), Mk−1 (t)) and (ˆ k (t − 1), Mk (t − 1)) satisfy the constraints. Subsequently, a new pair f (ˆ k−1 (t), Mk (t)) can be established by solving the corresponding Lyapunov equation. With the bound k (t) imposed, the pair (ˆ k (t), Mk (t)) complies with the stability constraints. In sequel, f there always exists a pair (ˆ k (t), Mk+1 (t)) satisfying the constraint, f because ˆ (t) is just a convex combination of two sets of stable idenk
tified parameters and all the roots of the characteristic polynomials after the convex combination lie within the unit disc according to Kharitonov theorem. Then, by invoking Lyapunov equation theorem, it can conclude on the pair’s existence. Therefore, if the first batch’s identification fulfils the constraint, the following identifications can absolutely fulfil the constraint. This completes the proof. 䊐 The following theorem gives the consistency result for the deterministic scenario. Theorem 2 (Consistency). The identification ˆ k (t) goes to 0 (t) as k goes to infinity, provided that lim min [Pk−1 (t)] = ∞. k→∞
Fig. 8. Parameter a1,0 (t) tracking of the proposed method.
Proof. First, we need to prove that ˆ k2D (t) will converge to 0 (t). This kind of proof can be found in many system identification textbooks like [10]. Denote ˜ k2D (t) = ˆ k2D (t) − 0 (t) and k (t) = yk (t) − T (t)ˆ 2D (t) = −T (t)˜ 2D (t). Then, it is easy to get that k
k
k
k
−1 ˜ 2D ˜ k2D (t) = Pk (t)Pk−1 k−1 (t)
(36)
T
Select Vk (t) = [˜ k2D (t)] Pk−1 (t)˜ k2D (t) as a Lyapunov function. Consequently, we have Vk (t) − Vk−1 (t)
T
2D = ˜ k2D (t) − ˜ k−1 (t)
−1 2D Pk−1 (t)˜ k−1 (t)
T
=−
[˜ k−1 (t)] k (t)kT (t)˜ k−1 (t) 1 + kT (t)Pk−1 (t)k (t)
=−
2k (t) 1 + kT (t)Pk−1 (t)k (t)
(37)
Thus, Vk (t) is non-increasing along the batch direction and will converge to a bounded limit V∞ (t) as k→ ∞. Since lim min [Pk−1 (t)] = ∞, ˜ k2D (t) has to go to 0.
k→∞
Z. Cao et al. / Journal of Process Control 39 (2016) 100–110
107
Fig. 9. Parameter b1,0 (t) tracking of the proposed method.
From the aforementioned algorithm derivation, it can be known that ˜ k (t) = [1 − ˛k (t)]˜ k−1 (t) + ˛k (t)˜ k2D (t)
(38)
Since ˛k (t) is always greater than 0, assume that ˛ = inf ˛k (t). Given k
a positive real number . Also, due to the convergence of ˜ k2D (t), there always exists a N1 such that when k > N1 , ˜ 2D (t) ≤ ≤ (1 − k
˛) and N
˜ k2D (t) = F(N1 )
(39)
k=1
Therefore, by triangular inequality, we have for k > N1 ˜ k (t) ≤ (1 − ˛)˜ k−1 (t) + ˜ k2D (t) ≤ (1 − ˛)k−N1 [F(N1 ) + ˜ 1 (t)] +
1−˛
(40)
Hence, it is easy to conclude that there always exists a N2 such that ˜ k (t) < for any k > N2 . This completes the proof. 䊐
Fig. 10. The spectral radius evolution of the proposed method.
5. Simulation In this section, we are going to demonstrate the effectiveness of the proposed algorithm by numerically checking the example given in Section 2. The details of the simulation are as follows. The control input is chosen as pseudo-random binary sequence (PRBS). The process noise is selected as white noise with the variance = 0.1. The first batch is identified by RLS with a forgetting factor 0.99. There is a mechanism to ensure the stable identification. After the whole batch’s identification finished, all the unstable identifications are discarded and used ˆ AT = [−0.2, 0.1]T instead. The initial value of the RLS identification is ˆ A , with P(0) = 106 . As for the 2DRLS, the initial value of P1 (t) is 0.6I. Additionally, the parameters of the second adaptive filter are 1 = 0.7, 2 = 0.3, 1 = 2 = 0.05, 3 = 0.8. Fig. 6 shows the prediction error comparison among RLS, the conventional 2DRLS and the proposed method. It shows that the prediction errors associated with 2DRLS and the approach proposed decay from batch to batch starting from the second batch, and converges almost exponentially to the process noise variance. It implies that 2DRLS performs better the approach proposed with
108
Z. Cao et al. / Journal of Process Control 39 (2016) 100–110
Fig. 11. Parameter a1,0 (t) tracking of the conventional 2DRLS.
Fig. 12. Parameter b1,0 (t) tracking of the conventional 2DRLS.
respect to prediction error. Because the conventional 2DRLS only pursues to minimize the prediction error, while the approach proposed intends to track the real dynamics. The two objectives are totally different; the seeming abnormality of our method stems from two aspects – insufficient I/O data and the imposed stability constraint, which restrict the deviation of the parameter estimate from the previous one and make it generate a bad output prediction. Figs. 7–13 illustrate the parameter tracking and spectral radius evolution of RLS, the conventional 2DRLS, and the proposed approach. They indicate that the approach developed in this paper possesses the ability to yield stable identification while the other two cannot. Both the 2DRLS and our methods are able to capture the true dynamics as suggested in the previous proofs. It also implies that our method is better at suppressing the identification variations than the conventional 2DRLS. Figs. 14 and 15 show the PTE comparison among three algorithms – the conventional 2DRLS, 2DRLS with the stability constraint and the 2DRLS with both the stability constraint and 1 , 2 adaptive filter under different process noise variance. Fig. 14 shows that in the first few batches, the conventional 2DRLS slightly outperforms the other two. The performance
index PTE basically measures two effects – the tracking performance and the variation neutralization. In the first few batches, fast dynamics tracking matters a lot; 2DRLS is a little better at that, because it totally neglects other identification requirements, i.e., smoothness. There is a PTE peak of 2DRLS in Fig. 15. It may be caused by overfitting as mentioned in the paper [15]. Therefore, it can be concluded that our method is much more robust against process noise than the other two methods. Additionally, both results of Figs. 14 and 15 suggest that the PTE almost reaches steady states after 20 batches, which gives us a rough estimate for algorithm “turn-off” time. Finally, we would like to put some remarks on the implementations of our algorithm in practice. First, in reality, lots of batch processes involve dynamics drift across batches. For the rapid drift or the drift that changes the basic properties of processes, i.e., stability or minimum phase, our algorithm is incapable of handling that. However, it is easy to extend our method to deal with slow dynamics drift by introducing the forgetting factor like (6b) and (6c). Physically, it can be interpreted as imposing a smaller weight on the past data and a larger weight on the current data.
Z. Cao et al. / Journal of Process Control 39 (2016) 100–110
109
Parameter tracking error
25 Conventional 2DRLS Constrained Filtered + constrained
20 15 10 5 0 0
10
20
30
40
50
Batch Fig. 15. PTE comparison with = 0.5.
compatible with unstable zeros, or shifted poles and zeros (assumed to be known). For the first case, by replacing −Q, with Q, − , and reversing the inequalities in Eqs. (26) to (31), all the rest derivations immediately follow. For unstable poles, the outputs of the plant are possibly unbounded, which disables the identification process. For the second case, the stability constraint Eq. (24) becomes
ˆ k (t) ˆ k (t) A A Mk (t) s s
− Mk (t) < 0
ˆ k (t)] < s. It suggests Here s stands for the known shift, i.e., [A ˆ k (t)/s as a new A ˆ k (t). Then, all the derivathat we can group A tions followed are still valid, because the derivations focus on ˆ k (t). Fourth, within this paper, we only study the incremental ıA discrete-time models. As for continuous-time model and proper selection of sampling time, they are the topics out of the scope of this paper. For interests, we refer readers to the book [32]. 6. Conclusions
Fig. 13. The spectral radius evolution of the conventional 2DRLS.
Parameter tracking error
25 Conventional 2DRLS Constrained Filtered + constrained
20 15
This paper studies the two-time dimensional identification problem with the consideration of priori poles and zeros knowledge. The main contribution of the paper is to turn a non-convex optimization with the stability constraint into two parallel convex optimization problems that can be implemented recursively. One is to search an optimal solution within a bound; the other is to generate a bound adaptively based on the identification of the previous batch. The other contribution is that an adaptively first-order filter is introduced to suppress the severe variations on the identification of the conventional 2DRLS. Both the recursive feasibility and consistency have been proved. The numerical simulation shows the superior performance of the proposed approach. Acknowledgements
10 5 0 0
10
20
30
40
50
Batch Fig. 14. PTE comparison with = 0.1.
Second, the uneven batch duration is another important issue for batch processes. It is suggested to use the same idea in our previous paper [15] to solve it. Third, it should be noted that there is no assumption on the order of the system. Thus, our method is not restricted on that. Moreover, the identification approach is
This project is supported in part by Hong Kong Research Grant Council Project No. 612512, Guangzhou Scientific and Technology Project No. 1219007, the National Natural Science Foundation of China Project No. 61227005, and Guangdong Academician Workstation Project No. 2012B090500010. Appendix A. Derivation of Eqs. (9)–(12) From Eq. (1), ignoring the input part and replacing ai,0 (t) with the estimate aˆ i,k (t), we have yk (t) = −ˆa1,k (t)yk (t − 1) − aˆ 2,k (t)yk (t − 2) − · · · − aˆ na,k (t)yk (t − na)
(41)
110
Z. Cao et al. / Journal of Process Control 39 (2016) 100–110
It is equivalent to ˆ k (t)Yk (t − 1) Yk (t) = A where
Yk (t) = yk (t)
⎡ ⎢ ⎢ ⎢ ˆ k (t) = ⎢ A ⎢ ⎢ ⎣
yk (t − 1)
···
y(t − na + 1)
−ˆa1,k (t)
···
−ˆana−2,k (t)
−ˆana−1,k (t)
−ˆana,k (t)
1
0
···
0
0
0
1
···
0
0
..
.
.. .
.. .
···
1
0
.. .
0
0
0
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
Eqs. (10)–(12) can be obtained by dividing the matrix above into three parts. Moreover, it is obvious that studying the stability of Eq. (41) is equivalent to Eq. (9). References [1] D.Q. Mayne, J.B. Rawlings, C.V. Rao, P.O. Scokaert, Constrained model predictive control: stability and optimality, Automatica 36 (2000) 789–814. [2] S.J. Qin, T.A. Badgwell, A survey of industrial model predictive control technology, Control Eng. Pract. 11 (2003) 733–764. [3] E.F. Camacho, C.B. Alba, Model Predictive Control, Springer Science & Business Media, 2013. [4] M. Morari, J.H. Lee, Model predictive control: past, present and future, Comp. Chem. Eng. 23 (1999) 667–682. [5] H. Chen, F. Allgöwer, A quasi-infinite horizon nonlinear model predictive control scheme with guaranteed stability, Automatica 34 (1998) 1205–1217. ˚ [6] K.J. Aström, B. Wittenmark, Adaptive Control, Courier Corporation, 2013. [7] P.A. Ioannou, J. Sun, Robust Adaptive Control, Courier Corporation, 2012. [8] J.-J.E. Slotine, W. Li, On the adaptive control of robot manipulators, Int. J. Robot. Res. 6 (1987) 49–59. [9] J.J. Craig, P. Hsu, S.S. Sastry, Adaptive control of mechanical manipulators, Int. J. Robot. Res. 6 (1987) 16–28. [10] E. Mosca, Optimal, Predictive, and Adaptive Control, 1995. [11] L. Ljung, System Identification, Springer, 1998.
[12] L. Ljung, T. Söderström, Theory and Practice of Recursive Identification, 1985. [13] O. Nelles, Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy Models, Springer Science & Business Media, 2001. [14] Y. Zhu, Multivariable System Identification for Process Control, Elsevier, 2001. [15] Z. Cao, Y. Yang, J. Lu, F. Gao, Constrained two dimensional recursive least squares model identification for batch processes, Journal of Process Control 24 (2014) 871–879. [16] D.L. Ma, R.D. Braatz, Robust identification and control of batch processes, Comp. Chem. Eng. 27 (2003) 1175–1184. [17] A. Tayebi, Adaptive iterative learning control for robot manipulators, Automatica 40 (2004) 1195–1203. [18] R. Chi, Z. Hou, J. Xu, Adaptive ILC for a class of discrete-time systems with iteration-varying trajectory and random initial condition, Automatica 44 (2008) 2207–2213. [19] M. Golshan, J.F. MacGregor, Identification for the control of variable trajectories in batch processes, Ind. Eng. Chem. Res. 52 (2013) 2352–2367. [20] D. Simon, Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches, John Wiley & Sons, 2006. [21] E.L. Haseltine, J.B. Rawlings, Critical evaluation of extended Kalman filtering and moving-horizon estimation, Ind. Eng. Chem. Res. 44 (2005) 2451–2460. [22] X. Shao, B. Huang, J.M. Lee, Constrained Bayesian state estimation – a comparative study and a new particle filter based approach, J. Process Control 20 (2010) 143–157. [23] F. Allgöwer, T.A. Badgwell, J.S. Qin, J.B. Rawlings, S.J. Wright, Advances in Control, Springer, 1999, pp. 391–449. [24] C.V. Rao, J.B. Rawlings, J.H. Lee, Constrained linear state estimation – a moving horizon approach, Automatica 37 (2001) 1619–1628. [25] E. Ikonen, K. Najim, Advanced Process Identification and Control;, Vol. 9, CRC Press, 2001. [26] M.-J. Bruwer, J.F. MacGregor, Robust multi-variable identification: optimal experimental design with constraints, J. Process Control 16 (2006) 581–600. [27] F. Gao, Y. Yang, C. Shao, Robust iterative learning control with applications to injection molding process, Chem. Eng. Sci. 56 (2001) 7025–7034. [28] Y. Yang, F. Gao, Adaptive control of the filling velocity of thermoplastics injection molding, Control Eng. Pract. 8 (2000) 1285–1296. [29] Y. Yang, F. Gao, Cycle-to-cycle and within-cycle adaptive control of nozzle pressure during packing-holding for thermoplastic injection molding, Polym. Eng. Sci. 39 (1999) 2042–2063. [30] S. Boyd, L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004. [31] F.L. Lewis, L. Xie, D. Popa, Optimal and Robust Estimation: with an Introduction to Stochastic Control Theory, Vol. 29, CRC Press, 2007. [32] H. Garnier, L. Wang, P.C. Young, Direct Identification of Continuous-time Models from Sampled Data: Issues, Basic Solutions and Relevance, Springer, 2008.