Optimal errors-in-variables filtering

Optimal errors-in-variables filtering

Available online at www.sciencedirect.com Automatica 39 (2003) 281 – 289 www.elsevier.com/locate/automatica Brief Paper Optimal errors-in-variables...

213KB Sizes 9 Downloads 109 Views

Available online at www.sciencedirect.com

Automatica 39 (2003) 281 – 289 www.elsevier.com/locate/automatica

Brief Paper

Optimal errors-in-variables "ltering Roberto Guidorzi∗ , Roberto Diversi, Umberto Soverini Dipartimento di Elettronica, Informatica e Sistemistica, Universita di Bologna, Viale del Risorgimento 2, 40136 Bologna, Italy Received 9 February 2001; received in revised form 1 July 2002; accepted 10 September 2002

Abstract This paper deals with optimal (minimal variance) "ltering in an errors-in-variables framework. Di3erently from many other contexts, errors-in-variables models treat all variables in a symmetric way (no partition of the variables into inputs and outputs is required) and assume additive noise on all the variables. The "ltering technique described in this paper can be easily implemented in a recursive way and does not require the use of a Riccati equation at every update. The results of Monte Carlo simulations have shown the e3ectiveness and consistency of the approach. ? 2002 Elsevier Science Ltd. All rights reserved. Keywords: Optimal "ltering; Errors-in-variables models; Optimal interpolation; Recursive "ltering; Kalman "ltering

1. Introduction A problem of paramount importance in many engineering "elds consists in designing "lters for reducing the e3ect of additive noise on measured signals. Typical applications include, for example, reception and discrimination of radio signals, digital data transmission, beam-forming arrays, detection of radar signals, analysis of seismic data, image processing, analysis of electrocardiographic signals, etc. In any case, the "lter acts as a preprocessor whose output allows a more e3ective detection of the useful signal components or of their parameters. If the signal and noise spectra do not overlap, it is possible to design a "lter that does not a3ect the useful signal but reduces the e3ects of the noise component. Hence, the resulting system would be essentially a low-pass, high-pass or band-pass "lter, depending on the relative frequencies of the signal and of the noise. When the signal and noise spectra overlap, a more complex problem of signal and noise  This paper was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Associate Editor H. Hjalmarsson under the direction of Editor Torsten S?oderstr?om. ∗ Corresponding author. Tel.: +39-051-2093027; fax: +39-0512093073. E-mail addresses: [email protected] (R. Guidorzi), [email protected] (R. Diversi), [email protected] (U. Soverini).

separation arises and the optimal solution can be de"ned only in a statistical framework. As it is well-known, this problem was "rst studied by Wiener and Kolmogorov. In 1942 Wiener considered the problem of estimating a continuous-time signal, described as a stochastic process corrupted by additive noise. The explicit expression of the optimal estimate is given by the solution of an integral equation known as Wiener–Hopf equation that, with the exception of few cases, requires complex computations. In 1941 Kolmogorov had also considered the optimisation of a linear "lter for discrete-time stochastic processes. Originally, these works had been developed only for stationary stochastic processes in the frequency domain. It was thus possible to relate the statistical properties of the useful signal with its frequency domain properties. The modern approach to the design of "lters encompasses more advanced theories that exploit models used to describe the laws that have generated the data. Nowadays, much is known about "lter design procedures in simple cases, for example when the signal model is linear. One of these theories is Kalman "ltering, named after the pioneer work done by Kalman (1960) to overcome the basic limitations of the Wiener "lter. In fact, the Kalman "lter is more general since it can handle intrinsically nonstationary problems. The wide generality of the Kalman "lter derives from the adoption of a time-varying structure that has allowed

0005-1098/03/$ - see front matter ? 2002 Elsevier Science Ltd. All rights reserved. PII: S 0 0 0 5 - 1 0 9 8 ( 0 2 ) 0 0 2 0 0 - 5

282

R. Guidorzi et al. / Automatica 39 (2003) 281 – 289

its successful application to the solution of a wide variety of engineering problems. In Kalman "ltering the classical stochastic discrete-time state-space model for stationary signals is of the type x(t + 1) = Ax(t) + Bu(t) + w(t);

x(0) = x0 ;

(1)

y(t) = Cx(t) + v(t);

t ¿ 0;

(2)

where w(·) and v(·) denote white noise sequences with known statistical characteristics. Usually, w(t) is called process noise and v(t) observation noise. Model (1)–(2) allows a very general description of the stochastic and deterministic components of noisy processes. The main limit a3ecting the design of Kalman "lters concerns the assumption of an exact knowledge of the system model and of the noise covariances. In some applications of practical relevance the process model is, in fact, exactly known while the covariance matrices of the process and observation noise are completely unknown. The use of incorrect noise statistics can, however, lead to poor state/output estimates and even to divergence problems (Anderson & Moore, 1979). The relevance of this problem has stimulated the search for procedures that allow to estimate the covariance matrices of the noise or, in a more general context, the optimal gain of the Kalman "lter. The most widely applied methods are based on the correlation of the innovations of nonoptimal "lters and derive from the pioneer work of Mehra (1970). The stochastic context of Kalman "ltering is intrinsically asymmetric since while the output, y(t), is considered as a3ected by additive noise, the input, u(t), is assumed as exactly known. This assumption can be true in many control applications but is, in general, nonrealistic in other "elds where errors are present on all signals and even a partition between inputs and outputs could be arbitrary. This limit is not present in errors-in-variables (EIV) models that treat all variables in a symmetric way and no a-priori classi"cation of the observations is performed. State-space EIV models are given by relations of the following type:

problems are still open in the EIV framework. Some results have, in fact, been obtained in the area of their identi"cation (S?oderstr?om, 1981; Anderson & Deistler, 1984; Deistler, 1986; Beghelli, Guidorzi, & Soverini, 1990) while some other topics like "ltering see only a very limited number of works (Guidorzi & Guidorzi, 1998). This work investigates, in EIV contexts, the design of "lters that assure the optimal estimate of the noiseless inputs and outputs. It is important to note that Kalman "ltering cannot be directly applied, in the general case, to EIV processes as discussed in the following. The content is organised as follows. Section 2 states the interpolation and "ltering problems considered in the paper. Section 3 concerns least squares, maximum likelihood and minimal variance estimates for the inputs and outputs of EIV processes while Section 4 treats EIV "ltering in a behavioural context. A state-space approach, derived from Kalman "ltering is then developed in Section 5. The results of a Monte Carlo simulation are reported in Section 6 and some concluding remarks are "nally proposed in Section 7.

2. Context and statement of the problem The EIV models considered in this paper, given by the state-space relations (3)–(6), can also be described, under the assumption of complete observability (Guidorzi 1981), by the canonical polynomial model Q(z)y(t) ˆ = P(z) u(t); ˆ

(7)

where z denotes the unitary advance operator. For the sake of simplicity, in the following we will limit our considerations to SISO systems so that the polynomial matrices P(z) and Q(z) reduce to the polynomials p(z) and q(z) given by q(z) = z n − n z n−1 − · · · − 2 z − 1 ;

(8) (9)

x(t ˆ + 1) = Ax(t) ˆ + Bu(t); ˆ

(3)

p(z) = n+1 z n + n z n−1 + · · · + 2 z + 1

y(t) ˆ = C x(t) ˆ + Du(t); ˆ

(4)

and model (7) can be written as

y(t) = y(t) ˆ + y(t); ˜

(5)

y(t ˆ + n) =

n  k=1

u(t) = u(t) ˆ + u(t); ˜

(6)

where x(t) ˆ ∈ Rn is the state of the process and y(t) ˆ ∈ Rm , r u(t) ˆ ∈ R denote the unknown noiseless components of the observations; y(t) ˜ and u(t) ˜ are the additive noise on y(t) ˆ and u(t) ˆ constituted by mutually correlated white processes, uncorrelated with u(t). ˆ While the statistical properties of linear processes of type (1)–(2) have been deeply investigated in the literature, many

k y(t ˆ + k − 1) +

n+1 

k u(t ˆ + k − 1): (10)

k=1

In the following it will also be assumed, without loss of generality, that E[u(t)] ˜ = 0, E[y(t)] ˜ = 0. When a sequence of input–output data of length L is available, Eq. (10) can be evaluated at times t = n + 1; : : : ; L. It is thus possible to write the set of these N = L − n relations in the compact form G vˆ = 0;

(11)

R. Guidorzi et al. / Automatica 39 (2003) 281 – 289

where G is the N × 2(N + n) matrix G=         

283

a new observation u(t), y(t), becomes available, an unbiased estimate of its regular part u(t); ˆ y(t). ˆ

1

1

2

2

:::

n

n

−1

n+1

0 .. .

0

1

1

2

2

:::

n

n .. .

0

:::

:::

:::

:::

:::

:::

:::

1

0

:::

:::

:::

:::

:::

:::

:::

:::

0

:::

:::

:::

:::

:::

:::

−1 .. .

n+1

0

:::

:::

:::

:::

1

2

2

:::

−1

n+1

0

0

1

1

:::

n

n

−1

Remark 3. It is important to note that in this work the model of the process is considered as exactly known, as it happens in Kalman "ltering, and the regular behaviour vˆ will be extracted from the data on the basis of this information.



0

       0  0 .. .

n+1 (12)

and vˆ is the 2(N + n)-dimensional vector

EIV "ltering could also be connected with identi"cation, i.e. with the problem of extracting the regular part (and, consequently, a process model) from the noisy behaviour on the basis of an assumed cost function. This problem has been studied, in a deterministic context, by Roorda and Heij (1995) and Roorda (1994) who use as cost function the norm of v − v. ˆ The problem of estimating the model of the process from the noisy observations v is an errors-in-variables identi"cation problem that has been treated in several papers like, for instance, (S?oderstr?om, 1981; Anderson & Deistler, 1984, Beghelli et al., 1990; Zheng & Feng, 1989, Soverini & S?oderstr?om, 2000).

vˆ = [y(1) ˆ u(1) ˆ | y(2) ˆ u(2) ˆ | · · · | y(N ˆ +n) u(N ˆ +n)]T : (13) Relations (5) and (6) lead immediately to the condition Gv = G(vˆ + v) ˜ = G v˜ = ;

(14)

where v and v˜ are vectors containing the observations and the noise samples, with the same structure as v. ˆ Remark 1. The system that has been considered in (10) is non purely dynamic (D = 0 in (4)). Purely dynamic systems can rely on the same structure of model (11) assuming n+1 = 0 in matrix G. Remark 2. The extension of model (11) to MISO systems can be easily obtained by substituting, in matrix G, the entries i i with i i1 : : : ir where r denotes the number of inputs (Guidorzi & Guidorzi, 1998). The model of the considered EIV system is constituted by the description of its deterministic part, i.e. by the quadruple (A; B; C; D), or equivalently by the pair Q(z), P(z) or by matrix G, by the variances of the additive noises u(t) ˜ and y(t), ˜ ˜2u and ˜2y respectively and by their cross-covariance ˜yu . In this context, according to the de"nition of Willems (1986a, b, 1987), the vector of input and output observations v de"nes a noisy behaviour while vˆ is the regular behaviour of the process. The problems treated in this paper can be de"ned as follows. Problem 1. Interpolation—Given the model of the process and the noisy behaviour v, determine an unbiased estimate of its regular part v. ˆ Problem 2. Filtering—Given the model of the process and an increasing sequence of observations, estimate, as soon as

3. EIV interpolation In this section, we focus our attention on the problem of obtaining a decomposition of the data v = vˆ∗ + v˜∗ such that vˆ∗ satis"es the set of linear equations (11) and v˜∗ is minimised in a least-squares sense. Observing that rank G = N , the family of all solutions of (14) is described by the linear variety v˜∗ = G T (GG T )−1 Gv + vˆ∗ ;

(15)

where vˆ∗ ∈ ker G and G T (GG T )−1 is the pseudoinverse of G. Since G T (GG T )−1 Gv is the orthogonal projection of v on im G T , the LS solution, i.e. the solution with minimal euclidean norm is v˜∗LS = G T (GG T )−1 Gv:

(16)

In this case the interpolation vector of the noisy observations, vˆ∗LS = [I − G T (GG T )−1 G]v

(17)

is associated with the maximisation of the euclidean norm of vˆ∗ . This solution has been de"ned as Frisch interpolation in Guidorzi and Guidorzi (1998). The LS solution (17) does not require any assumption on the noise. If u(t) ˜ and y(t) ˜ are gaussian processes, the problem can, however, be solved by considering the maximum likelihood estimation of vˆ under constraint (11): max

{vˆ∗ |G vˆ∗ =0}

=

p(v|vˆ∗ )

max

{vˆ∗ | G vˆ∗ =0}

const exp{− 12 (v − vˆ∗ )T !˜ −1 (v − vˆ∗ )}; (18)

284

R. Guidorzi et al. / Automatica 39 (2003) 281 – 289

where



˜2y    ˜yu    T !˜ = E[v˜v˜ ] =  ...     0  0      =   

˜yu

:::

0

˜2u

:::

0

.. .

..

.. .

0

:::

˜2y

0

:::

˜yu

!˜ yu

0

···

0

!˜ yu

···

.. . 0

.. 0

.

···

0

.

0

linear estimate (in the MSE sense) that can be obtained from the knowledge of , i.e. from the data v under condition (14). It is immediate to verify that (26) coincides with (17) when ˜2y = ˜2u and ˜yu = 0. The covariance matrix of the estimate error eMV = v− ˆ vˆ∗MV is given by



  0    ..  .     ˜yu   ˜2u

T E[eMV eMV ]

˜ T (G !G ˜ T )−1 Gv−v)( ˜ T (G !G ˜ T )−1 Gv)T −v˜T ] = E[(!G ˜ !G



˜ T (G !G ˜ T )−1 G !˜ = !˜ − !G

 0     ..  .   ˜ !yu

(19)

is the 2(N + n) × 2(N + n) covariance matrix of v. ˜ By introducing the vector of Lagrange multipliers ", condition (18) is equivalent to the minimisation of the function ∗

∗ T

˜ −1

J (vˆ ; ") = (v − vˆ ) !



T



(v − vˆ ) + " G vˆ

(20)

˜ − G T (G !G ˜ T )−1 G !): ˜ = !(I

(27)

Remark 4. It can be immediately veri"ed that G v˜∗MV = Gv i.e. that estimate (25) belongs to the family of admissible solutions (15). Remark 5. It can be noted that estimates (16) and (25) of v˜ have both a structure of the type v˜∗ = HGv = Rv;

(28)

so that

and by equating to zero the gradient vectors of (20) with respect to vˆ∗ and " we obtain

E[v˜∗ ] = E[HG v˜ + HG v] ˆ = HGE[v] ˜ =0

˜ T )−1 Gv; " = 2(G !G

i.e. they are both unbiased. The same property holds also for vˆ∗LS and vˆ∗MV .

vˆ∗ = v −

˜ T" !G : 2

(21) (22)

The ML solution is thus given by vˆ∗ML

˜ T (G !G ˜ T )−1 Gv = v − !G ˜ T (G !G ˜ T )−1 G]v: = [I − !G

(23)

The same result can be obtained also by considering the mean square error (minimal variance) estimate of v˜ conditioned by  (Anderson & Moore, 1979; Rhodes, 1971) v˜∗MV = E[v|] ˜ = E[v ˜ T ]E[T ]−1 :

(24)

In fact, recalling that v = vˆ + v˜ and E[v˜vˆT ] = 0, it is easy to obtain, from (14), ˜ T (G !G ˜ T )−1  = !G ˜ T (G !G ˜ T )−1 Gv: v˜∗MV = !G

(25)

The mean square error estimate of the regular behaviour vˆ is thus given by ˜ T (G !G ˜ T )−1 G]v; vˆ∗MV = v − v˜∗MV = [I − !G

(26)

which coincides with (23). Note that when u(t) ˜ and y(t) ˜ are not gaussian, relation (26) constitutes, in any case, the best

(29)

4. EIV !ltering In Section 3 time has not been introduced because interpolation is a batch procedure based on all available samples. Filtering, on the contrary, is an on-line procedure that requires the association of all quantities to increasing values of time. At time t we have thus L = t, N = t − n, G(t) = G, ˜ ˜ v(t)=v and similar expressions for v(t), !(t)= !, ˜ v(t), ˆ v˜∗ (t) ∗ and vˆ (t). The samples uˆ ∗ (t), yˆ ∗ (t) obtained from (17) and (26) will depend, in general, on both previous and subsequent noisy samples u(·) and y(·); the only exception concerns the "rst sample that will depend only on subsequent samples and the last one that will depend only on previous samples. Relations (17) and (26) could thus be used in "ltering applications by performing at every step an interpolation and selecting in the whole sequences uˆ ∗ (·), yˆ ∗ (·) only the last samples. A procedure of this kind would, however, exhibit poor eRciency because it does not rely on previous computations; moreover the amount of computations increases at every step with the dimension of G(t). It is however possible to take advantage of the speci"c structure of G(t) to develop an on-line and "nite-memory procedure. Passing from

R. Guidorzi et al. / Automatica 39 (2003) 281 – 289

t to t + 1, the update of G(t) is given by G(t + 1)      =   

0

0

.. .

.. .

0

0

−1

n+1

G(t)

0

:::

    G(t)  =    m1 (t)

0 0 .. . 0

 1 1 0

:::

n

n

        



 ..  .  :  0 

(30)

m2

T T ˜ ˜ R(t) = !(t)G (t)(G(t)!(t)G (t))−1 G(t):

(31)

Then vˆ∗ (t) = [I − R(t)]v(t)

(32)

and the use of a well-known matrix inversion lemma leads, after some editing (see the appendix), to the "nal expression  R(t) 0 R(t + 1) = 0 0(2×2)  ˜ !(t)(R(t) − I )T mT1 (t)m1 (t)(R(t) − I ) + k(t) −!˜ yu mT2 m1 (t)(R(t) − I ) ˜ −!(t)(R(t) − I )T mT1 (t)m2 (33) !˜ yu mT2 m2 where the scalar k(t) is given by −1 T ˜ k(t) = [m2 !˜ yu mT2 − m1 (t)(R(t) − I )!(t)m 1 (t)] :

(34)

Denoting as m(t +1)=[m1 (t +1)m2 ] the last row of G(t +1), it is possible to factorize R(t + 1) and k(t) as  R(t) 0 R(t + 1) = 0 0(2×2)  ˜ + 1) + k(t)!(t  ×m(t + 1)

 (I − R(t))

k(t) = m(t + 1) 0

(I − R(t))

0

0

I2

(I − R(t))

0

0

I2

T mT (t + 1)

(35)

0



I2

˜ + 1)mT (t + 1) −1 : ×!(t

(36)

It is now possible to de"ne &(t + 1) as  (I − R(t)) 0 &(t + 1) = −m(t+1) v(t+1) 0 I2

(37)

and update vˆ∗ (t) by means of the relation vˆ∗ (t + 1) = [I − R(t + 1)]v(t + 1)  ∗  vˆ (t)    ˜ =  y(t + 1)  + k(t)!(t + 1) u(t + 1) 

We will make reference, in the following, to the minimal variance estimate (26), and will omit the suRx MV since no ambiguities can arise. Denote now, as in (28), with R(t) the N × N matrix

285

×

(I − R(t))

0

0

I2

T mT (t + 1)&(t + 1): (38)

In "ltering, however, only the last two entries of vˆ∗ (t + 1) are of interest. It is now important to observe that since only the last 2n + 2 entries of m(t + 1) are nonzero, only the lower right 2n × 2n submatrix of R(t), R(t)2n , and the last 2n interpolated entries of vˆ∗ (t) are required to compute the last 2n + 2 entries of vˆ∗ (t + 1), vˆ∗ (t + 1)2n+2 . This means that the memory of the "lter (38) is equal to n, i.e. to the order of the system. It is thus possible to consider only the update of the last 2n + 2 components of vˆ∗ (t); this can be obtained by rewriting relation (38) in the form   ∗ vˆ (t)2n    vˆ∗ (t + 1)2n+2 =   y(t + 1)  u(t + 1) ˜ + 1)sT (t)&(t + 1) + k(t)!(n with s(t) = mS



(I − R(t))2n

0

0

I2

(39)

(40)

and mS = [1

1

···

−1

n+1 ]:

(41)

Using this notation, k(t) can also be rewritten in the form ˜ + 1)mS T ); k(t) = 1=(s(t)!(n

(42)

where s(t) and k(t) can, "nally, be updated by means of the recursion  R(t)2n 0 T ˜ R(t + 1)2n = + k(t)!(n)(s (t)s(t))2n (43) 0 0 2n that is performed on matrices with maximal dimension ((2n + 2) × (2n + 2)). The steps of the whole procedure can thus be summarised as follows:

286

R. Guidorzi et al. / Automatica 39 (2003) 281 – 289

Algorithm 1. (1)At time t = n + 1 compute, by means of (32), vˆ∗ (t) and, consequently, yˆ ∗ (t) and uˆ ∗ (t). (2) Compute s(t) (40) and k(t) (42). (3) t ← t + 1 and compute the quantities &(t) (37), vˆ∗ (t)2n+2 (39) and, consequently, yˆ ∗ (t) and uˆ ∗ (t). (4) Update R(t)2n by means of (43). (5) Return to step 2. It can be observed that the interpolation and "ltering procedures described by Roorda (1994) can be extended to the EIV case. The interpolation considered in Roorda (1994) is, in fact, based on an orthogonal projection corresponding to relation (16) while the minimal variance EIV interpolation requires the oblique projection given by (25). To solve the EIV interpolation by means of an orthogonal projection it is necessary to modify both the data and the model and perform a post-"ltering of the result; rewrite, for this purpose, relation (25) in the form v˜∗MV = !˜ 1=2 !˜ 1=2 G T (G !˜ 1=2 !˜ 1=2 G T )−1 G !˜ 1=2 !˜ −1=2 v ˜ = !˜ 1=2 G˜ T (G˜ G˜ T )−1 Gz; where G˜ = G !˜ 1=2 ;

z = !˜ −1=2 v:

(44) (45)

Relation (44) has the same structure as (16) except for the premultiplication by !˜ 1=2 that implies the necessity of a post-processing (rescaling) of the obtained output; the orthogonal projection has, however, been performed on modi"ed data (z instead of v) and making reference to a di3erent model (G˜ instead of G). It can be noted that the scaling of the data on the basis of the variance of the respective additive noise is a step necessary to reduce the more general unbalanced EIV environment to a standard balanced one. These considerations have been performed making reference only to the interpolation problem; in a similar way the EIV "ltering problem can be solved by extending the "ltering procedure proposed in Roorda (1994) according to the following steps: (1) Compute  1  .  .  .    n −1

the new input–output model  1 ..   .  ˜  !yu ;  n 

(46)

n+1

u(t)

yˆ + (t)

uˆ + (t)

in order to obtain the "ltered sequences yˆ ∗ (·), uˆ ∗ (·). It can be observed that the algorithms proposed in this paper are more simple and eRcient than the above procedure. Moreover, from a conceptual point of view, EIV "ltering and the "ltering problem treated in Roorda (1994) di3er in the following two aspects: (1) The stochastic versus deterministic context; (2) The unbalanced versus balanced error environments. Unbalanced noise environments, which constitute the very basis of EIV models are not considered at all in Roorda (1994). Thus the steps formally necessary to adapt Roorda’s algorithm to the EIV context require a proper de"nition of the meaning of some scaling matrices that, in the stochastic context, are noise covariances. 5. Comparison with Kalman !ltering The purpose of this section is to solve the EIV "ltering problem by means of an alternative procedure relying on state-space instead of behavioural models, based on Kalman "ltering techniques. On the basis of (24) and (26) it can be observed that the optimal estimate (32) is given by vˆ∗ (t) = E[v(t)|(t)]; ˆ

(49)

where, according to (14) (t) = G(t)v(t) = [)(n + 1); )(n + 2); : : : )(t)]T

(50)

and the entries )(i)(i = n + 1; : : : ; t) constitute the samples of the process )(t) = p(z) u(t) − q(z) y(t) = p(z)u(t) ˜ − q(z)y(t): ˜

(2) Compute an isometric state-space model equivalent to the input–output model obtained at step 1; (3) Perform the data scaling   y(1) u(1)    y(2) u(2)    ˜ −1=2 ! ; (47)  . ..   .  yu .   . y(t)

(4) Apply the "ltering algorithm described in Roorda (1994, p. 62) and obtain the intermediate "ltered sequences yˆ + (·), uˆ + (·); (5) Perform the post-"ltering  +  yˆ (1) uˆ + (1)  yˆ + (2) uˆ + (2)    −1=2   (48) ..  !˜ yu  ..  . . 

(51)

Since the process )(t) is constituted by the sum of two MA processes, it admits a state-space realisation of the type *(t + 1) = A*(t) + Bw(t);

(52)

)(t) = C*(t) + Dw(t);

(53)

where 

0

 1  A=  .. . 0 C = [0

0 0 .. . ··· 0

··· .. . .. . 1 ···

0



 0    0



  2 B=  ..  .

0 1]

1

D=[−1

n

 1 2    ..  .  n n+1 ]

(54)

R. Guidorzi et al. / Automatica 39 (2003) 281 – 289

287

and w(t) = [y(t) ˜ u(t)] ˜ T , with covariance matrix !˜ yu , de"ned in (19). The associated Kalman "lter leading to the optimal estimate of *(t) is described by the relations:

BS BSC *(t | t) + )(t); (55) *(t + 1 | t) = A − r r

implementation based on the Cholesky decomposition can be found in Diversi, Guidorzi, and Soverini (2002).

*(t | t) = *(t | t − 1) + ,(t)()(t) − C*(t | t − 1));

The properties of the proposed "ltering procedure have been tested by means of a Monte Carlo simulation performed on an order 2 system given by

,(t) =

P(t | t − 1)C T ; CP(t | t − 1)C T + r

P(t | t) = (I − ,(t)C)P(t | t − 1);



T BSC BSC P(t | t) A − P(t + 1 | t) = A − r r

Q − SS T BT ; +B r

(56) (57) (58)



(59)

where Q = !˜ yu ;

(60)

2 r = D!˜ yu DT = n+1 ˜2u + ˜2y − 2n+1 ˜yu ;  n+1 ˜yu − ˜2y : S = !˜ yu DT = n+1 ˜2u − ˜yu

(61) (62)

Relations (52) and (55) allow to determine an optimal estimate w(t | t) = w∗ (t); in fact *(t + 1 | t) = A*(t | t) + Bw∗ (t): By substituting (55) in (63) we obtain  ∗ y˜ (t) S ∗ = [)(t) − C*(t | t)]: w (t) = ∗ r u˜ (t)

(63)

ˆ = 0:5381(z 2 + 0:3z − 0:4)u(t); ˆ (z 2 − 0:6z + 0:45)y(t) where the gain assures an output with the same variance of the input in order to simplify the description of the results. A Monte Carlo simulation constituted by 100 runs has been performed to test both interpolation and "ltering procedures. Every run has been performed using as input 200 samples of a coloured sequence with unitary variance (ˆ2u = 1) generated by an ARMA process. The noiseless input and output sequences have been corrupted by additive white noises u(·), ˜ y(·) ˜ with variances ˜2u =0:25 and ˜2y =0:64 and cross-covariance ˜yu = 0:36. The percent amounts of noise de"ned as one hundred times the ratio between the standard deviations of the noises and those of the signals are thus equal to 50% on the input and 80% on the output. Figs. 1 and 2 show the noiseless input and output (continuous line) compared with noise-corrupted ones (dashed line) in a typical run. The mean values on all runs of the standard deviations of the di3erences between the interpolated input and output sequences and noiseless ones are &u = 0:3042 ± 0:0113;

(64)

The minimal variance estimates of y(t), ˆ u(t) ˆ can thus be obtained as follows: ∗

6. A Monte Carlo simulation



yˆ (t) = y(t) − y˜ (t)

&y = 0:3061 ± 0:0221;

3 2 1

)(t) − C*(t | t) = y(t) − (n+1 ˜yu − ˜2y ); r

(65)

uˆ ∗ (t) = u(t) − u˜ ∗ (t) )(t) − C*(t | t) = u(t) − (n+1 ˜2u − ˜yu ): r

0 −1 −2 −3 100

(66)

Remark 6. The estimate yˆ ∗ (t) given by (65) coincides with the noisy measurement y(t) when n+1 ˜yu − ˜2y = 0. Similarly, the estimate uˆ ∗ (t) given by (66) coincides with the noisy measurement u(t) when n+1 ˜2u − ˜yu = 0.

110

120

130

140

150

160

170

180

190

200

Fig. 1. Comparison between the noiseless input and its observation.

3 2 1 0

It is important to note that the "ltering performed by means of the state-space approach described in this section leads to the same results as the "ltering performed by means of behavioural models since it is based on the same process )(t). A particularly eRcient and numerically stable

−1 −2 −3 100

110

120

130

140

150

160

170

180

190

200

Fig. 2. Comparison between the noiseless output and its observation.

288

R. Guidorzi et al. / Automatica 39 (2003) 281 – 289

that

3 2

M (t + 1) 

1 0

=

−1 −2 −3 100

110

120

130

140

150

160

170

180

190

200

Fig. 3. Comparison between the noiseless input and the "ltered one.

2 1 0 −1 −2 −3 100

T ˜ G(t)!(t)m 1 (t)

T ˜ m1 (t)!(t)G (t)

T T ˜ ˜ m1 (t)!(t)m 1 (t) + m2 !yu m2

(A.1) a well-known lemma on the inversion of partitioned matrices leads, after some editing of the results, to M (t + 1)−1  M (t)−1 0 = 0 0  T S (t)mT1 (t)m1 (t)S(t) + k(t) −m1 (t)S(t)

3

110

120

130

140

150

160

170

180

190

−S T (t)mT1 (t) 1

; (A.2)

200

Fig. 4. Comparison between the noiseless output and the "ltered one.



M (t)

where S(t) and k(t) denote the matrix T ˜ S(t) = !(t)G (t)M (t)−1

(A.3)

and the scalar which correspond to 30.4% and 30.6% percent errors. The mean values on all runs of the standard deviations of the di3erences between the "ltered input and output sequences and noiseless ones are &u = 0:3410 ± 0:0112; &y = 0:3447 ± 0:0249; which correspond to 34.1% and 34.5% percent errors. Figs. 3 and 4 show the noiseless input and output (continuous line) compared with the "ltered ones (dashed line) in a typical run. 7. Concluding remarks This paper has described a new optimal "lter for errors-in-variables processes where both inputs and outputs are a3ected by measurement noise and no orientation of the process is necessarily required. This "lter allows to obtain minimum variance estimates of both inputs and outputs and can be implemented by means of a simple recursive algorithm that does not require the solution of a Riccati equation at every step. The results obtained by means of Monte Carlo simulations evidentiate the consistency and good performance of the approach. Future investigations will concern the application of this algorithm to real processes where an EIV framework is more realistic than traditional error-free input contexts. Appendix. To obtain relation (33) we will start from the update of T ˜ the inverse of the matrix M (t) = G(t)!(t)G (t). Observing

T T ˜ ˜ k(t) = [m1 (t)!(t)m 1 (t) + m2 !yu m2 −1 T ˜ − m1 (t)R(t)!(t)m 1 (t)] −1 T ˜ = [m2 !˜ yu mT2 − m1 (t)(R(t) − I )!(t)m 1 (t)] :

(A.4)

Using (A.2) it is easy to obtain the following expression for the update for R(t + 1) R(t + 1)  R(t) 0 = + k(t) 0 0  T T ˜ ˜ [!(t)R (t)mT1 (t)m1 (t)R(t)−!(t)m 1 (t)m1 (t)R(t)  T T T ˜ ˜ × −!(t)R (t)m1 (t)m1 (t) + !(t)m1 (t)m1 (t)] −!˜ yu mT2 m1 (t)R(t) + !˜ yu mT2 m1 (t)



T T ˜ ˜ [ − !(t)R (t)mT1 (t)m2 + !(t)m 1 (t)m2 ]  ;  T !˜ yu m2 m2

(A.5) simple editing leads "nally to relation (33). References Anderson, B. D. O., & Deistler, M. (1984). Identi"ability of dynamic errors-in-variables models. Journal of Time Series Analysis, 5, 1–13. Anderson, B. D. O., & Moore, J. B. (1979). Optimal 7ltering. Englewood Cli3s, NJ: Prentice-Hall. Beghelli, S., Guidorzi, R. P., & Soverini, U. (1990). The Frisch scheme in dynamic system identi"cation. Automatica, 26, 171–176. Deistler, M. (1986). Linear errors-in-variables models. In S. Bittanti (Ed.), Time series and linear systems (pp. 37– 68). Berlin: Springer.

R. Guidorzi et al. / Automatica 39 (2003) 281 – 289 Diversi, R., Guidorzi, R., & Soverini, U. (2002). Algorithms for optimal errors-in-variables "ltering. Systems & Control Letters, to appear. Guidorzi, R. P. (1981). Invariants and canonical forms for systems structural and parametric identi"cation. Automatica, 17, 117–133. Guidorzi, P., & Guidorzi, R. (1998). Frisch "ltering of noisy signals. Proceedings of EUSIPCO 98, Rhodes, Greece, Vol. IV (pp. 2117–2120). Kalman, R. E. (1960). A new approach to linear "ltering and prediction problems. Transactions of the ASME, 82, 35–45. Mehra, R. K. (1970). On the identi"cation of variances and adaptive Kalman "ltering. IEEE Transactions on Automatic Control, 15, 175–184. Rhodes, I. B. (1971). A tutorial introduction to estimation and "ltering. IEEE Transactions on Automatic Control, 16, 688–706. Roorda, B. (1994). Global total least squares. Ph.D. thesis, Book no. 88, Tinbergen Institute Research Series, Amsterdam, The Netherlands. Roorda, B., & Heij, C. (1995). Global total least squares modeling of multivariable time series. IEEE Transactions on Automatic Control, 40, 50–63. S?oderstr?om, T. (1981). Identi"cation of stochastic linear systems in presence of input noise. Automatica, 17, 713–725. Soverini, U., & S?oderstr?om, T. (2000). Identi"cation methods of dynamic systems in presence of input noise. Proceedings of 12th IFAC symposium on system identi7cation, Santa Barbara, USA. Willems, J. C. (1986a). From time series to linear system, Part I: Finite dimensional linear time invariant systems. Automatica, 22, 561–580. Willems, J. C. (1986b). From time series to linear system, Part II: Exact modelling. Automatica, 22, 675–694. Willems, J. C. (1987). From time series to linear system, Part III: Approximate modelling. Automatica, 23, 87–115. Zheng, W. X., & Feng, C. B. (1989). Unbiased parameter estimation of linear systems in the presence of input and output noise. International Journal of Adaptive Control and Signal Processing, 3, 231–251. Roberto Guidorzi holds the chair of System Theory at the University of Bologna since 1980. He has, besides, been Director of the Computer Centre of the Engineering School of Bologna University from 1987 to 1992 and is, from January 1997, Director of CITAM (Interfaculty Center for Advanced and Multimedia Technologies). He has been visiting professor and invited speaker in European and American universities and has collaborated with several industries in the development of advanced

289

projects, among them management systems for gas pipeline networks, satellite navigation systems, computerised injection systems, software for the identi"cation of natural gas reservoirs, tracking and data fusion, early diagnosis in railway systems. He is the author of some 160 publications dealing with subjects of a methodological nature as well as applications of system theory methodologies. His present research interests concern errors-in-variables identi"cation and "ltering, blind channel equalisation and the development of e-learning environments. Roberto Diversi was born in Faenza, Italy in 1970. He received the “Laurea” degree in Electronic Engineering in 1996 and the Ph.D. degree in System Engineering in 2000, both from the University of Bologna. Currently, he is a post-doc researcher at the Department of Electronics, Computer Science and Systems, University of Bologna. His main research interests are in the "eld of signal processing, system identi"cation and related applications to natural and industrial processes. Umberto Soverini was born in Bologna, Italy in 1959. He received the “Laurea” degree in Electronic Engineering from the University of Bologna in 1985 and the Ph.D. in System Engineering in 1990. In 1992 he has been appointed as Research Associate at the Department of Electronics, Computer Science and Systems, University of Bologna where he is currently Associate Professor. In 1999 he has been a visiting researcher at the Department of Systems and Control, Uppsala University. His research interests include stochastic realization theory, signal processing, system identi"cation and related applications to natural and industrial processes.