Proceedings of the 15th IFAC Symposium on System Identification Saint-Malo, France, July 6-8, 2009
Identifying Errors-in-Variables Systems By Using A Covariance Matching Approach ⋆ Torsten S¨oderstr¨om ∗ Magnus Mossberg ∗∗ Mei Hong ∗ ∗
Division of Systems and Control, Department of Information Technology, Uppsala University, P O Box 337, SE-75105 Uppsala, Sweden (e-mail:
[email protected],
[email protected],). ∗∗ Department of Physics and Electrical Engineering, Karlstad University, SE-651 88, Karlstad, Sweden (e-mail:
[email protected]). Abstract: The errors-in-variables identification problem concerns dynamic systems whose input and output variables are affected by additive noise. Several estimation methods have been proposed for identifying dynamic errors–in–variables models. In the paper a covariance matching approach is proposed to solve the identification problem. It applies for general types of input signals. The method utilizes a small set of covariances of the measured input-output data. This property applies also for some other methods, such as the Frisch scheme and the bias-eliminating least-squares method. Algorithmic details for the proposed method are provided. The method is evaluated by using numerical examples, and is shown to have competitive properties as compared to alternative methods. Keywords: System identification, errors–in–variables models, linear systems, covariance functions 1. INTRODUCTION The Errors-In-Variables (EIV) framework concerns static or dynamic systems whose input and output variables are affected by additive noise. These models play an important role in several engineering applications like, for example, time series modeling, direction–of–arrival estimation, blind channel equalization, and many other signal and image processing problems; see Van Huffel (1997), Van Huffel and Lemmerling (2002). Many different solutions have been proposed for the identification of EIV models. An overview can be found in S¨oderstr¨om (2007). Among the different methods there are several ones that are based on a number of covariances of the noisy input-output data. The methods are typically based on a compensated leastsquares principle. This leads to a set of nonlinear equations with a specific structure. In many cases, the equations are bilinear in the unknown parameters, and specific numerical algorithms may be applied to solve the equations. One method of this class is the bias-eliminating least squares, (BELS), proposed in Zheng (1998), Zheng (2002). There are different variants for the output noise being white or correlated. Another approach is the Frisch scheme, for which there are several ways to arrange the additional equations leading to the bias compensation. The method was developed to identify dynamic EIV systems in Beghelli et al. (1990) and was further elaborated for the white output noise case in Beghelli et al. (1993), Diversi et al. (2003), while an extension to the correlated output case appeared in S¨oderstr¨om (2008). A third alternative is the so called extended compensated least squares method, (ECLS), proposed in Ekman (2005) and analyzed in Ekman et al. (2006). It is a generalization of the instrumental variable method, where a compensation term is added to remove the bias contribution caused by the measurement noise. All these methods have been ⋆ This research was partially supported by The Swedish Research Council, contract 621-2007-6364.
978-3-902661-47-0/09/$20.00 © 2009 IFAC
shown to be similar, and in several cases equivalent under weak assumptions, Hong and S¨oderstr¨om (2009). The methods are constructed to work well for general types of the noise-free input. In this paper we propose and analyze an alternative method. We take the starting point, not in the normal equations of least squares, but in a small set of covariance elements of the measured data. These covariance elements are compared to what can be expected from the underlying model, and this leads to equations defining the parameter estimates. Such an approach has been previously described for example in Mossberg (2008), S¨oderstr¨om et al. (2006). In those cases it was though assumed that the noise-free input can be described as an ARMA model. The ARMA parameters were estimated together with the unknown system parameters. Here we develop a method where the noise-free input is a stationary process, but we do not specify any particular type of model for its statistical properties. The proposed method is developed in detail. It is compared with alternative methods that also are based on a small set of covariance elements of the measured data. The paper is organized as follows. The next section gives background to the problem, and the proposed method is presented in Section 3. Section 4 is devoted to implementation aspects with some more details on user choices appearing in Section 5. Section 6 contains comparison to other but similar approaches. Numerical illustrations appear in Section 7, and concluding remarks are given in Section 8. 2. BACKGROUND Consider a linear and single input single output (SISO) system given by A(q −1 )y0 (t) = B(q −1 )u0 (t), (1) where u0 (t) and y0 (t) are the noise-free input and output, respectively. Further, A(q −1 ) and B(q −1 ) are polynomials
1551
10.3182/20090706-3-FR-2004.0027
15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009 described as A(q −1 ) = 1 + a1 q −1 + · · · + an q −n , (2) B(q −1 ) = b1 q −1 + · · · + bm q −m . We assume that the observations are corrupted by additive measurement noises u ˜(t) and y˜(t). The available signals are in discrete time and of the form u(t) = u0 (t) + u ˜(t),
y(t) = y0 (t) + y˜(t).
(3)
The following assumptions are introduced. A1. The dynamic system (1) is asymptotically stable, observable and controllable. A2. The polynomial degrees n and m are a priori known. A3. The true input u0 (t) is a zero-mean stationary ergodic random signal that is persistently exciting of sufficiently high order. A4. The input noise u ˜(t) and the output noise y˜(t) are both independent of u0 (t) and mutually independent white Gaussian noise sequences of zero mean, and variances λu and λy , respectively. (The extension of the method to cope with arbitrarily correlated output noise is possible.) The problem of identifying this EIV system is concerned with consistently estimating the parameter vector θ 0 = (a1 . . . an b1 . . . bm )T (4) and possibly the noise variances λu and λy from the measured noisy data {u(t), y(t)}N t=1 . For a general random process x(t), we define its covariance function rx (τ ) as: rx (τ ) = E{x(t)x(t − τ )}, τ = 0, ±1, ±2, . . . . where E is the expectation operator.
(5)
3. PROPOSED METHOD Define z0 (t) =
1 u0 (t). A(q −1 )
(7)
y(t) = B(q )z0 (t) + y˜(t). (8) It is a key issue that both u0 (t) and y0 (t) are obtained by FIR filtering (and not IIR filtering) of z0 (t), as in (7), (8). −1
The covariance function of the measured input u(t) can be written as n X n X ru (τ ) = E{u(t + τ )u(t)} = ai aj r0 (τ − i + j) (9) i=0 j=0
for τ > 0, where a0 = 1 and r0 (τ ) = E{z0 (t + τ )z0 (t)}. (10) Moreover, m X m X ry (τ ) = E{y(t + τ )y(t)} = bi bj r0 (τ − i + j), (11) i=1 j=1
for τ > 0, and
ryu (τ ) = E{y(t + τ )u(t)} =
m X n X i=1 j=0
ru (pu )
ry (py )
bi aj r0 (τ − i + j). (12)
ryu (p2 )
(13) are estimated. Here, the integers py , pu , p1 and p2 are treated as user parameters. When the output is assumed to be corrupted by white output noise, one can choose py ≥ 1. In the extended case of correlated output noise, no vector ry at all should be used. As the input is assumed to be subject to white measurement noise, the condition pu ≥ 1 should apply. The integers p1 and p2 can be chosen more freely, but must by construction fulfill p1 ≤ p2 . It is a common experience that covariance elements tend to be more informative when the lag is small. Therefore it is practical to assume that the covariance element ryu (0) is included. Note that (12) holds also for τ = 0. We will for convenience assume that p1 ≤ 0 ≤ p2 . (14) We now introduce a number of notational conventions. The covariance vector of the noise-free signal z0 (t) is introduced as r0 (0) (15) rz = ... , r0 (k) where the maximal lag k is to be specified later. Further, introduce the conventions 1 i = 0, ai = (16) 0 i < 0, i > n, bi = 0
(6)
This means that both u0 (t) and y0 (t) can be expressed in terms of z0 (t) and that u(t) = A(q −1 )z0 (t) + u ˜(t),
In the covariance matching approach, the idea is to regard θ 0 and the covariance elements {r0 (τ )} as unknowns that are fitted to some specified input-output covariance elements, as in (9), (11) and (12). The approach thus means that first the covariance vectors ryu (p1 ) ru (1) ry (1) .. ry = ... ru = ... ryu = .
i < 1, i > m.
(17)
The vector of known covariance elements to be utilized is taken as ! ry ∆ r = ru . (18) ryu From (9), (11) and (12) it is clear that it is possible to write ! Fy (θ) r = F(θ)rz = Fu (θ) rz . (19) Fyu (θ) Note that (9) and (11) do not hold for τ = 0. For such a case the noise variances, λy and λu , have to be added to the right hand sides. In including such equations in (19) two more equations would be obtained, and also two more unknowns to be determined. One can not expect that such a case would lead to other estimates of θ, as there is hardly any new information about θ to infer. Still, such a consideration can have its value, when comparing the covariance matching approach with alternative methods such as BELS and the Frisch scheme, as in these methods certainly the covariance ry (0) and ru (0) are exploited for determining the parameter vector θ. We will soon describe how to express F(θ) for arbitrary values of θ. See Section 4 for details. The identification approach is hence to first estimate r from data, which is a straightforward task. Then θ and rz are de-
1552
15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009
termined by considering the system of equations given by (19). Here, of course, θ is the interesting quantity to determine, while rz has the role of being auxiliary variables. Based on the system of equations (19), the estimator ˆ ˆrz } = arg min J(θ, rz ), {θ, θ,rz
×
(23)
A number of remarks on the implementation issues are now in order. (1) The optimization problem in (23) is a variable projection problem. It can be solved with standard optimization algorithms. It is though possible in this case to provide explicit expressions for the gradient, as the dependence of F(θ) on θ has a relatively straightforward structure. Many general properties of variable projection algorithms are collected in Golub and Pereyra (1973), Golub and Pereyra (2003). (2) Note that for the algorithm (23) to work, it is necessary that the matrix F(θ) has full rank. This condition is analysed in some detail in S¨oderstr¨om et al. (2008). (3) In a practical implementation one should use a QR factorization of F(θ) to ensure that the corresponding projection in (23) is guaranteed to stay nonnegative definite. (4) It seems to be an open issue whether or not the loss function defined in (23) can have multiple local minima. In our numerical studies we have though not encountered problems with local minima. 4. EXPRESSIONS FOR THE COVARIANCE ELEMENTS To show how the matrix F(θ) in (19) depends on θ, we start with Fy (θ). For this purpose, it is convenient to introduce m X βj = bi bi+j . (24) i=1
It follows from (17) that βj = 0 if either j ≤ −m or j ≥ m. We can then write (11) as m−1 X ry (τ ) = βj r0 (τ + j)
(25)
(26)
j=−m+1
and therefore ry (1) β−m+1 . . . βm−1 .. .. ry = ... = . . ry (py ) β−m+1 . . . βm−1
β−m+1 . . . βm−1 .. .. = . . β−m+1 . . . βm−1 1 . .. r0 (0) 1 .. × . ... r (p + m − 1) 0 y .. . 1
where
θ
r0 (py + m − 1)
(20)
J(θ, rz ) = ||ˆr − F(θ)rz ||2Q , (21) determining for θ and rz is suggested. Here, ˆr is defined as in (18) but with the covariance and cross-covariance function elements replaced by estimates, and Q is a symmetric and positive definite weighting matrix. From the separable least squares problem (20), −1 T ˆrz = FT (θ)QF(θ) F (θ)Qˆr (22) and ˆ = arg min ˆrT Qˆr − ˆrT QF(θ) FT (θ)QF(θ) −1 FT (θ)Qˆr . θ
r0 (2 − m) .. .
∆
= Ty (θ)Py rz . (27) Here, the matrix Ty (θ) has dimension py |(py +2m−2) and the dimension of Py is (py + 2m − 2)|(py + m). In the matrix Py all elements are zero, except one element being equal to one in each row. In the first row, the element in column m − 1 is equal to one. It now holds Fy (θ) = Ty (θ)Py (28) The minimal dimension of rz in (19) may be different for ry , ru and ryu . To handle this, we set Fy (θ) = ( Ty (θ)Py 0 ) (29) to account for the possibly different lengths of rz compatible with the vectors ry , ru and ryu . The null matrix in (29) has k − py − m + 1 columns. Next consider the matrix Fu (θ). The procedure is very analogous to the one above. First introduce n X αj = ai ai+j . (30) i=0
We can then write (9) as
ru (τ ) =
n X
αj r0 (τ + j),
(31)
j=−n
and therefore α−n . . . αn .. ru = . α−n 1 . .. 1 × ... .. . ∆
..
. . . . αn
1
r0 (0) .. . r (p + n) 0 u
= Tu (θ)Pu rz . (32) Here, the matrix Tu (θ) has dimension pu |(pu + 2n) and the dimension of Pu is (pu + 2n)|(pu + n + 1). In the matrix Pu all elements are zero, except one element being equal to one in each row. In the first row, the element in column n is equal to one. It now holds Fu (θ) = Tu (θ)Pu (33)
1553
15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009 (2) It is by experience often an advantage to use covariance elements with small τ arguments. The main reason is that then the estimated covariances deviate only a little from their true values. For ‘large’ values of τ , the accuracy of an estimated covariance can be quite low. (3) One should avoid of making p too large, in the sense that ˆr with many elements would then increase the dimension of the problem. However, the essential optimization problem will be of order n+m, and the dimension of rz is really of no particular concern. The quantities in the loss function, (23), have though higher dimension when p has large elements. (4) If more than the minimal number of equations is formed, one may consider active weighting in the loss function (21) by choosing Q 6= I. An optimal weighing is Q = R−1 , (42) where R is the covariance matrix of the vector ˆr. (5) If one expects the measurement noise in y, or in both u and y to be correlated, the appropriate choices in p include to take py = 0, and py = pu = 0, respectively.
or possibly Fu (θ) = ( Tu (θ)Pu 0 ) (34) to account for the possibly different length of rz compatible with the vectors ry , ru and ryu . The null matrix in (33) has k − pu − n columns. Finally, we need to consider the matrix Fyu (θ). The procedure follows the same lines as before. Introduce m X γj = bi ai+j . (35) i=1
We can then write (12) as
ryu (τ ) =
n−1 X
γj r0 (τ + j),
(36)
j=−m
and therefore
γ−m . . . γn−1 .. ryu = . γ−m 1 . . . 1 × ... .. .
1
∆
= Tyu (θ)Pyu rz .
..
. . . . γn−1
r (0) 0. .. r (k ) 0 yu
In order to be able to solve the system of equations given in (19), it is necessary that there are at least as many equations as unknowns. This consideration leads to the compatibility condition dim(ry ) + dim(ru ) + dim(ryu ) ≥ dim(θ) + dim(rz ), (43) which is easily rewritten into py + pu + (p2 − p1 + 1) ≥ n + m + k + 1. (44) (37)
The lag kyu is given by kyu = max(−p1 + m, p2 + n − 1). (38) In (37), the matrix Tyu is (−p1 + p2 + 1)|(p2 − p1 + m + n) and the dimension of Pyu is (p2 − p1 + m + n)|(kyu + 1). In the matrix Pyu all elements are zero, except one element being equal to one in each row. In the first row, the element in column m − p1 + 1 is equal to one. Depending on the values of p1 and p2 it may happen that this element appears in the rightmost column. It now holds Fyu (θ) = Tyu (θ)Pyu
(39)
or Fyu (θ) = ( Tyu (θ)Pyu 0 ) (40) to account for the possibly different length of r0 compatible with the vectors ry , ru and ryu . The null matrix in (40) has k − kyu columns.
The value of the integer k is to be chosen as the maximum number that appears in the lags of r0 (τ ) in the expressions (27), (32) and (37). This turns out to give k = max(py +m−1, pu +n, max(−p1 +m, p2 +n−1)). (45) Therefore, the compatibility condition (44) becomes py + pu + p2 − p1 − n − m ≥ max(py + m − 1, pu + n, −p1 + m, p2 + n − 1). (46) One may ask if there is a natural choice of the user parameters py , pu , p1 and p2 to satisfy the condition (46). One can see that if py , pu and p2 are all chosen large enough the condition will be fulfilled. On the other hand, large values of these integers will imply that the dimension of ˆr is large, and that the computational load is increased. Example 5.1. The choices p1 = 0, pu = py = p2 = n + m. lead to
(47)
LHS = py + pu + p2 − p1 − n − m = 2n + 2m, RHS = max(n + 2m − 1, 2n + m, m, 2n + m − 1)
5. USER CHOICES IN THE ALGORITHM It may be worthwhile to emphasize the ‘structure’ of the problem, and let it be define as the quadruple of integers
= max(n + 2m − 1, 2n + m) < LHS, so the necessary compatability condition (46) is satisfied.
2
p = ( py pu p1 p2 ) (41) Note that the choice of structure is certainly in the hands of the user. How should p be chosen? What aspects should be considered in that choice? Below we discuss briefly some ideas.
More aspects on how to choose the structural parameters are given in S¨oderstr¨om et al. (2008).
(1) The choice of p must of course be such that a sufficient number of equations are obtained. This leads to some specific necessary conditions, see (46) below.
We have seen some different methods which are all based on a finite-dimensional ˆr as a starting point. A few general comments can be made on the relations between the approaches.
T
6. RELATION TO OTHER APPROACHES
1554
15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009 0.4
0.25 0.2 2
NMSE(a )
NMSE(a1)
0.3
0.2
0.15 0.1
0.1
0.05
0
1
2
3
4
0
5
5
10
4
8 2
NMSE(b )
1
NMSE(b )
(1) The proposed covariance matching approach. An arbitrary form of the noise-free input is allowed. The underlying equations lead to an optimization of order n + m. The proposed method is hence somewhat more computationally demanding than BELS and the Frisch scheme. There is a specific parameterization of the spectra of the noisefree input, that allows arbitrary spectra, see (7), (8) and still lead to a low-dimensional optimization problem. We have a more efficient use of the information in ˆr here, as compared to BELS and Frisch. (2) Some earlier covariance matching, Mossberg (2008), S¨oderstr¨om et al. (2006). In these cases the noise-free input is modelled as a specific process, typically as an ARMA process of some given order. The computational complexity is similar to the new proposed method. (3) Bias-eliminating least squares (BELS), Zheng (1998), Zheng (2002) and the Frisch scheme, Beghelli et al. (1990), Diversi et al. (2003). Here the underlying equations are relatively simple, and lead to a low-dimensional (say dimension 2) optimization problem. There is no modelling at all of the noise-free input. An arbitrary form of the noise-free input is allowed.
3 2
The noise-free input u0 (t) is an ARMA(1,1) process 1 + 0.7q −1 e(t), u0 (t) = 1 − 0.5q −1
1
2
3
4
0
5
1
2
3
4
5
Fig. 1. CM with different parameters py , pu , p1 , p2 . to proceed from one given data set, but it should give a result close to a lower bound of what can be achieved with an almost optimal weighting). The results are plotted in Figure 2. 0.25 0.2 2
NMSE(a )
NMSE(a1)
0.2 0.15
0.15 0.1
0.1 0.05
0.05 0
1
2
3
4
0
5
4
1
2
3
4
5
1
2
3
4
5
7 6
3 NMSE(b2)
5
1
NMSE(b )
(48)
2
4 3 2
1
1 0
(49)
where e(t) is a zero-mean white noise with unit variance. The white measurement noises at the input and output sides both have unit variance. The signal to noise ratios at the input and output sides equal 5.8 dB and 20.4 dB, respectively. For each simulation result, N = 1000 data points are used and M = 200 realizations are made. For the covariance matching approach (CM for short in the following) proposed in this paper, the user parameters were first chosen as py = 3, pu = 2, p1 = −2, p2 = 3. (50) Then, one additional covariance element is added. Four cases are considered, by changing one of the user parameters at a time. No weighting matrix is used, i.e. Q = I. The results are shown in Figure 1, where the normalized mean square errors, ˆ k − θ k )2 } NMSE(θ k ) = N E{(θ (51) are displayed. 4 p1 = −3
5
0.25
Example 7.1 Consider a second-order system given by 1.0q −1 + 0.5q −2 y0 (t) = u0 (t). 1 − 1.5q −1 + 0.7q −2
3 pu = 3
4
4
0.3
In this section we provide some numerical examples illustrating various properties of the proposed method. Effects of user parameters as well as comparisons with alternative approaches are given. In all cases the optimization (23) was initialized using the Frisch estimate of θ.
2 py = 4
3
2
0
7. NUMERICAL ILLUSTRATION
1 (50)
2
6
1
0.35
Case Parameters
1
5 p2 = 4
Optimal weighting is also examined, i.e. Q = Qopt = R−1 . The matrix R was found by using a large number of realizations and computing the statistics of the vector ˆr. (This is not a way
1
2
3
4
5
0
Fig. 2. CM optimal weighting with different parameters py , pu , p1 , p2 . A few observations are in order: (1) The different cases give relatively similar performances. (2) It does not always pay, in terms of lower MSE, to increase the number of covariance elements, unless proper weighting is applied. (3) The covariance elements of ˆry seem to be those of richest information, in the sense that the lowest NMSE values are obtained when the parameter py is increased. Example 7.2 The system and the input signal are the same as in Example 7.1. Here, the performance of CM is compared to BELS, Zheng (1998) and the Frisch scheme with comparison of residual statistics, Frisch-CR, Diversi et al. (2003). To make the comparisons fair and neutral, we tried to find conditions such that the same covariance elements will be used in these three methods. For CM we selected the user parameters as py = n + ℓ, pu = m + ℓ − 1, (52) p1 = 1 − n − ℓ, p2 = m + ℓ, and let ℓ vary. It follows that the total number of covariance elements will be p = py + pu + p2 − p1 + 1 = 7 + 4ℓ. (53)
1555
15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009 In the simulations, the values ℓ = 1, 2, . . . , 5 were tried. For CM both no weighting and optimal weighting as implemented in Example 6.1, were tried. For BELS, the appended vector was chosen as ϕ(t) = (−y(t − n − 1), . . . , −y(t − n − ℓ), u(t − m − 1), . . . , u(t − m − ℓ))T . (54) In Frisch-CR, the number of covariance lags of the residuals is chosen as ℓ, and there is no weighting applied. For each estimator, the normalized mean square errors of the parameters are shown in Figure 3. For BELS a few realizations were omitted, since this method occasionally showed bad convergence, especially when ℓ is large. The following observa1.2
0.8 0.7
1
NMSE(a2)
NMSE(a1)
0.6
0.8
0.5 0.4
0.6
0.3 0.4 0.2 10
0.2 15
20
number of p
25
30
25
0.1 CM 10 BELS FrischCR CMopt 35
NMSE(b2)
1
NMSE(b )
20
25
30
20
25
30
number of p
30
20 15 10 5 0 10
15
25 20 15 10
15
20
number of p
25
30
5 10
15
number of p
Fig. 3. NMSE of estimates of CM, BELS, Frisch-CR and CM optimal with different number of p. tions can be made: (1) The covariance matching approach gives significantly better performance than BELS and Frisch-CR. This means that the information in the covariance data ˆr is used in a more efficient way by the CM methods. (2) For CM, optimal weighting as compared to no weighting gives a marginal improvement for the A-coefficients. For the B-coefficients the improvement is significant when ℓ is ‘large’. 8. CONCLUDING REMARKS A new method, based on covariance matching, has been proposed for identifying errors-in-variables systems. No particular structural assumption is made about the noise-free input signal. As a first step a few sample covariance elements of the noisy input-output data are computed. The parameter estimates are then found by solving a variable projection optimization problem originating from a non-linear least squares formulation. The method has thus much in common with bias-eliminating least squares (BELS) and the Frisch scheme, but shows significantly better performance of the estimates. There are ways to include an optimal weighting in the problem formulation, so as to make maximally efficient use of the information in the computed sample covariances, which may give significant improvement of the estimates. REFERENCES
S. Beghelli, P. Castaldi, R. Guidorzi, and U. Soverini. A robust criterion for model selection in identification from noisy data. In Proc. 9th International Conference on Systems Engineering, pages 480–484, Las Vegas, Nevada, USA, 1993. R. Diversi, R. Guidorzi, and U. Soverini. A new criterion in EIV identification and filtering applications. In Proc. 13th IFAC Symposium on System Identification, pages 1993–1998, Rotterdam, The Netherlands, August 27-29 2003. M. Ekman. Identification of linear systems with errors in variables using separable nonlinear least squares. In 16th IFAC World Congress, Prague, Czech Republic, July 04-08 2005. M. Ekman, M. Hong, and T. S¨oderstr¨om. A separable nonlinear least squares approach for identification of linear systems with errors in variables. In Proc. 14th IFAC Symposium on System Identification, Newcastle, Australia, March 29-31 2006. G. H. Golub and V. Pereyra. Separable nonlinear least squares: The variable projection method and its applications. Inverse Problems, 19(2):R1–R26, April 2003. G. H. Golub and V. Pereyra. The differentiation of pseudoinverses and nonlinear least squares problems whose variables separate. SIAM J. Numerical Analysis, 10(2):413–432, 1973. M. Hong and T. S¨oderstr¨om. Relations between biaseliminating least squares, the Frisch scheme and extended compensated least squares methods for identifying errorsin-variables systems. Automatica, 45(1):277–282, January 2009. M. Mossberg. Errors-in-variables identification through covariance matching: Analysis of a colored measurement noise case. In Proc. American Control Conference, pages 1310– 1315, Seattle, WA, June 11-13 2008. T. S¨oderstr¨om. Errors-in-variables methods in system identification. Automatica, 43(6):939–958, June 2007. Survey paper. T. S¨oderstr¨om. Extending the Frisch scheme for errors-invariables identification to correlated noise. International Journal of Adaptive Control and Signal Processing, 22(1): 55–73, February 2008. T. S¨oderstr¨om, E. K. Larsson, K. Mahata, and M. Mossberg. Using continuous-time modeling for errors-invariables identification. In Proc. 14th IFAC Symposium on System Identification, Newcastle, Australia, March 29-31 2006. T. S¨oderstr¨om, M. Mossberg, and M. Hong. A covariance matching approach for identifying errors-in-variables systems. Automatica, 2008. Submitted. S. Van Huffel, editor. Recent Advances in Total Least Squares Techniques and Errors-in-Variables Modelling. SIAM, Philadelphia, USA, 1997. S. Van Huffel and Ph. Lemmerling, editors. Total Least Squares and Errors-in-Variables Modelling. Analysis, Algorithms and Applications. Kluwer, Dordrecht, The Netherlands, 2002. W. X. Zheng. A bias correction method for identification of linear dynamic errors-in-variables models. IEEE Transactions on Automatic Control, 47(7):1142–1147, July 2002. W. X. Zheng. Transfer function estimation form noisy input and output data. International Journal of Adaptive Control and Signal Processing, 12:365–380, 1998.
S. Beghelli, R.P. Guidorzi, and U. Soverini. The Frisch scheme in dynamic system identification. Automatica, 26:171–176, 1990.
1556