Optimally conditioned instrumental variable approach for frequency-domain system identification

Optimally conditioned instrumental variable approach for frequency-domain system identification

Automatica ( ) – Contents lists available at ScienceDirect Automatica journal homepage: www.elsevier.com/locate/automatica Optimally conditioned ...

1MB Sizes 0 Downloads 99 Views

Automatica (

)



Contents lists available at ScienceDirect

Automatica journal homepage: www.elsevier.com/locate/automatica

Optimally conditioned instrumental variable approach for frequency-domain system identification✩ Robbert van Herpen 1 , Tom Oomen, Maarten Steinbuch Eindhoven University of Technology, The Netherlands Department of Mechanical Engineering, Control Systems Technology Group, Building GEM-Z, PO Box 513, 5600 MB Eindhoven, The Netherlands

article

info

Article history: Received 19 July 2013 Received in revised form 20 December 2013 Accepted 7 May 2014 Available online xxxx Keywords: System identification Identification algorithms Numerical methods Matrix inversion Condition numbers

abstract Accurate frequency-domain system identification demands for reliable computational algorithms. The aim of this paper is to develop a new algorithm for parametric system identification with favorable convergence properties and optimal numerical conditioning. Recent results in frequency-domain instrumental variable identification are exploited, which lead to enhanced convergence properties compared to classical identification algorithms. In addition, bi-orthonormal polynomials with respect to a data-dependent bi-linear form are introduced for system identification. Hereby, optimal numerical conditioning of the relevant system of equations is achieved. This is shown to be particularly important for the class of instrumental variable algorithms, for which numerical conditioning is typically quadratic compared to alternative frequency-domain identification algorithms. Superiority of the proposed algorithm is demonstrated by means of both simulation and experimental results. © 2014 Elsevier Ltd. All rights reserved.

1. Introduction Frequency-domain system identification (McKelvey, 2002; Pintelon & Schoukens, 2001) is of significant importance for a broad class of applications, since it enables (i) straightforward data reduction, (ii) straightforward combination of multiple data sets, (iii) direct estimation and use of nonparametric noise models, and (iv) a direct connection to control-relevant identification criteria. Many parametric identification techniques based on frequencydomain data involve a nonlinear least-squares problem. Here, the nonlinearity arises from the parametrization of the poles in the denominator polynomial. In Levy (1959), the nonlinear problem is approximated using a single linear least-squares problem. However, this introduces an a priori unknown weighting function. The

✩ This work is supported by ASML Research, Veldhoven, The Netherlands, and the Innovational Research Incentives Scheme under VENI grant ‘‘Precision Motion: Beyond the Nanometer’’ (no. 13073) awarded by NWO (The Netherlands Organisation for Scientific Research) and STW (Dutch Science Foundation). The material in this paper was partially presented at the 51st IEEE Conference on Decision and Control (CDC), December 10–13, 2012, Maui, Hawaii, USA. This paper was recommended for publication in revised form by Associate Editor Wei Xing Zheng under the direction of Editor Torsten Söderström. E-mail addresses: [email protected] (R. van Herpen), [email protected] (T. Oomen), [email protected] (M. Steinbuch). 1 Tel.: +31 40 247 5444; fax: +31 40 246 1418.

http://dx.doi.org/10.1016/j.automatica.2014.07.002 0005-1098/© 2014 Elsevier Ltd. All rights reserved.

SK-algorithm (Sanathanan & Koerner, 1963) mitigates the effect of such weighting through iterations. In Bayard (1994) and de Callafon, de Roover, and Van den Hof (1996), the SK-algorithm is generalized to multivariable systems. Nevertheless, two aspects require further attention. On one hand, frequency-domain identification problems are typically numerically ill-conditioned. Several partial solutions exist, including (i) frequency scaling (Pintelon & Kollár, 2005), (ii) amplitude scaling (Hakvoort & Van den Hof, 1994), and (iii) the use of orthonormal polynomials and orthonormal rational functions with respect to a continuous inner product, see, e.g., Heuberger, Van den Hof, and Wahlberg (2005) and Ninness and Hjalmarsson (2001) for a connection with numerical properties. These approaches confirm that ill-conditioning is an important aspect in system identification applications and they typically improve numerical conditioning. However, these partial solutions may be insufficient to reliably solve complex frequencydomain identification problems. Therefore, in Oomen and Steinbuch (to appear) and van Herpen, Oomen, and Bosgra (2012b), an approach is presented that leads to optimal numerical conditioning of the SK-algorithm by using polynomials that are orthonormal with respect to a data-based discrete inner product, see Reichel, Ammar, and Gragg (1991) and Van Barel and Bultheel (1995) for a definition and earlier results. On the other hand, the fixed point of the SK-algorithm generally does not correspond to a (local) minimum of the nonlinear leastsquares criterion, as shown in Whitfield (1987). Consequently, the

2

R. van Herpen et al. / Automatica (

SK-algorithm is typically used as an initialization for subsequent Gauss–Newton iterations, see, e.g., Bayard (1994) and Pintelon and Schoukens (2001, Section 7.9.1), which guarantees convergence to a (local) minimum. Recently, in Douma (2006, Sections 3.5.3 and 3.5.8), an alternative frequency-domain identification algorithm has been formulated, in which a fixed point of the iterations corresponds to an optimum of the objective function. This renders a Gauss–Newton iteration superfluous, potentially enabling an increase of algorithm efficiency. The new algorithm, which has been extended towards multivariable systems in Blom and Van den Hof (2010), takes the form of an iterative instrumental variable method, see also Stoica and Söderström (1981) and Young (1976) for earlier results in this direction. Although the result in Blom and Van den Hof (2010) and Douma (2006) potentially reduces the number of iterations in frequencydomain identification, a direct implementation of the algorithm exhibits poor numerical properties. This is further supported in this paper, both by means of a theoretical analysis and a numerical example. In fact, the condition numbers associated with the algorithm are quadratically larger than for the standard SK-iterations. This obstructs a reliable and accurate computation of the optimal model. In addition, the approach in Reichel et al. (1991), Van Barel and Bultheel (1995) and van Herpen et al. (2012b) for optimal conditioning of the SK-iterations does not apply to the algorithm in Blom and Van den Hof (2010) due to the lack of an appropriate inner product. The main contribution of this paper is a new framework for frequency-domain system identification based on a nonlinear least-squares criterion, which (i) provides advantageous convergence properties, and (ii) ensures optimal numerical conditioning (κ = 1). Essentially, the proposed solution exploits the results in Blom and Van den Hof (2010), while providing optimal numerical conditioning in the spirit of Reichel et al. (1991), Van Barel and Bultheel (1995) and van Herpen et al. (2012b), albeit through a fundamentally different mechanism. In particular, the new algorithm relies on the introduction of bi-orthonormal polynomial bases in system identification. Recently, in Gilson, Welsh, and Garnier (2013) and Welsh and Goodwin (2003), the need for enhancement of numerical conditioning in frequency-domain instrumental variable identification has been confirmed and some enhancements have been obtained by using an alternative polynomial basis. The approach in this paper reformulates the instrumental-variable algorithm using bi-orthonormal polynomials with respect to a data-dependent bi-linear form, which leads to optimal numerical conditioning, i.e., κ = 1. The following specific contributions are presented in this paper. (C1) The numerical conditioning that is associated with the linear system of equations of the algorithm in Blom and Van den Hof (2010) is quadratically larger than the condition numbers encountered in the standard SK-iterations (Bayard, 1994; de Callafon et al., 1996; Sanathanan & Koerner, 1963), as is shown both theoretically and by means of a numerical example. (C2) The algorithm in Blom and Van den Hof (2010) has the interpretation of an instrumental variable method. Such type of method admits a transformation of instruments (Söderström & Stoica, 1983). This freedom is exploited to formulate the algorithm in two distinct polynomial bases: one for the model (at the present iteration) and one for the instrument. (C3) Optimal numerical conditioning is achieved by selecting polynomial bases that are bi-orthonormal with respect to a data-dependent bi-linear form. This bi-linear form accounts for the asymmetric and indefinite character of instrumental variable problems. As a special case, the bi-orthonormal

)



polynomial bases include the orthonormal polynomials with respect to a data-based discrete inner product in Reichel et al. (1991), Van Barel and Bultheel (1995) and van Herpen et al. (2012b). (C4) Identification of a SISO rational transfer function requires modeling of a numerator and denominator polynomial. Thus, this paper considers a 2 × 1 vector-polynomial, which is developed in terms of a 2 × 2 block-polynomial basis. The construction of bi-orthonormal block-polynomials from given frequency response data is presented for continuous-time systems. It is shown that an efficient construction using three-term-recurrence relations is possible, where the recursion coefficients are obtained from a matrix 2 × 2 blocktridiagonalization problem. (C5) Superiority of the proposed algorithm is shown by means of a simulation example and is experimentally validated on an industrial motion system. This paper extends the results in van Herpen, Oomen, and Bosgra (2012a), in which optimal conditioning of asymmetric polynomial equalities using bi-orthonormal polynomials is introduced, by (i) explicitly connecting bi-orthonormal polynomials with instrumental variable identification (C2)–(C3), and (ii) extending the construction of scalar bi-orthonormal polynomials towards 2 × 2 block-polynomials (C4). The latter result facilitates the estimation of a numerator–denominator vector-polynomial, which enables a confrontation of the proposed method with frequency-domain identification problems (C5). In van Herpen (2014, Chap. 2), a complementary study of relevant aspects in the theory of biorthonormal polynomials is provided. This paper is organized as follows. In Section 2, the frequencydomain identification problem is posed and two iterative algorithms are compared with respect to their convergence properties. In Section 3, the numerical properties of both algorithms are evaluated, motivating the need for enhancement of numerical conditioning (C1). Then, in Section 4, bi-orthonormal polynomials are introduced in frequency-domain system identification, which provides optimal numerical conditioning (C2)–(C3). Subsequently, in Section 5, the construction of bi-orthonormal polynomials using three-term-recurrence relations is presented (C4). In Section 6, an experimental validation of the benefits of the new algorithm for frequency-domain system identification is provided (C5). Conclusions are drawn in Section 7. Notation: throughout this paper, ξ represents either s = ȷω or √ z = eȷω , ȷ = −1, where ω ∈ R denotes a frequency. Moreover, Rp×q [ξ ] denotes a p × q matrix of real polynomials in ξ . Finally, A∗ denotes the conjugate of A, whereas AH denotes the conjugate transpose of A. Scope: for clarity of the exposition, attention is restricted to identification of SISO systems. The results in this paper can be generalized to the multivariable situation along conceptually similar lines. In this case, matrix fraction descriptions (MFDs), see, e.g., Kailath (1980, Chap. 6), provide a suitable framework as they directly connect to state-space models. Note that besides model order selection, i.e., the McMillan degree of the model, such multivariable models also require the selection of Kronecker indices. The reader is referred to Gevers (1986) and Moore (1981) for further information. In the second part of the paper, the construction of biorthonormal block-polynomial bases is presented. Here, attention is restricted to polynomials in the s-domain. The construction of biorthonormal block-polynomial bases in the z-domain is conceptually similar.

R. van Herpen et al. / Automatica (

)



3

The rationale that motivates the SK-algorithm is that V (θ ) is W (ξ ) convex in θ if d(ξ ,θk ) is given.

2. Iterative frequency-domain identification algorithms

k

2.1. Identification criterion In this paper, the frequency-domain identification problem minθ V (θ ) is considered, where m  2    V (θ) := W (ξk )(Po (ξk ) − Pˆ (ξk , θ )) .

Algorithm I (SK-Iteration). Let θ ⟨0⟩ be given. In iteration i = 1, 2, . . . , determine the least-squares solution to min

(1)

   d(ξk , θ ⟨i⟩ ) 2  . −1 n(ξk , θ ⟨i⟩ ) 

m    W (ξk )    d(ξ , θ ⟨i−1⟩ ) Po (ξk )

θ ⟨i⟩ k=1

k

(6)

k=1

In (1), Po (ξk ), k = 1, . . . , m, are frequency response function (FRF) measurements of the true system, see Pintelon and Schoukens (2001, Chap. 2) for further details. The weighting function W (ξk ) enables a variety of system identification criteria, cf. Pintelon, Guillaume, Rolain, Schoukens, and Van hamme (1994) for an overview. Common choices are maximum-likelihood identification (McKelvey, 2002; Pintelon & Schoukens, 1990) and control-relevant identification (Gevers, 1993), see also Oomen and Bosgra (2008, Sect. VI).

By inserting (3), it follows that (6) can be recast as the linear least-squares problem



Pˆ (ξ , θ ) =

n(ξ , θ )

where

 w11  W1 =  

d(ξ , θ )

(2)

where n(ξ , θ ), d(ξ , θ ) ∈ R[ξ ]. The numerator and denominator polynomial are formulated as a degree n vector-polynomial n  d(ξ , θ ) = ϕj (ξ ) θj , n(ξ , θ )





θ = θ0T 

θ1T

 T T

θn

···

.

(3)

j =0

Here, θj ∈ R2×1 and ϕj (ξ ) ∈ R2×2 [ξ ] is of strict degree j, i.e.,

ϕj (ξ ) = ξ j · sjj + · · · + ξ · sj1 + sj0 ,

(4)

where sj0 , sj1 , . . . , sjj ∈ R2×2 and sjj is assumed upper triangular and invertible. In system identification, it is common to let θn in (3) be determined by a degree constraint, see also De Moor, Gevers, and Goodwin (1994). Assumption 1. In this paper, it is assumed that (i) Pˆ (ξ , θ ) is strictly proper and (ii) d(ξ , θ ) is monic. Hence, by virtue of (4),

 1 1 θn = s− nn

T

0

.

d(ξ , θ ) θ θ θ = ξ n · n1 + · · · + ξ · 11 + 01 , n(ξ , θ ) θn2 θ12 θ02





 w12

W (ξk ) d(ξk

, θ ⟨i−1⟩ )

ϕ0 (ξ1 )  ϕ0 (ξ2 ) Φ=  .. .

..

.

 ,  w1m

Po (ξk )

ϕ1 (ξ1 ) ϕ1 (ξ2 ) .. .











where, as a consequence of Assumption 1, θn = 1



T

(8)

 −1 ,



··· ···

 ϕn−1 (ξ1 ) ϕn−1 (ξ2 )  , ..  .

ϕ0 (ξm ) ϕ1 (ξm ) · · · ϕn−1 (ξm )   ϕn (ξ1 ) ϕ (ξ )  n 2   Φn =   ..  , . ϕn (ξm )  T T ⟨i⟩ T ⟨i⟩ T ϑ ⟨i⟩ = θ0⟨i⟩ θ1 · · · θn−1 ,

(9)

and bn = −W1 Φn θn⟨i⟩ .

(10)

Note that in (7), the vector bn is determined by the imposed degree constraint, cf. Assumption 1. Remark 3. Associated with (7) is an orthogonal projection that is defined through the normal equations

(Φ H W1H W1 Φ ) θ ⟨i⟩ = Φ H W1H bn .

Example 2. A common polynomial basis is the monomial basis ϕjmon (ξ ) = ξ j · I2 . For this basis, (3) corresponds to



w1k = 

,

(7)

ϑ ⟨i⟩

2.2. Model parametrization The model parametrization that is considered in this paper is

2

min W1 Φ ϑ ⟨i⟩ − bn 2 ,

(11)

Algorithm I is widely applied for frequency-domain system identification (Pintelon et al., 1994). Indeed, the SK-algorithm typically yields a small value of the cost function V (θ ) in (1). However, the fixed point of the iterations is generally not a (local) optimum of V (θ ), as is proven in Whitfield (1987). 2.4. Refined iterative algorithm

0 .

Due to the model parametrization (2)–(3), criterion (1) is typically non-convex. In the next sections, two iterative algorithms for the minimization of (1) are investigated.

In this section, a refined iterative algorithm is considered, in which stationary points are a (local) minimum of V (θ ) in (1). Consider the first order necessary condition for optimality

2.3. Traditional iterative algorithm

∂ V (θ ) = 0. ∂θ T

In this section, the SK-algorithm (Sanathanan & Koerner, 1963) is formulated. Observe that (1) is equivalent to

  W (ξk )   V (θ) =  d(ξ , θ ) Po (ξk ) m

k=1

k

  d(ξk , θ ) 2  . −1 n(ξk , θ )  

(5)

Since (1) equals V (θ ) =

(12)

m

k=1

ε H (ξk , θ )ε(ξk , θ ), where ε(ξk , θ ) :=

W (ξk )(Po (ξk ) − Pˆ (ξk , θ )), (12) equals m  k=1

ζ H (ξk , θ ) ε(ξk , θ ) = 0,

(13)

4

R. van Herpen et al. / Automatica (

∂ε(ξk , θ ) ∂ Pˆ (ξk , θ ) = −W (ξk ) . T ∂θ ∂θ T

ζ (ξk , θ) :=

(14)

Thus, a (local) optimum of V (θ ) is attained when

 H m  −∂ Pˆ (ξk , θ ) W H (ξk ) W (ξk ) (Po (ξk ) − Pˆ (ξk , θ )) = 0. (15) T ∂θ k =1 From the model parametrization (2)–(3) it follows that (15) is a nonlinear equality in θ . In order to solve this equality, the following iterative algorithm is proposed in Blom and Van den Hof (2010) and Douma (2006). Algorithm II (IV-Iteration). Let θ ⟨0⟩ be given. In iteration i 1, 2, . . . , solve the linear system of equations



k =1

=

H

 −∂ Pˆ (ξk , θ )    ∂θ T

W H (ξk ) ·

(16)

θ =θ ⟨i−1⟩

W (ξk ) d(ξk , θ ⟨i−1⟩ )



   d(ξk , θ ⟨i⟩ ) −1 = 0. n(ξk , θ ⟨i⟩ )

Po (ξk )

Lemma 4. By inserting (3) into (16), the linear system of equations

(Φ H W2H W1 Φ )ϑ ⟨i⟩ = Φ H W2H bn

(17)

is obtained, where W1 , Φ , ϑ , and bn are defined in (8)–(10), and ⟨i⟩



w21

 W2 =  

 w22

W (ξk ) d(ξk , θ ⟨i−1⟩ )

..

(18)

w2m

Pˆ (ξk , θ ⟨i−1⟩ )





∂ Pˆ (ξk , θ ) ∂ = T ∂θ ∂θ T

−1 .

=

−1 d(ξk , θ )

0

1 d(ξk , θ )

Pˆ (ξk , θ )



W (ξk )  Pˆ (ξk , θ )

d(ξk , θ )

 · ϕ0 (ξk )

ϕ1 (ξk )

d(ξk , θ ) n(ξk , θ )



 ∂ −1 ∂θ T

 d(ξk , θ ) n(ξk , θ )





, (19)

 −1 ···

3.1. Numerical ill-conditioning: increased importance for the IValgorithm In this section, it is shown that the use of a standard polynomial basis in Algorithm II can lead to extremely large condition numbers, which obstructs a reliable implementation of the algorithm. This forms the theoretical aspect of Contribution C1. First the SK-iteration in Algorithm I is considered, in which (7) is the essential computational step. In each iteration, the leastsquares solution to

 ϕn (ξk ) .

(21)

is computed, with the associated condition number κ(W1 Φ ). This condition number crucially depends on the choice of polynomial basis in (3). A common choice is the monomial basis, which is applied in, e.g., Bayard et al. (1991) and de Callafon et al. (1996). However, this typically leads to κ(W1 Φ ) ≫ 1, as is illustrated next. Example 6. Consider ϕjmon (ξ ) = ξ j · I2 , j = 0, 1, . . . , n, see Example 2. Then, (9) yields the block-Vandermonde matrix I2 I2 

Φ mon =  .  ..

ξ1 · I2 ξ2 · I2 .. .

ξm · I2

··· ··· ···

 ξ1n−1 · I2 ξ2n−1 · I2   ..  . . 

ξmn−1 · I2

Now, let ξk = ȷ·2π ·{1, 10, 100, 1000} and n = 3. Then, κ(Φ mon ) = 3.0 · 107 ≫ 1.

(14) equals

ζ (ξk , θ) =

In this section, the numerical conditioning associated with Algorithms I and II is further investigated.

I2



Proof. By virtue of



κ(A) = σ (A)/σ (A).



 , 

.

The iterative frequency-domain identification algorithms that are presented in Section 2 both rely on solving a linear system of equations of the form Ax = b. The accuracy of the solution in each iteration is determined by the condition number κ(A) (Golub & Van Loan, 1989, Sect. 5.3.7), which is defined as

W 1 Φ ϑ ⟨i ⟩ = b n

In the following lemma, (16) is cast in the form of a standard linear system of equations.

w2k =



3. Numerical analysis of the iterative algorithms

where

m 

)

(20)

Example 6 motivates for frequency points on the imaginary axis, or non-equidistantly spaced frequency points on the unit circle, that the use of the monomial basis typically leads to κ(W1 Φ ) ≫ 1 (assuming W1 = I). In other words, (21) is typically numerically ill-conditioned, leading to an inaccurate numerical solution of Algorithm I (in a worst-case sense). Next, consider Algorithm II, with the system of Eqs. (17) as key computational step. The solution accuracy depends on κ(Φ H W2H W1 Φ ). The following result reveals that this condition number is quadratically worse than κ(W1 Φ ), associated with the SKalgorithm.

Then, (17) follows after writing (16) in matrix form, where bn follows from the degree constraint in Assumption 1. 

Result 7. Consider the desired situation where upon convergence of Algorithm II, Pˆ (ξk , θ ⋆ ) ≈ Po (ξk ), k = 1, . . . , m, where θ ⋆ denotes a fixed point of the iterations (16). In that case, W2 ≈ W1 , cf. (8) and (18). As a result,

Remark 5. Note that the system of Eqs. (17) represents an oblique, i.e., non-orthogonal, projection.

κ(Φ H W2H W1 Φ ) ≈ κ(Φ H W1H W1 Φ ) = κ(W1 Φ )2 .

Compared to Algorithm I, the essential advantage of Algorithm II is that upon convergence, it is guaranteed that a (local) optimum of V (θ ) in (1) is attained. In the forthcoming section, the numerical properties of both algorithms are investigated.

Result 7 reveals that the use of monomial basis polynomials in Algorithm II, as considered in Blom and Van den Hof (2010), potentially leads to a severely ill-conditioned system of equations (17).

(22)

R. van Herpen et al. / Automatica (

Remark 8. In view of (22), it is favorable to solve the least squares problem (7) through QR-factorization of W1 Φ in (21), instead of solving the normal Eqs. (11), see also Golub and Van Loan (1989, Sect. 5.3.8). However, this approach does not apply to the oblique projection (17). 3.2. Classical solutions for improvement of conditioning: limitations on applicability to the IV-algorithm In Section 3.1, it was shown that the use of the monomial polynomial basis leads to very bad numerical conditioning of Algorithm I and especially of Algorithm II. In the literature, several approaches have been considered to mitigate the conditioning problems in frequency-domain system identification, including (i) discarding the least-relevant part of the system of equations, e.g., by means of a singular-value decomposition (Wills & Ninness, 2008), (ii) amplitude scaling (Hakvoort & Van den Hof, 1994), (iii) frequency scaling (Pintelon & Kollár, 2005), (iv) conversion from the continuous-time domain to the discretetime domain and vice-versa using a Möbius transformation (Oomen, van de Wal, & Bosgra, 2007, Sect. 5.1), and (v) the use of an alternative classical polynomial basis such as Chebyshev polynomials (Adcock, 1987; Dailey & Lukich, 1987), the use of basis functions that are orthonormal with respect to the standard inner product in the Hilbert space RH2 (de Vries & Van den Hof, 1998; Heuberger et al., 2005), or the use of frequency localizing basis functions (FLBFs) that have been proposed recently in Gilson et al. (2013) and Welsh and Goodwin (2003). Although the mentioned approaches often help in mitigating the numerical ill-conditioning, they typically require heuristic tuning and may not provide sufficient improvement for complex frequency-domain identification problems. Furthermore, for the classical Algorithm I, a fundamental solution to prevent numerical ill-conditioning of (21) exists. It relies on the use of a data-dependent polynomial basis, in which the following inner product plays an important role. Definition 9. Let distinct frequency points ξk ∈ C, k = 1, . . . , m, be given. Let w1k ∈ C1×2 be weights as specified in (8). Then, for vector polynomials φi (ξ ), φj (ξ ) ∈ R2×1 [ξ ] the following datadependent inner product is defined:

⟨φi (ξ ), φj (ξ )⟩ :=

m 

φ (ξk ) w w1k φi (ξk ). H j

H 1k

(23)

k=1

As shown in Faßbender (1997), Reichel et al. (1991) and Van Barel and Bultheel (1995), if a polynomial basis is orthonormal with respect to (23), then in (11) it holds that

Φ H W1H W1 Φ = I2n . Equivalently, κ(W1 Φ ) = 1 in (21). Hence, optimal conditioning is achieved, facilitating a numerically reliable implementation of Algorithm I. Although a polynomial basis that is orthonormal with respect to (23) provides optimal conditioning of the SK-iteration in Algorithm I, this does not apply to Algorithm II. Importantly, whereas κ(W1 Φ ) = 1, in contrast, κ(Φ H W2H W1 Φ ) can be arbitrarily large for general data. The underlying reason is that the oblique projection (17) is not associated with an inner product of the form in Definition 9. In fact, it is shown in van Herpen (2014, Chap. 2, Thm 2.19) that if

∃ k = 1, . . . , m,

H H s.t. w2k w1k ̸= w1k w2k ,

)



5

i.e., W2H W1 is asymmetric, then (disregarding some degenerate cases) it is not possible to select one single polynomial basis ϕj (ξ ), j = 0, 1, . . . , n − 1, that yields κ(Φ H W2H W1 Φ ) = 1. In conclusion, whereas Algorithm II has beneficial convergence properties compared to Algorithm I, the existing approaches for implementation of this algorithm do not provide optimal numerical conditioning. In the forthcoming section, a new solution is presented that does provide optimal conditioning in Algorithm II. 4. Optimally conditioned frequency-domain IV-identification using bi-orthonormal polynomial bases In this section, a new solution is presented that combines the algorithmic advantages of Algorithm II with optimal numerical conditioning. As observed in Blom and Van den Hof (2010), ζ (ξ , θ ) in (14) can be interpreted as an instrumental variable (IV), see Söderström and Stoica (1983) for an overview of instrumental variable methods in system identification. In this section, it is shown that through a coordinate transformation of the instrumental variable problem, and by the introduction of bi-orthonormal polynomials in system identification, it is possible to achieve optimal numerical conditioning in Algorithm II, such that κ = 1. This forms Contributions C2 and C3 of this paper. 4.1. Additional freedom by a static transformation of instruments The essential difference between the identification algorithms considered in this paper is that Algorithm I relies on the orthogonal projection (11), in which Φ H W1H W1 Φ is symmetric, whereas Algorithm II is based on the oblique projection (17), in which Φ H W2H W1 Φ is an asymmetric, indefinite matrix, see also Gohberg, Lancaster, and Rodman (2005) for a treatment on indefinite linear algebra. Consequently, a polynomial basis that is orthonormal with respect to an inner product of the form (23) does not lead to optimal results in Algorithm II. To obtain optimal numerical conditioning, the asymmetric character of Algorithm II has to be accounted for explicitly. To this end, the algorithm is reformulated in a more general form in which two distinct polynomial bases are used. Definition 10. In analogy to (4), a basis ψj (ξ ) ∈ R2×2 [ξ ], j = 0, 1, . . . , n, is defined, where ψj (ξ ) is of strict degree j, viz.

ψj (ξ ) = ξ j · tjj + · · · + ξ · tj1 + tj0 ,

(24)

with tjk ∈ R , k = 0, . . . , j. In addition, tjj is assumed to be upper triangular and invertible. 2×2

Definition 11. Consider the two polynomial bases ϕj (ξ ), ψj (ξ ), j = 0, 1, . . . , n, in (4) and (24). Then, the coefficient matrices S , T ∈ R2(n+1) ×2(n+1) are defined as

 

S= 

s00

s10 s11

··· ··· .. .

sn0 sn1 

t00



..  , .

 

T = 

snn

t10 t11

··· ··· .. .

tn0 tn1 



..  . .

tnn

Note that S and T are upper triangular and invertible. The following auxiliary result now follows immediately. Result 12. The unique static transformation from the polynomial basis ϕj (ξ ) in (4) to the basis ψj (ξ ) in Definition 10 is given by

  ψ0 (ξ ) ψ1 (ξ ) · · · ψn (ξ )   = ϕ0 (ξ ) ϕ1 (ξ ) · · · ϕn (ξ ) U , with U = S −1 T ∈ R2(n+1)×2(n+1) upper triangular and invertible.

6

R. van Herpen et al. / Automatica (

In the remainder of this section, a more general formulation of the instrumental variable identification problem in Algorithm II is derived. Herein, the two distinct polynomial bases ϕ(ξ ) and ψ(ξ ) are exploited. In particular, a specific transformation of the instruments is performed, which has the interpretation of a change of polynomial basis. The forthcoming developments lead to the main result in Lemma 13. n(ϕ(ξ ),θ ⟨i−1⟩ ) Consider Algorithm II, where Pˆ (ϕ(ξ ), θ ⟨i−1⟩ ) = d(ϕ(ξ ),θ ⟨i−1⟩ ) is

the model obtained in iteration (i − 1), with



 d(ϕ(ξ ), θ ) = ϕ0 (ξ ) ⟨i−1⟩ n(ϕ(ξ ), θ ) ⟨i−1⟩



ϕ1 (ξ )

···

 ϕn (ξ ) θ ⟨i−1⟩ .

Here, the parametrization in the basis ϕ(ξ ), see (3), is indicated explicitly. Using Result 12, the model is re-parameterized in an alternative basis ψj (ξ ), j = 0, 1, . . . , n, which leads to Pˆ (ψ(ξ ), η⟨i−1⟩ ). Here, η⟨i−1⟩ = U −1 θ ⟨i−1⟩ is the parameter vector that provides the same model in the new basis ψ(ξ ), i.e.,



d(ϕ(ξ ), θ ⟨i−1⟩ ) n(ϕ(ξ ), θ ⟨i−1⟩ )



 =

d(ψ(ξ ), η⟨i−1⟩ ) n(ψ(ξ ), η⟨i−1⟩ )

 = ψ0 (ξ )



ψ1 (ξ )

d(ψ(ξ ), η⟨i−1⟩ )

· =

∂ ∂ηT

Pˆ (ψ(ξ ), η⟨i−1⟩ )

d(ϕ(ξ ), θ ⟨i−1⟩ )

 · ψ0 (ξ )

Pˆ (ϕ(ξ ), θ ⟨i−1⟩ )



ψ1 (ξ )

···

(25)

T





∂ Pˆ (ξ ,θ)   ⟨i−1⟩ , ∂θ T θ =θ

⟨i−1⟩

result in (25),

Then, the solution θ m  k=1

ζ˜ H (ξk , θ ⟨i−1⟩ ) ·

⟨i ⟩

d(ξk , θ ⟨i−1⟩ )

  . 

ψn−1 (ξm )

Definition 15. Let distinct frequency points ξk ∈ C, k = 1, . . . , m, be given. Let w1k , w2k ∈ C1×2 be weights as defined in (8) and (18). For vector-polynomials ϕ(ξ ), ψ(ξ ) ∈ R2×1 [ξ ], the datadependent bi-linear form [·, ·] is defined as m 

H w1k ϕ(ξk ). ψ H (ξk ) w2k

(29)

Pˆ (ξk , θ ⟨i−1⟩ )



ψ1 (ξk )

···

Definition 17. For block-polynomials ϕ(ξ ), ψ(ξ ) ∈ R2×2 [ξ ], a generalization of (29) is given by

 −1  ψn (ξk ) .

Remark 16. Note that (29) is not an inner product, since in general this bi-linear form is asymmetric and indefinite, see van Herpen (2014, Section 2.3). In the special situation where w1k = w2k ∀ k = 1, . . . , m, then (29) reduces to the data-dependent inner product (23).

(26)

[[ϕ(ξ ), ψ(ξ )]] :=

to W (ξk )

···

ψn−1 (ξ1 ) ψn−1 (ξ2 ) .. .

is formulated in the

) = −W (ξ ) · see (14), is evaluated for a different basis ψj (ξ ), i.e., using the

 · ψ0 (ξk )

ψ1 (ξm )

··· ···

k=1

(i) vector-polynomial d(ξ , θ ⟨i⟩ ) n(ξ , θ ⟨i⟩ ) polynomial basis ϕj (ξ ), cf. (3),

W (ξk )

ψ1 (ξ1 ) ψ1 (ξ2 ) .. .

In this section, optimal numerical conditioning of Algorithm II is achieved through the introduction of bi-orthonormal polynomials in system identification. To this end, a data-dependent bi-linear form is introduced.

[ϕ(ξ ), ψ(ξ )] :=

Lemma 13. Consider (16), where

d(ξk , θ ⟨i−1⟩ )

(28)

4.2. Bi-orthonormal polynomial bases for optimally conditioned frequency-domain system identification

 −1

which represents the gradient of the model Pˆ (ξ , θ ⟨i−1⟩ ) when evaluated in terms of the polynomial basis ψj (ξ ), j = 0, 1, . . . , n. Note that this new basis ψj (ξ ) only appears explicitly as the rightmost matrix in the product (25). The remaining two terms can still be evaluated in terms of the basis ϕj (ξ ), as is done (implicitly) in (19). In the following lemma, two distinct polynomial bases are used to parameterize the model and the instrumental variable. This constitutes Contribution C2 of the paper.

ζ˜ (ξ , θ ⟨i−1⟩ ) :=

The rationale behind Lemma 13 is that by evaluating the instrumental variable ζ (ξ , θ ) in terms of a different polynomial basis, additional freedom is provided to formulate the system of equations in Algorithm II. In particular, it follows along the lines of Lemma 4 that it is possible to recast (27) as

In the next section, the freedom in (28) when compared to (17) is exploited to achieve optimal numerical conditioning in frequencydomain instrumental variable identification, see Algorithm II.

 ψn (ξ ) ,

(ii) instrumental variable ζ (ξ , θ

Remark 14. A transformation of the polynomial basis in which ζ (ξ , θ ) is evaluated constitutes a special form of a transformation of instruments. Indeed, the parameter estimation in instrumental variable system identification problems is invariant under a linear transformation of instruments, as is also shown in Söderström and Stoica (1983, Sect. 3.1, pp. 24–25).

ψ0 (ξm )

 −1



1

k=1

is equivalent to the solution θ to (13). In analogy to this, the solution θ ⟨i⟩ to (27) and (16) is equivalent. 

ψ0 (ξ1 )  ψ0 (ξ2 ) Ψ =  .. .

d(ψ(ξ ), η) n(ψ(ξ ), η)



ζ˜ H (ξk , θ ) ε(ξk , θ ) = 0



η=η⟨i−1⟩



m 

where W1 , W2 , Φ , ϑ , and bn are defined in (8)–(10) and (18), and

 ∂ Pˆ (ψ(ξ ), η)  −   ∂ηT =

Proof. By virtue of Result 12, ζ˜ (ξk , θ ) = ζ (ξk , θ )U, see (20) and (26). Since U is invertible, the solution θ to

⟨i ⟩

In accordance with (19), it now follows that

1



(Ψ H W2H W1 Φ )ϑ ⟨i⟩ = Ψ H W2H bn ,

 ψn (ξ ) η⟨i−1⟩ .

···

)

m 

H ψ H (ξk ) w2k w1k ϕ(ξk ).

(30)

k=1

Po (ξk )



   d(ξk , θ ⟨i⟩ ) −1 =0 ⟨i⟩ n(ξk , θ ) (27)

is equivalent to the solution θ ⟨i⟩ to (16).

Indeed, let ϕ(ξ ) be decomposed into vector-polynomials

 ϕ(ξ ) = ϕ 1 (ξ )

 ϕ 2 (ξ ) ,

ϕ 1 , ϕ 2 ∈ R2×1 [ξ ],

and similarly for ψ(ξ ). Then, element (e2 , e1 ) in (30), where e1 , e2 ∈ {1, 2}, is given by the bi-linear form [ϕ (ξ ), ψ (ξ )]. e1

e2

R. van Herpen et al. / Automatica (

The following theorem is the main result of this section and constitutes Contribution C3 of the paper. Here, the freedom to choose two different polynomial bases in frequency-domain instrumental variable identification, cf. Lemma 13, is exploited. Theorem 18. Consider (28). Let ϕi (ξ ), ψj (ξ ), i, j = 0, 1, . . . , n − 1 be bi-orthonormal with respect to (30), i.e.

[[ϕi (ξ ), ψj (ξ )]] = δij · I2 ,

i, j = 0, 1, . . . , n − 1.

(31)

Then, κ(Ψ H W2H W1 Φ ) = 1. Proof. The result follows by observing that (31) in matrix form reads

Ψ H W2H W1 Φ = I2n , hence, κ(Ψ H W2H W1 Φ ) = 1.



Theorem 18 reveals that when bi-orthonormal polynomials (BPs) with respect to (30) are used, optimal numerical conditioning of (28) is achieved. Hence, the key computational step in Algorithm II is optimally conditioned, which facilitates a reliable implementation of the algorithm. In the next section, the construction of BPs with respect to (30) is considered. 5. Construction of bi-orthonormal block-polynomial bases In this section, the construction of a set of polynomial bases that are bi-orthonormal with respect to the bi-linear form (30) is considered. This constitutes Contribution C4 of the paper. To facilitate the exposition, the polynomial bases ϕj (ξ ), ψj (ξ ), j = 0, 1, . . . , n − 1, as well as the weights w1k , w2k , k = 1, . . . , m, are initially assumed to be scalar. It is then explained how to construct ϕj (ξ ), ψj (ξ ) that are bi-orthonormal with respect to the scalar bi-linear form

[ϕ(ξ ), ψ(ξ )] =

m 

∗ ϕ ∗ (ξk ) w2k w1k ϕ(ξk ).

)



7

where

 σ00 σ10 H1 =    τ00 τ10 H2 =  

··· ··· .. . ··· ··· .. .

σ0,m−2 σ1,m−2 .. .

 σ0,m−1 σ1,m−1  , ..  .

σm−1,m−2 τ0,m−2 τ1,m−2 .. .

τm−1,m−2

σm−1,m−1  τ0,m−1 τ1,m−1  , ..  .

(37)

(38)

τm−1,m−1

are upper Hessenberg matrices. To let the recurrence relations (35)–(36) generate bi-orthonormal polynomials with respect to (32), the specific problem data has to be introduced into the recurrence coefficients in the Hessenberg matrices. Before presenting the key results, several auxiliary matrix definitions are introduced. 5.2. Auxiliary matrix definitions The bi-linear form (32) is based on m frequency points ξk , k = 1, . . . , m, with corresponding weights w1k , w2k . Denote matrices X = diag(ξ1 , ξ2 , . . . , ξm ) ∈ Cm×m ,

(39)

W1 = diag(w11 , w12 , . . . , w1m ) ∈ C

m×m

,

W2 = diag(w21 , w22 , . . . , w2m ) ∈ Cm×m . Also, denote vectors

 w 1 = w11  w2 = w21

w12 w22

··· ···

w1m

T

w2m

T

∈ Cm×1 ,

(40)

.

(41)

∈C

m×1

Finally, denote (32)

k=1

In Section 5.8, the presented concepts are generalized towards the construction of block-polynomials that are bi-linear with respect to (30). 5.1. Polynomial recurrence relations Throughout this section, polynomial bases that satisfy a degree structure are considered. In analogy with (4) and (24), the scalar polynomials ϕj (ξ ), ψj (ξ ) take the form

ϕj (ξ ) = ξ j · sjj + · · · + ξ · sj1 + sj0 , ψj (ξ ) = ξ j · tjj + · · · + ξ · tj1 + tj0 , where sjk , tjk , k = 0, 1, . . . , j, are scalars. In addition, sjj and tjj are assumed to be nonzero, such that the jth polynomial is of strict degree j. As a consequence of the imposed degree structure, scalars σk,j−1 , τk,j−1 , k = 0, 1, . . . , j, exist such that

  Φ = Φ0 Φ1 · · · Φm−1   ϕ0 (ξ1 ) ϕ1 (ξ1 ) · · · ϕm−1 (ξ1 ) ϕ (ξ ) ϕ (ξ ) · · · ϕ (ξ )  0 2 1 2 m−1 2   ∈ Cm×m , = .. ..  ..  . . . ϕ0 (ξm ) ϕ1 (ξm ) · · · ϕm−1 (ξm )   Ψ = Ψ0 Ψ1 · · · Ψm−1   ψ0 (ξ1 ) ψ1 (ξ1 ) · · · ψm−1 (ξ1 )  ψ0 (ξ2 ) ψ1 (ξ2 ) · · · ψm−1 (ξ2 )   ∈ Cm×m . = .. ..   .. . . . ψ0 (ξm ) ψ1 (ξm ) · · · ψm−1 (ξm ) For scalar polynomials, bi-orthonormality condition (31) reduces to [ϕi (ξ ), ψj (ξ )] = δij , i, j = 0, 1, . . . , n − 1, where (32) is used. This bi-orthonormality condition is equivalent to the matrix representation

ξ ϕj−1 (ξ ) = ϕj (ξ ) σj,j−1 + · · · + ϕ1 (ξ ) σ1,j−1 + ϕ0 (ξ )σ0,j−1 , (33)

Ψ H W2H W1 Φ = Im .

ξ ψj−1 (ξ ) = ψj (ξ ) τj,j−1 + · · · + ψ1 (ξ ) τ1,j−1 + ψ0 (ξ ) τ0,j−1 . (34) Now, let n = m, i.e., consider as many basis polynomials as the

In Sections 5.3–5.7, the matrices X , W1 , W2 , and w 1 , w 2 , and Φ , Ψ are assumed to be dimensioned accordingly.

number of data points that define the bi-linear form (32). Then, combining (33)–(34) for j = 0, 1, . . . , n − 1 yields

  ξ ϕ0 (ξ ) ϕ1 (ξ ) · · · ϕm−1 (ξ )   = ϕ0 (ξ ) ϕ1 (ξ ) · · · ϕm−1 (ξ ) H1   + ϕm (ξ ) 01×(m−1) σm,m−1 ,   ξ ψ0 (ξ ) ψ1 (ξ ) · · · ψm−1 (ξ )   = ψ0 (ξ ) ψ1 (ξ ) · · · ψm−1 (ξ ) H2   + ψm (ξ ) 01×(m−1) τm,m−1 ,

(35)

(36)

(42)

5.3. Connecting the polynomial recurrences with the data: a requirement on the eigenvalues of the Hessenberg matrices The general j-term recurrence relations (35)–(36) hold for any polynomial bases ϕj (ξ ), ψj (ξ ) with the assumed degree structure. In this section, an additional requirement on the Hessenberg matrices H1 and H2 is formulated that needs to be fulfilled to let ϕj (ξ ), ψj (ξ ) be bi-orthonormal with respect to (32). For this purpose, observe that if ϕj (ξ ), ψj (ξ ) are bi-orthonormal polynomials, then (35)–(36) have a specific form.

8

R. van Herpen et al. / Automatica (

Lemma 19. Let ϕj (ξ ), ψj (ξ ), j = 0, 1, . . . , n−1, be bi-orthonormal polynomials with respect to (32), where n = m. Then, (35)–(36) are equivalent to X Φ = Φ H1 ,

(43)

X Ψ = Ψ H2 .

(44)

Proof. The bi-linear form (32) is based on the frequency points ξk , k = 1, . . . , m. Evaluating (35)–(36) for these frequencies yields X Φ = Φ H1 + Φm



01×(m−1)

X Ψ = Φ H2 + Ψm 01×(m−1)



 σm,m−1 ,  τm,m−1 ,

where

 Φm = ϕm (ξ1 )  Ψm = ψm (ξ1 )

ϕm (ξ2 ) ψm (ξ2 )

Theorem 20. Let ϕj (ξ ), ψj (ξ ), j = 0, 1, . . . , n − 1, be bi-orthonormal polynomials with respect to (32), where n = m. Then, (43)–(44) are equivalent to

(Ψ H W2H )X (W1 Φ ) = H1 ,

(45)



(46)

)X (W2 Ψ ) = H2 ,

where

Ψ

H

W2H

= (W1 Φ ) . −1

Theorem 21. Let ϕj (ξ ), ψj (ξ ), j = 0, 1, . . . , n − 1 be bi-orthonormal polynomials with respect to (32), where n = m. Furthermore, let the 0th degree basis polynomials be defined as

ϕ0 (ξ ) :=

1

γ0

,

ψ0 (ξ ) :=

1

β0

,

(48)

where β0 and γ0 are scalars. Then, since ϕ0 (ξ ) and ψ0 (ξ ) are part of bi-orthonormal polynomial bases, the following equalities hold:

 (Ψ H W2H )w 1 = γ0  (Φ H W1H )w 2 = β0

W1 Φ0 = w 1

Using a bi-linear form based on m distinct frequency points, biorthonormal bases ϕj (ξ ), ψj (ξ ) of m non-trivial polynomials can be obtained. The relations (43)–(44) describe the relations between these polynomials for the frequencies on which the bi-linear form is defined. It turns out that the Hessenberg matrices H1 and H2 have precisely these frequencies as its eigenvalues.

W1H



01,m−1

T

,

(49)

01,m−1

T

.

(50)

Proof. Let ϕ0 (ξ ) and ψ0 (ξ ) be as defined in (48). From the auxiliary matrix definitions in Section 5.2, it follows that

T ϕm (ξm ) T · · · ϕm (ξm ) .

···

Since it is assumed that the m frequency points are distinct, it follows as a consequence of the imposed polynomial degree structure that Φ and Ψ are invertible matrices. Hence, in order for ϕm (ξ ), ψm (ξ ) to be bi-orthogonal with all lower degree basis polynomials, it is required that ϕm (ξk ) = ψm (ξk ) = 0 ∀ ξk . Thus, Φm = Ψm = 0m×1 , proving (43)–(44). 

H

)

(47)

W 2 Ψ0 = w 2

Theorem 20 shows that the Hessenberg matrices H1 and H2 are similar to X . Consequently, the eigenvalues of both H1 and H2 are ξk , k = 1, . . . , m. Indeed, bi-orthonormal polynomials can be constructed by solving two inverse eigenvalue problems. Before showing this, the lowest degree polynomial is appropriately selected to initialize the recurrence relations. 5.4. Requirement for initialization of bi-orthonormal bases Recall from Section 5.1 that H1 and H2 contain the recurrence coefficients that relate basis polynomials of increasing degree. The associated recurrence relations are given by (43)–(44). The missing aspect for the actual construction of bi-orthonormal polynomials is the initialization of the polynomial bases. The following result enables the initialization of the polynomial bases from the bi-linear form (32).

1

β0

,

(51)

.

(52)

In order for ϕ0 (ξ ) and ψ0 (ξ ) to be the 0th degree elements of biorthonormal polynomial bases, it is required that

 T Ψ H W2H W1 Φ0 = 1 01,m−1 ,  T Φ H W1H W2 Ψ0 = 1 01,m−1 , cf. (42). By inserting (51)–(52), these requirements can equivalently be written as (49)–(50), which completes the proof.  Note that together, (49)–(50) enable an initialization of the polynomials ϕ0 (ξ ) and ψ0 (ξ ) in terms of the weights w1k , w2k , k = 1, . . . , m, that define the bi-linear form (32). In particular, to ensure that the polynomial bases are correctly initialized, i.e., the 0th degree basis polynomials are bi-orthonormal with respect to (32), the following equality needs to hold:

ψ0∗ ϕ0 =

1

β0 γ0 ∗

=

1

w w1 H 2

.

To support this claim, observe that, first using (47) and second using (49)–(50),

 Proof. Consider the bi-orthonormality condition (42), which immediately yields (47). Next, (45)–(46) follow after pre-multiplication of (43) and (44) by (Ψ H W2H ) W1 and (Φ H W1H ) W2 , respectively. 

1

γ0

(Φ H W1H ) w2

H  H H  (Ψ W2 ) w 1

= wH2 (W1 Φ ) (W1 Φ )−1 w1 = w H2 w 1   T = β0∗ 01,m−1 γ0 01,m−1 = β0∗ γ0 . In the next section, an algorithm for the derivation of the polynomial recursion coefficients for bi-orthonormal polynomial bases ϕj (ξ ) and ψj (ξ ) is presented. 5.5. An algorithm for the derivation of recurrence coefficients In this section, it is shown how to generate the recurrence coefficients for the bi-orthonormal polynomial bases from the data that defines the bi-linear form (32). The following algorithm combines the initialization of the polynomial bases in (49)–(50) with the eigenvalue decompositions in (45)–(46). Algorithm 22. Let X be given in (39), and w 1 , w 2 be given  in (40)–(41). Then, form initial node-weight matrices w 1 | X and



 w2 | X . Next, compute matrices

K˜ , L˜ ∈ Cm×m ,

L˜ = K˜ −H ,

(53)

R. van Herpen et al. / Automatica (

such that

)

ψj (ξ ) =

 γ0  ˜K = 0m−1,1  H1 ,        β0  ˜K H w2 | X 1  H2 , ˜ = 0 



  1 L˜ w 1 | X H



L



m−1,1

(54)



1

τj,j−1

9



(ξ − τj−1,j−1 ) ψj−1 (ξ ) − τj−2,j−1 ψj−2 (ξ )

 − · · · − τ1,j−1 ψ1 (ξ ) − τ0,j−1 ψ0 (ξ ) generate bi-orthonormal bases with respect to (32).

(55) Proof. Combining (51), (52), and (58) yields

where H1 and H2 are upper Hessenberg matrices as in (37)–(38).

k˜ 1 = W1 Φ0 ,

A numerically reliable and efficient implementation of Algorithm 22 can be obtained by exploiting elementary zeroing operations such as Givens reflections or rotations, while following the ‘chasing down the diagonal’ strategy in Rutishauser (1963). This approach indeed yields good results on actual data, as will be further demonstrated in Section 6, and is also applied in, e.g., Faßbender (1997), Reichel et al. (1991) and Van Barel and Bultheel (1995).

ℓ˜ 1 = W2 Ψ0 .

5.6. Proving bi-orthonormality of the resulting polynomials

L˜ H K˜ = Ψ H W2H W1 Φ = Im ,

In this section, it is shown that the Hessenberg matrices obtained in Algorithm 22 indeed contain the recurrence coefficients that generate bi-orthonormal polynomials with respect to (32). First, it is proven that the individual columns of K˜ , L˜ in (53), viz. K˜ = k˜ 1

k˜ 2

···

k˜ m ,

L˜ = ℓ˜ 1

ℓ˜ 2

···

 ℓ˜ m ,







where k˜ j , ℓ˜ j ∈ Cm×1 , j = 1, 2, . . . , m, satisfy recurrence relations. Lemma 23. Let K˜ , L˜ ∈ Cm×m be such that L˜ = K˜ −H and (54) and (55) hold. In addition, let H1 and H2 be defined in (37)–(38). Then, the columns of K˜ , L˜ satisfy the recurrence relations X k˜ j = k˜ 1 σ0,j−1 + k˜ 2 σ1,j−1 + · · · + k˜ j+1 σj,j−1 ,

(56)

X ℓ˜ j = ℓ˜ 1 τ0,j−1 + ℓ˜ 2 τ1,j−1 + · · · + ℓ˜ j+1 τj,j−1 ,

(57)

1

γ0

,

1 ℓ˜ 1 = w 2 . β0

(58)

bi-orthonormality of ϕj (ξ ) and ψj (ξ ) follows immediately from bi-

˜ orthonormality of K˜ and L.



To summarize, the construction of bi-orthonormal polynomial bases requires the solution of the Hessenberg decomposition of two initial node-weight matrices, see Algorithm 22. Hereafter, the polynomial bases are generated using the recurrence relations in Theorem 24. 5.7. Frequency points on the imaginary axis In the special situation where all frequency points ξk , k = 1, . . . , m, are taken on the imaginary axis, a useful relation exists between the Hessenberg matrices H1 and H2 in (37)–(38). Theorem 25. Let ξk = ȷωk , ωk ∈ R, k = 1, . . . , m. Then, H1 and H2 are tri-diagonal matrices. Furthermore,

Proof. Since all frequency points are on the imaginary axis,



H

= (Φ H W1H )X H (W2 Ψ ) = −(Φ H W1H )X (W2 Ψ ) = −H2 .

This implies that H1 and H2 are both upper and lower Hessenberg, hence, tri-diagonal matrices. 

X K˜ = K˜ H1 , X L˜ = L˜ H2 , by virtue of (53). Evaluating the jth column of these equations, j = 1, 2, . . . , m − 1, yields (56) and (57).  Next, the recurrence coefficients in H1 and H2 are used to generate polynomial bases. The following result proves bi-orthonormality with respect to (32) by connecting the individual poly˜ nomials to the individual columns of K˜ and L. Theorem 24. Consider the Hessenberg decompositions (54) and (55). Let ϕ0 (ξ ) and ψ0 (ξ ) be selected according to (48). Then, the recurrence relations

 (ξ − σj−1,j−1 ) ϕj−1 (ξ ) − σj−2,j−1 ϕj−2 (ξ )

 − · · · − σ1,j−1 ϕ1 (ξ ) − σ0,j−1 ϕ0 (ξ )

j = 1, 2, . . . , m. Hence, since

H1H = (Ψ H W2H )X (W1 Φ )

where (53) is used. Indeed, (58) follows. Next, observe that

σj,j−1

ℓ˜ j = W2 Ψj−1 ,

see (39). Using (45) and (46), it then holds that

 T  T L˜ w 1 = γ0 01,m−1 H⇒ w 1 = K˜ γ0 01,m−1 ,  T  T K˜ H w 2 = β0 01,m−1 H⇒ w 2 = L˜ β0 01,m−1 , H

1

k˜ j = W1 Φj−1 ,

X H = −X ,

Proof. From the first column of (54) and (55), it follows that

ϕj (ξ ) =

Then, by rewriting (59)–(60) in the equivalent form (33)–(34), it follows from (56)–(57) that

H1H = −H2 .

which are initialized by (see also Theorem 21) k˜ 1 = w 1

(60)

(59)

By virtue of Theorem 25, three-term-recurrence relations can be derived in line with Theorem 24 that generate bi-orthonormal polynomial bases ϕj (ξ ) and ψj (ξ ). 5.8. Constructing bi-orthonormal 2 × 2 real block-polynomials In Sections 5.1–5.7, the principle steps in the construction of bi-orthonormal polynomials have been presented. In this section, these concepts are generalized towards bi-orthonormal 2 × 2 real block-polynomials, which is needed to achieve optimal conditioning of Algorithm II in Section 2. For the identification of a real-rational transfer function Pˆ (ξ , θ ), cf. (2), the derivation of real polynomial recurrence coefficients is required. To this end, the property Pˆ (ξ ∗ , θ ) = Pˆ ∗ (ξ , θ ),

10

R. van Herpen et al. / Automatica (

)



which is characteristic for real-rational transfer functions, is exploited. Specifically, complex-conjugate node and weight pairs are taken in Algorithm 22. This leads to the initial node-weight matrix H w21 ξ1



Inw

     :=     

w11 ∗ w11 w12 ∗ w12 .. .

T w21

H w22

ξ1∗

T w22

ξ2

ξ2∗

···

..

w1m ∗ w1m

H w2m

.

ξm

T  w2m      ,    

ξm∗ (61)

where w1k , w2k ∈ C , k = 1, . . . , m, are given in (8) and (18). This matrix is then transformed under similarity into the matrix 1×2

I˜nw = T0H Inw T0 ,

Fig. 1. Noisy m = 181 point FRF Po (dotted) and n = 4th order parametric models Pˆ (s) obtained with Algorithm I (SK) (dashed) and Algorithm II (IV) (solid).

where



I2

T0 = 



1 Im ⊗ 1

  ȷ , −ȷ

ȷ=

√ −1.

Importantly, I˜nw is a real-valued matrix. Note that considering conjugate node and weight pairs does not lead to an increase of complexity, since all remaining algorithmic computations can now be performed on real matrices. Next, frequency points ξk = ȷωk , ωk ∈ R, are considered. In line with the result in Theorem 25, the real initial node-weight matrix I˜nw is transformed into a 2 × 2 real block-tridiagonal matrix

  Γ0    T :=    

BT0 A1

Γ1

 BT1 A2

Γ2

BT2

.. ..

. .

..

.

Γm−1

BTm−1 Am

       

(62)

under similarity. Here, the blocks Ai ∈ R2×2 , i = 1, . . . , m, are full matrices, whereas the blocks Bj , Γj ∈ R2×2 , j = 0, 1, . . . , m − 1, are upper triangular matrices. The block-tridiagonal matrix provides the recurrence coefficients that generate block-polynomials that are bi-orthonormal with respect to (30). Theorem 26. Consider the initial node-weight matrix Inw in (61), where ξk , w1k , w2k , k = 1, . . . , m, define the bi-linear form (30). Let Inw be transformed under similarity into the real block-tridiagonal matrix T in (62). Then, bi-orthonormal polynomials ϕj (ξ ), ψj (ξ ) ∈ R2×2 [ξ ], j = 0, 1, . . . , n − 1, with respect to the bi-linear form (30) are obtained from the three-term-recurrence relations

  ϕj (s) = ϕj−1 (s) (s · I2 − Aj ) − ϕj−2 BTj−1 Γj−1 ,   −1 T ψj (s) = ψj−1 (s) (−s · I2 − ATj ) − ψj−2 Γj− 1 Bj ,

(63) (64)

where Aj , Bj , Γj ∈ R2×2 follow from T . The recursions are initialized 1 with ϕ0 (s) = Γ0−1 and ψ0 (s) = B− 0 . Remark 27. Consider the special situation that W2 = W1 . In that case, the initial node-weight matrix can be transformed into a symmetric block-tridiagonal matrix under unitary similarity, see also Gragg and Harrod (1984) and Van Barel and Bultheel (1992). This symmetric block-tridiagonal matrix, known as a block-Jacobi

matrix, has ATj = Aj and Γj = Bj . Consequently, it follows from the three-term-recurrence relations (63)–(64) that the polynomial bases ϕj (s), ψj (s) coincide into one single basis. This polynomial basis is orthonormal with respect to

⟨⟨ϕi (ξ ), ϕj (ξ )⟩⟩ :=

m 

H ϕjH (ξ ) w1k w1k ϕi (ξk ),

(65)

k =1

which constitutes a matrix representation of the data-dependent inner product in Definition 9. This implies that the presented results cover the data-dependent inner product that enables optimal conditioning of Algorithm I (SK) as a special case. Summarizing, to construct bi-orthonormal 2 × 2 real blockpolynomials, the initial node-weight matrix (61) needs to be transformed under similarity into the 2 × 2 real block-tridiagonal matrix (62), after which the bi-orthonormal bases are obtained using the three-term-recurrences in Theorem 26. In the next section, the benefits of these polynomials in frequency-domain system identification algorithms are demonstrated by means of experimental examples. 6. Comparison of algorithms for identification examples In this section, the convergence properties and numerical properties of Algorithms I and II are compared for two identification examples. This constitutes Contribution C5 of the paper. 6.1. Simulation example First, a comparison between Algorithms I and II is made for a relatively simple simulation example. This enables reliable computations in both algorithms irrespective of the polynomial basis that is used, because numerical ill-conditioning is not excessive for this example and sufficiently accurate model estimates can be computed. In particular, measurements Po (sk ), sk = 1, . . . , m, are generated by contaminating the frequency response of a 4th order system, representing a typical collocated motion system, with circularly complex distributed random noise having a uniform amplitude distribution over all frequencies. Fig. 1 shows resulting the m measurement data. It is aimed to minimize V (θ ) = k=1 |Po (sk )− Pˆ (sk )|2 . In Table 1, the converged results of Algorithms I and II are shown. Again, it is emphasized that due to the simplicity of the considered example, the polynomial basis that is selected does

R. van Herpen et al. / Automatica ( Table 1 Convergence behavior and numerical conditioning for the SK-algorithm and the IValgorithm. Alg.

Basis

κ

V (θ ⋆ )

   ∂ V (θ)   ∂ θT 

I (SK) I (SK)

ϕmon ϕOP

1.010 · 106 1.000

27.915 27.915

0.229 0.229

II (IV) II (IV)

ϕmon ϕBP , ψBP

5.669 · 1015 1.000

6.003 6.003

θ=θ

)



Positioning platform

11

Metrology frame

   ⋆

2

5.588 · 10−13 5.588 · 10−13

Fig. 3. Lightweight positioning platform used in IC manufacturing machines.

(2005). It is emphasized that the key new aspect in this paper is the combination of the enhanced convergence properties of Algorithm II with optimal numerical conditioning in frequency-domain identification. Fig. 2. Convergence behavior of the SK-algorithm ( ), the IV-algorithm ), and the GN-algorithm started after 5 SK-iterations (⋆ − ⋆). (

not significantly influence the outcome of the algorithms. The results in Table 1 confirm the theoretical analysis in Whitfield (1987), i.e., the  fixed point of  Algorithm I is not a (local) minimum

 ∂ V (θ) 

of V (θ ), since  ∂ θ T θ=θ ⋆  > 0. In contrast, the fixed point SK 2 of Algorithm II corresponds to an optimum , since the  of V (θ )



 ∂ V (θ) 

first order optimality condition holds, i.e.,  ∂ θ T θ=θ ⋆  ≈ 0. In IV 2 accordance with this result, Algorithm II yields a smaller criterion ⋆ ⋆ value than Algorithm I, i.e., V (θIV ) < V (θSK ), see also Fig. 2. In fact, whereas the SK-algorithm yields a biased estimate due to the noise effects that are present in the data, the IV-algorithm successfully captures all relevant dynamical behavior of the underlying 4th order system, as is shown in Fig. 1.



Remark 28. To ensure convergence to a (local) optimum, it is common to apply a gradient-based search after convergence of the SK-algorithm. In particular, the SK-algorithm is commonly used to initialize (damped) Gauss–Newton (GN) iterations, see, e.g., Bayard (1994). Indeed, in the considered example, the same minimum of the cost function as obtained using the IV-algorithm can also be obtained by means of GN-iterations, initialized with the SK-algorithm. However, this goes at the expense of a significant increase of the number of required iterations, hence, a larger computation time, see Fig. 2. Next, the numerical properties associated with both frequencydomain identification algorithms are investigated. When the monomial basis, see Example 2, is used in Algorithm I, then the maximum condition number during the iterations is κ(W1 Φ ) = 1.010 · 106 . For Algorithm II, the maximum condition numbers are approximately quadratically larger, confirming Result 7. For this example, it even holds that κ(Ψ H W2H W1 Φ ) = 5.669 · 1015 > κ(W1 Φ )2 . The results in this paper enable a numerically reliable implementation of both frequency-domain identification algorithms. Importantly, by using two sets of bi-orthonormal polynomials with respect to the bi-linear form (30), optimal numerical conditioning of Algorithm II is achieved, i.e., κ(Ψ H W2H W1 Φ ) = 1. In the special case where W2 = W1 , the bi-orthonormal polynomials introduced in this paper coincide into a single set of orthonormal polynomials with respect to the data-dependent inner product (65). Indeed, such polynomial basis leads to optimal conditioning in Algorithm I, i.e., κ(W1 Φ ) = 1, see also Bultheel, Van Barel, Rolain, and Pintelon

6.2. Experimental example In this section, the frequency-domain identification algorithm that is proposed in this paper is applied to an industrial motion system. In particular, the prototype wafer stage in Fig. 3 is considered. A wafer stage is part of lithographic IC manufacturing devices, used to position ICs with respect to a light source. Market viability of IC manufacturing devices requires nanometer positioning accuracy as well as a high throughput. Since a high throughput demands large accelerations, next-generation wafer stages are designed to be lightweight. However, as a consequence, these stages tend to show structural deformations, which obstruct the desired positioning accuracy. Typically, such deformations manifest in all degrees of freedom of the wafer stage. Therefore, multivariable, model-based control design is indispensable to achieve high positioning accuracy, see also Oomen et al. (2014). A corner stone for successful multivariable control design is a reliable algorithm for parametric identification, which captures all relevant flexible dynamical behavior of the system with high accuracy. The wafer stage in Fig. 3 is operated contactless on the basis of magnetic levitation. As a consequence, there are six motion degrees of freedom (DOFs) of the system, viz. three translations and three rotations. In this paper, the behavior of the wafer stage for vertical translations is investigated. In particular, the relation between forces Fz [N ] that are applied in the vertical direction and resulting displacements z [m] in the vertical direction is considered. Fig. 4 shows a frequency response function (FRF) identification Po (sk ), sk = ȷ · 2π · [1, 3, . . . , 2449] rad/s. Next, a 16th order parametric transfer function model Pˆ (s) of the system’s behavior is estimated. A maximum likelihood criterion is considered, see McKelvey (2002, Sect. 3), Pintelon and Schoukens (1990) and Pintelon and Schoukens (2001, Chap. 7). To this end, (1) is minimized with W (sk ) = 1/σPo (sk ), where σP2o (sk ) is the variance on the non-parametric FRF data as shown in Fig. 4. Hence,

2  m     (Po (sk ) − Pˆ (sk , θ ))  VML (θ ) :=  .    σPo (sk ) k =1

(66)

To optimize the maximum likelihood criterion (66), Algorithm II is employed. During the iterations, κ(Ψ H W2H W1 Φ ) < 1.008, i.e., the use of bi-orthonormal polynomials yields optimal numerical conditioning. This facilitates reliable, accurate computations. In contrast, when the monomial basis is used, the worst-case condition H number that is observed equals κ(Φmon W2H W1 Φmon ) = 1.181 · 98 10 . Although for the considered identification problem it is still

12

R. van Herpen et al. / Automatica (

)



Thus, the iterations are performed with high reliability, leading to an accurate parametric model of the system under study. Although the material that is presented in this paper is concentrated on identification of single-input single-output systems, all obtained results can be generalized towards multivariable systems. For this purpose, a system representation in the form of a matrix fraction description can be employed, as is also considered in Blom and Van den Hof (2010). First results are presented in van Herpen (2014, Chapter 4 and Appendix A). In practice, both Algorithms I and II show convergence to a stationary point for most data sets, especially when the signal to noise ratio is sufficiently high. Indeed, both frequency-domain algorithms are widely applied for system identification (Pintelon et al., 1994). Important convergence aspects have been further studied in detail in Regalia, Mboup, and Ashari (1997) and Söderström and Stoica (1988). Acknowledgments This paper is dedicated to the late Prof. Okko Bosgra, whose encouraging, critical view has been invaluable for this research. The authors thank Joris Termaat, Wouter Aangenent, and Marc van de Wal (ASML Research) for their contribution to the experimental work. Fig. 4. Measured m = 1250 point FRF Po (dotted) with corresponding variance (dashed), and n = 16th order parametric model Pˆ (s) (solid).

possible to compute an accurate model after appropriate scaling, such high condition numbers lead to a numerical breakdown for multivariable systems such as the one presented in Oomen and Steinbuch (to appear) and Oomen et al. (2014). 3 ⋆ Upon convergence  of the algorithm, V (θIV ) = 1.727 · 10 ,

 ∂ V (θ) 

whereas  ∂θ T θ=θ ⋆  = 8.5831 · 10−5 ≈ 0, i.e., a (local) minIV 2 imum of the cost function is attained. Indeed, in Fig. 4 it is observed that the obtained model describes the system’s behavior with high accuracy, demonstrating the effectiveness of Algorithm II for frequency-domain system identification.



7. Conclusions In this paper, a new algorithm for frequency-domain system identification is presented that combines advantageous convergence properties in frequency-domain instrumental variable (IV) identification with computational algorithms that yield optimal numerical conditioning. The key novel technical result of this paper is the introduction of two bi-orthonormal polynomial bases with respect to a data-dependent bi-linear form in system identification. By using these bi-orthonormal bases in the IV-algorithm, viz. one polynomial basis to parameterize the to-be-estimated model and another polynomial basis to parameterize the instrument, optimal numerical conditioning is achieved. In addition, optimal conditioning for the classical SK-algorithm is retrieved as a special case, in which the two polynomial bases coincide into a single basis that is orthonormal with respect to a data-dependent inner product. A simulation example confirms that the convergence behavior of the IV-algorithm is beneficial in frequency-domain identification. In particular, for the considered example, the SK-algorithm suffers from a bias that is induced by the noise on the measurement data. The IV-algorithm outperforms the SK-algorithm and yields an unbiased estimate. Furthermore, the IV-algorithm is successfully applied for identification of the dynamical behavior a highprecision motion system. Indeed, the use of bi-orthonormal polynomials provides optimal numerical conditioning in all iterations.

References Adcock, J. L. (1987). Curve fitter for pole-zero analysis. Hewlett–Packard Journal, 38, 33–36. Bayard, D. S. (1994). High-order multivariable transfer function curve fitting: Algorithms, sparse matrix methods and experimental results. Automatica, 30, 1439–1444. Bayard, D., Hadaegh, F., Yam, Y., Scheid, R., Mettler, E., & Milman, M. (1991). Automated on-orbit frequency domain identification for large space structures. Automatica, 27, 931–946. Blom, R. S., & Van den Hof, P. M. J. (2010). Multivariable frequency domain identification using IV-based linear regression. In Proc. 49th IEEE conference on decision and control, Atlanta, GA, USA (pp. 1148–1153). Bultheel, A., Van Barel, M., Rolain, Y., & Pintelon, R. (2005). Numerically robust transfer function modeling from noisy frequency domain data. IEEE Transactions on Automatic Control, 50, 1835–1839. Dailey, R.L., & Lukich, M.S. (1987). MIMO transfer function curve fitting using Chebyshev polynomials. In SIAM 35th anniversary meeting, Denver, CO, USA. de Callafon, R. A., de Roover, D., & Van den Hof, P. M. J. (1996). Multivariable least squares frequency domain identification using polynomial matrix fraction descriptions. In Proc. 35th conference on decision and control, Kobe, Japan (pp. 2030–2035). De Moor, B., Gevers, M., & Goodwin, G. C. (1994). L2 -overbiased, L2 -underbiased and L2 -unbiased estimation of transfer functions. Automatica, 30, 893–898. de Vries, D. K., & Van den Hof, P. M. J. (1998). Frequency domain identification with generalized orthonormal basis functions. IEEE Transactions on Automatic Control, 43, 656–669. Douma, S. G. (2006). From data to performance: system identification uncertainty and robust control design (Ph.D. thesis), Delft, The Netherlands: Delft University of Technology. Faßbender, H. (1997). Inverse unitary eigenproblems and related orthogonal functions. Numerische Mathematik, 77, 323–345. Gevers, M. R. (1986). ARMA models, their Kronecker indices and their McMillan degree. International Journal of Control, 43, 1745–1761. Gevers, M. (1993). Towards a joint design of identification and control? In Essays on control: perspectives in the theory and its applications (pp. 111–151). Gilson, M., Welsh, J. S., & Garnier, H. (2013). Frequency-domain instrumental variable based method for wide band system identification. In Proc. American control conference, Washington, DC, USA (pp. 1666–1671). Gohberg, I., Lancaster, P., & Rodman, L. (2005). Indefinite linear algebra and applications. Basel, Switzerland: Birkhäuser Verlag. Golub, G. H., & Van Loan, C. F. (1989). Matrix computations. Baltimore, MD, USA: The Johns Hopkins University Press. Gragg, W. B., & Harrod, W. J. (1984). The numerically stable reconstruction of Jacobi matrices from spectral data. Numerische Mathematik, 44, 317–335. Hakvoort, R. G., & Van den Hof, P. M. J. (1994). Frequency domain curve fitting with maximum amplitude criterion and guaranteed stability. International Journal of Control, 60, 809–825. Heuberger, P. S. C., Van den Hof, P. M. J., & Wahlberg, B. (2005). Modelling and identification with rational orthogonal basis functions. New York, NY, USA: Springer-Verlag. Kailath, T. (1980). Linear systems. Englewood Cliffs, NJ, USA: Prentice-Hall. Levy, E. C. (1959). Complex-curve fitting. IRE Transactions on Automatic Control, 4, 318–321.

R. van Herpen et al. / Automatica ( McKelvey, T. (2002). Frequency domain identification methods. Circuits, Systems and Signal Processing, 21, 39–55. Moore, B. C. (1981). Principal component analysis in linear systems: controllability, observability, and model reduction. IEEE Transactions on Automatic Control, 26, 17–32. Ninness, B., & Hjalmarsson, H. (2001). Model structure and numerical properties of normal equations. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 48, 425–437. Oomen, T., & Bosgra, O. (2008). Robust-control-relevant coprime factor identification: A numerically reliable frequency domain approach. In Proc. American control conference, Seattle, WA, USA (pp. 625–631). Oomen, T., & Steinbuch, M. (2014). Identification for robust control of complex systems: algorithm and motion application. In M. Lovera (Ed.), Control-oriented modelling and identification: theory and applications. IET, (Chapter) (to appear). Oomen, T., van de Wal, M., & Bosgra, O. (2007). Design framework for highperformance optimal sampled-data control with application to a wafer stage. International Journal of Control, 80, 919–934. Oomen, T., van Herpen, R., Quist, S., van de Wal, M., Bosgra, O., & Steinbuch, M. (2014). Connecting system identification and robust control for nextgeneration motion control of a wafer stage. IEEE Transactions on Control Systems Technology, 22, 102–118. Pintelon, R., Guillaume, P., Rolain, Y., Schoukens, J., & Van hamme, H. (1994). Parametric identification of transfer functions in the frequency domain—a survey. IEEE Transactions on Automatic Control, 39, 2245–2260. Pintelon, R., & Kollár, I. (2005). On the frequency scaling in continuous-time modelling. IEEE Transactions on Instrumentation and Measurement, 54, 318–321. Pintelon, R., & Schoukens, J. (1990). Robust identification of transfer functions in the s- and z-domains. IEEE Transactions on Instrumentation and Measurement, 39, 565–573. Pintelon, R., & Schoukens, J. (2001). System identification: a frequency domain approach. New York, NY, USA: IEEE Press. Regalia, P. A., Mboup, M., & Ashari, M. (1997). On the existence of stationary points for the Steiglitz–McBride algorithm. IEEE Transactions on Automatic Control, 42, 1592–1596. Reichel, L., Ammar, G. S., & Gragg, W. B. (1991). Discrete least squares approximation by trigonometric polynomials. Mathematics of Computation, 57, 273–289. Rutishauser, H. (1963). On Jacobi rotation patterns. In Proc. symposia in applied mathematics, experimental arithmetic, high speed computing and mathematics: Vol. 15 (pp. 219–239). Providence, RI, USA: American Mathematical Society. Sanathanan, C. K., & Koerner, J. (1963). Transfer function synthesis as a ratio of two complex polynomials. IEEE Transactions on Automatic Control, 8, 56–58. Söderström, T., & Stoica, P. (1983). Instrumental variable methods for system identification, Lecture notes in control and information sciences. Berlin: SpringerVerlag. Söderström, T., & Stoica, P. (1988). On some system identification techniques for adaptive filtering. IEEE Transactions on Circuits and Systems, 35, 457–461. Stoica, P., & Söderström, T. (1981). Asymptotic behaviour of some bootstrap estimators. International Journal of Control, 33, 433–454. Van Barel, M., & Bultheel, A. (1992). A parallel algorithm for discrete least squares rational approximation. Numerische Mathematik, 63, 99–121. Van Barel, M., & Bultheel, A. (1995). Orthonormal polynomial vectors and least squares approximation for a discrete inner product. Electronic Transactions on Numerical Analysis, 3, 1–23. van Herpen, R. (2014). Identification for control of complex motion systems: optimal numerical conditioning using data-dependent polynomial bases (Ph.D. thesis), Eindhoven, The Netherlands: Eindhoven University of Technology.

)



13

van Herpen, R., Oomen, T., & Bosgra, O. (2012a). Bi-orthonormal basis functions for improved frequency-domain system identification. In Proc. 51st IEEE conference on decision and control, Maui, HI, USA (pp. 3451–3456). van Herpen, R., Oomen, T., & Bosgra, O. (2012b). Numerically reliable frequencydomain estimation of transfer functions: A computationally efficient methodology. In Proc. 16th IFAC symposium on system identification, Brussels, Belgium (pp. 595–600). Welsh, J. S., & Goodwin, G. C. (2003). Frequency localising basis functions for wideband identification. In Proc. European control conference, Cambridge, UK. Whitfield, A. H. (1987). Asymptotic behaviour of transfer function synthesis methods. International Journal of Control, 45, 1083–1092. Wills, A., & Ninness, B. (2008). On gradient-based search for multivariable system estimates. IEEE Transactions on Automatic Control, 53, 298–306. Young, P. (1976). Some observations on instrumental variable methods of timeseries analysis. International Journal of Control, 23, 593–612.

Robbert van Herpen received the M.Sc. degree in Electrical Engineering (cum laude, 2009) and the Ph.D. degree (2014) from Eindhoven University of Technology, The Netherlands. His Ph.D. research was conducted with the Control Systems Technology group of the faculty of Mechanical Engineering. The title of his thesis is ‘‘Identification for control of complex motion systems: Optimal numerical conditioning using data-dependent polynomial bases’’. The main contribution of this work is the development of numerically reliable algorithms for identification of systems with a large dynamical complexity. This forms a corner stone towards control of lightweight motion systems using a very large number of actuators and sensors. Tom Oomen received the M.Sc. degree (cum laude) and Ph.D. degree from the Eindhoven University of Technology, Eindhoven, The Netherlands, in 2005 and 2010, respectively. He was awarded the Corus Young Talent Graduation Award in 2005. He held visiting positions at the KTH, Stockholm, Sweden and at The University of Newcastle, Australia. Presently, he is an assistant professor with the Department of Mechanical Engineering at the Eindhoven University of Technology. His research interests are in the field of identification and robust control, with main applications in high-tech motion systems. Maarten Steinbuch is a full professor and the head of the Control Systems Technology group at the Mechanical Engineering Department of Eindhoven University of Technology. He received the Ph.D. degree from Delft University of Technology, The Netherlands, in 1989. His research interests are in modeling, design and control of motion systems and robotics, automotive power trains and control of plasma fusion. He is now the Editor-in-Chief of IFAC Mechatronics, the Scientific Director of the Centre of Competence High Tech Systems of the Federation of Dutch Technical Universities, and the Director of the TU/e Graduate Program Automotive Systems. In 2013 he was appointed Distinguished University Professor at Eindhoven University of Technology.