ANALYSIS OF THE NONLINEAR INDUCED VARIANCE IN LINEAR SYSTEM IDENTIFICATION

ANALYSIS OF THE NONLINEAR INDUCED VARIANCE IN LINEAR SYSTEM IDENTIFICATION

Proceedings of the 15th IFAC Symposium on System Identification Saint-Malo, France, July 6-8, 2009 ANALYSIS OF THE NONLINEAR INDUCED VARIANCE IN LINE...

213KB Sizes 0 Downloads 43 Views

Proceedings of the 15th IFAC Symposium on System Identification Saint-Malo, France, July 6-8, 2009

ANALYSIS OF THE NONLINEAR INDUCED VARIANCE IN LINEAR SYSTEM IDENTIFICATION

J. Schoukens, K. Barbé, L. Vanbeylen, R. Pintelon

Vrije Universiteit Brussel, Department ELEC, Pleinlaan 2, B1050 Brussels, Belgium

Abstract - This paper analyses the variance of linear models that are identified in the presence of nonlinear distortions. The best linear approximation G BLA to a nonlinear system, driven by a Gaussian distributed random input, is identified in least squares sense. ˆ BLA will vary from one Even in the absence of disturbing noise, the estimated model G realization of the excitation signal to the other. In this paper it will be shown that the ˆ BLA(k) can still be variance of the nonparametric frequency response function (FRF) G obtained from the experimental data using the classical variance formulas. However, for ˆ BLA(q, θ) , an increased variance is observed. The contributing parametric estimates G terms to this additional variance are studied in this paper. Copyright©2008IFAC Keywords: Variance, modelling errors, nonlinear distortions, linear approximation, identification

y S(t) :

1. INTRODUCTION

y 0(t) = G BLA(q)u 0(t) + y S(t) .

The classical linear system identification methods identify a parametric plant and (non-)parametric noise model from the experimental data u 0(t), y(t) :

G BLA(q) is the best linear approximation of the nonlinear system, in mean squares sense (Pintelon and Schoukens, 2001; Enqvist, 2005):

y(t) = G 0(q)u 0(t) + H 0(q)e(t) = y 0(t) + v(t) , (1) with q – 1 the backward shift operator (Ljung, 1999a; Söderström and Stoica, 1989). The noise term v(t) = H 0(q)e(t) models the power spectrum of all unmodelled output contributions, and it is assumed to be independent of the input u 0 (Ljung, 1999a, pp.250). Starting from the identified noise models, the variance on the model parameters or other model characteristics is calculated, making explicit use of the independency assumption on u 0, v . However, in most real life problems the system to be modelled is not perfectly linear, also nonlinear distortions are present. This leads to the following generalized problem setting. Consider a (dynamic) nonlinear, time invariant system, driven by a Gaussian stationary input u 0(t) : y 0(t) = g(u 0(t)) ,

G BLA(q) = arg min E { y 0(t) – G(q)u 0(t) 2 } , G

(4)

where the expected value E { } is taken with respect to the random input u 0(t) . The error term y S , that accounts for the difference between the output of the best linear approximation and the actual nonlinear output, looks like disturbing noise, and it is uncorrelated with u 0 because it is the residual of the solution of a least squares problem. But the nonlinear relation y s(t) = g(u 0(t)) – G BLA(q)u 0(t) creates a dependency on u 0(t) . The properties of y S are intensively studied in Schoukens et al., (1998, 2005) and Pintelon and Schoukens (2001) for the class of Gaussian like excitations, specified in more detail later in this paper. In many cases it is difficult for an (untrained) user to distinguish the nonlinear residual y S(t) from the ‘classical’ independent disturbing noise v(t) in the linear identification framework.

(2)

alternatively represented by its best linear approximation G BLA(q)u 0(t) plus an error term

978-3-902661-47-0/09/$20.00 © 2009 IFAC

(3)

634

10.3182/20090706-3-FR-2004.0008

15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009 In this paper we analyse how the variance of the linear models is affected by the presence of the input dependent nonlinear errors. Is the variance that is calculated under the independency assumption still valid in the nonparametric and parametric linear modelling framework? The surprising result will be that this is still the case for the nonparametric FRF ˆ BLA(k) of a nonlinear dynamic system. estimate G ˆ BLA(q, θ) , a However, for parametric estimates G larger variance is observed than that calculated in the simplified framework that is based on the independency of u 0 and v(t) .

not contribute to the variance for deterministic input, that do not change from one experiment to the other. However, for random inputs, that do vary from one experiment to the other, the model errors contribute directly to the observed variance. In the case of unmodelled linear dynamics, correction terms are provided, depending upon the 4th order moments of the random input signal. For nonlinear model errors, also the higher order moments of the input contribute. These contributions are completely missed by the ‘linear’ correction terms. The study of this problem is the topic of this paper.

In order to illustrate the unexpected behaviour very clearly, we take in this paper the extreme point of view that no process- and measurement noise is present: the exact input and output signals u 0(t), y 0(t) are available to the user. The variance of the estimation results in that case will be completely due to the fact that the model errors depend upon the actual realization of the random input. This paper studies the factors that contribute to the variance of the estimated best linear approximation, and compares it to the situation that the noise is independent of the input.

We start in Section 2 with a very simple but motivating example. Next, the formal setup is given in Section 3, followed by a theoretical analysis of the problem in Sections 4 and 5. A simulation example is given in Section 6, and finally the conclusions are drawn.

2. A MOTIVATING EXAMPLE Consider the static nonlinear system y 0(t) = u 0(t) 3 , that is excited with zero mean and white Gaussian noise with unit variance: u 0(t) ∼ N ( 0, 1 ) . For Gaussian excitations, the best linear approximation to the nonlinear static system is also a static system (Bussgang, 1952): y BLA(t) = a BLA u 0(t) .

The problem of identifying linear models for nonlinear systems in the absence of disturbing noise is also intensively studied by Mäkilä (2004, 2006), and Mäkilä and Partington (2004), but in these papers the validity of the uncertainty bounds is not addressed. Another attempt to generate more reliable error bounds in the presence of model errors is discussed in the ‘model-error-modeling’ (MEM) framework (Ljung, 1999b and 2000; Reinelt et al., 2002; Hildebrand and Gevers, 2004). Most of the MEM-papers, fit in the linear identification framework: the problem of unmodelled dynamics due to the use of a low order linear model to estimate a higher order linear system is studied. In this paper we assume that the order of the linear model of G BLA is complex enough to capture all linear dynamics, so that this problem is no point of discussion here. The MEM-papers that also include nonlinear model errors, can be split in two groups. The first group (the minority) considers explicitly a nonlinear modelerror-model, while in the second group the modelerror-models are still confined to be linear. The first group is out of the scope of this paper, we consider explicitly linear models and their behaviour in the presence of nonlinear distortions. The problem of the second group of papers fits in the scope of this paper: improved uncertainty bounds are generated, still using the expressions of the linear framework, which will be shown to underestimate the variability.

For the specified excitation, it is easy to show that E { y 0(t)u 0(t) } µ4 a BLA = --------------------------------- = ------ = 3 , 2 µ2 E { u 0(t) } with µk the k th moment, y S(t) = y 0(t) – 3u 0(t) , with σ y2 = 6 .

so

(5)

that

S

The reader should notice that although y S(t) is not correlated with u 0(t) , both random variables are dependent. First we analyse the parametric estimate, next we discuss the nonparametric FRF estimate. 2.1 Parametric estimate y 0 = aˆ u 0 . The least squares estimate aˆ is: 1 N 1 N aˆ =  ---- ∑ y (k)u 0(k) ⁄  ---- ∑ u (k) 2 . (6) N k = 1 0  N k = 1 0  i) Simple variance calculation: The variance of aˆ , for N data points, assuming that u 0 and y S are independent, as a practising engineer in the classic linear identification framework often does (Ljung, 1999a and 1999b), is:

From earlier studies of the impact of model errors on the variance (Pintelon et al., 2003; Hildebrand and Gevers, 2004), it is known that the model errors do

lim Nσ 2ˆ = σ y2 ⁄ µ 2 = 6 .

N→∞

635

a

S

(7)

15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009 ii) Exact variance calculation: The exact limit variance of aˆ , accounting for the dependency, is: 42 1 µ 8 + 9µ 4 – 6µ 6 σ 2ˆ ≈ ---- ------------------------------------- = ------ . a N N µ 22

distributed on the interval [0, 2π[ , such that jϕ E { e k } = 0 . The amplitude U k is a random variable with

(8) E { Uk 2 } = SU

Conclusion: This simple example shows that the presence of the higher order correlations between y S and u 0 results in an increased variance.

f max (k ----------) , U 0 0 N

(11)

and U k is independent of U l for k ≠ l . U l is also independent of the phases ϕ k ∀k . The amplitudes are set by the power spectrum S U U (f) of the 0 0 equivalent Gaussian random noise excitation. S U U (f) is a piece wise continuous function of f 0 0 with a countable number of discontinuities. The higher order moments of U k should remain bounded for any finite order. The maximum excited frequency is f max . The reader should note that: i) The choice of the power spectrum S U U is a free user choice; ii) A 0 0 special choice for the amplitudes U k is to select them to be deterministic: U k = S U U (kf max ⁄ N) . We 0 0 call this class ‘random phase multisines’ and it is a subset of the class of random multisines.

2.2 Nonparametric FRF estimate A similar analysis can be made on the nonparametric frequency response function (Schoukens, 2008), and here the surprising conclusion is that the classical variance expressions for nonparametric FRF estimates still give a correct estimate. The intuitive explanation for this result is that the dependency is spread out over all N frequencies. Consider the DFTspectra of the signals, for example: lt

N – 1 – j2π ---1 N u (t) U 0(l) = -------- ∑ e 0 N t=0

Definition 2: Class of excitation signals S E : A signal u 0(t) belongs to S E if it is either a random multisine excitation (see Definition 1) or a zero mean Gaussian noise excitation with power spectrum S U U .

(9)

Relating the measurements frequency by frequency results in a dependency per frequency between U 0(l), Y S(l) that decays as an O(N – 1) . The reader should not conclude from this result that a frequency domain method that builds a parametric model from ˆ the estimated FRF G BLA(jω k) would not suffer from the under-estimation of the variance. Once the N ˆ individual nonparametric estimates G are BLA recombined to estimate a parametric model, exactly the same problems appear as reported for the time domain analysis.

0

3.2. Class of systems In this paper we consider Wiener systems: this are nonlinear systems that can be approximated in mean square sense by a Volterra system for Gaussian excitations with a user defined power spectrum (Schetzen, 1980). The major property of these systems for our purpose is that the steady state response to a periodic input is periodic with the same period as the input. It allows to make all calculations on the DFT-spectra of the signals.

3. FORMAL SETUP

Definition 3: The set of systems S G is the set of systems y 0(t) = g(u 0(t)) that can be approximated in mean square sense by a convergent Volterra system for u 0 ∈ S E .

Due to the restricted length of this conference contribution, it is impossible to give all theoretical and technical details. For that reason we state only the main definitions and results, without any proof. More details can be found in (Schoukens, 2008).

3.3. Using linear models in the presence of nonlinear distortions

3.1. Class of excitation signals

In this section, we will extend and refine relation (3) by giving more insight in the behaviour of y S(t) for variations from one realization to another of the random input:

In this paper, we consider zero mean Gaussian excitations, extended with random multisines. Definition 1: Class of random multisine excitations with period N : a random multisine excitation is N⁄2–1

u 0( t ) = ( N – 2 ) – 1 / 2 ∑ k

U e –N ⁄ 2 + 1 k

= k≠0

t j  2πk ---- + ϕ k   N

y 0(t) = G BLA(q)u 0(t) + G S, A(q)u 0(t) + y S, φ(t) = G BLA(q)u 0(t) + y S, A(t) + y S, φ(t)

,

0

(10)

, (12)

with y S(t) = y S, A(t) + y S, φ(t) . The first contribution y S, A(t) is only due to variations of the input amplitude spectrum from one realization of the input to the other, the random phases ϕ k of the input do not

with ϕ – k = – ϕ k . The phases ϕ k are independently

636

15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009 appear in these terms. The other contribution y S, φ(t) is due to the variations of the random phases of the input. In the frequency domain the relations are:

phase contributions dominate the impact of the amplitude variations on the variance of the FRF.

5. STUDY OF THE VARIANCE OF THE ˆ BLA(θ) . PARAMETRIC ESTIMATE G

Y 0(k) = ( G BLA(jω k)U 0(k) + G S, A(jω k)U 0(k) + Y S, Φ(k) + T G(jω k) ) = G BLA(jω k)U 0(k) + Y S, A(k) + Y S, Φ(k) + T G(jω k)

.

In this section, the variance of the parametric estimate ˆ BLA(θ) is obtained by ˆ BLA(θ) is calculated. G G solving the following least squares problem:

(13)

The transient term T G(jω k) = O(N –1 / 2) accounts for the leakage effect (Pintelon and Schoukens, 2001). It can be shown that these effects can be neglected in this study, because T G(jω k) is only weekly correlated with the input.

N–1

θˆ = arg min ∑ ( y 0(t) – G(q, θ)u 0(t) ) 2 , t=0 θ

(17)

or in the frequency domain 4. STUDY OF THE UNCERTAINTY ON THE FRF ˆ G BLA(jω k) FOR A VOLTERRA SYSTEM

θˆ = N⁄2

Consider the nonparametric FRF ˆ ˆ ˆ G BLA(jω l) = S Y 0 U 0(l) ⁄ S U 0 U 0(l) ,

arg min



θ

k = –N ⁄ 2

( G BLA(jω k) – G(k, θ) )U 0(k) + Y S, A(k) + Y S, Φ(k)

2

= arg min V N(θ, U 0)

(14)

θ

(18) calculated from M sub-blocks with length N each, at those frequencies for which S U U (k) ≥ ε > 0 , with ε 0 0 a user selected value.   2 2 E  Sˆ Y S U0(k)  E  Sˆ Y S U0(k)      -, lim Mσ 2ˆ (k) = ------------------------------------- = -----------------------------------2 G BLA M→∞ S U U (k ) 2  ˆ 0 0 E  S U 0 U 0( k )   

In

the

linear

with G(k, θ) a short hand notation for the parametric model G(q, θ) , evaluated at the normalized frequency 2πk ⁄ N . A noise weighting is omitted for simplicity in (17) and (18). The parametric estimate G(q, θˆ ) is a consistent estimate of the G BLA (Pintelon and Schoukens, 2001; Enqvist, 2005). Its variance is discussed below.

(15)

In Figure 2 the contributions Y S are split in two parts, being due to the nonlinear disturbances Y S, A and Y S, Φ respectively. Although Y S, A = O(N – 1 / 2) and Y S, Φ = O(N 0) , it is shown that both disturbances give variance contributions of O(N – 1 / 2) .

system

identification theory,  2 E  Sˆ Y S U0(k)  = S Y Y (k)S U U (k) , but because of the  of y S(tS) on 0u 0 (t) , this step is no longer dependency S 0 allowed without additional analysis. It can be shown (Schoukens, 2008), that for the nonparametric case nothing changes, which leads eventually to the (surprising) result of Theorem 1.

Only a fraction of these disturbances, completely belonging to Y S, Φ , behaves equivalent to independent noise, and it is only this contribution that is captured by the variance calculated from the linear system identification theory. The remaining part is due to the term Y S, A which increases the variance of G BLA , and to the dependency of y S, Φ(t) on u 0(t) which effectively creates higher order moments during the variance calculations. These contributions are missed by the linear system identification methods.

Theorem 1: The variance on the FRF The variance σ 2ˆ (k) of the FRF of the best linear G ˆ approximation BLA of a nonlinear G BLA(jω k) system ∈ S G , excited with u 0(t) ∈ S E is: S Y Y ( k) S S (k) = -------------------- + O( N – 1 ) S U U ( k) G BLA

lim Mσ 2ˆ

M→∞

0

0

S Y Y ( k) S, Φ S, Φ = --------------------------- + O ( N – 1) S U U ( k) 0

,

(16) Using a multisine with a deterministic amplitude spectrum eliminates completely the (a) contributions (see Theorem 2 below). Moreover, for dynamic nonlinear systems, the (b) contributions dominate over the (c) terms.

0

with M the number of averaged realizations, and N the length of each subrecord. From this theorem, it can be seen that the random

637

15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009

Y S, A : variations due to the random σ 2ˆ

G BLA

amplitude spectrum U 0

( k)

(a)

variances ‘missed’ in the linear S.I. framework

(b) independent behaviour

variances obtained from the linear S.I. framework

dependent (c) behaviour

variances ‘missed’ in the linear S.I. framework

Y S, Φ : variations due to the random phases ∠U 0

Fig. 1Structure of the nonlinear contributions to the variability of the parametric estimate of the best linear ˆ BLA ( θ ) . approximation G Theorem 2:Variance of the parametric estimate ˆ BLA(jω, θ) G

generating filter G gen . The DC-value is set to zero.

The covariance matrix C θ of the least squares estimate (17) of the parametric model G BLA(jω, θ) of a nonlinear system in S G , excited with u 0(t) ∈ S E is:

Using a Box-Jenkins method, a linear dynamic system of order 4 (4 poles, 4 zeros) is identified, together with a noise model of order 5. This is repeated 1000 times. The mean value of the estimated models coincided with the theoretical expected value that is proportional to G 1(q)G 2(q) . The standard deviation on the amplitude of the estimated models is shown in Figure 2 and compared to that derived from the Box-Jenkins model. It can be seen that the standard deviation depends strongly on the nature of the excitation. As predicted by the theory, the

–1

NC θ ≈ V N′′ QV N′′

–1

, with Q = Q Y

S, A

+ QY

S, Φ

(19)

0

Amplitude (dB)

V N′′ is the second derivative of V N with respect to the parameters; Q Y : the variability due to the S, A variations of the auto-spectrum S U U of u 0 from 0 0 one realization to the other. Only the odd nonlinearities contribute to this term if S U U (0) = 0 . 0

Proof: see Schoukens (2008).

−20 −40 0

6. SIMULATION EXAMPLE

0.05

0.1

0.15

0.1

0.15

Amplitude (dB)

−30

The theory is illustrated in the discrete time domain on a Wiener-Hammerstein system: y(t) = G 1(q)s(t) with s(t) = atan(2r(t)) and r(t) = G 1(q)u 0(t) ,

G0

0

−40

−50 0

0.05 Frequency

(20)

Fig. 2 Comparison of the standard deviation on the amplitude of the estimated G BLA(θˆ ) with the standard deviation calculated from the BoxJenkins model. The figure at the bottom is a zoom of the figure at the top. ...: Gaussian excitation; --: normalized Gaussian excitation; ___: multisine excitation; gray line: std. dev. Box-Jenkins model obtained via the classical theory ignoring the dependency between u 0 and y S .

G 1, G 2 are linear dynamic systems. The excitation signal u 0(t) is a zero mean Gaussian noise sequence filtered by a 5th order Butterworth filter G gen with a cut-off frequency of 0.3 . The mean and variance of the input was µ u = 0 , and σ u2 = 1 , but as 0 0 explained before, the sample mean and variance of a single realization varies around these values. The linear systems G 1, G 2 are Chebyshev filters of order 2, with a ripple of 5 dB, and a cut-off frequency of 0.09 and 0.15 respectively. Thousand simulations where made with a length of 1024 samples each. For each simulation, the initial states of the systems were zero. Three different excitations signals were used: i) x 1 is a realization of u 0 ; ii) x 2 = x 1 ⁄ σˆ x 1 ; iii) x 3 is a multisine with a flat deterministic amplitude spectrum up to the cut-off frequency of the noise

standard deviation for the multisine is close to the theoretical value of the Box-Jenkins model, while for a Gaussian excitation, and additional loss of up to 6 dB can be observed. The normalized Gaussian excitations have a standard deviation that is in between both extremes, as expected.

638

15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009 7. CONCLUSION

Ljung, L. (2000). Model error modeling and control design. In Proceedings of the IFAC symposium SYSID, Santa Barbara, CA, USA (pp. WeAM13). Mäkilä, P.M., Partington, J.R. (2004). Least-squares LTI approximation of nonlinear systems and quasistationarity analysis. Automatica 40 (7), pp. 1157-1169. Mäkilä, P.M. (2004). On optimal LTI approximation of nonlinear systems. IEEE Transactions on Automatic Control, 49 (7), pp. 1178-1182 . Mäkilä, P.M. (2006). LTI approximation of nonlinear systems via signal distribution theory. Automatica, 42, pp. 917-928. Oppenheim, A.V., Willsky, A.S. and Young, I.T. (1983). Signals and systems. Prentice-Hall International Editions. Pintelon, R. and Schoukens, J. (2001). System Identification. A Frequency Domain Approach. IEEE-press, Piscataway. Pintelon, R., Schoukens, J., and Rolain, Y. (2003). Uncertainty of transfer function modelling using prior estimated noise models. Automatica 39, pp. 1721-1733. Reinelt, W., Garulli, A. and Ljung, L. (2002). Comparing different approaches to model error modeling in robust identification. Automatica 38, pp. 787-803. Schetzen, M. (1980). The Volterra and Wiener Theories of Nonlinear Systems. Wiley and Sons, New York. Schoukens, J., Dobrowiecki, T., and Pintelon, R. (1998). Parametric and nonparametric identification of linear systems in the presence of nonlinear distortions. A frequency domain approach. IEEE Trans. Autom. Contr., 43(2), 176-190. Schoukens, J., Pintelon, R., Dobrowiecki, T. and Rolain, Y. (2005). Identification of linear systems with nonlinear distortions. Automatica, 41(3), 491-504. Schoukens, J., Lataire, J., Pintelon, R., Vandersteen, G. (2008). Robustness Isues of the equivalent linear representation of a nonlinear system. Proceedings of the IEEE Instrumentation and Measurement Technology Conference, Victoria, Vancouver Island, Canada, May 15-18, 2008, pp. 332-335. Schoukens, J. (2008). Nonlinear induced variability in linear system identification. Technical Note ([email protected]) or at http://wwwir.vub.ac.be/elec/ Papers%20on%20web/Papers/JohanSchoukens/ Schoukens2008TechnicalNotev8.pdf Söderström, T. and Stoica, P. (1989). System Identification. Prentice-Hall, Englewood Cliffs.

The contributions to the variance of linear models for nonlinear systems, excited by random excitations, are studied. Even in the absence of disturbing noise, the identified linear model varies due to variations of the random input from one realization to the other. Additional contributions to the variance are present, because of the dependency of the output errors of the linear model on the input. The variance of the nonparametric frequency response function (FRF) ˆ BLA(jω ) is correctly calculated with the classical G k formulas, which validates the common engineering practice for this generalized setting. However, for ˆ BLA(jω, θ) , the variance of parametric estimates G the identified models is under-estimated by the bounds derived in the linear identification framework under the independency assumption, which creates a potential danger when these bounds are used in practice. In order to be sure that the calculated variances are reliable, it is crucial to make a check for the presence of nonlinear distortions. Without such a test, there is a risk that the obtained uncertainty bounds are far too optimistic. Only when the disturbing noise dominates the nonlinear distortions, the variance calculations of the linear framework are reliable estimates for the true variance.

REFERENCES Bendat, J. S. and Piersol, A. G. (1980). Engineering Applications of Correlations and Spectral Analysis. Wiley, New York. Bussgang J.J. (1952). Crosscorrelation functions of amplitude-distorted Gaussian signals. Technical Report, 216, MIT Laboratory of Electronics. Enqvist, M. (2005). Linear Models of Nonlinear systems. PhD Thesis No. 985, Institute of technology, Linköping University, Sweden. Enqvist, M. and Ljung, L. (2005). Linear approximations of nonlinear FIR systems for separable input processes. Automatica, 41(3), 459-473. Hildebrand R., Gevers, M. (2004). Quantification of the variance of estimated transfer functions in the presence of undermodeling. IEEE Trans. on Automatic Control, 49, 1345-1349. Kendall, M. and Stuart A. (1987). Distribution Theory, vol. 1 of The Advanced Theory of Statistics (5th ed.). Charles Griffin, London Ljung, L. (1999a). System Identification: Theory for the User (second edition). Prentice Hall, Upper Saddle River, New Jersey. Ljung, L. (1999b). Model validation and model error modeling. In B. Wittenmark, & A. Rantzer (Eds), Proceedings of the Åström symposium on control, Studentliteratur, Lund, Sweden (pp. 1542).

639