Functionally Pooled models for the global identification of stochastic systems under different pseudo-static operating conditions

Functionally Pooled models for the global identification of stochastic systems under different pseudo-static operating conditions

Mechanical Systems and Signal Processing 72-73 (2016) 785–807 Contents lists available at ScienceDirect Mechanical Systems and Signal Processing jou...

2MB Sizes 0 Downloads 39 Views

Mechanical Systems and Signal Processing 72-73 (2016) 785–807

Contents lists available at ScienceDirect

Mechanical Systems and Signal Processing journal homepage: www.elsevier.com/locate/ymssp

Functionally Pooled models for the global identification of stochastic systems under different pseudo-static operating conditions J.S. Sakellariou a,n, S.D. Fassois a,b a Stochastic Mechanical Systems & Automation (SMSA) Laboratory, Department of Mechanical Engineering & Aeronautics, University of Patras, GR 265 04 Patras, Greece b Department of Mechanical Engineering, Khalifa University of Science, Technology & Research, (KUSTAR), PO Box 127788, Abu Dhabi, UAE

a r t i c l e i n f o

abstract

Article history: Received 28 July 2015 Accepted 11 October 2015 Available online 28 November 2015

The problem of identifying a single global model for stochastic dynamical systems operating under different conditions is considered within a novel Functionally Pooled (FP) identification framework. Within it a specific value of a measurable scheduling variable characterizes each operating condition that has pseudo–static effects on the dynamics. The FP framework incorporates parsimonious FP models capable of fully accounting for cross correlations among the operating conditions, functional pooling for the simultaneous treatment of all data records, and statistically optimal estimation. Unlike seemingly related Linear Parameter Varying (LPV) model identification leading to suboptimal accuracy in this context, the postulated FP model estimators are shown to achieve optimal statistical accuracy. An application case study based on a simulated railway vehicle under various mass loading conditions serves to illustrate the high achievable accuracy of FP modelling and the improvements over local models employed within LPV–type identification. & 2015 Elsevier Ltd. All rights reserved.

Keywords: Stochastic identification Global model identification Functional models LPV models Asymptotic analysis Railway suspensions

1. Introduction Many dynamical systems operate under different conditions that significantly affect their dynamics. Oftentimes, the operating conditions are characterized by one or more measurable variables and remain constant or vary slowly over time, thus having a pseudo-static effect on the dynamics. Typical examples include structural systems vibrating under different loading conditions, such as bridges, sea vessels and trains [1,2], structures vibrating under different environmental (for instance temperature) or boundary conditions [3,4], rotating machinery dynamics under different speeds [5], aircraft dynamics at various altitudes or flight conditions [6,7], and many more.

Abbreviation: AR, X, AutoRegressive, eXogenous; ARX, AutoRegressive with eXogenous excitation; FP–ARX, Functionally Pooled ARX; SPP, Samples per Parameter; LS, Least Squares; ML, Maximum Likelihood; RSS, Residual Sum of Squares; SSS, Signal Sum of Squares; BIC, Bayesian Information Criterion; AIC, Akaike Information Criterion; OLS, Ordinary Least Squares; WLS, Weighted Least Squares n Corresponding author. Tel./fax (direct): ( þ 30) 2610 969 494, 2610 969495; Tel./Fax (central): ( þ 30) 2610 969 492. E-mail addresses: [email protected] (J.S. Sakellariou), [email protected] (S.D. Fassois). URL: http://www.smsa.upatras.gr (J.S. Sakellariou). http://dx.doi.org/10.1016/j.ymssp.2015.10.018 0888-3270/& 2015 Elsevier Ltd. All rights reserved.

786

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

Nomenclature1 Important Symbols k xk ½t yk ½t wk ½t ek ½t N na, nb pa, pb Efg γ w ðk; lÞ

σ 2w ðkÞ

scheduling variable excitation signal for the k operating condition response signal for the k operating condition model innovations for the k operating condition one-step-ahead prediction error (residuals) for the k operating condition normal distribution AutoRegressive (AR) and eXogenous (X) orders dimensionality of AR and X functional subspaces (equal to p if pa ¼pb) statistical expectation innovations cross correlation between operating conditions k and l innovations variance as a function of the scheduling variable

ai;j , bi;j Gj(k)

θ θ

M N B  o

plim

AR, X coefficients of projection j basis function coefficients of projection vector augmented parameter vector including innovations variance number of excitation-response signal pairs used for FP-ARX identification signal length in samples for each individual operating condition backshift operator (Bj  u½t ¼ u½t j) Kronecker product subscript designating actual (true) system probability limit operator

N-1 d

p Covf; g o(x)

convergence in distribution convergence in probability covariance between two random quantities function that tends to zero faster than x

In such cases the problem of identifying a single global model of the system, that is a model capable of representing the dynamics under any operating condition based on available excitation-response signal pairs, each one corresponding to a sample operating condition, is of particular interest and the subject of the present study. This problem is typically tackled via Linear Parameter Varying (LPV) models [5,8,9]. These are dynamical models with parameters expressed as functions of the measurable variable(s) – referred to as the scheduling variable(s) – designating the operating condition. In this context model identification is based on the so–called local approach [10–12], the rationale of which is simple and is based on a two-step approach that effectively splits the problem into two distinct subproblems: First a number of local (or else frozen) models – each corresponding to a single operating condition for which excitation-response signal pairs are available – are identified using conventional identification techniques [13, ch. 7] (step 1). Second, the identified models are interpolated, typically using orthogonal interpolation functions, in order to provide a single global model [12, pp. 250–251], [14,15] (step 2). This approach seems reasonable, as a straightforward extension of classical identification. Yet, when viewed within a stochastic framework in which the excitation-response signals are stochastic, it leads to suboptimal accuracy. The intuitive explanation for this is simple, and may be readily understood from the fact that the signal pairs are not treated as a single entity, but rather in complete isolation of each other in the process of obtaining each local (conventional) model (step 1). This not only neglects potential cross-correlations among the signal pairs, thus resulting into information loss, but additionally leads to an unnecessarily high number of estimated parameters,2 thus violating the principle of statistical parsimony [13, p. 492] and further leading to increased estimation variance and thus reduced accuracy (lack of efficiency in statistical terminology) [13, pp. 560–562]. To these one should also add the errors involved in the subsequent (step 2) interpolation of the obtained local models when constructing the LPV (global) model. The end result is a final, global, LPV model characterized by reduced – that is suboptimal – accuracy. Recognizing the aforementioned problems that arise within a stochastic context, the present authors and their coworkers have postulated a novel class of stochastic global models, referred to as Functionally Pooled (FP) models, for the proper global representation of systems and the remedy of the aforementioned weaknesses [16,17]. The class of FP models resembles the form of LPV models, with some of the important differences being that the signal pairs are treated as a single entity, the number of estimated parameters is minimal, potential cross-correlations among the signal pairs are accounted for, and the estimation is accomplished in a single step (instead of two subsequent steps) as necessary for achieving optimal accuracy. The optimal achievable accuracy is analytically established as well. The FP identification framework is based on three entities (also see Fig. 1): 1 Important Conventions Bold-face upper/lower case symbols designate matrix/column-vector quantities, respectively. Matrix transposition is indicated by the superscript T. A functional argument in brackets designates function of an integer variable; for instance x½t is a function of normalized discrete time (t ¼ 1; 2; …). The conversion from discrete normalized time to analog time is based on ðt  1ÞT s , with Ts standing for the sampling period. A hat designates estimator/estimate of the indicated quantity; for instance θ^ is an estimator/estimate of θ. Tilde designates sample quantity; for instance σ~ 2 designates sample variance. For simplicity of notation, no distinction is made between a random variable and its value(s). 2 Equal to the number of local models times the number of model parameters.

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

787

Fig. 1. (a) Schematic representation of data collection from a system operating under different conditions characterized by a measurable (scheduling) variable; (b) estimation of a global model within the Functional Pooling framework based on data from different operating conditions.

(a) The concept of data pooling, in its functional pooling form: This allows for the simultaneous treatment of all excitationresponse signal pairs and is important for accounting for cross-correlations and for achieving optimal estimation accuracy. (b) The class of Functionally Pooled (FP) stochastic models: These have a structure resembling that of their LPV counterparts, but additionally include proper cross correlation terms [see Eq. (1.b)]. This leads to more complete model forms, but also paves the way for accurate estimation. (c) Statistically optimal estimation. FP models have been thus far used in their AutoRegressive with eXogenous excitation (ARX) form and mainly from an application point of view using simple estimators. For instance a global model for composite beam dynamics under different temperatures has been obtained in [3] and FP models have been used for representing aircraft dynamics under different flight conditions and configurations in [6,7]. The main use of FP–ARX models has, thus far, been in the context of damage localization and magnitude estimation within the broader Structural Health Monitoring (SHM) framework [16–21]. In all these studies the simplest possible (Least Squares, LS) type estimators have been employed, without any claims on estimator optimality or any analysis on related issues – a preliminary discussion on estimator properties was briefly presented in an early conference paper [22]. The aims of the present study thus are: (i) the formal introduction of FP models in their complete form, (ii) the postulation of estimation methods based on the Least Squares (LS) and Maximum Likelihood (ML) principles, (iii) the theoretical investigation of estimator properties, such as consistency and efficiency which refer to estimated model optimality. In addition, (iv) the comparison, via numerical Monte Carlo experiments, of the postulated estimation methods, and (v) the presentation of an application case study where FP–ARX modelling is used for estimating the dynamics and global modal characteristics (natural frequencies and damping ratios) of a simulated railway vehicle under various mass loading conditions (different numbers of passengers). Comparisons demonstrating the estimation accuracy improvement over that of local models used in the context of LPV-type identification are also made. The rest of the paper is organized as follows: The FP–ARX model structure and the identification problem are presented in Section 2. The LS and ML based estimation methods are presented in Section 3, while estimator asymptotic properties (consistency and efficiency) are analysed in Section 4. The methods' performance is assessed in Section 5 via numerical Monte Carlo experiments. The application case study pertaining to the estimation of the dynamics and the global modal characteristics (natural frequencies and damping ratios) of a simulated railway vehicle under various mass loading conditions is presented in Section 6, along with a demonstration of the improvement achieved over local models used in LPV model identification. The conclusions of the study are finally summarized in Section 7.

2. The FP-ARX model structure and the identification problem The Functionally Pooled AutoRegressive with eXogenous (FP-ARX) excitation model structure is defined as: yk ½t þ

na X i¼1

ai ðkÞ  yk ½t  i ¼

nb X i¼0

bi ðkÞ  xk ½t i þwk ½t;

ð1:aÞ

788

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

Efwk ½twl ½t  τg ¼ γ w ðk; lÞ  δ½τ;

ai ðkÞ 9

pa X

ai;j  Gj ðkÞ;

bi ðkÞ 9

j¼1

γ w ðk; kÞ ¼ σ 2w ðkÞ k A R pb X

bi;j  Gj ðkÞ

ð1:bÞ

ð1:cÞ

j¼1

with t designating normalized discrete time, k the measurable scheduling variable characterizing each operating condition, xk ½t, yk ½t the corresponding excitation and noise corrupted response signals for the k operating condition, na, nb the AutoRegressive (AR) and eXogenous (X) orders. wk ½t is the model innovations Gaussian zero-mean white process with variance σ 2w ðkÞ, that is N ð0; σ 2w ðkÞÞ, which is uncorrelated with xk ½t 8 k and potentially cross-correlated with its counterparts corresponding to different operating conditions. Efg designates statistical expectation and δ½τ Kronecker delta (equal to unity for τ ¼ 0 and equal to zero for τ a 0). As Eq. (1.c) indicates, the AR and X parameters ai(k), bi(k), are modelled as explicit functions of the scheduling variable k belonging to corresponding functional subspaces spanned by the mutually independent functions Gj(k) (functional basis): F 〈ai ðkÞ〉 9fG1 ðkÞ; G2 ðkÞ; …; Gpa ðkÞg;

F 〈bi ðkÞ〉9 fG1 ðkÞ; G2 ðkÞ; …; Gpb ðkÞg

with F 〈  〉 designating functional subspace and pa, pb the dimensionality of the AR and X functional subspaces, respectively. The constants ai;j , bi;j designate the corresponding AR and X, coefficients of projection. The representation of (1) is referred to as an FP-ARX model of orders (na, nb) and functional subspaces of dimensionalities pa, pb, or in short an FP-ARX ðna; nbÞ½pa;pb (or FP  ARXðna; nbÞp for p ¼ pa ¼ pb) and is further designated as Mðθ Þ parameterized in terms of the parameter vector θ 9 ½ai;j ⋮bi;j ⋮ci;j ⋮σ 2w ðkÞT 8 i; j.3 Three cases are distinguished: Case (i): γ w ðk; lÞ ¼ σ 2w  δðk  lÞ (groupwise homoscedastic innovations) Case (ii): γ w ðk; lÞ ¼ σ 2w ðkÞ  δðk lÞ (groupwise heteroscedastic innovations) Case (iii): γ w ðk; lÞ of general form (contemporaneously correlated innovations) with δðk  lÞ equal to unity if k¼l and equal to zero if k al. Thus the FP-ARX estimation problem may be stated as follows: Given M pairs of excitation-response signals, each signal being of length N samples, for the sample values of the scheduling variable k1 , k2 ; …; kM : Z NM 9 fxk ½t; yk ½t j k ¼ k1 ; k2 ; …; kM ;

t ¼ 1; …; Ng;

ð2Þ

determine an estimate of the parameter vector θ . Before proceeding with model parameter estimation it is worth noting that: (a) The FP-ARX model structure allows for the representation of contemporaneous cross correlations among different excitation-response pairs via the γ w ðk; lÞ term. Moreover, such information is fully accounted form in the parameter estimation phase and is vital for obtaining statistically optimal models (see Section 3). (b) The projection of the parameters ai(k), bi(k) on the functional subspaces F 〈ai ðkÞ〉, F 〈bi ðkÞ〉 allows for models capable of   representing the dynamics everywhere within ½kmin ; kmax  A R and not only at the distinct k1 ; k2 ; …; kM values. (c) The form of functional dependence is important. Physical insight may be used, while experience [3,16,17,19–21] indicates that orthogonal polynomials, such as shifted Type II Chebyshev polynomials or trigonometric functions, are often sufficient. Using the backshift operator B (Bj  xk ½t 9 xk ½t  j) the main expression of the FP-ARX model may be compactly written as: A½B; k  yk ½t ¼ B½B; k  xk ½t þwk ½t

with A½B; k 91 þ

na X i¼1

ai ðkÞBi ;

B½B; k 9

ð3:aÞ nb X

bi ðkÞBi

ð3:bÞ

i¼0

In analogy to conventional models this representation is assumed to satisfy the following conditions: A1. Stability condition: The poles of the AR polynomial lie inside the unit circle 8 k A ½kmin ; kmax . A2. Irreducibility condition: The polynomials A½B; k, B½B; k are coprime (have no common factors). A3. Each excitation signal xk ½t is stationary, ergodic, and persistently exciting of sufficiently high order [13, pp. 412–414] with Efxk ½t  wl ½tg ¼ 0 8 k; l. 3

Bold-face upper/lower case symbols designate matrix/column-vector quantities, respectively.

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

789

3. FP-ARX model estimation A model, corresponding to an actual underlying system of the form (1), is now to be estimated based on the available data of Eq. (2): na X

yk ½t þ

ai ðkÞ  yk ½t  i ¼

i¼1

nb X

Efek ½tel ½t  τg ¼ γ e ðk; lÞ  δ½τ; ai ðkÞ 9

pa X

bi ðkÞ  xk ½t i þek ½t

ð4:aÞ

i¼0

ai;j  Gj ðkÞ; bi ðkÞ 9

j¼1

γ e ðk; kÞ ¼ σ 2e ðkÞ k A R pb X

ð4:bÞ

bi;j  Gj ðkÞ

ð4:cÞ

j¼1

with ek ½t designating the model's one-step-ahead prediction error (residual), that like the wk ½t’s is N ð0, σ 2e ðkÞÞ. Without any loss of generality it is assumed that p ¼ pa ¼ pb, thus the main model expression above is written in a linear regression form as follows (  designates Kronecker product [23, pp. 27–28]):   T yk ½t ¼ φTk ½t  g T ðkÞ  θ þek ½t ¼ ϕk ½t  θ þek ½t ð5Þ with: 

T

φk ½t 9 yk ½t  1; …;  yk ½t  na⋮xk ½t; …; xk ½t  nb  T gðkÞ 9 G1 ðkÞ; …; Gp ðkÞ ½p1 ;

½ðna þ nb þ 1Þ1



θ 9 a1;1 ; …; ana;p ⋮b0;1 ; …; bnb;p

T ½ðna þ nb þ 1Þp1

The substitution of the values (for t ¼ 1; 2; …; N) for a single signal pair characterized by k to the above leads to: 3 2 3 2 T 2 3 yk ½1 ek ½1 ϕk ½1 6 7 6 ⋮ 7 6 6 7 4 5¼4 ⋮ 7 5  θ þ 4 ⋮ 5⟹yk ¼ Φk  θ þek T yk ½N ek ½N ϕk ½N

ð6Þ ð7Þ

ð8Þ

Pooling together (see [24, ch. 13] for the concept of pooling in a linear regression framework in econometrics) the signal pairs corresponding to the sample values k1 ; k2 ; …; kM leads to: y ¼ Φ  θþe with:

2

yk1 6y 6 k2 y96 6 ⋮ 4 ykM

ð9:aÞ

3

2

7 7 7 7 5

6 6 Φk2 Φ96 6 ⋮ 4 ½NM1

Φk1

ΦkM

3

2

7 7 7 7 5 ½NMpðna þ nb þ 1Þ

ek1 6e 6 k2 e96 6 ⋮ 4 ekM

3 7 7 7 7 5

ð9:bÞ ½NM1

Based on the above functional pooling (the first term emphasizing the functional dependence of each equation on the scheduling variable k) the estimation of the parameter vector θ may be achieved through Least Squares (LS) or Maximum Likelihood (ML) principles. 3.1. Least Squares (LS) type estimation The estimation of the parameter vector θ may be achieved by minimizing the following Least Squares criterion:   1 T 1 e Rw e J θ; Z NM 9 NM

ð10Þ

where e is given by Eq. (9.a) and Rw designates the covariance of w (of the actual system) defined similarly to e. In case (i) of groupwise homoscedasticity that corresponds to innovations with equal variances, Efw2k ½tg ¼ σ 2w ; for ¼ k1 ; …; kM , Rw ¼ σ 2w I NM (I NM : identity matrix of dimension NM  NM), the criterion of Eq. (10) leads to the Ordinary Least Squares (OLS) estimator: 2 31 2 3 kM X kM X N N  1 X X 1 1 T T T θ^ ¼ Φ Φ Φ y¼4 ϕ ½t ϕk ½t 5  4 ϕ ½t yk ½t 5 ð11Þ NM k ¼ k t ¼ 1 k NM k ¼ k t ¼ 1 k 1

1

while the residual variance may be estimated as:

σ^ 2w ðkÞ ¼

N h i 1X e2 t; θ^ Nt¼1 k

for k ¼ k1 ; k2 ; …; kM

ð12Þ

790

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

In case (ii) of groupwise heteroscedasticity, with different variances among innovations of the considered operating conditions, Efw2k ½tg ¼ σ 2w ðkÞ; for k ¼ k1 ; …; kM , and covariance matrix given by: 2 2 3 σ w ðk1 ÞI N 0 ⋯ 0 6 7 σ 2w ðk2 ÞI N ⋯ 0 0 6 7 7 Rw ¼ 6 ð13Þ 6 7 ⋮ ⋮ ⋱ ⋮ 4 5 0 0 ⋯ σ 2w ðkM ÞI N NMNM

the criterion of Eq. (10) leads to the Weighted Least Squares (WLS) estimator: 2 31 2 3 kM kM N N h i  1h i 1 X 1 X 1 X 1 X T T 1 T 1 ^ 4 5 4 θ ¼ Φ Rw Φ Φ Rw y ¼ ϕ ½t ϕk ½t   ϕ ½t yk ½t 5 NM k ¼ k σ 2w ðkÞ t ¼ 1 k NM k ¼ k σ 2w ðkÞ t ¼ 1 k 1

ð14Þ

1

As the innovations covariance matrix Rw is practically unavailable, the unknown σ 2w ðkÞ may be estimated by Eq. (12) with θ^ obtained through the previous OLS estimator. The estimation procedure may be iterated until convergence in the coefficients of projection vector θ is achieved. In case (iii) with Efw2k ½tg ¼ σ 2w ðkÞ and contemporaneously correlated innovations, Efwk ½twl ½tg ¼ γ w ðk; lÞ, the criterion to be minimized becomes:   1 J θ; Z NM 9 eT  Rw 1  e N

ð15Þ

with Rw ¼ Rw½t  I N and: 2

σ 2w ðk1 Þ 6 6 γ w ðk2 ; k1 Þ Rw½t ¼ Efw½twT ½tg ¼ 6 6 ⋮ 4

3

γ w ðk1 ; k2 Þ ⋯ γ w ðk1 ; kM Þ 7 σ 2w ðk2 Þ ⋯ ⋮ 7

γ w ðkM ; k1 Þ











σ 2w ðkM Þ

7 7 5

ð16Þ MM

 T where w½t wk1 ½t; …; wkM ½t M1 . Then minimization of the LS criterion leads to the estimator: h

i  1h

θ^ ¼ ΦT Rw 1 Φ

ΦT Rw 1 y

i

ð17Þ

b w is obtained through the OLS or WLS estimator and the final residual covariance As Rw is not available, an initial estimate R matrix is estimated as: N h i h i X b w½t ¼ 1 e t; θ^ eT t; θ^ : R Nt¼1

ð18Þ

3.2. Maximum Likelihood (ML) estimation The estimation of the augmented parameter vector θ (see Section 2) for the more general cases of groupwise heteroscedastic and contemporaneously correlated residuals based on the ML principle [25, pp. 198–199] is considered. Thus in case (ii) the log-likelihood function corresponding to an FP-ARX model is given by:     kM L θ =e ¼ ln p e=θ ¼ ln ∏

kM kM N   N   NM N X 1 X 1 X ln 2π  ∏ p ek ½t =θ ¼  ln σ 2w ðkÞ  e2 t; θ 2 2 2k¼k 2 k ¼ k σ w ðkÞ t ¼ 1 k k ¼ k1 t ¼ 1 1

ð19Þ

1

   P 2 with pðÞ designating probability density. By setting λk ¼ σ 2w ðkÞ, σ~ 2e k; θ ¼ N1 N t ¼ 1 ek t; θ , Eq. (19) may be rewritten as follows (tilde designates sample quantity): kM kM   NM N X N X σ~ 2e ðk; θÞ ln 2π  L θ =e ¼  ln λk  2 2k¼k 2k¼k λk 1

ð20Þ

1

The first two derivatives of L with respect to λk are: ϑLðθ; λk =eÞ N N 2 ¼ þ σ~ k; θ ; 2λk 2λ2 e ϑλk k

ϑ2 Lðθ; λk =eÞ N N ~ 2  ¼ 2  3 σ e k; θ 2λk λk ϑλ2k

ð21Þ

Maximization of L with respect to λk leads to λk ¼ σ~ 2e ðk; θÞ, while the right expression is negative. This provides the estimate

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

791

of λk for given θ. In order to obtain the estimate of θ, λk is inserted into Eq. (20) giving: 2 3 σ~ 2e ðk1 ; θÞ ⋯ 0 kM 6 7   NM N NM N 7 0 ⋱ 0 ðln 2π þ1Þ  ln ∏ σ~ 2e k; θ ¼  ðln 2π þ 1Þ  ln det6 L θ=e ¼  4 5 2 2 2 2 k ¼ k1 2 ~ 0 ⋯ σ e ðkM ; θÞ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}

ð22Þ

~ θÞ:MM Rð

The ML estimator of θ then is: ~ θÞ θ^ ML 9 arg min det Rð

ð23Þ

θ

with: 



σ^ 2w ðkÞ ¼ σ~ 2e k; θ^ ML ¼

N h i 1X e2k t; θ^ ML Nt¼1

ð24Þ

~ θÞ ¼ ½σ~ 2 ðk; θÞM for any k ¼ k1 ; …; kM . (b) Based on Eq. (19) Remarks. (a) Evidently the above specializes to case (i) for det Rð e the estimation of θ based on the ML principle coincides with the WLS estimator of Eq. (14) provided that consistent estimates of the innovations variances σ 2w ðkÞ for k ¼ k1 ; …; kM are available. In case (iii) the log-likelihood function of (jointly) normally distributed residuals is for all k's given by: N N   1X NM N ln 2π  ln det R L θ; R=e½t 1 ; …; e½t N  ¼ ln ∏ p e½t =θ; R ¼  eT ½t R  1 e½t   2t ¼1 2 2 t¼1

ð25Þ

with R replacing Rw½t for simplicity. By setting: 

Λ~ θ ¼

N     1X e t; θ eT t; θ Nt¼1

ð26Þ

Eq. (25) becomes: o N  N n~ NM ln 2π L θ; R=e ¼  Tr Λ ðθÞR  1  ln det R  2 2 2

ð27Þ

The ML estimator then is obtained as:

θ^ ML 9 arg min det Λ~ ðθÞ

ð28Þ

N   1X h i h i ~ θ^ b w½t ¼ Λ e t; θ^ ML eT t; θ^ ML R ML ¼ Nt¼1

ð29Þ

θ

3.3. Model structure selection and validation Model structure selection (model identification) includes order determination for the AR and X polynomials [26] as well as the determination of the functional subspace dimensionality for a given basis function family such as Chebyshev, Laguerre, Jacobi and so on [27, pp. 771–802]. Thus models of various structures, forms and subspace dimensionalities are estimated, and subsequently each one is validated. A model is rejected if validation fails. Among those successfully validated, the “best” one is selected according to minimization of the Akaike Information Criterion (AIC) or the Bayesian Information Criterion (BIC) [13, pp. 505–507]. These may be shown to be adapted to Functionally Pooled models as follows: kM

AIC ¼ N ∑

k ¼ k1

ln σ^ 2w ðkÞ þ 2 dim θ;

kM

BIC ¼ ∑

k ¼ k1

ln σ^ 2w ðkÞ þ dim θ

ln N  homoscedastic=heteroscedastic innovations N ð30:aÞ

ln N ðcontemporaneously correlated innovationsÞ ð30:bÞ N The validation procedure is based on the obtained residual sequences e^ k ½t ¼ ek ½t; θ^  8 k, which should, for an accurate model, be serially (over time) uncorrelated, uncorrelated with the excitation 8 k, and cross uncorrelated over the different operating conditions. Testing these hypotheses may be based on typical statistical tests which employ the sample auto and cross correlation functions [13, pp. 513–514], [25, pp. 426–429], [28, p. 149]. ^ w½t  þ2 dim θ; AIC ¼ N ln det R

^ w½t þ dim θ BIC ¼ ln det R

792

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

4. Identification accuracy: asymptotic properties of the estimators The achievable identification accuracy is now examined by considering the asymptotic (as the signal length N⟶1) properties of the presented estimators. The considered properties are consistency (examining the limit value of an estimate as N⟶1) and asymptotic distribution of the estimates. Estimator optimality then refers to a consistent estimate with minimal variance in the asymptotic distribution (statistical efficiency). For the sake of brevity the special case (ii) of heteroscedastic residuals is considered. Theorem 1 (LS Estimator Consistency). Let θo be the true (actual system) projection coefficient vector, wk ½t a zero mean white T process with Efw2k ½tg ¼ σ 2w ðkÞ, and Efϕk ½tϕk ½tg a nonsingular matrix. Then:

θ^ ⟶θo p

ðN-1Þ

ð31Þ

p

with ⟶ designating convergence in probability [29, p. 94]. Proof. See Appendix A.1. Remark I. Note that, using the Kolmogorov theorem [30, p. 32], it is easily verified that: w:p:1

σ^ 2w ðkÞ ⟶ σ 2w ðkÞ

ðN-1Þ

ð32Þ

w:p:1

with ⟶ designating convergence with probability one [29, p. 93]. Theorem 2 (ML Estimator Consistency). Let θ o ¼ ½θo ⋮σ 2w ðkÞT be the true (actual system) parameter vector, wk ½t a zero mean T white process with Efw2k ½tg ¼ σ 2w ðkÞ, Efw3k ½tg ¼ 0, Efw4k ½tg ¼ 3σ 4w ðkÞ, and Efϕk ½tϕk ½tg a nonsingular matrix. Then: b

p

θ ML ⟶θ o

ðN-1Þ

ð33Þ

Proof. See Appendix A.2. Theorem 3 (LS Estimator Asymptotic Distribution). Let θo be the true projection coefficient vector, wk ½t a zero mean white T process with Efw2k ½tg ¼ σ 2w ðkÞ, and Efϕk ½tϕk ½tg a nonsingular matrix. Then: pffiffiffiffiffiffiffiffi d ðN-1Þ ð34Þ NM ðθ^  θo Þ-N ð0; PÞ with: 2

31 kM X   T  1 1 Efφk t; θo φk t; θo g  Gk 5 P¼4 M k ¼ k σ 2w ðkÞ

ð35Þ

1

d

with ⟶ designating convergence in distribution [29, p. 94] and Gk ¼ gðkÞg T ðkÞ. Proof. See Appendix A.3. Remark II. Based on Eq. (35), the WLS estimator of Eq. (14) is efficient, that is its covariance achieves the Cramer–Rao lower bound of Eq. (37). This is true provided that consistent estimates of the innovations variances σ^ w ðkÞ for k ¼ k1 ; …; kM are available (also see remarks in Section 3.2). In addition, the OLS estimator of Eq. (11) is efficient only in the case of an homoscedastic (case (i)) actual system with covariance matrix given by Eq. (35) by replacing σ 2w ðkÞ by the common (for all operating conditions) variance σ 2w – a consistent estimate of which is obtained via Eq. (12) for any k. Theorem 4 (ML Estimator Asymptotic Distribution & Efficiency.). Let θ o ¼ ½θo ⋮σ 2w ðkÞT be the true parameter vector, wk ½t a zero T mean white process with Efw2k ½tg ¼ σ 2w ðkÞ, Efw3k ½tg ¼ 0, Efw4k ½tg ¼ 3σ 4w ðkÞ 8 k, and Efϕk ½tϕk ½tg a nonsingular matrix. Then the b estimate θ ML is efficient, as it follows asymptotically Gaussian distribution with mean θ o and covariance matrix equal to the Cramer–Rao lower bound [25, pp. 560–562]: b

θ ML  N ðθ o ; F  1 Þ

ðN-1Þ

ð36Þ

with F designating the Fisher information matrix which is equal to: 8 2 3T 9 <ϑLðθ =eÞ = ϑLðθ =eÞ 4 5 F ¼E : ϑθ ; ϑθ θ ¼ θo

ð37Þ

θ ¼ θo

Proof. See Appendix A.4. In summary, the first two theorems establish the consistency of all estimators, while the last two provide the asymptotic distribution and efficiency of the estimates. Based on these, consistent estimates are obtained by all estimators. On the other hand, the ML estimator is efficient, with covariance equal to the Cramer–Rao lower bound. The WLS estimator is also

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

793

efficient provided that consistent estimates of the innovations variances for the different operating conditions are available, while the OLS estimator is efficient only in the case of an homoscedastic (case (i)) actual system – otherwise it is inefficient.

5. Performance assessment via Monte Carlo numerical experiments The objectives of this section are: (i) assessment of the effectiveness of the OLS, WLS, and ML estimation methods and confirmation of the asymptotic accuracy results of Section 4, (ii) demonstration of functional subspace selection and validation procedures, and (iii) demonstration of the estimation accuracy improvement over that of local models used in the context of LPV-type identification. These objectives are achieved via Monte Carlo numerical experiments with a simulated system. 5.1. The simulated system and the experiments The actual simulated system is of the FP  ARXð2; 1Þ3 form with zero delay (b0 a0 in the exogenous polynomial) and AR, X functional subspaces consisting of the first p ¼3 shifted Chebyshev continuous polynomials of Type II [27, pp. 773–782]: 2 3 2 3 2 X 3 1 X 3 X X 41 þ ai;j Gj ðkÞBi 5yk ½t ¼ 4 bi;j Gj ðkÞBi 5xk ½t þ wk ½t ð38Þ i¼1j¼1

i¼0j¼1

with: Efwk ½twl ½t  τg ¼ σ 2w ðkÞ  δ½k  l  δ½τ

ðcase ðiiÞÞ

ð39Þ

n o 2 F 〈ai ðkÞ〉 ¼ F 〈bi ðkÞ〉 ¼ G1 ðkÞ ¼ 1; G2 ðkÞ ¼  2 þ 4k; G3 ðkÞ ¼ 16k  16k þ 3 ;

k A ½0; 15

ð40Þ

The scheduling variable k is normalized with respect to its maximum value, kmax ¼ 15, in order to lie within the range ½0; 1 (see [27, pp. 773–782]). The system's true coefficients of projection are indicated in Table 1, while the variance of the heteroscedastic innovations for various values of k is shown in Table 2. 500 Monte Carlo experiments per estimation method (OLS, WLS, ML), with each covering the range of k A ½0; 15 (before normalization) with an increment of δk ¼ 1, are included. Each experiment is based on M¼16 pairs of N ¼1001 sample long excitation-response signals. In each experiment the system response is generated by using a number of mutually independent Gaussian zero-mean random sequences acting as excitation and innovations. Further details are provided in Table 3. Model structure selection includes functional subspace dimensionality p selection based on the Residual Sum of Squares normalized by the Series Sum of Squares (RSS/SSS): kM PN 2 1 X t ¼ 1 ek ½t  100% ð41Þ RSS=SSS ¼ PN 2 Mk¼k t ¼ 1 yk ½t 1 as well as the BIC and AIC criteria of Section 3.3. Model validation is based on the whiteness examination of the residuals for Table 1 Monte Carlo estimation results for the FP-ARX ð2; 1Þ3 (500 experiments per method). Coeff.

True

OLS estimatesa

WLS estimatesa

ML estimatesa

a1;1 a1;2 a1;3 a2;1 a2;2 a2;3 b0;1 b0;2 b0;3 b1;1 b1;2 b1;3

0.1725 0.1111  0.1711 0.2512  0.1010 0.1222  0.4711  0.4761 0.8474  0.2081  0.2731 0.4670

0.17228 70.00489 0.11107 7 0.00098  0.17105 7 0.00015 0.25118 70.00139  0.10095 7 0.00086 0.12219 7 0.00093  0.47110 7 0.00111  0.47596 7 0.00210 0.84729 7 0.00201  0.20795 7 0.00358  0.27295 7 0.00282 0.46685 7 0.00457

0.17251 70.00197 0.11110 70.00066  0.17110 7 0.00067 0.25113 7 0.00092  0.10095 7 0.00066 0.12225 7 0.00059  0.47108 70.00045  0.47606 7 0.00053 0.84738 70.00083  0.20809 70.00101  0.27309 70.00108 0.46699 7 0.00182

0.17254 7 0.00196 0.11111 70.00066  0.17110 7 0.00067 0.25113 7 0.00092  0.10095 7 0.00066 0.12225 7 0.00059  0.47108 70.00045  0.47606 7 0.00053 0.84738 70.00083  0.20811 7 0.00101  0.27311 70.00108 0.46702 7 0.00182

EAR

3:9784  10 4

2:0483  10  4

2:4864  10  4

EX RSS

2:5235  10 4

2:4874  10  5 0.986597 0.01069

3:8299  10  5 0.98659 70.01069

SSS a

%

0.992177 0.01145

Sample Estimate 7 Sample Standard Deviation,

σ w ðkÞ σ y~ % k

¼ 10%, σ y~ k : noise free response standard deviation.

794

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

Table 2 Monte Carlo estimation results for the residual variances of the FP-ARX ð2; 1Þ3 (500 experiments per method). Variance

True

OLS estimatesa

WLS estimatesa

ML estimatesa

σ 2w ðk1 Þ

0.51755

0.51588 7 0.05744

0.51715 7 0.05759

0.51715 70.05759

σ 2w ðk2 Þ

0.12889

0.12889 7 0.01068

0.12881 7 0.01068

0.12881 70.01067

σ 2w ðk3 Þ σ 2w ðk4 Þ

0.03136

0.03140 7 0.00237

0.03134 7 0.00236

0.03134 70.00236

0.00411

0.00413 7 0.00028

0.00411 7 0.00028

0.00411 70.00028

σ 2w ðk5 Þ

0.00020

0.00022 7 0.00001

0.00020 7 0.00001

0.00020 70.00001

σ 2w ðk6 Þ

0.00456

0.00457 7 0.00030

0.00456 7 0.00030

0.00456 70.00030

σ 2w ðk7 Þ σ 2w ðk8 Þ

0.01097 0.01657

0.01097 7 0.00070 0.01657 7 0.00103

0.01096 7 0.00070 0.01656 7 0.00103

0.01096 70.00070 0.01656 70.00103 0.01965 70.00124

σ 2w ðk9 Þ

0.01966

0.01966 7 0.00124

0.01965 7 0.00124

σ 2w ðk10 Þ

0.01979

0.01979 7 0.00125

0.01979 7 0.00125

0.01979 70.00125

σ 2w ðk11 Þ

0.01683

0.01683 7 0.00104

0.01682 7 0.00104

0.01682 70.00104

σ 2w ðk12 Þ

0.01137

0.01138 7 0.00072

0.01137 7 0.00072

0.01137 70.00072

σ 2w ðk13 Þ σ 2w ðk14 Þ

0.00514

0.00515 7 0.00033

0.00514 7 0.00033

0.00514 70.00033

0.00042

0.00042 7 0.00002

0.00042 7 0.00002

0.00042 70.00002

σ 2w ðk15 Þ

0.00248

0.00248 7 0.00017

0.00247 7 0.00017

0.00247 70.00017

σ 2w ðk16 Þ

0.02179

0.02175 7 0.00156

0.02177 7 0.00156

0.02177 70.00156

a

Mean Estimate 7 Standard Deviation.

Table 3 Details on the Monte Carlo experiments for FP-ARX ð2; 1Þ3 model identification. No. of Monte Carlo exps: 500

Operat. conds.: M ¼ 16 ðk A ½0; 15; δk ¼ 1Þ Innovs. depend.: groupwise heteroscedastic

Func. subspace: first 3 Chebyshev II polyn. Data length: N ¼1001 samples

FP-ARX estimation methods

OLS method: QR decompostion WLS method: iterative; QR decomp. init. vars. via OLS mean estim.

ML method: Quasi-Newton nonlin. optim. init. vars. via WLS mean estims. max. number of iterations ¼ 400 estim. pars termination tol.¼ 10  8

converg. index: δr ¼ 10  8 No. of iterations ¼ 2 Model structure selection

objective function term. tol.¼ 10  6

Model orders: true values

Funct. subspace: criteria RSS/SSS, AIC, BIC

each k, as well as on the normalized cross correlation function between the excitation signals and the residuals for all k, and among the residuals of the different operating conditions (also see Section 3.3). Interval parameter estimates are recorded for all estimators, along with normalized (percentage) aggregate AR and X ^

^

errors corresponding to the sample mean estimates EAR 9 jjajja○ajj○1jj1 %, EX 9 jjbjjb○bjj○ jj1 %, with the subscript “○ ” designating the 1 P true value of the indicated quantity, a, b the projection coefficient vectors for the AR and X polynomials, and jjajj1 9 i jai j. A r

^ ^ convergence monitoring index is also used for the WLS and ML estimators δr ¼ jjθ^ r 1θ jjθ

r1

jj1

jj1

, with r designating iteration

number and selected threshold of 10  8 (also see Table 3).

5.2. Model structure selection and parameter estimation results Model structure selection results are presented in Figs. 2 and 3. The first depicts the RSS/SSS for OLS-estimated models as a function of functional subspace dimensionality p for 500 experiments per p. The results indicate (with very small variance) an initial drop in the RSS/SSS and a subsequent “levelling off” for p Z 3, leading to a potential selection of p ¼3 which is the correct value. This result is confirmed by the BIC and AIC criteria (Fig. 3). The results are very similar for the WLS and ML based estimates. The obtained, by all three (OLS, WLS, ML) Monte Carlo estimation results are presented in Tables 1 and 2 and in Fig. 4. They indicate excellent agreement between the estimates and their true counterparts for all estimators, with the attained standard deviations being particularly low. This is also supported by the insignificant AR and X aggregate errors and the very low values of the sample mean RSS/SSS (Table 1). Yet, as expected according to the asymptotic analysis of Section 4, the WLS and ML estimators achieve accuracy that is considerably better (lower standard deviations) than that achieved by the OLS. This is due to the inefficiency of the OLS estimator in the present (heteroscedastic) case. The true FP–ARX parameter trajectories versus k and their estimates, as obtained by the OLS, WLS and ML estimators (500 experiments), are shown in Fig. 5 and further confirm the high accuracy achieved.

RSS/SSS (%)

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

795

101

100 2

3

4

5

6

7

8

functional basis dimensionality p Fig. 2. Functional subspace dimensionality p selection via the RSS/SSS (Monte Carlo experiments – each cross corresponds to a single experiment; 500 experiments per p).

0.015

20 0

AIC

BIC

0.01 −20

0.005 −40 0

2

3

4

5

6

7

8

−60

functional basis dimensionality p

2

3

4

5

6

7

8

functional basis dimensionality p

Fig. 3. Functional subspace dimensionality p selection via the BIC (a) and AIC (b) criteria (Monte Carlo experiments – each cross corresponds to a single experiment; 500 experiments per p).

0.113

0.18

0.17

−0.17

13

0.111

α

α12

α

11

−0.168 0.112

0.175

−0.172

0.11 −0.174 0.109

0.165

−0.099 0.254 0.124

0.25

α23

22

α

α21

−0.1 0.252

−0.101

0.122

−0.102 0.248

−0.472

−0.468

0.85 −0.476

0.848

b

b

02

b01 −0.472

03

−0.474

−0.47

0.846 −0.478 0.844 −0.48 −0.268

−0.2

−0.27

−0.21

0.47 13

−0.272

b

b

b

12

11

−0.205

0.465

−0.274 −0.276

−0.215

OLS

WLS

ML

OLS

WLS

ML

0.46

OLS

WLS

ML

Fig. 4. True (red dashed lines) projection coefficients and Monte Carlo estimates (boxes indicating estimate sample mean 7 sample standard deviation) by the OLS, WLS and ML estimators (Monte Carlo experiments – 500 experiments per method). (For interpretation of the references to color in this figure caption, the reader is referred to the web version of this paper.)

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

OLS

0

−0.5

1

1

1

α (k) 2

−1

α (k) 2

−1

0.5

0.5

0

4

4

4

2

b (k) 0

0

b (k) 0

0

2

2

0

0

−2

−2

−2

2

2

2

1

1

1

0 −1

0

5

10

15

1

1

b (k)

0

b (k)

2 0

b (k)

−0.5

0

−1

0.5

ML

0.5

1

α (k)

−0.5

1

b (k)

WLS

0.5

0

α (k)

α1(k)

0.5

α1(k)

796

0 −1

0

k series

5

10

k series

Fig. 5. True (- - -) AR and X parameter trajectories versus k and Monte Carlo estimates ( (i)–(l) ML based estimates (Monte Carlo experiments – 500 experiments per method).

15

0 −1

0

5

10

15

k series

—): (a)–(d) OLS based estimates; (e)–(h) WLS based estimates;

5.3. Estimation accuracy improvement over local modelling The estimation accuracy improvement achieved by FP-ARX modelling over that of local models used in the context of LPV-type identification is now examined. In the local model approach 16 separate conventional ARX(2,1) local models (again with b0 a 0 in the eXogenous polynomial) are estimated, one for each sample value of the scheduling variable k. This is in sharp contrast to FP-ARX model estimation which is estimated in a single step, simultaneously using all 16 data records. Purely based on this fact, one may immediately observe a very significant difference in the number of model estimated parameters between the two approaches, and, as the data records are identical, a sharp difference in the signal Samples Per estimated Parameter (SPP) – the latter being a quantity that critically affects the achievable estimation accuracy. Indeed, in the FP-ARX approach the number of estimated parameters (the coefficients of projection) is 12, resulting into an SPP of  2670. Yet, in the local approach the corresponding numbers are 4  16 ¼ 64 and SPP of  500. The detrimental effects of the lower SPP in the local (LPV-type) approach are evident in the Monte Carlo comparative estimation results of Fig. 6. In that figure, parts of the true, FP-ARX based and local model based trajectory estimates (sample means and standard deviations) are presented (for local models only integer values of k are used). It is evident, that the FPARX based approach achieves high estimation accuracy that cannot be matched by that of the local model approach. The conclusion is that the FP-ARX framework achieves the identification of a global model that is both compact (parsimonious) and of considerable higher accuracy than that achievable via the local model approach in the context of LPV-type identification.

6. Application case study: modelling of railway suspension dynamics under different operating conditions In this application case study the identification of simulated railway suspension dynamics and the estimation of the global modal parameters (natural frequencies and damping ratios) [26] under different operating conditions (mass loadings) is considered based on available data records. The operating conditions correspond to vehicle body mass changes (which may be due to factors such as varying number of passengers, varying freight, fuel consumption, and so on), and are in the

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

797

0.37 0.12

α (k)

0.35

0.115

2

1

α (k)

0.36

0.34

0.11

0.33 7.5

8

8.5

9

9.5

10

0.105 7.5

10.5

8

8.5

9

9.5

10

10.5

−0.68

−1.36

−0.7

1

b0(k)

b (k)

−0.69

−1.38

−0.71 −0.72

7.8

8

8.2

8.4

8.6

8.8

9

−0.73

9.2

7.8

8

8.2

8.4

8.6

k

8.8

9

9.2

k

Fig. 6. Comparison of FP–ARX & local ARX modelling for selected values of k. True model parameters versus k (- - -) contrasted to FP–ARX estimates (WLS estimator; sample mean estimate: ; sample standard deviation: ) and local ARX estimates (sample mean estimate ( ) 7 sample standard deviation). 500 Monte Carlo experiments per method.

---



x

u Lc

Car body

MB b(t)

A

Mtt

yb(t)

Kst

Csl

Ksl

Mtl

tt

Trailing bogie

tl

tt(t)

tl(t)

Kp3

Cp3

4th wheelset yw4(t)

3rd wheelset Lbw

Kp2

Cp2

Kp1

2nd wheelset

st

yw3(t) L

Cp1

Primary Suspension

Cp4

Leading bogie

ytl(t)

ytt(t) Kp4

Secondary Suspension

Output Acceleration Cst

B

wheelset

x

yw1(t) yw2(t) Velocity Input

Fig. 7. Schematic representation of the model of one-half railway vehicle suspension [17].

range of ½29523:6 32804 kg. Like in the previous section, 500 Monte Carlo experiments are employed, and comparisons with the local ARX modelling approach are made. 6.1. The suspension dynamics model The six degree-of-freedom model of one (left or right) half of a railway vehicle is depicted in Fig. 7. Its parameters correspond to those of a typical passenger vehicle of the Hellenic Railways southern network (MAN OSE/KAT.4) and are provided in Appendix B. The vehicle is assumed to run on an horizontal track with constant speed u, symmetrical loading of

798

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

25

No mass reduction 1 % mass reduction 2 % mass reduction 3 % mass reduction 4 % mass reduction 5 % mass reduction 6 % mass reduction 7 % mass reduction 8 % mass reduction 9 % mass reduction 10 % mass reduction

20 FRF Magnitude (dB)

15 10 5 0 −5 −10 −15 −20 0

10

20

30 Frequency (Hz)

40

50

60

Fig. 8. Application case study: FRF magnitude under different operating conditions.

the two rails (small roll angle) and no wheel lift. The car is modelled as a rigid body with two degrees of freedom (vertical displacement yb and pitch angle θb ), and is connected to the bogies located at its front and rear ends (leading and trailing bogies, respectively) via the secondary suspension. The secondary suspension elements (physically realized via air chambers and hydraulic dampers) are indicated as K sl , Csl (leading part), and K st , Cst (trailing part). Each bogie is modelled as a rigid body with two degrees of freedom (vertical displacement ytl or ydd and pitch angle θtl or θtt ) and is connected to two wheelsets (modelled as massless point followers) via the primary suspension. The primary suspension consists of coil springs and shock absorbers modelled as linear spring-dashpot elements, which are indicated as Kpi, Cpi, with i ¼ 1; 2; 3; 4 designating the corresponding wheelset (Fig. 7). The track is assumed to be fixed and rigid, with the track vertical velocity excitation being approximated by Gaussian white noise with spectrum equal to: Svv ðωÞ ¼ ð2π Þ2 Ar u

ð42Þ

with u representing the (horizontal) vehicle velocity and Ar a roughness factor indicative of track quality [17]. The vertical dynamics of the vehicle model are, for small displacements, described by the linear differential equation: € þC  yðtÞ _ þ K  yðtÞ ¼ B  xw ðtÞ ð43Þ M  yðtÞ  T with yðtÞ ¼ yb ðtÞ θb ðtÞ ytl ðtÞ θtl ðtÞ ytt ðtÞ θtt ðtÞ designating the displacement vector (see Fig. 7), xw ðtÞ the input vector, B an input shaping matrix, and M, C, K the mass, damping, and stiffness matrices, respectively (see full details in [17]). The transfer function between the vehicle body vertical acceleration at point A (right above the trailing airspring, Fig. 7) and the track vertical velocity profile – which due to the negligible wheel mass coincides with the vertical velocity of any wheelset – is presently considered. Thus, the track vertical velocity and the acceleration at point A, based on a sampling period of T s ¼ 0:004 s (sampling frequency fs ¼250 Hz), are the excitation-response signal pairs are employed for each operating condition. The variability of the true Frequency Response Function (FRF) magnitude versus body mass in a range that corresponds to mass reduction in the range of 0–10% is presented in Fig. 8. 500 Monte Carlo experiments are performed with each experiment covering the range of k∈½0; 10% reduction on the car body mass with an increment of δk ¼ 1% that corresponds to 328.04 kg. Thus each experiment includes M¼11 pairs of N ¼4096 sample long excitation-response signals, with the system response being generated by using a number of mutually independent Gaussian zero-mean random sequences acting as the velocity excitation. Each response signal is corrupted by random noise with noise to signal standard deviation ratio equal to 10%. 6.2. FP–ARX modelling & comparisons Like in the previous section, a levelling-off is noted in the RSS/SSS for p Z 3, while both the AIC and BIC attain minimum values for the same value of p ¼3 [LS estimator of Eq. (11)]. Thus an FP  ARXð12; 4Þ3 model (b0 ¼ 0 in the eXogenous polynomial) with groupwise homoscedastic residuals (case (i)) and AR, X subspaces consisting of the following functions: 2

F 〈ai ðkÞ〉 ¼ F 〈bi ðkÞ〉 ¼ fG1 ðkÞ ¼ 1; G2 ðkÞ ¼ k; G3 ðkÞ ¼ k g;

k A ½29523:6; 32804 kg

is selected. Indicative global modal parameter (natural frequency and damping ratio) interval estimation results are presented in Figs. 9 and 10 (solid lines indicate sample mean estimates and dashed lines sample standard deviations) as

1

ω (Hz)

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

799

4 2 0

2

ω (Hz)

4 3 2

3

ω (Hz)

12.115 12.11 12.105 12.1 0

1

2

3

4

5

6

7

8

9

10

Mass reduction k (%)

Fig. 9. Application case study: Estimates of the first three natural frequencies (x: true values) within the considered range of vehicle body mass based on: (i) local ARX models for certain masses (mean estimate: ; sample standard deviation ; 500 experiments), (ii) the FP–ARX model (mean estimate: ; sample standard deviation

o |—| − − −; 500 experiments). Zooms of certain results are also shown.



ζ1

0.3 0.2 0.1

ζ

2

0

0.2 0.15 0.1 0.05 0 0.23

ζ3

0.228 0.226 0.224 0.222

0

1

2

3

4

5

6

7

8

9

10

Mass reduction k (%)

Fig. 10. Application case study: Estimates of the first three damping ratios (x: true values) within the considered range of vehicle body mass based on: (i) local ARX models for certain masses (mean estimate: ; sample standard deviation ; 500 experiments), (ii) the FP-ARX model (mean estimate: ; sample standard deviation

o |—| − − −; 500 experiments). Zooms of certain results are also shown.



Table 4 No. of estimated parameters and SPP for the FP–ARX and local models. Model

na

nb

p

No. of estim. param.

SPP

FP-ARX model local ARX models

12 12

4 4

3 –

48 11  16 ¼ 176

1877.3 512

obtained based on the FP  ARXð12; 4Þ3 model (500 Monte Carlo experiments). The estimates are evidently very accurate, characterized by low standard deviations. In the same figures comparisons with the local model approach (used in the context of LPV-type identification) are also made (11 local ARXð12; 4Þ models are obtained based on standard identification procedures [13, pp. 505–507]). Like in the previous section, the number of estimated parameters is considerably lower in the FP-ARX case, leading to a high SPP (see Table 4). As a consequence, the interval estimates corresponding to the FP-ARX model are much narrower (more accurate) than those of the local modelling approach.

800

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

7. Concluding remarks The problem of identifying a global model for a stochastic dynamical system operating under different pseudo-static conditions from excitation-response signal pairs corresponding to those conditions was considered and a Functionally Pooled (FP) framework was postulated. The FP framework incorporates parsimonious FP models which are also capable of fully accounting for cross correlations among the operating conditions, functional pooling for the simultaneous treatment of all data records, and statistically optimal estimation. Within it Least Squares and Maximum Likelihood based estimators were postulated, and were analytically shown to be consistent and (under proper conditions) optimal in the sense of achieving minimal asymptotic variance (statistical efficiency). Their optimal accuracy was also demonstrated via numerical Monte Carlo experiments, while their superiority over local modelling used in an LPV-type context was also confirmed. An application case study pertaining to global identification of the suspension dynamics, under different mass loadings, for a simulated railway vehicle was also considered and further served in confirming the high achievable accuracy and superiority of the FP-ARX approach.

Acknowledgments The authors wish to thank Mr. K. Petsounis of Mentor Hellas for fruitful discussions and support in the simulation of the railway vehicle suspension.

Appendix A. Estimator consistency analysis & asymptotic distribution A.1. Proof of LS estimator consistency The true system of Eq. (1.a) is written in analogy to the conventional case in a linear regression form [also see Eq. (5)] as: yk ½t ¼ ϕk ½tθo þwk ½t T

ðA:1Þ

Substituting this into the estimator of Eq. (14) leads to 2

31 2 3 kM kM N N   X X X X 1 1 1 1 T T θ^ ¼ 4 ϕ ½t ϕk ½t 5  4 ϕ ½t  ϕk ½t θo þ wk ½t  5 NM k ¼ k σ^ 2 ðkÞ t ¼ 1 k NM k ¼ k σ^ 2 ðkÞ t ¼ 1 k w w 1 1 2

1 3 2 3 kM kM N N X X 1 1 1 X 1 X T 4 5 4 ϕ ½t ϕk ½t   ϕ ½t wk ½t 5 ¼ θo þ NM k ¼ k σ^ 2 ðkÞ t ¼ 1 k NM k ¼ k σ^ 2 ðkÞ t ¼ 1 k w w 1 1 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} A

ðA:2Þ

b

Thus, by using the probability limit operator [29, p. 96] yields: plim θ^ ¼ θo þ plim A  1 plim b N-1

N-1

ðA:3Þ

N-1

2 Also, σ^ w ðkÞ of Eq. (12) is a consistent estimator of the true variance (by virtue of the Kolmogorov theorem in [30, p. 32]),

which means that plim σ^ 2w ðkÞ ¼ σ 2w ðkÞ. By using the ergodicity property of the data [25, pp. 547–548], the internal summation N→∞

over time in A becomes for N-1: N N  N  1X w:p:1 1X 1X ϕk ½t ϕTk ½t  ¼ φk ½t   g ðkÞ φTk ½t   g T ðkÞ ¼ φ ½t φTk ½t   gðkÞg T ðkÞ ⟶ Efφk ½t φTk ½t g  Gk |fflfflfflfflfflffl{zfflfflfflfflfflffl} Nt¼1 Nt¼1 Nt¼1 k

ðN-1Þ

ðA:4Þ

Gk

and applying the plim property of division to A, leads to: N→∞ kM 1 X 1 Efφk ½t φTk ½t g  Gk ¼ Q plim A ¼ 2 ðkÞ M σ N-1 w k¼k

ðA:5Þ

1

For the evaluation of plim b, the vector b is rewritten as N→∞ kM 1 X 1 p b¼ M k ¼ k σ 2w ðkÞ k 1

with

pk 9

N N 1X 1X ϕk ½t wk ½t  ¼ φ ½t   g ðkÞ  wk ½t  Nt¼1 Nt¼1 k

ðA:6Þ

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

801

Evidently the mean of pk is given by: Efpk g ¼

N   1X E φk ½twk ½t  g ðkÞ ¼ 0 Nt¼1

ðA:7Þ

as wk ½t is a white sequence independent of the excitation for all k and the vector φk ½t includes only data of the excitation and past data of the response [Eq. (6)]. Furthermore, the covariance matrix of vector pk is (as wk ½t is independent of wk ½s 8 t as and φk ½s): Covfpk g ¼ Efpk pTk g ¼ þ

N X N 1 X

N

2

N

T

Efwk ½t φk ½t φTk ½swk ½sg  Gk þ

t ¼ 1 s ¼ tþ1

⟹Covfpk g ¼

N2 t ¼ 1 s ¼ 1

t ¼1s¼1

N N X 1 X 2

N X t 1 1 X

Efϕk ½t wk ½t wk ½sϕk ½sg ¼

N 1 X

N2 t ¼ 1

N X t 1 X

N2 t ¼ 1 s ¼ t N σ 2w ðkÞ X

Efw2k ½t φk ½t φTk ½t g  Gk ¼

N2

Efwk ½t φk ½t wk ½sφTk ½sg  Gk

Efwk ½t wk ½sφk ½t φTk ½sg  Gk Efφk ½t φTk ½t g  Gk

ðA:8Þ

t¼1

which tends to zero as N-1 due to the term 1=N2 . This implies that {pk} converges in mean square (m.s.) to its zero mean, which also implies convergence in probability. Hence plimb ¼ 0. Consequently, provided that Q is nonsingular, Eq. (A.3) N→∞

gives plim θ^ ¼ θo and θ^ is a consistent estimator of the true vector θo . N→∞

A.2. Proof of ML estimator consistency 

Let a value θ (θ designates the augmented parameter vector that includes the model projection coefficients and the  b residual variance; also see Section 2) close to θ o such that θ ¼ λθ o þ ð1  λÞθ ML with 0 r λ r 1. Thus, the first derivative of b the log-likelihood function Lðθ =eÞ at θ ML may be expanded into a Taylor series around θ o as: " # " # " 2 #

 ϑLðθ =eÞ ϑLðθ =eÞ ϑ Lðθ =eÞ ¼ þ θb ML  θ o ðA:9Þ T ϑθ θ ¼ bθ ϑθ θ ¼ θ ϑθϑθ θ ¼ θ  o

ML

b As θ ML maximizes Lðθ =eÞ, the left term is equal to zero and the above equation may be written as: " # " 2 #

 1 ϑLðθ =eÞ 1 ϑ Lðθ =eÞ b ¼ θ  θ o ML NM NM ϑθϑθ T  ϑθ θ ¼ θo

By virtue of Eq. (19) the left term of the above equation may be rewritten as: " # " # kM X N 1 ϑLðθ =eÞ 1 X ϑ ln pðek ½t=θ Þ ¼ NM NM k ¼ k t ¼ 1 ϑθ ϑθ θ ¼ θo

ðA:10Þ

θ ¼θ

ðA:11Þ

θ ¼ θo

1

Differentiating ln pðek ½t=θ Þ [see Eq. (19)] with respect to σ 2w ðkÞ and θ, for θ ¼ θ o , gives: ( ) w2 ½t ϑ ln pðek ½t=θ Þ ϑ 1 1 2 1 e2k ½t; θ 1 ln 2  þ k4 ¼ π  σ ð k Þ  ¼ 2 w 2 2 2 2 2 2 σ w ðkÞ ϑσ w ðkÞ θ ¼ θ ϑσ w ðkÞ 2σ w ðkÞ 2σ w ðkÞ θ ¼ θo o |fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}

ðA12:aÞ

d1 ðk;kÞ

( ) ϑ ln pðek ½t=θÞ ϑ 1 1 2 1 e2k ½t; θ  ln 2π  σ ¼ ð k Þ  ϑθ ϑθ 2 2 w 2 σ 2w ðkÞ θ ¼ θo

θ ¼ θo

1 ¼  2 wk ½t ψ k ½t; θo  σ w ðkÞ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}

ðA12:bÞ

d2 ðkÞ ðna þ nb þ 1Þ1

with ψ k ½t; θo  ¼ ½ϑek ½t=ϑθθ ¼ θo . It is noted that ½ϑ ln pðeki ½t=θ Þ=ϑσ 2w ðkj Þθ ¼ θ for ia j is equal to zero [also see Eq. (19)], o

which means that d1 ðki ; kj Þ ¼ 0 for i aj. Based on the ergodicity property of the innovations and as ψ k ½t; θo  ¼  ϕk ½t; θo  [see Eq. (5)] includes only data of the excitation and past data of the response independent of the innovations wk ½t, Eq. (A.11) tends to zero:   kM X N w:p:1 1 ϑLðθ=eÞ 1 X T ¼ ½d2 ðkÞ d1 ðk1 ; kÞ…d1 ðkM ; kÞT ⟶ 0 NM NM k ¼ k t ¼ 1 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl ffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ϑθ θ ¼ θo 1 dkt

ðN-1Þ

ðA:13Þ

802

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

as: kM X N N 1 X 1 X d1 ðki ; kÞ ¼ NM k ¼ k t ¼ 1 NM t ¼ 1

(

1

) ( ) w2ki ½t w:p:1 1 Efw2ki ½tg 1 1  2 þ þ  2 ⟶ ¼0 M 2σ w ðki Þ 2σ 4w ðki Þ 2σ w ðki Þ 2σ 4w ðki Þ

ðA14:aÞ

 kM X kM N        w:p:1 1 X 1 X 1 1 E wk ½t E ψ k ½t; θo  ¼ 0  2 wk ½t ψ k t; θo ⟶  NM k ¼ k t ¼ 1 M k ¼ k σ 2w ðkÞ σ w ðkÞ 1

ðA14:bÞ

1

In a manner analogous to the conventional case [25, pp. 560–562] it may be shown that: 8 8 9 2 3T 9 <ϑ ln pðe ½t=θ Þ = <ϑ2 ln pðe ½t=θ Þ = ϑ ln pðek ½t=θ Þ k k 4 5 ¼ E E T : ; : ; ϑθ ϑθ θ ¼ θo

ϑθϑθ

θ ¼ θo

ðA:15Þ

θ ¼ θo

The second derivative of Lðθ =eÞ with respect to θ may, via Eq. (A.15), be written as: 1 NM

"

#

ϑ2 Lðθ =eÞ ϑθϑθ

T

¼ θ ¼ θo

kM X N 1 X NM k ¼ k t ¼ 1

"

#

ϑ2 ln pðek ½t=θ Þ ϑθϑθ

1

T

w:p:1

⟶

θ ¼ θo

8 kM <ϑ ln pðe ½t=θ Þ 1 X k E Mk¼k : ϑθ 1

2 θ ¼ θo

4ϑ ln pðek ½t=θ Þ ϑθ

θ ¼ θo

3T 9 = 5 ;

ðN-1Þ

ðA:16Þ By using dkt ¼ ½ϑ ln pðek ½t=θ Þ=ϑθ θ ¼ θ , the above equation becomes: o

8 2 3T 9 kM <ϑ ln pðe ½t=θ Þ = n o 1 ϑ ln pðek ½t=θ Þ 1 X T k 4 5 ¼  E E dkt dkt ; Mk¼k : Mk¼k ϑθ ϑθ θ ¼ θo θ ¼ θo 1 1 82 39 T > > d ðkÞd ðkÞ d ðkÞd ðk ; kÞ d ðkÞd ðk ; kÞ … d2 ðkÞd1 ðkM ; kÞ > > 2 2 1 1 2 1 2 2 >6 > 7> > > > > 6 d1 ðk1 ; kÞd2 ðkÞ d1 ðk1 ; kÞd1 ðk1 ; kÞ d1 ðk1 ; kÞd1 ðk2 ; kÞ … d1 ðk1 ; kÞd1 ðkM ; kÞ 7> > kM = <6 7> 1 X 6 7 ¼ E 6 d1 ðk2 ; kÞd2 ðkÞ d1 ðk2 ; kÞd1 ðk1 ; kÞ d1 ðk2 ; kÞd1 ðk2 ; kÞ … d1 ðk2 ; kÞd1 ðkM ; kÞ 7 6 7> Mk¼k > > > >6 7> 1 ⋮ ⋮ ⋮ ⋱ ⋮ > > > 4 5> > > > ; : d ðkM ; kÞd ðkÞ d ðkM ; kÞd ðk ; kÞ d ðkM ; kÞd ðk ; kÞ … d ðkM ; kÞd ðkM ; kÞ > kM X

1

2

1

1

1

1

1

2

1

ðA:17Þ

1

Based on the fact that innovations from different operating conditions are uncorrelated, d1 ðki ; kj Þ ¼ 0 for i a j, as well as that ϕk ½t; θo  includes only data of the excitation and past data of the response that is independent of the innovations wk ½t is:   n o n o   T  1 1 T T E d2 ðkÞd2 ðkÞ ¼ E 4 w2k ½t ϕk t; θo ϕk t; θo ¼ 2 E ϕk ½t; θo ϕk ½t; θo  σ w ðkÞ σ w ðkÞ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} Dk ðna þ nb þ 1Þðna þ nb þ 1Þ

(

!)   w2k ½t 1 1 w ½t ϕk t; θo  2 þ 2σ w ðkÞ 2σ 4w ðkÞ σ 2w ðkÞ k ( )       w3k ½t   1 1 ½  w t ϕk t; θo  6 ϕk t; θo ¼ 4 E wk ½t E ϕk ½t; θo  ¼E 2σ 4w ðkÞ k 2σ w ðkÞ 2σ w ðkÞ

  E d2 ðkÞd1 ðk; kÞ ¼ E 



    1 E w3k ½t E ϕk ½t; θo  ¼ 0 2σ 6w ðkÞ

  E d1 ðk; kÞd1 ðk; kÞ ¼ E

( 

w2 ½t 1 þ k σ 2w ðkÞ 2σ 4w ðkÞ

! 



1

þ

!) w2k ½t 2σ 4w ðkÞ

2σ 2w ðkÞ ) 2w2k ½t w4 ½t 1 1 σ 2 ðkÞ 3σ 4w ðkÞ 1  þ k  w þ ¼ ¼ d3 ðkÞ ¼E ¼ 4 4σ 4w ðkÞ 4σ 6w ðkÞ 4σ 8w ðkÞ 4σ w ðkÞ 2σ 6w ðkÞ 4σ 8w ðkÞ 2σ 4w ðkÞ (

    as for Gaussian distributed random variables is E w3k ½t ¼ 0 and E w4k ½t ¼ 3σ 4w ðkÞ, [31, p. 128]. Thus, Eq. (A.16) becomes ðN-1Þ: 2 3 kM X 1 Dk 0 0 … 0 6 M 7 6 7 k ¼ k1 6 7 " 2 # 6 7 1 7 w:p:16 1 ϑ Lðθ =eÞ 0 M d3 ðk1 Þ 0 … 0 6 7 ¼F ⟶6 ðA:18Þ 7 T θo 1 NM ϑθϑθ 6 7 0 0 0  Md3 ðk2 Þ … θ ¼ θo 6 7 6 7 ⋮ ⋮ ⋮ ⋱ ⋮ 4 5 1 0 0 0 …  Md3 ðkM Þ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ½ðna þ nb þ 1 þ MÞðna þ nb þ 1 þ MÞ

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

803

As F θ of the true system is nonsingular and as F θ  F θ the plim operator to Eq. (A.10) based on Eq. (A.13) leads to: o o h i1 b b plim θ ML ¼ θ o þ  F θ   0⟹ plim θ ML ¼ θ o ðA:19Þ N-1

N-1

A.3. Proof of LS asymptotic distribution For the sake of simplicity let Jðθ; Z NM Þ ¼ JðθÞ and Rw ¼ R. The second order Taylor expansion of the first derivative of J at θ^ around θo is: J 0 ðθ^ Þ ¼ J 0 ðθo Þ þ J″ðθo Þðθ^  θo Þ þ oðjjθ^  θo jjÞ

ðA:20Þ

with o(x) designating a function tending to zero faster than x and jj  jj Euclidean norm. As the first derivative is zero at θ ¼ θ^ and θ^ is a consistent estimate of θo (θ^  θo 0), then the equation above may be rewritten in analogy to the conventional case as: pffiffiffiffiffiffiffiffi    1 pffiffiffiffiffiffiffiffi 0 NM ðθ^  θo Þ  J″ðθo Þ ðA:21Þ NM J ðθo Þ The first two derivatives of J with respect to θ are easily obtained as:  2 ΦT R  1 e; J0 θ ¼  NM

 2 J″ θ ¼ ΦT R  1 Φ NM

ðA:22Þ

and by virtue of Eqs. (9.b) and (13) they become: kM N      2 X 1 X ϕ t; θ ek t; θ ; J0 θ ¼  2 NM k ¼ k σ e ðkÞ t ¼ 1 k

  pðna þnb þ1Þ  1

ðA23:aÞ

1

kM N      2 X 1 X ϕ t; θ ϕTk t; θ ; J″ θ ¼ NM k ¼ k σ 2e ðkÞ t ¼ 1 k



pðna þnb þ1Þ  pðna þ nb þ 1Þ



ðA23:bÞ

1

The second derivative for θ ¼ θo and N-1 tends with probability one to its expected value J ″1 ðθo Þ [also see Eq. (A.4)]: kM  2 X     1 Efφk t; θo φTk t; θo g  Gk J ″1 θo ¼ M k ¼ k σ 2w ðkÞ

ðN-1Þ

ðA23:bÞ

1

Δ

The matrix J″ðθo Þ tends (N-1) to J ″∞ ðθo Þ, while J 0 ðθo Þ ¼ J 0 ðθÞjθ ¼ θo is a random vector [see Eq. (A23.a)] which for N-1 follows Gaussian distribution with zero mean and covariance matrix P o as shown in the following. pffiffiffiffiffiffiffiffi 0 The vector NM J ðθo Þ of Eq. (A.21) is rewritten via Eq. (A23.a) as: ! kM kM N N pffiffiffiffiffiffiffiffi  2 X 1 X 2 X 1 1 X p ffiffiffiffi ffi p ffiffiffiffi ½ w ½ w ½  ½  ϕ t; θ t ¼ φ t; θ t  g ðk Þ ðA:25Þ NM o o k k k NM k ¼ k σ 2w ðkÞ t ¼ 1 k M k ¼ k1 σ 2w ðkÞ Nt¼1 1 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} νkt Based on a variant of the central limit theorem (see Lemma B.3 in [25, p. 550]), as the entries of φk ½t; θo ; wk ½t may be d considered as zero mean ARMA processes [also see Eq. (5)], it follows that νkt - N ð0; P k Þ ðN-1Þ with: ( ) N N     1 X 1 X Efνkt g ¼ E pffiffiffiffi φk t; θo wk ½t  ¼ pffiffiffiffi Efφk t; θo gEfwk ½t g ¼ 0 ðA26:aÞ Nt¼1 Nt¼1   P k ¼ Covfνkt g ¼ lim E νkt νTkt ¼ σ 2w ðkÞ Efφk ½t; θo φTk ½t; θo g N-1

ðA26:bÞ

Thus from Eq. (A.25): kM X pffiffiffiffiffiffiffiffi  2 pffiffiffiffiffi J 0 9 NM J 0 θo ¼ νkt  g ðkÞ M σ 2w ðkÞ k ¼ k1

is a sum of Gaussian distributed random vectors which leads to a new Gaussian distributed random vector with mean and covariance given by virtue of Eqs. (A26.a) and (A26.b) as: EfJ 0 g ¼

kM X

2 pffiffiffiffiffi Efνkt g  g ðkÞ ¼ 0 σ 2w ðkÞ M k ¼ k1

ðA27:aÞ

804

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

kM X

0

CovfJ g ¼

k ¼ k1

"

2 pffiffiffiffiffi M σ 2w ðkÞ

#2 P k  g ðkÞg T ðkÞ ¼

kM     4 X 1 Efφk t; θo φTk t; θo g  Gk M k ¼ k σ 2w ðkÞ

ðA27:bÞ

1

d

ðN-1Þ By setting P o ¼ CovfJ 0 g may be written: J 0 -N ð0; P o Þ pffiffiffiffiffiffiffiffi In analogy to the conventional case the transformation of vector J 0 leads to NM ðθ^  θo Þ [see Eq. (A.21)] which thus follows Gaussian distribution with zero mean and covariance matrix P, given via Eqs. (A24) and (A27.b) as (also see Lemma B.4 in [25, p. 551]): 2 31 kM   T  1 X 1 5 Ef P ¼ ½ J ″1 ðθo Þ  1 P o ½ J ″1 ðθo ÞT   1 ¼ 4 φ t; θ φ t; θ ðA:28Þ g  G o o k k k M k ¼ k σ 2w ðkÞ 1

Thus: pffiffiffiffiffiffiffiffi d NM ðθ^  θo Þ - N ð0; PÞ

ðN-1Þ

ðA:29Þ

For M pairs of N sample long excitation-response signals, the following estimator may be used for P: 2 31 kM N h i h i X X 1 1 T ^ ^ b φ t; θ φk t; θ  Gk 5 P ¼4 NM k ¼ k σ^ 2 ðkÞ t ¼ 1 k w 1

ðA:30Þ

with

σ^ 2w ðkÞ ¼

N h i 1X e2 t; θ^ ; Nt¼1 k

k ¼ k1 ; k2 ; …; kM

ðA:31Þ

A.4. Proof of ML asymptotic distribution b The first derivative of the log-likelihood function Lðθ =eÞ at θ ML may be expanded into a Taylor series around θ o as: "  "  " # #  #     ϑL θ=e ϑL θ=e ϑ2 L θ=e ¼ þ θ^ ML  θo þo ∥θ^ ML  θo ∥ ðA:32Þ T ϑθ ϑθ ^ ϑθϑθ θ¼θ θ¼θ θ¼θ o

ML

o

b b b As the first derivative is zero at θ ¼ θ ML and θ ML is a consistent estimate of θ o (θ ML  θ o 0), the equation above may be rewritten as: " # " 2 #

 1 ϑLðθ =eÞ 1 ϑ Lðθ =eÞ b  θ ML  θ o ðA:33Þ T NM NM ϑθϑθ ϑθ θ ¼ θ θ¼θ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}o |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}o f ðθ o Þ

Fðθ o Þ

or else:  pffiffiffiffiffiffiffiffi    −1 pffiffiffiffiffiffiffiffi  NM f θo NM θ^ ML −θo − F θo

ðA:34Þ

b This expression is similar to that of Eq. (A.21). Thus, the asymptotic distribution of the θ ML estimator may be obtained via the procedure used in the previous paragraph for the WLS. Fðθ o Þ tends (N-1) to F θ as shown in Eq. (A.18). Furthermore, p ffiffiffiffiffiffiffiffi o NMf ðθ o Þ may, via Eq. (A.13), be written as follows: kM N pffiffiffiffiffiffiffiffi 1 kM N ∑ ∑ d ¼ 1 pffiffiffiffiffi ∑ 1 pffiffiffiffi ∑ dkt r kt NM NM k ¼ k1 t ¼ 1 kt M k ¼ k1 Nt¼1

ðA:35Þ



Based on the central limit theorem (also see Lemma B.3 in [25, p. 550]) and the ergodicity property of the data, the vector r kt may be shown that follows Gaussian distribution with zero mean and covariance F k [also see Eqs. (A14.a) and (A14.b)]: d

r kt - N ð0; F k Þ

ðN-1Þ

ðA:36Þ

with: N   1 X E dkt ¼ 0 Efr kt g ¼ pffiffiffiffi Nt¼1

ðA:37Þ

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

805

  o n o N     1 1 T 1 N n T T ðA:38Þ F k ¼ Cov r kt ¼ lim E r kt r Tkt ¼ lim ∑ E pffiffiffiffi dkt pffiffiffiffi dkt ¼ lim ∑ E dkt dkt ¼ E dkt dkt N→∞ N→∞ t ¼ 1 N→∞ N t ¼ 1 N N pffiffiffiffiffiffiffiffiffiffi  ffi r is a sum of Gaussian distributed random vectors which leads to a new Thus from Eq. (A.35), f ≜ NM f θo ¼ ∑kkM¼ k1 p1ffiffiffi M kt Gaussian distributed random vector with mean and covariance given by virtue of Eqs. (A37) and (A38) as follows: kM   1   E f ¼ ∑ pffiffiffiffiffi E r kt ¼ 0 M k ¼ k1

F o 9Covff g ¼

ðA39:aÞ

 kM  kM kM X 1 2 1 X 1 X T pffiffiffiffiffi Covfr kt g ¼ Fk ¼ Efdkt dkt g ¼ F θ o M M M k ¼ k1 k ¼ k1 k ¼ k1

ðA39:bÞ

pffiffiffiffiffiffiffiffi b d Thus for N-1 is: f -N ð0; F o Þ. The transformation of f leads to vector NMðθ ML  θ o Þ of Eq. (A.34) which also follows Gaussian distribution with zero mean and covariance matrix P ML , given by virtue of Eqs. (A17), (A18) , (A39.b) as: 2 31 kM 1 X T P ML ¼ ½  F θ   1 F o ½ F T   1 ¼ 4 Efdkt dkt g5 θo o Mk¼k 1

31

2

7 6 7 6 7 6 Pθ 7 6 zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{ 7 6 k M X 7 6 1   T  1 0 0 … 0 7 6 Ef ϕ t; θ ϕ t; θ g o o k k 7 6M 2 7 6 k ¼ k1 σ w ðkÞ 7 6 7 6 ¼6 1 1 7 0 … 0 0 7 6 M 2σ 4w ðk1 Þ 7 6 7 6 1 1 7 6 … 0 7 0 0 6 7 6 M 2σ 4w ðk2 Þ 7 6 ⋮ ⋮ ⋮ ⋱ ⋮ 7 6 5 4 1 1 0 0 0 … M 2σ 4w ðkM Þ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}

ðA:40Þ

½ðna þ nb þ 1 þ MÞðna þ nb þ 1 þ MÞ

Thus: pffiffiffiffiffiffiffiffi b d NM ðθ ML  θ o Þ - N ð0; P ML Þ

ðN-1Þ

ðA:41Þ

The covariance matrix P ML as given in Eq. (A.40) may be rewritten as follows: 2

31 kM X N 1 X T 5 4 Efdkt dkt g ¼ NM  F  1 P ML ¼ NM k ¼ k t ¼ 1

ðA:42Þ

1

where F is the Fisher information matrix (also see the corresponding matrix of the conventional case in [25, p. 562]), that is: 8 <ϑLðθ =eÞ F ¼E : ϑθ

θ ¼ θo

2

4ϑLðθ =eÞ ϑθ

θ ¼ θo

3T 9 kM X = N X T 5 ¼ Efdkt dkt g ; t¼1

ðA:43Þ

k ¼ k1

b Thus, asymptotically θ ML follows [see Eq. (A.41)] Gaussian distribution with mean θ o and covariance matrix equal to the Cramer–Rao lower bound F  1 [25, pp. 560–562]. In Eq. (A.40) P θ that corresponds to the covariance matrix of the model projection coefficients is the same with P of WLS as shown from Eq. (A.28): 31 31 2 kM kM   T    T  1 X 1 1 X 1 4 4 5 Efϕk t; θo ϕk t; θo g Efφk t; θo φk t; θo g  Gk 5 ¼ Pθ ¼ M k ¼ k σ 2w ðkÞ M k ¼ k σ 2w ðkÞ 2

1

ðA:44Þ

1

This implies that efficient estimates of the projection coefficients vector θ may be obtained via both WLS and ML estimators when consistent estimates of the true innovation variances σ 2w ðkÞ are used as weights in the former.

806

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

Appendix B. Railway vehicle parameter values The parameter values for the railway vehicle suspension are summarized in the following Table B1: Table B1 Model parameters for the railway vehicle suspension [17]. Property

Symbol

Value

Units

Vehicle body mass Vehicle body inertia

Mb Ib

32,804.00

kg kg  m2

Leading bogie mass Leading bogie inertia Trailing bogie mass Trailing bogie inertia Leading/trailing secondary suspension elasticity

Mtl Itl Mtt Itt Ksl/Kst

2:20  106

kg kg kg kg N/m

Leading/trailing secondary suspension damping

Csl/Cst

4:00  104

Ns/m

Primary suspension stiffness (i-th element)

Kpi (i ¼ 1; 2; 3; 4)

5:56  106

N/m

Primary suspension damping (i-th element)

Cpi (i ¼ 1; 2; 3; 4)

1:80  104

Ns/m

Track roughness factor

Ar

Distance between center of vehicle body and bogies Distance between center of vehicle body and secondary dampers Distance between center of bogie and wheelsets Vehicle velocity

L

0:52  10  7 6.40

m

Lc Lbw u

6.85 1.05 25.00

m m m/s

0:67  106 2363.50 1026.36 2261.50 972.54

m

References [1] S. Bruni, R. Goodall, T.X. Mei, H. Tsunashima, Control and monitoring for railway vehicle dynamics, Veh. Syst. Dyn. Int. J. Veh. Mech. Mobil. 45 (7–8) (2007) 743–779. http://dx.doi.org/10.1080/00423110701426690. [2] R.E. Kim, F. Moreu, B.F. Spencer Jr., System identification of an in–service railroad bridge using wireless smart sensors, Smart Struct. Syst. 15 (3) (2015) 683–698. http://dx.doi.org/10.12989/sss.2015.15.3.683. [3] J.D. Hios, S.D. Fassois, Stochastic identification of temperature effects on the dynamics of a smart composite beam: assessment of multi-model and global approaches, Smart Mater. Struct. 18 (035011) . http://dx.doi.org/10.1088/09641726/18/3/035011. [4] K. Worden, H. Sohn, C.R. Farrar, Novelty detection in a changing environment: regression and interpolation approaches, J. Sound Vib. 258 (4) (2002) 741–761. http://dx.doi.org/10.1006/jsvi.2002.5148. [5] V. Verdult, M. Lovera, M. Verhaegen, Identification of linear parameter-varying state-space models with application to helicopter rotor dynamics, Int. J. Control 77 (13) (2004) 1149–1159. http://dx.doi.org/10.1080/0020717042000274527. [6] P.A. Samara, J.S. Sakellariou, G.N. Fouskitakis, J.D. Hios, S.D. Fassois, Aircraft virtual sensor design via a time-dependent functional pooling NARX methodology, Aerosp. Sci. Technol. 29 (1) (2013) 114–124. http://dx.doi.org/10.1016/j.ast.2013.02.001. [7] D.G. Dimogianopoulos, J.D. Hios, S.D. Fassois, FDI for aircraft systems using stochastic pooled NARMAX representations: design and assessment, IEEE Trans. Control Syst. Technol. 17 (6) (2009) 1385–1397. [8] J. De Caigny, J.F. Camino, J. Swevers, Interpolating model identification for SISO linear parameter–varying systems, Mech. Syst. Signal Process. 23 (8) (2009) 2395–2417. [9] A. Almeida Gonçalves Siqueira, R. Nicoletti, N. Norrick, K. Lucchesi Cavalca, H. Fiori De Castro, J. Bauer, F. Dohnal, Linear parameter varying control design for rotating systems supported by journal bearings, J. Sound Vib. 331 (10) (2012) 2220–2232. http://dx.doi.org/10.1016/j.jsv.2012.01.009. [10] F. Casella, M. Lovera, LPV/LFT modelling and identification: overview, synergies and a case study, in: International Symposium on Computer-Aided Control System Design, San Antonio, Texas, USA, September 3–5, 2008. [11] B. Paijmans, W. Symens, H.V. Brussel, J. Swevers, Experimental identification of affine LPV models for mechatronic systems with one varying parameter, Eur. J. Control 14 (2008) 16–29. [12] R. Tóth, Modeling and Identification of Linear Parameter-Varying Systems, Springer, Springer-Verlag, Berlin, Heidelberg, 2010. [13] L. Ljung, System Identification: Theory for the User, second ed. Prentice Hall PTR, New Jersey, 1999. [14] B. Bamieh, L. Giarre, Identification of linear parameter varying models, Int. J. Robust Nonlinear Control 12 (2002) 841–853. [15] R. Tóth, H.S. Abbas, H. Werner, On the state-space realization of LPV input-output models: practical approaches, IEEE Trans. Control Syst. Technol. 20 (1) (2012) 139–153. [16] J.S. Sakellariou, S.D. Fassois, Fault detection and identification in an aircraft skeleton structure via a stochastic functionally pooled model based method, Mech. Syst. Signal Process. 22 (2008) 557–573. [17] J.S. Sakellariou, K.A. Petsounis, S.D. Fassois, Vibration based fault diagnosis for railway vehicle suspensions via a functional model based method: a feasibility study, J. Mech. Sci. Technol. 29 (2) (2014) 471–484. http://dx.doi.org/10.1007/s12206-015-0107-0. [18] S.D. Fassois, J.S. Sakellariou, Statistical time series methods for structural health monitoring, in: C. Boller, et al., (Eds.), Encyclopedia of Structural Health Monitoring, Wiley, Chichester, 2009, pp. 443–472. [19] F.P. Kopsaftopoulos, S.D. Fassois, A Stochastic functional model based method for vibration based damage detection, localization, and magnitude estimation, Mech. Syst. Signal Process. 39 (2013) 143–161. http://dx.doi.org/10.1016/j.ymssp.2012.08.023. [20] C.S. Sakaris, J.S. Sakellariou, S.D. Fassois, Vibration-based damage precise localization in 3D structures: single versus multiple response measurements, Struct. Health Monit. 14 (3) (2015) 300–314. http://dx.doi.org/10.1177/1475921714568407. [21] C.S. Sakaris, J.S. Sakellariou, S.D. Fassois, A time series generalized functional model based method for vibration-based damage precise localization in structures consisting of 1D, 2D and 3D Elements, Mech. Syst. Signal Process. (2015), in press. [22] J.S. Sakellariou, S.D. Fassois, A functional pooling framework for the identification of systems under multiple operating conditions, in: Proceedings of the Mediterranean Control Conference, Athens, Greece, 2007. [23] J.R. Magnus, H. Neudecker, Matrix Differential Calculus, John Wiley and Sons, New York, 1988. [24] W. Greene, Econometric Analysis, Prentice-Hall International, Inc, New Jersey, 1997. [25] T. Söderström, T. Stoica, System Identification, Prentice Hall, 1989.

J.S. Sakellariou, S.D. Fassois / Mechanical Systems and Signal Processing 72-73 (2016) 785–807

807

[26] S.D. Fassois, Parametric identification of vibrating structures, in the encyclopaedia of vibration, in: S.G. Braun, D.J. Ewins, S.S. Rao (Eds.), Academic Press, 2001, pp. 673-685. http://dx.doi.org/10.1006/rwvb.2001.0121. [27] M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions, Dover, New York, 1970. [28] W.W.S. Wei, Time Series Analysis: Univariate and Multivariate Methods, Addison-Wesley Publishing Company, 1990. [29] J.M. Mendel, Lessons in Estimation Theory for Signal Processing, Communications, and Control, Prentice Hall, 1995. [30] H. White, Asymptotic Theory for Econometricians, Revised Ed. Academic Press, San Diego, 2001. [31] E.R. Dougherty, Probability and Statistics for the Engineering, Computing and Physical Sciences, Prentice Hall, Englewood Cliffs, New Jersey, 1990.