Convex Optimization for Blind Identification of Monotone Wiener Systems*

Convex Optimization for Blind Identification of Monotone Wiener Systems*

Preprints of the 17th IFAC Symposium on System Identification Preprints of the 17th IFAC Symposium on System Identification Beijing International Cent...

1MB Sizes 5 Downloads 84 Views

Preprints of the 17th IFAC Symposium on System Identification Preprints of the 17th IFAC Symposium on System Identification Beijing International Centeron Preprints of the the 17th 17th Convention IFAC Preprints of IFAC Symposium Symposium on System System Identification Identification Beijing International Convention Center Available online at www.sciencedirect.com October 19-21, 2015. Convention Beijing, China Beijing Center Beijing International International Center October 19-21, 2015. Convention Beijing, China October October 19-21, 19-21, 2015. 2015. Beijing, Beijing, China China

ScienceDirect

IFAC-PapersOnLine 48-28 (2015) 626–631

Convex Optimization for Convex Optimization for Convex Optimization for Convex Optimization for Identification of Monotone Identification of Monotone Identification of Monotone  IdentificationSystems of Monotone  Systems Systems Systems 

Blind Blind Blind Blind Wiener Wiener Wiener Wiener

∗ Kristiaan Pelckmans ∗ Kristiaan Pelckmans ∗ Kristiaan Kristiaan Pelckmans Pelckmans ∗ ∗ ∗ Uppsala University, Department of Information Technology, Division Uppsala University, Department of Information Technology, Division ∗ ∗ Uppsala University, Department of Information Technology, of Systems and Control kp@ it.uu.se). Uppsala University, Department of(e-mail: Information Technology, Division Division of Systems and Control (e-mail: kp@ it.uu.se). of of Systems Systems and and Control Control (e-mail: (e-mail: kp@ kp@ it.uu.se). it.uu.se).

Abstract: This paper studies Convex Optimization (CO) method to learn monotone Wiener Abstract: This This paper paper studies studies aa a Convex Convex Optimization Optimization (CO) (CO) method method to to learn learn aaa monotone monotone Wiener Wiener Abstract: system, consisting of (i) a linear state-space model of finite order corresponding to the LTI block Abstract: This paper studies a Convex Optimization (CO) method to learn a monotone Wiener system, consisting of (i) a linear state-space model of finite order corresponding to the LTI block system, consisting of (i) a linear state-space model of finite order corresponding to the LTI block of the system, and of (ii) a monotone output nonlinearity. This work advances the previously system, consisting of (i) a linear state-space model of finite order corresponding to the LTI block of the system, and of (ii) a monotone output nonlinearity. This work advances the previously of the system, and of (ii) a monotone output nonlinearity. This work advances the previously published work of the author on the MINLIP method towards state-space descriptions. The of the system, and of (ii) a monotone output nonlinearity. This work advances the previously published work of the author on the MINLIP method towards state-space descriptions. The published work of the the author on the the MINLIP method towards state-space descriptions. The motivation to study Wiener systems is that they are arguably the simplest, non-trivial examples published work of author on MINLIP method towards state-space descriptions. The motivation to study Wiener systems is that they are arguably the simplest, non-trivial examples motivation tosystems, study systems is are arguably the simplest, non-trivial examples of nonlinear and that they provide aa forum develop techniques can be used for motivation study Wiener Wiener systems is that that they they are to arguably simplest,which non-trivial of nonlinear nonlineartosystems, systems, and that that they provide provide forum to developthe techniques which can be beexamples used for for of and they a forum to develop techniques which can used identifying Volterra, LFTs or other more advanced nonlinear dynamical models. The key idea of nonlinear systems, and that they provide a forum to develop techniques which can be used for identifying Volterra, LFTs or other more advanced nonlinear dynamical models. The key idea identifying Volterra, LFTs orthe other more advanced nonlinear models. The key is to use the nuclear norm Hankel corresponding to the model used for identifying LFTsof other morematrix advanced nonlinear dynamical dynamical models. Thedescribing key idea idea is to to use use the theVolterra, nuclear norm norm oforthe the Hankel matrix corresponding to the the model model used for for describing is nuclear of Hankel matrix corresponding to used describing the LTI block. The approach is extended towards the use of CO for the blind identification of is to use the nuclear norm of the Hankel matrix corresponding to the model used for describing the LTI block. The approach is extended towards the use of CO for the blind identification of the LTI block. The approach is extended towards the use of CO for the blind identification of such systems. the LTI block. The approach is extended towards the use of CO for the blind identification of such systems. such systems. such systems. © 2015, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Keywords: Block-oriented Identification Keywords: Block-oriented nonlinear nonlinear Systems, Systems, Wiener Wiener Systems, Systems, Blind Blind Identification Keywords: Keywords: Block-oriented Block-oriented nonlinear nonlinear Systems, Systems, Wiener Wiener Systems, Systems, Blind Blind Identification Identification 1. cific application. A noteworthy effort are the Al1. INTRODUCTION INTRODUCTION cific application. A noteworthy effort are the Al1. cific application. A noteworthy effort are the Alternating Direction Method of Mulipliers (ADMM) 1. INTRODUCTION INTRODUCTION cific application. A noteworthy effort are the Alternating Direction Method of Mulipliers (ADMM) ternating Direction Method of Mulipliers (ADMM) techniques, which have found their way into system ternating Direction Method of Mulipliers (ADMM) A Wiener system consists of a linear LTI followed by techniques, which have found their way into system A Wiener Wiener system system consists of of a linear LTI LTI followed followed by by techniques, which have found their way into system ID. See e.g. Wahlberg al. (2012); al. (2013). A techniques, which haveet found their Liu wayet into system a static nonlinearity. Identification (ID) of such systems A Wiener system consists consists of aa linear linear LTI followed by ID. See e.g. Wahlberg et al. (2012); Liu et al. (2013). a static nonlinearity. Identification (ID) of such systems ID. See e.g. Wahlberg et al. (2012); Liu et al. (2013). • (Analysis) There are good tools of analysis available, aaaims static nonlinearity. Identification (ID) of such systems ID. See e.g.There Wahlberg et al.tools (2012); Liu et al. (2013). at identifying both blocks from observation of inputstatic nonlinearity. Identification (ID) of such systems • (Analysis) are good of analysis available, aims at at identifying identifying both both blocks blocks from from observation observation of of inputinput•• (Analysis) There are tools available, e.g. KKT conditions. Specifically, in this way the aims (Analysis) There are good good tools of of analysis analysis available, output that this is uniquely possible aims at behaviour. identifying The bothfact blocks from observation ofpossible inpute.g. KKT conditions. Specifically, in this way the output behaviour. The fact that this is uniquely e.g. KKT conditions. Specifically, in this way the research direction connects directly to methods of output behaviour. possible The fact fact is that this is isinuniquely uniquely e.g. KKT conditions. Specifically, in this way the under ideal circumstances, treated the research on output behaviour. The that this possible research direction connects directly to methods of under ideal circumstances, is treated in the research on research direction connects directly to methods of Compressive Sampling, see e.g. Recht et al. (2010) under ideal circumstances, is treated in the research on research direction connects directly to methods of identifiability as studied in Dai et al. (2015), going back under ideal circumstances, is treated in the research on Compressive Sampling, see e.g. Recht et al. (2010) identifiability as studied in Dai et al. (2015), going back Compressive Sampling, see e.g. Recht et al. (2010) for citations. identifiability as studied going back in Dai et al. (2015), Compressive Sampling, see e.g. Recht et al. (2010) to work in the statistical community on additive index identifiability as studied in Dai et al. (2015), going back for citations. to work work in in the statistical statistical community community on on additive index index citations. •• for (Flexible) The estimators are often easily extended to for citations. models and Kulasekera (2007). Much research work to workLin in the the statistical community on additive additive index (Flexible) The estimators are often easily extended models Lin and Kulasekera (2007). Much research work • (Flexible) The estimators are often easily extended to handle additional structural properties like noise, models Lin and Kulasekera (2007). Much research work • (Flexible) The estimators are often easily extended has been devoted to designing and testing algorithms to models Lin and Kulasekera (2007). Much research work to handle additional structural properties like noise, has been devoted to designing and testing algorithms to to handle additional structural properties like noise, missing samples, etc. see e.g. Grossmann et al. (2009); has been to designing and testing algorithms to devoted to handle additional structural properties like(2009); noise, perform identification in a efficient, has been such devoted to designing and testing noise-tolerant, algorithms to missing samples, etc. see e.g. Grossmann et al. perform such identification in a efficient, noise-tolerant, missing samples, etc. see e.g. Grossmann et al. (2009); Smith (2012); Liu et al. (2013). perform such identification in a efficient, noise-tolerant, missing samples, etc. see e.g. Grossmann et al. (2009); analytical tractable way, see e.g. Giri and Bai (2010) for perform such identification in a efficient, noise-tolerant, Smith (2012); Liu et al. (2013). analytical tractable tractable way, see see e.g. Giri Giri and Bai Bai (2010) for for Smith Liu (2013). analytical Smith (2012); (2012); Liu et et al. al. of (2013). an overview. analytical tractable way, way, see e.g. e.g. Giri and and Bai (2010) (2010) for We studied the identification monotone Wiener systems an overview. We studied the identification of monotone Wiener systems an overview. an overview. We studied the identification of monotone Wiener systems in a series of publications before. In Pelckmans (2011), We studied the identification of monotone Wiener systems In the present line of work we make the additional asin a series of publications before. In Pelckmans (2011), In the the present present line line of of work work we we make make the the additional additional asas- in a series of publications before. In Pelckmans (2011), inspired on earlier work by Zhang et al. (2006), introduced In in a series of publications before. In Pelckmans (2011), sumption of the static nonlinearity being monotone. We In the present line of work we make the monotone. additional We as- inspired on earlier work by Zhang et al. (2006), introduced sumption of the static nonlinearity being inspired on earlier work by Zhang et al. (2006), introduced the basic MINLIP estimator. A simple recursive version sumption static nonlinearity of the being monotone. We inspired on earlier work by Zhang et al. (2006), introduced argued before Pelckmans (2011) that this assumption is sumption of the static nonlinearity being monotone. We basic MINLIP estimator. A simple recursive version argued before before Pelckmans Pelckmans (2011) (2011) that this this assumption is is the the basic MINLIP estimator. A simple recursive version studied in Pelckmans and (2011). Theoretical argued the basic MINLIP estimator. A Dai simple recursive version often satisfied, when studying saturation effects, quanargued before e.g. Pelckmans (2011) that that this assumption assumption is was was studied in Pelckmans and Dai (2011). Theoretical often satisfied, e.g. when studying saturation effects, quanwas studied in Pelckmans and Dai (2011). Theoretical issues as identifiability and persistency of excitation were often satisfied, e.g. when studying saturation effects, quanwas studied in Pelckmans and Dai (2011). Theoretical tisation effects, transformations and others. Note that we often satisfied, e.g. when studying saturation effects, quanissues as identifiability and persistency of excitation were tisation effects, effects, transformations transformations and and others. others. Note Note that that we we issues as identifiability and persistency of excitation were studied in Dai et al. (2015). Application of such systems tisation issues as identifiability and persistency of excitation were do not require the static nonlinearity to be invertible (in tisation effects, the transformations and others. Note that (in we studied in Dai et al. (2015). Application of such systems do not require static nonlinearity to be invertible studied in Dai et al. (2015). Application of such systems towards control was studied in Dai and Pelckmans (2012). do not require the static nonlinearity to be invertible (in studied in Dai et al. (2015). Application of such systems such case one can simply invert the nonlinear system into do not require the static nonlinearity to be invertible (in was studied in Dai and Pelckmans (2012). such case case one one can can simply simply invert invert the the nonlinear nonlinear system system into into towards control control was studied in and (2012). such towards control was the studied in Dai Dai andisPelckmans Pelckmans (2012). an easier solve If the nonlinearity such caseto one canHammerstein simply invertsystem). the nonlinear system into towards Identification using nuclear norm not new -- see e.g. an easier to solve Hammerstein system). If the nonlinearity Identification using the nuclear norm is not new see e.g. an easier to solve Hammerstein system). If the nonlinearity is noninvertible (e.g. if it represents a saturation), then one an easier to solve Hammerstein system). If the nonlinearity Identification using the nuclear norm is not new see e.g. the work of Fazel et al. (2001), and its application to block is noninvertible (e.g. if it represents a saturation), then one Identification using the nuclear norm is not new see e.g. the work of Fazel et al. (2001), and its application to block is noninvertible (e.g. if it represents a saturation), then one needs to resort to forward approaches as the one below. is noninvertible (e.g. if it represents a saturation), then one the work of Fazel et al. (2001), and its application to block structured systems was initiated in Falck et al. (2010). needs to resort to forward approaches as the one below. the work of Fazel et al. (2001), and its application to block structured systems was initiated in Falck et al. (2010). needs to forward approaches as the one below. needsargument to resort resort to to approaches systems was in Falck et (2010). Dai studies the theoretical question of nuclear structured systems was initiated initiated inwether Falck use et al. al. (2010). The to forward revisit the the problemasof oftheID IDone of below. Wiener structured Dai studies the theoretical question wether use of nuclear The argument to revisit problem of Wiener Dai studies the theoretical question wether use of nuclear norms for a highly abstracted task of ID can work. The The argument to revisit the problem of ID of Wiener Dai studies the theoretical question wether use of nuclear systems, and to rephrase this task as a problem of Convex The argument to revisit the problem of ID of Wiener for aa highly abstracted task of ID can work. The systems, and and to to rephrase rephrase this this task task as as aa problem problem of of Convex Convex norms norms for highly abstracted task of ID can work. The key difference with the study of generic low-rank recovery systems, norms for a highly abstracted task of ID can work. The Optimization (CO) is threefold: systems, and to rephrase this task as a problem of Convex key difference with the study of generic low-rank recovery Optimization (CO) is threefold: key difference with the study of generic low-rank recovery as in Cand` e s and Recht (2009); Recht et al. (2010); Gross Optimization (CO) is threefold: key difference withRecht the study of Recht genericetlow-rank recovery Optimization (CO) is threefold: as in Cand` e s and (2009); al. (2010); Gross • (Numerical) There are efficient (general purpose) in eess and Recht (2009); Recht et Gross (2011) is that the studied stochastic do not fit as in Cand` Cand` and Recht (2009); Recht sampling et al. al. (2010); (2010); Gross (Numerical) There There are are efficient efficient (general (general purpose) purpose) as is that the studied stochastic sampling do not fit ••• (Numerical) solvers available, moreover research on optimisation (Numerical) There are efficient (general purpose) (2011) (2011) is that the studied stochastic sampling do not fit within the structures typically employed in ID. See e.g. (2011) is that the studied stochastic sampling do not fit solvers available, moreover research on optimisation within the structures typically employed in ID. See e.g. solvers available, moreover research on optimisation algorithms for CO has resulted in very effective buildsolvers available, moreover research on optimisation within the structures typically employed in ID. See e.g. Markovsky (2011) for a discussion of low rank recovery in within the structures typically employed in ID. See e.g. algorithms for CO has resulted in very effective buildMarkovsky (2011) for aa discussion of low rank recovery in algorithms for has resulted in effective building blocks which tailored directly to the spealgorithms for CO COcan hasbe resulted in very very effective buildMarkovsky (2011) for discussion of low rank recovery in aa context of system ID. Markovsky (2011) for a discussion of low rank recovery in ing blocks which can be tailored directly to the speof system ID. ing ing blocks blocks which which can can be be tailored tailored directly directly to to the the spespe- aa context context of system ID. context of system ID. The present paper extends the MINLIP estimator to the  This work was supported by the Swedish Research Council under The present paper extends the MINLIP estimator to the  This work was supported by the Swedish Research Council under The present paper extends the MINLIP estimator to the  class of systems where the LTI block can be represented This work was supported by the Swedish Research Council under contract 621-2007-6364. The present paper extends the MINLIP estimator to the  This work was supported by the Swedish Research Council under class of systems where the LTI block can be represented contract 621-2007-6364. class of systems where the LTI block can be represented contract 621-2007-6364. class of systems where the LTI block can be represented contract 621-2007-6364. Copyright IFAC 2015 626 Hosting by Elsevier Ltd. All rights reserved. 2405-8963 © 2015, IFAC (International Federation of Automatic Control) Copyright © IFAC 2015 626 Copyright IFAC 2015 626 Peer review© of International Federation of Automatic Copyright ©under IFAC responsibility 2015 626Control. 10.1016/j.ifacol.2015.12.199

2015 IFAC SYSID October 19-21, 2015. Beijing, China

Kristiaan Pelckmans et al. / IFAC-PapersOnLine 48-28 (2015) 626–631

by a state-space system of small order. The challenge is to keep the estimator (1) one step, and (2) convex, so that the result is both practical and amendable for analysis. The second goal is to make an estimator which is capable of handling messy signals with very low Signal-to-Noise Ratios (SNR), and where observations or missing or where signals are sampled unevenly. Then we proceed at looking into the task of blind identification of such systems. That is: given the output signal, recover both the input signal as well as the system itself. Blind identification in the context of Wiener systems has been studied extensively, see e.g. Bai (2002); Vanbeylen et al. (2009); Giri and Bai (2010); Wills et al. (2013); Van Vaerenbergh et al. (2013). Contrary to the approaches taken there, we investigate how far approaches based on CO can bring us. This research track is inspired by the recent Ahmed et al. (2013) were blind identification of linear systems using CO and nuclear norms was analysed. We revisit here their approach, and phrase it in the context of monotone Wiener systems. This paper is organised as follows. The next section formalises the problem, and states the resulting estimators. This section is broken up in three parts: first, the linear ID using nuclear norm estimators is stated. Extended investigations of this result can be found in the companion paper Pelckmans and Cubo (2015). Secondly, results are extended to the monotone Wiener case, and thirdly the method based on CO of blind ID in monotone Wiener systems is stated. Section 4 provides a series of numerical examples, and Section 5 gives further work.

2. FORMULATION 2.1 Nuclear Norm Minimisation for ID of LTIs This subsection considers the case where the system can be described by an LTI. This approach is discussed in some further detail in . The presented techniques are extended in the next subsection to Monotone Wiener problems. At this point, only the SISO case is considered. (The extension to the MIMO case is straightforward). Fix signals {ut }t ⊂ R and {yt }t ⊂ R be univariate signals. These time-series are related as yt = yˆt + t = ut ∗ h + t =

d 

ut−τ hτ + t = tr (Hd (h)Ud (t)) + t ,

(1)

τ =1

where {t }t can be arbitrary, d can be arbitrary large, h ∈ Rd is referred to as the Impulse Response Function (IRF) and ∗ denotes the convolution operator, and where   h1 h2 h3 . . . hd   h2 h3 h4  h h h , 3 4 5 Hd (h) =  (2)  . ..   .. . hd . . . h2d−1

and

627

ut−d  ut−2 ut−3 ut−1 ... 2 3 d  u  t−2 ut−3 ut−4 0     u2 u3 u4  t−3 t−4 t−5 0  Ud (t) =  . 4 5   3   .. ..   . .  u t−d 0 0 ... 0 d Note that d  τ u2t−τ ≤ R2 ln(d + 1). Ud (t)2F = 2 τ τ =1

627



(3)

(4)

in case |ut | ≤ R, indicating that the dependence on d in Ud will only be logarithmically. The matrix Hd (h) has a Hankel structure.

In the case that there is a state-space system explaining the data, or there exist an order m, a sequence of states {xt }t ⊂ Rm , an initial state x0 = 0, and matrices A ∈ Rm×m , B ∈ Rm , C ∈ R1×m such that  xt = Axt−1 + But (5) yt = Cxt + t . Then it is wellknown that the corresponding Hd (h) is of rank m, called the McMillan degree of the system. Hence (6) yt ≈ tr (Hd (A, B, C)Ud (t)) + t with equality for large enough d. In the rest of the paper, we assume that one has selected a large enough d such that the remainders become negligible. Here we defined   CB CAB CA2 B CAd−1 B   ..   CA1 B CA3 B .   2   Hd (A, B, C) =  CA B    .. ..   . . d−1 d 2d−1 CA B CA B CA B . . . = OC, (7) which corresponds to Hd (h) where h is the impulse response corresponding to the system (6). The matrix Ud (t) is as defined in eq. (3). Here the matrices O ∈ Rd×m , C ∈ Rm×d are known as the (extended) observability, controllability matrix respectively. This result is the subject of realisation theory, see e.g. Kailath (1980). Observe that a finite d seems to imply the implicit use of an FIR filter. In practice, this is not an issue since one may chose d arbitrarily large, say d = O(102 ) when the true order of the system is O(10) and the signals are long enough. In case of short signals (such that n < d), one has to cope with explicitly with initial value estimation. In theory, this is not disrupting results since the resulting approximation error results in negligible terms: Piecing those insights together leads to the following estimation (identification) approach for LTIs. Given signals {ut } and {yt }t of length n. The representation of eq. (1) leads directly to an identification scheme.  yt = tr (HUd (t)) , ∀t = d + 1, . . . , T min H∗ s.t. H H ∈ H, (8) saying find the minimal order system that explains the data. Here  · ∗ denotes the nuclear norm of its argument, which is a convex proxy to the rank. This idea was already

2015 IFAC SYSID 628 October 19-21, 2015. Beijing, China

Kristiaan Pelckmans et al. / IFAC-PapersOnLine 48-28 (2015) 626–631

put forward in the conclusions of Kailath (1980) in the conclusions. 2.2 Monotone Wiener Systems In Pelckmans (2011), an estimator for a subclass of Wiener systems is studied. The estimator is called MINLIP, and focusses on the case where the static nonlinearity is a monotone function. Monotone functions as quantisation, saturation and transformation functions occur in many practical situations. Note that those functions are often not invertible, so that the ’inverse approach’ of rephrasing the Wiener ID task as (simpler) Hammerstein ID problem is not feasable. Practical issues Ahmed et al. (2013) as well as theoretical properties of MINLIP are investigated and published on in some detail, see Dai et al. (2015). Formally, the model becomes yt = yˆt + t = f (ut ∗ h) + t = f



d 

τ =1

ut−τ hτ



+ t , (9)

where {t }t can be arbitrary, d can be arbitrary large, h ∈ Rd is referred to as the Impulse Response Function (IRF) and ∗ denotes the convolution operator. The static nonlinear function is monotonically increasing, i.e. (f (x) − f (x )) (x − x ) ≥ 0, ∀x, x ∈ R.

(a)

(10)

The aforementioned estimator can be plugged in easily in MINLIP as follows:  yt − yt ≤ tr (H(Ud (t) − Ud (t ))) , ∀yt > yt min H∗ s.t. H H ∈ H, (11) We refer to this estimator as to MINLIP∗ for its use of the nuclear norm  · ∗ . We note that the convergence analysis as presented in Dai et al. (2015) is still valid, since this one only uses the geometry of the constraints. These are exactly the same in MINLIP∗ as in MINLIP. We exemplify this estimator as follows. A random system of order n = 5 with poles on the 0.75 spectral-radius circle is generated. The result of the state-space system when excited with white noise of length T = 100 is mapped through the static tanh function. This input and output signal is then given to the estimator (11), and the resulting estimate is given in figure (1) a and b. Note that the order of the system is only correctly estimated from the singular value spectrum when setting the threshold large enough. If the threshold here is too low, one would end up with a system of order 24! Unfortunately, this is often the issue and is hence a potential shortcoming of the method. For now, we consider that the true order is given. Figure (2) gives results when using the 2.3 Blind Identification of a Monotone Wiener System Based on the convex formulation of blind identification as in Ahmed et al. (2013), the above formulation can be extended to Blind identification of monotone Wiener systems. Here, we only consider the noise-free case. The key is to do lifting, that is re-parametrize problems in terms of an outer-product matrix Z ∈ RT ×d between the unknown input signal of length T and the unknown 628

(b)

Fig. 1. The results of the MINLIP∗ estimator applied on signals arising from a stable order 5 LTI system followed by tanh. (a) shows the output signal, input signal, estimated IRF and singular spectrum of H(h), (b) shows the poles and zeros of the estimated (red) and true (blue) system. impulse response function of length d. Then one has that Zt,τ = ut hτ for all t = 1, . . . , T and τ = 0, . . . , d − 1, or

Z = uhT , (12) where the vectors u and h are defined as u = (u1 , . . . , uT )T ∈ RT and h = (h0 , . . . , hd−1 )T ∈ Rd . Then, this matrix Z has to rank one, and the nuclear norm Z∗ can be used to enforce this property as well as possible. This approach was proposed in Ahmed et al. (2013) for LTIs. A straightforward extension to MINLIP is 

min Z∗ s.t. yt − yt ≤ tr(Z t − Z t ), ∀yt > yt . Z

(13)

t where Z t is a (reverse) sub matrix of Z such that Zij = Zt−i+1,j for all i, j = 1, . . . , d. Hence

2015 IFAC SYSID October 19-21, 2015. Beijing, China

Kristiaan Pelckmans et al. / IFAC-PapersOnLine 48-28 (2015) 626–631

629

(2) the term Z∗ does not reflect the order of the system as in Subsection 2.2, and (3) the Hankel-structure specific to the LTI system does not come in. These insights motivate the following formulation instead. Instead of considering the rank-1 lifted matrix Z, lets consider a matrix M ∈ RT −d+1×d of rank equal to the order n of the system. This one represents M = T (u)H(h)  u u d

d−1

...

u1 wd u2 wd .. .



w   w0   ud+1 ud1   ...  h0 h1 . . . hd−1   w0 w 1   .   h1 h2  . hd    . , =  ut ut−1   .. u . t−d+1  ..   .  w0 w1 wd   hd−1 hd  h2d−2 ..   ..   . .  uT uT −1 uT −d+1  ... w0 w1 wd (15)

Fig. 2. The results of the original MINLIP estimator applied to the same system. Compare to figure (1.a). tr(Z t ) =

d−1 

Zt−τ,τ =

τ =0

d−1 

ut−τ hτ .

where T (u) denotes a structured matrix as defined above. The matrix H(h) is the Hankel matrix corresponding to the IRF h. Hence

(14)

τ =0

Mt,s =

d−1  ut−τ

τ =0



hτ +s−1 , ∀t = d, . . . , T, s = 1, . . . , d. (16)

The weights wτ are defined as wτ = e1+τ .

(17)

vs = e1+s .

(18)

Then define vs as

such that for any τ = 0, . . . , 2d − 1, one has min(d,τ +1)



vs−1

s=max(1,τ +2−d)

Fig. 3. Blind identification as in eq. (13) applies to the same system as above (n = 5, ρ = 0.75), and with d = 20 and T = 50. Note that the rank of H(h) is not enforced to be small. This approach is suboptimal as (1) it doesn’t reflect the fact that the IRF represents an LTI block of small order, 629

≈ emin(d,τ +1) − emax(1,τ +2−d)     = min ed , eτ +1 − max e1 , eτ +2−d

≈ eτ +1 = wτ ,

(19)

  where the first ’≈’ arises by approximating by . The second ’≈’ follows since we focus on cases where d  τ . That is, if d is chosen sufficiently large then terms hs with s > d will be enforced to be zero anyways. Then

2015 IFAC SYSID 630 October 19-21, 2015. Beijing, China

yˆt = f

2d−1 

ut−τ hτ

Kristiaan Pelckmans et al. / IFAC-PapersOnLine 48-28 (2015) 626–631



  τ =0  min(d,τ +1) 2d−1    1 vs−1  ut−τ hτ  ≈f wτ τ =0 s=max(1,τ +2−d)  d  d+s−1   ut−τ =f vs−1 hτ wτ s=1 τ =s−1  d  d−1   ut−τ −s+1 =f vs−1 hτ +s−1 wτ +s−1 s=1 τ =0   d  d−1  vs−1   ut−τ −s+1 hτ +s−1 =f ws−1 τ =0 wτ s=1   d d−1   ut−τ −s+1 hτ +s−1 =f wτ s=1 τ =0   = f tr(M t ) , (20) for all t = 2d − 2, . . . , T . The third equality follows from swapping the order of the double sum, and the fifth one followsby definition of wτ as in eq. (17). We also use that  vs−1 = 1 by definition of the weights in eq. (17) and ws−1

eq. (18). Here M t is a (reverse) sub matrix of M such that t Mij = Mt−i+1,j for all t = 2d−1, . . . , T and i, j = 1, . . . , d. Then, the final estimator can be written as    min M ∗ s.t. yt − yt ≤ tr M t − M t , ∀yt > yt . M

(21) Then the estimate of M and m can be used to extract the input signal and the system as follows

specific subclass of Wiener models - the monotone ones and comparison with more general applicable approaches is not fair. (2) They phrase the problem as a Convex Optimisation (CO) problem, working under different constraints than other approaches. (3) The handle a massive over-parametrization of the problem, and are in a sense more useful for estimating systems of a higher order, and finally (4) we focus on the noise-free case, making approaches as based on maximum likelihood less suitable. The CO framework makes it straightforwardly possible to handle additional structure, constraints, noise terms, or (structured) missing entries, see e.g. the companion paper Pelckmans and Cubo (2015) which deals with the LTI case. This example consists of an order n = 5 and n = 25 (!) systems, with spectral radius bounded by 0.75 (ensuring that the IRF ’dies off’ sufficiently fast). The number of samples T = 100 (!), the system is fed with white noise, and d = 100. There is no measurement-, or process-noise in this example, and the nonlinearity is given as (22) fα (z) = tanh(αz), for α > 0. If α → 0, then this function behaves as a (scaled) linear function. If α → +∞, then this function behaves as a hard sign function. The experiment now investigates how the estimate deteriorates with an increasing α. The results are plotted in Fig. (4) for n = 5 and in (5) for n = 25. For n = 5, 25, the MINLIP∗ performs on average a factor two better than MINLIP, unless α is small, meaning the system is almost LTI. The log-ratios are in both cases significantly above zero, with MINLIP∗ performing on average almost twice (!) as good as MINLIP. These results indicate that the MINLIP∗ technique can even for highorder systems be advised over the MINLIP technique as studied in Pelckmans (2011).

• The estimated input signal is given (up to a scaling) by m. • The order of the estimated system is equal to the rank of the estimate M . This is so since the Hankel matrix H(h) is of rank n, and the matrix T (u) is presumed to be of full rank. The latter is true when the (unknown) input is persistently exciting of sufficient order. • The estimated IRF h is part of the right range space of M . The latter two quantities are provided by a Singular Value Decomposition (SVD) of M . The thus obtained estimator (21) is the same as the estimator (13), except that (1) the rank of M is not one, and (2) the constraints only apply when t ≥ 2d − 1 instead of t ≥ d − 1. In practical applications, we note that the obtained matrix M is more often than not rank one, and the above derivation gives a better explanation than eq. (13) did. 3. NUMERICAL EXAMPLES 3.1 Monotone Wiener State-Space Systems The first example studies the difference between the MINLIP and the MINLIP∗ estimator. Comparisons with other non-convex approaches of the MINLIP estimator are reported in Pelckmans (2011) and later work of these authors. Further arguments of restricting this subsection to the class of MINLIP estimators is that (1) they target a 630

Fig. 4. Comparison of MINLIP vs. MINLIP∗ for systems of order n = 5. (a) performances of either in terms of α in eq. (22), and (b) the log-ratio of either. A log-ratio of 0 means they perform equal. A positive log-ratio means that MINLIP∗ outperforms MINLIP. 3.2 Blind Identification Blind identification is more difficult, and both formulations as presented here are not advised yet for use in practice. Nevertheless, we compare the two methods as presented in Fig. (6) on the system of n = 5 as above. Note that the performance increases a bit using the second formulation, substantiating the theoretical arguments. 4. FURTHER WORK This paper studied the task of ID of the monotone Wiener problem where the LTI block takes the form of a state-

2015 IFAC SYSID October 19-21, 2015. Beijing, China

Kristiaan Pelckmans et al. / IFAC-PapersOnLine 48-28 (2015) 626–631

Fig. 5. Comparison of MINLIP vs. MINLIP∗ for systems of order n = 25. (a) performances of either in terms of α in eq. (22), and (b) the log-ratio of either. A log-ratio of 0 means they perform equal. A positive log-ratio means that MINLIP∗ outperforms MINLIP.

Fig. 6. Comparison of the methods proposed for blind identification of Wiener systems, using methods of CO. (a) comparisons of either in terms of α as in eq. (22). (b) the log-ratio of the performance, indicating that the latter technique performs slightly better than the rank-one approach. space model. The main argument is to rephrase the ID problem as a task of CO, so that (a) efficient solvers can be employed, (b) the estimator is amendable for analysis, and (c) the estimator are easily adjusted to new situations and applications. The approach is extended towards blind identification of such systems. There are ample issues left, including (1) how to deal with noise in this setting?, (2) how can we make the inherent order estimate more practical?, (3) When does the method succeed in recovery (analysis)? REFERENCES Ahmed, A., Romberg, J., and Recht, B. (2013). Blind deconvolution using convex programming. Information Theory, IEEE Transactions on, 60(3), 1711 – 1732. Bai, E.W. (2002). A blind approach to the hammerstein–wiener model identification. Automatica, 38(6), 967–979. Cand` es, E.J. and Recht, B. (2009). Exact matrix completion via convex optimization. Foundations of Computational mathematics, 9(6), 717–772. Dai, L. and Pelckmans, K. (2012). An online algorithm for controlling a monotone wiener system. In Control and Decision Conference (CCDC), 2012 24th Chinese, 1585–1590. IEEE. Dai, L., Pelckmans, K., and Bai, E.W. (2015). Identifiability and convergence analysis of the minlip estimator. Automatica, 51, 104–110. Falck, T., Suykens, J.A., Schoukens, J., and De Moor, B. (2010). Nuclear norm regularization for overparametrized hammerstein

631

631

systems. In Decision and Control (CDC), 2010 49th IEEE Conference on, 7202–7207. IEEE. Fazel, M., Hindi, H., and Boyd, S.P. (2001). A rank minimization heuristic with application to minimum order system approximation. In American Control Conference, 2001. Proceedings of the 2001, volume 6, 4734–4739. IEEE. Giri, F. and Bai, E. (2010). Block-oriented nonlinear system identification. Springer. Gross, D. (2011). Recovering low-rank matrices from few coefficients in any basis. Information Theory, IEEE Transactions on, 57(3), 1548–1566. Grossmann, C., Jones, C.N., and Morari, M. (2009). System identification via nuclear norm regularization for simulated moving bed processes from incomplete data sets. In Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Proceedings of the 48th IEEE Conference on, 4692–4697. IEEE. Kailath, T. (1980). Linear systems, volume 1. Prentice-Hall Englewood Cliffs, NJ. Lin, W. and Kulasekera, K. (2007). Identifiability of single-index models and additive-index models. Biometrika, 94(2), 496–501. Liu, Z., Hansson, A., and Vandenberghe, L. (2013). Nuclear norm system identification with missing inputs and outputs. Systems & Control Letters, 62(8), 605–612. Markovsky, I. (2011). Low rank approximation: algorithms, implementation, applications. Springer Science & Business Media. Pelckmans, K. (2011). MINLIP for the Identification of Monotone Wiener Systems. Automatica, In Press. Pelckmans, K. and Cubo, R. (2015). Nuclear norms for system identification - a direct input-output approach. In Submitted. Pelckmans, K. and Dai, L. (2011). A simple recursive algorithm for learning a monotone wiener system. In Proceedins of the 50th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC 2011). IEEE. Recht, B., Fazel, M., and Parrilo, P. (2010). Guaranteed minimumrank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3), 471–501. Smith, R.S. (2012). Nuclear norm minimization methods for frequency domain subspace identification. In American Control Conference (ACC), 2012, 2689–2694. IEEE. Van Vaerenbergh, S., Via, J., and Santamaria, I. (2013). Blind identification of simo wiener systems based on kernel canonical correlation analysis. Signal Processing, IEEE Transactions on, 61(9), 2219–2230. Vanbeylen, L., Pintelon, R., and Schoukens, J. (2009). Blind maximum-likelihood identification of wiener systems. Signal Processing, IEEE Transactions on, 57(8), 3017–3029. Wahlberg, B., Boyd, S., Annergren, M., and Wang, Y. (2012). An admm algorithm for a class of total variation regularized estimation problems. In Proceedings of the 16th IFAC Symposium on System Identification, 83–88. Wills, A., Sch¨ on, T.B., Ljung, L., and Ninness, B. (2013). Identification of hammerstein–wiener models. Automatica, 49(1), 70–81. Zhang, Q., Iouditski, A., and Ljung, L. (2006). Identification of Wiener system with monotonous nonlinearity. In Proceedings of IFAC Symposium on System Identification, 166171. Newcastle, Australia.