Undermodelling Detection with Sign-Perturbed Sums*

Undermodelling Detection with Sign-Perturbed Sums*

Proceedings of the 20th World Congress Proceedings of the 20th World Congress The International Federation of Congress Automatic Control Proceedings o...

543KB Sizes 0 Downloads 41 Views

Proceedings of the 20th World Congress Proceedings of the 20th World Congress The International Federation of Congress Automatic Control Proceedings of the the 20th World World Congress Proceedings of 20th Available online at www.sciencedirect.com The International Federation of Automatic Control Toulouse, France, July 2017 The Federation of Automatic Proceedings of the 20th9-14, World The International International Federation of Congress Automatic Control Control Toulouse, France, July 9-14, 2017 Toulouse, France, July 9-14, 2017 The International of Automatic Control Toulouse, France,Federation July 9-14, 2017 Toulouse, France, July 9-14, 2017

ScienceDirect

IFAC PapersOnLine 50-1 (2017) 2744–2749 Undermodelling Detection with Undermodelling Undermodelling Detection Detectionwith with Sign-Perturbed Sums Undermodelling Detection Sign-Perturbed Sums with Sign-Perturbed Sums  Sign-Perturbed Sumsaji∗∗ ∗,∗∗ ‡ Algo Car` e∗,∗∗ Marco C. Campi†† Bal´ azs Cs. Cs´ ∗∗ Erik Weyer‡

Algo Car` e Bal´ a zs Cs. Cs´ a ji∗∗ Erik Weyer‡‡ ∗,∗∗ Marco C. Campi† Algo e Marco a a Erik Algo Car` Car` e∗,∗∗ Marco C. C. Campi Campi†† Bal´ Bal´ azs zs Cs. Cs. Cs´ Cs´ aji ji∗∗ Erik Weyer Weyer‡ ∗,∗∗ ∗∗ ∗ Algo ∗Car` e Marco C. &Campi Bal´ a(CWI), zs Cs. Cs´ aji Erik Weyer Centrum Wiskunde Informatica Science Park 123, Centrum Wiskunde & (CWI), Science ∗ ∗ Centrum Wiskunde & Informatica Informatica (CWI), Science Park Park 123, 123, Amsterdam, The Netherlands; (email: [email protected]) Centrum Wiskunde & Informatica (CWI), Science Park 123, Amsterdam, The (email: [email protected]) † ∗ Centrum Wiskunde & Informatica (CWI), Science Park 123,Via Amsterdam, The Netherlands; Netherlands; (email: [email protected]) of Information Engineering, University of Brescia, † Department Amsterdam, The Netherlands; (email: [email protected]) Department of Information Engineering, University of Brescia, Via † † Department Amsterdam, The Netherlands; (email: [email protected]) of Information Engineering, University of Brescia, Branze 38, Brescia, Italy; [email protected]) Department of25123 Information Engineering, University of Brescia, Via Via Branze 38, 25123 Brescia, Italy; (email: [email protected]) † ∗∗ ofComputer Information Engineering, University of Brescia, Via Branze 38, 25123 Brescia, Italy; (email: [email protected]) Institute for Science and Control (SZTAKI), Hungarian ∗∗Department Branze 38, 25123 Brescia, Italy; (email: [email protected]) Institute for Computer Science and Control (SZTAKI), Hungarian ∗∗ ∗∗ Branzeoffor 38, 25123 Brescia, Italy; [email protected]) Institute for Computer Science and(email: Control (SZTAKI), Hungarian Academy Sciences (MTA), Kende utca 13-17, H-1111, Budapest, Institute Computer Science and Control (SZTAKI), Hungarian Academy Sciences (MTA), Kende 13-17, H-1111, Budapest, ∗∗ Instituteof Computer Science and utca Control (SZTAKI), Academy of Sciences (MTA), Kende utca 13-17, H-1111, Budapest, Hungary; (email: [email protected]) Academy offor Sciences (MTA), Kende utca 13-17, H-1111, Hungarian Budapest, Hungary; (email: [email protected]) ‡ Academy of Sciences (MTA), Kende utca 13-17, H-1111, Budapest, Hungary; (email: [email protected]) Department of Electrical and Electronic Engineering, Melbourne ‡ Hungary; (email: [email protected]) Department of Electrical and Electronic Engineering, Melbourne ‡ ‡ Department Hungary; (email: [email protected]) of Electrical and Electronic Engineering, Melbourne School of Engineering, The University of Melbourne, Parkville, Department of Electrical and Electronic Engineering, Melbourne ‡ School of Engineering, The University of Melbourne, Parkville, Department of Electrical and Electronic Melbourne School of of Engineering, Engineering, The University ofEngineering, Melbourne, Parkville, Melbourne, Victoria, 3010, Australia; (email: [email protected]) School The University of Melbourne, Parkville, Melbourne, Victoria, 3010, (email: [email protected]) School of Engineering, TheAustralia; University of Melbourne, Parkville, Melbourne, Victoria, Australia; (email: [email protected]) Melbourne, Victoria, 3010, 3010, Australia; (email: [email protected]) Melbourne, Victoria, 3010, Australia; (email: [email protected]) Abstract: Sign-Perturbed Sums (SPS) is aa finite sample system identification method that Abstract: Sign-Perturbed Sums (SPS) is sample system identification method that Abstract: Sign-Perturbed Sums (SPS) is aa finite finite sample system identification method that can build exact confidence regions for the unknown parameters of linear systems under mild Abstract: Sign-Perturbed Sums (SPS) is finite sample system identification method that can build exact confidence regions for the unknown parameters of linear systems under mild Abstract: Sign-Perturbed Sums (SPS) is a finite sample system identification method that can build exact confidence regions for the unknown parameters of linear systems under mild statistical assumptions. Theoretical studies of the SPS method have assumed so far that the can build exact confidence regions for the unknown parameters of linear systems under mild statistical assumptions. Theoretical studies of the SPS method have assumed so far that the can build exact confidence regions for the unknown parameters of linear systems under mild statistical assumptions. Theoretical studies of the SPS method have assumed so far that order of the system model is known to the user. In this paper we discuss the implications of statistical assumptions. Theoretical studies of theInSPS method have assumed so far that the the order of the system model is known to the user. this paper we discuss the implications of statistical assumptions. Theoretical of themethod, have assumed so far thatthat, the orderassumption of the the system model is known knownstudies to the user. InSPS thismethod paper we discuss the implications of this for the applicability of the SPS and we propose an extension order of system model is to the user. In this paper we discuss the implications of this assumption for the applicability of the SPS method, and we propose an extension that, order of theassumptions, system model is known to the user. Inconfidence this paper we propose discuss the implications of this assumption for the applicability of the SPS method, and we an extension that, under mild i) still delivers guaranteed regions when the model order is this assumption for the applicability of the SPS method, and we propose an extension that, under mild assumptions, i) still delivers guaranteed confidence regions when the model order is this assumption for the applicability of the SPS method, and we propose an extension that, under mild assumptions, i) still delivers guaranteed confidence regions when the model order is correct, and ii) it is guaranteed to detect, in the long run, if the model order is wrong. under mild assumptions, i) still delivers guaranteed confidence regions when the model order is correct, and ii) it is guaranteed to detect, in the long run, if the model order is wrong. under mild assumptions, i) still delivers guaranteed confidence regions when the model order is correct, and ii) it is guaranteed to detect, in the long run, if the model order is wrong. correct, and ii) it is guaranteed to detect, in the long run, if the model order is wrong. © 2017, IFAC (International Federation of Automatic Control) Hosting bymodel Elsevier Ltd. is Allwrong. rights reserved. correct, and ii) it is guaranteed to detect, in the long run, if the order Keywords: system identification, confidence regions, finite sample results, least squares, Keywords: system identification, confidence regions, finite sample results, least squares, Keywords: system identification, confidence regions, finite sample results, least squares, parameter estimation, distribution-free results Keywords: system identification, confidence regions, finite sample results, least squares, parameter estimation, distribution-free results Keywords: system identification, confidence regions, finite sample results, least squares, parameter estimation, distribution-free results parameter estimation, distribution-free results parameter estimation, distribution-free results where Y is the output, N is the noise, and U is a t is the output, Nt is the noise, and Ut is a where Y t is the output, Nt is the noise, and Ut is a 1. INTRODUCTION INTRODUCTION where Y measured time t. system can be where Y is the at output, NttThis is the noise, and Uwritten 1. tt input, tt is a measured input, at time t. This system can be written  ∗U 1. INTRODUCTION where Y is the output, N is the noise, and measured input, at time t. This system can be written 1. INTRODUCTION in linear regression form as follows Y = ϕ θ Nta, t t t+ is  t = measured input, at time t.asThis system can be t θ ∗written in linear regression form follows Y ϕ ∗ + Nt , 1. INTRODUCTION Estimating parameters parameters of partially partially unknown unknown systems systems measured t θ ∗  input, at vectors time This system can be in linear regression form t.as as follows Yttt as = ϕ + t−1 Ntt , where the regressor ϕ is defined ϕ −Y Estimating of tt=θ[written t t in linear regression form follows Y = ϕ + N where the regressor vectors ϕ is defined =θ[[∗−Y , Estimating parameters of unknown systems t follows t  based on on observations observations corrupted by noise noise is aa fundamental fundamental  ∗Y as ∗ϕ ∗ Estimating parameters of partially partially unknown systems in regression form as =aϕ +a t−1 N the regressor vectors ϕ defined ϕ −Y ..where .. .. ,,linear −Y , U , . . . , U ] and θ [ . . , a , bb∗1∗t ,,, t as t ,is t−1 based corrupted by is  ∗ = ∗ ,tt.t= ∗ t−n t−1 t−n where the regressor vectors ϕ is defined as ϕ = [ −Y n 1 a b t t−1 −Y , U , . . . , U ] , and θ = [ a , . . . , a , Estimating parameters of partially unknown systems based on observations corrupted by noise is a fundamental  ∗ ∗ ∗ ∗ t−n t−1 t−n problem in system identification, signal processing, mana , b∗ 1, . = 1 a bϕ based on in observations corrupted bysignal noise isprocessing, a fundamental ∗ ∗ regressor defined ..where .. .. ,, −Y ,, U .. to ,, U ] ,isand θ∗ = [[ a t−n t−1 ,, .. .. vectors t−n problem system identification, mana t−1 1 ,t. .. .. ,,[ a a b ] t , and θ −Y U Ut−n =as aϕ a−Y , b1∗ ,, bb∗∗nbthe ]] is referred as the t−n t−1  a n 1 b  true parameter. based on in observations corrupted bysignal noiseand isprocessing, aCasella, fundamental ∗ ∗ ∗a 1 problem system identification, mais referred to as the true parameter. chine learning and statistics, statistics, (Lehmann 1998; ∗  problem in system identification, signal processing, ma- ... ... ... ,,, −Y nb t−n , U , . . . , U ] , and θ = [ a , . . . , a , b t−1 t−n ∗  b ] is referred to as the true parameter. n 1 1, a b chine learning and (Lehmann and Casella, 1998; a . . . , bn∗nbb ] is referred to as the true parameter. problem in system identification, signaland processing, ma- Following chine and (Lehmann Casella, Ljung,learning 1999; Hastie et al., al., 2009). 2009). Standard solutions such (Cs´ a ji et al., 2015), we assume that the meachine learning and statistics, statistics, (Lehmann and Casella, 1998; 1998; .Following . . , bnb ] is referred to as the true parameter. Ljung, 1999; Hastie et Standard solutions such (Cs´ aaji et al., 2015), we assume that the meachine and statistics, (Lehmann and generally, Casella, 1998; Ljung, 1999; Hastie et al., 2009). Standard solutions such as the thelearning Least Squares (LS) method or, more more pre- sured Following (Cs´ al., 2015), that the meainputs {U but all Ljung, 1999; Squares Hastie et(LS) al., method 2009). Standard solutions such Following (Cs´ aji ji tt }}et et are al., deterministic, 2015), we we assume assume thatthe theresults meaas Least or, generally, presured inputs {U are deterministic, but all the results Ljung, 1999; Hastie et al., 2009). Standard solutions such as the Least Squares (LS) method or, more generally, prediction error methods provide point estimates. In many sitFollowing (Cs´ a ji et al., 2015), we assume that the measured inputs {U } are deterministic, but all the results here presented can be immediately generalised to random as the Least Squares (LS) method or, more generally, pret sured inputs {U } are deterministic, but all the results t be immediately generalised to random diction error methods provide point estimates. In many sithere presented can as the Least Squares or, more generally, prediction error methods provide point estimates. In many situations, for example, example, whenmethod the safety, stability or quality suredpresented inputs {U areindependent deterministic, but noise. all the results can be immediately generalised to random inputs when they of the The SPS t }are diction error methods(LS) provide point estimates. Inor many sit- here here presented can be immediately generalised to random uations, for when the safety, stability quality when they are of the noise. The SPS diction error methods point Inor many sit- inputs uations, for when the safety, stability quality of aa process process has to be be provide guaranteed, aestimates. point estimate should here presented can be independent immediately generalised tounknown random inputs when they are independent of the noise. The SPS algorithm builds exact confidence regions for the uations, for example, example, when the safety, stability or quality inputs when they are independent of the noise. The SPS of has to guaranteed, a point estimate should algorithm builds exact confidence regions for the unknown uations, for example, when the safety, stability or quality of a process has to be guaranteed, a point estimate should be accompanied with a confidence region that certifies the inputs when they are independent of the noise. The SPS algorithm builds exact confidence regions for the unknown parameters under the assumption that the noise sequence of a process has to be guaranteed, a point estimate should algorithm builds exact confidence regions fornoise the unknown be accompanied with a confidence region that certifies the parameters under the assumption that the sequence of aaccompanied process has estimate. to be aaguaranteed, aregion pointthat estimate should be with confidence certifies the accuracy of the The Sign-Perturbed Sums (SPS) algorithm builds exact confidence regions for the unknown parameters under the assumption that the noise sequence {N } is independent and symmetric (not necessarily idenbe accompanied with confidence region that certifies the t parameters under the assumption that the noise sequence accuracy of the estimate. The Sign-Perturbed (SPS) symmetric (not necessarily ident} 1 and be accompanied with a confidence region that Sums certifies accuracy of The Sign-Perturbed Sums method (Cs´ ji et etestimate. al., 2012b, 2012b, 2014, 2015; Kolumb´ Kolumb´ an n (SPS) et the al., {N parameters under the assumption that the noise {N } is is independent independent symmetric (not necessarily tically Given n observations Y ,, Y and accuracy of aathe the estimate. The2014, Sign-Perturbed Sums (SPS) 1 and 1 ,, .. .. ..sequence niden{N independent and symmetric (not necessarily identt } isdistributed). method (Cs´ ji al., 2015; a et al., tically distributed). Given n observations Y Y and 1 1 n accuracy of aathe The2014, Sign-Perturbed Sums method (Cs´ ji al., 2015; Kolumb´ aaan n et al., 1 and 2015) constructs confidence regions which have exact independent symmetric (not necessarily identically Given n observations Y ,, .. .. .. ,,contain Y and ϕ , ϕ , SPS constructs confidence regions that t.}. .is,distributed). method (Cs´ ji et etestimate. al., 2012b, 2012b, 2014, 2015; Kolumb´ n (SPS) et al., {N 1 n 1 n tically distributed). Given n observations Y Y and 1 n 2015) constructs confidence regions which have an exact ϕ , . . . , ϕ , SPS constructs confidence regions that contain 1 1 n method (Cs´ a ji et al., 2012b, 2014, 2015; Kolumb´ a n et al., 2015) constructs confidence regions which have an exact coverage probability of the true system parameter based tically distributed). Given n observations Y , . . . , Y and ˆ ϕ , . . . , ϕ , SPS constructs confidence regions that contain 1 n 2015) constructs confidence regions which have an exact 1 n , which is defined as the minimiser of the LS estimate θ ϕ , . . . , ϕ , SPS constructs confidence regions that contain n 1 n coverage probability of the true system parameter based ˆ is defined as the minimiser of the LS estimate θθˆn ,, which 2015) constructs confidence regions which have very an based exact coverage probability of the true system parameter only on a finite sample of observations and under mild ϕ , . . . , ϕ , SPS constructs confidence regions that contain 1 n coverage probability of the true system parameter based which is defined as the minimiser of the LS estimate ˆ sum of the squared prediction errors, that is n is defined as the minimiser of the LS estimate θn , which only on a finite sample of observations and under very mild the sum of the squared prediction errors, that is coverage probability of the true system parameter based only on a finite sample of observations and under very mild ˆ statistical assumptions on the noise terms. is as the of LS estimate θn , which only on a finite sample ofonobservations and under very mild the sum of the the squared squared prediction errors, thatminimiser is ndefined the sum of prediction errors, that is statistical assumptions the noise terms. n only on a finite sample ofon and under very mild the sum of theˆ squared prediction statistical assumptions the   that 2 n errors, is statistical assumptions onobservations the noise noise terms. terms. arg min (Y − ϕ θ) . (2) θθˆn  n   2 t Consider an ARX system: (Y − ϕtt θ)2 . arg (2) statisticalan assumptions on the noise terms. Consider nmin a +nb n (Yt − ϕ  arg (2) θθˆˆnnn  t=1 (Ytt − ϕ Consider an ARX ARX system: system: ∗ tt θ)  θ∈R argnnmin min θ)22 .. (2) a +nb  θ∈R Consider ∗ an ARX∗system: ∗  +nb t=1 a ˆ θ∈R nmin +nb t=1 (Yt − ϕ θ) . +a Y +· · ·a Y  b U +· · ·b U +N , (1) Y  arg (2) θ a ∗ ∗ ∗ ∗ t t−1 t−n t−1 t−n t n θ∈R Consider an ARX system: 1Y na Y nb U a  b1 U b +N , (1) t t=1 Y +a +· · ·a +· · ·b ∗ ∗ ∗ ∗ t In the construction of nathe +nb SPS regions, a crucial role 1 t−1 +· · ·a∗ na Yt−na  b∗ 1 Ut−1 +· · ·b∗ nb Ut−nb +Nt , (1) θ∈R Y t−n t−1 t−n t In the construction of the SPS regions, a crucial role t=1 n 1 n a b Ytt +a +a∗1∗1 Y Yt−1 +· · ·a Y  b U +· · ·b U +N , (1) a b t−1 t−na t n 1 t−1 nb t−nb In the of the SPS regions, aa be crucial role played by the fact can inverted, In the construction construction ofthat the the SPSsystem regions, crucial role · ·a∗ aa Yt−n b∗1 Ut−1 +· by · ·b∗nthe Ut−n +Nt , Re(1) is t +a1 Yt−1 +·  YThe is played by the fact that the system can be inverted, a  supported b b of A. nCar` e was European  The work In the construction of the SPS regions, a crucial role is played by the fact that the system can be inverted, }, can be correctly and the symmetric noise sequence, {N work of A. Car` e was supported by the European Ret },can is played by the fact that the system be inverted,  can be correctly and the symmetric noise sequence, {N search Informatics and Mathematics (ERCIM) and The Consortium work of of A. A.forCar` Car` was supported supported by the the European European Re ∗ t },can The work ee was by Reis played by the fact that the system be inverted, can be correctly and the symmetric noise sequence, {N recovered when the true parameter θ is correctly guessed. search Consortium for Informatics and Mathematics (ERCIM) and t ∗ }, can be correctly and the symmetric noise sequence, {N  t correctly guessed. recovered when the noise true parameter θθ{N the Australian (ARC) under Grant search Consortium Informatics and Mathematics (ERCIM) and The work of Research A.for e Council was supported by theDiscovery European Re∗ is search Consortium forCar` Informatics Mathematics (ERCIM) and canthe be standard correctly and symmetric sequence, recovered when true parameter guessed. the Australian Research Council and (ARC) under Discovery Grant As aathe consequence this fact, in order for t },correctly recovered when the theof true parameter θ∗∗ is is correctly guessed. DP130104028. The work of M. C.and Campi was partly supported the Australian Australian Research Council (ARC) under Discovery Grant search Consortium for Informatics Mathematics (ERCIM) and As consequence of this fact, in order for the standard the Research Council (ARC) under Discovery Grant DP130104028. The work of M. C. Campi was partly supported recovered when the true parameter θ is correctly guessed. As a consequence of this fact, in order for the standard SPS method to be rigorously guaranteed by the theory, As a consequence of this fact, in order for the standard by MIUR Ministero dell’Istruzione, dell’Universit` a e della Ricerca DP130104028. The work of M. C. Campi was partly supported the Australian Research Council (ARC) under Discovery Grant SPS method to be rigorously guaranteed by the theory, DP130104028. The work of M. C. Campi was partly supported by MIUR - Ministero dell’Istruzione, dell’Universit` a e della Ricerca As a consequence of this fact, in order for the standard SPS method to be rigorously guaranteed by the theory, the knowledge of the “true” model order (or at least an SPS method toofbethe rigorously guaranteed by the theory, and by the- Ministero H&W theC.University of Brescia under the by MIUR MIUR dell’Istruzione, dell’Universit` a ee della della Ricerca DP130104028. Theprogram work ofofM. Campi was partly supported by dell’Istruzione, dell’Universit` a Ricerca the knowledge “true” model order (or at least an and by the- Ministero H&W program of the University of Brescia under the SPS method to be rigorously guaranteed by the theory, the knowledge of the “true” model order (or at least an upper bound of it) must be available. project CLAFITE. The work of B. Cs. Cs´ a ji was supported by and by the H&W program of the University of Brescia under the by MIUR Ministero dell’Istruzione, dell’Universit` a e della Ricerca the knowledge of the “true” model order (or at least an and by CLAFITE. the H&W program of Brescia under upper bound of it) must be available. project The workofofthe B. University Cs. Cs´ aji was supported by the the the knowledge ofit) the “true” model order (or at least an upper bound of must be available. GINOP-2.3.2-15-2016-00002 grant and by the J´ anos Bolyaiunder Research project The work of B. Cs. Cs´ a ji was supported by the and by CLAFITE. the H&W program of the University of Brescia upper bound of it) must be available. project CLAFITE. The work of B. Cs. Cs´ a ji was supported by the GINOP-2.3.2-15-2016-00002 grant and by the J´ anos Bolyai Research 1 For a bound Fellowship, BO / 00217 16 / grant 6.ofThe work of E.J´ Weyer was supported robustness of SPS with respect to violations GINOP-2.3.2-15-2016-00002 the a Bolyai Research upper of ofit)themust be available. project CLAFITE. The/ work B.and Cs.by Cs´ aji was by the 1 For a discussion GINOP-2.3.2-15-2016-00002 by the anos nossupported Bolyai Research Fellowship, BO / 00217 / 16 / grant 6. Theand work of E.J´ Weyer was supported discussion of the robustness of SPS with respect to violations

1 For by the ARCBO under Discovery DP130104028. of the a symmetry see (Car` e etwith al., 2016). Fellowship, / / 6. The work E. Weyer was discussion of of respect 1 GINOP-2.3.2-15-2016-00002 byof the anos Bolyai Research Fellowship, / 00217 00217 / 16 16 / / grant 6.Grant Theand work of E.J´ Weyer was supported supported For discussionassumptions, of the the robustness robustness of SPS SPS respect to to violations violations by the ARCBO under Discovery Grant DP130104028. of the asymmetry assumptions, see (Car` e etwith al., 2016). 1 For by the ARC under Discovery Grant DP130104028. of the symmetry assumptions, see (Car` e et al., 2016). Fellowship, BO / 00217 / 16 / 6. The work of E. Weyer was supported a discussion of the robustness of SPS with respect by the ARC under Discovery Grant DP130104028. of the symmetry assumptions, see (Car` e et al., 2016). to violations by the ARC under Discovery Grant DP130104028. of the symmetry assumptions, see (Car` e et al., 2016). 2405-8963 © 2017 2017, IFAC IFAC (International Federation of Automatic Control) Copyright © 2799Hosting by Elsevier Ltd. All rights reserved. Copyright 2017 IFAC 2799Control. Peer review© under responsibility of International Federation of Automatic Copyright © 2017 IFAC 2799 Copyright © 2017 IFAC 2799 10.1016/j.ifacol.2017.08.581 Copyright © 2017 IFAC 2799

Proceedings of the 20th IFAC World Congress Algo Carè et al. / IFAC PapersOnLine 50-1 (2017) 2744–2749 Toulouse, France, July 9-14, 2017

Model order selection is a standard topic in system identification and various tools are available to the user, see, e.g., (Ljung, 1999; Stoica and Selen, 2004; Pillonetto et al., 2013). However, it is still a challenging problem, and it is realistic to assume that the selection procedure might end up with a model order (ˆ na , n ˆ b ) smaller than the “true” one, i.e., with n ˆ a < na and/or n ˆ b < nb . So far theoretical studies of SPS did not consider this possibility, while a simulation experiment on the effects of undermodelling was carried out in (Cs´ aji et al., 2015). Finite sample methods for the case when the input signal can be designed are available and can be used to obtain guaranteed confidence regions when the user is interested in estimating a subset of the parameters, see (Campi et al., 2009). In principle, these techniques can be applied also when the true model order is unknown but, arguably, higher than the selected one. However, they are not aimed at obtaining regions around the LS estimate, and they can be applied only if the input satisfies certain statistical properties, such as symmetry. Aim of the paper In this paper, we move a step towards SPS methods that can be used in the presence of imperfect knowledge of the true model order, or even of the true system structure. In particular, we propose an approach to modify the existing SPS method in such a way that: • If the system is not undermodelled, the algorithm builds exact confidence regions for the true model parameter θ∗ . This property holds true under the same assumptions as for the original SPS. • On the other hand, if the system is undermodelled, the algorithm detects undermodelling as soon as a sufficient amount of data is available. This property holds true under some mild, additional assumptions. Structure of the paper In the following Section 2 the standard SPS method and its main properties are briefly reviewed. The issues of standard SPS in the presence of undermodelling are discussed in Section 3 and provide a motivation for a new algorithm, which we call UD-SPS (SPS with Undermodelling Detection). UD-SPS is presented in Section 4. The results of this paper deal with an FIR-oriented SPS method, which is an archtypical case that allows us to point out the main ideas without technical complications. However, in Section 5 we argue that our ideas are applicable to more general models. The properties of the new UDSPS algorithm are illustrated on some simulation examples in Section 6, while conclusions are drawn in Section 7. 2. REVIEW OF THE STANDARD SPS ALGORITHM The SPS algorithm in its standard form (Cs´ aji et al., 2015, 2014) is summarised in this section. We will assume that na = 0, that is, we restrict ourselves to the FIR case where the regressors {ϕt } do not depend on the noise {Nt }. 2 Recall that the LS estimate, see formula (2), can be obtained by solving the normal equation, 2

Extensions to ARX and more general systems are available (Cs´ aji et al., 2012b,a; Kolumb´ an et al., 2015).

n  t=1

2745

ϕt (Yt − ϕ t θ) = 0,

(3)

n which, when t=1 ϕt ϕ t is invertible (this will be always assumed throughout this paper), has the analytic solution  −1    n n ϕt ϕ ϕ Y . θˆn = t t t t=1

t=1

The fundamental step in SPS consists of generating m − 1 sign-perturbed sums by randomly perturbing the sign of the prediction error in the normal equations (3), that is, for i = 1, . . . , m − 1, we define n  Hi (θ)  ϕt αi,t (Yt − ϕ t θ), t=1

where {αi,t } are random signs, i.e., i.i.d. random variables that take on the values ±1 with probability 1/2 each. For a given θ, the reference sum is instead defined as n  ϕt (Yt − ϕ H0 (θ)  t θ). t=1

It is shown in (Cs´aji et al., 2015, 2014) that a suitable linear transformation of these sums ensures better properties, and therefore we apply the following functions, 1 −1 Si (θ)  Rn 2 Hi (θ), i = 0, . . . , m − 1, n n − 12 ” denotes the inverse where Rn = n1 t=1 ϕt ϕ t and “ of its square root.

Denote by R(θ) the rank of S0 (θ) in the ordering of S0 (θ), Si (θ), i = 1, . . . , m − 1, e.g., R(θ) = 1 means that S0 (θ) is the smallest one, and so on. In case of ties, the rank is broken by randomisation, see (Cs´ aji et al., 2015) for details. The SPS region is formally defined as     n  θ : R(θ) ≤ m 1 − q , Θ m and the following theorem holds true, (Cs´aji et al., 2015). Theorem 1. (Exact confidence of SPS). If N1 , . . . , Nn is a sequence of independent random variables distributed symmetrically about zero, then it holds that n } = 1 − q . Pr{ θ∗ ∈ Θ ∗ m

Moreover, under some mild additional assumptions, SPS  n is almost is strongly consistent, that is, for every ε > 0, Θ surely contained in an ε-ball around the true parameter θ∗ for sufficiently large n (Cs´aji et al., 2014). 3. SPS IN THE PRESENCE OF UNDERMODELLING In this section, we discuss the behaviour of the SPS algorithm when the model chosen by the user does not correspond to the true data-generation mechanism. 3.1 The user-chosen model We assume that the user has decided to use the FIR model (4) Yt (θ) = ϕ t θ

for predicting Yt , where ϕt = [ Ut−1 , . . . , Ut−ˆnb ] and θ = [ b1 , . . . , bnˆ b ] is a generic parameter. This assumption is

2800

Proceedings of the 20th IFAC World Congress 2746 Algo Carè et al. / IFAC PapersOnLine 50-1 (2017) 2744–2749 Toulouse, France, July 9-14, 2017

held throughout the paper, where all the applications of the SPS algorithm use (4) as model structure. However, it can be relaxed as discussed in Section 5.

and, for all ε > 0,  ∞ ∞     ˆ  Θn ⊆ Bε (θ∞ ) = 1, Pr n n ¯ =1 n=¯

3.2 The true data-generation mechanism If data {Yt } are really generated according to a FIR system of order nb = n ˆ b , that is Yt = b∗1 Ut−1 + b∗2 Ut−2 + · · · + b∗nˆ b Ut−ˆnb + Nt ,

(5)

where {Nt } is independent and symmetric, then the standard SPS is guaranteed to deliver a confidence region for θ that contains the true parameter, θ∗ = [ b∗1 , . . . , b∗nˆ b ], with user-chosen probability. Moreover, under mild assumptions, θ → θ∗ as n → ∞, and the SPS regions shrinks around θ∗ , see Section 2 and references therein. It is interesting to consider, instead, the situation where data {Yt } are generated by a higher order FIR model or even a more general linear system. Assume therefore that Yt can be written as follows ∗ Yt = ϕ t θ + Et + N t ,

n ⊆ Technically, the theorem states that the event that Θ Bε (θ∞ ) for all n larger than some (realisation-dependent) ¯ value n ¯ is a probability 1 tail-event. Thus, by (10), if E is nonzero, that is, if the sequence of the residuals {Et } is correlated with the sequence of the regressors {ϕt } (in the sense of (8)), the region built by SPS by using the model (4) shrinks around the “wrong” value θˆ∞ = θ∗ .

By relying only on the standard SPS algorithm, there is no way for the user to recognise that the assumptions are not satisfied and the SPS region is going to exclude the true parameter systematically. This is the motivation for the work of this paper and for the SPS algorithm with undermodelling detection, presented in the next section. 4. UD-SPS: A MODIFIED SPS METHOD

(6)

where {Nt } is the usual, independent symmetric noise; the (linear) effect of Ut−1 , . . . , Ut−ˆnb is correctly described ∗ by the term ϕ t θ ; Et is an extra, non-zero component that can depend linearly on all the past inputs Ut−ˆnb −1 , Ut−ˆnb −2 , . . . and on all the past noise samples Nt−1 , Nt−2 . . .. For example, this situation includes the case where the true system is ARX as in (1). In the special case where the true system is an FIR of order nb > n ˆ b , Et depends linearly on Ut−ˆnb −1 , . . . , Ut−nb only. 3.3 The effect of undermodelling on standard SPS Under (6), the conditions of Theorem 1 are not met in general by the “noise” {Nt + Et }, so the confidence region built by SPS is not guaranteed. The next theorem characterises the asymptotic behaviour of the SPS region in the case of undermodelling. Defining θˆ∞ as the limit of the LS estimate θˆn , the theorem states that, under some  n will mild assumptions, for every ε > 0, the SPS region Θ remain inside an ε-ball centred around θˆ∞ for all n large enough. Since θˆ∞ = θ∗ in general, the theorem implies that the SPS region will not include θ∗ in the long run. Theorem 2. (Asymptotic behaviour with undermodelling). Assume that (6) the true data-generating mechanism. is n Define Rn = n1 t=1 ϕt ϕ t , and assume that det(Rn ) = 0 ¯ R ¯  0 such that and there exists a finite limit matrix R, ¯ lim Rn = R  0. (7) n→∞

¯ such that Assume also that there is a finite real vector E n  1 ¯ lim ϕt E[Et ] = E, (8) n→∞ n t=1

and, moreover, ∞ ∞ ∞    ϕt 4 E[Nt 2 ]2 E[Et 2 ]2 <∞, <∞, <∞. 2 2 t t t2 t=1 t=1 t=1 (9) Then, ¯ −1 E ¯ (w.p.1), (10) θˆ∞  lim θˆn = θ∗ + R n→∞



where Bε (θ) denotes an ε-ball centred around θ.

We now define the UD-SPS algorithm and discuss the main ideas behind it. Explaining the connection between UDSPS and the standard SPS makes it easy to prove that UDSPS inherits the most important properties of standard SPS when the system is correctly specified. Finally, we study the undermodelling-detection property of UD-SPS. 4.1 Definition of UD-SPS The UD-SPS algorithm is obtained from the standard SPS algorithm by substituting the functions S0 (θ), . . . , Sm−1 (θ) with the following ones  − 12   n  1 Rn Bn ϕt Q0 (θ)  (Yt − ϕ t θ), ψt Bn Dn n Qi (θ) 



Rn Bn Bn Dn

− 12

t=1

  n 1 ϕt αi,t (Yt − ϕ t θ), ψt n t=1

i = 1, . . . , m − 1,

for

(11)

where ψt is a vector that includes s extra input values preceding the n ˆ b that are included in ϕt , i.e., and

ψt  [ Ut−ˆnb −1 , · · · , Ut−ˆnb −s ] , n

Bn 

1 ϕt ψt , n t=1

n

Dn 

1 ψt ψt . n t=1

So, while the prediction error (Yt − ϕ t θ) in (11) is the usual prediction error for the user-chosen model class, the regressor vector and the shaping matrix are larger than in the standard SPS algorithm. s is a parameter that can be chosen by the user and, clearly, it need not be equal to the difference between the true order of the system, which is unknown, and n ˆ b . The region built by UD-SPS will be o . denoted by Θ n 4.2 The idea of UD-SPS

The key idea is stated in the following Fact 1.

2801

Proceedings of the 20th IFAC World Congress Algo Carè et al. / IFAC PapersOnLine 50-1 (2017) 2744–2749 Toulouse, France, July 9-14, 2017

 o for estimating θ∗ ∈ Rnˆ b Fact 1. The UD-SPS region Θ n can be interpreted as the restriction to a n ˆ b -dimensional   , that space of a full-fledged standard SPS region, say Θ n lives in the domain {θ ∈ Rnˆ b +s }, which is the n ˆbdimensional identification space augmented with s extra  o can be identified with the components. More precisely, Θ n   ∩ (Rnˆ b × {0}s ). first n ˆ b components of the set Θ ∗ n

In order to see that Fact 1 is true, consider the functions  (θ ) of θ ∈ Rnˆ b +s defined as S0 (θ ), S1 (θ ), . . . , Sm−1 n  1 1  ϕ (Yt − ϕt θ ), S0 (θ ) = Rn − 2 n t=1 t n      − 12 1 Si (θ ) = Rn αi,t ϕt (Yt − ϕt θ ), n t=1

for

where

i = 1, . . . , m − 1,

Rn = and ϕ t

=



ϕt ψt





Rn Bn Bn Dn



(12) (13)

= [Ut−1 , . . . , Ut−ˆnb , Ut−ˆnb −1 , Ut−ˆnb −s ]. (14)

These are the standard Si -functions based on which the   , can be built in the augstandard SPS region, say Θ n n ˆ b +s  mented space R for the user-chosen model ϕ t θ . Comparing (12) with (11), it can be immediately observed that functions (12) take the same values of functions (11) whenever θ is restricted to Rnˆ b × {0}s , i.e., (15) Si (θ )|θ =[θ ,0 ] = Qi (θ). Computational feasibility of the UD-SPS algorithm There is no significant difference in the computational complexity between UD-SPS and the standard SPS: on the one hand, it is easy to check whether a certain value of θ is inside or outside an SPS region, while on the other hand computing and handling a complete and explicit representation of the region becomes unpractical in a high dimensional parameter space. For SPS, computationally feasible approximations techniques have been studied that rely on interval analysis, e.g. (Kieffer and Walter, 2013), or on semidefinite programming (SDP), (Cs´ aji et al., 2015). We focus here on the latter option, which allows us to compute outer ellipsoidal approximations of SPS regions.   the outer-approximating ellipsoid of the Denote by Θ n   in the augmented space Rnˆ b +s (Fact 1). SPS region Θ n   is defined as the set {θ ∈ In (Cs´ aji et al., 2015), Θ n n ˆ b +s   2 ∗ R : S0 (θ ) ≤ γ }, where γ ∗ can be computed from the solutions of some suitable (convex) SDP problems. In virtue of Fact 1, the restriction of this ellipsoid to the  o , can be written as domain Rnˆ b × {0}s , denoted by Θ n  o = { θ ∈ Rnˆ b : Q0 (θ)2 ≤ γ ∗ }, Θ n

(16)

o . and contains Θ n Remark 1. With small modifications, it is possible to find  o than Θ o a smaller outer ellipsoidal approximation for Θ n n as defined above. In fact, the optimisation space of the SDP problems can be restricted to the domain Rnˆ b ×

2747

{0}s with no harm. In general, from this restriction a smaller, but still valid, γ ∗ to be used in (16) can be obtained. Moreover, in this way, only n ˆ b decision variables are involved in the optimisation problem instead of n ˆ b + s. UD-SPS and the LS estimate It can be shown that the LS estimate, θˆn , is always included in the outer o , whenever Θ  o is not empty: approximation ellipsoid, Θ n n o .  o = ∅, then θˆn ∈ Θ ∗ Theorem 3. If Θ n

n

4.3 UD-SPS with correct system specification

Now, we study the case when the system and its order are correctly specified, i.e., the true system is (5). In this case, the most important properties of standard SPS carry over to UD-SPS, by applying the standard SPS results (see Section 2) in the θ space and then restricting the result to the first n ˆ b coordinates (Fact 1). In particular, if (5) is   is guaranteed to contain the “true” the true system, Θ n q ∗ , so parameter θ  [ θ∗ 0 · · · 0 ] with probability 1 − m the following theorem is obtained. Theorem 4. (Exact confidence of UD-SPS). If the FIR system is correctly specified, i.e., (5) holds true, then o } = 1 − q . ∗ Pr{ θ∗ ∈ Θ n m

Moreover, UD-SPS is strongly consistent.

Theorem 5. (Strong consistency of UD-SPS). Assume that (7)-(9) and the following statements hold true ¯  lim Bn < ∞, D ¯  lim Dn < ∞, B (17) n→∞

with and

n→∞



¯ B ¯ R ¯ D ¯ B



 0;

n ∞  1 ψt 4 ψt E[Et ] < ∞, < ∞. n→∞ n t2 t=1 t=1

lim

(18)

(19)

If the system is correctly specified, i.e., (5) holds true, then, for all ε > 0, we have that ∞ ∞      o ∗  ⊆ Bε (θ ) Pr Θ = 1, n n n ¯ =1 n=¯

where Bε (θ) denotes an ε-ball centred around θ.



o The strong consistency of the outer approximation Θ n   in the augmented follows by the strong consistency of Θ n space, see (Cs´aji et al., 2015, 2014), and Fact 1. 4.4 UD-SPS in the presence of undermodelling

Consider now the case where the true data-generating  o is not mechanism is system (6). In this case, the region Θ n guaranteed. However, the following theorem guarantees that, for n large enough, the region is empty. Theorem 6. (Undermodelling detection). Assume that (6) is the true system, and relations (7)-(9) and (17)-(19) hold true. With the notation    n  1  ϕt Rn B n   ¯ ¯ , E  lim E[Et ], R  lim  ψt n→∞ Bn Dn n→∞ n t=1

2802

Proceedings of the 20th IFAC World Congress 2748 Algo Carè et al. / IFAC PapersOnLine 50-1 (2017) 2744–2749 Toulouse, France, July 9-14, 2017

if then

¯ −1 E ¯ ∈ R / Rnˆ b × {0}s ,  ∞ ∞     o  Θn = ∅ = 1. Pr

(20)

LSE of b*(n=20)

0.6

(21)

LSE of b*(n=100)

0.4

n n ¯ =1 n=¯

(b*,0)

n=50

0.3 ˜b

∗ The main statement of the theorem is (21), which is formulated using the tail-event notation as in Theorems 2 and 5. Equation (21) means that, with probability 1, there is a (realisation-dependent) value of n ¯ such that the region  o is empty for every n ≥ n ¯ . Condition (20) is a technical Θ n detectability condition, which, in practice, is expected to be met, unless Et does not depend on the input or there are contrived correlation patterns in the input sequence. 3

LSE of b*(n=50)

n=20

0.5

~

(b*,b*)

n=100

0.2 0.1 0

−0.1 −0.2 0.1

In concluding, after a certain amount of data n ¯ has been observed, UD-SPS warns the user when the method is working beyond its domain of applicability by building empty confidence regions. The number n ¯ depends on the system and the degree of misspecification, as we will see in the example in Section 6.

0.2

0.3

0.4

0.5

0.6

(a)

b

a∗

0.7

0.8

0.9

1

1.1

1.2

=0

0.6 0.5 0.4

5. UD-SPS FOR MORE GENERAL SYSTEMS ˜ b

0.3

The SPS algorithm has been generalised to ARX systems (Cs´ aji et al., 2012b) and general linear systems (Cs´aji et al., 2012a; Kolumb´ an et al., 2015). It is possible to extend the asymptotic results that hold true for the FIR case to the ARX case. Relying on these extensions, the main arguments in Section 4, which rely only on the strong consistency of the SPS method, carry over to the ARX setting.

0.2 0.1 0

−0.1 −0.2 0.1

0.2

0.3

0.4

6. NUMERICAL EXPERIMENTS

0.6

b

0.7

0.8

0.9

1

1.1

1.2

(b) a∗ = 0.15

Consider the following ARX(1,1) generating system

1

Yt = a∗ Yt−1 + b∗ Ut−1 + Nt ,

0.9 0.8 0.7 0.6 0.5 ˜b

with zero initial conditions, where a∗ = 0.7 and b∗ = 1 are the true system parameters and {Nt } is a sequence of i.i.d. Laplacian random variables with zero mean and variance 0.1. The input signal is generated as Ut = 0.75Ut−1 + Vt , where {Vt } is a sequence of i.i.d. Gaussian random variables with zero mean and variance 1. The user-chosen predictor is Yt (θ) = ϕ t θ = b Ut−1 , We choose s = 1 in the UD-SPS algorithm, that is, ψt = [Ut−2 ] in (11). We construct the outer approximation o  o for the 95 % confidence UD-SPS region Θ ellipsoid Θ n n by using the algorithm of Section 4.3, see definition (16).

A notable situation where the detectability condition fails is when the true system is a FIR system with nb > n ˆ b , the input is an uncorrelated sequence and none of the inputs Ut−ˆ nb −1 , . . . , Ut−nb corresponding to a nonzero coefficient among b∗k , k = n ˆ b +1, . . . , nb is included among the s extra components in the augmented regressor ϕt . However, in this case, the LS estimate θˆn will not be biased, i.e., θˆn → θ∗ . Moreover, if the input can be thought of as the realisation ˆb of an independent and symmetric process, the region for θ ∗ ∈ Rn will be still guaranteed for finite samples (Theorem 4).

0.4 0.3 0.2 0.1

that is, the autoregressive part is missing, θ = [ b ] is the model parameter, and ϕt = [ Ut−1 ] is the input-dependent regressor at time t.

3

0.5

0 −0.1 −0.2 0.1

0.2

0.3

0.4

0.5

0.6

(c)

a∗

b

0.7

0.8

0.9

1

1.1

1.2

= 0.5

 n in the 2Fig. 1. In each of these pictures, the SPS ellipsoids Θ

2803

dimensional (augmented) space for n = 20 (light gray), n = 50 (gray), n = 100 (dark gray) are shown. The corresponding  on , is obtained outer-approximation of the UD-SPS region, Θ  n with the b-axis (horizontal by intersecting the ellipsoid Θ line). The “true” parameter b∗ is also represented in the 2dimensional space as (b∗ , 0), together with the convergence b∗ ) of the SPS region in the 2-dimensional space. point (b∗ ,  The circles on the b-axis denote the LS estimates of b∗ , namely θˆ20 (light gray), θˆ50 (gray), and θˆ100 (dark gray).

Proceedings of the 20th IFAC World Congress Algo Carè et al. / IFAC PapersOnLine 50-1 (2017) 2744–2749 Toulouse, France, July 9-14, 2017

Although this approximation algorithm can be refined in line with Remark 1, it is here used for illustrative purposes as it reflects in an intuitive manner the relation between UD-SPS and standard SPS (i.e., the fact that UD-SPS region can be obtained by restricting a higher-dimensional  o is an interval. SPS region). Note that, in this case, Θ n

Although the normal use of the method is in the 1dimensional space where the model parameter θ = [ b ] takes value, in Fig. 1 the augmented 2-dimensional space is represented for explanatory purposes. In Fig. 1, the b-axis corresponds to the unknown parameter that we want to estimate and the ˜b-axis is the extra coordinate accounting for the extra input in ψt . In this space, the   can be built according to SPS 2-dimensional ellipse Θ n the standard algorithm in (Cs´ aji et al., 2015). According  o can then be interpreted as to Section 4.3, the interval Θ n   with the b-axis. the intersection of Θ n

Note that, as expected, whenever the 1-dimensional intersection of the SPS ellipsoid with the b-axis is non-empty,  o . When a∗ = 0 (Fig. the LS estimate θˆn is included in Θ n 1a), the augmented SPS region shrinks around the true parameter (b∗ , 0) as n increases, and the corresponding UD-SPS interval becomes smaller and smaller around b∗ . However, when a∗ = 0, the system is misspecified and, as n increases, the augmented SPS region shrinks around a o parameter value that does not lie on the b axis, so that Θ n becomes empty and undermodelling is detected. The limit point of SPS in the augmented space, denoted by (b∗ , b∗ ), can be computed according to formula (10). Undermod o is empty. This happens when n elling is detected when Θ n is 100 in Fig.1b (a∗ = 0.15) and is 50 in Fig.1c (a∗ = 0.5). Table 1.

UD-SPS ellipsoid coverage o ) (θ ∗ ∈ Θ

Detection with UD-SPS ellipsoid o = ∅) (Θ

20 50 100

84.5% 29.5% 3.6%

a∗ = 0.15 2.4% 31.4% 72.2%

77.0% 41.5% 11.1%

20 50 100

13.3% 0% 0%

a∗ = 0.5 63.5% 99.9% 100%

37.1% 0.4% 0.1%

n 20 50 100

n

99.8% 99.0% 98.6%

n



a =0

0.2% 0% 0.4%

Standard SPS ellipsoid coverage

n ) (θ ∗ ∈ Θ

98.7% 97.5% 97.7%

In Table 1, the results of 1000 Monte Carlo simulations are shown for the same three values of a∗ and n that are  o is compared used in Fig.1. The empirical coverage of Θ n with the empirical coverage of the standard SPS outer  n ⊆ R, computed as in (Cs´ interval Θ aji et al., 2015), and  o is empty is also shown. In the frequency with which Θ n the cases of misspecification (a∗ = 0.15 and a∗ = 0.5),  o is empty estimates the the frequency with which Θ n probability that undermodelling is detected; in the case of correct system model (a∗ = 0), the same frequency can be interpreted as an estimate of the probability of false detection, which turned out to be very small.

2749

7. CONCLUSIONS In this paper we have studied the behaviour of SPS, a guaranteed finite-sample system identification method, in presence of undermodelled dynamics. In this case, the confidence regions generated by the standard SPS algorithm are not rigorously guaranteed, nor do they provide any warning that the algorithm is working outside of its applicability domain. We have proposed an extension of SPS, the UD-SPS algorithm, which is able to detect that the algorithm is working outside of its applicability domain. We have shown that UD-SPS provides guaranteed confidence regions if the model order is correctly specified, otherwise it almost surely detects undermodelling in the long run. Finally, we demonstrated UD-SPS through numerical experiments. REFERENCES Campi, M.C., Ko, S., and Weyer, E. (2009). Nonasymptotic confidence regions for model parameters in the presence of unmodelled dynamics. Automatica, 45, 2175–2186. Car`e, A., Cs´aji, B.Cs., and Campi, M.C. (2016). SignPerturbed Sums (SPS) with asymmetric noise: Robustness analysis and robustification techniques. In Procs. of the 55th IEEE Conf. on Decision and Control, 262–267. Cs´aji, B.Cs., Campi, M.C., and Weyer, E. (2012a). A method for constructing exact finite-sample confidence regions for general linear systems. In Procs. of the 51st IEEE Conference on Decision and Control, 7321–7326. Cs´aji, B.Cs., Campi, M.C., and Weyer, E. (2012b). Nonasymptotic confidence regions for the least-squares estimate. In Procs. of the 16th IFAC Symposium on System Identification, 227–232. Cs´aji, B.Cs., Campi, M.C., and Weyer, E. (2014). Strong consistency of the Sign-Perturbed Sums method. In Procs. of the 53th IEEE Conference on Decision and Control, 3352–3357. Cs´aji, B.Cs., Campi, M.C., and Weyer, E. (2015). SignPerturbed Sums: A new system identification approach for constructing exact non-asymptotic confidence regions in linear regression models. IEEE Transactions on Signal Processing, 63(1), 169–181. Hastie, T., Tibshirani, R., and Friedman, J. (2009). The elements of statistical learning. Springer, 2nd edition. Kieffer, M. and Walter, E. (2013). Guaranteed characterization of exact non-asymptotic confidence regions as defined by LSCR and SPS. Automatica, 49, 507–512. Kolumb´an, S., Vajk, I., and Schoukens, J. (2015). Perturbed datasets methods for hypothesis testing and structure of corresponding confidence sets. Automatica, 51, 326–331. Lehmann, E.L. and Casella, G. (1998). Theory of Point Estimation. Springer, 2nd edition. Ljung, L. (1999). System Identification: Theory for the User. Prentice-Hall, Upper Saddle River, 2nd edition. Pillonetto, G., Chen, T., and Ljung, L. (2013). Kernelbased model order selection for identification and prediction of linear dynamic systems. In Procs. of the 52nd IEEE Conference on Decision and Control, 5174–5179. Stoica, P. and Selen, Y. (2004). Model-order selection: a review of information criterion rules. IEEE Signal Processing Magazine, 21(4), 36–47.

2804