Polynomial NARX Model Identification: a Wiener–Hammerstein Benchmark

Polynomial NARX Model Identification: a Wiener–Hammerstein Benchmark

Proceedings of the 15th IFAC Symposium on System Identification Saint-Malo, France, July 6-8, 2009 Polynomial NARX Model Identification: a Wiener–Ham...

341KB Sizes 1 Downloads 174 Views

Proceedings of the 15th IFAC Symposium on System Identification Saint-Malo, France, July 6-8, 2009

Polynomial NARX Model Identification: a Wiener–Hammerstein Benchmark Luigi Piroddi*, Marcello Farina*, Marco Lovera* *Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milano, Italy (Tel:+39-02-2399-3556; e-mail: {piroddi,farina,lovera}@elet.polimi.it) Abstract: This work discusses the performance of several NARX model identification techniques on a benchmark problem for black–box identification methods proposed in (Schoukens et al., 2008), concerning a SISO electronic nonlinear system with a Wiener–Hammerstein structure, originally documented in (Vandersteen, 1997). The objective being the obtainment of an accurate simulation model, capable of replicating the dynamic behavior of the system without using past measured output data, both prediction error and simulation error minimization methods have been tested and compared. 1. INTRODUCTION NARX (nonlinear autoregressive with exogenous variables) models (Leontaritis and Billings, 1985a,b) are a popular class of models in black–box nonlinear model identification, for which several identification methods have been proposed and used in many applications (see, e.g., (Piroddi, 2008) and the references therein). The general NARX model is an input– output recursive model where the current output is given by a nonlinear functional expansion of lagged input and output terms, plus additive noise. Several types of nonlinear expansions have been used in NARX identification. One of the most studied one is the polynomial model class, since it yields a linear–in–the–parameters structure that allows easy model construction, and the use of standard linear regression techniques and simple identification algorithms of the Least Squares (LS) family for prediction error minimization (PEM) type identification. Notice also that the polynomial class often allows a direct interpretation of the various nonlinear contributions to the model. A main bottleneck in the identification of polynomial NARX models is the curse of dimensionality of polynomial expansions. In fact, unless an accurate model structure selection is performed, one may come across overparameterization problems or lack of model robustness. This is particularly important if the main purpose of the identification is to obtain a model capable of capturing the real dynamics of the underlying system (i.e., the goal is long range prediction or simulation accuracy), rather than simply achieving a good fitting of the estimation data (i.e., the goal is short term prediction accuracy). The identification of polynomial NARX models amounts therefore to solving two problems, i.e., model structure selection and parameter estimation. The two problems are coupled, since structure selection is usually based on the performance comparison of parameterized models with varying structure. Model construction is typically approached in an incremental fashion, adding one term at a time (Haber and Unbehauen, 1990), and possibly occasionally pruning the model of redundant terms (Piroddi and Spinelli, 2003b). 978-3-902661-47-0/09/$20.00 © 2009 IFAC

Parameter estimation is usually performed with LS–type techniques for computational reasons, thereby minimizing a PEM performance index, although costlier simulation error minimization (SEM) techniques may be worth using for better simulation accuracy. Actually, very different approaches result depending on whether a PEM or SEM criterion is used in model selection and/or parameter estimation. In this work, algorithms representative of all these cases are compared on the proposed benchmark, ranging from a full PEM approach as the classical Forward Regression Orthogonal Estimator (FROE) (Korenberg et al., 1988), to the SEM approach with Pruning (SEMP) (Piroddi and Spinelli, 2003a), that is actually a mixed SEM/PEM method, since LS are used for parameter estimation, to full SEM methods. The comparison study focuses on the accuracy and robustness of the identified models according to the main purpose of the benchmark study. However, the reader should be aware that the computational complexity increases significantly if either model selection or – even worse – parameter estimation are performed using a SEM criterion as opposed to a PEM one. Current research efforts are devoted to devising computationally viable algorithms for SEM identification (Farina and Piroddi, 2008). 2. NARX MODEL IDENTIFICATION ALGORITHMS 2.1 NARX models A NARX model (Leontaritis and Billings, 1985a, b) is an input–output recursive model described by the following general equation: y(k) = f(y(k−1),..., y(k−ny), u(k−1),..., u(k−nu)) + ξ(k),

(1)

where y(·) and u(·) are the model output and input, with maximum lags ny and nu, respectively, ξ(·) is a white noise sequence, and f(·) is a suitable nonlinear function. If f(·) is assumed to be a polynomial function, equation (1) can be expressed as a linear regression y(k) = ψT(k−1)ϑ + ξ(k),

1074

(2)

10.3182/20090706-3-FR-2004.0227

15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009

where ψT(k−1) is the regressor vector that contains monomials of the arguments {y(k−1), ..., y(k−ny), u(k−1), ..., u(k−nu)}, up to a given nonlinearity degree l∈Z+, and ϑ is the parameter vector containing the coefficients of the polynomial terms. It is clearly impractical to employ a full polynomial expansion of the given arguments, since the number of regressors increases rapidly with ny, nu and l, let alone over–parameterization implications. Satisfactory models can be often obtained using just a fraction of all the terms of the full polynomial expansion, which motivates the emphasis on the model structure selection problem. 2.1 Identification methods with PEM–oriented model structure selection Identification of a polynomial NARX model typically consists in iteratively building up the model adding or eliminating regressors according to a measure of their contribution to its accuracy. A classical method is the FROE (Korenberg et al., 1988), which combines a forward regression technique for adding terms to the model with an orthogonalization technique to decouple the estimation of the parameters associated with different regressors (Haber and Unbehauen, 1990). The identification criterion is the minimization of the prediction error variance. More precisely, the model is incremented by a single term at each iteration, choosing the one that provides the highest increase in the explanation of the output variance. To establish this, the candidate regressors are rated based on the Error Reduction Ratio (ERR) criterion, which basically measures the reduction of the Mean Square Prediction Error (MSPE) with respect to the current model, normalized with respect to the output variance. Orthogonal Least Squares (OLS) are used to decouple the estimation of parameters. Some variations on the FROE algorithm have been developed recently. For example, in Mendes and Billings (2001), a forward search is complemented with an exhaustive subset model selection. In Peng et al. (2006), a forward and backward regression method is proposed. A recursive version of the FROE is introduced in Li et al. (2005). Several drawbacks of the FROE algorithm have been pointed out in the literature, regarding long range prediction and simulation applications (Piroddi and Spinelli, 2003a). These are mainly due to the fact that the ERR cannot provide an absolute measure of the importance of a regressor, which is not constant but varies as the model structure is modified. This may significantly distort the model selection process, causing incorrect or redundant terms to be picked out (especially in the first stages of the procedure). This can be partially counteracted using a model reduction scheme as in (Piroddi and Spinelli, 2003b) or (Peng et al., 2006). Referring to the former work, a version of the FROE is there developed which operates a pruning of the model structure at each iteration. More precisely, after a term has been added to the model in the forward regression step, each regressor in the current model is tested for possible exclusion, by tentatively eliminating it and measuring the relative performance loss. If a regressor can be found such that its elimination reduces the performance less than the obtained

increment given by the last added term (by a given threshold), the said regressor is pruned. In this way, each complete forward regression / pruning iteration is guaranteed to improve the model PEM performance index and, at the same time, the model size is kept small, for enhanced robustness. Such variant of the FROE is here denoted FRP, for Forward Regression with Pruning. In the recent work (Piroddi and Lovera, 2008), the use of error filtering techniques (as opposed to standard data prefiltering) to improve model accuracy in a given frequency band has been explored related to polynomial NARX identification, motivating the introduction of a version of the FRP endowed with error filtering (EF–FRP). The EF–FRP algorithm has been shown to have a potential of improving the simulation capabilities of the FRP method by focusing on the dominant dynamics of the system. This motivates its testing on the benchmark example. 2.2 Identification methods with SEM–oriented model structure selection A significant improvement in long range prediction or simulation accuracy is achieved if model structure selection is reformulated in the SEM framework (Piroddi and Spinelli, 2003a). In fact, while some model inaccuracy or redundancy may be tolerable for prediction purposes, since in the computation of the one–step–ahead prediction there is no error accumulation (the measured past output values are employed), even a small discrepancy in the model structure may be the cause of significant errors when the model is simulated. It is not infrequent that an optimal model according to the PEM criterion turns out to be unstable in simulation (Piroddi and Spinelli, 2003a). Other unexpected and unwanted nonlinear behaviours deriving from the presence of redundant terms in the model are discussed in (Aguirre and Billings, 1995). It is argued in (Piroddi and Spinelli, 2003a) that PEM methods may fail in the presence of particular noise structures or poorly exciting input signals (insufficient amplitude range, low frequency or over–sampled data), whereas SEM methods may still provide accurate and robust identification results in the same conditions. The NARX identification approach proposed in (Piroddi and Spinelli, 2003a), denoted SEMP (for SEM with Pruning) operates much like the FRP, by iteratively constructing the model, adding one term per iteration to the model according to the achievable performance increment, and possibly pruning the redundant terms (if any) as long as the performance index is overall improved in the iteration. The difference is given by the criterion used for comparing model structures instead of the ERR. This criterion, denoted Simulation error Reduction Ratio (SRR), measures the variation in the Mean Square Simulation Error (MSSE) resulting from the inclusion or elimination of one term in the model. On the other hand, parameter estimation is still carried out with LS, for obvious computational reasons. Notice, however, that a simulation loop must be computed for each candidate regressor in the addition phase and for each model regressor in the pruning phase, so that the overall computational load is significant. Therefore, this approach is viable only with a limited number of candidate regressors.

1075

15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009

Although some heuristic ideas can be exploited to reduce the computational load (Piroddi and Spinelli, 2003a; Piroddi, 2008), the basic SEMP algorithm has been used in this work, the focus being on the model quality and not on the computational effort. To distinguish it from the approaches described in the sequel, it is denoted LS–SEMP. Concerning parameter estimation, the PEM approach is known to provide unbiased solutions in very specific identification conditions of perfect model matching and persistent excitation. However, especially the first condition is rarely true in nonlinear model identification, except in very idealized cases. Thus, we cannot expect the final model obtained by the SEMP to be optimized for simulation purposes, and it might be beneficial to finely tune its parameters by directly minimizing the MSSE with a Quasi– Newton (QN) iterative approach. A detailed description of the parameter estimation method is given in the companion paper (Farina and Piroddi, 2009). This corrective approach is listed below as LS/QN–SEMP and provides a not so costly trade–off between the SEMP and a full SEM approach. However, it should be observed that the risk of obtaining biased models is even higher at intermediate stages of the SEMP algorithm, when the model structure is still incomplete and the accuracy scarce. Since in the SEMP approach models are compared based on their PEM performance, the model selection can be severely distorted especially at the early stages of the model construction. This motivates the application of the QN approach for direct SEM optimization in the parameter estimation stage from the beginning of the SEMP algorithm. While this increases dramatically the computational effort required to complete the identification process, the achievable improvements in terms of model quality clearly demonstrate the potential benefits of a direct optimization of the simulation performance criterion and suggest that it is worth investigating for more efficient SEM algorithms. Table 1 summarizes the different criteria used for model selection and parameter estimation by the mentioned methods. The latter are listed in order of increasing computational complexity.

transfer function G1(s), a static nonlinear block f[·], and a linear dynamic block G2(s), as depicted in Fig. 1 (Vandersteen, 1997).

Fig. 1. The Wiener–Hammerstein system used in the benchmark case study. G1(s) is designed as a third order Chebyshev filter (pass–band ripple of 0.5 dB and cut off frequency of 4.4 kHz). G2(s) is designed as a third order inverse Chebyshev filter (stop–band attenuation of 40 dB starting at 5 kHz). Note that this system has a transmission zero in the frequency band of interest. Finally, the static nonlinearity is built using a diode circuit. The system was excited with a band–limited filtered Gaussian excitation signal and a dataset of 188000 data was generated, corresponding to 3.6719 s at the sampling frequency of 51200 Hz (the external excitation starting from sample 5001 and ending at sample 184000, approximately). The data are characterized by an extremely low measurement noise level (~70 dB below the signal levels). Further details are available in (Schoukens et al., 2008). 3.2 Preliminary analysis A preliminary step in the analysis of the data is aimed at the definition of an appropriate family of candidate regressors. For this purpose, since the data appear to be quite stationary, it suffices to take into account a small portion of the data set, e.g., 2000 data after the initialization section, i.e., in the sample interval [5001, 7000]. Several identification experiments have been carried out using the LS–SEMP algorithm to establish the appropriate span of the input and output signal. All simulations have been carried out for a maximum of 15 iterations. The results are reported in Table 2. Table 2. Results of the preliminary analysis with the LS– SEMP algorithm # output input nonlin. max. MSPE MSSE lags lags degree iterations 1 1, ..., 4 1, ..., 4 2 15 1.3802·10−6 3.4427·10−3 2 1, ..., 4 1, ..., 6 2 15 1.1040·10−6 1.3237·10−3 3 1, ..., 4 3, ..., 6 2 15 1.0077·10−6 0.9702·10−3 4 1, ..., 4 1, ..., 8 2 15 0.7500·10−6 0.4873·10−3 5 1, ..., 4 3, ..., 8 2 15 0.7501·10−6 0.4873·10−3 6 1, ..., 4 3, ..., 10 2 15 0.7265·10−6 0.4272·10−3 (*) 7 1, ..., 4 3, ..., 10 3 15 0.6274·10−6 0.2645·10−3 8 1, ..., 4 3, ..., 10 3 (**) 15 0.8439·10−6 0.4886·10−3 9 1, ..., 6 3, ..., 10 2 15 0.6136·10−6 0.4402·10−3 (*) Only pure cubic terms are employed to avoid regressor explosion. (**) Only pure quadratic and cubic terms are employed.

Table 1. Criteria used in the studied NARX model identification methods algorithm

model parameter pruning selection estimation FROE PEM PEM N FRP PEM PEM Y EF–FRP PEM (*) PEM (*) Y LS–SEMP SEM PEM Y LS/QN–SEMP SEM PEM/SEM (**) Y QN–SEMP SEM SEM Y (*) A filtered error cost function is used. (**) The last obtained model is reestimated with a SEM criterion

3. ANALYSIS OF THE BENCHMARK CASE STUDY 3.1 Brief description of the benchmark case study The system used in the benchmark case study is an electronic nonlinear system with a Wiener–Hammerstein structure, i.e., consisting of the cascade of a linear dynamic block with

Apparently, u(t−1), u(t−2) are unnecessary regressors, which is consistent with the input delay revealed by direct data inspection. Also, past input terms up to lag t−10 all appear to significantly affect the model performance, both in prediction and in simulation. Run 7 shows that cubic terms are also essential for model performance, although only pure cubic terms were added to the candidate regressor set to avoid

1076

15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009

regressor explosion. On the contrary, Run 8 shows that the candidate regressor set cannot be further reduced by excluding non pure quadratic terms without paying a significant cost in terms of model accuracy. Concerning the autoregressive part of the model, by comparing Run 9 with Run 6, it appears that the model simulation performance does not benefit from the inclusion of further output terms with higher lags (although the prediction accuracy obviously does).

iteration, but also the simulation one, due to the extremely low level of the noise present in the system. Allowing more complex models (of the order of 40–50 regressors), the simulation performance can be further halved to a figure below 0.15·10−3. This is probably due to the form of the static nonlinearity in the model, which is only partially accounted for with a truncated polynomial expansion, so that several compensating terms are selected.

Since both the prediction and simulation accuracy achieved are remarkable (see, in particular, Fig. 2), the regressor set of Run 7 has been used for the subsequent analysis. This includes all linear, quadratic and (pure) cubic terms obtainable from the elementary regressor set {y(t−1), ..., y(t−4), u(t−3), ..., u(t−10)}.

3.3 Comparison analysis of different NARX model identification techniques

0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 -1.2

0

200

400

600

800

1000

1200

1400

1600

1800

2000

Fig. 2. Measured (blue) and simulated (green) output with the model estimated in Run 7. Another feature that is apparent from this preliminary analysis is the following. Figure 3 shows the simulation performance (in terms of the MSSE indicator) of the model estimated in Run 7, over subsequent data portions of 2000 samples, throughout the entire dataset (dropping only the initial 5000 and the final 3000 data, where only low level noise is present). Only slight performance variations are experienced, the average MSSE being ~0.35·10−3, which is still better than what obtained with all the other model structures on the estimation data set. This demonstrates the robustness of the approach on the given data set, which allows to consider a restricted portion of data for model structure tuning, in order to avoid excessive computational load. Comparable results are obtained if the identification is repeated over different batches of 2000 data in the data set. -4

6

x 10

5

4

3

2

0

10

20

30

40

50

60

70

80

90

Fig. 3. Simulation performance (MSSE) of the model estimated in Run 7 over subsequent data portions of 2000 samples, throughout the entire data set, in the interval [5001, 185000]. Another interesting feature of the preliminary analysis is that not only the prediction performance is improved at every

All the NARX identification algorithms described in the previous section, i.e., the FROE, the FRP, the EF–FRP (with a 2nd order Butterworth filter with a normalized cut–off frequency of 0.7), the LS–SEMP, the LS/QN–SEMP, and the QN–SEMP, have been tested next using the same 2000 data in the sample interval [5001, 7000] and the same candidate regressor set, used before in Run 7. In view of the observations made in the previous sub–section, a constraint on the maximum number of model regressors has been considered in order to meaningfully compare the different techniques. Specifically, the best 15–term model achievable with each technique has been selected for comparison. The obtained models have been also evaluated on the complete training set (time interval [1, 100000]) and the complete validation set ([1000001, 188000]), using the indicators suggested in (Schoukens et al., 2008), i.e., the mean value (µ), the standard deviation (s), and the root mean square (eRMS) of the error, as well as the MSPE and the MSSE. The results are reported in Table 3. Comparing the obtained MSSE values on the actual training set of 2000 samples, only small improvements are apparently achieved by using more complex identification algorithms than the FROE involving PEM–oriented parameter estimation. Specifically, the introduction of pruning yields a 2.70% improvement over the plain FROE, which raises to a figure of 4.96% with the filtered error approach. If model selection is performed using the SEM approach, a 7.01% improvement over the FROE is observed. However, it is only when a SEM criterion is used for parameter estimation that significant improvements over the FROE are achieved. For example, even a simple QN correction of the final model obtained with the LS–SEMP doubles the improvement figure (14.83%), and if the same parameter estimation method is used during the whole model selection process a stunning 23.72% improvement is achieved. These improvements should be regarded all the more significant, since the accuracy achievable with the FROE is already remarkable, probably due to the extremely low noise setting of the experiment. Notice also that, if a full 3rd order expansion is used in the candidate regressor set, the MSSE figure can be almost halved with the QN–SEMP, at the cost of an extremely high computational load. The obtained models have been also evaluated on the validation set, where, excluding the filtered error approach, the relative improvements range from 9.27% to 16.84%, the latter figure being provided by the QN–SEMP. Apparently, the filtered error approach, with a mere 5.23% improvement

1077

15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009

Table 3. Results of the NARX identification algorithms on the first 2000 data method FROE

FRP

EF–FRP

LS–SEMP

LS/QN–SEMP

QN–SEMP

data interval [ 5001, 7000] [ 1, 100000] [100001, 188000] [ 5001, 7000] [ 1, 100000] [100001, 188000] [ 5001, 7000] [ 1, 100000] [100001, 188000] [ 5001, 7000] [ 1, 100000] [100001, 188000] [ 5001, 7000] [ 1, 100000] [100001, 188000] [ 5001, 7000] [ 1, 100000] [100001, 188000]

MSPE 5.0749·10−7 5.2754·10−7 5.2046·10−7 4.9506·10−7 5.0731·10−7 5.0273·10−7 7.0498·10−7 6.9906·10−7 6.9283·10−7 6.2736·10−7 6.5673·10−7 6.4980·10−7 7.1533·10−7 7.3613·10−7 7.2662·10−7 2.8759·10−6 2.8640·10−6 2.8046·10−6

MSSE 2.8441·10−4 3.5426·10−4 3.5469·10−4 2.7673·10−4 3.1582·10−4 3.1998·10−4 2.7031·10−4 3.3171·10−4 3.3615·10−4 2.6446·10−4 3.1591·10−4 3.2181·10−4 2.4223·10−4 3.0629·10−4 3.1352·10−4 2.1694·10−4 2.8506·10−4 2.9495·10−4

error mean (µ) 2.6357·10−3 3.3758·10−3 3.4804·10−3 0.4178·10−3 1.1531·10−3 1.3005·10−3 3.3297·10−3 4.0071·10−3 4.0735·10−3 0.9412·10−3 1.8657·10−3 1.9321·10−3 2.5906·10−3 3.5699·10−3 3.5984·10−3 1.5295·10−3 2.2116·10−3 2.2442·10−3

error st. dev. (s) 1.6661·10−2 1.8517·10−2 1.8509·10−2 1.6634·10−2 1.7734·10−2 1.7841·10−2 1.6104·10−2 1.7767·10−2 1.7876·10−2 1.6239·10−2 1.7676·10−2 1.7835·10−2 1.5351·10−2 1.7133·10−2 1.7337·10−2 1.4653·10−2 1.6738·10−2 1.7027·10−2

error RMS (eRMS) 1.6865·10−2 1.8822·10−2 1.8833·10−2 1.6635·10−2 1.7771·10−2 1.7888·10−2 1.6441·10−2 1.8213·10−2 1.8334·10−2 1.6262·10−2 1.7774·10−2 1.7939·10−2 1.5564·10−2 1.7501·10−2 1.7707·10−2 1.4729·10−2 1.6884·10−2 1.7174·10−2

Table 5. Results with the previously identified models re-estimated on the whole training set method FROE FRP EF–FRP LS–SEMP LS/QN–SEMP QN–SEMP

data interval [ 1, 100000] [100001, 188000] [ 1, 100000] [100001, 188000] [ 1, 100000] [100001, 188000] [ 1, 100000] [100001, 188000] [ 1, 100000] [100001, 188000] [ 1, 100000] [100001, 188000]

MSPE 5.2133·10−7 5.1377·10−7 5.0331·10−7 4.9850·10−7 6.8326·10−7 6.7722·10−7 6.4838·10−7 6.4210·10−7 9.6813·10−7 9.5851·10−7 3.9746·10−6 3.9059·10−6

MSSE 3.3344·10−4 3.3287·10−4 3.0909·10−4 3.1197·10−4 3.2834·10−4 3.3282·10−4 3.1028·10−4 3.1739·10−4 2.7983·10−4 2.9207·10−4 2.6416·10−4 2.7671·10−4

over the FROE on the validation set is the least robust approach among the ones listed. For illustration purposes, the model selection process operated by the QN–SEMP is reported in Table 4 (the slight mismatch in the final value of the MSSE with respect to Table 3 is due to the fact that the criterion is here evaluated from sample 5011, and not from sample 5001, due to the maximum input lag equal to 10).

error mean (µ) 2.9934·10−3 3.1169·10−3 1.1911·10−3 1.3459·10−3 2.2641·10−3 2.3399·10−3 1.4396·10−3 1.5203·10−3 2.4542·10−3 2.5223·10−3 2.1896·10−3 2.2349·10−3

error st. dev. (s) 1.8014·10−2 1.7977·10−2 1.7541·10−2 1.7612·10−2 1.7978·10−2 1.8093·10−2 1.7556·10−2 1.7751·10−2 1.6547·10−2 1.6903·10−2 1.6105·10−2 1.6484·10−2

error RMS (eRMS) 1.8260·10−2 1.8245·10−2 1.7581·10−2 1.7663·10−2 1.8120·10−2 1.8243·10−2 1.7615·10−2 1.7815·10−2 1.6728·10−2 1.7090·10−2 1.6253·10−2 1.6635·10−2

tested on a Wiener–Hammerstein benchmark problem. Although the data have such a high SNR that even a PEM approach like the FROE apparently obtains quite satisfactory results in simulation, significant improvements can be obtained by operating a direct optimization of the SEM criterion in the parameter estimation phase.

Finally, the models obtained with the different described techniques have been retuned on the complete training set, for the benchkmark purposes. The results, reported in Table 5, are qualitatively similar to those obtained previously, the QN–SEMP still providing the best model (with a 16.87% improvement over the FROE). 3.4 Description of the identified model The parameterization of the final model identified with the QN–SEMP on the 2000 samples data set and retuned on the whole training set is reported in Table 6. Besides the performance indexes of the model on the validation data set, reported in the last row of Table 5, time domain and frequency domain plots of the modelled output and the simulation error are shown in Figs. 4–5. 6. CONCLUSIONS Several NARX model identification algorithms have been 1078

Table 4. Model selection iterations with the QN–SEMP (on the 2000 samples training data set) iter. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

regressor u(t−10) y(t−1) y(t−4) u(t−5) u(t−10) y(t−3) y(t−4) y(t−4) u(t−4)u(t−6) u(t−5)3 u(t−6)u(t−7) u(t−10) u(t−3) u(t−9) u(t−10) u(t−10) y(t−4)2 y(t−4)3 y(t−3)u(t−4) u(t−3)u(t−10) u(t−3)u(t−9)

a/p + + + + − + − + + + + + + + − + + + + + +

MSPE 5.1834·10−2 2.0159·10−3 5.3898·10−4 8.3500·10−5 2.0427·10−4 1.4057·10−5 1.2130·10−4 1.4057·10−5 1.4073·10−5 1.3802·10−5 1.3757·10−5 1.1927·10−5 7.7859·10−6 2.5816·10−6 3.2682·10−6 2.5816·10−6 2.9552·10−6 3.1566·10−6 3.2041·10−6 3.3550·10−6 2.8885·10−6

MSSE 5.1834·10−2 1.2614·10−2 7.3825·10−3 2.9729·10−3 6.0765·10−3 2.8765·10−3 5.5480·10−3 2.8765·10−3 6.1638·10−4 3.8323·10−4 3.6107·10−4 3.3653·10−4 3.0605·10−4 2.6306·10−4 2.8028·10−4 2.6306·10−4 2.5355·10−4 2.4412·10−4 2.3351·10−4 2.2609·10−4 2.2001·10−4

15th IFAC SYSID (SYSID 2009) Saint-Malo, France, July 6-8, 2009

This suggests that SEM identification is definitely worth investigating, especially for applications like the one studied here, where simulation or long range prediction accuracy is required. Further research is needed, however, to reduce the computational effort of SEM approaches to more acceptable levels. A promising step in this direction is given by an approximate SEM optimization method that operates by iteratively constructing multi–step predictors with increasing prediction horizons as discussed in (Farina and Piroddi, 2008) for the linear case. The generalization of this approach to the nonlinear case is currently under development. Table 6. Model parameters linear terms

quadratic terms

cubic terms

regressor y(t−1) y(t−3) y(t−4) u(t−3) u(t−5) u(t−9) u(t−10) y(t−3)u(t−4) y(t−4)2 u(t−3)u(t−9) u(t−3)u(t−10) u(t−4)u(t−6) u(t−6)u(t−7) y(t−4)3 u(t−5)3

parameter 1.7478·100 −1.3838·100 6.2476·10−1 3.3114·10−3 2.8480·10−3 9.2866·10−3 −5.9857·10−3 −4.9005·10−4 6.9105·10−4 −5.8034·10−4 8.0708·10−4 −1.0738·10−3 −7.4397·10−4 5.5333·10−4 −5.5720·10−4

1 0.5 0 -0.5 -1 -1.5 0.2

11

12

13

14

15

16

17

18

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

0.1 0 -0.1 -0.2

x 105 5

Fig. 4. Output (top) and error (bottom) of the model estimated with the QN-SEMP on the validation data set. 60 40 20 0 -20 -40 -60 60 0

0 05

01

0 15

02

0 25

03

0 35

04

0 45

05

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

40 20 0 -20 -40 -60

0

Fig. 5. DFT of the output (top) and error (bottom) of the model estimated with the QN-SEMP on the validation data set (the transform is computed with a plain FFT, without windowing).

REFERENCES Aguirre, L.A. and S.A. Billings (1995). Dynamical effects of overparametrization in nonlinear models. Physica D, 80, 26–40. Farina, M. and L. Piroddi (2008). Some convergence properties of multi–step prediction error identification criteria. 47th IEEE Conference on Decision and Control, Cancun (Mexico), 756–761. Farina, M. and L. Piroddi (2009). Simulation error minimization–based identification of polynomial input– output recursive models. 15th IFAC Symposium on System Identification, Saint-Malo (France). Haber, R. and H. Unbehauen (1990). Structure Identification of Nonlinear Dynamic Systems – A Survey on Input/Output Approaches. Automatica, 26 (4), 651–677. Korenberg, M.J., S.A. Billings, Y.P. Liu and P.J. Mcilroy (1988). Orthogonal parameter estimation algorithm for non–linear stochastic systems. International Journal of Control, 48 (1), 193–210. Leontaritis, I.J. and S.A. Billings (1985a). Input output parametric models for non linear systems – Part I: deterministic non linear systems. International Journal of Control, 41 (2), 303–328. Leontaritis, I.J. and S.A. Billings (1985b). Input output parametric models for non linear systems – Part II: stochastic non linear systems. International Journal of Control, 41 (2), 329–344. Li, K., J. Peng and G. Irwin (2005). A fast nonlinear model identification method. IEEE Transactions on Automatic Control, 50 (8), 1211–1216. Mendes, E.M.A.M. and S.A. Billings (2001). An Alternative Solution to the Model Structure Selection Problem. IEEE Transactions on Systems, Man, and Cybernetics, part A: Systems and Humans, 31 (6), 597–608. Peng J.X., K. Li and E.W. Bai (2006). A two–stage algorithm for identification of nonlinear dynamic systems. 14th IFAC Symposium on System Identification, Newcastle (Australia), 570–575. Piroddi, L. (2008). Simulation error minimization methods for NARX model identification. International Journal of Modelling, Identification and Control, 3 (4), 392–403. Piroddi, L. and M. Lovera (2008). NARX model identification with error filtering. 17th IFAC World Congress, Seoul (South Korea), 2726–2731. Piroddi, L. and W. Spinelli (2003a). An identification algorithm for polynomial NARX models based on simulation error minimization. International Journal of Control, 76 (17), 1767–1781. Piroddi, L. and W. Spinelli (2003b). A pruning method for the identification of polynomial NARMAX models. 13th IFAC Symposium on System Identification, 1108–1113. Schoukens, J., Suykens, J. and L. Ljung (2008). Wiener– Hammerstein benchmark. Invited Session Proposal for the 15th IFAC Symposium on System Identification. Vandersteen, G. (1997). Identification of linear and nonlinear systems in an errors–in–variables least squares and total least squares framework. Ph.D Thesis, Vrije Universiteit Brussel.

1079