14th IFAC Symposium on System Identification, Newcastle, Australia, 2006
IDENTIFICATION OF NONLINEAR AND LINEAR SYSTEMS, SIMILARITIES, DIFFERENCES, CHALLENGES. J. Schoukens, R. Pintelon, Y. Rolain
Vrije Universiteit Brussel, dep. ELEC, Pleinlaan 2, B1050 Brussels, Belgium email:
[email protected]
Abstract: The basic goal of system identification is to match models on data that are disturbed with noise by minimizing a well selected cost function. Opposed to linear system identification where the disturbing noise is the dominant error, model errors are the major problem for nonlinear system identification. This has its impact on the whole identification process: the design of the experiment, the selection of the cost function, the validation and use of the identified model. The search for good flexible models that are easy to identify is the driving force behind the actual nonlinear system identification research. Copyright © 2006 IFAC. Keywords: system identification, linear, nonlinear, disturbing noise, model errors.
that should be properly addressed in order to get reliable parameter estimates. Failing to make a proper selection can even drive the whole identification process to useless results. A good understanding of each of these steps is necessary to find out where a specific identification run is failing: is it due to numerical problems, convergence problems, identifiability problems, or a poor design of the experiment?
1. INTRODUCTION The aim of system identification is to match a model M(θ) to (measurement) data Z . Measurement data are disturbed by measurement errors and process noise, described as disturbing noise n z on the data: Z = Z0 + nz .
(1)
Since the selected model class M does in general not include the true system S 0 , model errors appear: S 0 ∈ M 0 and M0 = M + M ε ,
The major difference between linear and nonlinear identification is the dominating error source. For linear identification, we have good model structures (transfer function models, state space models, etc.), so that the disturbing noise n z is the most important error. For most nonlinear identification problems we face severe model errors: the behaviour of nature is much richer than what is included in our models. This difference in the driving error mechanism affects many of our research activities. This is elaborated in more detail in the next two sections.
(2)
with M ε the model errors. The goal of the identification process is to select M , and to tune the model parameters θ such that the ‘distance’ between the model and the data becomes as small as possible. This distance is measured by the cost function that is minimized. The selection of these three items (data, model, cost function) sets the whole picture, all the rest are technicalities that do not affect the quality of the estimates. Of course this is an over simplification. The numerical methods used to minimize the cost function, numerical conditioning problems, model parameterizations, ... are all examples of very important choices
2. LINEAR SYSTEM IDENTIFICATION If we assume that the true system S 0 is linear, we do not really face model errors in linear system identifica-
122
model class. If for example we excite a system with a larger amplitude, it is impossible to predict whether the output will explode or saturate without taking a look at that amplitude range. A direct consequence is that it is no longer possible to decouple the approximating model from its excitation. There is an intimate link between both. Only if there is physical insight that gives evidence that the selected model is in agreement with the physics of the system, we can weaken this link.
tion. It is enough to increase the model order to capture all dynamics of the system. Hence the disturbing noise n z is the dominant error for validated models, and the selection of a cost function that guarantees consistent and efficient estimates can be embedded in a statistical framework. At the end of the seventies, beginning of the eighties of last century, the identification pioneers structured the field by making a clear distinction between the three basic identification steps: collection of the data; selection of a plant and noise model; selection of a cost function. These choices shouldn’t be mixed with the choice of the numerical optimizer, another minimization scheme applied to the same cost function does not result in a new and/or more efficient estimate.
Experiment design: Since the model is linked to the experiment, it is important to design the experiment to cover the intended use of the model. The power spectrum (e.g. white or coloured noise) and the amplitude distribution (e.g. uniform, Gaussian or binary) should be properly set. Recently it became also clear that the generation of the signals by minimum or nonminimum phase filters makes a big difference for non-Gaussian excitations if causal approximations are needed. Of course the classical linear identification rules remain also valid (persistency, maximum Fisher information), but they should be balanced against the other requirements to identify an approximating model that were motivated just before.
Another typical aspect of linear system identification is that there is a clear split between the model and its input and output. Within the linearity assumption, a validated model is valid for all signals that are within the frequency band that it is designed for. 3. NONLINEAR SYSTEM IDENTIFICATION SOME GENERAL THOUGHTS In nonlinear system identification, the situation is completely different. As mentioned before, the major issue is the selection of a good model structure. While it was possible to propose a ‘universal’ model for linear system identification, it will remain a day dream to do so for nonlinear systems because the nonlinear world covers everything that is not linear. Model errors will remain a major issue in nonlinear system identification for a long time. Checking the nonlinear system identification literature shows immediately that many of the actual activities are directed towards a search for good models. These can be either dedicated models like Wiener, Hammerstein, bilinear models, or block structured models; or it can be ‘universal approximators’ like neural network and support vector machines. Physical models are outside the scope of this discussion because it is not possible to come up with generic methods, the model building should be restarted for every new problem. Mixing up this wide variety of possibilities with the choice of the cost function and the optimization methods brings us back to ‘bag of tricks’ of the sixties and seventies for linear identification.
Selection of the cost function: In linear system identification, the selection of the cost function is embedded in a statistical framework, because random disturbances are the dominating error. For nonlinear system identification this is no longer the only criterion. It is clear that we still should avoid an extreme sensitivity to disturbing noise by introducing a poor weighting, but even more important is to put the model errors were they hurt not too much. For example: it is not because the signal-to-noise-ratio (SNR) is very high in a given frequency band that we should weight this band more if we want to distribute the dominating model errors equally over the full frequency band. This will strongly affect the optimal frequency weighting. Similar aspects are also valid if we want to make sure that the model error as a function of the excitation amplitude behaves well. As mentioned before, the nonlinear system identification theory is not ready for a global approach, copying the success of linear system identification. A more modest attempt is needed, restricting the field to a more manageable problem. In the next section we illustrate the previous explained ideas by generalizing the identification of linear systems to linear identification in the presence of nonlinear distortions, and next identifying block-structured models.
Identification in the presence of model errors affects all choices to be made: the design of the experiment, the selection of the cost function, the validity of the model. Also the independency between model and data is lost.
4. EXAMPLE: LINEAR SYSTEM IDENTIFICATION IN THE PRESENCE OF NONLINEAR DISTORTIONS
Dependency of the model on the experiment: Opposed to linear system identification, an ‘approximating’ model is closely linked to the data set it is identified on. Generalization (another word for extrapolation) should be avoided, it is NOT a property of a method or
The linear system identification framework is a huge success with many applications. However, a more detailed analysis shows that the linearity assumption is not met in many applications while this is not detected
123
model. Since we have (large) model errors, the experiment design is crucial: the excitation signals should be similar to those that will be applied later on. Nothing can be said about the model quality under different conditions.
at all by the validation tests. The reason for that is that a nonlinear system excited by a (Gaussian) random excitation can be replaced by a linear system plus a nonlinear noise source. It is very hard to separate the nonlinear noise source from the disturbing noise without special tools. That is also the reason why the nonlinear distortions are often not recognized, they are hidden in the disturbing noise model. So what is the problem? The major danger is that the effect of nonlinear distortions is completely different from disturbing noise effects because the nonlinear noise source reveals only one aspect of the nonlinear distortions. These will create also a systematic shift of the linear estimate which is reduced to a ‘best linear approximation’ with all its restrictions as explained in Section 3. The classical linear framework tells nothing about this shift.
- By extending the model class to include nonlinear systems, it might turn out that the nonlinearities are captured by the model. In that case the optimal weighting is again dictated by the disturbing noise properties that should be extracted somehow from the data. This is a simple task if period excitations are used, but in general it is still an open problem. - The link between the experiment and model becomes looser if the model comes closer to the physical reality, and the model errors decrease. This can not be concluded from a single experiment. The results should be repeatable for different excitations: changing the power spectrum, the amplitude distribution, the amplitude range should be possible. Only when all these tests are successful it is allowed to consider this as a validated model. One restriction that will remain is the extrapolation of the amplitude range. By increasing or decreasing the amplitude it is always possible that new nonlinear effects pop up that were not present or well visible in the earlier experiments. For example we can mention saturation effects for growing amplitudes, and stick slip effects in mechanical systems for decreasing amplitudes.
Under these conditions three different approaches can be made: Approach 1: Determine the level of the nonlinear distortions using a dedicated experiment strategy. This will set an intrinsic limit on the reliability of the linear approximation. It makes no sense to tune the design based on this model below this level, because the linear model is not valid below this level. Approach 2: If the nonlinear errors are too large, a nonlinear structure should be identified. A natural extension of the linear model are the Wiener, Hammerstein, or Wiener-Hammerstein models. These are all block oriented models that consist of a cascade of linear dynamic and static nonlinear blocks. It are all open loop models, the output is not fed back to the input. These nonlinear models will lead to a significant improvement compared to the linear approximation if the actual system fits within this class (e.g. there is a static input nonlinearity). In the other case a better approximation might result, and worst case, no improvement is obtained. Methods to identify this class of models are available.
5. CONCLUSIONS In this paper we discussed differences and similarities between linear and nonlinear system identification. In both cases it is important to make a clear distinction between the three basic choices in system identification. If this is not done, we loose the clear structure that allowed to order the linear field. A major difference is that nonlinear system identification is dominantly driven by a control of the model errors, while for linear system identification reducing the impact of the disturbing noise is the major drive. We analysed the consequences of this difference that pop up in the experiment design, the selection of the cost function and the validation of the identified model.
Approach 3: If the previous step fails, the class of block oriented nonlinear models can be further extended to include also feedback. This covers a much wider class of systems, but it makes the identification problem more involved. Identifying these models is still a hot research topic.
6. ACKNOWLEDGEMENT This work was supported by the Flemish government (GOA-ILiNos), the FWO (onderzoeksgemeenschap ICCoS) and the Belgian government as a part of the Belgian program on Interuniversity Poles of Attraction (IUAP V/22).
If we analyse the identification problem in the 3 approaches we can make the following observations: - To identify the best linear approximation, we should use a weighting that consists of the variance of the disturbing noise + the nonlinear noise source. The classical linear identification methods that estimate also a noise model, like the Box-Jenkins method, do this automatically. However they give no warning about the presence of the nonlinear distortions. Additional tools are needed to quantify the reliability and validity of the
7. REFERENCES Within the restricted length of this paper, we preferred to give no references instead of being very incomplete. We advice the reader the revisit the actual literature on (non)linear system identification keeping the messages from this paper in mind.
124