564
Book reviews / Automatica 39 (2003) 563 – 568
tion during the 1990s. This important topic is discussed in Chapter 6. Various techniques of voltage stability assessment are outlined in this chapter. The second part of the book, Chapters 7–9, cover the basics of intelligent systems and their application to angle and voltage stability. A brief description of the fundamentals of arti9cial neural networks, expert systems and fuzzy logic are given in Chapter 7. Two examples of the applications of AI to power systems are given in Chapter 8. One example describes an oF-line trained arti9cial neural network for transient stability assessment based on the transient energy function and its use in an expert system for direct stability analysis. The second example is that of a fuzzy logic controller as a power system stabilizer. Although it is stated that simulation studies were performed with the fuzzy logic power system stabilizer, no results are given to illustrate its capabilities. These two examples hint at the possible applications of these techniques. Two examples of the use of a neural network and an expert system to voltage stability problem are described in Chapter 9. Outline and simulation results of (i) the application of a feedforward multilayer neural network in voltage stability assessment and enhancement, and (ii) an expert system based voltage collapse detection and prediction are given. The material presented in Chapters 8 and 9 gives the reader a glimpse of the possible applications of AI in power systems. By now a large body of literature exists in this area. An extensive Bibliography of related literature, both on power system analysis (9rst part of the book) and AI applications (second part of the book), which can be very useful to the reader, is included in the book. Most listings in the Bibliography are very relevant. Description in the book is perforce brief and the literature listed in the Bibliography provides further details of the material covered in the book. Unfortunately, no references are provided in the main body of the book, which makes it very diCcult for a reader who may wish to pursue further a speci9c topic. A Glossary of terms used and a few exercise problems on power systems for Chapters 2– 6 are a welcome addition to the book. Subject matter of the book, particularly in regard to AI applications is very timely. However, there are a number of
problems with it. Its value could be signi9cantly improved if proper attention is paid to editing for language, grammar, terminology, etc. The book contains errors of a technical nature. Considering the level of expertise of the authors one may attribute these to the lack of attention to editing. Whatever the reason, the reader is left confused. Regarding the technical content, it would certainly have been enhanced if the material included had been kept at about the same level and the very basic material had not been included. This would have provided a better Jow and a more coherent coverage of the subject matter. Inclusion of references in the body of the book is essential for a book of this nature. One more item that would certainly make such a book very useful is some information on the actual state of application of AI methods. Are these methods still at the investigation stage or have any applications been implemented in practical power systems? Such information would go a long way in furthering this type of work. O.P. Malik Department of Electrical and Computer Engineering; University of Calgary; 2500 University Drive NW Calgary; Canada AB T2N 1N4 E-mail address:
[email protected] About the reviewer Dr. O. P. Malik is currently a professor emeritus in the Department of Electrical and Computer Engineering at the University of Calgary, Canada. He has been at the University of Calgary since 1968 teaching and doing research. Before joining the University of Calgary, he taught at the University of Windsor, Canada for 2 years, and worked with English Electric Co., UK and in electric utilities in India for 9 years. Dr. Malik has done extensive research in the application of adaptive control and AI techniques to control and protection of power systems, having published over 500 papers in these areas. In his research, particular emphasis is laid on the implementation and experimental evaluation of all control and protection schemes rather than just simulation studies. Dr. Malik graduated in electrical engineering from Delhi, Polytechnic, India, in 1952, obtained Master of Engineering from Roorkee University, India, in 1962, a Ph.D. from University of London, England and Diploma of the Imperial College, London, in 1965. He is a Life Fellow of IEEE, and a Fellow of the Canadian Academy of Engineering, of the Engineering Institute of Canada and of the IEE.
doi:10.1016/S0005-1098(02)00230-3
Nonlinear system identication Oliver Nelles; Springer, Berlin, 2001, ISBN 3–540– 67369–5 In the preface, Oliver Nelles states his goal as providing engineers and scientists in academia and industry with a thorough understanding of the underlying principles of nonlinear identi/cation. This is a tall order, no wonder the book is 785pp long. As its subtitle From classical
approaches to neural networks and fuzzy models suggests, the book is more speci9cally focused on black-box modeling from experimental data (supervised learning), with almost nothing on the identi9cation of knowledge-based models. It does build a bridge between the seemingly distant worlds of classical parameter identi9cation and neuro-fuzzy modeling. The main topic is the prediction of the output of a system based on input–output data, with little interest for the values taken by the parameters of the model and no
Book reviews / Automatica 39 (2003) 563 – 568
interest for the reconstruction of physically meaningful state variables that would not be accessible to measurement. The notions of identi9ability and observability are thus absent, and state–space models and Kalman 9ltering play very small roles. The uncertainty in the estimated parameters is considered only insofar as it impacts on the uncertainty of the prediction of the system output. Because of the black-box nature of the models, considerable attention is devoted to the problem of selecting appropriate input variables and to the struggle against the curse of dimensionality. Parameter optimization is considered within structure determination, which is a much more complex problem than in the case of linear models. The book is written from an engineering perspective, with the mathematics kept to a minimum. If you are keen on the theorem–proof format, then it is not for you. Instead, the formulas and properties are illustrated and commented upon at length. This sometimes leads to oversimpli9cation, but most comments are to the point and interesting. An extensive bibliography (414 references) provides pointers to more theoretical presentations. There is a strong emphasis on an intuitive understanding of the topics treated. The advantages and limitations of the various model structures and optimization strategies are always described in detail from the standpoint of a potential user. Part 1 is devoted to optimization techniques, which are thus treated before the detailed consideration of any speci9c model architecture. This is appropriate because it avoids conveying the impression that algorithms and architectures are inextricably mixed. It paves the way for an understanding of the algorithms available for a given architecture and of the consequences of choosing any one of them. The price to be paid is that one has to wait until the following parts to see identi9cation in action. This diCculty cannot be avoided when presenting system identi9cation, because all topics are related whereas a book is discursive by nature. Many cross-references are provided, which help the reader explore these relations. Basic notions of probability and statistics are introduced to explain what types of loss function could be considered. Although the presentation of statistical estimation is limited in scope, it gives much more information than strictly necessary for the remainder of the book. Similarly, many more algorithms are described than will actually be used in the sequel (essentially weighted, recursive and orthogonal least squares in the linear case, and gradient-based algorithms for local optimization of nonlinearly parameterized models). These extensions are useful as they allow a better understanding in a more general setting of what will actually be used for building models. I have some reservation about the terminology though, as calling linear optimization problems those that can be solved by least-squares entails the risk of a confusion with the problems that can be solved by linear programming, but the context should rule out this confusion. Little attention is being paid to numerical robustness, with for example a repeated use of the explicit formula for least-square estimation without the reader being
565
told why this formula should be avoided in actual computation. Given the importance of the computation of gradients, it would have been interesting to have a section on automatic di3erentiation, which allows an exact yet remarkably eCcient computation of the gradient of any quantity computed by a program, provided of course that the code itself is diFerentiable (Griewank & Corliss, 1991). The basic idea is the same as for the backpropagation algorithm presented in the book, namely the chain rule for diFerentiation, but it is applied to code instead of mathematical formulas, which makes it applicable in a systematic way to a much wider range of models. Global optimization algorithms based on random search are presented at great length, without stressing clearly that absolutely no guarantee can be provided as to the results obtained in 9nite time. Regarding simulated annealing, I do not 9nd fuzzy statements such as It can be shown that the algorithm is statistically guaranteed to /nd the global optimum particularly helpful, but the discussion about the policy for the annealing schedule is interesting. For a biological testimony to the fact that evolutionary algorithms may get trapped in local optima, see the story of compound eyes as reported in Dawkins (1996). Compound eyes, favored by insects and crustaceans, cannot for fundamental reasons provide as precise and detailed images of the world as simple camera eyes favored by many other species including ours. In the words of the Swedish biologist Dan Nilsson: It is only a small exaggeration to say that evolution seems to be /ghting a desperate battle to improve a basically disastrous design. Given these shortcomings, it is a pity that nothing is said about deterministic methods for global search (Horst & Tuy, 1990), which can provide guaranteed results in 9nite time. Contrary to what is said on p. 133, branch-and-bound search does not require parameter space to be discretized, when combined with the tools of interval analysis (Hansen, 1992). These tools have indeed been used to get guaranteed results in the 9eld of nonlinear parameter and state estimation, see e.g. (Jaulin, KieFer, Didrit, & Walter, 2001). Of course deterministic global search cannot claim to solve all problems, so random search still has a promising future. The last chapter of Part 1 is devoted to the choice of model structure, including the selection of pertinent input variables and the issue of model complexity, which are all key-points in nonlinear black-box modeling. The fundamental nature of the curse of dimensionality is explained, as well as why the situation is often much less severe in real-world problems than could have been feared. Clear explanations and useful hints are given on cross validation and various methods for automatic structure selection and regularization of model output. I am afraid though that the part on statistical tests is useless for those who know and cannot be understood by those who do not. Part 2 deals with static models. This is again appropriate, as the situation is much simpler than with dynamic models. Moreover static models are the basic blocs for the major route to be followed in Part 3 to build dynamic mod-
566
Book reviews / Automatica 39 (2003) 563 – 568
els, namely the external dynamics approach. For the sake of simplicity, MIMO models are decomposed into a series of MISO models, at the possible cost of a deterioration of the evaluation speed. To illustrate the properties of the nonlinear model architectures to be presented, the same very simple nonlinear SISO system is considered as a test case, and each of these architectures will be assessed according to 15 pertinent evaluation criteria, including interpolation and extrapolation behaviors, sensitivity to noise, parameter and structure optimization, training and evaluation speeds and interpretability. This is of course a crude characterization of the advantages and disadvantages of each solution, but it should nevertheless be particularly useful to who lacks practical experience in the matter. The structures considered include linear and polynomial models, look-up tables (very much in use in industry), splines, neural networks (with special emphasis on multilayer perceptrons (MLP) and radial basis functions networks (RBF)), and fuzzy models. A description of the main construction mechanisms for neural networks facilitates the understanding of the diFerences in behavior between the MLP and RBF networks and of the reasons why projection-based mechanisms as applied in MLP networks are particularly eCcient to overcome the curse of dimensionality. Radial basis functions were used in mathematics for interpolation much before the emergence of neural networks, and Oliver Nelles provides useful pointers in this respect. Another connection that would have been worth mentioning is that between RBF networks and the prediction method developed under the name of Kriging in the context of geostatistics after the seminal work of Krige more than 50 years ago (ChilQes & Del9ner, 1999). It would also have been interesting to mention Support Vector Machines, a hot topic in the community of neural networks (SchRolkoptf & Smola, 2002). Fuzzy logic is introduced clearly and pedagogically, and linguistic, singleton and Takagi–Sugeno models are compared. Fuzzy logic is the object of a 9erce opposition from statisticians, see the excellent special issue entirely devoted to the debate by the IEEE Transactions on Fuzzy Systems (Bezdek, 1994). The main criticism of statisticians against fuzzy logic is that the same advantages in terms of incorporation of prior knowledge and interpretability of the results could be obtained using the theory of subjective probability and a Bayesian approach, with a stronger theoretical foundation. The normalization of the membership functions to realize a partition of unity seems, for example, to be nothing else than the familiar requirement of having probabilities summing to one. This being said, the discussion of the unexpected and undesired eFects of this normalization is very interesting. From Chapter 13 onward, the emphasis is on Takagi–Sugeno models and a speci9c algorithm proposed by Oliver Nelles to build such models and known under the acronym of LOLIMOT (for local linear model tree). If the membership functions were fully speci9ed, the estimation of the parameters of the local linear models would boil down to a linear problem, which could be solved, at least in principle, by a single
application of the weighted least-squares algorithm. Oliver Nelles convincingly advocates a diFerent route along which the parameters of each local linear model are estimated individually. The bias increases, but the variance of the error decreases and the complexity of computation is much reduced. As regards the (soft) partitioning of the input space to decide the degree of contribution of each local model to the output at any given point of input space, the advantages and limitations of the available techniques are presented, before concentrating on the very simple LOLIMOT algorithm, which iteratively builds a tree consisting of boxes in input space. At each iteration the box that corresponds to the local linear model with the worst performance is split into two boxes, each of which is associated with a new local linear model. Pruning can be incorporated into LOLIMOT to merge local linear models, thus preventing their number to grow inde9nitely. A very attractive feature of LOLIMOT is that computational demand grows only linearly with the number of local linear models to be considered. Since these models are linearly parameterized, the loss function to be used for estimating their parameters must be quadratic in the model output to take advantage of the weighted least-squares algorithm. No such constraint exists for the loss function associated with the partitioning of input space, and it is not even necessary that the input variables are the same in both cases. This possibility of using diFerent input spaces for rule premises and consequents is an additional advantage of the LOLIMOT strategy. Advanced aspects include a methodology to deal with direction-dependent behavior (hysteresis), the use of orthogonal least squares instead of least squares to select regressors in the local linear models and online learning. A methodology for assessing the accuracy of the predicted output for any given input while taking into account the uncertainty of the local linear models is proposed, with honest explanations as to the limitations of the exercise. Despite these limitations, the resulting error bars are potentially very useful to detect extrapolation regions where the output prediction is particularly doubtful and to design suitable input signals for the training of the model. Part 3 extends the previous considerations to dynamic models. It starts by an overview of well-known parametric methods for the identi9cation of discrete-time dynamic linear systems in the time domain. The now standard terminology is recalled, as well as the basic properties of models structures that include ARX, ARMAX, OE, FIR and OBF models. Also recalled are the main approaches available for the estimation of their parameters, including prediction-error and instrumental-variable methods and their recursive variants. This corresponds to the largest chapter, with 89pp, and it could probably have been made more concise. Again, much more material is presented than actually needed to build the local linear models to be used later. The chapter contains a number of remarks that will be useful in the context of nonlinear modeling, for instance on the important distinction between simulation and prediction or on the
Book reviews / Automatica 39 (2003) 563 – 568
stability properties of the resulting models. Two main strategies are considered to transition from linear to nonlinear dynamic models. One of them, the internal dynamics approach based on a state–space representation, receives little attention. The other, considered in much more detail, is the external dynamics approach, in which the nonlinear dynamic model consists of a nonlinear static approximator, the inputs of which are suitably delayed versions of the inputs and outputs of the system. Oliver Nelles explains clearly why, as a result of this structure, vast regions of the input space of the nonlinear approximator are not explored. Also interesting is the remark that even a slightly nonlinear static approximator may be capable of describing the behavior of a highly nonlinear system, which is a strong argument in favor of local linear modeling schemes. Depending on what type of input is fed to the nonlinear static approximator, various nonlinear extensions of the main linear models are obtained. One thus gets the NARX, NARMAX and NOE structures if delayed versions of the system or model outputs are used, and the NFIR and NOBF structures if they are not. Many properties of these models are similar to those of their linear counterparts. Some properties diFer, however, and a well-chosen example demonstrates the shortcomings of a PRBS signal classically used in the linear case for the identi9cation of a nonlinear dynamic system. The main idea to be pursued now is the use of static neural networks and fuzzy models architectures in the framework of the external dynamics approach. Important features of the various solutions that can be considered are described. They include how these solutions face the curse of dimensionality, their interpolation and extrapolation behaviors and whether the dynamic local models are linear in their parameters. The intended use of the model is assumed to be simulation (i.e. the model must operate in parallel with the system, without being fed with the system output), even if training may be based on the minimization of some measure of the prediction error. Special attention is devoted to the case where the local models are linear and training is performed using LOLIMOT. It is then possible to bene9t from all the knowledge available about linear models as well as from the simplicity and Jexibility of this algorithm. The possibly catastrophic consequences of an interpolation between the parameters of the local linear models according to the values taken by their membership functions are honestly pointed out, as well as suggestions for improving the situation. Methods for studying the stability of the resulting global model are provided, as well as interesting explanations of why some local linear models of a stable nonlinear system may turn out to be unstable. Four simulated examples representative of a variety of nonlinear systems are treated using the simplest possible structure for a dynamic local linear model, i.e. an ARX to be trained by least squares. The 9rst three examples include a separable static nonlinearity and consist of a Hammerstein system, a Wiener system and a second order non-minimum phase linear system, the output of which is fed back through a parabolic nonlinearity.
567
The fourth example features a dynamic nonlinearity. In all cases, performance can be considered satisfactory, with the Wiener system turning out to be the most diCcult to model. Various extensions of the scheme based on LOLIMOT are brieJy considered, such as the optimization of the structure of the local models by orthogonal least squares or the replacement of ARX local models by more sophisticated models, which may even be nonlinear. The increase in performance resulting from the use of ARMAX or nonlinearly parameterized models seems seldom suCcient to warrant the increased complexity of their training. Part 4 describes applications to real-world examples of local linear models trained with LOLIMOT. Many but not all of these applications come from the automotive industry. Oliver Nelles honestly points out the lack of a comparison with the results that could be obtained with alternative approaches. These applications are not trivial, and provide many opportunities to illustrate important features of LOLIMOT, including its remarkable Jexibility. I am nevertheless afraid that Oliver Nelles gets slightly carried away when he claims on p. 707 that local linear neuro-fuzzy models trained with the LOLIMOT algorithm are a universal tool for modeling and identi/cation of nonlinear dynamic real-world processes. The last chapter oFers a glimpse of applications to predictive or adaptive control and to fault detection, diagnosis and recon9guration strategies. The book concludes by two appendices summarizing important results about vector and matrix derivatives and statistics. To a large extent, Oliver Nelles has managed to give a uni9ed presentation of techniques that have been developed in communities so diverse that even their basic vocabularies diFer. He chose to stick as much as possible to the standard terminology in the system identi9cation and optimization literature, but provides welcome translations in the languages of neural networks and evolutionary algorithms. Also welcome are the historic insights, for instance the tracing on p. 92 of the origin of the concept of backpropagation, which is nothing else than the computation of gradients via the chain rule for diFerentiation although the term is sometimes used in the community of neural networks to describe a steepest descent algorithm or even the network to which the algorithm is applied. There is no table of notation, which is unfortunate. The notation itself is not always consistent, for instance with n standing for the noise and the dimension of parameter space or the notation for the norm changing from : on p. 29 (without a proper de9nition) to |:| on p. 50 and elsewhere. I found rather few mathematical typos, and these are easily detected. In my copy, p. 206 was missing, and I hope this is not general. The book is well suited to teaching at the last year undergraduate or graduate level. It has been student-tested and contains a large number of particularly well chosen simple illustrative examples. The pace of presentation is slow, with repetitions that help the reader memorize or re-
568
Book reviews / Automatica 39 (2003) 563 – 568
cover the most important points. Most chapters, as well as Parts 1 and 2 conclude by welcome summaries. There is not a single exercise, but interesting exercises that can be treated without a computer are hard to come by in this context. A toolbox implementing LOLIMOT has been built by Oliver Nelles and his colleagues, and it would have been useful to include some information about its availability as well as a series of problems to be studied with this toolbox in order to get a hands-on experience. The examples treated throughout the book deserve special praise and may serve as a partial substitute, especially if one has access to the necessary software to make similar numerical experiments. They have been carefully designed to demonstrate speci9c and interesting points, and are supplemented with pertinent 9gures. Oliver Nelles has a large practical experience of nonlinear modeling, and does not hesitate to share it, which makes the book particularly attractive for research and development engineers. One will often 9nd sentences starting by in the experience of the author and providing very useful bits of information indeed. The argumentation in favor of local linear models and the associated algorithm LOLIMOT is convincing, and the book is a welcome addition to any library covering system identi9cation. References Bezdek, J. C. (Ed.) (1994). Fuzziness vs. Probability—the N th Round. IEEE Transactions on Fuzzy Systems (Special Issue) 2(1), 1– 45. ChilQes, J. P., & Del9ner, P. (1999). Geostatistics. New York: Wiley.
doi:10.1016/S0005-1098(02)00239-X
Dawkins, R. (1996). Climbing mount improbable. London: Penguin Books. Griewank, A., & Corliss, G. (Eds.). (1991). Automatic di3erentiation of algorithms: Theory, implementation and applications. Philadelphia: SIAM. Hansen, E. (1992). Global optimization using interval analysis. New York: Marcel Dekker. Horst, R., & Tuy, H. (1990). Global optimization, deterministic approaches. Berlin: Springer. Jaulin, L., KieFer, M., Didrit, O., & Walter, E. (2001). Applied interval analysis, with examples in parameter and state estimation, robust control and robotics. London: Springer. SchRolkoptf, B., & Smola, A. (2002). Learning with kernels. Cambridge, MA: MIT Press. About the reviewer ? Eric Walter was awarded a Doctorat d’Etat in control theory in 1980. He is Directeur de Recherche at CNRS (the French national center for scienti9c research). His research interests revolve around parameter estimation and its application to chemical engineering, chemistry, control, image processing, medicine, pharmacokinetics, and robotics. A translation of his doctoral dissertation has been published under the title Identi/ability of State–Space Models (Springer, Berlin, 1982). He has written Identi/cation of Parametric Models from Experimental Data with Luc Pronzato (Springer, London, 1997), and Applied Interval Analysis with Luc Jaulin, Michel KieFer and Olivier Didrit (Springer, London, 2001). He is now the Director of the Laboratoire des Signaux et SystAemes. More information and a list of publications are available at http://www.lss.supelec.fr/perso/walter/index.html.
Eric Walter Laboratoire des Signaux et SystAemes, CNRS–Sup?elec–Universit?e Paris-Sud 91192 Gif-sur-Yvette, France E-mail address:
[email protected] (E. Walter)