Entropy optimization principles with applications

Entropy optimization principles with applications

Structural Safety, 12 (1993) 243-244 243 Elsevier Book review Entropy Optimization Principles with Applications, by J.N. Kapur and H.K. Kesavan, A...

157KB Sizes 1 Downloads 40 Views

Structural Safety, 12 (1993) 243-244

243

Elsevier

Book review

Entropy Optimization Principles with Applications, by J.N. Kapur and H.K. Kesavan, Academic Press, San Diego, CA, 1992, xiv + 408 pp., ISBN 0-12-397670-7 This book is about information-theoretic entropy. This is not the entropy that is used in thermodynamics, nor "entropy" as it is sometimes invoked in speculations about the future of our world sliding into chaos. Information-theoretic entropy is a concept of importance in the analysis of systems whenever uncertainty is a significant feature. This entropy has important applications in reliability and risk analysis. The book may therefore interest researchers concerned with structural safety. Three revolutionary ideas about information were introduced around 1950: Shannon's measure of information (1948) and its expected value called entropy, Jaynes' principle of maximum Shannon entropy (1957), and Kullback's principle of minimum cross-entropy (1951). Their applications are numerous, and "MaxEnt" research is a large and rapidly growing field of research. Yet, few books (none, to the writer's knowledge) give a coherent presentation of the use of entropy to analyze systems subject to uncertainty. This book by Kapur and Kesavan is the first that conveniently can serve as a self-contained textbook, complete with numerous problems and worked examples. It can also be used as a reference book, although it is worth mentioning that a reference book by J.N. Kapur (Entropy Models in Science and Engineering, Wiley, 1989) is also available. The book requires no advanced mathematical knowledge. It makes use only of undergrate calculus and Lagrange's method to maximize a function or a functional subject to constraints. The first half of the book can serve as a text for a one-semester course on the senior undergraduate or beginning graduate student level. But the book goes far beyond an introductory text; it introduces new theory, new solutions, a broad body of generalizations and more than a score of new principles: principles of maximum, mimimum and mixed entropy and cross-entropy. It also shows how to solve a wealth of problems and inverse problems in probability analysis. There are two major kinds of uncertainty: vagueness about the objects of discourse and uncertainty about outcomes. Fuzzy methods have been proposed and used to deal with the first kind of uncertainty. The alternative, favored by many in science, engineering and law, is to first define what you are talking about, with whatever precision is necessary to permit a rational analysis. Some think that fuzzy methods and similar exotic tools are unnecessary, that all uncertainty can be modelled by probability. There are others who disagree. Kapur and Kesavan avoid taking sides in this debate, by stating simply that the book is concerned only with those aspects of uncertainty that lend themselves to probabilistic modelling. The book introduces the concept of entropy gently, suggesting that you consider it "equivalent to u n c e r t a i n t y . . . t h e ordinary dictionary meaning [of the term]." If something has n possible outcomes, with probabilities Pl, P 2 , . . . , Pn, then entropy is a just a function of these n probabilities that has some useful properties. Prominent, of course, is the original Shannon entropy (minus the expected value of a logarithm of these probabilities), but many generalizations of this particular brand are introduced in the book. Entropy, the book suggests, is equivalent to uncertainty--but what is uncertainty? The book largely avoids the issue, although a measure of uncertainty of a distribution is defined (on p.

244 311, as any monotonic decreasing function of a directed divergence of the distribution from the uniform). This definition is not adequate from the perspective of applications--but it reflects the scope of the book. The book is essentially mathematical, and remains neutral about real-world meanings of the concepts. Shannon's entropy was used by Jaynes in 1957 to give an elegant derivation of the classical distributions in statistical mechanics. Jaynes' principle can be viewed as a quantitative expression of some principles that should be fundamental to rational analysis: Speak the truth and nothing but the truth, use all relevant information given, and make as little as possible use of information that is not available. The book deals with the two optimization principles of Jaynes and Kullback in a connected way. Indeed Jaynes' principle can be considered as a special case of Kullback's, but its foundation is Shannon's measure of information while the latter deals with a directed distance between probability distributions in a metric space. Numerous applications are demonstrated in statistical mechanics, economics, transportation, regional and urban planning, statistics, pattern recognition, spectral analysis, queuing theory, and more. Estimation of distributions from sparse data is a critical problem in probability-based structural safety. Six conceptionally different solutions to the parameter estimation problem in statistics arising out of entropy formulations are shown in the book. Interestingly, it is shown that the principle of maximum likelihood can be interpreted in terms of cross-entropy minimization. Among all distributions of a given type, the one that maximizes the likelihood minimizes the directed (Kullback-Leibler) distance to the empirical distribution of the sample. The principles of maximum product of spacings and of least information have similar interpretations. The interpretation of cross-entropy as a directed distance is very powerful. The book exploits this concept, exploring the use of many metrics other than the Kullback-Leibler measure. Casting problems in terms of optimization of distance in a linear space, subject to constraints, also leads the authors to formulate several inverse problems. They do not exhaustively explore the wide new world opened by these generalizations, but they give convincing demonstrations of their potential usefulness to solve a great variety of problems. The power of Shannon entropy and Kullback cross-entropy has lead many to consider them as cure-alls for problems of uncertainty. In contrast, this book presents a bewildering array of other possibilities, tools and principles from which one must choose. This embarrassment of riches makes it necessary to think about the rationale of any new possible application of entropy optimization. Some have found this annoying; others may welcome the opportunity and promise for a deeper insight. In any case, the book has many potential applications to structural reliability and risk analysis, and it can be recommended both as a stimulant for thought and as a text. N. Lind Victoria, BC, Canada