Book reviews
Data Mining in Finance: Advances in Relational and Hybrid Methods, Boris Kovalerchuk and Evgenii Vityaev, Kluwer Academic Publishers, Norwell, Massachusetts, 2000, ISBN 0-79237804-0, US $120 (Hardback) As the title suggests, this book acquaints the reader with data mining methods useful for financial forecasting. It is an important text given the incredible challenge represented by the forecasting of financial time series and is particularly insightful to those interested in using data mining to identify market trends. Rather than attempt to deal with efficient market theory, the book draws upon the empirical evidence that short-term, local conditional regularities may exist. And thus, a primary objective of the text that the authors successfully achieve is to illustrate that the financial forecasting benefits from relational data mining based on symbolic methods. Kovalerchuk and Vityaev take an innovative approach of or to selecting a single financial forecasting problem. They focus on forecasting the stock market; and within this context, they analyze the strengths and weaknesses of various forecasting techniques. Within such a context, it is clear that the forecaster is more concerned with trading performance based on identified trading ‘‘rules’’ as compared with the accuracy of the forecast. Thus, forecasting itself need not be the final product of the data mining exercise. The weakness of such an approach is that different forecasting methods may be superior depending upon the limitations of the underlying financial problem. Fortunately, the authors do present a smattering of practical examples that might appeal to forecasters concerned with alternative problems in finance, such as exchange rates and stock ratings. Following an introductory chapter, the book is loosely structured into two complementary sections. Chapters 2–4 focus on describing specific data mining techniques. The text pro-
155
gresses from numerical data mining to rulebased methods and finally to relational data mining. Chapters 2 and 3 do not provide as great a depth into each topic as Chapter 4 given that relational data mining techniques are the focus of the text. Chapters 5–6 analyze specific financial applications and compare methods using the same underlying forecasting problem. The book concludes in Chapter 7 with an introduction to hybrid models that incorporate fuzzy logic and applications of such models in finance. Chapter 2 presents numerical data mining models, such as statistical models and neural networks. The authors discuss the value of ‘‘expert mining’’ as a source of regularities when dealing with absent or insufficient data. A typical problem with expert-based learning systems in finance is the slowness of response of the systems to changing markets. Kovalerchuk and Vityaev provide the example of that few trading rules are successful across different markets. (this sentence lost me?) Therefore, much of the chapter is devoted to methods for mining regularities from an expert’ s perspective in an expeditious, efficient manner. Chapter 3 focuses on rule-based and hybrid data mining, such as decision trees. Although learned decision trees provide a set of human readable, consistent rules, they suffer from the difficulty of discovering small trees for complex problems. In addition, they fail to compare two attribute values as is possible with relational methods. Such solutions are more amenable to human comprehension than the neural network approach. Kovalerchuk and Vityaev also discuss hybrid methods that allow for the extraction of symbolic representations (rule-based approaches) from a trained neural network. This hybrid methodology allows for the combination of both discoveries in an understandable manner. Chapter 4 provides an in-depth review of relational data mining, the primary focus of the
156
Book reviews
book. Although there are many relational data mining algorithms, the field is migrating toward probabilistic first-order rules to avoid the limitations of deterministic systems. As well as introducing us to alternative algorithms, Kovalerchuk and Vityaev detail the Machine Method for Discovering Regularities (MMDR). MMDR is well suited to financial applications given its ability to handle numerical data with high levels of noise. Chapter 5 is of particular interest to those concerned with the success of relational data mining when applied to financial problems. The authors reveal how regularities in time series can be discovered using mathematical logic and probability theory. Chapter 6 provides an important comparison of the performance of relational data mining methods with other forecasting methods. It is interesting to note that trading strategies developed based on MMDR consistently outperform trading strategies developed based on other data-mining methods. And even though buy and hold can outperform the MMDR-based trading strategies in a linear growth market, such strategies appear to perform almost as well. MMDR-based strategies are clearly superior in volatile markets with changing trends. A potential weakness of the chapter is the lack of discussion of the impact of trading fees on performance. Nevertheless, given the growing number of portfolio managers using such forecasting tools, it appears that relational data mining algorithms hold a good deal of promise for the future. The final Chapter 7 focuses on fuzzy logic tools. Such knowledge discovery allows for reductions in the search space given the mining of knowledge from experts and is thus quite useful in situations in which the forecaster has limited training data. Kovalerchuk and Vityaev discuss the benefits of combining fuzzy logic with other methods to mitigate weaknesses in
other methods. For example, in a hybrid model, fuzzy logic might be used to adjust the inputs and parameters of the neural networks based on expert information. The authors illustrate the development of a hybrid model that combines fuzzy logic with neural networks, but they are quick to note that fuzzy logic can be combined into a hybrid system with other data mining tools. They also present a number of successful applications of fuzzy logic in finance, including investment fund management and bond rating programs. ‘‘Data Mining in Finance’’ is a timely book that provides an introduction to forecasting financial time series using data mining. The weaknesses of the book are primarily structural. It suffers to some degree from a lack of editing and at times inundates the reader with terminology. In addition, comprehension of the material would be facilitated by organizing the material in a more integrated manner, such as the inclusion of a summary section at the end of each chapter. In contrast, a wonderful strength of the book is that it provides insightful comparisons of many data mining techniques. Overall, it provides excellent examples and arguments for the application of relational data mining to financial problems and leaves the reader with expectations of great advancements in this field in the near future. Adrian M. Cowan Senior Financial Economist U.S. Department of the Treasury OTS Office of Risk and Economic Analysis Washington, DC USA PII: S0169-2070( 01 )00128-5