International Journal of Forecasting 29 (2013) 628–641
Contents lists available at ScienceDirect
International Journal of Forecasting journal homepage: www.elsevier.com/locate/ijforecast
Takeover prediction using forecast combinations Bruno Dore Rodrigues ∗ , Maxwell J. Stevenson Discipline of Finance, Faculty of Economics and Business, Level 4, Building H69, The University of Sydney, Sydney NSW 2006, Australia
article
info
abstract
Keywords: Combining forecasts Forecast accuracy Economic value Abnormal returns Panel data Neural networks
The ability to identify likely takeover targets at an early stage should provide investors with valuable information, enabling them to profit by investing in potential target firms. In this paper we contribute to the takeover forecasting literature by suggesting the combination of probability forecasts as an alternative method of improving the forecast accuracy in takeover prediction and realizing improved economic returns from portfolios made up of predicted targets. Forecasts from several non-linear forecasting models, such as logistic and neural network models and a combination of them, are used to determine the methodology that best reduces the out-of-sample misclassification error. We draw two general conclusions from our results. First, the forecast combination method outperforms the single models, and should therefore be used to improve the accuracy of takeover target predictions. Second, we demonstrate that an investment in a portfolio of the combined predicted targets results in significant abnormal returns being made by an investor, in the order of up to double the market benchmark return when using a portfolio of manageable size. © 2013 Published by Elsevier B.V. on behalf of International Institute of Forecasters.
1. Introduction
prediction, the answer to the question of whether takeover targets can be predicted remains unclear. If the conclusions from a study are based on a single forecast, little information on the robustness of these predictions is available. Further, Powell (2004) advised that modelling takeovers using a binomial framework exclusively may be misleading, since takeovers may occur for many reasons which will not be present in the selected hypotheses and the corresponding predictor variables. From an investment perspective, it is crucial to be aware of the risk and stability of a takeover model. It hardly seems optimal for an investor to invest capital in a portfolio of potential target companies unless the selection process was based on robustly evaluated predictions. Forecast combination has long been viewed as a simple and effective way to improve the robustness of forecasting performances over those offered by forecasts from just one model. The perception that model instability is an important determinant of forecasting performances, and a potential reason for combining forecasts from different models, started with Bates and Granger (1969), and was further supported by Diebold and Pauly (1987) and Pesaran and Timmermann (2007). Later, the combination of
Mergers and acquisitions have long been a major area of research in finance. Several studies have demonstrated that the target’s share price increases substantially over the period leading up to the bid announcement date. It has also been observed that most of the gains in mergers and acquisition deals accrue to the shareholders of the target firm. Consequently, the ability to identify likely takeover targets at an early stage could provide investors with valuable information from which they can profit by investing in potential target firms. Assuming that abnormal returns can be achieved by trading in advance of acquisition announcements, the development of takeover prediction models based on publicly available information provides important tools for guiding investment strategies. Even after considering the methodological improvements from several recent studies in the area of takeover
∗ Corresponding author. Tel.: +61 4 5078 6535, 61 2 9351 3915; fax: +61 2 9351 6461. E-mail addresses:
[email protected] (B.D. Rodrigues),
[email protected] (M.J. Stevenson).
0169-2070/$ – see front matter © 2013 Published by Elsevier B.V. on behalf of International Institute of Forecasters. http://dx.doi.org/10.1016/j.ijforecast.2013.01.008
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
probability forecasts of a binary variable defined on the [0, 1] interval appeared, when Kamstra and Kennedy (1998) introduced a method of combining log-odds ratios using logit regressions. Further development in this area was carried out by Riedel and Gabrys (2004), who generated multilevel forecasts, and Clements and Harvey (2010), who compared several different methods for combining probability forecasts. The motivation for this paper is to explore the possible economic gains accruing to a portfolio of predicted takeover target companies. The forecasts are estimated using a combination of probability forecasts generated by established takeover prediction models. It is anticipated that, by combining the forecasts from various individual models, a portfolio of targets will be created that will consistently achieve abnormal returns and lower misclassification rates. This research contributes by showing that a good and consistent forecast accuracy can be achieved when predicting potential takeover targets using forecast combinations both from a number of panel data logistic regression models and from neural network models. Its assessment of the financial gains from the proposed modelling approach is innovative, as is its observation of model consistency over time. Further, this study extends previous research by analysing a wide range of companies over a decade and including new explanatory variables for takeover prediction. The background of takeover prediction research is summarized in the next section. In Section 3, takeover hypotheses and their corresponding explanatory variables are discussed. Section 4 outlines the data used in the study, while the design of the forecasting strategy that includes the combination of forecasts is detailed in Section 5. Section 6 contains the results, with conclusions following in Section 7. 2. Background The theoretical background of the takeover prediction literature relies on hypotheses arising from the Market for Corporate Control. This theory assumes that takeovers can be predicted using published financial data, and includes factors which are hypothesised to increase the probability of a takeover announcement, such as inefficient management and a growth resources mismatch. Barnes (2000) explains that, although there may be many reasons for mergers, the targets are not selected arbitrarily. Instead they arise from a bidding company’s desire to gather benefits from a takeover or merger. The proposed and evidenced theories explaining the reasons behind takeovers include profitability (Hogarty, 1970), economies of scale (Silbertson, 1972), market power (Sullivan, 1977; Thomadakis, 1976), information signaling (Bradley, Desai, & Kim, 1983), and management efficiency (Jensen & Ruback, 1983). In particular, researchers have found financial synergy to be a strong motive for mergers (Gahlon & Stover, 1979). However, each individual takeover has a specific rationale, and, due to its complexity, the finance literature has been unable to come up with a catch-all
629
model to anticipate these events. It follows that an important challenge for researchers who are attempting to forecast targets is the issue of identifying the most appropriate model or models. From a theoretical perspective, knowing the reasons behind a takeover bid should prove useful and provide a key to understanding merger and acquisition dynamics and motivations. As a consequence, the economic benefit derived from the management of a portfolio of forecasted targets depends critically on the accuracy of the predictions from the forecasting model utilized. An assortment of models has been applied in the past in an attempt to identify common characteristics of different takeover targets. They include univariate analysis by Harris, Stewart, Guilkey, and Carleton (1982), multiple discriminant analysis by Stevens (1973) and Barnes (1998), logit analysis by Meador, Church, and Rayburn (1996), and neural networks by Cheh, Randy, and Ken (1999) and Denčić-Mihajlov and Radović (2006). Stevens (1973) defended multiple discriminant analysis as a model that was well suited to many financial problems where the dependent variable is dichotomous. However, most of the studies conducted in the 1980s and 1990s switched to logistic regression models for predicting takeover targets. Dietrich and Sorensen (1984) were the first to apply logistic regression to bankruptcy prediction following the article by Ohlson (1980). Palepu (1986) was able to formally improve both the validity and the consistency of the prediction procedure by analysing the influence of the cut-off probability on the predictability rate. Since then, the research has concentrated on the development of alternative methods, in order to determine optimal cut-off probabilities, and thus reduce the misclassification error. The end of the 1990s saw the emergence of additional methodological improvements, such as the profit maximization criterion proposed by Barnes (1999), and the use of a standard feed-forward backpropagation neural network model by Cheh et al. (1999). The classification models reported in the literature have demonstrated varying degrees of success, with predictive accuracies of up to 90% better-than-chance in-sample, and accuracies ranging from below 50% to around 120% betterthan-chance out-of-sample. For example, Powell (1995)’s best results were achieved by the use of multinomial models, with an overall reported prediction accuracy of 4.76%. The work of Powell (2004) showed an even better rate of success, with up to 12% accuracy using multinomial models. Barnes (1999) claimed an accuracy of 2.5% (or 97.33% better than chance) when using a logit regression, while Stevenson and Peat (2009) used a combined logistic model to achieve results which were up to 118% betterthan-chance. However, such abilities to generate abnormal returns have since been questioned by many authors, who have been unable to replicate the results of previous studies when applying the proposed methodologies in different markets or periods. In contrast to the classification abilities claimed by many studies, empirical applications of the models have generally failed to confirm the out-of-sample predictive expectations formed from in-sample results.
630
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
3. Takeover hypotheses and explanatory variables The literature on the Market for Corporate Control presumes that targets can be forecasted using mainly publicly available data. The crucial question, however, is whether future economic events, including takeovers, can in fact be predicted without the market presence of inside information. Barnes (1998) expressed the view that, while these events cannot normally be predicted, some of them may at least be anticipated. Earlier studies centred on motivations for corporate mergers and acquisitions. The variables explained below and used in takeover target prediction models point to these motivations. As a consequence, the use of operational and financial characteristics of target firms, along with accounting and market data, has become commonplace in recent studies of the identification and prediction of takeover events. From the several theories which are purported to explain firm acquisition, eight main hypotheses have been formulated. The explanatory variables identified based on these hypotheses are suggested as covariates for inclusion in the predictive models. These variables are collected at the firm level, as well as from within industry and market categories. The resultant number of variables is thirty-five, and the full list of hypotheses with their respective proxy variables is described below. H1: Inefficient management This hypothesis is based on the Market for Corporate Control theory, which states that inefficiently managed firms will be acquired by more efficient firms in order to increase capital gains. Therefore, companies which are managed inefficiently are more susceptible to poor performance and acquisition. Accordingly, the explanatory variables suggested as proxies for this hypothesis include: V1 ROA (EBIT/total assets—outside equity interests) V2 ROE (net profit after tax/shareholders equity—outside equity interests) V3 EBIT/operating revenue V4 Dividend/shareholders equity V5 Asset turnover (net sales/total assets) V6 Growth in EBIT over past year V7 Growth in EBIT over past three years V8 Growth of 1-year total assets V9 Growth of 3-year total assets V10 Inventory/working capital V11 Inventory/total assets V12 Net profit/market value. H2: Undervaluation There is agreement across most studies that the greater the level of undervaluation, the greater the likelihood that a firm will be acquired. Undervalued stocks are seen as a bargain in the market, especially from overvalued entities. The explanatory variables suggested by this hypothesis are: V13 Market to book ratio (market value of securities/net assets) V14 Market capitalisation/total assets.
H3: Price to earnings ratio The price-to-earnings (P/E) ratio is closely linked to the undervaluation and inefficient management of a company. The earnings of a firm with a low P/E ratio will be valued at the multiple of the acquirer, allowing an immediate gain to be realised. Consequently, a high P/E ratio will decrease the likelihood of acquisition. Thus, the P/E ratio is a likely candidate for inclusion in models. V15 Price/earnings ratio. H4: Growth resource mismatch Acquisition will create opportunities for a better allocation of the target firm’s resources to generate profitable investments. Firms which possess low growth/high resource combinations, or, alternatively, high growth/low resource combinations, will have an increased likelihood of acquisition. However, the explanatory variables used to examine this hypothesis capture growth and resource availability separately. The explanatory variables suggested by this hypothesis are: V16 Growth in sales (operating revenue) over the past year V17 Growth in total sales over three years V18 Capital expenditure/operating revenue V19 Quick assets (current assets—inventory)/current liabilities V20 Invested capital turnover V21 Long term asset turnover V22 Working capital turnover. H5: Dividend payout The behaviours of some firms in paying out less of their earnings in order to maintain enough financial slack (retained earnings) leads to a higher growth potential and, consequently, a greater market value. It is assumed that low payout ratios will lead to an increased likelihood of acquisition. The explanatory variables suggested by this hypothesis are: V23 Dividend payout ratio V24 Dividend yield V25 Dividend per share/earnings per share. H6: Inefficient financial structure The rectification of capital structure problems is also a motivation for takeovers, given that increases in debt demand a greater return on equity. A high leverage will lead to an increased likelihood of acquisition. The explanatory variables for this hypothesis are: V26 V27 V28 V29 V30
Net interest cover (EBIT/interest expense) Net debt/cash flow Growth in net debt over past year Growth in net debt over past three years Current assets/current liabilities.
H7: Merger and acquisition activity The more important industry sectors in the economy and the most traded companies will attract more investments, and, as a result, will create more opportunities for mergers and acquisitions. The use of a dummy variable for the mining industry was a natural choice, given its significant representation in the sample of takeover announcements. The explanatory variables for this hypothesis are:
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
V31 Industry dummy variable for companies from the mining industry V32 Dummy variable indicating company listing on the ASX3001 in that year. H8: Size There are two rationales underlying this hypothesis. The first states that smaller firms will have greater likelihoods of acquisition, because larger firms are generally exposed to fewer bidding companies with sufficient resources to acquire them. In that case, it follows that size has a negative effect on the probability of acquisition. The second proposes a positive relationship between size and takeover likelihood, based on the assumption that managers would prefer larger, rather than smaller, acquisitions, in order to increase the size of the company. Both lines of argument are tested using the variables below. The second rationale prevails in the model estimations, with the variables’ coefficients assuming positive coefficients in all samples. V33 Log (total assets) V34 Market capitalisation V35 Sales and revenues. 4. Data The sample collected includes financial data from all listed companies on the Australian Stock Exchange (ASX) for 13 years, spanning the financial years from 1999 to 2011. It includes their respective accounting, market and historical takeover data. The dataset is divided into 12 panels, each corresponding to one financial year.2 The financial year 2009 is used as an out-of-sample data period for the first in-sample panel estimation period from 1999 to 2008. It follows that 2010 and 2011 are the out-of-sample data periods corresponding to the second (1999–2009) and third (1999–2010) in-sample estimation periods, respectively. The out-of-sample data are used to evaluate the takeover forecast accuracy based on the estimation and cut-off probabilities from their respective in-sample periods. The estimation and one year prediction for three distinct periods allows the stability of the models to be verified under changing economic conditions. The main sources used to collect the financial and corporate information are the AspectHuntley and Connect4 databases. The first of these databases contains the available published financial information of all listed companies in Australia, including industry classifications and a complete list of financial variables and ratios.3 Connect4 has historical records of Australian companies’ takeover bids, with their respective dates and the details of the transactions. 1 The ASX300 index is a market-capitalisation weighted and floatadjusted stock market index of Australian stocks listed on the Australian Securities Exchange from Standard & Poors. The ASX300 index incorporates all of the companies in the top 200, together with an additional 100 smaller companies, making a total of about 300 companies. 2 In Australia, the financial year extends from July 1 of the previous year until June 30 of the year under consideration. For example, the financial year 2009 includes the period from 01/07/2008 to 30/06/2009. In the case of the financial year 2011, the data set is truncated after 31/03/2011. 3 Descriptive statistics of all of the variables used in this study can be provided on request.
631
5. Forecasting strategy It has not yet been demonstrated in the literature that such a complex problem as takeover prediction can be solved efficiently using only one forecasting model. Rather, it requires a more robust approach. The discrete choice modelling framework proposed in this paper is divided into three segments. Firstly, a logistic regression and two other specifications of panel data logistic models are estimated, with each assuming a different time relationship between the variables. Secondly, three architectures of feed-forward neural networks are trained to forecast takeover likelihoods using the same database. Last of all, the KK combination method is used to combine the forecasts from the previous models. In theory, the neural network models should be more efficient in generating predictions, given their associated high complexity and computational intensity. However, the transparency of the logistic models in relation to variable selection and time structure adds flexibility for the researcher to adapt the model. The takeover literature provides compelling arguments and results in favour of both types of models, but two points that are still untested are the use of forecast combinations for improving the prediction accuracy of takeover announcements, along with the behaviour of each model over different time periods. 5.1. Logistic models M1: Logistic regression The first modelling procedure used is the logistic regression model which is commonly utilised for dichotomous state variable problems. The model is given by the Eqs. (1) and (2) below. Pi = E (Y = 1|Xi ) =
Li = ln
Pi 1 − Pi
1 1 + 6−Zi
= Zi = β0 + β1 X1i + · · · + βk Xki ,
(1) (2)
where Pi is the probability of company i being taken over, β0 is the intercept, and each βk (k = 1, . . . , N ) is the coefficient corresponding to the financial variable Xk . The logistic regression model was developed to overcome the rigidities of the linear probability model in the presence of a binary dependent variable. Eqs. (1) and (2) show the existence of a linear relationship between the logodds ratio and the explanatory variables. However, the relationship between the probability of the event and the acquisition likelihood is non-linear. This non-linear relationship has a major advantage, in that it measures the change in the probability of the event as a result of a small increment in the explanatory variables. However, the incremental impact of a change in an explanatory variable on the likelihood of the event is compressed, requiring a large change in the explanatory variables to change the classification of an observation. M2: Panel data mixed effects logistic regression Panel data models make the most of the data on hand, having the ability to analyse the relationship between variables within a time dependent structure simultaneously.
632
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
Although these models have structures similar to that of the logistic regression model, the panel structure allows the historical records for each variable to be considered in the estimation procedure. The mixed-effects logistic regression adds other components to the panel structure by estimating both fixed effects and random effects. The presence of fixed effects captures the effect of all of the unobserved time-invariant factors that influence the dependent variable.4 In contrast, the random effects capture the intrapanel correlation. That is, observations in the same panel (year) are correlated, because they share common panellevel random effects. The fixed effects are estimated directly as an additional regressor, and the random effects take the form of either random intercepts or random coefficients. An important characteristic of such models is the grouping structure of the data. It consists of multiple levels of nested groups that allow for one or more levels. In this study, the two-level model assumes that industries are the first level and companies the second level. Companies are nested within industries, and random effects are unique to companies within an industry. Assuming that company effects would be nested within industries is natural, as companies are generally unique to industries.
Lij = ln
Pij
1 − Pij
= Zi = β0 + β1 Xijk + Zijk ui + εij .
(3) 5.2. Neural network models
In the model given by Eq. (3) above, i = 1, . . . , M panels (years), with each panel i consisting of j = 1, . . . , N observations. In a two-level panel, k = 1, . . . , L corresponds to the industry sectors, while the Xijk are the covariates for the fixed effects that quantify a general mean process for the company j in the panel i. The covariates corresponding to the random effects are given by Zikj , and can be used to represent random intercepts and random coefficients.5 The random effects, ui , are not estimated directly as model parameters, but instead are summarized according to the unique elements of the covariance matrix. The errors εij are distributed as logistic, with mean zero and variance π 2 /3, and are independent of the ui . M3: Panel data crossed effects logistic regression This model inherits the same structure as the previous panel data model, but with a different approach to the random structure. While it is safe to assume that all mixedeffects models will contain nested random effects, it makes sense in this analysis to assume that the random effects are crossed rather than nested. This means that the random effects are the same regardless of the industries. The panel data crossed effects logistic model with the jth company within the ith panel in the kth industry is given by:
Lij = ln
Pij 1 − Pij
= Zi = β0 + β1 Xijk + Zij ui + εij ,
(4)
4 For this reason, it is referred to as the unobserved heterogeneity, or company effect, and represents all factors that affect the takeover announcements which do not change over time. 5 See Rabe-Hesketh, Skrondal, and Pickles (2005) for further explanations.
where Xikj are the covariates for the fixed effects and Zij those for the random effects. All three of the logistic models in this subsection (M1–M3) are estimated by maximum likelihood. The selected logit models are the result of elaborate model search/specification procedures. In each case, we choose the best variables from the list of 35 candidates. This is one reason to suppose that the in-sample fits will differ from their out-of-sample counterparts. Although all of the proposed variables are tested, not all of them are used in the final specification for each model. The selection of variables is done by following a backward stepwise procedure during the model estimation procedure in-sample. This involves starting with all candidate variables and testing the deletion of each variable based on its significance level in the model. Only variables with p-values lower than 0.2 stay in the model and are used for generating the predictions out-of-sample. Consequently, each model specification, and year, will be based on a different set of variables. The models are estimated over the entire sample, with the last year being used as a holdout period for creating the forecast out-of-sample. For example, the last sample uses data from the financial years 1999–2010 to estimate the model parameters, and forecasts takeover targets for 2011 (out-of-sample).
Logistic regression is the most commonly used technique in the takeover prediction literature. However, such parametric models require a pre-specified functional relationship between the dependent and independent variables. This is difficult to validate in many empirical studies, due to the complexity of the problem and the relationships between variables. The advantages of neural networks over conventional methods of analysis lie in their ability to analyse complex patterns quickly with a high degree of accuracy, along with the fact that no assumptions are made about the nature of the underlying distribution of the data. However, as was explained by Denčić-Mihajlov and Radović (2006), the limitations of this model lie in its inability to explain the relative importances of the inputs separately, as well as the requirement to have a sufficiently large dataset to be able to train, validate and generalize the network. Neural networks consist of a large number of processing elements, known as neurons. At the input level, they are represented by a weighted sum that is squashed by a non-linear function. The squashing function maps a set of input–output values by finding the best possible approximation to the function. This approximation is coded in the neurons of the network using weights that are associated with each neuron. The weights are calculated using a training procedure, during which examples of input–output associations are successively exposed to the network. After each iteration, the weights are updated so that the network starts to mimic the desirable input–output behaviour. Due to its structure, the feed-forward neural network uses parallel processing to capture complicated non-linear relationships between the
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
dependent and independent variables. The neural network output is specified as: y = v0 +
NH
vj g (wjT X ),
(5)
j −1
where X represents the inputs (explanatory variables), wj is the weight vector for the jth hidden node, v0 , v1 , . . . , vNH are the weights for the output node and y is the output (dependent variable). The function g represents the hidden node output and is given, in this study, in terms of the logistic and hyperbolic tangent squashing functions. Specifying the architecture of the net determines the network complexity and is a critical task in the process of fitting a neural network. If the size of the network architecture is not controlled adequately, the network can easily overfit the data in-sample, resulting in poor out-of-sample forecasts. Unfortunately, no clear rule has yet been developed for determining the optimal number of hidden nodes. Usually, the number of nodes is determined empirically by trial-and-error, by selecting the number that produces the best in-sample result. In theory, a single hidden layer feedforward neural network can approximate any nonlinear function to an arbitrary degree of accuracy with a suitable number of hidden neurons (White, 1992). For this study, we selected a feed-forward backpropagation neural network with one hidden layer and the choice of logistic-sigmoid and tangent-sigmoid activation functions. The models were trained using between one and thirty-five neurons. In general, the models with the higher numbers of neurons resulted in over-specification in-sample and a low ability to forecast out-of-sample. The following architectures achieved the best results insample, and were therefore selected as the models. M4: 1 hidden layer, 10 neurons, logistic-sigmoid squashing (activation) function. M5: 1 hidden layer, 3 neurons, tangent-sigmoid squashing function. M6: 1 hidden layer, 4 neurons, tangent-sigmoid squashing function. Even though a neural network can yield a set of coefficients, it cannot provide logical descriptions, or cause–effect relationships. As a consequence, all of the variables suggested in hypotheses H1–H8 are used to train the neural networks and generate predictions one year ahead. The use of neural network models requires the sample of companies to be divided into three parts: a training set, a validation set, and a prediction set. These sub-samples are selected by grouping a large part of the sample in the training set, validating the model during one year, and predicting takeover targets one year ahead (outof-sample). This experimental design aims to facilitate a comparison of the results with those of the logistic models, which also have a one year forecast horizon, by allowing for the production of forecast combinations of all models at a later stage. For example, the first sample allocated the financial years from 1999 to 2007 to the training set, used the year 2008 for validation, and generated predictions for 2009.
633
5.3. Forecast combinations High levels of misclassification are of great concern when using probabilistic predictive models for takeover predictions. This is especially relevant when costly Type II errors occur; that is, when non-targets are predicted to be targets. Practical experience has shown that the best model in-sample might not be the most accurate when forecasting future values. This gives rise to one of the main objectives of this study, which is to improve the accuracy of the prediction of takeover announcements by introducing the methodology of probability forecast combinations. Although forecast combination has been proven to be an effective methodology in many other forecast applications, to the best of our knowledge it has not previously been used in the takeover prediction literature. The methodology consists of combining the predictions obtained from different forecasting models using an aggregation function. The forecast combination methodology accounts for the diversity in the underlying forecasting models, instead of focusing on the narrow specification from one model. Timmermann (2006) documented that forecast combinations are often superior to their constituent forecasts. In our study, the combined forecast is the output of a function that gathers the results from a number of takeover prediction models, using the neural network and logistic modelling approaches as inputs. The use of the unique non-linear relationships between takeover targets and explanatory variables captured by each single model and used as inputs in the construction of a forecast combination represents the key difference between this methodology and an individual model forecast. The forecast combination method chosen for this study is the established KK combination, given the flexibility in its structure for dealing with logistic functions. It basically uses the output from other models as the input in a function. The output of this model is a vector of combined forecasts. The method attributes weights (coefficients) to each of the inputs, and, as was pointed out by Kamstra, Kennedy, and Suan (2001), the weights show the contribution of each corresponding forecast input to the final forecast. The key point in the determination of the weights is the choice of the combination function. In this study, a logistic regression is used to determine the optimal weights for combining each of the forecasts, and, based on the model estimations, predict takeover targets one year ahead. This methodology was first presented by Kamstra and Kennedy (1998), and is known as ‘KK combination’ in the forecast combination literature. This is a simple methodology for combining forecasts in order to lessen the bias. The main advantage of the methodology is that it confines the resulting forecasts to the unit interval while permitting unrestricted coefficient and intercept values. The KK methodology is specified in Box I. The combination method advocates the use of log-odds ratios as the input to a logistic regression. Ci is the probability of company i being taken over, cons is the intercept and W1 –W6 are the weights for each input, which are estimated from the logistic regression by maximum likelihood. The vectors M1–M6 contain the probability forecasts from each specific model. M1–M3
634
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
+ W2 ln 1−M2M2i i + · · · + W6 ln 1−M6M6i i . Ci = M2i M6i M1i + W2 ln 1−M2 + · · · + W6 ln 1−M6 1 + exp cons + W1 ln 1−M1 i i i exp cons + W1 ln
M1i 1−M1i
(6)
Box I.
represent the vectors of predicted probability forecasts from the logistic models, while M4–M6 refer to the probability forecasts from the respective neural network models. The result is found in the vector of combined forecasts, C . Overall, the aim behind the use of such a variety of models is to capture different non-linear relationships among the variables in order to improve the robustness of the forecast. The forecast combination literature typically assesses the out-of-sample accuracy of combinations whose weights have been determined insample. Maintaining that consistency, the logistic model is estimated by maximum likelihood, and a hold-out period of one year is used to generate predictions out-of-sample. 5.4. Forecast benchmark As a means of comparison, two benchmark methodologies are estimated. The first is commonly referred to as Linear combination. This form of regression is estimated by applying OLS to Eq. (7): LCi = cons + βi M1i + β2 M2i + · · · + β6 M6i .
(7)
LCi represents the probability of company i being taken over, cons is the intercept, β1 –β6 are the coefficients for each input, and, as before, the vectors M1–M6 are the probability forecasts from each individual model. This method was suggested as a general form for the combination of point forecasts by Clements and Harvey (2011). However, it does not ensure that the predicted output from the model lies in the unit interval. The second methodology for comparison is the Chance criterion. It basically calculates the probability of picking a takeover target by selecting a stock listed on the ASX blindly without any prior information about the company. Under this method, all traded stocks are classified as targets, and consequently, all companies that were takeover targets during the period are considered as correctly predicted targets. This is a very naive method that is used as a bottom line for model useability in many takeover prediction studies, such as those of Barnes (1999) and Stevenson and Peat (2009). If a model is unable to outperform the Chance criterion, the investor has a higher probability of selecting takeover targets by picking stocks to include in the portfolio randomly. 5.5. Scoring rule Typically, binary models generate a probability as output. This, in turn, requires the specification of a threshold probability (cut-off) for assessing the classification accuracy of the models. We chose to classify the prediction from each model based on the cut-off probability that provides the highest proportion of correctly predicted targets in the estimation sample. This method is known as the
Maximum Chance Criterion (MCC), and was first used by Barnes (1999). As Barnes explains, minimizing the total error probabilities in takeover predictions is not the same as minimizing the total error costs. This is because the loss functions of Type I and Type II errors are not symmetrical. He proposes that the appropriate cut-off point for the identification of takeover targets is the probability cut-off that maximises returns. This is determined by maximizing the estimated returns obtained from investing in takeover targets compared to investing in non-targets. The MCC recognizes that the penalty of misclassifying a non-target firm as a target (Type II error) is significantly larger than misclassifying a target as a non-target (Type I error). The cut-off probability refers to the probability, p, that maximizes the ratio presented in Eq. (8) below. The maximization of this function, which measures the accuracy rate, is based on the assumption that the proportion of correctly predicted targets is directly related to the returns of the portfolio. Cut-off (p) = max
Correctly Predicted Targets Predicted Targets
.
(8)
Any company with an assigned probability equal to or higher than the cut-off probability is classified as a takeover target. Deriving the cut-off probability using the Maximum Chance Criterion sets the threshold within the decision context of selecting a parsimonious number of predicted targets in the portfolio. Other common score methods were also tested in this research, such as the Brier score and the Logarithm Probability score. Although they generated similar accuracy results for the same set of probabilities, the number of predicted targets was considerably larger, leading to increased costs in managing the portfolio of stocks. This research uses the best cut-off probability estimated in-sample to classify the out-of-sample forecast. As the forecast horizon is moved forward in time, we generate out-of-sample forecasts by updating the model parameter estimates in-sample. The forecasts are then evaluated outof-sample based on the accuracy rate. 6. Results The results from this study are reported in two interrelated sections. The first section analyses the performances of the individual and combined forecasts for predicting takeover announcements. The second section is concerned with assessing the economic usefulness of portfolios made up of the predicted targets from the single models and combined forecasts. 6.1. Performance analysis The accuracy rate is the only score rule which is used to measure the performances of the individual classification
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
635
Table 1 The models’ accuracies in-sample and out-of-sample. Sample 1999–2009
Logistic models
Neural network models
KK combination
Benchmark
M1
M2
M3
M4
M5
M6
C1
Linear combination
Chance criterion
23 3 20
14 2 12
25 3 22
26 2 24
27 2 25
15 2 13
19 3 16
87 3 84
1948 57 1891
13.04%
14.29%
12.00%
7.69%
7.41%
13.33%
15.79%
3.45%
2.93%
In-sample: 1999–2008 Classified targets Correctly classified Incorrectly classified
190 78 112
286 98 188
653 220 433
138 26 112
147 32 115
51 16 35
457 174 283
238 12 226
14132 566 13566
Accuracy in-sample
41.05%
34.27%
33.69%
18.84%
21.77%
31.37%
38.07%
5.04%
4.01%
Out-of-sample: 2010 Classified targets Correctly classified Incorrectly classified
42 5 37
47 3 44
40 3 37
30 3 27
40 5 35
36 4 32
40 9 31
137 11 126
1924 75 57
11.90%
6.38%
7.50%
10.00%
12.50%
11.11%
22.50%
8.03%
3.90%
In-sample: 1999–2009 Classified targets Correctly classified Incorrectly classified
315 117 198
840 230 610
1177 342 835
192 20 172
378 51 327
290 21 269
374 146 228
448 18 430
16080 623 57
Accuracy in-sample
37.14%
27.38%
29.06%
10.42%
13.49%
7.24%
39.04%
4.02%
3.87%
Out-of-sample: 2011 Classified targets Correctly classified Incorrectly classified
34 4 30
87 7 80
166 12 154
42 7 35
33 6 27
40 5 35
18 6 12
168 6 162
1949 94 57
11.76%
8.05%
7.23%
16.67%
18.18%
12.50%
33.33%
3.57%
4.82%
In-sample: 1999–2010 Classified targets Correctly classified Incorrectly classified
253 53 200
1411 321 1090
2587 559 2028
411 34 377
211 62 149
327 37 290
110 42 68
563 30 533
18004 698 57
Accuracy in-sample
20.95%
22.75%
21.61%
8.27%
29.38%
11.31%
38.18%
5.33%
3.88%
Out-of-sample: 2009 Classified targets Correctly classified Incorrectly classified Accuracy out-of-sample
Accuracy out-of-sample
Accuracy out-of-sample
models and the forecast combination method in predicting takeover targets. It is calculated by taking the ratio of the number of correct predictions to the number of predicted takeover targets in the data set. The better the predictive power of a model, the higher the ratio, since it estimates the percentage of observations that a model predicts correctly. As we are interested in forecasting, we concentrate on the out-of-sample results.6 Table 1 presents the accuracies for the logistic models (M1–M3), neural network models (M4–M6), the KK combination model and the benchmark methods (Linear combination and Chance criterion), both in-sample and out-of-sample. The lines indicating classified targets contain the number of predicted target com-
6 Model estimations for the logistic-based models M1, M2 and M3, along with the KK combination model, C1, are reported in the Appendix A. Specific details of the neural network models are available from the authors on request.
panies from each model for both the in-sample and outof-sample periods. Similarly, the ‘‘correctly classified’’ lines refer to the numbers of successfully predicted takeover offers, while the ‘‘incorrectly classified’’ lines contain the numbers of companies misclassified by each model. For the group of logistic models (M1–M3), we observe that an increase in model complexity does not necessarily result in better forecasts. In the first two in-sample estimation periods, the standard logistic specification (M1) has a higher level of accuracy than the more complex mixed and crossed effects models (M2 and M3, respectively). However, this characteristic is reversed somewhat in the third in-sample estimation period. The simplest model, M1, was the most consistent out-of-sample and the most accurate among the logistic models for the financial years 2010 and 2011. For 2009, however, the mixed model, M2, was preferred, with an accuracy rate of 14.29%. As would be expected, the accuracy levels are reduced markedly for the logistic models in the out-of-sample periods.
636
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
Table 2 Out-of-sample returns for the portfolios of predicted targets using the KK combination model and the market benchmark (with a buy-and-hold strategy). Forecast combination KK 2009 19 companies 31-Jul-08 31-Aug-08 30-Sep-08 31-Oct-08 30-Nov-08 31-Dec-08 31-Jan-09 28-Feb-09 31-Mar-09 30-Apr-09 31-May-09
−2.50% −7.27% −16.85% −22.00% −29.46% −33.19% −29.74% −32.52% −30.18% −29.75% −25.88%
30-Jun-09
Market benchmark CAR
ALL ORDS
ASX200
ASX300
3.32% 1.68% −1.81% 5.04% 5.65% 3.59% 0.03% 2.62%
−5.26% −2.20% −13.16% −25.32% −31.13% −31.38% −34.78% −38.18% −33.76% −29.78% −28.49%
−4.56% −1.53% −11.79% −22.96% −28.24% −28.63% −32.11% −35.87% −31.32% −27.51% −26.79%
−4.70% −1.70% −12.04% −23.42% −28.73% −29.03% −32.46% −36.19% −31.60% −27.72% −26.93%
2.75%
−5.07% −3.69%
−25.23%
0.74%
−25.97%
−24.17%
−24.34%
2010 40 companies 31-Jul-09 31-Aug-09 30-Sep-09 30-Oct-09 30-Nov-09 31-Dec-09 29-Jan-10 26-Feb-10 31-Mar-10 30-Apr-10 31-May-10
7.61% 18.06% 25.97% 31.92% 27.66% 29.20% 28.72% 24.11% 32.52% 36.12% 20.01%
−0.03% 4.47% 5.92% 14.22% 8.22% 5.51% 12.28% 6.30% 8.57% 13.68% 7.20%
7.64% 13.58% 20.05% 17.71% 19.45% 23.68% 16.44% 17.82% 23.94% 22.44% 12.81%
7.31% 13.25% 19.94% 17.40% 18.87% 23.15% 15.54% 17.27% 23.28% 21.55% 12.00%
7.33% 13.37% 20.09% 17.56% 19.07% 23.29% 15.68% 17.28% 23.29% 21.61% 12.02%
30-Jun-10
15.00%
5.45%
9.55%
8.76%
8.72%
2011 18 companies 31-Jul-10 31-Aug-10 30-Sep-10 30-Oct-10 30-Nov-10 31-Dec-10 29-Jan-11 26-Feb-11 31-Mar-11 30-Apr-11 31-May-11
3.99% 4.72% 9.25% 17.31% 17.64% 18.18% 15.25% 23.83% 19.90% 16.60% 14.94%
−0.24% 2.08% 2.03% 7.86% 9.51% 6.10% 3.10% 10.01% 5.94% 2.09% 4.20%
4.22% 2.64% 7.22% 9.45% 8.13% 12.07% 12.14% 13.82% 13.96% 14.51% 10.73%
4.46% 2.39% 6.54% 8.37% 6.58% 10.32% 10.52% 12.32% 12.47% 13.29% 9.46%
4.47% 2.48% 6.81% 8.70% 7.03% 10.90% 11.01% 12.80% 12.97% 13.77% 9.87%
30-Jun-11
14.53%
6.79%
7.75%
7.12%
7.34%
In the neural network cases (M4–M6), the three specifications that produced the best results in-sample were then used to predict one year ahead. The model that achieved the best results for the second and third in-sample and out-of-sample periods was M5. Recall that it is a neural network model estimated with one layer containing three neurons and a tangent-sigmoid activation function. It had the highest level of accuracy of any of the single models, with rates of 18.18% out-of-sample and 29.38% in-sample for the financial year 2011. In the first period, the M6 model with one hidden layer and four neurons performed best both in-sample and outof-sample. Overall, all of the models produced forecasts which were considerably better than those of the Chance criterion, with the neural network models outperforming the logistic models out-of-sample in most cases, especially following the financial crises that hit during the 2009 financial year. In line with the existing empirical literature,
the results confirm that the neural network models appear to have an advantage over the logistic models, but at the cost of a greater complexity. While the theory offers assistance in the choice of explanatory variables, no one forecasting method consistently dominates the takeover prediction literature. For a given data set, each model has different underlying assumptions, and therefore assigns different probability estimates to each company. This study investigates whether combining these different predictions can result in better forecasts than those obtained by the individual models. The KK combination method takes the output vector from each of the models into consideration when estimating the combined forecast C1. Again, we use the best in-sample cut-off probability to derive the best out-of-sample forecast, and the results are reported in Table 1 in the column headed ‘‘KK combination’’. The combined forecasts outperform both the single models and the benchmark methods,
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
637
Table 3 Distribution of portfolio returns by the sub-groups of correctly predicted target companies and misclassified companies (non-targets).
Out-of-sample: 2009 Correctly classified (actual targets) Incorrectly classified (non-targets) Portfolio (predicted targets)
Number of companies
Average return
3 16 19
−22.08% −25.82% −25.23%
9 31 40
61.56% 1.48% 15.00%
6 12 18
43.05% 0.27% 14.53%
Out-of-sample: 2010 Correctly classified (actual targets) Incorrectly classified (non-targets) Portfolio (predicted targets) Out-of-sample: 2011 Correctly classified (actual targets) Incorrectly classified (non-targets) Portfolio (predicted targets)
in all three years, and are more parsimonious in the portfolio selection. Except for the logistic model M1 in the first estimation period, the in-sample estimation of C1 was more accurate than that of any of the logistic or neural network models. It was also more stable over the years, with an accuracy rate of around 38% for the three in-sample periods. However, it was in the out-of-sample forecasts that the combined model particularly distinguished itself from the single models. Its forecast accuracy was higher than those of the other models in all three periods, achieving accuracy rates of 15.79% (2009), 22.50% (2010) and 33.33% (2011). Furthermore, the forecast combination resulted in a better predictive accuracy out-of-sample than the individual models in the first forecast period (2009) when the financial crisis had taken hold of stock markets world-wide. The forecast accuracy from the KK combination was also higher than the benchmark methods, including the linear combination of forecasts. In Appendix B we present the statistical test for equality of proportions among the accuracy rates presented in Table 1. The results confirm that the accuracy rates achieved by the KK combination model outof-sample are statistically significant for 2010 and 2011. These results suggest that the use of forecast combinations for the prediction of takeover targets in the Australian context is appropriate. The KK combination model significantly outperformed the other models for predictive purposes, as well as being parsimonious with the number of predicted targets. The use of this methodology also reduced the misclassification error. These results contest the claims of Barnes (1999) and Palepu (1986) that models which achieved predictive accuracies greater than chance cannot be implemented. On the other hand, they are consistent with the forecasting literature which shows that forecast combination using the KK method is generally more accurate than either the individual models or the linear combination. The results from the KK combination are stable across the different estimation periods both insample and out-of-sample. This further confirms the results of studies such as those of Kamstra and Kennedy (1998) and Kamstra et al. (2001), which propose forecast combination using weights to enhance the performances of single models.
6.2. Economic analysis Although the above methodology provides us with a statistical assessment of model performance, it had nothing to say about the economic usefulness of the model. In order to assess the financial gains from our modelling approach, we use the predicted targets from the combined prediction models to create an equally weighted portfolio. Using this approach, we are able to measure whether the KK combination model for predicting takeover targets was able to earn abnormal returns. The investment strategy consists of adopting a one-year buy-and-hold approach for the portfolio made up of the out-of-sample predictions from the KK combination model. The results are presented in Table 2. The numbers of companies in the portfolio are 19 for 2009, 40 for 2010, and 18 for 2011. The returns from the portfolio are calculated for the three out-of-sample years, that is, the financial years 2009–2011. During the financial year 2009, a period heavily impacted by the financial crisis, the returns of the predicted Australian portfolios were similar to what the market experienced. At the end of that year there was virtually no abnormal return when compared to the ALL ORDS index. Despite the higher predictive accuracy of the combination model, C1, losses in downturn periods are not necessarily reduced compared to the benchmark indexes. However, the results for regular years, such as the financial years 2010 and 2011, indicate that combining the predictions by KK combination not only improves the forecast accuracy, but almost doubles the average market return. The returns from the combination method are significantly higher than the market performance over the last two out-of-sample periods on a month-by-month basis. The final portfolio returns at the end of the financial years 2010 and 2011 were 15% and 14.53%, respectively, representing abnormal returns of 5.45% for 2010 and 6.78% for 2011 relative to the ALL ORDS index. The columns in Table 2 include three benchmark indexes representing proxies for the market, plus the portfolio returns and the CAR (cumulative abnormal return) since the first day of each financial
638
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
year at monthly intervals. The three indexes are the ALL ORDS,7 ASX200,8 and ASX300. Importantly, this positive economic result was achieved through the combination method, resulting in reasonably sized portfolios. This has the added advantage of reducing the risk of investing in incorrectly predicted targets. However, while these results are impressive in themselves, it should be recognised that they could potentially have been driven by actual non-target firms within the portfolio of predicted targets. In that case, the abnormal returns in 2010 and 2011 could be partly the result of selecting over-performing non-target firms, rather than accurately selecting the target firms. The answer to this particular issue is in Table 3. The table contains the average returns, split by the sub-groups of correctly predicted targets and misclassified targets (non-targets) for each out-of-sample period. The results in Table 3 allow the quantification of the economic benefit of improving the model accuracy. The average return for the sub-group of correctly predicted target companies in the first prediction period (financial year 2009) is marginally higher than the non-targets return, −22.08% against −25.82%. However, this difference increases significantly in the following two prediction periods. In the financial year 2010, the nine actual targets show an average return of 61.56%, which is considerably higher than the 1.48% average return achieved by the 31 non-target companies in the portfolio. Although the predicted sample is double the size of the previous period, reflecting the uncertainty from the GFC period, the high predictive accuracy certainly contributes to the abnormal portfolio return. The results from the financial year 2011 only confirm that high average returns are related directly to model accuracy. From the portfolio of 18 predicted target companies, the six correctly predicted targets achieved a 43.05% average return, while the other two thirds of the companies from the portfolio achieved an average return of only 0.27%. The returns of the predicted takeover targets over the forecast horizon are given in Appendix C. Overall, the combination of forecasts appears to be an efficient technique for both improving the accuracy of takeover predictions and achieving abnormal returns. The KK combination method appears to be very stable across years, as well as being parsimonious for portfolio selection. The mix of panel data logistic and neural network models has proved to be a good choice for capturing and combining information from a range of different models in order to achieve abnormal returns. 7. Conclusions Forecasts of events based on economic and financial variables that take the form of probabilities are becoming
7 The All Ordinaries (All Ords) index contains nearly all ordinary shares listed on the Australian Securities Exchange. The market capitalization of the companies included in the All Ords index amounts to over 95% of the value of all shares listed on the ASX. 8 The ASX200 index is a market-capitalization weighted and floatadjusted stock market index of the top 200 Australian stocks listed on the ASX from Standard & Poors.
increasingly common. There is an extensive body of literature suggesting that a forecast combination approach can improve on the individual forecasts. This study has evaluated whether combining probability forecast methods for the prediction of takeover targets forms a consensus forecast that improves the prediction accuracy and generates abnormal returns from portfolios comprising the predicted companies. We believe that our results provide evidence in favour of a good and consistent forecast accuracy. This is achieved when predicting potential takeover targets using forecast combinations from a number of panel data logistic and neural network models. Furthermore, the combination model’s results are consistent over time, confirming the robustness of such a methodology to a reduced misclassification error, which is an important consideration in takeover predictions. Finally, two general conclusions are drawn from the results. Firstly, the KK combination method outperforms the single models and should be used to improve the prediction of takeover targets. In particular, the combination approach is both a stable and an efficient method for combining probability forecasts in order to improve the model accuracy and achieve abnormal returns. Secondly, it has been demonstrated that an investment in the combined predicted targets in a regular year resulted in significant abnormal returns being made by an investor, in the order of up to double the market benchmark return in a manageable-sized portfolio. In fact, the use of models which are designed to predict companies with a minimal misclassification rate had a significant economic impact on the portfolio returns. Appendix A. Model estimations See Table A.1. Appendix B. Table with statistical test comparing results from Table 1 See Table B.2. Appendix C. Returns by company out-of-sample See Table C.3. References Barnes, P. (1998). Can takeover targets be identified by statistical techniques? Some UK evidence. The Statistician, 47(4), 573–591. Barnes, P. (1999). Predicting UK takeover targets: some methodological issues and an empirical study. Review of Quantitative Finance and Accounting, 12, 283–301. Barnes, P. (2000). The identification of UK takeover targets using published historical cost accounting data. Some empirical evidence comparing logit with linear discriminant analysis and raw financial ratios with industry-relative ratios. International Review of Financial Analysis, 9(2), 147–162. Bates, J. M., & Granger, C. W. J. (1969). The combination of forecasts. Operations Research Quarterly, 20, 451–468. Reprinted in T.C. Mills (Ed.), Economic forecasting. Edward Elgar. 1999. Bradley, M., Desai, A., & Kim, E. (1983). The rationale behind inter firm tender offers. Journal of Financial Economics, 11(1), 183–206. Cheh, J., Randy, W., & Ken, Y. (1999). An application of an artificial neural network investment system to predict takeover targets. The Journal of Applied Business Research, 15(4), 33–45.
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
639
Table A.1 Models’ estimation results from the in-sample period 1999–2010. M1: Logistic regression
M2: Mixed-effects logistic regression
Number of obs = 16081 LR chi2(13) = 247.41 Prob > chi2 = 0.0000 Log likelihood = −2518.7249
Number of obs = 16081 Number of groups = 2612 Prob > chi2 = 0.0000 Log likelihood = −2512.7412
tkvr
Coef.
Std. Err.
z
tkvr
Coef.
Std. Err.
V1 V3 V9 V13 V14 V18 V19 V25 V29 V30 V31 V32
0.035 0.000 −0.213 0.006 −0.050 0.000 0.000 −0.010 0.076 −0.006 0.149 0.495
0.021 0.000 0.099 0.004 0.025 0.000 0.000 0.006 0.040 0.003 0.096 0.114
1.69 1.28 −2.14 1.73 −2 1.23 2.6 −1.68 1.89 −1.84 1.56 4.35
V3 V9 V18 V19 V25 V29 V31 V32 V33 V34 V35 _cons
0.000 −0.248 0.000 0.000 −0.011 0.086 0.132 0.534 0.266 0.000 0.000 −8.334
0.000 0.107 0.000 0.000 0.007 0.044 0.112 0.127 0.029 0.000 0.000 0.547
V33 V34 V35 _cons
0.195 0.000 0.000 −6.685
0.026 0.000 0.000 0.471
−3.17 −1.35 −14.21
7.53
Random-effects parameters id: Identity Estimate var(_cons) 9E-01 LR test chibar2(01) = 24.90 Prob > chi = 0.000
M3: Crossed-effects logistic regression
C1: Logistic regression (KK comb.)
Number of obs = 16081 Group id _all 2612 Log likelihood = −2490.7673
Number of obs = 16081 LR chi2(6) = 1910.28 Prob > chi2 = 0.0000 Log likelihood = −1680.1525
Groups 11
z 1.38
−2.32 1.31 2.47 −1.7 1.94 1.18 4.21 9.07 −2.84 −1.46 −15.25 Std. Err. 0.216
tkvr
Coef.
Std. Err.
z
tkvr
Coef.
Std. Err.
z
V3 V9 V18 V19 V25 V29 V31 V32 V33 V34 V35 _cons
0.000 −0.271 0.000 0.000 −0.013 0.096 0.140 0.584 0.294 0.000 0.000 −9.155
0.000 0.113 0.000 0.000 0.007 0.047 0.127 0.138 0.032 0.000 0.000 0.603
1.37 −2.4 1.32 2.39 −1.81 2.06 1.1 4.23 9.12 −2.16 −1.47 −15.19
M1 M2 M3 M4 M5 M6 _cons
−39.025
5.677 5.316 2.984 5.949 0.847 3.180 0.098
−6.87
Estimate 4.60E-15
Std. Err. 2E-09
32.070 19.091 −15.645 6.888 7.231 −3.945
6.03 6.4 −2.63 8.13 2.27 −40.21
Random-effects parameters _all: Identity var(R.sector)
id: Identity Estimate var(_ cons) 1.90287 LR test chi2(2) = 68.84 Prob > chi2 = 0.0000
Std. Err. 0.305
Clements, M. P., & Harvey, D. I. (2010). Forecast encompassing tests and probability forecasts. Journal of Applied Econometrics, 25(6), 1028–1062. Clements, M. P., & Harvey, D. I. (2011). Combining probability forecasts. International Journal of Forecasting, 27(2), 208–223. Denčić-Mihajlov, K., & Radović, O. (2006). Problems in predicting target firms in the underdeveloped capital markets. Facta Universitatis Series: Economics and Organization, 3(1), 59–68. Diebold, F. X., & Pauly, P. (1987). Structural change and the combination of forecasts. Journal of Forecasting, 6, 21–40. Dietrich, J. K., & Sorensen, E. (1984). An application of logit analysis to prediction of merger targets. Journal of Business Research, 12, 393–402. Gahlon, J., & Stover, R. (1979). Diversification, financial leverage, and conglomerate systematic risk. Journal of Financial and Quantitative Analysis, 14(5), 999–1013. Harris, R., Stewart, J., Guilkey, D., & Carleton, W. (1982). Characteristics of acquired firms: fixed and random coefficients probit analyses. Southern Economic Journal, 49(1), 164–184. Hogarty, T. (1970). The profitability of corporate mergers. Journal of Business, 43(3), 317–327.
Jensen, M., & Ruback, R. (1983). The market for corporate control: the scientific evidence. Journal of Financial Economics, 11, 5–50. Kamstra, M., & Kennedy, P. (1998). Combining qualitative forecasts using logit. International Journal of Forecasting, 14, 83–93. Kamstra, M., Kennedy, P., & Suan, T.-K. (2001). Combining bond rating forecasts using logit. The Financial Review, 36(2), 75–96. Meador, A., Church, P., & Rayburn, L. (1996). Development of prediction models for horizontal and vertical mergers. Journal of Financial and Strategic Decisions, 9(1), 11–23. Ohlson, J. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research, 18, 109–131. Palepu, K. (1986). Predicting takeover targets: a methodological and empirical analysis. Journal of Accounting and Economics, 8, 3–35. Pesaran, M., & Timmermann, A. (2007). Selection of estimation window in the presence of breaks. Journal of Econometrics, 137, 134–161. Powell, R. (1995). Takeover prediction in the UK: a multilogit approach. Working Papers in Accounting and Finance, Accounting and Finance Division, The Queen’s University of Belfast, 95 (3). Powell, R. (2004). Takeover prediction models and portfolio strategies: a multinomial approach. Multinational Finance Journal, 8(1–2), 35–74.
640
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641
Table B.2 Test for equality of proportions (unequal variances) based on the accuracy results from Table 1. (H0: Accuracy KK combination = accuracy models and benchmarks, one by one.) Test for equality of proportions (accuracy
Logistic models
Neural network models
Benchmark
from Table 1). Sample: 1999–2009
KK = M1
KK = M2
KK = M3
KK = M4
KK = M5
KK = linear comb.
Z -statistic
0.251
0.120
0.358
0.821
0.858
0.203
1.437
1.536
p-value
0.401
0.452
0.360
0.206
0.195
0.420
0.075
0.062
Out-of-sample: 2009
KK = M6
KK = chance criterion
Test for equality of proportions (accuracy
Logistic models
Neural network models
Benchmark
from Table 1). Sample: 1999–2010
KK = M1
KK = M2
KK = M3
KK = M4
KK = M5
KK = linear comb.
Z -statistic
1.133
2.033
1.797
1.321
1.039
1.209
1.948
2.739
p-value
0.129
0.021
0.036
0.093
0.149
0.113
0.026
0.003
Out-of-sample: 2010
KK = M6
KK = chance criterion
Test for equality of proportions (accuracy
Logistic models
Neural network models
Benchmark
from Table 1). Sample: 1999–2011
KK = M1
KK = M2
KK = M3
KK = M4
KK = M5
KK = linear comb.
Z -statistic
1.738
2.201
2.312
1.332
1.167
1.697
2.657
2.563
p-value
0.041
0.014
0.010
0.091
0.122
0.045
0.004
0.005
Out-of-sample: 2011
KK = M6
KK = chance criterion
Table C.3 Distribution of portfolio returns by company in the sub-groups of correctly predicted target companies and misclassified companies (non-targets). Out-of-sample: 2011
Out-of-sample: 2011
Out-of-sample: 2011
Company
Buy and hold return
Company
Buy and hold return
Company
Buy and hold return
LST QGC TPX
−27.32%
AOE CKT ERC FLX LGL LLP PLI SSI TKA
36.62% 154.88% −52.40% 19.08% 46.10% 231.52% 75.00% −58.04% 101.28%
AKR ASX CRG DKN IIF JML
−6.33%
BEN CBH CHQ CIF CNP FLT GPT IPN MMX NXS QAN REA SBM SGB SST VBA
−36.41% −42.11% −44.79% −45.45% −62.04% −48.11% −21.17%
AAY AEM ANZ AQF AZO CBZ CDU CFE CSL CWK CXC EQX HDI KMD MDL MOO MQA PTN RMR ROB RUL RVE SHU SNE SOI TBI VGM VIP WBC WCR WIG
−61.54%
API CER CNP DUE DXS EXT MDL OMH RIO SPN TAP TPM
−28.21%
19 companies
−25.23%
7.08%
−46.00%
Correctly classified (actual targets)
Incorrectly classified (non-targets)
Portfolio
1.92%
−43.05% −33.23% −33.88% 35.84%
−36.99% 2.18% 27.12% −32.98%
40 companies
0.00% 31.05% 21.60% −4.24% −26.39% 82.17% 1.56% 1.34% 51.06% 18.26% −22.22% 0.00% −1.76% 51.61% −8.33% 3.26% −48.24% 40.00% −46.15% −23.08% 127.27% −10.00% −20.00% −44.44% −32.69% −25.00% 0.00% 4.84% −21.95% 8.02% 15.00%
18 companies
4.42% 28.94% 41.07% 42.67% 147.54%
109.38%
−72.59% 5.26% 14.29% 19.08% −39.57% −37.20% 24.50% 23.53% −2.92% −12.24%
14.53%
B.D. Rodrigues, M.J. Stevenson / International Journal of Forecasting 29 (2013) 628–641 Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2005). Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics, 128, 301–323. Riedel, S., & Gabrys, B. (2004). Hierarchical multilevel approaches of forecast combination. Proceedings of the GOR 2004 Conference, 1–3 September 2004, Tilburg, Netherlands. Silbertson, A. (1972). Economics of scale in theory and practice. Economic Journal, 82, 369–391. Stevens, D. (1973). Financial characteristics of merged firms: a multivariate analysis. Journal of Financial and Quantitative Analysis, 8, 149–158. Stevenson, M., & Peat, M. (2009). Predicting Australian takeover targets: a logit analysis. Paper presented at the 3rd MEAFA Workshop, University of Sydney, January, 2009. Sullivan, T. (1977). A note on market power and returns to shareholders. Review of Economics and Statistics (February), 108–113. Thomadakis, S. (1976). A model of market power, valuation, and the firm’s return. Bell Journal of Economics (Spring), 150–162. Timmermann, A. (2006). Forecast combinations. In G. Elliott, C. Granger, & A. Timmermann (Eds.), Handbook of economic forecasting (pp. 135–196). Elsevier. White, H. (1992). Artificial neural networks: approximation and learning theory. Cambridge, MA: Blackwell Publishers.
641
Bruno Rodrigues is a Ph.D. student in the Discipline of Finance, Business School, The University of Sydney. Bruno holds a Bachelor of Economics, Master of Statistics, and a PhD in Statistics from the Pontifical University of Rio de Janeiro. His research in the areas of statistics, time series analysis, financial econometrics, market microstructure and financial analysis has been presented at a variety of conferences, including the International Symposium of Forecasting, the Society of Nonlinear Dynamics and Econometrics Symposium, the MEAFA Meeting at The University of Sydney, and the Brazilian Finance Symposium.
Max Stevenson is an Honorary Senior Lecturer in the Discipline of Finance, Business School, The University of Sydney. Max holds a Master of Statistics, Master of Commerce and a Ph.D. from the University of NSW and is an accredited statistician (AStat) with the Australian Statistical Society. His research in the areas of mathematical economics, mathematical statistics, financial econometrics, corporate finance and financial analysis has been published in a variety of journals, including the Journal of Economic Behaviour and Organisation, International Journal of Forecasting, Studies in Nonlinear Dynamics and Econometrics, Review of Futures Markets, Abacus and the Journal of Applied Financial Economics.