Interpreting and evaluating CESIfo's World Economic Survey directional forecasts

Interpreting and evaluating CESIfo's World Economic Survey directional forecasts

Economic Modelling 38 (2014) 6–11 Contents lists available at ScienceDirect Economic Modelling journal homepage: www.elsevier.com/locate/ecmod Inte...

379KB Sizes 0 Downloads 10 Views

Economic Modelling 38 (2014) 6–11

Contents lists available at ScienceDirect

Economic Modelling journal homepage: www.elsevier.com/locate/ecmod

Interpreting and evaluating CESIfo's World Economic Survey directional forecasts Mark Hutson, Fred Joutz ⁎, Herman Stekler Department of Economics, George Washington University, Washington DC 20052, United States

a r t i c l e

i n f o

Article history: Accepted 25 November 2013 Available online 28 December 2013 Keywords: Evaluating surveys Directional forecasts

a b s t r a c t Using the Carlson and Parkin (1975) framework and employing the Pesaran–Timmermann (1992) Predictive Failure statistic, we evaluate several consensus forecast series from CESIfo's World Economic Survey. Several issues are examined related to interpreting qualitative survey responses. We define what an “about the same” response implies across different economic variables, the value of agreement across the forecast panel, and how to maximize the signal value provided by the survey. We find that survey respondents provide statistically significant directional forecasts or signals. © 2013 Elsevier B.V. All rights reserved.

1. Introduction Surveys of consumer sentiment and expectations are frequently used by economists either to estimate economic relationships or as possible inputs to economic forecasts. Such surveys include the University of Michigan and the Conference Board's measures of consumer confidence. These particular surveys provide quantitative measures about consumers' beliefs. (See Ludvigson, 2004). However, Croushore (2005) found that they did not have significant value in real-time forecasting. Other surveys do not provide quantitative measures because respondents are asked to provide qualitative answers to the questions. Such responses might be of the form: up, no change, down. These are qualitative measures and a number of procedures have been developed to quantify these qualitative data. (See Anderson, 1951, Anderson Jr, 1952; Carlson and Parkin, 1975; Theil, 1952).1 These numbers have then been used to estimate consumer expectations about particular variables such as the expected inflation rate and to test the rational expectations hypothesis. These studies have not explicitly focused on the directional accuracy of the surveys. We address that question in our paper. For this analysis we use a survey that asks qualitative questions but reports the responses in a different format. Thus, we need to develop a different procedure for determining whether the survey provides information about the direction that an economy will take in the next several quarters. In this context, we examine and evaluate the information about the US economy that is provided in the CESIfo World Economic Survey (WES). The WES is one of the world's broadest survey of forecasts that is available. It is published quarterly under the auspices of both the University of ⁎ Corresponding author. E-mail addresses: [email protected] (F. Joutz), [email protected] (H. Stekler). 1 Nardo (2003) presents a survey of these methods. Qualitative responses are more difficult to evaluate than quantitative statements. (For example, see Dorfman, 1998; Dasgupta and Lahiri, 1992). 0264-9993/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.econmod.2013.11.032

Munich's Center for Economic Studies and the Ifo Institute for Economic Research in Munich. In each survey there are 10 multi-part questions that cover everything from the world economic climate to macroeconomic and trade variables in the respondent's home country.2 Participants in the survey are asked to check one of three boxes reflecting their expectations about macroeconomic variables. The categories, while sometimes worded differently, all ask the respondent whether a variable is/was/or will be positive, negative, or about the same. This paper will focus only on the responses to the surveys about the US economy. The remaining sections are laid out in the following manner. Section 2 provides a brief discussion of relevant literature. The next section describes the survey and data used in the empirical analysis, followed by a description of the methodology and scenarios used to evaluate the forecasts. Section 5 provides a discussion of the forecast results and Section 6 summarizes the results.

2. Literature review 2.1. The balance statistic and contingency tables Some of the earliest work done on evaluating directional and qualitative forecasts used the balance statistic, which has been shown to provide informative results (Anderson, 1951 and Anderson Jr, 1952; Theil, 1952). The balance statistic reported the difference between the number of positive respondents and negative respondents. However, since the balance statistic's accuracy is based on the underlying distribution of the actual data, valid statistical inference from the balance statistic was problematic.

2 As of January, 2010, more than 1000 economists in 94 countries responded to the survey. The respondents only answer questions about economic conditions in their home country.

M. Hutson et al. / Economic Modelling 38 (2014) 6–11

Theil (1961) evaluated directional forecasts by using contingency tables. He was able to determine whether or not directional forecasts provided statistically significant information compared to random predictions and realizations without relying on the underlying distribution. To do this, forecasts were categorized according to the evaluation metric (e.g. “up”, “down”, “about the same”), and then the corresponding realizations were evaluated using the same categorization. The data were then used to construct a contingency table. The counts of each combination were the individual cell elements (e.g. the number of “actual up”, predicted “up” would go in row 1, column 1). Theil then tested whether the forecasts and realizations were related or independent. This test is distributed as χ2 with the null hypothesis of independence (i.e. the forecasts are unrelated to the realizations). However, it should be noted that this method only tests whether the forecasts and realizations are independent, not correct. In a symmetric contingency table, only values along the main diagonal are correct, so failure to reject the null of independence is not sufficient for determining predictive failure (or success). Pesaran and Timmermann (1992) developed a predictive failure test statistic that is also based on the contingency table.

2.2. Categorizing realizations and forecasts The contingency table method was appropriate when forecasts and realizations were both qualitative. If forecasts are qualitative and realizations are quantitative, a different approach is required to evaluate the forecasts. Implicit in the use of a contingency table is that both the realizations and the forecasts must be categorized into cells. Carlson and Parkin (C–P) (1975) developed a framework that converted quantitative realizations into qualitative categories. C–P have an indepth discussion of the method they used to determine the boundaries between their categories. Essentially, C–P defined the size of the “actual” cells based on their “imperceptibility” threshold in order to define a correct prediction using a 3-cell framework.3 However, as we indicate below, these methods must be modified because the WES publishes the results of its survey in a form that is not consistent with these approaches.

2.3. Are qualitative surveys useful for directional analysis? Qualitative forecast evaluations have yielded a number of important findings. Bessler and Brandt (1981) found that expert opinions can increase the robustness of directional forecasts. Expert predictions, when coupled with statistical models, improve the directional forecasts; in fact, the experts often outperform both the statistical models' directional forecasts as well as quantitative point forecasts in agricultural spot markets (Dorfman, 1998). The use of expert or survey results also is quite prevalent in the literature as an instrument for low-quality or unavailable data (Rowe and Wright, 2001). When evaluating consensus forecasts, Lahiri and Sheng (2010) have found that disagreement among quantitative forecasters can be a good proxy for uncertainty; thus, when possible to measure, the value of disagreement (or lack thereof) might be useful in evaluating forecasts. Less formal categorizations can also be used by creating an artificial “score” for respondents. For example, Balke and Petersen (2002) have scored the Federal Reserve Beige Book. They converted qualitative descriptions of the economy into a quantitative system of analysis (scored from − 2 to 2) and used OLS to estimate GDP as a function

3 Nardo (2003) and Breitung and Schmeling (2013) discuss the criticisms of this method which assumes symmetric and time-invariant responses.

7

of their index. They found that the Beige Book provided additional predictive value beyond other indicators and forecasts. 3. The survey and data 3.1. The survey Each of the questions in the CESIfo World Economic Survey (WES) asks forecasters to check one of three boxes, with responses being optimistic, pessimistic, or about the same. Thus, individuals are asked to convert their own forecasts, which are often quantitative values, into a qualitative measure. These qualitative values are then aggregated using a numerical scale with optimistic, about the same, and pessimistic responses assigned the numbers 9, 5, and 1, respectively. These values are aggregated and then averaged, yielding a quantitative consensus directional forecast. These quantitative numbers, which are derived from scoring qualitative directional responses, are the “forecast” data to be used in this analysis. However, when CESIfo reports the survey to the public, they often take this quantitative value and present a qualitative prediction graphically. Values in the lower third of the scale are considered “down” predictions, while those in the upper third are “up” predictions, and middle values are “about the same” predictions.4 To summarize, the final consensus forecasts that are reported are derived from private qualitative responses, which are then converted into a quantitative scale and averaged. In the WES, these numbers are converted back into a qualitative prediction which is shown graphically as a three-cell system. However, these numbers are also presented as a two-cell forecast in the introduction. An evaluation of these survey results thus poses some important technical issues that are analyzed in this paper. 3.2. The data This paper examines the responses in the US surveys to questions about GDP and some of its components. The components are consumption, investment, exports, imports, and the trade balance. The analysis covers the quarterly surveys from 1989 through 2009. All forecasts are lined up with their corresponding prediction period. For example, the survey from the first quarter of 2001 asks about the current situation for GDP and how that compares to its value six months from now; these forecasts made in 2001.1 are thus compared to the actual real growth rates referring to the period between 2001.1 and 2001.3. These forecasts are compared to the actual real-time figures for the same time periods. The actual data were obtained from Philadelphia Fed's Real-Time Database.5 4. Methodology This section presents the procedures for evaluating the forecasts in the WES. The form of a variable that is used depends on the wording of the question in the WES. It also depends on whether the question refers to the present or future condition of the economy. What follows is (1) a discussion of the contingency table technique that is used, (2) the statistic that is used in the evaluation, (3) the choice of the form of the actual series against which the WES was compared, and (4) how

4 Further, in their introduction, they define values greater than 5 as optimistic and below 5 as pessimistic, essentially reclassifying their system for two-cell interpretation similar to a balance statistic. 5 http://www.philadelphiafed.org/research-and-data/real-time-center/real-time-data/. It should be noted that 1996.1 had a lagged reporting vintage and thus has been removed from the analysis. The document states that the BEA switched from fixed-weight aggregation to chain-weighting, and this is the most likely explanation for the gap.

8

M. Hutson et al. / Economic Modelling 38 (2014) 6–11

the actual series and predictions are categorized in the contingency tables.

the situation will be “better” (plus), “about the same” (0), or “worse” (minus).6

4.1. Contingency tables

4.3. Classification of the data

We utilize (n × n) contingency tables, starting with the 3 × 3 case to evaluate the WES forecasts. The cells of each contingency table are based on 1) the category that the forecasters predicted and 2) the actual realization that occurred. The table size is based on the dimensionality of the forecast. As an example, if the survey only requires an “up” or “down” response, the analysis will use a 2 × 2 table with four cells. A three-category system which also incorporates an “about the same” response, will have nine cells. The contingency table can be evaluated by testing whether the predicted and actual distributions are independent or whether the forecasts display predictive failure. The former relies on the χ2k distribution; the latter uses the Pesaran and Timmermann (1992) statistic. Obviously, if the forecasts and the realizations are independent (the null hypothesis is not rejected), the forecasts have no value. However, rejecting independence is not equivalent to accuracy, and it is necessary to determine if the forecasts reject the null of predictive failure. The Pesaran and Timmermann (henceforth P–T) (1992) predictive failure test statistic takes the form:

Given the nature of the data that are available, it is necessary to define the boundaries of both the columns and rows of the contingency tables that we use in this analysis. We assume that the forecasters can determine, in observing a time series, what meaning that they associate with the three categories, (up, the same, and down). We then adopt an approach similar to the framework of Carlson and Parkin (1975). They assume that there is some threshold, X, above which the forecaster perceives a positive change in the series, while considering any value below it not to be significantly positive. The Carlson–Parkin approach assumes that both the responses and the perceptions of the actual data are symmetric and normally distributed. Although this procedure is commonly used to evaluate qualitative survey data, it has been criticized. (See Nardo (2003) and Breitung and Schmeling (2013)). Of particular concern are the possibilities that asymmetries exist and that the responses are not normally distributed. It is for these reasons that we examine several alternative approaches for determining the bounds/thresholds that distinguish between up (down) and no change. One approach assumes that both the responses and the perceptions of the actual data are symmetric; the other allows for asymmetries. Table 1 presents statistics about the distributions of each of the variables. In some cases, the normality null is rejected at the 5% level; in other instances the null is not rejected. We, therefore, used both procedures in the subsequent analyses. In each case we also varied the bounds to determine whether the results were robust.

sn Sn

¼ ¼

pffiffiffi −1=2 n V n Sn ∼ Nð0; 1Þ

where P^ − P^ 

¼

m   X P^ ii − P^ io P^ oi i¼1

nij nio noi ; P^ io ¼ ; P^ oi ¼ P^ ii ¼ n n n and     δf ðP Þ δf ðP Þ f Ψ − P^ P^ Vn ¼ δP P¼P^ δP P¼P^ where P is the probability that forecasts will accurately predict the actual, P^ is the ML estimator of P, P* is the predicted number of correct 0 forecasts based on the distribution of actual and forecasts, P0 is P^ s true value, nij is the number of observations in the cell in the ith row ^ and jth column, nio is the number of observations in the ith row, and Ψ is an m × m matrix with P0 as its diagonal elements, and the null (H0) is predictive failure of the forecasters. The methodology is similar to the χ2k test of the contingency tables; the P–T test looks at deviations from the predicted number of observations in the cells based on the distribution of the forecasts and the actuals. However, the numerator includes only the diagonal elements (i.e. the correct predictions) that exceed the expected number of correct responses from an independent forecast. Thus, the null hypothesis is predictive failure. Statistically significant positive values of sn, which reject the null, suggest that the forecasts are correct. We will focus on predictive failure and this statistic rather than on the independence test, but we will also report the number of accurate responses. 4.2. Wording of questions and choice of variable form In order to compare the forecasts with the actual variables, it is necessary to determine the form of the actual variable that will be used. The form depends on what the respondents are asked to predict. For example, the three “present judgment” questions ask whether the situation is good/satisfactory/bad. In this case, the analysis compares these responses to the growth rates of the series because growth rates are the best gauge of whether the economy is growing satisfactorily. Similarly, the signs of the growth rates are used whenever respondents are asked to compare the current situation with that expected to prevail six months in the future. In that case respondents are asked whether

4.3.1. Symmetry Similar to the framework of Carlson and Parkin, we assume that there is some threshold, X, above which the forecaster perceives a positive change in the series, while considering any value below it not to be significantly positive. Suppose that both the actual and forecast data are distributed normally around the means of the observations. In both cases, the bound between no change and up (down) would then be defined as plus (minus) X standard deviations from the mean. In our analysis we allow X to vary between ¼ and 2. Consequently, we will evaluate the WES forecasts using various values of X for the distributions of both the actual data and the surveys' reported results. A value of any observation that is more than X standard deviations above the series' mean represents a change that is considered to be positive. Similarly, values more than X standard deviations below the series mean will be considered to be negative. The larger the value of X, the greater will be the number of observations that are classified as “no change.” 4.3.2. Asymmetry The previous method for deriving the bounds might not be appropriate when the data are not normally distributed. A second methodology, which is based on the range of the observed data, was developed to classify the boundaries of the data. This approach allows for asymmetries based on the distance between the value of the maximum (minimum) observation and the median. The boundary between “no change” and “up” (“down”) is then defined as Y% of the distance between the median and the maximum (minimum) value of the series.

6 Other questions have slightly different wordings. The import and export questions ask whether they will be “higher/about the same/lower” six months from now. This comparison involves a comparison of the actual levels, as compared to any sort of rate change. Similarly with the trade balance question which asks whether it will “improve/no change/deteriorate”.

M. Hutson et al. / Economic Modelling 38 (2014) 6–11

9

Table 1 Summary statistics for CESIfo World Economic Survey directional forecasts of the United States. Present GDP Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Jarque–Bera Probability Observations

Future GDP

5.52 6.00 9.00 1.20 2.335 −0.316 1.898 5.642 0.060

Present consumption

5.15 4.80 8.80 1.30 1.715 0.298 2.347 2.672 0.263

84

5.54 5.85 8.80 1.00 2.151 −0.535 2.399 5.268 0.072

82

Future consumption 4.84 4.75 8.80 1.30 1.575 0.628 3.132 5.453 0.065

84

82

Present investment 5.12 5.60 8.30 1.10 2.246 −0.330 1.740 7.081 0.029 84

Future investment 5.13 4.80 8.50 1.90 1.592 0.400 2.219 4.269 0.118 82

Sample 1989q1–2009q4. Present refers to current quarter. Future refers to 2 quarters ahead.

Table 2A Characteristics of present situation forecasts. 1 Std Deviation

0.5 Std Deviation

Variable

% Correct

Sn

% Correct

Sn

GDP Consumption Investment

76.2 61.9 50.0

4.64*** 5.20*** 5.32***

66.7 61.9 65.5

6.34*** 6.97*** 7.22***

*** 1% significance level.

4.4. An alternative contingency table Initially, the analysis of the categories of the WES data is based on a 3 × 3 contingency table using all of the responses. However, the CESIfo reports suggest that the WES forecasts should be analyzed in a two-cell format, with values over 5 considered to be optimistic and under 5 to be pessimistic. This converts the responses into a binary forecast, i.e. up or down.7 In this format, the boundary of the rows of the contingency table is 5. Thus, any reported survey average above 5.0 is considered an “up” prediction and any that is below 5.0 is called a “down” prediction.8 The boundary of the cells of the actual data is either the mean of the series for the present situation surveys or zero for the change over the next six months' surveys. Otherwise the methodology for analyzing the forecasts of the surveys is the same. 5. Results In this section, we first present the results based upon the 3 × 3 contingency tables for both the present judgment and six month-ahead responses to the surveys. We then examine the forecasting record based on the alternative 2 × 2 categorizations. 5.1. Present situation Table 2A presents the current situation results for three variables: GDP, consumption, and investment, the only variables for which these data were available. These tabulations are based on the use of the normal distribution and two different standard deviations to determine the bounds of the cells. The bounds of the cells are either one s.d. or .5 s.d. away from the mean. We compare the results obtained from two sets of bounds because more observations are placed within the 7 Naik and Leuthold (1986) had suggested that such a classification might yield more robust results. However, in using the 3 × 3 contingency tables, we are in essence calculating interval forecasts and outcomes. Using the binary forecasts, up or down, is identical to using the mean without any deviations. Our results show that with a smaller deviation, the percentage of correct predictions declines. 8 This approach is akin to using the median optimistic/pessimistic forecast and follows a similar pattern to the Carlson–Parkin results whereby “about the same” values are not weighted.

“no change” cells when up (down) changes must exceed one standard deviation away from the mean.9 Our results indicate that the forecasters were able to accurately describe the current situation for both GDP and consumption and to a somewhat lesser extent for investment. The direction of change of GDP was predicted accurately at least 2/3 of the time. The directional consumption forecasts were accurate more than 60% of the time, but accuracy of the investment predictions exceeded 50% only when .5 s.d. bounds were used. All the forecasts rejected the null of predictive failure regardless of which set of bounds was used. There were only two turning point errors, i.e. describing the situation as bad when it actually was good.10 5.2. Six months-ahead forecasts In addition to the three variables which were analyzed in the previous section, six months ahead forecasts were also available for exports, imports and the trade balance. We found that the results for the six months-ahead forecasts were mixed and depended on the way the boundaries of the cells of the actual data were defined. (Table 2B). If the bounds were set at one standard deviation, forecasts of each of the six variables were accurate at least 60% of the time. However, most of the accurate forecasts were contained in the “about the same” cells. With the smaller standard deviation bounds, more observations were incorrectly classified in the positive or negative categories and the accuracy was substantially smaller, never exceeding 50% in the case of consumption, investment and GDP. Also, in this case, the GDP forecasts showed predictive failure. 5.3. Eliminating “about the same” cells Entirely different results occurred when the analysis eliminated the “about the same” cell and used the 2 × 2 contingency framework that was described above. This transforms the evaluation to a binary problem determining whether the predicted changes were related to the observed actual changes. In that case the Pesaran–Timmermann statistic is identical to the χ2 statistic. The results in Tables 3 and 4 show that all of the responses to either the present situation or the outlook in six months were significant and thus valuable. In this analysis, the responses about current conditions coincided with the actual conditions, as defined above, 83–87% of the time. While the accuracy was lower for the six months-ahead forecasts, the direction of change of GDP, consumption and investment was still accurately forecast more

9 The results based on deviations from the median are presented in the Appendix. There is no substantial difference between the findings obtained from the two approaches. 10 The consumption and investment variables each had one such error using the .5 s.d. bounds.

10

M. Hutson et al. / Economic Modelling 38 (2014) 6–11

Table 2B Characteristics of six months ahead forecasts. 1 Std Deviation % Correct

Sn

% Correct

Sn

GDP Consumption Investment Exports Imports Trade Balance

67.1 63.4 68.3 61.0 67.1 65.8

2.36*** 1.36 2.97*** 0.93 2.03** 1.39

36.6 47.6 50.0 47.6 56.1 52.4

0.64 2.43*** 3.23*** 2.22** 3.35*** 3.13***

*, **, ***: significant at 10%, 5%, and, 1% levels respectively.

Table 3 Values of the χ2 statistic and accuracy ratio for “present judgment” response 2 × 2 contingency table. χ2

Accuracy ratio

30.7*** 30.3*** 38.6***

83 83 87

*, **, ***: significant at 10%, 5%, and, 1% levels respectively.

than 60% of the time. The accuracy ratio of the trade variables was similar. These results indicate that the responses to the WES questions can yield some information which can be useful in forecasting the direction of change of the economy. The respondents are not completely accurate about the state of the economy at the time they respond to the “current state” questions. Consequently, these misperceptions are, unfortunately, introduced into their forecasts about the future, but they can still forecast the change into the future at least 60–70% of the time. Fig. 1 presents the time series of the published consensus estimates of the current situation. We have superimposed the dates of the three recessions that occurred in the time period that we have examined. Some results clearly stand out. Suppose that one uses any reported survey average above 5.0 as an “up” prediction and any that is below 5.0 as a “down” prediction. Then we observe that the nowcasts of GDP (the current situation) are all positive immediately prior to the three recessions and are negative during the recessions. However, during the recovery phases of the first two recessions, these consensus numbers are consistently less than 5.0. This suggests that the respondents to the survey might have a different interpretation about the state of the economy than we have assumed in our analysis. For instance, they might be interpreting a “good” state, not as one in which the economy is growing, but as one in which the economy has attained the previous peak level of GDP. Fig. 2 presents a similar graph of the six months-ahead estimates. Some of these forecasts were less accurate than the current quarter estimates portrayed in Fig. 1, but the results are mixed. For example, the consensus forecast was consistently and incorrectly predicting a recession long before it began in 2001. On the other hand, prior to and during the Great Recession of 2008–9 the forecasts correctly predicted the direction but not the magnitude of the decline in the US economy.

Table 4 Values of the χ2 statistic and accuracy ratio for “six months ahead” response 2 × 2 contingency table. 2

Real GDP Consumption Investment Exports Imports Trade Balance

12

9

10

8

8

7

6

6

4

5

2

4

0

3

-2

2

-4

0.5 Std Deviation

Variable

Real GDP Consumption Investment

10

χ

Accuracy ratio

7.0*** 5.4** 4.9** 2.6 10.8*** 15.5***

65 62 62 62 70 71

*, **, ***: significant at 10%, 5%, and, 1% levels respectively.

-6

1 90

92

94

96

98

00

02

04

06

08

Ifo Present RGDP RGDP Growth Rate

Fig. 1. Current situation graph Ifo Present Situation Index (Left Axis) and Real GDP Growth Rate (Right Axis). NBER recession dates in shaded areas; Ifo Index greater than 5 indicates optimism—less than 5 suggests forecasters are pessimistic.

These results reinforce our previous conclusion that the CESIfo survey yields some information about the future movements of the US economy.

6. Interpretation and conclusions We examined the CESIfo World Economic Surveys for the United States for the period, 1989–2009 to determine whether they were useful in forecasting the directional change. We developed several methods for analyzing these surveys in the three cell format of the survey. We found that the results of those surveys were more meaningful if the “same change” cell was eliminated and the responses were analyzed in the two cell format that CESIfo used in the introduction of their reports. In that case, the surveys provided some valuable information about the future direction of the economy. However, the directional forecasts of some of the major variables were only accurate 60–70% of the time. Thus the responses to these surveys cannot be reliably used as a leading indicator. Future research could determine whether the responses when combined in a quantitative analysis provide information over and above those contained in other variables.

9

8

8

6

7

4

6

2

5

0

4

-2

3

-4

2

-6 -8

1 90

92

94

96

98

00

02

04

06

08

Ifo Future RGDP Growth Change Change in 6 Month RGDP Growth Rate

Fig. 2. Future situation graph Ifo Future Situation Index (Left Axis) and 6 Month Real GDP Growth Rate Change (Right Axis). NBER recession dates in shaded areas; Ifo Index greater than 5 indicates optimism—less than 5 suggests forecasters are pessimistic.

M. Hutson et al. / Economic Modelling 38 (2014) 6–11

Appendix Table 1. Characteristics of six months ahead forecasts, based on medians

Intervals based on 50% of the distance between the median and the maximum (minimum) value of the series Variable

% Correct

Sn

GDP Consumption Investment

58.3 82.1 60.7

2.30*** 6.47*** 3.81***

*** 1% significance level.

Appendix Table 2. Characteristics of six months ahead forecasts, based on medians

Intervals based on 50% of the distance between the median and the maximum (minimum) value of the series Variable

% Correct

Sn

GDP Consumption Investment Exports Imports Trade Balance

76.2 73.2 63.4 57.3 71.2 67.1

2.79*** 1.20 −.07 0.32 2.16** 0.7

*, **, ***: significant at 10%, 5%, and, 1% levels respectively.

11

References Anderson, T.W., 1951. A note on a maximum-likelihood estimate. Econometrica 19 (1), 85. Anderson Jr, Oskar, 1952. The business test of the Ifo-Institute for Economic Research, Munich, and its theoretical model. Revue de l'Institut International de Statistique/ Rev. Int. Stat. Inst. 20 (1), 1–17. Balke, Nathan S., Petersen, D'Ann, 2002. How well does the Beige Book reflect economic activity? Evaluating qualitative information quantitatively. J. Money Credit Bank. 34, 114–136 (Feb.). Bessler, David A., Brandt, Jon A., 1981. Forecasting livestock prices with individual and composite methods. Appl. Econ. 13, 513–522 (Dec.). Breitung, Jorg, Schmeling, Mark, 2013. Quantifying survey expectations: what's wrong with the probability approach. Int. J. Forecast. 29 (1), 142–154. Carlson, John A., Parkin, J. Michael, 1975. Inflation expectations. Economica 42, 123–138 (May). Croushore, Dean, 2005. Do consumer confidence indexes help forecast consumer spending in real time? N. Am. J. Econ. Finan. 16, 438–450. Dasgupta, Susmita, Lahiri, Kajal, 1992. A comparative study of alternative methods of quantifying qualitative survey responses using NAPM data. J. Bus. Econ. Stat. 10, 391–400 (Oct.). Dorfman, Jeffrey H., 1998. Bayesian composite qualitative forecasting: hog prices again. Am. J. Agric. Econ. 80, 543–551 (Aug.). Lahiri, Kajal, Sheng, Xuguang, 2010. Measuring forecast uncertainty by disagreement: the missing link. J. Appl. Econ. 25, 514–538. Ludvigson, Sidney C., 2004. Consumer confidence and consumer spending. J. Econ. Perspect. 18 (2), 29–50. Naik, Gopal, Leuthold, Raymond M., 1986. A note on qualitative forecast evaluation. Am. J. Agric. Econ. 68 (Aug.), 721–726. Nardo, Michela, 2003. The quantification of qualitative data: a critical assessment. J. Econ. Surv. 17 (5), 645–668. Pesaran, M. Hashem, Timmermann, Allan, 1992. A simple nonparametric test of predictive performance. J. Bus. Econ. Stat. Am. Stat. Assoc. 10, 561–565 (Oct.). Rowe, Gene, Wright, George, 2001. Expert opinions in forecasting: the role of the Delphi Technique. In: Armstrong, Jon Scott (Ed.), In Principles of Forecasting: a Handbook for Researchers and Practitioners. Spring Science & Business Media, Inc., New York. Theil, Henri, 1952. On the time shape of economic microvariables and the Munich business test. Revue de l'Institut International de Statistique/Rev. Int. Stat. Inst. 20 (2), 105–120. Theil, Henri, 1961. Economic Forecasts and Policy. North Holland, Amsterdam.