Economic Modelling xxx (xxxx) xxx–xxx
Contents lists available at ScienceDirect
Economic Modelling journal homepage: www.elsevier.com/locate/econmod
Baidu news information flow and return volatility: Evidence for the Sequential Information Arrival Hypothesis ⁎
Dehua Shena,b, Xiao Lic, , Wei Zhanga,d a
College of Management and Economics, Tianjin University, Tianjin 300072, PR China Key Laboratory of Computation and Analytics of Complex Management Systems, Tianjin 300072, PR China c School of Finance, Nankai University, Tianjin 300350, PR China d China Center for Social Computing and Analytics, Tianjin University, Tianjin 300072, PR China b
A R T I C L E I N F O
A BS T RAC T
JEL Classification: G12 G14
This paper employs Baidu News as the proxy for information flow and investigates competing hypotheses on the relationships between information flow and return volatility in Chinese stock market. The empirical results show that: (1) trading volume and return volatility are not driven by the same variable, i.e., the information flow, and thus contradicts the predication of the Mixture of Distribution Hypothesis (MDH); (2) there exist significant lead-lag relationships between information flow and return volatility, which is in accordance with the Sequential Information Arrival Hypothesis (SIAH); (3) these findings are robust to alternative measurement of return volatility and subsample analysis. Generally speaking, these findings contradict the prediction of MDH and support the SIAH.
Keywords: Return volatility Sequential Information Arrival Hypothesis Mixture of Distribution Hypothesis Information flow Baidu News
1. Introduction Return volatility can be the result of the rate of information flow, the reflection of the private information as well as the irrational behavior of the noise trader (Grossman and Stiglitz, 1980; Kyle, 1985; Glosten and Milgrom, 1985; French and Roll, 1986; Ross, 1989). Among the various explanations proposed, the rate of information flow has been widely considered as the major factor uncovering the underlying mechanism on the changes of asset prices (Andersen, 1996; Bergemann et al., 2015). Investigation of the relationships between information flow and return volatility is important. For example, discovering significant linear and nonlinear relationships between information flow and return volatility may be of interest to policy makers as they decide on the transparency and quality of information diffusion. Two competing hypotheses based on the rate of information flow explanation have been proposed: the Mixture of Distribution Hypothesis (MDH) and Sequential Information Arrival Hypothesis (SIAH). However, the interactions and dynamics between the information flow and return volatility remain debatable for the reason that the intangible form of the information makes it hard to construct desired proxies for the information flow. Existing literature mainly relies on trading volume (Lamoureux and Lastrapes, 1990; Bohl and Henke, 2003; Le and Zurbruegg, 2010) and its adjusted forms (Wagner and
⁎
Marsh, 2005; Fleming et al., 2006; Park, 2010), and number of firmspecific announcements (Kalev et al., 2004) as the proxies for the information flow. They usually incorporate the constructed proxy into the conditional variance equation of the GARCH model (Bollerslev, 1986) and observe a significant reduction in the volatility persistence. In that sense, they provide evidence for the MDH, which claims that return volatility and trading volume are generated by a mixture of distribution, in which the rate of information flow is the stochastic mixing variable (Clark, 1973; Epps and Epps, 1976; Harris, 1987; Andersen, 1996). The alternative hypothesis for the information flow and return volatility relationships is the SIAH, which claims that the formation of new market equilibrium is not instantaneous and requires some time for investors to react to new information, producing the lead-lag relationships between information flow and return volatility (Copeland, 1976; Jennings et al., 1981; Smirlock and Starks, 1988; Darrat et al., 2003 and 2007). Although both the MDH and SIAH hypotheses have experienced extensive investigation by examining the relationship between trading volume (is considered as the proxy for information flow) and return volatility, the overall empirical results are mixed (Darrat et al., 2003; Wagner and Marsh, 2005; Fleming et al., 2006; Park, 2010). This paper revisits this relationships between information flow and return volatility by employing a more appropriate proxy for information flow in Chinese stock market. This proxy is constructed
Correspondence to: School of Finance, Nankai University, 38 Tongyan Road, Jinnan District, Tianjin 300350, PR China. E-mail address:
[email protected] (X. Li).
http://dx.doi.org/10.1016/j.econmod.2017.09.012 Received 18 May 2017; Received in revised form 27 July 2017; Accepted 15 September 2017 0264-9993/ © 2017 Elsevier B.V. All rights reserved.
Please cite this article as: Shen, D., Economic Modelling (2017), http://dx.doi.org/10.1016/j.econmod.2017.09.012
Economic Modelling xxx (xxxx) xxx–xxx
D. Shen et al.
2.1. Trading periods in Chinese stock market
by acquiring information from the Internet. In particular, we use the search frequency of the stock names in Baidu News as the direct proxy for the information flow. Compared with the prevailing proxies for the information flow, e.g., trading volume, our proxy is more direct, which makes the rate of information flow tangible. With very few exceptions, Zhang et al. (2014) and Shen et al. (2016) also employ such proxy for information flow. But their studies focus exclusively on the explanatory power of the information flow on volatility clustering, while we aim to address the underlying hypotheses, i.e., MDH and SIAH, of the relationships between information flow and return volatility. Our paper contributes to the existing literature in three aspects. Firstly, the information flow proxy is acquired from the Baidu News, which can naturally rule out other non-informational factors. Specifically, the prevailing proxy for information flow is trading volume. However, trading volume is not solely driven by information and it can also be driven by private information, irrational trading and liquidity shocks (French and Roll, 1986; Andersen, 1996; Shen et al., 2017). In that sense, trading volume is not a satisfied proxy for information flow. Besides, given to the high-frequency property of Baidu News, we can further divide the information flow into trading periods’ information flow and non-trading periods’ information flow and only the trading periods’ information flow is considered as daily information flow employed in subsequent analysis. This further rules out the impact of second-hand information on the changes of asset prices (Davies and Canes, 1978). Secondly, this new proxy for information flow provides us with an alternative way to test the MDH by making comparisons between the contemporaneous correlation coefficient for information flow-return volatility (IF-RV) relationship and the contemporaneous correlation coefficient for information flow-trading volume (IF-TV) relationship. Existing literature mainly relies on the investigation of the contemporaneous correlation coefficients between trading volume (is considered as the proxy for information flow) and return volatility (Kalev et al., 2004; Wagner and Marsh, 2005; Fleming et al., 2006; Park, 2010). Employing the trading volume as the proxy for information flow seems to “test a theory that is about from inputs to outputs with an output measure” (Qiu and Welch, 2004). In fact, we find conflicting results when employing the Baidu News and trading volume as the proxy for information flow, respectively. Thirdly, the empirical investigation focusing on contemporaneous and lead-lag relationships between information flow and return volatility in Chinese stock market is seldom investigated and the only study is Lee and Rui (2000). Since 2000, both the investor structure and the trading rules have changed dramatically. For example, the Qualified Foreign Institutional Investors (QFII) are allowed to trade stocks in July 2003 and Share Price Index Futures is launched in April 2010. What's more, we also provide an alternative evidence on the relationships between information flow and return volatility in an emerging stock market, rather than focusing on the mature markets, e.g., the NYSE (Darrat et al., 2003, 2007), DAX, FTSE, CAC and TPX (Wagner and Marsh, 2005). The remainder of this paper is organized as follows. Section 2 describes the Chinese stock market, the proxy for information flow and capital data. Section 3 gives the empirical methodology and Section 4 presents the main results based on correlation coefficients analysis and Granger causality test as well as the robustness. Section 5 concludes.
The Chinese stock market composes of Shanghai Stock Exchange (SSE) and Shenzhen Stock Exchange (SZSE), which were established on 19 December 1990 and 3 July 1991, respectively. Except for some national holidays (e.g., National Days, Mid-autumn Festival, Dragon Boat Festival and Lunar Chinese New Year), both the exchanges are open five days a week (from Monday to Friday) from 9:30 a.m. to 15:00 p.m. with a trading break from 11:30 a.m. to 13:00 p.m. in GMT +8 time zone. Therefore, the trading periods are from 9:30 a.m. to 11:30 and from 13:00 p.m. to 15:00 p.m. and there are 4 h for trading on each trading day. There are some evidence suggesting that Chinese stock market is significantly different from other developed stock markets, e.g., the New York stock exchange and the London stock exchange. Firstly, according to a report released by Shanghai stock exchanges, individual investors account for more than 90% of all the accounts in Chinese stock market (Zhang et al., 2016). Secondly, there exist price limits for the individual stocks where the prices cannot change more than 10% from the opening prices at each trading day. Thirdly, for the majority of the stocks, short sale is still constrained. On 31 March 2010, the China Security Regulatory Commission (CSRC) approved the margin trading and securities lending program to remove the restrictions on short selling for selected stocks. Up to the end of 2015, only 892 (less than one third) stocks are allowed to be sold short, but the naked short sale is still prohibited for all the stocks on both Shanghai and Shenzhen stock exchanges. 2.2. Proxy for information flow The proxy for information flow is obtained from the Baidu News, which is a service provided by Baidu. According to a survey on searching behavior of Chinese netizen released by China Internet Network Information Center, Baidu dominates the Chinese searching market with a market share more than 80%. In that sense, Baidu News is an ideal source of the rate of information flow. Baidu News retrieves the news from more than 500 authoritative websites and provides 24/7 update service to its users. Besides, given the “Advanced Settings” provided by Baidu News, we can restrict the news to a certain interval. As a consequence, we can obtain the intraday information flow from Baidu News. For individual stock at a given trading day, we could confine the searching results to the trading periods (from 9:30 a.m. to 11:30 and from 13:00 p.m. to 15:00 p.m.). Table 1 reports the statistical properties of news in different intervals. In this paper, we search the stock names with Baidu News and employ the aggregated number of news appeared in trading periods as the daily information flow for individual stock. The rationale for employing this aggregated number of news is that if the stock market is efficient, the news in the non-trading periods should have been reflected into the prices. It is admitted that there may exists some overlapped news, i.e., the same news reported by different information sources, the proposed proxy has Table 1 Statistical properties of news in T1, T2, T3 and T4. This table reports the statistical properties of the news appeared in T1, T2, T3 and T4. “Std.”, “Max” and “Min” denote the standard deviation, maximum value and minimum value, respectively. According to the trading rules in Chinese stock market, each trading days can be divided into four sub-periods, i.e., T1 (from 15:00 p.m. in previous trading day to 9:30 a.m. ), T2 (from 9:30 a.m.to 11:30 a.m.), T3 (from 11:30 a.m. to 13:00 p.m.) and T4 (from 13:00 p.m. to 15:00 p.m.).
2. Data description There are mainly two sources of data in this paper. The first refers to the capital data retrieved from RESSET Financial Research Database. The second is the data on the proxy for information flow, i.e., the Baidu News. A brief introduction of Chinese stock market is given to describe the trading periods. 2
Variables
T1
T2
T3
T4
Mean Median Std. Kurtosis Max Min
48.09 8 122.9 25.83 819 0
8.169 2 22.28 31.91 162 0
1.512 0 2.704 11.42 15 0
2.008 0 3.649 10.01 19 0
Economic Modelling xxx (xxxx) xxx–xxx
D. Shen et al.
Besides, in order to investigate the nonlinear contemporaneous relationships, we also employ the Spearman Correlation Coefficient, which is defined as the Pearson correlation coefficient between the ranked variables:
the advantage of ruling out some non-informational factors. Since Baidu News does not support the directly download service, a java crawler script is written to automatically download the data. 2.3. Capital data
ρs = Our sample consists of the daily data for 40 stocks from Shanghai Shenzhen CSI 300 Index (CSI 300 Index),1 includes the GARCHvolatility, opening, closing, highest and lowest stock prices and trading volume. All the data comes from the RESSET Financial Research Database. The CSI 300 Index is the first index launched by both Shanghai and Shenzhen stock exchange with the aim of representing the overall performance of Chinese stock market and it is revised every six months with adding and deleting stocks from the index. The selected 40 stocks are the long-lived stocks that exist from the beginning date of the CSI 300 Index to the end of December 2015. Therefore, these actively traded stocks are likely to have sufficient information flow per day to satisfy the conditions for the Central Limit Theorem argued by MDH (Lamoureux and Lastrapes, 1990). Since trading volume has both linear and nonlinear trends in the time series (Gallant et al., 1992; Chen et al., 2001; Mougoué and Aggarwal, 2011), we employ the following regression model to investigate the linear and nonlinear time trends:
Volumet = b0 +b1 t +b 2 t 2+εt
To test the lead-lag relationships between information flow and return volatility, we construct the following Granger Causality models (Granger, 1988): p
Voli, t = uVol +
m =1
n =1
p
q
∑ am Infoi,t −i + ∑ βn Voli,t −j+εi,t m =1
n =1
(5)
(6)
where p and q denote the lag length, Voli, t and Infoi, t denote the daily return volatility and information flow, respectively, am and βn denote the coefficients, uVol and uInfo are the intercept terms and εi, t is regression error. The lag length of p and q is chosen using the Bayesian Information Criterion and is determined separately for each individual stock. We use the KPSS test to confirm the stationary property of the time series.2
(1)
4. Empirical results In the beginning, it is necessary to explore the empirical criterion to discriminate between MDH and SIAH. According to the definition of MDH (Clark, 1973; Epps and Epps, 1976; Harris, 1987), trading volume and return volatility are both driven by the same unobservable mixing variables, i.e., information flow. In that sense, the impact of information flow on the trading volume should be consistent with the impact of information flow on the return volatility. Besides, all investors receive the same signal and the market shifts to the ultimate equilibrium without any intermediately partial equilibrium in the process (Darrat et al., 2003; Kalev et al., 2004). Therefore, both the return volatility and the trading volume are proportional to the information flow at a given interval and the positive relationship is derived. Alternatively, SIAH posits that investors’ reactions to new information are largely different and thus there exist intermediately partial equilibrium before the ultimate equilibrium. Therefore, the lead-lag relationships between the information flow and return volatility is derived.
To investigate the contemporaneous relationships between the information flow and return volatility, we employ the Pearson Correlation Coefficient: (2)
where Info and Vol denote the information flow and return volatility, respectively, Cov (Info, Vol ) denotes the covariance between Info and Vol , Var is the variance. The ρ (Info , Vol ) quantifies the linear contemporaneous dependence. In particular, the Info is the aggregated number of news in trading periods in the same trading day and the Vol is the corresponding daily return volatility. Similarly, we also investigate the contemporaneous relationships between the information flow and trading volume.
Cov(Info, Tv) Var(Info)Var(Tv)
q
∑ am Voli,t −i+ ∑ βn Infoi,t −j +εi,t
Infoi, t = uInfo+
3.1. Correlation coefficient analysis
ρ(Info, Tv) =
(4)
3.2. Granger causality test
3. Empirical methodology
Cov(Info, Vol) Var(Info)Var(Vol)
Var(rgx )Var(rgy )
where ρs is the Spearman correlation coefficient, rgx and rgy is the rank variables of return volatility and information flow as well as trading volume and information flow, respectively.
where Volumet is the raw trading volume, b0 , b1 and b 2 are the coefficients, t and t 2 are the linear and quadratic time trends, respectively. The results reveal that the coefficients of both linear and quadratic time trends are significant at 5% level with the average Rsquare of 0.1556. Besides, we also employ the Dickey-Fuller test confirm the stationary property of the detrended trading volume, which is the residuals from model (1). We use this detrended trading volume as the trading volume used in the subsequent analysis.
ρ(Info, Vol) =
Cov(rgx , rgy )
4.1. Contemporaneous correlation We begin our analysis by investigating the contemporaneous relationships between return volatility and trading volume. Table 2 reports the Pearson and Spearman correlation coefficients between return volatility and trading volume. As is clear from the table, all the 40 stocks show a positive and significant contemporaneous correlation between return volatility and trading volume. In particular, the mean of the Pearson correlation coefficients is 0.4761 and the mean of the Spearman correlation coefficients is 0.4681. If we consider the trading volume as the proxy for the information flow, the empirical results clearly support the prediction of MDH.
(3)
where Info and Tv denote the information flow and trading volume, respectively, Cov (Info, Tv ) denotes the covariance between Info and Tv , Var is the variance. The ρ (Info , Tv ) quantifies the linear contemporaneous dependence. In particular, the Info is the aggregated number of news in trading periods in the same trading day and the Tv is the corresponding daily trading volume. 1 Other studies also employ the similar sample size, for example: Lamoureux and Lastrapes (1990) employ 20 stocks, Darrat et al. (2003) employ 30 stocks and Kalev et al. (2004) employ 5 stocks.
2 For brevity, we do not reports the lag length and the KPSS rest here. The results are upon request.
3
Economic Modelling xxx (xxxx) xxx–xxx
D. Shen et al.
Table 2 Contemporaneous correlations between return volatility and trading volume. This table reports the contemporaneous correlation coefficients between return volatility and trading volume. Return volatility is the GARCH(1,1) volatility of Bollerslev (1986), which is directly downloaded from the RESSET Financial Research Database. Trading volume is the detrended trading volume with model (1). “Pearson” denotes the Pearson correlation coefficients and “Spearman” denotes the Spearman correlation coefficients.
Table 4 Contemporaneous correlations between trading volume and information flow. This table reports the contemporaneous correlations between return volatility and information flow. Trading volume is the detrended trading volume with model (1). For each stock, information flow is the sum of news appeared in trading periods. “Pearson” denotes the Pearson correlation coefficients and “Spearman” denotes the Spearman correlation coefficients. Code
Code
Pearson
000001 000002 000009 000012 000024 000039 000060 000061 000063 000069 000157 000401 000402 000425 000527 000528 000568 000625 000630 000651
0.6563*** 0.5191*** 0.5723*** 0.6307*** 0.3946*** 0.4950*** 0.5997*** 0.4465*** 0.4739*** 0.0870** 0.3900*** 0.3709*** 0.1755*** 0.1957*** 0.6328*** 0.4835*** 0.3896*** 0.5741*** 0.6635*** 0.5317***
Spearman 0.7026*** 0.4543*** 0.5779*** 0.6242*** 0.4373*** 0.5212*** 0.5586*** 0.5120*** 0.4524*** 0.0870** 0.2894*** 0.3291*** 0.2852*** 0.1909*** 0.6690*** 0.4254*** 0.3843*** 0.6230*** 0.5820*** 0.5576***
Code 000709 000729 000758 000778 000792 000800 000807 000825 000839 000858 000878 000898 000933 000937 000960 000983 600000 600005 600009 600010
Pearson 0.5075*** 0.1419*** 0.6109*** 0.4005*** 0.4405*** 0.6231*** 0.4883*** 0.4252*** 0.4765*** 0.3959*** 0.6025*** 0.5000*** 0.4533*** 0.2072*** 0.7011*** 0.5321*** 0.5331*** 0.4810*** 0.5690*** 0.6730***
Pearson
Spearman
Code
Pearson
Spearman
Spearman 000001 000002 000009 000012 000024 000039 000060 000061 000063 000069 000157 000401 000402 000425 000527 000528 000568 000625 000630 000651
0.5037*** 0.1156*** 0.5747*** 0.4451*** 0.3537*** 0.5374*** 0.4829*** 0.5218*** 0.5989*** 0.2607*** 0.5323*** 0.5476*** 0.4155*** 0.1725*** 0.5958*** 0.5162*** 0.4582*** 0.5430*** 0.6526*** 0.6314***
***
0.3299 0.3071*** 0.2935*** 0.1462*** 0.1310*** 0.1439*** 0.5439*** −0.0503 0.1573*** 0.2864*** 0.0395 0.4102*** −0.0034 0.0159 0.5110*** 0.2484*** 0.1954*** 0.0140 0.4669*** 0.0980**
−0.0309 0.2343*** 0.2130*** 0.2368*** 0.2055*** 0.1152*** 0.3055*** −0.0429 0.0034 0.1321*** 0.1915*** 0.2979*** 0.0183 0.0025 0.1820*** 0.1854*** 0.1313*** 0.0532 0.3812*** 0.1425***
000709 000729 000758 000778 000792 000800 000807 000825 000839 000858 000878 000898 000933 000937 000960 000983 600000 600005 600009 600010
***
0.2067 0.0248 0.5063*** 0.3011*** 0.0736* 0.3491*** 0.2773*** 0.1183*** 0.3365*** 0.2753*** 0.3345*** 0.0143 0.3413*** 0.2593*** 0.5260*** 0.3782*** 0.1561*** 0.2289*** 0.0167 0.3255***
0.2985*** 0.0313 0.3767*** 0.2272*** 0.0539 −0.0243 0.0731* 0.1246*** 0.1288*** 0.1863*** 0.2078*** 0.1001*** 0.1586*** 0.1812*** 0.4170*** 0.2382*** −0.0476 0.1186*** 0.0768** 0.3740***
*** ** ***
and
**
, and * indicate correlation coefficient significant at 1%, 5% and 10% level, respectively.
indicate correlation coefficient significant at 1% and 5% level, respectively.
Table 3 Contemporaneous correlations between return volatility and information flow. This table reports the contemporaneous correlations between information flow and return volatility. Return volatility is the GARCH(1,1) volatility of Bollerslev (1986), which is directly downloaded from the RESSET Financial Research Database. For each stock, information flow is the sum of news appeared in trading periods. “Pearson” denotes the Pearson correlation coefficients and “Spearman” denotes the Spearman correlation coefficients. Code 000001 000002 000009 000012 000024 000039 000060 000061 000063 000069 000157 000401 000402 000425 000527 000528 000568 000625 000630 000651
Pearson ***
0.4587 0.1710*** −0.0343 0.0801 0.1948*** 0.1555*** 0.2344*** 0.0039 0.1742*** 0.0205 0.0146 0.0631* −0.1761*** −0.1309*** 0.0840** 0.0763** 0.0812** −0.0455 0.2352*** −0.0758**
Spearman ***
0.2898 0.1708*** 0.0803** 0.1871*** 0.2429*** 0.2134*** 0.2429*** −0.0310 0.0789** −0.0654* −0.1945*** 0.0405 −0.1867*** −0.1946*** 0.0466 0.0142 0.0306 −0.0926** 0.2010*** −0.1780***
Code
Pearson
000709 000729 000758 000778 000792 000800 000807 000825 000839 000858 000878 000898 000933 000937 000960 000983 600000 600005 600009 600010
−0.1220 0.0057 0.1767*** −0.0328 −0.0589 0.3349*** 0.0686* 0.0345 0.1904*** 0.2145*** 0.1291*** −0.1458*** 0.0045 −0.0209 0.3052*** 0.2181*** 0.2637*** 0.0488 −0.1734*** 0.2251***
Table 5 Results on MDH. This table reports the results on examining the MDH. Panel A and B give the Pearson correlation coefficients and Spearman correlation coefficients for the IF-RV and IR-TV, respectively. Besides, we also construct the synthesized correlation coefficients by choosing the more significant correlation coefficients (the correlation coefficients with smaller corresponding p-values) from Pearson correlation coefficients and Spearman correlation coefficients. All these three correlation coefficients show that the relationships between trading volume and information flow is significantly larger than that of the relationships between return volatility and information flow.
Spearman ***
−0.2331*** −0.0070 0.1720*** −0.0561 −0.1916*** 0.3117*** 0.0241 0.0325 0.2646*** 0.1655*** 0.0515 −0.1539*** −0.0278 −0.0536 0.1903*** 0.2059*** 0.0710*** −0.1131*** −0.1693*** 0.2237***
Mean
Negative
Positive
Panel A: Pearson correlation coefficients Return volatility-Information Flow 0.0828 28 Trading volume-Information Flow 0.2316 32 Difference in mean −0.1488 (−6.9224)***
6 0
22 32
Panel B: Spearman correlation coefficients Return volatility-Information Flow 0.0385 28 Trading volume-Information Flow 0.1543 30 Difference in mean −0.1159 (−4.3758)***
11 0
17 30
Panel C: Synthesized correlation coefficients Return volatility-Information Flow 0.0683 34 Trading volume-Information Flow 0.2497 35 Difference in mean −0.1813 (−7.4712)***
11 0
23 35
***
Significant
Indicates Granger-causality test significant at 1% level.
Pearson (Spearman) correlation coefficients for the IF-RV relationships is 0.0828 (0.0385). Table 4 reports the contemporaneous correlations coefficients for the IF-TV relationships. As is clear from the table, no stocks show negative and significant Pearson (Spearman) contemporaneous correlation coefficients. The average Pearson (Spearman) correlation coefficient IF-TV relationship is 0.2316 (0.1543). The t-test shows that the correlation coefficients for IF-TV relationships are significantly larger than the correlation coefficients for IF-RV relationships at 1% level (See Table 5 for details). Besides, we also construct the synthesized correlation coefficients by choosing the more significant correlation coefficients (the correlation coefficients with smaller corresponding pvalues) from Pearson correlation coefficients and Spearman correlation
*** **
, and * indicate correlation coefficient significant at 1%, 5% and 10% level, respectively.
Since MDH claims that trading volume and return volatility are both driven by the same unobservable mixing variables, i.e., information flow, we turn next to employ the Baidu News as the proxy for information flow and investigate its contemporaneous relationships between return volatility and trading volume, respectively. Table 3 reports the contemporaneous correlation coefficients for IF-RV relationships. As is clear from the Table, 6 (11) stocks show negative and significant Pearson (Spearman) contemporaneous correlation coefficients. The average
4
Economic Modelling xxx (xxxx) xxx–xxx
D. Shen et al.
coefficients. Table 5 reports that the synthesized correlation coefficients for the IF-TV relationships are significantly larger than the synthesized correlation coefficients for the IF-RV relationships. To facilitate illustration, we also use graphical presentation to discuss the results. The red solid line with square denotes the correlation coefficients for IF-TV relationships across the 40 stocks and the blue solid line with pentagram denotes the correlation coefficients for the IF-RV relationships across the 40 stocks. As is clear from the figure, the IF-TV relationships are significantly larger than the IF-RV relationships. Generally speaking, the significant difference between the IF-RV relationships and the IF-TV relationships contradicts the prediction of the MDH in Chinese stock market (Fig. 1). 4.2. Granger causality In this section, we further investigate Granger causality for the IFRV relationship. For each individual stock, we use the KPSS test to confirm the stationary property and the Bayesian Information Criterion to select the lag length of p and q . As is clear from Table 6, there exists significant causality from information flow to return volatility in at least 25 stocks. The reverse causality from return volatility to information flow is generally smaller, but achieve statistical significance in 7 stocks. Therefore, more than half of the 40 stocks show evidence of significant causality between information flow and return volatility in one way or another as predicted by SIAH. 4.3. Robustness To ensure that the previous findings are not driven by the measurement of return volatility and the selection of sample period, we use the alternative measurement of return volatility and subsample analysis for the robustness. In particular, we use the opening, closing, highest and lowest stock prices to calculate the range-based volatility for each individual stock (Garman and Klass, 1980):
Fig. 1. Three alternative correlation coefficients. This figure illustrates the three alternative correlation coefficients. The red solid line with square denotes the correlation coefficients between trading volume and information flow across the 40 stocks and the blue solid line with pentagram denotes the correlation coefficients between return volatility and trading volume across the 40 stocks. Panel A, B and C illustrate the Pearson, Spearman and Synthesized correlation coefficients, respectively. They all show that the relationships between trading volume and information flow is significantly larger than the relationships between return volatility and information flow. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Vi2, t = 0.5hli, t 2−(2ln2 −1) oci, t 2
(7)
where hli, t is the difference in natural logarithms of the highest and lowest prices for individual stock i on day t , oci, t is the difference in
Table 6 Granger-causality between return volatility and information flow. This table reports the Granger-causality test between return volatility (V2) and information flow (IF). Value in the table is the F-statistics and value in the parentheses is the critical value from the F-distribution. If F-statistics is greater than the critical value, the null hypothesis of no Granger Causality is rejected at certain significance level. The lag length of p and q is chosen using the Bayesian Information Criterion and is determined separately for each individual stock. We use the KPSS test to confirm the stationary property of the time series. Code
Null: IF does not granger cause V2
Null: V2 does not granger cause IF
Code
Null: IF does not granger cause V2
Null: V2 does not granger cause IF
000001 000002 000009 000012 000024 000039 000060 000061 000063 000069 000157 000401 000402 000425 000527 000528 000568 000625 000630 000651
23.54 (6.67) *** 42.23 (6.67) *** 70.94 (6.67) *** 11.34 (6.67) *** 66.92 (6.67) *** 1.09 (6.67) 44.64 (4.64) *** 0.07 (4.64) 13.53 (6.67) *** 4.65 (6.67) 1.24 (6.67) 27.65 (6.67) *** 0.32 (6.67) 0.17 (6.67) 9.18 (6.67) *** 0.70 (6.67) 7.21 (4.64) *** 0.58 (6.67) 17.22 (6.67) *** 3.58 (6.67)
1.37 (6.67) 4.51 (6.67) 15.77 (4.64) 0.21 (6.67) 2.57 (6.67) 0.72 (6.67) 46.48 (4.64) 0.10 (6.67) 2.39 (6.67) 0.20 (6.67) 0.09 (6.67) 2.37 (6.67) 1.66 (6.67) 2.22 (6.67) 1.33 (6.67) 1.45 (6.67) 4.79 (6.67) 0.68 (6.67) 5.40 (6.67) 1.02 (6.67)
000709 000729 000758 000778 000792 000800 000807 000825 000839 000858 000878 000898 000933 000937 000960 000983 600000 600005 600009 600010
18.37 (3.35) *** 0.83 (6.67) 16.48 (6.67) *** 14.21 (6.67) *** 18.74 (3.04) *** 10.92 (3.04) *** 3.69 (6.67) 0.13 (6.67) 54.73 (3.04) *** 12.83 (6.67) *** 17.32 (3.81) *** 1.78 (6.67) 4.47 (6.67) 6.81 (6.67) *** 25.27 (6.67) *** 20.94 (6.67) *** 13.28 (6.67) *** 9.33 (6.67) *** 0.65 (6.67) 17.77 (6.67) ***
3.47 (6.67) 4.12 (6.67) 2.30 (6.67) 3.27 (6.67) 5.30 (4.64) *** 9.63 (4.64) *** 2.87 (6.67) 0.45 (6.67) 5.84 (6.67) 1.41 (6.67) 29.52 (4.64) *** 4.68 (6.67) 2.70 (6.67) 1.36 (6.67) 6.66 (4.64) *** 2.14 (6.67) 1.74 (6.67) 5.10 (6.67) 15.12 (6.67) *** 1.79 (6.67)
***
***
***
Indicates Granger-causality test significant at 1% level.
5
Economic Modelling xxx (xxxx) xxx–xxx
D. Shen et al.
Table 7 Results on MDH with range-based volatility and subsample analysis. This table reports the robustness test on MDH with range-based volatility and subsample analysis. Panel A reports the full sample analysis with the range-based volatility. Panel B and C reports the subsample analysis with the range-based volatility, respectively. Mean Panel A: 2010/6/1–2013/7/1
Panel A1: Pearson correlation coefficients Return volatility0.1462 Information Flow Trading volume0.2316 Information Flow Difference in mean −0.0854 (−3.8502)*** Panel A2: Spearman correlation coefficients Return volatility0.0885 Information Flow Trading volume0.1543 Information Flow Difference in mean −0.0658 (−3.2994)*** Panel A3: Synthesized correlation coefficients Return volatility0.1474 Information Flow Trading volume0.2497 Information Flow Difference in mean −0.1023 (−4.7614)***
Panel B: 2010/6/1–2012/1/31
Panel B1: Pearson correlation coefficients Return volatility0.0522 Information Flow Trading volume0.2458 Information Flow Difference in mean −0.1936 (−8.5705)*** Panel B2: Spearman correlation coefficients Return volatility0.0527 Information Flow Trading volume0.1625 Information Flow Difference in mean −0.1098 (−4.4674) Panel B3: Synthesized correlation coefficients Return volatility0.0678 Information Flow Trading volume0.2501 Information Flow Difference in mean −0.1823 (−6.8991)
Panel C: 2012/2/1–2013/7/1
Panel C1: Pearson correlation coefficients Return volatility0.1178 Information Flow Trading volume0.2455 Information Flow Difference in mean −0.1277 (−5.2108) Panel C2: Spearman correlation coefficients Return volatility0.0951 Information Flow Trading volume0.1501 Information Flow Difference in mean −0.0550 (−2.2281) Panel C3: Synthesized correlation coefficients Return volatility0.1082 Information Flow Trading volume0.2491 Information Flow Difference in mean −0.1408 (−5.7333)
***
Significant
Negative
Positive
33
4
29
32
0
32
33
4
29
30
0
30
35
6
29
35
0
35
20
4
16
32
0
32
22
6
16
28
0
28
26
6
20
32
0
32
24
3
21
32
0
32
24
5
19
29
0
29
27
6
21
32
0
32
***
***
***
**
***
Indicates Granger-causality test significant at 1% level.
relationship. Besides, there exists significant and negative correlation coefficients for the IF-RV relationship and thus contradict the prediction of MDH. Panel A of Table 8 reports that more than half of the 40 stocks show evidence of significant causality between information flow and return volatility in one way or another. Therefore, the empirical results support the prediction of SIAH.
natural logarithms of the opening and closing prices for individual stock i on day t , ln2 denotes the square of natural logarithms. We re-estimate models (1)–(6) with this range-based volatility. Panel A of Table 7 reports the full sample results on the MDH and it shows that the correlation coefficients for the IF-TV relationship is significantly larger than the correlation coefficients for the IF-RV 6
Economic Modelling xxx (xxxx) xxx–xxx
D. Shen et al.
Bergemann, D., Heumann, T., Morris, S., 2015. Information and volatility. J. Econ. Theory 158 (Part B), 427–465. Bohl, M.T., Henke, H., 2003. Trading volume and stock market volatility: the Polish case. Int. Rev. Financ. Anal. 5, 513–525. Bollerslev, T., 1986. Generalized autoregressive conditional heteroskedasticity. J. Econ. 31, 307–327. Chen, G.M., Firth, M., Rui, O.M., 2001. The dynamic relation between stock returns, trading volume, and volatility. Financ. Rev. 36 (3), 153–174. Clark, P.K., 1973. A subordinated stochastic process model with finite variance for speculative prices. Econometrica 41, 135–155. Copeland, T.E., 1976. A model of asset trading under the assumption of sequential information arrival. J. Financ. 31, 1149–1168. Da, Z., Engelberg, J., Gao, P., 2011. In search of attention. J. Financ. 5, 1461–1499. Darrat, A.F., Rahman, S., Zhong, M., 2003. Intraday trading volume and return volatility of the DJIA stocks: a note. J. Bank. Financ. 27, 2035–2043. Darrat, A.F., Zhong, M., Cheng, L.T.W., 2007. Intraday volume and volatility relations with and without public news. J. Bank. Financ. 31, 2711–2729. Davies, P.L., Canes, M., 1978. Stock prices and the publication of second-hand information. J. Bus. 51 (1), 43–56. Edelman, B., 2012. Using internet data for economic research. J. Econ. Perspect. 2, 189–206. Epps, T.W., Epps, M.L., 1976. The stochastic dependence of security price changes and transaction volumes: implications for the mixture-of-distributions hypothesis. Econometrica 44, 305–321. Fleming, J., Kirby, C., Ostdiek, B., 2006. Stochastic volatility, trading volume, and the daily flow of information. J. Bus. 79, 1551–1590. French, K.R., Roll, R., 1986. Stock return variances: the arrival of information and the reaction of traders. J. Financ. Econ. 17, 5–26. Gallant, A., Rossi, P., Tauchen, G., 1992. Stock prices and volume. Rev. Financ. Stud. 5 (2), 199–242. Garman, M.B., Klass, M.J., 1980. On the estimation of security price volatilities from historical data. J. Bus. 53, 67–78. Glosten, L.R., Milgrom, P.R., 1985. Bid, ask and transaction prices in a specialist market with heterogeneously informed traders. J. Financ. Econ. 14, 71–100. Granger, C.W.J., 1988. Some recent development in a concept of causality. J. Econom. 39 (1–2), 199–211. Grossman, S.J., Stiglitz, J.E., 1980. On the impossibility of informationally efficient markets. Am. Econ. Rev. 70, 393–408. Harris, L., 1987. Transaction data tests of the mixture of distributions hypothesis. J. Financ. Quant. Anal. 22, 127–141. Jennings, R.H., Starks, L.T., Fellingham, J.C., 1981. An equilibrium model of asset trading with sequential information arrival. J. Financ. 36, 143–161. Kalev, P.S., Liu, W.-M., Pham, P.K., Jarnecic, E., 2004. Public Information arrival and volatility of intraday stock returns. J. Bank. Financ. 28, 1441–1467. Kyle, A.S., 1985. Continuous auctions and insider trading. Econometrica 53, 1315–1335. Lamoureux, C.G., Lastrapes, W.D., 1990. Heteroskedasticity in stock return data: volume versus GARCH effects. J. Financ. 45, 221–229. Le, V., Zurbruegg, R., 2010. The role of trading volume in volatility forecasting. J. Int. Financ. Mark. Inst. Money 20, 533–555. Lee, C.F., Rui, O.M., 2000. Does trading volume contain information to predict stock returns? Evidence from China's stock markets. Rev. Quant. Financ. Account. 14, 341–360. Li, X., Shen, D., Xue, M., Zhang, W., 2017. Daily happiness and stock returns: the case of Chinese company listed in the United States. Econ. Model. 64, 496–501. Liu, L.X., Shu, H., Wei, K.C.J., 2017. The impacts of political uncertainty on asset prices: evidence from the Bo scandal in China. J. Financ. Econ. 125 (2), 286–310. Mougoué, M., Aggarwal, R., 2011. Trading volume and exchange rate volatility: evidence for the sequential arrival of information hypothesis. J. Bank. Financ. 35 (10), 2690–2703. Park, B.-J., 2010. Surprising information, the MDH, and the relationship between volatility and trading volume. J. Financ. Mark. 13, 344–366. Qiu, L., Welch, I., 2004. Investor Sentiment Measures. National Bureau of Economic Research. Ross, S.A., 1989. Information and volatility: the no-arbitrage martingale approach to timing and resolution irrelevancy. J. Financ. 44, 1–17. Shen, D., Li, X., Zhang, W., 2017. Baidu news coverage and its impacts on order imbalance and large-size trade of Chinese stocks. Financ. Res. Lett. http:// dx.doi.org/10.1016/j.frl.2017.06.008. Shen, D., Zhang, W., Xiong, X., Li, X., Zhang, Y., 2016. Trading and non-trading period Internet information flow and intraday return volatility. Phys. A: Stat. Mech. Appl. 451, 519–524. Smirlock, M., Starks, L., 1988. An empirical analysis of the stock price-volume relationship. J. Bank. Financ. 12, 31–41. Wagner, N., Marsh, T.A., 2005. Surprise volume and heteroskedasticity in equity market returns. Quant. Financ. 5, 153–168. Zhang, W., Li, X., Shen, D., Teglio, A., 2016. R2 and idiosyncratic volatility: which captures the firm-specific return variation? Econ. Model. 55, 298–304. Zhang, W., Shen, D., Zhang, Y., Xiong, X., 2013a. Open source information, investor attention, and asset pricing. Econ. Model. 33, 613–619. Zhang, Y., Feng, L., Jin, X., Shen, D., Xiong, X., Zhang, W., 2014. Internet information arrival and volatility of SME PRICE INDEX. Phys. A: Stat. Mech. Appl. 399, 70–74.
Table 8 Results on SIAH with range-based volatility and subsample analysis. This table reports the robustness test on SIAH with range-based volatility and subsample analysis. Panel A reports the full sample analysis with the range-based volatility. Panel B and C reports the subsample analysis with the range-based volatility, respectively. “IF→V2” denotes that information flow can Granger-cause the changes of return volatility. “V2→IF” denotes that return volatility can Granger-cause the changes of information flow. “IF∞V2” denotes the bi-directional causality and “IF—V2” denotes the causality in either direction. Relationship Panel A: 2010/6/1–2013/7/1
Panel B: 2010/6/1–2012/1/31
Panel C: 2012/2/1–2013/7/1
IF→V2 V2→IF IF∞V2 IF—V2
13 6 5 24
11 8 6 25
14 7 6 27
Following Mougoué and Aggarwal (2011), we also examine the sensitivity of our findings with respect to certain major event that might have changed structural breaks. For example, the political uncertainty event in February 2012 (Liu et al., 2017). In particular, we construct the subsample analysis by breaking up the full sample around February 2012. Panels B and C of Table 7 and Panels B and C of Table 8 report the results on the subsample analysis. They all support our previous findings. 5. Conclusions In this paper, we employ the search frequency of stock names in Baidu News as the proxy for the information flow and investigates competing hypotheses on the relationships between information flow and return volatility in Chinese stock market. We mainly find that: (1) trading volume and return volatility are not driven by the same variable, i.e., the information flow, and thus contradicts the predication of the MDH; (2) there exist significant lead-lag relationships between information flow and return volatility, which is in accordance with the SIAH; (3) these findings are robust to alternative measurement of return volatility and subsample analysis. Generally speaking, these findings contradict the prediction of MDH and support the SIAH. This paper is in line with the literature on the utilization of internet data for financial and economic analysis (Da et al., 2011; Edelman 2012; Zhang et al., 2013; Li et al., 2017). It is notable that the quantified information flow could provide new perspectives on investor psychological biases as well as market over- and under-reactions. Besides, much work need to be focused on distinguishing the information from the noise. We leave this for future research. Acknowledgments Earlier version of this paper was circulated as “An Empirical Analysis of the Information-Volatility Relationship with Internet Information” and was presented at the 2016 International Conference on Applied Financial Economics and the 23rd International Conference on Computing in Economics and Finance. Helpful comments from participants are gratefully acknowledged. This work is supported by the National Natural Science Foundation of China (71701150, 71532009 and 71320107003). We thank Prof. Xiaoneng Zhu (Associate Editors) and four anonymous referees for useful suggestions that helped us improve the paper substantially. Any errors are ours. References Andersen, T.G., 1996. Return volatility and trading volume: an information flow interpretation of stochastic volatility. J. Financ. 51, 169–204.
7