China Economic Review 57 (2019) 101344
Contents lists available at ScienceDirect
China Economic Review journal homepage: www.elsevier.com/locate/chieco
Does renaming promote economic development? New evidence from a city-renaming reform experiment in China
T
Jing Guo , Zhengyu Zhang ⁎
School of Economics, Shanghai University of Finance and Economics, No 777, Guoding Road, Yangpu District, Shanghai 200433, China
ARTICLE INFO
ABSTRACT
Keywords: City-renaming reform Synthetic control method Panel data approach Machine learning method Treatment effects Statistical inference
To explore the impact of city-renaming reform on economic growth, we compare the empirical performance of the synthetic control method, panel data approach and machine learning method (LASSO and elastic net) by the case of Xiangyang, which was officially renamed in 2010. We find that for the data on real GDP growth, the panel data approach reveals the best performance under the criteria of evaluating the quality of a model. The estimation results show that Xiangyang's real GDP growth rate rose by about 1.43% annually after the renaming reform. However, further discussions show that the annual growth rate of the tertiary industry decreased by 1.59%, which contradicts the mechanism of the brand effect of the reform. The statistical inference demonstrates that even if a city did not implement the city-renaming reform in 2010, the probability of obtaining an effect as large as Xiangyang's would be 25.9%. Therefore, the effect of the cityrenaming reform is insignificant and other policy interventions—rather than the city-renaming reform—promote economic growth in Xiangyang. In summary, policymakers cannot win a “Promotion Tournament” by renaming cities.
JEL classification: C23 C51 C53 R11
1. Introduction In post-reform China, officials who encourage local economic growth are likely to be promoted (Li & Zhou, 2005). Therefore, local governments across China have launched a GDP-stimulating competition known as the “Promotion Tournament” (Zhou, 2007). Officials use various methods to promote local economic development including city-renaming reforms. Why does renaming promote economic growth? The name of a city is like an advertisement and can have a certain brand effect. For a company, the brand effect improves the marketing performance of products (Keller, 1993) and increases firm value (Aaker & Jacobson, 2001; Michell, King, & Reast, 2001). Similarly, the city brand represents the image of a city, and increasing the brand effect of the city may promote its economic growth. Thus, officials in a city have incentives to rename it (or restore its previous name) to promote local economic development. Xiangyang is located in the northwest of Hubei Province on the Han River. With a history of more than 2800 years, it is a famous military and cultural center as well as an important member of the urban agglomeration in the middle reaches of the Yangtze River and a sub-central city of Hubei Province. However, after 1949, because of the merger of Xiangyang and Fancheng, the first words of the two cities were taken and the newly merged city was renamed Xiangfan. For various reasons such as retaining the historical and cultural connotations of the city name and promoting economic growth, Xiangfan was officially renamed Xiangyang on December 9, 2010. Hence, Xianyang is a good case study for examining the treatment effect of a city-renaming reform. Lu, Wu, and Xie (2018) show that this city-renaming reform has increased annual nighttime light images in Xiangyang by 10.1%. ⁎
Corresponding author. E-mail addresses:
[email protected] (J. Guo),
[email protected] (Z. Zhang).
https://doi.org/10.1016/j.chieco.2019.101344 Received 9 October 2018; Received in revised form 2 July 2019; Accepted 25 August 2019 Available online 27 August 2019 1043-951X/ © 2019 Elsevier Inc. All rights reserved.
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
However, this effect may have arisen from other factors. For example, on April 25, 2010, the Xiangyang Economic and Technological Development Zone was approved by the State Council. The establishment of such zones can significantly promote economic growth (Yang, Zhou, & Zou, 2017). On May 4, 2012, the Hubei Provincial Government also approved the “Overall Plan for the Construction of the Xiangyang Dongjin New District,” and the Dongjin New District became the first urban new district approved by the Hubei provincial government. Subsequently, the large-scale construction of infrastructure such as roads and bridges began. In this study, we find that annual real GDP growth in Xiangyang rose by 1.43% after the city-renaming reform. But by using statistical inferences, we demonstrate that the effect of the city-renaming reform was insignificant. Therefore, this effect was caused by other interventions rather than the reform itself. To examine the impact of renaming on Xiangyang's real GDP growth rate after 2010, we need to estimate the counterfactuals of this rate in the absence of the policy intervention. Here, the treatment effects of the policy intervention are the difference between the actual observed real GDP growth rates and counterfactuals. However, the fundamental problem in treatment effect analysis is the difficulty of estimating the counterfactuals. The parametric method is the simplest and most commonly used method to estimate counterfactuals. For example, to study the impact of smoking on lifespan, we use the following parametric model:
Yi =
0
+ Xi
+ Di +
(1)
i
where Yi is lifespan, γ0 is a constant, Xi is a vector of the covariates that affect lifespan, β is a vector of the coefficients, Di is the dummy endogenous variable (Di = 1 if i smokes and Di = 0 otherwise), the coefficient ρ is the treatment effect, and ϵi is the idiosyncratic error term with E(ϵi) = 0. If individual i smokes, we can only observe lifespan under smoking. Therefore, we need to estimate the counterfactual of lifespan in the absence of smoking. The difference between them is the effect of smoking on lifespan, which is the coefficient ρ in this case. Ordinary least squares (OLS) applied to (1), however, may give rise to biased estimates of ρ caused by omitted variable bias. The most common way to address this problem is the instrumental variable technique, which assumes that the instrumental variable has the same impact on each individual in the sample (Imbens & Angrist, 1994). Such a hypothesis is almost impossible to set up because lifestyle and genetic differences lead to different impacts of the instrumental variable on different individuals. The synthetic control method (Abadie, Diamond, & Hainmueller, 2010; Abadie & Gardeazabal, 2003), panel data approach (Hsiao, Ching, & Wan, 2012), and machine learning method1 (LASSO, Tibshirani (1996); elastic net, Zou and Hastie (2005)) could be considered as alternatives to the parametric method to estimate the counterfactuals for program evaluation. In the Appendix, we compare the performance of these three methods using a simulation experiment and provide guidance for empirical studies. The advantage of the three methods is that they use data-driven procedures to select suitable control groups instead of discretion, forcing users to show the affinities between the treated and untreated units using the observed variables. Therefore, they have recently been widely used in program evaluation. The idea of the synthetic control method is to assign an appropriate weight to each control unit based on the similarity of the preintervention characteristics between the treated unit and control units. To avoid extrapolation, the synthetic control method requires each weight to be positive and sum to one. Abadie and Gardeazabal (2003) were the first to propose the synthetic control method in a study of terrorism in the Basque Country. Abadie et al. (2010) later perfected the theory of the synthetic control method to study the effects of California's tobacco control program. Gardeazabal and Vega-Bayo (2016) conclude that “the synthetic control method results in a post-treatment mean squared error, mean absolute percentage error, and mean error with a smaller interquartile range, whenever there is a good enough match.” However, we show in the Appendix that the conclusion in Gardeazabal and Vega-Bayo (2016) is not convincing. In addition, Section 3 shows that the fit before the city-renaming reform made by the synthetic control method is very poor. Abadie et al. (2010) argue that the counterfactuals constructed by the synthetic control method are unsuitable if the fit is poor. Hence, we consider the panel data approach as an alternative to evaluate the impact of renaming on economic growth in Xiangyang. The panel data approach has two advantages. First, only outcome variables are used to assign weights to control cities. The covariates that explain the outcome variables are excluded. Second, the panel data approach allows for traditional inference by using the asymptotic distribution of the average treatment effect derived by Li and Bell (2017). Consequently, it is a suitable alternative to the synthetic control method. In recent studies, the panel data approach has been widely used for policy evaluation (Bai, Li, & Ouyang, 2014; Du & Zhang, 2015; Ouyang & Peng, 2015). Social scientists often focus on the relationship between the response and covariates, so that a parsimony model is more popular (Tibshirani, 1996; Zou & Hastie, 2005). In the Appendix, we find that the machine learning method selects too many control units. Therefore, it is mainly used for checking the robustness of results. By using the panel data approach, we show that Xiangyang's real GDP growth rate rose by 1.43% annually after the city-renaming reform, which is significant at the 1% level. Then, by adding predictors, weighing the estimation bias and prediction error, and measuring economic growth using per capita real GDP, we demonstrate that this result is robust. However, further discussions show that the impact of the city-renaming reform on the tertiary industry is negative, which contradicts the mechanism of the impact of the city-renaming reform that the historical and cultural connotations contained in the city name can improve the image of the city. In turn, the image of the city is expected to directly influence the decision-making of consumers, especially in the tourism industry (Chen & Tsai, 2007; Gallarza, Saura, & Garcia, 2002; Zhang, Fu, Cai, & Lu, 2014). Are the results we obtain driven by other policy 1
Athey (2018) and Huang and Yu (2018) review the application of the machine learning method in economics. 2
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
interventions? To answer this question, we use the inferential technique, which is similar to a permutation test, proposed by Abadie et al. (2010) to construct a distribution of the average treatment effects to determine whether the effect of the city-renaming reform is significant. The statistical inference demonstrates that if we were to assign the city-renaming reform to a city in the donor pool randomly, the probability of obtaining a treatment effect as large as Xiangyang's would be 25.9%. Therefore, the estimated effect for Xiangyang is small relative to the distribution of the effects estimated for other cities. These findings suggest that the effect of the city renaming is insignificant. Therefore, other policy interventions rather than the city-renaming reform have promoted economic growth in Xiangyang after 2010. We also study the effect of the city-renaming reform on economic growth in Pu'er and obtain the same results. Our study makes three potential contributions. First, when researchers use different methods developed in the literature for policy evaluation, they can follow the procedures in this paper. Second, it is not a good idea for policymakers to win a “Promotion Tournament” by renaming a city. Third, our analysis provides us with a warning that statistical inferences must be made when using the synthetic control or panel data methods to evaluate policy interventions; otherwise, the opposite conclusion may be drawn. The remainder of this paper is organized as follows. Section 2 describes the data used in the analysis and discusses the shortcomings of the intervention time series analysis approach. In Section 3, we empirically compare the three approaches to select the appropriate prediction method. Section 4 presents the estimation results. Section 5 discusses the statistical inferences. Concluding remarks are in Section 6. The introduction of the methods employed in this paper and Monte Carlo results are shown in the Appendix. 2. Data description The Decision of the Central Committee of the Communist Party of China on the Reform of the Economic Structure in October 1984 contradicted the traditional concept that the planned economy was opposed to the commodity economy and then slowly broke the dominance of the planned economy. In addition, the State Council promulgated the Regulations on the Administration of Geographical Names on January 23, 1986, which formed a relatively complete regulatory system for the administration of geographical names. We use a sample period from 1986 to compare three program evaluation methods: the synthetic control method, panel data approach, and machine learning method (LASSO and elastic net). The Southern Tour Speech of Deng Xiaoping pointed out that socialist countries can also implement a market economy. The reform goal established by the 14th National Congress of the Communist Party of China was to establish a socialist market economic system. In 1993, the Third Plenary Session of the 14th Central Committee adopted the “Decision of the Central Committee of the Communist Party of China on Several Issues Concerning the Establishment of a Socialist Market Economic System,” and established the basic theoretical framework of the socialist market economic system. The economic reform had now entered a new historical stage of establishing a socialist market economic system. Thus, the economic system in China before and after 1992 may differ. Therefore, we use a sample period that begins in 1993 to carefully explore the effects of Xiangyang's renaming reform in Section 4. The data on real GDP growth are from the Statistical Yearbook and Statistical Bulletin of each city, Statistical Yearbook and Statistical Bulletin of various provinces, and Road of Rise–60 Years of Hubei Splendor, Hunan Economic and Social Development 60 Years, Sixty Years in Henan and Jiangxi in Sixty Years in New China. Data on per capita GDP are from the China Statistical Yearbook for the Regional Economy, and the missing data are supplemented from the Statistical Yearbook of various provinces. We adjust nominal per capita GDP by the provincial consumer price index (CPI). We choose prefecture-level cities (including capital cities) in Hubei, Hunan, Henan, and Jiangxi Provinces as alternative control units. Some cities such as Yichang are not included because of the unavailability of real GDP growth rates. Cities renamed in the past such as Zhangjiajie are also excluded. We choose alternative control units in this way for two main reasons: (i) the four provinces are adjacent and all belong to the central region, thereby enjoying the same preferential policies in many aspects, and (ii) we obtain the annual real GDP growth rate of every city after 1985 mainly by referring to the Statistical Yearbooks of the various cities. However, the Statistical Yearbook of prefecture-level cities is difficult to obtain, and therefore, taking much time and energy to obtain data from all cities in China is unnecessary. Fig. 1 plots the growth rates of GDP for Xiangyang as well as of its secondary and tertiary sectors from 1986 to 2016. It shows that all these growth rates were declining when Xiangfan was renamed in December 2010. Does this mean that the city-renaming reform has had negative effects? Because many factors can affect economic growth, we can only draw conclusions by comparing actual real economic growth after the city-renaming reform with the counterfactuals of no renaming. 2.1. Structural intervention time series analysis Intervention time series analysis provides an effective method to the modelling of policy interventions (Vujić, Commandeur, & Koopman, 2016) and was popularized by Harvey and Durbin (1986), Harvey (1989) and Commandeur and Koopman (2007). We first use the linear intervention model presented in Enders (2014) to simply study the impact of renaming on the real GDP growth rates of Xiangyang2:
yt = a 0 + A (L) yt
1
+ Dt + B (L)
(2)
t
where A(L) and B(L) are polynomials in the lag operator L, Dt is a dummy variable that equals zero before 2011 and unity beginning 2
We also conduct analysis on secondary and tertiary sectors and obtain the similar results. 3
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Fig. 1. GDP, secondary, and tertiary growth rates of Xiangyang.
in 2011, and ϵt is a white noise disturbance. Thus, the effect of renaming is given by the magnitude of β. We use the procedures described in Enders (2014) to estimate the most appropriate ARMA model for the pre-intervention periods. The ACF and PACF of the pre-intervention periods in Fig. 2 show that the first autocorrelation and partial autocorrelation are statistically significant at 5% (falling outside the 95% confidence interval), whereas the other autocorrelations and partial autocorrelations are considered to be zero. Therefore, the AR(1) and MA(1) models are considered. The estimated AR(1) model (with t-statistics in parentheses) is
yt = 10.821 + 0.494yt (5.41)
(2.56)
1
+ t,
AIC = 151.31,
BIC = 154.97
(3)
The residuals of Eq. (3) appear to be serially uncorrelated with Q(5) = 3.34(0.6479) (the significance level is in parentheses). The estimated MA(1) model is
yt = 10.621 + 0.401 (7.33)
(1.76)
t 1
+ t,
AIC = 152.89,
BIC = 156.55
(4)
where the errors, ϵt, appear to be serially uncorrelated (Q(5) = 5.90(0.3159)). All the estimated coefficients in AR(1) are significant, whereas the estimated coefficient for ϵt−1 in MA(1) is not. Moreover, both the AIC and the BIC select AR(1) over MA(1). Hence, we adopt the following model to study the impact of renaming on the GDP growth of Xiangyang:
yt = a 0 + a1 yt
1
+ Dt +
(5)
t
Fig. 2. ACF and PACF for the real GDP growth rates of Xiangyang from 1986 to 2010. 4
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
We obtain the following result:
yt = 5.025 + 0.538yt (2.78)
(2.57)
1
0.550Dt + ( 0.40)
t
(6)
The Q-tests find no autocorrelation in the fitted residuals ϵt (Q(5) = 3.50(0.6238)). In addition, the estimated regression coefficient for Dt is not statistically significant. From these analyses, we conclude that the renaming had no significant effect on GDP growth. Economic growth is driven by common latent factors that affect all cities. If the policy lasts in the long run or the treatment effect takes a long time to manifest itself, then some factors may have changed during that time. In this case, the intervention time series analysis method cannot separate the effects of renaming from the effects of these changes, and therefore we do not know whether the above conclusion is correct. Given that information on other cities that have not been renamed can help construct the counterfactuals of Xiangyang, we can use the three methods mentioned above to separate the effects of renaming from the effects of the changes in those factors. 3. Empirical comparison of the three methodologies In the Appendix, we use Monte Carlo simulations to compare the performance of the three methods. As real macroeconomic data may differ slightly from simulated data, we focus on comparing them by using the real GDP growth rates of Xiangyang during 1986–2016 in this section. Typically, two criteria can be used to evaluate the quality of a model. The first is prediction accuracy. If we had failed to fit the real GDP growth in Xiangyang before the city-renaming reform, we would interpret that much of the post-2010 gap between the real and counterfactual GDP growth rates was falsely created by the lack of fit rather than by the impact of the cityrenaming reform (Abadie et al., 2010). Since no post-renaming mean squared error (MSE) is observed here, the pre-MSE is used to measure prediction accuracy as an alternative. The second criterion is the interpretation of the model. Social scientists prefer a parsimony model because it focuses on the relationship between the response and covariates and the results estimated from it can be easily explained (Tibshirani, 1996; Zou & Hastie, 2005). The goal here is to choose an appropriate model for the data on real GDP growth. In the Appendix, we slightly modify the panel data approach proposed by Hsiao et al. (2012). k is the number of control cities in each group when we group all control cities. In the empirical study, we can choose the best k according to actual situations. Tables 1 and 2 report the estimated treatment effects from 2011 to 2016, average treatment effect, pre-renaming MSE, and number of selected control cities. Compared with the synthetic control method and elastic net, the panel data approach not only has a smaller pre-MSE, but also selects fewer cities. Although the number of cities selected by LASSO is similar to the panel data approach, the pre-MSE is much larger. Fig. 3 shows that the panel data approach provides a better fit of the economy of Xiangyang before the city-renaming reform than the other methods. To balance prediction accuracy with the interpretation of the model, the panel data approach is the best choice here. In fact, the purpose of proposing the panel data approach in Hsiao et al. (2012) is to handle the data on real GDP growth. Incidentally, for the synthetic control method, the results obtained with or without the predictors are different. Tables 3 and 4 show the estimated coefficients using the panel data approach and machine learning method as well as the weight of each city in the donor pool for the synthetic control method. The cities selected by these different methods include Ezhou, Shaoyang, Yiyang, Huaihua, Nanyang and so on. Most of the methods select Huaihua and Nanyang. As shown in Table 5, Huaihua and Nanyang have the highest correlation coefficients as well as the highest regression coefficients or weights (see Tables 3 and 4). This shows that the three methods select the cities most closely related to Xiangyang to construct the counterfactuals. Owing to the different control groups selected, the magnitudes of the counterfactuals estimated by these methods are different. However, the average treatment effects are about the same, ranging from 1.2 to 1.6, except for two special cases: the synthetic control method including the predictors and the panel data approach with k = 12. Although the three methods select different control groups, the counterfactual paths during 2011–2016 depicted in Fig. 3 are strikingly similar. In the following empirical studies, except for the robustness checks, all the empirical analyses are carried out using the panel data approach. Table 1 Treatment effects of the synthetic control method, the panel data approach, and LASSO. Synth.
2011 2012 2013 2014 2015 2016 Average pre-MSE #
Panel
LASSO
Yes
No
k=8
k = 12
k = 16
k = 20
5-fold
10-fold
25-fold
0.930 −0.102 0.855 1.666 0.321 0.431 0.683 5.549 5
2.538 0.920 2.061 2.506 0.170 0.391 1.431 3.925 9
1.941 0.813 2.840 2.309 0.336 1.194 1.572 3.066 4
2.109 1.315 2.335 2.644 1.489 2.004 1.982 2.896 4
2.312 0.814 2.016 2.482 0.932 1.199 1.626 3.477 3
1.856 0.678 1.133 1.542 0.485 1.466 1.193 1.501 6
3.866 1.491 1.695 1.798 −0.199 −0.295 1.393 7.649 3
3.457 1.299 1.759 1.932 0.057 0.076 1.430 5.814 4
3.457 1.299 1.759 1.932 0.057 0.076 1.430 5.814 4
Note: For the synthetic control method, “yes” means that there are predictors and “no” means that there are no predictors. If predictors are used, Changsha and Xiangtan are excluded because of data unavailability. The 25-fold and 10-fold results for LASSO are the same. 5
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table 2 Treatment effects of elastic net. α = 0.8
2011 2012 2013 2014 2015 2016 Average pre-MSE #
α = 0.9
5-fold
10-fold
25-fold
5-fold
10-fold
25-fold
3.027 1.111 1.473 1.211 0.105 0.220 1.191 5.994 9
2.870 1.059 1.482 1.272 0.194 0.346 1.204 5.456 9
2.390 0.923 1.546 1.511 0.511 0.796 1.279 4.016 9
3.264 1.234 1.607 1.495 0.049 0.115 1.294 6.109 6
3.106 1.189 1.634 1.554 0.158 0.264 1.318 5.553 6
2.569 1.039 1.693 1.735 0.516 0.762 1.386 4.042 8
4. Empirical analysis As mentioned in Section 2, the economic systems in China before and after 1992 may differ. In addition, comparing Tables 5 and 6 shows that the correlation coefficients between the other cities and Xiangyang obtained during the sample period beginning in 1993 are greater. In this section and the next, we use the sample period that begins in 1993 to carefully explore the economic effects of the city-renaming reform. The advantages of the synthetic control method are as follows. First, it assigns weights to each control unit by using predictors before the policy intervention. Thus, the researcher can decide if the treated unit is similar to the synthetic control. Second, the weights are restricted to non-negative and must sum to one to safeguard against extrapolation (Abadie et al., 2010). However, if Xiangyang falls far from the convex hull of cities in the donor pool or the predictors do not provide a sufficiently good match, the synthetic control method may yield a poor fit. As the panel data approach uses only data from the outcome variable and relies on the latent correlations among the cross-sectional GDP growth rates to select control groups, the drawbacks of the synthetic control method do not occur under the panel data approach.3 Another advantage of the panel data approach is that we can use the traditional inference based on the asymptotic distribution derived by Li and Bell (2017). 4.1. Estimation of the effects of the city-renaming reform As mentioned earlier, the basic idea behind the panel data approach is to rely on the strong correlations among cities. According to the assumptions in Hsiao et al. (2012), two criteria must be satisfied when selecting cities into control groups for Xiangyang: (i) the policy intervention is exogenous (i.e., Xiangfan being renamed Xiangyang did not affect the GDP growth rates of other cities; see Assumption 5 in Hsiao et al. (2012)) and (ii) there is a strong correlation between Xiangyang and other cities in the absence of the policy intervention. Before applying the econometric model, we test both these criteria. Violating the exogeneity criterion would prevent us from identifying the impact of the latent factors on the treated unit. Ideally, renaming should be unrelated to other cities, but it is difficult to justify. However, it is hard to imagine that renaming would have influenced the economic development of the other cities to any large degree. Hence, we are safe to assume that the renaming of Xiangyang had a negligible impact on the control cities. We validate criterion (ii) by using detailed macro data. Because the correlations across cities are caused by common factors, we use the size of the correlation coefficients to illustrate the degree of the correlations among the GDP growth rates (Fujiki & Hsiao, 2015). Table 6 shows the correlation coefficients between the real GDP growth rate of Xiangyang and those of other cities from 1993 to 2010. Most of these correlation coefficients are above 0.5, with only a handful below 0.3. This finding suggests a strong correlation between the GDP growth rates of other cities and those of Xiangyang. Hence, criterion (ii) is satisfied. We demonstrate that it is reasonable to use the GDP growth rates of other cities to construct the counterfactuals of Xiangyang. By using the panel data approach with k = 14, we select Shaoyang, Ji'an, Huaihua, and Zhoukou to construct the hypothetical GDP growth path of Xiangyang had there been no city-renaming reform. Table 7 reports the OLS estimation weights of the control groups based on real GDP growth from 1993 to 2010. Tables 6 and 7 show that Shaoyang, Huaihua, and Ji'an not only have the highest correlation coefficients, but also the highest regression coefficients. This demonstrates that the economic growth of the three cities is similar to that of Xiangyang. Fig. 4 shows that the real GDP growth rates of Xiangyang, predicted by the control groups, trace closely its actual growth rates before the city-renaming reform with an R2 of 0.9342 and a pre-MSE of 1.191. This indicates that the panel data approach provides a good fit and further illustrates the rationality of using this approach here. Table 8 reports the actual real GDP growth rates of Xiangyang and predicted counterfactuals constructed by the control groups. The estimated city-renaming reform effects are simply the difference between the two. As shown in Table 8, the counterfactual growth rates are lower than the actual growth rates. Hence, the estimated treatment effects are strictly positive for Xiangyang from 2011 to 2016. The average treatment effect in this period is 1.43%, and this result is significant at the 1% level on the basis of the asymptotic distribution derived by Li and Bell (2017). We also find that the counterfactual path during 2011–2016 in Fig. 4 is similar 3
See Gardeazabal and Vega-Bayo (2016) and Wan et al. (2018) for discussions of the advantages and disadvantages of the two methods. 6
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Fig. 3. Actual versus predicted paths for Xiangyang's real GDP growth during 1986–2016 using the three methods. 7
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Fig. 3. (continued) 8
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Fig. 3. (continued)
to that in Fig. 3. In Subsection 4.2, the synthetic control method and machine learning method do not show this feature, which further illustrates that the estimate of the panel data approach is more robust for the data on GDP growth. Our analysis yields estimates of the effect of the city-renaming reform that are similar to those obtained by Lu et al. (2018) using data on cities' nighttime light images. 4.2. Robustness check Before further discussing the mechanism of the city-renaming reform, we conduct three robustness tests to check the robustness of the above results from three aspects: adding predictors, weighing the estimation bias and prediction error, and measuring economic growth using per capita real GDP. 4.2.1. Results of the synthetic control method As explained by Abadie et al. (2010), the synthetic control of Xiangyang is constructed as a convex combination of cities in the donor pool.4 As shown by Abadie and Gardeazabal (2003), Wang and Nie (2010), and Yu and Wang (2011), we select the proportion of the employed population in the total population, the proportion of fixed assets investment to GDP, the secondary sector share, the tertiary sector share, and the proportion of retail sales of consumer goods to GDP as predictors. Because China announced an Economic Stimulus Program of 4 trillion RMB in November 2008, we use the average of the above explanatory variables between 4 Compared with the donor pool used in the panel data approach and machine learning method, the synthetic control method excludes Changsha and Xiangtan owing to data unavailability.
9
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table 3 Weights/coefficients of the synthetic control method, the panel data approach, and LASSO. Synth.
Wuhan Huangshi Jingmen Xianning Enshi Ezhou Changsha Xiangtan Hengyang Shaoyang Changde Yiyang Chenzhou Huaihua Xiangxi Kaifeng Luoyang Anyang Hebi Xinxiang Jiaozuo Puyang Xuchang Luohe Sanmenxia Nanyang Shangqiu Xinyang Zhoukou Zhumadian Nanchang Jingdezhen Pingxiang Jiujiang Xinyu Ganzhou Ji'an Yichun Fuzhou
Panel
LASSO
Yes
No
k=8
k = 12
k = 16
k = 20
5-fold
10-fold
25-fold
0.035 0 0.065 0 0 0.101
0 0 0 0.009 0.013 0.015 0 0 0 0.056 0 0.009 0 0.442 0.077 0 0 0 0 0 0 0 0 0 0 0.325 0 0 0.053 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.352 0 0 0 0.406 0.212 0 0 0 0 0 0 0 0 0 0 0.392 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0.296 0 0 0 0.323 0 0 0 0.284 0 0 0 0 0 0 0 0 0 0 0 0.474 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.363 0 0.400 0 0 0 0 0 0 0 0 0 0 0 0.483 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0.288 0 0 0 0.420 0 0 0 0.322 0 0.191 0 0 0 0 0 0 0 0 0 0.550 0 0 0 0 0 0 −0.391 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.024 0 0.318 0 0 0 0 0 0 0 0 0 0 0 0.271 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.021 0 0.099 0 0.337 0 0 0 0 0 0 0 0 0 0 0 0.319 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.021 0 0.099 0 0.337 0 0 0 0 0 0 0 0 0 0 0 0.319 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0.365 0 0 0.434 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Note: The 10-fold and 25-fold results for LASSO are the same.
2008 and 2009 as separate predictors. In addition, we use the real GDP growth rates in 1997, 1998, 2002, 2008, 2009, and 2010 as predictors. Table 9 compares the pre-treatment characteristics of actual Xiangyang with that of synthetic Xiangyang as well as with the average of the 37 control cities. In addition to the proportion of retail sales of consumer goods to GDP, proportion of retail sales of consumer goods to GDP averaged for 2009–2010, and real GDP growth rate in 1998, the other predictors of synthetic Xiangyang are closer to the actual predictors than the average of the 37 control cities. Table 10 displays the weights of each city in the donor pool, showing that the GDP growth trend in Xiangyang before the cityrenaming reform is best reproduced by Jingmen, Xianning, Huaihua, Xinxiang, Sanmenxia, and Xinyu. The weights for all the other potential control cities in the donor pool take zero. Fig. 5 plots the real GDP growth rates of Xiangyang and its synthetic counterpart for 1993–2016. Table 11 reports the GDP growth rates of Xiangyang and the synthetic control during 2011–2016. As shown in Table 11, the estimated effects are strictly positive and the average treatment effect is 1.00%. The effects obtained by the synthetic control method are in the same direction as that obtained by the panel data approach.5 4.2.2. Results of the machine learning method Based on the recommendations in the Appendix, we also use machine learning methods to test the robustness. These methods weigh the estimation bias and prediction error, which introduce some bias but reduce the variance. 5 The pre-MSE of the synthetic control method is 3.476, while the pre-MSE of the panel data approach is 1.191. The former is almost three times the latter. Although the lack of fit leads to inaccurate estimates of the synthetic control method, we can use it to test whether the direction of the effects obtained is consistent with the panel data approach. In addition, the synthetic control method selects six control cities, while the panel data approach only selects four.
10
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table 4 Coefficients of elastic net. α = 0.8
Wuhan Huangshi Jingmen Xianning Enshi Ezhou Changsha Xiangtan Hengyang Shaoyang Changde Yiyang Chenzhou Huaihua Xiangxi Kaifeng Luoyang Anyang Hebi Xinxiang Jiaozuo Puyang Xuchang Luohe Sanmenxia Nanyang Shangqiu Xinyang Zhoukou Zhumadian Nanchang Jingdezhen Pingxiang Jiujiang Xinyu Ganzhou Ji'an Yichun Fuzhou
α = 0.9
5-fold
10-fold
25-fold
5-fold
10-fold
25-fold
0 0 0 0 0 0.092 0 0 0 0.106 0.086 0.075 0 0.230 0 0 0 0 0 0.003 0 0.018 0 0 0 0.234 0 0 0 0 0 0.004 0 0 0 0 0 0 0
0 0 0 0 0 0.104 0 0 0 0.119 0.091 0.075 0 0.236 0 0.005 0 0 0 0.003 0 0.021 0 0 0 0.243 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0.149 0 0 0 0.165 0.106 0.069 0 0.252 0 0.025 0 0 0 0 0 0.028 0 0 0 0.271 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0.058 0 0 0 0.073 0.050 0.073 0 0.271 0 0 0 0 0 0 0 0 0 0 0 0.270 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0.074 0 0 0 0.090 0.059 0.072 0 0.274 0 0 0 0 0 0 0 0 0 0 0 0.280 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0.132 0 0 0 0.149 0.086 0.062 0 0.278 0 0.012 0 0 0 0 0 0.008 0 0 0 0.303 0 0 0 0 0 0 0 0 0 0 0 0 0
Table 5 Correlation with Xiangyang based on real GDP growth 1986–2010. City
Corr.
City
Corr.
City
Corr.
City
Corr.
Huaihua Yiyang Anyang Huangshi Ganzhou Hengyang Pingxiang Ji'an Chenzhou Jiujiang
0.8226 0.6393 0.6059 0.5821 0.5619 0.5376 0.4926 0.4638 0.3757 0.2763
Nanyang Jingmen Puyang Changsha Xinyu Zhumadian Zhoukou Enshi Jiaozuo Xinyang
0.7839 0.6386 0.6016 0.5766 0.5569 0.5145 0.4924 0.4598 0.3500 0.2123
Jingdezhen Ezhou Wuhan Sanmenxia Kaifeng Nanchang Xuchang Luoyang Xianning Yichun
0.7104 0.6122 0.5991 0.5736 0.5539 0.5063 0.4846 0.4412 0.3310 0.1735
Changde Xinxiang Shaoyang Xiangtan Hebi Xiangxi Luohe Shangqiu Fuzhou
0.6660 0.6101 0.5924 0.5670 0.5452 0.4934 0.4642 0.4292 0.3071
When we re-evaluate the treatment effects of renaming, the same control cities are selected into the control groups by LASSO and elastic net. Because machine learning methods select more control cities, Fig. 6 shows that they provide a better fit than the panel data approach. The estimated weights reported in Table 12 differ slightly across the machine learning methods. Shaoyang, which has the highest correlation coefficient, has the largest weights in all the methods. The same findings emerge in the panel data approach. Table 13 shows the treatment effects estimated by using the different machine learning methods. The average treatment effect is about 2%. This result is higher than that in the panel data approach, which is an average of the synthetic control method and machine learning method. 11
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table 6 Correlation with Xiangyang based on real GDP growth 1993–2010. City
Corr.
City
Corr.
City
Corr.
City
Corr.
Shaoyang Wuhan Luoyang Xinyang Luohe Pingxiang Xuchang Kaifeng Huangshi Yichun
0.8218 0.7797 0.7184 0.6789 0.6350 0.5824 0.5047 0.4404 0.3748 0.2281
Ji'an Jingmen Changsha Nanyang Yiyang Nanchang Xiangtan Xianning Chenzhou Xiangxi
0.8158 0.7589 0.7145 0.6733 0.6337 0.5724 0.5012 0.4346 0.3721 0.2227
Huaihua Ganzhou Zhumadian Hebi Hengyang Xinyu Changde Puyang Fuzhou Enshi
0.8119 0.7371 0.6921 0.6729 0.6018 0.5588 0.4791 0.4323 0.3523 0.0841
Sanmenxia Ezhou Xinxiang Jingdezhen Anyang Jiaozuo Zhoukou Shangqiu Jiujiang
0.8071 0.7354 0.6870 0.6680 0.5835 0.5490 0.4687 0.3767 0.3201
Table 7 Weights of the control groups for Xiangyang's real GDP growth 1993–2010 (panel data approach).
Shaoyang Ji'an Huaihua Zhoukou Constant R2 = 0.9342
Coefficient
St. Error
T-stat
0.443 0.335 0.355 0.190 −3.917 pre-MSE = 1.191
0.066 0.140 0.072 0.063 1.539
6.74 2.39 4.90 3.01 −2.55
Fig. 4. Actual and predicted paths for Xiangyang's real GDP growth 1993–2016 (panel data approach). Table 8 Treatment effects of the renaming on real GDP 2011–2016 (panel data approach).
2011 2012 2013 2014 2015 2016 Average T
Actual
Control
Treatment
16 12.5 11.4 9.8 8.9 8.5 11.18
13.73 11.79 9.75 7.90 8.15 7.20 9.75
2.27 0.71 1.65 1.90 0.75 1.30 1.43 5.35
12
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table 9 Real GDP growth predictor means. Variables
Actual
Synthetic
Average of 37 control cities
Employed population Secondary sector share Tertiary sector share Fixed assets investment Retail sales of consumer goods Secondary sector share 2009–2010 Tertiary sector share 2009–2010 Fixed assets investment 2009–2010 Retail sales of consumer goods 2009–2010 Real GDP growth rate in 1997 Real GDP growth rate in 1998 Real GDP growth rate in 2002 Real GDP growth rate in 2008 Real GDP growth rate in 2009 Real GDP growth rate in 2010
0.509 0.386 0.343 0.241 0.371 0.499 0.341 0.511 0.394 10.2 9.2 8.3 14.6 15 16.2
0.518 0.380 0.360 0.311 0.335 0.499 0.346 0.677 0.326 10.210 11.145 8.300 14.424 16.476 15.789
0.558 0.425 0.319 0.320 0.344 0.520 0.315 0.754 0.336 10.895 8.573 10.351 13.178 12.907 13.827
Note: Employed population is averaged over 2002–2008; fixed assets investment, secondary sector share, tertiary sector share, and retail sales of consumer goods are averaged over 1993–2008. Table 10 City weights in synthetic Xiangyang. City
Weight
City
Weight
City
Weight
City
Weight
Wuhan Enshi Changde Xiangxi Hebi Xuchang Shangqiu Nanchang Xinyu Fuzhou pre-MSE = 3.476
0 0 0 0 0 0 0 0 0.034 0
Huangshi Ezhou Yiyang Kaifeng Xinxiang Luohe Xinyang Jingdezhen Ganzhou
0 0 0 0 0.014 0 0 0 0
Jingmen Hengyang Chenzhou Luoyang Jiaozuo Sanmenxia Zhoukou Pingxiang Ji'an
0.305 0 0 0 0 0.224 0 0 0
Xianning Shaoyang Huaihua Anyang Puyang Nanyang Zhumadian Jiujiang Yichun
0.091 0 0.332 0 0 0 0 0 0
Fig. 5. Actual and predicted paths for Xiangyang's real GDP growth 1993–2016 (synthetic control method).
4.2.3. Per capita real GDP We further use per capita real GDP adjusted by the provincial CPI to measure economic growth and adopt this measurement to reestimate the effect of the city-renaming reform experiment. The predictors used here are the same as in Subsection 4.2.1. The sample 13
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table 11 Treatment effects of the renaming on real GDP 2011–2016 (synthetic control method).
2011 2012 2013 2014 2015 2016 Average
Actual
Control
Treatment
16 12.5 11.4 9.8 8.9 8.5 11.18
15.03 12.44 9.98 8.10 7.47 8.06 10.18
0.97 0.06 1.42 1.70 1.43 0.44 1.00
Fig. 6. Actual versus predicted paths for Xiangyang's real GDP growth during 1993–2016 of the machining learning method.
period for the prefectural-level per capita real GDP is 1999–2016. As shown in Tables 14 and 15, the control groups selected by the synthetic control method and panel data approach differ. However, Fig. 7 shows that the estimation results are similar. Both methods demonstrate that the actual per capita real GDP of Xiangyang after 2010 is significantly higher than its synthetic per capita real GDP. Interestingly, the panel data approach only selects Ezhou as the control city and provides a good fit for Xiangyang. As comparative researchers favor sparse control units (Abadie, Diamond, & Hainmueller, 2015), we can implement a comparative case study of Xiangyang. In summary, the above tests show that the result that annual real GDP growth in Xiangyang rose by about 1.43% after the cityrenaming reform is robust. 14
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table 12 Weights of the control groups for Xiangyang's real GDP growth 1993–2010 (machine learning method). LASSO: 5-fold
Elastic net: α = 0.8,10-fold Coefficient
Wuhan Jingmen Xianning Ezhou Shaoyang Huaihua Xinxiang Sanmenxia Xinyang Zhumadian Xinyu Constant pre-MSE
Elastic net: α = 0.9,5-fold Coefficient
0.079 0.070 0.114 0.179 0.433 0.188 0.016 0.091 0.136 0.108 0.025 −5.522 0.467
Wuhan Jingmen Xianning Ezhou Shaoyang Huaihua Xinxiang Sanmenxia Xinyang Zhumadian Xinyu Constant pre-MSE
Coefficient
0.083 0.080 0.101 0.194 0.408 0.171 0.017 0.114 0.123 0.102 0.034 −5.454 0.500
Wuhan Jingmen Xianning Ezhou Shaoyang Huaihua Xinxiang Sanmenxia Xinyang Zhumadian Xinyu Constant pre-MSE
0.073 0.078 0.117 0.178 0.423 0.182 0.014 0.103 0.130 0.111 0.034 −5.594 0.463
Table 13 Treatment effects of the renaming on real GDP 2011–2016 (machine learning method). LASSO: 5-fold
2011 2012 2013 2014 2015 2016 Average
Elastic net: α = 0.8,10-fold
Elastic net: α = 0.9,5-fold
Actual
Control
Treatment
Actual
Control
Treatment
Actual
Control
Treatment
16 12.5 11.4 9.8 8.9 8.5 11.18
14.19 11.44 9.10 7.76 6.70 6.06 9.21
1.81 1.06 2.30 2.04 2.20 2.44 1.97
16 12.5 11.4 9.8 8.9 8.5 11.18
14.11 11.34 8.99 7.74 6.50 6.01 9.12
1.89 1.16 2.41 2.06 2.40 2.49 2.07
16 12.5 11.4 9.8 8.9 8.5 11.18
14.18 11.40 9.00 7.73 6.59 6.01 9.15
1.82 1.10 2.40 2.08 2.31 2.49 2.03
Table 14 Weights of the control groups in synthetic Xiangyang for per capita real GDP. City
Huangshi
Jingmen
Changde
Kaifeng
Weight
0.235
0.448
0.041
0.276
Table 15 Weights of the control groups in the panel data approach for per capita real GDP.
Ezhou Constant
Coefficient
St. Error
T-stat
0.934 0.331
0.034 0.330
27.29 1.00
4.3. Further discussion on the mechanism of the city-renaming reform By using the synthetic control method, Lu et al. (2018) find that the effect of the city-renaming reform is achieved through the development of the tourism industry, construction of the urban transportation network, and sales expansion of light industrial products domestically, which are private components of the tertiary and secondary sectors. As a further step, we thus apply the panel data approach to estimate the treatment effects of the city-renaming reform on the secondary and tertiary sectors (Fig. 8). Table 16 shows the correlation coefficients of Xiangyang and the other cities for the secondary and tertiary sectors from 1993 to 2010. Like the real GDP growth rate correlation coefficients, those of the secondary sector are mostly above 0.5. Although the correlation coefficients of the growth rate of the tertiary sector are relatively small, the panel data approach provides a good fit for this sector with an R2 of 0.9484. There are also some negative correlation coefficients, which is possible. In particular, the correlation coefficient reflects not only whether the change in the two variables is the same, but also whether the change in the two variables is contrary. The correlation coefficient is thus negative when the impact of a factor on the tertiary sector in Xiangyang is the opposite of the impact on other cities. By using the panel data approach with k = 10, we select Shaoyang and Luohe to construct the hypothetical growth path of the secondary sector in Xiangyang had there been no renaming. The approach selects four cities to construct the counterfactuals for the 15
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Fig. 7. Actual versus predicted paths for Xiangyang's per capita real GDP.
Fig. 8. Actual versus predicted paths for Xiangyang's secondary and tertiary sectors during 1993–2016.
tertiary sector, namely Shaoyang, Kaifeng, Hebi, and Jingmen, with k = 14. Table 17 reports the OLS estimation weights of the control groups based on the data on the growth rates of the secondary and tertiary sectors for 1993–2010 and shows the t-statistics of the coefficients of the control groups. Table 18 presents the estimated effects under the renaming from 2011 to 2016. The estimated average treatment effects of the secondary and tertiary sectors are 5.19% and − 1.59%, respectively. According to the asymptotic distribution derived in Li and Bell (2017), the t-statistics are 4.98 and − 2.41, respectively. Therefore, the effect of renaming on the growth rate of the secondary sector is highly significant. The impact on the tertiary sector is negative and also statistically significant, which contradicts the mechanism of the impact of the city-renaming reform mentioned in the Introduction. Therefore, the conclusion obtained in Subsection 4.1 that the city-renaming reform experiment increased annual real GDP growth in Xiangyang by about 1.43% is doubtful. Are the results we obtain driven by other policy interventions? To answer this question, we implement a statistical inference in the next section. 5. Inference about the effect of the city-renaming reform Our analysis and robustness test can only demonstrate that GDP growth in Xiangyang after 2010 significantly improved compared with the normal situation and that this increase may not be uniquely caused by the city-renaming reform. Similarly, the analysis of Lu et al. (2018) can only demonstrate that nighttime light images increased after the city-renaming reform. The effect may not be exclusively caused by the city-renaming reform. A fundamental assumption of the synthetic control method and panel data approach is that after controlling for the impact of the pre-treatment covariates, the observed data for a unit under the policy intervention is 16
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table 16 Correlation with Xiangyang based on the growth rates of the added value of the secondary and tertiary sectors 1993–2010. Secondary
Tertiary
City
Corr.
City
Corr.
City
Corr.
City
Corr.
Wuhan Nanchang Luohe Ganzhou Jiujiang Shangqiu Pingxiang Changde Xinyang Xuchang Ezhou Jiaozuo Zhumadian Hengyang Xinyu Shaoyang Xiangxi Huangshi Enshi
0.8455 0.7132 0.6902 0.6627 0.6421 0.5914 0.5786 0.5689 0.5546 0.5534 0.5294 0.5089 0.4975 0.4896 0.4713 0.3999 0.3208 0.2783 0.2125
Jingmen Zhoukou Nanyang Hebi Yiyang Xinxiang Ji'an Puyang Changsha Sanmenxia Huaihua Xianning Luoyang Jingdezhen Anyang Xiangtan Kaifeng Chenzhou
0.8237 0.7039 0.6657 0.6557 0.6223 0.5898 0.5753 0.5667 0.5536 0.5345 0.5232 0.5027 0.4972 0.4740 0.4482 0.3913 0.3046 0.2506
Shaoyang Changsha Yiyang Ezhou Xinyu Hebi Nanchang Xinxiang Xianning Jingmen Chenzhou Ganzhou Huaihua Jingdezhen Pingxiang Jiujiang Xinyang Enshi Ji'an
0.9066 0.6582 0.6359 0.5811 0.5022 0.4738 0.3716 0.3409 0.3241 0.2705 0.2514 0.2162 0.2099 0.1431 0.1014 0.0143 −0.1183 −0.2431 −0.2713
Sanmenxia Nanyang Wuhan Xiangtan Anyang Shangqiu Luohe Zhoukou Huangshi Luoyang Xuchang Hengyang Puyang Zhumadian Jiaozuo Changde Xiangxi Kaifeng
0.7124 0.6441 0.6118 0.5334 0.4936 0.4146 0.3589 0.3327 0.2972 0.2590 0.2246 0.2122 0.1483 0.1190 0.0580 −0.0567 −0.1411 −0.2525
Table 17 Weights of the control groups for Xiangyang's secondary and tertiary sectors 1993–2010. Secondary
Tertiary Coefficient
St. Error
T-stat
Shaoyang Luohe Constant
0.960 0.835 −12.924
0.142 0.104 2.938
6.77 8.05 −4.40
R2 = 0.8358
pre-MSE = 7.805
Shaoyang Kaifeng Hebi Jingmen Constant R2 = 0.9484
Coefficient
St. Error
T-stat
0.918 0.861 −0.325 0.131 −6.520 pre-MSE = 1.888
0.051 0.162 0.086 0.048 2.778
18.03 5.30 −3.80 2.72 −2.35
Table 18 Treatment effects of the renaming on the secondary and tertiary sectors 2011–2016. Secondary
2011 2012 2013 2014 2015 2016 Average T
Tertiary Actual
Control
Treatment
20.2 15.9 13.3 10.2 9 9 12.93
17.52 11.61 7.15 6.41 3.31 0.45 7.74
2.68 4.29 6.15 3.79 5.69 8.55 5.19 4.98
2011 2012 2013 2014 2015 2016 Average T
Actual
Control
Treatment
15 10.5 10.7 11 10.3 9.4 11.15
13.92 12.50 11.45 13.28 13.19 12.07 12.74
1.08 −2.00 −0.75 −2.28 −2.89 −2.67 −1.59 −2.41
only the outcome of this specific intervention (Fujiki & Hsiao, 2015). If the observed outcomes are simultaneously affected by two or more policies and other cities are not affected by these policies, the above two methods fail.6 We want to know if the effect of the cityrenaming reform is significant. However, the traditional large sample inferential techniques no longer work here. Following the placebo study by Abadie et al. (2010), we construct a distribution of the average treatment effects to determine whether the effect of the city-renaming reform is significant. Although the cities in the donor pool were not renamed, we can carry out a series of placebo studies by applying the panel data approach to every city in the donor pool as if it would have made a city-renaming reform in 2010. We then evaluate the “forged” treatment effects of renaming for each city. The iterative conduct of the placebo study provides a distribution of the estimated 6 If there are only two policies and the impact of one of them is transitory, Fujiki and Hsiao (2015) propose a panel data approach to disentangle the impact of one policy from that of the other policy.
17
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table 19 Goodness of fit of Xiangyang and the 39 control cities. City
R2
City
R2
City
R2
City
R2
Xiangyang Zhoukou Sanmenxia Xianning Ganzhou Hengyang Yichun Enshi Jiaozuo Shangqiu
0.9342 0.9837 0.9755 0.9705 0.9589 0.9393 0.9154 0.8755 0.8450 0.7362
Zhumadian Changsha Ji'an Jingdezhen Anyang Fuzhou Pingxiang Xinxiang Xinyu Huangshi
0.9970 0.9773 0.9754 0.9694 0.9581 0.9363 0.9082 0.8744 0.8293 0.7361
Nanyang Wuhan Nanchang Ezhou Hebi Kaifeng Jiujiang Puyang Xiangxi Xinyang
0.9873 0.9772 0.9749 0.9607 0.9489 0.9285 0.9012 0.8722 0.7600 0.6439
Yiyang Jingmen Xiangtan Luohe Xuchang Luoyang Huaihua Changde Shaoyang Chenzhou
0.9856 0.9770 0.9724 0.9595 0.9463 0.9217 0.8944 0.8695 0.7568 0.3121
Fig. 9. Placebos for the city-renaming reform of Xiangyang: (a) placebo average treatment effects of all 39 control cities and (b) placebo average treatment effects of 26 of the control cities (discarding cities with an R2 below 0.90).
treatment effects for cities under no city-renaming reform. Then, we examine whether the estimated effect of Xiangyang is large relative to the distribution of the effects estimated for the cities not renamed. If the magnitudes of the treatment effects of the cities in the donor pool are similar to that estimated for Xiangyang, then our conclusion is that the treatment effect estimated for Xiangyang was not created by the city-renaming reform. Otherwise, we conclude that the effect of the city-renaming reform is significant. As shown in Table 19, the median goodness-of-fit among the 39 control cities is 0.94, indicating that the panel data approach can provide a good fit for most cities in the donor pool before the city-renaming reform. Fig. 9(a) displays the distribution of the average treatment effects of the 39 control cities. The zero line is the boundary between the positive and negative effects. Because 23 points are above the zero line, the probability of obtaining a positive treatment effect in our data is 57.5%. In other words, if we were to appoint the renaming at random to a city in the donor pool, the probability of obtaining a positive treatment effect would be more than half. The probability of obtaining an average treatment effect as large as Xiangyang's is 32.5%. Since the average treatment effect estimated for Xiangyang is not unusually large relative to the distribution of the average treatment effects estimated for cities in which no renaming took place, the rise in the real GDP growth rates would be due to other policy interventions rather than the cityrenaming reform. Similar to the interpretation by Abadie et al. (2010), if some placebo runs had failed to fit the real GDP growth rates for the control cities in the years before the city-renaming reform, the distribution of the average treatment effects would not provide accurate information to measure the relative rarity of the large average treatment effect estimated for Xiangyang that is well fitted before the city-renaming reform. For this reason, we discard the 13 cities with an R2 less than 0.90. Fig. 9(b) shows the distribution of the average treatment effects for the 26 remaining control cities. Even if we focus on control cities that are well fitted, the probability of obtaining a positive treatment effect is 51.9% and the probability of obtaining an effect as large as Xiangyang's is 7/27 = 0.259. Thus, we demonstrate the insignificance of the effect of the city-renaming reform. Finally, Fig. 10 displays the estimated annual effects for the control cities. If the city-renaming reform really promotes economic growth, the black line representing the annual effect of Xiangyang should not only be above the zero line, but also be above the gray lines representing the annual effects of the control cities. However, Fig. 10 does not show such a feature. Although the black line is above the zero line, it is below a lot of the gray lines. This further shows that the effect of city renaming is insignificant. 18
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Fig. 10. Placebos for the city-renaming reform of Xiangyang: (a) placebo-estimated annual effects of all 39 control cities and (b) placebo-estimated annual effects of 26 of the control cities (discarding cities with an R2 below 0.90).
Section 4 shows that real GDP growth in Xiangyang after renaming rose by about 1.43% annually and that this result is robust. This section shows that the effect of the city-renaming reform is insignificant. How can we explain these seemingly contradictory results? The only interpretation is that other policy interventions have promoted economic growth in Xiangyang. For example, Xiangyang proposed the construction of the Dongjin New District in 2010, and subsequent large-scale construction of infrastructure such as roads and bridges began. This is consistent with the finding in Subsection 4.3 that the annual growth rate of Xiangyang's secondary industry increased by 5.19% after 2010. On January 21, 2007, Simao, Yunnan Province, was renamed Pu'er. By using the same procedures, we study the effect of the cityrenaming reform on economic growth in Pu'er. The estimation results show that the annual real GDP growth in Pu'er rose by about 0.70% after 2006. However, the statistical inference demonstrates that if one were to assign the city-renaming reform randomly to a city in the donor pool, the probability of obtaining an effect as large as Pu'er's is 41.7%. If we only consider placebo runs with good fit prior to the city-renaming reform, the probability is 52.9%. It is further shown that the effect of the city-renaming reform is insignificant. 6. Conclusion In the Appendix, we slightly modify the method proposed by Hsiao et al. (2012). This modification not only makes the panel data approach suitable for a variety of situations regardless of the number of control units or pre-treatment time periods, but also allows us to select the best result from a wide range of results. We also compare the performance of the modified panel data approach, synthetic control method, and machine learning method by using simulated data. In addition, to select the best method to analyze the data on real GDP growth, we use the real GDP growth rates of Xiangyang in 1986–2016 to re-compare the empirical performance of the three methods. Then, we use the sample period that begins in 1993 to carefully explore the economic effect of the city-renaming reform. We find that annual real GDP growth in Xiangyang rose by about 1.43% after the city-renaming reform experiment, and this result is robust. However, further discussions show that the impact of the city-renaming reform on the tertiary industry is negative, which is inconsistent with the mechanism of the impact of the city-renaming reform on economic growth mentioned in the Introduction. Hence, a question remains about whether our results could be driven by other policy interventions. To answer this question, we construct a distribution of the average treatment effects and then implement a statistical inference. The statistical inference shows that if we were to relabel the city-renaming reform in the donor pool randomly, the probability of obtaining an effect as large as Xiangyang's would be 25.9%. This indicates that the estimated effect for Xiangyang is small relative to the distribution of the effects estimated for the other cities. Therefore, the treatment effect of the city-renaming reform is insignificant, and other policy interventions rather than the city-renaming reform promote economic growth in Xiangyang after 2010. We also study the effect of the city-renaming reform on economic growth in Pu'er and obtain the same results. From the discussion, we know that statistical inferences must be made when using the synthetic control method or panel data approach; otherwise, the opposite conclusion may be drawn. In addition, our findings are useful for policymakers. The economic effect of the city-renaming reform is insignificant. Moreover, the city-renaming reform will lead to a series of hidden costs such as inconvenience to the lives of local residents. Therefore, renaming a city is not a good way for policymakers to win the “Promotion Tournament.”
19
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Declaration of Competing Interest This work was supported by the National Science Foundation of China (Grant No. 71873080); the Key Project of the National Science Foundation of China (Grant No. 71833004); the 2018 Program for Innovative Research Team of Shanghai University of Finance and Economics; and the Fundamental Research Funds for the Central Universities. Acknowledgements We would like to thank Youze Lang and two anonymous referees for their helpful comments. This work was supported by the National Science Foundation of China (Grant No. 71873080); the Key Project of the National Science Foundation of China (Grant No. 71833004); the 2018 Program for Innovative Research Team of Shanghai University of Finance and Economics; and the Fundamental Research Funds for the Central Universities. All remaining errors and omissions are ours. Appendix A. Appendix A.1. Introduction to the three methods Let yit0 denote the potential outcome of the ith unit at time t without treatment and yit1 denote the potential outcome of the ith unit at time t under treatment. However, usually we do not simultaneously observe yit0 and yit1. The observed data yit can be written as
yit = dit yit1 + (1
dit ) yit0
where dit = 1 if unit i under treatment at time t, and dit = 0 otherwise. Suppose we have N units over t = 1, ⋯, T0, T0 + 1, ⋯, T periods. Without loss of generality, we let i = 1 correspond to the only unit that receives the treatment in period T0 + 1 until period T. So we have that
y1t = y10t
for
t = 1,
y1t = y11t
for
t = T0 + 1,
, T0
and
,T
The remaining units are not affected by the intervention, then
yit = yit0
for
i = 2,
,N,
for
t = 1,
,T
According to the statistical literature, we refer to the remaining untreated units as “donor pool.” The treatment effect for the first unit is 1t
= y11t
y10t
for
t = T0 + 1,
,T
In order to evaluate the effect of the policy intervention, we must obtain a predictor y 10t for y1t0 which can not be observed for t = T0 + 1, ⋯, T. By using the synthetic control method, panel data approach or machine learning method, we can replicate the counterfactual path y1t0 of the treated unit during the post-intervention periods T0 + 1, ⋯, T. Accordingly, treatment effects are estimated by 1t
= y11t
y 10t
for
t = T0 + 1,
,T
and the average treatment effect is estimated by T
1 T
T0
1t t = T0+ 1
For the sake of evaluating different methods, we report the post-treatment predicted mean squared error (post-MSE) and the number of control units selected. The post-MSE is expressed as
post MSE =
T
1 T
T0
(y10t
0 y^1t ) 2
t = T0+ 1
A.1.1. Synthetic control method The idea behind the synthetic control method is that the pre-intervention characteristics of the treated unit can be approximated by a convex combination of the untreated units in the donor pool, allowing us to use a weighted average of the untreated units to construct a “synthetic” control unit that is most similar to the treated unit. Following Abadie et al. (2010), yit0 is generated by a factor model
20
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
yit0 =
+
t
t zi
+
t µi
+
(A.1)
it
where δt is an unknown common factor that vary over time, zi is a vector of observed covariates, μi is a vector of unknown individualspecific factor loadings, θt is a vector of unknown parameters constant across units, λt denotes a vector of unobserved common factors, and ϵit is a random idiosyncratic error term with zero mean. W = (w2, ⋯, wN) is a vector of nonnegative weights which satisfy wi ≥ 0 for i = 2, ⋯, N and w2 + ⋯ + wN = 1. Suppose that there exist (w2∗, ⋯, wN∗) such that N
N
wi yi01 = y110 , i=2
If
N
i=2
∑t=1T0λt′λt
N
0 wi yi02 = y12 , i=2
wi yiT00 = y10T0 ,
and
wi z i = z1
(A.2)
i=2
is nonsingular, Abadie et al. (2010) suggest using N
1t
= y11t
wi yit0
for
t = T0 + 1,
,T
i=2
as the estimated treatment effect of the intervention. If (y110, ⋯, y1T00, z1′) falls far outside the convex hull of {(y210, ⋯, y2T00, z2′), ⋯, (yN10, ⋯, yNT00, zN′)}, Eq. (A.2) can not hold exactly. In practice, we find a vector W∗ such that it minimizes:
(X1
X 0 W ) V (X1
(A.3)
X0 W )
where X1 is a K × 1 vector of pre-intervention characteristics for the intervened unit, such as X1 = (y110, ⋯, y1T00, z1′), X0 is a K × (N − 1) matrix that contains the same predictors for the unaffected units. V is a diagonal matrix with nonnegative elements that reflect the relative importance of the different predictors. If we have a prior knowledge about the predictors, the choice of V can be subjective. Otherwise, the optimal choice for V should be data driven. In the Monte Carlo section, we choose V such that the outcome variable of the treated unit is best reproduced by the weighted combination of untreated units for the pre-intervention periods. A.1.2. Panel data approach Following Hsiao et al. (2012), we assume that yit0 is generated by a factor model of the form7
yit0 =
+ bi ft +
i
it ,
for i = 1, 2, …, N ,
for
(A.4)
t = 1, 2, …, T
where αi is an individual specific intercept, bi is a K × 1 vector of factor loading, ft is a K × 1 vector of (unobserved) common factors which is the main force that drives all yit0 to change over time and ϵit is a zero mean error term. Note that bi can differ by i. Stacking the N potential outcomes that under no treatment into a vector yields
yt0 =
+ Bft +
yt0
0
t
for
(A.5)
t = 1, 2, …, T
0
0
yt = a
+a
where = (y1t , y2t , …, yNt )′, α = (α1, α2, …, αN)′, ϵt = (ϵ1t, ϵ2t, …, ϵNt)′, and B = (b1, b2, …, bN)′ is a N × K factor loading matrix. 0 0 Hsiao et al. (2012) suggest using yt = (y20t , y30t ,…, yNt ) in lieu of ft to predict y1t for the post-treatment period. If Rank(B) < N or N > K, according to linear algebra, there exists a vector a = (1, −γ)′ such that a′B = 0. Following Eq. (A.5), we get
a yt0 = y10t
(A.6)
t
Furthermore, in terms of the assumptions in Hsiao et al. (2012) or Li and Bell (2017), rearranging terms leads to
y10t =
0
+
1 yt
+ u1t
for
(A.7)
t = 1, 2, …, T
˜t = ( 2t , 3t ,…, Nt ) , and u1t = a where 0 = a , 1 = (IN 1 Cov ( t , yt ) Var (yt ) correlated with y~t . We can use Eq. (A.7) to construct the counterfactuals. 1),
t
+
Cov ( t , yt ) Var (yt )
1y t
is the error term un-
Remark 1. In fact, Eq. (A.7) can also be obtained as follows. According to Eq. (A.4), for treated unit one,
y10t =
1
+ b1 ft +
1t ,
(A.8)
t = 1, 2, …, T
Then we use the assumptions in Hsiao et al. (2012) to get the covariance of 1 and i,
Cov (yit0 , y10t ) = bi E (ft ft ) b1
(A.9)
This shows that the treated unit is correlated to other units, so by (A.7).
E (y10t
| yt ) is a function of yt , and this function can be approximated
7 As a matter of fact, one might develop a general framework that nests both the synthetic control method and panel data approach, see Gardeazabal and Vega-Bayo (2016) for details.
21
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
To balance the within sample fit with the out-of-sample prediction, Hsiao et al. (2012) suggest using the following two-step method to select control units to construct the counterfactuals: Step 1. Select j cross-sectional units out of (N − 1) units such that they maximize R2, denoted by M(j)∗, for j = 1, ⋯, N − 1. Step 2. Choose M(m)∗ from M(1)∗, ⋯, M(N − 1)∗ according to the corrected Akaike Information Criterion (AICC; Hurvich & Tsai, 1989). A.1.3. Machine learning method In recent years, machine learning methods such as regression trees, random forests, neutral networks, and regularized regression (ridge, LASSO, and elastic net) have been widely used to make economic predictions. Here, we focus on using regularized regression to select the control groups. Consider Eq. (A.7),
y10t = xt
+ u1t ,
for t = 1, 2, …, T
where x t = (1, yt ) and β = (β0, β1′)′. The MSE of an estimator
MSE( ) = E [( = E [( = E [(
E( ) + E( ) E ( ))(
)( )(
for β is expressed as
)] E( ) + E( )
)]
E ( ))(
E ( )) ]
E ( )) ] + E [(
= Var( ) + [Bias( )][Bias( )] Different prediction methods have a trade-off between variance and bias. For example, the bias of the OLS prediction is zero, whereas the variance is large. Regularized regression introduces some bias, but significantly reduces the variance and leads to a better out-of-sample prediction. The elastic net method selects β to solve the penalized residual sum of squares of the form T0
min
(y10t
1
xt )2 +
2
t=1
| |2 +
| |
(A.10)
where λ ≥ 0, 0 ≤ α ≤ 1, |β|2, and |β| are the L2 and L1 norm of β, respectively. As λ increases, the coefficients of some unimportant 1 | |2 + | | is a compromise penalty between ridge and LASSO. The elastic net becomes covariates will shrink toward zero. 2 LASSO when α = 1 and is the same as the ridge regression when α = 0. Since the ridge method imposes an L2-penalty on the regression coefficients, it always keeps all the predictors in the model, which means that a simpler model cannot be obtained; hence, we do not consider it here. LASSO and elastic net can both make predictions and select predictors at the same time. In addition, we use the l-fold cross-validation method to select a value of the non-negative regularization parameter λ. Suppose λ ∈ Λ, Λ is a vector of the real and non-negative values, and {1, 2, ⋯, T0} is randomly divided into l subsets Γ1, ⋯, Γl. For each λ ∈ Λ, a single subset Γ is used as the validation dataset to test the model, and the other l − 1 subsets are used as the training set to select β. That is to say, the cross-validation estimator , is obtained by solving the following
(y10t
min
1
xt )2 +
2
t
| |2 +
| |
(A.11)
Then, we calculate the sum of the squared residuals over the testing set Γ as
(y10t
SE , =
xt
,
)2
(A.12)
t
For each λ, the sum of the squared residuals for the cross-validation is
CV ( ) =
(y10t
SE , =
xt
,
)2
(A.13)
t
The λ that minimizes CV(λ) is what we demand. Ten-fold and 5-fold cross-validations are commonly used in the machine learning method. When l = T0, the l-fold cross-validation is the leave-one-out cross-validation, which is used by Li and Bell (2017) for LASSO. A.1.4. A minor modification to the panel data approach The model selection strategy proposed by Hsiao et al. (2012) has two shortcomings. First, the number of control units must be less than the pre-treatment time periods (N < T0). Second, the number of control units cannot be large, otherwise it is not feasible in the calculation. To address the first shortcoming, Gardeazabal and Vega-Bayo (2016) modify the method (referred to here as GV-AICC) such that j = 1, ⋯, T0 − g, where g is a positive integer less than T0. The GV-AICC method is only applicable to small N and T0. For example, with N = 71 and T0 = 25 and allowing at least 10 degrees of freedom (i.e., g = 10), the GV-AICC method requires the AICC criterion to be used approximately 9.75 × 1014 times to select the best control units for y1t0, which is computationally infeasible. Further, to overcome the first drawback, Li and Bell (2017) divide the control units into m groups such that (m − 1)T0 ≤ N < mT0, and then use the AICC to select the best units in each group. For the same reason, the modified method proposed by Li and Bell (2017) is only applicable to small T0. 22
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
We modify the method such that we divide the N − 1 control units into n groups, each group contains k control units (referred to here as the k-AICC), where k is a appropriate positive integer less than T0 − 1.8 The specific implementation procedures of this modification are as follows: Step a. Divide the N − 1 control units into n groups, where n is the smallest positive integer greater than or equal to (N − 1)/k. Step b. Use Step 1 and Step 2 in Hsiao et al. (2012) to select the best units in each group. Step c. If the sum of the units selected in Step b exceeds k, we repeat Step a, b, and c until the sum of the selected units does not exceed k. Step d. Use Step 1 and Step 2 proposed in Hsiao et al. (2012) to select the final control units. The advantages of this modification are two-fold: (i) it makes the panel data approach work for more cases regardless of the number of control units or pre-treatment time periods, and the Monte Carlo performance is good; and (ii) we can select the best result from a wide range of results by changing k. The modification by (Li and Bell, 2017) is a particular case of the k-AICC. Table A.1
k-AICC method with N = 41, T0 = 25, and T = 35. k
8
10
12
14
16
18
20
22
Avg. post-MSE Avg. #
2.165 3.283
2.214 3.610
2.278 3.795
2.495 4.408
2.553 4.528
2.557 4.535
2.861 5.393
2.861 5.393
Note: The simulation results for k = 20 are the same as those with k = 22.
Table A.2
k-AICC method with N = 51, T0 = 25, and T = 35. k
8
10
12
14
16
18
20
22
Avg. post-MSE Avg. #
1.998 3.270
2.120 3.640
2.169 3.735
2.265 4.303
2.376 4.615
2.685 5.388
2.710 5.423
2.710 5.423
Note: The simulation results for k = 20 are the same as those with k = 22.
Table A.3
k-AICC method with N = 61, T0 = 25, and T = 35. k
8
10
12
14
16
18
20
22
Avg. post-MSE Avg. #
2.082 3.338
2.284 3.835
2.251 4.045
2.290 4.168
2.441 4.790
2.556 5.135
3.061 6.143
3.123 6.343
Table A.4
k-AICC method with N = 71, T0 = 25, and T = 35. k
8
10
12
14
16
18
20
22
Avg. post-MSE Avg. #
2.043 3.368
2.195 3.858
2.245 4.230
2.349 4.508
2.411 4.693
2.576 5.488
2.666 5.923
2.845 6.305
Table A.5
LASSO method with N = 41, 51, T0 = 25, and T = 35. N = 41
N = 51
l-fold
5-fold
10-fold
25-fold
5-fold
10-fold
25-fold
Avg. post-MSE Avg. #
1.910 9.228
1.938 9.903
1.977 9.960
1.855 10.338
1.914 10.563
1.944 11.035
8 In a contemporaneous paper, Hsiao and Zhou (2019) propose to randomly split the control units into G subsets. By contrast, we allow each group to contain k control units at most, in order to save computational time and explore the sensitivity of the estimation results to different choice of k.
23
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table A.6
LASSO method with N = 61, 71, T0 = 25, and T = 35. N = 61
N = 71
l-fold
5-fold
10-fold
25-fold
5-fold
10-fold
25-fold
Avg. post-MSE Avg. #
1.838 10.773
1.871 10.690
1.903 11.193
1.752 10.453
1.774 10.828
1.808 11.163
Table A.7
Elastic net method with N = 41, T0 = 25, and T = 35. α = 0.8
α = 0.9
l-fold
5-fold
10-fold
25-fold
5-fold
10-fold
25-fold
Avg. post-MSE Avg. #
1.853 12.013
1.841 12.143
1.883 12.368
1.869 10.658
1.864 10.990
1.930 11.355
Table A.8
Elastic net method with N = 51, T0 = 25, and T = 35. α = 0.8
α = 0.9
l-fold
5-fold
10-fold
25-fold
5-fold
10-fold
25-fold
Avg. post-MSE Avg. #
1.730 13.048
1.798 13.405
1.819 13.485
1.800 11.623
1.863 12.050
1.874 12.205
Table A.9
Elastic net method with N = 61, T0 = 25, and T = 35. α = 0.8
α = 0.9
l-fold
5-fold
10-fold
25-fold
5-fold
10-fold
25-fold
Avg. post-MSE Avg. #
1.783 13.643
1.792 13.683
1.800 13.853
1.806 12.205
1.832 12.243
1.832 12.415
Table A.10
Elastic net method with N = 71, T0 = 25, and T = 35. α = 0.8
α = 0.9
l-fold
5-fold
10-fold
25-fold
5-fold
10-fold
25-fold
Avg. post-MSE Avg. #
1.680 13.630
1.718 14.223
1.729 14.373
1.677 11.820
1.715 12.470
1.746 12.870
Table A.11
The synthetic control method with N = 41,51,61,71, T0 = 25, and T = 35. N
41
51
61
71
Avg. post-MSE Avg. #
1.930 6.330
1.821 7.078
1.705 6.983
1.814 7.523
24
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
Table A.12
The synthetic control method and GV-AICC method with N = 21, 26, T0 = 20, and T = 30. Synthetic control method
GV-AICC method
N
21
26
21
26
Avg. post-MSE Avg. #
2.416 5.050
2.263 5.410
3.951 4.800
5.093 7.040
Table A.13
k-AICC method with N = 21, 26, T0 = 20, and T = 30. N = 21
N = 26
k
6
8
10
6
8
10
12
14
Avg. post-MSE Avg. #
2.347 2.490
2.373 2.840
2.573 3.200
2.107 2.510
2.287 2.810
2.250 3.220
2.338 3.260
2.788 3.820
A.2. Monte Carlo simulation results By using Monte Carlo simulations, we are able to observe potential outcomes under no treatment after T0. In this section, we report simulation results for exploring the pros and cons of different methods. We consider the same data generating process (DGP) as in Hsiao et al. (2012) and Du and Zhang (2015),
f1t = 0.8f1t 1 + e1t f2t = 0.6f2t 1 + e2t + 0.8e2t 1 f3t = e3t + 0.9e3t 1 + 0.4e3t 2 where ejt is generated by N(0, 1). Let yt0 = (y1t0, y2t0, …, yNt0)′, it is generated via
DGP:
yt0 =
+ Bft + t ,
(A.14)
for t = 1, 2, …, T
where ft = (f1t, f2t, f3t)′, B is an N × 3 factor loading matrix whose elements are iid N(1, 1), and the component ϵit of ϵt is generated by N(0, 1). A.2.1. Simulation results for the k-AICC method, synthetic control method, and machine learning method We generate model (A.14) with N = 41,51,61,71, T0 = 25, and T = T0 + 10. To explore the sensitivity of the k-AICC method to the number of control units included in each group, we consider k = 8,10,12,14,16,18,20,22. We repeat each experiment 400 times. From Tables A.1–A.4, we observe the following: (i) for all N, as k increases, the k-AICC seems to select more predictors, which is consistent with our intuition; (ii) when k remains constant and N increases, the post-MSE and number of control units selected change little and are relatively stable; and (iii) the smaller k, the fewer control units are selected and the smaller is the post-MSE. Under the simulated data, according to the model evaluation criteria, the model becomes better as k decreases. However, in empirical studies, to balance the within-sample fit with the out-of-sample prediction, k cannot be too small. This can be seen in the text. For the LASSO method (Tables A.5–A.6), the 5-fold and 10-fold cross-validations are better than the leave-one-out one according to the model evaluation criteria. The computation time of the leave-one-out cross-validation is about 2.2 times that of the 10-fold crossvalidation and 4.6 times that of the 5-fold cross-validation. Therefore, if we want to use the LASSO method in empirical research, we should use the 5-fold or 10-fold cross-validation instead of the leave-one-out cross-validation suggested by Li and Bell (2017). Compared with the k-AICC method, the LASSO method selects a larger number of predictors, whereas the post-MSE is smaller. For example, when N = 61, by comparing the k-AICC with k = 16 with the LASSO method (5-fold cross-validation), we find that the postMSE decreases by 24.7%, whereas the number of predictors increases by 124.9%. With a large number of predictors, parsimony is an important issue (Zou & Hastie, 2005). For the elastic net method (Tables A.7–A.10), we choose α = 0.8,0.9 mainly to balance prediction accuracy and the interpretation of the model. We find that compared with α = 0.9, the prediction accuracy of α = 0.8 is not significantly improved, whereas the number of predictors selected increases by about 1.5. Compared with the LASSO method, the prediction accuracy of the elastic net method is slightly improved, whereas the number of predictors selected increases more. For example, when N = 71, in the case of α = 0.8, the post-MSE of elastic net reduces by 4.1% ((1.752 − 1.680)/1.752) compared with LASSO, whereas the number of predictors increases by 30.4% ((13.630 − 10.453)/10.453). The data-generating process we use only contains the common factors and does not include the vector zi of the observed covariates. The synthetic control method can take advantage of the information on zi to obtain a more accurate prediction. Gardeazabal and Vega-Bayo (2016) handle this problem by keeping the experiments that have a pre-intervention mean absolute error smaller than 20% of the pre-intervention mean of y1t0. However, Wan, Xie, and Hsiao (2018) believe that a fair comparison should be based on an 25
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
equal number of experimental outcomes. As shown in Tables A.1–A.4 and A.11, we find that the performance of the synthetic control method and panel data approach is comparable based on the same number of experimental outcomes. Although the prediction error of the synthetic control method is smaller than that of the panel data approach, more control units are selected. The prediction accuracy of the synthetic control method is similar to that of the machine learning methods; however, the latter selects too many predictors. Therefore, in the study of economics, we should choose between the synthetic control method and panel data approach according to the actual situation, and machine learning methods are generally used to implement robustness checks. A.2.2. Comparison between the synthetic control method and panel data approach for small N and T0 Gardeazabal and Vega-Bayo (2016) compare the performance of the synthetic control method and panel data approach with N = 21 and T0 = 20. However, their comparison is not convincing because of the poor performance of the GV-AICC. Using the kAICC, we re-compare them with N = 21, 26, T0 = 20, and T = 30. We let g = 8 for both N = 21 and N = 26 to reduce the computation time and allow for at least eight degrees of freedom. When N increases slightly from 21 to 26, the computation time of the GV-AICC increases by about 1564%, the post-MSE by 28.9%, and the number of predictors by 46.7%. All of these indicate that the GV-AICC is unsuitable when the number of control units is large. Tables A.12 and A.13 reveal that the post-MSE and number of predictors of the GV-AICC are almost twice that of the k-AICC. Hence, the GV-AICC is a worse modification than the k-AICC. The comparison between the k-AICC method and synthetic control method is more convincing. Tables A.12 and A.13 show that the post-MSE of the k-AICC is as good as that of the synthetic control method. In addition, the kAICC selects fewer predictors than the synthetic control method. For small N, if we focus on the interpretation or succinctness of the model, we should choose the k-AICC method. Conversely, if we have information on the observed covariates or want to prevent extrapolation, then we should choose the synthetic control method. A.3. Conclusion This Appendix does three things: (i) we slightly modify the panel data approach for it to work for more cases regardless of the number of control units or pre-treatment time periods and allow us to select the best result from a wide range of results; (ii) we present a convincing comparison between the panel data approach and synthetic control method; and (iii) under the factor model, we find that the performance of the synthetic control method and panel data approach is comparable. In the study of economics, we should choose between them based on the actual situation. The machine learning method is often used to test the robustness of the empirical results.
References Aaker, D. A., & Jacobson, R. (2001). The value relevance of brand attitude in high-technology market. Journal of Marketing Research, 38(4), 485–493. Abadie, A., Diamond, A., & Hainmueller, J. (2010). Synthetic control methods for comparative case studies: Estimating the effect of California’s tobacco control program. Journal of the American Statistical Association, 105(490), 493–505. Abadie, A., Diamond, A., & Hainmueller, J. (2015). Comparative politics and the synthetic control method. American Journal of Political Science, 59(2), 495–510. Abadie, A., & Gardeazabal, J. (2003). The economic costs of conflict: A case study of the Basque Country. American Economic Review, 93(1), 112–132. Athey, S. (2018). The impact of machine learning on economics. In the economics of artificial intelligence: An agenda. Chicago: University of Chicago Press. Bai, C. E., Li, Q., & Ouyang, M. (2014). Property taxes and home prices: A tale of two cities. Journal of Econometrics, 180(1), 1–15. Chen, C. F., & Tsai, D. C. (2007). How destination image and evaluative factors affect behavioral intentions? Tourism Management, 28(4), 1115–1122. Commandeur, J. J. F., & Koopman, S. J. (2007). An introduction to state space time series analysis. Oxford: Oxford University Press. Du, Z. C., & Zhang, L. (2015). Home-purchase restriction, property tax and housing price in China: A counterfactual analysis. Journal of Econometrics, 188(2), 558–568. Enders, W. (2014). Applied econometric time series (4th ed.). New York, NY: Wiley. Fujiki, H., & Hsiao, C. (2015). Disentangling the effects of multiple treatments-measuring the net economic impact of the 1995 great Hanshin-Awaji earthquake. Journal of Econometrics, 186(1), 66–73. Gallarza, M. G., Saura, I. G., & Garcia, H. C. (2002). Destination image: Towards a conceptual framework. Annals of Tourism Research, 29(1), 56–78. Gardeazabal, J., & Vega-Bayo, A. (2016). An empirical comparison between the synthetic control method and Hsiao et al.’s panel data approach to program evaluation. Journal of Applied Econometrics, 32(5), 983–1002. Harvey, A. C. (1989). Forecasting, structural time series models and the Kalman filter. Cambridge: Cambridge University Press. Harvey, A. C., & Durbin, J. (1986). The effects of seat belt legislation on British road casualties: A case study in structural time series modelling. Journal of the Royal Statistical Society. Series A (General), 149(3), 187–227. Hsiao, C., Ching, H. S., & Wan, S. K. (2012). A panel data approach for program evaluation: Measuring the benefit of political and economic integration of Hong Kong with mainland China. Journal of Applied Econometrics, 27(5), 705–740. Hsiao, C., & Zhou, Q. (2019). Panel parametric, semiparametric, and nonparametric construction of counterfactuals. Journal of Applied Econometrics, 34(4), 463–481 (forthcoming). Huang, N. J., & Yu, M. Z. (2018). Research progress on the impact of machine learning on economics. Economic Perspectives, 7, 115–129. Hurvich, C. M., & Tsai, C. L. (1989). Regression and time series model selection in small samples. Biometrika, 76(2), 297–307. Imbens, G. W., & Angrist, J. (1994). Identification and estimation of local average treatment effects. Econometrica, 62(2), 467–475. Keller, K. L. (1993). Conceptualizing, measuring and managing customer-based brand equity. Journal of Marketing, 57(1), 1–22. Li, H., & Zhou, L. (2005). Political turnover and economic performance: The incentive role of personnel control in China. Journal of Public Economics, 89(9–10), 1743–1762. Li, K. T., & Bell, D. R. (2017). Estimation of average treatment effects with panel data: Asymptotic theory and implementation. Journal of Econometrics, 197(1), 65–75. Lu, S. F., Wu, Y. P., & Xie, X. (2018). The benefit of historical reputation: Evidence from the city-renaming reforms in China. China Economic Quarterly, 17(3), 1055–1078. Michell, P., King, J., & Reast, J. (2001). Brand values related to industrial products. Industrial Marketing Management, 30(5), 415–425. Ouyang, M., & Peng, Y. (2015). The treatment-effect estimation: A case study of the 2008 economic stimulus package of China. Journal of Econometrics, 188(2),
26
China Economic Review 57 (2019) 101344
J. Guo and Z. Zhang
545–557. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B: Methodological, 58(1), 267–288. Vujić, S., Commandeur, J. J., & Koopman, S. J. (2016). Intervention time series analysis of crime rates: the case of sentence reform in Virginia. Economic Modelling, 57, 311–323. Wan, S. K., Xie, Y., & Hsiao, C. (2018). Panel data approach vs synthetic control method. Economics Letters, 164, 121–123. Wang, X. B., & Nie, H. F. (2010). Administrative district adjustment and economic growth. Management World, 4, 42–53. Yang, J. G., Zhou, L. L., & Zou, H. F. (2017). Evaluation of the economic growth effect of the establishment of special economic zones in China–analysis based on the synthetic control method. Economic Perspectives, 1, 41–51. Yu, J. W., & Wang, C. C. (2011). Political environment and economic development–analysis based on the evolution of cross-strait relations. South China Journal of Economics, 4, 30–39. Zhang, H., Fu, X., Cai, L. A., & Lu, L. (2014). Destination image and tourist loyalty: A meta-analysis. Tourism Management, 40(1), 213–223. Zhou, L. (2007). Governing China’s local officials: An analysis of promotion tournament model. Economic Research Journal, 7, 36–50. Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B: Methodological, 67(2), 301–320.
27