China Economic Review 31 (2014) 379–391
Contents lists available at ScienceDirect
China Economic Review
Decomposing the rich dad effect on income inequality using instrumental variable quantile regression Zaichao DU a,⁎, Renyu LI a, Qinying HE b,c,⁎⁎, Lin ZHANG a a b c
Research Institute of Economics and Management, Southwestern University of Finance and Economics, Chengdu, China School of Economics and Management, South China Normal University, Guangzhou, China Scientific Laboratory of Economic Behaviors, South China Normal University, Guangzhou, China
a r t i c l e
i n f o
Article history: Received 16 September 2013 Received in revised form 11 June 2014 Accepted 12 June 2014 Available online 19 June 2014 JEL classification: D31 E24 J62 Keywords: Intergenerational inequality Counterfactual decomposition Composition effect Income structure effect Instrumental variable quantile regression
a b s t r a c t In this paper we evaluate the relative importance of the two main channels, namely the composition effect and the income structure effect, through which the paternal income affects children's income inequality. Using data on 2677 pairs of father and children from China Health and Nutrition Survey (CHNS), we construct the counterfactual income of children from poor families if they had the same characteristics as children from rich families. We propose an instrumental variable quantile regression-based method to solve the endogeneity problem and decompose the rich dad effect on income inequality into the composition effect and the income structure effect. We find that the composition effect explains at least 80% of the income difference at any quantile, and it explains all the income difference at the top four deciles. Income structure effect has a significant impact only at quantiles between 20% and 40%, where it explains about 20% of the income difference. © 2014 Elsevier Inc. All rights reserved.
1. Introduction Income inequality has become a more and more important problem in China. The income share held by the top 10% households in China has exceeded 57%.1 Meantime, about 50% of Chinese households have no savings for the year 2010.2 How to reduce the income inequality has become one of the biggest problems for stimulating domestic consumptions and economic growth in China. One important source of income inequality is through intergenerational transmission, which has drawn the attention of many researchers, e.g. Becker and Tomes (1986), Zimmerman (1992), Solon (1999), and Erikson and Goldthorpe (2002), just to name a few. Meanwhile, as evidenced by the large popularity and controversy of “competition of the father” and “the rich second generation”, intergeneration inequality is also of great concern to the general public. This paper studies the relative importance of the two main channels, namely the composition effect and the income structure effect (cf. Firpo, Fortin, & Lemieux, 2007; Mata & Machado, 2005), through which the paternal income affects children's income inequality. Most of the literature on intergenerational inequality has been focusing on intergenerational income elasticity (IIE), see Solon (1999) and Björklund and Jäntti (2009) for an excellent survey. However these studies fail to answer how the income of the parents ⁎ Correspondence to: Z. Du, RIEM, Southwestern University of Finance and Economics, 55 Guanghuacun Street, Chengdu 610074, China. Tel.: +86 28 8735 3797. ⁎⁎ Correspondence to: Q. He, School of Economics and Management, South China Normal University, Guangzhou Higher Education Mega Center, Guangzhou 510006, China. Tel.: +86 20 39310177. E-mail addresses:
[email protected] (Z. Du),
[email protected] (Q. He). 1 Research Report of China Household Finance Survey 2012. The same number for US in 2009 is only 40.6%. 2 Research Report of China Household Finance Survey 2012.
http://dx.doi.org/10.1016/j.chieco.2014.06.007 1043-951X/© 2014 Elsevier Inc. All rights reserved.
380
Z. Du et al. / China Economic Review 31 (2014) 379–391
affects that of the children. One exception is Lefgren, Lindquist, and Sims (2012), who study the effects of the father's human capital and financial resources on children's income. Our study combines two strands of research on the transition mechanism of intergenerational inequality. On the one hand, Bowles and Gintis (2002), Shea (2000), Mayer (2002), Dahl and Lochner (2012) and many others have found that parents with higher income can invest more on their children's human capital, which brings higher income for their children. On the other hand, Conlisk (1974), Ruhm (1988), Björklund and Jäntti (2009), Zhang and Eriksson (2010) and Li, Meng, Shi, and Wu (2012) have found that children from different socioeconomic family backgrounds, mainly characterized by the income of the parents, may face different work opportunities (or different returns to the covariates), which may also cause income inequality for the children. We disentangle the composition effect and the income structure effect by counterfactual decomposition. We consider two groups of children, children from rich families and children from poor families. We construct the counterfactual income of children from poor families if they had the same characteristics (or covariates) as children from rich families. The difference between the counterfactual income and the actual income of children from poor families is then purely due to the covariates differences between the two groups of children. Following the literature (Firpo et al., 2007; Mata & Machado, 2005), we call this part the composition effect. The difference between the actual income of children from rich families and the counterfactual income is due to the difference in the returns to the covariates, which is called the income structure effect. Our methodological contribution is that we extend the counterfactual decomposition method of Mata & Machado, 2005, (MM hereinafter) to cases with endogenous variables. The MM method has been widely used in wage distribution decomposition, and it generalizes the traditional Oaxaca (1973) decomposition of effects on mean wages to the entire wage distribution. However, MM does not consider the endogeneity problem, which will bias the decomposition in cases like here. Our method solves the omitted variable problem by incorporating the IV quantile regression method proposed by Chernozhukov and Hansen (2008). Another nice approach would be the unconditional quantile regression and the decomposition method proposed by Firpo et al. (2007); Firpo, Fortin, and Lemieux (2009), but it is unclear to us how to extend it to the endogenous variable case. Our empirical analysis using CHNS data finds that the composition effect explains at least 80% of the income disparity between children from rich families and from poor families, and it explains all the income difference at the top 40% quantiles of the distribution. Income structure effect explains about 20% of the income difference at the lower 40% quantiles of the income distribution. Some analyses show that our results are robust to alternative specifications. Our study sheds some light on the transition mechanism of intergenerational inequality. It gives a better picture of how the composition effect and income structure effect contribute to the income difference between children from rich and poor families. Our results suggest that improving the labor market efficiency can reduce the income inequality to a certain extent, and more importantly, helping children from poor families to get more education can reduce the income inequality substantially. The rest of this paper is organized as follows. Section 2 describes our econometric model and decomposition method. Section 3 describes our data. Section 4 presents the empirical results. In Section 5 we do some robustness checks and we conclude in Section 6. 2. The econometric model and the decomposition method In this section, we propose an instrumental variable (IV) quantile regression (QR)-based counterfactual decomposition method. Our method extends MM's method to cases with endogenous variables. One issue with MM's procedure is the omitted variable problem, here ability, which may bias the decomposition when there exists a systematic difference between the ability of children from rich and poor families, see Fortin, Lemieux, and Firpo (2011). Our method solves the omitted variable problem by incorporating the instrumental variable quantile regression method proposed by Chernozhukov and Hansen (2008), which is introduced next. 2.1. An instrumental variable quantile regression (IVQR) model To allow endogenous control variables and to better describe the different behaviors of people at different income levels, we consider the following IVQR model: 0
0
Y ¼ D α ðU Þ þ X βðU Þ; D ¼ f ðX; Z; V Þ; UjX; Z Uniformð0; 1Þ;
ð1Þ
where Y is the log of the CPI-adjusted disposable yearly income of an individual. U is a scalar random variable that aggregates all of the unobserved factors affecting Y. U follows a uniform distribution on the interval (0,1) as in any QR model. X is a vector of exogenous control variables including experience,3 experience square, gender, urban, SOE (state-owned enterprise), migrant worker, small business, provincial dummies, time dummies and a constant term. Urban is a binary variable that takes value 1 if one's Hukou is in the urban area and 0 otherwise. Dummy variable SOE equals 1 if an individual works for a state-owned enterprise and 0 otherwise. Variable migrant worker equals 1 if an individual is a migrant worker and 0 otherwise. D is years of education. Because the well-known omitted variables problem, we allow D to be endogenous here. We further assume that D is a functional of X, Z and V, where Z is an IV, and V is an error term affecting D. Here we use the community education index,4 a measure of the average education level of a community, as the IV, which is 3
We do not control age here, as experience is a linear function of age. Specifically, here experience = age − years of education − 7. The community education index is one component of the urbanization index created by Jones-Smith and Popkin (2010) and released by CHNS. The community education index allots a maximum total of 10 points for the average educational attainment of adults more than 21 years old in the community, and a higher score indicates a higher community education level. The scoring algorithms are developed based on distributions in the data, with the goal of having the median score be close to half of the total possible points and with sufficient spread in the scores between the minimum and maximum points. 4
Z. Du et al. / China Economic Review 31 (2014) 379–391
381
not a weak IV as shown in the next section. The endogeneity of D implies that U and V are correlated with each other. Model (1) implies the following expression for the conditional τ-quantile of log-income 0
0
Y τ ¼ D α ðτÞ þ X βðτ Þ: We estimate model (1) using the IVQR method proposed by Chernozhukov and Hansen (2008). To be specific, for any given value of α, we minimize the objective function Q n ðτ; α; β; γÞ ¼
n X 0 0 0 ρτ Y i −Di α−X i β−Z I γ i¼1
and get the estimates for β and γ as follows ^ ðα; τ Þ; γ ^ ðα; τÞ ¼ arg min Q n ðτ; α; β; γÞ; β
ð2Þ
β;γ
where ρτ is the check function at τ quantile ρτ ðuÞ ¼ ðτ−1ðu ≤0ÞÞu; ^ ðα; τ Þ as close to 0 as possible, we get an estimator for α, i.e. with 1(⋅) being the indicator function. Next, by making γ h 0i ^ ðα Þ½γ ^ ðα; τÞ A ^ ðα; τ Þ; ^ ðτÞ ¼ arg inf½ Waldn ðα Þ; Waldn ðα Þ ¼ n γ α α∈Ψ
^ ðα Þ is an estimate for the inverse of the asymptotic covariance matrix of pffiffiffi ^ ðα; τ Þ−γ ðα; τÞ, where Ψ is the parameter space of α; A n ½γ which is obtained in the first step estimation. Therefore, Waldn(α) is the Wald test statistic for γ(α, τ) = 0. Our final estimators for α ^ ðα ^ ðτ Þ and β ^ ðτÞ; τ Þ. Chernozhukov and Hansen (2008) prove the consistency and asymptotic normality of the and β are given by α above estimators under some mild assumptions. 2.2. An IVQR-based decomposition method Before we introduce our decomposition method, we introduce some notations first. We use superscripts h and l to indicate variables for children from rich families and children from poor families, respectively. For example, Yh and Yl denote the log-income of children from rich and poor families, respectively; Wh = (Dh', Xh')' and Wl = (Dl', Xl')' are the corresponding control variables; δh = (αh', βh')' and δl = (αl', βl')' are the corresponding coefficients; nh and nl are numbers of observations of children from rich families and children from poor families, respectively. We let F(δh, Wh) and F(δl, Wl) denote the cumulative distribution function (CDF) of Yh and Yl implied by model (1), respectively. Finally, we let F*(δl, Wh) denote the CDF of the counterfactual log-income, i.e. the log-income of children from poor families if they had the same characteristics as children from rich families. We approximate the CDFs as follows: n onh 1. Run IV quantile regressions at τ = 0.01*j, j = 1, 2,…, 99, using the data for children from rich families only, i.e. Y hi ; W hi ; Z hi , and i¼1 n 0 h onh 99 h . get the coefficients estimates ^δ ðτÞ. Approximate F(δh, Wh) by the empirical CDF of W hi ^δ τ j i¼1; j¼1 n onl l and get the coefficients estimates ^δ ðτÞ. Approx2. Repeat step 1 using the data for children from poor families only, i.e. Y li ; W li ; Z li i¼1 l n 0 l on 99 . imate F(δl, Wl) by the empirical CDF of W li ^δ τ j j¼1 onh 99 n i¼1; 0 l . 3. Approximate F*(δl, Wh) by the empirical CDF of W hi ^δ τ j i¼1; j¼1
To save notations, we still use F(δh, Wh), F(δl, Wl) and F *(δl, W h) to denote the approximated CDFs without causing any confusion. 0 Notice that we get the conditional quantile regression coefficients estimates after IVQR. We then pair each Wi with all the ^δ τ j s; j = 1, 2,…, 99. As the empirical distribution of {Wi} approximates the distribution of W, the above procedure gives us an approximation of the unconditional distribution of Y implied by model (1). A similar idea has been used in Albrecht, Björklund, and Vroman (2003); Melly (2005); Dustmann, Ludsteck, and Schoenberg (2009). With the above approximated CDFs, we are ready to state our decomposition method. Suppose we'd like to decompose the difference between the q-quantile of the log-income for children from rich families, q(Yh), and from poor families, q(Yl). We have h l q Y −q Y h h l h l h l l −q F δ ; W −q F δ ; W ¼ q F δ ;W þq F δ ;W þ residual; |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} income structure effect
composition effect
382
Z. Du et al. / China Economic Review 31 (2014) 379–391
where the first term on the right hand side is due to the difference in the returns to the covariates, and we call this term the income structure effect; the second term is purely due to the covariates differences between the two groups of children, and we call this term the composition effect. The contribution of the first term to the income difference is: q F δh ; W h −q F δl ; W h ; contribution of income structure effect ¼ q Y h −q Y l and the contribution of the second term is: q F δl ; W h −q F δl ; W l : contribution of composition effect ¼ q Y h −q Y l
3. Data The data used in this paper are from China Health and Nutrition Survey (CHNS), an ongoing international collaborative project between the Carolina Population Center at the University of North Carolina at Chapel Hill and the National Institute of Nutrition and Food Safety at the Chinese Center for Disease Control and Prevention. CHNS data provide detailed economic, demographic and health information for about 4400 households with a total of 26,000 individuals in nine provinces5 of China in eight years: 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009. The CHNS data fit our study very well because it allows us to trace the fathers' income back to their children's growing up stage, when the investment in human capital matters the most for the children. This is important as we want to study how parental income affects children's income. One way that parents can affect children's income is through investing in their children's human capital. We consider two groups of children: children from rich families, whose father's annual disposable income is in the top 20% of the sample wave,6 and children from poor families, whose father's annual disposable income is in the bottom 20% of the sample wave. We use the income of the fathers when their children are in the age between 15 and 18.7 We have used other classifications for rich and poor families as well as other age periods, and they all give us similar results as shown in Section 5. Our sample consists of adult children aged more than 20 years with positive labor earnings. We trace their fathers' income status8 when the children were in the age between 15 and 18.9 After deleting observations with key variables missing, we obtain a pooled cross-sectional data with 2677 observations. The age of the children in our sample ranges from 20 to 38. Fig. 1 plots the histogram of children's ages. Almost 90% of the children in our sample are between 20 and 30 years old. The average age of the children is 24, and the average age of their fathers when the children aged 15–18 is about 46. Table 1 reports the descriptive statistics for the father's annual disposable income by quintile. All the income data are inflationadjusted and expressed in 2009 Yuan.10 Under our classifications, 393 families belong to the poor families and 767 families belong to the rich families. The mean of the father's annual disposable income is 11,683.56 Yuan for the highest quintile while it is only 727.83 Yuan for the lowest quintile. As the income quintile increases, the standard deviation, as a measure of the income inequality within the quintile, also increases. Table 2 reports the sample mean or proportion of the key variables for the full sample, rich families and poor families, respectively. We see more years of education for children from rich families than children from poor families. The percentage of attending secondary technical school and college is also higher for children from rich families. Meanwhile the proportion of working in SOE (stateowned enterprises) or government, which are considered as better jobs in China, is also higher for children from rich families. Another important feature is the huge income difference between the two groups of children. The average annual disposable income of children from rich families is 11,856.34 Yuan, which is almost twice as large as that of children from poor families, 6732.19 Yuan. Fig. 2 shows the ratio of the log-income of children from rich families to that of children from poor families at different quantiles. We also plot the 95% bootstrap confidence bands for the ratio. The ratio is significantly bigger than 1, which indicates a significant income difference between the two groups of children. 4. Empirical results In this section, we first show the results of the IV quantile regression, and provide evidences for the difference in the returns to the covariates between children from rich families and from poor families. We also analyze the differences in the covariates between the 5 CHNS data cover 9 provinces in almost all the survey years, namely Heilongjiang, Jiangsu, Shandong, Guizhou, Guangxi, Hubei, Henan, Hunan and Liaoning. The only exception is the 1997 survey, where Liaoning is not included. 6 We have tried defining income above median as rich families, which gives us similar results as shown in Section 5. 7 We use this period for our main results as 15–18 is a period in-between attending school and going to work for our sample. 8 Here we only consider the biological father, but not stepfather or adoptive father. 9 If we have a father's income when his children are aged 15–18 in two survey years, which is 36% of the case in our sample, we use the father's income at the children's early age. 10 The adjusted income is the nominal income divided by the consumer price index (CPI). The CPI constructed by CHNS has already taken into account the differences across rural and urban as well as different provinces.
383
.1 0
.05
density
.15
.2
Z. Du et al. / China Economic Review 31 (2014) 379–391
20
25
30 age
35
40
Data source: CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009
Fig. 1. Histogram of adult children's age.
Table 1 Descriptive statistics of father's annual disposable income. Quantile
N
Mean
Median
SE
Min
Max
0%–20% 20%–40% 40%–60% 60%–80% 80%–100% Full sample
393 437 474 606 767 2677
727.83 1952.61 3397.74 5052.36 11683.56 5518.44
415.73 623.96 1236.93 1984.48 8548.38 3757.54
748.97 1845.92 2989.05 4311.86 9051.52 6273.27
14.85 1187.27 2167.37 3104.02 4764.94 14.85
2369.44 4565.74 10400.42 15124.78 71162.37 71162.37
Notes: Data are from CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009; N is number of observations; SE stands for standard deviation; the income data are inflation-adjusted and expressed in 2009 Yuan.
two groups of children. We then decompose the rich dad effect on income inequality into the composition effect and the income structure effect.
4.1. Evidences on income structure difference and characteristic difference Table 3 reports the estimation results of model (1) using the IVQR method with τ = 0.5. As we can see, children from rich families have a higher return to schooling than children from poor families. Fig. 3 further shows the IVQR estimates for the return to schooling for the two groups of children for 0.05 ≤ τ ≤ 0.95. In the lower quantiles of the income distribution, the return to schooling is notably higher for children from rich families than children from poor families. We call this difference the income structure effect, as it basically says that for children with the same characteristics, those from rich families have a higher return to schooling than those from poor families. Notice that we already control the ability problem by using IV. Fig. 3 also shows a higher return to schooling for children from poor families in the top 20% of the income distribution. One possible explanation could be that the high income of children from rich families is often not due to education, but due to owning a small business or having other opportunities; while the high income of children from poor families is usually due to better education. To check the strength of our IV used in the IVQR, we calculate the weak IV measure implied in Assumption R5 of Chernozhukov and Hansen (2008). For our case here, it takes the form of density-weighted correlation between the endogenous variable Education and the IV community education index.11 With the aid of the ‘np’ package in R, we are able to estimate the density-weighted correlation. We get a value of 0.51 for our whole sample at quantiles around the median, 0.48 for the subsample of children from rich families, 0.42 for the subsample of children from poor families. These results indicate that our IV is not weak here. For completeness, we also report the results of 2SLS estimates in Table 3, which tell a similar story. Besides, we report the Kleibergen-Paap rk Wald F statistic (Kleibergen and Paap 2006) for weak IV. The F statistics for the full sample, children from rich families and children from poor families are 248.78, 78.32 and 49.01, respectively, which again indicate that our IV, community education index, is not a weak IV. 11
See the display below Assumption R5 and R6 in Chernozhukov and Hansen (2008) as well as the discussion therein.
384
Z. Du et al. / China Economic Review 31 (2014) 379–391
Table 2 Sample mean or proportion of key variables. Rich family
Poor family
Full sample
11,856.34
6732.192
8890.91
Years of education
10.18
8.13
9.33
Primary school and below
5.08%
6.36%
4.30%
Middle school
14.70%
22.39%
14.38%
High school
50.63%
58.78%
49.91%
Secondary technical schools
14.30%
9.41%
16.06%
College and above
15.29%
3.05%
15.36%
SOE
9.00%
2.80%
7.54%
Government
16.29%
2.04
12.71%
Years of experience
7.58
7.81
7.79
1–10 years 11–22 years
81.23% 18.77%
78.88% 21.12%
78.82% 21.18%
Urban Hukou
21.12%
1.78%
19.13%
Small business
14.21%
13.23%
13.22%
Male
63.23%
67.94%
64.18%
Age
24.31
23.95
24.3
Father's age
45.48
47.65
46.04
Father's years of education
6.93
4.97
6.42
Mather's years of education
4.32
2.59
3.92
Community Education Index
3.04
2.22
2.77
Income Education
Occupation
Experience
Other characteristics
Family variables
1
1.5
log income ratio 2.5 2
3
Notes: Data are from CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009. Income — children's inflation-adjusted annual disposable income in 2009 Yuan; SOE — whether one works in a state-owned enterprise; Government — whether one works for the government; Urban Hukou — whether one's Hukou is in the urban area; Small Business — whether one owns a small business; Community Education Index — a measure of the average education level of a community in a scale from 0 to 10.
0
20
40
60
80
100
quantile 95% confidence interval log income ratio Data source: CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009
Fig. 2. Ratios of the log-income of children from rich families to that of children from poor families at different quantiles.
Z. Du et al. / China Economic Review 31 (2014) 379–391
385
Table 3 IVQR and 2SLS results. (1)
Years of education Experience Experience 2/100 Constant Other control variables N R2 Weak IV measure
(2)
(3)
(4)
(5)
(6)
IVQR
IVQR
IVQR
2SLS
2SLS
2SLS
Full sample
Rich dad's children
Poor dad's children
Full sample
Rich dad's children
Poor dad's children
0.298⁎⁎⁎ (0.0327) 0.132⁎⁎⁎
0.295⁎⁎⁎ (0.0547) 0.134⁎⁎⁎
0.272⁎ (0.155) 0.203⁎⁎
0.259⁎⁎⁎ (0.0241) 0.148⁎⁎⁎
0.297⁎⁎⁎ (0.0491) 0.139⁎⁎⁎
0.242⁎⁎⁎ (0.0675) 0.135⁎⁎
(0.0294) −0.222 (0.140) 4.989⁎⁎⁎ (0.222) – 2677
(0.0500) −0.242 (0.246) 5.190⁎⁎⁎ (0.350) – 767
(0.0928) −0.488 (0.428) 4.617⁎⁎⁎ (0.812) – 393
0.51
0.48
0.42
(0.0223) −0.335⁎⁎⁎ (0.105) 5.352⁎⁎⁎ (0.318) – 2677 0.284 248.78
(0.0396) −0.223 (0.188) 5.036⁎⁎⁎ (0.642) – 767 0.245 78.32
(0.0663) −0.209 (0.297) 5.170⁎⁎⁎ (0.784) – 393 0.281 49.01
.4 .3 .2 .1
return to education
.5
Notes: Data are from CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009; Robust standard errors are reported in the parentheses in the first three columns; Cluster robust standard errors are reported in the parentheses in the fourth to sixth columns; Weak IV measure for IVQR is the density-weighted correlation between education and community education index as explained in Chernozhukov and Hansen (2008); Weak IV measure for 2SLS is the Kleibergen-Paap rk Wald F statistic; Other control variables include: small business dummy, migrant worker dummy, SOE dummy, gender dummy, urban dummy, provincial dummies and time dummies. ⁎ p-Value b 0.1. ⁎⁎ p-Value b 0.05. ⁎⁎⁎ p-Value b 0.01.
0
.2
.4
.6
.8
1
quantile rich dad's children
poor dad's children
Data source: CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009
Fig. 3. Returns to education at different quantiles.
Table 4 reports some results on the characteristic differences among different groups of children, where we divide the children into 5 groups by the quintiles of their father's income. The base group is children from poor families (father's income in the lowest 20% quantile). The first column shows the OLS results of regressing years of education on the 4 group dummies and other control variables. One can see that children from richer families have significantly more years of education. Column 2 and 3 report the estimates of the linear probability model12 for working at SOE and working for government, respectively. Again, children from richer families have significantly higher probabilities of getting these jobs that are considered as better ones in China.
4.2. The IVQR-based decomposition results With the IVQR estimates in the previous subsection, we can approximate the CDFs of Yh and Yl, F(δh, Wh) and F(δl, Wl), using the procedure described in Section 2.2, and then decompose the rich dad effect on income inequality into the composition effect and the income structure effect. 12
Probit or Logit models give similar average partial effects as the linear probability model used here.
386
Z. Du et al. / China Economic Review 31 (2014) 379–391
Table 4 Results on characteristic differences.
Father's income quantile 20%–40% 40%–60% 60%–80% 80%–100%
(1)
(2)
(3)
Years of education
Work at SOE
Work for government
0.0667 (0.278) 0.379 (0.247) 0.672⁎⁎⁎
0.0138 (0.0144) 0.0149 (0.0151) 0.0446⁎⁎⁎
0.0279 (0.0234) 0.0851⁎⁎⁎
(0.235) 1.170⁎⁎⁎ (0.222)
(0.0157) 0.0377⁎⁎⁎ (0.0143) 0.00677⁎⁎⁎
(0.0250) 0.126⁎⁎⁎ (0.0231) 0.113⁎⁎⁎ (0.0218) 0.0338⁎⁎⁎
(0.00203) 0.0752⁎⁎⁎
(0.00327) 0.223⁎⁎⁎
(0.0211) 0.00980 (0.0118) −0.0707 (0.0437) 2677 0.074
(0.0285) 0.00560 (0.0178) −0.184⁎⁎⁎ (0.0598) 2677 0.281
Years of education 2.265⁎⁎⁎ (0.174) 0.114 (0.147) 9.026⁎⁎⁎ (0.326) 2677 0.236
Urban Hukou Female Constant N R2
Notes: Data are from CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009; Urban Hukou — whether one's Hukou is in the urban area; cluster robust standard errors are reported in the parentheses. ⁎ p-Value b 0.1. ⁎⁎ p-Value b 0.05. ⁎⁎⁎ p-Value b 0.01.
We first illustrate the performance of our method of approximating the CDFs, which is the stepping stone for our decomposition method. To that end, we plot the actual and approximated empirical CDFs of children's income in Fig. 4. We can see that the approximated income path traces the actual path reasonably well for both groups of children. Therefore, our method of approximating F(δh, Wh) and F(δl, Wl) works well in practice, and we then go ahead with the decomposition. Following the procedure in Subsection 2.2, we then construct the counterfactual income of children from poor families if they had the same covariates as children from rich families. Fig. 5 plots the approximated income of children from rich families and poor families and the counterfactual income at different quantiles. The income of children from rich families is above that of children from poor families at all quantiles. The counterfactual income is on top of the income of children from poor families, and below the income of
poor dad's children
ln(income)
9
8
8 6
6
7
ln(income)
10
10
11
12
rich dad's children
0
.2
.4 .6 quantile
.8
1
Estimated income
0
.2
.4 .6 quantile
Observed income
Data source: CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009 Fig. 4. Estimated and actual empirical CDFs of children's income.
.8
1
387
9 8 6
7
ln(income)
10
11
Z. Du et al. / China Economic Review 31 (2014) 379–391
0
.2
.4
.6
.8
1
quantile rich dad's children counterfactual
poor dad's children
Data source: CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009
Fig. 5. Approximated income of children from rich families, poor families and the counterfactual income at different quantiles.
.5 0 -.5 -1
ln (income) gap
1
children from rich families in the lower 60% of the income distribution, i.e. if the children from poor families had the same characteristics as children from rich families, their income would definitely increase but would still be less than the income of children from rich families in general. The counterfactual income is above the income of children from rich families in the upper 40% of the income distribution, which is consistent with our findings in Fig. 3. Fig. 6 displays our decomposition results. The solid line denotes the difference in the income between children from rich families and from poor families at different quantiles. The difference is larger in the lower quantiles. The dotted line is the difference between the counterfactual income and the income of children from poor families, which represents the composition effect: if the children from poor families had the same characteristics as children from rich families, their income would increase by this much. The dashed line is the difference between the income of children from rich families and the counterfactual income: the income of children from poor families would still be less than that of children from rich families by this much, even if they had the same covariates, so it represents the income structure effect and can be explained by the difference in returns to covariates between children from rich and poor families. The income structure effect is negative in the upper quantiles but insignificantly different from 0 as shown by the 90% bootstrap confidence intervals. Fig. 6 also shows that the composition effect accounts for most part of the income difference between children from rich and poor families. The income structure effect has a significant impact only at quantiles between 20% and 60% of the income distribution. The composition effect almost explains all the difference at the two ends of the income distribution. Table 5 reports the detailed numerical results. Columns 2 and 3 give the observed and estimated log-income at the 9 deciles for rich dad's children, and columns 4 and 5 report the corresponding log-income for poor dad's children. Again, the estimated log-income using our method is very close to the actual observed one, which proves the validity of our method. Column 6 is the counterfactual income, and column 7 is the difference between the estimated income of rich dad's children and that of poor dad's children. The
0
.2
.4
.6
.8
1
quantile 90% Confidence Interval Composition Effect
Income Gap Income Structure Effect
Data source: CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009
Fig. 6. IVQR-based decomposition results.
388
Z. Du et al. / China Economic Review 31 (2014) 379–391
Table 5 IVQR-based decomposition results. Quantile ln(income) of children
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Rich dad's children
Poor dad's children
Observed Estimated (a)
Observed Estimated (b)
7.26 7.97 8.41 8.72 8.92 9.19 9.45 9.68 10.06
6.32 6.8 7.22 7.7 8.17 8.66 8.96 9.24 9.64
6.99 7.7 8.17 8.55 8.87 9.17 9.49 9.85 10.34
6.37 7.01 7.47 7.87 8.25 8.62 9.03 9.51 10.16
Estimated income gap (d = a − b)
Counter-factual income (c)
Income gap decomposition Income structure effect
Composition effect
Gap Contribution Gap Contribution (e = a − c) (e/d) (f = c − b) (f/d)
6.95 7.58 8.03 8.44 8.81 9.18 9.57 10.03 10.74
0.61 0.69 0.7 0.68 0.62 0.55 0.46 0.34 0.18
0.04 0.12 0.14 0.11 0.05 0 −0.08 −0.19 −0.4
6% 17% 20% 16% 9% 0% 0% 0% 0%
0.58 0.57 0.56 0.57 0.57 0.55 0.54 0.53 0.58
94% 83% 80% 84% 91% 100% 100% 100% 100%
Notes: Data are from CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009; Income — inflation-adjusted annual disposable income in 2009 Yuan.
last four columns report the estimated income structure effect and composition effect, as well as their contributions to the income difference between rich dad's children and poor dad's children. As we can see, the composition effect explains at least 80% of the income difference, and it explains all the income difference at the top four deciles. Income structure effect explains about 20% of the income difference at the second, third and fourth deciles. For comparison purposes, in Fig. 7 we also report the results of MM's method, i.e. without considering the endogeneity problem caused by the unobservable ability. Compared with Fig. 6, we can see that the income structure effect accounts for a bigger proportion of the income difference using MM's method than using our method, which is because the contribution of ability is misplaced into the income structure effect by MM's method. We can also construct the counterfactual income of children from poor families if they had the same income structure as children n 0 h onl 99 . This from rich families, which involves approximating the counterfactual CDF F*(δh, Wl) by the empirical CDF of Wli ^δ τ j i¼1; j¼1
actually gives us similar decomposition results as those in Table 5, and we do not report the results here.
5. Robustness checks and discussions
.5 0
ln(income) gap
1
We do some robust checks as well as some discussions on the selection bias in this section. First, we try another way of defining the rich and poor families. We consider the family as a rich family, if the father's income is above the median income of the survey wave, and we consider it as a poor family otherwise. In this way, we can use all the 2677 observations in our sample. Second, we use the fathers' income status when their children aged 10–14 instead of 15–18 to define the rich and poor families. The two modifications lead to similar results as before as shown below. Last, we have some qualitative discussions on the possible selection bias caused by the data unavailability for parents whose children are living separately.
0
.2
.4
.6
.8
1
quantile 90% Confidence Interval Composition Effect
Income Gap Income Structure Effect
Data source: CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009
Fig. 7. Decomposition results using MM's method.
389
.5 0 -.5
ln (income) gap
1
Z. Du et al. / China Economic Review 31 (2014) 379–391
0
.2
.4
.6
.8
1
quantile 90% Confidence Interval Composition Effect
Income Gap Income Structure Effect
Data source: CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009
Fig. 8. IVQR-based decomposition results, viewing income above median as rich families.
5.1. Viewing income above median as rich families Now, we consider a family as a rich family, if the father's income is above the median income of the survey wave, and we consider it as a poor family otherwise. Fig. 8 plots the new IVQR-based decomposition results. We get similar results as those in Fig. 6. The composition effect still explains most part of the income difference. The income structure effect plays a significant role only at quantiles lower than 60%. Table 6 reports the detailed numerical decomposition results. Similar to the results in Table 5, the composition effect explains about 80% of the income difference at the lower deciles, and it explains all the income difference at the top four deciles; the income structure effect explains about 20% of the income difference at the lowest three deciles. The major difference between the two results is at the first decile, where the impact of the income structure effect rises from 6% to 24%. 5.2. Using fathers' income status when their children aged 10–14 If we use the fathers' income status when their children were in the age between 10 and 14, we end up with 1286 observations. Due to the sample size limitation, we consider a family as a rich family, if the father's income is above the median income of the survey wave, and we consider it as a poor family otherwise. Following the procedure in Section 2, we get the IVQR-based decomposition results as in Fig. 9, which are similar to those in Section 4. 5.3. Discussions on the selection bias The sample design of CHNS makes it unable to survey parents whose children are living separately, which may bring some sample selection bias. Due to the data unavailability for parents whose children are living separately, we only do some qualitative discussions in this subsection. Table 6 IVQR-based decomposition results, robust check. Quantile ln(income) of children Rich dad's children
Poor dad's children
Observed Estimated Observed Estimated (a) (b) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
7.08 7.76 8.20 8.55 8.78 9.07 9.33 9.57 9.91
7.22 7.78 8.19 8.51 8.80 9.08 9.37 9.71 10.16
6.40 6.99 7.40 7.81 8.32 8.62 8.97 9.24 9.68
6.40 7.06 7.53 7.94 8.29 8.65 9.01 9.44 9.99
Counter-factual income (c)
7.02 7.64 8.07 8.44 8.76 9.09 9.44 9.82 10.36
Estimated income gap (d = a − b)
Income gap decomposition Income structure effect
Composition effect
Gap Contribution Gap Contribution (e = a − c) (e/d) (f = c − b) (f/d) 0.82 0.72 0.65 0.57 0.50 0.43 0.36 0.27 0.17
0.20 0.14 0.11 0.07 0.03 −0.01 −0.06 −0.11 −0.19
24% 20% 17% 13% 7% 0% 0% 0% 0%
0.62 0.58 0.54 0.50 0.47 0.44 0.42 0.38 0.36
76% 80% 83% 87% 93% 100% 100% 100% 100%
Notes: Data are from CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009; Income — inflation-adjusted annual disposable income in 2009 Yuan.
Z. Du et al. / China Economic Review 31 (2014) 379–391
-.2
0
ln (income) gap .2 .6 .4
390
0
.2
.4
.6
.8
1
quantile Income Gap Income Structure Effect
Composition Effect
Data source: CHNS 1989, 1991, 1993, 1997, 2000, 2004, 2006 and 2009
Fig. 9. IVQR-based decomposition results, using fathers' income status when their children aged 10-14.
Generally speaking, children with higher human capital are more likely to live separately with parents. A simple Probit model confirms this: more years of education significantly increase the probability of living separately with parents. As there are more children with higher human capital from rich families than from poor families, see e.g. Björklund and Salvanes (2011), our analysis will underestimate the human capital difference between the two groups, and hence the composition effect. Besides, the sample selection may overestimate the income structure effect. We illustrate our point using education as an example, and we expect that similar arguments for experience can also go through. The literature has documented diminishing marginal returns to schooling at high levels of education (Heckman, Lochner, & Todd, 2008; Mincer, 1974; Trostel, 2005). Therefore, omitting children with higher human capital may overestimate the coefficient of education. As there are more children with higher human capital from rich families than from poor families, the coefficient for children from rich families is more overestimated. As a result, the difference between the coefficients, and hence the income structure effect, is overestimated. In summary, the composition effect will still play a more important role in explaining the income inequality than the income structure effect does, even after considering the possible sample selection bias. 6. Conclusions The literature has documented two channels through which parents' income can affect the children's income. On the one hand, parents with higher income can invest more on their children's human capital, which brings higher income for their children. On the other hand, children from rich and poor families may face different returns to the covariates, which may also cause income inequality for the children. We disentangle the effects of these two channels in this paper by counterfactual decomposition. To deal with the omitted variable problem, we propose an instrumental variable quantile regression-based decomposition method. Our empirical analysis using CHNS data finds that the composition effect explains at least 80% of the income difference at any quantile, and it explains all the income difference at the top four deciles. Income structure effect has a significant impact only at quantiles between 20% and 40%, where it explains about 20% of the income difference. These findings have several policy implications. First, policies, designed to redistribute income and to subsidize the acquisition of human capital, would help children from poor families to overcome the “poverty trap”. Second, enhancing fairness on the labor market can generate certain improvement, but would have limited effects on reducing the intergeneration income inequality. Third, because poor families face not only tight budget constraints in investing on their children's human capital, but also unequal educational opportunities, policies designed to relax budget constraints for poor family and to provide equal opportunities for the acquisition of human capital would weaken the inheritance of inequality. References Albrecht, J., Björklund, A., & Vroman, S. (2003). Is there a glass ceiling in Sweden? Journal of Labor Economics, 21(1), 145–178. Becker, G. S., & Tomes, N. (1986). Human capital and the rise and fall of families. Journal of Labor Economics, 4, 1–39. Björklund, A., & Jäntti, M. (2009). Intergenerational income mobility and the role of family background. In W. Salverda, B. Nolan, & T. Smeeding (Eds.), The Oxford Handbook of Economic Inequality (pp. 491–521). Oxford: Oxford University Press. Björklund, A., & Salvanes, K. G. (2011). Education and family background: Mechanisms and policies. In E. A. Hanushek, S. Machin, & L. Woessmann (Eds.), Handbook of the Economics of Education. Elsevier. Bowles, S., & Gintis, H. (2002). The inheritance of inequality. Journal of Economic Perspectives, 16(3), 3–30. Chernozhukov, V., & Hansen, C. (2008). Instrumental variable quantile regression: A robust inference approach. Journal of Econometrics, 142(1), 379–398. Conlisk, J. (1974). Can equalization of opportunity reduce social mobility? The American Economic Review, 64(1), 80–90.
Z. Du et al. / China Economic Review 31 (2014) 379–391
391
Dahl, G. B., & Lochner, L. (2012). The impact of family income on child achievement: Evidence from the earned income tax credit. American Economic Review, 102(5), 1927–1956. Dustmann, C., Ludsteck, J., & Schoenberg, U. (2009). Revisiting the German wage structure. Quarterly Journal of Economics, 124(2), 843–881. Erikson, R., & Goldthorpe, J. H. (2002). Intergenerational inequality: A sociological perspective. The Journal of Economic Perspectives, 16(3), 31–44. Firpo, S., Fortin, N. M. & Lemieux, T. (2007). Decomposing Wage Distributions using Recentered Influence Functions Regressions", mimeo, University of British Columbia. Firpo, S., Fortin, N. M., & Lemieux, T. (2009). Unconditional quantile regressions. Econometrica, 77, 953–973. Fortin, N., Lemieux, T., & Firpo, S. (2011). Decomposition methods in economics. Handbook of Labor Economics. Elsevier. Heckman, J. J., Lochner, L. J., & Todd, P. E. (2008). Earnings functions and rates of return. Journal of Human Capital, 2(1), 1–31. Jones, S. J., & Popkin, M. B. (2010). Understanding community context and adult health changes in China: Development of an urbanicity scale. Social Science & Medicine, 71(8), 1436–1446. Lefgren, L., Lindquist, M. J., & Sims, D. (2012). Rich dad, smart dad: Decomposing the intergenerational transmission of income. Journal of Political Economy, 120(2), 268–303. Li, H., Meng, L., Shi, X., & Wu, B. (2012). Parental political capital and children labor market performance: Evidence from the first job offers of Chinese college graduates. (in Chinese). China Economic Quarterly, 03, 1011–1026. Mata, J., & Machado, J. A. F. (2005). Counterfactual decomposition of changes in wage distributions using quantile regression. Journal of Applied Econometrics, 20(4), 445–465. Mayer, S. E. (2002). The Influence of parental income on children's outcomes. Report to the New Zealand Ministry of Social Development (http://www.msd.govt.nz/aboutmsd-and-our-work/publications-resources/research/influence-parental-income/index.html). Melly, B. (2005). Decomposition of differences in distribution using quantile regression. Labour Economics, 12(4), 577–590. Mincer, J. (1974). Schooling, experience, and earnings. New York: Columbia University Press. Oaxaca, R. (1973). Male and female wage differentials in urban labor market. International Economic Review, 14(3), 693–709. Ruhm, C. J. (1988). When ‘equal opportunity’ is not enough: Training costs and intergenerational inequality. The Journal of Human Resources, 23(2), 155–172. Shea, J. (2000). Does parents money matter? Journal of Public Economics, 77(2), 155–184. Solon, G. (1999). Intergenerational mobility in the labor market. In C. O. Ashenfelter, & D. Card (Eds.), Handbook of Labor Economics. , 3A. (pp. 1761–1800). Amsterdam: North-Holland. Trostel, P. A. (2005). Nonlinearity in the return to education. Journal of Applied Economics, 8(1), 191–202. Zhang, Y., & Eriksson, T. (2010). Inequality of opportunity and income inequality in nine Chinese provinces, 1989–2006. China Economic Review, 21(4), 607–616. Zimmerman, D. J. (1992). Regression toward mediocrity in economic stature. American Economic Review, 82(3), 409–429.