Social Science & Medicine 134 (2015) 12e22
Contents lists available at ScienceDirect
Social Science & Medicine journal homepage: www.elsevier.com/locate/socscimed
Doubling up: A gift or a shame? Intergenerational households and parental depression of older Europeans Luis Aranda University of Venice, Cannaregio 873, 30121 Venice, Italy
a r t i c l e i n f o
a b s t r a c t
Article history: Available online 31 March 2015
The Great Recession has brought along a rearrangement of living patterns both in the U.S. and in Europe. This study seeks to identify the consequences of a change in intergenerational coresidence on the depression level of the elderly. Using data from the Survey of Health, Ageing and Retirement in Europe (SHARE) and a difference-in-difference propensity score matching approach, this study finds robust evidence of a positive effect of coresidence on the mental health of the older generation in those European countries historically marked by a Catholic tradition. In contrast with previous literature, the present program evaluation setup accounts for non-random selection bias and heterogeneous treatment effects. Though heterogeneous across Europe, the results highlight that, in a time marked by increasing demographic aging, intergenerational living arrangements can lead to significant improvements in the quality of life of older individuals. © 2015 Elsevier Ltd. All rights reserved.
Keywords: Europe Doubling up Mental health Aging DID Matching Wellbeing
1. Introduction Multigenerational living arrangements have been on the rise for the last few decades, both in the U.S. (Kochhar and Cohn, 2011; Taylor et al., 2012) and in Europe (Corselli-Nordblad, 2010; Choroszewicz and Wolff, 2010). Likewise, in recent years a growing number of young adults are moving back into their parents' home e arguably a strategic and protective response to the economic hardships and high unemployment rates brought about by the Great Recession (Mykyta, 2012; Kaplan, 2012). This sharp increase in the number of households where more than one generation of adults decide to “double up” has gained increasing social and economic importance due to its far-reaching implications to the public support systems in today's aging societies. The causes of multigenerational household arrangements are without a doubt of great interest and importance. However, this study aims at shifting our attention to the analysis and measurement of its effects, which have been greatly disregarded in the literature e especially so for individuals belonging to the 50þ age group. So far, solid empirical evidence of the effects of coresidence on the health of the elderly is scarce. Much has been said and done
E-mail address:
[email protected]. http://dx.doi.org/10.1016/j.socscimed.2015.03.056 0277-9536/© 2015 Elsevier Ltd. All rights reserved.
about the effects of leaving or returning to the parents' household on the younger generation (Taylor et al., 2012; Parker, 2012; Mykyta, 2012; Wiemers, 2014; Kaplan, 2012), but how do older adults fare when such changes take place? This paper aims at advancing the literature by separating the causal effect of doubling up on the quality of life of older Europeans from potential confounding factors. Using longitudinal data from the Survey of Health, Ageing and Retirement in Europe (SHARE), it seeks in particular to untangle the impact of a coresidence change on the psychological health of the elderly e as proxied by a selfreported depression index. Attesting a causal link is a challenging empirical task. I adopt econometric techniques from the program evaluation literature to do so. A non-parametric difference-in-difference (DID) propensity score matching approach (Heckman et al., 1997; Heckman et al., 1998, 2005) is used to assess the causal effect of a child moving in or outside the household (“treatment”) on the depression level of older Europeans. The treatment group consists of elderly parents who experience such transition from one wave to the next, while the coresidence status for the control group remains unchanged. The use of a simple linear regression framework to unveil the link between coresidence and depression is inhibited by the likely existence of confounding factors. For instance, reverse causality may exist as an increase in parental depression and mental
L. Aranda / Social Science & Medicine 134 (2015) 12e22
deterioration may induce the coresidence choice of adult children. Moreover, selection bias may arise as respondents with certain characteristics e both observable and not e are more likely to join an intergenerational living arrangement. I exploit the longitudinal nature of the SHARE data to account for the earlier bias and rid the estimates from unobservable time-invariant traits, while matching techniques are employed to minimize non-random selection into coresidence based on observed individual attributes. Subsequently, the robustness of our results to the presence of omitted variables is thoroughly investigated through a series of statistical tests. Though common, these confounding factors have rarely been addressed by previous studies on this topic. The results are summarized in three key findings. First, a change in coresidence has no sizable effects on parental depression when all countries are pooled together. This is arguably due to heterogeneous treatment effects across European regions canceling each other out. Second, a structural divide emerges when the sample is split into two macroregions with historically divergent values. In particular, the effect of a change in coresidence on depression moves in opposite directions depending on whether a country is marked by a Protestant or a Catholic tradition. Parental depression never increases significantly after a double up; however, it does decrease significantly e by a magnitude of 19% e when the transition into coresidence takes place in Catholic Europe. To my knowledge, this constitutes pioneering evidence of the substantial heterogeneity in the coresidence effect. Third, the dissolution of shared households has no impact on the outcome of interest in neither macroregion, reflecting the common belief that an adult child's move into independent living is a signal of success. 1.1. Literature review Previous research has identified a wide range of indicators linking social support to both physical and mental wellbeing. In particular, a positive association between loneliness and depression in old age is consistent in the literature (e.g., Green et al., 1992; Singh and Misra, 2009; Luo et al., 2012; Riumallo-Herl et al., 2014). Social support from adult children has been shown to buffer the adverse effects of negative life events e such as widowhood e on mental wellbeing (Silverstein and Bengtson, 1994; Li et al., 2005; Mazzella et al., 2010). Contrastingly, the association between the elderly's living arrangements and their wellbeing is still a topic of ongoing debate: while some studies have shown a positive effect of coresidence with adult children (Zunzunegui et al., 2001; Hughes and Waite, 2002; Okabayashi et al., 2004; Chen and Short, 2008), others have found a negative or null effect (Chyi and Mao, 2012; Lowenstein and Katz, 2005). The present study seeks to complement these strands of literature by examining the effect of coresidence choices on the mental wellbeing of the older European generation. In what follows, Section 2 describes the data, the main variables, and models the coresidence problem. In Section 3 the results are presented and justified through a series of balancing tests. Finally, the robustness of our results is attested in Section 4, followed by a brief discussion and concluding remarks in Section 5.
13
European countries (and Israel). Only respondents present in all three waves are considered. The analysis is thus limited to the ten countries consistently present in those waves: Austria, Belgium, Denmark, France, Germany, Italy, the Netherlands, Spain, Sweden, and Switzerland. SHARE is a multidisciplinary and cross-national database which provides detailed information on physical and mental health, socio-economic status, and social and family networks of respondents and their households. International comparisons are permitted by the inter-country standardization of all questions. The sample is made up of 10,107 individuals participating in waves 1, 2 and 4 (57% females), making for 30,321 observations, out of which 12,463 contain complete information on depression levels, geographical proximity with children, and the usual socioeconomic and health indicators used as controls. 2.1.1. Independent variable: coresidence with children Coresidence in SHARE is measured by asking respondents about their children's living arrangements: “Where does child [child name] live?” Although respondents are asked to choose among nine alternatives denoting different degrees of geographical proximity, for our purposes the only answer considered is “1) in the same household.” Since 97.2% of respondents in our dataset mention having at most five children, the analysis is limited to a respondent's coresidence with any of her first five children. Given the longitudinal nature of the dataset, it is possible to identify changes in coresidence between the respondent and her children from one wave to the next, as well as to observe the respondent's depression level before and after the move. The analysis is split up into two time periods, depending on when the move took place: short term (between wave 1 in 2004 and wave 2 in 2006) and long term (between wave 2 in 2006 and wave 4 in 2010). Inter-wave refresher samples are avoided in order to preserve sample comparability in the short and long terms (I thank an anonymous referee for highlighting the need to point this out explicitly). These short and long term schemes seem adequate in capturing the real and lasting effects of a shock in family proximity, since depression is known to be a recurrent condition with effects that can take some time to develop and even longer to dissipate, continuing for months and even years (Aging and Depression, 2014). In total, 115 individuals doubled up in the short term and 301 did so in the long term. Splitting up seems to be more common in our sample, since 413 and 335 respondents split up in the short and long term, respectively. One of the drawbacks of the data is that following children in time is troubling at best, which, among other things, hinders our ability to make conclusions regarding the gender, age, and socioeconomic conditions of the specific children involved in the treatment. Moreover, although after a coresidence change it is not possible to identify who the actual mover is, for our purposes the factor of interest remains coresidence per se irrespective of which family member is the protagonist of such move (be it child A, child B, etc. or the respondent herself). Given such data limitations, coresidence in our analysis is determined by the respondent and her closest child. Coresidence is hence the independent or explanatory variable of this study, constructed as a binary indicator equal to one if the respondent and at least one of her children live in the same household and zero otherwise.
2. Empirical methodology 2.1. Description of the data This study uses data from the first (2004), second (2006) and fourth (2010) waves of the Survey on Health, Aging and Retirement in Europe (SHARE), which surveys people aged 50 and over in 19
2.1.2. Dependent variable: depression score The dependent variable is depression, defined on the basis of a symptom-oriented measure known as the EURO-D scale. The EURO-D is made up of five depression measures (GMS-AGECAT, SHORT-CARE, CES-D, ZSDS, and CPRS) which are harmonized to produce a 12-item scale comprising the following symptoms:
14
L. Aranda / Social Science & Medicine 134 (2015) 12e22
pessimism, depressed mood, suicidal thoughts, guilt, trouble sleeping, loss of interest, irritability, fatigue, inability to concentrate, lack of appetite, enjoyment inability, and regular tearfulness (Prince et al., 1999). The index was developed in an effort to compare depression symptoms in fourteen European centers and is constructed by a sum of binary responses, ranging from a score of 0 (no symptoms) to 12 (all symptoms). According to Prince et al. (1999), the EURO-D is a valid and internally consistent scale which correlates well with other well-known mental health measures, providing for a valid comparison of risk factor associations in mental health between units. 2.1.3. Matching covariates The probability of receiving treatment, or “propensity score” (Rosenbaum and Rubin, 1983), is built upon a number of observables which may prove themselves relevant to the outcome of interest. Such observables include: a) sociodemographic characteristics, such as age, gender, marital status, years of education, area of residence (either urban or rural), and health (the number of chronic conditions as well as the difficulties in performing activities of daily living, or “ADLs”); and b) economic situation, measured by household income (split into quintiles by country), ability to “make ends meet” financially (i.e., financial distress), and employment status (employed, unemployed, or retired). Respondents are also matched on the number of living children and their average age. Additional variables, such as the number of grandchildren, were included to assess the robustness of the model at hand, but were then dropped for lack of explanatory power. As per Bryson et al. (2002) and Caliendo and Kopeinig (2008), including extraneous variables in the model may aggravate the common support problem, and hence researchers should refrain from over-parameterizing their models. Summary statistics on the main variables in our sample are presented in Table 1. 2.2. Modeling the change-in-coresidence problem Modeling the problem empirically, individual respondents are classified depending on whether a change in coresidence with an adult child was registered in between two waves, independent of the direction of the change (e.g., coresidence or dissolution). Let Dit 2½0; 1 indicate such a change in coresidence status from one
Table 1 Summary statistics of the main variables. Variable
Mean
Median
S.D.
Min
Max
N
EURO-D Age Female Marrieda Singleb Education (years) Financial distress Income quintile Employed Unemployed Retired Chronic conditions (>1) ADLs (1) Urban area Rural area Number of children Age of children
2.27 63.82 0.57 0.58 0.42 10.75 0.31 2.90 0.09 0.03 0.57 0.45 0.09 0.45 0.55 2.46 34.93
2 63 1 1 0 11 0 3 0 0 1 0 0 0 1 2 34
2.14 9.16 0.49 0.49 0.49 4.42 0.46 1.40 0.29 0.16 0.49 0.50 0.29 0.50 0.50 1.32 10.06
0 50 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2
11 94 1 1 1 25 1 5 1 1 1 1 1 1 1 12 70
12,463 12,463 12,463 12,463 12,463 12,463 12,463 12,463 12,463 12,463 12,463 12,463 12,463 12,463 12,463 12,463 12,463
Notes: a Married and living together or in a registered partnership. b Separated, never married, divorced, or widowed.
wave to the next. If a change is registered, respondents are classified as treated and are assigned a value of Dit ¼ 1. Otherwise, respondents are left untreated (Dit ¼ 0). Moreover, let Y1i(tþs) be the respondent's depression score at time t þ s, s 0, following a change in proximity, while Y0i(tþs) represents the depression score had there not been such a change. The causal effect of a change in family proximity for respondent i at time t þ s is then given by Y1i(tþs) Y0i(tþs), which leads to the estimation of the average treatment effect on the treated:
h i h i E Y1iðtþsÞ Y0iðtþsÞ jDit ¼ 1 ¼ E Y1iðtþsÞ jDit ¼ 1 i h E Y0iðtþsÞ jDit ¼ 1 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
(1)
q
The last term in (1), labeled q for short, represents the expected outcome that treated respondents would have experienced had there not been a change in coresidence, i.e., the counterfactual. Given that the counterfactual is unobserved for respondents undergoing a coresidence transition, causal inference relies on its proper approximation through a suitable empirical construction of Y0i(tþs). Provided the absence of confounding factors which might influence treatment assignment (and thus permeate our results with biases and endogeneity), a valid approximation to E½Y0iðtþsÞ jDit ¼ 1 would be given by the average depression level of respondents not experiencing a change in family proximity, namely E½Y0iðtþsÞ jDit ¼ 0. Since this is seldom the case, the need to resort to matching techniques arises in order to minimize estimation bias by constructing an appropriate control group. By building up the propensity score for each respondent on the basis of observable characteristics, it is possible to match each individual undergoing an inter-wave change in coresidence with children with a comparable respondent for whom such move is nonexistent. Individual differences in observable attributes are thus minimized and depression score dynamics for the untreated group can be used to construct the counterfactual for the treated individuals. A probit model is used to calculate the propensity score PðDit ¼ 1Þ ¼ f ðXiðt1Þ Þ, where Xi(t1) constitutes a vector of observed pre-treatment individual characteristics. In our specific case, the limited sample size plays a crucial role in the selection of the matching algorithm. The reduced number of control units at our disposal limits our choice to only those algorithms where matching with replacement is allowed. Due to its advantages in smaller samples and the lower variance achieved by using more information from a broader control base, the kernel matching estimator with replacement proves itself the better option for the present study (Caliendo and Kopeinig, 2008). In particular, matching results throughout this study are based on the Epanechnikov kernel distribution, although the choice of kernel function appears to be relatively irrelevant in the majority of cases (DiNardo and Tobias, 2001; Caliendo and Kopeinig, 2008). Kernel estimators use a kernel-weighted average over multiple individuals in the comparison group to construct a match for each treated unit. Following Heckman et al. (1997); Heckman et al. (1998); Heckman et al. (1998), the kernel matching estimator is used to construct weighted averages of the untreated individuals' outcomes aimed at matching the treated as closely as possible. The weights W(i,j) depend on the distance between the propensity score of each untreated unit in the comparison group and the treated individual for whom the counterfactual is being built up (Smith and Todd, 2005). The average treatment effect on the treated is then estimated by inserting the propensity score into the following DID-matching estimator
L. Aranda / Social Science & Medicine 134 (2015) 12e22
b d DIDM
1 ¼ n1P
X
2 4 Y1iðtþsÞ Y0iðtÞ
i2I1 ∩SP
X
3 Wði; jÞ Y0jðtþsÞ Y0jðtÞ 5
(2)
j2I0 ∩SP
where I1 and I0 are the sets of the treated and untreated individuals, respectively; SP is the common support region; and n1P is the number of treated individuals whose propensity score lies under the region of common support, i.e., those belonging to the set I1 ∩SP . Although this estimator still requires the selection on observables assumption inherent to matching, performing DID effectively eliminates all unobserved time invariant characteristics of respondents which would otherwise bias the results (Girma and Goerg, 2007). Finally, the assumption of independence conditional on observables common to all matching methods implies that treatment and control groups should be balanced e made similar e in terms of pre-treatment characteristics. By guaranteeing comparability, a set of properly balanced covariates will make a case for the validity of our results. The findings presented in the following section are chaperoned by several balancing tests displayed in both numerical and graphical format. As commonly done in the literature (e.g., Lechner, 2002; Heckman et al., 1997; Black and Smith, 2004), bootstrapping (200 repetitions) is used to approximate standard errors. I follow the common convention in the literature of not considering sampling weights in the context of matching (Leuven and Sianesi, 2003). Another way to diagnose the quality of the matching estimates is to perform sensitivity and robustness checks by exposing the model to minor changes and testing its main assumptions, which is done in Section 4. The diff and psmatch2 Stata commands and their subcommands were employed for estimation (Villa (2011) and Leuven and Sianesi (2003), respectively). 3. Results The analysis is split into two subsections. First, the DID matching estimates are performed on a pooled sample of all countries in the dataset. Second, the sample is split up into two European macro blocks (Protestant and Catholic Europe) in an attempt to individuate any potential large scale environmental factors leading to heterogeneity in the treatment effect. 3.1. Pooled sample Table 2 displays the DID matching estimates of the doubling up effect when the ten European countries in our sample are pooled together. The resulting estimates indicate that neither doubling up
Table 2 “Double up” and “split up” DID matching estimates (pooled sample). Double up
D Depression p-value Std. Error Number treated Number control
Split up
S.T.
L.T.
S.T.
L.T.
0.191 0.615 0.380 69 3017
0.300 0.161 0.214 215 2946
0.051 0.825 0.232 282 791
0.225 0.315 0.224 281 573
Notes: S.T. ¼ Short term (2 years); L.T. ¼ Long term (4 years). *p < 0.10, ** p < 0.05, *** p < 0.01
15
nor dissolving a child-parent household between waves has a significant effect on parental depression levels. This is true both in the short and long terms. The validity of our estimates is justified by the successful balancing of the treatment and control groups. Table 3 shows that matching is indeed effective in removing differences in observable characteristics between individuals undergoing treatment and their control counterparts. It presents the standardized difference (or bias) for all the covariates used in the propensity score estimation (Rosenbaum and Rubin, 1983). Intuitively, the standardized difference can be described as the difference in means of a covariate divided by its pooled standard deviation. The lower the standardized bias, the higher the balance (or comparability) between treatment and control groups in terms of observable characteristics. Given the lack of formal criteria for evaluating the size of the bias, I follow Rosenbaum and Rubin (1983) in assuming that a value of 20 is “large.” Additionally, I abide by the common practice in the literature of estimating the standardized bias for all the covariates included in the matching, since bias reduction for individual covariates might be misleading. Propensity score matching reduces the median absolute bias of the models substantially by a factor ranging from 65% to 87%. The standardized differences between the treatment and control samples are all less than 3%. Matching also effectively removes any explanatory power of the covariates in the model, as indicated by a pseudo R-squared close to zero. Histograms of the propensity scores of the treatment and control groups are plotted in Fig. 1 to assess whether there exists enough overlap to make reasonable comparisons among them. The charts are satisfactory in that they show similar propensity score distributions for both groups. Put differently, there are enough control cases to serve as analogue counterparts for the treated. The balancing tests indicate that comparability conditions are satisfied and suggest that the chosen matching specification effectively accounts for factors that may non-randomly influence selection into treatment. Heterogeneity in treatment effect is receiving increasing attention in the program evaluation literature. Statistical neutrality of results e such as the one observed e would be expected if parents in different regions experience heterogeneous effects of equal magnitude in opposite directions after treatment. The resulting outcome would average zero and the econometrician would naively posit an insignificant effect of coresidence on parental depression. Being aware of the enormous heterogeneity present across Europe, the foreseeable question given the lack of significance in our estimates is whether there exists a latent structural or environmental factor which may induce heterogeneous treatment effects across European regions. In such a case, any sizable effects would be eclipsed by the pooling of all ten European countries in our data. For instance, it is well known that in Mediterranean countries such as Italy or Spain adult children leave the household at a much older age than they would do, say, in Sweden or the Netherlands. The existence of such an environmental difference is explored in more detail in the following subsection. 3.2. Protestant and Catholic Europe To account for possible heterogeneity in treatment effects, the analysis is split into two geographical macroregions: Protestant Europe (Sweden, Denmark, Germany, the Netherlands, and Switzerland) and Catholic Europe (France, Belgium, Austria, Italy, and Spain). This arrangement follows Inglehart and Welzel (2005) “cultural map of the world,” a chart that uses data from the World Values Survey to position countries according to their
16
L. Aranda / Social Science & Medicine 134 (2015) 12e22
Table 3 Balancing statistics (pooled sample). Pseudo R2
Double up Short term Long term Split up Short term Long term
LR c2
p > c2
Median bias
% Reduction in bias
Raw
Matched
Raw
Matched
Raw
Matched
Raw
Matched
0.067 0.037
0.010 0.001
89.57 115.08
4.09 1.67
0.000 0.000
0.990 0.999
23.2 6.4
2.9 1.4
87.5 78.1
0.045 0.028
0.003 0.003
112.12 60.92
3.86 4.65
0.000 0.000
0.993 0.982
7.6 5.9
2.6 2.0
65.8 66.1
people's values as opposed to their geographical location. Western Europe is hence split into two environmentally and historicallydiverse groups, generalized under the labels “Protestant” and “Catholic” Europe. I first calculate and then compare the estimates for each of these regions in order to assess the extent to which regional heterogeneity plays a determinant role in the coresidence effect. Though it is clear that the analysis would be best at a country or even city level, small sample sizes make this option statistically unviable. Table 4 shows the DID matching estimates after the sample has been split up into macroregions (42.6% of respondents live in
Protestant Europe, 57.4% in its Catholic counterpart). A remarkable effect emerges: in Catholic Europe, doubling up significantly reduces parental depression in the long term, making it approximately 19% lower than what it would have otherwise been had the double up not taken place. On the other hand, no significant effect of doubling up on parental depression levels is observed in Protestant Europe. Several reasons make this a noteworthy result. First, neither doubling up nor splitting up increase depression significantly, regardless of the macroregion and time period under study. Second, although more intergenerational coresidence in the long term has
Fig. 1. Propensity score histogram for the treatment and control groups (pooled sample).
L. Aranda / Social Science & Medicine 134 (2015) 12e22
17
Table 4 “Double up” and “split up” DID matching estimates by macroregion. Protestant Europe
Catholic Europe
Double up
D Depression p-value Std. Error Number treated Number control
Split up
Double up
Split up
S.T.
L.T.
S.T.
L.T.
S.T.
L.T.
S.T.
L.T.
0.543 0.434 0.694 24 1273
0.108 0.731 0.313 82 1456
0.078 0.802 0.311 114 187
0.226 0.478 0.318 100 133
0.559 0.243 0.478 43 1481
0.540** 0.049 0.275 133 1492
0.077 0.804 0.311 168 577
0.220 0.406 0.264 182 432
Notes: Protestant Europe ¼ Sweden, Denmark, the Netherlands, Germany, and Switzerland. Catholic Europe ¼ Belgium, France, Austria, Italy, and Spain. S.T. ¼ Short term (2 years); L.T. ¼ Long term (4 years). *p < 0.10, ** p < 0.05, *** p < 0.01
no effect on parental depression in Protestant Europe, it proves itself significantly beneficial for its Catholic counterpart. In line with Isengard and Szydlik (2012), this finding could be attributed to a more family-oriented environment in the latter where the socalled welfare State may be lacking in comparison to the earlier, and where intergenerational households are much more prevalent. A weaker welfare State combined with a stronger family tradition in Catholic Europe may result in a scenario where elderly parents rely mainly on the help, assistance, and companionship of their children. Contrastingly, while family unity and intergenerational support might be the rule in such region, Protestant values might instead encourage professional success, personal independence, and mobility. Last, the dissolution of intergenerational households throughout Europe might be regarded as “nature's course” or the “appropriate” life-cycle path, as it exerts no effect on parental depression in neither macroregion. The balancing tests that follow (Table 5 and Figs. 2 and 3) assert the overall validity of our findings. However (and although the estimates are not statistically significant), caution is advised when interpreting the effect of a split up in Protestant Europe in the long term, given the relatively small (18.8%) post-matching bias reduction. Granted that this is due to the relatively smaller control group, comparability among the treatment and control groups may be challenged in such case.
4. Robustness checks 4.1. Sensitivity analysis In this section, the model is subjected to minor changes in order to assess the reliability of our results. In particular, further regional re-categorizations are proposed to test the sensitivity of our estimates to the geographical specification at hand. Eurostat data were used to rank the countries according to the average percentage of young adults (aged 24e35) living in cohabitation with their parents from 2005 to 2011. This information was used to split our sample into three groupings according to the average coresidence rate: low (less than 10%), medium (more than 10% but less than 20%), and high (more than 20%). A geographical pattern emerges: low coresidence in northern Europe (Denmark, Sweden, and the Netherlands), medium in central (France, Switzerland, Belgium, and Germany), and high in southern Europe (Austria, Spain, Italy). DID matching estimates for this re-categorization exercise confirm the robustness of our results (Table 6). Although doubling up seems to be more beneficial in countries with medium rates of coresidence (France, Switzerland, Belgium, Germany) than in their southern counterparts, when both groups are combined the effect of doubling up is magnified to a statistical significance level of 1.4%. This is due to the fact that sample sizes suffer greatly with the
Table 5 Balancing statistics by macroregion. Pseudo R2 Raw a) Protestant Europe Double up Short term 0.095 Long term 0.039 Split up Short term 0.107 Long term 0.029 b) Catholic Europe Double up Short term 0.071 Long term 0.037 Split up Short term 0.034 Long term 0.029
LR c2
p>c2
Median bias
% Reduction in bias
Matched
Raw
Matched
Raw
Matched
Raw
Matched
0.024 0.007
49.47 49.64
3.00 3.02
0.000 0.000
0.998 0.998
20.1 12.8
7.8 2.7
61.2 78.9
0.031 0.025
90.00 18.64
16.66 12.28
0.000 0.135
0.215 0.505
16.5 9.6
7.6 7.8
53.9 18.8
0.013 0.002
56.80 68.25
3.00 1.56
0.000 0.000
0.998 0.999
22.9 7.7
6.1 1.9
73.4 75.3
0.004 0.007
55.85 44.75
3.81 6.40
0.000 0.000
0.993 0.930
8.0 5.5
2.7 3.1
66.3 43.6
Notes: Protestant Europe ¼ Sweden, Denmark, the Netherlands, Germany, and Switzerland. Catholic Europe ¼ Belgium, France, Austria, Italy, and Spain.
18
L. Aranda / Social Science & Medicine 134 (2015) 12e22
Fig. 2. Propensity score histogram for the treatment and control groups (Protestant Europe).
choice to split the sample into more than two macroregions. Our findings seem robust in advocating for a strong environmental difference present across Europe. The fact that the effect of a double up in northern Europe runs in the opposite direction in both short and long terms to that of southern Europe is worthy of note. A second test for the sensitivity of the results relies on the inclusion of country dummies as controls in all models. Despite admittedly producing slight additional losses in the efficiency of the matching procedure, the estimates for Catholic Europe preserve their statistical significance when country dummies are included as matching covariates. Last, the robustness of our matching mechanism is evaluated against other estimation methods. In particular, the model is reestimated using fixed effects, DID of the unmatched sample, and regression-adjusted radius matching (I follow Lechner et al. (2011) in defining the radius size as the largest distance calculated from pair-matching). The resulting estimates (Table 7) confirm the robustness of our identification strategy, as conventional levels of statistical significance are reached irrespective of the method used. Moreover, the negative effect is consistent across methods, although its magnitude varies depending on the estimating procedure. Given its ability to eliminate fixed differences unrelated to treatment while accounting for time-varying covariates, the FE
estimate is closer in magnitude to our DIDM estimates than is, for instance, the radius matching estimate (which only accounts for observed variables). 4.2. Verification of underlying assumptions Three assumptions are required for our econometric methods to hold: the overlap condition, the unconfoundedness assumption (both indispensable to the matching estimator), and the parallel trends assumption (crucial to the DID estimator). This section aims at assessing the plausibility of these assumptions under our current framework and data limitations. First, the robustness of the results with respect to deviations from the common support condition is verified. The common support condition was imposed in all the aforementioned results: individuals who fall outside of propensity score overlap were disregarded from the analysis. However, if the number of individuals dropped out of the sample is large, the estimates obtained for the remaining units may neither be representative nor consistent (Bryson et al., 2002; Caliendo and Kopeinig, 2008). While preserving a larger sample size, estimates do not change significantly when the common support condition is not imposed (Table 8). Second, the unconfoundedness assumption critical to our
L. Aranda / Social Science & Medicine 134 (2015) 12e22
19
Fig. 3. Propensity score histogram for the treatment and control groups (Catholic Europe).
identification strategy is discussed and examined. Although this assumption is not directly testable, a number of indirect ways of assessing its validity exist and are often used in practice (e.g., Lee, 2008; Battistin et al., 2009). For the most part, they consist in estimating a causal effect that is known to be zero. An estimated effect different from zero is an indication that the treated and control units are different in terms of this particular covariate conditional on the others, rendering the unconfoundedness assumption less plausible. Moreover, the power of this proxy test is higher when the variables used are closely related to the outcome of interest and, in consequence, to the unobservable factors likely affecting it. Table 9 presents the estimates of the effect of a double up in Catholic Europe in the long term on several outcomes likely to be determined prior to the treatment itself. As shown, all cases considered are consistent with our identification restriction of unconfoundedness. Finally, two remarks are at hand when it comes to the parallel trends assumption. First among them is the finding that the rate of older adults with depressive symptoms tends to increase with advancing age (U.S. Department of Health and Human Services, 1999). This is attributed not only to negative late-life events (such as loss of friends, widowhood, and declining health) but also to a natural process of physiological change that comes with age. If such an increasing trend in depressive symptoms with age is accurate
and holds true in developed countries, and given that our dataset is, in fact, representative of the population, there would be no reason to expect the treated and control units in our sample to follow different trends in the absence of treatment. Secondly, a placebo test is run by exploiting the panel nature of the data. The test consists in artificially moving the treatment back one period (before it actually happened) to assess whether the difference in trends between the treated and control individuals had been, in fact, historically persistent and thus unrelated to the treatment. Nevertheless, the placebo test delivers no significant difference in pre-treatment depression trends between the treated and control individuals (p ¼ 0.50). Although a more chronological analysis of trends is not possible with the current data, the results of the placebo test based on the three waves of available data are unable to contest the parallel trends assumption. 4.3. The role of economic factors Do latent economic factors play a relevant role in the betterment of mental health after a double up? Six steps were taken to investigate possible economic confounding factors in our results: a) a bivariate probit model was devised to address the claim that economic conditions (proxied by financial distress) and coresidence decisions are dependent on each other and should thus be modeled
20
L. Aranda / Social Science & Medicine 134 (2015) 12e22
Table 6 “Double up” and “split up” DID matching estimates by region (northern, central, and southern Europe).
D Depression S.T. Double up L.T.
S.T. Split up L.T.
p-value Std. Error Treated Control D Depression p-value Std. Error Treated Control D Depression p-value Std. Error Treated Control D Depression p-value Std. Error Treated Control
N
C
S
NþC
SþC
0.221 0.810 0.920 13 397 0.428 0.342 0.451 55 1002 0.352 0.345 0.373 69 111 0.185 0.656 0.414 65 65
0.153 0.733 0.449 32 1369 0.700** 0.036 0.334 82 1400 0.072 0.817 0.313 113 291 0.472 0.114 0.298 138 199
0.776 0.269 0.701 22 587 0.302 0.239 0.256 77 543 0.008 0.985 0.426 94 359 0.205 0.652 0.454 74 289
0.171 0.707 0.454 46 2184 0.257 0.364 0.283 137 2380 0.156 0.519 0.242 186 404 0.384 0.169 0.279 201 276
0.313 0.435 0.401 55 1845 0.574** 0.014 0.235 159 1916 0.016 0.950 0.249 210 663 0.215 0.411 0.261 214 496
Notes: North (N) ¼ Denmark, Sweden, and the Netherlands. Center (C) ¼ France, Switzerland, Belgium, and Germany. South (S) ¼ Austria, Spain, and Italy. S.T. ¼ Short term (2 years); L.T. ¼ Long term. *p < 0.10, **p < 0.05, ***p < 0.01
simultaneously; b) an analysis of the provision of public benefits in the ten countries in our sample was performed and the differences appraised; c) changes in reported individual public benefits between waves were investigated and included as covariate; d) information on home ownership and household net worth was examined under the logic that both may decrease substantially in old age, given the eventual need for additional liquidity in order to cover increasing health expenditures; and e) a ratio of the amount of household food expenditure to the total household income was constructed and its inter-wave change assessed. Nevertheless, all attempts at finding an economic rationale behind the observed positive effects of a double up on the mental wellbeing of older individuals in Catholic Europe proved systematically insufficient (detailed results available from the author upon request).
5. Conclusion In the last decade or so, intergenerational households in the developed world have been on the rise, reverting a post-WWII trend of independent living. In particular, an increasing number of parents and their adult children are doubling up into the same household. This paper examines whether depression levels of older Europeans are affected by changes in coresidence with their adult children. Although treatment effects prove heterogeneous across European regions, a double up seems to be accompanied by a greater
Table 8 “Double up” and “split up” DID matching estimates without imposing the common support condition.
D Depression
Table 7 Estimation of long-term effects using Fixed Effects, unmatched DID, and Radius matching.
F.E.
Double up
DID
Radius
Coeff. p-value Std. Error Treated Coeff. p-value Std. Error Treated Control Coeff. p-value Std. Error Treated Control
Pooled
Protestant
Catholic
0.339 0.053* 0.175 301 0.197 0.278 0.181 213 3199 0.044 0.636 0.093 481 6595
0.004 0.989 0.269 113 0.141 0.616 0.281 81 1469 0.077 0.578 0.139 186 3199
0.534 0.019** 0.228 188 0.439 0.066* 0.239 132 1517 0.288 0.014** 0.117 295 3364
Notes: Protestant ¼ Sweden, Denmark, the Netherlands, Germany, and Switzerland. Catholic ¼ France, Belgium, Austria, Italy, and Spain. F.E. ¼ Fixed Effects; DID ¼ DID on unmatched sample; Radius ¼ Radius Matching. *p < 0.10, **p < 0.05, ***p < 0.01
S.T. Double up L.T.
S.T. Split up L.T.
p-value Std. Error Treated Control D Depression p-value Std. Error Treated Control D Depression p-value Std. Error Treated Control D Depression p-value Std. Error Treated Control
Pooled
Protestant
Catholic
0.235 0.469 0.324 115 4089 0.309 0.101 0.188 301 4178 0.034 0.859 0.193 413 1115 0.144 0.451 0.191 385 869
0.191 0.676 0.456 52 2505 0.095 0.751 0.297 134 2591 0.077 0.758 0.252 208 377 0.317 0.256 0.279 183 275
0.549 0.165 0.395 73 2098 0.495** 0.035 0.235 188 2148 0.050 0.831 0.236 246 840 0.085 0.731 0.246 252 665
Notes: Protestant ¼ Sweden, Denmark, the Netherlands, Germany, and Switzerland. Catholic ¼ France, Belgium, Austria, Italy, and Spain. S.T. ¼ Short term (2 years); L.T. ¼ Long term (4 years). *p < 0.10, **p < 0.05, ***p < 0.01
L. Aranda / Social Science & Medicine 134 (2015) 12e22
that lead to reductions in family capital investments and a worsening in mental health.
Table 9 Overidentification tests. Variable Primary school High school diploma College degree Graduate degree Number of chronic illnesses Number of ADLs Married Single Unemployed Financially distressed Number of children
21
Estimate
Std. Error
z
p-value
0.007 0.005 0.002 0.001 0.015 0.049 0.008 0.008 0.011 0.028 0.072
0.056 0.062 0.048 0.031 0.063 0.094 0.066 0.065 0.011 0.066 0.156
0.120 0.080 0.040 0.020 0.240 0.520 0.120 0.120 0.930 0.043 0.460
0.905 0.939 0.968 0.983 0.812 0.601 0.906 0.905 0.355 0.669 0.645
peace of mind of older generations living in traditionally Catholic European countries. Among the main exogenous causes behind this social occurrence arguably stands the harsh economic situation brought about in the last decade by the Great Recession. By producing shocks to coresidence trends, negative economic situations may result in increasingly supportive household conditions for the older generation. The logic goes as follows: recessions and economic difficulties increase the chances that the elderly share a household with their children, who in turn provide support, social interaction, and companionship to their parents. By decreasing loneliness and solo-living, this fulfillment of family roles results in an increase in emotional wellbeing of the older generation. Although no such link is observed in traditionally Protestant countries, the results for Catholic Europe support the consistentlyfound positive association between loneliness and depression in old age. Ths finding is all the more remarkable considering that recessions have long been associated with increased suicide rates and diminished mental health (Ruhm, 2003; Charles and DeCicca, 2008). Several data limitations leave room for improvement. In particular, a) although three waves are the minimum required to test it, more time points would definitely allow for a more in-depth assessment of the parallel trends assumption; b) knowing with certainty whether it was the child to move back to the parents' home or vice-versa is unfeasible; c) the number of observed children characteristics is sub-optimal, which may inhibit matching accuracy; and d) the motives behind the coresidence decision remain largely unknown. Data improvements in all these areas would permit a better identification of important dynamics in intergenerational living arrangements. Although small sample sizes result in reduced estimation efficiency and in some cases may hinder generalizability of results, our findings evidence the necessity of accounting for latent environmental factors when conducting research on mental health. In an aging European society where the mental health of older individuals is of growing concern, these findings are not to be taken lightly e especially so in the current climate of slow economic recovery where long-term shocks to living arrangements are expected to continue. Most long-term care policy developments over the past decade have pursued shifts (i) away from institutional care and towards home-based solutions; (ii) away from public provisions and towards alternative private or mixed services supported by cash transfers; and (iii) seeking to complement rather than replace informal care. By highlighting the gains in psychological wellbeing for the elderly, our findings support the development and implementation of home-based solutions to LTC, where the emotional support provided by children in certain cultural backgrounds is complementary to other forms of professional care. In line with Giordano and Lindstrm (2011), welfare solutions must be carefully thought out to avoid creating perverse incentives
References Aging and Depression, 2014. American Psychiatric Association. Retrieved September 20, 2014, from. http://www.apa.org/helpcenter/aging-depression. aspx. Battistin, E., Brugiavini, A., Rettore, E., Weber, G., 2009. The retirement consumption puzzle: evidence from a regression discontinuity approach. Am. Econ. Rev. 99 (5), 2209e2226. Black, D., Smith, J., 2004. How robust is the evidence on the effects of the college quality? evidence from matching. J. Econ. 121 (1), 99e124. Bryson, A., Dorsett, R., Purdon, S., 2002. The Use of Propensity Score Matching in the Evaluation of Active Labour Market Policies. LSE Research Online Documents on Economics 4993, LSE Library. Caliendo, M., Kopeinig, S., 2008. Some practical guidance for the implementation of propensity score matching. J. Econ. Surv. 22 (1), 31e72. Charles, K.K., DeCicca, P., 2008. Local labor market fluctuations and health: is there a connection and for whom? J. Health Econ. 27 (6), 1532e1550. Chen, F., Short, S.E., 2008. Household context and subjective well-being among the oldest old in China. J. Fam. Issues 29, 1379e1403. Choroszewicz, M., Wolff, P., 2010. 51 million Young Adults Lived with Their Parent(s) in 2008. Eurostat. Statistics in Focus 50/2010. Chyi, H., Mao, S., 2012. The determinants of happiness of China's elderly population. J. Happiness Stud. 13 (1), 167e185. Corselli-Nordblad, L., 2010. Young Adults in the EU27 in 2008. Eurostat. Newsrelease 149/2010. DiNardo, J., Tobias, J., 2001. Nonparametric density and regression estimation. J. Econ. Perspect. 15 (4), 11e28. Giordano, G.N., Lindstrm, M., 2011. Social capital and change in psychological health over time. Soc. Sci. Med. 72 (8), 1219e1227. Girma, S., Goerg, H., 2007. Evaluating the foreign ownership wage premium using a difference-in-differences matching approach. J. Int. Econ. 72, 97e112. Green, B., Copeland, J., Dewey, M., Shamra, V., Saunders, P., Davidson, I., Sullivan, C., McWilliam, C., 1992. Risk factors for depression in elderly people: a prospective study. Acta Psychiatr. Scand. 86 (3), 213e217. Heckman, J., Ichimura, H., Smith, J., Todd, P., 1998a. Characterising selection bias using experimental data. Econometrica 66, 5. Heckman, J., Ichimura, H., Todd, P., 1997. Matching as an econometric evaluation estimator: evidence from evaluating a job training programme. Rev. Econ. Stud. 64, 605e654. Heckman, J., Ichimura, H., Todd, P., 1998b. Matching as an econometric evaluation estimator. Rev. Econ. Stud. 65, 261e294. Hughes, M., Waite, L., 2002. Health in household context: living arrangements and health in late middle age, 43 (1), 1e21. Inglehart, R., Welzel, C., 2005. Modernization, Cultural Change and Democracy. Cambridge University Press, New York. Isengard, B., Szydlik, M., 2012. Living apart (or) together? coresidence of elderly parents and their adult children in europe. Res. Aging 34 (4), 449e474. Kaplan, G., 2012. Moving back Home: insurance against labor market risk. J. Polit. Econ. 120 (3), 446e512. Kochhar, R., Cohn, D., 2011. Fighting Poverty in a Tough Economy, Americans Move in with Relatives. Pew Social & Demographic Trends, Pew Research Center. Lechner, M., 2002. Some practical issues in the evaluation of heterogeneous labour market programmes by matching methods. J. R. Stat. Soc. A165, 59e82. Lechner, M., Miquel, R., Wunsch, C., 2011. Long-run effects of public sector sponsored training in West Germany. J. Eur. Econ. Assoc. 9 (4), 742e784. Lee, D., 2008. Randomized experiments from non-random selection in the U.S. House Elections. J. Econ. 142 (2), 675e697. Leuven, E., Sianesi, B., 2003. Psmatch2: Stata Module to Perform Full Mahalanobis and Propensity Score Matching, Common Support Graphing, and Covariate Imbalance Testing. This version 4.0.6 17may2012. Li, L., Liang, J., Toler, A., Gu, S., 2005. Widowhood and depressive symptoms among older chinese: do genderand source of support make a difference? Soc. Sci. Med. 60 (3), 637e647. Lowenstein, A., Katz, R., 2005. Living arrangements, family solidarity and life satisfaction of two generations of immigrants in Israel. Ageing & Soc. 25, 749e767. Luo, Y., Hawkley, L.C., Waite, L.J., Cacioppo, J.T., 2012. Loneliness, health, and mortality in old age: a national longitudinal study. Soc. Sci. Med. 74 (6), 907e914. Mazzella, F., Cacciatore, F., Galizia, G., Della-Morte, D., Rossetti, M., Abbruzzese, R., Langellotto, A., Avolio, D., Gargiulo, G., Ferrara, N., Rengo, F., Abete, P., 2010. Social support and long-term mortality in the elderly: role of comorbidity. Arch. Gerontol. Geriatr. 51 (3), 323e328. Mykyta, L., 2012. Economic Downturns and the Failure to Launch: The Living Arrangements of Young Adults in the U.S. 1995-2011. U.S. Census Bureau SEHSD Working Paper (2012-24). Okabayashi, H., Liang, J., Krause, N., Akiyama, H., Sugisawa, H., 2004. Mental health among older adults in Japan: do sources of social support and negative interaction make a difference? Soc. Sci. Med. 59 (11), 2259e2270. Parker, K., 2012. The Boomerang Generation: Feeling Ok about Living with Mom and Dad. Pew Research & Demographic Trends, Pew Research Center.
22
L. Aranda / Social Science & Medicine 134 (2015) 12e22
Prince, M., Reischies, F., Beekman, A., Fuhrer, R., Jonker, C., Kivela, S., 1999. Development of the EURO-D scaleea European Union initiative to compare symptoms of depression in 14 European centres. Br. J. Psychiatry 174, 330e338. Riumallo-Herl, C.J., Kawachi, I., Avendano, M., 2014. Social capital, mental health and biomarkers in chile: assessing the effects of social capital in a middleincome country. Soc. Sci. Med. 105 (0), 47e58. Rosenbaum, P., Rubin, D., 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41e55. Ruhm, C., 2003. Good times make you sick. J. Health Econ. 22, 637e658. Silverstein, M., Bengtson, V.L., 1994. Does intergenerational social support influence the psychological well-being of older parents? the contingencies of declining healthand widowhood. Soc. Sci. Med. 38 (7), 943e957. Singh, A., Misra, N., 2009. Loneliness, depression and sociability in old age. Ind. Psychiatry J. 18 (1), 51e55. Smith, J., Todd, P., 2005. Does matching overcome LaLondes critique of
nonexperimental estimators. J. Econ. 125, 305e353. Taylor, P., Parker, K., Kochhar, R., Fry, R., Funk, C., 2012. Young, Underemployed and Optimistic: Coming of Age, Slowly, in a Tough Economy. Pew Research & Demographic Trends, Pew Research Center. U.S. Department of Health and Human Services, 1999. Older adults and mental health. In: Mental Health: A Report of the Surgeon General. U.S. National Library of Medicine. Villa, J., 2011. Diff: Stata Module to Perform Differences-in-differences Estimation, Statistical Software Components. Boston College Department of Economics. Wiemers, E., 2014. The effect of unemployment on household composition and doubling up. Demography 51 (6), 2155e2178. Zunzunegui, M.V., Beland, F., Otero, A., 2001. Support from children, living arrangements, self-rated health and depressive symptoms of older people in spain. Int. J. Epidemiol. 30 (5), 1090e1099.