On intergenerational risk sharing within social security schemes

On intergenerational risk sharing within social security schemes

European Journal of Political Economy Vol. 20 (2004) 181 – 206 www.elsevier.com/locate/econbase On intergenerational risk sharing within social secur...

364KB Sizes 4 Downloads 88 Views

European Journal of Political Economy Vol. 20 (2004) 181 – 206 www.elsevier.com/locate/econbase

On intergenerational risk sharing within social security schemes Andreas Wagener * Department of Economics, University of Vienna, Hohenstaufengasse 9, 1010 Vienna, Austria Received 8 November 2001; received in revised form 14 September 2002; accepted 19 February 2003

Abstract Pay-as-you-go (PAYG) schemes entail beneficial risk sharing and diversification features in multi-pillar pension systems. Depending on the pension formula these features vary, however, significantly for different types of PAYG schemes. We derive individually most-preferred PAYG rules, represented by a risk-sharing parameter, for young and old members of a society. Preferences depend on the correlation between the risks of the PAYG scheme and the return risk of a funded scheme and on expectations about the durability of the pension rule. We find that the generations’ interests with respect to the optimal PAYG rule typically do not fully clash, in particular if future economic conditions are expected to be similar to today’s. We discuss the implications of these findings for the political economy of pension systems, offering an explanation why one typically observes ‘‘mixed’’ PAYG rules in reality. D 2003 Elsevier B.V. All rights reserved. JEL classification: H55; D78 Keywords: Social security; Intergenerational risk sharing; Pay-as-you-go pensions; Majority voting

1. Introduction It is now widely acknowledged that pay-as-you-go (PAYG) social security schemes may exhibit beneficial effects in reallocating risks inter- and intragenerationally. Such effects include risk sharing (see Merton, 1983; Enders and Lapan, 1993; Richter, 1993) and diversification features (cf. Hauenschild, 1999; Dutta et al., 2000). Both types of effects may provide a rationale for including PAYG schemes into the pension mix although

* Tel./fax: +43-1-4277-37423/9374. E-mail address: [email protected] (A. Wagener). 0176-2680/$ - see front matter D 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.ejpoleco.2003.02.002

182

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

such schemes yield lower expected rates of return than funded forms of old-age provisions (Gale, 1991). It is also well-known that various forms of PAYG schemes differ considerably in their risk allocation features; the ‘‘pension formula’’ is of crucial importance here (Bohn, 1999; Lindbeck, 2002; Thøgersen, 1998). This feature is most easily visible from the defining property of a PAYG scheme that, in every period, current pension payments are financed out of current contributions. Assuming that the economy and its pension scheme are subject to stochastic shocks there are, in principle, two ways to keep the budget balance of the PAYG scheme intact: by adjusting benefits or by adjusting contributions. 

If it is the policy of the PAYG scheme to keep the contribution rate constant (fixed contribution [FC] scheme), then pensions will depend on the future development of the economy. Future economic risks thus have to a considerable extent to be borne by pensioners. Investing in the ‘‘PAYG asset’’ is a risky activity as the relationship between contributions (during the working period) and pension benefits (during retirement) is stochastic. An FC scheme may contribute to a diversification in the risks of old-age consumption as a whole (when other stochastic sources of old-age income are available).  On the other extreme, a fixed-replacement [FR] scheme follows the policy to keep benefits constant, measured by the ratio between pensions and pre-retirement income (both calculated in real terms). Contributions to an FR scheme create a risk-free entitlement to a pension of a certain size, implying a deterministic nexus between contributions and pensions. The non-stochastic FR pension might provide insurance against other old-age consumption risks but means, on the other hand, that the risks of adverse changes in the economic environment stay with the contributors (i.e., the younger generation). Pure FC and FR schemes have been discussed by Thøgersen (1998) and Wagener (2002) and, in a more policy-oriented way, by Lindbeck (2002). In this paper, we extend the analysis to allow for convex combinations of FR and FC schemes. We will represent such mixtures by a policy parameter aa[0,1], which measures the degree of intergenerational risk sharing inherent in the PAYG pension policy. As one could expect from previous analyses of ‘‘pure’’ schemes (aa{0,1}), ‘‘intermediate’’ policies 0 < a < 1 assemble a rather complex mix of risk sharing and diversification features. Our motivation to investigate a continuum of mixed PAYG pension policies is threefold: . First, mixed schemes are the empirically dominant form of pension schemes. Pure FC or FR PAYG schemes do not exist in reality where pension formulae and policies typically combine elements of both schemes, often in a way difficult to disentangle. A piece of evidence is given in Fig. 1, which depicts contribution and replacement rates1 for the 1 Strictly speaking, the GRV statistics do not know an indicator called replacement ratio. The three upper curves in Fig. 1 depict annual data for the pension level of a retiree who previously had average earnings, calculated as a percentage of average earnings of the currently working. Cum grano salis, this may be interpreted as the replacement rate for an average earner at the entry into retirement. Since GRV pensions are adjusted annually, the replacement ratio of retirees (i.e., their pension relative to their own previous earnings) varies annually too. The stylized OLG framework of this paper abstracts from this complexity.

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

183

Fig. 1. Contribution and replacement rates of the GRV. Source; VDR (2002).

German statutory PAYG scheme, the Gesetzliche Rentenversicherung (GRV), over the last decades. With an FR scheme, the upper curves should be constant in Fig. 1; whereas, for an FC scheme, the lower curve should be flat. In Germany, obviously both pensioners and contributors face variations in the pension parameters that affect them.2 . Furthermore, many aspects of the recent policy debate on averting the old-age crisis in PAYG pension schemes can also be thought of in terms of FC/FR mixtures. For example, one of the guidelines for the recent German pension reform is to keep contribution rates stable. This signals to the general public that pensions might become more correlated to the future ups and downs of the economy and thus become more risky. 2

A superficial view in Fig. 1 might suggest that the contribution rate to the German PAYG scheme (bottom curve) was rather constant over time. However, as already pointed out by Werding (1998) in a different context, the graph of Fig. 1 is highly misleading here. In fact, the bottom line exhibits much greater volatility than the three upper plots; the coefficient of variation for the contribution rate is 0.12, while it ranges between 0.064 and 0.07 for the upper lines.

184

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

At least partly, the distinction whether a PAYG scheme is more of the FC or the FR type is determined by the provisions for adjusting pensions to changes in the environment. In a number of countries, PAYG pensions adjust according to the price index, while in others they are linked to changes in current (net) wages. Not few countries adopt mixtures of these procedures. From an ex ante perspective and in real terms, price index related adjustment rules come close to an FR scheme: with the entry into retirement the pension is fixed in real terms, and retirees will not share fluctuations in the business cycle. This is different when pensions are—as it de facto happens in an FC scheme—indexed to current wages and, thus, are stochastic in real terms. The widely applied blending of price and wage indexation can (in our stylized framework) be depicted by intermediate values of a. . A second motivation for considering mixtures of FC and FR schemes comes from the interpretation of a PAYG scheme as an asset: At the price of contributions during working life, returns in the form of pensions can be earned. Pure FR and FC schemes then constitute two specific payment streams with different risk/return patterns. As an implication of the basic diversification theorems from finance theory individuals will often prefer a mixed portfolio of (PAYG) assets to a strategy of ‘‘keeping all their eggs in one basket’’, i.e., they prefer intermediate values of a to the polar cases. One might call this intra-PAYG diversification—as contrasted to the usual diversification of multi-pillared pension schemes into funded and non-funded pillars. One of our aims is to elucidate the circumstances so that such intra-PAYG diversification is warranted. . Third, one might wish to explain the fluctuations observed in Fig. 1 and similar graphs for other countries. Why is none of the curves in such graphs flat? From a normative, Paretian perspective, pure FR and FC PAYG schemes are ex ante noncomparable; neither is Pareto superior to the other (Wagener, 2002)—and neither generally supports a Pareto optimal allocation.3 By contrast, a positive, political economy perspective would view graphs as Fig. 1 and its kindred as emerging from political decision processes that are not generally guided by Paretian welfare considerations, but which use preferences of the members of society as one (major) ingredient of a public choice mechanism. (For example, majority voting implements the policy option which gets the largest number of votes, where a vote is a specific transformation of an individual’s preferences.) Knowledge of the preferences of individuals of different types for intergenerational risk sharing is necessary, or at least helpful, for both designing and explaining PAYG politics. As it turns out, in a great variety of settings in our model individuals prefer an intermediate to any of the pure (FC or FR) schemes—which might provide an explanation why, in the real world, one indeed observes intermediate schemes more often than the pure ones. The primary interest of this paper is, thus, in the preferences of individuals over the continuum of PAYG pension mixes (formally: the shape of expected indirect utility as a 3

Pareto optimality is a not a straightforward concept in stochastic OLG economies. The main candidates are interim and ex-ante optimality, the latter being the stronger concept. For interim optimality (proposed by Peled, 1982), optimal intergenerational risk sharing generally requires the possibility of state-contingent forward, old-toyoung transfers and, thus, neither interim nor ex-ante optimality can be achieved with old-age pensions only (Rangel and Zeckhauser, 2001).

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

185

function of a). We demonstrate how preferences vary with the position in the life cycle and with the economic situation at the point in time when preferences are solicited. A main finding is that, for a rather wide range of economic environments, the most preferred value of a for both older and younger individuals lies in the interior—and not at the extremes—of the unit interval: Everybody prefers some mixed PAYG scheme to a pure FC or FR scheme. It is particularly interesting to observe how individual preferences towards risk sharing within the PAYG scheme vary with the stochastic properties of the returns on savings. The correlation between PAYG payments and the returns on private saving can also be interpreted as that between a PAYG scheme and a second, funded pillar of a diversified pension system, given that funded schemes are a one-by-one substitute for saving. Roughly, if the returns to the two pillars are positively [negatively] correlated, individuals wish to have a PAYG mix with an intermediate value of a whenever their expectations on the future wage rate are optimistic [pessimistic]. Preferences for intergenerational risk sharing of the younger generation are also affected by their expectations on the ‘‘durability’’ of the scheme, i.e., on whether today’s pension formula is expected to be also applied when today’s young are old. Given that individuals would quite often opt for intermediate PAYG schemes, volatile curves as in Fig. 1 can indeed be explained as the outcome of politically aggregating these preferences. Moreover, our results indicate that the class of political mechanisms that potentially give rise to intermediate PAYG schemes with ag{0,1} is very rich, including majority voting as well as representative democracies. In so far, our findings likely explain, or are at least consistent with, the empirical observation that real-world pension schemes are mostly of a mixed, and not of a pure type. This paper is organized as follows: Section 2 presents the framework of our analysis. Sections 3 and 4 derive the individual preferences over different PAYG mixes. Section 5 discusses the implications of our findings for the political economy of intergenerational risk sharing in PAYG schemes. Section 6 concludes.

2. The model We employ a model of a simple 2-OLG economy under uncertainty. We assume that all prices (respectively, their probability distributions) are exogenous, e.g., because the economy is a small, open one. Generation t (t = 0,1,. . .) consists of NaN identical individuals each of whom lives for the two periods t (working age) and t + 1 (old age). Population size N is, thus, held fixed over time. None of the results to come would be affected by incorporating variable deterministic population change rates, while stochastic demographics might indeed affect the analysis.4 4

Demographic risks can be due to uncertain fertility, uncertain life expectancies, or uncertain migration. The impact of stochastic population dynamics on optimal pension politics has recently been studied by Demange and Laroque (1999, 2000). The second of these papers is particularly relevant for our analysis since it shows that, contrary to conventional wisdom, optimal intergenerational sharing of demographic risks cannot be achieved by a pure FR strategy but rather requires a hybrid pension policy (closer to FC than to FR schemes).

186

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

During working age, each individual inelastically supplies one unit of labor, while during old age all individuals are retired and do not work. Members of generation t derive utility from consumption c˜t1 during youth and consumption c˜t2 in old age which are both random from the viewpoint of generation t’s date of birth. We indicate random variables by a tilde; the respective variable without a tilde denotes a realization. By st, Bt, and Pt we denote, respectively, generation t’s per-capita savings, their social security contribution and their pensions (payable at date t + 1). Variables wt and Rt denote, respectively, the wage rate and the interest factor ( = interest rate plus one) prevailing in t. The budget constraints for the two periods in the life of a member of generation t are: c˜ 1t ¼ w˜ t  B˜ t  s˜ t

and

c˜ 2t ¼ R˜ tþ1  s˜ t þ P˜ t :

ð1Þ

Individual preferences with respect to the distribution of consumption over time and states are assumed to be representable by an additively separable von Neumann/Morgenstern index Ut ¼ E½uð˜c1t Þ þ uˆ ð˜c2t Þ ;

ð2Þ

where E denotes the expectations operator. We will make the stochastics explicit presently. All individuals are assumed to be identical, risk averse and to possess decreasing absolute risk aversion (DARA). Their utility functions m = u, uˆ thus are smooth, strictly increasing, strictly concave (mV(c)>0>mW(c) for all c) and satisfy (  mW(c)/mV(c))V < 0 for all c>0. In the sequel we will use the Arrow – Pratt measure of relative risk aversion; for a utility function v, it is defined by Rv(c): =  cmW(c)/mV(c). Stochastics take their origin in price (i.e., wage and interest rate) uncertainties. We assume that the wage rates at different points in time are independently distributed (no autocorrelation). Furthermore, the interest factor in t and the wage rate in any other period are assumed to be stochastically independent. Hence, Cov(w˜t,w˜s) = Cov(w˜t,R˜H ) = 0 for all t p s. The interest and the wage rate at the same point in time may, however, be dependent random variables. We will use the notation Et to indicate the expectation for random variables xt with time index t. As serial stochastic dependence is excluded there is no need to specify information sets upon which the expectations are formed. We will denote the support of interest rates by R and that of wage rates by W; both R and W are assumed to be closed intervals of the positive real line. There is no loss in the generality of the insights of this paper to assume that R and W are time-independent. We only consider pay-as-you-go (PAYG) pension schemes. Hence, for all t, the budget of the pension scheme must always be balanced: B˜ tþ1 ¼ P˜ t

a:e:  w˜ t ; R˜ t :

ð3Þ

Pensions and contributions to the PAYG scheme are assumed to be wage-related. Precisely, the pension for a member of generation t (to be received in t + 1) is given by: P˜ t ¼ b  ½ð1  at Þw˜ t þ at w˜ tþ1

ð4Þ

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

187

with 0 < b < 1 and 0 V at V 1 for all t. We assume that b—which determines the volume of intergenerational transfers—is constant throughout.5 To clarify Eq. (4), we discuss its extreme cases: 

at = 0: The PAYG pension of generation t is a constant share of their earnings during working age: P˜t = bw˜t. Thus, b represents the replacement rate, i.e., the ratio between the pension and the pensioner’s previous earnings. From Eq. (3), we obtain that the contributions levied on those working must amount to Bt + 1 = bwt; clearly, they are a function of their elder’s past earnings.  at = 1: The pension of generation t is a constant share of their children’s gross earnings, which maybe understood as (a proxy for) the living standard prevailing among the working people in t + 1: P˜t = bw˜t + 1. Consequently, the contribution to the PAYG scheme amounts to B˜t + 1 = bw˜t + 1; it is a constant share of current earnings such that b now may be interpreted as the contribution rate. If at = 0 for all t, then the PAYG scheme has a fixed replacement rate (FR scheme), while for a = 1, it works with a fixed contribution rate (FC scheme). For intermediate values 0 < at < 1, pensions Pt and contributions Bt + 1 in t + 1 are (proportional to) a convex combination of today’s and last period’s wage rates w˜t + 1 and w˜t. We will often interpret a as a measure for intergenerational risk sharing: It measures the extent to which pensions are affected by current wage shocks or, equivalently, the degree to which young individuals can shift their income risk to their parents. Note that if a is fixed at a value other than 0 or 1 over several periods then neither the replacement rate nor the contribution rate is constant. For generation t, (implicit) contribution and replacement rates can be calculated as Bt/wt and Pt/wt, respectively. Individuals take all parameters, prices, and distributions as given. The pension scheme is not a priori fixed, but emerges as the result of a political process. Since we consider b as given the policy problem is one-dimensional and decisions are to be made only on the values of at, i.e., on the degree of intergenerational risk sharing in the course of time. Such decisions are made (at most) in every model period. The decision on at, which directly affects the pension payments of generation t and the contribution payments for generation t + 1, is made at date t + 1, but still before the wage and interest rates w˜t + 1 and R˜t + 1 have been realized. As an ingredient to political decision making in welfarist social choice approaches (which include the democratic ones) we need information on the citizens’ most preferred values of the policy parameters. To gather this information is the aim of the following sections.

3. The preferred degree of risk sharing for the retirees Suppose we are at the end of period t. Generation t will retire at the beginning of period t + 1 and afterwards Nature will draw the interest factor and the wage rate for that period. 5

Papers that endogenize b are, to name but a few, Browning (1975), Hu (1982), Sjoblom (1985) or Casamatta et al. (2000). In the latter paper, the pension rule looks similar to Eq. (4); it also contains a parameter a (called Bismarckian factor) that measures, however, the degree to which the pension scheme is redistributive.

188

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

As a result of his past decision, each individual close to retirement holds some nonnegative amount of savings s, which is now beyond his control. Shortly before date t + 1, members of generation t are thus only interested in Z Z ˜ ˜ Etþ1 uˆ ðRtþ1 s þ Pt Þ ¼ uˆ ðR˜ tþ1 s þ b  ½ð1  at Þwt þ at w˜ tþ1 Þ R

W

hðR˜ tþ1 w˜ tþ1 Þdw˜ tþ1 dR˜ tþ1 :

ð5Þ

where h is the bivariate density function of R˜t + 1 and w˜t + 1. Define aot ðwt Þ :¼ arg max Etþ1 uˆ ðR˜ tþ1 s þ b  ½ð1  aÞwt þ at w˜ tþ1 Þ 0VaV1

ð6Þ

as the most preferred degree of risk sharing for generation t in its old age. By definition, ato is a function of generation t’s past labor earnings. As Eq. (5) is still a double integral, some of the more interesting properties of ato(wt) will depend on the nature of stochastic dependence between the interest factor R˜t + 1 and the wage rate w˜t + 1. To capture these effects, we employ a refined concept of stochastic dependence due to Lehmann (1966), which was used in the economics literature, e.g., by Cheng and Magill (1985) and Magill and Nermuth (1986): Two random variables R˜ and w˜ are called positively (negatively) dependent if, for all (a,b)aR2 ˜ ˜ Prob½RVa; wVb

˜ ˜ zðVÞ Prob½RVa  Prob½wVb

ð7Þ

with strict inequality for some (a,b)aR2. If there is equality of the two sides above for all (a,b)aR2, then R˜ and w˜ are independent in the standard sense. Positively dependent random variables tend to ‘‘hang together’’. It can be shown that, if R˜t + 1 and w˜t + 1 are positively (negatively) dependent and if g is a decreasing function, then g(R˜t + 1) and w˜t + 1 are negatively (positively) dependent (Lehmann, 1966, Lemma 1). Proposition 1. Suppose that uˆ satisfies Ruˆ(c) V 1 for all c. 1. There exist wt and w¯t with 8 ¼1 if > > > > < aot ðwt Þ að0; 1Þ if > > > > : ¼0 if

wt < w¯t such that wt Vwt wt < wt < w¯ t wt zw¯ t :

Here, wt < Et + 1w˜t + 1. Further, w¯t < (>)[=] Et + 1w˜t + 1 if R˜t + 1 and w˜t + 1 are positively (negatively) dependent [independent]. 2. The function ato(wt) is strictly decreasing in wt for all wt < wt < w¯t. Proof. Differentiating Eq. (5) with respect to a yields: Fða; wt Þ :¼ bEtþ1 ððw˜ tþ1  wt Þˆu VðR˜ tþ1 s þ b  ½ð1  at Þwt þ at w˜ tþ1 ÞÞ:

ð8Þ

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

189

In an interior solution for ato(wt), we have F(ato(wt),wt) = 0. . Consider the case that a = 0. Then Fð0; wt Þ ¼ b  Etþ1 ððw˜ tþ1  wt ÞgðR˜ tþ1 ÞÞ where we set g(R˜t + 1): = uˆV(R˜t + 1s + bwt). The function g is decreasing: g V(R˜t + 1) = suˆW < 0. Hence, g and wt + 1 are negatively (positively) dependent [independent] whenever R˜t + 1 and w˜t + 1 are positively (negatively) dependent [independent]. From Lehmann (1966, Lemma 3), it is known that if two random variables are positively (negatively) dependent [independent], their covariance is positive (negative) [zero]. Applying this result, we obtain b1  Fð0; wt Þ ¼ Etþ1 ððw˜ tþ1  wt ÞgðR˜ tþ1 ÞÞ > ð<Þ½¼ Etþ1 ðw˜ tþ1  wt Þ  Etþ1 gðR˜ tþ1 Þ if w˜t + 1 and R˜t + 1 are negatively (positively) dependent [independent]. If ato = 0 is supposed to hold (i.e., the retirees’ most preferred value of at just happens to be zero), we must have 0 = F(0,w) and thus, due to g>0, that wt >( < )[=]Et + 1w˜t + 1 if w˜t + 1 and R˜t + 1 are negatively (positively) dependent [independent]. . If ato is an interior most-preferred value, we have F(ato,wt) = 0. By the implicitfunction theorem, daot ðwt Þ Fw ðawt Þ : ¼ t dwt Fa ða; wt Þ Here, Fa = bEt + 1((w˜t + 1  wt)2uˆW) < 0. Hence, ato is decreasing in wt if and only if Fwt < 0. Calculate that Fwt ðat ; wt Þ ¼ b  Etþ1 ðˆuVð˜c2t Þ þ ðw˜ tþ1  wt Þbð1  aÞˆuWð˜c2t ÞÞ ¼b

Z Z R

W

ðˆuVð˜c2t Þ þ ðw˜ tþ1  wt Þbð1  aÞˆuWð˜c2t ÞÞ

hðR˜ tþ1 ; w˜ tþ1 Þdw˜ tþ1 dR˜ tþ1 :

ð9Þ

As uV>0, this is clearly negative for a = 1. However, as shown in Appendix A, the second term is strictly positive for a < 1 and thus tends to offset the first. Yet, verify that Fwt ðat ; wt Þ < bEtþ1 ðˆuVð˜c2t Þ  ½bð1  aÞwt þ baw˜ tþ1 ˆuWð˜c2t ÞÞ < bEtþ1 ðˆuVð˜c2t Þ  c˜ 2t uˆ Wð˜c2t Þ ¼ bEtþ1 ðˆuVð˜c2t Þ  ½1  Ruˆ ð˜c2t Þ Þ which is negative if Ruˆ V 1 for all c˜t2. Define the wage level w¯t as that (smallest) one where ato = 0; formally, w¯t satisfies ato (w¯t) = 0 = F(ato(w¯t),w¯t). Further set w t to be that (largest) value of wt, such that ato = 1;

190

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

formally, wt solves at(wt) = 1, and F(ato(wt),wt) = 0. Given that Fwt is strictly decreasing in wt, both w¯t and wt are unique and w¯t >wt. Proposition 1 then readily follows. 5 From Proposition 1, the most preferred value of at for members of generation t shortly prior to their retirement is a (weakly) decreasing function of their earnings during working age. Moreover, there exists an earnings level w¯t above which generation t prefers at to be zero (FR scheme). Finally, there exists a low wage level w¯t below which retirees always prefer at to be one—which corresponds to the FC scheme. At first glance, these results seem intuitive: A positive value of at means that generation t’s pensions depend on the yet uncertain future wage rate w˜t + 1. Risk-averse individuals will find that unattractive per se, and they will be more reluctant the higher their own wage level becomes, upon which their pension would otherwise be contingent. However, there are two obstacles that stand against this intuitive reasoning—and that make Proposition 1 look a bit more complicated. First, the argument that risk-averse members of generation t will find it generally unattractive to make their pension contingent on uncertain future incomes needs an important qualification. The argument entails that risk-averse retirees would never accept any lottery about their pension whose expected value is smaller or even below the certain pension they could ensure for at = 0. This intuition is flawed; it overlooks that retirees already face some risk for their old-age consumption because the returns on their savings are stochastic.6 As Proposition 1 states, this might shift the limit value w¯t (above which a positive value of at is never welcome) below the expected value of future wages Et + 1w˜t + 1. This will always be the case when future wages and the interest rate are negatively dependent. To understand this, recall that, for at = 0, the riskiness of old-age consumption is solely due to the return risk on savings, while positive levels of at add to this another risk via wages w˜t + 1. Such an additional risk will be welcome if it entails diversification possibilities—which happens if and only if stochastic returns are negatively correlated. Then generation t is willing to accept some additional risk even if it goes along with a loss in terms of expected return compared to the ‘‘safe return’’ they would earn for at = 0. Of course, if future interest and wage rates are positively dependent, an increase in at will increase the exposure to the interest rate risk without offering any opportunities for diversification. Hence, retirees will at most accept such a higher risk exposure if they are compensated by a higher return, i.e., for Et + 1w˜t + 1>wt. Proposition 1 reflects the potential diversification possibilities of multi-pillar pension schemes that have recently been elaborated by Hauenschild (1999) and Dutta et al. (2000). Diversification possibilities generally exist whenever the rates of return for two assets are negatively dependent; they imply that risk-averse individuals are also willing to include the low-return asset in their portfolios. This is readily transferred to the present analysis: a negative correlation between wages and interest rates can make the exposure to additional pension risks worthwhile. 6 In all formal propositions, the case of non-stochastic interest rates is covered by the case that interest and wage risks are independent. This does, however, not imply that deterministic and wage-independent interest rates are the same thing in general; this only holds for the qualitative properties and boundary values that are mentioned in the results. Indeed, the interest rate risk is a background risk with highly complex comparative-static effects on the decision on a.

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

191

The second obstacle against a first-order intuition behind Proposition 1 is related to the observation that, the higher their own wage rate wt has been, the more reluctant will members of generation t be to accept higher values of at. Eq. (9) identifies two effects of an increase in wt on ato that point into opposite directions. The first one (captured in  uˆ V) is always negative and may be termed a substitution effect: as an additional unit of the lottery over w˜t + 1 (i.e., an marginal increase in at) always means one unit of wt foregone in the pension formula, an increase in wt makes the lottery over w˜t + 1 more costly and less attractive. Hence, higher values of a become less desirable. The second effect (captured by the expression containing uˆW) may be termed DARA or income effect: Higher wages wt induce a higher propensity to risk-taking. This effect is unambiguously positive (as shown in Appendix A) and thus works to counter the substitution effect. The assumption that relative risk aversion Ruˆ is less than one then amounts to preventing the DARA effect from dominating the substitution effect. The assumption that relative risk aversion falls below unity is often made. In particular, Cheng et al. (1987) and Choi et al. (2001) identify it as being equivalent to a series of ‘‘plausible’’ comparative statics results in various settings under uncertainty, ranging from portfolio choices over liquidity preferences to optimal firm behavior with price uncertainty. In our framework Ruˆ V 1 is merely a sufficient, not a necessary condition for Proposition 1 to hold.7

4. The preferred degree of risk sharing for the young We now turn to the ex ante most preferred value of at for generation t + 1. The parameter at (at least) determines that generation’s contributions to the PAYG scheme. As before, we assume that decisions on at take place in knowledge of wt, but before Nature draws w˜t + 1. Ex ante, the expected lifetime utility of generation t + 1 amounts to: Utþ1 ¼ Etþ1 maxfuðw˜ tþ1  bðð1  at Þwt þ at w˜ tþ1 Þ  stþ1 Þ stþ1

þ Etþ2 uˆ ðR˜ tþ2 stþ1 þ P˜ tþ1 Þg;

ð10Þ

where we take into account that generation t + 1 decides on saving after w˜t + 1 is known. For their old age in t + 2, generation t + 1 must form expectations not only about interest and wage rates R˜t + 2 and w˜t + 2, but also about the pension parameter at + 1 that will prevail then. In principle, a rational expectations analysis where the young agents correctly infer the distributions of all relevant future variables (including the endogenous values of at + k) is required here. Such an analysis is not only very demanding (both for the model maker and for the agents populating the model), but also necessitates to be specific about the 7

(i) The necessary condition Fwt < 0 in Eq. (9) does not seem to have any accessible equivalent in terms of preferences and/or underlying stochastics. (ii) A slightly weaker sufficient condition than Ruˆ V 1 can be obtained by help of the concept of partial relative risk aversion (Menezes and Hanson, 1970; Cheng et al., 1987) which, for a given utility function v(c) with c = Y + x˜, is defined by Pv( Y,x˜): =  x˜mW( Y + x˜)/mV( Y + x˜). The condition Ruˆ V 1 in Proposition 1 can then be replaced by Puˆ( Y,x˜) V 1 for all Y = R˜t + 1s and x˜ = b(1  a)wt + bawt + 1.

192

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

political process(es) that determine future decisions on a. We avoid that complexity here— at the cost of some rationality in individual decision making. For retrieving generation t + 1’s preferences over at, the future parameter at + 1 plays a role in so far as the young might expect it to vary with today’s at. To capture these effects, we assume that (not fully rational) individuals perceive the future value at + 1 to depend, in a deterministic way, on at and, possibly, on the (realized value of the) future wage rate w˜t + 1:8 atþ1 ¼ Cðat ; w˜ tþ1 Þ:

ð11Þ

In period t + 1, the members of generation t + 1 choose saving s as to maximize uðwtþ1  Btþ1  sÞ þ Etþ2 uˆ ððR˜ tþ2 s þ P˜ tþ1 Þ

ð12Þ

Observe in Eq. (12) that the wage rate wt + 1 and the pension contribution Bt + 1, which is determined by at, are (at the point in time in question) already given. Furthermore, the pension parameter at + 1 is known. Nevertheless, the second-period expectation is a double integral since stochastics extend over the interest factor R˜t + 2 and the wage rate w˜t + 2, with the latter affecting the pension payment P˜t + 1. Optimal savings are characterized by the FOC that the expected marginal utilities of consumption have to be equalized across time: u Vðc1tþ1 Þ þ Etþ2 ðR˜ tþ2 uˆ Vð˜c2tþ1 ÞÞ ¼ 0:

ð13Þ

For the sake of reference, it will be useful to consider the income effect on savings right here. Define first-period income of generation t + 1 as Yt + 1: = w˜t + 1  b(atw˜t + 1+ (1  at)wt). Implicit differentiation of Eq. (13) yields that the marginal saving rate is positive, but smaller than one: sY :¼

uWðc1tþ1 Þ Bstþ1 ¼ að0; 1Þ: BYtþ1 uWðc1tþ1 Þ þ Etþ2 ðR˜ 2tþ2 uˆ Wðc2tþ1 ÞÞ

ð14Þ

Generally, the assessment of at by the younger generation can be expressed as:    BUtþ1 BC ¼ Etþ1 bðwt  w˜ tþ1 ÞuVð˜c1tþ1 Þ þ Etþ2 ðw˜ tþ2  w˜ tþ1 ÞˆuVð˜c2tþ1 Þ : Bat Bat ð15Þ All effects of at on saving cancel out in Eq. (15) by the envelope theorem. 8 An alternative (also not fully rational) approach would be to introduce stochastics over at + 1 that are conditional on at, i.e., H(xAat): = Prob(at + 1 V xAat). Then changes in at cause the stochastics of at + 1 to change. Some experiments with such an approach, however, suggest that its insights are (even) more limited than with assumption (11).

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

193

4.1. Case I: Future decisions are independent of today’s We first assume that today’s decision on at has no (perceived) effect on the value of ˜ /Bat u 0. We denote by aty(wt) generation t + 1’s most preferred value of at in that at + 1:BA 9 case. Proposition 2. Let at + 1 be independent of at. Suppose that Ru(c) V 1 and that savings are convex in income: sYY z 0. 1. There exist wˇt < wˆt < Et + 1w˜t + 1 such that 8 ¼0 if > > > > < aty ðwt Þ að0; 1Þ if > > > > : ¼1 if

wt Vwˇ t wˇ t < wt < wˆ t wt zwˆ t :

2. The function aty(wt) is strictly increasing for wˇt < wt < wˆt. Proof. With at + 1 independent of at, the second term in Eq. (15) disappears. We thus only have to be concerned with the properties of G(at,wt): = Et + 1[(wt  w˜t + 1)uV(ct1+ 1)]. If aty is an interior solution, we have G(aty(wt),wt) = 0. . We first verify that uV(ct1+ 1) is a decreasing function of w ˜ t + 1:   BuVðc1tþ1 Þ Bstþ1 1 ¼ uWðctþ1 Þ  ð1  bat Þ  < 0: Bw˜ tþ1 Bw˜ tþ1 Here, the inequality follows from Eq. (14): Bst + 1/Bw˜t + 1=(1  bat)sY < (1  bat). Since (wt  w˜t + 1) (trivially) decreases in w˜t + 1, applying Chebyshev’s Inequality to G yields Gðat ; wt Þ ¼ Etþ1 ½ðwt  w˜ tþ1 ÞuVðc1tþ1 Þ > Etþ1 ðwt  w˜ tþ1 Þ  Etþ1 uVðc1tþ1 Þ: Hence, for wt z Et + 1w˜t + 1, we have G(at,wt) > 0 for all a. Thus, aty = 1 is the most preferred choice. In order for aty to be an interior solution, we must have 0 = G(at,wt) or wt < Et + 1w˜t + 1. . We now wish to show that G(aty,wt) is strictly increasing in wt. Calculate: Gwt ¼ Etþ1 ½uVðc1tþ1 Þ  ð1  at Þbð1  sY Þðw˜ tþ1  wt ÞuWðc1tþ1 Þ 9

ð16Þ

The (predetermined) value of at + 1 clearly also impacts on the preferred value of aty, i.e., aty = aty(wt,a¯ t + 1). ¯ Since the (quite complex) comparative statics of aty with respect to at + 1 are of no relevance for our analysis, we do not present them here.

194

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

where we used Eq. (14). Gwt contains a positive effect (uV>0) and a negative one (see Appendix B). Yet, Gwt > Etþ1 ½uVðc1tþ1 Þ þ ðw˜ tþ1 ð1  bat Þ  bð1  at Þwt Þð1  sY ÞuWðc1tþ1 Þ " z

ð*Þ

Etþ1 uVðc1tþ1 Þ

c1 þ ðw˜ tþ1 ð1  bat Þ  bð1  at Þwt Þ tþ1 uWðc1tþ1 Þ Ytþ1

#

¼ Etþ1 ½uVðc1tþ1 Þð1  Ru ðc1tþ1 ÞÞ > 0: Here, the first inequality follows from uW < 0 and b < 1 while the final one comes from Ru < 1. The equality in the third line holds by definition. The second inequality (*) comes from the convexity of saving (equivalently, the concavity of ct1+ 1) in Yt + 1: 1  sY ¼

Bc1tþ1 c1tþ1 V 1 : 1 BYtþ1 Ytþ1

Now define the wage level wˆt mentioned in the claim as that (smallest) one where aty = 1; formally, wˆt satisfies aty(wˆt) = 1 and 0 = G(aty(wˆt),wˆt). Further define wˇt as that (largest) level of wt such that G(0,wt) V 0. As Gwt < 0, both wˆt and wˇt are unique and wˇt < wˆt. Proposition 2 then readily follows from combining the two items above. 5 Proposition 2 states that the young generation will only accept a value of at below unity if their parents’ earnings fall below a certain level wˆt, which itself is smaller than the expected value Et + 1w˜t + 1 of generation t + 1’s own earnings. The intuition is immediate: First, a large wt means that the (non-stochastic) burden imposed on the young by a pension scheme with at < 1 is larger, the lower at becomes. Furthermore, the lower at, the higher is the exposure of generation t + 1 to its own wage risk (recall that BYt + 1/Bw˜t + 1 = 1  bat). Thus, by increasing at, the risk of the own income can be partially shifted to the elder generation whose high pension claims stemming from wt will in turn be ‘‘devalued’’ by higher values of at. These two effects imply that aty = 1 always holds when wt exceeds Et + 1wt + 1. But even if wt is, up to a certain amount, smaller than Et + 1wt + 1, the most preferred value of at for the young is unity since there the risk reduction effect dominates the (now negative) income effect. This is an immediate implication of risk aversion. If wt ˆ t), the young will gets sufficiently smaller than Et + 1w ˜ t + 1 (precisely, if it is smaller than w successively attach higher utility to the income effect (i.e., leaving their parents with the low pension claim derived from wt rather than having them participating in the—then relatively good—prospects for their own income) than to the risk shifting effect; the optimal value of aty now falls below unity. For small levels of wt (below wˇt), it may even reach zero. Proposition 2 further states that generation t + 1’s desire to accede risks to their parents (i.e., their preference for higher levels of at) increases with their parents’ earnings, at least in the range (wˇt,wˆt): The larger wt and thus the larger the inherited pension burden b(1  at) wt, the smaller is the disposable income of generation t + 1 during working age. With DARA, this means that the reluctance to risk taking is greater and the willingness to

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

195

expose oneself to the own wage risks is smaller the higher is wt. Hence, the optimal level of aty increases. While this sounds natural, two caveats must be added. The first is analogous to one discussed in the previous section for the older generation: An increase in wt entails a (now) positive substitution and a negative DARA effect. The assumption Ru V 1 then ensures that the former dominates the latter. The second caveat involves the saving function: Risk aversion is formulated in terms of consumption ct1+ 1, not in terms of income Yt + 1. The above reasoning is, however, based on an argument related to income, not on consumption ( = net wage minus saving). If we want the argument to be phrased in terms on utility u as a function of c1, we must control for the fact that part of any reduction in incomes is accommodated by a reduction in savings (see Eq. (14)). Convexity of the saving function ensures that the marginal reduction in consumption is always smaller than the average one. Under this proviso, properties like DARA and the size of relative risk aversion Ru can be related to disposable incomes, not only to consumption. Convexity of the savings function or, equivalently, concavity of the consumption function is an often-made ‘‘Keynesian’’ assumption in macroeconomic models which also has empirical support (see Carroll and Kimball, 1996 for further references). Of course, the curvature properties of savings or consumption functions are determined by the properties of the utility functions u and uˆ. Carroll and Kimball (1996, Theorem 1) provide a full characterization showing that the consumption function is concave (but not necessarily strictly so) if preferences belong to the HARA class—which is well compatible with relative risk aversion lying below unity. Convexity of s is a sufficient, but not a necessary assumption for the validity of Proposition 2; if savings are concave in income (but not too much so), the young’s most preferred level of at will still be increasing in their parents’ earnings as long as Ru falls sufficiently below unity. Fig. 2 depicts the optimal values of at from the perspectives of the young and the old generations, plotted as functions of wt. The graphs can easily be derived from Propositions 1 and 2. We will return to the implications of these preferences for political decision-making in Section 5. 4.2. Case II: No revoting opportunities So far, we have supposed that the young assume today’s decision on at to have no implication for their own pension P˜t + 1. Against the background of our stochastic setting without any autocorrelation in variables, it seems somehow natural to assume that a should change over time and that past decisions do not preclude or restrict future choices. One might, however, assume instead that today’s decision at is a ‘‘historical singularity’’ (viz., a decision on the type of the pension scheme) that will have validity forever or at least for the foreseeable future as it is relevant for those living today. Starting with the seminal paper by Browning (1975), no-revoting assumptions are made in many prominent models of voting on social security (Boadway and Wildasin, 1989; Casamatta et al., 2000; Veall, 1986, and others). In this section, we examine the young’s most preferred risk-sharing

196

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

Fig. 2. Preferences for t (independent decisions).

parameter when today’s decision is expected to be still fully effective when today’s young will retire: atþ1 ¼ at :

ð17Þ

We denote the most preferred value of the risk sharing parameter under assumption (17) by a˜ ty = a˜ ty(wt). Invoking (equal), Eq. (15) turns into: BUtþ1 ¼ bEtþ1 fðwt  w˜ tþ1 ÞuVðc1tþ1 Þ þ Etþ2 ððw˜ tþ2  w˜ tþ1 ÞˆuVð˜c2tþ1 ÞÞg: Bat

ð18Þ

The young’s assessment of at differs from that in the previous section by the additional term Xðat Þ :¼ Etþ1 Etþ2 fðw˜ tþ2  w˜ tþ1 ÞˆuVðR˜ tþ2 s˜ tþ1 þ bðat w˜ tþ2 þ ð1  at Þw˜ tþ1 ÞÞg

ð19Þ

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

197

which contains˜ stochastics for w˜t + 1, w˜t + 2 and R˜t + 2. Roughly speaking, if X >0 [ < 0], then the function aty(wt) lies above [below] aty(wt). The analysis of X is quite involved so that only partial results can be obtained: Proposition 3. 1. Assume that w˜t + 2 and ct2+ 1 are negatively dependent or independent. If Et + 2w˜t + 2 z Et + 1w˜t + 1, then X(at)>0. 2. Assume that w˜t + 2 and ct2+ 1 are positively dependent. Then X(at) < 0 if Et + 2 w˜t + 2V Et + 1 w˜t + 1 and if the covariance of w˜t + 1 and Et + 2 uˆV(ct2+ 1(w˜t + 1)) is small. 3. (i) If w˜t + 2 and R˜t + 2 are independent, then w˜t + 2 and ct2+ 1 are positively dependent for at >0 and independent for at = 0. (ii) If w˜t + 2 and R˜t + 2 are positively dependent, then w˜t + 2 and ct2+ 1 are also positively dependent. The converse is not true. Proof. Let ata[0,1]. Then:10 X

¼

Etþ1 Etþ2 ðw˜ tþ2 uˆ Vð˜c2tþ1 ÞÞ  Etþ1 ðw˜ tþ1 Etþ2 uˆ Vð˜c2tþ1 ÞÞ

ð*Þ Etþ1 ðEtþ2 w˜ tþ2 Etþ2 uˆ Vð˜c2tþ1 ÞÞ  Etþ1 ðw˜ tþ1 Etþ2 uˆ Vð˜c2tþ1 ÞÞ ¼

½Etþ2 w˜ tþ2  Etþ1 w˜ tþ1 Etþ1 Etþ2 uˆ Vð˜c2tþ1 Þ  Covtþ1 ðw˜ tþ1 ; Etþ2 uˆ Vð˜c2tþ1 ÞÞ: ð20Þ

Here, uˆV has to be evaluated at c2tþ1 ¼ R˜ tþ2 s˜ tþ1 þ bðð1  at Þw˜ tþ1 þ at w˜ tþ2 Þ:

ð21Þ

The first equality in Eq. (20) is from Fubini’s Theorem. The second line and its missing sign will be explained below. The third line emerges from the second by application of the covariance formula and collecting terms (note that from the viewpoint of t + 1, the expectation Et + 2uˆV is—as a function of w˜t + 1—a random variable). We now verify that the covariance in the final line of Eq. (20) is negative. This follows from Chebyshev’s inequality since Et + 2uˆV decreases with w˜t + 1. Namely, for all ata[0,1]: B Etþ2 uˆ Vðc2tþ1 Þ ¼ ð1  bat ÞsY Etþ2 ðR˜ tþ2 uˆ Wðc2tþ1 ÞÞ þ bð1  at ÞEˆuWðc2tþ1 Þ < 0: Bw˜ tþ1 Let us turn to the second line of Eq. (20) and its missing sign (*). Due to Lemma 3 in Lehmann (1966), this sign has to read >, < or = if, respectively, w˜t + 2 and ct2+ 1 are negatively dependent, positively dependent or independent. (Note that uˆV is decreasing.) Given that the covariance in the final line is negative, the first two items in Proposition 3 follow immediately. Now recall Eq. (21). It is immediate that, as wage rates are serially independent by assumption, any sort of dependence of ct2+ 1 and w˜t + 2 can fully be 10

To avoid notational clutter, we slightly abuse notation here when writing Et + 2 both for the expectation over the (marginal) density of w˜t + 2 and over the joint density of (R˜t + 2,w˜t + 2). Whenever the expectations operator applies to arguments involving both R˜t + 2 and w˜t + 2 it should be thought of as a double integral, of course.

198

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

attributed to dependence between w˜t + 2 and R˜t + 2. Furthermore, as ct2+ 1 is a function of w˜t + 2 for all at >0, c2t + 1 and w˜t + 2 cannot be stochastically independent unless at = 0 and R˜t + 2 R˜t + 2 and w˜t + 2 are independent.To show that (for at p 0) stochastic independence of R˜t + 2 and w˜t + 2 implies positive dependence of c2t + 1 and w˜t + 2, we combine Results (iii) and (iv) in Lehmann (1966, p. 1139). Together, they imply that, if two random variables x˜1 and x˜2 are stochastically independent, then x˜1 and g(x˜1 + x˜2) are positively dependent whenever g is non-decreasing. For our case, put x˜1 = w˜t + 2, x˜2 = R˜t + 2 and g = ct2+ 1, which is affine in (w˜t + 2,R˜t + 2). Similarly, one shows that the positive dependence of R˜t + 2 and w˜t + 2 implies positive dependence of c2t + 1 and w˜t + 2. 5 Proposition 3 can best be understood by viewing a young individual who believes that the at chosen today will remain valid until his retirement age (and maybe beyond) as a person who combines both the preferences of a young without such beliefs (Proposition 2) and an individual close to retirement (Proposition 1). The term X then captures the young individuals’ anticipation of their old age. It is closely related to the term F defined in Eq. (8). From Proposition 1, we know that a pensioner who expects future wages to match today’s will accept a positive level of a if and only if the pension scheme offers diversification opportunities upon his saving, i.e., whenever future wages and interest rates are non-positively correlated. The first item of Proposition 3 restates exactly this: If risk sharing by the pension scheme contributes to diversification of old-age consumption (namely, if the own consumption and the wage rate of one’s children tend to vary inversely) a young individual will agree to a positive value of a today—although he would not do so were there a revoting opportunity on a. Conversely, if increasing a acerbates the risk exposure of old-age consumption (namely, if wages and interest rates in t + 2 are positively dependent), then a young person today is, cet. par., more reluctant to accept positive values of a than he would be under the assumptions of Proposition 2 (i.e., if a were only to hold for a single period). This is stated in the second item of Proposition 2. There is a counter-effect at work here, however, captured by the covariance term in Eq. (20). This covariance term (with the negative sign) is always positive. It drives the young to accept higher values of at, regardless of the stochastic relationships between the random variables. It may be interpreted as an intertemporal diversification effect. For a special setting, the following result compares the optimal values of aty for the present and the previous scenarios: Proposition 4. Assume that Et + 1 w˜t + 1 = Et + 2 w˜t + 2, that w˜t + 2 and R˜t + 2 are independent, and that X(1) < 0. Let wˇt and wˆt be defined as in Proposition 2. Then there exist wtV and wtW with wtV < wˇt < wˆt < w W such that: 8 ¼0 > > > > < a˜ yt ðwt Þ að0; 1Þ > > > > : ¼1

if

wt V wt V

if

wt V < wt < wt W

if

wt zwt W:

The proof is straightforward from Proposition 3. Fig. 3 depicts the new situation.

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

199

Fig. 3. Preferences of the young: independent decisions vs. no-revoting.

Compared to the most-preferred parameter value aty of the previous scenario, the new curve a˜ty(wt) is stretched and, on average, flattened. Contrary to Proposition 2, we can no longer show that a˜ ty is monotonic in wt unless we impose rather restrictive conditions on preferences. Generally, there is no clear-cut relation between aty and a˜ ty. Hu (1982, Proposition 5) makes a related observation in a model with voting on the contribution rate b: an unambiguous relation between the most-preferred contribution rates with and without a revoting hypothesis does not exist there either. 4.3. Case III: Dependent choices Let us briefly discuss the general case: at + 1 = G(at,wˇt + 1), where G is supposed to be neither constant at zero (as in Case I) nor always equal to at (as in Case II). This captures the idea that today’s choice of at is expected to bear some non-trivial significance for future values of a. Similar to the term X in Eq. (19), we can from Eq. (15) define a function ( BC ˆ  Etþ1 ½ððw˜ tþ2  w˜ tþ1 ÞˆuVðR˜ tþ2 s˜ tþ1 Xðat Þ :¼ Etþ1 Bat ) þbðat1 w˜ tþ2 þ ð1  atþ1 Þw˜ tþ1 ÞÞ

ð22Þ

200

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

with the interpretation that, compared to the function ayt defined in Proposition 2, the young’s most preferred value of at weakly increases [decreases] if Xˆ is negative [positive]. We first consider the case where at + 1 is (perceived to be) independent of the realization of wˇt + 1: BC/Bwˇt + 1 u 0. Then Xˆ equals ˆ t Þ :¼ BC  Etþ1 Etþ2 fðw˜ tþ2  w˜ tþ1 ÞˆuVðR˜ tþ2 s˜ tþ1 þ bðat1 w˜ tþ2 Xða Bat þð1  atþ1 Þw˜ tþ1 ÞÞg The double expectation closely resembles the original version of X. In particular, its sign can be analyzed along the same lines as in Proposition 3. If individuals expect that at + 1 varies positively with at, then Proposition 3 and 4 go through (with X being replaced by Xˆ ): The range of wt, where the young’s most preferred value of at is an intermediate one, becomes larger as compared to the independence case of Proposition 2; cf. Fig. 3. Assumption BG/Bat > 0 means that a higher value of at is seen as a signal for higher values of at + 1. However, this is only a subjective belief, and the converse opinion might also be held. If individuals believe that future values of a vary negatively with that of today, all inequality signs referring to X in Proposition 3 have to be reversed for Xˆ . In turn, Proposition 4 also changes: The function aty(wt) on average gets steeper (not flatter) than the original function aty(wt). This reasoning holds if at + 1 is independent of w˜t + 1. Individuals in generation t + 1 might, however, also believe that their wage wt + 1 systematically impacts on future decisions on a. Unfortunately, non-zero partials of at + 1 with respect to w˜t + 1 render Eq. (22) inaccessible. The sign of Xˆ then also depends on the sign and magnitude of the crosspartial B2G/(BatBw˜t + 1). As we are not able to justify any assumption on this derivative, we refrain from a further analysis of that case.11

5. Implications for the political economy of intergenerational risk sharing Political decisions on a (or any other issue) should, at least in a welfarist understanding, account for the citizens’ preferences. Typically, social and public choice mechanisms are formally depicted as mappings from the preference space (possibly times some other space) to the outcome space, and the various social or public choice procedures are distinguished by the structure of these mappings. A common example is the case of majority voting which, under certain properties of the outcome space, reduces to a very simple mapping: The median voter’s preferences determine the outcome of the political process. 11

In his analysis of voting over the contribution rate, Hu (1982) has to resort to hypotheses about beliefs on the impact of today’s choices and the primitives of the model on future decisions. This also includes assumptions on signs and magnitudes of partials and cross-partials of the belief function. In particular, Hu assumes that today’s decisions positively affect future ones—which corresponds to BC/Bat > 0 in our framework. The assumptions on the cross-partials in Hu (1982) seem, however, rather ad hoc.

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

201

In a 2-OLG model without intracohort differences, the median voter is the representative agent of the more populous generation. As we assumed at the outset that all generations are of equal size, a unique median voter equilibrium does not exist in our framework. However, it is an easy exercise to incorporate different population sizes into the analysis. As long as population changes are deterministic and fully foreseen, none of Propositions 1 to 4 is affected since all population effects can be thought of as being hidden in the constant b. Hence, in a model with different sizes of neighboring generations a unique majority equilibrium for voting on at(wt) exists for every wage rate wt. It has the property that, if the young outweigh the old, political decisions on at will follow the aycurve (cf. Propositions 2 and 4) and the ao-curve otherwise (see Proposition 1). These curves represent the preferences of individuals in different positions in the life cycle. None of these preferences is of the bang-bang type: Between the ranges where the most preferred value of a is an extreme one, there always exists an interval where intermediate values are most welcome. That is, even if we assume that political decisions are organized as majority voting, our 2-OLG model can explain intermediate values for a—and thus fluctuations both in the replacement rate and in the contribution rate. Even if the political decision procedure involves that one generation outvotes the other, there are large ranges of wt, such that the decisive generation chooses to implement some intergenerational risk sharing (ata(0,1)) rather than entirely loading the pension risk upon one of the two generations (ata{0,1}). Contrary to what the median-voter model predicts, one typically does not observe that pension decisions in the real-world evolve through one generation outvoting and fully exploiting the other (Veall, 1986; Breyer and Craig, 1997). More realistically, one might view policy-making as an attempt to reconcile the interests of the groups in a society. This is, e.g., done in models of representative democracy, which have successfully been applied in the pensions context by Verbon (1986, 1993), or Verbon and Verhoeven (1992). Formally, these approaches boil down to the maximization of a weighted sum of the expected utilities of a (representative) worker and a (representative) pensioner; in our context: maxðUt þ k  Utþ1 Þ a

ð23Þ

with k z 0 as the political weight of the younger generation relative to the older one. In models where governments maximize expected electoral support (as, e.g., in Hettich and Winer, 1988), political choices can also be represented as the maximization of a function like Eq. (23). Let us first consider the case where the young believe that today’s decision on at has no implications beyond those on today’s contributions and pensions: BG/Bat = 0. Fig. 2 then exhibits that for large values of last period’s wage rate wt (precisely, for wt >max{w¯t,wˆt}) the generations’ interests are at full clash: the young want the largest degree of risk sharing at = 1, while the old prefer not to have any risk sharing at all: at = 0. A similar clash of interests occurs at very low levels of wt, but now the roles of the generations have been interchanged. In both of these cases, a political process of type (23) might nevertheless home in on an intermediate value for at, provided that the political weights of the two generations alive do not differ too much (viz., k is close to one). The same will obviously

202

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

also happen between these two ‘‘clash zones’’, i.e., in the range where both generations desire some, but no full risk sharing (both aot and ayt are positive, but smaller than one). However, even if we now assume that the political weight of one of the generations vastly exceeds that of the other one, will we now end up with an intermediate value of a, given that generational preferences are as in Fig. 2. Our model in fact predicts that for a very large class of political decision rules the outcome will always be some intermediate degree of intergenerational risk sharing. When we speak of ‘‘small’’ or ‘‘large’’ values of wt this always relates to the expected future wage rate Et + 1w˜t + 1. Changing the perspective, we can thus state: if generations expect a slump in wage rates (Et + 1w˜t + 1 < wt), or if they expect a splendid boom (Et + 1wˇt + 1Hwt) then their conflict of interest will be most severe. If they expect a moderate increase in wage levels (Et + 1w˜t + 1 is larger than wt, but not too much so), their interests are more in harmony: both generations prefer some intermediate degree of risk sharing (but not necessarily the same one).12 Turning to the case that the young assume the pension parameter a to stay in effect for longer than just for period t + 1, Fig. 3 demonstrates that (under the assumptions of Proposition 4) the realm for which the interests of the generations are diametrically opposed gets smaller as compared to the previous case (at least it will not expand). It is now even conceivable that both generations wish to have interior values of at when they expect future wages to fall below today’s. Generally, harmony of interests with respect to intergenerational risk sharing via the PAYG scheme is greater in slightly optimistic circumstances where future incomes are expected to be moderately higher than today’s. Strong conflicts arise if the economy anticipates either a decline or exceptional growth for the future.

6. Concluding remarks We started from the observation that in real-world PAYG pension schemes neither the contribution nor the replacement rate are constant over time. One possible interpretation for this is that PAYG policies involve some positive, but variable degree of intergenerational risk sharing. The aim of the paper then was to elucidate the preferences which riskaverse individuals in different positions of their life cycles and under various initial conditions (represented by the ‘‘historical’’ wage rate) develop for risk sharing through the pension scheme. We did so in a two-pillar pension scheme where individuals have access to private old-age savings. Note, however, that this two-pillar pension model in fact includes three assets: the one to which private savings go, a riskless PAYG asset and a risky PAYG asset. The parameter a determines the blend of the two PAYG assets within the first pillar. Individual preferences for PAYG mixtures are influenced by the return risk in the second pillar (savings) —which should not come as a surprise if one views old-age provisions as a portfolio problem. Correlations (and not only return differentials) play an essential role if one wishes to assess PAYG schemes correctly. 12 In the intersection of aty and ato in Fig. 3 there is of course perfect coincidence of interests and thus unanimous support for that level of a. However, this is a non-generic event.

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

203

We assumed the parameter b > 0 to be exogenous and invariable. This, chiefly means that we do not question the existence of the PAYG scheme itself. Advocates of radical changes in social security sometimes argue that PAYG schemes should be entirely abolished (b = 0) due to their low returns. However, since saving in our model earns stochastic returns, there is indeed scope for PAYG social security here, even for schemes with relatively low expected rates of return (Gale, 1991; Hauenschild, 1999): rational individuals will wish to include PAYG schemes in their old-age income mix for the motives of diversification and insurance. Assuming that b is constant presupposes a constant scale of intergenerational transfers (at least in expected terms) and, given that saving and (FC) funded pensions are perfect substitutes, implicitly also invariant shares of funded and PAYG components in the two-pillar pension mix. Our focus here was on the type of the PAYG scheme, represented by the risk-sharing parameter a. With regular voting on social security, the size of the pension scheme or the blending of PAYG and funded pensions certainly figure more prominently than a on many a policy agenda. Recent reforms, e.g., in Germany (2001; see Bonin, 2001), Sweden (1997; see Wadensjo¨, 2000), and Italy (Dini reform 1995; see Antichi and Pizzuti, 2000), which deliberately shifted pension policies towards an FC regime, indicate, however, that a might indeed be regarded as a policy variable in itself. For future research, it might be interesting to combine voting on both the transfer and the risk-sharing components of PAYG schemes—although, given the multi-dimensionality of that problem, a collective choice analysis will presumably become more problematic. Our main question was to analyze individual preferences over the risk-sharing parameter a. An important finding was that under slightly optimistic expectations for the future insurance and diversification motives are important from all individuals’ points of view. Aggregating these individual preferences in the political process then might explain PAYG pension formulae that have both the contribution and the replacement rate fluctuating. While different political mechanisms will, for sure, induce different outcomes with respect to intergenerational risk sharing (i.e., generate different numerical values for a), the polar outcomes of pure FC or pure FR schemes will emerge from very few mechanisms only. Even mechanisms that allocate all political powers to one generation will generate mixed PAYG pension schemes, provided that expectations for the future are at least slightly optimistic. As moderate optimism seems to have been the dominant expectation with respect to future economic growth over the past decades after World War II, this might explain why most real-world PAYG schemes indeed are of an intermediate type.

Acknowledgements I am indebted to Friedrich Breyer, Georges Casamatta, Ru¨diger Pethig, Hans-Werner Sinn, and Erling Steigum for fruitful discussions. Conference participants in Paris, Munich, and Linz provided valuable comments. I wish to thank two anonymous referees for their productive suggestions and helpful comments. Financial support by Deutsche Forschungsgemeinschaft (DFG) is gratefully acknowledged.

204

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

Appendix A . Further Analysis of Eq. (9) We will show that the second term in Eq. (9) is positive for all a < 1. We are done if Et + 1[(w˜t + 1  w˜t) uˆW()]>0. Expand the expression under the expectation operator by uˆV to obtain: Etþ1 ððw˜ tþ1  wt ÞˆuWÞ   uˆ W uˆ W ¼ 0: ¼ Etþ1 ðw˜ tþ1  wt ÞˆuV > Etþ1 ððw˜ tþ1  wt ÞˆuVÞ  Etþ1 uˆ V uˆ V The inequality can be established by a non-monotone version of Chebyshev’s Inequality.13 First, note that from the DARA assumption uˆW/uˆV is increasing in w˜t + 1. Next, verify that the value of (w˜t + 1  wt)uV is positive if and only if w˜t + 1>wt; in terms of w˜t + 1, the function (w˜t + 1  wt)uˆV thus changes its sign once from negative to positive. Hence, for all x in the support of w˜t + 1, Z

x

ðw˜ tþ1  wt ÞˆuVhð; w˜ tþ1 Þdw˜ tþ1 < Etþ1 ððw˜ tþ1  wt ÞˆuVÞ ¼ 0:

The final equality follows from Eq. (8), evaluated at ato. Clearly, mx hð; w˜ tþ1 Þdw˜ tþ1 > 0 for all x in support of w˜t + 1. Hence, for all x, Z

x

ðw˜ tþ1  wt ÞˆuVhð; w˜ tþ1 Þd w˜ tþ1 Z x < 0 ¼ Etþ1 ððw˜ tþ1  wt ÞˆuVÞ: hð; w˜ tþ1 Þd w˜ tþ1

Chebyshev’s Inequality as presented in footnote 13 thus applies: Etþ1 ððw˜ tþ1  wt ÞˆuWÞ > Etþ1 ðˆuW=ˆuVÞ  Etþ1 ððw˜ tþ1  wt ÞˆuVÞ ¼ 0:

13

Let f, g: [w,w¯] ! R be integrable functions of a random variable w and let f be increasing. Furthermore, let h: [w,w¯] ! R+ be a density for w. If, for all xa(w,w¯), Z

x

gðwÞhðwÞdw w

Z

Z

hðwÞdw

w

gðwÞhðwÞdw;

V

x

ð24Þ

w

w

then Eh( f(w)g(w)) z Ehf(w)Ehg(w). Equality will only hold if Eq. (24) always holds with equality. In our application, put w u w ˜ t + 1, f(w ˜ t + 1) = uˆW/uˆV, g(w˜t + 1) = (w˜t + 1  wt) uˆV, and h(w) u h(,w˜t + 1) as the marginal density of w˜t + 1. See Mitrinovic et al. (1993, p. 248).

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

205

Appendix B . Further Analysis of Eq. (16) We want to show that, given DARA and the convexity of the saving function, the second term in Eq. (16) is negative. Since sY is not independent of w˜t + 1, we cannot directly apply the same token as in Appendix A. Inspection of that proof yet shows that the non-monotone version of Chebyshev’s Inequality (cf. footnote 13) remains applicable if (1  sY)uW(c1t + 1 )/uV(c1t + 1 ) is increasing in w˜t + 1. Given that u exhibits DARA, a sufficient (yet not necessary) condition for this to hold is that sYY z 0:   uWðc1tþ1 Þ B ð1  sY Þ Bw˜ tþ1 uVðc1tþ1 Þ     uWðc1tþ1 Þ uWðc1tþ1 Þ V 2 ¼ ð1  at bÞ sYY  þ ð1  sY Þ  > 0: uVðc1tþ1 Þ uVðc1tþ1 Þ The positive sign holds as uW/uV is negative and increasing in c1t + 1 from DARA. References Antichi, M., Pizzuti, F.R., 2000. The public pension system in Italy: observations on the recent reforms, methods of control and their application. In: Reynaud, E. (Ed.), Social Dialogue and Pension Reform. International Labour Office, Geneva, pp. 81 – 95. Boadway, R., Wildasin, D.E., 1989. A median voter model of social security. International Economic Review 30, 307 – 328. Bohn, H., 1999. Should the social security trust fund hold equities? An intergenerational welfare analysis. Review of Economic Dynamics 2, 666 – 697. Bonin, H., 2001. Will it last? An assessment of the 2001 German pension reform. Geneva Papers on Risk and Insurance. Issues and Practice 26, 253 – 270. Breyer, F., Craig, B., 1997. Voting on social security: evidence from OECD countries. European Journal of Political Economy 13, 705 – 724. Browning, E., 1975. Why the social insurance budget is too large in a democracy. Economic Inquiry 13, 373 – 388. Carroll, C.D., Kimball, M.S., 1996. On the concavity of the consumption function. Econometrica 64, 981 – 992. Casamatta, G., Cremer, H., Pestieau, P., 2000. The political economy of social security. Scandinavian Journal of Economics 102, 503 – 522. Cheng, H., Magill, M.J., 1985. Futures markets, production, and diversification of risk. Journal of Mathematical Analysis and Applications 107, 331 – 355. Cheng, H.-C., Magill, M.J., Shafer, W.J., 1987. Some results on comparative statics under uncertainty. International Economic Review 28, 493 – 507. Choi, G., Kim, I., Snow, A., 2001. Comparative static predictions for changes in uncertainty in the portfolio and savings problems. Bulletin of Economic Research 53, 61 – 72. Demange, G., Laroque, G., 1999. Social security and demographic shocks. Econometrica 67, 527 – 542. Demange, G., Laroque, G., 2000. Social security, optimality and equilibria in a stochastic overlapping generations economy. Journal of Public Economic Theory 2, 1 – 23. Dutta, J., Kapur, S., Orszag, J.M., 2000. A portfolio approach to the optimal funding of pensions. Economics Letters 69, 201 – 206. Enders, W., Lapan, H.E., 1993. A model of first and second-best social security programs. In: Felderer, B. (Ed.), Public Pension Economics. Journal of Economics, Supplementum 7. Springer, Wien, pp. 65 – 90. Gale, D., 1991. The efficient design of public debt. In: Dornbusch, R., Draghi, M. (Eds.), Public Debt Management: Theory and History. Cambridge Univ. Press, Cambridge, pp. 14 – 51.

206

A. Wagener / European Journal of Political Economy 20 (2004) 181–206

Hauenschild, N., 1999. Uncertain incomes, pay-as-you-go-systems, and diversification. Zeitschrift fu¨r Wirtschafts- und Sozialwissenschaften 119, 491 – 507. Hettich, W., Winer, S., 1988. Economic and political foundations of tax structure. American Economic Review 78, 701 – 712. Hu, S.C., 1982. Social security, majority-voting equilibrium and dynamic efficiency. International Economic Review 23, 269 – 287. Lehmann, E., 1966. Some concepts of dependence. Annals of Mathematics and Statistics 37, 1137 – 1153. Lindbeck, A., 2002. Pensions and contemporary socioeconomic change. In: Feldstein, M., Siebert, H. (Eds.), Social Security Pension Reform in Europe. University of Chicago Press, Chicago, pp. 19 – 44. Magill, M., Nermuth, M., 1986. On the qualitative properties of futures market equilibrium. Journal of Economics 46, 233 – 252. Menezes, C.F., Hanson, D.L., 1970. On the theory of risk aversion. International Economic Review 11, 481 – 487. Merton, R.S., 1983. On the role of social security as a means for efficient risk sharing in an economy where human capital is not tradable. In: Bodie, Z., Shoven, J.B. (Eds.), Financial Aspects of the United States Pension System. The University of Chicago Press, Chicago, pp. 325 – 358. Mitrinovic, D.S., Pecaric, J., Fink, A., 1993. Classical and New Inequalities in Analysis Kluwer Academic Publishing, Dordrecht. Peled, D., 1982. Informational diversity over time and the optimality of monetary equilibria. Journal of Economic Theory 28, 255 – 274. Rangel, A., Zeckhauser, R., 2001. Can market and voting institutions generate optimal intergenerational risk sharing? In: Campbell, J.Y., Feldstein, M. (Eds.), Risk Aspects of Investment Based Social Security Reform. Chicago University Press, Chicago, pp. 113 – 141. Richter, W.F., 1993. Intergenerational risk sharing and social security in an economy with land. In: Felderer, B. (Ed.), Public Pension Economics. Journal of Economics, Supplementum 7. Springer, Wien, pp. 91 – 103. Sjoblom, K., 1985. Voting for social security. Public Choice 45, 225 – 240. Thøgersen, Ø., 1998. A note on intergenerational risk sharing and the design of pay-as-you-go pension programs. Journal of Population Economics 11, 373 – 378. VDR, 2002. Rentenversicherung in Zeitreihen. DRV-Schriften, vol. 22. Verband Deutscher Rentenversicherungstra¨ger, Frankfurt/Main. URL: http://www.vdr.de; as of November 2003. Veall, M., 1986. Public pensions as optimal social contracts. Journal of Public Economics 31, 237 – 251. Verbon, H.A.A., 1986. Altruism, political power, and public pensions. Kyklos 39, 343 – 358. Verbon, H.A.A., 1993. Public pensions. The role of public choice and expectations. Journal of Population Economics 6, 123 – 135. Verbon, H.A.A., Verhoeven, M.J., 1992. Decision making on pension schemes under rational expectations. Journal of Economics 56, 71 – 97. Wadensjo¨, E., 2000. Sweden: Reform of the public pension system. In: Reynaud, E. (Ed.), Social Dialogue and Pension Reform. International Labour Office, Geneva, pp. 67 – 80. Wagener, A., 2002. Pensions as a portfolio problem: fixed contribution rates vs. fixed replacement rates reconsidered. Journal of Population Economics 16, 111 – 134. Werding, M., 1998. Zur Rekonstruktion des Generationenvertrages Mohr Siebeck, Tu¨bingen.