Term structures of asset prices and returns

Accepted Manuscript Term structures of asset prices and returns David Backus, Nina Boyarchenko, Mikhail Chernov PII: DOI: Reference: S0304-405X(18)3...

Download PDF

3MB Sizes 0 Downloads 98 Views

Report

Full Text

Accepted Manuscript

Term structures of asset prices and returns David Backus, Nina Boyarchenko, Mikhail Chernov PII: DOI: Reference:

S0304-405X(18)30099-0 10.1016/j.jfineco.2018.04.005 FINEC 2885

To appear in:

Journal of Financial Economics

Received date: Revised date: Accepted date:

31 January 2017 6 July 2017 10 August 2017

Please cite this article as: David Backus, Nina Boyarchenko, Mikhail Chernov, Term structures of asset prices and returns, Journal of Financial Economics (2018), doi: 10.1016/j.jfineco.2018.04.005

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

AN US

David Backus,† Nina Boyarchenko,‡ and Mikhail Chernov§

CR IP T

Term structures of asset prices and returns∗

Abstract

M

We explore the term structures of claims to a variety of cash flows, namely, US government bonds (claims to dollars), foreign government bonds (claims to foreign currency), inflationadjusted bonds (claims to the price index), and equity (claims to future equity indexes or dividends). The average term structures reflect the dynamics of the dollar pricing kernel, cash flow growth, and the interaction between the two. We use an affine model to illustrate how these two components can deliver term structures with a wide range of levels and shapes. Finally, we calibrate a representative agent economy to show that the evidence is consistent with the equilibrium models.

ED

JEL Classification Codes: G12, G13.

PT

Keywords: entropy; coentropy; term structure; yields; excess returns; affine models; recursive preferences; disasters

∗

AC

CE

We are grateful to the anonymous referee and the editor, Ron Kaniel, for their thoughtful comments, which have helped us improve the manuscript. We also thank Jarda Borovicka, Lars Hansen, Christian Heyerdahl-Larsen, Mahyar Kargar, Lars Lochstoer, Bryan Routledge, Andres Schneider, Raghu Sundaram, Fabio Trojani, Bruce Tuckman, Stijn Van Nieuwerburgh, Jonathan Wright, Liuren Wu, and Irina Zviadadze for their comments on earlier drafts and the participants of the seminars at the 2015 BI-SHoF conference in Oslo, the 2014 Brazilian Finance Meeting in Recife, Carnegie Mellon University, City University of Hong Kong, the Board of Governors of the Federal Reserve System, Goethe University, ITAM, the 6th MacroFinance Workshop, McGill University, the 2015 NBER meeting at Stanford, New York University, the 2014 SoFiE conference in Toronto, the Swedish House of Finance, UCLA, and VGSF. Disclaimer: the views expressed here are the authors’ and are not representative of the views of the Federal Reserve Bank of New York or the Federal Reserve System. † Deceased. ‡ Federal Reserve Bank of New York; 33 Liberty Street New York, NY 10045, USA; [email protected]. § Corresponding author. Anderson School of Management, UCLA, NBER, and CEPR; 101 Westwood Plaza, Los Angeles, CA, 90095, USA; [email protected].

ACCEPTED MANUSCRIPT

1. Introduction

CR IP T

Perhaps the most striking recent challenge to the representative agent models comes from evidence about the term structure of risk premiums. Several papers have argued that the patterns computed for zero-coupon assets across different investment horizons cannot be replicated using workhorse models, such as long-run risk, habits, or disasters (Binsbergen and Koijen, 2017, provide a comprehensive review). In an endowment economy, representative agent models have two basic components: an equilibrium-based pricing kernel that prices all of the assets in the economy and an exogenously specified cash flow process for a given asset. The evidence implicitly suggests which features these two components must possess. In this paper, we develop a methodology that allows researchers to establish these required features.

M

AN US

We start by introducing complementary evidence on the average buy-and-hold log excess returns across different horizons of a diverse set of assets, namely, foreign-currency bonds, inflation-protected bonds, and the dividend yields associated with equity dividend strips. We find log excess returns attractive because, as we show, the change in their averages with the horizon tracks the difference between the term spreads of two yield curves: one associated with US dollar (USD) bonds and the other corresponding to the other individual assets. Because the foreign-currency, inflation-protected, and equity dividend assets are claims to different cash flows, the recovered term structures of their average log excess returns have disparate levels and shapes.

CE

PT

ED

What determines the level and shape of the term structure for a given asset? We follow the approach of Alvarez and Jermann (2005), Backus, Chernov, and Zin (2014), and Bansal and Lehmann (1997) in articulating the key modeling points. These authors connect the average level of excess returns to the entropy of the pricing kernel or, equivalently, to the largest risk premium in a given economy. The entropy of the USD pricing kernel has to be sufficiently large to be consistent with the magnitudes of the observed returns. Next, the changes in the entropy of the pricing kernel associated with changes in an investment horizon, known as horizon dependence, should be moving one for one with how the yield term spreads change with the horizon. These changes are small, and this, in general, imposes a limit on how large the entropy can be.

AC

We can apply the same logic to the average log excess returns by suitably redefining the pricing kernel. We define the transformed pricing kernel as the product of the USD pricing kernel and the growth rate of the cash flows of a given asset. This type of pricing kernel is often referred to as the foreign pricing kernel in the context of currencies and the real pricing kernel in the context of inflation, although it does not have any special moniker in the context of equity dividends. Given the empirical slopes of the term structures, each of these transformed pricing kernels should have a large entropy and small horizon dependency. As with the pricing kernel itself, it is convenient to separate the consideration of the shape of the term structure of the log excess returns from the consideration of the level of the term structure. Because this term structure is driven by the differences between the US curves

ACCEPTED MANUSCRIPT

and asset-specific yields, we can infer an empirically plausible specification of the cash flow dynamics for each asset by comparing the transformed pricing kernel to the nominal one.

AN US

CR IP T

As it turns out, it is difficult to match the cash flow dynamics capable of replicating the term structure pattern in the excess returns with the level of one-period excess returns. We characterize this tension between shapes and levels quantitatively using an affine term structure model. The US nominal term structure allows us to fix an empirically plausible model of the nominal pricing kernel. We follow the logic of term structure models in identifying the transformed pricing kernels that correspond to the yield curves for each of the other assets. We establish that the cross-sectional differences in the shapes of the yield curves for these assets are driven by the cross-sectional differences between the levels of persistence of the expected cash flows and by the difference between their persistence and the persistence of the US nominal pricing kernel. In practice, this means that the expected cash flow growth should be affected by at least two state variables. One is common across all assets, including the US nominal term structure, while the other is asset-specific and has a different level of persistence.

M

Turning next to the levels of the term structures, we find that the observed one-period excess returns are too high; that is, the entropy is too low in the calibrated model. We argue that the affine term structure models that are used to describe the shape of the yield curve have to be augmented with nonnormal innovations, which are frequently modeled via jumps. For nonnormal innovations to the cash flows to affect the level of excess returns, the pricing kernel and the cash flow growth process should have coincident jumps.

PT

ED

The shape of the yield curve imposes an important constraint on the dynamics of this additional shock. To maintain the empirically plausible horizon dependence of log excess returns, this joint jump must be independently identically distributed (iid), that is, neither the probability of a jump occurring nor the conditional distribution of the jump sizes can have persistent components. This separation between horizon dependence and the level of the yield curve allows us to calibrate the jump distribution separately by matching the average and variance of the one-period risk premiums of the corresponding assets.

AC

CE

In lognormal environments, log risk premiums are captured by the covariance of the log pricing kernel and log cash flows. When both the pricing kernel and the cash flow process have nonnormal innovations, covariance is no longer a sufficient statistic for the comovement between the two. Building on the entropy research, we introduce the concept of coentropy as a measure of dependence that directly generalizes the computation of log risk premiums to nonnormal environments. Coentropy is equal to an infinite sum of the joint cumulants of the log pricing kernel and log cash flows, with the first joint cumulant being the covariance. This concept is useful for computing risk premiums in our models. Furthermore, the interpretation of coentropy as an infinite sum of joint cumulants allows us to highlight the role of nonnormal innovations in generating realistic risk premiums.

2

ACCEPTED MANUSCRIPT

Indeed, we show that our proposed extension to the affine term structure model successfully matches the one-period risk premiums. Quantitatively, a modest nonnormality, i.e., small cumulants of shocks to the cash flow growth process, translates into large risk premiums. This happens because the cash flow cumulants interact with large cumulants of the nonnormal innovations to the pricing kernel.

CR IP T

We introduce a model of an endowment economy featuring a representative agent with recursive preferences to illustrate how the modeling insights of the affine model manifest themselves in an equilibrium setting. The model shows which features need to be incorporated in the equilibrium models to satisfy the empirical targets presented by the term structure evidence.

M

AN US

We highlight three important modeling components. First, unlike the literature that models the variance of consumption growth as either an AR(1) or ARG(1) (square-root) process, we assume that the volatility of the consumption growth is an AR(1) process. This feature allows the generation of upward sloping nominal and real yield curves. Second, the consumption growth features an iid jump similar to the one in Barro (2006). This is important for resolving the highlighted tension between matching the shapes of the yield curves and levels of the risk premiums. Third, the expected cash flow growth depends on two state variables. One of the state variables also affects the expected consumption growth, which is the traditional, albeit less persistent, “long-run risk” component of consumption growth. The other state variable is asset specific as in our affine model. 1.1. Related literature

AC

CE

PT

ED

Our work is primarily motivated by two strands of recent literature. First, there is growing evidence, both nonparametric and model-based, on the risk premium patterns of zero-coupon securities across different horizons. A partial list of the research in this area includes Belo, Collin-Dufresne, and Goldstein (2015), Binsbergen, Brandt, and Koijen (2012), Binsbergen et al. (2012), Dahlquist and Hasseltoft (2013, 2016), Dew-Becker et al. (2017), Giglio, Maggiori, and Stroebel (2015), Hansen, Heaton and Li (2008), Hasler and Marfe (2016), Lustig, Stathopolous, and Verdelhan (2018), and Zviadadze (2017). We complement this body of research by offering evidence on the log excess returns, which are cousins of risk premiums. This switch allows us to connect evidence across the different horizons in a more transparent way. We also differ from the literature in that we use the evidence to establish features that a successful asset-pricing model should possess instead of estimating and testing specific models. Second, an important stream of theoretical literature, exemplified by Alvarez and Jermann (2005), Hansen (2012), Hansen, Heaton, and Li (2008), and Hansen and Scheinkman (2009), analyzes the interaction of cash flows and the pricing kernel at the infinite horizon. Our approach has a deep connection to these papers, which we highlight in the main text. The main difference is that we rely on the existing evidence at intermediate horizons to characterize the transition in the risk premiums across these horizons. 3

ACCEPTED MANUSCRIPT

Our paper is also connected to earlier research seeking to understand the value premium in the cross-section of equities, such as Hansen, Heaton, and Li (2008), Lettau and Wachter (2007), and Santos and Veronesi (2010). The evidence on zero-coupon assets was not available at that time, so these papers confront a different set of facts that do not have an explicit horizon dependence, which forces them to make different modeling choices.

AN US

CR IP T

The work of Lettau and Wachter (2011), who reach across different types of assets by modeling the aggregate stock market, cross-section of equities, and the yield curve via an affine pricing kernel, is particularly close to our paper. Because of limited data availability, they focus on different moments than we do. Moreover, cross-sectional differences in cashflows arise from the additivity constraint on individual firms, whereas we do not explore this mechanism at all in this paper. Because we do not study individual firms, our model of cross-sectional differences in cash flows arises from the different exposures to the common component and from the asset-specific cash flow components. Finally, we emphasize the tension in the term structure of zero-coupon asset returns and their one-period counterparts. We argue that the only way to resolve this tension is to allow for a nonnormal shock to the pricing kernel and cash flows.

PT

2. Evidence

ED

M

In this last respect, our paper is related to the literature on the impact of jumps on asset prices dating back to Merton (1976). More recent work, such as Backus, Chernov, and Martin (2011), Barro (2006), Longstaff and Piazzesi (2004), Rietz (1988), and Wachter (2013), has focused on the ability of jumps to explain the asset risk premiums. Our approach differs from this body of literature because it emphasizes that the jump component is not only helpful but must also be present in our asset-pricing models. Moreover, this component must be iid to resolve the tension between the relatively flat term structures of returns and relatively high one-period risk premiums. Finally, we introduce the concept of coentropy, which is helpful in characterizing the log risk premiums and expected log excess returns in the presence of jumps.

AC

CE

We focus on the properties of the observed term structures of prices and returns, so it is helpful to begin with the data. Consider a cash flow process dt with growth rate gt,t+n = dt+n /dt over n periods. We are interested in the zero-coupon claims to gt,t+n with a price denoted by pbnt . In the special case of a claim to the cash flow of one US dollar, the price is denoted by pnt . We define a yield on such an asset as ybtn = −n−1 log pbnt . Examples include nominal risk-free bonds with gt,t+n = 1 (we reserve the special notation n ytn ≡ n−1 log rt,t+n for a yield or, equivalently, an n−period holding period return on a US nominal bond); foreign bonds if dt is an exchange rate; inflation-linked bonds if dt is the price level; and equities if dt is a dividend. Returns are connected to yields. Consider the hold-to-maturity n−period log return log rt,t+n = log(gt,t+n /b pnt ) = log gt,t+n + nb ytn . 4

ACCEPTED MANUSCRIPT

We can thus express the term spread between the average per-period returns as n−1 E log rt,t+n − E log rt,t+1 = E(b ytn − ybt1 ).

Define the per-period excess holding return as

n log rxt,t+n = n−1 (log rt,t+n − log rt,t+n ).

(1)

CR IP T

Therefore, the average difference between the one- and n-period excess returns is equal to the difference between the average term spreads: E(log rxt,t+n − log rxt,t+1 ) = E(b ytn − ybt1 ) − E(ytn − yt1 ).

(2)

AN US

This connection between yields and excess returns simplifies the ordinarily difficult task of reliably computing the holding period returns over long horizons. The number of nonoverlapping data points available decreases when computing the historical average of realized returns. In contrast, yields are available in every period, so the number of available data points does not change with the horizon n and does not require observations of the cash flows. We only need to compute the average excess return for n = 1 and then propagate it across horizons using the yields.

PT

ED

M

We report the summary statistics for the one-period excess returns for a number of examples in Table 1. We choose assets for which zero-coupon approximations exist, namely the various bonds and dividend strips. This exercise is meant to be illustrative, so we do not exhaustively analyze all possible assets (see Binsbergen and Koijen, 2017; Giglio and Kelly, 2018 for a more comprehensive list). Based on data availability, we select one quarter as one period. We observe a large cross-sectional dispersion in the returns of around 1.36% per quarter or about 5.5% per year. Departures of excess returns from normality are evident despite the relatively low frequency, with the skewness of the returns ranging from −0.5 (Australian dollars) to 0.75 (S&P dividends).

CE

Table 2 reports the yield curves and the departures of the term spreads from those of the US term structure. The US dollar term structure starts low, on average, reflecting the low average returns on short-term default-free dollar bonds. The mean yields increase with maturity, and the mean spread between 1-quarter and 40-quarter yields is about 2% annually.

AC

Assets with cash flows also have term structures, although they typically have less market depth at long maturities than bonds. In general, they differ in both the starting point (the one-period return on a spot contract) and in how they vary with maturity. Some assets have steeper yield curves, some are flatter, and some have different shapes. In Fig. 1, we plot the term spreads of the US Treasury yield, ytn − yt1 , and the differences between the mean term spreads on a number of other assets and US Treasury yields, E(b ytn − ybt1 ) − E(ytn − yt1 ). Because the latter is equal to the average difference between one- and 5

ACCEPTED MANUSCRIPT

n-period excess returns, excess returns decline with the horizon in all of the examples with the exception of the dividend strips. Moreover, the cross-sectional spread in the excess returns widens as the horizon increases. The additional spread is about 1 percent higher annually than the one-quarter excess returns.

3. Term structures of prices and returns

CR IP T

In summary, the evidence points to large cross-sectional differences in excess returns. Because short-term excess returns are nonnormal, part of the returns can come from the compensation for the tail risk. The differences in returns increase with the horizon, suggesting that the persistence of asset yields is different from the persistence of interest rates.

AN US

We now model the term structures of asset prices and returns. We do this by showing how the concepts of entropy and horizon dependence can help translate the evidence on excess returns into the language of term structure modeling. In particular, we highlight the tension between a model’s ability to fit how risk premiums change with the horizon versus how large the risk premiums are over one period. 3.1. Term structure of zero-coupon bonds

ED

M

Returns and risk premiums follow from the no-arbitrage theorem. There exists a positive pricing kernel m that satisfies Et mt,t+1 rt,t+1 = 1 (3)

PT

for all returns r. An asset pricing model is then a stochastic process for m and r. We want to characterize what the asset prices tell us about these stochastic processes. In this section, we start with a process for m. The equality (3) implies a bound on the expected log excess returns: (4)

CE

1 E(log rt,t+1 − log rt,t+1 ) ≤ E[log Et mt,t+1 − Et log mt,t+1 ] ≡ E[Lt (mt,t+1 )].

AC

We refer to the inequality as the entropy bound, to Lt as conditional entropy, and to its unconditional expectation as entropy (see Alvarez and Jermann, 2005, proof of Proposition 2; Backus, Chernov, and Martin, 2011, Section I.C; Backus, Chernov, and Zin, 2014, Sections I.C and I.D; Bansal and Lehmann, 1997, Section 2.3). Thus, entropy is the highest possible expected excess return that an asset can generate in an economy that features the pricing kernel mt,t+1 . To facilitate computation, we express entropy in terms of the cumulant generating function (cgf) of log x. The cgf of log x, if it exists, is the log of its moment-generating function, kt (s; log xt+1 ) = log Et es log xt+1 . (5) 6

ACCEPTED MANUSCRIPT

Conditional entropy is therefore Lt (xt+1 ) ≡ log Et xt+1 − Et log xt+1 = kt (1; log xt+1 ) − Et log xt+1 . Example 1. Two-horizon price of risk. Consider the following model of the pricing kernel, (6)

CR IP T

log mt,t+1 = log β + a0 wt+1 + a1 wt ,

with {wt } iid standard normal. Although this model has valuation implications for horizons beyond two, the information about one- and two-horizon assets is sufficient to identify its properties. The price of risk a0 is constant in this model. Conditional entropy,

AN US

Lt (mt,t+1 ) = a20 /2,

is the maximum risk premium regardless of the state. So, the entropy, E[Lt (mt,t+1 )], is the same. The model can generate high risk premiums via large values of a0 . Although one could be tempted to choose a high a value as desired, the issue is whether discipline can be imposed on the choice of this value.

M

One source of such discipline is the yield curve. In an arbitrage-free setting, the bond prices inherit their properties from the pricing kernel. Pricing has a simple recursive structure. Applying the pricing relation (3) to the bond prices gives us n−1 pnt = Et mt,t+1 pt+1 = Et mt,t+n , (7)

ED

where mt,t+n = mt,t+1 mt+1,t+2 · · · mt+n−1,t+n .

PT

The dynamics of the pricing kernel are reflected in what Backus, Chernov, and Zin (2014) call horizon dependence, which is the relation between the n−period entropy Lm (n), Lm (n) ≡ n−1 E[Lt (mt,t+n )] = n−1 E[log Et mt,t+n ] − E log mt,t+1 ,

(8)

CE

and the time horizon represented by the function Hm (n) = Lm (n) − Lm (1).

AC

Backus, Chernov, and Zin (2014) show that horizon dependence is connected to bond yields via Hm (n) = −E(ytn − yt1 ).

(9)

In the iid case, Hm (n) = 0, and the yield curve is flat or, equivalently, the entropy does not change with n. Bond yields are then the same at all maturities and are constant over time. If the mean yield curve slopes upwards, then Hm (n) is negative and slopes downward, 7

ACCEPTED MANUSCRIPT

reflecting the dynamics in the pricing kernel. One important implication of this result is that the iid components of m will affect the level of the yield curve but not its shape.

CR IP T

These concepts relate to the work of Alvarez and Jermann (2005), Hansen (2012), Hansen, Heaton, and Li (2008), and Hansen and Scheinkman (2009) when the horizon n is pushed to infinity. The connection is detailed in Appendix A. The major difference is that by considering the intermediate n, we can connect the theory to data. Example 1. Two-horizon price of risk, continued. Using the cgf definition (5), the guess-and-verify approach, and the law of iterated expectations, we obtain the cgf of the n−period pricing kernel: kt (s; log mt,t+n ) = Cn (s) + sa1 wt ,

AN US

where the constant is Cn (s) = ns log β + (n − 1)s2 (a0 + a1 )2 /2 + s2 a20 /2. Thus, the (log) bond prices are log pnt = kt (1; log mt,t+n ) = n log β + (n − 1)(a0 + a1 )2 /2 + a20 /2 + a1 wt . The one-period yield is

yt1 = − log p1t = − log β − a20 /2 − a1 wt . The horizon dependence is

M

Hm (n) = (1 − 1/n)[(a0 + a1 )2 − a20 ]/2.

(10)

(11)

(12)

PT

ED

The pricing kernel becomes iid when a1 = 0. Consistent with the earlier observation, the horizon dependence is constant across horizons in this case. That is, a1 affects the slope of the yield curve in this model. Also, a1 is pinned down by the volatility of the one-period interest rate (11). Thus, the quantity that is helpful in generating the large one-period risk premium, a0 , is constrained by the value of a1 and the need to fit the term spreads, −Hm (n), or, equivalently, by how the largest n-period risk premium differs from the one-period premium.

AC

CE

To demonstrate this, we put some numbers on the parameters. We use the properties of the US nominal Treasury data described in Tables 1 and 2 for calibration. We focus on one- and two-period bonds only because of the two-horizon structure of the pricing kernel. At a quarterly frequency, the short rate yt1 in Eq. (11) has a standard deviation of 0.0084. Thus, we set the absolute value of a1 to this value. The mean of the two-quarter yield spread y 2 − y 1 is 0.0004; equivalently, the two-period horizon dependence in Eq. (12) is −0.0004. We reproduce this value by setting a0 = −0.0994. This value of a0 corresponds to the maximum risk premium of 2% per year (a20 /2 · 400). This low magnitude of the maximum risk premium reflects the tension between fitting the yield curve and generating large one-period risk premiums within the same pricing kernel. We tackle the question of how to resolve this tension in Section 5. 8

ACCEPTED MANUSCRIPT

To conclude this section, we contrast the result (9) with Hansen and Jagannathan’s (1991) characterization of the pricing kernel via the bound 1 1 Et (rt,t+1 − rt,t+1 )/Vart (rt,t+1 − rt,t+1 )1/2 ≤ Vart (mt,t+1 )1/2 /Et (mt,t+1 ).

(13)

We extend this to an n−period case by characterizing the mean and variance of the pricing kernel via the cgf:

CR IP T

Et mt,t+n = ekt (1;log mt,t+n ) ,

Vart mt,t+n = Et m2t,t+n − (Et mt,t+n )2 = ekt (2;log mt,t+n ) − e2kt (1;log mt,t+n ) . The n-period bound is then

ekt (2;log mt,t+n )−2kt (1;log mt,t+n ) − 1 1/2 2 2 = e(n−1)(a0 +a1 ) +a0 − 1 ,

1/2

AN US

Vart (mt,t+n )1/2 /Et (mt,t+n ) =

where the last line corresponds to the simple model from our example. The term in the exponent is a positive constant that gives us a nonlinear relation between the maximum Sharpe ratio and maturity n, even in the iid case.

M

Thus, the entropy conveys the term structure effects in a more intuitive fashion. Fig. 2 compares the Sharpe ratios with the entropies for the iid and non-iid cases at different horizons. The dashed lines show the departures from iid for the two-horizon model, which are evident in the case of entropy. This is why the evidence in Section 2 is presented in terms of log excess returns rather than Sharpe ratios.

ED

3.2. Term structures of other assets

CE

PT

Bonds are simple assets in the sense that their cash flows are known. All of the action in valuation comes from the pricing kernel. When we introduce uncertain cash flows, the pricing reflects the interaction between the pricing kernel and the cash flows. Nevertheless, we can think about the term structures of these other assets in a similar way. Our approach mirrors that of Hansen and Scheinkman (2009, Sections 3.5 and 4.4).

AC

The pricing relation (3) gives us n−1 pbnt = Et mt,t+1 gt,t+1 pbt+1

= Et m b t,t+1 pbn−1 t+1

= Et m b t,t+n ,

(14)

where m b t,t+1 = mt,t+1 gt,t+1 is the transformed pricing kernel, m b t,t+n = 0 m b t,t+1 m b t+1,t+2 · · · m b t+n−1,t+n , and pbt = 1. This has the same form as the bond pricing Eq. (7), with m b replacing m.

Example 2. Cash flows. We complement the pricing kernel of Example 1 by adding a process for cash flow growth, log gt,t+1 = log γ + b0 wt+1 + b1 wt . 9

(15)

ACCEPTED MANUSCRIPT

The transformed pricing kernel is then log m b t,t+1 = log β + log γ + (a0 + b0 )wt+1 + (a1 + b1 )wt ≡ log βb + b a0 wt+1 + b a1 wt .

CR IP T

Our focus is on the differences between the two term structures, specifically those shown in Section 2 in the mean excess returns and in the slopes and shapes of the mean yield curves. Combining Eq. (2) with the definition of horizon dependence, we see that the term difference in the log excess return on an asset is equal to E(log rxt,t+n − log rxt,t+1 ) = Hm (n) − Hm b (n).

AN US

The term spread of the US nominal yield curve tells us about the properties of the nominal pricing kernel m via its horizon dependence Hm (n), while the term difference in the log excess returns tells us about the properties of the cash flow process g because the differences in g are the only source of the differences in Hm (n) − Hm b (n). Example 2. Cash flows, continued. We need bond prices to compute horizon dependence. When the pricing kernel is m, b the expression is the same as Eq. (10), but with “hats” over the appropriate parameters. In particular, the one-period yield is ybt1 = − log βb − b a20 /2 − b a1 wt .

M

Therefore, the horizon dependence is

Hm a0 + b a1 )2 − b a20 ]/2. b (n) = (1 − 1/n)[(b

(16)

PT

ED

Thus, the average term spreads in the log excess returns are determined by the properties of the cash flow growth process. Cross-sectional differences in the term spreads are driven by cross-sectional differences in the cash flows. In our simple model, these are manifested by b0 and b1 .

AC

CE

As an illustration, we use the evidence on S&P 500 dividend futures (S&P) and the British pound (GBP) described in Tables 1 and 2. In the former case, the cash flow growth process (15) represents the equity index dividend growth, and in the latter case, it is the currency depreciation rate. The calibration proceeds similarly to the nominal US yield curve. The short interest rate ybt1 corresponds to the dividend yield on a one-period S&P strip and to the yield on a one-period UK bond. Their respective volatilities are 0.0402 and 0.0105. These values determine b a1 for each of the assets. The horizon dependence Hm b (2) is −0.0016 and −0.0005, implying values of −0.0997 and −0.1005 for b a0 , respectively. These coefficients imply b0 = −0.0003 and b1 = 0.0318 for S&P, and b0 = −0.0011 and b1 = 0.0021 for GBP. This simple example indicates the potential cross-sectional differences between the cash flows that are implied by the respective term structures of their respective asset prices.

10

ACCEPTED MANUSCRIPT

3.3. One-period returns We have already shown that there is tension between the fit of the model to the term structure of nominal yields and the implied largest risk premium. We can highlight this tension further by computing the one-period expected log excess returns for different assets.

log rxt,t+1 = log gt,t+1 + ybt1 − yt1 .

In our example, its expectation is

CR IP T

From Eq. (1), the one-period log excess return is (17)

E log rxt,t+1 = log γ − log βb − b a20 /2 + log β + a20 /2 = (a20 − b a20 )/2 = −b20 /2 − a0 b0 .

AN US

In the case of our two asset classes, S&P and GBP, this expression implies −0.29 · 10−4 and −1.09 · 10−4 , respectively. Consistent with our observation about the relatively small maximum risk premium in this model, the absolute values of these quantities are much smaller than those of their sample counterparts reported in Table 1.

ED

M

It could be argued that the observed tension between the term spreads and one-period premiums stems from an overly simple model of the pricing kernel and cash flows. It can very well be that specific values are due to model misspecification. However, consider the general implications. The loading on the shock wt+1 in the pricing kernel, a0 in this model, controls the one-period risk premium in any model. The loading has to be large because some of the one-period risk premiums are large. In contrast, the term spreads, which are controlled by the same loading on wt+1 , are an order of magnitude smaller than these one-period risk premiums. We confirm these observations and explore a resolution of this tension in a more realistic model in the next section.

PT

4. Interpreting term structure evidence using an affine model

AC

CE

One of the implications of our discussion is that the term spreads in the log excess returns can be viewed as the term spreads in yields of bonds corresponding to the suitably transformed pricing kernel, m. b Thus, the tools developed in the term structure literature can be used to model the behavior of the risk premiums on a cross-section of assets. As is the case with the US nominal pricing kernel m, we want our models to deliver a large entropy of m b and changes in entropy that are consistent with the yield curve corresponding to m. b Because cross-sectional differences in m b are solely due to cross-sectional differences in g, considering different transformed pricing kernels together with the US nominal pricing kernel will help us to identify the properties of the cash flows. We present an affine term structure model that captures these features.

11

ACCEPTED MANUSCRIPT

4.1. The model Consider a simplified version of the model in Koijen, Lustig, and Van Nieuwerburgh (2017), which we refer to as the KLV model. Specifically, we complement the (essentially) affine model of the pricing kernel in Duffee (2002) by adding a process for cash flow growth: log gt,t+1 = log γ + θg> xt + η0 wt+1 , xt+1 = Φxt + Iwt+1 ,

CR IP T

> log mt,t+1 = log β + θm xt − λ2t /2 + λt wt+1 ,

AN US

where xt = (x1t , x2t )> , λt = λ0 + λ1 x1t , θm = (θm1 , 0)> , Φ = diag(ϕ1 , ϕ2 ), I is the identity matrix, and {wt } is iid standard normal. This model is intentionally restricted compared to the most general identifiable two-factor affine model (e.g., the risk premium depends on one state only, and only one disturbance drives the dynamics of the state). Our goal is to present the simplest model that highlights the features necessary to capture the evidence. Note that if λ1 = 0, the pricing kernel can be re-written as

log mt,t+1 = log β − λ20 /2 + a0 wt+1 + a1 wt + a2 wt−1 + . . . ,

M

where a0 = λ0 , a1 = θm1 , aj = aj−1 ϕ1 , and j ≥ 2. We recover the model of Example 1 with ϕ1 = 0. Thus, our simple model misses two important features: the time variation in risk premiums and persistent state variables. The conditional entropy of the pricing kernel,

ED

Lt (mt,t+1 ) = (λ0 + λ1 x1t )2 /2, is the maximum risk premium in state x1t . The entropy is its mean:

PT

Lm (1) = [λ20 + λ21 /(1 − ϕ1 )2 ]/2.

CE

Thus, the model has two avenues for generating high risk premiums. The first is the large values of the coefficients λ0 and λ1 , which control the exposure to the state affecting the volatility of m. The second is the high persistence ϕ1 of the state. Horizon dependence imposes discipline on the choice of their values, as we demonstrate below.

AC

The bond prices satisfy log pnt = An + Bn xt with > > B n = θm (I + Φ∗ + Φ∗2 + · · · + Φ∗n−1 ) = θm (I − Φ∗ )−1 (I − Φ∗n ), n−1 n−1 X X An = n log β + λ0 Bj> e + 1/2 (Bj> e)2 , e> = (1, 1), j=0

j=0

where ∗

Φ =

ϕ1 + λ1 0 λ1 ϕ2 12

ACCEPTED MANUSCRIPT

is the matrix of persistence coefficients under the risk-neutral measure. These expressions are obtained by using the guess for the log bond price and applying the law of iterated expectations to Eq. (7). The result is standard in the affine term structure literature. In particular, the one-period yield is yt1 = − log p1t = − log β − θm1 x1t . 

Hm (n) = n−1 An − A1 = n−1 λ0

n−1 X

Bj> e + 1/2

j=0

CR IP T

The horizon dependence is

(18)

n−1 X j=0



(Bj> e)2  .

(19)

AN US

The quantities that are helpful in generating the large one-period risk premiums (λ0 , λ1 ) and ϕ1 are constrained by the need to fit the n-period term spreads or, equivalently, by how the largest n-period risk premium differs from the one-period premium (recall that Bn depends on λ1 ). The transformed pricing kernel has the same form

> b2 /2 + λ bt wt+1 , log m b t,t+1 = log mt,t+1 + log gt,t+1 ≡ log βb + θbm xt − λ t

(20)

ED

M

> = (θ with suitably redefined parameters: log βb = log β + log γ + λ0 η0 + η02 /2, θbm m1 + θg1 + bt = λ b0 + λ1 x1t , and λ b0 = λ0 + η0 . The expression for horizon dependence when η0 λ1 , θg2 ), λ the pricing kernel is m b is the same but with parameters with a hat:   n−1 n−1 X X −1 b bj> e + 1/2 bj> e)2  , Hm λ0 B (B (21) b (n) = n j=0

j=0

PT

> bn = θbm B (I − Φ∗ )−1 (I − Φ∗n ).

AC

CE

Thus, the horizon dependence of the log excess returns is determined by the differences between the impacts of the state variables xt on the nominal pricing kernel and the transformed pricing kernel (θm and θbm , respectively) and by the differences between the loadings b0 ). These on the shocks common to the nominal and transformed pricing kernels (λ0 and λ > differences, −(θg1 + η0 λ1 , θg2 ) , and η0 , respectively, are determined by the properties of the cash flow growth process, that is, by the exposure of its conditional mean to the state variables and by its exposure to the shock. As a next step, we relate this model to data.

13

ACCEPTED MANUSCRIPT

4.2. US dollar bonds

CR IP T

We use the properties of the US nominal Treasury data described in Tables 1 and 2 to calibrate the pricing kernel m. At a quarterly frequency, the short rate yt1 in equation (18) has a standard deviation of 0.0084 and an autocorrelation of 0.9487. The mean of the 40-quarter (10-year) yield spread y 40 − y 1 is 0.0045; equivalently, the horizon dependence in Eq. (19) is −0.0045. We reproduce each of these features by choosing the parameter values θm1 = 0.0026, ϕ1 = 0.9487, and λ0 = −0.1225. The parameter controlling the time variation in the risk premium is set to match the curvature of the yield curve. Typically, this results in Φ∗11 ≡ ϕ∗1 being very close to 1. We set it to 0.9999, implying that λ1 = 0.0512. All of these values are summarized in Panel A of Table 3. The level of the term structure can then be set by adjusting log β.

AN US

It is important to clarify the roles of the various parameters. Here, θm1 and ϕ1 control the variance and autocorrelation of the short rate and λ0 controls the slope of the mean yield curve. The different signs of θm1 and λ0 produce the upward slope in the mean yield curve. The difference in the absolute values of λ0 and θm1 (the former is roughly two orders of magnitude greater) implies a large entropy and small horizon dependence. 4.3. Other term structures

ED

M

We complement the analysis in section 4.2 by characterizing the empirical properties of cash flow growth g. We keep the parameter values we used earlier for US bonds, (θm , ϕ1 , λ0 , λ1 ), and choose others, (θg , ϕ2 , η0 ), to mimic the behavior of the cash flow of interest. Thus, the first set of parameters is common to all assets, while the second is asset specific. We suppress an asset-specific notation for simplicity.

PT

4.3.1. Foreign currency bonds

AC

CE

There is an extensive set of markets for bonds denominated in foreign currencies, which are linked by a similarly extensive set of currency markets. The term structure of a foreign sovereign yield curve depends on the interaction of the dollar pricing kernel and the depreciation rate of the dollar relative to a specific foreign currency, with the depreciation rate corresponding to the growth rate of the cash flow in this setting. For symmetry between the interest rates in the US and other countries and for simplicity of calibration, we assume that θg1 = −θm1 − λ1 η0 (so that in Eq. (20), θbm1 = 0). Then the one-period yield is ybt1 = − log βb − θg2 x2t .

Thus, the asset-specific parameters ϕ2 and also θg2 are calibrated by analogy with US nominal bonds using serial correlation and the variance of the one-period yields. The term 14

ACCEPTED MANUSCRIPT

b0 = λ0 + η0 from Eq. (21). spread of the foreign curve can then be used to back out λ Because we already know λ0 from the US curve, we can determine η0 . Panel B of Table 3 lists the calibrated values.

CR IP T

We observe a dramatic difference in the persistence of the cash flow specific shock, ϕ2 , across the different countries. The volatility θg2 and the risk premium contribution η0 retain the same qualitative features as their US counterparts in that they have different signs and the former is much smaller than the latter. Quantitatively, we observe cross-sectional variations in both parameters.

AN US

The literature views foreign exchange rates as being close to a random walk. In our bn1 = (1 + η0 λ1 /θm1 )Bn1 model, this means θg2 = 0 and θg1 = 0. These values imply that B bn2 = Bn2 = 0. The foreign term spread is (approximately) a scaled version of the US and B term spread, contradicting the term structure evidence. Thus, the information captured in the term structure of the sovereign bonds provides additional information that may be useful in modeling the one-period dynamics of exchange rates.

4.3.2. Inflation-linked bonds

M

Another implication of the calibrated model is that in contrast to many theoretical models of exchange rates, the domestic and foreign pricing kernels are asymmetric. We use quotation marks because we simply express the same projection of the pricing kernel in different units. Thus, depending on the setup of the general equilibrium model, the marginal rates of substitution of domestic and foreign economic agents could still be symmetric.

CE

PT

ED

Conceptually, the analysis of inflation-linked bonds is similar to that of foreign bonds. Exchange rates and foreign bonds tell us about the transitions between domestic and foreign economies, while the price level (CPI) and Treasury Inflation Protected Securities (TIPS) tell us about transition between the real and nominal economy. Therefore, we use the same model and the same calibration strategy in this case. We maintain the same US nominal pricing kernel, so the calibration of cash flow growth, or inflation in this case, is the only novel part relative to the previous section.

AC

Assuming that θbm1 = 0 would be too restrictive in this case because of the extremely low volatility of returns associated with trading TIPS at quarterly frequency. Table 1 shows that the volatility of returns to holding TIPS is two orders of magnitude smaller than those of foreign bonds, and Table 2 shows that the difference in the term spreads of TIPS and US nominal bonds is in the middle of the range for the spreads of foreign bonds. These quantities create tension between the dual role of η0 , which controls both the cross-section (term spreads of bonds) and the time series (conditional volatility of cash flow growth and, therefore, returns). Thus, we reconsider the calibration strategy of cash flow growth in the case of inflation by relaxing the zero constraint on θbm1 . In this case, the real short interest rate is equal to 15

ACCEPTED MANUSCRIPT

a linear combination of two AR(1) processes; that is, it is an ARMA(2,1): ybt1 = constant − (θbm1 + θbm2 )st ,

(22)

st = (ϕ1 + ϕ2 )st−1 − ϕ1 ϕ2 st−2 + wt − (θbm1 ϕ2 + θbm2 ϕ1 )(θbm1 + θbm2 )

−1

wt−1 .

See Appendix B.

CR IP T

First, we can calibrate θbm and ϕ2 by matching the variance and first- and second-order autocorrelations of ybt1 . As a second step, we can calibrate η0 using Hm b (40). Finally, we can b calibrate θg1 because θm1 = θm1 + θg1 + η0 λ1 . All of the required expressions are provided in Appendix B.

AN US

The results are reported in the first line of Table 3B. We see that the persistence of x2t is much lower than in the currency examples. This is natural because we rely on two factors to model the real short interest rate. 4.3.3. Equity

M

Dividend strips have recently attracted interest in the literature because the term structure of the associated Sharpe ratios seems to offer prima facie evidence against the major asset pricing models. Although we study excess log returns instead of Sharpe ratios, a comparison of Eq. (4) and (13) clearly demonstrates that these objects are related.

ED

We make best use of the available data by mixing two-quarter strip prices from Binsbergen, Brandt, and Koijen (2012) with summary statistics for ybtn − ytn , n ≥ 4 quarters from Binsbergen et al (2013) and making a number of bold assumptions (see the description in Table 2 and Appendix C). All of this evidence is worth revisiting as more data become available.

AC

CE

PT

Our calibrated model shares the qualitative traits of those matched to bond prices in the preceding sections. Quantitatively, we observe a dramatic drop in persistence ϕ2 . We note the cross-sectional variation in ϕ2 appears earlier, but the equity model is the lowest (excluding the CPI model that features a two-factor structure of the relevant short interest rate). Most of the representative agent models that have been confronted with the Sharpe ratio evidence feature exogenously specified cash flows with persistence connected to that of expected consumption growth and, therefore, the real pricing kernel. Our results suggest that different levels of persistence of cash flows and the pricing kernel must be explored before the final opinion on the equilibrium component of these models can be expressed.

5. One-period risk premiums The discussion in the previous section shows that evidence on the behavior of average excess returns can be translated across different horizons into the language of term structure 16

ACCEPTED MANUSCRIPT

modeling. Given the remarkable success of the no-arbitrage modeling of zero-coupon yield curves, it is not surprising that we can model the yield curves for other assets by suitably redefining the pricing kernel.

5.1. Term structure implications for one-period returns

AN US

From Eq. (1), the one-period log excess return is

CR IP T

This approach circumvents two issues. First, the term structure approach focuses on the differences in excess returns along the maturity curve (“slope” in the term structure literature) and does not address the question of the level of excess returns. If the term structure model successfully captures the horizon dependence of returns, we can recast the question of the level of excess returns in terms of the one-period excess return, that is, in terms of log rxt,t+1 . The second concern with the term structure approach is whether the same dynamic behavior can be generated in an equilibrium model. In this section, we focus on the question of the level of the term structure.

log rxt,t+1 = log gt,t+1 + ybt1 − yt1 .

In the lognormal environment of the models that we have discussed, the expected log excess return is given by Et log rxt,t+1 = −vart (log rxt,t+1 )/2 − covt (log mt,t+1 , log rxt,t+1 )

(23)

M

= −vart (log gt,t+1 )/2 − covt (log mt,t+1 , log gt,t+1 ).

Equation (23) implies for the KLV model: Et log rxt,t+1 = −η02 /2 − (λ0 + λ1 x1t )η0 .

ED

Conditional expectations are not observable, so the theoretical counterpart to average excess returns is

PT

E log rxt,t+1 = −η02 /2 − λ0 η0 .

CE

Here, we consider what the model that was calibrated to match the term structure evidence implies for one-period excess returns. The first column of Table 3C reports the calculated values, which depart dramatically from their data counterparts displayed in Table 1. Thus, the KLV model does a good job in matching the term structure of excess returns but not their level.

AC

In the language of entropy, the presented model can match the horizon dependences associated with the various assets but not the one-period entropies of the respective pricing kernels. In fact, the message is more refined because one-period entropy is related to the unobserved maximal risk premium. Here, we show that the observed risk premiums on specific assets that cannot be matched exceed this model-based maximal risk premium. Thus, we reach the same conclusion as in the simple two-horizon example. In the remainder of this section, we discuss the possible extensions of the model to rectify this shortcoming. 17

ACCEPTED MANUSCRIPT

5.2. Proposed extension I: normal shocks As noted earlier, an iid component of the pricing kernel has identical implications for horizon dependence regardless of the horizon. Thus, adding an iid component to the pricing kernel is the only avenue that will allow the one-period risk premium to be changed without affecting the implications for the term spreads.

CR IP T

We start by adding a normal iid shock to the KLV model:

> log mt,t+1 = log β + θm xt − λ2t /2 + λt wt+1 + λ2 εt+1 ,

log gt,t+1 = log γ + θg> xt + η0 wt+1 + η2 εt+1 .

We calibrate (λ2 , η2 ) to match the expected excess returns

E log rxt,t+1 = −η02 /2 − η22 /2 − λ0 η0 − λ2 η2 .

AN US

The expected excess return gives us one target for two parameters. To calibrate both parameters, we also use the unconditional variance of excess returns var log rxt,t+1 = η02 [1 + λ21 (1 − ϕ21 )−1 ] + η22 .

Thus, the variance of the excess returns implies η2 ; then, given η2 , the expected excess returns imply λ2 .

PT

ED

M

Table 3C reports the calculated values in the second and third columns. Both parameters have indeterminate signs, so we report their absolute values. The inferred values of λ2 are dramatically different across the different assets. In fact, the values have to be the same because λ2 reflects the exposure of the US nominal pricing kernel to the shock ε. Thus, a normal shock is not capable of capturing the levels of the risk premiums. The statistics in Table 1 also indicate that the observed one-period excess returns are nonnormal, further suggesting that a nonnormal shock is needed. 5.3. Coentropy

AC

CE

Before we proceed with a nonnormal extension of the KLV model, we introduce the concept of coentropy and its properties. This will be helpful for developing a nonnormal counterpart to the risk premium formula in Eq. (23) and for understanding the role of nonnormality in generating realistic risk premiums. We define the coentropy of two positive random variables x1 and x2 as the difference between the entropy of their product and the sum of their entropies: C(x1 , x2 ) = L(x1 x2 ) − [L(x1 ) + L(x2 )].

(24)

Coentropy captures a notion of dependence of two variables. If x1 and x2 are independent, then L(x1 x2 ) = L(x1 ) + L(x2 ) and C(x1 , x2 ) = 0. If x1 = ax2 for a > 0, then the 18

ACCEPTED MANUSCRIPT

coentropy is positive. If x1 = a/x2 , then L(x1 x2 ) = L(a) = 0 and the coentropy is negative. Coentropy is also invariant to noise. Consider a positive random variable y independent of x1 and x2 or, in other words, noise. Then, C(x1 y, x2 ) = C(x1 , x2 y) = C(x1 , x2 ).

CR IP T

As with entropy, we can express coentropy in terms of cgfs. The cgf of log x = (log x1 , log x2 ) is k(s1 , s2 ) = log E(es1 log x1 +s2 log x2 ). The cgfs of the components are k(s1 , 0) and k(0, s2 ). The coentropy is therefore C(x1 , x2 ) = k(1, 1) − k(1, 0) − k(0, 1).

(25)

The connection to the joint cgf can be made more explicit by representing it as a sum of joint cumulants, κi,j

j=1 p=0

j!

j! sj−p sp , p!(j − p)! 1 2

AN US

k(s1 , s2 ) =

j ∞ X X κj−p,p

This representation defines joint cumulants as the respective partial derivatives of the cgf evaluated at zero: κi,j =

∂ i+j k(0, 0) ∂si1 ∂sj2

.

M

Joint cumulants are close relatives of comoments:

ED

mean1 = κ1,0

mean2 = κ0,1

variance1 = κ2,0 variance2 = κ0,2

PT

covariance = κ1,1

CE

Coentropy can be expressed in terms of joint cumulants:

AC

C(x1 , x2 ) = =

j−1 ∞ X X

κj−p,p p!(j − p)!

j=2 p=1 1,1

κ |{z}

(log)normal term

+ κ2,1 /2! + κ1,2 /2! + · · · {z } |

(26)

high-order joint cumulants

Intuitively, coentropy focuses on the joint distribution of the two variables by removing all of the terms pertaining to the respective marginal distributions. Expression (26) suggests that coentropy could deviate from covariance in a nonnormal case. Consider a Poisson mixture of normals. Jumps j are Poisson with intensity ω. 19

ACCEPTED MANUSCRIPT

Conditional on j jumps, log x ∼ N (jµ, j∆), where the matrix ∆ has the elements δij . > > The cgf is k(s) = ω es µ+s ∆s/2 − 1 . The entropies are L(xi ) = ω eµi +δii /2 − 1 − ωµi L(x1 x2 ) = ω e(µ1 +µ2 )+(δ11 +δ22 +2δ12 )/2 − 1 − ω(µ1 + µ2 ).

CR IP T

The coentropy is therefore

C(x1 , x2 ) = ω e(µ1 +µ2 )+(δ11 +δ22 +2δ12 )/2 − eµ1 +δ11 /2 − eµ2 +δ22 /2 + 1 .

In contrast, the covariance is cov(log x1 , log x2 ) = ω(µ1 µ2 + δ12 ). We show in Appendix D that coentropy also differs from the other concepts of dependence introduced in the literature.

AN US

A numerical example illustrates this point. Let ω = µ1 = 1 and ∆ = 0 (a two-by-two matrix of zeros). If µ2 = 1, C(x1 , x1 ) > cov(x1 , x2 ), but if µ2 = −1, the inequality goes the other way because the odd high-order cumulants change signs. Similarly, it is not difficult to construct examples in which the covariance and coentropy have opposite signs.

M

Another numerical example shows how different they can be. Let µ1 = µ2 = −0.5 and 1 ρ ∆ = δ . ρ 1

PT

ED

We set ρ = 0 and δ = 1/ω. We then vary ω to identify what happens to the covariance and coentropy. We see in Fig. 3 that the two can be very different. When the jump intensity ω is low, the coentropy exceeds the covariance, but as the jump intensity increases, the coentropy becomes smaller than the covariance between the two processes. 5.4. Proposed extension II: Poisson shocks

CE

We introduce jumps into the KLV model so that the new pricing kernel and cash flow growth specification is

AC

> m log mt,t+1 = log β + θm xt − λ2t /2 + λt wt+1 + λ2 zt+1 ,

log gt,t+1 = log γ +

θg> xt

+ η0 wt+1 +

g η2 zt+1 ,

(27) (28)

where ztm and ztg are compound Poisson processes with the same arrival rate of ω and jump 2 ) and N (µ , δ 2 ), respectively. For the jumps to cash flow size distributions of N (µm , δm g g growth to be priced, the jump processes ztm and ztg must have coincident jumps. Thus, the only difference between the nonnormal innovations to the pricing kernel and the cash flow is in the jump size.

20

ACCEPTED MANUSCRIPT

In this case, the expression for expected excess returns (23) no longer applies. However, the main asset valuation Eq. (3) still applies, and we use it to derive a generalization of (23) to nonnormal settings. Taking the logs of Eq. (3), we obtain 0 = log Et (mt,t+1 rt,t+1 ) = Lt (mt,t+1 rt,t+1 ) + Et log mt,t+1 + Et log rt,t+1

This equation implies that the (log) risk premium is

CR IP T

= Lt (mt,t+1 rt,t+1 ) − Lt mt,t+1 − Lt rt,t+1 + log Et mt,t+1 + log Et rt,t+1 .

1 log Et rt,t+1 − log rt,t+1 = −Ct (mt,t+1 , rt,t+1 ),

and the expected (log) excess return is

1 Et log rt,t+1 − log rt,t+1 = Lt mt,t+1 − Lt (mt,t+1 , rt,t+1 ) = −Lt rt,t+1 − Ct (mt,t+1 , rt,t+1 ).

AN US

Here, Ct (x1t+1 , x2t+1 ) is the conditional version of the definition in Eq. (24). The last equation implies that in the case of zero-coupon claims, Et log rxt,t+1 = −Lt gt,t+1 − Ct (mt,t+1 , gt,t+1 ).

M

As in the normal examples, we have unconditional expectations, so it is helpful to introduce the additional notation Cmg (n) = n−1 ECt (mt,t+n , gt,t+n ). The definition of coentropy implies (29)

ED

Cmg (n) − Cmg (1) = Hm b (n) − Hm (n) − Hg (n).

PT

Although the horizon dependence is not affected by the addition of the Poisson iid shock, the entropy of the pricing kernel is affected: 2 2 Lm (1) = λ20 /2 + λ21 (1 − ϕ21 )−1 /2 − ωλ2 µm + ω eλ2 µm +λ2 δm /2 − 1 .

CE

Thus, it is easy to compute the n−period entropy via Lm (n) = Lm (1) + Hm (n), where Hm (n) is exactly the same as in Eq. (19). The one-period coentropy is

AC

Cmg (1) = λ0 η0 + kz (λ2 , η2 ) − kz (λ2 , 0) − kz (0, η2 ),

(30)

with

2 2

2 2

kz (s1 , s2 ) = ω(es1 µm +s2 µg +(s1 δm +s2 δg )/2 − 1).

(31)

Eq. (29) implies the n−period coentropy. Although Cmg (n) is affected by the jump components for any n, the change in coentropy across horizons is not−a result of the jump components being iid. Thus, Cmg (n) − Cmg (1) 21

ACCEPTED MANUSCRIPT

is affected only by the differences in the covariances between log m and log g at different horizons.

CR IP T

To calibrate the jump parameters, we normalize the jump loadings λ2 and η2 to 1 because they are not identified separately from the jump volatilities δm and δg , respectively. We borrow the parameters that control the jumps in the pricing kernel from the CI2 model 2 = (−10 · of Backus, Chernov, and Zin (2014): ω = 0.01/4, µm = −10 · (−0.15) = 1.5, δm 2 2 0.15) = 1.5 . These parameter values represent a milder version of the Barro (2006) disaster calibration, as we discuss in the context of a consumption-based model in the next section. We can use information about cash flows or, equivalently, about one-period excess returns, to infer the asset-specific µg and δg . The one-period excess (log) returns are log rxt,t+1 = log gt,t+1 + ybt1 − yt1 Thus,

AN US

g . = −λ0 η0 − η02 /2 − kz (λ2 , η2 ) + kz (λ2 , 0) − λ1 η0 x1t + η0 wt+1 + η2 zt+1

E log rxt,t+1 = −η02 /2 − λ0 η0 − kz (λ2 , η2 ) + kz (λ2 , 0) + ωη2 µg , and

M

var log rxt,t+1 = η02 [1 + λ21 (1 − ϕ21 )−1 ] + η22 ω(µ2g + δg2 ).

PT

ED

Table 3B reports the results of the calibration procedure. As discussed in the example of a bivariate Poisson process, the nonnormality of the pricing kernel and cash flow growth manifests itself in large differences between coentropy and covariance, as reported in Table 3C. These substantial deviations of coentropy from covariance highlight the ability of models with nonnormal innovations to generate large expected returns and large cross-sectional differences between them.

AC

CE

As a reality check, we verify whether the calibrated processes for cash flows, log g, resemble the data. We focus on two basic summary statistics: variance and serial correlation (the mean can be mechanically matched by adjusting log γ). We use the model to compute the population values of these two statistics at the calibrated parameters. Further, we simulate 100,000 artificial histories of the respective cash flow growth rates, which allows us to compute the finite-sample distribution of the same two statistics. Table 4 compares these theoretical results with empirical values and indicates that they are sufficiently close to the data. A second reality check is to use the calibrated model of equity to see what it implies for the equity premium. The value of an equity claim is an infinite sum of zero-coupon claims, so the model-implied premium should be consistent with the one in the data. In Appendix 22

ACCEPTED MANUSCRIPT

E, we explain how we solve this for the equity premium in our model and demonstrate that the premium is indeed matched.

CR IP T

We want to highlight how the nonnormalities featured in our model affect the coentropy and, therefore, the risk premiums via the joint cumulants of the pricing kernel and cash flows. Because the (log) normal component, λ0 η0 , of coentropy in Eq. (30) is standard and well-explored in the literature, we focus on the effect the jumps have on coentropy using the decomposition in Eq. (26). The joint cgf associated with jumps in our model (27)−(28) is provided in Eq. (31). The cumulants corresponding to the marginal distributions have the following well-known form: κ0,1 κ0,2 κ0,3 κ0,4

= ωµm , 2 ), = ω(µ2m + δm 2 ), = ωµm (µ2m + 3δm 4 2 2 4 ), = ω(µm + 6µm δm + 3δm

= ωµg , = ω(µ2g + δg2 ), = ωµg (µ2g + 3δg2 ), = ω(µ4g + 6µ2g δg2 + 3δg4 ),

AN US

κ1,0 κ2,0 κ3,0 κ4,0

and so on. Because the jump size distributions of the two variables are independent of each other, the joint cumulants have a particularly transparent representation: κi,j = ω −1 κi,0 κ0,j .

(32)

ED

M

This expression clarifies how joint cumulants contribute to generating risk premiums. The nonnormalities present in both variables and reflected in the magnitudes of the respective cumulants reinforce each other multiplicatively in the joint cumulants. Further, because the first term, κi,0 , reflects the properties of the pricing kernel, the cross-sectional differences in the risk premiums are determined by the marginal cumulants of the cash flow process.

CE

PT

The question is which of the joint cumulants matter quantitatively in the model presented here. The property (32) allows us to obtain a simple interpretation of coentropy. The high-order cumulants of the log pricing kernel are very large—larger than ten for i > 2. Regardless of the asset, the cumulants of the cash flows are relatively small. Thus, it is helpful to use the log scale to compare the quantitative effects of the two. On the log scale, each joint cumulant is simply a sum of the two log marginal cumulants, up to a constant.

AC

To illustrate the quantitative effect of coentropy, consider the case of GBP. The first row of Fig. 4 displays the log marginal cumulants. Each log joint cumulant with an index (i, j) is equal to the sum of the two marginals with indexes i and j, as displayed in panel C of the figure. We see that the log joint cumulants corresponding to a large i are dominated by the properties of the pricing kernel, and vice versa for large j. Panel C ignores the fact that the contributions of joint cumulants to coentropy are divided by i!j!, so the effects of the higher-order terms on risk premiums and coentropy 23

ACCEPTED MANUSCRIPT

should be diminished. Fig. 4D demonstrates how fast this occurs for GBP by showing the joint cumulants (not logs) scaled by the factorials. We see that the quantitative effect of high-order joint cumulants decreases quickly. The cash flow matters for j = 1, 2, while the effect of the pricing kernel starts to decline at i = 4 but is still visible at i = 10.

CR IP T

Fig. 5 displays the contribution of the joint cumulants to coentropy for all of the other assets. Qualitatively, the implications are the same in that the nonnormalities in the pricing kernel matter much more than those of the cash flows. Obviously, if the cash flows have no nonnormalities, the joint cumulants will be equal to zero. Thus, despite being small, the nonnormal shocks in the cash flows play an important role for the risk premiums.

6. The representative agent with recursive preferences

AN US

We have demonstrated that the evidence on risk premiums can be represented with affine term structure models that feature different levels of persistence of the pricing kernel and cash flows to capture the term differences in the risk premiums and an iid jump component to capture the level of premiums. These insights are important because they highlight the features that a pricing kernel and cash flows should possess to match the empirical evidence. However, this analysis does not address the question of whether the evidence is consistent with an equilibrium model.

PT

ED

M

We would like to emphasize the reasons for considering an equilibrium model. It is not our intent to offer an improvement of the existing models. We merely want to demonstrate how certain features of the data translate into required features in a model. We have already highlighted such features in the affine framework. The issue is whether the restrictions imposed by the economic theory affect the model’s ability to produce quantitatively realistic results. Although we argue that the resulting model is reasonable according to basic metrics, we leave it to our readers to pursue this formulation or its extensions in their research.

CE

We focus on an endowment economy with a representative agent. Because such an economy delivers implications for the real pricing kernel, we focus on whether we can generate something similar to the real affine pricing kernel implied by the combination of the nominal pricing kernel and the growth process for CPI (inflation). To recap, we have

AC

log m b t,t+1 = (log β + log γ + λ0 η0 + η02 /2) + (θm1 + θg1 + η0 λ1 )x1t + θg2 x2t

g m − (λ0 + η0 + λ1 x1t )2 /2 + (λ0 + η0 + λ1 x1t )wt+1 + λ2 zt+1 + η2 zt+1 > b2 /2 + λ bt wt+1 + zbm , ≡ log βb + θbm xt − λ (33) t t+1

m arrive at rate ω with jump sizes N (b 2 ) with µ where jumps zbt+1 µm , δbm bm = λ2 µm + η2 µg , and 2 2 2 2 2 δbm = λ2 δm + η2 δg . We conduct a reverse-engineering exercise in which the specification of consumption growth is motivated by the objective of having a similar functional form of the consumption-based real pricing kernel.

24

ACCEPTED MANUSCRIPT

6.1. Equilibrum real pricing kernel

We define utility with the time aggregator, Ut = [(1 − β)cρt + βµt (Ut+1 )ρ ]1/ρ , and certainty equivalent function,

(34)

AN US

α 1/α µt (Ut+1 ) = [Et Ut+1 ] ,

CR IP T

We assume that there is a representative agent with recursive preferences, as developed by Kreps and Porteus (1978), Epstein and Zin (1989), and Weill (1989), among many others. The benefit of this specification is that a loglinear approximation of the agent’s value function leads to a (restricted) affine pricing kernel. Thus, the insights that we have gained from the no-arbitrage case have the best chance of being successfully translated into an equilibrium setting.

where ct is the aggregate consumption. In standard terminology, ρ < 1 captures the time preference (with intertemporal elasticity of substitution 1/[1 − ρ]), and α < 1 captures the risk aversion (with coefficient of relative risk aversion 1 − α).

M

The time aggregator and certainty equivalent functions are homogeneous of degree one, which allows us to scale everything by the current consumption. If we define scaled utility as ut = Ut /ct , Eq. (34) becomes c ut = [(1 − β) + βµt (gt+1 ut+1 )ρ ]1/ρ ,

(35)

ED

c = ct+1 /ct is consumption growth. This relation serves as the recursive utility where gt,t+1 analog of a classical Bellman equation.

PT

With this utility function, the real pricing kernel is c c c m b t,t+1 = β(gt,t+1 )ρ−1 [gt,t+1 ut+1 /µt (gt,t+1 ut+1 )]α−ρ .

(36)

c c log gt,t+1 = g c + θc> xt + (σ0 + σ1 x1t )wt+1 + zt+1 ,

(37)

CE

The primary input to the pricing kernels of these models is a consumption growth process. We assume that

AC

where jumps arrive at the rate of ω and the jump sizes are distributed as N (µc , δc2 ). The factors xt are as the same as in the KLV model. This specification has two novel features that arise directly from the functional form of the affine pricing kernel. First, the conditional volatility of the consumption growth is captured by a normally distributed variable. Because the sign of the volatility is not determined, it is appropriate to use such a specification. Commonly used specifications either use a normal variable for the variance of consumption growth or model the variance 25

ACCEPTED MANUSCRIPT

via the square-root or ARG processes. We propose our specification to match the essentially affine form of the affine m b t,t+1 . It could be a useful alternative in future applications. Second, the expected consumption growth follows an ARMA(2,1) process instead of the commonly used AR(1) process. We derive the pricing implications from a loglinear approximation of Eq. (35), (38)

CR IP T

c log ut ≈ u0 + u1 log µt (gt,t+1 ut+1 ),

around the point log µt = E(log µt ). This approximation is exact when ρ = 0, in which case u0 = 0 and u1 = β. We guess a value function of the form

log ut+1 = u + a1 x1t+1 + a2 x21t+1 + bx2t+1 .

AN US

We verify this guess and derive the values of a1 , a2 , and b in Appendix F.1. Then Eq. (36) implies the real pricing kernel log m b t+1 = m b + [(ρ − 1)θc1 − (α − ρ)α(σ0 + a1 + b)(σ1 + 2a2 ϕ1 )(1 − 2αa2 )−1 ]x1t + (ρ − 1)θc2 x2t − (α − ρ)α(σ1 + 2a2 ϕ1 )2 (1 − 2αa2 )−1 x21t /2

+ [(α − 1)σ0 + (α − ρ)(a1 + b) + ((α − 1)σ1 + (α − ρ)2a2 ϕ1 )x1t ]wt+1

2 c + (α − ρ)a2 wt+1 + (α − 1)zt+1 ,

(39)

M

where m b is a constant whose explicit expression as a function of the model parameters is omitted.

CE

PT

ED

We calibrate the model to match the properties of the affine pricing kernel (33), see Appendix F.2. The calibrated preference parameters and parameters controlling the dynamics of consumption are listed in Panel D of Table 3. Table 4 shows that the model-implied positive serial correlation of consumption growth is similar to the empirical one. Yet in contrast to the traditional implementations of the long-run risk paradigm, the real yield curve is upward sloping. This property arises from the specification of the consumption volatility. Intuitively, x1t , which affects the conditional volatility of the pricing kernel in line three of Eq. (39), acts similarly to a habit or a preference shock (e.g., Creal and Wu, 2016; Wachter, 2006). The technical details of how this works are provided in Appendix F.3. Thus, our model of consumption growth transcends the specific objective of matching the affine pricing kernel and could be useful in other equilibrium setups.

AC

The serial correlation of the expected consumption growth can be computed using the formulas in Appendix B and is equal to 0.6786. This number is much lower than is typically used in the long-run risk literature. Implicitly, this number is determined by the shape of the real yield curve. As a result, the normal component of the model will not be able to generate risk premiums of realistic magnitudes. Here, jumps in consumption help in matching the levels of the risk premiums. The calibrated jump parameters are consistent with a modest version of the Barro (2006) disaster model in that jumps take place once in a hundred years, and the average jump in consumption is −15%, with a volatility of 15%. 26

ACCEPTED MANUSCRIPT

6.2. Nominal pricing kernel and other assets

CR IP T

To obtain the nominal pricing kernel in our endowment economy, we assume an exogenous process of inflation as in Bansal and Shaliastovich (2013), Piazzesi and Schneider (2006), and Wachter (2006). Specifically, we use the process specified and calibrated in Sections 4.3.2 and 5.4. Given that our consumption-based real pricing kernel is quite close to the affine one, the nominal pricing kernel will be (nearly) matched by construction. Following the same logic, we can match the transformed pricing kernels associated with currencies and equities using the calibrated cash flows (28).

7. Concluding remarks

M

AN US

Given that the state variables x1t and x2t control the expected consumption growth in the recursive model, we can reinterpret the cash flow model in the context of what is typically used in endowment economies. The Bansal and Yaron (2004) specification expects consumption growth to be determined by a single AR(1) factor, and an asset’s expected cash flow growth has a different exposure to this factor. Our model has two AR(1) factors, and the exposure to both changes as we move from expected consumption growth to expected cash flow growth. Put differently, we can write expected consumption growth as an ARMA(2,1) process, while expected cash flow growth cannot be written as a different exposure to the same ARMA(2,1) process, because both the exposure to common shocks and the process change. These two departures from the traditional specifications are helpful in matching the observed patterns of average multihorizon returns.

CE

PT

ED

We focus on how risk is priced in the cross-section of assets and across investment horizons. Empirically, we link the average log holding period returns on a given asset in excess of US interest rates to the difference between the yield curve corresponding to this asset (dividend yield, foreign yield, or real yield) and the US yield curve. The crosssectional dispersion of one-period excess returns is very large and continues to increase with the horizon. For a given asset, excess log returns decline with the horizon, but the rate of decline is different in the cross-section.

AC

Theoretically, we introduce the concept of coentropy, which serves as a generalized measure of covariance in the nonnormal, multiperiod world. Coentropy of the pricing kernel and cash flows is closely related to the shown cross-sectional differences in yields. Thus, these differences in yields must reflect the differences in cash flows. We demonstrate that to capture the shown patterns in excess log returns, an asset-pricing model has to feature iid extreme outcomes, a persistent component, and cross-sectional variation in the persistence of cash flows. A model of the representative agent with recursive preferences and consumption that features disasters and persistent variation in its expected value is capable of capturing the evidence.

27

ACCEPTED MANUSCRIPT

Appendix A. Long horizons

CR IP T

We use the term long horizon to refer to the behavior of asset prices and entropy as the time horizon approaches infinity. Hansen and Scheinkman (2008) echo the PerronFrobenius theorem and consider the problem of finding a positive dominant eigenvalue ν and associated positive eigenfunction vt satisfying Et mt,t+1 vt+1 = νvt . (A.1)

If such a pair exists, we can construct the Alvarez-Jermann (2005) decomposition mt,t+1 = m1t,t+1 m2t,t+1 with m1t,t+1 = mt,t+1 vt+1 /(νvt ), m2t,t+1 = νvt /vt+1 .

AN US

By construction Et (m1t,t+1 ) = 1, hence Hansen and Scheinkman (2009) refer to it as a martingale component of the pricing kernel. Qin and Linetsky (2017) demonstrate how this decomposition works in non-Markovian environments.

M

Given such an eigenvalue-eigenfunction pair, the long yield converges to − log ν. The ∞ long bond one-period return is not constant, but its expected value also converges: rt,t+1 = n limn→∞ rt,t+1 = 1/m2t,t+1 = vt+1 /(νvt ), so that E(log r∞ ) = − log ν. See Alvarez and Jermann (2005, Section 3).

ED

The special case m1t,t+1 = 1 has gotten a lot of recent attention; see, for example, the review in Borovicka, Hansen, and Scheinkman (2016). The pricing kernel becomes mt,t+1 = m2t,t+1 . Since the long bond return is its inverse, the long bond is the high return asset. Realistic or not, it’s an interesting special case. In logs, the pricing kernel becomes

PT

log mt,t+1 = log ν + log vt − log vt+1 .

CE

The log pricing kernel is the first difference of a stationary object, namely v, plus a constant. In a sense, it’s been over differenced.

AC

Example 1. Two-horizon price of risk, continued. We guess an eigenvector of the form log vt = c0 wt + c1 wt−1 . If we substitute into Eq. (A.1) we find: c0 = a1 ,

c1 = 0,

log ν = log β + (a0 + a1 )2 /2.

Horizon dependence is Hm (∞) = log ν − E log mt,t+1 = log ν − log β = (a0 + a1 )2 /2.

If a1 = −a0 , then m1t,t+1 = 1, and Hm (∞) = 0.

28

ACCEPTED MANUSCRIPT

Moving on to other assets, we introduce two equations analogous to Eq. (A.1). One is for cashflow growth: Et (gt,t+1 ut+1 ) = ξut ,

leading to a decomposition m b t,t+1 = νbm b 1t,t+1 vbt /b vt+1 .

The decompositions are related to each other via:

CR IP T

1 leading to a decompistion gt,t+1 = ξgt,t+1 ut /ut+1 . The other is for transformed pricing kernel: Et m b t,t+1 vbt+1 = νbvbt , ds (A.2)

1 νbm b 1t,t+1 vbt /b vt+1 = m b t,t+1 ≡ mt,t+1 gt,t+1 = νξm1t,t+1 gt,t+1 (vt ut )/(vt+1 ut+1 ).

(A.3)

AN US

There is not, in general, a close relation between νb, ν, and ξ, but there is in some special cases. One special case is a stationary cash flow, which leads to the martingale component 1 gt,t+1 = 1 as in the example above. In this case, the simplified Eq. (A.3) implies that the value νξ and function vt ut solve Eq. (A.2). Therefore, νb = νξ, the martingale components coincide, m b 1t,t+1 = m1t,t+1 , and long-horizon excess returns are equal to zero: E log rxt,t+n → 0,

as n → ∞.

(A.4)

ED

M

1 The reverse is also true: if m b 1t,t+1 = m1t,t+1 , it must be the case that gt,t+1 = 1. Indeed, 1 in this case Eq. (A.3) implies that the level of gt,t+1 (vt ut )/(vt+1 ut+1 ) must be stationary 1 because vbt is. Because vt and ut are stationary as well, the martingale gt,t+1 must be a constant (we can normalize it to one without loss of generality).

Example 2. Cash flows, continued. The Perron-Frobenius theory implies log ut = d0 wt + d1 wt−1 with

PT

d0 = b1 ,

d1 = 0,

and log vbt = b c0 wt + b c1 wt−1 with

CE

b c0 = b a1 ,

b c1 = 0,

log ξ = log γ + (b0 + b1 )2 /2,

log νb = log βb + (b a0 + b a1 )2 /2.

If b1 = −b0 , then cash flow is stationary, log ξ = log γ, and log νb = log β + log γ + (a0 + a1 )2 /2 = log ν + log ξ.

AC

Another special case is one in which the price-dividend ratio pb is constant, see the October 2005 version of Hansen, Heaton, and Li (2008), Section 3.2. Consider a factorization of the dividend dt into a growth component d∗t and a stationary component st , so that ∗ ∗ 1 dt = d∗t · st , and gt,t+1 ≡ d∗t+1 /d∗t (if gt,t+1 is a constant, then gt,t+1 = 1.) Because st is ∗ ∗ stationary, the two transformed pricing kernels m b t,t+1 and mt,t+1 ≡ mt,t+1 gt,t+1 will have the same eigenvalue νb. The eigenfunctions will be vbt and vbt · st , respectively. Thus, if a dividend is such that its vbt = 1, or, equivalently, st equals the eigenfunction associated with m∗t,t+1 , then pb is constant. 29

ACCEPTED MANUSCRIPT

Appendix B. Details of the inflation model

Then

θbm1 θbm2 x1t + x2t . θbm1 + θbm2 θbm1 + θbm2

(1 − ϕ1 L)(1 − ϕ2 L)st = = So

θbm1 θbm2 (1 − ϕ2 L)wt + (1 − ϕ1 L)wt θbm1 + θbm2 θbm1 + θbm2 ! ! θbm1 θbm2 1− ϕ2 + ϕ1 L wt . θbm1 + θbm2 θbm1 + θbm2

AN US

st =

CR IP T

First we show that a sum of two AR(1) processes is an ARMA(2,1). Consider

st = φ1 st−1 + φ2 st−2 + wt + θ1 wt−1 , with φ1 = ϕ1 + ϕ2 ,

φ2 = −ϕ1 ϕ2 ,

θ1 = −

θbm1 θbm2 ϕ2 + ϕ1 θbm1 + θbm2 θbm1 + θbm2

(B.1) !

.

M

The first step of calibration requires the knowledge of the unconditional moments of ARMA(2,1). Denote variance by v0 and autocovariances by vj . Then multiply st−j on both side of Eq. (B.1) and take expectation

ED

vj = φ1 vj−1 + φ2 vj−2 + Ewt st−j + θ1 Ewt−1 st−j .

PT

We are interested in j = 0, 1, 2. We use v−i = vi ; Ewt−i st−j = 0 if j > i, and Ewt−i st−j = ψi−j , if j ≤ i where ψj is a coefficient in the MA representation of st . As a result we get the following linear system of equations in the variables of interest:

CE

v0 = φ1 v1 + φ2 v2 + 1 + θ1 (φ1 + θ1 ), v 1 = φ 1 v 0 + φ 2 v 1 + θ1 , v2 = φ1 v1 + φ2 v0 .

AC

The solution is:

v0 = D−1 (1 + 2θ1 φ1 − θ12 (φ2 − 1) − φ2 ),

v1 = D−1 (φ1 + θ12 φ1 + θ1 (1 + φ21 − φ22 )),

v2 = D−1 (φ2 + φ21 − φ22 + θ12 (φ2 + φ21 − φ22 ) + θ1 φ1 (1 + φ21 + 2φ2 − φ22 )), D = ((φ2 − 1)2 − φ21 )(1 + φ2 ).

Having obtained these moments, we can construct serial autocorrelations of ybt1 = constant− (θbm1 + θbm2 )st : v1 /v0 , and v2 /v0 , and its variance: v0 (θbm1 + θbm2 )2 . 30

ACCEPTED MANUSCRIPT

Appendix C. Details of the dividend strips

CR IP T

Dividend strips are forward contracts on annual dividends paid out n years from now. So, assuming a time step of one quarter, these are not zero-coupon claims. The issue is how to summarize the data and to value these contracts in a setup where one quarter is the shortest time step. Suppose dt+1 is a one-quarter dividend that is paid out at time t + 1. The corresponding (log) growth rate is log gt,t+1 = log(dt+1 /dt ). One-year dividend is (m) dt

=

m X

dt−m+i ,

m = 4.

i=1

(m)

AN US

A k−year forward contract specifies at date t the exchange of its price, or strike, for (m) dt+km at date t + n, n = km. Denote its price by Qnt . Binsbergen et al. (2013) report summary statistics for k −1 [log dt − log Qnt ]. Specifically, they report averages that are (m) estimates of k −1 [E log dt − E log Qnt ]. This section establishes how is this object related to E log rxt,t+n in our paper. (m)

(m)

(m)

Consider a claim to gt,t+n ≡ dt+km /dt

with a price denoted by pbnt . The corresponding (m)

M

yield, as before, is ybtn = −n−1 log pbnt . By no arbitrage, pnt qtn = pbnt , with qtn = Qnt /dt pnt is a price of a US nominal zero-coupon bond that pays $1 at time t + n.

ED

Then,

(m)

− log Qnt ] = k −1 E[− log pbnt + log pnt ] = mE[b ytn − ytn ]. (m)

PT

k −1 E[log dt

Now consider return on the claim to gt,t+n :

CE AC Therefore,

(m)

n log rxt,t+n = n−1 [log gt,t+n − log pbnt − log rt,t+n ] n X (m) = n−1 [ log gt+j−1,t+j ] + ybtn − ytn . j=1

E(log rxt,t+n − log rxt,t+1 ) = E(b ytn − ytn ) − E(b yt1 − yt1 ) n X (m) (m) −1 + n E[log gt+j−1,t+j − log gt,t+1 ] =

j=1 n E(b yt − ybt1 )

31

− E(ytn − yt1 ).

, and

ACCEPTED MANUSCRIPT

So, the Binsbergen et al. (2013) statistic allows computing average term spread in excess returns. We need to clarify what ybt1 is because the smallest n = 4 in Binsbergen et al. (2013). We will use the results from Binsbergen, Brandt, and Koijen (2012) to approximate this (m) (m) (m) quantity. One-period asset yield corresponds to a claim to gt,t+1 ≡ dt+1 /dt . Its price is (m) −1

=

m−1 X

(m) (dt )−1

)

(m)

Et (mt,t+1 dt+1 ) (m) −1

dt−m+1+i + (dt

)

CR IP T

(m)

pb1t = Et (mt,t+1 gt,t+1 ) = (dt

Et (mt,t+1 dt+1 ).

i=1

(m)

Prices of six-month contracts, that is, claims to gt,t+2 are: (m)

(m) −1

(m) −1

= (dt

)

[

m−2 X

)

Et [mt,t+2

m X

dt−m+2+i ]

AN US

pb2t = Et [mt,t+2 gt,t+2 ] = (dt

i=0

dt−m+2+i + Et (mt,t+1 dt+1 ) + Et (mt,t+2 dt+2 )].

i=1

M

Binsbergen, Brandt, and Koijen (2012) report Pt2 = Et (mt,t+1 dt+1 ) + Et (mt,t+2 dt+2 ). If we assume that Et (mt,t+1 dt+1 ) ≈ 1.02Et (mt,t+2 dt+2 ) (the one-period price is just a bit higher than the two-period price), then we can obtain an estimate of ybt1 : (m)

− log(dt−2 + dt−1 + dt + Pt2 · 0.495).

ED

ybt1 = − log pb1t ≈ log dt

The reported shape of the corresponding curve does not materially depend on reasonable variations in the approximating assumption.

AC

CE

PT

The issue with theoretical valuation of these securities is that they are not literally zero coupon. Therefore, computation of yields would involve taking logs of sums of variables, which is not convenient. For this reason, we will exploit the persistence of dividends. That is, annual dividend divided by four (quarterly average) should not be too much different from the quarterly dividend. Fig. 6 confirms this intuition. As a result, our theoretical model will treat dividend strips as if they were claims on quarterly dividends.

Appendix D. Copula, mutual information, and coentropy Consider two random variables x1 and x2 with a joint pdf p(x1 , x2 ) and marginals p1 (x1 ) and p2 (x2 ). The corresponding marginal cdf’s are P1 (x1 ) and P2 (x2 ). Sklar’s theorem enables one to decompose p using copula “density” c: p(x1 , x2 ) = c(P1 (x1 ), P2 (x2 )) · p1 (x1 ) · p2 (x2 ). 32

ACCEPTED MANUSCRIPT

(The general result is P (x1 , x2 ) = Cop(P1 (x1 ), P2 (x2 )), where Cop is copula.) Mutual information is I(x1 , x2 ) ≡ E log

p(x1 , x2 ) = E log c(P1 (x1 ), P2 (x2 )). p1 (x1 ) · p2 (x2 )

CR IP T

Coentropy is C(x1 , x2 ) ≡ L(x1 x2 ) − L(x1 ) − L(x2 )

= log E(x1 x2 ) − E log(x1 x2 ) − (log Ex1 − E log x1 + log Ex2 − E log x2 ) x1 x2 x1 x2 = −E log + E log + E log . E(x1 x2 ) E(x1 ) E(x2 )

AN US

To relate coentropy to mutual information and copula, define new probabilities: p˜(x1 , x2 ) = p(x1 , x2 )x1 x2 /E(x1 x2 ), and −j denotes “not j.” We have the following marginals Z Z p˜j (xj ) = p˜(x1 , x2 )dx−j = p(x1 , x2 )x1 x2 /E(x1 x2 )dx−j Z = xj pj (xj )/E(x1 x2 ) p(x−j |xj )x−j dx−j = xj pj (xj )E(x−j |xj )/E(x1 x2 )

M

= pj (xj )xj /E(xj ).

ED

Therefore,

C(x1 , x2 ) = −E log p˜/p + E log p˜1 /p1 + E log p˜2 /p2 = −E log

p˜/p p˜1 /p1 · p˜2 /p2

p˜ p + E log p˜1 · p˜2 p1 · p2 ˜ ˜ = −E log c˜(P1 (x1 ), P2 (x2 )) + E log c(P1 (x1 ), P2 (x2 ))

PT

= −E log

CE

= −E log c˜/c.

AC

In words, coentropy is the difference between mutual informations corresponding to two different joint probabilities. Consider a specific example when the new probability is defined by p˜(m, g) = p(m, g)mg/E(mg). Then the first marginal, p˜1 is the risk-adjusted probability. Chabi-Yo and Colacito (2017) introduce a concept of coentropy. It is different from coentropy in this paper despite the same name. Expanding on their definition, we obtain: K(x1 , x2 ) ≡ 1 −

L(x1 x2 ) + L(x1 ) − L(x2 ) C(x1 , x2 ) + 2L(x1 ) L(x2 ) = = . L(x1 x2 ) + L(x1 ) L(x1 x2 ) + L(x1 ) C(x1 , x2 ) + 2L(x1 ) + L(x2 )

In their notation, x1 = x and x2 = y/x. We have relabeled the variables to match our use with theirs: x1 ultimately becomes m, and x2 is g. 33

ACCEPTED MANUSCRIPT

Appendix E. The equity premium The main objective of this section is to establish equity premium in the KLV model that was calibrated to match returns on dividend strips. We exploit the Campbell-Shiller loglinearaization

CR IP T

log rt,t+1 = log gt,t+1 − log pdt + κ log pdt+1 ,

where pdt = pbt /dt is the price-to-dividend ratio, and κ = E(pdt )/(1 + E(pdt )). Then Eq. (3) implies

1 = Et elog mt,t+1 +log gt,t+1 −log pdt +κ log pdt+1 , or, equivalently,

AN US

b t,t+1 +κ log pdt+1 log pdt = log Et elog m .

Guess log pdt = A + Bxt , and solve for A and B by plugging the guess into the valuation equation. We obtain:

M

b0 κB > e + (κB > e)2 /2 + kz (λ2 , η2 )], A = (1 − κ)−1 [log βb + λ B = θb> (I − κΦ∗ )−1 .

ED

The risk free rate in the presence of jumps is

yt1 = − log Et elog mt,t+1 = − log β − kz (λ2 , 0) − θm1 x1t .

PT

Therefore, excess log returns are

log rxt,t+1 = log gt,t+1 − A − Bxt + κA + κBxt+1 + log β + kz (λ2 , 0) + θm1 x1t

CE

and expected excess returns are b0 κB > e + (κB > e)2 /2 + kz (λ2 , η2 )] + log β + kz (λ2 , 0) E log rxt,t+1 = log γ + ωη2 µg − [log βb + λ b0 κB > e − (κB > e)2 /2 + kz (λ2 , 0) + ωη2 µg − kz (λ2 , η2 ). = −λ0 η0 − η 2 /2 − λ

AC

0

Given the calibrated parameters for zero-coupon equity claim and a commonly used value of κ = 0.963, the implied equity premium is 0.0033, or 1.33% per year. While this appears to be on the low end of the customary equity premium estimates (the Shiller dataset implies 4% with a standard error of 1.65%), the number is close to average realized log excess returns on S&P 500 of 1.34% during 1996 − 2011. This is the period that was effectively used for calibration of equity cash flow growth as it is the sample corresponding to data from Binsbergen, Brandt, and Koijen (2012) and Binsbergen et al. (2013). 34

ACCEPTED MANUSCRIPT

Appendix F. The recursive utility pricing kernel Appendix F.1. Derivation

We guess a value function of the form log ut+1 = u + a1 x1t+1 + a2 x21t+1 + bx2t+1

CR IP T

We derive the pricing kernel for a representative agent model with recursive utility, loglinear consumption growth dynamics, stochastic volatility, and jumps with constant intensity.

2 = u + a1 ϕ1 x1t + a2 ϕ21 x21t + bϕ2 x2t + (a1 + b + 2a2 ϕ1 x1t )wt+1 + a2 wt+1 .

Then

AN US

log(gt,t+1 ut+1 ) = g + u + (θc1 + a1 ϕ1 )x1t + (θc2 + bϕ2 )x2t + a2 ϕ21 x21t

2 c + [σ0 + a1 + b + (σ1 + 2a2 ϕ1 )x1t ]wt+1 + a2 wt+1 + zt+1 .

Therefore, using the identity

we obtain

M

1 2 log Et eα1 wt+1 +α2 wt+1 = − [log(1 − 2α2 ) − α12 (1 − 2α2 )−1 ], 2

α 1 2 2 log(1 − 2αa2 ) + (σ0 + a1 + b)2 (1 − 2αa2 )−1 + α−1 ω(eαµc +α δc /2 − 1) 2α 2 + [θc1 + a1 ϕ1 + α(σ0 + a1 + b)(σ1 + 2a2 ϕ1 )(1 − 2αa2 )−1 ]x1t + [θc2 + bϕ2 ]x2t

ED

log µt (gt,t+1 ut+1 ) = g + u −

PT

+ [a2 ϕ21 + 0.5α(σ1 + 2a2 ϕ1 )2 (1 − 2αa2 )−1 ]x21t .

Lining up terms in expression (38), we get b = u1 θc2 (1 − u1 ϕ2 )−1 , and

CE

a1 = u1 (θc1 + α(σ0 + b)(σ1 + 2a2 ϕ1 )(1 − 2αa2 )−1 )(1 − u1 (ϕ1 + α(σ1 + 2a2 ϕ1 )(1 − 2αa2 )−1 ))−1 ,

AC

while a2 is the negative root of a2 = u1 (a2 ϕ21 + 0.5α(σ1 + 2a2 ϕ1 )2 (1 − 2αa2 )−1 ).

We select the negative root because in this case a2 → 0 when σ1 → 0. Then Eq. (36) implies the real pricing kernel in Eq. (39).

35

ACCEPTED MANUSCRIPT

Appendix F.2. Matching the affine real pricing kernel We introduce the following shorthand notation for the various elements of the real pricing kernel (39): log m b t+1 = m b + [(ρ − 1)θc1 − (α − ρ)α(σ0 + a1 + b)(σ1 + 2a2 ϕ1 )(1 − 2αa2 )−1 ]x1t + (ρ − 1)θc2 x2t − (α − ρ)α(σ1 + 2a2 ϕ1 )2 (1 − 2αa2 )−1 x21t /2 2 c + (α − ρ)a2 wt+1 + (α − 1)zt+1 ,

CR IP T

+ [(α − 1)σ0 + (α − ρ)(a1 + b) + ((α − 1)σ1 + (α − ρ)2a2 ϕ1 )x1t ]wt+1

rec ≡ m b + θ1rec x1t + θ2rec x2t − ϑrec x21t /2 + (λrec 0 + λ1 x1t )wt+1 2 rec + λrec 2 wt+1 + zt+1 .

(F.1)

AN US

Observe that both the affine, (33), and the recursive, (F.1), pricing kernels can be written as 1/2

2 m log m b t+1 = −b yt1 − convexity + vt wt+1 + v2 wt+1 + zbt+1 ,

where ybt1 is given in Eq. (22) in the affine case and

rec rec rec 2 2 ybt1 = constant − (θ1rec + λrec − [λrec 0 λ1 )x1t − θ2 x2t + (ϑ 1 ] )x1t /2

M

b0 λ b1 x1t +λ b2 x2 /2 and λrec λrec x1t + in the recursive case; the convexity, ignoring constants, is λ 1 1t 0 1 1/2 2 2 rec rec b b [λrec 1 ] x1t /2. The volatility of the pricing kernel, vt , is λ0 + λ0 x1t and λ0 +λ1 x1t . Finally, v2 is zero in the affine case and λrec 2 in the recursive case.

CE

PT

ED

The convexity term in Eq. (F.1) does not offset the quadratic term in the interest rate completely, as it does in Eq. (33). Moreover, the structural model features the squared 2 in the pricing kernel, and such a term is absent from the affine pricing kernel. shock wt+1 Thus, we will not be able to construct a perfect equilibrium counterpart to our affine model. However, this is not our goal, as we simply want to obtain a model that is quantitatively consistent with the highlighted features of the data, and we use an affine model to guide this search.

AC

We focus on calibrating the parameters controlling consumption growth θ1c , θ2c , σ0 , and σ1 to match the linear components of the interest rates and conditional volatilities of the pricing kernels. The jump parameters have an intuitive one-to-one mapping. In addition, θc2 = θbm2 (ρ − 1)−1 , µc = µ bm (α − 1)−1 , δc = δbm (α − 1)−1 , while θc1 , σ0 , and σ1 solve a nonlinear system of equations: rec θbm1 = θ1rec + λrec 0 λ1 , b0 = λrec , λ 0 b λ1 = λrec . 1

We select α = −9 and ρ = 1/3, which are ubiquitous in the literature. 36

ACCEPTED MANUSCRIPT

Appendix F.3. Understanding implications for the real yield curve In this appendix, we follow the approach of Backus, Chernov, and Zin (2014) and focus on the serial covariance of the real pricing kernel, which governs the slope of the real yield curve. Note that we can represent the consumption growth process in Eq. (37) as:

CR IP T

c log gt,t+1 = g c + γ(B)wt+1 + σ ¯t wt+1 ,

where σ ¯t = σ1 x1t is the demeaned conditional volatility of consumption growth. We omit the iid jumps here because they have no implications for the shape of the yield curve.

AN US

P j The term γ(B) = ∞ j=0 B γj is a lag polynomial that captures the serial dependence of consumption growth. In the specific case of our model, it arises from the ARMA(2,1) structure of the expected consumption growth process; the first three elements of the lag polynomial in our model are γ0 = σ0 , γ1 = θc1 + θc2 , γ2 = (θc1 + θc2 )(φ1 + θ1 ), with φ1 = ϕ1 + ϕ2 and θ1 = −(θc1 ϕ2 + θc2 ϕ1 )(θc1 + θc2 )−1 as in Appendix B. The lag polynomial notation γ(B) is more general than our particular model. We use this notation to emphasize that the specifics of expected consumption growth are not important for our discussion here. Finally, notice that, in our model, the time-varying component of volatility is additive. Instead, in the standard ARG(1) and AR(1) specifications of variance, volatility is multiplicative, and the consumption growth process is represented as:

ED

where σt is the volatility.

M

c log gt,t+1 = g c + γ(B)σt wt+1 ,

Following the same notation, the real pricing kernel (39) can be rewritten as: log m b t+1 = constant + [(ρ − 1)γ(B) + (α − ρ)γ(u1 )]wt+1

PT

− (α − ρ)αγ(u1 )(σ1 + 2a2 ϕ1 )(1 − 2αa2 )−1 ν(B)Bwt+1 + [(α − 1)σ1 + (α − ρ)2a2 ϕ1 ]σt wt+1 + . . . ,

AC

CE

P j where γ(u1 ) = ∞ j=0 u1 γj , lower dots represent the omitted quadratic terms, and ν(B) is a lag polynomial that arises from the AR(1) structure of x1t and whose first three elements equal to ν0 = 1, ν1 = ϕ1 , and ν2 = ϕ21 . The last two lines of this expression represent components of the pricing kernel associated with time-varying volatility of consumption growth: if σ1 = 0, then a2 = 0, and these terms vanish. The first line in this expression represents the pricing kernel corresponding to Bansal and Yaron (2004), Model I−time-varying expected consumption growth with constant volatility. As pointed out by Backus, Chernov, and Zin (2014), extensions to Model I using ARG(1) or AR(1) processes to capture the volatility dynamics do not help in changing the sign of the term spread. In contrast, the linear specification of volatility, reflected in the second line of the recursive pricing kernel expression, offers additional flexibility in generating 37

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

serial covariance of the linear part of the pricing kernel. For instance, the leading term (ρ − 1)γ0 + (α − ρ)γ(u1 ) is unaltered, but the second one changes from (ρ − 1)γ1 to (ρ − 1)γ1 − (α − ρ)αγ(u1 )(σ1 + 2a2 ϕ1 )(1 − 2αa2 )−1 ν0 . Using the calibrated parameters in Table 3D, we see that indeed the extra term changes the sign of the serial covariance of the pricing kernel as compared to the traditional specifications of variance. Thus, serial covariance of the pricing kernel becomes negative and the slope of the real curve becomes positive.

38

ACCEPTED MANUSCRIPT

References Alvarez, F., Jermann, U., 2005. Using asset prices to measure the persistence of the marginal utility of wealth. Econometrica 73, 1977-2016. Backus, D., Chernov, M., Martin, I., 2011. Disasters implied by equity index options. Journal of Finance 66, 1969-2012.

CR IP T

Backus, D., Chernov, M., Zin, S., 2014. Sources of entropy in representative agent models. Journal of Finance 69, 51-99. Bansal, R., Lehmann, B., 1997. Growth-optimal portfolio restrictions on asset pricing models. Macroeconomic Dynamics 1, 333-354. Bansal, R., Shaliastovich, I., 2013. A long-run risks explanation of predictability puzzles in bond and currency markets. Review of Financial Studies 26, 1-33.

AN US

Bansal, R., Yaron, A., 2004. Risks for the long run: A potential resolution of asset pricing puzzles. Journal of Finance 59, 1481-1509. Barro, R., 2006. Rare disasters and asset markets in the twentieth century. Journal of Economics 121, 823-867.

Quarterly

Belo, F., Collin-Dufresne, P., Goldstein, R., 2015. Dividend dynamics and the term structure of dividend strips. Journal of Finance 70, 1115-1160.

M

Binsbergen, J.H.v., Brandt, M., Koijen, R., 2012. On the timing and pricing of dividends. American Economic Review 102, 1596-1618.

ED

Binsbergen, J.H.v., Koijen, R., 2017. The term structure of returns: facts and theory. Journal of Financial Economics 124, 1-21. Journal of

Borovicka, J., Hansen, L.P., Scheinkman, J., 2016. Misspecified recovery. Finance 71, 2493-2544.

Journal of

PT

Binsbergen, J.H.v., Hueskes, W., Koijen, R., Vrugt, E., 2013. Equity yields. Financial Economics 110, 503-519.

CE

Chabi-Yo, F., Colacito, R., 2017. The Term Structures of Co-Entropy in International Financial Markets. Management Science, forthcoming. Chernov, M., Mueller, P., 2012. The term structure of inflation expectations. Financial Economics 106, 367-394.

Journal of

AC

Cochrane, J., 1992. Explaining the variance of price-dividend ratios. Review of Financial Studies 5, 243-280. Cover, T., Thomas, J., 2006. Elements of Information Theory (Second Edition). John Wiley & Sons, New York. Creal, D., Wu, J.C., 2016. Bond Risk Premia in Consumption-Based Models. Unpublished working paper. Chicago Booth School of Business, Chicago, IL.

39

ACCEPTED MANUSCRIPT

Dahlquist, M., Hasseltoft, H., 2013. International bond risk premia. tional Economics 90, 17-32.

Journal of Interna-

Dahlquist, M., Hasseltoft, H., 2016. International bond risk premia. In: Veronesi, P. (Ed.), Handbook of Fixed-Income Securities. Wiley, New York, pp. 171-190.

CR IP T

Dew-Becker, I., Giglio, S., Le, A., Rodriguez, M., 2017. The price of variance risk. Journal of Financial Economics 123, 225-250. Duffee, G., 2002. Term premia and interest rate forecasts in affine models. Finance 57, 405-443.

Journal of

Epstein, L., Zin, S., 1989. Substitution, risk aversion, and the temporal behavior of consumption and asset returns: A theoretical framework. Econometrica 57, 937-969. Garlappi, L., Skoulakis, G., Xue, J., 2015. Do industry portfolios predict the aggregate market? Unpublished working paper. Sauder School of Business, Vancouver, BC.

AN US

Giglio, S., Kelly, B., 2018. Excess volatility: Beyond discount rates. Quarterly Journal of Economics 113, 71-127. Giglio, S., Maggiori, M., Stroebel, J., 2015. Very long-run discount rates. Journal of Economics 130, 1-53.

Quarterly

Gurkaynak, R., Sack, B., Wright, J., 2007. The US Treasury yield curve: 1961 to the present. Journal of Monetary Economics 54, 2291-2304.

M

Gurkaynak, R., Sack, B., Wright, J., 2010. The TIPS yield curve and inflation compensation. American Economic Journal: Macroeconomics 54, 70-92.

ED

Hansen, L.P., 2012. Dynamic value decomposition in stochastic economies. Econometrica 80, 911-967. Hansen, L.P., Heaton, J.C., Li, N., 2008. Consumption strikes back? Measuring long-run risk. Journal of Political Economy 116, 260-302.

PT

Hansen, L.P., Jagannathan, R., 1991. Implications of security market data for models of dynamic economies. Journal of Political Economy 99, 225-262.

CE

Hansen, L.P., Scheinkman, J., 2009. Long term risk: an operator approach. Econometrica 77, 177-234.

AC

Hasler, M., Marfe, R., 2016. Disaster recovery and the term structure of dividend strips. Journal of Financial Economics 122, 116-134 Koijen, R., Lustig, H., Van Nieuwerburgh, S., 2017. The bond risk premium and the cross-section of equity returns. Journal of Monetary Economics 88, 50-69. Kreps, D., Porteus, E., 1978. Temporal resolution of uncertainty and dynamic choice theory. Econometrica 46, 185-200. Lettau, M., Wachter, J., 2007. Why is long-horizon equity less risky? A duration-based explanation of the value premium. Journal of Finance 62, 55-92. 40

ACCEPTED MANUSCRIPT

Lettau, M., Wachter, J., 2011. The term structures of equity and interest rates. of Financial Economics 101, 90-113.

Journal

Longstaff, F., Piazzesi, M., 2004. Corporate earnings and the equity premium. Journal of Financial Economics 74, 401-421.

CR IP T

Lustig, H., Stathopoulos, A., Verdelhan, A., 2018. The term structure of currency carry trade risk premia. Unpublished working paper. Stanford Graduate School of Business, Palo Alto, CA. Martin, I., 2013. Consumption-based asset pricing with higher cumulants. Economic Studies 80, 745-773.

Review of

Merton, R., 1976. Option pricing when underlying stock returns are discontinuous. Journal of Financial Economics 3, 125-144.

AN US

Piazzesi, M., Schneider, M., 2006. Equilibrium yield curves. In: Acemoglu, D., Rogoff, K., Woodford, M. (Eds.), NBER Macroeconomics Annual. MIT Press, Cambridge, pp. 389-472. Rietz, T., 1988. The equity risk premium: A solution. 22, 117-131.

Journal of Monetary Economics

Qin, L., Linetsky, V., 2017. Long term risk: A martingale approach. Econometrica 85, 299-312.

M

Shiller, R.,, 1989. Long term stock, bond, interest rate and consumption data. Spreadsheet posted on http://www.econ.yale.edu/∼shiller/data.htm

ED

Santos, T., Veronesi, P., 2010. Habit formation, the cross section of stock returns and the cash-flow risk puzzle. Journal of Financial Economics 98, 385-413. Wachter, J., 2006. A consumption-based model of the term structure of interest rates. Journal of Financial Economics 79, 365-399.

PT

Wachter, J., 2013. Can time-varying risk of rare disasters explain aggregate stock market volatility? Journal of Finance 68, 987-1035.

CE

Weill, P., 1989. The equity premium puzzle and the risk-free rate puzzle. Journal of Monetary Economics 24, 401-421. Wright, J., 2011. Term premia and inflation uncertainty: Empirical evidence from an international panel dataset. American Economic Review 101, 1514-1534.

AC

Zviadadze, I., 2017. Term structure of consumption risk premia in the cross section of currency returns. Journal of Finance 72, 1529-2017.

41

CR IP T

ACCEPTED MANUSCRIPT

Table 1 Properties of excess dollar returns

Entries are sample moments of quarterly observations of (quarterly) log excess returns: log r − log r1 , where

r is a (gross) return and r1 is the (gross) return on a three-month bond. All of these returns are measured in dollars. Sample periods: US TIPS, 1971-2014 (source: Gurkaynak, Sack, and Wright, 2010; Chernov

AN US

and Mueller, 2012); US nominal bonds, 1971-2014 (source: Gurkaynak, Sack, and Wright, 2007; FRED); Australian nominal bonds, 1987-2014 (source: Reserve Bank of Australia; Wright, 2011); UK nominal bonds, 1979-2014 (source: Bank of England); German nominal bonds, 1973-2014 (source: Bundesbank; Wright, 2011); exchange rate to the USD (source: FRED; EUR was complemented by Deutsche mark (DM), which was converted using the official EUR/DM rate); S&P 500 dividend strips, 1996-2009 (source: Binsbergen, Brandt, and Koijen, 2012). The shortest maturity available for dividend strips is two quarters,

Standard Mean deviation bonds (TIPS) 0.0022 0.0078

ED

Asset Inflation-protected CPI Currencies AUD EUR (Germany) GBP Equity S&P 500 div fut

M

so we extrapolate to one quarter as described in Appendix C.

Excess kurtosis

Entropy, L(rx)

0.1785

0.8223

0.00002

0.0108 −0.0015 −0.0008

0.0677 0.0614 0.0607

−0.5134 0.2748 −0.0816

0.7206 0.6517 1.4681

0.0016 0.0018 0.0015

−0.0159

0.0270

0.7491

0.6273

0.0004

PT

CE AC

Skewness

CR IP T

ACCEPTED MANUSCRIPT

Table 2 Average curves

Entries are means of yields on various assets of various maturities. All of these yields are expressed in decimals, on a quarterly basis. The second line shows the difference in term spreads relative to the US nominal curve. A term spread is defined as the difference between an n−quarter yield and a one-quarter yield. Sample periods: US nominal bonds, 1971-2014 (source: Gurkaynak, Sack, and Wright, 2007; FRED);

AN US

US TIPS, 1971-2014 (source: Gurkaynak, Sack, and Wright, 2010; Chernov and Mueller, 2012); Australian nominal bonds, 1987-2014 (source: Reserve Bank of Australia; Wright, 2011); UK nominal bonds, 1979-2014 (source: Bank of England); German nominal bonds, 1973-2014 (source: Bundesbank; Wright, 2011); two-quarter S&P 500 dividend strips, 1996-2009 (source: Binsbergen, Brandt, and Koijen, 2012); annual S&P 500 dividend futures, 2002-2011 (source: Binsbergen et al., 2013). Dividend strip/futures prices are

Australia

0.0165

Germany

0.0120

UK

0.0168 −0.0072

AC

CE

S&P 500

2 0.0128 0.0043 −0.0003 0.0164 −0.0003 0.0118 −0.0006 0.0173 0.0002 −0.0056 0.0013

4 0.0138 0.0043 −0.0013

ED

1 0.0124 0.0042

PT

Asset or Country US US TIPS

M

not available at the one-quarter horizon, so we extrapolate to one quarter as described in Appendix C.

0.0118 −0.0015 0.0166 −0.0016 −0.0001 0.0061

Maturity, quarters 8 12 20 0.0144 0.0149 0.0157 0.0045 0.0047 0.0052 −0.0017 −0.0020 −0.0023 0.0161 0.0164 0.0170 −0.0021 −0.0024 −0.0029 0.0124 0.0130 0.0139 −0.0015 −0.0014 −0.0014 0.0168 0.0170 0.0175 −0.0020 −0.0022 −0.0026 0.0018 0.0024 0.0035 0.0077 0.0078 0.0081

24 0.0160

28 0.0163 0.0056 −0.0025

0.0142 −0.0014 0.0177 −0.0028 0.0043 0.0084

0.0145 −0.0014 0.0178 −0.0032 0.0048 0.0086

40 0.0169 0.0063 −0.0024 0.0177 −0.0040 0.0151 −0.0014 0.0181 −0.0035

ACCEPTED MANUSCRIPT

Table 3 Calibrated parameters Entries are the model parameters expressed in quarterly terms. With the exception of CPI, θg1 in Panel B is a derived parameter. The first column of Panel C displays expected log excess returns implied by the model in Example 2. The subsequent two columns display the loadings of cash flows and the pricing kernel, respectively, on a normal shock in the extension of this model in Section 5.2. The last two columns

ϕ1 0.9487

CR IP T

highlight differences in covariance and entropy implied by the final model featuring jumps.

Panel A. Common parameters (US nominal economy) θm1 λ0 λ1 ω µm 0.0026 −0.1225 0.0512 0.0025 1.5000

δm 1.5000

ED

M

AN US

Panel B. Asset-specific parameters Asset ϕ2 θg1 θg2 η0 µg Inflation-protected bonds (TIPS) CPI 0.2240 −0.0016 0.0016 0.0077 −0.0438 Currencies AUD 0.9404 −0.0058 0.0029 0.0642 −0.1728 EUR 0.8356 −0.0039 0.0045 0.0254 −0.4365 GBP 0.9664 −0.0056 0.0027 0.0587 0.1765 Equity S&P 500 0.6846 −0.0014 0.0292 −0.0225 −0.0799

AC

CE

PT

Panel C. Some derived quantities Asset E log rxt,t+1 |η2 | |λ2 | covmg Inflation-protected bonds (TIPS) CPI 0.0013 0.0004 3.3067 −0.0001 Currencies AUD 0.0058 0.0188 0.2749 −0.0079 EUR 0.0028 0.0557 0.0491 0.0007 GBP 0.0055 0.0123 0.5014 −0.0069 Equity S&P 500 −0.0030 0.0145 0.8834 0.0042

δg 0.0200 0.0173 1.0268 0.0589 0.3736

Cmg (1) −0.0023 −0.0129 −0.0001 −0.0009 0.0024

Panel D. Parameters from the representative agent model Consumption Preferences θc1 θc2 σ0 σ1 µc δc α ρ 0.0009 −0.0025 −0.0085 −0.0045 −0.1456 0.1500 −9 1/3

ACCEPTED MANUSCRIPT

CR IP T

Table 4 Variance and serial correlation of cash flows

The data module reports summary statistics and the corresponding standard errors in parentheses in the second line. The model module reports population values at the calibrated parameters in the first line. The second line report in parentheses the 2.5th and 97.5th percentiles of the distribution of the respective statistics computed from 100,000 artificial histories of log g simulated from the model at calibrated parameters. We use annual data on dividends expressed in quarterly units. The reason is that dividends are highly seasonal and lumpy, as highlighted in Garlappi, Skoulakis, and Xue (2015). As a result, the Shiller (1989) annual data are an accurate representation of annual dividends, but it is oversmoothing at

AN US

higher frequencies. To match the annual data with the quarterly model, we simulate annual dividends. Consumption data are from quarterly NIPA tables from 1947 to 2014. Variance of consumption growth is matched by construction, so we do not report its sampling characteristics to emphasize this.

AC

CE

PT

ED

M

Data Asset Var×102 AR(1) Inflation-protected bonds (TIPS) CPI 0.0078 0.5889 (0.0008) (0.0625) Currencies AUD 0.3132 0.0598 (0.0422) (0.0950) EUR 0.3748 0.0015 (0.0410) (0.0788) GBP 0.3126 0.1271 (0.0370) (0.0828) Equity S&P 500 0.3829 0.2599 (0.0459) (0.0812) Macro Cons. growth 0.0004 0.0877 – (0.0600)

Model

Var×102

AR(1)

0.0084 (0.0062, 0.0123)

0.2202 (0.0520, 0.3848)

0.44309 (0.3224, 0.5577) 0.3763 (0.0562, 2.9027) 0.3604 (0.2784, 0.4586)

−0.0191 (−0.2018, 0.1625) 0.0655 (−0.0726, 0.2490) −0.0278 (−0.1886, 0.1328)

0.2947 (0.1981, 0.4033)

0.1097 (−0.0819, 0.2961)

0.0004 –

0.0487 (−0.0941, 0.2081)

US TIPS AUS GER UK S&P

10

PT

CE

0

ED

−0.004

M

0.000

AN US

0.004

0.008

CR IP T

ACCEPTED MANUSCRIPT

20 Horizon, quarters

30

40

Fig. 1. The black solid line shows average US nominal term spreads, E(ytn −yt1 ) at different maturities n. The

remaining lines represent the term spread in average excess returns, E(log rxt,t+n − log rxt,t+1 ), measured by differences of average term spreads on several assets relative to US Treasuries, E(b ytn − ybt1 ) − E(ytn − yt1 ).

AC

Data sources are the same as in Table 2.

AN US

0.4 0.3

M

0.2

CE

0

PT

0.0

ED

0.1

Entropy and HJ bounds

0.5

0.6

CR IP T

ACCEPTED MANUSCRIPT

10

20

30

40

Horizon, quarters

Fig. 2. The figure compares how the Hansen-Jagannathan (HJ) bound (purple lines) and entropy, nLm (n), (blue lines) change with horizon in the benchmark iid case (solid lines) and in the two-horizon model (dashed

AC

lines).

CR IP T

ACCEPTED MANUSCRIPT

1.8

1.2 1.0 0.8 covariance

0.6 0.4

ED

0.2 0

1

PT

0.0

AN US

1.4

M

Coentropy and Covariance

1.6

coentropy

2 3 Poisson intensity ω

4

5

CE

Fig. 3. The figure compares coentropy and covariance for the Poisson mixture of bivariate normals described

AC

at the end of Section 5.3. As we vary ω, we adjust δ to hold the variance constant.

ACCEPTED MANUSCRIPT

(A) Cumulants of the pricing kernel (log)

(B) Cumulants of cash flow growth (log) 0

12

CR IP T

-2

10 -4

8

-6 -8

4

-10

2

-12

AN US

6

-14

0

1

2

3

4

5

6

7

8

(C) Joint cumulants (log)

5

-5

2

3

4

5

6

7

8

CE

CF

9 10

3

4

5

6

7

8

9

10

#10-4

PT

-10

2

(D) Contributions to coentropy

ED

0

1

10

M

10

1

9

10 8 9 7 6 4 5 3 PK 1 2

8 6 4 2 0 1

2

3

4

5

CF

6

7

8

9 10

10 8 9 7 6 4 5 3 PK 1 2

Fig. 4. The figure displays marginal cumulants of the jump component of the pricing kernel and the GBP

AC

depreciation rate in panels (A) and (B), respectively. Panels (C) and (D) show how they combine to contribute to coentropy. In contrast to plots in other panels that are depicted on the log scale, the plot in panel (D) is in levels. Further, in contrast to panel (C), it accounts for factorials that strongly discount contribution of high order joint cumulants to coentropy.

ACCEPTED MANUSCRIPT

(A) AUD #10-4

#10-3

0

3

-2

2

-4

1 0

-6

-1

-8

2

AN US

-2

1

1

3

4

5

6

CF

7

8

9 10

ED

0

3

4

CE

2

PT

-2

5

AC

CF

6

7

8

9 10

3

4

5

6

CF

7

8

9 10

10 8 9 7 6 4 5 3 PK 1 2

(D) S&P 500

M

#10-4

-1

2

9 10 7 8 6 5 3 4 2 PK 1

(C) CPI

1

CR IP T

(B) EUR

10 8 9 7 6 4 5 3 PK 1 2

#10-4 4 2 0 -2 -4 1

2

3

4

5

CF

6

7

8

9 10

10 8 9 7 6 4 5 3 PK 1 2

Fig. 5. The figure compares contribution of joint cumulants to coentropies of different assets.

Dec 1999

PT

Jun 1998

CE

Dec 1996

ED

4

M

5

6

AN US

7

CR IP T

ACCEPTED MANUSCRIPT

Jun 2001

Dec 2002

Jun 2004

Dec 2005

Jun 2007

Sep 2008

Fig. 6. The figure displays quarterly dividends (black solid line) and quarterly average of annual dividends (red dashed line). The sample corresponds to the availability of short-term dividend prices in Binsbergen,

AC

Brandt, and Koijen (2012).

Term structures of asset prices and returns

Term structures of asset prices and returns

Recommend Documents