Orthant-based variance decomposition in investment portfolios

Orthant-based variance decomposition in investment portfolios

ARTICLE IN PRESS JID: EOR [m5G;December 6, 2019;23:16] European Journal of Operational Research xxx (xxxx) xxx Contents lists available at Science...

2MB Sizes 0 Downloads 29 Views

ARTICLE IN PRESS

JID: EOR

[m5G;December 6, 2019;23:16]

European Journal of Operational Research xxx (xxxx) xxx

Contents lists available at ScienceDirect

European Journal of Operational Research journal homepage: www.elsevier.com/locate/ejor

Orthant-based variance decomposition in investment portfolios Javier Giner Department of Economics, Accounting and Finance, University of La Laguna, Camino La Hornera s/n, Santa Cruz de Tenerife 38071, Spain

a r t i c l e

i n f o

Article history: Received 27 February 2019 Accepted 14 November 2019 Available online xxx Keywords: Finance Mixture Correlation Variance decomposition Bivariate normal distribution

a b s t r a c t A traditional and useful approach in portfolio theory is to consider the total risk of one security partitioned into two components: market risk and specific risk. In this paper, we propose a new variance decomposition based on a four-orthant partitioning of a bivariate normal distribution representing the returns on two stock portfolios. Four Euclidian quadrants around the central mean point are considered with their correspondent truncated distributions. We can consider stock pairs as events in which both stocks rise together, both decline or one rises and the other declines. The question that arises is what the contribution is of each quadrant to the overall mean return. And, what is the contribution of each quadrant to the total variance? We consider the mixture of four truncated bivariate normal distributions, where the weighting coefficients coincide with the quadrant probabilities. Through the law of total variance and the first and second moments of each truncated distribution, the requested decomposition formulas are deduced. These results are validated with straightforward simulations. The equations obtained here show higher variance concentration when considering diagonal quadrants, more than could be expected when compared to the subset probability mass. These results show that pair trading and low variance strategies could be better interpreted with this variance decomposition. Finally, a comparison with principal component theory is carried out showing that greater variance concentration can be found within this orthant scheme. © 2019 Elsevier B.V. All rights reserved.

1. Introduction In the financial literature, there is growing interest in adopting an alternative approach to study relations between variables, using a general scheme of bivariate variables, but putting a special emphasis on their behavior in scenarios based on four different quadrants according to the variables’ signs. In the case of the bivariate normal distribution (BND), there are especially interesting outcomes, for the simple reason that a useful interpretative tool exists such as Sheppard’s formula (Sheppard, 1899). This formula is a clear and simple expression that determines the joint probability of two variables meeting in one quadrant or another, which greatly facilitates quantitative analyses, and can be very useful for the study of the joint behavior of stocks and financial assets. Acar and Lequeux (1996, 1998) were the first to adopt this orthant scheme, making use of Sheppard’s formula, particularly to calculate the expected return and variance of trading rules. They also obtained interesting results such as the correlation between different trading rules. Subsequently, Lundin and Satchell (2002) reformulated the standard variance framework to account for correlations between assets in terms of orthant probabilities

E-mail address: [email protected]

in order to measure portfolio risk. Deshpande, Ertley, Lundin and Satchell (2019) and Lundin and Satchell (2016, 2018) have also applied the orthant scheme to the estimation of correlation in different regimes simultaneously, obtaining interesting performance indicators. Moreover, there have been efforts to address the general multivariate problem but this has not always been analytically manageable, although several advances have been obtained with numerical expressions of the moments of the truncated multivariate normal distribution (Horrace, 2005; Arismendi, 2013). Sheppard’s formula, and in general, the quadrant-based bivariate analysis are quite useful. The most important bivariate information that affects two variables is usually summarized by the Pearson correlation coefficient, but sometimes its interpretation can be rather imprecise and undefined, with a meaning that is often difficult to understand. Instead, using Sheppard’s formula, it is possible to respond to a larger number of specific questions, basically because a probabilistic interpretation of a correlation opens the door to significantly more practical interpretations. Although Sheppard’s formula is obtained by variables under BND, Giner, Mendoza and Morini (2018) showed its reasonable validity in the area of financial assets. As a continuation of this work, special focus on variance contributions is provided in this paper. The aim is not to search for direct financial applications, which may require more effort and probably exceed the scope of this paper, but to develop a

https://doi.org/10.1016/j.ejor.2019.11.028 0377-2217/© 2019 Elsevier B.V. All rights reserved.

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR 2

useful theoretical framework that opens the field for the subsequent search for more applications. Thus, this is a theoretical work on partitioning the variance between quadrants as a first step before testing with empirical data values and to encourage the search for new applications. This article is structured in the following way. In Section 2, a mixture approach for partitioning a general probability distribution function (pdf) is presented. Section 3 describes in detail the case of four quadrants under BND. In Section 4, this conceptual framework is used to deduce the decomposition of the variance, based on the diagonal quadrants, applying these results to two-stock portfolios, with some numerical examples. Section 5 explores the regressions based on positive and negative diagonal quadrants, with some suggestions and ideas concerning future work, and Section 6 concludes. 2. A mixture based on the whole set of truncated distributions Notation followed here is that vector and matrix letters are in bold, using capital letters for matrices. Thus, for example, variance– covariance matrices will be presented in bold and capital letter, although it is common to find scientific papers with matrix symbols not in bold. Let’s suppose two assets whose returns X and Y are represented by the bivariate random variable r = (X, Y ) , where a denotes the transpose of vector a. The variable r follows a bivariate joint probability distribution function f (r|θ ) parameterized on θ . For a general mixture distribution over a two-dimensional feature space, 2 , the general f (r|θ ) probability distribution function can be represented by

f ( r|θ ) =

n 

pi fi (r|θi ),

i=1

n with mixture weights that obey i=1 pi = 1 and n component probability density functions fi (r|θi ) conditioned to some kind of restriction represented by the vector parameter θi . In mixture distributions, the kth moment of each variable, X and Y in our case, shows the property

E[X k ] =

n 

pi Ei [X k ] and E[Y k ] =

i=1

n 

pi Ei [Y k ],

i=1

where Ei is the expectation operator for each truncated distribution domain. The product moment also has the property

E[X Y ] =

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

n 

pi Ei [X Y ].

i=1

The return r = (X, Y ) is characterized by a mean return μ = (μx , μy ) and a variance–covariance matrix  formed by variance variables σx2 , σy2 and covariance σxy = ρσx σy , with ρ being the coefficient of correlation. Each i-component probability distribution function, i = 1, . . . , n, is characterized by a subpopulation mean μi = (μx,i , μy,i ) and covariance matrix i , with variance variables 2 , σ 2 and covariance σ σx,i xy,i . y,i From now on a major assumption is made. Mixture model components usually belong to the same parametric family of distributions but with different parameters, for example, a normal probability distribution where each component shows a different mean or variance. Instead, we are going to consider that the components are the result of breaking down the original probability distribution function domain in n truncated probability functions, each of them is only defined by the correspondent subdomain, but defined in a way that they completely cover the overall domain. We make this assumption because we are interested in seeing how each component influences the overall function, and how the first and second

moment of the overall population can be decomposed into different components. It must be noted that, due to this special kind of mixture, each truncated probability distribution function is applied to a specific domain, so component probability function and domain will refer to the same concept. With this assumption, there is an interesting general result that will make subsequent calculations easier. Since all the components belong to the same parametric family of distributions, it is appropriate to consider standardized variables U and V (zero mean and unit variance), with X = μx + σxU and Y = μy + σyV . Then, it is straightforward to obtain the i-component means, variances and covariance expressed in terms of the standardized moments of each truncated distribution:

    μx,i μx + σx mu,i μi = = = μ + μi , μy,i μy + σy mv,i   σx mu,i being μi = , σy mv,i

(2.1)

 i =

  2  2 σx,i σxy,i σx (muu,i − m2u,i ) σxy (muv,i − mu,i mv,i ) = 2 σxy,i σy,i σxy (muv,i − mu,i mv,i ) σy2 (mvv,i − m2v,i )

= i − μi μi ,



being i =

σx2 muu,i σxy muv,i

 σxy muv,i . σy2 mvv,i

(2.2)

The terms mu,i and mv,i are the first order moments of the i-truncated probability distribution with standardized variables U and V. In the same way, muu,i and mvv,i are the standardized second order moments, and muv,i the product moment of both variables, moments about the origin. Eq. (2.1) says that intra-mean of i-component can be obtained as the overall mean return plus a marginal mean return, which is the first order moment of the standardized truncated distribution amplified by the variance variable. For example, x-variate mean return at component i, μx,i = μx + μx,i , can be considered as the overall x-mean return plus a marginal mean return contribution which is μx,i = μx,i − μx = σx mu,i . Eq. (2.2) only specifies that intra-variance of i-component is obtained with the moments of the standardized truncated distribution by scaling with the overall variance, 2 = E [X 2 ] − E [X]2 = σ 2 (m 2 for example, σx,i due i i uu,i − mu,i ), x to Ei [X 2 ] = Ei [(μx + σxU )2 ] = μ2x + 2μx σx mu,i + σx2 muu,i and Ei [X]2 = (μx + σx mu,i )2 = μ2x + 2μx σx mu,i + σx2 m2u,i simplifies 2 = σ 2 (m 2 to σx,i uu,i − mu,i ). x These first statements let us decompose the total mean and variance in each of the i contributions.

2.1. Orthant contributions to the overall mean return Using the moments’ properties of mixtures and Eq. (2.1), the mean return decomposition is

μ=

n  i=1

pi μi =

n  i=1

pi (μ + μi ) =

n  i=1

pi μ +

n 

pi μi = μ,

i=1

(2.3) where it is shown that the weighted sum of the marginal mean returns of n components is zero, they cancel each other out, or likewise, as μi = (σx mu,i , σy mv,i ) , the weighted sum of standardized first order moments of n components is zero. 2.2. Orthant contributions to the overall variance Since the probability function is a weighted average of n truncated bivariate distributions, the mean and other higher moments

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

are the weighted averages of the truncated bivariate means and higher moments. However, the variance of the mixture is not the weighted average of the truncated variances. Therefore, the law of total variance (LTV) must be used,

V ar (r ) = E[V ar (r|θi )] + V ar[E (r|θi )], where it can be seen that the overall variance comes from two sources, the mean of the intra-group variances, plus the variance of the inter-group means. Applying LTV and Eq. (2.2), the overallcovariance matrix  can be obtained as

=

n 

pi (i − μi μi  ) +

i=1

n 

pi (μi − μ )(μi − μ ) =

i=1

n 

p i i . (2.4)

σx2 = E[X 2 ] − μ2x =

pi Ei [X 2 ] − μ2x

i=1

=

n 

pi (μ2x + 2μx σx mu,i + σx2 muu,i ) − μ2x = σx2

i=1

n 

pi muu,i ,

i=1

where the equivalent result to Eq. (2.4) is obtained, using the standardization X = μx + σxU and applying the previous result that the sum of the n standardized first order moment equals zero, as it is seen in Eq. (2.3). Actually, LTV is just a shortcut to link overall variance with the intra-group variances. We now have a first proxy of the variance decomposition, where the variance contribution of each i-quadrant is only related to the i-component term. Additionally, there is the useful result that the variance of a variable can be interpreted as the sum of n net contributions, each one is the standardized second order moment muu,i weighted with the probability pi of this partition and the variance variable σx2 . It must be noted that the sum of n net variance contributions must always be 1. As expected, Eq. (2.4) is independent of the total mean return vector μ, showing invariance to the shift in location. The second order moment σx2 muu,i can be designated as an unweighted contribution to the overall variance, in the same way that each i-component makes an unweighted contribution to the total covariance matrix equal to



i =

σx2 muu,i σxy muv,i

 σxy muv,i , σy2 mvv,i

(2.5)

being  = p1 1 + p2 2 + · · · + pn n the total variance– covariance matrix retrieved from each of the components. i at Eq. (2.5) can be called as the i-component second order moment matrix. In the same way, μi can be named as the i-component unweighted contribution to the total mean, or also the i-component first order moment vector. The weighted sum of i-component second order moment matrices, net contributions, equals the total covariance matrix,  = p1 1 + p2 2 + · · · + pn n , the sum of i-component variance net contributions equals the total covariance matrix. It should be noted that the variance–covariance matrix of component i is



i =

  σx2 muu,i − m2u,i σxy (muv,i − mu,i mv,i )

but these covariance matrices do not show an additive property in order to obtain a correct variance decomposition. When a weighted average of component matrices is done, according to the LTV, it must be corrected with the inter-group variance to retrieve the original covariance matrix. Instead, Eq. (2.4) shows that the second order moment matrices are enough to obtain these specific variance contributions, and their weighted additions recuperate the overall covariance matrix. These Section 2 results are applicable to any distribution and any truncation pattern. We emphasize that only the second moment about the origin of the standardized truncated distribution functions are needed to obtain the variance contributions.

i=1

Eq. (2.4) lends a great simplification with respect to the LTV, because instead of 2n summands with unclear i-component identification, only n summands are needed, and they are uniquely related to an individual i-component origin, so each pi i term can be identified as a net i-component variance contribution. This simplification will provide further interesting results. To see the results of Eq. (2.4) in more detail, let us only focus on overall x-variance contributions, n 

3

 σxy (m uv,i − mu,i mv,i ) , σy2 mvv,i − m2v,i

(2.6)

3. Bivariate normal distribution In this paper, a bivariate normal distribution (BND) is considered, and this distribution is broken down in four-centered quadrants, so it is necessary to particularize the previous section for the case of a mixture of four truncated bivariate normal distributions (TBND). Thus, the mixture model that relates the overall pdf with the four components is

f (x, y ) = p1 f1 (x, y ) + p2 f2 (x, y ) + p3 f3 (x, y ) + p4 f4 (x, y ), where fi (x, y ) is the pdf of the truncated bivariate normal distribution (domain restricted to the i-quadrant), f1 (x, y ) = f (x, y|X ≥ μx , Y ≥ μy ), f2 (x, y ) = f (x, y|X < μx , Y ≥ μy ), f3 (x, y ) = f (x, y|X < μx , Y ≥ μy ) and f4 (x, y ) = f (x, y|X ≥ μx , Y < μy ). 4 The weighting coefficients pi are 0 ≤ pi ≤ 1, with i=1 pi = 1. The domain splitting requires correspondent i-quadrant probabilities, with pi being the proportion of i–subpopulation in the overall population. Since we are dealing with a BND split into 4 quadrants around the center mean, they coincide with Sheppard’s probabilities (Johnson & Kotz, 1971):

p1 = p3 =

1 1 1 1 + arcsin(ρ ) and p2 = p4 = − arcsin(ρ ). 4 2π 4 2π

In the following subsections, Eqs. (2.1)–(2.4) will be particularized for the BND case with this four-quadrant-splitting scheme. 3.1. Mean contributions We are interested in what the mean and variance contribution of each quadrant is, and this can be determined using Eqs. (2.1) and (2.2), but we do need to know the first and second moments of each TBND. What is the quadrant’s contribution to the mean return? Eq. (2.1) shows what is necessary to calculate the orthant contribution, which is the first order moment of the truncated bivariate normal distribution, concretely, the singly truncated distribution with respect to both variables. Rosebaum (1961) deduces the first and second order moments of a truncated standardized bivariate normal distribution. Thus, we just need to particularize these expressions to our problem. The detailed solution can be seen in Appendix A. Therefore, the four means in each quadrant are:

 2 (1+ρ )  σx μ1 = μ + μ1 = μ + π2 (41+p1ρ ) , μ3 = μ − μ1 π 4 p 1 σy

(3.1a)

  2 ( 1 −ρ )  − σx μ2 = μ + μ2 = μ +  2 π(1−4ρp)2 , μ4 = μ − μ2 (3.1b) σ y π 4 p2 Following Eq. (2.3), these four mean contributions can be gathered in the overall mean return, allowing the interpretation of a

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR 4

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

proper mean return decomposition, focusing on the terms of net and marginal mean return contribution to each quadrant:

μ=

4 

pi μi =

i=1

4 

pi (μ + μi ) = μ +

i=1

4 

σx2 = σx2 [ 2 p1 muu,1 + 2 p2 muu,2 ]

   = σx2 (2 p1 + ρ 1 − ρ 2 /π ) + (2 p2 − ρ 1 − ρ 2 /2π ) (3.4)

pi μi

i=1

where we have two parts: pure contribution to the total plus marginal contribution to the total. The first term sums the total mean return, and the second term has sum zero, but each component is very important when it is considered separately. A similar result was already presented in Giner et al. (2018), but it has also been included in this paper for sake of completeness and with clearer vectorial notation. 3.2. Variance contributions As we are particularizing the case of a BND, the second moments about the origin of the standardized truncated bivariate distribution functions are needed, concretely, the quadrant truncated distributions around the overall mean return vector μ, so this information is crucial in this characterization. Following Rosebaund (1961), these values are obtained. In Appendix B, the deduction is shown. Thus, the required i-quadrant second order moment matrices are:

1 =  3 =

  σx2 + σx2 ρ 1 − ρ 2 /(2π p1 ) σxy + σx σy 1 − ρ 2 /(2π p1 )   σxy + σx σy 1 − ρ 2 /(2π p1 ) σy2 + σy2 ρ 1 − ρ 2 /(2π p1 )

=  + 1

2 =  4 =

(3.2a)

  σx2 − σx2 ρ 1 − ρ 2 /(2π p2 ) σxy − σx σy 1 − ρ 2 /(2π p2 )   σxy − σx σy 1 − ρ 2 /(2π p2 ) σy2 − σy2 ρ 1 − ρ 2 /(2π p2 )

=  + 2 .

(3.2b)

In the same way as the i-quadrant first order moment vector is μi = μ + μi , the total mean return plus an increment, the iquadrant second order moment matrix is i =  + i , the total variance plus an increment. Using Eq. (2.4),  = p1 1 + p2 2 + p3 3 + p4 4 , and applying Eqs. (3.2a) and (3.2b), the total covariance matrix is retrieved,  =  + 2 p1 1 + 2 p2 2 = , as expected. The increment contribution variances, with p2 2 = −p1 1 are crucial,



  σx2 ρ σxy p1 1 = and σxy σy2 ρ    1 − ρ 2 σx2 ρ σxy p2 2 = − . σxy σy2 ρ 2π 1 − ρ2 2π

(3.3)

At this point, it is very convenient to analyze in more detail, the composition of these variables. For example, the x-variance can be seen as four contributions,



  ρ 1 − ρ2 ρ 1 − ρ2 σ = σ σ p1 + + p2 − 2π 2π i=1



  ρ 1 − ρ2 ρ 1 − ρ2 + p3 + + p4 − . 2π 2π 2 x

4 



pi x2 muu,i =

by grouping I+III quadrants and II+IV quadrants (diagonal quadrants):

2 x

This Eq. (3.4) enables an interesting interpretation. There are quadrants that add variance, I and III, these are variance sources, and others that cancel these values, II and IV, and these are variance holes. It is very important not to confuse the variance of a quadrant (these cannot be added) with the contribution to the variance of this quadrant (these can be added). One could argue that there is no use in finding that something can be split, for example, 7 is equal to 4 plus 3, or 5 plus 2. One always can separate in contributions such as x is equal to a plus x-a. But in this particular case of diagonal quadrant grouping, the variance of each group is well-defined and independent of the other group, independent in the sense that these events are not related to others, and adding both variances you obtain the overall variance. Considering diagonal grouping, the component contribution variances are intra-variances themselves because there is no intergroup variance, since each component mean matches the overall mean. This is an important advantage that is going to be considered in the next section. 4. Positive and negative quadrant decomposition The quadrant splitting results are a bit disappointing, since variance quadrant contributions are not real variances, they are only contributions, and they do not have real meaning as individual items. Remember that total variance is the mean of intra-variances (n components) plus the inter variance of the means (n components), thus total variance is the mean variance corrected by the variance of the means. But we propose a simple solution to overcome this problem, we are going to study the combined result of quadrants I+III against II+IV. Focusing on these quadrants combination, it can be shown that a correct variance decomposition can be obtained. As a first step, it is required to define the specific domain variables. With (Xi , Yi ) being the variable at i-quadrant, we can group I+III (same sign, positive diagonal) and II+IV (different sign, negative diagonal) and define the positive and negative subsets:

(XP , YP ) = (X1 , Y1 ) ∪ (X3 , Y3 ) (XN , YN ) = (X2 , Y2 ) ∪ (X4 , Y4 ) We elect to refer to this type of quadrant combination as diagonal quadrant combination. Using the results from Section 3.1, mean return decomposition, both these variables have the same mean return μ as the overall population. For example, in the positive diagonal P, the 1st quadrant positive excess mean return is cancelled by the 3rd quadrant negative excess mean return. So:



μP =



EP [X ] EP [Y ]



=

     EN [X ] μx μx = μ and μN = = = μ. μy μy EN [Y ]

This must be emphasized because this is the reason why contribution matrices will equal their own variance matrices: if the subset-center mean equals the overall mean, LTV is greatly simplified. 4.1. Diagonal quadrant variance decomposition

It is evident that quadrants I and III show the same variance contribution, in terms of probability mass and standardized second order moment, and quadrants II and IV as well. Then, simplifying

Using Section 3.2 results on variance decomposition, the correspondent diagonal variances and covariances can be obtained.

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

5

Being rP = (XP , YP ) the positive diagonal subset, its covariance matrix, applying the LTV, is

P = 0.5(1 − μ1 μ1  ) + 0.5(3 − μ3 μ3  ) + 0.5(μ1 − μ )2 + 0.5(μ3 − μ )2 = 0.51 + 0.53 = 1 , because each subpopulation (I and III) shows the same proportion in the overall population (I+III). In the same way, variable rN = (XN , YN ) variance–covariance matrix is N = 0.52 + 0.54 = 2 . Each one of these diagonal subpopulations shows a variance– covariance matrix that equals the corresponding second order moment matrix at Eq. (3.2):

P = P = 1 =  + 1 and N = N = 2 =  + 2 In this case, the fact that the mean returns of rP and rN both coincide with the overall mean return makes it easier to calculate the variance derivation. As (X, Y ) = (XP , YP ) ∪ (XN , YN ) is the union of two subpopulations with proportions 2 p1 and 2 p2 , or equivalently, a mixture of two individual components, applying general Eq. (2.4) the overall matrix is

 = 2 p 1  P + 2 p 2  N = 2 p 1 P + 2 p 2 N ,

(4.1)

where, fortunately, variance contributions are real variances, not only contributions, and an interesting variance decomposition scheme can be developed. To have a better perspective, let us focus on a single variate, for example X = XP ∪ XN . The variance of XP and XN , using Eq. (3.2), are:



ρ 1 − ρ2 2 σx and 2 p1 π  ρ 1 − ρ2 2 σX2N = σx2 muu,2 = σx2 − σx . 2 p2 π σX2P = σx2 muu,1 = σx2 +

σ =

σ +

2 p1 X2P

σ

2 p2 X2N

For example, the linear regression of X on Y allows you to determine X as a function of Y (variable X is given when Y takes a determined value),

X = μx + ρ

σx (Y − μy ) + εx . σy

In finance, it is common to make use of expressions like the market variance decomposition: 2 2 σx2 = σSystematic + σSpeci f ic .

It is important to emphasize again that these are not contributions to the variance, they are strictly variances. And finally, the following variance decomposition of variate X can be considered as 2 x

Fig. 1. Variance decomposition, Eq. (4.5): σx2 is decomposed in two proportion of variances, the variance of quadrants I+III, POVXP , and variance of quadrants II+IV, POVXN .

 ρ 1 − ρ2 2 = σ + σx π  ρ 1 − ρ2 2 + 2 p2 σx2 − σx . π 2 p1 x2

(4.2)

And of course, this leads to many interesting financial applications. So, perhaps, in the same way as the market model, other applications can be found for diagonal decomposition. The market variance decomposition is just a consequence of OLS regression properties. Risk market relations are frequently presented in terms of beta, which in this context is β = ρσx /σy , so σx2 = β 2 σy2 + σε2 or σx2 = ρ 2 σx2 + σε2 , and then

σx2 = ρ 2 σx2 + (1 − ρ 2 )σx2 .

(4.3)

This shows an interesting result. A two-term variance decomposition is obtained, where each term is a proper variance in itself. In addition, this expression is function only of the correlation parameter, omitting the overall variance factor scale. In the same way:

Moreover, R2 = ρ 2 is the coefficient of determination, and it provides the proportion of the variance that can be explained by the independent variable. It can be seen that OLS-risk decomposition, as diagonal-risk decomposition, is a unique function of ρ (at least in proportional terms, omitting σx factor scale). In the case of diagonal-risk decomposition, what has been obtained is

σ =

σx2 = 2 p1 σX2P + 2 p2 σX2N .

2 y

σ +

2 p1 Y2P

σ = 

2 p2 Y2N

+ 2 p2 σy2 −

 ρ 1 − ρ2 2 σ + σy π

2 p1 y2

ρ 1 − ρ2 2 σy π

σxy = 2 p1 σXP ,YP + 2 p2 σXN ,YN = 2 p1 σxy +  1 − ρ2 + 2 p2 σxy − σx σy . π



1 − ρ2

π

σx σy

We wish to emphasize the fact that correlation coefficient is a key value in these formulas. Correlation is not neutral information. If you have knowledge about it, you have a lot of information. If you know how large the correlation is, you can determine the value of one variable, given the value of another variable. Correlation can determine, almost completely, the value of one variable as function of another.

(4.4)

Eq. (4.4) highlights parallelism with Eq. (4.3). There exists a pure variance decomposition. This expression allows one to determine the proportion of variance (POV) which correspond to events with the same sign (++ or −−), POVXP ,and the proportion of variance corresponding to events with a different sign (+− or −+), POVXN . Dividing expression (4.2) by σx2 , these proportions are obtained:

 ρ 1 − ρ2 POVXP = = 2 p1 + and π σx2  2 p2 σX2P ρ 1 − ρ2 POVXN = = 2 p2 − . π σx2 2 p1 σX2P

(4.5)

In Fig. 1, both these proportions are presented for any correlation coefficient.

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

JID: EOR 6

ARTICLE IN PRESS

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

Fig. 2. POVXP is the diagonal probability mass 2 p1 plus and additional term Psi.

Fig. 4. ρ comes from two diagonal contributions, ContP (ρ ) at P diagonal, and ContN (ρ ) at N diagonal.

positive diagonal, and POVXN = 20% at negative diagonal, 2 2 σx2 = 2 p1 σx,P + 2 p2 σx,N = σx2 (0.8 + 0.2 ).

But using the OLS decomposition, the explicative variable accounts for 25% of the variance with the remaining 75% as residual variance,

σx2 = σx2 ρ 2 + σx2 (1 − ρ 2 ) = σx2 (0.25 + 0.75 ). Moreover, if ρ = 0, OLS decomposition gives 0% of the variance in the explicative variable and 100% as error variance, but a diagonal decomposition provides 50% of the variance at P and 50% at N. 4.2. Diagonal quadrant correlation contributions

Fig. 3. Diagonal decomposition, POVXP = 2 p1 + Psi, vs. OLS decomposition, R2 = ρ 2 .

The overall correlation coefficient ρ can also be split into two independent diagonal contributions. Focusing on covariance elements at Eq. (3.2), correlation coefficient comes directly from the sum of two diagonal contributions,

σxy 2 p1 σx σy muv,P + 2 p2 σx σy muv,N = = 2 p1 muv ,1 + 2 p2 muv ,2 σx σy σx σy   1 − ρ2 1 − ρ2 = 2 p1 ρ + + 2 p2 ρ − , π π

ρ= Eq. (4.5) shows that the variance decomposition is basically 2 p1 and 2 p2 , the diagonal probability mass, but √ they present two addiρ 1−ρ 2

tional terms, +Psi and –Psi, being P si = , as can be seen in π Fig. 2 for the positive diagonal case, POVXP = 2 p1 + P si. The proportion of variance at the positive diagonal (P subset) should be 2 p1 , but there is an √ additional term, Psi relative term, which is max/min at ρ = ± 0.5 = ±0.71, where an additional ±16% of variance is produced. This additional term is important because it means that there is an asymmetry in the proportion of variance between diagonals, more than can be expected regarding the probability mass. A significant additional variance concentration is produced in the positive diagonal, when the overall correlation coefficient is positive, and in the negative diagonal, when the correlation coefficient is negative. This quadrant decomposition is different from the OLS variance decomposition, which can be described as σx2 = σx2 ρ 2 + σx2 (1 − ρ 2 ). In the same way as R2 = ρ 2 is the proportion of variance explained by the model, POVXP is the proportion of variance explained by the positive diagonal. Fig. 3 shows the big differences between them. There are large differences between both methods. For example, when ρ =0.5, diagonal decomposition shows POVXP = 80% at

because muv,P = muv,1 and muv,N = muv,2 . So, ρ = ContP (ρ ) + ContN (ρ ), where correlation contribution of positive and negative diagonals are



ContP (ρ ) = 2 p1 ρ +

1 − ρ2

π



and ContN (ρ ) = 2 p2 ρ −

1 − ρ2

π

,

as it is represented in Fig. 4. It must be noted that positive diagonal contribution is always greater than 0 and negative diagonal contribution is always less than 0. It is very important not to confuse these contributions with the real within diagonal correlations, for example

ρp = =

σxy,P σx σy (muv,P − mu,P mv,P ) = σx,P σy,P σx (muu,P − m2 )1/2 σy (mvv,P − m2 )1/2 u,P v,P muv,P 1/2

(muu,P mvv,P )

Then, substituting,

=

muv ,1 . muu,1

ρP =

2π p1 ρ + 2π p1 + ρ

 

1 − ρ2 1 − ρ2

.

(4.6a)

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

Fig. 5. ρP , diagonal P, is always above 0.5, and ρN , diagonal N, is always bellow −0.5.

In the same way,

ρN =

2π p2 ρ − 2π p2 + ρ



1 − ρ2



1 − ρ2

.

(4.6b)

In Fig. 5, both diagonal correlations ρP and ρN are drawn for every correlation coefficient ρ . It must be noted that ρP is always positive and greater than 0.5 and ρN is always negative and less than −0.5, regardless of the overall ρ . This sign constant correlation property of diagonal subsets is not the same for the correlation of i-quadrant subsets, where each ρi shows the same sign as ρ , positive or negative, and is always lower in absolute value, σ |ρi | ≤ |ρ|. It can be seen that ρi = σ xy,i σ , using intra-quadrant max,i y,i

trix elements at Eq. (2.6), with the corresponding BND moment expressions of Section 3. First intra-quadrant correlation coefficient of BND has also been studied in Johnson and Kotz (1972), where a sample of numerical values was shown, although there was not analytical expression presented. 4.3. Principal component analysis One question that naturally arises is about establishing a comparison between Principal Component Analysis (PCA) and diagonal or Positive and Negative quadrant results (P + N). When considering N = 2 standard variates (X, Y) the eigen values are

λ+ = 1 + ρ and λ− = 1 − ρ , and the normalized eigen vectors

 √ 

v+ =

1/√2 1/ 2



and v− =

√  1/ √2 . −1/ 2

In principal components, the proportion of total population variance due to kth principal component equals

λk , λ1 + λ2 + · · · + λN and the sum of N eigen values coincides with the sum of variable   variances, i λi = i σi2 . In our case of two variables (X,Y), the total standardized population variance is 2, so the PCA explained proportions of total variance are

POVλ+ =

λ+ 2

=

1 ρ λ− 1 ρ + and POVλ− = = − . 2 2 2 2 2

7

Fig. 6. POVXP , positive diagonal proportion of variance explained, and POVλ+ , PCA proportion of variance explained.

This is useful because it can be compared with the results of diagonals P and N. The I+III quadrant contributions, proportion of variance POVXP , and II+IV quadrant contributions, proportion of variance POVXN , are

  ρ 1 − ρ2 ρ 1 − ρ2 2 p1 + and 2 p2 − . π π

In Fig. 6, considering positive overall correlation coefficient, both techniques POVXP and POVλ+ are compared. The proportion of total population variance explained by P subset, I+III quadrants, is very close to PCA’s main eigenvalue. It is slightly greater for positive correlations. In the case of ρ < 0, the diagonal N subset, II+IV quadrants variance decomposition, would explain more. 4.4. Orthant algebra: two-semiplane decomposition Decomposing mean and variance in four quadrants can also be used, for example, to analyze different strategies where this information is combined in different ways. The two-diagonal decomposition, I+III and II+IV break down, positive and negative diagonals, has already been studied. But, in this section, we are going to consider a two-semiplane decomposition, up and down break down, above and below semiplanes. With (Xi , Yi ) being the variable in i-quadrant, we can group those above and below and define

rA  = (XA , YA ) = (X1 , Y1 ) ∪ (X2 , Y2 ) . rB  = (XB , YB ) = (X3 , Y3 ) ∪ (X4 , Y4 ) These domains correspond directly with Y ≥ μy (above) and Y < μy (below), without any restriction on X. Using the previous results on quadrants, the mean and variance of these semiplanes can be obtained. The mean return of the above semiplane is

μA = 2 p1 μ1 + 2 p2 μ2 = μ + 2 p1 μ1 + 2 p2 μ2  2  ρσ = μ + π2 x = μ + μA , π σy 



where μA  = ( π2 ρσx , π2 σy ). The component μx,A is proportional to ρ , but μy,A is independent of ρ . The mean return of the below space is

μB = 2 p3 μ3 + 2 p4 μ4 = μ − 2 p1 μ1 − 2 p2 μ2 = μ − μA ,

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR 8

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

where the marginal return of below domain is the same but opposite sign to the above domain, μB = −μA . It must be noted that the overall mean return is recuperated when both semiplanes are combined, μ = 0.5μA + 0.5μB = μ + 0.5μA − 0.5μB = μ. The variance of the above semiplane can be obtained using the LTV:

A = 2 p1 (1 − μ1 μ1  ) + 2 p2 (2 − μ2 μ2  ) + 2 p1 (μ1 − μA )(μ1 − μA ) + 2 p2 (μ2 − μA )(μ2 − μA ) But, it can also be deduced more easily as

A = E[rA rA  ] − μA μA  = EA [r r ] − μA μA  = A − μA μA    σx2 (1 − 2/π ρ 2 ) σxy (1 − 2/π ) = , σxy (1 − 2/π ) σy2 (1 − 2/π ) because A = 2 p1 1 + 2 p2 2 = . It is important to note that y-variance of above semiplane is equal to the univariate truncated variance, and the x-variance of above semiplane decreases with the correlation coefficient. The variance of below space is, by symmetry, equal to the variance of above space, B = A . It can be checked that the overall matrix is retrieved by combining both spaces making use of the LTV:

Fig. 7. Distributions of XA = (X |Y > 0 ), XB = (X |Y < 0 ) and X for standard BND with ρ = 0.8.

 = 0.5(A − μA μA  ) + 0.5(B − μB μB  ) + 0.5(μA − μ )(μA − μ ) + 0.5(μB − μ )(μB − μ ) = 0 . 5 A + 0 . 5 B =  It must be noted that, curiously, the mean return is unbalanced, x-variate is different from y-variate. The above-mean return has a useful property: if the correlation coefficient is positive, there is a positive marginal mean return on x-variate, it does not matter X, unconditional to X. This is the case-scenario analyzed by Acar and Lequeux (1996). Thus set Y as a signal-trading rule, typically measured at time t-1, which being positive (above-domain), is the required signal to trigger a buying operation at time t. On average, this trading rule has a positive mean return, proportional to the correlation coefficient. This result could also have been obtained using a semiplane splitting, properly adapting the standardized first and second order moments BND, instead of quadrant splitting, as done in Acar and Lequeux (1996). It must be emphasized that this semiplane decomposition does not provide the answer to the question of how much of the variability in one variable is explained by the above semiplane, and how much is explained by the below semiplane. Contributions are identified, but a proper variance splitting cannot be obtained because the intra-group means do not coincide with the overall mean, thus A = A . Another 2 semiplane decompositions could be studied, a left and right break down, but these coincide with the above-below break down, only inter-exchanging X and Y variates.

4.5. Marginal distribution: skew-normal distribution Although quadrant partitions are versatile because they allow an algebra of quadrants, the two-semiplane decomposition with only above and below components is especially interesting for the reason that it provides the answer to the question of what is the marginal distribution of X-variate conditioned to Y>0. It is a well-known feature of BND that the distribution of the conditioned variable XA = (X |Y ≥ μy ), X above, follows a SkewNormal distribution (Arnold, Beaver, Groeneveld & Meeker, 1993).

Fig. 8. Pdf’s of variables XP , XN and X for standard BND with ρ = 0.8.

Then X1 = (XA |XA ≥ μx ) is a Left-Truncated Skew Distribution, LeftTSD, and X2 = (XA |XA < μx ) is a Right-Truncated Skew Distribution, Right-TSD, and the mixture combination of them retrieves the normal distribution (Fig. 7 clarifies this aspect). Correlation makes X and Y follow each other. This has consequences in the context of BND giving rise to a strong displacement of the probability mass. On the other hand, rP is a mixture of TBND, first and third quadrants. But, what is the marginal distribution of XP variate? Based on positive diagonal, XP is a mixture of two pdf’s, a Left-TSD referred to the first quadrant, and a mirrored Left-Truncated Skew Distribution, referred to the third quadrant. The same corresponds to the variable XN . This is a variable whose marginal pdf is a mixture of two pdf’s, a Right-TSD referring to the second quadrant, and a mirrored Right-Truncated Skew Distributions, referring to the fourth quadrant. Fig. 8 shows the marginal distributions of XP and XN , where they can be interpreted under the mirrored truncated skew-normal distribution scheme.

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

9

4.6. Portfolios: diagonal-quadrant mean and variance decomposition Consider a two-stock portfolio with vector return r = (X, Y ) and weights w = (wx , wy ). The portfolio return is R = w r = wx X + wyY , the portfolio mean return μR = w μ = wx μx + wy μy , and the portfolio variance is VR = w  w. Portfolio return can be broken down in four quadrant contributions, where i-quadrant mean return portfolio contribution is

pi μR,i = pi w (μ + μi ) = pi μR + pi w μi ,

(4.7)

being the marginal contribution μi defined at (3.1). As before, it seems especially interesting the case when positive and negative diagonals are considered, P and N subsets. So, we focus our attention on quadrants I+III (P) and II+IV (N). The portfolio mean return can be split into two terms, μR = 2 p1 μR,P + 2 p2 μR,N , where μR,P and μR,N are the portfolio mean returns at diagonals P and N. Since the marginal returns are cancelled by each other at P and N subsets, μR,P = wx μR + wy μR = μR and μR,N = wx μR + wy μR = μR , and then, the proportions of mean return portfolio at each diagonal are obtained:

μR = 2 p 1 μR + 2 p 2 μR

This is an interesting result: quadrants I+III concentrate a percentage 2p1 of total portfolio return, while quadrants II+IV concentrate a percentage 2p2 of total return. This is expected and not surprising. Marginal returns are cancelled by each other inside each subset, they do not produce net return, I and III quadrant returns are cancelled, as are II and IV. Then, it is found that, on average, P and N subset return productions are proportional to their related probability areas. The portfolio variance is the quadratic form VR = w  w, and using Eq. (4.1),

VR = 2 p1 w P w + 2 p2 w N w, the portfolio variance at diagonal P subset is equal to VR,P = w  P w, and diagonal N subset VR,N = w N w. The portfolio variances at diagonals P and N can be easily obtained taking into account that P =  + 1 and N =  + 2 . Then, substituting by Eq. (3.2),

VR,N

2 x

w2y y2

2 p1 VR,P = 2 p1 VR + 2 p2 VR,N = 2 p2 VR −

1−ρ

 π

2

1 − ρ2

π

VR = 2 p1 VR + + 2 p2 VR −

(4.10)



1−ρ 2

with = (w2x σx2 ρ + w2y σy2 ρ + 2wx wy σx σy ). π Total portfolio variance is decomposed into two main factors, mainly 2 p1 and 2 p2 , but there is a plus, + and − . This correction term can also be expressed as:



=

1 − ρ2

π

[ρ VR + 2wx wy σx σy (1 − ρ 2 )] or

= Psi · VR +

3/2

2 w x w y σx σy ( 1 − ρ 2 )

π

This expression is more interesting when it is calculated in relative terms respect the overall portfolio variance VR . So we define R :

VR



=

ρ

1 − ρ2

π

3/2

+

2/π wx wy σx σy (1 − ρ 2 ) VR 3/2

= P si +

2/π wx σx wy σy (1 − ρ 2 ) VR

= P si + ψR

Dividing Eq. (4.10) by VR ,

1 = 2 p 1 + R + 2 p 2 − R ,

where it can be seen that the first summands in brackets are VR . The diagonal net contributions to the total portfolio variance are:



Adding previous equations:

R =

σ + σ + 2wx wy σxy )  1 − ρ2 2 2 + (wx σx ρ + w2y σy2 ρ + 2wx wy σx σy ) 2 π p1 = (w2x σx2 + w2y σy2 + 2wx wy σxy )  1 − ρ2 2 2 − (wx σx ρ + w2y σy2 ρ + 2wx wy σx σy ), 2 π p1

VR,P = (

w2x

Fig. 9. Additional term R = Psi + ψR as function of ρ (case wx = wy ).

(4.8)

(w2x σx2 ρ + w2y σy2 ρ + 2wx wy σx σy ) (w2x σx2 ρ + w2y σy2 ρ + 2wx wy σx σy ) (4.9)

And this is an unexpected result. The variance is distributed between the two diagonals, as the corresponding probability weight, but unlike the mean return decomposition, an important positive term appears in quadrants I+III, in the same way as it appears with negative sign in quadrants II+IV. So, there is real variance arising from one subset (a source) and another subset that makes variance disappear (a hole). And this is not only undefined contributions, since these are true within diagonal variances.

(4.11)

where the proportion of variance of the portfolio in the positive diagonal P, POVRP , can be identified, and respectively, POVRN , corresponds to the negative diagonal N. These expressions are:

POVRP = 2 p1 + R = 2 p1 + P si + ψR and POVRN = 2 p2 − R = 2 p2 − P si − ψR .

(4.12)

It must be noted that R has a similar interpretation as P si, it is the extra percentage of variance concentrated at diagonal P. In Fig. 9 is shown the value of this additional term as function of the overall correlation coefficient ρ , in the specific case of standard variables with equal weighting coefficients wx = wy . To better illustrate the portfolio variance decomposition, one example can be taken. Assuming standard variables and weights wx = wy = 1, when ρ = 0 the total portfolio variance is VR = 2. To obtain the extra percentage R = P si + ψR , using previous definitions, P si = 0 and ψR = 0.32, meaning that R = 32% is the percentage of additional variance to the probability mass that is added in the diagonal P, so POVRP = 2 p1 + R = 0.5 + 0.32 = 0.82. Then 82% of overall portfolio variance comes from diagonal P, and 18%

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR 10

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx Table 1 Mean and variance return of R = wx X + wy Y with (wx, wx ) = (1, 1) split into their four orthant contributions, absolute and percentage values. Diagonal contribution percentages are P = I+III and N = II+IV.

Fig. 10. POVRP = V RP /V R = 2 p1 + R is the proportion of variance of the portfolio in the positive diagonal P, POVRN at negative diagonal N (case wx = wy ). POVRP equals to the probability mass 2 p1 plus a term R .

The overall portfolio mean return μR = 0.08 is split between the 4 orthant contributions (left side of Table I). For example, I-quadrant mean return portfolio contribution, using Eqs. (4.7) and (3.1), is p1 μR,1 = p1 (μR + μR,1 ) =

comes from diagonal N. It can be seen in this example how orthants I+III provide 50% of the mean portfolio return and orthants II+IV provide 50% of the mean portfolio return, but orthants I+III provide 82% of the total portfolio variance, and orthants II+IV just 18% of the total portfolio variance. Another example, in this case supposing that ρ = 0.5, then the total portfolio variance is VR = 3. Calculating P si = 0.14 and ψR = 0.14, so R = 28% is the percentage of additional variance to the probability mass that is added in the diagonal P, so POVRP =

2 p1 + R = 0.6 + 0.28 = 0.94. Then, 94% of overall portfolio variance comes from diagonal P, and 6% comes from diagonal N. It can be seen that orthants I+III provide 67% of the mean portfolio return, whereas orthants II+IV provide 33%, of the overall return. However, orthants I+III provide 94% of the total portfolio variance, whereas orthants II+IV provide only 6% of the overall variance. In Fig. 10 both proportions POVRP and POVRN are represented in the case of standard variables with equal weighting coefficients, wx = wy . Since POVRP = 2 p1 + R , percentages 2 p1 and ψR are included in this figure, in order to highlight the terms where proportion of variance POVRP comes from. Previous equations have shown some main results. It was expected to find only 2 p1 and 2 p2 relative variance probability mass at diagonals P and N, but things are different, the variance of portfolio at P must be increased with an additional term R = P si + ψR . In the same way that the proportion of variance of X at diagonal P must be increased with a percentage Psi, the portfolio variance at diagonal P must be increased with the percentage R = P si + ψR .

4.7. Portfolios: examples To better understand how different portfolio weights affect the orthant-based mean and variance contributions, some examples breaking down the overall average return and the variance portfolio into their four quadrants are shown. For example, we can consider a portfolio with two stocks (X,Y) with mean daily return values (μx , μy ) = (0.04, 0.04 ), unit weighting coefficients (wx , wy ) = (1, 1 ), unit standard deviation (σx , σy ) = (1, 1 ) and ρ = 0.5. The average return μR = E[wx X + wyY ]=0.08 and variance return σR2 = V ar (wx X + wyY )=3, split into its four orthants contributions, are shown in Table 1.



2/π (1 + 0.5 )(wx σx + wy σy )/4 = 0.63, and IVquadrant mean return contribution is p4 μR,4 = p4 (μR + μR,4 ) =

(1/3 ) 0.08 +



2/π (1 − 0.5 )(wx σx − wy σy )/4 = 0.01. These values are better interpreted as a percentage over total mean return 0.08, so it can be seen how the days with same positive sign of both stocks are responsible for the 0.63/0.08=781% of the overall mean return. These days of good news (quadrant I) are cancelled by the days of really bad news (quadrant III), resulting in a net positive diagonal contribution of 67%, orthants I+III, while the negative diagonal contribution is 33%, orthants II+IV. As shown in Eq. (4.3), the diagonal mean return contribution is equal to the probability mass, 2/3 of total return at P and 1/3 of total return at N. But things are different with the variance magnitude. The overall portfolio variance σR2 = 3 is also split in the 4 quadrants (right side of Table I). I and III quadrants make the same  variance contribution, p1VR + (w2x σx2 ρ + w2y σy2 ρ + 2wx wy σx σy ) 1 − ρ 2 /(2 π ) = 1.41, which is the half of the contribution of the positive diagonal at Eq. (4.9), and II and IV quadrants contribute with 0.09 each. These values are 47% and 3% when they are presented as percentage of the overall variance. Finally, from the diagonal point of view, the positive diagonal produces 94% of the variance (2·1.41/3 = 0.94), and the negative diagonal only a little 6%. It can be seen in this example how orthants I+III provide 67% of the mean return and orthants II+IV provide 33% of the mean return, but orthants I+III provide 94% of the total variance, and orthants II+IV just 6% of the total variance. In view of these results it seems obvious that there exists a huge imbalance between the diagonals P and N, especially in terms of variance. If possible, one would like to remain in quadrants II and IV, where although only 33% of total portfolio return is produced, this is obtained with very little variance, only 6% of the portfolio variance. In general, the variance is smaller at quadrants II and IV (positive correlation assumed), where stocks with opposite sign returns are combined, producing a much smaller range of returns. These orthant patterns can explain differences between investment strategies. Following the same previous example, if the weighting coefficients were (wx , wy ) = (−1, 2 ), no diagonal variance imbalance is produced, because nearly the same 50% of variance comes from P and the other 50% from N, as can be seen in Table 2. This portfolio is not very different to the previous one, it has the same overall variance σR2 = 3 and the half overall mean return μR = 0.04.

(1/6 ) 0.08 +

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

11

Table 2 Mean and variance return of R = wx X + wy Y with (wx, wx ) = (−1, 2) split into their four orthant contributions, absolute and percentage values. Diagonal contribution percentages are P = I+III and N = II+IV.

Well orthant-balanced variance portfolios can be obtained investing long and short at the same time, assuming positive correlation. This does not guarantee safer investments, but it seems that some kind of advantage can be obtained using orthant information, as this orthant property characterizes the pair trading strategies. This sub-topic merits a great deal of further consideration but is outside of the scope of this particular discussion. 5. Extensions One first step to finding different applications or extensions related to the orthant-based variance decomposition, can be made by thinking in a parallel way about the meaning of classical variance decomposition of linear regression. What are the practical applications of variance decomposition based on linear regression? When one considers this question deeply, there is no easy answer. Let us take an example. One stock is characterized by correlation ρ = 0.8 with the market, then it shows ρ 2 = 64% of market risk percentage and (1 − ρ 2 ) = 36% of specific risk percentage. This is the classical variance decomposition based on linear regression and is a very illustrative example. Even going further, one observes that the coefficient of regression determination is 64% as well. However, let us consider what the real use of this information could be. For practical purposes, how can this information influence investment decisions? Beyond the fascination that splitting risk between market risk and the risk inherent in the stock brings, how can this information be used from a practical point of view? In our opinion, the main interest resides in the regression that is behind the variance decomposition. It is the linear regression that holds the useful information. Therefore, a question naturally arises. What can the regression origin of the orthant-based variance decomposition be? And, what about its coefficient of determination? Sections 5.1 and 5.2 are devoted to study regressions based on diagonal orthants P and N. Some answers are found, although many questions will have to be solved in further work. 5.1. Regressions: OLS vs positive and negative regressions Following the classical linear regression, X variable can be linearly regressed on Y, X = f (Y ), say

X = μx + ρ

σx (Y − μy ) + εx , σy

and, given Y, the expected regressed value of X, is

E[X |Y ] = μx + ρ

σx (Y − μy ). σy

Fig. 11. R2P and R2N , coefficients of determination at P and N subsets, and overall R2 = ρ 2 .

The OLS quality regression is measured by the coefficient of determination R2 = ρ 2 . The greater the correlation, the better the regression. We can think of pair (X, Y ) as a variable that is the union of two diagonal variables, (XP , YP ) and (XN , YN ), with a mixture characterization determined by probabilities 2 p1 and 2 p2 :



(X, Y ) =

(XP , YP ) with 2 p1 probability (XN , YN ) with 2 p2 probability

The linear regressions at diagonals P and N are

μx + βP (YP − μy ) + εXP XN = μx + βN (YN − μy ) + εXN , XP =

where each component mean matches the overall mean, (μxP , μyP ) = (μx , μy ) and (μxN , μyN ) = (μx , μy ). These diagonal regression coefficients, βP and βN , are the OLS ones inside each diagonal, based on positive and negative correlation coefficients computed in Section 4.2, Eq. (4.6), and they can be expressed as

σx,P σx m u v , 1 σx = ρP = and σy,P σy muu,1 σy σ σ m σ βN ≡ ρN x,N = ρN x = uv,2 x . σy,N σy muu,2 σy β p ≡ ρP

Then, the regressed values (expected) are

σx (Y − μy ) and σy P σx E[XN |YN ] = μx + ρN (YN − μy ). σy E[XP |YP ] = μx + ρP

This kind of regressions requires YP (YN variable) to be used as an input variable, which is a strong requirement. The coefficients of determination of these regressions at P and N subsets are R2P = ρP2 and R2N = ρN2 , and they are bigger than R2 = ρ 2 , whenever they are applied in their natural subsets, positive and negative correlation, as can be seen in Fig. 11. It is important to take into account that each regression is defined in its own subset, P, N, and overall P + N, so they are not fully comparable. Anyway, could we get advantage of these high coefficients of determination? A combination of regressions can be a possibility, assuming pre-classified data.

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

JID: EOR 12

ARTICLE IN PRESS

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

Fig. 12. R2P and R2N , coefficients of determination at diagonals P and N, R2PN coefficient of determination of combined regression of P and N, and R2 = ρ 2 is referred to the overall domain (classical regression).

Fig. 13. MSE(X ) is the benchmark, classical regression of two standardized variables, for each ρ . MSE(X˜ ) is referred to P + N combined regression, with preclassification. MSE(X˜˜ ) is referred to P + N combined regression, without preclassification.

5.2. Combined regression: coefficient of determination A combination of previous regressions can be defined in the following way:



X˜ =

X˜P = μx + βP (YP − μy ) X˜N = μx + βN (YN − μy )

(5.1)

where X˜P = E[XP |YP ] and X˜N = E[XN |YN ]. We start by assuming that we have the necessary information to distinguish the kind of input that is received, a YP value or a YN value. If this is the case, model (5.1) produces an output whose sign agrees with the real one, not necessarily in value, but agrees with the sign, sign(X˜ ) = sign(X ). This is very useful because it makes the error smaller than if this information was not available. What is the quality of this combined regression? The coefficient of determination, R2PN , can be deduced (see proof details in Appendix C). This value is:

R2PN = 2 p1 ρP2 mvv,1 + 2 p2 ρN2 mvv,2

(5.2)

This is an interesting result. In Fig. 12, we can see how this mixed regression takes advantage of P and N subset information, and R2PN exceeds R2 = ρ 2 for every correlation coefficient value. For example, when ρ = 0, the OLS regression shows a null coefficient of determination, R2 = 0, but this mixed regression leads to R2PN = 0.405, and for any other value of correlation coefficient the mixed regression is always better in terms of the coefficient of determination. It uses the best of the P regression and the best of the N regression. If we didn’t know what kind of input variable is arriving, P or N, YP or YN , we would have no choice but to use the pure variable Y, without orthant distinction. In this case, the expected value of X ˜˜ is: obtained, say X,



X˜˜ =

X˜˜ P = μx + βP (Y − μy ) with 2 p1 probability X˜˜ N = μx + βN (Y − μy ) with 2 p2 probability

(5.3)

where we can only randomly assign P or N subset regression, since we have no information. This random procedure cannot be as good as the previous one because useful information is absent or ignored. Moreover, an appropriate coefficient of determination could not be obtained, because the regression coefficients βP or βN are not OLS estimators, they are correct for YP and YN input values, but

not for Y . In this case, we cannot use the sum of squared decomposition SST=SSR+SSE, thus a proper coefficient of determination cannot be calculated. In its place, another loss function should be used, for example the mean squared error (MSE). This is shown in the next Fig. 13. Sadly, but as expected, these mixed results are always worse than the unmixed procedure. Results are better only when pre-classification is used and the MSE is lesser. How can we do pre-classification? When Y arrives, is it a P or an N variable? If P, X∗ Y>0, if N, X∗ Y<0. Since we have incomplete knowledge of the kind of input we are processing (P or N), this combination of regressions cannot completely improve the original regression. However, sometimes, there can be situations where some probabilistic information about the variable may be available. This is difficult to imagine, but we can think of situations such as online auctions, display advertisements, ride-hailing services, stock options, etc., specific and complex situations where variables sign is the same or the contrary. This is a challenging question, and it would probably require deep experience in different fields to untangle. In finance, what could be an indicator of same return sign between stocks? It does not need to be a direct relationship; it can just be an indirect relationship. An additional piece of information that only tells for example which days the sign of return X is equal to the sign of return Y, and when the sign of X will be opposite to Y. No more would be needed. If this information was available, for example in ρ =0 case, this information could increase the coefficient of determination from 0 to 0.405 as we have seen. This is a further work that falls beyond the scope of this paper. 5.3. More than two variables Two-stock portfolios have been a first step in this study. This paper’s scope is limited to only two stocks, although the two-stock portfolio problem is very enlightening because it allows the study of the relationship of a stock with the market, and this is the basis of the single-index market model (Sharpe, 1963), reducing the relationships among stocks to only relationships between each stock and the market. Moreover, pair trading strategies also deal with only two assets and is a technique widely analyzed and used. We would like to explore the problem for N-stock portfolios because it is very common to have more assets within portfolios, but

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

the complexity of this approach can be huge. The orthant number n for N variables growths exponentially, n = 2N , which is the number of variations with repetition of two elements (+ and −) taken by N-tuples. For example, for N = 3 variables, the orthant number is n = 23 = 8, and for N = 8 variables there are n = 28 = 256 orthants. We have seen that Section 2 results are applicable to any distribution and any truncation pattern. Eq. (2.4) establishes the general result that allows the variance of any variable to be decomposed as the sum of the second moments about the origin of the standardized truncated distribution function, weighted with the probability  pi of each partition, σx2 = σx2 ni=1 pi muu,i , with n being the number of orthants for N variables (U is the standardized variable of X). As we did with the BND case, pdf’s with symmetry between opposite orthants are a very interesting case to be considered. Orthant symmetry enables natural simplification that appears by grouping the orthants in pairs, defining symmetric diagonals where marginal mean return contributions are cancelled between them. Moreover, in these orthant pairs the contribution variance is equal to the diagonal variance, and we again find diagonals that are variance sources (those where variables participate with positive sign, in the case of positive correlation), and other diagonals that are variance holes (those where variables participate with negative sign, in case of positive correlation). In this way, the use of diagonals reduces the study to these particular domains. The number of diagonals is half of the number of orthants, nd = n/2 = 2N−1 , because every two orthants give rise to one diagonal. So, expression (2.4) is simplified to

σx2 = σx2

n/2 

2 pi muu,i ,

i=1

where the contributions of all diagonals are captured. We know that diagonals not only facilitate variance contributions, but they are an important part of the decomposition because each of these diagonals shows a variance that coincides with the contribution that it is supporting, so

σx2 = σx2

n/2 

2 pi V ar (Di ),

i=1

a correct variance decomposition is obtained that is the weighted sum of the variances of diagonal subsets Di with i = 1,…, n/2 orthants. 5.4. Different distributions Considering N variables instead of 2 would mean a straightforward change from the BND to the MND (Multivariate Normal Distribution), which seems a natural and harmless change, but that in practice would bring great calculation difficulties. The first problem is that there is no analytical expression of the probability mass of each orthant for the general case of N variables. The second problem is that there is no closed-form expression for the second order moment of the truncated multivariate normal distribution. Therefore, the analytical expression of Eq. (2.4) cannot be computed. In any case, numerical solutions can always be used to obtain the correct variance decomposition as a weighted sum of diagonal variances. Coming back to the bivariate case, other distributions can be considered beyond the BND. In the field of finance, interesting candidates could be those distributions capable of representing large return oscillations beyond the normal distribution (fat tails) such as the bivariate t-distribution (Kotz & Nadarajah, 2004) or the bivariate α -stable distribution (Menn & Rachev, 2005). The lack of a simple formulae for cumulative distribution would again force the use of numerical solutions for orthant-based variance decomposition.

[m5G;December 6, 2019;23:16] 13

5.5. Risk parity There is a strong parallelism between orthant-based variance decomposition and the Risk Parity methodology. Risk Parity is a portfolio investment strategy in which every portfolio asset exhibits the same risk contribution, which results in balanced portfolios in terms of risk (Maillard, Roncalli & Teiletche, 2010). Therefore, in the same way of equally-weighted risk contributions portfolios, we could design portfolios in which the orthant-based variance contribution is the same for every orthant. This would result in a kind of well-diversified portfolios where smoother oscillations would replace the more aggressive scenarios where very good days (all stocks go up, typically quadrant I) are followed by really bad days (all stocks go down, typically quadrant III). Considering N-stock portfolios, it seems possible to find numerical solutions that minimize the differences between orthant-based variance contributions, but this approach would require a special study that extends beyond the scope of this paper.

6. Conclusions In the same way that classical regression between two variables makes variance decompose into a regressed or explained part plus a non-regressed or error part, a variance decomposition based on orthant framework is discussed in this paper. This decomposition consists of one part of the variance explained by the events with co-movement, same sign variables, both positive or both negative, and another part explained by events with opposite signs. This is a four-quadrant scheme where every bivariate pdf can be split into four components and where each quadrant shows a mean and variance contribution. The combined result of quadrants I+III compared to II+IV for diagonal symmetrical distributions, as in the BND case, is of particular interest. When we focus on these diagonal quadrants, marginal mean returns of opposite quadrants are cancelled between them and a proper variance decomposition is obtained. Moreover, the equations used herein show high variance concentration in the first and third quadrants when positive overall correlation coefficient is considered, more than would be expected by the probability mass. In short, it has been confirmed that long and short positions can produce well-balanced orthant portfolios regardless of the overall variance. Therefore, pair trading strategies are expected to reflect this smoother orthant pattern. Indeed, in the same way that Risk Parity methodology looks for equally-weighted risk contribution portfolios, equally-weighted orthant-based risk contribution portfolios could be good candidates to optimize portfolios. However, further study should be conducted to confirm this and other properties. In addition, in order to find parallelism with the classical variance decomposition, some cases with linear regressions based on diagonals P and N were performed. This revealed that greater coefficients of determination can be achieved using orthant-based regression coefficients, although this requires additional information to pre-classify the type of subset the independent variable belongs to. Finally, we compared our approach to that of principal component analysis showing that there was slightly greater variance concentration than found with PCA.

Acknowledgments The author would like to thank the two anonymous reviewers for their insightful comments and my colleagues Sandra Morini, Judit Mendoza and Valeriy Zakamulin for their useful suggestions.

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR 14

[m5G;December 6, 2019;23:16]

J. Giner / European Journal of Operational Research xxx (xxxx) xxx



Appendix A

+k

Following Rosenbaum (1961), supposing that U∼N (0, 1 )and V∼N (0, 1 ) both follow a bivariate normal distribution with ρ correlation coefficient, the first moment mu of the variable U, is given by



k−ρh

L(h, k; ρ ) mu = φ (h ) 







+ ρ φ (k ) 

1 − ρ2

h−ρk



1 − ρ2



, (A.1)

ρ 1 − ρ2 φ + √ 2π

L(h, k; ρ ) mv = ρ φ (h ) 



k−ρh



1−ρ

2



+

h−ρk

φ (k )  

1 − ρ2

(A.2) We are interested in the symmetric case, so we particularize expressions (A.1) and (A.2) for h = 0 and k = 0, being L(0, 0; ρ ) = p1 the Sheppard’s formula, p1 = 1/4 + 1/(2π ) arcsin(ρ ). In this case, p1 mu,1 and p1 mv,1 are obtained, where index 1 refers to the first quadrant case. So,



( p1 mu,1 , p1 mv,1 ) = ( 2/π (1 + ρ )/4 ,



2/π (1 + ρ )/4 )

is the first quadrant net contribution to the overall mean return for the standardized variable (U, V). By symmetry, third quadrant first order moments are equal to the first quadrant ones but with the opposite sign. Second quadrant first order moments are equal to the first quadrant moments but changing ρ by − ρ , and including a negative sign for variable U. Fourth quadrant is the opposite of the second quadrant.

  ( p2 mu,2 , p2 mv,2 ) = (− 2/π (1 − ρ )/4, − 2/π (1 − ρ )/4)   ( p3 mu,3 , p3 mv,3 ) = (− 2/π (1 + ρ )/4, − 2/π (1 + ρ )/4)   ( p4 mu,4 , p4 mv,4 ) = ( 2/π (1 − ρ )/4, − 2/π (1 − ρ )/4)

As can be seen, under the normal bivariate standard distribution, every quadrant mean contribution is linear with the correlation coefficient. With these first order moments, Eq. (3.1) is obtained.

L(h, k; ρ ) muv =





L(h, k; ρ ) muu = L(h, k; ρ ) + h φ (h ) 

+ ρ2 k

k−ρh



h−ρk

φ (k )  



φ (k )  





1 − ρ2

1 − ρ2 φ √ 2π



1 − ρ2

1 − ρ2

h2 − 2ρ hk + k2



1 − ρ2



.

(muu,1 , mvv,1 , muv,1 )

   1 − ρ2 1 − ρ2 1 − ρ2 = 1+ρ , 1+ρ , ρ+ . 2 π p1 2 π p1 2 π p1 And by symmetry, the second quadrant and fourth quadrant moments,

(muu,2 , mvv,2 , muv,2 )

   1 − ρ2 1 − ρ2 1 − ρ2 = 1−ρ , 1−ρ , ρ− . 2 π p2 2 π p2 2 π p2 Eqs. (3.2a) and (3.2b) are obtained with these second order and product moments. Appendix C The coefficient of determination of the combination of diagonal regressions is the proportion of the total sum of squares (SST) that SSR is explained by the regression (SSR), R2PN = SST . Taking into account that SST, SSR and SSE can be decomposed in two parts with n = nP + nN , the following relations accomplish:

SST =

n 

(xi − μx )2 =

SSE =

n 

nP 

(xi − μx )2 +

i=1

(x˜i − μx )2 =

nP 

nN 

(x˜i − μx )2 +

nN 

i=1

i=1

i=1

n 

nP 

nN 

(x˜i − xi )2 =

(xi − μx )2 = SSTP + SSTN

i=1

(x˜i − xi )2 +

i=1

(x˜i − μx )2 = SSRP + SSRN

(x˜i − xi )2 = SSEP + SSEN

i=1

2 2 SSR = nPV ar (X˜P ) + nNV ar (X˜N ) = nP ρP2 σx,P + nN ρN2 σx,N

= nP ρP2 σx2 muu,1 + nN ρN2 σx2 muu,2



 ρ 1 − ρ2 h2 − 2ρ hk + k2 φ + . √  2π 1 − ρ2 The second moment about the origin of the standardized variable V is

k−ρh



k−ρh

Particularizing to the values h = 0 and k = 0, 1st and 3rd quadrant moments are obtained,

i=1

1 − ρ2

L(h, k; ρ ) mvv = L(h, k; ρ ) + ρ 2 h φ (h ) 

.

In terms of variance, SSR and SST can be expressed as







1 − ρ2

h−ρk

+ρ k

+

SSR =

Following the same reference Rosenbaum (1961), the second moment about the origin of the standardized variable U is obtained by



ρ L(h, k; ρ ) + ρ h φ (h )  

i=1

Appendix B

h2 − 2ρ hk + k2





.

1 − ρ2



The product moment about the origin of U and V is

where u = h, v = k are the truncation points, L is the total probability in the truncated distribution, φ and  are the standard frequency and distribution functions of the univariate normal distribution. By symmetry, the other first moment mv is given by:



φ (k )   



h−ρk

1 − ρ2



and SST = nσx2 . SSR Dividing both expressions: R2PN = SST = 2 p1 muu,1 R2P + 2 2 p2 muu,2 RN . Since the BND accomplishes muu,i = mvv,i , Eq. (5.2) is proved. In the same way, the mean squared error (MSE) of the different regressions analyzed can easily be obtained, that is, classical regression, combination with pre-classification and combination without pre-classification:

MSE = 1 − ρ 2

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028

ARTICLE IN PRESS

JID: EOR

J. Giner / European Journal of Operational Research xxx (xxxx) xxx

MSE (X˜ ) = 2 p1 MSE (X˜P ) + 2 p2 MSE (X˜N ) = 2 p1 muu,1 (1 − ρ ) + 2 p2 muu,2 (1 − ρ ) 2 P

2 N

MSE (X˜˜ ) = 2 p1 MSE (X˜˜ P ) + 2 p2 MSE (X˜˜ N ) = 2 p1 (1 − 2ρP ρ + ρP2 ) + 2 p2 (1 − 2ρN ρ + ρN2 ) References Acar, E., & Lequeux, P. (1996). Dynamic strategies: A correlation study. In C. Dunis (Ed.), Forecasting financial markets (pp. 93–123). London: Wiley. Acar, E., & Lequeux, P. (1998). Trading rules profits and the underlying time series properties. In P. Lequeux (Ed.), Financial markets tick by tick (pp. 255–301). London: Wiley. Arismendi, J. C. (2013). Multivariate truncated moments. Journal of Multivariate Analysis, 117, 41–75. Arnold, B. C., Beaver, R. J., Groeneveld, R. A., & Meeker, W. Q. (1993). “The nontruncated marginal of a truncated bivariate normal distribution. Psychometrika, 58(3), 471–488. Deshpande, A., Ertley, B., Lundin, M., & Satchell, S. (2019). Risk discriminating portfolio optimization. Quantitative Finance, 19(2), 177–185. Giner, J., Mendoza, J., & Morini, S. (2018). Correlation as probability: Applications of Sheppard’s formula to financial assets. Quantitative Finance, 18(5), 777–787. Horrace, W. C. (2005). Some results on the multivariate truncated normal distribution. Journal of Multivariate Analysis, 94(1), 209–221.

[m5G;December 6, 2019;23:16] 15

Johnson, N. L., & Kotz, S. (1972). Distribution in statistics: Continuous multivariate distributions. John Wiley. Kotz, S. & ., & Nadarajah, S. (2004). Multivariate t distributions and their applications. Cambridge University Press. Lundin, M., & Satchell, S. (2002). Performance measurement of portfolio risk with orthant probabilities. In J. Knight, & S. Satchell (Eds.), Performance measurement in finance (pp. 261–284). Butterworth and Heinemann. Lundin, M., & Satchell, S. (2016). Risk management for return enhancement. Risk Magazine, February 2016, 126–131. Lundin, M., & Satchell, S. (2018). Orthant probability-based correlation. In J. Alcock, & S. Satchell (Eds.), Asymmetric dependence in finance (pp. 133–151). New York: Willey. Maillard, S., Roncalli, T., & Teiletche, J. (2010). On the properties of equally-weighted risk contributions portfolios. The Journal of Portfolio Management, 36(4), 60–70. Menn, C., & Rachev, S. T. (2005). A GARCH option pricing model with alpha-stable innovations. European Journal of Operational Research, 163, 201–209. Rosenbaum, S. (1961). Moments of a truncated bivariate normal distribution. Journal of the Royal Statistical Society, Series B (Methodology), 23(2), 405–408. Sharpe, W. F. (1963). A simplified model for portfolio analysis. Management Science, 9(2), 277–293. Sheppard, W. F. (1899). On the application of the theory of error to cases of normal distribution and normal correlation. Philosophical Transactions of the Royal Society of London Series A, 192, 101–167.

Please cite this article as: J. Giner, Orthant-based variance decomposition in investment portfolios, European Journal of Operational Research, https://doi.org/10.1016/j.ejor.2019.11.028