Journal of Mathematical Psychology (
)
–
Contents lists available at ScienceDirect
Journal of Mathematical Psychology journal homepage: www.elsevier.com/locate/jmp
Notes and comment
A note on the standardized covariance David V. Budescu a,∗ , Yuanchao Emily Bo b a
Fordham University, United States
b
UCLA, United States
highlights • We show that the standardized covariance is a measure of additive association. • In simple cases the standardized covariance is a function of the difference between the ranges of the two lotteries.
article
info
Article history: Received 10 September 2015 Received in revised form 3 July 2016 Available online xxxx Keywords: Standardized covariance Coefficient of additivity Metric association
abstract In a recent paper Andraszewicz and Rieskamp (2014) proposed using the standardized covariance, as a ‘‘measure of association, similarity and co-riskiness between choice options’’. They stress that the standardized covariance is not a measure of linear association, but do not specify its exact nature. We relate the standardized covariance to Zegers and ten Berge’s (1985) family of metric association measures and show that is a measure of additive association, and we analyze some properties of this measure for binary lotteries. We distinguish between the case where both lotteries are driven by one (common) probability distribution as well as the more general case where the two are resolved, independently, by two distinct distributions, and we show how the range of the outcomes offered by the lotteries and their probability distributions drive the value of the standardized covariance. © 2016 Elsevier Inc. All rights reserved.
Recently, Andraszewicz and Rieskamp (2014) (henceforth AR) proposed using the standardized covariance, as a ‘‘measure of association, similarity and co-riskiness between choice options’’. They discuss some of its properties and illustrate its relevance in evaluating and comparing various quantitative models of choice between risky prospects. They focused mostly on the case of pairs of binary (two-outcome) gambles, which is prevalent in the decision literature, and highlighted the differences between the new measure and the familiar correlation coefficient, which is uninformative in the case of binary outcomes. They stress that, unlike the correlation, the standardized covariance is not a measure of linear relationship. The goal of the present note is to put this proposed measure in the proper statistical context by relating it to a more general family of metric association coefficients, providing a more general perspective on its interpretation and, in the process, clarify some points in AR’s exposition in order to facilitate the interpretation of the proposed measure. Zegers and Ten Berge (1985) and Zegers (1986) described a family of 4 association coefficients for metric scales, designed
∗
Corresponding author. E-mail address:
[email protected] (D.V. Budescu).
http://dx.doi.org/10.1016/j.jmp.2016.07.002 0022-2496/© 2016 Elsevier Inc. All rights reserved.
to capture the nature of the relationship between two random variables, X and Y , with finite moments, µx , σx , µy , σy and σxy . The natural benchmark is the ‘‘traditional’’ Pearson correlation, r, which is a coefficient of linearity, is defined as r =
σxy σx σy
(1)
and quantifies agreement with a perfect linear relation of the type, Y = α + β X . The (chance corrected) coefficient of proportionality, p, is defined as: p=
σxy (µ2x + σx2 )(µ2y + σy2 ) − µx µy
(2)
and measures agreement with a perfect proportional relation of the type, Y = β X (Equation 7 in Zegers, 1986). The coefficient of additivity, a, is defined as a=
σxy (σx2 + σy2 )/2
(3)
and measures fit with a perfect additive relation of the type, Y = α + X (Equation 19 in Zegers & Ten Berge, 1985). Finally, the (chance corrected) coefficient of identity, e=
σxy [(σx2 + σy2 ) + (µx + µy )2 ]/2
(4)
2
D.V. Budescu, Y.E. Bo / Journal of Mathematical Psychology (
measures agreement with a perfect identity relation, Y = X (Equation 6 in Zegers, 1986). All four coefficients use the covariance between X and Y , σxy , in the numerator but they have different denominators, meaning that they re-scale σxy in different ways. All four coefficients are 0 if, and only if, the covariance is 0, so they all agree in the case of no association, but they differ in the way they measure departures from this common benchmark. Given the four definitions, it is clear that the coefficients define a partial hierarchy, such that r ≥ a, p ≥ e. Since the linear relation involves two free parameters – a slope and an intercept – it is always possible to select two values to achieve a perfect linear correlation, r = ±1, between binary random variables (the sign of the correlation being defined by the sign of the slope). However, the other 3 coefficients have fewer free parameters (the coefficient of identity, e, has no free parameters), and are not subject to such constraints and they can be meaningfully used to compare binary prospects. This explains why AR (2014), found it necessary to avoid the measure of linear relationship, r. AR (2014) stress that the standardized covariance measures non-linear relationship but they do not specify the exact nature of the relationship being captured. Our analysis clarifies this ambiguity: AR’s standardized covariance is, exactly, Zegers and ten Berge’s coefficient of additivity (we will use the two names interchangeably), that measures the magnitude of the additive relationship between two random variables (lotteries). In this context, it is instructive to consider the conditions under which the various coefficients coincide with the linear correlation, r. It is easy to show that: e = r ⇐⇒ µx = µy
and
σx = σy
)
–
that the results are invariant across all probabilities, so they do not play a role in this derivation. It follows that:
µX =
XH + XL
, and 2 σX = (XH − XL )2 = | XH − XL | = 1X .
(8)
Let Y offer YH and YL with the same probabilities. The values of X and Y are perfectly correlated, but the sign of the correlation may vary. If P (YH ) = p, matching the pattern in X , the two lotteries are positively correlated (r = +1); If, on the other hand, P (YH ) = (1 − p), the reverse pattern of X , the two lotteries are negatively correlated (r = −1). By symmetry:
µY =
YH + YL
, and 2 σY = (YH − YL )2 = |YH − YL | = 1Y .
(9)
It follows that σXY = 1X 1Y , so it is simple to show that the standardized covariance is given by: a=
21X 1Y
∆2X + ∆2Y
.
(10)
If we let δ = 1Y /1X denote the ratio of the two standard deviations (which in the binary case coincide with the ranges of two sets of values), and we re-arrange terms, we can show that
(5)
p = r ⇐⇒ σx /µx = σy /µy ⇐⇒ CV (X ) = CV (Y )
(6)
a = r ⇐⇒ σx = σy .
(7)
Eq. (5) shows that e = r only if X and Y have identical means and variances. Eq. (6) states that p = r only if the two gambles X and Y have identical coefficients of variation, indicating that the relative riskiness of the two gambles is the same. The third equality (7) shows that a = r only if the variances of two gambles X and Y are equal, indicating that they are equally risky. The standardized covariance, a, is a Pearson correlation coefficient only under the special condition of homogeneity of variances. Indeed, Mehta and Gurland (1969) show that this coefficient is the maximum likelihood estimate of the linear correlation between two normal variables with equal variances. The standardized covariance, a.k.a the coefficient of additivity, preserves most of the properties of the ‘‘regular’’ correlation. It is: (1) double bounded: −1 ≤ a ≤ +1; (2) invariant, except for sign reversal, under change of orientation of either variable: ax,−y = a−x,y = −ax,y and, of course, ax,y = a−x,−y ; (3) symmetric: ax,y = ay,x ; and (4) invariant under positive additive transformations: ax,y = a(x+b),(y+c ) . However, as AR (2014) point out, unlike Pearson’s correlation, it is not invariant under change of scale: ax,y̸= akx,qy . In the rest of the note we explore a, the coefficient of additivity (a.k.a. standardized covariance), in order to provide some context to its interpretation. Following AR’s (2014) lead we focus on binary lotteries and we distinguish between the case where the two lotteries are driven by one (common) probability distribution and the more general case where the two are resolved, independently, by two distinct distributions. Standardized covariance for two lotteries with a common probability distribution. Consider two binary lotteries, X and Y , with a single (common) probability resolution mechanism. Let X offer XH with probability p, and XL with probability (1−p), where the subscripts L and H refer to the higher and lower outcomes, respectively. AR (2014) show
a=
2δ 1 + δ2
,
(11)
i.e. a quadratic function of the ratio. It is easy to see that a peaks at δ = 1 (where a = 1), and decreases as δ moves away from 1 in either direction, i.e. when the range of values for one of the two lotteries increases (or decreases) markedly compared to the other lottery. The decrease is symmetric, towards 0, in the Log(δ) scale. In other words, one can think of a, the standardized covariance, as a simple measure of the disparity between the ranges of outcomes offered by the two lotteries. Similarity between two lotteries with distinct probability distributions. Consider now two binary lotteries, X and Y , with distinct, and independent, probability distributions. Let X offer XH with probability px , and XL with probability (1 − px ), and let Y offer YH with probability py , and YL with probability (1 − py ). In both cases the subscripts L and H refer to the higher and lower outcomes, respectively. The moments of the lotteries are (see Appendix for details):
µX = pX XH + (1 − pX )XL and σX2 = pX (1 − pX )∆2X
(12)
and, by symmetry:
µY = pY YH + (1 − pY )YL and σY2 = pY (1 − pY )∆2Y .
(13)
Because the covariance of two independent lotteries is, by definition, 0, AR (2014) introduced a new measure of the strength of similarity sXY (Eq. 7 in their paper) which, in this case, takes on a simple symmetric form (see Appendix for details): sXY = 4pX pY (1 − pX )(1 − pY )∆X ∆Y .
(14)
D.V. Budescu, Y.E. Bo / Journal of Mathematical Psychology (
)
–
3
value of log(δ), SXY is a monotonically increasing function of log(γ ). We further study to what degree the similarity is affected by the relation between the two sets of outcomes, as measured by δ , and the relation between the probability distributions, as measured by γ . To this end we compare the partial derivatives:
δ 1 + γ δ2 − γ δ3 ∂ SXY = 8dXY pX (1 − pX ) 2 ∂γ 1 + γ δ2 δ = 8dXY pX (1 − pX ) 2 , and 1 + γ δ2 γ 1 + γ δ 2 − γ δ ∗ 2γ δ ∂ SXY = 8dXY pX (1 − pX ) 2 ∂δ 1 + γ δ2
(19)
γ − γ 2 δ2 = 8dXY pX (1 − pX ) 2 , 1 + γ δ2
where δ > 0 and 0 < γ <
Fig. 1. The value of SXY as a function of δ and γ when pX = 0.5.
Finally, the new standardized global similarity (defined as equation 9 in AR’s paper) is given by: SXY =
8dXY pX pY (1 − pX )(1 − pY ) ∆X ∆Y pX (1 − pX )∆2X + pY (1 − pY )∆2Y
,
(15)
where dXY is a sign indicator (defined in equation 8 in AR). First, it is important to emphasize that this is not a direct generalization of the standardized covariance from the previous section. If we let pX = pY = p, we find that: SXY =
=
8dXY p2 (1 − p)2 ∆X ∆Y 8dXY p(1 − p)∆X ∆Y
∆2X + ∆2Y
= 8dXY p (1 − p) a
(16)
p (1−p )
and define γ = pY (1−pY ) , a ratio related to the key probabilities, but X X independent of the outcomes. Eq. (16) can be re-arranged as (see Appendix for details):
γδ 1 + γ δ2
In the special case where pX similarity reduces to: SXY = dXY
2γ δ 1 + γ δ2
.
∂S
derivative with respect to the ratio of outcomes, ∂δXY is positive only if γ δ 2 > 1, i.e. when σY2 > σX2 . The two partial derivatives are equal (i.e., the two surfaces intersect) when (γ − δ) = γ 2 δ 2 . We now turn our attention to the case where the two lotteries have identical expected values. By setting, without any loss of generality, XL = YL = 0, the equal expectations constraint µX = µY implies that pY = pδX and we can rewrite Eq. (15), as (see Appendix for details):
(δ − pX ) . δ − 2δ pX + δ 2
where a is the standardized covariance defined in Eqs. (3) and (11). So the measure is not independent of the probability distribution, as AR correctly point out. When the variance is maximal (i.e., p = 0.5 and p(1 − p) = 0.25) the relation is simplified, as SXY = 2dXY a. 1Y as the ratio of the two ranges of values, Let us denote δ = 1 X
SXY = 8dXY pX (1 − pX )
0.25 . pX (1−pX )
To better explore the relationship between the probability distributions and SXY , we plot the partial derivatives of the two ratios for three values of pX = 0.1, 0.3 and 0.5 as shown in Fig. 2. In these plots we set dXY = 1. Of course, if dXY = −1, the plots will be reversed. The partial derivative with respect to the ratio ∂S of probabilities, ∂γXY is always positive, and sign of the partial
SXY = 8dXY pX (1 − pX )
p(1 − p)∆2X + p(1 − p)∆2Y
(17)
= 0.5, the standardized global
(20)
(21)
Next we take the two partial derivatives: Eq. (22) is given in Box I and Eq. (23) is given in Box II where δ > 0 and 0 < pX < 1. Because of the equal means constraint, dXY = −1, in this case. Figs. 3 and 4 present the relationship between the two partial derivatives (defined in Eqs. (22) and (23)) and pX and δ . As shown on the two plots, the rate of changes of SXY as we change pX and hold δ fixed on the range of 0–10 is much higher than the rate of changes of SXY as we change δ and hold pX fixed. This is confirmed ∂S ∂S by a comparison of the absolute values of ∂ pXY and ∂δXY that shows X that the former is higher for most cases (the exceptions are in the vicinity of pX = 0.5, when the two probabilities are similar) suggesting that, in general, pX is a more powerful driver of SXY than δ . Thus, in general, the probability distributions of lotteries with identical expected values are more critically related to the change of SXY than the outcome of the two lotteries. Final remarks
.
(18)
Fig. 1 displays the value of SXY when pX = 0.5 as a function of the two ratios, δ and γ . If we slice the 3-D plot on the log(γ ) axis it is easy to see that SXY peaks at γ = 1(log(γ ) = 0) where the two probabilities are equal and decrease symmetrically as the ratio of the two increases (or decreases). For any fixed
AR’s (2014) analysis highlights the inappropriateness of the standard correlation to analyze the relations between binary lotteries. They proposed the standardized covariance as an alternative measure, but failed to specify what exactly it measures. It is not entirely clear whether AR (2014) suggested the use if a because of its intuitive and appealing form, or because they consider it optimal based on some of its properties. We showed
4
D.V. Budescu, Y.E. Bo / Journal of Mathematical Psychology (
)
–
Fig. 2. The partial derivatives of δ and γ when pX = 0.1, 0.3 and 0.5. pX ) ∂ 8dXY pX (1 − pX ) δ−(δ− ∂ SXY ( 2δpX +δ2 ) = ∂ pX ∂ pX 8dXY δ − 2pX − 2δ pX + 3p2x δ − 2δ pX + δ 2 + 8dXY pX (1 − pX ) (δ − pX ) 2δ = 2 δ − 2δ pX + δ 2
(22)
Box I. pX ) ∂ 8dXY pX (1 − pX ) δ−(δ− ∂ SXY ( 2δpX +δ2 ) = ∂δ ∂δ 8dXY pX (1 − pX ) δ − 2δ pX + δ 2 − 8dXY pX (1 − pX ) (δ − pX ) (1 − 2pX + 2δ) = 2 δ − 2δ pX + δ 2
(23)
Box II.
Fig. 3. Partial derivative of SXY , when the two lotteries have equal expected values, with respect to δ as a function of δ and px .
Fig. 4. Partial derivative of SXY , when the two lotteries have equal expected values, with respect to px as a function of δ and px .
D.V. Budescu, Y.E. Bo / Journal of Mathematical Psychology (
that when both binary lotteries have a common probability distribution, the standardized covariance is a simple measure of the disparity between the ranges of outcomes offered by the two lotteries. When the two lotteries have distinct probability distributions and identical expected values, the standardized covariance is more critically related to the probabilities than the outcome of the two lotteries. By placing the standardized covariance in the context of a larger family of alternative coefficients and identifying its precise nature, we open the door for the exploration of alternative measures such as the coefficients of identity and/or proportionality, and others. Future work should explore the merits of various coefficients in this context. Appendix. Derivations
)
–
5
Eq. (17): SXY =
=
8dXY pX pY (1 − pX )(1 − pY ) ∆X ∆Y pX (1 − pX )∆2X + pY (1 − pY )∆2Y
(8dXY pX pY (1 − pX )(1 − pY ) ∆X ∆Y )/px (1 − px ) ∆2X (pX (1 − pX )∆2X + pY (1 − pY )∆2Y )/px (1 − px ) ∆2X 8dXY pY (1−pY )∆Y
=
∆X
1 + pY (1 − pY )∆2Y /px (1 − px ) ∆2X
= 8dXY pX (1 − pX )
γδ . 1 + γ δ2
Eq. (21): SXY =
8dXY pX pY (1 − pX )(1 − pY ) ∆X ∆Y pX (1 − pX )∆2X + pY (1 − pY )∆2Y pY (1−pY ) pX (1−pX )
Eq. (12):
σ = pX (XH − µX ) + (1 − pX ) (XL − µX ) 2
2 X
2
= pX (XH − pX XH − (1 − pX )XL )2 + (1 − pX ) (XL − pX XH − (1 − pX )XL )2 = pX (1 − pX )2 ∆2X + (1 − pX ) p2X ∆2X = pX (1 − pX )∆2X . Eq. (14): sXY
= pX pY (XH − µX )2 (YH − µY )2 + pX (1 − pY ) (XH − µX )2 (YL − µY )2 + (1 − pX ) pY (XL − µX )2 (YH − µY )2 + (1 − pX ) (1 − pY ) (XL − µX )2 (YL − µY )2 = pX (XH − µX )2 + (1 − pX ) (XL − µX )2 × pY (YH − µY )2 + (1 − pY ) (YL − µY )2 = (pX (1 − pX ) ∆X + pX (1 − pX ) ∆X ) × (pY (1 − pY ) ∆Y + pY (1 − pY ) ∆Y ) = 4pX pY (1 − pX )(1 − pY )∆X ∆Y .
= 8dXY pX (1 − pX )
= 8dXY pX (1 − pX )
1+
1
δ
pY (1−pY ) 2 pX (1−pX )
δ
(1−pY ) (1−pX ) −pY ) + ((11− pX )
δ p
= 8dXY pX (1 − pX )
1 − δX (1 − pX ) + δ 1 − pδX
= 8dXY pX (1 − pX )
(δ − pX ) (δ − δ pX ) + δ (δ − pX )
(δ − pX ) . = 8dXY pX (1 − pX ) δ − 2δ pX + δ 2 References Andraszewicz, S., & Rieskamp, J. (2014). Standardized covariance: A new measure of association, similarity and co-riskiness between choice options. Journal of Mathematical Psychology, 61, 25–37. Mehta, J. S., & Gurland, J. (1969). Testing equality of means in the presence of correlation. Biometrika, 56, 119–126. Zegers, F. E. (1986). A family of chance-corrected association coefficients for metric scales. Psychometrika, 51, 562–595. Zegers, F. E., & Ten Berge, J. M. F. (1985). A family of association coefficients for metric scales. Psychometrika, 50, 17–24.