Extensions of Pearson’s inequality between skewness and kurtosis to multivariate cases

Extensions of Pearson’s inequality between skewness and kurtosis to multivariate cases

STAPRO: 7988 Model 3G pp. 1–5 (col. fig: nil) Statistics and Probability Letters xx (xxxx) xxx–xxx Contents lists available at ScienceDirect Stat...

290KB Sizes 1 Downloads 128 Views

STAPRO: 7988

Model 3G

pp. 1–5 (col. fig: nil)

Statistics and Probability Letters xx (xxxx) xxx–xxx

Contents lists available at ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

Extensions of Pearson’s inequality between skewness and kurtosis to multivariate cases Haruhiko Ogasawara Otaru University of Commerce, 3-5-21, Midori, Otaru 047-8501, Japan

article

info

Article history: Received 12 January 2017 Received in revised form 11 July 2017 Accepted 13 July 2017 Available online xxxx

a b s t r a c t An extension of Pearson’s inequality between squared skewness and kurtosis to the case with three possibly distinct variables is obtained. A similar extension to the multivariate analogue of skewness defined by Mardia (1970) is also derived. © 2017 Elsevier B.V. All rights reserved.

Keywords: Multivariate cumulants Cauchy–Schwarz inequality Pearson’s inequality Infinitely divisible

1. Introduction

1

Denote the skewness and kurtosis of a variable by sk and kt with the assumption of their existence, respectively. Then, 2

sk − 2 ≤ kt

(1.1)

holds, which is known as Pearson’s inequality. The proof has been given by G.N. Watson (Pearson, 1916 p. 432), Wilkins (1944), Rohatgi and Székely (1989), Móri et al. (1993) and Sen (2012). The constant term −2 in (1.1) has been improved to −186/125 for unimodal distributions by Klaassen et al. (2000), where −6/5 can be used for symmetric unimodal distributions. Further, Móri et al. (1993) showed that the term −2 in (1.1) can be improved to 0 for infinitely divisible distributions e.g., normal and Poisson. Móri et al. (1993) gave a result similar to (1.1) for multivariate analogues of skewness and kurtosis, whose derivation, when applied to a single variable giving (1.1), seems to be the simplest one using the Cauchy–Schwarz inequality among the proofs mentioned earlier. In this paper, the multivariate third and fourth cumulants for standardized variables with unit variances are focused on. Some relationships similar to Pearson’s inequality are obtained, where the results reduce to Pearson’s inequality for a single variable. A corresponding inequality for a summarized third cumulants for standardized variables defined by Mardia (1970) is derived. The extended inequalities are shown to be improved when a vector variable is infinitely divisible. 2. Extensions of Pearson’s inequality to multivariate cases

)′

σab...e ≡ E {(Xa − µa ) (Xb − µb ) · · · (Xe − µe )}

(a, b, . . . , e = 1, . . . , p) .

3 4 5 6 7 8 9 10 11 12 13 14 15

16

Let X = X1 , . . . , Xp be a random vector consisting of p variables Xi (i = 1, . . . , p). Denote a multivariate central moment corresponding to possibly distinct variables Xa , Xb , . . . , Xd of arbitrary order with the assumption of its existence by

(

2

(2.1)

E-mail addresses: [email protected], [email protected]. http://dx.doi.org/10.1016/j.spl.2017.07.003 0167-7152/© 2017 Elsevier B.V. All rights reserved.

Please cite this article in press as: Ogasawara, H., Extensions of Pearson’s inequality between skewness and kurtosis to multivariate cases. Statistics and Probability Letters (2017), http://dx.doi.org/10.1016/j.spl.2017.07.003.

17 18

19

STAPRO: 7988

2

1 2

3

H. Ogasawara / Statistics and Probability Letters xx (xxxx) xxx–xxx

with µa ≡ E (Xa ) (a = 1, . . . , p) (similar assumptions will be employed throughout this article). Assume that the covariance matrix of X denoted by Σ = (σab ) with σa2 = σaa (a, b = 1, . . . , p) is non-singular. Define

ρab...e ≡ E (Za Zb · · · Ze ) =

σab...e σa σb · · · σe

(2.2)

)′

7

with Za = (Xa − µa ) /σa (a, b, . . . , e = 1, . . . , p). When X is standardized as Z ≡ Σ−1/2 {X − E (X)} with Z = Z1 , . . . , Zp , where Σ−1/2 is a symmetric matrix square root of Σ−1 , the same notation as in (2.2) is used for simplicity with ρab = δab being the Kronecker delta (a, b = 1, . . . , p). { } Let g (X) and h (X) be functions of X. Let E(·)2 = E (·)2 and E2 (·) = {E (·)}2 . Then, we have the following result.

8

Lemma 1. The inequality

4 5 6

9 10

11

12 13 14 15 16 17 18 19

20

(

E2 [{g (X) − E (g (X))} h (X)] ≤ E{g (X) − E (g (X))}2 E h(X)2

{

22

23

(2.3)

between the left- and right-hand sides is improved to E2 [{g (X) − E (g (X))} h (X){] } ≤ E{g (X) − E (g (X))}2 E h(X)2 − E{g (X) − E (g (X))}2 E2 {h (X)} .

(2.4)

Proof. Inequality (2.3) holds by the Cauchy–Schwarz inequality while (2.4) is similarly given by noting E [{g (X) − E (g (X))} h (X)] = E [{g (X) − E (g (X))} {h (X) − E (h (X))}]. Inequality (2.4) is generally sharper than (2.3) since the right-hand side of (2.4) is smaller than that of (2.3) except for the degenerate cases with g (X) = E (g (X)) and/or h (X) = 0. □ Let Z be a generic variable denoting one of Z1 , . . .(, Zp) which the case of ( may ) possibly be correlated. In Lemma 1,( consider ) g (X) = Z and h (X) = Z 2 . Then, (2.3) gives sk2 = E2 Z 3 ≤ E Z 4 = kt + 3 whereas (2.4) gives sk2 ≤ E Z 4 − 1 = kt(+ 2. The ) latter is a simple proof of Pearson’s inequality. Note that Móri et al. (1993, Theorem 1) used (2.4) when g (X) = E Z′ ZZ′ Z ′ and h (X) = Z Z under ρab = δab (a, b = 1, . . . , p) yielding p ∑

ρabb ρacc ≤ 2p +

a,b,c =1 21

}

p ∑

(Z ) κaabb ,

(2.5)

a,b=1

(Z )

where κabcd = ρabcd − ρab ρcd − ρac ρbd − ρad ρbc . Note that the left-hand side of (2.5) is a multivariate analogue of squared skewness, which is different from Mardia’s (1970, Equation (2.19)) definition: p ∑

2 ρabc with ρab = δab (a, b = 1, . . . , p) .

(2.6)

a,b,c =1

∑p

(Z )

27

Note also that a,b=1 κaabb with ρab = δab (a, b = 1, . . . , p) in (2.5) is Mardia’s (1970, Equation (3.5)) multivariate analogue of kurtosis when the normalizing constant p2 + 2p for the normal distribution is subtracted. Using Lemma 1 when g (Z) = Za and h (Z) = Zb Zc · · · Ze and exchanging the variables symmetrically, we have the following result.

28

Theorem 1.

24 25 26

29

2 2 ρabc ...e ≤ ρbbcc ...ee − ρbc ...e , 2 2 ρabc ...e ≤ ρaacc ...ee − ρac ...e , .. .

2 2 ρabc ...e ≤ ρaabb...dd − ρab...d 30

31

32

33

34

(2.7)

(a, b, c , . . . , e = 1, . . . , p) .

When ρabc ...e = ρabc , Theorem 1 gives the following result. Corollary 1. (Z ) 2 2 2 ρabc ≤ ρaabb − ρab = κaabb + 1 + ρab , 2 2 2 (Z ) ρabc ≤ ρaacc − ρac = κaacc + 1 + ρac , (Z ) 2 2 2 ρabc ≤ ρbbcc − ρbc = κbbcc + 1 + ρbc

(2.8)

and equivalently

( ) (Z ) (Z ) 2 2 2 2 (Z ) ρabc ≤ 1 + min κaabb + ρab , κaacc + ρac , κbbcc + ρbc ( ) (Z ) (Z ) 2 2 2 (Z ) + κaacc + κbbcc + ρab + ρac + ρbc ≤ 1 + (1/3) κaabb (a, b, c = 1, . . . , p) .

(2.9)

Please cite this article in press as: Ogasawara, H., Extensions of Pearson’s inequality between skewness and kurtosis to multivariate cases. Statistics and Probability Letters (2017), http://dx.doi.org/10.1016/j.spl.2017.07.003.

STAPRO: 7988

H. Ogasawara / Statistics and Probability Letters xx (xxxx) xxx–xxx

3

When b = c with a ̸ = b, Corollary 1 yields

1

) ) ( ( (Z ) (Z ) (Z ) (Z ) 2 2 2 , κbbbb + 1 ≤ 1 + (1/2) κaabb + κbbbb + 1 + ρab ≤ 1 + min κaabb + ρab ρabb (a, b = 1, . . . , p; a ̸= b) .

(2.10)

When a = b = c , Corollary 1 reduces to Pearson’s inequality. Note that the result of Corollary 1 is an extension of Pearson’s inequality to multivariate cases, where possibly nonzero ρab (a, b = 1, . . . , p) are used as well as fourth multivariate cumulants for standardized variables with unit variances. An upper bound for Mardia’s multivariate analogue of squared skewness defined by (2.6) is obtained from (2.9) of Corollary 1 with ρab = δab (a, b = 1, . . . , p). Corollary 2. When ρab = δab (a, b = 1, . . . , p), p

2

3 4 5 6 7

8

p



2 ρabc ≤ p3 + p2 + p

a,b,c =1



(Z ) κaabb .

(2.11)

9

(see (2.5)). They

10

a,b=1

The upper bound of (2.11) is larger than or equal to the upper bound 2p + are equal if and only if p = 1.

(Z ) a,b=1 aabb

∑p

κ

of

∑p

ρ

ρ

a,b,c =1 abb acc

11

3. Upper and lower bounds for kurtosis

12

Let κ(q) be the qth cumulant of Z . Then, we have the following result.

13

Theorem 2. For kt = κ(4) with sk = κ(3) ,

{

(1/2) 9 − 4κ(6) + 36sk2 + 105

(

14

)1/2 }

{

)1/2 }

≤ kt ≤ (1/2) 9 + 4κ(6) + 36sk2 + 105

(

.

(3.1)

Proof. When ρabc ...e in Theorem 1 is ρaaaa , we have

( ) 2 0 ≥ ρ − ρaaaaaa − ρaaa ) ( )2 ( = κ(4) + 3 − κ(6) + 15κ(4) + 10κ(23) + 15 − κ(23) = κ(24) − 9κ(4) − κ(6) − 9κ(23) − 6

15

16

2 aaaa

(3.2)

(see Stuart and Ord, 1994, Equation (3.38)), which gives (3.1). □

18

Inequalities between the fourth multivariate cumulants for standardized variables with unit variances are similarly given as Theorem 3.

(

19 20

21

(Z )

κabcd + ρac ρbd + ρad ρbc (a, b, c , d = 1, . . . , p)

)2

( )( ) (Z ) (Z ) 2 2 ≤ κaabb κccdd + 1 + ρab + 1 + ρcd

(3.3)

and

22

23

(

(Z )

κaacc + 2ρ

2 2 ac

)

(Z )

(Z )

≤ κaaaa + 2 κcccc + 2

(

)(

)

(a, c = 1, . . . , p; a ̸= c ) .

(3.4)

When ρab = δab (a, b = 1, . . . , p),

(

(Z ) κaacc

)2

( (Z ) ) ( (Z ) ) ≤ κaaaa + 2 κcccc +2

(a, c = 1, . . . , p; a ̸= c ) .

(3.5)

In Lemma 1, when g (X) = Za Zb and h (X) = Zc Zd (a, b, c , d = 1, . . . , p), we have (ρabcd − ρab ρcd )2 ) )( 2 2 − ρab ρccdd − ρcd , which yields (3.3). When a = b ̸= c = d in (3.3), we obtain (3.4) and (3.5). □ (Z )

24 25

Proof. (

ρaabb

17



(Z )

Note that κaaaa and κcccc in (3.4) and (3.5) are kurtoses of Za and Zc , respectively.

26

27 28

29

4. Infinitely divisible distributions

30

Assume that

31

J

Z=



Y(i) ,

(4.1)

i=1

Please cite this article in press as: Ogasawara, H., Extensions of Pearson’s inequality between skewness and kurtosis to multivariate cases. Statistics and Probability Letters (2017), http://dx.doi.org/10.1016/j.spl.2017.07.003.

32

STAPRO: 7988

4

1 2 3

H. Ogasawara / Statistics and Probability Letters xx (xxxx) xxx–xxx

)′

where Y(i) (i = 1, . . . , J ) are independent and identically distributed. Let Y = Y1 , . . . , Yp be a generic random vector 2 = σYaa (a, b = denoting one of Y(i) (i = 1, . . . , J ). Define ΣY = (σYab ) = cov (Y) with σYa ( )′ ( )′ 1, . . . , p). Then, from (4.1), we obtain J ΣY ≡ P = (ρab ) = cov (Z). Let ZY = ZY 1 , . . . , ZYp ≡ Y1 /σY 1 , . . . , Yp /σYp and PY = (ρYab ) = cov (ZY ).

(

(Z )

5

Y Denote the fourth multivariate cumulants of ZYa , ZYb , ZYc and ZYd by κabcd (a, b, c , d = 1, . . . , p). Then, we have the following property.

6

Lemma 2. When (4.1) holds,

4

) J −(t −2)/2 κa(1ZY...)at = κa(1Z... at ,

7

(4.2)

8

(Z ) where κa1 Y...at

9

e.g., Stuart and Ord, 1994, Section 3.29).

10 11 12 13

14

15

16

(Z ) (κa1 ...at ) is the tth multivariate cumulant of

(Y )

(Y )

(Z )

(Y )

(Z )

Proof. From (4.1), we have J κa1 ...at = κa1 ...at and J κa1 a2 = κa1 a2 , where κa1 a2 = cov Ya1 , Ya2 and κa1 a2 = ρa1 a2 = ) κa(1Z... at

(

)

Y (a1 , . . . , at = 1, . . . , p). Standardizing each of Yak (k = 1, . . . , t ) by multiplying J 1/2 and noting J κa(1 ...) at = ) ) ( ( with e.g., JE Ya1 Ya2 Ya3 = J −1/2 E J 1/2 Ya1 J 1/2 Ya2 J 1/2 Ya3 give (4.2). □

cov Za1 , Za2

(

)

Inequality (2.9) is improved when (4.1) holds. Theorem 4. When a standardized random vector Z with unit variances and possibly non-zero correlations is infinitely divisible,

( ) (Z ) (Z ) 2 (Z ) ρabc ≤ (1/3) κaabb + κaacc + κbbcc p ∑

2 ρabc ≤p

a,b,c =1 17

(Z )

ZYa1 , . . . , ZYat (Za1 , . . . , Zat , respectively) (a1 , . . . , at = 1, . . . , p) (see

p ∑

(a, b, c = 1, . . . , p)

and

(4.3)

(Z ) κaabb .

(4.4)

a,b=1

Proof. Using Corollary 1 for ZYa , ZYb and ZYc and Lemma 2, we obtain

)}2 { ( 2 ρabc = {JE (Ya Yb Yc )}2 = J −1 E J 1/2 Ya J 1/2 Yb J 1/2 Yc = J −1 {{E (ZYa ZYb ZYc()}2

18

19 20

21

)} (ZY ) (ZY ) 2 2 2 (ZY ) ≤ J −1 1 + (1/3) κaabb + κaacc + κbbcc + ρab + ρac + ρbc ( ) ( 2 )} { (Z ) (Z ) 2 2 (Z ) + ρac + ρbc = (1/3) κaabb + κaacc + κbbcc + J −1 1 + (1/3) ρab (a, b, c = 1, . . . , p) .

(4.5)

When J → +∞, we have (4.3) and consequently (4.4). □ Móri et al. (1993, Theorem 2) obtained a similar result when (4.1) holds with ρab = δab (a, b = 1, . . . , p): p ∑

ρabb ρacc ≤

a,b,c =1

p ∑

(Z ) κaabb ,

(4.6)

a,b=1

26

which was derived in a similar way though the proof of Theorem 4 is simplified using Lemma 2. It is obvious that the equalities in (4.4) and (4.6) hold under multivariate normality. It is an open problem to prove that the inequalities of (4.4) and (4.6) are tight except under normality. Note that the upper bound for Mardia’s multivariate analogue of squared skewness in (4.4) is p times that of (4.6). Note also that (4.4) holds for general ρab (a, b = 1, . . . , p) while Mardia’s index is defined when ρab = δab (a, b = 1, . . . , p).

27

5. Discussion

22 23 24 25

∑p

∑p

33

2 In Móri et al. (1993), a,b,c =1 ρabb ρacc is focused on while in this paper a,b,c =1 ρabc is dealt with. Both indexes are important. For instance, in the higher-order asymptotic the Akaike information criterion (AIC; Akaike, 1973), both ∑p bias of (Z ) types of summarized squared skewness as well as a,b=1 κaabb appear in the exponential family of distributions under canonical parametrization (Ogasawara, 2016 Corollary 2). In the case of a single canonical parameter, the two indexes of squared skewness become identical, where the ratio of squared skewness to kurtosis plays an important role (Móri et al., 1993, p. 548; Ogasawara, 2013).

34

Acknowledgment

28 29 30 31 32

35 36

This work was partially supported by a Grant-in-Aid for Scientific Research from the Japanese Ministry of Education, Culture, Sports, Science and Technology (JSPS KAKENHI, Grant No. 17K00042). Please cite this article in press as: Ogasawara, H., Extensions of Pearson’s inequality between skewness and kurtosis to multivariate cases. Statistics and Probability Letters (2017), http://dx.doi.org/10.1016/j.spl.2017.07.003.

STAPRO: 7988

H. Ogasawara / Statistics and Probability Letters xx (xxxx) xxx–xxx

5

References Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csáki, F. (Eds.), Proceedings of the 2nd International Symposium on Information Theory. Académiai Kiado, Budapest, pp. 267–281. Klaassen, C.A.J., Mokveld, P.J., van Es, B., 2000. Squared skewness minus kurtosis bounded by 186/125 for unimodal distributions. Statist. Probab. Lett. 50, 131–135. Mardia, K.V., 1970. Measures of multivariate skewness and kurtosis with applications. Biometrika 57, 519–530. Móri, T.F., Rohatgi, V.K., Székely, G.J., 1993. On multivariate skewness and kurtosis. Theory Probab. Appl. 38, 547–551. Ogasawara, H., 2013. Asymptotic cumulants of the estimator of the canonical parameter in the exponential family. J. Statist. Plann. Inference 143, 2142–2150. Ogasawara, H., 2016. Asymptotic cumulants of some information criteria. J. Japanese Soc. Comput. Statist. 29, 1–25. Pearson, K., 1916. IX. Mathematical contributions to the theory of evolution.–XIX. Second supplement to a memoir on skew variation. Phil. Trans. R. Soc. A Containing Pap. Math. Phys. Charact. 216, 429–457. Rohatgi, V.K., Székely, G.J., 1989. Sharp inequalities between skewness and kurtosis. Statist. Probab. Lett. 8, 297–299. Sen, A., 2012. On the interrelation between the sample mean and the sample variance. Amer. Statist. 66, 112–117. Stuart, A., Ord, J.K., 1994. Kendall’S Advanced Theory of Statistics: Distribution Theory, Vol. 1, 6th ed. Arnold, London. Wilkins, J.E., 1944. A note on skewness and kurtosis. Ann. Math. Statist. 15, 333–335.

Please cite this article in press as: Ogasawara, H., Extensions of Pearson’s inequality between skewness and kurtosis to multivariate cases. Statistics and Probability Letters (2017), http://dx.doi.org/10.1016/j.spl.2017.07.003.

1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17