Journal of Multivariate Analysis 145 (2016) 224–235
Contents lists available at ScienceDirect
Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva
Matrix-variate distribution theory under elliptical models-4: Joint distribution of latent roots of covariance matrix and the largest and smallest latent roots Francisco J. Caro-Lopera a , Graciela González Farías b,∗ , Narayanaswamy Balakrishnan c a
Universidad de Medellín, Department of Basic Sciences, Carrera 87 No. 30-65, of. 5-103, Medellín, Colombia
b
CIMAT A. C., Department of Probability and Statistics, Callejón de Jalisco s/n, Mineral de Valenciana, 36240 Guanajuato, Guanajuato, Mexico c
McMaster University, Department of Mathematics and Statistics, Hamilton, Ontario, Canada L8S 4K1
article
info
Article history: Received 12 February 2015 Available online 2 January 2016 AMS subject classifications: 62E15 60E05 12E10
abstract In this work, we derive the joint distribution of the latent roots of a sample covariance matrix under elliptical models. We then obtain the distributions of the largest and smallest latent roots. In the process of these derivations, we also correct some results present in the literature. © 2015 Elsevier Inc. All rights reserved.
Keywords: Likelihood ratio statistics Elliptical models Zonal polynomials Sample covariance matrix Latent roots
1. Introduction The zonal polynomials of James [12] have played a key role in the development of matrix-variate distributions under Gaussian models. Various developments have taken place in this direction during the last five decades, which have been authoritatively revised by Muirhead [18]. Recently, extensions of these results to elliptical models have been made. For example, Caro-Lopera et al. studied the generalized Wishart distribution and its application in the likelihood ratio test for the homogeneity of covariance matrices [2], discussed the sphericity test [3], and developed the likelihood ratio test for testing for a scale matrix and specified values for the location vector and the scale matrix [4], all under elliptical models. All these works have generalized the results of the Gaussian models presented in Chapter 8 of the book by Muihead [18]. In the present work, we generalized the results in Chapter 9 [18] concerning the joint distribution of the latent roots of a sample covariance matrix and the distribution of the largest and smallest latent roots to the case of elliptical models. In the process of these derivations, we also correct some erroneous results present in the literature. The expressions of the distributions derived here in terms of zonal polynomials can all be efficiently computed, with suitable modifications of the algorithms of Koev and Edelman [15] as demonstrated by Caro-Lopera et al. [3].
∗
Corresponding author. E-mail addresses:
[email protected] (F.J. Caro-Lopera),
[email protected] (G. González Farías),
[email protected] (N. Balakrishnan).
http://dx.doi.org/10.1016/j.jmva.2015.12.012 0047-259X/© 2015 Elsevier Inc. All rights reserved.
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
225
The rest of this paper proceeds as follows. In Section 2, we derive the joint distribution of the latent roots of a sample covariance matrix under elliptical models. Then in Section 3, we derive the distributions of the largest and smallest latent roots from a sample covariance matrix under an elliptical model. In the process of these derivations, we also correct some erroneous results concerning zonal polynomials present in the literature. 2. Joint distribution of latent roots of a sample covariance matrix In this section, we provide the basic tools for a generalization of the classical work on distributions of the latent roots of a sample covariance matrix under the case of elliptical models. First, recall that an n × m matrix X is said to have a matrix-variate elliptically contoured distribution, denoted by En×m (µ, 2 ⊗ 6, h), if its density function is given by gX (X) =
1
|6|n/2 |2|m/2
h tr 2−1 (X − µ)6−1 (X − µ)⊤ ,
where : m × m, 2 : n × n, 6 > 0, 2 > 0, and the function h : ℜ → [0, ∞), termed the generator function, is such that ∞ nm6 −1 u h(u2 )du < ∞. 0 Now, let us denote the Elliptical Wishart E W m (n, 6, h) distribution as the distribution of the m × m random matrix A = Z⊤ Z, where the n × m matrix Z is En×m (0, In ⊗ 6, h) and n ≥ m; then, from [2], we have the density function of A as
π mn/2 |A|(n−m−1)/2 h tr 6−1 A , 0m (n/2) |6|n/2 m where 0m (a) = π m(m−1)/4 i=1 0 (a − (i − 1)/2) and Re(a) > (m − 1)/2.
(1)
From [2], we have the following generalization of Theorem 3.2.18 of [18]. Lemma 1. If A is E W m (n, 6, h) with n > m − 1, the joint density function of the latent roots l1 , . . . , lm (l1 > · · · > lm > 0) of A is ∞ m m h(k) (0) Cκ (6−1 )Cκ (L) π m(n+m)/2 (n−m−1)/2 (li − lj ) . li n / 2 0m (n/2) |6| 0m (m/2) i=1 k! Cκ (Im ) κ i
(2)
Here and throughout this work, h(k) (a) denotes the kth derivative of the elliptical generator function h(u) with respect to u and dk h(u)
evaluated at u = a, ( duk |u=a ), κ denotes the summation over all partitions κ = (k1 , . . . , km ), k1 ≥ · · · ≥ km ≥ 0 of k, and Cκ (6) is the zonal polynomial indexed by κ in the latent roots of the matrix argument.
Now, the sample covariance matrix S is E W m (n, 1n 6, h). Then, the joint density function of l1 , . . . , lm , the latent roots of S, is given as follows: Corollary 2. Let nS have the E W m (n, 6, h) distribution, with n > m − 1. Then, the joint density function of the latent roots l1 , . . . , lm (l1 > · · · > lm > 0) of S is given by m m ∞ π m(n+m)/2 nmn/2 h(k) (0) Cκ (6−1 )Cκ (nL) (n−m−1)/2 l ( l − l ) . i j 0m (n/2) |6|n/2 0m (m/2) i=1 i k! C κ ( Im ) κ i
(3)
The Gaussian version due to James [11] (see also [18, Theorem 9.4.1]) follows as a particular case by considering h(y) = e−y/2 /(2π )mn/2 , and h(k) (0) = (−1/2)k /(2π )mn/2 , in which case we deduce the following result. Corollary 3. Let nS have the Wm (n, 6) distribution, with n > m − 1. Then, the joint density function of the latent roots l1 , . . . , lm (l1 > · · · > lm > 0) of S is given by
(n/2)mn/2
2 m m ∞ π m /2 |6|−n/2 1 Cκ (6−1 )Cκ (−nL/2) (n−m−1)/2 . li (li − lj ) 0m (n/2) 0m (m/2) i=1 k! κ C κ ( Im ) i
When 6 = λIm , the joint density function of the latent roots l1 , . . . , lm of the sample covariance matrix S becomes m m ∞ h(k) (0) π m(n+m)/2 nmn/2 (n−m−1)/2 l ( l − l ) × Cκ (nλ−1 L), i j i mn / 2 0m (n/2) λ 0m (m/2) i=1 k ! κ i
from which we readily obtain the following result.
(4)
226
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
Corollary 4. When 6 = λIm , the joint density function of the latent roots l1 > · · · > lm > 0 of the sample covariance matrix S is given by
(n/λ)mn/2
m m π m(n+m)/2 (n−m−1)/2 h tr nλ−1 L li (li − lj ). 0m (n/2) 0m (m/2) i=1 i
(5)
Once again, in the Gaussian case with h(y) = e−y/2 /(2π )mn/2 , we deduce the following result. Corollary 5. When 6 = λIm , the joint density function of the latent roots l1 > · · · > lm > 0 of the sample covariance matrix S is given by mn/2
(n/(2λ))
2 m m m π m /2 n (n−m−1)/2 exp − li li (li − lj ). 0m (n/2) 0m (m/2) 2λ i=1 i=1 i
(6)
3. Distributions of the largest and smallest latent roots of the sample covariance matrix This section discusses the distributions of the largest and smallest latent roots of a sample covariance matrix under an elliptical model. The Gaussian case can be found for example in [18,10]. We first start with the largest latent root and its distribution. Theorem 6. If A is E W m (n, 6, h), n > m − 1, and > 0 is an m × m matrix, then the probability that − A is positive definite is Pr(A < ) =
∞ h(k) (0) π mn/2 0m ((m + 1)/2) −1 n/2 (n/2)κ C κ 6 −1 . |6 | 0m ((n + m + 1)/2) k! κ ((n + m + 1)/2)κ k=0
Proof. Recall that if A is E W m (n, 6, h) with n ≥ m − 1, then the density function of A is
π mn/2 |A|(n−m−1)/2 h tr 6−1 A , 0m (n/2) |6|n/2 and so we want to compute the integral Pr(A < ) =
π mn/2 0m (n/2) |6|n/2
0
|A|(n−m−1)/2 h tr 6−1 A (dA).
As usual, the integral is reduced over 0 < X < I upon substituting A = 1/2 X1/2 , which implies (dA) = ||(m+1)/2 (dX). Then, Pr(A < ) =
π mn/2 ||n/2 0m (n/2) |6|n/2
0
|X|(n−m−1)/2 h tr 6−1 1/2 X1/2 (dX).
Upon expanding in terms of zonal polynomials, we get Pr(A < ) =
∞ π mn/2 ||n/2 h(k) (0) |X|(n−m−1)/2 Cκ 6−1 1/2 X1/2 (dX), n / 2 0m (n/2) |6| k! 0
and then the required result follows by Theorem 7.2.10 of [18].
The classical Gaussian case of Muirhead [18, Theorem 9.7.1] is obtained by considering h(y) = e−y/2 /(2π )mn/2 and h(k) (0) = (−1/2)k /(2π )mn/2 , which is presented in the following corollary. Corollary 7. If A is Wm (n, 6), n > m − 1, and > 0 is an m × m matrix, then the probability that − A is positive definite is
n/2 0m ((m + 1)/2) −1 6 /2 1 F1 n/2; (n + m + 1)/2; −6−1 /2 , 0m ((n + m + 1)/2) ∞ 1 (a)κ where 1 F1 (a; b; X) = k=0 k! κ (b)κ Cκ (X); see [18]. Pr(A < ) =
Now, we can apply Theorem 6 for the elliptical Wishart matrix S. In this case, we are interested in the distribution of the largest latent root l1 of S ∼ E W m (n, 6/n, h); in other words, we want to find Pr(l1 < x), which is equivalent to finding Pr(S < xI); but recalling that A = nS ∼ E W m (n, 6), we have Pr(l1 < x) = Pr(S < xI) = Pr(A < nxI), thus resulting in the following corollary.
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
227
Corollary 8. The distribution of the largest latent root l1 of the covariance matrix S ∼ E W m (n, 6/n, h) is given by ∞ π mn/2 0m ((m + 1)/2) h(k) (0) (n/2)κ |nx6−1 |n/2 Cκ nx6−1 . n + m + 1 )/ 2 0m ((n + m + 1)/2) k ! (( ) κ κ k=0
Pr(l1 < x) =
The corresponding Gaussian version of Muirhead [18, Corollary 9.7.2] is deduced as follows: Corollary 9. The distribution of the largest latent root l1 of the sample covariance matrix S ∼ Wm (n, 6/n) is given by
n/2 0m ((m + 1)/2) nx6−1 /2 1 F1 n/2; (n + m + 1)/2; −nx6−1 /2 . 0m ((n + m + 1)/2)
Pr(A < nxI) =
Since the distribution of the smallest root lm of the covariance matrix S is also of interest, the computation of Pr(x > lm ) = Pr(xI > S) = Pr(nxI > A) is required, but as noted by Muirhead [18, p. 421], it cannot be obtained from a given probability Pr(A < ), since for m ≥ 2, Pr(A > ) ̸= 1 − Pr(A < ). So, we need to perform the integration over A > by using properties of zonal polynomials. The key for this is to change the set of integration from A > to 0 < Y, following the idea of transformation from a beta type I integral into a beta type II integral. Specifically, when A ∼ E W m (n, 6, h), n > m − 1, we want to compute the integral
π mn/2 Pr(A > ) = 0m (n/2) |6|n/2
A>
|A|(n−m−1)/2 h tr 6−1 A (dA).
Let us perform the transformation of A > into Y > 0 through A = 1/2 (I + Y)1/2 yielding (dA) = ||(m+1)/2 (dY). Then, we obtain
π mn/2 Pr(A > ) = |A|(n−m−1)/2 h tr 6−1 A (dA) 0m (n/2) |6|n/2 A> π mn/2 n/2 = | | |Y|(n−m−1)/2 |I + Y−1 |(n−m−1)/2 h tr 1/2 6−1 1/2 Y + tr 6−1 (dY). 0m (n/2) |6|n/2 0
h tr
6 −1
1/2
∞ t h(t ) tr 1/2 6−1 1/2 Y Y + tr 6 = tr 6−1 . t! t =0 −1
Upon using an idea of Khatri [14], with r = (n − m − 1)/2 being a positive integer, we may expand |I + Y−1 |(n−m−1)/2 in terms of zonal polynomials as
|I + Y−1 |r =
mr (−1)k (−r )κ ∗ Cκ ∗ Y−1 , k! k=0 κ∗
where the summation runs over partitions κ ∗ = (k1 , . . . , km ) of k such that k1 ≤ r, and given that r is a positive integer the series of the determinant ends; see, for example, [18, p. 259]. Then, by combining these expansions, we obtain Pr(A > ) =
t mr ∞ tr 6−1 π mn/2 |6−1 |n/2 (−1)k (−r )κ ∗ 0m (n/2) k! t =0 t! k=0 κ∗ × h(t ) tr 1/2 6−1 1/2 Y |Y|(n−m−1)/2 Cκ ∗ (Y−1 )(dY).
(7)
0
The above integral has been given by Li [16, Theorem 2.2], but a revision of that result is necessary. First, we need to discuss a preliminary result. 3.1. About Lemma 7.2.12 of [18] Let us start with a result of Caro-Lopera et al. [1]. Theorem 10. Let Z be a complex symmetric m × m matrix with Re(Z) > 0 and let Y be a symmetric m × m matrix. Then,
X>0
h(tr XZ)|X|a−(m+1)/2 Cκ (XY)(dX) =
|Z|−a (a)κ 0m (a)Cκ (YZ−1 ) S, 0 (ma + k)
(8)
228
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
where ∞
h(w)w ma+k−1 dw < ∞,
S=
(9)
0
(a)κ =
m
(a − (i − 1)/2)ki ,
κ = (k1 , . . . , km ), k1 ≥ · · · ≥ km > 0,
m
ki = k,
i =1
i=1
(a)k = a(a + 1) · · · (a + k − 1),
(a)0 = 1,
and
0m (a) = π m(m−1)/4
m
0 (a − (i − 1)/2) ,
Re(a) > (m − 1)/2.
i=1
This result was based on the following fact. Lemma 11. If Y = diag(y1 , . . . , ym ) and X = (xij ) is an m × m positive definite matrix, then k
Cκ (XY) = dκ y11 · · · ykmm |X1 |k1 −k2 |X2 |k2 −k3 · · · |X|km + terms of lower weight in the y’s,
(10)
where Xp = (xij ), i, j = 1, . . . , p, κ = (k1 , . . . , km ) and dκ is the coefficient of the term of highest weight in Cκ (·); see, for example, [18, Lemma 7.2.6]. Note that the expression ‘‘terms of lower weight in the z’s’’, commonly used in Definition (1), p. 228 in [18], and the original references by Constantine, Khatri, is not appropriate here since it refers to partition values less than κ (in Littlewood’s notations); it does not include symmetric terms of κ . Thus, the required result in (7), which is a particular case of the more general integral g (Y, Z) =
X >0
h(tr XZ)|X|a−(m+1)/2 Cκ (YX−1 )(dX),
(11)
should be based on the expansion presented in the following lemma. Lemma 12 (Muirhead [18, Lemma 7.2.12]). If Z = diag(z1 , . . . , zm ) and Y = (yij ) is an m × m positive definite matrix, then km Cκ (Y−1 Z) = dκ z11 · · · zm |Y1 |−(k1 −k2 ) |Y2 |−(k2 −k3 ) · · · |Y|−km + terms of lower weight in the z’s, k
(12)
where Yp = (yij ), i, j = 1, . . . , p, κ = (k1 , . . . , km ) and dκ is the coefficient of the term of highest weight in Cκ (·). However, it seems that Lemma 12 is incorrectly specified and to demonstrate this we provide a simple counterexample. If
0 z2
y2 y3
z Z= 1 0
and y Y= 1 y2
is positive definite, then
y3 z1 − 2 − y y 1 3 + y2 Y −1 Z = y2 z1 −y1 y3 + y2 2
y2 z2
2
−y1 y3 + y2 y1 z2 − 2 −y1 y3 + y2
with the latent roots as y1 z2 + y3 z1 ±
y1 2 z2 2 − 2 y1 z2 y3 z1 + y3 2 z1 2 + 4 y2 2 z1 z2 2(y1 y3 − y2 2 )
.
Recall that C2 (W) = M2 (W) +
2 3
M12 (W) = w12 + w22 +
2 3
w1 w2 ,
where w1 and w2 are the latent roots of the 2 × 2 positive definite matrix W.
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
229
Thus, C2 (Y−1 Z) =
3 y1 2 z2 2 + 2 y1 z2 y3 z1 + 3 y3 2 z1 2 + 4 y2 2 z1 z2
3 y1 y3 − y2 2
2
= (z22 |Y1 |2 |Y|−2 + symmetric terms (independent of z22 )) + terms of lower weight in the z’s,
(13)
which contradicts Muirhead’s version of the result C2 (Y−1 Z) = (z12 |Y1 |−2 |Y|−(0) + symmetric terms(independent of z12 )) + terms of lower weight in the z’s.1 One can also check that (13) holds in structure for quaternion matrices (see [17]). Perhaps, it is a lexicographical error, because if we follow the directions of the proof given by Muirhead [18, p. 256], faithfully, i.e. by translating the proof of Muirhead [18, Lemma 7.2.6], we do obtain the following corrected version: Lemma 13 (Corrected Version of Muirhead [18, Lemma 7.2.12]). If Z = diag(z1 , . . . , zm ) and Y = (yij ) is an m × m positive definite matrix then k1 Cκ (ZY−1 ) = (dκ z1m · · · zm |Y1 |km−1 −km |Y2 |km−2 −km−1 · · · |Y|−k1 k
+ symmetric terms) + terms of lower weight in the z’s,
(14)
where Xp = (xij ), i, j = 1, . . . , p, κ = (k1 , . . . , km ) and dκ is the coefficient of the term of highest weight in Cκ (·). The above result coincides in form with the following lemma of Li and Xue [17] in the context of quaternion matrices. Lemma 14. If Z = diag(z1 , . . . , zm ) is a real diagonal matrix and Y = (yij )m×m is a m × m positive definite quaternion matrix, then Cκ ( Y
−1
Z) =
k dκ z 1 m
···
k1 km−1 −km zm y11
y11 y21
km−2 −km−1 y12 · · · |Y|−k1 + terms of lower weight , y22
(15)
where κ = (k1 , . . . , km ) and dκ is the coefficient of the term of highest weight in Cκ (·). The difference appears in Y = (yij ), which is an m × m positive definite quaternion matrix, instead of a positive definite matrix. In fact, Li and Xue [17] followed the proof given by Muirhead [18] closely, but they did not notice that their result differs in structure from Lemma 12. 3.2. About Theorem 2.2 of [16] concerning the main integral Now, we can evaluate the integral in (11) and then as a consequence the one in (7), which gives the distribution of the smallest latent root of the sample covariance matrix S. As mentioned earlier, the result follows by translating the proof for Theorem 10 [1] via Lemma 13. However, we must note that this integral was already attempted by Li [16, Theorem 2.2] based on the incorrectly stated Lemma 7.2.12 of [18] and so the corollaries and applications provided in his work need to be corrected as well. The mentioned result of Li [16] is as follows. Theorem 15 (Li [16, Theorem 2.2]). Let Z be a complex symmetric p × p matrix with Re(Z) > 0 and U be a symmetric p × p ∞ matrix. Suppose σ = 0 f (z )z pa−k−1 dz < ∞, where k is a given positive integer. Then,
a−(p+1)/2
|W| W>0
π p(p−1) f (tr ZW)Cκ (W
−1
U)(dW) =
p
0 (a − ki − (i − 1)/2)
i=1
0 (ma − k)
· σ · |Z|−a Cκ (UZ),
(16)
where a > k1 + (p − 1)/2.2
1 As pointed out by a referee, a version of Muirhead’s Lemma and corrections of their consequences previously published by different authors, were derived by Díaz-García and Gutiérrez-Jáimez [8] and Díaz-García and Gutiérrez-Jáimez [7]; both works are based on [6, Lemma 3.1, p.23]. After revising that Lemma, it is easy to check an error in the last part of that published proof, i.e., application of property (i) on page 24 does not hold for Cκ (Y−1 Z). Again, our counterexample (13) demonstrates the particular error when we compare it with the following published version: C2 (Y−1 Z) = (z12 |Y1 |2 |Y|−2 + symmetric terms (independent of z12 )) + terms of lower weight in the z’s. 2 Now, it is clear that all the results in [8,7] based on the erroneous version of Díaz-García and Gutiérrez-Jáimez [6, Lemma 3.1, p.23] need to be corrected. However, there is an unusual aspect here in that our expressions and those derived in [8] are apparently the same, but it can be checked that it is just a coincidence because of the symmetric terms weighted by partition κ in the expansions.
230
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
So for correctness, we will include all the proofs, using Lemma 13 and then present a complete proof which is a direct translation of the proof of Theorem 10 given by Caro-Lopera et al. [1]. Theorem 16 (Corrected Version of Li [16, Theorem 2.2]). Let Z be a complex symmetric m × m matrix with Re(Z) > 0 and Y be a symmetric m × m matrix. Then,
X>0
h(tr XZ)|X|a−(m+1)/2 Cκ (X−1 Y)(dX) =
|Z|−a 0m (a, −κ)Cκ (YZ) S, 0 (ma − k)
(17)
where ∞
h(w)w ma−k−1 dw < ∞
S= 0
and
0m (a, −κ) =
(−1)k 0m (a) , (−a + (m + 1)/2)κ
Re(a) > (m − 1)/2 + k1 .
Proof. Given g (Y, Z) =
X>0
h(tr XZ)|X|a−(m+1)/2 Cκ (YX−1 )(dX),
we have g (Y, Im ) =
h(tr X)|X|a−(m+1)/2 Cκ (X−1 Y)(dX).
X >0
(18)
For any H ∈ O(m), we have g (HYH⊤ , Im ) =
h(tr X)|X|a−(m+1)/2 Cκ (X−1 HYH⊤ )(dX).
X>0
(19)
Let U = H⊤ XH, which implies (dU) = (dX); then, g (HYH⊤ , Im ) =
U>0
h(tr(U))|U|a−(m+1)/2 Cκ (U−1 Y)(dU)
= g (Y, Im ).
(20)
Thus, g (Y, Im ) is a symmetric function of Y. From (19) and (20), upon integrating with respect to the normalized invariant measure (dH) on O(m), we obtain g (Y, Im ) =
O(m)
= O(m)
g (Y, Im )(dH) g (HYH⊤ , Im )(dH)
= O(m)
= X>0
= X>0
= =
X>0
h(tr X)|X|a−(m+1)/2 Cκ (X−1 HYH⊤ )(dH)(dX)
h(tr X)|X|a−(m+1)/2 h(tr X)|X|a−(m+1)/2
Cκ (Y)
Cκ (Im )
X>0
g (Im , Im ) C κ ( Im )
O(m)
Cκ (X−1 HYH⊤ )(dH)(dX)
Cκ (X−1 )Cκ (Y) Cκ (Im )
(dX),
h(tr X)|X|a−(m+1)/2 Cκ (X−1 )(dX)
Cκ (Y),
where the result of James [12] was used for the integration over O(m). Now, for reducing the multiple integral in (21), we consider the following facts.
(21)
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
231
By the definition of zonal polynomials, they are linear combinations of monomial symmetric functions in the latent roots of Y, and so they can be expressed as Cκ (Y) = dκ Mκ (Y) +
µ<κ
dµ Mµ (Y)
k
= dκ (y11 · · · ykmm + · · · + yk1m · · · ykm1 ) +
µ<κ
dµ Mµ (Y)
= dκ (yk1m · · · ykm1 + symmetric terms) + terms of lower weight in the y’s. Thus, g (Y, Im ) =
=
g (Im , Im ) Cκ (Im ) g ( Im , Im ) Cκ (Im )
Cκ (Y)
(dκ yk1m · · · ykm1 + symmetric terms) + terms of lower weight in the y’s,
(22)
and so by Lemma 13 we have g (Y, Im ) =
X>0
=
h(tr X)|X|a−(m+1)/2 Cκ (X−1 Y)(dX)
k dκ y1m
···
ykm1
a−(m+1)/2
X >0
h(tr X)|X|
|X1 |
km−1 −km
km−2 −km−1
|X2 |
· · · |X|
−k1
(dX) + symmetric terms
+ terms of lower weight in the y’s.
(23) ⊤
To evaluate the last integral, let us factorize X as X = T T, where T is upper triangular with positive diagonal elements, and
(dX) = 2m
m
tiim+1−i (dT).
i=1
Then, tr X =
m
tij2 ,
2 2 2 |X1 | = t11 , |X2 | = t11 t22 , . . . , |X| =
i≤j
m
tii2 .
i =1
Now by using the results of Fang and Zhang [9] and then performing some simplification, we obtain
g (Y, Im ) =
k dκ y1m
···
ykm1
···
h tij
m i ≤j
tij2
1
2(a−ki −(m−i)/2)−1 m tii 2
(dT) + symmetric terms
i=m
+ terms of lower weight in the y’s m m 0 (a − ki − (m − i)/2) 0 12 i =1 i
∞
m
h(w)w i=1
×
(a−ki −(m−i)/2)+m(m−1)/4−1
dw + symmetric terms
0
+ terms of lower weight in the y’s m π m(m−1)/4 0 (a − ki − (m − i)/2) ∞ km i=1 k1 = h(w)w ma−k−1 dw + symmetric terms d κ y 1 · · · y m 0 (ma − k) 0 + terms of lower weight in the y’s 0m (a, −κ) = dκ yk1m · · · ykm1 S + symmetric terms + terms of lower weight in the y’s, 0 (ma − k)
(24)
where the last equation follows from [13, eq. (4), p. 469] and we have denoted
S= 0
∞
h(w)w ma−k−1 dw < ∞.
(25)
232
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
As suggested by the referees, we can simplify Khatri’s expression as
0m (a, −κ) = π m(m−1)/4
m
0 (a − ki − (m − i)/2)
i=1
=
(−1)k 0m (a) , (−a + (m + 1)/2)κ
Re(a) > k1 + (m − 1)/2,
a fact which can be derived from 0 (b − ki ) = (−1)ki 0 (b)/(−b + 1)ki , i = 1, . . . , m. k
k
Equating coefficients of y1m · · · ym1 in (24) and (22), we get g (Im , Im )
=
Cκ (Im )
0m (a, −κ) S, 0 (ma − k)
and using this in (21), we obtain (17) for Z = Im :
X>0
h(tr X)|X|a−(m+1)/2 Cκ (X−1 Y)(dX) =
0m (a, −κ) Cκ (Y)S , 0 (ma − k)
(26)
where S is as defined in (25). Now, consider the integral in (17) when Z > 0 is real. By setting V = Z1/2 XZ1/2 , so that (dV) = |Z|(m+1)/2 (dX), (17) becomes g (Y, Z) = |Z|−a
=
V >0
h(tr V)|V|a−(m+1)/2 Cκ (V−1 Z1/2 YZ1/2 )(dV)
|Z|−a 0m (a, −κ) Cκ (YZ)S , 0 (ma − k)
where the last equation follows from (26). Thus, the theorem is true for real Z > 0 and it follows for complex Z with Re(Z) > 0 by analytic continuation. One can now compare the similarity between Theorems 10 and 16. In both cases, we expect that the associated integrals with Gaussian kernels are obtained in a straight forward manner. For example, from Theorem 10 (see [1]), we obtain the following: (1) Generalization of [13, Lemma 7, eq. (19)] (which was originally established by expanding a Laplace transform). Corollary 17.
X >0
etr(−XZ)(tr(XZ))j |X|a−(m+1)/2 Cκ (XY)(dX) =
0m (a)(a)κ 0 (ma + j + k) −a |Z| Cκ (YZ−1 ), 0 (ma + k)
(27)
where Y is a symmetric m × m matrix and Z is a complex symmetric m × m matrix with Re(Z) > 0. Proof. Taking h(y) = e−y yj in (9), we have S = we have Lemma 7, eq. (19) of [13].
∞ 0
h(y)yma+k−1 dy = 0 (ma + j + k), and so (27) follows. When Z = Im ,
(2) The result of Constantine [5, eq.(1)], widely used by Muirhead [18], is obtained in a similar manner. Corollary 18.
X >0
etr(−XZ)|X|a−(m+1)/2 Cκ (XY)(dX) = (a)κ 0m (a)|Z|−a Cκ (YZ−1 ),
(28)
where Y is a symmetric m × m matrix and Z is a complex symmetric m × m matrix with Re(Z) > 0. Proof. We simply set j = 0 in Corollary 17.
(3) Generalized multivariate elliptical function: Taking k = 0, we obtain the following result. Corollary 19.
a−(m+1)/2
X >0
h(tr XZ)|X|
0m (a) −a (dX) = |Z| 0 (ma)
∞
h(w)w ma−1 dw,
(29)
0
where Z is a complex symmetric m × m matrix with Re(Z) > 0. (4) If we take Z = 6−1 and h(w) = e−w/2 , we have the classical result for a multivariate gamma function proved by Muirhead [18, p. 61–63], as follows.
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
233
Corollary 20.
1 etr − 6−1 X |X|a−(m+1)/2 (dX) = 2ma 0m (a)|6|a 2 X>0
(30)
where 6 is a complex symmetric m × m matrix with Re(6) > 0. When the arguments of the zonal polynomials change from YZ to YZ−1 , some of the parallel results of the above corollaries can be found in [13,18]. However, one can check that it is not possible to obtain them by using Theorem 15. Next, we demonstrate some consequences of Theorem 16. (1) Generalization of [13, Lemma 7, eq. (20)] (which was originally established by expanding a Laplace transform). Corollary 21.
X>0
etr(−XZ)(tr(XZ))j |X|a−(m+1)/2 Cκ (X−1 Y)(dX) =
|Z|−a 0m (a, −κ)Cκ (YZ) 0 (ma + j − k), 0 (ma − k)
(31)
where Y is a symmetric m × m matrix and Z is a complex symmetric m × m matrix with Re(Z) > 0. Proof. Taking h(y) = e−y yj in (25), we have S = we have Lemma 7, eq. (20) of [13].
∞ 0
h(y)yma−k−1 dy = 0 (ma + j − k), and so (31) follows. When Z = Im ,
(2) Generalization of Theorem 7.2.13 of [18]. Corollary 22.
X>0
etr(−XZ)|X|a−(m+1)/2 Cκ (X−1 Y)(dX) = 0m (a, −κ)|Z|−a Cκ (YZ−1 ),
(32)
where Y is a symmetric m × m matrix and Z is a complex symmetric m × m matrix with Re(Z) > 0. Proof. We simply set j = 0 in Corollary 21.
When Y = Im , we have Theorem 7.2.13 of [18]; it was fully derived in pages 256–258 of this work, but it was based on the erroneous Lemma 12, and yet the final result turns out to be correct due to the symmetry of the involved monomials and determinants. (3) Generalized multivariate elliptical function and multivariate gamma function: Taking k = 0, we get Corollaries 19 and 20, as expected. 3.3. Distribution of the smallest latent root of the sample covariance matrix Finally, we can now deal with the task of finding the distribution of the smallest latent root lm of the sample covariance matrix. We are now ready to compute (7) as Pr(A > ) =
mr ∞ t (−1)k 1 π mn/2 |6−1 |n/2 tr 6−1 (−r )κ ∗ 0m (n/2) k! t =0 t ! k=0 κ∗ × h(t ) tr 1/2 6−1 1/2 Y |Y|(n−m−1)/2 Cκ ∗ (Y−1 )(dY), 0
with a = n/2. Applying Theorem 16, we obtain the following result. Theorem 23. Let A be E W m (n, 6, h), n > m − 1, and > 0. If r = (n − m − 1)/2 is a positive integer, then Pr(A > ) = π mn/2
mr ∞ Cκ ∗ (6−1 ) k!0 (ma − k) t =0 k=0 κ ∗
∞ 0
t h(t ) (w)w ma−k−1 dw tr 6−1 , t!
where a = n/2 and the summation runs over partitions κ ∗ = (k1 , . . . , km ) of k such that k1 ≤ r. For a specific elliptical model, the above expressions simplify considerably as shown below. For example, for the Gaussian kernel, we obtain the following results.
234
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
Corollary 24. Let A be Wm (n, 6), n > m − 1, and > 0. If r = (n − m − 1)/2 is a positive integer, then
mr 1 −1 1 1 −1 Pr(A > ) = etr − 6 6 , Cκ ∗ 2 k! κ ∗ 2 k=0
(33)
where the summation runs over partitions κ ∗ = (k1 , . . . , km ) of k such that k1 ≤ r. Proof. Taking h(y) = e−y/2 /(2π )mn/2 so that h(t ) (y) = (−1/2)t e−y/2 /(2π )mn/2 , we have (−1)t π −mn/2 2−k−t 0 (mn/2 − k), and then using the fact that ak Cκ (X) = Cκ (aX), we get Pr(A > ) =
mr Cκ ∗ k=0 κ ∗
1 2
∞ 6 −1 − 21 tr 6−1
k!
t!
t =0
∞ 0
h(t ) (w)w ma−k−1 dw =
t .
Upon using the Taylor expansion of the exponential series, we get the required result.
By a suggestion of a referee, we illustrate the above theorem with another model. Consider the so-called Kotz type I class of distributions which includes the Gaussian distribution. The generator function for this distributions is given by: h(y) =
RT −1+mn/2 0 (mn/2)
π mn/2 0 (T − 1 + mn/2)
yT −1 exp{−Ry},
where R > 0, 2T + mn > 2 and dk dyk
T −1
y
exp{−Ry} = (−R) y
k T −1
exp{−Ry} 1 +
p−1 k k p=1
p
(T − 1 − i) (−Ry)
−p
,
i=0
see [1]. Here, the parameter of interest is T . The first few cases for integers T are as follows:
mr G tr R6−1 Cκ ∗ (R6−1 ) −1 , Pr(A > ) = etr −R6 k! k=0 κ ∗ where G tr R6−1 is certain function of tr R6−1 . Let denote g = tr R6−1 , then: For R > 0 and T = 1: G(g ) = 1; note that when R = 1/2, we have the Gaussian case in Corollary 24. −2k+mn+2g For R > 0 and T = 2: G(g ) = . mn For R > 0 and T = 3: G(g ) = For R > 0 and T = 4: G(g ) =
4k2 +2mn+(mn+2g )2 −4k(1+mn+2g ) . mn(2+mn) −8k3 +8mn+12k2 (2+mn+2g )+(mn+2g )(mn(6+mn)+4mng +4g 2 )−2k(3m2 n2 +12mn(1+g )+4(2+3g (1+g ))) . mn(2+mn)(4+mn) −1
When the model increases the power T , more complex polynomials with powers of tr R6 appear; this polynomial increasing behavior in terms of Gaussian laws, has been noted earlier in other extensions under elliptical models; see [2–4]. As a final example, we consider the family of Pearson type VII distributions. The generator function is given by:
h(y) =
0 (s) (1 + y/R)−s , (π R)mn/2 0 (s − mn/2)
where R > 0, s > mn/2, and h(t ) (y) =
0 (s)(−1)t (s)t (1 + y/R)−(s+t ) , (π R)mn/2 0 (s − mn/2) Rt
then the required Pr(A > ) is: 1 + g /R
2 −(s−mn/2)
−k mr Cκ ∗ (R−1 6−1 ) (s − mn/2)k 1 + g /R2 , k! k=0 κ ∗
with g = tr R6−1 . When s = (mn + R)/2, we have the matrix variate t-distribution with R degrees of freedom. Other families of distributions such as Bessel, general Kotz, and Jensen-Logistic can be studied in this regard and their derivatives are given in [1]. Finally, we conclude this section by deriving the distribution of the smallest latent root lm of a sample covariance matrix S under an elliptical model. As stated earlier, it requires the computation of Pr(x > lm ) = Pr(xI > S) = Pr(nxI > A). Now, by replacing by nxI in Theorem 23, we get the required result as follows.
F.J. Caro-Lopera et al. / Journal of Multivariate Analysis 145 (2016) 224–235
235
Corollary 25. The distribution of the smallest latent root lm of a sample covariance matrix S ∼ E W m (n, 1n 6, h), n > m − 1, when r = (n − m − 1)/2 is a positive integer, is given by Pr(lm > x) = π
ma
mr ∞ Cκ ∗ (nx6−1 ) k!0 (ma − k) t =0 k=0 κ ∗
∞ 0
t h(t ) (w)w ma−k−1 dw tr nx6−1 , t!
where a = n/2 and the summation runs over partitions κ ∗ = (k1 , . . . , km ) of k such that k1 ≤ r. The Gaussian case (see [18, Corollary 9.7.4]) follows immediately from Corollary 26 as given below. Corollary 26. If lm is the smallest latent root of S, where A = nS is Wn (n, 6) and if r = (n − m − 1)/2 is a positive integer, then
1 Pr(lm > x) = etr − nx6−1 2
mr k=0
1 k! κ ∗
Cκ
1 2
nx6−1
,
where the summation is over those partitions κ = (k1 , . . . , km ) of k with k1 ≤ r. For other models such as Kotz type I and Pearson type VII, simply replace by nxI in the examples given for Theorem 23. All the distributions derived in the preceding sections can be computed by suitable modifications of the algorithms of Koev and Edelman [15] given for hypergeometric functions, but we need to find the general derivatives of the kernel function. Caro-Lopera et al. [1] obtained these derivatives for the classical families of elliptical distributions such as Pearson VII, Kotz, Bessel and Jensen-Logistic, and this will be useful in the required computations. In concluding, it needs to be mentioned that by proceeding on similar lines of work, we may study noncentral and inverted Wishart distributions under elliptical models and examine their properties and applications. We are currently working on these problems and hope to present the findings in a future paper. Acknowledgments This research work was supported by a joint project between the University of Medellin (Medellin, Colombia) and the Research Center in Mathematics (CIMAT, Guanajuato, Mexico), with partial support from the Laboratory M3 from CIMAT. N. Balakrishnan thanks the Natural Sciences and Engineering Research Council of Canada for supporting his research through an individual discovery grant (5-36028). Our sincere thanks also go to the two anonymous reviewers, the Editor in Chief and the Associate Editor for their incisive comments and suggestions on an earlier version of this manuscript which led to this much improved version. References [1] F.J. Caro-Lopera, J.A. Díaz-García, G. González-Farías, Noncentral elliptical configuration density, J. Multivariate Anal. 101 (2009) 32–43. [2] F.J. Caro-Lopera, G. González-Farías, N. Balakrishnan, On generalized Wishart distributions—I: Likelihood ratio test for homogeneity of covariance matrices, Sankhya¯ Ser. A 76 (2014) 179–194. [3] F.J. Caro-Lopera, G. González-Farías, N. Balakrishnan, On generalized Wishart distributions—II: Sphericity test, Sankhya¯ Ser. A 76 (2014) 195–218. [4] F.J. Caro-Lopera, G. González-Farías, N. Balakrishnan, Matrix-variate distribution theory under elliptical models-3: Likelihood ratio test statistics for testing location and scale and their exact null distribution, 2015, submitted for publication. [5] A.G. Constantine, Noncentral distribution problems in multivariate analysis, Ann. Math. Statist. 34 (1963) 1270–1285. [6] J.A. Díaz-García, R. Gutiérrez-Jáimez, Zonal polynomials: A note, Chil. J. Statist. 2 (2010) 21–30. [7] J.A. Díaz-García, R. Gutiérrez-Jáimez, Distributions of the compound and scale mixture of vector and spherical matrix variate elliptical distributions, J. Multivariate Anal. 102 (2011) 143–152. [8] J.A. Díaz-García, R. Gutiérrez-Jáimez, Some comments on zonal polynomials and their expected values with respect to elliptical distributions, Acta Math. Appl. Sin. Engl. Ser. (2011) http://dx.doi.org/10.1007/s10255-011-0092-8. [9] K.T. Fang, Y.T. Zhang, Generalized Multivariate Analysis, Science Press, Springer-Verlag, Beijing, 1990. [10] H. Hashiguchi, N. Niki, Numerical computation on distributions of the largest and the smallest latent roots of the Wishart matrix, J. Japanese Soc. Comput. Statist. 19 (1) (2006) 45–56. [11] A.T. James, The distribution of the latent roots of the covariance matrix, Ann. Math. Statist. 31 (1960) 151–158. [12] A.T. James, Distributions of matrix variates and latent roots derived from normal samples, Ann. Math. Statist. 39 (1964) 1711–1718. [13] C.G. Khatri, On certain distribution problems based on positive definite quadratic functions in normal vectors, Ann. Math. Statist. 37 (1966) 468–479. [14] C.G. Khatri, On the exact finite series distribution of the smallest or the largest root of matrices in three situations, J. Multivariate Anal. 2 (1972) 201–207. [15] P. Koev, A. Edelman, The efficient evaluation of the hypergeometric function of a matrix argument, Math. Comp. 75 (2006) 833–846. [16] R. Li, The expected values of invariant polynomials with matrix argument of elliptical distributions, Acta Math. Appl. Sin. 13 (1997) 64–70. [17] F. Li, Y. Xue, Zonal polynomials and hypergeometric functions of quaternion matrix argument, Comm. Statist. Theory Methods 38 (2009) 1184–1206. [18] R.J. Muirhead, Aspects of Multivariate Statistical Theory, John Wiley & Sons, New York, 1982.