Singular matric and matrix variate t distributions

Singular matric and matrix variate t distributions

Journal of Statistical Planning and Inference 139 (2009) 2382 -- 2387 Contents lists available at ScienceDirect Journal of Statistical Planning and ...

156KB Sizes 1 Downloads 90 Views

Journal of Statistical Planning and Inference 139 (2009) 2382 -- 2387

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference journal homepage: w w w . e l s e v i e r . c o m / l o c a t e / j s p i

Singular matric and matrix variate t distributions José A. Díaz-Garcíaa,∗ , Ramón Gutiérrez-Jáimezb a b

Department of Statistics and Computation, Universidad Autónoma Agraria Antonio Narro, 25315 Buenavista, Saltillo, Coahuila, Mexico Department of Statistics and OR, University of Granada, Granada 18071, Spain

A R T I C L E

I N F O

Article history: Received 11 April 2008 Received in revised form 5 November 2008 Accepted 7 November 2008 Available online 21 November 2008

A B S T R A C T

Singularities for the matric and matrix variate t distributions are studied and their corresponding densities are derived. An application is then given in the context of sensitivity analysis. © 2008 Elsevier B.V. All rights reserved.

PACS: 62E15 60E05 Keywords: Matrix-variate t distribution Singular distribution Matricvariate t distribution Externally studentised residual

Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Double singular matricvariate t distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Singular matrix-variate t distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2382 2383 2385 2385 2386 2387 2387

1. Introduction The multivariate t distribution has been studied by several authors. An excellent review of the question can be found in Kotz and Nadarajah (2004). r Let us recall that the r-dimensional t distribution with n degrees of freedom, mean vector  ∈ R , and positive definite r×r has the following two representations, see Kotz and Nadarajah (2004, pp. 2–7), correlation matrix R ∈ R • t = S−1 Y + ; ∗ Corresponding author. E-mail addresses: [email protected] (J.A. Díaz-García), [email protected] (R. Gutiérrez-Jáimez). 0378-3758/$ - see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2008.11.001

(1)

J.A. Díaz-García, R. Gutiérrez-Jáimez / Journal of Statistical Planning and Inference 139 (2009) 2382 -- 2387

2383

where Y ∼ Nr (0, R), i.e. Y is a non-singular r-dimensional normal random vector with mean 0 and positive definite covariance matrix R, R > 0; and nS/ 2 ∼ 2 (n), independent of Y, is the chi-squared random variable with n degrees of freedom. • t = (V1/2 )−1 Y + ;

(2)

where Y ∼ Nr (0, nIr ) is independent of V, Ir is the r-dimensional identity matrix, V1/2 is the symmetric square root of V, i.e. V1/2 V1/2 = V ∼ Wm (n + r − 1, R−1 ) and Wr (n, R) is the non-singular Wishart distribution with n degrees of freedom n and positive definite matrix parameter R. Let us now consider the matrial versions of these representations: m×r

• For the first generalisation assume that Y ∼ Nm×r (0, Im ⊗ R) (Muirhead, 1982, pp. 89–80), then T1 ∈ R is defined in a m×r and the corresponding density is called the matrix-variate non-singular t distribution, see similar way to (1), but  ∈ R Fang and Anderson (1990, p. 208), Sutradhar and Ali (1989) and Gupta and Varga (1993, p. 76), among others. m×r • For the generalisation of (2), assume that Y ∼ Nm×r (0, Im ⊗ nIr ), then the density of T ∈ R is called the non-singular matricvariate t distribution, see Dickey (1967), Press (1982, pp. 138–141), Box and Tiao (1972, pp. 441–448) and Kotz and Nadarajah (2004, Section 5.11). Note that the rows of the matrices T1 and T have the same marginal distribution in both representations, but that the densities of matrices T1 and T are different, see Dickey (1967) and Sutradhar and Ali (1989) among others. A common problem in matric and matrix variate t distributions is encountered when the matrix Y is singular. More explicitly, m×r if the matrix Y ∈ R represents the information of m objects or observations where r variables (characteristics) are evaluated per object, then the singularity is due to the following reasons: the rows (observations) and/or the columns (characteristics) of the matrix Y are linearly dependent, see Díaz-García et al. (1997) and Díaz-García and González-Farías (2005). In addition, for the matricvariate distribution we need to consider another class of singularity, that found when V is positive semidefinite (V  0). From a practical point of view, this singularity (V  0) can be accounted for by many reasons; for example, when the number of observations is lower than the number of characteristics (dimensions) of the problem, m < r, see Díaz-García et al. (1997). In any case, if the matrix Y and/or the matrix V are singular, then the matrices T and T1 are singular, and the corresponding densities of the matric and matrix variate distributions exist with respect to the Hausdorff measure, see Billingsley (1986), Díaz-García et al. (1997) and Díaz-García and González-Farías (2005). In the case of the matrix-variate t distribution there is only one type of singularity as a consequence of the singularity of Y. However, in the case of the matricvariate t distribution we have the following cases: ⎧ Non-singular, V > 0, Y is non-singular; ⎪ ⎪ ⎪ ⎪ ⎨ Singular 1, V > 0, Y is singular; T= ⎪ Singular 2, V  0, Y is non-singular; ⎪ ⎪ ⎪ ⎩ Double singular, V  0, Y is singular. In this work we propose an expression for the doubly singular matricvariate t distribution and as particular cases we find expressions for the non-singular, singular 1 and singular 2 matricvariate t distributions. Similarly, we propose an expression for the singular matrix-variate t distribution. Finally, an application of the singular matricvariate t distribution is proposed; this is obtained by studying the generalised multivariate version of the externally studentised residual when the number of observations is less than the dimension of the problem. 2. Double singular matricvariate t distribution Given any symmetric matrix A, let A+ and A− be the Moore–Penrose inverse and any symmetric generalised inverse A, respectively, see Rao (1973). If chi (A) denotes the i-th eigenvalue of the matrix A and we use the notation of Díaz-García et al. (1997) for the singular matrix-variate normal and Wishart and pseudo-Wishart distributions, then we have: Theorem 1 (Double singular matricvariate t distribution). Let T be the random m × r matrix, T = (V1/2 )+ Y + , m,r

where V1/2 V1/2 = V ∼ Wqm (n, N), N(: m × m)  0 with rank (N) = rN  m and q = min(m, n); independent of Y ∼ Nm×rR (0, Im ⊗ R), R(: r × r)  0 with rank (R) = rR  r and m  q  rR . Then the singular random matrix T has the density, r  c(m, n, q, q1 , rN , rR , r ) chl [N− + (T − )R− (T − ) ]−(n+rR )/2 (dT), rR  r N chi (R)m/2 j=1 chj (N)n/2 l=1 i=1

(3)

2384

J.A. Díaz-García, R. Gutiérrez-Jáimez / Journal of Statistical Planning and Inference 139 (2009) 2382 -- 2387

where

n(q−rN )/2−(n+rR )(q1 −r )/2−mrR /2 q1 [(n + rR )/2] , 2(mrR +nrN )/2−(n+rR )r /2 q [n/2]

c(m, n, q, q1 , rN , rR , r ) =

(4)

where q1 = min(m, n + rR ), r = rank [N− + (T − )R− (T − ) ]  m, and (dT) denotes the Hausdorff measure. Proof. From Díaz-García et al. (1997), the joint density of V and Y is dF Y,V (Y, V) ∝ |L|(n−m−1)/2 etr(− 12 (N− V + YR− Y ))(dY)(dV), where etr(·) = exp(tr(·)), V = H1 LH1 , is the non-singular part of the spectral decomposition of V, H1 is a q × m semi-orthogonal matrix, and L = diag(l1 , ..., lq ), l1 > · · · > lq ; and the corresponding constant of proportionality is c=

2−nrN /2 n(q−rN )/2 . rN rR (2)mrR /2 q [n/2] i=1 chi (R)m/2 j=1 chj (N)n/2

First consider the transformation T = (V1/2 )+ Y = (V1/2 )+ (Y1 , ..., Yr ), where Yj , j = 1, ..., r are the columns of Y such that the Yj 's are in the column space of V1/2 , then Y = (V1/2 )T. Now make the change of variables Y = (V1/2 )(T − ). Then by Díaz-García (2007) with m  q  rR , (dY) = |L|rR /2 (dT). The joint density of T and V is given by dF T,V (T, V) ∝

etr( 12 (N−

|L|(n+rR −m−1)/2

+ (T − )R− (T − ) )V)

(dT)(dV).

Integrating with respect to V  0 we have (see Díaz-García et al., 1997), dF T (T) ∝

(n+rR )(q1 −r )/2

2(n+rR )r /2 q1 [(n + rR )/2] (dT), r chl (N− + (T − )R− (T − ) )(n+rR )/2 l=1

and the result follows by noting that q1 = min(m, n + rR ) and r  m is the rank of the matrix (N− + (T − )R− (T − ) ).  Note that inside the singular 2 and doubly singular types there are several particular cases of the matricvariate t distribution according to the singularity sources of the density of V, see Díaz-García et al. (1997). Some of these particular cases are obtained by the following substitutions: (1) if r = m, then c(m, n, q, q1 , rN , rR , m) |N− + (T − )R− (T − ) |−(n+rR )/2 (dT). rR m/2 rN n/2 ch ( R ) ch ( N ) i j i=1 j=1 (2) Case: singular 1, then c(m, n, m, m, m, rR , m) −1 |N + (T − )R− (T − ) |−(n+rR )/2 (dT). rR m/2 n/2 ch ( R ) | N | i i=1 This distribution was obtained previously in an alternative way by Díaz-García and Gutiérrez-Jáimez (2006b). (3) Case: singular 2, in general we have c(m, n, q, n + r, rN , r, m) − |N + (T − )R−1 (T − ) |−(n+r)/2 (dT), rN |R|m/2 j=1 chj (N)n/2 or in a particular case, when V has a non-singular pseudo-Wishart distribution, c(m, n, n, n + r, m, r, m) |R|m/2 |N|n/2

|N−1 + (T − )R−1 (T − ) |−(n+r)/2 (dT).

J.A. Díaz-García, R. Gutiérrez-Jáimez / Journal of Statistical Planning and Inference 139 (2009) 2382 -- 2387

2385

Observe that the non-singular case studied by Dickey (1967) and Box and Tiao (1972, pp. 439–447) is obtained as a particular case of Theorem 1, by taking q = q1 = rN = r = m and rR = r,

m [(n + r)/2] |N−1 + (T − )R−1 (T − ) |−(n+r)/2 (dT), mr/2 |m [n/2]R|m/2 |N|n/2 where (dT) is the Lebesgue measure. 3. Singular matrix-variate t distribution In this section we study the density function of the singular matrix-variate t distribution. Theorem 2 (Singular matrix-variate t distribution). Let T1 be the random m × r matrix, T1 = S−1 Y + , r ,r

 R (0, H ⊗ R), R(: r × r)  0 with rank (R) = rR  r and rank (H) = r  m. Then the where nS/ 2 ∼ 2 (n) is independent of Y ∼ Nm×r T1 has the density,

−g/2 [(n + g)/2] − (1 + trH (T1 − )R− (T1 − ) )−(n+g)/2 (dT1 ), N [n/2] i=1 chi (R)r /2 rj=1 chj (H)rR /2  rR

(5)

where g = rR r and (dT1 ) denotes the Hausdorff measure. Proof. The joint density of Y and S is −

dF Y,S (Y, s) ∝ s(n−2)/2 exp(− 12 (s + trH YR− Y ))(dY)(ds), where the corresponding constant of proportionality is c1 =

2n/2 (2)r rR /2 [n/2]

rR

1

i=1

chi (R)r /2

r

j=1

chj (H)rR /2

.

Let Y = S1/2 (T1 − ), then by Díaz-García (2007), (dY) = Sr rR /2 (dT1 ). The joint density of Y and S is given by dF T1 ,S (Y, s) ∝

s(n+r rR −2)/2 − exp( 12 (1 + trH (T1

− )R− (T1 − ) )s)

(dT1 )(ds).

Integrating with respect to S and using  − s(n+r rR −2)/2 exp(− 12 (1 + trH (T1 − )R− (T1 − ) )s)(ds) s>0



= 2(n+r rR )/2 [(n + r rR )/2](1 + trH (T1 − )R− (T1 − ) )−(n+r rR )/2 gives the required marginal density function for T1 .  Theorem 2 allows generalisations of the results in Sutradhar and Ali (1989) and Fang and Anderson (1990, p. 208) to the singular matrix-variate case. Observe, also, that this distribution can be obtained in an alternative way by Theorem 1 in Díaz-García and Gutiérrez-Jáimez (2006c), see also Díaz-García and González-Farías (2005). 4. Application The distributions of certain classes of residuals associated with a multivariate general linear model play a fundamental role in the context of sensitivity analysis, see Caroni (1987) and Díaz-García and Gutiérrez-Jáimez (2006a), among many others. Consider the multivariate general linear model Y = XB + E,

(6)

where X is the m × p (known) design matrix of rank rX  min(m, p) and B is the p × r matrix of unknown parameters. We shall assume that E ∼ Nm×r (0, Im ⊗ R) with r × r unknown matrix of covariance R  0 of rank (m − rX ). We also assume that m  r, i.e. the number of observations is less than the dimension r.

2386

J.A. Díaz-García, R. Gutiérrez-Jáimez / Journal of Statistical Planning and Inference 139 (2009) 2382 -- 2387

The maximum likelihood or the least square estimator of XB is given by ≡ X

XB B = X(X X)− X Y = XX+ Y, which is invariant for any generalised inverse of (X X). The covariance matrix R can be unbiasedly estimated by

= R

1 (Y − X

B) (Y − X

B). (m − rX )

has a pseudo-Wishart distribution

is, however, a singular random matrix of rank m − r < r; moreover, (m − r )R The matrix R X X

∼ of rank (m − rX ) with (m − rX ) degrees of freedom and a symmetric matrix of parameters R, which is denoted as (m − rX )R X PWm−r ((m − r ), R ), see Díaz-García et al. (1997). X r The residual matrix is defined as

E = Y −

Y = Y − X

B = (Im − XX+ )Y = (Im − H)Y, where H = XX+ is the orthogonal projector on the (m−r ),r

column space of X. Then E has a singular matrix-variate normal distribution of rank r(m − rX ), i.e.

E ∼ Nm×r X (0, (Im − H) ⊗ R). Also, observe that the i-th row of

E, denoted as

Ei , has a non-singular r-variate normal distribution, i.e.

Ei ∼ Nr (0, (1 − hii )R), H = (hij ), for all i = 1, . . . , n. Given that the

Ei 's are linearly dependent, we need to define the index I = {i1 , . . . , ik }, with is = 1, . . . , n; s = 1, . . . , k and k  (n − rX ), in such way that the vectors

Ei1 , . . . ,

Eik 's are linearly independent. Thus we write ⎛ T ⎞ Ei 1 ⎜ ⎟ ⎜ ⎟

EI = ⎜ .. ⎟ . ⎝ . ⎠

ET

(7)

ik

Observe that

EI has a matrix-variate normal non-singular distribution, and moreover that

EI ∼ Nk×p (0, (Ik − HI ) ⊗ R). Here HI is obtained by deleting the row and column of H not indexed by I. A generalised multivariate version of the externally studentised residual when the number of observations is less than the

 0) is given by (see Caroni, 1987 for the classical definition), dimension (R ui = 

1

1/2

1 − hii

)+

(R Ei , (i)

is also and r × r singular matrix of rank (m − r − 1), which is obtained by removing the i-th observation from the where R X (i) sample. Given the index I, the following definition is established:

1/2 )+ , uI = (Ik − HI )−1/2

EI (R I

is obtained by removing the observations of the sample not indexed by I. where R I √ Theorem 3 (Externally studentised residual). Under the general multivariate linear model (6), W = uI / s, has a singular matricvariate t distribution (singular 2), moreover dF W (W) =

(m−rX ) [(m − rX )/2] |Ir + W W|−(m−rX )/2 (dW), (m−rX )(m−rX −r)/2+kr/2 s [s/2]

where s = m − rX − k. Proof. The demonstration follows from Theorem 1, by observing that W = T , (Ik − HI )−1/2

EI ∼ Nk×r (0, I ⊗ R) and that (m − rX −

∼ PW(m−rX −k) ((m − r − k), R).  k)R I

r

X

5. Conclusions The singular matrix and matricvariate t distributions are studied under a number of singularities. Several particular cases of interest in the statistical literature are studied. In the same context, previously published results are found as particular cases of the general densities derived in this paper. Finally, some of these results are applied in the context of sensitivity analysis, specifically, to study the joint distribution of the externally studentised residual.

J.A. Díaz-García, R. Gutiérrez-Jáimez / Journal of Statistical Planning and Inference 139 (2009) 2382 -- 2387

2387

Acknowledgements The authors wish to thank the Associate Editor and the anonymous reviewers for their constructive comments on the preliminary version of this paper. This research work was partially supported by IDI-Spain, Grant MTM2005-09209, and CONACYTMéxico, research Grant no. 81512. This paper was written during J.A. Díaz- García's stay as a visiting professor at the Department of Statistics and OR of the University of Granada, Spain. We also thanks Francisco José Caro Lopera (CIMAT-México) for his careful reading and excellent comments. References Billingsley, P., 1986. Probability and Measure. second ed. Wiley, New York. Box, G.E., Tiao, G.C., 1972. Bayesian Inference in Statistical Analysis. Addison-Wesley Publishing Company, Reading, MA. Caroni, C., 1987. Residuals and influence in the multivariate linear model. The Statist. 36, 365–370. Dickey, J.M., 1967. Matricvariate generalizations of the multivariate t-distribution and the inverted multivariate t-distribution. Ann. Math. Statist. 38, 511–518. Díaz-García, J.A., 2007. A note about measures and Jacobians of singular random matrices. J. Multivariate Anal. 98, 960–969. Díaz-García, J.A., González-Farías, G., 2005. Singular random matrix decompositions: distributions. J. Multivariate Anal. 94, 109–122. Díaz-García, J.A., Gutiérrez-Jáimez, R., 2006a. The distribution of the residual from a general elliptical multivariate linear regression model. J. Multivariate Anal. 97, 1829–1841. Díaz-García, J.A., Gutiérrez-Jáimez, R., 2006b. Distribution of the generalised inverse of a random matrix and its applications. J. Statist. Plann. Inference 136, 183–192. Díaz-García, J.A., Gutiérrez-Jáimez, R., 2006c. Wishart and pseudo-Wishart distributions under elliptical laws and related distributions in the shape theory context. J. Statist. Plann. Inference 136, 4176–4193. Díaz-García, J.A., Gutiérrez-Jáimez, R., Mardia, K.V., 1997. Wishart and pseudo-Wishart distributions and some applications to shape theory. J. Multivariate Anal. 63, 73–87. Fang, K.T., Anderson, T.W., 1990. Statistical Inference in Elliptically Contoured and Related Distributions. Allerton Press Inc., New York. Gupta, A.K., Varga, T., 1993. Elliptically Contoured Models in Statistics. Kluwer Academic Publishers, Dordrecht. Kotz, S., Nadarajah, S., 2004. Multivariate t Distributions and Their Applications. Cambridge University Press, UK. Muirhead, R.J., 1982. Aspects of Multivariate Statistical Theory. In: Wiley Series in Probability and Mathematical Statistics.Wiley, New York. Press, S.J., 1982. Applied Multivariate Analysis: Using Bayesian and Frequentist Methods of Inference. second ed. Robert E. Krieger Publishing Company, Malabar, FL. Rao, C.R., 1973. Linear Statistical Inference and Its Applications. Wiley, New York. Sutradhar, B.C., Ali, M.M., 1989. A generalization of the Wishart distribution for elliptical model and moments for the multivariate t model. J. Multivariate Anal. 29, 155–162.