ARTICLE IN PRESS
Statistics & Probability Letters 77 (2007) 407–416 www.elsevier.com/locate/stapro
Multivariate extensions of Spearman’s rho and related statistics Friedrich Schmid, Rafael Schmidt Department of Economic and Social Statistics, University of Cologne, Germany Received 11 October 2005; received in revised form 24 May 2006; accepted 7 August 2006 Available online 5 September 2006
Abstract Multivariate measures of association are considered which, in the bivariate case, coincide with the population version of Spearman’s rho. For these measures, nonparametric estimators are introduced via the empirical copula. Their asymptotic normality is established under rather weak assumptions concerning the copula. The asymptotic variances are explicitly calculated for some copulas of simple structure. For general copulas, a nonparametric bootstrap is established. r 2006 Elsevier B.V. All rights reserved. MSC: Primary 62H20; 62G05; 62G20; secondary 60G15; 62G30 Keywords: Spearman’s rho; Multivariate measure of association; Copula; Nonparametric estimation; Empirical copula; Weak convergence; Asymptotic variance; Nonparametric bootstrap
1. Introduction Spearman’s rho is a widely used measure for the strength of association between two random variables X and Y. It is invariant with respect to the marginal distributions of X and Y, and can be expressed as a function of their copula. This property is also known as ‘scale invariance’. There are various ways to extend Spearman’s rho to a (multivariate) measure of association between d random variables X 1 ; . . . ; X d . This is of interest in many fields of application, e.g., in the multivariate analysis of financial asset returns where one wants to express the amount of dependence in a portfolio by a single number. The focus of this paper is on the nonparametric estimation of multivariate population versions of Spearman’s rho via the empirical copula. Using empirical process theory it can be shown that the estimators are asymptotically normally distributed under rather weak assumptions concerning the copula of X 1 ; . . . ; X d . We obtain compact expressions for the asymptotic variances of these estimators, which are determined by the
Corresponding author. Seminar fur Wirtschafts- und Sozialstatistik, Universitat zu Koln, Albertus-Magnus-Platz, D-50923 Koln, ¨ ¨ ¨ ¨
Germany. Tel.: +49 221 470 2283; fax: +49 221 470 5074. E-mail addresses:
[email protected] (F. Schmid),
[email protected] (R. Schmidt). URL: http://www.uni-koeln.de/wiso-fak/wisostatsem/. 0167-7152/$ - see front matter r 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2006.08.007
ARTICLE IN PRESS F. Schmid, R. Schmidt / Statistics & Probability Letters 77 (2007) 407–416
408
copula and its partial derivatives. If the copula possesses a simple structure, these obtained formulas are suitable for explicit computations. Otherwise, we provide a bootstrap algorithm. The structure of the paper is as follows. The next section introduces the notation and some definitions used in the following. Section 3 reviews three multivariate extensions of the population version of Spearman’s rho and provides some analytical properties. Section 4 introduces the related nonparametric estimators which are based on the empirical copula and establishes their asymptotic normality. Thereafter, formulas for their asymptotic variances are given and a bootstrap procedure is presented. Finally, in Section 5 we explicitly calculate the asymptotic variance for various copulas.
2. Notation and definitions Throughout the paper we write bold letters for vectors, e.g., x:¼ðx1 ; . . . ; xd Þ is a d-dimensional vector. Inequalities xpy are understood componentwise, i.e, xi pyi for all i ¼ 1; . . . ; d. The indicator function on a set A is denoted by 1A . The set ½a; bd ; aob, refers to the d-dimensional Cartesian product ½a; b ½a; b Rd . The space ‘1 ðTÞ comprises all uniformly bounded real-valued functions on some set T. We equip the space with the uniform metric mðf 1 ; f 2 Þ ¼ supt2T jf 1 ðtÞ f 2 ðtÞj. Let X 1 ; X 2 ; . . . ; X d be dX2 random variables with joint distribution function F ðxÞ ¼ PðX 1 px1 ; . . . ; X d pxd Þ;
x ¼ ðx1 ; . . . ; xd Þ 2 Rd ,
and marginal distribution functions F i ðxÞ ¼ PðX i pxÞ for x 2 R and i ¼ 1; . . . ; d. If not stated otherwise, we will always assume that the F i are continuous functions. Thus, according to Sklar’s (1959) theorem, there exists a unique copula C : ½0; 1d ! ½0; 1 such that F ðxÞ ¼ CðF 1 ðx1 Þ; . . . ; F d ðxd ÞÞ
for all x 2 Rd .
The copula C is the joint distribution function of the random variables U i ¼ F i ðX i Þ; i ¼ 1; . . . ; d. Moreover, d 1 1 CðuÞ ¼ F ðF 1 is defined by 1 ðu1 Þ; . . . ; F d ðud ÞÞ for all u 2 ½0; 1 . The generalized inverse function G 1 1 G ðuÞ:¼ inffx 2 R [ f1g j GðxÞXug for all u 2 ð0; 1 and G ð0Þ:¼ supfx 2 R [ f1g j GðxÞ ¼ 0g. A detailed treatment of copulas is given in Nelsen (2006) and Joe (1997). A copula is said to be radially symmetric if, and only if, ¯ uÞ CðuÞ ¼ PðUpuÞ ¼ PðU41 uÞ¼:Cð1
for all u 2 ½0; 1d .
It is well known that every copula C is bounded in the following sense: W ðuÞ:¼ maxfu1 þ þ ud ðd 1Þ; 0g pCðuÞp minfu1 ; . . . ; ud g¼:MðuÞ for all u 2 ½0; 1d , where M and W are called the upper and lower Fre´chet– Hoeffding bounds, respectively. The upper bound M is a copula itself and is also known as the comonotonic copula. It represents the copula of X 1 ; . . . ; X d if F 1 ðX 1 Þ ¼ ¼ F d ðX d Þ with probability one, i.e., if there is (with probability one) a strictly increasing functional relationship between X i and X j ðiajÞ. By contrast, the lower bound W is a copula only for dimension d ¼ 2. Another important copula is the independence copula
PðuÞ:¼
d Y
ui ;
u 2 ½0; 1d ,
i¼1
describing the dependence structure of stochastically independent random variables X 1 ; . . . ; X d .
ARTICLE IN PRESS F. Schmid, R. Schmidt / Statistics & Probability Letters 77 (2007) 407–416
409
3. Multivariate extensions of Spearman’s rho The following three multivariate population versions of Spearman’s rho are considered ðdX2Þ: Z d þ1 d r1 ¼ hðdÞ 2 , CðuÞ du 1 with hðdÞ ¼ d d 2 ðd þ 1Þ ½0;1 Z r2 ¼ hðdÞ 2d PðuÞ dCðuÞ 1 and ½0;1d ( ) X d 1 Z r3 ¼ hð2Þ 22 C kl ðu; vÞ du dv 1 , 2 ½0;12 kol where C kl ðu; vÞ refers to the bivariate marginal copula of C which corresponds to the kth and lth margin. The estimation of r1 has been originally considered in Ruymgaart and van Zuijlen (1978) and was later discussed by Wolff (1980), Joe (1990), and Nelsen (1996). A related class of multivariate measures of tail dependence is developed in Schmid and Schmidt (2006a). The version r2 appears first in Joe (1990) and later in Nelsen (1996). Further, r3 is known as the population version of the average pairwise Spearman’s rho given, for example, in Kendall (1970, Chapter 6). Remark. The average of r1 and r2 leads to another multivariate version of Spearman’s rho which is a multivariate measure of concordance, as discussed in Taylor (2006) and Dolati and U´beda-Flores (2006). The asymptotic normality of the related non-parametric estimator (cf. Section 4) can be established using the following results. Note that r3 is a multivariate measure of concordance, too. The motivation of r3 as the weighted average of all pairwise Spearman’s rhos is obvious. By contrast, the motivation of r1 and r2 becomes clearer by the following. Spearman’s rho of a two-dimensional random vector X with copula C can be written as R1 R1 Z 1Z 1 uv dCðu; vÞ ð1=2Þ2 CovðU 1 ; U 2 Þ r ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffipffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0 0 pffiffiffiffiffiffiffiffiffiffipffiffiffiffiffiffiffiffiffiffi Cðu; vÞ du dv 3, ¼ 12 VarðU 1 Þ VarðU 2 Þ 1=12 1=12 0 0 where ðU 1 ; U 2 Þ have joint distribution function C. This expression is equal to R R R R ½0;12 Cðu; vÞ du dv ½0;12 Pðu; vÞ du dv ½0;12 uv dCðu; vÞ ½0;12 uv dPðu; vÞ R R r¼R ¼R ½0;12 Mðu; vÞ du dv ½0;12 Pðu; vÞ du dv ½0;12 uv dMðu; vÞ ½0;12 uv dPðu; vÞ R R because of ½0;12 Mðu; vÞ du dv ¼ 13 and ½0;12 Pðu; vÞ du dv ¼ 14. Thus, r can be interpreted as the normalized average distance between the copula C and the independent copula Pðu; vÞ ¼ uv. The following d-dimensional extension of r to r1 (and analogously to r2 ) is now straightforward: R R Z d þ1 ½0;1d CðuÞ du ½0;1d PðuÞ du R R 2d CðuÞ du 1 ¼ r1 . ¼ d 2 ðd þ 1Þ ½0;1d ½0;1d MðuÞ du ½0;1d PðuÞ du For d ¼ 2, Spearman’s r coincides with r1 ¼ r2 ¼ r3 , though, for d42 the values of ri ; i ¼ 1; 2; 3, are different in general. There exists, however, an interesting relationship between r1 and r2 . Consider a random vector X ¼ ðX 1 ; . . . ; X d Þ and an index set I f1; . . . ; dg with cardinality 2pjIjpd. We denote by r1;I the jIjdimensional version of r1 corresponding to those variables X i where i 2 I. The following relationship holds: r2;f1;...;dg ¼
d X k¼2
ð1Þk
hðdÞ 2d X r . hðkÞ 2k If1;...;dg 1;I jIj¼k
This can be derived using partial integration and the inclusion–exclusion principle. There is an immediate consequence of this relationship if C is radially symmetric. In this case r1;f1;...;dg and r2;f1;...;dg coincide.
ARTICLE IN PRESS 410
F. Schmid, R. Schmidt / Statistics & Probability Letters 77 (2007) 407–416
Moreover, if d is odd, then both measures of association can be expressed by the lower dimensional r1;I : d 1 X hðdÞ 2d1 X ð1Þk r . r1;f1;...;dg ¼ r2;f1;...;dg ¼ hðkÞ 2k If1;...;dg 1;I k¼2 jIj¼k
If d is even, we have d 1 X hðdÞ 2d 0¼ ð1Þk hðkÞ 2k k¼2
X
r1;I .
If1;...;dg jIj¼k
Further, r1 and r3 are closely related to each other for the following copula C 0 , which is a convex combination of copulas: X d 1 Y C 0 ðuÞ ¼ C kl ðuk ; ul Þ uj , (1) 2 kol jal;k where C kl refers to the marginal copula which corresponds to the kth and lth margin of (any) copula C. In this R particular case, r3 ðCÞ ¼ hð2Þf2d ½0;1d C 0 ðuÞ du 1g, which coincides with r1 ðC 0 Þ except for a normalizing factor. Note that r1 ðC 0 ÞphðdÞ=3. We remark that r3 has certain disadvantages as a multidimensional measure of association, as it is determined by the bivariate copulas only. Consider, for example, the three-dimensional copula Cðu; v; wÞ ¼ uvw þ luð1 uÞvð1 vÞwð1 wÞ; jljp1, which is of Farlie–Gumbel–Morgenstern (FGM) type discussed in Section 5. This copula possesses independent bivariate margins and, therefore, r3 ¼ 0. By contrast, r1 ¼ l 63 a0 if la0. 4. Nonparametric estimation via the empirical copula Consider a random sample ðXj Þj¼1;...;n from a d-dimensional random vector X with joint distribution function F and copula C which are completely unknown. The marginal distribution functions F i are estimated by their empirical counterparts n 1X F^ i;n ðxÞ ¼ 1fX ij pxg for i ¼ 1; . . . ; d and x 2 R. n j¼1 ^ j;n ¼ ðU^ 1j;n ; . . . ; U^ dj;n Þ. Note that Further, set U^ ij;n :¼F^ i;n ðX ij Þ for i ¼ 1; . . . ; d; j ¼ 1; . . . ; n, and U 1 U^ ij;n ¼ ðrank of X ij in X i1 ; . . . ; X in Þ. n The estimation of ri ; i ¼ 1; 2; 3, will therefore be based on ranks (and not on the original observations). In other words, we consider rank order statistics. The copula C is estimated by the empirical copula which is defined as n Y d 1X 1 ^ for u ¼ ðu1 ; . . . ; ud Þ 2 ½0; 1d . C^ n ðuÞ ¼ n j¼1 i¼1 fU ij;n pui g Empirical copulas were introduced and studied by Deheuvels (1979) under the name of ‘empirical dependence functions’. The estimators of ri ; i ¼ 1; 2; 3, are given by ( ) Z n Y d 2d X d ^ ð1 U^ ij;n Þ 1 , r^ 1;n ¼ hðdÞ 2 C n ðuÞ du 1 ¼ hðdÞ n j¼1 i¼1 ½0;1d ( ) Z n Y d 2d X d ^ ^ r^ 2;n ¼ hðdÞ 2 U ij;n 1 , PðuÞ dC n ðuÞ 1 ¼ hðdÞ n j¼1 i¼1 ½0;1d n X d 1 Z 12 d 1 X X ð1 U^ kj;n Þð1 U^ lj;n Þ r^ 3;n þ 3 ¼ 12 C^ kl;n ðu; vÞ du dv ¼ 2 n 2 ½0;12 kol kol j¼1
ARTICLE IN PRESS F. Schmid, R. Schmidt / Statistics & Probability Letters 77 (2007) 407–416
411
with hðdÞ ¼ ðd þ 1Þ=ð2d d 1Þ and C^ kl;n ðu; vÞ being the bivariate marginal empirical copula of C^ n which corresponds to the kth and lth margin. The right-hand formula of r^ 2;n is developed in Eq. (8) later. The estimator r^ 1;n for d ¼ 2 differs slightly from the traditional sample version of Spearman’s rho r^ S;n ¼ 1
n 6n X ðU^ 1j;n U^ 2j;n Þ2 , n2 1 j¼1
(2)
which ispffiffiused if no ties are present in the sample. It can be shown that r^ 1;n pr^ S;n for n 2 N and ffi limn!1 nfr^ 1;n r^ S;n g ¼ 0 a.s. Therefore, in the bivariate case, r^ 1;n and r^ S;n have the same asymptotic distribution. An equivalent result for r^ 3;n and the traditional sample version of r3 , as given, e.g., in Kendall (1970, Chapter 6), holds. A method for approximating the asymptotic variance of r^ S;n is developed in Borkowf (1999, 2002). pffiffiffi The derivation of the limiting laws for pffiffiffi n^ ðr^ i;n ri Þ involves the following theorem concerning the asymptotic behavior of the copula process nfC n ðuÞ CðuÞg which has been investigated in various settings, e.g., by Ru¨schendorf (1976), Stute (1984), Ga¨nXler and Stute (1987), Fermanian et al. (2004), and Tsukahara (2005). Theorem 1. Let F be a continuous d-dimensional distribution function with copula C. Under the additional assumption that the ith partial derivatives Di CðuÞ exist and are continuous for i ¼ 1; . . . ; d, we have pffiffiffi w nfC^ n ðuÞ CðuÞg ! GC ðuÞ. Weak convergence takes place in ‘1 ð½0; 1d Þ and GC ðuÞ ¼ BC ðuÞ
d X
Di CðuÞBC ðuðiÞ Þ.
i¼1 ðiÞ
The vector u denotes the vector where all coordinates, except the ith coordinate of u, are replaced by 1. The process BC is a tight centered Gaussian process on ½0; 1d with covariance function EfBC ðuÞBC ðvÞg ¼ Cðu ^ vÞ CðuÞCðvÞ, i.e., BC is a d-dimensional Brownian bridge. 1 is closely related to the next theorem which we need in order to establish the limiting law of pffiffiTheorem ffi nðr^ 2;n r2 Þ. Theorem 2. Let F be a continuous d-dimensional distribution function with copula C. Suppose U is distributed ¯ with copula C and define CðuÞ ¼ PðU4uÞ. Consider the estimator ^¯ ðuÞ ¼ 1 C n n
n Y d X
1fU^ ij;n 4ui g
for u ¼ ðu1 ; . . . ; ud Þ 2 ½0; 1d .
j¼1 i¼1
Under the assumptions and using the notation of Theorem 1 we have pffiffiffi ^ w ¯ nfC¯ n ðuÞ CðuÞg ! GC¯ ðuÞ. (3) Pd d 1 ¯ Weak convergence takes place in ‘ ð½0; 1 Þ and GC¯ ðuÞ ¼ BC¯ ðuÞ þ i¼1 Di CðuÞB C¯ ðuðiÞ Þ with uðiÞ denoting the vector where all coordinates, except the ith coordinate of u, are replaced by 0. The process BC¯ is a tight centered ¯ _ vÞ CðuÞ ¯ CðvÞ. ¯ Gaussian process on ½0; 1d with covariance function EfBC¯ ðuÞBC¯ ðvÞg ¼ Cðu Proof. Consider the estimator ^¯ % ðuÞ ¼ 1 C n n
n Y d X
1fU ij 4ui g
for u 2 ½0; 1d
with U ij ¼ F i ðX ij Þ
j¼1 i¼1
and F i ; i ¼ 1; . . . ; d, denoting the marginal distribution functions of F . The corresponding empirical process converges weakly in ‘1 ð½0; 1d Þ to a d-dimensional Brownian bridge BC¯ with covariance structure ¯ _ vÞ CðuÞ ¯ CðvÞ. ¯ EfBC¯ ðuÞBC¯ ðvÞg ¼ Cðu The verification is standard; for example, marginal convergence is
ARTICLE IN PRESS F. Schmid, R. Schmidt / Statistics & Probability Letters 77 (2007) 407–416
412
proven via a multivariate version of the Lindeberg–Feller theorem for triangular arrays, see Araujo and Gine´ (1980, p. 41). This empirical process and the empirical process in (3) are related to each other as follows: pffiffiffi ^ pffiffiffi pffiffiffi 1 1 ¯ ¯ nfC¯ n ðuÞ CðuÞg þ Oð1= nÞ ¼ n½F^¯ n fF^ 1;n ðu1 Þ; . . . ; F^ d;n ðud Þg CðuÞ ¼ ðÞ, where we write 1 F^¯ n ðxÞ ¼ n
n Y d X
1fX ij 4xi g .
j¼1 i¼1
With F¯ ðxÞ ¼ PðX4xÞ we have pffiffiffi pffiffiffi ^ 1 ¯ ¯ þ n½F¯ fF^ 1 ðÞ ¼ nfC^¯ %n ðuÞ CðuÞg 1;n ðu1 Þ; . . . ; F d;n ðud Þg CðuÞ þ
d 1 X 1 ^ 1 ^ 1 ½H n fF 1 1 ðu1 Þ; . . . ; F i1 ðui1 Þ; F i;n ðui Þ; . . . ; F d;n ðud Þg i¼1
1 ^ 1 ^ 1 H n fF 1 1 ðu1 Þ; . . . ; F i ðui Þ; F iþ1;n ðuiþ1 Þ; . . . ; F d;n ðud Þg, pffiffiffi pffiffiffi ^ % w ¯ n ðuÞ CðuÞg ¯ ! BC¯ in ‘1 ð½0; 1d Þ. Further, where H n ¼ nðF^¯ n F¯ Þ. We mentioned that nfC
ð4Þ
d X pffiffiffi w ðiÞ ^ 1 ¯ ¯ n½F¯ fF^ 1 Di CðuÞB C ðu Þ 1;n ðu1 Þ; . . . ; F d;n ðud Þg CðuÞ ! i¼1
due to the Bahadur (1966) representation of the empirical process for uniformly distributed random variables and an application of the functional delta method (Van der Vaart and Wellner, 1996, p. 374). Weak convergence takes place in ‘1 ð½0; 1d Þ, cf. Fermanian etpal. ffiffiffi (2004). The last sum of formula (4) converges to zero in probability due to the weak convergence of nðF^¯ n F¯ Þ in ‘1 ð½1; 1d Þ. An application of the continuous mapping theorem yields the asserted weak convergence (utilize a.s. convergent versions of H n .) Finally, the fact BC ðuðiÞ Þ ¼ BC¯ ðuðiÞ Þ a.s. for all uðiÞ ; uðiÞ 2 ½0; 1d ; i ¼ 1; . . . ; d, completes the proof. & Theorem 3. Let r^ i;n ; i ¼ 1; 2; 3, be the estimators as defined above. Under the assumptions and using the notation of Theorems 1 and 2, pffiffiffi d nðr^ i;n ri Þ ! Zi Nð0; s2i Þ. The variances are s21 ¼ 22d hðdÞ2 s22 ¼ 22d hðdÞ2 s23
¼ 144
Z ½0;1d
Z
Z ½0;1d
kol sot
ð5Þ
EfGC¯ ðuÞGC¯ ðvÞg du dv,
ð6Þ
Z
½0;1d ½0;1d X 2 Z
d 2
EfGC ðuÞGC ðvÞg du dv,
½0;1d
Z ½0;1d
EfGC ðuðk;lÞ ÞGC ðvðs;tÞ Þg du dv
ð7Þ
with uðk;lÞ :¼ð1; . . . ; 1; uk ; 1; . . . ; 1; ul ; 1; . . . ; 1Þ. For related results on bivariate linear rank order statistics of similar type we refer to Ru¨schendorf (1976), Ga¨nXler and Stute (1987), and Genest and Re´millard (2004). Alternative derivations of the asymptotic behavior of rank order statistics such as r^ 1;n are given in Stepanova (2003). pffiffiffi Proof. The distributional convergence of nðr^ 1;n r1 Þ follows with Theorem 1 and the continuous mapping theorem (see Van der Vaart and Wellner, 1996, Theorem 1.3.6), since the integral operator is a continuous linear map on ‘1 ð½0; 1d Þ into R and GC is a tight Gaussian process. The formula for the variance s21 follows then by an application of Fubini’s theorem.
ARTICLE IN PRESS F. Schmid, R. Schmidt / Statistics & Probability Letters 77 (2007) 407–416
413
For r^ 2;n we first establish a useful relationship. Let Uo be distributed according to the (empirical) copula C^ n ðÞðoÞ for fixed o. Note that the last expression is indeed a multivariate distribution function. Further, consider i.i.d. random variables Z 1 ; . . . ; Z d which are uniformly distributed on the interval ½0; 1. Then for any fixed o, Z Z Z Z ^ ^ C^¯ n ðuÞðoÞ du. PðuÞ dC n ðuÞðoÞ ¼ PðZouÞ dC n ðuÞðoÞ ¼ PðUo 4uÞ du ¼ ½0;1d
½0;1d
½0;1d
½0;1d
Hence, we may rewrite the estimator r^ 2;n as follows: Z Z n Y d 2d X d d ^ U^ ij;n . PðuÞ dC n ðuÞ ¼ 2 (8) C^¯ n ðuÞ du ¼ r^ 2;n =hðdÞ þ 1 ¼ 2 n j¼1 i¼1 ½0;1d ½0;1d pffiffiffi Weak convergence of nðr^ 2;n r2 Þ and the form of s22 follows now by Theorem 2, along the same argumentation as above. pffiffiffi For the weak convergence of nðr^ 3;n r3 Þ, observe that X d 1 Z pffiffiffi pffiffiffi nfC^ n ðuðk;lÞ Þ Cðuðk;lÞ Þg du nðr^ 3;n r3 Þ ¼ 12 d 2 ½0;1 kol pffiffiffi is a continuousPlinear map on ‘1 ð½0; 1d Þ of the empirical copula process nfC^ n ðuÞ CðuÞg and, thus, R converges to 12 kol ðd2 Þ1 ½0;1d GC ðuðk;lÞ Þ du according to Theorem 1. & Remark. The process GC ðuðk;lÞ Þ in formula (7) takes the following form (using the notation of Theorem 1): GC ðuðk;lÞ Þ ¼ BC ðuðk;lÞ Þ Dk Cðuðk;lÞ ÞBC ðuðkÞ Þ Dl Cðuðk;lÞ ÞBC ðuðlÞ Þ because BC ð1Þ ¼ 0 a.s. Moreover, formula (7) can be rewritten as 2 Z Z 6 X d 2 2 6 s3 ¼ 1444 EfGC ðuðk;lÞ ÞGC ðvðk;lÞ Þg dðuk ; ul Þ dðvk ; vl Þ 2 2 2 ½0;1 ½0;1 kol
þ
X kol;ros fk;lgafr;sg
d 2
!2 Z ½0;12
Z ½0;12
3 7 EfGC ðuðk;lÞ ÞGC ðvðr;sÞ Þg dðuk ; ul Þ dður ; us Þ7 5.
Proposition 4. Let C be a radially symmetric copula. Under the additional assumptions and using the notation of Theorems 1 and 2, the processes GC ðuÞ and GC¯ ð1 uÞ are equally distributed. Moreover, the asymptotic variances s21 and s22 in formulas (5) and (6) coincide. ¯ ¯ Proof. Regarding the first assertion, note that Di CðtÞj t¼1u ¼ Di Cð1 uÞ ¼ Di CðuÞ and the covariance structure of BC ðuÞ equals ¯ ¯ uÞCð1 ¯ vÞ Cðu ^ vÞ CðuÞCðvÞ ¼ Cfð1 uÞ _ ð1 vÞg Cð1 which corresponds to the covariance function of BC¯ ð1 uÞ. The second assertion follows by an appropriate substitution of the integrals in formulas (5) and (6). & Obviously, the integral expressions in Theorem 3 cannot be explicitly evaluated for the majority of known copulas, even in the case d ¼ 2. However, the following theorem shows that the nonparametric bootstrap works. In this context, let ðXBj Þj¼1;...;n denote the bootstrap sample which is obtained by sampling from ðXj Þj¼1;...;n with replacement. An empirical analysis of this bootstrap procedure is presented in Schmid and Schmidt (2006b) and its applicability in the context of measuring price comovements in financial markets is discussed in Penzer et al. (2006). Theorem 5 (The bootstrap). Let r^ i;n ; i ¼ 1; 2; 3, be the estimators as defined in the beginning of the present section and r^ Bi;n denote the corresponding estimators for the bootstrap sample ðXBj Þj¼1;...;n . Then, under the
ARTICLE IN PRESS F. Schmid, R. Schmidt / Statistics & Probability Letters 77 (2007) 407–416
414
pffiffiffi B assumptions of Theorems pffiffi1ffi and 2, the sequences nfr^ i;n r^ i;n g; i ¼ 1; 2; 3, respectively, converge weakly to the same Gaussian limit as nfr^ i;n ri g; i ¼ 1; 2; 3, with probability one. B Proof. Denote the empirical copula of ðXBj Þj¼1;...;n by C^ n . For dimension d ¼ 2, Fermanian et al. (2004) show pffiffiffi B pffiffiffi that the process nfC^ n C^ n g converges weakly to the same Gaussian process as nfC^ n Cg with probability one. Weak convergence takes place in ‘1 ð½0; 12 Þ. The multidimensional generalization of this result is proven along the same lines as Theorem 2. The conclusion follows now by the continuous mapping theorem, see Van der Vaart and Wellner (1996, Theorem 1.3.6). &
5. Examples The independence copula: Consider a random vector X ¼ ðX 1 ; . . . ; X n Þ where X 1 ; . . . ; X n are stochastically independent (but not necessarily identically distributed). The related copula is the independence copula P. Q Proposition 6. Let C be the independence copula PðuÞ ¼ di¼1 ui . Then, the asymptotic variances in Theorem 3 are given by 1 d ðd þ 1Þ2 f3 þ d 3ð4=3Þd g 2 2 2 s1 ¼ s2 ¼ and s3 ¼ . (9) d 2 2 3ð1 þ d 2 Þ Proof. Note and, hence, s21 ¼ s22 according to Proposition 4. Obviously, Q that C P is radially symmetric ðiÞ Di CðuÞ ¼ kai uk . Further, for iaj and u :¼ð1; . . . ; 1; ui ; 1; . . . ; 1Þ we have Z Z EfDi CðuÞBC ðuðiÞ Þ Dj CðvÞBC ðvðjÞ Þg du dv ¼ 0. ½0;1d
½0;1d
R
Moreover, ½0;1d Z Z ½0;1d
Z
R
½0;1d
½0;1d
EfBC ðuÞ Di CðvÞBC ðvðiÞ Þg du dv Z
¼ ½0;1d
EfBC ðuÞBC ðvÞg du dv ¼ 3d 22d and for i ¼ 1; . . . ; d we derive
½0;1d
EfDi CðuÞBC ðuðiÞ Þ Di CðvÞBC ðvðiÞ Þg du dv ¼
222d 22d . 3
R R Collecting terms, we obtain ½0;1d ½0;1d EfGC ðuÞGC ðvÞg du dv ¼ 3d 22d dð222d 31 22d Þ. Finally, the respective normalization yields the left-hand formula in Eq. (9). For s23 observe that Z Z 1 , EfGC ðuðk;lÞ ÞGC ðvðk;lÞ Þg du dv ¼ 144 ½0;1d ½0;1d and for k; l; r; s 2 f1; . . . ; dg such that kol and ros and fk; lg \ fr; sg ¼ ; we have Z Z EfGC ðuðk;lÞ ÞGC ðvðr;sÞ Þg du dv ¼ 0. ½0;1d
½0;1d
The latter expression is also 0 if we consider the case where fk; lg \ fr; sg have exactly one element in common. Insertion of the above findings into Eq. (5) results in the right-hand formula of Eq. (9). & A variance stabilizing transformation for the FGM copula: The family of Farlie–Gumbel–Morgenstern copulas (in short: FGM copulas) is defined by Cðu; vÞ ¼ uv þ luvð1 uÞð1 vÞ
for all jljp1.
Because of their simple analytical form, FGM copulas have been widely used in statistics, for example, in order to obtain efficiency results on nonparametric tests for stochastic independence. A list of applications and references is given in Nelsen (2006, p. 77). Recall that for d ¼ 2 all the asymptotic variances s2i , as given in Theorem 3, coincide. Hence, the subscripts i may be dropped. Direct calculation shows that s2 ¼ 1
11 2 l . 45
ARTICLE IN PRESS F. Schmid, R. Schmidt / Statistics & Probability Letters 77 (2007) 407–416
415
Further, Spearman’s rho takes the form r ¼ l=3 which implies that jrjp 13 and s2 can be expressed as a function of r, i.e., s2 ¼ 1 r2 11 5 . A variance stabilizing transformation h, which satisfies pffiffiffi d nðhðr^ i;n Þ hðrÞÞ ! Nð0; 1Þ; i ¼ 1; 2; 3, is then derived via the delta method. In particular, we obtain rffiffiffiffiffi rffiffiffiffiffi ! 5 11 5 hðrÞ ¼ arcsin r for jrjp . 11 5 11 ^ is well defined. For This transformation may appropriately be extended to the domain ½1; 1 such that hðrÞ 1 jrjp 3, transformation h is close to Fisher’s z transformation lnfð1 þ rÞ=ð1 rÞg=2. The Kotz– Johnson copula: The following copula is called Kotz–Johnson copula (or Kotz and Johnson’s iterated FGM copula), see Nelsen (2006, p. 82): Cðu; vÞ ¼ uv þ luvð1 uÞð1 vÞð1 þ yuvÞ for all 1pl; yp1. It forms a generalization of the FGM copula. Spearman’s rho for this copula is r ¼ l=3 þ ly=12 and a direct calculation yields 2 1 11 2 53 2 2 1 3 1 3 2 s2 ¼ 1 þ 25 ly 11 45l 90l y 5040l y þ 450l y þ 1800l y .
A generic family of copulas: The next family of copulas, introduced in Rodrı´ guez-Lallena and U´beda-Flores (2004), can also be seen as a generalization of FGM copulas: Cðu; vÞ ¼ uv þ lf ðuÞgðvÞ, where f and g are nonzero and absolutely continuous functions on ½0; 1 such that f ð0Þ ¼ f ð1Þ ¼ gð0Þ ¼ gð1Þ ¼ 0. The range of the parameter l, which obviously depends on the choice of f and g, is specified in the last reference. Spearman’s rho takes the form Z 1 Z 1 r ¼ 12 l f ðuÞ du gðvÞ dv. 0
0 2
The asymptotic variance s involves the following functionals: Z 1 Z 1 Z 2 T 1 ðf Þ ¼ f ðuÞ du; T 2 ðf Þ ¼ f ðuÞ du and T 3 ðf Þ ¼ 0
0
Z
1
1
Z
u
F ðuÞ du ¼ 0
f ðxÞ dx du 0
0
and T i ðgÞ, i ¼ 1; 2; 3, which are analogously defined. For this family of copulas, we obtain s2 ¼ 1 þ l 144f2T 3 ðf Þ T 1 ðf Þgf2T 3 ðgÞ T 1 ðgÞg þ l2 144f2T 2 ðf ÞT 21 ðgÞ þ 2T 2 ðgÞT 21 ðf Þ 7T 21 ðf ÞT 21 ðgÞg.
References Araujo, A., Gine´, E., 1980. The Central Limit Theorem for Real and Banach Valued Random Variables. Wiley, New York. Bahadur, R.R., 1966. A note on quantiles in large samples. Ann. Math. Statist. 37, 577–580. Borkowf, C.B., 1999. A new method for approximating the asymptotic variance of Spearman’s rank correlation. Statist. Sinica 9, 535–558. Borkowf, C.B., 2002. Computing the nonnull asymptotic variance and the asymptotic relative efficiency of Spearman’s rank correlation. Comput. Statist. Data Anal. 39, 271–286. Deheuvels, P., 1979. La fonction de de´pendance empirique et ses proprie´te´s. Acad. Roy. Belg. Bull. Cl. Sci. 65 (5), 274–292. Dolati, A., U´beda-Flores, M., 2006. On measures of multivariate concordance, Shiraz University, preprint. Fermanian, J.D., Radulovic´, D., Wegkamp, M., 2004. Weak convergence of empirical copula processes. Bernoulli 10 (5), 847–860. Ga¨nXler, P., Stute, W., 1987. Seminar on Empirical Processes, DMV-Seminar, vol. 9. Birkha¨user Verlag, Basel. Genest, C., Re´millard, B., 2004. Tests of independence and randomness based on the empirical copula process. Test 13 (2), 335–369. Joe, H., 1990. Multivariate concordance. J. Multivariate Anal. 35, 12–30. Joe, H., 1997. Multivariate Models and Dependence Concepts. Chapman & Hall, London. Kendall, M.G., 1970. Rank Correlation Methods. Griffin, London. Nelsen, R.B., 1996. Nonparametric measures of multivariate association. In: Distribution with Fixed Marginals and Related Topics. IMS Lecture Notes—Monograph Series, vol. 28, pp. 223–232.
ARTICLE IN PRESS 416
F. Schmid, R. Schmidt / Statistics & Probability Letters 77 (2007) 407–416
Nelsen, R.B., 2006. An Introduction to Copulas, second ed., Springer, New York. Penzer, J., Schmid, F., Schmidt, R., 2006. Measuring large comovements in financial markets, University of Cologne, preprint. Rodrı´ guez-Lallena, J.A., U´beda-Flores, M., 2004. A new class of bivariate copulas. Statist. Probab. Lett. 66 (3), 315–325. Ru¨schendorf, L., 1976. Asymptotic normality of multivariate rank order statistics. Ann. Statist. 4, 912–923. Ruymgaart, F.H., van Zuijlen, M.C.A., 1978. Asymptotic normality of multivariate linear rank statistics in the non-i.i.d. case. Ann. Statist. 6 (3), 588–602. Schmid, F., Schmidt, R., 2006a. Multivariate conditional versions of Spearman’s rho and related measures of tail dependence.J. Multivariate Anal., in print, available online at www.sciencedirect.com. Schmid, F., Schmidt, R., 2006b. Bootstrapping Spearman’s multivariate rho. In: Rizzi, A., Vichi, M. (Eds.), Proceedings of COMPSTAT 2006, pp. 759–766. Sklar, A., 1959. Fonctions de re´partition a` n dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris 8, 229–231. Stepanova, N.A., 2003. Multivariate rank tests for independence and their asymptotic efficiency. Math. Methods Statist. 12 (2), 197–217. Stute, W., 1984. The oscillation behavior of empirical processes: the multivariate case. Ann. Probab. 12, 361–379. Taylor, M.D., 2006. Multivariate measures of concordance. Ann. Inst. Statist. Math., forthcoming. Tsukahara, H., 2005. Semiparametric estimation in copula models. Canad. J. Statist. 33 (3), 357–375. Van der Vaart, A.W., Wellner, J.A., 1996. Weak Convergence and Empirical Processes. Springer, New York. Wolff, E.F., 1980. N-dimensional measures of dependence. Stochastica 4 (3), 175–188.