Computational Statistics and Data Analysis 53 (2009) 3291–3304
Contents lists available at ScienceDirect
Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda
The geometry of time-varying cross-correlation random fields F. Carbonell a,b,∗ , K.J. Worsley a,b , N.J. Trujillo-Barreto c , M. Vega-Hernandez c a
Department of Mathematics and Statistics, McGill University, Montreal, Canada
b
McConnell Brain Imaging Centre, Montreal Neurological Institute, McGill University, Montreal, Canada
c
Cuban Neuroscience Center, Havana, Cuba
article
a b s t r a c t
info
Article history: Received 31 July 2008 Received in revised form 27 January 2009 Accepted 14 February 2009 Available online 4 March 2009
The goal of this paper is to assess the P-value of local maxima of time-varying crosscorrelation random fields. The motivation for this comes from an electroencephalography (EEG) experiment, where one seeks connectivity between all pairs of voxels inside the brain at each time point of the recording window. In this way, we extend the results of [Cao, J., Worsley, K.J., 1999b. The geometry of correlation fields with an application to functional connectivity of the brain. The Annals of Applied Probability 9 (4), 1021–1057] by searching for high correlations not only over all pairs of voxels, but over all time points as well. We apply our results to an EEG data set of a face recognition paradigm. Our analysis determines those time instants for which there are significantly correlated regions involved in face recognition. © 2009 Elsevier B.V. All rights reserved.
1. Introduction The main idea behind functional connectivity of brain images is to establish a connectivity between two different regions of the brain on the basis of similar functional response. Several approaches have been followed for assessing functional connectivity, Mclntosh and Gonzalez-Lima (1994), Strother et al. (1995), Horowitz et al. (1996) and Friston et al. (1997). A more general approach uses the sample correlation coefficient between all pairs of voxels for estimating connectivity between images, Cao and Worsley (1999b) and Worsley et al. (1998, 2004, 2005). This has been applied to functional connectivity within resting state fMRI data Worsley et al. (2005) and anatomical connectivity within cortical thickness measurements Worsley et al. (2004, 2005). The challenging statistical problem is how to assess the P-value of local maxima of the resulting smooth 6D cross-correlation random field (RF), defined by the sample correlation coefficient between all pairs of voxels Cao and Worsley (1999b). We extend this idea to time-varying cross-correlation, as follows. Let Xi (u, t ) be the value of the ith RF of one set of zero mean independent RFs at spatial point u ∈ U and time point t in the time interval [0, T ], i = 1, . . . , ν . Let Yi (v, t ) be the value of the ith RF of a second set of zero mean independent RFs at spatial point v ∈ V and time point t, i = 1, . . . , ν . Then the time-varying connectivity between spatial points u and v at time point t is measured by the correlation ν P
C (u, v, t ) = s
Xi (u, t )Yi (v, t )
i=1
ν P i =1
Xi (u, t )
2
ν P
.
(1)
Yi (v, t )
2
i =1
The motivation comes from a typical EEG experiment. A stimulus is presented to the subject and the current sources underlying the recorded evoked potential are reconstructed at all spatial points (voxels) in the brain at every time point
∗
Corresponding author at: Department of Mathematics and Statistics, McGill University, Montreal, Canada. E-mail address:
[email protected] (F. Carbonell).
0167-9473/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2009.02.019
3292
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
measured from the onset of the stimulus. This is repeated over many trials. We then seek connectivity between all pairs of voxels at each time point. In other words, we search for high correlations not only over all pairs of voxels, but over all time points as well. Hence, our main task is to calculate a threshold, based on RF theory, for controlling the false positive rate to say α = 0.05. Thus we need a good approximation to
Pr
max
u∈U,v∈V ,t ∈[0,T ]
C (u, v, t ) ≥ c
.
This problem was studied in the seminal works Adler and Hasofer (1976) and Hasofer (1978) for the case of Gaussian RFs. The key idea consists of approximating the P-value that the local maxima exceeds a high threshold value via the geometry of the excursion set (i.e. the set of points where the RF exceeds the threshold value). For high thresholds, the excursion set consists of isolated regions, each containing a local maximum. Then, the Euler Characteristic (EC) of the excursion set counts the number of such connected components. For high thresholds near the maximum of the RF, the EC takes the value one if the maximum is above the threshold, and zero otherwise. Thus, for these high thresholds, the expected EC approximates the P-value of the maximum of the RF. During the past decade this approach has become an important tool in different areas of applied probability. This is due to the many applications that emerged principally in medical imaging and astrophysics Gottiii et al. (1990), Beaky et al. (1992), Vogeley et al. (1994), Worsley (1995a,b), Worsley et al. (1998) and Cao and Worsley (2001). So far, results are also available for RFs of standard univariate statistics such as χ 2 , t and F RFs Worsley (1994). Some other useful extensions were obtained in Cao and Worsley (1999a) for Hotelling’s T 2 and Cao and Worsley (1999b) for the cross-correlation RFs. Recent results have been also obtained for some multivariate RFs, such as Roy’s maximum root and maximal canonical correlations Taylor and Worsley (2008). The work most closely related to this paper, and the one that we shall generalize, is the paper Cao and Worsley (1999b). Such papers treat two types of random fields: the cross-correlation RF C (u, v, 0), a special case of (1) with t fixed (say at 0), and the homologous correlation RF C (0, 0, t ), a special case of (1) with u and v fixed (say at 0). However, the extant results for the cross-correlation and the homologous correlation RFs do not solve the problem of finding the P-value for the local maxima of (1). Therefore, the extension of the main tools developed in Cao and Worsley (1999b) is compelling. The plan of the paper is the following. In Section 2 we shall state the notation as well as some preliminaries about RF theory. In Section 3 we present the main results of the paper, which consists of explicit expressions for the EC densities of the RF C (u, v, t ). Finally, the analysis of real EEG data coming from a face recognition experiment is presented in the last section. 2. Preliminaries 2.1. Notation We shall use the notation Im to represent the identity matrix of order m. By Normalm (µ, 6) we denote the multivariate normal distribution with mean µ and covariance matrix 6. If X is a m × m matrix, denote Var(vec (X )) by Vm when Cov(Xij , Xkl ) = ε(i, j, k, l) − δij δkl where ε(i, j, k, l) is symmetric in its arguments and δij = 1 if i = j and 0 otherwise. Let Wishartm (ν, 6) represent the Wishart distribution of a m × m matrix with expectation ν 6 and degrees of freedom ν , and use χν2 for chi-square distribution with ν degrees of freedom. For any vector we shall use subscripts j, |j and the superscript |j to denote the jth component, the first j components and the last j components, respectively. In a similar way, for any matrix, the subscript |ij shall denote the matrix of the first i rows and first j columns, and the superscripts |ij shall denote the matrix for the last i rows and last j columns. Combinations of subscripts and superscripts can then be used to denote submatrices. For any scalar b, let b+ = b if b > 0 and zero otherwise. For any symmetric matrix B, let detrj (B) be the sum of the determinants of all j × j principal minors of B and define detr0 (B) = 1. Finally, if f (u, v) is a scalar function then u
f ≡
∂f , ∂u
v
f ≡
∂f , ∂v
uv
f ≡
∂ 2f . ∂ u∂ v0
2.2. The geometry of random fields Let C = C (s) , s ∈ S ⊂ RD be a real-valued isotropic RF and let S be a compact subset of RD with a twice differentiable . .. boundary ∂ S . Let us denote the first derivative of C by C and the D × D matrix of second-order partial derivatives of C by C . The d-dimensional Euler Characteristic (EC) density is defined as .+
..
.
ρd (c ) = E(C d det(−C |d−1 ) | C |d−1 = 0, C = c )φ|d−1 (0, c )
(2) .
where E (·) denotes mathematical expectation and φ|d−1 (·, ·) is the joint probability density function of C |d−1 and C . Let ad = 2π d/2 /Γ (d/2) be the surface area of a unit (d − 1)-sphere in RD . Define µD (S ) to be the Lebesgue measure of S and µd (S ), d = 0, . . . , D − 1 to be the d-dimensional intrinsic volume or Minkowski functional. For instance, for D = 3, µd (S ), d = 0, 1, 2 are nothing but the Euler Characteristic, twice the caliper diameter and half the surface area of S , respectively.
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
3293
The expected EC χ of the excursion set Ac = {s ∈ S : C (s) ≥ c } of C above the threshold c inside S is given by
E(χ (Ac )) =
D X
µd (S ) ρd (c )
(3)
d=0
where ρ0 (c ) = P(C ≥ c ). Then, the P-value of the maximum of C inside S is well approximated by
P max C ≥ c ≈ E(χ (Ac )).
(4)
s∈S
3. Time-varying cross-correlation field The RF (1) can be more formally defined as X(u, t)0 Y(v, t) C (u, v, t) = √ , X(u, t)0 X(u, t)Y(v, t)0 Y(v, t)
(5)
where X(u, t) = (X1 (u, t), . . . , Xν (u, t))0 , Y(v, t) = (Y1 (v, t), . . . , Yν (v, t))0 , u ∈ U ⊂ RM , v ∈ V ⊂ RN , t ∈ T ⊂ RL are vectors of independent, identically distributed Gaussian RFs with mean 0 and variance 1. Notice that in this section we are going to deal with the general case L ≥ 1. Similarly to Cao and Worsley (1999b), our goal is to evaluate (3) for the particular case of the time-varying crosscorrelation RF defined by (5). First, the boundary of U × V × T is not twice differentiable (even with the assumption that the boundaries ∂ U, ∂ V and ∂ T are twice differentiable). Second, even if we assume that the field is isotropic in each of the variables u, v or t for every fixed value of the other two, the field is not isotropic in (u, v, t). Therefore we cannot apply the results in the previous section. Indeed, if C (u, v, t) is stationary in (u, v, t) and isotropic in each of the three arguments for every fixed value of the other two, and satisfies the regularity conditions of Theorem 5.2.2 in Adler (1981) then E (χ (Ac )) =
M X N X L X
µi (U) µj (V ) µk (T ) ρijk (c ),
(6)
i=0 j=0 k=0
where Ac = {u ∈ U, v ∈ V , t ∈ T : C (u, v, t) ≥ c } and ρijk (c ) having the same expression as in (2) with the parameters of the RF restricted to the first i components of u, the first j components of v and the first k components of t. The equality above follows straightforwardly from Lemma 3.1 in Cao and Worsley (1999b). 3.1. EC densities ..
.
According to (2), the main task is the evaluation of the terms E(det(C |D−1 )|C |D−1 = 0, C = c ), which must be combined .+
C with E(C D |C = c ) and φ|D−1 (0, c ), where D = M + N + L. Some simplifications hold, namely, ρM ,N ,0 (c ) = ρM ,N (c ), C where ρM ,N (c ) denotes the EC densities of the classical cross-correlation RF in Cao and Worsley (1999b). In a similar way,
ρ0,0,k (c ) = ρ1H (c ), where ρkH (c ) denotes the EC densities of the homologous RF defined in Cao and Worsley (1999b). Thus, we only need to calculate the EC densities ρM ,N ,L (c ) for M + N > 0 and L > 0. This final operation requires a lot of algebra
and very long computations. Hence, in order to clarify the exposition, we are going to follow the same structure used in Cao and Worsley (1999b). That is, our result shall be based on a series of lemmas, which for ease of exposition are presented in the Appendix section. Let
3x 0
0
λx IL
and
3y
0
λ y IL
0
(u,t)
(v,t)
be the variance matrices of the components of X and Y in (5), respectively. Without loss of generality we can write 3x = IM and 3y = IN , since results for arbitrary 3x and Λy follow by a linear transformation of the coordinates u and v. Next theorem is the main result of this paper. Explicit expressions for the EC densities are provided for L = 1, which is a very important case from the application point of view. The EC densities for the case L > 1 are obtained in a similar way after long and tedious manipulations. Theorem 1. Let D = M + N + 1. For ν > D, N > 0 the EC densities for the RF C are given by M
ρM ,N ,1 (c ) =
N
1 1 [ 2 ] [ 2 ] N −2j XX det(3x ) 2 det(3y ) 2 2ν−2−D M !N ! X D
π 2 +1 −1 +i+j+k 2 ν−D 2
× (1 − c )
(−1)i+j+k c D−1−2(i+j+k)
i=0 j=0 k=0
Ii,j (λx , λy )Γ (ν + i + j −
D 2
)
i!j!k!(ν − 1 − D + 2i + 2j + k)!(M − 2i − k)!(N − 2j − k)!
,
3294
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
where Ii,j (λx , λy ) =
1
Z
z
ν−M −1 +i 2
ν−N −1 +j
(1 − z )
2
1
(z λy + (1 − z )λx ) 2 dz .
0
Proof. From Lemma 3, .+
1
1
1
1
1
E(C D |C = c , a, b) = (2π )− 2 (1 − c 2 ) 2 s12
1
1
= (2π )− 2 (1 − c 2 ) 2 (ab)− 2 (aλy + bλx ) 2
(7)
and
φ|D−1 (0, c |a, b) =
Γ ( ν2 ) Γ(
ν−1 2
)
2−
D−1 2
D
π − 2 (1 − c 2 )−
D−ν+2 2
M
N
a2 b2,
(8)
where a, b ∼ χν2 independently. On the other hand, also from Lemma 3 we obtain ..
.
1
˜ + (1 − C 2 ) 2 H˜ + z˜ w ˜ +w ˜ z˜ , C |(C = 0) = Q 0
0
where
α1 Q11 x
α2 Q11 xy
˜Q =
α2 Q11 yx
α3 Q11 y
1 2
α λ
21 1 x Qx
+α λ
1 2
21 2 y Q yx
β1 H11 x
0
0
β2 H11 y
1 2
1 2
˜ = H
β λ
21 1 x Hx
β λ
21 2 y Hy
α λ
1 2
21 3 y Qy
+α λ
1
1
12 2 α1 λx2 Q12 x + α2 λy Q xy 1
1
, 1 1 22 22 22 22 2 2 α1 λx Q x + α3 λy Qy + α2 λx λy (Q xy + Q yx ) 12 2 α3 λy2 Q12 y + α2 λx Q yx
1 2
21 2 x Q xy
1
(9)
β1 λx2 H12 x
1 , β2 λy2 H12 y 22 β1 λx H22 x + β2 λy Hy
(10)
and z˜ = (0, 0, z4 )0 ,
˜ = (β2 w1 , −β1 w2 , −β1 β2 w4 ). w
Here,
Q=
Qx Q yx
Q xy Qy
∼ WishartM +N +2L (IM +N +2L , ν − 2),
Hx ∼ NormalM +L (0, V (IM +L )), Hy ∼ NormalN +L (0, V (IN +L )), w1 ∼ NormalM (0, IM ), w2 ∼ NormalN (0, IN ), z4 , w4 ∼ 1
1
1
1
1
NormalL (0, IL ), all independently, and α1 = −Ca−1 , α2 = (ab)− 2 , α3 = −Cb−1 , β1 = (1 − C 2 ) 2 a− 2 , β2 = (1 − C 2 ) 2 b− 2 . . Therefore, from Lemma 9 we obtain that, conditional on C |D−1 = 0,C = c , a, b, M
..
.
E(det(C |D−1 )|C |D−1 = 0, C = c , a, b) =
N
[ 2 ] [ 2 ] N −2j X XX
(−1)i+j+k 2−(i+j) c D−1−2(i+j+k) (1 − c 2 )i+j+k
i=0 j=0 k=0
× a− M + i b − N + j
(ν − 2)!M !N ! i!j!k!(ν − 1 − D + 2i + 2j + k)!(M − 2i − k)!(N − 2j − k)!
.
(11)
Using (7), (8) and (11) together we have .+
..
E(C D det(C |D−1 )|C = c , a, b) φ|D−1 (0, c |a, b) =
Γ ( ν2 )
Γ ( ν−2 1 )
×c ×
M
− D2
2
π
1 − D+ 2
D−1−2(i+j+k)
N
[ 2 ] [ 2 ] N −2j X XX
(−1)i+j+k 2−(i+j)
i=0 j=0 k=0
(1 − c 2 )
ν−D−1 +i+j+k − M +1 +i − N +1 +j 2 a 2 b 2
1
(aλy + bλx ) 2
(ν − 2)!M !N ! . i!j!k!(ν − 1 − D + 2i + 2j + k)!(M − 2i − k)!(N − 2j − k)!
In order to take expectations over a and b in the expression above it is convenient to make the change of variables w = a + b, z = a+a b , which distribute independently as χ22ν and Beta( ν2 , ν2 ), respectively. Hence
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
.+
Γ ( ν2 )
..
E(C D det(C |D−1 )|C = c , w, z ) φ|D−1 (0, c |w, z ) = D−1−2(i+j+k)
Γ ( ν−2 1 )
ν−D−1 +i+j+k − M +1 +i 2 c2 z 2
2
π
1 − D+ 2
N
[ 2 ] [ 2 ] N −2j X XX (−1)i+j+k 2−(i+j) i=0 j=0 k=0
1 − N+ +j 2
Since E(w l ) = 2l .+
Γ (ν+l) Γ (ν)
)
1
(1 − z ) (z λy + (1 − z )λx ) 2 (ν − 2)!M !N ! M +N +1 × w − 2 +i+j . i!j!k!(ν − 1 − D + 2i + 2j + k)!(M − 2i − k)!(N − 2j − k)! ×c
(1 −
M
− D2
3295
then taking expectations over w yields
..
E(C D det(C |D−1 )|C = c , z )φ|D−1 (0, c |z ) =
×c ×
D−1−2(i+j+k)
(1 −
Γ ( ν−2 1 )Γ (ν)
ν−D−1 +i+j+k − M +1 +i 2 c2 z 2
)
M
Γ ( ν2 )
(1 − z )
2
−D
1 − N+ +j 2
π
1 − D+ 2
N
[ 2 ] [ 2 ] N −2j X XX
(−1)i+j+k
i=0 j=0 k=0 1
(z λy + (1 − z )λx ) 2
(ν − 2)!M !N !Γ (ν + i + j − D2 ) , i!j!k!(ν − 1 − D + 2i + 2j + k)!(M − 2i − k)!(N − 2j − k)!
and the result follows from the factorization l! =
2l Γ ( l+22 )Γ ( l+21 ) 1
π2
.
Corollary 2. For λx = λy = λ, M
ρM ,N ,1 (c ) =
N
1 1 1 [ 2 ] [ 2 ] N −2j XX det(3x ) 2 det(3y ) 2 λ 2 2ν−2−D M !N ! X
π ×
D +1 2
(−1)i+j+k c D−1−2(i+j+k) (1 − c 2 )
ν−D−1 +i+j+k 2
i=0 j=0 k=0
Γ (ν + i + j − D2 )Γ ( ν−M2 +1 + i)Γ ( ν−N2 +1 + j) Γ (ν + i + j −
D −3 2
)i!j!k!(ν − 1 − D + 2i + 2j + k)!(M − 2i − k)!(N − 2j − k)!
.
4. Computational aspects In this section we briefly discuss some computational aspects to be taken into account while implementing our previous results. Firstly, notice that application of formula (6) with EC densities given by Theorem 1 requires the numerical approximations of the terms det(3x ) and det(3y ) as well as the intrinsic volumes µi (U), µj (V ) and µk (T ), for i = 0, . . . , M, j = 0, . . . , N, k = 0, . . . , L. On the one hand, in the neuroimaging literature a Gaussian random field X (u), u ∈ U ⊂ RM is often modeled as white noise convolved with an isotropic Gaussian filter. The width of the filter is measured by its Full Width at Half Maximum (FWHMx ) Worsley et al. (1992), which is a standard parameter for characterizing the smoothness of a random field (as higher the FWHM value, smoother the random field). It is then straightforward to show that u 4 log 2 Var(X ) = 3x = ID , FWHMx2 which motivates us to define FWHMx = (4 log 2)1/2 det(3x )−1/(2M ) for any stationary random field X . Then, the parameter FWHMx is easily estimated from the data according to Kiebel et al. (1999). On the other hand, since real applications are usually carried out for the case M = N = 3 (3D brain regions U and V ), the intrinsic volumes of the whole brain are commonly approximated by that of a sphere with the same volume, which gives a very good approximation for any non-spherical search region. Indeed, intrinsic volumes of spherical regions are easily approximated according to Worsley et al. (1996). Therefore, the approximation of intrinsic volumes for particular brain regions such as hemispheres is straightforward. Finally, according to (4), the time-varying cross-correlation threshold at significance level α = 0.05 can be found by equating the right-hand side of (6) to 0.05, with the EC densities are given by Theorem 1. 5. Applications to EEG As an illustration of the methods, we apply ours results for thresholding a time-varying cross-correlation RF resulting from an electroencephalography (EEG) data set recorded during a face recognition experiment (Henson et al., 2003). Our goal is to determine at which time instants there are significant inter-hemisphere correlations. This would increase our understanding about the relationship between different brain areas involved in the face recognition process.
3296
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
Fig. 1. Left Panel: Average (over 76 observations) of the differences, Faces − Scrambled faces, for 8 of the 128 components of the vector M(t ) (corresponding to electrodes CZ, PZ, C4, FPZ, FZ, OZ, P8 and P7) plotted against time t (ms). There is a negative EEG peak around t = 110 ms for the electrodes CZ, PZ, C4 in the temporal region, and FPZ and FZ in the frontal region, followed by a positive peak around 170 ms. The reverse happens in the occipito-temporal electrodes OZ, P8 and P7 (the so-called N170 component). Right Panel: The EEG data ˆJ(t ) reconstructed inside the brain at time t = 170 ms. The main activated sources are located in the occipital region (bilateral), left frontal and right temporal regions.
The experimental paradigm consists in the presentation of 76 images of faces to the subject: 38 famous faces and 38 unfamiliar faces. The EEG data were acquired on a 128-channel (128 electrodes on the scalp) Active-Two system, sampled at 2048 Hz. Each of the facial images was followed by the presentation of a scrambled version of the same face, and the resulting EEG data was subtracted, to give 76 differences (Faces − Scrambled faces). The face was presented at time zero, and data were recorded over the time window T = [−200, 600] ms. As was explained in Henson et al. (2003), looking for differences between both conditions (Faces and Scrambled faces) may reveal interesting aspects about the spatio-temporal brain topography associated to the face perception and face recognition processes. To get an idea of the primary data, the overall mean (over all 76 differences Faces − Scrambled faces) of the EEG recording is shown in Fig. 1. Notice that the EEG data consists of a set of 76 independent observations of multivariate time series of dimension 128 (the number of scalp recording electrodes), sampled over the time interval T = [−200, 600] ms. Left panel on Fig. 1 shows the average (over the 76 independent observations) for 8 representative recording sites. In general, one should be cautious when interpreting the EEG topography in terms of neural activation regions, which requires the use of specific tomographic techniques. Indeed, Low Resolution Electromagnetic Tomography (LORETA) method (Pascual-Marqui et al., 1994) allows the reconstruction of the 3D neural generators of 128 EEG electrical recordings. So, the general idea is to obtain a 3D representation of the brain, based on the observed multivariate time series. Specifically, the relationship between the electric field J(t ) at all points in the brain (written as a vector) at time t and the EEG measurements M(t ) at all electrodes on the scalp (written as a vector of length 128) at time t is given by the simple linear model based on elementary physics: M(t ) = KJ(t ) + e(t ). The matrix K, called the leadfield, is a known function of the position of the electrodes on the scalp, the position of the points inside the brain, and the electrical properties of the brain tissue and scalp. The measurement error e(t ) is assumed to be a vector of independent Gaussian processes at time t. The LORETA solution of the inverse problem is
b J(t ) = (K0 K + λ2 L0 L)−1 KM(t ), where λ is a regularizing parameter and L is a discrete version of the Laplacian operator that force the solution to be spatially smooth. Therefore, our 3D data Xi (u, t ), Yi (v, t ) consist of the LORETA solutions b J(t ), corresponding to all time instants t and independent observations i = 1, . . . , 76. Such a 3D geometrical representation is obtained by mapping each LORETA solution over a brain template of 3244 3D voxels of dimension 7 mm × 7 mm × 7 mm. The average of the 76 reconstructed
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
3297
Fig. 2. The cross-correlations between hemispheres thresholded at c = 0.8019 (P = 0.05) at times t = (a) 110 ms, (b) 122 ms, (c) 164 ms, (d) 180 ms. Letters A, P, L, R in each subplot mean Anterior, Posterior, Left and Right, respectively. We conclude that there are statistically significant correlations between right frontal and left occipital regions.
LORETA solutions b J(t ) at time point t = 170 ms is shown in the right panel of Fig. 1. To remove the effect of familiar and unfamiliar faces depicted in Fig. 1, we used the residuals obtained by subtracting the average of each group (familiar, unfamiliar) from the data in that group, to give ν = 76 − 2 = 74 effectively independent observations. We now look for time-varying connectivity of the 3D reconstructed images. We looked for connectivity between the two brain hemispheres, so that U represents the left hemisphere and V denotes the right hemisphere, as depicted in Fig. 2. The spatial and temporal FWHMs were estimated as 20.28 mm and 14.17 ms, respectively (Kiebel et al., 1999). The intrinsic volumes were approximated as described in the previous section : µ0,1,2,3 (U) = µ0,1,2,3 (V ) = 1, 191.24, 1.395 × 104 , 5.037 × 105 . Evidently, the intrinsic volumes of T are given by µ0,1 (T ) = 1,800. As explained in the previous section, the decision threshold c at significance 0.05 is obtained from (4), (6) and Theorem 1, which gives the value c = 0.8019. Then the inter-hemispheric time-varying cross-correlations were calculated and the resulting 7D images were thresholded at c = ±0.8019 (for detecting negative significant correlations as well). It should be noted that voxels separated by less than one FWHM were ignored, since they will have high correlation due to the spatial smoothness effect. Fig. 2 shows the significant cross-correlations between hemispheres at four different time instants. We conclude that there is significant correlation between the right frontal region and left occipito-temporal region. These results are consistent with the classical cognitive ‘‘core model’’ for face recognition and perception (Gobbini and Haxby, 2007) which involves the processing and transfer of information in a neuronal network that includes bilateral brain structures in the occipital and frontal regions as well as in the superior-temporal area. Acknowledgements The authors would like to thank Rik Henson (MRC Cognition and Brain Sciences Unit, Cambrigde) and Will Penny (FIL, Wellcome Department of Imaging Neuroscience, University College London) for kindly providing the experimental data. The first author’s work was supported by a postdoctoral fellowship from the ISM-CRM, Montreal, Quebec, Canada.
3298
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
Appendix Lemma 3. The first and the second derivatives of time-varying cross-correlation RF (5) can be expressed by u
1
1
1
v
− 2 2 C = (1 − C ) 2 a 2 Λx z1 ,
1
1
1
− 2 2 C = (1 − C ) 2 b 2 3y z2 ,
t
1
1
2 2 C = (1 − C ) 2 s1 z3 ,
uu
1
1
1
1
1
vv
1
1
1
1
1
uv
1
1
ut
1
−1
−1 0 −1 11 2 −1 0 0 2 − 11 2 2 C = 3x [−Ca z1 z1 − Ca Q x − (1 − C ) 2 a (z1 w1 + w1 z1 ) + (1 − C ) 2 a 2 Hx ]3x ,
−1 0 −1 11 2 −1 0 0 2 − 11 2 2 C = 3y [−Cb z2 z2 − Cb Q y − (1 − C ) 2 b (z2 w2 + w2 z2 ) + (1 − C ) 2 b 2 Hy ]3y , 1
1
− 0 − 11 2 2 C = 3x [−C (ab) 2 z1 z2 + (ab) 2 Q xy ]3y , 1
1
1
−1
1
1
1
− 0 −1 2 12 −1 0 − 0 −1 − 0 2 − 2 2 2 C = 3x [−Ca s1 z1 (a 2 z3 + b 2 z4 ) − Ca λx Q x − (1 − C ) 2 a s1 (z1 (a 2 w3 + b 2 w4 ) 1
1
1
1
1
1
−1
1
1
1
1
1
1
1
−2 2 +w1 (a− 2 z03 + b− 2 z04 )) + (1 − C 2 ) 2 a− 2 λx2 H12 s1 2 z1 (b− 2 z03 − a− 2 z04 )] + 3x2 (ab)− 2 Q12 x − C (ab) xy λy , −1
1
vt
1
1
1
−1
1
1
1
−1 − 0 2 − − 0 −1 2 12 −1 0 − 0 2 2 2 C = 3y [−Cb s1 z2 (b 2 z3 − a 2 z4 ) − Cb λy Q y − (1 − C ) 2 b s1 [z2 (b 2 w3 − a 2 w4 ) 1
1
1
1
1
1
−1
1
1
1
−2 2 +w2 (b− 2 z03 − a− 2 z04 )] + (1 − C 2 ) 2 b− 2 λy2 H12 s1 2 z2 (a− 2 z03 + b− 2 z04 )] + 3y2 (ab)− 2 Q12 y − C (ab) yx λx , tt
1
1
1
1
1
1
−1 −1 − − − 0 22 − 0 −1 22 22 − C = −Ca s1 (a 2 z3 + b 2 z4 )(a 2 z3 + b 2 z4 ) − Ca λx Qx + (ab) 2 (λx λy ) 2 [Q xy + Q yx ] 1
1
1
1
−(1 − C 2 ) 2 [s22 (z3 w03 + w3 z03 + z4 w04 + w4 z04 ) + s4 (z3 w04 + w4 z03 )] + (1 − C 2 ) 2 a− 2 λx H22 x 1
1
1
1
1
1
1
1
1
1 −2 −C (ab)− 2 s− z3 + b− 2 z4 )(b− 2 z03 − a− 2 z04 ) + (b− 2 z3 − a− 2 z4 )(a− 2 z03 + b− 2 z04 )] 1 [(a 1
1
1
1
1
1
1 −2 2 2 −2 −Cb−1 s− z3 − a− 2 z4 )(b− 2 z03 − a− 2 z04 ) − Cb−1 λy Q22 λy H22 y , y + (1 − C ) b 1 (b
λ λx λy λ where s1 = λax + by , s2 = ab , s4 = λax − by , and
a, b ∼ χν2 , H11 x
H12 x
H21 x
z1 , w1 ∼ NormalM (0, IM ),
H22 x
H11 y
H12 y
H21 y
H22 y
z3 , z4 , w3 , w4 ∼ NormalL (0, IL ),
∼ Normal(M +L)×(M +L) (0, VM +L ),
∼ Normal(N +L)×(N +L) (0, VN +L ),
Q11 x
Q12 x
Q11 xy
Q12 xy
Q21 x Q = 11 Q yx
Q22 x
Q21 xy
Q12 yx Q22 yx
Q11 y Q21 y
Q22 xy
z2 , w2 ∼ NormalN (0, IN ),
Q21 yx
∼ WishartM +N +2L (IM +N +2L , ν − 2),
Q12 y Q22 y
independently and independent of C . Proof. The RF C defined in (5) can be rewritten as X(p)0 Y(q) C = C (p, q) = √ , X(p)0 X(p)Y(q)0 Y(q) where p(u, v, t) := (u, t), q(u, v, t) := (v, t). Hence, by simple rules of the differential calculus it is obtained that u
C =
p
q
v
,
C
C =
,
C
|M
t
C =
|L p
C
|L q + C .
|N
Since C (p, q) can be seen as a cross-correlation RF defined in RM +L × RN +L then by Lemma 4.2 in Cao and Worsley (1999b), u
1
1
1
2 − 2 1 C = (1 − C ) 2 a 2 Λx zx ,
v
1
1
1
2 − 2 1 C = (1 − C ) 2 b 2 3y zy ,
t
1
1
1
1
1
2 − − 2 2 2 2 C = (1 − C ) 2 (a 2 λx zx + b 2 λy zy ),
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
3299
and the first result follows by defining z1 := z1x ∼ NormalM (0, IM ), z2 := z1y ∼ NormalN (0, IN ), −1
1
1
1
1
z3 := s1 2 (a− 2 λx2 z2x + b− 2 λy2 z2y ) ∼ NormalL (0, IL ), λ with s1 = λax + by . In a similar way, uu
C = tt
C =
pp
,
C
qq
vv
C =
,
C
|MM |LL pp
pq
uv
C =
,
C
|NN
|L pp
ut
C =
|L pq
+ C
C
|MN
|M
|L qq
vt
,
C =
|L qp
+ C
C
|M
|N
,
|N
|LL |LL |LL pq qp qq + C + C + C .
C
Thus, by Lemma 4.2 in Cao and Worsley (1999b) uu
1
1
1
1
1
vv
1
1
1
1
1
uv
1
−1 0 −1 0 − 11 −1 11 0 2 2 2 2 C = 3x [−Ca z1 z1 − Ca Q x − (1 − C ) 2 a (z1 w1 + w1 z1 ) + (1 − C ) 2 a 2 Hx ]3x ,
−1 0 −1 11 −1 0 − 11 2 0 2 2 2 C = 3y [−Cb z2 z2 − Cb Q y − (1 − C ) 2 b (z2 w2 + w2 z2 ) + (1 − C ) 2 b 2 Hy ]3y , 1
1
1
0 11 − − 2 2 C = 3x [−C (ab) 2 z1 z2 + (ab) 2 Q xy ]3y ,
where w1 := w1x ∼ NormalM (0, IM ), w2 := w1y ∼ NormalN (0, IN ), which distribute independently, and independent of z1 , z2 and z3 . Furthermore, ut
1
1
1
1
1
1
1
1
2 0 2 −1 2 0 −1 12 2 −1 2 0 − 12 2 2 C = 3x [−Ca z1 (zx ) − Ca Qx − (1 − C ) 2 a (z1 (wx ) + w1 (zx ) ) + (1 − C ) 2 a 2 Hx ]λx 1
1
1
1
2 +3x2 [−C (ab)− 2 z1 (z2y )0 + (ab)− 2 Q12 xy ]λy ,
vt
1
1
2 0 −1 2 0 − 12 −1 12 2 0 2 2 −1 2 2 C = 3y [−Cb z2 (zy ) − Cb Qy − (1 − C ) 2 b (z2 (wy ) + w2 (zy ) ) + (1 − C ) 2 b 2 Hy ]λy 1
1
1
1
2 +3y2 [−C (ab)− 2 z2 (z2x )0 + (ab)− 2 Q12 yx ]λx ,
tt
1
1
1
1
1
−1 2 2 0 −1 22 2 −1 2 2 0 2 2 0 2 − 22 2 2 C = λx [−Ca zx (zx ) − Ca Q x − (1 − C ) 2 a [zx (wx ) + wx (zx ) ] + (1 − C ) 2 a 2 Hx ]λx 1
1
1
1
1
1
1
1
−2 2 2 0 2 2 2 zy (zx ) + (ab)− 2 Q22 +λx2 [−C (ab)− 2 z2x (z2y )0 + (ab)− 2 Q22 xy ]λy + λy [−C (ab) yx ]λx 1
1
1
1
1
2 2 −1 2 2 +λy2 [−Cb−1 z2y (z2y )0 − Cb−1 Q22 [zy (w2y )0 + w2y (z2y )0 ] + (1 − C 2 ) 2 b− 2 H22 y ]λy , y − (1 − C ) b
and the expression for the second derivative follows by defining −1
1
1
1
1
z4 := s1 2 (b− 2 λy2 z2x − a− 2 λx2 z2y ) ∼ NormalL (0, IL ), −1
1
1
1
1
−1
1
1
1
1
w3 := s1 2 (a− 2 λx2 w2x + b− 2 λy2 w2y ) ∼ NormalL (0, IL ), w4 := s1 2 (b− 2 λy2 w2x − a− 2 λx2 w2y ) ∼ NormalL (0, IL ), which distribute independently, and independent of z1 , z2 , z3 , w1 , w2 and noting that 1
a−1 λx [z2x (w2x )0 + w2x (z2x )0 ] + b−1 λy [z2y (w2y )0 + w2y (z2y )0 ] = s22 (z3 w03 + w3 z03 + z4 w04 + w4 z04 ) + s4 (z3 w04 + w4 z03 ), where s2 =
λx λy ab
λ and s4 = λax − by .
Lemma 4. Let V1 , V2 ∼ WishartP (IP , ξ ) independently, A be a fixed P × P symmetric matrix, and e1 , e2 be fixed scalars satisfying e1 + e2 = s˜1 and e1 e2 = s˜2 . Then P −k
P −k
[ 2 ] [ 2 ]−l P X X X ξ E (det (A − e1 V1 − e2 V2 )) = (−1)P −k det rk (A) (P − k)! k=0
l =0
q=0
l
ξ
P −k−l
3300
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
×
P − k − 2l l −(P −k−2l−1) s˜2 2 (1 + δP −k,2l )−1 (˜s21 − 2s˜2 )q s1P −k−2l−2q . 2q
D
Proof. Since V = U0 VU for V = e1 V1 + e2 V2 and any orthonormal matrix U, it is easily obtained that
E (det (A − e1 V1 − e2 V2 )) =
P X (−1)P −k det rk (A)E(det(e1 (V1 )P −k + e2 (V2 )P −k )), k=0
where (V1 )P −k and (V2 )P −k denote any (P − k) × (P − k) principal minor of V1 and V2 , respectively. The proof concludes by applying Lemma A.6 in Cao and Worsley (1999b). Lemma 5. Let G ∼ WishartN +2P (IN +2P , η) be a 3 × 3 block matrix where the diagonal blocks G11 , G22 and G33 are square matrices of dimension P , N and P, respectively. Let α1 , α2 and α3 be fixed scalars. If 1
α3 G22
˜ = G
1
1
α3 λy2 G23 + α2 λx2 G21
1
1
α3 λy2 G32 + α2 λx2 G12
!
1
α1 λx G11 + α3 λy G33 + α2 λx2 λy2 (G13 + G31 )
then k P −k ] [ P− ]−l P [X 2 2 X η!P ! X N η−N η−N (−1)P −k (α1 α4 λx )k k l P −k−l (η − N )! k=0 l=0 q=0 P − k − 2l l −(P −k−2l−1) × s˜2 2 (1 + δP −k,2l )−1 (˜s21 − 2s˜2 )q sP1 −k−2l−2q ,
˜ )) = α3N E(det(G
2q
where s˜1 = −α1 λx − α3 λy and s˜2 = (α1 α3 − α22 )λx λy . Proof. Let F∗ and F be 2 × 2 block matrices defined as G12 G32
F∗ :=
F :=
G11 G31
(G22 )−1 G21 G23 , 12 G13 G − 32 (G22 )−1 G21 33 G
G
G23 ,
which distribute independently as Wishart2P (I2P , N ) and Wishart2P (I2P , η − N ), respectively. It is easy to check that 11
˜ ) = det(G˜ ) det(G˜ det(G
22
21 11 12 − G˜ (G˜ )−1 G˜ ) 1
1
= α3N det(G22 ) det(α1 λx F11 + α3 λy F22 + α2 λx2 λy2 (F12 + F21 ) + α1 α4 λx F11 ∗ ),
(12)
α22
where α4 = 1 − α α . Let e1 , e2 be scalars satisfying e1 + e2 = −α1 λx − α3 λy and e1 e2 = (α1 α3 − α22 )λx λy . Then, 1 3 1
1
α1 λx F11 + α3 λy F22 + α2 λx2 λy2 (F12 + F21 ) = −e1 V1 − e2 V2 , where V1 , V2 ∼ WishartP (IP , η − N ) independently. Therefore
˜ )) = α3N E det(G22 ) det α1 α4 λx F11 E(det(G ∗ − e1 V1 − e2 V2
.
Then, by applying Lemma 4 with ξ = η − N, s˜1 = e1 + e2 , s˜2 = e1 e2 and A = α1 α4 λx F11 ∗ it is obtained that u u [ P− ] [ P− ]−l P 2 2 X X X η! N! P η−N η−N P − u u ˜ )) = α E(det(G (−1) (α1 α4 λx ) (P − u)! u (N − k)! l P −u−l (η − N )! u=0 l =0 q =0 P − u − 2l l −(P −u−2l−1) × s˜2 2 (1 + δP −u,2l )−1 (˜s21 − 2s˜2 )q s1P −u−2l−2q ,
N 3
2q
and the result follows after proper simplifications.
Lemma 6. Let Q ∼ WishartM +N +2P (IM +N +2P , ν) be the 4 × 4 block matrix
Q=
Qx Q yx
Q xy Qy
,
(13)
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
3301
where in turn the diagonal blocks Q x and Q y are square 2 × 2 block matrices of dimension M + P and N + P, respectively. Let
α1 , α2 and α3 be fixed scalars and Q˜ defined as in (9) Then P −p N X P X X M ν−M N −n M N ν!N !P ! P −u u p n+p+u M ˜ E det(Q) = α1 α3 (−1) (α1 λx ) (α3 λy ) (α4 ) n p N −n u (ν − M )! n=0 p=0 u=0 [ P −2p−u ] [ P −2p−u ]−l
×
X
X
l =0
q =0
ν−M −N +n l
ν−M −N +n P − p − u − 2l l −(P −p−u−2l−1) s˜2 2 P −p−u−l 2q
× (1 + δP −p−u,2l )−1 (˜s21 − 2s˜2 )q s˜P1 −p−u−2l−2q , α2
where s˜1 = −α1 λx − α3 λy , s˜2 = (α1 α3 − α22 )λx λy and α4 = 1 − α α2 . 1 3 Proof. It can be obtained that 22
˜ Q
˜ )) = E det(Q˜ 11 ) det E(det(Q
32
˜ Q
˜ Q
23
˜ Q
33
!
21
˜ Q
−
!
31
˜ Q
11 12 (Q˜ )−1 Q˜
13
˜ Q
!!
˜ ˜ = α1M E det(Q11 x ) E det(α3 α4 G∗ + G) , α2
where α4 = 1 − α α2 and 1 3
˜ = G
1
α3 G22
1
1
!
α3 λy2 G23 + α2 λx2 G21
1
1
α3 λy2 G32 + α2 λx2 G12
1
α1 λx G11 + α3 λy G33 + α2 λx2 λy2 (G13 + G31 )
,
˜∗ = G
G11 ∗ 1
1
λy2 G12 ∗
λy2 G21 ∗
!
λy G22 ∗
,
with Q11 yx
G∗ :=
!
Q21 yx
−1 Q11 (Q11 xy x )
Q12 xy ,
Q22 x
Q21 xy
Q22 xy
G := Q12 yx
Q11 y
11 −1 Q Q12 Q12 y − yx (Q x ) x
Q22 yx
Q21 y
Q21 x
11
Q11 xy
Q12 xy ,
Q21 yx
Q22 y
which distribute independently as WishartN +P (IN +P , M ) and WishartN +2P (IN +2P , ν − M ), respectively. Let U be a 2 × 2 block ˜ ∗∗ = U0 G˜ ∗ U is diagonal. It is easy to see that U0 GU ˜ has the same distribution as G˜ and hence orthonormal matrix such that G D
˜ ∗ + G˜ ) = det α3 α4 G˜ ∗∗ + G˜ . det(α3 α4 G ˜ ∗∗ + G˜ ) can be expanded as the summation of the product of the determinant of any l × l principal minor Then det(α3 α4 G ˜ ∗∗ (or α3 α4 G˜ ∗ ) and the determinant of the (N + P − l) × (N + P − l) principal minor of G˜ , formed by the remaining of α3 α4 G N + P − l rows and columns. For each l, let n and p = l − n be the number of rows or columns from the N × N upper diagonal ˜ ∗ and from the P × P lower diagonal matrix of G˜ ∗ , respectively. Then, by (12), matrix of G N X P X (α3 α4 )n+p E det rn,p (G∗ )
˜ )) = α1M E det(Q11 E(det(Q x )
n=0 p=0
× E det((α3 G )N −n ) det((α1 λx F + α3 λy F + α2 λx λy (F + F ) + α1 α4 λx F∗ )P −p ) .
22
11
22
1 2
1 2
12
21
11
Hence, the expression (13) with η = ν − M and N , P replaced by N − n and P − p respectively yields
˜ )) = α1M E(det(Q
N X P X M! M! (ν − M )! ν! P n+p p N (α3 α4 ) λy n p (M − n)! (M − p)! (ν − M − N + n)! (ν − M )! n=0 p=0
[ P −2p−u ] [ P −2p−u ]−l X X (N − n)! ν−M −N +n ×α (−1) (α1 α ) λ (P − p − u)! l (N − n − u)! l=0 u =0 q=0 ν−M −N +n P − p − u − 2l l −(P −p−u−2l−1) × s˜2 2 (1 + δP −p−u,2l )−1 (˜s21 − 2s˜2 )q s˜P1 −p−u−2l−2q , P −p−u−l 2q N −n 3
P −p X
P −u
u u 4 x
P −p u
and the result follows after proper simplifications.
3302
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
˜ be defined as in (9), let γ1 , γ2 and γ4 be fixed real numbers, and let z˜ = (0, 0, z4 )0 , w ˜ = (γ1 w1 , γ2 w2 , γ4 w4 )0 , Lemma 7. Let Q with w1 ∼ NormalM (0, IM ),
w2 ∼ NormalN (0, IN ),
z4 , w4 ∼ NormalP (0, IP ),
˜ Then independently and independent of Q. ν!(N − y)!(P − z )! (ν − M + x)! x=0 y=0 z =0 N − y P − z − p P −z X XX M −x M −x ν−M +x N −y−n × (−1)P −u (α1 λx )u (α3 λy )p α4 n+p+u n p N −y−n u
˜ + z˜ w ˜ +w ˜ z˜ )) = E(det(Q 0
×
0
n=0 p=0
u =0
p−u [ P −z − ] 2
p−u [ P −z − ]−l 2
X
X
l =0
M X N X P X
N −y
∆x,y,z (γ1 , γ2 , γ3 ; M , N , P )α1M −x α3
ν−M −N +x+y+n l
q =0
ν−M −N +x+y+n P − z − p − u − 2l l s˜2 P −z−p−u−l 2q
× 2−(P −z −p−u−2l−1) (1 + δP −z −p−u,2l )−1 (˜s21 − 2s˜2 )q s˜1P −z −p−u−2l−2q where
1, −γ42 P (P − 1), ∆x,y,z (γ1 , γ2 , γ3 ; M , N , P ) = −γ12 MP , 2 −γ2 NP , 0,
x=y=z=0 x = y = 0, z = y = 0, x = z = x = 0, y = z = elsewhere
2
1 , 1
α2
s˜1 = −α1 λx − α3 λy , s˜2 = (α1 α3 − α22 )λx λy and α4 = 1 − α α2 . 1 3
˜ + z˜ w ˜ +w ˜ z˜ ) can be expanded as the summation of the product of the determinant of any Proof. It is easy to see that det(Q ˜ formed by the remaining ˜ 0 +w ˜ z˜ 0 and the determinant of the (D − l) × (D − l) principal minor of Q, l × l principal minor of z˜ w D − l rows and columns. For each l, let x, y and z be the number of rows or columns from the first, second and third diagonal ˜0 +w ˜ z˜ 0 . Then blocks z˜ w 0
˜ + z˜ w ˜ +w ˜ z˜ )) = E(det(Q 0
0
0
M X N X P X
˜ +w ˜ z˜ ))E(det(Q˜ M −x,N −y,P −z )), E(det rx,y,z (˜zw 0
0
x=0 y=0 z =0
˜ +w ˜ z˜ ) = 0 for x + y + x > 2. On the other hand, by the definition of z˜ , det rx,y,0 (˜zw ˜ +w ˜ z˜ ) = 0 for where det rx,y,z (˜zw x 6= 0 and y 6= 0. Hence the only triples of (x, y, z ) that contributes to the summation above are (0, 0, 0), (0, 0, 2), (1, 0, 1) and (0, 1, 1). Thus, similarly to Lemma A.7 in Cao and Worsley (1999b) it is easy to show that 0
0
0
˜ +w ˜ z˜ ) = 1, det r0,0,0 (˜zw 0
˜ +w ˜ z˜ ) = −γ42 P (P − 1), det r0,0,2 (˜zw
0
0
˜ +w ˜ z˜ ) = −γ12 MP , det r1,0,1 (˜zw 0
0
0
0
˜ +w ˜ z˜ ) = −γ22 NP , det r0,1,1 (˜zw 0
0
and the result follows by applying Lemma 6 with M , N and P replaced by M − x, N − y and P − z, respectively.
Lemma 8. Let Hx ∼ NormalM +P (0, V (IM +P )), Hy ∼ NormalN +P (0, V (IN +P )) be 2 × 2 block matrices of dimension M + P and
˜ be defined as in (10) and let A be a symmetric matrix with the same block N + P, respectively, and let β1 , β2 be fixed scalars. Let H D
˜ and such that A = U0 AU for any fixed orthonormal matrix U. Let Ai,j,k be any (i + j + k) × (1 + j + k) principal structure as H minor of A, with i, j, k rows and columns corresponding to the first, second and third diagonal blocks of A, respectively. Then M
˜ + A)) = E(det(H
N
P
[ 2 ] [2] [2] X X X (−1)i+j+k (2i)!(2j)!(2k)! M N P β12i β22j β3k E(det(AM −2i,N −2j,P −2k )), 2i 2j 2k 2i+j+k i!j!k! i=0 j=0 k=0
with β3 = (β1 λx )2 + (β2 λy )2 . Proof. Firstly, notice that
˜ ) = det(H˜ 1 ) det(H˜ 2 ) det(H 11 ˜ = β1M β2N det(H11 x ) det(Hy ) det(H2 ),
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
3303
where
˜1 = H
β1 H11 x 0
0 β2 H11 y
,
1 22 ˜ 2 = β1 λx H22 21 2 H x + β2 λy Hy − β1 λx Hx
1 2
β2 λy H21 y
1 12 2 β λ H 1 x x . (H˜ 1 )−1 1 β2 λy2 H12 y
Since
˜ 2 = β1 λx Hx + β2 λy Hy , H with 21 11 −1 12 Hx ∼ NormalP ×P (0, VP ), Hx = H22 x − Hx (Hx ) 21 11 −1 12 Hy = H22 Hy ∼ NormalP ×P (0, VP ), y − Hy (Hy ) 11 independently, and independent of H11 x , Hy , we obtain 11 ˜ )) = β1M β2N E(det(H11 E(det(H x ))E(det(Hy ))E(β1 λx Hx + β2 λy Hy ).
Hence, P
11 ˜ )) = β1M β2N β32 E(det(H11 E(det(H x ))E(det(Hy ))E(H),
(14)
where − 12
H = β3
(β1 λx Hx + β2 λy Hy ) ∼ NormalP ×P (0, VP ), D
˜ + A)) with β3 = (β1 λx )2 +(β2 λy )2 . On the other hand, since and A = U0 AU for any fixed orthonormal matrix U, then E(det(H ˜ l of H˜ and the determinant can be expressed as the summation of the product of the determinant of any l × l principal minor H ˜ l. of the (D − l) × (D − l) principal minor AD−l of A, formed from the remaining D − l rows and columns not included in H ˜ l , let i and j be the rows and columns from the first M × M and N × N diagonal matrices of H, ˜ respectively. Hence, For H ˜ Therefore, from (14), k = l − i − j is the number of rows and columns from the lower P × P diagonal matrix of H. ˜ + A)) = E(det(H
M X N X P X
k
11 β1i β2j β32 E(det ri (H11 x ))E(det rj (Hy ))E(det rk (H))E(det(AM −i,N −j,P −k )),
i=0 j=0 k=0
and the final result is obtained by Lemma 5.3.2 in Adler (1981).
˜ be as in Lemma 8 and Q, ˜ z˜ , w ˜ as in Lemma 7. Then Lemma 9. Let H M
N
P
[ 2 ] [2] [2] X X X (−1)i+j+k (2i)!(2j)!(2k)! M N P β12i β22j β3k 2i+j+k i!j!k! 2i 2j 2k i=0 j=0 k=0
˜ + Q˜ + z˜ w ˜0 +w ˜ z˜ 0 )) = E(det(H M −2i N −2j P −2k
×
XXX
α1M −2i−x α3N −2j−y ∆x,y,z (γ1 , γ2 , γ3 ; M − 2i, N − 2j, P − 2k)
x=0 y=0 z =0 2j−y P − −z −p 2k−z P −2k X X X ν!(N − 2j − y)!(P − 2k − z )! N − (−1)P −u (α1 λx )u (ν − M + 2i + x)! n=0 p=0 u =0 M − 2i − x M − 2i − x ν − M + 2i + x N − 2j − y − n × (α3 λy )p α4 n+p+u n p N − 2j − y − n u
×
[ P −2k−2z −p−u ] [ P −2k−2z −p−u ]−l
× ×
X
X
l=0
q =0
ν − M − N + 2i + 2j + x + y + n l
ν − M − N + 2i + 2j + x + y + n P − 2k − z − p − u − l
P − 2k − z − p − u − 2l l −(P −2k−z −p−u−2l−1) s˜2 2 (1 + δP −2k−z −p−u,2l )−1 (˜s21 − 2s˜2 )q s˜1P −2k−z −p−u−2l−2q . 2q 0
0
˜ + z˜ w ˜ + w ˜ z˜ and then Lemma 7 for computing Proof. The result follows from applying Lemma 8 with A = Q E(det(AM −2i,N −2j,P −2k )).
3304
F. Carbonell et al. / Computational Statistics and Data Analysis 53 (2009) 3291–3304
References Adler, R., Hasofer, A., 1976. Level crossings for random fields. The Annals of Probability 4 (1), 1–12. Adler, R.J., 1981. The Geometry of Random Fields. John Wiley & Sons Inc. Beaky, M.M., Scherrer, R.J., Villumsen, J.V., 1992. Topology of large-scale structure in seeded hot dark matter models. Astrophysical Journal 387, 443–448. Cao, J., Worsley, K.J., 1999a. The detection of local shape changes via the geometry of Hotelling’s T 2 fields. The Annals of Statistics 27 (3), 925–942. Cao, J., Worsley, K.J., 1999b. The geometry of correlation fields with an application to functional connectivity of the brain. The Annals of Applied Probability 9 (4), 1021–1057. Cao, J., Worsley, K.J., 2001. Applications of random fields in human brain mapping. In: Spatial Statistics: Methodological Aspects and Applications. In: Springer Lecture Notes in Statistics, vol. 159. pp. 170–182. Friston, K.J., Buechel, C., Fink, G.R., Morris, J., Rolls, E., Dolan, R.J., 1997. Psychophysiological and modulatory interactions in neuroimaging. Neuroimage 6 (3), 218–229. Gobbini, M., Haxby, J., 2007. Neural systems for recognition of familiar faces. Neuropsychologia 45 (1), 32–41. Gott III, J.R., Park, C., Juszkiewicz, R., Bies, W.E., Bennett, D.P., Bouchet, F.R., Stebbins, A., 1990. Topology of microwave background fluctuations-theory. Astrophysical Journal 352, 1–14. Hasofer, A., 1978. Upcrossings of random fields. Advances in Applied Probability 10, 14–21. Henson, R.N., Goshen-Gottstein, Y., Ganel, T., Otten, L.J., Quayle, A., Rugg, M.D., 2003. Electrophysiological and haemodynamic correlates of face perception, recognition and priming. Cerebral Cortex 13 (7), 793–805. Horowitz, B., Grady, C.L., Mentis, M.J., Pietrini, P., Ungerleider, L.G., Rapoport, S.I., Haxby, J.V., 1996. Brain functional connectivity changes as task difficulty is altered. Neuroimage 3, S248. Kiebel, S.J., Poline, J.B., Friston, K.J., Holmes, A.P., Worsley, K.J., 1999. Robust smoothness estimation in statistical parametric maps using standardized residuals from the general linear model. Neuroimage 10 (6), 756–766. McLntosh, A.R., Gonzalez-Lima, F., 1994. Structural equation modeling and its application to network analysis in functional brain imaging. Human Brain Mapping 2 (1–2), 2–22. Pascual-Marqui, R.D., Michel, C.M., Lehmann, D., 1994. Low resolution electromagnetic tomography: A new method for localizing electrical activity in the brain. International Journal of Psychophysiology 18 (1), 49–65. Strother, S.C., Anderson, J.R., Schaper, K.A., Sidtis, J.J., Liow, J., Woods, R.P., Rottenberg, D.A., 1995. Principal component analysis and the scaled subprofile model compared to intersubject averaging and statistical parametric mapping I: Functional connectivity of the human motor system studied with 15 O water PET. Journal of Cerebral Blood Flow and Metabolism 15 (5), 738–753. Taylor, J.E., Worsley, K.J., 2008. Random fields of multivariate test statistics, with applications to shape analysis. Annals of Statistics 36, 1–27. Vogeley, M.S., Park, C., Geller, M.J., Huchra, J.P., Gott III, J.R., 1994. Topological analysis of the CfA redshift survey. The Astrophysical Journal 420 (2), 525–544. Worsley, K.J., 1994. Local maxima and the expected Euler characteristic of excursion sets of χ 2 , F and t fields. Advances in Applied Probability 26 (1), 13–42. Worsley, K.J., 1995a. Boundary corrections for the expected Euler characteristic of excursion sets of random fields, with an application to astrophysics. Advances in Applied Probability 27 (4), 943–959. Worsley, K.J., 1995b. Estimating the number of peaks in a random field using the Hadwiger characteristic of excursion sets, with applications to medical images. The Annals of Statistics 23 (2), 640–669. Worsley, K.J., Cao, J., Paus, T., Petrides, M., Evans, A.C., 1998. Applications of random field theory to functional connectivity. Human Brain Mapping 6 (5–6), 364–367. Worsley, K.J., Chen, J.I., Lerch, J., Evans, A.C., 2005. Comparing functional connectivity via thresholding correlations and singular value decomposition. Philosophical Transactions of the Royal Society B: Biological Sciences 360 (1457), 913–920. Worsley, K.J., Evans, A.C., Marrett, S., Neelin, P., 1992. Determining the number of statistically significant areas of activation in subtracted activation studies from PET. Journal of Cerebral Blood Flow and Metabolism 12, 900–918. Worsley, K.J., Marrett, S., Neelin, P., Vandal, A.C., Friston, K.J., Evans, A.C., 1996. A unified statistical approach for determining significant signals in images of cerebral activation. Human Brain Mapping 4 (1), 58–73. Worsley, K.J., Taylor, J., Tomaiuolo, F., Lerch, J., 2004. Unified univariate and multivariate random field theory. Neuroimage 23, 189–195.