Statistics and Probability Letters 80 (2010) 361–365
Contents lists available at ScienceDirect
Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro
A note on the cross-covariance operator and on congruence relations for Hilbert space valued stochastic processes David King ∗ School of Mathematics and Statistics, Arizona State University, Tempe, AZ 85287-1804, United States
article
abstract
info
Article history: Received 16 June 2009 Received in revised form 11 November 2009 Accepted 13 November 2009 Available online 2 December 2009
Explicit formulas are derived for the congruence mappings that connect three Hilbert spaces associated with a second-order stochastic process. In particular, an insightful expression is obtained for the mapping that connects a process to its corresponding reproducing kernel Hilbert space. In addition, a useful infinite dimensional extension of a result from Khatri (1976) which pertains to cross-covariance operators is provided. © 2009 Elsevier B.V. All rights reserved.
MSC: Primary 62H20 62H25 62M99
1. Introduction Let {X (t ), t ∈ E } be a zero-mean stochastic process with a finite dimensional index set E = {t1 , . . . tn }. The set of all linear combinations of X = (X (t1 ), . . . , X (tn ))0 is isometrically isomorphic or congruent to the column space of the covariance matrix K = {K (ti , tj )}ni,j=1 , where K is the covariance kernel defined by K (s, t ) = E[X (s)X (t )].
(1)
The congruence mapping is determined uniquely by Ψ (K (t , ·)) = X (t ), t ∈ E, with the result that every linear combination U of the X vector with nonzero variance can be expressed as U = Ψ (f) = f0 KĎ X
(2) Ď
for some f ∈ ker(K) , with K denoting the Moore–Penrose generalized inverse of K. The congruence in (2) is a special case of the Lòeve–Parzen congruence that connects a second-order process to the reproducing kernel Hilbert space (RKHS) generated by its covariance kernel. In general, however, there is no simple closed form for the congruence mapping. This poses problems for the application of RKHS methods in areas such as functional data analysis (FDA; e.g., Eubank and Hsing, 2008). Here we deal with a case of particular interest for FDA settings and show that (2) has a natural infinite dimensional extension for processes that take values in certain Hilbert function spaces. Let E be a subset of R and ν a sigma-finite measure on E. We then consider the case where a zero-mean stochastic process {X R (t ), t ∈ E } takes values in the Hilbert space H = L2 (E ) of square integrable functions on E with inner product (f , g )H ≡ E f (t )g (t )dν(t ). Associated with X (·) is the covariance operator S : H 7→ H defined by ⊥
(f , Sg )H ≡ Cov((X , f )H , (X , g )H ) =
Z
(X , f )H (X , g )H dP (X ), H
∗
Tel.: +1 480 862 2135. E-mail address:
[email protected].
0167-7152/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2009.11.011
(3)
362
D. King / Statistics and Probability Letters 80 (2010) 361–365
for f , g ∈ H with P the induced probability measure from X on H . It is well known (e.g. Laha and Rohatgi, 1979) that S PN is positive, self-adjoint and Hilbert–Schmidt. As a result, X admits a Karhunen–Loève expansion X (·) = i=1 (X , φi )H φi (·), for N = rank(S ) (possibly infinity) and {φi }Ni=1 the eigenvectors of S whose corresponding eigenvalues {λn }Nn=1 satisfy E[(X , φi )H (X , φj )H ] = (φi , S φi )H = λi δij , with δij denoting the Kronecker delta function. Alternatively, for λi > 0, we can write X (·) =
PN √ λi Z˜i φi (·) with i=1
p
Z˜i ≡ (X , φi )H / λi
(4)
uncorrelated random variables with unit variances. The covariance kernel in (1) for X (·) generates a reproducing kernel Hilbert space H (K ) (see Berlinet and Thomas-Agnan, 2004) that is congruent to the Hilbert space L2X of all linear functionals of the X process. Specifically, L2X contains all finite dimensional linear combinations of {X (t ), t ∈ E } and their limits under the inner product (U , V )L2 = E[UV ], for U , V ∈ L2X , X
fi2 < ∞}. The congruence mapping Ψ : H (K ) 7→ L2X is defined by requiring while H (K ) = {f : f = i =1 λ i P i=1 λi fi φi , Pn n every finite dimensional linear combination i=1 ai K (·, ti ) to map to i=1 ai X (ti ), for all ai ∈ R, ti ∈ E , and n ∈ N. An application of the integral representation theorem of Parzen (1961) then produces the following result.
P∞
P∞
Theorem 1.1. Let f (·) =
Ψ (f ) =
N X
PN
λi fi φi (·) be in H (K ). Then, ! N N X X −1 and Ψ fi (X , φi )H = λi fi φi ,
i =1
fi (X , φi )H
i =1
i=1
(5)
i=1
with Ψ −1 = Ψ ∗ , where Ψ ∗ denotes the adjoint of Ψ . For the developments here we need a further congruence that connects the closure of the image of the square root of S, denoted as S 1/2 , with H (K ). Theorem 1.2 (Eubank and Hsing, 2008). The Hilbert spaces Im(S 1/2 ) = ker(S )⊥ and H (K ) are congruent under the mapping
Γ : H 7→ H (K ) defined by
(Γ g )(·) ≡
N p X
λi gi φi (·),
(6)
i=1
where g =
PN
i=1
(g , φi )φi =
(Γ −1 f )(·) ≡
N p X
PN
i =1
gi φi ∈ ker(S )⊥ . The inverse mapping
λi fi φi (·),
(7)
i =1
for f =
PN
i =1
λi fi φi (·) ∈ H (K ) is also the adjoint of Γ .
Theorems 1.1 and 1.2 provide congruences that connect L2X to H (K ) and H (K ) to ker(S )⊥ . We can connect these mappings to obtain a congruence between L2X and ker(S )⊥ ⊆ L2 (E ). In the next section we use this to produce the natural extension of (2) to the case where X is L2 (E ) valued. 2. Main results First note that since (5)–(7) are congruences, the composition Ω ≡ Ψ ◦ Γ is also a congruence mapping from ker(S )⊥ PN ⊥ onto L2X . Thus, if we take f = i=1 fi φi ∈ ker(S ) , then
Ω (f ) ≡ Ψ
Γ
N X
!! fi φi
i=1
=
=Ψ
N X λi fi φi √ λi i=1
!
N N X X fi fi Z˜i √ (X , φi )L2 (E ) = λ i i =1 i=1
(8)
with Z˜i as defined in (4). However, if f ∈ Im(S 1/2 ) ⊆ Im(S 1/2 ) = ker(S )⊥ , then f = S 1/2 g for some g ∈ L2 (E ), so we may further refine (8) to write
1/2Ď (X , S f )H ∞ X Ω (f ) = fi Z˜i , i =1
for f ∈ Im(S 1/2 ) and, whenever f ∈ Im(S 1/2 ) \ Im(S 1/2 ) and N = ∞
(9)
D. King / Statistics and Probability Letters 80 (2010) 361–365
363
with S 1/2Ď denoting the Moore–Penrose inverse of S 1/2 . The inverse mapping Ω −1 : L2X 7→ ker(S )⊥ for any U =
PN
i =1 f i
(X , φi )H has the form
N X
Ω −1 ( U ) = Γ −1 Ψ −1
!! fi (X , φi )H
=
N p X
i=1
with the coefficients {fi } satisfying then Ω −1 (U ) = S 1/2 f . Thus,
1/2 S f ∞ p −1 X Ω (U ) = λi fi φi ,
λi fi φi
i=1
PN
i=1
λi fi2 < ∞. However, in the special case that U = (X , f )H for some f ∈ ker(S )⊥
for U = (X , f )H with f ∈ ker(S )⊥ and, (10)
whenever N = ∞ and U ∈ L2X \ Z,
i=1
where Z is the subspace of L2X which is expressible as an L2 (E ) inner product defined by
Z ≡ {(X , f )H : f ∈ L2 (E )}.
(11)
The subspace Z was discussed in Kupressanin (2008) and an extension of Theorem 3.38, p. 43, in this paper is provided below. Theorem 2.1. The following relations hold: (i) Z = {(X , f )L2 (E ) | f ∈ ker(S )⊥ }, (ii) Z = Ω (Im(S 1/2 )), (iii) Z = L2X , where Z denotes the closure of L2X . Proof. To see (i), observe that for any f ∈ ker(S ), Var[(X , f )H ] = (f , Sf )H = 0. Thus (X , f )H = 0 with probability 1. Consequently, the orthogonal decomposition f = f1 + f2 , with f1 ∈ ker(S )⊥ and f2 ∈ ker(S ) ensures that (X , f )H = (X , f1 )H with probability 1. For (ii) let U ∈ Ω (Im(S 1/2 )). Then, for f = f1 + f2 ∈ L2 (E ) with f1 ∈ ker(S )⊥ and f2 ∈ ker(S ), we have U = Ω (S 1/2 f ) = (X , S 1/2Ď S 1/2 f1 )H + 0 = (X , f1 )H . This shows that Ω (Im(S 1/2 )) ⊂ Z. Conversely, if U = (X , f )H ∈ Z then Ω −1 (U ) = S 1/2 f by (10) which implies that Z = Ω (Im(S 1/2 )). Finally, since Im(S 1/2 ) is a dense subspace of ker(S )⊥ , the congruence property of Ω ensures that Z = Ω (Im(S 1/2 )) is a dense subspace of L2X . One application of Theorem 2.1 in the FDA setting is seen in the definition of functional canonical correlation as offered by He et al. (2002) or by Ramsay and Silverman (2005). In this paper if X1 and X2 are two L2 (E ) valued stochastic processes then the squared principal canonical correlation ρ12 is defined as the solution to the optimization problem
ρ12 ≡
sup 1/2
f ∈Im(S1
1/2
),g ∈Im(S2 )
Corr2 [(X1 , f )L2 (E ) , (X2 , g )L2 (E ) ],
(12)
where {S1 , S2 } are defined in the same fashion as (3) with X replaced by {X1 , X2 }. Using the isometries {Ω1 , Ω2 } we see that under this definition
ρ12 =
sup
U ∈Z1 ,V ∈Z2
Corr2 [U , V ],
with Zi ≡ {(Xi , h) : h ∈ L2 (E )} for i = 1, 2. Consequently, the optimization problem in (12) occurs on the dense subspace of {Z1 , Z2 } ⊆ {L2X1 , L2X2 }. By way of contrast, Eubank and Hsing (2008) define ρ12 by
ρ12 =
sup
Corr2 [U , V ]
U ∈L2X ,V ∈L2X 1 2
and so this is a more complete definition. We can now utilize the formula for Ω in (9) to obtain the desired extension of (2). Theorem 2.2. For any f˜ =
PN
Ď (X , S f˜ )H ∞ √ X Ψ (f˜ ) = λi fi Z˜i
i=1
λi fi φi ∈ H (K )
whenever f˜ ∈ Γ Im(S 1/2 ) and,
for N = ∞ and f˜ ∈ Γ
Im(S 1/2 ) \ Im(S 1/2 ) .
i =1
The inverse mapping for any U = (X , f )H =
PN
i=1 fi
(X , φi )H with f ∈ ker(S )⊥ is Ψ −1 (U ) = Sf .
364
D. King / Statistics and Probability Letters 80 (2010) 361–365
Proof. Observe that Ψ = ΩΓ −1 . Thus, for any f˜ = N √ X ( λi fi )φi
Ψ (f˜ ) = Ω (Γ −1 (f˜ )) = Ω
! =
i =1
PN
i =1
λi fi φi ∈ H (K ) it follows that
N √ X ( λi fi )Z˜i . i=1
In the special case that f˜ = Γ (f ) for f ∈ Im(S 1/2 ),
Ψ (f˜ ) = Ω (Γ −1 (f˜ )) = (X , S 1/2Ď Γ −1 f˜ )L2 (E ) = (X , S Ď f˜ )L2 (E ) . Similarly, Ψ −1 = Γ Ω −1 . So, for any U = (X , f )H ∈ L2X , with f =
Ψ −1 ( U ) = Γ
N X
Ω −1
!! fi (X , φi )H
=Γ
i=1
N X
PN
i=1 fi
φi ∈ ker(S )⊥ ,
! fi (S 1/2 φi )
= Sf .
i=1
√
Now notice that there are actually simple representations for Γ , Ω and Ψ . To see this note that {Z˜i ≡ (X , φi )H / λi = √ Ω (φi )}Ni=1 and {φ˜ i ≡ λi φi = Γ φi }Ni=1 are complete orthonormal systems for L2X and H (K ), respectively. Thus, Γ =
φi ⊗L2 (E ) φ˜ i , Ψ =
P φ˜ i ⊗H (K ) Z˜i and Ω = Ni=1 φi ⊗L2 (E ) Z˜i , where the bi-linear operation ⊗H1 is defined in such a way that for any abstract Hilbert spaces H1 , H2 with e, f ∈ H1 and g ∈ H2 , (e ⊗H1 g )f ≡ (e, f )H1 g. For the second major result suppose X1 and X2 are two L2 (E ) valued stochastic processes with cross-covariance operator S12 defined by (f , S12 g )L2 (E ) = Cov[(X1 , f )L2 (E ) , (X2 , g )L2 (E ) ]. Now let Li : L2 (E ) 7→ L2Xi be given by Li (f ) = (Xi , f )L2 (E ) for i = 1, 2. The adjoint L∗i for any U ∈ L2Xi is given by L∗i (U ) = E[UXi ] (see Dauxois et al., 2004). Notice that for i, j = 1, 2 the PN
i=1
PN
i=1
operators Li and Lj satisfy
Cov[Li f , Lj g ] = E[(f , Xi )L2 (E ) (g , Xj )L2 (E ) ] = (f , Sij g )L2 (E ) . It follows that (f , L∗i Lj g )L2 (E ) = (f , Sij g )L2 (E ) and so Sij admits the factorization Sij = L∗i Lj . Observe also that since
PN
k=1
(Li φk , Li φk )L2 = trace(Si ) < ∞, the Li are Hilbert–Schmidt and so cannot be surjective when N = ∞. We now Xi
obtain our second major result. Theorem 2.3. Let Li and Sij be as defined above with Si = Sii for i, j = 1, 2. Then Ď
S1 S1 S12 = S12
Ď
S2 S2 S21 = S21 .
and
(13)
Proof. Observe that Ď
S1 S1 S12 = (L∗1 L1 )(L∗1 L1 )Ď (L∗1 L2 ) = (L∗1 L2 ) = S12 holds in both finite and infinite dimensions.
Theorem 2.3 can be viewed as an infinite dimensional extension of the result from Khatri (1976). 3. Example To illustrate the developments in Section 2 let {X (t ), t ∈ [0, 1]} correspond to a Brownian motion process: i.e., X (·) is a normal process with X (0) = 0 and covariance kernel K (s, t ) = E[X (s)X (t )] = min(s, t ) for s, t ∈ [0, 1]. The covariance R1 Rt R1 operator corresponds to the L2 [0, 1] integral operator (Sf )(t ) = 0 K (s, t )f (s)ds = 0 sf (s)ds + t t f (s)ds, for which the eigenvalues and eigenfunctions are
λn =
1
!2 ∞
(n − 12 )π
and
∞ √ 1 φn (t ) = 2 sin n− πt 2
n =1
n =1
for t ∈ [0, 1]. We may now utilize (6) and (8) and (12) to develop formulae for Γ , Ω and Ψ . First, utilizing (6) we arrive at
(Γ f )(t ) =
∞ X
√ 2fi
1
πt ,
for all t ∈ [0, 1], (14) 2 (n − 12 )π √ R1 with fi = (f , φi )H = 2 0 sin n − 21 π t f (t )dt the Fourier coefficient for f ∈ ker(S )⊥ = L2 [0, 1]. The isometry Ω P∞ P∞ applied to any f = i=1 (f , φi )H φi = i=1 fi φi produces sin
n−
n=1
Ω f = (X , S −1/2 f )H =
∞ X n=1
fi Z˜i ,
(15)
D. King / Statistics and Probability Letters 80 (2010) 361–365
365
√ √ R 1 with Z˜n = 2/ λn 0 X (t ) sin n − 21 π t dt standard normal random variables. Finally, the isometry Ψ applied to P∞ P∞ 2 any f˜ = i=1 λi fi φi ∈ H (K ) with i=1 fi /λi < ∞ results in Ψ f˜ = (X , S Ď f˜ )H =
∞ X
fi
n=1
(n − 21 )π
Z˜i .
References Berlinet, A., Thomas-Agnan, C., 2004. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Kluwer Academic Publishers, Boston. Dauxois, J., Nkiet, G.M., Romain, Y., 2004. Canonical analysis relative to a closed subspace. Linear Algebra and it’s Applications 388, 119–145. Eubank, R.L., Hsing, T., 2008. Canonical correlation for stochastic processes. Stochastic Processes and their Application 118, 1634–1661. He, G., Müller, H., Wang, J., 2002. Methods of canonical analysis for functional data. Journal of Statistical Planning and Inference 122, 141–159. Khatri, C.G., 1976. A note on multiple and canonical correlation for a singular covariance matrix. Psychometrika 41, 465–470. Kupressanin, A., 2008. Topics in functional canonical correlation and regression. Ph.D. Thesis, Dept. Math and Stats., Arizona State University. Laha, R., Rohatgi, V., 1979. Probability Theory. Wiley, New York. Parzen, E., 1961. An approach to time series analysis. Annals of Mathematical Statistics 32, 951–989. Ramsay, J.O., Silverman, B.W., 2005. Functional Data Analysis. Springer, New York.
(16)