An example of inconsistent MLE of spatial covariance parameters under increasing domain asymptotics

An example of inconsistent MLE of spatial covariance parameters under increasing domain asymptotics

Statistics and Probability Letters 120 (2017) 108–113 Contents lists available at ScienceDirect Statistics and Probability Letters journal homepage:...

361KB Sizes 0 Downloads 23 Views

Statistics and Probability Letters 120 (2017) 108–113

Contents lists available at ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

An example of inconsistent MLE of spatial covariance parameters under increasing domain asymptotics Tonglin Zhang Department of Statistics, Purdue University, 250 North University Street, West Lafayette, IN 47907-2066, United States

article

info

Article history: Received 15 August 2014 Received in revised form 8 October 2014 Accepted 1 October 2016 Available online 11 October 2016 Keywords: Fixed domain and increasing domain asymptotics Gaussian random field Inconsistency Maximum likelihood estimator

abstract Asymptotic properties of estimators of covariance parameters in spatial statistics are commonly considered under the frameworks of increasing domain and fixed domain asymptotics, respectively. Although inconsistency is a general conclusion under the framework of fixed domain asymptotics, it is generally believed that consistency should generally hold under the framework of increasing domain asymptotics. This article provides an example in which the maximum likelihood estimator (MLE) of covariance parameters is still inconsistent under the framework of increasing domain asymptotics. Therefore, consistency may still be a problem under the framework of increasing domain asymptotics. © 2016 Published by Elsevier B.V.

1. Introduction Spatial data are often dependent, observations from nearby sites tending to be more likely to be close to each other than observations from distant sites. In practice, the usual method is to model the dependence by a parametric covariance function in a Gaussian random field, and then to use a likelihood-based approach to estimate the parameters of the covariance function. For the purposes of making inferences, asymptotic properties of the maximum likelihood estimator (MLE) of these parameters are important. However, the derivation of the asymptotic properties of the MLE is complicated by the fact that there are two different asymptotic frameworks: the fixed domain and increasing domain asymptotics (Cressie, 1993, p. 350), where the fixed domain asymptotics is also called the infill asymptotics. Under the framework of fixed domain asymptotics, the study region is fixed and bounded, and sampling locations are increasingly dense within the study region. Under the framework of increasing domain asymptotics, the study region increases without bound, and the minimum distance between sampled locations is bounded below by a positive constant. It is not surprised that the asymptotic behavior of the MLE of covariance parameters in the spatial covariance function is quite different under the two frameworks, respectively. For example, it is known that the MLE is inconsistent under the framework of infill asymptotics (Du et al., 2009; Lahiri, 1996; Stein, 1999; Ying, 1991; Zhang, 2004). However, it is believed that these parameters should be consistently estimable under the framework of increasing domain asymptotics, since it has been shown in the literature that with some regularity conditions their MLEs are consistent and asymptotically normal (Mardia and Marshall, 1983). Therefore, people believe that consistency should be the general conclusion under the framework of increasing domain asymptotics. In this article, we provide an example in which the MLE is still inconsistent under the framework of increasing domain asymptotics. We derive the closed form expression of the likelihood function and show that the difference between the likelihood functions at distinct sets of parameters approaches a nondegenerate random variable in distribution, which can

E-mail address: [email protected]. http://dx.doi.org/10.1016/j.spl.2016.10.002 0167-7152/© 2016 Published by Elsevier B.V.

T. Zhang / Statistics and Probability Letters 120 (2017) 108–113

109

be larger than any positive value with a positive probability. Therefore, the consistency of the MLE does not hold. This example implies that consistency may still be a concern under the framework of increasing domain asymptotics. 2. The example The basic model in spatial statistics can be generally expressed as Y (s) = µ(s) + δ(s) + ϵ(s),

s ∈ D ⊆ Rd ,

(1)

where µ(s) is the mean function, δ(s) is a spatially correlated process, and ϵ(s) is a white noise error process. The spatially correlated process δ(s) is commonly modeled by a stationary Gaussian process with E(δ(s)) = 0, V(δ(s)) = σ 2 , and corr(δ(s), δ(s + h)) = c (h). The error process ϵ(s) is often assumed i.i.d. N (0, τ 2 ). Under these two assumptions, there are E (Y (s)) = µ(s), V (Y (s)) = σ 2 + τ 2 , and Cov(Y (s), Y (s + h)) = σ 2 c (h) if h ̸= 0. In Model (1), the values of σ 2 and τ 2 describe the strength of the spatial correlated effect and the nugget effect, respectively, where the nugget effect disappears if τ 2 = 0. Because the correlation function c (h) must satisfy the positive-definiteness condition, it is often to specify a stationary parametric model as c (h) = c (h; θ ) = corr(δ(s), δ(s + h)),

θ ∈ Θ ∈ Rq .

(2)

There are many parametric models proposed for c (h; θ ) in the literature. Among those, the most popular one is the Matérn correlation family (Matérn, 1986). The Matérn correlation family includes the exponential correlation function as a special case, where q = 1 and c (h; θ ) = e−θ∥h∥ , θ > 0. In this article, we are interested in the construction of an example when the spatial correlation function is the exponential correlation function. In the example, we assume µ(s) = 0, σ 2 = 1, and τ 2 = 0 such that E (Y (s)) = 0, V (Y (s)) = 1, and corr(Y (s), Y (s + h)) = e−θ∥h∥ in Model (1). We attempt to show that the MLE of θ is inconsistent in the example. Since θ ∈ R+ is the only parameter contained in our model, we use ρ = e−θ to represent the unknown parameter instead. We assume ρ ∈ [0, 1) in our model, where ρ = 0 is derived by letting θ → ∞. The problems of consistency are equivalent in the models using θ and ρ as the unknown parameter, respectively. Suppose spatial data are observed at irregularly spaced locations s1 , . . . , sn . Let Yi = Y (si ) and y = (Y1 , . . . , Yn ). Then, y ∼ N (0, Rρ ), where Rρ = (ρ

∥si −sj ∥

(3)

)n×n is the covariance (also the correlation) matrix of y. The loglikelihood function of ρ is

n

1

2

2

ℓ(ρ) = − log(2π ) −

log | det(Rρ )| −

1 2

1 yT R− ρ y.

(4)

The MLE of ρ , which is denoted by ρˆ , is derived by maximizing ℓ(ρ). The asymptotic property of ρˆ depends on the allocation of S = {s1 , . . . , sn } as n → ∞. It has been shown that ρˆ is consistent if si = i under d = 1 (Mardia and Marshall, 1983). Here, we consider another choice of S , where we set si = 2i−1 for i = 1, . . . , n. We show that ρˆ is an inconsistent estimator of ρ as n → ∞, which is enough to conclude that the MLE of θ is also an inconsistent estimator of θ in this case. We focus on the proof of the inconsistency of ρˆ under S = {2i−1 : i = 1, . . . , n} in this article. The (i, j)th entry of Rρ can be represented as j−1 −2i−1

ri,j (ρ) = ρ 2

,

1 ≤ i ≤ j ≤ n,

(5)

and rj,i (ρ) = ri,j (ρ) if i > j. If n > 1, then the determinant of Rρ is det(Rρ ) =

n−1  i (1 − ρ 2 ). i=1

Let Vρ =

1 R− ρ .

Then, the (i, j)th entry of Vρ can be expressed as

v1,1 (ρ)

=

vi,i (ρ)

=

vn,n (ρ)

=

vi,i+1 (ρ) = vi+1,i (ρ)

=

1 1 − ρ2

ρ

,

2i−1

+

1 − ρ2 1

i−1

1−ρ

2n −1

,

i−1



ρ2

1 − ρ2

i

,

1 1 − ρ2

i

, (6)

110

T. Zhang / Statistics and Probability Letters 120 (2017) 108–113

and vi,j (ρ) = 0 if |i − j| ≥ 2, where 1 < i < n. If n > 1, then the loglikelihood function of ρ is n

n−1 1

2

2 i=1

n

n−1 1

2

2 i=1

ℓ(ρ) = − log(2π ) − = − log(2π ) −

i

log(1 − ρ 2 ) −

2i

log(1 − ρ ) −

1 2

yT Vρ y

1 2





i−1 n−1  (Yi − ρ 2 Yi+1 )2

+

1 − ρ2

i

i=1

Yn2

.

(7)

i−1 Lemma 1. Let ρ0 be the  true parameter of ρ in Model (3) with si = 2 for i = 1, . . . , n. Assume ρ0 ∈ (0, 1). Define i − 1 i Zi = (Yi − ρ02 Yi+1 )/ 1 − ρ02 for i = 1, . . . , n − 1 and Zn = Yn . Then, Z1 , . . . , Zn are i.i.d. N (0, 1).

Proof. It is clear that z = (Z1 , . . . , Zn ) is a multivariate normal random vector with mean zero. Therefore, it is enough to show that the covariance matrix of z is the identity matrix. Straightforwardly, for any 1 ≤ i < n, we have V (Zi ) =

1

i−1

1−ρ

2i 0

[V (Yi ) − 2ρ02

i

Cov(Yi , Yi+1 ) + ρ02 V (Yi+1 )] = 1

and i−1

Cov(Zi , Zn ) =

Cov(Yi , Yn ) − ρ02



1 − ρ02

n−1 −2i−1

=

Cov(Yi+1 , Yn )

i−1

ρ02

− ρ02



i

n−1 −2i

ρ02

i

1 − ρ02

= 0. For any 1 ≤ i < j < n, there is Cov(Yi , Yj ) − ρ02

Cov(Zi , Zj ) =

i−1

Cov(Yi+1 , Yj ) − ρ02

j−1

i−1 +2j−1

Cov(Yi , Yj+1 ) + ρ02

Cov(Yi+1 , Yj+1 )

(1 − ρ )(1 − ρ ) 2i

ρ

=

2j−1 −2i−1 0

−ρ

2i−1 0

ρ

2j−1 −2i 0

2j−1 0

−ρ

j

2j

i−1

ρ02 −2

+ ρ02

i−1 +2j−1

j

i

ρ02 −2

(1 − ρ 2i )(1 − ρ 2j )

= 0.

Those are enough to conclude that the covariance matrix of z is the identity matrix.



Lemma 2. Let Sn =

n  n 

ai,j Zi Zj ,

i=1 j=1

where ai,j only depends on (i, j) with a1,1 > 0 and Z1 , . . . , Zn are i.i.d. N (0, 1). If there exists a positive C such that

|ai,j | ≤ C /(i + j)5 ,

(8)

then for any M ∈ R there exists a positive constant η such that for any n there is P (Sn ≥ M ) ≥ η.

(9)

Proof. Let φ and Φ be the PDF and CDF of N (0, 1), respectively. There is lim

x→∞

2[1 − Φ (x)]

φ(x)

= lim

x→∞

2 x

= 0.

Then, there is a positive constant a such that P (|Zi | ≥ x) ≤ φ(x) for any x ≥ a. Without loss of generality, we can choose a ≥ 2. Let S˜n =

n  n  i=1 j=1

|ai,j |ij.

T. Zhang / Statistics and Probability Letters 120 (2017) 108–113

111

Then, S˜n ≤

n  n  i =1 j =1

Cij

(i + j)



5

n  n 

C

i=1 j=1

(i + j)3

,

which implies that S˜n converges as n → ∞. Let S˜m,n = S˜n − S˜m for m ≤ n. Then, we can find an m0 ≥ a such that S˜m,n ≤ 1 for any n > m ≥ m0 . Then, P (Sn ≥ M ) ≥ P (Sn ≥ M , |Zm0 +1 | ≤ m0 + 1, . . . , |Zn | ≤ n)

≥ P (Sm0 ≥ M + S˜m0 ,n , |Zm0 +1 | ≤ m0 + 1, . . . , |Zn | ≤ n) ≥ P (Sm0 ≥ M + 1, |Zm0 +1 | ≤ m0 + 1, . . . , |Zn | ≤ n) n 

= P (Sm0 ≥ M + 1)

P (|Zi | ≤ i)

i=m0 +1 n 

≥ P (Sm0 ≥ M + 1)

[1 − φ(i)]

i=m0 +1

 ≥ P (Sm0 ≥ M + 1) 1 −

n 

 φ(i)

i=m0 +1

≥ P (Sm0 ≥ M + 1)(1 − b), where b = i=m0 +1 φ(i) satisfying 0 < b < 1. It is only needed to show that P (Sm0 ≥ M + 1) ≥ 0 for any M ∈ R. Since m0 is fixed and ai,j does not depend on n, the distribution of Sm0 does not depend on n as n → ∞. Therefore, P (Sm0 ≥ M + 1) is a constant. Based on the property of the multivariate normal distribution, it is enough to show that Sm0 ≥ M + 1 is not empty for any M ∈ R. If we choose Z2 = Z3 = · · · = Zm0 = 0, then Sm0 = a1,1 Z12 . We can find the value of Z1 such that Sm0 ≥ M + 1 since a1,1 > 0. Therefore, we can choose η = P (Sm0 ≥ M + 1)(1 − b) in (9) to ensure the equation holds. 

∞

Theorem 1. Let ρ0 ∈ (0, 1) be the true parameter of ρ in Model (5). Then, for any ρ ∈ (0, 1) and any M ∈ R with ρ < ρ0 , there exists a positive η such that for any n there is P [ℓ(ρ) − ℓ(ρ0 ) ≥ M ] ≥ η.

(10) i

Proof. Let un (ρ) = −(1/2) i=1 log(1 − ρ 2 ) for n > 1. Then, un (ρ) uniformly converges on [a, b] ⊆ [0, 1) for any 0 < a < b < 1. Therefore, un (ρ) is uniformly bounded on [a, b]. Let M1 = supρ∈[a,b] |un (ρ)|. Then, M1 is finite. Without loss of generality, we can assume ρ, ρ0 ∈ [a, b]. Based on the expression of ℓ(ρ) given by Eq. (7), there is

n−1

ℓ(ρ) − ℓ(ρ0 ) − [un (ρ) − un (ρ0 )] = −

n −1 1



2 i =1

 +

Yi2+1

 Yi2

ρ2

ρ02i 1 − ρ02

i

1 − ρ2

i



i

− ρ02

ρ2

1 − ρ2

i

 − Yi Yi+1



i

1 − ρ02



i

i

1

= − yT By, 2

where y = (Y1 , . . . , Yn ) and the (i, j)th entry of B = Bρ,ρ0 is b1,1 (ρ, ρ0 ) bi,i (ρ, ρ0 )

ρ02 ρ2 − , 2 1 − ρ2 1 − ρ0     i−1 i i−1 ρ02i ρ02 ρ2 ρ2 = − − − , i i i−1 i−1 1 − ρ2 1 − ρ2 1 − ρ02 1 − ρ02

=

n

n

bn,n (ρ, ρ0 )

=

ρ2 ρ02 − n , n 1 − ρ2 1 − ρ02 i

i−1

bi,i+1 (ρ, ρ0 ) =

ρ2

1 − ρ2

i



ρ02

i

1 − ρ02

,

1 ≤ i < n,

1 < i < n,

1 − ρ2



i

i−1

2ρ 2

i



2ρ02

i

1 − ρ02

112

T. Zhang / Statistics and Probability Letters 120 (2017) 108–113 i−1

with bj,i (ρ, ρ0 ) = bi,j (ρ, ρ0 ) and bi,j (ρ, ρ0 ) = 0 if |i − j| > 1. Let Zi = (Yi − ρ02 Then, Z1 , . . . , Zn are i.i.d. N (0, 1) according to Lemma 1 and



i

Yi+1 )/ 1 − ρ02 for 1 ≤ i < n and Zn = Yn .

Yn = Zn

 Yi =

 2n−1 −2i−1 0

2i

1 − ρ0 Zi + ρ

n −i  

 2i+k−1

1 − ρ0



n −1 

Zn +

k =1

ρ

2l−1 −2i−1 0



l −i  

2i+k−1

1 − ρ0

l=i+1

Zl ,

1 ≤ i < n.

k=1

Based on z = (Z1 , . . . , Zn ), we can write 1

ℓ(ρ) − ℓ(ρ0 ) − [un (ρ) − un (ρ0 )] = − zT Az, 2

where zT Az =

n 

bi,i (ρ, ρ0 )Yi2 +

i=1

n−1 

2bi,i+1 (ρ, ρ0 )Yi Yi+1 .

i=1

Let ai,j (ρ, ρ0 ) be the (i, j)th entry of A for i, j = 1, . . . , n. Then, we can compute ai,j (ρ, ρ0 ). The results are i−1 

i −l  l+k−1 (1 − ρ02 ) l=1 k=1     i− l−1  i−2 i−l    2i −2l −2l−1 2l+k−1 2l+k 1 − ρ0 1 − ρ0 +2 bl,l+1 (ρ, ρ0 )ρ0 k=1 l =1 k=1  i−2 i i−1 + 2bi−1,i (ρ, ρ0 )ρ02 (1 − ρ02 )(1 − ρ02 ), 1 ≤ i < n, i

ai,i (ρ, ρ0 ) = bi,i (ρ, ρ0 )(1 − ρ02 ) +

an,n (ρ, ρ0 ) = bn,n (ρ, ρ0 ) +

n −1 

i

bl,l (ρ, ρ0 )ρ02 −2

n l bl,l (ρ, ρ0 )ρ02 −2

l =1

+2

n−2 

 bl,l+1 (ρ, ρ0 )ρ

2n −2l −2l−1 0

l=1

2j−1 −2i−1 0

2i 0

ai,j (ρ, ρ0 ) = bi,i (ρ, ρ0 ) 1 − ρ ρ

+

bl,l (ρ, ρ0 )ρ

2i−1 +2j−1 −2 0

l=1 i−1

+



+

) 

k=1 n−l



j−i  

 

1−ρ

1−ρ

bl,l+1 (ρ, ρ0 )ρ bl,l+1 (ρ, ρ0 )ρ

+ bi,i+1 (ρ, ρ0 )ρ

1−ρ

i+k−1 0

1−ρ

l+k−1 0



2i−1 +2j−1 −2 0

2j−1 −2i−1 0



  j−l  

1−ρ

2l+k−1 0

1−ρ

 1−ρ

1−ρ

2 0



 

1−ρ

  k=1 j −l  l+k

2l+k 0



1−ρ

2l+k−1 0

k=1

k=1 j −i

2i 0

 2l+k−1 0

k=1 j−l−1

k=1  i− l −1  l

l=1

2l+k 0

k=1

k=1 i−l

2i−1 +2j−1 −2l 0



n−l−1

2l+k−1 0



 k =1  i −l  l 

l=1 i−1



l+k−1

(1 − ρ02

k=1

 n−1 + 2bn−1,n (ρ, ρ0 ) 1 − ρ02 ,   i−1 

n −l 

l





1−ρ

2i+k 0

,

1 ≤ i < j < n,

k=1

and ai,n (ρ, ρ0 ) = bi,i (ρ, ρ0 )ρ

2n−1 −2i−1 0



1−ρ

 2i 0



n−i  

2i+k−1 0

1−ρ

k=1

+

i−1 

 2n−1 +2i−1 −2l 0

bl,l (ρ, ρ0 )ρ

l=1

+

i−1 

n−l  

1−ρ

+

l=1

  i −l 

k=1

bl,l+1 (ρ, ρ0 )ρ

2n−1 +2i−1 −3(2l−1 ) 0



i−l  

bl,l+1 (ρ, ρ0 )ρ

1−ρ



1−ρ



k=1

 

1−ρ

l +k 0

k=1

i−l−1

 



n−l−1

l+k−1 0

k=1 2n−1 +2i−1 −3(2l−1 ) 0

 l+k−1 0

k=1

l=1 i−1 

2l+k−1 0

1−ρ

l +k 0

  n−i 

1−ρ

k=1

 i+k−1 0

T. Zhang / Statistics and Probability Letters 120 (2017) 108–113

n−1 −2i

+ Ii̸=n−1 bi,i+1 (ρ, ρ0 )ρ02



 1 − ρ02

n−i−1

i

 

1 − ρ02

113

 i+k

k=1



n−1

+ Ii=n−1 bn−1,n (ρ, ρ0 ) 1 − ρ02

,

1 ≤ i < n.

Note that bi,j (ρ, ρ0 ) does not depend on n. Observing the terms above, we find that (a) ai,j (ρ, ρ0 ) does not depend on n if i, j ̸= n; (b) bi,j (ρ, ρ0 ) goes to zero exponentially fast as i and j increase if 0 ≤ ρ, ρ0 < 1; and (c) a1,1 (ρ, ρ0 ) = ρ02 − ρ 2 (1 − ρ02 )/(1 − ρ 2 ) is positive if ρ < ρ0 and is negative if ρ > ρ0 . Thus, ai,j (ρ, ρ0 ) satisfies Conditions (8). This is enough to draw Eq. (10) according to Lemma 2.  Corollary 1. For any ρ, ρ0 ∈ (0, 1) with ρ < ρ0 , ℓ(ρ) − ρ(ρ0 ) goes to a nondegenerate random variable in distribution, which is larger than any positive number with a positive probability. D

Proof. Note that ℓ(ρ)−ℓ(ρ0 ) is a Cauchy sequence in probability. We can find a random variable X such that ℓ(ρ)−ℓ(ρ0 ) → X as n → ∞. According to Theorem 1, for any M > 0 there is an η > 0 such that P (X > M ) ≥ η, which is the conclusion of the corollary.  Theorem 2. If si = 2i−1 for i = 1, . . . , n in Model (3), then ρˆ is an inconsistent estimator of ρ , which implies that the MLE of θ is also an inconsistent estimator of θ . Proof. The conclusion can be directly implied by Corollary 1.



3. Discussion We have constructed a special example in which the MLE of covariance parameters is inconsistent under the framework of increasing domain asymptotics. The example may be classified as an infinitely sparse case about the pattern of irregularly spaced locations in spatial statistics. Comparing to the case considered under the framework of the infill asymptotics which assumes the spaced locations are infinitely dense, the example of the infinitely sparse case provides another way to understand the impact of location patterns on asymptotic properties of estimators of covariance parameters in a spatial statistics. Therefore, it is also necessary to consider this case in the understanding of the asymptotic behavior of estimators of covariance parameters. References Cressie, N., 1993. Statistics for Spatial Data. Wiley, New York. Du, J., Zhang, H., Mandrekar, V.S., 2009. Fixed-domain asymptotic properties of tapered maximum likelihood estimators. Ann. Statist. 37, 3330–3361. Lahiri, S.N., 1996. On inconsistency of estimators based on spatial data under infill asymptotics. Sankhya¯ 58, 403–417. Mardia, K.V., Marshall, R.J., 1983. Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika 73, 135–146. Matérn, B., 1986. Spatial Variation, second ed. Springer-Verlag, Berlin. Stein, M.L., 1999. Interpolation of Spatial Data: Some Theory for Kriging. Springer, New York. Ying, Z., 1991. Asymptotic properties of a maximum likelihood estimator with data from a Gaussian process. J. Multivariate Anal. 36, 280–296. Zhang, H., 2004. Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. J. Amer. Statist. Assoc. 99, 250–261.