On asymptotic properties of the mark variogram estimator of a marked point process

On asymptotic properties of the mark variogram estimator of a marked point process

Journal of Statistical Planning and Inference 137 (2007) 148 – 161 www.elsevier.com/locate/jspi On asymptotic properties of the mark variogram estima...

266KB Sizes 1 Downloads 23 Views

Journal of Statistical Planning and Inference 137 (2007) 148 – 161 www.elsevier.com/locate/jspi

On asymptotic properties of the mark variogram estimator of a marked point process Yongtao Guana,∗ , Michael Shermanb , James A. Calvinc a Department of Management Science, University of Miami, Coral Gables, FL 33124, USA b Department of Statistics, Texas A&M University, College Station, TX 77843, USA c Department of Statistics, Texas A&M University, College Station, TX 77843, USA

Received 8 November 2004; received in revised form 5 July 2005; accepted 31 October 2005 Available online 19 December 2005

Abstract The mark variogram [Cressie, 1993. Statistics for Spatial Data. Wiley, New York] is a useful tool to analyze data from marked point processes. In this paper, we investigate the asymptotic properties of its estimator. Our main findings are that the sample mark variogram is a consistent estimator for the true mark variogram and is asymptotically normal under some mild conditions. These results hold for both the geostatistical marking case (i.e., the case where the marks and points are independent) and the non-geostatistical marking case (i.e., the case where the marks and points are dependent). As an application we develop a general test for spatial isotropy and study our methodology through a simulation study and an application to a data set on long leaf pine trees. © 2005 Elsevier B.V. All rights reserved. Keywords: Marked point process; Mark variogram

1. Introduction Marked point process models are useful tools to analyze irregularly spaced data, especially those arising from ecological studies (see, e.g., Biondi et al., 1994; Stoyan and Stoyan, 1994; Gavrikov and Stoyan, 1995; Wälder and Stoyan, 1996; Stoyan and Wälder, 2000). The use of second-order structures, in particular the mark variogram (Cressie, 1993), plays an important role in modeling marked point processes. Examples of the use of the mark variogram include studying interactions among trees in a forest (Biondi et al., 1994), checking for independence between marks and points (Wälder and Stoyan, 1996), and testing for isotropy of marks (Guan et al., 2004a). For a complete list of second-order structures of a marked point process, see Schlather (2001). Different from its analogue in geostatistics, the mark variogram is a conditional quantity given the existence of points at the locations under consideration. To see this, let  = {[x, m(x)]} denote a stationary marked point process (Stoyan and Stoyan, 1994) and  denote the point process that generates the locations. Let t ∈ R2 denote a lag and 0

∗ Corresponding author.

E-mail addresses: [email protected] (Y. Guan), [email protected] (M. Sherman), [email protected] (J.A. Calvin). 0378-3758/$ - see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2005.10.004

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

149

denote the origin. The mark variogram is then defined by 2(t) = Var[m(0) − m(t)|0, t ∈ ].

(1)

Equivalently, (t) = E[m(0)2 |0, t ∈ ] − E[m(0)m(t)|0, t ∈ ], where the conditioning -fields here and in (1) are the -field generated by realizations of  with events at both 0 and t. In the case of the marks and the points of  being independent, which is often referred to as the case of geostatistical marking in literature (Schlather et al., 2004), (1) then coincides with the variogram function defined in geostatistics, that is, 2(t) = Var[m(0) − m(t)]. In many ecological example, the marks and the points may be dependent. For example, the growth rates of trees (i.e., marks) in a forest may depend on the relative locations of trees (i.e., points) due to competition for nutrient and/or light among neighboring trees. Following Schlather et al. (2004), we will refer to this type of marked point processes as the case of non-geostatistical marking. See Schlather et al. (2004) and Guan (2004) for formal methods to test for geostatistical marking vs. non-geostatistical marking marked point processes. The mark variogram defined by (1) is typically estimated via non-parametric methods such as kernel smoothing (see, e.g., Cressie, 1993). The distributional properties of these estimators, however, have not been well studied. Guan et al. (2004a) established the asymptotic consistency and normality of the sample mark variogram for the geostatistical marking case, under the condition that the point process  is Poisson. To the authors’ knowledge, the cases of nonPoisson and non-geostatistical marking have not been considered. On the contrary, we note that the distributional properties of the sample variogram in geostatistics have been extensively studied. For examples, see Lahiri et al. (2002) and Guan et al. (2004a). In this article, we investigate the asymptotic properties of a mark variogram estimator which will be formally defined in Section 2. Specifically, we prove that the sample mark variogram is a consistent estimator for the true mark variogram and is asymptotically normal under some mild conditions. These results hold for both the geostatistical marking case and the non-geostatistical marking case. For the latter, we verify that the assumed conditions hold for several available models. As an illustrative example, we apply these results to test for isotropy of the marks. The rest of the article is organized as follows. In Section 2, we introduce some necessary notation. We develop the asymptotic properties of the sample mark variogram in the geostatistical marking case in Section 3 and then proceed to the non-geostatistical marking case in Section 4. We discuss an application of our results to testing for isotropy in Section 5 and conclude with a discussion in Section 6. 2. General notation Throughout this article, we use D ⊂ R2 to denote the region where observations are made, and jD to denote the boundary of D. Let |D| and |jD| denote the volume of D and the length of its boundary, respectively, let h be a positive constant and let w(·) be a bounded, non-negative, symmetric density function which takes positive values on only a finite support. We define the following kernel estimator for the mark variogram    t − xi + xj [m(xi ) − m(xj )]2 1 −2 ˆ (t) = (2) h w × , (2) h |D ∩ (D − xi + xj )|  (t) i=j

where (2) (t) is the second-order product density of  (Stoyan and Stoyan, 1994, p. 246). In practice, (2) (t) is usually unknown and must be estimated. An example of ˆ (2) (t) is given as    t − xi + xj 1 ˆ (2) (t) = h−2 w × . (3) h |D ∩ (D − xi + xj )| i=j

The resulting estimator ˆ (t) may not be non-negative definite in general. In the geostatistical marking case, note that non-negative definiteness (NND) is a necessary condition for ˆ (t) to be a valid variogram. A remedy in practice is to fit a parametric variogram model satisfying NND to the empirical variogram estimators so as to obtain a valid variogram function. In the non-geostatistical marking case, NND is not a necessary condition. In fact, many available

150

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

mark variogram models in this case are not (conditionally) non-negative definite (see, e.g., Wälder and Stoyan, 1996; Stoyan and Wälder, 2000). The asymptotic properties of (3) have been established by Guan et al. (2004b). In this article, we are mainly concerned with the asymptotic properties of (2) assuming a known (2) (t). The more general results for the case with unknown (2) (t) can be obtained similarly by combining the results in Guan et al. (2004b) and our results to be detailed in Sections 3 and 4. To study the asymptotic properties of the mark variogram estimator, we need to account for the shape of the region, D, and the choice of bandwidth, h. Consider a sequence of increasing regions, Dn , and a sequence of bandwidths, hn . Throughout this article, we assume that There exist constants K1 , K2 ∈ (0, ∞) such that K1 n2 |Dn | K2 n2 for all nK2 , |jDn | = O(n),

(4)

and hn = O(n− )

for some  ∈ (0, 1).

(5)

Practically, condition (4) requires Dn to grow in all directions. This condition allows for a variety of regions, e.g., squares, circles and starshapes. Condition (5) allows for sufficient averaging for each ˆ n (t), where ˆ n (t) is ˆ (t) calculated on Dn . 3. Asymptotic results: The geostatistical marking case 3.1. Asymptotic consistency Let dx denote an infinitesimally small disc centered at x and (dx) denote the number of points of  in dx. In this section, we assume the existence of the kth-order cumulant density function of  defined as   Cum[(dx1 ), . . . , (dxk )] C (k) (x2 − x1 , . . . , xk − x1 ) ≡ lim , |dx1 |,...,|dxk |→0 |dx1 | × · · · × |dxk | where xi are pairwise different and Cum(·) denotes the cumulant function (see, e.g., Brillinger, 1975). Following Guan et al. (2004b), we assume the following conditions C (2) (·) is finite and continuous, C (3) (·, ·) is finite,   |C (2) (u)| du < ∞, |C (3) (u1 , u2 )| du1 < ∞, R2

and

R2

(6)

 R2

|C (3) (u1 , u1 + u2 )| du1 < ∞,

 R

2

|C (4) (u1 , u2 , u2 + u3 )| du2 < ∞

for all u1 , u2 , u3 ∈ R2 .

(7)

Conditions (7) is clearly satisfied if  is a homogeneous Poisson process. In addition, it holds for any model with finite dependence range. Guan (2003) proved further that condition (7) also holds for a class of commonly used cluster models. Let (4) (t) = E{[m(0) − m(t)]4 |0, t ∈ }. In addition to (6) and (7), we require the following conditions on the process generating the marks: (t), (4) (t) are finite and continuous,  |Cov{[m(0) − m(u1 )]2 , [m(t) − m(t + u2 )]2 }|dt < ∞, t∈R2

(8) (9)

Condition (9) is satisfied for any process with finite dependence range and finite fourth moment of the marks. For a  Gaussian process, R2 |Cov[m(0), m(t)]|dt < ∞ is sufficient for (9) to hold (see Guan, 2003). Many covariance models, e.g., Exponential, Gaussian, Spherical models, satisfy the integrability condition and consequently satisfy (9).

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

151

The following theorem establishes the asymptotic consistency of ˆ n (t). Theorem 1. Consider a stationary marked point process with geostatistical marking. Let ˆ n (t) denote (2) obtained on Dn and assume that conditions (4)–(9) hold. Then  1 E[ˆn (t)] = (2) w(u)(t − hn u)(2) (t − hn u) du → (t) and  (t)  lim |Dn | × h2n × Cov[ˆn (t1 ), ˆ n (t2 )] =

n→∞

w 2 (v) dv ×

I (t1 = ±t2 ) × (4) (t1 ) , (2) (t1 )

where I (t1 = ±t2 ) = 1 if t1 = ±t2 and 0 otherwise. (2) Proof. Let wn (x) = h−2 n w(x/ hn ) and  (dx1 , dx2 ) = (dx1 )(dx2 )I (dx1  = dx2 ). Define the following functions:

(x2 − x1 , x3 − x1 , x4 − x1 ) ≡ E{[Z(x2 ) − Z(x1 )]2 [Z(x4 ) − Z(x3 )]2 }, ∗ (x2 − x1 , x3 − x1 , x4 − x1 ) ≡ Cov{[Z(x2 ) − Z(x1 )]2 , [Z(x4 ) − Z(x3 )]2 }. We first study the expectation of ˆ n (t).   wn (t − x1 + x2 ) 1 × (x2 − x1 ) × (2) (x2 − x1 ) dx1 dx2 (2)  (t) Dn Dn |Dn ∩ (Dn − x1 + x2 )|  1 wn (t + u)(u)(2) (u) du = (2)  (t) Dn −Dn  1 = (2) w(v)(t − hn v)(2) (t − hn v) dv  (t) → (t).

E[ˆn (t)] =

To derive the expression of the covariance term in Theorem 1, consider two lags, t1 and t2 . Cov[ˆn (t1 ), ˆ n (t2 )] × [(2) (t1 )(2) (t2 )] can be written as  wn (t1 − x1 + x2 ) × wn (t2 − x3 + x4 ) × (x2 − x1 , x3 − x1 , x4 − x1 ) |Dn ∩ (Dn − x1 + x2 )| × |Dn ∩ (Dn − x3 + x4 )| Dn

× E[(2) (dx1 , dx2 )(2) (dx3 , dx4 )]  wn (t1 − x1 + x2 ) × wn (t2 − x3 + x4 ) × (x2 − x1 ) × (x4 − x3 ) − |Dn ∩ (Dn − x1 + x2 )| × |Dn ∩ (Dn − x3 + x4 )| Dn

× E[(2) (dx1 , dx2 )]E[(2) (dx3 , dx4 )]. Note that E[(2) (dx1 , dx2 )(2) (dx3 , dx4 )] = (4) (x2 − x1 , x3 − x1 , x4 − x1 ) dx1 dx2 dx3 dx4 + (3) (x2 − x1 , x3 − x1 ) dx1 dx2 dx3 x1 (dx4 ) + (3) (x2 − x1 , x4 − x1 ) dx1 dx2 dx4 x1 (dx3 ) + (3) (x2 − x1 , x3 − x1 ) dx1 dx2 dx3 x2 (dx4 ) + (3) (x2 − x1 , x4 − x1 ) dx1 dx2 dx4 x2 (dx3 ) + (2) (x2 − x1 )dx1 dx2 x1 (dx3 )x2 (dx4 ) + (2) (x2 − x1 ) dx1 dx2 x1 (dx4 )x2 (dx3 ). Thus, the covariance term can in turn be written as the sum of seven terms, where the second to the seventh terms depend only on E[ˆn (t1 ) × ˆ n (t2 )]. Following the proof in Guan et al. (2004b), the second to the fifth terms can be

152

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

shown to be of order 1/|Dn | due to the fact that (x2 − x1 , x3 − x1 , x4 − x1 ) is finite. To study the first term, first define F1 (x2 − x1 , x3 − x1 , x4 − x1 ) = (x2 − x1 , x3 − x1 , x4 − x1 ){C (4) (x2 − x1 , x3 − x1 , x4 − x1 ) + C (3) (x2 − x1 , x3 − x1 ) + C (3) (x2 − x1 , x4 − x1 ) + C (3) (x3 − x1 , x4 − x1 ) + C (3) (x3 − x2 , x4 − x2 ) + C (2) (x3 − x1 )C (2) (x4 − x2 ) + C (2) (x4 − x1 )C (2) (x3 − x2 ) + 2 C (2) (x3 − x1 ) + 2 C (2) (x4 − x1 ) + 2 C (2) (x3 − x2 ) + 2 C (2) (x4 − x2 )} and F2 (x2 − x1 , x3 − x1 , x4 − x1 ) = ∗ (x2 − x1 , x3 − x1 , x4 − x1 ){ 2 C (2) (x2 − x1 ) + 2 C (2) (x4 − x3 ) + C (2) (x2 − x1 )C (2) (x4 − x3 ) + 4 }. The first term can be written as the sum of the following two terms:  Dn

 Dn

wn (t1 − x1 + x2 ) × wn (t2 − x3 + x4 ) × F1 (x2 − x1 , x3 − x1 , x4 − x1 ) dx1 dx2 dx3 dx4 , |Dn ∩ (Dn − x1 + x2 )| × |Dn ∩ (Dn − x3 + x4 )| wn (t1 − x1 + x2 ) × wn (t2 − x3 + x4 ) × F2 (x2 − x1 , x3 − x1 , x4 − x1 ) dx1 dx2 dx3 dx4 . |Dn ∩ (Dn − x1 + x2 )| × |Dn ∩ (Dn − x3 + x4 )|

Following Guan et al. (2004b), the first quantity is of order 1/|Dn | due to that (x2 − x1 , x3 − x1 , x4 − x1 ) is finite. The second quantity is of order 1/|Dn | due to the proof of the marked-Poisson case in Guan et al. (2004a) and that the cumulant functions are finite. Consider the sixth term, which can be written as  Dn

wn (t1 − x1 + x2 ) × wn (t2 − x1 + x2 ) × (x2 − x1 ) × (2) (x2 − x1 ) dx1 dx2 |Dn ∩ (Dn − x1 + x2 )|2 

=  =

Dn −Dn

wn (t1 + u) × wn (t2 + u) × (u) × (2) (u) du |Dn ∩ (Dn − u)|

Dn −Dn

w(v) × w(v + (t2 − t1 )/ hn ) × (hn v − t1 ) × (2) (hn v − t1 ) dv. |Dn ∩ (Dn + t1 − hn v)| × h2n

Thus  lim |Dn | × h2n n→∞

× the sixth term =

w 2 (v) dv × (4) (t1 ) × (2) (t1 ) × I (t1 = t2 ). C

Similarly, we can show  lim |Dn | × h2n n→∞

× the seventh term =

Thus, Theorem 1 is proved.

w 2 (v) dv × (4) (t1 ) × (2) (t1 ) × I (t1 = −t2 ). C



Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

153

3.2. Asymptotic normality To study the asymptotic normality of ˆ (t), we quantify the strength of dependence in the random field through the following strong mixing coefficients (see, e.g., Politis and Sherman, 2001):

M (p; k) ≡ sup{|P (A1 ∩ A2 ) − P (A1 )P (A2 )| : A1 ∈ FM (E1 ), A2 ∈ FM (E2 ), E2 = E1 + x, |E1 | = |E2 |p, d(E1 , E2 ) k}, and

P (p; k) ≡ sup{|P (A1 ∩ A2 ) − P (A1 )P (A2 )| : A1 ∈ FP (E1 ), A2 ∈ FP (E2 ), E2 = E1 + x, |E1 | = |E2 |p, d(E1 , E2 ) k}, where the suprema are taken over all compact and convex subsets E1 ⊂ R2 , and over all x ∈ R2 such that d(E1 , E2 ) k. In the foregoing, FM (E) and FP (E) denote the -algebras generated by the random variables {Z(x) : x ∈ E} and {x : x ∈  ∩ E}, respectively. We assume the following mixing conditions: sup p

sup p

M (p; k) = O(k − ) p

for some  > 2,

(10)

P (p; k) = O(k − ) p

for some  > 2.

(11)

Conditions (10) and (11) require that dependence decreases at a polynomial rate in k for fixed p, for both the process generating the marks and the point process . This condition is weaker than the commonly used strong mixing condition (see, e.g., Doukhan, 1994) in that we allow the dependence to increase at a rate controlled by p for fixed k. Similar conditions have been used in, e.g., Sherman (1996), Heagerty and Lumley (2000), Politis and Sherman (2001), Guan et al. (2004a, b). We also require the following mild moment condition:  sup E[| |Dn | × hn × {ˆn (t) − E[ˆn (t)]}|2+ ] C for some > 0, C < ∞. (12) n

Condition (12) is only slightly stronger than the existence of the (standardized) asymptotic variance of ˆ n (t). In the case that the marked point process has a finite dependence range, we see that (12) can be relaxed (to = 0) due to an argument analogous to the proof of Theorem 6.4.2 in Brockwell and Davis (1991). Theorem 2. Consider a stationary marked-point process with geostatistical marking. Assume that conditions (4)–(12) hold. Then  D |Dn | × hn × {ˆn (t1 ) − E[ˆn (t1 )], . . . , ˆ n (tk ) − E[ˆn (tk )]} → N (0, ), where the elements of  can be obtained from results in Theorem 1. i , i = 1, . . . , k , Proof. Following Guan et al. (2004a), we divide Dn into non-overlapping l(n) × l(n) subsquares, Dl(n) n i where l(n) = n for some ∈ (0, 1). Now let l(n) = n − n for some 4/(2 + ) < < and obtain Dl(n) with the

i . Thus, d(D i , Dl(n) ) n for i  = j . Define same center as Dl(n) l(n)  2  ≡ w 2 (v) dv × (4) (t)/(2) (t), j

C

 Sn ≡ |Dn | × hn × {ˆn (t) − E[ˆn (t)]}, sn ≡

kn  i=1

 sni / kn ,

sn ≡

kn 

 (sni ) / kn ,

i=1

i i il(n) (t) − E[ˆil(n) (t)]} where ˆ il(n) (t) denotes the sample mark variogram obtained from Dl(n) , sn = l(n) × hn × {ˆ and (sni ) have the same marginal distributions as sni but are independent. Let n (x) and n (x) be the characteristic functions of sn and sn , respectively.

154

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161 p

D

Now we prove Sn → N (0, 2 ). To see this we first prove Sn − sn → 0. This is straightforward due to E(Sn − sn ) = 0 and Var(Sn − sn ) → 0. Then we show that n (x) − n (x) → 0. In what follows, we use I to denote the imaginary number. Define   sni , Ui ≡ exp I x √ kn then

n (x) = E

k n

Ui ,

n (x) =

i=1

kn

E(Ui ).

i=1

Thus, | n (x) − n (x)|

k n −1 j =1

⎧ ⎫ ⎧ ⎫ ⎨j +1 ⎬ j ⎨



E − E × E{U U U } i i j +1 . ⎩ ⎭ ⎩ ⎭ i=1 i=1    (A)

Define Xj =

j

Yj = Uj +1 ,

Ui ,

i=1

EN (Xj ) ≡ E(Xj |N ),

EN (Yj ) ≡ E(Yj |N ),

CovN (Xj , Yj ) ≡ Cov(Xj , Yj |N ), then (A) = Cov(Xj , Yj ) = E[CovN (Xj , Yj )] + Cov[EN (Xj ), EN (Yj )]. j j +1 i For given N, Xj |N is measurable with respect to F( i=1 Dm(n) ) and Yj |N is measurable with respect to F(Dm(n) ). j j +1 i | = j × m(n)2 , we have Since |Xj | 1, |Yj | 1 and |Dm(n) || i=1 Dm(n) CovN (Xj , Yj )16j × {n2 + n2 − 2n + } × n−  j

due to the assumed mixing condition (10). Note EN (Xj ) and EN (Yj ) are measurable with respect to FN ( and

j +1 FN (Dm(n) ),

i i=1 Dm(n) )

respectively; |EN (Xj )|1, |EN (Yj )| 1, we have

Cov[EN (Xj ), EN (Yj )]16j × (n2 + n2 − 2n + ) × n−  due to the assumed mixing condition (11). Combining the above results, we have | n (x) − n (x)|O(n4−2 −  ) → 0. D

d

Finally note that sn → N (0, 2 ) due to Lyapounov central limit theorem. Thus, Sn → N (0, 2 ). The proof of the joint normality follows directly by applying the Cràmer–Wold device.  4. Asymptotic results: The non-geostatistical marking case 4.1. Examples The literature on marked point process models in the non-geostatistical marking case is currently limited but rapidly evolving. We consider here the examples given in Schlather et al. (2004) and use them to verify some of the conditions

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

155

made in the following sections while investigating the asymptotic properties of ˆ (t). For more example, see Wälder and Stoyan (1996), Stoyan and Wälder (2000) and Schlather et al. (2004). Example 1 (Wälder and Stoyan, 1996). Let  be a Poisson process of intensity . For each x ∈ , define m(x) as the number of points of  in the ball of radius R centered at x. Example 2 (Schlather et al., 2004). Let  be a Poisson process of intensity . For each x ∈ , define m(x) as the product of the number of points of  in the ball of radius R centered at x and a standard Gaussian random variable at x, Z(x). Z(x) at different locations are assumed to be identically and independently distributed. Example 3 (Wälder and Stoyan, 1996). Let  be a Poisson process of intensity . For each x ∈ , define m(x) as the distance to the nearest neighbor of x in . 4.2. Asymptotic consistency To establish the asymptotic consistency of ˆ (t), we first define two functions, denoted by (3) (x2 − x1 , x3 − x1 ) and  (x2 − x1 , x3 − x1 , x4 − x1 ), as follows:  E{[m(x1 ) − m(x2 )]2 × [m(x1 ) − m(x3 )]2 × 3i=1 (dxi )} lim , |dx1 |,|dx2 |,|dx3 |→0 |dx1 | × |dx2 | × |dx3 |  E{[m(x1 ) − m(x2 )]2 × [m(x3 ) − m(x4 )]2 × 4i=1 (dxi )} lim , |dx1 |,...,|dx4 |→0 |dx1 | × |dx2 | × |dx3 | × |dx4 | (4)

respectively, where xi s are pairwise different. We assume (t), C (2) (t) and (4) (t) are finite and continuous, (3) (u1 , u2 ) is finite,  |(4) (u1 , u2 , u2 + u3 ) − (u1 )(2) (u1 )(u3 )(2) (u3 )| du2 < ∞ for all u1 , u2 , u3 ∈ R2 . R2

(13) (14)

In the geostatistical marking case, conditions (6) and (8) imply (13) and conditions (7) and (9) imply (14). For the examples given in Section 4.1, all three variogram functions are bounded and continuous due to the expressions in Wälder and Stoyan (1996) and Schlather et al. (2004). Further, analysis reveals that (4) (t) is finite and continuous and that (3) (u1 , u2 ) is finite for any given u1 and u2 . Thus all three examples satisfy condition (13). Clearly condition (14) holds also for all the examples due to the fact that (4) (u1 , u2 , u2 + u3 ) − (u1 )(2) (u1 )(u3 )(2) (u3 ) can take non-zero values only on a finite support for any given u1 and u3 . Theorem 3. Consider a stationary marked-point process with non-geostatistical marking. Let ˆ n (t) denote (2) obtained on Dn and assume that conditions (4), (5), (13) and (14) are true. Then the same conclusions as in Theorem 1 hold. Proof. The derivation for the expectation term follows the same as that of the geostatistical marking case. For the covariance term, consider two lags, t1 and t2 . Cov[ˆn (t1 ), ˆ n (t2 )] × [(2) (t1 )(2) (t2 )] can be written as  wn (t1 − x1 + x2 ) × wn (t2 − x3 + x4 ) |Dn ∩ (Dn − x1 + x2 )| × |Dn ∩ (Dn − x3 + x4 )| Dn

× E{[m(x1 ) − m(x2 )]2 × (2) (dx1 , dx2 ) × [m(x3 ) − m(x4 )]2 × (2) (dx3 , dx4 )}  wn (t1 − x1 + x2 ) × wn (t2 − x3 + x4 ) − |Dn ∩ (Dn − x1 + x2 )| × |Dn ∩ (Dn − x3 + x4 )| Dn

× E{[m(x1 ) − m(x2 )]2 × (2) (dx1 , dx2 )}E{[m(x3 ) − m(x4 )]2 × (2) (dx3 , dx4 )}.

156

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

As in the proof of Theorem 1, we have E{[m(x1 ) − m(x2 )]2 × (2) (dx1 , dx2 ) × [m(x3 ) − m(x4 )]2 × (2) (dx3 , dx4 )} = (4) (x2 − x1 , x3 − x1 , x4 − x1 ) dx1 dx2 dx3 dx4 + (3) (x2 − x1 , x3 − x1 ) dx1 dx2 dx3 x1 (dx4 ) + (3) (x2 − x1 , x4 − x1 ) dx1 dx2 dx4 x1 (dx3 ) + (3) (x2 − x1 , x3 − x1 ) dx1 dx2 dx3 x2 (dx4 ) + (3) (x2 − x1 , x4 − x1 ) dx1 dx2 dx4 x2 (dx3 ) + (2) (x2 − x1 ) dx1 dx2 x1 (dx3 )x2 (dx4 ) + (2) (x2 − x1 ) dx1 dx2 x1 (dx4 )x2 (dx3 ). Thus, the covariance can be written as the sum of seven terms, where the second to the seventh terms depend only on E[ˆn (t1 ) × ˆ n (t2 )]. The first term is of order 1/|Dn | due to condition (14). The second to the fifth terms are of order 1/|Dn | due to the fact that (3) (·, ·) is finite. The derivation of the sixth and the seventh term follows the same as in the proof of Theorem 1.  Theorem 3 suggests that the effect of the dependence between marks and points vanishes asymptotically in the sense that the asymptotic covariance between the empirical mark variogram estimators at any two distinct lags is zero. The same asymptotic covariance structure is obtained in the geostatistical marking case. However, it is worth noting that the effect of the dependence between marks and points on the asymptotic variances still remains. Specifically, the theoretical expression of the asymptotic variance in the non-geostatistical marking case is different to that in the geostatistical marking case. This is due to the fact that in the geostatistical marking case (4) (t) = E{[m(0) − m(t)]4 } but this is not true in the non-geostatistical marking case. For instance, consider example 2 introduced in Section 4.1. Assume that Z(x) is normal with mean zero and variance 2 . Let N (x) denote the number of points of  in the ball of R centered at x. Then some simple algebra yields (4) (t) = 64 {E[N (0)4 + N (0)2 N (t)2 |0, t ∈ ]}, which is clearly different to 124 as expected in the geostatistical marking case. As a result, the asymptotic variances are different. 4.3. Asymptotic normality To study the asymptotic normality, we first extend the strong mixing coefficients discussed in Section 3 to the non-geostatistical marking case as follows:

(p; k) ≡ sup{|P (A1 ∩ A2 ) − P (A1 )P (A2 )| : A1 ∈ F(E1 ), A2 ∈ F(E2 ), E2 = E1 + x, |E1 | = |E2 |p, d(E1 , E2 ) k}, where the supremum is taken over all compact and convex subsets E1 ⊂ R2 , and over all x ∈ R2 such that d(E1 , E2 ) k. In the foregoing, F(E) denotes the -algebras generated by the random vectors {[x, m(x)]}, where x are events from the point process  that fall within E. We assume the following mixing condition: sup p

(p; k) = O(k − ) p

for some  > 2.

(15)

In the geostatistical marking case, (10) and (11) together imply (15). To see this, let Mi , Ni , i =1, 2, denote two arbitrary Borel sets from the -algebras generated by the mark process and the point process, respectively. Let A1 = M1 × N1 , A2 = M2 × N2 , where × stands for product. Then |P (M1 × N1 ∩ M2 × N2 ) − P (M1 × N1 )P (M2 × N2 )| = |P (M1 ∩ M2 )P (N1 ∩ N2 ) − P (M1 )P (M2 )P (N1 )P (N2 )| |P (M1 ∩ M2 ) − P (M1 )P (M2 )| + |P (N1 ∩ N2 ) − P (N1 )P (N2 )|, which implies (p; k) M (p; k) + P (p; k). Thus, (10) and (11) implies (15) in the geostatistical marking case. In the non-geostatistical marking case, (15) holds for examples 1and 2 for any finite R. Furthermore, it holds if the Poisson process used in the definitions of these two examples is replaced by any process with a finite dependence range, e.g., the Matern cluster process (Stoyan and Stoyan, 1994) or by the Strauss process. The latter is due to the

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

157

result of Jensen (1993a, b) that for the Strauss process P (p; k)Ape−Bk for some constants A and B. For example 3, a stronger condition holds and thus (15) also holds. To see this, let B = B1 ∩ B2 where B1 : the event that all points in E1 have nearest neighbors within k/2, B2 : the event that all points in E2 have nearest neighbors within k/2. Note that P (B c ), P (B1c ) and P (B2c ) are all of order O(exp−k Thus,

2 /4

) and that P (A1 ∩ A2 |B) = P (A1 |B1 )P (A2 |B2 ).

|P (A1 ∩ A2 ) − P (A1 )P (A2 )| = |P (A1 ∩ A2 |B)P (B) + P (A1 ∩ A2 |B c )P (B c ) − [P (A1 |B1 )P (B1 ) + P (A1 |B1c )P (B1c )][P (A2 |B2 )P (B2 ) + P (A2 |B2c )P (B2c )]| = |P (A1 |B1 )P (A2 |B2 )P (B) − P (A1 |B1 )P (A2 |B2 )P (B1 )P (B2 )| + O(exp−k |P (B) − P (B1 )P (B2 )| + O(exp

−k 2 /4

= O(exp

)

)

= |P (B1c ) + P (B2c ) − P (B c ) − P (B1c )P (B2c )| + O(exp−k −k 2 /4

2 /4

2 /4

)

).

Theorem 4. Consider a stationary marked-point process with non-geostatistical marking. Assume that conditions (4), (5), (12)–(15) hold. Then  D |Dn | × hn × {ˆn (t1 ) − E[ˆn (t1 )], . . . , ˆ n (tk ) − E[ˆn (tk )]} → N (0, ), where the elements of  can be obtained from results in Theorem 3. Proof. The proof is analogous to the proof of Theorem 2.



5. An example: Test for isotropy 5.1. A test statistic In this section, we propose a formal method to test for isotropy of the marks using the asymptotic results developed in the previous sections. We will assume throughout this section that the underlying point process is isotropic. Note that this assumption can be evaluated using the technique proposed in Guan et al. (2004b). Recall from Theorems 1 and 3 that  1 E[ˆn (t)] = (2) w(u)(t − hn u)(2) (t − hn u) du.  (t) Consider two lags such that t1  = t2 , but t1 = t2 . Since w(·) and (2) (·) are both symmetric functions, E[ˆn (t1 )] = E[ˆn (t2 )] under isotropy. This suggests the testable null hypothesis H0 : E[ˆn (t1 )] = E[ˆn (t2 )] if t1 = t2 . ˆ n ≡ {ˆn (ti ) : i =1, . . . , k} . Let {ti : i =1, . . . , k} denote a set of user-chosen lags, Gn ≡ {E[ˆn (ti )] : i =1, . . . , k} and G We then form appropriate contrasts AGn = 0 and test the hypothesis H0 : AGn = 0 based on the sample contrasts ˆ n . The idea of testing isotropy in this manner was given by Lu and Zimmerman (2001) who considered equally AG spaced geostatistical observations, and was also used in Guan et al. (2004a) who considered locations from a Poisson process and geostatistical marking. Here, we enlarge the class of applications to more general point processes satisfying conditions 6, 7 and 11 and possibly to the non-geostatistical marking case. ˆ n with 0, we need to estimate the covariance matrix of G ˆ n . Following Guan et al. (2004a), we To formally compare AG consider a subsampling approach. Specifically, let Dl(n) be a subfield with the same shape as Dn , where l(n) = cn for some positive constants c and < 1 defines the size of a subfield. Define a shifted copy Dl(n) (y) ≡ {s + y : s ∈ Dl(n) },

158

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

ˆ l(n) (y) be the sample mark variogram on Dl(n) (y) and hl(n) be the where y ∈ Dn1−c ≡ {y : Dl(n) (y) ⊂ Dn }. Let G ˆ bandwidth used to obtain Gl(n) (y). The subsampling estimator (denoted by ˆ n ) is given by  1 ˆ l(n) (y) − Gl(n) ][G ˆ l(n) (y) − Gl(n) ] dy, |Dl(n) |h2l(n) [G (16) |Dn1−c |fn Dn1−c  ˆ l(n) (y) dy and fn = 1 − |Dl(n) |/|Dn | is a finite sample bias correction due to Guan where Gl(n) ≡ 1/|Dn1−c | Dn1−c G et al. (2004a). This is simply the (appropriately scaled) sample variance of the statistics computed from all subfields contained in the domain Dn . The asymptotic consistency of the subsampling estimator defined by (16) can be easily established under suitable conditions following the proof of Theorems 1 and 2 in Politis and Sherman (2001) and that of Theorem 3 in Guan et al. (2004a). In view of this result we propose the following test statistic: ˆ n A )−1 (AG ˆ n ). ˆ n ) (A TSn ≡ |Dn |h2n (AG D

TSn → 2d as n → ∞ by the multivariate Slutsky’s theorem (Ferguson, 1996), where d denotes the row rank of A. Thus, an approximate size test for isotropy rejects H0 if TSn is bigger than 2d, , i.e., the upper percentage point of a 2 distribution with d degrees of freedom. 5.2. A simulation experiment We considered realizations on a 20 × 20 square region from a stationary marked point process given in Stoyan and Wälder (2000). To do so, we first generated locations of points by a (modified) isotropic Poisson cluster process (see Diggle, 2003). Specifically, we generated parent locations by a Poisson process with intensity .25, the number of offspring per cluster by a Poisson random variable with parameter 4, and the position of each offspring relative to its parent by the probability density function f (t) =

1 exp[−t t/22 ]. 22

In contrast to the commonly-used Poisson cluster process, we retained both the offspring and parent locations. Thus, the foregoing design yields (an expected) 500 points in each realization. The parameter  determines the spread of each cluster in that 22 is the mean squared distance of an offspring from its parent. In the simulation we set  at 0.4 and 0.8. These values are similar to the estimates of  in Cressie’s (1993) analysis of long-leaf pine data and Diggle’s (2003) study of redwood seedlings, respectively. At these simulated point locations, we generated realizations from a√Gaussian process [denoted as Z(x), x ∈ R2 ] with mean 0 and covariance structure R(r; a) = exp(−ar), where r = t Bt and B is a 2 × 2 positive definite matrix. We then obtained the marks  Z(x) if x is a parent point, m(x) = Z(x) if x is an offspring point. The parameter a in the covariance structure of Z(x) defines the range and strength of dependence of the process. In the simulation study, we set a at 1 and 2, representing relatively strong and weak dependence, respectively. The matrix B determines the direction and strength of anisotropy. We used three B matrices:       1 0 0.8125 −0.3248 0.7656 −0.4060 , B2 = . B0 = , B1 = 0 1 −0.3248 0.4375 −0.4060 0.2969 Matrix B0 generates isotropic random processes, whereas B1 and B2 generate geometrically anisotropic random processes. The main anisotropic axes are in the 60- and 150-degree directions for both B1 and B2, whereas the anisotropy ratio (defined as the ratio of the lengths of the main axes) is 1:2 for B1 but 1:4 for B2. In the simulation, we set  at 0.5, 0.75, and 1. Note that  = 1 corresponds to the case of geostatistical marking. We simulated one thousand realizations for each combination of a, B, , and . For each realization, we compared the sample variogram values at unit-length lags in 0-, 45-, 90- and 135-degree directions. We used three different

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

159

Table 1 Simulation results



0.4



1.00

0.75

0.50

0.8

1.00

0.75

0.50

h

a=1

a=2

B0

B1

B2

B0

B1

B2

0.2 0.6 1.0 0.2 0.6 1.0 0.2 0.6 1.0

0.071 0.071 0.066 0.071 0.058 0.056 0.065 0.054 0.051

0.138 0.240 0.174 0.133 0.255 0.186 0.134 0.220 0.162

0.374 0.702 0.570 0.349 0.676 0.534 0.233 0.493 0.397

0.058 0.049 0.057 0.066 0.059 0.054 0.059 0.067 0.054

0.073 0.154 0.098 0.110 0.144 0.111 0.078 0.128 0.090

0.230 0.478 0.322 0.202 0.495 0.328 0.154 0.370 0.231

0.2 0.6 1.0 0.2 0.6 1.0 0.2 0.6 1.0

0.059 0.061 0.052 0.049 0.053 0.052 0.054 0.053 0.059

0.199 0.441 0.372 0.172 0.425 0.350 0.114 0.304 0.276

0.507 0.911 0.863 0.481 0.912 0.850 0.319 0.802 0.751

0.059 0.047 0.042 0.057 0.038 0.053 0.068 0.045 0.053

0.108 0.245 0.170 0.115 0.230 0.133 0.083 0.206 0.114

0.342 0.790 0.636 0.269 0.729 0.603 0.204 0.624 0.469

Each table entry is the percentage of rejection at the 5% nominal level for one thousand simulations.

bandwidths, 0.2, 0.6 and 1, to assess how sensitive the test is to different values of h. Following the recommendations in Guan et al. (2004a), we used a subblock of 4 × 4 to obtain a covariance estimate of the sample variogram. Table 1 reports the empirical sizes and powers at the 5% nominal level. We see that regardless of the value of a, , and  and the choice of h, the empirical sizes are in general reasonably close to the nominal size. For fixed a, , , and h, we see that the powers increase when the anisotropic ratio increases. In addition, we see that the powers increase when Z(x) becomes more dependent (i.e., a decreases) but typically decrease when the marks and points become more dependent on each other (i.e.,  decreases). Further, we see that the powers generally decrease when the clustering strength increases (i.e.,  decreases). In practice, we must decide the value of the bandwidth, h. When h is too small, the sample size may not be large enough to detect anisotropy. When h is too large, however, overlapping in the smoothing may occur, which in turn will lead to less power. From the simulation results in Table 1, we see that the best power is typically achieved when h = 0.6 but not when h = 0.2 or 1.0 due to the foregoing reasons. In light of these observations, we recommend choosing h large enough such that, say, at least 200 distinct pairs of observations are used to calculate (2) following Guan et al. (2004b)’s recommendation, but small enough such that little or no overlapping occurs in the smoothing. 5.3. An application to the long-leaf pine data The long-leaf pine data consist of the location and diameter at breast height (dbh) of 584 long-leaf pine trees in a 200 m × 200 m square region. The data set was given in and analyzed by Cressie (1993), who detected spatial clustering among locations of trees. Stoyan and Stoyan (1996) fitted an isotropic (generalized) Thomas process, which is a stationary process, to these locations and obtained a reasonably good fit. Thus, our test approach appears to be appropriate for this data set. Fig. 1 plots the empirical E(r) quantity defined in Schlather et al. (2004) against the inter-point distance between two trees, r, where E(r) is the conditional expectation of a tree’s dbh given that there exists another tree at a distance r away. Clearly E(r) changes with respect to r, which indicates dependence between the dbh values and the relative tree locations. Our approach is still appropriate since it can be applied to the case of non-statistical marking. We translated the study region into a 20 × 20 region which is comparable to the region considered in the simulation study. In light of the simulation results presented in the previous section, we chose to calculate and compare the sample

160

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161 26 24 22 20 18 16 14 12 10 0

5

10

15

20

25

30

35

40

Fig. 1. Plot of the empirical conditional expectation of dbh. The x-axis represents the inter-point distance; the y-axis represents the empirical conditional expectation of a tree’s dbh given that there exists another tree at a given distance away.

√ √ √ √ mark variogram at lags (10, 0), (0, 10), (10/ 2, 10/ 2) and (−10/ 2, 10/ 2). We set A as follows:   1 −1 0 0 A = 1 0 −1 0 . 1 0 0 −1 We used a 40m × 40m square subsampling window to estimate the covariance matrix and a bandwidth h equal to 6m to obtain the sample mark variogram. These values are comparable to the ones used in the simulation study. The calculated test statistic is 3.2563. This yields a p value equal to .3538 when compared to a 2 distribution with three degrees of freedom. Thus, we did not detect any evidence against the null hypothesis of spatial isotropy. 6. Discussion The analysis of marked point process data, especially in the case of non-geostatistical marking, presents a challenge to statisticians. In this paper, we have developed the asymptotic properties of the mark variogram estimator (Section 2) in the simpler geostatistical marking case and extended the results to the more general non-geostatistical marking case. Under mild conditions we have proved that the sample mark variogram is consistent and asymptotically normal for the true mark variogram. In particular, we have illustrated in the non-geostatistical marking case that the results hold for several existing models. As an example, we have demonstrated through both a simulation study and an application to a real data example how our results can be applied to test for isotropy for spatial marked point processes. The literature on marked point process analysis in the non-geostatistical marking case is currently limited. Hopefully, future research will focus on developing more general models as well as mechanisms to fit such models, possibly through its second-order structure such as the mark variogram. Our findings in this paper can be used to obtain general guidelines on such practices. Acknowledgements The authors would like to thank the editor, the associate editor and a referee for their helpful comments. References

Y. Guan et al. / Journal of Statistical Planning and Inference 137 (2007) 148 – 161

161

Biondi, F., Myers, D.E., Avery, C.C., 1994. Geostatistically modeling stem size and increment in an old-growth forest. Canad. J. Forest Res. 24, 1354–1368. Brillinger, D.R., 1975. Time Series: Data Analysis and Theory. Rinehart & Winston Holt, Austin, TX. Brockwell, P.J., Davis, R.A., 1991. Time Series: Theory and Methods. Springer, New York. Cressie, N.A.C., 1993. Statistics for Spatial Data. Wiley, New York. Diggle, P.J., 2003. Statistical Analysis of Spatial Point Patterns. Wiley, New York. Doukhan, P., 1994. Mixing: Properties and Examples. Springer, New York. Ferguson, T., 1996. A Course in Large Sample Theory. Chapman and Hall, Boca Raton. Gavrikov, V.L., Stoyan, D., 1995. The use of marked point processes in ecological and environmental forest studies. Ecological Environ. Statist. 2, 331–344. Guan,Y., 2003. Nonparametric methods of assessing spatial isotropy. unpublished Ph.D. dissertation, TexasA&M University, Department of Statistics. Guan, Y., 2004. Tests for independence between marks and points of a marked point process. Biometrics, to appear. Guan,Y., Sherman, M., Calvin, J.A., 2004a. A nonparametric test for spatial isotropy using subsampling. J. Amer. Statist. Assoc. Theory and Methods 99, 810–821. Guan, Y., Sherman, M., Calvin, J.A., 2004b. Assessing isotropy for spatial point processes. Biometrics, to appear . Heagerty, P.J., Lumley, T., 2000. Window subsampling of estimating functions with application to regression models. J. Amer. Statist. Assoc. 95, 197–211. Jensen, J.L., 1993a. A note on asymptotic expansions for sums over a weakly dependent random field with application to the Poisson and Strauss processes. Ann. Inst. Statist. Math. 45, 353–360. Jensen, J.L., 1993b. Asymptotic normality of estimates in spatial point processes. Scand. J. Statist. 20, 97–109. Lahiri, S.N., Lee, Y., Cressie, N., 2002. On asymptotic distribution and asymptotic efficiency of least squares estimators of spatial variogram parameters. J. Statist. Plann. Infer. 103, 65–85. Lu, H., Zimmerman, D.L., 2001. Testing for isotropy and other directional symmetry properties of spatial correlation, preprint. Politis, D.N., Sherman, M., 2001. Moment estimation for statistics from marked point processes. J. Roy. Statist. Soc. Ser. B 63, 261–275. Schlather, M., 2001. On the second-order characteristics of marked point processes. Bernoulli 7 (1), 99–117. Schlather, M., Ribeiro Jr., P.J., Diggle, P.J., 2004. Detecting dependence between marks and locations of marked point processes. J. Roy. Statist. Soc. Ser. B 66, 79–93. Sherman, M., 1996. Variance estimation for statistic computed from spatial lattice data. J. Roy. Statist. Soc. Ser. B 58, 509–523. Stoyan, D., Stoyan, H., 1994. Fractals, Random Shapes and Point Fields. Wiley, New York. Stoyan, D., Stoyan, H., 1996. Estimating pair correlation functions of planar cluster processes. Biometrical J. 38, 259–271. Stoyan, D., Wälder, O., 2000. On variogram in point process statistics, II: models of markings and ecological interpretation. Biometrical J. 42, 171–187. Wälder, O., Stoyan, D., 1996. On variogram in point process statistics. Biometrical J. 38, 895–905.