Point processes and some related processes

Point processes and some related processes

D. N. Shanbhag and C. R. Rao, eds., Handbook of Statistics, Vol. 19 © 2001 Elsevier Science B.V. All rights reserved. | I t.3 Point Processes and S...

3MB Sizes 9 Downloads 297 Views

D. N. Shanbhag and C. R. Rao, eds., Handbook of Statistics, Vol. 19 © 2001 Elsevier Science B.V. All rights reserved.

|

I t.3

Point Processes and Some Related Processes

R o b i n K. M i l n e

1. Introduction

Investigations in diverse areas of science lead to data in the form of a countable collection of points distributed in an apparently random manner over some set. Often the data may comprise the times of occurrence of some phenomenon, or the locations at which some objects are observed or some phenomenon occurs. Simple examples of such data are: the times of emission of pulses in a nerve fibre, the times of arrival of patients at an intensive care unit, the locations of trees in a forest, ants nests in a region, or a specified type of particle or cell in a section (or volume) of tissue. More complex examples are: the times of occurrence, locations and magnitudes of earthquakes in a given region; the locations and instantaneous directions of movement of insects, spermatozoa etc. in a region at a specified time. The particular type of stochastic process which is used to model such data is known as a point process. Thus a point process is a mathematical model for describing a countable set of points randomly distributed over some space. Poisson processes provide the simplest and best known type of point process. As Stoyan et al. (1995, p. xiv) have noted, since point processes on the real line were predominantly the earliest considered, use of the term 'process' often reflects just the historical origins of the subject rather than implying any time dependence. The theory of point processes as presently developed is rich in mathematical structure, having many important connections with other areas of probability and stochastic process theory. Its early origins [cf. Daley and Vere-Jones (1988)] lie in the areas of life tables and renewal theory, counting problems beginning with work of S.-D. Poisson and leading to applications in particle physics and population processes, queueing theory and communication engineering. In a significant review paper which attempted to chart future directions in queueing theory, Kendall (1964) said: '/t is clear that progress with the problems mentioned here would be greatly facilitated by a richer theory of point processes . . . . . . . Only so can queueing theory repay the debt it owes to the world of technology which gave it birth.' After explosive growth in the literature of point processes in subsequent years the debt would appear to have been amply repaid, with bonuses to many other areas of application. 599

600

R. K. Milne

In fact there are now even a number of books dealing with various aspects of point processes. Applications to queueing theory are considered especially in Br6maud (1981), Franken et al. (1982), Disney and Kiessler (1987), Baccelli and Br6maud (1987), and Brandt et al. (1990). Khinchin (1969) and its earlier editions had introduced many readers to basic results about stationary point processes on the real line, as well as applications to queueing systems and loss systems. Other related ideas were presented in Gnedenko and Kovalenko (1968, 1989) and K6nig et al. (1967). Until the appearance of Kerstan et al. (1974), revised and translated as Matthes et al. (1978), there was no detailed systematic presentation of the theory of point processes. These books provided rigorous foundations for the theory of point processes on complete separable metric spaces, a setting which was seen to encompass Euclidean spaces and to be sufficiently general to fulfil demands from within the theory. They dealt also with the notion of a marked point process, that is a point process in which a supplementary variable or mark is associated with each point in the underlying process, and focused their account on theory for infinitely divisible processes. The generality and careful attention to detail resulted in a valuable reference, but one which makes heavy demands on the mathematical equipment of readers, especially in the areas of measure theory and integration, but also in topology and functional analysis. Similar comments could be made about Kallenberg (1983) and earlier editions (lst edn, 1975), although the terseness of this elegant work makes added demands on the reader. Daley and Vere-Jones (1988), attempting to be accessible to a wide range of readers, provides an excellent and comprehensive introduction to the theory. A still useful summary is available in the precursor, Daley and Vere-Jones (1972); some complementary aspects are discussed in Brillinger (1975). Srinivasan (1974) gives a brief heuristic account of some aspects of point process theory, emphasizing product densities (drawing on Moyal (1962)) and considering some applications. Another informal and practically motivated approach to the key ideas is provided in the short monograph by Cox and Isham (1980). Earlier, Cox and Lewis (1966) gave an account of statistical analysis of point process data together with some related theory. Stoyan et al. (1995) surveys a variety of aspects of point process theory with a view to applications in stochastic geometry. The various chapters in BarndorffNielsen et al. (1999) provide a rich vista of theory and applications of point processes, also emphasizing connections with stochastic geometry. The present article is a review of the theory of point processes and some related stochastic processes, with a choice of topics that is undoubtedly somewhat personal. The early sections attempt to provide a relatively non-technical summary of foundational ideas, with any technical details able to be passed over at first reading. Later sections indicate some of the rich variety of point process and other stochastic process models that can be built from more basic point processes, and consider aspects of statistical inference for point process data. A unifying tool throughout much of the presentation is provided by the probability generating functional, which stands in relation to a point process as does the probability

Point processes and some related processes

601

generating function to a nonnegative integer-valued random variable. The overall intention of this review is to whet the appetite: to introduce the reader to key ideas, preparing the way to the more specialized accounts in the many available books, and offering a window through which to view the wider literature. Mathematical generality has not been pursued for its own sake, and the early sections attempt an overview of the foundations that acknowledges but does not depend heavily on detailed understanding of measure theory. For this reason Poisson processes are not initially introduced in the most general way (with a specified mean measure), and product densities are discussed before the introduction of moment measures. Nevertheless, the reader who wishes to pursue some of the ideas of measure and integration theory will find a compact but useful summary in Section 1.8 of Stoyan et al. (1995), and a more detailed survey in Appendix 1 of Daley and Vere-Jones (1988). No attempt has been made to provide detailed references to all ideas discussed, but it is hoped that sufficient have been given for it to be clear where the reader might look initially for further information.

2. Foundations 2.1. Representation o f realizations

The simplest view of a point process realization is, as suggested above, a (finite or) countable set x = {xi} of (distinct) points randomly located in some state space S. Often this space would be one-dimensional, commonly representing time. More generally, in many applications the state space S will be a Euclidean space. For example, to deal with earthquakes would require, in principle, a five-dimensional space: a time dimension, three spatial dimensions for the locations of the epicentres, and a further dimension for the magnitudes. However, attention might be focused, at least initially, on just the epicentres. Spaces of which less structure is assumed can be used [cf. Kingman (1993, Chapter 2)], and are in fact required to deal with particular aspects of point process theory and some applications, for example in stochastic geometry and stereology [(cf. Stoyan et al. (1995)]; some such applications are mentioned in Section 4.5. For many purposes the assumption that the state space is a complete separable metric space is adequate, and this is assumed throughout the present review. In such a space there is a notion of distance between points and furthermore a countable dense subset, ideas which generalize two key properties of Euclidean spaces. [For some discussion about the choice of type of state space see Daley and Vere-Jones (1988, pp. 152-153).] The more specific assumption of a Euclidean state space is made in some places, for example in dealing with stationarity and in defining a homogeneous Poisson process, though the essence here is really just the presence of a group structure and an associated invariant measure. There are two other possible approaches to the representation of point process realizations. The first, limited to situations where the state space is one-dimen-

602

R. K. Milne

sional and ordered, involves representing the realization as the sequence of 'intervals' between successive points and also specifying the location of this configuration of points in relation to the origin. Renewal processes are usually considered in this way. In the second approach, the point process realization is viewed as a counting measure N(.), that is as a measure defined on a suitable class of sets and taking values which are nonnegative integers or +oo. (A measure, in the sense of the theory of measure and integration, is a set function which is nonnegative, zero for the empty set, and countably additive for sequences of pairwise disjoint sets. A probability measure is a more familiar special case where the measure is constrained to take the value one for the entire space on which it is defined; for a general measure the value taken on the whole space need not be one or even a finite value.) The connection between this and the initial view is that N(B) = (¢{xi C B}, the number of points from the set {xi} which lie in B, considered for suitable subsets B of the space. For technical reasons, the collection of such subsets would usually be assumed to be a o--field, i.e. to include S and be closed under complementation and countable unions and intersections. Commonly it would include at least all possible bounded sets which are intervals (on a line), rectangles or discs (in a plane), or boxes or balls (in space); the resultant collection of sets is called the (a-field of) Borel sets of S and denoted by ~s. In addition, it is usual to assume localfiniteness: that N(B) is finite whenever the set B is bounded. This assumption is likely to be satisfied in most applications; for an exception, where the origin is a limit point of every realization, see Section 9.4 of Kingman (1993). The counting measure representation of realizations, although for many not the most intuitive, does have practical relevance in that an appropriate summary description of the data may often be provided by counts in some selected sets. For developing a nice mathematical theory of point processes in general spaces the counting measure view is compelling, in particular because it facilitates the treatment of possible 'multiple' points. (In the set view of realizations the points xi must, strictly speaking, be distinct, although an extension is possible where with each point xi is associated a mark ki giving the multiplicity of that point; see Section 4.2.) A disadvantage of adopting the counting measure viewpoint is that the resultant theory depends on the theory of measure and integration, which is often regarded as difficult, and is not common knowledge among many who are interested in applications of point processes. In spite of this, many of the ideas of point process theory can be explained with minimal dependence on the technical details of measure and integration theory, sometimes by resorting to the more heuristic 'set' view of realizations. As far as possible, this is the approach adopted in the present article.

2.2. Probabilities and finite-dimensional distributions In principle, to describe a point process there needs to be a specification of the joint distributions of counts in all possible finite sequences of Borel sets. For a fixed positive integer k, and (Borel) subsets B I , . . . , Bk of the state space we need

Point processes and some relatedprocesses

603

{Pr(N(B1) = n l , . . . , N ( B k ) =- nk): n l , . . . , n k = {0, 1,...}} . This is simply the joint distribution of N ( B l ) , . . . , N(Bk), the numbers of points in the sets B1,...,Bk. These joint distributions, for all possible k and subsets B1,... ,Bk, are called the finite-dimensional (fidi) distributions of the point process. For a particular point process, they must be mutually consistent; for instance, the one-dimensional marginals of any fidi distribution for specified sets B1,...,Bk must coincide with the separate one-dimensional distributions for those sets, as must the two-dimensional marginals etc. Moreover, for example, since for disjoint B1,B2 the counts must be additive in the sense that N(B10B2) = N(B1)+N(B2) (with probability one), the family of one-dimensional distributions of a point process must be an 'additive' family of probability distributions. Although they are simple to describe in this way, such additivity and consistency conditions put severe limitations on the structure of the possible fidi distributions for a point process; in general it is a nontrivial task to specify a (new) point process model. Formally, a (random) point process can be defined as a measurable mapping from some probability space (~2, ~-, P) into the measurable space (M, J/{), where M is the space of all counting measures defined on the state space S, and Jg is the smallest a-field containing all cylinder sets, that is subsets of M of the form {N(.) E M : N(B1) = n l , . . . ,N(B~) = nk} for all possible positive integers k and Borel subsets B 1 , . . . , Bk of S. Alternatively, a point process can be defined directly as a probability measure on (M, J/l). The former point of view is to be preferred mathematically if it is desired, for example, to consider N(B) formally as a random variable. In any case, a point process can be described as a random counting measure, in the same sense that a random variable can be described as a random real number or a random m-vector as a random point in •m. In general, the existence of a suitable probability measure on d/l is guaranteed by a refinement of the stochastic process existence theorem usually attributed to Kolmogorov provided a suitably consistent, additive family of potential fidi distributions can be specified. Readable early versions of the existence theorem for point processes were given by Nawrotzki (1962) and by Harris (1963) in his book on branching processes. Both versions restrict to S = ~; in the case of Harris (1963) there is a further restriction to processes for which N(R) < ec almost surely. For complete separable metric state spaces the relevant existence result is stated and proved as Theorem 1.3.5 in Matthes et al. (1978); in Daley and Vere-Jones (1988) the corresponding result is Theorem 7.1.XI. Although, in the manner indicated above, a point process can be determined by specifying its fidi distributions for all possible positive integers k and Borel subsets B1 . . . . ,Bk of S, it is enough to use a smaller class of sets than the Borel sets. For example, when S = N it is adequate to work with fidi distributions for sets B1,B2,... that are pairwise disjoint half-open intervals with rational end-points. The essential point, which can be extended to the case where S is a complete separable metric space, is that finite unions of such sets form a semi-ring (i.e. a collection closed under finite unions and differences) whose generated

604

R. K. Milne

a-field (i.e. smallest a-field containing the initial class) is -~s, the a-field of Borel subsets of S where ~ s is defined as the a-field generated by the open sets in S. As Neveu (1977) has commented, the existence theorem 'does not seem to have a great importance, most interesting point processes being constructed by special methods'. In studying the crossings of a fixed level by a stationary stochastic process [cf. Cram6r and Leadbetter (1967)], the required probability measure is determined from that of a previously specified stochastic process. For Poisson processes there is a relatively simple construction [cf. Kingman (1993, Section 2.5)] which is related to the 'conditional' property (see Sections 2.4 and 2.8) and which can be used also as a means of simulating such a process. Many other processes can be obtained by a further random procedure, such as compounding or clustering (see Section 3.1), starting from some more basic point process, often a Poisson process. Provided that attention is restricted to point processes on some bounded subset W of a state space S, a wide class of point processes can be defined by specifying their (Radon-Nikodym) densities, that is likelihood ratios, with respect to a homogeneous Poisson process of unit intensity defined on S. The specification through likelihood ratios with respect to a Poisson process facilitates simulation of such processes and also statistical inference for them. Section 4.6 discusses processes of this type. A point process is called simple if, with probability one, its realizations have no multiple points, i.e. if Pr(N({x}) = 0 or 1 for every x E S) = 1. A theorem due to M6nch (1971) [cf. Matthes et al. (1978, Theorem 1.4.9)] shows that the distribution of any simple point process is determined by its one-dimensional distributions for a suitably large subclass of Borel sets, specifically for the sets of a semi-ring generating the Borel a-field (e.g. on the real line, all sets which are finite unions of half-open intervals). M6nch's theorem is a generalization of a result which in the Poisson case is due to R~nyi (1967); see Section 2.8. In fact, the distribution of any simple point process is determined by the probabilities of zero counts for the sets of a semi-ring generating the Borel a-field; see, for example, Matthes et al. (1978, Proposition 1.4.7). In the latter form the result is related to the characterization of a random set by its avoidance function [cf. Daley and Vere-Jones (1988, Section 7.3), Matheron (1975, Section 2.2), and Matthes et al. (1978, Theorem 1.4.10)].

2.3. Stationarity and related properties

In this section it is assumed that the state space is Euclidean, i.e. S - •d for some positive integer d. A point process is called (strictly) stationary if its probabilistic properties are invariant under translation: specifically, if for every u in the state space S, every positive integer k, every collection of Borel subsets B 1 , . . . , Bk of the state space, and every collection n l , . . . , nk of nonnegative integers, we have Pr(N(B1) = n l , . . . , N ( B k ) = nk) = Pr(N(B1 + u) = n l , . . . , N ( B k + u) = nk) ,

Point processes and some related processes

605

where B + u = {x + u : x C B} denotes the translate of B by u. For a stationary point process it follows that the distribution, and hence expected value, of the number of points falling in any translate of a given Borel set B is the same as that for the set B. A further consequence is that ~V(B) = #]B], where [B[ denotes the Lebesgue measure (length, area, volume etc.) of B and #, called the intensity, denotes the expected number of points falling in any set U having [U[ = 1. The intensity of a stationary point process is thus always a nonnegative real number, or possibly +oo. Also, for any stationary point process it can be shown that the limit p = lim

Pr(N(Bh) > 0)

hd

h--+0+

,

(1)

where Bh = (0, h]d, exists and 0 _< p < oc. The quantity p is often called the rate of the point process, though this terminology is not standardized in the literature. Furthermore, since Pr(N(Bh) > 0) < EN(Bh) for all h, it follows that p _< #. From any point process it is possible to derive an associated simple point process by 'forgetting' all multiplicities in the original process. This derived process is stationary whenever the original process is stationary, and its intensity #* can be shown to satisfy #* = p, where p is the rate of the original process and the common value +oc is permissible. This latter result is widely known in the literature as Korolyuk's theorem. Extending these ideas it is possible to prove [cf. Milne (1971)] the existence for any stationary point process of a batch-size distribution {~k : k = 0, 1 . .}, and show that # = ~k=0 k~k. When the point process is simple, 7c1 1 and so # = p. The distribution {zck : k = 0, 1,...} can be viewed as the distribution of the number of occurrences at an arbitrary point, x, given that there is at least one occurrence there. A generalization of this idea is to look at the joint distribution of numbers of occurrences in some configuration of sets considered relative to an arbitrary point, x, given that there is an occurrence at x, and leads ultimately to the Palm distribution of a stationary point process. This can be defined as the conditional distribution of the process given that, for example, there is a point of the process at x. (Because of stationarity, this conditioning can be reduced to the demand that there be a point at the origin.) Such conditioning requires care because the conditioning event is clearly one of probability zero. It is by means of the Palm distribution that we must approach, for example, the distribution of the times between successive points (°intervals') of a stationary point process on the real line when starting from a description of the process in terms of its fidi distributions. An important inversion formula allows the latter distributions to be expressed in terms of the Palm distribution. For further details concerning Palm distributions see Thorisson (1995), Sigman (1995, Chapter 4) or Stoyan et al. (1995, Section 4.4). For spatial point processes (d >_ 2) one may also be interested in isotropy, a concept analogous to stationarity, defined by all the fidi distributions being invariant under rotations. For a stationary isotropic point process the Palm distribution is needed at least in order to define formally the nearest-neighbour distribution function (the extension of the distribution function of a typical =

606

R. K. Milne

interval between successive points for a stationary point process on the real line) and also the K-function. These concepts will be introduced in the discussion of statistical inference for such processes in Section 5.3, where it will be seen that the nearest-neighbour distribution function and the K-function each play an important role. A point process is termed mixing essentially if events defined in terms of the numbers of points in widely separated sets are close to being independent. [For a formal definition see, for example, Daley and Vere-Jones (1988, Section 10.3).] Some such property is needed to ensure consistent estimation of the intensity of a stationary point process from a single realization; many use the related notion of ergodicity [cf. Stoyan and Stoyan (1994, p. 194) or Karr (1991, Section 1.8 and Chapter 9)]. 2.4. Poisson, Bernoulli and renewal processes A homogeneous Poisson process in a Euclidean space S = ~d can be defined by the following two requirements: (P1) for every bounded Borel subset B of the state space, N ( B ) ~ Poi(2]BI) (i.e. N(B) is Poisson distributed with parameter 21BI), where I" I denotes Lebesgue measure; and (P2) for every positive integer k and all sequences B1,..., Bk of pairwise disjoint bounded Borel sets, N(B1),...,N(Bk) are mutually independent random variables. Any point process satisfying property (P2) is commonly called completely random. Together, (P1) and (P2) specify the form of all fidi distributions and ensure the resulting process is stationary. Moreover, (P1) clearly ensures that 2 is the intensity of the process. It is straightforward to show that such a process is simple, and hence that 2 is also the rate of the process. A fundamental property of any homogeneous Poisson process is that, given the number of points in a bounded subset of the state space, these points are distributed independently and uniformly over the subset. This 'conditional' property is an important tool in proving other results about homogeneous Poisson processes and about processes that are defined using them. It is also the key to simulation of such processes within a bounded set. The point process that results from such conditioning of a homogeneous Poisson process is a particular type of Bernoulli process [cf. Kingman (1993, Section 2.4)]. In general, such a process is defined on a compact (i.e. closed and bounded) set W by the demand that all its realizations have the same fixed (total) number of points and that these points be distributed independently and identically over W according to some specified probability distribution. As a consequence, these processes are easy to simulate (see Sections 4.1 and 4.6). Bernoulli processes are also called sample processes [cf. Kallenberg (1983, p. 15)] and binomial (point) processes [cf. Stoyan et al. (1995, Section 2.2)]. Such processes are, in a sense, more basic than Poisson processes.

Pointprocessesand some relatedprocesses

607

Suppose now that, in the above definition of a Poisson process, it is agreed to relax the implicit stationarity requirement and allow the intensity to vary over the state space, with its value at any u E S being given by 2(u) which is assumed nonnegative. In addition, it would usually now be assumed that J'B ,~(u)du is finite for all bounded Borel sets B. If (P1) in the definition of a homogeneous Poisson process is replaced by (P1) r for every bounded Borel subset B of the state space,

then the resultant process is called an inhomogeneous or nonhomogeneous Poisson process (a terminology not meant to exclude homogeneous Poisson processes as the special cases for which the intensity function is in fact constant), or simply a Poisson process with intensity function 2(.). The class of inhomogeneous Poisson processes includes, for example, processes for which 2(u), or what may be preferable In )~(u), exhibits some trend with u, is a periodic function, or is dependent on the values of some associated covariates at specified points u; see, for example, Cox (1972), Cressie (1991, pp. 654-657), or Lewis (1972b, Section 5). The classes of Poisson processes discussed here will be further extended in Section 2.8. An (ordinary) renewal process on S = [0, oc) can be defined by the set {L1,L1 +L2,L1 +L2 - - L 3 , . . . } of random points, where L1,L2,L3,... are independent and identically distributed 'lifetime' (i.e. nonnegative) random variables. When the first lifetime, L1, is allowed a distribution different from that of the other random variables the process is commonly called a modified renewalprocess. If the common distribution of L1, L2, L3,. • • is an exponential distribution, then the process reduces to a homogeneous Poisson process on [0, oc). Aside from this special case, counting properties of a renewal process are not simple to describe, though some results can be derived; see, for example, Cox (1962, Chapter 3). Various generalizations of renewal processes have been attempted in planar setting, i.e. with S = R2; for example, Hunter (1974a, b). Whilst such generalizations may have value for particular purposes, they were not intended to provide an approach to specification of general spatial point processes through 'interval' properties.

2.5. Some processes derived from Poisson processes Many types of point process can be defined in terms of simpler point processes, for example Poisson or renewal processes. Whilst fidi distributions could then be derived, in practice such derivations may be difficult or tedious, and full details may not be needed. Probability generating functions or moment generating functions of relevant fidi distributions, or selected summary measures or moments, may provide useful tools. Often it is possible to determine a usable expression for a generating functional (Section 2.10) of the derived process. F r o m Poisson processes, three broad classes of point processes can readily be constructed by introducing further randomness. A compound Poisson process is

R. K. Milne

608

obtained from a Poisson process (homogeneous or inhomogeneous) if, independently of the other points, each point of the Poisson process is replaced by a random number of new points, with these numbers of new points identically distributed and every new point placed at its associated Poisson location. In general, this derived process may have points with multiplicity greater than one. A mixed Poisson process [cf. Grandell (1997)] is defined by allowing the parameter 2 of a homogeneous Poisson process to have a specified distribution. Such a process provides one of the simplest types of process which is not mixing, the lack of mixing being essentially a consequence of the dependence of the counts of points in even widely separated sets on the common value of 2. By allowing the intensity function of an inhomogeneous Poisson process to be a realization of some other stochastic process, a third class, that of doubly stochastic" Poisson processes or Cox processes, is obtained; see, for example, Grandell (1976, 1997). The 'driving' stochastic process, a random field if the state space is Lq2, may represent an underlying (often unobservable) environmental heterogeneity. When the state space is the real line, taking the stochastic process governing the intensity function to be a continuous-time Markov chain with finitely many states yields a special type of Cox process known as a Markov modulated Poisson process; see, for example, Ryd6n (1995). The simplest case then arises from a Markov chain having just two states, with these corresponding to a high level and a low (or even zero) level for the intensity function. As is intuitively clear from the presence of further randomness beyond that of a Poisson process, the compound Poisson, mixed Poisson and Cox process models have the property of being overdispersed relative to a Poisson process, in that the variance of any count N(B) will usually be greater than the corresponding mean.

2.6. Product densities An approach to point process properties that is both intuitively appealing and useful is through what are called the product densities of the process. These are defined for any simple point process in a Euclidean state space, subject to some further conditions that will be discussed in the next section. Product densities can be described in terms of differentials as follows: the first-order product density is given, for any u in the state space S, by ml(u)du = Pr(N(du) = 1). The above probabilities are expressions for the probability of the event 'a point at u'. A Poisson process with intensity function 2(.) has m~ (.) = 2(.). For a general point process, rnl (.) itself is called the intensity function of the process. When the state space is one-dimensional and interpreted as time, the term instantaneous intensity function is often used. Higher-order product densities, the 'coincidence densities' of Macchi (1975), are given analogously: for any positive integer k and any distinct Ul,..., uk in the state space, the kth order product density is the function mk(') given by

m k ( u l , . . . , u k ) d u l ' " d u k - - P r ( N ( d u l ) = l , . . . , N ( d u k ) = 1) ;

(2)

Point processes and some related processes

609

see also Daley and Vere-Jones (1988, Sections 5.4 and 5.7) and Srinivasan (1974, Chapter 2). For an inhomogeneous Poisson process with intensity function 2(.) the product densities are given by i n k ( u 1 , . . . , uk) = 2 ( u l ) . . . 2(u~) for all k and distinct u l , . . . , uk. For a stationary point process ml(u) =/~, the intensity of the point process, and m2(ul, u2) is a function just of u2 - ul. If the process is also isotropic m2(ul, us) is a function just of d(ul, u2), the Euclidean distance between ul and u2. In the case of a homogeneous Poisson process with intensity 2 the product densities reduce to i n k ( u 1 , . . . , uk) --- 2 ~ for all k and distinct Ul,... , U k . Moreover, for a mixed Poisson process the product densities are the respective moments (about the origin) of the (now random) intensity. The product densities of a renewal process on S = [0, oc) are mk(ul , . . . , uk ) = hl (ul )h(u2 - Ul)..- h(uk

--

Uk-1)

for all k and distinct u l , . . . , uk, where h is the (ordinary) renewal density and hi the modified renewal density (which takes account of the fact that the origin need not be a renewal point). For a general point process described by product densities, the simplest connection with moments of counts is that ml (u)du = Pr(N(du) -- 1) = EN(du), or equivalently ~N(B) ----fB ml (u)du, and EN(B)IN(B)-l]=£f

ms(ul,u2)duxdu2

,

the latter quantity being the second factorial moment of N ( B ) . These ideas lead inevitably to the consideration of moment measures.

2.7. M o m e n t measures

Moment measures are, for a point process, the quantities which are the analogues of ordinary moments of a random variable. Such point process quantities are measures in the sense of the theory of measure and integration. Rather more knowledge of this theory than has been demanded in many earlier sections is needed to pursue a detailed study of moment measures, though even here some intuitive understanding can be developed. For any given point process, the set function MI(.) defined for B C ~s, the a-field of Borel subsets of S, by M1 (B) = ~JV(B), inherits the nonnegativity and additivity properties of counts associated with any realization and so is itself a measure, variously called the mean measure, the intensity measure, or the first m o m e n t measure of the process. Higher-order moment measures can be defined as follows. For k = 2, 3 , . . . define the kth m o m e n t measure, Mk(.), on 'rectangles' B~ x -- • x B~ in S k, the Cartesian product of the state space with itself, by Mk(B1 × " "

×Bk) =[EN(B1)'"N(Bk) •

(3)

Each object, M~(.), so defined, can be extended in a standard (measure-theoretic) way to a measure Mk(.) on the Borel sets of S k, and this extension is always a

610

R. K. Milne

symmetric measure since the order in which N(/31),..., N(B~) are multiplied does not matter. The simplest aspects of the dependence structure of a point process are embodied in its second moment measure 342 (.). The covariance measure can be defined in a similar manner, starting from C(B~ x/32) = M2(/3I ×/32) - M1 (13i)M1 (B2) for 'rectangles' B1 x B2 in S 2. Although it is additive for disjoint sets, observe that C(.) may take negative values and so is a signed measure. Substituting BI = B2 = / 3 gives M2(B x B) = 2{N(B)2}, and also the variance function, a set function, defined by var N(B) = C(B x B). When the state space is the real line and B = (0, t], it is the (point) function V ( t ) = varN((0, t]) which is usually termed the variance function. More generally, for any point process cumulant measures CI (.), C2(-), • • • can be defined in terms of the moment measures by the usual moment-cumulant formulae; see, for example, Prohorov and Rozanov (1969, p. 347). In particular, Cl(-) = Ml(.) and C2(.) is the covariance measure C(.) as defined above. Moment (and cumulant) measures suffer the disadvantage that the second and all higher-order measures of any point process have 'diagonal concentrations' [Daley and Vere-Jones (1988, Section 5.4)], as a consequence of them being able to be viewed as first moment measures of 'product point processes' constructed from the original point process. This disadvantage can be side-stepped, at least for simple point processes, by using instead the factorial moment measures, which are the point process analogues of factorial moments for a random variable. The first factorial moment measure coincides with the mean measure, while the second factorial moment measure Mf2](.) can be defined by

MI2J(/31 ×/32) = M2(B

×/32) -

(/31 n/32)

(4)

for Borel subsets B1 and B2 of S. Observe that when B1 = B2 = B (say) the above right-hand side reduces to the second factorial moment of N(B), and that when B1 and B2 are disjoint it reduces to M2(B1 x B2). Factorial cumulant measures can also be considered; for details see Daley and Vere-Jones (1988, Section 5.5). Each set of moment/cumulant measures can be linked in a natural way with a generating functional, as will be indicated for moment and factorial moment measures in Section 2.10. When S = ~d for some d, the factorial moment measures may be absolutely continuous with respect to Lebesgue measure (length, area, volume etc.) in the appropriate dimensional Euclidean space (e.g. Nd for the mean measure and ~2d for the second factorial moment measure), and so able to be defined by densities. These are then the densities which we introduced earlier in a heuristic manner under the name of product densities. An important and useful result about moment measures is Campbell's theorem [cf. Daley and Vere-Jones (1988, Section 6.4), and Kingman (1993, Section 3.2) in the Poisson case]. The simplest version of this result is that for any nonnegative measurable function, or any Ml-integrable function 9, and any Borel subset B of the state space

Pointprocessesand some relatedprocesses ~-{ fRg(u)N(du)} = f g(u)Ml(du) .

611 (5)

An alternative, possibly more intuitive expression of this is

where {X~} is the set of random points corresponding to the point process, ml (u) its intensity function (assumed to exist). When the point process is stationary with intensity it, the latter integral simplifies to # r e g(u)du. Such results are automatically true whenever g is the indicator function of some Borel subset of S, because they are then a consequence of the definition of the first moment measure. The extension to finite linear combinations of indicator functions with nonnegative coefficients (simple functions) is immediate, whilst the extension to the stated classes of functions can be achieved by the usual extension methods of measure and integration theory (i.e. approximation by increasing sequences of simple functions etc.). The above simple version of Campbell's theorem can be extended to the following result about higher-order moment measures: for any nonnegative measurable function, or any Mk-integrable function 9, and any Borel subset A of S k,

E { f g(ul,...,Uk)N(dul)'"N(du~)} = JA g(ul,..., uk)Mk(dul'" duk) .

(6)

There are also useful refinements involving the Palm distribution; see, for example, Stoyan et al. (1995, Section 4.4).

2.8. General Poisson processes Suppose that #(-) is a measure on (S, ~s), where S is not necessarily a Euclidean space and #(B) is finite for bounded sets B. Replace (P1) in the definition of a homogeneous Poisson process (given in Section 2.4) by (P1)" for every bounded Borel subset B of the state space,

N(B) ~ Poi(/~(B)) . The existence of a point process satisfying (P1)" and the complete randomness property (P2) is a consequence of the refined Kolmogorov theorem, since these two requirements specify a suitable form for all fidi distributions. It is immediate that this point process has mean measure g(.). Such a process is usually referred to as a Poisson process with mean measure /~(.). Reflecting the property (P2), cov(N(BI),N(B2)) =var(N(Bl AB2))=#(B1 NB2), and hence the covariance measure C(.) of such a process is determined by C(B1 x B2) = I~(B1 N B2) for all Borel sets B1 and B2 of S. This shows that the covariance measure of a Poisson

612

R. K. Milne

process is concentrated on the leading diagonal in S x S, and that the second factorial moment measure is identically zero. One aspect of the generality of this definition is indicated by the observation that with the choice of a single point state space, for example S = { 1}, the resultant point process corresponds to a single Poisson random variable, whilst when the state space has m points, for example S = { 1 , 2 , . . . , m}, the point process corresponds to a r a n d o m m-vector with independent Poisson distributed components. Another aspect is that when S = Rd, homogeneous and inhomogeneous Poisson processes as introduced earlier are special cases with respectively #(B) = 21B I and #(B) = f~ 2(u)du for Borel sets B. In principle, with the above definition of a Poisson process the mean measure could have a discrete component, with part, or even all, of its mass concentrated on a set of points (atoms), of which there can be at most countably m a n y and only a finite number in any bounded set B. F o r any x with itx =/~({x}) > 0, it follows that Pr(N({x}) > 0) > 0; thus such an x, often termed a f i x e d atom, is a possible multiple point of the process. (In fact, N({x}) ~ Poi(/lx). ) A Poisson process with mean measure #(.) on S = R is simple if and only if its mean measure has no atoms. In general the mean measure could have part of its mass concentrated on some lower-dimensional subspace, for example a line, and so in general a Poisson process could have a positive probability of points lying in such a subspace. None of these possibilities can occur for a homogeneous Poisson process, or even an inhomogeneous Poisson process as defined in Section 2.4. Some, though not all, authors restrict attention to Poisson processes whose mean measure has no discrete component, and thereby implicitly to Poisson processes which are simple; see, for example, Kingman (1993) where this approach is motivated by a view of a Poisson process as a random set (which by definition is not allowed repeated points). Any Poisson process with mean measure #(.) has the following property: for any bounded set B and any nonnegative integers h i , . . . , nk Pr(N(BI) = n l , . . . , N ( e k ) = nklN(B) = n)

rLk_~ ~. ~=~ [~(B)J

'

(7) whenever n = nl + - . . + nk. Thus, given N(B) = n, these n points can be considered to be independently distributed over B according to the probability measure #e(.)/l~(B) where/~e(') = #(" riB) denotes the restriction of #(.) to B. For a stationary Poisson process on Nd this measure reduces to the uniform distribution over B, and the property is exactly the 'conditional' property discussed earlier (in Section 2.4) for such processes. Kingman (1993) is an extended and erudite essay which presents the m a n y beautiful properties and applications of Poisson and related processes, and shows how such processes can be defined in spaces having minimal structure, In particular, Section 3.4 of Kingman's book has an interesting discussion of a

Point processes and some related processes

613

fundamental result due to R6nyi (1967). In one form this result shows that, although the defining assumptions (P1) 11 and (P2) are not incompatible (and are conveniently taken as the c o m m o n definition), in fact (P2) is redundant in the presence of (P 1)". Moreover, the latter assumption can be considerably weakened: provided the mean measure is finite on bounded sets and has no discrete component, it is enough to demand the Poisson form for the probabilities of zero counts for a suitably large subclass of Borel sets, specifically for the sets of a semiring generating the Borel a-field (e.g. on the real line, all sets which are finite unions of half-open intervals). (See also the earlier discussion, in Section 2.2, of M6nch's theorem.) 2.9. R a n d o m measures

In studying some aspects of point processes one is soon led, for one reason or another, to consider r a n d o m measures. F r o m a mathematical point of view, as has been discussed in Section 2.1, it is natural to consider a point process realization as a counting measure, and it then seems equally natural to consider stochastic processes whose realizations are measures. R a n d o m measures arise naturally also from a modelling view-point: • as models for the distribution o f mass, for example of some mineral, over an area or in space; • in dealing with a generalization of Cox processes as discussed in Section 2.5 here, in order to take account of possible environmental heterogeneity, we allow the mean measure of a Poisson process to be r a n d o m [cf. Stoyan et al. (1995, Section 5.2)]; and • as mark-sum processes built from an underlying marked point process (see Section 4.2) by means of ~(B) = ~i:xicB ki, where the realization of the marked point process is denoted by {(xi, ki)}. Some further reasons are based on connections with other stochastic processes: • for a line segment process, consider ~ ( B ) = total length of line segments intersecting B; • for a stochastic process, consider ~(B) = length/area etc. of the intersection with B of exceedance regions for a given level; and • for a Boolean model [cf. Stoyan et al. (1995, Chapter 3)] built from discs centred at the points of a planar Poisson process, consider ~(B) = area of the intersection with B of the random set defined by the union of the discs, or ~(B) -- perimeter of the intersection with B of the boundary of that r a n d o m set. For some other r a n d o m measures that arise naturally in the study of stochastic geometry see Section 7.3.4 of Stoyan et al. (1995). In thinking about a realization of a random measure as compared with that of a point process, there arise considerations similar to those which arise when contemplating a general measure (or probability distribution) as compared with

614

R. K. M i l n e

one that is discrete; a realization of a random measure may 'smear' the mass over a region and/or it may place mass at discrete set of points (atoms). The formal definition of a random measure can proceed in a manner similar to the point process case. In the present context, take the space of realizations to be M = the set of all measures 4(') on (S, Ns) that are finite on bounded sets and let J / = the smallest o--field, of subsets of M containing all sets of the form {4(') E M : ~(B1) _< U l , . . . , ~ ( B k ) < u k } for all possible positive integers k, all Borel sets B 1 , . • • , B k of S and all nonnegative real numbers Ul,..., uk. Then, corresponding to the two points of view indicated for point processes in Section 2.2, a r a n d o m m e a s u r e can be defined either as a probability measure on the space (M, d//), or as a measurable mapping from some probability space (f2, ~ , P) into (M, ~ ) . The finite-dimensional (fidi) distributions of a random measure ~ are the (joint) distributions, on •k, of ~(B1),..., ~(Bk) where B 1 , . . . , B~ are bounded Borel sets of S and k is a positive integer. Since they will not, in general, be discrete distributions (as in the point process case) or distributions absolutely continuous with respect to the appropriate Lebesgue measure (and so able to be described by density functions), such fidi distributions need to be described by their distribution functions Fk(Bl,

. . . ,Bk ; ul , . . . , uk )

=Pr(¢(Bl)_
Ul,...,uk

> O .

There is a generalization to random measures of the refined Kolmogorov theorem described in the point process case. For the fidi distributions we require a consistent family of probability measures on the nonnegative orthant of Nk for positive integers k (by contrast with {0, 1,...}k for positive integers k in the point process case). In addition, as in the point process case, conditions are required in order to ensure 'additivity' and the 'measure character' of realizations. Such results are provided, for example, in Daley and Vere-Jones (1988, Theorem 6.2.VII); see also Kallenberg (1983, Section 5.2). Moment measures can be introduced for random measures in much the same way as they were for point processes. In particular, the first moment measure, mean measure or intensity measure is given by M1 (B) = E~(B), B E ~s. A simple, though rather trivial example of a random measure, can be specified as follows. Suppose S = Rd and that I " I denotes Lebesgue measure. Let X be a nonnegative random variable (e.g. gamma distributed) and set ~(B) = XIBI for any Borel set B of S. A less trivial example is built, as is the Poisson process, by using complete randomness after specifying a suitable family of one-dimensional distributions. Suppose that fi is a positive real number and #(.) a measure on (S, ~ s ) which is finite for bounded sets B. A g a m m a r a n d o m m e a s u r e can be defined by the requirements: (G1) for every bounded Borel subset B of the state space, ~(B) ~ Gamma(g(B), l i f t ) , where the latter is determined by the distribution function

Point processes and some related processes

F(l~

flU(B) t~(B)-le-t/~dt'

615

u > 0 ,

or equivalently by the Laplace transform E{e -t~(~) } = [1 +/~t]-~(B); and (G2) for every positive integer k and all sequences B 1 , . . . , Bk of pairwise disjoint bounded Borel sets, ~(B~),...,~(B~) are mutually independent random variables. The complete randomness property (G2) ensures consistency, while it and the usual properties of gamma distributions with fixed scale parameter ensure the required additivity. Strictly, there is a need to check also a simple continuity condition [cf. Condition 6.2.4 of Daley and Vere-Jones (1988)], namely that for any non-increasing sequence {Bn } of bounded sets from ~ s that converges to the empty set 0 as n tends to infinity, the corresponding G a m m a (#(Bn), 1/fi) distributions converge to the degenerate distribution concentrated at the origin. Then, by appeal to a random measure version of the refined Kolmogorov theorem the existence of a random measure ~ which is completely random and has the above gamma distributions as its one-dimensional distributions can be assured. The mean measure of this gamma random measure is the measure/~(.). Such random measures were considered by Sewastjanov (1975, Chapter XII) in applications to branching processes. Because we are here dealing with a random measure, it may come as some surprise that any realization of such a gamma random measure can be shown to be a purely atomic measure. This result is well known from the theory of stochastic processes with independent increments [see, for example, Gikhman and Skorokhod (1969, Chapter VI) in the case S = R]. Gamma random measures with/~(S) < oc were used by Ferguson (1973) in a study of prior distributions on spaces of probability measures, and by Shorrock (1975) in a paper about discrete time extremal processes. Whilst there are few simple examples of random measures, essentially because there is a dearth of additive families of distributions, many general results are known. For example, it is possible to characterize those random measures which are completely random, i.e. random measures satisfying just (G2); for an elegant treatment see Kingman (1993, Chapter 8). Kallenberg (1983) provides a systematic study of random measures, including completely random measures and the wider class of infinitely divisible random measures. Extensions of the theory of random measures to random signed measures are not routine. There is some discussion and a relevant example in Section 6.1 of Daley and Vere-Jones (1988); see also Section 7.1.1 of Stoyan et al. (1995), where it is pointed out that some theory exists and that this is important for studying curvatures of random closed sets. 2.10. Generating functionals

In the study of random variables and their distributions, various types of generating function (probability generating functions, moment generating functions or Laplace transforms, and characteristic functions) have proved to be useful

R. K. Milne

616

tools. Moment generating functions or Laplace transforms are well-suited to dealing with nonnegative random variables, while probability generating functions play a special role with nonnegative integer-valued random variables. All such generating functions have 'functional' analogues that are useful in the study of point processes. (Here the argument of a functional is a real-valued function, rather than a real number or a vector of real numbers.) These functionals provide a means of compactly summarizing information about point processes and enabling that information to be easily manipulated. The probability generating functional (pgfl) G of a point process N can be defined, for suitable functions h with domain S (and suitable conventions about the logarithm function when its argument is zero), by G[h] = E(exp{fs lnh(u)N(du)}). The simplest class of such functions h consists of those (measurable functions) taking values in the interval [0, 11 and equal to one outside some bounded subset of S. Heuristically

G[h] =[E(Hh(u)N(du)l : E( ku~S

/

H

h(IA)N({u})) "

(8)

u:N({,})>0

In the middle expression the integrand is a product integral, and shows that G can be viewed as the joint probability generating function of all the random variables N ( d u ) , u c S; notice that at most countably many of these will be non-zero. Observe that for pairwise disjoint B],...,Bk and h(x)

zi

I1

x E Bi, i = 1 , . . . , k

x s\U lBi,

G[h] .

. . . zk ), which is the joint pgfn of N ( B 1 ) , . . . , N ( B k ) . Working heuristically using (8), the defining properties of an inhomogeneous Poisson process with intensity function 2(-) (which, in particular, ensure that for u ¢ v N(du) and N(dv) are independent), and the form of the probability generating function of a Poisson distributed random variable, it follows that the pgfl of an inhomogeneous Poisson process with intensity function 2(.) is G[h]= I - [ e x p { [ h ( u ) - l ] 2 ( u ) d u } =

uES

exp{ fs [h(u)- 1]2(u)du}

(9)

where the middle expression is again a product integral. Observe that (9) generalizes to G[h] = exp{fs[h(u ) - 1]#(du)} for a Poisson process with mean measure #(-), and simplifies to G[h] = exp{2 fs[h(u) - 1]du} in the case of a homogeneous Poisson process with intensity 2. The Laplace functional (Lfl) of a random measure is defined by

(lo)

Point processes and some related processes

617

where again the integrand in the middle expression is a product integral, for nonnegative (measurable) functions h defined on S that vanish outside some bounded subset of S. The requirements on the functions h ensure that fs h (u)~ (du) is well-defined and finite almost surely. Observe that for pairwise disjoint B 1 , . •. , B ~ and h given by h(u) = F_~i=ltilBi(u) k the Lfl of a random measure reduces to

L[h] =

ti~(Bi)

E exp i=1

})

,

(11)

that is to the Laplace transform of the joint distribution of ~(B1),..., ~(B~). Since any point process is a random measure, the Lfl can be used also for point processes. Working heuristically, as above for the pgfl of an inhomogeneous Poisson process, using (10), the Laplace transform of a gamma density and the defining properties of the gamma random measure with mean measure/~(.), it follows that the Lfl of this random measure is

L[h] = H ~_(exp{-h(u)#(du)}) uES

= I I e x p { - ln[1 +

fih(u)]~(du)} ,

uES

where the latter is a product integral expression which yields

L[h] = e x p { - fs ln[l + [3h(u)]#(du) } .

(12)

For the reader desirous of a more formal approach to the derivation of either (9) or (12) it is straightforward to write down the relevant joint transform corresponding to any pairwise disjoint bounded Borel sets B1,...,Bk and from this obtain the associated functional by taking a limit for a suitable sequence {h,} of arguments. That such a method for deriving a generating functional does work is a consequence of an appropriate characterization result; see, for example, Daley and Vere-Jones (1988, Theorem 7.4.II) for the pgfl case and their Exercise 6.4.2 for the Lfl version. Also of importance for each type of functional is a uniqueness result, which allows the distribution of a point process or random measure to be deduced from the form of its generating functional; see Ji~ina (1964) for Lfls and Westcott (1972) for pgfls. Suppose that ~ is a random measure on (S, Ns) with mean measure ~t(.). Consider a Cox process driven by this random measure, i.e. consider a point process which, given 4, is a Poisson process on S with mean measure 4('). Then it can be shown that the pgft, G, of the Cox process is related to the Lfl L of the driving random measure £ by GIh] = L[1 - h i , for any (measurable) function h taking values in the interval [0, 1] and equal to one outside some bounded subset of S. This generating functional relationship can be used to obtain various properties. It can be deduced that the mean measure of such a Cox process coincides with the mean measure of the driving random measure and also that the Cox process will be completely random iff its driving random measure is corn-

R. K. Milne

618

pletely random. If ~ is the gamma random measure discussed in Section 2.9, then the pgfl of the Cox process is

G[h] = exp{- fs ln[l + fl(1- h(u))l #(du) } .

(13)

From this form it follows easily that the one-dimensional distributions of this Cox process are of negative binomial form with probability generating function E(z jr(B)) = [ l ÷ f i ( 1 - z ) ]

-~(B), B E ~ s

-

(14)

Generating functionals can be linked to the various types of moment/cumulant measures, for example by suitable differentiation or expansion, in ways which generalize the connections between moments/cumulants and the classical generating functions [cf. Westcott (1972), Stoyan et al. (1995, Sections 4.4 and 7.2) and Daley and Vere-Jones (1988, Section 7.4)]. For example, under conditions which ensure the convergence of the right-hand side, the pgfl can be expressed in terms of the factorial moment measures by G[1 - h] = 1 + ~ -" (-1)kk!is"" [

fsh(Ul)...h(uk)M[k](du, ...duk)

,

(15)

and a similar expansion links in G[1 - h] with the factorial cumulant measures. It is instructive to consider such expansions for a Poisson process. Similarly, for the Lfl there is an expansion in terms of the moment measures: Lib] = 1 ~-

~ .

"'"

h(H1) " ' ' h ( u k ) M k ( d u l " ' "

duk)

.

(16)

k=l

A further type of functional, the characteristic functional [cf. Daley and Vere-Jones (1988, Section 6.4)] can be used for random measures, including point processes, in a manner similar to the Lfl. As the name suggests the characteristic functional is for random measures an analogue of the ordinary characteristic function. 3. Operations on point processes and associated limit results

3.1. Operations New point processes can be generated in several ways from a process or processes already defined. Generating functionals provide a useful unifying device and operational tool.

Superposition. For two not necessarily independent point processes Nl and N2, a new process N, their superposition, can be defined by N = N1 + N2. This is to be interpreted as N(B) = N1(B) + N2(B) for Borel subsets B of the state space, or in intuitive terms as the pooling of the sets of points for the two processes (though

Point processesand some relatedprocesses

619

the problem with the latter view is that multiple points may be introduced by the pooling). For processes which are independent, the distribution of their superposition can be expressed as the convolution of the distributions of the summand processes. In particular, for example from (9), the superposition of two independent Poisson processes with respective mean measures #1(') and #2(') is again a Poisson process, with mean measure #1(') + #2('). The superposition of two independent homogeneous Poisson processes with respective intensities 21 and )~2 is another homogeneous Poisson process, with intensity 2 = 21 + ,~2. The superposition N of two independent point processes N1 and N2 having respective pgfls G1 and G2 has probability generating functional G[h] = G1 [h]G2[h]. This general result can be employed to verify the above assertions about the superposition of two independent Poisson processes. In a similar way, the superposition of finitely many point processes can be considered and, under appropriate conditions, the superposition of a countable number of point processes; the latter extension is needed, for example, in order to deal with cluster processes as discussed below.

Random deletion. Given a realization of a point process N, suppose that each point in the realization is deleted with probability 1 - p and retained with probability p, independently of all other points in the realization. This operation is variously referred to as (random) thinning, (random) deletion or sometimes Bernoulli deletion. When ml (u) is the intensity function for the original process N, the intensity function for the process of retained points is clearly pro1 (u). (This is, for example, obvious from the product density interpretation.) Thus, if N is stationary with intensity # then the process of retained points is stationary, with intensity p#; further, if N is a homogeneous Poisson process then so is the new process. This result for a Poisson process can be seen, for example, from (9), as will be indicated shortly. A generalization is to allow position dependent thinning. For example, a point at x in a realization of the original point process could be deleted with probability 1 - p(x) and retained with probability p(x), independently of all other points in the realization. By first conditioning on the original process, the pgfl Go [hI of the output (or thinned) process can be expressed in terms of the pgfl GI [hi of the input (original) process as Go[hi= ~(x~es{p(x)h(x)+[1-p(x)]} N(dx)) = G~[1 +p(.)[h(.) - 11] .

(17)

Now observe that such position dependent thinning of a homogeneous Poisson process with intensity 2 results in an output process with pgfl Go[h I = exp{,~ fs[h(x) - 1]p(x)dx}, and so, using (9), in an inhomogeneous Poisson process with intensity function 2p(x).

620

R. K. Milne

Random translation. This operation can be defined as follows for any point process N. Given a realization of N, each point in that realization is shifted, independently of all other points in the realization, the shift having some specified distribution function on S, where this distribution is the same for each point in the original realization. (The shifts are thus assumed independent and identically distributed for each point in the realization of N.) Such a randomly translated process is stationary whenever the original point process is stationary. Whenever the original point process is Poisson the randomly translated process is Poisson. This result can again be seen by a pgfl argument. Let the translation distribution be denoted by F. First condition on the original process, assumed to be Poisson with intensity measure #(-), to give

G[h] = ~- l-[ { f h(x + t)F(dt) \ x E S t. d S

=exp{./s[/sh(X+t)F(dt ) - 1]#(dx)} .

(18)

The last expression can be rearranged to give

G[h] = exp{ffs[h(u ) - e]#F(du)} ,

(19)

where #F(') is defined by #F(B) = fs #(B - t)F(dt) with B - t = {u - t: u E B}. (This is also easily seen using product densities.) Moreover, i f N is a homogeneous Poisson process then so is any random translation of it, and the intensities of the two processes are the same. A generalization, discussed for example in Daley and Vere-Jones (1988, Example 8.2(b)), allows the shift of each point x in the original point process to be governed by a Markov kernel H(.[x), where H(BJx) is the probability, given a point at x, of shifting that point into the set B. (Notice that now such a shift may be dependent on the position of the original point.) By allowing this kernel to be substochastic, i.e. H(S[x) < 1, position dependent thinning can be brought within this framework.

Cluster processes. Now suppose that each point of an 'input' point process is replaced by the points of some subsidiary point process or cluster, and that the superposition of all these clusters is then the 'output' process or cluster process. In the simplest type of cluster process the input is a homogeneous Poisson process of specified intensity and the clusters are independent and identically distributed point processes, each with its origin translated to the associated point, often called the cluster centre, in the input process. When S = ~ two general types of cluster structure have been considered: (i) a finite renewal process where the number of points is either fixed or follows a specified distribution, and the interval (lifetime) distribution also specified; or

Point processes and some relatedprocesses

621

(ii) the number of points is either fixed or follows a specified distribution, and these points are placed independently and identically according to a specified distribution on S. The resultant cluster processes are respectively termed Bartlett-Lewis and Neyman-Scott (cluster) processes. Neyman-Scott processes can be considered also for Euclidean state spaces S = Na. In principle, a Bartlett-Lewis type process could also be considered in S = Ra, though this extension seems less common. Random translation, as introduced earlier, is a particular case of each of these types of cluster structure in which each cluster has a single point. A cluster structure in which all the points of a given cluster are placed at the cluster centre is often referred to as a compounding. A compound Poisson process, as introduced in Section 2.5, is a special case. Bernoulli deletion, as introduced earlier, is a particular case of compounding. The type of generating functional approach considered for the operations of random deletion and random translation can be generalized to give an expression for the pgfl of a cluster process. Suppose that the pgfl of the cluster arising from an input point at x is G[hlx I . Then the pgfl of the output or cluster process given input {xi} is Ili G[h[xil, and therefore the unconditional pgfl of the output process is Go[h]=E(u:N({u})>0H {G[h]u]}N({u})) = GI[G[h[']]

(20)

where GI[.] denotes the pgfl of the input process. If the input process is a Poisson process with mean measure #(-) then the pgfl of the resultant output process is

Golh] = exp{ fs(GIh]x ] - 1)#(dx)} .

(21)

Such a process is called a Poisson cluster process. For a N e y m a ~ S c o t t process, with the number of points in a typical cluster having probability generating function Hc and the individual points in a cluster distributed about the cluster centre according to the distribution F, the pgfl G[h]x] of the cluster arising from an input point at x is G[hlx] = Hc(fsh(x +y)F(dy)). A question not addressed above is the important one of whether the output process exists, in the sense of there being almost surely a finite number of points in any bounded set. This is directly linked with the question of convergence of the infinite product at (20). For further discussion of these aspects see Daley and Vere-Jones (1988, Section 8.2).

State space tramformation. This operation is defined initially as a mapping which transforms points of the state space S into points of a new state space S*, and not directly on a process or processes. Such a mapping then transforms the set of points of any given point process realization in S into a corresponding realization in S*. Under state space transformation a Poisson process on S is transformed into another such process on S* [Kingman (1993, Section 2.3)]. If the initial

622

R. K. Milne

process were a homogeneous Poisson process then the transformed process would in general be an inhomogeneous Poisson process. The operations of superposition and state space transformation of point processes can be extended to random measures. The remaining operations do not in general have direct analogues. 3.2. Limit results

With each of the operations of superposition, random deletion and random translation can be associated certain limit results concerning point processes. Such results had their beginnings in a result, enunciated by Palm (1943) and with a proof completed by Khinchin (1955, 1969), that the superposition of a number of independent and suitably sparse point processes tends to a homogeneous Poisson process as the number of 'summand' processes tends to infinity. This result is considered important for its capacity to explain the widespread usefulness of Poisson processes in many applications. Generalization of this result leads to a limit theorem [cf. Daley and Vere-Jones (1988, Proposition 9.2.IV)] for superpositions of point processes in a triangular array and thereby to a characterization of infinitely divisible point processes (see Section 4.6). For point processes in Euclidean state spaces, the simplest limit theorem for random deletions can be stated loosely as follows: suppose that points of an initial point process are subject to Bernoulli deletions, with a retention probability o f p for any individual point, and that to compensate for this loss of points the scale is contracted so as to balance the deletions and preserve the intensity. It is then possible to prove convergence as p ~ 0, starting from suitable initial point processes, to a homogeneous Poisson process. There are various limit results for random translations. One of the simplest assumes the points of a suitable initial point process move with independent and identically distributed random velocities, and establishes convergence to a homogeneous Poisson process as time tends to infinity. Substantial generalizations of these basic results can be considered. Generating functional methods can be used in constructing proofs. No details will be given here, since to do so would require more serious discussion of convergence of point processes. For details of such convergence and various limit results see, for example, Chapter 9 of Daley and Vere-Jones (1988). The fundamental convergence concepts are developed also in Kallenberg (1983, Chapter 4) and in Matthes et al. (1978, Chapter 3).

4. Some other classes of point process 4.1. M i x e d Bernoulli processes

Assume initially that the state space S is Euclidean and that W is a not necessarily bounded subset of S. Consider a general Bernoulli process defined on W, as in Section 2.4, with the distribution for any one of its points defined by some

Point processes and some relatedprocesses

623

probability measure Q on W. A mixed Bernoulli process is obtained by allowing the total number of points in each realization to have some prescribed probability distribution. If the mixture distribution is taken as the Poisson distribution with parameter ~ then for any bounded Borel set B, N(B) ~ Poi(2Q(B)). Such a process is clearly a Poisson process with mean measure 2Q(.). However, since 2Q(W) < ~ such a process is constrained to have N ( W ) almost surely finite, i.e. its realizations have a finite total number of points with probability one. The importance of the above construction is that it provides a genuinely constructive approach to Poisson processes with (totally) finite mean measure, and yields a means of simulating Poisson processes on bounded sets. Thus, to simulate on a bounded set B a Poisson process having mean measure #(-): (i) choose a random integer n from a Poi(#(B)) distribution, and (ii) choose n points independently according to the probability distribution #~(.)/#(B), where #8(') = #(" riB) denotes the restriction of/z(.) to 8. In fact, this construction works in an arbitrary state space and, in particular, is not dependent on its dimension. A 'pasting' argument [cf. Williams (1979, pp. 46-47)] can be used to extend to the a-finite case: construct independent Poisson processes on B1,B2... where { 8 1 , 8 2 , . . . ) is a partition of S and #(Bi) < cxD, i = 1 , 2 , . . . , and then 'paste' these together. This is essentially the approach taken by, for example, Kingman (1993, Section 2.5) in his treatment of existence of Poisson processes with a specified (nonatomic) mean measure. To obtain a homogeneous Poisson process requires only that (ii) be specialized to choosing n points independently and uniformly on B. If the mixture distribution is taken as a negative binomial distribution then the counts in individual Borel sets have related negative binomial distributions whose probability parameters depend on the Borel sets being considered and a simple expression can be given for the pgfl. Such processes are reasonably termed negative binomial processes. The processes so generated can also be obtained as a mixed Poisson process where the mixing distribution is a gamma distribution; see, for example, Daley and Vere-Jones (1988, Section 7.4). For other types of negative binomial process see Diggle and Milne (1983a). 4.2. Marked point processes

As was indicated earlier, for modelling the location of earthquakes in a region, or trees in a forest, it is often appropriate to introduce a further random quantity, often called a mark, associated with each point in an underlying point process (of locations). The resultant process is known as a marked point process. Such a process can be considered as a point process on a product space S × K consisting of all pairs (x, m) where x is from the state space S of locations and m from a space K consisting of all possible marks. For the two examples indicated above the mark space would be taken as K = (0, ~ ) , though other choices for K are possible. Any compound point process, in particular a compound Poisson process,

R. K. Milne

624

can be viewed as a marked point process; in the stationary case the mark distribution is what earlier we called the batch size distribution. More generally, any cluster point process can be viewed as a marked point process with the marks in this case being the subsidiary point processes associated with each cluster centre. The general theory of point processes (where the state space can be any complete separable metric space) does not, strictly speaking, require extension to cover marked point processes. Nevertheless, the product structure of the new state space leads to related special structure in, for example, moment measures and product densities, and it is often helpful to recognize this; see, for example, Stoyan et al. (1995, Chapter 4). In particular, there is a version of Campbell's theorem for marked point processes [see also Stoyan and Stoyan (1994, Section 14.2)]. For modelling situations where there are two types of point, for example two types of tree in a forest or two types of cell in a region, a point process with a two point mark space, e.g. K = {1,2}, can be used. A particular class of two-type point process is based on independent marking, where the marks are assigned independently to the points of some underlying point process of locations. This is formally the same as considering jointly the point process of 'retained' points and the point process of 'deleted' points for the case of Bernoulli deletions. It is then easy to write down an expression for the joint pgfl of the two processes (probability generating functional of the marked point process) in terms of the pgfl of the original process. In general, the joint pgfl G of two point processes N1 and N2 is given by

G[hl,h2]~-~_(Ilhl(u)Nl(dU)IIh2(v)Nz(dv)) , \uES yes /

(22)

for suitable functions h~ and h2 as in (8). By first conditioning on the input process, the pgfl G[hi,h2] of an independently marked point process process can be expressed in terms of the pgfl Gi[h] of the input (original) process as

Go[hi,h2] = ~-(x~Es{P(x)hl(X)+[1-P(x)]h2(x)}N(dx)) = Cq[p(')hl(.) + [1 -p(')]h2(')] •

(23)

When the input process is a homogeneous Poisson process with intensity 2 the pgfl of the independently marked point process is

Go[h~,h2]=exp{2 fs[P(x)h,(x) + [l -p(x)lh2(x)- l]dx} ,

(24)

and this can be simplified to

Go[hi,h2] =

exp{2~s[h,(x ) - 1]p(x)dx} × e x p ( 2 fs[h2(x ) - 1][1-p(x)]dx} .

(25)

Pointprocessesandsomerelatedprocesses

625

Thus, each of the processes Nl and N2 is an inhomogeneous Poisson process and these processes are independent, this independence being an unexpected consequence. The independent marking, or splitting, property of a Poisson process is termed the colouring property by Kingman (1993, Section 5.1); later in the same chapter he explores significant generalizations of the property. If the mark space is a finite set, taken without loss of generality to be K = { 1 , 2 , . . . , s}, then the marked point process is often called a multitype or multivariate point process; see, for example, Cox and Isham (1980, Chapter 5) or Diggle (1983, Chapter 6). The next section considers two-type Poisson processes; Diggle and Milne (1983b) have explored some bivariate Cox processes as possible models for spatial patterns exhibiting dependence between the points of two types.

4.3. Two-type Poisson processes Although the independent marking of a Poisson process, as considered above, leads to independent Poisson processes, it is not hard to conceive of situations giving rise to dependent Poisson processes. One of the simplest arises from applying the operation of random translation, according to a Markov kernel H(.lu) (as in Section 3.1), to an underlying Poisson process which is assumed to have mean measure #(.). Consider the input (underlying) Poisson process as one process and the output (randomly translated) process as the other; for example, all input points could be labelled with mark 1 and all output points with mark 2. Clearly, given an input point at u, the joint pgfl of the input and output arising from this point is

G[hl, h21ul= hi(u) fss h2(v)H(dv[u) ,

(26)

and the joint pgfl of the input and output given input {ui} is

H GLhl,h21uil= H i

u:N({ })>0

(hl(u) fsh2(v)H(dv]u)).

(27)

Therefore, the unconditional joint pgfl of the input and output is

GIo[hl'h2]=EI(

H

*hl(u) fh2@)H(d~)'u)})

u:N({.))>Ot

Js

-~ GiIhl(.) fsh2(v)H(dv,.)l .

(28)

If now it is supposed that the input process is Poisson with mean measure #(.) then

Gio[hl,h2] =-exp{ fs [hl(U) fs h2(v) H(dv,u) - llp(du) } ,

(29)

R. K. Milne

626 and this can be written as

Gio[hl,h;]

= exp{f

fs[hl(u)h2(v)-1]H(dv]u)#(du)}.

(30)

Observe that by setting h2(u) - 1 we recover the pgfl of the input process, setting hi(v) -= 1 yields the pgfl of the output process (as derived in Section 3.1), and setting hi = h2 = h gives the pgfl of the superposition, N~ + N2, of the input and output processes; these observations apply, in fact, to any joint pgfl. Since both the input and output processes are Poisson with respective mean measures #(-) and gH(B)= fsH(Blu)l~(du), the input and output processes corresponding to (30) provide one example of a two-type Poisson process. As can be seen from the following section, such a point process is infinitely divisible. The most general form of infinitely divisible two-type Poisson process [cf. Milne (1974)] has pgfl

G[hl,h2] = e x p { f s [hl(u) - l]yl(dU) + fs [h2(v)- l]#2(dv) + fs fs[hl(u)h~(v)- l]v(dudv)} .

(31)

This can be interpreted as being the superposition of three independent processes: a Poisson process with mean measure #1 (') contributing points with mark 1, a Poisson process with mean measure//2(-) contributing points with mark 2, and a further process which contributes pairs of points, one of each type, where this latter process is of the same type as (30). Other two-type Poisson processes are possible [cf. Griffiths and Milne (1978) and Brown et al. (1981)], though such processes cannot be infinitely divisible. Setting hi = h2 = h in (31) gives the pgfl of the superposition, N1 +N2, of the two processes: this has pgfl G[h]= exp{

fs [h(u) - l]t~(du) + fs fs [h(u)h(v) - l]v(dudv) } ,

where/~(-) = / ~ (.) + #2(')- The corresponding point process is the process studied in particular by Milne and Westcott (1972).

(32)

Gauss-Poisson

4.4. Infinitely divisible point processes One of the simplest definitions of infinite divisibility is in terms of the pgfl: a point process is said to be infinitely divisible if for each positive integer n its pgfl G[h] can be expressed as G[h] = (G,[h]) n where G,,[h] is another pgfl. This is equivalent to being able to express the process, for each positive integer n, as an n-fold superposition of independent and identically distributed processes, each having pgfl Gn[h]. Observe that any Poisson cluster process is infinitely divisible. This is really a consequence of the infinite divisibility of any Poisson process: a Poisson cluster

Point processes and some related processes

627

process whose pgfl Go[h] is given by (21) can be seen to be infinitely divisible by the choice Gn[h] = e x p { fs(G[hlx ] - 1)n-1/z(dx)}

(33)

which is clearly the pgfl of a Poisson cluster process whose input Poisson process has mean measure n-l#(.). Poisson cluster processes constitute a large and important subclass of the class of infinitely divisible point processes. For a general infinitely divisible point process there is a representation theorem [see, for example, Theorem 8.4.V of Daley and Vere-Jones (1988)] giving a canonical form for its pgfl. This generalizes to point processes the classical result giving a compound Poisson representation for the probability generating function of an infinitely divisible nonnegative-valued random variable [cf. Section XII.2 of Feller (1968)]; the corresponding extension to random vectors was obtained by Dwass and Teicher (1957). In a sense, the representation for an infinitely divisible point process is a weaving together of the compound Poisson representation (as in Dwass and Teicher (1957)) for the (joint) probability generating function of each of the (necessarily infinitely divisible) finite-dimensional distributions of the process.

4.5. Some other processes related to point processes For providing language and a unifying framework within which other processes can be considered, the theory of point processes, and especially that of marked point processes, is often helpful. Semi-Markov processes with finitely many states can be viewed as multitype point processes [Cox and Isham (1980, Section 3.2)], although most properties of such processes can be conveniently derived without using this connection. Alternating renewal processes are a special case of semiMarkov processes in which there are two types of mark and two types of lifetime, each of which alternate over time. Shot noise processes [cf. Cox and Isham (1980, Section 5.6) or Snyder and Miller (1991, Chapter 4)] can also be conveniently viewed as marked point processes. In this case the mark attached to each point of an underlying point process is a possibly random multiple (usually independent and identically distributed for each point) of a fixed function; for example, the fixed function may be a negative exponential function with fixed decay parameter, representing a 'blip' of electric current associated with a typical point. The superposition (sum) of the all the 'blips' of current, which is clearly stochastic process and not a point process, is what is called a shot noise process. There is a connection between the joint characteristic function (Laplace transform) of the shot noise process at finitely many time points and the characteristic functional (Laplace functional) of the underlying point process [cf. Snyder and Miller (1991, Section 5.2.1)]. Shot noise processes and this connection are explored by Kingman (1993, Chapter 3) in the simplest setting, that is when the underlying point process is a Poisson process.

628

R. K. Milne

Marked point processes provide a framework for treating many other probability models of interest, especially in stochastic geometry and stereology. For example, consider a stochastic process whose realizations are a (finite or) countable number of line segments in the plane, with each segment specified by the random location (of its midpoint), together with its orientation and length. Such a process can be viewed as a marked point process in which the points of the underlying point process give the locations of the midpoints of the line segments, while the mark attached to each such point is a vector recording the orientation and length of the line segment to be associated with that (mid-)point. [(Stoyan and Stoyan (1994, p. 265) give an example involving positions and orientations of flies on a leaf.)] The simplest such model for a process of line segments assumes that the underlying point process is a homogeneous Poisson process in the plane. A related type of model can be built from an underlying homogeneous Poisson process in the plane by supposing that each point is independently marked with a positive number drawn from some specified distribution, the same for each point. Suppose now that each point of the underlying process is replaced by a disc centred at that point and with radius given by the associated mark. The union of all such discs then constitutes a realization of a stochastic process which is not itself a point process. Such a process is an example of Poisson grain model or Boolean model and a particular example of a random set process. Boolean models are studied in Molchanov (1997), Stoyan et al. (1995, Chapter 3) and Stoyan and Stoyan (1994, Appendix F); for more general discussion of random sets see Molchanov (1999) or Stoyan et al. (1995, Chapter 6). One application of Boolean models is to modelling the distribution of cells of differing size over a region; a variety of other applications are given in Stoyan et al. (1995, p. 62). It is sometimes of interest to consider a process of lines (infinite in length) rather than a process of line segments; see Kingman (1993, Chapter 7), Daley and Vere-Jones (1988, Section 10.6), or Stoyan et al. (1995, Chapter 8). Stoyan et al. (1995, Chapter 9) consider other generalizations, for example to processes of fibres. 4.6. Gibbs processes, Markov point processes and related processes

In earlier sections, it has been seen that Poisson processes are a fundamental class of point processes, both as processes in their own right and as a basic building block for constructing other classes of processes. This section briefly surveys a way of constructing new point processes that once again builds on a Poisson process, but provides an approach that is very different from any of the other approaches discussed in earlier sections. Broadly, the aim here is to define classes of point processes by means of their densities (Radon-Nikodym derivatives) with respect to a Poisson process. However, in a Euclidean state space this approach is inappropriate for even homogeneous Poisson processes, since such processes are mutually singular if they have different intensities; see Stoyan et al. (1995, Example 5.6). This problem can be circumvented if attention is restricted to a

Point processes and some relatedprocesses

629

bounded window W in the state space. In this setting a Gibbs (point) process is defined by its density, or likelihood ratio, relative to a homogeneous Poisson process of unit intensity; see Stoyan et al. (1995, Section 5.5). Such processes have their origins, and are widely used, in statistical mechanics. (The choice of intensity of the 'base' Poisson process is, in a sense, immaterial since on a bounded window W homogeneous Poisson processes of different intensity form a class of mutually absolutely continuous processes.) Gibbs processes form a large class of point processes, and it is usually necessary to make further assumptions about the structure of the densities in order to obtain more specific and useful classes of processes. One of the classes is that of Markov point processes [Ripley and Kelly (1977)]. These allow the introduction of a form of spatial dependence that is local or Markov [Diggle (1983, Section 4.9); Cressie (1991, Section 8.5.5); Stoyan et al. (1995, Section 5.5)] once a neighbourhood structure has been specified for the points in any realization through some reflexive and symmetric 'neighbourhood' relation defined for pairs of points. An example of such a relation is that for a specified finite r, two points xi and x/ are r-neighbours if and only if the Euclidean distance d(xi,x/) between them is at most r. For these processes a Hammersley-Clifford type theorem [Ripley and Kelly (1977)] shows that the densities (with respect to the unit Poisson process) are precisely of the form f ( x ) = H O(x') ,

(34)

X t CX

where x, and x / here denote particular realizations (i.e. sets of finitely many points from W), 0(.) is nonnegative, and 0(x') = 1 unless all pairs of points in x' are neighbours. Such Markov point processes are surveyed in Baddeley and Moller (1989). These authors introduce also multitype generalizations and a more general class of Markov point processes based on a different type of neighbourhood relation (where fixed-range interactions are replaced by interactions between points that are neighbours according to a relation that may depend on the realization); see Section 4 of Baddeley and Moiler (1989). A more manageable subclass of Markov point processes is the class ofpairwise interaction processes, whose densities have the form

S(=) =

l-I

+(xi,x+) ,

(35)

{x,,x/}C_x where c~is a normalizing constant,/3 > 0 is an 'intensity' parameter, n(x) denotes the number of points in the realization x, and (p(., .) is a symmetric nonnegative function, often termed an interaction function, satisfying ~o(xi,xi)= 1 and such that normalization is possible. Normalization is possible if range(~o) c_ [0, 1], corresponding to purely inhibitory processes, or if q0(xi,x/) = h(d(xi, x/)) where h is a function which is nonnegative, bounded and zero for values of its argument less than or equal to some specified value e. Processes of the latter type are called hard core processes.

R. K. Milne

630

A particular parametric subclass of the class of pairwise interaction processes is the family of Strauss processes [Strauss (1975)], which are those whose densities with respect to the unit Poisson process are of the form

f ( x ) = ct/~(~)7~(~) ,

(36)

where c~ and ]~ are as above, 7 is an 'interaction' parameter satisfying 0 _< 7 -< 1, and s(x) counts the number of pairs of points in the realization x which are r-neighbours (in the sense defined above). The density is not integrable, and normalization is not possible, if 7 > 1. The case 7 = 1 gives a Poisson process with intensity/~, and while the case 7 = 0 gives a hard core process in which Poisson realizations are conditioned to have no pairs of points that are r-neighbours. Cases with 0 < ~ < 1 yield processes exhibiting less strict inhibition, sometimes termed softcore processes. Strauss processes are arguably the simplest non-trivial Markov point processes. The densities of the Strauss family can be represented as a canonical exponential family [see, for example, Barndorff-Nielsen (1978, Section 8. I) and references cited there)]: the dominating measure for the representation is the Poisson process of unit intensity, the canonical parameter vector is [ln c~,in/~] and the canonical statistic is In(x), s(x)]. A similar representation is possible for the densities for a number of other families of Markov point processes. Because maximum likelihood estimation is in principle straightforward for canonical exponential families, inference for such Markov point processes should also be so. The difficulty is that the likelihood function for such a Markov point process family would usually involve the (parameter dependent) normalizing constant, for which an explicit closed form is generally impossible, in contrast to what happens for the common exponential families. When the normalizing constant cannot be obtained, the conventional approach to maximum likelihood estimation, based on solving the likelihood equation(s), is not possible. However, recent advances in computing power and statistical technology have made it feasible to avoid calculation of the normalizing constant and conventional maximum likelihood estimation using instead the approach known as Markov chain Monte Carlo, which involves large scale simulation. For a good coverage of these ideas see Geyer (1999). Another family of Markov point processes which has a representation as a canonical exponential family, is that consisting of so-called triplets processes; see Geyer (1999). Whereas Strauss processes can be obtained by a standard procedure for generating a canonical exponential family with specified canonical statistics, triplets processes can be generated in a similar way by using a further statistic w(x), counting the number of triples of points that are mutual r-neighbours in the realization x. The resultant exponential family has a three-dimensional canonical statistic In(x), t(x),w(x)]. Such processes can, like the Strauss process, be fitted by Markov chain Monte Carlo methods [cf. Geyer (1999)]. Similar comments apply to the area interaction processes introduced in Baddeley and van Lieshout (1995) and the saturation processes described in Geyer (1999, Section 3.9.2), although it should be noted that processes of the latter type are not Markov point processes.

Point processes and some relatedprocesses

631

Important in the simulation of the processes discussed in this section, and in Markov chain Monte Carlo methods of inference for such processes, is a class of spatio-temporal stochastic processes known as spatial birth and death processes [Stoyan et al. (1995, Section 5.5.5), Baddeley and Moller (1989)]. Such a process is a continuous-time pure jump Markov process whose state space is the set of all possible realizations of point processes on W (that is, all finite subsets of W which is assumed, as above, to be a bounded subset of a Euclidean space), and whose only possible transitions are either the 'birth' of a new point, or the 'death' of a point in the preceeding point process realization. (Note that a spatial birth and death process is Markov as regards time.) The essence of the connection with simulation is that, under certain conditions, the limiting distribution of a spatial birth and death process is a Markov point process as introduced above. A consequence of this is that realizations of such a Markov point process can be generated as observations on the relevant spatial birth and death process after it has been running for a long time [Geyer (1999), Moller (1999)].

5. Statistical inference

5.1. Introductory remarks The area of statistical inference for point processes is a difficult one which has been the subject of much recent growth. The remarks that follow will serve as some introduction. Any statistical analysis of point process data should be backed by suitable graphical displays. A plot of the point process realization should be included where this is feasible. In itself, this plot may suggest some form of interaction between points. For example, there may be a tendency to clustering (in a biological application, perhaps as a result of local reproduction), or to inhibition (possibly arising from competition for space or nutrients). When there is inhibition with a minimum permissible distance between points and a sufficiently high intensity, a tendency to regularity may be observed. Other plots of data summaries are possible. Some of these may play a purely descriptive or summary role; others may be relevant in fitting particular point process models, or assessing the goodness-of-fit of such models. Any approach should be driven primarily by the needs of the person who collected the data. Brillinger (1994) gives an interesting review of techniques for statistical analysis of time series and point processes; connections are drawn between the two areas and the techniques illustrated by real data.

5.2. Estimation of the intensity function Suppose that the data is a partial realization of a stationary point process and that, for example, only those points within a bounded window W can be observed. The prime interest may then lie in estimation of the intensity. In a forestry

632

R. K. Milne

application, this may give valuable information, for example on the overall quantity of wood in the forest. However, in such an application it may be necessary to use a more complex model: one that introduces supplementary information on the sizes of individual trees, represented as a mark (Section 4.2) attached to each point in the point process, may lead to better information on the overall quantity of wood. If stationarity does not seem a reasonable assumption, it may be of interest to estimate the intensity function of the process. Here non-parametric kernel density estimation techniques could be used. Alternatively, based on an inhomogeneous Poisson process model a specific parametric form could be fitted for the intensity function. These approaches are discussed, for example, in Cressie (1991, Sections 8.2.4 and 8.5.1) and in Stoyan and Stoyan (1994, Section 13.3). Graphical displays in the form of a plot or contour plot of the estimated intensity function could be provided, respectively, for real line or planar data.

5.3. Nearest neighbour distributions and the K function Consider now data which is a partial realization of a stationary isotropic (planar) point process. In this setting three other functions are often considered [see Diggle (1983, Chapter 2) or Cressie (1991, Sections 8.2.6 and 8.4)]. One of these functions is the so-called empty space function, defined by F(r) = Pr(d(u, x) _< r), r > 0. This is the distribution function of the distance d(u, x) from an arbitrary point u in S to the nearest point of the process. Another function is G(r) = Pr(d(x,x\{x})< r), r > 0, the distribution function of the distance d(x, x\{x}) from an arbitrary point x of x to the nearest other point of the process x. This is the nearest-neighbour distribution function of the process. Finally, there is the reduced second moment function, or (Ripley)K-function [cf. Ripley (1981)], which can be defined for r > 0 by

K(r) = #-iX(number of further points of x within distance r of an arbitrary point of x) , where # is the intensity of the process. To formally define the latter two functions requires consideration of the Palm distribution of the process; the expectation defining the K-function is in fact an expectation with respect to the (reduced) Palm distribution of the process [cf. Stoyan etal. (1995, Sections2.4.1 and 2.4.3)]. For a homogeneous planar Poisson process with intensity 2 it can be shown [cf. Stoyan et al. (1995, Sections 4.4 and 4.5)] that F(r) = G(r) = 1 - exp { - 2 ~ r 2} and K(r) = ~zr2, r > 0. For a clustered point process F(r), for small r, will be less than the corresponding value for a homogeneous Poisson process, and for values of r close to the range of clustering G(r) and K(r) will each be greater than the corresponding Poisson value. For a point process showing inhibition, at values of r larger than the range of inhibition F(r) will exceed the homogeneous Poisson

Pointprocessesand some relatedprocesses

633

equivalent, and for values of r close to the range of inhibition G(r) and K(r) will each be less than the corresponding Poisson value. Since F = G for a homogeneous Poisson process, various proposals for assessing Poissonness of a given point process are based on comparing F with G. Reflecting a basic property of a homogeneous Poisson process some authors would speak here of assessing complete spatial randomness (often abbreviated to CSR); see, for example, Diggle (1983) or Cressie (1991). In this spirit, Diggle (1979) considered the statistic supr IF(r)- G(r)]. Moreover, van Lieshout and Baddeley (1996) considered the function J(r) = [1 - G(r)]/[1 - F(r)], defined for r such that F(r) < 1, suggesting it as a useful summary measure to indicate the strength and range of interpoint interactions in a point process. For a homogeneous planar Poisson process with intensity 2 it is clear that J(r) - 1. Furthermore, J(r) < 1 indicates clustering and J(r) > 1 inhibition or regularity, while for many point processes J(r) is constant for r beyond the range of spatial interaction. The remarks of the preceding paragraphs refer to the 'true' functions being considered, whereas in practice they would usually be estimated from some point process realization. Estimation of the functions F, G and K on the basis of the points in a bounded window raises special problems of edge effects, which have been discussed in detail by Baddeley (1999b). There are two main types of edge effects: sampling bias that is size dependent and related to the well-known problem of length-biased sampling (for example, widely separated nearest neighbour pairs are less likely to be represented in a fixed bounded sampling window), and censoring effects (which arise, for example, because the nearest point to a given point inside the window may be outside the window and therefore unobserved). Ripley (1988) and Baddeley (1999b) have discussed ways of dealing with these effects. For example, extensions of Campbell's theorem play a key role in assessing bias in the estimation of F, G and K. It is possible to plot separately estimates F, G a n d / ( , together with their respective Poisson equivalents based on the estimated intensity. These plots can be used to make an assessment of fit of a homogeneous Poisson process model. Such assessment may be assisted by use of Monte Carlo tests (cf. Diggle, 1983). Here, for example based on F, one would simulate 99 independent realizations from the homogeneous Poisson model with the estimated intensity, and then construct the upper and lower envelopes for F

UF(r) = max~.(r)

LF(r) = m inF/(r) ,

where the maximum and minimum are taken over the estimates F / o f F from each of the 99 simulations. The functions UF and LF are then plotted with F and its Poisson equivalent. To the extent that F lies between UF and Le the fit of the Poisson model is regarded as acceptable. (Notice that, whilst (LF(r), UF(r)) gives a 98% confidence interval for F(r) for any specified value of r, it cannot be asserted that the same confidence coefficient applies when all values of r in some interval are considered this is a problem of simultaneous confidence intervals.)

634

R. K. Milne

A similar approach based on G, K or J could be used; the different functions each embody somewhat different information from the others, and so the plots should complement one another. It is recommended that attention should not be restricted to just one of the functions. Variations are possible on the plots discussed above. For example, one could use a probability plot of P-P type where F(r) is plotted against the corresponding Poisson equivalent ['Poi(r) for each r and, on the same plot, add the pairs (L F (r), [rPoi(r)) and (UF (r), lVPoi(r)) to give corresponding envelope functions. In the case of the K-function it has been found useful to plot either K(r) - ~zr2 against r, or ~/K(r)/rc - r against r; see, for example, Diggle (1983) or Cressie (1991). Any deviations from Poissonness that may be shown in the plots considered above provide clues as to what type of non-Poisson model may be appropriate. A more detailed description of any observed clustering or inhibition could then be attempted by the formulation and fitting of a more complex model [cf. Cressie (1991, Section 8.5)]: for example, the choice might initially be some simple Poisson cluster process or a Strauss process. Choosing and fitting a suitable model may not be entirely simple matters and it is likely that a statistician or applied probabilist knowledgeable about point processes would need to be involved, at least at this stage. The parameters of a reasonably fitting model provide a summary of the original data: for a homogeneous Poisson process this summary would involve just the intensity; for a Strauss process, as described in the previous section, the parameter fi is related to the intensity of the process, while 7 describes interactions between neighbouring points. Since the case ? = 1 yields a Poisson process, it is in principle possible to assess Poissonness parametrically within the family of Strauss processes by testing the hypothesis that 7 = 1.

5.4. Likelihood based inference One of the problems that has impeded development of inference for point process models is the difficulty, and in most cases impossibility, of writing down an expression for the likelihood function. A notable exception is that the likelihood can be written down explicitly for a realization over a fixed time interval (0, T) of any (inhomogeneous) Poisson process on R; see Snyder and Miller (1991) or Kutoyants (1998). As indicated in Section 4.6, there is much current interest in the use of Markov chain Monte Carlo methods; see, for example, Geyer (1999) and Moller (1999). These are sophisticated simulation intensive methods which enable likelihood based inference to be implemented for parametric point process models even when a likelihood function cannot be written down explicitly. There is also the possibility of using pseudo-likelihood methods, in which the likelihood function is replaced by another closely related function which is then used as if it were the likelihood. Such methods grew from work of Besag (1974, 1978); the monograph by S~irkk~i (1993) provides a good review of these ideas

Point processes and some related processes

635

and some new applications; see also Jensen and Moller (1991), Sfirkkfi (1995), Goulard et al. (1996) and Baddeley and Turner (2000). 5.5. Inference f r o m muItitype point process data

Statistical inference from multitype (multivariate) point process data is less well developed than inference for a single point process. Brillinger (1976), Lotwick and Silverman (1982), Chapters 6 and 7 of Diggle (1983) and Section 8.6 of Cressie (1991) deal with some basic ideas. These include estimation of cross-type versions of the functions G, K and J; see especially van Lieshout and Baddeley (1996) where attention is focused on multitype extensions of the J function. Goulard et al. (1996) have considered maximum pseudo-likelihood estimation for marked Gibbs processes, and so in particular for multitype Gibbs processes.

6. Simulation

It is straightforward to simulate a homogeneous Poisson process using its representation as a mixed Bernoulli process, as described in Section 4.1. Such a simulation can be effected in a window W of odd shape by initially simulating on a larger set, for example a rectangle, of more regular shape and rejecting those points which do not fall within W. Whilst the mixed Bernoulli approach can be used for any state space, on the real line there is also the possibility of simulating a homogeneous Poisson process by generating a sequence of 'intervals' from a suitable exponential distribution. An extension of the latter technique allowing non-exponential distributions facilitates simulation of renewal processes. For simulating an inhomogeneous Poisson process Lewis and Shedler (1979) gave a simple technique based on thinning a realization of a suitable homogeneous Poisson process. The only requirement, which would surely be met in most practical circumstances, is that the process to be simulated have an intensity function which can be bounded above by some fixed constant over the window W on which the realization will be obtained. Suppose then that it is desired to generate a realization on W of an inhomogeneous Poisson process with intensity function 2(-) where, for some constant 2*, 2(u) _< 2" for all u E W. First generate a realization of a homogeneous Poisson process on W with intensity 2*. Then delete points in this realization independently according to the following procedure: for a point at u, delete this point with probability 1 - p ( u ) and retain it with probability p(u), where p(u) = 2(u)/2". That the points of the original (homogeneous Poisson) realization which remain after this deletion process form a realization of an inhomogeneous Poisson process with intensity 2(.) is clear from the discussion near (17). The technique is relatively more efficient if the proportion of points retained is high; for a given 2(t), maximum possible efficiency is achieved when sup 2(u) = )~*, where the supremum is taken over all relevant u c W. Variations on this 'thinning' (deletion) method, and refinements which can be used to improve its efficiency, are discussed by Lewis and Shedler (1979). The thinning

R. K. Milne

636

method can be used in particular to simulate a Poisson process with cyclic intensity function. In principle a Gibbs process having a specified density (with respect to the Poisson process of unit intensity) which is a bounded function can be simulated by a rejection technique as described, for example, in Stoyan et al. (1995, Section 5.5.2); however, this may not provide an efficient approach. For efficient simulation of a Gibbs process, and in particular a Markov point process, the Markov chain Monte Carlo methods mentioned in Section 4.6 are often preferable. A realization of such a process in a bounded window can be obtained by generating an observation on a suitable spatial birth and death process after it has been running for a long time. Stoyan et al. (1995, Section 5.5.5) has an introduction to these ideas; for more detailed discussion see Geyer (1999) and Moller (1999).

7. Concluding remarks

7.1. Martingale theory of point processes A different approach to point processes is needed for dealing with processes, such as arise in the study of queueing or communication systems or in survival analysis, which evolve dynamically over time and in a manner that may depend on the past history of the process. For a point process N~, where t represents time and Nt = N((0, t]) in our previous notation, the stochastic intensity function 2(tlg4~t_) might be defined heuristically by 2 ( t l ~ , _ ) d t = Pr(N(dt) = l12/{t_ ) , where 24~t_ is a history of N~ up to but not including time t. Using the mathematically well-developed theory of martingales, the martingale theory of point processes provides an approach to formalizing the notion of a stochastic intensity and to solving a wide variety of problems by means of the stochastic calculus that results. It is beyond the scope of the present article to enter into details of this extensive and technically rather difficult theory. Key references are Br6maud (1981), Karr (1991), Snyder and Miller (1991) and Andersen et al. (1993), the latter comprehensive work being focused on applications to survival analysis. Aalen (1997) is interesting for an overview and some historical comments on the development of the field.

7.2. Omissions and further bibliographic comments Aside from not discussing any detail of the martingale theory of point processes, this article has at least two other major omissions: for reasons of time and space it has not attempted any serious presentation of either Palm theory or convergence results for point processes and random measures. Both these areas are well covered in Daley and Vere-Jones (1988), Matthes et al. (1978) and Kallenberg (1983). There has also been no discussion of spectral theory for point processes

Point processes and some related processes

637

[cf. Daley and Vere-Jones (1988, Chapter 11)] or of applications of point process theory to the study of statistics of extremes [cf. Resnick (1987)]. Daley and Milne (1973) provided an comprehensive annotated bibliography which may still be useful despite the later explosive growth in the area. The extensive references in, for example, Daley and Vere-Jones (1988) and Karr (1991) offer good updates. Several other books which have appeared in the last twenty years (see Section 1 and the further comments in the present section) have more specialized bibliographies. Those of the various chapters in the collection Barndorff-Nielsen et al. (1999) are worthy of particular mention. The encyclopedia articles Karr (1986, 1988) and Milne (1999) provide a compact summary of many key point process ideas. For those interested primarily in spatial data, especially in biological applications, Diggle (1983) is highly recommended and Mat6rn (1960, 1986) may prove useful. The examples in parts of Cressie (1991), Ripley (1981, 1988) and Stoyan and Stoyan (1994) are also good, and in all these books there is some discussion of theory. There is a wide-ranging review of spatial processes, including point processes, in the paper by Hjort and Omre (1994). Guttorp (1995, Chapter 5 ) offers a succint overview of the theory of point processes and is well motivated by examples. The monograph Kingman (1993) is a masterly survey of the many beautiful properties and applications of Poisson processes and a good introduction to many central aspects of general point process theory. More detailed exposition of various aspects of the theory can be found in Daley and Vere-Jones (1972), Srinivasan (1974), Cox and Isham (1980) and Reiss (1993). Daley and Vere-Jones (1988) give a much more comprehensive, yet readable, presentation of the mathematical theory. A systematic and careful development of the mathematical foundations of point process theory can be found in Matthes, Kerstan and Mecke (1978), though this work is usually considered difficult, even by probabilists. Point process theory, with a view to applications in stochastic geometry, is dealt with in Ambartzumian (1990), Stoyan et al. (1995), Stoyan and Stoyan (1994) and many of the chapters in Barndorff-Nielsen et al. (1999). A good introduction to stochastic geometry, including its connections with point process theory, is provided in Baddeley (1999a). For those interested especially in statistical inference from point process data Diggle (1983), Cressie (1991), Ripley (1981, 1988) and Stoyan and Stoyan (1994) contain examples as well as an introduction to relevant theory; in particular Chapters 6 and 7 of Diggle (1983) and Section 8.6 of Cressie (1991) deal with aspects of inference for multitype point process data.

Acknowledgements I am very grateful to Yih Chong Chin and a referee for reading an earlier version of the paper and offering a number of suggestions which have led to improvements.

638

R. K. Milne

References Aalen, O. O. (1997). Counting processes and dynamic modelling. In Festschriftfor Lucien Le Cam. Research Papers in Probability and Statistics (Eds., D. Pollard, E. Torgersen and G. L. Yang), pp. 1 12. Springer-Verlag, New York. Ambartzumian, R. V. (1990). Factorization Calculus and Geometric Probability. Encyclopedia of Mathematics and its Applications, Vol. 33. Cambridge University Press, Cambridge UK. Andersen, P. K., 0. Borgan, R. D. Gilt and N. Keiding (1993). Statistical Models Based on Counting Processes. Springer-Verlag, New York. Baccelli, F. and P. Br6maud (1987). Palm Probabilities and Stationary Queues. Lecture Notes in Statistics, Vol. 41. Springer-Verlag, Berlin. Baddeley, A. J. (1999a). A crash course in stochastic geometry. In Barndorff Nielsen, Kendall and van Lieshout (1999), pp. 1-35. Baddeley, A. J. (1999b). Spatial sampling and censoring. In Barndorff Nielsen, Kendall and van Lieshout (1999), pp. 37-78. Baddeley, A. J. and J. Moller (1989). Nearest-neighbour Markov point processes and random sets. Int. Statist. Rev. 57, 89-121. Baddeley, A. J. and R. Turner (2000). Practical maximum pseudolikelihood for spatial point patterns. Austral. & New Zealand J. Statist. 42, to appear. Baddeley, A. J. and M. N. M. van Lieshout (1995). Area-interaction point processes. Ann. Inst. Statist. Math. 47, 601-619. Barndorff-Niclsen, O. (1978). Information and Exponential Families in Statistical Theory. John Wiley and Sons, Chichester UK. Barndorff Nielsen, O. E., W. S. Kendall and M. N. M. van Lieshout (eds.) (1999). Stochastic Geometry: Likelihood and Computation. Monographs on Statistics and Applied Probability, Vol. 80. Chapman and Hall/CRC, Boca Raton. Besag, J. E. (1974). Spatial interaction and the statistical analysis of lattice systems. (With discussion.) J. Roy. Statist. Soc. Ser. B 36, 192-236. Besag, J. E. (1978). Some methods for statistical analysis of spatial data. Bull. Int. Statist. Inst. 47(2), 77-92. Brandt, A., P. Franken and B. Lisek (t990). Stationary Stochastic Models. John Wiley and Sons, Chichester. Br6maud, P. (1981). Point Processes and Queues. Martingale Dynamics. Springer-Verlag, New York. Brillinger, D. R. (1975). Statistical inference for stationary point processes. In Stochastic Processes and Related Topics (Ed., M. L. Puri), pp. 55-99. Academic Press, New York. Brillinger, D. R. (1976). Estimation of second-order intensities of a bivariate stationary point process. J. Roy. Statist. Soc. Ser. B 38, 60 66. Brillinger, D. R. (1994). Time series, point processes and hybrids. Canad. J. Statist. 22, 177-206. Brown, T. C., B. W. Silverman and R. K. Milne (1981). A class of two-type point processes. Zeit. fit? Wahrscheinliehkeitstheorie verw. Gebiete 58, 299 308. Cox, D. R. (1962). Renewal Theory. Methuen, London. Cox, D. R. (1972). The statistical analysis of dependencies in point processes. In Lewis (1972a), pp. 55-66. Cox, D. R. and V. Isham (1980). Point Processes. Chapman and Hall, London. Cox, D. R. and P. A. W. Lewis (1966). Statistical Analysis of Series ~?f Events. Methuen (now Chapman and Hall), London. Cram&, H. and M. R. Leadbctter (1967). Stationary and Related Processes. Sample Function Properties and Their Applications. John Wiley and Sons, New York. Cressie, N. A. (1991). Statistics for Spatial Data. John Wiley and Sons, New York. Daley, D. J. and R. K. Milne (1973). The theory of point processes: a bibliography. Int. Statist. Rev. 41, 183-201. Daley, D. J. and D. Vere-Jones (1972). A summary of the theory of point processes. In Lewis (1972a), pp. 299-383.

Point processes and some related processes

639

Daley, D. J. and D. Vere-Jones (1988). An Introduction to the Theory of Point Processes. SpringerVerlag, New York. Diggle, P. J. (1979). On parameter estimation and goodness-of-fit testing for spatial point patterns. Biometrics 35, 87 101. Diggle, P. J. (1983). Statistical Analysis of Spatial Point Patterns. Academic Press, London. Diggle, P. J. and R, K. Milne (1983a). Negative binomial quadrat counts and point processes. Scand. J. Statist. 10, 257-267. Diggle, P. J. and R. K. Milne (1983b). Bivariate Cox processes: Some models for bivariate spatial point patterns. J. Roy. Statist. Soc. Ser. B 45, 11 21. Disney, R. L. and P. C. Kiesster (1987). Traffic Processes in Queueing Networks: A Markov Renewal Approach. Johns Hopkins, Baltimore MD. Dwass, M. and H. Teicher (1957). On infinitely divisible random vectors. Ann. Math. Statist. 28, 461470. Feller, W. (1968). An Introduetion to Probability Theory and Its Applications. Vol. I. (3rd edn) John Wiley and Sons, New York. Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1,209-230. Franken, P., D. KSnig, U. Arndt and V. Schmidt (1982). Queues andPoint Processes. John Wiley and Sons, Chichester. Geyer, C. (1999). Likelihood inference for spatial point processes. In Barndorff-Nielsen, Kendall and van Lieshout (1999), pp. 79 140. Gikhman, I. I. and A. V. Skorokhod (1969). Introduction to the Theory of Random Processes. W.B. Saunders Company, Philadelphia PA. (Translated from the 1st Russian edition: Nauka Press, Moscow, 1965). Gnedenko, B. V. and I. N. Kovalenko (1989). Introduction to Queueing Theory. Birkauser, Boston MA. [lst Russian edition: Nauka Press, Moscow, 1966; translated 1968]. Goulard, M., A. Sfirkk~ and P. Grabarnik (1996). Parameter estimation for marked Gibbs point processes through the maximum pseudo-likelihood method. Scand. J. Statist. 23, 365-379. Grandell, J. (1976). Doubly Stochastic Poisson Processes. Lecture Notes in Mathematics, 529. Springer-Verlag, Berlin. Grandell, J. (1997). Mixed Poisson Processes. (Monographs on Statistics and Applied Probability, 77.) Chapman and Hall, London. Griffiths, R. C. and R. K. Milne (1978). A class of bivariate Poisson processes. J. Multivariate Anal. 8, 380-395. Guttorp, P. (1995). Stochastic Modelling of Scientific Data. Chapman and Hall, London. Harris, T. E. (1963). The Theory of Branching Processes. Springer-Verlag, Berlin. Hjort, N. L. and H. Omre (1994). Topics in spatial statistics. Scand. J. Statist. 21, 289 357. Hunter, J. J. (1974a). Renewal theory in two dimensions: basic results. Adv. Appl. Probab. 6, 376-391. Hunter, J. J. (1974b). Renewal theory in two dimensions: asymptotic theory. Adv. Appl. Probab. 6, 546-562. Jensen, J. L. and J. Moller (1991). Pseudo-likefihood estimation for exponential family models of spatial point processes. Ann. Appl. Probab. 3, 445461. Jif-ina, M. (1964). Branching processes with measure-valued states. In Transactions of the Third Prague Conference on Information Theory, Statistical Decision Functions, Random Processes. Czechoslovak Academy of Sciences, pp. 333-357. Kallenberg, O. (1983). Random Measures. (3rd edn.) Akademie-Verlag, Berlin. Karr, A. F. (1986), Article "Point process, stationary'. In Encyclopedia of Statistical Sciences (Eds., S. Kotz and N. L. Johnson), Vol. 7, pp,15 19. John Wiley and Sons, New York. Karr, A. F. (1988). Article "Stochasticproeesses, point'. In Eneyelopedia of Statistical Sciences (Eds., S. Kotz and N. L. Johnson), Vol. 8, pp, 852-859. John Wiley and Sons, New York. Karr, A. F. (1991). Point Processes and Their Statistical Inference. (2nd edn.) Marcel Dekker, New York. (lst edn., 1986). Kendall, D. G. (1964). Some recent work and further problems in the theory of queues. Theory Probab. Appl. 9, 1-12.

640

R. K. Milne

Kerstan, J., K. Matthes and J. Mecke (I974). Unbegrenzt teilbare Punktprozesse. Akademie-Verlag, Berlin. Khinchin, A. Ya. (1969). Mathematical Methods in the Theory of Queueing. (2nd edn.) Griffin, London. [lst Russian edn., 1955; translated, 1960]. Kingman, J. F. C. (1993). Poisson Processes. Clarendon Press, Oxford. K6nig, D., K. G. Matthes and K. Nawrotzki (1967). Verallgemeinerung der Erlangschen und Engsetschen Formeln (Eine Methode in der Bedienungstheorie.) Akademie-Verlag, Berlin. Kutoyants, Yu. A. (1998). Statistical Inference for Spatial Poisson Processes. Lecture Notes in Statistics, Vol. 134. Springer-Verlag, New York. Lewis, P. A. W. (1972a). Stochastic Point Processes: Statistical Analysis, Theory and Applications. Wiley-Interscience, New York. Lewis, P. A, W. (1972b). Recent results in the statistical analysis of univariate point processes. In Lewis (1972a), pp. 1-54. Lewis, P. A. W. and G. S. Shedler (1979). Simulation of non-homogeneous Poisson processes by thinning. Naval Res. Logistics Quart. 26, 403413. Lotwick, H. W. and B. W. Silverman (1982). Methods for analysing spatial processes of several types of points. J. Roy. Statist. Soe. Ser. B 44, 406M13. Macchi, O. (1975). The coincidence approach to stochastic point processes. Adv. Appl. Probab. 7, 83-122. Mat~rn, B. (1960). Spatial Variation: Stochastic Models and Their Application to Some Problems in Forest Surveys and Other Sampling Investigations. Meddelanden fran Statens Skogsforskningsinstitut 49, nr 5, 1-144. Mat~rn, B. (1986). Spatial Variation. (2nd ed.) Lecture Notes in Statistics, 36. Springer-Verlag, Berlin. Matthes, K., J. Kerstan and J. Mecke (1978). Infinitely Divisible Point Processes. John Wiley and Sons, Chichester. Milne, R. K. (1971). Simple proofs of some theorems on point processes. Ann. Math. Statist. 42, 368 372. Milne, R. K. (1974). Infinitely divisible bivariate Poisson processes. (Abstract) Adv. Appl. Probab. 6, 226-227. Milne, R. K. (1998). Article on 'Point Processes' In Encyclopedia ofBiostatistics (Eds., P. Armitage and T. Colton), John Wiley and Sons, Chichester. Vol. 4, pp. 3385-3398. Milne, R. K. and M. Westcott (1972). Further results for Gauss-Poisson processes. Adv. Appl. Probab. 4, 151-176. Molchanov, I. S. (1997). Statistics of the Boolean Model for Practitioners and Mathematicians. John Wiley and Sons, Chichester. Molchanov, I. S. (1999). Random closed sets: results and problems. In Barndorff Nielsen, Kendall and van Lieshout (1999), pp. 285-331. Moller, J. (1999) Markov chain Monte Carlo and spatial point processes. In Stochastic Geomentry: Likelihood and Computation Barndorff Nielsen, Kendall and van Lieshout (1999), pp. 141 172. M6nch, G. (1971). Verallgemeinerung eines Satzes yon A. R6nyi. Stud Sci. Math. Hung. 6, 81 90. Moyal, J. E. (1962). The general theory of stochastic population processes. Acta Math. 108, 1 31. Nawrotzki, K. (1962). Eine Grenzwertsatz ffir homogene zuf/illige Punktfolgen. Math. Nachr. 24, 201~17. Neveu, J. (1977). Processus Ponctuels. In Lecture Notes in Mathematics. 598, 249-447. SpringerVerlag, Berlin. Palm, C. (1943). Intensitfitsschwankungen in Fernsprechverkehr. Ericsson Teehniks 44, 1-189. Prohorov, Yu. V. and Yu. A. Rozanov (1969). Probability Theory: Basic Concepts, Limit Theorems, Random Processes. Springer-Verlag, Berlin. Reiss, R.-D. (1993). A Course on Point Processes. Springer-Verlag, New York. R~nyi, A. (1967). Remarks on the Poisson process. Stud. Sci. Math. Hung. 2, 119-123. Resnick, S. (1987). Extreme Values, Regular Variation, and Point Processes. Springer-Verlag, New York. Ripley, B. D. (1981). Spatial Statistics. John Wiley and Sons, New York.

Point processes and some related processes

641

Ripley, B. D. (1988). Statistical Inference for Spatial Processes. Cambridge University Press, Cambridge UK. Ripley, B. D. and F. P. Kelly (1977). Markov point processes. J. Lond. Math. Soc. 15, 188-192. Ryd6n, T. (1995). Consistent and asymptotically normal parameter estimates for Markov modulated Poisson processes. Scand. J. Statist. 22, 295-303. S~irkk~, A. (1993). Pseudo-likelihood Approach for Pair Potential Estimation of Gibbs Processes. University of Jyv~iskyl~ Studies in Computer Science, Economics and Statistics, Vol. 22. University of JyvfiskylS, Jyv~skyl~i. S~irkk~, A. (1995), Pseudo-likelihood approach for Gibbs point processes in connection with field obselwations. Statistics 26, 89-97. Sewastjanov, B. A. (1975). Verzweigungsprozesse. R. Oldenbourg-Verlag, M~nchen. (Original Russian edition: Nauka Press, Moscow, 1971.). Shorrock, R. W. (1975). Extremal processes and random measures. J. Appl. Probab. 12, 316-323. Sigman, K. (1995). Stationary Marked Point Processes. An Intuitive Approach. Chapman and Hall, New York. Snyder, D. L. and M. I. Miller (1991). Random Point Processes in Time and Space. 2nd edn. SpringerVerlag, New York. (1st ed., Snyder only: John Wiley and Sons, New York, 1975). Srinivasan, S. K. (1974). Stochastic Point Processes. Griffin, London. Stoyan, D., W. S. Kendall and J. Mecke (1995). Stochastic Geometry and Its Applications. 2nd ed. John Wiley and Sons, Chichester. (1st ed. 1987, joint with Akademie Verlag, Berlin). Stoyan, D. and H. Stoyan (1994). Fractals, Random Shapes and Point Fields." Methods of Geometrical Statistics. John Wiley and Sons, Chichester. Strauss, D. J. (1975). A model for clustering. Biometrika 63, 467 475. Thorisson, H. (1995). On time- and cycle-stationarity. Stoch. Proc. Appl. 55, 185-209. van Lieshout, M. N. M. and A. J. Baddeley (1996). A nonparametric measure of spatial interaction in point patterns. Statist. Neerland. 50, 344-361. van Lieshout, M. N. M. and A. J. Baddeley (1999). Indices of dependence between types in multivariate point patterns. Scand. J. Statist. (to appear). Westcott, M. (1972). The probability generating functional. Austral. J. Math. 14, 448466. Williams, D. (1979). Diffusions, Markov Processes and Martingales. Vol. 1. Foundations. John Wiley and Sons, Chichester.