Time association between series of geophysical events

Time association between series of geophysical events

Physics of the Earth and Planetary Interiors, 71(1992)147—153 Elsevier Science Publishers B.V., Amsterdam 147 Time association between. series of ge...

551KB Sizes 2 Downloads 43 Views

Physics of the Earth and Planetary Interiors, 71(1992)147—153 Elsevier Science Publishers B.V., Amsterdam

147

Time association between. series of geophysical events Francesco Mulargia Dipartimento di Fisica, Settore di Geofisica, Università di Bologna, J/iale Berti Pichat 8, 40127 Bologna, Italy (Received 8 April 1991; revision accepted 9 December 1991)

ABSTRACT Mulargia, F., 1992. Time association between series of geophysical events. Phys. Earth Planet. Inter., 71: 147—153. The problem of evaluating the statistical si8nificance of the time association of different series of events, i.e. the correlation between the instants of occurrence, is often encountered in geophysics. It is of paramount importance in all the issues which have an essentially empirical basis, where chance may play a substantiai role in affecting the results. Among others, there is the case of assessing the validity of seismic and volcanic precursors. A general procedure is presented together with a practical application to flank eruptive activity and local seismicity of Mount Etna volcano.

1. Introduction Owing to the predominant importance that observation has in geophysics, and to the great difficulty of deriving reliable models several lines of research are devoted to the identification of phenomenological correlations between different phenomena. In many cases interest is focused on the mere instants of occurrence of different types of events rather than on measured variables and it is more prQperly called time association. The correlation of geomagnetic reversals to volcanism, and the search for precursors to both earthquakes and volcanic eruptions are typical examples. Since in most cases time association itself provides the essential validation, it should be analyzed with great care. This has seldom been the rule in the geophysical literature (cf. Lutz and Watson, 1985; Rhoades and Evison, 1989), and the few exceptions are specific to particular applications (Vere-Jones, 1978; Rhoades and Evison, 1979; Molchan et al., 1990). The goal of the present note is to discuss the problem in general and to propose a simple formal procedure for Correspondence to: Prof. F. Mulargia, Dipartimento di Fisica, Setiore di Geofisica, Università di Bologna, Viale Berti Pichat 8, 40127 Bologna, Italy. 0031-9201/92/$05.00 © 1992



evaluating the statistical significance of the time association between two given series, which should be applicable to many practical situations. 2. Stating the problem Statistically, the time association between series of events has been quite thoroughly understood since the 1950s, but the applications have been limited to machinery malfunction and nerve cell discharges (Cox, 1955; Griffith and Horn, 1963; Brillinger, 1976). This existing statistical apparatus can be transferred with some manipulation to the case of geophysical events. In general, a trustworthy correlation issue must satisfy two basic requirements (cf. Huff, 1973): (1) it must be ‘statistically significant’; (2) it must have a credible ‘physical’ basis. Each of these two points requires careful scrutiny. As regards point (1), there are two key factors: a correct ‘sampling procedure’ and an appropriate ‘formal testing’. A correct sampling procedure requires the data to be selected in ‘large’ sets and ‘at random’. Unfortunately, the available sets in geophysics are often small. Thus, a strictly random selection of the data becomes crucial since fluctuations may produce a wide variety of ‘apparent’ anomalies. Just throw two coins and

Elsevier Science Publishers B.V. All rights reserved

148

F. MULARGIA

count separately the series of heads and tails for each coin. By selecting ad hoc ‘small’ subsets (about 5—10 units) in the two series it is possible to identify many formally significant, but totally meaningless, correlations and regularities. It should be noted that this problem may be masked by a choice of variables which gives a predominant weight to some elements, reducing an tially large set to a few units. A typical example is represented by the energy or the scalar moment released by seismic events. These are usually evaluated through empirical laws of the type log E = DM + C where M is the magnitude, C and D are constants, and log is the logarithm in base 10. The number of earthquakes N is distributed according to a formally similar relation log N = bM + a, where b and a are also constants, but b
Satisfying point (2) for geophysical events is often difficult since the physical processes are not known in detail and there may exist important phenomena which have yet to be investigated. Nevertheless, a formally valid correlation is just the hint of a true correlation as long as it is not backed-up by a physically sound explanation. Indeed, an effective research strategy requires formally valid phenomenological correlations to be the starting point of theoretical modeling.

inherent to the methodology. This is the case with pattern recognition, which aims at classifying events according to a set of ‘characteristic’ features. The recognition can be more or less quantitative and objective (cf. e.g. Mantovani et al., 1987; Keilis-Borok et al., 1988), but implies always a ‘learning’ set on which the characteristic features are identified. Testing the statistical significance on this same set implies a non-random sampling and is, therefore, meaningless. A common solution consists of testing the results on different sets to the one used in learning (cf. Gelfand et al., 1976; Keilis-Borok et al., 1988). However, since one tends always to influence the process of data selection to some extent, all retrospective analysis does not guarantee a truly random sampling, which is possible only on ‘future’ occurrences (Rhoades and Evison, 1989; Mulargia et al., 1991). Formal testing requires the use of a statistical procedure appropriate to study time series. The problem is discussed in detail in the next section.

,N(Ik, ~)] [N(11 + t, w), N(Ik + t, W)] for all t and k = 1, 2,..., and with I + t denoting the interval (a + t, b + t). The first assumption is usually satisfied by most geophysical time series. The second one sometimes is not. In this case it is still possible to use the stationarity assumption by breaking the series into quasi-stationary sections and performing calculations separately on each of them. Obviously, a sufficient number of events (about 10—20) in each section is necessary. Given two stationary stochastic point processes without multiple events, M and N, (consisting of M(T) and N(T) events) the association between the two is described by the ‘intensity cross product’ (Brillinger, 1976) PMN ( u)

mi-

3. Statistical formulation Statistically, time series of geophysical events are represented by stochastic point processes. These are characterized by the number of points [N(I, w)] in the interval I = (a, b) relative to the realization w. This number is random since it is in general different for each realization. The statistical analysis of time series is much simplified under two assumptions (Cox and Lewis, 1966): (1) absence of multiple events, i.e. interevent times always 0; (2) stationarity of the process, i.e. [N(11, w), . .

. . . ,

[point M in =

urn Prob h,h’-~O and point N in

(t



4hh’

h’,

u h, 4hh’

(t +

t +



t +

u

+

h)

h’)] (1)

149

TIME ASSOCIATION BETWEEN SERIES OF GEOPHYSICAL EVENTS

which, for h, h’

0 and small gives h’, ~ + h’)] ~pN2h’

(2)

Prob[point N in (t and Prob[point M in (t + u h, t + u + h) and point N in (t h’, ~ + h’)] ~pMN4hh (3) —





Using the definition of conditional probability we have: Prob[M in (t + ~ h, ~ + ~ given a point N in (t h’, t

PMN =





Prob[point N in

t



h’,

t +

h] + u’)

length approximately equal to M(T)2h/T if few M events occur with an interevent time smaller than h, i.e. if

+

h’)]

(4)

Since the total number n(u, h) of events associated by both chance and true association is the number of S1 such that +

u



h

S~

+

u

+h

for some j

we estimate the intensity cross product given realization by fl (u, PMN—

h) ftN N(T)2h

~ =

(5) PMN

(u, h) 2hT

for a

(6)

which is in general a function of both u and h. Defining the ‘mean intensity’ of the processes M and N, PM and PN’ as PM

=

PN=

Jim Prob[point M in (t, t + h)] /h lirn Prob[point N in (t, t+h)]/h

(7a) (7b)

h-.0

in the presence of no true association, i.e. with the events M and N assumed independent, the intensity cross product takes the form PMPN’ which allows the definition of functions such as PMN/(PMPN) to measure the true association (Brillinger, 1976). However, following this approach the problem is complex since the distributions PM’ PN, and PMN are in general unknown. Asimpler and more reliable approach is possible by considering the intensity cross function due to chance PMNO, which is defined identically tO PMN’ but accounts only for the n0(u, h) associations which take place by mere chance. It can be estimated by 3~ n PMNO— n0(u, N(T)2h h)j 0(u, 2hTh) (8) =

Practical estimates of ~MNO require the hypothesis of two general independent processes M and N appropriate for geophysical time series. A small number of weak assumptions guarantees the widest possible applicability. One solution is the following. Let us call (0, T) the observation interval and M(T), N(T) the total number of events M and N, the union of the M(T) intervals of amplitude 2h centered at each M event defines on the time axis a set ~ of total normalized

h

< (t 1~1— t1) for

most

j; 1= 1,...,M(T)

—1 (9)

The independence of the processes M and N can be obtained using this constraint without any assumption on the distribution of M events, but provided that the N events occur according to a stationary Poisson process, which is appropriate for most geophysical time series (non-stationarities tobe treated as above). The number n0(u, h) of events N which fall in fI follows then, by definition, a binomial distribution with parameters ~ M(T)2h/T and n N(T). This also implies that if M(T)2h/T is small and N(T) is large, n0(u, h) follows a Poisson distribution with mean IL =M(T)N(T)2h/T (see, e.g. Cramer, 1946). In other words, the intensity cross product in the case of no association is distributed either binomially or according to Poisson with parameters readily estimated from the data. The presence of a true association can then be assessed in terms of decision theory by testing the null hypothesis that the association is due to mere chance PMN PMNO’ or equivalently that n(u, h) n0(u, h), through H0: n(u, h) is binomial with parameters =

P

=

2hM(T)/T; h

=

=

N(T) if 2hM(T) ~ T (lOa)

H0: n(u, h) is Poisson with mean 2hPMJ3N/T if 2hM(T) <>0 (lOb)

150

F. MULARGIA

with the alternative of true association. The level of statistical significance s.1. of true association (the risk of being wrong in rejecting the hypothesis of an association merely due to chance) is therefore evaluated one-tailed on the upper part of the cumulative curves N(T) x 1 s.l.= ~ N(T)J[2hM(fl/T] x=n(u,h)

Year 1974 1978

Month 01 04

Day 31 29

Year 1974 1976

Month 01 05

Day 21 01

1 X Ii L

1978 1978

11 08

23 23

1978 1978

05 02

17 23

1979 1981 1983 1985 1985 1986 1989

08 03 03 03 12 10 09

03 17 28 10 25 30 28

1978 1979 1981 1983 1986 1987 1989

11 08 03 03 10 08 08

27 05 17 27 05 13 21

1990

03

16



“.‘ 1/11 1I.i~ufT’\ i~r1[N(T)—x]

if binomial 00

s.l.=

TABLE 1 Etna flank eruptions and seismic sequences within Etna volcanic edifice for the period 1 January 1974—1 January 1991, according to Gasperini et al. (1990). Etna flank eruptions Local earthquake sequences Starting date Starting date

(ha) ~

exp[—(2hTPMpN)]

~ x=n(u,h)

Poisson (lib) The significance level is directly computed from eqn. (ha) for the binomial distribution. For the Poisson distribution it is found by comparing the observed n(u, h) with the critical values for the cumulative distribution (1 ib), which is tabulated in a number of texts (e.g. Pearson and Hartley, 1966; Abramowitz and Stegun, 1970) and is also included in computer libraries (e.g. IMSL subroutine MDTPS). In n(u, h) both u and h are random variates and several subsidiary functions can be constructed (provided that h remains ‘small’). Two of them appear particularly interesting, n(u, h const.), which shows the ‘peaks’ of association (here h const. has essentially the meaning of a~ unit of measure) and n(u h, h), which is consistent with the standard definition of precursors and precursory time (equal to 2h). Their practical application will be illustrated in the next section. if

=

=

=

4. An example: local seismicity precursor to Mount Etna flank eruptions or vice versa? The existence of a time correlation between local seismicity and Mount Etna volcano eruptive activity has been suggested by several authors, with earthquakes preceding (Sharp et al., 1981; Mulargia et al., 1987) and following eruptions (Nercessian et al., 1991). While both issues are interesting from a scientific point of view, the

first one, if confirmed, would be a potentially useful warning tool, since flank activity every few years causes considerable damage on the densely populated slopes of Etna. Let us apply the above technique to the period 1 January 1974—1 January 1991, for which reliable catalogs of both seismic and volcanic activity are available. A total of 11 flank eruptions occurred in this period (see Table 1). Focusing our interest on the seismicity directly linked to eruptions, i.e. the earthquakes which occurred within the volcanic edifice, a total of 12 seismic sequences have been recorded in the same period (cf. Gasperini et al., 1990; see Table 1 for the list). These sets are rather small. However, they should represent a random sample since they account for all the data available in a given time period, and the criteria fOr their selection had an independent objective basis (Gasperini et al., 1990). Neither of the two series shows evidence of non-stationarities as can be easily checked (Fig. 1) through a visual examination of the cumulative event curves (cf. Cox and Lewis, 1966). Condition (9) appears also well satisfied. Note that, since we wish to analyze the possibility that each of the two series of seismic clusters and flank eruptions is a precursor to the other one, we test both for

151

TIME ASSOCIATION BETWEEN SERIES OF GEOPHYSICAL EVENTS

fitting a stationary Poisson process. The result is positive, with a Kolrnogorov—Smirnov two-tail test (e.g. Conover, 1980) yielding KS1 values of 0.213

liD —.

for the first and 0.199 for the second series. These correspond to a lowest significance level for the rejection of a stationary Poisson process

(a)

lii U)

z

U)

(~0~ 0

z

~‘197ø

1~72

1974

1976

1~72

1~74

1~76

1~78

1~8O YEAR

1~82

1~84

1~86

1988

1~9ø

(b)

Lii Li~ 0

E

z

‘~t97ø

1~78

1~88 YEAR

1982

1984

Fig. 1. The cumulative number of occurrences of seismic clusters (a) and flank eruptions (b).

1~86

1988

1990

152

F. MULARGIA

~5n (u,h = 5 days)

-

_____________P_____ -2

_____

~Hnn,n

0.01 s.I.

I

100

0

100 u (days)

Fig. 2. The function n(u, h = 5 days) showing the association peaks within a window of 2h = 10 days (see text) for Mount Etna local seismicity and flank eruptive activity. Positive u values indicate earthquakes occurring before eruptions. Significant association is above the 0.01 level line.

-7 n(u

h,h) 0.01 8.1.

-6

-5

4

3

2

2h (days) —100

.

0

100

Fig. 3. The association function n(u = h, h), which is consistent with the current definition of precursory time (equal to 2h). The meaning of the symbols is as in Fig. 2.

TIME ASSOCIATION BETWEEN SERIES OF GEOPHYSICAL EVENTS

equal to 0.298 and 0.130, respectively. The total period T of analysis is 6205 days and the estimated probabilities PM and IN are, respectively, 1.77 x hO~and 1.93 X i0~. We take as positive the delay of volcanic eruptions with respect to earthquakes, i.e. earthquakes occurring first. Evaluating n(u, h 5 days), i.e. within a window of 2h hO days centered at u, we can use the corresponding critical values of Poisson distribution since 2hM(T) ~szT and N(T)>> 0. These indicate that a highly significant association peak exists (see Fig. 2) (at a level lower than 0.01) for u of between —7 and 3 days. No other significant peak is apparent. Physically, this gives definite support to the issue of local earthquakes and flank eruptions being concomitant phenomena and suggests that crustal stress is a fundamental variable of Mount Etna eruptive dynamics. In terms of precursors, it indicates the quasi-symmetric result that bursts in local seismicity significantly precede flank eruptions and flank eruptions significantly precede local seismicity, both by a few days. A more conventional precursor evaluation is provided by n(u h, h), which accounts for all the associated events within a vanable precursory time 2h. As Fig. 3 shows, this function suggests a significant association (at the 0.01 level) for u of between —25 and 100 days. =

=

=

References Abramowitz, M. and Stegun, l.A., 1970. Handbook of Mathematical Functions. Dover, New York, 1046 pp. Brillinger, D.R., 1976. Measuring the association of point processes: a case history. Am. Math. Monthly, 83: 16—22. Conover, WJ., 1980. Practical nonparametric statistics, 2nd Ed. Wiley, New York, 493 pp. Cox, D.R., 1955. Some statistical methods connected with series of events. J. R. Stat. Soc. B, 17: 129—164. Cox, D.R. and Lewis, P.A.W., 1966. The Statistical Analysis of Series of Events. Chapman and Hall, London, 285 pp. Cramer, H., 1946. Mathematical Methods of Statistics. Princeton University, Princeton.

.

153

Gasperini, P., Gresta, S. and Mulargia, F., 1990. Statistical analysis of seismic and eruptive activities at Mt. Etna during 1978—1987. 1. Volcanol. Geoth. Res., 40: 317—325. Gelfand, I.M., Guberman, S.H., Keilis-Borok, VI., Knopoff, L., Press, F., Ranzman, E., Rotwain, I. and Sadovsky, A.M., 1976. Pattern recognition applied to earthquake epicenters in California. Phys. Earth Planet. Inter., 11: 227283. Griffith, J.S. and Horn, G., 1963. Functional coupling between cells in the visual cortex of the unrestrained cat. Nature, 199: 876—895. Huff, D., 1973. How to Lie with Statistics. Penguin, London, 164 pp. Keilis-Borok, VI., Knopoff, L., Rotwain, I.M. and Allen, CR., 1988. Intermediate-term prediction of occurrence times of strong earthquakes. Nature, 335: 690—693. Lutz, T.M., 1985. The magnetic reversal record is not penodic. Nature, 317: 404—407. Mantovani, E., Mucciarelli, M. and Albarello, D., 1987. Evidence of interrelation between the seismicity of the Southern Apennines and the Southern Dinarides. Phys. Earth Planet. Inter., 49: 259—263. Moichan, G., Dmitrieva, 0., Rotwain, I. and Dewey, J., 1990. Statistical analysis of results of earthquake prediction based on burst of aftershocks. Phys. Earth Planet. Inter., 61: 128—139. Mulargia F., Gasperini, P. and Tinti, S., 1987. Identifying different regimes in eruptive activity: an application to Etna volcano. J. Volcanol. Geoth. Res., 34: 89—106. Mulargia, F., Achilli, V., Broccio, F. and Baldi, P., 1991. Is a destructive earthquake imminent in southeastern Sicily? Tectonophysics, 188: 399—402. Nercessian, Him, A. and and eruptive Sapin, M., 1991.atAMt correlation between A., earthquakes phases Etna: an example and past occurrences. Geophys. J. mt., 105: 131— 138. Pearson, ES. and Hartley, H.0., (Editors), 1966. Biometrika Tables for Statisticians. Cambridge University Press, Cambridge, 604 pp. Rhoades, D.A. and Evison, F.F., 1979. Long-range earthquake forecasting based on a single predictor. Geophys. J. R. Astron. Soc., 59: 43—56. Rhoades, D.A. and Evison, F.F., 1989. Onthe reliability of precursors. Phys. Earth Planet. Inter., 58: 137—140. Sharp, A.D.L., Lombardo, G. and Davis, P.M., 1981. Correlalion between eruption of Mount Etna, Sicily, and regional earthquakes as seen in historical records from 1582 A.D. Geophys. J. R. Astron. Soc., 65: 507—523. Vere-Jones, D., 1978. Earthquake prediction—a statistician’s view. J. Phys. Earth, 26: 129—146.