5 October 1998 PHYSICS LETTERS A
Physics Letters A 247 (1998) 47-52
On the constructive role of noise in spatial systems Alessandro Giuliani a, Alfred0 Colosimo b, Romualdo Benigni a, Joseph l? Zbilut cpl -aLaboratory of Comparative Toxicology and Ecororicology, lstituto Superiore di Sanitd, Hale Regina Ekna 299, 00161 Rome, Italy b reprint ofBiochemistry, ~nive~~~ of Rome“La Sapierua”, Rome, Italy c repaint
of ~oIecKlar Bio~hysies and ~~sio~g~
Rush ~nive~~~. 1653 K Congress, Chicago, It 60612, USA
Received 28 May 1998; revised manuscript received 13 Juty 1998; accepted for publication 13 July 1998 Communicatedby CR. Doering
Abstract The detection of weak signals has made a dramatic advance with the recognition of the constructive role that noise can play in the study of weak periodic signals. In this paper we show that the enhancementof weak signals has a wider reach, and is not contined to dynamical systems. In particular,the addition of artificial noise to a principal components analysis of a geographical problem permitted the discrimination between very weak but informative components, and the noise inherent in the data. The ~h~isrn involved is discussed. @ 1998 EIsevier Science B.V. PACS: 43.50; 727O+m; 74.#.+k; 80 Keyrvor&:Signah; Noise: Principal components; Singular values; Stochastic resonance; Pattern recognition
1. Introduction A crucial factor for pattern recognition is the ability to discriminate between noise and signal. This discrimination is normally accomplished by devising a filter (either physical or ma~ematic~) , which eliminates any signal below a given threshold [ l-31. Although a strong signal is certainly made clearer by a filter, semantically important information may be carried also by weak signals, which may be significantly degraded by the filtering [ 3,4-63. The discovery of the mechanism of stochastic resonance (SR) [7,8] has radically changed the way in which the problem of discriminating weak signals from background noise has been traditionally approached. SR consists of a nonlinear cooperative effect, which arises when a weak periodic signal enters in resonance
with random fluctuations, thus producing the amplification of the periodic component; i.e., a maximum for the signal-to-noise ratio. More recently, claims have been made that the addition of a noise component can result in multiple maxima [ 21. Although this phenomenon has been traditionally described with respect to dyn~ical systems, in principle, there is no reason why a spatially ordered system may not also exhibit signal enhancement due to added noise. The mathematical equivalence between principal components analysis (RCA) of a multivariate data set and the singular spectrum analysis of a time series [ 91 stimulated us to consider this possibility. This attempt relies on the notion that a temporal structure is a correlation structure with no fundamental difference from any non-time-dependent correlation structure 1l&11].
’ Conesponding author; e-mail: j~ilut~ms~.~u. 0375~96X0/98/$ - see front matter @ 1998 Elsevier Science B.V. AR rights reserved. PII SO375-9601(98)00570-2
48
A. Git&zni et ~~./P~ysi&s Letters A 247 11998) 47-52
Table I Distances of European
Amsterdam Athens Barcelona Beograd Berlin Bern Bonn Brusehes Bucharest Budapest CalaiS Copenhagen Dublin Edinburgh Frankfurt Hamburg Helsinki tstanbut Lisbon London Luxembourg Madrid Marseilfe Moscow Munich Oslo Paris Prague Sofia Stockholm Warsaw Vienna Zurich
cities (km) from the main cities of Latium
Rome
Latina
Frosinone
Viterbo
Rieti
430 347 283 227 393 227 353 388 364 268 418 510 622 637 318 435 727 452 615 474 325 449 200 782 230 664 365 305 294 653 435 255 227
447 321 305 222 400 249 370 406 355 261 448 522 645 655 333 448 729 430 637 494 346 470 223 773 245 675 386 313 273 658 433 254 246
449 331 293 236 409 247 372 406 368 274 446 527 641 655 336 453 739 443 622 493 346 458 213 785 250 682 383 320 286 668 444 265 246
415 346 292 220 374 220 339 371 359 246 418 492 615 625 302 417 706 443 624 464 315 460 202 7.59 216 646 357 286 280 632 413 233 214
409 364 271 238 373 205 330 365 378 259 405 491 600 615 295 414 713 464 604 456 307 440 183 774 213 645 343 290 301 636 421 240 205
2.Methods The test material for our analysis is shown in Table 1, which reports the distances of 33 European cities from the 5 main towns of the Latium region (a province in central Italy containing Rome). The weak signal to be detected was the spatial pattern of the European cities, based on their distances from the Latium towns. In this case, the info~ation linked to the average ch~acteristics of the data (strong signal) was highly degenerate, since it represented only the distance of European cities from Latium, and did not permit the reconstruction of the relative spatial orientation of the European cities (for example, Warsaw and
Madrid have almost the same distance from Latium) . The semantically important information, i.e., spatial orientation, was linked to the minor components of the variability of distance data (weak signal). This specific problem was selected because of its difficulty: (a) the differences between the distances of the 5 Latium towns from each of the European cities are extremely small with respect to the correlated portion of the distances (weak signal versus strong signal); i.e., the area ratio of Latium to Europe is 0.00163 [ 121; (b) since Latium is in an eccentric position with respect to Europe, and since there is no regular pattern in the distribution of European cities, the weak signal was very irregular and asymmetric; (c) the sample of data was rather small (33 statistical units) ; and (d) the distances were measured manually on a I :3 000 000 scale European map in order to generate a certain amount of internal noise ( 1 mm = 3 km). On the other hand, the success or failure of the identification of the weak signal is easily recognizable by simple comparison of the results with the map of Europe. The problem was modeled with PCA, which can be considered as a filter for correlated information [9,13- 15] . We compute the principal components with an M x M covariance matrix Cx which is diagonalized and the eigenvalues are ranked in decreasing order, Ax = ET,CxEx ,
(1)
where Ax = diag( At, AZ, . .. . AM) is the diagonal matrix with At 2 A2 2 ... 2 A,,, > 0 and Ex is the M x M matrix having the corresponding eigenvectors M, as its columns. The eigenvalue & Ek, k = l,..., gives the variance of the time series in the direction given by the eigenvector Ek; while the square roots of the eigenvalues are called the singular values. Projection of the eigenvectors yields the corresponding principal components [ 16,171.
3. Results Since the PCs are extracted in order of importance, the correlated portion of info~ation is amplified and incorporated into the first PC, whereas the minor PCs are more and more representative of noise [ 3,16,17 ] . Table 2 reports the factor loadings and the proportion of explained variance relative to the PCs generated by PCA of the 5 variables in Table 1. As expected,
A. Giuliani et al./Physics L.&ten A 247 (1998) 47-52
49
Table 2 Factor loadings and proportions of explained variance Variables
Components PC1
PC2
PC3
PC4
PCS
Rome Frosinone Latina Rieti Viterbo
0.9997 0.9973 0.9987 0.9909 0.9964
0.0137 -0.0715 -0.0420 0.0162 0.0837
-0.0184 0.0132 -0.0272 0.0393 -0.0070
-0.0120 0.0011 0.0058 -0.0009 0.0060
0.0001 0.0029 -0.0024 -0.0023 0.0017
Explained variance
0.9965
0.0029
0.000569
the first PC (PC1 ) explained almost all of the variance (more than 99%,), the 5 distance variables being highly interco~elat~ due to the small dimension of Latium with respect to Europe. The subsequent PCs represented a minimum amount of the initial information, and, at first glance, could be mistakenly considered as noise. However, the plot of PC2 and PC3 scores (Fig. 1) clearly shows that these two PCs reconstructed the angular distance map of the European cities, as they are observed from the Latium point of view. In the figure, the European cities form a semicircle around Latium, going from the Balkans to the Iberian peninsula. Cities having the same bearing with respect to Latium are close to each other in the figure, even if they are distant from each other in real space (the size component being PCl). The analytical demonstration of the above finding was that the angle, 0, which is formed by the European cities with respect to the north-south axis passing through Rome, as computed with the PC GLOBE 5.0 software [ 181, was accurately predicted based on the PC2 and PC3 values (PC1 was not significantly correlated with the angle), B = -4.8306 + 41.~7(~2~
r = 0.97,
p < o.ooo1.
- 27.086(PC3), (2)
Thus, PCA broke down the information into “size” (PC1 > and “shape” (PC2, PC3) components [ 6 1,
and separated effects relative to different measurement scales. PC4 and PC5 were not correlated with any meaningful characteristic of the European map, and were considered to be noise derived from the measurement of distances on a European map using a ruler.
3
I-
0.000043
I
0.000005
I
I
a Lisbon l Madrid
2
,_
Barcelona a Marseille a Athens (
1 Calals aa Dublin
B clI-
-1
1; :.0
Berna 2y⁢9h
.
Hamburg a ‘** Munich l,a l ’ Moscow . l Budapest Vienna a I
-0.75
“mrbu’ 0 Sofia
I
I
0.50 PC2
1.75
3.00
Fig. 1. Projection of the European towns on the second and third PCs obtained from Table 1 data. It is emphasised that this projection represents the angular orientation centered on Latium, thus distorting the usual visual presentation of the European map. Furthermore, the distance component (PC1 ) is required for complete characterimtion (see text).
In order to test the ~ssibility of inducing an SRlike phenomenon able to magnify the relative importance of the weak signal, increasing amounts of Gaussian noise were added to the Table 1 variables. The weak signal amplitude was estimated by multiplying the mean variance of the distance variables times the proportion of variance explained by PC2 plus PC3. The signal-to-noise ratio (SNR) ranged from 52% to 939% corresponding to a numerical value of added noise of respectively 1 and 18 mm. New PCAs of the noise-corrupted distances were performed, for
50
A. Giuiianiet d/Physics
Letters A 247 fI998) 47-52
PC1
Noise Levels 0.010 0.009 0.008 0.007 0.006 0.005 0.004 0.003 0.002 0.001 0.0 0
5
0.0020 1
1
-0.2
.s
10 Noise Levels 1
15
20
I
1
I
B
w 8 0.0010 5
.C
s 0.0005 0
10
15
20
Fig. 3. The figure displays the degree of recognition (Pearson’s correlation coefficient) betwe@nthe noise-cormpted PCs and their original counterparts, for different amounts of noise. The abscissa in expressed in SD units (mm) (I mm = 3 km).
0.0015
0.0
5
Noise Levels
IC
x
I
0
5
10
15
20
Noise Levels Fig. 2. Table 1 variables were added with noise at different SNR values, and subjected to PCA. The figure reports the variation of the proportion of explained variance relative to the first 3 PCs in the presence of different amounts of noise. The noise is the standard deviation of zero-mean Gaussian deviates added to the original variables. (a) Behavior of the strong signal PC; (b) relative to the weak signa PCs; (c) relative to the noise components. The abscissa of the graphs is expressed in standard deviation (SD) units (mm) ( I mm = 3 km).
different noise values, Fig. 2 displays the variance explained by the new PCs. The proportion of variance explained by the strong signal (PC1 ) decreased when the added noise increased (d~reasing SNR values) (Fig. 2a), whereas the variance explained by the weak signal components (PC2 and PC3) increased (Fig. 2b). From a purely computational point of view this result is due to the fact that white noise corresponds to a flat eigenvalues spectrum (all the components explain the same amount of variability) [ 171’. This implies that adding white noise to our data set tends to equalize the eigenvalue normalized spectrum (proportion of explained v~iabi~ity) , diminishing the proportion explained by strong signal components, and enhancing the proportion of variability relative to weak signal and noise components. The variance exptained by the noise components (PC4 and PC5) increased as well (Fig. 2~). Fig. 3 demonstrates the results of a “recognition test” between the original PCs and the PCs relative to the noise-coopted data. The correlation coefficients between homologous PCs ( 1 to 3) indicated that both the strong and the weak signals remained recognizable (statistically significant corrclation coefficients for the entire simutation range). On *We note that this approach is similar to a class of techniques termed “pmwhi~ning” and advocated by Tukey in the 1950s. See, e.g.. Ref. [241.
A. Gi~l~~i et at, /Physics
the contrary, the pure noise components had no relationship with their original counterparts. Thus, the overall procedure shown in Figs. 2 and 3 (corruption of the original data with noise, followed by the computation of the correlations between homologous PCs) permitted a clear discrimination between strong signal, weak signal and noise. 4. Discussion and conclusions Our results support the view that noise can have a constructive role in the recognition of weak signals in non-dynamical frames; however, the method may not be similar to the mechanism as found in SR of dynamical systems. The basic paradigm for SR is given by a double well potential, V(x)
= jbx4 -
$a,*,
(3)
the minima being located at fx,, where xnt = ri2. A potential barrier, whose height is given (a/b) by AV = az/4b, separates the minima, while its top is located at x&= 0. With a periodic driving force, the double well potential, V( x, t) = V(x) - AOXcos( LB), tilts back and forth, raising and lowering the potential barriers to the right and left antisymmetrically. If the period of the driving approximately equals twice the noise induced escape time, a synchronized hopping to the globally stable state will occur [ 81. In our case, however, we do not have a driving force, and the noise does not derive from a time dependent process. Instead, the noise results from errors of measurement scale. In our example, the noise may be considered a variant of the “quantization” problem encountered in analog-to-digital (A/D) conversions, and floating point calculations: it is well known that such processes generate error which is often correlated with the signal (calculation) being transformed; i.e., a mapping from a continuous range to, e.g., an integer range. Generally speaking such “rounding” error is not significant if the signal is large amplitude and wideband. If the signal, however, is small amplitude and/or narrow bandwid~ (such as in the case of our small PCs), signiticant correlations can occur, which distort the input signal. To overcome this problem, audio engineers have employed the technique of “dithering”, whereby noise is added to sampled
Letters A 247 (1998) 47-52
51
signals to make the quantization error independent of the input signal [ 19,201. Thus if d(n) is a white noise source added to x(n), then e(n) = y(n)
- x(n)
(4)
is also white, regardless of the spectrum of X, where e(n) is the error, y(n) is the output, and x(n) is the input. That this occurs in our example can be seen in the gradual (but still statistic~ly signific~t) decorrelation of PCs 2 and 3; lack of change for PC1 (large amplitude); whereas PCs 4 and 5 are unchanged, since they already are white (Fig. 3). A range of applications is possible for the present findings. An immediate application is to add increasing amounts of noise to a data set, in order to check the congruency of the hypotheses made on the weak informative components. This kind of implementation can be imagined in many instances, ranging from the study of the relationships between chemical structure and biological activity 1211 to the analysis of biological signals [ 221. This is p~icul~ly impo~nt in light of the increasing role of PCA and regression on KS as tools to investigate many different phenomena [ 231. The trends of the correlations between noise-corrupted components and original components, and consequently their identification as proper signals, may be investigated easily by applying classical inferential methods, such as a simple one-way analysis of variance to check for the statistical significance of the decay of the correIation between original and degraded homologous components at increasing noise levels, the quantity of added noise being the source of v~ahility. Acknowledgement
Dr. Barbara Camerini and Dr. Marta Menghini are gratefully acknowledged for the continued interest in our work, Ms. Eve Silvester is acknowledged for the patient editing of the text. JPZ acknowledges useful discussions with Zeev Schuss. References [ 11 C.E. Shannon, N. Weaver, The Mathematical Theory of Communication (University of Illinois Press, Urbana, IL, 1949).
52
A. Giuliani et al./Physics
[2] K. Wiesenfeld, F. Moss, Nature 373 (1995) 33. [3] E. Oja, Neural Networks 5 ( 1992) 927. [4] P Thompson, in: Proc. 13th Asilomar Conf. Circuits, Systems, and Computers (Pacific Grove, CA, 1979) pp. 529533. (51 H. Frauenfelder, S.G. Sligar? Science 254 (1991) 1.598. 161 J.N. Darroch, J.E. Mosimann, Biometrika 72 (1985) 241. [7] R. Benzi, A. Sutera, A. Vulpi~i, J. Phys. A 14 (1981) L453. [S] G. Gammaitoni, P Honggi. P Jung, Rev. Mod. Phys. 70 (1998) 223. [9] R.W. Preisendorfer, Develop. Atmosph. Sci. 17 (Elsevier, Amsterdam, 1988). [ lo] F. Moss, K. Wiesenfeld, Scientific American 273(2) (1995) 50. [ 111 B.J. West, Physica D 195 (1995) 12. [ 121 Anonymous, II grande atlante dell’Eutopa e de1 Mondo ( DeAgostini E&tote, Novam, 1975). [ 131 L. Lebart, A. Morineau, K.M. Warwick, Multivariate Descriptive Statistical Analysis (Wiley, New York, 1984).
Letters A 247 (1998) 47-52
1141 R. Benigni, A. Giuliani, Am. J. Physiol. 266 (1994) R1697. [ 151 J.A. Anderson, J.W. Silverstein, S.A. Ritz, S.J. Randall, Psychol. Rev. 5 (1977) 413. [ 161 R. Vautard, R.P. You, M. Ghil, Physica D 58 (1992) 95. [ 171 D.S. Broomhead, G.l? King, Physica D 20 (1986) 217. [IS] PC GLOBE 5.0, PC GLOBE Inc. [ 191 I. Kolhir, Periodica Polytechnica Ser. Electrical Engineering 28 (1984) 173. [20] B. Widrow, I. Kolhir, M.-C. Liu, IEEE Trans. instrum. & Meas. 45 (1996). 353. [21] R. Benigni, A. Giuliani, Mutat. Res. 306 (1994) 181. [22] T. Elbert, W.G. Ray, Z.J. Kowalik, J.E. Skinner, K.E. Graph, N. Birbaumer. Physiol. Rev. 74 (1994) 1. [23] E.S. Soot?. J. Am. Stat. Assoc. 89 (1994) 1243. 1241 D.R. Bales, ed., The Collected Works of John W. Tukey. Vol. I. Time series (Wadsworth Advanced Books, Belmont, CA, 1984); M.B. Priestley, Spectral Analysis and Time Series (Academic Press, London, 1981)