Signal Processing 4 (19821 139-153 North-Holland Publishing Company
139
TIME D E L A Y ESTIMATION IN NON-LINEAR SYSTEMS USING A V E R A G E AMOUNT OF MUTUAL INFORMATION ANALYSIS N.J.I. M A R S Section Medical Data Processing, Uni~'ersity of Leiden, Wassenaarseweg 62, 2333 AL Leiden, ITle Netherlands
G.W. van A R R A G O N Vimtron Medical B. V., P.O. Box 76, 6950 A B Dieren, The Netherlands Received 29 April 1980 Revised 24 june 1981 and 11 December 1981
Abstract. This paper discusses the problem of time delay estimation in non-linear systems. A new analysis method is proposed using the concept of Average Amount of Mutuai Information (AAMI). From this information measure an AAMI-function for non-linear systems is derived in analogy to the crosscorrelation function for linear systems. An algorithm is proposed which comprises an iterative probability density estimation procedure using the optimal kernel functions derived by Epanechnikov. The results of Monte Carlo simulations for three simple systems are supplied: a linear system, a rectifying system and a squaring system. These results indicate that the AAMI-function is useful for the estimation of time delay in non-linear systems. Zusammenfassung. Die vorliegende Untersuchung besch~iftigt sich mit dem Problem der Sch~itzung yon Laufzeiten in nicht-linearen Systemen. Es wird ein neues Analyse-Verfahren vorgeschlagen, bei dem das Prinzip des Average Amount of Mutual Information (AAMI oder mittlerer Transinformationsgehalt) verwendet wird. Von diesem Informationsparameter wird-analog zur Kreuzkorrelations-Funktion fiir lineare Systeme-eine AAMI-Funktion fiir nicht-lineare Systeme abgeleitet. Es wird ein Algorhythmus angegeben, der ein iteratives Sch~itzungsverfahren enthalt, bei dem optimale Kern-Funktionen verwendet werden, wie von Epanechnikov beschrieben. Die Ergebnisse von Monte Carlo Simulationen werden fiir die drei folgenden einfachen Systeme dargesteltt: ein lineares System, ein gleichrichtendes System und ein Quadrierer. Die Ergebnisse lassen erkennen, dass die AAMI-Funktion fiir die Sch~itzung yon Laufzeiten in nicht-linearen Systemen yon Nutzen ist. R~sum~. L'~tude pr~sent6e concerne le probl~me de l'estimation du retard dans des syst~mes non-lin~aires. Une nouvelle m6thode d'analyse est propos~e par l'usage du principe "Average Amount of Mutual Information (AAMI) ou quantit~ moyenne de l'information mutuelle". Une fonction AAMI pour des syst~mes non-lin~aires est d~riv6e de ce param~tre. Elle est analogue ~ la fonction d'intercorr~lation pour des syst~mes lin~aires. L'algorithme propos~ contient une m6thode it~rative d'estimation de la densit~ de probabilit6 par utilisation de fonctions noyau optimales propos~es par Epanechnikov. Des simulations par la m&hode de Monte Carlo ont ~t6 r6alis~,es pour trois syst~mes simples: un syst~me lin6aire, un syst~me rectifiant et un syst~me quadratique. Les resultats indiquent que la fonction AAMI convient pour l'estimation du retard dans des syst;~mes non-lin6aires. Keywords. Mutual information, time delay estimation, non-linear systems, probability density estimation, optimal kernel functions, Monte Carlo simulations.
1. Introduction
A d d i t i o n a l y , t h e i n p u t signals a r e o f t e n s t o c h a s t i c a n d n o t u n d e r t h e c o n t r o l of t h e o b s e r v e r . T h e s e
T h e s t u d y of c o m p l i c a t e d s y s t e m s is f r e q u e n t l y hampered
by the fact that the relation between
t h e i r i n p u t a n d o u t p u t signals is h i g h l y n o n - l i n e a r . 0165-1684/82/0000-0000/$02.7501982
c i r c u m s t a n c e s t e n d to limit t h e u s e f u l n e s s of c o n v e n t i o n a l s p e c t r a l analysis t e c h n i q u e s for t h e s t u d y of b i o l o g i c a l s y s t e m s .
North-Holland
140
N.J.L Mars and G. IV. van Arragon
In this paper we describe a new method for the analysis of non-linear n o - m e m o r y systems, using the concept of Average A m o u n t of Mutual Information (AAMI). A measure for Mutual Informal tion was introduced in Shannon's landmark paper [9] in 1949 and extended and generalized by Gel'fand and Yaglom [3] 1. A A M I can be used to characterize the relation between input and output signals of non-linear systems. Although the main goal in the development of our method has been the construction of a tool for the estimation of time delay in non-linear systems, we feel that our approach is potentially of more general use. Our practical experience, however, has been limited to time delay estimation in which the A A M I values are used in a relative sense. In Section 2 of this paper we discuss the theoretical background for A A M I . The computation of A A M I requires knowledge of the probability density functions of the signals under study. Therefore, we treat in Section 3 the estimation of probability density functions from a finite number of samples. In Section 4 we demonstrate the computation of A A M I for three simple systems. Finally, we show in Section 5 how A A M I can be used for the estimation of time delay in non-linear systems. Again, the computation is illustrated for three simple systems.
Time delay in non-linear systems
which only assume a finite number of discrete values and for stochastic variables in which the realizations assume continuous values. We will show the definition of Average A m o u n t of Mutual Information (AAMI) separately for these two cases. For the discrete case we consider two stochastic variables X and Y. We assume that the realizations assume values from the finite sets {xi}, i = 1 . . . . . M and {yj}, j = 1 . . . . . N, respectively. The elements xi and Yi have a probability of occurence given by px(xi) and P~(Yi) in which:
px(xi) = P r o b ( X = xi),
(1)
PY(Yi) = P r o b ( Y = yj),
(2)
px(x~)>-O,
i = 1, 2 . . . . . M,
(3)
pg(yj) ~>0,
] = 1 , 2 . . . . . N,
(4)
Y px(x,)= 1,
(5)
i=l N
Z PY(Yi)= 1.
(6)
i=1
The joint probability of occurence is given by p x y ( X i , Yi) with:
pXy(Xi, yi)
= Prob(X =
pxy(Xi, yi)~O,
xi, Y
(7)
= Yi),
i = 1, 2 , . . . , M, ] = 1, 2 . . . . . N,
(8)
M" N
2. Theory Using the mathematical theory of communication of Shannon [9], Gel'fand and Yaglom extended and generalized Shannon's measure of mutual information [3]. Their measure provides the amount of information about a random vector contained in another such vector. This measure of mutual information can be defined for stochastic variables with realizations The quantity we denote by AAM~ is often called Mutual Information or Transinformation in the literature on information theory; we added " a m o u n t " to distinguish the information itself from its measure and we added "average" for emphasis. Signal Processing
~, pxr(Xi, y~)= 1.
(9)
i=1 ] = l
For this discrete case the A A M I is defined as: A A M I ( X , Y) = ~
.v
,. z pxy(x,,
i=1 jffil
Yi) log
p
pxy(Xi, Yi)
YTy,)
(When pxy(X~, yi) = 0, the corresponding term in the summation is taken to be zero.) The unit is bits, nats or hartleys, if the base of the logarithm is respectively 2, e ( 2 . 7 1 8 2 8 . . . ) or 10. For the continuous case we consider again two stochastic variables X and Y and assume that
141
N.J.L Mars and G. W. van Arragon / Time delay in non-linear systems
these variables have marginal probability densities f x ( X ) and fy(y) and a joint probability density fxy(X, y) in which: fx(x)>~O,
-co < x < +co,
(11)
f~.(y)>~O,
-co < y < +ec,
(12)
fxv(x, y)~>0, - c c < x < + c o , - e c < y <+co, fx(x) dx
I
(13)
= 1,
(14)
-oc
[61.
oc
I_ /,,(y) dy = 1, cc
given reviews of this literature. Following Wegman, we distinguish four methods for density estimation: orthogonal series, kernels (or windows), histograms and maximum likelihood estimators. Neither Wegman nor Fryer prefer one estimator above all others, although both reject the maximum likelihood estimator and the histogram-type estimator. Of the four methods, the kernel estimator appears to be the most suitable one for our application. This estimator has been studied by Parzen The kernel estimator in general is given by:
(15)
/.,(yl, y2..... y,.)
:~
1 N
--::12
.~
-- ~ . ~ , .I]_i ~
1
/ yet - -
Xmn\
g / ~,--~ )
(18)
For this continuous case AAMI is defined as: AAMI(X, Y) =
fill ~
~ fxy(X,
.
y)l°g'~x"7[V"~ : " /~-~-:Z "lYtY) ' : dx dy. (17)
The unit is again bits, nats or hartleys depending on the base of the logarithm. In both the discrete and the continuous case AAMI is equal to zero if the variables are independent and greater than zero if the variables are interdependent. Estimation of AAMI using formula (17) requires knowledge of the joint and marginal densities. Because these densities are normally not known a priori, they have to be estimated. We will first discuss methods for estimating these densities.
3. Estimation of probability density functions 3.1. Choice of method
The problem of estimating a probability density function from a number of samples has been discussed extensively in the literature [4, 6, 7, 8, 10, 12, 13, 14]. Wegman [15, 16] and Fryer [2] have
Here M is the number of dimensions of the density, N the number of M- dimensional samples, h . , ( N ) the kernel width for N samples, y.~ the variable in the mth dimension, x ~ the nth sample value in the mth dimension and K ( . ) the kernel function. Parzen [6] has derived conditions for K ( . ) which assure asymptotic unbiasedness, asymptotic consistency and uniform consistency. These properties can be achieved with a wide class of kernel functions. Epanechnikov [1] has used this freedom in the choice of K ( . ) to derive the non-negative kernel form and kernel width which give the minimum relative global approximation error over all densities, in case the true probability density function has a Taylor expansion in all its arguments everywhere under the restriction that h,~(N)= h ( N ) , i.e., that the kernel width is independent of i7/.
The optimal kernel function Epanechnikov is given by:
derived
by
if - 4 5 < y < +,/5, =0
(19) elsewhere. V o l . 4, Nos. 2 a n d 3, A p r i l 1 9 8 2
142
Mars and G. W. van Arragon / Time delay in non-linear systems
N'.J.L
Note that the optimal kernel function has a simple quadratic form and is independent of the true probability density and of the sample size. The function has finite support which makes the computation easier. Within this class of optimal kernel functions, the optimal kernel width which minimizes the relative global approximation error of the density to be estimated is given by [1]: [MLM] '/'M÷~) hopt(N) = L ~ J
(20)
in which 0o
L = f_ K~,., (y) dy oo
-,/g
3
dY=54g
(21)
and CO
OO
O0
oo
30
¢c
,0-f(xl,
.....
xM)
dxl dx2 " " • dXM. (22)
The kernel width which is optimal for the estimation of the density is thus a function of the number of samples and of the density itself. We have used the Epanechnikov kernel estimator in an iterative way, in which an estimated kernel width is used to provide an estimate of the probability density function (using (18)), which is used to provide an improved estimate of the optimal kernel width. We have been unable to prove the convergence of this iterative procedure. In our experiments with estimation of densities based on samples from some known probability density functions we have not experienced any problem in this respect. 3.2. The algorithm for the density estimation
In this section we will describe in more detail our iterative probability density estimation procedure. In Section 3.1 we stated that the optimal kernel width is given by [20]. Replacing the actual Signal Processing
density f in (22) by the estimated density/N(Yb y: . . . . . yN, /~(N)) using some estimate /~(N) of hopt(N) gives:
r M., ],,,-*,, /~(N) = LND(/;(N))J
(23)
in whichthe dependence of D on the kernel width /~(N) has been made explicit. Solution of (23) gives an estimateof the optimal kernel width. This implicit equation for/~ can be solved iterativelyusing the bisection method. Our algorithmfor the estimationof probability density functionsin detail is: Step I. Normalizethe data to sample meanzero and sample variance one. The densityvalues will be computed at discrete equidistant points. The range for these points is -3o" to +3o'. Step 2. Choose a starting estimate /~(N) of hopt(N). Step 3. Solve (23) to find an approximation /~(N) to the optimal kernel hopt(N) using the bisection method, until the required accuracy is achieved (in our experiments we used a relative change of 5% in/~(N) as the stop criterion). 3.1. Compute the contribution of the kernels with this /~(N) for the samples at the discrete points where the density has to be estimated (using (18)). We used 31M equidistant points, which seemed to be sufficient after visual inspection. 3.2. Compute the second derivative of the estimated densities for the M dimensions and compute D in (22). In our experiments we computed the second derivative by fitting a polynomial in M variables of degree 2 in the least squares sense through 4 M points of the estimated density and differentiating this polynomial twice analytically. Integration was performed numerically using the degree 3, two-dimensional Simpson product-rule [11, p. 231]. Step 4. Use the density found in the last execution of Step 3.1 as the estimated density. We have embodied this algorithm in an ANSI Standard Fortran computer program. Computations have been performed using large scale computers.
N.J.I. Mars and G. IV. van Arragon / Time delay in non-linear systems
• Computation o [ A A M I
143
4.2. A linear system
Fhe algorithm of Section 3.2 is used to compute ; density values at a rectangular grid within a age of -3o- to +30-. To compute the AAMI lue according to (17), the integrand of (17) is ~egrated, again using Simpson's rule [11, p. 231]. ae use of the range -30- to +30. for the estima)n of the probability density function implies at the densities are assumed to be zero outside Lat range. We tried two different methods for the computtion of (17). In the first method the marginal ensities fx (x) and fv(y) were obtained by numerial integration of[xv(X, y). In the second approach he marginal densities were estimated indepenlently, using the one-dimensional equivalent of 18). Extensive simulations showed that the .econd method always gave superior results (i.e., :loser to the theoretically expected values). In the fimulation results to be discussed below always the second method has been used.
Fig. 1 shows the first system studied. The delay zx is set to zero. N~(i) and N2(i) are mutually independent stochastic processes (white Gaussian noise) with zero mean and standard deviation respectively one and 0.N2. Low-pass filtering is performed according to the formula: x ( / ) = p " x(i-1)+41---L-~ • n~(i)
(24)
in which {n~(i)} is a sequence of independent pseudo random numbers and {x(i)} is the output signal. Thus the variance of the process X(i) is one and the correlation between successive samples is p. In the linear system the signal is multiplied by a constant chosen so that {z(i)} has sample variance 0.~. The signal to noise ratio S I N of Y is defined as:
S / N = O'z/O'N.. z 2
(25)
2
0.N2 and o-~ are chosen such that
4. Computation of AAMI for three simple systems 4.1. Introduction To assess the feasibility of the computational procedure described and the practical usefulness of AAMI, we performed experiments with three known systems: a linear system, a rectifying system and a squaring system. The input signal for all three systems was recursively low-pass filtered Gaussian noise.
NI
_
2
2
"~
O'z+ 0.N~ = o"~-= 1. Here 0.2, 0.2 2 and 0.2 are the sample variances of respectively Z(i), N2(i) and Y(i). We will first derive the theoretical value of AAMI between X and Y as a function of the correlation coefficient between X and Y for this system. According to the definitions of N~ and N2, the probability density functions of X and Y are independent of the correlation coefficient between successive samples p and of the signal to noise
FILTER
Z
Y
Fig. 1. A simple linear system with Gaussian noise Nt and N2 as input and X and Y as output signals. Vol. 4, Nos. 2 and 3, April 1982
N.ZL Mars and G. W. van Arragon / Time delay in non-linear systems
144 ratio S / N and given by: fx(x) = ~
1
exp(-~x"),
1
fv(y)=-~exp(-sy
1 2
).
With the correlation coefficient p' between X and Y the joint (bi-variate) density function of X and Y is given by:
fxy(X, y)
combination of number of samples, correlation coefficient p and signal to noise ratio 10 simulations with different realizations of the noise processes were performed. Table 1 gives the results for 128 samples and Table 2 for 1024 samples. For each combination of correlation p and signal to noise ratio S I N the sample mean and the sample standard deviations (in parentheses) of A A M I are given for the 10 runs. We have identified two sources for the bias which is evident in these tables; one is the limited sample size and the second is the finite range of
1 [ _ 1 x 2 - 2 p ' x y + y2., 2 r r ' , / 1 - - ~ exp!, 2 1 _p,2 ] Substitution of (27), (28) and (29) in (17) and some formula reduction gives: A A M I ( X , Y) = - 3 log(1 _p,2)
(30)
Table 1 AAMI in nats as a function of signal to noise ratio S/N and correlation between successive samples rho for a linear system using 128 samples in the simulations
S/N•0.00 0.50
which is a function of p' only. A A M I ( X , Y) ranges from 0 to co if the correlation coefficient ranges
0.11
from 0 to 1. As
, P =°'z=
0.25
( S/N ~1/2 I+S/N]
(31)
2.33
we find: A A M I ( X , Y) = ~ log(1 + S I N ) .
(32)
We performed Monte Carlo simulations using this system in order to compare the empirical results with the theoretical results. The simulations were performed for 5 different values of the correlation p between the successive samples of X. Theoretically, the value of A A M I is independent of the correlation between samples p. The probability density function estimator we used, however, assumes independent samples. Therefore, we included simulations for several values of p to determine the effect of the violation of this assumption. For a fixed value of the correlation coefficient p, the simulations were repreated for 5 different signal to noise ratios S I N . To assess the influence of the number of samples, simulations were performed with 128 and with 1024 samples. For each Signal Processing
1.00
9.00
0.077 (0.019) 0.112 (0.026) 0.254 (0.050) 0.437 (0.066) 0.799 (0.053)
0.70
0.90
0.95
Theoretical
0.071 0.074 0.075 0.073 0.053 (0.016) (0.014) (0.025) (0.024) 0.105 0.108 0.103 0.098 0.112 (0.020) (0.019) (0.035) (0.028) 0.264 0.268 0.231 0.209 0.347 t0.047) (0.051) (0.077) (0.069) 0.439 0.449 0.387 0.354 0.602 (0.066) (0.070) (0.086) (0.097) 0.811 0.806 0.740 0.671 1.151 (0.066) (0.078) (0.108) (0.115)
Table 2 Results from analogous simulations as in Table 1 using 1024 samples 0.00
0.11 0.25 1.00 2.33 9.00
0.058 (0.008) 0.104 (0.010) 0.298 (0.013) 0.520 (0.024) 0.977 (0.037)
0.50
0.060 (0.009) 0.106 (0.011) 0.301 (0.021) 0.524 (0.026) 0.979 /0.032)
0.70
0.059 (0.012) 0.107 (0.015) 0.303 (0.021) 0.519 (0.023) 0.970 (0.033)
0.90
0.95
Theoretical
0.059 (0.012) 0.103 (0.015) 0.287 (0.024) 0.502 (0.030) 0.923 (0.031)
0.057 (0210) 0.099 (0.014) 0.282 (0.025) 0.478 (0,032) 0,889 (0.055)
0.053 0.112 0.347 0.602 1.151
145
N.J.L Mars and G. W. van Arragon / Time delay in non-linear systems
integration used in the numerical evaluation of the integral (17). In the range of parameter values we are interested in the two contributions to the bias are of opposite sign and cancel partially. As can also be seen from these tables, the standard deviation of the estimated AAMI value increases with increasing correlation between samples and decreases (relatively) with increasing signal to noise ratio.
Table 3 AAMI in nats as a function of signal to noise ratio and correlation between successive samples for a rectifying system using 128 samples S/N~
0.00
0.50
0.70
0.90
0.95
O. 11
0.064 (0.023) 0.088 (0.023) 0.212 (0.048) 0.356 (0.053) 0.625 (0.043)
0.057 (0.024) 0.080 I0.030) 0.209 t0.071) 0.349 t0.069) 0.631 t0.071)
0.065 (0.027) 0.094 (0.046) 0.228 (0.078) 0.374 (0.071) 0,645 (0.092)
0.068 (0.034) 0.101 (0.050) 0.224 (0.100) 0.361 (0.117) 0.632 (0.136)
0.075 (0.035) 0.106 (0.048) 0.229 (0.107) 0.348 (0.121) 0.636 (0.151)
0.25 1.00
4.3. A rectifying system The system studied here is the same as in the previous section, with the exception that the linear system has been replaced by a rectifying system. The delay A is again set to zero. With this modified system the same simulations as in the previous section were performed. The rectified signal is multiplied by a constant chosen such that its sample variance is cr2. Table 3 gives the results for 128 samples and Table 4 for 1024 samples. The relation between the sample standard deviation of the estimate and the number of samples, signal to noise ratio and correlation between samples is generally the same as for the linear system.
2.33 9.00
Table 4 Results from analogous simulations as in Table 3 using 1024 samples S/N~
0.00
0.50
0.70
0.90
0.95
0.11
0.049 (0.008) 0.088 (0.010) 0.256 (0.023) 0.428 (0.023) 0.772 (0.046)
0.051 ~0.009) 0.088 (0.013) 0.259 (0.025) 0.429 (0.032) 0.756 (0.068)
0.053 i0.010) 0.092 (0.016) 0.257 (0.030) 0.427 (0.035) 0.731 (0.085)
0.052 (0.013) 0.087 (0.017) 0.239 (0.034) 0.395 (0.049) 0.673 (0.099)
0.047 (0.010) 0.081 (0.015) 0.225 (0.035) 0.366 (0.056) 0.622 (0.105)
0.25 1.00
4.4. A squaring system
2.33
The system of Section 4.2 has been modified by replacing the linear system by a squaring system. With this system we performed again the described simulations, obtaining the results of Table 5 for 128 samples and of Table 6 for 1024 samples. Again, the squared result is multiplied by a constant to make its sample variance O"Z. 2 The relation between the standard deviation of the estimated AAMI values and the number of samples, signal to noise ratio and correlation between samples is again similar to the linear case.
9.00
4.5. Discussion The simulations described above confirm that the mutual information is a function of the signal to noise ratio SIN. The probability density func-
tion estimation procedure we have used degenerates at very high values of p. In the case of a linear system, the values of AAMI were found to be in satisfying agreement with the theoretical values, validating the computational approach. The AAMI is a measure of the degree of relationship between input and output signals of a system. In our application to the estimation of time delay in non-linear systems we only compare AAMI for different lag-values, analogous to the crosscorrelation function. This application will be described in the next section. Vol.4. Nos. 2 and 3, April 1982
N.J.L Mars and G. W. van Arragon / Time delay in non-linear systems
146
5. Application of AAMI to time delay estimation T o estimate time delays in non-linear systems we computed the value of A A M I between the input and output signal for a range of lag-values, in analogy to the crosscorrelation function. The lag-value where the A A M I reaches a maximum is an estimate of the time delay of the system studied.
1.0 I SIN
//~BHO
= 0.00
=
9.00
_
SIN -
9.00
-
2.33 _ - 1.00 0.25 _ .
_-I
I
I
I
l
-5.0 -3.0 -I.0 1.0 3.0 LAG 05RMPLE53
5.0
1.0 SIN
_
2.33
_
1.00
_
~
~
-
~
,~S~--~
I
l
5.0
~
_
0.11
_
I
l
-5.0 -3.0 -I.0 1.0 3.0 LAG CSRMPLES"I
5.0
51N = 9.00
F |
I
BHO
I
T
=:
.
-
51N = 9.00 _ 2.33 1.00 _ ~ 0,25
1.0
I
1.0
_
I
~ "-'-"'--
-5.0 -3.0 -I.0 I. ~" 3.0 LRG CSAMPL-53
0. I1
I
~
_
0.25 _ - I
= 0.50
_--
I
-~
-
9.00
~ 0 , .
_--
A
0.11 I
1.0 I
2.33 _ -|.00 0.25 _ - -
~
_
0.11
To assess the usefulness of this approach we performed simulations with the same three systems as in Section 4. Without loss of generality the time delay was set to zero. The A A M I was computed for a lag of - 5 samples to +5 samples. The experiments were performed using 128 samples and 1024 samples. Fig. 2 gives the mean A A M I for 10 runs with the linear system for 128 samples and Fig. 3 for 1024 samples. In this linear case
-
I
J
-5.0 -3.0 - 1 . 0 I. r 3.0 LAG C5RMPL.~57
5.0
I
I
~
0.95
L _
2,33 1.00
_
0.25
_
0. II
_
~
~
I
I
--
I
I
-5.0-3.0-I.0 1,0 3.0 LAG CSAMPLE'5D
I
[
5.0
Fig. 2. A A M I in nats from Monte Carlo simulations with a linear system using 128 samples per realization from both output signals X and Y in Fig. 1. The zero-level for each value of S / N is indicated by the horizontal lines to the left of the cur~,es. Signal Processing
N.J.L Mars and G. W. van Arragon I Time delay in non-linear s)'srems
/~QHO
,.°I
= 0.00
SIN -
51N
1.0 I
?~,
-
~~~____~
147
ffHO = C.50
B.OO _ _ 2.33
_
_
1.00
_
-
1.00
_
0.Z5
_
O, Z5
_
0.11
_ L
0. II
_
I
-
I
I
I
-5.0 -3.0 -I.0 1.0
I
I
3.0
5.0
C5AMPLE53
LAG
_ I
~ I
1
I
I
-5.0 -3.0 -I.0 1.0 3.0 LAG C5RNPLE5]
i
5.0
I..0
,.
.
SIN 9.00 _ 2,33
_- ~
1.00
_
O, 25 _ -
0.11
~
2.33 1.00
~
~
I
I
~
_ 1
I
I
-5.8 -3.0 -I,0 1.0 3.0 LAG 55RMPLE53
5.0
1.0 I
g.o0
~
~
_
~
0
I
,
9
_ _
0.Z5
_
0. II
_ r
I
-5.0-3.0-I.0 I.O 3.~ LAG C5RMPLE57
I
I
I
i
5.0
5
~
2.33 _ " ~.oo
_
0.25
_
0,11
~
~
_
I
I
-5.0 -3.0 -I.0 1.0 3.0 LAG CSRMPLE52
[
I
I
I
5.0
Fig. 3. R e s u l t s f r o m a n a l o g o u s s i m u l a t i o n s as in Fig. 2 using 1024 s a m p l e s .
the A A M I as a function of lag is a monotonous function of the crosscorrelation function. Figs. 4 and 5 give the mean A A M I for respectively 128 and 1024 samples for the rectifying system. Figs. 6 and 7 show analogous results for the squaring
system for respectively 128 and 1024 samples. The value of the peaks can be found in the Tables 1 to 6. These figures show that using the A A M I analysis the correct time delay can be estimated even for non-linear systems. Vol. 4, Nos. 2 and 3. April 1982
N.J.L ~tIars and G. W. van Arragon I Time delay in non-linear systems
148
F
F
1.0 I
9.00
_~
2 . 3 3
_
l.oo
_ -
BHO
9,00
_--
2 . 3 3
_
/ \
1.oo
_ -
~
0.25
_
0.
_
--
0.25 _ - 0.11
_
I
I
I
I
I
-5.0 -3.0 -I.0 1.0 3.0 LAG C5RMPLE53
5.0
,.o I SIN
l.O I
- 0.00
I
BHO
II
I
I
5.0
9.00 _
_
- -
2.33
_
--
2 . 3 3
1.00
0.25
_
0.Z5
_
0.11
_
0.
_
I
I
I
I
-5.0 -3.0 -I.0 1.0 3.0 LRO CSRMPLE53
5.0
F
I 0 I
51N .
I
~HO
-
I
I
I
II
- ~
"
--
_
I
/ \
-5.0 -3.0 -I.0 1.0 3.0 LAG CSAMPLES3
9 . 0 0
1.00
_
I
I
-5.0 -3.0 -I.0 1.0 3.0 LAG 05ANPLE53
I
!
I
I
5.0
o.95
L
9.00 _ 2.33 _ l . O0
_
0.25 _ 0.11
_
I
I
-5.0 -3.0 -I.0 l.O 3.0 LAG CSAMPLE53
I
I
I
I
5.0
Fig. 4. R e s u l t s f r o m s i m u l a t i o n s with a rectifying s y s t e m using 128 s a m p l e s .
Signal Processing
0.50
--
1.0
5/N
-
-
--
I
O. 70
-
BHO
N.J.I. Mars and O. W. van Arragon
/ Time delay in non-linear systems
/A SIN .
2.33
_
.
.
.
0.50
9.00
_
2.33
I.OO
.
0.25
. . . .
o.11
_
/
_
0.25
_
o. II I
--
I.oo
~
~
I
I
I
i
1.0
-5.0-3.0-1.0
3.0
I.
I
I
,o[
1.0
5.0
3.
LRO ZSRFIPLE5 ]
~_
- 0.70
-
I
-5.0-3.0-1.3
5.0
LAG 0 5 R M P L E S 3
9.00
-
51N -
9.00
51N
BHC
149
I
5/N
o
o
F| L
.
BNC
~
,
-
O.
~
_
_
1.00
_
- - - - - - - - - ~
0.25
_
.
O.
_
--
0.11
_
_
0,1l
2.33 1.00
I -5.0
I
I
I
-3.0-I,0
.
|.0
I.
3.0
1 5.0
LAG C S R M P L E 5 ]
1.o f
25
90
-I
i
-5.0-3.O-'.0 1.0 :.0 LR5 - 5 A M P L E E Z
i
I
I
5.0
RHO - 0.95
51N 9.00
_/
2.33
_
1.00
_
0.25
_
0.
I1
-
-
-
-
-
-
-
~
_
L I I I I -5.0 -3.0 -I.0 1.0 3.0 LRO E 5 R M P L E 5 3
I 5.0
Fig. 5. Results from simulations with a rectifying system using 1024 samples.
Vol.~..Nos,] and 3. April1982
150
N.J.L ,tfars and O. W. van Arragon
,o[ 5/N
BHO
/
Time delay in non-linear systems
F
= 0.00
1.0 I
-
9.00
_
2.33
_
/~
9.00
_ ~
2.33
_
-
-
_
- -
1.00
_ -
0.25
_
--
0.25
_
0.11
_
0.11
_
1.00
I
-5.0
1.0
I
I
-3.0
I
I
-I.0 1.0 3.0 LAG C5RMPLES2
I
5.0
F|
BHO
I
R 0.70
1.0 SIN
-
I
I
- I . 0 1.0 3.0 LAG C S R M P L E 5 ]
I
I
5.0
F
I
~H0
- 0.90
=
_ ~
9.00
2.33
_ ~
2.33
_
1,00
_
1.00
_
0,25
_
0.25
_
0. II
_
0.
_
--
I
1.0
I
-3.0
t
9.00
-5.0
--
-5.0
L SIN
flH0 - 0 . 5 0
I
I
-3.0
I
I
I
-I.0 1.0 3.0 LAG C S A N P L E S 3
5.0
F /
BH0
I I
_ ~
~
~
I
-5.0
I
-3,0
I
I
- I . 0 1.0 3.0 LflG CSflNPLE53
I
[
5.0
- 0.95
L SIN
-
9.00
--
2.33
_
1.00
_
0.25
_
0.11
_
~
.
~ ~
I
-5.0
-
I
-3.0
~
I
I
ml. 0 1.0 3.0 LAG C S R M P L E 5 3
I
I
5.0
Fig. 6. Results from simulations with a squaring system using 128 samples.
SignalProcessing
N.J.L Mars and G. W. uan Arragon / Time delay in non-linear s)stems
,o I 51N
F
BHO - 0.00
1.0 | L
0.50
-
-
9.00
_
--
9.00
_
2.33
_
--
2.33
_
1.00
_
0.Z5
_
0. II
_
~.oo •- - I " - -
I -5.0
l -3.0
I
I
I
-I .0 1.0 3.0 CSRMPLE53 LAG
5.0
,.oI SIN
I
_
-/~
0.11
_
I
I
-5.0 -3.0 -1.0 l.O 3.0 LAG ESR~PLE52
I
I
I
5.0
F
I
RHO
1.0 I
C,SO
L
51N _
--
2.33 1.00
-
_
0.25
BHO - 0.70
-
9.00
9.00
_ - ~
2.33
_
0.25
_--
0. II
_
_
0.25 0.
BHO
151
II
_
I
I
I
I
I
-5.0 -3.0 -1.0 1.0 3.0 LflG ESRMPL£S3
5.0
F
1.0 I L
I
I
I
-5.0 -3.0 -I.0 !.0 3.0 LAG C5-7,~PLE52
I
I
1
I
5.0
RHO - 0.95
SIN 9.00
_
2.33
_
1.oo
~
_
O.Z5
_
0.11
_ I
I
-5.0 -3.0 -I.0 1.0 3.0 LAG C5RMPLE53
I
I
I
I
5.0
Fig. 7. R e s u l t s from s i m u l a t i o n s w i t h a s q u a r i n g s y s t e m using 1024 s a m p l e s .
Vol. 4, Nos. 2 and 3, April 1982
152
N.J.L Mars and G. W. van Arragon / Time delay in non-linear systems
Table 5 An, Nil in nats as a function of signal to noise ratio and correlation between successive samples for a squaring system using 128 samples S/N~ 0.11
0'00
0.50
0.70
0.90
0.95
0.061
0.053 (0.021) 0.073 (0.030) 0.157 (0.055) 0.250 (0.063) 0.440 (0.078)
0.062 (0.026) 0.082 (0.035) 0.175 (0.075) 0.274 (0.090) 0.470 (0.127)
0.067 (0.037) 0.093 (0.052) 0.176 (0.097) 0.261 (0.134) 0.466 (0.187)
0.075 (0.037) O.104 (0.068) 0.191 (0.136) 0.279 (0.176) 0.445 (0.235)
(0.022) 0.25 1.00 2.33 9.00
0.082 (0.023) 0.163 (0.037) 0.248 (0.047) 0.442 (0.062)
system t i m e d e l a y can be e s t i m a t e d with the A A M I - a n a l y s i s . F u r t h e r w o r k r e m a i n s to be d o n e to d e t e r m i n e the quatity of this t i m e d e l a y estimator. O u r m o t i v a t i o n for the d e v e l o p m e n t of this time delay estimator comes from a medical problem: the l o c a l i z a t i o n of an e p i l e p t o g e n i c focus in the b r a i n of p a t i e n t s suffering f r o m e p i l e p t i c seizures. L o c a l i z a t i o n of this focus will b e d o n e b y c o m p u t ing t i m e d e l a y s b e t w e e n E E G - s i g n a l s r e c o r d e d d u r i n g e p i l e p t i c seizures with e l e c t r o d e s i m p l a n t e d in the brain. R e s u l t s f r o m this r e s e a r c h will be r e p o r t e d e l s e w h e r e [5].
Acknowledgement Table 6 Results from analogous simulations as in Table 5 using 1024 samples S/N~
0"00
0.50
0.11
0.046 (0.008) 0.078 (O.OLO) o.195 (o.o17) 0.305 (0.027) 0.539 (0.041)
0.048 (0.009) 0.079 (0.012) 0.197 (0.017) 0.309 (0.031) 0.542 (0.054)
0.70
0.90
H e l p f u l discussions with Prof. Ir. G r S n e v e l d are g r a t e f u l l y a c k n o w l e d g e d .
E.
W.
0.95
References 0.050 (0.012) 0.083 (0.016) 0.203 (0.019) 0.313 (0.025) 0.533 (0.042)
0.049 (0.014) 0.080 (0.017) 0.188 (0.027) 0.288 (0.037) 0.495 (0.060)
0.045 (0.013) 0.075 (0.018) 0.177 (0.035) 0.272 (0.047) 0.464 (0.063)
6. Discussion
[1] V.A. Epanechnikov, "'Non-parametric estimation of a multivariable probability density", Theor. Probab. Appl., Vol. 14, 1969, pp. 153-158. [2] M.J. Fryer, "A review of some non-parametric methods of density estimation", J. Inst. Maths. Applics., Vol. 20, 1977, pp. 335-354. [3] I.M. Gel'fand and A.M. Yaglom, "Calculation of the amount of information about a random function contained in another such function", Amer. Math. Sac. Transl., Vol. 12, 1959, pp. 199-246. [4] D.O. Loftsgaarden and C.P. Quesenberry, "A non-parametric estimate of a multivariate density function", Ann. Math. Statist., Vol. 36, 1965, pp. 1049-1501. [5] N.J.I. Mars and G.W. van Arragon, In preparation for publication in Electroencephalography and Clinical
In this p a p e r w e p r e s e n t e d t h e o r e t i c a l aspects a n d s o m e e m p i r i c a l results o b t a i n e d using A v e r a g e A m o u n t of M u t u a l I n f o r m a t i o n A n a l y s i s . W e c o n f i r m e d with M o n t e C a r l o s i m u l a t i o n s that A A M I d e s c r i b e s the d e g r e e of ( n o n - ) l i n e a r r e l a t i o n s h i p b e t w e e n two signals. T h e r e is a p r o mising a p p l i c a t i o n in t i m e d e l a y e s t i m a t i o n w h e r e t h e A A M I is u s e d in a r e l a t i v e way. S i m u l a t i o n s s h o w e d t h a t for a rectifying s y s t e m a n d a s q u a r i n g
[6] E. Parzen, "On the estimation of a probability density function and mode", Ann. Math. Statist., Vol. 33, 1962, pp. 1065-1076. [7] M. Rosenblatt, "Remarks on some non-parametric estimates of a density function", Ann. Math. Statist., Vol. 27, 1956, pp. 823-837. [8] S.C. Schwartz, "Estimation of a probability density by an orthogonal series", Ann. Math. Statist., Vol. 38, 1967, pp. 1261-1265. [9] C.E. Shannon and W. Weaver, The Mathematical Theory of Communication, University of Illinois Press, Chicago, IL, 1948.
0.25 1.oo 2.33 9.00
Neurophysiology.
Signal Processing
N.J.L Mars and G. W. van Arragon / Time delay in non-linear systems [10] D.F. Specht, "Series estimation of a probability density function", Technornetrics, Vol. 13, 1971, pp. 409--424. [11] A.H. Stroud, "Approximate calculation of multiple integrals", Prentice-Hall, Englewood Cliffs, NJ, 1971. [12] M.E. Tarter, R. k. Holcomb and R.A. Kronmal, "'After the histogram what? A description of new co~nputer methods for estimating the population density", Proc. ACM, Vol. 22, 1967, pp. 511-519. [13] G. Wahba, " A polynomial algorithm for density estimation", Ann. Math. Statist., Vol. 42. 1971, pp. 1870-1886.
153
[14] G.S. Watson and M.R. keadbetter, "On the estimation of the probability density, I", Ann. Math. Stat., Vol. 34, 1963, pp. 480--491. [15] E.J. Wegman, "'Non-parametric probability density estimation, I: A summary of available methods", Technometrics, Vol. 14, 1972, pp. 533-546. [16] E.J. Wegman, "Non-parametric probability density estimation, II: A comparison of density estimation methods", 3. Statist. Comput. Sim., Vol. 1, 1972, pp. 225-245.
Vol. 4. Nos. 2 and 3, April 1952