Copyright e IFAC ModcIing and Ccntrol in Biomedical Systems, GalveIUlD, Texas, USA, 1994
IDENTIFICATION OF NONLINEAR SYSTEMS Michael J, Korenberg Department of Electrical Engineering Queen's University, Kingston, Ontario K7L 3N6, Canada
Abstract
impulse) response is defined using a slice of a cross correlation of the input with the latest residue. In this way, the basis functions in effect change with the system to be identified, which generally provides a good rate of convergence. Following the dynamic linear element, the static nonlinearity, either a polynomial or other combination of basis functions such as gate functions, is then best-fit to the residue. As pointed out [3], it is trivial to extend the procedure by adding a second dynamic linear element to the cascade under development. Mo and Elkasabgy [5] found that this significantly increased the rate of convergence in simulated examples. In their simulations, the second dynamic linear element was a differential equation best-fit to the residue.
A method is proposed for approximating dynamic nonlinear systems using parallel cascades of alternating dynamic linear and static nonlinear elements. A key advantage of the proposed method is its effectiveness in approximating nonlinear systems which cannot be well fit using the first few terms of a Volterra series.
Introduction Consider a discrete-time, finite memory, causal nonlinear system where the output y(n) is a function of the input x(n) and delayed values x(n-l), ... , x(n-R): y(n) = F[x(n), ... , x(n-R)]
Suppose the static nonlinearity in each of the parallel cascades is a polynomial . Then the identified parallel array is equivalent to a Volterra series whose finite order is determined by the highest degree polynomial in the parallel cascade. The nonlinear system to be identified might not be accurately approximatable, for the given input, by a Volterra series unless the series order is very high. This would require use of higher-degree polynomials to represent the static nonlinearities in the parallel cascades, or replacement by other basis functions. The method described in the next section uses instead a Fourier transform approach to develop the parallel cascade approximation for systems ill fit by low order Volterra series.
(1)
Suppose that the input and output are available for n = 1, ... , N. Assume that the nonlinear function F in (1) is a continuous mapping of x(n), .. . , x(n-R), so that small changes in the input values result in small changes in the output values. Then Palm [1] has shown that the nonlinear system can be approximated to an arbitrary degree of accuracy by a sum of a sufficient number of LNL cascades. Each LNL cascade comprises [2] a dynamic linear element followed by a static nonlinearity which is itself followed by a second dynamic linear element. Palm's result [1] is of great value, but he did not suggest a method for identifying such a parallel cascade representation of a nonlinear system.
Identification Method
More recently, Korenberg [3,4] proposed a procedure for rapidly building a parallel cascade approximation from input output data, without a priori knowledge of the structure of the nonlinear system to be approximated. In this method, the cascades are identified individually, with each being "best-fit" to the output residue remaining after adding the previous cascade. Each cascade begins with a dynamic linear element whose delta (discrete
Assume that F in (1) has multidimensional Fourier transform H[COo, ... , roRl. Then the system output is y(n) = 1: ... COo
1: H[COo, ... , roRl· ro R
exp GCOo x(n» ... expGro R x(n-R»
561
(2)
chosen to be one of 0, ±Ol. If the M distinct input values are O,± 1, ±2, ... , then one alternative is to define the fundamental
Note that (2) is a sum of terms of form
H[Olo, ... ,
ro.J
exp( j
R 1: ~ x(n-i» i=O
(3) ffi
Expression (3) represents a dynamic linear operation (on input x) followed by the static noniinearity
H[Olo, ... ,
ro.J
Then set each random from
(4)
expO·)
= 21t/M. Olj
in (5) to be one of ± k
k = 0,1, ... ,[M/2],
(7) Ol,
chosen at
(8)
Thus (2) is clearly a parallel cascade of dynamic linear I static noniinear elements.
where the upper limit on k is the greatest integer not exceeding M/2.
For real input and output, each cascade comprises the dynamic linear operation
In practice, more rapid convergence was accomplished by first performing a simple linear transformation on each input value:
u(n) =
R 1: i=O
Olj
x(n-i)
m x(n) +b
(5)
so that the transformed points belong to the closed interval [-1,1]. Thus if a and 6 are respectively the maximum and minimum of the original input values, set 2 (10) m=a-6
followed by the static nonlinear operation
v=
A cos(u)
+
B sin (u)
+
C
(9)
(6)
The constant C allows the DC component of the nonlinear system to be "copied" more rapidly. Note that when (5) and (6) are used to define the first and second elements respectively in each cascade, the resulting parallel array can represent nonlinear systems forming a broad class. In particular, the parallel array can conveniently represent nonlinear systems not accurately approximatable by Volterra series unless the series had high order. This is because F in (1 ) can be any nonlinear function, not necessarily a polynomial, so long as its multi-dimensional Fourier transform exists. Thus nonlinear systems not readily amenable to Volterra series approximation can be modelled to arbitrary accuracy by this parallel cascade approach.
b
= -(6m+1)
(11)
Following the transformation in (9), use (7) to define the fundamental. Then set each Olj in (5) to be one of the harmonics ± kOl, where each k is randomly chosen from (8). Then the linear element output u(n) can be calculated. Next, to define the static nonlinear operation of (6) for the r-th cascade, r~l, find A,B,C to leastsquare fit V to the residue Yr.! remaining after adding the previous cascade. With A,B,C determined, V can be calculated. Then, a second dynamic linear element optionally can be best-fit to the residue Yr.!. Indeed one could continue to best-fit alternating static nonlinear and dynamic linear elements in the cascade under development.
To begin each cascade with the dynamic linear operation of (5), the Olj must be chosen. As assumed, the input output record is available for n= 1, ... ,N. Suppose that there are M distinct input values in this record. For example, the input might be a 1000 point ternary sequence, with distinct values 0, ± 1. In this case, define the fundamental to be ffi = 21t/3 and each ffij in (5) can be randomly
After the cascade has been completed, its output is subtracted from the residue Yr-h to yield the new residue Yr' Then a further cascade can be added analogously. The process may be terminated when 562
the mean-square error of the parallel cascade fit is acceptably small, or a pre-set number of cascades have been added to the parallel array. Termination may also be elected when no further candidate cascade can reduce the mean-square error beyond a threshold amount.
followed by the static nonlinear operation of raising to 0.01 power the absolute value of the linear element output. Input was 1000 points of zero-mean white Gaussian noise with variance unity. The best /iecond-()rder Volterra series with memory of 6 (corresponding to input lags 0, ... ,5) had a mse of 57.9%. Four parallel LNL cascades, with each dynamic linear element allowing input lags 0, ... ,5, had mse = 57.3 % . With 20 cascades, mse = 38.9%; with 50 cascades the mse = 22.2%. While far more parameters were used in the identified parallel cascade model than in the Volterra series, the improved mse performance of the identified parallel array was also true for novel inputs.
It can readily be shown [3,4] that each cascade added reduces the mean-square error (mse) by an amount equal to the mean-square of the cascade output. Before adding the r-th cascade, say defined by (5), (6), one may optionally require that
'fl >
y2r _t .L / (N-R+ 1)
(12)
Conclusion The proposed parallel cascade method may be particularly suited to identifying systems with highorder nonlinearities but fairly low memory. This method may for example be applicable to satellite communication channels. It may also be combined with the earlier parallel cascade method [3,4], for example, either to choose some of the cascades, or to model the static nonlinearities in the cascades.
where the overbar denotes time average over n = R, ... ,N. (L >= 11 corresponds to approximately 99.9% confidence interval calculated if the residue were zero-mean Gaussian noise.) Use of requirement (12) helps to avoid adding a cascade that is merely fitting noise [4]. The above scheme for choosing the coj in (5) resulted in initial rapid convergence in test simulatioDS. Since the transformed input values lie in the closed interval [-1,1], and there are M distinct values, the fundamental frequency is co
= 1t (M-l)/M
References 1.
Palm, G. On the representation and approximation of nonlinear systems. Part 11: Discrete time. BioI. Cybern. Vol. 34, pp. 4952; 1979.
2.
Korenberg, M.J.; Hunter, LW. The identification of nonlinear biological systems:LNL cascade models. BioI. Cybern. Vol. 55, pp. 125-134, 1986.
3.
Korenberg, M.J. Statistical identification of parallel cascades of linear and nonlinear systems. IFAC Symp. Ident. Sys. Param. Est. Vol. 1, pp. 580-585; 1982.
4.
Korenberg, M.]. Parallel Cascade identification and kernel estimation for nonlinear systems. Annals of Biomedical Engineering, Vol. 19, pp. 429-455; 1991.
5.
Mo, L.; Elkasabgy, N. Elec-841 Report, Dept. Elect. Eng., Queen's Univ., Kingston, Ontario, Canada; 1984.
(13)
One would expect that each coj in (5) could be randomly chosen to be one of the harmonies ±kco, subject to (8). However defining co by (13) did not result in initial rapid convergence, while use of (7) did. Once the convergence slowed, expanding the range for wj ' by switching to (13) to define the fundamental frequency, continued the convergence. Of course, many other schemes may be used to search the frequency band with limits ± [M/2]w in order to choose the Wj in (5). Example The test system comprised a dynamic linear element with delta response for m=0, ... ,5 g(m) = exp(-O.25m) - exp(-O.5m)
(14) 563