Kernel Method for Nonlinear Analysis: Identification of a Biological Control System ATSUSHI WATANABE* AND LAWRENCE STARK Departments of Mechanical Engineering, of Physiological Optics, and of Electrical Engineering and Computer Sciences, University of California, Berkelq, 94720
ABSTRACT Due to the high dimensionality of computation involved, the Volterra-Wiener kernel method has found only limited application in nonlinear system analysis. By means of a finite-dimensional approximation combined with the least-square error method, an efficient algorithm was developed for computing kernels of a nonlinear system subjected to random inputs. An application to a physiological control system demonstrates the practicality of this algorithm as well as suggesting the prospective usage of the kernel method for nonlinear-system analysis in the future.
The idea of polynomial-like expansion of a nonlinear functional, now called the Volterra series, can be traced back to the works of Frechet and Volterra [l] early in this century. Norbert Wiener [2] first introduced the use of these series into the analysis of a nonlinear engineering system in 1942. By the use of the Volterra series, a single-input, single-output system can be represented by a set of functions, he, h,, h,,. . ., which are respectively called the zeroth-order kernel, the first-order kernel, the second-order kernel,. . . , in the form y(t)=h,+
j-h,(s)x(t-s)ds +
s Sh*(s,,s,)x(t-s,)x(t-s,)ds,ds~
+.... where x and y are the input and the output of the system, respectively. This method is a natural extension of the weighting-function representation of *Dr. Watanabe is now with the Sumitomo-Chiba Ichihara-shi, Chiba-ken, Japan. MATHEMATICAL
BIOSCIENCES
0 American Elsevier Publishing
21, 99-108
Company,
Chemical
(1975)
Inc., 1975
Co., Ltd., Anesaki-Kaigan,
99
100
ATSUHI
WATANABE
AND
LAWRENCE
STARK
linear systems to nonlinear systems, and applies to a wide class of systems with “well-behaved” properties such as time invariance, nonanticipativeness, finite memory and a continuous input-output relationship.’ In 1946 Cameron and Martin [3] developed a series expansion of a nonlinear functional which had an orthogonality property on the ensemble of white Gaussian input functions. This concept, introduced into the framework of the Volterra series, enabled Wiener [4] to develop a theory of orthogonal functionals which are now called Wiener G-functionals; his approach was essentially the Gram-Schmidt orthogonalization of the Volterra series. A number of engineering studies at MIT and elsewhere extended and modified the method of functional series [S-7]. A comprehensive survey of various orthogonal functional series was provided by Zadeh [8]. In spite of the generality and the refinement of the theory of functional series, only a few applications to actual systems have been attempted. In 1963 Sandberg and Stark [9-121 first measured the kernels of the pupil-light reflex system up to the second order by the cross-correlation method [13] developed by Lee and Schetzen and by the multiimpulse method [14], and showed a prospective use of the method, which gives a new way of looking [ 151 also measured the at a structually unknown system. Katzenelson kernels of the pupil-light reflex system by applying his own measurement theory, wherein he introduced a Wiener-Hopf-type integral equation to solve for the kernels. Another application recently published is by Marmarelis and Naka [16]. They applied the cross-correlation method to a neuron chain in the catfish retina and measured the first-and second-order kernels. By inspecting the shapes of the kernels, they were able to derive some characteristics and the topology of this system in similar manner to the earlier work of Sandberg and Stark [9-121. One of the difficulties which has prevented further applications is perhaps the formidable amount of computation required to calculate the kernels. This difficulty is due to the inevitable fact that the best approximate kth-order kernel must be chosen from the whole space of k-variable functions. On the other hand, there is usually some a priori knowledge about the characteristics of the system response. This often enables one to find a subspace of the k-variable functions from which the best kernel is chosen rather than from the whole space of k-variable functions. Consequently the amount of computation involved can be considerably reduced. For example, from an impulse response test we may determine the existence of a delay time, the duration of the response, and the oscillatory or non-oscillatory nature of the response. Then the search for the kernels ‘Most actual systems in practice have these properties for a relatively short time of observation. The property of finite memory may be replaced by a weaker condition that the memory decays about exponentially as the time elapses.
KERNEL
METHOD
FOR NONLINEAR
ANALYSIS
101
can be restricted to the subspaces where functions have such properties as account for the characteristics of the response. As an implementation of the above idea, an efficient computation algorithm [17,18] was developed with the following principle. Initially a small number of functions are selected with as many of the properties mentioned above as possible. Then these functions, after orthonormalization and multiplication to generate higher-order orthonormal sets of functions, are used as the base functions which span the subspaces from which the best kernels are chosen as minimizing the mean-square error between the output of the model represented by the kernels and that of the actual system. This algorithm seems more economical and convenient than those previously employed in our earlier studies and in other similar works. When five functions are initially chosen, it takes only 5 seconds and 30 seconds for a CDC 6600 computer to obtain the second- and the third-order models, respectively, for a data series of 585 points. An experiment and calculation on a physiological control system demonstrated the practicality of the present method. We again took the human pupil-light reflex system as the experimental object, because this system has interesting features typically found in many physiological control systems and may be a good example of general nonlinear systems [ 121. The pupil reflex to light attempts to regulate the total light flux reaching the retina and results in rapid changes of the pupil diameter. The most remarkable feature of this response is its dynamical asymmetry: a positive change of the light flux causes a fast, overshooting response, while a negative change causes a relatively slow, dead-beat response. This phenomenon has been accounted for in terms of the asymmetrical sensitivity of the system. Besides the rapid reflex response, there also exist many modes of retinal adaptation to changes in light flux which are in large part completed by about 1 minute for a light adaptation and by about 30 minutes for a dark adaptation. As the reflex response is quite rapid compared to slower retinal adaptation phenomena in our experiment, we presumed that the reflex dynamics, including rapid adaptations, may be treated independently for a short period of observation. An outline of the experiment is as follows. The stimulus light, which was randomly modulated, was focused in the center of the pupil, eliminating the influence of pupil diameter on incoming stimulus, and thus driving the system in an open-loop fashion [ 121. The pupil diameter was measured by a special television-camera pupillometer, recorded together with intensity of the light stimulus by a two-pen recorder, and also sampled by an on-line laboratory computer. The longest run without blinks of the eye used for calculations was about 1 minute long: Fig. 1 shows the input (light intensity) on the first line and the output (pupil diameter) on the second line. Values were sampled every tenth second and digitized, yielding a data series
102
ATSUHI
WATANABE
AND
LAWRENCE
STARK
of 585 points. From a simple step-response test we also knew that the system had a latency of response (pure delay) of about 0.3 second and that the dynamic response disappeared less than 5 seconds after application of a step change of the input. As base functions we chose Laguerre-type functions with a delay time and an exponential decay factor which were suitably determined from the above observations on the step response. Using these base functions, first-, secondorder to investigate the effect accuracy
of the models,
and third-order of the number
four to ten functions
models were calculated. In of the base functions on the were used for calculations.
In Fig. 2 are plotted the mean-square errors for the models as functions of the number of parameters in the models. Although it is quite natural that, for a limited-length data series, error should decrease as the number of parameters in the model increases, a more remarkable fact shown in the figure is that the errors for each model order distinctively fall on a fairly horizontal line, demonstrating that error reduction is not solely due to increasing degrees due to the increase
of freedom of the models, of model order.
11
z-
but also and more importantly
TRACE
A
TRACE
B
TRACE
C
kz go E a-2
0
/ IO
I 20 TIME
I I 30 40 (SECONDS)
I 50
FIG. 1. Random-input measurement of pupil light reflex system. Trace A: Light intensity (millilumens); mean value (1.05 millilumen) subtracted. Trace B: Pupil diameter (millimeters); mean value (5.3 millimeters) subtracted. Trace C: Response of second-order model.
KERNEL
METHOD
cc
FOR NONLINEAR
O6
g .05 w
“0
4$#FlRST
IO I
ORDER
20I NUMBER
FIG. 2. is indicated
103
ANALYSIS
30I
40I OF
MODEL
50I
60I
70I
80I
6
90 J
PARAMETERS
Mean-square errors of models. Number for each point.
of base functions
employed
in model
Fixing the number of the base functions at five, step responses of three orders of models are calculated for both positive and negative step inputs (Fig. 3). The first-order (that is, linear) model gives only symmetric responses. The second-order model exhibits the dynamical asymmetry which is a typical property of the pupil light reflex system. The overshoot to a positive step is large, while that to a negative step is not as large as in the linear case. The third-order model gives similar nonlinear responses except that more apparent overshoots occur even to negative step inputs; this result is inconsistent with the actual system step responses. It was felt that the length of the data sequence was not long enough in comparison with the number of parameters in the third-order model, and it was thus difficult to determine them with a sufficient degree of accuracy. We took the secondorder model as the most appropriate one; it not only reproduced the observed input-output data in the noise experiment with fairly good agreement, but also best simulated such dynamic properties of the system as the step response. The third line of Fig. 1 shows the system output reproduced by the second-order model. The first- and second-order kernels of the second-order model are shown in Fig. 4 (A) and (B) respectively. The first-order kernel produces only dynamically symmetric responses; the existence of an even-order kernel, the second-order kernel, accounts for dynamical asymmetry. For example, when a step input is applied, the second-order kernel produces a negative response regardless of the sign of the input step, adding to the response produced by the first-order kernel for a positive step, and subtracting from it for a negative step; thus the total response is asymmetric. Remarkable features of the second-order kernel are its dominant values along the lines
104
ATSUHI WATANABE AND LAWRENCE STARK
TIME FIG. 3.
Step responses of models: (A) first-order model, (B) second-order model, (C) Inputs are steps at time 0 of -C0.05, kO.10 millilumens. Time values are tenths of a second. third-order
model.
which are parallel to the time axes at a distance that the shapes along these lines are analogous
equal to the latency, and to that of the first-order
kernel. Off these lines the values of the kernels are relatively small. From this observation it may be postulated that the principal part of the secondorder kernel may be generated by a two-dimensional product of the first-order kernel and an impulse-like function. Although the third-order model calculated ffom the present measurement is not reliable, it is interesting to see what the third-order kernel looks like. Figure 5 shows a “section” of the third-order kernel of the third-order model by a hyperplane perpendicular to one of the time axes. This is the first third-order kernel of an actual system ever realized. Gathering the foregoing observations on the kernels together, we can give a “structure” of the second-order model as in Fig. 6. H, is a linear element whose weighting function is the first-order kernel. H, is also a
KERNEL
METHOD
FOR NONLINEAR
105
ANALYSIS
20 TIME
30 T
40
(a)
kernels of second-order model: FIG. 4. Calculated are tenths of a second); (b) second order kernel.
(a) first-order
kernel (time values
ATSUHI WATANABE AND LAWRENCE STARK
106
FIG. 5. Third-order kernel of third-order model: cross-section by hyperplane T, = 0.4 second. linear
element,
whose
response
is very
fast,
function is almost an impulse function. After DC term, the two outputs are multiplied second-order
system.
The pure
delay
and
therefore
its weighting
the output of H, is biased by a together, thus resulting in a
element
D is not necessarily
lumped
and anterior to the linear elements, but may be included in the linear elements or distributed throughout the whole system. N is a nonlinear function, which is traditionally placed there to account for the logarithmic property of the retinal response to light flux. This element could not be evaluated by the present measurements, since input values were collected only in the vicinity of the mean value, and thus such a logarithmic nonlinearity would not have much effect on the output values. The model structure of Fig. 6 is only heuristic; it can neither be interpreted physiologically nor predict all the input-output relations of the pupil light reflex system. The experimental run was too short and also too noisy to identify every feature of the system response. Although this specific example of measurement has these shortcomings, it can still demonstrate the usefulness and the practicality of the approach used here. It is believed that the kernel
method,
which
Wiener
introduced
into system
analysis
more
than thirty years ago, is one of very few powerful methods for nonlinear analysis. It can now be practiced as easily as other conventional methods such as frequency
response
for linear
systems.
KERNEL
METHOD
FOR NONLINEAR
FIG. 6.
Heuristic
107
ANALYSIS
structure
of second-order
model.
The computer graphics of kernels for this publication were produced at Aerospace Medical Research Laboratory, Wright Patterson Air Force Base, Ohio, where A. Watanabe was Resident Research Associate of the National Research Council. The authors wish to thank Captain M. E. Jerrel and Captain D. L. Brungart for their kind assistance in the use of the CDC 6600 computer for these graphics. REFERENCES 1 2 3
4 5 6 I 8 9 10 11 12 13
V. Volterra, Theory of Function& and of Integral and Integrodifferential Equations, Blakie, London, 1930. N. Wiener, Response of a nonlinear device to noise, Rep., Radiat. Lab., MIT 168 (Apr. 1942). R. H. Cameron and W. T. Martin, The orthogonal development of nonlinear functionals in series of Fourier-Hermite functionals, Ann. Math. 48, 385-392(Apr. 1947). N. Wiener, Nonlinear Problems in Random Theov, Technology Press (MIT) and Wiley, 1958. A. G. Bose, A theory of nonlinear systems, Tech. Rep., Res. Lab. of Electron., MIT, No. 309, May 1956. M. B. Brilliant, Theory of the analysis of nonlinear systems, Tech. Rept., Res. Lab. of Electron., MIT, No. 345, Mar. 1958. D. A. George, Continuous nonlinear systems, Tech. Rep., Rex Lab. of Electron., MIT, No. 355, Mar. 1958. L. A. Zadeh, On the representation of nonlinear operators, in IRE Weston Conoentional Record, Part 2, 1957, pp. 105-l 13. L. Stark, Black box description and physical element identification in the pupil system , in Q. Prog. Rep., Res. Lab. of Electron., MIT No. 69, Apr. 1963, pp. 247-250. A. Sandberg and L. Stark, Wiener G-functional analysis as an approach to nonlinear characteristics of human pupil light reflex, Brain Res. 11, 194211 (1968). L. Stark, The pupillary control system: Its nonlinear adaptive and stochastic engineering design characteristics, Automatica 5, 665676 (1969). L. Stark, Neurological Control Systems, Studies in Bioengineering, Plenum, New York, 1968. Y. W. Lee, and
M. Schetzen,
Measurement
Prog. Rept., Res. Lab. of Electron.,
MIT,
of kernels
No. 60, Jan.
by cross-correlation,
1961, pp. 118-130.
in Q.
ATSUHI
108
WATANABE
AND
LAWRENCE
STARK
14
M. Schetzen, Measurement of the kernels of a nonlinear system of finite order, Prog. Rept., Res. Lab. of Electron., MIT, Jan. 1963, pp. 122-130.
15
J. Katzenelson, Synthesis of nonlinear filters, Sc.D. Thesis, Electr. Eng. Dept., MIT, Sept. 1963. P. 2. Marmarelis and K. Naka, White noise analysis of a neuron chain: an application of the Wiener theory, Science 175. 12761278 (Mar. 1972). A. Watanabe, Characterization of nonlinear systems by discrete Volterra series and Ph.D. Thesis, Univ. of Calif., Berkeley, its application to a biological system, June 1972. A. Watanabe and L. Stark, Characterization of nonlinear systems by discrete Volterra kernel method, in Proceedings, Fourth Symposium on Nonlinear Estimation and
16 17
18
Its Applications,
San Diego, Calif., Sept. 1973.
in Q.