CHOICE OFSAMPLING INTERVALS G.C.Goodwin Department of Electrical Engineering University of Newcastle New South Wales, 2308, Australia and R.L . Payne Department of Systems and Control University of New South Wales New South Wales, 2031, Australia
1.
INTRODUCTION
251
2.
DESIGN CRITERIA
253
3.
CONSTRAINTS
254
4.
NONUNIFORM SAMPLING THEORY
255
5.
SEQUENTIAL DESIGN OF NONUNIFORM SAMPLING INTERVALS
262
6.
UNIFORM SAMPLING INTERVAL DESIGN
265
7.
ANALYTIC DESIGNS FOR UNIFORM SAMPLING INTERVALS
274
8.
CONCLUSIONS
279
REFERENCES
282
1.
INTRODUCTION I n most experiments there are a number of variables which can
be adjusted, subject to certain constraints, so that the information provided by the experiment is maximized. Experiment design procedures have been described for a number of diverse modelling problems in many fields including agriculture, economics, social sciences, chemical engineering, industrial engineering and biological sciences [l] to [3] the main emphasis to date has been on non-dynamic systems, but recently there has 251
252
C . C. Goodwin and R . L . Payne
been growing interest in systems where dynamic effects predominate [9] to [45].
For dynamic systems, experiment design in-
cludes choice of input and measurement ports, test signals; sampling instants and presampling filters.
In general the effects
of these design parameters are interrelated [46] and a complete design should be carried out.
However, in practice, it is often
the case that the appropriate input and measurement ports are predetermined.
Furthermore, for the case of continuous observa-
tions, or discrete time systems, the problem of the choice of sampling intervals and filters does not arise.
In these cases,
the complete design reduces to the design of test signals. One of the earliest publications on optimal test signal design appeared in 1960 [9], the major result being that an input having impulsive autocorrelation function is optimal for the estimation of the parameters of a moving average process in the presence of white measurement noise.
Since then, more general
classes of models have been considered and a number of test signal design algorithms based on optimal control principles have been proposed 1131 to [151 , [171, [201 , 1241, [251 , [ X I , [331, [341. Recently it has been shown that design algorithms originally developed for non-dynamic systems 141 to [8] are applicable to the design of test signals in the frequency domain, [45], [39], [40], [43], [44]. For those problems where the choice of sampling instants and pre-sampling filters does arise, it has been recognized that this choice often has a significant effect on the information return from the experiment [46] to [SO].
The choice is
particularly critical when the available computer storage and analysis time are limited. In these cases, the experiment should be designed to maximize the average information content of each sample. In the sampling problem, four cases can be distinguished depending upon whether the samples are uniformly or nonuniformly spaced in time and whether the test signal is prespecified or is
Choice of Sampling InteruaO
to be jointly designed.
253
The purpose of this chapter is to
present, what is, in the authors' opinion, the state of the art in the design of sampling intervals in each of these cases. 2.
DESIGN CRITERIA To form a basis for the comparison of different experiments
a measure of the "goodness" of an experiment is required.
A
logical approach is to choose a measure related to the expected accuracy of the parameter estimates to be obtained from the data collected. Clearly the parameter accuracy is a function of both the experimental conditions and the form of the estimator. HOWever, it is usual to assume that the estimator used is efficient in the sense that the Cramer Rao lower bound on the parameter covariance matrix is achieved, that is, the covariance matrix is given by the inverse of Fisher's information matrix [511.
(This
property is true, at least asymptotically, for most commonly used estimators e.g., maximum likelihood [51].)
A suitable measure of
the "goodness" of an experiment is therefore
where
I$ is a scalar function and M is Fisher's information
matrix. Various choices exist for the function I$.
Possible choices
are i) ii) iii)
-
Trace (M)
Det (M-l) (equivalently - Det (M) or Trace (WM-')
where W
-
Log Det (M))
is a nonnegative definite
matrix. The choice between these alternatives depends upon the objective of the experiment.
In fact it could be argued that a Bayesian
decision theoretic approach is called for, that is, the measure, J, should be a Bayesian risk function which would reflect the
ultimate use to which the data from.the experiment will be put. This approach has been followed by a number of authors in the
G . C . Goodwin and R . L. Payne
254
literature [33] , 1341 , 1381.
A useful approximation to the full
Bayesian formulation is to use: J
where
W
=
Trace [MW
-1 ]
(2.2)
is weighting matrix given by the second derivative of
the risk function with respect to the parameters, [131 , [381. An alternative approach is to use an information measure where information is used in the mathematical sense [52].
Under
the usual Gaussian assumptions the information measure reduces to J
=
Det[M
-1
]
(2.3)
Of the three possible criteria mentioned above, choice (i) seems to be, at first sight, particularly attractive since, for input design, use of this criteria leads to a quadratic optimization problem [17], [32], [20], [33].
However, it has been
pointed out, [35], [26], [25], [13] that this criteria can lead to unidentifiability conditions and is therefore not recommended. Other criteria, apart from the three discussed above, are possible.
Most commonly met criteria (including the three above)
have the following desirable property: @(M1)
@(M2)
if
M1
-
M2
is nonnegative definite (2.4)
This property will be used in later sections. 3.
CONSTRAINTS In the optimization of the design criteria introduced in
Section 2, it is obvious that the optimization must be carried out with due regard to the constraints on the allowable experimental conditions. Typical constraints that might be met in practice are:
(1)
Input, output or state amplitude constraints,
(2)
Input, output or state energy/power constraints,
(3)
Total experiment time,
(4)
Total number of samples,
Choice of Sampling Inletwlr
255
(5)
Maximum sampling rate,
(6)
Total computation time for design and analysis,
(7)
Availability of transducers and filters,
(8)
Availability of hardware and software for analysis,
(9)
Model accuracy specifications,
(10)
Non-disruption of normal production.
Which of the above constraints (or others) are important in a particular experiment depends on the situation, and it is not always obvious which should be taken into account during the design. A useful approach leading to simplified designs is to work with a subset of the constraints (such as input energy and total number of samples) and to subsequently check for violation of the other constraints. 4.
NONUNIFORM SAMPLING THEORY In this section the following problem is considered: a.
The parameters in a continuous linear system are to be
estimated from a fixed number of samples, N. b.
The samples are taken at times tll t2'""
tN
where
the sampling intervals k
=
O,...,N-l
are not necessarily equal (to corresponds to the beginning of the experiment) c.
.
The experimental conditions available for adjustment
are the system input, the pre-sampling filter and the sampling intervals Lo, All A 2 r - - - i A N - l * A general model for a linear time invariant system is 1531: dx' (t)
=
dy' (t) = where
y'(t)
:
sampling, u(t)
T
+
:
Rr
T
-+
A'x' (t)dt + B'u(t)dt C'x' (t)dt
+ dq(t)
(4.2)
+ du(t)
(4.3)
is the output prior to filtering and Rm
is the input vector, x'(t)
:
T
-+
n' R
G . C. Goodwin and R . L. F'ayne
256
is a state vector and
n(t) , w(t)
are Wiener Processes with
incremental covariance:
.( The above model is based on the usual assumptions of rational input-output transfer function and rational output noise power density spectrum. In general, to minimize the information l o s s due to sampling, some form of pre-sampling filter is required. Optimization of the filter requires careful specification of the allowable filter structures as otherwise impractical solutions will result. For example, with no restrictions on the allowable structure, a filter which is in fact a complete identifier could be constructed.
A common restriction is to use a linear filter of fixed
dimension. The filter design then reduces to optimization of the filter parameters.
where
y(t)
A
suitable linear filter is
is the filter output.
Equations (4.5) , (4.6) can be combined with equations (4.2), (4.3) to produce a composite model for the system and filter of
the following form:
where
x(t)
is the state of the composite model and
E(t)
is a
Wiener Process with incremental covariance Cdt. For simplicity of exposition, the class of allowable inputs is now restricted to piecewise constant functions, that is
Choice of Sampling I n t e n d s
257
This c l a s s of i n p u t s h a s t h e added advantage of e a s e of implementation w i t h t h e s p e c i f i e d sampling scheme.
However, t h e
a n a l y s i s c a n be r e a d i l y extended t o o t h e r c l a s s e s of i n p u t s , e . g . , a b s o l u t e l y continuous f u n c t i o n s . A s a f i r s t s t e p towards o b t a i n i n g an e x p r e s s i o n f o r t h e
information m a t r i x of t h e system parameter, a d i s c r e t e time model f o r t h e sampled o b s e r v a t i o n s i s now o b t a i n e d . I n t e g r a t i o n of (4.7) and (4.8) g i v e s (4.10) =
‘k
(4.11)
cx k
where (4.12)
(4.13)
(4.14)
and ‘k
=
The sequence
tk+l
{A,}
-
(4.15)
tk
i s a sequence of independent normal
random v a r i a b l e s having z e r o mean and covariance a t time
f,
(4.16) -
Q ,
by d e f i n i t i o n
G . C. Goodwin and R . L. Payne
258
Equations (4.10) and (4.11) can now be used to develop an expression for the information matrix of the parameters using the samples Y1r*--rYNThe standard expression for the information matrix [53] is: (4.17) where
Y
is a random variable corresponding to the observations
ylr...ryN and
is the p-vector of parameters in the system
model (4.2) and (4.3). the parameters and tion of
Y
given
E !3.
y
p(YIf3)
IB
is the likelihood function for
denotes expectation over the distribu-
Using Bayes' rule the likelihood function can be expressed as: (4.18) =
P(Y1
It follows from the assumption of Gaussian noise processes that the conditional distributions in equation (4.18) can be expressed as
exp where
-
yk
I-
1/2(yk
-
-
Y,)
(4.19) T
-1
sk
(Y,
- Y,) I
is the conditional mean and satisfies the following
equations [54]:
k '
where and
-
xk
=
cxk
is the conditional mean of
(4.20)
x
k'
In equation (4.21) the simplified notation Ok = @(AkrO) +k = $(AkrO) has been used. In equation (4.21) the matrix
Choice of Sampling Intends
rk
is given by:
rk and
259
Pk
T -1
T
@k~kc(cpkc
=
(4.22)
%
is the conditional covariance of
and is the solution
of a matrix Riccati equation:
(Note: The matrix
[CPCT] in equations (4.22) and (4.23) will
usually be nonsingular but if singular, then the correct results are obtained by using the pseudo inverse [591.) The conditional covariance, Sk,
of
yk
is related to
k'
as follows Sk
T CPkC
=
(4.24)
The boundary conditions for equation (4.21) and (4.23) are assumed to be
-
x
and
0
Po
respectively where the initial state
has been taken to be normally distributed with mean covariance Substituting ( 4 -19) to (4.18) yields
nN
P(YIB)
det Sk )-1/2
k=l exp
1
-
N
1/2
C
k=l
which when substituted into (4.17
+
N 1/2
Trace ( S ; ' k=l
yields:
21
Trace
1
F}(4.26)
-1 ask
Sk
260
G . C . Goodwin and R . L. Payne
The proof of equation (4.26) is straightforward, though somewhat long winded [SO], and has therefore been left to the reader as an exercise. The quantities
a;,/aB,
in equation (4.25) can be evaluated
from the following sensitivity equations:
(4.27)
(4.28) Equation (4.28) can be combined with equation (4.21) to give:
(4.30) and the forms of the matrices
Fk, Gk
and
%
follow from (4.28)
and (4.21). Also from equation (4.27) the quantities expressed in terms of the composite state 2,
ayk/aBi
can be
as follows: (4.31)
where i'
ac ,
0
. . . I
0, c, 0,
. . . I
(4.32)
Now substituting (4.31) into (4.26) leads to the following expression for the ij-th element of the information matrix:
Choice of Sampling Intenah
+
N 1/2 k=l
-1
-1 ask Trace [ Sk a@,
Trace [S;'%/
261
(4.33)
Performing the expectation in equation (4.33) yields:
h
where xk
2k and
are
and
Tk
are the mean and covariance respectively of
given by:
A
X
k+l
Tk+l
A
=
F G + G u k k k k '
x
= o
(4.35)
=
T T F T F + H S H k k k k k k '
T
= O
(4.36)
0 0
S w i n g up, equation (4.34) provides a means for computing the information matrix before the experiment is performed, that is, it is possible to evaluate a scalar function $I
of the in-
formation matrix corresponding to a particular set of experimental conditions. (4.34) that M
It follows from development of equation
and hence $I
depend on the experimental condi-
tions namely the input sequence intervals
(Ao,...rAN-l)
(4.5) and (4.6)).
( u ~ ~ . . . ~ ,u ~the - ~ sampling )
and the pre-sampling filter (equation
Hence, it would be possible, at least in
principle, to optimize I$ with respect to these design variables. Suitable algorithms have been developed for this type of optimization [50] but are extremely time consuming except for
262
G . C. Goodwin and R . L. Payne
simple situations and unfortunately do not give a great deal of insight into the problem. In the next section a suboptimal design scheme will be described which allows the theory developed in this section to be applied to practical systems.
In subsequent sections it will be
shown that, with certain constraints on the design, alternative simple design methods exist. These alternative methods give further insight into the general sampling problem. 5.
SEQUENTIAL DESIGN OF NONUNIFORM SAMPLING INTERVALS This section describes a suboptimal design procedure for
sequential determination of the input and sampling intervals. It follows from equation (4.34) that the information matrix is given by the following recursion: %+1 where
I',
%
=
+
'k+l
'
M
0= o
(5.1)
is the information increment resulting from the k+l-th
sample and has ij-th element ['k+llij
-- G~k+lfiTs-l fi.; 1 k+l J
+
1/2 Trace
k+l
+
Trace
ask+l
Using (4.35) equation (5.2) becomes:
-
+
(FkGk + G,~,)~fiTs;:,fi~(F~;~
1
Trace fiT.S-l fi.T i k + l J k+l
+
Gkuk)
(5.3)
Considering the situation at time k , if it is assumed that the past inputs and sampling intervals have been specified, then
Choice of Sampling I n t e n d s
263
is a %+1 Ak. Hence it is a
it follows from equations (5.1) and (5.3) that function of only the two variables
\
and
4,
simple optimizatitm problem to choose u and to maximize k the optimality criterion Then moving on to time (k+l) the procedure can be repeated to choose u and Ak+l etc. k+l This procedure is analogous to the one-step-ahead control algorithms such as minimum variance control [4.3] and will not generally be globally optimal in the sense that I$(%)
is
optimized with respect to uo, ul,...,u
and Ao,Al,...,AN~l~ N-1 The procedure described above is a generalization to the
sampling problem of the sequential test signal design procedures originally described by Arimoto and Kimura [16] and by Keviczky [281, [291. The sequential design algorithm is now illustrated by a simple example. Consider the first order system [461.
+
2
=
ax
y
=
x + n
where the noise, n(t)
bu
(5.4) (5.5)
is assumed to have wide bandwidth
(approximately "white"). Equation (5.4) can be discretized as
follows:
k' where
4~
=
-
exp(aAk) ,
xk + nk
8,
=
b/a(ak
(5.7)
-
and approximately constant variance.
1)
and
\
has zero mean
(This assumption will be
valid provided the smallest sampling interval allowed is much greater than the reciprocal of the noise bandwidth; this point will be taken up again later in this section). For this system, equation (5.1) becomes
G.C. C o o d w n and R . L. Payne
264
(5.9) and 0
ak h
X
k+ 1
-
0
"k'k
ak
, o
0
-
%'k
+
ak (5.11)
BkUk
Here t h e o p t i m a l i t y c r i t e r i o n i s t a k e n t o be
-
equivalently
-
Det(\+l)
Det(M
-1
or
)
Hence u s i n g e q u a t i o n ( 5 . 8 ) :
Det(M). =
-
Det(\)Det[I
=
-
Det(\)
=
-
Det (Mk) (1+(%Gk+Bkuk)
+
"T T -1 (1 + x ~ + ~ MkS ~slxk+l) A
T T - 1
fi
\
R(%Xk+BkUk)
)
(5.12)
Now assuming t h a t i n the interval
-1
uk
u
k
5
i s c o n s t r a i n e d i n a m p l i t u d e and l i e s 1,
t h e n it can be s e e n from e q u a t i o n
(5.12) t h a t t h e o p t i m a l i t y c r i t e r i o n i s minimized w i t h r e s p e c t to
uk
if
\ u* k
i s chosen as =
T T - 1
Sign (Bkfi
% mkxk)
(5.13)
A
S u b s t i t u t i n g (5.13) back i n t o (5.12) y i e l d s
+
T T -1
B
k
Mk fiBkl
(5.14)
2 65
Choice of Sampling 1nterua.k
A one dimensional optimization can now be used to find the
best values for
.
This value can then be substituted into
(5.13) to give uc. Results of this optimization are shown in Fig. (5.1). It can be seen from Figure (5.1) that, for this example, a significant improvement in information occurs if the input is sequentially optimized with fixed sampling intervals. However, a further improvement is achieved by a coupled design of both the input and sampling intervals. Of course the design algorithm is only optimal in the one-step-ahead sense and presumably further improvements could be obtained by extending the optimization horizon but at the cost of greatly increased computational complexity. It should be noted that, in this example, the sampling was performed without the inclusion of a presampling filter. However, since the noise was very wide bandwidth, drastic improvements in estimation accuracy could have been achieved by the simple expedient of including a low pass filter prior to sampling. The improvement in accuracy will be of the order of the ratio of the noise bandwidth to the system bandwidth, provided that the filter is chosen appropriately.
(The variance of the discrete noise on
the samples is roughly proportional to the noise bandwidth at the samples.)
This point will be discussed further for the case of
uniform sampling intervals in the next section. 6.
UNIFORM SAMPLING INTERVAL DESIGN In this section the case of uniform sampling interval design
is considered. For this case it will be shown that it is a simple matter to simultaneously find optimal test signals, pre-sampling filters and sampling rates. For the purposes of the current discussion, consider the following model of a constant coefficient linear single inputmultiple output system with coloured measurement noise: k(t)
=
Ax(t)
+
Bu(t)
(6.1)
15
Log det
%
-1
Figure 5.1 : COMPARISON OF SEQUENTIAL DESIGNS
Legend A - square wave input, A k = 0.2 B - sequentially optimized input Ak
10
C
-
A
5 t u h m
0
-B
-5
0
I
I
5
10
I
1
15
20
=
sequentially optimized input and sampling intervals.
C
I
25 Number of samples
30
k
0.2
Choice of Sampling InteruaLc
t € T = [O, tf] , x : T
where u
:
and
T
Cx(t) + Du(t)
=
y(t)
R1
-+
w
Rn,
is the input vector, y
T
:
-+
+
+.
Rm
w(t)
267
(6.2)
is the state vector, :
T
-+
Rm
is the output vector
is the measurement noise vector, assumed to have
a zero mean Gaussian distribution with power density spectrum $(f)
with full rank for all
f6
(-(arm).
For large experiment time, tf,
it is convenient to trans-
form the model equations (6.1), (6.2) to the frequency domain: =
y(s)
T(s)u(s)
+
w(s)
;
s = j2nf
(6.3)
where
Fisher's information for the parameters, and
D
(assuming for the moment continuous observations over
[ O r tfl)
is given by: M
where
6, in A, B, C
-
M
B
B
=
t i
(6.5)
fB
is the average information matrix per unit time with
ik-th element given by: m
where
<(f)
is the input spectral distribution function 1551 ,
[56] satisfying:
(Note: The input power has been constrained to be unity). Now, introducing the matrix Qi(s)
=
aT(s) aBi
Q
having i-th column given by (6.8)
268
G . C. Goodwn and R . L. Payne
the average information matrix may be written as:
iB
=
[
QT(-J2,f)$-'(f)Q(j2,f)
(6.9)
dc(f)
Assuming that the test signal spectrum is band limited to [-fh, fh], then equation (6.9) may equivalently be written as:
(Note: The assumption of a band limited test signal will later be shown to be consistent with the conditions for optimality.) The effect of sampling on the average information matrix is now investigated. Suppose that the output, y(t),
t € [ O r tfl
is sampled at greater than the Nyquist rate, [571, for is the sampling frequency
fsl is greater than
2fh).
fh
This form
of sampler does not distort that part of the spectrum of arising from the input.
(that
y
The part of the output spectrum arising
from the noise will, however, be distorted due to aliasing.
This
distortion will result in an "information" l o s s as indicated by the following expression for the average information matrix ( p e r u n i t t i m e ) from the s a m p l e d observations:
is B
=
fh
QT(-J2af)$S1(f)Q(j2,f)
d<(f)
(6.11)
-fh where
is the aliased noise spectrum given by [581 :
$s(f)
m
$s(f)
=
$(f) +
k=l
($(kfs + f) + $(kfs
the fact that
- is
€
f))
i
(- $ >) f
f
-
,
(6.12)
is nonnegative definite follows from
equation (6.121, which indicates that $,(f) definite and hence that
-
$(f)
is positive
Choice of Sampling IntervaLr
269
is nonnegative definite. It follows that, for all optimality criteria 4 (2.4) ,
(I
(Mi) 5 $(Mo) ,
satisfying
that is the experiment with sampled data
cannot be better than the corresponding experiment with continuous observations. This result is, of course, quite expected. However, it will now be shown that, by the inclusion of an optimal presampling filter, equality can be achieved. THEOREM 6.1:
fs, s a t i s f y i n g
4
With
f
s
s a t i s f y i n g (2.4)
=
sampling r a t e ,
> 2fh a p r e s a m p l i n g f i l t e r h a v i n g t r a n s f e r
f u n c t i o n , F, w i t h the p r o p e r t y :
IF(j21'rf)I
, and
0
for a l l
and invertible f o r a l l g i v e s the same v a l u e f o r
f
-
f C(-,
t o which
5
$1
U[>
, a)
(6.14)
a s s i g n s nonzero m e a s u r e ,
4 u s i n g the sampled d a t a a s w i t h
continuous observations.
Proof.
The inclusion of a presampling filter having trans-
fer function F(s)
modifies the expression for the average
information matrix (6.11) to: c
where
and
QF(f)
is the filtered noise spectrum given by: (6.17)
G. C. Goodwin and R . L . Payne
270
Now, for a presampling filter satisfying (6.141, reduces to
qF(f)
as is readily verified.
Ss(f) Furthermore, sub-
stituting (6.17) into (6.15) yields
and comparison of (6.18) with (6.10) yields: (6.19)
This completes the proof of the theorem. H The significance of (6.19) is that the specified presampling filter, F, eliminates the information loss due to sampling. This form of filter is often called an aliasing filter [58]. So far, only the average information matrix per unit time
has been considered. However, if the total number of samples is
N, say, then it is more appropriate to consider
constrained to be
the average information matrix per sample. is
fs
If the sampling rate
samples per second then the average information matrix
per sample is given by:
fi
=
B
where
Cs B
-1
is
(6.20)
fs
is the average information matrix per unit time
corresponding to an input spectrum having highest frequency component fh
and samples collected at
fs > 2fh.
inclusion of an aliasing filter ensures that
MB :
be replaced by
fi
B
=
1
-
is B
Furthermore, in (6.20) can
M
(6.21)
fs
It would now be possible, at least in principle, to choose the test signal spectrum (or equivalently the spectral
Choice of Sampling Intervals
distribution function <(f) , f € [-fh, f,])
271
to minimize some
scalar function I$ of, the average information matrix per unit
-
time, M
It is obvious that the tighter the constraint on the
8'
allowable input spectrum (i.e. the smaller fh)
4.
be the "cost"
the greater will
Also from (6.21) and the property (2.4) of 4 ,
it can be seen that
is made small by increasing
$(fig)
fS'
fs > 2fh, a compromise will be reached between
However, since
4 due to the decrease in fs and the
the improvement in
4 due to the tightening of the constraint on
degradation in
the allowable input spectrum. The optimization problem outlined above is greatly simplified by the observation that only a finite number of frequency components in the input spectrum need to be considered. This result is established in the following theorem: THEOREM 6.2: function
,
<,(f)
For a n y t e s t s i g n a l s p e c t r a l d i s t r i b u t i o n
f 6 [-fh, fh] w i t h c o r r e s p o n d i n g a v e r a g e i n -
fig(<,)
formation m a t r i x p e r sample,
f
€
[-fh, fh] ,
l i n e s and
Rg(<,)
Proof.
,
there i s a l w a y s a
the s p e c t r u m o f w h i c h c o n t a i n s a t most =
fig(<,)
where
R (5 B
2
)
<,(f)
,
p(p+1)/2+1
corresponds to
5,.
It follows from equations (6.21), (6.7) and (6.9)
that the set of all possible average information matrices per sample arising from power constrained inputs is the convex hull of the set of average information matrices per sample corresponding to inputs containing a single frequency component.
Since
any information matrix is symmetric, it can be represented as an element in a
p(p+1)/2
dimensional space and hence it follows
from a theorem due to Caratheodory [6] that there exist k = 1,.
..,R;
R
=
p(p+1)/2
B8K1) where (from (6.9))
:
=
-
+
Xk, fkl
1 such that
R
(6.22)
G . C. Goodwin and R . L. Payne
2 72
and
II
EXk =
k=l Defining
c2(f)
(6.24)
1
t o be t h e s p e c t r a l d i s t r i b u t i o n f u n c t i o n which
Xk
a s s i g n s measure
to
fk,
. ., p ( p + 1 ) / 2 +
k = 1,.
1
and zero
otherwise completes t h e proof of t h e theorem.. I n p a r t i c u l a r , t h e r e e x i s t s a $-optimal t e s t s i g n a l spectrum containing n o t more than
p(p+1)/2
+
1 lines.
problem now reduces t o minimization of fsr
fp(p+l)/2+1'
flr--.r
2fk < f s ,
s t r a i n t that
Al""Jp(p+l)/2
$(aB )
The o p t i m i z a t i o n with r e s p e c t t o
s u b j e c t t o the con-
k = 1,. . . , p (p+1)2 + 1. Obviously with
t h e problem a s s t a t e d above, t h e optimal value of
...,
fs
w i l l be
= max(fl, f In practice t h e p ( p + l )/ 2 + d m a l i a s i n g f i l t e r w i l l n o t be i d e a l and so, t o avoid a l i a s i n g ,
2fm where
f
i s taken t o be
2 ( 1 + € ) f m where
E
i s related t o the cutoff
c h a r a c t e r i s t i c s of t h e f i l t e r and should be small.
fS
F i n a l l y , from
(6.22) t h e experiment design may be summarized a s :
(6.25)
(6.26) (6.27)
R
CA,=l
(6.28)
k=l
fm = and
$*
max(f l,...,
fR)
(6.29)
i s t h e optimal v a l u e f o r t h e design c r i t e r i o n f u n c t i o n .
2 73
Choice of Sampling Intervals
The quantities A,,
A2,.-., have the physical interpret-
ation of being the fraction of the total power at the corresponding frequencies fl, f2,...,
in the test signal spectrum.
The
dimension of the problem is reduced even further for single input systems for which it has been proven that only p+l
lines are
required in the test signal spectrum [44]. Furthermore, for the -1 usual design criteria, e.g. det(M-l) or trace(M ) I it can be shown that the optimal information matrix is a boundary point of the convex hull, [6], and hence the maximum number of lines required is reduced to p(p+1)/2
in general and
p
for single
input-single output systems. Although the analysis in this section has been restricted to the single input case, it is a relatively simple matter to extend the results to the multiple input case.
Details of the
appropriate background theory may be found in references [401 and 1431. The above development has assumed that a single constant sampling Kate will be used for the complete experiment. However it will be clear from equation (6.25) that it is preferable, if possible, to divide the experiment into a number of subexperiments each having a single sinewave input and each having a different aliasing filter and different Nyquist sampling rate. Equation (6.25) then becomes:
(6.30) fll.. . I
fR
The use of more than one sampling rate is often advantageous especially where there is a wide spread in time constants in the system.
Further discussion on this point may be found in [601.
In the next section, the theory developed above will be applied to a number of examples leading to analytic results.
274
7.
G. C.Coodwn and R . L. Payne
ANALYTIC DESIGNS FOR UNIFORM SAMPLING INTERVALS 7.1
One Parameter Model
Consider a first order system described by the following model : (7.1 -1) where
u(s)
, y(s)
are the Laplace Transforms of the input and
output respectively and
e(s)
represents coloured measurement
noise having spectral density: $(f)
=
1 2 2
w
;
a w +1
Here the only parameter of interest is T
= 2llf
(7.l.2)
the system time con-
stant, and hence a suitable design criterion is simply -1 det(M ) = 1/M where M is the (1x1) information matrix. Following the discussion of Theorem 6.2, an optimal one line input spectrum can be found (since p = 1 for this example). The information matrix per unit time corresponding to a single frequency input at f = w / 2 ~ is given by:
M
=
2
2 2 2 (a w +l)w 2 2 2 (T w +1)
(7.l. 3 )
Hence it follows that, for continuous observations, the input which minimizes det(k-’)
-
w
=
2 (T
-
has frequency
-1/2 2a)
That is, if the noise is wide band
if
2a2 < T2
(a + 0)
given by (7.1.4)
then the following
appealing result is obtained:
-
w = -1 T
(7.1.6)
2 75
Choice of Sampling Interwrlr
The above results apply only to the case where there are no restrictions on the allowable number of samples (i.e., continuous observations are possible).
The case where the total number of
samples is fixed will now be considered. Assuming the optimal aliasing filter has been incorporated, the average information matrix per sample, (6.251, is: 2 2 2r(a w +l)w 2 2 2 (1+E)(T w +1)
m = where
E
(7.1.7)
is the aliasing filter constant.
put frequency G
which minimizes
det(fi-’)
From (7.1.7) the insatisfies the
equation: 2 2-4 2 a - c w +3(a
which for wide band noise
13
Note that G
=
+ 1
=
(7.1.8)
0
(a -+ 0) yields:
-+
-
1
(7.1.9)
-
(sampled observations) is less than w
observations) as expected. is fs -
7.2
2 -2
- T ) W
(continuous
The sampling rate corresponding to 3
1+E w Tr
First Order Two Parameter Model
The system of (7.1) is generalized to include a gain parameter: K TS+l
= -
where
u(s), y(s)
and
e(s)
u(s) + e(s)
(7.2.1)
are as for subsection 7.1.
From equation (6.23) the average information matrix per unit time corresponding to a single line is:
2 76
G . C . Goodwin and R . L. Payne
I where
T
B
1 2 2 2 (T w +1)
(T2w 2+1)
(7.2.2)
= (K,T).
For the determinant optimality criterion, it would appear from the discussion in Section 6, that the optimal input spectrum will contain two lines since p = 2.
However, it can be
shown that, in fact, one line is sufficient for this problem. The verification of this result is straightforward depending on the detailed structure of the convex hull and is left as an exercise for the reader. Thus, for continuous observations, the optimal input --1 frequency, w, minimizing det(M ) satisfies: 2 2-4
- a T w +3(a which for wide band noise
2
2 -2
- T ) W
(a
-+
+ 1
=
(7.2.3)
0
0) yields: (7.2.4)
The variation of
det(fi
B
)
with
w
is shown in Fig. ( 7 . 1 ) .
Now considering the case where the total number of samples is fixed, the average information matrix per sample is - given by: 2 -KTU 2 2 2 (T w +1) 2 2 2 r ( a w +1)
W(l+E)
(7.2.5) 2
-KTW 2 2 2 (T w +1)
K
2
2
W 2 2 2 (T w +1)
Hence the optimal input frequency with a fixed number of samples is
U
i a,
cv
Figure 7.1 V a r i a t i o n of I n f o r m a t i o n M a t r i x w i t h I n p u t Frequency f o r F i r s t O r d e r System w i t h White O u t p u t N o i s e and F i x e d Experiment Time
C . C.Goodwin and R . L . Payze
2 78
(7.2.6)
= o
if
a5~,.jT
(7.2.7)
The corresponding optimal sampling frequency is fs
=
1+E 2T
-
w
(7.2.8)
The result in (7.2.7) for wide band noise is, of course, impractical but can correctly be interpreted as meaning "apply a low frequency input and sample slowly."
Just how slowly will depend
on the total allowable experiment time. 7.3
Second O r d e r Two Parameter Model
Consider a second order system having undamped natural frequency wo
and damping ratio
5:
L
Y(S)
where
u(s), y(s)
=
wO
2 2 u(s) + n(s) s +25 wos+wo
(7.3.1)
are the Laplace Transform of the input and
output respectively and
n(s)
denotes wide band noise.
Again using equation (6.23), the average information matrix per unit time corresponding to an input'having a single frequency component is
(7.3.3)
2 79
Choice of Sampling Intenmls
where
6T
[wo,<]
=
.
For t h i s problem it can again be shown t h a t the optimal i n p u t s p e c t r a need only c o n t a i n one l i n e f o r t h e determinant criterion.
The optimal value of t h e i n p u t frequency with con-
tinuous o b s e r v a t i o n s i s :
-
w
=
wO 2 ((1-25 ) +
A
The v a r i a t i o n of Figure 7.3.1.
“1-25
2 2 )
+
151
with t h e damping r a t i o
}l i 2
5
(7.3.4)
i s shown i n
I t can be seen t h a t t h e optimal i n p u t frequency
drops a s t h e damping i n c r e a s e s .
This r e s u l t might have been
expected from i n t u i t i v e reasoning. Proceeding as i n t h e previous examples, t h e optimal i n p u t frequency,
G I with sampled o b s e r v a t i o n s i s
-
w
-
w = - 0
J3
((1-2E2)
+
[ ( 1 - 2 5 2 ) 2 + 31 lI2} li2
(7.3.5)
The corresponding optimal sampling r a t e i s (7.3.6) The v a r i a t i o n of Figure 7.3.1.
3
with t h e damping r a t i o i s a l s o shown i n
I t can be seen from t h e f i g u r e t h a t t h e optimal
i n p u t frequency with sampled observations i s always less than o r equal t o t h e optimal i n p u t frequency with continuous observations ( e q u a l i t y o c c u r s only when t h e system i s marginally s t a b l e ) . This i s i n l i n e with t h e g e n e r a l observations made i n Section 6. 8.
CONCLUSIONS
This c h a p t e r has considered t h e problem of design of experiments f o r dynamic system i d e n t i f i c a t i o n .
The main emphasis has
been on t h e d e s i g n o f t h e sampling system, namely t h e choice of presampling f i l t e r s and sampling i n t e r v a l s .
I t has been demon-
s t r a t e d t h a t , i n o r d e r t o achieve maximal r e t u r n from an
280
-
G . C . Goodwin and R . L. Payne
,..
1
0.5
1
0
I
1
I
I
I
2
3
4
L
r
.
5
Figure 7.3.1 Variation of Optimal Input Frequency versus Damping Ratio For Second Order System with "White" Output Noise
Legend:
-
A
-
w Continuous Observations (-)
B
-
Sampled Observations
-
WrJ
(K) WrJ
Choice of Sampling Intervals
281
experiment, a coupled design of the sampling system and test signal is required. The general problem, in which nonuniform sampling intervals are allowed, has been formulated and expressions developed for the information return from an experiment.
Special identifi-
cation algorithms are, of course, required to analyze data collected at nonuniform sampling intervals but an appropriate maximum likelihood algorithm follows from the expressions developed for the likelihood function in the text.
The general
experiment design problem with noniiniform sampling requires a very sophisticated optimization to be carried out.
To simplify
the computations a suboptimal one-step-ahead design algorithm has been described.
The algorithm would be suitable for on-line
experiment design.
Examples have been presented which show that
significant improvements in identification accuracy can be achieved with this simply optimization procedure.
Presumably,
further improvements could be achieved by extending the optimization horizon but the complexity of the design would be very greatly increased.
For the case of uniform sampling intervals
and power constrained test signals it has been shown that the experiment design problem has a simple interpretation in the frequency domain.
The usual aliasing filter is optimal in a well
defined sense and the optimal input spectra need only contain a finite number of lines.
These facts lead to an optimization in a
space of low dimension.
Analytic designs are possible for simple
cases. The final choice between the different design methods outlined in this chapter (and other methods also) depends upon the nature of the constraints on the allowable experimental conditions.
Care must be exercised in stating the constraints as
otherwise impractical or trivial designs can result.
However,
with judicious choice of the constraints, then significant improvements in estimation accuracy can be achieved from relatively simple experiment design procedures similar to those outlined
G . C . Goodwin and R . L. Payne
282
in this chapter. It has been the authors' intention that this chapter should not only provide methods for experiment design but, perhaps more importantly, also provide insight into the ramifications of the information structure in identification problems. Acknowledgments
The work reported in this chapter grew largely from joint work of the authors whilst they were in the Department of Computing and Control at Imperial College.
The authors wish to
acknowledge many helpful suggestions made by the staff and students of Imperial College.
Special credit is due to Professor
David Mayne who was a constant source of inspiration and to Martin Zarrop who originally developed some of the techniques described in Sections 4 and 5 as part of his doctoral research. REFERENCES The References have been collected under the following headings for the reader's convenience: 1.
E x p e r i m e n t D e s i g n for Non-Dynamic s y s t e m s
2.
Test S i g n a l D e s i g n f o r Dynamic S y s t e m I d e n t i f i c a t i o n
3.
D e s i g n o f S a m p l i n g I n t e r v a l s f o r Dynamic S y s t e m I d e n ti f ic a tion
4.
Ma thema t i c a l Background
1.
EXPERIMENT DESIGN FOR NON-DYNAMIC SYSTEMS
1.
Cox, D. R., P l a n n i n g of E x p e r i m e n t s , Wiley, 1958.
2.
Davies, 0. L. , D e s i g n and A n a l y s i s of I n d u s t r i a l E x p e r i m e n t s , Oliver and Boyd, 1954.
3.
Kempthorne, O., D e s i g n and A n a l y s i s o f E x p e r i m e n t s , Wiley, New York, 1952.
Choice of Sampling Intervals
283
4.
Kiefer, J. and J. Wolfowitz, "The Equivalence of Two Extremum Problems," C a n a d i a n J o u r n a l of Maths., Vol. 1 2 , 1966, pp. 363-366.
5.
Karlin, S. and W. J. Studden, "Optimal Experimental Designs," Annals of M a t h . S t a t . , V o l . 37, 1966, pp. 783-815.
6.
Federov, V. V., T h e o r y of O p t i m a l E x p e r i m e n t s , Academic Press, New York and London, 1971.
7.
Whittle, P., "Some General Points in the Theory of Optimal Experimental Design," J . R . S t a t i s t . SOC., B , 35, N o . 1, 1973, pp. 123-130.
8.
Wynn, H. P., "Results in the Theory and Construction of DOptimum Experimental Designs, J . R . S t a t i s t . SOC., B , 3 4 , N O . 2, 1972, pp. 133-147.
2.
TEST SIGNAL DESIGN
9.
Levin, M. J., "Optimal Estimation of Impulse Response in the Presence of Noise," IRE T r a n s . on C i r c u i t T h e o r y , V o l . C T - 7 , March 1960, pp. 50-56.
FOR DYNAMIC SYSTEM IDENTIFICATION
10.
Litman, S. and W. H. Huggins, "Growing Exponentials as a Probing Signal for System Identification," P r o c . I E E E , V o l . 5 1 , June 1963, pp. 917-923.
11.
Levadi, V. S., "Design of Input Signals for Parameter Estimation," IEEE T r a n s . A u t o m a t i c C o n t r o l , V o l . AC-11, N o . 2 , April 1966, pp. 205-211.
12.
Gagliardi, R. M., "Input Selection for Parameter Identification in Discrete Systems," I E E E T r a n s . A u t o m a t i c C o n t r o l , Vol. AC-12, N o . 5, October 1967.
13.
Goodwin, G. C. and R. L. Payne, "Design and Characterization of Optimal Test Signals for Linear Single Input-Single Output Parameter Estimation," Paper TT-1, 3rd IFAC Symposium, The Hague/Delft, June 1973.
14.
Goodwin, G. C., R. L. Payne and J. C. Murdoch, "Optimal Test Signal Design for Linear Single Input-Single Output Closed Loop Identification," CACSD Conference, Cambridge, 2-4th April 1973.
15.
Goodwin, G. C., J. C. Murdoch and R. L. Payne, "Optimal Test Signal Design for Linear Single Input-Single Output System Identification," I n t . J . C o n t r o l , Vol. 1 7 , N o . 1 , 1973, pp. 45-55.
284
16.
17.
G . C . Goodwin and R . L. P a p e
Arhoto, S. and H. Kimura, "Optimal Input Test Signals for System Identification - An Information Theoretic Approach," I n t . J. S y s t e m s Science, V o l . 1, N o . 3 , 1971, pp. 279-290. Mehra, R. K., "Optimal Inputs for System Identification," June 1974, pp. 192-200.
IEEE T r a n s . Automatic C o n t r o l , V o l . AC-19,
18.
Smith, P. L., "Test Input Evaluation for Optimal Adaptive Filtering," Preprints 3rd IFAC Symposium, The Hague/Delft, Paper TT-5, 1973.
19.
BOX, G. E. P. and G. M. Jenkins, T i m e Series A n a l y s i s Forec a s t i n g a n d C o n t r o l , Holden Day, San Francisco, 1970, pp. 416-420.
20.
Nahi, N. E. and D. E. Wallis, Jr., "Optimal Inputs for Parameter Estimation in Dynamic Systems with White Observation Noise," Paper IV-A5, Proc. JACC, Boulder, Colorado, 1969, pp. 506-512.
21.
Inoue, K., K. Ogino and Y. Sawaragi, "Sensitivity Synthesis of Optimal Inputs for Parameter Identification," Paper 9-7, IFAC Symposium, Prague, June 1970.
22.
Rault, A., R. Pouliquen and J. Richalet, "Sensitivizing Inputs and Identification," Preprints 4th IFAC Congress, Warsaw, Paper 26.2, 1969.
23.
Van den BOS, A., "Selection of Periodic Test Signals for Estimation of Linear System Dynamics," Paper TT-3, Preprints of 3rd IFAC Symposium, The Hague/Delft, 1973.
24.
Goodwin, G. C., "Optimal Input Signals for Nonlinear System Identification," P r o c . I E E E , V o l . 1 1 8 , N o . 7, 1971, pp. 922926.
25.
Goodwin, G. C., "Input Synthesis for Minimum Covariance State and Parameter Estimation," E l e c t r o n i c s Letters, Vol. 5, N o . 21, 1969.
26.
Reid, D. B., "Optimal Inputs for System Identification," Stanford University Rpt. No. SUDAAR 440, May 1972.
27.
Sawaragi, Y. and K. Ogino, "Game Theoretic Design of Input Signals.for Parameter Identification," Paper TT-6, Preprints 3rd IFAC Symposium, The Hague/Delft, June 1973.
Choice of Sampling Intervals
285
28.
Keviczky, L. and C. S. Banyasz, "On Input Signal Synthesis for Linear Discrete-Time Systems," Paper TT-2, Preprints 3rd IFAC Symposium, The Hague/Delft, June 1973.
29.
Keviczky, L., "On Some Questions of Input Signal Synthesis," Report 7226(B), Lund Institute of Technology, Division of Automatic Control, November 1972.
30.
Aoki, M., R. M. Staley, "On Approximate Input Signal Synthesis in Plant Parameter Identification," presented at 1st Hawaii International Conference on System Sciences, University of Hawaii, January 1968.
31.
Aoki, M. and R. M. Staley, "Some Computational Considerations in Input Signal Synthesis Problems," 2nd International Conference on Computing Methods in Optimization Problems, sponsored by SIAM, San Remo, Italy, September 1968.
32.
Aoki, M. and R. M. Staley, "On Input Signal Synthesis in Parameter Identification," A u t o m a t i c a , V o l . 6, 1970, pp. 431440. Originally presented at 4th IFAC Congress, Warsaw, June 1969.
33.
Nahi, N. E. and G. A. Napjus, "Design of Optimal Probing Signals for Vector Parameter Estimation," Paper W9-5, Preprints IEEE Conference on Decision and Control, Miami, December 1971.
34.
Napjus, G. A., "Design of Optimal Inputs for Parameter Estimation," Ph.D. Dissertation, University of Southern California, August, 1971.
35.
Tse, E., "Information Matrix and Local Identifiability of Parameters," Paper 20-3, JACC, Columbus, Ohio, June 1973.
36.
Minnich, G. M., "Some Considerations in the Choice of Experimental Designs for Dynamic Models," Ph-D. Dissertation, University of Wisconsin, 1972.
37.
Kalaba, R. E. and I. Spingarn, "Optimal Inputs and Sensitivities for Parameter Estimation," JOTA, 11, 1, 1973, pp. 56-67.
38.
Payne, R. L. and G. C. Goodwin, "A Bayesian Approach to Experiment Design with Applications to Linear Multivariable Dynamic Systems," IMA Conference on Computational Problems in Statistics, Univ. of Essex, July 1973.
39.
Mehra, R. K. "Frequency Domain Synthesis of Optimal Inputs for Parameter Estimation," Techn. Report No. 645, Div. of Eng. and Appl. Physics, Harvard University, July 1973.
286
G . C . Goodwin and R . L . Payne
40. Mehra, R. K., "Synthesis of Optimal Inputs for MultiinputMultioutput (MIMO) Systems with Process Noise," Techn. Report No. 649, Div. of Eng. and Appl. Physics, Harvard University, February 1974. 41. Van den BOS, A., "Construction of Binary Multifrequency Test Signals," Paper 4-6, IFAC Symposium on Identification, Prague, June 1967. 42.
Mehra, R. K., "Optimal Measurement Policies and Sensor Designs for State and Parameter Estimation," Milwaukee Symposium on Automatic Controls, March 1974.
43.
Goodwin, G. C., R. L. Payne and M. B. Zarrop, "Frequency Domain Experiment Design for Multiple Input Systems," Report No. 75/17, Department of Computing and Control, Imperial College, London.
44. Payne, R. L. and G. C. Goodwin, "Simplification of Frequency Domain Experiment Design for Single Input - Single Output Systems," Report No. 74/3, Department of Computing and Control, Imperial College, London. 45.
3.
Viort, B., "D-Optimal Designs for Dynamic Models - Part ITheory," Techn. Report 314, Department of Statistics, University of Wisconsin, October 1972. DESIGN OF SAMPLING INTERVALS FOR DYNAMIC SYSTEM I D E N T I F I CATION
46. Goodwin, G. C., M. B. Zarrop and R. L. Payne, "Coupled Design of Test Signal, Sampling Intervals and Filters for System Identification," IEEE T r a n s . AC S p e c i a l I s s ue on I d e n t i f i c a t i o n , December 1974.
47. Payne, R. L., G. C. Goodwin and M. B. Zarrop, "Frequency Domain Approach for Designing Sampling Rates for System Identification," A u t o m a t i c a , March 1975. 48. Astrom, K. J. , "On the Choice of Sampling Rates in Parameter Identification of Time Series," I n f . S c i . , V o l . 1, 1969, pp. 273-287.
49. Gustavsson, I., "Choice of Sampling Intervals for Parametric Identification," Report 7103, Lund Institute of Technology, 1971.
50.
Zarrop, M.B., "Experimental Design for System Identification," M.Sc. Dissertation, Imperial College of Science and Technology, 1973.
Choice of Sampling I n t e n d
4.
MA THEMATICAL BACKGROUND
51.
Silvey, S. D. , S t a t i s t i c a l I n f e r e n c e , 35-44.
52.
Kullback, S., I n f o r m a t i o n T h e o r y and S t a t i s t i c s , Dover, New York, 1968.
53. 54.
287
Penguin, 1970, pp.
Astrom, K. J. , I n t r o d u c t i o n t o S t o c h a s t i c C o n t r o l T h e o r y , Academic Press, 1970. Kailath, T., "An Innovation Approach to Least Squares Estimation - Part 1: Linear Filtering in Additive White Noise," IEEE T r a n s . A-13, N o . 6 , December 1968.
55. Cox, D. R. and H. D. Miller, T h e T h e o r y of S t o c h a s t i c Processes, Chapman Hall, London, 1967.
,
56.
Wong, E. S t o c h a s t i c P r o c e s s e s i n I n f o r m a t i o n and Dynamical S y s t e m s , McGraw Hill, 1970.
57.
Yen, J. L., "On Nonuniform Sampling of Bandwidth Limited Signals," IRE T r a n s . C i r c u i t T h e o r y , December 1956, pp. 251257.
58. 59.
Bendat, J. S. and A. G. Piersol, Measurement a n d A n a l y s i s o f
Random Data, Wiley, New York, 1966.
Anderson, B. D. O., K. L. Hitz and N. D. Diem, "Recursive Algorithm for Spectral Factorization," IEEE T r a n s . C i r c u i t s a n d S y s t e m s , V o l . CAS-21, No. 6 , 1974.
60. Goodwin, G. C. and R. L. Payne, "Dynamic System Identification: Experiment Design and Data Analysis," Book to appear 1977.