Choice of Sampling Intervals

Choice of Sampling Intervals

CHOICE OFSAMPLING INTERVALS G.C.Goodwin Department of Electrical Engineering University of Newcastle New South Wales, 2308, Australia and R.L . Payne ...

1MB Sizes 8 Downloads 54 Views

CHOICE OFSAMPLING INTERVALS G.C.Goodwin Department of Electrical Engineering University of Newcastle New South Wales, 2308, Australia and R.L . Payne Department of Systems and Control University of New South Wales New South Wales, 2031, Australia

1.

INTRODUCTION

251

2.

DESIGN CRITERIA

253

3.

CONSTRAINTS

254

4.

NONUNIFORM SAMPLING THEORY

255

5.

SEQUENTIAL DESIGN OF NONUNIFORM SAMPLING INTERVALS

262

6.

UNIFORM SAMPLING INTERVAL DESIGN

265

7.

ANALYTIC DESIGNS FOR UNIFORM SAMPLING INTERVALS

274

8.

CONCLUSIONS

279

REFERENCES

282

1.

INTRODUCTION I n most experiments there are a number of variables which can

be adjusted, subject to certain constraints, so that the information provided by the experiment is maximized. Experiment design procedures have been described for a number of diverse modelling problems in many fields including agriculture, economics, social sciences, chemical engineering, industrial engineering and biological sciences [l] to [3] the main emphasis to date has been on non-dynamic systems, but recently there has 251

252

C . C. Goodwin and R . L . Payne

been growing interest in systems where dynamic effects predominate [9] to [45].

For dynamic systems, experiment design in-

cludes choice of input and measurement ports, test signals; sampling instants and presampling filters.

In general the effects

of these design parameters are interrelated [46] and a complete design should be carried out.

However, in practice, it is often

the case that the appropriate input and measurement ports are predetermined.

Furthermore, for the case of continuous observa-

tions, or discrete time systems, the problem of the choice of sampling intervals and filters does not arise.

In these cases,

the complete design reduces to the design of test signals. One of the earliest publications on optimal test signal design appeared in 1960 [9], the major result being that an input having impulsive autocorrelation function is optimal for the estimation of the parameters of a moving average process in the presence of white measurement noise.

Since then, more general

classes of models have been considered and a number of test signal design algorithms based on optimal control principles have been proposed 1131 to [151 , [171, [201 , 1241, [251 , [ X I , [331, [341. Recently it has been shown that design algorithms originally developed for non-dynamic systems 141 to [8] are applicable to the design of test signals in the frequency domain, [45], [39], [40], [43], [44]. For those problems where the choice of sampling instants and pre-sampling filters does arise, it has been recognized that this choice often has a significant effect on the information return from the experiment [46] to [SO].

The choice is

particularly critical when the available computer storage and analysis time are limited. In these cases, the experiment should be designed to maximize the average information content of each sample. In the sampling problem, four cases can be distinguished depending upon whether the samples are uniformly or nonuniformly spaced in time and whether the test signal is prespecified or is

Choice of Sampling InteruaO

to be jointly designed.

253

The purpose of this chapter is to

present, what is, in the authors' opinion, the state of the art in the design of sampling intervals in each of these cases. 2.

DESIGN CRITERIA To form a basis for the comparison of different experiments

a measure of the "goodness" of an experiment is required.

A

logical approach is to choose a measure related to the expected accuracy of the parameter estimates to be obtained from the data collected. Clearly the parameter accuracy is a function of both the experimental conditions and the form of the estimator. HOWever, it is usual to assume that the estimator used is efficient in the sense that the Cramer Rao lower bound on the parameter covariance matrix is achieved, that is, the covariance matrix is given by the inverse of Fisher's information matrix [511.

(This

property is true, at least asymptotically, for most commonly used estimators e.g., maximum likelihood [51].)

A suitable measure of

the "goodness" of an experiment is therefore

where

I$ is a scalar function and M is Fisher's information

matrix. Various choices exist for the function I$.

Possible choices

are i) ii) iii)

-

Trace (M)

Det (M-l) (equivalently - Det (M) or Trace (WM-')

where W

-

Log Det (M))

is a nonnegative definite

matrix. The choice between these alternatives depends upon the objective of the experiment.

In fact it could be argued that a Bayesian

decision theoretic approach is called for, that is, the measure, J, should be a Bayesian risk function which would reflect the

ultimate use to which the data from.the experiment will be put. This approach has been followed by a number of authors in the

G . C . Goodwin and R . L. Payne

254

literature [33] , 1341 , 1381.

A useful approximation to the full

Bayesian formulation is to use: J

where

W

=

Trace [MW

-1 ]

(2.2)

is weighting matrix given by the second derivative of

the risk function with respect to the parameters, [131 , [381. An alternative approach is to use an information measure where information is used in the mathematical sense [52].

Under

the usual Gaussian assumptions the information measure reduces to J

=

Det[M

-1

]

(2.3)

Of the three possible criteria mentioned above, choice (i) seems to be, at first sight, particularly attractive since, for input design, use of this criteria leads to a quadratic optimization problem [17], [32], [20], [33].

However, it has been

pointed out, [35], [26], [25], [13] that this criteria can lead to unidentifiability conditions and is therefore not recommended. Other criteria, apart from the three discussed above, are possible.

Most commonly met criteria (including the three above)

have the following desirable property: @(M1)

@(M2)

if

M1

-

M2

is nonnegative definite (2.4)

This property will be used in later sections. 3.

CONSTRAINTS In the optimization of the design criteria introduced in

Section 2, it is obvious that the optimization must be carried out with due regard to the constraints on the allowable experimental conditions. Typical constraints that might be met in practice are:

(1)

Input, output or state amplitude constraints,

(2)

Input, output or state energy/power constraints,

(3)

Total experiment time,

(4)

Total number of samples,

Choice of Sampling Inletwlr

255

(5)

Maximum sampling rate,

(6)

Total computation time for design and analysis,

(7)

Availability of transducers and filters,

(8)

Availability of hardware and software for analysis,

(9)

Model accuracy specifications,

(10)

Non-disruption of normal production.

Which of the above constraints (or others) are important in a particular experiment depends on the situation, and it is not always obvious which should be taken into account during the design. A useful approach leading to simplified designs is to work with a subset of the constraints (such as input energy and total number of samples) and to subsequently check for violation of the other constraints. 4.

NONUNIFORM SAMPLING THEORY In this section the following problem is considered: a.

The parameters in a continuous linear system are to be

estimated from a fixed number of samples, N. b.

The samples are taken at times tll t2'""

tN

where

the sampling intervals k

=

O,...,N-l

are not necessarily equal (to corresponds to the beginning of the experiment) c.

.

The experimental conditions available for adjustment

are the system input, the pre-sampling filter and the sampling intervals Lo, All A 2 r - - - i A N - l * A general model for a linear time invariant system is 1531: dx' (t)

=

dy' (t) = where

y'(t)

:

sampling, u(t)

T

+

:

Rr

T

-+

A'x' (t)dt + B'u(t)dt C'x' (t)dt

+ dq(t)

(4.2)

+ du(t)

(4.3)

is the output prior to filtering and Rm

is the input vector, x'(t)

:

T

-+

n' R

G . C. Goodwin and R . L. F'ayne

256

is a state vector and

n(t) , w(t)

are Wiener Processes with

incremental covariance:

.( The above model is based on the usual assumptions of rational input-output transfer function and rational output noise power density spectrum. In general, to minimize the information l o s s due to sampling, some form of pre-sampling filter is required. Optimization of the filter requires careful specification of the allowable filter structures as otherwise impractical solutions will result. For example, with no restrictions on the allowable structure, a filter which is in fact a complete identifier could be constructed.

A common restriction is to use a linear filter of fixed

dimension. The filter design then reduces to optimization of the filter parameters.

where

y(t)

A

suitable linear filter is

is the filter output.

Equations (4.5) , (4.6) can be combined with equations (4.2), (4.3) to produce a composite model for the system and filter of

the following form:

where

x(t)

is the state of the composite model and

E(t)

is a

Wiener Process with incremental covariance Cdt. For simplicity of exposition, the class of allowable inputs is now restricted to piecewise constant functions, that is

Choice of Sampling I n t e n d s

257

This c l a s s of i n p u t s h a s t h e added advantage of e a s e of implementation w i t h t h e s p e c i f i e d sampling scheme.

However, t h e

a n a l y s i s c a n be r e a d i l y extended t o o t h e r c l a s s e s of i n p u t s , e . g . , a b s o l u t e l y continuous f u n c t i o n s . A s a f i r s t s t e p towards o b t a i n i n g an e x p r e s s i o n f o r t h e

information m a t r i x of t h e system parameter, a d i s c r e t e time model f o r t h e sampled o b s e r v a t i o n s i s now o b t a i n e d . I n t e g r a t i o n of (4.7) and (4.8) g i v e s (4.10) =

‘k

(4.11)

cx k

where (4.12)

(4.13)

(4.14)

and ‘k

=

The sequence

tk+l

{A,}

-

(4.15)

tk

i s a sequence of independent normal

random v a r i a b l e s having z e r o mean and covariance a t time

f,

(4.16) -

Q ,

by d e f i n i t i o n

G . C. Goodwin and R . L. Payne

258

Equations (4.10) and (4.11) can now be used to develop an expression for the information matrix of the parameters using the samples Y1r*--rYNThe standard expression for the information matrix [53] is: (4.17) where

Y

is a random variable corresponding to the observations

ylr...ryN and

is the p-vector of parameters in the system

model (4.2) and (4.3). the parameters and tion of

Y

given

E !3.

y

p(YIf3)

IB

is the likelihood function for

denotes expectation over the distribu-

Using Bayes' rule the likelihood function can be expressed as: (4.18) =

P(Y1

It follows from the assumption of Gaussian noise processes that the conditional distributions in equation (4.18) can be expressed as

exp where

-

yk

I-

1/2(yk

-

-

Y,)

(4.19) T

-1

sk

(Y,

- Y,) I

is the conditional mean and satisfies the following

equations [54]:

k '

where and

-

xk

=

cxk

is the conditional mean of

(4.20)

x

k'

In equation (4.21) the simplified notation Ok = @(AkrO) +k = $(AkrO) has been used. In equation (4.21) the matrix

Choice of Sampling Intends

rk

is given by:

rk and

259

Pk

T -1

T

@k~kc(cpkc

=

(4.22)

%

is the conditional covariance of

and is the solution

of a matrix Riccati equation:

(Note: The matrix

[CPCT] in equations (4.22) and (4.23) will

usually be nonsingular but if singular, then the correct results are obtained by using the pseudo inverse [591.) The conditional covariance, Sk,

of

yk

is related to

k'

as follows Sk

T CPkC

=

(4.24)

The boundary conditions for equation (4.21) and (4.23) are assumed to be

-

x

and

0

Po

respectively where the initial state

has been taken to be normally distributed with mean covariance Substituting ( 4 -19) to (4.18) yields

nN

P(YIB)

det Sk )-1/2

k=l exp

1

-

N

1/2

C

k=l

which when substituted into (4.17

+

N 1/2

Trace ( S ; ' k=l

yields:

21

Trace

1

F}(4.26)

-1 ask

Sk

260

G . C . Goodwin and R . L. Payne

The proof of equation (4.26) is straightforward, though somewhat long winded [SO], and has therefore been left to the reader as an exercise. The quantities

a;,/aB,

in equation (4.25) can be evaluated

from the following sensitivity equations:

(4.27)

(4.28) Equation (4.28) can be combined with equation (4.21) to give:

(4.30) and the forms of the matrices

Fk, Gk

and

%

follow from (4.28)

and (4.21). Also from equation (4.27) the quantities expressed in terms of the composite state 2,

ayk/aBi

can be

as follows: (4.31)

where i'

ac ,

0

. . . I

0, c, 0,

. . . I

(4.32)

Now substituting (4.31) into (4.26) leads to the following expression for the ij-th element of the information matrix:

Choice of Sampling Intenah

+

N 1/2 k=l

-1

-1 ask Trace [ Sk a@,

Trace [S;'%/

261

(4.33)

Performing the expectation in equation (4.33) yields:

h

where xk

2k and

are

and

Tk

are the mean and covariance respectively of

given by:

A

X

k+l

Tk+l

A

=

F G + G u k k k k '

x

= o

(4.35)

=

T T F T F + H S H k k k k k k '

T

= O

(4.36)

0 0

S w i n g up, equation (4.34) provides a means for computing the information matrix before the experiment is performed, that is, it is possible to evaluate a scalar function $I

of the in-

formation matrix corresponding to a particular set of experimental conditions. (4.34) that M

It follows from development of equation

and hence $I

depend on the experimental condi-

tions namely the input sequence intervals

(Ao,...rAN-l)

(4.5) and (4.6)).

( u ~ ~ . . . ~ ,u ~the - ~ sampling )

and the pre-sampling filter (equation

Hence, it would be possible, at least in

principle, to optimize I$ with respect to these design variables. Suitable algorithms have been developed for this type of optimization [50] but are extremely time consuming except for

262

G . C. Goodwin and R . L. Payne

simple situations and unfortunately do not give a great deal of insight into the problem. In the next section a suboptimal design scheme will be described which allows the theory developed in this section to be applied to practical systems.

In subsequent sections it will be

shown that, with certain constraints on the design, alternative simple design methods exist. These alternative methods give further insight into the general sampling problem. 5.

SEQUENTIAL DESIGN OF NONUNIFORM SAMPLING INTERVALS This section describes a suboptimal design procedure for

sequential determination of the input and sampling intervals. It follows from equation (4.34) that the information matrix is given by the following recursion: %+1 where

I',

%

=

+

'k+l

'

M

0= o

(5.1)

is the information increment resulting from the k+l-th

sample and has ij-th element ['k+llij

-- G~k+lfiTs-l fi.; 1 k+l J

+

1/2 Trace

k+l

+

Trace

ask+l

Using (4.35) equation (5.2) becomes:

-

+

(FkGk + G,~,)~fiTs;:,fi~(F~;~

1

Trace fiT.S-l fi.T i k + l J k+l

+

Gkuk)

(5.3)

Considering the situation at time k , if it is assumed that the past inputs and sampling intervals have been specified, then

Choice of Sampling I n t e n d s

263

is a %+1 Ak. Hence it is a

it follows from equations (5.1) and (5.3) that function of only the two variables

\

and

4,

simple optimizatitm problem to choose u and to maximize k the optimality criterion Then moving on to time (k+l) the procedure can be repeated to choose u and Ak+l etc. k+l This procedure is analogous to the one-step-ahead control algorithms such as minimum variance control [4.3] and will not generally be globally optimal in the sense that I$(%)

is

optimized with respect to uo, ul,...,u

and Ao,Al,...,AN~l~ N-1 The procedure described above is a generalization to the

sampling problem of the sequential test signal design procedures originally described by Arimoto and Kimura [16] and by Keviczky [281, [291. The sequential design algorithm is now illustrated by a simple example. Consider the first order system [461.

+

2

=

ax

y

=

x + n

where the noise, n(t)

bu

(5.4) (5.5)

is assumed to have wide bandwidth

(approximately "white"). Equation (5.4) can be discretized as

follows:

k' where

4~

=

-

exp(aAk) ,

xk + nk

8,

=

b/a(ak

(5.7)

-

and approximately constant variance.

1)

and

\

has zero mean

(This assumption will be

valid provided the smallest sampling interval allowed is much greater than the reciprocal of the noise bandwidth; this point will be taken up again later in this section). For this system, equation (5.1) becomes

G.C. C o o d w n and R . L. Payne

264

(5.9) and 0

ak h

X

k+ 1

-

0

"k'k

ak

, o

0

-

%'k

+

ak (5.11)

BkUk

Here t h e o p t i m a l i t y c r i t e r i o n i s t a k e n t o be

-

equivalently

-

Det(\+l)

Det(M

-1

or

)

Hence u s i n g e q u a t i o n ( 5 . 8 ) :

Det(M). =

-

Det(\)Det[I

=

-

Det(\)

=

-

Det (Mk) (1+(%Gk+Bkuk)

+
"T T -1 (1 + x ~ + ~ MkS ~slxk+l) A

T T - 1

fi

\

R(%Xk+BkUk)

)

(5.12)

Now assuming t h a t i n the interval

-1

uk

u

k

5

i s c o n s t r a i n e d i n a m p l i t u d e and l i e s 1,

t h e n it can be s e e n from e q u a t i o n

(5.12) t h a t t h e o p t i m a l i t y c r i t e r i o n i s minimized w i t h r e s p e c t to

uk

if

\ u* k

i s chosen as =

T T - 1

Sign (Bkfi

% mkxk)

(5.13)

A

S u b s t i t u t i n g (5.13) back i n t o (5.12) y i e l d s

+

T T -1

B

k

Mk fiBkl

(5.14)

2 65

Choice of Sampling 1nterua.k

A one dimensional optimization can now be used to find the

best values for

.

This value can then be substituted into

(5.13) to give uc. Results of this optimization are shown in Fig. (5.1). It can be seen from Figure (5.1) that, for this example, a significant improvement in information occurs if the input is sequentially optimized with fixed sampling intervals. However, a further improvement is achieved by a coupled design of both the input and sampling intervals. Of course the design algorithm is only optimal in the one-step-ahead sense and presumably further improvements could be obtained by extending the optimization horizon but at the cost of greatly increased computational complexity. It should be noted that, in this example, the sampling was performed without the inclusion of a presampling filter. However, since the noise was very wide bandwidth, drastic improvements in estimation accuracy could have been achieved by the simple expedient of including a low pass filter prior to sampling. The improvement in accuracy will be of the order of the ratio of the noise bandwidth to the system bandwidth, provided that the filter is chosen appropriately.

(The variance of the discrete noise on

the samples is roughly proportional to the noise bandwidth at the samples.)

This point will be discussed further for the case of

uniform sampling intervals in the next section. 6.

UNIFORM SAMPLING INTERVAL DESIGN In this section the case of uniform sampling interval design

is considered. For this case it will be shown that it is a simple matter to simultaneously find optimal test signals, pre-sampling filters and sampling rates. For the purposes of the current discussion, consider the following model of a constant coefficient linear single inputmultiple output system with coloured measurement noise: k(t)

=

Ax(t)

+

Bu(t)

(6.1)

15

Log det

%

-1

Figure 5.1 : COMPARISON OF SEQUENTIAL DESIGNS

Legend A - square wave input, A k = 0.2 B - sequentially optimized input Ak

10

C

-

A

5 t u h m

0

-B

-5

0

I

I

5

10

I

1

15

20

=

sequentially optimized input and sampling intervals.

C

I

25 Number of samples

30

k

0.2

Choice of Sampling InteruaLc

t € T = [O, tf] , x : T

where u

:

and

T

Cx(t) + Du(t)

=

y(t)

R1

-+

w

Rn,

is the input vector, y

T

:

-+

+

+.

Rm

w(t)

267

(6.2)

is the state vector, :

T

-+

Rm

is the output vector

is the measurement noise vector, assumed to have

a zero mean Gaussian distribution with power density spectrum $(f)

with full rank for all

f6

(-(arm).

For large experiment time, tf,

it is convenient to trans-

form the model equations (6.1), (6.2) to the frequency domain: =

y(s)

T(s)u(s)

+

w(s)

;

s = j2nf

(6.3)

where

Fisher's information for the parameters, and

D

(assuming for the moment continuous observations over

[ O r tfl)

is given by: M

where

6, in A, B, C

-

M

B

B

=

t i

(6.5)

fB

is the average information matrix per unit time with

ik-th element given by: m

where

<(f)

is the input spectral distribution function 1551 ,

[56] satisfying:

(Note: The input power has been constrained to be unity). Now, introducing the matrix Qi(s)

=

aT(s) aBi

Q

having i-th column given by (6.8)

268

G . C. Goodwn and R . L. Payne

the average information matrix may be written as:

iB

=

[

QT(-J2,f)$-'(f)Q(j2,f)

(6.9)

dc(f)

Assuming that the test signal spectrum is band limited to [-fh, fh], then equation (6.9) may equivalently be written as:

(Note: The assumption of a band limited test signal will later be shown to be consistent with the conditions for optimality.) The effect of sampling on the average information matrix is now investigated. Suppose that the output, y(t),

t € [ O r tfl

is sampled at greater than the Nyquist rate, [571, for is the sampling frequency

fsl is greater than

2fh).

fh

This form

of sampler does not distort that part of the spectrum of arising from the input.

(that

y

The part of the output spectrum arising

from the noise will, however, be distorted due to aliasing.

This

distortion will result in an "information" l o s s as indicated by the following expression for the average information matrix ( p e r u n i t t i m e ) from the s a m p l e d observations:

is B

=

fh

QT(-J2af)$S1(f)Q(j2,f)

d<(f)

(6.11)

-fh where

is the aliased noise spectrum given by [581 :

$s(f)

m

$s(f)

=

$(f) +

k=l

($(kfs + f) + $(kfs

the fact that

- is



f))

i

(- $ >) f

f

-

,

(6.12)

is nonnegative definite follows from

equation (6.121, which indicates that $,(f) definite and hence that

-

$(f)

is positive

Choice of Sampling IntervaLr

269

is nonnegative definite. It follows that, for all optimality criteria 4 (2.4) ,

(I

(Mi) 5 $(Mo) ,

satisfying

that is the experiment with sampled data

cannot be better than the corresponding experiment with continuous observations. This result is, of course, quite expected. However, it will now be shown that, by the inclusion of an optimal presampling filter, equality can be achieved. THEOREM 6.1:

fs, s a t i s f y i n g

4

With

f

s

s a t i s f y i n g (2.4)

=

sampling r a t e ,

> 2fh a p r e s a m p l i n g f i l t e r h a v i n g t r a n s f e r

f u n c t i o n , F, w i t h the p r o p e r t y :

IF(j21'rf)I

, and

0

for a l l

and invertible f o r a l l g i v e s the same v a l u e f o r

f

-

f C(-,

t o which

5

$1

U[>

, a)

(6.14)

a s s i g n s nonzero m e a s u r e ,

4 u s i n g the sampled d a t a a s w i t h

continuous observations.

Proof.

The inclusion of a presampling filter having trans-

fer function F(s)

modifies the expression for the average

information matrix (6.11) to: c

where

and

QF(f)

is the filtered noise spectrum given by: (6.17)

G. C. Goodwin and R . L . Payne

270

Now, for a presampling filter satisfying (6.141, reduces to

qF(f)

as is readily verified.

Ss(f) Furthermore, sub-

stituting (6.17) into (6.15) yields

and comparison of (6.18) with (6.10) yields: (6.19)

This completes the proof of the theorem. H The significance of (6.19) is that the specified presampling filter, F, eliminates the information loss due to sampling. This form of filter is often called an aliasing filter [58]. So far, only the average information matrix per unit time

has been considered. However, if the total number of samples is

N, say, then it is more appropriate to consider

constrained to be

the average information matrix per sample. is

fs

If the sampling rate

samples per second then the average information matrix

per sample is given by:

fi

=

B

where

Cs B

-1

is

(6.20)

fs

is the average information matrix per unit time

corresponding to an input spectrum having highest frequency component fh

and samples collected at

fs > 2fh.

inclusion of an aliasing filter ensures that

MB :

be replaced by

fi

B

=

1

-

is B

Furthermore, in (6.20) can

M

(6.21)

fs

It would now be possible, at least in principle, to choose the test signal spectrum (or equivalently the spectral

Choice of Sampling Intervals

distribution function <(f) , f € [-fh, f,])

271

to minimize some

scalar function I$ of, the average information matrix per unit

-

time, M

It is obvious that the tighter the constraint on the

8'

allowable input spectrum (i.e. the smaller fh)

4.

be the "cost"

the greater will

Also from (6.21) and the property (2.4) of 4 ,

it can be seen that

is made small by increasing

$(fig)

fS'

fs > 2fh, a compromise will be reached between

However, since

4 due to the decrease in fs and the

the improvement in

4 due to the tightening of the constraint on

degradation in

the allowable input spectrum. The optimization problem outlined above is greatly simplified by the observation that only a finite number of frequency components in the input spectrum need to be considered. This result is established in the following theorem: THEOREM 6.2: function

,

<,(f)

For a n y t e s t s i g n a l s p e c t r a l d i s t r i b u t i o n

f 6 [-fh, fh] w i t h c o r r e s p o n d i n g a v e r a g e i n -

fig(<,)

formation m a t r i x p e r sample,

f



[-fh, fh] ,

l i n e s and

Rg(<,)

Proof.

,

there i s a l w a y s a

the s p e c t r u m o f w h i c h c o n t a i n s a t most =

fig(<,)

where

R (5 B

2

)

<,(f)

,

p(p+1)/2+1

corresponds to

5,.

It follows from equations (6.21), (6.7) and (6.9)

that the set of all possible average information matrices per sample arising from power constrained inputs is the convex hull of the set of average information matrices per sample corresponding to inputs containing a single frequency component.

Since

any information matrix is symmetric, it can be represented as an element in a

p(p+1)/2

dimensional space and hence it follows

from a theorem due to Caratheodory [6] that there exist k = 1,.

..,R;

R

=

p(p+1)/2

B8K1) where (from (6.9))

:

=

-

+

Xk, fkl

1 such that

R

(6.22)

G . C. Goodwin and R . L. Payne

2 72

and

II

EXk =

k=l Defining

c2(f)

(6.24)

1

t o be t h e s p e c t r a l d i s t r i b u t i o n f u n c t i o n which

Xk

a s s i g n s measure

to

fk,

. ., p ( p + 1 ) / 2 +

k = 1,.

1

and zero

otherwise completes t h e proof of t h e theorem.. I n p a r t i c u l a r , t h e r e e x i s t s a $-optimal t e s t s i g n a l spectrum containing n o t more than

p(p+1)/2

+

1 lines.

problem now reduces t o minimization of fsr

fp(p+l)/2+1'

flr--.r

2fk < f s ,

s t r a i n t that

Al""Jp(p+l)/2

$(aB )

The o p t i m i z a t i o n with r e s p e c t t o

s u b j e c t t o the con-

k = 1,. . . , p (p+1)2 + 1. Obviously with

t h e problem a s s t a t e d above, t h e optimal value of

...,

fs

w i l l be

= max(fl, f In practice t h e p ( p + l )/ 2 + d m a l i a s i n g f i l t e r w i l l n o t be i d e a l and so, t o avoid a l i a s i n g ,

2fm where

f

i s taken t o be

2 ( 1 + € ) f m where

E

i s related t o the cutoff

c h a r a c t e r i s t i c s of t h e f i l t e r and should be small.

fS

F i n a l l y , from

(6.22) t h e experiment design may be summarized a s :

(6.25)

(6.26) (6.27)

R

CA,=l

(6.28)

k=l

fm = and

$*

max(f l,...,

fR)

(6.29)

i s t h e optimal v a l u e f o r t h e design c r i t e r i o n f u n c t i o n .

2 73

Choice of Sampling Intervals

The quantities A,,

A2,.-., have the physical interpret-

ation of being the fraction of the total power at the corresponding frequencies fl, f2,...,

in the test signal spectrum.

The

dimension of the problem is reduced even further for single input systems for which it has been proven that only p+l

lines are

required in the test signal spectrum [44]. Furthermore, for the -1 usual design criteria, e.g. det(M-l) or trace(M ) I it can be shown that the optimal information matrix is a boundary point of the convex hull, [6], and hence the maximum number of lines required is reduced to p(p+1)/2

in general and

p

for single

input-single output systems. Although the analysis in this section has been restricted to the single input case, it is a relatively simple matter to extend the results to the multiple input case.

Details of the

appropriate background theory may be found in references [401 and 1431. The above development has assumed that a single constant sampling Kate will be used for the complete experiment. However it will be clear from equation (6.25) that it is preferable, if possible, to divide the experiment into a number of subexperiments each having a single sinewave input and each having a different aliasing filter and different Nyquist sampling rate. Equation (6.25) then becomes:

(6.30) fll.. . I

fR

The use of more than one sampling rate is often advantageous especially where there is a wide spread in time constants in the system.

Further discussion on this point may be found in [601.

In the next section, the theory developed above will be applied to a number of examples leading to analytic results.

274

7.

G. C.Coodwn and R . L. Payne

ANALYTIC DESIGNS FOR UNIFORM SAMPLING INTERVALS 7.1

One Parameter Model

Consider a first order system described by the following model : (7.1 -1) where

u(s)

, y(s)

are the Laplace Transforms of the input and

output respectively and

e(s)

represents coloured measurement

noise having spectral density: $(f)

=

1 2 2

w

;

a w +1

Here the only parameter of interest is T

= 2llf

(7.l.2)

the system time con-

stant, and hence a suitable design criterion is simply -1 det(M ) = 1/M where M is the (1x1) information matrix. Following the discussion of Theorem 6.2, an optimal one line input spectrum can be found (since p = 1 for this example). The information matrix per unit time corresponding to a single frequency input at f = w / 2 ~ is given by:

M

=

2

2 2 2 (a w +l)w 2 2 2 (T w +1)

(7.l. 3 )

Hence it follows that, for continuous observations, the input which minimizes det(k-’)

-

w

=

2 (T

-

has frequency

-1/2 2a)

That is, if the noise is wide band

if

2a2 < T2

(a + 0)

given by (7.1.4)

then the following

appealing result is obtained:

-

w = -1 T

(7.1.6)

2 75

Choice of Sampling Interwrlr

The above results apply only to the case where there are no restrictions on the allowable number of samples (i.e., continuous observations are possible).

The case where the total number of

samples is fixed will now be considered. Assuming the optimal aliasing filter has been incorporated, the average information matrix per sample, (6.251, is: 2 2 2r(a w +l)w 2 2 2 (1+E)(T w +1)

m = where

E

(7.1.7)

is the aliasing filter constant.

put frequency G

which minimizes

det(fi-’)

From (7.1.7) the insatisfies the

equation: 2 2-4 2 a - c w +3(a

which for wide band noise

13

Note that G

=

+ 1

=

(7.1.8)

0

(a -+ 0) yields:

-+

-

1

(7.1.9)

-

(sampled observations) is less than w

observations) as expected. is fs -

7.2

2 -2

- T ) W

(continuous

The sampling rate corresponding to 3

1+E w Tr

First Order Two Parameter Model

The system of (7.1) is generalized to include a gain parameter: K TS+l

= -

where

u(s), y(s)

and

e(s)

u(s) + e(s)

(7.2.1)

are as for subsection 7.1.

From equation (6.23) the average information matrix per unit time corresponding to a single line is:

2 76

G . C . Goodwin and R . L. Payne

I where

T

B

1 2 2 2 (T w +1)

(T2w 2+1)

(7.2.2)

= (K,T).

For the determinant optimality criterion, it would appear from the discussion in Section 6, that the optimal input spectrum will contain two lines since p = 2.

However, it can be

shown that, in fact, one line is sufficient for this problem. The verification of this result is straightforward depending on the detailed structure of the convex hull and is left as an exercise for the reader. Thus, for continuous observations, the optimal input --1 frequency, w, minimizing det(M ) satisfies: 2 2-4

- a T w +3(a which for wide band noise

2

2 -2

- T ) W

(a

-+

+ 1

=

(7.2.3)

0

0) yields: (7.2.4)

The variation of

det(fi

B

)

with

w

is shown in Fig. ( 7 . 1 ) .

Now considering the case where the total number of samples is fixed, the average information matrix per sample is - given by: 2 -KTU 2 2 2 (T w +1) 2 2 2 r ( a w +1)

W(l+E)

(7.2.5) 2

-KTW 2 2 2 (T w +1)

K

2

2

W 2 2 2 (T w +1)

Hence the optimal input frequency with a fixed number of samples is

U

i a,

cv

Figure 7.1 V a r i a t i o n of I n f o r m a t i o n M a t r i x w i t h I n p u t Frequency f o r F i r s t O r d e r System w i t h White O u t p u t N o i s e and F i x e d Experiment Time

C . C.Goodwin and R . L . Payze

2 78

(7.2.6)

= o

if

a5~,.jT

(7.2.7)

The corresponding optimal sampling frequency is fs

=

1+E 2T

-

w

(7.2.8)

The result in (7.2.7) for wide band noise is, of course, impractical but can correctly be interpreted as meaning "apply a low frequency input and sample slowly."

Just how slowly will depend

on the total allowable experiment time. 7.3

Second O r d e r Two Parameter Model

Consider a second order system having undamped natural frequency wo

and damping ratio

5:

L

Y(S)

where

u(s), y(s)

=

wO

2 2 u(s) + n(s) s +25 wos+wo

(7.3.1)

are the Laplace Transform of the input and

output respectively and

n(s)

denotes wide band noise.

Again using equation (6.23), the average information matrix per unit time corresponding to an input'having a single frequency component is

(7.3.3)

2 79

Choice of Sampling Intenmls

where

6T

[wo,<]

=

.

For t h i s problem it can again be shown t h a t the optimal i n p u t s p e c t r a need only c o n t a i n one l i n e f o r t h e determinant criterion.

The optimal value of t h e i n p u t frequency with con-

tinuous o b s e r v a t i o n s i s :

-

w

=

wO 2 ((1-25 ) +

A

The v a r i a t i o n of Figure 7.3.1.

“1-25

2 2 )

+

151

with t h e damping r a t i o

}l i 2

5

(7.3.4)

i s shown i n

I t can be seen t h a t t h e optimal i n p u t frequency

drops a s t h e damping i n c r e a s e s .

This r e s u l t might have been

expected from i n t u i t i v e reasoning. Proceeding as i n t h e previous examples, t h e optimal i n p u t frequency,

G I with sampled o b s e r v a t i o n s i s

-

w

-

w = - 0

J3

((1-2E2)

+

[ ( 1 - 2 5 2 ) 2 + 31 lI2} li2

(7.3.5)

The corresponding optimal sampling r a t e i s (7.3.6) The v a r i a t i o n of Figure 7.3.1.

3

with t h e damping r a t i o i s a l s o shown i n

I t can be seen from t h e f i g u r e t h a t t h e optimal

i n p u t frequency with sampled observations i s always less than o r equal t o t h e optimal i n p u t frequency with continuous observations ( e q u a l i t y o c c u r s only when t h e system i s marginally s t a b l e ) . This i s i n l i n e with t h e g e n e r a l observations made i n Section 6. 8.

CONCLUSIONS

This c h a p t e r has considered t h e problem of design of experiments f o r dynamic system i d e n t i f i c a t i o n .

The main emphasis has

been on t h e d e s i g n o f t h e sampling system, namely t h e choice of presampling f i l t e r s and sampling i n t e r v a l s .

I t has been demon-

s t r a t e d t h a t , i n o r d e r t o achieve maximal r e t u r n from an

280

-

G . C . Goodwin and R . L. Payne

,..

1

0.5

1

0

I

1

I

I

I

2

3

4

L

r

.

5

Figure 7.3.1 Variation of Optimal Input Frequency versus Damping Ratio For Second Order System with "White" Output Noise

Legend:

-

A

-

w Continuous Observations (-)

B

-

Sampled Observations

-

WrJ

(K) WrJ

Choice of Sampling Intervals

281

experiment, a coupled design of the sampling system and test signal is required. The general problem, in which nonuniform sampling intervals are allowed, has been formulated and expressions developed for the information return from an experiment.

Special identifi-

cation algorithms are, of course, required to analyze data collected at nonuniform sampling intervals but an appropriate maximum likelihood algorithm follows from the expressions developed for the likelihood function in the text.

The general

experiment design problem with noniiniform sampling requires a very sophisticated optimization to be carried out.

To simplify

the computations a suboptimal one-step-ahead design algorithm has been described.

The algorithm would be suitable for on-line

experiment design.

Examples have been presented which show that

significant improvements in identification accuracy can be achieved with this simply optimization procedure.

Presumably,

further improvements could be achieved by extending the optimization horizon but the complexity of the design would be very greatly increased.

For the case of uniform sampling intervals

and power constrained test signals it has been shown that the experiment design problem has a simple interpretation in the frequency domain.

The usual aliasing filter is optimal in a well

defined sense and the optimal input spectra need only contain a finite number of lines.

These facts lead to an optimization in a

space of low dimension.

Analytic designs are possible for simple

cases. The final choice between the different design methods outlined in this chapter (and other methods also) depends upon the nature of the constraints on the allowable experimental conditions.

Care must be exercised in stating the constraints as

otherwise impractical or trivial designs can result.

However,

with judicious choice of the constraints, then significant improvements in estimation accuracy can be achieved from relatively simple experiment design procedures similar to those outlined

G . C . Goodwin and R . L. Payne

282

in this chapter. It has been the authors' intention that this chapter should not only provide methods for experiment design but, perhaps more importantly, also provide insight into the ramifications of the information structure in identification problems. Acknowledgments

The work reported in this chapter grew largely from joint work of the authors whilst they were in the Department of Computing and Control at Imperial College.

The authors wish to

acknowledge many helpful suggestions made by the staff and students of Imperial College.

Special credit is due to Professor

David Mayne who was a constant source of inspiration and to Martin Zarrop who originally developed some of the techniques described in Sections 4 and 5 as part of his doctoral research. REFERENCES The References have been collected under the following headings for the reader's convenience: 1.

E x p e r i m e n t D e s i g n for Non-Dynamic s y s t e m s

2.

Test S i g n a l D e s i g n f o r Dynamic S y s t e m I d e n t i f i c a t i o n

3.

D e s i g n o f S a m p l i n g I n t e r v a l s f o r Dynamic S y s t e m I d e n ti f ic a tion

4.

Ma thema t i c a l Background

1.

EXPERIMENT DESIGN FOR NON-DYNAMIC SYSTEMS

1.

Cox, D. R., P l a n n i n g of E x p e r i m e n t s , Wiley, 1958.

2.

Davies, 0. L. , D e s i g n and A n a l y s i s of I n d u s t r i a l E x p e r i m e n t s , Oliver and Boyd, 1954.

3.

Kempthorne, O., D e s i g n and A n a l y s i s o f E x p e r i m e n t s , Wiley, New York, 1952.

Choice of Sampling Intervals

283

4.

Kiefer, J. and J. Wolfowitz, "The Equivalence of Two Extremum Problems," C a n a d i a n J o u r n a l of Maths., Vol. 1 2 , 1966, pp. 363-366.

5.

Karlin, S. and W. J. Studden, "Optimal Experimental Designs," Annals of M a t h . S t a t . , V o l . 37, 1966, pp. 783-815.

6.

Federov, V. V., T h e o r y of O p t i m a l E x p e r i m e n t s , Academic Press, New York and London, 1971.

7.

Whittle, P., "Some General Points in the Theory of Optimal Experimental Design," J . R . S t a t i s t . SOC., B , 35, N o . 1, 1973, pp. 123-130.

8.

Wynn, H. P., "Results in the Theory and Construction of DOptimum Experimental Designs, J . R . S t a t i s t . SOC., B , 3 4 , N O . 2, 1972, pp. 133-147.

2.

TEST SIGNAL DESIGN

9.

Levin, M. J., "Optimal Estimation of Impulse Response in the Presence of Noise," IRE T r a n s . on C i r c u i t T h e o r y , V o l . C T - 7 , March 1960, pp. 50-56.

FOR DYNAMIC SYSTEM IDENTIFICATION

10.

Litman, S. and W. H. Huggins, "Growing Exponentials as a Probing Signal for System Identification," P r o c . I E E E , V o l . 5 1 , June 1963, pp. 917-923.

11.

Levadi, V. S., "Design of Input Signals for Parameter Estimation," IEEE T r a n s . A u t o m a t i c C o n t r o l , V o l . AC-11, N o . 2 , April 1966, pp. 205-211.

12.

Gagliardi, R. M., "Input Selection for Parameter Identification in Discrete Systems," I E E E T r a n s . A u t o m a t i c C o n t r o l , Vol. AC-12, N o . 5, October 1967.

13.

Goodwin, G. C. and R. L. Payne, "Design and Characterization of Optimal Test Signals for Linear Single Input-Single Output Parameter Estimation," Paper TT-1, 3rd IFAC Symposium, The Hague/Delft, June 1973.

14.

Goodwin, G. C., R. L. Payne and J. C. Murdoch, "Optimal Test Signal Design for Linear Single Input-Single Output Closed Loop Identification," CACSD Conference, Cambridge, 2-4th April 1973.

15.

Goodwin, G. C., J. C. Murdoch and R. L. Payne, "Optimal Test Signal Design for Linear Single Input-Single Output System Identification," I n t . J . C o n t r o l , Vol. 1 7 , N o . 1 , 1973, pp. 45-55.

284

16.

17.

G . C . Goodwin and R . L. P a p e

Arhoto, S. and H. Kimura, "Optimal Input Test Signals for System Identification - An Information Theoretic Approach," I n t . J. S y s t e m s Science, V o l . 1, N o . 3 , 1971, pp. 279-290. Mehra, R. K., "Optimal Inputs for System Identification," June 1974, pp. 192-200.

IEEE T r a n s . Automatic C o n t r o l , V o l . AC-19,

18.

Smith, P. L., "Test Input Evaluation for Optimal Adaptive Filtering," Preprints 3rd IFAC Symposium, The Hague/Delft, Paper TT-5, 1973.

19.

BOX, G. E. P. and G. M. Jenkins, T i m e Series A n a l y s i s Forec a s t i n g a n d C o n t r o l , Holden Day, San Francisco, 1970, pp. 416-420.

20.

Nahi, N. E. and D. E. Wallis, Jr., "Optimal Inputs for Parameter Estimation in Dynamic Systems with White Observation Noise," Paper IV-A5, Proc. JACC, Boulder, Colorado, 1969, pp. 506-512.

21.

Inoue, K., K. Ogino and Y. Sawaragi, "Sensitivity Synthesis of Optimal Inputs for Parameter Identification," Paper 9-7, IFAC Symposium, Prague, June 1970.

22.

Rault, A., R. Pouliquen and J. Richalet, "Sensitivizing Inputs and Identification," Preprints 4th IFAC Congress, Warsaw, Paper 26.2, 1969.

23.

Van den BOS, A., "Selection of Periodic Test Signals for Estimation of Linear System Dynamics," Paper TT-3, Preprints of 3rd IFAC Symposium, The Hague/Delft, 1973.

24.

Goodwin, G. C., "Optimal Input Signals for Nonlinear System Identification," P r o c . I E E E , V o l . 1 1 8 , N o . 7, 1971, pp. 922926.

25.

Goodwin, G. C., "Input Synthesis for Minimum Covariance State and Parameter Estimation," E l e c t r o n i c s Letters, Vol. 5, N o . 21, 1969.

26.

Reid, D. B., "Optimal Inputs for System Identification," Stanford University Rpt. No. SUDAAR 440, May 1972.

27.

Sawaragi, Y. and K. Ogino, "Game Theoretic Design of Input Signals.for Parameter Identification," Paper TT-6, Preprints 3rd IFAC Symposium, The Hague/Delft, June 1973.

Choice of Sampling Intervals

285

28.

Keviczky, L. and C. S. Banyasz, "On Input Signal Synthesis for Linear Discrete-Time Systems," Paper TT-2, Preprints 3rd IFAC Symposium, The Hague/Delft, June 1973.

29.

Keviczky, L., "On Some Questions of Input Signal Synthesis," Report 7226(B), Lund Institute of Technology, Division of Automatic Control, November 1972.

30.

Aoki, M., R. M. Staley, "On Approximate Input Signal Synthesis in Plant Parameter Identification," presented at 1st Hawaii International Conference on System Sciences, University of Hawaii, January 1968.

31.

Aoki, M. and R. M. Staley, "Some Computational Considerations in Input Signal Synthesis Problems," 2nd International Conference on Computing Methods in Optimization Problems, sponsored by SIAM, San Remo, Italy, September 1968.

32.

Aoki, M. and R. M. Staley, "On Input Signal Synthesis in Parameter Identification," A u t o m a t i c a , V o l . 6, 1970, pp. 431440. Originally presented at 4th IFAC Congress, Warsaw, June 1969.

33.

Nahi, N. E. and G. A. Napjus, "Design of Optimal Probing Signals for Vector Parameter Estimation," Paper W9-5, Preprints IEEE Conference on Decision and Control, Miami, December 1971.

34.

Napjus, G. A., "Design of Optimal Inputs for Parameter Estimation," Ph.D. Dissertation, University of Southern California, August, 1971.

35.

Tse, E., "Information Matrix and Local Identifiability of Parameters," Paper 20-3, JACC, Columbus, Ohio, June 1973.

36.

Minnich, G. M., "Some Considerations in the Choice of Experimental Designs for Dynamic Models," Ph-D. Dissertation, University of Wisconsin, 1972.

37.

Kalaba, R. E. and I. Spingarn, "Optimal Inputs and Sensitivities for Parameter Estimation," JOTA, 11, 1, 1973, pp. 56-67.

38.

Payne, R. L. and G. C. Goodwin, "A Bayesian Approach to Experiment Design with Applications to Linear Multivariable Dynamic Systems," IMA Conference on Computational Problems in Statistics, Univ. of Essex, July 1973.

39.

Mehra, R. K. "Frequency Domain Synthesis of Optimal Inputs for Parameter Estimation," Techn. Report No. 645, Div. of Eng. and Appl. Physics, Harvard University, July 1973.

286

G . C . Goodwin and R . L . Payne

40. Mehra, R. K., "Synthesis of Optimal Inputs for MultiinputMultioutput (MIMO) Systems with Process Noise," Techn. Report No. 649, Div. of Eng. and Appl. Physics, Harvard University, February 1974. 41. Van den BOS, A., "Construction of Binary Multifrequency Test Signals," Paper 4-6, IFAC Symposium on Identification, Prague, June 1967. 42.

Mehra, R. K., "Optimal Measurement Policies and Sensor Designs for State and Parameter Estimation," Milwaukee Symposium on Automatic Controls, March 1974.

43.

Goodwin, G. C., R. L. Payne and M. B. Zarrop, "Frequency Domain Experiment Design for Multiple Input Systems," Report No. 75/17, Department of Computing and Control, Imperial College, London.

44. Payne, R. L. and G. C. Goodwin, "Simplification of Frequency Domain Experiment Design for Single Input - Single Output Systems," Report No. 74/3, Department of Computing and Control, Imperial College, London. 45.

3.

Viort, B., "D-Optimal Designs for Dynamic Models - Part ITheory," Techn. Report 314, Department of Statistics, University of Wisconsin, October 1972. DESIGN OF SAMPLING INTERVALS FOR DYNAMIC SYSTEM I D E N T I F I CATION

46. Goodwin, G. C., M. B. Zarrop and R. L. Payne, "Coupled Design of Test Signal, Sampling Intervals and Filters for System Identification," IEEE T r a n s . AC S p e c i a l I s s ue on I d e n t i f i c a t i o n , December 1974.

47. Payne, R. L., G. C. Goodwin and M. B. Zarrop, "Frequency Domain Approach for Designing Sampling Rates for System Identification," A u t o m a t i c a , March 1975. 48. Astrom, K. J. , "On the Choice of Sampling Rates in Parameter Identification of Time Series," I n f . S c i . , V o l . 1, 1969, pp. 273-287.

49. Gustavsson, I., "Choice of Sampling Intervals for Parametric Identification," Report 7103, Lund Institute of Technology, 1971.

50.

Zarrop, M.B., "Experimental Design for System Identification," M.Sc. Dissertation, Imperial College of Science and Technology, 1973.

Choice of Sampling I n t e n d

4.

MA THEMATICAL BACKGROUND

51.

Silvey, S. D. , S t a t i s t i c a l I n f e r e n c e , 35-44.

52.

Kullback, S., I n f o r m a t i o n T h e o r y and S t a t i s t i c s , Dover, New York, 1968.

53. 54.

287

Penguin, 1970, pp.

Astrom, K. J. , I n t r o d u c t i o n t o S t o c h a s t i c C o n t r o l T h e o r y , Academic Press, 1970. Kailath, T., "An Innovation Approach to Least Squares Estimation - Part 1: Linear Filtering in Additive White Noise," IEEE T r a n s . A-13, N o . 6 , December 1968.

55. Cox, D. R. and H. D. Miller, T h e T h e o r y of S t o c h a s t i c Processes, Chapman Hall, London, 1967.

,

56.

Wong, E. S t o c h a s t i c P r o c e s s e s i n I n f o r m a t i o n and Dynamical S y s t e m s , McGraw Hill, 1970.

57.

Yen, J. L., "On Nonuniform Sampling of Bandwidth Limited Signals," IRE T r a n s . C i r c u i t T h e o r y , December 1956, pp. 251257.

58. 59.

Bendat, J. S. and A. G. Piersol, Measurement a n d A n a l y s i s o f

Random Data, Wiley, New York, 1966.

Anderson, B. D. O., K. L. Hitz and N. D. Diem, "Recursive Algorithm for Spectral Factorization," IEEE T r a n s . C i r c u i t s a n d S y s t e m s , V o l . CAS-21, No. 6 , 1974.

60. Goodwin, G. C. and R. L. Payne, "Dynamic System Identification: Experiment Design and Data Analysis," Book to appear 1977.