Parameter estimation using transform estimation in time-evolving models

Parameter estimation using transform estimation in time-evolving models

Parameter Estimation Using Transform Estimation in Time-Evolving Models HANS-JURGEN SCHUH* ANDRICHARD L. TWEEDIE CSIRO, Division of Mathematics and ...

2MB Sizes 7 Downloads 138 Views

Parameter Estimation Using Transform Estimation in Time-Evolving Models HANS-JURGEN

SCHUH* ANDRICHARD L. TWEEDIE

CSIRO, Division of Mathematics and Statistics, P. 0. Box 1965, Canberra City, A. C. T 2601, Australia

Received

I June 1978; revised 5 December

1978

ABSTRACT Parameter estimation can be complex when populations change in time and p(t), the probability of an individual having a desired characteristic at time t, can only be estimated by the population proportion at time t having that characteristic. We consider sampling at the random time T where T has an exponential distribution; the population proportion then estimates the Laplace transform of p(t). This reduces the complications of parameter estimation in many common situations-for example, when p(t) is a convolution of functions, each involving different parameters. We show that a stratified random sampling technique gives a more efficient parameter estimation method, and that this lends itself to a relatively straightforward parameter estimation technique based on ordinary fixed sampling times. The efficiency of this method is explored in two ways: analytically in a simple case where p(t) is the probability of transition by time t of an organism in a two-stage model with Erlangian transition times; and by simulation in a model where p(t) is the probability of a biological organism being in a given stage of development and the parameters are the death rate and the maturing rates in each stage. Finally, data on the hatching of parasitic nematodes 0. circumcincta are fitted by least squares and by transform methods. Particularly in the complex simulation context the transform method has several desirable properties; explicit parameter estimators can be found, and they compare well with other estimation methods.

1.

INTRODUCTION In many

models

of processes

evolving

through

time, the events

of interest

can be described by a function p@(t) representing the probability that at time t the event occurs, where 8 is the value of the parameter of the model. For example, pa(t) may be the probability of an organism being at a certain stage of development at time t, and 8 represents the parameters of the maturation times and death rates in the various stages of the organism’s development. Often, whilst an explicit form of p@(t) is difficult to find, the

*Now at Department of Statistics, University of Melbourne, Melbourne, Australia. MATHEMATICAL

BIOSCIENCES

QElsevier North Holland, Inc., 1979

45~37-67

37

(1979)

0025-5X4/79/050037

+ 3 1$02.25

38

HANS-JiJRGEN

SCHUH AND RICHARD

L. TWEEDIE

model can be expressed fairly simply in terms of the Laplace transform &e(s)= Jo”pe(t)e-“‘dt: this is a well-known phenomenon in, for example, queueing and dam theory, in biological modeling situations, in manpower models, and in reliability theory. We are interested in situations where it is fairly simple to estimate p@(t), say by a population proportion n(t) produced by sampling at time t: in the biological model above, n(t) would be the proportion of organisms found in the relevant stage when sampled at time t. However, we wish to estimate 0 rather than p@(t), and in the cases we consider, the complexity of p@(t) as a function of 8, or sometimes the fact that no closed form for pe(t) exists, makes estimation of 0 from estimation of p@(t) a cumbersome if not impossible task. In this paper, we propose that, rather than estimating 8 by that 4 for which the functionp,(t) is “closest” to n(t), we might estimate 0 by the 4 for which the function G@(s)= s&(s) is “closest” to n(T), where T is a random sampling time with exponential density se-“. Neither the idea of parameter estimation via some sort of transform estimation, nor the idea of estimating Laplace transforms by using exponentially distributed sampling times, is particularly new. For example, empirical characteristic functions have been used, especially for estimating the parameters of stable laws, by Heathcote [lo]; Feigin and Heathcote [7]; and Paulson, Holcomb, and Leitch [ 151; whilst Kingman [12] suggests using random sampling times, finding some optimal properties of such counts, and Basawa [3] looks at maximum likelihood properties of transform estimators themselves in a random sampling time situation. Gaver [8] uses an approach very like ours, but to approximate Laplace transforms for numerical inversion. We shall look at some of the properties of the parameter estimators derived from transform estimators, and shall show that in many cases of interest, they are computationally far easier than other methods, often even having explicit forms, whilst also being reasonably efficient in estimating the parameters of the model. In Sec. 2 we give details of three Laplace-transform estimation methods we propose: a simple Poisson count approach, a more sophisticated stratified random-time sampling procedure, and an approximate method using fixed-time sampling, which may be the most useful in practice. In Sec. 3 we investigate a simple two-stage biological model, in which the properties of parameter estimators derived from both fixed-time estimation of p@(t) and random-time estimation of &e(s) can be found analytically. We given some detailed comparisons of the mean squared errors of these estimators, and see that in particular the stratified sampling version of the transform estimator does not compare too unfavorably with the estimator derived from pe( t).

PARAMETER ESTIMATION USING TRANSFORMS

39

We look at a hierarchical model for population growth, which is also a standard manpower planning model, in Sec. 4. In this context parameter estimation via p@(t) can be computationally extremely cumbersome. We show that estimation via Laplace transforms can be much simpler, and in the case where death, or wastage, rates are constant over growth stages, or grades, we find explicit estimators for maturation, or promotion, parameters. Properties of these are investigated using simulated data, and they appear to perform rather well. Finally, in Sec. 5 we fit the models of Sec. 4 to data on the hatching of the sheep nematode 0. Circumcincta, and compare the results with those given by a standard least-squares procedure. Results of these sections are generally encouraging, and indicate that this approach may be of considerable use in the fitting of models to inherently difficult biological data. 2. 2.1.

ESTIMATION

OF TRANSFORMS

RANDOM- TIME SAMPLING

ESTIMATORS

We consider throughout a situation where the event of interest can occur at times t E[O, co): the case where the time set is (0, 1,. . . } can be handled completely analogously. We assume we have a model in whichp,(t) represents the probability that the event occurs at time t; here B is an unknown parameter which we wish to estimate. We assume that we have a number of independent populations which can each be sampled only once, but at arbitrary time points; this occurs, for example, if sampling is destructive. If a population 9Z is sampled at time t, the proportion n(t) of the population 9Z in which the desired event occurs is assumed to satisfy E(n(t))=Pe(t). Our first tool is then a simple result of the strong law of large numbers and the use of conditional expectations. THEOREM

I

(i) If T has a density a(t) and is independent of the sampled population %, then n(T) is an unbiased estimator of Szp@(t)a(t)dt. (ii) If T,,T,,... are independent and identically distributed as T, and if the sampled populations GSL, ,9&, . . . are independent of each other and of the sequence { Ti}, then m -‘Zjm_ ,n( Tj) is an unbiased and strongly consistent estimator of Jo”Pe(t)c~(t)dt as m+oo. Remark. Notice that each of the quantities n(q) is obtained from a separate population, since sampling is destructive. For simplicity of notation, however, we shall use the present notation throughout the paper; this should not cause any difficulty.

HANS-JiJRGEN SCHUH AND RICHARD L. TWEEDIE

40

Example I. If (r(t)=se-S’, t > 0, then ~-‘rn-~Il$,n(Tj) is an unbiased and strongly consistent estimator of the Laplace transform @@e(s)= So”pe(t)e-S’dt. Example 2. When the time set is (0, 1, . . . }, the analogous result is that, taking cr(l)=x’[l-x], t=O,l,..., O 0, then m - ‘Z;ln( I;.) is an unbiased and strongly sistent estimator of

sjad”-’

- ((a - l)!

+8e(s)_

&“-i

with con-

P-1)

The Laplace transform, as in Example 1, and the analogous generatingfunction transform, can be particularly useful in simplifying estimation procedures when p@(t) is obtained as a convolution of other probabilities. Section 4.1 illustrates this point more than adequately. The transform in Example 3 may not be an obvious quantity to wish to estimate; however, as a+w with a/s held constant, the sampling time tends in distribution to the fixed sampling time t = a/s, and one can thus expect that the mean squared error of the estimators of 0 derived from this transform estimator might tend to that of the estimator of 8 derived from fixed-time sampling. On the other hand, we might also hope that in many cases the parameter 8 would occur in the derivatives of &,(s) in a more accessible way than in the probabilities p@(t) themselves, and so we could achieve computational simplicity using Example 3 whilst retaining most of the accuracy of fixed-time sampling. This is illustrated in Sec. 3.3. 2.2.

STRATIFIED

TRANSFORM

SAMPLING

The method of Sec. 2.1 is essentially inefficient, since we are generating ancillary variables (the sampling times q) and then failing to condition on the known values of the q (see [S, p. 381). One possibility of increasing the efficiency of estimation of an integral is stratified sampling [9, Sec. 51. M+l} be a set of points with O=c,
where 4 = j?+u(t)dt.

Further

set T=(T(‘),

T(‘),.. ., TcM)).

PARAMETER

ESTIMATION

41

USING TRANSFORMS

We get immediately: THEOREM

2

The estimator

based on the stratification

n,(T)=

5

S given by

$n(TCI’))

j=l

is an unbiased

estimator

of j Fp@(t)a(t)dt.

We now show that this estimator n,(T) is also strongly consistent in an appropriate sense. For this we need some technical preparations. Let X be an individual of our population. We will say (*) holds if there exists a sequence e, with lim,~, =O, such that for any t, t,, t2 with Sf-,2a(t)dt=

S:+‘la(t)dt=

l/n,

P(X changes its stateduring

the time interval

[t - t,, t + t,])
It is easily checked that (*) holds in all situations we consider in this paper. (S, = {0= k~1 Let now S,, S, ,..., 5, ,... be a sequence of stratifications
k4=

max IG
/

*‘+‘a(t)dt aC,

+ k-m

0.

(**I

Thus, the later stratifications are refinements of the earlier ones, and the a-weights of the strata [k+&+,) converge uniformly to zero. Define for Sk analogously to T, T(l), . . . for S . kT,kTti),... We sample now a sequence of independent populations ‘X,, &, . . . , all of the same initial size N, according to the following sampling rule: Take 92 ,,“‘? 9ZMI’ the proportions n(, T(l)), . . . ,n(, Tc”l)) from the populations Now for every 1
42

HANS-JiJRGEN

THEOREM

SCHUH AND RICHARD L. TWEEDIE

3

If conditions (*) and (**) hold, and if the proportions n(k T(j)) are taken according to the sampling rule above, then

Proof. combined 2.3.

This is again a consequence of the strong law of large numbers, with condition (*); we omit the details.

FIXED-TIME

SAMPLING

In general, persuading an experimenter to sample at random time points is a nontrivial exercise; and even if he is amenable to the idea, he may already have data sets sampled in a more conventional manner. This latter is the case with the data set analyzed in the final section of this paper. The stratified sampling scheme of Sec. 2.2 suggests the following transform estimator: let t ,,. .., t,,,, be the fixed sampling times, and choose the strata {[ci,cj+ ,)} of Sec. 2.2 such that ti e[cj,ci+ J. Then provided p@(t) and a(t) are reasonably smooth, q(t)=

5 n(ti)Zi i=l

should be an adequate, if biased, estimator of the transform Jrp@(t)a(t)dt. Bounds on the bias can easily be found in particular cases when the variation of p@(t)a(t) over the ranges [+q+i) is known; and the variability of the estimator is again reduced, since we are no longer introducing sampling-time variability. This method is in fact estimating the Stieltjes sum approximating the integral in which we are interested. As such it is related to the methods of empirical characteristic function estimation of Heathcote [lo]; Feigin and Heathcote [7]; and Paulson, Holcomb, and Leitch [ 151. For practical purposes the fixed-time estimators may be the most useful of those we propose here, and in Sets. 4 and 5 we see that they do seem to give reasonable parameter estimation compared with the estimates from random-time sampling and with nontransform estimation. 2.4.

CHOICE

OF SAMPLING

RATES

One of the problems arising above is the choice of the sampling rate s when T has density se-“. In general, it seems reasonable to choose s - ‘, the mean sampling time, near values of t where pe( t) is changing fairly rapidly;

43

PARAMETER ESTIMATION USING TRANSFORMS

if p@(t) is, say, a cumulative distribution function with density go(t), then this would involve most sampling near the mode of g@(t). This point will not, of course, usually be known; but one of the advantages of the fixed-sampling-time method of Sec. 2.3 is that, unlike the random-time methods of Sec. 2.1 and 2.2, it does not demand that s be fixed in advance. Hence estimation can be carried out iteratively until the value of s is chosen in a satisfactory range relative to the estimated parameters, i.e., relative to an estimate of the mean or mode of g@(t). Some effects of varying s for fixed 0 are illustrated in Sec. 3.2, and an example of an iterative estimation procedure is given in Sec. 5. It is also interesting to note that, as well as getting an estimator n(T) of I,!J~(s)for the chosen rate s as in Theorem 1, one also has for any u

E(n(T)e‘“-“‘yJ/S))

= q&(u);

thus it is possible to estimate &(a) from n(T) for all values of u, rather than only for u = s. In this sense n(T) contains information on the whole Laplace transform r&,(s), and hence 13can be estimated using methods such as those of Paulson, Holcomb, and Leitch [IS]. Finally, we mention that optimal choice of s is somewhat related to the work of Box and Lucas [4], Kiefer [ 111, and White [18]. 3.

RELATIVE

EFFICIENCY

IN TWO-STAGE

MODELS

To examine the viability of random-time sampling of transforms compared with a more normal estimation procedure, we investigate the relative efficiencies (defined essentially as relative mean squared errors) of fixedtime and random-time estimators in the simple case of organisms which pass through only one stage, and which can be sampled individually; the example of Sec. 5 is similar to this model, although there sampling is usually only done in groups. 3.1.

THE MODEL

We consider a situation where we have N organisms with lifetimes (i.e., times to pass through some stage in their development; for example, the egg and identically stage in Sec. 5) Xi, i= 1,. . ., N, which are independent distributed with distribution function p*(t) = P(X, < t), t > 0, depending on parameters 8. The situation we will investigate is that where the exact times to transition are essentially unobservable, so that if the jth organism is sampled at time t,, the data will consist of 5 = 1 if the organism has made its transition and q =0 if not. An industrial example of this type of “censored reliability” is given by Denby, Fowlkes and Roe [6].

44

HANS-JtiRGEN

We shall compare (i) Fixed-time ZZyrj is

the following

SCHUH AND RICHARD L. ‘lWEEDIE

two situations:

All t,= t, and the likelihood

sampling.

of the data r=

I,=( f)[Pe(t)lr[l -P&)K so that the maximum-likelihood

estimator

of 8 is the solution

of

pdt>= r/N.

(3.1)

We shall deal only with models for which this solution is unique. (ii) Random-time sampling. The sampling times q are independent and identically distributed with density se-S’. The likelihood of the data r = CT? is now L=

(

;

>[+3(s)l’[l

-h’e(~v+~

where q@(s) = S&(S) = J o”pe(t)se e-s’dt. Hence estimator of 0 is that value of 0 satisfying

the

maximum-likelihood

#O(S) = r/N. Again in our models the solution To illustrate where

the behavior

will be unique.

of these estimators,

pe(t) = where g(x) is the Erlangian

(3.2)

we shall consider

the case

I0fg(x)dx,

density with a > 1 a fixed integer, given by

g(x) = x a-le-hxhu/(a-

l)!,

x>o,

A>O.

(3.3)

In this case the unknown parameter 0 is h, and I+!+(S)= [X/(h + s)p is just the Laplace transform of g(x). When (3.3) holds, (3.1) becomes l-ePA’

a-1 (th)’ _ r 2 j!-N'

(3.4)

j=O

and we write x(t) for the estimator the estimator fi,(s)=$r/N)“‘[

solving (3.4); whilst from (3.2), we have

1 -(r/N)““]-‘,

(3.5)

PARAMETER

ESTIMATION

USING

45

TRANSFORMS

so that random-time sampling leads to an explicit estimator contrast to the fixed-time sampling. 3.2.

RATIO

OF ASYMPTOTIC

MEAN

SQUARED

for any a, in

ERRORS

In order to get some idea of the relative efficiencies of the estimators x(t) and i,(s), we would like to compare their mean squared errors. However, E@(t) -A]*) and E@,(s) -Alz) are not very meaningful expressions, since both A(t) and x,(s) take the value cc with positive probability. Before showing how to overcome this difficulty, we first study the probabilities that A(t) or A,(s) are infinite, By (3.4) and (3.5) i(t)= cc or A,(s) = cc if and only if r = N, and hence P(r;(t)=co)=p,(t)N, and

P@,(s)= co)=&e(s)“. To compare these probabilities, we choose s= t-l (so that the expected value of the sampling time is equal to t). We further take t = a/X, to match the sampling time with the mean transition time. In this case we get p(fi(+,)=

=y-1 (l--e-“‘a-l

-i._i,l-?4: j=o

.

N )

P(K,(S)=+(l-&-)aN=(l--&)~N = : B(cqN. Now A(l)= 1 --e-l and A(a).J$ as a+cc (by the central limit theorem and some analysis); but B(l)= i and B(a)Je-’ as a-+ocl. Hence we always have (independent of a and N) P(Q)=

co) >P&(S)

= cc).

On the other hand, as N-co clearly P@(t) = cc) and P(x,(s) = co) go to 0, and the influence of this effect on the estimators disappears geometrically fast. In order to get some measure for the relative efficiences of x(t) and x,(s), we have to restrict ourselves to the event {r < N}.

46

HANS-JtiRGEN

SCHUH AND RICHARD L. TWEEDIE

Let i = i(t) or x,(s) respectively. We use E*(e) to represent E(. )r
l_e-Yr

lb,’

_x

if

j!

fi=fi(t),

j=O

or SX’/U f(x)=

Sincef(r/N)=x

if

___ 1 -Xl/O

X=8X,(s).

andf(E(r/N))=X,

But from the Taylor series approximation E*([f(x)-j(E*(~))]~)=Var*(x)[f’(E*(x))]~+.** we get

Using (3.7), we now evaluate (3.6) asymptotically and show that the last two terms on the right hand side tend to 0 as N goes to infinity. Let

and q = 1 -p.

(Note that p < 1 always.) We get

NP

N-l

q

(1 -PNj2 _-

1

1 -pN

(3.8)

41

PARAMETER ESTIMATION USING TRANSFORMS

Hence +

pq=NVar(G),

f’(E*(i))Nzf(P)j

N-+CC

and hence the right-hand analysis)

side of (3.7) is asymptotically

equal to (after some

pquYp))2=q [ys)]2[(1+;)a_I], Now m

]E*(fi-f(E*(~)))I

< [ N~*([j(~)-f(E*($))]2)]“2
by (3.7). Further,

~p-f(“*($))l=q_f(q~))-f(E*(~))I =m.~(~)~.IE(f)-E*(~)l < const.

fl

.pN- 1-p I-pN’

by (3.8).

[where

TJ is a suitable value between E*(r/N) and @r/N)]. Hence is asymptotically equal to (3.9) or (3.10) respectively. We are now able to define the ratio of the asymptotic mean squared errors of i(t) and i,(s) as

N-E* ((A -A)2)

RSE@(t),fi,(s);

E*([fi(‘)-X12) A) = lim N-+* E*( [&(s)-h]3 =ratio

of (3.9) to (3.10),

which can be considered as an indicator of relative efficiencies. In Table 1 we give numerical values for RSE@( t),,%,(s); A) when h = 1, in the two simplest cases when a = 1 and a=2. We look at a range of fixed

48

HANS-JtiRGEN

SCHUH AND RICHARD L. TWEEDIE

TABLE 1 Ratio of Asymptotic Mean Squared Errors, R=RSE&t),&(s);X) (h= 1) of FixedTime and Exponential Sampling Estimators When Data Have Exponential (a= 1) or Erlangian (a = 2) Density for Failure Timesa tss-’

0.1 0.2 0.25 0.33 0.5 1 2

t=s-’

R a=1

a=2

0.869 0.769 0.727 0.670 0.577 0.43 1 0.355

1.567 1.274 1.160 1.014 0.793 0.479 0.293

R

a=1 3 4 5 6 10 +O -+M

a=2

0.398 0.536 0.819 1.369 18.203 1 co

0.255 0.276 0.345 0.484 3.812 2 co

“For a= 1, we have R-(e’l)/r(r+ 1)2, and for a=2, Rd(e’--tl)/t2(t+ 1)(2t+ 1). The fixed sampling time is t = s - ‘, the mean of the random sampling times.

sampling times t, with t = SC’, so that the mean sampling times are identical in each scheme. Provided the mean sampling time t is not too far from the mean transition time a of the population, the fixed-time sampling scheme is clearly better, by up to a factor of 4 in these simple cases. When the sampling times are far from the true mean transition time, the randomsampling-time scheme seems much better; however, this is to some extent illusory, since sampling in the extreme tails of the transition time distribution will give very bad estimates from either scheme. 3.3. IMPROVEMENT ERLA NGIAN

IN THE ASYMPTOTIC SAMPLING TIMES

MEAN

SQUARED

ERROR

WITH

Since the exponential sampling times are less efficient than the fixed-time sampling schemes, we now examine the quality of Erlangian sampling schemes, as described in Example 3 of Sec. 2.1. Again we assume the transition times are Erlangian with parameters (a,X), with a known; but now for the density of the sampling times we take an Erlangian with parameters (b,s). The maximum-likelihood estimator &,(.s) can be shown to be the solution of (3.11) [Note that in the special case b = 1 (i.e., exponential sampling notation A,(s) is consistent with x,(s), defined in (3.5).]

times) the

49

PARAMETER ESTIMATION USING TRANSFORMS TABLE 2 Ratio of Asymptotic Mean Squared Errors R, =RSE(&(2s),i,(s); A))-_(~s+ 1)(2s+ lp/ 16s(s + lp of Er_langim_(6 = 2) and Exponential (6 = 1) Sampling Estimators, and R, = RSE(A(s- ‘),h2(2s);h)- 16s4(e’/”- 1)/(4s + 1)(2s+ 1)’ of Fixed-Time and Erlangian (6 = 2) Sampling Estimatorsa s

0.2 0.25 0.33 0.5 1

R, 0.765 0.720 0.685 0.667 0.703

R2

1.070 0.744 0.585 0.532 0.611

s

RI

R2

2 3 4 5 -+co -+O

0.781 0.829 0.861 0.882 1 M

0.738 0.805 0.845 0.871 1 0

aData have exponential (a= 1, X- 1) density for failure times; mean sampling time is constant in all schemes at s- ‘.

It is informative to study the ratio of the asymptotic mean squared errors of &(s) and i(t), since this will shed some light on the increased efficiency derived from using Erlangian sampling times. If we consider the simplest case, a = 1, of exponential transition times, then

and

N.E*([);(I)-X12)-t-2[eth_I]. A comparison of fi,(2s) with x,(s) in terms of RSE(i2(2s), i,(s); h), again with mean sampling times held constant, is given in Table 2. It is clear that a relatively uniform improvement results over a wide range of sampling rates when the lifetime has a simple exponential distribution with A= 1; from the form of Z?, in Table 2 it follows that taking b = 2 is more efficient provided s > (fl -2)/6=0.108.. . . Table 2 also gives a comparison of i(s-I) and &(2s). 3.4. COMPARISON

OF OPTIMAL

MEAN

SQUARED

ERRORS

In the preceding sections we compared asymptotic mean squared errors in cases where the sampling means were equal. This, we think, compares the estimators which would compete in a practical situation. However, if the true parameter X is known, then the mean sampling times which minimize

50

HANS-JiJRGEN

SCHUH AND RICHARD L. TWEEDIE

the asymptotic mean squared error for the different schemes are in fact different. For example, if the data are exponential with mean A= 1, then for random exponential sampling times, the asymptotic mean squared error is minimized by choosing s = 1 also; if the sampling times are Erlangian with parameter a =2, then the optimal sampling rate is s =(l + fi )/21.618...; if sampling is at one fixed time, then the optimal time is at the solution of 2e’ - 2 - te’ = 0, which is approximately t = 1.5936.. . . The ratios of these optimal asymptotic mean squared errors are then RSE(fi( 1.5936),x1( 1); 1)=0.386, RSE(i(1.5936),&(1.618);

l)-0.557.

These are perhaps more reasonable measurements of the true loss of efficiency in using random- rather than fixed-time methods. 3.5.

STRATIFIED

SAMPLING

EFFICIENCY

As remarked in Sec. 2.2, a stratified sampling procedure should give a more efficient estimator than those considered above. In this section we examine some of the properties of such stratified estimators. We assume that we have M strata [c,,c,+,), i=l,..., M, with O=c,
hence an unbiased

estimator

of the transform

c$~(.s)= J eps’po(t)dt is given

by

s-$qM,s)=s-’

2 Ik$

k=l

where for notational convenience we have suppressed the dependence of $(M,s) on the fixed partition {ck}. The variance of $(M,s) can be found easily, since r, is binomial. For simplicity we shall consider the case of equiprobable strata (i.e., Ik sl4 -I). We find that Var(~(M,s))=N-‘[~(s)-MZn,2],

(3.12)

51

PARAMETER ESTIMATION USING TRANSFORMS

compared

with the unstratified

case (3.13)

Var(t&l,s))=Var(r/N)=N-‘$(s)[l-4(s)].

If fi(M,s) is the estimator of A from using $((M,s) in place of (r/N) in (3.5) then from the form of the Taylor series expansions given in Sec. 3.2, we have that the ratio of the asymptotic mean squared errors Var($(M,s))

RSEo;(M,s),~(l,s);h)=

Var(d(l,s>)

(3.14) ’

and so the increased efficiency of using stratifying is given by the ratio of (3.13) to (3.12). Finding this ratio analytically seems nontrivial in general, but in the simplest case it is possible. Let us assume that g(t) is exponential (i.e., a = 1) with parameter X, and that our exponential sampling time has the same parameter s =A. From Sec. 3.4 we know that this is the optimal sampling rate for M = 1. After some algebra, we find that

so that

The ratio (3.13) is then easily calculated;

we find

Var(&M2h))

=$+iM-2

Var(J(l,X))

(3.15) 3

*

As M+co, then, this ratio tends to $ ; hence if, from Table 1, we wish to compare the asymptotic mean squared error of the fixed-time estimator A( 1) with that of fi(M, 1) in the case h= 1, we have that approximately for large M, RSE@( I), A( M, 1); A) = 0.644. Thus we see that the stratified sampling procedure in this case lies midway between the random-time sampling procedure and the fixed-time procedure in efficiency.

52

HANS-JiJRGEN

SCHUH

AND

RICHARD

L. TWEEDIE

There are several points about (3.15) that bear mentioning. One is that (3.15) decreases monotonely as A4 increases; hence with N points to be used, the optimal strategy seems to be to calculate $(N,s). However, this involves calculating random times using N different distributions, and the extra work involved may not be worth the increased efficiency; it is also obvious that, in (3.19, the gain in efficiency is minimal once M has reached a value of 4 or 5. Now let us consider the ratio (3.14) when the sampling rate s is not equal to A, but where again g(t) is exponential with parameter A. If we let 6 be such that (a-

l)s=A,

then using the fact that C~= - s- ’ log[ 1 - (k - 1)/M],

as M-too

we get

with k/M held fixed. But now

Using this, we have that for large M, Var<$( M,s)) Var(&(l,s))

---

6 26-

(3.16) ’ ’

which agrees with (3.15) when 6 =2. Using this approximate result, we tabulate in Table 3 the values of (3.16) for A- 1 and various d-values, and we also give a comparison of the asymptotic mean squared error of i(t) with that of fi(M,s); this shows that the stratifying makes the Laplace-transform-derived estimator a viable competitor of the fixed-time estimator over a much greater range than is the unstratified transform estimator.

PARAMETER

ESTIMATION

53

USING TRANSFORMS TABLE 3

Limiting Ratio of Asymptotic Mean Squared Errors R,= limM_eo RSE&M,s), i(l,s); A)z(t + 1)/(2t + 1) of stratified and unstratified sampling estimators, and RSE.&r),i(M,s); +-(2r+ l)(e’- I)/r(r+ 1)3 of fixed-time and stratified R, = lim,,, sampling estimator@ t=s-’ 0.1

0.2 0.25 0.33 0.5 1 2

R3

R4

0.917 0.857 0.833 0.801 0.750 0.667 0.600

0.948 0.897 0.873 0.836 0.769 0.644 0.592

t=s-’

4

R4

3 4 5 6 10 +O 400

0.571 0.556 0.546 0.539 0.524 1 L 2

0.696 0.965 1.501 2.542 34.75 1 1 co

*Data have exponential (a= 1, h= 1) density for failure times; mean sampling times are constant at t = s - ‘. 3.6.

CONCLUSIONS

We have investigated the comparative behavior of fixed- and randomtime estimators in this simple model in some detail, not only because of the intrinsic interest of the model but because here we have been able to find analytic expressions for the quantities involved. It is clear that the randomtime estimators are more variable than the fixed-time estimators. This is not surprising; the random estimation times form ancillary statistics for estimation, and by not conditioning on their (known) values, we are bound to increase variability. However, use of either Erlangian sampling schemes or (more likely) stratified sampling schemes does make the Laplace-transform method more comparable with the fixed-time method; and considering the ease of computation of the random-time estimators, increased variation of the order exhibited in Tables l-3 may be quite acceptable in many situations. 4. 4.1.

ESTIMATION THE CLASS

IN HIERARCHICAL

MODELS

OF MODELS

There are many systems where organisms evolve in time through a hierarchy of stages 1,2,3 , . . . , Z, with the lifetime in stage i influenced by a “death rate” h, competing with a random time until maturation to stage i+ 1 with density gi(x), x > 0. Bartholomew [2, Chapter 51 describes a manpower model of this type, with h the wastage rate from grade i and

HANS-JtiRGEN

54 g,(x)

describing the time to promotion; model in a biological context. The evolution of each independent described by the probabilities

SCHUH AND RICHARD

L. TWEEDIE

Read and Ashford

[16] give the

organism

X in such

models

is

p;(t) = P(X in stage i at time t/X starts stage 1 at time zero), which, if behavior

is also independent Pi(t)=h~*h~**‘*

in different

stages, has the form

*h&,*Hi(t)f

(4.1)

J ,“gi(x)dx, and * denotes convolution. where h,(t)=g,(t)e-H’, H,(t)=e-fir The model of Sec. 3 is a special two-stage version of this model with initial “populations” containing only one individual each. Typically in our situation, sampling is destructive of the population, and so when sampling at time t, the only measurable variables are N,(r), the number of organisms alive in stage i at time t, and N, the original population size. This is much less information than one would usually have in the manpower model, where actual times of retirement or resignation and times of promotion from grade to grade are available for each individual. The estimation of the parameter set [which in this case consists of the b and parameters of gi(t) for i = 1,. . . , I] by fitting Ni(tk)/ Nk to pi( tk), where Nk is the initial size of the population sampled at time tk, is difficult because of the form of (4.1). Even in simple cases (4.1) is a complex function of the parameters, as we shall see in Sec. 4.2. If, however, we let K denote independently ?.istributed sampling times with exponential density se -“, then the quantities

(4.2)

are consistent for parameter $j,

estimators estimation

of Gj(s) = S+J(S)= s 1 ,“pj(t)ees’dt; and using this is facilitated by the simple functional form of

namely

~(~)=~P,(~+~LI)..‘pi-I(~+~--_) where pi(s) = J $e-s’gi(t)dt.

l-Pj(s+Fj)

s+y



(4.3)

PARAMETER ESTIMATION USING TRANSFORMS SPECIFIC DISTRIBUTIONAL

4.2.

55

MODELS

If each of the densities gi(t) in (4.1) is exponential with parameter provided pi = b +h, are all distinct, one has [2, p. 1541

Xi, then

i-l P;(t)=

II,&

i

e-”

mal

(Pm-P/J-I.

h=l m#h

However, exponential maturation times are not usually a good fit to biological data, because the mode is at zero; and a more reasonable assumption is frequently a gamma distribution of shape parameter not equal to unity (see Figs. 1 and 2 in Sec. 5). The first model we shall investigate is thus (4.1) with gi(t) taken to be an Erlangian of order a and unknown parameter 4.. If pi =b+& are still distinct, then one can show that p,(t)

=

i-l

o-l

“v -

j-0

Ix

+

(4.4)

Clearly, parameter estimation by fitting (4.4) to population proportions is computationally nontrivial, even for a =2, although Read and Ashford [16] and Ashford, Read, and Vickers [l] do manage this on some data sets. We remark in passing that K. L. Q. Read, in his Exeter Ph.D. thesis, gives (4.4) for the cases a = 1,2,3, using direct evaluation of (4.1). His form is superficially different from ours, but some algebra shows them to be equivalent; in the case a= 1 the crucial step is the use of the algebraic identity, for distinct real pi, i r=l

ii

(Pj-Pr)-'=09

j=l

j#r

a variety of proofs of which are given in [17]. If, instead of using an Erlangian of known constant order a and varying scale parameter & for the density g,(t), we use a gamma distribution with known constant scale parameter h and stage-dependent shape parameter a,, then the pi(t)% no longer have a closed form; however, in the special case of

56

HANS-JtiRGEN

constant

SCHUH AND RICHARD L. TWEiEDIE

b~i-p they do have the uncomplicated

integral form

p,(t)=Aj-,(t)-A,(t),

(4.5)

where 1dx

and

In contrast to (4.4) and (4.9, the transforms $(s) have a particularly simple explicit form under the assumption that g,(t) is a gamma distribution with shape parameter a, and scale parameter Xi; this gives us (4.3) with

ww=( x,+;+s)a’. In the Erlangian case where a,za is known as in (4.4), or when known as in (4.9, then using the functional forms (4.3) and obvious computational advantages. A more detailed analysis of which might be used in this situation, and of the fixed time mentioned above, will be given elsewhere. 4.3.

THE CONSTANT-DEATH-RATE

&=A is (4.6) has methods methods

CASE

Here we will look at the particularly pleasant case where h=l in every stage. In the biological literature this is the case most frequently analysed (see e.g. [16, 1, 13, 141) and has simplifying properties even in the fixedsampling-time situation; see for example (4.5). The crucial advantage of taking p constant is that it can then be estimated separately from the ?+ If the sample times are T,, . . . , TK, then for each organism X, P(X is alive at time TklX begins stage 1 at time zero) = e-p%; hence, if N(T,)=Zf, ,Ni(Tk) is the total (stage-independent) at time Tk in the kth population, when the Tk are independent with parameter s,

m k=l

E

=

Se-s’e-P’dt= J



hk k=l

s

S+P ’

J

_.

number alive exponentials

57

PARAMETER ESTIMATION USING TRANSFORMS Thus a strongly

consistent

estimator

of p is given

5 N/c

k=l

ji=.S

-1.

5 k=l

(4.7)

N(Tk)

It is worth noting that fi is always non-negative, no matter what the sample values. Using this estimator for ~1, we can now derive actual explicit estimators for the other parameters in both the constant-shape case (4.4) and the constant-scale case (4.5). Let us look at the former case first. Assume a,=a is known, and notice that, from (4.6) &=

!J+s

[p,(s)]-““-

1’

so that if we have estimators of p and pi(s) which are strongly consistent, then by substituting in (4.8) we can achieve a consistent estimator of 4. Let us write

(j)N(T,)= i

N(G)

;=j

for the number we have

seen in stage j or higher at time T,.From (4.6)

of organisms

P,(s)=l-(~+s)~,(s)/s, and substituting gives us

N( T,Jfrom (4.7) and d,(s) from (4.2)

(i + s)/s = ZN,/C

1

$,

;$* Ni(=k) $ N(T,)

‘l(‘)=

(4.9) Induction

then leads us to the general estimator

C (i+ *)N(G) &)=

k

T WN(=k)



(4.10)

58

HANS-JiJRGEN

SCHUH AND RICHARD L. TWEEDIE

and substituting (4.10) in (4.8) gives a complete the set &, of the form

set of explicit estimators

for

(4.11)

Again, since 0 < &s) < 1 for any sample, h.(s) is always well defined except when &s) is 0 or 1, i.e., when no organisms are seen in stages i + 1 onward, or when no organisms are seen in stage i; and in the former case estimation of A, is intrinsically impossible anyway, whilst in the latter it is only possible to estimate the maturation times from i - 1 to i + 1. The constant-scale situation is similar. Take &~h to be known. From (4.6) we have

lo&(s)

ai=log[X/(X

+ s + ZL)-J ’

and so: using (4.10) and (4.7) again, we find that estimators parameters are given by l”gBi(s)

;ds)=log[X/(X+s+fi)] Provided 4.4.

pi(s)>O,

PROPERTIES

.

of the shape

(4.12)

C+(S)is well defined for any sample, as with k.(s) above. OF THE EXPLICIT

ESTIMATORS

In this section we give the results of a variety of simulation experiments which give some idea of the properties of the estimators fi and & given by (4.7) and (4.1 l), and of iii given by (4.12). In each simulation, the initial population Nk was taken as Nk = 50, and the number of populations (i.e., the number of sampling times) was taken as K= 20. The number of stages was taken as Z= 4. The distributions gi were first taken as Erlangian of second order, and the parameters & were taken as Ai= 1 in each stage i = 1,2,3. The death rate was taken as I= 0.17, so that mean time of death was at the mean of stage 3. Three different sets of sampling times T,,. . ., T2,, were generated as independent exponential variables, with sampling rate s taking successively the values s = 1, s = 0.333, and s = 0.2; these correspond respectively to the mean sampling time equaling the means of stages 1, 2, and 3. For each of these three simulations, the parameters fi and & were estimated using (4.6) and (4.10), assuming a=2 to be known. Fifty replicates of each of the above schemes were carried out, the means and standard errors of the fifty estimators thus derived being given as the leftmost of the three entries in each place in Table 4.

PARAMETER

ESTIMATION

59

USING TRANSFORMS

TABLE 4 Mean and Standard Deviation of 50 Simulations of Four Stage Populations, Sampled with Exponential Rate s; Estimates of & under the Assumption ai=2 Known and Estimates of a, under the Assumption &= 1 Known

Theoretical

P’O.17 x, = 1.0 h,=l.O AJ= 1.0 =I =2 a,=2

a,==2

Random

0.173 (.0!5) 1.025 (.219) 1.018 (.337) 0.889 (.980) 2.009 (.291) 2.086 (.477) 2.447 (.800)

Random stratified

0.172 (.014) 1.005 (.053) 1.006 (.133) 1.057 (.914) 1.995 (.074) 2.009 (.191) 2.08 1 (.594)

Fixed stratified

0.169 (.013) 1.000 (.045) 0.978 (.125) 0.946 (.286) 2.002 (.063) 2.048 (.194) 2.193 (.563)

s = 0.333 jL’O.17 h, = 1.0 h*=l.O A,= 1.0 a,=2 a,=2 a,=2

0.165 (.037) 1.037 (.336) 1.001 (.310) 1.002 (.410) 2.075 (.517) 2.142 (.577) 2.181 (.625)

0.173 (.012) 0.997 (.058) 1.041 (.088) 1.032 (.177) 2.010 (.097) 1.945(. 136) 1.988 (.286)

0.171 (.012) 1.016 (.058) 1.015 (.093) 1.029(. 156) 1.978 (.096) 1.986 (. 154) 1.984 (.25 1)

s=o.2 pzo.17 A,=l.O A*= 1.0 h,=l.O a,=2 a,=2 a,=2

0.179 (.0&l) I .262 (.871) 1.069 (.488) 1.064 (.437) 2.020 (.8 14) 2.128 (.677) 2.123 (.680)

0.173 (.009) 1.003 (.065) 0.989 (.074) 1.037(.118) 2.002 (.114) 2.028 (. 132) 1.958 (.197)

0.174 (.009) 1.007 (.062) 1.011 (.084) 1.044 (.136) 1.992 (.103) 1.991 (.136) 1.952 (.226)

From the Bi(s) values calculated in this simulation, the ai were also estimated using (4.12) under the assumption that &rh= 1 was known. The mean and standard deviation of the fifty estimates of each ui are given in Table 4, again as the leftmost figure. A similar series of simulations was performed using the values ai ~3, & = 1, and p = 0.111; so that again, the mean time of death occurred at the mean exit time from stage three. The estimates of p, of & assuming ui =3 known and of ui assuming hi- 1 known are given as the leftmost entries in Table 5.

60

HANS-JGRGEN SCHUH AND RICHARD L. TWEEDIE TABLE 5 Mean and Standard Deviation of 50 Simulations of Four Stage Populations, Sampled with Exponential Rate s; Estimates of & under the Assumption ai= 3 Known and Estimates of a, under the Assumption h = 1 Known

Theoretical

Random

Random stratified

Fixed stratified

s = 0.666 p=O.lll x,=1.0 h,= 1.0 h,= 1.0 a,=3

a,-3 a,=3

0.106 (0.026) 0.960 (0.294) 0.948 (0.358) 1.166(0.948) 3.252 (0.686) 3.480 (1.293) 3.239 (1.195)

0.111 (0.011) 1.COl(0.052) 0.989 (0.153) 1.030(0.517) 3.002 (0.118) 3.069 (0.370) 3.419 (1.332)

0.107 (.009) 1.006(.051) 0.978 (. 134) 0.940 (.2%) 2.990 (.113) 3.083 (.312) 3.322 (.782)

s = 0.222 ~‘0.111 h,=l.O AZ,-1.0 h,= 1.0 a,=3 a,=3 a,=3

0.117 (0.031) 1.145(0.526) 1.061(0.303) 1.059(0.329) 3.016 (0.993) 3.037(0.775) 3.079 (0.858)

0.110 (0.007) 1Xl00(0.042) 1.012(0.083) 1.011(0.155) 3.005 (0.110) 2.984 (0.215) 3.028 (0.433)

0.113 (.007) 1.006(.042) 1.013(.0&o 1.021(.149) 2.986 (.112) 2.981 (.216) 2.992 (.391)

s=o.133 p==O.lll h,=l.O x2= 1.0 &= 1.0 a,==3 a2=3 a,=3

0.120 (0.031) 1.202(0.573) 1.111(0.404) 1.125(0.439) 2.987 (1.160) 3.017 (0.971) 3.048 (1.092)

0.111 (0.006) 1.002(0.049) 1.036(0.122) 1.020(0.149) 3.001 (0.129) 2.935 (0.300) 2.992 (0.359)

0.112 (.006) 1.020(.052) 1.014(.llO) 1.044(.147) 2.954 (.132) 2.989 (.277) 2.926 (.331)

*Values estimated on less than 50 simulations, since no third-stage organisms were seen in (from left to right) 11, 2, and 1 of the simulations. It is clear that the estimators have little or no bias for both the scale and the shape estimation cases. As one would expect, the results vary with varying sampling rates; and it is particularly noticeable that the standard deviations of the third-stage parameters are very high when the mean sampling time s -’ is early, and correspondingly those of stage one are high when s- ’ is late. Overall, the middle value of s-l performs best, but in all cases the standard deviations are rather large.

PARAMETER

ESTIMATION

61

USING TRANSFORMS

Typically, also, errors are larger for the later stages. This is due to the implicit dependence of A(s) on 4(s),j
STRATIFIED AND FIXED- TIME SAMPLING ESTIMATORS

We would expect that reducing the sampling-time variances by stratification would reduce the observed variances of our estimators, and this does prove to be the case. Suppose that 0= c, CC, < . . . CC,
I=,ck"se-s;D,(t)dt Ik

,

hence

l&s)= ,t,

IkF

(4.13)

is an unbiased estimator of b(s), and strongly Proceeding as in Sec. 4.3, we find that

consistent

as in Theorem

P=s[ (p,W-‘-I] is strongly

consistent

a set of estimators

for CL,and that if a, ra

are known,

3.

(4.14)

then, writing

for & is given by A

w =[&;;s,”_1 ; whilst if & E h are known,

the ai can be estimated

(4.15)

using

1ogbii,(s)

4(S) = log[h/(A

+ s + $)] .

(4.16)

62

HANS-JtJRGEN

SCHUH AND RICHARD L. TWEEDIE

Properties of these estimators were also examined using the simulations described in the previous section: the twenty sampling times Tk were distributed with m= 1 in each of the twenty strata, and the c, were chosen so that Zk~0.05, i.e., the strata were chosen equiprobable. Means and standard deviations over fifty replications are given as the central entries in the cells of Tables 4 and 5: it is clear that considerable increase in accuracy is gained by using stratified random sampling times rather than completely random times, with factors of up to 14 being recorded; and that in general, standard deviations are improved by a factor of 3-4. Finally, the effect of taking fixed sampling times Tk within each stratum was investigated: estimates were found using (4.13)-(4.16) but now the times Tk in (4.13) were taken with Tk as a fixed point in the stratum [ck,ck+,). In our simulations we chose Tk= -{log[l-(2k-1)/401)/s, so that $$e-“‘dt= Z,/2=0.025. The results of these estimates are tabulated as the rightmost entries in the cells of Tables 4 and 5. It is obvious that this method does not differ greatly from the stratified random-sampling method, and that both give fairly satisfactory results. This is fortunate, as in practice data will usually come as collected at fixed sampling times; and it seems possible to use the estimators (4.13)-(4.16) with the Tk fixed and the strata defined around them, as it were, and still achieve reasonable results. Essentially, one is thus using a weighting of the ZVi(Tk) which approximates, not p,(t), but the Laplace transform &(s); and this reinterpretation of the data enables the use of the explicit estimators above. Finally, we remark that a variety of simulations have been carried out using the techniques described above, but with nonconstant Z.L;and that the estimators seemed fairly robust under the particular conditions we imposed. 5.

ESTIMATING

VITAL RATES OF SHEEP NEMATODES

We conclude by applying the methods discussed above to the hatching of eggs to larvae of the parasitic nematode Ostertagia circumcincta. Here we have exactly the situation of Sec. 4; eggs are initially collected at (as closely as possible) an identical stage of development, and are then incubated at a constant temperature in a number of essentially identical environments; in the case we consider, these were glass jars containing a known mass of the fecal material in which the eggs were collected. Details of the experimental technique can be found in D. J. Overend’s La Trobe honours-degree thesis, and a complete analysis will be published elsewhere. Sampling times were at half-hour intervals; at the kth sampling time, the number of eggs and the number of first stage larvae were counted. Sampling was destructive, so that further analysis of each “population” was not

PARAMETER

possible. In stage, stage N,(T,J and rate in both

ESTIMATION

63

USING TRANSFORMS

this experiment the number of stages is Z=2; stage 1 is the egg 2 the first-stage larvae. At sampling time T,,the quantities N,(T,)were collected; the assumption was made that the death stages was effectively zero, so that Nk was taken as Nk = N,( Tk)

+N;?(TE). Thus the only variable

in this situation is the maturation time density g,(r). Using the fixed-time methods described in Sec. 4.4, we fitted two different classes of models to g,(r). Firstly, Erlangian distributions of orders u= 1, a =2, and a=3 with unknown parameter A were fitted, hi being estimated by (4.15). In practice, this was done iteratively: an arbitrary value si of s was chosen, i,(s,) was a was chosen; iI was then found found from (4.15), and then sz=i,(s,)/ from (4.15), and so on until the values si converged satisfactorily. This ensured that the value s at which hi was finally estimated was approximately the estimated mean of g,(t); this iteration only took two or three steps in general, which seems to indicate that the method is rather robust under choice of s. There was one further, data-specific complication. There was a delay, of unknown length G, before hatching really began, and since, due to slight sample nonhomogeneity, populations sampled before G might have one or two hatched larvae visible, it was not possible to estimate G by the time of first hatching. The values of A,(S) for a= 1,2,3 and a variety of delays G around the correct value were calculated. These are given in Table 6, and the graphs of the data and the fitted curves for G=6 and 7 are given in Figs. 1 and 2; it seems clear that G does take values around this point, and that once G is estimated, the Erlangians of order 2 and 3 with A, estimated as above do provide good fits to the data. It is also clear that the method is sensitive to the choice of G; for G = 6, the u = 3 curve is best, whilst for G = 7, the a = 2 curve is more satisfactory. Since the method of Sec. 4.4 is essentially a matter of exponentially weighting the data, this sensitivity to a change of the origin is not too

TABLE 6 Values of A,(s) for Delays G and Shape Parameters a in the Data of Figs. I and 2

1

2 3

0.162 0.388 0.625

0.187 0.45 1 0.728

0.224 0.540 0.875

0.273 0.668 1.087

HANS-JiJRGEN

64

i

SCHUH AND RICHARD L. TWEEDIE

0

8 FIG.

L

2.

16

_--__I

__~_I_

24

As in Fig. 1, but with the delay G=7.

_i-_

TlllE(IN HALF-HOURS)

~

1

~

L-

32

~

~~_

66

HANS-JtiRGEN

SCHUH AND RICHARD L. TWEEDIE

TABLE I Values of i,(s) for Delays G and Scale Parameters h in the Data of Figs. 1 and 2

0.5 1.0 2.0

2.413 4.552 8.637

2.178 3.969 7.487

1.877 3.373 6.265

1.586 2.192 5.112

but it does indicate that satisfactory estimation of a delay term G, if one occurs in the model, is of some importance. A two-parameter model for these data, with a =2 and G and hi as parameters, was also fitted by ordinary least-squares techniques. The fitted values were found to be 6 = 7.28 and fi, =0.571; the correspondence between these values and the Laplace-transform estimate at G = 7 is also quite gratifying. Finally, a model of gi(t) as a gamma with unknown shape parameter a and with scale parameter A known was fitted using (4.16), again iterating with s until s converged satisfactorily on B,(s)/h. The values of B,(s) for known X, -0.5, 1, and 2 are given in Table 7. surprising;

This paper owes more than most to the comments and help of friencis and colleagues with whom we have discussed it. In particular, we are grateful to Leigh Callinan, Norman Anderson, and Rick Young for bringing us the original problem, outlined in Sec. 4, and for providing the data in Sec. 5; and to P. D. Feigin, P. A. P. Moran, R. Morton, I. W. Saunders, R. Gnanadesikan, C. Mallows, J. B. Kruskal, and others for many ideas and inspirations. NOTE ADDED

IN PROOF:

Various sets of data for hatching of eggs of 0. circumcincta have been fitted by delayed Erlangian models as described in Sec. 5, and this function seems to provide a good fit under various conditions; see [19] for details. Multistage data have been fitted using the transform techniques of Sec. 4 and the overall fit of the model in this more complicated situation seems to be adequate also; see [20] for details. REFERENCES 1 J. R. Ashford, K. L. Q. Read, and G. G. Vickers, A system of stochastic models applicable to studies of animal population dynamics, J. Anim. Ecology 39:29-50 (1970).

PARAMETER

ESTIMATION

USING TRANSFORMS

67

D. J. Bartholomew, Stochastic Models for Social Processes, 2nd ed., Wiley, London, 1973. 3 I. V. Basawa, Maximum likelihood estimation of parameters in renewal and Markovrenewal processes, Austral. J. Statist. 16:33-43 (1974). 4 G. E. P. Box and H. L. Lucas, Design of experiments in non-linear situations, 2

Biometrika

5 6

7 8 9 10

46:77-90

(1959).

D. R. Cox and D. V. Hinkley, Theoretical Statistics, Chapman and Hall, London, 1974. L. Denby, E. B. Fowlkes, and R. J. Roe, Estimation from censored, interval data of the median breaking point of polyethylene subjected to stress-cracking tests: a Monte Carlo study. J. Appt. Mech. 42:607-612 (1975). P. D. Feigin and C. R. Heathcote, The empirical characteristic function and the Cramer-von Mises statistic, Sankhyci, Series A, 38:30!-325 (1976). D. P. Gaver, Jr., Observing stochastic processes, and approximate transform inversion, Operations Res. 14:444-459 (1966). J. M. Hammersley and D. C. Handscomb, Monte Carlo Methods, Methuen, London, 1965. C. R. Heathcote, A test of goodness of fit for symmetric random variables, Austral. J. Statist.

14:172-181

(1972).

11 J. Kiefer, Optimum experimental design V, with applications to systematic and rotatable designs, in Proceedings of the 4th Berkeley Symposium on Mathematical Statistics and Probability, 196 1. 12 J. F. C. Kingman, Poisson counts for random sequences of events, Ann. Math. Statist. 34:1217-1232

(1963).

13 K. Kiritani and F. Nakasuji, Estimation of the stage-specific survival rate in the insect population with overlapping stages, Res. Popul. Ecol. 9: 143-152 (1967). 14 B. F. J. Manly, A comparison of methods for the analysis of insect stage-frequency data, Oecologia 17:335-348 (1974). 15 A. S. Paulson, E. W. Holcomb and R. A. Leitch, The estimation of the parameters of the stable laws, Biometrika 62: 163-170 (1975). 16 K. L. Q. Read and J. R. Ashford, A system of models for the life cycle of a biological organism, Biometrika 55:21 I-221 (1968). 17 H.-J. Schuh and R. L. Tweedie, A nematode identity, CSZRO DMS Newsletter 21:4-5

(1976).

18 L. V. White, An extension of the general equivalence theorem to nonlinear models, Biometrika

60:345-248

(1973).

19 R. R. Young, N. Anderson, D. Overend, R. L. Tweedie, K. W. J. Malafant, and G. A. N. Preston, The effect of temperature on times to hatching of eggs of the sheep nematode Ostertagia circumcincta (submitted for publication). 20 R. R. Young, R. M. Nicholson, R. L Tweedie, and H.-J. Schuh, The effect of temperature on the development of the free-living stages of Ostertagia ostertagi under field and controlled conditions (submitted for publication).