Journd ~~A~m~spheri~ and Tcrrrsrrinl
Physics.
Vol. SO,No. 1. pp. 21-32. 1988.
~2l-9~69/88 83.00-t .OO 0 I988 Pergamon Journals Ltd.
Printed in Great Britain.
Statisticai errors in the determination of wind velocities by the spaced antenna technique P. T. MAY* Physics Department, University of Adelaide, Adelaide, Australia 5001 (Received
infinal form
15 July 1987)
Abstract--This paper is concerned with the statistical errors which are present when wind veIoc.ities in the atmosphere are determined by the radar method known as the spaced antenna technique. It is assumed that the (complex) data is processed by the method known as full correlation analysis (FCA). A theory is
first developed to give the error in the dete~nation of the Position of the m~imum of a crow-co~elation function and the value of lag such that the auto-correlation falls to a value equal to that of the crosscorrelation at zero lag. These are the basic quantities needed for the application of FCA. These error estimates are tested with a variety of numerically simulated data and shown to be realistic. The results are applied to real data and, using the standard techniques for the propagation of errors, they lead to estimates of the errors in the derived wind velocities. In order to test these estimates, an experiment was carried out in which two independent wind determinations were made simultaneously. The differences were used to obtain experimental estimates of the errors. It was found that the theory overestimates the error in the wind velocities by about 50%. Possible reasons for this are discussed.
INTRODUCTION The Spaced Antenna
(SA) technique has been used to measure wind velocities in the ionosphere for more than 30 years (see, for example, BRIGGS, 1977, for a review). Recently the technique has been used to measure winds in the troposphere and stratosphere with VHF radars (VINCENTand R~~TTGER,1980). In view of the extensive use of the technique, it is important to be able to attach error estimates to the measured quantities. However, a statistical treatment of the errors has never been given. Some limited numerical studies were made by AWE (1964a, b), but no explicit expressions for the errors were given. BUCKLEY(1971) also considered the problem and obtained an expression for the error in the determination of the position of the maximum of a cross-correlation function. This is an important element in the discussion, but not the only error to be considered. Buckley’s expression is discussed later and a modification is suggested. Some aspects of the problem have also been considered by FEDOR(1967), CHANDRA et al. (1972) and CHANDRA(1980). In the present paper we will not attempt a rigorous discussion of the errors, which is difficult, but estimates of the errors will be obtained by reasonably straightforward arguments. These estimates will be tested by numerical simulations and by the use of real data, and shown to be realistic. The discussion will be confined to the random statistical type of errors which ~.
_-~ -* Present address: RASC, Kyoto University, Uji, Kyoto 611. Japan. 21
arise from the use of records of finite length and from the presence of external noise. We will not discuss the question of whether systematic errors may also be present. In the SA technique a pulsed radar system employing a vertically directed transmitting antenna and three or more spaced receiving antennas is used. The fading of echoes returned from selected heights in the atmosphere is analyzed, either ‘on line’ in real time or after digital storage. In earlier work only the timevarying amplitudes were used. More recently the phase, as well as the amplitude, of the echoes has been utilized (e.g. R~~TTGEEand VINCENT,1978); the advantages of this are discussed later. The phase and amplitude can be combined in the usual way to give time-var~ng complex signals, one from each receiving channel. In practice it is usual to measure in-phase and quadrature components of the signals, rather than phase and amplitude. We may envisage a random pattern of complex field strength moving past the array of spaced receiving antennas and inducing in them the complex signals. The pattern moves with twice the wind velocity at the selected height (FELGATE,1970). To estimate this velocity, time differences for transit between pairs of antennas are needed. These are best obtained by computing complex cross-correlation functions between the time series, taken in pairs. It is also necessary to compute the auto-collation function of each time series. The auto-correlation functions should all be the same, apart from statistical fluctuations. The subsequent analysis has become known as ‘full correlation analysis’ (FCA). A recent review of this analysis has been given by BRIGGS (1984) and the
Fig. 1. Notation. The standard deviation of z, is denoted by CJ
formulation and notation of that paper will be foflowed here. The parameters which must be obtained from the correlation functions in order to apply this analysis are (i) the time shift for which the crosscorrelation is a maximum (r:) and (ii) the value of lag Z, such that the auto-correlation function falls to the value of cross-correlation at zero lag (Fig. 1). The notation used here implies that we are considering a pair of antennas arranged along the x axis, There will be values of these quantities for each pair of antennas. FCA derives from these the values of various parameters describing the moving pattern, as well as its velocity, for example, the scale of the pattern and the rate at which it changes as it moves. Variations of the FCA use other quantities than T.~,but expressions can be derived in a similar manner to that which is shown in this paper. The plan of the present paper is as follows. We first obtain theoretical expressions for the errors in ri and rs. These are then tested using simulated data. Finally, the resulting errors in derived winds are compared with those existing in real data. The latter are found by using two separate spaced antenna arrays to derive winds. The arrays are so far apart that the wind measurements may be regarded as statistically independent and the differences between them therefore give estimates of the errors. PRELIMINARY DISCUSSION
Before commencing a discussion of the errors, it is necessary to discuss signal averaging, since this affects the signal-to-noise ratio, and to explain how complex signals are treated in FCA.
In many radar systems the time between successive pulses is much less than the fading time of the returned signal. If each pulse were used as a data point the complex time series would be oversampled. On the other hand, receiver noise and some types of external noise will be incoherent from one oulse to the next. Thus the signal-to-noise ratio can be improved by averaging several successive pulses. This is normally done digitally, after analogue to digital conversion. Suppose n successive pulses spaced at time intervals At are averaged to form a data point of the desired time series. Provided nAt is less than the characteristic fading time of the signals, this will not affect the time series, but will tend to average out the noise. If only the amplitude of the returned signal is utilized the signal-to-noise power ratio wiil be improved by a factor Jrz.However, if the transmitted pulses are phase coherent and the complex returned signal is used it can be shown that the same averaging procedure (called coherent integrtion) will improve the effective signalto-noise power ratio by a factor n (FARLEY, 1983: R~TTGER, 1984). This can be a very significant improvement, particularly in the case of radars with a high pulse repetition frequency, such as a VHF radar, where the averaging may be over more than 1000 successive pulses. Note that the time series is effectively sampled every nAt s and that the (reduced) noise on the successive samples is uncorrelated. When, in the subsequent analysis, we introduce a factor to represent random noise, it is the noise after coherent averaging to which we are referring. If the time interval nAt is of the order of the fading time or greater, the averaging process will attenuate the higher frequency components of the wanted time
The
series and thus reduce the signal power. However, the subsequent FCA can still proceed, but it will utilize the low frequency components only. The effects of such filtering procedures have been discussed by CHANDRAand BRIGGS(1978). In all the early papers on FCA it was assumed that only the amplitudes of the signals were utilized. More recently, advances in experimental techniques have made it feasible to utilize both amplitude and phase information, as explained earlier. Although this is now almost universal, the procedures and modifications which are needed to apply FCA to complex signals have been rather inadequately described in the literature. We therefore give a discussion of the relevant points here. Let the time-varying complex signals from the mth and nth receivers be given by
-k,(t) = G(t)+iQntW
23
spaced ant:enna technique
(14
PA(Z)=
((A,(t)-A,)(A,(r+z)-A,))
(((A,(t)-A,)Z)((A,(t)-A,)Z))“Z’
(3)
where the brackets ( ) indicate a time average. These functions, together with the analogous auto-correlation functions (n = m), formed the starting point for FCA. When using complex signals, the complex cross-correlation function for receivers m and n is defined as
m2(~)~“0+4)
PE(T) = {(E:E,)(E,*E”)}1’2
(4)
where * denotes a complex conjugate. In this case no means have to be removed, for the reasons already explained. If we denote the normalizing constant in the denominator of equation (4) by No and expand the expression in terms of the in-phase and quadrature components we obtain
PAZ) = ((Zm(t)-jQ,(f))(Z,(t+z>
and
.%(f) = Zn(t)+iQ.(t),
(lb)
respectively, where Z denotes ‘in-phase’ and Q ‘quadrature’. In principle, Z and Q should be random time series with the same statistical properties (since there is nothing special about the choice of the origin for phase) and they should both have zero mean with probability distributions symmetrical around zero. In practice, the mean values are found to be close to zero, but may not be precisely zero for several reasons : (i) there may be d.c. offsets in the phase-sensitive detector circuits ; (ii) for short records, a non-zero mean may occur for statistical reasons ; (iii) in some radars nonfading echoes, such as ground clutter, may be present. It is therefore usual to subtract any mean values from Z and Q before proceeding to further analysis. We assume that this has been done, so that Z and Q in equations (la) and (lb) represent the desired fading signal only, with (I) = (Q) = 0. It follows that (E) = 0 also. The amplitudes of the signals from the mth and nth receivers are
‘h!(t) = En(
W
A,(t) = MOl
CW
and
and these are, of course, positive, with definite mean values A, and A,. In the earlier forms of analysis, using amplitude only, the cross-correlation function for this pair of receivers was defined as
+jQn(t+z)>>/No = G(~M~+~))lN, +/No +~Um(t)Qn(t+4)/No --~(Q&Vn(~+4>/No.
(9
For the case where there is no Doppler shift the two imaginary terms will both be zero, since the in-phase and quadrature components are uncorrelated (in practice small imaginary components may be present due to statistical fluctuations, especially for short records). The first two terms are equal. Therefore, for this case, pE(r) will be real and equal to the (real) cross-correlation function of either the in-phase or the quadrature component. It should now be evident that pE(r) can be used just as effectively as P”(Z) for the determination of winds. For, if a complex pattern of signal amplitude moves across the antenna array, particular values of in-phase or quadrature components will tend to be repeated with a lag depending on the velocity, in exactly the same way as a particular value of amplitude A would be repeated. In fact, as we will show later, there are advantages in using pE(r) rather than ~~(7) for the determination of winds. A practical point arises in connection with the small residual imaginary terms in Pi arising from statistical fluctuations. It would be permissible to simply remove these terms on the grounds that the true values are zero, i.e. to use the real part of ~~(7). It is more convenient, however, for a reason which will appear
.‘-I
P .I
in a moment, to USCthe modulus of Pi, i.c. tpI,(r)I. Since this is obtained by adding the square of the real part to the square of the imaginary part and since the latter is very small compared with the former, it is clear that it is practically identical with the crosscorrelation function which would be obtained by taking the real part. Either procedure will result in a suitable, purely real, cross-correlation function. as is required for the subsequent application of FCA. The assumption ha:, been made above that the signal phasor moves randomly in an Argand diagram. This is only true if the mean frcqucncy of the returned signal is the same as the frequency transmitted. This is not always true. since the wave may be reflected from a layer or scattered from irrcgularitics which have a mean vertical component of velocity. so that there is a mean Doppler shift. We will now show that provided the modulus of i’+,(r) is used (rather than. forcxample. the real part) the results will be unaffected by such a vertical motion. If there is a Doppler shift the signal phasor will have a systematic rotation in the Argand diagram, which will be superimposed on the random variations. The in-phase and quadrature components will therefore have a tendency to vary sinusoidally at the Doppler frequency and with a relative phase difference of x/2. Their variations will no longer be uncorrelated. The complex signal must now be written E,,,(t)=
[I,,(r)+jQb,(t)]e”““‘d’,“‘,
(6)
where w is the Doppler shift (or rate of change of phase) and 4 is an arbitrary phase constant. I’ and Q’ are new in-phase and quadrature components which are independent and have the same statistical properties as I and Q for the case of no Doppler shift. The cross-correlation function for two such signals from receivers m and n is now ykl(d = ([I&,(t)-,jQb,(t)]e
‘(““‘“J~~)[Z~~(?+~)
ilM~ t CJt +,,I ) ,N,,
+.iQb(f+r)le =
M
2i
Very similar considerations apply to the cornpic~L auto-correlation function. which IS defined as for the cross-correlation function, but wtth tn = II. i.e. in thu absence of a Doppler shift the ‘complex’ auto-correlation function should in fact he real. apart from residual statistical fluctuations. We take the modulus of the function for use in FCA and can use the phase factor to determine the Doppler shift (if prcscnt) and hence find the vertical velocity. From now onwards in the present paper when the terms auto- or cross-correlation functions are used we will be referring to the modulus of the complex functions, so that any mean Doppler shifts are removed. Similarly, the symbol /J will apply to the modulus, without modulus signs being included. With this convention all the results will apply without modification either to real signals or complex signals. We shall later consider the effect of additive high frequency noise. It has already been pointed out that even when coherent integration over many pulses is used. such noise will be uncorrelated from one sample to the next. As the correlation functions can only be computed for time increments of the sample spacing and its multiples, this means that the auto-correlation function ofeach signal will have a ‘spike’ at the origin. It is usual to interpolate through the spike and then t-e-normalize the auto-correlation function to unity at the origin. This removes any systematic ctfcct of the noise. ic. the re-normalized correlation function is the same. on average, as it would have been in the absence of noise, though the random errors associated with each point of the function will. of course be increased. The cross-correlation functions must also bc tenor. malized. They may or may not have a ‘spike’ at the origin, depending on the nature of the noise. Externa! impulsive noise which arrives at each receiver simultaneously will produce such a spike, but rcceivcr noise will not. since it is uncorrelated between the separate channels. This spike, if it exists, may also bc interpolated through and the interpolated value used as an estimate of the cross-correlation at zero lag.
(IzZ,(t)-.iQ;(t)l[lb(t+z) (7)
As for the case of no Doppler shift, the first term ( ) will be real. The complex ~~(7) therefore has a phase which varies linearly with z, at a rate w. Thus, by taking the modulus of yE(r) we eliminate the effect of the Doppler shift and obtain a real function which can be used in FCA to determine horizontal drifts, as before. In addition, we may use the slope of the phase factor to estimate the Doppler shift and hence the vertical velocity (e.g. WOODMANand GUILLEN, 1974).
RRROR ESTIMATESFOR r: AND *, As already explained, the drift analysis makes use of quantities derived from auto- and cross-correlation functions. For a particular pair of receivers oriented in the x direction these quantities are z:, the time shift for which the cross-correlation function has its maximum value, and t,, the time shift for which the auto-correlation function takes the same value as the value of the cross-correlation at zero lag (Fig. 1). In general. neither of these values will fall at an integer
The spaced antenna technique value of lag for which the correlation functions are computed and so inte~olation is used to determine them, making use of several points on the function around the interpolated value. In order to estimate the errors in r: and z, we must first consider the error in each computed correlation coefficient, i.e. the error in each point on the correlation functions. For a real time series of N completely independent data points taken from a Gaussian population the standard deviation op of the sample correlation coefficient p is given to a good approximation by the equation
25
where pm is the maximum value of p(r) occurring at r = r, and s is a constant which determines the width of the Gaussian function. The sample cross-correlation function will be P(T) = Pmexp(-(~-zm)*/sZ)+&~~),
where E(T)is an error term whose standard deviation is gP, given by equation (9), and with (E) = 0. To locate the maximum 7: of p(z) we differentiate equation (11) with respect to T and set dp/dz = 0. This gives [-2(r:--a,)/s’]p,exp
cP = (1 -p’)/dN.
(11)
(-(r:-r,)2/S2~+de/dr
= 0.
(8) (12)
This equation is a good approximation provided the sample is large enough for p to be a good estimate of the population correlation coefficient and provided p is not too close to unity. However, if adjacent data points are correlated, the number of effective data points will be less than the actual number. AWE (1964a) has shown that the effective number of data points is given approximately by T/r ,,2, where r ,,2 is the lag at which the auto-correlation function of the time series falls to 0.5 and T is the sample length. z r,z is often referred to as the fading time. Replacing N by T/z ,!z gives 6, = (~,,~/~“z(l
-P2).
(9)
It is not immediately evident that equation (9) will apply to a complex time series, with p equal to the modulus of the complex correlation coefficient, and r ,,* the lag at which the modulus of the complex autocorrelation function takes the value 0.5. However, tests with artificial complex data generated numerically (in the manner described in Section 4) show that it does, in fact, hold to a good approximation (within 10%) for complex time series, as well as real time series. For present purposes we need only make the assumption that oP is proportional to the quantity on the ~ght-hand-side of equation (9), since normalizing constants will be determined empirically. We will now consider how the errors in the correlation values will be carried over into errors in the quantities zi and r,. For these derivations we will assume that the correlation functions have a Gaussian form. This form is a reasonable approximation to the actual functions in most cases and the expressions are only changed by small factors if other functional forms, such as the quadratic, are assumed. Thus, for the population cross-correlation function we assume a form p(t) = pnrexp(-(r--r~)“/s”)~
(10)
In realistic cases z: will oniy deviate from the true value r,,, by a small amount compared to the width of the correlation function, i.e. (r:-r,,,) <
(13)
To estimate ((d.s/dr)2) we use AWE’S (1964a) result that the time scale of the fluctuation of E is proportional to the half-width of the correlation function z rj2’The reason for the correlation of the errors in the correlation coefficients, which are due to finite data length, is easily seen by considering the case where there are zero population cross-chelations. In this case there will still be some (statistically insi~ificant) correlations and these will vary in magnitude with a similar fading time to that of the auto-correlation functions, as this is the time scale for the data of each time series to become independent. Thus ((d=s/dr)‘) 0~ (~,/r,,z)*. Then, combining obtain
equations
(14)
(9), (13) and (14) we
6, = Kr1,&1,2/lr)“*E(l
-dmA
(1%
in which we have used the fact that x1,2GCs and
I”. I
‘0
absorbed all the numerical constants into one constant K,. In practice, Kr will be determined empirically. We will now consider the other quantity :, which is needed for the application of FC‘A. WC assume ;I Gaussian form for the population auto-correlation function Q(T). which is wriltcn i’(r) =- exp ( --?zi,vL). and a sample auto-correlation
(16)
function
will be
f,(z) = exp( --T’/.v’)+c,(~),
(17)
where c,(t) is the error in i)(z). To find 7r we take the value of cross-correlation at zero lag pu and equate the auto-correlation to this value. There will be an error c2 in I>(,.Thus, exp(-Tt/s’}+r:,
= pn+c2.
(18)
The two error terms will both have zero mean values and their standard deviations (T,,are given by equation (9) with p = po. The errors in the two correlation functions will themselves be partly correlated, since they are calculated from the same pairs of time series. Thus, when combining the error terms the effective error is reduced compared with the case of independent errors. in general the new error term will depend on such parameters as the t&ding time and the maximum value of cross-correlation, but for the following analysis we assume that this is a weak dependence and that the new error term is simply some numerical factor multiplied by the error on either side of equation (18). If this is not a good approximation we would expect the following analysis to break down and poor agreement with simulated data would be found. Thus we write exp { -7.2 (s? ) = [),I i-c,
(19)
with E << po. Taking the logarithm of both sides and using a first order expansion of the logarithm, we find -z~;s~
= In (po)+c:/po.
Note that for c = 0 we have the expected --s;!s”
= In (p,,).
Q-0) result that.
(21)
The standard deviation of zi will be (?/P,,)[(E’)]“* and of z, will be this divided by 22,. We will denote the standard deviation of z, by cu. Since s rx 7 ,,?, we obtain C, cc (ri ~~~,p~)[(~z>]’
?.
(22)
In the actual experiments we again use interpolation procedures and curve fitting in the neighborhood of p0 to find z,, usually involving 3-5 points on the autoand cross-correlation functions. This procedure will
VI \‘i
reduce the uncertainty in ‘II by some numerical factor. Assuming that ((I;‘))’ ’ is proportional to o,, as given by equation (9) (for the reasons explained above), UC* c4>tain the following ‘j
= A,(:;
expression for G,
.,‘T,, I(;,
_,-/-)I ‘II 1--p:‘l/,l,,{.
(231
whert: K, is a numerical constant to be determined empirically. The interpolation procedures which are used (allowing the measurement of 7: and z, to LI precision of a fraction of a lag) are not expected to significantly reduce the errors because, as was noted before, the errors in adjacent correlation coefficients arc correlated. This implies that the results should not be sensitive to the choice of interpolation procedure. The functional forms given in equations (15) and (23) will be tested using simulated data and the values of the constants Ki and K, determined.
GENERATION
OF SIMULATED
RADAR
DATA
In this section the method used to generate simulated data which have a realistic form and known population correlation functions is described. By finding the differences between the sample and the population correlation functions and the differences between the measured and the model values of 6, and t, we can test the results derived in the previous section. Two complex time series were generated, each with an amplitude distribution which is described by a Rayleigh function and with a uniform phase distribution between 0 and 2n, consistent with data from radar returns from turbulent scatter. Furthermore. the two time series had the maximum of their population cross-correlation function at some known lag. The data were constructed using a technique similar to that described in BRIGCSand PACE (1955). First, two complex series S, and S1 were constructed from a population of Gaussian distributed real random numbers. The real and imaginary components are formed separately, since they are independent. Therefore, the data set had no mean Doppler shift, although this could be easily introduced. In order to give these series some correlation width they were each convolved with a Gaussian function of known width s. Thus the autocorrelation functions of the new series S’, and S; were Gaussian with width s. From these two series we can now construct two time series with the same mean signal amplitude which have a population cross-correlation function with its maximum value pm at some lag 7m by forming two functions E, and E,, such that E,,,(i) = S’.
(24a)
27
The spaced antenna technique and J%(i+r,)
= p,s;
-t-(1-&“‘S$.
(24b)
At this stage it is simple to add noise to the data by taking two series of complex random numbers, again constructed from a Gaussian population of real numbers, with mean amplitudes N,, N2 and adding these to E,,, and En, respectively. We now calculate the mean auto. ovariance functions and the cross-covariance functions of Emand E,, out to + 10 lags: The auto-covariance functions will, in general, have a spike at zero lag due to the high frequency noise added to the data. This is inte~olated through and all the covariance functions normalized. The signal-to-noise ratio can be estimated from the ratio of the size of the noise spike and the interpolated value of auto-covariance at zero lag. The position of the maximum of the cross-correlation function is found. If this has a value of lag at the limit of the computed correlation function an error flag is set. This is only implant if the maximum value of correlation is small or the time series has a low signal-to-noise ratio, in which cases the correlation functions have large fluctuations. Given good data, a quartic function is fitted around the maximum and the value of 7: is found by locating the lag such that the slope of the quartic is zero. A quartic is used in case the cross-correlation function is slightly skew. If the curve fitting breaks down another flag is set, but this occurred rarely. The value of r, is found by first reading off the value of cross-correlation at zero lag p,, and then searching along the auto-correlation function until the correlation has fallen below this value. Using the previous two points ofthe curve and this value, Aitken’s scheme of linear interpolation is used to estimate the value z,. We know the value of lag such that the population auto-correlation function falls to 0.5 from the width of the Gaussian function we used in the convolution when constructing the functions Em, E,, and will use this in expressions (15) and (23). Of course, in real experiments the value of r 1,zmust be estimated from the sample auto-correlation function. For the simulation tests we will be setting the lag corresponding to the maximum value of the population cross-~o~elation to - 1 and varying pm, z,,~, the signal-to-noise ratio and the record length. Since the form of the cross-correlation function and its maximum value are set, the population value of r, is also fixed. RFSULTS FOR NO ADDED NOISE
In this section we will examine the results of the simulations and compare the RMS deviations of the
sample values of r: and 7, from their population values with the theoretical expressions derived in the previous section. In addition, we will show that our formula for the error in 7: gives better estimates than the equation derived by BUC~Y (1971). Results for data with a significant added noise component will be discussed in the following section. Numerous simulations have been carried out in which the population values of Pmand 7 I,zwere varied. The total number of simulations for each particular set of population parameters ranged from 60 to 120. The record length was held constant at 256 data points. In Fig. 2 a scatter plot of the observed standard deviations of 7: against the standard deviations found from equation (15) is shown. The values of z ,,* used were 2.35, 3.53 and 4.71 lags for data sets that had maximum population correlation values pn of 0.3,0.5, 0.7 and 0.9. The results show a good agreement to a straight line with a slope of 0.52 (found by linear regression analysis). This indicates that equation (15) gives a good estimate of the error in r> for a variety of different conditions using a value of KV= 0.52. An alternative expression for the error in 7: has been derived by BUCKLEY(1971) and is quoted in BRIGGS(1984). This expression is 0 = r1,2(7,,2/7Y2[(l
-PZJlP,l”*.
(25)
It has the same functional dependence on T,,~ and T as the one derived here, but a different dependence on p,,,.Buckley’s relation was formulated for data points very close together compared with 7,,2 and the estimate of the position of the cross-correlation maximum was simply the lag of the maximum correlation value. This does not allow for any interpolation procedure around the maximum, so Buckley’s formula may not be appropriate for the case considered here. In Fig. 3 we have plotted the theoretical error estimates derived from both Buckley’s formula and equation (15) against P,,, for values of T = 256 and 7 112= 4.71 lags. The values of o, from the simulations are also plotted. Equation (1.5) is seen to be in much better agreement with the simulation than is Buckley’s equation. This result is repeated for other values of the fading time. The population values of pm, 7, and %,,* fix the popuIation values of p. and 7, when the form of the correlation function is known. Figure 4 shows a scatter plot of the standard deviations of the sample values of 7, found for many simulations against the predictions of equation (23). Note that again a straight line gives an excellent fit to the points, confirming that we have the correct functional form for the error estimate. By performing a linear regression
P. ‘I‘. MA\
2s
t!’
'36-
.
I a2
I
L
04
1
I, 0.6
1
04
I,
L
14
I
1.2
I
I.
14
I, 1.6
1
I,.
M
24
D,(W)
Fig. 2. Test of equation (15) using simulated data sets. The observed values of err (the standard deviation of T;) are plotted as the ordinate against the values of o, (theory) calculated from equation (15) as the abscissa. 0 Values obtained from data sets with no added noise ; A valuesobtained from data sets with population p”$= 0.7, but sample correlation reduced by added high frequency noise.
Fig. 3. Test of the form of the dependence of (T,on pn. The solid curve shows the dependence given by equation (15) with K, = 0.52 and the dotted curve the dependence given by equation (25). Values of cr obtained from the simulations are shown as follows : 0 values obtained from data sets with no added noise; n values obtained from data sets with population pm = 0.7, but sample correlation reduced by added high frequency noise.
we find the line of best fit has a slope K, = 0.50. Finally, we can test both of these relations with respect to their dependence on T by using records of different lengths, in this case 64 and 128 points. Figure 5a shows the results for the error in r:. It is noticeable
that for 128 points the observations have a larger slope than for N = 256. This effect is even greater when the data length is reduced to 64 points. This indicates that the dependence of the error on the record length is more complicated than in the expressions given. What appears to be happening is that for short records, which are only a small number of fading periods in duration, the sample correlation becomes a poor estimate of the population value, i.e. equation (9) starts to break down. It has been suggested (HOCKING, 1981) that reliable estimates of the correlation functions require that the record length should be of the order of 100 times the fading period. These simulations, for the short record lengths, fall short of this criterion and are of the order of 50 times the fading period. Good estimates of the errors are seen to be obtained from the above theory for records of greater than 50 times the fading time. In the case of record lengths of only a few fading periods the spurious oscillations in the correlation functions become much larger and therefore more important, leading to the detection of erroneous maxima. Figure 5b shows the analogous results for r,. The error in z, is not as much affected by the spurious oscillations, so that the agreement between theory and experiment is better than for the error in z:, but it is clear that there are still sign&ant perturbations for short data lengths. From these simulations it has been shown that the estimates of the errors in z’, and z, are, to a good
The spaced antenna technique
0.2
Q4
0.6
08
1.0
1,2
1.4
1.6
Fig. 4. Test of equation (23) using simulated data sets. The observed values of ux (the standard deviation of 7,) are plotted as the ordinate against the values of ux (theory) calculated from equation (23) as the abscissa.
o-2 0.3 04
0.5 0.6 0.7
p (thay)
cr;(*Yt
Fig. S(a). Test of the dependence of CT,on the length of the data set. The observed values of a, are plotted against the values of uz (theory) obtained from equation (15) with K = 0.52. The solid line with a slope of unity is the best fitting line for data sets of length 256 points. 0 Values obtained from data sets of length 128 points; x values obtained from data sets of length 64 points.
Fig. 5(b). Test of the dependence of a, on the length of the data set. The observed values of a, are plotted against the values u, (theory) obtained from equation (23) with K = 0.5. The solid line with a slope of unity is the best fitting line for data sets of length 256 points. 0 Values obtained from data sets of length 128 points ; x values obtained from data sets of length 64 points.
degree of accuracy, given by equations (15) and (23) with K%= 0.52 and K, = 0.50, providing that the record lengths are at least of the order of 50 times the fading period.
significant external noise in the data. We will consider only the case in which the noise is uncorrelated from one data point to the next; this will be the situation for added high frequency noise, even when coherent integration is used. Therefore, the noise will manifest itself as a spike at the origin of the auto-correlation function, which can be interpolated across and the auto- and cross-correlation functions can be renor-
THE FZFFECTOF NOISE So far, the errors considered are due solely to the finite record length. In many experiments there will be
.K!
I’
‘f.
malizcd. The noise will also generate tluctuations in the correlation functions which are uncorrelated from one lag to the next. However, unless the sample correlation is very high or the amplitude of the noise is of comparable amplitude to that of the signal component, these fluctuations will have only a small effect on the determination of r; and t,. This is due to the dependence of the errors on the fading time and the use of interpolation procedures around the points of interest on the correlation functions, which act to decrease the direct effect of noise on the values of r’, and T_~.Similarly, once the auto-correlation function is renormalized the sample value of t,,? will be in reasonable agreement with the popuIation value. However, the presence of the noise means that the unnormalized correlation coefficients b* are systematically reduced from the population values and these reduced values must be used in equation (9) thus increasing the standard deviations of all the correlation estimates. It is hypothesized that this is the major effect of noise in most cases. By assuming the above hypothesis we can use expressions (15) and (23) to estimate the errors by using the sample correlation values without renormalization (i.e. before interpolating across any noise spike) and using the population fading time of the signal component (or the sample fading time after the auto-correlation function is renormalized) in the error estimates. This interpretation was tested using the simulation procedure described before. The population maximum cross-correlation p,,$ was set to a value of 0.7, but the sample correlations were reduced by Gaussian distributed noise added to the data. Various levels of noise were used, ranging from zero up to a level such that the variance of the noise component was the same as that of the ‘signal’ component. This is a reasonable limit, since noise levels in excess of this reduce the sample correlation coefficients to levels of very marginal significance for the record lengths used, even if the population values of the correlation coefficients without noise are very high. The errors are then so large that the data are not usable for the type of analysis employed. More complex analysis techniques have been devised to extract small signals from large noise and clutter levels, but these have not been applied here. Data with the sample correlation values reduced by the presence of noise are shown in Fig. 3. The data again had a characteristic fading time of z I,‘Z= 4.71 lags and a close fit to the theoretical expression is observed. Similar data for other values of fading time are shown in Fig. 2 and again it is clear that expression (23), with I(, = 0.50, gives the correct error estimate, provided the reduced values of sample correlation are
used. This confirms the hypothesis made carlicr in this subsection and allows us to estimate the errors in T
and r, by the use of equations ( 15) and (23). even when there is significant noise in the data. WC cz;In now sum~nari~c the theoretical results as follows. The equations
and 0.L= 0.5O(z:,~/r,)(~,:z/T)“‘[(l
-p;)jp,]
(27)
can be used to estimate the standard deviations of T: and z,, respectively, provided the record lengths are at least 50 times the fading period zIiZ and provided (in the case where external noise is present) the values of p are the actual sample values, not corrected for the presence of noise.
EXPERIMENTAL
CONSIDERATIONS
The theoretical estimates for the errors can be used to estimate the uncertainties in the wind measurements by using the standard formulae for the propagation of errors. Note that the equations used to estimate the wind velocity must be expressed in terms of Z: and z, in order to do this. For actual observations one must use the sample correlation maxima and fading times, rather than the population values used so far, in order to estimate the errors. The use of these values is, of course, an approxima~on. In the rest case there are a number of further factors to be considered. To begin with, the experimental correlation space is two-dimensional, whereas we have only considered the one-dimensional case. However, the most important factor not so far considered is probably the effect of applying rejection criteria to the SAD analysis. In order to test whether the error analysis can be applied to routine observations of winds the rejection criteria commonly used in such measurements (e.g, BRIC;GS. 1984) must be used. One of the most important, both in the rate of data rejection and the implications on the uncertainty of the wind measurement, is the Percentage Time Discrepancy (PTD) described in MEEK et al. (1979), which is given by
Data are rejected if the PTD exceeds 30%. If the maximum correlations are approximately equal we have gi = constant for the three cross-correlation maxima. Therefore, PTD N (l.‘la,/]ZzLl) x 100. For typical measurements we could have r,,? = 4 lags, pm = 0.7 and C(zil = 2.5 lags. Using equation (26) for 6, gives a PTD of 33%. This shows that purely
31
The spaced antenna technique
statistical errors will cause a considerable proportion of records to be rejected on the basis of the Percentage Time Discrepancy. The ones which are retained will have lower statistical errors than the entire sample. We may therefore anticipate that the theory will tend to overestimate the errors which occur in practice. (The Percentage Time Discrepancy criterion was originally introduced in order to reject cases in which the wrong maximum had been used on the crosscorrelation function. However, with the rejection criterion set at 30% it appears that many records will be rejected in which the identification of maxima is quite correct and purely random errors are involved. If this is so, consideration should be given to relaxing the condition to some extent.) In order to provide a suitable test, independent measurements of the mean wind field were made such that the observations were taken simultaneously and close together spatially, so there was no real difference in the wind field. Using the large antenna array at Buckland Park (BRIGGSet al., 1969) it was possible to set up two independent spaced antenna experiments, with six arrays, each consisting of a square array of 4 dipoles. These were arranged in two triangles with array separations of the order of 180 m and a separation between the ccntres of the two triangles of greater than 500 m. Since the scale of the pattern formed on the ground is of the order of 300 m, these observations were statistically independent, but variations of the wind field over horizontal distances of a few hundred metres averaged over the period taken to accumulate the data should be negligible. Measurements were taken over a total period of 4 h on two occasions two weeks apart. Data were taken using 6 receiver channels for durations of 102.4 s to give a time series of 256 points. Partial reflections from altitudes between 70 and 90 km were sampled, a region where the signal-to-noise ratio ranges from about 0 dB up to more than 20 dB. The data were analyzed in real time, including estimates for the uncertainties in the wind measurements from the relations derived above. It was found in most cases that the theoretical uncertainties in z: and Z, contributed about the same amount to the total error. The theoretical RMS difference was found by taking the square root of the sum of the expected variance of the two wind measurements. It was found that the two estimates had similar theoretical estimates of their variance, as is expected. This was true even if the correction in going from the ‘apparent velocity’, which only uses the values oft:, to the ‘true velocity’ was small. These estimates were then compared with the actual differences between wind measurements. For a given theoretical uncertainty
we expect a continuous
distribution
of deviations, with about 68% of the results lying within the estimated standard deviation, if the distribution is approximately Gaussian. In order to test the relations, the results of a given range in the mean theoretical uncertainty over the two arrays were placed into a ‘bin’ and the median deviation of the observed difference of the wind measurements in each bin computed. The median is favoured over the 68% point because of the greater reliability of the estimate of the median from a fairly small sample. The RMS differences are then estimated as J2 times the median value. For these observations the wind speed was greater than 70 m s-’ for most of the observations, with some measurements over 100 m s-‘, for several successive records from both arrays. The distribution of the error estimates is strongly peaked at about 10-20 m s-‘, which is consistent with the overall distribution of the RMS differences between pairs of observations, which had a peak around 10-15 m ss’. The ‘bins’ for the theoretical uncertainty have a width of 5 m SK’and we have only considered samples where there are greater than ten measurements in a given uncertainty ‘bin’. The graph (Fig. 6) shows the theoretical uncertainty plotted against the observed RMS difference, which, if the error estimates are correct, should give a straight line of slope one. The reason for the graph seeming to tilt backwards for larger error estimates is probably due to the data rejection criteria becoming important when T~,~ is large or the maximum correlation is small. As discussed above, the effect of the rejection criteria could also explain why the errors are overestimated. The uncertainty in the error estimates themselves is likely to be of the order of lO-20%. Sometimes the error estimates are unrealistically large, so some care must
35 I-
----
ZONAL
-
MERIDIONAL
VEL. VEL.
47
ova ’ 5
lo
”
’
’
15
20
2s
30
OBSERVED
RI6
DIFFERENCE
”
35
40
’
45
ms-’
Fig. 6. Comparison of the r.m.s. velocity difference given by the theory with the observed r.m.s. velocity difference. (The latter was obtained as J2 times the median velocity difference.) The straight line has a slope of unity.
I’. I
3’
be taken in the interpretation mates.
of individual
error esti-
SUMMARY
In this paper the use of complex data in FCA is discussed and a theory has been developed for the error in the determination of the position of the maximum in a cross-correlation function and the value of lag such that the auto-correlation falls to some specified value of the cross-correlation (here taken to be the cross-correlation at zero lag). These estimates have been tested with a variety of simulated data with quite widely varying properties of the population functions. They have also been applied to real
MA\ data and shown to give useful estimates of the errors in the determination of the wind velocities using the SA technique. However, the estimates are generally a factor of about 1.5 too great. Possible reasons why the theory might overestimate the errors have also been given, with the most important effect being ascribed to the influence of the rejection criteria. For practical use it is suggested that the theoretical error estimates be appropriately scaled. Acknowledgements-This work was carried out while the author held an Australian Commonwealth Postgraduate Scholarship. It forms part of a program of work on atmospheric dynamics supported by the Australian Research Grants Scheme. The help and encouragement of Dr B. H. BRIGGS are acknowledged with thanks.
REFERENCES
AWE 0. AWE 0. BRIGGSB. H. BIUGGSB. H. BRIGGSB. H. and PAGEE. S.
1964a 1964b 1977 1984 1955
BRICGSB. H., ELFIZIRD W. G., FEZATE D. G., GOLLEY M. G., ROSSITER D. E. and SMITHJ. W. CHANDRA H. CHANDRA H. and BRIGGSB. H. CHANDRAH., DESHPANDE M. R. and RAS~OGIR. C. FARLEYD. T.
1969
J. atmos. terr. Phys. 26, 1239. J. atmos. terr. Phys. 26, 1257. J. atmos. terr. Phys. 39, 1023. MAP Handbook, Vol. 13 (VINCENTR. A., Ed.), p. 166. Report of the Physical Society Conference on Physics of the Ionosphere. Physical Society, London, p. 119. Nature 223, 1321.
1980 1978 1972 1983
Indian J. Radio Space Phys. 9, 32. J. atmos. terr. Phys. 40, 541. J. Instn Telecommun. Engrs, New Delhi 18, 340. MAP Handbook, Vol. 9 (BOWHILL S. A.
and
EDWARDSS., Eds), p. 507. FEDORL. S. FELGATED. G. MEEKC. E., MANSONA. H. and GREGORYJ. B. R~TTGERJ.
1967 1970 1979 1984
J. geophys. Res. 72, 5401. J. atmos. terr. Phys. 32, 241. J. atmos. terr. Phys. 41, 251. MAP Handbook, Vol. 13 (VINCENT R. A., Ed.),
R&TGER J. and VINCENTR. A. VINCENTR. A. and R~~TZERJ. WOODMAN R. F. and GUILLENA.
1978 1980 1974
Geophys. Res. Left. 5,917. Radio Sci. 15, 319. J. atmos. Sci. 31,493.
p. 187.
Reference is also made to the following unpublished material:
BUCKLEYR.
1971
HOCKINGW. K.
1981
Some notes on practical digital power spectral, autoand cross-correlation analysis using the FFT, Department of Physics Report, University of Adelaide. Ph.D. Thesis, University of Adelaide.