A comparison between least squares and utmost correlation methods

A comparison between least squares and utmost correlation methods

Nuclear Instruments and Methods in Physics Research 221 (1984) 443-448 North-Holland, Amsterdam 443 A C O M P A R I S O N B E T W E E N L E A S T S ...

322KB Sizes 0 Downloads 43 Views

Nuclear Instruments and Methods in Physics Research 221 (1984) 443-448 North-Holland, Amsterdam

443

A C O M P A R I S O N B E T W E E N L E A S T S Q U A R E S AND U T M O S T C O R R E L A T I O N M E T H O D S A. D E L U N A S and V. M A X I A Gruppo Nazionale di Struttura della Materia del CNR, Istituto di Fisica Superiore dell'Universita, Cagliarh Italy

G. BUSSETTI Istituto di Fisica Superiore dell'Universitb, Torino, Italy

Received 22 March 1983 and in revised form 6 July 1983

The problem of fitting experimental curves is faced by means of a method based on the Schwartz inequality. It consists of searching for the function which holds the maximum correlation with the experimental curve dealt with. The method, therefore, concerns applications in which a proportionality factor between the curve and the fitting function can be left out. Comparison with the least squares method is performed, taking into special consideration the application to Mrssbauer spectroscopy.

I. Introduction The least squares (LS) method can be conveniently applied to problems in which an experimental curve recorded by means of the data points x i, Yi must be fitted by an analytical function f ( x i ; a t ) depending on some unknown parameters a r ( r = 1, 2 . . . h). T h e values of these parameters, using proper computational procedures, are determined in such a way that the function f ( x , ; Otr) approaches as closely as possible the curve dealt with. However, there exists a class of problems in which parameters holding actual physical interest should be determined by means of a fitting procedure which searches for a function f ( x ~ ; a t ) merely proportional to the experimental data Yi. This, obviously, is different from the LS method. To this class belongs, for instance, emission spectroscopy. In fact when an emission spectrum is to be analyzed, the interesting parameters are the wavelengths of the lines, their half-height widths and their relative intensities. The absolute intensities of the recorded lines are devoid of interest, since they depend on spectrometer light collection efficiency and on the brightness of the light source. These features do not concern the physical nature of the spectrum. Thus spectrum fitting should be performed using arbitrary units on the vertical axis. Other examples are supplied by some nuclear measurements. Let us consider, for instance, the y ray angular distribution in a f l - y angular correlation experiment. The counting level in each angular channel is clearly proportional to source activity, counting time, detector efficiency and so on. It follows that the fitting of the ~ ray angular distribution should be performed taking into account the counting levels irrespective of a proportionality factor. Particular attention must be given to the case of resonant y ray absorption spectra, that is, the MOssbauer spectra. The parameters involved are the peak positions, their absorption coefficients, their half-height half-line-widths and using the LS method, the base line, that is, when considering a doublet spectrum, seven parameters. However, the base line is devoid of physical meaning, since it merely intervenes as a multiplying factor which allows for the actual counting levels per velocity channel. The latter, as happens for measurements on the y ray angular distribution, are related to contingent experimental conditions. Therefore, leaving out the base line, the doublet spectrum concerns only six parameters. In the following, we will show how the utmost correlation (UC) method can be suitably applied to this class of problems, avoiding superfluous scale parameters, which on the contrary, must be accounted for in the LS method. 0167-5087/84/$03.00 © Elsevier Science Publishers B.V. (North-Holland Physics Publishing Division)

444

A. Delunas et al. .," f'itting experimental curves

2. Least squares and utmost correlation equations

We now come to the fitting of the curve x i, y , ( i = 1,2 ..... K) by means of the analytical function

f ( x , a~, a 2. . . . . %). Introducing a set of positive definitive weights w,>~0,

(1)

the quantity K

w,[y,- Xf(x,; a,)] 2,

xZ()k) = •

for ( r = 1, 2 . . . . h),

(2)

/=1

turns out to be always positive, that is X2(X) >~0.

(3)

In eqs. (2) and (3), X is an arbitrary parameter. When the LS approach is chosen, we put ), = 1

(LS),

(4)

so that eqs. (2) and (3) yield K

X 2= Y~ w , [ y i - f ( x , ; a,)]2 >~0,

(5)

i=l

where X2 - X2(1). This inequality constitutes the starting point of the LS method. Utilizing eq. (5), function f yields the best fit of curve y~ when the parameters a, are determined in such a way that X2 attains a minimum. If, alternatively, we take into account the value of ~ which makes X2(X) a minimum, we obtain K

K

X,m. = ~_, wiyJ(x,; or,)/Y~ w,f2(xi; or,) i=1

(UC),

(6)

i=1

which, substituting in eq. (2) and allowing for eq. (3), yields

wy, 2. ~ wff2(x,; a~)i=1

i=l

w,y,f(x,; a~)

~ O.

(7)

i=l

This inequality leads directly to the weighted Schwartz inequality O= Z

w i Y i f ( x i ; Olr)/

i=1

wiYi2" i

%f2(x,,

at)

~<

1,

(8)

i

where p means the correlation coefficient. The UC method actually is based on eqs. (7) and (8). When the curve's best fit is performed by means of eq. (8), the parameters a, are determined in such a way that p, though less than unity, attains a maximum value. This actually is equivalent to minimizing the left hand member of eq. (7). It is to be remarked that, according to the LS method, the ultimate best fit corresponds to the obtaining of f(xi; a , ) = y i, while according to the UC method, it corresponds to the obtaining of f ( x i ; a,) proportional to y~. This circumstance precisely fulfills the different requirements evidenced in sect. 1, concerning the different classes of experimental curves. Assuming as a supplementary condition that the experimental values yi and the function f(x~; ofr) have the same normalization, that is K

K

E w,yi2= Z w,f2(xi; at), i=1

i=1

(9)

A. Delunas et al, / Fitting experimental curves

445

the UC equations become equivalent to the LS ones. In fact, eq. (9) leads to

w,Y,z" E wif2(xi; a,) i=l

wiYi2 + E Wff2(Xi; a,) .

= 1

i=1

(10)

i=1

which allows eq. (8) to be rewritten as K

K

K

K

2(1 - p) Y'. w, yi2 = E wiYi2 + E wif2(x,; %) - 2 Y'. wiyif(x,; a,). i=1

i=1

i=l

(11)

i=1

Thus, by putting K

2(1 - O) )-'. wiyi2 = X 2,

(12)

i=l

and taking into account that when p approaches unity X 2 goes to zero, eqs. (11) and (5) turn out to be coincident. This fact proves that, under condition (9), the LS method can be regarded as a special case of the UC method. We come now to the determination of the parameters a r. When these parameters intervene linearly, inequalities (5) and (7) allow the direct determination of their minimizing values. If, on the contrary, the dependence is non-linear, suitable linearizing methods must be applied. The one most generally utilized is the iterative Gauss-Newton method [1]. It takes into account the expansion h

f ( x , ; % ) = f ( x i ; a~o))+ y ' ~ 3 a ~ o ) . . . r=l o a ; -

(13)

in which a~°) are suitably chosen zero approximation values of parameters. Substituting eq. (13) in eqs. (5) or (7), their left hand members minimization is turned to minimization with respect to 3a~°), which actually is a linear problem. Thus, starting from a~°~, successive approximations of the parameters can be evaluated, that is

ot(,.~'+l}=ot~t°+Sa~t'),

where (g = 0, 1, 2 , . - . ).

(14)

The procedure is ended when each 8a~ ~) becomes negligible. In this way, taking into account eq. (5), we have h

At, 3%

=

where (s = 1, 2 , . - . h),

(15)

(LS),

(16)

r=l

where

K

3f

= E w,-i=1

3f

and X w,[ y, - f ( x , ; a~"))] ~ Of BJ ~'= i=,E

(LS).

(17)

Eqs. (15) are a set of h linear equations with respect to the h unknown quantities 8a~ "). When eq. (7) is applied, eq. (15) holds again, but we have in this case

~ W,Y, 0---~) af ~ W'Y'a-7~} Of 2., AL"' = Eg w, 3____f_f 3f . EK wjY2 - 2., i=1

j=l

i=l

v



j=l

V°ts

(UC),

(18)

446

Delunas et aL / tqttmg experimental curves

/l.

and K

K

B~" = i=lZ~;yJ(x,;

t~•

%v, of

Of(rif) ) " 1= E

~ 0 OL,`if)

-

-

/~-

Y'~ w , / ( x , ; c(~") 1--. 1

3)" "/E 1 W! V/2 ~

(uc).

(19)

=

There are some alternative linearization methods which, however, do not differ substantially from the Gauss-Newton method. A good account of these methods is given in ref. [1].

3. An exemplifying application In order to compare the features of LS and UC methods, we now apply the previous equations to the case of M6ssbauer absorption spectra. An experimental M6ssbauer spectrum can be easily simulated by means of a Monte Carlo procedure introducing a statistical distribution of fluctuations. Indeed, in the absence of fluctuations, the counting in the velocity channel x~ is given by

N~=f(x,)--B

i °='Ll+l-Z-o - )] 1-

~

/x~-vo

2 '

where(i=l,2,...,K),

(20)

in which B is the base line counting level, L is the number of Lorentzian peaks and Ao, v, and F, are the absorption coefficient, the minimum velocity and the half-height half-line-width of the o th peak, respectively. The simulated experimental spectrum is

y,=N,+3N,,

,K),

where (i = 1, 2 , . . .

(21)

where 8N, are Poissonian fluctuations generated using pseudo-random numbers. In actual experiments using moderate counting rates, indeed, instrumental effects are negligible and Poissonian fluctuations can be regarded as adequate. Using eqs. (20) and (21) and choosing the following values of parameters:

97500 7_

// g3000

,/ /

,// / / ,

,

/,

/

,

,(

88508

i t

84000

* ~

7'9500

75880

. . _ _ _ 2"

52

78

L I03

Chonnels

Fig. 1. Crosses representa computergenerated MOssbauerdoublet with Poissonianfluctuations.Full lines representthe fluctuation-free Lorentzian components of the doublet and its fitting function.

A. D e l u n a s et al. / F i t t i n g e x p e r i m e n t a l c u r v e s

447

/

96. 77

E D

¢,,

J © C3_ (53

+÷ +÷

9 2 . 88 ++

S3 @ 42 [3

~ + 4+

89. O0

L

+, 81, 22

*

t

87480

83523

91438

L .

85995

99353

Counts

Fig. 2. Correlation plot between the doublet of fig. 1 and the fitting function. Horizontal and vertical scales were adjusted in such a way that lengths were the same on both axes. B = 100.000, v 1 = 58, A 1 = 0.12, /"1 = 15 a n d Vz = 70, A 2 = 0.12, /"2 = 15, w e s i m u l a t e d a s y m m e t r i c a l d o u b l e t ( L = 2). T h e n u m b e r o f c h a n n e l s w a s K = 128. In fig. (1), the d o u b l e t is s h o w n together w i t h the o u t l i n e of its L o r e n t z i a n c o m p o n e n t s . T h i s d o u b l e t , a s s u m i n g w i = 1 (i -- 1, 2, . . - , K ) a n d u s i n g eq. (15) together w i t h eqs. (16) a n d (17), or (18) a n d (19), w a s a n a l y z e d b y m e a n s of b o t h L S a n d U C m e t h o d s * A p a r t f r o m the b a s e line, w h i c h in the U C case is u n n e c e s s a r y , the zero a p p r o x i m a t i o n v a l u e s of p a r a m e t e r s w e r e c h o s e n so as to be the s a m e for b o t h m e t h o d s . T h e y are reported in table 1 together w i t h t h o s e o f the sixth a p p r o x i m a t i o n , w h i c h a c t u a l l y y i e l d s the final results, since fifth a n d sixth a p p r o x i m a t i o n s were f o u n d to differ b y less than 0.0001%. In order to a l l o w a c o m p a r i s o n b e t w e e n L S a n d U C data, u p to five d e c i m a l digits are s h o w n in table 1. It a p p e a r s that the d i f f e r e n c e s are negligible, since t h e y lie w i t h i n 0.1%. O n the other h a n d , the d i f f e r e n c e s w i t h respect to the v a l u e s of the f l u c t u a t i o n free s p e c t r u m (i.e. the p r o b a b l e values) reach 1%. T h i s fact m e a n s that in practice LS a n d U C data are e q u i v a l e n t . S u c h a * A special BASIC programme is available on request. Table 1 Values of parameters for the analysis of the MiSssbauer doublet of fig. 1 by means of least squares and utmost correlation methods. Parameters v 1, F l, v 2 and F2 are expressed in channels. Probable values

Zero approximation values

Sixth approximation values

Least squares

v1 A1 /'1 v2 A2 F2 B

58 0.12 15 70 0.12 15 100.000

55 0.10 13 73 0.10 13 100.000

57.5585 0.11016 14.8766 69.3798 0.12864 15.2419 100.402

Utmost correlation

v1 /11 /'1 l)2 Az

58 0.12 15 70 0.12 15

55 0.10 13 73 0.10 13

57.5628 0.11009 14.8890 69.3724 0.12865 15.2554

Fz

448

A. Dehmax et al. / FJtting e~perirnental ~urt:es

result holds unchanged if values of B different from 100.000 (that is, the figure corresponding to the fluctuation free spectrum) were utilized as zero approximation values. In fig. 1, the best-fit curve obtained with LS data is also shown. The good agreement with the simulated spectrum is evident. This circumstance is substantiated by the chi-square value, which, utilizing all the 128 counting channels, turns out to be 53.9. With the UC data, that is, in the absence of the base line parameter, we cannot compare the simulated spectrum with the best-fit curve, as done in fig. 1 using kS data. In spite of this, the comparison can be performed again by means of a correlation plot [2]. It consists of reporting on a horizontal axis the values of y,, that is, the simulated counting levels. On the vertical axis, we reported the values of N, obtained by substituting in the right hand member of eq. (20) the UC sixth approximation parameters and by putting B = 100. In this way, a succession of points can be drawn, each corresponding to a velocity channel which, if the values of ~ and y, are really proportional, lie close to a straight line. Such a plot is shown in fig. 2. To obtain an easily readable plot, the horizontal and vertical scales were adjusted so that the spectrum took up the same length on both axes. In this way, the straight line is expected to be at 45 ° with respect to the axes. It appears that the points form, in fact, a 45 ~' line, if errors due to statistical fluctuations are accounted for. It is to be pointed out in this connection, that the good sensitivity of the correlation plot for evidencing small differences in MOssbauer spectra has been previously proved [2,3].

4. Final remarks The above example seems sufficient to explain why the UC method constitutes a useful alternative to the LS method for the special class of problems considered. As a matter of fact, in the previous example, the use of UC equations leads to a saving of time of about 20% with respect to the LS equations. But the most important feature lies in the fact that the LS method requires that in MiSssbauer experiments a notable part of the base line be recorded. Lacking this, it is in fact difficult to estimate a good zero approximation value of the base line parameter. Actually the base line recording take up a time which is subtracted from the peak recording time, thus worsening statistical accuracy. This drawback is avoided with the UC method, It must be noted, on the other hand, that in performing spectrum analyses, it is convenient to take into account only channels in which the absorption signal is high. The base line channels certainly introduce statistical noise without significantly contributing to the signal [1,4].

References [1] [2] [3] [4]

B.J. Duke and T.C. Gibb, J. Chem. Soc. A(9) (1967) 1478. F. Aramu, V. Maxia and C. Muntoni, Nucl. Instr. and Meth. 133 (1976) 503. B. Renard, H. Pollak, R. Quartier and P. Walter, Nucl. Instr. and Meth. 190 (1981) 565. F. Aramu, V, Maxia and S. Serci, Nucl. Instr. and Meth. 186 (1981) 553.