Journal of Molecular
Structure
ElsevierPublishingCompany, Amsterdam. Printedin the Netherlands
FORCE-CONSTANT
COMPUTATIONS*
III. FORCE-CONSTANT RANGES FOR PERCHLORYL FLUORIDE AND SOME IMPROVEMENTS IN THE FLETCHER-POWELL METHOD OF REFINEMENT PETERGANS Department
of Inorganic
and StructuraZ Cixemistry, The University,
Leeds LS2 9JT (England)
(Received 17 February 1972)
ABSTRACT
The frequency data from one isotopic species of perchloryl fluoride are used to compute approximate values of some of the force-constants. The uncertainty in these values which arises from the absence of sufficient frequency data is made to manifest itself in the form of ranges, e.g. 9.5fO.l N cm-’ for Cl-0 stretching and 3.5kO.5 N cm-’ for Cl-F stretching. The ranges are derived by systematic variation of the arbitrarily chosen off-diagonal symmetry force-constants, and are delimited by the extremal force-constant sets F’. The relationships between the Ff matrices, multiple frequency assignments, and a criterion for “complete mixing”
of vibrations is discussed. The Fletcher-Powell
method of force-constant
refinement is improved by a new initiation procedure and its performance is shown to be optional. Failure diagnostics for the Fletcher-Powell method are also presented.
INTRODUCTION
The application of the Fletcher-Powell misation
to
the
computation
of
vibrational
(FP)
method of function mini-
force-constants
has
been
shown
to
overcome the mathematical difficulties, that is, oscillation and non-convergence of the refinement, that can arise when the conventional Gauss-Newton-Raphson (GNR) method is used. The theory and practical aspects of both methods have been presented in detail’. However, these mathematical difficulties arise from the * Part II is given in ref. 1. J. Mol.
Structure,
12 (1972)
411
use of a physically inappropriate force-field: for example, a set of off-diagonal symmetry force-constants arbitrarily chosen such that no real values of the diagonal force-constants then exist which will permit the observed frequencies to be fitted exactly. The force-field may be improved by computing an extremal set of force constants I;‘, for which an exact frequency fit is predicated by dehnition2. These techniques are now applied to perchloryl fluoride, ClO,F, by refinement of on-diagonal symmetry force-constants; the off-diagonal symmetry forceconstants are chosen arbitrarily and are varied within “reasonable” limits, bounded at one end of the range by an F’ matrix. Thus a progression of force-fields is generated, all of which satisfy the observed frequencies exactly. There follows a discussion of various improvements, observations and failure diagnostics for the FP method with examples drawn from these force-constant computations.
FORCE-CONSTANT RESULTS AND DISCUSSION
The internal coordinates, ri, are indicated in Fig. 1, where s and t are changes in the bond lengths and cc and /I are changes in the inter-bond angles. The internal force-constants, Fki,are defined in Table 1. The orthogonal transformation to symmetry coordinates 4 = Uv’ specifies the symmetry force-constants Kii as K = UFD and these are given in Table 2. The observed vibration frequencies3 were used without correction for anharmonicity, and the G matrix was calculated from the results of an electron diffraction study4. There is one redundant A, symmetry coordinate associated with the six angle bending deformations. This
Fig_ 1. Internal coordinates of perchIory1 fluoride.
412
J. Mol. Structure, 12 (1972)
TABLE
1
DEFINITION
TABLE DEFINITION
OF THE INTERNAL
FORCE-CONSTANTS
&
=
azv/afpj
2 OF
THE
SYMMETRY
FORCE-CONSTANTS
K,,
=
a2 v/ a+;
was “eliminated” by finding an orthogonal transformation to linearly independent symmetry coordinates, r”’ = Mu” in which the redundant coordinate I$’ is zero, so that the fourth row and column of 3 = MUGofi consists of only zeros’. The symmetry force-constants in terms of P are given by .F = MUF~L@ so that ~ii = Kii except for i orj = 3 or 4. Fi4, SZ4, Fs4, and Fa4 are indeterminate. ConsequentIy the only internal force-constants that can be determinedat present are those that relate to coordinates not involved in the redundancy conditions, namely, F,, E”, F’t and F’=. The diagonal force-constants ~ii were refined using fixed values for the off-diagonal force-constants *ii (i # j). The A1 and E symmetry blocks were refined separately. It was found that the refinement of the E block force-constants aIways converged to R = 0 [eqn. (3)] for a variety of values in FS6, gS7 and .FG7. In particular, with SS6 = F5, = SG7 = 0 the values SSS = 9.255 N cm-l, F 66 -- 1.75 bending units (b-u., 1 b-u. = lo-’ N rad-I) and FT7 = 1.31 b.u. were obtained. These values were then used later to compute F” and F” since the assumption that F’B- F’$ = F, - FiE = FoIs - FLfl= 0 is “reasonable”. J. Mol. Structure, 12 (1972)
413
TABLE3 sz
*In
MATRICES
FOR A VARIETY
OF ORIGINAL
0.00
0.00
0.00
0.20
0.20
0.20
0.20 0.20 0.20 -0.20 -0.20 -0.20 -0.20
-0.20 0.20 -0.20 0.20 -0.20 0.20 -0.20
0.20 -0.20 -0.20 0.20 0.20 -0.20 -0.20
SYMMETRY
FORCE-CONSTANTSa
9.7i 3.33 2.71 --&I3 -0.08 9.96 3.31 2.68 0.03 0.10 9.56 3.33 2.77 -0.03 -0.32 10.05 3.30 2.67 0.13 O.I6 9.74 3.31 2.74 0.11 -0.23 9.64 3.38 2.70 -0.38 0.04 9.07 3.4# 2.82 -0.46 -0.39 9.81 3.35 2.77 -0.28 -0.25 9.43 3.37 2.82 -0.46 -0.39
-0.53 -0.56 -0.51 -0.57 -0.55 -0.52 -0.42 -0.49 -0.42
9.41 9.49 9.36 9.52 9.42 9.38 9.19 9.39 9.19
3.33 3.21 3.33 3.30 3.30 3.38 3.44 3.37 3.44
-0_08
0.15 0.02 0.23 -0.02 0.10 0.26 0.08 0.06 0.16 -0.22 0.12 -0.26 -0.06 -0.16 0.06 -0.26 -0.06
thisand foIlowingtablesthe foIIowingreIationships hold:
913
=
-0.8509K~3-0.5253Kt4,
54-23
=
-0.8509K2,-0.5253&a,
9733
=
0.7240K33+0.8943K34+0.2759Kqq.
For the A, symmetry block the initial refmernents all converged to a minimum of R # 0. The three off-diagonal force-constants were then systematically alloted the value of +0.2 or -0.2 and after an FP refinement had been performed an 9’ matrix was computed, according to eqn. (13). The results are given, in Table 3, as the Sf matrix and the values of 9r2, F1 3 and Sz3 from which it derives. Thus, all vaIues of ri3 lie between -0.42 and -0.57 stretch-bend units (s.b.u., 1 s,b.u. = 10-r* .J rad-‘). In view of the systematic nature of the original guesses for HI 2, S1 3 and Sz3 it is therefore very likely that the true value of Fz3 should be negative and greater than -0.4 s.b.u. This in turn means that I;;, or E;B (or both) must be relatively large and positive, and suggests that the Cl-F bond strength is a sensitive function of the FCIQ and OCIO bond angles. The significance of this result is that even without sufficient frequency data it has proved possible to place an upper limit on an off-diagonal force-constant. We then proceeded to use the information gained by computing the 9’ matrices to compute a variety of force-constant solutions which will fit the observed frequencies exactly without being extremal. A selection of the results is shown in Table 4_ These were obtained by setting SI 3 = 0.0, F12 = 0.0, 0.1 and 0.2 and 9 23 was set larger than the corresponding 9% value, and all could have been computed by the GNR method since the refinements converged to R = 0. While thedatainTable4 isclearlynotexhaustivetheyservetoindicatemeaningfuivalues for the diagonal symmetry force-constants; a slightly wider range of values would be obtained if a greater variety of off-diagonal values had been chosen. From the values obtained it follows that &” must be ca. 9_5rf=O.l N cm-‘, P; is probably ca. 3.6kO.4 N cm-l, but the values of F’ and Fti are not closely defined. The range of permitted values of F, is increased by the possibility of another assignment of the observed frequencies. If v2 and v3 are interchanged a new family 414
J. MoL Strwture,12 (1972)
TABLE
4 ASSIGNMENTP~
NON-EXTREMALFORCE-CONSTANTSOLUUONSFORTHE
=
1061 cm-‘,VZ
= 715cm-‘,
uz = 549 cm’ L
I=
2 3 4 5 6 7” 8 9 10 11 12 13” 14 15 16 17
0.20 0.20 0.20 0.20 0.20 0.20 0.10 0.10 0.10 0.10 0.10 0.10 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
-0.58 -0.60 -0.65 -0.70
0.00
-0.75
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
-0.80
0.00 0.00
0.00
-00.56 -0.60 -0.65 -0.70 -0_75 -0.80 -0.55 -0.60 -0.65 -0.70 -0.75
9.98
9.98 9.99
10.00 10.00 10.01 9.92 9.93 9.94 9.95 9.96 9.96 9.87 9.88 9.89 9.90 9.90
2.58 2.51 2.42 2.36 2.32 2.29 2.69 2.49 2.41 2.35 2.31 2.29 2.70 2.48 2.40 2.35 2.3 1
3.44 3.54 3.70 3.82 3.91 4.00 3.30 3.58 3.73 3.84 3.93 4.01 3.31 3.62 3.76 3.87 3.96
9.50 9.50 9.50 9.50 9.50
9.51 9.48 9.48 9.48 9.49 9.49 9.49 9.46 9.46 9.47 9.47
9.47
0.11 0.11 0.11 0.11 0.11 0.11 0.06 0.06 0.06 0.06 0.06 0.06 0.00 0.00 0.00 0.00 0.00
3.44 3.54 3.70 3.82 3.91 4.00 3.30 3.58 3.73 3.84 3.93 4.01 3.31 3.62 3.76 3.87 3.96
0.24 0.24 0.24 0.25 0.25 0.25 0.22 0.22 0.23 0.23 0.23 0.23 0.20 0.21 0.21 0.21 0.21
D Solution approximately extremal i.e. s TABLE 5 N~N-~X~~MAL~O~CE-C~~~;S~ANT~~L~~~N~FORTHEA~~IGN~~E~~~~ V3 =715cm-’
=
Solution
1 2 3 4 5 6
1061 cm-‘,
1’2 =
549cm-‘,
2F12
2Fls
Sz3
911
922
.F33
I=..
E;
r;;,
F.
0.20 0.20 0.20 0.20 0.10 0.00
0.00 0.00 0.00 0.00 0.00 0.00
-0.58 -0.60 -0.65 -0.70
9.97 9.96 9.96 9.96
3.16 3.08 2.97 2.90
2.81 2.89 3.02 3.12
9.49 9.49 9.49 9.49
3.16 3.08 2.97 2.90
0.11 0.11 0.11 0.11
0.24 0.23 0.23 0.23
-0.65
9.91
2.96
3.04
9.47
2.96
0.06
0.22
-0.65
9.86
2.97
3.06
9.46
2.97
0.00
0.20
of solutions
is obtained,
of which
a few examples
are given in Table
5.
Since both
v2 and v3 correspond to vibrations of the same symmetry it is difficult to make an a priori assignment of these frequencies. The range of possible values for F, therefore becomes
even larger,
ca. 3.5_tO.S
N cm-
‘.
Mixing of vibrations The effect of varying the appropriate off-diagonal force-constant 2Fii “outward” from an S* set is to decrease mixing between rm and I$‘_This is illustrated in Table 6 which gives the principal potential energy contributions of ry to the J. Mol. Structtdre,12 (1972)
415
TABLE
6
POTENTIAL v,,
is the
ENERGY
DISTRIBUTION
contribution
CORRESPONDING
TO FORCE-CONSTANT
of rJ’ to normal coordinate
Qt. Fxz
SOLUTIONS
= 0.20 and 9,~
XN TABLES
4
AND 5
= 0.0 throughout
43
-0.70
97.7
1.9
2.2
1.9
90.1
32.3
0.5
13.8
3s
-0.65
97.6
1.9
2.4
2.1
83.9
37.5
0.5
19.3
71.2 65.0
2”
-0.60
97.4
1.8
2.6
2.3
74.9
45.4
0.4
27.7
56.3
I*
-0.58
97.3
1.8
2.7
2.5
68.9
50.7
0.4
33.4
50.5
606
59.2
03
42.3
41-7
-0.568554
97.1
1.7
2-9
2-7
lC
-0.58
97.0
I.6
3.2
2.9
51.3
67.9
0.2
51.2
32.9
2=
-0.60
97.0
1.5
3.3
3.0
45.9
73.7
0.2
57.0
27.2
b
3’
-0.65
96.9
1.4
3.5
3.1
37.9
82.4
0.1
65.8
19.0
4=
-0.70
96.9
1.3
3.7
3.2
32.6
88.5
0.1
72.0
13.7
0 Table 4.
b H,,.
c Table 5.
9 96
Fig. 2. Variation 4 and 5.
416
of SIX,
Sz2 and 9ir,, with Fz3
for Sxs
= 0.2, Pi-13 = 0.0; data from Tables
.r. &for. structure, 12 (1972)
normal coordinate Qi as “ii = L;i -Flii (normalised SO that p Lki Lii Skj = 100 Ok) for fixed values of Sr2 and F1 3 and various values of -Pz3. vI is seen to correspond to a virtually pure symmetric Cl-O stretching vibration, whereas v2 and v3 are mixtures of Cl-F stretching and angle deformation vibrations. It must be stressed that unless the value of F23 can be determined independently, the extent of mixing between r$’ and r: cannot be accurately assessed. As the mixing varies with Sz3 (for fixed vaIues of Fr2 and Sr3) so do the computed values for Sri , F22 and F3 3. This is illustrated in Fig. 2. The upper part of the curves for St1 and Sz2 derives from the assignment v2 > v3 while the lower part derives from the assignment v2 x v3. The opposite holds for the S3s curve. The point where the slope of the curves is infinite is the 9* matrix for the given values of *I2 and P1 3. An algebraic proof is given elsewhere6 that under the condition that S12, Fz3, vl, v2 and v3 are constant, Si3 is stationary in the strict mathematical sense that l/(aFiJaFz3) = 0 for all i. The curves in Fig. 2 give a geometrical illustration of this extremal nature of 9’ and support the algebraic proof. Each curve clearly resembles a part of a force-constant ellipse, as first described by Torkington’ for the system of two interacting coordinates. By analogy with Torkington’s and related work’ one can say that in a system of more than two interacting coordinates there is “complete mixing” of vibrations for any 5’ matrix. The form of the mixing can be described as follows. From the definition’ of 9’ the columns of V must be linearly dependent. It follows that the contribution of one symmetry coordinate to the potential energy of any normal coordinate is not an independent quantity, but can be expressed as a linear combination of the contributions of the other symmetry coordinates to that normal coordinate. In the present example this can be written as Vi3 = Xk’ii +yVi, for i = 1, 2, 3 and Table 7 contains values of x and y corresponding to the matrices 2-9 in Table 3. Clearly the potential energy contribution from r: to each normal coordinate is TABLE
7
COEFFICIENTS VI3
No.”
=
OF LINEAR
DEPENDENCE
IN POTENTIAL
ENERGY
DISTRIBUTION
MATRIX
v,
DEFINED
BY
xK,+vvra X
Y
0.0077 0.0226 0.0075 0.0203 0.0026 0.0215 0.0025 0.0171
0.9926
0.9777
0.9927 0.9799
0.9976 0.9789 0.9979 0.9833
a The number corresponds to the S J. Mol. Structure,
12 (1972)
matrix in Table 3.
417
almost equal to the contribution from ry . Exact equality would be achieved if 9f 23 were also stationary with respect to all elements of the inverse eigenvector matrix, in which case V2, and V,, would be zero’. However in general it is probable that mixing will be extensive between the two coordinates with subscripts i andj corresponding with the .9ij that is stationary in the first sense defined above. The remaining coordinates will be involved in the “complete mixing” situation to a minor extent. The extent of mixing between two coordinates labelled i andj is decreased as *ii is varied “outward” from the value 2$, but an individual element V, may be either increased or decreased according to the two different assignments of normal coordinates to frequencies, say, Q, + v,, Q6 + v, or Q, + v,, Q, -+ v, as is shown in Table 6. At the point defined by 9*, the two assignments are indistinguishable-this is another way of saying that there is “complete mixing”. Thus, the computation of an 9* matrix locates a saddle point between different possible frequency assignments and emphasises the difficulties of making correct vibrational assignments of the observed frequencies.
CONCLUSIONS
It is questionable whether the computation for force-constants from insufficient data is a valuable excercise. When sufficient data can be collected by isotopic substitution techniques and measurement of Coriolis (<) and centrifugal distortion (0) constants a full analysis can and should be undertaken. Unfortunately many molecules are not volatile enough to permit measurement of 5 and D constants and isotopic substitution may not be practicable. A recent discussionl” has shown that restricted data can yield valuable results when they are interpreted with care. If the absence of experimental data other than the frequencies of one isotopic species is regarded as experimental uncertainty, then the computed force-constants must be subject to uncertainty. The results given in this paper show that useful if not ‘precise information can be obtained if values are chosen for the off-diagonal force-constants which are reasonable by comparison with similar molecules for which a rigorous analysis has been done. Thus, the Cl-O stretching force-constant is restricted to a relatively small range. This arises because there can be little mixing of the Al Cl-O symmetry coordinate with the other Al symmetry coordinates unless *I2 or *rs are unacceptably large. The Cl-F stretching force-constant is less well defined because there are no criteria available by which to judge what is “acceptable” for Fz 3. Some other authors have analysed their results in terms of force-constant ranges ‘r , but it has not always been stressed that when alternative frequency assignments are possible one cannot meaningfully apply the group frequency approximation--in the present example the computations have shown 418
L Mol. Structure,12 (1972)
that the “C&-F stretching frequency” may be perhaps assigned to either the 715 cm-’ or the 549 cm- ’ band, and is therefore an undetermined concept. The results from force-constant computations of this nature must be interpreted with great caution, particuktrly where the mixing of symmetry coordinates is concerned. In those cases where extensive mixing may be suspected, the extent of mixing can only be properly assessed if the relevant off-diagonal force-constant is more precisely defined by other data as, for example, from isotopic substitutions_ A further unce~ainty, not discussed here, arises from the use of anharmonic fre-
quencies.
COMPUTATiONAL
REWLTS
AND DISCUSSION
The notation used here is mostly as before’ and only the essential features are repeated. The computation of diagonal symmetry force-constantsfi (i = I.. _n> is defined by the rth iteration cycle, eqns. (1-12). $CJF’L’ = L’I1’, fi =LFii A’ = diag (A, _ _ _ An), 2.: = 42c2(v;)= E (~;/1302.8)~
(1) (2) (3) (4) (5)
, f)’
z
(6)
A’-_B’
(7)
A’ = fiWJ Avl;.
(8)
(9) Af’ = --H’g’
(FP method)
Afr = - (A’)-’
H ‘+I = H’+Y’,
(10)
g’ (C?NR method)
Hr*’
+ (II’+‘)-’
(FP method)
(11) (12)
and when the minimisation is complete and Rmin# 0 an extremal force-constant set is given by eqn. (13). *f
=t-‘nL-l
J. Mol. Srructure,
12 (1972)
(13)
419
The definition of A’ and B’ differs slightly from before and the matrix Yr symbolizes the updating formula of Fletcher and Powell I2 . ( There is confusion in the Iiterature concerning terminofogy-some authors caI1 eqn. (11) the Gauss-Newton method and define the Gauss-Newton-Raphson method according to Ar = - (D’)- ‘g’.) The frequencies v, are obtained from experiment and & is equal to 47~~~~.The results to be discussed here all derive from the A, symmetry block (n = 3). Unit weights were used throughout except where stated to the contrary, wr, = I, with consequent simplification in eqns. (3), (S), (7) and (8). For example cond (A’) = (cond (Jr))*. Initiation Previous FP refinements’ began with Ho = E, the unit matrix. Use of the approximate inverse Hessian Ho = (A “)- ’ reduced the computation time by about one third on average, as judged by the number of function evaIuations, rzr, comprising solutions to eqns. (I), (3), (4) and (5). The number of iteration cycles, n was also reduced. A typical refinement starting with the approximate inverse l&Sian is shown in Table 8. Here the rate of refinement is virtuaffy the best that can be expected when using the FP method since a sum of squares cannot generahy be minimised in Iess than n+ 1 iterations and at least two function evaluations per iteration are needed to calculate CCin eqn. (9). This is a significant result since most computational papers which compare the speed of the FP method with others use the initial inverse Hessian Ho = E. For instance, Marquart’s method, which has been used in force-constant work13 was adjudged faster than the FP method on test functionsr4, but Bard has “[used] the approximate inverse Hessian and obtained some reduction in iterations”r5. Moreover, Davidon, from whose work the FP method originates, cIearIy states I6 that fewer iterations are to be expected when an approximate inverse Hessian is used for R*. The improvement in speed is onIy marginally off-set by the need to invert A ‘. The FP method requires about double the number of function evaluations that the GNR method requires, but as it does not require one matrix inversion per iteration the overall time required is very similar.
The matrix El’ is required to be positive definite in order. that the vector Af’, eqn- (lo), may point in a ““downhih” direction, that is, a direction along which it is possible to reduce R so that R’+r < R’. The FP method has been made to fail by finding conditions under which W ceases to be positive definite. In one group of cases this arose when A0 was not positive definite because of rounding errors introduced during its computation, eqn, (7), and hence when R” was taken as (A’)“l it too was not positive definite. These cases occurred when 420
f.
Mol.
Structure,
12 (I 972)
In
11~
*
3.aooaoaa 2,4958294 3.0675789 3.0663045 3.0661396 2,a~ooooo 2,5210553 2.4970086 2,491?775 284983065
9’33
b Norms as definedin ref. 1,
Ia~o~ooaoo Pep467779 9.7673648 9.7492773 9.7492200
Cumulativetotat.
13 2 6 3 8 4 10
a
r
9.07868 x103 8.36125~10~ 5.98324~10~ 5.98240~10~ 5.98240~10~
R’
= $23
=
cored(A’)
ok)
II(‘II”
cond (W)
8,24414~10-~2.39179~10 2.90120x IO4 23.9179 7916981 x 1O-4 1.98917~10 2.88100~10~ 5.60638 3.78075~10 9.36359~10~ 2.88516~10~ 5.64709 4.76614~10~ 1.17986x 10' 2.91793~10~ 5.91646 5.51466~10~ 1.36510~10~~2.64516~10~ 5.36770
II(‘II”
A TYPICAL FP REFINEMENT USING AN APPiROXIMATE INITIAL INVERSE HESSIAN: 9~12 = $1,
I= s &?
ii
3
P
3
; L 2”
9”ll
2,6511952 2,6650671 2,6731341 2.6801435 2,6841505 2,6881721 2.6898751 2,6905006 2,6905185 2,6905175
383011274 3.3011056 3,3011067
1.4103944 2,2951195 2,3923613 2.5034706 2.5649522 2.6060829 2,6334303
933
3.3316897 3.3221935 3.3136065 3.3088238 3.3039408 3.3018889
3.6211202 3.5202319 3.4486433 3.4047162 3.3688713 3.3485170
3.0000000 3.7805833
V...*2,55774x lo4 3,20683x lo2 1.10980x IO2 1.93498x 10 4.03514 9,13067x10-’ 1.88532x 10-l 4.31331x10-2 8.95084x IO- 3 2.27344x lO-3 5.41225x 1O-4 1.87669x 1O-4 8.47748x lO-5 6.84339x 1O-5 6.71986x 1O-5 6.71961x IO-5 6.71961x 1O-5
R'
Oe1732051,S13 = O,O,St3 = -0,566375S
* Cumulative total, b $tt matrix dcrivcd from this is P’t 1I = 9,959341,f%,, = 3.3011438,$=$33 bi& = -0S66348,
9.9593437 9.9593436 9.9593437
2 =
*FGr22
FD Rff~I~~M~~T: Fi
1” 9,2000000 4 10,0717631 6 9.8718545 8 10.0082978 10 9.9464320 12 9.9707812 14 9,9580641 16 9.9629605 18 9.9596559 20 9.9605271 22 949596361 24 9,9597374 26 9.9594065 28 9.9593854
1t.f
14 30 15 32 16b 34
1 2 3 4 5 6 7 8 9 10 It 12 13
0
r
A “SLOW”
TABLE9
=
II
7.55947 5.03483 8.47947 I,90072 x 10 4.11431 xl0 8,68660x 10 1.95614x lo2 4.05873x lo* g-73663x lo2 2.06960x lo3 5.85356x lo3 1.54584~ lo4 1.14308x lo5 1.51234x lo6 2.15644x IO” 5.11393x10” 3.54260x lOI4
cond (A?
2,6905480,9$2 = 0.1731937,#,a
1.67671x10-j 1,90120x 10-J 3918581x 1O-4 7,13678x 10-j 1‘~404oxlo-3 3,24568x IO- 3 7,29159x 10-z 1,51140x 10-2 3.62261x 10’ 2 7.69665x 1O-2 2.17593x lo- ’ 5.74504x lo- 1 4.24722 5.61868x 10 8.01136x lo4 1,89987x 10’ I,31661 x 1O’O
kW-'
= -0.0000036,
4.50851x lo4 3,98872x IO4 4.37304x 104 3,50363x 104 3.52663x lo4 3.12805x IO4 3,52103x lo4 3.47999x lo4 3.46807x lOA 3.62310x lo4 3.47861x lo4 3.52419x 104 3.39974x 104 3.49942x 104 3.26292x 10” 3.20774x IO4 2.69312x lo4
ilW>-‘It
6.63680 7.76312 8.66846 I‘80408x 10 3.80698x 10 8.05847x 10 1.60464x lo2 3.77978x lo2 7.82082x IO2 1a85728x 10” 3.70384x 103 9,48958x lo3 1.62681x lo4 2.08827x lo4 2.04284x 10” 1.74847x lo4
7s5947
cond (fir)
a refinement was started from a point very close to the minimum with cond (?ri’) z 10”. In other cases the failure must be interpreted to imply that the refinement had reached a point in R-space between two distinct minima in R, when R ceases to be a convex function of the parameters f. The failures can be recognised by notirig that one or more of the diagonal elements of Ii’ will become negative. Both failures can be avoided by choosing a new initial parameter vectorf”, It is also known that the FP method may fait if El=, whiIe remaining positive definite, tends to become singular17. Hr may become singular according to eqns. (12) and (6) if Br is a null matrix while A’ is singular. This can happen according
to eqns. (7) and (8) if A$ = 0 fork = 1 . . . n while A’ is singular, a circumstance which requires that 9’ = S’, an extremal set, by definition2. In practice as a choice of off-diagonal constants close to those of an S’ set is taken the rate of refinement decreases (n, and mf increase), but it has not yet failed to converge in these cases. An example is given in Table 9. It will be seen that as R’ decreases towards zero cond (H’) increases, though cond (A’) increases more rapidly. The way in which ns varies with the final value of cond (H’) for an aImost constant starting point is illustrated in Table 10. “Slow’” GNR refinements have been mentioned by various authors without
adducing any examples, though they have been correctly associated with nearly singular A’ matrices18. As li’ tends to zero the FP and GNR refinements become identical as H’ -P (A’)-I. Other failures in the FP method have occurred when H’ is positive definite but a value of at’ such that Rr + 1 -C R’ could not be found-failures in linear minimisation. Presumably inaccuracies in the computation of Ar, eqn. (IO), cause this vector to point in a direction that is not “downhill” but parallel to the R-contours TABLE
IO
DEPENDENCE
STARTS
FROM
VALUES
OF gz3
OF
RATE
5i;rr
=
OF
9.2,
REFINEhtENT
922
UPON
= 3.0,
THE
9-33
=
FINAL
VALUE
1.41,
$12
OF COND
= 1/3,
s-_13
(ff?). =
EACH 0.0,
REFINEMENT
AND
VARIOUS
923
-0.2947570 -0.4421355 -0.5600383 -0.5659334 -0.5663755 -0.5664179b -0.5666703 - O.‘S674072 -OS895140 -0.7368924
4 6 9 13 16 18 12 10 8 4
10
14 21 28 34 39 26 22 18 IO
1.42 x IO3 2.90 x 102 7.57 x lo- I 4.66x10-” 6.72 x IO-” 4.78 x 1O-6
5.30
9.38 1.86 2.10 1.75 2.31 2.68 6.25 2.21 5.01
x x x x x x x
IO2 lo3 104 10’ lo3 lo2 10’
o Refinement ternGnates when R’ < 1.0 x 10m6 b Closest approximation to .@23.
423
“uphill”. This situation arose when using Ho = (A’)-’ from points very close to the minimum. An occasional failure in linear minimisation occurred which was characterised by a continuously decreasing value of ct’ with r. It is clear that both types of failure in linear minimisation are associated with an inverse Hessian H’ which is nearly not positive definite and are related to the failures previously described. They occur near either a minimum or a saddle-point in the R-surface, and are usually dependent upon the value off’ used initially. or even
StabJlity of the FP re_finement The minimisation of a sum of squares (R) which is non-linear function of the parameters (fi) depends on two separate but related factors: (a) the method used to linearize the problem for each iteration and (b) the method used to solve the linear problem. Linearization in the GNR process is achieved by ignoring all derivates of v higher than the first, as in eqn. (1 l), while in the FP method second derivates are implicitly included, see eqn. (12), but higher derivates are still ignored. Thus the FP method uses a better approximation, but for very non-linear situations it may still be inadequate_ In the GNR method the linear problem involves a matrix inversion, which is therefore subject to rounding error. However this error is much less than has been often supposed. If the computer uses t-digit base-b floating point arithmetic and if cond (A’) = /3P,then it can be shownlv that the vector Af obtained by solving eqn. (11) will be correct to approximately (t-p) digits. Thus if t = 12 and /.I= 10, as on KDF9, Af’ should be correct to three digits if cond (A’) z 10’. Instead the refinement diverged’ with cond (A’) as low as 103, showing that the non-convergence was not due to rounding error. The linear problem in the FP method is the solution of eqn. (10). Since this involves only matrix multiplication rounding errors will be very small, especially if double-length inner products are used. The updating formula (12) is more subject to rounding errors since it involves quantities such as (grtl -g’) which are differences of two very similar numbers. While Fletcher and Powell’s updating formula normally guarantees that H’ remains positive definiteI some of the failures above may have resulted from rounding error introduced when the gradient was almost constant, gr+’ E gr. The norms and condition numbers given in Tables 8 and 9 are intended to form the basis for subsequent discussions on the nature of break-downs in the refinement process. Of particular interest is the fact that the FP refinement was successful even with cond (H’) > 104. Also, the “slow” refinements occurred with such low values of cond (H’) that the slowness must be attributed either to a deficiency in the linearization, or to rounding errors in the updating formula. Since “slow” refinements can occur with the GNR process the former explanation is more likely to be correct. It is hoped to apply this kind of analysis to other systems, e.g. involving data from chemical kinetics*‘, where the FP method has failed. The scaling of equations is important as shown by the fact that the sum of 424
J. Mol. Srrucrure,
12 (1972)
squares in eqn. (3) was minimised more rapidly than the one in eqn. (14).
R’ =
&%k-%;)“wk
The rate of minimisation w, = 1 to W, = l/12, or factor
(14)
of this R-factor varied as the weights were varied from
By contrast the rate of refinement of the R(3) was scarcely affected by the choice of weights as )vk = 1, l/vk or l/v,‘. rv, =
l/1,2_
ACKNOWLEDGEMENT’S
copy
I thank Dr. D. E. Freeman for correspondence and Dr. A. H. Clark for a of his $9 matrix.
REFERENCES 1 P. GANS, J. Chem. Sot., A, (1971), 2017. 2 P. GANS, Chem. Phys. Lett., 6 (1970) 561; see also C%enz.Phys. Lett., 7 (1970) 396. 3 D. R. LIDE AND D. E. MANN, J. Chem. Phys., 25 (1956) 1128. 4 A. H. CLARK, B. BEAGLEY, D. W. J. CRUICK~HANK AND J. G. HEWI~, J. Chem. Sot., A, (1970) 872. 5 P. GANS, Vibrating Molectdes, Chapman and Hall, London, 1971, p. 120. 6 P. GANS, Chem. Phys. Lett., 12 (1972) 471. 7 P. TORKINGTON, J. Chem. Phys., 17 (1949) 357. 8 W. SAWODNY, J. Mol. Spectrosc., 30 (1969) 56. 9 D. E. FREEMAN, Chem. Phys. Lett., 8 (1971) 271. 10 J. W. NIRLER AND G. C. PIMENI-EL,J. Mol. Spectrosc-, 26 (1968) 294. 11 W. J. LEHMANN, L. BEcehiANN AND L. GIJTJAHR,J. Chem. Phys_, 44 (1966) 1654. 12 R. FLETCHERAND M. J. D. POWELL, Computer J., (1963) 163. 13 D. E. FREEMAN,J. Mol. Spectrosc., 20 (1966) 75. 14 Y. BARD, J. Sot. Id. Appl. Math., 7 (1970) 157. 15 Y. BARD, personal communication. 16 W. C. DAVIDON, U.S.A.E.C. Report, 1959, ANL 5990. 17 Y. BARD, Math. Computation, 22 (1968) 665. 18 J. ALDOUS AND 1. M. MILLS, Spectrochim. Actu, 18 (1962) 1073. 19 G. E. FORSYTHEAND C. B. MOLER, Computer Solution of Linear Algebraic Systems, PrenticeHall, Englewood Cliffs, N.J., 1967, p. 50. 20 R. N. F. THORNLEY, A. G. SYKES AND P. GANS, .i. Chem. Sot., A, (1971) 1494.
J. Mol. Structure,
12 (1972)
425