0097xwa5/84 $3.00 + .OO Pergamon Press Ltd.
ESTIMATION OF THE BASICITY CONSTANTS OF WEAK BASES BY THE TARGET TESTING METHOD OF FACTOR ANALYSIS institute of Chemistry,
Estonian
ij. HALDNA and A. MURSHAK SSR Academy of Sciences, 15 Akadeemia tee, Tallinn 200026, U.S.S.R.
Abstract-The basicity constants (K,,+ ) and corresponding salvation parameters (m*) of weak bases are extensively studied in aqueous solutions of strong acids applying the UV-VIS spectrophotometry. But difficulties in calculating the pKeH+ and m* values are often encountered due to the “medium effects” in the UV-VIS absorption spectra considered. To overcome these difficulties a computer program based on the factor analysis target testing method has been written. The program generates a large number of combinations of the given pK,,+ and m* values. Employing the target testing method a pair of these values is chosen which best fit the set of UV-VIS spectra measured. 1. INTRODUDION The UVVIS absorption spectra of weak bases may easily be used for calculating the respective basicity constant (I&+) and solvation parameter (m *) values if (i) on the plot of molar extinction coefficient vs strong acid concentration two plateaus are observed corresponding to the unionized base (B) and its conjugated acid (BH+), and (ii) the B to BH + from ratio does not depend on the wavelength chosen. For a number of weak bases these two requirements do not hold (Hammett. 1970) due to the shifts in absorution maxima with &creasing strong acid concentr&ion. The empirical correction methods for these “medium effects” (Hammett, 1970) have not yielded a solution to the problem. That is why Edward & Wong suggested using the factor analysis (FA) in the basicity studies (Edward & Wong, i977). The method suiRested (Edward & Wona. 1977) is a version of the principal component a&&is @CA) described in detail by Simonds (Simonds, 1963). In some papers the results obtained by this method were found to be quite satisfactory (Cox et al., 1979; Edward & Worm, 1977; Zalewski, -1979; Haldna et al., 1980; Haldna et al., 1980). But the aDuIication of Simonds’ version of PCA to amide proto&tion was unsuccessful: the coefficient of the first vector did not reach a stable final value in sulfuric acid solutions where the base is practically all in the BH+ form; this coefficient continues to vary slowly with increasing acid concentration (Cox & Yates, 1981). This failure of Simonds’ method is IIot surprising because it yields an abstract FA solution to the problem only. The abstract vectors and their coefficients are good for short-cut reproduction of the initial data matrix but may have no real physical or chemical meaning (Malinowski Br Howery, 1980). Applying the digital simulation models the ability of the Simonds method to recover the shapes of vectors and their coefficients used to generate data matrixes has heen estimated (Haldna & Murshak, 1983). It was found that the results obtained by the Simonds method should be treated with caution because in a number of cases an agreement between the initial and recovered shapes of vectors and their coefficients was rather poor (Haldna & Murshak, 1983). This could be anticipated as the Simonds method (Simonds, 1963) is
in fact PCA, i.e. first the maximum possible variation is assigned to the first eigenvector, then the maximum amount of the remainder is assigned to the second, etc. Thus the Simonds method shows a tendency to overestimate the first vector on account of all others as well as the second on account of remainders, etc. 2. PRINCIPLES The failure of the Simonds method (Simonds, 1963) does not mean that factor analysis (FA) should be disqualitied as a tool in protonation studies. For further use of FA in protonation studies the abstract FA solution must be transformed into a more meaningful solution. The mathematical basis for such a transformation is summarized by the sequence of equations (Malinoski & Howery, 1980)
PI = vk*[C1PC* 1 #pc~~l~irclK*rr-‘I TF,4 IT.4
(1)
where the spectral data matrix is [D]. The [R] and [C] are the row matrix and column matrix, respectively. The symbol PCA indicates that the matrix has been obtained by the PCA. The transformed matrixes have the index TFA. The [ZJ is the transformation matrix and [T]-’ its inverse. Two distinctly different approaches, i.e. abstract rotation and target testing (TT), may be used to transform the PCA solution. The first of these, abstract rotation, transforms the abstract PCA matrixes into other abstract matrixes and is therefore not useful for our purposes. The second approach, target testing (TT), is a unique method for testing potential factors one at a time. lT enables us to evaluate ideas concerning the nature of the factors and thereby develop physically significant models for data considered (Malinowski & Howery, 1980). For this reason we decided to apply the ‘IT approach to the PCA solution in order to estimate the meaningful values of pK,,+ and m* for weak bases whose UV-VIS spectra show a complicated pattern of behaviour in strongly acidic solutions. 3. ALGORITHM in FORTRAN
The program, written
IV, is based on
202
~.HALDNACUI~
the mathematical procedures given by Malinowski (Malinowski & Howery, 1980). In the first, the PCA stage, the covariance about the origin of data is used. The iterations for decomposition of the covariance matrix are repeated until no element ofthe eigenvector C, considered changed more than 1 x 10m4% in ttio subsequent steps (FORTRAN IV double precision mode was applied). The number of abstract factors (NF) necessary in PCA was determined using the variance (Malinowski & Howery, 1980): VAR
=,+
(2)
C 4 j-1 where II,, . . , kk are the eigenvalues already calculated and &+, is. the new one. If VAR < Q, the new k + 1 factor was rejected and NF = k. If VAR > Q, the (k + I)th factor was accepted. After some experimentation we chose Q = 0.0008. This Q value yields NF < 5 and the square mean element of the residual matrix [RE] = [D] - [RIKAICIPC. was -c 2% from the respective element of the data matrix [D]. For target testing we must have a test vector R, qua”titativcly representing the model under test. As R, we used the protonation fraction of the base R
=
’
PH+l PI + [BH+]
(3)
where [B] and [BH+] are the molar base Band its protonated form BH+ elements of R, were calculated using method (Cox & Yates, 1978) and pK,,+ and m* values generated by
concentrations of respectively. The the excess acidity a combination of the program:
% = I ;;;ig,
A. MU-K each R, tested
log C,+
+ m ‘X
+ pK,,+
(5)
of hyand C,,+ and X are the molar concentration drated protons and the excess acidity, respectively. The program used all combinations of pK,,+ and m * values included. The number of included pK,, + values (NPK) and m* values (NM) was NPK,(25 and NM < 2.5, respectively. Thus the total number of R, vectors tested (NT) was NT < 625. Each R, was subjected to the TT (Malinowski & Howery, 1980): R, = [RI,,,
* T,
(6)
where the test vector T, =
[~lddRl,T,~~~~
(7)
The VIEA is the diagonal matrix with eigenvalues 2,. & _ . . obtained in PCA on the main diagonal. If the suspected test vector R, is a real one, i.e. fits the measured set of UV-VIS spectra, regeneration according to Eq. (6) will be successful: each element of R, will reasonably equal the corresponding element of R,. The program calculates and stores for
[AI&,ML[~l
IT] =
(9)
where the row matrix tested [R] is made by replacing in the [RJPTA the first column by ii, corresponding to the minimal Z value (Eq. 8). Postmultiplying the abstract row matrix [RI,, by [r] yields [RITFA, the row matrix in the new coordinate system: [RI,,,
=
The column matrix is obtained by
UGw,~~l~
(10)
in the new coordinate
[Cl TFA = The program
log 1=
of
where f,,, and p,, are the elements of ii, and R, respectively. When all the tests have been performed, the smallest L value IS chosen. The pK,, + and m * values used to obtain the minimal Z value are considered to be the best of those tested for describing the protonation equilibrium studied. Of course, if in the run the pKBH+ and WI* are varied in large steps, a new run with smaller pKBH+ and m* steps (in closer range) is necessary in order to obtain pKBH+ and M* values within the commonly used precision of +O.Ol in the corresponding units. The parameters for the new run are determined by the researcher on the basis of the four best combinations of pK,,+ and m * values printed out by the program. When the most suitable combination of pKsn+ and NZ* values has been found the program reproduces the spectral data matrix used. For these purposes the test matrix [T] is generated
(4)
where
the value
calculates
(11)
[~lr’[ClPc4.
the reconstructed
system
data matrix
PITFA = [R~~~~~TPA
(12)
and the difference WRl
= [Dl7..
-WI.
The square mean element of ER was found 62.5% from the same element of [D]. 4.
(13) to be
RESULTS
Digital simulation models were applied in order to check whether the program enables us to recover the and m* values used to generate the data PK,,,+ matrixes. At the wavelength j the molar extinction coefficient for the solution i was calculated by c;/ = (1 - r,,)b,
+ ‘i,Jaij
(14)
where Us and b, are the molar extinction coefficients for the protonated and unprotonated forms of base R respectively; r,,, is given by Eq. (4). If a, and bg do not depend on the strong acid concentration of the solution, the problem is trivial and there is no reason
Estimation of basicity constants of weak bases tn use FA. Thus, when generating digital models for FA we have to use some kind of functions a, = qa (“/, acid) and b, = rp, (“/, acid). In run 1 we chose for the B and BH+ form spectra the Gaussian curves shifting towards shorter wavelengths with increasingly strong acid concentration. In run 2 only the spectrum of the BH+ form shifts in the same way as in run 1 while the spectrum of the B form was held constant. Run 3 was similar to run 1 but the shifting Gaussian curves were replaced by shifting parabolas. In runs l-3 the wavelengths corresponding to the IIIaXha in spectra were linear functions of the strong acid concentration in the solution. In run 4 the spectra of both B and BH+ forms were held constant but a special “medium effect” pig, was added to e$ e& =e,+p,g,
INlWT:
SPECTFiAL (ED)
FOR
DATA
[D.l ,
hlATRIX DATA,
Reacrivity
IN
PKBH+
APPENDIX
I
FACTOR
(PCA).
SPECTRAL
AND SOLVATION
DATA
MATRIX,-
HOOT MEAN SQUAKZ
r;KHOR
HIGHEST
PARAMETEEX M:
STZF
AND LENGTHS
AND l&a.’ (‘1
ANALYSIS
NEW FACTORS
IN
POR BACH SOLUTION,
I + USTRACT
17, 384.
Haldna, U., Murshak, A. & Kuus, H. (19X0), Oragnic ReacliUi2~17, 313. Haldna, U. & Murshak, A. (1984), Cornput. Chem. 8, 390. Hammctt, L. P. (1!370), Physical Organic Chemistry. New York, McGraw-Hill. Malinowski, E. R. & Howery, D. G. (1980), Factor Analysis in Chemistry. New York. J. Wiley. Simonds, J. L. (1963), J. Uppr. Sue. Am. 53, 968. Zalewski. R. 1. (1979), J. Chem. SOL-. Perkin Trans. II, 1637.
THE ESTIM&TED
ACID
FOR PI&+
VALUE?
CHANGES
REFERENCES Bum&t, J. F. & Olsen, F. P.(1966), Can. J. C&m. 44, 1899. Cox, R. A., Smith, C. R. c%Yates, K. (1979), Can. J. Chem. 57, 2952. Cox, R. A. &Yates, K. (197X), J. Am. Chem. Sot. 100,386l. Cox R. A. & Yates, K. (1981), Can. J. Chem. 59, 1560. Edward, J. T. & Wang, S. C. (1977), J. Am. Chem. Sot. 99, 4229. Haldna, U., Murshak, A. & Kuura, H. (1980), Organic
N-l_JhZ%ROF RO$!S AND COLUIWS
OF SPECTRAL
LOWEST
Fig. I. Simulated spectra for a hypothetical weak base @KLW+ = - 2.40 and m * = 0.90). The spectra of’B and BH + form are Gauss-curves shifting towards shorter wavelengths with increasing sulfuric acid concentration. The spectra for following sulfuric acid solutions are given (% H,SO,, w/w): I, 5.0; 2, 33.0; 3, 40.0: 4, 45.0 and 5, 86.0.
(15)
where pi is the ‘i: (w/w) of sulfuric acid in solution and g! is an arbitrary function of the wavelength. In all runs two sets of pK,,+ and m* values were used to obtain ri,, (Eq. 4). These are pK,,+ = - 1.58, m* = 0.69 and pKa”+ = - 2.40, m* = 0.90. In Fig. 1 an example of the generated set of spectra is presented. No matter how the sets of shifting spectra were constructed (i.e. by runs L-4), the program was always able to suggest as the “best fitting values” exactly these pKRH + and m * values which were used to generate the corresponding set of spectra. This seems to demonstrate that the TT method in FA separates protonation effects From “medium effects” in the UV-VIS absorption spectra. We should mention that the use of the excess acidity method for generating the test vectors R, is not a requirementany other approach, the Bunnett-Olsen method, for example (Bunnett & Olsen, I966), etc. is also acccptable.
DATA
203
ARE
ALL INCLUD-
4 PK AND
VALUES
f +
M’
TO BE
CALCULATION OF SOLUTION
ED UNTIL
ROOT iCEAN SQUARS
USBD AKE
PARAmTrlRS
ERROR
ED.
CALCULA’Tr;D
X ANL: CH+
AND STCWZD.
USING
C
ABSTRACT
([RI, AND COLUMN
ROW
([Cl,
MATRIXES
AR8 OBTAINED.
RESIDUAL
IkiiTRIXIS PHIFJZED.
(ii).
ACID
%
204
~.HALDNA
CYCLES
OVER
CLUDES:
ALL
COMBINATIONS
(I) CALCULATION
(3) CALCULATION TITY
IS
and A.MURSHAK
OF PK
OP =
@VALUES.
AND
TEST VEC1QR,
OF TEb3 SUM OF MISFITS,
STORED
TOGZiTHER
WITH
PK
AND
E&H
CYCL3
(2) TARGW
TESTING,
Ai'!3(4) THE LAST
Itl*VALUES
IN-
QUAN-
USBD.
I
v MISFITS
m
Alli?ANGED - FIRST
AF?_&PRINTED
OBTAINING
TOGETHER
OE NsW
YIELDING
WITH
CR] AND
THE MINIMAL
LC:clMATRIX&S
MISFIT.
AND CALCULATION
- [Dl. IDJTFA
AM)
Ranges
ONE. TCKE FOUR SWLL.&ST--1
KESPECTIVJ~ PK AND Ibh'ALUES.
UtiTRIX [DlTpA
ti>
THE MINIMAL
USING
WRODUCTION
THE T&ST WC:CTOH,
OF TEE SP3CTRAL
OF THE i3.WOLIMATRIX
DATA
[ERI = [D'JT,
[ER] ARE PRINTED.
I
and m* are estimated
for psH+
first
using
available
literature
data
bases
a similar
structure.
For example,
varied
I
within
the m* values
the range within
about
of 4 units
the range
very
roughly
the protonation the pKBH+
by a step
of 0.8 units
of
may be
of 0.2 and by a step
of
0.05 respectively. cii> Equations
for calculation
(Cox & Yates,1978). for
aqueous
2 = 0.05 P
<
The values
sulfurio
are those
of CH+ were
acid solutions
of Cox
obtained:
(P - H2S04
56 w/w,
P>
65.0:
CH+ = P{0.11~016 -
P >
of X-function
65.0:
0.0141916
CH+ = P(9.44835 + 0.42093
for aqueous
Z3
perchloric
+ 0.0169089 +
O.O03?4g3g
- 6.2132
Z3 - 0.0881593
Z4
-
(P - HCI04 Poa5
Zb)
Z2 +
Z4 + 0.000713638
+ 0.016604
Z2 -
O.O0005q’Ig5
2 + 0.517897
acid solutions
CH+ = P/(10.066
2 + 0.020324
Z6>
% w/w>
- 0.059526
P>