ARCHIVES
Vol.
OF BIOCHEMISTRY
194, No. 2, May,
AND
pp. 542-551,
BIOPHYSICS
1979
Quantitative Structure-Activity Relationships of Chymotrypsin-Ligand Interactions: An Analysis of Interactions in p3 Space CIRO
GRIECO,*
*Institute
CORWIN ANTONIO
of Pharmaceutical +Department
HANSCH,?’ VITTORIA,”
CARLO SILIP0,*~2 R. NELSON AND KELVIN YAMADAt
and Toxicological Chemistry, of Chemistry, Pomona College,
Received
October
13, 1978; revised
University Claremont, December
of Naples, California
Naples, 91711
SMITH,t
Italy,
and
27, 1978
Fourteen derivatives of L-alanine of the type CH,CH(NHCO-3-C,H,N)COOR, have been synthesized and their hydrolysis by chymotrypsin was studied with the object of characterizing enzymic space (p,) to which R:, binds. The binding of R, (log l/K,) was shown via correlation analysis to correlate with molar refractivity (MR) of R, rather than hydrophobicity (s-). The results confirmed our earlier predictions. A correlation equation for the hydrolysis of 77 acyl-amino acid esters of the general formula R,CH(NHCOR,)COOR, relating log (k,,,/K,,,) to molar refractivity of R,, R,, and R, and to o* (Taft’s polar parameter) of R:, was formulated. The general picture of ligand interactions with chymotrypsin as seen with correlation analysis is discussed.
This report continues our study of the interactions of various substrates and inhibitors with chymotrypsin (l-4). In formulating quantitative structure-activity relationships (QSAR)” for chymotrypsinligand interactions, we have employed the nomenclature and model for binding of Hein and Niemann (5).
RAJHO~RI
acid are shown interacting with the pl, p2, and p3 spaces of the enzyme. The a-H projects behind the plane of the page and is said to interact in pH space. Hein and Niemann postulated that characteristic constants could be found for the interaction of R,, R2, and R, with pl, pz, and p3 of the enzyme; in fact, they believed that many different binding modes would occur with each ligand and that all of these constants would have to be considered in order to rationalize the binding of any given ligand (6). We have approached the problem with a different view and assumed that each ligand binds in essentially one mode. Starting with this assumption (2,4), we formulated Eq. [ 11,
p2
log l/K,n
p3 R3\
0
0 \c4
' ,.H
PI
= 1.09(kO.ll)MR,
I
+ 0.52(+0.13)MR,
The three substituents in structure I on the a-carbon of the acyl ester of an L-amino
+ 1.26(?0.28)cr;
MR,.MR,
I To whom all correspondence should be addressed. 2 Visiting Scientist from the University of Naples. :< Abbreviations used: QSAR, Quantitative structureactivity relationships; MR, molar refractivity; p3, enzymic space; n, hydrophobicity; q*, Taft’s polar parameter. Copyright All rights
0 1979 by Academic Press, of reproduction in any form
Inc. reserved.
- 0.63(*0.26)1, - 0.05i’(~O.O13)MR,.
- 1.61(+0.47),
[ll
where n = 71, r = 0.979, s = 0.332. MR in this expression refers to the molar refractivity of groups R1, RP, and R,. I, is an indicator variable which takes the value of 1 542
0003-9861/79/060542-10$02.00/O
+ 0.80(kO.ll)MR,
CHYMOTRYPSIN-LIGAND
INTERACTIONS
543
One of the serious shortcomings of many for the case where R, = isopropyl and the value of 0 for all other examples of R,; its past structure-activity studies of ligands negative coefficient indicates that the interacting with enzymes is that in choosing isopropyl group hinders binding, all other substrates, the authors paid little or no factors being equal. The parameter (T* is attention to making changes in substituents Taft’s polar constant which is a measure of so as to provide a set of congeners having the inductive effect of R3. The negative orthogonal vectors with respect to the known coefficient with the cross product (MR,. substituent constants such as (T, E,, x, and MR, * MR,) indicates that large groups interMR. To a considerable extent, the data upon act cooperatively to inhibit binding when which Eq. [l] is based suffer from this R1, Rz, and R, reach a certain size. One must difficulty. Therefore, to more precisely characterize binding in pl, pZ, and p3 space, bear in mind that MR {MR = [(n* - l>/(n’ the + 2)] . (MWId)} is largely a measure of we have decided to reinvestigate molar volume; it is also viewed as a measure interaction of acyl esters of amino acids with of the polarizability of a molecule or a chymotrypsin. For this first study we have substituent. Equation [l] is based on 71 data synthesized a set of L-esters of the general formula II. points (n), r is the correlation coefficient, and s is the standard deviation from regression. We have assumed for a first approximation that l/K, can be taken as a binding constant. Equation [l] shows that the dependence of binding on the three types of substituents R1, Rz, and R, varies considerably, with coefficients ranging from 1.1 for R, to 0.5 for R,. A difficult problem in this rather nonspecific kind of binding correlated by MR is that one often finds (7) another kind II of nonspecific binding correlated by 7~. The parameter r is derived from partition Collinearity between MR and r and MR coefficients (8) and is assumed to be a and cr* for R3 of the esters upon which measure of the classic partitioning process Eq. [l] rests is: y&R,* = 0.91; Y$~,~~ = 0.61. which is postulated to be driven primarily Table II lists the enzymic and physicoby the forces involved in desolvation. chemical parameters for the 14 esters we Considerable evidence has accumulated from X-ray studies of protein structure to have prepared and studied. The collinearity show that apolar amino acid residues in among the pertinent variables is given in Table III. While we have very good separaproteins are placed together to form tion of u* and MR, collinearity between hydrophobic pockets (9). This means that MR and n is still higher than one would polar groups must constitute the main (y&n elements of certain other domains. The like. This degree of collinearity = 0.45) was accepted in order to avoid partitioning of proteins into polar and apolar excessively difficult synthetic problems. domains is by no means perfect and there Even with this degree of collinearity, the are regions of mixed character. Our results (7, 10) seem to show two limiting types results clearly show how much more of nonspecific binding: one, characterized by important MR is for binding than n. 7r, which we presume to correlate binding in apolar space and another, characterized EXPERIMENTAL PROCEDURES by MR, which is assumed to correlate Substituent Constants binding in polar space. Our present objective is to use chymotrypsin as one model system We have used parameters from our recent in which these types of binding can be compilations (11, 12) to correlate the data in Tables better delineated. II and IV. Octanollwater log P values were
544
GRIECO ET AZ,.
determined as usual (8); in addition, we have taken CT*values from Taft (13). We have used the value of Rhodes and Vargas (14) for -CH,-C,H, and Kruglikova and Kalinina’s value (15) for -CH,OCH,. To estimate u* for
-CH2v we have used cr* for -CH,CH(OH)CH,
(16).
Preparation of c-u-N-Nictonirzyl-L-Alanine Esters Method A. Most of the compounds of Table I were prepared by a general procedure (2): A mixture of L-alanine (0.027 mol) and appropriate alcohol (100 ml) was saturated with anhydrous hydrogen chloride. The mixture was then heated at 80-90°C for 2 h, after which the excess alcohol was evaporated under reduced pressure or extracted with ether from the basic (K&O,) solution. In the latter case, the L-alanine ester hydrochloride was reprecipitated from the dry ether solution. Nicotinyl azide (0.022 mol) and ethyl TABLE PROPERTIES
No.
AND
METHOD
Method of synthesis
Compound R3
GH, CA, C,H, C(CH,),
,CH3
7 CH2CH2CH %
9 10 11 12 13 14
CH,CH,Cl CH,CH,OCH, CH,COCH, CH,OCH, CH,C,H, CH,CN
Solvent of recrystallization Acetone-pentane Acetone-pentane Acetone-pentane Acetone-pentane Acetone-pentane Ethyl acetatehexane
1 CWCH,), 2 CH, 3 4 5 6
OF PREPARATION
acetate were added to the free base (0.020 mol). After standing overnight, this mixture was extracted with 4% HCl, the extract neutralized with 4% NaHCO,, and then finally extracted with ethyl acetate. The final extract was dried and the solvent removed under reduced pressure. Method B. Some of the compounds in Table I were made from the corresponding carbobenzoxy alanine ester (17). To a solution of 0.02 mol of carbobenzoxy alanine in 20 ml of acetone, 0.02 mol of the appropriate alcohol and 2 ml of pyridine were added. The calculated amount of dicyclohexylcarbodiimide was added and the mixture held at room temperature for 4 h. The N,N’-dicyclohexylurea which separated was filtered off and the filtrate was evaporated to dryness. The carbobenzoxy group was removed by catalytic (Pd) hydrogenolysis in ethyl acetate. The course of the reaction was followed by absorbing the carbon dioxide and measuring the hydrogen. The free base was treated with nicotinyl azide as mentioned in Method A. Method C. A solution of L-alanine benzyl ester (18) in ethyl acetate was treated with nicotinyl azide to yield the nicotinyl ester. Hydrogenolysis in methanol was used to remove the benzyl moiety. The I OF CH,CH(NH-3-COC,H,N)COOR:,
Melting point (“C)
[a];”
85-86 75-76 49-50 45-46 92-93 87-88
-20.1 -24.4 -22.1 -27.3 -19.9 -35.3
4.71 2.26 4.62 3.03 4.57 3.62
1.11 0.18 1.16 1.80 0.64 1.61
0.01 0.00 0.01 0.00 0.01 0.01
0.93 0.00 0.98 1.62 0.46 1.43
-
-17.7
5.00
2.16 k 0.05
1.98
0.50 k 0.01
0.32
C (HCl 1N)
A
Liquid
B
Ethyl acetatehexane
60-61
-15.0
6.14
Acetone-pentane Acetone-pentane Acetone-pentane Liquid Acetone-pentane Acetone-pentane
78-79
-19.2 -25.9 -5.9 -9.84 -11.9 - 18.2
5.33 2.41 6.65 4.70 5.08 3.41
41-42 122-123 112-113 76-77
log P
0.78 0.02 -0.20 0.16 2.03 -0.24
+ t IL r 2 +
+ ‘-’ + s r k
?P
0.01 0.01 0.02 0.02 0.05 0.02
0.60 -0.16 -0.38 -0.02 1.85 -0.42
a n is defined with respect to CHa; that is, compound 2 is taken as the parent molecule and its log P subtracted from each of the others.
CHYMOTRYPSIN-LIGAND acids obtained were dissolved in ethyl acetate with triethylamine and the appropriate halogen compound (ClCH,CN, CICH,COCH,, ClCH,OCHJ was added and the mixture allowed to stand for 4 h. After filtration, the solution was extracted with 4% HCI. This extract was neutralized with NaHCO, and extracted with ethyl acetate. After drying the extract over NaJS04, the solvent was removed under reduced pressure. Purification of all of the compounds was achieved by column chromatography and/or recrystalliza tion. Purity of all of the compounds was established by tic and nuclear magnetic resonance. All new compounds of Table I gave carbon, hydrogen, and nitrogen analyses which agreed within 0.3% of the theoretical value.
Kinetic Measurements a-Chymotrypsin, a 3 x crystallized product prepared free of autolysis products and low molecular weight contaminants by the method of Yapel et al. (191, was obtained from Worthington Biochemical Corporation. Titration of the a-chymotrypsin with trans-cinnamoylimidazole, using the procedure of Schonbaum et al. (ZO), showed the active site content to be 89.0%. This value was used in the calculation of k,,, and k,.aJK,,,. Stock solutions usually contained 2.0 mg enzyme/ml of 0.050 M NaCl (reagent grade) and were used over a period of at most 2 days. Substrate stock solutions were made shortly before use by dissolving weighed quantities of ester in a measured volume of 0.050 M NaCl, spectroquality CH,CN, or reagent grade DMSO, the choice depending on the solubility and uncatalyzed rate of hydrolysis of the ester. These solutions were normally of the order of l-4 mg/ml, with care being taken to minimize the percent of organic solvent in the reaction mixture. In the case of the t-butyl derivative, it was necessary to use about 42 mg/ml of CH,CN. The rate of enzymic hydrolysis of the ester substrates was followed by automatic titration, using a Radiometer pH-stat set at 7.90. Standardized millimolar NaOH was stored under N, and used as a titrant in a flowing N, (“prepurified” from Matheson Gas Co.) atmosphere. The temperature of the reaction vessel was maintained at 25.O”C with a thermostated circulating water jacket. For a given run, 0.90 ml of solution composed of a suitable combination of stock substrate solution and 0.050 M NaCl was placed in the titration cup and adjusted to the desired pH with the automatic titrator, the desired pH being sufficiently above 7.90 so that on addition of enzyme, the resulting pH would be 7.90. The concentrated enzyme solution used for the t-butyl derivative was prepared by dissolving a weighed sample of enzyme in 0.050 M NaCl, adjusting the pH to 7.90, and then diluting with 0.30 M NaCl to give a final NaCl concentration of 0.050 M. After selecting an appropriate
545
INTERACTIONS
time span for the recorder, enzymic hydrolysis was initiated by rapid addition of 0.100 ml of ol-chymotrypsin stock solution from a special pipet. The active site concentration of enzyme was usually 7.1 X 10e6 M, though for three of the esters it was more convenient to deviate from this (0.5 for -CH&N, 0.2 for -CH,C,H,, and five times for the t-butyl derivatives). The substrate concentrations were generally in the range of 0.01 to 0.0005 M. In calculating the init ial rates, a suitable correction involving the method of least squares was made for the reduction in rate caused by the progressive dilution resulting from addition of titrant. Except for the acetylmethyl ester, no correction was made for the uncatalyzed hydrolysis of the esters since this rate was negligible in comparison with the rate of enzymic hydrolysis. A correction was made for the rate of uncatalyzed hydrolysis of the acetylmethyl derivative. For a given ester, initial rates were obtained for at least eight different substrate concentrations, and these were employed in a Lineweaver-Burk plot using the method of least squares. In all cases the correlation coefficients of these plots were 0.990 or better. The results are summarized in Table II (UK,,,) and Table IV (k,.,,lK,,,). RESULTS
We have developed Eqs. [Z] and [3] from the data in Table II. Equation [2] is the best single-variable equation: log l/K,,, = 1.21(,0.48)(r$ + 1.95(?0.22),
[2]
where n = 14, r = 0.844, s = 0.350. Equation [3] is the “best” equation for the QSAR of compound II: log l/K,r, = 1.36(t0.28)~.: + 0.46(-c0.20)MRS
+ 1.05(?0.40),
[3]
where n = 14, r = 0.957, s = 0.198. The addition of the term in MRs to Eq. [2] is a highly significant improvement (F,,,, = 26.1; F I,lIa.O,,l =19.7). If rTT3is used in Eq. [3] in place of MR3, a much poorer equation results (r = 0.893); in fact, this equation is not a very significant improvement over Eq. [2] (F 1,11= 4.55; Fl,lla.05 = 4.84). Hence it is clear that MR is the parameter of choice for rationalizing the data of Table II. This provides convincing evidence that ps is not typically hydrophobic. A most important aspect of Eq. [3],
546
GRIECO
ET AL.
TABLE PARAMETERS
FOR THE DERIVATION OF CH,CH(NHCO-3-C,H,N)COOR
II
OF EQS.
[2] AND WITH
[3] FOR THE CHYMOTRYPSIN
INTERACTION
Log l/K,,,
No.
O-R
1 2 3 4 5 6 I
iso-CsH, CH, W& WL GH, t-C,H, CH,CH,CH(CH,L
8
CH,-13
K” (Ml
Ob served”
CdCU-
kited”
Calcukited’
(7.:
MR.,
ns
Es-3
0.479 0.646 1.45 1.62 0.289 0.0131 1.09
0.0427 0.0417 0.0363 0.0214 0.0178 0.0162 0.0102
11.2 15.5 39.9 75.7 16.2 0.809 107
1.37 1.38 1.44 1.67 1.75 1.79 1.99
1.58 1.41 1.67 1.87 1.49 1.64 2.04
1.82 1.73 1.91 2.07 1.77 1.84 2.20
-0.19 0.00 -0.12 -0.13 -0.10 -0.30 -0.16
1.71 0.79 1.71 2.17 1.25 2.17 2.64
0.93 0.00 0.98 1.62 0.46 1.43 1.98
-1.71 -1.24 ~1.60 -1.U -1.31 ~2.78 ml.59
1.20
0.00479
250
2.32
2.46
2.61
0.16
2.60
0.32
-2.31
0.913 0.251 0.536 0.491 1.10 1.66
0.00324 0.00324 0.00245 0.00145 0.000955 0.000589
282 71.5 219 339 1150 2820
2.49 2.49 2.61 2.84 3.02 3.23
2.27 2.27 2.69 2.60 2.88 3.42
2.61 2.48 2.91 2.85 2.98 3.67
0.39 0.24 0.60 0.64 0.26 1.32
1.75 1.94 1.80 1.48 3.22 1.30
0.60 -0.16 -0.38 -0.02 1.85 -0.42
-2.14 -2.01 -1.99 -1.43 -1.61 -2.18
0 9 10 11 12 13 14
CH,CH,Cl
CH,CH,OCH,< CH,COCH, CH,OCH, CH,C,H, CH,CN n This paper. b Calculated using c Calculated using
Eq. [3]. Eq. [41.
however, is the excellent agreement of the coefficients of MR and (T* with those of Eq. [l]. We cannot compare the intercepts of Eqs. [l] and [3] because we cannot include terms in MR,, MR2, and MR,*MR,*MR, in Eq. [3]. The first two of these variables are constant in the data of Table II. The cross-product term is perfectly collinear with MR,. The collinearity among the variables considered in the formulation of Eq. [3] is given in Table III. To test the possibility that the largest R, groups might be departing from linearity in the correlation of log l/K,,,, we added a term TABLE SQUARED
CORRELATION
CONSIDERED
MR
4 I$-3
III
IN THE
MATRIX DERIVATION
FOR
VARIABLES
OF EQ.
[31
MR
4
nTT3
Es-3
1.00
0.05 1.00
0.45 0.43 1.00
0.08 0.02 0.00 1.00
in (MR# to Eq. [3]. This did not lower the standard deviation; in fact, it was increased. The data in Table II can be combined with those used (4) to derive Eq. [l] to give Eq. [4]. The coefficients in Eq. [4] agree closely with those of Eq. [l] and the quality of fit is essentially the same: log l/K, = 0.77(-t-O.ll)MR, + 1.13(kO.ll)MR, + 0.47(?0.11)MR, - 0.56(-t-0.25)1, + 1.35(?0.22)a: - 0.055(kO.Ol)MR,. MR*.MR, - 1.64(?0.46), [4] where n = 84, r = 0.977, s = 0.333. A more stringent test is given in the third and fourth columns of Table II where log l/K, values calculated with Eqs. [l] and [4] can be compared with the observed values. The results obtained with Eq. [4] are not quite as good as those obtained with Eq. [3]. The results upon which Eq. [41 is based contain much more structural variation and are from several different
CHYMOTRYPSIN-LIGAND TABLE CONSTANTS
USED
IV
FOR
DERIVING
OhNO. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1X 19 20 21 22 23 24 25’ 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 62 53 54 55 56 57 58 59 60 61 6“ 63
NHCOR, L-NHCOMe, LNHCOM~, L-NHCOMe, L-NHCOMe, L-NHCOMe, L-NHCOCH,CI, L~.NHCOFWH,, L-NHCO-2.pyridyl, r.mNHCOfuryl, L-NHCOPh, L-NHCOMe, LmNHCOPh, r.-NHCO-4.pyridyl, L-NHCOMe, L-NHCOMe. L-NHCOPh, I.-NHCO-Whienyl, I.-NHCOPh-Z-NH?, L-NHCOMe, L-NHCOPh, L-NHCOMe, L-NHCOMe, I.-NHCOMe, L~NHCOM~. L-NHCOMe, L-NHCOOCH,Ph, I.cNHCOPh, L-KHCOMe, L-NHCOMe. L-NHCOMe, L-NHCOMe. L-NHCOMe, L-NHCOMe, L-NHCOOCHiPh, l.mNHCOMe. L-NHCOMe, L-NHCOMe, L-NHCO-fury1 L-NHCOMe, L-KHCOMe, L-NHCOMe. L-NHCOMe. L-iiHCOOCH,Ph, L-NHCOOCH,Ph, L-NHCOOCH,Ph, LmNHCOPh, L-NHCOOCH,Ph, L-NHSOIMe, L-NHCOOCH,Ph. I.-NHCOMe. L-NHSO,Me, L-NHCOMe, L-NHSOzMe, L-NHSO,Me. L-NHSO,Me, L-NHCOOCH,Ph, L-KHCOMe, L-NHSOiMe, L-NHCOPh, L-NHCOPh, L-NHCOOCH,Ph. L-NHSO,Me, L-NHCOOCH,Ph.
FL i-C,IH;, i-C,LHT, i-C.,H,. Me, CH(Me)Et, iLC,H,, Me, Me, Me. Me. i-C,H, I-C,H,. Me, Et, COOEt, Me, Me, Me, CJL, Et, GH,, Ph. GH,, IL&H,, GH,,, i-C,H,, GH,. C.& Cd,,, CHIPh, CH,Ph. CH,Ph. CH,Ph, Me, CH,Ph, CH,Ph, CH,&.clohexyl CH,Ph-I-OH, CHAPh-4-OH, CH,-indolyl, CH,Ph-4-OH, CH,-indolyl, EL CHJZONH>. i-C& CHzPh. WL, CH,Ph, C.,H:. CH,Ph, CH>Ph. CH,Ph. CH,Ph. CH,Ph, CH,Ph. CH,-indolyl, CH,Ph, CHIPh, CH,Ph-I-OH. CHzPh-COH. CH,-indolyl, CH,Ph, CH1-indolyl,
OR,, OmiX,H, OEt OMe OMe OMe OMe OMe OMe OMe OEt OCH,CH,Cl OMe OMe OM.2 OEt OMe OMe OMe O-I&H, OMe OMe OEl OMe OMC OMe OPh-4.NO, OMc OCH,CH,Cl OMe 0-S.se&H, 0.R-CH(Me)-e-&H,, OMe 0.S-CH(Me)-c-C,.H,, OPh+NO, 0.R-sec.C,H, OEt OMe OMe OEt OMe O!ble OEt OPh+NO> OPhANO? OPh-I-NO, OMe OPh+NO> OPh OPh-CNOI 0.S-CH(Me)Ph OPh-kMe 0.R-CH(Me)Ph OPh-4.OMe OPh-CCOMe OPhm4mCI OPh-4-Cl OPh-CNO, OPh-3.NO2 OEt OMe OPh-4-COMe OPh-CNO, OPhANO,
547
INTERACTIONS
served” -0.33 0.08 0.13 0.32 0.39 0.47 0.48 0.59 1.02 1.06 1.08 1.15 1.16 1.30 1.86 1.37 1.48 1.71 1.97 2.36 2.42 2.43 3.09 3.12 3.32 3.36" 3.46 3.67 3.92 4.59 4.60 4.62 4.65 4.67'# 4.i5 4.80 4.90 5.08 5.43 5.46 5.56 5.75 5.83" 5.91” 5.95” 5,95 5.97" 6.08 6. lV1 6.26 6.26 6.32 6.36 6.43 6.44 6.46 6.51 6.55 6.59 6.70 6.73 6.93 6.95
EQ. 151
CdCU-
lated” 0.16 0.09 0.03 0.00 0.95 0.36 0.95 1.23 0.86 1.50 0.93 1.3i I."3_ 1.22 0.59 1.44 1.30 1.76 2.43 2.61 2.30 1.99 3.21 3 21 4.65 3.70 3.64 3.20 4.00 4.87 5.24 4.79 5.24 4.49 4.87 4.77 4.93 5.68 4.97 2.7; 5.00 5.68 5.30 5.95 6.4X 5.97 6.48 6.14 5.96 S.59 6.07 5.59 5.96 6.i2 6.43 6.58 6.94 6.90 5.94 6.15 6.45 6.96 7.01
11 logL,iK,,,~ 0.49 0.01 0.10 0.32 0.56 0.10 0.47 lJ.64 0.16 0.44 0.15 0.22 0.07 0.08 0.77 0.07 0.18 0.05 0.46 0.25 0.12 0.51 0.12 0.09 1 .33 0.3.5 0.18 0.47 0.08 0.28 0.64 0.17 0.59 0.18 0.12 0.03 0.03 0.60 0.46 o.29 0.56 0.07 0.53 0.04 0.53 0.02 0.51 0.06 0.23 0.6i u. 19 0 73 0.40 0.29 0.01 0.12 0.43 0.36 0.65 0.55 0.28 0.03 0.06
MR,
MR,
MR.,
I
1.49 1.49 1.49 1.49 1.49 1.98 2.80 3.18 2.67 3.46 1.49 3.46 3.18 1.49 1.49 3.46 3.28 3.90 1.49 3.46 1.49 1.49 1.49 1.49 1.49 4.19 3.46 1.49 1.49 1.49 1.49 1.49 1.49 4.19 1.49 1.49 1.49 2.67 1.49 1.49 1.49 1.49 4.19 4.19 4.19 3.46 4.19 1.K2 4.19 1.49 1.82 1.49 1.82 1.82 182 4.19 149 1.82 3.46 3.46 4.19 1.82 4.19
1.50 1.50 1.50 0.56 1.96 1.50 0.56 0.56 0.56 0,:s 1.50 1.50 0.56 1.03 1.75 0.56 0.56 0.56 1.50 1.03 1.50 2.54 1.96 1.96 2.89 1.50 1.50 1.50 I 2.42 3.00 3.00 3.00 3.00 0.56 3.00 3.00 3.13 3.18 3.18 4.23 3.1x 4.23 1.03 1.49 1.96 3.00 1.96 3.00 1.50 3.00 3.00 3.00 3.cu 3.00 3.00 4.23 3.00 3.00 3.18 3.18 4.26 3.00 4.23
1.71 1.25 0.79 0.79 0.79 0.79 0.79 0.79 0.79 1.25 1.75 0.79 0.79 0.79 1.26 0.79 0.79 O.i9 1.71 0.79 o.i9 1.25 0.79 0.79 O.i9 3.40 0 79 1.75 0.79 2.li 3.60 0.79 3.60 3.40 2.17 1.25 0.79 0.79 1.25 0.79 0.79 1.25 3.40 3.40 3.40 U.i9 3.40 2 i7 3.40 X.36 3.33 3.36 3.46 3.B
-0.19 -0.10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -0.10 0.39 0.00 0.00 0.00 -0.10 0.00 0.00 0.00 mu.19 0.00 u.00 -0.10 0.00 0.00 o.uo 114 0.00 0.39 0.00 -0.21 -0.21 0.00 -0.21 1.14 -0.21 -0.10 0.00 0.00 -0.10 0.00 0.00 -0.10 1.14 1.14 1.14 0.00 1.14 0.60 1.14 0.11 0.46 0.11 0.36 0.90 II.75 0.z 1.14 1.09 -0.10 0.00 II.96 1.14 1.14
1 1 1 0 1 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3.27
3.27 3.40 3.40 1.25 0.79 3.78 3.40 3.40
Reference (4) (4) (4) (4) @3) (4) (4) (4) (4) (41 (4) (4) 14) (4) (4) (4! (4) (41 (4) (4) (4) (4) (4) (4) (4)
cm (4) (4) (4) (4) 14) (4) (4) (22) (4) (4) (4) (4) (4) (4) (4) (4) (22) ?2&) (22) (41 (4) (4) (22) 14) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4)
548
GRIECO TABLE
ET AL.
IV-Continued log k,,/K,,,
NO. 64 65 66’ 67 68 69 70 71 72 73 74 75 76 77 78 79 ” ’ ’ ” to the ’
NHCOR, LNHCOM~, rxNHCOOCH,Ph, L-NHCO-8pyridyl. L-NHCO-3.pyridyl, L-NHCO-3.pyridyl, L-NHCO-8pyridy1, L-NHCO-3-pyridyl, L-NHCO-Bpyridyl. L-NHCO-Bpyridyl, L-NHCO-3-pyridyl, L-NHCO-Bpyridyl, L-NHCO-3.pyridyl, L-NHCO-3.pyridyl, L-NHCO-8pyridy1, L-NHCO-Spyridyl, L-NHCO-3.pyridyl,
R2 CH,-indolyl, CH,Ph, Me, Me, Me, Me, Me, Me, Me, Me, Me, Me, Me, Me, Me, Me,
OR:, OPh-4-NO2 OPh-CNO, 0.t-Butyl OCH(Me), OMe OEt OPT OBut OCH,CH,OMe OCH,CH,CH(M& OCH,COMe OCH&Fur-H, OCH&H&l OCH,OMe OCH,C,H, OCH,CN
Observed"
CdCUlatedb
7.18 i. 59” -0.09 1.05 1.19 1.21 1.60 1.88 1.89 2.03 2.34 2.40 2.45 2.53 3.06 3.45
7.58 7.13 1.45 1.39 1.23 1.30 1.48 1.67 2.05 1.83 2.45 2.23 2.16 2.37 2.64 3.14
~Alogk,,,/K,,,l 0.40 0.46 1.54 0.34 0.04 0.09 0.12 0.21 0.16 0.20 0.11 0.17 0.29 0.16 0.42 0.31
MR,
MR,
MR,
r;
I
Referenee
1.4s 4.1s 3.18 3.18 3.18 3.18 3.18 3.18 3.18 3.18 3.18 3.18 3.18 3.18 3.18 3.18
4.23 3.00 0.56 0.56 0.56 0.56 0.56 0.56 0.56 0.56 0.56 0.56 0.56 0.56 0.56 0.56
3.40 3.40 2.17 1.71 0.79 1.25 1.71 2.17 1.94 2.64 1.80 2.60 1.75 1.48 3.22 1.30
1.14 1.14 -0.30 -0.19 0.00 -0.10 -0.12 -0.13 0.24 -0.16 0.60 0.16 0.39 0.64 0.26 1.30
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
(4) (22) -’ -s -‘ -? -? -e -F -? -? -I -’ -? -? -’
+ IH*I/K:‘I
where
[H’]
refers
Calculated from results of Refs. (4, 22, 23) and the present authors. Calculated using Eq. [51. This molecule not used in deriving Eq. 151. Compounds for Ref. (22) were corrected according to Dupaix et al. (24) using the expression k,,, = k,/[l pH of the solution and Kt’ is the equilibrium constant far the protonation of the acyl-enzyme intermediate. Compounds made and tested in this work.
laboratories which accounts for the less good fit of Eq. [4]. The congeners which fit most poorly are those which are most weakly bound. This is probably the inevitable result of the very large confidence interval which is always associated with l/K, (the x-intercept) obtained from a Lineweaver-Burk plot that goes very close to the origin, even though the plot itself has an excellent correlation coefficient. Of the 84 substrates used for Eq. [4], one studied by Niemann corresponds to compound No. 2 (the methyl ester) in Table II; for this substrate, Niemann (21) reported log (l/K,) as 1.43, in close agreement with our value of 1.38. We have used our value in formulating Eq. [4]; we found k,,,lK, to be 1.19 and Niemann (21) obtained exactly the same value. Another way of viewing enzymic QSAR, one that reduces the uncertainty that is associated with l/K ,,,, is to use the parameter k,,,lK,. This parameter is related to the slope of the Lineweaver-Burk plot which, if the plot has an excellent correlation coefficient, will be known with a small confidence interval even though the plot goes very close to the origin. Using this parameter (see Table IV), we have
derived Eq. [5]: log LzdL = 0.‘76(+0.14)MR,
+ 3.19(?0.35)MR,
+ 0.56(-+0.13)MR, + 1.30(*0.26)(r,* - 2.2’7(+0.28)1 - 0.32(-+0.08)(MR,)2 - 0.067(-+0.02)MR,.MR,.MR,
- 3.21(?0.61),
[5]
where n = 77, r = 0.988, s = 0.369. In our previous study (4), 57 values of kcat were found. Of those, one was so poorly fit that we did not include it in deriving Eq. [5], and one was identical to that which we have now measured; hence only 55 values have been used for Eq. [5]. Thirteen points (k,,,lK,) have been added from our study, (the t-butyl analog is very poorly fit and not included); in addition, we have used 8 values from the study of B&het et al. (22) and one value from Dorovska et al. (23) (R, = see-butyl). Equation [5] is a rather complex expression. Its stepwise development is shown in Table V and the degree of collinearity among its variables is given in Table VI. The important difference between Eqs. [4]
CHYMOTRYPSIN-LIGAND TABLE
V
DEVELOPMENT Intercept
MR2
U$
1.49 1.35 1.30 2.70 3.23 1.88 3.19
2.08 1.87 1.90 1.52 1.39 1.30
0.79 0.49 0.93 -0.01 -1.45 -2.34 -3.21
I,
-2.30 -2.74 -2.65 -1.86 -2.27
(MRJ*
-0.33 -0.42 -0.32
549
INTERACTIONS
MR,
OF EQ. [5] MR,
0.39 0.71 0.76
MR,.MR,.MR,
0.66 0.56
-0.083 -0.067
T
s
0.777 0.888 0.949 0.963 0.972 0.977 0.988
1.461 1.074 0.745 0.640 0.563 0.508 0.369
~,,,a
F,,~”
114.4 64.9 80.7 27.1 21.9 63.5
114.4 138.3 218.5 229.2 240.9 249.8 414.4
“F 1,60a.001 = 11.1. bF 1,60a.001 = 4.09.
and [5] is that the latter contains an additional term in (MR.J2. Also in Eq. [5], I is given the value of 1 for the see-butyl group in addition to the isopropyl group. No such group was present in the data set on which Eq. [I] is based. DISCUSSION
Equation [3], correlating binding of the esters in the formation of the enzymesubstrate complex, is as interesting for the parameters it does not contain as it is for those it does. The addition of a term in E, to Eq. [3] does not give an improved correlation. Even the t-butyl group, with its great steric demands, is reasonably well fit without the use of E, which suggests that the formation of the enzyme substrate intermediate is not far advanced at the ratelimiting point in this process. The t-butyl analog is badly fit in the correlation of k,,JK,, showing that steric effects intervene in the acylation-deacylation process. In earlier correlation work (10) we TABLE SQUARED
MR, MR, MR, 4 Z
CORRELATION PERTAINING
VI MATRIX FOR VARIABLES TO EQ. [5]
MR,
MR,
MR,
a?
Z
1.00
0.11 1.00
0.04 0.11 1.00
0.17 0.03 0.46 1.00
0.04 0.01 0.04 0.02 1.00
found E, to have at best a marginal role in correlating log l/K, for a small set of hippurate esters. By contrast, in correlating (10) the work of Fife, E, was found to have a major role in the QSAR for log k,lK, of compounds of the type RCOOC,H,NO,; however, our attempts to find a role for E, in the large data set (Eq. [5]) were unsuccessful. The addition of a term in E, was significant with our own data set (Eq. [3]) only if the t-butyl congener was included. Further testing of more sterically hindering R, groups is needed to more clearly define the role of E,. It is of course satisfying that the coefficients with af and MR, are close to those of Eq. [l]. The R, groups upon which Eq. [l] is based constitute a rather poor selection. Our new results support our earlier tentative view that p3 space is not typically hydrophobic. While the results of Eqs. [3] and [4] show that MR is a parameter of overwhelming importance in correlating ligand-enzyme interaction with molar refractivity, the exact meaning of this correlation is not clear. In introducing the use of MR for correlating the binding of organic compounds with biological material, Pauling and Pressman assumed that MR modeled dispersion forces (25). In further developing this idea, Agin et al. (26) followed a similar line of thinking. There is another aspect of MR which must be considered, especially in enzyme interactions. As we have pointed out (4), MR is essentially a measure of molar volume with a correction related to the index of refraction. Since the index of
550
GRIECO
refraction has a relatively small range for organic compounds, this factor does not greatly alter the molar volume. Conformational changes are highly important in enzymic processes where the geometry of the parts of the enzyme acting catalytically upon the substrate is so crucial; hence, MR is most probably associated with binding via dispersion forces and the induction of conformational changes in the enzyme. It is not yet possible to say exactly how important MR is in producing an induced fit of substrate and enzyme. Another way of viewing QSAR in enzymic processes which has often been used is to relate the parameter k&K, to structural variation (27). Table IV includes all of the published data we could find for k,,tlK,r,, as well as values for the congeners of this study. Equation [5] has been derived from these data. Although Eq. [5] has a high correlation in terms of r, the standard deviation is somewhat high; this is because there is a range of almost lo8 in k&K,,. Thus, although Eq. [5] accounts for 97.6% of the variance in log (k,,,lK,,t), the remaining 2.4% is large enough to produce the high standard deviation. In Eqs. [l], [4], and [5] the coefficient of g* has the expected positive sign. One expects and finds that the formation of the tetrahedral intermediate is promoted by electron withdrawal by substituents (28). The role of cr* in Eq. [5] is the same as in Eqs. [3] and [4]. It is clear from Eq. [5] that, starting with H, initial increases in MR of R1, Rz, and R, favor hydrolysis; however, as R, becomes larger, the (MR,)’ begins to take over and activity falls off. The optimal value for R, is about 5 (other factors being constant). It must be noted that, as usual, we have scaled all MR values by 0.1 to make them more nearly equiscalar with n. It is well known from many studies that pr has limited bulk tolerance. The terms in MR, in Eq. [5] describe this mathematically. Equation [4] does not require an (MRY term and, in addition, the R, = t-butyl congener is well fit in the correlation of log l/K,n [but not in the correlation of log (k,,,,lK,,t)l. As we noted before (4), bulky groups in
ET
AL.
p2 space promote the activity of chymotrypsin at three important points: ES complex formation, acylation, and deacylation. In the catalytic step involving acylation and deacylation, R, would appear to favor reaction by producing an induced fit; too large an R, appears to cause an overinduced fit, resulting in lower activity. Since the R, = t-butyl congener fits well in the step governed by K,,, and since it is not involved in the deacylation step (k,), the fact that it is not fit by Eq. [5] means that the trouble lies in acylation (k,). We noted (4) earlier that isopropyl is much less active than one would expect in the acylation step. This process appears to be very sensitive to the steric character of R, and needs further study. As we have pointed out before (5), the triple cross-product term (MR, . MR, MR,) implies that the three regions pl, p2 and p3 are linked together so that a bulky group interacting in one region can, if large enough, inhibit interactions in the other regions. This phenomenon needs further study with a better selection of substituents to more firmly establish this term. In the development of Eq. [5], (MR,)’ and MR, .MR,. MR, are the least significant terms (see Table V) in that they are the last to enter the equation in its stepwise development. There are numerous discussions in the literature about the hydrophobic interactions of various ligands with chymotrypsin. Our present study, as well as our earlier ones, do not support the view that chymotrypsin has a classical-type hydrophobic pocket. This is especially true of p, and p3 space; both of these regions have been studied with sets of congeners for which rr and MR are reasonably orthogonal and in every instance MR is far superior to rr in the correlation equations. The types of ligands binding in p2 space have not been as extensively varied as those binding in p1 and p3 space; even so, MR2 consistently gives better results than n2. As we have pointed out before (2), the so-called hydrophobic pocket in chymotrypsin is not really hydrophobic. The analysis of Dickerson and Geis (29) shows this pocket to be circumscribed by two peptide sequences:
CHYMOTRYPSIN-LIGAND
Gly 184 Ilu Val Ser Trp 212 213 214 215
Ala 185
Gly 216
Ser 186
Gly Val 187 188
Ser Ser Thr 21’7 218 219
These two sequences are largely composed of hydrophilic residues. One -would not expect binding into such an environment to be correlated with parameters based on octanollwater partition coefficients. The good correlations of Eqs. [4] and [5] support our contention that substrates bind and interact in one mode with chymotrypsin. This is in contrast to the view of Hamilton et al. (6) who have suggested that many modes of binding occur for each substrate; this view means that binding constants for all possible interactions of R1, R2, and R, in p,, p2, and p3 space need to be considered. Fortunately, it is not necessary to take such a complex view of the ligand interaction with chymotrypsin. It is gratifying that our earlier surmise about the nature of p3 is correct. The close correspondence between the comparable parameters of Eqs. [l] and [3] provides further evidence to that in hand (30) that correlation equations have valuable predictive ability. ACKNOWLEDGMENT This work the National
was supported by Grant Cancer Institute.
CA-11110
from
REFERENCES R. N., POINDEXTER, T. P., AND HANSCH, C. (1975) Physiol. Chenz. Phys. 7, 423-436. YOSHIMOTO, M., AND HANSCH, C. (1976) J. Org. Chem. 41, 2269-2273. GRIECO, C., SILIPO, C., VITTORIA, A., AND HANSCH, C. (1977)5. Med. Chem. 20,586-588. HANSCH, C., GRIECO, C., SILIPO, C., AND VITTORIA, A. (1977) J. Med. Chem. 20, 1420- 1435. HEIN, G. E., AND NIEMANN, C. (1962) J. Amer. Chem. Sot. 84, 4487-4494. HAMILTON, C. L., NIEMANN, C., AND HAMMOND, G. (1966) Proc. Nat. Acad. Sci. USA 55, 664-670. SILIPO, C., AND HANSCH, C. (1975) J. Amer. Chem. Sot. 97, 6849-6861.
1. SMITH,
2. 3. 4.
5. 6.
7.
551
INTERACTIONS
Ser 189
Cys 220
Ser 221
Ser Cys 190 191 Thr 222
Met 192
Ser Thr 223 224
Pro Gly Val 225 226 227
8. LEO, A., HANSCH, C., AND ELKINS, D. (1971) Chem. Rev. 71, 525-616. 9. KUNTZ, I. D. (1972) J. Amer. Chem. Sot. 94, 8568-8572. C., AND COATS, E. (1970) J. Pham. 10. HANSCH, sci. 59, 731-743. 11. HANSCH, C., LEO, A., UNGER, S. H., KIM, K. H., NIKAITANI, D., ,~ND LIEN, E. J. (1973) J. Med. Che,jr. 16, 1207--1216. 12. UNGER, S. H., APJD HANSCH, C. (1976) Progr. Phys. Oyg. Che)ll. 12, 91-118. in Organic 13. TAFT, R. W. (1956) i?z Steric Effects Chemistry (Newman, M. S., ed.), pp. 556-675, Wiley, New York. 14. RHODES, Y. E., A:VD VARGAS, L. (1973) J. Org. Chem. 38, 4077.-4078. 15. KRUGLIKOVA, R. I., AND KALININA, G. R. (1971) Zh. Org. Khim. 7, 857-860. G. (1!967) Canad. J. Chew/. 45, 16. PERRAULT, 1063-1067. 17. BODANSZKY, M., I~ND Du VIGNEAUD, V. (1959) J. Amer. Chew Sot. 81, 5688-5691. 18. KRUG, R., AND NOWAK, K. (1965) Rocz. Chew. 39, 1343- 1345. 19. YAPEL, A., HAN, M., LUMRY, R., ROSENBERG, A., AND SHIAO, D. .F. (1966) J. Amer. Chem. Sot. 88, 2573-2584. G. R.., ZERNER, B., AND BENDER, 20. SCHONBAUM, M. L. (1961) J. .Biol. Chem. 236, 29’30-2935. C., AND HEIN, G. E. 21. RAPP, J. P., NIEMANN, (1966) Biochemistry 5, 4100-4104. 22. B~~CHET, J. J., DUPAIX, A., AND Raucous, C. (1973) Biochemistry 12, 2566-2572. 23. DOROVSKA, V. N., VARFOLOMEYEV, S. D., KAZANSKAYA, N. F., KLYOSOV, A. A., AND MARTINEK, K. (1972) FEBS Lett. 23, 122-124. 24. DUPAIX, A., B&XET, J. J., AND Raucous, C. (1973) Biochemistry 12, 2559-2566. L., AND PRESSMAN, D. (1945) J. Amer. 25. PAULING, Chem. Sot. 67, 1003-1012. D., HERSH, D., AND HOLTZMAN, L. 26. AGIN, (1965)Proc. Nat. Acad. Sci. USA 53,952-958. 27. BENDER, M. L., AND KI?ZDY, F. J. (1965) Annu. Rev. Biochem. 34, 49-76. 28. WILLIAMS, R. E., AND BENDER, M. L. (1971) Cattad. J. Biochem. 49, 210-217. 29. DICKERSON, R. E., AND GEIS, I. (1969) The Structure and Action of Proteins, p. 85, Harper & Row, New York. 30. HANSCH, C. (1977) iv Biological Activity and Chemical Structure (Buisman, J. A. K., ed.), p. 47, Elsevier, Amsterdam.