EXPONENTIAL MODEL FOR A TWO-LIGAND, REGULATORY ENZYME. PART 1: COMPUTER PROGRAMS FOR THE DETERMINATION OF THE MODEL CONSTANTS FROM INITIAL VELOCITY DATA
J. KINDERLERER, Department
S. AINSWORTH and R.B. GREGORY
of Biochemistry.
(Received 22nd January,
The University, Sheffield SI 0 2TN f United Kingdom)
1981)
The esponential model for a regulatory enzyme with two ligands (either two substrates or one substrate and an effector) provides an expression for the initial velocity of the catalysed reaction in terms of the fractional saturation of the enzyme by each ligand. This paper describes a program which determines the constants of the model from velocity data. measured with respect to the concentrations of the two ligands. when the corresponding fractional saturations are unknown. Listings of the essential routines, written in BASIC, are provided.
Introduction The exponential model for a regulatory enzyme (Sturgill and Biltonen, 1976: Ainsworth, 1977) assumes that the most relevant measure of protein configuration (itself determining the kinetic behaviour of the enzyme) is the apparent association constant, cxp, measured for the given fractional saturation of the enzyme by the substrate, A. It further assumes that the original state of the protein in solution, (Ye, is destabilized by an increment of energy, AC;, that is proportional to the fractional saturation, p, of the enzyme by substrate so that formation of the configuraI tional state, ap, can be represented by -AGi/R
T = kp = ln(&*/a,)
where 0~~is the binding constant
P’
(1) at zero saturation.
Aa, exp kp
Equation
1 then gives
(2)
1 + Aa0 exp kp
which describes both positive and negative co-operativity with respect to the binding of A. Equation 2 can also be applied to rate data if p is equated with v/V, the ratio of the measured initial velocity of the enzyme catalysed reaction to the maximum velocity established at a saturating concentration of A. The form of the relationship 291 Int. J. Bio-Medical Gwnputing(12) (1981) 291-313 0 Elsevier/North-Holland Scientific Publishers Ltd.
J. Kinderlerer, S. Aimworth, R.B. Gregory
292
between p and A, defined by given constants,
can be calculated
directly as A = f(p) by
1
_,+P 1 -p
o0 expkp
or asp = f(A) by iteration
P n+1 =P
-
n
IX(P, -I)+ X2--kX+
according to the equation
11x
1979)
(4)
k
where X = 1 -I- Acre exp kp,. The constants o_e and k can be determined transformation
ln(&-*-!)=
(Ainsworth,
lno,,=lncu,+kp
from @,A) data either by the linear
(5)
or by a computer program ‘JK2’ (Kinderlerer and Ainsworth, 1978) which refines initial estimates obtained from the log plot by fitting equation 2 directly. (The program also establishes a precise value of V, or its equivalent, complete saturation). The exponential model was subsequently extended to describe the binding of two ligands, A and B, by a regulatory enzyme .(Ainsworth and Gregory, 1978). The binding equation now appears as:
=Ina,, *naa,
+ A,, pE + (kAo
In this equation, tional saturations pB = b: that is,
I-P,
4~ PE) PA
(6)
CITYrepresents the binding constant for A measured when the fracof the enzyme by A and B take the values, respectively,~~ = u and
- . PA
&ab =
+
1
-
(7)
A
The corresponding binding constant measured when pA = pe = zero is Q, while kAo is the difference (Incure - InoW), determined by the binding constants for A measured at PA = 1 and PA = 0, respectively, in the absence of B. The effect of bound B on the binding of A is described by the constants 4, and f,, which are defined by the equations
Model of regulatory enzyme
293
(9) By analogy, the binding equation
ln&b=
lnpoo + IAOPA + (km
for B is given as
+ !‘lxP*)Pi3
(10)
The application of equations 6 and ld to describe the behaviour of a regulatory enzyme depends on the operational relationship of the two ligands to one another: if A and B are both substrates then: -- Affab
V
vc
=
PAPB
=
1 +Aol,,
.
Bflab
(11)
1 + B&b
an equation which embodies the assumptions that the basic enzyme mechanism is equilibrium random order in type (Cleland, 1963) and that the global maximum velocity VG is independent of saturation (Ainsworth, 1977). Equations 6 and 10 can be employed directly to evaluate their constants if PA and pB are known but, if the experimental data are limited to (v,A,B), then constants can only be ascertained for those ligand-protein systems in which pB (say) is independent of pA . For, when this condition applies, all the constants in equation 6 can be obtained by re-plotting the primary slopes and intercepts as functions of pB - which can itself be determined from data collected at saturating concentrations of A where PA = 1. The second alternative to be considered is where A is a substrate and B is an allosteric effector. Here, we can write (Ainsworth and Gregory, 1978).
+=
PA
f1 +&b-l))
(12)
G
In this equation, V, represents the rate of product formation when the enzyme is completely saturated by A in the absence of B. A different rate is measured when the enzyme is also saturated by B and the change is defined by the constant y: inhibition is associated with values 0 < y < 1 and activation with values 7 > 1. Note that the effector B also influences the rate of reaction through its effect on crab: indeed, this is its only influence when y = 1. Equation 12 can be used to analyse rate data when pB is independent of pa . A selection of rate data for which it appeared reasonable to assume that pB was independent of pA was examined by Ainsworth and Gregory (1978) but we now wish to revoke the restriction. The purpose of this paper, therefore, is to describe procedures which permit the analysis of rate data provided by enzymes whose saturations by A and B depend on one another: these procedures are applicable to systems in which A and B are both substrates and to those in which B is an effector with
294
J. Kinderlerer.
S. A insworth,
R. B. Gregory
y f 1. In both cases, the analysis is based on the family of curves provided when the velocity is plotted against the concentration of one of the ligands, with the other ligand concentration kept constant at a number of fixed values. Principles of analytical
procedures
Examination of equations 6 and 10 shows that eight constants are required by the exponential model for a two-substrate regulatory enzyme. In addition, analysis of rate data requires the specification of V, and, when B is an effector, the value of y. The determination of these constants by non-linear regression methods has proved unsuccessful. The Gauss-Newton algorithm leads to severe ill-conditioning even when initial estimates of the constants are provided which are very close to their correct values. In response to these problems, a new algorithm has been developed which employs multiple linear regressions. This method will be outlined as it applies to the twosubstrate problem: the modifications that are required when B is an effector will then be introduced in the detailed description which follows. The algorithm starts by assigning values to pA and pB for each A,B. Several alternative options exist, but typically both saturations are set equal to 0.5 throughout and are then multiplied by Vc to obtain a velocity (equal to 0.25 I$). This velocity is then compared with the observed value of v for eachA,B to obtain correction factors by which the values of pA and pB are adjusted in equal proportion (0
295
Model of regulatory enzyme
The description is not intended to be exhaustive but only full enough to reveal the feature concerned. A detailed specification of the computer programs involved is given in the appendix together with listings of the ‘MAIN’, ‘EFF’ and ‘PAPB’ routines. Detailed description of analytical procedures A. Constraints
The procedure described in the last section can be employed in a form in which linear regression on equations 6 and 10 in turn determines, respectively, all the constants that describe the binding of A and B. We have found, unfortunately, that such an unconstrained analysis of (VP&) data yields solutions which vary widely as the initial estimates of PA and pe are changed. A much improved stability can be achieved, however, if VG (and, where applicable, y) is determined independently and if the sums of given interaction and binding constants are constrained to pre-determined values. The determination of these constraints is, therefore,the first step in the overall analysis of (v,,4,B) data. We shall deal with the second substrate and effector treatments separately. A and Bare substrates: First, consider the principle of the determination. By Setting pB = 1, equation 11 becomes Val _
-=
v,
P
”-
___.Aol,l 1 +Aol,,
where Val is the observed velocity and VG is the global maximum can also be simplified to lnffal
=
l%o + I”, + (&lo+ I&PA1
(13)
velocity. Equation
6
(14)
Equations 13 and 15 are fitted by the program ‘JK2’ so as to evaluate lnar,, and kA, and thereby constrain the constants that enter into their definitions. ‘JK2’ also provides an estimate of VG. In the same way, application of ‘JK2’ to (VIb,B) data provides the constraints In&, and kIs and gives a second estimate of V,. In practice it is difficult and even hazardous (because of the possibility of substrate inhibition) to employ concentrations of B that are sufficiently large to make pB approach 1. However, when equation 15 is fitted with artificial data for which pr, is approximately 0.9, apparent values of lmol and kA1 can be obtained which are close, and sometimes very close, to their correct values. The estimated value of V, is, or course, depressed
296
J. Kinderlerer,
S. Ainsworth,
R.B. Gregory
in proportion to pB but the estimate can be improved by multiplication with (1 -!- B&r)/&3,,. In this factor B is the concentration of B employed in determining V,, and PII is given by the estimates of I@,,, and klB obtained by applying ‘JK2’ to data obtained in the ‘opposite’ experiment where A is saturating and B is varied. These investigations are reported more fully in an accompanying paper. B is an effector: When B is an effector, velocity measurements can be obtained in the absence of B. Thus, three constants Vc, lnoe, and kAOcan be evaluated by the ‘JK2’ analysis of (v,A)~= e data. The remaining constants, 7, It@,, and kls can be determined from data measured when the concentration of A is approximately saturating. For these conditions, equation 12 gives
P1B
&b =
‘GPAb
(16)
vmw)
where V,(,,,, .= VGpAb (7 - 1). The program ‘JK2’ can then be applied to analyse [vrb-&pAb] = f(B): the analysis gives lnfl,e and km directly, while y can be calculated from the value of VGC,ppjwhich is also provided. These estimates can only be approxrmate when pAb is not equal to 1, but a first correction can be introduced by giving pAb a COnStaUt Vahe ecpd to Aa,, exp k,,/(l + A&,, exp kA,,): this correction term can then be improved, if desired, when provisional estimates of all the constants are to hand. Tests with artificial data have shown, however, that the above approach gives satisfactory estimates of the constraints and that refinement of the constants is better left to a subsequent stage. These tests are described in an accompanying paper. B. Initialisation The ‘INDEXP’ and ‘INPUT’ sub-routines require the nature of A and B to be defined and then request values of the constraints appropriate to the system. The program then accepts (v,A,B) data together with initial estimates of pA and pB that are either given directly or based upon given values of the constants appearing in equations 6 and 10. In the absence of estimates of this kind, pA and pB are set to 0.5 throughout. When B is an effector, data at B = 0 is excluded. C. Adjustment of pA and pn The values of pA and pB given initially or provided by the preceding stage of the iterative cycle are adjusted to bring the calculated and observed values of the velocity into agreement. From equations 11 and 12 we can write VCalC
=
v,P,z
(17)
Model of regulatory
297
enzyme
where 2 equals either pe or {I + pe (y - 1)) when B is either a second substrate or an effector, respectively. A factor, F A,B is obtained by dividing the observed by the calculated velocity for each (A,B) datum point: thus for the ith iteration F A,B
of3
%bs IkpA,i’i
=
Adjusted values of pA and Z are then given by pA,i+
(19)
I = PA,i (&,B)*‘*
and
zi+,= .Zi (FA,B)“2
(20)
When B is a substrate Z = pB and pB,i
+I =
PB,i
(21)
(FA,B)~‘*
but when B is an effector, Z = {I + pB (y - 1)) and
11+PB,j(Y-l)) pB,i+l
(F,,B)“*-1
(22)
= Y---l
The adjusted values pA,i + , and pB,i + 1are employed
in the next stage of the calculation.
D. Determination of the constants When A and B are both substrates, tion 15 into equation 6 gives
of the constraints
lncYol+ ‘!A#A
-
ln%(,
When B is an effector, to give lncrOO +
-
kAOPA
lnaa,
= 1,~
(1 -pB)
the constraints
= dBPB
substitution
+ IxBPA(l
-
PB)
lncu,, and kA,, are introduced
+ z&?APB
defined by equa-
(23) into equation
6
(24)
Equation 23, employed in ‘MAIN’, and equation 24, employed in ‘EFF’ can be solved for /es and Ix. by the standard least squares method. The determination of 4, and IAX employs the same constraints when B is either a second substrate or an effector and proceeds by performing a linear regression on an analogous form of equation 23, given by ltilo
+
kBpB
-
1n&,
=l,,(l-PA)+I,,PB(l--A)
(25)
J. Kinderlerer, S. Ainsworth, R.B. Gregory
298
Both ‘MAIN’ and ‘EFF’ are able to perform weighted linear regressions. The programs set the weighting applied to each (v,A,B) point either to VW or to (v/A)w where w is chosen by the operator. Alternatively, the operator can introduce any other weighting function, With Zen, IXBy IAo and r,x to hand, the remaining constants are calculated from the constraints. The set of constants can then be employed to recalculate pA andPB. However, before the program enters the ‘p,p, ’ sub-routine, the values of the k and I interaction constrants are compared with a free energy limit L which is set by the operator usually to a value of 3. Two reasons can be given for this introduction. First, Weber (1975) has pointed out that ligand free energies rarely exceed an absolute value of 3 RT at 25’C. Second, Ainsworth (1979) has demonstrated the existence of discontinuities in the function pA = f(A) when kAB > 4. At the practical level, limiting the interaction constants to feasible values generally improves the speed of convergence towards a minimum discrepancy between the observed and calculated velocities. Hence, when an individual k or I constant exceeds I!,, the particular constant is arbitrarily set equal to 2 with its original sign and the remaining constants adjusted in accordance with the given constraints. E. Recalculation
of pA and pu by ‘pApu’ sub-routine The calculation of values of pA and pB consistent with the constants obtained in the last step is achieved by a Newton iteration according to equation 4 but with the substitutions, for the calculation of pA and pB, respectively, of
XA= 1 +A%exP[(k~,, + zX~pB)p~ + &PBI kA =
kAO + ‘XBPB
and
XB
=
kB =
1 +
ko,
&&,exP[(keB
+
bXPA)PB
+ b,PA
1
(27)
+ IAXPA
In the first step of the calculation, X, and kA are evaluated by inserting the given constants (including A) into equation 26 together with the initial estimates of pa and pB that were provided, for example pA = pB = 0.5. Iteration by equation 4 now provides a new value of pA : note that at each step in the iteration X, is recalculated but kA remains constant. The new value of pA is now introduced as a fixed constant into equation 27 together with, the initial estimate pB = 0.5 and the remaining constants, including B. Again, iteration by equation 4 provides a new value of pB. Thereafter, thedterations for pA and pB proceed alternately: at each stage n, the saturation that
Model
of regulatory
enzyme
299
is not being iterated on is treated as a constant with the value obtained at the immediately preceding stage, (n - l), while the initial value at stage n of the saturation that is being sought is that achieved at the penultimate stage, (n - 2). The program termmates when the values of both pA and pe do not change by a significant amount. The values of pa and pB calculated for any (A,@ point are stored and used again as the initial estimates for ‘pApB’ when the routine is applied to an improved set of constants obtained by the constrained linear regressions. Values of vcalcare calculated by inserting pA and ps together with the constant VG (and 7 when B is an effector) into equation 11 or 12 as appropriate. F. Error criteria
The values of v,,,~ obtained from the previous program stage are employed together with the observed velocities, vobbs, to provide estimates of the fitting error. Two error criteria are calculated: the sum of squared residuals given by
(28) and the mean deviation given by
(29) Both S and D become smaller as the fit to the data improves but in different ways: thus, S is an absolute error whilst D is relative. A good fit according to the S criterion may therefore be associated with small relative error in the high velocity points but with considerable relative error in the low velocity points. The reverse arises when D is used to measure goodness of fit - large absolute errors in high velocity points may be associated with small absolute errors in the low velocities. For this reason, we have used the error product DS as the means whereby an overall goodness of fit can be judged, but print both D and S individually so that the progress of the fitting procedure can be more easily understood. The D criterion is also used to compare fits obtained with different sets of data because, unlike S, its value is not influenced by the number or magnitude of the data points employed in the fitting procedure. G. Entry to the next iterative cycle and termination The values of pA and ps calculated for each (A,@ datum
point enter the next iterative cycle, the first stage of which is their adjustment by the method described in section C. The constants and fitting errors obtained at each iterative cycle are printed and the iteration is terminated usually after less than 20 cycles. Tests with artificial data have shown that the error product DS changes with iteration number in several ways: (1) a rapid decrease in DS to small values is followed by a persistant slow
J. Kinderlerer,
300
S. Ainsworth, R.B. Gregory
increase in DS; (2) DS shows a slow, monotonic decrease; and (3) DS passes through a number of minima which may or may not include the global minimum itself. Whichever pattern of behaviour is revealed, the constants associated with the minimum value of DS are entered into the refinement procedures described in the next sections. H. Refinement by ‘PAPB’ Both the ‘MAIN’ and ‘EFF’ programs employ ‘pApe’ as a sub-routine to calculate the values of pa and pB. In addition, the sub-routine can be accessed directly in the program ‘PAPB’. This program serves two purposes: Firstly, it calculates the values of pA and ps from a set of constants supplied by the operator. In this context ‘PAPB’ is applied to a set of (A,B) points in which A is varied at a constant B, before proceeding to a second set of points in which B takes a new constant value. Thus except when changing sets, only minor changes in pa and ps are to be expected. The speed of operation is therefore increased by limiting the initial condition pA = pg = 0.5 to the first calculation. Thereafter, the initial values of pA and pB required for any calculation are taken to be those given by ‘PAPB’ for the preceding (A,B) condition. Secondly, the ‘PAPB’ program performs a ‘one at a time’ search, in which an improvement in the fit is sought by the adjustment of the model constants. The search exists in constrained and unconstrained forms that can be independently accessed by the operator. To begin with the constrained search. When A and B are both substrates, the constants are linked in pairs and an increment added to one is subtracted from the other as follows
1
Shift
Add increment Subtract
to
increment
from
2
3
‘n%”
kAO
‘OS
‘OS
‘XB
‘%
4
0
5
6
I
8
‘XB
‘“pa,
kOB
‘A0
‘AX
Kay
*A0
‘AX
‘@e”
k~B
When A is a substrate and B an effector the same shifts are applied but, because the values of h-tar,, and kAo are determined directly and are not constrained respectively with I,, and IxB, lncz,, and kAo are reset to their starting values. This simply gives:
Shift
Add increment Subtract
to
increment
from
1
2
_
_
‘OB
IXB
3
4
‘OB
1x3 _
-
6
I
8
WJ,
koB
1AO
IAX
‘A0
‘AX
‘@cm
kOB
5
301
Model of regulatory enzyme
Note that, in both instances, the constants remain constrained throughout. For each shift, the values of S and D are calculated and the error product DS is formed. Should the error product decrease, the adjusted value of the constant is retained and a further increment is applied to it and tested. When no further reduction in the error product is obtained, the routine proceeds to the next constant in the list. On completing the set of eight shifts, the routine returns to the beginning of the list to repeat the procedure with the same increment value. When no improvement in the fit is obtained for a particular increment value, or a complete circuit without alteration of the constants occurs, the size of the increment is halved and the routine returns to the beginning of the list of constants as before. This procedure continues until the size of the increment is 0.01, whereupon the original increment value stipulated by the operator is restored and the unconstrained search is initiated. For the unconstrained search, the value of the increment is first added to and then, if this fails, subtracted from, each constant in turn, in the order: lno,, , &IOTb, I,,, ln&,, , koB, Z,,, and finally JAX. After each adjustment the error product is calculated. If an improvement in fit is obtained, a further increment in the same direction is tested before the routine proceeds to the next constant in the list. Tests with artificial data have shown that these search procedures are useful in refining the values of the constants obtained by the ‘MAIN’ and ‘EFF’ routines. Examples of these tests are given elsewhere. I. Additional features (a) Data input: The data input routine accepts values of the constraints,
of V, and, when B is an effector, a value of y. The experimental one of three formats:
an estimate data can be entered in
(a) A,B,v (b)A,B,v,p,,p, (c)A, B, v and initial estimates
of all the constants. If format (c) is chosen, values of pA and ps are calculated by using ‘PAPB’ (b) Printing routine: The routine ‘PRINT’ prints the values of the constants together with the calculated values of pA, pe and the velocity for each (v,A,B) point employed in the fitting procedure. (c) Residual plots: Ln principal, the appropriate weighting functions and the value of the exponent w, may be ascertained by examination of the standard deviation determined for each (v,,4,@ point as a function of velocity. In practice, however, the large number of velocity determinations required to provide accurate estimates of the standard deviations are rarely performed, and the choice of an appropriate weighting function is largely a matter of operator judgement. To facilitate such a judgement the program routine ‘RESID’ may be employed. This routine plots (vohs - v,,,~) and (V”bS - %~ic)/%bs against “ohs. This option enables the operator to judge whether a systematic trend is present that might be eliminated by the use of a more appropriate weighting function.
302
J. Kinderlerer, S. Aimworth. R.B. Gregory
(d) The plotting routine: The program routine ‘PLOT’ plots calculated values of v and observed velocities against either A or B. The plot may be set at 4, 5, 8 or 10 inches square. The routine calculates suitable scales and an origin given the range of substrate concentrations and velocities required. The velocity is calculated from the constants obtained from ‘MAIN’, ‘EFF’ or the ‘PAPB’ search sub-routine by using ‘pApB’and is plotted as the continuous functions v = f(A), or v = f(B)A , the input data appearing as points. (e) Dufa storage and retrieval: The data storage routine ‘STORE’ packs the (v,A,B) data points and the exponential model constants and constraint values correct to three significant digits as a record in a disc data file ‘EXPDATA’. Data that has been stored in the data file may be retrieved by using the data retrieval routine ‘LOAD’. Linked functions It is a characteristic of models for regulatory enzymes which can be represented ratio of polynomials in the concentrations of two ligands that
ii%?), =(iEJB
as a
(30)
when A and B possess an equal number of binding sites (Wyman, 1964). The exponential model also predicts equation 30 when IA,, = la1 = I,, = IIB, i.e. bX = IXB = zero (Ainsworth and Gregory, 1978). These additional restrictions simplify the regressions appearing in ‘MAIN’ and ‘EFF’, thus equations 23 and 25 (for ‘MAIN’) reduce to regressions for the solution of I,, and I,,, , respectively. The average of I,, and la0 is then obtained and subtracted from lnaol to give lncyoo and from ln&, to give In&* The same procedure applies to ‘EFF’, (equations 24 and 2.5) except that I,, is not required to establish Ins, which is provided directly as a constraint. It will also be noted that the exponential constants kA, and k,, are equal to the corresponding constraints kA, and k,, because I,, and IaX are both zero. The ‘MAIN’ and ‘EFF’ routines, modified in these ways have been designated ‘WYMAN’ and ‘WYMEFF’, respectively. Notes on the program line changes are given in the appendix. The linkage restrictions allow the refinement procedure ‘PAPB’ to be similarly simplified. For example, in the constrained mode, any adjustment applied to I,, is applied to IA,-, and hence, by operation of the constraint lr@,,, to ln&, with reversed sign. However, because of the absence of I,, and lax, adjustments to kA,, and k,, are effectively unconstrained. In the unconstrained mode, the only restriction that is imposed is that ho and I,,B are kept equal throughout. These alternative operations are incorporated in ‘PAPB’ and are automatically put into effect when refining parameters that have been provided by either ‘WYMAN’ or ‘WYMEFF’.
Model of regulatory enzyme
303
It was stated earlier, when considering the principles of the analytical procedures, that non-linear regression on the 8 constants of the general model was unsuccessful. This approach, however, can be employed when the data are consistent with S linked constants. For two substrates, equation 11 can be fitted by a routine, called ‘FIVE’, similar to ‘JK2’ (Kinderlerer and Ainsworth, 1978) but with the following derivatives for each constant to be determined:
-=-av
V
aQo0 %OXA -=-av
'PA
%A0
x,
-=-av
(32)
V
woo
PO,%
-=-av
'PB
%I,
X,
g
(31)
=
v(z +?)
(33)
(34)
(35)
where X, and X, as defined in equations 26 and 27 are used with I = I,, = Z,, and IAX = ‘Xi3 = 0. The initial estimates of the constants required for the iterative procedure are the final values obtained by using ‘WYMAN’ and ‘PAPB’ as described earlier in this section, with consistent values of pA and pe, calculated from these constants, for each A,B datum point. When it appears reasonable to assume that I = zero, a four constant routine, called ‘FOUR’, can be employed based on equation 11, and equations 3 l-34. The step length calculated by the routines for each of the four or five constants is halved if DS is greater than 1 .lS times that provided by the previous iteration and the errors (differentials) are recalculated. This is repeated until the conclusion (DS), + 1 < 1 .lS (DS), is attained. The constants are then adjusted and printed-out, together with the standard deviation of the estimate obtained from the diagonal of the correlation matrix. The next iteration is then entered and the process continues until a minimum in DS is obtained. Similar procedures based on equation 12 are employed when B is an effector. The relevant derivatives are as follows.
av
V _-
a’yoo -
(36) &OOxA
304
J. Kinderlerer,
av
'PA
av -=
(37)
vGPAPB(~ -
%o av
1)
(38)
&JB v,PAP;(r
-=
akoB =
- 1)
(39)
43
av G
S. Ainsworth. R.B. Gregory
VI PAPB
PA(Y-l) XI3
The application
+
I1 +PB(“f-l)l
(40)
XA
of these programs is described in a subsequent
paper.
Discussion Tests with artificial data have shown that the computer routines described above give good estimates of the constants of the exponential model (Ainsworth et. al., 1981a,b). The time required to obtain the estimates is not long in comparison with the time that would be required to accumulate real data: one working day generally suffices to complete the whole process, and most of this time is devoted to refinement by the search routine. One important omission remains: the routines in their present form do not offer a measure of the error of individual constants unless they are linked. References Ainsworth, S., J. Theor. Biol., 68 (1977) pp. 391-413. Ainsworth, S., and Gregory, R.B.,J. Theor. Biol., 75 (1978) pp. 97-114. Ainsworth, S., J. Theor. Biol., 77 (1979) pp. 27-35. Ainsworth, S., Gregory,R.B. and Kinderlerer, J.,lnC. .I. Bio-Med. Comput., 12 (1981a) pp. 315-334. Ainsworth, S., Kinderlerer, J.andGregory, R.B.,Znt. J. Bio-Med. Comput., 12 (1981b)pp. 335-348. Cleland, W.W., Biochim. Biophys. Actu, 67 (1963) pp. 104-137. Kinderlerer, J. and Ainsworth, S.,Jnt. J. Bio-Med. Cornput., 10 (1978) pp. 29-36. Sturgiil, T. and Biltonen, R.L., Biopolymers, 15 (1976) pp. 337-354. Weber, G., Adv. Protein Chem., 29 (1975) pp. l-83. Wyman, J.,Adv. Protein. Chem., 19 (1964) pp. 223-286.
Appendix The analysis of rate data according to the two ligand exponential model is programmed in basic on a Wang 2200 PCS11 desk-top computer with 8 K available bytes. The programs involve a number of routines using common storage of data which are overlaid using index routines to allow the user to select the appropriate program. The
305
Model of regulatory enzyme
two index pages, which would normally
iWLT1PLE
LINEAR
HEGWSSlON TNnEX 2*.**.. T~PI_I t rou t Fat Pb and
0 1
be displayed
on a screen are printed EXPUNENTIAL
- PAGE
below:
MODEL
t
F’ase
IY A, B* V and optionat the coefficients routine - MAIN Mu1 tiplr linear resression Calculation of Pa and Pb Print results of regression analrsis Plot of rcsidualst unscaled Plot V asainst either A or B sto1.e data on disc 45 rrclaim data from disc 45 linear regression with linkage relationshiP !ax=Ixb=Op Tao=Iob
8 9
:L
no
Ok'TION'?
MULTIPLE
0 1
2
3 4 5 6
LINFAR
REGRESSION INDEX - PAGE
EXPONENTIAL
MODEL
2
Page l...?.. Interface routine - converts data for use in JK2 Single substrate regression - JK2 Four constant re3rEss ion Five constant regression Random normal deviates on v or Vm Averaged random normal deviates
The common storage area linking the routines is defined for up to 84 A, B, and v points where A and B contain the data for ligand concentrations and V the reaction velocities. The nine coefficients (lncuee, kao, IeB, I,,, ln&,,, koB, IA,,, I,, and V,) are contained in an array C; the number of data points in the data set is stored in N and the last calculated values of the fractional saturations PA and pi are in the P and Q arrays respectively(these are set initially to 0.5). Y3$ is used as a pointer as to whether (1) data exists and (2) the data set involves B as a substrate or modifier. In the latter case EO is set to the value of y. The array IV is set to the four constraints calculated from the data extrapolated to saturation for each ligand. The other variables in the common set are labels for printing the calculated data which are reset on entry to the index routine. The MAIN routine is listed below, comments are inserted as appropriate. This is the routine which uses two linked multiple regression analyses to calculate the values of the 8 constants considered variable ( VG is not altered).
J. Kinderlerer. S. Ainsworth, R.B. Gregory
306
PROGRAM FILE MAIN
The data can be weighted by a positive or negative power of (1) V; (2) v/A; or (3) some other method which may be defined by the user. The weights are calculated for each data point and stored in array WS .+I
CIATA "slmp~e
v"~"slrn~le v/A"~"func.tion":MAT
READ
WlQ,
40 ;'&Ttii' illr‘Xf(.13t>D)ti.IB+ "WEiGHTS",,iEX(ODCA):FRJNT "t"vLl$(l):P?INT "2"rWl8(2):PFiX~T "3"l"dcffn,";W1%(3)1HEX(ODUA):INPUT "KFY lt2tor 3",W9 I.& IF WSOBTHEN
60:STOP
"enter
function
at
line 570
II
The limit value (L9) is set to limit the values of the constants. 70 iqA'TX=ZEH:mAT
Y=IER:M=M+l:FOR
do WS(I)=V(I,~WI:GOTO
I=lTO N:ON
W9GOTD
80~90~1.00
110
90 W~~ci).=t3(;)/A(,ji^Wl:GUTO
110
i00 W5tl)=FNb(V(I)>
The F value (equation 1%
ik OdJ){iTH!?N
130
PRINT
The
18) is calculated, and PA and pe adjusted to lie between 0 and 1. l~~O:R(~f-,99:F(I)=FNP(I)
I:FNA(F)t
first regression on the A constants
307
Model of regulatory enzyme
Limit value correction are set to +2 or -2 160 ~(2):
to the calculated
constants.
If greater than the limit value they
,J:NEXT I:i’RINT H$:ilAT X=INV(X):MAT K:Nt:xT ABS(13)(LSTHFN ~~~:C(Z)=~+SGN(LI):Z(~)=W(Z)-C(~)
r\;fXT IF
Z=X+Y:D=W(2)-
The second regression, identical to the first but on the remaining B constants.
2.1rj I\ltxT t::hk.xT J:NEXT 1:h~T X=INV(X):MAT BS(tl)(I..STHEN 720:C(h)=Z+SGN(Lt):Z(2):Wo_C(b)
a40
1;i&h
IFI
A ) : SE I .1: 2T 2%
'5..XLECT F’R I NT
t-15, FN. 3:
005
kl=o:
PRINT (64
F:IR
~15~8O):GiXUR
) : GOTCl I=lTG
Z=X+Y?D=W(J)-Z(2):
IF
A
J 1 : rR r9~ HEX t OAOA~AOAOAO
70
N:P=P(I):Q=O(I):R=P~Q
Calculation of the fractional saturations uses subroutine ‘3 which includes calculation of S (residual sum of squares ~ equation 28), D (mean deviation - equation 29) and DS (the product of D and S): XO
K=C(2)+C(4)*R
2,?C/
k=l
iH0
T=P-L:F’=T:
IF
ABS(L/P)
310
T-G-l,:
Q=T:
iF
AH5
.%‘O
S=P+O:
IF
33~:)
;;(~)=P:~(I)=Q:V=C(~)*~*~~:~~II~T F’RINT V(1 );FNA(V):NEXT
(P)oFNA(R)i:
cA(I)*rbPtC(l)+R+P+C(3)+n):L=X*(X*(~-l)+~)/(X*X.-t:*X+K)
ABS(
t L/U
).K?THEN
270
) 1 . 02THElu
300
(K-S)/S)(.O2THLN
330:H=S:GOTO
260
A(I,:B(I),:PRINTUSTNG 1:PRINT H9:RETlJRN
3901FNA
308
J. Kinderlerer,
Printing routine:
prints coefficients,S,
340 DEFFN’1:P&INT h:” J))*B(I+4)~FNA(C(X+J)):NEXT
370 iK;fiN’4:IAT (2)=2(2)+V+V:NEXT
Limit value correction (iO(3 IlEkFN’2S:FOR
520
ii-
ABS(Z(i)+7(2))(LSTHEN
430
h’t TURN
D and DS :=1TCi 4:F’RINT C$(9);C(9)1:GOSUB
CB(I!+FNA(C( ‘4
I=lTfl N:V=P(I>+~(I):Z(1)=Zrl)+V+Vo:I )/Z(2):PHINT “Urn (cst )“;V
N:V=C(9)+F(I)+O(I!:H=V(I)-V:S1=S1+H,H:F=V( I:S2=FNA(100+SRR(Fl/N)):RETURN
for combinations I=lTO
R.B. Gregory
~tet-atlnns”:FOR 1:F’RXNT
Z=SEH:FOH I:V=Z(i
381.1 S~~F~=~:F-I)H I=lTt3 I)/V:Fl=Fl+(i-F)*(i-F):NEXT
S. Aimworth,
2: IF
of constants ABS(Z(I)
430:MAT
)(L9THEN
~~I:):Z(I)=Z+SGN(Z(I)
Z=(ABS(2/(Z(1)+7(2))))+%
PROGRAM FILE EFF EFF is the same as MAIN except that the equation fitted is 12 rather than 11. It is loaded if Y3$ indicates that B is an effector and not a second substrate. It differs marginally from MAIN and the changed instructions, identified by line number are listed below: li0 F~~~l+l~(X)*(G-1):F=V(I)/(C(9)*P~I)~F.~):F(I)=~~,~)*~~~(F):Q(~)= (SRR(F-)+F3-1)/(G-l):IF P(I)(lTHEN 120:P(I)=.99:R(I)=(FNF’(I)-.l)/( c-i ) 120
IF
Q(I)(ITHFN
121:Q(I)=.99:P(I)=FNF’(I)/F3
121 TF’ n(I))~.:THEN (I)-.999
~~!-);~(T,=,~I-)~~.:~~~(I!=F’JF’(I):IF
J.3CI PRINT
t
I;FNA(b-)
i’(I)(lTHEN
J:3o:P
309
Model of regulatory enzyme
140 F(I)=Q(I~:F(2)=Q(I)+~(I):Ll=LOG(F(J)/((1-F(I))*A(I~)):L=L~-C (l)-C(Z?)+F(I)
Calculation
of pA and PB
830 F~1)=F:Q~I)=Q:V=C(9)+P+o+G+a):PRINT A(I);Fc(I)~:FRINTClSI NG 390tFNAiP)+FNA(O); :PRXNT V(I):F~lA(V):~tiEXT T:PRINT :RETURN
WYMAN and WYMEF are again similar to MAIN but involve a linkage relationship. C(3) is set equal to C(7) (Z,,B = 4,) and C(4) = C(8) = 0 (Zxe and lax). In practice C(3) and C(7) are calculated, their mean taken, and stored in both. No limit is assumed and L9 is not used. The major changes are: WYMAN 140 F1=Q(I)-l:L1=LOG(F(I)/((l-~(I))):L=Ll-~~(~)-~(~)*~(I) 150 F(2)=F(2!+L+F1+W5(I):F(l)=t-(l>+FlsFt++W~(I) 160 NEXT
ItPRINT
H!S:C(3)=F(2)/F(l)
I:lTCl N:Fl=P(I)-1:L1=LOG(Q(Z)/o)+B(I))*B(I))) So MAT t=ZER:FOR :L=C1-W(3)-W(4)*Q(1) 1'00 F(IL')=F(2)+LwFl+W5(I):F(l)=F(1)+Fl+Fl*W5(I):NEXT
FPINT .215(AO) 005 L64 ) : GOT0 70
240 KIYUB
'3:SFLECT
A 1: SL K.1:CT
F’R I NT
-i‘;iJSlJH
’ 1: PR I NT
I
tiEX ( OAOAOAOAC)AO
WYMEF
160 NEiT
I:FtiINT
i90
:1:(3i=f(2)/F(l)
MAT F=ZER:FOR I=lTO :L=~.l-W(3)-W(I)*Q(I)
N:F1=P(I)-1:Ll=LOG(Q(I)/((1-ao)+H(I))*~(I)))
700 F(2)=F(2)+L+Fl+W5(I):Fo=F(I)+T1+FlrW5(I) 210 NEXT :~4(3
GOSUB
I:~i7)=F(~)/F(l):C(7)1Co=(C~~)+C~3))/2:C(~)=W~3)-C(7) ‘3:FiELECT
PRINT
215180):GOSUR
SELFCT Pr? MT ~.50.5 ( 64 ) t C~TO 70
'1:PRINT
:P#INT
:FRINT
:
J. Kinderlerer,
310
S. Ainsworth, R.B. Gregory
PAPB is largely the same as subroutine ‘3 in the previous listings but allows for systematic alteration of the constants. It takes account of: (a) B being either a substrate or modifier; (b) the presence of a linkage relationship if C(4) is identically equal to zero. The latter sets a pointer E7 so that C(4) and C(8) are kept equal to zero and C(3) and C(7) are kept equal. The routine is listed in full below: PROGRAM FILE PAPB
2C! DATA
40 PRINT
3~411,217t81516
HEX(O3)1 "CALCULATION
OF Pa AND Pb"
(1) The values of P and Q for a given set of constants can be calculated. (2) The effect of varying the constants first in a constrained manner where the constraints linking the constants (normally C(1) - C(3), C(2) - C(4), C(5) - C(7) and C(6) - C(8)) are maintained and secondly removing the constraints, is examined (3) The effect of varying the constants when unconstrained with respect to one another is examined. D9 is used as the initial increment/decrement of the constants, D is initially set to the value of D9 and is subsequently halved whenever changes to all the constants fails to decrease DS 50
PRINT lt "with haven constants" :FHINT 2~"constrained":PRXNT "unconstrained":INPUT "112 or 3"+Ei:PRINT
60 IF fl(2 THEN 70: INPlJT "INCREMENT",D9:D=D9:M,MI=O:E;=U:XF OOTHEN 70:If' C(B)OOTHEN 70:PRINT "WYMAN FIT":E7=1 70 <.;L~SIJB '3:CDSUB
'4:f3=FZtIf
El=lTHEN
30:IF
Ei-BTHEN
3,
C(4)
400
The values of pA and pB and the initial value of DS is calculated for the set of constants in the array Con entry to the routine, Array G is set to the values in C. The constants in G are now varied using the increment D. 2 is the pointer to the current constant and J the pointer to the complementary constant. If allowed (by the definition of the function) G(J) is decremented as G(Z) is incremented. 80 FOR Y=lTO 20:HO=O:FOR Z=iTO 8:READ J:IF Z=HUTHEN IhO:IF Z=MlT HEN J.80:[F E7= OTHEN 90:IF Z=STHEN thC~:If Z=7THEN 16O:IF Y35( )HEX (02L)THEN 90:IF Z=2THEN 160:IF Z=ITHEN i6U
311
Model of regulatory enzyme
90 G(Z)=G(Z)+D:G!J)=G(J)-D:PRINT'"SET";Y, )THEN 1OO:G(l)=C(l):G(2)=C(2) 100
"SHIFT";Z:IF Y38=HEXt01
TF Ei'=OTi-IENtZI,:G(4)lG(8)=0:G(7)=G(3):G(S)=Wo-G(7)
120 W=W+1:GOSllB '3:IF FZ(F3THF.N 15O:G(Z)=G(Z)-D:G(J)=G(J)+U:F2=F 3:IF Y~B=HEXI~-)~)THEN13O:G(l)=C(l):G(2)=C(2) 130 IF E7=OTHEN 160:G(4)~G(8)=0:G(7)=G(3):G~5~=W~3~-G~7):GOTO 0
16
150 GOSUH 'J:HO=J:F~=F~:M~=Z:GQTCI 90
If the cycle is complete array C is compared to G. If they are the same no changes have occurred and D is halved. The process is then repeated. Otherwise the set is repeated with the same value of D until the shift attempted would be the same as that which was last successful. In this case D is halved and the process restarted. 160 NExr 0 180
Z:RESTORE :FOR Z=lTO 8:IF G(Z)OC(Z)THEN
170:NEXT Z:GOT
170 MAT C=G:NEXT Y 180 RESTORE :WAT C=G:D=D/2:Wl=O:IF
D(.OlTHEN 4OO:GOTO 80
190 DETFN'B:SI,Fl,F 2=O:MAT Z=IER:FflR I=ITO N:P=P(I):Q-R(I):R=P+Q -200
K=G(2)+G(4)+Q
210 X=l+A~I)+EXP(G(l)+K+P~G~3)~Q):L=X*(X*(P-l)+l)/(X*X-K*X+K) 720 T=P-L:IF ABS(L/P)(,OOITHEN
230: P=T:GOTO
210
270:Q=T:GOTO
250
230 P=T
740 K=G(6)+G(B)*P
260 T=Q-L:IF
ABS(L/Q)<.004THEN
?70. S=F+Q:TF
AHS((H_S)/S)(.004THEN
280 P(I)=P:Q(I)=Q:IF
Y3$=HEX(02)THEN
?RO:R=S:GOTO
200
29O:V=G(9)+F'+Q:GOTO 3(3O
290 V=G(S)+P+(l+(EO-l)+Q) 300 H=V(I)-V:S1=S1+H*H:F=Vo/V:Fl=Fl+(l-F)*(l-~)
At each step in this procedure the value of DS is compared to DS,, . If greater the procedure fails and returns to the step at which it was called (lines 120,430 or 470).
312
J. Kinderlerer,
S. Ainsworth, R.B. Gregory
310 E3=FYA(SI~:EJ=FNA~lOO~SQR~Fl~N)):F.~~FNA(E3*~4~:IF 0: IF F~)=FL”THI:N 320:PRXNT “FAIL ON ITEM”: It”SHIFT”:Z: RETURN YZO PRINT A(V):NEXT
A(I):Ei(I) 1:RETURN
,:PRINTUSING
This routine decreased.
is called to print the constants
330 lKFFN’4:SFLECT PRINT ))~C%(I-I.~)~F,~A(G(T+~)):NEXT 1) THEN 340: F’R I NT “GAMMA”, 340
PR 1 NT
“Res~dua)
350 PRINT “Error PRINT : PRINT SE:Lk CT F’RXNT
370
DEFFN’B:kiFLECT
380
~iEF-FNA~V~=INT~10OC~+V+.~~/lOOO
390
I
1 fxx
64)
PRINT
“Mean
C$(I)eFNA(G(I :F V339=HEX(O
dcviation”;EJ:
Shi ft”;Zg
“X”
“STEP”;D:F’RINT
:
: RETlJRN iK~5(64):i,OAD
DC F”INISEXP”
t r; fif;
400 D=Ll9: SELFCT PRINT 215:PRINT : PRINT : SELECT PRINT 005: PRINT 410 420:
I=lTO 4:F’RINT CBt9)tFNA(G(9)):
“Set”:Y;”
360
V(Ij:FN
and errors where the value of DSmin is
sql,ares”;E3~
product”;FZt
005(
39OeFNA(P)+FNA(Q)::PRINT
‘215(8O):FOR I:PRINT EO
.surn of
M=OTH&N 32 I=N:NEXT I:
FOFi Y=lTO IF ;=4THEN
‘HEX(OE) ; “UNCONSTRAINED” M =M+1: MAT G=C: MI-O
20:FOR Z=lTCl 8:G9=G(Z):IF 490: IF 7)6T.+EN 490
Ml=ZTHEN
t HEX (OF) :
52O:IF
E7=OTHF.N
Unconstrained variation of the constants: The increment is reset to the starting value D9. The value of each constant in turn is incremented by D. If this leads to a decrease in DS then it is repeated, otherwise the value is decremented by Dand tested. If successful it is again decremented otherwise the routine moves to the next constant. 420 G(Z)=G9+D:PRINT )=G(3) 430
GOSUB
440
IF
450
GOSUB
‘3:IF
“SET”;Y~“SHIFT”;Z~“++++“:IF
FZtF3THEN
G9( )C(Z)THEN
450:G(Z)=G9:IF
49O:GOTO
PRINT
“SET”:
E7=OTHEN
43O:G(7
440:G(7)=G(3)
460
‘4:F3=F2:G9=G(Z):Ml=Z:GOTO
460 G( Z) =G9-C: )=G(3)
E7=OTHEN
Y, “SHIFT”:
420 Z, “----“:
IF
E7=OTHEN
470:G(
7
313
Model of regulatory enzyme
'3:IF
470 GOSUB GOT0 490
F2tF3THEN
480:C(Z!=G9:IF
480 GL~SLIB '4:F3=FZ:G9=G(Z):Ml=Z:GOTO 490 NEXT
E7=OTHEN
49O:GC7)=G(3):
460
Z
500 FOR
Z=.lTCl8:XF
G(Z)OC(Z)THEN
510 MAT
C=G:NEXT
520 MAT
C=G:b=D/2:Ml=O:IF
S10:NEXT
Z:GClTO 520
Y CI)=.Ol.THEN 41O:GOSUB
'8