Statistical design of experiments for protein crystal growth and the use of a precrystallization assay

Statistical design of experiments for protein crystal growth and the use of a precrystallization assay

60 Journal of Crystal Growth 90 (1988) 60—73 North-Holland, Amsterdam STATISTICAL DESIGN OF EXPERIMENTS FOR PROTEIN CRYSTAL GROWTH AND THE USE OF A ...

1MB Sizes 0 Downloads 80 Views

60

Journal of Crystal Growth 90 (1988) 60—73 North-Holland, Amsterdam

STATISTICAL DESIGN OF EXPERIMENTS FOR PROTEIN CRYSTAL GROWTH AND THE USE OF A PRECRYSTALLIZATION ASSAY Charles W. CARTER, Jr., Eric T. BALDWIN and Lloyd FRICK

*

Department of Biochemistr~’,CB 7260, Unwersity of North Carolina at Chapel Hill, Chapel Hill, North C’arolina 27599-7260. USA

Received 29 September 1987; manuscript received in final form 14 March 1988

Statistical design of crystallization experiments greatly reduces the amount of protein necessary to find conditions for crystal growth and leads naturally to a useful data base for improving crystallization conditions in cases where the initial trials do not produce adequate results. Although it is counterintuitive to vary simultaneously all the factors to be screened, this apparent loss of control over experimental parameters actually costs very little in terms of the statistical strength of inferences to be drawn from the resulting data. We have used incomplete factorial designs to crystallize all of the proteins we are now studying (tryptophanyl-IRNA synthetase, cytidine deaminase, and a manganese catalase). In each case we used a rudimentary microscopic examination to determine the experimental Outcomes. From these data we have generated for each protein a “factor profile” showing the relative importance of the factors studied. In at least one case, the specific information in this profile lead to a substantial improvement in the crystals we were able to grow. Considerable potential power lies in the ability to feed information obtained from all experiments in an initial trial back into the design of subsequent improvements in crystal growth conditions. Exploitation of this feedback is limited by the accuracy of the experimental assay procedure. Microscopic examination has many limitations, including subjectivity and low precision. More importantly, it can be seriously misleading if kinetic factors force suitable nuclei to shower as a microcrystalline array which is mistakenly taken to be a precipitate. The dilution curve assay based on dynamic light scattering was developed (Kam et al., J. Mol. Biol. 123 (1978) 539—555) from the concept that hydrodynamic measurements at different protein concentrations could reflect thermodynamic interactions in aggregates without forcing the system to produce macroscopic manifestations. Hence, it should discriminate between cases involving a kinetic barrier to large crystal growth and those where thermodynamic interactions between molecules are poor. Preliminary measurements of the dilution curves of phosphoglucomutase in ammonium sulfate solutions with and without PEG-400 suggest that the dilution curve can identify microcrystalline samples.

1. Introduction Many factors can influence protein crystal growth, and in particular the ability to grow good crystals of a chosen protein may depend critically on factors whose relationship to crystal growth seem capricious. The problem of screening many such potentially important factors at the outset of a new project has generally been avoided by protein crystallographers because the number of cxperiments required for such a screening seems intuitively to be quite large. The obvious success of protein crystal growers, evidenced by the rapidly *

Present address: Section of Experimental Therapy.

Bur-

roughs Wellcome Laboratory, Research Triangle Park, North (‘arolina 27709, USA.

growing number of new crystal forms suggests that perhaps this problem does not represent an insurmountable barrier to the crystallization of a new protein. One nevertheless suspects that rational and efficient exploration of potential factors should play an important role not only in attempts to crystallize proteins which have resisted crystallization but in most new protein crystal growth projects. Since we introduced the use of statistical designs for protein crystallization experiments [fl, we have used incomplete factorial designs containing between 20 and 35 experiments to screen conditions for growing crystals of six new proteins that have come to us via colleagues and a tRNA [21.Our results compare quite favorably with those obtaining using other approaches to screening. We

0022-0248/88/$03.50 © Elsevier Science Publishers B.V. (North-Holland Physics Publishing Division)

C. W. Carter, Jr. eta!.

/ Designed experiments

crystallized six of the seven macromolecules and the tRNA under at least one condition in the initial incomplete factorial design. Five crystal forms obtained from the initial screening for three of the proteins were of sufficient quality that three-dimensional structure determinations could be pursued without modifying the conditions from the initial trial. The factorial experiment with the tRNAt~ was carried out with the cognate tryptophanyl-tRNA synthetase as one of the factors, and produced crystals both of the tRNA and of a complex (Laurie Betts, unpublished results). Statistical analysis of the results from the first screening for one of the remaining three proteins produced sufficient information to improve dramatically the first crystals obtained. The fifth and sixth proteins have so far produced for us only small needles, and have resisted all attempts to produce diffraction quality crystals. We strongly suspect that the factor limiting us in the case of these last three proteins is the purity of the protein itself, The statistical approach to initial screening cxperiments is countenntuitive to most protein crystal growers, and consequently it presents a perceptible psychological barrier to its implementation. In the interest of lowering this barrier we wish here to describe in familiar and unsophisticated terms why statistically designed experiments can be so effective, to summarize what we have achieved using such experiments, and finally what we have been doing to enhance the power of the approach by combining it with a more quantitative and accurate precrystallization assay [3,4].

2. Statistical experimental designs The use of statistical experimental designs is by no means a recent development. It has its roots in the work of the British statistician R.A. Fischer [5] and has been described numerous times in both textbooks [6] and in more popular accounts [7]. The conceptual framework of a statistical experimental design, of which the incomplete factorial is perhaps the most radical example, involves several assumptions. Most basic is the assumption of their purpose, which is to determine whether or not a particular experimental factor exerts a significant

and precrysta/lization assays

61

influence on the outcome. Thus, at the outset of a new crystallization search in our laboratory, we first ask which, of all possible factors governing crystal growth, are those that actually affect the outcome and determine whether solutions will produce high quality crystals or precipitates. This means that the goal of such an experiment is not so much to produce high quality crystals (although we have found it often does do this very effectively as well, and for similar reasons) as to effect a crude sorting of the possible factors into two groups, those which are important and those which are not. A companion assumption is that to a level of approximation appropriate to the screening we are trying to achieve, the factors, f~,contribute in a linear fashion to the experimental outcome, Q, where Q = constant + E/3, x F,. As with most mathematical models, this has the virtue of providing a simple means of analysing experimental data. With such a model it is straightforward to determine the coefficients, /3,, and from these to make inferences about different factors at the same level of approximation to which the experiment is directed, namely a sorting of factors into those with significant and those with insignificant coefficients. Perhaps the most important underlying principie and certainly the least obvious concerns the nature and power of averaging outcomes from several experiments. Without claiming any specialized expertise in this area, it is nevertheless useful to describe qualitatively and by examples how averaging can substantially increase the efficiency of properly designed screening experiments. Understanding such examples, one comes to appreciate more fully the design of factorial and incomplete factorial experiments. The following examples are hypothetical, and are adapted from a similar presentation in ref. [7]. The conclusions they illustrate, however, can be supported by more sophisticated mathematical arguments, and are quite general. 2.1. Main effects and interactions

Suppose that it is possible to evaluate quantitatively, on a scale from 1 to ~0, the results of a

62

C. W. Carter, Jr. et aL

3,4,2,3

5,4,6,6

/ Designed experiments

Indjvjdu8l results

and precrystallization assays

ments would be to do them all at 20°C, four at pH 6.5 and four at pH 7.5 (fig. 2). In this way, the two main effects can be evaluated from averages of the same number of experiments. An even better approach would be to do the sixteen experiments in fig. 2 at the same time, to avoid the possible effects of unknown factors that differed on two different days. From fig. 2 we verify the main effect of pH and conclude that there is also a significant main effect of temperature on crystal growth. The design in fig. 2 is what is called a Factorial Experiment. This is the name given to a design which tests simultaneously for main effects of more than a single factor. The two factors are each tested at two different levels, and so this design is also called a “two-level” factorial design. There are several notable features of this design: (1) With two factors there is always the additional possibility that they are interdependent. Here, for example, the combination of pH 7.5 and 20° could give a result significantly better than the combined effects of pH and temperature would predict. The technical term for this interdependence is a first-order interaction. To insure that a possible interaction can be detected if it exists, it is important to balance the design, with an equal number of experiments for each combination of the two factors. An interaction is described mathematically as a cross term in the linear approximation for Q. =

3.00(0.71) ~Ii

65

5.25(0.63)

Averages

of

4

7.5

Fig. 1. A main effect of pH. Averages and their standard deviations are given in the second row. The standard deviation is the standard deviation of the mean,

series of crystallization experiments. The first cxample is a set of experiments carried out at two different pH’s and, say, at 4°C. Obviously, whatever the precision of the individual measurements, the confidence level of our inference can be increased by doing multiple experiments at the two pH’s and taking averages of all those for a given pH (fig. 1). In the example, we have decided to use averages of four experiments for each cxperimental point, in order to establish a satisfactory confidence level for our results. In a strictly proper implementation of this approach one would choose this number based on actual knowledge of the precision with which the outcome could be measured. In the case of protein crystal growth the choice is problematic, as will be discussed further below, If crystals grown at pH 7.5 can be shown reproducibly to be better than those grown at pH 6.5, as indicated in fig. 1, according to standard statistical terminology, we say that there is a significant main effect of pH. These is little argument that on the basis of results such as those shown here, it would make sense to grow crystals at the higher pH. Moreover, the pH having been shown by this experiment to be a significant main effect, we can also say that it makes sense to carry out a more detailed study of exactly how pH influences crystal growth to find the optimum pH, which is not necessarily at pH 7.5. Suppose now that we also wish to determine whether or not temperature has a significant effect. We could of course proceed by doing eight more experiments, four at 20°C and another four at 4°C. An immediate and important question arises: at which pH do we do these eight new experiments? Common sense suggests that we do them at the pH previously found to be “better However, a better choice for eight new experi=

=

=

=

________

3.88(1.05)

Temp.



_________

=

5,5,5,4

7,8,9,9 8.25(0.83)

— .





________







________



.





pH

________

4.75(0.43)



200

40

6.75(1.71)

________ _________

5.25(0.83)



___________





3.00(0.7 1) — —

6.5

6.50(1.87) — ___________

4,5,6,6

________

4—



2,3,3,4

________

= ~ects 4.13(1.36)

7.5

Fig. 2. A two-level, complete factorial design, showing signifi.

cant main effects of temperature and pH. Sixteen experiments are distributed four to each box. Averages and their standard deviations are as described for fig. 1.

C. W. Carter, Jr. et aL

3.75(1.09)

Temp.







20°











________

5

7.00(1.58) —

4



_____

________





________

8.50(0.50)



4

S

— —

6.50(2.06) — —

5,6

3.00(1.00) —

~

8

+—

,

4.50(0.50) 2,

_________

/ Designed experiments

4 25(1

45)

5.50(0.50) —

_________



and precrystaltization assays

63

the same eight experiments to determine equally confidently the main effect of temperature! Each individual experiment in a factorial design therefore serves multiple functions. For the experiment in fig. 3 these same eight experiments would faithfully reveal whether or not there were an interaction. In fact, it can be shown that the total number of main effects and interactions of all orders that can be demonstrated by N experiments in such a design is N 1. Thus, by varying —

pH

615



715



Fig. 3. The design shown in fig. 2 but using only eight of the sixteen experiments. The averages and their standard deviations have approximately the same values as do the corresponding ones in fig. 2. The same main effects can therefore be inferred with approximately the same confidence.

(2) The main effects of pH and temperature are apparent in all one-dimensional rows and columns, as was the main effect of pH in fig. 1. This will be true under fairly general conditions related to those under which the assumption of linearity is valid. Thus to the level of approximation intended for a factorial experiment any two columns will show approximately the same “main effect”. Each individual experiment is therefore contributing consistently to the evaluation of both main effects and any possible interaction, (3) A stronger statement can be made regarding the foregoing observation. In general, it can be shown that the power of averages from different experiments in a factorial design depends strongly only on the precision of individual measurements and only very weakly on the number of different factors being screened. The impact of this statement can be appreciated by considering the results in fig. 3, in which only half of the 16 experiments have been included. The same main effects are evident with approximately the same statistical strength using only half as many experiments! In this case, the power of averaging four results at each pH is achieved by using two experiments at 4°Cand two at 20°C. In fact, using eight experiments deployed as in fig. 3, we have determined the main effect of pH with nearly the same confidence achieved with eight experiments at the same temperature shown in fig. 1. We have also used

other possible determine simultaneously factors, alsoa total at two of levels, 7 main we effects can and/or interactions with seriously diluting the statistical strength of interferences. 2.2. Incomplete factorial designs in practice From these simple examples we can learn much about why factorial experiments are so efficient and how to go about setting such an experiment up to screen protein crystal growth conditions. The incomplete factorial design described by Carter and Carter [1] generalizes the two-level design shown in fig. 3 to one in which factors may have more than two levels, and in which the experimental matrix represented in fig. 3 with two experiments in each element will in fact be missing experiments from many of its elements. The concept of “level” is also generalized to include, for example, different cations as levels of the factor, “cation”. However, we preserve from the design in fig. 3 the requirement that each firstorder interaction be represented in projection by fully populated elements. Moreover, the distribution of experiments in the full multi-dimensional space is determined randomly, so as to insure that sampling of matrix elements is spread evenly and hence is as efficient as possible. Randomization also helps to prevent corruption of inferences by systematic influences of unknown factors. The experimental matrix is determined in accordance with three principles. The list of factors to be screened should be comprehensive; the levels of each factor are chosen at random; and the distribution of experiments among the factors (for main effects) and among the first-order interactions should be balanced. As a concrete example, tables 1 and 2 represent the design used to

64

C. W. Carter, Jr. et al.

/ Designed experiments

crystallize adenosine and cytidine deaminases. We followed these steps: (1) Make a thorough attempt to enumerate the possible experimental factors to be tested by discussing the crystal growth project with everyone concerned with the purification and crystallization processes. This step should not be overlooked or

Table 1 Factors and levels Level number

Factor (level)

(A) pH 1

4.5 6.0 7.5

2

(B) Temperature 4 14

it is the only way to ensure that no important factors are left out of the design. Even a properly designed experiments will fail if it does not sample the area in factor space where crystals grow! In addition to obviously important factors,

ignored:

1 2

22 (C) Precipitating agent PEG P0 4 (NH ) SO (D) Monovalent ~tion Na K Li ,

and precrystallization assays

3 1 2

(E) Divalent cation Nothing Ca Mg (F) /~-Octylglucoside Yes, 0.1% No

3

such as pH and temperature, we have found that

1 2

monovalent cations are among the most important factors to screen. Others to be considered are potential ligands of all kinds (substrates, inhibitors, effectors), and various additional cornpounds, such as /3-octyl glucoside and PEG-400.

1 2 3

The factors we screened for the deaminases are listed in table I. For screening experiments it make sense to divide the problem of crystal growth into two

1 2

.

_______________________________________________

.

different searches, asking ftrst what solution con-

Table 2 Experimental matrix Number

pH

Temp.

Agent

Cation

Bi-val

DCF BOG

Q

a>

Result,

~>

ADASE

CDASE

1 2 3 4

1 2 2 1

1 3 1 1

1 2 2 1

1 2 3 1

1 2 1 2

1 2 2 2

1 2 2 1

1 2 4 1

5 6 7 8 9 10 11 12 13 14 15

3 3 2 1 2 3 1 3 2 1 2

3 1 3 3 2 2 2 3 2 1 3

3 3 2 1 1 2 3 3 3 3 2

3 2 1 3 2 2 1 3 1 2 1

2 2 1 3 3 1 3 3 1 1 2

1 1 1 1 2 2 1 1 1 1 2

1 2 4 4 5 1 2 2 3 1 2

1 1 1 3 1 1 1 6 1 6

16 17 18

1 3 2

1 1 1

3 1 2

2 3 2

1 3 3

1 2 1

1 1 1

1 1 4

19 20

3 1

2 3

3 3

3 3

2 2

2 1

4 4

1 3

*> b>

DCF, deoxycoformycin, used in the adensosine deaminase design. BOG, ,8-octylglucoside, used in the cytidine dcaminase design.

C. W. Carter, Jr. et at

/

Designed experiments and prectyssallization assays

ditions promote protocrystalline interactions, and only then asking how to engineer nucleation and growth phases to best utilize these solution conditions to produce optimal crystals. In general, therefore, we test for factors, such as pH and ionic composition, that influence the thermodynamics of interactions between molecules, This decision puts us at some distance, philosophically, from the current emphasis on studying the kinetics of protein crystal growth. Our justification for concentrating on factors likely to change the strength and direction of dominant interactions between protein molecules is this: if a solution is headed toward precipitation there is little one can do to change the outcome by varying the level of supersaturation during nucleation and subsequent growth. On the other hand if one already knows with some certainly, for example from experiments like those described here, that a set of solution conditions will promote a three-dimensional network of interactions, then it makes sense to investigate that set of conditions carefully, with particular emphasis on controlling nucleation and growth phases. It is also important to note that concentrations of factors such as precipitating agents are not considered as levels in the sense of the design. Rather, all experiments are carried out by systematically approaching supersaturation via changes in these concentrations to insure that each experiment is taken to completion. When possible, this is done several times for conditions producing precipitates, to insure so far as possible that precipitation was not due simply to excessive precipitant concentrations. For this reason most of our screening experiments are done using microdialysis buttons and not with vapor diffusion. We do, however, use vapor diffusion for experimental conditions, such as organic precipitating agents and high molecular weight polymers, for which dialysis is inappropriate. Finally, it should be noted here that there are probably good reasons to vary protein concentration as a factor, especially if one is using microscopic examination to determine the experimental outcome. Variations of protein concentration will change supersaturation conditions and hence the kinetics of all growth processes. This variation

65

may in turn cause conditions which give microcrystalline precipitates at one concentration to produce macroscopic crystals at another (R. Boistelle, personal communication). (2) Decide how many experiments are to be done. This will depend primarily on the outcome of step * 1. However, the amount of available protein and other resources may also influence the choice. If the number of experiments which can be done is necessarily less than the number of factors to be tested, plus 1, it will be necessary to limit the objectives of the design by eliminating factors. (3) Choose the level of each factor used in each experiment randomly. Imagine experiments as being located in a multi-dimensional space whose basis vectors are factors and whose coordinates are different levels. As noted above the coordinates defining each experiment are determined by chance in order to insure as even a distribution as possible. This will also place experiments as far apart from each other as possible, and hence insure as efficient a sampling as possible. (4) The levels chosen randomly are than readjusted slightly to insure a balanced design. An example of a balanced first-order interaction is the pH-temperature interaction from the data in table 2 (fig. 4). This interaction is balanced because there is at least one experiment for each combination of the two factors. We have not yet seen any interactions obvious enough to catch our attention, but it is an important possibility to remember, A design in which all first-order interactions are balanced, as shown in fig. 4, will ensure that data will be available for demonstrating such interactions.

4.5

60

7.5

1 4 14 11

8,20

14~ 3,18

9,13

2~7,15

22~

10,19

40

TEMP



-~

6,17 _____

5,12

______

Fig. 4. The balanced pH—temperature interaction taken from the experimental matrix in table 2.

66

C. W. Uarter, Jr. et al.

/

Designed experiments and precrystallization assays

In practice, steps 3 and 4 are carried out by a computer program, INFAC, designed in our laboratory by Tom Mercolino. This program continues to generate designs according to input vanables; including the total number of experiments, the number of factors and the number of levels for each factor; until it finds a design in which there are no empty boxes in any first order interactions and in which no two experiments are identical. Although this procedure can be done by hand (as was done in ref. [1]), it is rather a tedious job when the number of experiments exceeds about 10. The INFAC program is available from our laboratory as fortran source code on an IBM PC diskette, INFAC does not check for such unfortunate combinations as Ca~~and phosphate because it has no way of knowing how the experimental matrix it outputs will be used. Thus, in order to carry out experiments we usually examine the matrix output by INFAC and chose level assignments to minimize such combinations. It should be noted that such combinations are manditory if one is screening simultaneously for Ca2 ± and phosphate as factors. In such cases, we simply use very dilute solutions of Ca2~ and tolerate the resulting inorganic crystals. Other ways around this kind of problem could perhaps be devised, but we have not devoted serious attention to it. (5) Carry out the experiments as dictated by the design. As noted, this means for us a series of dialysis experiments, usually requiring 35 ~il of protein for each sample. The design of such an experiment precludes using the same buffer for any two experiments, so some thought should be given to rational preparation of stock solutions, and so on. We find that there is no good alternative to simply making up each buffer separately, in order to keep the ionic compositions consistent with design requirements. (6) For each experiment brought to completion, determine a numerical value describing the result. Much has been made, informally, about problems associated with this step. Although as discussed further below, we are developing other assay methods, we continue to use microscopic examination and an arbitrary scale such as that in table 3 to grade each experiment. This scale has

Table 3 Scale of crystal quality Resuli

Quality

Cloudy precipitates

1.0

Gelatinous or particulate precipitates

2.0

Spherulites

3.0

Needles

4.0

Plates Prisms

5.0 6.0

many problems, notably that is difficult to distinguish between microcrystalline precipitates and true precipitates. Nevertheless, we have generally found that evaluation of all experiments in a design using this scale gave significant and useful information when subjected to regression analysis. Our experience using this scale includes both successful and mistaken inferences, and is described below. (7) Having determined Q-values for each cxperiment in the design, one can then analyze these data for main effects and interactions. In practice, we have only sought to identify main effects. We use a standard statistical package called SYSTAT [8], but any such package will serve as well. Data must be prepared for analysis by translating the original design parameters into a form that can be used for analysis. This involves describing each experiment by a row of numbers, one for each component in the design. Using the example of our deaminase screening experiment, we must translate the matrix in table 2 into a corresponding table in which columns corresponding to factors whose levels are themselves different components are expanded to include one column for each component studied. Thus for the precipitating agent there will be three columns, one each for PEG, phosphate, and ammonium sulfate. For each experiment, one of these three columns will have a value of 1, the other two will have values of 0. Except for the pH and temperature columns in table 3 the other columns in table 2 are likewise expanded. Each experiment in the design will have a row of thirteen entries, one each of pH, temperature, and each component of the crystallization solution, plus an entry for the Q value.

C. W. Carter, Jr. ci a!.

/ Designed experiments

This expansion permits multiple regression analysis of Q versus all colunms. We use stepwise multiple regression analysis, starting with a model including all columns. Stepwise multiple regression has a dubious reputation among initiated statisticians [8]; however, it serves us by eliminating factors with little impact on the outcome. The resulting list, including a smaller number of factors, is used to formulate a new model using only the significant factors. The /3-coefficients are then evaluated by conventional multiple regression analysis. Significant main effects are indicated by the largest coefficients, positive coefficients mdicate that the factor promotes crystal growth, negative coefficients that the factor impairs crystal growth. Examples of the type of information we have obtained from this analysis are presented in the following section. 3. Results from our own incomplete factorial experiments As noted above, we have used incomplete factorial designs exclusively in our attempts to crystallize new proteins. We have studied the behaviour of six new proteins, including tryptophanyl-tRNA synthetase from B. stearot hermophilus, a pseudocatalase from L. plantarum, cytidine deaminase from E. coli, adenosine deaminase from a rat hepatoma cell line (obtained from P. Hoffee), the A subunit of the UVRABC cxcinuclease, and the photoreactivating enzyme, both from E. coli. The first three designs produced directly five different crystals (two each for the synthetase and the pseudocatalase) with high resolution diffraction characteristics. We also used an incomplete factorial design to crystallize tRNAt~ and a complex between the enzyme and the tRNA [2]. In no case have we been unable to produce crystals. Analysis of the data from the experiment for adenosine deaminase produced an important indication for how to improve the crystals. We have therefore found the approach quite effective, even with only a crude, microscopic assay. 3.1. Factor profiles

A useful presentation of these results is as a histogram representing the most important /3-coef-

and precrystallization assays

67

0

t-—Hl [SALTI—LO[SALTl—NH~— P0~—pH-

—PEG—Ne

_________________________________________ WTS Fig. 5. Main effects for the crystallization of the tryptophanyltRNA synthetase from B. stearothermophilus. Histograms show the /3 coefficients from multiple regression analysis of the initial incomplete factorial experiment.

ficients (figs. 5—7). This device provides at a glance the “factor profile” obtained from the experiment. Our first application of the incomplete factorial design for screening conditions for crystal growth was that’ for tryptophanyl-tRNA synthetase (1; fig. 5). Using thirty five experiments we found one condition ideally suited for growth of crystals of this protein and its complexes in several different conformational states. We have previously noted (1) several prominent features of this factor profile: a significant positive effect of PEG 6000, the surprising negative effect of Nat, and positive indications for both high and low ionic strength. Factor profiles for the two deaminase experiments are shown in fig. 6. Here it is interesting to

0

—Te~—P0~

~

K—C~——Mg-——o~—DCF—B0G

E ~

-0 5

[DASE

_____I

ADASE

I~i~i

Fig. 6. Main effects for the crystallization of cytidine (E. coli) and adenosine (rat hepatoma) deaminases. These histograms result from principal component analysis [8] of the figures in columns 9 and 8, respectively, of table 2.

68

C. W. Carter, Jr. et a!.

/

Designed experiments and precrystallization assays

experiments at the lower protein concentration may also have resulted from the fact that at the lower protein concentration the kinetics of nucleation and growth were such that microcrystalline precipitates formed, which were consequently

05

~‘

0

~

PCAT Fig. 7. Main effects for the crystallization of L. plantarum

pseudocatalase.

note that only one of six factors common to both designs, Mg>, shows effects of the same sign for both proteins. The profile for adenosine deaminase proved quite useful to us. The initial screening experiment produced several crystalline samples, but none of these was suitable for diffraction work. However, the profile indicated that K~ion was a very positive factor. Therefore, for all experiments with a Q of 4 or higher which had cations other than K~,we replaced the original cation with K Experiment * 7, changed only in this way, produced very nice looking single crystals. These diffracted to high resolution, but they were badly mosaic. Nevertheless, the substitution indicated by the factor profile produced a dramatic improvement in the crystal growth. Subsequent efforts to reduce mosaicity were not successful, and we suggest that this problem may result from microheterogeneity in the protein sampie. The pseudocatalase design factor profile (fig. 7) has two important cautionary lessons. First, protein concentration had a very significant impact on the outcome. Two concentrations were used: 28 mg/mI and 32 mg/ml. No crystals were observed at the lower concentration, whereas seven of twelve experiments at the higher concentration produced different crystal forms. Second, ammonium sulfate has a very negative effect, yet our best crystals are grown in ammonium sulfate! We suspect in this case that many of the precipitates given a low quality rating were in fact microcrystalline. Our inability to detect any crystals in ~.

evaluated as poor conditions. This conclusion is supported by the fact that conditions which produce the crystals we are using for structural studies produce granular precipitates at lower protein concentrations. This example points up the most serious defect of using the microscope to evaluate crystallization trials: precipitates which are actually microcrystalline may be misinterpreted, leading to false factor profiles. Thus, in this case, the profile turned out to be seriously misleading, although we were lucky enough not to have to use it. 3.2. Conclusions regarding the use of incomplete factorial designs.

These examples suggest three important conclusions: (1) The incomplete factorial design constitutes a very efficient search protocol. With a balanced. randomized design testing crystallization conditions, one can maximize the changes of finding a “winning combination” of factors with a relatively small number of experiments. This efficient sampling of experimental space is relatively independent of how accurately one evaluates outcomes. It is amply demonstrated by our overall success rate. Using typically 10—15 mg of protein, we have grown crystals of tryptophanyl-tRNA synthetase, a manganese catalase, cytidine deaminase, adenosine deaminase, and the A subunit of E. coli UVRABC excinuclease. The tRNA search was carried out with about 6 mg of tRNA>~. In no case were more than thirty-five experiments used. (2) The statistical structure of the designs leads to experiments that are maximally informative. Evaluating the outcome of each experiment in a design is definitely worthwhile, even if this means using microscopic examination and~a crude quality scale such as that in table 3. Individual experiments are much more informative when analysed together with others from the rational design. This is because a single experiment does many tasks

C. W. Carter, Jr. et a!.

/ Designed experiments

and precrystallization assays

69

when it is a part of such a design: it contributes to all averages used to compare different levels of each factor. The underlying properties of averages imply that they will reflect differences due to different levels of a given factor almost equally well regardless of whether or not all experiments utilize the same levels for the remaining factors. The factors most important for growing crystals often show up as statistically significant “main effects”, which we have represented as “factor profiles” (figs. 5—7). The signal in the factor profiles is certainly noisy, but even a crude factor

precipitates and crystals differ in the number and distribution of intermolecular bonds formed in the aggregate, and hence that the aggregate size distributions of the two types of aggregates will have different dependences on protein concentration. The degree of cooperativity of the dilution curve is determined by the number and orientation of intermolecular contacts within aggregates. Kam et al. [3] provided a simple quantitative model for calculating dilution curves from two parameters, the equilibrium constant for association of two monomers to form a dimer, K1, and that for

profile will suggest modifications of crystal growth conditions that can sometimes dramatically improve crystals. We found this to be so with adenosine deaminase. (3) Microscopic examination is certainly subjective and quantitatively imprecise. However, a more serious qualitative defect is its inability to discriminate between true and microcrystalline precipitates. This weakness means that interferences from regression analyses can be dead wrong, as was the case with the pseudocatalase design. There are at least two possible ways to rectify this defect. First, microscopic examination can be done using polarized light, and precipitates can be screened according to their birefringence properties. We have not been able to utilize polarized light because so many of our experiments were done in plastic dialysis buttons, which do not lend themselves to examination of birefringence. Seeond, evaluation of Q can be done using the protein concentration dependence of hydrodynamic measurements which depend on the aggregate size in solution. Our experience using this approach is described in the next section.

adding a monomer to an aggregate, Km. For a crystalline aggregate Km >> K~ and for a precipitate Km K1. The ratio Km/Ki is thus a quantitative indication of the tendency to crystallize, and hence an appropriate measure for Q. It is important to note that although we use dynamic light scattering to determine dilution curves, other techniques have been proposed, including electron microscopy [9] and fluorescence anisotropy [10]. There are several reasons why dilution curves can be superior to microscopic examination. In principle, the cooperativity of the dilution curve should be a more precise and a more faithful assay for tendencies of macromolecular solutions to precipitate or crystallize. Moreover, since it does not depend on achieving a macroscopically detectable result this property should be rather insensitive to the kinetics of crystal growth or precipitation. It should therefore accurately discriminate between cases where kinetic factors drive the system to produce microcrystals and those where thermodynamic interactions between molecules are insufficient to produce crystals. Finally, our results [4] suggest that the shape of the dilution curve is also rather insensitive to the concentration of precipitant and hence to supersaturation. Dilution curves can be determined with a rather modest amount of material, which will vary depending on the molecular weight of the protein. Since the autocorrelation spectrum is actually recorded only from the illuminated volume, sample size is limited essentially by practical aspects of the sample holder design. Our configuration makes use of Beckman cellulose propionate airfuge tubes, which were suggested to us by Z. Kam as a

4. The dilution curve is an accurate, precrystallization assay Kam, Shore, and Feher [3] described a different approach to assaying for crystallization. Their device, which we call the Dilution Curve, consists of measuring the protein concentration dependence (hence, dilution) of some property related to the aggregate size in solution. The idea is that



70

C. W. Carter, Jr. et a!.

/ Designed experiments

convenient microcuvette. The minimum sample volume, about 30 ~tl,is determined by the need for a path through the sample above the curvature at the bottom of the tube and below the solution meniscus, both of which increase scattered light. The scattered intensity and hence the autocorrelation spectrum signal depends linearly on protein concentration up to rather high concentrations where solutions lose ideality. The amplitude of the autocorrelation function itself increases with the fourth power of the molecular weight [11]. Thus for solutions of constant weight concentration, the signal increases with the square of the sample molecular weight. For a protein the size of lysozyme, solutions of 40—50 mg/mI are required in order that a ten-fold dilution (4—5 mg/ml) will still have detectable signal. The amount of protein required decreases dramatically as molecular weight increases, These considerations suggest that the dilution curve is an ideal tool for separating thermody-

namic determinants of crystal (or precipitate) growth from the kinetics of the process. We have therefore devoted considerable effort to implementing dilution curve measurement as a practical tool in our searches for crystallization conditions. We will report here two preliminary results from these efforts. First, we have developed a practical way to measure autocorrelation spectra from sampies under dialysis (we thank H.W. Wyckoff for suggesting this development). This will facilitate determination of quasi-equilibrium, and is potentially useful as a way to monitor solutions intended for crystal growth for the purpose of controlling nucleation and growth phases. A second experiment suggests that the dilution curve does in fact correctly indicate the types of interactions present in solution, 4.1. Time-dependence of aggregating systems measured by dynamic light scattering We have previously noted [4] that it is important to verify that the aggregate system under study has reached the phase, known as quasi-equilibrium [3], in which the aggregate size distribution is constant over time. In order to study the time dependence of aggregating systems, and to

and precrvstallization assays

assess the relaxation of such systems in response to changes in solution conditions, we have adapted the Zeppezauer dialysis technique [13] to use in our light scattering instrument. There are numerous good discussions of the theory and practice of using dynamic light scattering to study the hydrodynamic properties of macromolecules [11,12]. Our apparatus is similar to that previously described [4]. The dialysis sample in a tube approximately the diameter of a Beckman Airfuge tube is fitted into a black Teflon mask within a 3.0 ml fluorescence cuvette filled with the dialyzing solution. The 3 mm slits of the Teflon mask serve to reduce scattered light. This assembly is placed in a temperature controlled spectrophotometer block and cooled by a Haake 516/ED circulating and water bath. The block height is adjusted so that the laser beam passes through the sample just below the meniscus. Light scattering data are collected at 90° in the photon counting mode. A Spectra-Physics model 120 S laser is the light source. The autocorrelation function is con-

structed by a Langley-Ford autocorrelator, which is in turn interfaced to a MASSCOMP 530 computer. Autocorrelation curves, G( T). are fitted by a non-linear least squares analysis program provided by Dr. Michael Johnson (University of Virginia) to a single exponential decay: G(T) A + B * exp( k T), and the Z-average diffusion coefficient, D7. is calculated from the refined =



parameter, k. Use of this dialysis setup to determine the time-dependence of a precipitating lysozyme solution is shown in fig. 8. From the autocorrelation function in fig. 8a it is clear that we get satisfactory measurements from a dializing sample. The actual time course, fig. 8b, shows that it takes this sample a little over 19 h (68,000 s) to reach a state that can be called quasi-equilibrium.

Although more time-consuming, dialysis has an important advantage over batch preparations for measuring dilution curves. Dialysis experiments are easily adjusted if too little or too much precipitant has been added. We have found on occasion that adding precipitants directly to protein solutions can produce very large aggregates which are not characteristic of quasi-equilibrium. Dialysis allows the concentration of precipitant to be

C. W. Carter, Jr. et al.

/ Designed experiments and precrysta!lization assays

71

a~ 0

~

~ 0

75

CORRELATION TIME

~

150

225

[MICROSECONDS]

:

_____________

15.0

34.8

TIME

54.6

[SECONDS

74.4 3]

X iO

Fig. 8. Time course for the aggregate size distribution of lysozyme (30 mg/ml) during dialysis against 30% saturated ammonium sulfate, 10mM sodium acetate, pH = 4.2, 20°C. (a) The autocorrelation function at the end of the time course (81000 s). The Z-average diffusion coefficient is 5.87)< 10 7cm2/s, compared to a value of 10.6 x 10 7em2/s for the monomer (our own results and ref. [121).(b) The time course for changes in aggregate size distribution. Each point is the best fit of the autocorrelation function at a time, t, after beginning dialysis against precipitant. Error bars are taken from the error of the fit.

raised slowly, thereby avoiding such sources of irreproducibility. The usefulness of this approach for monitoring actual crystal growth experiments to control nucleation and growth phase is evident. It will be important to determine empirically just what time point along the evolution of a crystallizing solution corresponds to “nucleation”. Nevertheless, Figure 8b shows that the kinetics of aggregation can indeed be monitored successfully using dynamic light scattering under conditions (dialysis) subject to experimental control.

synthetase and pseudocatalase under their crystallizing conditions and found that in fact both systems show very cooperative dilution curves. For the synthetase Km/KI has a value of around 200, whereas for pseudocatalase it is around 1000. The phosphoglucomutase system described by Ray and Bracker [14] offers an unusual test of the ability of dilution curves to discriminate between precipitating and crystallizing conditions. In this system the presence of PEG-400 transforms a system that normally gives precipitates (— 53% saturated ammonium sulfate) into one that always gives crystals. The true nature of interactions in

4.2. Use of dilution curves to discriminate between precipitating and crystallizing solutions

The claim [3] that the degree of cooperativity of the dilution curve reflects the tendency of a solution toward crystallization rests on sound physical reasoning, but it is important also ask to what extent this claim is supported by experimental data. We have found an excellent correlation between very cooperative dilution curves and conditions that produce crystals. For example, we have determined dilution curves for tryptophanyl-tRNA

solution in the absence of PEG-400 is not entirely clear, however, because this solution will on occasion give rise to crystals. It appears therefore to be an example for which thermodynamic interactions in ammonium sulfate favor crystallization, but

where the kinetics of nucleation and growth force the system toward a microcrystalline precipitate. If so, as suggested by Ray and Bracker [14], a natural question to ask is how the dilution curves of the two systems compare. The systems are not entirely straightforward to assemble or manipulate, owing to the tendency of PEG to induce a

72

C. W. Carter, Jr. et a!.

/ Designed experiments

and prec,ysta!lization assays

that form in ammonium sulfate solutions. If, as seems likely, interactions in solution are thermo0

\

o

\

\~\ \ \,

dynamically comparable with and without PEG-

\

\

o N

i’o

0

200

50

0

400, the explanatton for the different macroscopic results would seem to be that PEG-400 changes the relative rates at which nuclei form and grow,

thereby changing the macroscopic result. It seems very likely from the discussion of Ray and Bracker [14] and from the near identity of the two dilution curves, that precipitates formed in

0

i~2

‘‘“~-

i

“~o ‘‘‘“~i

EPGM] • K 1

Fig. 9. Dilution curves for phosphoglucomutase under crystal-

lizing and “precipitating” conditions. Each point represents an average of 5—7 measurements of D~,normalized to the diffusion coefficient of the monomer. The points for phosphoglucomutase in ammonium sulfate, which normally gives precipitates (~)are superimposable on those for ammonium sulfate plus PEG 400 (0) which gives crystals Both dilution curves have a value for K~/KI of about 30. indicating a strong

tendency to crystallize,

phase separation. However, under appropriate conditions, the ternary system ammonium sulfate : protein: PEG-400 does exist as a homogeneous solution, Solutions with and without PEG-

400 were prepared under the these conditions and provided to us by Dr. Ray, so that we could measure the respective dilution curves. For several technical reasons our measurements correspond only approximately to those studied by Ray and Bracker, most notably, our dilution

curves were measured at 16°Cand not at 25°C, and solutions were diluted relative to those used to grow crystals, in order to prevent crystallization during the experiment. Nevertheless, we have determined that with these differences dilution curves under the two conditions are completely superimposable (fig. 9). The result provides important confirmation of the proposal that PEG acts as a nucleation catalyst [14] in the following sense. PEG-400 in this system probably does not alter thermodynamic interactions between protein molecules, but changes rates of subunit addition to crystal nuclei

ammonium sulfate are really microcrystalline. This would mean that at nucleation there are vastly too many nuclei, so none is able to grow to macroscopic size. In the presence of PEG-400, either because changes itthecreates effective protein concentrationPEG or because a separate protein-rich phase [14], nucleation is impaired sufficiently to allow growth of large, single crystals.

This is an encouraging result; it shows that dilution curves are more likely to identify cor.

rectly those solution condttions that lead tocr’stal growth in cases where mscrocrystalline prectpitates observed by microscopic examination would be confused with true precipitates. The dilution curve precrystallization assay should provide both more precise and more accurate data than microscopic examination. Factor profiles constructed from such data should thus be considerably more informative than those shown here, perhaps permitting a significant enhancement of optimization proce-

dures. More generally, the near identity of the two dilution curves in fig. 9 implies that an entire class of crystallization results those leading to precipitation probably include many conditions conductive to crystallization but prevented by kinetic factors from yielding macroscopic crystals. Intervention to produce crystals from these conditions is therefore qualitatively different than that required to produce crystals from a solution in which the prevailing mode of aggregation is a true precipitate. This conclusion supports the idea proposed by Ray and Bracker [14] that PEG-400 may be a general purpose reagent for the kinetics of nucleation and growth for proteins crystallized from concentrated salt solutions. It would therefore make sense to include it as a factor in incomplete factorial screening designs. —



C. W. Carter, Jr.

et a!. / Designed experiments and precrysrallization

Acknowledgments

assays

73

[5] R.A. Fisher, The Design of Experiments (Oliver and Boyd,

Edinburgh, 1951). [61 G.E. Box, W.G. Hunter and J.S. Hunter, Statistics for

We thank Hal Wyckoff for his encouragement and for useful suggestions, and Jan Hermans for use of the laser assembly and autocorrelator. Anne Rich and Frank Hage provided assistance in assembling the light scattering instrumentation computer interface. This work was supported by NIH research grant GM 26203.

Experiments (Wiley—Interscience, New York, 1978). [7] C.D. Hendrix, Chemtech (March 1979) 167.

[81 L. Wilkinson, SYSTAT, The System for Statistics (SYSTAT, Inc., 2902 Central St., Evanston, IL 60601, USA, [9] 1986). S.D. Durbin and G. Feher, presented at Intern. Conf. on

[10]

References [1] C.W. Carter, Jr. and C.W. Carter, J. Biol. Chem. 254 (1979) 12219. [21C.W. Carter, Jr., D.C. Green, C.S. Toomin’i and L. &tts, Anal. Biochem. 151 (1985) 515. 131 Z. Kam, H.B. Shore and G. Feher, J. Mol. Biol. 123 (1978) 539. 141 E.T. Baldwin, K.V. Crumley and C.W. Carter, Jr., Biophys. J. 49 (1986) 47.

[Ill 1121

[13] [14]

Crystal Growth of Biological Macromolecules, FEBS Leeture Course, Bischenberg, France, July 1987. M. Jullien, presented at Intern. Conf. on Crystal Growth of Biological Macromolecules, FEBS Lecture Course, Rischenberg, France, July 1987. R. Pecora, in: Measurement of Suspended Particles by Quasi-Elastic Light Scattering, Ed. B.E. Dahneke (Wiley—Interscience, New York, 1983) pp. 3—30. C.S. Johnson and D.A. Gabriel, in: Spectroscopy in Biochemistry, Vol. III, Ed. J.E. Bell (CRC Press, Boca Raton, FL, 1981) pp. 177—272. M. Zeppezauer, Arch. Biochem. Biophys. 126 (1968) 564. W.J. Ray, Jr., and CE. Bracker, J. Crystal Growth 76 (1986) 562.