Copyright © IFAC Computer Applications in Biotechnology, Osaka, Japan , 1998
DESIGN OF AN EXPERT SYSTEM FOR SELECTION OF PROTEIN PURIFICA TION PROCESSES: COMPARISON BETWEEN DIFFERENT SELECTION CRITERIA
M.E.Liengueo, J.C.SaJgado and J.A.Asenjo
Centre for Biochemical Engineering and Biotechnology Department of Chemical Engineering University of Chile Beauchef 861. Santiago. Chile
Abstract: We have implemented a hybrid Expert System, that combines experts rules and mathematical correlations to manipulate physicochemical databases for selecting the optimum sequence of operations for purification of proteins, Two criteria were implemented to select the optimum sequence of purification. One of these uses the selection separation coefficient (SSC) and the other uses the final level of purity. The sequences suggested by the expert system have been investigated experimentally using both criteria and it has been shown that both are valid for practical application, but the sequences suggested by the Purity criteria have fewer steps than those suggested by the SSC criteria. Copyright © 1998 IFAC
Keywords: Expert System. Protein Purification
2. The second stage called Purification, takes a multi-protein solution and purifies the protein of interest to a high level of purity. It may consider preconditioning, several high-resolution purification steps and in cases of therapeutical applications, final polishing (purity range 99 -99.9%). The purification process of clinical biotechnological products accounts for 70 to 80% of the total production costs.
1. INTRODUCTION Since the 1970's, the development of modern biology and recombinant DNA technology has made possible the production of a number of new biotechnological protein products. New bio-products such as. those for therapeutics uses (human growth hormones, antibodies, blood factors, vaccines and tests for diagnostics amongst others) and enzymes for industrial use (amylases, proteases and lipases) can be found .
The basic heuristic rules for the downstream processing design are : (Prokopakis and Asenjo. 1990, Asenjo and Patrick, 1990):
Advantages in production of a bio-product depends not only on innovations in molecular biology but also on the downstream process innnovations and optimIzation (Asenjo and Chaudhuri, 1996). Downstream processing of proteins can be divided in two stages:
1. Choose the separation process based on the different physicochemical and/or biochemical properties. In the case of protein purification, physicochemical properties such as surface charge (titration curve), surface hydrophobicity, molecular weight and pIs, are considered. Biochemical properties such as biospecificity with different ligands. stability at different temperatures and pHs are also useful.
I. The first one called Recovery includes the unit operations used for the transformation of the broth into a solution ready to undergo high resolution purification sequences. It may consider harvesting. cell disruption if the product is intracellular. separation of solid debris, protein solubilization and refolding if the product is in inclusion bodies and precipitation of nucleic acids if disruption was needed.
2. Eliminate those proteins and compounds that are found in greater percentage first 3. Use a high-resolution step, as soon as possible. In the case of protein purification consider only chromatographic techniques. The techniques ranked according to their efficiency are: a. Affinity
295
b. Ion Exchange c. Hydrophobic Interaction d. Gel Filtration
Deviation factor (OF), this variable relates the difference between a property of the target protein (for example, the dimensionless retention time in a specific chromatographic technique) and the same property of the contaminant.
4. Do the most arduous purification step at the end of the process
OF = IKdproduct - Kdcontaminantl
The selection of an efficient purification process is the limiting step in a downstream process. The steps are not usually chosen in a rational manner or using the basic heuristic rules for protein separation. To overcome the solution of these problems, it has been proposed to use artificial intelIigence tools, such as Expert Systems (ES). The use of Expert Systems for solving biotechnological problems in the selection downstream process has been documented elsewhere (Protein Purification Advisor (Eriksson et al. 1991 a), Reacti vate Planing P8 (Erriksson et al. 1991 b), FPLC assistant (Pharmacia) and ProcEx (Lienqueo et aI , 1996».
Dimensionless Retention Time (Kd), this variable represents the behaviour of the proteins in a separation carried out by gel filtration, ion exchange or hydrophobic interaction chromatography . Mathematical relationships for predicting Kd have been derived using the physicochemical properties of proteins (Lienqueo et ai, 1996). These relationships are shown in Table I Efficiency Factor (T]) This parameter gives account of the unequal capability of the differents separation processes to separate different proteins. Its value is constant for each type of separation and the chromatographic materials used and have to be measured experimentalIy . In Table I the results obtained for a number of proteins are shown .
The present work discusses the implementation of two algorithms for selecting the optimal sequence of purification in ProcEx. ProcEx is an hybrid Expert System that combines rules stated by experts and mathematical relationships using physicochemical databases of the proteins to be separated. This expert system can be structured in two parts:
Concentration Factor (9i ) Its value represents the relative concentration of each contaminant. It wilI affect the selection criteria. In this way contaminants which are always present in a high concentration have to be eliminated first.
I . A first part is guided to the selection of the optimum recovery sequence of proteins. This subsystem could be weB structured using expert rules foIlowing the heuristic used by the experts (Leser and Asenjo, 1992). Such part of the system is based only on heuristic rules.
e.= I
After determining which chromatographic technique gives the maximum SSC value. it is necessary to calculate the new concentration of all contaminants after the chromatographic technique selected has been applied and construct a new database of contaminants concentration. With this new database the system calculates the purity level and this value is compared with the level of purity required . The optimal sequences of steps are chosen until the required level of purity is reached. FinaBy the system creates a list with the defined sequence of operations.
1.1 SSC criteria (Separation selection coefficient)
The expert system wilI select the best process using the values of the SSC (Separation Selection Coefficient) calculated for each chromatographic technique and each contaminant protein. The best process wilI be that which has the highest SSC value. The relationship developed for this purpose was :
=DF
T] 9t
Concentration of contaminant i (3) Concentration of all contaminants
The amount of contaminant eliminated after a chromatographic step is given in Figure I for different situations. The variable L corresponds to the peak width (Table I) . Values of L are independent of the protein concentration for the normal conditions used in the protein purification process (Lienqueo et ai, 1996).
2. A second part is guided to the purification of the target protein. It has been more difficult to structure this knowledge base compared to that of the recovery section. For this reason two criteria for selecting the best sequence of operations for purification have been implemented:
SSC
(2)
(I)
296
Table 1:Expressions and parameters used for SSC criteria and Purity criteria Chromatographic Techniques
Retention Time Kd
Efficiency Factor 1)
Peak Width I
Anion Exchange
7383 (Q 10251mw } 1+15844 (Q 1025/mw)
1.00
0.15
Cation Exchange
5972 (Q 1025Imw } 1+17065 (Q 1025 /mw )
1.00
0. 15
0.86
0.22
-0.4691 log mw +2.3902
0.66
0.46
Hydrophobic Interaction Gel Filtration
Where: Q represents surface charge [coulomb/molecule) ; mw is molecular weight [Da);
Where Cl is contaminant concentration before a is contaminant chromatographic step and C z concentration after a chromatographic step.
1.2 Purity Criteria
:_
pF
Considering that the most important value is the final purity level and that now we had developed an algorithm to calculate the purity after a purification step. we implemented this criteria. This criteria compares the final purity level obtained after a particular chromatographic technique has been applied.
----.
The Purity concept is defined as: Purity
= Concentration of the target protein I Concentration of the proteins present
(4)
After determining which chromatographic technique gives the highest purity level, the system chooses this as the technique to use at this step. It will then compare the purity with that required . A sequence of steps is chosen until the required level of purity is reached. Finally the system creates a list with the defined sequence of operations. In this work both criteria to select the optimal purification process have been compared. An example has been tested experimentally with a model protein mixture. 2. METHODS
2.1 Characterisation of Model Proteins Four proteins were used : thaumatin (Tau). soy bean trypsin inhibitor (SBT!), serum from bovine albumin (BSA) and ovalbumin (Ova). All proteins and chemicals were from Sigma Chemical Co. (St Louis. Mo. USA.). Thaumatin was a gift of 4F Nutrition. Northallerton. UK.
Figure 1 Criteria to determine the percentage of contaminant eliminated after a chromatographic step.
297
four proteins (BSA, SBTl, Ovalbumin and Thaumatin). The data on this mixture used in the consultations is given in table 2.
Isoelectric Point and Titration Curves: Isoelectric points and titration curves were obtained with a Phast System (Pharmacia Biotechnology., Uppsala Sweden) using Phast Gel l.E.F. 3-9 (linear pH gradient from 3 to 9). Details about the programming of the instrument and running the methods can be found in "Separation Technique File W . lOO, IEF and electrophoretic titration curve analysis"(Pharmacia Biotechnology). The gels were developed using Coomassie Blue, as described in "Development Technique File N°. 200" (Pharmacia Biotechnology).
Three different values for the final purity level were investigated ( 94%, 99% and 99.9%). The results obtained for 94% purity are shown in Table 3 The SSC criteria selects a purification sequence based on the elimination of the contaminant that gives the highest SSC value . Its contribution is described through the product of concentration factor (8) the efficiency factor (11) and the deviation factor (OF). In those cases where all contaminants have the same concentration (equal concentration factor) and the efficiency is constant then OF is the variable that has the main contribution. Then SSC criteria is based on the elimination of the contaminant that presents the most different properties from those of the target protein. In this example the cation exchange chromatography at pH 6.0 is useful to eliminate the thaumatin protein. Nevertheless, the purity achieved after the purification is only 33% . However if anion exchange chromatography at pH 7.0 is used as suggested by the Purity criteria it is possible to eliminate thaumatin and a part of SBTI obtaining a purity of 64%. This has been confirmed experimentally as shown in Figures 2 and 3. This situation takes place because the Purity criteria chooses the optimum chromatographic step considering all the contaminants present. The SSC criteria considers only the contaminant that gives the highest SSC value . For this reason the chromatographic step chosen using the Purity criteria is the optimum for that stage and gives a higher purity than that obtained when the SSC criteria is used . For the example, the second step suggested for both criteria was hydrophobic interaction chromatography. This chromatography eliminates Ovalbumin. Finally the SSC Criteria considers an additional step (anion exchange chromatography at pH 7.0) to eliminate SBTI and to reach a final purity level of 97% .
Determination of molecular weight: Molecular weight were obtained with a Phast System using gradient Phast Gel 8-25 in agreement with the protocol SOS - PAGE described in "Separation Technique File N° 110". The gels were developed with Coomassie blue as has been described in " Development Technique File N° . 200" (Pharmacia Biotechnology).
Hydrophobic Interaction Chromatography: FPLC (fast protein liquid chromatography) was used . The columns and their packing material were purchased from Pharmacia Biotechnology. An HR 5/5 column (10 * 100 mm) was packed with Phenyl-Sepharose fast flow. The buffer was Trizma base 20 mM pH 7 .0 (Buffer A). Each protein was dissolved in buffer A (2 mgiml) . All buffers were de gassed with helium for 10 min. All proteins and buffer solutions were filtered through a 0.2 ~m membrane.
2.2 ValiMtion Anion Exchange Chromatography: A Q-sepharose Fast Flow column with Bis(tris)Propane 20 mM at pH 7.0 buffer was used. Cation Exchange Chromatography: A S-sepharose Fast Flow column with MES 50 mM at pH 6.0 buffer was used
3.2 Experimental investigation Hydrophobic Interaction Chromatography : PhenylSepharose fast flow column Trizma base 20 mM at pH 6.0 plus (N~hS02 1.5 M buffer was used
The sequence suggested by the Purity and SSC criteria were experimentally investigated. The chromatograms for purification sequence suggested by the SSC criteria are shown in Figure 2. Figure 2a shows the separation of thal!!Ylatin from the mixture. Figure 2b shows the separation of ovalbumin from the mixture. Finally, figure 2c shows the separation of part of SBTI from the mixture .
2.3 Criteria Implementation Both algorithms were implemented using Nexpert Object™ , a commercial shell from Neuron Data.
The chromatograms for purification sequence suggested by the Purity criteria are shown in Figure3 .
3. RESULTS AND DISCUSSION
3. J Purification of BSA Consultations were carried out using both criteria eSSC criteria and Purity criteria) implemented in Prot_Ex for purification of BSA from a mixture of
298
Table 2 Physicochemical properties of protein mixture Proteins
Initial Molecular HydrophoConcentration weight Bicity [mglml1 [Da1 [NH z(SOZ)4]
BSA Ovalbumin SBn Thaumatin
2
pH 4,0
Charge pH 5,0 pH 6,0 [Coulomb/ molecule]
pH 7,0 10. 25
pH 8,0
67000
0,86
1,03
-0,14
-1.l6
-1,68
2
43800
0.54
1.40
-0.76
-1.65
-2.20
-2.36
2
24500
0.90
1.22 1,94
-0,76
-1.54
-2.17
-2. 13
-2.05
2 22200 0.89 1.90 1.87 0.91 1.98 Where: Molecular weight was measured by SDS - PAGE with Phast media in Phast System Hydrophobicity was measured by hydrophobic interaction chromatography using a phenyl-superose gel. It is expressed as the concentration (M) of ammonium sulphate at the protein elution point Charge was measured by electrophoretic titration curve analysis with PhastGel IEF 3-9 in a Phast System
Table 3 Sequence suggested by the expert system to obtain a purity superior to 94 % SSC Criteria Chromatography Cation Exchange at pH 6.0 Hydrophobic Interaction Anion Exchange at pH 7.0
Purity Criteria Chromatography
Purity
Anion Exchange at pH 7.0 Hydrophobic Interaction
33.11 % 49.45% 97.02%
0,12
40
Purity 63.65 % 94.5 %
100
0.03
BSA 0,08
80
30
SBT1
0.02
60
20
Ovalburrin 0,04
40
0,01
Thaurratin
20
10 0,00
0,00 AU 0
0 10
20
30
AU
0.03
20
30
40
ml
Fig.2c Third step suggested by SSC criteria. Anion Exchange Chromatography at pH 7.0. Figure 3a shows the separation of thaumatin and a part of SBTI from the mixture. Figure 3b shows the separation of ovalbumin from the mixture. It can be seen that a percentage of SBTI (50%) is not eliminated in the previous step (Figure 3a).
100
SBT1
80
0,02
60 40
0,01
Figures 2 and 3 clearly show that both schemes for purification of BSA suggested by the expert system are perfectly valid for this process.
20 0,00 AU 0
10
40ml
Fig.2a First step suggested by SSC criteria. Cation Exchange Chromatography at pH 6 .0
Ovalburrin
0
10
20
30
0 ml
Fig.2b Second step suggested by SSC criteria. Hydrophobic Interaction Chromatography
299
0,06
criteria are recommended as a first guessing point to start the experimental purification processes instead of the commonly use trial and error method .
100
0,05
Thaurrelin
80
0,04
60 0,03
Finally, it is possible to discover new knowledge, using this expert system, establishing assumptions. implementing rules and testing their validity.
40
0,02
20
0,01 AU 0 0
10
20
30
40
0 ml
REFERENCES
Fig.3a First step suggested by Purity criteria. Anion Exchange Chromatography at pH 7.0.
0,03
,.--~c-----------..,--,
CNalburrin
Asenjo 1.A. and Patrick 1.(1990) Large-scale protein purification in Protein Purification Applications : A practical approach Harris E.L.V. and Angal S .. Eds. IRL Oxford University Press. Oxford, U.K.
100 80
0,02
Asenjo 1.A. and Chaudhuri 1.B .( 1996) separation methods In blOprocessIng In processes in the food and biotechnology Grandison A.S. and Lewis M .. Eds . Publishing Limited, U.K.
60 40
0,01
20 0,00
+-1-..._---._ _ _-;::=-:>.-_~.............
0
AU
0
ml
10
20
30
Innovative Separation industries. Wood head
Eriksson H .. Sandahl K., Forslund G and Osterlund B. ( 1991 a) Knowledge-based planning for protein purification .. Chemometrics and Intelligent laboratory Systems: Laboratory Information Management. 13: 173-184.
Fig.3b. Second step suggested by Purity criteria. Hydrophobic Interaction Chromatography
3.3 Knowledge Discovery
Eriksson H .. Sandahl K., Brewer 1. and Osterlund B. (1991 b) Reactive planning for chromatography , Chemometrics and Intelligent laboratory Systems: Laboratory Information Management, 13: 185-194
The development of the present expert system has allowed us to find and compare two different criteria for the selection of optimal proteins purification stages. Furthermore this implementation has allowed us to find and evidently choose the purity criteria as the optimal one for the most efficient purification step. This knowledge was unknown before.
Leser E. W. and Asenjo 1.A ( 1992) The rational design of purification processes for recombinat proteins. 1.Chromatogr. 584: 43-57. Lienqueo , M.E .. Leser. E.W and Asenjo. 1.A ( 1996) An expert system for selection and synthesis of multistep protein separation processes. Comp Chem Engng, .20: 189-194.
4. CONCLUSIONS The Purity Criteria and the SSC criteria suggest valid purification sequences that have been experimentally validated, however the sequences suggested by the Purity criteria have fewer steps than those suggested by SSC criteria. On a large scale it is necessary to obtain the highest possible yield, minimizing the resources used . One way to obtain this result, is minimizing the number of purification steps. Then, the Purity criteria suggests an optimal purification sequence and also more efficient than those suggested by the SSC criteria. This is particularly true when there are a large number of contaminants present In similar concentrations.
Prokopakis G.J. and Asenjo 1.A (1990) Synthesis of downstream processes in Separation processes In biotechnology. 1.A. Asenjo. Ed . Marcel Dekker, New . York. ACKNOWLEDGEMENTS Financial assistance from Fundaci6n Andes and Vicerrectoria Academica. Universidad de Chile (Beca PG/080/97) is gratefully acknowledged. This project was also supported by Proyecto Fondecyt 1950620 and Proyecto Citcdra Presidencial en Ciencias.
When there are many contaminants, it is very important to decide an appropriate initial step, because the first step is used to eliminate the maIn contaminants while the subsequent steps are used to eliminate those contaminants of smaller importance. The sequences suggested by the expert system have been investigated experimentally using both criteria and it has been shown that both are valid. These 300