Chemical Engineering Journal 207–208 (2012) 822–831
Contents lists available at SciVerse ScienceDirect
Chemical Engineering Journal journal homepage: www.elsevier.com/locate/cej
Support vector regression models for trickle bed reactors Shubh Bansal a, Shantanu Roy a,⇑, Faical Larachi b a b
Department of Chemical Engineering, Indian Institute of Technology – Delhi, Hauz Khas, New Delhi 110 016, India Department of Chemical Engineering, Laval University, Quebec City, Quebec, Canada G1V 0A6
h i g h l i g h t s " Regression method based on Support Vector Machines. " Data mined from over 22,000 experimental conditions from various authors. " SVR method shows remarkably good predictability over this wide range of data. " Improves on earlier neural network-based heuristic learning proposed by Larachi and co-workers [1]. " Scalable and extendable to other multiphase reactor systems.
a r t i c l e
i n f o
Article history: Available online 31 July 2012 Keywords: Support Vector Machines (SVMs) Support Vector Regression (SVR) Correlations Trickle bed reactors Machine learning
a b s t r a c t Transport phenomena in multiphase reactors are poorly understood and first-principles modeling approaches have hitherto met with limited success. Industry continues thus far to depend heavily on engineering correlations for variables like pressure drop, transport coefficients and wetting efficiencies. While immensely useful, engineering correlations typically have wide variations in their predictive capability when venturing outside their instructed domain, and hence universally applicable correlations are rare. In this contribution, we present a machine learning approach for modeling such multiphase systems, specifically using the Support Vector Regression (SVR) algorithm. An application of trickle bed reactors is considered wherein key design variables for which numerous correlations exist in the literature (with a large variation in their predictions), are all correlated using the SVR approach with remarkable accuracy of prediction for all the different literature data sets with wide-ranging databanks. Ó 2012 Elsevier B.V. All rights reserved.
1. Introduction Estimation of various design variables in multiphase reactors has been a major challenge in the Chemical Engineering discipline since its early years. For design and scale-up, it is crucial to estimate properties such as heat and mass transport coefficients, wetting efficiencies and friction factors, as well as for secondary functions like discrimination of hydrodynamic flow regimes (e.g. [1,2]). Prior to the early 1990s, detailed modeling methodologies like computational fluid dynamics (CFD) and multi-scale and multi-physics approaches were not available, hence almost all industrial design and scale-up was based on correlations, and at best, phenomenological one-dimensional models. Two decades hence, we have made significant progress in multi-physics and multiscale modeling approaches, and even commercially available packages like ComsolÒ Multiphysics, AnsysÒ Multiphysics and FluentÒ profess the relative ease with which it is now possible to solve flow ⇑ Corresponding author. Tel.: +91 11 2659 6021; fax: +91 11 2658 1120. E-mail address:
[email protected] (S. Roy). 1385-8947/$ - see front matter Ó 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.cej.2012.07.081
and transport problems in complex geometries. Be that as it may, these detailed modeling methodologies most often depend on closure models for various phenomena occurring at different scales. And since progress of our understanding of the multi-scale physics is limited, these powerful numerical platforms are also limited in their predictions of key design variables. Thus, for all practical purposes, the technical basis of industrial design has had a status quo in terms of empirical correlations still continuing to be the workhorse of industrial design. Effective mining of the experimental data collected over large ranges and cast into simple, easy-to-use correlations is still a desirable objective in all designs. The perennial question is how can we make the correlations themselves more predictive in terms of their accuracy and versatility had it been possible to grasp via heuristic (or soft) modeling the physical features concealed over wide-ranging databanks while bypassing the necessity to construct ad hoc first-principle closures? Thus, a ‘‘parallel’’ approach to the better physical models and better numerical algorithms to solve the first principles transport equation (progress in which is welcome and in the future may become the workhorse of the design process), would be to develop
S. Bansal et al. / Chemical Engineering Journal 207–208 (2012) 822–831
823
Nomenclature i h P N ycalc;i yexp;i 1 (–) i¼1 yexp;i N
AARE
average absolute relative error
aLG C dc dp DG DL
specific gas–liquid interfacial area (m2/m3) SVM parameter (trade-off cost) (–) column diameter (m) equivalent diameter (m) diffusivity in gas phase (m2/s) diffusivity in liquid phase (m2/s) v2 liquid Froude number ¼ gdSL (–)
FrL g kLa kGa ScL ScG ShL ShG
Sb
s dh bed correction factor ¼ a1 e (–)
vSL or UL liquid superficial velocity (m/s) vSG or UG gas superficial velocity (m/s) ~ xðiÞ ðiÞ ~ xj y(i)
set of inputs merged into a vector, for the ith training sample (–) jth feature/attribute/input of the ith training sample (–) output value/label corresponding to the ith training sample
p
acceleration due to gravity (m/s2) liquid phase side volumetric mass transfer coefficient (s1) 1 gas phase side volumetric masstransfer coefficient (s ) lL liquid phase Schmidt number ¼ DL q (–) L l gas phase Schmidt number ¼ DG qG (–) G 2 k ad liquid phase Sherwood number ¼ LDL h (–) 2 k ad gas phase Sherwood number ¼ GDG h (–)
rigorous methods for making correlation predictions more accurate for data sets in which they have been developed, and also allow if possible to extrapolate them to situations in which data could not be or cannot be collected. This forms the motivation for this contribution. It is generally well-known that there is a wide variation in the predictions for the same dependent variable by using correlations from different sources [1]. For example, a comparison of two bestknown correlations for Sherwood number for gas-phase in trickle beds from Refs. [3,4] is reported in [1], in which the correlation predictions are compared with experimental values. The comparison with Ref. [3] reports an AARE (average absolute relative error) of 54.9% (standard deviation of 141%), and the comparison with Ref. [4] reports an AARE of 89.4% (standard deviation of 266%), and 37.2% (standard deviation of 167%), respectively, for the low interaction regime and high interaction regime that prevail in trickle beds (see Fig. 5 in [1], which is reproduced as Fig. 1 here).
Greek letters e bed porosity (–) r standard deviation of relative (percentage) error of pre2sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3 PN ycalc;i yexp;i ð y AAREÞ2 7 i¼1 6 exp;i diction 4 5 (–) ðN1Þ /
rL l
sphericity of packing particles (–) surface tension of liquid (N/m) dynamic viscosity (kg/m s)
In other words, even the best and popular correlations for various design parameters (Sherwood number being only one such illustrative example) have rather poor predictive capability when used outside the domains over which the correlations were originally developed. In recent years, some researchers (e.g. [1]) have attempted to consolidate a large body of published experimental data for hydrodynamics and transport in multiphase reactors using artificial neural network (ANN) models. What these researchers did was to develop neural network-based models that were ‘‘trained’’ on a large body of data (from different, independently obtained datasets reported in the open literature), resulting in a heuristic learning tool that was able to predict to remarkable success the various design variables of choice. Fig. 1 clearly illustrates the improvement in predictability that could be achieved with the ANN approach [1]. Our current paper is inspired largely by this philosophy, that an effective machine learning tool should be able to ‘‘wade through’’ large body of empirical data and ‘‘learn’’ the trends to the extent that its own predictability would far outweigh that of correlations developed based on individual data sets that constitute the data set in toto. We demonstrate this through Support Vector Machines (SVMs) based approach in the following sections. It is worth noting that the power and versatility of the SVM technique has already been applied to a limited number of Chemical Engineering applications, particularly in the context of process control and optimization [5–8], in the development of sensors [9] and in correlating data for gas holdup in bubble columns [10]. The last reference is in some ways similar to our current contribution, though in a different context: the current work being dedicated to multi-variable correlations in trickle bed reactors. 2. Relevant theory of Support Vector Machines (SVMs) and Regression (SVR)
Fig. 1. Parity plot of predicted value of gas phase Sherwood number vs. experimental values obtained from all correlations in the database (adapted from [1]).
Support Vector Machines or SVMs [11] is a machine learning method that was conceived and gained popularity in the late 1990s, and have a lot of attractive features which seem to put them in a better position than artificial neural networks (ANNs) as learning tools for large sets of empirical data. In particular, they directly address the issue that commonly plagues ANN models, i.e., overfitting. SVR theory has been rigorously developed from computational theory, as opposed to ANNs where development has taken
824
S. Bansal et al. / Chemical Engineering Journal 207–208 (2012) 822–831
a more heuristic path [12]. Rather than minimizing empirical risk, or training error (as is done in ANNs), SVMs minimize structural risk [12]. In SVR, the objective function (for instance the error function which may need to be minimized) is convex, meaning that the global optimum is always reached. This is sharply in contrast to ANNs, where for instance, the classical back-propagation learning algorithm is prone to convergence to ‘‘bad’’ local minima [12– 14]. In practice, they have greatly outperformed ANNs in a wide breadth of applications [12]. While a rigorous derivation of SVR data regression is outside the scope of this contribution and will be reported elsewhere (the derivation is along same lines as theory described in [12]), an illustration of the basic concept is in order. Suppose we want to model a dimensionless quantity, such as the specific gas–liquid interfacial area, aLG, as a function of some other dimensionless groups, say the vector ~ x ¼ ½ReL ; ReG ; WeL ; WeG ; Fr L . That is, suppose we had the following modeling problem, and we wanted to find a function f that is optimal in some sense:
aLG ¼ f ðReL ; ReG ; WeL ; WeG ; FrL Þ
ð1Þ
Conventionally, many correlations of relevance to Chemical Engineering are typically formulated by assuming a form of the function f that is a product of various powers of the dimensionless groups, pre-multiplied by a constant. Functions of this kind are inspired by the use of Buckingham pi theorem for developing correlations. Notionally, such a form of f may be written as:
aLG ¼ f ðReL ; ReG ; WeL ; WeG ; FrL Þ ¼ kðReL Þw1 ðReG Þw2 ðWeL Þw3 ðWeG Þw4 ðFr L Þw5
ð2Þ
Next, the parameter vector [w1, w2, . . ., w5] is estimated by varying its corresponding input (dimensionless group), while holding all other inputs fixed, and computing the slope of the best fit line to log (aLG) vs. log (inputi). For instance, w1 can be found by varying only
ReL while holding all of [ReG, WeL, WeG, FrL] fixed, by the following expression:
logðaLG Þ ¼ w1 logðReL Þ þ log½ðReG Þw2 ðWeL Þw3 ðWeG Þw4 ðFr L Þw5
ð3Þ
Note that when ReL is varied, the w1 is estimated from the slope of the log (a) vs. log (ReL) graph and the second term in Eq. (3) is merely the intercept (even though in itself it consists of dimensionless groups whose powers have also to be estimated). In numerical implementations, various approximations are used for defining the ‘‘best’’ estimate of the ‘‘slope’’, in order to estimate the exponents of the various dimensionless groups. Even though this has been regarded as a reasonable approach for decades, clearly there are some limitations with it. The first is that this is a very restricted functional space to explore, and the ‘‘true’’ relationship in our dataset may not really be described in the best way by Eq. (2). Even if the functional form may be optimal, this form of regression (as in Eq. (3)) will not in general give us the best set of parameters. Here, with respect to each input (on a logarithmic scale), the objective is to find a line that best represents the output’s variation with respect to only that input. Note that even in multivariable regression, the various algorithms essentially follow this process for regression (Eq. (3)) only, except that different variables may be optimized one at a time in different steps which follow some rules (depending on the algorithm used). In either case, when tradeoffs exist (which do, because a different dimensionless groups may be affected by the same physical quantity, like viscosity or density), optimizing each wi independently may not result in the best estimated for the parameter vector: ~ ð¼ ½w1 ; w2 . . . w5 Þ. w In fact, the conventional method of regression building as highlighted in above method actually, in general, minimize even the error what it is trying to minimize, i.e., the so-called training error, or error over the data that is being used to build the correlation (which essentially ‘‘trains’’ the mathematical function f to give it predictive ability). A more fundamental limitation of this approach
Fig. 2. Parity plot of predicted value of specific gas–liquid interfacial area for same database. (a) Using ANN based correlation [1]. (b) Using SVM: Model 1 (Table 1) with dimensional value of a. (c) Using SVM: Model 2 (Table 1) with dimensionless value of aLG.
825
S. Bansal et al. / Chemical Engineering Journal 207–208 (2012) 822–831
are linear in the input space, i.e., functions of the following form, and by design, nearly (almost always) guarantees the set of param~ that will minimize the generalization error given a certain eters w data set [12–14]:
is that it at best considers training error, and pays no heed to generalization error, i.e., it provides no probabilistic guarantee regarding its performance on new, unseen data. Thus, such correlations are almost never reliable for extrapolation into domains that are beyond the range in which the correlation was developed. In physical terms, this limitation may translate not only to domains outside the operating conditions, but also outside domains like different materials and fluids. For instance, our variable of interest is Reynolds number which appears in some correlation developed thus. Not only the reliability of the correlation is rather limited in flow velocities outside the range of values available in the data range wherein the correlation was developed, but also limited in its use of different kind of fluids. As an example, if the correlation was developed by using data from tests done with air and water being fluids, its extrapolation to a situation which involves liquid hydrocarbons and hydrogen gas is rather questionable. The latter situation, as researchers will easily recognize, is a common limitation faced in ‘‘academic’’ correlation development from laboratory cold flow units running on air–water, and the intended use being industrial reactors using hydrocarbons. SVR addresses this fundamental limitation of conventional correlations in the following manner. SVR only explores functions that
aLG ¼ f ðReL ; ReG ; WeL ; WeG ; Fr L Þ ¼ w1 ReL þ w2 ReG þ w3 WeL þ w4 WeG þ w5 Fr L þ b
ð4Þ
Compare this with the functional form in Eq. (2). In order to consider functions of the form that we had considered earlier (Eq. (2)), one needs to transform the inputs. Thus, if in our original input space, input vectors were of the form: ~ x ¼ ½ReL ; ReG ; WeL ; WeG ; FrL , then, in our transformed set of inputs, or mapped feature space, the inputs we will consider are of the form:
uð~xÞ ¼ ½log ReL ; log ReG ; log WeL ; log WeG ; log FrL
ð5Þ
Now, instead of applying Support Vector Regression (SVR) on x; aLG Þ, we apply it on ðuð~ ð~ xÞ; log aLG Þ, the form of functions that we will be exploring will be:
log f ¼ w1 log ReL þ w2 log ReG þ w3 log WeL þ w4 log WeG þ w5 log Fr L þ b
ð6Þ
Table 1 SVR models for specific gas–liquid interfacial area (Fig. 2). Inputs Model parameters and errors Model 1 log [uL, uG, Dc, e, dv, /, qL, lL, rL, qG, lG] Model 2
log [ReL, ReG, WeL, WeG, ScL, StL, XG, MoL, FrL, EoM, Sb]
Percentile aLG (m2/m3 reactor) uL (m/s)
uG (m/s)
Dc (m)
Output
Kernel and parameters
Support vectors
Total samples
Training error
Validation error
log aLG
Gaussian c = 1, C = 64, e = 0.03125 Gaussian c = 1, C = 64, e = 0.00391
758
1461
1369
1461
8.05% (1061 points) 10.9% (1061 points)
12.77% (400 points) 17.2% (400 points)
LG dh log a1 e
e (–)
dv (m)
qL (kg/m3) lL (Pa s)
/ (–)
Profile by dimensional quantities Min 2.34E+01 5.81E04 8.39E03 2.30E02 2.63E01 1.16E03 1.26E01 8.05E+02 5% 7.69E+01 1.20E03 2.77E02 2.30E02 3.37E01 1.16E03 1.39E01 8.39E+02 25% 1.63E+02 3.60E03 8.67E02 5.00E02 3.74E01 2.40E03 4.55E01 1.00E+03 50% 3.30E+02 7.23E03 2.50E01 5.00E02 4.08E01 3.37E03 9.11E01 1.04E+03 75% 7.58E+02 2.12E02 8.83E01 1.00E01 6.70E01 8.81E03 1.00E+00 1.08E+03 95% 2.45E+03 8.34E02 2.00E+00 2.00E01 9.30E01 2.04E02 1.00E+00 1.10E+03 Max 1.07E+04 1.26E01 4.50E+00 3.80E01 9.40E01 3.47E02 1.00E+00 1.12E+03 Percentile
a_dimensionless
ReL
6.32E04 6.90E04 1.12E03 1.40E03 1.55E03 2.32E02 6.60E02
rL (N/m) DL (m2/s) qG (kg/m3) lG (Pa s) 1.06E02 2.56E02 4.00E02 6.28E02 6.44E02 7.40E02 7.70E02
1.23E10 2.40E10 1.17E09 1.60E09 2.00E09 3.97E09 4.06E09
1.15E+00 1.17E+00 1.19E+00 1.24E+00 1.70E+00 1.26E+01 5.75E+01
1.50E05 1.71E05 1.74E05 1.78E05 1.80E05 1.80E05 2.02E05
ReG
WeL
WeG
ScL
StL
Xg
MoL
FrL
EoM
Sb
Profile by dimensionless numbers Min 6.86E09 8.79E02 5% 2.52E07 1.18E+00 25% 3.52E06 6.17E+00 50% 1.89E05 2.31E+01 75% 4.26E04 9.43E+01 95% 5.96E02 3.08E+02 Max 2.08E01 2.90E+03
1.77E+00 5.47E+00 3.12E+01 1.28E+02 4.34E+02 1.89E+03 6.24E+03
1.20E05 9.42E05 7.68E04 4.57E03 6.30E02 4.09E01 4.38E+00
3.55E06 7.78E05 1.22E03 8.36E03 1.05E01 1.18E+00 6.86+00
1.37E+08 1.50E+08 6.62E+08 8.39E+08 1.32E+09 1.05E+11 4.23E+11
1.95E07 2.12E06 1.39E05 6.25E05 2.18E04 7.15E03 7.03E02
1.07E02 7.09E02 5.76E01 1.46E+00 3.38E+00 1.11E+01 4.83E+01
1.81E11 3.59E11 9.10E11 1.36E10 7.97E10 2.16E05 1.01E02
9.68E06 3.65E05 2.79E04 9.89E04 5.63E03 3.68E01 9.21E01
2.69E02 6.64E02 5.25E01 7.72E01 9.62E+00 1.07E+02 1.59E+02
1.14E06 2.16E06 1.20E05 2.35E05 8.21E04 9.34E02 1.79E01
Percentile
Dc (m)
e (–)
dv (m)
/ (–)
qL (kg/ m3)
lL (Pa s)
rL (N/m)
DL (m2/s)
qG (kg/ m3)
lG (Pa s)
536.98 838.08 1013.72 815.21 401.67 284.05
306.65 495.55 851.04 958.57 257.26 234.79
299.52 954.32 805.02 578.72 248.69 192.37
222.93 268.16 896.54 688.16 706.61 706.61
763.76 792.01 418.38 515.57 514.06 1563.43
211.36 388.76 622.12 566.61 990.77 1075.10
998.36 519.43 914.98 392.71 572.09 275.08
1649.18 507.68 626.64 684.62 370.81 258.98
191.32 223.57 430.07 783.56 1320.00 384.77
330.73 458.51 497.48 770.73 1002.19 931.35
uL (m/s)
uG (m/s)
Trends by dimensional quantities 5% 173.89 226.57 25% 314.73 354.93 50% 428.55 528.21 75% 501.16 986.32 95% 1183.67 832.79 Max 2197.52 443.83 Percentile
ReL
ReG
Trends by dimensionless numbers 5% 2.72E05 3.35E06 25% 1.81E05 1.56E03 50% 8.15E04 6.42E05 75% 5.71E03 1.35E03 95% 1.67E02 2.11E02 Max 6.52E02 7.16E02
WeL
WeG
ScL
StL
Xg
MoL
FrL
EoM
Sb
2.55E05 1.08E03 2.26E03 1.21E02 1.94E02 9.33E03
2.34E05 2.99E05 4.12E05 3.85E03 2.01E02 5.90E02
1.33E06 2.85E02 7.04E03 5.59E03 1.31E04 9.27E06
2.22E02 3.00E02 4.59E03 1.57E04 2.22E05 1.17E05
4.35E03 8.42E04 3.56E03 1.14E02 1.47E02 2.04E02
3.86E02 1.89E02 1.05E02 1.31E04 1.02E04 3.51E05
6.88E03 7.54E03 4.66E03 1.87E02 3.90E03 1.59E05
9.92E08 7.66E06 1.09E05 1.54E04 2.15E02 6.37E02
9.92E08 5.85E06 1.19E05 1.47E04 2.19–02 8.65E02
826
S. Bansal et al. / Chemical Engineering Journal 207–208 (2012) 822–831
which is equivalent to exploring the following form:
f ¼ 10b ðReL Þw1 ðReG Þw2 ðWeL Þw3 ðWeG Þw4 ðFrL Þw5
was implemented. The essential steps to the implementation of the current problem are described as follows.
ð7Þ
Note that for k > 0 (Eq. (2)), Eq. (7) represents precisely the same family of functions as described by Eq. (2). Hence, we can essentially make the SVR to learn precisely the same kind of correlations considered in the conventional approach, however use of the form like Eq. (4) ensures that we are optimizing the generalization error, and optimizing it globally (not locally, which might happen even with advanced regression and pattern recognition techniques such as artificial neural networks (ANNs)). We have so far considered only one kind of input transformation (which, in machine learning parlance is referred to as the so-called feature mapping), i.e., one that maps the inputs logarithmically (Eq. (5)). There are many other transformations that work for different kinds of problems, such as polynomial, and sigmoidal. These maps can be efficiently realized using different kernels. In particular, there exists the so-called Gaussian kernel [12–14] that maps each input vector, to a vector with infinite terms. Details of discussion on this kernel is out of the scope of this short paper, but it suffices to say that this kernel is extremely powerful, and can adapt to really almost any kind of non-linear relationship in data. Normally, the Gaussian kernel is characterized by three parameters: C, c and e. It is arguably the Gaussian kernel that lends SVR its true power, and is of great relevance here since we do not know the structure of the parameter space a priori in the context of the Chemical Engineering correlation building exercise. 3. Implementation LibSVM [15] (software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm) is a popular tool for SVM training that implements most of the functions required by the work contained within this paper. Thus, it was the platform on which the current work
Step 1: Assembling. Assemble a training data set. Ideally, the data instances should be randomly and independently sampled, and in number should be at least (about) ten times the number of input attributes. More practically, make sure that the training set reasonably samples the entire range of inputs for which the model or correlation is to be used. Step 2: Take logarithms of every value in the dataset, both inputs as well as outputs. This serves many purposes. First, it roughly imposes a percentage insensitive loss on the output, and second, forms of correlations such as Eq. (2) often are transformed to simpler in terms of logarithms and also provide good best fit estimates. In fact, the latter is one of the reasons why conventional correlations of the type in Eq. (2) are popular in the first place. Also, it imposes non-negativity constraints on the output and input variables, something that is required in most physical situations. Step 3: Scaling. Normalize all inputs to some common range (typical range could be [1, 1]). This prevents larger magnitude inputs from dominating the optimization. Another variant of normalization, called standardization, that normalizes each feature to 0 mean and unit variance, often works better in the presence of outliers [12]. Step 4: Select features. It is well known that identifying and removing noisy inputs from the dataset can lead to significantly superior models. In this context, we propose that a logical method to choose that subset from the set of possible inputs (as has been used in this work), is to choose a subset that leads to a model with the lowest cross-validation error (CVE). If the set of inputs is small enough (say at most about 10 input dimensionless groups, which is most often the case in building chemical engineering correlations), then we propose that the most robust way to identify the subset of data with least CVE
Fig. 3. Parity plot of predicted value of mass transfer coefficient for same database. (a) Using ANN based correlation [1]. (b) Using SVM: Model 1 (Table 1) with dimensional value of kLa. (c) Using SVM: Model 2 (Table 2) with dimensionless value of ShL.
827
S. Bansal et al. / Chemical Engineering Journal 207–208 (2012) 822–831
is to do a brute force search over all possible subsets. Alternatively, instead of optimizing the CVE, if it is required to impose some constraints on the correlation or model, one can also add a penalty for deviation of the model or correlation prediction from physical expectations of our objective function, e.g., in the form of a priori topological–phenomenological rules. This is one of the ways that we can use physical constraints to guide learning algorithms [16]. Step 5: Select kernel and model parameters. If the form of function is known a priori (for instance, Eq. (2)), and only function parameters are to be regressed, it is possible to ‘‘hand tune’’ a kernel that only allows relevant functions. For instance, if we know that (logarithm of) friction factor depends linearly on (logarithm of) Reynolds number in the laminar region for flow through ducts, then it is possible to choose a kernel function accordingly. In case the form of the functional relationship in the model of correlation is not known a priori (such as Eq. (2)), then popular kernels may be implemented: such as linear, polynomial and the Gaussian kernel. In this work, a Gaussian kernel has been implemented [12–14]. A popular way of optimizing model parameters and kernel parameters is to use the so-called grid search CVE minimization [17]. In this search
method too, it is possible to modify the objective function to be able to preferentially choose physically consistent models. Step 6: Use cross-validation, as described above, to find the best estimates of parameters (such as unknowns in Eq. (6)). Step 7: Test the parameters against validation data set (which should be distinct from the training data set).
4. Results and discussion The methodology of use of SVMs for correlation building, and the necessary background as described in Section 2, is sufficient to develop correlations for any chemical engineering process. Indeed, an instance of this has already been reported [10], wherein gas holdup in bubble columns has been correlated using SVMs. In this work, we present an implementation of SVMs for developing generalized correlations for trickle bed reactors (TBRs) (fluid dynamic and mass transfer characteristics), by using historical databases (22,000 experiments in all, across different parameters) on TBR performance reported in the open literature. This is philosophically along the same lines as [1], except that we find remarkable improvement in overall prediction with the use of SVMs.
Table 2 SVR models for liquid side mass transfer coefficient (Fig. 3). Inputs
Output Kernel and parameters
Model parameters and errors Model 1 log [uL, uG, Dc, e, dv, /, qL, lL, rL, qG, lG]
kLa
Model 2
log [ReL, ReG, WeL, WeG, ScL, StL, XG, MoL, FrL, EoM, Sb] ShL
Percentile
kLa (1/s)
uL (m/s)
Profile by dimensional quantities Min 4.25E04 6.67E07 5% 5.01E03 1.08E03 25% 1.70E02 4.65E03 50% 4.12E02 9.57E03 75% 1.79E01 2.29E02 95% 1.17E+00 8.57E02 Max 7.04E+00 1.49E01 Percentile
ShL
uL(m/s)
ReL
Training error
Validation error
802
902
552
902
12.5% (600 points) 11.3% (600 points)
18.3% (302 points) 17.04% (302 points)
Dc (m)
e (–)
dv (m)
/ (–)
qL (kg/m3) lL (Pa s)
rL (N/m)
DL (m2/s)
qG (kg/m3) lG (Pa s)
1.50E03 2.45E03 5.91E02 1.48E01 4.90E01 2.00E+00 4.50E+00
1.58E02 2.50E02 5.00E02 5.10E02 1.14E01 1.72E01 1.72E01
3.56E01 3.56E01 3.85E01 4.00E01 5.90E01 8.90E01 8.90E01
5.41E04 1.77E03 2.40E03 3.63E03 6.00E03 1.77E02 2.04E02
1.33E01 1.45E01 5.62E01 9.11E01 1.00E+00 1.00E+00 2.33E+02
6.91E+02 8.06E+02 9.98E+02 1.00E+03 1.08E+03 1.17E+03 1.17E+03
1.06E02 2.53E02 4.77E02 7.20E02 7.20E02 7.56E02 7.56E02
4.76E11 2.27E10 1.60E09 1.82E09 2.29E09 2.89E09 9.79E09
1.19E01 1.12E+00 1.15E+00 1.24E+00 1.27E+00 2.30E+00 3.68E+01
ReL
uG (m/s)
Trends by dimensional quantities Min 0.00 0.00 5% 0.01 0.01 25% 0.03 0.04 50% 0.05 0.09 75% 0.14 0.25 95% 0.55 0.80 Max 1.80 0.10 Percentile
Total samples
uG (m/s)
Profile by dimensionless numbers Min 4.73E03 1.00E03 5% 6.69E+00 4.46E01 25% 3.08E+01 7.16E+00 50% 2.59E+02 3.09E+01 75% 2.61E+03 9.09E+01 95% 3.10E+04 6.40E+02 Max 9.55E+04 1.58E+03 Percentile
Gaussian c = 0.5, C = 32, e = 0.00781 Gaussian c = 1.0, C = 64 e = 0.03125
Support vectors
ReG
Trends by dimensionless numbers Min 0.00 0.01 5% 63.94 23.03 25% 637.78 93.71 50% 961.09 709.83 75% 1353.76 3100.60 95% 13377.54 8522.82 Max 30272.60 40159.92
6.30E04 8.00E04 1.00E03 1.20E03 1.54E03 2.15E02 2.48E02
8.21E06 1.45E05 1.71E05 1.77E05 1.08E05 1.08E05 1.82E05
ReG
WeL
WeG
ScL
StL
Xg
MoL
FrL
EoM
Sb
1.24E02 9.29E01 1.31E+01 5.22E+01 1.70E+02 1.39E+03 5.88E+03
2.90E11 5.03E05 1.15E03 6.57E03 5.85E02 6.24E01 3.67E+00
5.62E09 5.58E07 2.76E04 2.11E03 3.22E02 8.79E01 6.51E+00
1.03E+02 3.51E+02 5.24E+02 6.13E+02 7.31E+02 7.66E+04 4.89E+05
7.60E09 1.97E06 2.06E05 8.62E05 2.62E04 9.06E03 6.83E02
2.14E03 1.78E02 1.99E01 6.65E01 1.96E+00 9.31E+00 2.16E+02
1.08E11 1.12E11 2.63E11 5.44E11 3.31E09 1.78E05 3.18E05
1.51E11 2.43E05 4.68E04 2.17E03 9.68E03 2.27E01 9.64E01
2.73E02 1.50E01 4.96E01 7.72E01 4.09E+00 5.94E+01 1.33E+05
6.17E01 2.53E+00 2.74E+00 3.28E+00 1.02E++01 2.15E+02 2.40E+02
Dc (m)
e (–)
dv (m)
/ (–)
qL(kg/m3)
lL (Pa s)
rL (N/m)
DL (m2/s)
qG (kg/m3)
lG (Pa s)
0.04 0.05 0.48 0.50 0.17 0.09 0.04
0.03 0.03 0.06 0.20 0.49 0.19 0.03
0.01 0.88 0.58 0.25 0.10 0.13 0.03
0.03 0.03 0.21 0.51 0.17 0.18 0.18
0.00 0.03 0.31 0.12 0.13 0.40 0.33
0.02 0.05 0.04 0.13 0.14 0.56 0.72
0.03 0.62 0.49 0.17 0.05 0.11 0.33
0.03 0.66 0.30 0.31 0.22 0.20 0.03
0.00 0.04 0.08 0.20 0.17 0.46 1.14
0.00 0.04 0.18 0.13 0.37 0.55 0.50
WeL
WeG
ScL
StL
Xg
MoL
FrL
EoM
Sb
0.01 330.29 206.85 997.42 3507.32 14399.75 12848.76
0.01 22.80 88.63 1395.35 2547.03 8688.75 39906.04
0.06 24251.88 10323.43 3549.78 1540.89 1203.84 3603.12
0.09 7382.60 13139.75 3191.02 2509.96 740.72 4341.14
100.28 276.55 3335.67 2216.22 3501.01 10514.69 10006.43
24.80 30071.75 11490.97 3093.68 1316.76 343.48 2494.86
0.09 713.80 1478.39 4204.82 7391.72 6277.79 3942.60
0.51 64.53 1048.73 710.76 1262.97 11660.12 44212.62
34.53 206.17 127.98 483.81 1914.05 10750.30 46973.75
828
S. Bansal et al. / Chemical Engineering Journal 207–208 (2012) 822–831
In [1], the researchers have presented an extensive trickle bedflow database built up by consolidating experimental information from over 150 literature references, scanned over half a century. In doing so, the authors consolidated information over many gas–liquid systems (more than 30), over 30 column diameters, over 40 different packings and equally large packing bed heights. In all, this represented above 22,000 experiments. Operating conditions ranged from high pressure and temperature (10 MPa and 350 °C), Newtonian, non-Newtonian, coalescing, non-coalescing, organic, pure, mixed and aqueous liquids. The data also scanned across various flow regimes, viz. low interaction regimes (LIR) and high interaction regimes (HIR). Arguably, data mining this large database was an extensive and commendable task. The central idea of that paper was the idea that in multiphase reactors such as trickle bed reactors, these range of correlations actually lead to a wide diversity of results. Thus, it is almost impossible to use any one correlation (or even a combination of a few) to make accurate predictions for the system design parameters with reliable accuracy. For instance, Fig. 1 clearly shows (as an example) that for gasphase Sherwood number (dimensionless mass transfer coefficient), the prediction from correlations in [3,4] leads to unacceptable (several hundred percent) standard deviation from experimental values. Iliuta et al. [1] reported an ANN model which successfully reduced this variability to around 10%, which was a big improvement on what individual correlations could predict and also had a wide applicability through the entire database of 22,000 experimental conditions. It is however notable that not all parameters are populated in every observation. For our aLG model for instance, when we find the data points (which are actually individual data collected from different research works) from the entire database reported in [1], the number comes up to 1461 observations (which also, is a very large number of observations). The same experimental database as [1] was used in the current work, and SVMs were implemented to achieve the same task as [1], using the logic outlined in Section 3. Fig. 2 shows the first of the comparisons, attempting to predict the specific gas–liquid interfacial area aLG in TBRs, a variable that is not only very important to estimate for design, but is also one which conventional correlations, including those using ANNs [1] have found particularly difficult to model universally. A parity plot that summarizes the performance of popular ‘‘general’’ correlations, and an ANN trained by Iliuta et al. [1] is presented in Fig. 2a. The important point to be noted from Fig. 2a is that the AARE of popular correlations was found to be as large as 148% when applied to the entire dataset, with an even higher standard deviation of as much as 356% in the so-called ‘‘low interaction regime’’. ANNs were impressively able to improve prediction performance, reducing AARE to 28.1% with a standard deviation of 37%. Table 1 presents a summary of the SVR models that we have trained and Fig. 2b and c present parity plots demonstrating their performance. We note a notably significant improvement in predictability compared to previous models. For instance, prediction error reduced from 28.1% (for ANNs trained model precisely on the same dataset) to 12.8%. Note that in Table 1 and Fig. 2a and b two different SVM regression models (or simply, SVR models) have been presented. Model 1, which is designed to predict the dimensional gas–liquid interfacial area, aLG (m2/m3), is significantly better than that of Model 2 which is designed to predict the dimensionless gas–liquid interfaLG dh cial area, a1 e . The reader should also note that the number of support vectors (which is an indicator of model complexity) is much larger for Model 2 than 1, indicating that Model 2 is a worse model on both counts. Our next variable of interest was the liquid side mass transfer coefficient kLa, which in dimensionless terms is expressed as the liquid-side Sherwood number ShL. Fig. 3a shows the performance of
previous models for liquid phase Sherwood number, as reported in [1]. Similar to Fig. 2, the two kinds of SVR models were trained, and the results are summarized in Table 2. Their performance is demonstrated through parity plots in Fig. 3b and c, respectively. In this case, we find that Model 2 (which is based on dimensionless number ShL being the output variable of the SVR) seems to perform marginally better. One notes from Table 2 even though the predictability performance of both Models 1 and 2 are similar, the number of support vectors that are required to reach the optimal predictability in each case is almost 2/3rd in case of Model 2 (as compared to Model 1). Some explanation for this is provided below. Finally, we present Fig. 4, which is the prediction of gas side mass transfer coefficient kGa, which in dimensionless terms is expressed as the gas-side Sherwood number ShG. This should be viewed in context of Fig. 1, which presents the best case predictability from the ANN model presented in [1], and also shows comparison with all the models available in the open literature. The parameters of the SVR model are presented in Table 3. Again on notes in Fig. 4 significant improvements than anything that has ever been presented in the open literature, and notably both the dimensional and dimensionless descriptions for seeking the optimal SVR model leads us to comparable results. From Table 3 and Fig. 4, it is clear that not only is the ‘‘goodness of fit’’ remarkable but the requirement of support vectors is also small (of the order of 300 or so). With the idea of providing some backup information regarding our SVR models, in Tables 1–3, in addition to model details
Fig. 4. Parity plot of predicted value of mass transfer coefficient for same database. (a) Using ANN based correlation [1]. (b) Using SVM: Model 1 (Table 1) with dimensional value of kGa. (c) Using SVM: Model 2 (Table 3) with dimensionless value of ShG.
829
S. Bansal et al. / Chemical Engineering Journal 207–208 (2012) 822–831
described above, there two tables of profiles by dimensionless and dimensional numbers in each case, as well as trends with dimensionless and dimensional numbers in each case. Essentially, in each case we show the ‘‘range’’ of parameters in which the SVR models were created (considering dimensionless number regression or dimensional number or physical quantity regression). This actually corresponds to Table 1 in [1]. We highlight here the profiles of pertinent parameters in the data, indicating ranges over which model predictions would have the most accuracy. The presented tables indicate the percentile distribution of each relevant parameter in the SVR regression model. Further, in each of Tables 1–3, we show the ‘‘trends’’ in the output data with various input variables. In these tables, highlight in each variable case, the approximate trends of the output as a function of the input parameters. Each cell of the table reports the average value of the output, over one bin of the input. In summary, the point to note is that over reasonably wide range of parameters, SVR regression works almost equally well and also the regression does not show any noticeable bias in predicting the trends.
In the above analysis, it is clear that making variables dimensionless does not necessarily lead to better mathematical descriptions in terms of correlation. Conventional correlations benefitted from making variables dimensionless because it was the most convenient and intuitive way for data mining in times before machine learning. Also, physical meaning could be ascribed to the dimensionless groups which broadly provided some physical insights into the physical phenomena and helped guide design and scaleup. However, if the latter requirement were not an overriding factor, the ‘‘structure’’ of the space of physical variables (not dimensionless variables) does not necessarily make the dimensionless correlation the optimal way of describing their interrelationships. Fig. 2 makes this amply clear. On the other hand, Figs. 3 and 4 seem to indicate that the predictions are similar no matter whether one chooses dimensional (physical) or dimensionless (compact) variables. In the latter cases, the space of variables would have been rather flat to have permitted this result that we derive mathematically through the SVM regression method. On important issue of concern to engineers is the ‘‘scale’’ issue. Is the proposed regressions method better suited for some scales and worse at others? The answer lies in the parity plots, which
Table 3 SVR models for gas side mass transfer coefficient (Fig. 4). Inputs
Output Kernel and parameters
Model parameters and errors Model 1 log [uL, uG, Dc, e, dv, /, qL, lL, rL, qG, lG]
kGa
Model 2
log [ReL, ReG, WeL, WeG, ScL, StL, XG, MoL, FrL, EoM, Sb] ShG
Percentile
kGa (1/s)
uL (m/s)
Profile by dimensional quantities Min 8.84E03 4.56E04 5% 1.72E01 6.64E04 25% 5.81E01 1.52E03 50% 1.16E+00 4.14E03 75% 2.24E+00 6.67E03 95% 4.11E+00 1.50E02 Max 6.94E+00 1.62E02 Percentile
uL (m/s)
ReL
Validation error
Gaussian c = 0.25, C = 64, e = 0.01563
431
498
3.06% (350 points)
4.99% (148 points)
Gaussian c = 0.5, C = 64, e = 0.0039
270
498
3.69% (350 points)
7.44% (148 points)
e (–)
dv (m)
/ (–)
qL (kg/m3) lL (Pa s)
rL (N/m)
DL (m2/s)
qG (kg/m3) lG (Pa s)
3.83E03 2.40E02 5.48E02 1.48E01 2.72E01 8.30E01 2.01E+00
2.58E02 5.00E02 5.00E02 5.00E02 5.10E02 1.52E01 1.52E01
2.73E01 2.73E01 3.65E01 3.89E01 4.53E01 7.40E01 9.30E01
5.41E04 1.35E03 1.77E03 2.40E03 3.00E03 2.21E02 2.21E02
1.39E01 3.80E01 8.13E01 1.00E+00 1.00E+00 1.00E+00 1.00E+00
9.00E+02 9.00E+02 9.56E+02 1.01E+03 1.09E+03 1.39E+03 1.39E+03
2.67E02 2.67E02 3.70E02 4.00E02 7.28E02 7.77E02 7.77E02
6.11E06 1.40E05 1.40E05 1.40E05 1.60E05 1.60E05 1.60E05
1.16E+00 1.17E+00 1.17E+00 1.19E+00 1.19E+00 1.24E+00 1.56E+00
1.71E05 1.74E05 1.74E05 1.76E05 1.80E05 1.80E05 1.80E05
uG (m/s)
Trends by dimensional quantities Min 1.07 0.08 5% 1.33 0.45 25% 1.39 1.14 50% 1.48 2.10 75% 2.05 2.20 95% 2.24 3.99 Max 1.67 6.94 Percentile
Training error
Dc (m)
Profile by dimensionless numbers Min 1.49E04 2.11E01 5% 3.52E03 6.02E01 25% 2.54E02 1.56E+00 50% 8.87E02 4.70E+00 75% 4.05E01 1.13E+01 95% 7.17+01 3.38E+01 Max 8.03E+02 1.01E+02 Percentile
Total samples
uG (m/s)
ReL
ShG
Support vectors
1.00E03 1.05E03 1.50E03 1.66E03 3.20E03 9.00E03 9.00E03
ReG
WeL
WeG
ScL
StL
Xg
MoL
FrL
EoM
Sb
1.44E01 2.22E+00 7.90E+00 1.80E+01 5.44E+01 7.84E+02 1.43E+03
1.69E06 2.51E05 1.54E04 6.24E04 2.80E03 3.43E02 1.15E01
1.29E07 1.35E05 1.69E04 1.26E03 7.06E03 1.44E01 5.74E01
8.07E01 9.14E01 9.14E01 1.07E+00 1.08E+00 1.10E+00 2.35E+00
6.20E07 5.81E06 4.11E05 1.05E04 2.50E04 4.23E04 1.80E03
2.55E02 1.38E01 4.72E01 1.26E+00 2.86E+00 1.61E+01 5.33E+01
2.54E11 2.99E11 1.61E10 1.02E09 2.61E08 1.35E07 1.35E07
9.62E07 8.03E06 1.35E04 6.16E04 1.35E03 3.00E03 5.03E03
2.70E02 3.52E02 2.36E01 4.50E01 9.37E01 1.03E+02 1.03E+02
1.72E+00 1.72E+00 2.58E+00 2.85E+00 4.13E+00 2.56E+01 2.18E+02
Dc (m)
e (–)
dv (m)
/ (–)
qL (kg/m3)
lL (Pa s)
rL (N/m)
DL (m2/s)
qG (kg/m3)
lG (Pa s)
1.33 1.40 1.40 1.47 2.02 1.69 1.69
1.62 1.52 1.54 1.01 1.55 2.49 3.88
1.42 1.41 1.41 1.46 1.88 0.88 0.88
2.49 1.57 1.39 1.43 1.43 1.43 1.43
1.45 1.34 1.23 1.97 1.71 1.69 1.69
1.26 1.53 1.33 1.58 1.63 1.69 1.69
1.45 1.27 1.26 1.43 1.82 1.73 1.73
1.42 1.50 1.50 1.59 1.82 1.82 1.82
0.30 0.70 1.70 1.68 1.96 2.91 4.25
1.43 1.69 1.59 1.50 1.34 1.34 1.34
ReG
Trends by dimensionless numbers Min 0.45 0.00 5% 3.91 0.02 25% 6.13 0.06 50% 2.16 0.16 75% 30.07 50.33 95% 252.26 177.31 Max 539.90 110.84
WeL
WeG
ScL
StL
Xg
MoL
FrL
EoM
Sb
0.02 3.75 6.30 3.31 69.37 36.18 51.88
0.00 0.02 0.08 0.45 34.64 246.95 803.03
25.91 31.10 14.98 50.75 0.08 0.01 0.00
44.80 80.31 3.61 0.46 0.33 0.05 0.00
0.02 0.48 5.49 19.15 57.06 33.65 6.09
0.12 53.36 0.09 0.08 21.75 31.10 31.10
36.20 5.07 21.06 20.36 28.26 1.10 0.00
0.01 0.03 0.11 0.11 65.39 55.75 55.75
0.02 0.07 0.13 0.09 25.31 163.69 448.95
830
S. Bansal et al. / Chemical Engineering Journal 207–208 (2012) 822–831
Table 4 Percentile distribution of AARE for illustrative case of liquid side Sherwood number (Fig. 3(c)). Percentile bin (%)
ReL (%)
ReG (%)
WeL (%)
WeG (%)
ScL (%)
StL (%)
Xg (%)
MoL (%)
FrL (%)
EoM (%)
Sb (%)
0 1 25 50 75 99 100
– 24 24 15 18 12 10
– 24 19 21 17 11 9
– 24 23 17 17 12 6
– 24 22 15 20 11 9
– 24 10 18 15 21 12
– 46 15 17 17 18 1
– 31 17 17 17 16 32
– – 14 17 17 21 –
– 24 24 16 15 13 12
– 8 14 17 24 15 6
– 24 13 24 15 12 –
indicate that the same SVR models perform to similar levels of accuracy across the large variation in scale that our data has. To illustrate this point further, we have added Table 4, as an example of the fact that the error distribution does not show any noticeable scale dependence. Table 4, which is for the illustrative case of liquid side Sherwood number, shows that the average AARE over percentile bins of different ranges of parameters. Note that the overall average AARE is 17.05 ± 18.97% (Fig. 3c). Similar results were obtained in other cases as well. Thus, ‘‘scale’’ dependence is well incorporated in the designed SVR regression correlations in the documented range of the databanks. Finally we present the generalizing ability of SVRs in the present context. One merit of conventional correlations is that since the number of ‘‘choosable’’ parameters is small, we intuitively expect good regression results on a large sample set to be statistically significant, and hence expect results to generalize to unseen data well. In this experiment, we demonstrate the generalizing ability of SVR, using again the TBR dataset, specifically modeling aLG. The SVR model was developed in this case using randomly sampled 400 points, and the remaining 1061 points were treated as unseen test data. The results are shown in Fig. 5, which summarizes the performance of this model on the unseen data. We see that despite the model having ‘‘seen’’ only a very small fraction of the dataset, it is able to successfully predict for the large unseen fraction with a reasonable AARE = 21.6%. In other words, even if one attempts to ‘‘extrapolate’’ the predictions to more conditions than what was originally used to developing the SVM correlations, the results are still acceptable. 5. Summary and conclusions In this work, we have presented a machine learning method of empirical modeling, and its adaptations towards correlation build-
ing in Chemical Engineering problems, specifically trickle bed reactors. We have demonstrated its ability to adapt to highly non linear relationships in data, which otherwise permit poor predictions with a lot of scatter from conventional correlations. Our method, using SVMs, avoids over-fitting, while retaining the generalization power of simple, flat models. In that sense, the SVR approach improves significantly over the earlier ‘‘best option’’ available, based on correlation development using artificial neural networks [1]. We also show that making variables dimensionless is not necessarily the best option in terms of accuracy of correlations, and depending on the structure of the parameter space, dimensional SVR correlations may actually be better. Of course, the relative merit of dimensionless groups in correlations lies in their physical interpretation. Until rigorous theory catches up with engineering need, empirical models combined with diversified and broad enough databanks will probably keep dominating the realm of design in the industry. SVR is a very powerful learning paradigm that can lead to vastly superior predictive models as compared to those that are popular today. The proposed algorithm allows us to extract physical insights from complex data by highlighting their interrelationships, and this cannot only help us model it better statistically, but also help derive rigorous theory. Endnote The programs developed as part of this work will be made available in public domain and will be shared with anyone who wishes to obtain a copy. Interested persons are advised to write an email requesting the same to the corresponding author. The models are also being made available as Supplementary Material with the online version of this manuscript. Appendix A. Supplementary material Models created in this work (as Excel sheets) can be found, in the online version, at http://dx.doi.org/10.1016/j.cej.2012.07.081. References
Fig. 5. Parity plot demonstrating the generalization ability of SVR models. The model was trained only on 400 randomly sampled points, and was tested on the remaining 1061 points.
[1] I. Iliuta, A. Ortiz-Arroyo, F. Larachi, B.P.A. Grandjean, G. Wild, Hydrodynamics and mass transfer in trickle-bed reactors: an overview, Chem. Eng. Sci. 54 (1999) 5329–5337. [2] C. Vial, S. Poncin, G. Wild, N. Midoux, A simple method for regime identification and flow characterisation in bubble columns and airlift reactors, Chem. Eng. Process. 40 (2001) 135–151. [3] W. Yaïci, A. Laurent, N. Midoux, J.C. Charpentier, Détermination des coefficients de transfert de matière en phase gazeuse dans un réacteur catalytique à lit fixed arrosé en présence de phases liquides aqueouses et organiques, Bull. Soc. Chem. France 6 (1985) 1032. [4] G. Wild, F. Larachi, J.C. Charpentier, Heat and mass transfer in gas–liquid–solid fixed bed reactors, in: M. Quintard, M. Todorovic (Eds.), Heat and Mass Transfer in Porous Media, Elsevier, Amsterdam, The Netherlands, 1992, p. 616. [5] A. Kulkarni, V.K. Jayaraman, B.D. Kulkarni, Knowledge incorporated support vector machines to detect faults in Tennessee Eastman Process, Comput. Chem. Eng. 29 (10) (2003) 2128–2133.
S. Bansal et al. / Chemical Engineering Journal 207–208 (2012) 822–831 [6] A. Kulkarni, V.K. Jayaraman, B.D. Kulkarni, Control of chaotic dynamical systems using support vector machines, Phys. Lett. A 317 (5–6) (2003) 429– 435. [7] M. Agrawal, A.M. Jade, V.K. Jayaraman, B.D. Kulkarni, Support vector machines: a useful tool for process engineering applications, Chem. Eng. Prog. 98 (1) (2003) 57–62. [8] S. Nandi, Y. Badhe, J. Lonari, U. Sridevi, B.S. Rao, S.S. Tambe, B.D. Kulkarni, Hybrid process modeling and optimization strategies integrating neural networks/support vector regression and genetic algorithms: study of benzene isopropylation on h-beta catalyst, Chem. Eng. J. 97 (2004) 115–129. [9] K. Desai, Y. Badhe, S.S. Tambe, B.D. Kulkarni, Soft-sensor development for fedbatch bioreactors using support vector regression, Biochem. Eng. J. 27 (3) (2006) 225–239. [10] A.B. Gandhi, J.B. Joshi, V.K. Jayaraman, B.D. Kulkarni, Development of support vector regression (SVR)-based correlation for prediction of overall gas hold-up in bubble column reactors for various gas–liquid systems, Chem. Eng. Sci. 62 (24) (2007) 7078–7089.
831
[11] C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn. 20 (1995) 273– 297. [12] L. Wang, Support Vector Machines: Theory and Applications, Springer, New York, 2005. [13] C. Burges, A tutorial on support vector machines for pattern recognition, Data Min. Knowl. Disc. 2 (1998) 121–167. [14] D. Basak, S. Pal, D.C. Patranabis, Support vector regression, Neural Inf. Process. Lett. Rev. 11 (10) (2007) 203–224. [15] C.-C. Chang, C.-J. Lin, LIBSVM: a library for support vector machines, ACM Trans. Intell. Syst. Technol. 2 (2011) 27:1–27:27. [16] L. Tarca, B. Grandjean, F. Larachi, Reinforcing the phenomenological consistency in artificial neural network modeling of multiphase reactors, Chem. Eng. Process. 42 (8–9) (2003) 653–662. [17] C.W. Hsu, C.C. Chang, C.J. Lin, A Practical Guide to Support Vector Classification, Department of Computer Science, National Taiwan University, 2010.