Journal of Molecular Liquids 216 (2016) 25–34
Contents lists available at ScienceDirect
Journal of Molecular Liquids journal homepage: www.elsevier.com/locate/molliq
A corresponding states-based method for the estimation of natural gas compressibility factors Arash Kamari a, Farhad Gharagheizi b, Amir H. Mohammadi a,b,c,d,⁎, Deresh Ramjugernath a,⁎⁎ a
Thermodynamics Research Unit, School of Engineering, University of KwaZulu-Natal, Howard College Campus, King George V Avenue, Durban 4041, South Africa Department of Chemical Engineering, Texas Tech University, Lubbock, Texas 79409-3121, United States Institut de Recherche en Génie Chimique et Pétrolier (IRGCP), Paris Cedex, France d Département de Génie des Mines, de la Métallurgie et des Matériaux, Faculté des Sciences et de Génie, Université Laval, Québec (QC), G1V 0A6, Canada b c
a r t i c l e
i n f o
Article history: Received 1 June 2015 Received in revised form 21 November 2015 Accepted 30 December 2015 Available online xxxx Keywords: Gene expression programming (GEP) Genetic algorithm, z-factor Error analysis Equation of state (EoS)
a b s t r a c t In this communication, a corresponding states-based model for the calculation/estimation of the gas compressibility factor (z-factor) of natural gasses is proposed. The method applies the gene expression programming (GEP) algorithm. The parameters of the new model comprise the pseudo-reduced pressure and pseudoreduced temperature. For assessing the performance and accuracy of the developed model, several statistical and graphical error analyses have been applied simultaneously. Additionally, comparisons have been made between this method and the most widely-used correlations and equations of state (EoS) available in the literature. Various statistical parameters are also used to evaluate the validity and the predictive capability of the newly developed method. Furthermore, the Leverage approach (Williams plot) is used to determine the realm of prediction capability of the new z-factor model and to detect any probable erroneous data points. The results obtained demonstrate that the newly proposed model is more reliable and more effective than the empirical models and EoS methods for prediction of z-factors of natural gasses. © 2015 Elsevier B.V. All rights reserved.
1. Introduction Natural gas is a multi-component mixture composed broadly of different components, comprising methane as the key component and more fundamental and important components like carbon dioxide (CO2), nitrogen (N2), ethane (C2H6), propane (C3H8), and heavier hydrocarbon components [1]. Natural gas is one of the cleaner and cheaper energy sources, compared with other hydrocarbon-based materials like oil and coal. It also has a longer predicted future availability compared to crude oil and coal [2]. Furthermore, there has been growing importance in natural gas being used in meeting the world energy demand due to its versatility and abundance compared to the other fuels, as well as its clean burning [3] characteristics. Therefore, it is important to develop reliable predictive methods for the physical properties related to natural gas, like gas compressibility factor (z-factor), for enable optimal exploitation and usage. The gas compressibility factor is a key thermodynamic parameter in the chemical and petroleum engineering disciplines such
⁎ Corresponding author. ⁎⁎ Correspondence to: A. H. Mohammadi, Thermodynamics Research Unit, School of Engineering, University of KwaZulu-Natal, Howard College Campus, King George V Avenue, Durban 4041, South Africa. E-mail addresses:
[email protected] (A.H. Mohammadi),
[email protected] (D. Ramjugernath).
http://dx.doi.org/10.1016/j.molliq.2015.12.103 0167-7322/© 2015 Elsevier B.V. All rights reserved.
as phase equilibria of various hydrocarbon and non-hydrocarbon mixtures, analyzing PVT behavior, upstream and downstream calculations of petroleum industries, material balances, assessment of underground gas reserves, gas reservoir simulations, well-testing analysis and calculations associated with processing of gasses [4,5]. Moreover, the importance and role of z-factor cannot be overemphasized in process engineering calculations and in lower complexity simulations within a thermodynamics context. Generally, the volumetric properties of the petroleum fluids are predicted from laboratory tests, empirically derived models, or thermodynamic models [6]. Normally, high-temperature and high-pressure apparatuses are utilized for the experimental measurements in order to investigate the volumetric properties of natural gasses [7]. Moreover, these measurements are expensive and time-consuming and it is impossible to measure properties for all possible compositions of natural gasses [8]. In addition to laboratory tests, equations of state (EoS) and empirically derived models can predict the properties related to petroleum fluids. To determine the natural gas z-factor, empirical correlations are more rapid and simpler than equations of state (involving a large number of parameters) which require longer computations and are more complicated [9]. Furthermore, with regard to the gas compressibility factor, all EoS models are implicit and consequently are the mathematical roots of the EoS [10]. It is worth noting that in spite of the above-mentioned drawback, EoS has some advantages; for instance
26
A. Kamari et al. / Journal of Molecular Liquids 216 (2016) 25–34
they sacrifice a bit of precision in exchange for a smooth mathematical function for the model developed and reliable mathematical behavior in terms of the derivatives of the mathematical function. The partial derivatives of compressibility lead to various expressions, including the entropy, enthalpy, and Gibbs free energy residuals, which are in turn utilized to estimate fugacity coefficients and then used to describe phase equilibria [11]. As a result, accurate and fast calculation/estimation of the z-factor is of great importance and is one of the main challenges in most of the commercial process simulators utilized in chemical and petroleum engineering. Among the intelligent techniques, genetic algorithm (GA) is a reliable population-based progressive optimization method based on the concept of evolution and genetic principles proposed by John Holland [12]. The genetic programming (GP) proposed by John Koza [13] may be considered as the next generation of GAs. GP is a novel evolutionary system which overcomes some restrictions associated with GAs. A variant and reformulation of this new mathematical approach (gene expression programming (GEP)) were proposed by Ferreira [14]. In this article, the GEP [14] mathematical approach is applied to develop an accurate and reliable method for the determination of the gas compressibility of around 900 data points at different temperature and pressure conditions. Subsequently, the results obtained for the newly proposed GEP model are compared against literaturereported data and previously published correlations and EoS calculations. In assessing the validity of the GEP model, several statistical parameters are considered. Moreover, the Leverage approach (Williams plot) is utilized to determine the prediction capability of the new z-factor model and also to identify probable erroneous data points.
the relationship among the parameters above (PVT variables), an equation is formulated as follows:
2. Gas compressibility factor
where Pci denotes the critical pressure, Tci is the critical temperature and yi is the mole fraction of component i. The values of critical pressure and temperature (Pc and Tc) for the components of natural gasses [16] are presented in the supplementary material.
2.1. Ideal gas behavior The gas compressibility factor is defined as the ratio of the molar volume actually occupied by a gas to the molar volume related to an ideal gas at a given/same temperature and pressure [15]. In other words, compressibility factor of gasses is a dimensionless quantity which is a function of pressure and temperature. According to the kinetic theory related to gasses [8], the volume of a molecule is insignificant and/or unimportant compared to the total bulk volume. Also, it is assumed that there are neither attractive forces nor repulsive forces among the gas molecules [8]. A mathematical equation called an equation of state (EoS) is a relationship among pressure (P), volume (V), and temperature (T) for a given quantity of moles of gas (n). Consequently, the abovementioned relationship is mathematically expressed by the following equation: 0
PV ¼ n RT
ð1Þ
0
where P is the pressure, V denotes the volume, Z represents the gas compressibility factor, n′ is the number of moles, R is the Universal Gas Constant, and T stands for temperature. Investigation of the compressibility factor for natural gasses of different compositions has shown that z-factor can be used in generalized form with adequate accuracy for most engineering calculation purposes when it is expressed in terms of two dimensionless properties [8], which are as follows: P pr ¼
P P pc
ð3Þ
T pr ¼
T T pc
ð4Þ
where Ppr denotes the pseudo-reduced pressure, Tpr is the pseudoreduced temperature, Tpc is the pseudo-critical temperature and Ppc is the pseudo-critical pressure. The pseudo-critical properties are given by the following equations: P pc ¼
n X
yi P ci
ð5Þ
yi T ci
ð6Þ
i¼1
T pc ¼
n X i¼1
2.3. Determination of the critical properties of plus fraction components There are many empirically derived correlations that can be used for the determination of plus properties of natural gas components. A review was undertaken by the late Ali Danesh [16]. As recommended, one of the most reliable methods for this purpose is the correlation proposed by Twu [17] as follows: critical temperature: T c ¼ T c o ½ð1 þ 2f T Þ=ð1−2 f T Þ
2
ð7Þ
where h i 1 1 f T ¼ ΔSGT −0:362456=T 2b þ 0:0398285−0:948125=T 2b ΔSGT
where P represents pressure, V denotes the volume, n′ is the number of moles, R is the Universal Gas Constant, and T is the temperature.
where
2.2. Real gas behavior
ΔSGT ¼ exp½5ðSG o −SGÞ−1
Gasses that deviate from ideal behavior are known as real gasses. The ideal gas EoS shows low deviations from experimental data at atmospheric pressure (2–3% of average absolute relative deviation) whereas, its application at high pressures is not recommended due to the high deviations from real gas behavior [8]. The deviation dramatically increases with an increase in temperature and pressure and is dependent on the gas composition as well. To correlate the pressure, volume and temperature (PVT) parameters, various EoSs have been reported for real gasses. In order to present
ð2Þ
PV ¼ Zn RT
ð8Þ
ð9Þ
critical volume: 2
V c ¼ V c o ½ð1 þ 2f V Þ=ð1−2f V Þ
ð10Þ
where h i 1 1 f V ¼ ΔSGV 0:466590=T 2b þ −0:182421 þ 3:01721=T 2b ΔSGV
ð11Þ
A. Kamari et al. / Journal of Molecular Liquids 216 (2016) 25–34
where h i ΔSGV ¼ exp 4 SGo2 −SG2 −1
ð12Þ
and critical pressure: Pc ¼ Pc
o
by the user. Each gene involves a head composed of functions and terminals (variables and constants), and a tail composed only of terminals [26]. As a result, the head length (h) is an input parameter for the GEP mathematical method while the tail length (t) is expressed as follows: t ¼ hðn−1Þ þ 1
2 T c =T c o V c o =V c ½ð1 þ 2 f P Þ=ð1−2f P Þ
ð13Þ
where 1=2 f P ¼ ΔSGP ½ 2:53262−46:1955=T b −0:00127885 T b 1=2
þð−11:4277 þ 252:140=T b
þ 0:00230535 T b ÞΔSGP
ð14Þ
where ΔSGP ¼ exp 0:5 SGo −SG −1:
ð15Þ
27
ð16Þ
where t stands for the tail length of gene, h expresses the head length, and n denotes the largest arity of the functions used in the gene's head. For instance, a two-gene chromosome can be made of four functions, Q, ∗,/, and +, (Q expresses the square root function) and three terminals, a, b and c, together with its decoded ET and the related computational pffiffiffiffiffiffiffiffiffiffiffi expression. The algebraic expression (a/b) + ( a b) can be simply indicated as a diagram or ET, with the Karva language demonstration. Each character is placed in a position from zero to seven and can be indicated as 0 1 2 3 4 5 6 7. 3.2. The GEP computational procedure
In the equations above, the superscript “o” denotes correlations specific to the n-alkanes, Tb is the normal boiling point temperature, SG stands for the specific gravity, and finally Tc, Vc and Pc express critical temperature, critical volume, and critical pressure, respectively. 3. Model development 3.1. GEP strategy GEP strategy, a modified form of genetic algorithm and genetic programming, utilizes populations of individuals, selects them with respect to fitness, and presents genetic variations utilizing one or more genetic operators [18]. In other words, the GEP [14] mathematical approach overcomes some of the GA and GP limitations. In actual fact, GEP [14] as a variant resulting from the extension and modification of the GP algorithm [13], is an implemented soft-computing program to solve regression problems. In the latter method, the population individuals are symbolic expression trees (ETs) unlike those of GEP [19], in which the population individuals are encoded as linear chromosomes, which are later translated into the expression parse trees, i.e. the phenotype and genotype are finally divided by one another [20–25]. In other words, GEP [14] operates with two elements including the chromosome and the ET. The chromosome plays the role of encoder of the candidate solution which is translated into an ET (the actual candidate solution). It should be noted that the genetic operators related to the GEP algorithm are implemented on the chromosome; not directly on the candidate solution (ET). The reproduction technique together with the structural organization of the chromosome and its translation process into an ET permits unconstrained genetic modifications, always producing acceptable ETs [26]. In other words, the novel structures of the genes in the GEP [14] mathematical strategy, enables encoding of any soft-computing program for efficient evolution of the solutions [19]. Previous research studies [19] indicate that these characteristics permit the GEP method to outperform genetic programming methodology by two to four orders of magnitude with regard to convergence speed for evaluating symbolic regression and classification problems. 3.1.1. Mathematical performance As previously mentioned, the GEP [14] mathematical strategy uses two entities: the chromosome and the expression tree. The chromosome comprises terminals including variables and constants, and functions structured in one or more genes of equal length [26]. The functions and variables are input data while the constants are produced by the mathematical-based algorithm in a range selected
Ferreira [19] presented the general computational procedure of the GEP [14] mathematical approach as follows [19]: (1) initializing the population, by including the random generated chromosomes of a certain number of individuals by setting various correlations expressed; (2) fitting the population individuals on the basis of fitness functions; (3) selection of the population individuals along with their fitness to reproduce with adaption; (4) the novel population individuals are treated utilizing the same process involving confrontation of the selection environment, expression of the genomes, selecting, and reproduction with modification/adaption; and (5) repeating the steps above for a certain number of generations or until an optimum solution has been obtained. The procedure described above has been used in this study for developing the model. Generally, the reliability and applicability of any model or correlation are associated to the comprehensiveness and validity of the dataset employed for the development [11,27–29]. Hence, a large dataset covering wide ranges of pressures and temperatures for estimating the z-factors of gasses was collected from the literature [30–36]. The minimum, maximum and average values of the data are reported in Table 1. As can be observed in Table 1, the data points include an extensive range of temperatures, pressures, and compositions. As mentioned above, capability and consistency of a correlation for estimation of a specific parameter like gas compressibility factor of gasses actually rely on the comprehensiveness of the dataset used for its development. Hence, the method proposed in this work is expected to be reliable for estimating other samples. To develop a GEP model for prediction of z-factor, the database was divided into three sub-datasets, according to a rule of thumb [22]. These three sub-datasets include the “Training” set (784 data points, about 80% of the entire dataset), the “Validation” set (97 data points, about 10% of the entire dataset), and the “Prediction” set (97 data points, about 10% of the entire dataset). It should be mentioned that the partition process of the available database into three subdatasets is performed randomly. The training set is allocated for the development of the model. When it is obtained, its validity is assessed utilizing the validation set. As it is clear from its name, the prediction/ estimation set is applied to observe the predictive performance of the GEP model. The need to not assume specific functional (method) forms to obtain the optimum estimation of the actual data is one of the most important features of the GEP [19] mathematical strategy [20–25]. Hence, the most precise functional form including the most effective independent parameters is found through the novel evolutionary approach itself. The parameters affecting z-factor are pseudo-critical pressure (Ppr) and pseudo-critical temperature (Tpr). In this study, it is firstly
28
A. Kamari et al. / Journal of Molecular Liquids 216 (2016) 25–34
considered that the z-factor can be expressed as a function of the aforementioned properties as follows: Z ¼ f P pr ; T pr :
ð17Þ
4. Performance assessment 4.1. Statistical deviation parameters For assessing the accuracy of the newly proposed GEP model against existing empirically derived models and EoS methods, a number of statistical deviation parameters have been used, involving average percent relative error (APRE), average absolute percent relative error (AAPRE), standard deviation of error (SD), root mean square error (RMSE) and coefficient of determination (R2). The formulas related to the abovementioned deviation parameters are as follows: 1. Average Percent Relative Error (APRE). It evaluates the relative deviation of estimated z-factor data from the actual ones, which is expressed as follows: Er % ¼
n 1X E% n i¼1 i
Z exp −Z rep:=pred 100⇒i ¼ 1; 2; 3; :::; n Z exp
ð19Þ
2. Average Absolute Percent Relative Error (AAPRE). It measures the absolute relative deviation from the actual data and is expressed as follows: Ea % ¼
Property
Min.
Max.
Average
Pressure,psi Reservoir temperature, °F Methane Ethane Propane Iso-Butane N-Butane Iso-Pentane n-Pentane Hexane Heptane plus Mw C7+ SG C7+ Hydrogen sulfide Carbon dioxide Nitrogen Tpr Ppr z-factor
154 40 17.27 0 0 0 0 0 0 0 0 0 0 0 0 0 0.97 0.17 0.40
7026 300 97.48 28.67 13.16 2.23 3.10 2.85 0.79 2.68 8.17 150 0.90 73.85 54.46 25.15 1.96 10.19 1.241
2820 147 71.18 3.86 1.44 0.21 0.36 0.18 0.10 0.20 0.64 50 0.31 13.92 6.00 1.83 1.46 3.75 0.86
ð18Þ
where Ei% stands for the relative deviation of a represented/predicted values from its related actual value and is defined as percent relative error: Ei % ¼
Table 1 Range and corresponding statistical parameters of the input/output data utilized in development of the model; data from References [30–36].
n 1X jE %j n i¼1 i
ð20Þ
4.2. Graphical deviation study Normally, to visualize the performance and accuracy of a model developed, two graphical techniques are utilized, in which error distribution curves and crossplot are sketched. 1. Relative deviation distribution plot: It is a method to measure APRE deviation distribution around the zero error line in order to show if the model has an error trend or not. 2. Crossplot or parity diagram: In this graphical analysis, all calculated/ estimated z-factor data points are sketched against the actual values and subsequently a crossplot is constructed. A 45° straight line (unit slope line) between represented/predicted and the actual values data points on the parity diagram reveals the perfect model line.
3. Root Mean Square Error (RMSE). It measures the data scattering around the zero deviation, expressed as follows: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u n u1 X 2 Z i exp −Z irep:=pred RMSE ¼ t n i¼1
ð21Þ
Table 2 Statistical error parameters of the developed model (including training, validation and prediction sets) to determine the z-factor. Statistical parameter
4. Standard Deviation (SD). It is a criterion of dispersion and a lower value exhibits a smaller degree of dispersion. It is expressed as follows: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u 2 n u 1 X Z i exp −Z i rep:=pred SD ¼ t Z i exp n−1 i¼1
ð22Þ
2
5. Coefficient of Determination (R ). This parameter is a simple statistical deviation parameter which illustrations how good the model matches the data and accordantly, expresses a measure of the usefulness of the GEP model. It is expressed as follows: n X
Z i exp −Z i rep:=pred
R2 ¼ 1− i¼1n X
Z i rep:=pred −Z
2
2
ð23Þ
i¼1
where Z is the mean of the actual data values presented in the above equation.
Training set R2 Average absolute percent relative error Standard deviation error Root mean square error N Validation set R2 Average absolute percent relative error Standard deviation error Root mean square error N Test set R2 Average absolute percent relative error Standard deviation error Root mean square error N Total R2 Average absolute percent relative error Standard deviation error Root mean square error N
0.897 3.47 0.04 0.04 784 0.883 3.47 0.04 0.04 97 0.921 3.46 0.04 0.04 97 0.898 3.44 0.04 0.04 978
A. Kamari et al. / Journal of Molecular Liquids 216 (2016) 25–34
29
Fig. 1. Comparison between the results of the model developed (Eq. (24)) and the database values of z-factor.
5. Results and discussion The computational steps as described above were followed to achieve an efficient, reliable, and capable GEP model for the prediction of z-factor. Moreover, as previously mentioned, for assessing the capability and performance of the GEP model developed, statistical error analysis, in which AAPRE, APRE, SD, RMSE and R2, as well as graphical error analysis, in which parity diagram and APRE error distribution plot is sketched, have been implemented. As a consequence, the GEP [19] approach calculations express the required parameters, which yield the most precise model from the introduced parameters (Ppr and Tpr). Hence, one can consider several independent parameters for a
particular problem and find the ones which have the most positive impacts on the desired output results. The ultimate form of z-factor equation obtained can be expressed as follows: Z ¼ 0:2625136 þ
3:1263651 −3:8916368 1:0551763 þ þ T pr T pr 2 T pr 3
2 3 þ0:5638878½ ln P pr −0:3372525 ln P pr þ 0:061688 ln P pr 2 −1:3976452½ ln P pr 0:5217521½ ln P pr 0:447935 ln P pr þ þ þ 2 T pr T pr T pr
Fig. 2. Relative deviations of the represented z-factor values by Eq. (24) from the database values.
ð24Þ
30
A. Kamari et al. / Journal of Molecular Liquids 216 (2016) 25–34
Table 3 Comparative statistical error analysis for the empirical correlations, EoSs, and an artificial intelligent technique, as well as the newly developed model. Method
Er%
Ea%
RMSE
R2
van der-Waals [45] EoS Peng–Robinson [46] EoS Lawal–Lake–Silberberg [47] EoS Patel–Teja [49] EoS Soave–Redlich–Kwong [48] EoS Dranchuk–Abu–Kassem [37] Corr. Dranchuk–Purvis–Robinson [38] Corr. Hall–Yarborough [39] Corr. Beggs–Brill [40] Corr. Shell Oil Company [41] Corr. Gopal [42] Corr. Azizi et al. [43] Corr. Heidaryan et al. [10] Corr. Sanjari–Lay [44] Corr. ANN-PSO ANN GA Eq. (24)
0.31 −5.34 −2.66 −1.18 −3.14 4.21 4.66 1.46 4.95 5.34 6.12 4.26 3.61 0.66 −1.13 −1.85 −0.25
6.42 6.10 4.43 4.15 4.82 8.18 4.77 3.59 5.07 5.40 6.26 6.25 5.80 5.67 6.13 7.46 3.44
0.0696 0.0599 0.0453 0.0447 0.0493 0.0992 0.0555 0.0429 0.0574 0.0596 0.0910 0.0792 0.0762 0.0697 0.0647 0.0766 0.04
0.771 0.891 0.894 0.880 0.893 0.574 0.906 0.892 0.904 0.908 0.737 0.772 0.778 0.811 0.736 0.624 0.898
where Tpr is the pseudo-reduced temperature and, Ppr denotes pseudoreduced pressure. To obtain the equation above, the number of significant digits for the coefficients has been calculated by conducting sensitivity analysis of the predicted results to the actual values. The statistical error parameters of the results obtained show that the average absolute percent relative errors and R2 of the three sub-data set (total) results are about 3.44 and 0.898, respectively. This indicator demonstrates acceptable accuracy of the method developed for calculation of the z-factor of the gasses studied. Detailed statistical error analysis of the proposed model for z-factor in this work is listed in Table 2. The results listed for the training, validation and testing phases in Table 2 reveal that the new model developed has reliable performance. A crossplot of the training, validation, and test datasets for z-factor, obtained by Eq. (24) is illustrated graphically in Fig. 1. The results indicate that the new model provides a more precise estimation of the z-factor. Additionally, it is obvious that almost all data points obtained by the newly developed GEP model lie on the unit slope line and this indicates its good prediction capability. Fig. 2 represents the error distribution of the model for determination of z-factor of natural gasses. The figure confirms that the proposed model has a small error range and a low scatter around
Fig. 4. Absolute percent relative error contour of gas compressibility factor for the Eq. (24) in the ranges of Ppr and Tpr.
the zero error line. This indicates the potential of Eq. (24) for estimation of z-factor with a small expected error. The performance of the model for determination of the z-factor for the experimental data studied has been compared with that of some of the most widely-utilized empirically derived models and equations of state available in the literature, including six empirically derived models, viz. Dranchuk–Abu–Kassem [37], Dranchuk–Purvis–Robinson [38], Hall–Yarborough [39], Beggs–Brill [40], Shell Oil Company [41], Gopal [42], Heidaryan et al. [10], Azizi et al. [43], and Sanjari–Lay [44] and three EoS-based models viz. van der Waals [45], Peng–Robinson [46], Lawal–Lake–Silberberg [47], Soave–Redlich–Kwong [48], Patel– Teja [49]. Additionally, the feed-forward multi-layer artificial neural network (ANN) approach has been employed to conduct a further comparison of the model developed with other kinds of artificial intelligence techniques. To this end, two reliable optimization methods viz. particle swarm optimization (PSO) [50] and genetic algorithm (GA) [12] have been used to tune the ANN adjustable parameters, including weight and bias. More information regarding these methods can be found elsewhere [51–53]. Table 3 reports the corresponding results. As it is clear in Table 3, the model obtained in this study is simple and leads to reasonable deviations of the determined z-factor values from all the experimental data compared with the literature correlations,
Fig. 3. Calculated average absolute percent relative error for the empirical correlations, EoSs, and the artificial intelligent technique, as well as the proposed model (Eq. (24)).
A. Kamari et al. / Journal of Molecular Liquids 216 (2016) 25–34
EoS-based models, and ANN based models. The bar plots in Fig. 3 represent the average absolute percent relative errors of the z-factor for the newly proposed method, EoS-based models, the ANN based model,
31
and the empirical correlations. In Fig. 3, the results show that Eq. (24) has a reliable accuracy. The proposed model is easy-to-use and does not need any soft-computing programs for calculation. The results
Fig. 5. Absolute percent relative error contour of gas compressibility factor for the comparative methods (set I) in the ranges of Ppr and Tpr.
32
A. Kamari et al. / Journal of Molecular Liquids 216 (2016) 25–34
clearly demonstrate that the GEP algorithm is more powerful than ANN methodology in terms of accuracy, capability, and future usability. While the model developed in current study, on the basis of the GEP,
estimates gas compressibility factors of natural gasses with an AARD = 3.44%, the AARD obtained for ANN optimized with the PSO and GA methods are 6.13, and 7.46%, respectively. Additionally,
Fig. 6. Absolute percent relative error contour of gas compressibility factor for the comparative methods (set II) in the ranges of Ppr and Tpr.
A. Kamari et al. / Journal of Molecular Liquids 216 (2016) 25–34
33
Fig. 7. Detection of probable doubtful data of z-factor and the applicability domain of the model developed.
over-fitting is major problem faced with modeling by the ANN approach, in particular when a small dataset is used, because of the high number of adjustable parameters viz. weights and bias. It is worth noting that this is the first time that the GEP mathematical approach has been implemented for estimation of the z-factor of gasses. The results obtained indicate that the mathematical strategy implemented is very promising for evaluation of other petroleum fluids. To show the applicability domain of all methods investigated in this study graphically, the absolute percent relative error contour of gas compressibility factor has been sketched together with the collected database for the input variables of Ppr and Tpr. Fig. 4 illustrates the absolute percent relative error contour of gas compressibility factor predicted by the model developed in the current study. It is evident from the figure that the model developed is able to predict gas compressibility factors in the dataset range (Table 1). However, the model developed in the current study could not estimate the gas compressibility factor with high accuracy in the Tpr range of 1–1.2, and Ppr range of 1–3. The absolute percent relative error contours of gas compressibility factor estimated by the comparative methods mentioned above are illustrated in Figs. 5 and 6. These figures clearly indicate that the method presented in this study is superior to the comparative methods investigated in this study. Furthermore, Figs. 5 and 6 show that the comparative methods have high errors in the Tpr range of 1–1.2 and Ppr range of 1–3, similar to the developed method. This may be due to the experimental errors when conducting laboratory measurements for the gas compressibility factors. In the development of a predictive model or correlation, the leverage technique (detection of the outlier data points) plays a significant role to assess a group or groups of data which may differ from the bulk of the data present in a dataset [54–56]. As a matter of fact, the main objective of the leverage technique is that the data which are outliers (located out of applicability domain of the model) in each experimental/literature databank must be detected. A detailed description of computational procedure and equations for the leverage technique can be found elsewhere [54–56]. Hence, to check whether the GEP model is statistically acceptable; the Williams plot has been illustrated for the results obtained. The existence of the majority of data points in the ranges 0 ≤ H ≤ 0.0092 and −3 ≤ Standardized Residuals ≤ 3 confirms that the GEP model developed for the calculation of z-factor is statistically
accurate and reliable. As a consequence, good high leverage data points are located in the domain of 0.0092 b H for the method presented. Those good leverage points which are outside of the ranges −3 ≤ Standardized Residuals 3 may be regarded as outlier data points in terms of the applicability domain of the presented GEP model. The results of the z-factor predictive method illustrate that a few of the data points are located in the aforementioned domain (Fig. 7). 6. Conclusions In the present work, the gene expression programming approach was followed to develop a simple-to-utilize corresponding statesbased model for the calculation of z-factor values of more than 900 data values for natural gasses at different temperatures and pressures. The variables of the model include the pseudo-reduced pressure and pseudo-reduced temperature. 784 data points for z-factor (approximately 80% of the entire dataset) were used for developing the model and 97 data points (approximately 10% of the entire dataset), were applied for each of the validation and testing steps for the proposed model. A comparison of the method developed to other models (empirical correlations, equations of state, and an artificial intelligent technique) based on the statistical and graphical analyses showed the superiority of the newly developed method via indices such as R2, Ea and RMSE of 0.898, 3.45, and 0.04, respectively. This statistically indicates a satisfactory predictive tool. The model proposed in this study also provides a considerable improvement over previous proposed correlations and equations of state with broader applicability in terms of temperature and pressure ranges. Nomenclature ANN artificial neural network GA genetic algorithm GEP gene expression programming GP gene programming ET expression tree OF objective function pseudo-reduced pressure, dimensionless Ppr pseudo-reduced temperature, dimensionless Tpr P system pressure, psia
34
T Ppc Tpc Pci Tci yi Z R V MSE RMSE SD R2 N n′
A. Kamari et al. / Journal of Molecular Liquids 216 (2016) 25–34
system temperature, °R pseudo-critical pressure, psia pseudo-critical temperature, °R critical pressure of component i, psia critical temperature component i, °R mole fraction of component i gas compressibility factor gas constant volume mean square error root mean square error standard deviation coefficient of determination number of data points number of moles
Appendix A. Supplementary data Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.molliq.2015.12.103. References [1] E. Sanjari, E.N. Lay, Estimation of natural gas compressibility factors using artificial neural network approach, J. Nat. Gas Sci. Eng. 9 (2012) 220–226. [2] BP.BP., Statistical Review of World Energy 2006, June, 2006. [3] X. Wang, M.J. Economides, Advanced Natural Gas Engineering, Gulf Publishing Company Houston, TX, 2009. [4] E. Heidaryan, A. Salarabadi, J. Moghadasi, A novel correlation approach for prediction of natural gas compressibility factor, J. Nat. Gas Chem. 19 (2010) 189–192. [5] A. Chamkalani, S. Zendehboudi, R. Chamkalani, A. Lohi, A. Elkamel, I. Chatzis, Utilization of support vector machine to calculate gas compressibility factor, Fluid Phase Equilib. 358 (2013) 189–202. [6] K.-L. Yan, H. Liu, C.-Y. Sun, Q.-L. Ma, G.-J. Chen, D.-J. Shen, X.-J. Xiao, H.-Y. Wang, Measurement and calculation of gas compressibility factor for condensate gas and natural gas under pressure up to 116 MPa, J. Chem. Thermodyn. (2013) DOI. [7] K. Chylinski, M. Cebola, A. Meredith, G. Saville, W. Wakeham, Apparatus for phase equilibrium measurements at high temperatures and pressures, J. Chem. Thermodyn. 34 (2002) 1703–1728. [8] T. Ahmed, Reservoir engineering handbook, Access Online via Elsevier, 2006. [9] A.M. Elsharkawy, Efficient methods for calculations of compressibility, density and viscosity of natural gases, Fluid Phase Equilib. 218 (2004) 1–13. [10] E. Heidaryan, J. Moghadasi, M. Rahimi, New correlations to predict natural gas viscosity and compressibility factor, J. Pet. Sci. Eng. 73 (2010) 67–72. [11] A. Kamari, A. Hemmati-Sarapardeh, S.-M. Mirabbasi, M. Nikookar, A.H. Mohammadi, Prediction of sour gas compressibility factor using an intelligent approach, Fuel Process. Technol. 116 (2013) 209–216. [12] J.H. Holland, Adaptation in natural and artificial systems: An introductory analysis with applications to biology, control, and artificial intelligence, U Michigan Press, 1975. [13] J.R. Koza, On the programming of computers by means of natural selection, Genetic Programming, vol. 1, MIT press, 1992. [14] C. Ferreira, Gene expression programming: a new adaptive algorithm for solving problems, Complex Syst. 13 (2001) 87–129. [15] N. Kumar, Compressibility factors for natural and sour reservoir gases by correlations and cubic equations of state, 2005. [16] A. Danesh, PVT and phase behaviour of petroleum reservoir fluids, Elsevier, 1998. [17] C.H. Twu, An internally consistent correlation for predicting the critical properties and molecular weights of petroleum and coal-tar liquids, Fluid Phase Equilib. 16 (1984) 137–150. [18] M. Mitchell, An introduction to genetic algorithms (complex adaptive systems)DOI 1998. [19] C. Ferreira, Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence (Studies in Computational Intelligence), Springer-Verlag New York, Inc., Secaucus, NJ, 2006. [20] F. Gharagheizi, A. Eslamimanesh, M. Sattari, A.H. Mohammadi, D. Richon, Corresponding states method for determination of the viscosity of gases at atmospheric pressure, Ind. Eng. Chem. Res. 51 (2012) 3179–3185. [21] F. Gharagheizi, P. Ilani-Kashkouli, A.H. Mohammadi, Estimation of lower flammability limit temperature of chemical compounds using a corresponding state method, Fuel (2012) DOI. [22] F. Gharagheizi, A. Eslamimanesh, M. Sattari, A.H. Mohammadi, D. Richon, Corresponding states method for evaluation of the solubility parameters of chemical compounds, Ind. Eng. Chem. Res. 51 (2012) 3826–3831. [23] F. Gharagheizi, P. Ilani-Kashkouli, A.H. Mohammadi, Corresponding states method for estimation of upper flammability limit temperature of chemical compounds, Ind. Eng. Chem. Res. 51 (2012) 6265–6269.
[24] F. Gharagheizi, A. Eslamimanesh, M. Sattari, B. Tirandazi, A.H. Mohammadi, D. Richon, Evaluation of thermal conductivity of gases at atmospheric pressure through a corresponding states method, Ind. Eng. Chem. Res. 51 (2012) 3844–3849. [25] F. Gharagheizi, P. Ilani-Kashkouli, N. Farahani, A.H. Mohammadi, Gene expression programming strategy for estimation of flash point temperature of non-electrolyte organic compounds, Fluid Phase Equilib. 329 (2012) 71–77. [26] L. Teodorescu, D. Sherwood, High energy physics event selection with gene expression programming, Comput. Phys. Commun. 178 (2008) 409–419. [27] A. Kamari, A. Khaksar-Manshad, F. Gharagheizi, A.H. Mohammadi, S. Ashoori, Robust model for the determination of wax deposition in oil systems, Ind. Eng. Chem. Res. 52 (2013) 15664–15672. [28] G. Scalabrin, P. Marchi, L. Bettio, D. Richon, Enhancement of the extended corresponding states techniques for thermodynamic modeling. II. Mixtures, Int. J. Refrig. 29 (2006) 1195–1207. [29] A.H. Mohammadi, D. Richon, A mathematical model based on artificial neural network technique for estimating liquid water − hydrate equilibrium of water − hydrocarbon system, Ind. Eng. Chem. Res. 47 (2008) 4966–4970. [30] R. Simon, J.E. Briggs, Application of Benedict–Webb–Rubin equation of state to hydrogen sulfide–hydrocarbon mixtures, AICHE J. 10 (1964) 548–550. [31] R. Robinson Jr., R. Jacoby, Better compressibility factors, Hydrocarb. Process. 44 (1965) 141–145. [32] T. Buxton, J. Campbell, Compressibility factors for lean natural gas-carbon dioxide mixtures at high pressure, Old SPE J. 7 (1967) 80–86. [33] W.R. McLeod, Applications of molecular refraction to the principle of corresponding states, University of Oklahoma, 1968. [34] E. Wichert, K. Aziz, Calculate Z's for sour gases, Hydrocarb. Process. 51 (1972) 119–122. [35] C. Whitson, S. Torp, Evaluating constant volume depletion data, SPE Annual Technical Conference and Exhibition, 1981. [36] A.M. Elsharkawy, S.G. Foda, EOS simulation and GRNN modeling of the constant volume depletion behavior of gas condensate reservoirs, Energy Fuel 12 (1998) 353–364. [37] P. Dranchuk, H. Kassem, Calculation of Z factors for natural gases using equations of state, J. Can. Pet. Technol. 14 (1975). [38] P. Dranchuk, R. Purvis, D. Robinson, Computer calculation of natural gas compressibility factors using the Standing and Katz correlation, Annual Technical Meeting1973. [39] K.R. Hall, L. Yarborough, A new equation of state for Z-factor calculations, Oil Gas J. 71 (1973) 82–92. [40] D.H. Beggs, J.P. Brill, A study of two-phase flow in inclined pipes, J. Pet. Technol. 25 (1973) 607–617. [41] Shell Oil Company, Fluid Properties PackageDOI 2003. [42] V. Gopal, Gas z-factor equations developed for computer, Oil and Gas J. (1977) 58–60 ((Aug. 8, 1977), DOI). [43] N. Azizi, R. Behbahani, M. Isazadeh, An efficient correlation for calculating compressibility factor of natural gases, J. Nat. Gas Chem. 19 (2010) 642–645. [44] E. Sanjari, E.N. Lay, An accurate empirical correlation for predicting natural gas compressibility factors, J. Nat. Gas Chem. 21 (2012) 184–188. [45] J.D. van der Waals, On the Continuity of the Gaseous and Liquid States, DoverPublications. com, 2004. [46] D.-Y. Peng, D.B. Robinson, A new two-constant equation of state, Ind. Eng. Chem. Fundam. 15 (1976) 59–64. [47] A.S. Lawal, Application of the Lawal–Lake–Silberberg Equation-of-State to Thermodynamic and Transport Properties of Fluid and Fluid Mixtures, Department of Petroleum Engineering, Texas Tech University, Lubbock, 1999 Technical Report TR-4-99, DOI. [48] G. Soave, Equilibrium constants from a modified Redlich–Kwong equation of state, Chem. Eng. Sci. 27 (1972) 1197–1203. [49] N.C. Patel, A.S. Teja, A new cubic equation of state for fluids and fluid mixtures, Chem. Eng. Sci. 37 (1982) 463–473. [50] J. Kennedy, Particle swarm optimization, Encyclopedia of Machine Learning, Springer, 2010 760–766. [51] A. Chamkalani, A. Mae'soumi, A. Sameni, An intelligent approach for optimal prediction of gas deviation factor using particle swarm optimization and genetic algorithm, J. Nat. Gas Sci. Eng. 14 (2013) 132–143. [52] M. Kamyab, J.H. Sampaio, F. Qanbari, A.W. Eustes, Using artificial neural networks to estimate the z-factor for natural hydrocarbon gases, J. Pet. Sci. Eng. 73 (2010) 248–257. [53] E.M.E.-M. Shokir, M.N. El-Awad, A.A. Al-Quraishi, O.A. Al-Mahdy, Compressibility factor model of sweet, sour, and condensate gases using genetic programming, Chem. Eng. Res. Des. 90 (2012) 785–792. [54] A.H. Mohammadi, F. Gharagheizi, A. Eslamimanesh, D. Richon, Evaluation of Experimental Data for Wax and Diamondoids Solubility in Gaseous Systems, Chem. Eng. Sci., 2012 DOI. [55] A.H. Mohammadi, A. Eslamimanesh, F. Gharagheizi, D. Richon, A novel method for evaluation of asphaltene precipitation titration data, Chem. Eng. Sci. 78 (2012) 181–185. [56] P.J. Rousseeuw, A.M. Leroy, Robust regression and outlier detection, Wiley. com, 2005.