A novel PSO-LSSVM model for predicting liquid rate of two phase flow through wellhead chokes

A novel PSO-LSSVM model for predicting liquid rate of two phase flow through wellhead chokes

Journal of Natural Gas Science and Engineering 24 (2015) 228e237 Contents lists available at ScienceDirect Journal of Natural Gas Science and Engine...

1MB Sizes 0 Downloads 25 Views

Journal of Natural Gas Science and Engineering 24 (2015) 228e237

Contents lists available at ScienceDirect

Journal of Natural Gas Science and Engineering journal homepage: www.elsevier.com/locate/jngse

A novel PSO-LSSVM model for predicting liquid rate of two phase flow through wellhead chokes Reza Gholgheysari Gorjaei a, Reza Songolzadeh a, Mohammad Torkaman a, *, Mohsen Safari b, Ghassem Zargar c a b c

Ahwaz Petroleum College, Petroleum University of Technology, Ahwaz, Iran Research Institute of Petroleum Industry, Tehran, Iran Abadan Petroleum College, Petroleum University of Technology, Abadan, Iran

a r t i c l e i n f o

a b s t r a c t

Article history: Received 1 February 2015 Received in revised form 9 March 2015 Accepted 10 March 2015 Available online

Two-phase flow through chokes is common in oil industry. Wellhead chokes regulate and stabilize flow rate to prevent reservoir pressure declining, water coning and protecting downstream facilities against production flocculation. Choke liquid rate prediction is a basic requirement in production scheme and choke design. In this study, for the first time a least square support vector machine (LSSVM) model is developed for predicting liquid flow rate in two-phase flow through wellhead chokes. Particle swarm optimization (PSO) is applied to optimize tuning parameters of LSSVM model. Model inputs include choke upstream pressure, gas liquid ratio (GLR) and choke size which are surface measurable variables. Calculated flow rates from PSO-LSSVM model are excellently consistent with actual measured rates. Moreover, comparison between this model and related empirical correlations show accuracy and superiority of the model. Results of this work indicate PSO-LSSVM model is a powerful technique for predicting liquid rate of chokes in oil industry. © 2015 Elsevier B.V. All rights reserved.

Keywords: Choke LSSVM PSO Two-phase flow Kernel function

1. Introduction Wellhead chokes in oil industry are used to stabilize flow rate, protect the reservoir and surface facilities from pressure swinging and control production rate to prevent water or gas coning (Nasriani and Kalantari, 2011). Based on bean setting wellhead chokes may be either positive (fixed) or adjustable. In positive choke, bean size is fixed and invariable while adjustable chokes are like variable valves. Due to pressure loss along production tubing and flow line, pressure falls below bubble point and two-phase flow is common in chokes. Flow through wellhead chokes described as either critical or subcritical. Critical flow will occur if velocity is greater than the sonic velocity of the fluid. (Golan and Whitson, 1995; Guo et al., 2007) As a rule of thumb in two phase oil and gas flow, critical flow is established when upstream pressure is at least twice the downstream pressure. The ratio of upstream to downstream pressure is called critical

* Corresponding author. P.O. Box 63431, Iran. E-mail address: [email protected] (M. Torkaman). http://dx.doi.org/10.1016/j.jngse.2015.03.013 1875-5100/© 2015 Elsevier B.V. All rights reserved.

pressure ratio in which critical flow occurs. If this ratio is greater than or equal to critical pressure ratio critical flow happens otherwise flow is subcritical. In the case of critical, flow rate is independent of downstream noises. As mentioned in critical flow, fluid in choke throat is in its sonic velocity and downstream disturbances such as pressure changes cannot travel faster than sonic velocity, therefore upstream pressure is independent of downstream condition. In critical flow, the flow rate depends on upstream pressure while in subcritical flow pressure difference across choke influence choke flow rate (Guo et al., 2007). The major problems related to two phase flow through chokes are deriving a relation for calculating the flow rate based on measurable variable such as wellhead pressure, bean size, gas liquid ratio (GLR) and etc. So many researchers have been worked on two phase flow through choke and suggested their correlations. Suggested methods for multiphase flow through chokes are categorized into empirical and analytical (Al-Attar, 2010). Tangren et al. (1949) accomplished the first endeavor on multiphase flow through restrictions. He added gas bubbles to an incompressible fluid above a critical velocity and showed that medium is incapable of transmitting downstream pressure change

R. Gholgheysari Gorjaei et al. / Journal of Natural Gas Science and Engineering 24 (2015) 228e237

against the flow direction. The best known multiphase flow choke correlation for critical condition was developed by Gilbert (1954) in which liquid rate is linearly proportional to the upstream pressure. His data include 268 production tests for choke sizes change from 6/64 to 18/64 inch (Tangren et al., 1949; Guo et al., 2007; Safar Beiranvand and Babaei Khorzoughi, 2012). Some researchers like e (Baxendell, 1957), (Ros, 1960) and (Achong, 1961) modified (Gilbert, 1954) equation coefficients and proposed new equations. The general form of Gilbert type equations are as below:

229

kind of intelligent learning machine that modifies drawbacks of support vector machine (SVM) method. LSSVM has parameters that should be set before training model. Finding proper value for these parameters is one of the LSSVM users drudgery. These parameters are tuned by PSO which is a new optimizing algorithm for continuous nonlinear functions (Eberhart and Kennedy, 1995).

2. SVM background

Qliq ¼ h

Pupstream GLRj

Di

(1)

where Qliq, Pupstream, D and GLR are liquid rate (STB/D), choke upstream pressure (psi), choke diameter (1/64 in.) and gas liquid ratio (SCF/STB), respectively. h, i and j are Specific coefficients of each equation which are given in Table 1 (Guo et al., 2007). Ros (1960) extended Tangren's method where gas was continuous phase (Ros, 1960). The work of Ros was improved by Poettmann and Beck (1963). They set some charts up for various crude oils with different API gravity and tested them with 108 production data (Nasriani and Kalantari, 2011; Poettmann and Beck, 1963). Fortunati (1972) established correlations for both critical and subcritical flow through chokes. He also suggested a figure and determined boundary between critical and subcritical flow (Fortunati, 1972). Ashford (1974a) developed a correlation for two-phase critical flow based on Ros works (Ashford, 1974b). AlTowailib and Al-Marhoun (1994) used 3930 production tests of Middle East fields to expand a new correlation for two phase critical flow through chokes. Their new correlation was similar to Gilbert's equation but they changed GLR term by mixture density and gave better result than previous correlations (Al-Towailib and AlMarhoun, 1994). Al-Attar (2010) used 40 field tests for developing two equations for critical flow based on bean setting specification then achieved more precise correlation than past proposed ones. He also implemented discharging coefficient and modified Ashford and Pierce (1975) and Fortunati (1972) subcritical correlations by applying 139 field data (Ashford and Pierce, 1975; Al-Attar, 2010). Safar Beiranvand and Babaei Khorzoughi (2012) developed new correlation using 182 data of an Iranian oil field. They added base sediment and water (BS&W) and temperature to Gilbert equation and got more accurate result than prior correlations (Safar Beiranvand and Babaei Khorzoughi, 2012). According to the literature, the most suggested correlations for determining downstream oil flow rate in critical flow through wellhead choke are derived by linear or nonlinear regression methods, which have high error, whereas artificial intelligence techniques are the best alternative for complex problems when adequate data number is available. An artificial intelligent model for single gas flow through choke is proposed in the literature (Nejatian et al., 2014) but no model exists for oil and gas two-phase flow. In this work for the first time, LSSVM is used for modeling critical two-phase flow through wellhead chokes. LSSVM is a

SVM is practical usage of statistical learning theory in multidimensional functions (Vapnik, 1999). It is a learning machine defined for classifying works like optical character recognition (OCR) (Vapnik, 1995) and developed for regression purposes (Drucker et al., 1997; Vapnik et al., 1997). SVM recently used in most engineering fields and proposed models with good accuracy (Esfahani et al., 2015; Meng et al., 2014; Nejatian et al., 2014; Zhou et al., 2011). For simple case input data x 2 Rd are regressed by hyper plane f(x):

f ð x Þ ¼ 〈u; x〉 þ b with u2X; b2R

where 〈u,x〉 indicates the inner product between x and u. Flat solution will be attained if u is small. In other word its norm kuk2 ¼ 〈u; u〉 should be minimum (Safari, 2014). For regression cases Vapnik et al. (1997) defined a loss function as illustrated in Fig. 1 and Eq. (3) which allows some error in specified domain epsilon and some slack variable that could be out of this marginal domain by some penalty (Vapnik et al., 1997).

x ¼ jyi  f ðu; xi Þj ( 0 ifjxj  ε jxjε : ¼ jxj  ε otherwise

Specific coefficients h

i

j

Gilbert Achong Rose Baxendell

0.1 0.262 0.057 0.105

1.89 1.88 2 1.93

0.546 0.65 0.5 0.546

(3)

Loss function definition gives more flexibility to support vector machine regression method. By considering positive slack variables (xi, x*i ) optimization problem is formulated as bellow:

minimize

subject to

[  X  1 kuk2 þ C xi þ x*i 2 i¼1 8  〈u; x y > i i 〉  b  ε þ xi > < 〈u; xi 〉 þ b  yi  ε þ x*i > > : xi ; x*i 0

(4)

In which C is a positive constant and a penalize factor for the data that their deviation from f are xi unit higher than ε (Cherkassky and Ma, 2004; Vapnik, 1998).

Table 1 Specific coefficient for Gilbert type correlation. Correlation

(2)

€ lkopf and Smola, 2002). Fig. 1. Vapnik linear loss function (Scho

230

R. Gholgheysari Gorjaei et al. / Journal of Natural Gas Science and Engineering 24 (2015) 228e237

Table 2 Range of data used in PSO-LSSVM model.   SCF Parameter Choke liquid GLR STB rate (STBD) Range

derivatives of LSVM with respect to b, u, x and x* are equal to zero: Choke size

668.4e14480.8 828.1e13095.1 21e68



1 64 in:



Pressure (psig) 1646e3000

8   vL X * vL X * > > ¼ ¼ ai  ai ¼ 0; ai  ai xi ¼ 0 > > < vb vu i i > vL > > > : vx ¼ C  ai  hi ¼ 0; i

2.1. Dual problem and quadratic programs Lagrange function is applied to summarize all constraints in one function LSVM: [  [  [ X  X  X 1 LSVM ¼ kuk2 þ C ai ðε þ xi xi þ x*i  hi xi þ h*i x*i  2 i¼1 i¼1 i¼1

 yi þ 〈u; xi 〉 þ bÞ 

[ X

a*i

8 [    > 1 X > > ai  a*i aj  a*j 〈xi ; xj 〉  > > > < 2 i;j¼1 maximize

> > [  [ >  P   P > > > yi ai  a*i ai  a*i þ : ε i;j¼1

(5) where denote Lagrange multipliers. Saddle point of LSVM is a favorite condition. Optimum condition is where partial

(6)

By substituting Eq. (6) in Eq. (5) following dual optimization problem is derived:

  ε þ x*i þ yi  〈u; xi 〉  b

i¼1

ai, a*i ,hi,h*i

vL ¼ C  a*i  h*i ¼ 0 vx*i

subject to

(7)

i¼1

[   P ai  a*i ¼ 0 and ai ; a*i 2½0; C

i¼1

Through derivation of Eq. (7), h*i and hi are omitted by considering relation h*i ¼ C  a*i and finally, Eq. (8) is culminated.

Fig. 2. PSO-LSSVM modeling flowchart: a) tuning flow chart b) cost function.

R. Gholgheysari Gorjaei et al. / Journal of Natural Gas Science and Engineering 24 (2015) 228e237

231

transporting input space to feature space F by using map f. Dot product between mapped data is their kernel function:

    k xi ; xj ¼ 4ðxi Þ4 xj :

(9)

All mentioned equation for linear relations are valid for nonlinear conditions as well if all dot products between xi's change to their kernel (Scholkopf et al., 1999). Therefore, modified optimization problem is as bellow:

maximize

8 [    >   1 X > > ai  a*i aj  a*j k xi ; xj > > > 2 < i;j¼1 > > > [  [  P   P > > > yi ai  a*i ai þ a*i þ : ε i¼1

subject to

[  P i¼1

 ai  a*i ¼ 0

(10)

i¼1

and a*i ; a*i 2½0; C

Upon this condition u could not be obtained explicitly and thus reformed as below:



[  X  ai  a*i 4ðxi Þ

(11)

i¼1

Fig. 3. PSO-LSSVM and Gilbert predicted rate versus actual rate: a) Training data b) Testing data.

Finally, Eq. (12) is a modified support vector function for a condition that inputs relate non-linearly to the outputs (Scholkopf € lkopf, 2004): et al., 1999; Smola and Scho

f ðxÞ ¼ u¼

[  X i¼1

[  X   ai  a*i xi ; thus f ðxÞ ¼ ai  a*i 〈xi ; x〉 þ b:

[  X  ai  a*i kðxi ; xÞ þ b:

(12)

i¼1

(8)

i¼1

f(x) is support vector function that finds linear relation between €lkopf, 2004). In a input data xi and target data yi (Smola and Scho case that yi's relate nonlinearly to xi's, linearization is done by

Owing to complexity of explicit computation of dot products between input nonlinear maps f(xi), some kernel functions are suggested to calculate kernel matrix implicitly. Linear kernel Eq. (13), polynomial of degree d Eq. (14) and RBF kernel Eq. (15) are examples of these functions (Espinoza et al., 2003).

Fig. 4. Relative error percent of PSO-LSSVM predicted rate versus actual rate: a) Training data b) Testing data.

232

R. Gholgheysari Gorjaei et al. / Journal of Natural Gas Science and Engineering 24 (2015) 228e237

Table 3 Comparison of PSO-LSSVM model with empirical correlation using statistical parameters. Train

PSO-LSSVM Gilbert Ros Achong

Test

R2

ARE%

AARE%

R2

ARE%

AARE%

0.9939 0.2955 0.3610 0.3717

0.04 6.73 16.10 9.24

0.59 19.72 23.87 19.45

0.9641 0.3516 0.3666 0.4073

0.39 10.06 12.08 5.08

2.49 21.63 25 21.79

1) Linear kernel which is direct product between xTi and xj:

  K xi ; xj ¼ xTi xj

(13)

2) Polynomial kernel of degree d:

  K xi ; xj ¼

!d xTi xj þ1 c

(14)

where c is a tuning parameter. Fig. 5. PSO-LSSVM and Gilbert predicted rate versus actual rate: a) Training overall data b) Testing overall data.

3) Radial basis function (RBF) kernel:

2 !    xi  xj  K xi ; xj ¼ exp s2

(15)

where s2 is RBF kernel tuning parameter and represents RBF kernel width (Chapelle et al., 2002; Espinoza et al., 2003; Zhang et al., 2009).

In classical SVM, constraints are inequalities which demand quadratic programming solution where increases computation difficulty for high sized dataset (Suykens et al., 2002; Suykens and Vandewalle, 1999), whereas in LSSVM all inequality constraints are changed to equality constraints type. Hence, LSSVM solves system of linear equations instead of quadratic programming. Eventually, its solution gives support values and the solution (Shafiei et al., 2014; Suykens and Vandewalle, 1999; Ye and Xiong, 2007). Support values are related to errors instead of support vector in classical SVM (Suykens and Vandewalle, 1999). In regression case LSSVM is written as: N 1 1X x2 kuk2 þ g 2 2 i¼1 i

(16)

where xi are slack variables and g  0 is a regularization parameter. High g does not permit any slack variables and consequently increases model complicity but low g means a model with high training errors. Therefore, it is critical to find proper value for g and it is one of LSSVM tuning parameter that should be adjusted conscientiously (Lorena and De Carvalho, 2008; Zhang et al., 2009). Table 4 Statistical summary PSO-LSSVM modeling of training, testing and total for overall data. Training

N R2

yi ¼ u 4ðxi Þ þ b þ xi ; i ¼ 1; /; N

(17)

N N X 1 1X x2i  ai ðu4ðxi Þ þ b þ xi  yi Þ; LLSSVM ¼ kuk2 þ g 2 2 i¼1 i¼1

2.2. LSSVM methodology

min

Considering following equality constraints in Eq. (17) Lagrangian form of Eq. (16) is defined as Eq. (18):

PSO-training

PSO-validating

324 0.9550

82

Testing

Total

71 0.9467

477 0.9760

(18) where ais (i ¼ 1,.., N) are Lagrange multipliers. By applying optimal conditions for Eq. (18) following equations are attained:

8 N X > vL > > > ai yi 4ðxi Þ IÞ LSSVM ¼ 0/u ¼ > > vu > > i¼1 > > > > > N X > vL > > < IIÞ LSSVM ¼ 0/ a i yi ¼ 0 vb i¼1 > > > > vLLSSVM > > ¼ 0/ai ¼ gxi ; i ¼ 1; /; N > IIIÞ > vxi > > > > > > vL > > : IVÞ LSSVM ¼ 0/u4ðxi Þ þ b þ xi  yi ¼ 0; vai

i ¼ 1; /; N (19)

((19)_IV) and (16) reform optimization problem as:

2

0

6 61 6 6 6 6« 6 4 1

/

1

/

kðx1 ; xN Þ

«

1

«

kðxN ; x1 Þ

/

1 kðx1 ; x1 Þ þ

1 c

kðxN ; xN Þ þ

1 c

3 3 2 3 72 0 7 b 76 6 7 76 a1 7 7 ¼ 6 y1 7: 74 7 « 5 4 « 5 7 yN 5 aN

(20)

Finally, ultimate form of LSSVM prediction function is as below (Suykens et al., 2002; Shafiei et al., 2014; Suykens and Vandewalle, 1999):

R. Gholgheysari Gorjaei et al. / Journal of Natural Gas Science and Engineering 24 (2015) 228e237

233

Fig. 6. Relative error percent of PSO-LSSVM predicted rate versus actual rate: a) Training overall data b) Testing overall data.

f ðxÞ ¼

N X

ai kðx; xi Þ þ b:

(21)

Xinþ1 ¼ Xin þ Vinþ1

(23)

i¼1

3. PSO algorithm PSO is an algorithm for locating optimum value of continuous nonlinear function introduced by Eberhart and Kennedy (1995). Actually, PSO is a simulation of nature inspired by bird flocking, fish schooling and swarming theory and it has relations with both evolutionary programming and genetic algorithm (GA) (Eberhart and Kennedy, 1995). In PSO, particles with no volume distribute in search space and seek for optimum location of function. Position and velocity are two unique characteristics of each particle. Random function is used for determining particles initial position and velocity. The particle that has the best cost compared to other neighboring particles at the same iteration named local best (denoted by Plb). A local best with lowest cost in entire searching space called global best (represented by Pgb). In each iteration, all particles except global best accelerate toward global best and their local best. Therefore, new particles velocity is a summation of its last velocity and local and global best trajectory vector. Particles velocity and position are updated by Eq. (22) and Eq. (23) in each step:

    Vinþ1 ¼ w  Vin þ c1  r1  pnlb  Xin þ c2  r2  pngb  Xin

(22)

Table 5 Comparison of PSO-LSSVM model with empirical correlation using statistical parameters for overall data. Train

PSO-LSSVM Gilbert Ros Achong

Test

ARE%

AARE%

ARE%

AARE%

0.86 20.16 40.84 62.29

9.76 39.45 48.48 70.80

0.80 23.54 45.39 65.93

7.99 44.12 52.87 74.41

where c1, c2 are positive constants and r1, r2 are random functions in [0 1] interval. Xi and Vi represents i's particle position and velocity respectively (Shi and Eberhart, 1999) and w is inertia weight (Shi and Eberhart, 1998; Yuhui and Eberhart, 1998). Large w means global seeking and small value for it demonstrates local searching. Early versions of PSO had fast coverage but imprecise ultimate solution (Angeline, 1998) therefore, it was recommended to increase local search factor in final iterations. Decreasing inertia weight linearly through each iteration is a good suggestion for this problem (Shi and Eberhart, 1999). This algorithm in spite of its simplicity has some advantages over other accustomed algorithms like GA, evolutionary programming and etc. For example in GA each new generation doesn't have any knowledge about its ancestors except elitism that some “identities” of them are inherited to their decedent but PSO has memory that allows each particle to remember its best experience and other particles best experiences (Eberhart and Kennedy, 1995). On the other hand, PSO specific trajectory increases its convergence speed compared to other evolutionary programming which are based on mutating randomly in search space (Shi and Eberhart, 1999). Dependency of mean square error (MSE) of predicted targets to LSSVM tuning parameters is not direct and linear. In addition, finding proper values for g and s2 in RBF kernel function through continuous intervals is a hard work. Due to these constraints, it is necessary to use some optimizing algorithms like PSO for producing better results in this problem.

4. Model preparation In this work 276 production data from one of Iranian oil field are used for modeling. these data are given in Table A.1. A PSO-LSSVM model is developed for predicting choke liquid flow rate from measurable surface parameters such as GLR, upstream pressure and choke size where range of these parameters are given in Table 2. Because frequency of data with liquid rate lower than 2500 STBD is

234

R. Gholgheysari Gorjaei et al. / Journal of Natural Gas Science and Engineering 24 (2015) 228e237

small, model prediction for this range of data is not reliable, therefore 9 data that are involved in low rate range are discarded in first stage of model validation. Second stage investigates model validity for stage one discarded data range by adding some data from literature to whole 276 production data without throwing any data away. RBF kernel function is better choice in comparison with other mentioned kernels in this study. Although linear kernel function has good performance for linearly separable input data but it is not efficient for non-linearly separable data in this work. polynomial kernel function has two parameters that should be adjusted before using it, polynomial degree d and its tuning parameter c but RBF kernel function just has one tuning parameter s2 that reduce its tuning process time in comparison to polynomial kernel. Therefore RBF is proper candidate that is implemented in this work. LSSVM has another tuning parameter g that should be tuned beside kernel tuning parameter s2.These two parameters are optimized by PSO. For this purpose, data are randomly divided into two groups: training group and testing group. It is vital to determine training, validating and testing data portion allocation properly. Training data number must be neither low that causes model inaccuracy nor high that causes model over-training. Testing data are used in order to evaluate model accuracy whereas validating data are used to prevent model over-training. After multiple model runs the best proportion among these groups of data are determined. Training group includes 82 percent of total data that are used in the model training and validating process. 18% remained data are put in testing group and are conserved for testing ultimate LSSVM model. Testing data do not participate in PSO-LSSVM training and final optimized LSSVM training process but just are used as ultimate model performance analyzers. In PSO-LSSVM tuning section LSSVM training data are normalized and divided into two groups called PSO-training and PSO-validating data. In this work 72% of training data are involved in PSO-training group and the rest of them are assigned to PSO-validating group. After swarm size was chosen, particles initial position are selected randomly in PSO tuning process as demonstrated in flowchart of Fig. 2. Each particle position represents one g and s2 values that changes through each iteration until defined criteria reaches. Particle position, PSO-training and PSO-validating data are cost function inputs. In cost function a LSSVM model is trained by PSO-training data, in order to prevent model overtraining PSO-validating data are used for prediction part. Mean square error (MSE) of predicted data is considered as cost function constrain. MSE computes as below:

range too (MSE ¼ 0.002031). Final optimized g and s2 are computed to be 681.3066 and 0.07265. Adjusted model is used for predicting both training and testing data flow rates. Fig. 3 to Fig. 4 show model efficiency for both of them. Fig. 3 illustrates training (a) and testing data (b) model predicted rates versus actual their data values. In order to compare model and Gilbert equation, Gilbert evaluated rates are added to this figure as well. The solid line (y ¼ x) represents prediction with no error. It is clear that LSSVM predicted rates have good match with solid line (y ¼ x) that means prediction with no error. Linear regression and coefficient of determination (R2) of predicted flow rates for training and testing data are as follow respectively:

aÞ y ¼ x þ 65; bÞ y ¼ 0:98*x þ 173;

R2LSSVM ¼ 0:994 R2LSSVM ¼ 0:965

(25)

Excellent agreement between linear regression and y ¼ x and close to unity R2 prove that the model accuracy is not only for training data but also for testing data that have not participated in training process. Relative error percent (RE %) is difference of actual and predicted data divided to actual data as Eq. (26)

RE% ¼

  yactual  ypredicted  100 yactual

(26)

Where N is number of data, yi, actual is ith target data and yi,predicted is ith model predicted data. PSO iteration for determining optimum g and s2 continues until one of criteria is reached. Maximum number of iterations and global best MSE are considered as controlling criteria. MSE<0.0001 and 150 iteration number are considered criteria in this work. If PSO does not reach desired criteria, each particles velocity and position update using Eq. (22) and Eq. (23) respectively. Particles with new position repeat costing cycle until they reach considered criterion. Final global best (gbest) position gives optimized tuning parameters. Training data and optimized g and s2 are used to train LSSVM model.

Relative error percent of each rate for both training and testing data are plotted with respect to actual rate in Fig. 4. Training data Relative error percent alter in interval [15.4, 7.6] and the average relative error percent (ARE %) is with the average of 0.04% and average absolute relative error (AARE %) of training data is 0.59%. Testing data have acceptable values for RE% as well, in such that their alteration interval are [-14.2, 12.6] and ARE% and AARE% are 0.39%, 2.49% respectively. It is evident that most data RE% is close to zero which it reveals excellent agreement between actual data and their corresponding prediction. Statistical parameters (R2, ARE% and AARE %) are used in order to compare PSO-LSSVM model performance with some Gilbert type equations such as Gilbert, Ros and Achong in Table 3. A comparison of statistical parameters in Table 3 confirms PSO-LSSVM model superiority over other presented empirical correlation. Finally in order to ensure the generality of model and its proper performance with other choke conditions, reported data from works of Ashford, Al-attar et al. and Beiranvand et al. (Al-Attar and Abdul-Majeed, 1988; Ashford, 1974a, b; Safar Beiranvand and Babaei Khorzoughi, 2012) are extracted and added to our data for using in LSSVM model. Optimized g, s2 and MSE for these overall data are 8498.1762, 0.59491 and 0.00304 where details of this model are given in Table 4. Same as our data, analysis for these overall data is done from Fig. 5 to Fig. 6. Comparison between the model and other correlations are given in Table 4. In Fig. 5 PSOLSSVM and Gilbert prediction for overall data are plotted with respect to actual data and linear regression between them is attached to this figure. Finally, Fig. 6 exhibits relative error percent of PSO-LSSVM model with respect to actual rate. According to Fig. 5 to Fig. 6, the model capability to predict overall data is outstanding and efficient. The comparison made between PSO-LSSVM model and mentioned empirical correlation in Table 5 verifies that this model is absolutely more accurate than stated correlations.

5. Results and discussion

6. Conclusion

Finally, the model is ready for data prediction. Defined maximum iteration stops tuning process but MSE is in acceptable

Prediction of oil flow-rate when oil and gas flow simultaneously through wellhead choke is a crucial need in production planning

MSE ¼

N  2 1 X  yi;predicted y N i¼1 i;actual

(24)

R. Gholgheysari Gorjaei et al. / Journal of Natural Gas Science and Engineering 24 (2015) 228e237

predicted rate by PSO-LSSVM model are 0.9935 and 0.70%, respectively. Furthermore with the purpose of confirming generality of model and its prediction capability for wide range of data some reported production data are added to the first step data. Overall 477 data point are used in PSO-LSSVM model. Reported R2 and AARE for this step are 0.9760 and 4.77%, respectively that prove model precision and flexibility. Furthermore, PSO is a perfect algorithm for optimizing LSSVM tuning parameters.

and choke design. Therefore, numerous empirical correlations are suggested for rate prediction from some measurable surface production data. Gilbert type equations are well-known of such correlations that their precision and applicability are limited. In this work for the first time, a PSO-LSSVM model is proposed for predicting oil flow rate of two-phase flow through choke. At the first step 276 production data from one of the Iranian oil field used in training and testing model. Pressure, choke size and GLR was considered as inputs of model in order to predict the liquid rate. Model accuracy is checked by comparing relative error percent of model and Gilbert type empirical correlations. Coefficient of determination and the average absolute relative error percent of

Table A.1 Iranian oil field production data that are used in PSO_LSSVM model.       1 in: STB # Pup(psia) GLR MSCF Qliq Day D 64 # Pup(psia) STB 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56

2183 2250 2423 2970 2500 2300 2825 2380 2158 2310 2825 2279 2780 2545 2216 2323 2970 2350 2300 2825 2280 2158 2285 2825 2329 2730 2545 2217 2423 2970 2400 2350 2825 2330 2208 2385 2875 2329 2900 2545 2224 2150 2523 2677 2970 2400 2480 2208 2385 2950 2329 3735 2830 2645 2217 2100

1098.3 1827.9 1711.4 1964.3 1040 1250 1817.4 1816.4 1701.6 1696.9 1753 1646.7 2023 890.3 1098.3 1713.1 1964.3 1040 1230.4 1815.6 1816.4 1701.6 1695.2 1753 1646.7 2023 884.2 1101.6 1713.1 1985.4 1040 1236.5 1811.9 1816.4 1699.9 1693.5 1753 1646.7 2018.9 884.2 1101.6 1824.3 1709.7 2229.6 1985.4 1040 1816.4 1699.9 1772.1 1853 1646.7 2777.6 2030 915.3 1425.3 1824.3

48 64 64 48 47 31 58 48 52 50 48 52 60 48 48 64 48 47 31 58 48 52 50 48 52 60 48 48 64 48 47 31 58 48 52 50 48 52 60 48 48 64 64 56 48 47 48 52 50 48 52 36 60 48 48 64

7277.3 7032.9 9240.4 8120.1 9610 4269.8 9384.7 5771.5 9624.2 9422.2 7417.5 7888.6 10227.1 10323.6 7107.8 9954.1 8120.1 10075 4337.9 9394.1 6573.1 9624.2 9584.7 7417.5 7543 10754.3 10395.7 7081.5 9231.2 8033.9 9920 4008.3 9412.8 6172.3 9246 8981.2 6772.5 7543 8979.8 10395.7 7045.7 7589 8525.2 6207.5 8033.9 9280 4969.9 9246 7321.4 5805 7543 2912.3 10084.8 8003.9 7081.5 7860.1

93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148

2970 2400 2200 2825 2480 2308 2385 2975 2279 2855 2745 2136 2250 2423 2970 2400 2200 2825 2480 2208 2385 2925 2279 2855 2745 2191 2250 2423 2970 2400 2825 2480 2208 2385 2925 2279 2855 2745 2208 2423 2920 2400 2825 4340 2280 2358 2385 2925 2279 3835 2855 2745 2209 2250 2423 2677

GLR



MSCF STB

1985.4 1040 1518.2 1819.2 1818.2 1801.4 1800.2 1753 1646.7 2025 1048.9 1425.3 1824.3 1709.7 1985.4 1040 1518.2 1819.2 1818.2 1805 1800.2 1753 1646.7 2025 1048.9 1425.3 1824.3 1709.7 1985.4 1040 1819.2 1818.2 1805 1800.2 1753 1646.7 2025 1048.9 1425.3 1709.7 1985.4 1040 1819.2 10276 1818.2 1805 1800.2 1753 1646.7 2777.6 2025 1048.9 1422.5 1818.8 1709.7 1605.9

235

Appendix



  1 in: D 64 48 47 31 54 40 50 47 40 50 58 36 48 64 64 48 47 31 56 40 50 47 44 50 58 36 48 64 64 48 47 56 40 50 47 44 50 58 36 48 64 48 47 56 24 40 50 47 44 50 36 58 36 48 64 64 56

  STB Qliq Day 8033.9 9280 5353.2 8461 4965 8461.8 7292.3 5482.5 7888.6 9810.8 7165.8 7496.3 7047 9249.6 8033.9 9280 5353.2 8461 4965 9218.3 7292.3 6127.5 7362.7 9810.8 7165.8 7214.6 7047 9249.6 8033.9 9280 8461 4965 9218.3 7292.3 6127.5 7888.6 9810.8 7165.8 7127.6 9249.6 9621.6 9280 8461 1002 6566.6 8058.3 7292.3 6127.5 7888.6 3262.6 9810.8 7165.8 7136.7 7068 9249.6 6232.2



MSCF STB

#

Pup(psia)

GLR

185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240

2700 4360 1880 1758 2085 2775 2129 3835 2430 2445 2180 2250 2073 1740 2920 2400 2300 2825 3620 1830 1858 2285 2800 2179 3840 3030 2645 2207 2150 2577 1646 2945 2300 2400 3575 1958 2285 2775 2179 3575 2800 2495 2180 2200 2073 1703 2920 2400 2350 2825 3640 1830 1858 2285 2775 2179

1814.1 11344.7 1820 1785 1833.7 1753 1616.8 5706.6 2018.9 1049.9 1421.1 1824.3 1665 828.1 2010.9 1040 1271.2 1820.4 10753 1816.4 1785 1839.2 1753 1616.8 5599 2023 1052.5 1422.5 1826.1 1613.9 901 2010.9 1040 1279.2 11584.6 1805 1798.4 1753 1646.7 4699.6 2025 1050.9 1421.1 1824.3 1665 828.1 2010.9 1040 1271.2 1820.4 13095.1 1816.4 1785 1839.2 1753 1616.8



D



62 21 68 68 60 40 60 36 54 52 48 64 64 44 48 47 31 68 21 68 68 58 40 68 36 54 52 48 64 56 44 48 47 31 24 68 46 40 68 36 54 40 48 64 64 44 48 47 31 68 21 68 68 58 40 68

1 64 in:



  STB Qliq Day 12502.6 668.4 9760 10065.5 12050 8062.5 8925.7 4201.5 12086.5 14480.8 7292.7 7047 11457.3 4493 9499.6 9280 4844.5 10258 1309.1 10180.3 11925 10652.6 7740 8580 4090.4 7900.4 11943.5 7146.9 7581.5 6753.6 5485.6 8715.8 9570 4074.2 1277.5 11151.7 7797.8 8062.5 8580 5632.8 10413.7 9242.6 7292.7 7318 11457.3 4582.3 9499.6 9280 4498.5 10258 1276.5 10180.3 11925 10652.6 8062.5 8580

(continued on next page)

236

R. Gholgheysari Gorjaei et al. / Journal of Natural Gas Science and Engineering 24 (2015) 228e237

Table A.1 (continued )



MSCF STB

#

Pup(psia)

GLR

57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92

2423 2970 2400 2250 2825 2380 2158 2385 2975 2254 3735 2780 2695 2216 2150 2970 2650 2300 4045 2385 2279 3735 2206 2150 2970 2400 2300 2825 2780 2308 2385 2279 2695 2198 2150 2423

1709.7 1985.4 1040 1518.2 1819.2 1818.2 1801.4 1800.2 1753 1646.7 2777.6 2025 1048.9 1425.3 1824.3 1985.4 1040 1518.2 4015 1800.2 1646.7 2777.6 1425.3 1824.3 1985.4 1040 1518.2 1819.2 1818.2 1801.4 1800.2 1646.7 1048.9 1425.3 1824.3 1709.7



D



64 48 47 31 52 40 50 46 40 50 36 58 36 48 64 48 47 31 24 46 50 36 48 64 48 47 31 54 40 50 46 50 36 48 64 64

1 64 in:





  STB Qliq Day

#

Pup(psia)

GLR

9249.6 8033.9 9280 5018.6 8461 5765.8 9624.2 7292.3 5482.5 8061.5 2912.3 10632.9 7584.9 7086.6 7589 8033.9 8555 4684 2711.4 7292.3 7888.6 2912.3 7137.8 7589 8033.9 9280 4684 8461 2562.6 8461.8 7292.3 7888.6 7584.9 7178.8 7589 9249.6

149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184

2970 2400 2725 4340 2480 2358 2185 2975 2179 3885 2780 2745 2181 2250 2023 1921 2920 2300 2400 2800 3590 1830 1900 2285 2850 2179 3877 3030 2645 2071 2050 2473 3677 1756 2200 2300

1985.4 1040 1819.2 11466 1816.4 1805 1791.3 1753 1646.7 4444.9 2025 1048.9 1425.3 1824.3 1709.7 828.1 2010.9 1040 1272.5 1820.4 11052 1816.4 1805 1841 1753 1616.8 5615.6 2018.9 1051.9 1423.9 1822.4 1661.7 2722.1 833 1040 1272.5

MSCF STB

References Achong, I.B., 1961. Revised Bean and Performance Formula for Lake Maracaibo Wells. Shell Internal Report, October 1961. Al-Attar, H., Abdul-Majeed, G., 1988. Revised bean performance equation for East Baghdad oil wells. SPE Prod. Eng. 3 (01), 127e131. Al-Attar, H.H., 2010. New Correlations for Critical and Subcritical Two-phase Flow through Surface Chokes in High-rate Oil Wells. Al-Towailib, A.I., Al-Marhoun, M., 1994. A new correlation for two-phase flow through chokes. J. Can. Petroleum Technol. 33 (05). Angeline, P.J., 1998. Evolutionary Optimization versus Particle Swarm Optimization: Philosophy and Performance Differences (Paper presented at the Evolutionary Programming VII). Ashford, F., 1974a. An evaluation of critical multiphase flow performance through wellhead chokes. J. Petroleum Technol. 26 (08), 843e850. Ashford, F., 1974b. Basic Methods of Least Squares Support Vector Machines. Least Squares Support Vector Machines, pp. 71e116. Ashford, F.E., Pierce, P.E., 1975. Determining multiphase pressure drops and flow capacities in down-hole safety valves. J. Petrol. Technol. 27, 9. Baxendell, P.B., 1957. Bean Performance-lake Wells. Internal Report, October 1957. Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S., 2002. Choosing multiple parameters for support vector machines. Mach. Learn. 46 (1e3), 131e159. Cherkassky, V., Ma, Y., 2004. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 17 (1), 113e126. Drucker, H., Burges, C.J., Kaufman, L., Smola, A., Vapnik, V., 1997. Support vector regression machines. Adv. Neural Inform. Process. Syst. 9, 155e161. Eberhart, R.C., Kennedy, J., 1995. A New Optimizer Using Particle Swarm Theory (Paper presented at the Proceedings of the sixth international symposium on micro machine and human science). Esfahani, S., Baselizadeh, S., Hemmati-Sarapardeh, A., 2015. On determination of natural gas density: least square support vector machine modeling approach. J. Nat. Gas Sci. Eng. 22 (0), 348e358. http://dx.doi.org/10.1016/ j.jngse.2014.12.003. Espinoza, M., Suykens, J.A., De Moor, B., 2003. Least Squares Support Vector Machines and Primal Space Estimation. Paper presented at the Decision and Control, 2003. Proceedings. 42nd IEEE Conference on. Fortunati, F., 1972. Two-phase Flow through Wellhead Chokes (Paper presented at the SPE European Spring Meeting). Gilbert, W., 1954. Flowing and gas-lift well performance. API Drill. Prod. Pract. 20



D



48 47 60 24 40 50 54 44 50 36 62 36 48 64 64 44 48 47 31 68 21 68 68 58 40 68 36 54 52 48 64 64 36 44 47 31

1 64 in:





  STB Qliq Day

#

Pup(psia)

GLR

8033.9 9280 10072.6 1002 4969.9 8058.3 8329.2 5482.5 8580 2744.7 10632.9 7165.8 7265.8 7047 11809.4 4055.7 9499.6 9570 4148.3 10746.4 1359.2 10180.3 11600.2 10642 7095 8580 3596.7 7916.1 11949.4 7836.9 8139.2 8657.7 3065.3 4427.9 9860 4839.7

241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276

3910 3030 2690 2192 2250 2023 1677 2970 2350 2400 2725 3750 1880 1808 2335 2800 2179 3850 3030 2645 2076 2075 2488 3677 1790 2200 2400 2725 4360 1880 1743 2185 2760 2094 3868 2430

5599 2023 1052.5 1425.3 1824.3 1709.7 828.1 2006.9 1040 1272.5 1820.4 7274.2 1816.4 1805 1839.2 1753 1646.7 4766.6 2020.9 1050.9 1423.9 1822.4 1661.7 2722.1 833 1040 1272.5 1814.1 11344.7 1820 1785 1833.7 1753 1616.8 5706.6 2018.9

MSCF STB



D



36 54 52 48 64 64 44 48 47 31 68 21 68 68 58 40 68 36 54 52 48 64 64 36 44 47 31 62 21 68 68 58 40 68 36 54

1 64 in:



  STB Qliq Day 3166.8 7900.4 11380.7 7209.5 7047 11809.4 4645.1 7947.7 9425 3629.2 10270.5 1096.6 9779.5 12311.7 9402.4 7740 8580 2904.3 7908.3 9664.8 7811.3 7723.4 8551.9 3065.3 4346.3 9860 4148.3 12021.7 668.4 9760 10157.4 11367.3 8256 9167.7 3761.3 12086.5

(1954), 126e157. Golan, M., Whitson, C.H., 1995. 1-Concepts in Well Performance Engineering Well Performance, second ed. Prentice Hall, Trondheim, Norway, pp. 1e109. Guo, B., Lyons, W.C., Ghalambor, A., 2007. 5-Choke performance. In: Ghalambor, B.G.C.L. (Ed.), Petroleum Production Engineering. Gulf Professional Publishing, Burlington, pp. 59e67. Lorena, A.C., De Carvalho, A.C., 2008. Evolutionary tuning of SVM parameter values in multiclass problems. Neurocomputing 71 (16), 3326e3334. Meng, Q., Ma, X., Zhou, Y., 2014. Forecasting of coal seam gas content by using support vector regression based on particle swarm optimization. J. Nat. Gas Sci. Eng. 21 (0), 71e78. http://dx.doi.org/10.1016/j.jngse.2014.07.032. Nasriani, H.R., Kalantari, A., 2011. Two-phase Flow Choke Performance in High Rate Gas Condensate Wells (Paper presented at the SPE Asia Pacific Oil and Gas Conference and Exhibition). Nejatian, I., Kanani, M., Arabloo, M., Bahadori, A., Zendehboudi, S., 2014. Prediction of natural gas flow through chokes using support vector machine algorithm. J. Nat. Gas Sci. Eng. 18 (0), 155e163. http://dx.doi.org/10.1016/ j.jngse.2014.02.008. Poettmann, F., Beck, R., 1963. New charts developed to predict gas-liquid flow through chokes. World Oil 184 (3), 95e100. Ros, N., 1960. An analysis of critical simultaneous gas/liquid flow through a restriction and its application to flow metering. Appl. Sci. Res. 9 (1), 374e388. Safar Beiranvand, M., Babaei Khorzoughi, M., 2012. Introducing a new correlation for multiphase flow through surface chokes with newly incorporated parameters. SPE Prod. Operations 27 (04), 422e428. Safari, A., 2014. An eeE-insensitive support vector regression machine. Comput. Stat. 1e22. Scholkopf, B., Mika, S., Burges, C.J., Knirsch, P., Muller, K., Ratsch, G., Smola, A.J., 1999. Input space versus feature space in kernel-based methods. Neural Netw. IEEE Trans. 10 (5), 1000e1017. €lkopf, B., Smola, A.J., 2002. Learning with Kernels: Support Vector Machines, Scho Regularization, Optimization, and beyond. MIT press. Shafiei, A., Ahmadi, M.A., Zaheri, S.H., Baghban, A., Amirfakhrian, A., Soleimani, R., 2014. Estimating hydrogen sulfide solubility in ionic liquids using a machine learning approach. J. Supercrit. Fluids 95 (0), 525e534. http://dx.doi.org/ 10.1016/j.supflu.2014.08.011. Shi, Y., Eberhart, R.C., 1998. Parameter Selection in Particle Swarm Optimization (Paper presented at the Evolutionary Programming VII). Shi, Y., Eberhart, R.C., 1999. Empirical study of particle swarm optimization. Paper

R. Gholgheysari Gorjaei et al. / Journal of Natural Gas Science and Engineering 24 (2015) 228e237 presented at the Evolutionary Computation, 1999. CEC 99. In: Proceedings of the 1999 Congress on. €lkopf, B., 2004. A tutorial on support vector regression. Statistics Smola, A.J., Scho Comput. 14 (3), 199e222. Suykens, Johan A.K., Gestel, Tony Van, Moor, Bart De, Vandewalle, Joos, 2002. Basic Methods of Least Squares Support Vector Machines. In: Least Squares Support Vector Machines. World Scientific Publishing Co. Pte. Ltd. http://www. worldscientific.com/doi/abs/10.1142/9789812776655_0003. Suykens, J.A., Vandewalle, J., 1999. Least squares support vector machine classifiers. Neural Process. Lett. 9 (3), 293e300. Tangren, R.F., Dodge, C.H., Seifert, H.S., 1949. Compressibility effects in two-phase flow. J. Appl. Phys. 20, 7. Vapnik, V., 1995. The Nature of Statistical I. Eaming Theory. Spfnger Vef lag, New York. Vapnik, V., Golowich, S.E., Smola, A., 1997. Support vector method for function approximation, regression estimation, and signal processing. In: Advances in Neural Information Processing Systems, pp. 281e287. Vapnik, V.N., 1998. Statistical Learning Theory. Wiley-Interscience. Vapnik, V.N., 1999. An overview of statistical learning theory. Neural Netw. IEEE Trans. 10 (5), 988e999. Ye, J., Xiong, T., 2007. SVM versus Least Squares SVM (Paper presented at the International Conference on Artificial Intelligence and Statistics). 4e9 May 1998 Yuhui, S., Eberhart, R., 1998. A Modified Particle Swarm Optimizer. Paper presented at the Evolutionary Computation Proceedings, 1998. IEEE World Congress on Computational Intelligence., The 1998 IEEE International Conference on. Zhang, W., Li, C., Zhong, B., 2009. LSSVM Parameters Optimizing and Non-linear System Prediction Based on Cross Validation. Paper presented at the Natural Computation, 2009. ICNC'09. Fifth International Conference on. Zhou, J., Shi, J., Li, G., 2011. Fine tuning support vector machines for short-term wind speed forecasting. Energy Convers. Manag. 52 (4), 1990e1998.

Nomenclature ARE: average relative error AARE: average absolute error b: bias of SVM function

C: penalize factor   1 in: D: choke diameter 64 f(xi): LSSVM output predicted for ith input F: feature space GA: genetic algorithm gbest: global best   GLR: gas liquid ratio MSCF STB h, i, j: Gilbert type correlations specific coefficient k (xi, xj): kernel function of xi, xj LSVM: Lagrangian form of SVM constrains LLSSVM: Lagrangian form of LSSVM constrains LSSVM: least square support vector machine MSE: mean square error OCR: optical character recognition PSO: particle swarm optimizing Pgb: global best position Plb: local best position Pup: choke upstream pressure (psia) Pupstream: choke upstream  STB pressure (psia) Qliq: choke liquid rate Day r1, r2: random functions RBF: radial basis function RE: relative error R2: coefficient of determination SVM: support vector machine Vi: ith particle velocity w: PSO inertia weight xi: LSSVM ith input Xi: ith particle position yi: ith actual output value ai, a*i ,hi,h*i : Lagrange multipliers of ith data ε: Regression margin (xi, x*i ): slack variable amount of prediction offset from actual data F(x): mapped form of x s2: squared variance of the RBF function g: regularization parameter u: SVM regression weighting factor #: Data index

237