A model ranking and uncertainty propagation approach for improving confidence in solids transport model predictions

A model ranking and uncertainty propagation approach for improving confidence in solids transport model predictions

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx Contents lists available at ScienceDirect Journal of Petroleum Science and Engineering...

994KB Sizes 0 Downloads 60 Views

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

Contents lists available at ScienceDirect

Journal of Petroleum Science and Engineering journal homepage: www.elsevier.com/locate/petrol

A model ranking and uncertainty propagation approach for improving confidence in solids transport model predictions ⁎

Frits Byron Soepyana, Selen Cremaschib, , Cem Saricac, Hariprasad J. Subramanid, Haijing Gaod a

The University of Tulsa, Russell School of Chemical Engineering, 800 South Tucker Drive, Tulsa, OK 74104, USA Auburn University, Department of Chemical Engineering, 212 Ross Hall, Auburn, AL 36849-5127, USA c The University of Tulsa, McDougall School of Petroleum Engineering, 800 South Tucker Drive, Tulsa, OK 74104, USA d Chevron Energy Technology Company, 1400 Smith Street, Houston, TX 77002, USA b

A R T I C L E I N F O

A BS T RAC T

Keywords: Solid particle transport Threshold velocity prediction Model evaluation Uncertainty propagation Confidence in model predictions Monte Carlo simulation

The transport of solid particles in pipelines is of interest in the petroleum industry, and is needed to increase flow efficiency in the pipe and prevent pipeline damage due to the particles’ accumulation. To achieve this goal, the velocity of the carrier fluid in the pipe needs to exceed the threshold velocity. Many solids transport models are available for predicting the threshold velocity, but for the same input condition, the predictions of these models may vary by orders of magnitude, and information regarding the confidence of the models’ predictions is not readily available. To resolve these issues, this paper presents a model evaluation and uncertainty propagation approach that uses a novel combination of data clustering, model parameter fine-tuning, model screening and ranking, model uncertainty quantification, and Monte Carlo simulation methods. The inputs are the experimental database for solids transport, a set of solids transport models, and the input condition(s) where the models’ predictions are needed. The outputs of the methodology include the models’ rankings, and the envelopes of the models’ predictions to within a predetermined confidence level. By propagating the uncertainties of the models, experimental data, and input conditions, the highest-ranked models produce velocity envelopes at the 90% confidence level that cover the experimentally-observed values for 92% of the cases; while using the prediction of an individual model does not provide any information regarding the prediction confidence.

1. Introduction In the petroleum industry, the need to hydraulically transport solid particles is encountered frequently. For instance, hydraulic fracturing involves injecting fluid [typically water, oil, acids, methanol (Pangilinan et al., 2016), or water mixed with drag-reducing polymer (Gu and Mohanty, 2015)] and proppants [typically sand, ceramic, or resincoated ceramic or sand (Pangilinan et al., 2016)] at high rate and pressure (Shiozawa and McClure, 2016). This process creates fractures in the “geologic formations” (Pangilinan et al., 2016), which increases the permeability and the production rate of the oil reservoir (Zheng et al., 2015). In another application, during well drilling, the cuttings need to be transported by the drilling fluid (Akhshik et al., 2015) to prevent the formation of a stationary bed of solids at the bottom of the wellbore (Rodriguez Corredor et al., 2016). Consequences of having a stationary bed of solids include “slow drilling rate, and in severe cases, stuck pipe” (Rodriguez Corredor et al., 2016). In these cases, the fluid velocity must exceed the threshold velocity



to successfully transport the solid particles in the pipe. Many solids transport models exist that predict such velocity (Soepyan, 2015). Furthermore, different threshold velocity definitions exist (Soepyan et al., 2014), including the critical velocity (the fluid velocity that marks the boundary between the settling of solid particles at the bottom of the pipe and the particles’ full suspension) (Oroskar and Turian, 1980), saltation velocity (the minimum fluid velocity needed to prevent suspended solid particles from settling to the bottom of the pipe) (Zenz, 1964), equilibrium velocity (the fluid velocity where the rate at which the particles are transported by the fluid equals the rate at which the particles settle to the bottom of the pipe) (Gruesbeck et al., 1979), pick-up velocity (the fluid velocity required to initiate the motion of a solid particle initially at rest on a bed of solids) (Hayden et al., 2003), and incipient motion velocity (the fluid velocity required to initiate the motion of a solid particle initially at rest at the bottom of the pipe) (Rabinovich and Kalman, 2009a). Different models may be developed using different assumptions regarding the dominant forces for solid particle transport, given the

Corresponding author. E-mail address: [email protected] (S. Cremaschi).

http://dx.doi.org/10.1016/j.petrol.2016.12.025 Received 8 September 2016; Received in revised form 1 December 2016; Accepted 19 December 2016 0920-4105/ © 2016 Elsevier B.V. All rights reserved.

Please cite this article as: Soepyan, F.B., Journal of Petroleum Science and Engineering (2016), http://dx.doi.org/10.1016/j.petrol.2016.12.025

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

model j that lie outside ±(100%) × min(1 − εl , εu − 1) of the experimental observations Rj2 R2 statistic of model j

Nomenclature C Ci C0 di D Di D0 dp dp,i dp,0 EMA,j EMAP,j EMS,j EP,i,j ESS,j

Fcalc(v ) Fexp(v )

particle volumetric concentration particle volumetric concentration of datum point i particle volumetric concentration of the input condition weighted Euclidean distance between experimental datum point i and the input condition hydraulic diameter of the conduit hydraulic diameter of the conduit of datum point i hydraulic diameter of the conduit of the input condition particle diameter particle diameter of datum point i particle diameter of the input condition mean absolute error of model j mean absolute error percentage of model j mean squared error of model j error percentage of model j at experimental datum point i in the reduced database error sum of squares of the threshold velocity predictions of model j distribution function of the predicted threshold velocity in the reduced database distribution function of the experimentally-observed

R2adj,dev,j deviation of the modified adjusted-R2 statistic of model j from the value of one R2adj,j adjusted-R2 statistic of model j R2adj,mod,j modified adjusted-R2 statistic of model j Sj score of model j t index of the trial (replication) of the Monte Carlo simulation method TSS total sum of squares of the experimentally-observed threshold velocity TSS,mod modified total sum of squares of the experimentally-observed threshold velocity T1 test statistic for the null hypothesis Uexp,i uncertainty of the experimentally-observed threshold velocity at datum point i under%,j percentage of experimentally-observed threshold velocity underestimated by model j in the reduced database Uxl uncertainty of independent variable l

threshold velocity in the reduced database f ( xl , k ) equation of the model ⎛ ⎞ f ⎜ xl, i , kj ⎟ equation of model j at datum point i ⎝ ⎠ Ha alternative hypothesis hl correlation between independent variable l and the threshold velocity H0 null hypothesis i index of the experimental data points j index of the models J number of ranked models k vector that consists of the parameters (constants) of the model kj number of parameters in model j kj vector that consists of the parameters of model j kj,non l max%,j mj m0,j

NAr NAr,i NAr,0 ndata Ndata nindep,j Nmodel NRe,p NRe,p,i NRe,p,i,j ntrial Nvar over%,j P%,j

Uxl, i

uncertainty of independent variable l at datum point i

v vcalc,i,j

x xl xl

threshold velocity threshold velocity predicted by model j for experimental datum point i experimentally-observed threshold velocity average value of the experimentally-observed threshold velocity in the reduced database experimentally-observed threshold velocity of datum point i lower bound of the value of the threshold velocity at experimental datum point i estimated “true” value of the threshold velocity at the input condition given the error of model j at experimental datum point i upper bound of the value of the threshold velocity at experimental datum point i average value of the threshold velocity predictions of all the ranked models for the input condition absolute deviation of v0,j from v0,avg threshold velocity prediction of the model for the ith input condition threshold velocity prediction of model j for the input condition independent variable value of independent variable l vector that contains the values of the independent variables

xl,i xl, i

value of independent variable l at datum point i vector that contains the independent variables of datum

vexp vexp,avg vexp,i vL,exp,i vM,i,j

vU,exp,i v0,avg v0,dev,j v0,i

number of parameters in model j that become non-zero after the model parameter fine-tuning process index of the independent variables maximum between over%,j and under%,j slope between the predictions of model j and the experimentally-observed values of the threshold velocity slope between the predictions of model j and the experimentally-observed values of the threshold velocity, with the intercept forced to be at the origin Archimedes number Archimedes number of datum point i Archimedes number of the input condition number of data points in the reduced database number of data points in the experimental database number of independent variables incorporated in model j total number of models available in the model database particle Reynolds number particle Reynolds number of datum point i particle Reynolds number predicted by model j for datum point i total number of trials (replications) for the Monte Carlo simulation method total number of independent variables that describe the physical system percentage of experimentally-observed threshold velocity overestimated by model j in the reduced database percentage of threshold velocity predictions produced by

v0,j

xl, i xL,l,i xl,0 xl,0 xU,l,i x1 x2 yj z αS εl

εu 2

point i normalized xl,i lower bound of the value of independent variable l at experimental datum point i value of independent variable l at the input condition normalized xl,0 upper bound of the value of independent variable l at experimental datum point i first independent variable second independent variable statistic of model j dependent variable level of significance acceptable lower bound of the ratio of the model's prediction to the value of the threshold velocity observed experimentally acceptable upper bound of the ratio of the model's predic-

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

θ θi θ0 μi μ0

ρf ρf,i ρf,0 ρS ρS,i ρS,0

tion to the value of the threshold velocity observed experimentally conduit inclination angle conduit inclination angle of datum point i conduit inclination angle of the input condition fluid viscosity of datum point i fluid viscosity of the input condition

fluid density fluid density of datum point i fluid density of the input condition particle density particle density of datum point i particle density of the input condition

Our model evaluation and uncertainty propagation technique uses a novel combination of data clustering, model parameter fine-tuning, model screening and ranking, model uncertainty quantification, and Monte Carlo simulation methods. The model screening and ranking approach uses standard model evaluation metrics in a novel way to evaluate the models’ accuracy. Meanwhile, our uncertainty propagation method advances the state of the art in the field of solid particle transport by utilizing all existing information regarding the effect of each source of uncertainty (experimental data, models, and input condition) to quantify the uncertainties of the models’ predictions. This paper is organized as follows. Section 2 provides brief descriptions of the solids transport experimental database and models used in the analysis. Section 3 describes our model evaluation and uncertainty propagation approach in detail. Section 4 presents the results of the validation studies. Finally, Section 5 provides the concluding remarks.

different velocity definitions (Soepyan et al., 2014; Soepyan, 2015). For cases where the models were developed by experimental data-fitting, different models may be developed using different ranges of independent variables (e.g., fluid density and viscosity; particle density, diameter and concentration; and pipe diameter and inclination angle) (Soepyan et al., 2014). For instance, the dominant forces are different for the case of initiating the motion of solid particles that are initially at rest (i.e., for predicting the pick-up or incipient motion velocity) (Soepyan et al., 2016) and for the case of preventing solid particles that are already in motion from settling to the bottom of the pipe (i.e., for predicting the saltation velocity) (Davies, 1987; Ponagandla, 2008). Given the differences in the process of developing the solids transport models, for the same input condition (i.e., a set of independent variables where the models’ predictions are needed), the models’ velocity predictions may differ by orders of magnitude, as observed in our previous works (Cremaschi et al., 2015; Soepyan et al., 2013a). This discrepancy makes the selection of the appropriate operating fluid velocity for the input condition difficult. Furthermore, neither quantitative nor qualitative information on the confidence of these models’ predictions is readily available. Such information is important for quantifying the probability that the threshold velocity of the fluid in the pipeline is sufficient for solid particle transport. For instance, suppose that a pipeline is operating at a particular fluid velocity that is recommended by a reliable solids transport model, and that we would like to be at least 90% confident that the operating fluid velocity is sufficient for particle transport. When the confidence behind the model's prediction is quantified, if the value of the confidence is less than 90%, the operating fluid velocity needs to be increased. Otherwise, the operating fluid velocity can remain unchanged. In other words, quantifying the confidence behind the model's prediction helps in the decision-making process regarding the operating fluid velocity in the pipe. Given a set of solids transport models, and a solids transport experimental database where the performance of the models can be evaluated, the key challenges involve identifying the appropriate models for the given input condition, and quantifying the uncertainties of the models’ predictions to increase the confidence in these predictions. In other words, the questions are: (1) Given an input condition, which models should be used to predict the threshold velocity? (2) What is the confidence that these models’ predictions are “correct”? A thorough literature search suggests that there has not yet been any work where both a model evaluation technique and an uncertainty propagation approach are applied to the field of solid particle transport. In one of our previous works (Soepyan et al., 2013a), we developed a model evaluation technique that recommends the solids transport models to use to predict the threshold velocity for a given input condition. In another work (Soepyan et al., 2016), we developed a semi-mechanistic solids transport model using a combination of the physics behind solid particle transport and experimental data-fitting. We quantified the uncertainties of our model's predictions using the model's errors at the experimental data. In this work, we answer the two questions in the previous paragraph with the development of a methodology that combines a model evaluation technique (to answer the first question) with an uncertainty propagation approach (to answer the second question), thus merging the two concepts discussed in our previous works.

2. Solids transport experimental database and models The experimental database consists of 174 data points for the hydraulic conveying of solid particles at low particle concentrations (less than 100 ppm, or parts per million) and horizontal or nearhorizontal flow (the references for the experimental data are listed in Supplementary Materials, and the experimental data are tabulated in Table S.1 of Supplementary Materials). There are 141 data points for experiments conducted using water as the carrier fluid, and the remaining 33 data points originate from experiments conducted using more viscous liquids, with viscosities ranging from 2.80×10−3 Pa∙s to 4.48×10−1 Pa∙s. We assume that the fluids used in the experiments are Newtonian. In these experimental data, seven independent variables have been identified as the most significant (Soepyan et al., 2014): Ci (particle volumetric concentration of datum point i), θi (conduit inclination angle of datum point i), ρf,i (fluid density of datum point i), μi (fluid viscosity of datum point i), ρS,i (particle density of datum point i), dp,i (particle diameter of datum point i), and Di (hydraulic diameter of the conduit of datum point i). Here, i denotes the index for the experimental data points. The dependent variable (i.e., the measured variable in the experiments) is the threshold fluid velocity for particle motion, vexp,i. Solids transport models are used to predict the threshold velocity of a single-phase carrier fluid required for particle transport. Fifty four models were gathered from open literature (the references of the models are listed in Supplementary Materials, and the dimensionless equations of these models can be found in Table S.2 of Supplementary Materials). The models predict the threshold velocity for solid particle transport given an input condition, i.e., C0, θ0, ρf,0, μ0, ρS,0, dp,0, and D0. Here, the subscript i = 0 is used to denote the input condition. Each solids transport model j (with j representing the index of the solids transport models) can be used to predict the threshold velocity vcalc,i,j for the 174 experimental data points (i.e., for each experimental datum point i), and the threshold velocity v0,j of the input condition.

3

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

⎧ Nvar ⎫1/2 di = ⎨ ∑ [ hl × xl, i − xl,0 2 ]⎬ . ⎩ l =1 ⎭

3. Model evaluation and uncertainty propagation methodology Prior to developing the methodology, we have the input condition, the experimental data, and the solids transport models available. Therefore, three sources of uncertainties are identified: (1) the input condition (i.e., the uncertainties of the values of the independent variables of the input condition), (2) the experimental data (i.e., the uncertainties of the values of the threshold velocity and the independent variables of the experimental data), and (3) the models (i.e., the uncertainties originating from the errors of the models’ threshold velocity predictions). Throughout this paper, these three sources of uncertainties will be referred as the uncertainty of the input condition, the uncertainty of the experimental data, and the uncertainty of the models, respectively. The inputs to our methodology are: (1) the input condition, (2) the uncertainty of the input condition and its distributions, (3) the experimental database, (4) the uncertainty of the experimental database and its distributions, and (5) the equations of the solids transport models. The uncertainties of the solids transport models is not an input because these uncertainties and their distributions are quantified using our methodology (Section 3.4). The outputs of our methodology are: (1) a recommendation of the models to use for the input condition, (2) the distributions of the models’ threshold velocity predictions for the input condition, and (3) the envelopes of the models’ predictions to within a predetermined confidence level. These outputs are produced using our methodology, which consists of five parts: (1) the data clustering component (Section 3.1), (2) the model parameter finetuning module (Section 3.2), (3) the model screening and ranking protocol (Section 3.3), (4) the model uncertainty quantification method (Section 3.4), and (5) the Monte Carlo simulation method (Section 3.5). The purpose and the procedure for implementing each component of the methodology are provided in the following subsections.

The data clustering component generates a reduced database comprised of experimental data that are representative of the input condition, where these data points are selected from the experimental database. The distance between each experimental datum point and the input condition is used to select the data points that will populate the reduced database. The variables in the experimental data (which are used to calculate the above-mentioned distance) may have different scales. Thus, to ensure equal contributions from the variables in the data clustering process, the variables are normalized using Eq. (1) (Soepyan et al., 2013a):

xl, i − min( xl,0 , max( xl,0 , xl,1 ,

xl,2 ,

1, 2, …, Ndata ,

xl,1 , …,

xl,1 , xl,2 ,

xl,2 , …,

…,

xl, Ndata )

, i = 0,

xl, Ndata ) − min( xl,0 ,

xl, Ndata )

l = 1, 2, …, Nvar .







(2)

In Eq. (2), di represents the weighted Euclidean distance between experimental datum point i and the input condition, and hl the correlation between independent variable l and the threshold velocity. The correlation represents the strength of the linear relation between two variables (Devore and Berk, 2007). If the absolute value of the correlation is equal to one, there is a strong possibility that the two variables are linearly dependent; while a correlation value of zero suggests that there is no linear dependence between the two variables. By using the absolute values of the correlations as weights in the Euclidean distance function, a datum point is penalized more due to the difference between the values of the more important variable of the input condition and the datum point. Here, the importance of the variable is determined using the absolute value of the correlation, where the more significant variables have correlations whose absolute values are closer to one. For instance, suppose that dependent variable z is more linearly dependent on independent variable x1 compared to independent variable x2 (i.e., the absolute value of the correlation between z and x1 is greater than the absolute value of the correlation between z and x2). When the weighted Euclidean distance between the input condition and a datum point is computed, the weighted Euclidean distance becomes larger for a unit difference between the values of x1 of the input condition and the datum point compared to a unit difference between the values of x2 of the input condition and the datum point. There are cases where the dependent variable z is not linearly dependent on many of the independent variables, such as the case of solid particle transport in pipes, where the threshold velocity is not linearly dependent on some of the independent variables (Soepyan et al., 2016). As a result, the use of the correlation in Eq. (2) may not adequately capture the importance of the independent variables. For these cases, a data transformation may be necessary. Using dimensionless groups to describe the physical phenomena is common practice in the field of solid particle transport (Kalman et al., 2005; Rabinovich and Kalman, 2009a, 2009b). Using dimensional analysis (Cimbala and Çengel, 2008), the seven independent variables (Ci, θi, ρf,i, μi, ρS,i, dp,i, and Di) and the threshold velocity can be combined to form the following dimensionless groups (among others): Ci, θi, the ratio of the particle density to the fluid density (ρS,i / ρf,i), the ratio of the hydraulic diameter to the particle diameter (Di / dp,i), ⎡ ⎞ ⎤ ⎛ the Archimedes number ⎢NAr , i = gdp3, iρf , i ⎜ρS, i − ρf , i ⎟ / μi2 ⎥ (Rabinovich and ⎠ ⎦ ⎝ ⎣ Kalman, 2009a), and the particle Reynolds number ⎛ ⎞ ⎜NRe, p, i = dp, ivexp, iρf , i / μi ⎟ (Rabinovich and Kalman, 2009a). These dimen⎝ ⎠ sionless groups were selected because they incorporate some of the most significant forces for solid particle transport (Soepyan et al., 2014). Based on our experimental database (Table S.1 of Supplementary Materials), we found that the base-10 logarithm of the particle Reynolds number has a strong linear dependence on the base-10 logarithm of the Archimedes number and the base-10 logarithm of the ratio of the hydraulic diameter to the particle diameter. Thus, for Eq. (1), Nvar = 5, and xl,1, xl,2, xl,3, xl,4, and xl,5 are substituted with ⎞ ⎛ log10(Ci ), cos(θi ), log10⎜ρS, i / ρf , i ⎟, log10(Di / dp, i ), and log10(NAr , i ). Here, the ⎠ ⎝ cosine of θi is used, because for pipe inclination angles of zero degree (i.e., horizontal pipes), the logarithm of θi cannot be quantified. Using these substitutions, Eq. (2) becomes:

3.1. Data clustering component

xl, i =



(1)

In the above equation, l represents the index of the independent variables, Ndata the number of data points in the experimental database, Nvar the number of independent variables that describe the physical system, xl,i the value of independent variable l at datum point i, xl,0 the value of independent variable l at the input condition, and xl, i the value of xl, i upon normalization. Using the normalized values of the independent variables of the input condition and experimental datum point i, the weighted Euclidean distance function is used to quantify the distance between the input condition and experimental datum point i: 4

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al. 2

di = [ h1 × log10(Ci ) − log10(C0 ) + h 2 × cos(θi ) − cos(θ0 ) ⎞ ⎛ ⎞ ⎛ + h3 × log10⎜ρS, i / ρf , i ⎟ − log10⎜ρS,0 / ρf ,0 ⎟ ⎠ ⎝ ⎠ ⎝ + h4 × log10(Di / dp, i ) − log10(D0 / dp,0 )

2

3.3. Model screening and ranking protocol

2

The model screening approach (Section 3.3.1) uses statistics to systematically eliminate inaccurate models from being considered for further ranking, while the model ranking approach (Section 3.3.2) uses additional statistics to assess the accuracy of the remaining models.

2

2 1/2

+ h5 × log10(NAr , i ) − log10(NAr ,0 ) ] .

(3)

3.3.1. Model screening The model screening process is performed using four statistics: (1) the modified adjusted-R2 statistic (R2adj,mod,j) of model j; (2) the slope (m0,j) between the predictions of model j and the experimentallyobserved values of the threshold velocity, with the intercept forced to be at the origin; (3) the slope (mj) between the predictions of model j and the experimentally-observed values of the threshold velocity; and (4) the result of the Smirnov statistical test (Conover, 1971), which determines if the distributions of the experimentally-observed and model-predicted threshold velocities are equal in the reduced database. The Smirnov statistical test uses the following hypotheses:

Similar to Eq. (2), the overbars signify that the log or cosine of the variables are normalized, while h1, h2, h3, h4, and h5 represent the correlations between log10(C ) and log10(NRe, p ), cos(θ ) and log10(NRe, p ), ⎛ ⎞ log10⎜ρS / ρf ⎟ and log10(NRe, p ), log10(D / dp ) and log10(NRe, p ), and log10(NAr ) and ⎝ ⎠ log10(NRe, p ), respectively. The Euclidean distance di between the input condition and each datum point in the experimental database is quantified. Then, the experimental data are sorted based on the value of their Euclidean distance, from smallest to largest. The number of experimental data points (ndata) that will be used to populate the reduced database is determined by first quantifying the difference of the Euclidean distances between two consecutive data points. Then, large values of these differences are determined using a distribution-free outlier detection method (Devore and Berk, 2007). Large differences occur if the difference is greater than ddiff , u+1.5ddiff , f . Here, ddiff,u represents the upper fourth of the Euclidean distance difference, and ddiff,f the fourth spread of the Euclidean distance difference. To increase the confidence that a large enough sample size is used for the proceeding steps of the methodology, the number of data points populating the reduced database should be greater than 50 (Devore and Berk, 2007). Thus, the number of data points selected to populate the reduced database is equal to the datum point index i, with i > 50, where there is a large difference in the Euclidean distance between datum point i and i + 1.

H0 : Fcalc(v ) = Fexp(v ) for all − ∞ < v < + ∞,

(4)

Ha : Fcalc(v ) ≠ Fexp(v ) for at least one value of v.

(5)

In Eqs. (4) and (5), v denotes the threshold velocity, Fcalc(v ) and Fexp(v ) the distribution functions of the predicted and experimentally-observed threshold velocity in the reduced database, respectively, H0 the null hypothesis, and Ha the alternative hypothesis. The Smirnov statistical test assesses the validity of the null hypothesis H0. An example of the Smirnov statistical test is shown in Supplementary Materials. Our modified adjusted-R2 statistic incorporates the modified total sum of squares (TSS,mod) of the experimentally-observed threshold velocity, and the error sum of squares (ESS,j) of the predictions of model j: ndata

TSS, mod =

2 ∑ vexp ,i, i =1

(6)

ndata

3.2. Model parameter fine-tuning module

ESS, j =

∑ (vcalc,i,j − vexp,i )2 . i =1

The model parameter fine-tuning module uses the experimental data points in the reduced database to adjust the parameters (i.e., constants) of the solids transport models such that the bias in the models’ predictions are removed as much as possible for the reduced database. The proposed methodology can be implemented without the model parameter fine-tuning module, where in this case, the models with their original parameters (i.e., the original models) are used instead of the fine-tuned models. If the models under consideration predict a quantity that has dimensions, which is the case for many solids transport models (Soepyan et al., 2014), the models’ equations are converted to dimensionless forms to prevent dimensional mismatch that may occur when the values of the parameters are fine-tuned. The models were converted to their dimensionless forms by multiplying both sides of ⎛ ⎞ their equations with ⎜dp, iρf , i / μi ⎟. This process yields the particle ⎝ ⎠ ⎛ ⎞ Reynolds number ⎜NRe, p, i, j = dp, ivcalc, i, jρf , i / μi ⎟ on the left-hand side of ⎝ ⎠ each model's equation, which originally contains the threshold velocity. The right-hand side of the model's equation is then rearranged to form additional dimensionless groups. For each model, the parameters that minimize the error sum of squares of the model for the reduced database are determined by solving a nonlinear optimization problem with the value of zero as the lower bounds of the parameters. This non-negativity constraint ensures that the forms of the models remain unchanged, as it is assumed that they have the correct form. The model parameter fine-tuning method presented in this section is described in detail in Soepyan (2015).

(7)

In Eq. (6), the modified total sum of squares represents the overall variation of the threshold velocity in the reduced database, instead of its deviation from the average value. Traditionally, the total sum of ndata (vexp, i − vexp, avg )2 (Devore and Berk, 2007), squares is computed as ∑i =1 where vexp,avg denotes the average value of the experimentallyobserved threshold velocity in the reduced database. In Eq. (6), the ndata (vexp, i −0)2 . Because modified total sum of squares is computed as ∑i =1 the value of zero, instead of vexp,avg, is used as the middle value, and the value of vexp,i is always nonnegative, the modified total sum of squares thus represents the overall variation of vexp. The values of the statistics shown in Eqs. (6) and (7) are used to calculated the modified adjusted-R2 statistic:

⎞ ⎛ ESS, j ⎞⎛ ndata −1 2 ⎟⎟ . ⎟⎟⎜⎜ Radj ⎜ , mod , j =1−⎜ ⎝ TSS, mod ⎠⎝ ndata − nindep, j −1 ⎠

(8)

Here, nindep,j represents the number of independent variables incorporated in model j. The conditions that are used to discard inaccurate models include: 2 Radj , mod , j <0,

(9)

m 0, j < εl ,

(10)

m 0, j > εu ,

(11)

mj <0,

(12)

or if the null hypothesis [H0, shown in Eq. (4)] is rejected at a 5

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

Here, yj denotes the statistic of model j (and may be substituted with 2 EMS,j, EMAP,j, EMA,j, Radj , dev, j , max%,j, P%,j, v0,dev,j, or kj,non), yj represents the normalized value of yj, and J is the number of ranked models. The values of the normalized statistics are then summed to obtain the score Sj of model j. The models with scores closer to zero are the ones that are most accurate for the input condition. The score Sj is calculated using Eq. (14):

predetermined level of significance according to the Smirnov statistical test. In Eqs. (10) and (11), εl and εu represent the acceptable lower and upper bounds of the ratio of the model's prediction to the value of the threshold velocity observed experimentally (with εl < 1 and εu > 1). A negative value of the modified adjusted-R2 statistic [Eq. (9)] suggests that there is greater variation in the errors of model j compared to the variation in the experimental observations. Eqs. (10) and (11) remove models with average predictions that are outside the pre-determined acceptable deviation from the experimentally-observed values. Eq. (12) tests whether model j predicts an increase or decrease in the value of the threshold velocity if the experimental data suggest such trend. Finally, if the Smirnov statistical test suggests that the alternative hypothesis (Ha) is favored over the null hypothesis (H0), it is implied that model j is unable to capture the overall trend that is observed experimentally. For our application, the value of the level of significance αS = 0.05 is used for the Smirnov statistical test, and the values of εl =0.5 and εu=1.5 are used for Eqs. (10) and (11). These values of εl and εu were used because from our previous studies (Soepyan et al., 2014), when the existing solids transport models were evaluated, the most accurate ones produce small biases in their errors (i.e., the average value of the error is nearly zero percent). However, the spread of the models’ errors may be large, where many of these models’ predictions lie within ± 50% of the experimentally-observed threshold velocities. Therefore, the ± 50% threshold is deemed acceptable for the solids transport models.

Sj = w1 EMS, j + w2 EMAP, j + w3 EMA, j + w4

+ w7 v0, dev, j + (w8 kj, non ), j = 1, 2, …, J .

• • •

• • •

EMS, j =

ESS, j ndata

3.4. Model uncertainty quantification The model uncertainty quantification process generates the distribution of each model's threshold velocity predictions by using the errors produced by the model at the reduced database. This method ignores the uncertainties of the input condition and experimental data. The model uncertainty is propagated by first quantifying the error percentage EP,i,j of model j for each experimental datum point i at the reduced database:

EMAP, j =

⎛ vcalc, i, j − vexp, i ⎞ ⎟⎟ , EP, i, j = (100%) × ⎜⎜ vexp, i ⎝ ⎠

: the mean squared error of model j.

100 % ndata

ndata ∑i =1

vcalc, i, j − vexp, i vexp, i

: the mean absolute error percentage

data

vM , i, j =

max( y1 ,

2, …, J .

y2 ,

…,

…,

yJ )

yJ ) − min( y1 ,

y2 ,

…,

yJ )

,

(15)

v0, j 1 + (EP, i, j /100%)

,

i = 1, 2, …, ndata ,

j = 1, 2, …, Nmodel . (16)

For each model j, the values of vM,i,j (i = 1, 2, …, ndata) are compiled. The distribution of vM,i,j is then produced to quantify the uncertainty of the threshold velocity prediction of model j at the input condition due to the uncertainty of model j. 3.5. Uncertainty propagation of input condition and experimental data The Monte Carlo (MC) simulation method is used to propagate the uncertainties of the input condition and experimental data to determine the uncertainties and confidence bounds of the threshold velocity predictions of the models. The Monte Carlo simulation method for uncertainty propagation was selected due to the simplicity in its implementation and robustness (Roy and Oberkampf, 2011). Furthermore, the Monte Carlo simulation method is a non-intrusive uncertainty propagation approach that produces the distributions of the models’ predictions (Dieck, 2007). The distribution is crucial for generating the confidence intervals of the models’ predictions.

After these statistics are quantified, the values of these statistics are normalized using Eq. (13):

y2 ,

j = 1, 2, …,

where Nmodel denotes the total number of models available. Assuming that the experimental data points in the reduced database are representative of the input condition, the error percentage EP,i,j of model j can be extended to the input condition. Here, EP,i,j is used to estimate the “true” value of the threshold velocity at the input condition given the errors of the model at the reduced database. Hence, substituting the value of vcalc,i,j with v0,j (the prediction of model j at the input condition), and the value of vexp,i with vM,i,j (the estimated “true” value of the threshold velocity at the input condition if the error of model j at experimental datum point i is the “true” error of the model) in Eq. (15), and solving for vM,i,j yields:

2 2 2 Radj , dev, j =1−Radj, mod , j : the deviation of the modified adjusted-R statistic of model j from the value of one. max%, j = max(over%, j , under%, j ): the maximum between over%,j and under%,j. Here, over%,j denotes the percentage of experimentallyobserved threshold velocity overestimated by model j in the reduced database, and under%,j the percentage of experimentally-observed threshold velocity underestimated by model j in the reduced database. P%,j: the percentage of predictions produced by model j that lie outside ±(100%) × min(1 − εl , εu − 1) of the experimental observations. (As mentioned in Section 3.3.1, the values of εl =0.5 and εu=1.5 are used for our application.) v0, dev, j = v0, j − v0, avg : the absolute deviation of v0,j from v0,avg. Here, v0,j is the threshold velocity prediction of model j for the input condition, and v0,avg is the average value of the threshold velocity predictions of all the ranked models (i.e., all the models that pass the model screening test) for the input condition. kj,non: the number of parameters in model j that become non-zero after the model parameter fine-tuning process. This statistic is used only if the methodology is implemented with the model parameter fine-tuning module.

yj − min( y1 ,

i = 1, 2, …, ndata ,

Nmodel ,

of model j. ndata 1 EMA, j = n ∑i =1 vcalc, i, j − vexp, i : the mean absolute error of model j.

yj =

(14)

In the above equation, w1 to w8 represent the weights placed on each normalized statistic. The term w8 kj, non is enclosed in parentheses because this term only appears if the model parameters are fine-tuned prior to the model screening and ranking process. For our application, the value of one is used for each weight of the normalized statistics, as it is assumed that each statistic has equal contribution for evaluating the accuracy and performance of the models.

3.3.2. Model ranking The model ranking approach is performed by first calculating the values of the following statistics:

• •

2 Radj , dev, j + w5 max%, j + w6 P%, j

j = 1, (13) 6

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

1 Input: Input condition Inputs: Experimental data

Data clustering

3

2

6

5 Fine-tune model parameters?

Inputs: Models

Yes

Model parameter finetuning

4 No Reduce the value of ndata

9

7

Model screening and ranking

Is there at least one ranked model?

No

8

Yes

10

Outputs: Value of ndata and model ranking

Fig. 1. Algorithm of the Model Evaluation Methodology.

a. Set the value of ndata equal to the previous datum point index i where there is a large difference in the Euclidean distance between datum point i and i + 1 (Block 9 of Fig. 1) b. Repeat Step 3 until there is at least one ranked model, i.e., at least one model passes the model screening test (Block 4 of Fig. 1) 5. Obtain the recommended value of ndata and the model ranking as outputs (Block 10 of Fig. 1) Propagate the uncertainties of the input condition, experimental data, and models (Fig. 2): (ntrial replications) For t = 1, 2, …, ntrial replications (Block 1 of Fig. 2) 6. Using the input condition, the experimental data, the distributions of the independent variables of the input condition and experimental data as inputs (Blocks 2 and 3 of Fig. 2), and a random number generator, produce random values of the independent variables of the input condition and experimental data to propagate the uncertainties of the independent variables of the input condition and experimental data (Block 4 of Fig. 2) a. This step produces Nvar × (Ndata +1) random numbers 7. Using the value of ndata obtained from Step (5) as input (Block 5 of Fig. 2), and the perturbed input condition and experimental data obtained from Step (6) (Block 4 of Fig. 2), perform the data clustering procedure to select the ndata number of data points from the experimental database that are representative of the input condition (Block 6 of Fig. 2) For i = 1, 2, …, ndata number of data points in the reduced database

The algorithm for the Monte Carlo simulation method is as follows: Inputs: the input condition, the distributions of the uncertain variables of the input condition, the total number of replications (trials) for the Monte Carlo simulation (ntrial), the solids transport experimental database, the distributions of the uncertain variables of the experimental data, the equations of the solids transport models. Outputs: a recommendation for the number of experimental data points to include in the reduced database (ndata), the ranking of the models, the distributions of the threshold velocity predictions of the models. Determine the number of data points that will populate the reduced database (ndata) and the ranking of the models (Fig. 1): 1. Specify the input condition (Block 1 of Fig. 1) 2. Using the average values of the variables of the input condition and each experimental datum point as inputs (Blocks 1 and 2 of Fig. 1), perform the data clustering component of the methodology (Block 3 of Fig. 1) 3. Using the equations of the solids transport models as inputs (Block 4 of Fig. 1): a. If the fine-tuned models are to be used instead of the original models (Block 5 of Fig. 1), execute the model parameter finetuning module (Block 6 of Fig. 1) b. Implement the model screening and ranking protocol (Block 7 of Fig. 1) 4. In case all models are discarded during the model screening process with the current value of ndata (Block 8 of Fig. 1): 7

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

1

Set t = 1

2 Inputs: Input condition and uncertainty

4

Select random values of the independent variables of the input condition and experimental data

Inputs: Experimental data and uncertainties

5

3

6

Input: ndata

Data clustering

7 Inputs: Models

Compute models’ predictions

Inputs: Uncertainties of threshold velocity in experimental data

8

Select random values of the errors of the experimentally-observed threshold velocity

10

9 Perturb models’ predictions

11

13 Model parameter fine-tuning

Yes

12 Fine-tune model parameters? No Model uncertainty quantification

14

16

t=t+1

17 Outputs: Distributions of models’ predictions

15 Yes

Is t = ntrial?

No

Fig. 2. Algorithm of the Uncertainty Propagation Methodology.

⎛ ⎞ and f ⎜ xl, i , kj ⎟ the function (or equation) of model j ⎝ ⎠ b. Step (10) is performed because the experimentally-observed values of the threshold velocity are dependent on the values of the independent variables, but the exact dependence is unknown (which is the reason the models’ predictions are needed) c. Because of the dependence, the uncertainty of the experimentally-observed threshold velocity cannot be propagated independently in Step (6), and is instead incorporated through the model's predictions Next i 11. If the fine-tuned models are to be used instead of the original models (Block 12 of Fig. 2), perform the model parameter finetuning to reduce the bias in the models’ predictions (Block 13 of Fig. 2)

8. Using the equations of the solids transport models as inputs (Block 7 of Fig. 2), compute the models’ threshold velocity predictions for experimental datum point i (Block 8 of Fig. 2) 9. Using the uncertainties of the experimentally-observed threshold velocity as inputs (Block 9 of Fig. 2), obtain random values of the uncertainties (Uexp,i) of the experimentally-observed threshold velocity from their distributions (Block 10 of Fig. 2) 10. Perturb the models’ threshold velocity predictions for experimental datum point i using the uncertainty Uexp,i (Block 11 of Fig. 2):

vcalc, i, j = f

(x

l, i ,

)

kj + Uexp, i × f

(x

l, i ,

kj

)

(17)

a. Here, xl, i represents the vector containing the independent variables, kj the vector containing the parameters of model j,

8

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

a. Here, it is assumed that the uncertainties of the variables are time-invariant, and therefore, the model parameter fine-tuning needs to be performed only once for each trial t of the Monte Carlo simulation method 12. Perform the model uncertainty quantification process to propagate the uncertainty of each model (Block 14 of Fig. 2) Next t (Blocks 15 and 16 of Fig. 2) 13. Obtain the distributions of the models’ threshold velocity predictions for the input condition as outputs (Block 17 of Fig. 2)

No Uncertainty Model Uncertainty

Vocadlo and Charles (1972)

Kalman et al. (2005)

Wasp et al. (1970)

10% 2° 1% 10% 1% 40% 1% 40% 20%

Mantz (1977)

Input Uncertainty All Uncertainty

Fine-Tuned Gruesbeck et al. (1979)

After the above steps are completed and the distributions of the models’ threshold velocity predictions for the input condition are generated, the statistics of each model's prediction are quantified. Given that Steps (6) to (12) are repeated ntrial number of times, and that for each trial (i.e., replication), ndata number of model predictions are generated for each model, then, ntrial × ndata number of threshold velocity predictions are used to generate the distributions of each model's predictions. All these threshold velocity predictions are used to ensure that the uncertainties of the input condition, the experimental data, and the solids transport models are propagated and incorporated properly to quantify the confidence and the bounds of the models’ predictions at the input condition. To quantify the confidence and uncertainty bounds of the models’ predictions, while propagating only the uncertainties of the models, Steps (1) to (4) highlighted above are performed, followed by the model uncertainty quantification step of the methodology. On the other hand, if the interest is to propagate only the uncertainties of the input condition, the algorithm is executed, while skipping the selection of the random values of the independent variables of the experimental data in Step (6) (the average values of the variables in each experimental datum point are used instead), and Steps (9), (10), and (12). Furthermore, the distributions of the models’ threshold velocity predictions are generated from the ntrial (instead of ntrial × ndata ) number of samples. For our application, the value of ntrial is set equal to 100. For the experimental data, if the values of the uncertainties of the variables are specified by the experimenters, these values are reported in our experimental database. Otherwise, the assumed values shown in Table 1 are used for the uncertainties of the variables. The values shown in Table 1 were obtained using the uncertainties observed in the experimental data and expert opinion. The uncertainty of each independent variable at datum point i is calculated as Uxl, i = (xU , l, i − xL, l, i )/2 , where xU,l,i and xL,l,i denote the upper and lower bounds, respectively, of the value of independent variable l at experimental datum point i. Likewise, the uncertainty of the experimentally-observed threshold velocity of datum point i is calculated as Uexp, i = (vU , exp, i − vL, exp, i )/2 , where vU,exp,i and vL,exp,i denote the upper and lower bounds, respectively, of the value of the threshold velocity at experimental datum point i. It is assumed that the variables in the experimental data are normally distributed, with the mean equal to the values reported by the experimenters, and the standard deviation equal to Uxl, i /2 for the independent variables, and to Uexp, i /2 for the experimentally-observed threshold velocity. Thus, at the 95% confidence

1.2 1.0 0.8 0.6 0.4 0.2 0.0 Fine-Tuned Thomas (1961)

Velocity Prediction (m/s)

Case Study 1: Velocity Envelopes of the 5 Highest-Ranked Fine-Tuned Models

Fine-Tuned Almutahar (2006) Initial Approach

± ± ± ± ± ± ± ± ±

Fine-Tuned Turian et al. (1987)

Particle volumetric concentration Conduit inclination angle Fluid density Fluid viscosity Particle density Particle dimensions Conduit dimensions Coefficient of static friction Experimental threshold velocity

Rabinovich and Kalman (2007)

Uncertainty

Case Study 1: Velocity Envelopes of the Ranked Original Models 1.2 1.0 0.8 0.6 0.4 0.2 0.0

Fine-Tuned Wasp et al. (1970)

Variable

Velocity Prediction (m/s)

Table 1 Assumed Values of the Uncertainties of the Independent Variables.

Fig. 3. Velocity Envelopes of the Ranked Original Models and Five Highest-Ranked Fine-Tuned Models for Case Study 1 (Error Bars Represent Bounds of Predictions at 90% Confidence, and Dashed Lines Denote the Observed Threshold Velocity Envelope).

level, the values of the variables are within the variables’ uncertainty ranges. 4. Results and discussions This section evaluates the performance of our model evaluation and uncertainty propagation method. In Section 4.1, we test our methodology using a single input condition to highlight the outputs of our methodology, and to provide the readers with an example of the practical use of our methodology. In Section 4.2, we validate our methodology using 166 input conditions (the reason there are 166 input conditions is explained in Section 4.2), because at least 50 sample points are needed to make a statistically sound conclusion (Devore and Berk, 2007). 4.1. Our methodology tested using a single input condition To test the performance of the methodology, we began by first removing an experimental datum point from the experimental database. This experimental datum point was used as the input condition for our methodology, and originates from one of the experiments conducted by Ramadan et al. (2003). We will refer to this input condition as Case Study 1. The values of the independent variables and experimentally-observed threshold velocity (vexp) for Case Study 1 are: C0 = 1.0 ppm, θ0 = 0.0 degree, ρf,0 = 998 kg/m3, μ0 = 1.0 mPa-s, ρS,0 = 2650 kg/m3, dp,0 = 125–500 µm, D0 = 0.0665 m, and vexp = 0.25– 0.33 m/s. The uncertainty of the particle diameter provided by the experimenters was used as the uncertainty of the particle diameter of the input condition. For the remaining variables, the uncertainties 9

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

Study 1, and the values of the 95th percentiles of the threshold velocity predictions of the ranked original models and the five highest-ranked fine-tuned models are sufficient for particle transport. Here, we are particularly interested in the values of the 95th percentiles of the models’ predictions, because these values represent the fluid velocity where there is 95% confidence that the solid particles are transported. Fig. 3 also reveals that the threshold velocity envelopes suggested by the five highest-ranked fine-tuned models tend to be narrower compared to those of the ranked original models. However, the 95th percentile values of the threshold velocity suggested by the fine-tuned models are still sufficient for solid particle transport. For many of the ranked original models and five highest-ranked fine-tuned models, the threshold velocity envelopes produced from propagating only the uncertainty of the model are nearly equal to the threshold velocity envelopes produced from propagating all uncertainties (Fig. 3). This observation suggests that most of the uncertainties in these models’ threshold velocity predictions originate from the model. For some of these models, the threshold velocity envelopes obtained from propagating only the uncertainty of the model are wider compared to the threshold velocity envelopes obtained from propagating all uncertainties. At first, this result may seem counter-intuitive. For the case of propagating only the uncertainty of the model, the threshold velocity distribution of each model is generated using ndata number of threshold velocity predictions (Section 3.4). On the other hand, for the case of propagating all uncertainties, ndata × ntrial threshold velocity predictions are used to generate the distributions of the predictions of each model (Section 3.5). Thus, the sample size used to generate the distributions of the threshold velocity prediction of the models is larger for the case of propagating all uncertainties compared to the case of propagating only the uncertainty of the model. For cases where the Monte Carlo simulation method is used to approximate the distribution of a variable, the sample size affects the values of the estimated percentiles, where a larger sample size yields smaller differences between the 5th percentile and 95th percentile values (Schoonjans et al., 2011). Thus, it is possible for the width of the threshold velocity envelope produced from propagating all uncertainties to become narrower compared to the width of the threshold velocity envelope generated from propagating only the uncertainty of the model (Devore and Berk, 2007). Here, a cautionary note is appropriate. For the experimental condition of Case Study 1 (which was used as the input condition) with its uncertainty, the contribution of the uncertainty of the input condition is small. However, the impact of the uncertainty of the input condition may increase if the uncertainty of at least one independent variable becomes significant.

shown in Table 1 are used. For Case Study 1, the data clustering approach (Section 3.1) recommended the value of ndata = 71. Fig. 3 shows the threshold velocity envelopes produced by the ranked original models [Wasp et al. (1970), Rabinovich and Kalman (2007), Mantz (1977), Kalman et al. (2005), and Vocadlo and Charles (1972)] (for Case Study 1, only five models were not discarded during the model screening process) and the five highest-ranked fine-tuned models [Fine-Tuned Gruesbeck et al. (1979), Fine-Tuned Wasp et al. (1970), Fine-Tuned Turian et al. (1987), Fine-Tuned Almutahar (2006) Initial Approach, and Fine-Tuned Thomas (1961)]. There are 40 finedtuned models remaining after the model screening process. In Fig. 3, we include the threshold velocity envelopes of only the five highestranked fine-tuned models for fair comparison and because of space limitation. The threshold velocity envelopes of all five ranked original models and all 40 ranked fine-tuned models are tabulated in Tables S.5 and S.6 of Supplementary Materials. Here, the envelopes denote the difference between the 95th percentile and 5th percentile values of the threshold velocity predictions of the models obtained from the uncertainty propagation process. The bars in Fig. 3 represent the threshold velocity predictions of the models obtained without any uncertainty propagation. The tips of the error bars represent the values of the threshold velocity predictions at the 5th percentile and 95th percentile for the following instances: (1) only the uncertainties of the input condition are propagated, (2) only the uncertainties of the models are propagated, and (3) the uncertainties from all sources (input condition, experimental data, and models) are propagated. The dashed lines in these graphs denote the experimentally-observed threshold velocity envelopes. In the horizontal axes of these graphs, the models are arranged in descending order, based on their rank, with the first-ranked model being the leftmost model. The input condition for Case Study 1 involves the initiation of the motion of solid particles from a bed of solids using water as the carrier fluid (i.e., the prediction of the pick-up velocity is required). All five original models recommended by our model screening and ranking methodology were developed and validated using experimental data for solid particle transport in water. This observation suggests that our methodology is capable of selecting models that were developed for the same type of continuous phase as that of the input condition. Three of the five ranked original models (Rabinovich and Kalman, 2007; Mantz, 1977; Kalman et al., 2005) were developed to predict the pick-up velocity. The selection of these models suggests that our methodology is capable of selecting models that were developed for cases that are similar to the input condition in terms of the physics. As for the finetuned models, they were fitted to the experimental data in the reduced database. Therefore, these models have been adjusted such that they become appropriate for making threshold velocity predictions in the neighborhood of the input condition in terms of both the ranges of the independent variables and the physics behind solid particle transport. Quantitatively, the threshold velocity predictions produced by the ranked original models and five highest-ranked fine-tuned models when the uncertainties are ignored all lie within the experimentallyobserved threshold velocity envelope, with the exception of two of the ranked original models, which produce slight underestimates (Fig. 3). However, for most of these models, the threshold velocity predictions are closer to the lower end of the experimentally-observed threshold velocity envelope. Therefore, by using the threshold velocity predictions of these models, it is possible that the fluid velocity is not sufficient for solid particle transport (i.e., there is a possibility that the “true” threshold velocity lies in the middle or at the upper end of the experimentally-observed threshold velocity range). These observations highlight the importance and the need for propagating all sources of uncertainties for generating reliable operating threshold velocity envelopes. When the uncertainties from all sources are propagated, the threshold velocity envelopes of the ranked models cover the experimentally-observed threshold velocity for the input condition of Case

4.2. Our methodology tested using multiple input conditions For this validation test, one experimental datum point was removed from the experimental database, and used as the input condition. After implementing the methodology using this input condition, the experimental datum point was returned to the experimental database. The procedure is repeated for all 174 experimental data points for the hydraulic conveying of solid particles in conduits (shown in Table S.1 of Supplementary Materials). Similar to Case Study 1, for each input condition, the uncertainties of the independent variables provided by the experimenters were used as the uncertainties of the variables of the input condition. For cases where the uncertainties of the variables are not provided by the experimenters, the uncertainties shown in Table 1 are used. The cumulative distribution functions of the errors produced by the first-ranked models suggested by the methodology for 166 of the 174 input conditions are shown in Fig. 4. Here, only the performance of the first-ranked models for these 166 input conditions are shown, because when the methodology is implemented using the original models, for six of the 174 input conditions, all models were discarded by the model 10

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

1st Ranked Fine-Tuned Models

100

Cumulative Probability (%)

Cumulative Probability (%)

1st Ranked Original Models 80 60 40 20 0 -100 -50

0

50 100 150 200

100 80 60 40 20 0 -100 -50

0

Error (%) No Uncertainty Model Uncertainty Zero Error

Error (%)

Input Uncertainty All Uncertainty

No Uncertainty Model Uncertainty Zero Error

Cumulative Probability (%)

Cumulative Probability (%)

100 80 60 40 20 0

50 100 150 200

100 80 60 40 20 0 -100 -50

0

Error (%) No Uncertainty Model Uncertainty Zero Error

Input Uncertainty All Uncertainty

Oroskar and Turian (1980) SemiMechanistic Model

Mantz (1977) Model

0 -100 -50

50 100 150 200

50 100 150 200

Error (%)

Input Uncertainty All Uncertainty

No Uncertainty Model Uncertainty Zero Error

Input Uncertainty All Uncertainty

Fig. 4. Performances of the Highest-Ranked Original and Fine-Tuned Models, the Mantz (1977) Model, and the Oroskar and Turian (1980) Semi-Mechanistic Model for the Case Study.

were found to be reliable for experimental data involving the hydraulic conveying of solid particles. When the uncertainties of all sources are propagated, for most of the 166 input conditions, the experimentally-observed threshold velocities fall within the threshold velocity envelopes predicted by the models (Fig. 4). This is true for 92% of the input conditions for the case of the first-ranked original models, 93% of the input conditions for the case of the first-ranked fine-tuned models, 93% of the input conditions for the case of the Mantz (1977) model, and 92% of the input conditions for the case of the Oroskar and Turian (1980) semimechanistic model. As mentioned previously, for the case of propagating all uncertainties, Fig. 4 presents the errors produced by the 5th percentile and 95th percentile of the threshold velocity predictions of the models. Because the 5th percentile and 95th percentile of the threshold velocity predictions are quantified, the bounds produced by these percentiles represent the bounds of the models’ predictions at the 90% confidence level. Intuitively, given that the 90% confidence ranges of the models’ predictions are quantified, the models’ threshold velocity envelopes should cover the experimentally-observed values for at least 90% of the input conditions. When the 95th percentile values of the threshold velocity predictions obtained from propagating all uncertainties are used, underestimates of the experimentally-observed values are produced for only 4%, 4%, 3%, and 4% of the 166 input conditions when the first-ranked original models, the first-ranked fine-tuned models, the Mantz (1977) model, and the Oroskar and Turian (1980) semi-mechanistic models

screening approach at Step (3) of the algorithm (Section 3.5) no matter how small the reduced database is; and for two of the 174 input conditions, only two experimental data points populate the reduced database after Steps (3) and (4) of the algorithm (Section 3.5) were performed. This sample size is too small for the Smirnov statistical test to be meaningful (Conover, 1971). Therefore, the threshold velocity predictions of the models for these eight input conditions are not included as part of the results discussed in this section. Instead, these eight input conditions will be further discussed in Section 4.2.1. In Fig. 4, the errors of the models are calculated as (100%) × (v0, i − vexp, i )/ vexp, i , where v0,i denotes the model's threshold velocity prediction for the ith input condition used in the Case Study, for i = 1, 2, …, 166. Here, vexp,i represents the average value of the experimentally-observed threshold velocity for the ith input condition. The values of the errors are shown for four cases: (1) ignoring all uncertainties, (2) propagating only the uncertainty of the input condition, (3) propagating only the uncertainty of the model, and (4) propagating the uncertainties of all sources. For the cases where the uncertainty is not ignored, the 5th percentile and 95th percentile values of the threshold velocity predictions obtained from the distributions are compared against the experimentally-observed values. For comparison, Fig. 4 also provides the same information for the Mantz (1977) model and the Oroskar and Turian (1980) semi-mechanistic model. Here, the Mantz (1977) model and the Oroskar and Turian (1980) semi-mechanistic model are used for comparison because in our previous studies (Soepyan et al., 2012, 2013b), both of these models 11

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

developed for the same type of continuous phase as that of the input condition for all 166 input conditions. We also compared the type of velocity observed experimentally (these velocity definitions are provided in Section 1) and the type of velocity predicted by the first-ranked models. For 89 of the 166 input conditions (i.e., 54% of the input conditions), the first-ranked models selected by the methodology predict the same type of velocity as that observed at the input condition. When all the models that are selected for ranking are considered, at least one ranked model predicts the same type of velocity as that observed at the input condition for 108 of the 166 input conditions (i.e., 65% of the input conditions). We studied the 58 input conditions where none of the ranked models were developed to predict the same type of velocity as that observed at the input condition. Among these 58 input conditions, there are 5 input conditions where the critical velocity was observed experimentally, while there are only 10 data points in the experimental database where the critical velocity was measured; there are 8 input conditions where the saltation velocity was observed experimentally, while there are only 17 data points in the experimental database where the saltation velocity was measured; and there are 3 input conditions where the incipient motion velocity was measured, while there are only 3 data points in the experimental database where the incipient motion velocity was measured. Therefore, for most of the experimental data points in the reduced database, measurements were performed for a velocity other than the one observed experimentally at the input condition. The inclusion of these data points in the reduced database results in the selection of models that were not developed to predict the same type of velocity as that observed at the input condition. For the remaining 42 of the above-mentioned 58 input conditions, the pick-up velocity was observed at the input condition, and there are 136 data points in the experimental database where the pick-up velocity was measured. For these 42 input conditions, the number of experimental data points in the reduced database (ndata) ranges from 6 to 134. At least 50 data points are needed to make statistically-sound conclusions (Devore and Berk, 2007). There are 12 input conditions out of these 42 input conditions where ndata is less than 50. Therefore, for these input conditions, the number of experimental data points in the reduced database is too small such that the model screening and ranking approach cannot recommend models that were developed to predict the pick-up velocity. Further discussions regarding the input conditions where ndata is less than 50 can be found in Section 4.2.1. For the other 30 of the 42 input conditions where the pick-up velocity was observed and ndata is greater than 50, we noticed that although most of the data points in the reduced database are from experiments where the pick-up velocity was measured, the reduced database also includes experimental data points where the measured velocity is not the pick-up velocity. The inclusion of these experimental data points in the reduced database causes the model screening and ranking approach to recommend models that were not developed to predict the pick-up velocity for the input condition. From this exercise, we learn

Case Study: Number of Experimental Data Points in the Reduced Database 60 50 40 30 20 10 0

0 1 2 3 4 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 166

Number of Input Conditions

are used, respectively. Here, the 95th percentile values of the threshold velocity predictions are compared because the objective is to transport the solid particles in the conduit, and these values yield the threshold velocities where there is 95% confidence that the solid particles in the conduit are being transported by the carrier fluid. In contrast, using only the predictions of the models without any uncertainty propagation produces underestimates of the experimentally-observed threshold velocities for 49%, 45%, 49%, and 59% of the 166 input conditions for the cases of the first-ranked original models, the first-ranked finetuned models, the Mantz (1977) model, and the Oroskar and Turian (1980) semi-mechanistic model, respectively. This comparison suggests that by using the proposed methodology instead of relying on the threshold velocity predictions of individual models without any uncertainty propagation, the likelihood of underestimating the “true” value of the threshold velocity decreases. Using the 95th percentile values of the threshold velocity predictions generated from propagating all uncertainties produces errors that exceed 100% (compared to the average value of the experimental threshold velocity) for 42%, 33%, 48%, and 57% of the 166 input conditions when the first-ranked original models, the first-ranked finetuned models, the Mantz (1977) model, and the Oroskar and Turian (1980) semi-mechanistic models are used, respectively. The firstranked fine-tuned models produce the smallest value of these percentages compared to the first-ranked original models, the Mantz (1977) model, and the Oroskar and Turian (1980) semi-mechanistic model. Therefore, the likelihood of grossly overestimating the “true” threshold velocity of the solid particles decreases when the first-ranked finetuned models are used instead of the first-ranked original models, the Mantz (1977) model, or the Oroskar and Turian (1980) semi-mechanistic model. When the errors of the first-ranked original models, the first-ranked fine-tuned models, the Mantz (1977) model, and the Oroskar and Turian (1980) semi-mechanistic model are compared, Fig. 4 shows that the area occupied by the collection of cumulative distribution functions are smallest for the first-ranked fine-tuned models. In other words, the ranges of the errors are reduced if the first-ranked fine-tuned models are used instead of the original models. Similar to Fig. 3 for Case Study 1, Fig. 4 shows that most of the uncertainties in the models’ threshold velocity predictions come from the models. The contribution of the model uncertainty to the overall uncertainty in the models’ threshold velocity predictions is reduced when the model parameters are fine-tuned. Based on Fig. 4, the percentage of input conditions where the experimentally-observed threshold velocities lie within the threshold velocity envelopes of the models decreases if only the uncertainties of the model are propagated. To determine if our methodology's selection of the first-ranked original models is appropriate in terms of the physics behind particle motion, we performed a comparison similar to the one in Section 4.1. As mentioned in Section 4.1, the fine-tuned models have been adjusted using the experimental data in the reduced database (i.e., the experimental data in the neighborhood of the input condition). Because the experimental data in the reduced database are similar to the input condition in terms of the physics and the values of the independent variables, the fine-tuned models are appropriate for the input condition. All of the input conditions that are used to validate the methodology involve the use of a liquid carrier to transport the solid particles. For 163 of the 166 input conditions (i.e., 98% of the input conditions), the first-ranked models were developed for the hydraulic transport of solid particles. Therefore, for most of the input conditions, our methodology is capable of selecting first-ranked models that are developed for the same type of continuous phase as that of the input condition. For the three input conditions where the first-ranked model was developed for the pneumatic conveying of solids, at least one other model that was selected for ranking was developed for the hydraulic conveying of solids. In other words, when all the models that are selected for ranking are considered, at least one ranked model was

Number of Data Points Fig. 5. Number of Experimental Data Points Selected to Populate the Reduced Database for the 174 Input Conditions of the Case Study.

12

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

0 ρS,i / ρf,i

0

Pipe Inclination Angle (degree)

No. of Data Pts.

60

1.0 1.5 2.0 2.5 3.0 3.5 4.0 8.0

No. of Data Pts.

120

60

0

Particle Concentration (ppm)

Distribution of ρS,i / ρf,i in Experimental Database 180

No. of Data Pts.

60

120

0 100 500 1000 2000 3000 4000 5000 10000 12000

0

120

Distribution of Di / dp,i in Experimental Database 180

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

60

Distribution of θi in Experimental Database 180

Di / dp,i

Distribution of NAr,i in Experimental Database 180 120 60 0

1.E-03 1.E-02 1.E-01 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07

120

No. of Data Pts.

Distribution of Ci in Experimental Database 180

0 1 4 6 8 10 12 14 20 40

No. of Data Pts.

F.B. Soepyan et al.

Archimedes Number Fig. 6. Distributions of the Dimensionless Groups in the Experimental Database.

Experimental Data Points with 600 < Di / dp,i < 7000 and NAr,i < 600

NAr,i

that our methodology is more likely to recommend models that were developed for the same type of threshold velocity (e.g., pick-up velocity) as that required at the input condition if all the data points in the reduced database are from experiments where the same type of threshold velocity (e.g., pick-up velocity) is measured. 4.2.1. Number of experimental data points in the reduced database The proposed methodology works best when there are sufficient experimental data points in the reduced database that are representative of the input condition, as these data points are used for model parameter fine-tuning, model screening and ranking, and model uncertainty quantification. Furthermore, as mentioned previously, there needs to be sufficient amount of experimental data points in the reduced database in order for the statistical analysis to be meaningful. Thus, for the 174 input conditions used to test the methodology, the number of experimental data points selected to populate the reduced database (ndata) for each input condition is investigated and discussed in this subsection. Fig. 5 shows the distribution of the value of ndata for the 174 input conditions. For 90 of the 174 input conditions, the value of ndata is between 50 and 70, which is consistent with the data clustering procedure (Section 3.1). However, there are 47 input conditions where ndata < 50. In Fig. 5, ndata = 0 denotes the six input conditions where all original models are discarded during the model screening process regardless of the size of the reduced database. Because the data clustering was performed using dimensionless groups (Section 3.1), the ranges of the dimensionless groups for these 47 input conditions were investigated. Here, the threshold of ndata < 50 was selected for investigation because having fewer than 50 sample points may cause the quantified statistics for these sample points to become unreliable (Devore and Berk, 2007), as mentioned in Section 3.1. For the above-mentioned 47 input conditions, the ranges of the dimensionless groups are: 1.0 ppm ≤ C0 ≤ 19.7 ppm , θ0=0.0°, 2.0≤ρS,0 / ρf ,0 ≤3.8, 695≤D0 / dp,0≤6020 , and 4.2×10−3 ≤ NAr ,0≤553. As a comparison, for all 174 input conditions (i.e., the experimental database), the ranges of the dimensionless variables are:

1.E+03 1.E+02 1.E+01 1.E+00 1.E-01 1.E-02 1.E-03 100

1000 Di / dp,i

10000

ndata < 4 Input Conditions with ndata ≤ ndata ndata < 50 Input Conditions with 4 < Data Points with Similar Ranges Fig. 7. Experimental Data Points in the Same Neighborhood as the Input Conditions with ndata < 50. 1.0 ppm ≤ C0 ≤ 36.4 ppm ,

0.0°≤θ0≤3.2°, 1.1≤ρS,0 /ρf ,0 ≤7.8, 7.5≤D0 /dp,0≤11060 ,

and 4.2×10−3 ≤ NAr ,0≤2.0×106 . Fig. 6 shows the distributions of the five dimensionless groups in the experimental database. Based on Fig. 6, it can be seen that for most of the data points in the experimental database, the particle concentration falls in the neighborhood of 1.0 ppm. Therefore, for the input conditions where ndata < 50 and the particle concentration is between 10 and 20 ppm, there are only few data points in the experimental database that are representative of these input conditions. On the other hand, all 47 input conditions with ndata < 50 have pipe inclination angles of 0.0 degree, and so are most of the data points in the experimental database. Thus, in terms of the pipe inclination angle, there is sufficient experimental data in the neighborhood of these 47 input conditions. The 47 input conditions with ndata < 50 all lie in the following neighborhood: 2.0≤ρS,0 / ρf ,0 ≤4.0 , 600≤D0 / dp,0≤7000 , and NAr,0≤600 . There are a total of 73 data points in the experimental database where the values of the dimensionless groups are within the same ranges. When only the ranges of Di / dp,i and NAr,i were considered, the 13

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

used to predict the value of the threshold velocity? (2) Given the predictions of the models, with how much confidence are we able to say that these predictions are correct? In this paper, we present a model ranking, fine-tuning, and uncertainty propagation approach to answer the above questions. The methodology uses a novel combination of data clustering, model parameter fine-tuning, model screening and ranking, model uncertainty quantification, and Monte Carlo simulation methods. Given an experimental database for solids transport, a set of solid transport models, and an input condition, the outputs of the methodology include the ranking of the models, the distributions of the models’ threshold velocity predictions, and the envelopes of the models’ predictions to within a predetermined confidence level. The proposed methodology is tested on a Case Study. Here, an experimental datum point was removed from the experimental database, and used as the input condition. For a single input condition (Case Study 1), it was found that the threshold velocity envelopes of the ranked original models and five highest-ranked fine-tuned models cover the experimentally-observed threshold velocity envelope. Operating the pipeline within the threshold velocity envelope recommended by the higher-ranked fine-tuned models instead of the higherranked original models is advantageous due to the narrower envelopes produced by some of the higher-ranked fine-tuned models. The proposed methodology is then tested on multiple input conditions. Similar to Case Study 1, an experimental datum point is removed from the experimental database, and used as the input condition. After implementing the methodology using this input condition, the experimental datum point was returned to the experimental database. This procedure was repeated for all 174 experimental data points for the hydraulic conveying of solid particles. The performances of the first-ranked original and the first-ranked fine-tuned models recommended by the methodology were compared against the Mantz (1977) model and the Oroskar and Turian (1980) semimechanistic model. For these models, the threshold velocity envelopes produced when all sources of uncertainties are propagated cover the experimentally-observed values for 92–93% of the 166 input conditions where the number of experimental data points in the reduced database is greater than or equal to four. This observation stresses the importance of propagating the uncertainties from all sources. When the 95th percentile values of the threshold velocity produced by these models are compared against the experimentally-observed threshold velocity, underestimates were produced only for 3–4% of these 166 input conditions. Thus, propagating the uncertainties from all sources increases the confidence that the solid particles are transported. Using the prediction envelopes recommended by the proposed methodology is advantageous over relying on the prediction of individual models, without prior evaluation and uncertainty propagation. Without evaluating the model, it is difficult to determine if a model's prediction at the input condition is reliable. Even if the model is adequate for the input condition, without any uncertainty propagation, no information is provided regarding the confidence interval of the model's prediction. Therefore, by assessing the accuracy of the model in the neighborhood of the input condition, and propagating the uncertainties from multiple sources prior to using the model, the confidence in the model's prediction increases. Because the Monte Carlo simulation method is used as part of the methodology, executing the methodology for one input condition can be computationally expensive. Thus, to improve the methodology's efficiency, alternate uncertainty propagation methods may be considered.

number of data points in the experimental database where 600≤Di / dp, i≤7000 and NAr , i≤600 remain the same at 73. Therefore, in terms of ρS,i / ρf,i, there is sufficient experimental data in the neighborhood of the 47 input conditions. The values of Di / dp,i and NAr,i for the 73 experimental data points (which include the 47 input conditions) are plotted in Fig. 7. The input conditions where ndata < 50 are further partitioned based on the ranges of ndata < 4 and 4 ≤ ndata < 50. This partitioning is performed to determine the differences between the input conditions populated with only two experimental data points in the reduced database or where all original models were discarded during the model screening process, and the rest of the input conditions with similar values of Di / dp,i and NAr,i. From Fig. 7, it can be seen that the eight input conditions where ndata < 4 are more isolated compared to the remaining input conditions. For instance, the rightmost input condition where ndata < 4 (shown in Fig. 7) has no neighboring data points. The remaining input conditions where ndata < 4 have very few neighboring experimental data points. This observation suggests that there are insufficient representative experimental data points in the neighborhood of these input conditions. As a result, during the data clustering process, experimental data points that are not necessarily representative of these input conditions were selected to populate the reduced database. Because of the presence of non-representative experimental data in the reduced database, none of the models could capture the solids transport phenomena for all the experimental data points in the reduced database. This causes the model screening process to discard all the original models. The four input conditions shown in Fig. 7 where 1.0×10−3 < NAr ,0 <1.0×10−1 also seem isolated. However, for these input conditions, 37–38 experimental data points were selected to populate the reduced database. This observation suggests that during the data clustering process, many experimental data points where NAr , i >1.0×10−1 were selected to populate the reduced database. Because these data points (with NAr , i >1.0×10−1) comprise the majority of the data points in the reduced database, it is possible that models that are accurate in the range of NAr , i >1.0×10−1 were selected for ranking. As a result, for ndata = 37 or ndata = 38, at least one model was not discarded during the model screening process for these four input conditions. Fig. 7 also shows that of the 73 experimental data points where 600≤Di / dp, i≤7000 and NAr , i≤600 , most of the 26 data points (which were also used as input conditions for the Case Study) where ndata > 50 lie at the outer edge of this set of 73 data points. Thus, during the data clustering process, not all of these 26 experimental data points were included in the reduced database for those input conditions where ndata < 50. Furthermore, given that most of these 26 data points are located at the outer edge of the ranges 600≤Di / dp, i ≤ 7000 and NAr , i≤600 , when they are used as input conditions for the Case Study, it is possible that experimental data points that lie outside these ranges are more representative (i.e., are closer to the input condition) compared to some of the data points that lie within these ranges. The main conclusion that is drawn from the analysis performed in this subsection can be stated as follows: for the investigation of the threshold fluid velocity for solid particle transport, there are gaps in the experimental data where the particle concentration ranges from 10 to 40 ppm, 600≤Di / dp, i≤7000 , and NAr , i≤600 . Additional experiments may increase the understanding of the solids transport phenomena in these ranges. 5. Conclusions There are many solids transport models available to predict the threshold velocity required to transport solid particles in a pipe, but the predictions of these models may differ by orders of magnitude, and the uncertainties and confidence behind the predictions of many of these models are not readily available. Therefore, two questions are raised: (1) Given an input condition, which solids transport models should be

Acknowledgements The authors would like to thank Dr. Gene E. Kouba of Chevron Energy Technology Company (Chevron ETC) for helping them in the development and improvement of their methodology; Dr. William 14

Journal of Petroleum Science and Engineering (xxxx) xxxx–xxxx

F.B. Soepyan et al.

Ponagandla, V., 2008. Critical Deposition Velocity Method for Dispersed Sand Transport in Horizontal Flow [MS]. The University of Tulsa, Tulsa, OK, USA. Rabinovich, E., Kalman, H., 2007. Pickup, critical and wind threshold velocities of particles. Powder Technol. 176, 9–17. Rabinovich, E., Kalman, H., 2009a. Incipient motion of individual particles in horizontal particle-fluid systems: A. Experimental analysis. Powder Technol. 192, 318–325. Rabinovich, E., Kalman, H., 2009b. Incipient motion of individual particles in horizontal particle-fluid systems: B. Theoretical analysis. Powder Technol. 192, 326–338. Ramadan, A., Skalle, P., Johansen, S.T., 2003. A mechanistic model to determine the critical flow velocity required to initiate the movement of spherical bed particles in inclined channels. Chem. Eng. Sci. 58, 2153–2163. Rodriguez Corredor, F.E., Bizhani, M., Kuru, E., 2016. Experimental investigation of cuttings bed erosion in horizontal wells using water and drag reducing fluids. J. Pet. Sci. Eng. 147, 129–142. Roy, C.J., Oberkampf, W.L., 2011. A comprehensive framework for verification, validation, and uncertainty quantification in scientific computing. Comput. Methods Appl. Mech. Eng. 200, 2131–2144. Schoonjans, F., De Bacquer, D., Schmid, P., 2011. Estimation of population percentiles. Epidemiology 22 (5), 750–751. Shiozawa, S., McClure, M., 2016. Simulation of proppant transport with gravitational settling and fracture closure in a three-dimensional hydraulic fracturing simulator. J. Pet. Sci. Eng. 138, 298–314. Soepyan, F.B., 2015. A Model Evaluation and Uncertainty Propagation Method to Increase the Confidence of Model Predictions [PhD]. The University of Tulsa, Tulsa, OK, USA. Soepyan, F.B., Cremaschi, S., McLaury, B.S., Sarica, C., Subramani, H.J., Kouba, G.E., Gao, H., 2016. Threshold velocity to initiate particle motion in horizontal and nearhorizontal conduits. Powder Technol. 292, 272–289. Soepyan, F.B., Cremaschi, S., Sarica, C., Subramani, H.J., 2012. Selection of the optimal critical velocity for sand transport at low concentrations for near-horizontal flow. Paper presented at: Offshore Technology Conference; 30 April–3 May 2012; Houston, TX, USA. Soepyan, F.B., Cremaschi, S., Sarica, C., Subramani, H.J., Kouba, G.E., 2013a. Model parameter fine-tuning and ranking methodology to improve the accuracy of threshold velocity predictions for solid particle transport. J. Pet. Sci. Eng. 110, 210–224. Soepyan, F.B., Cremaschi, S., Sarica, C., Subramani, H.J., Kouba, G.E., 2013b. Quantification and reduction of uncertainties for solids transport velocity predictions at low concentrations in near-horizontal flow. Paper presented at: Offshore Technology Conference; 6–9 May 2013. Houston, TX, USA. Soepyan, F.B., Cremaschi, S., Sarica, C., Subramani, H.J., Kouba, G.E., 2014. Solids transport models comparison and fine-tuning for horizontal, low concentration flow in single-phase carrier fluid. AIChE J. 60 (1), 76–122. Thomas, D.G., 1961. Transport characteristics of suspensions: II. Minimum transport velocity for flocculated suspensions in horizontal pipes. AIChE J. 7 (3), 423–430. Turian, R.M., Hsu, F.-L., Ma, T.-W., 1987. Estimation of the critical velocity in pipeline flow of slurries. Powder Technol. 51, 35–47. Vocadlo, J.J., Charles, M.E., 1972. Prediction of pressure gradient for the horizontal turbulent flow of slurries. Paper presented at: The Second International Conference on the Hydraulic Transport of Solids in Pipes; 20–22 September 1972; University of Warwick, Coventry, Great Britain. Wasp, E.J., Aude, T.C., Kenny, J.P., Seiter, R.H., Jacques, R.B., 1970. Deposition velocities, transition velocities, and spatial distribution of solids in slurry pipelines. Paper presented at: First International Conference on the Hydraulic Transport of Solids in Pipes; 1–4 September 1970; The University of Warwick, Coventry, Great Britain. Zenz, F.A., 1964. Conveyability of materials of mixed particle size. Ind. Eng. Chem. Fundam. 3 (1), 65–75. Zheng, C., Liu, Y., Wang, H., Zhu, H., Liu, Z., Ji, R., Shen, Y., 2015. Structural optimization of downhole fracturing tool using turbulent flow CFD simulation. J. Pet. Sci. Eng. 133, 218–225.

Coberly of the Department of Mathematics at The University of Tulsa (TU) for assisting them with the development of the uncertainty propagation method; Dr. Peyton Cook of the Department of Mathematics at TU for assisting them with the development of the statistical analysis for the data clustering component and the development of the uncertainty propagation method; Dr. Brenton McLaury of the Department of Mechanical Engineering at TU for recommending the values of the uncertainties of the variables for the cases where these uncertainties are unknown; and Jisup Shin of the Russell School of Chemical Engineering at TU for compiling some of the solids transport models developed for high particle concentrations. The authors greatly acknowledge the financial support provided by the Tulsa University Center of Research Excellence (TUCoRE) program, a partnership between TU and Chevron ETC for conducting research benefitting the oil and gas industry. Appendix A. Supporting information Supplementary data associated with this article can be found in the online version at doi:10.1016/j.petrol.2016.12.025. References Akhshik, S., Behzad, M., Rajabi, M., 2015. CFD-DEM approach to investigate the effect of drill pipe rotation on cuttings transport behavior. J. Pet. Sci. Eng. 127, 229–244. Almutahar, F.M., 2006. Modeling of Critical Deposition Velocities of Sand in Horizontal and Inclined Pipes [MS]. The University of Tulsa, Tulsa, OK, USA. Cimbala, J.M., Çengel, Y.A., 2008. Essentials of Fluid Mechanics: Fundamentals and Applications. The McGraw-Hill Companies, Inc., New York, NY, USA. Conover, W.J., 1971. Practical Nonparametric Statistics. John Wiley & Sons, Inc, New York, NY, USA. Cremaschi, S., Shin, J., Subramani, H.J., 2015. Data clustering for model-prediction discrepancy reduction – a case study of solids transport in oil/gas pipelines. Comput. Chem. Eng. 81, 355–363. Davies, J.T., 1987. Calculation of critical velocities to maintain solids in suspension in horizontal pipes. Chem. Eng. Sci. 42 (7), 1667–1670. Devore, J.L., Berk, K.N., 2007. Modern Mathematical Statistics with Applications. Brooks/Cole, Cengage Learning, Belmont, CA, USA. Dieck, R.H., 2007. Measurement Uncertainty: Methods and Applications fourth ed. ISA— The Instrumentation, Systems, and Automation Society, Research Triangle Park, NC, USA. Gruesbeck, C., Salathiel, W.M., Echols, E.E., 1979. Design of gravel packs in deviated wellbores. J. Pet. Technol. 31 (1), 109–115. Gu, M., Mohanty, K.K., 2015. Rheology of polymer-free foam fracturing fluids. J. Pet. Sci. Eng. 134, 87–96. Hayden, K.S., Park, K., Curtis, J.S., 2003. Effect of particle characteristics on particle pickup velocity. Powder Technol. 131, 7–14. Kalman, H., Satran, A., Meir, D., Rabinovich, E., 2005. Pickup (critical) velocity of particles. Powder Technol. 160, 103–113. Mantz, P.A., 1977. Incipient transport of fine grains and flakes by fluids—extended Shields diagram. J. Hydraul. Div. 103 (HY6), 601–615. Oroskar, A.R., Turian, R.M., 1980. The critical velocity in pipeline flow of slurries. AIChE J. 26 (4), 550–558. Pangilinan, K.D., de Leon, A.C.C., Advincula, R.C., 2016. Polymers for proppants used in hydraulic fracturing. J. Pet. Sci. Eng. 145, 154–160.

15