Accelerating Poisson solvers in front tracking method using parallel direct methods

Accelerating Poisson solvers in front tracking method using parallel direct methods

Computers & Fluids 118 (2015) 101–113 Contents lists available at ScienceDirect Computers & Fluids j o u r n a l h o m e p a g e : w w w . e l s e v...

3MB Sizes 0 Downloads 42 Views

Computers & Fluids 118 (2015) 101–113

Contents lists available at ScienceDirect

Computers & Fluids j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / c o m p fl u i d

Accelerating Poisson solvers in front tracking method using parallel direct methods M.T. Mehrabani a,1, M.R.H. Nobari a,⇑, G. Tryggvason b a b

Department of Mechanical Engineering, Amirkabir University of Technology, 424 Hafez Ave., P.O. Box 15875-4413, Tehran, Iran Department of Aerospace and Mechanical Engineering, University of Notre Dame, United States

a r t i c l e

i n f o

Article history: Received 18 August 2014 Received in revised form 7 May 2015 Accepted 8 June 2015 Available online 14 June 2015 Keywords: Parallel direct solver Poisson equation Multiphase flow Incompressible flow Finite difference Curved pipe

a b s t r a c t Parallel direct computation is implemented in multiphase flow simulation of bubbles in a curved duct using a front tracking method to improve the computational cost. To solve the density Poisson equation, three methods including the SOR, FGMRES and PARDISO are tested using two strategies of full domain and masked bubble. The comparison of numerical simulations at the same conditions indicates that the PARDISO scheme under the masked bubble strategy considerably reduces the computational time and RAM usage, which enables simulations on very fine grid resolutions. Ó 2015 Elsevier Ltd. All rights reserved.

1. Introduction Three dimensional finite difference multi-fluid flow problems based on the structured grid suffer from severe limitations regarding the computational time and memory requirement. Most of the traditional solvers employ the iterative algorithms, but recently direct solvers have been used and tested successfully in various problems [1–3]. The choice of a direct solver or an iterative solver in large domains depends on the size and behavior of the physical problem. Iterative solvers are highly memory efficient and easier to be parallelized on the OpenMP or shared memory platforms [3]. Furthermore, iteration methods often require custom numerical tools such as preconditioning techniques to be efficient [4]. In addition, in problems involving highly ill-conditioned matrices, iterative solvers suffer from well performance. On the other hand, direct solvers are remarkably limited by large memory requirements. Recently, parallel techniques have been adapted to the sparse direct solvers and greatly increased the computational efficiency. Among the sparse direct solvers, PARDISO package (part of Intel MKL library) is a high-performance, robust, memory efficient and easy to use software for solving large sparse symmetric and

asymmetric linear systems [5]. Using this package enables simulation of massive grid-size CFD problems. Multiphase flow based on the front-tracking method through a curved duct has been chosen as a benchmark problem for studying the performance improvement using the PARDISO package, FGMRES and SOR methods. The front-tracking method [6] is able to model the front interface and surface tension forces by accurately tracking the motions of interfaces without loss of mass conservation of different phases. Because of the relatively huge number of computations required in bubbly flows, parallel techniques usage is essential to make the simulations more feasible. One of the most important tasks in CFD simulations is solving the Poisson equation, which involves in various cases. In this work, we focus on the parallel direct solvers for density Poisson equations appearing in the front-tracking method to assess the possible accelerating strategies in the computation. To do so, three different methods are implemented based on the available solvers including the parallel direct solvers (PARDISO) [7,8], Flexible Generalized Minimal Residual (FGMRES) [9,10], and traditional serial SOR methods.

2. Governing equations ⇑ Corresponding author. Tel.: +98 21 64543412; fax: +98 21 66419736. 1

E-mail address: [email protected] (M.R.H. Nobari). Tel.: +98 21 64543412; fax: +98 21 66419736.

http://dx.doi.org/10.1016/j.compfluid.2015.06.013 0045-7930/Ó 2015 Elsevier Ltd. All rights reserved.

The methodology of traditional front tracking method along with the present strategy called the masked bubble computing is briefly described in the following.

102

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

Nomenclature a b Dh ds0 Eo

width of the duct height of the duct hydraulic diameter, Dh = 2ab/(a + b) surface element area of the front € ¼ g Dqd2eq =r Eotvos, Eo

U u, v, w wm wijk

Fst g h k kLC MKL Mo ~ n N p r r0 Rc Re Ri Ro t

surface tension body force gravity the grid spacing the curvature Dean number, Re/k1/2 Mathematic Kernel Library Morton, Mo ¼ g l4l Dq=q2l r3 unit normal vector 3 Archimedes, N ¼ q2l de g=l2l pressure the point at which the equation is evaluated a point on the front average radius of curvature Rm = (Ri + Ro)/2 Reynolds number, Re = wmDh/m inner bend outer bend time

Greek symbols d delta function D le area of the element of the front Dsl the area of element l k curvature ratio, k ¼ a=Dh l viscosity of fluid m kinematic viscosity q density of fluid r surface tension coefficient /ijk an approximation to the grid value /l discrete approximation to the front value Subscripts i inner o outer

2.1. Front tracking method The computational code used in this work is developed in the cylindrical coordinates for modeling bubbly flows in curved ducts. The physical domain is shown in Fig. 1. The Navier–Stokes equations are solved for the immiscible phases by converting them as a single stratified fluid flow using the front-tracking technique described below. The momentum equation in vectorial form is

@ qU þ r  qUU ¼ rp þ r  lððrUÞ þ ðrUÞT Þ þ qg þ F st @t

ð1Þ

@ 1 @ 1 @ @ ur uh ðquh Þ þ ðqrur uh Þ þ ðqruh uh Þ þ ðquz uh Þ þ q @t r @r r @h @z r   @p 1 @ 2 1 @ shh @ shz þ qg h þ F sth  2 ðr srh Þ þ þ ¼ r@h r @r r @h @z @ 1 @ 1 @ @ ðquz Þ þ ðqrur uz Þ þ ðqruh uz Þ þ ðquz uz Þ @t r @r r @h @z   @p 1 @ 1 @ shz @ szz þ qg z þ F stz ðr srz Þ þ þ ¼  @z r @r r @h @z

ð3Þ

ð4Þ

where

In the cylindrical coordinates, it can respectively be written in the directions of r, h and z as



srr ¼ l 2

u2h

@ 1 @ 1 @ @ ðqur Þ þ ðqrur ur Þ þ ðqruh ur Þ þ ðquz ur Þ  q @t r @r r @h @z r   @p 1 @ 1 @ srh shh @ srz ¼  þ qg r þ F str ðr srr Þ þ  þ @r r @r r @h r @z

velocity vector velocity components in r, h and z direction, respectively mean axial velocity the weight of grid point ijk with respect to element l



shh ¼ 2l ð2Þ



szz ¼ l 2

@ur @r

 ð5Þ

1 @uh ur þ r @h r @uz @z



F st ¼

Z

ð7Þ @ uh  1 @ur þ @r r r @h

  @uz @ur þ @r @z

shz ¼ szh ¼ l

ð6Þ



srh ¼ shr ¼ l r srz ¼ szr ¼ l



  @uh 1 @uz þ r @h @z

rkndðr  r0 Þds0

 ð8Þ

ð9Þ

ð10Þ

ð11Þ

Because the viscosity is not constant, the viscous terms cannot be simplified and this causes some difficulties in discretizing the equations on the staggered grid. Using the projection algorithm, the momentum equation can be factorized as Fig. 1. Geometry of the physical domain.

ðqUÞnþ1  qnþ1 U  ¼ rh pnþ1 Dt

ð12Þ

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

where the un-projected mass flux is defined as nþ1

q





n

n

U q U ¼ rh  qn U n U n þ rh  ln rh U n þ rTh U n Dt þ qn g þ F nst

2.2. Updating material properties

 ð13Þ

Taking divergence from both sides, the Poisson equation for pressure can be obtained as

1

1 rh nþ1  rh p ¼ rh  U  q Dt

ð14Þ

New velocity in the time step (n + 1) is found as nþ1

q

U

nþ1

nþ1

q Dt

U



¼ rh p

(

1 4h



ð15Þ

 p r  1 þ cos 2h ; jrj < 2h

0;

ð16Þ

jrj P 2h

The weight for the grid point (i, j, k) based on the interpolation to Rp = (rp, hp, zp) is written as

wijk ðxp Þ ¼ dðrp  ðr i þ idrÞÞdðrp hp  jðr i þ idrÞdhÞdðzp  kdzÞ

ð17Þ

The momentum and continuity equations are calculated on an Eulerian fixed staggered grid as shown in Fig. 2 using a second order central difference scheme in space and explicit first-order time integration [12]. Since the time step is small due to the stability requirement, the first-order time integration is mostly used [13]. In the following equations, all the properties denote to the ambient fluid.

6 Dt Min½qb =lb ; qo =lo Min½Dr 2 ; ðr DhÞ2 ; Dz2  V max  ~ V max ÞDt 1=2ð~ 61 Min½lb =qb ; lo =qo 

In the front tracking method, the fluid properties such as the density and viscosity are computed using the interface location at each time step. In the new time step first the grid density gradient is obtained by:

rqnþ1 ¼

Z

61 ð18Þ

The interface is represented by Lagrangian grids moved by the velocity interpolation from the neighboring grids. The tracked new interface points are used to compute the new density and viscosity fields and the surface tension force at the new time step, which is carried out using the immersed boundary method.

Dqnþ1 ndðr  rf Þds

ð19Þ

And its discrete version is

rh qnþ1 ijk ¼

where the subscripts h and n respectively refer to the discretization in space and time. The source term due to the surface tension force in Eq. (11) is smeared out of the interface onto the finite neighboring grid points using the immersed-boundary method presented by Peskin [11], where a cosine weighting function is used to spread the interfacial properties toward the surrounding cells as shown in Fig. 2 based on the Lagrangian–Eulerian treatment of moving front markers and stationary grid points. Peskin’s interpolation can be expressed as

dðrÞ ¼

103

X

Dqnþ1 wlijk ne Dle

ð20Þ

Once the grid density field has been constructed, the density field must be recovered. Taking the numerical divergence of the grid density gradient (LHS of Eq. (19)), results in a numerical approximation to the Laplacian of the density field.

r2 qnþ1 ¼ rh  rqnþ1 ijk

ð21Þ

In other words, the RHS of Eq. (21) is obtained numerically by the location of interface and will be treated like a source term in Poisson equation. The left hand side is approximated by the standard centered difference approximation for the Laplacian, and solving the resulting Poisson equation with the boundary conditions set as the ambient fluid density indicator yields the density field in complete domain. When using the masked bubble, we solve the Poisson equation for density in a smaller sub-domain enclosing one bubble with the same boundaries as imposed in regular procedure for complete domain. It means that after the solution, in the outside mask the density field is constant and equal to the ambient fluid density. Solving this elliptic equation for the entire domain causes two types of error [14]. One source of error stems from the unphysical density variation away from the interface, and the second one results from the small over- and undershoots occasionally found near the interface. Small variations in density away from the interface can lead to unphysical buoyancy currents, and undershoots can lead to negative densities causing troubles in the pressure solver. On the other hand, solving the Poisson equation for the density and pressure field takes the most portion of CPU time at each time step. To overcome these problems, we use a parallel direct solver (PARDISO) for the density equation based on the masked bubble strategy to minimize the computational time and the density fluctuations. In addition, this package is used for the pressure equation in the entire domain. To show the computational improvement obtained in PARDISO solver, the results are compared with the other solvers using both the full domain and the masked bubble strategies. In the full domain case, all the grid nodes take part in the computation process, but in the masked bubble procedure just the nodes inside the mask (box) enclosing the bubbles are used in the computations. However, for the pressure equation the full domain strategy will be used. 2.3. Masked bubble strategy

Fig. 2. Staggered grid points and the front position.

Instead of solving the density Poisson equation for the complete computational domain, which takes large computational time, we implement the masked bubble strategy to remarkably reduce the computational time. This strategy follows the fact that the density gradient only occurs near the interfaces, and beyond that the density is constant and equal to the ambient fluid density. Therefore, using the smallest possible domain to solve the density Poisson equation by taking into account the regions with non-zero density gradients provides a tool for lowering the computational cost and increasing the convergence rate. We call this smallest region as the masked region that encloses one of the bubbles appears in the

104

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

multi-fluid flow domain regardless of its collision and deformation. Hence, one large Poisson solver for a complete domain containing ‘N’ bubbles is converted into the ‘N’ Poisson solvers for ‘N’ masks each including just one bubble.

n

m

r2 q ¼ rh  rqijk is converted into ðr2 qÞ n

¼ ðrh  rqijk Þ

m

oN

oN m¼1

m¼1

ð22Þ

where ‘m’ indicates the masked regions each enclosing one of the bubbles in the multi-fluid domain. Our masked bubble strategy considers a separate layer for each bubble at the same size of the main domain, where initially its density is stored as the ambient fluid density indicator assigned as zero. After solving each masked bubble density Poisson equation, the results are transferred into the corresponding nodes between the bubble mask and its density layer. Now for ‘N’ bubbles we have ‘N’ density layers. Based on the density fields stored in the ‘N’ layers, we build the final density layer, which initially contains the ambient fluid density in all its nodes, by replacing the maximum value of densities appears in each ‘N’ corresponding nodes of ‘N’ bubble layers as follows

n

m qfinal ijk ¼ max qijk

oN m¼1

additional step is necessary to smooth the density field, where the collision occurs. To find the collision locations, we compare the nodes of each bubble layer with the corresponding nodes of the other bubble layers. The same nodes with the densities of greater than the ambient fluid density indicator represent the collision location, where we need to smooth the final density field using the following filtering method.



final final final final final final final qfinal i;j;k ¼ qi;j;k þ qiþ1;j;k þ qi1;j;k þ qi;jþ1;k þ qi;j1;k þ qi;j;kþ1 þ qi;j;k1

ð24Þ where (i, j, k) indicates the node at which the values of density at least for two bubble layers are greater than the ambient fluid

; ðijkÞ includes complete domain nodes ð23Þ

The above final density field is the solution for the complete computational domain density field. As far as there is no collision between the bubbles, we can proceed our computations to find other physical parameters such as pressure . In case of collision of bubbles,

Fig. 3. Masked bubble strategy, a: 3D region in the physical domain, b: 2D schematic of mask boundary notation.

. 7

Fig. 4. Flowchart of the masked bubble strategy.

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

Final layer

a

b

Fig. 5. 2D schematics layers and 1D representation of constructing the final layer.

a

b

Filtering

105

The general procedure and algorithm for doing the masked strategy (with and without collision) is shown in Fig. 4. The algorithm starts by considering N bubbles in the domain. Then N matrices are allocated at the same size as the computational domain called a layer. This means that each bubble and its mask are located at its own layer. By solving the density Poisson equation in the mask regions, the results are copied to each layer individually. In other words, each layer contains only one bubble and one mask. When the computation of all layers is completed, the main or final layer, which is the physical domain of concerned, must be computed. To do so, we first initialize the final layer as zero. Then, each node value of the final layer (zero) is compared with each node of the computational layers, and the maximum value is chosen. For example, by comparing the density field of layer one with layer two assuming that no collision has occurred, there will be two layers with their own bubbles, and the outcome of the final layer will be a layer with two bubbles as shown schematically in Fig. 5a and b. As can be seen from the figure, when no collision occurs, the comparison of two layers node by node and choosing the maximum node values of different layers produce the final density layer. However, when two or more bubbles collide to each other, the above procedure leads to sharp interfaces near the collision locations. Therefore, in the case of collision, a filtering procedure by averaging the neighbor nodes around the interface must be taken into account to determine the final layer values. This filtering provides smoothing the interface at the collision locations as shown in Fig. 6a and b.

Fig. 6. Capturing colliding bubbles interface 1D and 2D schematic.

density. In other words, the two or more different layers with the densities of corresponding nodes greater than the ambient fluid density indicate the collision of two or more bubbles, where we need smoothing the density field. 2.3.1. How to build a mask As we explained in the previous section, each mask contains only one bubble regardless of its deformation or its collision with one or more other bubbles. To reduce the computational cost, the bubble mask has to be determined as small as possible box in which the density gradient takes place based on the target bubble interface. To do so, first we find the minimum and maximum location of the target bubble interface in each directions. Then based on those locations we find the corresponding nodes that surrounds the target bubble interface. Now the mask can be constructed by extending the surrounding nodes by two grid sizes from each side taking into account the Peskins’ distribution function. A sample mask is shown in Fig. 3a, and a 2D schematic mask boundaries which enclose a single bubble is shown in Fig. 3b for the cylindrical coordinates system. This procedure mathematically is explained for our cylindrical coordinates as follows. Suppose nr, nh and nz are the total number of grid points in r, h and z direction in computational domain. The minimum and maximum grid numbers corresponding the minimum and maximum interface locations in three directions taking into account the two grid size extension from each sides are computed as

imin ¼ intððr min  Ri Þ  nr=ðRo  Ri ÞÞ  2 imax ¼ intððr max  Ri Þ  nr=ðRo  Ri ÞÞ þ 2 jmin ¼ intðhmin  nh=he Þ  2 jmax ¼ intðhmax  nh=he Þ þ 2

ð25Þ

kmin ¼ intðzmin  nz=ze Þ  2 kmax ¼ intðzmax  nz=ze Þ þ 2

Fig. 7. Finite difference pattern of the discretized a: density equation and b: pressure equation.

106

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

2.4. Generating sparse matrices

Table 1 Parameters used in the simulation. Channel size: Ri, Ro, h, Z Resolution Density ratio (qG/qL) Viscosity ratio (rG/rL) Diameter of bubbles Number of bubbles Average void fraction in the whole channel (%) Eotvos number Archimedes number Reynolds number Dean number Morton number Non dimensional curvature

100, 108, p/9, 8 90  90  90 0.5 0.5 2.2 2 0.48 12.1 605 2 0.59 Mo = 2.5e6 11.55

The numerical simulation of multi-fluid incompressible flows with large density variations (such as liquid–gas scenarios) can present significant challenges, particularly in the solution of the pressure Poisson equation. In the common solvers, the Laplacian term for the density and pressure is approximated by the standard centered difference in the entire domain. The discrete form of the density Poisson equation is

qiþ1;j;k  2qi;j;k þ qi1;j;k

1 qiþ1;j;k  qi1;j;k þ r i;j;k Dr 2 2 Dr qi;jþ1;k  2qi;j;k þ qi;j1;k qi;j;kþ1  2qi;j;k þ qi;j;k1 þ þ Dz 2 r 2i;j;k Dh2

¼ rh  rqijk

ð26Þ

Similarly, the discrete form of the pressure Poisson equation is:

piþ1;j;k  pi;j;k pi;j;k  pi1;j;k 1 1 1  Dr qiþ1=2;j;k Dr qi1=2;j;k Dr

!

pi;jþ1;k  pi;j;k pi;j;k  pi;j1;k 1 1 1 þ  ðr i;j;k DhÞ qi;jþ1=2;k ðri;jþ1=2;k DhÞ qi;j1=2 ; k ðr i;j1=2;k DhÞ ! pi;j;kþ1  pi;j;k pi;j;k  pi;j;k1 1 1 1  þ Dz qi;j;kþ1=2 Dz qi;j;k1=2 Dz ¼

!

1 rh  u  Dt

ð27Þ

In Fig. 7a and b the finite difference pattern of the discretized density and pressure equations are shown respectively. The constants of the density matrices are



2 2 2 þ þ Dr 2 ðr DhÞ2 Dz2

B¼ C¼

Fig. 8. Bubble rising in the duct due to buoyancy.

1 Dz 2 1

ð28Þ

ðr DhÞ2   1 1 D¼  2 r Dr Dr   1 1 þ E¼ Dr 2 r Dr And for the pressure matrices are

1

2

! ! 1 1 1 1 1 1 A¼ 2 þ þ þ Dr qiþ1=2;j;k qi1=2;j;k ðr DhÞ2 qi;jþ1=2;k qi;j1=2;k ! 1 1 þ 2 Dz qi;j;kþ1=2 þ q 1 i;j;k1=2

1 1 B¼ 2 Dz qi;j;k1=2 3

4

C¼ D¼ E¼

Fig. 9. Density contour lines, left: full domain strategy, and right: masked region chosen around the bubble in various time steps.

1 1 Dz2 qi;j;kþ1=2 1

1

ðrDhÞ qi;j1=2;k 2

1

1

ðrDhÞ2 qi;jþ1=2;k   1 0:5 1 F¼  Dr 2 r Dr qi1=2;j;k   1 0:5 1 G¼ þ Dr 2 r Dr qiþ1=2;j;k

ð29Þ

107

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

Grid Resolution=60 3

Grid Resolution=60 3 1.6 1.4 1.2 1

10

10

0.8

-1

SOR (FULL)

-2

FGMRES (FULL)

RAM Usage (Gbyte)

CPU Time (Sec)

10 0

0.6 SOR (FULL)

0.4

FGMRES (FULL) PARDISO (FULL) SOR (Masked) FGMRES (Masked) PARDISO (Masked)

0.2

PARDISO (FULL) SOR (Masked) FGMRES (Masked) PARDISO (Masked)

10-3 0.5

1

1.5

2

2.5

0.5

1

Void Fraction (%)

1.5

2

2.5

Void Fraction (%)

Grid Resolution=90 3

Grid Resolution=90 3 10 1

1

10 0

10

-1

SOR (FULL) FGMRES (FULL)

RAM Usage (Gbyte)

CPU Time (Sec)

10

SOR (FULL)

10

0

FGMRES (FULL) PARDISO (FULL) SOR (Masked) FGMRES (Masked) PARDISO (Masked)

PARDISO (FULL) SOR (Masked)

10 -2

FGMRES (Masked) PARDISO (Masked)

0.5

1

1.5

2

10 -1

2.5

0.5

1

1.5

2

Void Fraction (%)

Void Fraction (%)

Grid Resolution=110 3

Grid Resolution=110 3

2.5

10 2 10 1

10 0

10

-1

SOR (FULL) FGMRES (FULL) PARDISO (FULL)

10-2

RAM Usage (Gbyte)

CPU Time (Sec)

10

1

SOR (FULL) FGMRES (FULL) PARDISO (FULL)

10

0

SOR (Masked) FGMRES (Masked) PARDISO (Masked)

SOR (Masked) FGMRES (Masked) PARDISO (Masked)

10

-3

0.5

1

1.5

2

2.5

10 -1

0.5

Void Fraction (%)

1

1.5

2

2.5

Void Fraction (%)

Fig. 10. CPU time and RAM usage of the code against void fraction in three different mesh sizes.

Table 2 Normalized CPU time (%) in three different mesh sizes.

Grid resolution = 1103 Grid resolution = 903 Grid resolution = 603

SOR (full) serial

FGMRES (full) parallel

PARDISO (full) parallel

SOR (masked) serial

FGMRES (masked) parallel

PARDISO (masked) parallel

100 100 100

56.4 34.13 35.5

4.3 1.8 1.74

1.62 0.445 0.428

0.92 0.262 0.193

0.182 0.050 0.042

108

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

Masked (SOR)

Full (SOR)

Full (FGMRES)

Masked (FGMRES)

Full (PARDISO)

Masked (PARDISO)

a

b

Fig. 11. Density field constructed by solving Poisson equation and using Peskin function (903), a: full domain strategy, b: masked bubble strategy.

it is possible to use advanced solvers such as PARDISO or FGMRES. The PARDISO and FGMRES methods used here are based on the Intel MKL library of 2013, where the PARDISO version of 2006 is incorporated. Furthermore, the convergence criterion in all our computations is set as the maximum relative error 106. This paper concentrates on FGMRES, which is a generalization of GMRES and allows larger flexibility in the choice of solution subspace than GMRES. FGMRES preconditions the system matrix in an efficient way and generates a converging solution by minimizing at each iterative cycle the Euclidean norm of the system residual [15]. In this study to accelerate the FGMRES solution, ILUT preconditioned is used for solving the linear algebraic equations.

Fig. 12. The interface of bubbles in collision before and after smoothing.

One can see that these two patterns are structurally symmetric and saving them in the sparse matrices notation will reduce the memory usage considerably. After extracting the sparse matrices,

3. Results and discussion The PARDISO package is a high-performance, robust, memory efficient, and easy to use software package for solving the large sparse symmetric and asymmetric linear system of equations on the shared memory multiprocessors. The parallel pivoting method compromises numerical stability and scalability during the

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

109

Fig. 13. Indicator function construction for collision of two bubbles.

factorization process. For sufficiently large domains, numerical experiments indicate that the scalability of the parallel algorithm is nearly independent of the shared-memory multiprocessing architecture. Here, all the experiments are performed on a PC with Intel (R) Core (TM) i7-3770, 3.5–3.9 GHz processor with 8 MB of cache, 16 GB Registered RAM, and a 1 TB SATA hard drive.

3.1. Capability of the masked bubble strategy The masked bubble strategy is first tested using the solver of Navier–Stokes equations for a typical problem of curved duct flow using the front-tracking method based on the projection algorithm. To do so, the code adapted with the PARDISO, FGMRES and SOR methods is evaluated for two bubbles rising by buoyancy in a curved duct with no-slip condition at the walls and the periodic boundary condition in the inlet and outlet of the duct. The parameters used in the simulations are listed in Table 1. The buoyancy force causes the bubbles to start moving upward while they deform when traveling in the duct as shown in Fig. 8. Fig. 9 shows the 2D density contours of the full domain strategy (left) and the masked bubble strategy (right) at different times.

3.2. Performance comparison of the solvers To test the code performance and the CPU time, three solvers, i.e. PARDISO, SOR and FGMRES are employed considering two strategies including full domain and masked bubbles. This test is done for three grid resolutions of 60 ⁄ 60 ⁄ 60, 90 ⁄ 90 ⁄ 90, and 110 ⁄ 110 ⁄ 110. In Fig. 10 the CPU time and RAM usage are plotted against the void fraction (the ratio of bubbles volume to the physical domain volume). One can see from Fig. 10 that the highest CPU time in solving the density Poisson equation for all cases belongs to the SOR full domain, which gradually increases with increasing the void fraction. FGMRES uses lower CPU time than the SOR, but the PARDISO consumes the lowest computational time in the full domain strategy. In addition, the CPU times of FGMRES and PARDISO in the full domain almost remain constant with the void fraction. However, their RAM usage is vice versa. On the other hand, the lowest RAM usage belongs to the SOR and that for the PARDISO is the highest. Similarly, when using the masked bubble strategy, the SOR has the highest CPU time and the PARDISO has the lowest one. However, in the masked bubble strategy, the RAM usage of all three methods remains almost the same and is not remarkably

110

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

1

0.5 One Bigger Mask Two Little Mask

0

-0.5

-1

ZONE 003

0.5

ZONE 003

0

-0.5

5

10

15

0

5

R*phi

Two Little Mask

-0.5

10

One Bigger Mask

0

Two Little Mask

-0.5

5

10

-1

0

0.5 One Bigger Mask Two Little Mask

0

-0.5

15

5

10

-0.5

-1

-0.5

5

15

10

15

R*phi

1

0.5 One Bigger Mask

0

Two Little Mask

-0.5

0.5 One Bigger Mask

0

Two Little Mask

-0.5

-1

-1 10

Two Little Mask

0

Indicator Function

Two Little Mask

Indicator Function

One Bigger Mask

5

One Bigger Mask

0

15

1

0

0.5

R*phi

0.5

15

-1 0

1

10

1

R*phi

0

5

R*phi

-1 10

Two Little Mask

-0.5

15

Indicator Function

-0.5

Indicator Function

Two Little Mask

5

One Bigger Mask

0

-1 0

1

One Bigger Mask

15

0.5

R*phi

0.5

10

R*phi

0.5

15

1

0

5

1

R*phi

0

-0.5

0

-1 5

0

15

Indicator Function

One Bigge rMask

Indicator Function

Indicator Function

0.5

-1

Indicator Function

10

1

0

ZONE 003

R*phi

1

0

ZONE 003

0.5

-1

-1 0

Indicator Function

Indicator Function

1

Indicator Function

Indicator Function

1

0

5

R*phi

10

15

0

5

R*phi

10

15

R*phi

Fig. 14. Indicator function against R  / for three constant radii and four time snapshots (two bubbles).

affected by increasing the void fraction. In fact, the increasing void fraction is accompanied by CPU time increase, which is maximum in full domain computation. For example, in a given void fraction (1%), the normalized CPU time with respect to the serial full domain SOR time is shown in Table 2. The values in the table show that the PARDISO solver in the masked bubble strategy has the lowest CPU time percent, for example in 110 ⁄ 110 ⁄ 110 grid resolution, it has 0.182 percent of the serial SOR full domain time usage.

3.3. Effect of masked bubble strategy on the quality of grid density It has to be emphasized that using full domain strategy in solving the density Poisson equation may cause small over and undershoots near the interface. This problem is solved by restricting our computation domain into the masked region. The contour of the density for the three solvers is plotted in Fig. 11, where the grid resolution is 90 ⁄ 90 ⁄ 90 and the density ratio is 100. It is obvious that in all three cases, masked bubble strategy captures the interface more precisely than the full domain strategy.

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

111

Fig. 15. Indicator function construction for collision of three bubbles.

This means that not only the masked bubble reduces the computational time considerably, but also it leads to a more accurate interface gradient prediction. 3.4. Masked bubble strategy in multi-bubble collision It was explained in the previous section that when two or more bubbles get closer or collide to each other, two different methods can be used to solve the Poisson equation for density field. Here, both methods are compared with each other. It was also shown schematically that to handle the sharp interfaces, which are created in the mask bubble strategy when getting close or collision of multi bubbles, filtering of node values on the interfaces is required. In Fig. 12 the density value before and after filtering is shown. As can be seen from the figure, the sharp interfaces are smoothed by a simple filtering without changing the general shape of interface. In Fig. 13 the difference between choosing a bigger mask and using separate masks for two bubbles in collision are shown. The first column shows the azimuthal variation of the indicator function for a constant radius at the center line of collision and the second and the third columns show the cross section of the bubbles in two views. One supposes a bigger mask around the bubbles which

predict the indicator function based on the traditional full domain strategy, and the other considers separate masks. For comparison, the azimuthal variation of the indicator function at different radii is plotted in Fig. 14. As can be seen from the figure, for the two bubbles in collision, the result of the filtering procedure is almost the same as the traditional full domain method. To show the reliability of the method for a more complex situation, we checked it for three bubbles in collision. The corresponding results are shown in Figs. 15 and 16. Figs. 13–16 indicate that when the bubbles penetration into each other is not remarkable, the masked bubble strategy performs the same result as the bigger mask (full domain) strategy on predicting the indicator function. As aforementioned, to have a high efficient solver, the masks must be chosen such that they enclose one bubble at a time to provide the smallest possible volume. This will guarantee the lowest CPU time usage by the solver. Collision of bubbles is not predictable, and developing a general rule to find an optimum mask for bubbles in collision is not possible. Choosing separate masks and then interpolating the interfaces is much easier than finding a bigger mask boundary and then solving the density Poisson equation inside the mask. This algorithm remains unchanged in different collisions of the bubbles and has lower CPU time and RAM usage relative to the bigger masks.

112

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

One Bigge rMask Three Little Mask

0

-0.5

0.5 One Bigger Mask Three Little Mask

0

-0.5

0

5

10

15

20

0

5

R*phi

15

Three Little Mask

0

-0.5

0

15

0.5 One Bigger Mask Three Little Mask

0

-0.5

20

5

10

15

15

0.5 One Bigger Mask Three Little Mask

0

-0.5

20

5

10

15

-0.5

5

15

20

10

15

20

R*phi

1

0.5 One Bigger Mask Three Little Mask

0

-0.5

0.5 One Bigger Mask Three Little Mask

0

-0.5

-1

-1

-1 10

Three Little Mask

0

Indicator Function

-0.5

Indicator Function

Three Little Mask

0

20

One Bigger Mask

0

20

1

One Bigger Mask

15

0.5

R*phi

0.5

10

-1 0

1

5

5

1

R*phi

0

-0.5

R*phi

-1

-1 10

Three Little Mask

0

Indicator Function

-0.5

Indicator Function

Three Little Mask

0

5

One Bigger Mask

0

20

1

One Bigger Mask

20

0.5

R*phi

0.5

15

-1 0

1

10

1

R*phi

0

5

R*phi

-1 10

Three Little Mask

-0.5

20

Indicator Function

One Bigger Mask

Indicator Function

Indicator Function

0.5

-1

Indicator Function

10

1

5

One Bigger Mask

0

R*phi

1

0

0.5

-1

-1

-1

Indicator Function

Indicator Function

0.5

Indicator Function

Indicator Function

1

1

1

0

5

R*phi

10

15

20

0

5

R*phi

10

15

20

R*phi

Fig. 16. Indicator function against R  / for three constant radii and four time snapshots (three bubbles).

4. Summary and conclusion We tested three different types of computing techniques for the density Poisson equation in the front tracking method based on the serial and OpenMP using two strategies including the full domain or masked bubble. In full domain strategy, all the nodes of the domain grid take part in solving the density of Poisson equation, but in the masked bubble strategy only the smallest box enclosing one bubble is used to solve the density equation. Although this strategy reduces the CPU time considerably in simple serial codes like the SOR, it is possible to make it more efficient by using fast solvers such as FGMRES and PARDISO. It is found that the serial SOR for full domain has the largest CPU time, and it increases by

increasing void fraction ratio. On the other hand, the lowest CPU time belongs to PARDISO when using the masked bubble strategy. For example, in 110 ⁄ 110 ⁄ 110 grid resolution, the PARDISO in the masked bubble case can solve the density Poisson equation about 550 times faster than the serial full domain SOR. It is also found that the PARDISO, which is a parallel direct solver, uses a larger RAM memory in full domain strategy compared to the others, but with masked bubble strategy, it is possible to speed up the PARDISO with very low RAM usage. In addition, two types of interfaces in collision considering two and three bubbles are tested in details to show that the masked bubble strategy in low deformed situations has the best performance and the predicted density values are almost the same as full domain strategy.

M.T. Mehrabani et al. / Computers & Fluids 118 (2015) 101–113

References [1] Yuen-Yick K, Jie S. An efficient direct parallel spectral-element solver for separable elliptic problems. J Comput Phys 2007;225:1721–35. [2] Borrell R, Lehmkuhl O, Trias FX, Oliva A. Parallel direct Poisson solver for discretisations with one Fourier diagonalisable direction. J Comput Phys 2011;230:4723–41. [3] Kuo-Long P, Guan-Chen Y. Parallel strategies of front-tracking method for simulation of multiphase flows. Comput Fluids 2012;67:123–9. [4] Thies J, Wubs F. Design of a parallel hybrid direct/iterative solver for CFD problems. In: Seventh IEEE international conference on eScience; 2011. [5] Schenk O, Gartner K. Sparse factorization with two-level scheduling in PARDISO. In: Proceedings of the 10th SIAM conference on parallel processing for scientific computing. Portsmouth, Virginia; 2001. [6] Unverdi SO, Tryggvason G. A front-tracking method for viscous incompressible multi-fluid flows. J Comput Phys 1992;100:25–37. [7] Schenk O, Gärtner K. Solving unsymmetric sparse systems of linear equations with PARDISO. Future Gener Comput Syst 2004;20:475–87.

113

[8] Schenk O, Gärtner K. Two-level dynamic scheduling in PARDISO improved scalability on shared memory multiprocessing systems. Parallel Comput 2002;28:187–97. [9] DeVries B, Iannelli J, Trefftz C, O’Hearn Kurt A, Wolffe G. Parallel implementations of FGMRES for solving large, sparse non-symmetric linear systems. Proc Comput Sci 2013;18:491–500. [10] DeCecchis D, L´opez H, Molina B. FGMRES preconditioning by symmetric/skewsymmetric decomposition of generalized stokes problems. Math Comput Simul 2009;79:1862–77. [11] Peskin CS. Numerical analysis of blood flow in the heart. J Comput Phys 1977;25:220. [12] Hoffman KA, Chiang ST. Computational fluid dynamic, 4th ed., vol. 2. Wichita: EES; 2000. [13] Nobari MR, Jan Y-J, Tryggvason G. Phys Fluids 1996;8:29–42. [14] Tryggvason G, Bunner B, Esmaeeli A. A front-tracking method for the computations of multiphase flow. J Comput Phys 2001;169:708–59. [15] DeVries B, Iannelli J, Trefftz C, O’Hearn Kurt A, Wolffe G. Parallel implementations of FGMRES for solving large, sparse non-symmetric linear system. Proc Comput Sci 2013;18:491–500.