Development and assessment of semi-implicit nonhydrostatic models for surface water waves

Development and assessment of semi-implicit nonhydrostatic models for surface water waves

Ocean Modelling 144 (2019) 101489 Contents lists available at ScienceDirect Ocean Modelling journal homepage: www.elsevier.com/locate/ocemod Develo...

4MB Sizes 0 Downloads 11 Views

Ocean Modelling 144 (2019) 101489

Contents lists available at ScienceDirect

Ocean Modelling journal homepage: www.elsevier.com/locate/ocemod

Development and assessment of semi-implicit nonhydrostatic models for surface water waves Congfang Ai, Yuxiang Ma ∗, Changfu Yuan, Guohai Dong State Key Laboratory of Coastal and Offshore Engineering, Dalian University of Technology, Dalian 116024, China

ARTICLE

INFO

Keywords: Nonhydrostatic model Semi-implicit Three-dimensional Wave model GPU computing

ABSTRACT This paper presents two semi-implicit nonhydrostatic models to simulate surface water waves. Both models utilize the pressure correction method to solve three-dimensional (3D) incompressible Navier–Stokes equations (NSEs), but one is a noniterative model and the other is associated with iterating to solve the Poisson equation for pressure correction terms. To meet the high computational demands, the Poisson equation is solved on the graphics processing unit (GPU) in conjunction with the conjugate gradient method. Both the accuracy in resolving the linear dispersion relation and the serially implemented efficiency of the models are assessed and discussed. Then, both models are validated with several test cases ranging from shallow to deep water. The results of the two models are very similar and show good agreement with experimental data in the shallowwater test cases. The accuracy of both models are also evaluated in deep-water nonlinear sloshing waves and 2D/3D focusing waves. In addition, the acceleration due to GPU parallelization is evaluated in a final 3D test case.

1. Introduction Over the past two decades, nonhydrostatic models have attracted increased attention because of their accuracy and efficiency in coastal and ocean applications. Nonhydrostatic models are developed based upon incompressible Navier–Stokes equations (NSEs) and are capable of resolving surface water waves (Ai et al., 2014; Ai and Jin, 2010, 2012; Ai et al., 2011; Anthonio and Hall, 2006; Bradford, 2005; Ma et al., 2012; Stelling and Zijlema, 2003; Wu et al., 2010; Young and Wu, 2010a,b; Young et al., 2007; Yuan and Wu, 2004; Zijlema and Stelling, 2005) and internal waves (Fringer et al., 2006; Lai et al., 2010; Vitousek and Fringer, 2011, 2014). In such models, the free surface is defined as a single value function of the horizontal position and thus can be determined by the so-called free-surface equation, which is obtained by integrating the continuity equation over the water depth and applying kinematic free surface and bottom conditions. Because they do not need to capture the free surface by using a large number of vertical grids, nonhydrostatic models are computationally more efficient than other NSE-based numerical models (e.g., VOF models Higuera et al., 2013 or level-set models Chen and Yu, 2009) involving complicated free-surface flows. In the development of nonhydrostatic models, one of the main issues is the precise implementation of a homogeneous Dirichlet boundary condition for the nonhydrostatic pressure on the free surface. Under a standard staggered grid framework, several approaches, such as the incorporation of an integration method at the free-surface layer

(Young and Wu, 2010a; Young et al., 2007; Yuan and Wu, 2004), implementation of an extrapolation method (Anthonio and Hall, 2006) and embedding of the Boussinesq-type-like equations with a reference velocity to introduce an analytical form of pressure distribution at the top layer (Wu et al., 2010; Young and Wu, 2010b), have been employed to impose the zero pressure boundary condition at the free surface. In addition, Stelling and Zijlema (2003) proposed a Keller– box scheme to discretize the vertical momentum equation, in which nonhydrostatic pressure is defined at vertically facing cell faces. Such a definition of the nonhydrostatic pressure allows the zero pressure boundary condition to be imposed straightforwardly at the free surface and has been incorporated in some nonhydrostatic models (Ai et al., 2014; Ai and Jin, 2010; Ai et al., 2011; Ma et al., 2012). However, the Keller–box scheme and the definition of vertical velocity used in Stelling and Zijlema (2003) complicate the discretizations of the vertical momentum equation and the Poisson equation. In the vertical discretization, nonhydrostatic models usually employ either a boundary-fitted coordinate system (Ai and Jin, 2012; Ai et al., 2011; Bradford, 2005; Ma et al., 2012; Young and Wu, 2010b; Young et al., 2007; Zijlema and Stelling, 2005) or a 𝑧-coordinate system (Ai and Jin, 2010; Anthonio and Hall, 2006; Stelling and Zijlema, 2003; Wu et al., 2010; Young and Wu, 2010a; Yuan and Wu, 2004). Each of these systems has advantages and disadvantages. The boundaryfitted coordinate system can better fit the bed and surface geometry but may introduce large errors in the discretization of the pressure

∗ Corresponding author. E-mail address: [email protected] (Y. Ma).

https://doi.org/10.1016/j.ocemod.2019.101489 Received 25 January 2019; Received in revised form 3 September 2019; Accepted 29 September 2019 Available online 5 October 2019 1463-5003/© 2019 Elsevier Ltd. All rights reserved.

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

to GPU-accelerated nonhydrostatic models (Escalante et al., 2018) are rare. The remainder of this paper is organized as follows. Section 2 presents governing equations. Numerical algorithms are described in Section 3. In Section 4, the accuracy of the developed models in resolving the linear dispersion relation and the efficiency of both serial models are assessed and discussed. In Section 5, model validations are tested. Finally, conclusions are drawn in the last section.

gradient term in simulating steep surface waves (Young et al., 2007). In contrast, the discretization of the pressure gradient term can be accurately estimated in the 𝑧-coordinate system. However, in the 𝑧coordinate system, the bed and surface geometry must be represented by stepwise discontinuity, which can cause spurious flows. Recently, Ai et al. (2014) proposed a general boundary-fitted coordinate system, which allows for a great adaptability of the vertical discretization when wave run-up and run-down along the shoreline are not involved in the computation. The satisfactory results of Ai et al. demonstrate that the nonhydrostatic model with such a grid system is capable of accurately simulating nonlinear waves with steep free surfaces. In this paper, we develop nonhydrostatic models for surface waves based on two semi-implicit algorithms, thus producing two numerical models, i.e., a noniterative model and a model associated with iterating to solve the Poisson equation for the pressure correction term. Both developed models incorporate a novel grid arrangement following Ai et al. (2011) and a general vertical boundary-fitted coordinate system following Ai et al. (2014). The grid arrangement not only ensures that the homogeneous Dirichlet boundary condition for the nonhydrostatic pressure can be precisely and easily imposed on the free surface but also renders the models relatively simple in their discretized forms. In both models, the pressure correction method is used to solve the NSEs, and the free-surface gradient is discretized by the semi-implicit method (𝜃-method). The employment of the 𝜃-method is very popular (Casulli, 1999; Casulli and Zanolli, 2002; Fringer et al., 2006; Lai et al., 2010; Rijnsdorp and Zijlema, 2016) because this method can easily handle fast surface gravity wave timescales and slow timescales of baroclinic variability. The two semi-implicit models are developed based upon the same grid arrangement and numerical discretization method, which facilitates the following comparison between the two semi-implicit algorithms. Kang and Fringer (2005) proved theoretically that semi-implicit algorithms can be second-order accurate in time for nonhydrostatic free-surface flows. Vitousek and Fringer (2013) focused on the analysis of stability and consistency of nonhydrostatic free-surface models using semi-implicit algorithms and demonstrated that semi-implicit nonhydrostatic models are unconditionally stable for linear surface waves when 0.5 ≤ 𝜃 ≤ 1. Moreover, they also investigated the accuracy of their semi-implicit nonhydrostatic models in resolving the linear dispersion relation for a small range of dimensionless water depths. However, the performance of nonhydrostatic models incorporating semi-implicit algorithms needs to be further assessed. In this study, compared to the work of Vitousek and Fringer (2013), the accuracy of both of the developed semi-implicit models in resolving the linear dispersion relation is assessed and discussed for a wider range of dimensionless water depths. The efficiency of the models is also tested. Due to the lack of comparisons between semi-implicit nonhydrostatic models in the simulation of nonlinear surface waves, capabilities of both of the developed models in predicting nonlinear surface waves are tested by using several test cases ranging from shallow to deep water. Although nonhydrostatic models are relatively computationally efficient, their parallelization is imperative. Because the NSEs are taken as the governing equations, the models are computationally expensive for three-dimensional (3D) problems. An option for parallelization is the use of modern graphics processor unit (GPU). The modern GPU consists of a set of streaming multiprocessors (SM) and each SM is designed to execute several hundred of threads concurrently. In contrast to parallelization based on multicore CPU, GPU parallelization is gaining increasing attention because of its favorable cost to performance ratio. Because solving the Poisson equation is computationally intensive and is the most time consuming part of the models, in both of the developed models, the Poisson equation solver is accelerated on a GPU architecture. Several GPU implementations of NSE-based numerical models (e.g., the VOF model Reddy and Banerjee, 2015 or level-set model Kuo et al., 2011) involving complicated free-surface flows have been developed. However, to our knowledge, papers related

2. Governing equations 3D nonhydrostatic free-surface flows are described by the incompressible NSEs, which can be expressed in the following forms by splitting the pressure into hydrostatic and nonhydrostatic components such that 𝑝 = 𝑔(𝜂 − 𝑧) + 𝑞 𝜕𝑢 𝜕𝑣 𝜕𝑤 + + =0 𝜕𝑥 𝜕𝑦 𝜕𝑧 ( 2 ) 2 𝜕𝜂 𝜕𝑞 𝜕𝑢 𝜕𝑢 𝜕𝑢𝑣 𝜕𝑢𝑤 𝜕 𝑢 𝜕2 𝑢 𝜕2 𝑢 + + + = −𝑔 − +𝜈 + + 𝜕𝑡 𝜕𝑥 𝜕𝑦 𝜕𝑧 𝜕𝑥 𝜕𝑥 𝜕𝑥2 𝜕𝑦2 𝜕𝑧2 ( 2 ) 2 𝜕𝜂 𝜕𝑞 𝜕𝑣 𝜕𝑢𝑣 𝜕𝑣 𝜕𝑣𝑤 𝜕 𝑣 𝜕2 𝑣 𝜕2 𝑣 + + + = −𝑔 − +𝜈 + + 𝜕𝑡 𝜕𝑥 𝜕𝑦 𝜕𝑧 𝜕𝑦 𝜕𝑦 𝜕𝑥2 𝜕𝑦2 𝜕𝑧2 ( 2 ) 𝜕𝑞 𝜕 𝑤 𝜕2 𝑤 𝜕2 𝑤 𝜕𝑤 𝜕𝑢𝑤 𝜕𝑣𝑤 𝜕𝑤2 + + + =− +𝜈 + + 2 2 2 𝜕𝑡 𝜕𝑥 𝜕𝑦 𝜕𝑧 𝜕𝑧 𝜕𝑥 𝜕𝑦 𝜕𝑧

(1) (2) (3) (4)

where 𝑢, 𝑣 and 𝑤 are the velocity components in the horizontal 𝑥 and 𝑦 and vertical 𝑧 directions, respectively, 𝑡 is the time, 𝑝 is the normalized pressure divided by a constant reference density, 𝜂 is the free-surface elevation, 𝑞 is the nonhydrostatic pressure component, 𝑔 is the gravitational acceleration, and 𝑣 is the kinematic viscosity. Notably, diffusion terms in Eqs. (2)–(4) are not considered in this study and are not presented in the following section. Integrating Eq. (1) from the bottom surface 𝑧 = −ℎ(𝑥, 𝑦) to the free surface 𝑧 = 𝜂(𝑥, 𝑦) and applying Leibniz’s rule with kinematic bottom condition and freesurface condition, we can obtain the following free-surface equation: 𝜂 𝜂 𝜕𝜂 𝜕 𝜕 + 𝑢𝑑𝑧 + 𝑣𝑑𝑧 = 0 𝜕𝑡 𝜕𝑥 ∫−ℎ 𝜕𝑦 ∫−ℎ

(5)

At the free surface, a homogeneous Dirichlet boundary condition for the nonhydrostatic pressure is specified, i.e., 𝑞 = 0. At the inflow, normal velocity components are specified by analytical solutions. Furthermore, tangential velocity components are set to zero. At the outflow boundary, a Sommerfeld’s radiation condition or a sponge layer technique is implemented to minimize wave reflection into the computational domain. 3. Numerical algorithms 3.1. Grid system The 3D grid system used in both of the present models is built from horizontal rectangular grids and a general vertical boundary-fitted coordinate system. In this grid system, each rectangular block defines a grid cell. By introducing the grid indexes 𝑖, 𝑗, and 𝑘 in the 𝑥, 𝑦, and 𝑧 directions, respectively, the cell center is identified by (𝑖, 𝑗, 𝑘), and the cell faces are denoted by (𝑖 ± 1∕2, 𝑗, 𝑘), (𝑖, 𝑗 ± 1∕2, 𝑘), and (𝑖, 𝑗, 𝑘 ± 1∕2). In the horizontal plane, the horizontal velocity components and free-surface elevation are defined at staggered locations. As shown in Fig. 1, the horizontal velocity components normal to each cell face are defined at the point of intersection between the face and the segment joining the centers of the two cells that share the face. The free-surface elevation is located in the center of the cell. In the vertical direction, a novel grid arrangement is employed, which renders the models relatively simple in their discretized forms and the discretized Poisson equation for nonhydrostatic pressure corrections symmetric. Following Stelling and Zijlema (2003), to apply the Dirichlet boundary condition for the nonhydrostatic pressure precisely 2

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

Fig. 1. Arrangements of variables.

The continuity equation is: 𝜕𝛥𝑧𝑘 𝜕(𝛥𝑧𝑢)𝑘 𝜕(𝛥𝑧𝑣)𝑘 + + + 𝜔𝑘+1∕2 − 𝜔𝑘−1∕2 = 0 𝜕𝑡 𝜕𝑥 𝜕𝑦

(7)

The horizontal momentum equations are: 𝜕(𝛥𝑧𝑢)𝑘 𝜕(𝛥𝑧𝑢𝑢)𝑘 𝜕(𝛥𝑧𝑢𝑣)𝑘 + + + 𝜔𝑘+1∕2 𝑢𝑘+1∕2 − 𝜔𝑘−1∕2 𝑢𝑘−1∕2 𝜕𝑡 𝜕𝑥 𝜕𝑦 𝜕𝑞 𝜕𝜂 + 𝛥𝑧𝑘 =0 +𝑔𝛥𝑧𝑘 𝜕𝑥 𝜕𝑥 𝜕(𝛥𝑧𝑣)𝑘 𝜕(𝛥𝑧𝑢𝑣)𝑘 𝜕(𝛥𝑧𝑣𝑣)𝑘 + + + 𝜔𝑘+1∕2 𝑣𝑘+1∕2 − 𝜔𝑘−1∕2 𝑣𝑘−1∕2 𝜕𝑡 𝜕𝑥 𝜕𝑦 𝜕𝜂 𝜕𝑞 +𝑔𝛥𝑧𝑘 + 𝛥𝑧𝑘 =0 𝜕𝑦 𝜕𝑦

(8)

(9)

The vertical momentum equation is: 𝜕(𝛥𝑧𝑤)𝑘 𝜕(𝛥𝑧𝑢𝑤)𝑘 𝜕(𝛥𝑧𝑣𝑤)𝑘 + + + 𝜔𝑘+1∕2 𝑤𝑘+1∕2 − 𝜔𝑘−1∕2 𝑤𝑘−1∕2 𝜕𝑡 𝜕𝑥 𝜕𝑦 (10) 𝜕𝑞 =0 +𝛥𝑧𝑘 𝜕𝑧 where 𝛥𝑧𝑘 = 𝑧𝑘+1∕2 − 𝑧𝑘−1∕2 , and 𝜔𝑘+1∕2 is the vertical velocity relative to layer level 𝑧𝑘+1∕2 . By virtue of

Fig. 2. Schematic of the general vertical boundary-fitted coordinate system.

on the free surface, the nonhydrostatic term is defined at the centers of the top and bottom faces of each grid cell (see Fig. 1). However, the definition of the vertical velocity component does not follow a standard way (Stelling and Zijlema, 2003; Wu et al., 2010; Young and Wu, 2010a; Young et al., 2007; Yuan and Wu, 2004) and is placed at the cell center following Ai et al. (2011). In addition, both models employ the general vertical boundaryfitted coordinate system (Ai et al., 2014) in the vertical direction, in which the horizontal levels 𝑧𝑘+1∕2 are given by: ⎧𝑧 + (𝑘 − 𝑘 )[𝜂(𝑥, 𝑦, 𝑡) − 𝑧 ]∕(𝑁 − 𝑘 ) 𝑓 𝑓 𝑧 𝑓 ⎪ 𝑓 𝑧𝑘+1∕2 = ⎨ 𝑧𝑓 ⎪ −ℎ(𝑥, 𝑦) + 𝑘[𝑧𝑓 + ℎ(𝑥, 𝑦)]∕𝑘𝑓 ⎩

𝑘 > 𝑘𝑓 𝑘 = 𝑘𝑓 𝑘 < 𝑘𝑓

𝜕(𝛥𝑧𝑢)𝑘 𝜕𝑢 𝜕𝛥𝑧𝑘 = 𝛥𝑧𝑘 𝑘 + 𝑢𝑘 (11) 𝜕𝑡 𝜕𝑡 𝜕𝑡 substituting Eq. (7) into Eq. (11) gives the following expression: [ ] 𝜕𝑢 𝜕(𝛥𝑧𝑢)𝑘 𝜕(𝛥𝑧𝑣)𝑘 𝜕(𝛥𝑧𝑢)𝑘 = 𝛥𝑧𝑘 𝑘 −𝑢𝑘 + −(𝑢𝑘 𝜔𝑘+1∕2 −𝑢𝑘 𝜔𝑘−1∕2 ) (12) 𝜕𝑡 𝜕𝑡 𝜕𝑥 𝜕𝑦 Therefore, substituting Eq. (12) into Eq. (8) gives: ( ) 𝜕𝑢𝑘 𝜕𝜂 𝜕𝑞 + 𝐴𝑑𝑣 𝑢𝑘 = −𝑔 − (13) 𝜕𝑡 𝜕𝑥 𝜕𝑥 ( ) where 𝐴𝑑𝑣 𝑢𝑘 represents the advection term for 𝑢𝑘 and can be expressed as: [ ] [ ] ( ) 𝜕(𝛥𝑧𝑢𝑢)𝑘 𝜕(𝛥𝑧𝑢𝑣)𝑘 𝜕(𝛥𝑧𝑢)𝑘 𝜕(𝛥𝑧𝑣)𝑘 𝑢 1 + − 𝑘 + 𝐴𝑑𝑣 𝑢𝑘 = 𝛥𝑧𝑘 𝜕𝑥 𝜕𝑦 𝛥𝑧𝑘 𝜕𝑥 𝜕𝑦 𝜔𝑘+1∕2 𝜔𝑘−1∕2 + (𝑢𝑘+1∕2 − 𝑢𝑘 ) − (𝑢𝑘−1∕2 − 𝑢𝑘 ) 𝛥𝑧𝑘 𝛥𝑧𝑘

(6)

where 𝑧𝑓 = 𝑧𝑘𝑓 +1∕2 is a fixed level in the given layer 𝑘𝑓 , 𝑁𝑧 is the number of vertical layers, and h is the water depth. Eq. (6) shows that the vertical computational domain is divided into an upper and a lower region by the fixed level 𝑧𝑓 . In each region, the layer thickness is distributed uniformly. The horizontal levels in the upper region move with time, while these levels are fixed in the lower region. Fig. 2 shows a schematic of the general vertical boundary-fitted coordinate system. Notably, 𝑧𝑓 can be defined as a horizontal level or a function of the horizontal position. When 𝑧𝑓 = −ℎ(𝑥, 𝑦) and 𝑘𝑓 = 0, the general boundary-fitted coordinate system defined by Eq. (6) is identical to the traditional boundary-fitted coordinate system used in Ai et al. (2011). Following Ai et al. (2011) and Zijlema and Stelling (2005), the governing Eqs. (1)–(4) are first integrated over the vertical layer 𝑘, which is bounded by the levels 𝑧𝑘−1∕2 and 𝑧𝑘+1∕2 . This gives the following equations. Details about the integration procedures are given in Appendix A.

(14) Similarly, Eqs. (9) and (10) can also be written as follows: ( ) 𝜕𝑣𝑘 𝜕𝜂 𝜕𝑞 + 𝐴𝑑𝑣 𝑣𝑘 = −𝑔 − (15) 𝜕𝑡 𝜕𝑦 𝜕𝑦 ( ) 𝜕𝑤𝑘 𝜕𝑞 + 𝐴𝑑𝑣 𝑤𝑘 = − (16) 𝜕𝑡 𝜕𝑧 ( ) ( ) where 𝐴𝑑𝑣 𝑣𝑘 and 𝐴𝑑𝑣 𝑤𝑘 represent the advection terms for 𝑣𝑘 and 𝑤𝑘 , respectively, and are expressed as: [ ] [ ] ( ) 𝜕(𝛥𝑧𝑢𝑣)𝑘 𝜕(𝛥𝑧𝑣𝑣)𝑘 𝑣 𝜕(𝛥𝑧𝑢)𝑘 𝜕(𝛥𝑧𝑣)𝑘 1 𝐴𝑑𝑣 𝑣𝑘 = + − 𝑘 + 𝛥𝑧𝑘 𝜕𝑥 𝜕𝑦 𝛥𝑧𝑘 𝜕𝑥 𝜕𝑦 𝜔𝑘+1∕2 𝜔𝑘−1∕2 + (𝑣𝑘+1∕2 − 𝑣𝑘 ) − (𝑣𝑘−1∕2 − 𝑣𝑘 ) 𝛥𝑧𝑘 𝛥𝑧𝑘 (17) 3

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

[

]

( ) 𝜕(𝛥𝑧𝑢𝑤)𝑘 𝜕(𝛥𝑧𝑣𝑤)𝑘 𝑤 1 − 𝑘 𝐴𝑑𝑣 𝑤𝑘 = + 𝛥𝑧𝑘 𝜕𝑥 𝜕𝑦 𝛥𝑧𝑘 𝜔𝑘+1∕2 𝜔𝑘−1∕2 + (𝑤𝑘+1∕2 − 𝑤𝑘 ) − (𝑤𝑘−1∕2 − 𝑤𝑘 ) 𝛥𝑧𝑘 𝛥𝑧𝑘

[

𝜕(𝛥𝑧𝑢)𝑘 𝜕(𝛥𝑧𝑣)𝑘 + 𝜕𝑥 𝜕𝑦

]

For 𝑘 = 1, the continuity Eq. (1) is discretized in a half bottom layer, which gives the following discrete equation by setting 𝑤𝑛+𝜃 = 0: 𝑖,𝑗,1∕2 𝑢𝑛+𝜃 − 𝑢𝑛+𝜃 𝑖+1∕2,𝑗,1 𝑖−1∕2,𝑗,1

(18)

𝛥𝑥

where 𝑤𝑘−1∕2 = (𝑤𝑘−1 + 𝑤𝑘 )∕2 when 𝑘 < 𝑁𝑧 . For 𝑤𝑁𝑧 +1∕2 which is defined at the free surface, it is calculated by solving the continuity Eq. (1) in a half free surface layer.

The pressure correction method consisting of two major steps is employed to solve Eqs. (1), (13), (15) and (16). In the first step, 𝑛+1∕2 𝑛+1∕2 𝑛+1∕2 the intermediate velocities 𝑢𝑖+1∕2,𝑗,𝑘 , 𝑣𝑖,𝑗+1∕2,𝑘 and 𝑤𝑖,𝑗,𝑘 are obtained by solving the momentum Eqs. (13), (15) and (16), which contain the nonhydrostatic pressure at the preceding time level. To obtain models that are independent of free-surface wave speed, the horizontal gradient terms of the water surface in Eqs. (13) and (15) are discretized by a semi-implicit method. − 𝑢𝑛𝑖+1∕2,𝑗,𝑘 𝛥𝑡 − 𝑣𝑛𝑖,𝑗+1∕2,𝑘 𝛥𝑡

( ) + 𝐴𝑑𝑣 𝑣𝑛𝑖,𝑗+1∕2,𝑘 = −𝑔

𝜕𝜂 𝜕𝑦

)𝑛+𝜃

( −

𝑖,𝑗+1∕2

𝜕𝑞 𝜕𝑦

)𝑛 𝑖,𝑗+1∕2,𝑘

(20) 𝑛+1∕2 𝑤𝑖,𝑗,𝑘

− 𝑤𝑛𝑖,𝑗,𝑘

𝛥𝑡

( ) + 𝐴𝑑𝑣 𝑤𝑛𝑖,𝑗,𝑘 = −

(

𝜕𝑞 𝜕𝑧

)𝑛 (21)

𝑛+1 𝑛 − 𝜂𝑖,𝑗 𝜂𝑖,𝑗

𝛥𝑡

the reader can refer to Ai et al. (2011). In Eqs. (19)–(21), the gradient terms of the water surface and nonhydrostatic pressure are discretized by means of the central differencing scheme. and 𝑤𝑛+1 , 𝑣𝑛+1 In the second step, the new velocities 𝑢𝑛+1 𝑖,𝑗,𝑘 𝑖,𝑗+1∕2,𝑘 𝑖+1∕2,𝑗,𝑘



(

𝛥𝑡

=− (

𝑛+1∕2

𝑣𝑛+1 − 𝑣𝑖,𝑗+1∕2,𝑘 𝑖,𝑗+1∕2,𝑘 𝛥𝑡

=− (

𝑛+1∕2

𝑤𝑛+1 − 𝑤𝑖,𝑗,𝑘 𝑖,𝑗,𝑘 𝛥𝑡

=−

𝜕𝛥𝑞 𝜕𝑥 𝜕𝛥𝑞 𝜕𝑦 𝜕𝛥𝑞 𝜕𝑧

(22) 𝑖+1∕2,𝑗,𝑘

) (23) 𝑖,𝑗+1∕2,𝑘

)𝑛 (24) 𝑖,𝑗,𝑘

where 𝛥𝑞 = 𝑞 𝑛+1 − 𝑞 𝑛 is the nonhydrostatic pressure correction term and is computed from the Poisson equation, which is obtained by the combination of the discrete continuity and momentum equations. The continuity Eq. (1) is also discretized in semi-implicit finite difference form and this gives: (𝑢𝑛+𝜃 + 𝑢𝑛+𝜃 ) − (𝑢𝑛+𝜃 + 𝑢𝑛+𝜃 ) 𝑖+1∕2,𝑗,𝑘 𝑖+1∕2,𝑗,𝑘−1 𝑖−1∕2,𝑗,𝑘 𝑖−1∕2,𝑗,𝑘−1 +

2𝛥𝑦

+

𝑤𝑛+𝜃 − 𝑤𝑛+𝜃 𝑖,𝑗,𝑘 𝑖,𝑗,𝑘−1 𝛥𝑧𝑖,𝑗,𝑘−1∕2

=0

(26)

(27)

𝛥𝑡𝜃 −𝑔 𝛥𝑥

𝛥𝑧𝑛𝑖−1∕2,𝑗,𝑘

[𝑁 𝑧 ∑

( 𝛥𝑧𝑛𝑖+1∕2,𝑗,𝑘

𝑘=1

(

𝜕𝜂 𝜕𝑥

)𝑛+𝜃

]

𝜕𝜂 𝜕𝑥

)𝑛+𝜃 𝑖+1∕2,𝑗

𝑖−1∕2,𝑗

[𝑁 ( )𝑛+𝜃 ] ( )𝑛+𝜃 𝑁𝑧 𝑧 ∑ 𝜕𝜂 𝜕𝜂 𝛥𝑡𝜃 ∑ 𝑛 𝑛 −𝑔 − 𝛥𝑧 𝛥𝑧 𝛥𝑦 𝑘=1 𝑖,𝑗+1∕2,𝑘 𝜕𝑦 𝑖,𝑗+1∕2 𝑘=1 𝑖,𝑗−1∕2,𝑘 𝜕𝑦 𝑖,𝑗−1∕2 (𝑁 ) 𝑁𝑧 𝑧 ∑ 1 ∑ 𝑛 =− 𝛥𝑧𝑖+1∕2,𝑗,𝑘 𝑢𝑛𝑖+1∕2,𝑗,𝑘 − 𝛥𝑧𝑛𝑖−1∕2,𝑗,𝑘 𝑢𝑛𝑖−1∕2,𝑗,𝑘 𝛥𝑥 𝑘=1 𝑘=1 (𝑁 ) 𝑁𝑧 𝑧 ∑ 1 ∑ 𝑛 𝛥𝑧𝑖,𝑗+1∕2,𝑘 𝑣𝑛𝑖,𝑗+1∕2,𝑘 − 𝛥𝑧𝑛𝑖,𝑗−1∕2,𝑘 𝑣𝑛𝑖,𝑗−1∕2,𝑘 − 𝛥𝑦 𝑘=1 𝑘=1 ] [𝑁 𝑁𝑧 𝑧 ( ) ∑ ( ) ∑ 𝛥𝑡𝜃 𝑛 𝑛 𝑛 𝑛 𝛥𝑧 𝐴𝑑𝑣 𝑢𝑖+1∕2,𝑗,𝑘 − 𝛥𝑧𝑖−1∕2,𝑗,𝑘 𝐴𝑑𝑣 𝑢𝑖−1∕2,𝑗,𝑘 + 𝛥𝑥 𝑘=1 𝑖+1∕2,𝑗,𝑘 𝑘=1 [𝑁 ] 𝑁𝑧 𝑧 ( ) ∑ ( ) 𝛥𝑡𝜃 ∑ 𝑛 + 𝛥𝑧𝑖,𝑗+1∕2,𝑘 𝐴𝑑𝑣 𝑣𝑛𝑖,𝑗+1∕2,𝑘 − 𝛥𝑧𝑛𝑖,𝑗−1∕2,𝑘 𝐴𝑑𝑣 𝑣𝑛𝑖,𝑗−1∕2,𝑘 𝛥𝑦 𝑘=1 𝑘=1 [𝑁 ] ( )𝑛 ( )𝑛 𝑁𝑧 𝑧 ∑ 𝜕𝑞 𝜕𝑞 𝛥𝑡𝜃 ∑ 𝑛 𝛥𝑧𝑖+1∕2,𝑗,𝑘 − 𝛥𝑧𝑛𝑖−1∕2,𝑗,𝑘 + 𝛥𝑥 𝑘=1 𝜕𝑥 𝑖+1∕2,𝑗,𝑘 𝑘=1 𝜕𝑥 𝑖−1∕2,𝑗,𝑘 [𝑁 ] ( ) ( )𝑛 𝑁 𝑛 𝑧 𝑧 ∑ 𝜕𝑞 𝜕𝑞 𝛥𝑡𝜃 ∑ 𝑛 𝑛 𝛥𝑧 − 𝛥𝑧 + 𝛥𝑦 𝑘=1 𝑖,𝑗+1∕2,𝑘 𝜕𝑦 𝑖,𝑗+1∕2,𝑘 𝑘=1 𝑖,𝑗−1∕2,𝑘 𝜕𝑦 𝑖,𝑗−1∕2,𝑘 [𝑁 ] ( ) ( ) 𝑁𝑧 𝑧 ∑ 𝜕𝛥𝑞 𝜕𝛥𝑞 𝛥𝑡𝜃 ∑ 𝑛 +𝛽 𝛥𝑧𝑖+1∕2,𝑗,𝑘 − 𝛥𝑧𝑛𝑖−1∕2,𝑗,𝑘 𝛥𝑥 𝑘=1 𝜕𝑥 𝑖+1∕2,𝑗,𝑘 𝑘=1 𝜕𝑥 𝑖−1∕2,𝑗,𝑘 [𝑁 ] ( ) ( ) 𝑁𝑧 𝑧 ∑ 𝜕𝛥𝑞 𝜕𝛥𝑞 𝛥𝑡𝜃 ∑ 𝑛 𝑛 +𝛽 𝛥𝑧 − 𝛥𝑧 𝛥𝑦 𝑘=1 𝑖,𝑗+1∕2,𝑘 𝜕𝑦 𝑖,𝑗+1∕2,𝑘 𝑘=1 𝑖,𝑗−1∕2,𝑘 𝜕𝑦 𝑖,𝑗−1∕2,𝑘

)

2𝛥𝑥 𝑛+𝜃 (𝑣𝑖,𝑗+1∕2,𝑘 + 𝑣𝑛+𝜃 ) − (𝑣𝑛+𝜃 + 𝑣𝑛+𝜃 ) 𝑖,𝑗+1∕2,𝑘−1 𝑖,𝑗−1∕2,𝑘 𝑖,𝑗−1∕2,𝑘−1

𝑁𝑧 ∑ 𝑘=1

are obtained by correcting the intermediate values after including the nonhydrostatic pressure correction term. This gives: 𝑛+1∕2

𝛥𝑧𝑖,𝑗,1∕2

By substituting Eqs. (22) and (23) and Eqs. (19) and (20) into Eq. (28), we can obtain the following expression:

𝑖,𝑗,𝑘

where 𝜂 𝑛+𝜃 = (1 − 𝜃)𝜂 𝑛 + 𝜃𝜂 𝑛+1 , and 𝜃 is an implicitness factor and has to be chosen in the range 0.5 ≤ 𝜃 ≤ 1 for stability. In all the simulations in the present( paper, ) 𝜃 = 0.5( is used.) For details ( about ) the discretizations of 𝐴𝑑𝑣 𝑢𝑛𝑖+1∕2,𝑗,𝑘 , 𝐴𝑑𝑣 𝑣𝑛𝑖,𝑗+1∕2,𝑘 and 𝐴𝑑𝑣 𝑤𝑛𝑖,𝑗,𝑘 ,

𝑢𝑛+1 − 𝑢𝑖+1∕2,𝑗,𝑘 𝑖+1∕2,𝑗,𝑘

𝑤𝑛+𝜃 𝑖,𝑗,1

where 𝐀 is a sparse coefficient matrix, 𝛥𝐪 is a vector of the nonhydrostatic pressure correction term, and 𝐛 is a known vector related to explicit and intermediate velocities. The coefficients of Eq. (27) are provided in Appendix B. A semi-implicit finite difference approximation for Eq. (5) is taken to be: (𝑁 ) 𝑛+1 𝑛 𝑁𝑧 𝑧 𝜂𝑖,𝑗 − 𝜂𝑖,𝑗 ∑ 1 ∑ 𝑛 𝑛+𝜃 𝑛 𝑛+𝜃 + 𝛥𝑧 𝑢 − 𝛥𝑧 𝑢 𝛥𝑡 𝛥𝑥 𝑘=1 𝑖+1∕2,𝑗,𝑘 𝑖+1∕2,𝑗,𝑘 𝑘=1 𝑖−1∕2,𝑗,𝑘 𝑖−1∕2,𝑗,𝑘 ) (𝑁 (28) 𝑁𝑧 𝑧 ∑ 1 ∑ 𝑛 𝑛 𝑛+𝜃 = 0 𝛥𝑧 𝑣 𝛥𝑧𝑖,𝑗+1∕2,𝑘 𝑣𝑛+𝜃 − + 𝑖,𝑗−1∕2,𝑘 𝑖,𝑗−1∕2,𝑘 𝑖,𝑗+1∕2,𝑘 𝛥𝑦 𝑘=1 𝑘=1

( )𝑛+𝜃 ( )𝑛 ( ) 𝜕𝑞 𝜕𝜂 + 𝐴𝑑𝑣 𝑢𝑛𝑖+1∕2,𝑗,𝑘 = −𝑔 − 𝜕𝑥 𝑖+1∕2,𝑗 𝜕𝑥 𝑖+1∕2,𝑗,𝑘 (

𝛥𝑦

+

𝐀𝛥𝐪 = 𝐛

(19) 𝑛+1∕2 𝑣𝑖,𝑗+1∕2,𝑘

𝑣𝑛+𝜃 − 𝑣𝑛+𝜃 𝑖,𝑗+1∕2,1 𝑖,𝑗−1∕2,1

where 𝛥𝑧𝑖,𝑗,1∕2 = 𝛥𝑧𝑖,𝑗,1 ∕2. There is no need to discretize the continuity equation in a half free surface layer, because the following Poisson equation derived from Eqs. (25) and (26) is enough to determine all the unknown variables. In addition, 𝑤𝑛+𝜃 should be determined by the kinematic bottom 𝑖,𝑗,1∕2 condition for an uneven bottom. However, this has little effects on the results of test cases involving uneven bottom in this study. By substituting the expressions for the new velocities from Eqs. (22)–(24) into Eqs. (25) and (26), respectively, and considering the zero pressure condition on the free surface 𝛥𝑞𝑁𝑧 +1∕2 = 0, the Poisson equation for the nonhydrostatic pressure correction term is derived and can be written in matrix form as:

3.2. Numerical discretization

𝑛+1∕2 𝑢𝑖+1∕2,𝑗,𝑘

+

=0 (25)

where 𝑘 = 2, … , 𝑁𝑧 , and 𝛥𝑧𝑖,𝑗,𝑘−1∕2 = (𝛥𝑧𝑖,𝑗,𝑘−1 + 𝛥𝑧𝑖,𝑗,𝑘 )∕2.

(29) 4

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

Notably, a factor 𝛽 = 0 or 1 is introduced in the above equation. For consistency with Eq. (28), Eq. (7) is also discretized by the semi-implicit method to calculate 𝜔𝑘+1∕2 . This gives: 𝛥𝑧𝑛+1 − 𝛥𝑧𝑛𝑖,𝑗,𝑘 𝑖,𝑗,𝑘

1 + (𝛥𝑧𝑛𝑖+1∕2,𝑗,𝑘 𝑢𝑛+𝜃 − 𝛥𝑧𝑛𝑖−1∕2,𝑗,𝑘 𝑢𝑛+𝜃 ) 𝑖+1∕2,𝑗,𝑘 𝑖−1∕2,𝑗,𝑘 𝛥𝑡 𝛥𝑥 1 𝑛 𝑛+𝜃 𝑛 𝑛+𝜃 𝑛+𝜃 + (𝛥𝑧𝑖,𝑗+1∕2,𝑘 𝑣𝑖,𝑗+1∕2,𝑘 − 𝛥𝑧𝑖,𝑗−1∕2,𝑘 𝑣𝑖,𝑗−1∕2,𝑘 ) + 𝜔𝑖,𝑗,𝑘+1∕2 − 𝜔𝑛+𝜃 =0 𝑖,𝑗,𝑘−1∕2 𝛥𝑦 (30) 3.3. Solution procedure The overall solution procedure is summarized as follows: 𝑛+1 (1) Calculate the free surface 𝜂𝑖,𝑗 by solving Eq. (29), which neglects the nonhydrostatic pressure correction 𝛥𝑞. Eq. (29) constitutes a strongly diagonally dominant, symmetric, and positive definite system, which is solved by the conjugate gradient method. 𝑛+1∕2 𝑛+1∕2 (2) Calculate the intermediate velocities 𝑢𝑖+1∕2,𝑗,𝑘 , 𝑣𝑖,𝑗+1∕2,𝑘 , and 𝑛+1∕2

𝑤𝑖,𝑗,𝑘

Fig. 3. The normalized wave celerity obtained with two vertical layers versus 𝐾ℎ.

4. Accuracy and efficiency

by using Eqs. (19)–(21).

(3) Calculate the nonhydrostatic pressure correction 𝛥𝑞 by using Eq. (27), which is solved on GPU in conjunction with the conjugate gradient method. Details about the GPU implementation of Eq. (27) are given in Section 3.4. (4) Update the intermediate velocities by using Eqs. (22)–(24) to obtain the new velocities 𝑢𝑛+1 , 𝑣𝑛+1 , and 𝑤𝑛+1 . 𝑖,𝑗,𝑘 𝑖+1∕2,𝑗,𝑘 𝑖,𝑗+1∕2,𝑘

In this section, we focus on the accuracy of both models in resolving linear dispersion relations and the efficiency of both serial models by conducting a series of numerical simulations of linear standing waves. In all the simulations, the layer thickness is uniformly distributed over the water depth. We consider linear standing waves in a closed basin of constant depth. The basin has a length of 𝐿 with a constant depth of ℎ = 10 m. The initial free-surface displacement is given as: ( ) 2𝜋 𝜂(𝑥, 𝑡 = 0) = 𝑎 cos(𝑘𝑥) cos 𝑡 , 0≤𝑥≤𝐿 (31) 𝑇 where 𝑎 and 𝑇 are the amplitude and wave period of the standing wave, and 𝐾 = 2𝜋∕𝐿 is the wavenumber. Here, the wavelength equals the length of the basin. To generate various 𝐾ℎ values ranging from shallow-water to intermediate-water to deep-water waves, we change the length of the basin but maintain the linear wave condition by adjusting the wave amplitude. The horizontal grid size and the time step are firstly set to 𝛥𝑥 = 𝐿∕20 and 𝛥𝑡 = 𝑇 ∕40, respectively. The normalized wave celerity obtained with two vertical layers √ versus 𝐾ℎ is depicted in Fig. 3, in which 𝑐 and 𝑐0 = 𝑔ℎ are the wave speed and long wave celerity, respectively. The computed wave celerity is the average value estimated over ten wave periods. In the investigated range of 𝐾ℎ ≤ 5, the iterative model accurately predicts the wave celerity, but the noniterative model results gradually deviate from the analytical solution with increases in 𝐾ℎ. This implies that with the same number of vertical layers, the accuracy of the noniterative model may be not as good as that of the iterative model for higher dispersive waves in deep water depths if both models are implemented in the same horizontal and vertical spatial resolution and relatively coarse temporal resolution. This can be further demonstrated by Fig. 4. To present the effect of the number of vertical layers on both of the models’ accuracy, Fig. 4 displays normalized percentage errors versus a wide range of 𝐾ℎ values. With increasing vertical layers, the iterative model can accurately predict linear dispersive waves up to larger 𝐾ℎ values. We can define a normalized percentage error as 𝜀 = |𝑐linear − 𝑐comp |∕𝑐linear , where 𝑐linear is the phase speed of linear waves, and 𝑐comp is the average wave speed estimated over ten wave periods. For a given tolerance error of 𝜀 = 1%, the iterative model with two, three and five layers is accurate up to 𝐾ℎ = 7, 15 and 40, respectively. Accordingly, the noniterative model is only accurate up to 𝐾ℎ = 3. Notably, Vitousek and Fringer (2013) also investigated the accuracy of the two semi-implicit algorithms in resolving linear dispersion relations by using a similar test case. Both semi-implicit algorithms showed almost identical results in the investigated water depth range of 𝐾ℎ ≤ 4. However, in the study of Vitousek and Fringer (2013),

(5) Calculate the vertical velocity relative to layer level 𝜔𝑛+1 𝑖,𝑗,𝑘+1∕2 by using Eq. (30). Note that considering the kinematic bottom and = 0. = 𝜔𝑛+1 free-surface conditions, 𝜔𝑛+1 𝑖,𝑗,𝑁 +1∕2 1∕2 𝑧

The above procedure corresponds to 𝛽 = 0 in Eq. (29), which results in the noniterative semi-implicit model. When 𝛽 = 1, an iterative procedure is required by repeating steps 1 to 3 until convergence is reached. Clearly, the noniterative model may be computationally more efficient than the iterative model. As demonstrated theoretically by Kang and Fringer (2005), both models are second-order accurate in time for nonhydrostatic free-surface flows. It is worth mentioning that if the pressure projection method is employed, 𝛽 = 0 results in the first-order temporal accuracy and 𝛽 = 1 still enable the model to be second-order accurate in time (Vitousek and Fringer, 2013). 3.4. GPU implementation of the Poisson equation The sparse matrix 𝑨 in Eq. (27) contains 10 nonzero diagonals in the bottom layer and 15 nonzero diagonals in the other layers. Moreover, this matrix is symmetric, as presented in Appendix B. In this study, the conjugate gradient method is used to solve Eq. (27). Because solving the Poisson equation is computationally intensive and is the most time consuming part of both solvers, the solution of Eq. (27) is outsourced to the GPU via a function called kernel, which implements the conjugate gradient method. In the CUDA programming environment, a kernel is called by a host (CPU) and is executed on a device (GPU). Both the CPU and GPU have their own memory. Data must be transferred between CPU memory and GPU memory. However, to achieve high performance, data transfer should be minimized to reduce memory latency. In this study, before the kernel is called from the host, the vector 𝒃 and the nonzero coefficients of the matrix 𝑨 are transferred from CPU memory to GPU memory. Then, after the solution is obtained on GPU, it is copied from GPU memory to CPU memory. The computationally intensive algorithmic components of the conjugate gradient method mainly include vector addition and vector dot product. The implementations of these components on GPU are described in Sanders and Kandrot (2010). As presented in the above section, the Poisson equation may be solved many times at each time step for the iterative model. Normally, the matrix 𝑨 and the vector 𝒃 are copied from CPU memory to GPU memory every time before solving the Poisson equation. However, because the coefficients of matrix 𝑨 presented in Appendix B are constant within each time step, to reduce data transfer latency, they are transferred only once before first solving the Poisson equation. 5

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

Fig. 4. The normalized percentage errors obtained with different vertical layers versus 𝐾ℎ. Table 1 The normalized percentage errors obtained with different combinations of the horizontal grid size and the time step for 𝐾ℎ = 10. Iterative model

𝛥𝑥 = 𝐿∕20, 𝛥𝑥 = 𝐿∕20, 𝛥𝑥 = 𝐿∕20, 𝛥𝑥 = 𝐿∕40,

𝛥𝑡 = 𝑇 ∕40 𝛥𝑡 = 𝑇 ∕80 𝛥𝑡 = 𝑇 ∕160 𝛥𝑡 = 𝑇 ∕160

Noniterative model

Two layers

Three layers

Two layers

Three layers

8.63% 4.76% 3.74% 3.74%

5.51% 1.64% 0.55% 0.55%

3.74% 3.49% 3.49% 3.42%

0.55% 0.28% 0.21% 0.14%

the computed wave speed was only estimated over one wave period, and only a small range of dimensionless water depth was considered. Accordingly, the accuracy of both present models was assessed severely in this paper because the computed wave speed was estimated over ten wave periods, and a wider range of dimensionless water depth was considered. To show the influences of both the time step and the horizontal grid size on the two models, Table 1 presents the normalized percentage errors obtained with different combinations of the time step and the horizontal grid size for 𝐾ℎ = 10. Numerical results with only two and three layers are given, because the results with more vertical layers are very similar to those with three layers. For the noniterative model, by setting 𝛥𝑥 = 𝐿∕20, its accuracy is improved significantly by decreasing the time step. For the smallest time step of 𝛥𝑡 = 𝑇 ∕160, the accuracy of the noniterative model is close to that of the iterative model. However, by setting 𝛥𝑡 = 𝑇 ∕160, refining the horizontal grid size almost has no effect on the accuracy of the noniterative model. For the iterative model, both the time step and the horizontal grid size has little effect on its accuracy. The results presented in Fig. 4 and Table 1 imply that both developed models can accurately predict higher dispersive waves. With the same time step which is not sufficiently small, the iterative model is more accurate than the noniterative model. The accuracy of the iterative model can be improved significantly by refining the vertical layers, while the accuracy of the noniterative model can be increased by decreasing the time step. To better compare both semi-implicit models in resolving the linear dispersion relationship in terms of the spatial and temporal resolution, it is necessary to derive the numerical dispersion relationship following Vitousek and Fringer (2013). However, this paper only focuses on the development and numerical assessment of the two semi-implicit models. To show the serial implemented efficiency of both models in response to changes in the horizontal and vertical resolutions, we also consider a linear standing wave with 𝑎 = 0.01 m, ℎ = 10 m and 𝐾ℎ = 2.5. We define a reference case, in which the horizontal grid size of 𝛥𝑥 = 𝐿∕20 and two uniform vertical layers are used. The same time step is employed for the two models. Fig. 5(a) depicts the nondimensional CPU time versus the horizontal resolution-increasing factor, and Fig. 5(b) shows the comparison of the

Fig. 5. CPU time versus the horizontal resolution-increasing factor 𝑓ℎ : (a) the normalized CPU time for both models; and (b) the CPU time of the iterative model relative to that of the noniterative model.

computational expense between the two models. The CPU time shown in Fig. 5(a) is nondimensionalized with respect to that of the reference case, while the relative CPU time shown in Fig. 5(b) is the ratio of the computational expense of the iterative model to that of the noniterative model. The horizontal resolution-increasing factor 𝑓ℎ is defined as the ratio of the number of horizontal grid cells to the number of horizontal grid cells employed in the reference case. Fig. 5(a) shows that for both models, the CPU time increases with increasing horizontal resolution, and it increases dramatically when 𝑓ℎ ≥ 6. As expected, Fig. 5(b) shows that the iterative model is more time consuming than the noniterative model. The maximum relative CPU time reaches approximately 2.5, while the minimum value is only approximately 1.1. Fig. 6(a) and (b) show the nondimensional CPU time and the relative CPU time versus the vertical resolution-increasing factor, respectively. The vertical resolution-increasing factor 𝑓𝑣 is defined as the ratio of the number of vertical layers to the number of vertical layers employed in the reference case. Fig. 6 shows that for both models, the CPU time increases moderately with the increase in the vertical resolution, and the iterative model is also more time consuming than the noniterative model. The minimum relative CPU time is 1.1, and the maximum value only reaches approximately 1.4. Overall, the relative CPU time ranges from 1.1 to 2.5, and all of the relative CPU time costs in the following test cases are within this range. By comparing Figs. 5(a) and 6(a), it can also be concluded that, for both models, the computational efficiency is more sensitive to the horizontal resolution than to the vertical resolution. 6

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

0.1 m at the submerged bar. The upward and downward slopes of the bar are 1:20 and 1:10, respectively. Fig. 7 shows the numerical model setup and free-surface elevation measurement stations. The present study focuses on two test cases, namely, Case A and Case C, which involve sinusoidal incident waves with a wave height of 𝐻0 = 2.0 cm and 𝐻0 = 4.1 cm, respectively, and a wave period of 𝑇 = 2.02 s and 𝑇 = 1.01 s, respectively. Experimental data obtained by Luth et al. (1994) are used for comparison with the numerical results. Case A with 𝐾ℎ = 0.67 has been used to verify a number of nonhydrostatic models (Bradford, 2005; Casulli, 1999; Ma et al., 2012; Stelling and Zijlema, 2003; Wu et al., 2010; Yuan and Wu, 2004). However, Case C with 𝐾ℎ = 1.69 has a relatively short wavelength and, to our knowledge, only comparison results obtained by the nonhydrostatic model developed by Stelling and Zijlema (2003) have been published for this case. In this computation, for Cases A and C, the horizontal grid spacing values are set to 𝛥𝑥 = 0.025 m and 0.0125 m, respectively, and the time steps are 𝛥𝑡 = 0.01 s and 0.005 s, respectively. In both cases, the vertical boundary-fitted coordinate system is constructed by using two vertical layers with 𝑧𝑓 = −ℎ(𝑥, 𝑦) and 𝑘𝑓 = 0. Figs. 8 and 9 show comparisons of the free-surface elevation between the two sets of numerical results and the experimental data for Case A and C. Both models predict well wave propagation over the bar, which involves the generation of bound higher harmonics via the increase in nonlinearity at the shoaling stage and the transformation of wave patterns due to the release of higher harmonics at the deshoaling stage. There are almost no differences between the two model results. Both models with only two vertical layers are in good agreement with the measurements, indicating that both developed models can accurately predict 2D nearshore wave propagation, in which wave shoaling, nonlinearity and dispersion occur. 5.2. Wave transformation over a circular shoal This example examines the capabilities of both models to predict 3D nearshore wave propagation. The numerical results are compared with experimental data from Chawla (1995). The experimental setup is shown in Fig. 10, in which a circular shoal is placed on a flat bottom. The boundary of the shoal is given by:

Fig. 6. CPU time versus the vertical resolution-increasing factor 𝑓𝑣 : (a) the normalized CPU time for both models; and (b) the CPU time of the iterative model relative to that of the noniterative model.

5. Test cases Both developed semi-implicit models are further assessed with five test cases concerning nonlinear surface waves for which either analytical solutions or existing experimental data are available. The first two examples focus on nearshore wave propagation, in which wave shoaling, nonlinearity, dispersion, reflection, and diffraction phenomena occur. The third example is used to evaluate the capabilities of both models to simulate nonlinear sloshing waves. The last two examples of focusing waves are employed to test the abilities of both models to resolve nonlinear wave groups. The acceleration of the Poisson equation solver due to GPU parallelization is presented in the final test case, which is computationally very expensive. In all the computations, the same horizontal grid sizes and the same time step are used for the two models.

(𝑥 − 5)2 + (𝑦 − 8.98)2 = 6.6049

(32)

whereas the depth on the shoal is: √ 𝐻(𝑥, 𝑦) = ℎ + 8.73 − 82.81 − (𝑥 − 5)2 − (𝑦 − 8.98)2

(33)

where ℎ = 0.45 m is the still water depth. The incident wave with a wave height 𝐻0 = 1.18 cm and a wave period 𝑇0 = 1.0 s is specified at the left boundary, resulting in 𝐾ℎ = 1.90. At the right end of the computational domain, a Sommerfeld’s radiation boundary is used to minimize wave reflection. The same test case has been used to verify the nonhydrostatic model developed by Zijlema and Stelling (2005). In this computation, a uniform grid of 𝛥𝑥 = 𝛥𝑦 = 0.05m is used in the horizontal domain, and the constant time step is set to 𝛥𝑡 = 0.02 s. Similar to the former example, the vertical boundary-fitted coordinate system is also constructed by using two vertical layers with 𝑧𝑓 = −ℎ(𝑥, 𝑦) and 𝑘𝑓 = 0 for both models. The total grid number is therefore 440 × 364 × 2 = 320 320. Both simulations are run for 40 s. Fig. 11 shows comparisons among the normalized wave heights at four sections of the two sets of numerical results and experimental data. Only minor discrepancies exist between the two numerical models. Generally, both model results are in good agreement with the measured data. This demonstrates the capabilities of both models for resolving 3D nearshore wave propagation, in which wave shoaling, reflection, and diffraction occur.

5.1. Wave propagation over a submerged bar Wave propagation over a submerged bar is a classical twodimensional (2D) benchmark test case for the validation of nonlinear dispersive wave models. Beji and Battjes (1993) and Luth et al. (1994) have conducted a series of laboratory experiments of the propagation of regular waves over a submerged trapezoidal bar in a wave flume. In this study, the numerical wave flume is 35 m long with 10 m of a sponge layer at the right end. The still water depth ℎ is 0.4 m and is reduced to 7

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

Fig. 7. Sketch of the model setup and measurement stations of free-surface elevation.

Fig. 8. Comparisons of the free-surface elevation between the two sets of numerical results and experimental data for Case A. Iterative model (solid line), noniterative model (dashed line), and experimental data (circles).

8

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

Fig. 9. Comparisons of the free-surface elevation among the two sets of numerical results and experimental data for Case C. Iterative model (solid line), noniterative model (dashed line), and experimental data (circles).

occurs in the last few wave periods. The difference between the results of the two models is more apparent in the case of large wave steepness than in the case of small wave steepness. Overall, with the same horizontal grid spacing and the time step, the iterative model results are in good agreement with the analytical solution and are better than the noniterative model results.

5.3. Nonlinear sloshing waves In this example, we consider nonlinear sloshing motions in a closed basin. The basin has a length of 𝐿 = 2.0 m and a still water depth of ℎ = 1.0 m. The initial free-surface profile also follows Eq. (31), but two wave steepnesses of 𝑎𝐾 = 0.1 and 0.2 are considered in the simulations, in which 𝐾 = 2𝜋∕𝐿 such that 𝐾ℎ = 𝜋. Similar test cases have been used to verify other nonhydrostatic models (Ai et al., 2014; Young et al., 2007). The second-order analytical solution described in Wu et al. (1998) is used for the comparison with numerical results. A horizontal grid spacing of 𝛥𝑥 = 𝐿∕40 and time step of 𝛥𝑡 = 𝑇 ∕40 are used in the computations, where 𝑇 is the wave period. In the vertical direction, the grid system is constructed by using ten vertical layers with 𝑧𝑓 = −0.1 m and 𝑘𝑓 = 9 for both models. Comparisons of the time histories of the free-surface elevation at 𝑥 = 𝐿∕2 among the two sets of numerical results and analytical solutions are depicted in Fig. 12. A notable discrepancy between the results of the two models

5.4. 2D focusing waves The spatial–temporal focusing of wave groups at one point in space and time is believed to be one of the physical mechanisms contributing to the formation of extreme waves. Many researchers have made a great deal of effort to investigate focusing behavior by using laboratory experiments (Baldock et al., 1996; Baldock and Swan, 1996; Johannessen and Swan, 2001; Ma et al., 2010; Ning et al., 2009). In the last decade, nonhydrostatic models for simulating focusing wave groups have received increased attention. In this example, to assess both 9

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

Fig. 10. Sketch of the experimental setup of Chawla (1995).

Fig. 12. Comparisons of the time histories of the free-surface elevation at 𝑥 = 𝐿∕2 among the two sets of numerical results and analytical solution. Iterative model (solid line), noniterative model (dashed line), and analytical solution (circles).

Table 2 Experimental conditions for 2D focusing waves.

Case B55 Case D55

Period range (s)

Frequency band (Hz)

𝐾ℎ

0.6 ≤ 𝑇 ≤ 1.4 0.8 ≤ 𝑇 ≤ 1.2

0.71 ≤ 𝑓 ≤ 1.66 0.83 ≤ 𝑓 ≤ 1.25

7.82∼1.57 4.40∼2.02

the number of frequency components, and 𝑎𝑛 defines the amplitude of each component. Wave components are of equal amplitude and equally spaced within the period range. The inflow boundary is specified at 𝑥 = 0 m, and the imposed normal velocity components are given by: { } [ ] 𝑁 ∑ cosh 𝐾𝑛 (ℎ + 𝑧) [ ] 𝑎𝑛 𝜔 𝑛 𝑢(𝑥, 𝑧, 𝑡) = cos 𝐾𝑛 (𝑥 − 𝑥𝑓 ) − 𝜔𝑛 (𝑡 − 𝑡𝑓 ) sin h(𝐾𝑛 ℎ) 𝑛=1

models’ representation of 2D nonlinear focusing waves, experiments conducted by Baldock et al. (1996), which have been reproduced by few nonhydrostatic models (Ai et al., 2014; Young and Wu, 2010a), are used. The dimension of the computational domain is 0 ≤ 𝑥 ≤ 20 m. The still water depth is ℎ = 0.7 m. A 5 m sponge layer is applied at the end of the computational domain, 𝑥 = 20 m. The two period ranges or frequency bands indicated in Table 2 are considered in the present simulations. Case B55 corresponds to a broad bandwidth, and Case D55 represents a narrow bandwidth. Both cases have the same ∑ input wave amplitude of 𝐴 = 𝑁 𝑛=1 𝑎𝑛 = 55 mm, where 𝑁 = 29 is

(34) where 𝐾𝑛 and 𝜔𝑛 denote the wavenumber and frequency of each component, respectively, satisfying the linear dispersion relationship; and 𝑥𝑓 = 8 m and 𝑡𝑓 = 30 s are the theoretical focusing position and time, respectively. In this computation, for the narrow bandwidth case (Case D55), the horizontal domain is discretized by setting the grid spacing at

Fig. 11. Comparisons among the normalized wave heights of the two sets of numerical results and experimental data at four sections. Iterative model (solid line), noniterative model (dashed line), and experimental data (circles).

10

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

Table 3 Comparisons of the focusing position 𝑥𝑟𝑓 and time 𝑡𝑟𝑓 among the results of both models and the estimated measured data for 2D focusing waves. 𝑥𝑟𝑓 and 𝑡𝑟𝑓 are relative to 𝑥𝑓 and 𝑡𝑓 , respectively. Case B55

𝑥𝑟𝑓 (m) 𝑡𝑟𝑓 (s)

Case D55

Experimental data

Iterative model

Noniterative model

Experimental data

Iterative model

Noniterative model

0.386 0.169

0.4125 0.185

0.3375 0.135

1.284 0.670

1.275 0.65

1.150 0.58

Fig. 13. Comparisons of (a) the time histories of the free-surface elevation at the focusing position and (b) the free-surface profiles at the focusing time for Case B55 among the two sets of numerical results and experimental data. Iterative model (solid line), noniterative model (dashed line), and experimental data (circles).

Fig. 14. Comparisons of (a) the time histories of the free-surface elevation at the focusing position and (b) the free-surface profiles at the focusing time for Case D55 among the two sets of numerical results and experimental data. Iterative model (solid line), noniterative model (dashed line), and experimental data (circles).

𝛥𝑥 = 0.025 m, and the time step is taken as 𝛥𝑡 = 0.005 s. To accurately predict the focusing position and time in the broad bandwidth case (Case B55), a refined grid with 𝛥𝑥 = 0.0125 m and a small time step of 𝛥𝑡 = 0.0025 s are used. The vertical boundary-fitted coordinate system is constructed by using seven vertical layers with 𝑧𝑓 = −0.1m and 𝑘𝑓 = 6. Figs. 13 and 14 show comparisons of the temporal and spatial wave profiles among the two sets of numerical results and experimental data for Case B55 and Case D55, where 𝑡𝑟 = 𝑡 − 𝑡𝑓 and 𝑥𝑟 = 𝑥 − 𝑥𝑓 . Generally, the results of both models are in good agreement with the experimental data. However, the noniterative model slightly underestimates the main wave crest. Accordingly, the focusing position and time are also underpredicted by the noniterative model. Table 3 compares the calculated focusing positions and times with the measured data, which are estimated from Fig. 10 of Baldock et al. (1996). The iterative model overestimates the focusing position and time by 6.9% and 9.5%, respectively, for Case B55 and only underestimates focusing position and time by 0.7% and 3.0%, respectively, for Case D55. Accordingly, the noniterative model underestimates the focusing position and time by 12.6% and 20.1%, respectively, for Case B55 and by 10.4% and 13.4%, respectively, for Case D55.

nonhydrostatic models (Ai et al., 2014; Young and Wu, 2010b). As shown in Fig. 15, the dimensions of the computational domain are 0 ≤ 𝑥 ≤ 13m and 0 ≤ 𝑦 ≤ 25m. The still water depth is ℎ = 1.2 m. A 5.0 m sponge layer is applied at the end of the computational domain. As presented in Table 4, we considered two wave groups with different input amplitudes. The largest input amplitude, i.e., 𝐴 = 78 mm, is very close to the limit at which incipient wave breaking occurs. Each wave group consists of 𝑁 = 28 frequency components, and in each frequency, there are 𝑀 = 91 direction components. For each frequency component, the frequency 𝑓𝑛 is equally spaced within the frequency range, and the wave amplitude 𝑎𝑛 is proportional to 𝑓𝑛 −2 . For each direction component, the direction 𝜃𝑚 ranges from −45◦ to 45◦ . The inflow boundary is specified at 𝑥 = 0 m, and the imposed normal velocity components can be expressed as: { [ ] 𝑁 ∑ cosh 𝐾𝑛 (ℎ + 𝑧) 𝑢(𝑥, 𝑦, 𝑧, 𝑡) = ⋅ 𝑎𝑛 𝜔 𝑛 sin h(𝐾𝑛 ℎ) 𝑛=1 } (35) 𝑀 ∑ [ ] 𝑏𝑛𝑚 cos 𝐾𝑛 ((𝑥 − 𝑥𝑓 ) cos(𝜃𝑚 ) + (𝑦 − 𝑦𝑓 ) sin(𝜃𝑚 )) − 𝜔𝑛 (𝑡 − 𝑡𝑓 ) 𝑚=1

where (𝑥𝑓 , 𝑦𝑓 ) = (5.5,12.5) m is the theoretical focusing position; 𝑡𝑓 = 50 s is the theoretical focusing time; and 𝑏𝑛𝑚 is the directional spread ∑ of wave components, which is normalized so that 𝑀 𝑚=1 𝑏𝑛𝑚 = 1. In this study, the directional spread is independent of the wave frequency and is expressed by adopting the following typical spreading parameter: ( ) 𝑏(𝜃𝑚 ) = 𝐵 cos𝑠 𝜃𝑚 ∕2 (36)

5.5. 3D focusing waves In the last example, experimental data from Johannessen and Swan (2001) are used to verify both models’ representation of 3D nonlinear focusing waves. Similar problems have also been predicted by few 11

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

Fig. 16. Comparisons of the time histories of the free-surface elevation at the focusing position among the two sets of numerical results and experimental data. Iterative model (solid line), noniterative model (dashed line), and experimental data (circles).

in Fig. 12 of Johannessen and Swan (2001), the measured focusing time for Case D4555 is much lower than the best-fit polynomial for the measured data. For Case D4578, the iterative model underestimates the focusing position and time by 4.9% and 14.0%, respectively. Accordingly, the noniterative model underestimates the focusing position and time by 15.6% and 24.5%, respectively. To test the performance of the GPU implementation of the Poisson equation in both models, Case D4578 is solved by using the serial and GPU-accelerated versions of the code. All computations are carried out with single-precision floating-point numbers. The test environment is based on a desktop computer with an Intel(R) Core(TM) i7-7700K CPU and a NVIDIA GeForce GTX 1080 Ti GPU. The CPU is a quad-core processor with eight threads that has a base frequency of 4.2 GHz and peak frequency of 4.5 GH with 8 MB L3 cache. The GPU consists of 3584 CUDA cores and has a total memory bandwidth of 484 GB/s. Each CUDA core has a base clock rate of 1480 MHz and a boost frequency of 1582 MHz. The performance of the GPU-accelerated Poisson solver is presented in Table 6. The CPU time is measured by executing the serial code in a single thread, while the GPU time is the parallel running time on the GPU. The speedups with respect to a single-thread CPU are 16.61 and 17.18 for the noniterative model and iterative model, respectively. The greatest speedup value is obtained by the iterative model, which is attributed to the fact that the GPU-accelerated part of the iterative model is more computationally intensive than that of the noniterative model because for the iterative model, the Poisson equation may need to be solved many times at each time step.

Fig. 15. Schematic of the 3D focusing wave basin.

where 𝐵 is a normalizing coefficient, and 𝑠 is the directional spread parameter. In this computation, we choose the horizontal grid spacing of 𝛥𝑥 = 0.025 m and 𝛥𝑦 = 0.05 m and construct the vertical boundary-fitted coordinate system by employing ten vertical layers with 𝑧𝑓 = −0.12m and 𝑘𝑓 = 9. The chosen time step is 0.005 s, and the total simulation time is up to 54 s. Fig. 16 shows comparisons of the time histories of the free-surface elevation at the focusing position among the two sets of numerical results and experimental data. For the case with a small input amplitude (Case D4555), the results of both models are very close and are in good agreement with the experimental data. However, the iterative model results are slightly better than the noniterative model results for the case with a large input amplitude (Case D4578) because the main wave crest is slightly underestimated by the noniterative model. Comparisons of the focusing position and time among the results of both models and measured data are shown in Table 5. The measured data are estimated from Fig. 12 of Johannessen and Swan (2001). For Case D4555, both models predict the focusing position very well but overestimate the focusing time significantly. The iterative model overestimates the focusing position by only 1.8%, and the noniterative model underestimates the focusing position by only 2.0%. The significant overestimation of the focusing time by both models may be attributed to inaccurately measured data because, as easily found

6. Conclusions In this paper, two nonhydrostatic models, namely, iterative and noniterative models, are developed to simulate surface water waves. Both models employ the semi-implicit method to discretize the free surface and solve the 3D incompressible NSEs with the pressure correction method. To improve both models’ efficiency, the Poisson equation solver is run on GPU in conjunction with the conjugate gradient method. 12

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

Table 4 Experimental conditions for 3D focusing waves.

Case D4555 Case D4578

Input wave amplitude (mm)

Frequency band (Hz)

Spreading parameter

Directional range (deg)

𝐾ℎ

𝐴 = 55 𝐴 = 78

0.83 ≤ 𝑓𝑛 ≤ 1.25

𝑠= 45

−45 ≤ 𝜃𝑚 ≤ 45

7.46∼3.38

Table 5 Comparisons of the focusing position 𝑥𝑟𝑓 and time 𝑡𝑟𝑓 among the results of both models and the estimated measured data for 3D focusing waves. 𝑥𝑟𝑓 and 𝑡𝑟𝑓 are relative to 𝑥𝑓 and 𝑡𝑓 , respectively. Case D4555

𝑥𝑟𝑓 (m) 𝑡𝑟𝑓 (s)

Case D4578

Experimental data

Iterative model

Noniterative model

Experimental data

Iterative model

Noniterative model

0.663 0.303

0.675 0.365

0.650 0.355

1.393 0.808

1.325 0.695

1.175 0.610

Table 6 Performance of the GPU-accelerated Poisson solver for Case D4578.

Noniterative model Iterative model

Acknowledgments

Grid number

CPU time (h)

GPU time (h)

Speedup

520 × 500 × 10

18.07 33.37

1.09 1.94

16.61 17.18

This work was supported by the National Key Research and Development Program, China (Grant No 2017YFC1404200), the National Natural Science Foundation of China (Grant Nos. 51679031, 51720105010, 51309052), the High-Tech Ship Research Projects Sponsored by the Ministry of Industry and Information Technology (MIIT) of China (Grant No. 2016-23-7) and the Fundamental Research Funds for the Central Universities, China (Grant Nos. DUT16TD08, DUT18ZD401).

An assessment of both models’ accuracy in resolving the linear dispersion relation is conducted for a wide range of 𝐾ℎ values. It is inferred that for the long-term evolution of higher dispersive waves, the iterative model is more accurate than the noniterative model when the time step is not sufficiently small. By refining the vertical layers, the accuracy of the iterative model can be improved significantly. By decreasing the time step, the accuracy of the noniterative model can be increased.

Appendix A. Integration procedures for Eqs. (1)–(4) Eqs. (7)–(10) are obtained by integrating Eqs. (1)–(4). Some details about the integration procedures are given as follows. By integrating Eq. (1) over a layer thickness of 𝛥𝑧𝑘 = 𝑧𝑘+1∕2 − 𝑧𝑘−1∕2 and using the Leibniz’ rule, given by:

The serially implemented efficiency of both models is also tested in resolving linear dispersive waves. The iterative model is computationally more expensive than the noniterative model. The ratio of the CPU time consumed by the iterative model to that spent by the noniterative model is in the range of 1.1–2.5. Both models’ efficiency is more sensitive to the horizontal grid resolution than the vertical grid resolution.

𝑧𝑘+1∕2

∫𝑧𝑘−1∕2

(

𝜕𝑢 𝜕𝑣 𝜕𝑤 + + 𝜕𝑥 𝜕𝑦 𝜕𝑧

) 𝑑𝑧 =

𝜕(𝛥𝑧𝑢)𝑘 𝜕(𝛥𝑧𝑣)𝑘 + 𝜕𝑥 𝜕𝑦

(A.1)

𝜕𝑧 |𝑧𝑘+1∕2 𝜕𝑧 |𝑧𝑘+1∕2 − 𝑢 || − 𝑣 || + 𝑤𝑘+1∕2 − 𝑤𝑘−1∕2 = 0 𝜕𝑥 |𝑧𝑘−1∕2 𝜕𝑦 |𝑧𝑘−1∕2 The relations between 𝑤𝑘±1∕2 and 𝜔𝑘±1∕2 can be expressed as:

For wave propagation over uneven bottoms, the overall good agreements between the two sets of model results and experimental data indicate that both models, by using only two vertical layers, can accurately simulate nearshore wave propagation involving wave shoaling, nonlinearity, dispersion, reflection, and diffraction phenomena for the investigated 𝐾ℎ values. For nonlinear sloshing waves with 𝐾ℎ = 𝜋, with the same horizontal grid sizes and the same time step, the iterative model results are better than the noniterative model results and are in good agreement with the analytical solution. In the last two examples involving 2D and 3D focusing wave groups, the iterative model predicts slightly larger values of the focusing position and time than the noniterative model. By considering the measurement errors and uncertainties in the developed models, both models well resolved the 2D and 3D focusing wave groups.

𝑤𝑘±1∕2 = 𝜔𝑘±1∕2 +

𝜕𝑧𝑘±1∕2 𝜕𝑡

+ 𝑢𝑘±1∕2

𝜕𝑧𝑘±1∕2 𝜕𝑥

+ 𝑣𝑘±1∕2

𝜕𝑧𝑘±1∕2 𝜕𝑦

(A.2)

By substituting Eq. (A.2) into Eq. (A.1), the layer-integrated continuity Eq. (7) is obtained. To obtain Eq. (8), the left-side terms of Eq. (2) are first integrated over a layer thickness. Using the Leibniz’ rule, the integration result is written as: ) 𝑧𝑘+1∕2 𝑧𝑘+1∕2 ( 𝜕𝑢2 𝜕𝑢𝑣 𝜕𝑢𝑤 𝜕𝑢 𝑑𝑧 + + + 𝑑𝑧 ∫𝑧𝑘−1∕2 ∫𝑧𝑘−1∕2 𝜕𝑡 𝜕𝑥 𝜕𝑦 𝜕𝑧 ( ) ( ) 𝑧𝑘+1∕2 𝜕(𝛥𝑧𝑢)𝑘 ( 𝜕𝑧 )|| 𝜕𝑧 || 𝜕 2 − 𝑢 + 𝑢 + 𝑢 𝑑𝑧 = 𝜕𝑡 𝜕𝑡 ||𝑧𝑘+1∕2 𝜕𝑡 ||𝑧𝑘−1∕2 𝜕𝑥 ∫𝑧𝑘−1∕2 (A.3) ( ) 𝑧𝑘+1∕2 𝜕 + (𝑢𝑣)𝑑𝑧 𝜕𝑦 ∫𝑧𝑘−1∕2 ( ) ( ) 𝜕𝑧𝑘+1∕2 𝜕𝑧𝑘−1∕2 + 𝜔𝑘+1∕2 + 𝑢(𝑧𝑘+1∕2 ) − 𝜔𝑘−1∕2 + 𝑢(𝑧𝑘−1∕2 ) 𝜕𝑡 𝜕𝑡

The acceleration due to GPU parallelization is evaluated in the 3D focusing wave group test, which is very computationally expensive. A speedup of approximately 17 times for both of the developed models is observed due to the current GPU implementation of the Poisson equation.

where Eq. (A.2) has been substituted. The above Eq. (A.3) can be simplified as follows: ) 𝑧𝑘+1∕2 𝑧𝑘+1∕2 ( 𝜕𝑢 𝜕𝑢2 𝜕𝑢𝑣 𝜕𝑢𝑤 𝑑𝑧 + + + 𝑑𝑧 ∫𝑧𝑘−1∕2 𝜕𝑡 ∫𝑧𝑘−1∕2 𝜕𝑥 𝜕𝑦 𝜕𝑧

In general, by considering the accuracy and efficiency of both semiimplicit models, the noniterative model is preferable to the iterative model for predicting nearshore wave propagation. Both models are capable of accurately predicting the long-term evolution of surface waves in deep-water conditions and can perform well in the test cases involving focusing wave groups. Moreover, the computational times of both models can be reduced by using GPU parallelization.

(A.4) 𝜕(𝛥𝑧𝑢)𝑘 𝜕(𝛥𝑧𝑢𝑢)𝑘 𝜕(𝛥𝑧𝑢𝑣)𝑘 = + + + (𝜔𝑢)𝑘+1∕2 − (𝜔𝑢)𝑘−1∕2 𝜕𝑡 𝜕𝑥 𝜕𝑥 It can be easily found that the Eq. (A.4) is identical to the left-side terms of Eq. (8). Then, a similar integration procedure is applied to the right-side terms of Eq. (2). Because diffusion terms are not considered in the 13

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489

study, and the water surface gradient is independent of the integration, only the layer-integrated nonhydrostatic pressure gradient is given as: ( ) 𝜕(𝛥𝑧𝑞)𝑘 ( 𝜕𝑧 )|| 𝜕𝑞 𝜕𝑧 || 𝑑𝑧 = − 𝑞 + 𝑞 | ∫𝑧𝑘−1∕2 𝜕𝑥 𝜕𝑥 𝜕𝑥 |𝑧𝑘+1∕2 𝜕𝑥 ||𝑧𝑘−1∕2 ( ) 𝜕𝑧𝑘+1∕2 𝜕𝑧𝑘−1∕2 𝜕𝑞 + (𝑞𝑘 − 𝑞𝑘+1∕2 ) = 𝛥𝑧𝑘 + (𝑞𝑘−1∕2 − 𝑞𝑘 ) 𝜕𝑥 𝑘 𝜕𝑥 𝜕𝑥

𝑤𝑛+𝜃∕2 = (1 − 𝜃)𝑤𝑛 + 𝜃𝑤𝑛+1∕2

(B.12)

𝑧𝑘+1∕2

Similarly, by substituting the expressions for the new velocities from Eqs. (22)–(24) into Eq. (26), the resulting equation for the half bottom layer (𝑘 = 1) can be expressed as:

(A.5)

𝑏1 𝛥𝑞𝑖,𝑗,3∕2 + 𝑏2 𝛥𝑞𝑖,𝑗,1∕2 + 𝑏3 𝛥𝑞𝑖−1,𝑗,3∕2 + 𝑏4 𝛥𝑞𝑖−1,𝑗,1∕2

The second and third terms in the above equation were neglected in the present models. However, for steep topography, one may recover these terms to reduce errors due to the treatment of the nonhydrostatic pressure gradient. Following a similar procedure, Eqs. (9) and (10) be obtained by integrating Eqs. (3) and (4), respectively. Details about this procedure are not given for brevity.

+𝑏5 𝛥𝑞𝑖+1,𝑗,3∕2 + 𝑏6 𝛥𝑞𝑖+1,𝑗,1∕2 where

Appendix B. Coefficients of the poisson equation for nonhydrostatic pressure corrections

𝑏1 =

𝜃𝛥𝑡 𝜃𝛥𝑡 𝜃𝛥𝑡 + − 𝛥𝑥2 𝛥𝑦2 𝛥𝑧𝑖,𝑗,1∕2 𝛥𝑧𝑖,𝑗,1

(B.14)

𝑏2 =

𝜃𝛥𝑡 𝜃𝛥𝑡 𝜃𝛥𝑡 + + 𝛥𝑥2 𝛥𝑦2 𝛥𝑧𝑖,𝑗,1∕2 𝛥𝑧𝑖,𝑗,1

(B.15)

𝑏3 = 𝑏4 = 𝑏5 = 𝑏6 = −

By substituting the expressions for the new velocities from Eqs. (22)–(24) into Eq. (25), the following equation for 𝑘 = 2, … , 𝑁𝑧 is obtained:

( ) ( ) 1 1 𝑛+𝜃∕2 𝑛+𝜃∕2 𝑛+𝜃∕2 𝑛+𝜃∕2 𝑢 − 𝑢𝑖−1∕2,𝑗,1 − 𝑣 − 𝑣𝑖,𝑗−1∕2,1 𝛥𝑥 𝑖+1∕2,𝑗,1 𝛥𝑦 𝑖,𝑗+1∕2,1 1 𝑛+𝜃∕2 − 𝑤 𝛥𝑧𝑖,𝑗,1∕2 𝑖,𝑗,1

(B.1)

+𝑎7 𝛥𝑞𝑖+1,𝑗,𝑘+1∕2 + 𝑎8 𝛥𝑞𝑖+1,𝑗,𝑘−1∕2 + 𝑎9 𝛥𝑞𝑖+1,𝑗,𝑘−3∕2 +𝑎13 𝛥𝑞𝑖,𝑗+1,𝑘+1∕2 + 𝑎14 𝛥𝑞𝑖,𝑗+1,𝑘−1∕2 + 𝑎15 𝛥𝑞𝑖,𝑗+1,𝑘−3∕2 = 𝑓 where 𝜃𝛥𝑡 𝜃𝛥𝑡 𝜃𝛥𝑡 𝑎1 = + − 2𝛥𝑥2 2𝛥𝑦2 𝛥𝑧𝑖,𝑗,𝑘−1∕2 𝛥𝑧𝑖,𝑗,𝑘 1

𝑎3 =

𝜃𝛥𝑡 𝜃𝛥𝑡 𝜃𝛥𝑡 + − 2𝛥𝑥2 2𝛥𝑦2 𝛥𝑧𝑖,𝑗,𝑘−1∕2 𝛥𝑧𝑖,𝑗,𝑘−1

𝑎4 = 𝑎6 = 𝑎7 = 𝑎9 = −

𝑎5 = 𝑎8 = −

𝛥𝑧𝑖,𝑗,𝑘−1

+

1 𝛥𝑧𝑖,𝑗,𝑘

𝜃𝛥𝑡 2𝛥𝑥2

𝑎11 = 𝑎14 = −

(B.2)

(B.3)

References Ai, C., Ding, W., Jin, S., 2014. A general boundary-fitted 3D non-hydrostatic model for nonlinear focusing wave groups. Ocean Eng. 89, 134–145. Ai, C., Jin, S., 2010. Non-hydrostatic finite volume model for non-linear waves interacting with structures. Comput. Fluids 39, 2090–2100. Ai, C., Jin, S., 2012. A multi-layer non-hydrostatic model for wave breaking and run-up. Coastal Eng. 62, 1–8. Ai, C., Jin, S., Lv, B., 2011. A new fully non-hydrostatic 3D free surface flow model for water wave motions. Internat. J. Numer. Methods Fluids 66, 1354–1370. Anthonio, S.L., Hall, K.R., 2006. High-order compact numerical schemes for non-hydrostatic free surface flows. Internat. J. Numer. Methods Fluids 52, 1315–1337. Baldock, T.E., Swan, C., 1996. Extreme waves in shallow and intermediate water depths. Coastal Eng. 27, 21–46. Baldock, T.E., Swan, C., Taylor, P.H., 1996. A laboratory study of nonlinear surface waves on water. Philos. Trans. R. Soc. Lond. A Math. Phys. Eng. Sci. 354, 649–676. Beji, S., Battjes, J.A., 1993. Experimental investigation of wave propagation over a bar. Coastal Eng. 19, 151–162. Bradford, S.F., 2005. Godunov-based model for nonhydrostatic wave dynamics. J. Waterw. Port Coast. Ocean Eng. 131, 226–238. Casulli, V., 1999. A semi-implicit finite difference method for non-hydrostatic, free-surface flows. Internat. J. Numer. Methods Fluids 30, 425–440. Casulli, V., Zanolli, P., 2002. Semi-implicit numerical modeling of nonhydrostatic free-surface flows for environmental problems. Math. Comput. Model. 36, 1131–1149. Chawla, A., 1995. Wave Transformation over a Submerged Shoal (M.Sc. Thesis). University of Delaware. Chen, H., Yu, K., 2009. CFD simulations of wave–current-body interactions including greenwater and wet deck slamming. Comput. Fluids 38, 970–980. Escalante, C., Morales de Luna, T., Castro, M.J., 2018. Non-hydrostatic pressure shallow flows: GPU implementation using finite volume and finite difference scheme. Appl. Math. Comput. 338, 631–659. Fringer, O.B., Gerritsen, M., Street, R.L., 2006. An unstructured-grid, finite-volume, nonhydrostatic, parallel coastal ocean simulator. Ocean Modell. 14, 139–173. Higuera, P., Lara, J.L., Losada, I.J., 2013. Realistic wave generation and active wave absorption for Navier–Stokes models Application to OpenFOAM® . Coastal Eng. 71, 102–118.

(B.4)

(B.5)

(B.6)

𝜃𝛥𝑡 4𝛥𝑦2

𝜃𝛥𝑡 2𝛥𝑦2

(B.7)

(B.8)

and [ ] 1 𝑛+𝜃∕2 𝑛+𝜃∕2 𝑛+𝜃∕2 𝑛+𝜃∕2 (𝑢𝑖+1∕2,𝑗,𝑘 + 𝑢𝑖+1∕2,𝑗,𝑘−1 ) − (𝑢𝑖−1∕2,𝑗,𝑘 + 𝑢𝑖−1∕2,𝑗,𝑘−1 ) 2𝛥𝑥 [ ] 1 𝑛+𝜃∕2 𝑛+𝜃∕2 𝑛+𝜃∕2 𝑛+𝜃∕2 − (𝑣𝑖,𝑗+1∕2,𝑘 + 𝑣𝑖,𝑗+1∕2,𝑘−1 ) − (𝑣𝑖,𝑗−1∕2,𝑘 + 𝑣𝑖,𝑗−1∕2,𝑘−1 ) 2𝛥𝑦 ( ) 1 𝑛+𝜃∕2 𝑛+𝜃∕2 − 𝑤𝑖,𝑗,𝑘 − 𝑤𝑖,𝑗,𝑘−1 𝛥𝑧𝑖,𝑗,𝑘−1∕2

(B.18)

The set of Eqs. (B.1) and (B.13) forms the Poisson equation (27), which is a sparse linear symmetric system that contains 10 nonzero diagonals in the bottom layer and 15 nonzero diagonals in the other layers.

)

𝜃𝛥𝑡 4𝛥𝑥2

𝑎10 = 𝑎12 = 𝑎13 = 𝑎15 = −

(B.17)

𝑔=−

+𝑎10 𝛥𝑞𝑖,𝑗−1,𝑘+1∕2 + 𝑎11 𝛥𝑞𝑖,𝑗−1,𝑘−1∕2 + 𝑎12 𝛥𝑞𝑖,𝑗−1,𝑘−3∕2

(

𝜃𝛥𝑡 2𝛥𝑦2

(B.16)

and

+𝑎4 𝛥𝑞𝑖−1,𝑗,𝑘+1∕2 + 𝑎5 𝛥𝑞𝑖−1,𝑗,𝑘−1∕2 + 𝑎6 𝛥𝑞𝑖−1,𝑗,𝑘−3∕2

𝜃𝛥𝑡 𝜃𝛥𝑡 𝜃𝛥𝑡 + + 𝛥𝑥2 𝛥𝑦2 𝛥𝑧𝑖,𝑗,𝑘−1∕2

𝜃𝛥𝑡 2𝛥𝑥2

𝑏7 = 𝑏8 = 𝑏9 = 𝑏10 = −

𝑎1 𝛥𝑞𝑖,𝑗,𝑘+1∕2 + 𝑎2 𝛥𝑞𝑖,𝑗,𝑘−1∕2 + 𝑎3 𝛥𝑞𝑖,𝑗,𝑘−3∕2

𝑎2 =

(B.13)

+𝑏7 𝛥𝑞𝑖,𝑗−1,3∕2 + 𝑏8 𝛥𝑞𝑖,𝑗−1,1∕2 + 𝑏9 𝛥𝑞𝑖,𝑗+1,3∕2 + 𝑏10 𝛥𝑞𝑖,𝑗+1,1∕2 = 𝑔

𝑓 =−

(B.9)

with 𝑢𝑛+𝜃∕2 = (1 − 𝜃)𝑢𝑛 + 𝜃𝑢𝑛+1∕2

(B.10)

𝑣𝑛+𝜃∕2 = (1 − 𝜃)𝑣𝑛 + 𝜃𝑣𝑛+1∕2

(B.11) 14

C. Ai, Y. Ma, C. Yuan et al.

Ocean Modelling 144 (2019) 101489 Vitousek, S., Fringer, O.B., 2011. Physical vs. numerical dispersion in nonhydrostatic ocean modeling. Ocean Modell. 40, 72–86. Vitousek, S., Fringer, O.B., 2013. Stability and consistency of nonhydrostatic freesurface models using the semi-implicit 𝜃-method. Internat. J. Numer. Methods Fluids 72, 550–582. Vitousek, S., Fringer, O.B., 2014. A nonhydrostatic, isopycnal-coordinate ocean model for internal waves. Ocean Modell. 83, 118–144. Wu, G.X., Ma, Q.W., Taylor, R.E., 1998. Numerical simulation of sloshing waves in a 3D tank based on a finite element method. Appl. Ocean Res. 20, 337–355. Wu, C.H., Young, C.-C., Chen, Q., Lynett, P.J., 2010. Efficient nonhydrostatic modeling of surface waves from deep to shallow water. J. Waterw. Port Coast. Ocean Eng. 136, 104–118. Young, C.-C., Wu, C.H., 2010a. Nonhydrostatic modeling of nonlinear deep-water wave groups. J. Eng. Mech. 136, 155–167. Young, C.-C., Wu, C.H., 2010b. A 𝜎-coordinate non-hydrostatic model with embedded boussinesq-type-like equations for modeling deep-water waves. Internat. J. Numer. Methods Fluids 63, 1448–1470. Young, C.-C., Wu, C.H., Kuo, J.-T., Liu, W.-C., 2007. A higher-order 𝜎-coordinate non-hydrostatic model for nonlinear surface waves. Ocean Eng. 34, 1357–1370. Yuan, H., Wu, C.H., 2004. An implicit three-dimensional fully non-hydrostatic model for free-surface flows. Internat. J. Numer. Methods Fluids 46, 709–733. Zijlema, M., Stelling, G.S., 2005. Further experiences with computing non-hydrostatic free-surface flows involving water waves. Internat. J. Numer. Methods Fluids 48, 169–197.

Johannessen, T.B., Swan, C., 2001. A laboratory study of the focusing of transient and directionally spread surface water waves. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 457, 971–1006. Kang, D., Fringer, O.B., 2005. Time accuracy of pressure methods for nonhydrostatic free-surface flows. In Estuarine and Coastal Modeling 2005 – Proceedings of 9th International Conference on Estuarine and Coastal Modeling, pp. 419-433. Kuo, S., Chiu, P., Lin, R., Lin, Y., 2011. GPU implementation for solving incompressible two-phase flows. World Acad. Sci. Eng. Technol. 52, 878–886. Lai, Z., Chen, C., Cowles, G.W., Beardsley, R.C., 2010. A nonhydrostatic version of FVCOM: 1. Validation experiments. J. Geophys. Res. 115 (C11010). Luth, H.R., Klopman, G., Kitou, N., 1994. Project 13G: Kinematics of waves breaking partially on an offshore bar: LDV measurements for waves with and without a net onshore current. Technical Report H1573, Delft Hydraulics. Ma, Y., Dong, G., Liu, S., Zang, J., Li, J., Sun, Y., 2010. Laboratory study of unidirectional focusing waves in intermediate depth water. J. Eng. Mech. 136, 78–90. Ma, G., Shi, F., Kirby, J.T., 2012. Shock-capturing non-hydrostatic model for fully dispersive surface wave processes. Ocean Modell. 43–44, 22–35. Ning, D.Z., Zang, J., Liu, S.X., Eatock Taylor, R., Teng, B., Taylor, P.H., 2009. Freesurface evolution and wave kinematics for nonlinear uni-directional focused wave groups. Ocean Eng. 36, 1226–1243. Reddy, R., Banerjee, R., 2015. GPU accelerated VOF based multiphase flow solver and its application to sprays. Comput. Fluids 117, 287–303. Rijnsdorp, D.P., Zijlema, M., 2016. Simulating waves and their interactions with a restrained ship using a non-hydrostatic wave-flow model. Coastal Eng. 114, 119–136. Sanders, J., Kandrot, E., 2010. CUDA by Example: An Introduction to General-Purpose GPU Programming, New Jersey. Stelling, G.S., Zijlema, M., 2003. An accurate and efficient finite-difference algorithm for non-hydrostatic free-surface flow with application to wave propagation. Internat. J. Numer. Methods Fluids 43, 1–23.

15