A fast wavelet-multipole method for direct BEM

ARTICLE IN PRESS Engineering Analysis with Boundary Elements 34 (2010) 673–679 Contents lists available at ScienceDirect Engineering Analysis with B...

Download PDF

560KB Sizes 5 Downloads 296 Views

Report

PDF Reader
Full Text

ARTICLE IN PRESS Engineering Analysis with Boundary Elements 34 (2010) 673–679

Contents lists available at ScienceDirect

Engineering Analysis with Boundary Elements journal homepage: www.elsevier.com/locate/enganabound

A fast wavelet-multipole method for direct BEM Jinyou Xiao a,, Johannes Tausch b a b

College of Astronautics, Northwestern Polytechnical University, Xi’an 710072, China Department of Mathematics, Southern Methodist University, Dallas, TX 75275, USA

a r t i c l e in fo

abstract

Article history: Received 20 August 2009 Accepted 17 January 2010 Available online 21 March 2010

A fast wavelet-multipole method (WMM) has been developed to achieve further speedup for the boundary element method in solving the direct boundary integral equations. The main idea is to compute the right-hand-side vector by the fast multipole method and to solve the linear system by the wavelet compression method. By using the variable order moments, almost linear complexity can be obtained. The primary advantages of the present WMM lie in that it (1) permits efﬁcient implementation; (2) is universal in handling practical problems with complicated geometries. Numerical examples with around 1 million unknowns, performed on nontrivial geometries, clearly show that the WMM can shorten the total computational time by reducing the time for computing the right-hand-side. & 2010 Elsevier Ltd. All rights reserved.

Keywords: Wavelet compression method Fast multipole method Direct boundary integral equation Boundary element method

1. Introduction There are indirect and direct formulations of the boundary integral equation (BIE) that describe a physical problems equivalently. The direct BIE has gained popularity among engineers and scientists, since the unknowns are the same as the real physical variables that govern the equations. By using the boundary element method (BEM) the direct BIE can be discretized into a linear system of equations. The coefﬁcient matrix is fully populated, and the right-hand-side (RHS) vector is obtained by a matrix–vector product with a dense matrix. Therefore, the total computational complexity for the numerical treatment of the direct BIE by the conventional BEM is at least OðN2 Þ, where N is the number of unknowns. The quadratic complexity can be reduced to almost linear, i.e. a OðNlog NÞa Z 0 , by using the fast boundary element techniques emerged in the past two decades. Representative examples of such fast techniques are fast multipole method (FMM) [1,2,14] and wavelet compression method (WCM) [7–10,15]. Other techniques include Hmatrices [3,4], adaptive cross approximation [13], precorrected-FFT method [5,6], etc. Here we note that Hmatrices is of similar idea with the FMM. Thus the following discussion about the FMM also applies to the Hmatrices. The essential of the aforementioned acceleration techniques is to use the iterative solver and to perform matrix–vector products approximately so that the complexity of one iteration can be reduced to almost linear. Even so, the overall efﬁciency of those techniques are generally different. We will focus on the FMM and

Corresponding author.

E-mail addresses: [email protected] (J. Xiao), [email protected] (J. Tausch). 0955-7997/$ - see front matter & 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.enganabound.2010.01.009

the WCM. Comparisons of the two methods can be found in [17], where we show that, although the setup of the FMM is faster than of the WCM, the matrix–vector application in the WCM is much more efﬁcient than in the FMM. Consequently, we conclude that, from the viewpoint of saving the computational time, the FMM is more suitable for problems with small number of iterations, while the WCM is more efﬁcient for problems with large number of iterations. This conclusion motivates us to exploit the combination of the two methods in order to achieve further speed-up in solving direct BIEs by BEM. Our idea is to compute the RHS vector by the FMM and to solve the resulting linear system by the WCM, which is referred to as wavelet-multipole method (WMM). Since in many practical problems relatively large numbers of iterations are necessary to get acceptable solutions, we can expect the WMM to reduce the total computational time. Theoretically, a WMM can be obtained by combining any one version of FMM and one WCM; for example, we can form a WMM by the FMM in [2] and the WCM in [9]. However, this is not a good combination from the viewpoint of simplicity and efﬁciency, because the principles of the two method are generally different and much extra work has to be done in setting up the WMM. Moreover, the WCM in [9] and its kin are not suitable for problems with very complex geometries due to the need of parametric surfaces. Thus, the key to form an efﬁcient WMM is to use FMM and WCM which are simple in implementation and more universal in handling realistic problems. We propose to use the FMM in [14] and the WCM in [11]. Both methods use Taylor expansion of the Green’s function and can achieve almost optimal complexity by using the variable order moments. In addition, they can be transformed from each other, which is detailed in [17]. It is worthy noting that a WMM can also be constructed using the WCM in [12] and the FMM in [2], and it

ARTICLE IN PRESS 674

J. Xiao, J. Tausch / Engineering Analysis with Boundary Elements 34 (2010) 673–679

may have similar complexity with the WMM in this paper. However, its implementation is more complicated. To our knowledge, the WMM is the ﬁrst reported work on the combination of the WCM and the FMM in order to further promote the efﬁciency of the BEM. We expect this to be the main contribution of this paper. For simplicity in presentation, our discussion is focused on the direct BIE of Laplace equation, but the method can be applied to solve direct BIEs in other problems including Helmholtz, elasticity and Stokes ﬂow, probably with minor modiﬁcations. We organize the paper as follows. The model problem as well as the main idea of the WMM are presented in Section 2. The basic settings of the FMM and the WCM are described in Sections 3 and 4, respectively. Numerical results are provided in Section 5 to test the performance of the WMM. Section 6 concludes the paper.

The motivation of the WMM comes from work [17] which shows that the WCM can be transformed into a FMM. Following this observation, the efﬁciency of the WCM and the FMM was compared with many large-scale practical examples, see [17–19]. We found that:

The CUP time requirement of FMM is lower than of the WCM

2. WMM for direct BEM

only when the number of iterations for solving the linear system is small (for example, less than 50), because in WCM most computational time is spent on computing the sparse matrix, once it is obtained and stored, the iteration is very fast. In FMM, however, one has to perform large amount of re-computations in order to reduce the memory requirement, thus the time consumption of an iteration in FMM is much longer than in WCM; see Table 2 in [17]. By using the a-posteriori compression, the memory requirement of the (sparse) system matrix of the WCM can be lower than the near-ﬁeld matrix of the FMM.

2.1. Direct BEM As a model equation (BIE) Z CBIE ðxÞuðxÞ ¼

problem we consider the direct boundary integral of the three-dimensional Laplace equation @ Gðx;yÞqðyÞ þ Gðx; yÞuðyÞ dGy ; x A G: ð1Þ @ny G

where CBIE is a constant whose value is dependent on the geometry of the position of the source point x ¼ ðx1 ; x2 ; x3 Þ; u and q are the temperature and heat ﬂux, respectively; ny is the outward normal at the ﬁeld point y and Gðx; yÞ the Green’s function Gðx;yÞ ¼ GðrÞ ¼

1 ; 4pr

SN

@Gðx;yÞ 1 @r ¼ ; @ny 4pr 2 @ny

r ¼ jxyj:

Let T h ¼ i ¼ 1 ti be a quasi uniform triangulation of G, where h ¼ maxN i ¼ 1 diamðti Þ. We can deﬁne the piecewise constant boundary element space Xh ¼ spanfwi : i ¼ 1; 2; . . . ; Ng; ð2Þ pﬃﬃﬃﬃﬃﬃﬃ in which wi ¼ 1= jti j on panel ti and vanishes elsewhere. Using the standard Galerkin method based on Xh, BIE (1) can be discretized as " # " # q1 u1 ¼ ½G1 G2 : ð3Þ ½H1 H2 q2 u2 The subscripts in (3) are used to indicate different boundary conditions. Here we simply suppose u2 and q1 are prescribed, then the linear system for the unknowns is Aa ¼ b;

ð4Þ

where "

A ¼ ½H1 G2 ;

# q1 ; a¼ u2

"

# q1 b¼B ; u2

B ¼ ½G1 H2 :

ð5Þ

In the conventional BEM, the coefﬁcient matrix A is densely populated and the complexity for solving (4) is at least OðN2 Þ. Moreover, the complexity for computing the known right-hand side vector b also scales like OðN 2 Þ. 2.2. Wavelet-multipole method The basic idea of the wavelet-multipole method (WMM) for direct BIE (1) is to compute the right-hand side vector b using the FMM while solving the linear system (4) using the WCM. The aim is to achieve further speed-up in solving large-scale problems using BEM.

Above comparisons indicate that the WCM is more suitable for problems which need relatively large number of iterations to converge, whereas the FMM is more efﬁcient for small number of iterations. Now we return to (4). We note that only one iteration is needed to compute vector b, thus the FMM is a better choice. On the other hand, in many real-life problems relatively large number of iterations is required to get a reasonable solution a, so we propose to solve (4) by the WCM. It is clear that the WMM has its advantage only when a certain number of iterations is required to solve (4). Our experience shows that this number is in most case around 50, see e.g., [17]. For problems requiring less than 50 iterations, a full FMM scheme in which not only the vector b is computed but the system (4) is also solved by FMM, may outperform the WMM. However, in most large-scale practical problems with complicated geometries and boundary conditions, usually large numbers (larger than 50) of iterations are necessary, thus we can expect the WMM to achieve further speed-up in solving direct BIEs; see numerical experiments in Section 5. In the following, we will consider the implementation of the WMM. Efﬁciency and simplicity are two factors we concern most. We will establish a WMM using the FMM in [14] and the WCM in [11], because: (1) the two methods have many common points, for example, they use the same tree structure, compute the moment matrices of cubes and use the same multipole-to-local Table 1 Computing times (s), memory usage (MB) and L2-error in Example 5.1. N

Trhs

Tmatrix

Tit

Ttotal

Nit

Memory

L2-error

648 2592 10,368 41,472 165,888 663,552

0 1 4 18 72 307

0 3 21 134 722 3278

0 0 0.02 0.12 0.60 2.33

0 4 25 154 806 3665

8 9 5 5 3 3

1.8 12.7 59.1 248.2 1031.6 4515.3

0.021143 0.008453 0.001125 0.000657 0.000285 0.000149

Table 2 Computing times (s), memory usage (MB) and L2 error in example 5.2. N

Trhs

Tmatrix

Tit

Ttotal

Nit

Memory

L2-error

11,788 47,152 188,608 754,432

4 20 79 316

6 44 188 710

0.01 0.06 0.24 1.12

11 74 319 1325

106 142 194 250

35.2 141.7 567.1 2303.9

0.00735 0.00240 0.00170 0.00128

ARTICLE IN PRESS J. Xiao, J. Tausch / Engineering Analysis with Boundary Elements 34 (2010) 673–679

transformations, this permits an efﬁcient implementation of the WMM; (2) they are based on Taylor expansion of Green’s functions, which allows a uniﬁed treatment of a large class of Green’s functions associated with elliptic boundary value problems.

a

coefﬁcient ln;n0 is given by

lan;n0 ¼

3.1. Tree structure and notations The construction of the multilevel tree begins with embedding T h into a level-0 cube C0 . The cube is subdivided into eight cubes of equal size and these cubes are repeatedly reﬁned until the ﬁnest cubes contain at most a predetermined number of panels. The set of cubes that have a non-empty intersection with T h at level-l is denoted by Cl . The collection of all the non-empty cubes S is C ¼ Ll ¼ 0 Cl . The collection of panels whose centroid is in cube n is Gn . The bounding box Bn of Gn is the smallest axiparallel box that contains Gn . xn is the center of Bn . The nonempty children of cube n are denoted by sonsðnÞ. The separation ratio of two cubes n and n0 at the same level is deﬁned as 8 jyy0 j > n a n0 ; < miny0 A Gn0 maxx A Gn ; jxy0 j Zn;n0 ¼ y A Gn > : n ¼ n0 : 1 0

The neighbor set for any cube n A C is deﬁned as maxfZn;n0 ; Zn0 ;n g 4 Zg;

ð6Þ

3.2. Basic FMM formulas In direct BEM the RHS b is obtained by a matrix–vector multiplication, see (5). This can be accelerated by the FMM. The key idea is to approximate the Green’s function in the interaction of well-separated cubes by a truncated Taylor series expansion. In the variable order scheme, the expansion order of the FMM is adjusted to the level. Here we let pl ¼ Ll þpL ;

ð7Þ

where pL is the order at the ﬁnest level L. Vector b is concerned with the computation of integrals of Green’s function as well as integrals of its normal derivative. For two well-separated cubes n and n0 at level l, these integrals can be approximated as Z g @ny GðjxyjÞ gðyÞ dGy X

X

Da þ b Gðjrn;n0 jÞ ðxxn Þa a!b! jaj r pl jbj r pl jaj X a ¼ ln;n0 ðxxn Þa ; xA Gn

Z

g

Gn0

jaj rpl ;

ð9Þ

where mbn0 ðgÞ is a moment of the function g, which is given by Z g mbn0 ðgÞ ¼ @ny ðyxn0 Þb gðyÞ dGy : ð10Þ Thus there are two types of moments, depending on whether g ¼ 0 or g ¼ 1. Since the translation operators for both types are identical, our notations will ignore this distinction. The computation of the local expansion coefﬁcients from the moments in (9) is the MtL translation. The moments of a cube n depend on the center xn . In order to recursively compute the moments of a cube from the moments of its children, it is necessary to translate the center. This operation is the MtM translation. Its coefﬁcients are derived easily from the multivariate binomial formula ! X X a ðxn0 xn Þab mbn0 ðgÞ; jajr pl ; l ¼ levelðnÞ: man ðgÞ ¼ n0 A sonsðnÞb r a

@ny ðxn0 yÞb gðyÞ dGy ð8Þ

jaj r pl

where rn;n0 ¼ xn xn0 ; a; b are multiindices and g ¼ 0 for the single and g ¼ 1 for the double layer operator. The local expansion

b

ð11Þ Note that in (11) the moments up to order pl of the son cubes are needed, however for those cubes only the moments up to order pl þ 1 r pl are computed before. Here we assume all higher order moments to vanish. It is shown in [14] that this do not amplify the error. The local expansion coefﬁcients of a cube n0 are computed from the expansion coefﬁcients of the cube’s parent n by translating the expansion center ! X a X aX a a ðxn0 xn Þab ðxxn0 Þb ln ðxxn Þ ¼ ln jaj r p

where levelðnÞ denotes the level of n; the parameter Z is predetermined. Two cubes which are not neighbors are said to be well-separated.

Gn0

Da þ b Gðjrn;n0 jÞ ð1Þb mbn0 ðgÞ; a!b! jbj r p jaj

Gn 0

The variable order FMM is recalled. This is the base of the WCM in Section 4. Most of the formulas in this section can be found in [14].

and

X l

3. FMM with variable order moments

N ðnÞ ¼ fn0 A C : levelðnÞ ¼ levelðn0 Þ

675

jaj r pl

0

bra

X BX ¼ @ jbj r pl

¼

X

aZb jaj r p

b

a b

!

1

aC ðxn0 xn Þab ln Aðxxn0 Þb

lbn0 ðxxn0 Þb :

ð12Þ

jbj r pl

The above operation is the LtL translation. With the above translation relations at hand, the matrix– vector multiplication in b can be readily performed by the standard procedures of FMM; see [14] for the details. 4. WCM with variable order wavelets It is well-known that the BEM using conventional nodal basis generates full matrix. The core of the WCM is to transform nodal basis into an equivalent wavelet basis, denoted by C, in which most functions have vanishing or quasi-vanishing moments. BEM using C leads to numerically sparse matrix. 4.1. Wavelet construction The wavelet functions can be transformed from the nodal functions by a sequence of local transformations within the tree structure established in Section 3.1. This process is described in detail in, e.g. [11,12,16]. Here we only give a brief summary to clarify the concept. In each ﬁnest-level cube, the nodal functions are transformed into ﬁnest-scale wavelets c and scaling functions f. In the coarser levels, the ffunctions of each cube’s children are transformed into c- and ffunctions of the cube. To understand this transformation, given a cube n, denote the ffunctions in the

ARTICLE IN PRESS 676

J. Xiao, J. Tausch / Engineering Analysis with Boundary Elements 34 (2010) 673–679

~ ...;f ~ . Their transformations f ^ ;...;f ^ is cube’s children by f 1 n 1 n given by a nonsingular n nmatrix Q and

f^ i ¼

n X

~ : ½Q j;i f j

ð13Þ

j¼1

~ ’s are supported in cube n, then the transformed Note that if the f j ^ functions f i are supported in n as well. The transformation matrix Q is at our disposition, but its choice has a large inﬂuence on the compressibility of the integral operator in the transformed basis. One approach to achieve high compression is to construct wavelets with vanishing moments up to a level-dependent order pl in (7). Note that the moments are deﬁned by (10). The moment ^ Þ is the matrix with the coefﬁcients ma ðf ^ Þ, where a is matrix Mn ðf i n the row and i the column index. Simple matrix algebra shows that ^ Þ ¼ Mn ðf ~ ÞQ . If we let Q be the matrix in the SVD Mn ð f ~ Þ ¼ URQ T , then the matrix Mn ðf ^ Þ has vanishing columns Mn ðf where the singular values vanish. Hence the corresponding transformed functions in (13) have the exact vanishing moment property. These functions are the cfunctions (denoted by cn;k ; k A I^ n ), the remaining functions are the ffunctions (denoted by fn;k ; kA I~ n ). To obtain more cfunctions and thus higher compression of the integral operator, it is possible to allow wavelets to have nonzero but sufﬁciently small moments, by including the columns corresponding to small singular values. We call these quasivanishing moment (QVM) wavelets [16]. An important parameter in this construction is the threshold esv of the singular values, which controls the residuals of the moments of the QVM-wavelets and the efﬁciency of the wavelet BEM. It can be given by

esv ¼ Cs 2l Zpl ;

ð14Þ

where Cs is a constant. In the above constructions, one needs to compute the moment matrix for each cube. This can be done in exactly the same way as in the FMM in Section 3.2; that is, in the ﬁnest level the moment of each cube is computed directly and in coarser levels it is transformed from the moments of the cube’s children by using MtM (11). 4.2. Matrix compression

compression is implemented according to the neighborhood deﬁned in (6); more speciﬁcally, we only have to compute the wavelet interactions associated with neighboring cubes. Consequently, we obtain a sparse NS-form matrix, denoted by A~ ns , which is an approximation of Ans . The nonzero coefﬁcients of A~ ns is OðNÞ. They can be computed in a fast manner (see [11,17] for the details) of which the computational cost is bounded by OðNÞ. Here we should note that, in calculating A~ ns , the MtL translation (9) in FMM is used to estimate the finteractions corresponding to well-separated cubes. The MtL translations consume much CUP time as they do in the FMM. Although A~ ns is sparse, it still contains a large number of coefﬁcients with small values. In order to yield further compression, thus to further reduce the memory requirement, we propose to use the a-posteriori compression [19]. The key idea is to discard the coefﬁcients of A~ ns below a level dependent threshold el , which can be given by

el ¼ Ce 2ðs2Þl Zpl þ s ;

ð15Þ

where, s is the order of singularity of the Green’s function; s ¼1 in this paper. Ce is a user-deﬁned parameter. Let A^ ns be the approximate NS-form after the a-posteriori compression, and let ½A~ ns lij be the coefﬁcient of A~ ns at level l. Then the compression strategy is given by ( ^ l ½A~ ns lij ½A ns ij Z el ; l ^ ð16Þ ½A ns ij ¼ ½A^ l o el : 0 ns ij

One should note that the a-posteriori compression does not always reduce the memory requirement. In the original WCM the non-zero blocks of the NS-form A~ ns is saved in dense matrix mode. To beneﬁt from the a-posteriori compression, however, the non-zero blocks in A^ ns must be kept in sparse matrix mode which requires storing the positions in addition to the actual values of the non-zero coefﬁcients. This additional memory overhead will probably cancel out the memory reduction of dropping the coefﬁcients. It is easy to show that, to beneﬁt from the aposteriori compression, one can only implement it to m n blocks in A~ ns satisfying nnz oCsp nm;

In FMM we do not compute and save the system matrix explicitly, instead we only use the block-wise low-rank decomposition of it in matrix–vector multiplication. However, in WCM since the matrix (e.g., A) can be compressed to a very sparse one a with only OðNlog NÞa Z 0 nonzero coefﬁcients, we can afford to compute and save it. Thus the iteration in the WCM is more fast than that in the FMM. This is a main difference of the two methods. The matrix in wavelet basis is the wavelet transformation of the matrix A in nodal basis. It has two representations, namely the standard form and the non-standard form (NS-form). This paper considers the latter. For the details of the former see [7,9,10]. In the following we let Ans be the NS-form wavelet transformation of A. The typical structure of Ans can be found in [19]. The nonzero coefﬁcients of the whole matrix Ans is OðN2 Þ. However, most of them have small values, and can be discarded without affecting the error of the result, which is known as the matrix compression. The compression scheme for the WCM based on NS-form is studied in [19]. Here it is summarized. The compression of Ans can be performed at two phases which are known as the a-priori compression and the a-posteriori compression. The former accounts mainly for the acceleration achieved by the WCM, while the latter can further reduce the memory requirement of the compressed matrix. The a-priori

ð17Þ

where, nnz is the number of non-zeros being retained after using (16); Csp ¼ Me =ðMe þ Mp Þ; Me and Mp are the memory requirements for storing a coefﬁcient and its position, respectively. When we obtain the compressed matrix A^ ns , the linear system (4) can be solved using a iterative solver, e.g. the generalized minimal residual method (GMRES). The NS-form matrix–vector multiplication can be found in [17]. We close this section by noting that: (1) the a-priori compression in the WCM is equivalent to the low-rank approximation in the FMM, they have the same accuracy; (2) the aposteriori compression is unique to the WCM, by which the memory requirement of the NS-form matrix can be further reduced to be even lower than the near-ﬁeld matrix of the FMM, as shown in [19].

5. Numerical experiments We demonstrate the performance of the WMM in solving direct BIEs in potential problems. All the computations are in core on an HP workstation with a Xeon 3.0 GHz CPU and 30 GB RAM. The resulting linear systems are solved by the GMRES.

ARTICLE IN PRESS J. Xiao, J. Tausch / Engineering Analysis with Boundary Elements 34 (2010) 673–679

677

Fig. 1. A crankshaft being discretized into 11,788 boundary elements.

5.1. A sphere model First, we test the accuracy of the WMM by solving Neumann problem of Laplace Eq. (1) on the surface of an unit sphere. Let uðxÞ ¼ x1 be the analytical solution of this problem, then the ﬂux boundary condition can be applied accordingly. The surface is partitioned into 648 triangular elements and then reﬁned 5 times. In the computations, we set the parameters pL ¼ 2; Z ¼ 0:4. We set the two parameters concerned with matrix compression to be Cs ¼ Ce ¼ 0:01. Tolerance for the iterative solver GMRES is 10 6. The computational times, total memory usage and the L2-error of potential u on boundary are listed in Table 1. Columns Trhs, Tmatrix, Tit and Ttotal are the time for computing the RHS using the FMM, time for computing the compressed matrix A^ ns , time for one iteration in the WCM and the total time for solving the problem; Nit stands for the number of iterations. The symbol ‘‘ 0’’ indicates the time is near 0s. It can be seen from the last column that the L2-error of potential u is of order OðhÞ. The memory usage and the computing times are of order OðNlogNÞ. In addition, one can see that the numbers of iterations Nit in this example are quite small, thus the FMM is a better choice to solve the direct BIE in this example (see Section 2.2 for the reason).

5.2. A crankshaft model Fig. 1 shows a crankshaft model1 whose boundary is initially partitioned into 11,788 elements and then reﬁned three times; thus the ﬁnest model consists of 754,432 elements. The potential u in the crankshaft is given by u ¼ x1, boundary conditions are enforced accordingly. Flux q is given at the two end sections (gray parts in Fig. 1) and u at the remainder part. In the WMM we set the parameters pL ¼ 1; Z ¼ 0:4. We set the two parameters concerned with matrix compression to be Cs ¼ Ce ¼ 0:1. Tolerance for the iterative solver GMRES is 10 5. The computational times, memory usage and L2-error are listed in Table 2. All Nit are larger than 100, and the WCM is preferable in solving system (4). The times Trhs are much lower than Tmatrix, which implies that the WMM can achieve further speedup by reducing the time for computing the RHS using the FMM. Comparing column Tit with column Trhs indicates that the iteration in the WCM is much faster than in the FMM. The same conclusion can be drawn from the comparison of Tit with the result in Fig. 5 in [2]. The computational times and memory usage increase almost linearly with N. However, the 1

[20]).

¨ The model has been generated by the program NETGEN by J. Schoberl (see

L2-error decays slower than OðhÞ, which may caused be the edges on the boundary. 5.3. An engine-block model In this example, an attempt has been made to apply the WMM to steady-state heat conduction analysis of a engine block. In order to study the scaling of the computational time and memory requirements, the engine block is ﬁrst discretized into 85,728 elements, then reﬁned two times; the largest model has 1,286,268 elements. The engine block as well as part of its coarsest boundary element partition are shown in Fig. 2. The conductivity of the model is 10 W=m3 C. The temperature of the inner surface of the oblique tube and the temperature of the bottom surface are assumed to be 75 3 C and 100 3 C, respectively; see Fig. 2. Convective condition is given on other part of the boundary, where the difference between the surface temperature u and the ambient temperature u0 is assumed to be linearly related to the heat ﬂux via a ﬁlm coefﬁcient hf, i.e., q ¼ hf ðuu0 Þ:

ð18Þ 23

In this problem, the ﬁlm coefﬁcient is hf ¼ 50 W=m C, the ambient temperature is u0 ¼ 22 3 C. One thing to note is that the WMM for linear system (3) cannot be applied to solve this problem in a straightforward manner due to the convective boundary condition. This can be ﬁxed, however, by a slight modiﬁcation to (3). Suppose q2 is given by the convective condition (18), then (3) can be recasted as " # " # q1 u1 ½H1 ðH2 hf G2 Þ ¼ ½G1 G2 : ð19Þ hf u0 u2 It is obvious that, in order to solve (19) using the WMM, Taylor expansion of function @Gðx; yÞ hf Gðx; yÞ @ny is needed. In the WMM we set the parameters pL ¼ 1; Z ¼ 0:4; Cs ¼ 1; Ce ¼ 0:1. Tolerance for the iterative solver GMRES is 10 4. Fig. 3(a) shows the computed temperature distribution on the surface of the engine block using the model with 85,728 elements. It is clear that the temperature distribution agrees well with that calculated by the ﬁnite element method (FEM) using ANSYS in Fig. 3(b). Table 3 lists the computational times and memory requirements of the models with different number of elements. Again, we see that the time for computing the RHS is much lower than that for the system matrix. In addition, the times and

ARTICLE IN PRESS 678

J. Xiao, J. Tausch / Engineering Analysis with Boundary Elements 34 (2010) 673–679

Fig. 2. An engine block with part of its boundary element partition.

Fig. 3. Temperature of the engine-block model computed by WMM and FEM. (a) WMM result. (b) FEM result.

Table 3 Computing times (s) and memory (MB) in example 5.3. N

Trhs

Tmatrix

Tit

Ttotal

Nit

Memory

85,728 325,774 1,286,268

32 77 438

121 355 2161

0.22 0.74 3.36

181 558 3234

114 157 170

400 1482 5772

memory usages scale almost linearly with the number of elements N.

6. Conclusion In order to achieve further speedup in solving direct BIEs, a WMM has been developed. The right-hand-side vector is computed using the FMM, and the linear system is solved using the WCM. The numerical results, performed on nontrivial

geometries, show that the WMM takes the advantages of both the FMM and the WCM, and can reduce the computational time in solving the direct BIEs of large-scale realistic problems.

Acknowledgments The work of Jinyou Xiao is in part supported by the Doctorate Foundation of Northwestern Polytechnical University under Grant no. CX200601 and the National Science Foundation of China under Grant no. 10674109. The work of Johannes Tausch is in part supported by the National Science Foundation under Grant DMS091522. The authors are also grateful to the anonymous referees for their helpful comments and suggestions that improved the quality and readability of the paper. References [1] Greengard L, Rokhlin V. A new version of the fast multipole method for the Laplace equation in three dimensions. Acta Num 1997;6:229–69.

ARTICLE IN PRESS J. Xiao, J. Tausch / Engineering Analysis with Boundary Elements 34 (2010) 673–679

[2] Shen L, Liu YJ. An adaptive fast multipole boundary element method for three-dimensional potential problems. Comput Mech 2007;39:681–91. [3] Hackbusch W, Nowak ZP. On the fast matrix multiplication in the boundary element method by panel clustering. Numer Math 1989;54:463–91. ¨ [4] Borm S. Construction of data-sparse H2 matrices by hierarchical compression. SIAM J Sci Comput 2009;31(3):1820–39. [5] Phillips JR, White J. A precorrected-FFT method for electrostatic analysis of complicated 3-D structures. IEEE Trans Comput Aided Des Integr Circuits Syst 1997;16(10):1059–72. [6] Nintcheu Fata S. Fast Galerkin BEM for 3D-potential theory. Comput Mech 2008;42(3):417–29. [7] Beylkin G, Coifman R, Rokhlin V. Fast wavelet transforms and numerical algorithms. Comm Pure Appl Math 1991;37:141–83. [8] Lage C, Schwab C. Wavelet Galerkin algorithms for boundary integral equations. SIAM J Sci Comput 1999;20:2195–222. [9] Dahmen W, Harbrecht H, Schneider R. Compression techniques for boundary integral equations—optimal complexity estimates. SIAM J Numer Anal 2006;43(6):2251–71. [10] Harbrecht H, Schneider R. Wavelet Galerkin schemes for boundary integral equations—implementation and quadrature. SIAM J Sci Comput 2006;27(4): 1347–70.

679

[11] Tausch J. A variable order wavelet method for the sparse representation of layer potentials in the non-standard form. J Numer Math 2004;12(3):233–54. [12] Tausch J, White J. Multiscale bases for the sparse representation of boundary integral operators on complex geometry. SIAM J Sci Comput 2003;24(5): 1610–29. [13] Bebendorf M, Rjasanow S. Adaptive low 2003;70:1–24. [14] Tausch J. The variable order fast multipole method for boundary integral equations of the second kind. Computing 2004;72:267–91. [15] Xiao J, Zhang D, Wen L. Fully discrete Alpert multi-wavelet Galerkin BEM in 2D. Eng Anal Bound Elem 2008;32:91–9. [16] Xiao J, Tausch J, Wen L. Approximate moment matrix decomposition in wavelet Galerkin BEM. Comput Meth Appl Mech Eng 2008;197:4000–6. [17] Xiao J, Wen L, Tausch J. On fast matrix–vector multiplication in wavelet Galerkin BEM. Eng Anal Bound Elem 2009;33:159–67. [18] Xiao J, Tausch J, Wen L, Cao Y. Combined equivalent charge formulations and fast wavelet Galerkin BEM for 3-D electrostatic analysis. Int J Numer Meth Eng 2009;79(6):753–72. [19] Xiao J, Tausch J, Hu Y. A-posteriori compression of wavelet-BEM matrices. Comput Mech 2009;44(5):705–15. ¨ [20] Schoberl J. NETGEN—an advancing front 2D/3D-mesh generator based on abstract rules. Comput Visual Sci 1997;1:41–52.

A fast wavelet-multipole method for direct BEM

A fast wavelet-multipole method for direct BEM

Recommend Documents