Breakdown-free version of ILU factorization for nonsymmetric positive definite matrices

Breakdown-free version of ILU factorization for nonsymmetric positive definite matrices

Journal of Computational and Applied Mathematics 230 (2009) 699–705 Contents lists available at ScienceDirect Journal of Computational and Applied M...

422KB Sizes 2 Downloads 52 Views

Journal of Computational and Applied Mathematics 230 (2009) 699–705

Contents lists available at ScienceDirect

Journal of Computational and Applied Mathematics journal homepage: www.elsevier.com/locate/cam

Breakdown-free version of ILU factorization for nonsymmetric positive definite matrices A. Rafiei ∗ , F. Toutounian Department of Mathematics Ferdowsi University of Mashhad, Mashhad, P. O. Box. 91775-1159, Iran

article

abstract

info

Article history: Received 20 August 2007 Received in revised form 18 July 2008

In this paper a new ILU factorization preconditioner for solving large sparse linear systems by iterative methods is presented. The factorization which is based on Abiorthogonalization process is well defined for a general positive definite matrix. Numerical experiments illustrating the performance of the preconditioner are presented. A comparison with the well known preconditioner RIFp of Benzi and Tůma is also included. © 2009 Elsevier B.V. All rights reserved.

MSC: 65F10 Keywords: Implicit preconditioner Sparse matrices RIF RIFp

1. Introduction In this paper we consider the solution of linear systems of the form Ax = b,

(1) n×n

where the coefficient matrix A ∈ R is large, sparse and nonsymmetric positive definite (NSPD), and b is a given right hand side vector using preconditioned conjugate gradient-type methods. Suppose that A admits the factorization A = LDU ,

(2)

where L, U T are unit lower triangular matrices and D is a diagonal matrix. If L¯ and U¯ T are sparse unit lower triangular matrices ¯ is a nonsingular diagonal matrix approximating approximating (in some sense) the matrices L and U T , respectively, and D D, then we say that matrix M with

¯ U¯ ≈ A, M = L¯ D

(3)

is an incomplete LU (ILU) factorization preconditioner for matrix A. The transformed linear systems AM −1 u = b,

M −1 u = x ,

(4)

or M −1 Ax = M −1 b,

(5)

have the same solution as system (1) and seem to be better-conditioned than the original system (1) to solve. It is wellknown that an incomplete factorization of a general matrix A may fail due to the occurrence of zero pivots, regardless of



Corresponding author. Tel.: +98 5118404288. E-mail addresses: [email protected] (A. Rafiei), [email protected] (F. Toutounian).

0377-0427/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.cam.2009.01.011

700

A. Rafiei, F. Toutounian / Journal of Computational and Applied Mathematics 230 (2009) 699–705

whether A admits the LU factorization or not. Moreover, trouble (in the form of numerical instabilities) can be expected in the presence of small pivots. In the SPD case, a breakdown occurs when a non positive pivot is encountered. A number of remedies have been proposed to deal with the problem of breakdown, particularly in the SPD case. Significant progress for the SPD case and the general case has been made in recent years. Some of these techniques can be found in [1–8]. In [3], Benzi and Tůma have presented a ¯ L¯ T , safe and reliable method, called RIF (robust incomplete factorization) for computing an approximate factorization A ≈ L¯ D ¯ ¯ with L unit lower triangular matrix and D a diagonal matrix. Unlike the standard Cholesky factorization, their technique is based on an inherently stable A-orthogonalization process and is therefore applicable to general SPD matrices. This is simply the Gram–Schmidt process with respect to the inner product generated by the SPD matrix. In [9], Rezghi and Hosseini have introduced an ILU preconditioner for NSPD matrices by using the conjugate Gram–Schmidt process. They have shown that by applying the A-biconjugation algorithm to both A and AT matrices, it is possible to obtain an approximate factorization of a NSPD matrix. The aim of our paper is to present a new breakdown-free method called RIFp − NS, to construct an approximate factorization for any NSPD matrix. The new method is based on the A-biconjugation process but it only use matrix A. This paper is organized as follows. In Section 2 we briefly review A-biconjugation algorithm and discuss the implementation of RIF and RIFp preconditioners. In Section 3 we describe the new techniques RIF − NS and RIFp − NS for computing an approximate factorization of the form (3). These techniques are applicable to any positive definite matrix. The resulting preconditioners are reliable (pivot breakdown can not occur) and effective at reducing the number of iterations. Section 4 is devoted to the numerical experiments that are done by using some test matrices. In Section 5 some remarks and conclusions are presented. 2. Reformulation of A-biconjugation algorithm The A-biconjugation algorithm that is presented by Benzi and Tůma in [10] is an algorithm that uses matrices A and AT to construct the unit upper triangular matrix Z , diagonal matrix D, and unit upper triangular matrix W in the inverse factorization A−1 = ZD−1 W T .

(6)

It has been shown that for limited class of matrices, i.e., H-matrices, no division by zero will occur during the run of Abiconjugation algorithm. In [11,12] the authors have shown that by a reformulation of the A-biconjugate algorithm no division by zero will occur for broad range of matrices, i.e., positive definite matrices. The reliable version of the AINV algorithm based on the reformulation of A-biconjugation algorithm is as follows: Algorithm 1 (Reformulation of the A-biconjugation algorithm). 1. For i = 1, . . . , n Do: (0) 2. wi(0) = ei , zi = ei 3. EndDo 4. For i = 1, 2, . . . , n Do: (i−1) ui = A T z i 5. vi = Awi(i−1) , (i−1) (i−1) T (i−1) = (zi(i−1) )T ui ) vi , pi = (wi 6. qi 7. If i = n goto 13 8. For j = i + 1, . . . , n Do: (i−1) (i−1) 9. qj = (wj(i−1) )T vi , pj = (zj(i−1) )T ui 10. 11. 12. 13.

14.

(i)

(i−1)

wj = wj

 −

(i−1) qj



(i−1) qi

(i−1)

wi

,

(i)

zj

(i−1)

= zj

 −

(i−1) pj



(i−1) pi

EndDo EndDo (i−1) (i−1) (i−1) Let zi := zi ; wi = wi and pi = pi for 1 ≤ i ≤ n.  p1 0 Return Z = [z1 , . . . , zn ], W = [w1 , . . . , wn ], and D =  .

(i−1)

zi

0 p2

..

.. .

··· ··· .. .

0 0

0

0

···

pn



..  . .

From (2), (6), and the fact that the factorization of the form (2) is unique, we have Z −1 = U = D−1 W T A,

W −T = L = AZD−1 .

(7)

In [13], we have shown that for 1 ≤ i ≤ j ≤ n the following two relations hold (i−1)

pj

= eTi Azj(i−1) = ziT Azj(i−1) = wiT Aej ,

(8)

A. Rafiei, F. Toutounian / Journal of Computational and Applied Mathematics 230 (2009) 699–705

701

and (i−1)

qj

= eTi AT wj(i−1) = wiT AT wj(i−1) = ziT AT ej ,

(9)

Therefore, by using relations (7)–(9) we obtain (i−1)

(i−1)

uij =

pj

, (i−1)

pi

lji =

qj

(i−1)

qi

.

(10)

Hence the L and U factors of A = LDU factorization can be obtained as a by-product of the A-biconjugation process at no extra cost. Based on relation (10) the following algorithm can be given. Algorithm 2. 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

Set L = U = I. For i = 1, . . . , n Do: (0) wi(0) = ei , z i = ei EndDo For i = 1, 2, . . . , n Do: (i−1) vi = Awi(i−1) , ui = AT zi (i−1) (i−1) T (i−1) qi = (wi ) vi , pi = (zi(i−1) )T ui If i = n goto 14 For j = i + 1, . . . , n Do: (i−1) (i−1) = (zj(i−1) )T ui pj = (wj(i−1) )T vi , qj lji =

(i−1) qj

(i−1) qi

,

(i−1)

(i)

wj = wj

uij =

 −

(i−1) pj

(i−1) pi (i−1) qj (i−1) (i−1) i qi



w

,

(i)

zj

(i−1)

= zj



(i−1) pj



(i−1) pi

(i−1)

zi

EndDo EndDo (i−1) Let pi = pi for 1 ≤ i ≤ n. 0 p2

.. .

0 0

.

··· ··· .. .

0

0

···

pn

p1 0

 15.



Return L = (lji )1≤i


..  . .

The incomplete factorization based on A-biconjugation process needs two drop tolerances, one for incomplete Abiconjugation process, to be applied to the Z and W factors, and a second one to be applied to the entries of L and U matrices. The latter is simply post-filtration that is once a column of L and a row of U have been computed. Removing small entries from the columns and rows of L and U factors, lead to some degradation of the convergence rate of iterative methods, but this is often compensated by saving in the computational work obtained from having sparser L and U factors. As mentioned in [3], two variants of incomplete factorization based on A-biconjugation process, called RIF (without post-filtration) and RIFp (with post-filtration), can be obtained. If a drop tolerance parameter T1 is used to sparsify the (i)

(i)

newly updated vectors wj and zj after step 11 of Algorithm 2 by dropping those entries that are less than T1 , then RIF preconditioner will be obtained. If in addition, a drop tolerance T2 is used to drop the small entries of L and U factors after step 10 (post-filtration) of Algorithm 2, then RIFp preconditioner will be obtained. It should be mentioned that, the other versions of RIF and RIFp preconditioners called NRIF and NRIFp , were presented in [9]. To construct NRIF and NRIFp preconditioner step 11 of Algorithm 2 was replaced by

wj(i) = wj(i−1) − lji wi(i−1) ,

(i)

zj

= zj(i−1) − uij zi(i−1) . (i−1)

The most important issue to be mentioned is that, since for every i, the vectors zi (i−1)

(i−1) T T (i−1)

(i−1)

(i−1)

and wi

(i−1) T

(i−1)

are nonzero by

computation, then for 1 ≤ i ≤ n the numbers pi = (zi ) A zi and qi = (wi ) Awi are all positive, provided that the original matrix A is positive definite. Thus in exact arithmetic, the RIF and RIFp preconditioners exist and no division by zero will occur during their construction.

702

A. Rafiei, F. Toutounian / Journal of Computational and Applied Mathematics 230 (2009) 699–705

3. RIF − NS method Note that, in Algorithm 2, matrix AT generates vectors {ui }ni=1 at step 5. These vectors are needed for generating numbers

(i−1)

pj

, for 1 ≤ i ≤ j ≤ n. We have shown that [13], by using the following proposition, it is possible to obtain the numbers

pj

without using matrix AT .

(i−1)

(i−1)

Proposition. Let pj (i−1)

pj

= aij −

(i−1)

’s and qj

’s be the scalars produced by Algorithm 1, then for 1 ≤ i ≤ j ≤ n we have

(k−1) i −1 X pj

(k−1) q (k−1) i

(11)

k=1 pk

and (i−1)

qj

= aji −

(k−1) i−1 X qj k=1

(k−1) p . (k−1) i

(12)

qk

From (10) and (11), we immediately observe that (i−1)

pj

= aij −

i −1 X

(k−1)

ukj qi

.

(13)

k=1

Based on (13), a transpose-free version of Algorithm 2 can be stated as follows: Algorithm 3. 0. Set L = U = I. (0) 1. Let wi = ei ; 1 ≤ i ≤ n 2. For i = 1, 2, . . . , n Do: 3. vi = Awi(i−1) (i−1) 4. qi = (wi(i−1) )T vi (i−1) 5. pi = q(i i−1) 6. If i = n goto 15 7. For j = i + 1, . . . , n (i−1) = (wj(i−1) )T vi 8. qj 9.

lji = (i−1)

(i−1) qj

(i−1) qi

= aij −

Pi−1

(k−1)

10.

pj

11.

uij =

12.

wj(i) = wj(i−1) − (

13. 14. 15.

EndDo EndDo (i−1) Let pi = pi for 1 ≤ i ≤ n.

(i−1) pj

k=1

(i−1) pi

ukj qi

(i−1) qj

(i−1) qi

)wi(i−1)

0 p2

.. .

0 0

.

··· ··· .. .

0

0

···

pn

p1 0

 16.

Return L = (lji )1≤i


..  . .

In exact arithmetic the above algorithm, as Algorithm 2, is applicable for nonsymmetric positive definite matrices and no division by zero will occur. Of course, instabilities due to positive but extremely small pivots may occur in finite precision and a thresholding technique may still be necessary to guard against such possibilities. As previous section, we can use drop tolerances T1 and T2 and obtain two variants of incomplete factorization based on A-biconjugation process. The RIF − NS preconditioner will obtain by incorporating a single dropping strategy with drop (i) tolerance T1 for only sparsifying vector wj after step 12 of Algorithm 3. The RIFp − NS preconditioner will obtain by using a (i)

dropping strategy with drop tolerance T1 for sparsifying vectors wj after step 12 of Algorithm 3 and also by using a dropping strategy with drop tolerance T2 for removing small elements of L and U factors after steps 9 and 11 of this algorithm.

A. Rafiei, F. Toutounian / Journal of Computational and Applied Mathematics 230 (2009) 699–705

703

Table 1 Convergence results for iterative methods without preconditioning. Matrix

n

NZ

It-QMR

It-BiCG

It-GMRES(10)

Reference

hor-131 cdde1 pde900 sherman1 sherman4 add20 add32 pde2961 fidap029 saylr3 cavity05 cavity06 rajat04 rajat12 raefsky1 raefsky2 epb0 poli

434 961 900 1000 1104 2395 4960 2961 2870 1000 1182 1182 1041 1879 3242 3242 1794 4008

4 710 4 681 4 380 3 750 3 786 17 319 23 884 14 585 23 754 3 750 32 747 32 747 9 642 12 926 293 409 293 551 7 764 8 188

Ď

Ď

Ď

MatrixMarket [14] MatrixMarket [14] MatrixMarket [14] MatrixMarket [14] MatrixMarket [14] MatrixMarket [14] MatrixMarket [14] MatrixMarket [14] MatrixMarket [14] MatrixMarket [14] MatrixMarket [14] MatrixMarket [14] U.FCollection [15] U.FCollection [15] U.FCollection [15] U.FCollection [15] U.FCollection [15] U.FCollection [15]

136 130 408 133 266 54 286 76 434 414

93 76 320 87 272 33 141 51 307 472

Ď Ď

Ď Ď

4327 314 528

2883 229 373 1324 22

Ď

35

1089 22 520 67 168 8 44 11 520 4411 Ď Ď Ď

1096 524 Ď

8

(i−1)

in line 10 of Algorithm 3, the drop tolerance T2 should be Since for k < i the elements ukj are used to compute pj chosen small since large value leads to the degradation of the convergence rate of iterative methods. 4. Numerical experiments In this section we focus our attention on comparison between our new preconditioner RIFp − NS and the well-known RIFp preconditioner. All the tests were run on a PC with CPU 3 GHz(full) and 1.00 GB of RAM and all the codes were written in Matlab. Results of right preconditioned QMR, BiCGSTAB and GMRES(10) methods using both RIFp and RIFp − NS preconditioners are presented in Tables 3–5. The results of unpreconditioned methods and matrix properties are listed in Table 1. In this table ‘‘n’’ means dimension of the matrix, ‘‘NZ’’ stands for the number of nonzero entries of the matrix, and ‘‘It-QMR’’, ‘‘It-BiCG’’ and ‘‘It-GMRES(10)’’ denote the number of iterations of QMR, BiCGSTAB and GMRES(10) methods, respectively. The nonsymmetric test matrices are taken from the University of Florida sparse matrix collection [15] and Matrix Market collection [14]. The test matrices were not reordered and no scaling was used. The initial vector for all tests is selected zero vector and the right hand side vector is b = Ae, where e = [1, . . . , 1]T . The stopping criterion

krk k2 ≤ 10−6 , was used, where rk = b − Axk is the kth iterated residual of the linear system to be solved. In this table a dagger (Ď) indicates that no convergence is achieved after 10 000 iterations. In Table 2, properties of the preconditioners are listed. ‘‘density’’ means density =

¯) NZ(L¯ ) + NZ(U NZ(A)

,

where NZ(X) is the number of nonzero entries of matrix X. In this table ‘‘Prcosts’’ denotes the number of arithmetic operations (Flops) for constructing the preconditioner divided by NZ(A). For all matrices we have used the drop tolerances T1 = 0.1 and T 2 = 0.01 to construct both preconditioners. In Tables 3–5, ‘‘Itr’’ stands for the number of iterations, ‘‘Itcosts’’ denotes the number of Flops needed for the iteration phase divided by NZ(A) and ‘‘Tcosts’’ is the total number of arithmetic operations, i.e., Tcosts = Prcosts + Itcosts. In Tables 3–5 a dagger (Ď) means that no convergence is achieved after 1000 iterations. In the computation of the two preconditioners whenever the pivot element was less than 10−16 we replaced it by 0.1. This case only happened for matrices saylr3, cav ity05, and cav ity06. Table 2 shows that for T1 = 0.1 and T2 = 0.01, the nonzero density of RIFp − NS is close to that of RIFp preconditioner. Thus the comparison of the number of iterations and the Prcosts for RIFp and RIFp −NS preconditioners will be meaningful. Table 2 also shows that, for all matrices except epb0 the Prcosts of RIFp − NS preconditioner is better than that of RIFp . preconditioner. Results of Tables 3–5 show that the two preconditioners give similar results from the point of view of the rate of convergence. Table 3 shows that for all matrices except cav ity06, rajat04, and epb0, the total costs (Tcosts) of applying the QMR method with RIFp − NS preconditioner are less than the total costs of applying QMR method with RIFp preconditioner. The results of Table 4 are similar to those of Table 3, except for cav ity05 for which Tcosts of RIFp preconditioner is also smaller than that of RIFp − NS preconditioner. Table 5 shows that for most problems RIFp − NS

704

A. Rafiei, F. Toutounian / Journal of Computational and Applied Mathematics 230 (2009) 699–705

Table 2 Properties of implicit preconditioners. Matrix

RIFp − NS

RIFp

hor-131 cdde1 pde900 sherman1 sherman4 add20 add32 pde2961 fidap029 saylr3 cavity05 cavity06 rajat04 rajat12 raefsky1 raefsky2 epb0 poli

density

Prcosts

density

Prcosts

3.641 2.29 2.09 1.72 2.19 0.716 1.45 2.08 1.32 1.72 1.053 1.059 0.935 0.855 0.465 0.626 3.58 2.27

1836.76 48.459 71.529 72.87 40.33 40.134 23.03 69.91 22.18 72.87 58.15 71.52 40.56 36.97 38.91 67.2 890.15 65.87

1.89 1.96 1.94 1.74 2.25 0.66 1.44 1.96 1.31 1.74 1.12 1.1 1.16 0.98 0.47 0.68 6.4 1.59

64.45 18.96 27.69 41.69 28.12 23.72 14.95 32.59 14.41 41.69 47.81 56.12 29.77 22.28 36.21 64.16 1077.51 5.34

Table 3 Results of right preconditioned QMR. Matrix

hor-131 cdde1 pde900 sherman1 sherman4 add20 add32 pde2961 fidap029 saylr3 cavity05 cavity06 rajat04 rajat12 raefsky1 raefsky2 epb0 poli

RIFp − NS

RIFp Itr

Itcosts

Tcosts

Itr

Itcosts

Tcosts

Ď

Ď

Ď

26 17 14 25 6 4 26 4 14 80 249 45 57 44 86 98 8

429.188 265.79 210.89 437.58 52.68 49.49 406.96 43.08 210.36 705.63 2203.92 428.98 559.64 266.48 576.37 2176.92 160.96

477.647 337.319 283.76 477.91 92.814 72.52 476.87 65.26 283.23 763.78 2275.44 469.54 596.61 305.39 643.57 3067.07 226.83

20 29 16 14 23 7 4 29 4 14 69 252 57 52 37 50 82 5

261.16 440.99 240.55 212.12 408.11 60.57 49.3 440.42 43.002 211.588 628.51 2273.61 596.75 536.23 225.67 346.8 2747.55 83.48

325.61 459.95 268.24 253.81 436.23 84.29 64.25 473.01 57.412 253.278 676.32 2329.73 626.52 558.51 261.88 410.96 3825.06 88.82

Itr

Itcosts

Tcosts

Itr

Itcosts

Tcosts

Ď

Ď

Ď

17 8 10 15 3 2 15 2 10 43 175 56 44 31

358.174 160.83 186.16 326.56 32.28 34.21 297.28 29.9 185.73 469.16 1909.66 630.55 481.42 216.96

406.633 232.359 259.03 366.89 72.414 57.24 367.19 52.08 258.6 527.31 1981.18 671.11 518.39 255.87

Ď

Ď

Ď

137 4

3973.72 103.02

4863.87 168.89

8 18 9 10 14 4 2 16 2 10 43 190 52 42 26 38 88 3

137.56 342.8 172.25 187.52 310.63 41.42 34.04 305.07 29.82 187.08 488.005 2122.03 658.38 491.4 183.84 316.34 4052.21 64.45

202.01 361.76 199.94 229.21 338.75 65.14 48.99 337.66 44.23 228.77 535.815 2178.15 688.15 513.68 220.05 380.5 5129.72 69.79

Table 4 Results of right preconditioned BiCGSTAB. Matrix

hor-131 cdde1 pde900 sherman1 sherman4 add20 add32 pde2961 fidap029 saylr3 cavity05 cavity06 rajat04 rajat12 raefsky1 raefsky2 epb0 poli

RIFp − NS

RIFp

A. Rafiei, F. Toutounian / Journal of Computational and Applied Mathematics 230 (2009) 699–705

705

Table 5 Results of right preconditioned GMRES(10). Matrix

hor-131 cdde1 pde900 sherman1 sherman4 add20 add32 pde2961 fidap029 saylr3 cavity05 cavity06 rajat04 rajat12 raefsky1 raefsky2 epb0 poli

RIFp − NS

RIFp Itr

Itcosts

Tcosts

Itr

Ď

Ď

Ď

6 2 2 6 1 1 5 1 2 94 427

720.818 232.69 248.77 840.66 71.76 104.32 577.117 79.4 248.009 4912.19 21001.11

769.277 304.219 321.64 880.99 111.894 127.35 647.027 101.58 320.879 4970.34 21072.63

2 6 2 2 5 1 1 5 1 2 54

Ď

Ď

Ď

43 38 Ď Ď

1

3264.35 1298.87 Ď Ď

3301.32 1337.78 Ď Ď

191.25

257.12

Itcosts

Tcosts

166.96 681.33 226.83 249.65 707.14 70.81 104.09 565.09 79.3 248.88 2900.24

231.41 700.29 254.52 291.34 735.26 94.53 119.04 597.68 93.71 290.57 2948.05

Ď Ď

Ď Ď

Ď Ď

30 13 22 49 1

2351.35 447.21 848.65 10230.37 177.1

2373.63 483.42 912.81 11307.88 182.44

preconditioner is cheaper (in terms of total costs) than RIFp preconditioner and for some matrices (like cav ity06 and rajat04) there is no convergence with RIFp − NS preconditioner. From the above discussion we can conclude that RIFp − NS preconditioner is a robust and effective preconditioner for iterative methods. 5. Conclusion In this paper a breakdown-free variant of an ILU preconditioner for NSPD matrices, based on the reformulation of Abiconjugation algorithm, was presented. The new preconditioner, called RIFp − NS can be computed without using matrix AT . We observe that RIFp − NS preconditioner is applicable for nonsymmetric positive definite matrices and no breakdown will occur during its construction. Numerical experiments show that RIFp − NS preconditioner is as effective as RIFp preconditioner at reducing number of iterations of iterative methods. Numerical results indicate that the nonzero densities of RIFp − NS and RIFp preconditioners are about the same and for most of problems RIFp − NS preconditioner is cheaper (in terms of preconditioning costs and total costs) than RIFp preconditioner. We can conclude that RIFp − NS preconditioner can be a useful tool for solving large and sparse NSPD linear systems. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

O. Axelsson, L.Yu. Kolotilina, Diagonally compensated reduction and related preconditioning methods, Numer. Linear Algebra Appl. 1 (1994) 155. M.A. Ajiz, A. Jennings, A robust incomplete Cholesky-conjugate algorithm, Int. J. Numer. Methods Eng. 20 (1984) 949. M. Benzi, M. Tůma, A robust incomplete factorization preconditioner for positive definite matrices, Numer. Linear Algebra Appl. 10 (2003) 385–400. D.S. Kershaw, The incomlpete Cholesky conjugste gradient method for the iterative solution of systems of linear equations, J. Comput. Phys. 26 (1978) 43. T.A. Manteuffel, An incomplete factorization technique for positive definite linear systems, Math. Comp. 34 (1980) 473. M. Tismenetsky, A new preconditioning technique for solving large sparse linear systems, Linear Algebra Appl. 154–156 (1991) 331–353. L.N. Trefethen, D. Bau III, Numerical Linear Algebra, SIAM, Philadelphia, PA, 1997. H.A. Van der Vorst, Iterative solution methods for certain sparse linear systems with a nonsymmetric matrix arising from PDE-problems, J. Comput. Phys. 44 (1981) 1. M. Rezghi, S.M. Hoseeini, An ILU preconditioner for nonsymmetric positive definite matrices by using the conjugate Gram–Schmidt process, J. Comput. Appl. Math. 188 (2006) 150–164. M. Benzi, M. Tůma, A sparse approximate inverse preconditioner for nonsymmetric linear systems, SIAM J. Sci. Comput. 19 (1998) 968–994. M. Benzi, J.K. Cullum, M. Tůma, Robust approximate inverse preconditioning for the conjugate gradient method, SIAM J. Sci. Comput. 22 (2000) 1318–1332. S.A. Kharchenko, L.Yu. Kolotilina, A.A. Nikishin, A.Yu. Yeremin, A robust AINV-type method for constructing sparse approximate inverse preconditioners in factored form, Numer. Linear Algebra Appl. 8 (2001) 165–179. A. Rafiei, F. Toutounian, New breakdown-free variant of AINV method for nonsymmetric positive definite matrices, J. Comput. Appl. Math. 219 (1) (2008) 72–80. Matrix Market page. http://math.nist.gov/MatrixMarket. University of Florida Sparse Matrix Collection web page. http://www.cise.ufl.edu/research/sparse/matrices.