Computer Physics Communications 139 (2001) 186–191 www.elsevier.com/locate/cpc
DTORH3 2.0: A new version of a computer program for the evaluation of toroidal harmonics ✩ Amparo Gil a , Javier Segura b,∗ a Instituto de Bioingeniería, Universidad Miguel Hernández, Elche, Alicante, Spain b Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Madrid, Spain
Received 8 January 2001; accepted 9 March 2001
Abstract An improved version of the program DTORH3 to evaluate toroidal harmonics is presented. The code incorporates a uniform asymptotic expansion for Pνm (x) for large m which extends the range of applicability for large x of the previous version. In addition, the new version √of the program takes advantage of the dual behaviour of the functions P and Q incorporating a “dual” algorithm for 1 < x < 2 which improves the accuracy of the code in this range. 2001 Elsevier Science B.V. All rights reserved. PACS: 02.60.6f; 02.30.Gp Keywords: Toroidal harmonics; Legendre functions; Laplace’s equation; Toroidal coordinates
NEW VERSION PROGRAM SUMMARY
Catalogue identifier of previous version: ADKV Does the new version supersede the original program: yes
Title of program: DTORH3
Catalogue identifier: ADOH
Computers on which this or another recent version has been tested: Hewlett Packard, Model 715/100; SUN Enterprise 3000; Pentium II 350 MHz
Program Summary URL: http://www.cpc.cs.qub.ac.uk/summaries/ADOH
Operating systems under which the program has been tested: UNIX, Linux
Program obtainable from: CPC Program Library, Queen’s University of Belfast, N. Ireland
Programming language used: FORTRAN 77
Reference in CPC to previous version: J. Segura, A. Gil, Comput. Phys. Commun. 124 (2000) 104
No. of bytes in distributed program, including test data, etc.: 13 011
Version number: 2.0
✩
This program can be downloaded from the CPC Program Library under catalogue identifier: http://cpc.cs.qub.ac.uk/summaries/ADOH
* Corresponding author.
E-mail address:
[email protected] (J. Segura). 0010-4655/01/$ – see front matter 2001 Elsevier Science B.V. All rights reserved. PII: S 0 0 1 0 - 4 6 5 5 ( 0 1 ) 0 0 1 8 8 - 6
A. Gil, J. Segura / Computer Physics Communications 139 (2001) 186–191 Distribution format: tar gzip file Keywords: Toroidal harmonics, Legendre functions, Laplace’s equation, toroidal coordinates Nature of the physical problem We include a new version of our code DTORH3 to evaluate toroidal harmonics. The algorithms find their application in problems with toroidal geometry (see Refs. [1,2]). Method of solution The codes are based on the application of recurrence relations for P s and Qs both over m and n. The forward and backward recursions (over n or over m) are linked through continued fractions for the ratio of minimal solutions and Wronskian relations; the CF is replaced by series expansion and asymptotic expansion when it fails to converge. Summary of revisions (1) We consider two different algorithms for the evaluation of the functions P , Q depending on the argument x: √ • When 1 < x < 2 the called “dual algorithm” is applied (see LONG WRITE-UP).
187
√ • When x > 2 the called “primal algorithm” is applied. (2) In addition to continued fractions and series expansions, a uniM (x) uniformly valid for x form asymptotic expansion for P−1/2 at large M. As a consequence, the option MODE=2 in the previous version of the code is eliminated. (3) The range of evaluable arguments x can be further extended by using quadruple precision (if supported for the FORTRAN compiler). Restrictions on the complexity of the problem The maximum degree (order) that can be reached with our method, for a given order (degree) m (n) and for a fixed real positive value of x, is provided by the maximum real number defined in our machine. The user can choose two different relative accuracies (10−8 or 10−12 ) in the interval 1.0001 < x < 10 000 for all available values of the orders and degrees. The range for x can be further extended by using quadruple precision for the input x and related variables (see LONG WRITE-UP). Typical running time Depends on the values of the argument x, the orders (m) and the degrees (n). For more details see text: LONG WRITE-UP, Section 4.
LONG WRITE-UP 1. Introduction In Ref. [1] a set of codes to evaluate toroidal harmonics Pνm (x), Qm ν (x) were presented. These algorithms use recurrence relations combined with continued fractions and Wronskian relations. The relative accuracy obtained was of the order of 10−12 . The code failed to converge for large values of m and x or too small x (x < 1.001). As discussed in [1,2], the slow convergence of two continued fractions: HQ (ν, m, x) :=
Qm ν (x) = Qm ν−1 (x)
1 2ν+1 ν+m x
−
m ν−m+1 Qν+1 (x) ν+m Qm ν (x)
,
(1)
and HP (ν, m, x) :=
Pνm (x) Pνm−1 (x)
=
(ν − m + 1)(ν + m) √2mx + 2 x −1
Pνm+1 (x) Pνm (x)
(2)
was responsible for this restriction. The convergence of the continued fraction (CF) from Eq. (1) fails when x is close to 1 and ν becomes large while the CF from Eq. (2) converges slowly for large x and m. This “dual” behaviour can be easily understood by considering the following relation [2]: 2 1/4 n π 3/2 n Qm x −1 Pm−1/2 (x), n−1/2 (λ) = (−1) √ 2Γ (n − m + 1/2)
(3)
188
A. Gil, J. Segura / Computer Physics Communications 139 (2001) 186–191
√ where λ = x/ x 2 − 1. Using this equation one can easily relate the CFs from Eqs. (1), (2): HQ (n − 1/2, m, λ) = HP (m − 1/2, n, x)/(m − n + 1/2). As discussed √ in [2], the duality between P s and Qs suggests the use of an alternative algorithm for small x (1 < x < 2). This algorithm (the “dual” algorithm) can be obtained from the primal one replacing: P by Q, the recurrences over n by recurrences over m and the Wronskian relating consecutive orders (m) by the 0 Wronskian relating consecutive degrees (n). Then, the starting point will be the evaluation of the values P−1/2 0 and P+1/2 ; the forward recurrence over degrees for P s will be applied together with a CF for Q0N−1/2 /Q0N−3/2 and a Wronskian relation in order to obtain Q0N−1/2 , and so on. In this case, the four CFs that would come into play M−1 0 1 M are Q0N−1/2 /Q0N−3/2 , PN−1/2 /PN−1/2 , Pk−1/2 /Pk−1/2 , k = 0, 1, where M, N are the maximum values of orders and degrees, respectively. One can avoid the use of the CF(2) when M, x are large in the primal algorithm or the use of (1) when N is large and x is close to 1 in the dual algorithm, by considering series and asymptotic expansions. In [1] we combined M (x) in powers of 1/x 2 . The expansion was used in the region x > 5 the CF(2) with a series expansion for P−1/2 and x/M > 0.22 for a precision better than 10−12 . However, when M became too large, series suffer from large roundoff errors. This limitation can be avoided by applying a uniform asymptotic expansion for large m of Pνm [2].
2. Uniform asymptotic expansion for Pνm (x), m large In [2] the following uniform asymptotic expansion for large m of Pνm (x) was obtained: Pνm (x) ∼ (−1)m+1
1
∞
l=0
k=0
a2k+l sin νπΓ (ν + m + 1) ν ξ Kν+l+1/2 (mα/2) , 3/2 π m2k+l
(4)
where ξ = (x − 1)/2, α = ln[(x + 1)/(x − 1)] and Kν+l+1/2 (mα/2) are the modified K-Bessel functions. The coefficients ai can be obtained as explained in [2]. In the code, we use: a0 =
α/(eα − 1),
1 3 1 1 α+ α − α5 , 48 2880 120960 7 2 13 α − α4 , a2 /a0 = 7680 322560 571 7 1697 α− α3 + α5 , a3 /a0 = 1920 2580480 154828800 31 2 22763 α + α4 , a4 /a0 = − 64512 495452160 31 5501381 7691 a5 /a0 = − α+ α3 − α5 , 16128 30965760 261598740480 127 2 44593 α − α4 . a6 /a0 = 245760 519045120
a1 /a0 = −
(5)
M (x) are evaluated using the The modified Bessel functions K0 (x) and K1 (x) needed in the evaluation of P−1/2 code by Cody and Stoltz which is obtained from NETLIB [3].
A. Gil, J. Segura / Computer Physics Communications 139 (2001) 186–191
189
3. Improved algorithm In [2] we performed comparative studies of the rate of convergence of CF, series and asymptotic expansion. We √ M (x) for any x 2 (in the primal algorithm) is: concluded that a possible strategy for the evaluation of P−1/2 (1) Use the CF(2) for moderate x (x 20). M (x) using the series expansion for x typically greater than 5 but for small M/x 3. (2) Evaluate P−1/2 (3) Use the asymptotic expansion (4) when the two previous approaches can not be used. √ In the dual algorithm, we use an analogous “dual” recipe for the evaluation of Q0N−1/2 (x). If λ = x/ x 2 − 1 then: (1) Use the CF(1) for moderate λ (λ 20). N (2) Evaluate Q0N−1/2 (x) using (3) and the series expansion for P−1/2 (λ) for λ typically greater than 5 but for small N/λ 3. (3) Evaluate Q0N−1/2 (x) using (3) and the the asymptotic expansion (4) when the two previous approaches can not be used. m Therefore, the full algorithm for the evaluation of the set of toroidal harmonics {Pn−1/2 (x), Qm n−1/2 (x), x > 1, n = 0, 1, √ . . . , N, m − 0, 1, . . . , M} can be summarized as follows: • If x > 2 use the improved primal algorithm. The primal algorithm was described in [1,2]. The improved m version takes into account the three possible strategies for the evaluation of P−1/2 (x) described before: CF, series and uniform asymptotic expansion. √ • If 1 < x 2 use the dual algorithm, which is obtained from the primal one replacing: P by Q, the recurrence over n by recurrences over m, and the Wronskian relating consecutive orders (m) by the Wronskian relating consecutive degrees (n). The dual algorithm together with the use of uniform asymptotic expansion extends both the range of evaluable orders as well as the x-range. In addition, the range of evaluable arguments x can be further extended when the FORTRAN compiler supports √ quadruple arithmetics, due to the fact that the gradual loss of precision in the evaluation of the argument x/ x 2 − 1 when x approaches to 1 can be avoided. For such a purpose, the version 2.0 of DTORH3 incorporates now the following lines: CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C IN ORDER TO ENLARGE THE RANGES FOR C C THE ARGUMENTS AND IF THE C C FORTRAN COMPILER SUPPORT QUADRUPLE C C ARITHMETICS, UNCOMMENT THE C C FOLLOWING LINE: C CCCCC REAL*16 QZ,QDC1,QQ and CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C IN ORDER TO ENLARGE THE RANGES FOR C C THE ARGUMENTS AND IF THE C C FORTRAN COMPILER SUPPORT QUADRUPLE C C ARITHMETICS, UNCOMMENT THE C C FOLLOWING LINE: C CCCCC QDC1=(QZ-1.Q0)*(QZ+1.Q0) C AND COMMENT THE FOLLOWING LINE: QDC1=(QZ-1.D0)*(QZ+1.D0) CCCCCCCCCCC END THE COMMENT CCCCCCCCCCC
190
A. Gil, J. Segura / Computer Physics Communications 139 (2001) 186–191
4. Subprogram specification Same as for the previous version of DTORH3, except that the selection MODE=2 is eliminated because the restrictions in the ranges for m and x in the previous version [1] are not present in DTORH3 2.0.
5. CPU times and available ranges Let us now compare typical CPU times spent by the old (DTORH3) and new version (DTORH3 2.0) of the code. In Table 1 we show CPU times (in 1/1000 s) spent in a Pentium II 350 MHz for the evaluation √ of PL(M, N), QL(M, N), M = 0, 1, . . . , 50 and N = 0, 1, . . . , 50. The argument of the functions is less than 2 in all the cases considered in order to show the efficiency of the dual algorithm. For the evaluation of the P and Q function, the precision obtained is better than 10−12 for moderate arguments for the two versions of our algorithm. For x < 1.0001 there is a loss of precision for the previous version of DTORH3 which is very mild in the new version and can be eliminated by using quadruple precision in the calculation of the arguments. To illustrate the efficiency of the asymptotic expansion in the primal algorithm, we show in Table 2 CPU times (in 1/1000 s) spent in a Pentium II 350 MHz for the evaluation of PL(M, N), QL(M, N), M = 0, 1, . . . , 5000 and Table 1 CPU times (in 1/1000 s) spent in evaluating PL(M, N ), QL(M, N ), M = 0, 1, . . . , 50 and N = 0, 1, . . . , 50 with the codes DTORH3 and DTORH3 2.0 x
DTORH3
DTORH3 2.0
103 × CPU-t. M 50; N 50
103 × CPU-t. M 50; N 50
1.000001
8.61
2.33
1.00001
4.51
2.38
1.0001
3.04
2.34
1.001
2.56
2.35
1.01
2.34
2.35
Table 2 CPU times (in 1/100 s) spent in evaluating PL(M, N ), QL(M, N ), M = 0, 1, . . . , 5000 and N = 0, 1, . . . , 10 with the codes DTORH3 and DTORH3 2.0 x
DTORH3
DTORH3 2.0
102 × CPU-t.
102 × CPU-t.
M 5000; N 10
M 5000; N 10
250.
5.73 (CF)
5.71
500.
5.87 (CF)
5.72
750.
6.00 (CF)
5.72
1000.
6.15 (CF)
5.73
1100.
CF fails
5.71
A. Gil, J. Segura / Computer Physics Communications 139 (2001) 186–191
191
√ N = 0, 1, . . . , 10, and for a precision of 10−12 . The argument of the functions is greater than 2 in all the cases considered. In DTORH3, we set MODE=2. As can be seen in Table 2, as x increases, the CF(2) used in DTORH3 needs progressively more terms to converge; this is the reason why the code slows down. For x = 1100 the CF fails to converge (we have limited the number of iterations of the CF to 10000). This is an example of how the use of the asymptotic expansion instead of the CF(2) enlarges the range of evaluable orders in the primal algorithm.
Acknowledgements The authors acknowledge the use of subroutines from the NETLIB Public Domain Library.
References [1] J. Segura, A. Gil, Comput. Phys. Commun. 124 (2000) 104. [2] A. Gil, J. Segura, N.M. Temme, J. Comput. Phys. 161 (81) (2000) 204. [3] http://netlib2.cs.utk.edu/specfun.