Computer Physics Communications 50 (1988) 375—393 North-Holland, Amsterdam
375
NJGRAF AN EFFICIENT PROGRAM FOR CALCULATION OF GENERAL RECOUPLING COEFFICIENTS BY GRAPHICAL ANALYSIS, COMPATIBLE WITH NJSYM -
A. BAR-SHALOM and M. KLAPISCH Racah Institute of Physics, Hebrew University, 91904 Jerusalem, Israel Received 30 May 1987
NJGRAF calculates angular momentum recoupling coefficients by graphical methods. The results are sums of products of 6j coefficients. These can be evaluated several times with different values of the arguments. The number and the range of summation variables are minimal. It is intended to replace NJSYM, with a gain of computer time of up to two orders of magnitude.
PROGRAM SUMMARY Title of program: NJGRAF
No. of bits in a word: 60
Catalogue number: ABBY
Peripherals used: reader, printer
Program obtainable from: CPC Program Library, Queen’s University of Belfast, N. Ireland (see application form in this issue)
No. of lines in combined program and test deck: 5215
Installation: Hebrew University Computing Center Computer
1):
CDC Cyber 72;
Nature of the physical problem This program calculates a general recoupling coefficient for an arbitrary number (see below) of integer or half-integer angular momenta.
Operating system: NOS/BE Programming language used
2):
FORTRAN 77
High speed storage required: 21600 words
1)
2)
Keywords: atomic, nuclear, scattering, recoupling coefficients, Racah coefficients, 3nj symbols, graph theory, angular momentum, sum rules
This version has been used also on the Univac 1110 of Paris-Sud Informatique, Université d’Orsay, Orsay 91405, France, which as 36 bit words. The only modification was the suppression of the comment sign in front of IMPLICIT REAL * 8 in subrouting GENSUM and DRACAH. It also runs on the Cray installation of Lawrence Livermore National Laboratory, Livermore, California, USA, and at some other places in the USA and in Europe. After the program was sent for publication, the computer of the Hebrew University has been changed for a CDC 170/855, working under NOS. Originally written in FORTRAN IV, this version was made compatible with FORTRAN 77 by C. Johnson, at the Theoretical Chemistry Dpt., Oxford, England.
Method of solution The recoupling coefficient is first transformed into a structureless graph, from which zero-valued angular momenta are taken out, Then one or more “flat diagrams” are created and searched for minimal “n-loops” so as to generate the optimal expression as a sum over products of 6j coefficients. The latter are evaluated by a subroutine of Scott and Hibbert. Due attention has been paid to minimalizing the numbers of values that summation variables are given. Restriction on the complexity of the problem The number of angular momenta is limited only by the dimensions of the arrays; in the present version: 60. Typical running time See text. The test case took 0.05 s on CDC Cyber 72. The gain of time with respect to the original NJSYM is a factor between 5 and 50.
OO1O-4655/88/$03.50 © Elsevier Science Publishers B.V. (North-Holland Physics Publishing Division)
376
A. Bar-Shalom, M. Klapisch
Unusual features of the program NJGRAF is compatible with NJSYM package, including the use of separate calls to GENSUM. provided minor changes in the calling program are made. A logical array FREE is defined; if the value of a given angular momentum J. may change
/
General recoupling coefficients (is “free”) between calls to GENSUM. then FREE(J) should be given the value TRUE as input. In that case, the formula is established in NJGRAF without taking into account any particular value of this angular momentum. e.g. 0.
LONG WRITE-UP I. Introduction This program is intended to replace NJSYM. the well-known program of Burke [11for the cornputation of recoupling coefficients. The input and results are the same, and it can be “plugged in” with only very minor modifications. The program NJSYM is one of the cornerstones of today’s computer-based atomic and molecular physics. It handles accurately and securely the ubiquitous angular coefficients (the “algebra”) intervening in atomic structure calculation, transition probabilities, electron collision cross sections, photoionisation, etc., and is used as such in several packages published in Computer Physics Communications [2—121.We used it in particular as a part of Grant’s package [11]. NJSYM is divided into two features. Its first stage, the NJSYM subroutine proper, establishes the formula to be calculated using successive recoupling steps. The second stage, subroutine GENSUM, calculates the resulting sum of products of Racah- i.e. “6j “-coefficients; it can be called a number of times with different numerical arguments to evaluate the same formula. This structure makes it attractive for a wide range of applications. However, it appeared to us that the computations were far more time consuming than they should be, and when our computing budget was severely cut down, an effort was attempted to make the program more efficient. The outcome was the present new program, and while it was in the phase of extensive tests (1982), our attention was drawn to the paper of Scott and Hibbert [10] stressing also the necessity of improving the computing times of NJSYM. However, it seems from their paper that the gain of time that they achieve comes mainly from a
new version of the 6j coefficients, and from the possibility of entering GENSUM directly in all cases, once the formula has been established, and only numerical values of angular momenta are allowed to change. On the other hand NJGRAF is completely new and uses graph analysis to generate a better formula. Indeed, a time analysis of the same kind as the one of Scott and Hibbert [10] showed that the computing time of NJSYM for a given recoupling coefficient increased drastically with the values of the angular momenta involved. It became clear that the recoupling strategy employed by NJSYM may generate a formula which is unnecessary complex to evaluate. A simplc example will make the point clear. In the test case number 4 of the original paper [1], the result is a sum over products of three 6j’s (see section 2). The values of the summation variable (Ii~) depend upon the values of the “fixed” angular momenta; when these are large, the range of values of Jj() permitted by the selection rules will also be large, so that there will be many calls to the subroutine DRACAH which is the main consumer of computing time. However, as shown in our test case no. 2 (section 6) the above coefficient can be transformed according to the well known Biedenham—Elliott sum rule (ref. [14], p. 134, ref. [15]), ~ [x]
x
(
/
lv) a
d
x
f
q
1~a±h+e+d+e±f+p+q+r
[p
k
{,
=
‘
X
b d
c
x
// c
pJ e
—
e
‘
g
~
~.
J
q
r
a
(1)
The computing time of the r.h.s. of eq. (1) is
A. Bar-Shalom, M. Klapisch
much shorter and much less dependent on the values of the angular momenta. There are of course many such sum rules, and it is nearly impossible to keep them in memory and to try and identify them on the final form of the formula as it is represented in GENSUM. The only possible way of producing the best formula, in the sense of the minimal number and values of summation variables, is to use graphical methods, as developed and explained by Yutsis, Levinson and Vanagas [13] (referred to later as YLV), Brink and Satchier [14] (BS), Judd [15], El-Baz [16], Lindgren and Morrison [18], etc. Another very important simplification which can be handled only by graphical methods, is taking zero-valued angular momenta into account, such as in test cases 3 and 4. The corresponding branches of the graph are simply taken out, and a complex formula can be reduced drastically. At the least, some summation variables are suppressed. As zero angular momenta are very common, especially in non-relativistic atomic physics, this possibility alone gives a noticeable reduction of computing times. The problems that remain are the coding of the graph in the computer and its analysis. Classically [13,17], a graph can be represented by a matrix whose lines and columns designate the nodes, and the matrix elements refer to the branches connecting the nodes. Alternatively, it can be represented by a matrix whose rows are nodes and columns are branches. An element will then be 1 or 0 according to whether the branch is connected to the node or not. It can be shown [17],that optimal formulae can then always be obtained, but at the expense of performing lengthy operations on the matrices, However, we wanted to keep the time spent at the first stage establishing a formula — as short as possible, in order to make the program efficient even for the simple cases, which actually occur very frequently. We then looked for a compromise between the simplicity of the algorithm and the optimality of the generated formula. The representation found the so-called Flat Graph (FG), has this property. As a consequence, the time spent in NJGRAF building and analyzing the FG will always be —
—
/
General recoupling coefficients
377
shorter than the corresponding time in the original NJSYM, not to mention the time spent in GENSUM calculating the actual values; this is now quicker by one or two of magnitude. In the following (section 2) we shall recall briefly some results of the graph theory relevant to the program, then we shall explain the structure of NJGRAF (section 3), and its subroutines before passing to the description of common blocks and variables (section 4). Section 5 is devoted to the description of input to the driver and of the way of implementation. The test cases are described in section 6 and there is a short discussion in section 7. In this paper the words graph and diagram are synonymous. Actually, both words are used in the literature, e.g. one refers to “Yutsis graphs”, and to “Feynman diagrams”.
2. Results of the graph theory 2.1. NJSYM [1] is based on the concept of trees, each branch representing an angular momentum. Graphs are generated from trees by three successive simple operations. (i) Put on each branch an arrow such that the arrows will go “down” in one tree and “up” in the other, and signs on the nodes indicating the order of coupling of the angular momenta at each node (— for clockwise, and + for counterclockwise). Arrows and signs are necessary to obtain the correct phase factom. There are several conventions for the arrows. We use the one of YLY [13], namely that each branch has always one and only one arrow; this is the most convenient convention for programming. (ii) Bring together the lines of the two trees bearing the same name, in such a way that the arrows “follow the stream”, thus closing the graph. Suppress the spurious arrow. (iii) Multiply by a simple factor and a phase (YLV, eqs. 22.1 and 22.2). The graph thus generated is invariant under any transformation which conserves the order of the branches
378
A. Bar-Shalom, M. Klapisch
/
General recoupling coefficients
changed by multiplying the value of the graph by (— 1)11 around the nodes.Change The signofofdirection a node may of be an
+
~J2~J3
arrow involves a multiplication by
(_j)21,
2.2.
~o)
+
We now formulate a number of rules, proven in YLV, BS, etc., whose object is to simplify the graph — i.e. taking out some parts and multiplying the result by some prescribed factor until it reduces to a product of factors like fig. 1 which represent the “identical” recoupling —
((i1i2)13
Fig. 2. Disposing of a zero branch. The rectangle represents an arbitrary graph. C2 = ([j~ l[ifl) 1/28~ /2)8( j~,j~).
~
=
~
(1112)13). Fig. 3. Cut on
I line. C3
This is often called a “triangular delta” to stress that the values 1~ 12 and j~ should satisfy the triangular inequality 1i~12~
[~~IIIII::~ xC2
________
~
=
l(i1][131Y ~
0)8(/~.J7)
1/28(1
2.2.1. Fig. 4. Cut on 2 lines. C4
A zero branch can be erased. Disposing of the remaining nodes introduces delta functions and weights like shown in fig. 2.
I
2.2.2.
____ I
_______
([/j)
18(/7
,, 2
If two parts of a graph are connected by one and only one line, the angular momentum represented this linetomust a value of zero the graphbyreduces the have product of the two and disconnected parts — this is “cutting on one line” (fig. 3).
=
‘3
‘2)
,,
r—i
I
I 4~ _______
]
H
‘~U
Fig. 5. Cut on 3 lines.
2.2.3.
If two parts of a graph are connected by only two lines, the graph can be cut according to fig. 4. Similarly, fig. 5 describes the cut on three lines.
Cuts through four and more lines involve the addition of an extra summation variable. 2.3.
_____________________
Fig. 1. Diagram of the unit graph, “triangular delta”,
Another way of reducing the graph is locating “N-loops”, or circuits of order N — i.e. made up of N branches. This is actually the main method used in this program. Disposing of a 2-loop (bubble) is made according to fig. 6, and taking out a triangle introduces a 6j like shown in fig. 7. A loop of order 4 introduces a summation variable and two 6j ‘s as in fig. 8.
A. Bar-Shalom, M. Klapisch
I
- k
2 =
x 06
Ii
18(li, 12). Fig. 6. Disposing of a “2-loop” (bubble). C6
I
=
3
[l~l”
1+
—
~
~8
7
=
{
}.
7,~2/3
=
(_
~
±‘2~
0
x[k]{’’~}/~. t’4’2’3)
d
~
+
d
e
+
~
hi g fci
with one summation variable).
Fig. 8. Disposing of a “4-loop” (square). C8
b
5 6
~
33
~4120
Argument
able and a 6j. The latter is actually the only rule used implicitly in the original NJSYM. The superiority of the formula generated by NJGRAF comes precisely from the possibility of identifying 2-loops (delta), triangles (6j) and squares (two
.
14
Node index
Another relation used in the program is the one represented in fig. 9: the permutation of two branches with the addition of a summation van-
3~\
33
Fig. 7. Disposing of a “3-loop” (triangle). C7
-
379
xC
\
,/
—
/
34
General recoupling coefficients
Table I Mode of storage of a 9j coefficient in the “fiat graph” (FG) system
+
2
/
09
b
+L—T”
0
5’1”[fl(~j. (~bf~
The flat graph (FG) representation used here consists in ordering a list of nodes (tniads) in such a way that the second argument of a triad is the third of the following one. An example is shown in fig. 10 representing a 9j symbol (BS, p. 125). This amounts to choosing a path on the graph going through all the nodes, in order to “flatten” it. The data are ordered in the memory as shown in table 1. It is clear that the difference of indices of the two triads having in common the same first argument, plus one, gives the order of the loop where in table 1, this a is argument in a loop ofbelongs. order 4 —For 1 + instance, 1 4. More =
details are explained in section 3.
Fig. 9. Interchange of 2 branches. C9 = (— l)
h
_
3.
Structure of the program
C
3,1. The overall subroutine calling structure is shown in fig. 11. There are many subroutines, but most
Fig. 10. Flat diagram of a 9j symbol.
of them are not called a large number of times.
______
380
A. Bar-Shalom, M. Klapisch
/
[NJGRAF~
General recoupling coefficient.c (~-~(~~F
SET DIM TAB SET
~
SETTAB
ZERO DIAGRM
~ ~
CUTNL
~
~
NT~~
______
~OTHER
~‘
J
I
‘~‘~‘~ ~~~IBOR
SEARCH ~
INTAB
CUT2L
CUTNL
NC’ NC+!
~
DELTA
I
SEARCH
iELP~2
F
P~YGN
““‘
r—f~~AT
WAY
NFREE
NBTR
,~CHANGEj--.—r~
~RDft~E~
DIAGRM
CUTIL
______
ORDER
DELTA
--—--n
r—I~~~~l
LOLPOP
BUBBLE
TRIANG
QUARE
DELTA
NBNODE’NBNODE-2
TRDEL
~
BNODE)NFI N
Fig. II. Subroutine structure of NJGRAF.
Y
NC’NCP
NBTR>0
N UMP’ 3
CHVAR
VAR
SPRATE
Y ORDER
The logical flowchart of the main routine NJGRAF is displayed in fig. 12. We shall now follow this flowchart to explain the role of the subroutines. Subroutines not mentioned in the following are sufficiently commented in the program. 3.2.
The first step of the program (subroutine SETTAB) is to build a structureless graph which is just the application of section 2.1: the triads of the two trees J2(I, K) and J3(I, K) are just put in one array J23(I, K) and arrows are created. The sign of the node is just the expression of the order of the 3 arguments of each triad J23(I, 1), J23(I, 2), J23(I, 3). The numerical factor connecting graphs
GENSUM
OUT Fig. 12 Flow chart of NJGRAF.
to recoupling coefficients is established (cf. YLV, eq. 22.1). 3.2.1.
The EG structure is very convenient for searching and locating “N-loops”. However, it is not convenient for erasing zero branches, because this may break the continuity of the FG. For this reason, the FG is constructed after zero branches have been taken care of. It happens frequently that some angular
A. Bar-Shalom, M. Klapisch
momentum may have the value zero when NJGRAF is called, but will have different values in subsequent numerical evaluation of the same formula. In that case, if the calling program keeps the formula established by NJGRAF with the zero value branch suppressed, the results of subsequent calls to GENSUM will give uncorrect results. For this reason, the logical array FREE(J) is defined, If the user wishes to establish a formula that does not take advantage of a given angular momentum J being zero at first call, i.e. whose value is free to change between subsequent calls to GENSUM, then the corresponding FREE(J) should be given the value .TRUE. on input,
/
General recoupling coefficients
381
next node on the axis of the FG. If no one or both have this property, the choice is indifferent and the first found is chosen. 3.3.3. If the program hits a dead end (I~and 12 marked as first) it proceeds from the first node in the reverse direction. This is why the array JDIAG is dimensioned to double the number of triads.
The possibility of forming a FG is connected with the existence of a Hamiltonian path on the graph i.e. a path going once and only once through every node. It can be shown [YLV] that such a path does not always exist for recoupling coefficients. In such a case, or in the event that the above procedure does not succeed in building one FG including all the nodes, the FG obtained will be connected to the rest of the graph by a certain number of lines (NFREE) that will be left as “free ends” when the analysis goes on. The program will —
3.2.2. Then, in subroutine ZERO, the angular momenta for which FREE(J) .FALSE. are tested for zero. Delta functions are created (subroutine DELTA) and tested. If they are not satisfied (for FREE(J) .FALSE.), the recoupling coefficient RECUP is put to zero and no graph is constructed. =
=
then proceed according to one of the two follow3.3. In subroutine DIAGRM, the FG is constructed. The way of forming the FG is the following: upon arrival onto a node (or triad) (1~~ 12, 13) one has to choose between two possibilities (J~or 12) in order to proceed to the next node. That is to say, which one of (1i~12) will be put second in the triad. j~must be third according to section 2.4 above. We use the following rules: 3.3.1. An argument that was once first in a triad i.e. not on the main axis of the FG cannot be second. This character is marked by setting the variable IAL(J) to 1 in subroutine WAY. —
—
3.3.2. If neither Ii nor 12 were used as first argument yet, the two other triads involving j~, say j~,14, j~—and 12 say 12, 16~17— are checked to see if one of their two other arguments (14, 15) and (16, 17) were marked as first in the FG being built. If one triad has this property it is chosen to represent the —
ing possibilities: 3.4.1. NFREE is smaller than 3. It is then possible to perform a cut through the free lines without destroying the structure of the FG. Then the cuts are performed in subroutines CUT1L and CUT2L. The generated FG is ordinary and closed. 3.4.2. If NFREE is larger or equal to 3, it is generally not possible to perform a cut without breaking the FG already built. In such a case, CUTNL is called. The FG has now branches which are attached to one node only (“free ends”). The FG is then reduced without removing the free ends. 3.4.3. In both cases, it is necessary to iterate and construct another FG after the reduction of the present one, starting from the free ends if there are any, until all the triads have been included in FG’s.
382
A. Bar-Shalom, M. Klapisch
3.5.
/
General recoupling coefficieni.s
The possible effects of this limitation are discussed in section 7.1.
When the FG is set, one has to identify N-loops of the smallest possible order. This is achieved in subroutine SEARCH, which looks for N-loops of a given order NC. The principle is simple: If a first argument Ii appears in two triads indexed n and p. p > n, a loop exists of order p n + 1. This corresponds of course only to the simple case where all the nodes of this N-loop are contiguous in one group (NPART 1). When it is not so, one should look for all the ways of partitioning NC. Thus, the routine SEARCH recognizes the diagrams of fig. 13. For NC > 4, it locates only the N-loops partitioned in 1 or 2 parts (NPART 2), because only those will be reduced by the permutation process (section 2.3.1) to an N-loop of order 4 without breaking the basic structure of the FG. —
=
3.5.1. The routine SEARCH is in a loop on NC, initialized at NC 1. NC is incremented by 1 when SEARCH returns with FIND FALSE.. and so on, until the smallest N-loop is found. After this N-loop is taken out (according to section 2.3), the number of nodes remaining in the FG, NBNODE, is smaller by 2, and the smallest possible N-loop that may be found next is now of order NC 1. =
=
—
3.6. The general case of reducing a N-loop is handied in the routine POLYGN which uses the relations of section 2.3.1. in order to reduce the number of contiguous nodes (in both parts if NPART 2) to the minimum, i.e. 2 for the case of fig. 13a, 3 for the case of fig. 13b and 4 for the others. =
~
2
--
~--
3.6.!.
~--
The remaining N-loop routines BUBBLE (NC and SQUARE (NC 4). SEARCH, it is taken out
_______
3
=
a
--
4
--
~
LTA. 1
b
=
b
L7”~~~II1
is then taken out in the 2), TRIANG (NC 3) If NC I was found in in LOLPOP. =
=
3.6.2. During this process LOLPOP and BUBBLE create Ifa it delta involves function. amomentum, summation This is checked variable inand DEa “constant” angular the summation variable is declared fixed by putting SUMVAR(J) .FALSE. If the delta function involves two summation variables, one is replaced by the other in all its occurrences. If it involves two “constants” the values are checked (for FREE(J) .FALSE.).
e
=
fl
=
W
e
3.6.3. In this process, TRIANG may produce a new triangular inequality i.e. one which was not implied by the initial triads, like in the Biedenham—Elliott sum rule (see test case 3). In that case TRDEL checks that it is valid and returns RECUP 0 if not, without going to GENSUM. This —
/ e
Fig. 13. Sorts of N-loops recognized by subroutine SEARCH.
=
A. Bar-Shalom, M. Klapisch
is only when FREE(J) .FALSE. for the concemned angular momenta. =
3.7. If the present FG did not include all the nodes of structureless graph (section 3.4) another FG is built, by another call to DIAGRM. If there were free ends, it may be necessary to order the remaining triads, in subroutine ORDER, before that. 3.8. When the process is completed, the subroutine SPRATE separates the formula in products of sums, if a cut on 1, 2 or 3 lines was detected, either in DIAGRM or at a later stage in BUBBLE or TRDEL. Use is made of subroutine CHVAR which changes the order of the variables in the arrays defining the phase, weight factors and 6J coefficients. 3.9. The resulting formula is transmitted to the calculating routine GENSUM in the same form as in NJSYM, i.e. as arrays containing the indices of the angular momenta which contribute to phase factor (— 1)~:J6, to (2j + 1)1/2: J7 and arguments of 6j ‘5: JW. However, there are now additional arrays for (_1)2~~:J8, (21 + 1)_3~~’2:J9 and delta functions: LDEL. 3.9.1. The subroutine GENSUM itself has been deeply modified. Indeed, it was realized when measuring the running times that too much time was spent in GENSUM itself, besides the calls to DRACAH. Most of this time was spent in tests of values of the arguments of the 6J ‘s to check if the triangular inequalities are satisfied. This is because in the original GENSUM, the limits of the values of the summation variable were not accurately calculated, and tests were performed on all the arguments of the 6j ‘s in order to check whether the latter should be computed or not. The number of such tests is approximately: NTEST
=
4*N6J*NVALUE* *NVAR,
/
General recoupling coefficients
383
where N6J is the number of 6j ‘s in the formula, NVALUE is the estimated mean number of values for each summation variable and NVAR is the number of such variables. There are 4 triangular inequalities in a 6j. NTEST can easily be of the order of millions, and hence the time spent. The modifications brought here to GENSUM are by themselves responsible for a noticeable part of the total gain of time for large values of angular momenta. 3.9.2. The problem is to evaluate the limits of variation of each summation variable. There are three cases of limitation by triangular inequalities: limitation by two constants, by one constant and another variable and by two other variables. The last case, three variables linked together with no other limitation, implies that these may tend to infinity and we conjecture that it never happens in actual cases. On the other hand, if all the variables were limited by two constants only, the exact limits could be fixed trivially. Now let us build a matrix MAT(I, J) connecting the variables I and J. The diagonal MAT(I, I) will be the number of possible values of I due to limitations by constants. The non-diagonal element will be the value of the constant linking I and J. The idea is to transform the matrix in a sort of diagonalization in order to find the most stringent limitations to each variable. The same pair (I, J) may be connected by more than one constant — e.g. they may appear in several 6j’s. So MAT(I, J) (I > I) will be the number of links between I and J and MAT(J, I) the value of the link if MAT(I, J) 1. If MAT(I, J)> 1, the several values are stored in another array J12(L, I, J). Then the matrix is studied, fixing first the limitations due to the diagonal, and then iteratively, giving to JSUM(1, I) and JSUM(2, I) the minimum and maximum values of I as influenced by the others. In the actual loop, I is initialized at JSUM(1, I) and the matrix MAT is used to check whether there are some more stringent conditions when all the variables are given their values. =
384
A. Bar-Shalom, M. Klapisch
3.9.3. The subroutine DRACAH, computing the 6j ‘s, was also modified. (It actually computed W coefficients, which differs from 6j’s only by a phase factor). However, our subroutine was then compared with the modification proposed by Scott and Hibbert [10], and was found to be 20—30% slower. For this reason, the subroutine of Scott and Hibbert is used in this version. The logarithm of the factorials are computed in subroutine FACT’T and stored in common block FACTS.
/
General recoupling coefficients
4.3. COMMON/CUTDIG/ contains only the logical variable CUT which is. TRUE. if a cut was detected. 44 COMMON/GRAPH/ relates to the FG.
—~
4. Description of common blocks and variables
—
4.1.
—
JDIAG (50, 3): arguments of the FG (can proceed either ways, starting at JDIAG(NBTR)). ARR (50, 3): (integer) arrows in the FG. TAB1 (60, 2): index of the 2 triads where a given J occurs as first argument. IL (50): current index (pointer) of an active triad in the FG when the FG is reduced, only —
COMMON/TIMEG/ transfers information to the DRIVER regarding time spent in the various subroutines,
these numbers change, the triads staying actually in the memory at their place even after reduction. IH (50): index of the original place of a triad in the FG as in the building of the FG. NPOINT: array containing the list of points (triads) pertaining to a given N-loop (fixed in SEARCH). NBNODE: number of active nodes currently in the FG. IFIRST: index of the first active triad (relative to IH). ILAST: index of the last active triad. IPARTS: number of nodes in the smaller part of the N-loop. IPARTL: same for the largest part. NPART: number of parts of the N-loop (1 or 2). ICROSS: (when NPART 2). If two lines cross ICROSS —1, if not +1. If NPART~2, ICROSS 0. NFREE: number of free ends on the FG. JTFREE: index of the nodes where there are free ends. NFIN: number of nodes with free ends remaining after the cut. NC: order of N-loop currently being analyzed. 4.5. -~
—
— -
—
NOGRAF: number of times NJGRAF was called. TGRAF: time spent in NJGRAF. NOGEN: number of times GENSUM was called. TGEN: time spent in GENSUM. NORAC: number of times DRACACH was called. TRAC: time spent in DRACAH.
—
—
—
4.2.
—
COMMON/TREE/. This common block relates to the structureless graph formed in SETTAB.
—
—
=
=
—
—
—
J23 (24, 3): arguments of the triads. ARROW (24, 3): (integer) arrows of the arguments: —1 goes into the node, +1 goes out. LINE (60, 2): gives the index of the two triads in which a given J appears in .J23. LCOL (60, 2): gives the position inside the triad of the latter. TABS (24): (logical) is .TRUE. when the triad is absent i.e. taken out of J23 by the process of building the FG. NBTR number of triads remaining in J23.
=
—
—
—
COMMON/COUPLE/: NJGRAF.
essentially data for
A. Bar-Shalom, M. Klapisch
—
—
—
— — —
M: number of angular momenta, 60. It is, at a stage, augmented by one having value zero, in order to code easily ~(j, o). N: number of triads in each of the original trees, 12. J1(60): value of the angular momenta, actually 2j + 1, in order to avoid half integers. J2(12, 3): left hand side tree, J3(12, 3): right hand side tree. FREE(60): (logical) .FALSE. if it is to be checked for zero while building the FG.
/
General recoupling coefficients
385
4.9. COMMON/DIM/: set in SETDIM and stores the maximum lengths of some arrays (see common block ARGU). —
J6CC, J7CC, J8CC, J9CC, JWCC, JDELC: maximum value of J6C, J7C, J8C, J9C, JWC, JDEL, respectively.
=
4.6. COMMON/ARGU/: arguments to be transmitted to GENSUM.
COMMON/BUILD/: set in DIAGRAM. Relates to the building of the FG. —
— — —
— — — — —
— — — — —
—
J6C: number of arguments for (21 + 1)~2. J7C: number of arguments (— 1)’. J8C: number of arguments for (_1)21. J9C: number of arguments for (2j + 1)1~’2. JWC: number of 6j coefficients (W). J6(120): arguments for (2j + 1)1~/2. J7(150): list of arguments for (—1)~. J8(120): arguments for (—1)2g. J9(40): arguments for (2j + 1)_1/2. KW(6,20): arguments for 6j’s. JDEL: number of delta functions. LDEL(20, 2): arguments of delta functions. SUMVAR(60): (logical) .TRUE. if this is a summation variable. MP: index of the last variable.
—
NAMSUB: name of the subroutine to be printed ‘n PRINTJ 1
4.8.
4 11 COMMON/JRAC/: arguments of a 6j transferred from GENSUM to DRACAH. 4 12
COMMON/ SUMARG/: set in SPRATE and contains information for partial, separated, sums to be evaluated in GENSUM. —
COMMON/KEEP/: contains information from CUTNL necessary to ORDER.
—
—
JKP(2, 3): triads involved in performing a cut. JARR(2, 3): arrows of the corresponding angular momenta. 1T2, 1T3, 1T5: position of the cut.
J6P, J7P, J8P, J9P, JWORD: correspond to J6, J7, J8, J9, KW of ARGU.
— —
—
—
—
=
COMMON/NAM: character type. —
—
IAL(50): initialized to 0, and has value 1 or 2 see section 3.3. IF1, 1F2: position of the arguments at the endtriads of the FG. NODE: position of the current node.
NBJ(10): the present number sum. of summation variables up to NB6J(10): number of 6j in each sum. K6CP(10), K7CP(10), K8CP(10), K9CP(10): number of arguments in the arrays J6P, etc., in each sum, JSUM6, indices of the 6j ‘s in each sum, JSUM5, position of the summation variables in each 6j, JSUM4 type of triangular relation for a pair of summation variable in a 6j, INV6J, index of a 6j in which a given summation variable appears.
A. Bar-Shalom, M. Klap~sch/ General recoupling coefficients
386
4.13.
5.1.6.
COMMON/ZER/: routine ZERO.
NGENSM: number of times GENSUM is to be called, in addition to the first time initiated by the main program NJGRAF.
—
—
information
from sub-
NZERO: number of zero valued momenta. JZERO(20): index of these,
angular
4.14.
5.2. Implementation
COMMON/ FACTS/: From FACIT to subroutine DRACAH. —
5.1.7. NGENSM cards like 5.1.4 with the new values of the angular momenta.
subroutine
GAM(500): logarithms of the factorials of the 500 first numbers.
5. Input to driver and implementation of NJGRAF 5.1. Input for the DRIVER: each of the following cards in format 2413, except 5.1.5.
5.2.1. When inserting NJGRAF instead of NJSYM, the DRIVER has to be removed and the variables in the common block COUPLE have to be set in the calling program. Usually, the calls to NJGRAF will be inside a loop on some angular momentum value, for instance the rank of a tensor in the Hamiltonian. It is convenient to call NJGRAF only on the first time the loop is performed in order to establish the formula, and call GENSUM directly the other times, to evaluate numerically the formula for this particular set of values of angular momenta. One should then declare for the angular momentum that is looped on, FREE(J) TRUE., whereas for all the angular momenta which are not changing in the innermost loop, FREE(J) FALSE. This procedure gives always the correct result, but does not give the simplest formula for the case that the loop angular momentum is zero. Other choices are possible if it is felt that more time can be gained by putting FREE(J) FALSE. for some or all the variables, but them GENSUM cannot be called directly. =
5.1.1. M, N, IBUG3. M: number of angular momenta. N: number of triads in each side of the recoupling coefficient. IBUG3: print option: I, for printing the full intermediate results; 0, prints only the final result. = =
=
=
-.
7 .~.
.
((J2(L, I), I 1, 3). L the left hand side. =
=
.
.
1, N): list of triads in 5.2.2
~J 3
The information transfered from NJGRAF to
((J3(L, I), I the right side.
1, N): list of triads in
GENSUM is in the common blocks ARGU and SUMARG.
5.1.4. J1(I), I 1, M: values of the angular momenta, actually as 2j + 1.
5.2.3. Other information necessary to NJGRAF includes the definition of the input/output files and the print option IBUG3.
-.
=
1, 3), L
=
=
5.1.5. FREE(I), I 1, M: logical variable inhibiting the suppression of a zero valued line in the graph, if given the value .TRUE. Format 24L3. =
5.2.4 The system subroutine SECOND(T) is called in NJGRAF, DRACAH and GENSUM to give time
A. Bar-Shalom, M. Klapisch
/
General recoupling coefficients
spent in various part of the program. For good performances, it should be bypassed in DRACAH, since the call to this subroutine might be more time consuming than DRACAH itself.
387
9
8 I 8
~~
~
‘9
The subroutine FAC~ should be called at the beginning of the main program, to load the common block/ FACTS/ with the values of the logarithm of the factorials of the number 1 to 500.
6 6 12
I
2
11
I3j
Cx
6. Description of the test cases 6.1. Test case 1
Il
Identical to test case 1 of NJSYM[I]. As it is short enough, the total output is reproduced, to illustrate the use of subroutine PRINTJ. The re-
,
13/T\
8A\. 13
7
coupling coefficient is: ((1112)15’(1314)16’17 I
Cx
J1’((12J3)38’14)9,17)
6
2
and the result is 1/2. 6.2. Test case 2 This is the same as test case 4 of NJSYM, with all angular momenta equal to 1. The recoupling coefficient is:
Fig. 14. Graph of test case number 2. Numbers are labels of angular momenta. (a) As it is built; 1~(i (b) after disposing of the loops of order 2, C = ([i12l[isI[i6IY 6~ .J9)~(.l5~J8)~(j~2. I14)~dotted line indicates cut on three lines; (c) After the cut, a product of two 6j ‘s is obtained: (111119113l (J11J19113 tJ~81216)
~J17 ~
(((1112)15,((1315)IiS,(118119)136)112)16’14’ 17 I((i1i2)18~((i3~(119is5)i17)ii3’1i8)iI4)i9’ 6.4. Test case 4
34,37)
The graph of this recoupling coefficient is shown in fig. 14 (without arrows for simplification). It illustrates the recognition by the program of the Biedenharn—Elliott sum rule. One obtains a product of two 6j ‘s without a sum. The loops of order 2 give only delta functions. 6.3. Test case 3
This is a very complicated case devised to check all the parts of the program, including cuts between the general graph and the FG. Fig. 15 shows the graph: a 9j and a 12j connected by 3 lines the dotted line shows the cut into the 2 trees. Here the gain of time with respect to NJSYM is tremendous. The computation with all the arguments equal to 4 (except j~ 0) took 3400 s with NJSYM and 0.028 s with NJGRAF on CDC Cyber 72. With no zero argument the time is 0.25 s. The gain of time is due to the smaller number of summation variables and the detection of cuts. It should be noted that all the times indicated here —
=
Same as preceding, with some arguments equal to zero like in NJSYM. The simplification is drastic: only delta functions are obtained, 8(j~~’ 119) explains why 119 2 gives 0. =
388
A. Bar-Shalom, M. Klapisch
/
General recoupling coefficients
25 2~\
29
4
26
/3
5/
/
~
I1~
6
/
6
/
2 —
~
~9 I
Fig. 15. Graph for test case number 4. Numbers are label of angular momenta.
were measured with the print option IBUG3 set to zero.
7.2. Running times Table 2 shows some running times on a CDC Cyber 72. for the two programs on the same configuration 3d84p2. Relativistically, d8 is d~/: d~/ 2 is P~2 + 2+ d~/2d~/2 + d~/2d~/2and p P1/2P3/2+P~/2. The entries of the columns are self-explanatory. The fourth column is the average time per coefficient in milliseconds. This table shows very clearly that the higher the angular momenta, the more NJGRAF is efficient with respect to NJSYM. For the last two lines, the introduction of the time measuring subroutines lowers the ratio. The times shown here include only the time spent in that part of the program (not the total time of MCP package).
7. Discussion 7.1. Checks NJGRAF has been inserted in the MCP package of Grant [11] and used to compute the angular matrix elements of the relativistic Hamiltoman for atomic configurations. Over a period of a few months, it was used routinely in parallel with the old program, and the resulting coefficients checked by a program comparing the two results. Some errors have been detected this way. Some other errors were recently pointed out to us by I.P. Grant and C. Johnson and corrected. We are confident that the present version should be correct.
Table 2 Comparison of running times of NJGRAF and NJSYM for matrix elements of the Hamiltonian for the configuration 3d84p2 for different total angular momenta (J). Time per coefficient is an average in milliseconds J
Number of coefficients
Total time(s) in NJGRAF
Time(ms) per coefficient
Total time(s) NJSYM
Ratio
0 1 2 3 4 5 6
205 673 1159 977 692 220 76
1.89 8.08 14.56 13.17 8.81 3.54 1.13
9 12 13 13 13 16 15
15.23 125.76 354.82 446.64 352.13 129.17 34.82
8.1 15.6 24.4 33.9 40.0 36.6 30.8
A. Bar-Shalom, M. Klapisch
/ General recoupling coefficients
389
7.3. Optimally
Acknowledgements
The above examples show that the compromise chosen between optimality and simplicity was justified. It was shown in section 3.5 that all loops of order 4 partitionned in at most two parts are located by SEARCH and subsequently reduced, This fact makes the program quite efficient, but some cases where the smallest loop is of order 4 partitionned in three or more groups, are not optimally reduced. Moreover, the first N-Loop encountered is reduced, without checking whether reducing another loop of the same order would have given a better result. On the other hand, the program begins the building of the flat graph arbitrarily at the first triad that appears in the input. A better formula could be obtained in some cases by starting at another triad, especially if more than one FG were necessary. Improvement on the above points would require a much more sophisticated program, at the expense of computing time in simple cases. On the other hand, if the program is to be used extensively in complicated cases, it would probably pay off to provide for a special subroutine for 9j’s, and modify SEARCH accordingly to recognize the appearance of these.
The authors wish to acknowledge with thanks for the numerous remarks of I.P. Grant making the manuscript clearer and pointing out errors in the program, and for the help of C. Johnson in finding and correcting some errors and making the program compatible with FORTRAN 77.
7.4. Use of FREE If NJGRAF is to be used in cases where many arguments are zero, it may be worthwhile to find a trade-off between calling several times NJGRAF with all FREE(J) .FALSE., and calling only one time NJGRAF with FREE .TRUE. and several times GENSUM, for each formula. This seems to be the case when working in LS coupling, due to the very frequent occurrence of s electrons (1 0) and of intermediate S 0. =
References [1] P.G. Burke, Comput. Phys. Coinmun. 1 (1970) 241. [21PG. Burke, Comput. Phys. Commun. 2 (1971) 173. [3] A. Hibbert, Comput. Phys. Commun. 2 (1971) 181. [4] I.P. Grant, Comput. Phys. Commun. 5 (1973) 161. [5] A. Hibbert, Comput. Phys. Commun. 9 (1975) 141. [6] A. Hibbert, Comput. Phys. Commun. 10 (1975) 436. [71R. Glass and A. Hibbert, Comput. Phys. Commun. 11 (1976) 125. [8] K.A. Berrington, P.G. Burke, J.J. Chang, AT. Chivers, W.D. Robb and K.T. Taylor, Comput. Phys. Commun. 8 (1974) 149. [9] K.A. Berrington, PG. Burke, M. Le Dourneuf, W.D. Robb, K.T. Taylor and Vo Ky Lan, Comput. Phys. Cornmun. 14 (1978) 367. [10] N.S. Scott and A. Hibbert, Comput. Phys. Commun. 25 (1982) 189. Also N.S. Scott and K.T. Taylor, Comput. Phys. Cornmun. 25(1982) 347. [11] I.P. Grant, B.J. McKenzie, PH. Norrington, D.F. Mayers and NC. Pyper, Comput. Phys. Commun. 21(1980) 207. [121 C. Froese-Fisher, Comput. Phys. Commun. 14 (1978) 145. [13] A.P. Yutsis, lB. Levinson and V.V. Vanagas, The Theory
[14] [15]
=
=
=
[16] [17] [18]
of Angular Momentum (Israel Program for Scientific Translation, Jerusalem, 1962). D.M. Brink and G.R. Satchler, Theory of Angular Momentum (Clarendon Press, Oxford, 1968). BR. Judd, Operator Techniques in Atomic Spectroscopy (McGraw-Hill, New York, Maidenhead, 1963). E. El Baz and B. Castel, Graphical Methods of Spin Algebras (Marcel Dekker, New York, 1972). Y. Bordarier, Ph.D. Thesis, Orsay (1970). I. Lindgren and J. Morrison, Atomic Many Body Theory (Springer Verlag, Berlin, 1982).
390
A. Bar-Shalom, M. Klapisch
/
General recoupling coefficients
TEST RUN OUTPUT *
TEST CASE hR.
INPUT: 621 124436 235156 542253 T T T T
T
1
*
T
PRINT OUT AFTER CALLING SUBROUTINE J23
NBTR1= +
1
1
2
2
3
3
4
4
4
—
124 +
+
436 —
—
+
235 —
—
+
156
L2 K2 41 4 2
J Li <1 1) 1 1 5) 3 3
+
2) 6)
16=
4
6
5
6
18=
1
4
2
1
19=
6
6
12 2 3
5
31 4 3
J23
NBTR1= +
1
2
2
3
3
4
4
22
32
4)
13
21
22
32
4)
13
21
ZERO
4
—
124 +
+
—
436 —
—
+
235 —
—
+
156
L2 1<2 4 1 4 2
J Li Ki 1) 1 1 5) 3 3
+
3)
6
PRINT OUT AFTER CALLING SUBROUTINE
1
SETTAF:
2) 6>
16=
4
6
5
6
18=
1
4
2
1
19=
6
6
12 2 3
5
6
31 4 3
3)
A. Bar-Shalom, M. Klapisch PRINT OUT AFTER CALLING SUBROUTINE NBNODE= IFIRST= IL
IH
1
4
2
5
3
6
4
7
4 4
/ General recoupling coefficients
391
DIAGRM
NBTR= 0 ILAST= 7
NFIN= 0 NFREE= 0
JDIAG +
+
-
124 +
—
532 +
+
—
463 —
+
—
156
TAIII 1)
4
7
SUMVAR=
iF
5> 2F
3F
4F
U23
~i Li 1<1 1) L 1 5) 3 3
5
7
SF
NI3TR1=
L2 1<2 4 1 4 2
2> 6)
2 3
1 2
16=
4
6
5
6
17=
2
3
5
4
3
6
18=
1
4
2
1
5
6
19=
6
6
6F
4)
4
6
7F
4
3 4
1 3
3)
2
2
3
2
4)
1
3
2
392
A. Bar-Shalom, M. Klapisch
PRINT
OUT AFTER
CALLING SUBROUTINE
NBNODE=
4 4
IFIRST= IL
IH
1
4
2
5
3
6
4
7
NBTR= ILAST=
/
General recoupling coefficients
SEARCH 0
NFIN= 0 NFREE= 0
7
.JDIAG
+
~1~
1
24
+
—
—
532 +
+
—
463 —
—
1
56
+
TABI 4
1)
7
SI 2F
SUMVAR=
iF
NC= 3
NPART=
NPOINT=
4
6
3F
5 4F
1
7
SF
4)
6F
IPARTS=
4
6
5
6
17
2
3
5
4
3
6
18=
1
4
2
1
5
6
19=
6
6
PRINT OUT AFTER CALLING SUBROUTINE NBNODE= IFIRST= IH
1
5
2
7
2 5
NBTR 0 ILAST= 7 JDIAG
+
—
532 +
+
—
253
TABI 5)
5
SUMVAR=
7
2>
iF
2F
3F
5 SF
4
6
5
6
17=
2
3
5
4
3
6
18=
1
4
2
1
5
6
19
6
6 6i)
5
3
7
4F
16
KW(ARG. OF 6
1
ICROSS=
7
16
IL
6
7F
2
IPARTL=
4
1
6F
7F
6
4
2
6
TRIANG NFIN= 0 NFREE= 0
0
A. Bar-Shalom, M. Klapisch PRINT OUT AFTER CALLING SUBROUTINE NBNODE= IFIRST= IL
IH
1
5
2
7
2 S
/
General recoupling coefficients
SEARCH
NBTR= 0 ILAST= 7
NFIN= 0 NFREE= 0
JDIAG +
—
532 +
+
—
253
TABI 5)
7
5
2)
SUMVAR=
iF
2F
3F
NC= 2
NPART= 1
NPOINT=
5
7
5 4F
SF
6F
7F
IFARTL= 1
IPARTS= 1
7
16=
4
6
5
6
17=
2
3
5
4
3
6
5
18=
1
4
2
1
5
6
6
I9=
6
6
KW(ARC. OF 6J1
5
3
1
4
3
2
2
6
PRINT OUT AFTER CALLING SUBROUTINE 5
4
2
2
I6=
4
6
5
6
17=
235436532532
18=
1
4
19=
6
6
2
KWIARG. OF 6~J) 5
5
1
3
5
3
1
6
1
N.JGRAF
6
4
5
2
6
PRINT OUT FROM SUBROUTINE GENSUM VALUES OF ANGULAR MOMENTA IN *REAL* FORMAT 2.0 1.5 .5 .5 2.0 1.0 NUMBER OF INDEPENDENT SUMS RACAH W FUNCTIONSIoU) ARGUMENTS IN *REAL* FORMAT NOT INVOLVING SUMMATION VARIABLE 2.0 .S 2.0 .5 1.5 1.0 NO SUMMATION. RECDUPLINC COEFFICJENT= RECOUPLING CDEFFICIENT= .70710o78
U.~LUE .22360650 .70710s72
SUMMARY FROM DRIVER SR. OF CALLS TO NJGRAF:
1
TIME SF-ENT:
.OSoO
SR. OF CALLS TO GENSUM:
1 TIME SPENT:
.0050
SR.
1
.0000
OF CALLS
TO DRACAM:
TIME SPENT:
ICROSS= 0
393