Pergamon
0045-7949(94)00370-x
EFFICIENT
Compurm & Srructurps Vol. 54. No. 4. pp. 731-742. 1995 Copyright (’ 1995 Elsevier Science Ltd Prmted ,n Great Bntam All rights reserved 0045.7949195 $9.50 + 0.00
LARGE SCALED PROFILE COMPUTATION PROGRAMS
MATRIX
M. S. Park Department
of Mechanical
Engineering,
373-l Kusong-dong,
Korea Advanced Institute of Science and Technology, Yusong-ku, Taejon, 305-701, Korea
(Received 12 July 1993) Abstract--Several efficient programs written in C that handle large scaled matrices, which are usually stored out of the core memory, are provided. A compacted sky-line matrix storage scheme is used for the symmetric coefficient matrices and Grout elimination is used for the solution of linear systems without core blocks. Scalar machine matrix multiplication, canonical multiplication and solutions of nonsymmetric augmented linear system routines are provided, which are easily modified to the vectorized machines. Numerical tests are done by personal computers and engineering workstations. Efficiency of storage and time is considered good, compared to the full in-core routines.
INTRODUCTION
whose size is relatively small compared to its matrix. When the block sky-line method is used, the memory size for the whole matrix storage is smaller than that of the frontal method but the size of core blocks and its accessing time is considerably larger [4]. In this study, the sky-line storage scheme, which is easy to implement on the vectorized machines, is adopted and computation algorithms that use small buffers instead of core blocks are implemented. Routines for multiplication, canonical multiplication, Crout variation of Gaussian elimination and solution of augmented non-symmetric system equations are coded in C and their numerical tests are undertaken by personal computers and engineering workstations. Their performances are also discussed.
Large amounts of memory are required for the analysis of structures and design sensitivities of the structures by finite element method. Almost all of this memory is to store the stiffness matrix, the mass matrix or their sensitivities to the design parameters. These matrices are usually symmetric and have a narrow band near the principal diagonal, so the efficient methods for storing ‘at least’ required coefficients are used. Nevertheless, cases when it is not sufficient are very common in practice. For example, a shell structure having 20,000 dofs needs 160 MBytes of memory (20,000*1000*8 Bytes), though the band matrix with its semi-bandwidth of 1000 is used. Similar amounts of in-core memory are rarely available for normal machines and the out-of-core remedy is necessary. The computation time by using such large scaled matrices is even more. Most of those matrix computations are categorized by multiplications and solution of linear systems of equations. In eigenvalue analysis, for example, such as natural frequency analysis or structural buckling analysis, matrix multiplication and solution of linear systems are required many times (or iteratively). For the band symmetric matrix of n x b, the number of scalar multiplications is nb for one matrix by vector multiplication, nb*/2 for triangularization of the matrix and nb for back substitution of a triangular matrix [ 11. If the matrices reside out of the core, most time is spent by I/O rather than CPU. So the program should handle I/O carefully and attain a minimum access time. The frontal method [2] and the block sky-line method [3] exist for the solution of linear system equations for the out-of-core matrix. Both of them use direct elimination and require ‘core blocks’
Compacted
matrix
storage
scheme
Assume the symmetric coefficient matrix A has the ‘profile’ or ‘sky-line’ shown in Fig. 1. Where elements of the matrix A which are located outside of the dashed line envelope have zero values, those elements are unused during any matrix calculation, i.e. multiplication, factorization and substitution. Also, the envelope or profile remains the same under matrix operations. So we store the matrix in a compact one-dimensional array, which contains the elements that are located inside the envelope and the lower triangular portion by rows. {a} = {A,,, A,,, A,,, A,,, A,3, A.,,, A,,
. . , A,,}. (1)
One piece of additional indexing is the following
information for the matrix (N + 1) integer array:
{L}={O,l,3,5,7,11,15}. 731
(2)
M. S. Park
132
pit
Matrix
$22
Usual lowing:
multiplication--Y matrix
= AX
multiplication
routines
are the fol-
A=
44 44
i%, Fig. 1. A profiled
symmetric
coefficient
matrix
Corresponding statements:
A.
loops
are similar
to the following
For i=1,2,...,N This array is a pointer array to locate diagonal elements. If the computer program is written in C, then array (1) may be a zero based indexing one and array (2) should contain the following values: {L} = {-I, Furthermore, ing indexing
0, 2,4, 6, 10, 14).
the array {L} is preferred scheme:
arithmatic
is performed
:
of the matrix
A:
A profile of the matrix varies not only with problems but also with node numbering methods. When the problem is specified and the finite element mesh is defined, the profile reduction algorithm may define the improved node numbering based upon the element connection given by the user. Then the array {L} in (4) can be determined before analysis as follows: L_,=
Keeping tracks of the loops (9) the sky-lined array (I) should be positioned in order, as in Fig. 2. If the array (1) resides out of the core memory, a positioning time like Fig. 2 prevails for the large scaled
: a[L[i]]
term
index of the ith row:
(i, j) element
(9)
Y,, + Y,, + A ‘,x,,
as follows:
width of the ith row starting
Forj=1,2,...,N
in the follow-
(4)
ith diagonal
Y&+-O
(3)
L[-1]=-1,L[0]=0,L[1]=2,...,L[5]=14. Then the indexing
Fork=1,2,...,M
B(i)=L[i]-L[i-l] S(i) = i -(L[i]
(9
- L[i - I]) + 1
a[S(i) + j]. coefficient array. So the loops must be modified for the minimum accessing time, that is, by the profile oriented loops. Expanding eqn (8) by using symmetry of the matrix A,
-1
for i = 0, 1,.
Y,,, = A,,, X,, + 4,:Xx
,N - 1
B, = max (i -j
i
+ l),
A, # 0,
L,=L,-,+ll,.
+
(10)
+ A,,,,Y,,,
j < i
(6)
In the program developed, large scaled array {a} is stored in a disk media with two small work arrays residing in the core having the size of Mband.
‘\
\
X
Mband = max (B,).
,
(7)
Fig. 2. Track
pointer
of the array plication (8).
\
>
{a} by the multi-
Matrix computation programs Equation (10) shows two parts of array {u}, the lower part and the upper part, and the following routine is proposed for the proper multiplication named in MMULO.
733
time, multiplication (12) is required. Here the multiplication routine is modified to use the Crout version of the factorized form of A. A=LDLT
and
S=X%DL’X,
(14)
{ w-0 where L is a lower triangular matrix with the diagonal terms of unity and D is a diagonal matrix. Both L and D are stored in a single array (u} of equation (I), which is described in the next section. Such a multiplication is composed of the following two steps:
For i=1,2,...,N {w}cith
row of {u}
Forj=1,2,...,i-1 Y’=X%
and
S=Y%Y.
(15)
YZk+ Y,k+ w,x,, In the program ments:
r,, + y,, + w, x,, Y,r+ Y,k+ M’,x,,
(11)
For the compacted, row-wise storage scheme, the array {w} begins with the first non-zero element in each row of A and ends with the diagonal term. Then the second loop of (11) can be modified to avoid multiplication with zeros. In the program MMULO, only one contiguous sweep of the array {u} completes the multiplication. Another form of frequently used matrix computation (e.g. subspace iteration during eigenvalue extraction) is the following canonical multiplication. S = X’AAX,
developed,
are the following
state-
{Y)+O For i=l,2,...,N {w}tith
row of {u}
Forj=1,2,...,i-1
(12) {S)+O
where the matrix A is the mass matrix or the stiffness matrix and X is the (N x M) rectangular matrix. The matrix S has a dimension of (M x M) and retains a symmetry because of the symmetry of the matrix A. Multiplication (12) can be performed by two consecutive multiplications using routine (11). Nevertheless, a memory saving routine is coded for the out-of-core matrix A as follows (MCANO):
Fori=1,2,...,N S.P-& Multiplication loops (16).
+ Y,, Q K,.
(14) is coded
(16) in KCANO
using the
Linear equation solution In many computational mechanics, it is necessary to solve the following linear system of equations:
IS)+0 Fori=1,2,...,N
Ax=b. {w}tith
(17)
row of {a}
Forj=1,2,...,i-1
S,,+ s,, + x,, W,x,,.
(13)
In routine (13), there need not be a (N x M) temporary array and multiplication (12) is completed by one contiguous sweep of array {u}. Sometimes the matrix A of equation (12) has already been factorized. It is true when the linear equation Ax = b should be solved many times without modifying the contents of A and, at the same
As mentioned before, when the size of the matrix A is very large, so that the core memory cannot contain the coefficients of A, it is necessary to store A out of the core area. Then the eqn (17) is solved by certain elimination processes. Two major categories of the solution procedure are the frontal method and the block sky-line method. Both of them use ‘coreblocks’ during elimination, particularly the block sky-line solver, which uses a compact matrix storage scheme of array (I) and, in general, needs blocks whose size is not small. Here the Crout version of the elimination process [5] (A = LDLT is coded without using such blocks, while using sky-line matrix storage. For a
734
M. S. Park
symmetric coefficient formed as follows:
A, factorization
matrix
For,j=l,2
is per-
,_..,
i-l
b,+b, - w,b,. Fori=1,2,...,N {ll,}cith
(20)
Solution of nugnzented non -.s~wunetric equation In recent years. the continuation or arc-length method is used for overcoming the limit point obstacles in the solution algorithms for the non-linear finite element equations. The general form of the continuation procedure can be expressed by augmenting a constraint equation as follows [6]:
row of {u}
Forj=1,2,...,i-1 {tl}+,jth
row of {u}
Ax=Yb--f
c’x+dy
Forj=1,2,...,i-1
w, + w, / D, D,+-M’, ith row of {a}+ It’.
(18)
Statements (18) are implemented in the program FACTO where the goal is the minimal use of disk access and the amount of the core memory, while the total factorization time (CPU time plus I/O time) remains the same as that of the in-core version solver. To obtain a solution by using the factorized coefficient matrix which is stored in the same array {u i. the substitution procedure follows: Lz = b
(forward
substitution)
Dy = z
(diagonal
scaling)
LTx = y
(backward
substitution).
(19)
The out-of-core version of substitution (19) is implemented in the program SOLO and its statements are as follows:
=g,
(23) where C,,,represents the M th unit vector in the dimension of (N + 1). So the last eqn in (23) is the special form of constraint eqn (22). It is known that if the local parameter index m is chosen such that &Z,,, # 0, the singularity of the augmented equation can be by-passed. Where the vector i is the predictor direction or the tangent vector and its dimension is (N + I), the solver for eqn (23), which is described in [7]. is coded according to the statements (24): If m = N + I, then
(9
L’ =g
(ii) solve Ax = y b - f
row of {ui otherwise
/-I
b,eb, - c w,b, I=/
(i) solve A,,?, =
D,+w,
b
0 o
solve &I%2 =
Fori=l,2,...,N b,+b,/D,
where A,, = -I (ii) rZ= row of {ai
+c,,,
Ae,,, 1
T e,,,
Fori=N.N-I,....1 {Ma}tith
(22)
where the RHS term in equation (21) represents the non-equilibrium force vector between the external applied yb and the internal stress resultant force f. The external load is not usually the process control variable, like in the conventional Newton method, but is determined by solving eqns (21) and (22) simultaneously.‘This situation arises in the predictor step, as well as in the corrector steps. In the program developed, the following augmented non-symmetric system of linear equations are solved, at which the m th degree of freedom is a local control parameter:
For i=1,2,...,N {rr)+ith
(21)
(.??I,, +I (-?I ),, +I - 1
0
9, -9,.
Matrix computation programs Table 1. Problem sizes and computation details by PC486 Neq
102
Aband (Mband) Storage
(2)
(kBytes) Factorization (set) Substitution (W
165 (2:)
240
426
(E)
(I;:,
537
942
(1::)
(I::,
17
36
62
148
210
495
3
6
13
3-i
54
156
I
4
6
10
12
19
where the basic ingredient of the solution procedure (24)is to solve the equation A,.? = i. It is solved by merely modifying some contents of the matrix A maintaining the original structure. The augmented coefficient matrix & has the following form: 0
Zl
42
1
2,
A?
0
A,
%I
B7
amI 7
4,wn %I2
0
The solution
r,FZ
=
100
(25)
r2
Z2 Y
:
of eqn (25) is the following
[i
%
71
[:,\I
942
Neq Aband (Mband)
(1;:)
Storage (MBytes) Factorization (set) Substitution (see) Core version Factorization (set)
1998 4056
6024 10332 20199
(E)
$&
(1:;:)
(Z,
(lE,
0.5
1.6
4.9
9.0
20.7
57.3
3.0
1I
50
117
405
1627
0.1
I
2
15
50
158
0.4
2
11
30
93
*
* ran out of core memory. (iii) y = T,~- aII zI - a,, z,, - a,L z2
~=(z,,z,,z2,Y)T.
Computation
h
procedure.
(i) -,n =h (ii) solve
Table 2. Problem sizes and computation details by HP750
and
r1
I I :I B
135
= (1: XI:]
(26)
examples
All of the programs developed herein are linked with a finite element package FYNS, which was developed for research in my laboratory. Numerical performances of two modules are provided. These are the factorization module FACTO and the substitution module SOLO. Problem sizes and numerical results are in Tables 1 and 2. Machines used are PC 486 DX2-66 and HP750. In the tables, Neq represents
Ftimc(scc)
Stime(sec)
-
Fatorization Time 0
Fig. 3. Computation
Substitution Time
time vs Neq plot.
736
M. S. Park 200
UI
1’ 3
+ s
-
Substitution Time 0
Factorization Time
100
1000
a
2
I
I
4
6
8
0
(*lop
NiZq*AtItNd Fig. 4. Computation
_J
time vs Neq x Aband plot.
200
1 2ooo
Ii
u) z $ s
F
LL
100
1ooc
PkpAtEWd' Fig. 5. Computation
time vs Ney x Aband’
plot.
Matrix
computation
the number of equations, Mband is the maximum semi-bandwidth [eqn (7)] and Aband is the following mean band width:
programs
737
program FACTO and SOLO is approximately four times larger than that of the full core version solver. These programs are becoming able to be efficiently used by linear, non-linear or eigenproblem packages.
Neg. REFERENCES
The value of Aband and the storage spaces needed are relatively small because the compact storage scheme is used after the profile reduction execution. The profile reduction algorithm used by FYNS is contained in [8]. Hundreds of equations are computed by PC 486 and thousands and millions of equations are solved by HP 750. The results of the module FACTP, which is the in-core version of the factorization, are shown in Table 2 for comparison purposes. In the tables, the factorization time and the substitution time are the total elapsed time, that is, CPU time plus disk transfering time. Execution time of FACTO is approximately four times larger than that of FACTP. This linear proportion, regardless of the problem size, is considered as the advantage of the current algorithm using hard disk for the coefficient matrix. Execution times are plotted with respect to the values of Neq, Neq x Aband and Neq x Aband in Figs 3, 4 and 5 respectively. The execution time axis is scaled by logarithm in Fig. 3. From these figures, it is shown that the factorization time is proportional to the value of Neq x Aband2, the substitution time is proportional to the value of Neq x Aband. These proportions are the same as those of the band solver if the value of Aband is replaced by the semi-bandwidth. However, the semi-bandwidth of the examples are the value of Mband and its value is considerably larger. CONCLUSION
The programs that perform efficient multiplications and eliminations of symmetric or augmented non-symmetric large scale sky-line matrices reside out of the core area are developed and coded in C. The core blocks, which are in general not small, are not used by the programs but two small buffers are used. Numerical results show that the order of the elimination time is the same for the developed program and the in-core solver. Total elimination time of the
1. 0. C. Zienkiewicz
2.
3.
4.
5. 6. 7.
8.
and R. L. Taylor, The Finite Element Method, 4th Edn, Vol. I, pp. 479495. McGraw-Hill, New York (1989). B. M. Irons, A frontal solution program for finite element analysis. Int. J. Namer Melh. Engng 2, 5-32 (1970). G. Dhatt, G. Touzot & G. Cantin, The Finite Element Method Displayed, pp. 269-291. John Wiley & Sons, New York (1984). E. Thompson and Y. Shimazak, A frontal procedure using skyline storage. Int. J. Numer Merh. Engng 15, 889-910 (1980). T. J. d. Hughes, The Finite Element Method, PP. 633444. Prentice Hall, New Jersey (1987). g Riks, Progress in collapse analysis. >. ‘Pi-es.;. Vessels Technol. 109, 3341 (1987). W. C. Rheinboldt, Numerical analysis of continuation methods for nonlinear structural problems. Comput. Struct. 13, 103-I 13 (1981). B. U. Koo and B. C. Lee, An efficient profile reduction algorithm based on the frontal ordering scheme and the graph theory. Compur. Struct. 44, 1339-1347 (1992).
APPENDIX-PROGRAM
DESCRIPTIONS
Listing of eight C functions are given below. These functions are facto, solo, factp, mmulo, mcano, kcano, solpad and prepad. All of the routines except solpad and prepad are explained in the context by their own names. Necessary documentation for using these routines is inscribed in the listing. All of the arrays should be properly dimensioned according to the documentation. The file of the coefficient matrix must be opened as follows: FILE
‘fp;
fp = fopen (“tile name”,“w
+ b”);-
Then the correct values should be stored into the file *fp. To solve eqn (23) according to the algorithm (24). the following calls of functions are needed.
prepad(n, m, L, NULL, fp, c, w);
[eon (WI
facfo(n, fp. L, w, v, x);
[con (26)l
for new force vectors
f and b
solpad(n, L, NULL, fp. b, c, f, w, x, m);
ieon (24)i
M. S. Park
738
KAIST
PROS0LVE.C
System& DesignLab.
Page: 1 Printed: Savodr 1993/ 6/26 1993/ 6/26
1)#include atdio.h> L t
g;iz
5 6 7 ;
#&fin *fin #define z;iz
10
REAL double INTEGER short int INTEGER4 Long (LIil-L[(il-11-11 ;:I; (Ii)-LIil+L[(i)-ll+l) A(i,jl aCLCil-(il+(jll MAX(x,Yl ((xv1 ? (Xl : (Yll
REAL fscto(n,fp,L,wi,uj,wkl
: A = L.D.L'
wk = DiaaIAl after factorization Trsce(AlVariables : n .... no of equations fo .... file winter to stiffness away . index array L’[“l . . . . dieg&l l w,Iml .... work array * srrajf wjtml . . . . wrk c m = mx(LIil-LCi-111 i=O,l,... l wkIn1 .... work array +..._____..___...________________________________________.,
l
l
l l
:: f: :; g
<.;WGER
i,j,k,m,iO,j~; d I*pOl 'pi lPJ' I
71
I
m = sizeof(REAL1; rcwind(f 1; frcad(Cw IOl,m.l,fpl; for (i=l; iw; i*+l c fseck(fp,mYLCi-11+11,01; frcad(wi,m,(Ltil-LIi-111,fpl; for (j=iO=S(il,pO=wi,fseck(fp,~(LIj-11-LCill.11; j'i; j++,pO++l c fr~d(uj,n,ILtjl-LIj-111,fDl; for (jO=s(Jl,k=HAX~iO,jOl,pi=~it~-iOI,pj=&ujIk-jOl ; kgj; k++,pi++,pj++l PpDl -= Ppil*Ppjl~ for (j=iO,pi=ui; j'i; j++, pi++1 c if (ukCj1 == 01 continue; (*pOl -= (*pil*(*pil/wkCJl; (*pi1 /= wkCj1; 1 uktil = I'pDl; fscck(fp,nP(LIi-11+11,01; fwrite(wi,m,(LCil-LCi-111,fpl; 1
F
:: 4t I.7
. ti
for (i=O,d=O; ia;
;I
i++l d += wkCi1;
return(d);
I INTEGER solo(n,fp,b,L.uil INTEGER n; INTEGER4 LCI; REAL uiII,bCl; FILE lfp; ,.._.___._.________.______________________________________. l SUBSTITUTIDNS (out-of-coreversion) . Output = SOL(X1 t A.x = L.D.L'.x = b l Return = icn if fail (i = zero diag index1 . =" if sucess * Variables : n . . . . no of equations t fp .... file pointer to stiffness array * LC"1 .... diagonal index array * uiIml .... work array t m = nux(LIil-L[i-111 i=O,l,... * bCn1 .... rhs vector (input1 * solution (output1 ._____...__.._.___..______~_____~~__~~~~__~~~_-~~~~_--~~_~, 1 <.INTEGER i,j,n,iO; REAL d: ,I
1
:
= sizeof(REAL1; rcwind(fp1; bI01 = bt01; for (i=l,fscck(fp,lrl,O?;i
2
for (i=O,rcwind(fpl;ia;
E 8:
l
i++,fscck(fp,~(LIil-LIi-11-11.111
Matrix computation programs
KAIST
PROS0LVE.C
S! teas&DesignLab.
77 ii ii ;i ii ii
C frebd(bd.m,l,fp);
if (d =I 0) return(i); ) btil /= d;
for (i=n-1; i>O; --i) for (j~iO=S(i),f~cek(fp~n*(LCi-ll+l~,O~,frced(wi,m,(Ltil-Lti-ll~,fP~; j
I!!
REAL frctp(n,a,L)
::s E E 108
!Z 111 1:: 114 115 116 117 118 119 120 121 122
Page
INTEGER n; INTEGER4 LCI; REAL (111; ,~........................................................* l FACTDRl2ATION (profile in-Core Version) : A = L.D.L' : Return Trace(A) l Variables : n .... no of equations . a .... stiffness array l L(nl .... diagonal index array . ..........................................................
C ;:W:GER
::t 125 126
i,j,k; d , *pO, *pi, lpjI
for (i=l,pO=ta[ll; ia; i++,pO++) C for (j=S(i); J'i; j*+,pO+*) for (k=mAx(s(i),S(j)),pi=&A(i,t).pj=Wj.k); kgj; k++.pi++.pj++) (*pO) -= (*pi)*(*pj)for (j=S(i),pi=M(i,j),pj=biLtjll; j
i+*) d
l
= atL(ill;
return(d);
::8' 1 129 z INTEGER nmulo(n,m,L,fs,x,y,w) INTEGER n,m; 132 INTEGER4 L(1; :% REAL x(l,y(l,u~l; lfa; 135 FILE 136 137 ,*............................................* l IUILTIPLICATIDN * an a Profile Matrix A (Mass) is * Dutput = Y * V = A.X 141 l Variables : n .... no of equations 111 :E: l .... dimension l Ltnl .... ding-1 index array 144 fa .... A[1 in file 145 t xCn*m].... two dim. array 146 l ytn*ml.... two dim. array (output) 147 l WC1 .... work space of size 148 l max(LCil-L[i-11) 149 * l . . . . . . . . . . . . . . . . . . . . . .......................*. 150 1:: C INTEGER i,j,k,iO; INTEGER4 mr: 1:: *xI,*xj,*yi3*yj,,va* REAi 155 156 mr = sizeof(REAL); for (i=O yi=y; icn- +*i) ::: for (k.6; kw; ++&,**yi) (*yi) = 0; 155 for (i=O,rcwind(fs);ia; ++i) % C fread(w~mr,LCil-Lti-11,fa); for (j=~O=S(i),xi=Lxti*mJ,yi=&yti*ml; j'i; ++j) for (k=O,yj=Py(j*ml,xj~Lx~j*ml,v~=w(j-iO1; ka; ++k) :z C yi(k1 += va*xj[kl. yjrkl += vr'xjtkl; > 164 for (k=O,va=utj-iO1;k&; ++k) 165 yiCk1 l = ve*xi(kl; ) :z return(O); :: 17( ) 171 17Z INTEGER mcano(n,m,L,fa,x,r,w)
:
sakradr 1993/ Printodr 1993/
2 6/26 C/36
M. S. Park
INTEGER ",m; INTEGER4 LII; REAL x[l,rcl,w[l; FILE lfe;
,*____________________________________________* l l
.
l
CANNGNICAL IUJLTIPLICATIOY Q) l Profile Matrix A (Ness) Output = R R = X’.A.X : n
vari9bles
. . . . no of equetions .... dimension m Lb1 .... disgcael index array l fa .... At1 in file * xWd.... two dim. errev t two dim. farrev(output1 l WI1 .... work spece of size l mex(LCil-Lti-111 '_____________________________________-------~,
l l l
rim%i....
C INTEGER i,j,k,l,iO; INTEGER4 mr; lxi,*xj,*rk,ve; REAL mr = sizeof(REAL); for (i=O; i*(mrl)/2; l*il rtil = 0; for (i=O,renind(fal;Ia; +*iI < fr6d(w,m,Ltil-LCi-l~~fe); for (j=1O=S(i),xi=Px[1ml; j
INTEGER kcno(n,m,L.fe.x.r.~,y.d) INTEGER n,m; INTEGERC LII; x~~,tI~,w~l,ytl.dIl; REAL FILE 'fa;
,.____________._______________________________.
'* CANNGNICAL WJLTIPLICATIW t m e Profile Factorized Matrix A (StiffwSS) Output = R R = X'.A.X = X'.L.A.L'.X Variebles : n . . . . noortions .... m Lb1 .... diegcael index arrey fa .... AC1 in file two dim. array XWRI.... rWml.... two dim. sway (output) ytrhl.... work space WC1 .... work spece of size mex(LCil-Lti-11) ._____________________________________-----~, c INTEGER i,j,k,l,iO; INTEGER4 mr; lrk,*yj,*xi,va; REAL mr = sizeofCREAL1; for (k-0; ka*(wll/2; ++kl for (j=O yj=y; ja; +*j) for tk=b: k-n: ++k.*+yj)
rtkl = 0;
for (i=O,renind(fe);i?n; ++if ( fred(w,mr,LCil-Ltl-ll,fel; for (j=rO=S(il,xi=&xCiW; j
Matrix
computation
Page:
KAIST
PROS0LVE.C
8av.d: Printdr
ystem& DesignLnb. )
IYTECER n,n; INTEGER4I. Cl; FILE lf*; REAL lU,btl,ctl,fU,u~l,xCl;
*._______________-~~~---_~_______~~~~~~~~--~~~~~~~~*
*
where
______. inputs n
e is a n-th
m et1 fsC> :I::
cwl+11 fCrnl1 WC1 Qltwts
7::
:
unit
vector,
the
local
parameter
dmension index of
of matrix [al local parameter factorized core stiffness array fectorized file stiffness array profile structure integer i externel reference toed of q-th rcu of A on input) : work arrey(contents : RtlS term : work array for file job solution vector (x2) i modified xl " (X l x2)*0+0/g ______________________________--*,
: profiled : profiled
I’
: ;:W:fiER i,i,k; y.d.R; lfm=stderr; FILE if (m==n) iw; ++i) xCi1 = g*b[il < for (i"O,g=ftnl; if,;f') k = solo(n,fa,x,L,u); k = solp(n, l,x,L); if (k I= n ) fprintf(fr,"\nw Mug-Solver) return((xtnl=g)); 1 for xcll
(i"D,g"D,x[nJ"
xW+l;
iw;
l+i)
xCi1
- ftil;
ZERO
PIVOT
at
" b[il;
" -g;
if (fa) k = solo(n,fa,x,L,u); k = solp(n, a,x,L); else if (k I= n ) fprintf(fn,"\n l * (Aug-Solver) XM " -g;
ZERO
for
+" o*cCil;
fM
(i"D,g"f[n],f[nl"ftnl;
iw;
++i)
fcil
PIVOT
at
" -9;
if (fa) k = solo(n,fa,f,L,w); else k = solp(n, 0,f.L); ftnl " -0; for C
(iLo; i
xtnl
-= ctil*xtil;
if ((d=x[nl-1) I= 0) y = ftnl/d; else fprintf(fm,"\n** (Aw-Solver) for
(i=O;
i<"n;
+*i)
xtil
= y*xtil
ZERO
)
DEWCU\n");
- ftil;
return(x[nl); >
REAL
prepedm,M,L,a,fa,c,w)
INTEGER iWTECER4 FILE REAL
".n: Lil: "fa;
etl,ccl,w~l;
,*.._____.......~~.~~~~~~~~----~~~-----~~~----~~~~~~~* *
Prepere
for SOLPADO
l
* . t
A "
CE d Cl :; ; ;;
"'
CE 0 Cl $ ; ;;
l
nd c " Cd l fl th q-th rou of A (output) .___...__......__...________________________________~,
l
l
741
programs
1993/ 1993/
4 6/26 C/26
M. S.Park
142
r
KAST
PROS0LVE.C
m&Da@Lib,
for
cj=o; j-n;
if t-4
++i1
ctj1
* 0.
c ctnl = 1; return(b); )
if (fM C Y = rirwf(REAL); rwlnd(fr); fr~(fr,lr~(L[r-11*0.0~; froad(u Y LW-Lfa-ll,fr); for (j=b,jb=S(m); ja; ++ ) < ctj1 = utjl: wtjl = d ; ) CW * NW; uw = 1; f~(f~,lr*(Llr-ll+l),o); fwrito(u,lr,LW-Lb-ll,fa); for (I*(: ja: ++I)
< ~6 = siji; _
1: :iz%
_
mr*(Ltj-ll*l+m-jO),o~; fredkijl mr 1 fr)f~~~fr,nr~~Lij~ll*~~-jo~,o~; > fwrit*(Zd.m.l.fa): . . . . ) )
l1w < for Cj=O,jO=SCm); ia; ++j) C ctjl = A(m,j); A(hj) = 0; 1 CW = Mm,m); A(n,m) = 1; for (j-1; j'n; ++j) if (S(j)*=m) C cCj1 = A(j,n); A(j,n) = 0; ) return(O);
Page : Printed: 8av.d:
5 1993/ 6/26 1993/ 6116