The number of computers in a non-hierarchical system for Tokamak data processing

The number of computers in a non-hierarchical system for Tokamak data processing

Computer Physics Communications 19 (1980) 35—41 © North-Holland Publishing Company THE NUMBER OF COMPUTERS IN A NON-HIERARCHICAL SYSTEM FOR TOKAMAK D...

570KB Sizes 0 Downloads 15 Views

Computer Physics Communications 19 (1980) 35—41 © North-Holland Publishing Company

THE NUMBER OF COMPUTERS IN A NON-HIERARCHICAL SYSTEM FOR TOKAMAK DATA PROCESSING Atsushi OGATA and Kazuo YUASA

*

Division of Large Tokamak Development, Japan Atomic Energy Research Institute, Tokai-mura, Naka-gun, Ibaraki-ken, 319-11 Japan Received 6 June 1979

The data processing of a tokamak Ohmic heating experiment is summarized by a directed graph. A method which is a modification of the project-management technique PERT is developed and applied to this directed graph to analyse how many computers should be used for the data processing. An estimate of the number of computers is made for the data processor of the large tokamak JT-60.

1. Introduction

This paper concerns the construction of a multicomputer system to analyse the data from various plasma-diagnostic devices. The authors have in mind the diagnostics and analysis routines for the large tokamak JT-60 [I] but the method described here is applicable to any other data processor for a plasma confinement device. Here only the analyses necessary for the study of confinement in Ohmic heating experiments are taken into account, and other analyses, for example those for the study of instability, are not, although it will also be possible to treat such analyses by the method described here. In this paper we summarize the computation requirements for the confinement study by a directed graph, which consists of directed branches interconnected at nodes [2]. Each node corresponds to a computational job, and a branch which grows from the node corresponds to a plasma parameter obtainable from the job. Let us first consider the simple case of fig. 1, which shows the procedure for obtaining the 13-value as the product of density N and temperature T. Suppose that the computational times for N, Tand j3 are 100, 50 and 10 ms, respectively. If we have only one computer, the total elapsed time to compute all three parameters is 160 ms. If the number of computers is two, N and Tare computable in parallel. However, we cannot compute j3 until the computa-

Data processing is essential to make a plasma confinement experiment successful. Initial effort has already been invested in data collection, and nowadays we have no difficulty in collecting data from devices with moderate confinement times such as low-beta tokamaks. Instead, we now face another problem: the amount of data has become large because of the increasing confinement time. It should also be noted that a requirement is now imposed on the data processor to analyse and display fresh data immediately after each discharge and to enable operators to feed it back to control the next discharge. Accordingly, a modern data processor should be capable of performing computations which hitherto have been carried out off-line by a large general-purpose computer system. However, we need a computer system which is different from a generalpurpose system; the general purpose system is designed to process indefinite jobs for indefinite users: here we need a system to process definite jobs in a definite order within a definite time interval, i.e., within the time interval between two successive discharges of a tokamak. *

Permanent address: Oki Electric Industry Co., Ltd., Shibaura, Minato-ku, Tokyo. 35

36

A. Ogata, K. Yuasa

/ Tokaniak data processing graph as a function of the number of computers. Certainly, theresystems exists a more method for evaluating parallel computer strictly than the pres. ent method [3]. It is however laborious. In addition,

N

to

~

we do not need a very strict method in the design stage the are datanot processor: thewill jobsbetodifferbe carriedofout very clearbecause yet, they

T 50

Fig. 1.

Simple example of directed graph.

tion of N is finished, because 13 is function of both N and T. As is shown in fig. 2(b), the necessary elapsed time cannot be shorter than 110 ms. Even if we have more than three computers, it is apparent that we cannot reduce the elapsed time any more. The actual directed graph of the computation for the confinement study, which is given in section 2, is more complicated than this example. Although there are more plasma parameters involved, the fact that these parameters can be computed only in turn is not lost. It is plausible that, although parallel processing can reduce the elapsed time to a certain extent, it tends to a lower limit as the number of computers increases. This paper aims to derive the elapsed time needed to finish a set of jobs shown in a directed

tOO

N

T$

tOO

10

or 50

T

N

t st computer 2nd

f0~

50

~oo

I

Computer

2. The directed graph of the computational procedure

$ ~ N~

50

T

or 1st

Computer

2nd Computer

50 tOO I

(b) -

ent from those imagined at the design stage. So we prefer a method which is simple and gives an overall idea. Section 2 discusses the directed graph of the cornputational procedure for the confinement study. Section 3 describes the method for determining the number of computers in the multi-computer system which realizes the analysis shown in the graph in section 2. It is based on a computer program which gives the elapsed time as a function of the number of cornputers in the system. Some restrictions which are imposed on the multi-computer system organization are not taken into account, so that further consideration is necessary to make a concrete design based on this method. However, we can derive a general idea from this simple method which is a modification of the PERT project-management technique [4]. In section 4, the method introduced in the previous section is applied to determine the number of computers in the data processor of the large tokamak JT-60 and some discussion is given.

. Fig. 2. Schedule to finish the jobs in fig. 1. (a) Case of one computer. (b) Case of two computers.

Data processing for the confinement study has to derive various plasma parameters from data collected from various diagnostic devices. This procedure is summarized by the directed graph of fig. 3. Most of the nodes in fig. 3 have more than two incoming branches. It should be noted that the jobs corresponding to such nodes become executable only when their preceding plasma parameters are available; in other words, they have to wait until all computations of their preceding jobs, which are represented by the roots of their incoming branches, are finished. Nomenclature and a calculation method to derive each plasma parameter are given in table 1. The method of calculation is based on that given in a standard textbook of plasma diagnostics [5] or in the review paper [61.

A. Ogata, K. Yuasa

RogOwski coil

/ Tokamak data processing

37

22

______________

28

one-turn loop>-

116

29 ~

a b

p

~5

job number computational time ~asrna parameter obtained by the job

x ~

diamagnetic loo~-

:

magnetic probes)~x9T~onscette4

L~erferometry)-

B~ ~

~

-‘ii

PQIr

13

22

T12

6

T13

-~i

A

1~

radiation>-

~

‘~

n

70

Pp

~

~2

23

n1

I 2

~

>-

~

50

lines

~2

T

18

T1

Ineutrol a~m ine)--~--fëj---_----—------------

impurity

Z1 —

~

2 25

25

X -roy pha

~r

~

nI

~

~

42 ~_____________

111 Dopp~-bro~~n~-~

6

Mi 2

n,

11111 T~

T1

~

W1

112

so na.trul particles)-

~

L~lometers)

48

7

1 PCI,,

22

ID,,,a

Fig. 3. Directed graph showing the computation procedures of plasma parameters in tokamak Ohmic heating experiments. The number under each diagnostic shows data amount estimated for the JT-60 in kW. Computational time are in ms.

From the viewpoint of project management, fig. 3 is a project graph [4], in which each node shows a computational job. Each node is labelled by its job number. The computational time necessary for each job is also shown in fig. 3, which depends both on the amount of incoming data and of the specific calculation method. The amount of data depends on the experiment, but the method does not. There are a few physical problems in the derivation of some of the plasma parameters. These together with their temporal solutions are: (1) Some parameters can be obtained independently from different diagnostics. For example, we have three methods to obtain Te: Thomson scattering, X-ray pulse height analysis and cyclotron radiation. The X-ray pulse height analysis gives averaged Te in a certain time interval within a certain observa-

tion angle, while Thomson scattering gives very local values both in space and time. In addition, its Te value is consistent with its ~e value in the sense that these two are obtained from the same diagnostics. The cyclotron radiation measurement can give detailed space-resolved time evaluation. However, further studies are needed to apply this method to JT-60. Before proceeding to obtain the secondary parameters of Te, we should synthesize these three results into one Te. The computing job necessary for such a synthesis is shown by a bold circle in fig. 3. The computing time required for this synthesis is not clear at present. The times in the bold circles of fig. 3 are based on the assumption that they are proportional to the amount of incoming data with a coefficient of 50 Ms/word. (2) The preceding parameters of a secondary

Table 1 Nomenclature, calculation method and job number in fig. 3 Symbol

Explanation

Calculation method

Job no.

I~,

poloidal magnetic flux density plasma current current density normalized internal inductance

integration of magnetic probe signal integration of Rogowski coil signal 3law T~’ (1) Shafranov’s method

2 1 14 19 20 3

I

electron density

ion density ionization rate density of the ion with charge Z and mass number M Ohmic heating power

n~ P~ grad 1’ei

radiation power power transferred from electrons to ions power loss by charge exchange

P~on

power loss by electron conduction and convection

(2) T~’3law (1) exponential curve fitting of Thomson scattering data (2) Abel transform of direct reading type interferometry derivation from n~and n~ look-up table Abel transform and Mewe’s method Ohm’s law bolometer data Spitzer’s method integration of neutral particle measurement data difference of input power and power loss

4 27 17 42 --46 29 34 10 32 9 35

P’~ 0~

q Rp

power loss by ion conduction convection safety radius major factor of the plasma

Te

electron temperature

and

ion temperature

We

electron pressure

difference of ~ei and loss 3law T~’ Shafranov’s method (Fourier analysis) (1) cyclotron radiation signal (2) exponential curve fitting of X-ray phase data (3) Exponential curve fitting of Thomson scattering data (1) exponential curve fitting of Doppler broadening data (2) Exponential curve fitting of neutral particle measurement data product of ne and Te

39 20 12

33 23 24 25 11

6 3 8 9 26

w 1

ion pressure

lIp

poloidal beta

Co

emission coefficient of recom-

product of n~and T1 (1) Spitzer resistivity (2) bremsstrahlung loss (3) according to definition integration of diamagnetic loop signal Abel transform

sip A rp

bination radiation plasma resistance + l~/2~l particle confinement time gross electron energy confine-

Spitzer’s method Shafranov’s method particle balance equation electron energy balance equation

28 12 21 36

ment time gross ion energy confinement

ion energy balance equation

37

electron energy balance for conduction and convection

40

ion energy balance for conduction and convection

38

effective ion charge

T~on

r&~n

time electron energy confinement time for conduction and convection ion energy confinement time for conduction and convection

7

A. Ogata, K. Yuasa

parameter often have space and time resolutions which are different from one another. We then encounter the problem of how to adjust these resolutions in order to compute the secondary parameter. In other words, we have to choose whether to interpolate the parameter with coarser resolution, or to thin out (or smooth) the one with finer resolution, The decision is made to choose the latter for two reasons: firstly to save computing time, and secondly not to fabricate false data. A computing job to obtain the secondary parameter often includes the computation for the resolution adjustment in fig. 3.

3. Elapsed time vs. number of computers There are several ways to realize a computer system which fulfils the requirement given in fig. 3. Certainly, one large computer is a solution. However, a multi-computer system is preferable. The reasons are; (I) in the multi-computer system the portion which directly interfaces a diagnostic device can be separated from the main system in the stages of development, commissioning and test of diagnostics; (2) even if a part of the system fails, the multi-computer system does not lose all its functions. Hierarchical structure is another solution, where lower levels mainly compute the primary parameters, while higher levels compute the secondaries. How. ever, if we intend to make this structure reliable, we need a back-up computer for each hierarchical level, In addition, once the lower level computers have sent their results to their higher levels, they have nothing to do but to be idle, A non-hierarchical structure is more attractive than a hierarchical one. Modern mini-computers are capable of a more organic structure than the hierarchical. The tools for a non-hierarchical structure to realize a store-and-forward techniqueinclude global path, shared memory and point-to-point connection [7]. The next question is how many computers should be used in the non-hierarchical computer network. The requirement in the computer system is clear: to finish a set ofjobs given in fig. 3 within a given interval. The computing time of each job is determined once its amount of incoming data is given. We have developed a computer code MPERT which

/ Tokatnak data processing

39

calculates the time needed to finish all the jobs as a function of the total number of computers for a given project graph such as fig. 3. The code simulates the computation in a simplifled way. The time needed to fetch a job is neglected. If the number of computers is large, the computers compete with one another for the same data, so bringing about a delay. This delay is also neglected. The computers are assumed to have a memory large enough to compute any of the given jobs. First we define a matrix A. It is a Boolean matrix whose elements are either 1 or 0; i.e., a1 = 1 represents the existence of the path from node ito node!. Note that ifa~1= 0 for all i, the job/has no preceding jobs so that it is immediately executable. Fig. 4 shows the flowchart of the code. The matrix A indicates executable jobs. The code selects as many jobs as the number of empty computers from the executable ones and simulates the lapse of time for the job computations. Each element of the vector r in fig. 4 shows the remaining computing time of the job under execution. If any of the executing jobs are complete the code omits the corresponding rows and columns and defines a newA matrix, It then searches for executable jobs again according to the matrix A. The problem is how to select jobs from the executable ones. The code gives first priority to the job which requires the longest computation time among the executable ones. This longest-comes-first rule does not mathematically assure the best solution. However, more than a dozen test runs show that this rule gives a practically-satisfactory answer. Another problem is how to assign the computers in the situation where the number of empty computers is larger than the number of executable jobs. If we number the computers and use DO loops to search for the empty ones, we subconsciously give the priority to computers with lower numbers so that jobs concentrate on these computers. The code MPERT deliberately makes the assignment random lest jobs should concentrate on certain computers, though this routine is not clearly shown in fig. 4. The code MPERT has approximately 300 FORTRAN instructions. It requires 32 kW core memory and about 2 s for execution on a FACOM 230/75 to obtain the results given in figs. 5 and 6.

40

A. Ogata, K. Yuasa

/ Tokamak

data processing idling

ratio (%)

START Print

N and 1.

50

Obtain

40 30

critical path time CPT.

no

T=CPT ?

N=N+l

N=1

20

to

yes SToP

0 12345

Select executable lobs as many as N according to A. Define r .

Omit

Fig. 6.

0k~ ‘ °~k

(xli

N; rk =01

of ofcomputers Percentage ofnumber idling time each computer

as a func-

tion of the number of computers. A point indicates an idling

Define the resultant comatrix as A.

time and a circle shows an averaged value.

1=0.

=

Has A any element?

mm (r~I

no

yes Is there

rr_[:

no

job? yes

Seiect executable jobs as many as the empty computers.

r by the corresponding elements.

Supplement

Fig. 4. Flowchart of the code to determine the necessary elapsed time. Nix the number of computers, A is the matrix, r is the vector whose each element gives the remaining computing time of the job under execution,

~

to

estimated based on sampling intervals and output channel numbers of the diagnostic devices to be used. FORTRAN programs to calculate the plasma parameters have been coded so that the computation time for each job may be accurately estimated. However, we have to make the following assumption in order to apply our results to a mini-computer system organization, i.e., that the CPU times of these programs on the general-purpose large computer FACOM 230/75 indicate the relative computational times on a mini-

the critical-path which The is thedashed computation time FACOM 230/75 time, computers. line shows of the jobs on the path shown in the bold line in fig. 3. However many computers maythan be used, the computation time cannot be shorter this critical-

40 30 20

number of computers needed in the JT-60 data processor. The amount of input data given in fig. 3 is

computer. The computing time of each job on the FACOM 230/75 is also given in fig. 3. Fig. 5 shows the necessary elapsed time for the jobs given in fig. 3 as a function of the number of

elapsed time 60 50

The method has been applied to determine the

any executable

ii T=T+Lit

4. Application to the design of the JT-60 data processor

critical path ~rne ___________________

I 2 ~ number of computers ~

~‘

Fig. 5. Elapsed time(s) for the jobs given in fig. 3 as a function of the number of the FACOM 230/75 computers.

path time. Fig. 5 shows that the use of more than six computers is meaningless. The more the number of the computers increases, the more each computer has to idle, because the order of execution of the jobs is constrained by the directed graph. Fig. 6 shows the percentage of idling time of each computer.

A. Ogata, K. Yuasa

A few comments will be made on the results of figs. 5 and 6. The curve of fig. 5 is not smooth because of the irregularly-long elapsed time for the case of four computers. In this case, the job scheduling is not very successful. This failure also appears in fig. 6 as an irregularly-large idling ratio. One may feel that five, the number of computers which gives the critical-path time, is too small compared with the complexity of fig. 3. It is actually the computation times of jobs 42 to 46 that decide the critical-path time. In fig. 3 the impurity lines are separated into five channels, only because five spectrometer units are proposed for the JT-60 diagnostics. However, the number of spectral lines is supposed to be 30 and the computations to derive n~ from these lines are executable in parallel. If the number of channels of impurity lines is increased, we may be able to carry out more parallel processing than in the present model to reduce the critical path time. As was mentioned in section 3, the code MPERT neglects the time required to fetch jobs from disk. If the number of computers becomes large, we have to take into account the competition amongst computers to use the same data. In this sense, the results in fig. 5 amd 6 are too optimistic for the case of several computers say more than three or four, From a different viewpoint we can consider that fig. 5 shows the “fail-soft” nature of the multi-cornputer system. Suppose that one or two out of n computers get into a fault condition, then fig. 5 mdi-

.

cates the computation time on n—i orn—2 computers

after the rearrangement of job scheduling, One estimate shows that a recent standard “midi” computer has a mean instruction-execution time ten times slower than the FACOM 230/75. This means

/ Toka~nakdata processing

41

that the five midi computers need approximately 3 nun to complete all the jobs. The time interval between two successive discharges will be 10 mm in the JT-60. Taking account of the time required to display the results on CRT screens, the data transmission time, etc., we can conclude that four or five computers are necessary for the JT-60 data processor. In conclusion, it has been demonstrated that the simple code MPERT can give an overall concept for the number of computers needed in the JT-60 data processor. This concept is useful enough at the preliminary stage of the design.

Acknowledgements The authors would like to thank Drs. Y. Shinohara, K. Asai, M. Fujii, T. Matoba and Y. Suzuki for helpful discussions.

References [1] Ed. B.J. Green, NucI. Fusion 19 (1979) 515. 12] E.J. Henley and R.A. Williams, Graph theory in modern engineering (Academic Press, New York, 1973). 13] J.L. Baer and J. Jenses in: Measuring, modelling and evaluating computer systems, eds. H. Beilner and E. Gelenbe (North-Holland, Amsterdam, 1977). [4] P.W. Muller, Schedule, cost and profit control with PERT (McGraw-Hill, New York, 1963). [5J For example, ed. W. Lochte-Holtgreven, Plasma diagnos-

tics (North-Holland, Amsterdam, 1968); eds. R.H. Huddlestone and S.L. Leonard, Plasma diagnostic techniques (Academic Press, New York, 1965). [61 Equipe TFR, NucI. Fusion 18 (1978) 647. [7] B.K. Penny, Comput. Phys. Commun. 15 (1978) 515.