AUTOMATED DESIGN OF DISTRIBUTED COMPUTER CONTROL SYSTEMS WITH PREDICTABLE TIMING BEHAVIOURH. THIELEN Techni5che Univer5ittit Munchen, Lehr5tuhl f. Prozeurechner, D-80290 Munchen, Gennany
Abstract. When designing distributed computer control systems, there is a great variety of possibilities how to assign system parameters e.g. the task allocation. For a chosen parameter set, it is indispensable to prove that the system meets all deadlines even in the worst case. On this condition it is a lengthy process to find a feasable and cheap solution. Therefore it is highly desirable to support the system designer in this process. In this paper, a proposal is presented to automate some parts of the design process in order to find a realization as inexpensive as possible by using stochastic optimization methods.
Keywords. Real time computer systems, parameter optimization, minimal realization, computer selection and evaluation, stochastic optimization, computer aided system design.
1. INTRODUCTION
A substantial problem during the development of computer control systems with hard real-time constraints is the proof, that the system meets all deadlines even in the worst case. In time driven systems, this is proven implicitly by the creation of a valid ~ask schedule (Ramamritham, 1990; Kopetz, 1986; Fohler and Koza, 1990). However time driven systems have disadvantages for the handling of aperiodic processes on principle. Event driven systems are more suitable for this kind of processes, but to meet all deadlines they tend to be oversized because of inaccurate process models. Even with more accurate process models as proposed e.g. by Gresser (1993a,b), there is a great variety of possible parameter settings such as the number and performance of the computing nodes, the task allocation and 80 on. These parameter settings have a great influence on the costs of the system which can be reduced by an automated optimization. Figure 1 shows the basic scheme of an automated design. The user describes the model of the technical processes and the tasks. Other constraints to the system are restrictions of the task allocation. The reasons for these restrictions can be e.g. requirements of fault tolerance (Gresser and Thielen, 1992) or special abilities of a computing node.
-This work is part of a research project IpOJlIOred by the German Science Foundation (DFG) under Grant Fa 109/1!)'1 .
.
47
Fig. 1. Basic scheme of the automated design.
The first step is the selection of a starting point in the variable parameter space. From this point, the optimization loop is entered. The given model with the selected parameters is analysed, and an evaluation factor is computed. Dependend on this evaluation factor and additional goal oriented improvement hints the parameters will be modified and the loop will be repeated. As soon as the evaluation factor is sufficient or does not improve within a given number of iterations this cycle stops.
2. AUTOMATED DESIGN This section presents a proposal for the automated
E(I) 6
5
cycle
..
=7
C(I) 6
•
4
4
3 2
2
1 0
o+-~-+-+~~+-~~-+
1
5
10
I
o
2
4
6
8
Fig. 3. C(J) derived from E(I) (c
Fig. 2. Example of Event Function E(I)
design process. The implementation is currently in work; the concept and first results will be shown in the following sections.
10
I
= 2, d = 3)
2.2 Parameters Essential parameters of the system design are • the number of the computing nodes
2.1 Schedulability analysis
• the performance of the computing nodes In this section an overview is given of a system modeling and scheduling analysis method presented in (Gresser, 1993b). This method will be used as an exemplary basis for the present design automation approach. In his work Gresser develops models for technical processes and tasks as well as methods to calculate the timing behaviour of given event driven systems and thus to proof the meeting of the deadlines. The stimulation of the tasks by the technical processes are modelled by Event Stream, that describe the maximum possible number of events within an interval J. This leads to an Event Function E(I) for each Event Stream (Fig. 2). Tasks are described by their maximum execution times and the deadlines for the triggering events. With these values, one can determine the C(I)function, which specifies how many units of execution time have to be finished in given intervals. The C(I)-function is derived from the Event Function by shifting by the deadline d and by multiplication with the execution time c of the task (Fig. 3). For a complete node, the C(I)-function is built by addition of the C(I)-functions of all tasks on this node. For earlie,t deadline jir,t scheduling, Gresser has proved, that each task meets its deadline if the C(I)-function always runs under the bisector, which specifies the maximum execution time in each interval. This construction of the C(I)-function by simple addition is only valid for independent tasks. Therefore in (Gresser, 1993b) it is shown how dependences of tasks, which result from dependences of the triggering events, from precedence constraints, from internode-communication or from mutual exclusion, can be transformed into a system of independent tasks, that shows the same worst case behaviour as the original task system.
48
• the performance of the communication system • and the task allocation . The aim of the automated design is not only to find a parameter set suitable for the realization of the system, but a solution that makes it possible to build the system as cheap as possible. Therefore the first three parameters, which have great influence on the hardware costs, are not fixed by the designer but have to be determined by the automated design process. This leads to a combinatorial problem with a great parameter space from which the task allocation problem alone is known to be NP-hard (Mok, 1983). Additional parameters that arise from the chosen analysis method (Gresser, 1993b) are • the placing of interjacent deadlines if tasks communicate across node boundaries, • the choice of priorities for interrupt service routines, • and the selection of some operating system strategies e.g. to solve priority inversion problems. In the first step of the investigation, which is the focus of this paper, only the first set of parameters is taken into account. The second set will be considered in the tool described below.
2.3 Starting Point The starting point of the optimization loop ca.n be determined ma.nually by the user, or it can be chosen ra.ndomly. A "good" starting point possibly speeds up the convergence of the optimization, but there is also the rise to get stuck in a local minimum. This will be subject of further investigations.
2.4 Objective Function
be optimized independently of the other·optimization parameters.
AB the aim is to build the system as cheap as possible, the costs have to have a great impact on the objective function. Basically, the cost function is the sum of the costs of all units:
CO!t!
=L
(1 )
co!t!{i)
But the problem for the optimization is, that the price of computing power is a discrete function with few distinct discontinuities. As an example, Fig. 4 shows the price to performance function of ordinary PC motherboards with twelve different 80x86 CPU's (qualitative; based on current prices and SPECint values). As can be seen the effect of small changes of the required computing power may be not visible at the cost function, and therefore it's possibly sometimes not decidable which of two different parameter sets is superior.
Another way to achieve a fine grained objective function is to take the least availa.ble laxity into account (Fig. 6).
C(I)
I Fig. 6. Laxity evaluation to achieve the objective function
Cost
If it's necessary to combine this fine grained function with the cost function, the objective function should return a compound value consisting of both the costs and the needed computing power:
0= (Co!t!,Perf)
Performance
Fig. 4. Price vs. performance of PC boards
(2)
with CO!t! as in (I) and Per/as the sum of the needed computing power of all nodes:
Per/ = Lper/{i)
If further investigations will show that this stays true, it is necessary for the optimization to take a finer o~ jective function into account. For the present analysis method this can be derived from the previously mentioned C(I)-function. The computing power of a node can be scaled in such a way that all deadlines are just met (Fig. 5), i.e. this is the minimal computing power of the node all deadlines can be guaranteed.
(3)
To use this objective function for the optimization process, relations between two values have to be defined. To compare two values 0 1 and O2 , at first the two costs Cl and C 2 will be compared, and only if this is not sufficient, P1 and P2 are taken into account . In an example:
C(I) (CO!t!l < COd!2) (CO!t!l
(Perfl
o
I
V
(4)
= CO,t!2) 1\
< Per/2))
Other relations can be defined similarly. These relations are sufficient for some optimization algorithms whereas others need a scalar value explicitly (e.g. Simulated Annealing, see below). In the latter case the two values CO!t! and Per/could be combined:
Fig. 5. Scaling the computing power to achieve the objective function
o = 41 • Cos" + 42 • Per/
The use of the performance value derived in such a way has the advantage that it may be used as parameter of the cost (performance )-function. Thus the performance of the computing nodes does not need to
49
(5)
The weighting factors 41 and 42 have to be chosen in such a way that the influence of CO!t! is always greater than that of Per/.
Fig. 7. Structure of the tool for the automated design.
2.5 Stochastic Optimization
2.6 Rule Based Optimization
Because of the huge combinatorial dimension of the parameter domain and because of the chaotic behaviour of the objective function for task allocation (Graham, 1966), it is not possible to use goal oriented optimization methods. Simple stochastic optimizations as the Monte-Carlo method are not appropriate as well because of the great parameter space. Finding an acceptable solution would take too much time.
To speed up the convergence of the stochastic optimization methods, more information generated by the schedulability analysis can be made use of.
On similar combinatorial problems, nature analogous optimization methods proved a success. E.g. Simulated Annealing (Kirkpatrick et al., 1983) has been used successfully for solving the travelling .ale.man problem, for routing and placement problems of complex integrated circuits and for allocating hard real time tasks (Tindell et al., 1992). There are some variants with faster convergence, e.g. Very Fait Simulated Re-Annealing (Ingber, 1989) and Adaptive Simulated Annealing (Ingber, 1993). With some o~ timization problems, just slightly simplified variants lead to faster optimization, e.g. Thre.hold Accepting or the Record-to-Record travel (Dueck et al., 1990; Dueck, 1993). All these methods allow the acceptance of changes for the worse with decreasing probability in time. This reduces the risk to get stuck in a local minimum. Another class of nature analogous optimizatious are the Genetic Algorithm. (Goldberg 1989). Especially on circuit partitioning problems, they were used successfully (Hulin, 1992). Because the global minimum of the objective function is not known usually, the optimization is stopped after a certain number of optimization steps, or after stagnation for a certain number of iterations.
50
For the analysis method used for this work, one idea is to analyse the C(I)-function in depth. For each discontinuity of this function it is known which task set is responsible and how large the laxity of these tasks are. If the node is overloaded, one can identify particularly critical tasks and move them to another node.
2.7 Studied Issues and Comparison
The optimization methods will be valued and compared by using realistic examples. Especially the speedup of convergence and the improvement on the reachable optimum by using rule based methods will be examined. Also, the effect of restrictions (e.g. of fault tolerance) on the optimization quality will be analysed.
3. TOOL
Currently a tool for the automated design is under work. Figure 7 gives an overview of the modules in the tool. The U.er Interface allows the input of the process, task and system description by either using an interactive graphical editor or as from text file. After the optimization, the design can be edited manually. The results of the analysis and optimization steps are shown in graphic windows. The parts Optimization and Anal.,.iI work as decribed above and will consider all parameters mentioned above.
4. FIRST RESULTS
5. CONCLUSION AND FUTURE'WORK
In a first step, the Record-to-Record Travel algorithm (Dueck, 1993) wa.s implemented for a simple task model. Tasks are defined by their Event Functions, their deadlines and their maximum execution times. In this first stage, tasks have neither communication relations nor dependences.
In this paper, a proposal was presented how to automate the design of distributed computer control systems with the main focus to find a solution 1.8 cheap 1.8 p088ible. First results verified that this goal can be reached, if the discontinuous cost values are chosen 1.8 the basis of the objective function instead of the nearly continuous performance values. The good results obtained by the optimization algorithm have to be verified in further tests, especially for the full optimization parameter set. The other optimization algorithms mentioned will be implemented and compared.
The algorithm currently implemented works lows:
1.5
fol-
1. Initial configuration: allocate each ta.sk to a randomly chosen node. The maximum number of nodes is equal to the number of ta.sb. 2. Choose a maximum deviation allowed (see below) . 3. Get the value of the objective function for this configuration: record
= c08t6(configuration)
The implementation of C08tO is explained below . 4. Get a new configuration new by randomly choosing a task and allocating it to a randomly chosen node. 5. If c08t( new) < record + deviation then hold this new configuration, else return the moved task to its previous node. 6. If C08t( new) < record then save this as the new record value: record
= c08t8(new)
7. Stop, if there's no decrease of record for a long time, or if there were too many iterations. Otherwise return to point 4. This algorithm has one parameter, deviation, that has to be determined by experiments. If deviation is low, the algorithm is fast, but the results produced are of minor quality. If deviation is higher, the algorithm slows down, but the results are better. The determination of the objective function C08t80 takes place in the following steps: 1. For each node, calculate the required performance to meet all deadlines by evaluating the C(I)-function as in Fig. 5. 2. For each node, calculate the costs for the next available performance (Fig. 4). 3. Sum up the costs of all nodes with tasks allocated on it. First experiments for small systems showed that the algorithm quickly leads to a result near the optimum. The system used has about 115000 p088ibilities to allocate the tasks; the algorithm stopped after a few thousend trials with results about 5 % above the minimum value.
51
6. REFERENCES Dueck, G. and T. Scheuer (1990) . Threshold Accepting: A General Purpose Optimization Algorithm Appearing Superior to Simulated Annealing. Journal of Computational PhY8ic8 90(1),161-175. Dueck, G. (1993) . New Optimization Heuristics: The Great Deluge Algorithm and the Recordto-Record Travel. Journal of Computational Physic8 104(1), 86-92. Fohler, G., and C . Koza (1990). Scheduling for Didributed Ham Real- Time Syltems u8ing Heuridic Search Strategie8, Forschungsbericht 12/90, Institut fiir Technische Informatik, Technische Universitit Wien, Osterreich. Goldberg D. E. (1989), Genetic Algorithms in Search, Optimization £1 Machine Learning. Addison-Wesley, Reading MA. Graham, R. L. (1966) . Bounds for certain multiprocessing anomalies. Bell SYltem Tech. J. -45, 1563-1581. Greaser, K, and H. Thielen (1992). Deadline Scheduling in Fault Tolerant Real Time Systems. Proc. Fourth Euromicro Workshop on RealTime Sydems, Athens, Greece, 184-189. Greaser, K (19931.) . An Event Model for Deadline Verification of Hard Real-Time Systems. Proc. Fifth Euromicro Workshop on Real- Time Systems, Oulu, Finland, 118-123. Greaser, K (1993b) . Echtzeitnachwei8 ereigni8geIteuerter Realzeitsysteme. Ph.D. Thesis, Technische Universiti.t Munchen. Fortschrittsberichte VD! Reihe 10 Nr. 268. VD! Verlag, Diisseldorf. Hulin, M. (1992). Evolutionlstrategien zur Schaltunglpartitionierung. Ph.D. Thesis, Technische Universiti.t Miinchen. Ingber, L. (1989) . Very fast simulated re-annealing. Mathl. Comput. Modelling 12(8), 967-973. Ingber L. (1993) . Adaptive Simulated Annealing (ASA). Not yet published. [ftp.caltech.edu: /pub/ingber/asa.Z] Kirkpatrick S., Gelatt C. D. Jr., and Vecchi M. P. (1983). Optimization by Simulated Annealing. Science 220(4598),671-680.
Kopetz H. (1986). Scheduling in Distributed Real Time Systems. Proc. Advanced Seminar on Real-Time LoaJI Area Networlu, INRlA, Rocquencourt, France, 105-126. Mok A. K.-L. (1983). Fundamental De,ign Problem, of Di,tributed Sf/,tem, for the HaroReal-Time Environment . Ph.D. Thesis, Ma.&sachusetts Institute of Technology. Ra.mamritham K. (1990). Alloca.tion and Scheduling of Complex Periodic Ta.sb. Proc. 10th Conf. on Di,tributed Computing SI/Item" 108115, IEEE. Tindell K. W., Burns A. and Wellings A. J. (1992). Alloca.ting Hud Real-Time Tasks: An NPHud Problem Made Ea.sy. The Journal of Real-Time Sf/Item, 4(2), 145-165.
52