Information Processing North-Holland
Letters
5 October
43 (1992) 265-270
1992
Analysis of hierarchical bus-based multicomputer architectures A.S. Pombortsis Department of Informatics, Aristotle Unicersity of Thessaloniki, 54006 Thessaloniki, Greece Communicated by W.M. Turski Received 8 May 1992
Abstract Pombortsis, 265-270.
AS.,
Analysis
of hierarchical
bus-based
multicomputer
architectures,
Information
Processing
Letters
43 (1992)
Hierarchical interconnection networks with cluster structure have been proposed for use in multicomputer systems for parallel and distributed processing. However, little attention has been paid to the study of the underline communication network at the message passing level. The main contribution of this paper is to present a performance model which relates the performance criteria with the architecture design and implementation, and to derive a method for optimizing the hierarchy in such systems. The proposed methodology can be easily augmented to include additional objectives and/or decision variables. Keywords: Computer architecture; performance models
performance
evaluation;
1. Introduction The multicomputer architectures are distributed-memory systems which consist of loosely coupled autonomous computing nodes (CNs) of identical structure, each with its own private memory. Autonomous, but cooperating tasks are executed on different CNs, exchanging data via a message passing technique, according to the MIMD paradigm. There are many issues involved in multicomputer design and operation. Since the design problem in an integrated manner is quite complex, most research efforts study particular aspects of this problem in isolation. This paper attempts to address some of the design issues, at the message passing level, of a
Correspondence to: Dr. A.S. Pombortsis, Department of Informatics, Aristotle University of Thessaloniki, 54006 Thessaloniki, Greece. Elsevier
Science
Publishers
B.V.
multicomputer
systems;
hierarchical
interconnection
networks;
bus-oriented multicomputer architecture which uses hierarchical interconnection networks (HINs), focusing on optimization of the underline communication system. Hierarchical and parallel processing structures arise from a large class of software problems and HINs are intuitively appearing when a large number of processors are to be connected [1,3,7]. Topological aspects of hierarchical multicomputers systems can be found in [2]. The general problem under consideration is to develop a hierarchical cluster-bases architecture which has to be optimal with respect to some criteria. More precisely we are concentrating on the performance of such systems as a function of the system design, bus technology, and traffic load distribution. Our work has been motivated by the observation that though the communication system performance is a critical issue in a distributed computing system, in most analyses it is assumed that the communication network costs are negligible. 265
Volume
43. Number
2. Characteristics networks
5
INFORMATION
of hierarchical
PROCESSING
on this notation expressed as
interconnection
L
In a hierarchical multicomputer system, a number of CNs (processors equipped with local memory for program and data storage) are formed into a cluster (or module) by sharing a common bus, and these clusters are interconnected hierarchically. Buses, of the same or different technology, are used primarily for I/O and interprocessor communication. The conceptual organization of the HIN is shown in Fig. 1. The structure of the HINs considered here can be informally described by the following elements: (a) the total number of CNs in the system [N], (b) the number of levels in the hierarchy [Ll (with the root as level l), (c) the number of devices connected together to form a cluster at the i-level [B(i)], and (d) the number of clusters at the i-level [C(i)]. The number of CNs in the architecture is given by the relation N=
4: C(i)
* B(i),
(1)
i=l
where C(i) = C(i - l>* B(i - 1) with initial values C(1) = 1, and 1 G B(1)
Fig. 1. The conceptual 266
organization
N=
5 October
LETTERS
the total number
1992
of CNs can be
C(i)
C
c
i=l
k=l
[CN:*B(i)].
This relation expresses the “formation” of the hierarchical architecture. The architecture can be seen as an extension of hierarchical multicomputer organizations in which each computer node is connected to only one processor “above” it, but may have multiple processors connected to it from “below”. A similar architecture has been proposed in order to support distributed simulation [31.
3. Message
traffic representation
Each CNf acts like a coordinator for a lower cluster in the hierarchy and can send messages to CNs which belong to the same cluster and level (intracluster communication [ic]), to different clusters in the same level (intralevel communication [ill), and to different clusters in the different levels (interlevel communication within the hierarchy [h]). It is assumed that CNs are identical and are subjected to the same average arrival rate of tasks. This homogeneity of CNs implies that over a long time the external load imposed on each CN is the same [4]. Also, it is assumed that a
of a hierarchical
interconnection
network
(L = 3).
Volume
43, Number
5
INFORMATION
PROCESSING
CN’s message is independent of its preceding messages (the so-called Request Independence Assumption [RIA]). Let P be the total number of messages per unit time generated by a CN. The message distribution is denoted as t&,j, where m is the level of the sender CN, j is the level of the receiver CN, and i is the level of the highest bus used in the hierarchy. For each CN, in a given level m, the reference pattern is defined by the Message Distribution Matrix [MDM’“)]) which is an L x L matrix, and it holds that
The terms of an MDM depend on the actual behaviour of an application program. The most used assumption, in order to model the process of message generation, is based on the Uniform Reference Model (URM) where the probability of CN, sending a message to CN, is constant for the whole system and does not depend on the network topology. Generally, it is difficult to predict what specific values of ti,, one should expect in practice, but it is clear that a large number of parallel applications are in fact characterized by a communication structure that results in significant locality. Also, in some cases accurate values can be obtained from traces or careful analysis of the program. In each MDM the terms ti,, with i = m = j express the probability of intracluster communication. Also, the terms tk j with (i) m = j and i>jand(ii)m>jandi>min(m, jl-tlmustbe zero since they express a communication which is not supported by the architecture and a communication which has been already taken into account respectively. It is assumed that CNs which belong to the same cluster have the same communication pattern. A CN in a level m generates P messages per unit time and these messages are distributed according to the Message Distribution Matrix [MDM’“)].
4. Delay functions The multicomputer system may be composed of many kinds of communication links and the
LETTERS
5 October
1992
overall system performance is affected by the delays on the buses and by the intensity and distribution of messages. Activities of an individual CN consist of two actions: internal processing followed by message transfer periods. A message is considered as a single entity with a fixed length. Also it is assumed that the messages are of “non which means that the transmission wait-type”, ends as soon as the receiver CN receives the message. In general, for a given message generation probability, the total delay of a bus is the sum of the delay due to arbitration protocols (delay before the bus is available) and the delay associated with the bus. The effect of the arbitration protocols on the performance of a system was discussed in detail in the literature. In [5] five arbitration protocols, commonly employed in actual systems, were analyzed and it was concluded that the optimum arbitration protocol was the rotating priority (round robin). For this protocol the access time increases linearly with the number of processors. On the other hand, the delay associated with a bus depends on the number of devices connected to the bus and can be classified into four general types [6]: (1) constant, (2) linear, (3) logarithmic and (4) quadratic. Thus, for a given bus i the total delay can be expressed as D(i)
= Do,, +
Klog
+ K~O~~,+ log,N+
Klin
* IV
Kquad*N2
[t.u],
(4)
where the parameters K depend on the bus technology. In the above expression some of the delay components may be negligible or nonexistent. Usually, for ease of analysis, the arbitration times are assumed to be negligible. However, in the following analysis we consider a bus employing the rotating-priority protocol. At each cluster and level it is possible to use the delay function (4) by replacing the number N with the number of devices connected together to form the cluster under consideration, and by choosing the proper values of parameters K. The evaluation of the delay functions is based on the relative values of i, m and j. The expression DA,j denotes the delay of a message when 267
Volume
43, Number
5
INFORMATION
PROCESSING
LETTERS
5 October
1992
general form of t& with n = m, . . , L, excluding the terms with x =n and x > m because they represent intracluster communication and traffic which is not supported by the network respectively. If m is the level number of the bus under consideration and i is the level number of a given bus that supports the intralevel communication, the load due to intralevel traffic of the m-level bus is given by L’,=,,(il)
=
I
B(m)
+ P(i)
- 11 X nZ++lB(4
xpxt:,,. Thus, the total load given by the relation
1 (6)
due
to interlevel
traffic
is
LI=,,(il) Fig. 2. Delay functions in an L-level hierarchical interconnection network based on the relative values of i, m and j.
=
5 4:[B(k) +[B(i) i-l
the sender level j and is in level 1 < i < L. delay of a of Fig. 2.
CN is in level m, the receiver CN is in the highest bus in the hierarchy used i. Obviously 1 < m < L, 1
5. Bus load
In order to analyze the load of each bus in the hierarchy, the overall traffic has been divided into three types: (a) Zntrucluster traffic. This consists of the traffic produced by CNs in the same cluster and it is expressed by the terms th.i with i = m = j. Thus the bus load is given by the expression Lh_(ic)
=B(m)
XPXti,,.
(5)
(b) Intralevel traffic. This expresses the traffic among CNs of the same level but of different clusters. This traffic is supported by buses of highest levels. Thus, for a given level m, the level numbers of the various buses used for the communication will be between 1 and (m - 1). This type of traffic is represented by terms with the 268
k=m
-11
X
fi
B(n)]
n=1+l
L
1
XPX&.
(7)
(c) lnterlevel traffic. This load expresses the traffic within the hierarchy between CNs of different levels. With respect to a bus in a given level m of the hierarchy, this type of traffic can be further divided in upstream (up), downstream (ds) and traffic generated by coordinators (cd) respectively. These three categories are analyzed below: (cl) The upstream traffic is generated by CNs which belong to the same cluster with the bus and by CNs of levels below the bus under consideration (i.e by CNs with level number i where Cm 1) < i < L). The bus load from this type of traffic is given by the expression B(k)
X 4: i-1
i j=l
f;,,
1 (8) XP
excluding the terms t&,j with: (a) i > j, because they represent traffic which is not supported by the architecture, and (b) i = m = j and m = j, because they represent intracluster and intralevel traffic respectively which has been already taken into account. The load of a bus of the lowest level
Volume
43, Number
INFORMATION
5
PROCESSING
(m = L) is also given by relation (8) excluding only terms with i = m = j and m = j. (~2) A bus in a level m accepts downstream traffic which produces by CNs in the hierarchy using buses with level numbers from 1 to (m - 1). The load of a given bus due to this type of traffic is given by the relation: +lr
L
Ii
I
x n=Q+lB(n) ’ C,j “. ;=I
(9)
In relation (9) the terms tk,j which represent intracluster and intralevel traffic must be also excluded. (~31 Finally each bus in the hierarchy, except the root bus (m = 11, accepts traffic from the coordinator CNs which is destined either to CNs connected to the bus under consideration or to CNs in lower levels. The bus load due to this traffic, for an internal bus, is given by the relation: m
Gd=m(h) = c
m-l
L
c c t:,
k=2 I=1 j=l
(10)
excluding the terms with j < m, and for a bus in the lowest level (m = L) by the relation: L Gd=Jh)
=
m-1
c
c
k=2
I=1
t:,_
(11)
except the terms with j < m and i < (or => m.
6. Optimizing a bus-based ture with cluster structure
hierarchical
architec-
The main design objective considered here is to minimize the overall network delay and bus load. The method is based on the performance model derived above (relations (l&(11)), and consists of the following steps: Step 1. Estimation of the possible number of levels. With the assumption that a cluster in the
5 October
LETTERS
hierarchy consists of more ber of levels can be found Lmin = 1 with C(i) = N archical architecture), L,,, = log,N (L-level
1992
than one CN the numin the range of: and B(i) = 1 (non-hierarchitecture).
Step 2. Estimation of the number of possible architectures. For each L-level hierarchical architecture, using the relations (1) and (2) and by exhausting all combinations of C(i) and B(i), which are subject to constraints of the above relations, we are able to specify the number of possible architectures. Step 3. Introduction of parameters in the delay functions of buses. Each cluster of the hierarchy can be implemented either with buses of the same technology or, as it happens in a real situation, with buses of different technologies. For each bus the parameters of its delay function must be given in this step. Step 4. Introduction of traffic parameters (traffic intensity and message distribution). At this step the number of generated messages (P) and the distribution of messages in the various clusters of the hierarchy (MDM’“‘} are determined according to the application and the task assignment procedure. Step 5. Performance evaluation of each architecture. For each possible architecture it is possible to compute (using relations (4)-(111, the delay expressions of Fig. 2, and the Message Distribution Matrices), the delays and bus loads associated with each cluster and level as well as the total delay and load of buses. Step 6. Selection of the architecture with optimum performance. The best architecture (in terms of given performance criteria) is selected.
7. Conclusion In this paper we have presented a performance model for bus-based, with cluster structure, hierarchical multicomputer systems at a message passing level. Based on relatively simple 269
Volume
43, Number
5
INFORMATION
expressions, the model gives the network performance as a function of the system design, bus technology, and traffic load distribution, and permits the development of a methodology which can be used in selecting the best interconnection according to given design objectives. Other bus delay functions and patterns of communication can be substituted for the ones used here, but the basic analysis technique remains unchanged. The proposed methodology can be easily augmented to include additional obiectives and /or decision variables in order to achieve a system architecture with the best performance/cost ratio.
References [l] P.L. Borril, High-speed 32-bit buses for forward-looking computers, IEEE Spectrum 7 (1989) 34-37.
270
PROCESSING
LETTERS
5 October
1992
[2] V. Cantoni, M. Ferretti and L. Lombardi, A comparison of homogeneous hierarchical interconnection structures, Proc. IEEE 79 (4) (1991) 416-427. [3] A.I. Conception, A hierarchical computer architecture for distributed simulation. IEEE Trans. Comout. 38 (1989) 311-319. [4] D.L. Eager, E.D. Lazowka and J. Zahorjan, Adaptive load sharing in homogeneous distributed systems, IEEE Trans. Software Engineering 12 (1986) 662-675. [5] F. El Guibaly, Design and analysis of arbitration protocols, IEEE Trans. Comput. 38 (1989) 161-171. [6] B.C. Winsor and T.N. Mudge, Analysis of bus hierarchies for multiprocessors, in: Proc. 15th IEEE Computer Architecture Co@. (1988) 100-107. [7] S.B. Wu and M.T. Liu, A cluster structure as an interconnection network for large multicomputer systems, IEEE Trans. Comput. 30 (1981) 254-264