Parallel Computing: Software Technology, Algorithms, Architectures and Applications G.R. Joubert, W.E. Nagel, F.J. Peters and W.V. Walter (Editors) 9 2004 Elsevier B.V. All rights reserved.
609
A B r o k e r A r c h i t e c t u r e for O b j e c t - O r i e n t e d M a s t e r / S l a v e C o m p u t i n g in a Hierarchical Grid System M. Di Santo ~, N. Ranaldo ~, and E. Zimeo ~ aDepartment of Engineering - Research Centre on Software Technology University of Sannio - Benevento - ITALY Even though many simulation frameworks for predicting performance in Grid systems have been implemented, real Grid middleware platforms lack of effective resource managers able to help them to dynamically make a decision about task placement. This paper proposes a broker architecture and its implementation for a hierarchical Grid system. The broker, whose logic is based on an economy-driven model, is able both to transparently split a sequential object-oriented application into tasks, according to the master/slave computing model, and to automatically distribute slave tasks to a set of computational resources selected so as to execute the application satisfying the QoS specified by the user. The goal is achieved by designing the broker with three patterns: Grid Broker, Reflection and Master/Slave. 1. I N T R O D U C T I O N One of the most important features in a grid environment is the ability to well exploit a high, variable number of distributed heterogeneous resources in order to run an application within a deadline and without exceeding a prefixed budget [4]. Such a service is typically provided by a specific middleware component: the broker. Recently, researchers in the field of Grid brokering and scheduling have developed many simulation frameworks [4, 5] for predicting application or system performance. Unfortunately, these simulators may not be immediately applied to real Grid middleware platforms in order to help them to dynamically make a decision about task placement. In a real environment many questions arise. How to describe a real application? How to individuate tasks and their dependencies? What resource parameters are significant for splitting a job into tasks and for scheduling them? How tasks are related to the programming model used? How to perform the distribution of application load on the basis of resource information? At our knowledge, only few papers [11, 13, 14] have analyzed all these aspects of a real Grid Broker. However, they don't (1) individuate a reference software architecture, (2) define resource and application descriptors, (3) analyze dependencies among tasks, (4) show how to reuse sequential code for executing an application in a Grid. This paper attempts to answer these questions by defining a broker architecture for a hierarchical Grid system. The proposed broker, whose logic is based on an economy-driven model, is able both to transparently split a sequential object-oriented application into tasks, according to the master/slave computing model, and to automatically distribute slave tasks to a set of resources selected so as to execute the application satisfying the QoS specified by the user. The goal is achieved by designing the broker with three patterns: Grid Broker, Reflection and Master~Slave. In particular, the broker
610 is integrated in a Java-based Grid middleware, called Hierarchical Metacomputer Middleware (HiMM) [6, 7], which provides information and communication services and allows users to program applications by adopting a distributed object model. The remainder of the paper is organized as follows. w introduces hierarchical Grids and briefly describes our middleware. w discusses different approaches for selecting resources in Grids. w presents the Grid Broker pattern, which is a pattern purposely defined for designing Grid middleware platforms. w tackles the problem of object-oriented task scheduling when the master/slave computing model is adopted. w concludes the paper and introduces future work. 2. H I E R A R C H I C A L GRID ARCHITECTURES Hierarchical topology fits for grid systems as it allows (1) remote resource owners to enforce their own policies on external users [ 11 ], (2) applications to exploit the performances of dedicated networks and (3) the system to be more scalable. Following these observations we have developed HiMM, a customisable middleware based on an architecture composed of four layers as proposed in [9]. HiMM is able to exploit collections of computers (hosts) interconnected by heterogeneous networks to build Hierarchical Metacomputers (HiMs). In fact, at the connectivity layer, HiMs are composed of abstract nodes interconnected by a virtual network based on a hierarchical topology. The nodes allocated onto hosts hidden from the Internet or connected by dedicated, fast networks can be grouped in macro-nodes, each one controlled by a Coordinator (C). The coordinator of the highest hierarchical level is the root of a HiM and is interfaced with the Console. Each host wanting to donate its computing power runs a special server, the Host Manager (HM), which receives creation commands by the console and creates the required nodes. It is worth noting that the console can create nodes only at the highest hierarchical level of a HiM. However, if a coordinator receives a creation command, it creates nodes inside the macro-node that it controls according to supplied configuration information. At the resource layer, HiMM provides an information service based on the interaction between the Resource Manager (RM) and the HM. ARM, allocated on one of the hosts taking part in a macro-node, is periodically contacted by the HMs running on the hosts belonging to the same macro-node and wanting to publish information about: (1) the CPU power and its utilization; (2) the available memory; (3) the communication performance. Information is collected by the RM and made available to subscribers. 3. RESOURCE SELECTION AND TASK MAPPING As stated in [ 14], the most common current Grid broker is the user but many research efforts [1, 17, 18] are underway to change this approach. The simplest and immediate scheme for selecting resources requires that the result of resource discovery is directly presented through the console to the user, who, on the basis of the application knowledge, selects a subset of the discovered machines for building a meta-system. A more sophisticated approach is based on the interaction among the information system, the user and the broker. With this scheme, resource selection is automatically performed by the broker. To this end, the broker needs some information about resources, the application to run and the QoS desired by the user. The result of the selection can be managed to perform the mapping either by the user through the console or directly by the broker (fig. 1.a). The mapping regards the spatial allocation of application tasks to the selected machines, which, in our environment, means the creation of the processes
611 on the selected machines that will host object-oriented tasks. In this paper we focus on the third approach, which guarantees a transparent use of heterogeneous, multi-owner, distributed resources. With this approach, the broker acts as a mediator between the user (console) and grid resources (HIM) by using middleware services (Info System and Host Managers). In particular, the broker becomes responsible for resource discovery, resource selection, process/task mapping, task scheduling, and presents the Grid (a HiM) as a single, unified resource. The broker architecture is distributed and hierarchical in order to match the organization of a hierarchical Grid (fig. 1.b). This architecture is not conceptually new [11], but the paper discusses how such an organization can be used to schedule object-oriented tasks dynamically produced by a running application. 4. GRID B R O K E R A R C H I T E C T U R A L PATTERN
Individuating a software architecture for designing and studying complex distributed systems is a key factor for their engineering and widespread diffusion [ 13]. Software architectures are typically described through architectural and design patterns which offer an effective way to reuse design solutions. Following this approach, our broker is designed according to the Grid Broker pattern which is a variant of the Broker pattern [3] purposely modified to satisfy the requirements of Grids. Due to limited space, in this paper we give only a brief overview of the pattern, which will be fully described in a future work. The pattern uses the broker as a key component to achieve decoupling of clients (grid users) and resources (computational power, communication bandwidth, storage capacity). Resources register themselves with the broker and make their functions available to clients. Clients use the resources by issuing jobs to the broker. The broker is responsible of discovering resources, selecting resources on the basis of QoS parameters, mapping tasks to the selected resources, scheduling tasks according to a pattern of dependencies and collecting results. Therefore, the broker hides all the details of a Grid system by offering a simple framework for the deployment and the execution of Grid applications (in the following called jobs). The structure of the Grid-Broker pattern comprises six participants: Job, Client, Resource, Task, Grid Broker and Information Service. Job is a generic application whose execution needs many resources. It is composed of the application code along with documents (descriptors) describing the parallel computing model to use and every other detail useful for a correct execution, such as its complexity, I/O operations, sizes of inputs and outputs. These details can be used by the broker to select resources in order to meet the QoS constraints (such as deadline and budget) also specified by the user within a descriptor. In particular, when the application is unstructured, the job is a set of independent and self-contained tasks (code and input data). Client is the component which submits a job to the
Figure 1. (a) Automati<~ resource sel~ctio--6 and mapp'lng; t'b') hlerarchl""cal arc itecturecture"-6fothe broker
612
Omat ~-wdgtok~rAH
]sd~
] asea~PI I
~k~
filxlR~our
A
+~stl~ourcesO +fHIl~somr
+e~cutv(JeblD)
4d~cardOobID) § #~k~l~souax~0 #sel~dukATsskO #exlraeffmh0 +~e.ln:hxll~o~o~sO -~a~'tl~ults 0
]
Ta~ ~ u ~ ' O :Obtccr +tmninateO:void
+qx~O selmt
<
+m•~ ~vc ute(TaskID).Objvct +t~(TasklD).~dd
........ s
Figure 2. Structure and dynamics of the Grid Broker pattern Grid Broker for the execution. It (1) runs on the Grid user machine, (2) can interact with the broker, (3) can control and monitor the job execution, and (4) can terminate the job execution. Resource is a generic resource which makes available its services to users for executing tasks. It is able to register itself in an information service in order to make its services available for Grid computing. Task is an object allocated on a Resource by the Grid Broker in order to run a task of the job. Grid-Broker is the key component of the pattern. It exports an API to clients for submitting jobs and to resources for registering themselves. In order to accomplish its tasks, the Grid Broker interacts with the Information Service (which can be integrated in the broker or not) for discovering resources. Moreover, the Grid Broker can select resources able to execute an application satisfying user requirements by using an algorithm chosen by the user. Moreover it allocates tasks to the selected resource and schedules them for the execution. Information Service is used by resources to register themselves. It provides registration, indexing and discovery services. The sequence diagram in figure 2 shows a typical scenario for executing a job. The grid broker activity is divided in three phases: resource discovery, resource selection, task mapping and scheduling. However, the implementation of the Grid Broker pattern in a real Grid requires to address several problems regarding application and resource description, the algorithm used by the broker to select resources in order to satisfy the desired QoS, the scheduling of tasks and the management of their dependencies.
4.1. Application and resource descriptors In order to execute an application within the desired deadline and without exceeding a budget, the user must specify his requirements whereas resource owners must specify the features of their resources. In order to collect this information, we defined descriptors based on XML for describing user requirements, a job and resource features. These descriptors are called User Requirements Description Format (URDF), Job Description Format (JDF) and Resource Description Format (RDF), respectively. As regards to RDF, it is worth noting that to perform an effective task scheduling, low-level (such as CPU speed), medium level (such as the time required for performing a simple operation), high level (such as the time required to execute a common task), and dynamically acquired (such as a specific benchmark executed by a mobile agent) resource information can be used.
613
4.2. Scheduling algorithms The selection algorithms adopted by our broker implementation are based on the economy model, proposed in [4]. In particular, the time minimization algorithm selects the resources whose aggregate cost is lower than the budget and that are able to complete the job execution as quickly as possible. The cost minimization algorithm selects the cheapest resources able to complete the application execution within the deadline. A third algorithm selects the resources whose cost is lower than the budget and that assures the completion of the job execution within the deadline. To facilitate the adoption of different scheduling algorithms, the Grid broker could be designed according to the Strategy behavioral pattern.
4.3. Management of task dependencies Managing all kinds of task dependencies is difficult. In this paper we focus on the master~slave model of parallelism applied to object-oriented applications, also known as Master~Slave pattern [3]. This programming model has been used successfully for a wide class of parallel applications [16, 8] and is suited to program in heterogeneous Grid environments [2, 15]. In this model, the dependencies and the communication patterns among tasks are simple and statically definable. In particular, a special process, the master, assigns tasks to other processes, the slaves. So, for each slave the following actions have to be performed: 1) transmission of data and command towards the slave; 2) execution of the computational task; 3) transmission of the result from the slave back to the master; 4) processing of partial results produced by the slave in order to collect the final result. In particular, in order to transparently subdivide application tasks among computational resources, programmers have to specify how: (1) the computation can be divided using an algorithm or domain specific information; (2) the final result of the whole task can be computed by using the sub-results obtained from the slaves. 5. O B J E C T - O R I E N T E D MASTER/SLAVE SCHEDULING In our object-oriented environment, the master/slave parallel model requires the creation of the processes and the allocation onto them of globally referable remote (slave) objects, whose methods can be asynchronously invoked by the master object. These objects can be instances either of classes implementing the master/slave pattern or of regular existing classes. In both cases, when a resource is a Coordinator/Broker, the task is further subdivided in a number of smaller tasks. The splitting continues until no more coordinators are met or the task can not be further subdivided. The computation completes when all the subtasks are completed and the partial results collected by the master. In the first case (fig. 3), a client object invokes a remote method of a master object (service) that, by exploiting broker information, splits the task in a number of subtasks and assigns them to the high level resources of a HiM. In the second case (fig. 4), every existing class can be used to instantiate an object that works as a master. A JDF descriptor indicates which method has to be used as service method and which task has to be split in subtasks. Therefore, the pattern is dynamically implemented and every object can be turned in a master object able to transparently split the service task into subtasks. This approach is made possible by using reflection and a meta-object protocol (MOP)[12]. Among the several MOP schemes [ 10] presented in literature, we chose the proxy-based runtime MOP to avoid modification to virtual machines (VMs). Such approach is based on the introduction at compile time of hooks in a program so as to reify run-time events such as object creations and method invocations. To avoid VMs modifications, this scheme uses the Proxy design pattern:
614 I ~-,
IMaster { public Object execute() { return service();} public Object service() { splitWork(); callSlaves();combineResults(); return ...; }
class Master extends Task implements
job'Job
public Object execute() public Object service()
~---]
I
[ i'R~uxce [
[j..~o~ce J
I
I
suhm~iob~
} class Slave extends Task implements ISlave private Matrix b; private float[][] a;
[ :MSOrkll~k~ [
]
I
._U. _.
_
I
{
{ return service(); { return b.multiply(a);
} }
I[~
I
seleclSle~ILesources
-<- ................. l ................ L r . q ~c.te(i~n~)
~- ~ ~
: ]+~ecute0:Objea
i :
:~ulc(Y.dO:Object
!+ten~nateO."void
.....
k):void
i+~(Tas
....................
,
Sla~
i-......... i
~ _ ~
-~=vkz~9.Obj~t ~ t ~ O ~
exlnlztTasks
::+~eo0
.......
Master
~+s(nvi:cO.Objcet ~+s#Wo~)
.~, ~+tenm~O.-r
[j< ............
LI ............ k-~
I
co~e(;(l~sdls,
j ...........
I
I
~FI
Figure 3. Structure and dynamics of the Grid Broker pattern specialized for the master/slave model surrogate objects to the meta-level.
(stubs) intercept object creation and method invocation events and pass them
Task I I newWra
~oP I
// TASK executed by a generic Resource float [] [] a ....; float [] [] b ....; Matrix mb = (Matrix) MOP.newWrapper(new Matrix(b) ) ; float [] [] c = mb.multiply(a);
I
<>
'
,
'
I
_. . . .
il
IIi1~
multiply
MW ~
MatrixStub
. . . . . . .
)
! -t~
t - - + - ~
~,~
r~y
I
~-g I
I
the node is a coordinator
I
~L...~
callSlTs +mu~(noat[][])
:fk~t[][]
- -
I
comt~c~-sults
I mul~ly A0V~.aota~vd
I
---Lt~[
I i,~=:0
U
I
M=talevd
~
. . . . . .
I
I
I
h ..... ' - - '
J
I G~-~
I
I
/
::+ 8e~Sel~t~l.Res ourcesO ~..-
I J
!
i ..... ~a,t~ok= ...... tnew[mtmaee(...):Object +newWmppar(...):Object
I~.o~11 ~
9 ,~:> ---I),.
I ~ U ~ I ' - ' ]
i
~
~)-i--i
ly ~
I I
I
i
J
I if We node is not a ea~ordinator
Figure 4. Structure and dynamics of the Grid Broker pattern with reflection
6. CONCLUSIONS AND FUTURE DIRECTIONS The main contribution of this paper is the definition of a platform-independent, architectural pattern for designing Grid Brokers for middleware platforms. Although, such a pattern can be implemented in whatever middleware, in this paper it is implemented for providing HiMM with the brokering services. This implementation uses XML-based documents for describing resources, applications and user requirements, and uses an economy-driven model for selecting resources from a pool in order to guarantee the execution of a job respecting QoS constraints. Finally, the paper analyses the master/slave computing model and its integration with the Grid
615 broker pattem. In particular two implementations of the master/slave pattem are discussed: a static and a dynamic, reflection-based one. In the future, we intend (1) to develop a Web Service based implementation of the broker for its integration with OGSA/Globus [19] and (2) to investigate reservation policies for guaranteeing QoS requirements when non-dedicated computers are used. REFERENCES
[ 1] [2] [3] [4]
[5]
[6] [7]
[8] [9] [10] [ 11 ]
[12] [13]
[ 14] [15] [16] [ 17] [18] [19]
G. Allen et al., The Cactus Worm: Experiments with Dynamic Resource Discovery and Allocation in a Grid Environment. J. of High Performance Computing Applications, 15(4):345-358, 2001. F. Berman, High-performance schedulers. In I. Foster and C. Kesselman, ed., The Grid: Blueprint for a New Computing Infrastructure. Chap. 12. Morgan Kaufmann Publishers, July 1998. F. Bushmann et al., Pattern-Oriented Software Architecture." A System of Patterns. Wiley and Sons, 1996. R. Buyya and M. Murshed, GridSim: A Toolkit for the Modeling and Simulation of Distributed Resource Management and Scheduling for Grid Computing. The J. of Concurrency and Computation: Practice and Experience, Wiley Press, pp 1-32, May 2002. H. Casanova, Simgrid: A Toolkit for the Simulation of Application Scheduling. Proc. of the First IEEE/ACM Intl Symposium on Cluster Computing and the Grid, Brisbane, Australia, May 15-18 2001. M. Di Santo, F. Frattolillo, W. Russo, and E. Zimeo, A Portable Middleware for Building High Performance Metacomputers. Proc. of Intl Conf. PARCO'01, Naples, Italy, September 4-7, 2001. M. Di Santo, F. Frattolillo, W. Russo, and E. Zimeo, A Component-based Approach to Build a Portable and Flexible Middleware for Metacomputing. Parallel Computing, 28(12): 1789-1810, Elsevier, 2002. K. Everaars and B. Koren, Using coordination to parallelize sparse-grid methods for 3-d cfd problems. Parallel Computing, 24(7): 1081-1106, Elsevier, 1998. I. Foster, C. Kesselman and S. Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Intl J. of Supercomputer Applications, 15(3), 2001. J. de O. Guimar~es, Reflection for Statically Typed Language. LNCS 1445, pp 440-461, 1998. K. Krauter, R. Buyya, and M. Maheswaran, A Taxonomy and Survey of Grid Resource Management Systems for Distributed Computing. Intl J. of Software: Practice and Experience, 32(2), Wiley Press, USA, 2002. G. Kiczales, J. des Rivires, and D. G. Bobrow, The Art of the Metaobject Protocol. MIT Press, 1991. O. F. Rana and D. W. Walker, Service Design Patterns for Computational Grids. In Patterns and Skeletons for Parallel and Distributed Computing, ed. by F. A. Rabhi and S. Gorlatch. SpringerVerlag, 2002. J. M. Schopf, A General Architecture for Scheduling on the Grid. JPDC, Special Issue on Grid Computing, April, 2002. G. Shao, F. Berman, R. Wolski, Master/Slave Computing on the Grid. Heterogeneous Computing Workshop. IEEE Computer Society Press, 2000. L. M. Silva, et al., Using mobile agents for parallel processing. Proc. of the International Symposium on Distributed Objects and Applications, Sept. 1999. Condor, http://www.cs.wisc.edu/condor/. 2003. Apples, http://grail.ucsd.edu/. 2003. OGSA, http://www.globus.org/ogsa/. 2003.