A generic tool to federate WAN servers

A generic tool to federate WAN servers

Journal of Network and Computer Applications (2001) 24, 307–331 doi:10.1006/jnca.2001.0136, available online at http://www.idealibrary.com on A gener...

171KB Sizes 3 Downloads 92 Views

Journal of Network and Computer Applications (2001) 24, 307–331 doi:10.1006/jnca.2001.0136, available online at http://www.idealibrary.com on

A generic tool to federate WAN servers Nawel Sabri† and Chantal Taconet‡ †

Institut National de Recherche en Informatique et en Automatique, Domaine de Voluceau—BP 105—78153 Le Chesnay Cedex France. E-mail: [email protected] ‡ Institut National des T´el´ecommunications, D´ept. Informatique, 9 rue Charles Fourier, 91011 EVRY Cedex France. E-mail: [email protected] (Received 15 July 2000; accepted 23 July 2001) On a Wide Area Network (WAN), services are more efficient if they are supported by several servers located near their respective clients. In this article, we present a generic tool which facilitates the federation of servers over a WAN. This tool is in charge of both dynamically managing the federation and propagating requests to all the federation’s servers. It may be applied to any kind of widely available service. As far as we know, there is no generic tool that can be used by WAN services to federate their servers. We present the benefits these services could derive from our federation tool. The federation adapts dynamically in case of server addition, server failure and network topology modification. Each server has a global view of the federation which is used especially to propagate requests to the federation. We present the general model and the design of the tool. The tool is built upon group communication objects. And we present some implementation issues in a CORBA (Common Object Request Broker Architecture) environment. The tool is generic in that it may be applied to any service. In this paper we show how a specific service may use the generic federation tool especially to propagate typed requests. Finally, we present an application of this tool for federating CORBA traders.  2001 Academic Press

1. Introduction More and more computer services are available to a large number of clients spread over the Internet. In order to offer the service with an acceptable response time, these widely available services need to be supported by several servers disseminated around the network. Actually, increasing the number of servers reduces the load of each server. Furthermore, a good distribution of the servers over the network globally reduces propagation delays between clients and servers. In order to ensure the service with distributed servers, service developers need tools to enable cooperation between servers. There are many examples of widely available services provided thanks to the cooperation of a set of servers over a Wide Area Network (WAN). The News USENET [1] has been distributed to end users thanks to the cooperation of several news servers for more than ten years. In the MBone [2], the subset of Internet supporting multicast routing, packets are forwarded thanks to tunnels set up between a set of cooperating multicast routers. And the CORBA trading service [3] may be provided by a federation of traders which cooperate to handle clients’ queries. 1084–8045/01/040307 C 25 $35.00/0

 2001 Academic Press

308 N. Sabri and C. Taconet

As far as we know there is no tool to federate the servers of widely available services. The study of the above examples shows that the cooperation links between the servers are set up manually by human administrators, independently on each server. We argue that the current manual administration of a cooperating server federation presents the following drawbacks. The federation links chosen may be inefficient: two linked servers may be very distant according to the underlying network topology. The federation graph may include loops which have to be handled in case of information propagation. The federation graph may not be fault tolerant and so at the propagation time, information might never reach a server because of the failure of another server in the federation. In this article, we present a tool that we call the Generic tool for Dynamic Federation of Servers (GDFS). GDFS is generic and so may be used to build a federation for any specific service which needs the cooperation of several servers distributed over a WAN. There are two main benefits from using GDFS. The first is that it provides an alternative dynamic administration of a federation instead of the manual administration currently used in the above examples of federation. The second is that it provides a generic means to propagate typed information efficiently to the set of (or a subset of) servers in the federation. GDFS has been designed for a CORBA environment in order to respond to the heterogeneity requirement currently needed by most of the distributed computing applications. The presentation of GDFS is organized as follows. In Section 2, we present the model and the protocol used for GDFS: we show how we obtain dynamic administration of the federation and we explain the semantics of the propagation protocol. In Section 3, we present the GDFS object design; we show how this design allows the application of GDFS to any service. In Section 4, we explain how to build specialized federated servers with GDFS and we present a case study: applying GDFS to a federation of CORBA traders. Finally, in Section 5 we give a conclusion outlining perspectives and open issues.

2. The GDFS model In this section we present the general model used by GDFS. We first introduce the principles of GDFS. Then we present the piece of information shared by every server of the federation and we show how this piece of information is used to handle the propagation of information. Finally we explain the dynamic administration of the federation. 2.1 GDFS applications and principles We distinguish two kinds of cooperation for servers of widely available services: cooperation for data replication and cooperation for data distribution. In the first approach, data are replicated on all the servers. The servers have to maintain the data consistency between all the servers. As it is difficult to maintain strong coherency between WAN servers, the data distribution approach may be preferred.

A generic tool to federate WAN servers 309

In the second approach, the data is distributed on all the servers: each piece of data is stored near its production. Thanks to the cooperation of all the servers any client can access the information. This last approach is efficient if the data evolve rapidly and if the production and the use of each piece of data takes place often on the same server. GDFS has been designed for the cooperation of servers built with the data distribution approach. Developers can use GDFS to build widely available services. A wide range of services, such as a software distribution service and a discovery service may benefit from GDFS. We call a service built with GDFS a specific service. The GDFS is to be applied on a WAN. One federation is associated to one service. A federation is made up of a set of servers called the Federated Servers (FS). The FSs are geographically distributed on different sites which may be separated by long distances which may in turn lead to important propagation delays. Data are distributed among the FSs. Each FS may communicate with all other FSs. Each FS may be invoked by a set of clients. Each client has a preferred FS which we call the proximity FS (as it is often the nearest to the client). The proximity FS is chosen by the client because of its good response time. But other FSs may offer the same service to the client if necessary (for example in case of the failure of the proximity FS). The proximity FS may be chosen either by configuration of the client or by intelligent directory services such as a specialized DNS [4]. Propagation may be used either for the resolution of a client request, or for a service’s internal reasons. Any client request may be resolved either locally by the proximity FS, or with the cooperation of other FSs. If a significant proportion of queries is resolved locally, the distribution of FSs allows network and server resources to be saved. If a request cannot be resolved locally, it needs to be propagated to the federation. For their own service needs, FSs also need to propagate information to other FSs. Propagation may be used for the purposes of updating shared information, advertising news or propagating queries among the FSs. GDFS is built upon two main protocols: the propagation protocol and the administration protocol. The administration protocol is necessary for the dynamic administration of the federation. Thanks to the administration protocol the federation adapts to react to events such as the incoming of a new FS or the failure of an FS. The propagation protocol is used by FSs in order to propagate typed requests to all the FSs of a federation (or a subset of them). The GDFS protocols are explained in detail in [5]. In the next sub-sections we describe the broad outlines of these GDFS protocols. 2.2 A global view of the federation All kinds of federations (i.e. manual federations and GDFS) need federation graphs to link their FSs. In the case of a manual federation, each FS has a partial

310 N. Sabri and C. Taconet

knowledge of the federation graph: each FS only knows its neighbours in the federation graph. This knowledge is acquired thanks to manual configurations. However, in GDFS, each FS has a global view of the federation graph. Each vertex of the federation graph is an FS. Each edge is labeled with a network distance assessment. A network distance is representative of the communication cost between two FSs (e.g. financial cost, latency, bandwidth). For our Internet prototype, we choose the distance function used by most routing protocols: i.e. the number of hops between two hosts. Each FS is in charge of evaluating its network distance to every federation’s FS. There is a link between two FSs if they can communicate and if they have network distance information. Fig. 1 gives an example of a GDFS graph (on this figure, the graph should be complete, but for readability reasons, we present only a partial federation graph). As the Federation Graph is evolving permanently (see subsection 2.4), each FS stores two Federation Graphs: the current version Federation Graph common

Beijing

Tokyo

4

Amsterdam 4

15

12

2

4

New York 12 3

10

Phnom Penh

6

Washington 12

Paris

4

Lille 3

8

5

12

3

9

Munich

3

5 4

Austin 3

Marseille

Estimated distance between two nodes

(a)

Beijing

4

Tokyo Amsterdam

2

Phnom Penh

3

10 5

3

8

Washington

Lille

3

New York

3

Munich

Paris 4

Austin Marseille (b)

Figure 1. A federation graph example and its associated propagation tree. (a) Federation graph; (b) propagation tree.

A generic tool to federate WAN servers 311

to all the FSs and a local view of the Federation Graph which takes into account recent federation changes only known locally. 2.3 The propagation protocol Thanks to the Federation Graph, each FS calculates a propagation tree. This tree is used to propagate information to every FS in the federation. We use a minimum weight spanning tree, calculated with the Prim algorithm [6]. Fig. 1 gives the minimum weight spanning tree associated with the example. The structure of the propagation graph has been chosen to: (i) minimize communication costs (according to the distance function); (ii) distribute the propagation costs between all the nodes; (iii) eliminate the stop control required by graph cycles. We could use alternative propagation trees. We do not retain shortest path trees, because as the federation graphs are often complete, we would obtain star trees. The depth of the trees would be minimum, but there would be no distribution of the propagation costs between the nodes. In order to be scalable, we would rather use trees which reach a good compromise between the communication cost and the degree of the tree, such as those calculated in [7]. These trees would limit the number of neighbours of each FS and as emphasized in [8] it is essential in order to avoid the acknowledgement implosion problem (i.e. above a certain degree, the throughput of the protocol drops because of the cost of error control). Whichever tree is chosen, the propagation protocol would be the same. The propagation protocol is as follows. One FS, which is called the source FS, initiates the propagation and sends some information to all its neighbours in the federation tree. Each intermediate FS receives the information, acknowledges it, and is responsible for propagating the information to all its neighbours in the propagation tree except the one from which it received the information. If after several attempts, no acknowledgement is received from one neighbour, as this FS may be down or unreachable, we use the alternative behaviour in case of failure in order to propagate the information to the rest of its subtree (see Section 2.4.1). The semantics of the propagation protocol has been chosen for the propagation of discovery requests. Any node may be the source of the propagation. The propagation has to go on even if one FS fails: the request must reach all the live nodes. There is no need to order the requests. There is no recovery of lost data after the restart of a failed node: the requests have to be made immediately or never. Nevertheless, other kinds of specialized propagation protocols could be designed above GDFS if necessary. 2.4 The dynamic administration of the federation The federation graph and the federation tree have to adapt dynamically to take into account all the following events: FS addition or removal, modification of the underlying WAN topology which leads to network distance changes and the temporary failure of an FS or of a communication link.

312 N. Sabri and C. Taconet

In order to be more efficient and because some changes are stables and others are not, the model offers three levels for taking the events into account: (i) the alternative behaviour in case of failure; (ii) the local reconfigurations; (iii) and the global change of the federation version. 2.4.1 Alternative behaviour in case of failure. As soon as an FS discovers a failure (i.e. when it can’t propagate information to one of its neighbours in the tree), this FS uses the alternative behaviour in case of failure. It propagates the information, on behalf of the failed neighbour (for simplicity’s sake, we call it the failed neighbour, but the communication failure may come from a failed FS or from a network failure), to the neighbours of the failed neighbour in the tree. For example, in Fig. 2, if the Washington node cannot propagate to the Paris node, the Washington node will decide to propagate to the Amsterdam, Lille, Munich and Marseille nodes on behalf of the Paris node. This behaviour is possible because of the global knowledge of the broadcast tree. This behaviour

Beijing

Tokyo

Amsterdam New York

Lille Munich

Phnom Penh Washington

Paris

Austin Marseille (a)

Beijing

Tokyo

on behalf of

Amsterdam

New York

Lille

3

Munich

Phnom Penh Washington

Paris

Austin Marseille (b)

Figure 2.

Alternative behaviour in case of failure. (a) Node failure; (b) alternative behaviour.

A generic tool to federate WAN servers 313

maintains the continuity of the service. At its request, the source FS may choose to be informed that the information did not reach the failed FS. 2.4.2 Local reconfigurations. The local reconfigurations level is used to take into account FS long time failure (e.g. the long time failure threshold may be set to five minutes), FS long time failure recovery, as well as FS addition and removal. Local reconfigurations enable federation changes at a lower cost than the propagation of a new federation version. A local reconfiguration consists in a coherent change of the propagation tree seen on an FS, its neighbours and theirs. The first level of neighbours is necessary in order to maintain a propagation tree (even if this tree is no longer a minimum-weight spanning tree). The second level of neighbours is necessary to enable future reconfigurations. A local reconfiguration is possible if, and only if, after the reconfiguration, each node knows its own neighbours (neighbours level 1) and the neighbours of its neighbours (neighbours level 2) in the effective propagation tree. A node cannot participate in two local reconfigurations at the same time. After accepting a local reconfiguration, a node buffers incoming requests so as to retransmit them at the end of the reconfiguration process. A local tree reconfiguration enables rapid federation modification, nevertheless the cost is that the tree knowledge is no longer global. Each FS stores its own local reconfigurations. In the rest of this subsection we briefly present local reconfiguration protocols: long time failure, the addition of a new FS and the recovery of an FS after a failure.

ž Long time failure. In the case where all neighbours agree on the long time failure of a FS, a new configuration of the propagation tree is decided by its neighbours. For example, if the failure of the Paris node lasts after the long time threshold, a new configuration of the tree between the Washington, Amsterdam, Lille, Munich and Marseille nodes is calculated. This modification concerning the Paris node is made coherently on Paris’s first level neighbours (i.e. Amsterdam, Lille, Munich and Marseille) and on the second level neighbours (i.e. New York) (see Fig. 3). ž Addition of an FS. In order to be added to a federation, a new FS has to obtain the current Federation version from any FS. Then it estimates its distance to all the federation’s FSs. Then it asks the nearest FS to be joined as a leaf to the federation. This leads to a local reconfiguration on the nearest FS and its level 1 neighbours. ž Recovery of an FS after a failure. If the failure of an FS has led to a local reconfiguration, when this FS recovers it asks its nearest neighbour to be added as a leaf to the tree. This procedure is nearly the same as the addition of a new FS.

314 N. Sabri and C. Taconet Beijing

Tokyo

Amsterdam New York

Lille Munich

Phnom Penh Washington

Paris

Austin Neighbour level two

Marseille

Neighbour level one

Figure 3.

Local changes.

2.4.3 New version of the federation. A Changing Version Federated Server (CVFS), chosen dynamically in the federation, triggers the new versions of the federation. Each FS sends its local long time reconfigurations to the CVFS including long time failures, long time distance changes, addition and removal of nodes. The CVFS triggers a new federation version when the degradation rate of the propagation tree goes past a given threshold (e.g. 30%). The new version is then propagated to all the federation’s FSs thanks to the propagation protocol. Two nodes (sender and receiver) have to agree on a version before they can communicate through GDFS. If the receiver has an older version, the request is buffered on the receiver until its version is updated. On the other hand, if the sender has an older version, it will send the request back to the source node which will reinitiate a propagation after updating its federation version. Thanks to the federation administration protocols, all the federation updates are made dynamically. The propagation of a new version is a costly operation, but thanks to the three update levels, the number of version changes should remain low. Through network distances, the links between the FSs follow the developments of the federation and of its underlying WAN topology. The model tolerates a great number of failures. In the case of too many failures leading to a division of the federation into several isolated classes, server cooperation is limited inside each class. This may happen when two neighbours in the propagation tree fail at the same time, consequently reconfiguration becomes impossible. In this section we have presented the general GDFS model; in the next section we present a design proposition for this general model.

3. GDFS design In this section we present the object oriented design of GDFS. We show how this design allows GDFS to be applied to a broad range of widely available

A generic tool to federate WAN servers 315

services. As mentioned in the introduction, we have chosen CORBA as our target development platform. And we present part of the GDFS administration and propagation interfaces in CORBA IDL. In Section 3.1, we present how we build GDFS upon a group communication facility, and in Section 3.2, we present the overall design of GDFS. In Sections 3.3 and 3.4, we show how we use this design for local tree reconfigurations and for propagating information in the federation. Finally in Section 3.5 we refer to some implementation issues in the CORBA environment. 3.1 Group communication objects in GDFS GDFS is built upon a group communication facility. A group communication facility allows the sending of information to all the members of a group in one operation. A group communication facility may offer several semantics (see [9] and [10] for description of group communication semantics). 3.1.1 Semantics of group communication in GDFS. For GDFS, we need group communication support for two different purposes. Firstly to multicast federation administration messages to a subset of the federation servers and secondly to propagate information to all the servers of the federation. For these two purposes we need two different group communication semantics that we present as follows.

ž Local tree changes. When the propagation tree is locally modified, reconfigurations have to be made coherently on a group of servers including the neighbours level one and two of the centre of the modification. In this case, a coordinator is automatically chosen from the group members. Thanks to the propagation tree, this coordinator deduces the group members involved in the modification. The coordinator invokes this group with the guarantee that either all server members receive the invocation or none of them does. We assume that a member cannot be involved in two local tree changes at the same time. We do not need any ordering of messages. In the rest of this paper, we call this semantics the atomic multicast. ž Propagation of information. For the purpose of propagation of information, we need a different group of communication semantics. Each server in the propagation’s chain which receives information to propagate is a coordinator and has to forward the information to the group made up of all its neighbours in the propagation tree (except the one which has forwarded the information). With these semantics, the propagation has to go on, even if one FS has not received the information. In this case (i.e. when one neighbour does not acknowledge) the coordinator triggers the alternative behaviour in case of failure. At the group communication level, there is no need to guarantee either atomicity, or the ordering of messages. In the rest of this paper, we call this semantics the best effort multicast.

316 N. Sabri and C. Taconet

3.1.2 Group communication in CORBA. In order to design and implement GDFS easily, we need a group communication facility. Existing group communication facilities have been typically implemented at the transport protocol level. They have been mostly implemented on a homogeneous distributed computing environment. We need a group communication available in the CORBA environment. Unfortunately, nowadays there is no CORBA tool, nor standard interfaces for group communication that we could find on the shelves. There is no accepted CORBA specification on group communication yet. In CORBA, two approaches are possible: middleware and service approaches. The middleware approach is based on the idea that group communication is supported by the middleware itself. OrbixCIsis [11], Electra [12,13], Eternal [14], and Aqua [15] are examples of the middleware approach. Also, the future evolution of CORBA regarding group communication Unreliable Multicast Inter ORB Protocol (MIOP) promotes the middleware approach [16]. In the service approach, a specific service offers group operations such as join or leave a group and multicast invocations. The group communication service does not have to be centralized. It can be made of several objects, located at different hosts on the network which work together to provide the complete service. Object Group Service (OGS) [17] is an example of the service approach. Fault tolerant CORBA [18] is a hybrid solution. Group management is handled through an interface, but the transport of the request is handled transparently by the ORB. For the moment, fault tolerant CORBA handles only passive replication. In our opinion, the service approach offers some advantages. There are lots of different semantics for group communication and it seems difficult to integrate all of them in the ORB, because an ORB has to remain simple. The service approach is more open and flexible. Lastly, in the middleware approach, transport is handled transparently, which does not fit in our case because we need to know which members have received a request or not. 3.1.3 Group communication design in GDFS. The communication group facility we have designed for GDFS is inspired by OGS. The main difference is that we do not need group management operations (join and leave operations) as the group members are always deduced from the propagation tree. We design the communication group facility as follows. There is a GroupAccessor object on the client side and an Invocable object on the server side of an invocation. A GroupAccessor object is the local representative of a group. It allows clients to address requests to a group. An Invocable intercepts the request before its invocation on the server. As we have two invocation semantics, we have two GroupAccessor classes: the AtomicGroupAccessor class which provides atomic multicast invocations and the BestEffortGroupAccessor class which provides best effort multicasts invocations. The AtomicInvocable intercepts atomic requests. In best effort invocation, we don’t need a specific Invocable

A generic tool to federate WAN servers 317

interface because there is no need to intercept these requests; they are directly invoked on the target server object. With the AtomicInvocable interface, the save operation stores the atomic request in persistent memory. Then, if all the group servers are ready, the coordinator’s AtomicGroupAccessor invokes the commit operation, which delivers the request to the group member, or else it invokes the abort operation to cancel the request. An extract from the group communication objects CORBA interfaces is given as follows. module mGroupAccess f. . . interface GroupAccessor f BoolSeq multicast (in GroupDescription group, in RequestDescription req); g; interface ReliableGroupAccessor: GroupAccessor f g; interface AtomicGroupAccessor: GroupAccessor f boolean isCommitted(in short reqId); g; interface AtomicInvocable f boolean save(in RequestDescription req, in short reqId); void abort(in short reqId); void commit(in short reqId); g; g;

In GDFS we use the communication group objects that we have described in this section. 3.2 Federation objects in GDFS The Federation objects manage the life of the federation. They provide support for joining and leaving the federation, changing version and propagating requests. The classes of the federation objects as well as their relationships with the communication objects are shown in Fig. 4. This figure shows the class diagram of the dynamic federation expressed in UML (Unified Modelling Language) notations. In Fig. 4, we can see federation objects. Graph, Node and Propagation Tree provide graph management methods. The GenericFederatedServer object is the heart of the model. It manages all the federation administration operations and is in charge of propagating requests. The FederationView object represents the server’s local view of the federation, while the ChangingVersionFederatedServer (CVFS) is in charge of deciding and triggering a new version of the federation. Finally the SpecificFederatedServer is the server on which we apply the dynamic federation.

318 N. Sabri and C. Taconet

mGroupAccess Invocable Abstract deliver (...)

mDynamicFederation Node

AtomicInvocable save (...) commit (...) abort (...)

Graph addNode (...) removeNode (...) extractGraph (...)

nodeIdentifier

* AtomicGroup Accessor

GroupAccessor Abstract

PropagationTree

LocalChange

isCommitted(...) *

multicast (...)

Best Effort GroupAccessor GenericFederated Server *

addServer (...) removeServer (...) treePropagate(...) changeVersion (...) getFederation (...) joinFederation (...)

FederationView calculatePropagation Tree (...) *

ChangingVersionFederated Server degradationRate (...) receiveEvents (...)

LEGEND : composition : inheritance : association

Figure 4.

SpecificFederated Server

GDFS class diagram.

Fig. 5 gives an architectural view of an FS. We can see its different components, interfaces and connections. An FS is made up of two parts: (i) the Specific Federated Server (SFS) which is in charge of the specific service and (ii) the Generic Federated Server (GFS) which is in charge of the GDFS protocols. An SFS has to be linked to a GFScomponent (see Section 4.1 how to build an SFS). As we can see in Fig. 5, a GenericFederatedServer object offers four interfaces. GFSpropagation receives requests to propagate from other GDFS components. GFSatomicManagt makes coherent modification of the propagation tree. GFSopen is used by any FS to get or set federation information. GFSdynamic is used to receive specific requests from SFS. An extract of the GenericFederatedServer CORBA interfaces is given as follows. module

mGDFS f

... interface GFSopen f void

setSpecificFederatedServer(in Object SFS);

FederationDescription

getFederation();

A generic tool to federate WAN servers 319

GFScomponent

GFSopen

AtomicInvocable

AtomicGroupAccessor

GFSatomicManagt

GenericFederatedServer GFSdynamic

BestEffortGroupAccessor

GFSpropagation

: use interface (1 to N) SpecificInterface

Client

Figure 5.

SpecificFederatedServer

: use interface (1 to 1) : provide interface

Federated Server Components.

Distance

getDistance(in GFSId one, in GFSId two) raised

(UnknownServer); Boolean

joinFederation(in GDFSheader h, in FederatedServerDesc f,

in DistanceSeq d) raises(UnknownFederation, NewerVersion); void

changeVersion(in GDFSheader h, in FederationDescription

newVersion); g interface GFSatomicManagt f void addServer(in GDFSheader h , in AddServerEvent e) raises (NewerVersion, OlderVersion, NotNeighbour); void removeServer(in GDFSheader h, in RemoveServerEvent e) raises (UnknownServer,NewerVersion, OlderVersion, NotNeighbour); g interface

GFSpropagation f

void treePropagate(in GDFSheader h, in mGroupAccess::RequestDescription req) raises (NewerVersion, OlderVersion, NotNeighbour); g; g;

We can present briefly the methods of these interfaces. An FS can propagate requests with the treePropagate operation (this operation is detailed in Section 3.4). A new FS can get a description of the federation through the getFederation operation and then it will try to join the federation with joinFederation. Coordinators can make coherent changes in the propagation tree through

320 N. Sabri and C. Taconet

addServer and removeServer operations. Changing the federation version is performed using the changeVersion operation. Federated servers send their local changes by invoking the receiveEvents operation on the CVFS. The kinds of events are server addition, server removal and distance change. The CVFSs store these events to build new federation versions. 3.3 Local tree modifications using atomic invocations In this section, we illustrate an atomic group communication invocation in GDFS with the addition of a new server (see Fig. 6). When a new FS is added to a federation, the following steps have to be performed: (1) first the new FS invokes the joinFederation method on a chosen FS (the nearest), then (2) the AtomicGroupAccessor’s multicast method is called on to multicast the addServer method. The atomic multicast requires two steps. First (3), the save method is invoked on every AtomicInvocable to store the request. The save method returns a boolean indicating if the AtomicInvocable is ready to process the invocation. Second if all the AtomicInvocables are ready, the commit method is invoked on all the AtomicInvocable which then invoke (4) the request (addServer) on the GenericFederatedServer object and the multicast is successful; otherwise, the abort method is invoked, the multicast returns a failure indication and the server addition is aborted and deferred. 3.4 Requests propagation handling In this section we illustrate a best effort group communication invocation in GDFS with a call for a specific method for propagation purpose (see Fig. 7).

AtomicInvocable addServer (...)

.)

ve sa

AtomicInvocable (1) join Federation (...)

Generic FederatedServer

(4)

/ ..) it( m .) m rt(.. o c o ab

addServer (...) (3) (2)

multicast (...)

(3) AtomicGroup Accessor

(..

O R B

..) save (. / commit(..) abort(...)

sa ve (... ) co m ab m or it( t(. ..) ..) /

Generic FederatedServer

(4)

AtomicInvocable addServer (...) Generic FederatedServer

(4)

AtomicInvocable addServer (...) Generic FederatedServer

Figure 6.

Atomic multicast in GDFS.

(4)

A generic tool to federate WAN servers 321

st ue

(1) specific request

eci sp

(2) specific request multicast(...) Generic FederatedServer

req

fic

Specific FederatedServer

(4)

(1) BestEffort GroupAccessor

(1)

O R B

(3) treePropagate()

specific re

quest

(1) (3) treePropagate()

Specific FederatedServer (2) specific request multicast(...) Generic FederatedServer (4)

sp

ec ifi

Specific FederatedServer (2) specific request multicast(...) Generic FederatedServer (4) (3) treePropagate()

cr

eq

ue

st

Specific FederatedServer (2) specific request multicast(...) Generic FederatedServer (4) (3) treePropagate()

Figure 7.

Best effort multicast in GDFS.

In Fig. 7, (1) a specific method which requires propagation with the best effort semantics is invoked on the GenericFederatedServer through its GFSdynamic interface, then (2) the latter invokes the treePropagate method to propagate the request to its neighbours. (3) This method invokes the multicast method on the BestEffortGroupAccessor object which invokes the specific method on the required group of servers. The treePropagate method tests the return parameter of the multicast method; if a server does not acknowledge the request, the treePropagate method triggers the alternative behaviour in case of failure. This whole process is repeated again on each server until reaching a leaf of the propagation tree. The propagation of messages is a classic problem in distributed systems. It raises many questions, such as propagation termination, handling of unreachable nodes and handling of output parameters. In our case, propagation characteristics are defined in a sort of briefing which forms part of propagated messages: the GDFSPropagationHeader presented below. module mGDFS f struct GDFSPropagationHeader f FederatedServerIdentifier source; RequestIdentifier reqId; FederatedServerIdentifier lastSender; FederationState state; short propagationDepth; RequestDescription stoppingCondition; RequestDescription deliveryCondition; short problemReportMode; Object replyHandler; g g

// // // // //

filled by the GFS filled by the GFS filled by the GFS filled by the GFS updated by the GFS

322 N. Sabri and C. Taconet

The GDFSPropagationHeader is prepared by the propagation source node and updated by each node visited. It specifies:

ž The source server identifier. ž A unique request identifier for this source. ž The last sender server identifier in the propagation tree; in order not to propagate the request back to its sender.

ž The federation state which is compared to the local one. It is used to detect outdated local federation versions.

ž The lasting depth of the propagation (updated on each node visited). ž A stopping propagation condition operation and its list of parameters. It is an SFS operation which tests some parameters to decide if the propagation must continue on this branch. ž A delivery condition operation with its list of parameters. It is used to check if the message must be delivered to the node visited. It can depend on some properties of the node. It is also an SFS operation. ž A problem report flag. It expresses if the source wants to receive notification of a propagation problem or not. ž If it is different from the source server, the reference of an object which will handle the replies of the propagated request. 3.5 Implementation issues In this section, we present some of the mechanisms used to implement GDFS in the CORBA environment.

ž Invoking requests using Dynamic Interface Invocation (DII). GenericFederated Servers objects have to invoke requests on the SpecificFederatedServer without any knowledge of their interface. For this purpose, we need the CORBA DII mechanism, because DII allows clients to invoke requests without any knowledge of their IDL type at compilation time. Besides, the DII mechanism allows requests to be invoked asynchronously, so it is possible to multicast requests by invoking all of them at once, and obtain all the answers at a later time. That is the principle guiding the implementation of the multicast method. ž Accepting requests using Dynamic Skeleton Interface (DSI). Also GFS objects have to accept specific requests which are unknown at the GFS compilation time. For this purpose, we need the CORBA DSI mechanism. The DSI is another dynamic mechanism of CORBA. It enables servers to accept requests even if their IDL declarations are unknown at compilation time. ž Multi-threading. GFScomponent objects have to handle requests coming both from their own clients and from other FSs. Some requests, such as those which need multicast, may take a long time to process. So, multi-threading is

A generic tool to federate WAN servers 323

necessary to optimize the FS response time. Each incoming request is handled in a separate thread chosen by the ORB from a pool of threads.

4. GDFS and specific federation After having presented the general model and design of GDFS, we show in this section how we apply GDFS to specific services. First, we present a general recipe for writing a service which uses GDFS to federate its servers. Then we present, as an example, the application of this general recipe for the federation of CORBA traders. Finally, we present the lessons learned through the application of GDFS to a traders’ federation. 4.1 How to use GDFS to build a specific federation GDFS may be applied to a broad range of widely available services. In this section we present the procedure which has to be followed in order to create a specific service which uses GDFS. 4.1.1 GFS component initialization. In order to create an SFS, a GFS has to be created first. According to the life cycle of the federation, there are different methods adapted to create a GFS. To create the first GFS of the federation, the only parameter to supply is the federation identifier. To create the other GFSs, it is also necessary to supply the reference of an existing GFS in order to get the federation description through the GFSopen::getFederation operation. When a GFS recovers after a stop, it obtains the federation description from persistent memory. If the GFS has not been removed from the federation, it deals with any possible pending atomic group communication. In the other hand, if the GFS has been removed from the propagation tree by its neighbours, the GFS has to rejoin the federation. 4.1.2 SFS initialization. In order to benefit from GDFS, an SFS has to be linked to a GFS: it must store the reference of the GFS which has been previously created, and vice versa, as a GFS delivers specific propagated requests to SFS, a GFS has to be linked to an SFS. For this purpose, the SFS must invoke the GFSopen::setSpecificServer(this) method. 4.1.3 Federation knowledge. An SFS may send requests directly to any SFS of the federation. For this purpose, SFS may use the GFSopen::getFederation method to get the reference of all the federation SFSs. 4.1.4 Methods to be propagated. The SFS will use the GDFS in order to propagate requests. All the requests which have to be propagated are declared in a SpecificInterface. All these methods must map the following requirements: (1) they don’t have any output parameters and program specific exceptions,

324 N. Sabri and C. Taconet

because the propagation procedure does not send back replies to the source node; (2) they must contain GDFS parameters: the first argument must be of mGDFS::GDFSPropagationHeader type in order to set up some propagation parameters (see Section 3.4), the return parameter is boolean indicating if the propagation has to go on, some exceptions may be thrown by the GFS to indicate propagation errors. A prototype of such a method is given as follows. // Interface of the specific Request boolean oneSpecificRequest(inout mGDFS::GDFSPropagationHeader header, // . . . in parameters . . .//) raises (mGDFS::NewerVersion, mGDFS::OlderVersion, mGDFS::NotNeighbor); // only GDFS exceptions

On a source SFS, when there is a specific request to propagate, probably at the demand of one of its clients, the source SFS has to prepare the header (including maximum depth of the propagation, stopping condition request and delivery condition request) and send the request to its associated GFScomponent: SFS::GFS.oneSpecificRequest(myHeader, . . . specific parameters . . .);

If necessary, the GFS may throw some propagation exceptions. In the other case, when the delivery condition is true, the GFS sends the request to the SFS for processing (as shown in Fig. 7). GFS::SFS.oneSpecificRequest(myHeader, . . . specific parameters . . .);

Then, if the propagation condition is true, the GFS propagates the request thanks to its treePropagate method. 4.1.5 Handling of replies. The propagation protocol does not handle replies. The replies (and the problem reports) have to be sent directly by any SFS to the source SFS. The source SFS has to handle replies asynchronously through callbacks. For this purpose, the GDFSPropagationHeader may supply a replyHandler reference. 4.2 CORBA trading service and GDFS In order to better understand the use of GDFS, its advantages and its drawbacks, we present the application of GDFS to a specific service. We have chosen to present the application of GDFS to the federation of CORBA Trading servers. We first briefly present the CORBA Trading Service specifications, then we present the drawbacks of the specification’s federation proposition, and finally we present GDFS applied to CORBA traders. 4.2.1 Presentation of the CORBA trading service. The CORBA trading service [3] provides a facility for discovering CORBA objects. It is often compared to the yellow page service as it furnishes a list of objects of a given Service Type filtered with clients’ constraints. A trader stores a Service Type Repository and service offers. Each service offer contains the Service Type (e.g. MathematicsService, PrinterService), the CORBA

A generic tool to federate WAN servers 325

Interoperable Object Reference and a list of pair property-values which describes the characteristics of the service (e.g. number of operations per second, printer colour property). In order to find CORBA objects, a client has to give a Service Type and a list of constraints and preferences properties. In return, the trader supplies a list of service offers which all satisfy the constraints and which is ordered thanks to the client’s preferences. In order to extend the limit of a discovery, the traders may be linked in a federation (see Fig. 8): a client query may be made on the federation. In order to establish the federation, a target trader (e.g. trader B) may offer its knowledge to a source trader (e.g. trader A). Then, the source trader is able to invoke the target trader. These links are therefore explicitly created and are unidirectional. All the links form a directed graph called the trading graph. 4.2.2 Limitations of the current federation. Trader federation is undoubtedly an interesting feature in the context of WANs. However, we argue that we can improve the federation model on the following points.

Importer

Exporter

query()

Lookup

Register

A

Service Offers

OMG-Trader Link query() Lookup

Register

B

Service Offers

OMG-Trader

Link query() Lookup

Register

Service Offers

C OMG-Trader

Link

Figure 8.

One simple trader federation.

326 N. Sabri and C. Taconet

As they are now defined, trader federations must be established ‘manually’ by an administrator. As a result, they may not be suited to the underlying network topology. Furthermore, trader graphs may contain cycles (i.e. a search may visit the same trader several times). As the federation is usually static, it cannot react in a transparent way to network events such as trader or communication link failures, or changes on the underlying network topology. Therefore, it does not adapt to events occurring on the underlying WAN. The trading service does not define the concept of distance between objects. Consequently, the client cannot express the search of the nearest service, and the trader cannot organize service offers according to the distance between clients and servers, even though this information could be important because of differences in communication costs in a WAN. Finally, as query calls are synchronous, the results are only returned to clients at the end of a path (e.g. ABCBA) which is not scalable. The integration of GDFS to manage the federation of traders brings solutions to the above remarks and therefore would improve and optimize this service. We present this integration in subsection 4.3. 4.3 Application of GDFS for the federation of traders We propose an extension to the standard CORBA Trading Service which uses GDFS. We call it Extended Trading Service in the rest of this section. GDFS added values are: the dynamic administration of the federation and the propagation of methods. We present these two points applied to the Extended Trading Service. 4.3.1 Administration of the federation. With the extended trader, the administrator’s tasks for federation purposes would be greatly reduced: he only has to provide the name of the federation and, if it is not the first server in the federation, the name of one server of the federation. Then GDFS automatically manages the federation graph and the changes in the federation. 4.3.2 Methods to be propagated. The federation will be used to propagate methods. For the federation of traders, it may be useful for two purposes. Firstly, to manage the Service Type Repository: when the repository is modified on one trader, the modification will be propagated to every trader, and so each trader will receive the description of new Service Types. Secondly, to extend the limit of object discovery to the federation. As an example, we present below the propagation for object discovery purpose. To search for objects, clients make query requests to traders. The query method IDL is as follows ([3]): interface Lookup f void

query ( in ServiceTypeName type, in Constraint constr,

in Preference pref, in PolicySeq policies,

A generic tool to federate WAN servers 327 in SpecifiedProps desired props, in unsigned long how many, out OfferSeq offers, out OfferIterator offer itr, out PolicyNameSeq limits applied ) raises ( IllegalServiceType, UnknownServiceType, IllegalConstraint, IllegalPreference, IllegalPolicyName, PolicyTypeMismatch, InvalidPolicyValue, IllegalPropertyName, DuplicatePropertyName, DuplicatePolicyName); g

The PolicySeq input parameter is used to limit a search in a part of the federation. The OfferIterator output parameter is used to obtain the answers in several attempts. This feature is useful for a search in a federation. The client only waits for the first answers. He will ask for other answers later on with the OfferIterator interface. As mentioned in Section 4.1, requests which have to be propagated have to follow some rules: (i) they must contain a propagation header; (ii) they can’t have return parameters which have to be sent via callback methods; (iii) they must throw some GDFS exceptions if necessary. So the ‘query’ method which is propagated has to be slightly different from the one used by the client. We define two new methods which have to be handled by the Extended Trader: boolean

queryPropagate (inout mGDFS::GDFSPropagationHeader, in

ServiceTypeName type, in Constraint constr, in Preference pref, in PolicySeq policies, in SpecifiedProps desired props, in unsigned long how many ) raises (mGDFS::NewerVersion, mGDFS::OlderVersion, mGDFS::NotNeighbor); void

queryAnswer (in RequestId requestId, in OfferSeq offers);

If an Extended Trader decides to propagate a search in the federation, it sends a queryPropagate method to the GFSdynamic interface of its associated GFS (see Fig. 7). If one trader in the federation has Service Offers which satisfy this query, it sends them directly to the source Extended Trader with the queryAnswer method. The trader’s client is only aware of the Lookup interface. It sends a standard query method to its proximity trader. The delay necessary to receive all the responses to the query on the source trader may be long and the answers are received asynchronously. We decide to block the client until the source trader receives the first queryAnswer. The client will retrieve other offers later on through the OfferIterator interface. 4.3.3 Lessons learned. The integration of GDFS with the trading service enhances the Trading Service federation on several points. Propagation and administration are handled by GDFS. The number of open connections is reduced because there is no need for nested requests. The number of requests for a search is drastically reduced because there is no loop in the graph.

328 N. Sabri and C. Taconet

Nevertheless, we have to highlight some difficulties encountered for designing the propagated methods. Firstly, the source trader of a query has to handle asynchronous replies. Secondly, propagated methods for updating (Service Type Repository updating) have to handle the possible holes in the propagation. We present solutions for handling these difficulties below. 4.3.4 Handling of replies. GDFS propagation does not handle replies: they have to be sent directly to the source Server. For each pending request, the source server has to create a specific handler which will receive asynchronous replies. Extended traders must define a garbage collection strategy to destroy these handlers. 4.3.5 Reliability of propagation. GDFS propagation is not reliable. One propagated request may be received only by a subset of the federation servers. This happens when one server is down, or when there is a division of the federation into classes during a propagation. GFS functionalities have to be added to get around this partial propagation problem. Several solutions can be implemented: firstly the GFS which discovers the problem may send an error report to the source server which will take the appropriate decision, secondly all the requests may be numbered and stored on their source, thus a GFS may ask the source for the unreceived requests.

5. Conclusion In this article, we have presented a generic tool which facilitates the federation of servers over a WAN. This generic tool is in charge of both dynamically managing the federation and propagating requests to all the federation servers. This tool has been designed in the CORBA environment. At present, many widely available services are accomplished thanks to the cooperation of several servers spread over a WAN: thanks to server distribution, services may be available to a great number of clients with acceptable response time. We distinguish two methods used by these services for choosing the servers’ resolution path. The first method is used by naming services, such as DNS [19] and CORBA Naming Service [20], the path is defined by the different part of the names to resolve. Other services, such as the News USENET [1], the CORBA trading service [3], and Internet Cache Protocol Implementations [21], use a second method: the resolution path is defined by a federation graph defined manually and partially on each server. For this last method, as far as we know, there is no generic tool that can be used by these services in order to federate their servers. Consequently, each service is in charge of managing the federation. The result is that the federation model they use is very basic. All these services could benefit from a generic federation tool. In most of the widely available services, federation graphs are fixed through configuration files. Some examples are given as follows: implementations of the CORBA Trading service federations [3], Internet Cache Protocol [21], the

A generic tool to federate WAN servers 329

MBone tunelling [22], RMTP (Reliable Multicast Transport Protocol) [23], SIENA (Scalable Internet Event Notification Architecture) [24]. Therefore their federation graphs are static. With GDFS the federation graph evolves dynamically. GDFS takes into account the network topology to organize the neighbourhood through distance functions. The propagation tree is changed dynamically in case of one server failure. GDFS takes charge of the propagation of requests to every server of the federation. As shown in [10], most group communication protocols are designed for specific applications, the propagation protocol presented is suited for discovery requests in a WAN, but other propagation protocols could be built upon GDFS. As most scalable propagation protocols (such as RMTP [23], TMTP [25], and SIENA [24]), the propagation is achieved through an acyclic graph, so the propagation cost is shared between the servers. The main difference with the above protocols is that the GDFS propagation tree adapts itself in case of one server failure. We have implemented the GDFS propagation protocol in the CORBA environment at the application level. Implementing the propagation protocol at this level presents some advantages. Firstly, servers may be installed in a heterogeneous environment. Secondly, the application may set up filters which may stop propagation progress if for example the request is already fulfilled at one point of the propagation. And lastly, the requests are typed, they directly trigger methods of the specific applications. But, as shown in [26], implementing the propagation protocol at this level also has a cost. As CORBA is only implemented above TCP/IP, the protocol cannot benefit from multicast routing or transport protocol and it cannot rely on negative acknowledgement. Furthermore, CORBA and all the more so CORBA Dynamic invocations induce in themselves a processing overhead on each server. In a previous work we have enhanced the CORBA trading service federation model: we have integrated our federation protocols into an implementation of the trading service [27]. The drawback of this first implementation was that the design was specific to the trading service. As our federation model may be applied to any kind of federation, we have decided to work on a design which makes the federation tool completely independent of the specialized service. In this paper we have presented the application of the generic federation tool to the CORBA trading service, nevertheless this tool may now be applied to any specialized CORBA service which needs federation. At this stage, we have worked on the design of this generic tool and have made an implementation feasibility study in the CORBA environment. We are currently working on a complete implementation and are conscious that some questions are still open, such as how to choose a distance function adapted to routing protocols, how often do we encounter federation partition problems in real conditions, what is the scalability of the protocol (maximum number of servers in the federation)? However, the availability of federation tools for building widely available services is essential.

330 N. Sabri and C. Taconet

References 1. B. Kantor & P. Lapsley 1986. NNTP Network News Transfert Protocol. Request For Comments RFC977, February 1986. 2. E. S. Deering & D. R. Cheriton 1990. Multicast routing in datagram inter-networks and extended LANs. ACM Transactions on Computer Systems 8(2), 85–110. 3. Trading Object Service Specification. OMG documents formal/00-06-27, May 2000. 4. K. Delgadillo 1999. Cisco DistributedDirector. At URL: http://www.cisco.com/warp/public/ cc/pd/cxsr/dd/tech/dd wp.htm. 5. C. Taconet 1997. Graphes de R´eseaux Coop´erants et Localisation Dynamique pour les Syst`emes R´epartis sur R´eseaux e´ tendus. PhD thesis, Universit´e d’Evry Val d’Essonne, 22 October 1997. 6. R.C. Prim 1957. Shortest connection networks and some generalizations. Bell Syst. Techn. J. 36, 1389–1401. 7. B. Boldon, N. Deo & N. Kumar 1996. Minimum-weight degree-constrained spanning tree problem: heuristics and implementation on an SIMD parallel machine. Parallel Computing 22(3), 369–382. 8. D. F. Towsley, J. F. Kurose & S. Pingali 1997. A comparison of sender-initiated and receiverinitiated reliable multicast protocols. IEEE Journal of Selected Areas in Communications 15(3), 398–406. 9. K. Birman, A. Schiper & P. Stephenson 1991. Lightweight causal and atomic group multicast. ACM Transactions on Computer Systems 9(3), 272–314. 10. K. Obraczka 1998. Multicast transport protocols: a survey and taxonomy. Communications Magazine 36(1), 94–102. 11. IONA Technologies Isis Distributed Systems 1995. Orbix C Isis Programming guide. 12. S. Maffeis 1995. Run-Time Support for Object Oriented Programming. PhD thesis, Zurich University. 13. S. Maffeis & D. Schmidt 1997. Constructing reliable distributed communication systems with CORBA. IEEE Communications Magazine 14(2), 56–60. 14. L. E. Moser, P. M. Melliar-Smith & P. Narasimhn 1998. Consistent object replication in the eternal system. Theory and Practice of Object Systems 4(2), 81–92. 15. M. Cukier, J. Ren, C. Sabnis, W. Sanders, D. Bakken, M. Berman, D. Karr & R. Schantz Aqua: An Adaptive Architecture that Provides Dependable Distributed Objects. Proc. 17th. Symp. Reliable Distributed Systems, West Lafayette, IN, U.S.A., October 1998, pp. 245–253. 16. C. Gransart & J. M. Geib 1999. Using an ORB with multicast IP. In PCS’99 Parallel Computing Systems Conference. Mexico, August 1999. 17. P. Felber 1998. The CORBA Object Group Service: A service Approach to Object Groups in CORBA. PhD thesis, Lausanne, EPFL, 1998. 18. Fault Tolerant CORBA. orbos/99-12-19, December 1999. 19. P. Mockapetris 1987. Domain Names Implementation and Specification. RFC1035, November 1987. 20. CORBA Services Specification. OMG Document 98-12-09, 1998. 21. D. Wessels & K. Claffy 1997. Internet Cache Protocol (ICP), version 2. Request For Comment RFC2186, September 1997. 22. K. Almeroth 2000. The evolution of multicast: From the MBone to inter-domain multicast to Internet2 deployment. IEEE Network, January/February 2000, 10–21. 23. S. Paul, K. Sabnani, J. C. Lin & S. Bhattacharyya 1997. Reliable multicast transport protocol (RMTP). IEEE Journal of Selected Areas in Communications 15(3), 407–421. 24. A. Carzaniga, D. S. Rosenblum & A. L. Wolf 2000. Achieving scalability and expressiveness in an Internet-scale event notification service. In Proceedings of the Nineteenth ACM Symposium on Principles of Distributed Computing (PODC 2000), Portland, OR, U.S.A., July 2000, pp. 219–227. 25. R. Yavatkar, J. Griffoen & M. Sudan 1995. A reliable dissemination protocol for interactive collaborative applications. In ACM Multimedia, 333–344. 26. S. Mishra, L. Fei, X. Lin & G. Xing 2001. On group communication support in CORBA. IEEE Transactions on Parallel and Distributed Systems 12(2), 193–208.

A generic tool to federate WAN servers 331 27. D. Belaid, N. Provenzano & C. Taconet 1998. Dynamic management of CORBA trader federation. In Proceedings of COOTS’98 (Conference on Object Oriented Technologies and Systems). SantaFe New Mexico, April 1998, pp. 53–63.

Nawel Sabri received the Engineering Degree in computer science, from the University of Algiers (USTHB), Algeria in 1997. She joined the Distributed Systems team of the National Institute of Telecommunications (INT) in Evry (France), in 1999, to work on group communication applied to the design of a generic tool of dynamic federation of CORBA servers. She is currently a PhD student at INRIA, Rocquencourt, France. She is working on advanced transaction services in component-based environments.

Chantal Taconet received the Engineering Degree in Computer Science from the University of Compi`egne (UTC), France in 1986 and the PhD Degree in Computer Science from the University of Evry, France in 1997. In 1997, she joined the Computer Science department of the National Institute of Telecommunications (INT) in Evry (France) as an Assistant Professor. Her research interests in computer networks and distributed systems include object middleware for large scale networks, object discovery services, and dynamic deployment of CORBA components.