Copyright © IFAC Information Control Problems in Manufacturing, Salvador, Brazil, 2004
ELSEVIER
IFAC PUBUCATIONS www.elsevier.com/loca!elifac
AN EVENT FILTERING SCHEME FOR DISTRIBUTED REAL-TIME EMBEDDED SYSTEMS Carlos Mitidieri •.•• Jorg Kaiser' Carlos Eduardo Pereira ••
• University of Ulm, Germany •• Federal University of Rio Grande do SuI, Brazil
Abstract: Event filters provide the infonnation selectivity needed to regulate the functional aspect of the coordination in publish/subscribe systems. This paper tackles the question of filtering events throughout a publish/subscribe middleware, which is aimed to support the interaction among autonomous entities in networked embedded systems. Therefore, the tradeoff between, first, the accuracy whereby subscribers can specify the events that are relevant to them, and second, the efficiency and predictability of the filtering execution, has to be adjusted to cope with resource constraints. The proposed filtering method and the related algorithms are aimed to provide an adequate solution to this tradeoff. Copyright © 2004 IFAC Keywords: communication protocols, embedded systems, real-time
timely coordination of networked computing nodes. On the other hand, the issues relating to fieldbus communication are on a very low level of abstraction, and should be hidden from application developers handling high level concerns such as the coordination of autonomous entities. A middleware layer providing appropriate abstractions is required, as previously discussed in previous papers (Kaiser and Mock, 1999; Pereira et aI., 200 I). Briefly reviewing, the coordination model has to support spontaneous dissemination of events, which are generated at the sensor interface or in association to internal state transitions. Moreover, the infonnation transported in association with events usually has to be disseminated efficiently to multiple destinations. Finally, the dissemination of events has to be decoupled from the individual control flows of the distributed objects.
I. INTRODUCTION The development of industrial automation systems shows up a clear trend towards distributed architectures. This reflects the decentralized nature of the control tasks concerned in automation, and the necessity to adapt to dynamic conditions, as well as to evolving processes and ever changing components. For dependability reasons, dependencies have to be avoided whenever possible. Hence, the autonomy of the distributed system components has to be enforced. Moreover, system responsiveness can be improved, if the autonomous components are fine-grained on a hardware level. This can be accomplished by deploying intelligent sensors and actuators at the input/output interface. These are dedicated devices, comprising a computational block (usually presenting low-perfonnance), a network interface and a transducer element. On a higher level of abstraction, the intelligent transducers have to coordinate with other autonomous objects (e.g. controllers) to perfonn useful tasks. This is accomplished through the underlying communication system.
The event-based communication entailed in the publish/subscribe (PIS) model is well known for supporting the requirements mentioned above (Oki et al., 1993; Rajkumar et aI., 1995; Eugster et aI., 200 I; Meier and Cahill, 2002). However, the PIS model does not implicitly provide any level of reliability or timeliness for disseminating events. Thus, the COSMIC middleware (Kaiser et aI., 2003a) has been intro-
Fieldbus technology has a central role in systems like these. Their properties allow to support reliable and
129
duced, which enriches the PIS model with abstractions and semantics that support the real-time dissemination of events in distributed embedded systems. First, it recognizes that "events" may represent occurrences in the sensed environment, so they must be associated to a context, which is described by attributes such as a location and the occurrence time. Second, it includes real-time event channels (Kaiser et al., 2003b), which are abstractions of the communication infrastructure, providing a high level interface for programming the quality level required to disseminate events. Third, it includes a dynamic binding mechanism (Kaiser and Mock, 1999), which is crucial for the applicability of PIS systems in embedded systems.
representation can be exploited for distinguishing the events on a subject, and support the detection of overlapping interests with a low computational cost.
2.2 Conformity-basedfiltering
Events present attributes which complement the contents. They relate, e.g., to the context in which events are generated and the required quality of service. Attributes are included (or not) in events according to applications' semantics. Hence, the structure defined by the presence/absence of attributes partially reveals the meaning of an event, and can be exploited in a method for filtering events. Conformity-based filters (Mitidieri and Kaiser, 2003) consist of such a method for exploiting events' structure in a subscription scheme.
This paper reports on mechanisms and algorithms for filtering events, which have been designed to integrate COSMIC. The proposed approach addresses the fundamental tradeoff existing between, on one side, the accuracy of the subscription scheme, and on the other side, the computational effort required to match events to subscriptions. This has to be solved considering the resource constraints existing in embedded systems.
Consider that events are typed by a subject and enclose a collection of (attribute, value) pairs. Events pertaining to the same subject are allowed to present different collections of attributes, so distinctive structures. An event that includes no attributes is also valid. When subscribing, an object must specify the subject and the attributes that an event has to include to be of interest. An event matches a subscription if:
The paper is organized as follows . Sec. 2 presents a review of the event filtering approach considered and further elaborated in this paper. The algorithms that have been developed for implementing this approach are analyzed in Sec. 3. Related work is discussed in Sec. 4. Conclusions are given in Sec. 5.
(I) The event's subject matches the subject specified in the subscription; (2) And the event's structure conforms to the structure specified in the subscription. A given element is said to conform structurally to another element, ifit includes at least the same attributes that are included in the latter (Cardelli, 1988). If an event includes every attribute that is specified in a subscription, plus any other, then this event conforms to (matches) this subscription. An event which lacks just a single attribute that is required in a subscription does not conform to (match) this subscription.
2. EVENT FILTERING 2.1 Overview
Two broad approaches exist for routing events to subscribers. In subject-based systems (Oki et al., 1993; Cabrera et a/., 200 I), a single identifier related to the events' contents has to be verified. In contrast, the entire functional data carried by events has to be looked over in content-based systems (Carzaniga et a/., 1998; Pietzuch and Bacon, 2002). In this process, some of the parsed items have to be checked against predicates defined by subscriptions. Although the latter approach is more flexible and may select information more accurately, its computational cost is higher. For efficiency reasons, the subject-based approach has been adopted in COSMIC. However, it turned out from practice that it is usually possible to find groups of subscriptions that, although requiring events characterized by distinct attributes, can be tracked down to a common topic. Of course, distinct subjects could be assigned to each of these groups of SUbscriptions to distinguish them. This implies that many subjects would become related to the same contents. Hence, the semantic relation between subject and contents would be lost. Moreover, it would be impossible to exploit the natural occurrence of overlapping interests when striving to optimize the filtering performance. As explained in Sec. 2.2, structural properties of events
Fig. I. Structural conformance relationships. The structural conformance relationship is transitive :
where the symbols a, ~ and y represent structures and the symbol
130
such a hierarchy for three attributes, which are represented by A, Band C. The structures shown in the nodes may belong to an event, or to a subscription. The directed edges represent the conforms-to relation. Note that the subject tagging the root node (which includes no attributes) applies to every node in the graph. Then follows these examples: • (Subscriptions : (S, (B))) is -matc hed -by (Events : (S, (B)) , (S,(A,B)), (S,(B,C) ), (S ,(A,B,C)) ); • (Event : (S,(A,C) ) ) matche s (Subscriptions : (S, (A,C)) , (S, (A)), (S,(C)) , (S,()) ).
Filters are distributed over components and gateways. At components, filters must be light weight and execute timely, to cope with severe resource constrains and control requirements. Conversely, gateways usually do not have to support stringent temporal requirements, since the WAN can barely support soft realtime communication. Moreover, since the filters in the gateways have to match every event that arrive at their network interfaces 1 , they may require processors with relative performance . 3.3 Filtering events at subscribers' nodes
3. FILTERING EXECUTION
3.3.1. Data structures The data structure that support event filtering at subscribers ' nodes is depicted in Fig. 2. It includes a two-dimensional array, which stores the references to subscriptions lists. The primary dimension of the array is indexed by subject identifiers. It is labeled as "SID" in Fig. 2. The secondary dimension is indexed by conformity filter specifications (CFspec)' Each subscriptions list contains every local subscription that is conformed by the indexing CFspec . Hence, filtering an event requires one simple access to this table. This is a light weight operation that practically presents no jitter.
3.1 Event frames and attributes vectors
Each subject is associated to a restricted set of specified attributes. E.g., Fig. 2 relates to a particular case whereby a subject is associated to three attributes. Each attribute is mapped to a fixed position in a row. This row has a bit-vector representation, so called attributes vector (AV). If an event includes a specific attribute, the respective position in the associated vector is turned "on", otherwise it is turned "off" . The attributes vector that represents a conformity filter specification (CFspec ) is formed in the same way.
• ub .et td
Events are laid into a frame for transmission. The layout of this frame has been designed to cope with the execution of the developed algorithms. The event frame has fields for the subject, for the attributes vector, and for the values of the attributes, as shown on the top of Fig. 2. The order of the attributes' values in the frame is the same as in the attributes vector. In addition, the data type and the length of every attribute are known in the nodes that publish or subscribe to the related subject. Hence, the location of each attribute value in a frame can be calculated. Event frames are compact in size, because the vectors' mappings and the attributes' data types are known in every node. This is beneficial regarding fieldbuses, which often present very narrow message frames .
av.Dt fr . . . Attril»ut •• V.etor Attribut •• Valu ••
r - -
--
I er epee. I
l\lbacrlptioDa Llat
Fig. 2. Data structures for subscribers nodes. 3.3.2. Implementation The data structures are statically allocated and configured, accordingly to the local subscriptions. This static configuration is possible, since the subscriptions that can be issued in a node are known in advance. And it is necessary, if the microcontrollers that are embedded in the smart transducers are not powerful enough to support dynamic memory allocation.
3.2 Location offilters
The assumed communication infrastructure is composed of fieldbuses, which are connected via gateways to a wide area network (WAN) . Fieldbuses are often embedded in an equipment (rovers, manufacturing cells, etc.), whose safe operation requires tight coordination of the local computing nodes . These nodes can be viewed as application components, since each of them is dedicated to a specialized activity (smart transducers, controllers, etc.). Supported by gateways, components located in distinct fieldbuses may also coordinate across the WAN . This typically comprises non critical interaction, e.g., cooperative ambient monitoring.
3.3.3. Insertion of subscriptions The subject identifier and the conformity filter specification (CF,pec ) are read from each submitted subscription. Then, the I The dynamic binding mechanism (Kaiser and Mock, 1999) avoids this need for component nodes .
131
subscription's CFspec is matched against each of the AV indices of the secondary array, accordingly to the conformity relation. If the CFspec is conformed by an index, a reference to the subscription is inserted in the respective list. An index conforms to a CFspec if the following condition is true:
(AV;"dex & CFspec )
==
CFspec
the less, this scheme is addressed for fieldbuses, particularly for the CAN-Bus (CAN, 1991). Hence, the number of attributes that can be included in events is already limited by the reduced frames sizes. For instance, the CAN-Bus frames may carry a maximum of 8 bytes. Thus, both the memory requirements and the execution of CFspec matching loop are expected to be feasible, even for small micro-controlled boards.
(I)
where & is the bitwise-AND operator. The algorithm is summarized below.
3.4 Filtering events at gateways When filtering events at gateways, events do not have to be passed to a notification dispatcher. Hence, the data structures have to store only flags signaling which events should be forwarded. In addition, the subscriptions' identifiers are registered together with the corresponding flags, to manage deletions.
Algorithm 1 01 For ea c h subscri pt ion 5 02 hs · hash_f unct ion (subj e ct_ (5 ) ) 03 cCspec ' a ttr i bu tes_vector (5) O~ For ( av-O , av
3.4.1. Data structures Gateways' filters include a filtering table, which is indexed by subject identifier (SID) and the number of attributes (NA) on events, as indicated in Fig. 3. The stored elements are references to lists of CFspec entries. Each list contains only entries presenting the number of checked attributes corresponding to the indexing NA . The presence ofa CFspec indicates that the conforming events have to be forwarded . No duplicate of a CFspec has to be stored in a list. Hence, these lists can be structured as heightbalanced binary search trees.
3.3.4. Events filtering The subject identifier is obtained for every received event, and applied as the input for a hash function (Vitter, 1982). The attributes' vector is also read from the event frame . Then, both the hash value and the attributes vector are used to index the filtering table, which returns a reference to a list ofsubscriptions. If this list is not empty, it contains all the subscriptions that have been matched. Hence, it is passed to a dispatching module, together with the event. This procedure is summarized in Alg. 2.
Algorithm 2 o1 For ea ch 02 03 04 05 06
Subj e ct Id
event e hs· ha sh_ function(subject_id(e» av = at tr ibutes_ve ctor (e) ·subscr i ptions_ lis t · fil t er i ng_ t abl e(hsl (av) If (subscripti ons_ list is not e mpty) dispatch( e ,subs criptions lis t )
Attribut •• Ve ctor
Attribut •• V. lu ••
- - -------, er . p. c .
I
0 0 0
1 0 0
3.3.5. Analysis The proposed algorithms were designed to ensure efficiency as well as predictability when filtering events, at the cost of a higher complexity when inserting and deleting subscriptions. Considering that subscriptions are inserted and deleted outside of critical control paths, while event filtering should usually be performed in real-time control paths, this is a reasonable solution.
111
I
'- _ _ _ _ _ _ _ _ J
Fig. 3. Data structures for gateways.
Accordingly, the filtering algorithm's execution time has complexity 0(1) , both in relation to the number of attributes and the number of stored subscriptions. On the other hand, when matching the CFspec of a subscription against the AV indexes (lines 04 and 05 in Alg. 2), the time complexity is 0 (I) , where I is the length of the array. In addition, the insertion and deletion of a subscription in a list is 0 (log 2 s) , where s is the number of subscriptions, since the subscriptions' lists are implemented as binary search trees. The order in the trees is defined by the subscriptions' identifiers.
3.4.2. Implementation. While the SID array is static, the NA arrays and the addressed lists are dynamically created and destroyed, in response to the insertion and deletion of subscriptions. Dynamic maintenance of this structure is necessary, because it is not possible to know in advance which hosts (i .e., subscribers) will be connected to a network. In fact, it is assumed that they vary, as the system evolves. 3.4.3. Insertion ofsubscriptions As outlined in Alg. 3, the subject identifier, the CFspec and the respec-
The required memory for each secondary array is 0 (2" ), where n is the number of attributes. Never-
132
tive number of checked attributes are read for each submitted subscription (Lines 02, 03, 04 ). Then, the applicable tree is retrieved (Line 05) and an attempt to insert the CF'spec is carried out (Line 06). If an identical CF'spec is already inserted, then only the subscription identifier has to be recorded in the corresponding tree node. This is necessary to manage the deletion of subscriptions. If an insertion effectively occurs, an operation for balancing the height of tree is performed. The algorithm for deleting a subscription includes comparable steps.
ber of nodes in the respective binary search tree. The determination of such number of nodes is related to the problem of generating combinations of k elements (i.e. the number of checked attributes) taken from a set of I elements (i.e. the total length of the AV), without repetitions. The maximum number of combinations (corresponding to tree nodes) is given by :
I! k!(l - k )! which is a polynomial function on I. Hence, considering the following inequality:
Algorithm 3 01 For ea ch subscription 5 02 hs· ha sh_ function Isu b ject_ ls)) 03 ef_spe c -= a ttri but e s _v ector (5) 04 na = number_oCche c ked.at t r ib ut esls ) 05 'bst· fil t e r i ng_tabl e [hs l [na l 06 insert - i nsert bi na ry t ree ( et s pec, bst)
it is possible to conclude that the complexity for inserting or deleting a subscription is lower than 0(1), where I is the AV length.
3.4.4. Eventsfiltering The subject identifier, the AV and the number of checked attributes (NA) in the AV are obtained (Lines 02, 03, 04, Alg. 4) from each received event. Then, a loop is executed (Line 05) to retrieve CF'spec lists. This loop ranges from NA to zero. The first retrieved list contains only CF'spec entries with the same NA as the processed event. Hence, a binary search is performed in the first list, since at most one entry may be conformed by the current AV. If a match does not occur, the remainder lists are retrieved (one at a time) and traversed, looking for a match. If a match occurs when traversing a tree, the event is forwarded and the search stops. In addition, a CF'spec identical to the AV of the current event is inserted in the corresponding tree . Therefore, the complexity for matching subsequent events with comparable AV will be much lower, as discussed in Sec. 3.4.5.
The time complexity for filtering events presents two cases (Lines 08 and 10 of Alg. 4). When an event with an unequaled AV is received for the first time in a gateway, the complexity is bounded to 0 (s), where s is the number of subscriptions on the related subject. However, the time complexity for filtering the subsequent events of the same flow (which will present the same AV) will be lower than 0 (I), exactly the complexity of insertions and deletions of subscriptions.
4. RELATED WORK Structural properties of events representation are implicitly considered in the filtering schemes of many PIS systems, although in these cases always tied to the evaluation of predicates (Carzaniga et aI. , 1998; Starovic et aI., 1995; Meier and Cahill, 2002; MOhl et al. , 2002; Gelernter, 1985). The omission of such association on the presented scheme allows to break up the filtering process in a number of stages. Such division of labor may improve the overall efficiency of the filtering process, since the searched space (i .e., the number of subscriptions to be matched) can be progressively restricted, while the complexity of the matching mechanism increases. For instance, predicates could be evaluated subsequently to conformitybased filters, but then only for the subscriptions which have been pre-selected so far.
Algorithm 4 01 f or ea ch event e
02 03 O~
05 06 07
08 C9
10
hs = hash_ fu nction Isu b ject _ id le)) na'" nUmber_o f_checked_attribut e s (e ) av ... attri butes_vect o r Ce)
Forli-na, i> ' O, i- - ) ' bst - fi lt er i ng_tab l e [hs l [ i l If l i == na ) binary_search (av, bst ) El se trave r s e_ tree (av, bs t)
3.4.5. Analysis Gateways concentrate subscriptions. Hence, the algorithms described in this section are designed to support a larger number of subscriptions, as well as more frequent insertions and deletions, in comparison to the previous one. As a matter of fact, the space complexity has been managed to be o (I) , where I is the AV length. In consequence, it is unavoidable to increase the time required to filter an event. This tradeoff is acceptable, since the interaction that occurs through the gateways is less stringent regarding timeliness, as already observed.
Structural properties reflecting the relations among information categories have been exploited in (Kulik, 2003) for speeding the matching and forwarding of events in PIS systems. That scheme was addressed for disseminating information in the Internet, thus it assumes the existence of a centralized information collector, which would manage to establish the categories of the incoming information flows . Such an infrastructure is not available and is actually undesired in the systems addressed in this work .
The time complexity for inserting or deleting a subscription is bounded to 0 (log2 n), where n is the num-
133
5. CONCLUSIONS AND FUTURE WORK This paper builds up on the exploitation of the publish/subscribe model for coordinating autonomous components in distributed real-time embedded systems. The question tackled relates to the fundamental compromise existing between the flexibility of a subscription scheme and the computational cost associated to the related implementation. This tradeoff has been addressed considering the resource constraints typically found in distributed embedded systems. It has been argued that structural properties of event representation can be exploited to improve subscriptions' selectivity on subject-based systems, while still conserving low the time complexity of filtering execution. The scheme has been implemented and its performance is being evaluated under a variety of conditions. In the future, the research will be focused on the exploitation of event attributes on the high level coordination of autonomous devices.
ACKNOWLEDGMENTS This work has been supported by the EC, through project IST-2000.26031 (CORTEX: COoperating Realtime senTient objects: architecture and EXperimental evaluation) and by CNPq, and FINEP, the Brazilian research agencies.
REFERENCES Cabrera, L. F., M. B. Jones and M. Theimer (2001). Herald: Achieving a global event notification system. In: 8th Workshop on Hot Topics in Operating Systems (HotOS-VII/) . CAN (1991). CAN Specification version 2.0. Cardelli, Luca (1988). Structural sub-typing and the notion of power type. In: Conference Record of the Fifteenth Annual ACM Symposium on Principles of Programming Languages. San Diego, California. pp. 70-79. Carzaniga, A., D. S. Rosenblum and A. L. Wolf (1998). Design of a scalable event notification service: Interface and architecture. Technical report. Department of Computer Science, University of Colorado. Eugster, P. Th., P. Felber, R. Guerraoui and A. M. Kermarrec (2001). The many faces of publish/subscribe. Technical Report DSC ID :200 I 04. EPFL. Lausanne, Switzerland. Gelernter, D. (1985). Generative communication in Linda. ACM Trans. Prog. Lang. Syst. 7( I), 80112. Kaiser, J., C. Mitidieri, C. Brudna and C.E. Pereira (2003a) . COSMIC : A middleware for eventbased interaction on CAN. In: IEEE Conference on Emerging Technologies and Factory Automation. Lisbon, Portugal.
134
Kaiser, Joerg and M. Mock (1999). Implementing the real-time publisher/subscriber on the controller area network (CAN). In: 2nd Interantional Symposium on Object-Oriented Real-time distributed Computing. Saint-Malo, France. Kaiser, Joerg, Cristiano Brudna and Carlos Mitidieri (2003b) . A real-time event channel model for the CAN-Bus. In : International Workshop on Parallel and Distributed Real-Time Systems (WPDRTS'2003) . Nice, France. Kulik, Joanna (2003). Fast and flexible forwarding for internet subscription systems. In: Second International Workshop on Distributed Event-Based Systems (DEBS '03) . Meier, Rene and Vinny Cahill (2002). Steam: Eventbased middleware for wireless ad hoc networks. In: International Workshop on Distributed EventBased Systems. Mitidieri, C. and J. Kaiser (2003). Attribute-based filtering for embedded systems. In: Second International Workshop on Distributed Event-Based Systems (DEBS'03) . San Diego, California. Miihl, G. , L. Fiege and A. P. Buchmann (2002). Filter similarities in content-based publish/subscribe systems. In: International Conference on Architecture of Computing Systems (ARCS) . Karlsruhe, Germany. pp . 224-238. Oki, Brian, Manfred Pfluegl, Alex Siegel and Dale Skeen (1993). The information bus - an architecture for extensible distributed systems. In: ACM Symposium on Operating System Proncipies. pp. 58-68. Pereira, C. E., J. Kaiser, C. Mitidieri, C. Villela and L. B. Becker (200 I). On evaluating interaction and communication schemes for automation applications based on real-time distributed objects. In: 4th Int. Symposium on Object-Oriented Real-Time Distributed Computing (ISORC 'Ol). Magdeburg, Germany. Pietzuch, P. and J. Bacon (2002). Hermes: A distributed event-middleware architecture . In : 1st International Workshop on Distributed Event Systems (DEBS '02) . Rajkumar, Ragunathan, Mike Gagliard and lui Sha (1995). The real-time publish/subscribe interprocess communication model for distributed real-time systems: Design and implementation. In: IEEE Real-Time Technology and Applications Symposi~m . IEEE Real-Time Technology and Applications Symposium. Starovic, Gradimir, Vinny Cahill and Brendan Tangney (1995). An event-based object model for distributed programming. In: OOIS (ObjectOriented Information Systems) '95. SpringerVerlag. London . pp. 72- 86. Vitter, 1. S. (1982). Implementations for coalesced hashing. Communications of ACM 25(12), 911 926.