Robotics and Computer-Integrated Manufacturing 27 (2011) 627–635
Contents lists available at ScienceDirect
Robotics and Computer-Integrated Manufacturing journal homepage: www.elsevier.com/locate/rcim
Evaluating a new communication protocol for real-time distributed control Jason J. Scarlett, Robert W. Brennan n Schulich School of Engineering, University of Calgary, 2500 University Dr. N.W. Calgary, Alberta, Canada T2N 1N4
a r t i c l e in f o
abstract
Article history: Received 15 April 2010 Received in revised form 8 October 2010 Accepted 29 October 2010 Available online 4 December 2010
The recent trend in distributed automation and control systems has been towards event-triggered system architectures such as UML and IEC 61499. Although existing communication protocols (e.g., Ethernet) can support high-level communication within these systems, there is contention as to which low-level protocol to use, or if any exist that meet the requirements of being event-triggered and hard real-time. This paper proposes a new way to measure communication performance. The goal of the new measurement method is to stress the necessity that a real-time communication protocol needs to be both efficient and fair. This is illustrated by comparing three communication strategies: Controller Area Network (CAN), Time-Triggered CAN (TTCAN) and Escalating Priority CAN (EPCAN). The first two represent the extremes between event-triggered and time-triggered communication strategies; the third is introduced to illustrate the benefits of a new event-based communication protocol proposed by the authors. & 2010 Elsevier Ltd. All rights reserved.
Keywords: Distributed control IEC 61499 Object-oriented programming
1. Introduction Given recent advances in distributed hardware and software technologies, distributed control is becoming a more attractive alternative for industrial applications. This is driven by the increasing performance and cost efficiency of smart devices interconnected by fieldbuses (bi-directional digital serial networks) as well as by the need to match the control model more closely with the physical system. This is particularly relevant to manufacturing control systems that are required to control widely distributed devices in an environment that is prone to disruptions. The natural fit of multi-agent systems technology to manufacturing problems has resulted in many applications in this domain [1]. Consequently, this has led to Holonic Manufacturing Systems (HMS), a manufacturing-specific application of the broader MAS approach [2]. Recently, there has been considerable interest in extending this work from the upper, planning and scheduling level of control, to the physical device level [3]. For example, members of the HMS consortium focused on defining a low-level control architecture [4,5] that is based on the IEC 61499 function block standard [6]. With this work as a basis, IEC 61499 function blocks have been used at the device level for dynamic reconfiguration [7], safety management [8,9], and system validation [10]. This has led to range of industrial applications of real-time distributed control [11] such as DaimlerChrysler’s holonic control implementation for engine assembly [12], Rockwell Automation’s design of a reconfigurable ship-based chilled-water system [13], the integration of RFID
n
Corresponding author. E-mail address:
[email protected] (R.W. Brennan).
0736-5845/$ - see front matter & 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.rcim.2010.10.009
(radio frequency identification) systems [14], and the use of holonic systems for sensor management on military platforms [15]. When agent technology is applied to physical devices, system design and analysis are of utmost concern given real time and safety issues at this level of control. For example, as devices such as controllers, sensors, and actuators become ‘‘smarter’’, safety functions that were previously performed by mechanical or electrical interlocks are assumed by computer software that may reside in a single device or might be distributed across multiple devices. Failures in these systems can have a significant impact on a company’s bottom line (e.g., profit reduction due to temporary loss of equipment resulting in production downtime), and more importantly, on the lives of people. Because of the more stringent requirements for latency, reliability, and availability, it follows that the step from the non-real time or soft real time domain is a large one, requiring new models and methodologies for distributed control. Like other control architectures, a distributed architecture must deal with safety issues such as redundancy, data validation, fault isolation, and timing issues. Timing issues in a distributed system must consider how the network resources are managed because the network is now comprised of a single digital communication bus shared between all devices. Timing of communication must deal with unpredictable collisions that result in delays. A problem arises however, when this model is applied to safety– critical systems because safety–critical systems require guaranteed communication delays. Two opposing paradigms currently exist for dealing with real-time communication: and only one of them is currently able to meet this requirement. The two paradigms commonly used in distributed control systems are either time-triggered or event-triggered [16,17].
628
J.J. Scarlett, R.W. Brennan / Robotics and Computer-Integrated Manufacturing 27 (2011) 627–635
The time-triggered paradigm uses a predefined cyclic time schedule (periodic) to drive the communication of the system. The eventtriggered paradigm relies on non-periodic (sporadic) stimuli to generate the need to communicate. To date, the time-triggered approach has been favoured in safety–critical systems because it can ensure that no communication delays occur by assigning dedicated communication time windows to specific nodes [18]. The purely event-triggered approaches have been unable to guarantee communication delays and have thus not gained acceptance. Recent work in this area has focused on making event-triggered systems more predictable [19]. The problem with the approach is that they have converted event-triggered communication into timetriggered communication by placing a minimum time period between successive communications. Although this approach results in deterministic communication, it essentially violates the idea that events take place independent of any time reference. In the next section, we provide the background to our work on communication protocols for real-time distributed control. Section 2 begins with a brief overview of the IEC 61499 distributed control model and the motivation for this work. We then summarize the extant communication protocols and propose a new communication protocol that is intended to closely match the IEC 61499 model. In Section 3, we propose a game theory based approach for assessing communication protocols. This approach is then applied in Section 4, where we compare our proposed protocol with no arbitration, a time-triggered approach, and an event-based approach. We conclude the paper in Section 5 with our conclusions and recommendations for future work.
Fig. 1. IEC 61499 distributed applications.
2. Background In this paper we focus on the problem of achieving predictable and timely behavior of real-time distributed control applications using IEC 61499 [6]. Although latencies can occur in distributed control systems for a variety of reasons (e.g., control application design, processor clock speed, processor memory, etc.), our focus will be on the important role the communication protocol plays in realising this goal when the system model is event-based and distributed. We begin this section with a brief overview of the IEC 61499 model as well as an example of how IEC 61499 distributed applications are affected by different communication protocols. Next, we look more closely at the extant communication protocols for real-time distributed control and propose a new protocol for IEC 61499 systems. We conclude this section with some background on the performance measures that can be used to assess competing protocols. 2.1. IEC 61499 and distributed control Much of the recent work on distributed intelligent control at the device level has relied on the IEC 61499 standard. This standard was developed by the International Electro-technical Commission to addresses the need for modular software that can be used for distributed industrial process control ([6]. In particular, IEC 61499 builds on the function block portion of the IEC 61131-3 standard for programmable logic controller (PLC) languages [20] and extends the function block language to meet more adequately the requirements of distributed control in a format that is independent of implementation. Current automation systems typically combine a control application in a single PLC with remote I/O (input/output) devices. With the continuing trend towards small, inexpensive programmable controllers, the possibility of distributed intelligent devices with local I/O is becoming more attractive. As is illustrated in Fig. 1, given a programming language that supports this type of distributed
Fig. 2. IEC 61499 function blocks.
application, this can be achieved by the distribution of functional units over a network of communicating processors (devices). One of the main limitations of existing approaches to PLC programming (e.g., ladder logic, IEC 61131-3) is that program execution is based on the conventional PLC scan cycle. This leads to considerable inflexibility in the scheduling of events, which can result in serious difficulties in meeting the requirements of hard real-time control. As is shown in Fig. 2, through a combination of the object concept (e.g., the IEC 61131-3 ‘‘function block’’) and the concept of state diagrams (e.g., the IEC 61131-3 ‘‘sequential function chart’’ or the French GRAFCET language [21] a solution can be found. As indicated at the bottom of Fig. 2, both scheduling of events (i.e., via function block event inputs and outputs) and data flow (i.e., via the typical function block connections) can be specified in a single notation. This result in an ‘‘enhanced’’ object or function block: the shaded portions of the function blocks shown in Fig. 2 are responsible for execution control and the non-shaded portions are responsible for functionality (e.g., algorithms and data). IEC 61499 extends the IEC 61131-3 standard through these two basic concepts of functional distribution and event-based execution control. A detailed description of the IEC 61499 standard and its associated models (i.e., function block, application, resource, device,
J.J. Scarlett, R.W. Brennan / Robotics and Computer-Integrated Manufacturing 27 (2011) 627–635
629
system) is beyond the scope of this paper. However, the reader may consult Refs. [22,23] or [7] for a comprehensive overview of these models and Ref. [6] for the detailed standard. Given the distributed nature of IEC 61499 applications, it is important that all messages (e.g., event signals and data) are delivered correctly and in a timely manner. For this paper, we assume that messages are delivered correctly (i.e., they are not corrupted): our focus is on the impact of untimely message delivery. From a safety point of view, timely delivery of messages is clearly important: i.e., if hard real-time constraints are not satisfied every time, the system cannot be used in a high-integrity environment. To address this issue, the typical approach is to use a timetriggered communication protocol that ensures a maximum, worst case latency for messages. Although this approach is considered deterministic in terms of message delivery times, it may not achieve the level of determinism required for real-time distributed control [24]. In other words, for IEC 61499 distributed applications, it is uncertain if one can say that the same inputs to the system will always result in the same outputs. This problem is illustrated by the simple conveyor control example Fig. 3. In this case, we have a single IEC 61499 function block application running across three devices as illustrated in Fig. 4: Device 1 is associated with Track 1, Device 2 is associated with Track 2, and Device 3 is responsible for the overall control of the conveyor system. Of concern here are the connections between FBA and FBC and between FBB and FBC. In this case, FBC controls a crossing point on an automated conveyor system. FBA and FBB signal FBC when pallets from Track 1 and Track 2 enter the crossing point, respectively; FBC uses the order of these signals to determine which track should be stopped to prevent a collision. The behavior of the application is dependent on the order in which the two respective events are received at FBC. In the case of a high-integrity application, the designer will want to ensure that hard real-time messages have maximum latency
guarantees. However, it will also be important that the application behaves as expected. For example, when FBA fires before FBB, the system output (determined by FBC, etc.) is always the same. To satisfy the guarantee on message latency, the designer would be inclined to select a time-triggered communication protocol. However, as illustrated in Fig. 5, this can result in a lack of determinism with respect to the distributed application execution (despite the fact that messages are received ‘‘on time’’). This timing diagram shows the order of event outputs from Device 1 (FBA) and Device 2 (FBB). In scenario 1, Device 1 has access to the bus when FBA is ready to transmit: as a result, the two events are received at FBC in the order they occurred on Devices 1 and 2 and the application responds as expected. In scenario 2, Device 2 has access to the bus and as a result, effectively blocks Device 1. This results in the order of event inputs to FBC being reversed. This would likely result in unexpected and unpredicted behavior. Of course, the designer could tighten the latency requirements to minimize this type of problem from occurring. However, given the nature of IEC 61499 distributed applications, this would require
Fig. 3. A simple conveyor system.
Fig. 5. The distributed application’s behavior using a time-triggered protocol.
Fig. 4. A simple IEC 61499 distributed application.
630
J.J. Scarlett, R.W. Brennan / Robotics and Computer-Integrated Manufacturing 27 (2011) 627–635
treating all distributed communications as hard real-time if the designer is to ensure predictable behavior. This may be fine for small applications, but it has the potential to create a real barrier to IEC 61499 adoption if required for all applications. In order to address this issue, it is important to look more closely at communication protocols in the context of IEC 61499. The IEC 61499 application layer plays an important role in this regard: i.e., the choice of communication protocol is made at this level (i.e., by the designer’s choice of communication service interface function block), which directly impacts message deliberation. For the majority of extant applications, communication function blocks are chosen that support Ethernet (i.e., no arbitration). For time critical control applications, communication function blocks that support time realtime communication protocols are necessary. In the remainder of this section, we briefly summarize the main protocols used in industry and propose a new event-triggered protocol for distributed intelligent control applications. We conclude the section with a brief discussion of common approaches to assessing communication protocols as well as their limitations.
2.2. Communication protocols One of the earliest communication protocols for real-time embedded systems is the Control Area Network (CAN) protocol, developed by Bosch [25]. This is an event-triggered protocol that does not guarantee message delivery times [26]. In order to provide a higher degree of predictability for safety-critical applications, a timetriggered component was either built into CAN with protocols such as TTCAN [27] and FTT-CAN [28,29], or a strictly time-triggered protocol was used such as TTP/C [30], Byteflight [31], and FlexRay [32]. In this section, we provide a brief overview of these five main communication protocols for real-time embedded systems. The details of these protocols are beyond the scope of this paper, however the reader may consult Ref. [17] or [33] for a comprehensive survey. TTCAN (Time-Triggered CAN) is a deterministic version of the CAN protocol. Unlike the event-triggered CAN model, TTCAN [34] implements a dual scheduling window system. In this scheme, time windows for both periodic and spontaneous messages are created allowing both time-triggered messages to coexist on the same network [35]. FTT-CAN was proposed by Ferreira et al. [28]. This protocol extends the dual scheduling window to a new level. The newly defined time windows are not only able to handle both time-triggered and eventtriggered messages, but are dynamically resizable. This allows the protocol to adapt to the present traffic conditions on the bus. TTP/C uses the TDMA (Time Division Multiple Access) scheme for bus access [27]. Each node is assigned a window during which they have exclusive broadcast rights. Currently, this protocol does not define support for event-triggered messages. In order to support event-triggered messaging, an additional protocol such as CAN must be implemented within TTP/C [32]. The Byteflight protocol was developed by BMW together with several semiconductor companies for safety-critical applications in automotive vehicles [32]. This protocol is not a true time-triggered protocol because it does not rely on global timing synchronization [35]. Instead, Byteflight provides access to the bus in a FTDMA (Flexible Time Division Multiple Access) scheme. FlexRay [36] is a hybrid protocol developed by a consortium of major manufacturers in the automotive industry (BMW, Bosch, DaimlerChrysler, General Motors, Philips and Volkswagen). Like TTCAN and FTT-CAN, FlexRay uses the dual time window approach: i.e., it allocates portions of network time to time-triggered messages and to prioritized message access [33]. Asynchronous messages are handled by the FTDMA Byteflight protocol and synchronous messages are handled by a TDMA scheme.
2.3. The EPCAN protocol The reliance on time-triggered protocols for real-time embedded systems is based on the assumption that time-triggered protocols are the only ones suited to safety-critical applications: i.e., time-triggered schemes are deterministic (higher degree of predictability) and event-triggered schemes are not [17,37]. As illustrated in the previous example, this approach does not match event-triggered models such as IEC 61499 very well. As a result, we propose a dynamic priority scheduling protocol, Escalating Priority CAN (EPCAN), which allows the maximum delay time to be deterministically predicted. In order to achieve this, the prioritysetting scheme must guarantee that each node will not have to wait longer than one cycle to broadcast. This maximum delay time is equivalent to the delay time a time-triggered protocol can guarantee. Our proposed EPCAN protocol involves two components. The first component is a calculation based on the message history within the node. The second component is a unique static identifier like the one used in the fixed priority scheduling method. Together these two components make up a dynamic priority code that is computed by each node. The calculated component is used to increase a node’s priority the longer it waits and is based on the time since the node last sent a message. An important feature of this is that no clock synchronization is required between the nodes as is the case with timetriggered systems. The static component is assigned pre run-time as a unique identifier for each node. It is required when two messages collide that have the same time priority. This happens only when two nodes that have not transmitted for greater than the cycle time attempt to transmit at the same time. In this case both nodes would have a time priority of one, and the static component would then be used to settle the arbitration. The following is a simplified example of the EPCAN protocol and a snapshot of the scheduling that takes place. In this case, we show five nodes with equal message lengths of one time unit. The cycle time of the system is calculated as the number of nodes times the message length (in this example, five): by convention, the lower the number, the higher the priority. Time Priority¼MAX(Cycle Time+ Message Length Time Since Last Message, 1) Node Priority ¼Unique Identification Number Dynamic Priority¼ CONCATENATE(Time Priority, Node Priority) As shown in Table 1, both Node 4 and Node 5 have a time priority of 1; however, Node 4 wins arbitration because it has the lowest node priority. Table 2 illustrates how the scheduling progresses over 8 time units. As noted previously, traditional evaluation of event and timetriggered approaches is based on protocol’s determinism: i.e., the ability of a system to respond with a consistent, predictable time delay between input and response [38]. Typically, the worst-case latency or average latency is used as a means to measure a protocol’s performance. Overall bandwidth efficiency and fairness come second to worst-case and probabilistic behavior.
Table 1 A sample calculation of dynamic priority. Node #
Time since last message sent
Time priority
Node priority
Dynamic priority
Node Node Node Node Node
4 3 2 8 9
2 3 4 1 1
1 2 3 4 5
21 32 43 14 15
1 2 3 4 5
J.J. Scarlett, R.W. Brennan / Robotics and Computer-Integrated Manufacturing 27 (2011) 627–635
Table 2 A sample calculation of dynamic priority scheduling.
Table 3 Bimatrix notation.
Node #
Dynamic priority
Node Node Node Node Node
21 32 43 14 15
11 22 33 – 15
– 12 23 – 15
– – 13 – 15
– – – 24 15
– 32 43 – 55
11 52 33 – 45
51 42 23 – 35
4 1
1 2
2 3
3 4
5 5
2 6
1 7
3 8
1 2 3 4 5
Broadcasting node Time
631
This at first may seem like an appropriate method for evaluating various strategies for safety-critical systems, but what happens when two protocols have the same worst-case behavior? When this happens we require alternative measures than measures typically provided. In the next section we propose a new approach to assess the performance of alternative communication protocols that is intended to provide further insight into competing protocols with similar worst-case behavior.
Table 4 State probabilities.
3. Assessing the communication protocols In this section, we propose a game theory based approach that can be used to evaluate and compare alternative communication protocols. The motivation for this work is to show that the proposed EPCAN communication protocol is well suited for event-based, distributed control models like IEC 61499. 3.1. Using game theory to compare the protocols Game theory is an interdisciplinary approach to the study of human behavior that has its roots in mathematics, economics, and the social sciences [39,40]. A game is an interaction between players who employ various strategies to achieve various outcomes. This fits very naturally with the idea of various nodes competing to send a message on a network. In this case, the players are the nodes and the strategies they employ will allow them to (or not allow them to) send a message. Sending the message results in a benefit for the node and is measured as a payoff. In order to characterize two-player games, a bimatrix notation can be used to summarize the payoffs for each player given various combinations of player strategies. For example, in Table 3, when Player 1 has a strategy of ‘‘A’’ and Player 2 has a strategy of ‘‘B’’, the payoff for player 1 is ‘‘w’’ and the payoff for player 2 is ‘‘x’’. Remaining of this paper is focused on the more complicated N-Player game. Our preliminary work on the 2-Player game can be found in Ref. [41]. For N-player games the bimatrix notation is not possible. Instead, the payoffs are represented in the arrays of form {A, B, C 9 w, x, y} where again, Player 1 has a strategy of ‘‘A’’ and payoff of ‘‘w’’, Player 2 has a strategy of ‘‘B’’ and payoff of ‘‘x’’ and Player 3 has a strategy of ‘‘C’’ and a payoff of ‘‘y’’. The number of arrays required to fully describe an N-player game is equal to the number of strategies raised to the power of N. Each node is considered to be a player that has one of two strategies: ‘‘Listen’’ or ‘‘Talk’’. In the case of nodes, the choice between two strategies is much less an issue of preference than it is of probability. It therefore makes sense to compute the strategy probability based on each node’s communication profile. Table 4 shows the probabilities for each quadrant if each node Listens 80% of the time and Talks 20% of the time. For example, the probability of both nodes listening is 0.8 0.8, or 0.64; the probability of Node 1 listening and Node 2 talking is 0.8 0.2, or 0.16.
The state probabilities can be multiplied by the bimatrix values to provide a weighted value under different communication profiles (network loading). When viewing nodes on a communication network in the context of game theory we need to think of the payoff each node can obtain while playing (i.e., communicating). Payoffs measure the benefit to a player given a particular outcome. The value of the payoff is arbitrary and is up to the person designing the game to select an appropriate value. In this case the selection is based on the ability of a node to send a message. The payoff is defined as the proportion of time that a message is successfully transmitted. For the N-player game, the payoff proportion can have any value between 0 (no successful transmissions) and 1 (all messages are successful). In the next section we summarize each of the main communication protocols using this two player notation. In addition to this characterization of the communication protocols, we also have to consider individual node behavior in order to evaluate how each protocol performs for various network loads.
3.2. The N-player scenario setup N-player games allow more complicated scenarios to be evaluated, but lose the intuitive feedback that a two player bimatrix notation provides. Instead, the notation takes on the form of an array of strategies and payoffs. In order to keep the notation as compact as possible, the strategies ‘‘Listen’’ and ‘‘Talk’’ will be represented as ‘‘L’’ and ‘‘T’’, respectively. Each scenario consists of N nodes communicating via a bus with one of two strategies: ‘‘L’’ or ‘‘T’’. Four scenarios will be considered in this section: no arbitration, CAN, TTCAN, and the proposed EPCAN protocol. The payoff for each node is determined based on the scenario’s ability to transmit messages given the complete set of strategies employed by each node. Each scenario’s payoff function is described next. Note that with this technique complicated analytical models and queuing theory is not required. Only a description of each nodes payoffs is required.
632
J.J. Scarlett, R.W. Brennan / Robotics and Computer-Integrated Manufacturing 27 (2011) 627–635
For the N-player no arbitration scenario, a payoff of 1 is only given when only one node has a strategy of ‘‘T’’. All other payoffs are 0 9 8 > > = < 0 when my strategy is L No Arb Payoffs ¼ 0 when any other strategy is T > ; : 1 when I am the only T strategy >
The Controller Area Network (CAN) scenario requires that each Node be assigned a fixed unique priority. This is done by setting the priority equal to its node number (i.e. 1, 2, 3, y, N). Evaluating the payoffs requires each node to be cognizant of the other node’s strategy and priority. When arbitration takes place, only the node with the highest priority receives a payoff of 1, all other receive a payoff of 0 8 > <0 CAN Payoffs ¼ 0 > :1
9 when my strategy is L > = when any node with a higher priority has strategy T > when I am the highest priority node with strategy T ;
The Time-triggered CAN (TTCAN) payoff is unique in that it is independent of the strategies that any of the other nodes employ. Because of this, the payoff is merely a function of the number of nodes as this represents the number of reserved time windows ( ) 0 when my strategy is L TTCAN Payoffs ¼ 1 when my strategy is T N For EPCAN, bus access is the same as a Controller Area Network (CAN) with the exception of how priorities are set. In typical CAN systems, priorities are static and do not change during operation. With the EPCAN protocol, priorities are assigned based on the nodes message broadcast history. One of the dynamic-priority protocols involves two components. The first component is a calculation based on the message history within the node. The second component is a unique static identifier like the one used in the fixed priority scheduling method. Together these two components make up a dynamic-priority code that is computed by each node. The calculated component is used to increase a node’s priority the longer it waits and is based on the time since the node last sent a message. An important feature of this is that no clock synchronization is required between the nodes as is the case with timetriggered systems. The static component is assigned pre run-time as a unique identifier for each node. It is required when two messages collide that have the same time priority. This happens only when two nodes that have not transmitted for greater than the cycle time attempt to transmit at the same time. Similar to the TTCAN scenario, the EPCAN scenario always has a payoff when the node strategy is ‘‘T’’. In this case however, the payoff is inversely proportional to the number of nodes with the ‘‘T’’ strategy ( EPCAN Payoffs ¼
0 1 nT
when my strategy is L where nT is the number of nodes with strategy T
)
4. Results Two types of comparisons are made as the number of players is increased in each scenario. The first compares the total payoffs of each scenario as the number of players is increased. The second compares the payoffs of individual nodes as the number of players in increased. In each case, message arbitration is determined by the
communication protocol; game theory is used to evaluate the relative performance of the protocols. 4.1. Comparison of total payoffs The impact of increasing the number of players on each scenario is shown in Fig. 6. As can be seen in this figure, EPCAN and CAN have the same total payoff; in both cases, total payoff increases as more players are added while TTCAN’s total payoff is independent of the number of players in the game. The No Arbitration scenario decreases its total payoffs and sees the peak performance occur at a decreased level of Talk % as the number of players increase. These three conclusions are consistent with the currently held understanding of each scenario’s ability to arbitrate collisions. What is interesting is the comparison of performance at lower levels of network traffic. The 10-player totals show that at a Talk % of 10%, the No Arbitration scenario outperforms TTCAN scenario by a factor of more than 3 as shown in Table 5. This is because as the Talk % decreases, the chance of a collision also decreases, and as a result of the collisions decreasing no arbitration is actually needed. This results in the No Arbitration scenario unobtrusively scheduling the bus. The level of Talk % in Table 5 is chosen based on the number of nodes in order to reach a 100% level of scheduled capacity of the bus. For example, if there are 2 nodes a 50% Talk % is used, and if there are 5 nodes a 20% Talk % is used. We can also note that for the 10-player game, the CAN and EPCAN scenarios still outperforms the No Arbitration scenario (0.38 versus 0.65). Figs. 7 and 8 contrast the case when the Talk % results in a bus that is scheduled 50% and 100% of the time, respectively. In contrast to the 100% scheduled bus, the 50% scheduled bus shows that the difference between the EPCAN/CAN and the No Arbitration scenario has almost disappeared. This may be in fact be a quantitative indication of why some experts feel that protocols such as Ethernet
Fig. 6. 2-, 5-, and 10-player scenario totals.
Table 5 Payoffs when the bus is scheduled at 100% capacity. Number of nodes
EPCAN
CAN
No Arbitration
TTCAN
2 3 4 5 6 7 8 9 10
0.7500 0.7037 0.6836 0.6723 0.6651 0.6601 0.6564 0.6536 0.6513
0.7500 0.7307 0.6836 0.6723 0.6651 0.6601 0.6564 0.6536 0.6513
0.5000 0.4444 0.4219 0.4096 0.4019 0.3966 0.3927 0.3897 0.3874
0.5000 0.3333 0.2500 0.2000 0.1667 0.1429 0.1250 0.1111 0.1000
J.J. Scarlett, R.W. Brennan / Robotics and Computer-Integrated Manufacturing 27 (2011) 627–635
Fig. 7. Payoffs when the bus is scheduled at 50% capacity.
633
Fig. 9. Sum of payoffs for 10 players, no arbitration.
Fig. 8. Payoffs when the bus is scheduled at 100% capacity. Fig. 10. Sum of payoffs for 10 players, CAN.
(which uses an arbitration technique only slight superior to the No Arbitration scenario) are the way of the future. These approaches typically segregate networks in such a way as to control the Talk % to a certain level. The findings here indicate that at these reduced level of communications, the No Arbitration scenario is a serious contender for responsiveness.
4.2. Individual node payoffs Figs. 9–12 illustrate the individual node payoffs when there are 10 players for the No Arbitration, CAN, EPCAN, and TTCAN scenarios, respectively. From these figures, one can see that nodes in the EPCAN, CAN, and the No Arbitration scenarios all show equivalent payoffs, and indication of fairness. However, as the number of players increase, each node in the No Arbitration scenario decreases its total payoff and sees its peak performance occur at a decreased level of Talk %. TTCAN’s total payoff is independent of the number of players in the game, while its individual nodes are always outperformed by individual nodes of the EPCAN scenario. Only Node 1 of the CAN scenario outperforms the nodes of the TTCAN scenario. In addition, there is one result that is quite unexpected. At lower traffic levels the EPCAN and No Arbitration scenarios approach the same performance. Using this game theory approach it is possible to justify the use of the No Arbitration technique under the specific condition of a low Talk % and a specific number of players. In this way, it is possible to see why the No Arbitration technique can be used in non safety–critical systems instead of a more complicated protocol. To add to this, the speed of the CAN protocol is limited because of its arbitration requirements (due to propagation delays) to 1 Mbits/s. At one time this seemed to provide more than enough bandwidth for control applications. The Ethernet protocol
Fig. 11. Sum of payoffs for 10 players, EPCAN.
however, is not limited by its arbitration technique and its speed is able to increase as technology improves. Speeds ten and one hundred times faster than CAN is already common and 1 Gbit/s networks are decreasing in cost. This disparity in network speeds and the introduction of intelligent switching means that the Ethernet protocol has a major advantage over CAN based protocols when network traffic levels are a concern.
4.3. N-player summary The N-node results are summarized in Table 6. The No Arbitration scenario is fair to all players and shows good responsiveness at low levels of network loading. Because of its higher speed, the resulting lower traffic levels, and its ability to handle additional
634
J.J. Scarlett, R.W. Brennan / Robotics and Computer-Integrated Manufacturing 27 (2011) 627–635
References
Fig. 12. Sum of payoffs for 10 players, TTCAN.
Table 6 Comparison of the N-node scenario.
No Arbitration CAN TTCAN EPCAN
Responsiveness
Fairness
4th Place 1st Place 4th Place 1st Place
1st Place 4th Place 1st Place 1st Place
nodes, its responsiveness must be considered at least comparable with TTCAN. The TTCAN scenario is again also fair to all players, provides adequate responsiveness usage under high network loading, but is outperformed by all other scenarios under low network loading. The CAN scenario matches the EPCAN total bandwidth usage but does so with increasing unfairness to additional players. Finally, the EPCAN scenario realizes maximum bandwidth usage by splitting the usage equally between all players just as TTCAN does. In doing so, EPCAN remains both fair and highly efficient/responsive. The N-node analysis has shown the impact of increased arbitration by increasing the number of nodes. For the TTCAN scenario, it shows that additional nodes decrease the responsiveness of the system, while all of others increase in responsiveness. The N-node comparison also shows that adding nodes in the CAN scenario does not impact the responsiveness of previous nodes.
5. Conclusions Using game theory has shown us that it is possible to represent a network of nodes as a game. The N-player scenarios confirm previously held understanding of each of the four scenarios. Further, the N-player games provided a measure of both the fairness and responsiveness of each scenario and demonstrated the effect of adding players into the game. This highlighted the responsiveness of the EPCAN scenario and also highlighted that the No Arbitration scenario is suited for low traffic levels. Using the game theory method allowed us to compare four diverse protocols using a single technique. This eliminates any potential bias towards a single scheduling approach. The simplicity of defining each scenarios payoffs also makes this technique easy to use and can potentially be used to define many other scenarios. Future work in this area will involve validating the results via discrete-event simulation, and ultimately, with physical hardware. As well, we plan to investigate games involving multiple players with mixed strategies. This could yield the most interesting results yet.
[1] Shen W, Norrie D. Agent-based systems for intelligent manufacturing: a stateof-the-art survey. Knowledge and Information Systems 1999;1:129–56. [2] Valckenaers P, Van Brussel H, Bongaerts L, Wyns J. Holonic manufacturing systems. Integrated Computer Aided Engineering 1997;4:191–201. [3] Brennan RW. Toward real-time distributed intelligent control: a survey of research themes and applications. IEEE Transactions on Systems, Man, and Cybernetics—Part C: Applications and Reviews 2007;37(5):744–65. [4] Olsen S, Wang J, Ramirez-Serrano A, Brennan RW. Contingencies-based reconfiguration of distributed factory automation. Robotics and Computerintegrated Manufacturing 2005;21(4–5):379–90. [5] Christensen J. HMS/FB architecture and its implementation In: Deen M, editor. Agent-based manufacturing: advances in the holonic approach, 2003, p. 53–88. [6] International Electrotechnical Commission, Function Blocks—Part 1: Architecture, International Electrotechnical Commission, Geneva, Switzerland, Technical Report IEC 61499-1, 2005. [7] Zoitl A. Real-Time Execution for IEC 61499. ISA; 2009. [8] Douglass B. Doing hard time: Developing real-time systems with UML, objects, frameworks, and patterns. Addison-Wesley; 1999. [9] Brennan RW, Fletcher M, Olsen S, Norrie DH. Safety-critical software for holonic control devices. In: Proceedings of the IMS international forum on global challenges in manufacturing, 17–19 May, 2004, Cernobbio, Italy, p. 767–74. [10] Vyatkin V, Hanisch H. Component design and formal validation of SFA systems: a case study. In: Proceedings of the fifth IEEE/IFIP international conference on information technology for balanced automation systems in manufacturing and services, 25–27 September, 2002, Cancun, Mexico, p. 313–22. ¨ [11] Brennan RW, Vrba P, Tichy P, Zoitl A, Sunder C, Strasser T, Marik V. Developments in dynamic and intelligent reconfiguration of industrial automation. Computers in Industry 2008;59:533–47. [12] Bussman S, Sieverding J. Holonic control of and engine assembly plant: an industrial evaluation. In: Proceedings of the 2001 IEEE international conference on systems, man, and cybernetics, Tucson, AZ, 2001. [13] Maturana F, Tichy´ P, Staron R, Slechta P. Using dynamically created decisionmaking organizations (holarchies) to plan, commit, and execute control tasks in a chilled water system, DEXA Workshops 2002, p. 613–22. [14] Vrba P, Macurek F, Marik V. Using radio frequency identification in agent-based manufacturing control systems. In: Marik V, Brennan RW, Pechoucek M, editors. Holonic and Multi-agent Systems for Manufacturing, Lecture Notes in Computer Science, Vol. 3593; 2005. p. 176–87. [15] Benaskeur A, McGuire P, Liggins G, Brennan RW, Wojick P. A distributed intelligent tactical sensor management system. International Journal of Intelligent Control and Systems 2007;12(2):97–106. [16] Albert A. Comparison of event-triggered and time-triggered concepts ¨ with regard to distributed control systems, Embedded World, Nurnberg, 2004. [17] Navet N, Song Y, Simonot-Lion F, Wilwert C. Trends in automotive communication systems. Proceedings of the IEEE 2005;93(6):1204–23. [18] Rushby J. A comparison of bus architectures for safety-critical embedded systems, CSL Technical Report, SRI International, 2001. [19] Tindell K, Burns A, Wellings. AJ. Calculating controller area network (CAN) message response times. Control Engineering Practice 1995;3(8):1163–9. [20] Lewis R. Programming Industrial Control Systems Using IEC 1131-3: revised edition. Stevenage, UK: Institution of Electrical Engineers; 1998. [21] Zaytoon J. On the recent advances in Grafcet. Production Planning and Control 2002;13(1):86–100. January/February. [22] Lewis R. Modelling control systems using IEC 61499. Stevenage, UK: Institution of Electrical Engineers; 2001. [23] Vyatkin V. IEC 61499 Function Blocks for Embedded and Distributed Control Systems Design. ISA; 2007. [24] Dubinin V, Vyatkin V, Hanisch H. Modelling and verification of IEC 61499 applications using Prolog. In: Proceedings of the 11th IEEE conference on emerging technologies and factory automation, Prague, Czech Republic, 2006. [25] Bosch GmbH. Bosch CAN Specification version 2.0, 1991. [26] Davis RI, Burns A, Bril RJ, Lukkien JJ. Controller area network (CAN) schedulability analysis: refuted, revisited and revised. Real Time Systems 2007;35: 239–72. [27] Marsh D. Network protocols compete for highway supremacy. EDN Europe 2003:26–38. [28] Ferreira P, Pedreiras, Almeida, Fonseca. The FTT-CAN protocol for flexibility in safety-critical systems. IEEE Micro 2001:81–92. [29] Almeida L, Pedreiras P, Fonseca J. The FTT-CAN protocol: why and how. IEEE Transactions On Industrial Electronics 2002;49(6):1189–201. [30] TTTech Computertechnik Gmbh., Time-Triggered Protocol TTP/C, High-Level Specification Document, Protocol Version 1.1, /http://www.tttech.comS, 2003. [31] FlexRay Consortium, FlexRay Communication System, Protocol Specification, Version 2.0, /http://www.flexray.comS, 2004. [32] Kopetz H. A comparison of TTP/C and FlexRay. TU Wien Research Report, 2001. [33] Koopman P. Critical embedded automotive networks. IEEE Micro 2002;22(4): 14–8. [34] International Standards Organisation, Road vehicles—controller area network (CAN) part 4: time-triggered communication, ISO 11 898-4, 2000.
J.J. Scarlett, R.W. Brennan / Robotics and Computer-Integrated Manufacturing 27 (2011) 627–635
[35] Shaheen S, Heernan D, Leen G. A comparison of emerging time-triggered protocols for automotive X-by-wire control networks. IMechE Journal Automobile Engineering 2003;217(Part D):13–22. [36] Paulitsch M, Hall B. FlexRay in aerospace and safety-sensitive systems. IEEE A & E Systems Magazine 2008. [37] Claesson V, Ekelin C, Suri N. The event-triggered and time-triggered mediumaccess methods. In: Proceedings of the IEEE international symposium on object-oriented real-time distributed computing. Hakodate, Japan, 2003.
635
[38] Taylor P. Real-time, determinism and Ethernet. Available at: /www.bara.org. uk/encyclopedia/ethernet/Hirschmann2.pdfS, 2003. [39] Owen G. Game Theory. 2nd ed. New York: Academic Press; 1982. [40] Osborne MJ, Rubinstein A. A Course in Game Theory. Cambridge, Massachusetts: The MIT Press; 1996. [41] Scarlett JJ, Brennan RW. A new evaluation method of communication for distributed control. In: Proceedings of the IEEE workshop on distributed intelligent systems, June 15–16, 2006, Prague, Czech Republic.