J. Parallel Distrib. Comput. 65 (2005) 643 – 653 www.elsevier.com/locate/jpdc
Scrolling partially ordered event displays David Taylor School of Computer Science, University of Waterloo, Waterloo, Ont., Canada N2L 3G1 Received 25 October 2003; received in revised form 28 December 2004; accepted 12 January 2005
Abstract Understanding the partial-order relationship between events is critical to understand the behaviour of a distributed or parallel application. Many problems need to be solved in order to provide an accurate and useful display of a large partial order, such as will occur during the execution of any non-trivial application. A display will most likely resemble a process-time diagram, but the “time” axis does not have a fixed interpretation. Flexibility in positioning events along this axis allows more-useful displays to be created, but also greatly complicates scrolling through the event history. This paper first discusses the properties that are arguably desirable for such scrolling and then presents an algorithm that provides scrolling with those properties. © 2005 Elsevier Inc. All rights reserved. Keywords: Distributed systems; Process-time diagrams; Execution visualisation; Partially ordered events
1. Introduction There are many situations in which it is important to understand the interactions between processes (or threads or other sequential entities) in a distributed or parallel application. One obvious example is debugging, in which a developer can compare a representation of the actual application behaviour with intended behaviour. Many other related needs also exist, such as performance tuning and documentation. For purposes of such understanding, it makes little difference whether the interacting entities are processes or threads, or even passive entities such as monitors or semaphores. Similarly, it makes little difference whether all entities reside on a single machine or on multiple machines, except that if multiple machines are involved, lack of clock synchronisation between machines will present problems for certain kinds of displays. Thus, to avoid repeated awkward phrasing, the remainder of the paper will simply refer to “processes” and “distributed systems” for lack of more general E-mail address:
[email protected] URL: http://www.shoshin.uwaterloo.ca/∼ dtaylor. 0743-7315/$ - see front matter © 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.jpdc.2005.01.001
terminology encompassing all alternatives. It is possible for an interaction to involve three or more processes directly, for example, barrier synchronisation, but for simplicity, this paper considers only pair-wise interactions. (Earlier work with PVM has demonstrated how multi-way interactions may be handled within this pair-wise framework [10].) Given this restriction, each event involved in an interaction is referred to as the “partner” of the other. The obvious way to represent process interactions is to draw a process-time diagram: a process is represented by a vertical line, an event by a symbol on the line for the corresponding process, and interactions between events (such as message sending and receiving or the acquisition of a semaphore by a thread) by a line connecting the two events. Such diagrams are often used informally to describe the behaviour of a distributed system and in some cases are a good choice for machine-generated representations of behaviour. There are several significant difficulties, however. One is that of scale—a technique that works well sketching 20 events in six processes on a piece of paper does not necessarily work well for displaying 20,000 events occurring in 600 processes on a CRT. Abstraction techniques are a vital aspect of dealing with large numbers of processes and
644
D. Taylor / J. Parallel Distrib. Comput. 65 (2005) 643 – 653
events and have been discussed previously [1,3,8,9]. This paper does not consider the problem of too many processes, but restricts itself to the apparently simple problem of moving backward and forward through a collection of events too large for all to be displayed simultaneously. The principal difficulty is that the fundamental relation between events in a distributed application is the partialorder “happened before” relationship [11]. The real time of an event occurrence is largely an artifact. If event A occurs at a slightly earlier real time than event B, it is possible that A must occur earlier or it may be that the two events could correctly occur in either order with no essential effect on application behaviour. Thus, the significant issue is whether A is a partial-order predecessor of B. Lack of highly accurate clock synchronisation in a distributed system may also create difficulties for an event display based on real time, but the fundamental difficulty is that relative real time of event occurrence is often not what the user most needs to know. For performance tuning and evaluation, real-time information is clearly very important, but for other purposes, such as debugging, it is often of very little value. Creating useful event displays based on the partial-order relationship rather than real time is significantly complex if the displays are to provide maximal information for the user and if moving from one display to another is to appear intuitive. After discussing related previous work in Section 2 and some background material in Section 3, the paper discusses the properties that are required to meet the preceding objectives in Section 4. Then, in Section 5 an algorithm is described that satisfies those properties. Finally, Section 6 summarises the paper and describes possible further work in the area.
2. Previous work Numerous tools provide some form of process-time diagram as a means of displaying the behaviour of a distributed application. In many cases, the display uses real time as one axis [5,6,18], in some cases explicitly adjusting realtime values to avoid anomalies when recorded real time is not consistent with the partial order [16]. One commercial product [13] does not adjust real times but marks explicitly a message that appears to be travelling backward in time, so that the user is aware of the problem. Most of these displays are explicitly or implicitly intended for rather small-scale use. They try to include a lot of information, which is potentially useful, but large numbers of processes and long executions can lead to a display that is hard to understand. ParaGraph [5], for example, can handle at most 128 processes in its space–time view. Zoom-in is provided to deal with problems of scale but is not a fully satisfactory solution. A very few tools use Lamport time rather than real time in constructing a display. Two examples are PVaniM [16] and the Animation Choreographer [7]. By using Lamport time,
both of these provide displays that are easy to comprehend even when the time intervals between events vary widely and for many purposes are much more useful than real-time displays. The key difference from the work presented here is that both tools compute a specific set of Lamport times for use in creating displays and thus any pair of events have a fixed relative position when displayed. As discussed later in this paper, the use of fixed Lamport times does not provide sufficient flexibilty for viewing large-scale executions. Substantial previous work in our research group has involved building displays based on the partial order of events. Those displays can be thought of as having Lamport time as one axis, but a set of Lamport times is not constructed and then used in creating all displays. Rather, in effect, a potentially different set of Lamport times is constructed for each display, allowing events to be shown in different relative locations in different displays. Lamport times are not actually computed and recomputed, but more-direct methods are used to position events in a manner consistent with Lamport time. Previous publications have not focused on this issue and an earlier patent [14] was effectively too ambitious in applying apparently reasonable constraints to the scrolling operations, as is described later in this paper.
3. Background The fundamental relationship between events in a distributed application is the “happened before” relationship [11]. Given two events, A and B, we have three possibilities: A happened before B, B happened before A, or A is concurrent with B. This relationship is transitive and antisymmetric and hence is a partial order. The fundamental problem then is to capture this partial order and display it to the user in a way that allows a user to understand it easily. This partial-order relationship is sometimes referred to as representing causality, although it is more accurate to describe it as potential causality. For example, a web server that simply constructs and returns web pages to browser clients will cause happened-before relationships to be established between the clients, although no causal relationships exist. In this case, the server is completely stateless, making it relatively easy to see that it induces no causal relationships between its clients, but the same could be true for some uses of a stateful server. For example, a web server that also allows a database to be updated by browser clients would have causal relationships between some requests, namely those that updated a given item and those that fetched a web page containing that item. In general, dealing with causality versus potential causality is both complex and application-specific and will not be considered here. One significant problem in representing behaviour as a partial order is synchronous events. The original description of the happened-before relationship included only asynchronous communication: successive events in a single process were in the partial order, send–receive pairs
D. Taylor / J. Parallel Distrib. Comput. 65 (2005) 643 – 653
were in the partial order, and the transitive closure of these gave the complete partial order. In many applications, synchronous communication is very important, for example, remote-procedure calls or the acquisition of a semaphore by a thread. Treating these as if they were asynchronous does not properly represent application behaviour and appears not to work at all if multi-way synchronous events exist, such as synchronisation barriers. (As mentioned above, such multi-way synchronisation is not considered here, but the techniques can be extended to handle it.) Unfortunately, the two events in a synchronous pair do not fit well into a partial-order framework. Having each event precede the other would violate the properties of a partial order. Having exactly one event precede the other would make the interaction equivalent to asynchronous communication, which is also not acceptable. The only remaining possibility is for the two events to be concurrent, but they are much more strongly related than two “ordinary” concurrent events. A different strategy is to represent synchronous communication by some number of events other than two. Representing each side of the communication by a “begin communication” event followed by an “end communication” event allows each “begin” to precede each “end” but uses substantial additional display space, particularly for large execution histories. Thus, although there are some drawbacks, this paper will adopt the view that a pair of synchronous events is really a single event that occurs simultaneously in two processes. This works well from a mathematical standpoint, but has the unfortunate property that one can no longer refer to the process in which an event occurred, but rather the process or processes. One might attempt to deal with the partial order by collapsing it to a consistent total order and then deal only with that total order. Assigning each event a Lamport-clock value [11] and using that value to position each event in the display would be one approach. If all processes are on a single machine or clocks are sufficiently closely synchronised, the real time at which events occurred also provides such a total order. If clocks are not sufficiently synchronised, it is also possible to adjust times so that they are consistent with the partial order, preserving the lengths of intervals between events as much as possible [15]. Such an approach has the serious difficulty that information about which events logically precede which other events is lost. In essence, we begin with the difficulty that the happened-before relationship represents potential causality rather than actual causality. It thus may contain some event relationships that are not relevant. Collapsing to a total order then adds, typically, many other event relationships that are known to be irrelevant, losing information of potentially great value to the user. Thus, maintaining the partial order of events and presenting that partial order to the user are very important. The partial order can be maintained through the use of vector timestamps as proposed by Fidge [4] and Mattern [12], with improved treatment of synchronous communication [2]. For
645
purposes of this paper, the precise means of maintaining the partial order is not significant. It is only important that there exists an efficient means of determining event relationships, since testing such relationships is an extremely frequent operation in building event displays. Much of our theoretical work on displaying partial orders has concerned abstraction, that is, how to cluster processes and create abstract events, in order to provide simpler and more comprehensible information to the user. This paper deals explicitly with only non-abstracted displays of events, but the principles are also applicable if abstraction is used.
4. Scroll-algorithm properties The display of a partial order should properly represent the partial order to the user and, as the user moves backward and forward through the execution history of an application, successive displays should have an intuitive relationship to each other. If the display is to resemble a process-time diagram, we will have a line for each process, with one dimension of the display representing processes (using some arbitrary process numbering, which should be under user control, but is not of concern here). The type of diagram desired is illustrated in Fig. 1. There are three processes and six events. As discussed above, the “pairs” of events A–B and E–F are each formally single events occurring in two processes. The informal discussion here (and similar discussions later), refers to them as if they were separate events because repeated phrasing such as “the occurrence of event Z in P1 ” becomes extremely tedious. It happens that P1 and P2 are communicating only synchronously and P2 and P3 are communicating only asynchronously. Events C and G both represent asynchronous sends that happen in P2 and P3 , respectively, before the receive occurs for either message. In P2 there is also synchronous communication with P1 (via event F) before the receive of the asynchronous message takes place. The diagram tells us, for example, that A precedes D because A is synchronous with B (they are formally the same event), B and C are in the same process with B occurring first, and C is an asynchronous send partnered with the asynchronous receive D. Although the diagram appears to suggest that G precedes F, because G appears at a higher position, no appropriate path exists between them (in either direction, so they are concurrent). Similarly, F does not precede D although F appears at a higher position. In general, in such a diagram, it can be arranged that if event X precedes event Y then X is at a higher position than Y but the converse cannot be achieved unless the partial order is a total order. A first, obvious possibility for determining the vertical position of events in such a diagram would be to assign Lamport-clock values to events and use that value to determine event position. This effectively collapses the partial order to a total order for purposes of building the display, but the partial order could still be maintained and reflected to
646
D. Taylor / J. Parallel Distrib. Comput. 65 (2005) 643 – 653
Fig. 1. Simple example of an event diagram.
Fig. 3. Event display using obvious Lamport-clock values.
Fig. 2. Events displayed according to real time of occurrence.
the user. For example, the user could select an event and the system could explicitly mark those events that are partialorder predecessors and successors. The properties of such a display are reasonably obvious and intuitive, but problems may occur in displaying large execution histories, as discussed in the following paragraphs. The preceding diagram is consistent with this approach, although the obvious assignment of Lamport-clock values would move events G and D each one position higher. Understanding the problems of such a display requires a larger example. Consider the set of events shown in Fig. 2. In this figure, the vertical axis is intended to indicate real time. It would be possible to assign Lamport-clock values to produce this display. A more likely assignment of Lamport-clock values will produce the display shown in Fig. 3, in which, subject to the constraints of the partial order, each event is displayed in a position as high in the figure as possible.
If the display window has approximately half the height of the figure, then a display corresponding to the top half of the figure will be useful. A display corresponding to the bottom half of the figure, however, has a serious problem because it does not show any of the interaction between the rightmost pair of processes. It is not possible to determine how much of the interaction was concurrent, in real time, with the events displayed for the leftmost pair of processes but since the relevant events are concurrent in the partial order, it should be possible to see them in the same display. It would be possible to assign Lamport-clock values differently so that events are displayed in as low a position as possible rather than as high as possible, but this would simply shift the problem situation to the case of displaying the top half of the figure. In general, it is simply not possible to draw a large picture of event interaction such that any window positioned over it gives a suitable display. Even basing the display on Lamport-adjusted real times does not solve the problem, since the display shown in Fig. 3 could be a real-time display, if the third process goes idle for a long time waiting for the interaction with the first process. That long passage of real time does not make the immediately preceding events any less relevant for understanding the behaviour of the application once the processes do interact. Using Lamport-clock values, however cleverly determined, corresponds to assigning each event a permanent
D. Taylor / J. Parallel Distrib. Comput. 65 (2005) 643 – 653
vertical position relative to other events. The model behind such a display is effectively that a fixed conceptual display of the complete execution history is created, with any actual display being a window into that conceptual display. Such a fixed-position method does not provide sufficient flexibility for useful displays of arbitrary segments of the partial order. A better intuitive model is that the events are beads on wires (much like a very tall abacus) and that creating a display is equivalent to sliding events into position so that they become visible in a window that is fixed in position. The linkages between events constrain the possible relative positions, but in the situation shown in Fig. 2, the four pairs of events occurring between the rightmost pair of processes can be positioned as they are in Fig. 3 or they can be positioned much lower on the display. Each such position corresponds to a legitimate way of drawing a large picture such as Fig. 3 but the beads-and-wires model means that we do not commit ourselves to any one such picture. Such a model makes many alternative displays possible, so rules are required concerning the way in which a user specifies a desired display and the way in which events are selected for and positioned in the resulting display. For simplicity, we assume here that a new display is created by taking some event and dragging it to the top or bottom of the display area. Since the two cases are symmetric, we explicitly consider only the case of dragging an event to the top. (The implementation is, unfortunately, not symmetric because vector timestamps provide more help in locating predecessors than successors.) An implementation needs to deal with user-interface issues such as providing a reasonable means of identifying what event is to be moved to the top of the display. If the event is in the current display, that is straightforward, but some means of selecting events not currently displayed is also required in order to allow the user to move easily to distant parts of the execution history. For purposes of this paper, we simply assume that some means exists to specify an arbitrary event to be moved to the top of the display. The remainder of this section discusses five constraints that should be met by partial-order displays created in this way and then discusses another constraint that appears desirable but is not achievable. The justification for these constraints is, necessarily, intuitive rather than formal. Vertical-position constraint: An obvious first constraint for all displays is that they should be consistent with the partial order, i.e., if event A precedes event B and both are in the display, then A is displayed at a higher position. Another way of looking at this is that the vertical positions correspond to possible Lamport-clock values for the events, but those values may need to change each time the display is scrolled. Unidirectional-movement constraint: A second constraint is that if you move something upward, then nothing else should move downward. It is reasonable for other parts of the display not to move at all, because there may be no relevant partial-order relationship between them and the event being
647
moved, but it appears quite unreasonable to have other parts of the display move in the opposite direction. Minimal-change constraint: There should be a reason for moving events out of the display. In other words, an event should be retained unless one of the other constraints requires it to be removed. Maximal-display constraint: The display should contain as many events as possible; it should not be possible to add any event to the display without violating some other constraint. Priority constraint: The four constraints discussed above in essence state some principles for building a display that will avoid clearly undesirable behaviour, but say almost nothing about the events that a user is likely to want included in the display. If the partial order is viewed as fundamental, then an obvious approach is to use partial-order relationships to establish priority for inclusion in the display. Thus, if event A is being moved to the top of the display, then events that are partial-order successors of A should have the highest priority for inclusion in the display. The next priority should go to events that are partial-order predecessors of those events, then partial-order successors of those latter events, and so on. One desirable property achieved by this constraint is that if there is a large gap in the displayed events for a process, such as in the third process of Fig. 3, then moving an event in that process to the top of the display will tend to close the gap. To be precise, it will close the gap unless the partial order induced by interactions with other processes means that the gap is a fundamental property of the display. Some of the five constraints will often conflict with each other, in determining which events to include in a display, so an ordering among the constraints needs to be specified. A plausible ordering is the following: vertical-position, unidirectional-movement, priority, minimal change, and maximal display. The justification is that the first two establish basic ground rules, the third indicates how to select events for inclusion, and the last two “tidy up” by disallowing rather silly possibilities. The only subtle issue is the effect of the relative ordering of the priority and minimal-change constraints. If their ordering were reversed, the priority constraint would largely lose its effectiveness. Consider the example shown in Fig. 4, where the horizontal dashed line indicates the bottom edge of the current display window. If event A is moved to the top of the display window, events B and C can be moved onto the display, but only by pushing events D and E out. The priority constraint gives B and C priority over D and E, so D and E will be forced out of the display window. Intuitively, since there is a precedence relationship between A and B and the user has effectively expressed interest in A by choosing it as the event to move, then B is likely more interesting to the user than D or E, which are merely concurrent with A. If the minimal-change constraint were considered before the priority constraint, then D and E would have priority for inclusion in the display over B and C. If the user
648
D. Taylor / J. Parallel Distrib. Comput. 65 (2005) 643 – 653
Fig. 4. Display for which scrolling up is a problem.
first moved A to the top of the display window, then moved the event following A, then moved the event following that, the left side of the display (events in processes P1 and P2 ) would gradually become nearly empty. To force the display of more events in P1 and P2 , the user would need to take some action with process P3 or P4 or some action involving event B. Since event B is not displayed, the user may need to guess which of P1 and P2 contains a useful event to drag into the display window. For example, dragging the next event in P1 into the display window might push event B out of the top of the display window, depending on the structure of events below those shown in the figure. It is possible to consider the fixed-position scrolling method in terms of these constraints. It satisfies the first two constraints but not the remaining three. The underlying, fixed display is consistent with the vertical-position constraint and, hence, so are any windows onto it. Because all relative positions are fixed, if something moves, everything else moves in the same direction by exactly the same amount, thus satisfying the unidirectional-movement constraint. However, it clearly cannot give priority to precedence-connected events because of the fixed relative positions, it may fail to display as many events as possible (as illustrated in Fig. 3), and it may move events out of the display unnecessarily. Thus, if the constraints are accepted as essential (or desirable), the fixed-position algorithm fails badly. One additional constraint on scrolling might appear desirable. If the intuition is that an event is being moved to the top of the display window, forcing out events above it and pulling in events below it, then we might want to enforce monotonic behavior. To be (apparently) precise, if an event A precedes an event B, then moving B to the top of the display should push out of the display at least all the events pushed out by moving A to the top of the display. It is clear that this is at best difficult to achieve, unless some simplifying result is available, since one could hardly trace the effects of moving each of many events to the top of the display simply to create a single new display. In fact, the situation is much worse, as the following example illustrates. In Fig. 5, we wish to move the synchronous event pair E–F to the top of the display. Other than events already at the top of the display, the only predecessors are the event pairs A–B
Fig. 5. Display for which monotonicity is a problem.
Fig. 6. Display after moving A–B to top.
and C–D. (There are subtle issues associated with “moving” an event to the top of the display when it is already there, but they do not need to be considered in this example.) If we move the event pair A–B to the top of the display, observing the five constraints previously enumerated, we obtain the display shown in Fig. 6. Unfortunately, this causes the event pair C–D to be forced off the top of the display. In all essential respects, the figure is symmetric, so moving the event pair C–D to the top of the display would force off the event pair A–B. The user cannot move both A–B and C–D to the top of the display, since moving one to the top will force the other out of the display. If the user first moves A–B to the top and then E–F, the display in Fig. 7 will result, but if the user first moves C–D, the mirror image of the figure will result. Thus, the “obvious” option of moving both of the other pairs prior to moving the pair E–F is impossible and the only feasible options, picking one of the two to move to the top, produce dramatically different results depending on which is chosen. The conclusion thus is that the possible monotonicity constraint is not even well defined and, hence, clearly unachievable.
D. Taylor / J. Parallel Distrib. Comput. 65 (2005) 643 – 653
649
Fig. 8. Single-sweep algorithm for scrolling an event display. Fig. 7. Display after moving A–B, then E–F to top.
5. Vertical scrolling of the display 5.1. A first attempt to achieve adequate event scrolling Before considering an algorithm that meets all five constraints, it may be useful to consider briefly a simpler algorithm that provides better behaviour than the fixed-position method but fails to meet all constraints. For reasons that will become obvious, this will be referred to as the single-sweep algorithm. The fundamental idea is to find an appropriate set of initial events and then “sweep down” the display window from them. The “sweeping down” involves placing in each successive line of the display window those events that have no predecessors in the set, then replacing each of those events by the succeeding event in the same process. The initial set of events is obtained by taking the event being moved to the top of the display window and, for each other process, the earliest event currently at or below the top of the display window which is not a predecessor of the event being moved to the top of the display window. Pseudocode for this algorithm is given in Fig. 8. The pseudocode is reasonably complete, but does not explicitly deal with the problem of “running off the end” of a process, either during the building of the initial set of events or during the sweep down. This algorithm was implemented and proved much better, in practice, than the fixed-position algorithm, but situations were observed in which its behaviour was not (intuitively) suitable. This prompted the enumeration of a set of constraints similar to those presented above, which allowed its deficiency to be more precisely characterised and which provided a standard for evaluating alternative scrolling algorithms. In terms of the constraints, the first two are clearly satisfied. The basic “sweep” algorithm is designed explicitly to satisfy the vertical-position constraint and because the initial set of events is created by selecting events at or below the top of the old display, everything moves up or stays in place when an event is moved to the top of the display. The problem arises with the priority constraint. In a situation such as that shown in Fig. 4, when event A is moved up, events D and E are not moved because they are con-
current with A, thus preventing event B from moving into the display window. This algorithm is effectively making the minimal-change constraint more important than the priority constraint—it refuses to remove any event from the display if it is possible to build a legitimate display containing that event, with the specified event moved to its new position. Although, the displays produced may have large “holes” in them because of the refusal to shift events such as D and E out of the display window, the algorithm does satisfy the maximal-display constraint. It is not possible to add any events to the display produced without violating the vertical-position constraint. 5.2. A satisfactory scrolling algorithm Given the deficiencies discovered in the scrolling algorithm just described, a more complex scrolling algorithm, called the multiple-sweep algorithm was devised. This algorithm works as follows. It is based on a “sweep” algorithm slightly different from the one described above. This sweep algorithm is given an initial set of events. If sweeping down, it finds the events in this set which have no predecessors, positions each of them as high in the display window as is consistent with the vertical-position constraint, replaces each of these events with the next event from the same process, possibly adds the partner event for each of these new events, and repeats until it reaches the bottom of the display window. For each new event that has a partner, the partner event is added if this event is synchronous or an asynchronous send and the partner process is not currently represented in the set of events being considered or if the partner event is a predecessor of the event in the set for that process. (In the latter case, the partner event replaces the event already in the set for that process. This situation can only occur when no events have yet been placed in the display for the affected process.) Pseudocode for this sweep algorithm is given in Fig. 9. The algorithm for sweeping up is the obvious reversal of the algorithm for sweeping down. Given an event to be positioned at the top of the display window, the algorithm first sweeps down from that event. During the sweep down, events will likely be included from
650
D. Taylor / J. Parallel Distrib. Comput. 65 (2005) 643 – 653
Fig. 9. Sweep algorithm used by the multiple-sweep event-display algorithm.
additional processes and for some of these processes it is possible to display earlier events than those placed in the display window during the sweep down. Thus, a set of events is constructed to use as the starting points for a sweep up. This set contains the event immediately preceding the top event for each process and for each receive event in the display, the corresponding send event if it is not yet displayed. If the preceding specifies more than one event for a process, the last of those events is the one included in the set. Then, a sweep up is performed from this set of events. This sweep may involve additional processes, so alternate sweeps down and up are performed until no new processes are encountered. At this point, each process not yet handled is examined, to determine the first event at or below the top of the old display that can be included in the new display. Using this set of top events as a starting point, a final sweep down is performed. No subsequent sweep up is required since no processes remain that could be added to the set under consideration. Throughout, events are repositioned by moving them as far up the display window as possible prior to a sweep down and as far down the display window as possible prior to a sweep up. This repositioning is needed because events added during sweeps down are positioned as high as possible in the display window and events added during sweeps up are positioned as low as possible. In some cases, particularly near the end of the execution history, a direct implementation of the above could add inappropriate events during a sweep up, so one additional constraint is added, that no event is added to the display that precedes the event being moved to the top of the display or any event at the top of the old display. When moving an event to the top of the display, this added test is not needed during a sweep down and thus is not shown in the pseudocode. Pseudocode for this algorithm is given in Fig. 10. The following (admittedly informal) argument is intended to demonstrate that this algorithm satisfies all five constraints. The vertical-position constraint underlies all the event placement of the “sweep” algorithms and the repositioning prior to each sweep, and thus is satisfied. When
an event is moved to the top of the display window, each other process either retains the same top displayed event or has some subsequent event become its top displayed event. As just indicated, an explicit test is included to prevent any earlier events from being placed in the new display. Given the adjustment of position preceding the final sweep down, all events for other processes must retain their position or move up. Thus, the unidirectional-movement constraint is satisfied. The initial part of the algorithm is a direct attempt to implement the priority constraint; events which are connected by precedence to the initial event are positioned first, followed by events connected by precedence to those events, and so on. Thus, the priority constraint is satisfied. All of the sweeps up and down terminate when no more events can be placed in the display window, in the direction of the sweep. Also, any process that is added “in the middle” of a sweep is immediately afterward part of a sweep in the opposite direction. Thus, by the end of the algorithm, no events can be added to the display window without violating the verticalposition constraint, so the maximal-display constraint is satisfied. Throughout the algorithm, events found in the old display are eliminated only if including them would violate the vertical-position constraint with respect to events already placed in the new display or because they “lose” with respect to the priority constraint and some other event placed in the display. Thus, the minimal-change constraint is satisfied.
5.3. Non-intuitive behaviour One notion underlying the constraints on event-scrolling algorithms is that user-observable behaviour should be intuitive and consistent. If monotonicity is viewed as intuitive, then one form of non-intuitive behaviour has already been discussed, but its “intuitive” status is surely doubtful. Another form of behaviour seems clearly non-intuitive but also unavoidable. In building a display, one ideally puts a set of concurrent events (possibly including synchronous event pairs) into a single horizontal row. If this approach is followed literally, however, illegible displays will result. For example, consider Fig. 5. The set of events immediately below the dashed line consists of two unary events and the synchronous pairs G– H and I–J, all concurrent, hence they should all be drawn at the same vertical position. If that had been done, the figure would be illegible, because the arrow connecting G and H would run through J and the two unary events, and similarly for the arrow connecting I and J. The only practical solution is to spread the events vertically enough to prevent any connecting arrow from contacting an unrelated event. How much vertical space is required thus depends partly on the particular display order of processes—if the user were to display the processes in the order (P1 , P5 , P3 , P4 , P2 , P6 ), there would be no need for extra vertical space in the
D. Taylor / J. Parallel Distrib. Comput. 65 (2005) 643 – 653
651
Fig. 10. Multiple-sweep event-display algorithm.
set of events just described, but the second and third rows of events would both require extra vertical space. The consequence is that the number of events that can be displayed vertically depends partly on an apparently irrelevant factor, the permutation of processes selected for the display. A less-complex example is given in Fig. 11. If the dashed line indicates the bottom edge of the display window, the bottom two events in the figure cannot be placed in the display, because of the extra vertical space used in avoiding overlap of connecting lines and events. Assume this display was built by moving event A to the top of the display. The problem is that unless the extra vertical space required by the “overlapping” of the connecting arrows is taken into account, the event-scrolling algorithm will believe that the bottom two events can be included in the display and, if necessary, move up events in processes P2 and P4 to allow those events to be included. Note that
Fig. 11. Event display in which events must be discarded from initial matrix.
the situation here is essentially like that shown in Fig. 4, with the position of processes P2 and P3 reversed; the nearindependence of the two sets of processes is not quite as
652
D. Taylor / J. Parallel Distrib. Comput. 65 (2005) 643 – 653
obvious because of the ordering on the display. Taking the additional space into account while selecting events to be placed in the display would be quite complex, because the display is not built in a single top-to-bottom sweep, so a two-stage process, as just implied, appears desirable, at least for the sake of simplicity. One could imagine a more-complicated procedure that would essentially determine the vertical space required for the display after each event was added and therefore avoid including an event at one point and later removing it. A fundamental difficulty with this approach is that in building the display shown in Fig. 11, the events from processes P1 and P3 are included first, then the events from processes P2 and P4 . Thus, at the point the bottom two events are being considered for inclusion, they appear to fit, because the overlap problem higher on the display has not yet arisen. The only solution appears to be an even more complicated procedure that would attempt to build a display according to the usual rules, check whether any anomalies occur in removing events from the display and, if so, start over assuming, in effect, that less vertical display space is available. The complicated procedure just suggested would sometimes build displays containing apparently inexplicable blank space at the bottom, but there would be a more fundamental difficulty. If the processes were reordered as (P1 , P3 , P2 , P4 ), then all the events shown in Fig. 4 could be displayed, because the other events would require less vertical space. If the display were built according to the complicated procedure and then the process order were changed, it would become possible to include additional events and the inclusion of those events would necessitate moving events upward in processes P2 and P4 . Presumably, for consistency, reversing the change in process order would also need to reverse this scrolling. Thus, a process reordering would have event scrolling as a side effect, which is very non-intuitive behaviour. On the other hand, if one decrees that this event scrolling will not take place, that process reordering will simply rearrange the events already included or “almost included” in the display, the operations of event scrolling and process reordering will not commute with each other. That is, reordering the processes as (P1 , P3 , P2 , P4 ) and then moving event A to the top of the display produces a display radically different from that obtained by moving A to the top of the display and then reordering the processes. There appears to be a choice of three different sorts of nonintuitive behaviour but no possibility of avoiding all such behaviour. The situation initially suggested is arguably the least non-intuitive. The user will be aware that reordering of processes sometimes causes events to disappear or appear at the bottom of the display. Thus, the concept that events “almost included” in the display may affect the events that are displayed is not too hard to grasp, particulary because in many situations a different ordering of processes can be used to bring the troublesome events into the display.
6. Conclusions and further work This paper has described several issues that need to be addressed to make an event display truly useful in understanding the behaviour of a large distributed application. It began by discussing the problem of scrolling a partially ordered display of events. Scrolling that produces useful and intuitive results proved surprisingly difficult to achieve, with at least one intuitively appealing property not being achievable at all. Several desirable properties for event-scrolling algorithms were presented, with the desirability of those properties argued intuitively. Then, a somewhat complex scrolling algorithm was presented that does satisfy the complete set of properties. The material presented here was developed as a result of building prototype software to aid in debugging distributed applications. A first prototype contained event-scrolling facilities that proved quite inadequate. A second prototype initially implemented the single-sweep algorithm, which was (subjectively) a great improvement, but was still unsatisfactory in some situations. An earlier version of the multiplesweep algorithm was then implemented, which attempted to achieve the properties of the multiple-sweep algorithm discussed here, plus monotonicity. Its behaviour always seemed reasonable, but it was eventually noticed that it sometimes failed to meet the precise requirements of monotonicity. Further investigation led to the discovery that monotonicity was an unachievable property in this context. The work presented here provides a somewhat simplified algorithm, in which monotonicity is not attempted, but produces identical or similar results in most situations. This paper arguably provides a complete solution to the problem of scrolling partially ordered displays of nonabstracted event data. Implementation issues have not been discussed and work is still required to obtain good scalability in the process dimension. Two approaches are possible, one being to reduce the use of vector timestamps, since parts of the display construction can evidently be performed with only “local” information whereas other parts appear to require vector timestamps or something equivalent to them. The other approach is to develop variants of vector timestamps that are more compact [17]. A possible area for investigation is the inclusion of a scrollbar in combination with the “drag to top” form of scrolling discussed in this paper. Users from time to time request such scrolling, but its inclusion will require careful investigation because the obvious implementation would rely on using fixed Lamport times, with those Lamport times corresponding to scrollbar positions. Since dynamically recomputed Lamport times are a central feature of the scrolling discussed here, it is unclear how to merge the two schemes in a reasonable way. Another area requiring further investigation is the integration of partial-order scrolling with present and future techniques for abstraction.
D. Taylor / J. Parallel Distrib. Comput. 65 (2005) 643 – 653
Acknowledgment The work described in this paper began in the Shoshin project at the University of Waterloo and has more recently benefited from numerous discussions with members of the Centre for Advanced Studies and developers at the IBM Toronto Lab. The work was supported by NSERC, under Operating Grant OGP00003078 and a Senior Industrial Fellowship, the Information Technology Research Centre of Ontario, and by IBM. References [1] T. Basten, T. Kunz, J.P. Black, M.H. Coffin, D.J. Taylor, Vector time and causality among abstract events in distributed computations, Distributed Comput. 11 (1997) 21–39. [2] H.W.H. Cheung, Process and event abstraction for debugging distributed programs, Ph.D. Thesis, University of Waterloo, Department of Computer Science, available as CCNG Technical Report T-189, 1989. [3] W.H. Cheung, J.P. Black, Process and event abstraction: implications for debugging and distributed systems (abstract), in: Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging, Santa Cruz, CA, 1991, pp. 213–215. [4] C. Fidge, Logical time in distributed computing systems, IEEE Software 24 (8) (1991) 28–33. [5] M.T. Heath, J.A. Etheridge, Visualizing the performance of parallel programs, IEEE Software 8 (5) (1991) 29–39. [6] J.A. Kohl, G.A. Geist II, The PVM 3.4 tracing facility and XPVM 1.1, in: Hawaii International Conference on Systems Science, 1996, pp. 290–299. [7] E. Kraemer, J.T. Stasko, Toward flexible control of the temporal mapping from concurrent program events to animations, in: Proceedings of the Eighth International Parallel Processing Symposium, 1994, pp. 902–908. [8] T. Kunz, High-level views of distributed executions: convex abstract events, J. Automat. Software Eng. 4 (2) (1997) 179–197. [9] T. Kunz, J.P. Black, D.J. Taylor, T. Basten, Poet: targetsystem-independent visualizations of complex distributed-application executions, Comput. J. 40 (8) (1997) 499–512.
653
[10] T. Kunz, D.J. Taylor, Visualizing PVM executions, in: Proceedings of the Third PVM User’s Group Meeting, 1995. [11] L. Lamport, Time, clocks and the ordering of events in a distributed system, Comm. Assoc. Comput. Mach. 21 (7) (1978) 558–565. [12] F. Mattern, Virtual time and global states of distributed systems, in: M. Cosnard et al. (Eds.), Proceedings of the International Workshop on Parallel and Distributed Algorithms, Elsevier Science Publishers B.V. Amsterdam, (North-Holland), Chateau de Bonas, France, 1988, pp. 215–226. [13] Pallas GmbH, Bruehl, Germany, Vampir 2.0 User’s Manual, http:// www.pallas.com/e/products/pdf/Vampir-userguide.pdf (June 1999). [14] D.J. Taylor, Computer program product for enabling a computer to construct displays of partially ordered data, US Patent# 5,640,500. [15] D.J. Taylor, M.H. Coffin, Integrating real-time and partial-order information in event-data displays, in: Proceedings, CASCON 94, Toronto, 1994, pp. 157–165. [16] B. Topol, J.T. Stasko, V.S. Sunderam, Dual timestamping methodology for visualizing distributed application behaviour, Internat. J. Parallel Distributed Systems Networks 1 (2) (1998) 43–50. [17] P.A.S. Ward, A scalable partial-order data structure for distributedsystem observation, Ph.D. Thesis, University of Waterloo, School of Computer Science, http://www.shoshin.uwaterloo.ca/∼pasward/ Publications/dissertation.ps (2001). [18] J.C. Yan, S.R. Sarukkai, P. Mehra, Performance measurement, visualization and modeling of parallel and distributed programs using the AIMS toolkit, Software Practice Experience 25 (4) (1995) 429–461. David Taylor is a Professor in the School of Computer Science at the University of Waterloo, where he has been a faculty member since 1977. He is also currently Associate Dean (Undergraduate Studies) for the Faculty of Mathematics. He obtained a B.Sc. in Mathematics from the University of Saskatchewan and an MMath and a Ph.D. in Computer Science from the University of Waterloo. His principal research interests are in distributed systems, particularly the visualisation and abstraction of distributedapplication behaviour, and in fault-tolerant computing.