ARTIFICIAL INTELLIGENCE
1
The Use of Aggregation in Causal Simulation* Daniel S. Weld
MIT Artificial Intelligence Laboratory, Cambridge, MA 02139, U.S.A. Recommended by Johan de Kleer ABSTRACT Aggregation is an abstraction technique for dynamically creating new descriptions of a system's behavior. Aggregation works by detecting repeating cycles of processes and creating a continuous process description of the cycle's behavior. Since this behavioral abstraction results in a continuous process, the powerful transition analysis technique may be applied to determine the system's final state. This paper reports on a program which uses aggregation to perform causal simulation in the domain of molecular genetics. A detailed analysis of aggregation indicates the requirements and limitations of the technique as well as problems for future research.
I. Introduction A number of AI programs perform causal simulation to check plans, aid in problem solving, and generate mechanistic explanations of complex devices [5, 6, 12, 24, 27]. Causal (i.e. component-based) simulation uses a model in which the behavior of the individual components is described; the global behavior of the system is then deduced from the interactions of the components. Two fundamentally different models of change have been developed to represent the behavior of components: discrete and continuous process models. The important characteristic of a discrete model is that actions are atomic; actions are assumed to be abrupt so they can be defined with add and delete lists, as for example in STRIPS [10]. In continuous process models (e.g. qualitative process theory [14]) actions gradually change the value of quan*This paper describes research done at the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. Support for the laboratory's artificial intelligence research is provided in part by the Advanced Research Projects Agency of the Department of Defense under Office of Naval Research contract N00014-80-C-0505. Artificial Intelligence 30 (1986) 1-34 0004-3702/86/$3.50 (~) 1986, Elsevier Science Publishers B.V. (North-Holland)
2
D.S. W E L D
tities, such as the level of fluid in a container, over an interval of time. Although continuous processes are more complex, they allow a powerful reasoning technique variously termed limit analysis [13], or transition analysis [27], the term I will use here. Although causal simulation techniques can work on both qualitative and quantitative models, this paper will emphasize qualitative models. One of the advantages of causal simulation is the ability to maintain dependency links, called a history structure [18], whenever the system state changes. These records of when and why changes occurred can be used for explanation generation and backtracking. In this paper I present a technique called aggregation, which uses the history structure to recognize repeating cycles of processes and generate a higher-level abstraction of the system's behavior. Because aggregation uses a continuous process to describe the effect of cyclic actions, transition analysis can be used to predict the eventual state without laboriously simulating every iteration of the cycle. Aggregation is flexible: cycles of both discrete and continuous processes can be aggregated as can cycles that contain other, nested cycles. This paper focuses on three issues: - H o w can repetition be detected and what techniques are necessary to determine which processes are repeating? - O n c e a cycle has been discovered, how can a higher-level description of the system's behavior be generated? -What are the requirements and limitations of aggregation? Can aggregation work with any representation of system state and process behavior or does aggregation constrain the choice of description language? This paper has two threads: (1) it describes an implemented simulator, PEerIDE, which was instrumental in refining the aggregation technique; and (2) it provides the beginnings of a theory of aggregation. The paper also discusses several areas for future research. 1.1. A simple example Example 1.1 will help explain the concept of aggregation. Imagine a container
A IIII One FIG. 1. Initial situation.
E
Many
A's
USE OF AGGREGATION IN CAUSAL SIMULATION
3
filled with a finite number of two kinds of molecules: As, which are small, and Es, which are larger and have sites into which As can fit. Initially, there is only one E, but there are many As (Fig. 1). Example 1.1. Suppose that there are three discrete processes that can happen: Discrete process: Preconditions: Changes:
I)PI. E's site is empty ^ There is a free A. E finds an A and binds it in its site (Fig. 2).
Discrete process: Preconditions: Changes:
DP2. E is holding an A. E chemically changes the A to another small molecule, B, which is bound in E's site (Fig. 3).
Discrete process: Preconditions: Changes:
DP3. E is holding a B. E drops the B causing both molecules to float free (Fig. 4).
What molecules would remain in the container after an extended period of time? Clearly, all the As would be replaced by Bs. But what kind of reasoning
I A IIII One
E-A
Many
FIG. 2. Situation after E grabs an A.
One
E
Fit;. 4. Situation after E drops B.
A IIII
A's
One
E-B
Many
A's
FIG. 3. Situation after A is transformed to B.
Many
A's
One
B
4
D.S. WELD
allows one to arrive at that answer? One certainly doesn't simulate the above sequence of actions for all the As. One couldn't even if one wanted to, since the initial conditions did not specify how many As there were. More likely, one simulates E's behavior for a short while and realizes that the " s a m e " things are happening repeatedly, in a cycle that will end only when all the As are gone. Aggregation allows a causal simulator to solve the problem in the same manner. Simulation starts by predicting the effects of the individual discrete processes. First, E grabs one of the As. Then E changes the A into a B. Next, E drops the B, leaving the site ready to grab another. At this point it would seem necessary to simulate the first process again, but the aggregator notices that the process is similar to a previously simulated process. This is a key step. Next the aggregator searches through the recorded history of process behavior to determine that the repeating cycle is composed of three processes: (DP1, DP2, DP3). In this case it is easy to recognize the cycle, but in general it can be difficult. Finally the aggregator determines that the net effect of one iteration of the cycle is a decrease in the number of As and an increase in the number of Bs. The aggregator encodes this as a continuous process 1 by combining the change with the necessary preconditions for the processes in the cycle to occur--in this case, that there must be at least one A and at least one E. Generating a continuous process abstraction is the final task for aggregation. Now, rather than simulating the system discretely, transition analysis [28] is used to predict in one step that all the As will be transformed to Bs.
1.2. Approach Some types of change, like the flip of a light switch, are naturally thought of as being atomic, and hence are well modeled using discrete processes. Similarly, continuous processes are an appropriate representation for changes, like the flow of electricity from a battery through a filament, that occur gradually over time. An aggregator should be able to abstract the behavior of cycles containing both discrete and continuous processes. Thus the nature of these kinds of processes exert a strong influence on the aggregation procedure. Since a goal of aggregation is to discover a cycle and produce an abstraction of the behavior that will support transition analysis, the aggregator must generate a continuous process. Thus the nature of continuous processes has a special impact on the way aggregation must work. To understand how transition analysis constrains aggregation, it is necessary to investigate the mathematical underpinnings of continuous processes and the transition analysis technique. As is explained in Section 3.3.2, the transition 11use the word "continuous" to refer to a process which takes place gradually over an interval of time and defines the state of the system at all times in the interval. I do n o t require time to be connected.
USE OF AGGREGATION IN CAUSAL SIMULATION
5
analysis technique can be generalized considerably from the specifications in previous discussions [20, 27]. The functions describing the gradual change in a continuous process do not need to be continuous, differentiable, and realvalued. Relaxing the requirements for transition analysis is important, because without a more general foundation, cycles including discrete processes could not be aggregated. 1.3. Overview Section 2 describes PEPTIDE, an initial implementation of a causal simulator which uses aggregation in the domain of molecular genetic systems. Sections 3 and 4 explain why PErnDE worked and what remains to be done. Section 3 discusses the issues in representing system state, time, and processes since these directly affect aggregation. Section 4 presents the theory of aggregation: cycle recognition, process interference, generation of continuous processes, interaction between multiple cycles, and theoretical limitations. Section 5 considers the applicability of aggregation by summarizing the technique's requirements and outlining domains where aggregation could prove helpful. Finally, Section 6 compares aggregation to related work in system dynamics, learning, and qualitative simulation. 2. Demonstration
Written in ZETALISPon a Symbolics 3600, PEPTIDEwas built to help explore the utility and theory of aggregation. It performs causal simulation in an idealized version of the molecular genetics domain. Using a specification language, users can describe a set of molecule types and define discrete processes to represent possible enzymatic behaviors. PEPTIDEthen attempts to predict the behavior of the system, perhaps constructing an aggregate, continuous process description along the way. PEPTIDE successfully predicts the behavior of a dozen small systems, including Examples 1.1 and 2.1; see [26] for another example and [25] for the complete details. This section summarizes the capabilities of the program by illustrating its behavior with one of the more complicated examples that it handled. The remainder of this paper can be considered a rational reconstruction of the system, explaining why it worked, where it fell short, and how it could be improved. Example 2.1 is interesting because cycles containing other cycles must be aggregated to get the correct answer. This example is a simplified model of transcription, the complex process by which cells create RNA copies of their DNA genes. Three types of molecules are involved: DNA, RNA, and duplicase. DNA and RNA are each described by their length, a totally ordered parameter. Initially R N A has length one, but DNA is an unspecified amount longer. Duplicase is modeled by two parameters, dna-site and rna-site; the value of each of these parameters is the name of another molecule or 0 if no
6
D.S. WELD
molecule is bound to the site. If a molecule with length greater than one is bound, then an extra parameter is necessary to describe the position of the attachment. When many identical molecules exist, PEPTIDE groups them together for simpler reasoning. Initially there are many R N A molecules in the R N A group, but only one D N A and one duplicase. Example 2.1. Four discrete processes are defined: bind, grow, slide, and drop. Discrete process: Preconditions:
Changes:
Discrete process: Preconditions: Changes: Discrete process: Preconditions: Changes:
Discrete process: Preconditions: Changes:
Bind. Both sites of duplicase are empty A nothing is binding the left end of D N A A there is a free piece of short R N A available. Set duplicase.dna-site = the left end of D N A ^ Set duplicase.rna-site = the right end of RNA. Grow.
D N A is bound left of its right end ^ R N A is bound at its right end. Increment RNA.length, making it grow longer at the right end. 2
Slide. Both D N A and R N A are bound at a position left of the right end. Increment the position of D N A in duplicase.dna-site A increment the position of R N A in duplicase.rna-site. Drop. D N A is bound at its right end ^ R N A is bound. Set duplicase.dna-site, duplicase.rna-site to ~.
In the initial situation all the preconditions of bind are satisfied, so an instance of the process is active (its changes are in effect); so the discrete simulator predicts that a duplicase-DNA-RNA complex would form as shown in Fig. 5. In addition there will be many single element R N A fragments still floating free. The simulator records its prediction in the history structure by maintaining a network in which nodes represent groups of equivalent molecules and links describe the activity of every process. 2Note that this process specifiesthat the RNA chain gets longer without depletingany resources. This is done to simplifythe model.
USE OF AGGREGATION IN CAUSAL SIMULATION
7
RNA
DNA
FIo. 5. Duplicase state after bind process.
Next, grow is active so the R N A chain increases to length two (Fig. 6). At this point, slide is the sole active process, so it is simulated (Fig. 7). Since duplicase is back at the right end of RNA, grow is active once more. But before the discrete simulator can predict its effects, the aggregator notices that the two instances of grow are similar. Of course, there are differences in the state of the system, but PErnDE determines that these differences are not "important" (as explained in Section 4.2.1.) and thus that a cycle of processes is repeating. A quick search of the history structure results in the discovery that the cycle is (grow, slide). PErnDE generates the following continuous process as an abstraction of the cycle: Continuous process: Preconditions:
Changes:
CP1. DNA.number > 0 ^ RNA.number > 0 A duplicase.number > 0 A the position of D N A in duplicase.dna-site is before the right-end A the position of R N A in duplicase.rna-site is before the right-end. Increase RNA.length on the right A increase the position of R N A in duplicase.rna-site A increase the position of D N A in duplicase.dna-site.
RNA
DNA
FI~. 6. Duplicase state after first grow process.
RNA
DNA
I~G. 7. Duplicase state after first slide process.
8
D.S. WELD
RNA
DNA
FIG. 8. Duplicase state after first transition analysis.
Next PEPTIDE uses transition analysis to predict the eventual state of the system. In this case since the length of R N A is increasing as fast as duplicase is moving towards the end, duplicase reaches the end of the D N A chain before reaching the end of the R N A chain (Fig. 8). Since the grow process did not use any of the seed R N A molecules to extend the chain (they were only used to start a chain), there are still many short R N A molecules present. At this point simulation reverts to the discrete model. Next the discrete simulator analyzes the discrete process preconditions and determines that drop is the sole active process instance. Simulation of it leaves duplicase, DNA, and the new long RNA chain floating free. In addition there are still a large number of the short, seed R N A molecules. Next, the discrete simulator determines that an instance of the bind process is active. But before it can be simulated, the aggregator recognizes a cycle. Since the current situation is the same as the initial situation (only the number of short and long R N A molecules has changed), the sequence (bind, grow, slide, CP1, drop) is a cycle. It is quite important that aggregation is able to handle cycles containing both discrete and continuous processes, because this provides the ability to compose aggregations. PEPTIDEcan abstract the behavior of this new cycle even though it contains a previously aggregated description. Since most complex systems involve such nested cycles, the power to compose aggregated descriptions is necessary when reasoning about most interesting situations. The net influences for an iteration of the cycle are: an increase in the number of long RNA strands, and a decrease in the short R N A fragments. Since there is only one possible boundary value, transition analysis predicts that the resulting situation will contain zero R N A fragments and many long R N A gene transcripts. When the simulator shifts back to discrete mode, there are no active processes, so the simulation is complete. The ability to aggregate cycles within other cycles is one of PEPTIDE'S most interesting attributes. Although PEPTIDE successfully aggregated a dozen test cases, subsequent analysis revealed that the cycle detection algorithm is flawed. The rest of this paper attempts a re-analysis of aggregation to explain this and other issues that PEPTIDE raised.
USE OF AGGREGATION IN CAUSAL SIMULATION
9
3. Representing State and Change Understanding the representations used for system state and change over time is essential to a clear comprehension of aggregation for two reasons: - S i n c e aggregation depends on repetition, a key question is "When is the same thing happening over and over?" The task of determining when two changes are the same is tightly bound to the representation of change. - Because the utility of an abstraction technique depends on the procedures that can be performed on the resulting description, aggregation's utility stems from the application of transition analysis to a continuous process. Thus the nature of continuous processes and their representation constrain aggregation. In this section I discuss the issues involved in modeling state, change, and time in mathematical terms (see [22, 23] for background). This analysis will prove quite useful in Section 4.
3.1. Parameters I assume that a state of the world is represented by a set of objects: each object is described by the values of its parameters. Both objects and parameters are typed; two parameters of the same type share a set of possible values. Two objects of the same type have the same parameters. Several types of parameters have been used in AI theories--unordered sets, the integers (7/), and real numbers • are all common and well understood. Since the focus here is on qualitative representations, I shall describe the mathematical properties of possible parameter value sets in terms that can be applied to qualitative representations as well as to the examples noted. The following properties are important to transition analysis and hence to aggregation. -Ordering: If every two values, x and y, of a parameter are related by exactly one of three possible relations, x < y, x = y, or x > y and the relations satisfy transitivity, then the set of values is totally ordered. This criterion is necessary to specify meaningful directions of change in the value of a parameter, a necessary prerequisite for transition analysis. - A l g e b r a i c structure: Since the operations of addition and multiplication are useful when reasoning about the integers and reals, it is common to define the same operations on qualitative parameters. The set of signs { - , 0, + , ?} is a common example: - plus 0 equals - , but the value of - plus + is ?. This value set doesn't provide much structure, however; the lack of inverses means that it isn't even a group. - Limit points: A value, x, in a value set is a limit point if every open set about x (no matter how small) contains a point distinct from x. As will
10
D.S. W E L D
be shown, limit points are important to both transitions analysis and mappings between value sets. In the rest of this paper I will use ~ to represent the set of nonnegative integers and P to represent the set of values for an arbitrary, totally ordered parameter. 3.2. Time
I assume that time, denoted T, is a totally ordered parameter consisting of decomposable intervals and nondecomposable instants. Two times are said to meet if they abut (no distinct time fits between them) but do not overlap [1]. As a notational convenience, I assume that time has a least element, 0. One reason that the meet relation is so handy for modeling time is the fact that it only partially specifies the temporal representation. Discrete (e.g. the integers), connected 3 (e.g. the reals) [2], and mixed models [11] have been shown consistent with the meet relation. This is important because different temporal models have been used to represent discrete and continuous processes. Section 4 shows that aggregation can work with either model. 3.3. Processes
Both processes and devices have been proposed as models of change, and each representation has its advantages (for examples, see [4]). For the purposes of this paper, however, the differences are unimportant. For concreteness I represent possible changes to the parameters of objects with processes (as in [14, 19]). Processes consist of two parts: preconditions and changes. Process descriptions are more general if they refer to types of objects rather than to specific individuals [7]. For example, when modeling contained fluids, one might define a general fluid flow process rather than one for each pipe. The term process instance refers to a process that has been instantiated during the simulation by replacing references to object types with specific objects. A process instance is said to be active if its preconditions are satisfied. Two types of process, discrete and continuous, can be used to describe the behavior of a system. Parameters that are being modified by an active discrete process have undefined values wtiile the process is active, while those that are changed by a continuous process are defined. As will be seen in the next two sections, this difference manifests itself in the changes part of the process definition.
3A set is separated if there are two open disjoint subsets whose union equals the set. A set is connected if it has no separations. T h u s the subset of R, A = {x [ 0 < x < 2} is connected, but B = A - {1} is not, because {x ] 0 < x < 1} and {x I 1 < x < 2 } separate B.
USE OF AGGREGATION IN CAUSALSIMULATION
11
3.3.1. Discrete processes Since discrete processes do not define the value of parameters while the values are being modified, the changes part of the process is quite simple: add and delete lists suffice. Because of this simplicity, discrete process representations have been common ever since their introduction in STRIPS [10]. The model of discrete process that I use here has two additional constraints. I assume that discrete processes act in an instant and act atomically, either completely or not at all. It is desirable to have a well defined next state after each discrete action. This means that discrete processes require an unconnected temporal model, because it says that there are two distinct time points with no points in between [22]. 3.3.2. Continuous processes Continuous processes define the value of all parameters as they change gradually over an interval of time. This raises three interesting questions: - W h a t does gradual change mean? - S i n c e each parameter is defined for all times, there must be functions, f:T--+ P, called trend functions, which define each parameter's value for all times that the process instance is active. What class of functions are allowable trend functions? - H o w should a trend function be specified in the changes part of a continuous process definition? Since transition analysis is the primary reasoning technique applied to continuous processes, it is helpful to look at transition analysis' requirements when answering these questions. 3.3.2.1. Transition analysis Transition analysis takes as input a description of a parameter's gradual change over time and a set of boundary values for the parameter. The boundary values (called limit points in [14] and landmark values in [20]) are simply possible values for the parameter that are of interest for some reason. Typically they are values mentioned in a process' preconditions marking points at which an active process becomes inactive or vice versa. The boundary values divide a totally ordered parameter space into intervals. Transition analysis determines if the parameter will move from one interval to another, and if more than one parameter is changing, which boundary value will be reached first. For example, if water is flowing out the drain of a sink, transition analysis could predict that eventually parameter water.height will reach the boundary value of zero. If water was flowing into the sink from a tap while the drain was open, then the analysis would be more difficult. Previous work on the requirements of transition analysis [21, 27] was aimed at real-valued parameters. Trend functions were required to be from the reals to the reals, continuous, and piecewise differentiable so the intermediate value
12
D.S. WELD
theorem could be used to determine and order transitions across boundaries. Fortunately the intermediate value theorem is not necessary for transition analysis; trend functions do not need to be real-valued. For example, functions from discrete time to a discrete parameter space are possible. Consider a jar of vitamins; the number of vitamins is best represented by the nonnegative integers, a discrete space. For the situation in which Joe removes one vitamin a day, time can also be modeled by the nonnegative integers. The trend function, f : N--~ ~, represents the number of vitamins as a function of time. Transition analysis can work here. For any boundary value, o, less than the initial number of vitamins, n, there exists a time t~ = n - v such that f(q) = v. Since time is ordered, it is clear which boundary value will be reached first. If transition analysis doesn't require trend functions to have real arguments and values, what are the requirements? Since continuity and differentiability don't have intuitive meanings for discrete spaces, new criteria must be developed. 3.3.2.2. Meticulous trend functions A trend function, f : T---~ P, is said to be strictly increasing if Vx, y E T, x < y implies f ( x ) < f ( y ) . A trend function, f:T---~ P, is said to be meticulously increasing if it is strictly increasing and onto the subset of P defined by {x I x ~>f(0)}, in other words it doesn't skip any values. Strictly decreasing and meticulously decreasing functions have the analogous meanings. A strictly monotonic function is either strictly increasing or strictly decreasing. A meticulous function is either meticulously increasing or meticulously decreasing. Convergent functions pose problems--they can move steadily closer to a boundary value yet never arrive. To avoid this problem I assume that all trend functions are unbounded. 4 If a trend function is meticulous, then transition analysis can be performed. The condition that the function be strictly increasing is similar to the requirement in previous treatments that functions be differentiable with positive derivative. The condition that the function be meticulous catches the notion of continuity in connected spaces--no values are skipped. This completes the notion of gradual change: if a trend function is meticulous, then transition analysis can be performed. Naturally, the trend function for the number of vitamins, above, is meticulous. If a trend function is not meticulous, then transition analysis cannot always be done. Consider Example 3.1. Example 3.1. There is a basket which can hold exactly n balls. If the balls are dropped in five at a time, then the trend function, f:N---~7/, is defined by f(t) -- 5t. Suppose that the preconditions for this process specify that it will stop 4Forbus also makes this restriction, but further work is required to actuallydetect systemswhich violate this assumption.
USE OF AGGREGATION
IN C A U S A L S I M U L A T I O N
13
when the basket gets full, and the preconditions for another process say that the other process will become active when three or more balls have spilled on the floor. Transition analysis will be given the trend function, f, and the boundary points n and n + 3. Since f is still strictly increasing, transition analysis can determine that eventually a boundary will be crossed, but will the smaller boundary point be reached first? Given this model of the world, it is impossible to tell; if 1 ~< n mod 5 ~<2 then the two boundaries will be crossed in the same instant of time! In some cases it may be possible to perform transition analysis of nonmeticulous trend functions given algebraic information such as "the trend function is linear." Linearity would allow the reasoner to partition the parameter space into equivalence classes given the initial value. For example, if no other process added or removed balls and the basket was initially empty, then the reasoner could conclude that the basket will always contain 5i balls for some i E N. If n = 65, then n would be in the same equivalence class as the initial value zero, and eventually the basket will be full with no balls on the floor. But if the reasoner has no algebraic information, then trend functions must be meticulous to allow transition analysis. In the remainder of this paper I will assume that all trend functions are meticulous. How seriously does this condition restrict the class of trend functions? - I f both the time and parameter spaces are connected, then meticulous trend functions correspond to strictly monotonic, continuous functions. - I f the time space is discrete and the parameter space has limit points (or vice versa), there are no meticulous trend functions. Note that all nontrivial connected spaces have limit points. In the first case, some parameter values will be missed; 5 in the second, the function will repeat values, thus violating strict monotonicity. - If both the time and parameter spaces are discrete (say, N and 7'), then every meticulous function is of the form, f ( t ) = t + b. The only difference between the functions is the initial value, b. Although the constraints seem quite strict in the last case, things aren't as grim as they might seem. First, the class includes many common functions; all the trends that VEZnDE produces are in this class. Second, many qualitative representations have algebraic structure. Although I don't consider the possibilities in this paper, there is considerable potential for new approaches to transition analysis which would use this structure in lieu of a meticulous trend function. SThis is deeper than the statement that there is no mapping from a countable set onto an uncountable set, since it shows that there is no meticulous function f : Z - - ~ Q even t h o u g h the rationals are countable.
14
D.S. WELD
3.3.2.3. Representing trend functions Although several different representations have been proposed for the changes part of a continuous process [8, 15, 27], they are all roughly equivalent; they all allow the generation of a meticulous trend function. I follow the approach in [14] and define the changes part of a continuous process by listing the "influences" on the various parameters, saying which parameters are increasing and by how much. If two continuous processes are active and both are affecting a parameter, then the influences must be summed to produce the actual direction and rate of change. It is easy to compute the trend function given an initial value and this description of change. 3.3.2.4. A key observation The fact that all meticulous trend functions are strictly monotonic has important ramifications. It means that every parameter being modified by an instance of a continuous process takes on a new value (greater than or less than the previous value) for each instant in the interval that the instance is active. Thus continuous processes can only change the values of totally ordered parameters. This will prove to be an important foundation of aggregation's cycle recognition capabilities (Section 4.2).
3.4. Histories The history of an object is a record of the change in the values of the parameters of the object over time [18]. The histories of two objects are said to intersect when a single process instance instantiates both objects. I assume a closed-world hypothesis: all changes to system state are modeled by processes and the definitions of all processes are known. This implies that an object can be directly affected by another only when their histories intersect. Both objects are not necessarily affected when their histories intersect, however; an object is only affected when it is actually changed by the process instance rather than just being mentioned in the process' preconditions. I assume that the causal simulator maintains a history for each object in the system. Each history is marked when it intersects with another in a manner that affects the first history's object. Histories are important because they allow the aggregator to detect cycles.
4. Aggregation In the most general sense, aggregation is a technique for recognizing when processes repeat and for generating a more abstract continuous process description of the change over time. Since transition analysis is the bestunderstood reasoning technique for continuous processes, I will only discuss aggregation techniques that can generate meticulous trend functions. Thus an aggregator is a program which takes as input a history structure, a set of active processes, the process definitions, and a list of totally ordeered parameters.
USE OF AGGREGATION IN CAUSAL SIMULATION
15
The aggregator recognizes cycles and produces a continuous process representation of the system behavior. Cycle recognition occurs in three phases. The repetition recognizer checks all active processes to discover when the "same" thing is happening. The cycle extractor determines what sequence of processes of processes is repeating, and the cycle verifier checks this candidate cycle to see if it is really valid. If the cycle is valid and has a meticulous trend function, then a continuous process abstraction of the cycle can be generated. The rest of this section discusses issues in aggregation. What is a cycle? What does it mean for two processes to be the "same"? Will other processes interfere with a cycle, perhaps causing it to stop? Given a cycle, how can a continuous process be generated? Does the existence of multiple repeating cycles cause problems? Can all cycles be detected? Throughout the section, the issues will be illustrated with examples and details of PEPTIDE'S performance.
4.1. Possible types of cycles Intuitively, a cycle is a collection of processes which can independently repeat activity. This repetition can come about in two ways, giving rise to serial and parallel cycles. Serial cycles are familiar: a single object does some action time after time. In a parallel cycle a group of similar objects all do the action once. In both cases the action gets repeated, but in parallel cycles the repetition is spatial rather than temporal. Consider the following examples: Example 4.1. Serial: The lifestyle of a person provides a simple example of a serial cycle. Each day the person sleeps and wakes. One object does the same pattern of actions over and over. Example 4.2. Parallel: A pile of heating popcorn illustrates a parallel cycle. Eventually all the kernels will pop, but each kernel pops only once. The "same" action is happening repeatedly, but the repetition is parallel. Example 4.3. Mixed: A mixed cycle has both serial and parallel elements. A line of people withdrawing money from the bank is an example. One clerk handles many similar transactions, so the cycle is serial, but a different person is helped each time, so the cycle is parallel. This paper concentrates on serial cycles. Often these cycles are just sequences of processes. But since more than one process can be active at a time, serial cycles can branch into concurrent paths. In general, a cycle will be a subgraph of the history structure. Because all the examples in this paper use discrete time, a cycle can be represented as a totally ordered sequence of equivalence classes of process instances; instances which happen at the same instant of time are in the same class.
16
D.S. WELD
4.2. Cycle detection Cycle detection requires three distinct phases: - The repetition recognition phase notices when an active process instance is the " s a m e " as one that has previously occurred. If such a pair of endpoint processes are found, then it is a strong clue that the cycle is repeating. - T h e candidate cycle extraction phase searches through the history structure to determine what sequence of process instances connects the two instances that were found to be the "same". - T h e cycle verification phase checks the sequence produced by the previous phase to ensure that all the process instances can repeat. This test, like repetition recognition, requires a test to see if pairs of process instances are the same. Each of these phases is described and illustrated below. 4.2.1. Repetition recognition Recognizing when an active process instance is the " s a m e " as one that has already happened depends critically on a set of sameness abstractions. Since intervening process instances have almost always modified some object, the system state is never identical after one iteration of a nontrivial cycle. The sameness abstractions allow the aggregator to define what should be considered relevant information. The processes instances are said to be the same if they are instances of a single process and the objects instantiating them are the same relative to the predicates in the process' preconditions. Two objects of identical type are the same relative to a set of predicates if (1) all the unordered parameters have identical values, and (2) the predicates do not distinguish between the values of any of the totally ordered parameters. Consider the following example. Example 4.4. Imagine a world with two objects: a frog and a cat, both modeled by two parameters, height and state. The values of height are totally ordered while the respective states are unordered sets. The frog is at the bottom of a deep well, so his initial height is 0, but the cat has height top where top > 0. The frog's state is a member of {ready, tired, dead}, initially ready. The cat's initial state is asleep but hungry is also possible. There are three processes defined: Discrete process: Preconditions: Changes:
Jump. frog.height < top ^ frog.state = ready. Increment frog.height by 3 ^ set frog.state = tired.
USE OF AGGREGATION IN CAUSAL SIMULATION
17
Discrete process: Preconditions: Changes:
Slide. 2 < frog.height < top A frog.state = tired. Decrement frog.height by 2 A set frog.state = ready.
Discrete process: Preconditions: Changes:
Awaken. cat.state = asleep A frog.height > 0. Set cat.state = hungry.
Simulating the frog system has the following results: System state Time 0: frog at 0 and ready; cat at top and asleep Time 1: frog at 3 and tired; cat at top and asleep Time 2: frog at 1 and ready; cat at top and hungry
Active process instances - - J u m p frog
)
--Slidefrog Awakenca t --Jumpf~og
Simulation of the first jump results in the state on the second line: the frog is at height 3 and is tired. At this point two instances are active; simulation results in the state shown at time 2. Now an instance of jump is active again--perhaps it is repeating. For the two instances of jump to be the same, the parameters of frog at time 0 must be same as at time 2. Since frog.state is an unordered parameter, the two values must be identical; they are. Since frog.height is a totally ordered parameter the predicate in the preconditions of the jump process must not distinguish between the two values, 0 and 1. It doesn't: both 0 and 1 are less than top. Thus, two instances of the jump process are the same. These sameness abstractions are carefully designed to ignore repetition if there is no chance that a transition analysis could be performed on the resulting cycle. To see that this is true, assume that there is a cycle. For transition analysis to work the cycle must have a meticulous trend function. Since unordered parameters cannot support such a trend function, they must not change value after an iteration of the cycle. This is why the sameness abstractions require unordered parameters to have identical values while totally ordered parameters are allowed to vary somewhat. If the values of both types of parameters were tested for distinguishability relative to the process' preconditions, then more cycles could be found, but the aggregator would be unable to generate a useful description for them. PEPTIDE'S sameness abstractions are similar to the ones presented here, but it mistakenly assumes that a cycle exists whenever two process instances are the same. In fact, the cycle must be verified, as explained in Section 4.2.3.
18
D.S. WELD
Since recognizing repetition is the key step in locating cycles, the algorithm's efficiency is quite important. PErnDE checks each active process instance against all previously simulated instances, hence checking for sameness at time t takes O(t. n 2) matches, assuming that n active process instances are generated at each instant in time. Much faster algorithms should be possible. Computing a hash value as a function of the process type and the unordered parameters of the instantiating objects could form the basis of an algorithm whose speed is independent of t and linear in n. 4.2.2. Candidate cycle extraction Once it is known that two process instances are the same, an aggregator needs to extract a candidate cycle that links the endpoints. The obvious approach-collect all the instances that occurred during the intervening interval--is obviously wrong. For Example 4.4, this approach would produce the candidate (jump, [slide, awaken]). 6 But the awaken process instance has no business in the cycle, since it can't repeat. PEPTIDE improves on the naive extraction algorithm by comparing the objects instantiating the repeating process instances. If the two instances are acting on different sets of objects, then the repetition must be parallel. Otherwise, if any of the objects instantiated both process instances, then the cycle is serial or mixed. In these cases, PEPTIDE simply uses the history structure of the object to determine the sequence of processes which brought the object back into a state where a process could repeat. PEPTIDE proposes the whole sequence of instances connecting the two endpoint instances as a cycle. It is possible for the repetition recognizer to find endpoints which share two objects. Consider the example in Fig. 9. Process P2 modifies both X and Y objects, taking X from state n to state c and Y from A to B. Since there is only one X and one Y object, the successive instances of process P2 change them repeatedly. In other words every instance of P2 is instantiated by the physically identical X and Y objects. If the extractor was given two instances of P2 as possible cycle endpoints, then it would realize that both X and Y were
P4
P2
P5
FIG. 9. Cycle with two interlocked loops. 6The square brackets indicate that slide and awaken happened in the same instant of time.
USE OF AGGREGATION IN CAUSALSIMULATION
19
involved, and the histories of both X and Y would be combined into the concurrent serial cycle (P1, P2, [P3, P5], P4). Unfortunately, this algorithm will not work reliably if the cycle extractor is given different endpoints. If the endpoints were instances of P1, for example, then this algorithm would not realize that Y was involved, and so would produce the plain serial cycle (P1, P2, P3, P4). If there were lots of Ys and the cycle could act on them in parallel, then this would be a reasonable cycle. But if there was only one Y, then the verifier (Section 4.2.3) would have to conclude that the cycle could not repeat. It is unclear how to efficiently extract the correct cycle in this case. Perhaps the search should be controlled by the verifier. 4.2.3. Cycle verification i'ErrIDE assumes that a cycle must necessarily exist if there are two process instances that are the same. In Example 4.4 a serial cycle does exist, but there are simple cases which would confuse PEI'TID~. Example 4.5. Consider a slight change in Example 4.4. Suppose that the frog won't slide backward unless the cat is asleep. Discrete process: Preconditions: Changes:
Slide2 2 < frog.height < top A frog.state = cat.state = asleep Decrement frog.height by 2 A set frog.state = ready
tired A
Simulation would result in the following sequence of states: System state Active process instances Time 0: frog at 0 and ready; cat at top and asleep --Jump frog Time 1: frog at 3 and tired; cat at top and asleep --Slide 2frog,catAwakencat--> Time 2: frog at 1 and ready; cat at top and hungry --Jumpfrog The aggregator would again notice the sameness of the two instances of jump and would extract the (jump, slide2) candidate cycle. But this sequence of processes can't repeat. Now that the cat is awake the frog will not slide a second time and will never be ready to jump again. The problem is that the repetition detector has only checked the sameness (and thus repeatability) of the endpoint process instances. The interior instances couldn't be checked because they hadn't been extracted yet. Given a
20
D.S. WELD
candidate cycle, the aggregator must now m a k e certain that each process in the cycle can actually repeat. One way to do this would be to simulate the cycle for two full iterations. This would provide the aggregator with two instances of each process so that all interior instances could be tested for sameness. Assuming no interference from processes outside the cycle, this test would guarantee that the cycle could repeat .7 Simulating a candidate cycle for two iterations is unappealing when the cycle contains nested cycles. A key question is: "Is there a way to avoid the work of simulating a second iteration?" Suppose that the initial history is as follows: SA
P1
~S B
P2
~S c
P3
~S D
P1
Suppose these processes are the only ones in the simulation. The repetition recognizer has discovered that the two instances of P1 are the same and the (P1, P2, P3) cycle has been extracted. A way to be sure that P2 can repeat would be to simulate P1 and compare the resulting state with state S B. But this requires simulating P1. Perhaps it suffices to check the values of certain parameters in states S m and S o . If all the parameters that are mentioned in the preconditions of P2 are the same in state S o as they were in S A, then they should be the same after simulating P1 the second time as they were in S B. It has not been proven that this is always the case, however. Possible difficulties in the proof are: interference from other processes and nondeterminism in the order and time of process activity. M o r e research should be done to determine if and when it is possible to verify cycles after a single iteration.
4.3. Process interference The previous section assumed that it was reasonable to consider cycles independent of other processes. But even though a valid cycle has been found, it may not be correct to generate a continuous process and perform transition analysis. Even if a system contains a cycle which would repeat if left alone, it may not be left alone. Example 4.6. Consider Example 4.4. Suppose in addition to the jump, slide, and awaken processes the following process was defined:
7Note that there is always the chance that a cycle will be detected at the precise moment that some totally ordered parameter counts up to a terminal boundary point. I assume that transition analysis will detect and correctly deal with these cases once the aggregator creates a continuous process.
USE OF AGGREGATION IN CAUSAL SIMULATION Discrete process: Preconditions: Changes:
21
Leap. cat.state = hungry. Leap into the well, smashing the frog.
Simulating this system for three instants results in an identical initial history as the original system, but now leap is active as well as jump. System state Time0: frog at 0 and ready; cat at top and asleep Time 1: frog at 3 and tired; cat at top and asleep Time 2: frog at 1 and ready; cat at top and hungry
Active process instances - - J u m p frog
:'
--Slidefrog, Awakenca t --Jumpfrog, Leapfrog,cat
:'
The simulator is quite justified in declaring the (jump, slide) cycle valid, but it would be improper to do transition analysis because the leap process is just about to terminate the cycle. In some cases it can be quite difficult to recognize this interference, especially if there is a chain of several distinct processes which occur before the cat actually leaps. PEPTIDE deals with this problem by refusing to generate a continuous process until all active processes are members of some cycle. Unfortunately, this approach is not sufficient, as the following modification to the leap process demonstrates:
Example 4.7. Discrete process: Preconditions: Changes:
Leap2. cat.state = hungry ^ frog.state = tired. Leap into the well, smashing the frog.
Now the first three instants of history are identical to the original, and only jump is active. The second clause in leap2's preconditions has masked its potential for disruption. But it will terminate the cycle. It is not clear how a simulator can detect this type of interference. Perhaps there are efficient ways to show that certain histories can't interact, but it is doubtful that these could work for all interesting cases.
4.4. Continuous process generation Given a valid cycle and assuming no interference, the aggregator needs to construct a continuous process representation of the cycle's behavior so that transition analysis can be applied. For the original frog and cat cycle in
22
D.S. WELD
Example 4.4, the aggregator should produce the following continuous process: Continuous process: Preconditions: Changes:
CP2. frog.height < top. increase flog.height.
Generation of the changes and preconditions parts can be done independently. 4.4.1. Generating changes As mentioned in Section 3.3.2.3, the changes part of a continuous process definition is a set of influences on parameters. These influences are assumed to define a meticulous trend function for the' parameter, i.e. to specify the parameter's change over time as strictly monotonic and skipping no values. Sometimes this is easy. In the (jump, slide) cycle above, all that is required is a scan through the cycle. Jump increases frog.height by 3; slide decreases the parameter by 2. So the net increase is 1, and no values are skipped. But what if the slide process only decreased frog.height by 1? The resulting trend function would not be meticulous. What if one of the cycle's processes changes a totally ordered parameter by multiplying it by 3 modulo some constant? In this case the parameter's value would jump around in a confusing fashion. Since it would be impossible to construct a monotonic (let alone meticulous) trend function for such a cycle, what should the aggregator do? There are three approaches to dealing with these problems: - C h e c k all cycles and refuse to aggregate those which generate nonmeticulous trend functions. Thus if a cycle was causing a parameter to jump wildly in value, the simulator would simply simulate it directly without aggregation. - A l l o w monotonic trend functions which aren't meticulous. Accept the resulting inaccuracy in transition analysis (Section 3.3.2). If the range of skipped values is bounded (e.g. by two in the modified frog example above) then subtract the bound from the boundary value, so that transition analysis is guaranteed to stop short of the boundary. Constrain the set of primitive processes so that only certain types of change are allowed on ordered parameters, thus ensuring that all cycles would engender meticulous trend functions. It appears that meticulous trend functions will result if processes are allowed to increment, decrement and set parameters to arbitrary constants. Monotonic trend functions will result when constant values are added or subtracted. It may well be possible to expand this class of operations. A related question arises when considering cycles which are partially describable in terms of meticulous functions. For example, assume that the frog had an additional unordered parameter, color. Suppose that every time the frog jumped, it changed to a new color as determined by some complex function; -
USE OF AGGREGATION IN CAUSALSIMULATION
23
assume that the frog never takes the same value of color more than once. If the preconditions of jump and slide never test the frog's color, then the cycle (jump, slide) will repeat. If the goal of simulation is to produce a complete description of the system's eventual state, then the aggregator should ignore this cycle because transition analysis cannot predict how the unordered parameter, color, will change. This is precisely what the current sameness abstractions guarantee by requiring all unordered parameters to be identical. However, there are probably situations in which the modeler does not care about the frog's eventual color. Since the trend function for frog.height is meticulous, transition analysis can predict that the frog will get out of the well. To benefit from transition analysis of partially aggregated descriptions, the cycle recognizer would need to be augmented with a list of irrelevant parameters. The sameness abstractions could then allow differences in unordered parameters (like frog.color) if they were labeled uninteresting and did not appear in process preconditions. 4.4.2. Generating preconditions Continuous process preconditions arise for two reasons. Transition analysis should terminate when a cycle stops repeating and also when some new process becomes active. These causes are discussed in turn. 4.4.2.1. Internal termination The continuous process is a reasonable model of system behavior only when all of the cycle members can repeat. Thus some of the preconditions serve to ensure that the continuous process is only active when these member processes can repeat. As explained in Section 3.3.2.4, a continuous process can only change the value of totally ordered parameters. Assuming that no other process interferes, a valid cycle can only terminate due to the change in a value of a precondition which tests a totally ordered parameter. This means that all member preconditions which test unordered parameters need not be considered because their truth will not change. Similarly, if the value of a totally ordered parameter is constant throughout the cycle or is moving away from a precondition's boundary value, then the precondition need not be considered. The remaining, relevant preconditions from processes in the cycle should be included in the preconditions for the continuous process. 4.4.2.2. Noticing other processes Preconditions also arise from the need to terminate transition analysis when a new process becomes active. Since the new process could interfere with the cycle or cause some interesting effect that should be noted, the continuous process abstraction becomes a poor model of the system's behavior when new processes become active. Thus the continuous process should have preconditions which cause it to deactivate when a new process could become active.
24
D.S. WELD
The closed world assumption makes it easy to do this alone, but ensuring no false alarms is quite difficult. PEPXIDEdoes it the easy way; it looks at all the preconditions of inactive processes. If the precondition compares a totally ordered parameter to a boundary point, p, and the parameter is changing value in the direction of p, then the negation of the precondition is included in the continuous process definition. Example 4.8. Assume that the well had a frog sensor twenty feet below the top which would sound an alarm if the frog reached that height. This might be modeled with the following discrete process: Discrete process: Preconditions: Changes:
Alarm.
frog.height = t o p - 20 ^ sensor.switch = The alarm sounds.
on.
Then the continuous process generated for the cycle must have the following preconditions: Preconditions:
flog.height <
top ^
flog.height #
top -
20.
But this extra precondition is only necessary when sensor.switch will be o n when the frog reaches t o p - 20. In general, it is quite difficult to design an algorithm which can tell when the precondition does and does not need to be inserted. For example, what if one of the processes in the cycle toggles the value of sensor.switch? aE~IO~'S approach is to play it safe and simple. Unfortunately, false boundary values do more damage then just degrade the efficiency of transition analysis. They cause the histories resulting from simulation to become fragmented with irrelevant boundary points where nothing unusual happened. This makes the simulation trace difficult to understand. Perhaps a history postprocessor could splice these out. 4.4.3.
Time shift
The basic power of aggregation results from considering each iteration of a cycle as a single operation and computing the net changes. In other words time gets compressed. This can greatly complicate precondition generation when a parameter gets changed by more than one process in the cycle. For example, during each iteration of the (jump, slide) cycle, the frog climbs one foot. If the well is one hundred feet high, how long will it take the frog to get out? PEPTIDE will predict one hundred rather than the correct answer of ninety-eight iterations. One conservative way to fix the problem would be to compute the maximum swing (up and down) for each parameter value during an iteration, then modify the preconditions accordingly. The drawback with this scheme is it might cause transition analysis to falsely predict transitions.
USE OF AGGREGATION IN CAUSAL SIMULATION
25
Example 4.9. Suppose that the definition of slide did not test to see if the frog was below top; this test was left solely in the jump process: Discrete process: Preconditions: Changes:
Slide3. 2 < frog.height ^ frog.state = tired. Decrement frog.height by 2 ^ set frog.state = ready.
The adjusted algorithm would tell transition analysis that there was a boundary value at top-2 when in fact the frog would take top iterations to climb all the way out. The simulator would eventually make the correct prediction, but it would take longer (at least three more process simulations) and clutter the history with extra detail.
4.5. Multiple cycles If two valid cycles are recognized then the situation may become more complex. Two cycles are said to be independent when every parameter which is tested in the preconditions of a process in one cycle is not modified by any process in the other cycle. If two cycles are independent, then the activity of member processes will be unaffected by the other cycle. Thus, their continuous processes can be generated separately. If the two cycles are not independent, then careful analysis is necessary to detect the following type of cycle deadlock. Suppose cycle a consists of five discrete processes, P1, P2 . . . . . P5, and cycle /3 consists of seven discrete processes. The processes in the two cycles modify and test different parameters except for a single unordered parameter, toggle, which has possible values on and off. Suppose process P1 sets toggle on, and process P2 does not touch toggle. Process P3 is only active if the toggle is on and it turns the toggle off. Thus process P1 enables P3. If one of the processes in the/3 cycle turns the toggle off while P2 is active then the a cycle could be terminated, because P3 won't be active. If there were a second shared toggle which enabled a process in the/3 cycle, then the two cycles could deadlock [17]. Deadlocks and interference won't necessarily occur during the first iteration. To be sure of detecting any interaction for cases in which the lengths of the two cycles differ, the system must be simulated for the least common multiple of the lengths of the two cycles. When the lengths are relatively prime (as in this case), it could take quite a while. The problem is even knottier if one of the cycles contains a continuous process.
4.6. Theoretical limitations As explained in Section 4.4, generating a continuous process abstraction of a cycle requires the construction of a meticulous trend function. For this to work,
26
D.S. WELD
the aggregator has to know in advance which parameters are totally ordered. Aggregation will fail to recognize cycles that influence parameters whose ordered nature is undeclared. Example 4.10. Consider the following chemical scenario. Suppose there is one type of molecule, G, which can bind together with other Gs in the following way. Each G is described by two unordered parameters, site and handle. The value of G.handle is I~ if no molecule has grabbed G, otherwise the value is the molecule doing the binding. The value of G.site is 1~if G isn't binding any molecule, otherwise the value is the molecule being bound. Initially, there are two Gs, one bound to the other, and many unbound Gs. There is one process defined: Discrete process: Preconditions:
Changes:
Grab. There are two G molecules G 1 and G 2 ^ G 1.site = ~ ^ a 1.handle 4 ~ n Gvsite = 0 ^ G2.handle = ~. Set Gx.site = G: ^ set G:.handle = G 1.
So, in other words, the Gs form chains. If a G is at the end of there is a free singleton G available, then the first G will grab extending the chain. Since the initial conditions specified that a chain existed, the chain will continue to grow until all the free Gs
a chain and the free G, short two G are used up.
If an aggregator could determine that the length of the chain was a totally ordered parameter, then it could generate this description and predict the eventual result. But the system was described solely in terms of the unordered parameters, site and handle. The fact that an ordered parameter might be useful description is apparent only as simulation proceeds. A natural question to ask, therefore, is: "Can one augment an aggregator so that it can learn new ways to describe a system, reformulate the description in terms of new parameters (in this case the chain length), and thus generate a continuous process description?" Unfortunately, the answer is: "In general, no." The problem is that, given a suitably rich set of primitive processes, arbitrarily complex situations can be devised. For example, the chains could include three binders:-A-B-C-A-B-C-, or form lattices in two or more dimensions. In fact, vE~mE's representation language for the biochemical domain is powerful enough to specify an arbitrary type 0 (unrestricted) grammar. Since this implies that a Turing machine could be built out of VE~IDE processes, it becomes quite clear why an aggregator cannot alway decide whether the simulation will halt. Of course, if other domains do not require such powerful representation languages, then the problem might be tractable. In any case I suspect that
USE OF AGGREGATION IN CAUSAL SIMULATION
27
predeclaring ordered parameters will cover most cases for most domains as well as it does for PEPTIDE'S.
4.7. Summary -Aggregation is an abstraction technique for dynamically creating new descriptions of a system's behavior during causal simulation. Aggregation works by detecting repeating cycles of processes and creating a continuous process description of the cycle's behavior. Transition analysis can then be applied in a single step to determine the system's eventual state. - Cycle recognition occurs in three phases. The repetition recognizer checks all active process instances to see if they are the same as previous instances recorded in the history structure. If two instances are the same, then there may be a repeating cycle. The cycle extractor searches the history structure and proposes a candidate cycle which the cycle verifier checks for validity: will all the processes repeat? - G i v e n a valid cycle, it is easy to generate a continuous process if all parameters have meticulous trend functions. If the trend functions are only monotonic, then tradeoffs must be made. - C y c l e recognition considers whether groups of processes can repeat as a cycle, isolated from other processes. A thorny problem remains: how can an aggregator anticipate interference from other processes?
5. Applicability For aggregation to be useful as a qualitative simulation technique, two conditions must hold for the domain in question: - I t must be possible to characterize interesting parts of the domain by totally ordered parameters which change gradually (Section 3.3.2). - S o m e processes must repeat their actions cyclically. These repetitious actions can be either discrete, continuous, or both. There are two steps necessary to apply aggregation to a new domain: -Select a vocabulary for describing the primitive discrete and continuous processes. -Identify a set of totally ordered parameters for use when recognizing sameness and generating the changes part of continuous processes. Below I attempt to demonstrate how aggregation could be applied in the domain of digital circuits, and that of mechanical machinery.
5.1. Cascaded digital counters Digital electronics is a natural domain for aggregation. There are many devices at different levels of detail whose states are totally ordered: counters, LIFO queues, and pipelines. Many processes repeat cycfically, and it is often interesting to consider the behavior of a system over time.
28
D.S. WELD
This section demonstrates how aggregation can be used to facilitate qualitative simulation of two cascaded decade counters. The circuit is shown in Fig. 10. It contains the following components: a clock generator, C, with one port, output; a ]-r flip-flop, F, with four ports: clk, J, K and output; two decade counters, X and Y, each with three ports: clk, enable, and carry. Each counter has internal state which is represented by a totally ordered parameter, mere. All other quantities are binary and thus unordered. Ports are connected as shown in Fig. 10, and it is assumed that values flow along the connections from port to port. The circuit behaves in the following manner. The clock, C, generates a square wave, and the counters only change when the pulse is high. The X counter increments on every pulse, but the Y counter only increments when X overflows. When Y finally overflows, it affects the J-K flip-flop, shutting the whole system down. The rest of this section shows how digital simulation can be augmented by an aggregator to recognize cycles (including nested ones) and summarize the repetitious behaviour of the system. Example 5.1. Five discrete processes are used to formalize the circuit's behavior: clock-up, clock-down, count, carry, and latch. Clock-up is active whenever the clock's output is currently low; simulation of clock-up causes the output to rise. Clock-down is similar.
Clock-up.
Discrete process: Preconditions: Changes:
clock.output = low Set clock.output = high.
Discrete process: Preconditions: Changes:
clock.output = high. Set clock.output = low.
Clock-down.
Count is active when the clock is high, enable is high, and the internal state is less than 9 (i.e. not about to overflow). Simulation of count increments the internal state of the counter. 8
ena•e
X
carry
enable
~V
carry
j K elk
Fro. 10. Two cascaded decade counters. 8Notice that the discrete process representation abstracts away the details of actual edgetriggering.
USE OF AGGREGATION IN CAUSAL SIMULATION
Discrete process: Preconditions: Changes:
29
Count. counter.clk = high ^ counter.enable = high ^ c o u n t e r . m e m < 9. Increment c o u n t e r . m e m ^ set counter.carry = low.
Carry is similar to count except it is active when the counter is about to overflow and set the carry flag. The m e m state variable is reset to zero. Discrete process: Preconditions: Changes:
Carry. counter.clk = high A counter.enable = high ^ c o u n t e r . m e m = 9. Set c o u n t e r . m e m = 0 ^ set counter.carry = high.
The latch process is active whenever the flip-flop's s input is high, r is low and the clock is high; it then sets output l o w . 9 Discrete process: Preconditions: Changes:
Latch. flip-flop.clk = high ^ flip-flop.J = high ^ flip-flop, r = low. Set flip-flop.output = low.
Assume that initially X.carry, Y.carry, and C.output are low, but as a result of some presetting activity, F.output and C.output are high. In addition, X . m e m and Y.mem are initialized to zero. Simulation proceeds as follows: Initially an instance of clock-up is active; the clock rises. Next an instance of clock-down is active as is countx .~° Simulation of clock-down sets C.output low which causes X.clk and Y.clk to read low also. Simulation of count x causes X . m e m to take the value 1. Now only clock-up is active; simulation causes C.output to be high. Once again, clock-down and count x are active. Before they can be simulated, however, the aggregator notices a cycle: ([clock-down, countx], clock-up). The following continuous process is generated: Continuous process: Preconditions: Changes:
CP3. X . m e m < 9. Increase X . m e m .
Transition analysis determines that eventually X . m e m will equal 9. Simul9Many more processes are necessary to completely describe a J-r flip-flop, but since they aren't needed for this example, they are omitted. 1°By countx, I mean the count process instantiated by the X counter.
30
D.S. WELD
ation continues at the discrete level: clock-down is active as is carry x. Simulation causes X.mem to be set to 0, X.carry to be set high, and C.output to be set low. Next clock-up is active, and the clock rises. Now three processes are active: countx, countr, and clock-down. Simulation causes X.mem to be 1, X.carry to be low, Y.mem to be 1, and C.output goes low. Now clock-up is active once again. And since X.carry has been reset, the aggregator recognizes that this instance is the same as the first instance of clock-up. A search of the history graph results in the following cycle: (clock-up, CP3, [clock-down, carryx], clock-up, [clock-down, c o u n t x , county]). The following continuous process is generated to summarize this cycle containing cycle: Continuous process: Preconditions: Changes:
CP4.
Y.mem < 9. Increase Y.mem.
The continuous process has no net influence on X.mem because although X.mem was increased from 1 to 9 in CP3, it was decreased from 9 to 0 in carry x and increased back to 1 in count x. Thus transition analysis predicts that the loop will terminate when Y.mem is equal to 9. Unaffected by the continuous process, X.mem remains equal to 1. When simulation reverts to the discrete level, clock-up is active. The aggregator notices that the two instances of clock-up are the same, but the candidate cycle contains CP4. Since the preconditions of CP4 aren't satisfied (Section 4.2.3), the candidate is declared invalid, and clock-up is simulated discretely. Now count x and clock-down are active; since X.mem was unaffected by CP4, it gets incremented from 1 to 2. Next, clock-up is active, but the aggregator finds a cycle: (clock-up [clock-down countx] ). The resulting continuous process, CP5, is identical to CP3. Transition analysis results in a situation with: X.mem equal to 9; Y.mem equal to 9; C.output, X.carry, and Y.carry are all low, but F.output is still high. Clock-up is active and simulated discretely. 11 Now carry x and clock-down are active. Simulation resets X.mem to 0 and sets X.carry high. Now clock-up is active, but there is no cycle because X.carry has been set. Once the clock has risen, count x, carryy, and clock-down are active. Simulation causes X.mem to be 1, X.carry to fall, Y.mem to be 0, Y.carry to go high, and C.output to go
low. HAgain, clock-up is not part of a cycle despite a sameness match, because the nested continuous process has unsatisfied preconditions.
USE OF AGGREGATION IN CAUSALSIMULATION
31
Again the clock rises. Afterwards, count x, latch and clock-down are active. Simulation increments X.mem to 2 and causes F.output to go low, disabling both counters. Simulation continues, however, with clock-up; then clock-down. When clock-up becomes active once more, the aggregator notices the cycle and produces process with no influences and no preconditions Simulation terminates with transition analysis concluding that the clock will oscillate forever. 5.2. Internal combustion engine Another domain in which aggregation could prove useful is that of complex mechanical devices such as an internal combustion engine. Internal combustion engines have many cyclic processes: the strokes of the pistons, the movements of valves, the rotation of gears, differentials and the crankshaft. These repetitious processes affect numerous interesting ordered parameters. Due to the spatially involved descriptions and general complexity of an engine, I will not present a concrete example. Instead this section hints at some of the cycles aggregation might detect: - O n each iteration of the four stroke cycle, the friction of the piston increases the cylinder's temperature, the spurt of fuel-injected gasoline lowers the level in the tank and increases it in the combustion chamber, and the crankshaft rotates. - On each circuit through the coolant system, the water increases in temperature with a corresponding decrease in the temperature of the engine, followed by an increase in the temperature of the radiator with a corresponding decrease in the temperature of the water. - On each pass through the engine the oil picks up dirt and deposits it in the oil filter. A major advantage of using aggregation to generate the influences on these ordered parameters (rather than hard-wiring them in to the domain model) is the ability to do both discrete and continuous modeling of the effects of malfunctioning parts and novel mechanisms. Intuitive, aggregated explanations could be generated for situations where parts such as the fan or water pump are broken. To simulate the behavior of an engine with defective spark plugs, one could simply substitute a new process description for the plugs. 6. Related Work This section discusses other work on causal simulation, aggregation's relationship to system dynamics, and the similarity between aggregation and two approaches to learning. 6.1. Loop recognition in causal simulation The causal simulators of de Kleer [6], Forbus [14] and Kuipers [21] all have
32
D.S. W E L D
limited cycle recognition abilities. When generating an envisionment (tree of possible futures for the system), they test each new global state to see if it is identical to a previous state. Since their cycles repeat verbatim (nothing changes), the cycles never stop. Thus these systems recognize only a small fraction of cycles, PEPTIDE finds more cycles by checking for sameness rather than equality and by comparing local rather than global state. Forbus [15] presents a hypothetical scenario in which the history of an oscillator is subjected to an energy analysis. However, the scenario is unimplemented, and Forbus does not propose a technique for summarizing the behavior of cycles.
6.2. System dynamics My theory of aggregation is actually quite similar to the system dynamics techniques developed in [16]. Forrester explains how aggregation can aid the construction of a model "by combining similar factors into a single aggregate." Numerous examples are presented, all in the domain of industrial organizations. For example, one might trace the path of an order, its formations, its clerical delays, its transmittal, its waiting in a backlog of unfilled orders, the shipment of goods in response to the order, and the transportation of the goods. Each item in the flow encounters these same circumstances. The channel is easiest to find and to visualize in terms of a single item of business. Having established the channel of one item, we then wish to aggregate into this channel as many separate items as possible. [16]. Although system dynamics models are simulated with computers, they are created by people. This is the inherent distinction between my aggregation and Forrester's: Forrester noticed the importance of aggregation, but didn't try to automate it. PEVrIDE creates new models dynamically without help from a human analyst. System dynamics also distinguishes between discrete events and continuous flows, but limits models to real-valued variables for simplicity.
6.3. Learning My work on aggregation is closely related to both Dufay and Latombe's [9] and Andreae's [3] work on inductive generalization of procedures from examples. Both of these present algorithms for acquiring procedures containing loops and conditionals from multiple, externally controlled traces. The techniques used to detect loops are very similar to, although slightly less general than, aggregation's recognition of process cycles. For their matchers to work, both Dufay and Latombe's and Andreae's
USE OF A G G R E G A T I O N IN CAUSAL SIMULATION
33
algorithms require key events to be known equivalent; given a few initial event matches, the algorithms propagate correspondences, refining the generalized procedures with each new trace input. Dufay and Latombe's program requires corresponding events to be equal, but Andreae allows more flexibility by doing this secondary matching with respect to a domain dependent hierarchy of actions, patterns, and conditions. The cycle recognizer described here is more general than Dufay and Latombe's and Andreae's algorithms since it doesn't require as input any known event equivalences. Like Andreae's work, aggregation allows very flexible secondary matching by allowing discrepancies in ordered parameters. However, unlike Dufay, Latombe and Andreae's work, my cycle recognizer cannot detect loops with embedded conditionals. 7. Conclusion
This paper demonstrates the utility of aggregation in component-based, causal simulation. An aggregator can recognize repeating cycles of processes and generate a continuous process abstraction of the cycle for more powerful reasoning. In particular, aggregation allows a simulator to apply transition analysis to systems that are partially or completely defined in terms of discrete processes. Aggregation is a powerful technique: it works on cycles which contain other cycles, and is domain-independent. Aggregators have few requirements. They need to know what parameters are totally ordered and need access to a detailed history of the simulation. As a result of an experimental implementation, the strengths and limitations of aggregation are beginning to be understood. This theory should be tested by a second implementation. In addition, several open problems remain to be answered. How should cycles be verified? Is there a way to detect all types of process interference? ACKNOWLEDGMENT There are many people who supported this work. Randy Davis contributed substantially to the clarity of the ideas. Numerous discussions with Brian Williams, Ken Forbus, and David Chapman improved this work. Mark Shirley, Jerry Roylance, Rick Lathrop, Walter Hamscher, Margaret Fleck, Bruce Donald, Johan de Kleer, John Seely Brown, Dan Brotsky, Mike Brady, Steve Bagley, and Phil Agre read various drafts and provided useful comments. Without the creative and challenging atmosphere of the MIT AI Lab, this work would not have been done. REFERENCES 1. Allen, J., Maintaining knowledge about temporal intervals, Comm. A C M 26 (11) (1983) 832-843. 2. Allen, J. and Hayes, P., A common-sense theory of time, in: Proceedings Ninth International Joint Conference on Artificial Intelligence, Los Angeles, CA (1985) 528-531. 3. Andreae, P., Constraint limited generalization: Acquiring procedures from examples, in: Proceedings Fourth National Conference on Artificial Intelligence, Austin, TX (1984) 6-10.
34
D.S. WELD
4. Bobrow, D.G. (Ed.), Qualitative Reasoning, Artificial Intelligence 24 (1984) Special Issue. 5. Brown, S. Burton, R.R. and de Kleer, J., Pedagogical, natural language and knowledge engineering techniques in SOPIE I, II, and III, in: D. Sleeman and J.S. Brown (Eds.), Intelligent Tutoring Systems (Academic Press, New York, 1982) 227-282. 6. de Kleer, J., Causal and teleological reasoning in circuit recognition, AI-TR-529, MIT AI Lab, Cambridge, MA, 1979. 7. de Kleer, J. and Brown, J.S., Assumptions and ambiguities in mechanistic mental models, Cognitive and Instructional Sciences Series CIS-9, Xerox Parc, Palo Alto, CA, 1982. 8. de Kleer, J. and Brown, J.S., A qualitative physics based on confluences. Artificial Intelligence 24 (1984) 7-83. 9. Dufay, B. and Latombe, J., An approach to automatic robot programming based on inductive learning, in: Robotics Research: The First International Symposium (MIT Press, Cambridge, MA, 1984) 97-115. 10. Fikes, R.E. and Nilsson, N., STRIPS: A new approach to the application of theorem proving to problem solving, Artificial Intelligence 2 (1971) 189-208. 11. Fleck, M., Space with boundaries: The math behind interval MEET, Working Paper, MIT AI Lab, Cambridge, MA, 1986. 12. Forbus, K.D. and Stevens, A., Using qualitative simulation to generate explanations. Tech. Rept. 4490, Bolt Beranek and Newman, Cambridge, MA, 1981. 13. Forbus, K.D., Qualitative reasoning about physical processes, in: Proceedings Seventh International Joint Conference on Artificial Intelligence, Vancouver, BC (1981) 326-330. 1~. Forbus, K.D., Qualitative process theory, Artificial Intelligence 24 (1984) 85-168. 15. Forbus, K.D. Qualitative process theory, AI-TR-789, MIT AI Lab, Cambridge, MA, 1984. 16. Forrester, J.W., Industrial Dynamics (MIT Press, Cambridge, MA, 1961). 17. Habermann, A., Introduction to operating system design, Science Research Associates, 1976. 18. Hayes, P., The second naive physics manifesto, in: J. Hobbs and R. Moore (Eds.), Formal Theories of the Commonsense World (Ablex, Norwood, NJ, 1985). 19. Hendrix, G., Modeling simultaneous actions as continuous processes, Artificial Intelligence 4 (1973) 145-180. 20. Kuipers, B., The limits of qualitative simulation, in: Proceedings Ninth International Joint Conference on Artificial Intelligence, Los Angeles, CA, 1985. 21. Kuipers, B., Qualitative simulation of mechanisms, Tech. Rept. TM-274, MIT Laboratory for Computer Science, Cambridge, MA, 1985. 22. Munkres, J.R., Topology, A First Course (Prentice-Hall, Englewood Cliffs, NJ, 1975). 23. Rudin, W., Principles of Mathematical Analysis (McGraw-Hill, New York, 3rd, ed., 1976). 24. Weld, D.S., Explaining complex engineered devices, Tech. Rept. 5489, Bolt Beranek and Newman, Cambridge, MA, 1983. 25. Weld, D.S., Switching between discrete and continuous process models to predict genetic activity, AI-TR-793, MIT AI Lab, Cambridge, MA, 1984. 26. Weld D.S., Combining discrete and continuous process models, in: Proceedings Ninth International Joint Conference on Artificial Intelligence, Los Angeles, CA, 1985. 27. Williams, B.C., Qualitative analysis of MOS circuits. Artificial Intelligence 24 (1984) 281-346. 28. Williams, B.C., The use of continuity in a qualitative physics, in: Proceedings Fourth National Conference on Artificial Intelligence, Austin, TX, 1984.
R e c e i v e d A p r i l 1985; revised version r e c e i v e d M a r c h 1986