A process-grammar for shape

A process-grammar for shape

ARTIFICIAL INTELLIGENCE 213 A Process-Grammar for Shape Michael Leyton Department of Psychology, Busch Campus, R u t g e r s U n i v e r s i t y , N...

2MB Sizes 16 Downloads 127 Views

ARTIFICIAL INTELLIGENCE

213

A Process-Grammar for Shape Michael Leyton Department of Psychology, Busch Campus, R u t g e r s U n i v e r s i t y , N e w B r u n s w i c k , N J 08903, U . S . A .

R e c o m m e n d e d by N. Nagao and S. Pizer ABSTRACT Inference rules are developed by which process-history can be recovered from natural shapes such as tumors, clouds, and embryos, etc. We argue that the inference of history arises from a newly discovered duality between curvature extrema and symmetry structure. We also develop a formal grammar by which someone, who has two views o f an entity at two developmental stages, can infer the processes that produced the second stage from the first. More specifically, we find that a grammar, o f only six operations, suffices to express the relationship between any two smooth shapes such that one shape is described as the extrapolation of processes inferred in the other under the above inference rules. In fact, a deformation is expressed as a transformation o f process-records--a technique reminiscent o f Chomsky's description of linguistic transformations in terms o f transitions between phrase-structure trees. In the present case, our process-grammar has the psychological role of explaining the curvature extrema in terms o f a sequence o f psychologically meaningful deformations. Finally, we compare a process-based symmetry analysis, that we introduce in this paper, with other symmetry analyses in the literature; and we compare our process-based grammar with another grammar based on curvature extrema.

1. Introduction

Recently, some researchers have argued that shape is understood as the outcome of processes that formed it. For example, in [9-14], I have elaborated certain inference rules by which an object is defined as the outcome of a history which is psychologically decomposed into p h a s e s where qualitatively different processes were acting. Along different lines, Pentland [20, 21] has proposed a fractal-based view by which an object can be characterized as the result of physical processes that randomly modified its shape through local action. Process-based descriptions are closely related to prototype-based descriptions. This is because an object, prior to the processes which altered it, is usually understood as a prototype. Rosch [26] has shown that categories of objects are a s y m m e t r i c a l l y referenced to their prototypes; for example, off-red is referenced to red, leaning objects to the vertical, the number 99 to 100, etc. The reverse references do not happen, e.g. 100 is not referenced to 99. In [9-13, 15] I have attempted to extend this view by showing that reference Artificial Intelligence 34 (1988) 213-247 0004-3702/88/$3.50 © 1988, Elsevier Science Publishers B.V. (North-Holland)

214

M. L E Y T O N

II a

b

c

FIc. 1. A n example of successive prototypification found by Leyton [9-13, 15].

to a prototype is stratified. For example, in a converging set of psychological experiments, I found that, when subjects are presented with a rotated parallelogram, Fig. l(a), they reference it to a nonrotated one, Fig. l(b), which they then reference to a rectangle, Fig. l(c), which they then reference to a square, Fig. l(d). These, and several other experimental results, have led me to argue that categories of objects are psychologically stratified into levels corresponding to phases in a formation history. The consequence is that a presented object is assigned a backward history through those phases, and thus through successively more primitive stages of prototypification. The present paper will attempt to corroborate this view in a new direction. Whereas, the previous analysis in [9-13, 15] has looked at process-history recovered from the shape as a whole, in the present paper, we shall find that a remarkably rich process-history is recovered from only a few psychologically salient points: the curvature extrema. What are curvature extrema? First observe that curvature is the amount of bend. For example, the top line in Fig. 2(a) is straight, and thus has 0 curvature. Each successive line downwards in Fig. 2(a) has greater bend, i.e.

ff-----10 E

b FIG. 2. (a) T h e descending curves have successively increasing curvature (bend); the final curve has a curvature e x t r e m u m at point E. (b) A face with the curvature minima marked with odd n u m b e r s and the maxima marked with even numbers.

PROCESS-GRAMMAR

215

curvature. Observe also that the final line has a point E where the amount of bend is greater than at any other point on that particular line. Such a point is called a curvature e x t r e m u m . If the curve bends in one direction the extremum is called a m a x i m u m . If the curve bends in the other direction, the extremum is called a m i n i m u m . By simple mathematical considerations, minima and maxima have to alternate along the curve. Thus, consider the face in Fig. 2(b). The minima are given by the odd numbers and the maxima by the even numbers. Curvature extrema are known to have two psychologically salient roles: First, as was demonstrated by Attneave [1], people choose curvature extrema when making a visual summary of a shape. Second, Hoffman and Richards [7] have provided compelling evidence that contours (curves) are psychologically segmented at negative minima. For example, on the face in Fig. 2(b), these extrema are the odd-numbered points. Going up the face, one finds that the successive pairs of these points correspond to the ends of the chin, the ends of the lower lip, the ends of the upper lip, the ends of the nose, and the ends of the forehead; i.e. the ends of the psychological parts of the face. In the present paper, I will argue that curvature extrema have a third very powerful psychological role: They are used to infer process-history. The paper will be divided into three parts. Part I will consider the inference of process-history from a single shape. It will be argued that such inference takes place by the application of two simple but powerful rules. In Part II, we will consider a different problem: Given two shapes that are known to represent two stages in the development of a particular object---e.g, a tumor, cloud or embryo---how can one infer the process-history that has occurred in the intervening time? We will solve this problem by introducing a new kind of logical language that we will call a process-grammar. Such a grammar generates th,~ later shape from the earlier one by a viable process-history. Finally, in Part III, we compare a process-based symmetry analysis, introduced in this paper, wi;h other symmetry analyses in the literature; and we compare our processbased grammar with another grammar based on curvature extrema.

PART I. INFERENCE FROM A SINGLE SHAPE

2. The Two Basic Rules

We begin by elaborating rules by which the curvature extrema of an individual shape can be used to infer processes that have acted upon that shape. Throughout the paper, we will assume that the input to the rules are shapes represented in the form of planar outlines. However, we shall see that the rules presented in Part I correspond to exactly equivalent rules for the direct analysis of three-dimensional shape. Unless otherwise stated, the term curve or contour

216

M. L E Y T O N

will refer, throughout the paper, to a closed smooth planar curve that is not a circle and is without self-intersections. The inference, from curvature extrema to processes, will be seen as requiring two stages: curvature extrema--~ symmetry axes--> processes

The intermediate role of symmetry axes will he crucial. This is because we shall show that significant symmetry axes terminate at curvature extrema, and that these axes are understood as the records" of deformational processes. Given the above two-stage inference, Section 2.1 deals with the first stage, curvature extrema-+ symmetry axes; and Section 2.2 deals with the second stage, symmetry axes-+ processes. 2.1. Curvature e x t r e m a - ~ symmetry axes

As just stated, central to the development of our process-inference rules will be the use of a symmetry analysis. Observe first, however, that, given two segments of a curve, it is rarely the case that a straight axis reflects one segment on to the other. Nevertheless, it may be possible to define symmetry in a differential sense. Consider the two bold curves c a and c 2, in Fig. 3. Although there is no mirror that reflects one curve onto the other, a mirror along line QO reflects the tangent vector, tA, at A, onto the tangent vector, t B, at B. It turns out that the existence of such a symmetry is equivalent to the existence of a circle that is tangential at both A and B. (The mirror will contain the circle center O.) Now drag the circle along the two curves, while always maintaining the double-touching property. As in [15], one can define a differential symmetry axis to be the trajectory of some midpoint associated with the circle. For example, the Symmetric Axis Transform (SAT) of Blum [3] defines the symmetry axis to be the locus of circle centers O. This is represented by the

~Cl

, • 0~

-"tB

FIG. 3. Two curves c 1 and c 2 that have symmetric tangent vectors at A and B. The SAT-axis is the loci of circle centers (represented by the dots shown). The SLS-axis and PISA-axis would be the locus of points P and Q, respectively.

PROCESS-GRAMMAR

217

trajectory of dots shown in Fig. 3. Again, the Smooth Local Symmetry (SLS) of Brady [4] defines the symmetry axis to be the locus of chord midpoints P. Alternatively, we shall propose here a new symmetry analysis in which the symmetry axis is the trajectory of the midpoint Q of the arc AB, as the circle moves. As we shall see in Part III, this analysis has remarkably different properties from the other two analyses and these properties make the analysis particularly appropriate for the inference of processes. Because of its appropriateness to process-inference, the new analysis will be called Process-Inferring Symmetry Analysis (PISA). It is now possible to state a result that is crucial to the entire paper. It is a theorem that was proposed and proved by Leyton [15], and it relates the curvature extrema of a smooth planar curve to the curve's symmetry structure:

Symmetry-Curvature Duality Theorem (Leyton [15]). Any segment of a smooth planar curve bounded by two consecutive curvature extrema of the same type (either both maxima or both minima) has a unique differential symmetry axis (under SAT, SLS, or PISA) and this axis terminates at the curvature extremum of the opposite type (minimum or maximum, respectively). Figure 4 illustrates the theorem. Points m~ and m 2 are two consecutive extrema of the same type (i.e. they bend in the same direction relative to the curve). The theorem states that, under any of the alternative differential symmetry analyses, there is one and only one symmetry axis. Furthermore, this axis terminates at the extremum M, of the opposite type (this extremum bends in the opposite direction). ~ Observe, finally, that the Symmetry-Curvature Duality Theorem can be regarded as an inference rule that assigns to each curvature extremum the unique symmetry axis that terminates at that extremum. For example, consider Fig. 5. It has eight extrema. Thus, under the theorem, there are eight unique

M~

q

7

7

e

FIG. 4. An illustration of the Symmetry-Curvature Duality Theorem. 1We should note that the final circle in the trajectory of circles is tangential to only M, This means that it can be arbitrarily small. Thus, letting the circle shrink to zero radius, the symmetry point reaches the extremum even in the case of the SAT. In the SLS and PISA, the axis reaches the extremum even without shrinking the last circle.

218

M. LEYTON

i

FIG. 5. Eight axes corresponding to eight extrema. axes associated with, and terminating at, those extrema. These axes are represented here by the dotted lines shown. 2.2. Symmetry axes--~ processes

The value, to our argument, of identifying symmetry axes, is given by the following crucial principle which was proposed and extensively corroborated by Leyton [9-14, 16]: Interaction Principle (Leyton [9]). The symmetry axes of a perceptual organization are interpreted as the principal directions along which processes are most likely to act or have acted.

The basis for this proposal is as follows: It was argued by Leyton [10] that if a transformation, acting on an organization, is one in which symmetry axes become invariant lines (eigenspaces) under the transformation, then the transformation will tend to preserve the symmetries; i.e. be structure-preserving on the organization. Two further stages of argument are then needed to obtain the above principle: (1) invariant lines are interpreted as principal directions of action, and (2) transformations that are most structure-preserving tend to be understood as most likely. The above principle has been psychologically corroborated on simple and complex shape, as well as motion perception [10-14]. The Interaction 2 Principle can be regarded as an inference rule. It claims that a symmetry axis is interpreted as the principal direction of a hypothesized process. In fact, since we are concerned, in the present paper, with the inference of processes that have taken place, the principle implies that a symmetry axis is interpreted as a record of a process. 2.3. Curvature extrema--~ processes

Now let us fit together the two inference rules given thus far. The first rule, the Symmetry-Curvature Duality T h e o r e m , states that, for any extremum, E, there ZThe Interaction Principle is described as such because it determines how the stimulus structure interacts with the structure of allowable transformations of the organization.

PROCESS-GRAMMAR

219

is a unique symmetry axis that is associated with, and terminates at, E. The second rule, the Interaction Principle, claims that symmetry axes are interpreted as records of processes. The two rules can be combined thus: Each curvature extremum implies a process whose trace is the unique symmetry axis associated with, and terminating at, that extremum. To illustrate: Consider Fig. 5 again. Under the rule, just given, the axes are interpreted as the traces or records of processes that have acted. For example, each protrusion is explained as the result of pushing out the boundary in the direction of the associated axis. Again, each identation is explained as the result of pushing in the shape along the associated axis. Thus the processes explain the extrema. Therefore, implicit in this analysis is a concept that we will now make explicit in the form of a rule: Asymmetry Rule. Processes are understood as creating greater curvature variation; i.e. greater information in the sense of mathematical information theory. The identification of curvature variation with information can be understood first intuitively in terms of the traditional identification of information with variety. However, more rigorously, Resnikoff [23] has recently shown that the measurement of curvature variation has the same form as the measurement of information in information theory. It is important to observe that, given a smooth shape, if one uses the Asymmetry Rule to extrapolate backward in time, the ultimate starting point for the shape must have been a circle, because the latter has the least amount of curvature variation; i.e. information. One can now see why our discussion has been restricted to smooth shapes: The backward use of the Asymmetry Rule yields ultimately a circle only for shapes without corners; i.e. where curvature is defined at each point. 3. Application of the Inference Rules We will now see that our two inference rules yield highly appropriate processanalyses for a large collection of shapes. Let us consider all possible shapes that have eight extrema or less. Drawings of all such shapes have been provided by Richards, Koenderink and Hoffman [25]. In Fig. 6, we have taken all these drawings and applied our two inference rules. The lines with arrows represent the process-records inferred by our rules. As can readily be seen, the results correspond strongly with intuition. Let us call a shape, together with the set of processes inferred on it under the two inference rules, a process-diagram. Surveying the process-diagrams in Fig. 6, another important feature emerges: Purely structural considerations yield strong semantic constraints, as we shall now see. First of all, note that the curvature variation along a contour can be represented by a function of the

220

M. L E Y T O N

LEVEL I

M + ~



M ~ M

÷

M + ~

P2

PI



P3

LEVEL IT M+

M+ ~ M+

M

M+ M+

M+

+ M"

TI

T2

M+

T3

÷

T5

M+( ' ~ . ~ =* ~M+ t

t

T4

÷

T6

LEVEL TIT

M+

M÷ M

t Qi

Q2

M+ M+

M+ ~

M+

M+

1 Q4

M+

~

Q6

Q5

Q3

M

Q7

QB

M+

M

+

Q9

~

M+

QIO

QII

QI2

F~G. 6. The application of the process-inference rules to all shapes which have up to eight extrema.

PROCESS-GRAMMAR

221

type shown in Fig. 7. (Position along the contour is represented by the horizontal axis, and curvature by the vertical axis.) Let us now represent the extrema of such a function in this way: Let M and m denote a local maximum and local minimum respectively, and + and - denote positive and negative curvature repectively. 3 Then there are four types of extrema: M +, m-, m +, and M - , as illustrated in Fig. 7. Now turn back to Fig. 6. All the extrema on all the shapes have been labeled according to this classification. It is important now to observe the following result: In surveying the shapes in Fig. 6, it becomes evident that the four types of extrema, given by the above purely structural characterization, correspond to semantic terms that people use to classify processes: the correspondence is as follows:

Semantic Interpretation Rule. M+ mm÷ M-

protrusion, indentation, squashing, internal resistance.

This rule will be useful to us later in interpreting the formal grammar to be derived in Part II of the paper. We conclude this section by adding two further comments related to different issues. First, the symmetry analysis, used in Fig. 6, was PISA, the new analysis introduced in Section 2.1. In Part III, we shall consider more deeply the advantages of using this analysis, and its relation to default reasoning. +rE CURVATURE

M+

--V~"

FIG. 7. An illustration of our labeling convention for curvature extrema. 3Our conventions for the specification of curvature traversing (going around) a curve, positive curvature negative curvature is the rate of clockwise rotation. (2) is that which keeps the inside of the figure to the left

on a planar curve are as follows: (1) In is the rate of anticlockwise rotation, and The direction chosen for traversing a curve of the curve and the outside to the right.

222

M. LEYTON

Second, Fig. 6 is stratified into levels according to n u m b e r of extrema. The reader might wonder why, in the set of shapes with up to eight extrema, there are only three levels. To answer this, first observe that any process-diagram must have an e v e n n u m b e r of extrema because maxima and minima alternate, and the contours which we are considering are closed. Furthermore, recall that the Four-Vertex T h e o r e m , in differential geometry, states that any smooth curve that is not a circle must have at least 4 curvature extrema (Do Carmo, [6, p. 37]). Thus, any shape with 8 extrema or less must have either 4, 6, or 8 extrema. These are the three levels in Fig. 6.

4. Three-Dimensional Analysis Although, in this paper, we will be considering shape information presented only in two-dimensional form, we shall observe here that the two basic inference rules--the Symmetry-Curvature Duality T h e o r e m and the Interaction Principle--also solve the three-dimensional situation. Two alternative routes are possible: We can use the two-dimensional rules directly on three-dimensional shape; or we can apply three-dimensional versions of the rules. The two alternatives are considered in turn. The two-dimensional rules can be applied to three-dimensional shape where the shape has been represented by a o n e - p a r a m e t e r family of smooth planar cross-sections (as in generalized cylinder representations; see [2]). Consider the bone in Fig. 8. As one moves from left to right along the bone, the cross-section alters shape. For example, in Fig. 8, the cross-section, initially a circle, deforms into curves with a greater and greater number of extrema. In the general case, extrema can both be introduced and disappear along the shape. Now using the above two inference rules, each cross-section can be converted into a process-diagram as in Fig. 6. Furthermore, putting the cross-sections together, the processes form sheets through the shape. These sheets smoothly appear and disappear. Finally, observe that since the crosssection with the least number of extrema is the circle, the entire shape is interpreted as a set of actions on a cylinder. The type of analysis, just described, is useful in many situations; e.g. in recovering process-history from tree trunks. However, in many other situations one may wish to apply rules directly to the three-dimensional shape, rather than via sectional slices. Remarkably, our two basic rules have exact equivalents in three dimensions. First, a three-dimensional version of the Symmetry-

Fro. 8. A bone parameterized as a generalized cylinder.

PROCESS-GRAMMAR

223

Curvature Duality T h e o r e m has recently been proved, and it states that, at each curvature extremum on a principle line of curvature, there is a unique symmetry sheet that terminates at that extremum (Yuille and Leyton, [28]). Second, the Interaction Principle, of course, states that the symmetry sheets are process-records. 4

PART II. INFERENCE OF INTERVENING HISTORY

5. Introduction

Part I of this paper elaborated rules by which processes can be inferred from a particular individual shape. However, in many situations, one is presented not just with a single shape but with a pair of shapes that are supposed to represent an object (e.g. a tumor or cloud) at two successive stages of development. The problem one is faced with is that of inferring what happened between the two stages. We now present a solution to this problem. What we will do is to develop a new kind of object which we will call a process-grammar. Such a grammar will generate the later shape from the earlier one by a plausible process-history. It is important to observe the following. We have two shapes, which we can call the earlier and the later shape. The later shape must be explained in terms of the earlier one. Therefore, the later shape must be understood, as much as possible, as the outcome of what can be seen in the earlier one. That is, the later shape should be explained, as much as possible, as an extrapolation of the process-structure of the earlier shape. Observe however that we have a very precise notion of what a processstructure is. It is a process-diagram: an outline together with a set of lines that represent the process-records. Thus the later shape must be understood, as much as possible, as the extrapolation of process-records in the earlier shape. Let us therefore, as a simple first cut, partition all process-extrapolations into two types: (1) process-continuations, (2) process-bifurcations. Now recall that, in a process-diagram, any process-record terminates at an extremum. Thus our problem reduces to the following: We simply have to go through each of the four types of extrema, M +, m-, m +, M - , and elaborate the possible process-continuations and process-bifurcations that can occur. Section 5.1 elaborates the continuations, and Section 5.2 elaborates the bifurcations. 4Nackman [18] and Nackman and Pizer [19] have developed a three-dimensional SAT.

224

M. LEYTON

5.1. Continuation 5.1.1.

Continuation

at M + and m

We begin by examining any one of the M + extrema in Fig. 9. Note that, in accord with the Semantic Interpretation Rule (Section 3), it is the tip of a protrusion. Now suppose that the associated process continued; that is, suppose that the boundary were pushed further along the direction of the process. The M + extremum would remain an M + extremum. That is, continuation at M + is structurally trivial. This, of course, simply means that the continuation of a protrusion must remain a protrusion. Similarly, consider the m - extremum in Fig. 9. Note that, in accord with the Semantic Interpretation Rule (Section 3), it is the endpoint of an indentation. Now suppose that the associated process continued; that is, suppose that the boundary were pushed further along the direction of the process. The m extremum would remain an m - extremum. That is, continuation of m - is structurally trivial. This simply means that the continuation of an indentation must remain an indentation. Thus, of the four types of extrema, continuation at M + and m is structurally trivial, and thus need not be considered. However, this is not true of continuation at the other two types of extrema, which we will now consider. 5.1.2. C o n t i n u a t i o n

at m +

Consider the top m ÷ e x t r e m u m in the shape T1 in Fig. 10. In accord with the Semantic Interpretation Rule, the e x t r e m u m is the result of squashing. Continuation of the process at this e x t r e m u m pushes the boundary inward until indentation is introduced, as shown in the top of T2. That is, the m ÷ extremum, in T1, has changed into a m extremum in T2. Furthermore, simple calculus considerations require two zeros of curvature (fiat points) to be

M+

M+

M

1-2 FIG. 9. Continuations at M + and m must be trivial with respect to curvature extrema.

PROCESS-GRAMMAR

225

J,

M+ [

M+

T2 FIG. 10. An illustration of continuation at m +. introduced around the m at the top of T2. These are represented by the dots on either side of the m - . Thus the developmental situation in Fig. 10 is completely specified by a simple rewrite rule on the discrete strings of extrema involved; i.e. the e x t r e m u m m ÷ is replaced by the string 0m 0, where the zeros are the zeros of curvature. We shall label this rewrite rule Cm +, meaning Continuation at m +. That is, the rule is:

Cm+:m+---~Om O. Observe that, although the situation is specified by a formal rewrite rule in discrete strings of extrema, the situation can be understood quite intuitively, using our Semantic Interpretation Rule, as: squashing continues till it indents. 5.1.3. Continuation at M Consider the shape T3 on the left of Fig. 11. It should be observed that our symmetry analysis, P I S A (Section 2.1), gives a subtle process-structure for the shape. PISA reveals that the structure of the top half of the shape is not an ordinary indentation; it is an indentation that has involved a source of internal resistance. The internal resistance is given by the upward process terminating at M - . Thus the shape could be the outline of an island in which there has been

M+ M+

M+

M"

M+

t T3 FIG. 11. An illustration of continuation at M-.

t T4

226

M. LEYTON

an inflow of water, which has been resisted at M - by a ridge of mountains. The result is the formation of a bay rather than a mere inlet. Now let us consider the continuation of the resistance at M . Continuation of this process pushes the contour upward until it bursts out and becomes a protrusion as shown at the top of shape T4, on the right of Fig. 11. Thus, returning to our island example, there might have been a volcano, in the ridge of mountains, that erupted sending lava down into the sea. Structurally, the situation is simply this. The initial M , in the bottom of the bay in the left-hand shape, is replaced by the M + at the tip of the protrusion in the right-hand shape. Simultaneously, the mathematics requires the introduction of a zero of curvature on either side of the new M +. Thus the developmental situation is specified by replacing M - by 0M+0. This transition will be labeled C M - meaning Continuation at M - . That is, we have the following rewrite rule: CM- :M

--~OM+O .

Observe again that, although this is a rule on discrete symbols, we can use our Semantic Interpretation Rule to give it an intuitive description as follows: internal resistance continues until it protrudes.

5.2. Bifurcation We have shown above that there are only two forms that continuation can take and be structurally significant on extrema. We shall now derive the only forms that bifurcation can take. The bifurcation of a process can be regarded as the splitting of the associated extremum E into two copies of the same extremum. For mathematical reasons, an intervening extremum, e, must necessarily be introduced; thus giving the rewrite rule E ~ EeE.

If E is a maximum (or minimum), then e is necessarily a minimum (or maximum, respectively). Furthermore, e can be either the same sign as E or the opposite sign. However, the case of the opposite sign can be regarded as the case of the same sign followed by one of the two continuation operations defined above. Therefore, we need consider only cases where e has the same sign. We can now conclude that, since there are only four types of extrema, there can be only four rules, as follows. They are all of the above form, E--* E e E , where e is uniquely determined by E, because e has the same sign as E and is a maximum (minimum) if E is a minimum (maximum). We consider the four rules, as follows.

PROCESS-GRAMMAR

227

5.2.1. B i f u r c a t i o n at M + The symbol B M + will be used to mean B i f u r c a t i o n at M +. Observe that the above considerations allow us to specify B M +, in advance, as: B M + : M + __~ M + m + M +

To illustrate this rule, consider the shape T4, on the left of Fig. 12. At the top of T4, there is a protrusion process terminating at the M + extremum. If this process bifurcates, one branch goes to the left and the other goes to the right. These branches therefore b e c o m e the leftward and rightward processes in the upper lobe of Q6 (the right-hand shape in Fig. 12). Note that the extrema at the ends of the two branches remain M +, in accord with the rewrite rule. Observe, once again that, although the rule is simply a formal rewrite rule on discrete strings of extrema, it can be intuitively understood as meaning: a n o d u l e turns into a lobe.

5.2.2. B i f u r c a t i o n at m We shall use the symbol B m - to mean B i f u r c a t i o n at m - . Observe that the earlier considerations allow us to specify B m - , in advance, as: Bm- :m- ~m-M-m-

.

One can illustrate this rule in the following way: Consider the shape P2, on the left of Fig. 13. At the top of P2 there is an indentation process terminating at the m - extremum. If this process bifurcates, one branch goes to the left and the other goes to the right. These branches therefore become the leftward and rightward processes in the bay of T3 (the right-hand shape in Fig. 13). Note that the extrema at the ends of the branches remain m - , in accord with the rewrite rule.

M+

M + O M

+

..,,_.



M+

M + ~ M

t T4 FIG. 12. An illustration of bifurcation at M ÷.

t Q6

+

228

M. L E Y T O N

M+ M+

t

t

P2

T3

FIG. 13. A n i l l u s t r a t i o n of b i f u r c a t i o n at m .

One should observe, once again that, although the rule is simply a formal rewrite rule on discrete strings of extrema, it can be intuitively understood as meaning: an inlet turns into a bay. 5.2.3. B i f u r c a t i o n at m + The symbol B m + will be used to mean B i f u r c a t i o n at m +. Again, we can specify B m +, in advance, as: B m + : m * --> m+ M +rn + .

To illustrate this rule, consider the shape P1, on the left of Fig. 14. At the top of this shape there is a downward squashing process terminating at the m + extremum. If this process bifurcates, the m + extremum splits and becomes the two rn + extrema on either side of the shape T1 (the right-hand shape in Fig.

14). Observe that, for mathematical reasons, a M + extremum is necessarily introduced in between these extrema, i.e. at the top of T1. Therefore, because a m + extremum means a p r o t r u s i o n , the above rewrite rule on extrema can be intuitively characterized as: a p r o t r u s i o n is i n t r o d u c e d . 5.2.4. B i f u r c a t i o n at M We shall use the symbol B M specify B M , in advance, as:

to mean B i f u r c a t i o n at M . Again, we can

M+

M+@

M+

M~~M

t PI FIG. 14. A n i l l u s t r a t i o n o f b i f u r c a t i o n at m +.

t TI

+

229

PROCESS-GRAMMAR

M+M+

~_//'~IM+ _~m~.

t

t Q3

T3 FI6. 15. An illustration of bifurcation at M-. BM-:M----~

M

m-M-.

To illustrate this rule, consider the shape T3, on the left of Fig. 15. In the center of the bay in T3, there is an upward resisting process terminating at the M - extremum. If this process bifurcates, the M - e x t r e m u m splits and becomes the two M - extrema on either side of the lagoon in Q3 (the right-hand shape in Fig. 15). Observe that, for mathematical reasons, a m - e x t r e m u m is necessarily introduced in between these extrema, i.e. at the bottom of the lagoon. Therefore, because a m e x t r e m u m means an i n l e t , the above rewrite rule on extrema can be intuitively characterized as: a n i n l e t is i n t r o d u c e d . 6. The Formal and Intuitive Simplicity of the Grammar The above discussion has shown that, remarkably, only six o p e r a t i o n s - - t w o continuation operations and four bifurcation o p e r a t i o n s - - a r e required to generate all possible process extrapolations. In fact, the six operations together form a g r a m m a r that allows the relationship between any two shapes to be expressed generatively in terms of process-extrapolations; i.e. such that the process-diagram of one shape is transformed into the process-diagram of the other via the successive extrapolation of the first. The complete g r a m m a r is shown on the left-hand side of Table 1. Observe also that, although the operations are expressed purely in terms of TABLE 1 The process-grammar Rewrite rules

Semantic interpretation

C m + : m+ -->0m-0 CM-:M ---~OM+O B M + : M + --> M + m + M + Bm- :m---* m-M-m B m + : m + __~ m + m * m + BM-:M -->M-m M

Squashing continues till it indents. Internal resistance continues till it protrudes. A protrusion bifurcates; e.g. a nodule becomes a lobe. An indentation bifurcates; e.g. an inlet becomes a bay. A protrusion is introduced. An indentation is introduced.

230

M. LEYTON

formal rewrite rules in discrete strings of extrema, each rewrite rule captures a situation that is perceptually very meaningful. That is, the six rules can be semantically described as the six situations on the right-hand side of Table 1. 7. The Process Stratification of Shape-Space The concluding sections of Part II show some of the insights that can be gained by using the grammar. First, we investigate how the grammar organizes shape-space. Recall that Fig. 6 stratifies shape-space into levels, where each level is the number of extrema. Examining Fig. 6, one might wish to ask whether shape-space has any further structure than this. Let us consider what organization is induced by the process-grammar. It might be thought that, if one interlinked the shapes by all possible uses of the grammatical operations, one would obtain a very disorganized network. In fact, the opposite is true. One obtains the highly structured system shown in Fig. 16. Close examination of this system reveals that it consists of six intersecting strata-systems where each system is a set of parallel planes. One stratification is that which we had before. It is the descending system of horizontal planes, where each plane corresponds to a level in Fig. 6. However, there are five other strata-systems which the grammar now reveals. Figure 17 shows Fig. 16 six times, each time revealing a different strata-system. For example, Fig. 17(a) shows the sequence of planes that are obtained by the successive iteration of the grammatical operation Cm+; Fig. 17(b) shows a sequence of planes that are elaborated by a successive iteration of the operations B M + and Bm +; Fig. 17(c) shows a sequence of planes elaborated by a successive iteration of the operations B M - and B m ; etc.

Pl

Cm~"

7 ,

,,'"

PZ

Cm+

P3

71 ,"

,

08~

BM-

s~

IBm-

co.

FIG. 16. The strong organization of shape-space yielded by the process-grammar.

231

PROCESS-GRAMMAR

0

d

~[

// ........ b

C

e

f

FIG. 17. The six stratificationsof shape-space yielded by the process-grammar. Those of us working in shape perception have not previously suspected that shape-space is so highly structured. Nevertheless, the stratifications embody the psychologically meaningful phenomenon of process-extrapolation. 8. The Multi-Resolution Heuristic

The purpose of the grammar is to bridge any two shapes with a plausible developmental history. The two shapes would be two nodes in the network in Fig. 16. The developmental history would be some path through the network bridging the two points. It is clear that, due to commutativities in the network, the path is nonunique. In fact, the commutativities are less than one might first suspect, because there is no way to move inward into the network, along any of the horizontal planes. Nevertheless, there are commutativities. We therefore need a further rule to constrain the order with which the grammatical operations are applied in defining a developmental history. The rule we suggest is as follows: Observe that, when one gradually blurs a shape, smaller processes (e.g. smaller indentations and protrusions) will disappear before larger ones. This significantly orders the processes. Has this order any relationship to the order required in the last paragraph? In other words, is it possible to substantiate a correspondence between blurring order and developmental order? An affirmative answer is given by the following heuristic:

232

M. L E Y T O N

M~

"at

M+

M ~ I I

M

T6

4-

+

Q5

FIG. 18. Two shapes to be linked by the grammar.

Size-is-Time Heuristic. Size corresponds to time. That is, later processes have had a shorter time to develop than earlier ones and are thus smaller. This rule is only a heuristic because it clearly has exceptions. Nevertheless, it is sufficiently valid to make it a useful means of constraining search through the path-space between two shapes. 9. Application of the Grammar Let us now illustrate the intuitive power of the grammar to yield processexplanations. Let us choose two arbitrary shapes, for example, the pair shown in Fig. 18. The assumption is that the two shapes are two stages in the development of the same object; e.g. a tumor, cloud, island, embryo, etc. We now show that the process-grammar gives a compelling account of the intervening development. The first step is to locate the two shapes, T6 and Q5, in the state-transition diagram, Fig. 16. One then identifies a path between them, using the Size-isTime Heuristic to constrain the path-space. Let us suppose that the appropriate path is T6--> T5--> Q7--> Q5. This sequence of shapes is shown in Fig. 19. The network also reveals that the sequence of grammatical operations is C M - * B M + * C m ÷. The sequence is the process-explanation for the intervening development. Using our semantic rules, the explanation can now be given as follows: (1) One particular process turns out to be crucial to the entire development.



M~

. •



~T-~M T6

-

M+

, , , ~

T5

M

Q7

FIG. 19. A process-historybetween the two shapesin Fig. 18.

÷

Q5

PROCESS-GRAMMAR

233

It is the internal resistance represented by the bold upward arrow in the first shape of Fig. 19 (the arrow terminating at M - ) . (2) This continues upward and creates the protrusion in the second shape of Fig. 19. (3) This same process then bifurcates, creating the lobe in the third shape of Fig. 19, where a downward squashing process has also been introduced from above. (4) The new squashing process continues, creating the top indentation shown in the fourth shape of Fig. 19. 10. Comments

(1) The inside of a closed curve, e.g. Fig. 20(a), can be either solid, as in Fig. 20(b), or a hole, as in Fig. 20(0. That is, a curve on its own is ambiguous as to figure and ground. Since we are interested, in the present paper, in defining operations only with respect to extrema, we observe that the figure and ground of a curve can be reversed by interchanging the extrema labels thus: M + ~--~/T/-,

M ~--~rn÷. This interchange operation will be called the Duality Operation, D. Note that, while the operation changes the figure-ground relationship, it will not alter the shape of the curve itself. Now observe that, because the process-diagram is the collection of (directed) symmetry axes obtained from only the curve information, the process-diagram must be invariant under the Duality Operation. Thus, for example, consider the upward arrow in Fig. 20(d). It must appear in exactly the same place in the process-diagram of the solid object, Fig. 20(b), and in the process-diagram of the hole, Fig. 20(c). In the former, it is internal resistance within the solid; in the latter, it is squashing against the inside wall of the hole.

a

b

c

d

FIG. 20. (a) A curve is ambiguous as to (b) figure or (c) ground. (d) However, the inference rules yield a process-record in the same place for both interpretations.

234

M. LEYTON

Generally, of course, the Duality Operation permutes the effect of the Semantic Interpretation Rule in this way: protrusion ~ indentation, squashing ~ internal resistance. Now observe that the Duality Operation performs the following permutation on the process-grammar: Cx ,~, C D ( x ) , Bx~

BD(x)

(where x is any extremum).

Finally, observe that the Duality Operation is commutative over composition of the grammatical operations. For example, D(CM- * Cm + * Bm + * CM-) = D(CM-)

* D ( C m +) * D ( B m +) * D ( C M - ) .

(2) The stratifications in Fig. 16 have interesting consequences for the accessibility of shapes via the operations. One consequence is as follows: Observe that, according to the Asymmetry Rule, the least deformed shape in any level must be the shape with no negative curvature; i.e. P1, T1, Q1, etc. (see Fig. 6). However, this does not mean that all the other shapes on a particular level are derivable from the least deformed shape on that level. For example, by consulting Fig. 16, we see that T3 and T6 are not derivable from T1 (the least deformed shape on the second level); and we see that Q2, Q3, Q9, Q10, Q l l , Q12, are each not derivable from Q1 (the least deformed member on the third level), etc. Nevertheless, we can see that all shapes are derivable from the least deformed shape, P1, in the entire hierarchy. (3) The grammar is, of course, applicable not only to curvature but to any single-valued function on the real line. In particular, the smoothing of any such function can be discretely represented by the grammar. For example, Witkin [27] defines the scale-space image of a one-dimensional signal as its successively smoothed versions, under Gaussian convolution. (See also Koenderink and van D o o m [8], Mokhtarian and Mackworth [17], Pizer, Oliver and Bloomberg [22], and Yuille and Poggio [29].) Witkin also discretizes the scale-space by extracting a tree that represents the history of the extrema as they are formed through successively finer scales. The tree however is presented simply as a composite of lines. Our grammar has the advantage of representing the tree as a discrete piece of algebra.

PROCESS-GRAMMAR

235

PART III. COMPARISONS In the third and final part of the paper, we take two crucial components of our system and compare them with current approaches to shape-description. The first component is the symmetry analysis. Recall that the symmetry analysis is crucial because it yields the process-records. We shall argue that, although current symmetry analyses solve the important problem of representing shape in terms of generalized ribbons (i.e. swept-out figures), they are less appropriate for the rather different problem of process-representation. We shall find that the analysis, PISA, introduced in this paper, is particularly attuned to process-description. Indeed, we shall find that the new analysis is useful outside the area discussed so far: for example, to understanding grasping in robotics. The second comparison we make is between alternative grammars that describe shape generation in terms of extrema. We shall find that our processgrammar (which is an example of such a grammar) not only accords more strongly with intuition but is much more economical. 11. Symmetry Analyses Recall, from Section 2.1, that there are currently three differential symmetry analyses that have been proposed. Given two curves, as in Fig. 3, the three analyses use the same trajectory of doubly tangential circles (by virtue of being differential symmetry analyses), but each selects a different midpoint associated with the moving circle. It is the trajectory of the chosen midpoint that is called the symmetry axis. The three analyses, the SAT, SLS, and PISA, are distinguished as follows: - S A T : the axis is the locus of circle centers, O. - S L S : the axis is the locus of chord midpoints, P. - P I S A : the axis is the locus of arc midpoints, Q. We shall now argue that, while the SAT and SLS are more appropriate for other tasks in shape-representation, PISA is more appropriate for processdescription. Our argument will proceed by examining common shape situations and showing that PISA infers the intuitively obvious processes where the other analyses do not. 11.1. The inference of indentation We will now see that PISA infers indentation correctly whereas the SAT and SLS do not. Consider the shape shown in Fig. 21(a). It has an indentation at the top. Our purpose is to derive a symmetry axis that correctly represents the trace of the process that created the indentation. Recall that the trajectory of doubly tangential circles is the same whichever symmetry analysis is chosen. Observe also that the doubly tangential circles

236

M. LEYTON

o

b

c

e

f

9

d

h

Fla. 21. The SAT has strong discontinuity problems on indentations.

must work their way down the mouth of the indentation such that each circle is tangential simultaneously to two points; one on either side of the indentation. We will consider first the SAT: the trajectory of circle centers. Observe first that, in the present shape, the initial circle has the following strange property: As shown in Fig. 21(a), in order to touch the two endp0ints of the indentation, the circle must exscribe the shape. Thus the circle center is below the contour, as represented by the dot. The next thing to observe is that an immediately subsequent circle, in the trajectory, has to look as shown in Fig. 21(b). That is, in order for the circle to work its way into the indentation, i.e. such that the two tangent points are further into the indentation, the circle has to have a wider radius, as shown in Fig. 21(b). However, observe that the circle center has therefore moved down, as shown by the dot at the bottom of Fig. 21(b). Now, in order to reach tangent points that are still further in the indentation, the circle has to widen further until it has infinite radius as shown in Fig. 21(c). Here, the circle center has gone off the bottom of the page, to infinity. To continue further into the indentation, the circle then becomes finite as shown in Fig. 21(d). Thus the circle center has reappeared. However, this time, it has come in from infinity via the top of the page. Figures 21(e), 21(f) and 21(g) show the remainder of the trajectory of circles and their centers, as the circles work their way down into the indentation. The trajectory of circle centers for the entire sequence is represented in Fig. 21(h). As we have seen, this trajectory, i.e. the SAT-trace, is very badly behaved: It breaks into two infinite pieces, that are strongly discontinuous; i.e. they cannot be meaningfully joined because the start point is below the finish point.

PROCESS-GRAMMAR

237

This type of trace is inappropriate for an indentation, because an indentation is understood to be continuous and finite. We conclude therefore that the SAT is inappropriate for the inference of indentation. Now let us consider the SLS. This symmetry analysis is badly behaved in a different way. We will not illustrate this fully here. However, consider the same sequence of diagrams, i.e. Fig. 21. Recall that the SLS axis is the trace of cross-section midpoints. Careful examination of Figs. 21(a) to 21(d) shows that the cross-section moves upward and then downward on this shape. Thus the SLS actually changes direction. Although the effect is a weak one for this particular shape, it is a very strong one for shapes such as a smooth dumb-bell. More specifically, the effect becomes stronger, the more the ears of the shape, shown in Fig. 21(a), point away from each other (while still retaining the indentation). The fact that the SLS axis changes direction makes it an inappropriate representation for the trace of an indentation process. The obvious reason is that indentation is understood as a simple unidirectional process into the shape. Now let us consider the appropriateness of the PISA axis as a trace for indentation. As pointed out earlier, PISA uses the same doubly tangential circles as the SAT and SLS, but chooses the circumference midpoint as the symmetry point. Thus in Fig. 22 we consider the same trajectory of circles, while constructing PISA. However, we now follow the trace of the circumference midpoint. As shown in Fig. 22(a), the point starts just above the mouth of the indentation--which is intuitively appropriate for an indenting process. In Fig. 22(b), it has moved smoothly down, as the circle has become larger. In Fig. 22(c), as the circle has become infinite, the point has again made a simple movement downward. Again, in Fig. 22(d), even though the circle now faces the other way, the circumference point has nevertheless continued on its simple downward trajectory. In Figs. 22(e), 22(f) and 22(g) we see this trajectory completing its course.

o

b

c

e f q FIG. 22. PISA does not have problems with indentations.

d

h

238

M. LEYTON

Thus, Fig. 22(h) shows the entire trace created by PISA. One can see that, unlike the SAT, which involves two disconnected infinite pieces, the trace is finite and connected. Again, unlike the SLS, which changes direction, it is unidirectional. In short, the PISA-axis has the required properties of the trace of an indentation process. 11.2. The inference of squashing or grasping Consider an ellipse. As shown in Fig. 23(a), an ellipse has four extrema E, F, G and H. Since the assumption (by the Asymmetry Rule of Section 2.3) is that this shape was deformed from a circle, an obvious hypothesis is that it was the result of vertically squashing the circle. This means that points E and F, as placed on the original circle, moved along the arrows shown in Fig. 23(b), arriving at their present position. That is, the arrows are the traces of those points. Thus the process-traces are external and inwardly directed. We should also observe that, in robotics applications, these points are the most likely grasp-points on the ellipse. This is no random coincidence: Grasping requires the same constraining action as squashing. Thus the inference of squash points is important not just for the analysis of deformation, but also for robot manipulation. Now consider points (3 and H on the ellipse. The most likely interpretation for these extrema is that, as placed on the original undeformed shape, i.e. the circle, they moved outward, creating the traces shown in Fig. 23(c), i.e. internal and outwardly directed traces. We shall now see that, whereas the second pair of traces (at G and H ) could have been created by the SAT or SLS, the first pair of traces (at E and F) could not. That is, the SAT and SLS cannot infer squashing processes. To see why, observe first that any cross-section of an ellipse must lie inside the ellipse. This means that any SLS-axis (the loci of cross-section midpoints) must lie inside the ellipse. Thus the four SLS-axes, associated with extrema E, F, G and H, must be as shown in Fig. 23(d). However, observe that internal axes at E and F would not be viable traces of processes. A similar type of argument leads to the rejection of the SAT. It appears therefore that, to infer the most likely traces, i.e. those shown in

o

b

c

d

FIG. 23. (a) An ellipse has four extrema, the labeled points shown. (b) The squashing interpretation. (c) The collection of likely interpretations. (d) An unlikely interpretation in the vertical direction is yielded by the SAT and SLS.

239

PROCESS-GRAMMAR

Fig. 23(c), we need a symmetry analysis that has a rather strange property: It must create internal axes at G and H, and external axes at E and F. We shall say that the analysis must be extremum-sensitive; i.e. sensitive to the type of extremum to which it is moving--that is, it must switch sides depending on the type of extremum. We shall now see that our new symmetry analysis, PISA, has the property of extremum-sensitivity. Let us return to the ellipse. Figure 24(a) shows the right half of the ellipse. In Fig. 24(b), the trajectory of doubly tangential circles, for this segment, has been drawn. Furthermore, a dot on each circle marks the symmetry point of the new analysis, i.e. the circumference point Q. Finally, in Fig. 24(c), we show the trace of circumference points Q. Thus, the axis, in this new analysis, is as we would want it for extremum H ; i.e. outwardly moving but internal. However, for this segment, the SAT and SLS would each produce an axis with these properties. In contrast, consider now the bottom half of the ellipse shown in Fig. 24(d). Here the central extremum is F. Figure 24(e) shows the doubly tangential circles that arise from this segment. The important thing to observe is that each circle exscribes the ellipse. Thus the circumference points Q must necessarily lie outside the ellipse. These points are the dots shown on the circles in Fig. 24(e). Finally, Fig. 24(f) shows the dots on their own. Thus we see that the axis created by the new symmetry analysis is external and inwardly directed at F. In contrast, the SAT and SLS would have to produce axes above F, because the cross-sections would still lie inside the ellipse. A mathematical proof that PISA is generally extremum-sensitive, i.e. not

o

b

c

d

e

f

Fro. 24. Ca) The right half of an ellipse. (b) The doubly tangential circles to that half. (c) The process-record yielded by PISA. (c)-(f) The corresponding diagrams for the bottom half of the ellipse. The PISA process-record, shown in (f), could not have been yielded by the SAT and SLS.

240

M. LEYTON

just on ellipses, can be obtained by putting together a number of proofs of Leyton [15]. In particular, the latter paper proves that, surrounding any M + or m - extremum, there is a region of curve on which any doubly tangential circle must inscribe the region. Conversely, the paper also proves that, surrounding any M - or m + extremum, there is a region of curve on which any doubly tangential circle must exscribe the region. The extremum-sensitivity of the new analysis follows immediately. 11.3. The inference of protrusion Indentation and protrusion are figure-ground reversals of each other. However, whereas the SAT and SLS do very badly on indentation, they usually do better on protrusion. The reason is as follows: As can be seen from Fig. 21, the discontinuity problems with the SAT and SLS, on indentation, arise because, at the entrance to an indentation, the contour switches from being outside to being inside the general body of the shape. In protrusions, this is not usually the case; i.e. the contour remains outside, all along the protrusion. However, severe problems can arise when protrusions do not have the latter property. For example, let us consider a figure-ground reversal of the shape in Fig. 21. That is, the shape is now that of a hole; e.g. a throat with a flap protruding into it. Because symmetry analyses act on the curve information only, and are therefore impervious to the figure-ground structure, the same topological consequences result from applying the three symmetry analyses here, as resulted when we considered the structure to be that of an indentation. That is, the SAT and SLS will generate the same problems, and PISA will not. Thus we conclude that PISA is, on the whole, better behaved on protrusions than the SAT and SLS. 11.4. The inference of internal resistance

Internal resistance is the figure-ground reversal of squashing. Furthermore, the topological consequences, that resulted from applying the three symmetry analyses to a squashing situation, depended on properties of the curve arbitrarily close to the extremum. Therefore, we conclude that the SAT and SLS go wrong on internal resistance as universally as they go wrong on squashing. Correspondingly, PISA remains well-behaved. 11.5. PISA as a genuine record

There is another reason why PISA seems more appropriate for the description of process-records. Note that, by the record of a process, one essentially means the trace of some physically significant point. Observe now that an extremum on a contour is a physically significant boundary point. However, the PISA symmetry point, Q, is also a boundary point: it lies on the circumference of a disc (Fig. 25). Thus, as illustrated in Fig. 25, PISA is a trajectory of boundary

PROCESS-GRAMMAR

E

241

B

FIG. 25. PISA as boundary movement under default reasoning. points Q of which the last is the extremum. Therefore PISA can be regarded as the record of boundary m o v e m e n t ; i.e. the genuine record of a process. We can m a k e these observations more precise. Move the circle along the contour, and stop the circle at a pair of points A and B, as in Fig. 25. One can regard the circle segment, between the points A and B, as the conjectured completion of the remaining contour, with the minimal curvature variation; i.e. with the minimal amount of information. As the circle moves further along the curve, more information is obtained, while the curve's completion is always a circular arc because, in the absence of further information, this involves the least informational commitment. Eventually, A and B together reach the extremum, where the least c o m m i t m e n t becomes a point. Thus, PISA represents boundary m o v e m e n t in the decreasing absence of information to the contrary; i.e. under a strategy of continuous default reasoning. 12. A Codon-Grammar

Recall that, whereas, in previous papers [9-16], I defined deformation with respect to the entire shape-structure, a purpose of the present p a p e r is to see what discoveries can be made when one expresses deformation only in terms of the extrema involved. We shall call a system of deformations that are minimally expressed in this way, an extrema-grammar. That is, m o r e generally, let us define an extrema-grammar to be a means of generatively elaborating a space of functions such that the generative relations between functions are expressed only in terms of the extrema of various derivatives. Thus, one of the purposes of the present p a p e r is to investigate what can be understood about perception from an e x t r e m a - g r a m m a r on the curvature function. H o w e v e r , the process-grammar is clearly not the only extremag r a m m a r that is possible. The possibility of an alternative such g r a m m a r arises in a p a p e r by Richards and H o f f m a n [24]. This alternative is of particular interest to us because it is closely related to our Symmetry-Curvature Duality T h e o r e m , in the following way:

242

M. LEYTON

Recall that the theorem states that the curve segment yielding a unique symmetry axis is a segment whose endpoints are two consecutive curvature extrema of the same type (either both maxima or both minima). We shall call such a segment an extrema-triple (or simply, triple), because it consists of three consecutive extrema, as illustrated in Fig. 4. Richards and H o f f m a n [24] have carried out an investigation of the extrematriples which have minima for endpoints. They call such triples, codons. Observe that the set of all triples partitions into the set of codons and the set of codon-duals (where duality is defined in the first paragraph of Section 10). Hoffman and Richards point out that there are five types of nontrivial codons depending on the arrangement of the curvature m a x i m u m and possible curvature inflections (O's) in between the bounding minima. The five types are represented by the top row of curve segments in Fig. 26. The symbols 0 +, 0-, 1+, 1-, 2, in the figure, are H o f f m a n and Richards' labels for the individual codons. Our labels have been inserted for the extrema involved. Richards and H o f f m a n suggest that rewrite rules on codons should be used to generate shape-space. Clearly, a g r a m m a r of such rules would constitute an example of what we call an extrema-grammar. In this section we will develop a codon-based g r a m m a r and, in the next section, we will compare it with the very different solution offered by the process-grammar. The codon rewrite rules that we develop will differ from those of H o f f m a n and Richards in two ways: (1) We extend the set of primitives to include not only codons but their duals; i.e. our set of primitives will be the set of all extrema-triples. This entire set is the set of segments shown in Fig. 26, where the bottom row is the set of codon-duals. By an abuse of terminology, we shall call the entire set of extrema-triples, codons. (2) Besides the differing primitives, the other way in which our analysis will differ from H o f f m a n and Richards' is that our actual operations will be different. Recall that the process-grammar has two types of operations: continuations and bifurcations. What is important to note here is that the former are level-preserving and the latter are level-increasing, where "level" means number of extrema. The reader can see these two alternatives as the horizontal and I0+

m * y m÷

O-



'M*

M-

2 M ÷

M ÷

m-

D(O +)

D(O-) m÷

O(I-)

D(IO m-

m-

](2) m-

m-

M+~'~M * FI(~. 26. The upper row shows the nontrivial codons and the lower row shows their duals.

PROCESS-GRAMMAR

243

vertical directions in the levels in Fig. 6. Since the formal method we are going to use here is codon substitution, these two alternative directions can be realized as follows: (1) L e v e l - p r e s e r v i n g . Replace a codon with a codon. (2) L e v e l - i n c r e a s i n g . (i) Replace an e x t r e m u m with a codon, and (ii) replace a codon with a pair of consecutive codons. The first type of substitution will be denoted by C - C ( " c o d o n - b y - c o d o n " substitution). The two types of level-increasing operations will be denoted by E - C ( " e x t r e m u m - b y - c o d o n " substitution) and C - 2 C ("codon-by-2-codon" substitution). These three methods will now be discussed in turn.

12.1. Codon-by-eodon substitution Observe, from Fig. 26, that since 0 + is the only codon which is bounded by m + extrema, it cannot be replaced by another codon. The same type of argument implies that D ( 0 + ) , 1 +, D ( I + ) , 1 , D ( 1 - ) cannot be replaced by a codon. Therefore only four of the codons can be substituted by another codon. The substitutions are the C - C operations shown in Table 2.

12.2. Extremum-by-codon substitution Observe that an extremum, x, can be replaced only by a codon if the codon endpoints are the same e x t r e m u m type and sign as x. This means that the replacing codon must be s y m m e t r i c a l . Thus the four asymmetrical codons, 1+, TABLE 2 The codon-grammar Codon operations

Process equivalents

C-C~ :O ---~2 C-C2:D(O )---*D(2)

P

! p l I '

C-C~ : 2---~0 C-C 4: D(2)--~ D(0 ) E-C, : m + -->0 + E-C: : M --~ D(O' ) E-C3 : m --~ 0 E-C4 : m "-->2 E-C~:M ~D(O )

E-C6: M +~ D(2) C-2C~ :0 +~ 1 ~ I C-2Cz:D(Ot)-oD(I*I C - 2 C 3:0 -022

)

C-2C4:0 --01 1 + )-0D(22) )--~D(1-1 +) C-2C7:2---~0 0 C-2C5:D(0 C-2C.:D(O

C-2Cs:D(Z)---,D(O 0 )

PBm + IBM IBm P * IBm PBM + I* PBM + I* PBM + P*IBm I * PBM* * P PBM'*P P*IBm *1 I B m *1 IBm * P 1 PBM **l~l

244

M. LEYTON

D ( I + ) , 1 , D(1 ), cannot replace extrema. Therefore, the only possible substitutions are the E-C operations shown in Table 2. 12.3. Codon-by-2-codon substitution In this case, the ends of the original codon must be the same as the ends of the replacement codon-pair. Thus the four asymmetrical codons 1 ÷, D ( I + ) , 1-, D ( 1 - ) cannot be replaced; unless each is replaced by itself and a symmetrical codon bounded by an extremum of the same type as one end of the asymmetrical codon. However, this type of pair substitution is the same as a E-C substitution, and thus will not be duplicated here. There are therefore eight nontrivial examples of codon-by-2-codon substitution--the C-2C operations shown in Table 2. 13. Comparison between the Process-Grammar and Codon-Grammar We now have two extrema-grammars which generate shape-space via curvature extrema. Table 1 displays the process-grammar, and Table 2 displays the codon-grammar. The first important fact that emerges from comparing the two tables is that the process-grammar is much more economical. It consists of only six operations, whereas the codon-grammar consists of eighteen. Furthermore, every one of the codon operations can be represented by the process operations. This can be seen by considering the far right-hand column of Table 2, which represents each codon operation in terms of purely process operations. Thus the process operations achieve everything that the codon operations achieve, and with significantly fewer elements. Besides the advantage of economy, the process-grammar has the considerable advantage of according with intuition concerning the structural relations between shapes. The difficulty with codons arises from their very advantage with respect to a different phenomenon: they are very successful in describing parts 5. However, a consequence of this is that codon substitutions (i.e. codon rewrite rules) are essentially cutting and pasting operations. That is, one excises a part and pastes another part across the created gap. For example, one would cut the hand from an arm and paste on a nose. In contrast, under the process-grammar, any change from one structure to another is expressed purely developmentally; i.e. by growth. To illustrate the difference between the two approaches, consider the codon operation C-2C3 : 0 -->22. 5Recall, from Section 1, that Hoffman and Richards [7] propose that a perceptual part is a curve segment bounded by two consecutive negative minima.

PROCESS-GRAMMAR

245

Because of the intuitive difficulty of understanding this operation, it takes a while to discover an example of it; but an example does exist. It is, in fact, the codon relationship between the two shapes in Fig. 18. This relationship is the replacement of one section of the first curve, with an alternative section, by cutting and pasting. However, examination of the two shapes reveals that cutting and pasting, alone, cannot obtain the second shape from the first. In contrast, the relationship established by the process-grammar is that one shape grows or develops into the other! Furthermore, the grammar elaborates the stages of the growth: they are the sequence of developmental stages listed at the end of Section 9, and shown as the succession of shapes given in Fig. 19. Thus, the process-grammar presents a relationship that is conceptually very meaningful.

14. Summary We began by presenting two rules by which curvature extrema can be used to infer processes that have acted upon a two-dimensional shape, or a threedimensional shape where the latter has been represented as a smooth generalized cylinder. The first inference rule, the Symmetry-Curvature Duality Theorem, states that, for any extremum, E, there is a unique symmetry axis that is associated with, and terminates at, E. The second rule, the Interaction Principle, claims that symmetry axes are interpreted as records of processes. Thus, together, these two rules yield a process-record at each curvature extremum. We defined a process-diagram to be a shape together with the set of processes inferrable via the use of these rules. Process-diagrams were provided for all curves with up to eight inferred processes. We then elaborated a process-grammar, of only six operations, to specify the relationship between two shapes in such a way that one shape is described as the extrapolation of processes inferred in the other. More formally, this means that the relationship is understood as the transformation of process-diagrams; an approach reminiscent of that of Chomsky [5] where transformations are expressed as transitions between phrase-structure derivations. Under the process-grammar, the transformations are either process-continuations or processbifurcations; and they have the advantage of being expressed in terms of only the extrema involved. We found also that the grammar organizes shape-space into six intersecting stratifications where each stratification corresponds to a history of successive modification by process-extrapolation. Thus the grammar represents shape-space as a highly organized state-transition diagram where each state is itself a process-diagram. We then argue that, while the SAT and SLS solve the important problem of representing shape in terms of generalized ribbons, an alternative symmetry analysis developed here, called PISA, is more appropriate for the rather different problem of inferring processes. In particular, with respect to proces-

246

M. LEYTON

ses, P I S A has t w o a d v a n t a g e s : (1) it i n f e r s s e v e r a l p r o c e s s e s c o r r e c t l y , a n d (2) it c a n b e r e g a r d e d as t h e r e c o r d o f b o u n d a r y m o v e m e n t . F i n a l l y , w e c o n s t r u c t e d a c o d o n - b a s e d g r a m m a r as a n o t h e r a p p r o a c h to t h e e x p r e s s i o n o f s h a p e d e f o r m a t i o n p u r e l y in t e r m s o f c u r v a t u r e e x t r e r n a . H o w e v e r , w e f o u n d t h a t t h e c o d o n - g r a m m a r is m u c h less e c o n o m i c a l t h a n t h e p r o c e s s - g r a m m a r , d e s p i t e t h e fact t h a t t h e l a t t e r is as e x h a u s t i v e as t h e f o r m e r . W e also f o u n d t h a t t h e p r o c e s s - g r a m m a r is m u c h m o r e p s y c h o l o g i c a l l y m e a n i n g f u l as an e x p r e s s i o n o f s h a p e r e l a t i o n s h i p s . ACKNOWLEDGMENT I wish to thank Whitman Richards and Stephen Pizer for giving many helpful suggestions on the presentation of this material. The research was supported by NSF grant IST-8418164 to Harvard; and by NSF grant IST-8312240 and AFOSR grant F49620-83-C-0135 to MIT. The research was carried out while the author was at Harvard University. Revisions, requested by AI journal, were incorporated while the author was at SUNY Buffalo. REFERENCES 1. Attneave, F., Some informational aspects of visual perception, Psychol. Rev. 61 (1954) 183-193. 2. Binford, O.B., Visual perception by computer, IEEE Systems, Science, and Cybernetics Conference, Miami, FL, 1971. 3. Blum, H., Biological shape and visual science (Part I), J. Theor. Biol. 38 (1973) 205-287. 4. Brady, M., Criteria for representations of shape, in: A. Rosenfeld and J. Beck (Eds.), Human and Machine Vision (Erlbaum, Hillsdale, NJ, 1983). 5. Chomsky, N., Syntactic Structures (Mouton, The Hague, 1957). 6. Do Carmo, M., Differential Geometry of Curves and Surfaces (Prentice-Hall, Englewood Cliffs, NJ, 1976). 7. Hoffman, D.D. and Richards, W.A., Parts of recognition, Cognition 18 (1985) 65-96. 8. Koenderink, J.J. and van Doorn, A.J., Dynamic shape, Biol. Cybern. 53 (1986) 383-396. 9. Leyton, M., Perceptual organization as nested control, Biol. Cybern. 51 (1984) 141-153. 10. Leyton, M. Generative systems of analyzers, Comput. Vision Graph. Image Process. 31 (1985) 201-241. 11. Leyton, M., Principles of information structure common to six levels of the human cognitive system, Inf. Sci. 38 (1) (1986) 1-120. 12. Leyton, M., A theory of information structure I: General principles, J. Math. Psychol. 30 (1986) 103-160. 13. Leyton, M., A theory of information structure II: A theory of perceptual organization, J. Math. Psychol. 30 (1986) 257-305. 14. Leyton, M., Nested structures of control: An intuitive view, Comput. Vision Graph. Image Process. 37 (1987) 20-53. 15. Leyton, M., Symmetry-curvature duality, Comput. Vision Graph. Image Process. 38 (1987) 327-341. 16. Leyton, M., A limitation theorem for the differential prototypification of shape, J. Math. Psychol., to appear. 17. Mokhtarian, F. and Mackworth, A., Scale-based description and recognition of planar curves and two-dimensional shapes, IEEE Trans. Pattern Anal. Mach. lntell. 8 (1986) 34-43. 18. Nackman, L.R., Three-dimensional shape description using the symmetric axis transform, Ph.D. Dissertation, Department of Computer Science, University North Carolina, Chapel Hill, NC, 1982.

PROCESS-GRAMMAR

247

19. Nackman, L.R. and Pizer, S.M., Three-dimensional shape description using the symmetric axis transform 1: Theory, IEEE Trans. Pattern Anal. Mach. Intell. 7 (1985) 187-202. 20. Pentland, A.P., Fractal-based description, in: Proceedings IJCAI-83, Karlsruhe, F.R.G. (1983) 973-981. 21. Pentland, A.P., Perceptual organization and the representation of natural form, Artificial Intelligence 28 (1986) 293-331. 22. Pizer, S.M., Oliver, W. and Bloomberg, S.H., Hierarchical shape description via the multiresolution of the symmetric axis transform, IEEE Trans. Pattern Anal. Mach. Intell. 9 (1987) 505-511. 23. Resnikoff, H.L., The Illusion of Reality: Topics in Information Science (Springer, New York, 1987). 24. Richards, W. and Hoffman, D.D., Codon constraints on closed 2D shapes, Comput. Vision Graph. Image Process. 31 (1985) 265-281. 25. Richards, W., Koenderink, J.J. and Hoffman, D.D., Inferring 3D shapes from 2D silhouettes, AI Memo 840, MIT, Cambridge, MA, 1985. 26. Rosch, E., Cognitive reference points, Cognitive Psychol. 7 (1975) 532-547. 27. Witkin, A.P., Scale-space filtering, in: Proceedings 1JCAI-83, Karlsruhe, F.R.G. (1983) 1019-1022. 28. Yuille, A. and Leyton, M., 3-D symmetry-curvature duality theorems, in: Proceedings ICCV-87. 29. Yuille, A. and Poggio, T.A., Scaling theorems for zero crossings, 1EEE Trans. Pattern Anal. Mach. lntell. 8 (1986) 15-25.

Received July 1986; revised version received June 1987