Computer Networks and ISDN Systems 30 Ž1998. 1841–1852
Classification: a basis for understanding tools in declarative modelling Laurent Champciaux
)
´ Ecole des Mines de Nantes, Computer Science Department, 4, rue Alfred Kastler, BP 20722, 44307 Nantes cedex 03, France
Abstract In declarative modelling, the major problem of the understanding phase is to deal with a large number of solution scenes. We present a classification tool that globally works out this problem. The classification technique automatically categorises a large number of solution scenes. The resulting classification is a class hierarchy where each class clusters many similar solution scenes. This hierarchy serves as a basis to the tools we also present. These tools let the designer navigate in the solution space, they allow him to better understand large solution sets. All of our tools make a declarative modeller World-Wide-Web compliant. q 1998 Elsevier Science B.V. All rights reserved. Keywords: Declarative modelling; Understanding; Large solution set; Classification; Navigation tool
1. Introduction The traditional geometrical models of the common CAD systems force the designer to use imperative methods in a bottom-up design methodology. Such an approach is not suitable when the scene to describe is complex or not well known as the designer thinks about it in a more abstract way. Another approach makes the scene specification more declarative and dynamic. We refer to it as ‘declarative modelling’ w1–3x. The iterative design process in a declarative modeller is made of three stages ŽFig. 1.. Firstly, the designer describes the topology, geometry, etc. of the scene in a declarative way by
)
E-mail:
[email protected] or http:rrwww.emn. frrinforimager.
means of properties and constraints on objects and their spatial configuration. Then, if the description is consistent, a generating phase produces at least one of the numerous solution scenes related to this description and potentially all the solutions. The third stage consists of understanding the structure of the solutions produced and so, involves a visualisation step. As the design process is iterative, the cycle of description–generation–understanding may have to be performed several times before the system computes the solutions that match the designer’s idea. One advantage of a declarative method over an imperative one is that the declarative method as a top-down technique does not only provide a unique solution but a model from which the system computes one or more solutions. Another advantage is that the underlying geometrical model of the scenes is hidden from the user’s point of view. Therefore, the required effort in a declarative design process is less important than with imperative methods. In re-
0169-7552r98r$ - see front matter q 1998 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 9 - 7 5 5 2 Ž 9 8 . 0 0 1 7 8 - 0
1842
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852
Fig. 1. Declarative modelling.
turn, one major problem can impede the iterative design process during the understanding phase. In fact, the understanding phase may be arduous when the number of solutions produced is very important. This is often the case when the initial description is a bit sketchy, an inherent consequence of all declarative methods. One can look at the understanding phase from two different angles. Firstly, the designer may want to understand the inner characteristics of a single solution scene. Many tools have been proposed for this aspect of the understanding phase w4,5x. Generally, these tools propose to place the designer at a viewpoint which permits to grasp a majority of the scene properties. Secondly, the designer may want to navigate in the solution set so as to have an overview of the different solution scenes. This aspect is particularly useful when the designer is faced with a large number of solutions Žhundreds or millions.. In this case, he needs tools to help him find his way in such a big solution set. We can distinguish two different tools that would be useful in this situation. The first one should be able to find the representative instances in the solution set under the assumption that many solutions are similar. This is a classification tool. Another tool should assist the designer when he is looking for specific solutions in the solution set. This is a navigation tool or a browsing tool. In the declarative modelling literature, one cannot find any technique that can deal with vast solution sets. Only w6x has proposed a model to visualise more than one solution scene with the multiÕiew tool, but this tool does not rely on any classification scheme. Our recent investigations in the machine learning domain w7x have directed our work toward a learning
paradigm, namely ‘conceptual clustering’ w8x, which can serve as a basis to the two tools mentioned above. The aim of this paper is to emphasise the way we work out the above problem with this machine learning technique. Section 2 gives an overview of our declarative modeller. Section 3 exposes the need for a classification tool. Section 4 details the classification tool and the way we use it. Section 5 focuses on a future tool for the understanding phase, a tool based on another machine learning technique. Sections 6 and 7 are about some related works and the WWW compliance of our tools.
2. System overview To illustrate our approach, we are carrying out a declarative modeller for building simple houses with a construction set. Mainly, our system handles two kinds of data: 1. A set of objects which are the components of the scene Žhouse, windows, doors, garage, . . . .. 2. A set of constraints on and between objects. Our system handles geometrical, topological, symbolic constraints, etc. The dialogue between the system and the designer is carried out with a specific object-scene oriented language through an interactive user interface. The description made by the designer is a conjunction of sentences describing the different objects of the scene ŽTable 1.. A sentence is a logical disjunction of constraints depicting a particular object. A constraint can also be quantified but not all the constraints can be quantified, essentially the numerical ones can be.
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852
1843
Table 1 A description ) ) ) ) ) ) ) ) ) ) ) ) ) ) )
The house is quite spacious The roof is gently sloping Doors are taller than wide The house is quite wide Windows are wide Windows are as many as possible Doors are not quite large The house is made of bricks or yellow plastic The roof is made of grey plastic Doors are made of plastic The garage is behind the house Windows are made of glass The roof is not flat Windows are as tall as wide There are 2 or 3 doors
The resolution process 1 of the generation phase accepts the description as input, and produces all the different solution scenes which satisfy the description constraints. A solution scene is described by an ordered set of attribute-value pairs. Note that in our implementation, the generation and understanding phases are two concurrent processes. This allows the designer to visualise a solution scene as soon as it is computed and while the generation phase is still in progress. The realistic images of Figs. 2–4 are 3 different solution scenes of the problem described in Table 1. This description has more than 6 million solutions.
3. The need for a classification tool
Fig. 2. A simple rectangular house.
tion tool. The main role of this tool is to reduce the number of solutions to the number of solution classes, where a class consists of solutions which are similar. Of course, the solution scenes produced in a declarative modeller are all different in some ways. Then, two solutions are similar if they share regularities across a number of attributes. ‘‘Looking for regularities in a data set is the task of a machine learning paradigm called Conceptual Clustering’’ w13x. A conceptual clustering system accepts a set of object descriptions Žtypically vectors of attributes describing events, observations, facts, . . . . and produces a classification scheme from the observations. These objects have not been assigned to classes by a ‘teacher’. Instead, an evaluation function is used to discover classes with good conceptual descriptions. Hence, this form of learning is often called unsuperÕised learning. As mentioned in Section 2, our declarative modeller allows the designer to visualise the solution
As mentioned in Section 1, the number of solutions produced at the generation phase can be very important. Suppose there are hundreds of solutions, it is inconceivable for the designer to examine each one. So, it is a serious handicap for the understanding phase. Many authors in declarative modelling w10–12x have been confronted with this problem, but none has provided any solution except some sampling methods that only depend on random drawings. The tool we are looking for is typically a classifica-
1
The description of the resolution process is outside the scope of this paper. More details can be found in w9x.
Fig. 3. A simple L-shaped house.
1844
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852 Table 2 A simple house description Nominal attributes
House shape House material Window shape Door material
rectangular brick hexagonal wood
Numeric attributes
House width House length House height a windows a doors
20.4 18.6 12.0 6 1
Fig. 4. A simple U-shaped house.
scenes as soon as they are computed and while the generation phase is still in progress. If the designer does not want to see a particular solution but the representative instance of a solution class, then the classification tool must be able to process each solution as soon as it is computed. In this way, a set of solution classes is always present in the system for visualisation. This constraint of incrementality excludes some nonincremental conceptual clustering systems such as w14x or w8x. Incremental clustering systems are grouped around a subparadigm of conceptual clustering: concept formation. There are many models of the concept formation process: Feigenbaum’s Epam w15x, Lebowitz’s Unimem w16x, Fisher’s Cobweb w17x, Gennari’s Classit Ža Cobweb variant. w18x, etc. The model which has evolved more than any other over the last ten years is the Fisher’s one ŽCobweb., and as pointed out in w18x, ‘‘The well-defined evaluation function of Cobweb constitutes an advance over previous work on concept formation’’. For this reason and some other ones described below, our classification tool is based on COBWEBr3, its last development w19x.
4. A classification tool for the understanding phase Cobweb has been heavily influenced by research in cognitive psychology on basic-level and typicality effects w20x. Briefly, human beings form concepts by organising their observations on the basis of shared characteristics among things they observe. For example, the general concept of house is build up incrementally over time after many observations of partic-
ular houses of varying appearance. The concept becomes more general and useful as one sees more houses and it becomes possible to discriminate houses from other buildings that look similar. Similarly, Cobweb takes observations presented to it and organises them, forming concepts that summarise the instances they cover. 4.1. Representation and organisation of knowledge Cobweb accepts as input a series of instance descriptions. One instance is represented as an ordered set of attribute-value pairs. This representation is well suited as our solution scenes are described in the same manner. To illustrate how solutions are represented, Table 2 shows a simple house described in terms of nominal and numeric attributes. 2 Instead of forming flat classes that merely group similar instances together as in w14x, Cobweb forms a concept hierarchy. It adds each instance it receives to the hierarchy, adding knowledge by changing information within the concept nodes and in some cases changing the overall structure of the hierarchy. The concept hierarchy is a tree in which each node describes a concept. The concepts are ordered, from most general Žthe root of the tree, summarising all objects encountered. to most specific Žthe leaves of the tree, specific instances.. Each concept node describes a class of instances. The information stored at a concept node k are different whether the attribute
2
To simplify, we have omitted many other attributes describing one of our simple houses.
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852
1845
values are nominal or numeric. In both cases, Cobweb stores a count Nk so as to compute the probability P Ž Ck . of the concept occurrence: P Ž Ck . s
Nk Ž a instances incorporated in class Ck . Np Ž a instances incorporated in the parent class C p .
.
It also stores information about every attributes observed in the instances that are covered by the concept. For each nominal attribute, one integer count is stored. It permits to compute the conditional probability of the attribute value, given membership in the class covered by the concept. For example, a concept k for the class of big house has an entry for large, P Žlarge
.
For each numeric attribute, a sum of values Ý x i and a sum of squares Ý x i2 are stored. They allow to incrementally compute the mean x and the standard deviation s that represent the continuous normal distribution for the attribute. 4.2. Operators for classification and the learning algorithm Cobweb relies on four operators ŽFig. 5. to incorporate an object into the hierarchy. At each level of the concept hierarchy, the system applies one of these operators to classify the object into the hierarchy at that level. The operators apply locally to the subtree composed of the last concept node to which the object was classified and the children of this node. Initially, all objects are classified at the root node. One selects which operator to apply by evaluating alternate classifications with the evaluation function discussed in the next section. The Incorporate operator is used when the object fits well into an existing concept. This operator adds the instance into one of the children nodes. If the child is not a leaf of the tree ŽFig. 5a 1 ., the conditional probabilities for the concept and each of the attribute values are updated. If this child is a leaf, the hierarchy is extended downward as shown in Fig. 5a 2 .
Fig. 5. Classification operators.
The Create new disjunct operator is used when the object has very different characteristics from any existing concept at the current level, as determined by the evaluation function. The object is placed in a category of its own, a sibling of the other concept nodes ŽFig. 5b.. The two last operators are used to restructure the hierarchy without reprocessing previous instances. They allow the system to reorganise the hierarchy in light of new incoming knowledge while remaining an incremental system. The Merge operator is used when the hierarchy is overly branched and when combining two classes provides a good concept to which to classify the incoming object. The operator merges two child nodes and incorporates the object into the new combined class ŽFig. 5c.. The Split operator is applied when the hierarchy contains a node that is too general and therefore less useful for classification. This split is achieved by removing this node and replacing it with its children ŽFig. 5d.. The object is incorporated into one of these more specific nodes. Our classification algorithm is based on Cobweb: CLASSIFY (Object, Root - of a classification tree)
1846
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852
{ Update counts of Root. If Root is a leaf Then Incorporate to leaf, extend downward. Else Find S and T, the 2 children of Root that best host Object, and perform the operation below that has the best score as determined by the evaluation function: 1. Incorporate m call CLASSIFY(Object, S) or 2. Create a new disjunct with Object or 3. Merge S and T and call CLASSIFY(Object, merged node) or 4. Split S and call CLASSIFY(Object, best child of S) 4 Updating counts of node k Žlike Root above. consists of updating the count Nk of the node Ž Nk s Nk q 1., updating the integer count for each nominal value covered by Object Žcount s count q 1. and updating the sum of values and the sum of squares for each numeric value covered by Object ŽÝ x i s Ý x i q x i and Ý x i2 s Ý x i2 q x i2 .. 4.3. The eÕaluation function At each level of the hierarchy, the system must be able to evaluate alternative classifications and apply the operator which produces the best. To score these alternatives, one uses an evaluation function: category utility w21x. Different classifications of a new instance result in a number of different partitions of all the instances into classes. Category utility gives a high score to partitions which maximise similarity among class members Žintra-class similarities. and differences between members of different classes Žinter-class differences.. Actually, category utility is a trade-off between the predictiveness of each attribute value Žthe probability of an instance’s membership in a class, given its attribute value. and the predictability of the value Žthe probability of the value, given that an instance is a member of a class..
Eq. Ž1. shows the category utility evaluation function as defined by Fisher in w17x. We do not have space to rederive this function, more details can be found in w21,17x. cu s
1 K
K
.
ž
I
J
Ý P Ž Ck . Ý Ý P Ž A isVi j < Ck . ks1
2
is1 js1 I
y
J
Ý Ý P Ž A i s Vi j . is1 js1
2
/
Ž 1.
Ž I attributes, J values.. P Ž A i s Vi j < Ck . is the conditional probability of a particular value Vi j given membership in the class Žpredictability.. P Ž A i s Vi j . is the probability of a particular value at the parent of the node classes being considered Žpredictiveness.. As written, category utility only handles nominal attributes. Fig. 6 shows an example of a partition. In w18x, Gennari modifies this function so that it can deal with numeric values, but only numeric values ŽEq. Ž2... cu s
1 K
K
.
ž
Ý
I
P Ž Ck .
ks1
y
Ý
1
is1
max Ž a i , si k .
I
1
Ý is1
max Ž a i , si p .
/
.
Ž 2.
si k and si p are the standard deviations of attribute i in class Ck and in the parent class C p respectively. a i is a user-defined system parameter, acuity, one for each attribute. If a concept node describes a single instance, then s s 0 and 1rs is infinite. Acuity serves then as a minimum value for s . It represents the minimum detectable difference between instances. Mixing nominal and numeric attributes in a single instance description is an open issue in the literature on clustering. However, w18,19x present evidence that summing together terms from both forms of the category utility equation works well in domains with mixed data. As our simple
Fig. 6. A partition.
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852
1847
houses are described by nominal and numeric attributes, our implementation includes the capability to handle both of the attribute types. The category utility evaluation function becomes ŽEq. Ž3..: K
Ý P Ž Ck . P P Ž Ck . y P Ž C p . cu s
ks1
P Ž Ck . s
,
K 1
Ž 3.
I
P
I
Ý P Ž A i < Ck . , J
P Ž A i < Ck . s
Ž 4.
is1 2
Ý P Ž A i s Vi j < Ck . Ž nominal case. , js1
Ž 5a . P Ž A i < Ck . s
ai max Ž a i , si k .
Ž numeric case. .
P Ž Ck . ŽEq. Ž4.. defines the individual quality of class Ck and ranges from 0 to 1. P Ž A i < Ck . is defined by Eq. Ž5a. for nominal values and by Eq. Ž5b. for numeric values. 4.4. Using the class hierarchy Our declarative modeller allows the designer to use the class hierarchy produced by the classification system in two ways. Firstly, one can browse the hierarchy from root to leaves as anyone does with a file system. Secondly, one can examine each class of a particular partition, a partition being a level in the Table 3 A restricted description ) ) ) ) ) ) ) ) ) ) ) ) ) ) )
The house has a rectangular shape The roof is gently sloping Doors are taller than wide The house is wide or very very wide Windows are wide Windows are as many as possible Doors are not quite large The house is made of bricks The roof is made of grey plastic Doors are made of plastic Windows are made of glass The roof has 4 slopes Windows are as tall as wide There are 2 doors Windows are rectangular or diamond shaped
Fig. 7. A class hierarchy.
Ž 5b .
class hierarchy. The description of Table 3 Ža restriction of Table 1 description. has been used to illustrate most of this section. This description has the particularity to mainly describe houses that vary along two dimensions: the width of the house Ž‘‘The house is wide or very very wide’’. and the shape of the windows Ž‘‘Windows are rectangular or diamond shaped’’.. The generation phase produces 378 solution scenes with this description. The classification of this solution set results in the class hierarchy 3 of Fig. 7. The class hierarchy is correct as it exhibits these two dimensions with its two first levels Žthe root is the 0th level.. Level 1 discriminates between houses with different windows and level 2 discriminates between houses of different width. Note that in many cases Ži.e. with more complex descriptions., a level partitions solution classes according to more than one single dimension. Before we detail the class browser and the partition Õiewer of our declarative modeller, we must explain how we can build the geometrical representation of a class when a class not only represents a single solution scene but a set of many solution scenes.
3
We only have space for showing the two first levels of the class hierarchy.
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852
1848
which is one leaf of the class subtree. As a class clusters instances that look similar, any one of these instances could be selected. Unfortunately, class partitions are not always perfect and a randomly selected instance could be completely unrepresentative of the class. We select the most representative instance of a class k by evaluating the removal from that class of each of its instances. The instance that produces the largest loss of quality in k, as determined by P Ž Ck . Žsee Section 4.3, Eq. Ž4.., is the most representative. It comes to finding the instance that shares most of the class characteristics. 4.6. The class browser
Fig. 8. The class browser.
4.5. Selecting the representatiÕe instance of a class As described in Section 4.1, a node in the hierarchy is a class that clusters many instances. For instance, the bottom left class of Fig. 7 clusters 44 instances. One can ask oneself how we build the geometrical representation of such a class. A class is a probabilistic concept described in terms of mean values essentially. As the search space of the resolution process Žgeneration phase. is discrete and finite 4 , a mean value has no chance to be a valid domain value for a particular attribute that describes the scene. For instance, the attribute for the house width may have to take its value in the domain 12.0, 12.4, 12.8, 13.24 . Therefore, a mean value such as 12.6 is not a valid domain value. So, it is impossible to set all the different attributes of a solution scene to these mean values. In addition, the resulting geometrical representation could be inconsistent with the constraints of the description. In order to visualise a class, we must find its most representative instance
4
See w9x for more details.
The first tool which uses the class hierarchy is a class browser ŽFig. 8.. The top part of the browser window shows the path followed by the designer in the hierarchy. The bottom part of the window shows the subclasses Žchildren nodes. of the last class in the path. In the example of Fig. 8, the designer examines the various width a house with rectangular windows can have Žleft part of level 2 in Fig. 7.. The class browser is particularly useful when looking for solution scenes which exhibit specific characteristics. The class browser is clearly the navigation tool we mentioned in Section 1. 4.7. The partition Õiewer Sometimes the user may want to have an overview of the classes that represent the whole solution set. For doing so, the system must select an entire level in the class hierarchy. Fig. 9 shows that the class set of a level l may retain classes from upper levels when these classes have no subclasses at level l.
Fig. 9. A partition of the whole class hierarchy at level 2.
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852
1849
5. An advanced tool for the understanding phase
Fig. 10. The partition viewer.
To obtain classes that are really representative, the system selects the level which exhibits a good overall quality in its classes. The average class quality Q l ŽEq. Ž6.. of each level l is computed by: Ql s
1 K
K
Ý P Ž Ck .
Ž 6.
ks1
with K the number of classes at level l. Q l is compared with a user-defined parameter that specifies the minimum Q value for the level to select. P Ž Ck . varies from 0 to 1. Our experiments have shown that the user-defined parameter is wellspecified in the range w0.8, 0.9x. The partition viewer is really useful when the designer wants to see at first sight all the different solution scenes produced by the generation phase. The partition viewer window of Fig. 10 illustrates the partition viewer tool. In the example of Fig. 10, the solution classes are those described in Table 1. Note that in our system, the designer can take a solution class in the class browser and put it into the partition viewer so as to visualise all of its sibling classes. In the same way, he can take a solution class in the partition viewer and put it into the class browser so as to visualise its subclasses. Mixing the two approaches is the key for exploring the solution set in a natural manner.
As pointed out in w13x, ‘‘Clusters may signify some deeper similarity, it can be useful to define them as a preliminary step before performing supervised learning’’. With this remark in mind, our research effort focuses on the development of a new selection tool for the understanding phase. This tool is based on a supervised machine learning technique called the Õersion space strategy w22x. A Õersion space allows to induce the general description of a concept from a sequence of positive and negative examples of that concept. Each example is classified as positive or negative by the user. In the case of our declarative modeller, the designer may want to induce the general description of a particular type of houses. For instance, the designer can look for the general description of the house he is looking at through the window. Maybe, he made a sketchy description of this house during the description phase, and now he tries to find in the solution set those that really match his initial idea. This tool is clearly a selection tool. The general idea behind the Õersion space strategy is to organise hypotheses in a lattice based on generality ŽFig. 11.. Hypotheses are the various plausible general descriptions of the target concept. A Õersion space VSŽH,E. with respect to hypothesis space H and training examples E is the subset of H that are consistent with E Ži.e. if H classifies all examples of E correctly.. A Õersion space is represented by its general boundary G Žits set of most
Fig. 11. A version space.
1850
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852
general members. and its specific boundary S Žits set of most specific members.. The interaction scheme between the system and the designer is depicted in Fig. 12. The examples that are classified by the user are the clusters Žclasses. of a level in the class hierarchy. The Õersion space training algorithm consists in building the sets S and G from the user-classified examples. It may converge to the correct target concept, assuming that the training set is error-free and that the target concept is in H. As the Õersion space Žequivalently the concept general description. becomes more precise, one can filter the cluster set by removing clusters that are undoubtedly positive or negative instances regarding their classification by the Õersion space. This is useful to reduce the cluster set size because it limits the number of candidate training instances. The instance whose classification is requested by the system must be the most informative training instance. An instance is really informative when the system cannot predict accurately its classification by the user. The most informative training instance is the one which comes closest to matching one half of the nodes in the Õersion space. It may as well be positive or negative with a probability near 0.5, and finding out its classification will help rejecting one half of the currently plausible general descriptions. An instance that only matches 10 percent of nodes is pretty uninformative because the system can accurately predict its classification as negative. Training may stop when sets S and G converge or when the cluster set is empty. Then, all the clusters which are classified as positive by the Õersion space are the representatives of the target concept.
Fig. 12. Version space, interaction scheme.
Today, our Õersion space interaction scheme is operational, it is useful as a selection tool for the understanding phase. This tool relies on a supervised machine learning technique with the designer as teacher. The designer navigates in the solution set simply by answering ‘‘Yes, I like this solution scene’’ or ‘‘No, I dislike this solution scene’’. This tool really allows the designer to select solution scenes in a more subjective way than the other tools. In addition, the selection process is usually faster because the designer may have to examine only a few solution scenes. At the present time, we take an interest in reusing the induced concept general description. As the general description of the target concept encapsulates some of the designer’s preferences, it can be used in other tasks of the iterative design process. For instance, it can be used in the description phase for two different purposes. It can help to improve the semantics of some descriptive constraints as it encapsulates some designer’s preferences, and it can be used as a descriptive object because it describes a particular type of objects. So, our present and future work is concerned with the influence of the understanding phase on the whole iterative design process.
6. Comparison to other works The declarative modelling approach as described in Section 1 may be compared to other works such as w23–26x because they share the same main goal, which is to provide the user with high level design tools. The main similarity between these works is that they allow a declarative specification of the task to be performed. On the other hand, the main conceptual difference is the way the generation phase is encompassed. For instance, w26x uses generative shape grammars w27x whereas w9x uses constraint satisfaction problems. One can find the descriptions of some other well-known generation techniques in w1x. Concerning the topic of our paper that deals with the use of machine learning techniques in the understanding phase, we did not find any related works in the literature. We think that our classification and selection tools are two new concepts of the declarative modelling approach. They really help the de-
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852
signer in the process of finding the final good solution scenes. In some other works such as w28x, the selection process is achieved by the designer alone because there is no need for a classification tool, their solution scenes are supposed to be all different.
7. World Wide Web compliance The interactive GUI of a classical CAD system imposes a very long specification phase during which the designer continually visualises and manipulates the 3D scene being built. For some bandwidth reasons, it is difficult to carry out such an interaction through the World Wide Web. At the opposite, the description process of a 3D scene in a declarative modeller appears to be more World Wide Web compliant. In the declarative case, the description generally consists in a simple textual description that can be easily and quickly captured with an HTML form w28,26x. In addition, the conception process in a declarative modeller does not impose to continually visualise the scene being designed, a good thing for the data transfer rate. So, the declarative character of our modelling method is well-suited for the internet. Furthermore, we believe that the categorisation process described in this paper is essential for a declarative modelling web server as the class hierarchy produced serves as a basis for the navigation and selection tools also presented here. These tools allow the designer to visualise only a few solution scenes — the representatives of the whole solution space — instead of a large number a solution scenes, which is crucial for the data transfer rate in an internet environment. Our different tools may be adapted for the web with Java applets for instance. Beside the declarative modelling discussion, note also that the concept formation technique described in this paper could be used for web pages classification and navigation as an alternative to the traditional keywords classification scheme.
8. Conclusions This paper has presented a classification tool for the understanding phase of a declarative modeller. This classification tool is based on Cobweb, a well-
1851
known concept formation technique. We have shown that this machine learning technique is well-suited for the architecture of a declarative modeller, and that it globally works out the major problem of the understanding phase which is to cope with a large number of solution scenes. To help the designer in this task, we have developed two additional tools around this classification tool. One allows the designer to browse a class hierarchy and the other allows the designer to have an overview of all solution classes. With these tools, the designer better understands large solution sets. As mentioned in Section 5, this classification tool is also a starting point for our work on advanced understanding tools. These new tools should be based on machine learning techniques for we strongly believe in the association of learning techniques and declarative modelling. Furthermore, our first results are promising and we really want to go further into this direction. Our next objective is that the understanding phase may influence the description and generation phases according to the designer’s desires.
References w1x M. Lucas, E. Desmontils, Les modeleurs declaratifs ŽDeclara´ tive modellers., Rev. Int. CFAO Inf. Graph. 10 Ž6. Ž1995. 559–585. w2x M. Daniel, M. Lucas, Towards declarative geometric modelling in mechanics, in: Proc. 1st Int. Conf. on Integrated Design and Manufacturing in Mechanical Engineering ŽIDMME’96., Nantes, France, 1996. w3x P. Martin, D. Martin, An expert system for polyhedra modeling, in: Proc. Eurographics’88, Nice, France, September 1988, pp. 221–232. w4x C. Colin, Calcul automatique de ‘bonnes vues’ d’une scene ` ŽAutomatic computation of good ‘ viewpoints’., in: Proc. MICAD’90, Paris, 1990, pp. 736–751. w5x D. Plemenos, M. Benayada, Intelligent display in scene modelling, new techniques to automatically compute good views, in: Proc. Graphicon’96, Moscow, 1996. w6x L. Pajot-Duval, M. Lucas, Un mecanisme general ´ ´ ´ pour la gestion declarative de vues multiples ŽA general mechanism ´ that declaratively deals with multiple views., Rev. Int. CFAO Inf. Graph. 9 Ž4. Ž1994. . w7x L. Champciaux, De l’applicabilite´ des techniques d’apprenŽUsing machine learning tissage en modelisation declarative ´ ´ techniques in declarative modelling., in: Proc. 3IA’96, Limoges, France, 1996, pp. 163–185. w8x R.S. Michalski, R. Stepp, Learning from observation: Conceptual clustering, in: R.S. Michalski, J.G. Carbonell, T.M.
1852
w9x
w10x
w11x
w12x
w13x w14x
w15x w16x w17x w18x w19x
L. Champciauxr Computer Networks and ISDN Systems 30 (1998) 1841–1852 Mitchell ŽEds.., Machine Learning: An Artificial Intelligence Approach, Tioga, Palo Alto, CA, 1983. L. Champciaux, Declarative modelling: speeding up the generation, in: Proc. CISST’97, Int. Conf. on Imaging Science, Systems, and Technology, Las Vegas, NV, June 30–July 3, 1997, pp. 120–129. C. Colin, Towards a system for exploring the universe of polyhedral shapes, in: Proc. Eurographics’88, Nice, France, 1988, pp. 209–220. F. Poulet, Modelisation declarative de scenes tridimension´ ´ ` nelles par enumeration spatiale: le projet spatioformes ŽDe´ clarative modelling of 3D scenes by spatial enumeration: the spatioFormes project., Ph.D. Thesis 1171, University of Rennes 1, 1994. S. Liege, A declarative method for urban layout ` G. Hegron, ´ modelling, in: Proc. Compugraphics’95, 4th Int. Conf. on Computational Graphics and Visualisation Techniques, GRASP, Portugal, 1995. J.W. Shavlik, T.G. Dietterich ŽEds.., Readings in Machine Learning, Morgan Kaufmann, San Mateo, CA, 1990. P. Cheeseman, J. Kelly, M. Self, W. Taylor, D. Freeman, Autoclass: a bayesian classification system, in: Proc. 5th Int. Conf. on Machine Learning, Ann Arbor, MI, 1988, pp. 54–64. E.A. Feigenbaum, H. Simon, EPAM-like models of recognition and learning, Cognitive Sci. 8 Ž1984. 305–336. M. Lebowitz, Experiments with incremental concept formation: UNIMEM, Mach. Learn. 2 Ž1987. 103–138. D. Fisher, Knowledge acquisition via incremental conceptual clustering, Mach. Learn. 2 Ž1987. 139–172. J.H. Gennari, P. Langley, D. Fisher, Models of incremental concept formation, Artif. Intell. 40 Ž1989. 11–61. K. Thompson, K. McKusik, COBWEBr3: A portable implementation, Technical Report FIA-90-6-18-2, AI Research Branch, NASA Ames Research Center, 1990.
w20x E. Rosch, The principles of categorization, in: E. Rosch, B.B. Lloyds ŽEds.., Cognition and Categorization, Erlbaum, Hillsdale, NJ, 1978. w21x M. Gluck, J. Corter, Information, uncertainty and the utility of categories, in: Proc. 7th Annual Conf. of the Cognitive Science Society, Irvine, CA, 1985, pp. 283–287. w22x T.M. Mitchell, Generalization as search, Artif. Intell. 18 Ž1982. 203–226. w23x S. Kochhar, CCAD: a paradigm for human–computer cooperation in design, IEEE Comput. Graph. Appl., May 1994. w24x L. Khemlani, GENWIN: A generative computer tool for window design in energy conscious architecture, Build. Environ. 30 Ž1. Ž1995. . w25x A. Mahdavi, L. Berberidou-Kallivoka, A generative simulation tool for architectural lighting, in: Proc. 4th Int. Conf. on Building Simulation, Madison, 1995. w26x A. Rau-Chaplin, B. MacKay-Lyons, P. Spierenburg, The LaHave house project: towards an automated architectural design service, in: Proc. Int. Conf. on Computer-Aided Design ŽCADEX’96., IEEE Computer Society, September 1996, pp. 24–31. w27x G. Stiny, Introduction to shape grammars, Environ. Plan. B: Plan. Design 7 Ž1980. 343–351. w28x A. Rau-Chaplin, B. MacKay-Lyons, T. Doucette, J. Gajewski, X. Hu, P. Spierenburg, Graphics support for a WorldWide-Web based architectural design service, in: Proc. Compugraphics’96, GRASP, Portugal, December 1996, pp. 83– 92. Laurent Champciaux received his Ph.D. in 1998 from the University of Nantes in France. His primary interests are in computer graphics, declarative modelling and learning methods to enhance the efficiency of the iterative design process of CAD modelers.