Benchmarking shape signatures against human perceptions of geometric similarity

Benchmarking shape signatures against human perceptions of geometric similarity

Computer-Aided Design 38 (2006) 1038–1051 www.elsevier.com/locate/cad Benchmarking shape signatures against human perceptions of geometric similarity...

4MB Sizes 0 Downloads 22 Views

Computer-Aided Design 38 (2006) 1038–1051 www.elsevier.com/locate/cad

Benchmarking shape signatures against human perceptions of geometric similarity Doug E.R. Clark a , Jonathan R. Corney b , Frank Mill c , Heather J. Rea b,∗ , Andrew Sherlock c , Nick K. Taylor a a School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK b School of Engineering and Physical Sciences, Heriot-Watt University, Edinburgh, UK c School of Engineering and Electronics, The University of Edinburgh, Edinburgh, UK

Received 6 October 2005; accepted 5 May 2006

Abstract Manual indexing of large databases of geometric information is both costly and difficult. Because of this, research into automated retrieval and indexing schemes has focused on the development of methods for characterising 3D shapes with a relatively small number of parameters (e.g. histograms) that allow ill-defined properties such as “geometric similarity” to be computed. However although many methods of generating these so called shape signatures have been proposed, little work on assessing how closely these measures match human perceptions of geometric similarity has been reported. This paper details the results of a trial that compared the part families identified by both human subjects and three published shape signatures. To do this a similarity matrix for the Drexel benchmark datasets was created by averaging the results of twelve manual inspections. Three different shape signatures (D2 shape distribution, spherical harmonics and surface portioning spectrum) were computed for each component in the dataset, and then used as input to a competitive neural network that sorted the objects into numbers of “similar” clusters. Comparison of human and machine generated clusters (i.e. families) of similar components allows the effectiveness of the signatures at duplicating human perceptions of shapes to be quantified. The work reported makes two contributions. Firstly the results of the human perception test suggest that the Drexel dataset contains objects whose perceived similarity levels ranged across the recorded spectrum (i.e. 0.1 to 0.9); Secondly the results obtained from benchmarking the three shape signatures against human perception demonstrate a low rate of false positives for all three signatures and a false negative rate that varied almost linearly with the amount of perceived similarity. In other words the shape signatures studied were reasonably effective at matching human perception in that they returned few wrong results and excluded parts in direct proportion to the level of similarity demanded by the user. c 2006 Elsevier Ltd. All rights reserved.

Keywords: Geometric similarity; Shape perception; D2 shape distribution; Spherical harmonics; Surface partitioning spectrum; Artificial neural networks

1. Introduction The most widespread and valuable types of 3D data are the CAD models created by commercial manufacturing companies who commonly have tens of thousands of component models

∗ Corresponding address: Heriot-Watt University, School of Engineering and Physical Sciences, Riccarton Campus, Edinburgh EH14 4AS, United Kingdom. Tel.: +44 131 449 511. E-mail addresses: [email protected] (D.E.R. Clark), [email protected] (J.R. Corney), [email protected] (F. Mill), [email protected] (H.J. Rea), [email protected] (A. Sherlock), [email protected] (N.K. Taylor).

c 2006 Elsevier Ltd. All rights reserved. 0010-4485/$ - see front matter doi:10.1016/j.cad.2006.05.003

stored on their hard disks. These models are not only used internally to record the exact shape and dimensions of products, but also to communicate specifications to both customers and sub-contractors. Consequently, 3D CAD models are of great importance and commercial value to all manufacturers. Currently 3D models (like engineering drawings) are indexed by alpha-numeric “part numbers” with a syntax specific to each company. While this indexing system works well in the context of ongoing maintenance and development of individual parts, it offers little scope for “data mining” (i.e. exploration) of a company’s inventory of designs. For example currently it is impossible to identify if existing

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

production tooling can be reused by automatically search a database of CAD data for parts that are of “similar” shape to a new design [1]. Consequently identifying some form of computable “shape signature” that enables a general-purpose, searchable, geometric indexing scheme is seen as an important research goal. However, all of the approaches described in the literature are limited to some extent by the expressive power of the individual shape signatures used and the computational methods used to compare them (i.e. Minkowski L1) [2]. The broad aim of the authors’ research is to investigate computational methods for enabling internal company reuse of existing 3D mechanical product data and external sourcing of production expertise. The specific objective of the work reported here is to investigate whether a combination of Artificial Neural Network (ANN) and shape signatures can be used to locate families of similar shapes that correspond to those identified by humans within a test dataset. The paper is structured as follows: Section 2 briefly outlines the literature in the areas of shape recognition and describes the nature of the three signatures used in this work: shape distributions (D2), Spherical Harmonics (SPH) and Surface Partitioning Spectrum (SPS); Section 3 details the procedure used to manually assess shape similarity within the Drexel benchmark dataset. Section 4 report how a competitive Neural Network was used to assess the different signatures’ ability to sort a test set of components into similar collections; and lastly, Section 5 discusses the results and Section 6 draws some conclusions. 2. Background 2.1. Literature Although many aspects of engineering and art have demonstrated that human beings are remarkably efficient at recalling, comparing and envisaging three dimensional shapes, psychologists, computer vision researchers and most recently CAD researchers have struggled to understand the mechanisms underlying this ability. While there is a rich literature on visual perception of shape found in several different areas of academia, each addresses different aspects of the problem and none appear to be directly applicable to the work described here. For example much of the psychological work on shape assessment in the medical literature [3] has been reported by paediatric researchers studying cognitive development who typically focus on the comparison of primitive volumes (e.g. sphere, cube, cone etc). Similarly vision researchers deal with 2D images of 3D objects and are typically concerned with identifying differences in gross form, say distinguishing a dog from a horse [4] (i.e. shape classification). However the study of perceived similarity amongst shapes [37,38] is directly relevant to this work. Interestingly these early experiments (to make quantitative measurements of shape similarity) suggested that humans frequently perceive asymmetric relationship (e.g. subjects judge A as similar to B, but do not always identify B as similar to A). Furthermore the strength of perceived similarity appeared to be dependent on

1039

the set of shapes from which the selection is made.1 Clearly geometric measures are only one aspect of a complex, overall judgement humans make about the similarity of objects. In contrast CAD researchers are concerned with the full 3D models of complex forms, that frequently have re-entrant features (i.e. holes, pockets, slots). Their objective is not to identify identical shapes (which is relatively trivial) but to locate shapes that are judged in someway similar to a target form. On the other hand cognitive and epistemic literature does provide some useful insights to the problem that can be exploited in the design and assessment of 3D retrieval algorithms. For instance [5,6] established that concepts of geometric similarity, like beauty, are in the “eye” of the beholder and vary between cultural and ethnic groups. This was reinforced by the authors’ own experience which indicated that mathematicians judge shape similarity differently than mechanical engineers do. Consequently participants in our study of human shape perception were drawn uniformly from mechanical engineering backgrounds. Geometric modelling researchers have also been able to draw directly on the algorithms and methods proposed in other areas of computer graphics for content based retrieval of the mesh models used in visualisation and animation (e.g. chairs, planes, ships, cars etc). From this work a number of different measures have been applied to CAD data, such as shape distributions [7–10], reflective symmetry [11], spherical harmonics [12,13], weighted point sets [14], 2D slices [15] and characteristic views [16]. Systematic assessments of these different approaches have been reported in [17,18], and [19]. However the methodology of these studies has been to compare the effectiveness of proposed methods in classifying the shapes into families based on properties such as manufacturing method (e.g. machined parts) or function (chairs). To the authors’ knowledge, with the exception of [18] no studies that benchmark proposed shape signatures for 3D mechanical CAD against human perception of similarity have been reported. Reviews of shape retrieval research that specifically addresses mechanical CAD models can be found in [20– 23]. Recently the establishment of benchmark data sets [18, 24], against which researchers can test their methods, has dramatically improved the ease with which different methods of shape retrieval for mechanical CAD can be compared. 2.2. Experimental rational Automated retrieval systems return a set of results judged to match a given search criteria (in our case similarity to the shape of a target part). Typically there is a trade-off between the number of “hits” returned (termed “Recall”) and the completeness of the results (termed “Precision”). The 1 It should be noted that these observations arose from a series of pair wise comparison of shapes. Whereas the authors’ experimental protocol asked subjects to group shapes into similar families and so preclude any asymmetric relationships.

1040

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

relationship between the parameter of recall (R) and precision (P) (both expressed as percentages of the maximum values possible) characterise the behaviour of a given retrieval system and are illustrated by plots of PR values. Intuitively this reflects the fact that the larger the number of hits returned the more likely it is that all “possible” matches will have been identified. The behaviour of retrieval systems can also be characterized by the proportion of wrong “hits” (false positives, fp), “correct” hits (true positives, tp) and missing “hits” (false negatives, fn). These values also vary with number of results returned and can be illustrated graphically (i.e. ROC plots). Both PR and ROC plots reflect different views of the same data [25]. Crucially both PR and ROC parameters can only be calculated if the correct answer (i.e. set of results) is known for a given query (so P, fp, tp and fn rates can be calculated) and the sensitivity of the method can be varied (so differing R rates can be quantified). Furthermore (fp, tp, fn, P and R) values should be calculated in such a way that they reflect the typical behaviour of the system for a range of retrieval operations across a given dataset (rather than the results returned by one specific query). However the Drexel benchmark dataset is structured in terms of classifications (e.g. manufacturing method) rather than generic shape similarity. In other words, if one is testing the performance of a retrieval system to identify components whose shapes are designed to be manufactured by casting (i.e. with tapers and blended edges) then the precision can be easily quantified. But if a researcher wishes to test the ability of a system to identify components whose shapes are perceived to be similar (regardless of manufacturing method or functional intent) to a given target, then the Drexel benchmark, unlike [18], does not define what the result set should contain (i.e. the correct answer). Because only the human user can define the correct result, there is an inescapable requirement for manual classification of the benchmark dataset. However to ask human subjects to manually identify similar parts in a manner directly analogous to the procedure used to characterise automated retrieval system (e.g. exhaustively ranking the similarity of the contents across all permutations) is impractical for all but small collections. Instead users were asked to cluster components into families allowing a similarity matrix to be defined (see Section 3). Having established a measure against which the precision of search results could be quantified the next step is to incorporate proposed shape signatures in a retrieval system whose output will allow direct comparison with the human assessment. This was done by using an unsupervised neural net to cluster components into classes determined to be similar on the basis of the information contained in the shape signatures (see Section 4). In previous work the authors have proposed a shape signature, known as the Surface Partitioning Spectrum (SPS) and this was used to benchmark against both human perception and the following measures proposed by other researchers: • D2 shape distribution (widely used in other benchmarking studies); and

• Spherical harmonics (SPH) (which has shown impressive performance [17]). Each of these three shape signatures is now discussed in detail. 2.3. Shape distributions (D2) One theme of recent activity has focused on the use of variants of “shape distributions” [7,8,26] (i.e. probability distributions) for three-dimensional objects that are generated by explicit computation. These methods have two distinct steps: • generation of random points over the surface of the object, and • application of “shape function”. On first encounter it is not clear why the probability distributions arising from different shape functions are of the form they are. However, for simple shapes there is an obvious correlation between the geometry of the object and the form of its 2D distribution curve. For example, Fig. 1 shows how the probability distribution for a cube generated by the D2 shape function (distance between pairs of random surface points) is a combination of the probability distributions for individual faces, pairs of orthogonal faces and pairs of parallel faces. After scaling the distances/lengths of the probability distributions by the diagonal length of their bounding box and using the same number of points to generate each curve we observe three things: • Both the single face (Fig. 1(a)) and orthogonal faces (Fig. 1(b)) produce distributions that progress smoothly from zero to the maximum distance. • The distribution produced by parallel faces in Fig. 1(c) has a sharp increase at the distance of separation between the faces. This is due to the fact that there is a high probability of the distance between two points closely approximating the distance between the faces. • For the entire cube in Fig. 1(d), the smooth curve from 0 to the point of sharp increase is due to the distribution of distances on each of the faces. Osada et al. [26] have successfully used the D2 shape function to differentiate between grossly dissimilar shaped objects (e.g. aeroplanes and animals). However Ip et al. [8] have shown that the difference between mechanical parts cannot be distinguished using this simple distribution alone, and have, accordingly, developed the function further to identify the measured distances as within the solid shape, across a void, or through both solid and void. In other words the D2 shape distributions of objects with even moderate re-entrant features prove difficult to analyse as they have surfaces with very similar separations. The global shape of the complex object drowns information about individual features, and, conversely, the effect of internal features on the distribution obscures information about the global shape (see Fig. 3). In response to these problems Ip et al. [8] describe three new distributions based on geometric properties of the intersection between a random line segment and the model. They demonstrate that their approach improves the ability of the D2 shape function to distinguish between mechanical parts.

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

(a) Single square face.

(b) Right angled faces.

(c) Parallel faces.

(d) Cube.

1041

Fig. 1. Constituent parts of cube D2.

However, the generation of these distributions requires repeated intersection calculations and it is reasonable to ask if simpler functions can produce comparable performance. Liu et al. [10] have introduced a variant of the D2 shape histogram known as a Thickness Histogram (TH), which estimates the uniform thickness of a model. The peaks of the D2 are effectively enhanced for parallel opposing regions by using the dot products between the D2 lines and the normals of the triangles occupied by the points as a weighting factor. The authors suggest that the TH should perform more effectively at identifying similar CAD/mechanical models, because of the high number of opposing parallel planes compared to the D2 shape function. However, this has not been shown quantitatively. 2.4. Spherical harmonics (SPH) Kazhdan et al. [13] generate a 3D histogram, using the spherical harmonics approach. This approach begins by scaling the model so that its bounding sphere fits into a 64 × 64 × 64 voxel grid with its centre of mass at the centre of the grid. Voxel values are binary, being set to unity if they intersect the model boundary and zero otherwise. From the voxel grid a family of 32 spherical functions is generated, one for each of a succession of concentric spheres with radii ranging from 1 to 32. Finally, each of the 32 spherical functions is decomposed into its first 16 harmonics (cf. decomposing a function into its Fourier components) and, given that the norms of the harmonics are rotationally invariant, it is the harmonic norms that are used to define the signature for the shape.

Fig. 2, shows spherical harmonic signatures generated for four simple geometric shapes using the Spherical Harmonic getSig program [27]. The plots illustrate the harmonics (x-axis, 1 to 16) and their amplitude (vertical z-axis) present in the shape signature generated for each of the 32 radii examined (y-axis). Considering each in turn: • Fig. 2(a): In theory a sphere can be exactly represented by a single harmonic, whose function “covers” a surface at a given radius. The program generates a signature close to, but blurred, from this ideal both by the tessellation (i.e. faceting) of the sphere and the voxelization of the surface. The resultant signal approximates the sphere surface by two harmonics that peak around the radius of the sphere (r = 16). At small radii no signal is detected (inside the “hollow” sphere). The larger radii (outside the sphere) produce no signals, this is because the bounding sphere used to size the voxel grid is generated from a faceted representation rather than an analytical surface. • Fig. 2(b): Unlike the sphere, the cube cannot be approximated by a single harmonic at any radius and consequently a series of higher harmonics appear in the signature. • Fig. 2(c): The addition of a small hole in the centre of the block generates signals at smaller radius (circled). Note that at larger radii the signal remains largely unchanged as the “expanding” sphere encounters voxels identical to the block in Fig. 2(b). • Fig. 2(d): The cylinder is notable for the regularity of the signal across a range of radii. Soon after the radius of the sphere is larger than that of the cylinder the intersection

1042

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

(a) Sphere.

(b) Cube.

(c) Cube with 1 hole.

(d) Cylinder. Fig. 2. Spherical Harmonics of simple shapes.

between the two shapes will be two circular areas on opposite sides of the sphere. Although the relative size of this intersection will decrease (i.e. its area as a proportion of the entire sphere) it will be present across a range of radius values. Consequently the plot is much broader than that of other shapes. This measure has performed well in comparative assessments carried out on datasets of generic components [17] (i.e. not mechanical CAD parts). However one potential difficulty of applying this approach to mechanical parts is the sensitivity of the signature to the location of the center of the spheres. Potentially visually similar objects could generate quite different signals depending on the location of the spheres used to create the spherical harmonics. 2.5. A surface partitioning spectrum (SPS) Many of the shape signatures proposed in the literature have been designed to distinguish between grossly different shapes (e.g. cows from aeroplanes), which have limited

surface depressions or re-entrant features. When considering manufacturing components, human observers are generally struck by the regularity of their surfaces (i.e. large areas having no curvature discontinuities) and the repetitions of patterns (i.e. features) across the component [28]. Consequently it is reasonable to speculate that any histogram, or signature, derived from a 3D shape should be influenced by both the curvature of the surface and the relationship between adjacent surfaces. B-Rep models, for example, represent the shape of objects in terms of face, edge and vertex elements that are generally synonymous with derivative discontinuities. Such discontinuities effectively define the boundaries of these elements (consequently they have no internal derivative discontinuities). In other words, the “natural” B-Rep faces used in commercial CAD systems represent maximally connected regions of an object’s surface bounded by discontinuities of tangent planes. However, the criterion for separation does not have to be limited to discontinuity of tangent planes; higher derivatives or nongeometric properties (e.g. colour, surface texture) could be used to partition a shape’s surface.

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

1043

Fig. 3. D2 distribution and Spherical Harmonics of complex shapes.

Fig. 4. Angles between adjacent facets.

These observations have motivated the development of a shape signature known as the Surface Partitioning Spectrum [29] that reflects: • Both concave and convex geometry • Repetition of features • The types of surfaces found on the component and their adjacent relationships. The division of continuous surfaces into neat collections of faces and edges is a problem well known in the literature on reverse engineering and point cloud manipulation. Surface curvature is frequently used as a basis for region separation and a large literature exists on both the robust estimation of surface curvature from polygon meshes [30–32] and mesh segmentation algorithms. Rather than attempt to define unambiguous criteria for segmenting an object’s surface into “faces” (i.e. regions or patches), the SPS systematically varies a boundary threshold

and records how the number of “faces” defined by each value varies. In this work all the 3D objects are represented by closed meshes of triangles (based on STL data) and the angle between adjacent triangular facets (see Fig. 4) is used as the segmentation criterion. For a given threshold the SPS constructs patches of contiguous triangles each of which are connected to the patch under construction by at least one edge whose dihedral angle is less than, or equal to, the threshold value. For many threshold values the patches formed will be disjoint and many triangles will not be incorporated into any patch. Because of this the exact position of patch boundaries is not significant and so the approach should be robust across different resolutions of faceting. To generate the SPS, the criterion for separation is varied across a range of values, denoted 1θ. By setting the range 1θ to between 179◦ and 181◦ , contiguous planar collections of triangles will be defined as patches; this planar range is given

1044

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

Fig. 5. Merging of patches

Fig. 6. SPS graph of Clover shape.

the special notation of 1θ ∗ Fig. 5(b). If the range 1θ is set 180 ), then planar patches to include 90◦ to 181◦ (denoted 1θ90 separated by a concave edge would merge to form a single patch (see Fig. 5(c)). By plotting the number of patches formed on the model against a range of tolerance values between 0◦ and 360◦ , a profile is created which reflects both the topology and geometry of the model. If the SPS simply counted the number of faces defined by a particular curvature threshold the results would be ambiguous. For example, if the threshold were set to 1θ ∗ then the number of planar faces would be counted, and clearly many different objects could be constructed with identical numbers of planar faces. However, because the SPS is generated by systematically varying the boundary threshold the patches will merge in a manner determined by their adjacency relationships. In other words,while the level of the SPS histogram (ie. number of patches) will be determined by an object’s surface geometry, the position of the step changes in the SPS histogram will reflect both the geometry and topology of a shape. Consider for example the clover-shaped extrusion presented in Fig. 6. To generate the spectrum, the tolerance is first decreased from 180◦ in increments of 10◦ down to 0◦ and then up to 360◦ . At each step the number of patches formed is calculated. The SPS

graph plots this tolerance angle against the number of patches generated. At 1θ ∗ the SPS graph has a value of two, representing the top and bottom planar faces of the mould. As the angle is 180 . These decreased six additional patches are formed at 1θ100 patches occur at the six concave creases of the clover shape. A similar increase in the number of patches can be seen when 190 the angle is increased to 1θ180 where the six convex surfaces of 270 , the number of patches the clover form patches. Again, at 1θ180 drops to one, due to the orthogonal nature of the object. Visual inspection of the SPS graphs suggested that they might offer good object discrimination for complex shapes (see Figs. 3 and 7). 3. Human assessment of shape similarity with the Drexel benchmark dataset As explained earlier models in the Drexel benchmark dataset [33,34] are not classified by their geometric similarity, but by other criteria such as method of manufacture (e.g. castings, machined) or mathematical class (i.e. geometric primitives). In order to establish a measure of the human perception of similarity the following exercise was carried out with 12 subjects. Each subject was presented with cardmounted images of all the objects in the dataset (see Fig. 8);

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

1045

Fig. 7. SPS graphs of complex 3D shape.

Fig. 8. Manual assessment of the geometric similarity within the dataset.

as an induction process, one card was picked randomly, and the subject was asked to choose 5 similar objects from the remaining data set. The subject was then asked to consider the entire data set and to divide it into groups, or families, of similar objects. No limit was placed on the number of these families, and unclassified objects were to be left in families of one. Once the subject had completed this process the groups were recorded and a similarity matrix was constructed for each subject. Each object is assigned a row and a column in the similarity matrix. For a given family member’s row, the element in the columns associated with the other family members is set to one and non-family member elements are set to zero. Twelve subjects participated in the test. The benchmark similarity matrix (Fig. 9) was the average of their individual similarity matrices, that is, B=

12 1 X Sn , 12 n=0

where B is the benchmark similarity matrix, and Sn is the nth subject’s similarity matrix. This provides graded entries in the similarity matrix (i.e. values between 0 and 1). Examples of relatively similar objects as defined by the matrix are shown in Table 1. The Similarity Matrix in Fig. 9 defines relationships for the Drexel datasets given in Table 2.

The high degrees of similarity found in blocks of components around the diagonal (e.g. group A in Fig. 9) suggests that many of the datasets are well chosen and contain inherently similar parts. However not all the components judged to be similar lie close to the diagonal. Close inspection shows there are numerous examples of parts that are judged similar to components found in other datasets (e.g. group B in Fig. 9). To enable comparison with the output of the Artificial Neural Networks’ clustering (described in the next section) the matrix has to be filtered to contain only binary values (i.e. similarity measures of 1 or 0). To achieve this, a threshold value is used; above the threshold value, similarity values are rounded up to 1, and below, rounded down to 0. Varying the threshold in this way creates similarity benchmark matrices with more, or less, demanding judgements of similarity as shown in Table 3. For example the table indicates that 64% of the test objects are judged to be geometrically similar by 90% of the test subjects. 4. ANN based similarity assessment Artificial Neural Networks (ANNs) have proved to be very successful in performing classification tasks. Given a set of attributes which describe a collection of objects and a label indicating the class to which each object belongs, ANNs can form a mapping from the former to the latter which can be more comprehensive than other classification mechanisms. Unsupervised ANNs do not rely on pre-determined classes; they “discover” classes by grouping the data presented to them into clusters such that all objects within a given cluster have more in common with each other than with any object in any other cluster. In this way problems of subjectivity can be avoided (although identifying the semantics for the classes created by this type of system can be very difficult) [35]. The ANN used in this study is the standard competitive network provided in the Matlab Neural Network Toolbox. Three competitive ANNs, one for each criterion (i.e. SPS vs D2 vs SPH distributions), were set to cluster the objects into 12 distinct families (a value determined by the range of possible families identified by manual classification). Every member of the dataset is assigned to a particular cluster and the number of members assigned to each cluster is shown in Table 4. The threshold matrices were previously given in Table 3. The sparse

1046

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

Table 1 Relative similarity of objects as assessed by humans

matrix output from the ANN was converted to a similarity matrix containing binary values and by assigning values of 1 to all relationships between members of a cluster (and 0 with all members outside the grouping). 5. Results and discussion The ANNs outputs were compared to the thresholded benchmark matrix. The resulting rates of false positive vs false negative, true positives vs false positives and precision vs recall have been plotted in Fig. 10 where the perception threshold was varied from 0.1 up to 0.9, in steps of 0.2 (a threshold of 0.9 will define a subset of components that 90% of the human subjects agreed were similar). Fig. 10(a) suggest that all three measures have very low false positive rates (i.e. they produce very few wrong answers)

and have false negative rates that vary linearly with the level of perceived similarity (i.e. threshold value). In other words if the output of the ANN clustering is compared to the human similarity matrix threshold at 0.9 (i.e. 90% of subjects agree on the assessment) the number of components “missed” (i.e. false negative rate) is low (SPH 30%, D2 50%, SPS 42%), higher thresholds produce higher false negative values. Overall the SPH produces lower false negative rates than D2 or SPS based signatures. Varying the perception threshold affects the false-negative rate of all three measures in a linear manner, a behaviour that can be understood by considering the fragments of a similarity matrix shown in Fig. 11. Fig. 11(a) depicts a graduated human Similarity matrix, and Fig. 11(b) and (c) show the same matrix thresholded at 90% and 50% respectively. If an ANN produces

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

1047

Fig. 9. Similarity matrix for Drexel datasets.

Table 2 Location of Drexel datasets in human similarity matrix Row/column number

Drexel dataset

1–11 12–52 53–68 69–77 78–88 89–94 95–107 108–114 115–132 133–137 138–144 145–148 149–178 179–183 184–238

Bricks Cast then machined Cubes Brackets Gears Housings Linkages Nuts Screws Springs Xshapes Wheel Plate Cylinders Machined

a similarity matrix Fig. 11(d) using a criteria, we calculate the number of false-positives (fp) (i.e. the number of black squares present in the machine similarity matrix which are not in the human similarity matrix) to be 4 when compared with the 90% threshold matrix, and 1 when compared to the 50% threshold matrix. The false negatives (fn) (i.e. the number of black square which present the human similarity matrix, but not in the machine similarity matrix) number 1 when compared with the 90% threshold matrix and 2 when compared to the 50% threshold matrix. Although from this example it appears

that the false-positive rate should increase as the threshold level is raised, it is important to note that the false-positive rate is the percentage of the maximum number of possible false positives for that similarity matrix. From Fig. 11(b) and (c) it can be seen that the maximum number of false positives will increase as the threshold level increases, and so the percentage remains constant. Fig. 10(b) illustrates the behaviour of the true positive rate when plotted against false positive for various levels of threshold. The result mirror the fp/fn plot and again the SPH measure performs slightly better when compared with human perceptions of similarity. A further insight into the system’s behaviour can be gained from the relationship between Precision and Recall values (Fig. 10(c)). Here although SPH appears to have superior performance its behaviour can be seen to be more erratic than the other measures which suggests that there are weaknesses in its overall ability to match human perception (possibly linked to the difficulties of determining a representative location of the centre of the spheres). While the authors believe that the procedure used to generate the human similarity matrix has produced a credible result it also raises a couple of issues worthy of debate, namely: • The human subjects used 2D views to classify the 3D objects. Would the results have been different if 3D models of each part had been available? • A total of 12 subjects were used. Would the results have been different if 100 have been used?

1048

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

Table 3 The threshold matrices

Lastly, the functioning of the ANN must also be considered when discussing these resulted. The ANN variables are:

Table 4 Number of objects in each ANN produced cluster Cluster no 1 2 3 4 5 6 7 8 9 10 11 12 Total

SPS

SPH

D2

16 19 15 20 33 19 13 13 12 5 19 15

19 22 10 14 8 12 32 13 19 18 26 6

27 23 15 8 15 12 22 10 17 17 17 16

199

199

199

Although these are valid questions, the authors believe that since each picture was generated from a view deliberately chosen to illustrate the model and the domain of application is engineering the results provide a useful benchmark. The question of the number of trial subjects is also interesting. The matrices generated by each individual are available at [36] and one can consider the variance within the sample. If all the subjects had perceived exactly the same geometric similarity amongst the models, there would be no shades of grey, only black and white (i.e. 100% or 0%) in Fig. 9. We have investigated this issue by applying thresholds to the benchmark and comparing the results at different thresholds.

• The number of clusters (output nodes). • The number of input parameters. • The scaling of the input parameters. Considering each of these in turn; • The number of clusters was set at a value that does not produce any empty clusters; • The number of parameters used to characterise each shape is determined by the nature of the shape signatures. • Scaling of the input parameters did have an effect. A number of different scaling values were applied to each of the shape signatures, and the scaling values that produce the most favourable result were used. Lastly translation difficulties with some of the B-Rep models prevented stl files being generated. The components affected are listed in the Appendix and were excluded from the analysis presented in the results section. 6. Conclusions There are two principal conclusions that can be drawn from the work reported: (1) The results of the human perception test suggest that the Drexel dataset contains a large number of perceptually similar objects that range from almost exact matches (i.e. 100%) to debatable similarities (i.e. 10%). It is hoped that the similarity matrix reported for the Drexel dataset will form a valuable resource for other researchers.

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

1049

Fig. 10. Graphs comparing the performance of SPS, SPH and D2. (a) False positives vs false negatives; (b) true positive rate vs false positive rate; (c) precision vs recall.

Fig. 11. Example of variation in false-positive and false-negative rates.

(2) The results obtained from benchmarking the three shape signatures against human perception of the dataset indicates they generally have a low false-positives rate and the false-negative rate varied almost linearly with the amount of similarity perceived by human subjects. This was a better result than the researchers expected and suggest that many practical applications could be viable with existing measures and a knowledge of the relative precisions. For example an interactive shape browsing system might require a low false-negative rate, because the final selection will be made by the user and it is

important not to exclude potential choices. On the other hand a system to provide cost estimates based on job’s similarity to previously produced parts cannot be allowed to introduce mistakes so a low false-positive rate is essential. 7. Future work The authors intend to repeat this experiment with a different dataset (e.g.) [18] to investigate if the nature of the components in the dataset has a significant effect on the

1050

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051

outcome Although current benchmark datasets appear to offer a good range of similar components there is a need for large collections of objects with complex surfaces. These would allow the behaviour of search and retrieval algorithms on this important class of components to be assessed; consequently the authors are working towards the creation of such a collection. Acknowledgements The work on the human similarity matrix owes a substantial debt to Tom Davenport who spent many hours creating the cards and supervising the experiments. Appendix. Drexel Benchmark dataset components not included in the study File Name assy 4.hh base1-Part 1.hh bracket 1.hh Caddy06 4.hh Caddy06 5.hh cognit 1.hh CompletePart 16.hh CompletePart 63.hh compressor 1.hh crankcse 1.hh good chamberor 1.hh impeller 1.hh linkage arm 42 1.hh linkage arm 43 1.hh linkage arm1 43 1.hh main block 1.hh s bracket 1.hh sliding stop 1.hh ToolingBlock1 1.hh ToolingPlate 1.hh torp s1 1.hh WINKEL 1.hh References [1] Rea HJ, Corney JC, Clark DER, Taylor NK. Commercial and business issues in the e-sourcing and reuse of mechanical components. The International Journal of Advanced Manufacturing Technology 2006; (March):1–7. Also available at: http://www.springerlink.com/openurl. asp?genre=article&id=doi:10.1007/s00170-005-0068-z. [2] Kullback S. Information theory and statistics. Dover; 1968. [3] Graham SA, Kilbreath CS, Welder AN. Thirteen-month-olds rely on shared labels and shape similarity for inductive inferences. Journal of Child Development 2004;75:409. [4] Tarr MJ, Bulthoff HH. Object recognition in man, monkey and machine. MIT Press; 1998. [5] Roberson D, Davidoff J, Shapiro L. Squaring the circle: The cultural relativity of ‘good’ shape. Journal of Cognition and Culture 2002;2: 29–51.

[6] Sloman S, Malt B, Shi M, Ennari S, Wang Y. Are bottles similar to one another? Sorting and naming by Chinese, Argentineans, and Americans. In: Procs SimCat 97 an interdisciplinary workshop on similarity and categorisation. Edinburgh: Department of Artificial Intelligence, University of Edinburgh; 1997. [7] Ankerst M, Kastenmuller G, Kriegal H-P, Seidl T. 3D shape histograms for similarity search and classification in spatial databases. In: Guting RH, Papadias D, Lochovsky F, editors. SSD 99 — Advances in spatial databases. Hong Kong, Berlin, Heidelberg: Springer-Verlag; 1999. p. 207–26. [8] Ip CY, Lapadat D, Sieger L, Regli WC. Using shape distributions to compare solid models. In: Solid modeling. Saarbrucken (Germany): ACM; 2002. [9] Osada R, Funkhouser T, Chazelle B, Dobkin D. Matching 3D models with shape distributions. Shape Modelling International 2001. [10] Lui Y, Pu J, Zha H, Liu W, Uehara Y. Thickness histogram and statistical harmonic representation for 3D model retrieval. In: 2nd international symposium on 3D data processing, visualization, and transmission. Thessaloniki, Greece: IEEE Computer Society; 2004. [11] Kazhdan M, Chazelle B, Dobkin D, Funkhouser T, Rusinkiewicz S. A reflective symmetry descriptor for 3D models. Algorithmica 2003. [12] Zhang D, Herbert M. Harmonic maps and their applications in surface matching. In: IEEE conference on computer vision and pattern recognition. 1999. [13] Kazhdan M, Funkhouser T, Rusinkiewicz S. Rotation invariant spherical harmonic representation of 3D shape descriptors. In: Symposium on geometry processing. Aachen (Germany): Eurographics Association; 2003. [14] Tangelder JWH, Veltkamp RC. Polyhedral model retrieval using weighted point sets. Utrecht (The Netherlands): Institute of Information and Computing Sciences; 2002. [15] Pu J, Lui Y, Xin G, Zha H, Liu W, Uehara Y. 3D model retrieval based on 2D slice similarity measurements. In: 2nd international symposium on 3D data processing, visualization, and transmission. Thessaloniki (Greece): IEEE Computer Society; 2004. [16] Ansary TF, Vandeborre J-P, Mahmoudi S, Daoudi M. A Bayesian framework for 3D models retrieval based on characteristic views. In: 2nd international symposium on 3D data processing, visualization, and transmission. Thessaloniki (Greece): IEEE Computer Society; 2004. [17] Shilane P, Min P, Kazhdan M, Funkhouser T. The Princeton shape benchmark. In: Giannini F, Pasko A, editors. International conference on shape modeling and applications. Genova (Italy): IEEE Computer Society; 2004. p. 167–78. [18] Iyer N, Jayanti S, Ramani K. An engineering shape benchmark for 3D models. In: ASME international design engineering technical conferences and computer and information in engineering conference; 2005. DECT2005-85612 available at: http://me98pc26.ecn.purdue.edu/papers%5CAn Engineering.pdf. [19] Bespalov D, Ip CY, Regli WC, Shaffer J. Benchmarking CAD search techniques. In: Symposium on solid and physical modelling. Cambridge (MA, USA): ACM; 2005. p. 275–86. Available at http://doi.acm.org/10.1145/1060244.1060275. [20] Cardone A, Gupta SK, Karnik M. A survey of shape similarity assessment algorithms for product design and manufacturing applications. Journal of Computing and Information Science in Engineering 2003;3:109–18. [21] Bustos B, Keim D, Saupe D, Schreck T, Vrani´e D. An experimental comparison of feature-based 3D retrieval methods. In: 2nd international symposium on 3D data processing, visualization, and transmission. Thessaloniki (Greece): IEEE Computer Society; 2004. [22] Iyer N, Jayanti S, Lou K, Kaylyanaraman Y, Ramani K. Threedimensional shape searching: state-of-the-art review and future trends. Computer-Aided Design 2005;37:509–30. [23] Tangelder J, Veltkamp R. A survey of content based 3D shape retrieval methods. In: Giannini F, Pasko A, editors. International conference on shape modeling and applications. Genova (Italy): IEEE Computer Society; 2004. p. 145–56. [24] Bespalov D, Shokoudandeh A, Regli WC, Sun W. Local feature extraction using scale-space decomposition. In: ASME 2004 design engineering technical conferences and computers and information in engineering

D.E.R. Clark et al. / Computer-Aided Design 38 (2006) 1038–1051 conference. Salt Lake City (Utah): ASME International; 2004. [25] Davis J, Goadrich M. The relationship between precision-recall and ROC curves. Madison (WI, USA): University of Wisconsin, Madison Computer Science Department; 2006. Available at: http://www.cs.wisc.edu/˜richm/articles/davisgoadrichpr.pdf. [26] Osada R, Funkhouser T, Chazelle B, Dobkin D. Shape distributions. ACM Transactions on Graphics 2002;21. [27] Kazhdan M. Spherical harmonics calculation, get sig. Available at: http://www.cs.jhu.edu/˜misha/HarmonicSignatures/; 2004. [28] Mills BI, Langbein FC, Marshall AD, Martin RR. Estimate of frequencies of geometric regularities for use in reverse engineering of simple mechanical components. Cardiff: Geometry and Vision Group, Dept. of Computer Science, Cardiff University; 2001. Available at http://www.langbein.org/files/BoRG/survey.pdf. [29] Rea H, Corney J, Clark D, Taylor N. A surface partitioning spectrum (SPS) for retrieval and indexing of 3D CAD models. In: 2nd international symposium on 3D data processing, visualization, and transmission. Thessaloniki (Greece): IEEE Computer Society; 2004.

1051

[30] Hamann B. Curvature approximation for triangulated surfaces. Computing Supplement 1993;8:139–53. [31] Kobbelt B. Discrete fairing and variational subdivision for free-form surface design. Visual Computer 2000;16:142–58. [32] Mangan A, Whitaker R. Partitioning 3D surface meshes using watershed segmentation. IEEE Transaction on Visualisation and Computer Graphics 1999;5. [33] Regli WC. An overview of the NIST repository for design, process planning, and assembly. CAD 1997;29:895–905. [34] Regli WC. National design repository benchmark. 2004. http://www.designrepository.org. [35] Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. Nature 1988;323:533–6. [36] Corney JC, Rea H, Sung R. ShapeSearch.net. Heriot-Watt University, Edinburgh; 2001. http://www.shapesearch.net/. [37] Tversky A. Features of similarity. Psychological Review 1977;84:327–52. [38] Mumford D. Mathematical Theories of shape: do they model perception?. In: Geometric Methods in Computer Vision. SPIE, vol 1570. 1991.