A new perceptual organization approach to 3D measuring system based on the fuzzy integral

A new perceptual organization approach to 3D measuring system based on the fuzzy integral

Image and Vision Computing 24 (2006) 381–393 www.elsevier.com/locate/imavis A new perceptual organization approach to 3D measuring system based on th...

378KB Sizes 9 Downloads 57 Views

Image and Vision Computing 24 (2006) 381–393 www.elsevier.com/locate/imavis

A new perceptual organization approach to 3D measuring system based on the fuzzy integral A. Bigand a,b,*, L. Evrard a,b, J.P. Dubus a,b b

a Littoral University, BP649, 62228 Calais Cedex, France Laboratoire ID3, Lille1 University, Bat.P3, 59655 Villeneuve d’Ascq Cedex, France

Received 27 July 2001; received in revised form 12 December 2005; accepted 12 December 2005

Abstract A new algorithm for perceptual grouping using the fuzzy integral and primarily aimed at static scenes (industrial images) analysis is presented. Our purpose is to build the planar surfaces of three-dimensional (3D) polyhedric objects from labeled line segments using an active vision system (projection of laser planes on the object and 3D reconstruction using a CCD camera). Each line segment is first characterized by three geometric constraints, which are assigned by a specific membership function. These constraints are used in geometric relations between image features (such as collinear and parallel relations) through the fuzzy integral for grouping the line segments with accuracy, since edge detection gives imperfect and incomplete information. q 2006 Elsevier B.V. All rights reserved. Keywords: Computer vision; Perceptual organization; Fuzzy integral

1. Introduction 3D measuring systems are now often used in many industrial applications like object modeling, medical diagnosis, CAD/ CAM, virtual reality systems, etc. Although present day range scanners handle complicated scenes, some specific applications need to define new approaches for this wellknown problem. The paper presents a new method for perceptual grouping of line segments leading to a 3D polyhedral reconstruction of a scene, devoted to industrial objects of great dimensions. Two different ways to describe scenes of three-dimensional (3D) objects are known. First, the 3D scene description, using region-based 3D reconstruction techniques [1] or the invariants of 3D structures approach to obtain reliable 3D primitives, is investigated by many authors and requires two or more perspective views and the application of projective geometry. Then, the 2D scene description is more classic, but has to deal with the importance of 3D depth uncertainties, so that it is difficult to detect points belonging to the same planar surface.

* Corresponding author. Address: Littoral University, BP649, 62228 Calais Cedex, France. E-mail address: [email protected] (A. Bigand).

0262-8856/$ - see front matter q 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.imavis.2005.12.003

Well adapted to 3D object recognition, the first method requires high computational complexity for high performances, in particular when taking noise into account [2]. In the second method, structural description for 2D objects in vision processing is a challenging area of image understanding and pattern recognition. However, these descriptions are difficult to extract from the low-level vision processing stage because of imprecision from various sources. This noise may be divided into two classes [3]. First, there are weak perturbations which have repercussions on the whole image (segmentation errors, lighting differences, etc.). The second category is made up of important noises such as occlusions, which spoil small parts of the image. We now propose to use tools, which make it possible to take uncertainty, inaccuracy, loss of information (small occlusions) and image variability into account. A brief review of the methods concerning 3D measuring systems (passive and active methods) is made in [4]. These authors show that the active measuring method is widely applied on industrial quantitative measurements, and they propose a new approach to increase precision, based on phase variations in the active projected light. Nevertheless, noise and computational complexity remain the main obstacles to the achievement of such a system. On the other hand, industrial applications often require high performance methods that are easy to implement and to modify (for a flexible production). Perceptual grouping based on meaningful geometric relations [5,6], and fuzzy operator [7,8] to combine these

382

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

relations, makes it possible to obtain reliable information for high-level vision processing. So we propose a new method of obtaining planar surfaces of 3D objects with a high degree of accuracy in 2D scene description. The scene is illuminated with patterned light and an effective decision theory tool, the fuzzy integral, is used to deal with depth uncertainty and secure a high degree of accuracy and, above all, flexibility. Our device is composed of one CCD camera and a laser plant which is able to generate eleven parallel planes through an optical head. The calibration procedure and the extraction of the light pattern to get rid of optical defects, that are inherent to this system of vision, have been previously described in [9,10]. It is possible to summarize the four steps of the initial treatment: the first step is the application of the laser signal on a polyhedric plant, the second step is the acquisition of the scene in the dark, the third step consists in labeling of the lines and the fourth step in the extraction of the light pattern. Thus, at the end of the initial treatment, we obtain labeled lines that are structured in labeled segments. This treatment can be interpreted as an arborescent image (Fig. 1). If a 2D segment belongs to a 3D surface, we need at least two segments and their attributes to completely characterize a 3D facet. Two methods are possible for this characterization: † Grandjean [11] uses the 3D data from stereoscopic segments and from telemetric segments to formulate the coplanarity assumption. † It is possible to match the 2D segments of two adjacent stripes, as well as to detect the edges between two 3D facets by image analysis. This method permits to deal with the raw sensorial data and avoids processing the 3D data (the reconstructed 3D data are imprecise). In this work, the second method is used. Indeed, it appears preferable to assess coplanarity from attributes on 2D segments than on 3D reconstructed segments considering the disparity of the 3D space. Then a decision tool, based on the Sugeno (fuzzy) measures and the Choquet integral, is used to provide a confidence measure on the matching of two 2D segments depending on the coplanarity hypothesis. The second stage of our work is the matching of the segments, which have been extracted into homogeneous

surfaces (perceptual grouping). The uncertainty of the 3D points is dealt with in two ways. The patterned light permits to obtain labeled segments belonging to the same surface with a high degree of accuracy (this is one of the advantages of active vision) but there remains some uncertainty about their matching, for they are not exactly coplanar. For example, the parallel relation between two line segments will never be exactly obtained. So, it was decided, for more a flexible and robust measurement, to adopt an approach based on fuzzy sets to detect geometric relations among line segments, as proposed in [7]. A membership function is thus assigned to the three appropriate constraints (parallelism, overlap ratio and distance) defining the geometrical relation for each line segment. Then it is necessary to make a qualitative decision (do the segments belong to the same surface or not?) under uncertainty in a finite setting. The Bayesan methods may be used here, as the underlying distributions are known in this case. In general, probabilistic methods employing the Bayesan theory do not need heuristic adaptations (like thresholds), but to the detriment of a complex model [3]. These authors propose a perceptual organization approach based on Dempster–Shafer theory and obtain good results for 2D images compared to a Bayesan network method. In a similar way, we decided to choose the Choquet integral-based utility, a generalization of expected utility that is sum-decomposable for such acts in this numerical framework. This method gives good results (as with 3D scene description, it has an average variance of 5% for the length and 1% for the angles), and proves the interest of the fuzzy integral introduced in information fusion for image processing. In particular, a learning method, that is well adapted to perceptual organization and for the changes in production that often occur in industrial tasks, is proposed. This paper is organized as follows: Section 2 sums up the concept of fuzzy sets and fuzzy operators (in particular the fuzzy integral) used in image processing. Section 3 presents perceptual grouping from image data based on fuzzy sets, and fuzzy based geometric relations using the Choquet integral. Section 4 describes the experimental results using our perceptual grouping system on real images. The industrial application is described in Section 5 and finally, conclusions are drawn in Section 6.

Fig. 1. Geometrical interpretation of the image.

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

2. Fuzzy sets and aggregation operators 2.1. Background Fuzzy sets, introduced by Zadeh [12], are tools commonly used to deal with ambiguous or imprecise data. The main idea is to allocate to an element x belonging to a physical universe a membership degree to a fuzzy set F. Then Zadeh proposed to extend the Boolean algebra of crisp sets to fuzzy sets, based on the use of min and max operators for intersection and union, respectively. These original definitions have been completed by the introduction of the use of triangular norms and conorms (see [13] for instance), coming from probabilistic metric spaces. The use of the implication operators of the fuzzy logic together with triangular norms (for modeling conjunction) and conorms (for modeling disjunction) has been very extensive for the last few years in fuzzy expert systems and control systems. But properties such as idempotency and scale invariance may also be of interest in the numerous multicriteria decisionmaking problems that are encountered in image processing. Grabisch [14] has demonstrated that the different fuzzy aggregation operators constitute a vast family that can be generalized by fuzzy integrals and may be successfully used in classification. So one particular approach to perceptual grouping may be the fuzzy pattern matching methodology we develop in Section 2.2. Before that the main properties of the fuzzy operators we can use in image processing remain to be defined. 2.2. Fuzzy operators The problem with the aggregation of numerical valuations is to find an operator H so that the global valuation of an object o given by: u(o)ZH(u1(o),.,ui(o)), where u1(o),.,ui(o) represent the scores (or mono-dimensional utility functions) according to the criteria 1 to i, respectively. In general, the decision maker needs some properties such as idempotency, scale invariance, the possibility to assign weights to the criteria, etc., in order to define the utility function u(o). The different families of operators used in fuzzy logic are the t-norms (and conorms), the averaging operators and the OWA operators that have been studied in [15]. In image processing, the t-norms (and conorms) are not very often used because t-conorms are too tolerant operators and t-norms are not idempotent (except for the minimum operator). The averaging operators are characterized by a behavior situated exactly between t-norms and t-conorms, and by the properties of idempotency, commutativity and monotonicity, therefore, they are well adapted to the aggregation of criteria. In perceptual grouping, it is necessary to detect meaningful geometric relations from image data (idempotency is important in this case), and since the exact detection of geometric relations between segments is impossible, computing the reasonable likehood for a relation is desirable. So, it is important to take into account the last family of fuzzy operators, that are ordered weighted averaging (OWA) operators, which make it possible to assign weights to criteria in order to take their importance

383

into account, using ordered values (see [16]). These operators may be useful in such an application, but the fuzzy integral as an aggregation operator was used preferably to them. It is more general than the OWA operators (as is shown in [17]), as we shall presently develop. 2.3. The Choquet integral as an aggregation operator The Choquet integral is a multiattribute utility function that makes it possible to quantify the evaluation and decisionmaking process in decision problems. So, it is a way to aggregate information from multiple sources and it has been introduced in this sense by Sugeno [18], Denneberg [19] and Grabisch [14,17]. Some applications in image processing were proposed in the last few years, see [20–22] for examples of these. The fuzzy integral is a non-linear functional operator, which makes it possible to calculate integrals relative to nonadditive measures (most of them are the fuzzy measures introduced by Sugeno [18,22]). The fuzzy integrals compute a kind of distorted ‘average’ of the different inputs, so that they can be used like aggregation operators, but they differ from these operators in that both objective information (provided by the sources) and the expected worth of subsets of these sources (subjective information) are considered in the combination process. The two most common definitions of the fuzzy integral are the original definition of Sugeno and the Choquet integral. The Sugeno integral is based on non linear operators (min and max) and is suitable for ordinal aggregation, whereas the Choquet integral is based on usual linear operators and suitable for cardinal aggregation. That is why we have chosen to use the Choquet integral in the present application, where the attributes of features define a discrete space X and provide numerical values. These integrals operate on ordered values (like OWA operators, they generalize). The fuzzy measures, and then the Choquet integral, will now be defined. 2.3.1. Fuzzy measures Let us denote by XZ{x1,.,xn} the set of criteria (or attributes), and by PðXÞ the power set of X, i.e. the set of all subsets of X. A fuzzy measure on X is a set function m of PðXÞ/ ½0; 1, satisfying the following axioms: † Boundary conditions. m(X)Z1 and m(:)Z0. † Monotonicity. If A3B, then m(A)%m(B) for A and B belonging to PðXÞ. † Continuity. If A13/3An3/, then lim(m(Ai))Zm(gAi). A fuzzy measure m may be non-additive, in order to have a flexible representation of complex interaction phenomena between criteria (attributes). According to the nature of the fuzzy measure m, the measure of the union of disjointed sets cannot be directly calculated from the measure of each set. So Sugeno introduced the so-called l-measure satisfying the

384

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

additional property:

To calculate the synthetic valuation of the Choquet integral, three steps can be defined:

cA;B 3X and Ah B Z :: (1) mðAg BÞ Z mðAÞ C mðBÞ C lmðAÞmðBÞ for some lOK1. For lZ0, we come across the same additivity axiom of probability measures. Let define aiZm({xi}), which are called the fuzzy densities of the measure. The fuzzy density defines the importance of an individual information source. If we know the fuzzy densities, the value of l can be found by l C1 Z

n Y ½1 C lai 

(2)

iZ1

2.3.2. The Choquet integral The Choquet integral (c) may be defined by the following form ð

ð1

ðcÞ f dm Z mðAa Þda

(3)

0

where AaZ{xjf(x)Ra, x2X}, and f is a measurable function on ðX; PðXÞÞ which associates a value fi to each attribute xi, this value being named marginal valuation (or partial score) of attributes of features. m is a fuzzy measure as defined previously. Within a discrete frame, the Choquet integral may be defined as following: ð n X      ðcÞ f dm Z mðAi Þ f xi Kf xiK1 (4) iZ1

or ð

ðcÞ f dm Z

n X

f ðxi Þ½mðAi ÞKmðAiC1 Þ

(5)

iZ1

with 8   mðAi Þ Z m xi ; xiC1 ;.; xn > > < and > > :   f x0 Z 0 In these equations, the ‘xj ’ represent a reordering of the xi for the marginal valuations fjZf(xj), relative to each attribute xj, such that: f ðx1 Þ% f ðx2 Þ%/% f ðxn Þ For an application with a vector of attributes of length n, there exist 2n coefficients to define the measure m. That is one of the difficulties of the application of the Choquet integral, because of the proper identification of these coefficients and the computational complexity that increase very rapidly. The complexity (and the interest) of the fuzzy integral proceeds from the measures m(Ai). In fact, it is possible to define, for instance, m({xi, xj}) with isj, and show the interaction between the attributes xi and xj, which is not authorized by the arithmetic averaging operator, for example.

† The first step is the choice of the attributes XZ{x1, x2,., xn} with (nR2). In our application, they will be chosen in relation to the three attributes of the features (segments) for the perceptual grouping operation. † The second step defines the Sugeno l-measure m, determining the importance of degree on the set PðXÞ given to the different attributes and to their interactions. † The third step considers the marginal valuations f(xj) obtained for each attribute xj. In the present application, the values f(xj) are the three similarity functions f1, f2 and f3 defined in Section 2.3.3. 2.3.3. Identification of the fuzzy measure For an application with a vector of attributes of length n, there exist 2n coefficients to determine (there are exactly 2nK2 coefficients because the empty set and the universal set are, by definition, 0 and 1, respectively). The major difficulty encountered in applications of the Choquet integral is the assessing of these coefficients to define the fuzzy measure. As a consequence of the non-additivity property of this measure, the exact assessment of the fuzzy measure requires many more subsets than in the additive case. Thus, fuzzy measures have been identified by recursive methods [23] or mathematical programming methods [18]. Tanaka and Sugeno [24] have proposed a combined approach of quadratic programming with relaxation method. Grabisch and Sugeno [25] use an automatic learning procedure minimizing a quadratic error like criteria with the constraints induced by the monotonicity property of the fuzzy measure. All these methods demand a large amount of data, obtained from decision makers or experts. As no fuzzy measure m on ðX; PðXÞÞ can solve this system, it is necessary to find an optimal approximation of the Eq. (4). The terminology inverse problem of the synthetic valuation is employed by some authors [26]. The usual technique consists in minimizing the quadratic error eZ

m m 1X 1X k Ek2 Z ðEk KE^ Þ2 2 kZ1 2 kZ1

(6)

where Ek is the global valuation (of the training input vector ‘k’ fk(x1), fk(x2), fk(x3) of the knowledge base) given by the k decision maker (with m samples), E^ is defined by the Choquet integral. The Sugeno l-measure m on X is characterized by n real values ajZm({xj})2[0, 1]. Wang and Wang [26] used the following form for the quadratic error: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi m 1 X k e0 Z (7) ðEk KE^ Þ2 m kZ1 The non-linearity of this expression does not make it possible to find the relation ve=vaj and Wang and Wang used a neural net to calculate the Sugeno measures.

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

In the special case of three attributes, we suggest using an optimization method (Gauss–Newton algorithm) associated to a knowledge base to solve this problem, because we are able to calculate the derivatives of the quadratic error in that case. Using the Sugeno measure, the minimization of the quadratic error previously defined is given by ve 10 vaj where the derivative of the error relative to aj is given by: k

m X ^ ve k vE Z ðEk KE^ Þ vaj vaj kZ1

(8)

The difficulty is then to express the derivative of a synthetic valuation relative to the densities aj. In the special case of three attributes x1, x2, x3, we can rewrite the Choquet integral expression using the general equation (4):   k E^ Z f k ðx1 ÞKf k ðx0 Þ mðfx1 ;x2 ;x3 gÞ

385

2 3 8 > > vE vl > 5 > Z ½f ðx1 ÞKf ðx0 Þ4K2 K3 C K4 > > va1 > > va1 > > 2 3 > > > > > vE vl 5 > 4 > > > va2 Z ½f ðx1 ÞKf ðx0 Þ K1 K3 C K4 va2 > > > > 2 3 > > < vl 5 C½f ðx2 ÞKf ðx1 Þ4K3 C ða2 a3 Þ > va2 > > > 2 3 > > > > > vE vl > 5 > Z ½f ðx1 ÞKf ðx0 Þ4K1 K2 C K4 > > va3 > va3 > > > 2 3 > > > > vl > > 5 C ½f ðx3 ÞKf ðx2 Þ > C½f ðx2 ÞKf ðx1 Þ4K2 C ða1 a2 Þ > : va3 (11)

The problem is that l is in fact a function of the fuzzy densities. Hence, the partial of l with respect to each density should be included in the derivation, and makes the derivatives much more complicated. However, the error will not affect the  k  k training results since we use a numerical differentiation method C f ðx2 ÞKf ðx1 Þ mðfx2 ;x3 gÞ in partial derivative implementation, so that the partial   derivatives with respect to the densities can be negative (l is C f k ðx3 ÞKf k ðx2 Þ mðfx3 gÞ: (9) updated at each incrementation of the applied method, as To simplify this expression, it is assumed that the attributes shown in [27]). The analysis of the Eq. (11) makes it possible to xi correspond to the ordered attributes xi , and we do express a generic formulation for the derivative vE=vaj : ( )! 8 j n n n X Y Y Y > > > ½f ðxi ÞKf ðxiK1 Þ ð1Clal ÞC ðai au ÞC ðau av C2ai au av lÞ ; for j!n > vE^ < iZ1 uZ1;usj uZ1;vZ1;usj;vsu lZi;lsj ( )! Z jK1 n n n X Y Y Y vaj > > > > ½f ðxi ÞKf ðxiK1 Þ ð1Clal ÞC ðai au ÞC ðau av C2ai au av lÞ C½f ðxn ÞKf ðxnK1 Þ; for jZn : iZ1

lZi;lsj

uZ1;usj

uZ1;vZ1;usj;vsu

(12) not consider the exponent ‘k’, which does not affect the derivatives. The values m(Ai), defined in Section 2.3.2, are given using the densities ai and l: 8 1 > > mðfx1 ;x2 ;x3 gÞ Z ½ð1 C la1 Þð1 C la2 Þð1 C la3 ÞK1 > > l > > < 1 mðfx2 ;x3 gÞ Z ½ð1 C la2 Þð1 C la3 ÞK1 > > > l > > > : mðfx3 gÞ Z a3

In the special case where lZ0, a similar analysis gives: j vE^ X Z ½f ðxi ÞKf ðxiK1 Þ vaj iZ1

(13)

The resolution of the Eq. (8), depending on Eqs. (12) and (13), is non-linear and needs a non-linear optimization method to be solved. (10)

The value of l, depending on the densities ai, is given using Eq. (2). The combination of Eqs. (2), (9) and (10) makes it possible ^ j for jZ1, 2, 3 (and with the functions to express vE=va of the densities K1Z(1Cla1), K2Z(1Cla2), K3Z(1Cla3), K4 Z ða1 a3 C a2 a3 C a1 a2 C 2a1 a2 a3 lÞ to simplify the following expressions):

2.3.4. Conclusion Therefore, the Choquet integral may be a good aggregation operator based on the uncertainty approach of the fuzzy measure, which is important in image processing where noise and computational complexity are often major obstacles to the achievement of reliable systems. Recent results [28,29] show that significant improvement over peak signal-to-noise ratio is achieved when applying this useful tool, without high computational complexity. These two points are very important for the industrial application under study. The main downside of the Choquet integral is the identification of the fuzzy measure. That is why we have defined an easy method to identify the densities, which are often to be seen in

386

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

image processing, for the special case nZ3 (it is also true for nZ4, if a curvature attribute is added). The next point will deal with the application of this method to the special case of the perceptual organization, and the results we obtain in the industrial application. 3. Perceptual organization

adapted to 3D projection; the graph theory and probabilistic methods are well adapted to higher levels of organization). During grouping, uncertainties are generated because of the incomplete data. To handle the uncertainties, perceptual grouping should provide a significant measure that determines the extent to which the interpretation is likely at all: the fuzzy integral has been tested there to aggregate the importance scores we obtain.

3.1. Structural level 3.2. Perceptual organization properties The goal of Perceptual Organization in computer vision is to organize image primitives into higher level primitives thus explicitly representing the structure contained in the image data. This aims at reducing ambiguity in image data or in initial segmentation and thus at increasing the robustness and efficiency of subsequent processing steps. The ideas of perceptual grouping for computer vision have their roots in the well known work of Gestalt psychologists, back at the beginning of the century, who described, among others, the ability of the human visual system to organize parts of the retinal stimulus to ‘Gestalten’, into organized structures. The importance of perceptual grouping for computer vision has been recognized in the mid 1980s by Lowe [5], Kanade [30], Sarkar and Boyer [31] and so on (CMU project). We are working in this paper with a contour-based approach (the image is initially segmented and approximated by straight line segments). These segments are then used to define a hierarchy of grouping hypotheses with growing complexity using the Gestalt laws of collinearity, parallelism and symmetry. Boldt et al. [32] and Ade [33] have worked on the matching of linear segments, using similarity and collinearity for the first two authors, and symmetry for the third, and we use similar techniques in our application. Various techniques may then be used to obtain the perceptual organization (in general these techniques are used in stereoscopic matching too). The group theory uses the local invariance to calculate the euclidean similarity. The rules bases were used by Ade [33]. This technique was also used by Jain and Hoffmann [34] and associated with the evidence concept. The graph theory is too well adapted to the perceptual organization, using criteria like proximity, continuity, etc. The probabilistic methods, like relaxation, may be used too. Among these techniques, the rules bases are well adapted to our work, based on the primitive level (the group theory is better

We have chosen to match linear segments (primitive level) with a sufficient confidence degree to an identical 3D surface. The meaningful relations between line segments defined here are the collinearity, parallelism and symmetry. As proposed in [7], we use two constraints to detect the collinearity between two line segments (the orientation difference Qij between two line segments, the perpendicular distance dist from the longer line segment to the shorter line segment), two constraints to detect the parallelism between two line segments (the orientation difference Qij between two line segments, the perpendicular distance dist from the longer line segment to the shorter line segment to detect non-zero distance), the symmetry relation between two line segments being defined as the ratio of the overlapped length to the longest line’s length (overlap ratio). Fig. 2 illustrates the complementarity between the parallelism relation and the symmetry relation on a simple example where three line segments are parallel (in 2D) but are not collinear in the same plane (in 3D). It also explains the importance of the latter relation (overlap ratio) in a perceptual grouping process. In order to obtain the coplanarity in 3D from 2D attributes (if the value of this attribute is zero, the segments cannot belong to the same surface). Kang and Walker [7] then compute the collinearity relation using the fuzzy set intersection between the constraints defining this relation, and the parallelism relation by the same way, and define a measure of significance (for collinear grouping) using other operators on these relations. In fact, the collinearity relation and the parallelism relation are computed using the same constraints, so we decided to compute only a global (instead of a hierarchical) measure of significance (for the perceptual grouping) using the Choquet integral, which takes into account the interactions that exist between the three

Fig. 2. Overlap ratio attribute.

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

constraints we defined previously. So the perceptual grouping system is simplified. A global approach is better adapted for perceptual grouping, for example, the rule-base introduced by Jain and Hoffmann [34]. The luminous pattern (coming from laser) generates parallel straight lines. If these straight lines are parallel (in 3D), they are also parallel in the camera frame, if we assume that only the central part of the projected light frame is considered. We have to define now the three chosen attributes characterizing these 2D line segments. Distinguishing the degree of collinearity of two line segments is a difficult thing to do, because of the numerous uncertainties brought about each line segment. So the vagueness about each attribute (orientation, distance, recovery area) will be represented by membership values (f1, f2, f2) assigned to these attributes. We remember that the illuminated image is represented by a set of stripes R R Z fRð1Þ ;Rð2Þ ;.;RðNÞ g where N is the number of stripes, and each stripe R(K) is composed of Mk linear segments SðkÞ j : n o ðkÞ ðkÞ RðkÞ Z SðkÞ 1 ;S2 ;.;SMk Z

n

 o   ðkÞ ðkÞ d1ðkÞ ;f1ðkÞ ; d2ðkÞ ;f2ðkÞ ;.; dM ;f M k k

where 

ðkÞ ðkÞ SðkÞ j Z dj ;fj

.



 8 ðkÞ < djðkÞ Z uðkÞ ;v dj dj  and : f ðkÞ Z uðkÞ ;vðkÞ j fj fj

The values djðkÞ and fjðkÞ define the beginning and the end of the segment j of the stripe k, and are composed of the image coordinates u and v (varying between 0 and 512 pixels in our case). The initial image treatment is the cause of multiple imprecision, ranging from the observed phenomena to the algorithm artifacts. So the objective evidence supplied by the three attributes defined for this application is evaluated according to a fuzzy model. To each physical value of the attributes is adjoined a membership degree to a fuzzy set, represented by various graphic forms. This treatment makes it possible to keep the initial imprecision (coming from low-level image processing), and to improve the quality of the aggregation process. To implement the Choquet integral, we need to calculate a marginal valuation fi for each attribute (or constraint), the orientation between two line segments Qij, the perpendicular distance between the two line segments and the recovery area. The first function we define is f1 which gives a measure of similarity on the orientation of the segment i of the stripe k ðkC1Þ ðSðkÞ Þ. The i Þ and the segment j of the stripe kC1 ðSj membership function f1 is assigned to the difference in orientation using the following relation (modulo p):

387

 j f1 Z jcoslC1 QðkÞ ij

(14)

with lZ0, 1, 2... The integer and positive coefficient l make it possible for the function f1 to be more selective (we have used lZ2). If the segments are parallel, the similarity measure tends to 1, and to 0 the other way round. While a segment is defined by the image coordinates of its extremities, the measure of QðkÞ ij is defined by the following expression: ! ! ðkÞ uðkC1Þ KuðkC1Þ uðkÞ dj fj di Kufi ðkÞ Karctan ðkÞ (15) Qij Z arctan ðkC1Þ vdj KvfðkC1Þ vdi KvðkÞ fi j The second membership function we have to define is f2, for the valuation of the similarity measure (given in [0, 1]), relative to the overlap ratio. This calculus is based on the orthogonal projection of each segment on their bisectrix and define three classes of recovery: partial recovery, complete recovery and separation. We have to distinguish the complete recovery from the separation, which highlights the belonging of the segments to the same surface. That implies a value ‘1’ for the function f2, and the separation implies the value ‘0’. The uncertainty is about the partial recovery so we have chosen for the membership function f2 the classical ‘S(g)’ function defined by Zadeh [12] in fuzzy logic. The function S(g) is defined with bZ aC c=2 and wZcKa (b is the cross-over point and w the support of the function S(g)). The g parameter is the ratio of the overlapped length of the line segment to the longest line’s length of the treated image (aZ0 (separation) and cZ1 (total recovery)). The S function defines the value of the overlap ratio relative to a segment. The membership function f2 has to calculate this ðkC1Þ value relatively to the right and left segments SðkÞ . To i and Sj obtain a good accuracy, the two segments have to be close, so that the function f2 may be written using the relation 0 13 2 0 1 L 1 4 @ Lij A ij S ðkÞ C S@ ðkC1Þ A 5 (16) f2 Z 2 Li Lj where Lij defines the length of the overlap ratio between the segments SðkÞ and SðkC1Þ , LðkÞ represents the length of the i j i ðkÞ ðkC1Þ represents the length of the segment segment Si and Lj SðkC1Þ . These lengths are defined by: j rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 8  2  2 > ðkÞ ðkÞ > uðkÞ C vðkÞ < LðkÞ i Z di Kufi di Kvfi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r 2  2 > > ðkC1Þ : Lj Z uðkC1Þ KuðkC1Þ C vðkC1Þ KvðkC1Þ dj

fj

dj

fj

The length of the overlap ratio depends closely to the distance separating the segments. So, we have to define a third attribute: f3 Z

1 dist 1 C ð512=NÞ

(17)

The value dist represents the distance (in pixels) separating the centers of each segment, while the ratio 512/N defines a

388

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

reasonable distance with 512 being the horizontal resolution of the image (we have chosen this value, 512, as longest line to simplify our process, but it may be parameterized depending on the choice of the camera) and N the number of stripes. When the stripes are close, dist%512/N and f3 tends to 1. When they are widely spaced, distR512/N and f3 tends to 0. The membership function f3 is equal to 1/2 if the distance separating two segments is reasonable. 3.3. Choquet integral To calculate the synthetic valuation of the Choquet integral, three steps have been defined: † The first step is the choice of the attribute XZ{x1, x2,., xn} with (nR2). In our application, the attributes were chosen depending on the perceptual organization: parallelism, overlap ratio, and distance. † The second step defines the Sugeno measure m, determining the importance degree on the set PðXÞ given to the different attributes and to their interaction. † The third step considers the marginal valuations f(xj) obtained for each attribute xi, and using the three specific membership functions f1, f2 and f3. The implementation of the Choquet integral for decisionmaking is characterized by a process with multiple inputs and one output. In practice, an expert system (or an a priori knowledge base) may provide multiple synthetic valuations associated to some experiments. We obtain the following chart with m samples: 10 1 0 E1 f11 f21 / fn1 CB C B B f 2 f 2 / f 2 CB E 2 C C B 1 n CB 2 CB C B C B « B « « A@ « C A @ m m m m f1 f2 / fn E For this chart, m is the number of experiments (training input vectors) and n the number of attributes for a given application, Em and fnm denotes the mth total evaluation and the mth partial evaluation of n item, respectively. In our case, the knowledge base was represented by 12 synthetic images we build for our application. There exists no general method to find the optimal number of samples needed. According to simulation results, [23] shows that 81 data were necessary with four attributes and 30 data for three attributes, and Grabisch [14] shows that the learning data must amount to at least three for three attributes, for matrices to be well-conditioned. With 12 synthetic images, we use 36 data, which may be considered reasonable. The synthetic valuation Ek is then defined by ð k E^ Z ðcÞ f k dm; c k Z 1;2;.;m (18)

The initial vector is X0 Z f1=3;1=3;1=3g, and an error 3 of 0.01 is allowed. We follow the basic idea from Grabisch [25] which is that, in the absence of any information, the least specific way of aggregation is the arithmetic mean. Then any input of information tends to move away the fuzzy measure from this equilibrium point. This means that, in case of few data, coefficients of the fuzzy measure which are not concerned with the data are kept as near as possible to the equilibrium point, in order to ensure monotonicity. After 32 iterations, we obtain the optimal following values, which are in accordance with the degree of importance of each attribute:

a1 Z 0:7246



a2 Z 0:5324

a Z 0:1209 3

The value l is equal to 1.9232. This version of automatic learning procedure is effective, but is not representative of the powerful interaction modelization provided by this operator, as Grabisch mentioned it [35]. The three densities of the Sugeno measure m obtained were applied to a first test image (Fig. 3), with three stripes and we can represent this image by the following set:

ð1Þ ð1Þ

R Z fSð1Þ 1 ;S2 g



ð2Þ ð2Þ R Z fRð1Þ ;Rð2Þ ;Rð3Þ g; with

Rð2Þ Z fSð2Þ 1 ;S2 ;S3 g

Rð3Þ Z fSð3Þ ;Sð3Þ ;Sð3Þ g 1 2 3 The results were the following: † the 2D segments, whose 3D homologuous segments belong to the same surface have a synthetic valuation greater than 0.7652 † the 2D segments, whose 3D homologuous segments do not belong to the same surface have a synthetic valuation smaller than 0.5515 ð2Þ When the segments Sð1Þ i and Sj are matched, we deal with groups Gr of segments belonging to the same surface: ð2Þ ðNÞ Gr Z Sð1Þ i g Sj g/g Sm

where the functions fk are f k ðxi Þ Z fik ;

i Z 1;2;.;n

Fig. 3. Test image.

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

389

Fig. 6. Original image of the first example.

the underlined values being the values superior to the fixed threshold and corresponding to the decision of matching these line segments to the same surface. Fig. 4. Planar surface reconstruction.

For each group, we know the 2D coordinates (1d, 2d,.) of the different segments. Each stripe being labeled, we also know the equation of the plane corresponding to the surface created by these segments. The equation of the 3D plane is then calculated by a least-square method and illustrated in Fig. 4. The 3D reconstruction of the test image is illustrated in Fig. 5. 4. Results Let us now present the application of our algorithm on two examples (real images), obtained with the projection of eight (or nine) luminous planes on stacked blocks. 4.1. First example Fig. 6 shows the original image, which is processed. The following tables represent some examples of synthetic valuations obtained between line segments of adjacent stripes,

Fig. 5. Planar surface reconstruction of the test image.

2

0:8353

0:1530

0:4955

0:8295

0:1604

0:1827

0:8083

0:2001

0:4993

0:1479

0:8494

0:8452

0:1891

0:4993

0:1552

0:8066

0:1814

0:1525

0:8722

0:1183

0:4997

0:1744

0:7353

0:1621

0 0

6 6 0:1926 stripes 1K2:6 6 0:4967 4

2

6 6 0:1173 stripes 2K3:6 6 0:4920 4

2

6 6 0:2330 stripes 6K7:6 6 0 4 0

0:9314 0 0

0:1711

3

7 0:4844 7 7 0:2628 7 5

3

7 0:5000 7 7 0:1592 7 5 0:9028

3

7 0 07 7 0 07 5 0 0

To conclude, Fig. 7 illustrates the grouping result obtained from the four groups of line segments characterizing the stacked blocks image. In this first example, the angle measures between two facets have been estimated (Table 1), and then the average length of the line segments (Table 2). The average angle error for this object is 1.38. The average metric error is 0.6 mm.

Fig. 7. Surfaces reconstruction of the first example.

390

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

Table 1 Estimation of the error on the angles for the first example Couple of facets

Real angle (in degree)

Reconstructed angle (in degree)

1–2 2–3 3–4

90 90 90

88.21 90.47 91.65

Table 2 Estimation of the length of the line segments for the first example Facet

Measure direction

Real length (cm)

Average reconstructed length (cm)

Standard deviation (cm)

2 4

Vertical Vertical

2.5 2.5

2.48 2.56

5.2!10K2 3.8!10K2

4.2. Second example A more complex stacked blocks image has been successfully proceeded. Fig. 8 illustrates the original image and shows the nine stripes analyzed in our grouping process. The grouping result is given in Fig. 9. The angular differences between two facets have been estimated (Table 3), as has the average length of the line segments (Table 4). The average angle error is 2.148 and the metric error on the length is 0.52 mm. 4.3. Conclusion These simple examples show that the proposed algorithm is suitable for accurate 3D reconstruction. The obtained accuracy is comparable with other existing methods [36]. We could also compare our algorithm to the state of the art courtesy of the Range Image Segmentation Comparison Project (segmentation algorithms available via http://marathon.csee.usf.edu/range/ seg-comp/SegComp.html). A proposed tool is used to objectively evaluate algorithms for segmenting a range image. In fact, the methodology we follow takes into account the new research directions proposed in [37], and particularly the role of learning in perceptual grouping. Learning is important to obtain strong geometrical consistency with the given model. Most of the authors [2] use probabilistic frameworks with success. The drawback of this framework is a long calculation time, and a probabilistic noise modeling has

Fig. 9. Reconstruction of the stacked blocks of the second example. Table 3 Estimation of the angles between facets for the second example Couple of facets

Real angle (in degree)

Reconstructed angle (in degree)

2–3 3–4 1–4 4–5

90 90 70 90

91.99 87.68 70.56 86.32

Table 4 Estimation of the length of the line segments for the second example Facet

Measure direction

Real length (cm)

Reconstructed length (cm)

Standard deviation (cm)

1 2 3 4

Inclination Horizontal Vertical Horizontal

3 1 3 1.5

3.17 0.98 2.99 1.49

1.8!10K2 8.3!10K2 4.4!10K2 2.5!10K2

to be performed. The aim of our work is to create a simple learning methodology that may be used in an industrial environment. Indeed, users are not often familiar with the probabilistics frameworks, and have some difficulties in adapting the 3D measurement tools they employ. So the motivation for using a fuzzy integral approach is to develop a flexible tool for the industrial application that is now presented (in industry today, production lines often change and measurement tools have to adapt, depending on the cost of their development). Only 12 training/testing runs are sufficient to perform good results, using 30–40 training epochs in each run. Thus, the results obtained on real images are satisfactory and the recovering of the edges of the objects may be used for measuring tasks. 5. Industrial application 5.1. 3D sensor presentation

Fig. 8. Original image of the second example.

The application of the 3D sensor is devoted to the measurement of critical cotations of pipes used in gas pipelines. These cotations are very strict, since the pipes are

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

391

Fig. 10. General presentation of the 3D sensors.

welded together. It is important to note that this type of measure may change, depending on the demand for pipes or other parts, and the learning of the fuzzy measure may be changed several times during the year in a simple way. Practically, two sensors are placed to measure the profiles of each extremity of the pipe. These pipes are about 12 m long and with a diameter of 20 to 56. According to the dimension of the tube to be measured, the sensors are placed between the positions P2 and PL2 (see Fig. 10). In this paper, we do not present the detailed device that makes the placement and the rotation of the pipe possible, in association with the appropriate optical and position sensors. Then, depending on the dimension of the pipe, a laser plane is projected on the extremity of the pipe (see Fig. 11), and the pipe is rotated a 100 times on its axis to perform the measure. This rotation makes the control of the curvature of the pipe possible (the different images obtained at each step are concatenated to obtain the whole image, as in the examples presented in Section 4). The end of the pipe is very reflective (surface mirror-typed), because of welding to be made. The industrial environment is subject to heavy perturbations:

Fig. 12. Observed image with bad lighting.

Fig. 13. Observed image in good lighting conditions.

† † † †

dust high lighting (from 240 to 5000 lux) temperature variations mechanical vibrations

So the lighting variations are very important, and we present Figs. 12 and 13 two examples of the observed laser ray before 3D reconstruction. The different measures made are summed up in the Table 5. The obtained average precision is 0.1 mm on the lengths and 0.18 on the angles, a precision that is required for welding. In particular, these results demonstrate the robustness of the method to the lighting variations. Fig. 14 defines the measures made, and Fig. 15 presents the device used (showing the CCD camera and the laser projector). Finally, we present in the Fig. 16 an example of a part of the 3D reconstruction (the image is fitted to an ideal plane by the LMSE technique, the complete image is too large to be Table 5 Measures table

Fig. 11. 3D sensor.

Measure designation

Minimum value

Maximum value

Bevel angle 1 (in degree) Bevel angle 2 (in degree) Root (cm) Curvature (diameter)

28 09 0.8 20 00

40 20 2.6 56 00

392

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

Fig. 14. Measures. Fig. 17. The 3D coordinates drawn in a wire frame.

Fig. 15. Used device.

[1,36], for instance, and it remains easy to implement (we have used a well-known software like Matlab). The greatest advantage of the fuzzy integral in this application is that we assign importance degrees to the interaction between attributes, non-authorized in other aggregation methods (hierarchical representation for example), making it possible to deal with the depth uncertainty of the 3D reconstruction by managing the balance between accuracy and fragmentation in grouping, depending on the data and the objects being extracted. The fuzzy membership functions associated with each attribute provide explicit models for the global aggregation method using the Choquet integral. The application, devoted to 3D measure on a device of great dimension, requires two strong pre-requisites: † robustness to the bad conditions of this application in an industrial environment (and particularly strong variations of lighting) † the obligation to change the geometry of the parts in measure.

Fig. 16. 3D reconstruction of a part of tube.

presented). To measure the accuracy of the proposed approach, the 3D coordinates are drawn in a wire frame as shown in Fig. 17.

6. Conclusion In this paper, we discussed perceptual grouping based on geometric relations using three geometrical attributes. Then these attributes are used with the Choquet integral and Sugeno measures for perceptual organization. This original method in matching segments to planar surfaces provides satisfactory, as well as 3D planar surfaces reconstruction obtained by Tarel

We have shown that the fuzzy integral, associated to a fuzzy measure, may be an answer to this problem. The aggregation system (fuzzy integral) is based on the (implicit) expression of uncertainty as a fuzzy vector (overlap ratio, distance and orientation) and makes it possible to fuse this uncertainty. The learning algorithm we propose is easy to implement, computationally inexpensive when the number of attributes is limited (3 or 4). This algorithm does not need noise modeling and analysis, carried out in a Monte Carlo framework (or others) using probabilistic algorithms. This methodology may be interesting in an industrial environment, making flexibility of the production line possible (without doing a study for each case of parts measurement). Anyway, the explicit modelization of noise in probabilistic frameworks (Bayesian networks, etc.) is replaced by the need of learning of the fuzzy measure, and there remains a problem for the application of the fuzzy integral (when the number of attributes increases for instance). So, we are working now on the applicability of statistical learning theory applied to this problem. The applications in vision may be numerous: 3D reconstruction of complex objects

A. Bigand et al. / Image and Vision Computing 24 (2006) 381–393

(linear and curve primitives), dynamic vision, classification, color imaging, etc. Acknowledgements The authors would like to thank Europipe Society, Dunkerque for supporting this study. We also want to thank the anonymous reviewers for their helpful remarks, and S. Target and J. Marichez for their help. References [1] J.-P. Tarel, J.-M. Vezien, A generic approach for planar patches stereo reconstruction, in: Proceedings of the Scandinavian Conference on Image Analysis, Uppsala, Sweden, June 1995, pp. 1061–1070. [2] I.K. Park, K.M. Lee, S. Uk Lee, Perceptual grouping of line features in 3D space: a model-based framework, Pattern Recognition 37 (2004). [3] P. Vasseur, C. Pegard, E. Mouaddib, L. Delahoche, Perceptual organization approach based on Dempster-Shafer theory, Pattern Recognition 32 (1999). [4] L.C. Fang, L.C. Yang, A new approach to high precision 3-D measuring system, Image and Vision Computing 17 (1999). [5] D.G. Lowe, Perceptual Organization and Visual Recognition, Kluwer Academic Publishers, Hingham, MA, Boston, 1985. [6] D.G. Lowe, Three-dimensional object recognition from single twodimensional images, Artificial Intelligence 31 (1987). [7] H.B. Kang, E.L. Walker, Perceptual grouping based on fuzzy sets, in: Proceedings of the FUZZ-IEEE’92, 1992, pp. 651–660. [8] E.L. Walker, Characterizing and controlling approximation in hierarchical perceptual grouping, Fuzzy Sets and Systems 65 (1994) 187–223. [9] L. Evrard, A. Bigand, J.P. Dubus, Fuzzy image analysis using multistripe structured light, in: Europto/SPIE Congress, Besanc¸on, 1996. [10] L. Evrard, A. Bigand, J.P. Dubus, Structural fuzzy-image analysis and structured light, in: Third France–Japan Congress of MECHATRONICS, Besanc¸on, 1996. [11] P. Grandjean, Perception Multisensorielle et Interpre´tation de Sce`nes, PhD thesis, LAAS Toulouse, 1991. [12] L.A. Zadeh, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems 1 (1978) 3–28. [13] B. Bouchon-Meunier, Aggregation and Fusion of Iimperfect Iinformation, Physica Verlag, Berlin, 1998. [14] M. Grabisch, Fuzzy Integral as a Flexible and Interpretable Tool of Aggregation in ‘Aggregation and Fusion of Imperfect Information’, Physica Verlag, Berlin, 1998. [15] I. Bloch, Information combination operators for data fusion: a comparative review with classification, IEEE Transactions on SMC 26 (1996) 52–67. [16] R.R. Yager, On ordered weighted averaging aggregation operators in multicriteria decision making, IEEE Transactions on SMC 18 (1988) 183–190.

393

[17] M. Grabisch, On equivalence classes of fuzzy connectives—the case of fuzzy integrals, IEEE Transactions on Fuzzy Systems 3 (1) (1995) 96–109. [18] M. Sugeno, Theory of Fuzzy Integrals and lts Applications, PhD thesis, Tokyo Institute of Technology, 1974. [19] D. Denneberg, Non-Additive Measure and Integral, Kluwer, London, 1994. [20] H. Tahani, J.M. Keller, Information fusion in computer vision using the fuzzy integral, IEEE Transactions on SMC 20 (3) (1990) 733–741. [21] M. Grabisch, J.M. Nicolas, Classification by fuzzy integral: performance and tests, Fuzzy Sets and Systems 65 (1994) 255–271. [22] M. Grabisch, T. Murofushi, M. Sugeno, Fuzzy Measures and Integrals, Physica Verlag, 2000. [23] K. Ishii, M. Sugeno, A model of human evaluation process using fuzzy integral, International Journal of Man–Machine Studies 22 (1985) 19–38. [24] K. Tanaka, M. Sugeno, A study on subjective evaluations of printed color images, International Journal of Approximate Reasoning 5 (1991) 213– 222. [25] M. Grabisch, M. Sugeno, Multi-attribute classification using fuzzy integral, in: Proceedings of the FUZZ-IEEE’92, 1992, pp. 47–54. [26] J. Wang, Z. Wang, Using neural networks to determine Sugeno measures by statistics, Neural Networks 10 (1) (1997) 183–195. [27] J.-H. Chiang, Choquet fuzzy integral-based hierarchical networks for decision analysis, IEEE Transactions on Fuzzy Systems 07 (01) (1999) 63–71. [28] J. Li, G. Chen, Z.C. Lu, Image coding quality assessment using fuzzy integral with a three-component image model, IEEE Transactions on Fuzzy Systems 12 (1) (2004). [29] S. Auephanwiriyakul, J.M. Keller, P.D. Gader, Generalized Choquet fuzzy integral fusion, Information Fusion 3 (2002) 69–85. [30] Y. Ohta, T. Kanade, Stereo by intra and inter scanline search using dynamic programming, IEEE Transactions on PAMI 7 (2) (1985) 139–154. [31] S. Sarkar, K.L. Boyer, Perceptual organization in computer vision, a review and a proposal for a classifactory structure, IEEE Transactions on SMC 23 (2) (1993) 382–399. [32] M. Boldt, R. Weiss, E. Riseman, Token-based extraction of straight lines, IEEE Transactions on SMC 19 (6) (1989) 1581–1595. [33] A. Yla¨-Ja¨a¨ski, F. Ade, Grouping symmetrical structures for object segmentation and description, Computer Vision and Image Understanding 63 (3) (1996) 399–417. [34] A.K. Jain, R. Hoffmann, Evidence-based recognition of 3-D objects, IEEE Transactions on PAMI 10 (6) (1988) 783–802. [35] M. Grabisch, A graphical interpretation of the choquet integral, IEEE Transactions on Fuzzy Systems 08 (05) (2000) 627–631. [36] A. Hoover, et al., An experimental comparison of range image segmentation algorithms, IEEE Transactions on PAMI 18 (7) (1996) 673–689. [37] K.L. Boyer, S. Sarkar, Perceptual organization in computer vision: status, challenges and potential, Computer Vision and Image Understanding 76 (1) (1999) 1–4.