Robustness of a multiscale scheme of feature points detection

Robustness of a multiscale scheme of feature points detection

Pattern Recognition 33 (2000) 1437}1453 Robustness of a multiscale scheme of feature points detection Jacques Fayolle*, Laurence Riou, Christophe Duc...

2MB Sizes 2 Downloads 48 Views

Pattern Recognition 33 (2000) 1437}1453

Robustness of a multiscale scheme of feature points detection Jacques Fayolle*, Laurence Riou, Christophe Ducottet Laboratoire Traitement du Signal et Instrumentation, UMR CNRS n3 5516, 23 rue du Docteur P. Michelon, 42023 Saint-Etienne, Cedex, France Received 19 February 1998; received in revised form 11 June 1999; accepted 11 June 1999

Abstract We present a new scheme for feature points detection on a grey level image. Its principle is the study of the gradient phase signal along object edges and the characterization of the behavior across scales of the wavelet coe$cients of this signal. The features points are determined as transition points of this signal. In the second part, we study the robustness of the detection scheme against changes of the acquisition parameters: the viewpoint and the zoom of the camera, the object rotation, the luminescence variation and noise. The results show the method e$ciency: most of the points are still detected even if these parameters vary.  2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Feature points; Wavelet transform; Local curvature; Method e$ciency; Acquisition parameter; Viewpoint; Scale variation

1. Introduction The detection of feature points on grey level images is a common problem of image processing. The main applications are the comparisons between human perception and computer vision, the recognition or the tracking of objects in a scene or object recognition in an image database. For example, the knowledge of positions of features points allows Talluri to locate a mobile robot in three-dimension [1]. These points are also used in the recognition of objects such as human face pro"les [2]. However, the main application of feature points detection is in the determination of motion between successive frames. The motion measurement is done through the tracking of these points. The tracking problem is known as the correspondence problem. Many papers address this subject [3}8]. Therefore, it is essential to get an algorithm of feature points detection which gives the same set of points even if the position or the orientation of the object has changed.

* Corresponding author: Tel.: 00-33-4-77-48-5131; fax: 00-334-77-48-5120. E-mail address: [email protected] (J. Fayolle).

We retain in this paper the de"nition of feature points given by Attneave in 1954 [9]: feature points are high curvature points. This de"nition is based on the consideration that the human perception is more sensitive to high curvature points than to all others points. According to Chen [10], we distinguish two main approaches of detection schemes: the polygonal approximation of edges and the detection of grey level corners. The "rst one (the more classical) consists in the segmentation of the image, the representation of the object boundary by one chain code and then the detection of corners as points where the direction of edges changes [11}15]. Obviously, the performance of this kind of algorithms is strongly linked to the quality of the segmentation. Unfortunately, for many types of images, the segmentation task is di$cult. The second approach avoids this drawback. Indeed, the second class of methods takes into account the grey level images and not the object shape [16}19]. Most of these methods use as criterion the changes of the direction of the grey level gradient. For example, Lucas and Kanade propose a detection scheme which uses a segmentation of eigenvalues of a contrast matrix [20]. More recent algorithms detect points through the behavior of wavelet coe$cients across scales. For example, Zheng proposes a detection scheme through Gabor wavelets decomposition and the research of points for which the

0031-3203/00/$20.00  2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 1 3 6 - 3

1438

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

variation of coe$cients across scales is maximum [8]. Similarly, Chen uses the evolution of wavelet coe$cient to isolate corner candidates (with a wavelet de"ned as the "rst derivative of a Gaussian) [10]. The method proposed in this paper is closely related to this last approach: detection through the behavior of coe$cients across scales. More precisely, we determine the gradient phase signal along the edges and extract the feature points from this signal. The feature points are those corresponding to a model of smoothed singularity (named transition). We characterize these points by the length and the amplitude of the grey level transition. These measurements allow the determination of local curvatures and therefore the detection of robust feature points. The quality of a scheme of feature points detection is strongly linked to its robustness against changes of object position and orientation. We expect that the algorithm detect the same set of feature points on the object image for any viewpoint. Indeed, this property of the detection scheme is essential either for object tracking or object recognition in an image database. For instance, we consider the following case: we get an image of an object and we do not know how this image has been acquired (the camera viewpoint, the ratio between the object size and the object}image size, the lighting direction,2). The aim is to recognize this image in a database very quickly. One method is to test the correlation between a set of feature points detected on this image and the sets of feature points detected on each image of the database. The maximum of this correlation indicates the corresponding image of the database. The "rst task of this kind of application is to assure that the same set of feature points is detected even if the acquisition parameters have changed. For object tracking, the problem is the same. Our purpose is to study the robustness of the proposed detection scheme against some parameters. We have chosen the following experimental protocol. We study images of painting reproductions seen from di!erent viewpoints and we test if the set of feature points detected on the object of these paintings is linked to the acquisition parameters. We consider "ve parameters: E the relative position of the camera and the paintings reproduction (the object), E the scale parameter (corresponding to a zoom e!ect), E the image rotation (corresponding to a rotation of the camera around its optical axis), E the luminescence variations, E the signal-to-noise ratio. Our aim is to determine the percentage of points which are still detected even if these parameters vary. We will say that the detection is e$cient if the feature points detected for two di!erent viewpoints correspond to the same localization on the objects represented on the paintings. This experimental protocol is introduced according

to the application of image database consultation but the results obtained are more general. Indeed, the set of tested parameters is large enough to group together many applications such as motion determination or pattern recognition. The rest of this paper is organized as follows. In Section 2, we present the multiscale scheme used for feature points detection. In particular, we give details on the theoretical variation along maxima lines of the modulus of the wavelet transform coe$cients across scales. The studied case is the transition between two stable levels of the grey level function. In Section 3, we show some results of detection on test images and we discuss about the robustness, advantages and drawbacks of the method in Section 4. Concluding remarks are given in Section 5.

2. Multiscale detection of feature points The proposed detection scheme is composed of three main stages: the detection of object edges (we de"ne precisely the notion of edges hereafter), the localization of feature points and then the characterization of these feature points (and particularly the estimation of the curvature at each feature point). Each step is detailed below. 2.1. Theory of multiscale detection using wavelets The algorithm of feature points detection proposed in this paper is based on the theory of multiscale edge detection using wavelets. In this section, we "rst present the principle of multiscale edge detection and we highlight its link with the wavelet transform. Only the one-dimensional (1D) case is considered here (the twodimensional (2D) case can be easily extrapolated). A more detailed presentation of this theory can be found in Mallat, Zhong and Hwang papers [21,22]. The edges of a signal are particular points where this signal has sharp variations. In 1D signal, such points are isolated. In 2D signals, they form continuous lines and provide the location of objects contours. Whereas this notion is clear in the case of a binary object, it becomes fuzzier for grey level images. For two-dimensional signals, points where the gradient vector modulus (of the grey value) is locally maximum in the gradient vector direction make up edges. According to this de"nition initially proposed by Canny [23], edges correspond to in#ection lines in the signal. If the gradient maximum is high, the in#ection corresponds to a sharp variation. On the contrary, if it is low, the in#ection corresponds to a smooth variation. In the 1D case, edges are particular points where the "rst derivative modulus is locally maximum. We describe here the principle of the multiscale detection of edges. We consider a smoothing function u(x)

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

whose integral is equal to 1 and that converges to 0 at in"nity. A typical example of such smoothing function is the Gaussian function. In order to obtain smoothed versions of a signal at di!erent scales, we introduce the smoothing function u (x) at scale s de"ned by Q 1 x u (x)" u . (1) Q s s



Multiscale edges are de"ned using smoothed versions of the signal and a "rst- or a second-order derivative. Here, we choose the "rst-order operator. The reasons of this choice are the following. We are interested by in#exion points of the signal, and these points can be detected as the local maxima of the "rst derivative signal. The second reason is linked to the choice of the wavelet used and we will explain it later. However, all the mathematical developments presented hereafter can be made with the second-order operator. (The principle and the results are identical.) With the previous choice, multiscale edges are de"ned as the modulus maxima points of the "rst derivative of smoothed versions of the signal. They are in#exion points of the "ltered signal. If the smoothing function is di!erentiable, the "rst derivative of a function f, smoothed at scale s can be expressed by



du 1 du x d ( f u )(x)"f * Q (x)" f * , dx s dx s dx * Q

(2)

where * denotes the convolution product. This multiscale edge detection can be expressed in terms of wavelet transform. Let us introduce the wavelet function t(x) and the wavelet function t (x) at scale s: Q du t(x)" (x), dx



1 x t (x)" t . Q s s

(3)

Then, 1 1 d ( f u )(x)" f * t (x)" =f (s, x), Q dx * Q s s

(4)

where =f denotes the wavelet transform of function f, with respect to the wavelet t(x), at scale s and position x [24}26]. The function t(x) is a wavelet because its integral is equal to 0. Eq. (4) proves that multiscale edges can be detected by means of a wavelet transform. The "rst step consists in computing the wavelet transform of the signal using a wavelet which is the "rst derivative of a smoothing function (Eq. (3)). The important parameter is the choice of the wavelet. We must use a wavelet which is the derivative of a smoothing function (other ones such as Morlet wavelets are not appropriated to the edge detection). But, we can choose the derivation order of the

1439

smoothing function. If we choose as wavelet the n-derivative of a smoothing function, then we can detect edges corresponding to singularity order less than n [22,27]. Here, we are only interested in the edges of the signal, and therefore in the singularity order less than 1. That is why we choose as wavelet the "rst derivative of a Gaussian. However, the following algorithm can be developed for wavelets obtained with second derivative of a Gaussian, but the interpretation of extracted points is more di$cult. The second step of the multiscale edge detection consists in detecting the local maxima of the modulus of the wavelet transform. The wavelet transform used must not be under sampled, and the simplest way to compute it is to evaluate, in the Fourier space, the convolution product of Eq. (4) [28]. The calculation can be made for any value of the scale parameter s. Multiscale edges can be represented in a scale space diagram where the horizontal axis corresponds to the space variable x and the vertical axis corresponds to the scale variable s. While the scale varies, maxima points of the wavelet transform modulus are connected along lines called maxima lines. In the scale space diagram, the maxima lines converge at in#ection points of the original signal when the scale reduces to 0. We will study hereafter the variation of the wavelet transform modulus along these lines. Therefore, we have to get many points along them in order to assure a good precision to our algorithm. This is the reason why we use a continuous wavelet transform and not a dyadic one. In the rest of the paper, the wavelet used is de"ned as the "rst derivative of a Gaussian function and the algorithm used for the wavelet transform is based on the Fourier transform. The advantages of the multiscale edge detection are: E the choice of the scale, therefore the level of details retained on each edge, E to get, in addition of the edge localization, the direction and the modulus of gradient vectors at each edge point (Fig. 1). This additional information is useful for the next step: the detection of feature points along edges. In summary, the edge detection step consists in the calculation of the wavelet transform of the image at one scale (chosen according to the details level we will retain) and the extraction of the maximum of its modulus in the gradient direction. 2.2. Detection of feature points Feature points (de"ned above as high curvature points) correspond obviously to points where the edge direction changes rapidly or (which is equivalent) to points where the gradient direction changes rapidly. The

1440

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

Fig. 1. (a) Initial image (b) edges detected and gradient vectors at each edge point.

gradient direction criterion is more powerful. Indeed, the edge detection gives this information in a `nearly continuous spacea (because the sampling frequency of the wavelet can be chosen as small as needed), whereas the edge direction is known on the discrete space of the image. Thus, the edge direction is more sensitive to aliasing and noise defaults than the gradient direction. Consequently, we propose the following scheme to detect high curvature points. We localize points corresponding to high variations in the signal of gradient direction. This 1D signal is constructed from the results of the "rst stage. We follow each edge with an edge route and we report the value of gradient direction at each abscissa. Then, the detection of feature points is equivalent to the detection of sharp variations in this onedimensional signal. Some of these variations are not signi"cant (local variations due to noise). On the contrary, other variations correspond to real transitions of the phase signal. These points of transition are retained as feature points. To detect them, we study the behavior across scales of wavelet transform coe$cients of the gradient phase signal. This kind of approach is also used by Chen [10]. It can be split into two sub tasks: the localization of sharp variations and then the selection of transition points among all the variations. The detection of sharp transitions in a one-dimensional signal with wavelet transforms is not an original algorithm. It has been "rst proposed by Mallat [21] and used in many applications. The originality of our algorithm is to use this detection scheme on a phase signal and the proposed extension of the Mallat algorithm in order to characterize very precisely feature points (Section 2.3). Lets us remind the Mallat algorithm. We compute the wavelet transform of the signal for a set of scales and we extract maxima lines of the wavelet transform modulus. These lines (functions of the scale parameter) point the transition abscissas when the scale goes down (Fig. 2).

Fig. 2. An original signal (top), its wavelet transform (middle) and the detected maxima lines (bottom).

To perform the discrimination between di!erent types of variations of the signal, we study the evolution of wavelet coe$cients along the maxima lines. We consider only variations corresponding to peaks or steps, and we represent them by "ltered Dirac and "ltered Heaviside distributions. The logarithmic slope, for important scales, of the evolution of wavelet coe$cients along maxima lines holds the information of variation type. This slope is equal to 0 in the step case and !1 in the Dirac case (Fig. 3). By extension, we consider that a positive slope indicates a "ltered step and a negative one a "ltered Dirac distribution. The proof of these behaviors can be found in Ref. [21]. For the detection of feature points, we are interested in transitions between two stable levels, therefore we are interested in variations corresponding to the step case.

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

1441

Fig. 4. Estimation of two parameters on a "ltered step: the transition length p and the amplitude A. Fig. 3. Logarithmic slope of the evolution of wavelet coe$cients modulus along maxima lines for the step (䊏) and the Dirac distribution (*).

On the other hand, we associate noise to variations corresponding to the Dirac type. In short, feature points are detected as points for which the behavior of the wavelet coe$cients of the gradient-phase signal along maxima lines is similar to the transition behavior. 2.3. Characterization of feature points The knowledge of the feature point type allows us to characterize them precisely, and "nally to estimate the local curvature. We measure two quantities on a "ltered step: a transition length named p and the amplitude of the step named A (Fig. 4). The direct measurement of these quantities from the grey level image is the main improvement of the proposed algorithm (according to classical schemes of feature points detection). These two parameters are obtained by means of the "tting of the experimental variation of the wavelet transform modulus along maxima lines with theoretical variations. These theoretical variations are computed for a "ltered step in the following manner. (The wavelet used is de"ned as the "rst derivative of the Gausssian, as for the "rst stage of the algorithm.) A "ltered step S (x) expressed in function of the HeaviN side distribution H(x) can be written as S (x)"A ) H(x) * G (x), N N where

(5)



1 x G (x)" G . N p p

(6)

In these formulae, G is the Gaussian function (with average and standard deviation equal to 0 and 1, respectively) and * denotes the convolution product. If we take into account the expression of S (x), the wavelet transN form at scale s of this step S is de"ned by N SI (x)"S (x) * t (x)"A ) H(x) * G (x) * t (x) N Q N Q N Q 1 dG x "A ) H(x) * G (x) * , (7) N s dx s



t (x) is the wavelet at scale s. Since we have chosen this Q wavelet as the "rst derivative of a Gaussian t (x)" Q (1/s) (dG/dx) (x/s) . Since (dG/dx) (x/s)"s (dG /dx) (x), we obtain: Q dG Q (x) SI (x)"A ) H(x) * G (x) * s N Q N dx





d (H(x) * G (x) * G (x)), N Q dx

(8)

SI (x)"As ) d(x) * G (x) * G (x)"As ) G (x) * G (x) N Q Q N Q N "As ) G(   (x). Q >N Therefore, the modulus of the wavelet transform is

(9)

"As )

A s "SI (x)"" e\VQ>N N Q (2p (s#p

(10)

A maxima line across scales is given by the maxima of "SI (x)" according to the variable x. The maximum is N Q obtained for x"0 whatever the value of s. Thus, the theoretical evolution of the wavelet transform modulus along a maxima line is given by A s . m(s)" (2p (s#p

(11)

The "tting of this theoretical variation with the experimental one allows the measurement of the two quantities p and A (Fig. 5). The simplest estimation of p and A is obtained from the values of m(s) at two di!erent scales s and s . (We   have two equations with two unknowns.) This approach gives the following results:



p"

s(1!m(s )/m(s ))    , m(s )/m(s )!s/s    



(12)

p 1# . (13) s  Thus, we only need to know the values of the wavelet coe$cients at two di!erent scales to extract these characteristics. Another way to determine these two parameters is to estimate them through least mean square optimization. Indeed, we can make a linear "tting of the function s/m(s) A"m(s ) ) (2p ) 

1442

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

Fig. 5. Fitting of the experimental variations of the wavelet transform modulus along maxima lines (*) with the theoretical ones (continuous lines) for di!erent values of p.

in the variable s. This "tting gives us the two previous parameters: p and A. Obviously, the least mean square estimation is more robust and more precise than the simple estimation. Note that, in our case, the original signal represents the evolution of the gradient direction along the edge. Therefore, the two quantities p and A get a special signi"cance. The transition length p corresponds to an arc length on the edge, and the amplitude A is an angle variation of the

gradient direction. Consequently, an estimation of the local curvature at each feature point is given by the ratio between A and p. Fig. 6 is an illustration of the feature point detection scheme. We detect the edge of an object (Fig. 6, top), the signal of gradient phase along its edge (Fig. 6, middle). Then, we extract the sharp transitions of this signal, and for each of them, we measure the two parameters A and p. This is done through the study of the evolution of the wavelet transform modulus along maxima lines. (Obviously, there is one maxima line per transition detected on the gradient phase signal.) We have represented the ratio between A and p which is an estimation of the local curvature (Fig. 6, bottom). For illustration, some feature points detected in this example are marked both on the object and on the curvature signal. 2.4. Selection of points according to the local curvature The main advantage of our detection scheme is the ability to estimate the local curvature (in addition to the singularity type selection). The curvature information cannot be available in the algorithm proposed by Chen [10]. Moreover, the knowledge of local curvature at each feature point allows a selection a posteriori between these

Fig. 6. Detection of feature points on an object (top): the signal of gradient direction and the estimated curvature A/p are represented (middle and bottom). Some feature points are identi"ed for illustration.

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

1443

Fig. 7. Example of feature points selection in function of local curvature criterion. The curvature signal is segmented at di!erent thresholds: c (a) c"0.2 (b) c"0.3 (c) c"0.4.

points. We can choose to retain only points for which the curvature is above a given threshold. Therefore, this curvature information brings our scheme closer to the classical ones (detection on binary images) with the advantages given by the multiscale approach. An example of feature points selection in function of the local curvature criterion is shown Fig. 7.

3. Experimental results In this section, we provide results of two experiments which are conducted to test the e$ciency of the proposed multiscale detection scheme. To illustrate the ability of the algorithm to detect feature points on grey level images, we apply it to two reproductions of famous paintings of Van Gogh and Kandinsky. The "rst test image is the famous Van Gogh painting: `The siestaa (Fig. 8). On this image, objects cannot be de"ned easily. Therefore, the classical detection of feature points fails. But, the proposed multiscale detection allows a correct detection of edges and feature points along them, even if the interpretation of the obtained points is not obvious (Fig. 9). The other test is realized on an image of Kandinsky painting: `Signa (Fig. 8). Unlike the previous image, the object edges are very clear and the de"nition of each edge is easy. This case is more representative of a realistic scene where objects have well de"ned edges. Therefore, it is important to test the e$ciency of our algorithm on this image. The result obtained (Fig. 9) is quite good and the feature points detected have the expected localization. The edge detection is made at scale 2.5 (i.e. the standard deviation of the derivative of the Gaussian used as basic wavelet is equal to 2.5). The estimation of the length of transition, the amplitude and therefore the curvature is made at scales 4.0 and 4.4. These scales are chosen su$ciently high to avoid numerical noise on the gradient phase signal. Indeed, if these scales are lower, we detect shorter transitions in the signal and therefore points

Fig. 8. The two test images: `the siestaa of Van Gogh (a) and `Signea of Kandinsky (b).

corresponding to thinner details. The threshold used on the curvature signal is equal to 0.4 in both cases. These results are obtained for the following con"guration: the painting reproductions are seen from a perpendicular viewpoint (we will refer to this position as the 03 viewpoint). These results are taken as reference results for the following study of the method robustness.

4. Robustness of the method We have seen in the previous section that the proposed algorithm is able to detect feature points on grey level images even if the segmentation of the object is di$cult. We now discuss about the robustness of the method according to "ve parameters: additive noise, changes of the viewpoint (a zoom or a change of the view angle of the camera), the position of the light source or the rotation of the image. We call these parameters `transformation parametersa. This set of parameters is chosen in order to test the ability of the method to detect the same feature points on the objects represented on a 2D image even if this image is seen from di!erent viewpoints. The aim is to show that the detected feature points are

1444

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

Fig. 9. Experimental results of feature points detection for the two test images. Feature points are represented with blank disks. These images are obtained for the 03 viewpoint.

representative of the object shape. If this result is reached, the use of the set of feature points for many applications is justi"ed (recognition of an object in an image database, motion determination or pattern recognition). The experimental protocol used is the following one. We acquire an image of the painting reproduction from a given viewpoint, we detect feature points on it, and we apply a numerical transform to these points in order to determine their localization in the coordinate system of the reference image. Then we can compare the set of feature points detected for this image and for the reference image (03 viewpoint). The robustness of the method is given by the percentage of feature points still detected (i.e. a feature point corresponds to the same detail of the object seen from di!erent viewpoints). These tests are made for each of the transformation parameters and for the Van Gogh and Kandinsky paintings. The set of obtained results is a good estimator of the method robustness. In addition, we compare our method to the results obtained by other detection schemes of feature points [29]. 4.1. Evaluation method We evaluate the robustness of our method through the measurement of an e$ciency coe$cient. This coe$cient is equal to the percentage of the total number of points in the "rst image which are still detected on the second image. (This second image is obtained after the application of one transformation parameter). If we denote (P ) G and (Q ) the sets of feature points detected on the two G di!erent images of the same object, these sets should be

linked by a transformation matrix H between the two images. Theoretically, we have the following relation: (P )"H(Q ) (14) G G In real situations, even if the points are correctly detected, the points obtained by H(Q ) are not strictly at the G positions of (P ). Therefore, we use a coe$cient of e$cienG cy de"ned by the following ratio: number of points Q for which distance(P , HQ ))e G G G . (15) number of points P G We choose e equal to (2 pixel. Then, this ratio is equal to the percentage of feature points which are still detected (in a 3;3 pixels neighborhood around the reference point) when we apply a transformation on the image (change of the viewpoint, change of the position of the light source,2). Obviously, an important task for the evaluation of the e$ciency is the determination of the transformation matrix H. This task is a well-known problem in robotics. Indeed, the determination of the transformation matrix between the two images of a same object is equivalent to the determination of the transformation matrix between the two cameras in stereoscopy. Therefore, we can suppose that two virtual cameras acquire the two images and then we calibrate these cameras independently. The knowledge of the transformation between these two cameras tells us the apparent transformation of the object. To determine the transformation matrix, we use a standard calibration technique. The principle of this method is to calibrate each of the virtual cameras to

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

1445

Fig. 10. Images and features points detected at 403.

obtain their positions and their orientations from the known positions of 3D points of a calibration pattern [30]. Knowing these positions and these orientations, a simple calculus gives the transformation matrix H (according to the object coordinate system). We present below the results of the robustness evaluation for each transformation parameter. 4.2. Results of the robustness evaluation 4.2.1. Viewpoint The "rst parameter tested is the change of the viewpoint. We acquire a set of images of the same object (either the Van Gogh or the Kandinsky paintings) with di!erent orientations. The camera is "xed and the object is hung on a micrometric rotation and translation stage. The rotation ranges in [03, 403] by steps of 53. We use the image acquired at 03 as reference image and we compare the set of points detected on this image with the set of points detected on the image at 53, 103,2 until 403. Fig. 10 shows the images obtained at 403 for both the Van Gogh sequence and the Kandinsky sequence. The curves of Figs. 10 and 11 show the evolution of the e$ciency coe$cient for the two sequences. These results indicate that even for a change of viewpoint of 403, there are around 35% of the feature points which are still correctly detected. As soon as the viewpoint changes, the e$ciency coe$cient falls to 70% and the value of the angle is not very important. Indeed, when we increase the viewpoint angle, the coe$cient decreases but this decrease is less than the initial one (this behavior is particularly true for the Van Gogh sequence). Moreover, we have a better e$ciency for the Van Gogh

Fig. 11. Evolution of the e$ciency of the feature point detection according to the viewpoint angle. (䉬 Van Gogh sequence, * Kandinsky sequence).

sequence than for the Kandinsky one (which is theoretically more simple). Indeed, the detection of feature points on the Kandinsky painting which are not very representative (like points along vertical lines of color changes) is the explanation for this strange behavior. If we retain only the points corresponding to transitions with important amplitude, we have a result similar to the Van Gogh case (and the number of detected points is less). 4.2.2. Scale The second parameter tested is the scale change (i.e. a zoom in and a zoom out on the object). To simulate this phenomenon, we change the focal length of the camera lens from 24 to 108 mm. Fig. 12 shows the images obtained for the extreme values for the Van Gogh sequence. The scale ratio between these two images is equal to 4.5.

1446

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

Fig. 12. Images and features points detected for the extreme scales: 0.48 (a) and 2.2 (b).

Fig. 13. Evolution of the e$ciency of the feature point detection according to a scale change (䉬 Van Gogh sequence, * Kandinsky sequence).

We choose a reference image in the middle of the sequence. Thus, we test both the e!ect of zoom in and zoom out. The following results (Figs. 12 and 13) are presented according to the ratio of scale. The evolution of the e$ciency coe$cient expresses the sensitivity of the method to any scale change. Indeed, even if the results are quite better for the Kandinsky sequence than for the Van Gogh one, these results are not very satisfying. As for the "rst parameter, we note a signi"cant decrease for little changes around the reference image, and a less signi"cant decrease after the "rst evolution. Apart from the "rst decrease, the reason of the e$ciency decrease is linked to the intrinsic change of the edge shape. Indeed, strong variations of the scale alter directly the shape of the objects. For example, in the case of a zoom out, the thin details of edges are erased and, consequently, the gradient phase signal along edges is not

the same. Thus, it is normal that only few points are still correctly detected. However, even for a very important scale change, there are around 20% of feature points which are correctly detected in the di$cult case of the Van Gogh sequence. Moreover, these points are those corresponding to high amplitude transitions. Furthermore, compared to other methods (like the detector of Harris [18]), our result is a little bit better [29]. This is probably due to the multiscale approach used here. We conclude that our method is more robust than the other ones to determine the scale change, but the results are still not satisfying. A way to improve the results of our method is to retain as feature points only those which remain at an important scale. Indeed, if we increase the scale of edge detection, we will erase many details along with them. Expressed in terms of gradient phase signal, we eliminate the little transitions. Only important variations remain. Obviously, these important transitions have a more robust localization than the little ones (which are more sensitive to noise). Therefore, the feature points detected from these important transitions have a more robust localization. For example, in the case of a zoom out, the details along the object edges are lost and therefore only the feature points detected at an important scale can still be detected. In this case, in order to preserve a good e$ciency of the detection during the zoom out, we should detect only the points corresponding to important features and not thin details. This is achieved through the choice of a su$ciently high detection scale. To prove this e!ect, we increase the scale of feature points detection (6.0 instead of 4.0) for the reference image. For the other images (taken with a di!erent zoom)

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

we still detect at the scale 4.0. This choice simulates the application of image database consultation (the reference image is the one belonging to the database and the other images are the samples we try to recognize). For this application, it is correct to allow the choice of the detection scale for the reference image, on the other hand all the samples should be processed in the same manner even if they are very di!erent. The results obtained for both the two test sequences are shown in Fig. 14. As it is expected, the detection of feature points at an important scale for the reference images improves the e$ciency of the method, in particular for the zoom out case. The average pro"t is around 20% for the Van Gogh test and 7% for the Kandinsky test. The improvement is greater in the case of the Van Gogh painting because there are more points corresponding to very thin details. These points, which will disappear during the zoom out, are no more detected at important scales, and therefore the e$ciency of the detection scheme increases. 4.2.3. Scene lighting We test now the e!ect of any change of the scene lighting. We distinguish two types of lighting change: the simple one which is uniform on the entire scene and the complex one which corresponds to a variation of the lighting direction. We test these two types of e!ect. We use a continuous lighting whose power supply can be controlled (a halogen lamp). For the study of the e!ect of the uniform variation, we acquire a sequence of images with di!erent values of the lighting power and we express the variation of the lighting through the variation of the average grey level. The lighting varies from 200 Lux to around 5000 Lux, corresponding to average grey levels: 57}210 for the Van Gogh sequence and 75}215 for the Kandinsky sequence. As for the previous case, the results are presented according to a reference image corresponding to an average lighting. On the whole, these results are satisfactory, in particular for the Kandinsky sequence. Indeed, for this sequence,

1447

more than 80% of the feature points are still detected even for an important variation of the average grey level. For the Van Gogh sequence, the results are not so good. In particular, when the average grey level is higher than the reference grey level, we have only 40% of e$ciency. This is mainly because of an overexposition of the scene (some objects disappear). The same type of phenomenon also explains the poor e$ciency obtained for the low value of the lighting (for the Van Gogh sequence). Indeed, the contrast is very weak (Fig. 15) and the edge of some objects becomes very fuzzy, almost invisible. The feature points of these objects are not detected. This phenomenon is less important for the Kandinsky sequence because the de"nition of the edges of objects is better. Even if the contrast is weak, we can localize these edges and we can detect the corresponding points. In conclusion, the method gives good results if the lighting variation does not induce the apparent elimination of objects (due to the over or underexposure of the CCD array) (Fig. 16). We also study the e!ect of the direction of the light source. To test this e!ect, we move the light source on a portion of a circle around the object. The variation of the lighting direction ranges in [03, 353] by step of 53. Fig. 17 shows images of Van Gogh and Kandinsky sequences for the complex variation of lighting. The light source position used as reference (called `zero degree positiona) is the one used for the previous tests: the lighting direction and the object plane are almost perpendicular (the approximation is due to the impossibility to align the light source and the camera). As for the case of uniform variations, the results are good (and better for the Kandinsky sequence). The reason is identical: the objects in the Kandinsky painting are more de"nite and the e!ect of overexposition is less important. We can always determine the object edges and therefore we can detect the feature points. For the Kandinsky sequence, we get around 75% of e$ciency of the detection scheme for light source direction less than 303. For greater values of the light source direction, the

Fig. 14. Evolution of the e$ciency of the feature point detection according to a scale change. Comparisons between the detection at a little scale and at an important scale (a) Van Gogh sequence 䉬 little scale 䊐 important scale (b) Kandinsky sequence 䢇 little scale * important scale.

1448

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

Fig. 15. Images and feature points detected for the extreme values of uniform lighting change: 0.43 (a) and 1.57 (b).

Fig. 16. Evolution of the e$ciency of the feature point detection according to uniform variations of lighting (䉬 Van Gogh sequence, * Kandinsky sequence).

light re#ection becomes very important and the images are quite overexposed (Fig. 17). The e!ect of the light direction is more important for the Van Gogh sequence. In this case, the object edges are less clear and therefore, they are more sensitive to overexposition e!ect. On the whole, for changes of the lighting direction which does not induce overexposed images, the results are quite satisfying (changes lower than 203) (Fig. 18). 4.2.4. Image rotation The fourth parameter tested is the image rotation around the optical axis. It is very di$cult to realize physically this rotation because of the poor determination of both the optical center and the optical axis. Therefore, we choose to simulate this transformation numerically: we rotate the image by image processing. In

this case, we do not have to calibrate the acquisition system since we know exactly the transformation between the two images and therefore, we can apply the reverse transformation to compare the sets of points. Due to the symmetry of the rotation e!ect, we test only the in#uence of rotation for angles ranging in [03, 903]. This interval is sampled by steps of 53. The curves (Fig. 19) show the evolution of the e$ciency coe$cient for both the Van Gogh and the Kandinsky sequences. The two curves follow the same evolution: an initial decrease and stabilization around an e$ciency coe$cient of 80% for the Kandinsky sequence and 55% for the Van Gogh sequence. These results are very satisfying because of the no-link between the rotation angle and the value of the e$ciency coe$cient. This phenomenon indicates the robustness of the method according to this parameter. The percentage of feature points preserved during the transformation is linked to the image type: if the objects are well de"ned, a lot of points are preserved (80% for the Kandinsky case), on the contrary for blurred objects, points corresponding to very thin details are not preserved during the rotation. 4.2.5. Noise Finally, the last parameter tested is the e!ect of additive noise. As for the previous case, we simulate numerically this transformation. We generate uniform noises with di!erent amplitudes and we add them, respectively, to a reference image (either the Van Gogh or the Kandinsky painting). This way, we obtain a sequence of images with di!erent signal-to-noise ratio (SNR). Fig. 20 shows the two images obtained for the lowest values of

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

1449

Fig. 17. Images and feature points detected with a lighting source direction of 353.

Fig. 18. Evolution of the e$ciency of the feature point detection according to complex variations of lighting (䉬 Van Gogh sequence, * Kandinsky sequence).

Fig. 19. Evolution of the e$ciency of the feature point detection according to image rotation (䉬 Van Gogh sequence, * Kandinsky sequence).

the SNR in both cases. We compare the detected points on these images to those detected on the original image. The results (Figs. 20 and 21) are presented according to a signal-to-noise ratio de"ned as

the signal amplitude, there is still 60% of feature points which are still correctly detected. In the case of the Van Gogh sequence, the decreasing rate is more important. However, 40% of points are still correctly detected for the noises studied (if we except the last measure: 25% of points still detected for noise and signal with same amplitudes). In conclusion, noise addition induces a decrease of the e$ciency coe$cient but this coe$cient remains su$ciently high to assure the robustness of the detection scheme. (The points which are still detected are those corresponding to the main features of the images).

SNR"10 log





signal amplitude . noise amplitude

(16)

The behavior of the detection scheme is identical for both the Van Gogh and the Kandinsky sequences. Indeed, the e$ciency of the detection decreases with the SNR. This behavior is foreseeable because the objects become more and more blurred as the noise amplitude increases. Therefore, the loss of some feature points is expected. The decreasing rate is less important for the Kandinsky case for which the object edges are well de"ned. In this case, even if the noise amplitude equals

4.3. Discussion on these results We should remark also that the e$ciency curves presented above are not strictly monotonic. This phenomenon is intrinsic to the measurement process. Indeed, we

1450

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

Fig. 20. Images and feature points detected for the lowest SNR.

Fig. 21. Evolution of the e$ciency of the feature point detection according to noise contamination (䉬 Van Gogh sequence, * Kandinsky sequence).

always compare the results obtained from the current values of transformation parameters to the reference image. For example, for the study of the viewpoint e!ect, we compare the results obtained from 53, 103,2 until 453 viewpoints to 03 viewpoint. A feature point, which has disappeared for example on the image at 353, could reappear on the image at 403. And that induces the oscillations of the e$ciency curves. The fact that a feature point can disappear on an image and reappear on the successive one is linked both to the calibration process and to the detection scheme. Indeed, the calibration process used introduces some errors and it is possible that the two calibrated images are not exactly in correspondence, but di!ers from 1 or 2 pixels. This "rst phenomenon is particularly important for the rotation e!ect. On the other hand, the shape of the object edges is altered

by the transformation on the image. Therefore, the gradient phase signal extracted from one edge can be very di!erent from one test to the other, and di!erent feature points will be detected. This second phenomenon is particularly important for the change of viewpoint and zoomin or zoom-out e!ects. However, the oscillations of the e$ciency curves remain weak according to the main evolution of these curves. And the average variations express the behavior of the feature point detection according to the tested set of parameters. In order to erase the perturbations around the average variations, we can apply statistical approaches: we can measure many times the e!ect of the same parameter. The average of all the e$ciency curves will give a more robust evaluation of the detection scheme. 4.4. Comparison with other methods In this section, we compare the results presented above with those of other detection schemes of feature points. In [29], Schmid has tested many algorithms of features points detection (in particular those of Harris [18], Forstner [31], and Horaud [32]). The experimental protocol used for these tests is very close to ours. Indeed, the same set of parameters is addressed and the ranges of values tested for each parameter are comparable. Moreover, these tests are also realized on images of Van Gogh painting. Therefore, the comparison of the di!erent results is signi"cant. In order to simplify this comparison, we just compare our results to the best method of all the techniques tested by Schmid: an improvement of the

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

1451

Fig. 22. Comparison between multiscale detection of feature points using wavelets (䉬) and Harris detector (*) for the following parameters: (a) viewpoint, (b) scale, (c) uniform lighting, (d) complex lighting, (e) image rotation.

detector of Harris. Brie#y, this detection scheme is based on the study of the following matrix:



e\V>W

I V I VW

I



VW . I W

(17)

In this expression, I represents the grey value function and the index x or y corresponds to the derivation of this function in the x- or y-axis direction. If the eigenvalues of this matrix are high, then the point is retained as feature point. The improvement of Harris detector is to smooth the grey level function before the derivation calculus. Then, the "rst derivative is less sensitive to noise and the obtained results are more robust. (The smoothing func-

tion used is a Gaussian.) We discuss hereafter of the e$ciency of our method according to the smoothed detector of Harris (Fig. 22). On the whole, the results are quite similar. The robustness of the two methods to any change of viewpoint, scale, luminescence, noise or rotation is almost equivalent. Moreover, for both techniques, many points are retained as representative of the object shapes and we can select between these points the more representative ones (through the segmentation of the local curvature). But, with our approach, we have access to a more precise characterization of the feature points. Indeed, for each point, we know the amplitude and the length of the gradient phase transition. These parameters are useful.

1452

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453

For example, the knowledge of the length of the transition informs us about the smoothness of the object shape. If we detail the study of the robustness of both methods, we can extract the following conclusions. The viewpoint and the lighting direction changes are the worst changes for our detection scheme. Indeed, the e$ciency of the proposed method is a little bit lower than the e$ciency of the smoothed Harris detector. However, the calibration process used by Schmid is not identical to our process, and we cannot separate the e!ects intrinsic to the detection scheme and those of the calibration method. On the other hand, the scale change is the best parameter in our case, according to other methods. Indeed, although the results are not very good, they are better than the other ones (the average of e$ciency gain is around 10%). For the rotation parameter, there are some di!erences between the two methods: the results obtained with the Harris detector are better. But, we remark that for rotations which are multiple of 22.53, the results are identical. Therefore, it seems that the weakness e$ciency of our algorithm is due to aliasing default during our calibration process. Finally, the e!ect of uniform variation of luminescence on the results is very similar for both the methods (if we consider only the non-overexposed cases). The last parameter we have to compare is the noise e!ect. But, the test carried out by Schmid [29] is not signi"cant. Indeed, their experimental protocol for this parameter is to acquire di!erent images with no changes in the acquisition parameter. Then, Schmid tests if the feature points have the same localization. But, the signal-to-noise ratio remains constant during this test. Therefore, the e!ect of noise cannot be compared to our result. In conclusion of this comparison, the robustness of the proposed scheme of feature points detection is equivalent to one of the better techniques proposed elsewhere. It is advantageous to use the proposed algorithm in the case of object scale change. On the other hand, our method is more sensitive to the change of viewpoint and lighting direction.

5. Conclusion We have presented in this paper a new scheme of feature point detection based on the behavior across scales of wavelet coe$cients and on the extraction and characterization of edges. The proposed algorithm can detect feature points directly on grey level images, even if the object segmentation is not easy (for example in the case of non uniform lighting of the background). The main originality of our method comes from the study of variations of a one-dimensional signal: the direction of gradient vectors along edges. We select among all variations those corresponding to a step. This is done

through the study along maxima lines of the logarithmic slope of the wavelet coe$cient modulus. Then, we propose a characterization of feature points by the length and the amplitude of the phase transition. The ratio between these two quantities gives us an estimation of the local curvature. This characterization and the selection of feature points on the curvature criterion are the main advantages of our method over the classical detection schemes. (We can retain only points corresponding to very high curvature among all the set of feature points). In the second part of this paper, we present an experimental study of the robustness of the method against "ve parameters: noise, image rotation, viewpoint, light source and scale changes. The obtained results show that the proposed method is robust against image rotation, noise contamination, and variation of the lighting (uniform and/or variation of the direction of light source). For the two other parameters, the results are not so satisfying. The e$ciency of the detection scheme is less important for changes of scale or viewpoint. However, the e$ciency is always higher than 40% for the viewpoint transformation and than 20% for scale change. Moreover, the detection of feature points at a greater scale improves this result. The average gain is around 7% for a `simple imagea like the Kandinsky painting and 20% for a `complexa one like the Van Gogh painting. In addition, our results for scale e!ect are better than those obtained by other detection schemes. On the whole, the robustness of the method is good (in particular if the object edges are well de"ned like that of the Kandinsky painting). Therefore, we can use it as the "rst step of algorithms such as motion determination, pattern recognition algorithms or image database consultation.

References [1] R. Talluri, J.K. Aggarwal, Mobile robot self location using model image feature correspondence, IEEE Trans. Robotics autom. 12 (1) (1996) 63}77. [2] K. Yu, X.Y. Jiang, H. Bunke, Robust facial pro"le recognition, IEEE Int. Conf. Image Process. 3 (1996) 491}494. [3] J. Fayolle, C. Ducottet, T. Fournel, J.P. Schon, Motion characterization of unrigid objects by detecting and tracking feature points, IEEE Int. Conf. Image Process. 3 (1996) 803}806. [4] J. Fayolle, C. Ducottet, J.P. Schon, Application of multiscale characterization of edges to motion determination, IEEE Trans. Signal Process. 46 (4) (1998) 1174}1179. [5] L.S. Shapiro, J.M. Brady, Feature based correspondence: an eigenvector approach, Image vision comput. 10 (5) (1992) 283}288. [6] I.J. Cox, S.L. Hingorani, An e$cient implementation of Reid's multiple hypothesis tracking algorithm and its evaluation for the purpose of visual tracking, IEEE Trans. Pattern Anal. Mach. Intell. 18 (2) (1996) 138}150.

J. Fayolle et al. / Pattern Recognition 33 (2000) 1437}1453 [7] I.K. Sethi, R. Jain, Finding trajectories of feature points in a monocular image sequence, IEEE Trans. Pattern Anal. Mach. Intell. 9 (1) (1987) 56}73. [8] Q. Zheng, R. Chellappa, Automatic feature point extraction and tracking in image sequences for arbitrary camera motion, Int. J. Comput. Vision 15 (1995) 31}76. [9] F. Attneave, Some informational aspects of visual perception, Psychol. Rev. 61 (3) (1954) 183}193. [10] C.H. Chen, J.S. Lee, Y.N. Sun, Wavelet transformation for grey level corner detection, Pattern Recognition 28 (6) (1995) 853}861. [11] C.H. Teh, R.T. Chin, On the detection of dominants points on digital curves, IEEE Trans. Pattern Anal. Mach. Intell. 11 (8) (1989) 859}872. [12] A. Rosenfeld, E. Johnston, Angle detection on digital curves, IEEE Trans. Comput. C22 (1973) 875}878. [13] P.V. Sankar, C.V. Sharma, A parallel procedure for the detection of dominants points on a digital close curves, Comput. Graphics image process. 7 (1978) 403}412. [14] W.Y. Wu, M.J.J. Wang, Detecting the dominant points by the curvature based polygonal approximation, CVGIP: Graphical models image process. 55 (2) (1993) 79}88. [15] M.J. Laboure, J. Azema, T. Fournel, Detection of dominant point on a digital curve, Acta stereologica 11 (2) (1992) 169}174. [16] L. Kitchen, A. Rosenfeld, Grey level corner detection, Pattern recognition lett. 1 (1982) 95}102. [17] H. Moravec, Rover visual obstacle avoidance, Proceedings of the Seventh International Joint Conference on Arti"cial Intelligence, 1981, pp. 785}790. [18] C. Harris, M. Stephens, A combined corner and edge detector, Proceedings of the fourth Alvey Vision Conference, 1988, pp. 174}151. [19] D. Reisfeld, H.J. Wolfson, Y. Yeshurun, Context free attentional operators: the generalized symmetry transform, Int. J. comput. vision 14 (1995) 119}130.

1453

[20] B. Lucas, T. Kanade, An iterative image registration technique with an application to stereo vision, Proceedings of the Seventh International Joint Conference on Arti"cial Intelligence, 1981, pp. 674}679. [21] S. Mallat, S. Zhong, Characterization of signals from multiscale edges, IEEE Trans. Pattern Anal. Mach. Intell. 14 (7) (1992) 710}732. [22] S. Mallat, W.L. Hwang, Singularity detection and processing with wavelets, IEEE Trans. Inf. theory 38 (2) (1992) 617}643. [23] J. Canny, A computational approach to edge detection, IEEE Trans. Pattern Anal. Mach. Intell. 8 (6) (1986) 679}698. [24] I. Daubechies, The wavelet transform, time frequency localization and signal analysis, IEEE Trans. Inf. theory 36 (5) (1990) 961}1005. [25] I. Daubechies, Ten lectures on wavelet, Philadelphia SIAM 1992. [26] Y. Meyer, Ondelettes et opeH rateurs I: ondelettes, Paris, Herman 1990. [27] A. Arneodo et al., Ondelettes, multifractales et turbulences, Diderot edition, Arts et sciences, 1995. [28] B. Torresani, Analyse continue par ondelettes, InterEditions @ CNRS Editions, Paris 1995. [29] C. Schmid, Appariement d'images par invariants locaux de niveaux de gris, application a` l'indexation d'une base d'objets, Ph.D. thesis, University of Grenoble, 1996. [30] L. Riou, J. Fayolle, T. Fournel, PIV measurement using multiple cameras: the calibration method, Proceedings of the Eigth International Symposium on Flow Visualization, Sorrento, September 98. [31] W. Forstner, A framework for low level feature extraction, Proceedings of the Third European Conference on Computer Vision, 1991. [32] R. Horaud, T. Skordas, F. Veillon, Finding geometric and relational structures in an image, Proceedings of the First European Conference on Computer Vision 1990, pp. 374}384.

About the Author*JACQUES FAYOLLE was born in 1970 in France. He received his postgraduate diploma on `Imagesa in 1993 and the Ph.D. degree on `Image analysis and image processinga in 1996 from the Saint-Etienne University. He is currently a professor at the `Institut Universitaire Professionnalisant TeH leH communicationsa and at the research laboratory `Traitement du Signal et Instrumentation-UMR CNRS 5516a at Saint-Etienne. His current research interests include the determination of motion and deformation. The image processing methods are based on mathematical tools such as the continuous wavelet transform. The main principle of motion determination algorithms is to track some feature points of the scene. His current research addresses the problem of 3D motion determination. Other interest "elds are pattern recognition through feature points characterization, camera calibration, segmentation of fuzzy objects, and more generally the applications of the continuous wavelet transform. The main applications of his research are the determination of displacements in turbulent #ows and the quanti"cation of structures on X-ray images. About the Author*LAURENCE RIOU was Born in Saint-Etienne, France in 1973. She graduated from the `Institut SupeH rieur des Techniques AvanceH es de Saint-Etiennea (ISTASE), France in 1996 and also received her postgraduate diploma on `Imagesa in 1996. Since october 1996, she has been a Ph.D. student at the `Traitement du Signal et Instrumentationa laboratory of the CNRS (UMR 5516) Saint-Etienne, France. Her research interests include the areas of image processing and computer vision applied to camera calibration and 3D motion determination. About the Author*CHRISTOPHE DUCOTTET was born in Lyon, France, in 1967. He graduated from the Ecole Nationale SupeH rieure de Physique de Marseille, France, in 1990 and he received the Ph.D. degree in image processing from Saint-Etienne University, France in 1994. He is currently a Professor at the Institut SupeH rieur des Techniques AvanceH es de Saint-Etienne (ISTASE), France and he is a reasearcher at the Traitement du Signal et Instrumentation (TSI) laboratory of the CNRS (UMR 5516), Saint-Etienne, France. His research interests are image processing applied to unrigid motion determination, 3D motion determination and segmentation of fuzzy objects.