Pattern Recognition, Vol. 29, No. 9, pp. 1485 1493. 1996 Copyright © 1996 Pattern Recognition Society. Published by Elsevier Science Ltd. Printed in Great Britain 0031 3203/96 $15.00+.00
Pergamon
PII: S0031-3203(96)00004-0
RESOLVING VIEW SENSITIVITY WITH SURFACE LOCALITY X I A O B U YUAN~" and SIWEI LU Department of Computer Science, Memorial University of Newfoundland, St. John's, Newfoundland, Canada A1C 5S7
(Received 15 March 1994; in revised form 4 April 1995; received for publication 11 January 1996) Abstract--If a three-dimensional (3D) object is defined under a coordinate system that depends on viewing direction, its representation changes when the view changes. It is regarded as view sensitivity in object recognition. As an approach of solving the problem, this paper introduces an object modeling method that provides view insensitive 3D representation. Surface locality is investiated to define localized surfaces so that 3D objects can be described independently of viewing directions with their boundary surfaces and topological relations. The stability of this method is analysed and tested with range images. Copyright ~;) 1996 Pattern Recognition Society. Published by Elsevier Science Ltd. 3D Representation View sensitivity Surface locality Topological structure Homogeneous transformation Object recognition Computer vision Geometric uncertainty Disambiguating features
1. I N T R O D U C T I O N
Object recognition is important in computer vision (1'2) because it enables a computer system to understand its environment by matching observed objects with a system's models. (3-5) A 3D object is a spatial representation (6'7) defined under a certain coordinate system (s) with a translation vector CO= (or, 6, 0 and a rotation vector 0 = (a, fl, ~) specifying its relation with the universal coordinate system. Therefore, if there are N models in the system, the set of models J / i s a group of ordered (object, translation, rotation) triples: ,/~t= { ( h i , COl,OXen, i , , i N=-o1,
(1)
where A~ is the representation of the ith object with position COiand rotation 01. At an arbitrary viewing direction, the appearance of an object is a projection (9) under a translation CO'and a rotation 0' that relate the view with the universal system: P ( ( A k, CO'k0'k , 5). (2) Recognizing the object from this appearance is to identify an inverse mapping on the model set ~t' so that the following equation exists:
(Ak, CO'k,O'k)= P - l ( ( A i , COi,Oi)),
(3)
where (Ai, COi,O i ) ~ l or to prove such an inverse mapping does not exist. Identifying one particular appearance of an object from the model set ~ is theoretically impossible. In addition to a part-to-whole confusion involved in projection, (1°) the spatial relation between the two ' Ok) ' and (COi,0~) m " the mapping " of equation " (3) pa ir s (COg, includes a transformation of six-degree freedom. It f Author for correspondence.
results in a combination n u m b e r up to N × ~76, where ~,6 is a set of real numbers. Different methods have been proposed to solve the problem in one way or another,ill-13) but a resolution to make a one-to-one matching in equation (3) still needs more work. (2) Introduced in the paper is a method that uses localized surfaces and their topological relations to describe 3D objects and further eliminates the obstinate effects of (CO'k'Ok) and (COi,0i) in object recognition. 2. L O C A L I Z E D S U R F A C E P A R A M E T E R S
In geometric surface modeling methods (14) objects are represented by boundary surfaces. If the ith object has mi surfaces, its triple in equation (1) can be further broken down into a set of surface representations: Ai = {Si(0). . . . . Si(rn,- 1)).
(4)
As a result, object recognition is now working on the following inverse mapping. ( {Sk(0). . . . . Sk(m k -- i)}, CO'k,O'k) = P - I ( ( { S I ( O ). . . . . S,(m i - 1)},COi,0,)).
(5)
In order to be transformation invariant, object representation must remain the same under any rigid transformations. Consequently, the invariance of boundary surface models shifts to surface invariance,* i.e. the description of boundary surfaces and their topological relations should be invariant with respect to any transformations.
* A surface characteristic is transformation invariant if it does not change under any rigid transformations that do not affect the visibility of that surface, i.e. invariant if visible.(9)
1485
1486 2.1.
X. YUAN and S. LU
Surface locality
In the sense of digitalization, an object surface S is a set of surface points {(xl, Yt, zz)}, 1 = 0, 1..... n - 1, in the universal coordinate system {Fv}/~5) When observed in a viewing direction, S becomes another point set St = {(ui, v~,wi) } in the new coordinate system {F~}. Let (~,0) be the transformation pair that relates {F,} to {Fv}. The definition of surface S under F, then depends on both c0 and 0, i.e. S~ = S(¢o,0). As the vectors change from (o~,0) to (to', 0') due to a changed viewing coordinate system {F~,}, each examined surface point (Uk,vk, WR)is mapped into {F~,} as an element (U'k,V'k,W'k)of S¢. Since the observed surface point set is now S(¢o,0), the surface description ~6'.71 obtained from it has a different format: S(¢~',0') :A S(to, 0).
(6)
Therefore, the choice of coordinate systems in which surfaces are defined plays a key role in surface invariance. Differential geometry has been successfully applied to image segmentation} ~2as~ because view independent properties capture the spatial properties of surface shapes. Similarly, when surface invariance ~19-z~1 is concerned the coordinate systems chosen to define object surfaces should originate from the surface characteristics that depend only on surface shapes. In a homogeneous system, if T(¢o) is the translation matrix and R(0) is the rotation matrix for {F,}, a matrix product R(O)T(¢o) relates {F~} to {Fv}. It is also true that R(0')T(to') relates {F,,} to {Fv}. As a result, {Fw} relates to {F,) with a product of homogeneous transformation matrices: r = T - l(¢o')R- 1(0')R(0)T(¢o), which can be further simplified as: $=
rio r2o
rll r21
r12 r22
0
0
0
tl
.
Consequently, the projected surface points on two image planes are related by a homogeneous operation:
[<,7
f roo r°'
/v'k/=|r,o
r°2 r,2
r,,
L 'J ,,o LO
,.
0
0
tolFU,,l
, llw,,1
1 J L 1_1
,7,
Since a surface centroid is the average surface point, the two surface centroid C~ and Cu, for S~ and S w are then derived from the following two equations:
C,=
a
];z..,=o ~1 ---/ / ~ zl'~' =' n - 1° v ' // /l~'n
L~ZT£~(r2ou,+
rz,vl + rEEWt + t 2 ) J
From equation (8) the above equation can be further rewritten into: U' /7'
Froo ~---/ r l o
'
/r=o
rol rll
%2 r12
"t°IF' //el
l
r=, r=
L 0
0
0
(101 1 -IL 1_1
Equation (10) shows the one-to-one mapping relation of C,, and C,. Especially, C~ is the real surface centroid of the surface primitive itself when {F,} is the universal reference system. Therefore, all the surface centroids of observed surfaces in different viewing coordinate systems are mapped from the same real surface centroid. When a coordinate system {Fs} is set up on surface S so that its origin coincides with the surface centroid and its z axis points to the average surface normal direction, {Fs} is the same on surface S no matter it is observed in {F~} or {F~,}. The locally defined frame {Fs} is not only fixed at a viewpoint invariant location of the surface, but its z axis indicates the visible direction of that surface as well. In addition, according to differential geometry there are other local spatial characteristics at the surface centroid that are transformation invariant, such as its maximum and minimum curvatures, and its spatial Euclidean distance to any point on the surface. When the x axis of {Fs} is chosen as a direction determined by these characteristics and the y axis makes {Fs} a right-hand Cartesian system with the other two axes, the local coordinate system fixed on the surface is independent of viewing directions. If defined under such a coordinate system, the localized description S of surface S inherits the view insensitivity also. 2.2.
Object models
For the N object models of ~ ' , let the mi surfaces of the ith object be Si(j), 0 < j < ml - 1 and 0 < i < N - 1. A local coordinate system {F~(j) } can then be set up on each individual S~(j) so that a localized surface description Si(j) can be derived as discussed in the previous
~ r
{Fi0)t }
Z
{Fi}
- -
(8)
tj
o,
f;=,=o,,l
f'
= /|!.~~.l-=l ov , / ll" /I..n.~..~l=O ! S ~ . - l w ,l--I /
~,
F z;-j(roou, + to,,, + ro,,W,+ to)l 1 n-1 r CW=I;Zl=O( loUl-'l'-r llvl +r12wl q-tl) ].
lw/
k~Z...t = o
C,, =
By substituting equations (7) to (9), C w can be derived directly from surface points of S,:
(9) Fig. 1. An example of local coordinate systems.
View sensitivity with surface locality
1487 R0
/(
H8 H7 - 143-
! bl,o.oo,
H2
I H6
~t
I
s(i)
s(o)
, I.o 1"7 1t10
jH6~
(b)
(a) Fig. 2. The front and side view of a machinery part. section. The z axis of {F~(j)} is the visible direction of
Si(j), i.e. the average surface normal; the x axis is perpendicular to z in the direction of the minimum/maximum curvature at the surface centroid Ci(j) of curved surfaces or of the longest/shortest distance from C~(j) to edge points of planar surfaces; y = x x z. The origin of {Fi(j) } locates at Ci(j) unless St(j) has a symmetric axis, in which case the representation of {St(j)} is simpler if {F,(j)} is on the axis. Furthermore, the topological structure of the object can be specified with the spatial relationship of boundary surfaces or the topological relations between local coordinate systems. In the modeling method, rather than explicitly defining the relation between each pair of local coordinate systems, an object coordinate system {Fi} is introduced. When the spatial relation from each individual local coordinate system {F,(j)} to {F,} is defined with a homogeneous transformation matrix %(j), the topological relation between any two local systems Fi(v) and F~(u) is simply a matrix product of their homogeneous matrices, i.e. vi(v)- xvi(u). Since {Fi} is used only for reference to relate local coordinate systems, it can be set up randomly, for example, as the average of all local systems. In consequence, the view sensitive boundary model <{St(0). . . . . Si(mi-1)), ~i, Oi> in equation (5) is changed to a set of transformation invariant localized surface parameters that include localized surface descriptions {St(j) } and homogeneous transformation matrices {ri(j)}:
M t = { (z(0), S(0)> . . . . .
} (12)
where z(j), 0
s(0) = (Vo(0), vl(0), v2(0), v3(0)> = ( (15.62, 0.0, 0.0>, (2.82, 15.36, 0.0>, ( - 15.62, 0.0, 0.0), ( - 2.82, - 15.36, 0.0> ) S(1) = = ( ( -4.47, 1.77,0.0>, < -0.48, - 3.98, 0.0>, (5.27, 0.0, 0.0>, (4.13, 1.64, 0.0>,
{ . . . . . }.
duction of the homogeneous matrix zi(j) provides a convenient processing method for spatial operations. They both play important roles in view insensitive object recognition. As an example, a machinery part is shown with parameter labels* in Fig. 2. It is one component of a three-part bearing support. The other two include a sided pole on which the bearing is placed and another base that houses the pole together with the shown component. The boundary of the shown object consists of 21 pieces of planar and curved surfaces. When 21 local coordinate systems are defined according to the previous discussion, the object is described in localized surface parameters as follows:
(11)
The basic object structure defined with localized surface parameters is shown in Fig. 1. While the elimination of to i and 0 i from equation (5) results in the view-independent surface description Si(j), the intro-
<0.02, - 1.20, 0.0>, < - 4.47, 1.77, 0.0> >. *In the section showing experiment results these parameters are instanced to particular values and also scaled in range images by sampling. The parameter values are given in the two lists: H o 10={3, 4, 12, 13, 9, 7, 2, 6, 2, 24} and Ro 2 = {7, 5, 4}.
1488
X. YUAN and S. LU
The matrices relating the two local coordinate systems to {F} are % and zl: 0.0 0.640 ~°=10.768 L0.0
zl =
-0.600 0.800 0.0 0.0
0.0 -0.768 0.640
1.0 0.0 0.0
-2.06-] 2.111 - 5 . 4 4 /j
o.o
o.o
1.o J
-0.800 -0.600 0.0 0.0
0.0 0.0 1.0 0.0
-2.06-] -7.89/ 11.56]'1 1.0 A
3. OBJECT RECOGNITION
The major problem in identifying an observed object from a set of object models is that the matching between the arbitrary appearance of that object with the model set suffers from a transformation of sixdegree freedom. However, when modeled with localized surface parameters, neither the observed object nor the system's models are transformation related triples, but sets of view-insensitive matrix-surface pairs, as indicated in equation (11). Accordingly, the view sensitive object recognition is shifted to an inverse mapping defined in the following:
--
1), S i ( m t - -
{Sk(0). . . . . Sk(g). . . . . Sk(h). . . . . Sk(m k -- 1)} ~k(h)" l r k ( O ) = z i ( v )
~,
{ (~k(0), s~(0)) . . . . . ( ~ ( m ~ - 1), S~(m~ - 1)) } = p- 1( { (,[7i(0) ' St(0 ) ) ..... ('gi(mi
On the other hand, the topological relation between any pair of surfaces is specified by the relative spatial position and orientation between their local coordinate systems. As homogeneous transformation matrices have been used to relate local systems with an object system, the topological relation between any pair of local systems can be easily obtained from a matrix operation. For instance, the topological relation of two object surfaces Sk(h ) and Sk(g ) comes from the result of multiplying the inverse of zk(h) with Tk(g), i.e. Zk(h )- lZk(g ) for g ~ h and 0 < g, h N m k -- 1. Similarly, rt(v)-lzi(u ) defines the relation between two model surfaces St(v ) and St(u). If S/(v) matches with St(h) and St(u) matches with Sk(g) in equation (14), vt(v ) 1zi(u ) must be the same as Zk(h )- lZk(g ) for them to satisfy the requirement of identical topological relationship. Since this condition applies to all the model surfaces, after taking out those surfaces that do not match with object surfaces either in spatial form or in topological relation, the object is deemed to be the model if the following one-to-one mapping exist. If there is no system model that provides the unique mapping, the object is different from any model:
I~(U)
{s,(0) ..... S,(u) ..... S,(v) .....
S,(m,- 1)}.
1))}).
(15)
(13) The disappearance of(to'k, Ok) and (toi, 0i) from equation (5) makes object recognition with the new representation independent of viewing direction, 3.1. Matching models with object
A system model is identical to the object only if they have the same shape. In terms of boundary surface representation, the surfaces of the model and the object must have exactly the same spatial form and topological relations. To check out if model surfaces and object surfaces match in spatial form, since surface descriptions at the left and right side of equation (13) are independent of viewing directions, locally defined surfaces Sk(l) and Si(j) are compared directly, where 0 <_ l <_ rag, 0 < j <_ mr, and 0 _< i < N. As only surfaces are concerned, model Mi can be simplified into a set of localized surfaces:
{s,(0), St(l) ..... st(j) .....
St(mr- 1)}.
The simplicity also applies to the observed object: {Sk(0), Sk(1). . . . . Sk(l). . . . . Sk(m k -- 1)}. As a result, the problem of surface matching is reexpressed as a "set operation" that defines the subset relation between the two surface sets. In other words, each of the object surfaces must have at least one model surface that has the same spatial form; otherwise, no matching exists: {Sk(0). . . . . Sk(m k - 1)} _~ {St(0). . . . . S , ( m , - 1)}. (14)
3.2. Practical advantages The identified model cannot be anything else, but the object as their shape is identical. Unfortunately, a perfect one-to-one mapping rarely exists in real application. In most cases, only parts of the object features are presented because others are not obtainable at certain viewing direction(s). As a result, object recognition is practically a mapping from a partial object description to the model set, where m'k is less than ink, instead of equation (13): { (Zk(0), Sk(0)) . . . . . (~k(m'~-- 1), Sk(m'~-- 1)) } = p - 1({ (zi(0)' Si(0)) . . . . . (zi(m i - 1), St(m i - 1)) }). (16) Incomplete object features assure no promises on either a one-to-one mapping or a correct identification. Multiple candidates may be selected out in the former case, where the missing features cause ambiguous recognition. Alternatively, even if only one model meets the requirement, the object could be only a part of the model, which is termed a part-to-whole confusion. The two problems have been addressed with localized surface parameters. Ambiguous recognition was resolved by setting and checking a disambiguating feature among the selected model; and the part-towhole confusion is clarified by back projecting the model to verify that no surfaces are visible in the viewing direction except those that match with object
View sensitivity with surface locality surfaces. Detailed discussion is available from reference (22).
1489
in the new description format, or Af = JfAiJ~
4. STABILITYANALYSIS As implied in equation (14), it is the identical descriptions of localized surfaces and their topological relations that make the direct comparison possible. Therefore, the stability of these descriptions is the main concern in the analysis. The description of a 3D feature is specified by a 3D coordinate vector x together with an n-dimensional parameter vector p: f(x,p)=0
x~3,p~
".
(17)
For surfaces, the vector function f(.) is a family of parameterized features, such as all quadratic surfaces. Any specific value Pi ofp describes a particular instance in the feature family f(.), such as a sphere. Generally, a geometric model consists of a set of parameter vectors with particular values to specify its geometric features. The stability of a vision system depends directly on the tolerance when matching these parameter values. Geometric uncertainty is defined as multivariate Gauss/an distributions.( 23,241 Given a mean vector /~ and the covariance matrix A, the distribution predicts the probability of obtaining a particular parameter value Pi as: g(Pi)-
1
e
(I/2)(pi-.)A
l(pl
l')
(18)
~/(2n)" det (A) where n is the dimension of the corresponding parameter vector p in equation (17). Suppose two parameter values, one from the model and another from observed features, p; and pj in a same feature family f(-) are obtained with the covariance matrices A / a n d Ar The attempt to compare the similarity of p / a n d pj involves an uncertainty tolerance within a maximum allowable distanceJ 24) The distance is normalized by a joint covariance A/j as: NormDist(p/, pj, Ai) = (Pi - pj)tA/]• 1(pi _ pj),
in the new coordinate system. More specifically, the covariance matrix A/has three independent derivations when a geometric feature S/is obtained from an image. They are ax, a r and a z in the three axis directions. The uncertainty inherited to the localized description is:
A'i = (jtp) 1J:AiJifj ~ 1.
(22)
As an example, let the image function for an observed surface S be h(x). Due to sensor noise, the image function is then corrupted with an additive random noise n(x). If there are 2 N~range levels in an image and the resolutions is 2 Nr × 2 N', the final measurement of surface S is an integer array fl(i,j) determined by the equation:
f~(i,j) = h(LxJ Nr) = [h(x)+ n(x)j u,,-
(23)
Investigating the three components of the coordinate vector x, the derivation of x and y caused by the digitization of i and j could be no greater than half a pixel and the noise added to each pixel would not exceed 9.2 absolute range levels in practical applications, tg) It makes a x_< 0.5, % < 0.5 and ~: < 9.2. For simplicity, suppose there is no need to transform formats or reference systems when comparing S / a n d S r i.e. A I = A i and A'j = Aj. From equation (19) the distance between any parameter of S/and Sj should be no greater than 4 for a 90% confidence:
A2xax + A2yay + A2za= = 0.5(A2x + A2y) + 9.2A2z _< 4. If Aax and A2y take the maximum difference as 1.0, then the absolute value of Az must be no greater than 0.6 range level. Or if only quantification noise exists, i.e. a z = 0.0, the 2D distance of two features must be less than ~f8 pixels.
(19)
where Aij = A i + Aj. NormDist(pi, pj, Aij) determines the likelihood if the two features are the same. As an example, for a 90% confidence, the distance must be no greater than 4, i.e. NormDist(pi, pj, A/j)_< 4 or in the 1D cse, Abs(p i - pfl <_ 2a. In addition, when a geometric feature p/is transformed to P'i because of a change in description format or reference coordinate system, the new covariance matrix A'i can be calculated from the original A / b y a Jacob/an matrix that makes the transformationJ 23) Suppose Jp and J : are the Jacobians for the transformation from Pi to p'/ due to parameter and system changes, respectively. The new covariance A I for the changed parameter vector is: A~ = ( j r ) - ~Afl/~
(21)
(20)
5. EXPERIMENTWITH RANGE IMAGES The LSP method has been tested with range images. Two machine parts are demonstrated in the section as an indication of the usefulness of the LSP in conducting view-insensitive object recognition. Figure 3(a) is the range image of a sided pole. When housed by two angled bases, one of which has been illustrated in Fig. 2, the pole supports a rotating bearing. As indicated in equation (11), the LSP model of an object is a set of localized surface parameters, each of which is a pair of local surface description and homogeneous transformation matrix related with a boundary surface of the object. To extract b o u n d a r y surfaces from the range image, normal vectors are computed first (Appendix A 1). In addition, a Gaussian curvature
1490
X. YUAN and S. LU
/ Fig. 3. Extracting the surfaces from another range image.
m Fig. 4. Extract the surfaces from a range image.
(K) and a mean curvature (H) are also calculated: K = L x L r - f~2Y
(1 +f~ +f~)2 H=-
1 (1 + f~)f~x + (1 + fz)fry - 2 L f y x r (1 + f x2 + f r2) 3/2 '
2
where fxx, f~y and fyr are obtained from fx and f r For
instance, fxx is calculated from the following equation in a discrete case: [fx(i + 1,j)--fx(i,j)qSz f~-
Sx
By checking the signs of K and H at each pixel, a range image is then segmented into surfaces with up to nine different types39) Figure 3(a) involves three types of surfaces. They are planar, cylindrical and conic surfaces. While the equation of a planar surface can be easily derived from the cross vector of two vectors on the plane, the geometric features of representing cylindrical or conic surfaces can be derived from three points on the surfaces (Appendices A2 and A3). After edge detection a propagation-shrinking process takes place on each isolated surface to recover surface representations in the viewing coordinate system. The process results in symbolically represented
spatial features. (2s) The extracted surfaces are further localized to construct a partial model of the object. (26) While Fig. 3 shows the range image and extracted features, the LSP model of object is given by Mp: Mp = { (zp(0), S~(0) }, (r,(1), Sp(1) ), (zp(2), Sp(2) ),
(zp(3), Sp(3) ), (z,(4), Sp(4) ) }, To check if Mp is the same object as that defined by M z in equation (12), every surface of Mp is first compared with all the surfaces of M t, as required by equation (14). Taking surface Sp(2) as an example. It is a cylinder whose radius is 25 and height 100. Although Mt has two cylinder surfaces, neither show such a big ratio between height and radius. The existence of an unmatched surface is against the primary condition for an object to be identical to a model. Therefore, the object captured in Fig. 3(a) cannot be the same as the model M~. In comparison, the object in Fig. 4(a) involves the same three types of surfaces. Similar techniques are applied to the image to extract spatial features [Fig. 4(b)] and surface descriptions. As a result, another LSP model M b is obtained: M b = { (Zb(O), Sb(0) } .... ('Cb(k), Sb(k) ) .... (zb(14), Sb(14) } }.
In the verification of conditions equations (14) and (15)
View sensitivity with surface locality between M r and Mz, not only the localized surface representations but the topological relations are also matched. F o r instance, the two surfaces S(0) and S(1) in Fig. 2 are now Sb(O) and Sb(1) after being extracted from Fig. 4: Sb(O) = (VA(O), VB(O), Vc(O), VD(O))
= ( (78.1,0.0>, ( 14.1, 76.8), ( - 78.1,0.0), - 1 4 . 1 , -
76.8)),
Sh(l) = (VA(1), VB(1), Vc(1), Vn(I), V~(1), VF(1) ) = ( ( - 2 5 . 0 , 0 . 0 ) , ( - 3.0, - 2 1 . 0 ) , ( - 1.0, 7.0), (19.0, - 8.0) ),
TO(0)
zb(l) =
0.683 -0.183 0.707 0.0
-0.602 0.796 --0.061 0.0
-0.402 -0.368 --0.838 0.0
-0.690 -0.481 0.542 0.0
I I
137.2"] 152.01/ 110.8 / ' /
1.02 90.6"] / 68.31/ 133.0[" l 1.0_1
Although %(0)~ "co and "CO(1) ~ "cl, which is due to the different setting of object coordinate systems, their orientational relations in Zo lrl and "cb(O)-lZb(1) are actually the same. As for the translational relations, they are approximately the same when sampling is counted in. It makes the topological relation of Sb(O) and Sb(1 ) in M b identical to the topological relation of S O and S 1 in M~. In addition, taking sampling into consideration, Sb(0 ) and Sb(1) are the same as So and S 1 correspondingly: I 'colZl=
0.512
--0.384 0.461 -0.800 0.0
-0.615 - 0.600 0.0
[- 0.512 / -0.615 Zb(0)- ~'Cb(1)= / --0.600 L
0.0
-0.384 0.461 -0.800 0.0
0.768 0.640 0.0 0.0 0.768 0.640 0.0 0.0
6.66-] 18.57[ 00 ! 110
J
33.29-] 92.83[ 0.0 [" 1.0
J
Consequently, the satisfaction of both conditions leads to the conclusion that they are the same. In case the captured object originates from the other supporting base, the representation of the two examined surfaces are still the same. However, as indicated by R(Z'b(O) lO"b(1)), the rotational component of Z'b(0)- lZ'b(1), they have a different topological relation. Therefore, it is distinguished from model M~ by the condition in equation (15): I R(z'b(O)-''c;(1)) =
0.512 0.384 - 0 . 7 6 8 0.01 -- 0.615 --0.461 --0.640 0.0 -- 0.600 0.800 0.0 0.0 " 0.0
0.0
0.0
LSP is a method to obtain view-insensitive 3D representations. By introducing local coordinate systems that are related with surface properties only, localized surfaces are obtained for view-insensitive object descriptions. In this way, observed objects can be compared directly with object models in spite of any spatial transformations. Fundamental ideas have been discussed in the paper. Further research will be directed to the application of complex objects.
REFERENCES
and their transform matrices are %(0) and %(1): -0.257 -0.967 -0.002 0.0
6. CONCLUSIONS
Acknowledgement--The work is supported by the National Science and Engineering Council of Canada under Grant OGP0155411.
( - 24.0, - 7.0>, ( - 16.0, - 13.0>,
-0.684 0.180 0.707 0.0
1491
1.0
1. V. S. Nalwa, A Guided Tour of Computer Vision. Addison Wesley, New York (1993). 2. R. Chellappa and A. Rosenfeld, Current issues in computer vision, Sadhana-Academy Proc. Eng. Sci. 18, 149 158 (1993). 3. X. Yuan, A mechanism of automatic 3D object modeling, IEEE Trans. PAMI 17(3), 307 311 (1995). 4. D. Wilkes and J. K. Tsotsos, Active object recognition, IEEE Comput. Soc. Conf. CVPR 136-141 (1992). 5. A.P. Pentland and R. C. Bolles, Learning and recognition in natural environments, Proc. SDF Benchmark Syrup. Robot. Res. (1989). 6. P.J. Besl and N. D. Mckay, A method for registration of 3D shapes, IEEE Trans. PAMI 14(2), 239-256 (1992). 7. D. Hearn and M. P. Baker, Computer Graphics, 2nd edn. Prentice Hall, New Jersey (1994). 8. S.G. Hoggar, Mathematics for Computer Graphics. Cambridge University Press, Massachussetts (1992). 9. P. J. Besl. Surfaces in Range Image Understandiny. Springer-Verlap, New York (1988). 10. M. Magee and M. Nathon, Spatial reasoning, sensor repositioning and disambiguation in 3D model based recognition, Spatial Reasoning and Multi-sensor Fusion: Proceeding of the1987 Workshop, pp. 262 271.Academic Press, New York (1987). 11. F. Arman and J. K. Aggarwal, Model-based object recognition in dense-range images--a review, Comput. Surv. 25(1), 5 43 (1993). 12. E. Barth, T. Caelli and C. Zetzsche, Image encoding, labeling, and reconstruction from differential geometry, CVGIP-Graphical Models Image Process. 55(6), 428-446 (1993). 13. F. Quek, R. Jain and T. E. Weymouth, An abstractionbased approach to 3-D pose determination from range images, IEEE Trans. PAMI 15(7), 722 736 (1993). 14. J. Foley, A. Dan, S. Feiner and J. Hughes, Computer Graphics: Principles and Practice, 2nd edn. Addison Wesley, New York (1992). 15. M. Mantyla, An Introduction to Solid Modelinq. Computer Science Press, Rockville, MD, (1988). 16. R. Figueiredo and H. D. Tagare, Curves and surfaces in computer vision, In Curves and Surfaces in Computer Vision and Graphics (Proceedings of SPIE), pp. 10 16, Santa Clara (1990). 17. J. Warren and S. Lodha, Free-form quadric surface patches, In Curves and Surfaces in Computer Vision and Graphics (Proceedings of SPIE), pp. 30 40, Santa Clara (1990). 18. P. Liang and J. S. Todhunter, Representation and recognition of surface shapes in range images: A differential geometry approach, Computer Vision, Graphics, and Image Processing, Vol. 50, pp. 77 109. Academic Press, New York (October 1990).
1492
X. YUAN and S. LU
19. J. B. Burns, R. S. Weiss and E. M. Riseman, View variation of point-set and line-segment features, IEEE Trans. PAMI 15(1), 51-68 (1993). 20. H. Bunke and T. Glauser, Viewpoint independent representation and recognition of polygonal faces in 3-D, IEEE Trans. Robot. Automat. 9(4), 457-463 (1993). 21. R. L. Stevenson and E. J. Delp, Viewpoint invariant recovery of visual surfaces from sparse data, IEEE Trans. PAMI 14(9), 897-909 (1992). 22. X. Yuan, Object recognition based on localized surface parameters, Proc. AMSE Int. Conf. lnf Process. Methodol. Appl pp. 71-80. Orlando, Florida (1993). 23. R. Smith, M. Self and P• Cheeseman, Uncertain geometry in robotics, Proc. Intl. Conf. Robot. Automat. 850-856 (1987). 24. J. L. Crowley and F. Ramparany, Mathematical tools for representing uncertainty in perception, Spatial Rea-
The other three unit normal vectors Vw V,,, and Vu~can be found in a similar manner. Hence, from the depth information of the grid points, the surface normal V. is derived as the average of the four vectors:
V, = (Vbr + Vbl + V,,, + V,t)/4. Since a range image is a discrete function that consists of M rows and N columns of pixels, the function of depth is an integer f(i,j), where i,j are also integers, 0 < i_< M and 0 < j _< N. If the sampling intervals along the three axes are s x, sy and sz, the slope computation needs a modification:
Ax
--
Af
Sx
[f(i,j+ 1)-f(i,j)]s~
Y,=X;y
soning and Multi-sensor Fusion: Proceeding of the 1987 Workshop, pp. 293 302. Academic Press, New York (1987). 25. A. K. C. Wong and S. W. Lu, Recognition and shape synthesis of 3D objects based on attributed hypergraphs, IEEE Trans. PAMI 11(3), 279-290 (1989). 26. X. Yuan, LSP: A view insensitive 3D representation method, Proc. Vis. Interface "93, pp. 64-69. Toronto, Ontario (1993).
I f ( i + 1,j)-f(i,j)]s z
Af
L
s,
The sloped grid segments vectors are also updated correspondingly:
v,=(O,s,,f,).
v~ = (Sx,0,L)
The normal vector in the discrete case is therefore given below: Vb, = V~ x V, = (s~,0,f~) x (O,s,,fr)
APPENDIX
= (-Ls,,
A1. Normal vector computation Suppose (x, y) is a pixel in the range image of a surface. Its four neighboring pixels are (x + Ax, y), (x - Ax, y), (x, y + Ay) and (x, y - Ay), where Ax and Ay are the increments in the x and y directions. If the image value f(x, y) is the depth of the surface at (x, y), the surface in a neighborhood around (x, y) can be approximated piecewise by four triangular planar surfaces whose vertex projections on the x-y plane are listed below: upper left triangle: upper right triangle: bottom left triangle: bottom right triangle:
(x,y), (x, y), (x, y), (x, y),
(x - Ax, y), (x,y - Ay); (x - Ax, y), (x + Ax, y), (x + Ax, y),
(x, y + Ay); (x, y - Ay); (x, y + Ay).
Let V.l, V.~, Vbl and Vb~ denote the normal vectors of the four triangles, respectively• Consider the normal of the bottom right triangle Vb~.The slope of the side along the x axis of the triangle is given by ,fx:
Af Ax
Vb~
The sloped grid segments can then be represented by two vectors V~ and Vr V:, = (1,0,f~)
V, = (0, 1,fr).
Thus, the normal vector Vb, of the bottom right triangle is determined by the cross product of V. and Vy: V b , = Vx
× Vy = (1,O,f~) x (0, 1,f,)
= ( - f ~ , - f y , 1). After normalizaton, Vb, takes the following format as a unit normal vector: (--f~, - f r , 1) x/1 + f ~ 2 q_f,r2
( --f~sy, --f,s~, S~Sy) , j ( s ; , ) ~ + (s,L) ~ + (s~ff
A2. Axis computation for cylindrical surface Let the axis of a cylindrical surface be represented by a unit vector Vo = (x., y~, Za), Suppose (x~, Yl, zl), i = 1, 2, 3, are three points on the surface such that no two points lie on a line parallel to the axis. A rotational transformation matrix A can be found from the relation A. V, = (0,0, 1):
A=
zo
o
0
1
-
- y" 2
0
[ =
A[
xaZa2 2 ~ x ,y,z,2 + y~2 --Y, -xa 2
AJ" f(x, y + Ay) --f(x, y) f'=Ay = Ay
s;,).
Its unit normal vector is then computed from the following equation:
j'(x + Ax, y ) - f ( x , y ) Ax
Similarly, the slope of the side along the y axis is given by fy:
-Ls~,
Xa
2
x, 2
2
0
2
0
T ]
2 JXa+Yo Ya
Za
Applying matrix A to the three points on the surface transforms their coordinates (x;,y'i, z;) into the new coordinate system, i = 1, 2, 3. The radius of the cylindrical surface can then be determined from the projection of the transformed points on the x ' - y ' plane. Since z~ = 0 on the projection plane, the projected axis is (x~, yb). Substituting the coordinates of the three projected points into the equation of the projected circle results in three equations: (xi - x;) 2 + (yl - y;)2 + (zl - z;) 2 = R 2 ( x ~ - x ; ) 2 + (y~
y;)= + (& - z;) 2 = R 2
(x; - x;) ~ + (y;
y;)~ + (z; - z;) ~ = R ~
As a result, the three unknowns x~, y~ and R can be determined by resolving this group of equations•
View sensitivity with surface locality
A3. A x i s computation f o r conic surface Suppose the normal vectors at three points Pi are V i = (xi, y i, zl), i = 1, 2, 3, and no two points lie on a line intersecting with the top vertex. When the axis of the conic surface, which is denoted by V. = (x., y., z.) and V1, V2 and V3 are all unit vectors, they must satisfy the following equations
1493
when the three points are on the conic surface: xexa + Y2Ya + Z22a = XlXa -~ Y2Ya ~- glZa
I
X3Xa -1- Y3Ya d- Z3Z a 2 [ . x2a + y . + z . = l2.
X1X a d- Y2Ya + ZlZa
Again, Vo = (x~,y., z.) can be obtained by resolving the above equations.
About the Author--XIAOBU YUAN received the B.S. degree in Computer science from the University of
Science and Technology of China in 1982 and the M.S. degree in Computer science from the Institute of Computing Technology, Academia Sinica, in 1984. He received the Ph.D. degree in computer science in 1993 from University of Alberta, Canada. Since January 1993 he has been an Assistant Professor with the Department of Computer Science, Memorial University of Newfoundland, Canada. His research interests include 3D representation, computer vision, interactive computer graphics, artificial intelligence and software engineering.
About the Author--SIWEI LU was born in Jiansu, China. He graduated from Electrical engineering
department, Tsinghua University, Peking, China, 1967, and received M.S. and Ph.D. degrees in Department of Systems Design Engineering from University of Waterloo, in 1982 and 1986, respectively. He was Visiting Assistant Professor in the Department of Computer Science, Concordia University, Montreal, Canada. He is an Associate Professor in the Department of Computer Science, Memorial University, Newfoundland, Canada. He is a Senior Member of IEEE. His present research interests include image processing, computer vision, artificial intelligence, neural networks and pattern recognition.