Can multiple views make up for lack of camera registration? Harit P Trivedi
It is known that the epipolar lines in stereo images can be determined from images themselves even when neither the camera registration nor the camera geometry is known; also that given relative orientation, relative translation can be obtained (and vice versa). In the process three equations are obtained for the four parameters of the (image) origin shifts, leaving at least one degree of freedom undetermined. If the origin shifts could be computed, however, then one could obtain the relative orientation and translation, whence depth. It could be argued that taking a third image into consideration would add at least three more equations while introducing only two more unknowns (the origin shift in the third image), thus making a total of six equations in the six unknowns and possibly yielding a unique solution. (More images could be added to overconstrain the problem and to compensate for measurement errors.) We show that this scheme would not work. The three equations in four parameters turn out to be equivalent to two equations and an identity. Thus each new image yould add two’ more equations and as many unknowns, and the shortfall could never be made up. The considerations involved here also have a bearing on direct calculations of rotation, translation and origin shifts as, for example, when on1.y degenerate data (e.g. all points lying on a plane etc.) are available. Keywords: stereo imaging, unregistered images, camera geometry
It is possible to determine the epipolar lines in stereo images from the images themselves even when the origin (where the optic axis meets the image plane) and the orientation of the image coordinate systems are unknown, i.e. the camera registration information is unavailable. In addition, given the relative orientation of cameras, it is possible to calculate relative translation, BP Research
Centre,
Chertsey
Road,
Sunbury-on-Thames,
Middlesex
TW16 7LN, UK
0262-8856/88/06029904 volt5 no 1 february 1988
$03.00
and vice versa’. In the process, three equations are obtained for the four parameters of the (image) origin shifts, leaving one degree of freedom undetermined and admitting infinitely many possible solutions. However, it can be readily seen that by taking one more image into consideration, three more equations would be obtained at the cost of only two more unknowns (viz. the origin shift of the third image) thus making up for the shortfall in equations. Even more images could be considered in order to overconstrain the problem and to alleviate the effects of the inevitable measurement errors. The motivation for wanting to determine the origin shifts is provided by the simple but powerful fact that given the latter it is possible to compute the camera geometry (the location and the orientation of one camera with respect to the other), and hence depth, for all matched points in the images2,3 from image data alone. There is also a more subtle reason for investigating these equations. The camera geometry is computed from the image data2x3 in two steps. First, a set of nine simultaneous linear homogeneous equations is solved subject to fixed normalization, the coefficient matrix coming solely from coordinate measurements on the images. The second step consists of interpreting the nine quantities so obtained to compute the camera geometry. For certain degenerate cases (all points on a plane etc4), the linear equations become linearly dependent and cannot be solved, thus precluding the second step. In a previous paper’ it was shown that in an arbitrary image coordinate system the same nine simultaneous linear homogeneous equations obtain, but the new unknowns are linear combinations of the old ones. This means that for degenerate cases, as before, the linear dependence would persist, the linear equations remaining unsolvable. It turns out that instead of going via the two-step procedure, if one directly solves for the rotation and translation, the degenerate cases pose no problems5. However, when the image origins are arbitrary, the relation governing corresponding pairs of points in the two images naturally involves not only the rotation and the translation, but also the origin shifts in the two images. Surprisingly, a straightforward attempt to solve
@ 1988 Butterworth & Co. (Publishers) Ltd 29
directly for all these parameters (subject to the three equations mentioned at the beginning) fails. This failure suggests that the three aforementioned equations provide less than three constraints on the origin shifts. This paper shows that this is indeed the case. The three equations in four parameters turn out to be the equivalent of two equations and an identity. Thus each new image would add two more constraints and as many unknowns, and the shortfall could never be made up.
to = t’, v =
ON THE
First, consider the image coordinate system the origin is taken to be at the point where axis meets the image plane at a right angle. In the image coordinates of a corresponding pair satisfy 2.3s the equation
in which the optic this case, of points
i xfiQgcj = 0 ij= I
where x13, x3 = 1. The primed coordinates refer to the right image and the unprimed to the left. The matrix Q = RS is defined in terms of the rotation matrix R and the antisymmetric matrix S related to the translation vector t by 0 s=
-t2
-ts
[
t2
:
11
-t1
0
1
Equation (1) provides a set of linear homogeneous simultaneous equations for the nine elements of Q. Rotation R and translation t can be computed from Q following References 2 and 3. When the left and the right image origins are shifted by u = (u,, u2, 0) and u’ = (u’~, u’~, 0) respectively, the new image coordinates x and x’ of corresponding points obey an equation of the same form as Equation cl), viz. i x; Q; 5 = 0 ij= 1
(2)
where x3, xl3 = 1, x = u + x, and x’ = u’ t The matrices Q and Q’ are related to each other by1 Q’
Q13+r
=
s’ Q33 ~~~~~0 + v and
Q=
r =
Qllul + Q124 = Q’II~, + Q’12e
s=
Q21u1
r’ =
Q,,u’,
+
Q21u’2
=
Q’,,u’,
+
Q’21u’2
s’
Q12u’,
+
Q22u’2
=
Q’12u’l
+
Q’22u’2
30
+
Q22u2
=
Q’214
+
Q’22u2
r’ul
+
Q32”2 +
s’u2
=
Q23”‘2 =
dir
Q’,,u,
+
=
Q’,,u’,
+
d2s
Q’32u2 +
Q’23u’2 (5)
l
Multiply column 1 with u’~, column 2 with u’*, and add both to column 3. Multiply row 1 of the new construct with ul, row 2 with ug, and add both to row 3.
Going the other way, Q can be constructed from Q’ by the same sequences of operations except that u, u’ are replaced by their negatives. In the absence of camera registration what is measured is Q’ since the origin shifts are not known. The latter are treated as unknowns in Equation (4) and appropriate constraints to determine their numerical values are sought. As pointed out in Reference 2, the symmetric matrix Q’Q has the form
Q’Q =
t2 - t: -t,tz - t1t3
-t1t2 t2 - t;
-t,t,
I
-t1t3 - t2t3 P-t;
(6)
The three diagonal and the three off-diagonal elements are clearly related. One way of writing these three relations is
4[(QTQ)J2= bc4QTQ)- 2(QTQ);il x [tr=(QTl?; 1 f(y p)jjl 2
7
3
i
#
j
(7)
While the three relations above obviously cannot determine the four parameters of origin shifts, it is quite logical to argue that taking one more image into account would give at least three more equations (between the left image and the new image, say) while adding only two more unknowns (the origin shifts in the new image), thus restoring the balance between the unknowns and the equations relating them to six each. Taking three or more images, these equations could be solved and Q computed for each pair, from which the relative camera geometries could be obtained following References 2 and 3. It is shown below why this scheme could never work.
(3) CANONICAL
(4)
Here
=
1
1’.
Q,,u’,
+
Note that Q’ can be constructed from Q by the following sequence of operations in either order.
l
CONSTRAINT EQUATIONS ORIGIN SHIFTS
=
Q31u1
FORM
OF 0
The matrix Q = RS is the product of an orthogonal matrix and an antisymmetric matrix. This is called the canonical form to distinguish it from its general form Q’ which obtains when the origins of the image coordinate systems are arbitrary. In what follows, it is convenient to consider the canonical form. Suppose that Q’ has been calculated from the measurements, and that the origin shifts are not known. The computed Q’ is inserted in the right-hand side of Equation (4) to obtain a parametric form of Q, which must be canonical for correct origin shifts, i.e. there must exist an orthogonal matrix R (i.e. the as yet unknown rotation matrix) such that A = RTQ (the para-
image and vision computing
metric Q) is antisymmetric. Note that A is a function of u, u’ and R alone. The antisymmetry imposes six conditions. aB = -a,,
iJ = 1,2,3
(8)
where ati is the element of A occupying the ith row and thejth column. These six equations in the unknowns R, u and II’ capture the canonical form of Q. It is easy to verify that these six equations imply that the determinant of A, detJA1, vanishes. Conversely, any live of these six equations together with the condition detlA/ = 0 can be generally shown to imply the remaining sixth equation (see Appendix 1). Thus we are at liberty to impose detlAl = 0 along with any five of the six conditions from Equation (8) on parametric Q if it is to be of canonical form. It is now shown that detlAl = 0 is an identity which holds for any R, u and u’, thus leaving in effect only live conditions on parametric Q. Consider the canonical Q = RS. Its determinant vanishes identically (since S is antisymmetric) whatever the numerical values of the rotation and the translation parameters. Since, as mentioned immediately after Equation (5) Q’ is obtained by linearly combining rows and columns of Q, its determinant det/Q’I also vanishes, whatever the numerical values of the coefficients of combination, viz. the four parameters of origin shifts. Thus the determinant of measured Q’ would be found to vanish: and hence the determinant of parametric Q constructed from it would also vanish. It then follows that detlA( vanishes identically, shedding no light on the numerical values of R (three parameters), u (two parameters) and II’ (two parameters). Thus we have at most live conditions on seven parameters, leaving at least two degrees of freedom undetermined. Noting that it is possible to obtain R given the origin shifts (from Q, using the methods of References 2 and 3, or direct1y5) it can only be concluded that the two undetermined degrees of freedom pertain to the origin shifts alone. Going back to Equation (7), two undetermined degrees of freedom imply that one of the three conditions is redundant. It is neither easy nor necessary to perform an analogous analysis on them directly and no attempt is made to do so here. Above, it was mentioned that for degenerate configurations (where Q cannot be determined because of linear dependence between equations) it is still possible to compute the rotation and the translation directly (bypassing Q) if the origin shifts are known. When the origin shifts are not known, however, they must be incorporated in the problem as unknowns to be determined with the help of constraints. It was also mentioned above that a naive attempt at this fails to work. (A nai’ve attempt might consist of trying to determine three parameters of the origin shifts given the fourth.) The reason is now clear: the equations still leave one degree of freedom undetermined. In fact, things can get more involved. For instance, when there is no rotation and the translation is perpendicular to both the optic axes, an identity and two equations in (u, - u’,) and (u, - u’,) are obtained (see Appendix 2) from the skew symmetry of Q. A strange consequence of this is that when u, and u’, (i = 1, 2) are known to be the same (as would be the case had the same camera been used to obtain the two images), this fact, instead of providing two constraints, reduces the two
~016 no I fehruary 1988
equations to identities. Similarly, if the origin shifts in one direction are known and this information (naively, two pieces of information) is incorporated, a degree of freedom is still left undetermined.
CONCLUSIONS It has been shown that Equation (7) determining the origin shifts in a stereo pair of images amounts only to two equations and an identity. Thus taking more images into consideration would add as many constraints as unknowns, and the shortfall of two equations could never be made up. A benefit of the considerations developed here is the light it sheds on computing rotation, translation and the origin shifts directly (without calculating Q) as one might wish to do to overcome the difficulties associated with degeneracies in data (which might be unavoidable at times, e.g. if the object of interest is planar). A feasible scheme might be to attempt to find the smallest origin shifts compatible with the data.
ACKNOWLEDGEMENT The author wishes to thank BP for permission to publish this work.
REFERENCES Trivedi, H P ‘On the reconstruction of a scene from two unregistered images’ Proc. 5th National Conf. AAAZ-86, Philadelphia, PA, USA (1986) pp 652-656 Longuet-Higgins, H C ‘A computer algorithm for reconstructing a scene from two projections’ Nature Vo1293 (1981) pp 133-135 Tsai, R Y and Huang, T S ‘Uniqueness and estimation of three-dimensional motion parameters of rigid objects with curved surfaces’ IEEE Trans. PAMZ Vol 6 (1984) pp 13-26 Longuet-Higgins, H C ‘The reconstruction of a scene from two projections - configurations that defeat the 8-point algorithm’ Proc. 1st Znt. Conf. Appl. AZ, Denver, CO, USA (1984) pp 395-397 Trivedi, H P ‘Estimation of stereo and motion parameters using a variational principle’ Image Vision Comput. Vo15 No 2 (May 1987) pp 181-183 Thompson, E H ‘A rational algebraic formulation of the problem of relative orientation’ Photogrammetric Record Vo13 (1959) pp 152-159
APPENDIX 1: ANTISYMMETRY CONDITIONS It is shown that the six parts of Equation (8) are equivalent to any five, together with the condition that detlAl = 0 as long as t, # 0, t, # 0 and t, # 0. It is readily checked that antisymmetry of A implies that detlAl = 0 identically. We now proceed to show that detlAl = 0 along with any five of the six conditions of Equation (8) implies the remaining sixth condition. Rewriting the six conditions explicitly gives
31
0
(9)
a22 = 0
(10)
0
(11)
a11
=
a33
=
a,,
+ a,,
= 0
(12)
a13 + a31 = 0
(13)
as2 + a23 = 0
(14)
Omit Equation (9) but include detlA/ = 0. Then, by Kramer’s rule, detJAj = 0 = -u23a32a1 1 = 4~~ 1, implying al 1 = 0, i.e. Equation (9), because S 23 = t, # 0. Similar considerations apply a23 = to Equations (10) and (11). Omit Equation (12). Then detlAl = 0 = u12u23u31 + a21a32a13 = (42 + a21b13a23 = 0, implying that a,, + u2i = 0, i.e. Equation (12) since neither ai3 (= - t2) nor u23 (= tr) vanishes. Similar procedures recover Equations (13) and (14). Thus, the six parts of Equation (8) are equivalent to any five of the six, plus the condition that detlA1 = 0, as long as ti # 0, t, # 0, t, # 0. The provision that t, # 0, t, # 0 and t, # 0 is not as restrictive as it might appear at first. Start by considering t = (- Itl,O,O).By rotating the X-Y plane about the Z-axis, it could always be so arranged that t = (tl, t,, 0), t, # 0 and t, # 0. To make t, nonzero, simply swap the labels ‘left’ and ‘right’. The translation t now has all three components nonvanishing. The only time this scheme fails is when the two optic axes are parallel and the direction of translation is
32
normal to them both. In that case, there is no rotation. The case is treated in Appendix 2. APPENDIX 2: ANTISYMMETRY CONDITIONS WITHOUT ROTATION In the absence of rotation, A = Q of Equation (4) must be antisymmetric. Noting that the elements Q&ij = 1,2) remain unchanged under origin shifts, we can parametrize
(15) Then, from Equation (4)
Q=
0
a
f3’ - au,
-a
0
6’ + au,
(16) (-jy
C-i”G.0
2
(6 + p”q + [“U, - zlI(p’ - au2) - u’,(c’ + aq)
This must be antisymmetric. Elements (13) and (3 1) give 64’2
-
u2)
=
W’
-
S’M
(17)
Elements (23) and (32) give (U’, - ui) = (<’ - <“)/a
(18)
Finally, element (33) reduces to detlQ’l/a2, vanishing identically.
image and vision computing