Parametric Model of the Perspective Projection of a Road with Applications to Lane Keeping and 3D Road Reconstruction

Parametric Model of the Perspective Projection of a Road with Applications to Lane Keeping and 3D Road Reconstruction

Computer Vision and Image Understanding Vol. 73, No. 3, March, pp. 414–427, 1999 Article ID cviu.1998.0737, available online at http://www.idealibrary...

300KB Sizes 4 Downloads 81 Views

Computer Vision and Image Understanding Vol. 73, No. 3, March, pp. 414–427, 1999 Article ID cviu.1998.0737, available online at http://www.idealibrary.com on

Parametric Model of the Perspective Projection of a Road with Applications to Lane Keeping and 3D Road Reconstruction Antonio Guiducci Systems Engineering Department, Istituto Elettrotecnico Nazionale “Galileo Ferraris,” Strada delle Cacce 91, 10135 Turin, Italy Received October 7, 1997; accepted September 2, 1998

This paper describes an algorithm for the computation of the 3D structure of a road and of the motion state of an in-vehicle CCD camera from the lane borders on the image plane. A parametric model of the projective projection of the lane borders on the image plane is introduced and a fast and reliable algorithm to compute its parameters is described. The physical significance of the model is demonstrated by showing that the model parameters completely determine the position of the vehicle inside the lane, its heading direction, and the local structure of the road. The conditions of applicability of the model are also given and the results of its application to a sequence of images taken during a test run discussed. The algorithm is suited to real time applications and indeed an implementation on the dedicated hardware of the mobile laboratory MOBLAB runs at a frame rate of 12 images per second. c 1999 Academic Press °

1. INTRODUCTION The only information needed by a human driver to safely follow its course on a modern highway, is the visual appearance of the lane boundaries. Then, it is expected that the position of the lane borders on the image plane of an on-board camera is of utmost importance to an autonomous steering system too. Indeed most of the existing autonomous vehicles rely on this information, possibly augmented by some external inputs (like the vehicle speed), to determine the motion state of the vehicle— distance from the lane borders, heading direction, and possibly their time derivatives—and the 3D structure of the visible portion of the road (or perhaps only its curvature at the vehicle position). Many algorithms to solve the above problems can be found in the literature. Some approaches, under the assumption of a flat, horizontal road, determine the vehicle position [1, 2] or heading direction [3], and the road curvature, without even requiring the explicit identification of the road borders on the image plane. Most authors, however, do make use of the position of the lane borders on a single image or on a sequence. The vanishing point and the position of the borders of a straight horizontal road on the bottom rows of a single image are used by Yeh et al. [4] to determine the heading direction and the lateral position of the 414 1077-3142/99 $30.00 c 1999 by Academic Press Copyright ° All rights of reproduction in any form reserved.

vehicle. For a curved horizontal road, Yokoyama and coauthors [5] computed the road curvature assuming that the position and heading of the vehicle was known from magnetic sensors. By applying the formula of [4] to all the points of a horizontal road border and to their tangents, on a sequence of images, and by making use of sequential estimation (Kalman) techniques, Hsu et al. [6] determined the lateral position of the vehicle, its heading direction, the time derivatives of the above quantities, and the road curvature. Dickmanns and Mysliwetz [7] abandoned the constraint of flatness and used a differential geometry representation of the road structure and a spatio-temporal model of the driving process to recursively estimate the vehicle position, heading, time derivatives, horizontal and vertical curvatures, and changes of curvature. Also the complete 3D reconstruction of the visible portion of the road from its image plane borders has received considerable attention. Many solutions to this problem exist in the literature. They differ on the model assumed to idealize the real roads, ranging from the “flat-earth” geometry model [8, 9], to the incremental polygonal earth model [10], to the “zero-bank” model [11, 12], to the local, pointwise reconstruction algorithms of DeMenthon [13], Kanatani [14, 15], and Guiducci [16]. This paper focuses on the parametric description of the projective projection of the borders of the road ahead of a vehicle equipped with a CCD camera. We give a very simple and fast procedure for the computation of the model parameters and discuss where and when the model is applicable. Modeling the image of the road borders is useful in that the model parameters are directly linked to both the road local 3D structure (curvature) and to the position and heading of the camera (and then of the vehicle) with respect to the road. As the computation of the model parameters is easily implementable in real time, the algorithm is suited to the realization of an autonomous lane keeping system. Moreover it is shown how, starting from the model, a complete 3D reconstruction of the road can be achieved. The latter information is of the utmost importance for detecting possible obstacles on the vehicle lane, thus making the lane keeping system not only reliable but also safe. In this paper the determination of the model parameters is based on the road borders from a single image. It can be expected

PARAMETRIC MODEL OF THE PROJECTION OF A ROAD

415

that a procedure taking into account the temporal evolution of the scene should produce better results; however, the author believes that it is important to analyze how much information can be gathered from a single image. Indeed the performance of any sequential estimator is expected to grow with the accuracy of the input data. A model similar to the one here described was introduced by Kluge [3] as a global constraint on the edgels found by an edge finding algorithm. The model is not fully justified there, however, and the limits of applicability are not stated. The parametric model is described in Section 2 and the computation of its parameters in 3. Two approaches are discussed: a batch, least squares technique and a sequential, state estimation algorithm. Section 4 describes the applications of the parametric model to lane keeping and 3D road reconstruction while the experimental testing of the whole approach is discussed in Section 5. An appendix describes in more details some lengthy calculations. 2. PARAMETRIC MODEL High speed roads are so designed that the curvature C (the inverse of the radius of curvature R) is constant or linearly variable as a function of the arc length s: C = 1/R = C0 + C1 s (see Dickmanns [7]). Moreover, both the horizontal and vertical radii of curvature are large, typically greater than 1000 m. We envisage a CCD camera, mounted on a vehicle running on a highway, and derive a parametric model of the projection of the lane borders on the image plane. We suppose, for the moment, that the framed portion of the road is both planar and of constant curvature (we show later in this section how a linearly changing curvature affects the model and in Section 4 how the planarity constraint can be relaxed). Figure 1a represents the circle to which the framed road arc belongs. The camera is at O and the optical axis Oz forms an angle θ with the tangent to the road (right) border. Obviously, in a real situation the radius of curvature R0 is much greater than the distance OA to the road border and the optical axis is almost in the vehicle heading direction; hence, the angle θ is much smaller than represented in the figure. The plane through O and perpendicular to the optical axis Oz intersects the road circle in the two points A and B. The view lines OA and OB are parallel to the image plane and the points A and B are then points at infinity for the conic on which the circle projects on the image plane. In other words the framed road arc projects into an arc of hyperbola (see Fig. 1b) whose asymptotes are the images of the tangents to the circle at A and B: AC and BC. These asymptotes are related to the position and heading direction of the vehicle. Indeed, with reference to Fig. 2, let O be the center of projection of the CCD and 5 I the image plane at a distance f = 1 from O; that is, 5 I is the calibrated image plane to which the physical pixel plane of the CCD can be reduced by an intrinsic camera calibration such as Tsai’s [17]. For simplicity let us suppose that the x axis of the x y reference on the image

FIG. 1. The framed road arc.

plane is parallel to the horizon, that is, to the vanishing line of the ground plane 5G on which the road lies, and that the projection of the optical axis Oz on 5G is in the heading direction of the vehicle (the determination of the coordinate rotation needed to attain this is discussed in Appendix B). The point A lies (in both Figs. 1a and 2) at the intersection of the road border with the plane 5 O through O and parallel to the image plane 5 I . The tangent to the road border at A, AT , projects onto the image line VT of equation x − x0 = a(y − y0 ), where V ≡ (x0 , y0 ) is the vanishing point of the tangent.

416

ANTONIO GUIDUCCI

field of view (see Fig. 3), the vertical variation 1y of the second asymptote over W I is given by 1y =

FIG. 2. The geometry of the framing system.

AT being parallel to OV and hence OA parallel to VT, a=

x − x0 MA MA = = cos ϕ y − y0 OM H

(1)

tan θ = x0 cos ϕ

(2)

y0 = −tan ϕ.

(3)

In these equations MA is negative if A is on the left of M (looking along the z axis), that is, for the left road border; H is the height of the camera above the road plane 5G , ϕ is the angle between the optical axis z and 5G , and θ is the angle between the tangent AT and the heading direction of the vehicle (that is the projection of the z axis on 5G ). Similar equations hold for the other intersection of the circular road and the plane 5 O (point B in Fig. 1a) and the respective asymptote: a0 =

x00

x− MB cos ϕ = y − y0 H

tan θ 0 = x00 cos ϕ y00 = −tan ϕ = y0 ,

WI · H WI = |a 0 | MB cos ϕ

and its horizontal slant is less than one pixel if 1y/W I < 1/N x , where N x is the number of pixels in a row of the CCD. In normal conditions OB À OA and θ ¿ 1 (see Fig. 1a) and then MB ' OB ' 2R0 , cos ϕ ' 1. Assuming H ' 1 m and N x = 400 pixels, the condition that the horizontal tilt be less than one pixel becomes R0 > N x /2 = 200 m. For what concerns point (2) above, having shown that the second asymptote is almost horizontal, it (almost) coincides with the horizon line if, from Fig. 1b, H = RH0 tan θ . Such a difference is lower yC ' y0 , but yC − y0 ' AC H than one pixel if R0 tan θ < F1Y having denoted by FY the focal length (in pixels) in the y direction. Assuming H = 1 m, θ < 5◦ we have for a wide angle lens (FY ∼ 500 pixels) R0 > 44 m, and for a telephoto lens (FY ∼ 2000 pixels) R0 > 175 m. Since the minimum horizontal radius of curvature ranges from about 1200 m for a primary highway in a flat region to about 200 m for a secondary rural highway in a mountainous region [18], we conclude that, in normal conditions, the second asymptote coincides with the horizon line, within one pixel, over the whole field of view. To sum up, the road borders project into a hyperbola whose first asymptote has equation x − x0 = a(y − y0 ) and whose second asymptote is (almost) horizontal; that is, it has equation y = y0 . The equation of the hyperbola can then be written x − x0 = a(y − y0 ) +

b . y − y0

(8)

More generally, we can think of (8) as the first two terms of a power series valid also for a changing curvature road, and write x − x 0 = a(y − y0 ) +

b0 b00 + + · · ·. y − y0 (y − y0 )2

(4) (5) (6)

V 0 ≡ (x00 , y0 ) being the vanishing point of this asymptote. From Fig. 1b it is clear that the intersection of the two asymptotes (image of the point C of Fig. 1a) does not coincide with V . We will show now that, normally, the second asymptote (that of equation x − x00 = a 0 (y − y0 )) is (1) almost horizontal (that is, almost parallel to the x axis), and moreover, (2) almost coincident with the horizon line, where almost means that the second asymptote is indistinguishable from the horizon line on the pixel plane; that is, it is nowhere displaced from the horizon line by more than one pixel. Indeed if W I = xmax − xmin is the horizontal

(7)

FIG. 3. The two asymptotes and the horizon line.

(9)

417

PARAMETRIC MODEL OF THE PROJECTION OF A ROAD

The coefficients a, x 0 , y0 are related to the position and heading of the vehicle through the formulas (1)–(3): a=

tan θ MA cos ϕ, x0 = , y0 = −tan ϕ, H cos ϕ

that is, 1x = 1a(y − y0 ) +

(10)

1b0 1b00 + + · · ·, y − y0 (y − y0 )2

(15)

while the coefficients b0 and b00 are related to the curvature C0 and change of curvature C1 = dC/ds, respectively, as shown in Appendix A, Eqs. (67) and (68).

but, in the approximation of horizontal second asymptote, 1x → 0 for y → y0 and then 1b0 = 1b00 = · · · = 0. To sum up, for a segment of planar road we have the relation, obvious if the road is straight, less obvious in the general case,

3. COMPUTATION OF THE MODEL PARAMETERS

1x = 1a(y − y0 ).

3.1. Least Squares Determination of Parameters x0 , a, b0 , b00 , . . . In the previous section we concluded that the projective projection of a generic segment of a road border can be described by the parametric model x − x0 = a(y − y0 ) +

b0 b00 + + · · ·. (11) y − y0 (y − y0 )2

Our task is now to determine the model parameters x0 , y0 , a, b0 , b00 , . . . that best describe a given segment of road border on the image plane. Were y0 known, the other parameters would be easily determined by a general least squares approach, that is, by fitting the data (the road border position on the image plane) with the model x(y) = aY1 (y) + x0 Y0 (y) + b0 Y−1 (y) + b00 Y−2 (y) + · · ·, (12) where the base functions are Yk (y) = (y − y0 )k .

(13)

In order to determine y0 we can exploit the fact that a road has two borders that, being parallel, have the same vanishing point: (x0 , y0 ). In effect this is not exactly true. Indeed, if the angle θ between the optical axis and the road direction is not exactly zero, the left and right asymptotes are not exactly parallel, hence the image of their intersection does not lie exactly on the horizon line. However, if the conditions of validity of the parametric model, stated in the previous section, are satisfied, the two asymptotes intersect in a point of the image plane that is displaced from the horizon line by much less than one pixel. They can then be treated as parallel introducing a totally negligible error. Hence, labeling with 1 and 2 the quantities pertaining to the left and right border, respectively, we have x2 − x1 = (a2 − a1 )(y − y0 ) +

b20 − b10 b00 − b100 + 2 + · · ·, y − y0 (y − y0 )2 (14)

(16)

Hence, the horizontal distance between the two road borders on the image plane is a linear function of y and the parameters 1a and y0 can be easily computed by fitting a straight line to the data 1x(y). Once 1a = a2 − a1 and y0 are known, being 1b0 = 0, that is 0 b2 = b10 , we can use both road borders to determine the parameters x0 , a = 12 (a1 + a2 ), and b = b10 = b20 that describe the center line of the road on the image plane (in the following, for simplicity, we consider only the case of a circular road arc, b00 = 0; that is, we neglect the change of curvature). Indeed, if x1 (y) and x2 (y) are the measured position of the left and right borders on row y, the center line of the road is at 12 [x1 (y) + x2 (y)] and its model parameters, determined by the general least squares approach (12), are x 0 , a, and b. The left and right road borders are then described by the equations ¶ µ 1a b (y − y0 ) + (left border) (17) x1 − x0 = a − 2 y − y0 µ ¶ 1a b (right border) (18) (y − y0 ) + x2 − x0 = a − 2 y − y0 It is worth noting that: 1. We have assumed in the previous discussions that the x axis was parallel to the horizon line. This condition is assured by the initial extrinsic calibration that will be briefly described in Appendix B. The small swings of the vehicle on its suspensions are treated as noise. 2. 1x(y) is a straight line only if the road is planar. If this is not the case the 1x(y) curve can be segmented into straight segments and the value of y0 for each stroke defines the slope of the corresponding road plane (see Fig. 7 in section 4.2). 3.2. Determination of Parameters x0 , a, and b by State Estimation The least squares determination of the model parameters x0 , a, and b can give incorrect results if the image plane position of one or both of the two borders is wrong, due, for example, to the presence of another vehicle or of a shadow (see Fig. 4). This difficulty can be overcome by making use of a recursive state estimation approach (Kalman filter). Indeed, tracking the center line of the road, e.g., bottom up, the innovation variance

418

ANTONIO GUIDUCCI

the differential equation (20) can be written 

 0 1 0   ˙ X(t) = Φ(t) X(t) with Φ(t) =  0 0 1  (22) 0 0 − 3t and its solution with the initial condition X(t) = X(t1 ) for t = t1 is ·Z t ¸ X(t) = exp Φ(t) dt X(t1 ) 

t1

0 t − t1 0 0 = exp 0

FIG. 4. Error in the determination of the position of one road border.

0

 0 t − t1   X(t1 ). ¡ t ¢3 −ln t1

The road borders are sampled at a discrete set of y and hence of t. Letting t1 = kT and t = (k + 1)T , where T is the y distance between two adjacent image rows, the “discrete time” version of the above equation becomes (writing Xk for X(kT )) Xk+1 = F(k)Xk ,

x10

of the filter could be used to disregard a measure (as in Fig. 4) that lies too far from the position predicted by the filter, that is, a measure that has a low probability to be correct. If both measures must be discarded (both borders are wrong), a blind tracking step can be performed and, after a predetermined number of blind steps, the procedure terminated. The last estimation is then used to compute the model parameters x0 , a, and b. Let us examine in more details how all this can be performed. First of all the model equations must be put in a form suited to be treated by linear estimation techniques; that is, we must construct a state space representation of the model. To this end write the model Eq. (8) in the form b x = x0 − at − , t

(19)



0 T 0 0 F(k) = exp 0 

1 T

 = 0

1

0

0

0 T2 ln α

−ln

 0  T  ¡ k + 1 ¢3 k

£

¡ ¢¤  1 − ln1α 1 − α1  ¡ ¢ T , 1 − α1 ln α 

(25)

1 α

with α ≡ ( tt1 )3 = ( k +k 1 )3 > 1. Note that for y in the range [ymax , y0 ], k varies between kmin = −(ymax − y0 )/T and 0.

(20)

If we introduce the state vector X(t) whose components are   x X(t) =  x˙  , x¨

(24)

where the state transition matrix F(k) is

where t ≡ y0 − y, −(ymax − y0 ) ≤ t < 0 (see Fig. 5). Denoting by a dot the derivative with respect to t, we get x˙ = −a + b/t 2 , x¨ = −2b/t 3 ,˙¨ x = 6b/t 4 = −(3/t)x¨ . Hence the image plane projection of a (constant curvature) road border is a solution of the linear differential equation 3 ˙¨ x = − x¨ . t

(23)

(21) FIG. 5. The model equation in the t − x plane.

419

PARAMETRIC MODEL OF THE PROJECTION OF A ROAD

For each k, the measured quantity is the x positions of the road borders, affected by a noise that we consider Gaussian. Hence the measurement equation reads Zk = HXk + n k = (1 0 0)Xk + n k ,

(26)

where n k is a white noise sequence with zero mean and variance R ≡ Var[n k ] = σn2 (assumed known). The sequential solution of the state estimation problem is well known [19] and we report here the Kalman estimator/predictor equations only for the sake of notational reference: ˆ k+1|k = FX ˆ k|k X

(27)

Pk+1|k = FPk|k F>

(28)

In this equation t0 ≡ k0 T where k0 is the last valid value of k (k0 = −1 if the row below the horizon y0 was reached) and T is the y distance between two adjacent rows.

4.1. Lane Keeping (30)

ˆ k+1|k+1 = X ˆ k+1|k + Wk+1 (Zk+1 − HX ˆ k+1|k ) X

(31)

Pk+1|k+1 = Pk+1|k − Wk+1 HPk+1|k .

(32)

ˆ k|k denotes the estimation of the state vecIn these equations X ˆ k+1|k tor Xk , given the measurements between kmin and k and X denotes the prediction for the next step k + 1. Likewise Pk|k and Pk+1|k are the covariance matrices of the estimation and prediction errors, respectively. The initialization equations, denoting by x 0 , x 00 , x 000 the first three measures (that is those for k = kmin , kmin + 1 and kmin + 2, respectively), are x 000 000 (x − x 00 )/T 000

00

  

(33)

With reference to Fig. 6 let x1 (y) and x2 (y) be the left and right borders of the lane. Starting from the bottom of the image (that is, considering the portion of the road closest to the vehicle) it is easy to determine the range 1y of y in which the curve 1x(y) = x 2 (y) − x1 (y) is well approximated by a straight line and to compute the coefficients 1a and y0 of such a line. The algorithms introduced in the previous section allow us to compute the parameters of the hyperbolic model that best matches the lane borders in the interval 1y. The two lane borders are then described by the equations x − x0 = (a ∓ 1a/2)(y − y0 ) + b/(y − y0 ) where the upper (lower) sign corresponds to the left (right) border. The model parameters x0 , y0 , 1a, a, b completely determine the position of the vehicle inside the lane, its heading direction, and the local structure (that is the curvature) of the road close to the vehicle. Indeed, as shown in Section 2, Eqs. (2) and (3),

0

(x − 2x + x )/T 

Pkmin +2|kmin +2 =

σn2

b b t03 , a = −X 2 + 2 , x0 = X 1 + at + . (35) 2 t0 t0

(29)

Wk+1 = Pk+1|k H> S−1 k+1

ˆ kmin +2|kmin +2 =  X 

b = −X 3

4. APPLICATIONS

Sk+1 = HPk+1|k H> + R



regarded (the corresponding border position is wrong) and the filter is fed with the other (or a blind tracking step is performed if both borders are wrongly computed). When either no more measures are available or a number of blind steps greater than a given value has been performed, the tracking terminates and the model parameters x0 , a, and b are determined from the last ˆ ≡ (X 1 X 2 X 3 )> as valid estimated state vector X

tan ϕ = −y0 , 2

1 1/T 1/T   ·  1/T 2/T 2 3/T 3  . 1/T 2 3/T 3 6/T 4

(36) x0

tan θ = x0 cos ϕ = q , 1 + y02

(34)

Equation (34) is derived in Appendix C. To sum up, the first (bottom) three measures of the horizontal position of the center line of the road (mean of the left and right borders) are used, through Eqs. (33) and (34), to initialize the filter. The following measures up to k = 1, that is, to y = y0 + T (remember that the y axis is downward) are then used to estimate the position of the center line in that row. In each row y, the (y − y0 ) center line is the mean of the two quantities x 1 (y) + 1a 2 (y − y ), where x (y) and x (y) are the positions and x2 (y) − 1a 0 1 2 2 of the left and right borders, respectively. If in a given row, one of the above two quantities is not within an innovation standard deviation S1/2 (Eq. (29) where S is a 1 × 1 matrix that is a real number) of the predicted position, that quantity must be dis-

(37)

where ϕ is the angle between the optical axis and the road plane, and θ is the angle between the heading direction of the vehicle and the road direction. Moreover, let d1 and d2 be the distances of the left and right road borders to the projection of the optical center O of the camera on the ground plane. From Eq. (1) 1a −d1 = cos ϕ 2 H +d2 1a = cos ϕ a2 ≡ a + 2 H

a1 ≡ a −

(38) (39)

whence 1a ≡ a2 − a1 =

W d1 + d2 cos ϕ = cos ϕ H H

(40)

420

ANTONIO GUIDUCCI

FIG. 6. The road boundaries and the curve 1x(y).

4.2. 3D Road Reconstruction

and a 1 d2 − d1 dC = = 1a 2 d2 + d 1 W

µ −0.5 ≤

¶ dC ≤ +0.5 , W

(41)

where H is the height of the camera above the ground plane, W is the lane width, and dC is the distance of the projection of O on the ground, from the center line of the lane. To sum up, the position of the vehicle with respect to the center line of the lane is given by dC /W = a/1a

(42)

and the heading direction by tan θ = x0

±q 1 + y02 .

(43)

Moreover if H is known, the instantaneous lane width W can be computed by the equation (40) q W = 1a H 1 + y02 .

(44)

Conversely if W is known, H can be determined and this gives a method to calibrate the height H of the camera, by framing a road of known width. Finally the curvature C0 of the road is linked to the parameter b through Eq. (67) of Appendix A: C0 H ≡

H 2b =¡ ¢3 . R0 1 + y02 2

(45)

The modelization of a range of road borders around an image point P (midpoint of the segment P1 P2 , see Fig. 7) allows us to reconstruct the 3D structure of the road around the corresponding 3D point. Indeed indicating as usual the two lane borders with x1 (y) and x2 (y), and letting 1x(y) = x2 (y) − x1 (y), the tangent to the curve 1x(y) at P intersects the y axis at y0 . On the image plane the straight line y = y0 represents the vanishing line of the road plane at the 3D point that projects to P. Given y0 , the hyperbolic model can be fitted to the segment of road between A1 A2 and B1 B2 (Fig. 7) in which the curve 1x(y) is approximable by its tangent at P. Let x0 , y0 , 1a, a, b be the so computed model parameters; the two road borders are then given by x − x0 = (a ∓ 1a/2)(y − y0 ) + b/(y − y0 ) where the upper (lower) sign corresponds to the left (right) border. The coordinates (x R , y R ) of the intersection R of the two tangents at P1 and P2 can easily be computed and are x R = x P − a(y P − y0 ) +

b , y R = y0 , y P − y0

(46)

where x P = 12 (x P1 + x P2 ) is the x coordinate of the midpoint P of the segment P1 P2 . As shown in [16] the knowledge of the tangents at the two points P1 and P2 allows us to compute the 3D structure of the road in a neighborhood of the 3D point P 0 that projects to P −→ on the image plane (see Fig. 8), that is, the vector OP0 and the triad of unit vectors uˆ P , vˆ P , wˆ P , respectively, along the road cross segment, orthogonal to the road plane, and along the road, at P 0 .

PARAMETRIC MODEL OF THE PROJECTION OF A ROAD

421

FIG. 7. The road boundaries around a generic point P and the curve 1x(y).

Suppose we are given a reference frame OXYZ, with origin in the optical center O of the camera, Z axis (unit vector wˆ v ) in the heading direction of the vehicle, Y axis (unit vector vˆ v ) perpendicular to the ground plane under the vehicle, and X axis (unit vector uˆ v ) determined by the right hand rule: uˆ v = vˆ v × wˆ v (Appendix B shows how such a reference can be determined). Then the formulae in [16] give, for the components of the three unit vectors in that reference, wˆ P = N [x R , y0 , 1]

(47)

uˆ P = N [ˆv P × wˆ P ] = N [1, 0, −x R ] £ ¤ vˆ P = N [wˆ P × uˆ P ] = N −y0 x R , 1 + x R2 , −y0 .

(48) (49)

(N [·] stands for the normalization operator: N [v] = v/|v|), and, −→ for the vector OP0 , −→ (ˆv P · mˆ Q )mˆ P1 + (ˆv P · mˆ P1 )mˆ Q OP0 = , W 2|ˆv P × (mˆ Q × mˆ P1 )|

(50)

where mˆ P1 = N [x P1 , y P , 1] and mˆ P2 = N [x P2 , y P , 1] are the unit vectors toward P1 and P2 , respectively, and we have defined mˆ Q ≡ N [(mˆ P2 × wˆ P ) × (mˆ P1 × uˆ P )]. The distance OP0 is given by (50) in units of the road width W . In the hypothesis that the road has constant width in the range from the vehicle to the point P 0 , W can be identified with the quantity measured in the previous section (formula (44)) and the distance OP0 is then known in meters. It is worth noting that the image of the road cross segment at P 0 (in the direction uˆ P ) does not coincide, in general, with the image segment P1 P2 . 5. EXPERIMENTAL RESULTS

FIG. 8. The 3D structure of the road at P 0 .

To verify the validity of the model and of the physical interpretation of its parameters, the algorithms described in the previous sections have been implemented on the Mobile Laboratory MOBLAB [20], equipped with a CCD camera with a wide angle lens (focal length of about 500 pixels). A test run was performed on a highway, grabbing and elaborating a sequence of 1000 images over a time stretch of about 83 s (frame rate of 12 images per second). The highway tract covered consisted of a wide curve toward the right followed by a straight portion. The

422

ANTONIO GUIDUCCI

FIG. 9. The lane boundaries and the computed road model for the first image of the sequence of 1000 images.

lane boundaries were detected using the algorithm described in [21] and fed to the model computation module. Figure 9a shows the lane boundaries detected on the first image of the sequence. The hyperbolic model described in Section 2 was fitted to these boundaries and the model parameters x0 , y0 , 1a, a, b computed. Figure 9b shows the curves x = x0 + (a + k1a/2)(y − y0 ) + b/(y − y0 ) with k = −3, −1, +1, +3, drawn upon the same image, together with the straight line y = y0 (horizon line). The model fits very well to the image of the road, even in the regions (close to the vehicle and toward the horizon) without boundary data available. The same model parameters were computed for all the images of the sequence and the results shown in Fig. 10. The parameter y0 , that is, the vertical position of the horizon line (Fig. 10a), is given by the superposition of a constant component, since the road is flat, and a noisy component due to the shake of the vehicle on its suspensions. Figure 10b shows the ratio a/1a, that is, the distance of the vehicle from the center line of the lane in units of the lane width (Eq. (42)). The wavy behavior is due to the zigzag path purposedly held by the vehicle on the whole test run. The parameter x0 , proportional to the heading direction θ of the vehicle (Eq. (43)), is shown in Fig. 10c and the parameter b, tied to the curvature of the road (Eq. (45)), in Fig. 10d. The transition from the curved to the straight portion of the road is well visible in this last figure (between images 765 and 780). In the previous sections we have almost always neglected the term proportional to the power −2 of y − y0 , that is, the parameter b00 , tied, as shown in Appendix A, Eq. (68), to the change of curvature of the road. To see if this approximation is indeed correct, the complete model, that is, including the b00 term, has been fitted in the least square sense (see Section 3.1) to the lane boundaries between images 750 and 800, that is, across the transition from curved to straight road, where the contribution of the parameter b00 can only be significant. Both the parameters

b0 ≡ b and b00 are presented in Fig. 11. It can be see that |b00 | attains a maximum value of about 2 × 10−5 in correspondence to a b0 value that is decreasing from about 10−3 to 0. On the pixel plane, denoting by FX and FY the focal lengths in pixels in the x and y directions, the neglected term b00 /(y − y0 )2 decreases below 1 pixel for y − y0 > (FX b00 )1/2 , that is, for image rows displaced from the horizon more than 1Y ≡ FY (FX b00 )1/2 pixels. With FX = FY ∼ 500 pixels and b00 ∼ 2 × 10−5 , we have, for this worst case, 1Y ∼ 50 pixels. Note that, in the test sequence of 1000 images, b00 is appreciably different from zero only on about 20 images, that is, for about 1.6 out of 83 s. The above figures and considerations show that the hyperbolic model is qualitatively correct. To assess those results also from a quantitative point of view the following two tests have been performed. First, the lane width has been computed from Eq. 44 for each image of the sequence. The result was W = 3.52 ± 0.06 m while the true, measured value was 3.50 m. Moreover, since the vehicle followed a wavy trajectory, a second test was possible, namely to compute the travel speed from the algorithm outputs and confront it with the vehicle speedometer readout. Indeed, let µ

1d W

µ

¶ = k

a 1a

µ

¶ − k+1

a 1a

¶ k

be the lateral displacement of the vehicle between the frames k and k + 1. The heading direction of the vehicle at the same instant is   x 0 −1 . θk = tan q 1 + y02 k

PARAMETRIC MODEL OF THE PROJECTION OF A ROAD

423

FIG. 10. The model parameters computed for each image of the sequence of 1000 images.

Hence the travel speed, in lane widths per frame, is given by vk =

(1d/W )k sin θk

[W /frame].

Obviously, the above determination of the travel speed gives an indeterminate result ( 00 ) when the vehicle is moving along the road (θ = 0). Due to the zigzag trajectory, however, a meaningful result was obtained by averaging vk over some “wavelengths.” Figure 12 shows the moving average of vk over a window of length 200: hvk i200 . With W = 3.52 m and a frame rate of 12 frame per second, the speed change from about 0.4 to about 0.55 W /frame corresponds to a speed variation from 60 to 80 km/ h, consistent with the readouts of the vehicle speedometer. For what concerns the 3D reconstruction, the equations of Section 4.2 do not give, for the above test sequence, any information that is not contained in the lane keeping data yet. Indeed close to the vehicle, where a wide angle lens can give meaningful road borders, the road can be completely reconstructed given the position and heading of the vehicle inside the lane and the road local curvature. Much more interesting is the application of the reconstruction formulas to a system with two cameras: one equipped with a wide angle and the other with a telephoto

lens. Indeed, for such a system, the wide angle camera framing the neighborhood of the vehicle supplies the lane keeping informations, while the telephoto camera frames a more distant portion of the road that can lie on a different plane and hence needs the formulas of Section 4.2 to recover the 3D structure. Such a two-camera system is under development at the author’s institution and results will be reported in a forthcoming paper. 6. CONCLUSIONS This paper has described an algorithm to infer the 3D structure of the road ahead of an in-vehicle CCD camera and its motion state, from the lane borders on the image plane. A parametric model of the projective projection of the lane borders has been introduced and a fast and reliable algorithm to compute its parameters has been described. The usefulness of such a model rests on the physical significance of its parameters. Indeed, as it has been demonstrated, the model parameters completely determine the position of the vehicle inside the lane, its heading direction, and the local structure of the road close to the vehicle. Moreover, the modelization of the road borders around a given image point P allows us to reconstruct the 3D structure (road plane, road direction, and position vector from the projection

424

ANTONIO GUIDUCCI

merical tests have been performed by computing the road width and the vehicle travel speed from the model parameters and the results compared to the known values of the same quantities. The determination of the model parameters is based on the road borders from a single image and should be considered only as the input to a sequential estimator that takes into account the temporal evolution of the scene. Such a system is under study at the author’s institution and the results will be reported in a forthcoming paper. APPENDICES A. Relation of Parameters b0 and b00 to Curvature and Change of Curvature With reference to Fig. 13, let O be the optical center of the CCD camera, at a height H above the ground, and AP a segment of a clothoidal road border (curvature constant or changing linearly over arc length). Let OX R Y R Z R be a reference frame with ˆ parallel to the tangent to the road the Z R axis (unit vector w) ˆ horizontal, and let the border in A and X R axis (unit vector u) camera reference be specified by the unit vectors xˆ , yˆ , and zˆ with xˆ horizontal (see Appendix B). As shown by Dickmanns [7], for small changes of the road direction, that is, for ξ/ζ ¿ 1 (see Fig. 13), the road segment is approximable by the parametric equations (s is the arc length) FIG. 11. The parameters b0 (curvature) and b00 (change of curvature).

center) of the road around the 3D point that projects to P. The conditions of applicability of the model are also discussed. To verify the validity of the model and the physical interpretation of its parameters, the modelization algorithm has been implemented on the mobile laboratory MOBLAB [20] and a test run was performed on a highway, grabbing and processing a sequence of 1000 images at a frame rate of 12 images per second. The results are reported in graphical form. Moreover two nu-

FIG. 12. Moving average of the vehicle speed computed over a window of length 200.

(

ξ = 12 C0 s 2 + 16 C1 s 3 ζ = s,

FIG. 13. Computation of the coordinates of a road border point.

(51)

425

PARAMETRIC MODEL OF THE PROJECTION OF A ROAD

and Eqs. (58) and (59) become

where C0 , and C1 are the curvature and the change of curvature, respectively. Denoting the position vector from O to the generic point P with arc length s by X P (s) we can write µ

¶ 1 1 2 3 X P (s) = d + C0 s + C1 s uˆ + H vˆ + s w, ˆ 2 6

(52)

(53)

vˆ = 0 · xˆ + cos ϕ yˆ + sin ϕ zˆ

(54)

wˆ = sin θ xˆ − sin ϕ cos θ yˆ + cos ϕ cos θ zˆ .

(55)

d + sθ + 12 s 2 C0 + 16 s 3 C1 −dθ cos ϕ + H sin ϕ + s cos ϕ

y=

dθ sin ϕ + H cos ϕ − s sin ϕ −dθ cos ϕ + H sin ϕ + s cos ϕ

=−

where d is the horizontal distance of the camera to the road border. If θ and ϕ are defined as in Fig. 13, the relation between the ˆ vˆ , wˆ is two triads of unit vectors: xˆ , yˆ , zˆ , and u, uˆ = cos θ xˆ + sin ϕ sin θ yˆ − cos ϕ sin θ zˆ

x=

sin ϕ H/cos ϕ + . cos ϕ −dθ cos ϕ + H sin ϕ + s cos ϕ

d + sθ + 12 s 2 C0 + 16 s 3 C1 D(s) α = x0 + + β D(s) + γ D(s)2 D(s)

x=

Substituting into (52) we get the components of X P (s) in the camera reference xˆ , yˆ , zˆ and remembering that the image coordinates are given by X P (s) · xˆ X P (s) · zˆ

(56)

y=

X P (s) · yˆ , X P (s) · zˆ

(57)

H/cos ϕ D(s)



D(s) =

(63) H/cos ϕ , y − y0

(64)

where the unknown constant (i.e., independent of s) coefficients x0 , α, β, and γ are computed by equating equal powers of s on both sides of the equation d + sθ + 12 s 2 C0 + 16 s 3 C1 = α + x0 D(s) + β D(s)2 + γ D(s)3 . The result, taking into account the approximations θ ¿ 1, ξ/ζ ¿ 1, C1 /C0 ∼ 1/ς ¿ 1, C0 ≡ 1/R0 ¿ 1, is x0 '

we finally get

1 1 C0 C1 θ , α ' d, β ' 2 2 γ ' 6 3 cos ϕ cos ϕ cos ϕ

(65)

and Eq. (63), taking into account the value of D(s) given by (64), becomes

x= d cos θ +s sin θ + 12 s 2 C0 cos θ + 16 s 3 C1 cos θ −d cos ϕ sin θ + H sin ϕ +s cos ϕ cos θ − 12 s 2 C0 cos ϕ sin θ − 16 s 3 C1 cos ϕ sin θ

x − x0 =

(58) y= d sin ϕ sin θ + H cos ϕ −s sin ϕ cos θ + 12 s 2 C0 sin ϕ sin θ + 16 s 3 C1 sin ϕ sin θ −d cos ϕ sin θ + H sin ϕ +s cos ϕ cos θ − 12 s 2 C0 cos ϕ sin θ − 16 s 3 C1 cos ϕ sin θ

(59) We now introduce the approximation θ ¿ 1 (cos θ ' 1, sin θ ' θ, θ 2 ' 0), which is reasonable since the vehicle is running along the road. Since we also have ξ/ζ ¿ 1, the s dependent part of the denominators simplifies to 1 1 s cos ϕ cos θ − s 2 C0 cos ϕ sin θ − s 3 C1 cos ϕ sin θ 2 6 # " 1 C s 2 + 16 C1 s 3 2 0 ' s cos ϕ 1 − θ s = s cos ϕ[1 − θ(ξ/ζ )] ' s cos ϕ

(62)

We introduce now the notation D(s) ≡ −dθ cos ϕ + H sin ϕ + s cos ϕ and y0 = −sin ϕ/cos ϕ and write

y = y0 +

x=

(61)

α cos ϕ β H/cos ϕ γ H 2 /cos ϕ 2 (y − y0 ) + + H y − y0 (y − y0 )2

≡ a(y − y0 ) + .

b0 b00 + . y − y0 (y − y0 )2

(66)

We conclude that x0 = θ/cos ϕ, a = d cos ϕ/H as previously derived in Section 2 (Eqs. (2) and (1), respectively) and moreover that the sought for equations for b0 and b00 as a function of C0 and C1 are b0 =

1 C H 2 0 cos ϕ 3

(67)

b00 =

1 C H2 6 1 cos ϕ 5

(68)

B. Extrinsic Camera Calibration (60)

The procedure described in the text requires the knowledge of a reference frame (the vehicle reference uˆ v , vˆ v , wˆ v ) lined up

426

ANTONIO GUIDUCCI

with the vehicle, that is, with wˆ v in the heading direction, vˆ v orthogonal to the ground plane under the vehicle (downward), and uˆ v determined by the right hand rule. An easy way to determine such a reference (extrinsic calibration) is as follows. The wˆ v unit vector is defined to be in the heading direction. Suppose that, while the vehicle is running on a straight horizontal road, it grabs a sequence of N frames, and let Ri ≡ (x Ri , y Ri ), i = 1 . . . N be the vanishing points of the road in the sequence. The heading direction can be approximated by the unit vector from O to the mean of thePRi ; hence, if we define mˆ Ri ≡ N [x Ri , y Ri , 1], we have wˆ v = N1 i mˆ Ri . Let, as before, mˆ Ri be the unit vectors in the direction of the vanishing points of a straight horizontal road, in a sequence of N images. Suppose now, however, that the vehicle is performing a slow turn. Then the points Ri span (approximately) the horizon line (that is, the vanishing line of the ground plane of the vehicle) to the ground and the unit vector vˆ v , defined to be orthogonal P plane, will be determined by the equation i (ˆvv · mˆ Ri )2 = min with the condition vˆ v · wˆ v = 0. A solution of the above constrained least squares problem can be obtained as follows. Let v⊥ be a generic vector satisfying the constraint, that is, lying in the plane through O and orthogonal to wˆ v , and let uˆ 1 and uˆ 2 be any two orthonormal vectors in the same plane. We have · ¸ z1 ≡ AEz , v⊥ ≡ z 1 uˆ 1 + z 2 uˆ 2 ≡ [uˆ 1 , uˆ 2 ] z2

(69)

C. Kalman Filter Covariance Initialization The estimation error resulting from using (33) to initialize the filter state estimation equation is (writing, for simplicity, i instead of kmin + i)   n2  ˜ 2|2 ≡ X ˆ 2|2 − X2 =  (n 2 − n 1 )/T X   , (72) 2 (n 2 − 2n 1 + n 0 )/T where n i is the measurement noise on row kmin + i (n i is assumed to be a white noise sequence with zero mean and variance σn2 ). Hence, since the covariance matrix P2|2 is defined as © ª ˜ 2|2 X ˜> , P2|2 = E X 2|2 it follows that [P2|2 ]11 = σn2 [P2|2 ]12 = [P2|2 ]13 = [P2|2 ]22 = [P2|2 ]23 = [P2|2 ]33 =

>

where A is a 3 × 2 matrix and Ez ≡ [z 1 , z 2 ] a column vector with two components. The above constrained least squares problem can now be written (using the notation a> b to denote the dot product a · b) X¡

mˆ >Ri v⊥

i

¢2

=



¢2 X > > (Ez ) A mˆ Ri mˆ >Ri AEz = min, mˆ >Ri AEz =

i

i

(70) that is, >

(Ez ) A

>

µX

mˆ Ri mˆ >Ri

¶ AEz = min.

(71)

i

P If the eigenvalues of the 2 × 2 matrix A> ( i mˆ Ri mˆ >Ri )A are σ1 < σ2 and the corresponding orthonormal eigenvectors are ζE1 and ζE2 , the solution of Eq. (71) is ζE1 (the eigenvector corresponding to the smallest eigenvalue). We conclude that vˆ v = AζE1 . The third unit vector of the sought for vehicle triad must be orthogonal to the other two and it is obviously given by uˆ v = AζE2 = vˆ v × wˆ v . The extrinsic calibration is completed by the determination of the height H of the camera above the ground. This can be computed from Eq. (44) if the width W of the road is known.

1 σ2 E{n 2 (n 2 − n 1 )} = n T T 1 σn2 E{n (n − 2n + n )} = 2 2 1 0 T2 T2 1 2σ 2 E{(n 2 − n 1 )(n 2 − n 1 )} = 2n 2 T T 1 3σn2 E{(n − n )(n − 2n + n )} = 2 1 2 1 0 T3 T3 1 6σn2 E{(n − 2n + n )(n − 2n + n )} = . 2 1 0 2 1 0 T4 T4 REFERENCES

1. D. Pomerleau, RALPH: Rapidly adapting lateral position handler, Proc. of IEEE Symp. on Intelligent Vehicles, Detroit, Michigan, 1995. 2. D. Pomerleau and T. Jochem, Rapidly adapting machine vision for automated vehicle steering, IEEE Expert 11(2), 1996, 19–27. 3. K. Kluge, Extracting road curvature and orientation from image edge points without perceptual grouping into features, Proc. of Intelligent Vehicle Symp., Paris, 1994, pp. 109–114. 4. E. C. Yeh, J. C. Hsu, and R. H. Lin, Image-based dynamic measurement for vehicle steering control, Proc. of Intelligent Vehicle Symp., Paris, 1994, pp. 326–332. 5. T. Yokoyama, A. Tachibana, T. Suzuki, and H. Inoue, Automated vehicle system using both a computer vision and magnetic field sensors, Proc. of Intelligent Vehicle Symp., Tokyo, 1993, pp. 157–162. 6. J. C. Hsu, W. L. Chen, R. H. Lin, and E. C. Yeh, Estimations of previewed road curvatures and vehicular motion by a vision-based data fusion scheme, Mach. Vision Appl. 9, 1997, 179–192. 7. E. D. Dickmanns and B. D. Mysliwetz, Recursive 3-D road and relative ego-state recognition, IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 1992, 199–213. 8. M. A. Turk, D. G. Morgenthaler, K. D. Gremban, and M. Marra, VITS—A vision system for autonomous land vehicle navigation, IEEE Trans. Pattern Anal. Mach. intell. 10(3), 1988, 342–361. 9. D. G. Morgenthaler, S. J. Hennessy, and D. DeMenthon, Range-video fusion and comparison of inverse perspective algorithms in static images, IEEE Trans. Syst. Man Cybern. 20(6), 1990, 1301–1312.

PARAMETRIC MODEL OF THE PROJECTION OF A ROAD 10. J. S. Kay and C. E. Thorpe, STRIPE: Supervsed teleRobotics using incremental polygonal Earth geometry, Proc. of the Third Int. Conf. on Intelligent Autonomous Systems, Pittsburgh, PA, 1993. 11. D. DeMenthon, A zero-bank algorithm for inverse perspective of a road from a single image, Proc. IEEE Int. Conf. Robotics Automation, Raleigh, NC, March–April 1987, pp. 1444–1449. 12. K. Kanatani and K. Watanabe, Reconstruction of 3-D road geometry from images for autonomous land vehicles, IEEE Trans. Robotics Automat. 6(1), 1990, 127–132. 13. D. DeMenthon and L. S. Davis, Reconstruction of a road by local image matches and global 3D optimization, Proc. IEEE Int. Conf. Robotics Automation, Cincinnati, OH, May 1990, pp. 1337–1342. 14. K. Kanatani and K. Watanabe, Road shape reconstruction by local flatness approximation, Adv. Robotics 6, 1992, 197–213. 15. K. Kanatani, Geometric Computation for Machine Vision, Clarendon Press, Oxford, 1993.

427

16. A. Guiducci, 3D road reconstruction from a single view, Comput. Vision Image Understanding 70(2), 1998, 212–226. 17. R. Y. Tsai, An efficient and accurate camera calibration technique for 3D machine vision, Proc. IEEE Int. Conf. on Computer Vision and Pattern Recognition, Miami Beach, 1986, pp. 364–374. 18. G. Colombo, Manuale dell’Ingegnere, Hoepli, Milano, 1990. 19. A. P. Sage and J. L. Melsa, Estimation Theory with Applications to Communications and Control, McGraw-Hill, New York, 1971. 20. A. Cumani, S. Denasi, P. Grattoni, A. Guiducci, G. Pettiti, and G. Quaglia, MOBLAB: a mobile laboratory for testing real-time vision-based systems in path monitoring, Proc. SPIE Int. Conf. Photonics for Industrial Applications, Mobile Robots IX, Boston, 1994, pp. 228–238. 21. A. Guiducci and G. Quaglia, Attitude of a vehicle moving on a structured road, 5th International Workshop on Time Varying Image Processing and Moving Object Recognition, Florence, Italy, 1996, pp. 295– 300.