Model-based tracking of complex innercity road intersections

Model-based tracking of complex innercity road intersections

Mathl. Comput. Modelling Vol. 27, No. 9-11, pp. 189-203, 1998 @ 1998 Elsevier Science Ltd. All rights rexsewed Printed in Great Britain 08957177/98 $1...

890KB Sizes 0 Downloads 17 Views

Mathl. Comput. Modelling Vol. 27, No. 9-11, pp. 189-203, 1998 @ 1998 Elsevier Science Ltd. All rights rexsewed Printed in Great Britain 08957177/98 $19.00 + 0.00 PII: SO895-7177(98)00059-4

Model-Based Tracking of Complex Innercity Road Intersections F. HEIMES F’raunhofer-Institut fiir Informations- und Datenverarbeitung (IITB) Fkaunhoferstr.

1, D-76131 Karlsruhe, Germany

H.-H. NAGEL Fraunhofer-Institut fir Informations- und Datenverarbeitung (IITB) Fkaunhoferstr.

1, D-76131 Karlsruhe, Germany and Institut fiir Algorithmen und Kognitive Systeme Fakultiit fiir Informatik der Universitiit Karlsruhe (TH) Postfach 6980, D-76128 Karlsruhe, Germany bhnQiitb.fhg.de

T. FRANK F’raunhofer-Institut fir Informations- und Datenverarbeitung (IITB) Fraunhoferstr. 1, D-76131 KarIsruhe, Germany Abstract-Vision-b& automatic driving along innercity roads and across complex innercity intersectionsrequiresto detect and track road markingsand lane boundariesin order to determine the position and orientation of the vehicle relative to the ground. The complexity of intersection scenes and the disturbances in the detected contours enforce the use of model-based state estimation techniques. We recorded monocular image sequences of complex innercity road intersections from a moving small experimental truck in order to track intersection models using a Kalman filter. A scalar distance measure for the distance between image contour points and model edge segments turned out to be advantageous for the estimation process. This scalar measure is based on the simultaneous exploitation of edge element location and direction, thus using more information about the image gray value variation than previously when only the perpendicular distance between the edge element and the model segment was taken into account. We report about the theoretical foundation of this approach, its implementation and experimental results from several intersection sequences, a~ well as detaIled comparisons with a previous, lees sophisticated approach. These examples demonstrate the kind of problems which are likely to occur more frequently at intersection scene8 than at those road scenes encountered while following a more or leas straight, uninterrupted highway lane with clearly marked boundaries.@ 1998 Elsevier Science Ltd. All rights reserved.

Keywords-Autonomous filtering.

road vehicle, Intersection

1.

models, Image sequence

analysis,

Kalman

INTRODUCTION

In 1995, we reported about our achievements in developing a driver’s warning assistant, alg+ rithms for automatic evasion maneuvers, and highway road following algorithms (1). These goals were necessary steps on our way to an automatic evaluation of traffic situations. The driving experiences obtained with our experimental vehicles MB609D and BMW 735 iL enabled us to extend our set of examined situations to more complex innercity traffic scenes.

189

F. HEIMES et 01.

190

Model line segment Aggregated data segment Figure 1. Aggregation of edge elements to model segments. Points represent relevant edge elements. The thick line is a model segment, the thin line represents a data segment obtained by aggregation of edge elements within an acceptance corridor around the model segment.

projected model line segment dvm

Origin of local coordinate system Figure 2. Assignment

‘I..

of edge elements to model segments.

left lane boundary ,.~~........_.._..._....................................................................................

*.

Figure 3. Comparison of the Kalman filter results in a right turn at the innercity intersection “Moltke” in Karlsruhe. The different trajectoriee will be discussed in detail in Figures 6-9.

Innercity Road Intersections

191

Figure 4. Comparison of the Kalman filter results in a left turn at the innercity intersection “Tulla” in Karlsruhe. The predicted trajectory would lead the vehicle to the left, off the lane. “Prediction” denotes the trajectory calculated entirely from vehicle sensor data using dead-reckoning. Both approaches result in trajectories which appear to be close to the actually driven path. Figures 19-12 illustrate the details.

.......

. . . . . . . . . . . . . . . . . . . . ..___

Prediction /V

I

l--l----

Figure 5. Comparison of the Kalman filter results to a driving experiment on the parking lot of our premises (“Parking”). The images in’ Figures 13-15 demonstrate a case of similar behrtvior of both approaches.

The scene presented to a driver who crosses an innercity intersection usually contains many different objects like cars, pedestrians, markings, and buildings. These objects need to be recognized and tracked by the driver or by a vision-based system that controls the road vehicle in order to avoid collisions and to obey traffic rules. A subtask of this problem consists in the detection and tracking of the boundaries and markings of an intersection. Since the elevation angle of our camera relative to the groundplane objects occlude other objects, many of these being part of road markings

is less than 20“, most and lane boundaries.

The lighting conditions depend on the time of day and the weather conditions. During daylight, whereas rain puddles cause reflections and cloudy weather yields more uniform illumination, bright sunshine causes reflections and shadows with sharp edges. Exploitation of color information in the image sequences may help to find road markings [2] and road surface regions [3,4]. However, many lanes are not marked in color and sidewalks often

F. HEIMESet al.

192

Figure 6. First frame of the sequence “Moltke” with the initial projected model and associated image features. The upper part illustrates the result of the aggregation, the lower part the result of the assignment approach. The image area near the rear window of the car illustrates nicely how irrelevant data segments or edge elements due to the occluding car come dangerously close to a road model segment and thus can seriously disturb the road model state update process.

‘e a color similar met;hold which:

to

that of the lane. It is thus necessary to use a lane detection alnd t1:acking

1. can cope with partial occlusion of markings and lane boundaries by natural &Ind alrtificial objects, 2. can handle changing lighting conditions, and 3. is able to detect different kinds of lane boundaries.

Innercity Road Interaection8

193

In a purely data driven approach, gray value changes in several search windows are examined to determine the lane boundaries. Such an approach is fast but works only for boundaries with good contrast in the image, without significant occlusion and under good lighting conditions. The detection and tracking of more or less straight road markings and lane boundaries has been solved by several groups, e.g., (1,5,6]. Solutions have even been realii in hardware (71. Additional examinations of textured regions and the orientation of road markings provide a possibility to detect junctions [8]. To improve the robustness of our tracking process [9], we use a model driven approach. This is especially useful for dealii with the Problems 1 and 2 above. Our intersection model consists of pairs of polygonal lines corresponding to the road markings and lane boundaries (see Figure 3). These are currently extracted from a detailed road map. Roll and yaw movements of our experimental vehicles due to rough road segments made it necessary to estimate all six degrees of freedom of our experimental vehicle with an Iterated Extended Kalman Filter (IEKF). We report about the development of matching complex models of innercity intersections to different kinds of image features using an IEKF.

MATCHING INTERSECTION MODELS TO DATA SEGMENTS AND TO EDGE-ELEMENTS 2.

A standard approach for edge detection aggregates adjacent edge elements into data segments provided their angles do not diier by more than a given iixed threshold. Due to disturbances, however, it is in many cases not possible to extract data segments with apprmcimately the same length as the model segments with which they should be associated. A variation of this approach was tested by assigning multiple data segments to one model segment, yielding more and better assignments. The main weakness of this variant, however, is the fact that information about the length of data segments was not taken into account. Thii resulted in situations where a few short data segments could drag the model segment away from a long data segment with which it should have been associated. In order to overcome thii weakness, we restricted the aggregation process (edge elements + data segments) to those edge elements within a tolerance corridor defined by Axed thresholds for the perpendicular distance and for the angle difference with respect to a model segment (see Figure 1). The resulting aggregated data segments were then fitted to the model segment. Most of these data segments match the image line segments reasonably well and have a similar angle. Edge elements caused by image noise can be included into the aggregation process, because the covariances of the edge element angle and location have been neglected so far. Thii increases the noise sensitivity of the position and angle of the aggregated data segments (see Section 5). This approach, which will be referred to as the aggregation approach, enabled us nevertheless to track a model of the parking lot on our premises along a 1332 frame image sequence recorded from our test vehicle [9]. The algorithm used little information about the scene, so it was quite sensitive to the disturbances presented in an image sequence of an innercity intersection (see Section 5).

ASSOCIATION OF EDGE ELEMENTS TO MODEL SEGMENTS 3.

In order to improve the tracking robustness, we assigned multiple edge elements to one model segment. Although thii approach is computationally expensive, it offers the opportunity to study an intuitively more attractive distance measure along with its covariance (see Figure 2). An edge element is a potential candidate for the association to a model segment if its gradient direction is approximately perpendicular to the model segment orientation and if its perpendicular distance to the model segment is small. The distance measure thus is the vector f = (AB,a)T,

F. HEIMES et al.

Figure 7. Since the number of edge elements per data segment was not taken into account, the aggregation approach failed to pull the model onto the interrupted road markings in frame 80.

where

is the perpe bndicular distance of the edge element from the model segment and A0 = Belemsnt- arctan ( du, 1 - g

Innercity Fload Intersections

Figure 8. After the markings reached the foreground in frame 160, both approacht yielded almost equivalent state estimations.

35

is the ar ogle difference between the gray value gradient direction and the normal to the projec :ted model st?gment . Given the covariances ~0, uU, and CT~for the angle and the coordinates of edge elel nents. the covarian ce matrix R for our distance measure can be derived as

(1)

See the Appendix

for a d&ailed

derivation.

196

F. HEIMES et al.

Figure 9. Final frame of the sequence “Moltke”. The roll angle was estimated insufficiantly in the previous frames. This resulted in an estimation error on the right lane boundary for both approaches.

T‘his approach

represents

a variation

of the one presented

in [lo], applied

to individual

edge

elen tents.

4. ORIENTATION

SENSITIVE

DISTANCE

MEASURE

T ‘he combination of the two distance measure components off into one scalar distance meas iure b the next step to reduce the algorithm’s run time without loss of robustness. This appr :oach approach. This is not a different assignment algoritl rmwill be referred to as the assignment expl eriments, therefore, yielded the same assignments for both distance measures f and b-a tlbeit W&s

Innercity Road Intersections

197

Figure 10. First frame of the sequence “Tulla” with the initial projected model and associated image features. The upper part illustrates the result of the aggregation, the lower part the result of the assignment approach. Please note that the white lines are the projected model and not extracted image features which we indicated here as black line segments (upper part) for the aggregation approach and as black edge elements for the assignment approach (lower part).

;htef dvi Ith a different

distance

value.

b=L

The orientation

cos A0

The ! co1 rres Ionding deri .vati ves of b:

the

covariance

‘I>endix for a detailed

0;

(see Figure

can be calculated

derivation.

sensitive

distance

measure

is

(1) with

the

2).

from R in equation

partial

198

F. HEIMES et al.

Figure 11. The car occludes some image feature in frame 177 that are crucial for a proper estimation of the model’s pitch angle. This results in an estimation error that accumulates in the following frames.

of data segments obtained by data-driven Usin Ig edge elements directly-instead of edge2 elements-yields better assignments and more robust Kalman filter results, tha configr rration of edge elements represents the projected object more accurately aggreg ated data segments (see Figure 7).

5. EXPERIMENTAL

4we ga becaulse the the SE?t of

RESULTS

The video data for our experiments was obtained with our experimental see, e .g., [l]). The thin lines in Figures 3, 4, and 5 represent the polygonal ( ucted from a detailed map of road markings and lane boundaries.

vehicl e MB 609D mode 1 whit :h WaS

Innercity Road Intersections

199

Figure 12. In the final hame, the estimation error caused in previous frames could not be entirely compensated by the aggregation approach, although the lateral position estimation appears to be correct in both images. The white projected lane model in the lower image is almost correctly positioned on the boundary of the right junction. The assignment approach was less affected by the disturbances illustrated in Figure 11.

Thin dcstted lines represent road markings and lane boundaries which were not visible in the recorded image sequence. The dotted, the thick solid, and the dashed lines are the trajectc derived frnom the vehicle sensors, calculated with the aggregation approach and the assignnnent approach, , respectively. The firsSt,the last, and an intermediate image from three sequences are presented along withL the projected model and the associated image features in Figures 6-15. The white solid lines are: the projected model, the black solid lines in the upper part of these figures are the aggregated tdata segments which were associated with the model segments. The dots in the lower part represent the associ iated edge elements.

F. HEIMESet al.

Figure 13. First frame of the sequence “Parking” with the initial projected model and associated image features. The upper part illustrates the result of the aggregation, the lower part the result of the assignment approach.

6. CONCLUSIONS

AND

OUTLOOK

The efforts to improve the tracking of objects in image sequences of outdoor scenes lead to several approaches for the fitting of image data to projections of models of these objects. The standard procedures, which performed well on indoor scenes, turned out to be unsuitable for our purposes. In this work, we introduced two approaches for associating edge elements with model segments-the aggregation and the assignment approach. Both approaches enabled us to track complex polygonal models of innercity road intersections with a 6-DOF Iterated Extended Kalman Filter (IEKF) with identical parameter values for different image sequences. Our approaches can cope with partial occlusion and can track different kinds of lane boundaries (sidewalks as well as road markings). The main difference between the two approaches is their trade-off between run-time and accuracy. The aggregation approach needs less computations per iteration because it uses fixed thresholds for the association of edge elements, while the assignment approach computes the Mahalanobis distance for each edge element. The assignment approach needs six IEKF iterations per frame on average compared to four of the aggregation approach. This could be related to the fact that the Kalman filter with the assignment approach has ten times as many data-model associations available for fine-tuning its state estimation, compared to the aggregation approach.

Innercity

Figure

14. The left turn

Road

in frame

201

Intersections

600 of the sequence

“Parking”

There are some problems which are typical for innercity road intersections compared to more or less straight lanes, like motorways, that temporarily increase the error in the state estimation or could even make both approaches diverge. Road markings vanish temporarily or they are interrupted orientation; cumstance

in the first place. Markings, lane boundaries, and other objects occur in arbitrary on straight lanes, markings are usually parallel to the vehicle’s heading. This cirallows the use of custom made road marking detection algorithms for highways which

do not work on image sequences of urban traffic scenes. Urban road surfaces are usually patchier than motorways due to more frequent repairs. These patches often cause data segments near lane boundaries which, unfortunately, can be caught by the Kalman filter. The experiments have demonstrated that we have to anticipate a large variety of disturbances. A sufficient robustness against disturbances is of great importance, in particular while tracking innercity intersections. We therefore study different approaches irrespective of the computational requirements in order to explore the ‘space’ spanned by dimensions of robustness against computational

efforts.

APPENDIX DERIVATIONS In a Kalman filter, the uncertainty of a distance measure is modeled eqdicitly covariance matrix. The elements of this matrix can usually not be estimated covariances of the underlying measurements.

in the form of a in contrast to the

202

F. HEIMES et al.

Figure 15. Final frame of the sequence “Parking”. This image sequence demonstrates a csse in which both approaches produce about the same estimation results.

In our case, the distance measure is derived from edge elements that have been calculated from the image (see Section 3). We observed a standard deviation of ue = 5” for the edge element angle and 0% = cV = 1 Pixel for the edge element coordinates. The covariance of the distance measure f can be determined by transforming the (0, U, v)-space into the (A@, a)-space:

d?J

aae de

da

da

aae

4

0

I:

i( )

av ae

0

0

0

4

aae da au 8% aae da au av ane

de

da

ae

203

Innercity Road Intersections

Using 1’ = d& + dzlk yields 4 R= ( 0

0 du2 0%+ 2 p (0: -CT:> ) *

In the same fashion, the covariance for the orientation sensitive distance measure can be computed from the distance measure f by transforming the (Ae, a)-space into the @)-space:

=

(a2!$_$__A_)($ 0)) (a+)

=

&+

(,,2g$J

cos ae

REFERENCES 1. H.-H. Nagel, W. Enkelmann and G. Struck, FhG-Co-Driver: From mapguided automatic driving by machine vision to a cooperative driver support, Mathl. Comput. Mcxielling22 (4-7), 185-212, (1995). 2. K. Kluge and G. Johnson, Statistical characterization of the visual characteristics of painted lane markings, In Proceedings Intelligent Vehicles ‘95 Symposium, September 25-26, 1995, Detroit, MI, pp. 488-493,. 3. C. Fernandez-Malolgne and W. Bonnet, Texture and neural network for road segmentation, In Proceedings Intelligent Vehicles ‘95 Symposium, September 25-26, 1995, Detroit, MI, pp. 344-349. 4. E. De Micheli, R. Prevete, G. Piccioli and M. Campani, Color cues for traffic scene analysis, In Proceedings Intelligent Vehicles ‘95 Symposium, September 25-26, 1995, Detroit, MI, pp. 466-471. 5. A. Broggi, A massively parallel approach to real-time vision-baaed road markings detection, In Proceedings Intelligent Vehicles ‘95 Symposium, September 25-26, 1995, Detroit, MI, pp. 84-89. 6. K. Kluge and S. Lakshmanan, Deformabletemplate approach to lane detection, In Proceedings Intelligent Vehicles ‘95 Symposium, September 25-26, 1995, Detroit, MI, pp. 54-59. 7. T. Haga, K. Sasakawa and S. Kuroda, The detection of lane boundary markings using the modifled spoke filter, In Proceedings Intelligent Vehicles ‘95 Symposium, September 25-26, 1995, Detroit, MI, pp. 293-297. 8. E. Ekinci and B.T. Thomas, Navigating an autonomous road vehicle in complex road networks, In Asian Conference on Computer Vision (ACCV) ‘95, December 5-8, 1995, Singapore, Volume I, pp. 229-233. 9. V. Gengenbach, H.-H. Nagel, F. Heimes, G. Struck and H. Kollnlg, Model-based recognition of intersections and lane structures, In Proceedings Intelligent Vehicles ‘95 Symposium, September 25-26, 1995, Detroit, MI, pp. 512-517. 10. R. Deriche and 0. Faugeras, Tracking line segments, Image and Vision Computing 8 (4), 261-270, (1990).