p)'right 0
IFAC Artificial Intelligence in Real-Time Control,
~~a LumPur, Malaysia, 1997
A visual, knowledge-based robot tracking scheme Khoo Bee Ee and M. G. Rodd Department of Electronic and Electrical Engineering University of Wales Swansea Singleton Park, SA2 8PP United Kingdom {K.B.Ee, M.G.Rodd}@swansea.ac.uk 'G'
Abstract
inherently flexible. Clearly the dynamics will be complex and the assumptions which are made in the control of most conventional robots, mainly relating to the rigidity of the component members, are simple not acceptable. The consequence is that advanced control algorithm must be introduced. Such algorithms, though, are totally dependent on feedback from the robot, with the need for good, timely information regarding speed, position, etc.
Robots are major components in most forms of automation. However, despite their extensive use, there are still many important aspects in their control and supervision which require attention. Firstly, there is the problem of controlling low cost, flexible systems often in situations in which the extensive use of sensors is not possible. Secondly, there is the continual problem of ensuring safety and supporting dynamic fault recovery. In all these cases, there is the urgent need for solutions which provide accurate and complete position information, even when the robot is in motion. In this paper, a simple, knowledge-based system for tracking the robot in motion is proposed. The system, which employs a model-based approach, uses a single, stationary camera and requires no special markings on the robot. The system, which has been proven in practical experiments, recovers information about the precise position of the robot. This information can then be used for control, fault detection and fault recovery.
At the same time, an important aspect of using robots in any situation is safety, and this relates directly to the question of reliability, since a safe system can only be build from a reliable one. Robots may, for example, collide with each other, the humans around them or other objects in their environment. To prevent such injury and to ensure that the operating robots work accurately and ~liably, their operation should be closely monitored. Thus, for example, the robot should only be allowed to work only within a defmed envelope. Any movement outside this envelope can be considered to be an error and must be dealt with accordingly.
Copyrighl© 1998 IFAC
Keywords: Robot monitoring, image sequence analysis, motion tracking, model-based vision, fault recovery
1 Introduction Robots have been increasingly used to help humans in doing repetitive and tedious work, and especially in difficult environments such as welding, mining, and medicine. The increasing usage requires the robot to have better, more precise control, together with increased capabilities, such as to be able to recognise objects in their environment and avoid collisions. These needs increase in the case of dangerous environments where is a need, not only to ensure the safe reliable operation but to control remotely the robot. From the control point-of-view, there is a pressing need to develop precise, reliable robots which are relatively low cost and highly re-configurable. To this end, much research in being undertaken towards the production of highly flexible systems, the so-called "sloppy, floppy robot". The objective here is to build systems which are light, fast and t
The consequence of the need to provide better control and also to ensure more reliable operation, is that there is a need for accurate and timely position information. Of course, this can be achieved using a variety of sensors distributed around the robot, indicating its position, velocity, joint angle, etc. However, currently most such sensors need to be attached to the robot, with obvious consequences for the payload and wiring complexity. The use of vision systems to observe the position of a robot naturally becomes very attractive, and this is an area of research which has attracted much attention. Such visual monitoring can, potentially, be used not only to aid in control itself, but can be used directly to detect erroneous operation. The visual information can then also be used for fault recovery. Normally in the case of a failure, a robot is reset to its' starting position or to a known reference position. Thus the process it has been carrying out has to be abandoned and the system started afresh. With visual monitoring, however, the information gained about the robot throughout the process can be used to aid in restoring the situation to a suitable recovery position which existed
Author to whom all correspondence should be addressed.
221
before the failure occurred, and hence possibly avoiding having to restart the process from a defmed initial point, hence losing work in progress.
2. The proposed approach The system introduced here employs a model-based approach. A general framework, common to many modelbased tracking systems, is adopted, and the major components of the computational systems are image synthesis, image analysis, feature matching, state estimation, and tracking. (see figure I)
Many existing robotics systems have incorporated vision. Hass[7], for example, developed an automatic visual surveillance system for robots in industrial workroom environments, with an emphasis on the prevention of collisions between the robot and human workers. The system is based on detection, recognition, and tracking of moving objects from digital workroom images. Leou et. al.[8] proposed a vision system to monitor multiple operating robot manipulators for on-line collision avoidance by image sequence analysis. Their system employs multiple camera views to monitor the scene. For each view, three image-frames taken from three recent frames (current and previous two frames) are used to predict each robot shape location in the predicted space. Here, three-dimensional (3D) robot operation monitoring for collision avoidance is decomposed into collision checking performed via multiple 20 image sequences.
In summary, in the image synthesis stage, a pre-captured robot model is rendered into an image to predict the appearance of the robot in any real image. The image analysis component will perform the feature extraction which will extract edge pixels and link and group them into line description segments. The synthesised image feature is then matched to the corresponding real image feature using the Mahalanobis distance between attributes of lines. The matched features will be passed to a state estimation module which then recovers the states of the model. Finally, the tracking module will take the information from previous frames to predict the future states of the object which will appear in next frame of an acquired image. This key idea will allow faster feature extraction and allow for fast automatic matching. The predicted states are also used as the initial estimates for the next state estimation process.
Mullingan[9] describes a model-based analysis-bysynthesis method to estimate the position of an excavator's arm. This approach utilises the concept that the position of each link depends on the position of the preceding link. The method uses geometric constraints on the possible position and orientation of manipulator components to segment the object and to estimate its' pose. The approach iteratively computes the difference between the real image and the synthetic image generated from an arm and camera model until a pre-selected criterion is minimised.
3. Robot model
Dhome et. al.[6] presents a method to determine the pose of an articulated object from a single perspective view. This method requires a CAD model of the object and a sufficient set of matchings between image segments and model segments. Vincze et.al. [13] proposed a laser tracking system to measure position and orientation of a robot end effectors under motion. A retroreflector is mounted on the robot's end effector and a vision system is used to analyse the profile of the reflected laser beam. In the following sections, a system to monitor and track robot at work is described. The proposed system, which uses a static camera and requires no marker on the robot, recovers the parameters of the position of a robot. The information gained can then be used to determine the envelope the robot occupies and, in the case of the failure, aid in repositioning the robot to a suitable position. The information also can be used for direct control of the robot, given that the information gained provides an accurate position of all aspects of the robot's component parts. The approach is based on the use a priori knowledge of the robot's physical structure.
To ensure a high speed of operation it is essential to use as simple models as possible. Thus, in the present work, the robot is represented using a simple geometrical polygon shape. From this a wire-frame appearance is rendered. Each robot part is modelled with local co-ordinate system and every part is connected to the other components with simple rotation and translation transformations. In the example given here, a robot with three revolute joints has been modelled. The translation component has been fixed and in this case then, only the rotation parameters represent the state that needs to be recovered. One of the robot's components is chosen as a reference part, here the robot base. The model is then projected perspectively into an image. It should be noted the transformation from the robot base to camera has been obtained off-line, based on the calibration procedures suggested by [Tsai87]. To avoid unnecessary matching later, a simple hidden line removal algorithm is also employed so as to remove lines which are invisible from the viewing point.
4. Matching As is suggested in the work by Lowe[I], the process of matching and state estimation is doneiteratively. An initial estimate is provided so that the system can fmd some possible initial matches. These potential matches are supplied to a state estimation module, which employs a least-squared method to determine the required parameters. The process is repeated with the new estimates to fmd
222
dditional correspondence. The iteration is stopped if there :s no more improve~ent .possible in the matching or if a threshold number of IteratIons has been reached. At the end of each iteration, the set of matched data and model lines are summitted to least-square estimation process to recover the states. The set which has smalllest residual is chosen for the state update. The average error per matched edge segment, weighted by the number of unmatched model lines, unmatched data line and length is chosen as the criterion for the smallest residual. Matching is performed using the edge segments and a midpoint representation of line segments, as described in [4]. The line segment is characterised by the vector X= (x"" Y"" (), I) where (x"" Ym ) is the midpoint, () represents the orientation of the line, I is length of the line segment. The correspondence between image lines and model lines is based upon the Mahalanobis distance of the attributes of line segments. Denoting the attribute vector of a model by Xmi and the attribute vector for a data segment by Xdj , the Mahalanobis distance between Xmi and Xdj is defmed by the following equation d ;7
= (X m;- X c ,
wh". A:
la'~
dj ) T
(I\. ) -
0
0
2
cr c y
0
0
crij
0
0
1(
X
m; -
X
dj)
Least-squared method After fmding the matched projected model edges and the image edges, these values are supplied to a state estimation module which employs a least square method to recover the parameters. The least squared method used here is based on the Levernberg-Marquardt algorithm. This algorithm minimises the distance between the projected lines and the image edges over a possible value of parameters to be recovered. Here, an infmite image line constraint is used as experiments have shown that an infmite image line constraint is more robust when image line fragmentation occurs. This is highly likely to occur in feature extraction process. The infmite image line constraint is derived from the end points of the projected model line and an infmitely extended image line. (The projected model line segments are aligned with the infmitely extended image line.) The extracted image line is expressed in the following form : x sin ().. Y cos () =d , where 8 is the orientation of the line with respect to x axis and d is the signed perpendicular distance of the line from the origin. Substituting with projected model point(x' ,y'} into the left side of the equation, we get a new value of d' . The perpendicular distance of the point to the line is d' -d. Therefore the minimisation criterion is: x = (81 , 82 , 83)
lj
min
x
crl , = (1nl -li) • (1nl -li); cr~ y = (1nl -li) • (1nl -li); cr~ =
and
variable (see text);
1nl· li cr = -_ . I (lm-li) , 2
= length of data line lm = length of model line
li
The mahalanobis distance, d, is weighted by the covariance
5. Tracking algorithm The task of the tracking module is to predict the state of the robot in next scene to be captured, based on the previous knowledge. Each time the state estimate module recovers a state, these will be passed to tracking module. A standard Kalman filter is employed to do the prediction. First two frames of the image sequence are bootstrapped to get the initial value for the velocities. Here, a constant velocity model for the robot movement is used.
1\... The covariance I\.. is chosen as such that a priori
knowledge can be utilised to govern the matching. Waite et. al.[14] proposed the use of variance to constrain the pose estimate process in a kalman filtering framework. Here a similar idea is employed. For example, the priori knowledge that the orientation of the line segments of link 1 is available as only rotation of the y axis will not change the orientation of the model in the image. The variance is chosen to be small such that it allows only a small deviation from the values of the model. Correspondence is established for a model segment if the data segment has the smallest Mahalanobis distance between the model segment and data segment, and provided that the Mahalanobis distance is less than a given threshold, which is determined experimentally.
The characteristic of the filter is given as following: State is given by ( 8 1, 8 2 , 8 3 ,
(iJ 1, iJ 2, iJ 3)
where
are the velocity of the rotation of
each robot component. The state prediction matrix is 0 1 0 0 ll.t 0 0 1 0 0 ll.t 0 0 ll.t 0 0 1 0 <1>= 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
223
el, e2 , e3)
where At is time between two consecutive images
Measure prediction matrix I 000 0 0
M=O
[5]
R. Y. Tsai, "A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses", IEEE journal of robotics and automation, Vol. RA-3 , No. 4, August 1987, pp 323-344
[6]
M.Dhome, A Yassine, J.M. Lavest, "Determination of the pose of an articulated object from single perspective view", BMYC93, pp. 95-104
[7]
Uwe L. Haass, "A visual surveillance system for tracking of moving objects in industrial workroom environments", Proc 6th International Conference Pattern Recolmition 1982 pp. 757-759
[8]
Jin-Jang Leou, Yung-Liu Chang, Jian-Shing Wu, "Robot operation monitoring for co11ision avoidance by image sequence analysis", Pattern Recol:nition, Vol. 25, No. 8, 1992, pp. 855-867
[9]
I. Jane Mulligan, Alan K. Mackworth, Peter Lawrence, "A model-based vision system manipulator position sensing", Workshop interpretation if 3D scenes Austin, Texas. 1989 186-193
[10]
Kenneth Levernberg , "A method for the solution of certain non-linear problems", Quarterly of Applied mathematics Vol. 2, 1944, pp 164-168
[11]
Donald W. Marquardt, "An algorithm for least squares estimation of non-linear parameters", ~ Indust. Appl. Math Vol 11 No. 2, June 1963, pp 431441
[12]
A.H. Jazwinski, "Stochastic processes and filtering theory", Academic press, 1970, London
[13]
M.Vincze., J.P.Prenninger, H.Gander, "A laser tracking system to measure position and orientation of robot end effectors under motion", International Journal of Robotic, Vol. 13, NO. 4, August, 1994, pp 305-314
[14]
M.Waite, M. Orr, R.Fisher, J. Hallam, "Statistical partial constraints for 3D model matching and pose estimation problems", Technical Report, Dept. of Artificial Inte11igence, Univeristy of Edinburgh
0 0 0 0
o
0
000
6. Experiment and Results Based on the above ideas, the approach has been tested with data from a sequence of real images, captured from a camera and frame grabber. The robot used is a small, crude laboratory system, but ideal for this work in that it has very poorly defmed edges, etc. The algorithm has been tested with 20 frames of an image sequence, in which the initial estimate for the process was given manually. The number of iterations used was 10 and the threshold value chosen for the mahalanobis distance was 10 based on experiment results. Some of the results is shown in figure 3, from here the success of the approach is illustrated.
7. Conclusion In this paper, a system to track an operational robot has been proposed. The system, which employs a knowledgebased approach, uses a single, stationary camera and requires no special markings on the robot. The system recovers information about the position of the robot, and this data can then be used for control, fault detection and fault recovery. The system has been validated using real image sequences captured from a laboratory robot. Experiments have shown that the approach is valid, although it is still not robust enough for industrial use. This is largely due to cumulative errors. Future work will be to include the initialization module and fme tuning of the covariance matrix in the matching process.
References [I]
D. G. Lowe, "Three-dimensional object recognition from single two-dimensional images:....L2f Artificial Inte11il:ence, Vol. 31,1987, pp 355-395
[2]
D. G. Lowe, "Fitting parameterized threedimensional models to images", IEEE Trans. Pattern Anal Machine Intell., vol. 13, No. 5, May 1991, pp 441 -450
[3]
Rakesh Kumar, Alien R. Hanson , "Robust Methods for estimating pose and sensitivity analysis" , CYGlP: Imal:e Understandinl:, Vol. 60, No. 3, Nov., 1994, pp. 313-342
[4]
R. Deriche, 0 Faugeras, " Tracking line segments", Lecture notes in computer science, Vol. 427, 1990, pp 259 -268
224
D. for on pp.
bnage analysis
Figure 1 General framework for model-based tracking system
y y
X
JV
~
ml2 w
z
y
+ link 1
X
X
7¥' Figure 2 Robot model
225
f---+ t
X
link 3
Figure 3. Reconstruction result superimposed to the images 3,5, 7, 9, 11 ,18 of the sequence
226