Ct)p~T i g ht
©
IF ..\( : I Il fc 11"l1l ; 1I il 11 I ( ,( 11l 1 1'111 I' l'I dl \t' IIl " \laIHti'a ctlll'ill g T t'cil l) nl(lg\, SII/( LII . l · S .~R . 1~ I K tl
i ll
A FLEXIBLE AND USER-FRIENDLY VISION SYSTEM FOR ADAPTIVE INDUSTRIAL ROBOTS A. A. Petrov and S. A. Kuz'min 1/J.\lilllll'
11/
Clllll m ! S('il'll U',\,
,\1 11.\ (' 0<(' ,
L'S.\U
Abstract. A VlSlon system f or adaptive robots is described which provides automatic generation of a necessary number of c lassifying features - independent integral uniform functionals calculatable by a single algorithm. An e ff icient regular technique is proposed to find parts orient a tion in t h8 working plane. The experimental results are presented which have proved the flexi b ility and user-friendliness of the vision syst em in an adapti ve r obo tic complex. Keywords.
Robots; computer vi s i on; image processing; parts palletizing.
INTRODUCTION A flexible and user-friendly readjustment of robot actions undpr a change of a nomenclature of manipulation objects operations and working environment s is of crucial import a nc e in t he design of ad a ptive robots with visual sensors for FMS. Up-ta-date c ontrol systems provide a wide range of means for forma tio n of the desired motions of the robot (teaching off-line ,programming in appropriate ~e rm s. aut omatic motion formation in the adaptive mode), whil e sens ory systems themselves often do not provide a suf f icient flexibility of their own operation. Readjustment of industrial visual systems. designed for determination of the typ e , position and orientation of objects handled by robots (which is required whenever a new batch of parts is to be treated) ia still, as a rule, far from being automated. This is true even f or simpler visual tasks (with binary ima ges, 2-D scenes, non-touching parts) irrespe c tive of the visual system characteristics in speed, resolution, ima ge size, capability of teaching-by-showing, etc. Obje c t classification is usually carried out in terms of a fixed set of features (area, perimeter, number of holes, chara c teristic r a dii and the like) n ot only with the hardw are implementations of algorithms for visual information analysis but with software implementations a s well. Since a good deal of objectively different parts may not be discriminated by s uch features the participation of a human expert in the preliminary study of each new batch of parts is required to find out if the i mplemented algorithms are applicable to add some new independent features if necessary. An automatic generation of sufficient sets of independent features based on the known theore tically unlimited systems of forma l characteristics (such as moment invariants) requires too many calculations to be reasonably carried out in real time.
ICPMT - I *
Rese a r ch aimed to ela bo ra t ion of methods and means f or industrial ro bots a daptation on the ba se of c omDut er vj Ri on has be en under way a t the Ins titute of Control Sciences for a numb er of years. Subsystems of videoda t a acq uisition and analysis to determine the tYpe, position and orient a tion of parts in the working scene and sub systems of on-line robot act ions pl a nning and adaptive mo tion genera tion in ac c ord with this information are designed jointly, the necessity of their close intera c tion as components of a single system of r ob ot vision and adaptation being taken into acc ount . It is also highly desira ble to employ comparatively cheap lot-manufa c tured computers without breaking the strict reqUirement of realtime problem solution (i.e. the robotized manufacturing process mus t not be slowed down) • In order to a chi e ve this pur pose the system TEZA ( Ru ss ian a bbre viation for Artificial Vision and Adaptation) ha s b een designed for robots intended f or handling non-oriented objects, e.g. for parts ordering: palletizing,sorting ,selecting etc. (Petrov and Kuz' min, 1985 ). The TEZA system is baseld on mi c rocomputer ELEKTRONIKa- 60 and may use any videosensor working in TV standard including TV-cameras and solid-state matrices. It is of principial importa nce that the inclusion of TEZA in a robotic complex does no t imply any change in standard robot controller hardware pro vided it possesses c omputing means compatible with ELEKTRONIKA-60. Extra investment in new software is not required either. Methods have been developed for interface and data exchange between processor of TEZA and that of the ro bot c ontroller with parallelization of processing operations of these units. Hence TEZA may be treated as a supplement ary block the coupling of which to a conventional industrial robot provides it with adaptive capabilities.
256
...... ...... Pctrm" and S ....... hll/Illill
For example Fig. 1 presents the block diagram of an adaptive robotic complex organized by plugging the TEZA system into commercially available in the USSR controller UKM-772 delivered with Soviet industrial robot TUR-10. TECHNIQUES FOR INFORMATION ANALYSIS IN ROBOT VISION SYSTEM Visual sUbsystem of TEZA determines type, position and orientation for each of a number of various non-touching objects laying on the working plane (table, conveyor belt etc.). It is assumed that a chaotic pile of parts can be mechanically separated using well developed technologies. In this paper we consider a frequent practical case when all necessary data on the object is contained in its silhouette. This permits one to be confined to processing of binary images. At the teaching stage one may enter some a priori information of the objects. However industrial practice shows that it is extremely desirable not to require high competence of a human operator to teach the system. The most preferable way is teaching by showing all the parts to be handled by robot to its vision system. It should be stressed that commercially available industrial visual systems employing teaching-by-showing methods can recognize only relatively narrow classes of objects and are unable to solve the problem for arbitrary shapes of silhouettes even if sensor resolution is quite sufficient. This is mainly due to the heuristic choice of classifying features (such as area,perimeter,maximal and minimal radii of an object, number of holes and the like). Since manY objectively different industrial parts can not be distinguished by such feature sets addition of some new independent features is required, which makes necessary the participation of a highly skilled expert in the preliminary study of each new batch of parts. Thus Fig. 2 shows examples of silhouettes which have equal areas, perimeters, numbers of holes and vertices and moments up to the second order inclusive. In principle such objects could be distinguished, say, by increasing the order of moments used for classification or by utilizing some other theoretically unlimited sets of formal features. However such features are too complex for computation and in practice their sufficiently complete set seldom can be calculated in real time. In view of the above we have propo s ed for the TEZA system and theoretically verified quite another set of formal classifying features, invariant to translations and/or rotations of objects on the videosensor view-field (Petrov and Kuz'min, 1981). The features have a form N
J
k
=L i,j=1
2 qk C 11 x.~ - Xj 11 ),
(1)
1, ••• ,M, k where N is the number of points { i } j 6f analyzed contour of object image, is a set of functions with {~(u) }
derivatives linearly independent in their common domain of existence. It has been shown that in this caseGhe features J k of (1) are functionally independent. This fact together with the uniformity of the proposed features (unlike conventional classifying features they are calculated by a single algorithm, which is computationally convenient) make them very efficient: the important possibility arises to automaticaly increase the number of classifying features if necessary. It should be emphasized that the uniformity of the proposed features is preserved for 3-D object forms (range images), thus allowing for parallel computation. In commonly used visual systems certain difficulties are tied also with finding the object orientation. Hence our approach (Petrov and Kuz'min, 1982) has been further developed to obtain a regular technique which does not depend on object symmetry and does not require any "prompts" from a human operator during the teaching. The technique provides efficiency and simplicity of teaching and is based on the orientation correspondence to argument of complex Fourier coefficient for the object contour expressed in polar coordinates. Accordingly the angle 'f' of object orientation relatively to a reference state (the one which has been specified during the teaching stage) is determined by a change of the argument
of Fourier coefficient of the silhouette fC}, in polar coordinate frameC'p, ~) attached t02.Jts geometrical centre. Here Wn )=j fCj,!f)exp(i ~n)d If ' the index T corresponding to the teaching stage and w - to the working stage. In order to reduce the amount of computation a coefficient of one-dimensional Fourier transform with respect to angle I.fJ is calculated, radius being fixed. A procedure has been developed and theoretically verified to choose both the radius providing the least sensivity of the cofficient to image discretization errors and the number n of the coefficient corresponding to object symmetry properties. This radius jD 0 is to maximize the absolute value of W It can be shown that the method may be successfully applied to determe the upper (facing the TV-camera) side of an object.
If')
(!
fa
JOo
(jO ).
HARDWARE-SOFTWARE IMPLEMENTATION The efficient implementation of the visual subsystem is provided by algorithms and software oriented to available microcomputers (like ELEKTRONIKA-60) with minimal supplementary hardware. The fact has been taken into account that image grabbed by TV-camera has large information redundancy for commom robotic application. First, for ordering non-oriented indust-
,,\ Ilexible and user-friendly \isioll system
rial parts it is enough to supply a robot with data on only one of the parts in the scene and a fragment which covers the" silhouette of this part often has much less size than that of the whole image. Second, all the information of the silhouette is contained in its contour, thus allowing to process only contour points, i.e. only small portion of the whole number of image pixels. Third, different contour points are not equally informative, e.g. a straight-line segment of the contour may be represented by its end points without losing any useful information. Thus a goal-oriented videodata sele c tion during acquistion and preprocessing stages permits to considerably reduce the amount of information to be processed. The following principles have been used in the proposed algorithms of videodata processing and analysis. 1. The possibility to process incomplete information.
2. The organization of informational feedbacks between different processing levels in order to select and verify videodata. 3. Reduction to contour images. 4. Maximal utilization of a priori information acquired during the tea ching stage. The study of the specific requirements to image processing algorithms for ro bots has shown that the hardware module of the visual system must be capable to enter videodata in portions, i.e. separate fragments of the image with specified parameters (position, size,resolution) should be grabbed rather than the whole image at once. Besides a programmable setting of quantization levels must be possible in the hardware module to implement algorithms of automatic selection of binarization threshold. In accord with the above requirements the tasks of the hardware module may be expressed as follows: 1) digitalization of videosensor signals; 2) videodata selection corresponding to parameters of a fragment to be grabbed; 3) hardware implementation of c ertain operations for videodata processing; 4) data transfer to the computer. In the proposed system videodata read-in is provided by the software while check for data readiness is wired-in, thus allowing to combine the software flexibility and hardware speed during image entry. The general structure of the device is shown in Fig. 3. The main functional blocks are: a discriminator for videosignal binarization and synchropulses detection (the binarization threshold is routine set via D/A converter); synchronizator for synchropulses generation correspondin~ to the specified fragment parameters (the digitized videosignal is entered into a shift register according to these pulses); block of buffer regis-
'237
ters; block of counters where a number of pixels in the fragment is counted with birghtness levels exceeding the threshold (this block permits one to speed up the calculation of brightness histogram for the threshold selection algorithm) and interface for information exchange. Data are transferred from and to the computer via inner bus of the device thus allowing hardware reconfiguration by adding new modules. The threshold selection is divided into two main subtasks: 1) fast calculation of the histogram and 2) detection of two main peaks on the histogram. While the former can be fulfilled by hardware counters, execution of the latter is complicated by the fact that a real image histogram can practically never be bicodal. Along with the two main peaks it has a lot of other ones, the highest of which correspond to shadows and reflections and the rest are due to non-uniform sensivity of the videosensor field and to quantization errors. Histogram smoothing techniques have been tried to overcome this difficulty (Iieszka, 1978). However they lead to displace ments of peaks and, hence, to errors in threshold selection which result in changes of geometrical size of a silhouette. We have proposed a technique free of these shortcomings. The idea is not to average the histogram but rather to dete c t all its local ma xima in order to find the main peaks by comparing them to each other. The following procedure is used for the search of main ma xima. First of all the global maximtm of the histogram is to be found and 't he n for all local maxima the ratio
is calculated where Mi is the value of a local maximum mi is the value of the global minimum in the range between the global maximum and the lo ca l one in question (Fig. 4). The 1 in the denominator is added to avoid division by zero. The local maximum having maximal value of~~ is chosen '.;Is the second main peak. The advantage of the described algorithm is twofold: 1) it precisely finds positions of main peaks irrespective of minor ones since it does not imply averaging; 2) it works even in the case when hights of minor maxima are comparable with those of main peaks (values of eX..- for different maximuma practically do not influence each other). Thus this algorithm is applicable even in cases when smoothing does not work. Software support of the visual system is made as a set of modules which could be classified into four functional groups: 1) module for videodata input into the computer (grabbing image fragments, sampling the brightness histogram); 2) modules for image preprocessing (optimal threshold selection, fragment masking, object localization on the coarse image, binary filtering, contour points extraction, contour following and representa-
:\ . :\ . I'ctro\· and S. '.1. Kuz'mill
tion as coordinate sequences, elimination of less-informative contour points); J) module for videoinformation Bnalysis
(features cal culation, correction of possible ranges of feature values, selection of minimal subset of inf ormative features, centroid coordinates estimation, calculation of Fourier coe f ficients in polar c oordinates, classification, finding of orientation and upper side of the object); 4) modules for control and service (control, diagnostics, dial ocue with an operator, c ompulsory expansion of feature values ranges, display of obtained contours). The main function of modules of the first group is fulfilled with hardware means. software only cont r ols hardware responding to its requests, the necessary amount of computation being sufficiently small. Modules of other groups are now implemented with software means and may work quasi-parallel with input modules provided computa t ional process i s organized properly. In order to reduce memory requirements videodata are preprocessed in packed words (16 points per word). The processing speed also may be considerably increased with this technique. Position and ori entation of a part found by the visua l subsystem represent input data for the subsyste m of ro bot motion f ormation. The latter subsystem generates an output array of dat a transferred t o the ro bot controller. These data specify the worki ng mode of the robot (adaptive or pl ayback of rigid program) and contain all t he inf orma tion necessary for the controller to support the formed motion servoing. In adaptive mode sequences of manipulat or joint c oordinates are generated to be served by robot drives. They provide the transfer of the robot gripper to the sele c ted pa rt, positioning the gripper to a grasping state and, finally, transportat i on of the grasped part to some standard location. Then the robot may exe cute completely deterministic actions under control of the UKM-772 Ne system. They are formed as comb inations of some primitive movements, corresponding to rigid programs a priori tought-in. The action planning subsystem chooses a sequence of manipulations in ac c ord with the visually classified part type and technological operation code. For example during part pa ll e tizing the de c ision is made whether or not a current part sh ou l d be placed ont o a pa llet, and ( if ye s ) ont o which pallet and into which of its c ells: whether the part should be reoriented (and in which way), etc. EXPERIMEnTAL A number of experiments has been carried out on the robotic complex with a visual subsystem (Fig. 1) to check the efficiency of the developed me thods. In particula r the algorithm of binarization threshold selection, discriminative properties of the features (1) and the algorithm of obje c t orientation finding
have been investigated. One of the ways to estimate the efficiency of the proposed algorithm for automatic selection of binarization threshold is to compare its results with one of the most popular algorithms, which determines the threshold as the point of minimum between two peaka on the smo othed histogram corresponding to brightness levels of object and background (e.g. Weszka, 1978). The experimental te chnique implied a multiple selection of thresholds with both known and proposed algorithms for diff€rent working scenes composed of la t hed industrial parts and for several variants of image contrast. Then for each seri es of 20 images the rel a tive dispersion of thresholds was calcu lated by the formula
where Tmax and Tmin are maxima l and mi n i ma l threshold va lue s fo r the given s er i es, f max and f min a re t he maximal a nd minimal va l u es of pix e l brihtness. This parameter gives a measure of algor i thm robustne s s a nd insensivity to vari ous noises and allows us t o ob jectively compare di f fe ren t a lgorithms. The experiments have shown t ha t ·i; hough both algorithm s ha d approx ima t ely the sa me results when t he contra s t wa s high, t h e proposed a lgorithm worke d mu ch mor e consistently in the case of low contrast when noise peaks at the histogram became comparable with peaks corresponding to object and ba c kground brightne s s lev els. The next set of experiments wa s de voted to the study of discriminative properties of the proposed c lassifyj.ng features (1): the ir distinctions for objects more or less similar in si z e and shape, their sensiv ity to dis c retizati on n oi ses, as well as the dep enden ce of t hese properties (and computati ona l n e eds to obtain the featur e s) on the form of th e f unctions ~ in (1). Acc ord i ng to experimental met hodics in e ach series tw o kinds of obje c ts (say, shafts of Fig. 2) were tested. For 20 images of the objects having various locations and orientations the f ollowing sets of features
~j
N
=
L,
exp( (1I~~L1ijIl2
- rm)/e; 2),
k ,1=1
m =1, •.• , t.i , were c alculated with different values of parameter r ( J~. corresponds to m l.J i-th presentation of j-th ob ject). For these sets feature variat ion ran~es were optained for each object (Fig. 5). This has permitted us to choose optimal values of algorithm parameters.
The experimental study of orientation errcrs for objects ha ve been carried out with different va lues of pixel size in order t o investigate the influence of system resolution on the ac curacy of ori-
:\ tlc "i iJll' alle! llser-lriclldh' \isi()1l S\" I<"11l
entation calculation and the reliability of the upper side identification. A typi ca l experimental test of the robotic complex efficiency implied palletizing different industrial parts (which were placed on the working plane at random) by the robot TUR-10. The complex operation is based on the TEZA system which after viewing the working scene automatically generates motions for grasping a nonoriented part and transfering it to some standard location. This permits maximal utilization of rigid programs which have been memorized in the UKM-772 controller during teaching the robot under nominal conditions. Thus parallelization of operations is achieved between the processor of TEZA and that of the UKM-772 controller for sequentional operation of the robot in the adaptive mode and in the mode of rigid program playback. Numerious experiments of the above type have subst antiated a reliable ope ra tion of the adaptive robotic complex in ordering non-oriented industrial parts of wide nomenclature. It is important that the stage of teaching TEZA how to deal with parts of a concrete batch is highly userfriendly. The user should only show a part to the videosensor, specify its name (number), a grasping point and a code of a prestored rigid program of robot ac tions after the grasped part is transferred to A standard location. All the features necessary for parts rec ognition are formed automatically. During the working stage TEZA grabs in image of a current part, extracts its contour points, classifies the part, finds its orientation and centroid. According to these data a set of referenc e points of robot trajectory for the adaptive mode is generated to grasp the part and to transfer it to a standard location. Then the priority is given to the ~obot controller UKM-772. The grasped part is placed into a corresponding ce ll of a desired pallet. (If necessary the robot can reorient the part properly before palletizing). Defective parts (if any) are remo ved by the robot
from the working field. The whole cycle (input of videodata, image processing with an eficient resolution 256x256, part recognition, action planning and robot motion generation) lasts 0.8 to 1.3 secs per typical part, the TEZA system working in parallel with restored motions servoing by the robot. CONCLUSIONS Principles of a flexible visual system for robot adaptation are proposed. Videosensor interfa ce structure and videodata processing algorithms are presented. The implemented techniques of goaloriented selection of information and distributed control of computations speed up the system operation and make it less sensitive to lighting conditions va riations. The speed and accuracy of the TEZA system are quite sufficient for proper and fast grasping of industrial parts by the robot TUR-10. REFERENCES Kuz'min, S.A., A.A. Petrov (1981). Algorithms for Silhouette Images Classification and Their Parameters Calculation in a Visual System of a Robot. In Problems of Computer Vision in Robotics. Inst. Appl. Math. of the USSR Acad. Sci., Moscow , 140-151 (in Russian). Petrov, A.A., S.A. Kuz'min (1982). Sensory Image Perception in Information Control System of a Robot. In Prepr. 2-nd Int. Conf. AIICSR Smolenice (CSSR ), 181-184. Petrov, A.A., S.A. Kuz'min (19 8 5). Efficient Uniform Algorithms for Image Analysis in Vision Systems of Adaptive Robots. ,In qripr. SYROCO'85, Barc elona, Spa~n. 4 -147. Weszka, J.S. (1978). A Survey of Threshold Se lection Techniques. CGIP v.7, No.2, 259-265.
TV- camera Videomodule
I
[
J
-
-
-.i
Microcomputer
TT
"ELEKTRON l KA- oO "
0 1~~
I Controller
/~
TUR-10K
UlUJ\- 772
I
/~
Fig. 1. Block dia gram of an adaptive robotic complex.
:\ ..-\. Pe!1'0\' and S. \1. Ku z" mill
260
Fig . 2 . hxamples of objects with the same conventional features.
A-D
converter
, D-A
converter
Histogram
Synchroni-
counters
zator
Buffer
1\ ....
~
Internal
bus
,
~
Interface
t
to computer
Fig . 3. The general structure of the videomodule~
1~
n
I I
J2
I
J
1
1
I I I I I
I
I
ir
......&,... _ _... . _ '- ... __ . -...6 • • "••• -1.
f
Fig. 4. On the new technique of threshold selection.
5
I I r
I I
. ........ _ .1. ___ -'-----~_._ ,
I
I 1tr"m
Fig. 5. Experimental results on discriminative properties of the proposed features.