10th IFAC Symposium on Robot Control International Federation of Automatic Control September 5-7, 2012. Dubrovnik, Croatia
Automatic Decision System for the Structure of Vision-Force Robotic Control M. Bdiwi, J. Suchý Department of Robotic Systems, Chemnitz University of Technology, Chemnitz, Germany (e-mail:
[email protected]). Abstract: Robotic applications are gradually taking a huge role in our everyday lives and even in the tasks which are previously thought that only human can do them. Most of these applications require robots to interact with environment, objects or even with human, which is performed by combining vision and force feedback. Generally there are five types of vision-force control: pure position control, pure force control, traded control, shared control and hybrid control. The important questions here are: How to define the most appropriate control mode for every part of different tasks and when the control system should switch from one control mode to the other. In this work an automatic decision system is suggested to define the most appropriate control mode for uncertain tasks and to choose the optimal structure of vision/force control depending on the surrounding environments and the conditions of the tasks. This research will focus on the operations of library automation as real application for the proposed control system such as sorting, storage and retrieval of imprecisely placed objects. Keywords: Vision/force robot control, hybrid control, traded control, shared control, automated library.
1. INTRODUCTION In recent years there have been rapidly increased development of robotic systems and their applications. The challenge today is e.g. that the robot performs many different subtasks to achieve one or more complicated tasks similar to human. If somebody wants to take a book from a shelf, this needs calculating of the distance between robot hand and the book, the position and orientation of the book (pose), comparing with the shelf and other books, the gap where robot can enter its fingers to grip the book etc. These subtasks and decisions are complicated enough for the robot. In advanced robotic applications using only one kind of feedback is sometimes insufficient to achieve the desired goals perfectly. In order to get full information about the work environment it is preferable to use different kinds of sensors such as vision sensor, force sensor, acceleration sensor, tactile sensor etc. From the point of view of control, more sensors mean more possibilities for the structure of the control system. In fact, in scientific papers it can be found a number of control algorithms and different structures for the robot control. In force control e.g., there are explicit and implicit force control [1], impedance control [2], stiffness control [3] and admittance control [4] etc. By using more sensors more approaches will appear, an illustration of some approaches to position/force control are hybrid position/force control [5] and hybrid impedance control [6]. Furthermore, different approaches can be found by using force and vision sensors together such as shared, traded and hybrid control [7]. As is known, vision and force sensors are the most common external sensors in robotic applications. This work will propose an automatic decision system to define the most appropriated vision/force control modes for different kinds of tasks and to choose the best structure of vision-force control depending on the surrounding environments and the 978-3-902823-11-3/12/$20.00 © 2012 IFAC
172
conditions of the tasks. The automatic decision system in this work depends on analyzing the conditions of the tasks with the help of vision system and image processing. The vision system will not only be used in a simple image processing and feedback (e.g., calculating the distance between robot tool and the target object), but it will also extract the relation between the target object and the surrounding environment and objects to define the optimal structure of the control system. In addition, as a real application, the proposed system will be tested on different subtasks which can help in library automation. In the next section, combination types of vision and force feedback are described. Section III contains explanation of the proposed structure of the robot control system. Section IV explains the automatic decision algorithms and how the system chooses the optimal control mode for the tasks. Section V presents the hardware experiment and the diagrams from the real experiments with automated library subtasks. The last section contains the conclusion, future work and the benefits of improving such kinds of tasks in different applications. 2. VISUAL SERVOING AND FORCE CONTROL Vision and force feedback provide complementary information, vision is used for motion planning and obstacle avoidance, whereas force adjusts the robot motion so that local constraints imposed by the environment are satisfied. In general, it is common to use vision feedback in gross motion and force feedback in fine motion. Nelson (1996) [7] presented the fusion of vision and force sensing within the feedback loop of a robot manipulator in three approaches: traded, shared and hybrid control. Then a lot of papers concerned on this field of research, such as [8] and [9]. The general characteristics of these papers are the following: 1. When robot is far from an object or an environment the visual 10.3182/20120905-3-HR-2030.00034
IFAC SYROCO 2012 September 5-7, 2012. Dubrovnik, Croatia
servoing is adopted and the relative position of the robot with respect to the object is calculated with the help of vision system. 2. When robot is in contact with the object or with environment one kind of interaction control strategy is adopted, and the relative position of the robot with respect to the object is generated recursively using vision, force and/or joint position measurements. 3. The specified structure of controller is used depending on the task and its circumstances. 4. The main role of vision system is to know the relative position of the robot with respect to the object.
consists of camera, features extraction and automatic decision. Camera sends the captured image to the features extraction box, in this box the system will analyze the image and extract the feature of it, in order to define the characteristics of the target object (position, orientation, width, length and etc.) and also the characteristics of surrounding objects. After that the feature extraction box will send the pose of the target objects to the position loop and will send the characteristics of the target object and surrounding objects to automatic decision box. Automatic decision box will do the following: a. analyzing the conditions of the tasks for every direction; b. calculating the relative positions and orientations (pose) of target object with respect to the surrounding objects and environment; c. testing the reliability of using vision and force information for every direction. All these operations are performed in order to define the contacting or grasping algorithm and to specify the most appropriate combination structure of vision-force control.
Fig.1 Hybrid vision/force Controller Fig.1 shows the typical scheme of hybrid vision force control, but with deeper analysis of the vision/force structure control modes, they can be divided in two extreme cases (pure position control and pure force control) and three approaches of vision/force combination (traded, shared and hybrid control). 1. In pure position control all directions and orientation are position controlled. This representation could be used in free space or when the target position is previously known. 2. In pure force control the motion is constrained by the environment, object or during contacting with human. 3. Traded control could be applied on every direction separately, the task-space direction is alternately controlled using a vision sensor or force sensor. In other words, manipulator motion is first controlled by visual feedback, and then the controller switches to force control when the robot is sufficiently near to the environment. 4. In shared control, both vision and force sensors control the same direction of the task space simultaneously. 5. In hybrid control, different directions of the task space are simultaneously controlled using vision or force sensors. All these schemes have different advantages and disadvantages which will be discussed later. However, there are some important questions connected with these schemes: How the control system should choose which scheme is the most appropriate to achieve a given goal; how the control system will automatically find which directions should be position controlled, which force controlled and which vision controlled; when the control system will automatically switch from one scheme to other to do different tasks without human intervention? According to our knowledge nobody until now has tried to discuss these topics. This work will focus on answering these questions.
Fig. 2 Scheme of control system At the end, the automatic decision box will control the values of S , S and S . Where S , S and S are position, vision and force selection matrixes. The selection matrixes are diagonal and the values of them are either 1 or 0. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 = , = , =
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
With the help of automatic decision box, the system will decided automatically which subspaces will be force controlled and which subspaces will be position or vision controlled and when the system will switch from one mode to other. 2. Force control loop: In this loop the system will calculate force error vector (6x1) from the desired force and measured force vector F ! . vector F = F − F ! ΔF
(1)
After that the system will apply the force control only on the subspaces which should be force controlled by multiplying with force selection matrix S : the force error vector ΔF $ = S · ΔF ΔF (2) which is the force/torque The output of force control is F vector corresponding to the force/torque error. 3. Position
3. CONTROL SYSTEM STRUCTURE This section will discuss the structure of proposed control system and the algorithms of fusing vision and force feedback. Fig. 2 shows the scheme of control system which consists of: vision system, position control loop, force control loop and the environment. 1. Vision system: Vision system 173
IFAC SYROCO 2012 September 5-7, 2012. Dubrovnik, Croatia
control loop: With X ' = (X, Y, Z, Ɵ+ , Ɵ, , Ɵ- , . will be further denoted the pose vector of end-effector, with m, d and v the sources of pose vector X measured, command-desired or vision are denoted. The automatic decision box will define which subspaces can reliably be vision controlled and which subspaces will be controlled with desired position X by using the matrices S and S . In the position control loop, the system will calculate X $ and X /$ which determine the subspaces which will be vision or position controlled: X
$
= S · X
X /$ = S · X/
(3)
gripper. In this case the algorithms of gripping the book will be different as shown in Fig. 4, and the force control will be necessary and helpful.
Fig. 4 Steps of getting out an object
(4)
Also the measured pose X! will be multiplied with the sum of S and S , because the position control will be applied only in the subspaces which are controlled with vector X / (desired position from vision) or with command-desired position vector X : X !$ = ( S + S ) · X !
(5)
$
+ X /$ − X !$
(6)
The output of the position control is F which is force/torque vector corresponding to pose error. In the proposed system there is no relation between S and S , in other words ( S ≠ (1 − S )) as it used in the hybrid control. This proposed system includes all possibilities of combination vision and force and with help of automatic decision box, the system will define automatically the values of selection matrixes depending on the situations and conditions of the tasks. 4. Environment: It consists of manipulator (robot Stäubli RX90) and the workplace for the application. In this paper the proposed system will be tested on sorting, storage and retrieval tasks of imprecisely placed objects. Choice of these applications was intentional, because these tasks can be found everywhere, they are monotone for the human and they need a lot of time, a lot of effort. Not only that but also in some places like libraries and pharmacies where the workers are skilled, large part of the normal workday of such staff members is spent with reading the codes of objects, searching about shelves and rearranging the books. These tasks are distinctive by their difficulty, they consist of many nested sub-tasks and the target objects in these tasks could have unlimited number of different situations, see i.e. Fig. 3.
3a
4. AUTOMATIC DECISION SYSTEM In general robot tasks needing visual servoing can be classified in three types. 1. Visual servoing with respect to an object without any contact with it, for example vision based teleoperation tasks [10]. In these tasks the vision feedback will take the main role of control and at the same time the force sensor could be monitored in some subspaces (guarded move). If the sensed force exceeds a limit threshold the motion will immediately be stopped. 2. Visual servoing with respect to an object with contact it, for example milling, cutting, drilling and etc. In these tasks the proposed system will extract only the features of the target object or place, and then the automatic decision system will define the control structure depending on these features. 3. Visual servoing with respect to an object with contacting and gripping it, such as sorting and palletizing systems. In these tasks the proposed system will extract the features of the target object and the features of the surrounding objects, in order to define the control structure and the gripping algorithm.
$ as At the end, the system will calculate pose error vector ΔX in equation (6): $ = X ΔX
Fig. 4 shows how the robot will grip a stocked object. Third finger will press on the target object with a specified force/moment and will pull the target object out in a way that the two parallel fingers can grip the object, just like a human trying to get out a book stuck between other books.
Fig. 5 shows the main part of automatic decision algorithm which defines the control structure of fusing vision and force feedback in all directions x, y and z. This algorithm will be repeated for every direction x, y and z. At first the system will test if the direction is parallel to the camera axis, which means camera cannot measure in this direction unless it is 3D camera. So if the camera is 2D and parallel to this direction, the vision feedback cannot be used in this direction. After that if there is a desired force in this direction, the control mode is force control, i.e. S5 = 0 and S5 = 1. If there is no desired force, this direction will be position controlled (S5 = 1) by desired position. On the contrary, if camera can monitor this direction, the system will test if the image processing results can be reliably used. This testing is carried out by comparing the results of the last captured image with the results of previous captured image. If the difference is not comprehensible and there is no matching between robot motion and the updated results, the vision cannot be reliably used. This situation can be met, e.g. when the robot moves slowly in the direction x only, and if the last two captured images show that the difference of the relative positions of target object with respect to the endeffector happens in all directions with large values. This means, that the detection of the target object in one of the
3b
Fig. 3 Different poses and situations of objects In Fig. 3a the book F can easily be grasped with only using vision feedback. After that if the book A should be grasped the vision feedback alone is insufficient, because the target book A is stacked between two other books and there is not sufficient place to enter the parallel fingers of the robot 174
IFAC SYROCO 2012 September 5-7, 2012. Dubrovnik, Croatia
both images is not correct and the vision feedback cannot be reliably used. In this case if there is a desired force in this direction, the control mode will be shared control, i.e. S5 = 1 and S5 = 1, but if there is no desired force, this direction will be just vision controlled (S5 = 1) with guarded move (force sensor is monitored). Using the vision feedback in the last two situations will depend on the reliability factor. Actually it has no meaning, when the system ignores all vision feedback because of one or two images which have wrong result. In the proposed system if the reliability factor is greater or equal 80% the automatic decision system will activate the vision.
has a contact force at one point in the gripper, this force will produce a torque in the sensor coordinate system. If the contact forces between the gripper and the object are desired in x and y, the torque value in the force/torque sensor coordinate system will be measured in z orientation, so θA will be force controlled to insure better quality of impact. However, if both x and y are vision controlled, the control mode will be as follows: if there is desired torque in θA , the control mode is either shared or traded depending on the priorities of quality of impact or speed of robot motion as illustrated in the previous paragraph. If there is no desired torque in θA , the control mode is vision control, i.e. S = 1 and S = 0. On the other hand if one of the two directions x or y or both use different feedback (vision and force). In this case, if one of the both axis x and y is parallel to the camera axis (camera cannot measure in this orientation), there are two situations: If there is desired torque in θA , the control mode in θA is force control, but if there is no desired torque defined in θA , the control mode is position, i.e. S = 1 with force-guarded move. On the contrary if both axes are not parallel to the camera axis there are two situations: If there is no desired torque in θA , the control mode is vision control with forceguarded move. If there is desired torque in θA , the control mode is either shared or traded control.
Fig. 5 Automatic decision algorithm for directions x, y, z On the other hand if the vision information can be reliably used and if there is no desired force in this direction, the control mode will be vision control, i.e. S5 = 1 and S5 = 0. However, if there is a desired force in this direction and the vision information can be reliably used, we get two situations depending on the type of target object and the conditions of the tasks. If the motion speed of the robot is more important than the impact force (target object is not impact sensitive), the control mode is shared control S5 = 1 and S5 = 1. However, if the situation is reversed, for example the target object is impact sensitive, the control mode will be traded control, as equations (7) and (8) show: if |X − X! | > ε then S5 = 1, S5 = 0
(7)
Fig. 6 Automatic decision algorithm of orientation angles
if |X − X! | ≤ ε then S5 = 0, S5 = 1
(8)
Hybrid control is not illustrated explicitly in Fig. 5 and Fig. 6, because these algorithms are designed for every single direction or orientation only. On the other hand, hybrid control works in the whole space. Once can see hybrid control obviously only from the selection matrices S , S and S .
Here ε is a small distance to switch from vision feedback to force feedback. The system will define ε value depending on the priorities of impact quality and motion speed. Fig. 6 shows the main part of automatic decision algorithm which defines the control structure for combining visionforce feedback in orientation angle θA depending on the control structure of x and y directions. The same algorithm will be then applied to the other angles(θB depending on y and z and θC depending on x and z). The proposed system will test the control structure of x and y. If they have the same source of feedback, e.g. if they both use force feedback, the control mode in θA is force control, i.e. S = 0 and S = 1. The explanation of this decision is as follows: Usually force/torque sensor is mounted between the last joint of the robot and the gripper, so the coordinate system of the force/torque sensor is in this sensor. This means, if the robot
In general every control mode of combining vision and force feedback has advantages and disadvantages. Previous work [12] discussed the problems of shared and traded control and suggested to mix both. The main benefit of traded control is the stable impact with a target surface, on the contrary the limitation of traded control are high contact forces which can arise unless the system switches from vision feedback to force feedback before contacting in adequate distance. Shared control is useful on the surfaces which cannot be reliably detected with vision feedback. The disadvantages of it are obvious when the vision system commands motions. The resulting accelerations causing oscillations of force control 175
IFAC SYROCO 2012 September 5-7, 2012. Dubrovnik, Croatia
system because of end-effector inertial effects. Hybrid control ignores much of the information provided by visual feedback.
not needed. In addition, the camera in this experiment is 2D, so according to the automatic decision algorithms the subspaces will be controlled as follows: y, z and θB will be vision controlled; x, θC and θA will be position controlled with force-guarded move.
In conclusion, the proposed system will analyze the situation of the task and the conditions of vision and force feedback in every direction in order to use all the advantages of different vision force control structures and to avoid their disadvantages. 5. EXPERIMENTAL RESULTS Throughout this section, the description of equipment will be given and the experimental results will be illustrated. Fig. 7 shows the hardware equipment for the experiment which consists of robot Stäubli RX90, 2D camera Sony XCD-700, multi-axis force/torque sensor JR3 (120M50A) with measurement range of ± 200 N for force and ±20N.m for torque. The system will sort, rearrange and transport books according to their alphabetic/numeric system. Vision system will detect the objects and define their characteristics (position, orientation, length, width etc) and the relation between them using image-processing techniques. Then it will identify the code of each book using SIFT features. The automatic decision system with the help of vision system will determine the following: how the target object will be grasped, which subspaces will be force controlled which vision or position controlled, which combination of visionforce control is the most appropriate for the task and when the system should switch from mode to the other.
Fig. 8 Grasping book H As is shown in Fig. 8 the task is divided into three phases. In the first phase robot will move to align the robot gripper with the book H in a way that the book can be easily grasped. In the second phase robot will move only in the x direction to position the fingers of gripper around the book H. In the last phase the gripper will grasp the book H and after that the robot can sort or transport it.
Fig. 7 Hardware equipment Fig. 8 and Fig. 9 present the measured values of pose and force/torque of the robot end-effector to illustrate the grasping tasks of book H and book A (see Fig. 7). Both books H and A have totally different poses and therefore distinct processing algorithms. The grasping tasks of them will be solved in different ways. Fig. 8 consists of two diagrams. The first diagram shows the position values of the robot end-effector for x, y and z. The second diagram presents the angle values of the robot endeffector for θB , θC and θA . The vision system will detect the book H, it characteristics and its poses (see Fig. 7 book H has diagonal pose). After that vision system will analyse the relation between book H and other books to see if the fingers of the end-effector can enter between the books. In case of book H the system has decided, that the gap between book H and other books allows entering the parallel fingers of the gripper between books. In this case the force control loop is 176
Fig. 9 Grasping book A Fig. 9 consists of three diagrams. The first and second diagrams present the pose values of the robot end-effector x, y, z, θB , θC and θA . The third diagram shows the force/torque values of the robot end-effector for subspaces D , E and FG . As is shown in the Fig. 7 the book A has right angle situation
IFAC SYROCO 2012 September 5-7, 2012. Dubrovnik, Croatia
surrounding environments and the preconditions of tasks. This work has used all possible types of vision/force control combinations and it could use different structures of the control in different subspaces in the same task to insure the best quality of control. This strategy will allow the robot to benefit from all the advantages of different control structures and to perform the complex tasks with no need to be reprogrammed or intervened by human. This work has focused on the library automation as real application of the proposed control system such as sorting, storage and retrieval operations of imprecisely placed objects. The ideas presented here can be also used in pharmacies, warehouses, factories, supermarkets etc.
and it is stuck between two other books. After detecting the book A and finding its characteristics, poses and relations with other books using vision, the system will recognize that there is not enough gap between the books to enter the parallel fingers of the gripper. Hence, the system has decided to grasp the book in the way which has been discussed in Fig. 4. As is shown in Fig. 9 the task is divided into five phases. In the first phase robot will move in all directions to align the robot gripper with the book (y, z and θB vision controlled and x, θC and θA position controlled). In the second phase robot will move only in x direction to position the third finger of the gripper above the book. In the third phase, z direction will be switched from vision control to force control. In other words, in this phase robot will move in z direction and press on the book until the contact between the robot and the book reaches the ranges of desired force E (position based explicit force control), as is shown in equation (9):
In this work, we emphasize the importance of developing the processes of choosing the control structures of the robot. If one looks at the existing research in this field, he can find different structures of robot control systems which depending on the tactile sensors. Hence, the future work will concentrate on the fusing of force, vision and tactile sensors.
HI JIEK > IEL + ƞ M NO JIEK < IEL − ƞ M QℎST UL = UK − V · WHXT (IEL − IEK )
(9)
REFERENCES
where IEL and IEK are the desired and measured force in z direction, ƞ is limit value of force and V is the gain of the force control loop.
[1] D. E. Whitney, “Historical Perspective and State of the Art in Robot Force Control”, Int. J. of Robotics Res. 6(1) (1987), pp. 4-17. [2] N. Hogan, “Impedance control: An approach to manipulation”, American Control Conference (1984), pp. 304–314. [3] K. J. Salisbury, “Active Stiffness Control of a Manipulator in Cartesian Coordinates”, the 19th IEEE conf. on Decision and Control, Albuquerque (1980), pp. 95-100. [4] H. Seraji, “Adaptive Admittance Control: An Approach to Explicit Force Control in Compliant Motion”, IEEE Int. Conf. on Robotics and Automation (1988), pp. 1185-1190 [5] M. Mason, “Compliance and Force Control for Computer Controlled Manipulators”, IEEE Trans. on Sys. Man. Cyber SMC-11 (1981), pp. 418-432. [6] R. Anderson and M. W. Spong, “Hybrid Impedance Control of Robotic Manipulators”, IEEE J. of Robotics and Automat. 4(5) (1988), pp. 549-556. [7] B. Nelson, D. Morrow, and P. Khosla, “Robotic Manipulation Using High Bandwidth Force and Vision Feedback”, Mathematical and Computer Modeling (1996), vol. 24, No. 5/6, pp 11-29. [8] J. Baeten, W. Verdonck, H. Bruyninckx and J. De Schutter, “Combining force control and visual servoing for planar contour following”, Int. J. of. Machine Intelligence & Robotic Control (2000), Vol. 2, No. 2, pp. 69–75. [9] J. Baeten and J. De Schutter, “Hybrid vision/force control at corners in planar robotic-contour following”, IEEE/ASME Transactions on Mechatronics (2002) Vol. 7, No. 2, pp. 143– 151. [10] J. Kofman, X. Wu, T. Luu, and S. Verma, “Teleoperation of a Robot Manipulator Using aVision-Based Human–Robot Interface”, IEEE transactions on industrial electronics (2005), vol. 52, No. 5. [11] M. Bdiwi, A. Winkler, J. Suchy and G. Zschocke, “Traded and Shared Vision-Force Robot Control for Improved Impact Control”, 8th IEEE International MultiConference on Systems, Signals & Devices, Sousse (2011), pp 154-159.
The fourth phase will start, when the contact force reaches the ranges of the desired force for the first time. The situation in the fourth phase is the following: the robot has good contact with the book (the third finger presses on the book with desired force), and then the robot will try to rotate the book in y direction. At the same time it will pull out the book in x direction slowly while controlling the desired forcetorque applied on the book, the same strategies as with human. As long as the conditions of force control are satisfied rotation and pulling out operations will be performed until the book reaches the pose that the parallel fingers of the gripper can grasp it, as equation (10): HI JFGL + ƞ > FGK > FGL − ƞ M QℎST θC = θC! − W & ZL = ZK − W
(10)
where FGL and FGK are the desired and measured torque in y direction, ƞ is limit value of torque, W and W are the steps of rotating in y and pulling in x. Equation (10) will be repeated until θC reaches a desired value in a way that the two parallel fingers can grasp the object. However, if the measured force is outside of the desired force range, robot will further move according to the equation (9). In this phase the torque and the angle in the θC direction will be controlled at the same time. In the last phase the robot will grip the book A and sort or transport it to another place. On the whole, according to the automatic decision box the subspaces will be controlled as the follows: x and θA will be position controlled, y and θB will be vision controlled, z will be traded vision/force controlled and θC will be shared position/force controlled. 6. CONCLUSION AND FUTURE WORK This work has suggested an automatic decision system which decides automatically the most appropriated vision/force control structure for different tasks depending on the 177