Copyright © IFAC Intelligent Assembly and Disassembly, Bucharest, Romania, 2003
ELSEVIER
IFAC PUBLICATIONS www .elsevier.comllocale/ifac
TELE-ASSEMBLY AND -DISASSEMBLY IN REAL AND VIRTUAL ENVIRONMENTSCONCEPTS, SYSTEMS AND APPLICATIONS G. Schmidt
A. Kron
J. Hoogen
Institute of Automatic Control Engineering Technische Universitiit Miinchen D-80290 Miinchen, Germany
[email protected], Alexander. K
[email protected] [email protected] .de Abstract : This paper discusses novel concepts, components and systems supporting human operators in tele-assembly and -disassembly tasks, both in real remote or virtual environments. For increasing task performance the introduction of telepresence techniques such as multimodal feedback and an appropriate display of information to the operator is proposed. Approaches focused in this article consider those human senses or modalities most relevant for execution of assembly tasks, i.e. vision, audition and haptics. For enhancement of visual and auditory feedback to the human operator a novel interactive stereo vision and audition setup is presented. Some guidelines for the interconnection of appropriate haptic display components are proposed for generation of high-fidelity haptic feedback from the task environment at the operator's hand and fingers . The paper outlines also details of developed system components supporting the proposed display concepts. Usability and effectiveness of the developed components and systems are demonstrated by focusing various standard and non-standard assembly/disassembly tasks, both in real remote and virtual environments. Copyright © 2003 IFAC Keywords: tele-assembly, virtual prototyping, haptic display, multimodal feedback, human operator
1. INTRODUCTION Modern production requires the development of advanced assembly/disassembly techniques and systems satisfying demands for shorter time of production, frequent and rapid product changes as well as for reduced development times for new product designs. Associated with such demands there has been a growing interest in Internetbased technologies for executing tele-assembly and -disassembly tasks, both in real remote (RE) and virtual environments (VE) . A major interest for execution of assembly tasks in VEs arises from Virtual Prototyping (VP), as employed in the automobile and other industries systems (NATIBO , 1996). VP tools assist designers in carrying out detailed examination of virtual objects and avoid substantial expenditures in terms of time and technology for the development
of physical prototypes. Moreover, the application of YE technologies proves to be extremely useful for the implementation of novel training systems and simulators, as required for industrial training (Oliveira et al., 2000) or even medical education (Riener et aI., 2000).
Fig. 1. Closed-loop interaction with YE and RE.
from others. For the application area under consideration here we will restrict the discussion to the human modalities vision, audition and haptics (touch), neglecting smell and tas~e. Moreover, in teleaction systems display of multlmodal feedback information depends often on inevitable timedelays (latencies) when transferring data from a RE to the HO or vice versa by Internet-based or other communication technologies. However, for simplification of the succeeding presentation of conceptual ideas we assume that time-delays are small, e.g. < 5 ms, which proves to be reasonable for communication in LANs. Consequently, the focus of the HSI concepts outlined next will be directed to the goal of providing improved feedback of visual, auditory and haptic information by use of advanced telepresence technologies and systems.
Tele-assembly in real REs is required for physical interaction with hazardous and/or inaccessible surroundings, e.g. in nuclear facilities, in micro-assembly tasks (Reinhart et al., 2(01), in underwater and space operations, or in the removal of explosive ordnances. Corresponding technologies are also useful for tele-surgery applications (Ortmaier et al., 2(01). In contrast to autonomously executed assembly tasks - e.g. in industrial production - teleassembly and -disassembly are characterized by strong human-system interaction. The human operator (HO) and the corresponding task environment form a closed loop system, as illustrated in Fig. 1. Satisfactory performance in exploratory and manipulatory task execution rests upon a sufficient degree of immersion of the HO into the RE. Strong immersion is particularly achieved by perception of multimodal feedback information from the VE or RE (Lederman and Klatzky, 2001). Nowadays, interaction with a RE is typically performed over a human-system interface (HSI) comprising integrated multimodal feedback actuators, each related to one of the five human senses: vision, audition, touch, smell and taste. Evidently, the more modalities are fed back with high quality to the HO the better his/her task performance will be. On the other hand, the effort for system design and the complexity of such a comprehensive HSI will increase, too. Consequently, technical and economical constraints often require the development of HSI concepts keeping a balance between sufficient multimodal immersiveness and an acceptable system complexity.
2.1 Vision and Audition Vision and audition are human sensory abilities for perceiving spatial information about objects without touching. A HO uses this type of information particularly for effectively planning necessary operations for task execution. A typical manipulatory action is grasping of a reachable object. For a human it is usually easy to perform such a task in his/her direct environment since object grasping can make use of multi-modal 3D perception. However, if a HO needs to perform the same task in a RE via a teleoperation system or if he/she is interacting with a VE, performance is usually degrading. This results mainly from the fact that in conventional HSIs the visual modality is typically implemented in 2-D (Buss and Schmidt, 1999; Baier et al., 1999). Consequently, the question is which improvements in task performance are achievable by introducing stereoscopic vision into visual feedback.
This paper discusses some of our approaches to an application of Internet-based multimodal telepresence technologies for improving various types of tele-assembly and -disassembly tasks. The concepts underlying the corresponding feedback, display and HSI designs, take into account typical limitations of hardware devices. The improvements in task performance achievable by introducing advanced forms of multimodal feedback are demonstrated by sample application-oriented experiments both in REs and VEs.
Furthermore, if a HO observes an object by means of stereoscopic vision in a certain distance, the human eye axes are crossing in this object. The eye axes form an angle between each other, called the vergence angle. Its range depends on the object distance and the individual eye-basis. It is wellknown that the vergence angle affects the quality of binocular visual feedback. Therefore, high fidelity stereo vision needs to take into account some sort of automatic vergence control.
The rest of the paper is organized as follows: Section 2 presents novel concepts for multimodal interaction, focusing the modalities vision, audition and haptics. Section 3 outlines details of newly developed system component. Results of experiments in REs and VEs are discussed in Section 4, followed by concluding remarks and an outlook to future work in Section 5.
Last but not least, humans are accustomed to actively select view direction corresponding to the current region of interest. However, most visual feedback systems are only capable of displaying images with a more or less fixed view direction. On the other hand, the degree of immersion into an environment is apparently increased if the HO is enabled to perform interoctive vision into this environment. The following discussions will show that improvements in visual feedback for purposes of tele-assembly can be achieved by means of an interactive stereo vision system.
2. CONCEPTS FOR MULTIMODAL INTERACTION Presently display of multimodal feedback is often limited by available actuator technology as well as by the ability to integrate different technologies into an overall HSI design. A more detailed analysis shows however that most tele-assembly applications do not always require feedback for all human senses. Individual applications allow to distinguish the more important feedback modalities
Improvements in auditory feedback also aim at an enhancement ofthe HO's feeling of immersion. On one hand, we may assume that unexpected
2
auditory feedback may disturb the HO's overall multimodal perception. On the other hand, if a HO interactively changes view direction, the direction of auditory signals should change too for reasons of spatial consistency. The latter requires some sort of interactive auditory feedback. In addition, high fidelity auditory feedback should provide stereo sound. Thus, we can assume that a major improvement of auditory immersion can be achieved by means of an interactive stereo audition system.
c) single or double displays, d) active or passive supplements. a) Low-powered or high-powered displays For many applications the use of low-powered force displays proves to be sufficient. These devices are not capable of overriding force or motion of a HO, e.g. (Massie and Salisbury, 1994). Main application areas are found in microassembly (Reinhart et al., 2001) or in the field of surgery (Ortmaier et al., 2001). However, there exist other application scenarios where heavy interaction between HO and the haptic display is relevant, e.g. simulation of free floating heavy objects for astronaut assembly training (Carignan and Akin, 2003), or medical systems for knee joint simulation (Riener et al. , 2000) or for rehabilitation. These applications require devices with high force output for display of high mechanical stiffness, as well as mass and damping effects. Another motivation for the development of strongly actuated kinesthetic devices is the need to support substantial pay loads. Such pay loads are additional haptic displays for kinesthetic finger-display or tactile actuators, mounted at the endeffector of the ground-based display. These devices can also be passive supplements.
2.2 Haptics
Haptic feedback means simultaneous and consistent evocation and processing of kinesthetic and tactile information. In addition to an exchange of signals, the haptic sense enables also bilateral exchange of energy between the HO and the environment, thus allowing active manipulation of and in the environment. The complexity of human tactile perception and the variety of skin receptors require to distinguish a multitude of tactile submodalities, like e.g. vibration, temperature, small scale-shape information, etc. Humans are highly experienced in performing dextrous interactions with objects based on widespread haptic fingertip stimuli, as well as multi-fingered and bimanual operations. Haptic displays, such as the well-known PHANToM device (Massie and Salisbury, 1994), are commonly characterized by enabling a HO to sense and manipulate an environment by receiving kinesthetic and/or some kind of tactile feedback. Despite considerable progress in the design of haptic feedback devices in the recent past (Burdea, 1996), currently available displays are still far away from the ultimate goal to touch and manipulate objects in REs by intuitive hand and finger motions together with combined kinesthetic and tactile feedback (Kammermeier et al., 2001). Most haptic display devices still force the user to a "system-adapted" behavior, reducing HO's skills and efficiency. Corresponding displays often prove to be highly specialized, bulky hardware setups enabling an user to perceive a single haptic submodality only and prohibiting system integration.
Fig. 2. Scheme of possible haptic display configurations.
b) Parallel or serial coupled displays If the human finger or hand is contacting an object, tactile skin receptors detect tactile stimuli. Perception of body movement, posture and resulting forces, subsumed as kinesthesia, is sensed by receptors embedded inside the human body. With respect to the human kinematic chain touch and kinesthesia are coupled serially. A corresponding serial mechanical setup for haptic feedback display purposes would only allow interaction between the skin and a variety of distributed tactile actuators. These tactile stimulators would be mounted on kinesthetic devices - often high-powered displays - at the finger-level. As a result, the finger devices would provide the sum of all tactile stimuli forces and torques. This type of hierarchical serial system configuration is rather demanding from a hardware developer's viewpoint. Lacking robustness and signal range of available tactile actuators often prohibit a serial haptic system design. On
The multitude of displays allows classification into low-powered and high-powered haptic display deVICes providing an HO with force feedback at hands and arms (see Fig. 2). For more comprehenswe forms of haptic feedback such displays are often coupled with active displays for kinesthetic or tactile finger feedback as well as with components prov~din.g passive tactile feedback. While many applicatIOns do not always require widespread haptic feedback, display of haptic submodalities relevant for corresponding task execution may be often appropriate. Due to technical limitations and our objective of reducing system complexity, we propose certain guidelines for the design of advanced haptic displays, by focusing: a) low- or high-powered displays, b) parallel or serial coupled displays,
3
d) Active or passive supplements In many cases system complexity can be reduced, if active tactile actuator components are replaced by passive components, providing a HO with passive feedback. Passive components can be endeffector designs or tools the HO would use while performing similar tasks in real local situations. For instance, if the HO mounts some screws, optimal tactile feedback will be reached if the HO interacts with a real screwdriver as endeffector instead of a common handle as often used in haptic displays. Holding the screwdriver instead of a handle will result in high fidelity perception and strong immersion by reduced system complexity. But this is due to a loss of generality, using the system for specific applications only. However, integration of passive supplements instead of active actuator components is a remarkable aspect in low cost haptic display design.
the other hand, a serial feedback approach will typically result in strong immersive systems, however for the price of considerable cost and system complexity. An alternative to the serial paradigm is to combine tactile and kinesthetic devices in a more or less pamllel configuration. In this case kinesthetic feedback on individual fingers is not provided over tactile finger devices, but as parallel supplements. Analogously, hand/wrist kinesthetic feedback directly affects the HO's hand. As an example for such a configuration, let us assume a HO wearing a hand force exoskeleton being coupled at the wrist with a low- or high-powered haptic feedback display (see Fig. 1). Now wrist and finger forces are applied by force stimuli displayed in parallel at the HO's hand. The assumption underlying the parallel paradigm is that multimodal human perception can cope with certain shortcomings concerning intermodal consistency as long as certain threshold values are not exceeded. In this case a HO needs to fuse the haptic stimuli displayed in parallel to a correct overall haptic perception. Although the parallel feedback approach may be capable of overcoming several technical limitations for combined kinesthetic and tactile feedback, it depends heavily on the HO's sensory fusion abilities.
The choice between the proposed concepts for haptic display development certainly depends on the kind of application. The next section will present a selection of system components, representing technical realization of some of the proposed concepts. 3. SELECTEO SYSTEM COMPONENTS
c) Single or double displays Some assembly tasks can be executed by the HO on a single-handed base. Such tasks only require single display configurations generating haptic feedback. However, human skill for complex task execution in everyday life mainly depends on the capability of coordinated bimanual action. Psychological studies classify human bimanual operation into symmetric and asymmetric fashion (Lindeman, 1999; Hinckley, 1996). Moreover, physiology and psychology literature advocate a move away from the traditional view that people are either right or left handed (Guiard, 1987). Instead, task execution is accomplished using two hands performing different roles. This knowledge is summarized in a commonly accepted kinematic chain model of both hands (Guiard, 1988) resulting in 3 hypotheses which should be taken into account for the design of bimanual HSIs:
3.1 Interactive Stereo Vision and Audition Setup
Since appropriate hardware is getting less expensive, state-of-the-art applications with YEs improve presence feeling and task performance by making use of stereoscopic vision, e.g. (Barfield et al., 1999; Koh et al., 1999). In contrast, application of stereoscopic vision to REs causes more difficulties due to the necessity of communicating two synchronous image streams to the HO. In addition, various geometric boundary conditions have to be met to achieve realistic visual impressions (Brooker et al., 1999). Currently applications employing stereoscopic visual telepresence are barely developed. Therefore an interactive stereo vision system has been developed in our lab (Baier et al., 2000; Baier et al., 2(01) as depicted in Fig. 3.
• The role of the non-dominant hand (NOH) is not only providing stability to the object acted upon by the dominant hand (OH). NOH additionally provides a reference frame for work done by OH. • NOH has a much coarser resolution of motion in comparison to OH. Therefore, OH accomplishes the actions requiring higher precision. • NOH actions have always temporal precedence over OH actions. This means, that the reference frame must be set by NOH before OH undertakes precise actions.
_ _ LAH (100101_>
Fig. 3. Architecture of an interactive stereo vision telepresence system. A stereo camera pair mounted on a pan-tiltroll head together with framegrabber and JPEG compression hardware on a SUN workstation is employed for image acquisition in the RE. The stereo video stream is sent to a head-mounted
Often consideration of these hypotheses allows to duplicate single-handed haptic displays for bimanual use. The resulting system configuration results in a more complex hardware setup with double displays.
4
are coupled by a mechanical connector. Fig. 4b illustrates, that the HO's forearm is fixed behind the wrist by means of a strap. With this type of coupling a HO retains control of all passive DoFs of his/her wrist motion, thus ensuring intuitive hand motion.
display (HMD) on the operator site via Internet. The orientation of the HO's head is tracked in 3 DoFs by means of a gyro-sensor and communicated to the angular position control system of the pan-tilt-roll head. In addition, the vergence angle is controlled depending on an infrared sensor distance measurement. This system allows the HO an interactive stereo view into a remote scenario, leading to improved performance in teleoperation as evaluated in (Baier et al., 2000). The current frame rate of the stereo stream is 3 to 5 frames/s while the bitrate is about 1 Mbit/s.
In addition to this type of kinesthetic force feedback, integrated tactile fingertip modules (TactTip) allow display of vibmtactile and thermal stimuli directly at the fingertips as further parallel supplements (Kron and Schmidt, 2002 ; Kron and Schmidt, 2003), see Fig. 5.
The pan-tilt-roll head is also equipped with a microphone pair, recording stereo sound information in RE. Auditory information is displayed via headphones to the HO . The overall system ensures interactive visual and auditory perception of REs. Experimental results using this setup for a benchmark assembly task are outlined in Section 4.1.
Vibmtactile feedback is produced by means of a module comprising a miniaturized DC-motor with a freewheeling out of balance mass at the top of the motor shaft. With a maximum speed of 10,000 RPM the vibration motor produces vibratory frequencies up to 166 Hz with amplitudes> 3/-Lm. This type of vibrotactile feedback stimulates Pacini as well as Meissner corpuscles embedded in t he human fin gert ip (Bruggencate, 1994).
3.2 Wrist/Finger Haptic Display For purposes of comprehensive haptic feedback perception we have also designed a Wrist/Finger Haptic Display (WFHD), see Fig. 4. The concept underlying this device is display of forces at the HO's wrist with a low-powered display in parallel to kinesthetic and tactile stimuli at the fingers by means of additional actuator supplements (Kron et al., 2000; Kron and Schmidt, 2002).
(a)
Fig. 5. (a) Tactile fingertip feedback module, (b) system integration of TactTip modules into a haptic glove.
Wrist force feedback is provided by the nonportable high performance Desktop Kinesthetic Feedback Device (DeKiFeD4), as shown in Fig. 4a. This SCARA type robotic arm allows the HO to generate proprioceptive inputs with 4 active DoFs (3 translations, 1 rotation) in the Cartesian space. The DeKiFeD4 has a comparatively wide workspace of about 80x25x30 cm 3 and a high force capability up to 120 Nand 20 Nm. Kinesthetic feedback is generated by means of DC-motors with harmonic drives directly mounted in the joints. The displayable force range is appropriate for providing a HO with kinesthetic feedback during typical assembly/disassembly tasks in REs.
(a)
(b)
The additionally integrated temperature feedback display comprises 4 serially connected Peltier elements producing cold as well as hot temperatures. The elements are pasted on a thin aluminium plate and achieve heating and cooling rates of +4.0 K/s and -2.5 K/s, respectively, within a temperature range from 8°C to 80°C. For safety reasons the highest and lowest output temperatures are limited to 42°C and 15°C. To avoid too strong module heating, a water cooling circuit is integrated into the module (see Fig. 5a). The miniature size of the TactTip module ensures a multi-fingered system integration into an exoskeleton-based kinesthetic feedback display, as illustrated in Fig. 5b. The presented WFHD system demonstrates an overall system design towards more holistic haptic feedback generation , considering combined kinesthetic and tactile feedback in a parallel coupled system configuration. Experimental results using this display in a VP scenario are outlined in Section 4.3.
(b)
F ig. 4. (a) Desktop kinesthetic feedback device 4 DoF (DeKiFeD4) , (b) combined hand and finger force feedback display. Additional finger force feedback is implemented in a parallel configuration wearing the commercially available sensorized hand force exoskeleton CyberGrasp from Immersion, Corp., as depicted in Fig. 4b . This haptic glove system produces finger forces up to 10 N for each finger which proved to be adequate for perceiving fingertip contacts with objects. Both kinesthetic feedback devices
3.3 Bimanual Haptic Telepresence System Nowadays bimanual HSIs for interaction in YEs or REs with haptic feedback are still weakly developed. Recently proposed displays for bimanual coordinated action in YEs are the Haptic Workstation from Immersion (Immersion, Corp., 2002) and the string-based device SPIDAR 8 from the
5
Tokyo Institute of Technology (Walairacht et aI. , 2001). To date bimanual haptic telepresence systems are only designed for telesurgery applications, as discussed in (Cavusoglu et al., 1999; Waldron and Tollon, 2003).
cuboid overlap of approximately 60x20x30 cm 3 is achieved. Use of the total and much larger bimanual workspace at operator and teleoperator site is ensured by means of an indexing procedure 1. This overall bimanual telepresence system takes into account the earlier proposed design rules for bimanual coordinated human system interaction. The principal of human hands accomplishing asymmetric roles is considered by means of different gripper configurations which are intermountable for right or left handed use. The similar mirrored kinematic chains of the SCARA type robotic arms at operator and teleoperator site resemble the similarity of the kinematic chains formed by both human arms. Experimental results when using this bimanual haptic telepresence system for tele-assembly tasks are outlined in Section 4.2.
For purposes of bimanual coordinated task execution in real REs, we designed a novel bimanual haptic telepresence system as depicted in Fig. 6. Ensuring bimanual position input as well as display of bimanual kinesthetic feedback in 4 DoFs at the HO's hands, respectively, we extended the DeKiFeD4 right handed display (see Section 3.2) by means of a second desktop device with mirrored join configuration for left handed use (see Fig.6a). Object manipulation in RE can be achieved by means of a bimanual teleoperator system as illustrated in Fig. 6b. By duplicating both DeKiFeD4s and mounting two-jaw grippers as endeffectors instead of the handles for the human hand , we built a left- and right-sided Desktop Kinesthetic Teleoperator (DeKiTop4) system, with respectively 4 active DoFs positioning the gripper, and an additional active DoF for grasping.
Position on x-axis [m]
Fig. 7. Cartesian workspace of the bimanual HSI and teleoperator system. 3.4 Robotic Haptic Display
Fig. 6. (a) Bimanual haptic feedback display, (b) bimanual teleoperator system.
For a system configuration with a high-powered haptic display we employ an off-the-shelf industrial robot (see Fig. 8). It is the medium-sized industrial robot Staubli RX90 with the wellknown 6 DoF PUMA-like kinematics. The spherical workspace has a radius of 98 .5 cm around the base. The joints are actuated with geared brushless AC motors and support a nominal payload of 6 kg at the endeffector. Resolvers on the motor axis provide analog joint angle measurements with the signal being transformed to encoder counts in the joint power amplifiers. A JR3 6 DoF force/torque sensor is mounted on the endeffector for measuring the force and torque exerted on the HO's hand.
Taking into account that two hands accomplish different roles in task execution we designed one gripper with horizontally arranged jaws. This gripper is used for the NDH, performing basically passive actions, such as maintaining a remote object in a stable grasp position for more dextrous manipulation by the DH. The NDH gripper shows a maximum gripper opening of 9 cm and produces grasp forces up to 70 N at each jaw. The second gripper controlled by the DH shows a vertical arrangement of the jaws. By means of grasp forces up to 40 N at each jaw and a gripper opening of 6 cm this gripper is used for performing dextrous and active actions, such as screwing. Each gripper is equipped with force sensors measuring grasp forces. Gripper opening is continuously controlled by use of potentiometers integrated into the user's handles. Both gripper configurations allow a sufficient overlap of both teleoperator workspaces, which is required for execution of bimanual coordinated assembly tasks.
The original RX90 control system rests upon a VME-Bus architecture with cards for sensor reading, actuator output and a processor board for Cartesian control loops, path planning and application programming. In order to achieve higher, deterministic sampling rates for implementation of more complex controllers and integration of VE models, a parallel PC-based controller has been
The bimanual Cartesian workspace on the operator and teleoperator site is mainly determined by the first two joint angles ql, q2 of the SCARA type kinematic chains. Fig. 7 shows the common bimanual workspace, when keeping the remaining joint angles q3, q4 fixed. As a result a
I An indexing procedure interrupts temporarily the interconnection between the operator and teleoperator site. After changing the current DeKiFeD4 configurations, the new relative pose input at the operator site is added to the actual tcleoperator pose.
6
developed. Configuration and calibration of the robot is still performed with the original architecture while for haptic display purposes control is passed to the PC by switching the joint amplifier input/output signals of the robot to the appropriate signal processing cards in the PC. In the latter case the control runs under Real-Time-Linux at 4kHz and 50% processor load on a Pentium III IGHz , leading to high accuracy and stability of the implemented control loops .
vision monocular bin. fixed I bin. auto I
It/sec
<7,
I /N 2 sec
31.70 25.49 24.63
12.61 6.47 9.47
138.99 111.70 138.67
<7
145.57 73.67 167.37
Table l. Evaluation results using stereo vision for operations in RE. teleaction system in a prototypical manipulation task (Baier et aI., 2000). The kinesthetic teleaction system comprises the teleoperator DeKiTop4 (right-sided) equipped with a two-jaw gripper and the desktop device DeKiFeD4 (right-sided) as described in Section 3.3. The scenario under consideration is shown in Fig. 9. The application is a peg-in-hole tele-reassembly task. The HO is supposed to grasp a cog, disassemble it from a gearbox, install it at another location and release it. This sequence of operation has to be repeated for all four jacks in the gearbox. Forces between cog and environment are measured and fed back to the HO via the kinesthetic feedback device. Vision information is received through a HMD from the interactive camera pair mounted on the remote pan-tilt head.
Fig. 8. Industrial robot used as a haptic display. As the robot can exert high forces and the HO is coupled to the robot by holding its endeffector, substantial effort has been put in avoiding human injury in case of system failure. A first step is to prevent the HO from entering the workspace with head or body. It has been realized by mounting the robot on a table-like base with guardrails in the height of hip and head (see Fig. 8) . This is crucial as the operator usually wears a HMD and cannot see the robot. Other mechanical safety features are a commercial pneumatic overload clutch and a magnetic clutch both mounted on the endeffector. These devices assure that the operator is disengaged by the haptic display if forces exceed a specified maximum in any of the 6 DoFs . Switches inside the clutches are in series with common emergency buttons as well as a deadman switch. An additional safety board has been developed which checks various watchdog signals, maximum joint velocities and commanded torques . The third level of the safety measures is software-based . The maximum allowed workspace and velocity is observed in Cartesian and joint space. The respective algorithms run on the PC and on the Staubli controller simultaneously such that again high reliability by redundancy is ensured . The PC also monitors maximum forces via the force/torque sensor. All hardware and software safety checks are linked to the robot safety board which disables the joint amplifiers and engages the robot brakes in case of system failure. The various measures described above provide a maximum level of safety for interaction between a HO and the industrial robot . Corresponding experimental results achieved with this robotic haptic display are outlined in Section 4.4.
stereo
Fig. 9. Experimental setup of a multi-modal telere-assembly task environment. For comparison purposes the re-assembly task is always executed in identical order, however with the following two contradictive objectives during task completion: accomplishment of task • in minimal time, i.e. It • and with minimal integral of squared forces applied to the objects, i.e. If . This evaluation experiment was performed with 3 male and 1 female subjects between 20 and 30 years old, all familiar with telepresence techniques. Each subject did the 4 single re-assembly tasks 4 times as desired. In all cases the pantilt head was controlled interactively by the HO 's head motion. The first 4 single tasks were accomplished with monocular vision, the second with binocular vision and fixed vergence angle "I, the third with binocular vision and automatic vergence control, and the fourth 4 single tasks again with monocular vision . Table 1 presents quantitative evaluation results taking into account learning effects with the assumption of a linear learning progress.
4. EXPERIMENTS IN REAL AND VIRTUAL ENVIRONMENTS
4.1 Tele-Assembly/Disassembly assisted by Interactive Stereo Vision The benefits of binocular active vision are demonstrated by using the interactive stereo and audition setup together with a 4 DoF kinesthetic
Obviously the task can be performed on the average in about 6 sec , i.e. about 20% faster with binocular compared to monocular vision. An additional second of execution time can be saved by
7
use of automatic vergence adjustment compared to fixed vergence. However, shorter execution time requires higher manipulation speeds, but faster motion leads to substantial overshoot with rapidly increasing contact forces that cannot be controlled anymore by the HO. This seems to be the reason that the force index If does not decrease when higher visual performance is provided . The experiment demonstrates an increase in task performance executing tele-assembly task with stereo vision.
with feed forward action and adaptive (contact force dependent) gains. Typically HO's are not capable of reacting fast enough to a display of high force impulses with their own reactive forces applied to the haptic display. This leads to operator position displacements which may even result in system instability of the user and teleoperator loop. For avoiding system instability we adapt measured teleoperator forces by non-linear filter algorithms as well as by force scaling methods. Fig. 11 shows experimental results of measured and displayed forces during unscrewing the lid of the flask.
4.2 Bimanual Haptic Tele-Assembly
non-dominant hand: 5~ -~D~e~~~ T~ 0~~~~~ft]~~__~~_
As a benchmark test for evaluating the performance of the bimanual haptic telepresence system - presented in Section 3.3 - we performed two-handed opening and closing procedures of a remote flask. This kind of interaction primarily requires grasping of the flask with the NDH and lifting it into the 3D space. According to the asymmetric role of hands, the NDH sets the reference frame by means of a stable grasp and the DH performs the more precise action, i.e. screwing or unscrewing the lid of the flask. The capability of unscrewing a flask in a RE could find realistic application in scenarios where object manipulation in dangerous, e.g. contaminated or radio-active, environments is required, see Fig. 10.
lorce (N] Oh ' _....,.......---.::.--vr_.... X-axis -5 ~
5~ ' ------~--~----------~_4
lorce (N] 0""_~__--_,..,.....-_.... Y-axis -5
~~--4---~~~~~~~
1~:[~]1~,~ . =d~. lifted. inlo3D ace'. .. . 5P .
-10 ~_
1
_"_ ~
torque (Nm] 0 ' Z-axis
. . .. .
-Sf
:
I
-10 ,
-
dominant hand: ~ -~De=-~:;.:·Tc.:: ~ ~(",rig",-h,-"-I]__.::.De .o.:~ ~'F ,-,ec::O-, 4-,, (n,,,gcc ht:L ] _
'"'--=',.._......._....,..,
5,
lorce (N] Oil _ _ _ _ _~....._ _ _
X-axis -5 ! ~~--~--r-----
~!-I------.-_...t"
lorce (N] Y-axis -si
10 11---
lorce (N] 0 ' Z-axlS_ 10 r torque (Nm] 2 Z-axis
--~--
- - - ---, start 01 closing flask..---.
A
F
pressed on top 01 flask
f
O~ o
__
flask is 5
10
15 20 time (s]
closed~ / ~ 25
30
35
Fig. 11 . Measured teleoperator forces and forces , adapted for display at HSI while flask closing. As mentioned earlier indexing is used for utilizing the bimanual workspaces on operator and teleoperator site. Several succeeding indexing operations may lead to entirely different configuratiom at operator and teleoperator site. The situation occurs that a HO is enabled to close the flask in the tele-environment with hands being a widE distance apart from each other on the operatOl site. This is contrary to a typical flask closing procedure in real physical environment. It is however noteworthy, that such a situation does not disturb the HO's overall haptic perception of thE given task. Apparently, the HO mainly depends or visual and kinesthetic feedback from the RE fOl task execution and neglects pose/motion input 01 his/her own hand configuration. A more seriom problem occurs when indexing operations result in a desired pose input exceeding the availablE teleoperator workspace. Such a situation can bE avoided by analyzing the pose input relative tc a soft range around the maximum teleoperatOl workspace. In case of penetration, computed re· active force feedback lets the HO actively senSE and avoid teleoperators' works pace constraints We found also that scaling of pose input car improve system performance. For manipulatin~ the flask we did not scale translational DoFs
Fig.
10. Bimanual tele-assembly application: opening and closing of a flask. In the presented system the HO controls the left and right DeKiTop4 to open or close the remote flask by means of providing two-handed position inputs via the left and right DeKiFeD4 at the operator site. Utilization of the overall workspace on both sites is ensured by means of indexing. Gripper openings are adjusted by potentiometers placed at the DeKiFeD4s' handles. At the teleoperator site position controllers ensure endeffector positioning at the desired pose. In case of contact with remote objects, 6-axis JR3 force/torque sensors measure resulting contact forces. To avoid application of too strong contact forces the DeKiTop4s' high stiffness is reduced by position-based impedance control (Carignan and Smith, 1994). The impedance controller ensures adjustment of a desired teleoperator compliance in case of contact situations. Measured contact forces are sent via the lab LAN to the operator site. 6-axis JR3 force/torque sensors integrated in the DeKiFeD4s ensure force control of displayed forces between the devices and the HO . Standard force controllers are applied
8
wristlfinger force feedback
new object position/orientation
Fig. 12. Architecture of haptic rendering engine, sampling rate: 1kHz. the car dashboard. During this action the HO is capable of perceiving multimodal information: 3D visual feedback using shutter glasses, detailed kinesthetic feedback at wrist and fingers as well as temperature and vibratory feedback at the fingertips of the acting hand by using the WFHD .
but we scaled the rotational DoF with a ratio of 1:2. Such a scaling avoids too large rotations at the operator site and speeds up task completion time. Further improvements can be achieved by strategies for detecting possible telemanipulator collisions. Corresponding algorithms generate reactive force feedback warning the HO of upcoming collisions. This measure helps to avoid undesirable teleoperator configurations.
Stereoscopic view supports in optimal planning of the insertion operation. Detailed kinesthetic feedback at the wrist and fingers enables the HO to experience the shape of the radio, to sense constraints in case of object-to-object contacts and to perceive at his/her hand the state of a performed grasp. Vibratory feedback proves to be useful for texture feedback of object surfaces as well as for additional augmentation of an object sliding, e.g. if the radio is gripped by an unstable grasp. Temperature feedback enables the HO to perceive heating-up of the switched-on radio. In addition, a gripped radio is visually augmented by a lightened grasp detection signal.
The experiment demonstrated that the bimanual haptic telepresence system ensures intuitive HO 's motions. Users were enabled to accomplish bimanual coordinated task execution in RE with asymmetric roles of both two-jaw grippers. The behavior in RE resembles the accustomed bimanual behavior when manipulating a flask in reality. Several actions during closing and opening operations demonstrated the necessity for force feedback. Particularly if vision is occluded, e.g. when mounting the lid of the flask on the screw thread , or deciding if the flask is completely closed or not . The bimanual telepresence system demonstrates that bimanual assembly can be achieved by just duplicating single displays into double displays. Future work will extend this system by means of integrating hand force exoskeletons - as used for the WFHD (see Section 3.2) - into the presented bimanual HSI for additional finger force feedback.
4.3 Virtual Prototyping with Wrist/Finger Haptic Display Fig. 13. Insertion of radio in a VP environment. Display of high-fidelity feedback from the VP environment in all proposed haptic feedback modalities requires appropriate haptic rendering algorithms. Fig. 12 depicts the scheme of the overall haptic rendering engine. Fingertip, hand, and object position/orientation inputs are required for computing haptic feedback by means of 3 specific rendering engines related to the haptic submodalities. Fingertip-to-object interaction is detected by means of simple point-to-object collision tests. A penalty-based method is applied to computing user's interactive forces on environmental objects
As a benchmark for evaluation of the presented WFHD (see Section 3.2) and the respective haptic rendering algorithms, we implemented a VP environment with the objective to perform non-trivial multi-fingered haptic exploration and manipulation tasks (Kron and Schmidt, 2001). The task under consideration here emulates the insertion of a radio receiver into a virtual car dashboard, as illustrated in Fig. 13. During the insertion process the HO interacts with his/her virtual hand avatar with stiff box type objects in the YE. The task comprises grasping the radio located on a rack and inserting it into the corresponding slot within
9
4.4 Applications with Robotic Haptic Display
assuming a linear spring-damper system to be forced by object penetrating depths (Kron and Schmidt, 2001) . Texture feedback computation algorithms take into account tapping and sliding as typical procedures when exploring virtual object surfaces (Kron and Schmidt, 2002). Temperature feedback considers physical heat transmission phenomena by modelling the tissue-layered dynamical temperature distribution inside the human fingertip (Kron and Schmidt, 2003) .
High-powered haptic displays can also be useful in VP environments. As a typical example we implemented a virtual screw-driver experiment in 1 DoF using the robotic haptic display (see Section 3.4). In this scenario the HO is enabled to tighten or loosen a virtual screw on the side mirror of a car (see Fig. 15b) by turning the endeffector on the haptic display (see Fig. 15a). As the operator can feel the torque he/she can make full use of his/her natural mechanical skills to perform this task. Visual feedback is provided via a HMD. In addition realistic tactile feedback is passively presented by using a screw driver directly as end effector of the robotic haptic display.
The currently implemented VP application is still restricted to interaction with non-deformable rigid objects guaranteeing the real-time capability of the developed rendering algorithms. The radio receiver is represented as a box type object with a mass 0.75 kg. For computation of interactive wrist and finger forces stiffness coefficients were selected as 750 Nm- I and damping coefficients as 0.5 Nsm - I. The implemented algorithms are capable of computing correct interactive forces and produce realistic feedback for each haptic submodality. As an example Fig. 14 illustrates typical computed interactive finger and wrist forces during a grasp task and while lifting the radio into 3D space.
(b)
Fig. 15. Robotic haptic device in a VP scenario
It is noteworthy that the operator using the WFHD has been capable of performing intuitive motion patterns during task execution. The parallel display of multi-fingered and wrist kinesthetic feedback was perceived by the HO as a single high-fidelity force sensation despite two separately applied force stimuli. Beside force feedback HOs demonstrated their capability to sense additional parallel tactile stimuli fusing all haptic stimuli into a correct perception of the VP environment.
Another experimental setup makes use of the robotic haptic display for an orthopaedic training system. Injuries, diseases, and pre- and postsurgical properties of the human knee joint can be evaluated by performing different kinds of clinical tests based on specific movements of the knee. However, much experience is required to diagnose pathological joint properties. Due to limited access to patients an effective training of medical students is difficult and time-consuming. Therefore, a multi-modal virtual human knee joint simulator was developed in our lab by a team of engineers and orthopaedic physicians (Riener et aI. , 2000; Hoogen et aI., 2002). The goal of the project is to provide a simulator for medical training where students can move an artificial shank which is attached to a haptic display (see Fig. 16). While the user gets a realistic tactile impression from touching the artificial shank, the haptic display lets the operator perceive impedance dynamics of the human knee joint. Thereby the student can experience and compare the dynamic properties of healthy or pathological human knee situations.
In conclusion we can say, that wrist force feedback enables the HO to perceive physical object constraints. It avoids the unrealistic HO experience with vision-only systems, i.e . penetrating into objects with hands and fingers without feeling any resistance. Multi-fingered force feedback supports a more intuitive operator behavior. The tactile feedback successfully supplements the multimodal perception and helps to make task execution more realistic. More advanced VP applications will require a HO using both hands for purposes of more complex task executions. For this reason we are currently investigating an extension of the WFHD design approach by duplicating the system for bimanual use.
Fig. 16. Multimodal knee joint simulator for orthopaedic training Due to the complex biomechanics the dynamic model of the 6 DoF human knee joint is highly
.WO$!. . . j.J\J X........ _..... ; . .~ . ._,. :
ra 10 ras
. . ..
~~~~..~J ' ..~~~~in~,3~(D~$~~'e~j o
0 .2
0.4
0 .6
time
1$1
0.8
1
Fig. 14. Computed interactive finger and wrist forces during a grasp procedure
10
of this approach is demonstrated by an assembly benchmark experiment: inserting a radio receiver into a virtual car dashboard.
nonlinear (lliener et al., 1996). The dominant dynamic properties are stiffnesses up to 200 Nm/rad and 100,000 N /m for different angles and pathol
Future work will extend bimanual haptic feedback display with multi-fingered haptic feedback generation as already applied to advantage in a single handed approach. This WFHD-based extension is expected to increase remote work efficiency of bimanual coordinated tele-assembly and -disassembly. However, the approach of developing an universal multimodal HSI for arbitrary assembly tasks is not recommended. It seems to be more appropriate that system designers carefully select quality and type of feedback modality with respect to individual task requirements, system complexity and usability.
5. CONCLUSIONS AND OUTLOOK This paper discussed several ideas and techniques for improving execution of tele-assembly and disassembly tasks, both in VEs and REs. A main assumption of our research is that task performance can be increased by introducing telepresence techniques into this application area. The novel concepts and systems developed for purposes of providing multimodal feedback and display to the HO are focusing those human senses most significant for assembly task execution: vision, audition and haptics.
Finally, tele-assembly /disassembly based on telepresence techniques as presented above may suffer from problems of substantial time delays, e.g. > 5 ms, due to a long distance between HO and the RE or unexpected network traffic. Research in telecommunication and networking is currently underway to resolve these issues, at least partially.
Improvements in visual and auditory feedback from the task environment are achieved by allowing the HO to be interactive in the VE or RE and by providing high fidelity stereo feedback. For that purpose an interactive stereo vision and audition setup was developed. Results with application of this stereo system to a basic tele-assembly task led to an increased task performance, i.e. shorter task execution times compared to the case with monocular vision. High fidelity stereo vision requires vergence control for adapting to the current region of interest in the task environment.
ACKNOWLEDGMENTS This work was supported in part by the German Research Foundation (DFG) within the Collaborative Research Centre SFB 453 on "HighFidelity Telepresence and Teleaction" and by the Network Project "VOR" of the German Ministry of Education and Research (No. 01 IR A15 A) . We are also grateful to various members of our laboratory particularly the technicians, for their expert assistance in the design of the experimental setups. REFERENCES
Haptic feedback to the HO comprises display of a multitude of submodalities, e.g. kinesthetic and tactile. However, a holistic and widespread feedback approach often fails because of limitations with available actuator and display technology as well as high system complexity. The paper pr
Baier, H., F. Freyberger and G. Schmidt (2001). Interaktives Stere
With respect to the human kinematic chain, touch and kinesthesia are coupled serially from a mechanical viewpoint. Corresponding serial display configurations seem to be rather reliable but will often result in high costs and system complexity. Application of a robotic haptic display with an artificial shank for an orthopaedic training system is one remarkable approach in this context. However, a less complex approach is combining tactile and kinesthetic devices in a parallel configuration, as presented in connection with the WFHD. This type of parallel display results in generation of reliable haptic perceptions and takes advantage of the human sensory fusion ability. The usability
11
Kron, A., M. Buss and G. Schmidt (2000). Exploration and Manipulation of Virtual Environments Using a Combined Hand and Finger Force Feedback System. In: Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Tsukuba. pp. 1328-1333. Lederman, S.J . and R.L. Klatzky (2001). Designing Haptic and Multimodal Interfaces: A Cognitive Scientist's Perspective. In: Proc. of the Workshop on Advances in Interactive Multimodal Telepresence Systems. Miinchen. pp. 71-80. Lindeman, R.W. (1999) . Bimanual Interaction, Passive-Haptic Feedback, 3D Widget Representation, and Simulated Surface Constraints for Interaction in Immersive Virtual Environments. PhD thesis. School of Engineering and Applied Science of The George Washington University. Massie, T.H. and J .K. Salisbury (1994). The PHANToM Haptic Interface: A Device for Probing Virtual Objects. In: Proc. of the ASME Winter Annual Meeting, Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems. Chicago. pp. 295301. NATIBO (1996). Collaborative Virtual Prototyping Sector Study : An Assessment of CVP Technology Integration, and Implementation. Technical report. North American Technology and Industrial Base Organization. Oliveira, J.C. , S. Shirmohammadi and N.D. Georganas (2000) . A Collaborative Virtual Environment for Industrial Training. In: Proc. of the IEEE Virtual Reality (VR'2000) , ISBN:O7695-0478-7. pp. 288-289. Ortmaier, T ., D. Reintsma, U. Seibold, U. Hagn and G . Hirzinger (2001). The dlr minimally invasive robotics surgery scenario. In: Proc. of the Workshop on Advances in Interactive Multimodal Telepresence Systems. Miinchen. pp. 135-147. Reinhart, G., O. Anton, M. Ehrenstrasser and B. Petzold (2001). Telepresent microassembly at a glance. In: PTOC. of the Workshop on Advances in Interactive Multimodal Telepresence Systems. Miinchen. pp. 21-31. Riener, R., J. Hoogen, G. Schmidt, M. Buss and R. Burkart (2000). Knee Joint Simulator Based on Haptic, Visual and Acoustic Feedback. In: Proceedings of 1st IFAC Conference on Mechatronics Systems. pp. 579--583. Riener, R., J. Quintern and G. Schmidt (1996) . Biomechanical Model of the Human Knee Evaluated by Neuromuscular Stimulation. J. Biomechanics 29, 1157-1167. Walairacht, S., M. Ishii, Y. Koike and M. Sato (2001). Two-Handed Multi-Fingers StringBased Haptic Interface Device. IEICIE Transactions on Information and Systems 84(3), 365-373. Waldron, K. J. and K. Tollon (2003). Mechanical Characterization of the Immersion Corp. Haptic, Bimanual, Surgical Simulator Interface. In: Proc. of the 8th International Symposium on Experimental Robotics (ISER '02). (eds.) Siciliano, B. and Dario, P., Springer Veriag, ISBN 3-540-00305-3. pp. 106-112.
(P. Deetjen and E. J . Speckmann, Eds.). pp. 43-55. Schwarzenberg. Oldenburg. Burdea, G.C. (1996) . Force and Touch Feedback for Virtual Reality. John Wiley & Sons. New York. Buss, M. and G. Schmidt (1999). Advances in Control, Highlights of ECC'99. Chap. Control Problems in Multi-Modal Telepresence Systems, pp. 65 - 101. Springer Verlag London Berlin Heidelberg. Carignan, C. R. and D. L. Akin (2003) . Using Robots for Astronaut Training. IEEE Control Systems Magazine 23(2) , 46-59. Carignan, C.R. and J.A. Smith (1994). Manipulator Impedance Accuracy in PositionBased Impedance Control Implementations. In: Proc. of the International Conference on Robotics and Automation. pp. 1216-1221. Cavusoglu, M.C., F. Tendick, W . Winthrop and S.S. Sastry (1999) . A Laparoscopic Telesurgical Workstation. IEEE Trans. on Robotics and Automation 15(4) , 728-739. Guiard, Y. (1987). Asymmetric Division of Labor in Human Skilled Bimanual Action: The Kinematic Chain as a Model. Journal of Motor Behavior 19(4) , 486-517. Guiard, Y. (1988). The Kinematic Chain as a Model for Human Skilled Bimanual Action: The Kinematic Chain as a Model. In: Cognition and Action in Skilled Bevahious, Colley,A. and Beech, J., Eds., Elsevier Science Publishers B. v., North-Holland. pp. 205- 228. Hinckley, K. (1996) . Haptic Issues for Virtual Manipulation. PhD thesis. School of Engineering and Applied Science at the University of Virginia. Hoogen, J ., R. Riener and G. Schmidt (2002) . Control Aspects of a Robotic Haptic Display for Kinesthetic Knee Joint Simulation. Control Engineering Practice 10(11), 1301-1308. Immersion, Corp. (2002). Technical specifications: Immersion 's haptic workstation. http://www.immersion .com. Kammermeier, P., A. Kron and G. Schmidt (2001). Towards intuitive multi-fingered haptic exploration and manipulation. In: Proc. of the Workshop on Advances in Interactive Multimodal Telepresence Systems. Miinchen. pp. 57- 70. Koh, G., T .E. von Wiegand , R.L. Garnett and N.I. Durlach (1999). Use of Virtual Environments for Acquiring Configurational Knowledge about Specific Real-World Spaces. Presence 8, 632-656. Kron, A. and G. Schmidt (2001). Multi-fingered Haptic Interaction in a Virtual Prototyping Environment. Systems Science 27(4), 9--22. Kron, A. and G. Schmidt (2002). Multi-fingered Haptic Feedback from Virtual Environments by means of Midget Tactile Fingertip Modules. In: Proc. of the 8th Mechatronics Forum International Conference. University of Twente, Netherlands. pp. 1191-1200. Kron, A. and G. Schmidt (2003) . Multi-fingered Tactile Feedback from Virtual and Remote Environments. In: Proc. of 11th Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems. Biltmore Hotel, Los Angeles, USA. pp. 16-23.
12