Robotics & Computer-Integrated Manufacturing, Vol. 5, No. 2/3, pp. 261-267, 1989
0736-584518953.00 + 0.00 Pergamon Presspie
Printed in Great Britain
• Paper
TRANSFER
OF MOTOR
SKILLS TO MACHINES
RAJKO TOMOVI~ Faculty of Electrical Engineering, Belgrade University, Belgrade, Yugoslavia This paper deals with problems related to the capture of knowledge memorized in the form of skills, i.e. automatic and reflex biological motor actions. The broader goal is to develop skill based expert systems. Since methods for knowledge encoding at the cognitive level are not applicable in the case of skills, new approaches must be developed in order to design skill based expert systems. The principle of motion invariants and scene simplication is proposed as a guiding rule for the transfer of skills to machines. Applications of the above principles to the representation of skills related to manipulation, grasping and locomotion are presented. How the skill based AI system can improve the design of the intelligent robots with dextrous multifingered hands is outlined.
literally. For instance, the biomechanical and the mechanical structure are not required to be a replica of each other nor use the same number of degrees of freedom. What matters is skill reproduction by the machine. The AI methodology by which the above task can be solved consists of the following elements: (a) A procedure to capture, i.e. identify, human skills stored in the form of automatic and reflex actions. (b) Representation of automatic and reflex actions in the machine. (c) Design of an expert system controlling the machine reproduction of skills. The design of a skill based expert systems for the above defined task does not require any special AI tools. Available expert system design methods are applicable here as well. Consequently, the main concern of this paper is to present methods for the identification of reflexes and automatic reactions involved in the execution of human skills and to describe ways in which they can be represented in the machine.
INTRODUCTION The term "skill" refers to sensory driven functional motions controlled at the execution level by reflexes and automatic actions acquired by learning and training. Control activities of this kind represent, thus, a form of knowledge which may be evoked without involvement of the voluntary level. Expert systems, based on the transfer of human knowledge to a machine at the conscious level are in wide use today. Conversely, machine control by skill based AI systems is still in the early development stage. The scope of skill oriented AI research comprises different areas. The basic research explores the mechanisms of skill acquisiton in general in order to be able to reproduce such processes in machines. Cognitive psychology takes care of such research using AI methods as well. 1 Although the mechanisms by which verbal knowledge is transformed into functional motions is still far from being well understood, skill based AI systems may be designed by heuristic approaches. For that purpose, it is necessary to describe human skill activities in machine representable form. If this can be done, the design of the corresponding expert system becomes feasible. This last issue is the concern of this paper. The task to be solved reads as follows. Given a class of human motor skills, design a skill based AI controller which will make a machine perform in the same way as man. The parallel between man and machine must not be interpreted
CAPTURING SKILL TYPE KNOWLEDGE The methods used to identify human expertise involving reasoning are of no use at the automatic level. In the first case, the identification of human expertise is done via language communication. At the skill level, such a tool is evidently of little help. 261
262
Robotics & Computer-Integrated Manufacturing • Volume 5, Number 2/3, 1989
By definition, automatic features of motor skills are expressed as space-time events. Consequently, the capture of such knowledge requires new identification methods which rely to a large extent on external manifestations of neuromotor control mechanisms. A general method of capturing skill expertise so that it may be encoded in a machine is nonexistent. Available methods are valid for specific classes of motor skills. Although they apply to limited domains, general guiding principles underlying this research can be formulated. The possibility of encoding skill type activities is due to an interesting phenomenon observed in the execution of functional motions. Let us consider, for instance, bipedal locomotion. Even by simple observation, it is easy to conclude that each subject walks in his own way if the comparison is made at the analog level. On the other hand, it has been also established that certain automatic reactions are common to all creatures using bipedal locomotion. The same applies to target reaching processes by the upper extremities, e.g. grasping, writing, sports, etc. The fact that common automatic activities are found among the large variety of individual ways in which given functional task is performed may serve as a starting point for the development of skill based AI systems. The basic idea is to identify invariant features of a given class of functional motions performed in identical circumstances by groups of individuals. Skill invariants thus obtained represent the relevant data on which an AI system is built. In many instances, as in locomotion and manipulation, motion invariants have already been identified for rehabilitation and other purposes. However, the promotion of skill oriented AI systems will require further multidisciplinary studies involving computer science and life sciences. In order to get a better insight into the meaning of invariant feature analysis for skill capture, two classes of functional motions will be used as examples. It will be shown how industrial and medical robots can be designed in this way.
NONNUMERICAL CONTROL OF LOCOMOTION Rule based locomotion control has been applied with considerable success in rehabilitation engineering. 6"7 However, this aspect of research is of no concern here. In the present context, the main goal is to outline a methodology by which locomotion machines may be controlled by skill based AI systems.
Due to its repetitive nature, locomotion is the ideal case to start with. To be more specific, bipedal locomotion on level ground is taken as an illustration of the principle of motion invariants used to capture skill type expertise. As a matter of fact, gait analysis studies provide all that is needed to encode motion invariants in the form of production rules. A standardized representation of gait invariants encountered in all bipedal locomotion is given in Ref. 3. The next step in the design of rule based control for this class of functional motions requires that pattern matching invariants are encoded in the form proper for machine representation. This can be done by taking into account the fact that motion invariants in this case are actually singular events embedded within analog processes. The term singular event has the following meaning in this context: (a) Sensory data, proprioceptive and exteroceptive, undergo sharp, discrete changes at instants of singular events. (b) At instants of singular events, the joints change states in an automatic, fixed order. In terms of states, four events are possible: free, locked, flexion, extension. Having in mind the above facts, it is easy to establish that in walking on level ground, the following sensors are exposed to discrete (0.1) changes: heel contact, middle foot contact, toe contact. A binary variable can be also associated with the presence or absence of flexion and extension terminal angles of leg joints. A configuration of discrete joint states (hip, knee, ankle) corresponds to each of these sensory patterns. Formal representation of singular events taking place in biped locomotion on level ground is now selfexplanatory. The relevant rules can be expressed in the following general form B(x1, x 2. . . . .
xn) -")'F(Jl(k), J2(k)
. . . . .
Jm(k))
(1)
where B(.) is the Boolean function, xi are binary sensory inputs. F(.) is a combination of discrete joint states with k = 1, 2, 3, 4. The pattern matching operator in Eq. (1) reflects the learnt or genetically determined invariant behavior in the execution of the given class of functional motions. Thus, the sensory driven process represents skill type knowledge in a form suitable for machine representation. Monosynaptic, nonmodulated reflexes involved in motor actions, like the patellar reflex and evasion reflex, are also pattern matching phenomena. Therefore, the rule based control of machines relying on Eq. (1) has been given the name robot control by artificial reflexes, s
Transfer of motor skillsto machines• R. ToMovI~ The proposed methodology was extended to other bipedal gait modes (stair climbing, walking on a ramp, adaptive step length, etc.) including rules reflecting human behavior in the case of an abnormal evolution of the desired functional motion. In this way, a general knowledge base was designed capable of easy extension to all gait modes, including emergency reactions. 5 As is known, AI approaches have an important feedback value. By transfering human expertise to the computer, AI systems extend machine performance in new directions. On the other hand, in the process of encoding human expertise, much can be learnt about the nature of cognitive processes. The same statement applies to skill based AI systems. Following this line of thinking one is, evidently, tempted to explore what kind of new information may be derived from the AI approach to reflexes. The question is challenging but, at this early stage of the development of skill based AI systems, a comprehensive answer cannot be produced. However, some hints based on the limited experience so far gathered may be offered: 1. Abrupt discrete changes of sensory inputs are responsible for simple reflex reactions. This assures a high reliability of such actions in terms of "false" inputs. 2. Certain proprioceptive and exteroceptive sensory patterns can be identified by simultaneous appearance of discrete inputs. Thus, the pattern recognition process means matching the stored Boolean expression to the external binary signal combination. 3. The direct matching of patterns to motor actions represents the fastest control method. In terms of mathematics, it requires input and output sets without any internal organization and constraints such as semigroups, groups, vector spaces, etc. In addition to speed of response, the control governed by Eq. (1) is independent of the number of inputs since the matching of Boolean expressions is parallel. This is a remarkable feature of reflex type control. Naturally, the rigidity of this kind of nonarticulated control limits its field of application. This may be overcome by introducing higher level interventions and adaptive control. It should be also kept in mind that skill based AI systems are not limited just to rule based control. Very interesting and instructive learning problems can be conceived in this area of AI research as well. Assuming that rule based forward walking control is available, one can formulate the seemingly simple but highly challenging learning problem: what constraints and rule modifications must be introduced in
263
order to generate a knowledge base for rearward walking. Hopefully, the study of learning problems at the skill level will produce more insight into the evolution of learning processes starting from the elementary pattern matching level.
SKILL BASED MANIPULATOR CONTROL Skill based manipulator control in this context will be limited to target reaching and grasping tasks in free space using multifingered smart hands. The shape of the target to be grasped is arbitrary. The application of the principle of motion invariants for skill identification is not that difficult in the case of bipedal gait. But reaching and grasping an object in a stable way is an incomparably more complex task. The initial hand-target position may vary in the three-dimensional working space in an infinite number of ways. The same holds for the target shape. In other words, the main feature of upper extremity functional motions is non-repetitiveness, in contrast to the repetitiveness of locomotion. Consequently, the basic issue to be explored in the reflex control of manipulation is how to apply automatic control to phenomena which are nonrepetitive. If the answer is positive, as is evident by human experience, the next step is to explain by what mechanisms wide variations in manipulation tasks can be reduced to invariant elements and, thus, to automatic and reflex actions. It will be shown how skill based AI research can throw interesting light on the ways by which automatism is introduced in the execution of highly diversified functional motions. The methodology by which the skills involved in the manipulation may be identified for AI purposes, relies on the following postulates: 1. Decomposition, According to the decomposition principle, a grasping action consists of two phases. In the first phase, taking place in the target approach process, the preparation for stable grasp is made on the basis of vision. In the second phase, the actual grasping action occurs using tactile sensory information as well. The essential elements for a stable grasp, such as hand preshaping, hand opening, grasp depth, alignment, are prepared while approaching the target. 2. Reduction to geometric primitives. Decomposition of the manipulation task into preliminary and implementing phases means that preparation for stable grasp is made without taking into account the details of the target contour. The hand preshaping and the hand preopening are adjusted according to the smoothed, enlarged target contour. In order to make this statement operational, a reasonable
264
Robotics & Computer-Integrated Manufacturing • Volume 5, Number 2/3, 1989
assumption is to replace the actual target contour by a circumscribed regular geometric body for the sake of preshaping and preopening of the hand. This is what we call the reduction to geometric primitives and scene simplification. In Fig. 1 an illustration of this postulate is given for an egg-shaped object. In Fig. 2, the phases of the preshapingprocessin man for a ball type primitive are shown.
Fig. 1. Example of the geometric primitive for the egg-shaped target.
The implications of the reduction postulate to geometric primitives are very important. In this way, one can handle an infinite number of target shapes contained within a primitive geometric form, using just one form of hand preshaping. The details of the target contour are taken care of in the actual grasping process by local reflex control as discussed later. The number of hand preshaping forms needed is thus reduced to the number of geometric primitives (parallelopiped, cylinder, pyramid, sphere, etc.), which is quite small. In practice, this number is less than ten having in mind that industrial manipulators, unlike human hands, are designed for given classes of objects. The above approach makes it feasible to design a knowledge base for the control of smart multifingered hands. It suffices to classify human hand preshaping forms when approaching geometric primitives and to store them in a knowledge base. A 3-D vision system is needed to identify the target contour. The important part of the picture processing operation deals with the replacement of the actual target contour by the geometric primitive.
Fig. 2. Hand preshaping for grasping.
Transfer of motor skills to machines • R. TOMOVIC The reduction of external contour details to geometric primitives in conjunction with artificial reflex control may be indicative of the role of smoothing and abstraction processes in the neural system. As is known, our perception of emergency situations (car accidents, etc.) is always deprived of details. Skill activities in sports are another example of this phenomenon. 3. Reduction to grasp primitives. Taken as an analog device, the multifingered hand with articulated fingers represents, in fact, a continuously deformable structure. The control goal in this case is unique. Namely, the hand surface is being so deformed as to fit the target shape in the closest way. As indicated above, certain reduction mechanisms must be applied in the preparation for stable grasp in order to be able to use experience and automatic actions. Another similar mechanism, used in the preparation for a stable grasp, is related to the selection of grasp modes or grasp primitives, as they are called here. According to the task description, target texture, size, etc., the desired initial hand structure is chosen. Namely, the total number of degrees of freedom of the hand need not always be used. On the basis of accumulated experience, it is possible to decide a priori if 2-, 3-, 4- or 5-fingered grasp is needed. In this way, by varying the hand structure, the number of controllable degrees of freedom can be significantly reduced. The a priori selection of grasp primitives such as fist, pinch, power grip, lateral grasp, etc. also simplifies greatly the burden of the final tactile control phase. It is surprising that the number of grasp primitives is quite small having in mind the richness of grasping tasks met in practice. Extensive studies in biomechanics and rehabilitation engineering have shown that the number of grasp primitives, capable of handling most situations in everyday life is less than ten. In many industrial applications, this number is even smaller. This type of invariant behavior in man related to grasping can also be represented in the corresponding segment of the knowledge base. 2 4. Alignment. Task requirements determine the selection of the grasping center and the grasping zone on the target surface. On the other hand, it was pointed out that the actual irregular target contour can be approximated by smooth, tangential surfaces in the preparation of the hand for the stable grasp. Due to such image processing, the optimal hand-target orientation can be reduced to a routine. This can be best explained by using a generic grasping task. In Fig. 3, a parallelopiped is used as
265
the geometric primitive. In the approach phase the hand will be preshaped in such a way as to reproduce the geometric primitive of the target. The matching of the hand to target shapes is not a sufficient condition to assure a stable grasp. Stability requirements are, evidently, satisfied only if the axes of the target geometric primitive and the hand are aligned, or, if this cannot be done, the corresponding angle differences are minimized. The correctness of this statement can be best verified by taking the proof to the contrary, Fig. 4. In a nonconstrained approach, the nonaligned hand-target position will
'
Fig. 3. Correctlyalignedhand-target position.
Fig. 4. Incorrectly aligned hand-target planes in the preparation for grasping.
266
Robotics & Computer-Integrated Manufacturing • Volume 5, Number 2/3, 1989
never be used. Consequently, in all tasks, where the grasping zone offers a single stable position, the hand-target alignment is unique. The preference for axis alignment in the grasp which, in many instances, leads to deterministic solutions, may be interpreted as the outcome of a heuristic optimization process governed by a criterion function of the following nature. The grasp which, within given constraints (grasp mode, grasp depth, etc.), maximizes the hand-target contact surface and results in even pressure distribution on the hand is preferred. The alignment solution is the consequence of this optimization criterion and vice versa. By the way, the alignment operation is another example of the scene simplification mechanism by which all irrelevant data for the actual control task are filtered out. 5. Shape and grasping force adaptation. In the preceding section, it was pointed out that the preparation for a stable grasp is done by vision. In the next phase, when physical contact between the hand and the target is established, proprioceptive and exteroceptive sensory feedback becomes dominant. These sensors feed back information to hand controllers to carry out the grasping operation until the detailed shape matching process is accomplished. How the shape and the grasping force adaptation can be made automatic in the case of multifingered hands has been shown in earlier papers. 9 The unique goal of this paper was to explore methods by which human, or animal, skills may be captured for machine representation. Indications, along which lines this can be done, have been outlined. Following the proposed methodology it is possible to proceed to the next step of computerization and develop skill based expert systems for robot control, i0 The structural diagram of such an intelligent robot system, which is being developed, can be seen in Fig. 5. The system consists of the following elements: • 3-D computer vision for target identification and reduction to geometric primitives, • skill based expert system whose inputs are task description and computer vision. The expert system generates as output the hand preshaping form, grasping mode, initial hand opening, initial grasping force, grasping depth, alignment control and target approach trajectory. • commercial industrial manipulator, • dextrous multifingered hand with the local controller. 4 The supporting information for the control of the
f~~.~TARGET computer}
~ / ~ vision ~ geometricJ
/ p r i m i t i v e ~
size
Nrameters
Expert .~ Task ~,,systemJ" description r i~ Fi Approac"~ h ...nget traj ectory / poSl l;lOn!ng i ~rma~lon f
~
NI
PU~
{pressure~ f slip ~ ] Finger ~,sensorJ ~ I positioning ~MULT ~ vector IFINGERE~D ~ , HAND j ' Fig. 5. Structural diagram of skill controlled robotic system.
above robotic system is obtained by the transfer of motor skills from man to machines.
CONCLUSION Due to the very nature of skill based AI systems, the capture of knowledge expressed in the form of automatic functional motions presents a special problem. A comprehensive methodology of skill transfer to machines is still lacking. Further research in this direction will require extensive multidisciplinary cooperation in which experts in biomechanics, neurophysiology, motor functions, biology, evolutionary psychology, etc., must be involved. The study of skill based AI systems is important from both theoretical and applied points of view. The multilevel control of intelligent industrial robots, involving the skill based level, is certainly an important applied research field. The theoretical potential of this AI approach is equally important. It may turn out that by analyzing the role of automatic mechanisms in the execution of functional motions by AI methods, we can get a better insight into the operation of neural networks and the evolution of learning processes.
Transfer of motor skills to machines • R. TOMOVIC REFERENCES 1. Anderson, J.R.: Knowledge compilation: the general learning mechanism. In Machine Learning, An Artificial Intelligence Approach, Vol. II. Morgan Kaufmann, Los Altos, 1986. 2. Huan, L., Bekey, G.: A generic grasp mode selection paradigm for robot hands on the basis of object geometric primitives. University of Southern California, Computer Science Report, Los Angeles, 1987. 3. Norkin, C., Levangie, P.: Joint Structure and Function--A Comprehensive Analysis. F.A. Davis, Philadelphia, 1983. 4. Raki6, M.: Multifingered robot hand with selfadaptability, Robotics Comput.-lnteg. Mfg 5: 269-276, 1989. 5. Tepavac, D.: Design of expert system for locomotion control. M.S. Thesis, Faculty of Electrical Engineering, Belgrade, 1987. 6. Tomovi6, R., et al.: Active modular unit for lower
7.
8.
9. 10.
267
limb assistive devices. Advances in External Control of Human Extremities, pp. 1-12. Yugoslav Committee for Electronics, Belgrade, 1981. Tomovid, R.: Control of assistive systems for motor deficiencies. In Control Aspects of Biomedical Engineering: Trends and Prospectives, Nalecz, M., ed. Pergamon Press, London, 1987. Tomovi6, R., Bekey, G.: Robot control by reflex actions. Proc. lEE Conference on Robotics and Automation, Vol. 1, pp. (1)240-(1)248. IEEE Computer Society Press, 1986. Tomovi~, R., Stojiljkovi6, Z.: Multifunctional terminal device with adaptive grasping force. Automatica 11: 567-571, 1975. Tomovi6, R., Bekey, G., Karplus, W.J.: A strategy for grasp synthesis with multifingered robot hands. Proc. IEEE Conference on Robotics and Automation, Vol. 1, pp. 83-89. IEEE Computer Society Press, 1987.