Application of discrete learning control to a robotic manipulator

Application of discrete learning control to a robotic manipulator

Robotics & Computer.Integrated Manufacturing, Vol. 12, No. 1, pp. 55--64, 1996 Copyright © 1995 Elsevier Science Ltd Pergamon Printed in Great Brita...

776KB Sizes 0 Downloads 67 Views

Robotics & Computer.Integrated Manufacturing, Vol. 12, No. 1, pp. 55--64, 1996 Copyright © 1995 Elsevier Science Ltd

Pergamon

Printed in Great Britain. All rights r~crved 0736-5845/96 $15.00 + 0.00

0736-5845(95)00028-3



Paper APPLICATION OF DISCRETE LEARNING CONTROL TO A ROBOTIC MANIPULATOR A. N. POO, K. B. LIM and Y. X. MA Department of Mechanical and Production Engineering, National University of Singapore, 10 Kent Ridge Crescent, Singapore 0511

An effective iterative learning control law is developed in discrete time domain for improving the trajectory tracking performance of robotic manipulators repeating the same task from cycle to cycle. With this methed, information on the current-cycle is intentionally introduced into the learning law such that the convergence rate can be further improved. An analysis of convergence, formulated completely in discrete time, is given. Experimental results, based on implementation on a commercial robot, are also presented and discnssed in the paper which demonstrated the effectiveness of the learning control method.

1. INTRODUCTION The highly demanding control problem of driving an industrial robot to precisely follow a given trajectory under an unconstrained or constrained environment has led to the application of sophisticated control techniques. Learning control offers a promising solution to this problem, as it generally relies on less calculation and requires less a priori knowledge about the system dynamics when compared with the more sophisticated adaptive controllers. An effective human learner is marked by progressively improving performance. It is realized that a robot executing repetitive work patterns can introduce learning control algorithms with similar advantages. The application of classical or modern control theory to repetitive-motion robots over the finite cyclical time duration [0,T] often leads to unsatisfactory performance, especially when the motion is required to be fast. This is because it takes a finite time in each cycle for the system response to trajectory

approach the desired one; that is, good performance is guaranteed only after time t is greater than some value tp (Fig. la). The industrial robot is usually called upon to execute repetitive operations. In recognition of this distinct feature, it appears attractive to exploit the cyclic behaviour of the robot and to control its motion at t = tj(tj< T) in each cycle by taking into account information earlier in the same cycle as well as information gathered in earlier cycles, whether at instants prior or subsequent to the corresponding instant t/within the period T. A learning scheme for an industrial robot can be compared with an adaptive control scheme. Both adaptive control and learning control are introduced to handle the unknown parameter and uncertainty problem. Adaptive robot control is usually concerned with extracting information about the dynamic behavior of the robotic system, explicitly or otherwise, and then making use of this to construct trajectory

.

desired

......... ~ b..~,

-

tp

t

(a) adaptation Fig. 1. Differencebetweenadaptation and learning. 55

/ --

~

desired cycle n

cycle 1

0

T (b) learning

t

56

Robotics & Computer-Integrated Manufacturing • Volume 12, Number 1, 1996

the adaptation law. It is therefore for general application and does not specifically take advantage of the nature of the robot operation. One weakness of adaptive control is that it cannot perform well when the actual trajectory of a robot manipulator needs to be tightly controlled all the time, because it takes a finite time to adapt. In learning control, the full robot dynamics is seldom considered and usually the measured outputs of the plant, in the previous cycles of operations and/or in the current cycle, are already sufficient for generating the control signal to decrease the error between the desired and the actual motion. It is noted that learning control is particularly applicable when the same operation is to be required from cycle to cycle. When repetitive operations are involved, even without using the full knowledge of the robot dynamics, the learning law can eventually lead to better overall tracking performance as compared with conventional adaptive control. 2 The difference between conventional adaptation and learning can be appreciated from Figs la and lb. With conventional adaptive control, there is no learning and the performance of subsequent cycles of operation does not improve over the first. On the other hand, for learning control, even though the performance of the first cycle of operation may not be good, continuous improvement is achieved with each cycle of operation until eventually its overall trajectorytracking performance is better than that for adaptive control. 2. REVIEW OF LEARNING CONTROL FOR

MANIPULATORS Uchiyama 19 was perhaps the first to propose the application of the methodology of repetition to robot control. This has been elaborated as a more formal theory of iterative learning by Arimoto and his associates. 2 Their idea is to apply a simple algorithm repetitively to a robot, until practically perfect tracking is achieved. The continuous-time iterative algorithm takes on the familiar PID structure and the control is based on the difference between the measured variables in consecutive cycles. Arimoto, Kawamura and their associates have also published other papers 3-5 to extend the understanding of learning control as applied to manipulators. Although Arimoto et aL stipulated the condition that the learning gain matrices must satisfy to ensure convergence, no clear guidelines for choosing these matrices were given. In fact, without realistic means to determine these matrices, satisfactory operation over arbitrary working trajectories cannot be taken for granted, especially given the highly-coupled, time-varying and nonlinear properties of a robot system. Togai and Yamano 17 described a robot system by a set of linear time-varying discrete state equations and proposed a discrete learning control algorithm.

They applied system optimization to learning control and identified the learning gains in several ways, but did not give a rigorous proof of convergence. Craig l° combined the basic learning concept with the computed-torque method to achieve a continuous-time learning controller aimed at learning the frictional effects acting at the joints. As the computed-torque method would require models of the relatively complicated Coriolis/ centrifugal terms and gravity terms of the actual manipulator, the advantage of the basic learning strategy was apparently not utilized to its full extent. Oh et al. is proposed an iterative learning control method for a class of linear systems, in which the control sequence is synthesized by adding a compensation signal generated from the estimated inverse model to the current control. However, the method did not handle the nonlinearities present in the manipulator dynamics. The model-based robot learning method of An et al. 1 made use of a full dynamic model of the robot in learning control. It would seem that the iterative learning capacity for dealing with uncertainties by the basic learning mechanism was even less exploited than the computed-torque case by Craig 1°. Gu and Loh ~2 proposed a learning control scheme which relies on the data obtained from a few adjacent sampling instants in the previous cycle to determine the updated input torque for the subsequent sampling interval in the current cycle. As no proof of convergence was given, there would be difficulty in choosing the values of the interpolation coefficients. Bien and Huh 7 synthesized the updated torque Uk+ 1(0 in the current cycle from multiple past-cycle trajectory error and control data pairs (ek, Uk), (ek-- 1, Uk-- 1 ) , . . . , (ek--N, Uk--N), not simply from one data pair (ek, Uk), in order to enhance the convergence performance of the learning control algorithm. Although rigorous proof of convergence conditions was presented, it is questionable whether there will exist weighting m a t r i c e s P i and Qi (applied to vectors uk-i+l and ek-i+l, respectively) satisfying the convergence conditions for arbitrary trajectories, especially when the number of past-history data pairs is large. It is true that the more the past information is exploited, the more flexible the learning process may become. However, more past information does not necessarily mean better performance. Rather, performance of learning is dependent not only on how much information is used but also whether the information is crucial and how effectively this overall information is fused together. There is increasing concern about choosing suitable gain factors or matrices in conventional learning controllers, particularly with reference to convergence. Bondi et aL s derived an iterative method in the spirit of model-reference adaptive control. In the proposed scheme, the ~position error, velocity error and acceleration error were used for modifying the control input. They showed the existence of feedback

57

Application of discrete learning control • A. N. PO0 et al.

gains which would ensure trajectory convergence. Moore et aL 14 combined an adaptive method with the scheme proposed by Bondi et aL to develop a learning control scheme in which the Euclidean norm computed at each instant of time in the interval [0, T] in the previous cycle was used to change the gain at the same corresponding instant of time during the current work cycle. The authors stated that the strategy enabled the system to adaptively choose the gains so as to guarantee convergence. In contrast to the previous method, it does not rely on the acceleration error to generate the control. Based on a linear model, Pourboghrat 16 suggested a scheme which was a combination of a model reference adaptive control and a learning strategy. The correction term in the continuous-time learning controller is updated according to the output of the adjoin system corresponding to the robotic system as well as the error between the outputs of the reference model and the actual output. It is noted that the existing learning control algorithms were mostly developed and analysed in continuous time domain. However, from the implementation point of view, it would be more reasonable to discuss an interative learning control scheme fully in discrete time domain because the storage of the past data in digital memory is definitely required for any practical learning control implementation. It is well known that many control algorithms in continuous time are still applicable in discrete time provided that the sample theorem is guaranteed. Conceptually, there certainly exist some restrictions such as the sampled theorem to determine the sample period in the learning control system so that the scheme is still effective, or convergence is also guaranteed in discrete time. Unfortunately, these issues are seldom discussed in the literature. With a view to improving the performance of trajectory-tracking control, a discrete formulation scheme has been presented by the authors is in Cartesian base. The following sections describe the development and analysis of a joint-based learning controller in discrete time formulation. Implementation details and experimental results are presented. 3. DEVELOPMENT AND JUSTIFICATION OF THE A L G O R I T H M IN DISCRETE TIME Consider a nonredundant manipulator with an n x 1 joint-variable vector 0(t). The equation of motion of the n-joint rigid-body system is given in joint space as,

M ( 0 ) 0 + V(0, 0) + G(0) + H(0) = I:

(1)

where 0 is the n x 1 vector of generalized joint angular positions, M(0) is the symmetric n x n inertia matrix, the n-dimensional vector V(0,0) represents the Coriolis and centrifugal terms, the vector G(0) signifies the gravity effects, H(0) is the

n x 1 frictional torque vector, and -r = [zl, . . . , z,]' specifies the generalized torque acting on the joint shafts. By defining a collective nonlinear function to(t, O, 0) = -M-I(o)[V(O, 0) + G(0) + H(0)] (2) we obtain a more concise dynamic formulation of the manipulator system choosing 0 as the system output

~'(t) = ~J(t) = to(t, 0, 0) + M -1 (t)'r(t)

(3)

= to(t, 0,y) + M-l(t)'r(t).

Mathematically, a reasonable objective of the discrete learning strategy in robot motion control is to provide the control torque ¢j+1 at the ( j + l ) t h work cycle, such that as j increases II0j÷l(,) -

(4)

0

where Od is the desired position trajectory, 0j+l is the measured trajectory at the ( j + 1)th cycle, and I1"11 corresponds to the Euclidean norm defined in discrete frame as follows = m a x { I G l ( k ) - 0~(k)l;

II0j+a(-) -

(5)

i=l,...,n;

k=l,...N}

where n denotes the degree-of-freedom of the manipulator, and N = T/At is the number of sampling instants in one cycle duration. It is known that O(t) = 0(0) +

0(z) dz

(6)

and hence we have 0j+l (t) -- 0d(t) =0j+l (0) -- Od(0) +

(7)

- 0d(z)]

Provided that the manipulator takes up the same position 0d(0) at every cycle of the operation, i.e. 0j+l(0) = 0d(0) for all j, the objective (4) can be achieved if 110j÷1(.) - 0d(')ll

= I} Yj÷I(') - Yd(')lloo

as k--, c¢.

0

(8)

In the following, a discrete learning control law is proposed such that the output Yj+I(t) = 0j+1(t) will be closer and closer to the desired output yd(t) = 0d(t) in the operation duration as the cycle number j increases. By making use of Taylor's expansion, the output yj at the time instant (k + 1) can be approximated, in the jth cycle as

58

Robotics & Computer-Integrated Manufacturing • Volume 12, Number 1, 1996 yi(k+ 1 ) = yy(k)+yj(k)At = yy(k) + [toy(k, 0, y ) +

The new discrete control law proposed above is designed to drive the manipulator along a desired velocity trajectory yd (t) during the operation duration t E [0, T] based on learning from previous experience, on the assumption that the robot returns to the same initial position at the beginning of every cycle. More than one dataset is deliberately employed for better control. The control torque update includes the previous-cycle control at the same sampling instant, a correction based on the difference between the velocity vector in the current and previous cycles, and a correction based on the errors between the velocity vector and the desired one at the following sampling instant in the previous cycle. There is hence an element of anticipatory control provided in this scheme. The relative simplicity of this control strategy is illustrated in Fig. 2. The data storage paths are shown as dashed line. It is a distinct advantage of the proposed approach that the algorithm is directly formulated in discrete time. It is clear and simple to implement for applications. Another notable feature of this learning controller is that the full manipulator model is not required for the computation of the required torque. With the uncertainties all relegated to a collective function to(t, 0,y), only the inertia matrix M is involved in the computation. With the intertia matrix M as a component in the controller, a clear-cut guideline on the choice of the effective 'gain factor' means that there is no need to adjust the weightings by trial and error. If the matrix expression M in any particular system cannot be computed, the matrix elements could be treated as gain matrices to be chosen by the user, much as many existing learning algorithms.

M71(k)'ry(k)] At. (9)

Similarly at the ( j + 1)th cycle, we have yy+!(k + 1) -~ yj+l (k) -t-~'j+l (k) At = Yj+I (k) + [toy+l (k, 0, y) + My~.ll(k)'ry+i (k)] At (I0) From Eq. (I0), given the torque "rj+l(k), the corresponding output yj+l(k+ 1) can be approximately determined. Conversely, the torque "rj+l(k) which forces yy+l(k+ 1) to approach yd(k+ 1) can be solved by replacing yj+x(k+ I) by yd(k+ 1) in Eq. (10), provided that the nonlinear function to(t,O,y) is known, that is, "ry+l(k) may be determined by the following yd(k+ 1) ~ Yj+l (k) (11) + [wy+1(k, 0, y) + M~ll (k)'lrj+1(k)] At. In practice, it is difficult to model the nonlinear function w(t, 0,y). Therefore the data in the previous cycle are used to learn the torque "ry+x(k) that will result in yy+l(k + 1) approaching yd(k + 1) as j tends to infinity. By substituting to(t,O,y) in Eq. (9) for Wy+l(t, 0,y) in Eq. (11), our recursive learning control law is derived as, 'ry+l (k) = My+ 1(k) M)-1(k)xy(k) + My+ 1(k) { [Yd(k at- 1 ) -

yy(k + 1)] - [Yy+I(k) - y y ( k ) ] } / a t . (12)

V

memo

j t'

for rj(t)

Mj÷l~/~J 1

I

Oj÷t(t)

memory ]O,(t+~) ~ " for 6 a (t) "-k

memory for Oj (t)



Mj÷I

--;7-

01(t + At)

b~(t)

;C)-"+

I t I L~

Fig. 2. Learning control scheme using partial model knowledge.

j.(O

Robot Dynamics

O.,(t)~

59

Application o f discrete learning c o n t r o l . A. N. P O 0 et al.

Of course, the learning controller Eq. (12) is effective only if the algorithm is convergent as j tends to infinity. The following section gives the analysis of convergence issue.

yy+l(k + 1) = yj(k) + jk[k+l{toJ(t' 0,y)

(17)

+ M~'(k)xy(k)} dt. By combining Eq. (16) with Eq. (17), it follows that

4. CONVERGENCE ISSUE

yy+l (k + 1) - yd(k + l)

Theorem For a robot manipulator governed by Eq. (3) with an inertia matrix M available, assume that: (a) the robot takes up the same initial position 0k(0) = 0d(0) at every work cycle; (b) the unknown nonlinear function to(t, 0,y) satisfies the Lipschitz continuity condition, is that is, there exists a constant L such that for any pair {(Oy+l,Yy+l), (0y, yy)} and any t E [0, T]

Iltoj+~ - toy II~ =11to(t, 0j+l, Yj+l) - to(t, 0j, yy)ll~ LIIYy+I- Y j I1~ (13)

(c)

the nonlinear function to(t, 0,y) is continuous and differentiable and there exists a positive constant tim such that : max{lto(t, 0,y)l; t ~ [0, T]; O,y ~ Rn×;}. (14)

Then the iterative discrete learning control law Eq. (12) guarantees that for a given output yd(t) = 0d(t), t E [0, T], the maximum error norm Ilyj+~(')--Yd(°)lloo will be bounded within fl/1-~ as k ~ c~, where 0t=(L A t / ( 1 - L AT)) and t = (t2m A t 2 / ( 1 - L At)), provided that ~ < 1.

k+ 1

[toy+! (t, 0, y) - toy(t, 0, y)] dt

= Jk

[k+l{[toj+l(t,O,y)

toj+l (k, 0, y)]

Jk

- [toy(t, 0, y) - toy(k, O, y)] + [toy+l(k, 0,y) - toy+1(k, 0,y)]} dt. Using the mean value theorem and Eq. (14), we have

Itoj+l(t, 0,y) - toj+l (k, 0,y)l < D.m(t - kAt) Iwj(t, 0,y) - toy(k, 0,y)l < D.m(t - kAt).

(19)

Taking the norms of both sides of Eq. (18), using the relationship described by Eq. (19), and invoking the Lipschitz continuity condition Eq. (13) yields k+ 1

Ilyj+~(')

- yd(°)l[~ < 2D.m

(t - kAt) dt dk

+Z

r

+l

IlYj+l(°) - yj(°)ll~ dt

(20)

dk

= D~ At 2 + LA/IlYj+~ (°) - Yj(°)II~ • By making use of the property of the norm IIa -bll-< IIall + Ilbl[

Proof

(18)

(21)

inequality (20) can be rearranged as,

As the torque vector "ry+l(k) is applied to the manipulator joint actuators at the time instant k and recognizing that in the discrete version scheme "r(k) remains unchanged between consecutive sampiing instants, i.e. x(t) = "r(k) for any "r E[k, k + 1], the output Yy+I at the following sampling instant k + 1 can be calculated as

IlY/+l(') - Yd(°)ll~ < fl + ~IIYj(°) - Yd(°)ll~

(22)

where L At ct-I_LA----~

and

D.m At 2 fl-l-LAt"

It is obvious, from Eq. (22), that, yj+l(k + 1) =yj+l(k) + [k+l dk

{toj+l (t, O,y)

(15)

IlY7+l(') -

+ M;+', (k ) ) dt.

Yd(')II~ --< /~+/~ +/~2 +... +/~/+ ~j+~ IlY0(°) - Yd(')II~ =/~(~ _--~ )

Substituting the learning control law Eq. (12) into Eq. (15), we obtain

÷ cd+llly0(°) - yd(o)llc~ . (23)

Yy+1(k + 1) = yd(k + 1) - [yj(k + 1) - yj(k)] k+l

+

dk k+ 1

+

toy+l (t, O, y) dt

(16)

Mfl(k)xj(k) dt. dk

Similar to Eq. (15), we have the following relation for the jth cycle

Therefore,

it

is

clear that the error norm Ilyj+~(')--Yd(')ltoo is bounded within fl/(1-or) as k ~ ~ provided that ~t< 1, that is, A t < (1/2L). In the implementation of the learning control law Eq. (12), it is recognized that the sampling interval At is limited by computer hardware considerations. However, from the convergence point of view it

60

Robotics & Computer-Integrated Manufacturing • Volume 12, Number I, 1996

must be chosen to guarantee that ~ < 1. Theoretically, the smaller the value of At, the faster the rate of convergence tends to become. This is also verified in the experiments. For a reasonably fast convergence, we can set = 0.2, for example. This gives At < I/(6L). Without an explicit knowledge of L in any particular system, this suggests that the learning controller can be tuned in each case by reducing At until convergence is achieved and its rate is satisfactory. The penalty of choosing too small a value for At is, of course, excessive processor investment for the particular system.

controlled by an HCTL-1000 motor control chip, which accepts a TTL-level quadrature shaft encoder signal and outputs a pulse-width-modulated (PWM) signal to the motor power amplifier. An HCTL1 motor control board provided the trajectory-generation functions which were selectable via software. These functions were raw position control with velocity and trapezoidal accelerationdeceleration profiling. Like many industrial robot controllers, the ZERO control system could only provide point-to-point position control. If the robot is required to follow a prescribed trajectory, the trajectory needed to be specified by a series of

selectable

In order to implement the learning control algorithms on the ZERO robot, an IBM 386 (25 MHz) compatible PC with a co-processor was used. The functions provided by the HCTL1 control board were by-passed and the PC controlled the joint motors and received encoder feedback directly through the HCTL-1000 control chips. The registers of each HCTL-1000 control chip were memorymapped directly into the IBM PC's memory. Robot command routines were written in C. During the operation of the robot with the learning controller, a real-time clock constantly interrupted the foreground program resident in the PC at a preselected sampling period At. The foreground program generated the desired trajectory and stored the data at every sampling instant. At each interrupt, the foreground program read the joint encoder readings through the HCTL-1000, generated a new set of control signals by using a learning control law, and finally dispatched these signals to the power amplifiers of the appropriate joint motors.

5. IMPLEMENTATION DETAILS FOR EXPERIMENTS Figure 3 shows an overall view of the ZERO robot and associated control system used in the experiments. The ZERO drive system consisted of DC motors driving each joint through a combination of shafts and gears. Incremental optical encoders were mounted directly on each motor shaft for position feedback. Each joint motor of the robot was directly

IBM PC

ZERO robot

power supply

compatible

I1

controller I b°ard HCTL 1 I

J-I

1 ampl.ier

Fig. 3. Elements of ZERO's control system.

%

SS J ~%

z Base

'~

~

via

points.

S

S

PtS

~

-~

J3

J2

js

Fig. 4. Geometrical configuration of the ZERO robot.

%%

61

Application of discrete learningcontrol • A. N. POO et al. 6. EXPERIMENTAL RESULTS The implementation was done only for Link 2 and Link 3 of the Z E R O arm, with Joint 1 held steady at the zero position. As the motion of Joint 3 affected the three joint angles of the wrist, proportional controllers were used to maintain these wrist joints, J4, J5 and J6, at their zero positions (see Fig. 4). Figure 5 shows the definition of the measured angular position of the two joints, where 02z = ~r/2 - 02. The orientation of the wrist was not the subject of control in the set of experiments described. Modelling the arm as two rigid links (of length ll, 12) with point masses (mi, m2) at the distal ends of the links, we obtain the inertia matrix M in terms of the physical parameters as follows:

M(0)

rt m2 + 21112m2cos 02 + l~(ml + m2) l~m2 + 1112m2cos 02 122m2 + lll2m2 cos 02] 12m2 J

(24)

where la = 0.279 m, 12= 0.369 m and ml = 3 kg, mE = 2 kg. The desired trajectories are described in terms of functions of the joint angles. A cubic polynomial with four coefficients was employed9 to generate the continuous path by using the following four constraints:

1/ --.r\,\-o= \

~

m2

Joint 1

2 ............. Pig. 5. Angle definition of the ZERO robot's two joints.

o c

E

0 20.15"

¢i O 0.

0.1 -

t,~.

01d(t) = 1.57 - 2.36t 2 + 1.57t 3 0 < t < 1 sec

02d(t) = --2.36 -- 2.36t 2 + 1.57t 3 0 < t < 1 sec and

01d(t)=--4.71t+4.71t 2 0
.- c

d

The learning performance for different sampling rates is shown in Fig. 6, for which the sampling rates are: (a) At=15 msec; (b) A t = 2 0 msec; (c) At=30 msec; (d) At=40 msec; and (e) At=50 msec. It is important to note that the sampling rate has a significant influence on the performance of any digital control system. In general, when At gets larger, the effects of sampling and discretization become more pronounced and the degradation of the control system performance becomes more noticeable. 6 In the previous sections, the theoretical analysis shows the particularly important role the sampling rate plays in the proposed learning system. The experimental results for 20
..:,\

,a

e

,

: ,~-~

Z

OO6" 6

10

16

20

26

j -- cycle number Fig. 6. Dependenceof convergenceon sampling period.

1 sec

0 < t < 1 sec

~@~ i.'~

0 ( T ) = 0.

Therefore, given T = l sec, 0(0)=[(~/2)-(37r/4)], and 0 ( T ) = [(Tr/4),-Tr], the desired position and velocity trajectories for el and 02 are:

o.1

:,,.

0

and

0.1t6I:.

0.3c)

O(T) = Of,

tad 0.18-

too

0.35-

o 0.25-

0(0) • 00,

30

j -- cycle number

62

Robotics & Computer-IntegratedManufacturing • Volume 12, Number 1, 1996 tad

rad -Z3"

I.@

o

15

I

,.,. 1.,

"....... ,.,.//)~,I IX . I ,-e .......

-//~'/"

'"

~

-2.4-

k-15

,I" :,/,,.

\

o ~

.2.7-

~

-2.0-

,

"S, " ":


-Z5-

Y

.,,/

,.Y

'

t~

.........

/(/'-,,

i -al'~- //~,,'"'"/ ~ao

........

'"'

---

.-1

~ 1'o i~ ab a~ ~h a~ ~b ~s ~b r (sampling instant)

r ( s a m p l i n g instarrO

1.2

'-:

1-

k.l

o,.

c,,I

o.s-

9-

,,.,/,, ......

0.~-

_u

0.4-

~

0.2"

%

-0 ~.-"

~

-0.41 ""'-0 5

k-15 . . . . . . . 10 15 20 25 30 35 r (semi)ling I n s t ~

40

-0.2" , 45

N

-20-

55

,o.,.k-1

..... ~-1

. 5

.

10

k-e

.

. . . . 15 20 25

. . . 313 35

40

45

~

5o

4A". %'---,

%.

"

•"0.41 0

r ( s a m p l i n g instant)

'-% -10-

, 50

,

:-

,'

,.

......^ ~"7, / ~

k- 8

.... ti.!*i ;

~\.

.~"

,%, ',

',% .

I

'~'

--

\_-A?. "-. \-,

~O.

..INI.

,;i

k-3

'~,.--..~-'" • ,~ ..~,~,,..,..

20" i

II

k-3

~-..

-~;~ ,,/ ,

-

k~15 .... 5 --

~, I/' ...... •. I / )(.," / k-8

r (sampling instanO

r (sampling instant)

Fig. 7. Tracking performance. The actual and desired angular position and velocity trajectories, as well as the voltage signals applied to the joint motors, are plotted in Fig. 7 with A t = 2 0 msec. It is noted that the tracking performance is satisfactory after only 8 operation cycles. In Fig. 8, the effects of choosing different estimates with A t = 2 0 msec are examined. In these experiments, the estimates M is chosen as a constant matrix, not as a function of time or joint angles. For the curves shown in Fig. 8, the values of I¢1 are given below:

=[080.0 00] 0.5 (b)

i~,l = [0.8 0.1] 0.1 0.5 0.2 0.5

(d)

l~i= [0.8 0.3] 0.3 0.5

,~

Application of discrete learning control • A. N. P o e et al. ra:l 0.35-

raa 0.2" 0.18.

0.3-

-.\ \

0.16" o

0.25-

f !

o

~

o

[E o

t: 9)

I~ 0.12-

02-

t 't ;~

oe

~

~."oi, S~

o

0.1-

0.16"

e~ 0

CL

OA4.

o

63

k ~°'"

~ o

0.1:

i:ii' 0.08-

• ~'\,

..¢1

o. 0.06-

I1

--°

OO4-

0.02t~

10 15 L~ j - - cycle number

25

b

~

, ~ ,.,.r-.---Nt~.y.,,,,~ -w.,,,-~- ~

/ •

,-

,

30 j - - cycle number

Fig. 8. Learning performance with different 1VI.

(e)

0.8 0.5

(f)

1VI=

0.8 -0.1

0.5] 0.5 -0.1 ] 0.5

Comparing Fig. 8 with Fig. 6b, we see that the performance does not vary much. This supports our view that M can be treated as a tuning gain if its full expression is not available. 7. CONCLUSIONS A rapidly-converging and easily-tuned iterative learning control scheme has been developed. The partial model-based approach, as proposed, is able to take into account automatically the coupling effects, and suggests rational values for use in the offdiagonal as well as diagonal elements of the learning gain matrix. There is, however, no need to adjust the weightings by trial and error. Fortunately, it is relatively easier to model and compute M as compared with frictional and other terms associated with the dynamics of the robot, 11 and this results in the ease of implementation for applications. Due to the particular nature of convergence of the proposed learning strategy, stable and effective learning is not contingent on the selection of numerous controller parameters, quite unlike many other learning control algorithms in the literature. Of course, should the inertia matrix expression M in any particular system be unavailable for computation, the matrix elements could be treated as gain matrices to be chosen by the user. Indeed, the controller algorithm remains the same for any manipulators to be controlled with the only parameter subject to in situ tuning being At, the sampling interval for the discrete implementation. In this case, tuning is simply guided by observation as regard the rate of convergence. This is a significant contribution of this investigation because relatively few discussions can be found in the literature about the effective methods of tuning learning controllers.

Theoretically, according to the results of the theorem presented, the rate of convergence could be increased by simply reducing the sampling interval At. However, the computation delay associated with a particular hardware becomes significant when too small a sampling period is used and the assumptions upon which the theoretical analysis are based becomes invalid. In such cases, as the experimental results indicate, too small a sampling interval leads to poor tracking performance.

REFERENCES

1. An, C. H., Atkeson, C. G., Hollerbach, J. M.: Modelbased Control of a Robot Manipulator. MIT Press. 1988. 2. Arimoto, S., Kawamura, S., Miyazaki, F.: Bettering operation of robots by learning. J. Robot. Syst. 1(2): 123-140, 1984. 3. Arimoto, S., Kawamura, S., Miyazaki, F., Tamaki, S.: Learning control theory for dynamic systems. In Proceedings of 24th IEEE CDC, Fort Lauderdale, Florida. 1985, pp. 1375-1380. 4. Arimoto, S., Miyazaki, F., Kawamura, S.: Motion control of robotic manipulator based on motor program learning. In IFAC Robot Control 1988 (SYROCO '88), Karlsruhe, FRG. 1988, pp. 169-176. 5. Arimoto, S., Naniwa, T., Suzuki, H.: Selective learning with a forgetting factor for robot motion control. In 1991 IEEE International Conference Robotics and Automation, Sacramento, CA. 1991, pp. 1528-1533. 6. Astrom, K. J., Wittenmark, B.: Computer Controlled Systems: Theory and Design. Prentice-Hall. 1990.

7. Bien, Z., Hub, K. M.: Higher-order iterative learning control algorithm, lEE Pro. Pt.D 136(3): 105-112, 1989. 8. Bondi, P., Casalino, G., Gambardella, L.: On the iterative learning control theory for robot manipulators. IEEE J. Robot. Auto. 4(1): 14-22, 1988. 9. Craig, J. J.: Introduction to Robotics: Mechanics & Control. Addison-Wesley, 1986. 10. Craig, J. J.; Adaptive Control of Mechanical Manipulators. Addison-Wesley. 1988. 11. Fijany, A., Bejczy, A. K.: An effective method for computation of the manipulator inertia matrix. In Proceedings 1989 IEEE International Conference on Robotics and Automation, Arizona. 1989, pp. 1366-

1373.

64

Robotics & Computer-Integrated Manufacturing • Volume 12, Number 1, 1996

12. Gu, L. Y., Loh, N. K.: Learning control in robotic systems. In Proceedings of IEEE International Symposium on Intelligent Control 1987, Philadelphia, PA. 1987, pp. 360-364. 13. Lambert, J. D.: Computational Methods in Ordinary Differential Equations. John Wiley, 1976. 14. Moore, K. L., Dahleh, M., Bhattacharyya, S. P.: Adaptive gain adjustment for a learning control method for robots. In Proceedings of 1990 IEEE International Conference on Robotics and Automation, Ohio. 1990, pp. 2095-2099. 15. Oh, S. R., Bien, Z., Sub, I. H.: An iterative learning control method with application for robot manipulator. IEEE J. Robot. Auto. 4(5): 508-514, 1988.

16. Pourboghrat, F.: Adaptive learning control for robots. In Proceedings of 1988 IEEE International Conference on Robotics and Automation, Philadelphia, PA. 1988, pp. 862-866. 17. Togai, M., Yamano, O.: Analysis and design of an optical learning control scheme for industrial robots: a discrete system approach. In Proceedings of 24th IEEE CDC, Fort Lauderdale, Florida. 1985, pp. 1399-1404. 18. Tso, S. K., Ma, Y. X.: Cartesian-based learning control for robots in discrete time formulation. IEEE Trans. Syst. Man Cybern. 22(5): 1198-1204, 1992. 19. Uchiyama, M.: Formation of high speed motion pattern of mechanical arm by trial. Trans. Soc. Instr. Contr. Eng. (Jpn), 19(5): 706-712, 1978.