Learning control of mobile robots using a multiprocessor system

ARTICLE IN PRESS Control Engineering Practice 14 (2006) 1279–1295 www.elsevier.com/locate/conengprac Learning control of mobile robots using a multi...

Download PDF

466KB Sizes 0 Downloads 61 Views

Report

PDF Reader
Full Text

ARTICLE IN PRESS

Control Engineering Practice 14 (2006) 1279–1295 www.elsevier.com/locate/conengprac

Learning control of mobile robots using a multiprocessor system P. Antonini, G. Ippoliti, S. Longhi Dipartimento di Ingegneria Informatica, Gestionale e dell’Automazione, Universita` Politecnica delle Marche, Via Brecce Bianche, 60131 Ancona, Italy Received 12 February 2004; accepted 28 June 2005 Available online 15 August 2005

Abstract A real-time multiprocessor system is proposed for the solution of the tracking problem of mobile robots operating in a real context with environmental disturbances and parameter uncertainties. The proposed control scheme utilizes multiple models of the robot for its identiﬁcation in an adaptive and learning control framework. Radial Basis Function Networks (RBFNs) are considered for the multiple models in order to exploit the net non-linear approximation capabilities for modeling the kinematic behavior of the vehicle and for reducing unmodeled contributions to tracking errors. The training of the nets and the tests of the achieved control performance have been done in a real experimental setup. The proposed control architecture improves the robot tracking performance achieving fast and accurate control actions in presence of large and time-varying uncertainties in dynamical environments. The experimental results are satisfactory in terms of tracking errors and computational efforts. r 2005 Elsevier Ltd. All rights reserved. Keywords: Robotics; Autonomous vehicles; Multiple models; Multiprocessor systems; Neural networks

1. Introduction Mobile robots usually work with different tasks in operative environments which assume different forms depending on speciﬁc applications to be performed. Environments may represent different terrains on which the system operates or different loads in the work space. In this context, a multiplicity of robot models can be used to identify different operative conditions and to determine the control actions with the necessary autonomy to cope with rapid environment changes. For the control problem, if a single identiﬁcation model is used, when operating conditions change, this model should be adapted accordingly to the new environment before a control action is taken. Therefore, a multiplicity of models may be required to identify different environments and to take the necessary control action rapidly when changes in the environments are of considerable magnitude and the autonomy of the robot Corresponding author. Tel.: +39 071 220 4451; fax: +39 071 220 4474. E-mail address: [email protected] (S. Longhi).

0967-0661/$ - see front matter r 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.conengprac.2005.06.012

must be improved (Chen & Narendra, 2000; Ippoliti & Longhi, 2004; Morse, 1995; Narendra & Xiang, 2000; Weller & Goodwin, 1994). The kinematic constraint on the use of commercially available vehicles which imposes to control the robot by setting its linear and angular velocities, hinders to take explicitly into account parameters variations in the design of the control algorithm for such vehicles. In fact, the commercial available vehicles impose to use a ‘‘kinematic viewpoint’’ based on the kinematic model which does not contain information on vehicle wheels dimensions, robot mass and inertia. These limits associated with the use of the kinematic model, can be overcome by the use of Neural Networks (NNs)-based controllers (Corradini, Ippoliti, & Longhi, 2003), where the non-linear approximation capabilities of Neural Nets have been exploited for reducing, with a simple kinematic control scheme, the effects of unmodeled phenomena, that are strong when relative high linear and angular accelerations are imposed by the navigation system. In this paper, NN-based control methodologies are further investigated within the context of multiple

ARTICLE IN PRESS 1280

P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

models control of mobile robots in an adaptive and learning control framework. A switching strategy among these models determines the best possible control input at any given instant. Radial basis function networks have been used for identiﬁcation and control in order to avoid the non-linear optimization techniques used in the learning algorithm of the Multilayer Neural Networks (MNN) considered in Corradini et al. (2003) and the related problems of local minima. The experimental results using Radial Basis Function Networks (RBFNs) control schemes (D’Amico, Ippoliti, & Longhi, 2001b) have shown that RBFN are a possible alternative to control schemes based on MNN (Corradini et al., 2003). Moreover, these nets are useful for the identiﬁcation and control of complex non-linear dynamical systems in the presence of uncertainties because the particular structure of RBFN allows their applicability to adaptive and adaptive-learning control schemes (Jagannathan, Lewis, & Pastravanu, 1994; Ahmed-Zaid, Ioannou, Polycarpou, & Youssef, 1993; Polycarpou & Ioannou, 1995; Polycarpou, 1996; Sadegh, 1995; Sanner & Slotine, 1991, 1994; Sanner & Essex, 1996). The real-time implementation of this switching controller with on-line learning of new models and controllers requires a hierarchical architecture where the control actions and the on-line learning are two independent processes coordinated by a supervisor. The supervisor task is double: to deduce the controller to be applied to the robot and to manage the on-line learning procedure. This solution is computation intensive and requires a high-performance computer architecture based on a multiprocessor system. The proposed hierarchical control structure imposes to share the computation on two independent processors, where a processor is dedicated to the implementation of the realtime multiple model control scheme and of the supervisor, while on the second processor the on-line learning algorithm is implemented. The computation of control actions and the on-line learning of new models and controllers are partially independent processes and are performed on independent computers connected by a local communication network. The technical aspects of this multiprocessor architecture will be analyzed in this paper. Experimental tests of the proposed NNs control schemes have been performed on the LabMate mobile base, available at the Robotics Lab of the Universita` Politecnica delle Marche and provided with an automatic navigation system. Experiments have been performed in a real indoor environment with unknown masses, friction, drive train compliance and with meaningful trajectories characterized by relatively strong linear and angular accelerations imposed by the planner. The experimental results show noticeable improvements of the vehicle performance with respect to ﬁxed and

single-model adaptive RBFN-based control scheme (D’Amico et al., 2001b), conﬁrming that the theoretical concepts on switching, learning and tuning developed in Chen and Narendra (2000), Narendra and Xiang (2000) can be applied to achieve signiﬁcant improvements in the control of mobile robots. The paper is organized as follows: in Section 2, the multiple models control approach for non-linear dynamical systems is described. The RBFNs used for implementing the multiple models control scheme and their learning algorithm are recalled in Section 3. The details on the multiple models control scheme based on NNs and on the multiprocessor architecture are described in Section 4, while experimental results are reported in Section 5. The paper ends with comments on the performance of the proposed controllers and on the necessity of using a hierarchical architecture for the realtime implementation of the switching controller.

2. A multiple models control approach for non-linear dynamical systems In this section, basic deﬁnitions and concepts of a multiple models control approach for non-linear dynamical systems are recalled. This methodology has been developed for improving the performance of dynamical systems operating in rapidly varying environments. The approach is based on the use of multiple models, switching, adaptation and learning (Chen & Narendra, 2000; Narendra & Xiang, 2000). The multiple models are based on neural networks that are an useful tool for the identiﬁcation and control of complex non-linear dynamical systems in the presence of uncertainties. Assuming that L identiﬁcation models M j ; j 2 f1; Lg are used in parallel with the given plant, the objective is to determine which among them is closest (according to some criterion) to the plant at any given instant and to use the corresponding controller C j ; j 2 f1; Lg, designed for controlling the corresponding model M j , to control the plant (Chen & Narendra, 2000; Narendra & Xiang, 2000). With this control structure, the performance of the different controllers cannot be directly evaluated in parallel because at any time instant only one control input can be chosen. Therefore, the output responses yj ; j ¼ 1; 2; . . . ; L of the multiple models and the plant output y are used for an estimate of the controller performance. Hence, at every instant, some measures of the identiﬁcation errors ej :¼ yj y are determined with suitable performance indices J j ðkÞ; j ¼ 1; 2; . . . ; L. The model corresponding to minfJ j ðkÞgLj¼1 is chosen to determine the plant control input at that time instant, i.e. the controller corresponding to the model with the minimum index is used (switched) for controlling the plant.

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

When the process to be controlled is non-linear, NNbased models fM j gLj¼1 can be used for the theoretical ability of NN to approximate arbitrary non-linear mappings (Cybenko, 1989). The determination of the models fM j gLj¼1 , representing different operative conditions, has to be based on the a priori information concerning the plant and when a new conﬁguration of the plant occurs, a new identiﬁcation model M Lþ1 has to be learned online becoming an a priori known environment for all future performance of the control system. The approach here recalled has been specialized to the control of a mobile robot. In particular, RBFNs have been used for the identiﬁcation and control. In the following section, the main features of the RBFNs are recalled. These nets are useful for the identiﬁcation and control of complex non-linear dynamical systems in the presence of uncertainty because the particular structure of RBFN allows their applicability to adaptive and adaptive-learning control schemes.

3. Learning algorithm for radial basis function networks 3.1. Radial basis function networks A RBFN with input pattern x 2 Rn and a scalar ^ 2 R implements a mapping f : Rn ! R output c r according to ^ ¼ f ðxÞ ¼ l0 þ c r

nr X

li fðkx ci kÞ,

(1)

1281

output samples acquired in a ﬁnite set of past time instants (Hunt, Sbarbaro, Zbikowski, & Gawthrop, 1992; Warwick et al., 1996a). In the RBFN the functional form fðÞ and the centers ci are suitably ﬁxed. By providing a set of network input pattern xðkÞ and the corresponding desired output cðkÞ to be approximated by the net, for k ¼ 1; 2; . . . ; N, the values of the weights li , i ¼ 0; 1; . . . ; nr can be determined using the linear Least-Squared (LS) method. The choices of centers ci must be carefully considered in order the RBFN closely matches the performance of MNNs, while theoretical investigation and practical results show that the choice of the non-linearity fðÞ, a function of the distance d i between the current input x and the center ci , does not signiﬁcantly inﬂuence the performance of the RBFN (Chen et al., 1991). Therefore, the following gaussian function is considered: fðd i Þ ¼ expðd 2i =b2 Þ;

i ¼ 1; 2; . . . ; nr ,

(2)

where d i ¼ kx ci k and the real constant b is a scaling or ‘‘width’’ parameter (Chen et al., 1991). The centers ci , i ¼ 1; 2; . . . ; nr are generally chosen from the data set fxðkÞgN k¼1 . The Orthogonal Least Square (OLS) algorithm (Chen et al., 1991) is an efﬁcient method for selecting centers from the data set obtaining adequate and parsimonious RBFN thus reducing computational complexity and numerical illconditioning. The application of this method requires to rewrite the RBFN expressed by (1) as a linear regression model

i¼1

where fðÞ is a given function from Rþ to R, k k denotes the Euclidean norm, li , i ¼ 0; 1; . . . ; nr are the weights or parameters, ci 2 Rn , i ¼ 1; 2; . . . ; nr , are the radial basis functions centers and nr is the number of centers (Chen, Cowan, & Grant, 1991; Girosi & Poggio, 1990; Unar & Murray-Smith, 1999; Warwick, Kambhampati, Parks, & Mason, 1996a). The block diagram representing the implementation of considered network (1) is shown in Fig. 1. Note that, if a RBFN is used for the estimation of the output of a dynamical system, the system dynamics can be taken into account through the input pattern x, that must be composed of a proper set of system input and

x1 x2

1(||x-c1||)

2(||x-c1||)

1 1 2

0 Σ

ψˆ

cðkÞ ¼

M X

pi ðkÞdi þ ðkÞ,

where M :¼ nr þ 1, cðkÞ is the desired output to be ^ approximated by the net, ðkÞ :¼ cðkÞ cðkÞ is the error signal, di :¼ li1 are the parameters to be estimated, pi ðkÞ :¼ fðkxðkÞ ci kÞ are given ﬁxed function of xðkÞ where the centers ci have to be ﬁxed and p1 ðkÞ ¼ 1. The error signal ðkÞ is assumed to be uncorrelated with pi ðkÞ. The OLS method, is considered for the selection of centers ci from the data set fxðkÞgN k¼1 , with a reduced number M of pi ðÞ. These signiﬁcant regressors pi ðkÞ, i ¼ 1; 2; . . . ; M can be selected using the OLS algorithm operating in a forward regression way (Chen et al., 1991). Arranging (3) for k ¼ 1; 2; . . . ; N in the following matrix form: (4)

W ¼ PD þ E, T

xn

n (||x-cn ||) r

nr

r

Fig. 1. Radial basis function neural network.

(3)

i¼1

T

where W :¼ ½cð1Þ cðNÞ , D :¼ ½d1 dM , E :¼ ½ð1Þ ðNÞT and P :¼ ½p1 pM with pi :¼ ½pi ð1Þpi ð2Þ pi ðNÞT , i ¼ 1; 2; . . . ; M the OLS method involves the transformation of the set of pi into a set of orthogonal basis vectors wi , i ¼ 1; 2; . . . ; M, where the space spanned by the set of orthogonal basis vectors wi , i ¼ 1; 2; . . . ; M, is the

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

1282

same space spanned by the set of pi , i ¼ 1; 2; . . . ; M, and (4) can be rewritten as (5)

W ¼ Wg þ E

with W ¼ ½w1 ; w2 ; . . . ; wM . This algorithm makes it possible to calculate the individual contribution to the desired output energy from each basis vector. Because wi and wj are orthogonal for iaj, the sum of squares or energy of WðkÞ is WT W ¼

M X

g2i wTi wi þ E T E

(6)

i¼1

and this relation suggests to consider as the error reduction ratio due to wi the following quantity: ½erri :¼ g2i wTi wi =ðWT WÞ;

i ¼ 1; 2; . . . ; M.

(7)

The regressors selection procedure terminates at the M s th step when 1

Ms X

½errj or,

(8)

3.3. ‘‘Width’’ of the radial basis functions Once the centers have been selected, the normalization parameter b2 of (2), that represents a measure of the spread of the data associated with each centre, has to be determined. No rigorous method exists to calculate this parameter. In the developed solution an Akaiketype criteria (Chen et al., 1991) is proposed for choosing the normalization parameter obtaining a good compromise between estimate accuracy and network complexity (D’Amico et al., 2001a). Akaike-type criteria which compromises between the performance and the number of parameters has the following form: AICðwÞ ¼ N logðs2e Þ þ M s w,

(9)

where w is the critical value of the chi-squared distribution with one degree of freedom and for a given level of signiﬁcance, N is the number of data set, s2e is the variance of the net estimate residuals and M s is the number of signiﬁcant regressors. The procedure terminates when AICðwÞ reaches its minimum.

j¼1

where 0oro1 is a chosen tolerance. This gives rise to a model containing only M s signiﬁcant regressors. The orthogonal property makes the whole selection procedure simple and efﬁcient and the tolerance r is an important parameter in balancing the accuracy and the complexity of the ﬁnal network. 3.2. K-means clustering algorithm The OLS method can be employed as a forward regression procedure (Chen et al., 1991) to select a suitable set of centers ns pnr , from a large initial set of candidates, for the RBFN f r : Rn ! R. In the developed solution the initial set of centers is obtained using the kmeans unsupervised clustering algorithm that starting from a reasonable high number of centers randomly chosen in the input space, moves them in the most signiﬁcative regions of the input space (D’Amico, Ippoliti, & Longhi, 2001a; Hush & Horne, 1993). The k-means clustering algorithm is given by the sequential execution of the following steps: Step 1: Initialize the cluster centers cj , j ¼ 1; 2; . . . ; nc . This centers are randomly chosen from the input data set fxðkÞgN k¼1 . Step 2: Group all the inputs of the data set fxðkÞgN k¼1 in the sets Oj , j ¼ 1; 2; . . . ; nc , where Oj is the set of input vectors xðÞ closest to the cluster center cj , i.e. N kxðkÞ ci kg. Oj :¼ fxðÞ 2 fxðkÞgN k¼1 j kxðkÞ cj k ¼ mini¼1P Step 3: For all j ¼ 1; 2; . . . ; nc do cj ¼ M1 j xðkÞ2Oj xðkÞ, where M j is the number of input vectors xðkÞ 2 Oj . Step 4: If there is no change in the cluster assignments of Step 3 from one iteration to the next the algorithm is stopped otherwise go back to Step 2.

3.4. On-line adaptive algorithm The structure of RBFNs allows to introduce adaptive and adaptive-learning on-line algorithms for improving the non-linear approximation capabilities of NNs. The RBFN is a special two-layer network which is linear in the parameters li ; i ¼ 0; 1; . . . ; nr , by ﬁxing all radial basis functions centers and non-linearities in the hidden layer. Therefore, the second layer implements a linear combiner of the outputs of the hidden layer with weights li ; i ¼ 0; 1; . . . ; nr . The considered on-line adaptive algorithm computes the weights li by standard LS methods. Therefore, this particular structure of RBFN allows their applicability to adaptive control schemes, where the weights of the network second layer are online adapted making use of plant input and output signals. This adaptation allows the use of nets in a control scheme able to cope with parameters variations and external disturbances (D’Amico et al., 2001a,b; Warwick, Kambhampati, Parks, & Mason, 1996b). 3.5. On-line adaptive clustering algorithm The structure of the RBFN allows also an on-line adaptive clustering by updating the number of the radial basis function centers and the weights of the second layer. The proposed on-line adaptive clustering algorithm, for each new input vector of the RBFN, checks if the outputs of the ﬁrst RBFN layer are signiﬁcantly different from zero. If the outputs of the hidden layer are under a proper ﬁxed threshold it means that the new input vector is outside the input space covered by the RBFN. In this case, the algorithm adds a new center to

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

1283

the RBFN to cover the new input vector and update the weight of the second layer with the LS method recalled in Section 3.4 (D’Amico et al., 2001a). This on-line adaptive clustering algorithm adds new centers to the RBFN to cover new input vectors outside the input space covered by the RBFN without taking in account the centers selected off-line with the OLS algorithm. In order to optimize the position of these new centers, such new centers are combined with centers selected off-line, by the on-line implementation of the OLS algorithm, as stated in the following Section 3.6.

4.1. The kinematic controller

3.6. On-line OLS algorithm

AT ðqÞ_q ¼ ½sinðym Þ

The introduction of new centers detected by the above algorithm, can be optimized by the use of an on-line OLS algorithm which adds new centers to the NN on the basis of new input vectors which are outside the input space covered by the RBFN. If f r ðxÞ : Rn ! R is the off-line trained NNs with a number ns of centers, the algorithm, on the basis of the new input vectors computes a new NN f r ðxÞ : Rn ! R of the form

which corresponds to the hypothesis of pure rolling and non-slipping condition. Deﬁning a full rank matrix SðqÞ, such that AT ðqÞSðqÞ ¼ 0, a vector u exists satisfying

c ¼ f r ðxÞ ¼ l0 þ

ns X

li fðkx ci kÞ

i¼1

þ

nr X

li fðkx ci kÞ,

ð10Þ

i¼ns þ1

where ci ; i ¼ ns þ 1; . . . ; nr are the new centers introduced for covering the new input vectors. The new centers ci ; i ¼ ns þ 1; . . . ; nr are derived by the k-means clustering algorithm recalled in Section 3.2 which is now applied to the set of the new input vectors which are outside the input space covered by f r ðxÞ : Rn ! R. With these set of centers ci ; i ¼ ns þ 1; . . . ; nr the OLS learning algorithm (Chen et al., 1991; D’Amico et al., 2001a) is on-line implemented for deducing the strictly necessary centers ci ; i ¼ ns þ 1; . . . ; ns , with ns pnr , which are orthogonal among them and orthogonal to the off-line determined centers ci ; i ¼ 1; 2; . . . ; ns .

A unicycle mobile robot is fully described by a threedimensional vector of generalized coordinates q constituted by the coordinates ðxm ; ym Þ of the midpoint between the two robot driving wheels and by the orientation angle ym with respect to a ﬁxed frame (see Fig. 2): q ¼ ½q1

q2

q3 T ¼ ½xm

ym

ym T .

The vehicle is subjected to an independent velocity constraint of the form cosðym Þ 0_q ¼ 0,

q_ ¼ SðqÞu,

(12)

(13)

where u :¼ ½v oT , with o ¼ y_ m (the angular velocity) and qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ v ¼ x_m 2 þ y_m 2 (the linear velocity). Eq. (13) is the wellknown kinematic model of the unicycle vehicle, with cosðym Þ sinðym Þ 0 T . (14) SðqÞ ¼ 0 0 1 Now consider a reference trajectory of the form x_ r ¼ vr cosðyr Þ, y_ r ¼ vr sinðyr Þ, y_ r ¼ or ,

ð15Þ

where vr and or are the linear and angular reference velocities, respectively, and deﬁne the tracking errors as #" " # " # xr xm cosðym Þ sinðym Þ e1 :¼ , yr ym sinðym Þ cosðym Þ e2 e3 :¼ yr ym .

ð16Þ

Y

4. Architecture of the real-time multiple models neural network controller

v(t) ym (t)

The use of multiple models in an adaptive and learning control architecture is developed for the control of mobile robots. The main idea is to generate the robot control inputs making use of the model which closely approximates the mobile robot at each instant. This leads to a switching control scheme which improves the performance of the tracking errors. The technical aspects of the proposed solution are stated in the following, where the kinematic controller of a mobile robot and its implementation with NNs are preliminary recalled.

(11)

C

θm(t)

O xm (t) Fig. 2. Mobile robot with its vector of generalized coordinates.

X

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

1284

The dynamics of the robot tracking errors has the following form (de Wit, Khennouf, Samson, & Sordalen, 1993): 32 3 2 3 2 e1 e_ 1 0 o 0 76 7 6 7 6 6 e_ 2 7 ¼ 6 o 0 0 76 e2 7 54 5 4 5 4 _e3 e3 0 0 0 3 2 3 2 vr cosðe3 Þ 1 0 " # 7 v 6 7 6 7 6 0 7 . ð17Þ þ6 5 4 vr sinðe3 Þ 5 þ 4 0 o or 0 1 In view of the experimental application of proposed control laws the controller design has been performed in the discrete time framework. Therefore, the model (17) has been discretized with a sampling time T c . Integrating equations (17) from kT c to ðk þ 1ÞT c and assuming the control efforts u :¼ ½v oT to be constant over the same time-interval ½kT c ðk þ 1ÞT c Þ, the following discretetime model is obtained: e1 ðk þ 1Þ ¼ e1 ðkÞ þ f 1 ðe2 ÞoðkÞ þ f 2 ðe3 ; vr Þ T c vðkÞ, e2 ðk þ 1Þ ¼ e2 ðkÞ þ f 3 ðe1 ÞoðkÞ þ f 4 ðe3 ; vr Þ, e3 ðk þ 1Þ ¼ e3 ðkÞ þ f 5 ðor Þ T c oðkÞ,

ð18Þ

where the argument k denotes the variable evaluated in t ¼ kT c and f i ; i ¼ 1; . . . ; 5 are the following non-linear function: Z ðkþ1ÞT c f 1 ðe2 Þ :¼ e2 ðtÞ dt, kT c

Z

vr ðtÞ cosðe3 ðtÞÞ dt,

Z

e1 ðtÞ dt, ðkþ1ÞT c

f 4 ðe3 ; vr Þ :¼ Z

vr ðtÞ sinðe3 ðtÞÞ dt, kT c ðkþ1ÞT c

f 5 ðor Þ :¼

e1 ðkÞ

or ðtÞ dt.

ð19Þ

kT c

Consider the following control laws for the linear and angular velocities: 1 ðk1 e1 ðkÞ þ f 1 ðe2 ÞoðkÞ þ f 2 ðe3 ; vr ÞÞ, Tc 1 ðf ðor Þ þ k2 e2 ðkÞ þ k3 e3 ðkÞÞ, oðkÞ ¼ Tc 5

3

6 7 T c vr ðkÞ 7 54 e2 ðkÞ 5. e3 ðkÞ ð1 k3 Þ (21)

Hence, the control gains k1 ; k2 and k3 of control law (20) are chosen to impose the asymptotic stability of (21), i.e. the eigenvalues l1 ; l2 and l3 of system (21) are imposed in module less than one. This choice guarantees the local asymptotic stability of the closed-loop system described by Eqs. (18) and (20). The eigenvalues l2 and l3 of (21) depend on vr ðkÞ, therefore the gain k2 must be related to the reference velocity vr ðkÞ (velocity scaling (de Wit et al., 1993)). For example, eigenvalues l1 , l2 and l3 are assigned to system (21) by the following choice of the gains k1 , k2 and k3 : k1 ¼ 1 þ l1 , k2 ¼ ðl1 l2 þ l1 l2 þ 1Þ=T c vr , k3 ¼ 2 l1 l2 . 4.2. Radial basis function neural networks control for mobile robots Before to introduce the multiple models control, a simple RBFN-based ﬁxed controller for mobile robots is recalled (D’Amico et al., 2001b). In this controller, the universal approximation ability of NNs is used for the estimation of f 2 ðe3 ; vr Þ and f 5 ðor Þ in the implementation of (20), with the integral term f 1 ðe2 Þ substituted with its rectangular approximation expressed by: (22)

k1 1 ^ f ðkÞ, e1 ðkÞ þ oðkÞe2 ðkÞ þ Tc 2 Tc k2 k3 1 ^ f ðkÞ, oðkÞ ¼ e2 ðkÞ þ e3 ðkÞ þ ð23Þ Tc 5 Tc Tc where the estimation f^2 of f 2 ðe3 ; vr Þ is performed by the neural net ha : R‘a þ2 ! R of the form vðkÞ ¼

kT c

Z

32

Therefore, the developed control law has the following form:

kT c ðkþ1ÞT c

f 3 ðe1 Þ :¼

0

f 1 ðe2 Þ ’ T c e2 ðkÞ.

ðkþ1ÞT c

f 2 ðe3 ; vr Þ :¼

form: 2 3 2 0 ð1 k1 Þ e1 ðk þ 1Þ 6 e ðk þ 1Þ 7 6 0 1 4 2 5¼4 e3 ðk þ 1Þ 0 k2

a

a

a

a

y ¼ h ðx Þ ¼

la0

þ

nr X

la‘ fðkxa ca‘ kÞ

(24)

‘¼1

with the output ya ðkÞ :¼ f^2 ðkÞ and the input pattern xa 2 R‘a þ2 deﬁned as

vðkÞ ¼

ð20Þ

where k1 , k2 and k3 are the control gains. Under the assumption of vr ðkÞ different to zero (de Wit et al., 1993) and a small sampling time (T c is chosen small enough to produce good estimations of simple integral terms with their rectangular approximations), the closed-loop system described by Eqs. (18) and (20) around the equilibrium point ðe1 ¼ e2 ¼ e3 ¼ 0Þ has the following

xa :¼ ½e3 ðkÞ e3 ðk 1Þ vr ðk ‘a þ 1Þ vr ðk ‘a þ 2Þ vr ðk 1Þ vr ðkÞT ,

ð25Þ

the estimation f^5 of f 5 ðor Þ is performed by the neural net hb : R‘b ! R of the form: b

b

b

b

y ¼ h ðx Þ ¼

lb0

þ

nr X ‘¼1

lb‘ fðkxb cb‘ kÞ

(26)

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

with the output yb ðkÞ :¼ f^5 ðkÞ and the input pattern xb 2 R‘b deﬁned as xb :¼ ½or ðk ‘b þ 1Þ or ðk ‘b þ 2Þ or ðk 1Þ or ðkÞT .

ð27Þ

The net input vectors xa and xb are composed of the tracking error measures and by the linear and angular velocities of the reference trajectories acquired in a ﬁnite set of successive sampling time-instants. The terms la‘ , lb‘ , ca‘ and cb‘ of the NNs ha and hb are off-line estimated by training on a set of experimental data given by the pairs ðxa ; ya Þ and ðxb ; yb Þ composed of the measures of the robot tracking errors and of the actual and reference robot control efforts acquired on a set of reference trajectories tracked by the mobile robot in real environments. In particular xa and xb , which have the form (25) and (27), are the input patterns of networks ha and hb and ya and yb are the desired outputs of the networks ha and hb , which from Eq.(18) have the forms ya ¼ e1 ðk þ 1Þ e1 ðkÞ T c e2 ðkÞoðkÞ þ T c vðkÞ, yb ¼ e3 ðk þ 1Þ e3 ðkÞ T c oðkÞ,

ð28Þ

where e1 ðÞ, e2 ðÞ, e3 ðÞ and vðÞ, oðÞ are the measures of the robot tracking errors and of the actual control efforts, respectively, acquired on a set of reference trajectories tracked by the mobile robot. The algorithm used for the net training minimizes the NN functional approximation errors along the considered reference trajectories. By (18) the NN functional approximation errors ra and rb of the neural networks ha and hb have the following expressions:

where eji ðÞ and ei ðÞ, i ¼ 1; 2; 3, are the model tracking errors and the measured robot tracking errors respectively, with j ¼ 1; 2; . . . ; L. Moreover, a set of L controllers fC j gLj¼1 , implementing the following control laws, respectively, for the linear and angular velocities of the mobile robot, is considered: 1 ðk1 e1 ðkÞ þ f j1 ðe2 ÞoðkÞ þ f j2 ðe3 ; vr ÞÞ, Tc 1 j ðf ðor Þ þ k2 e2 ðkÞ þ k3 e3 ðkÞÞ, oj ðkÞ ¼ Tc 5

vj ðkÞ ¼

T c e2 ðkÞoðkÞ þ T c vðkÞ f^2 ðkÞ, rb ðkÞ :¼ f f^ ¼ e3 ðk þ 1Þ e3 ðkÞ 5

T c oðkÞ f^5 ðkÞ.

ð29Þ

The training algorithm based on the OLS algorithm recalled in Section 3.1 is proposed for the net tuning. This RBFN-based controller implementing the control law (23) with f^2 and f^5 estimated by NNS (24) and (26) is used in the multiple models control scheme introduced in the next Section 4.3. 4.3. Multiple models framework for mobile robot control Consider a set of L discrete-time models fM j gLj¼1 that have the same form as the dynamics of the mobile robot tracking errors (18) ej1 ðk þ 1Þ ¼ ej1 ðkÞ þ f j1 ðe2 ÞoðkÞ þ f j2 ðe3 ; vr Þ T c vðkÞ, ej2 ðk þ 1Þ ¼ ej2 ðkÞ þ f j3 ðe1 ÞoðkÞ þ f j4 ðe3 ; vr Þ, ej3 ðk þ 1Þ ¼ ej3 ðkÞ þ f j5 ðor Þ T c oðkÞ,

ð30Þ

ð31Þ

where the controller C j is designed for controlling the corresponding model M j ; j ¼ 1; 2; . . . ; L. In the proposed multiple models controller (see part Multiple models controller in Fig. 3), at every time instant k only one controller of the set fC j gLj¼1 is applied to the robot and a parallel architecture of L models fM j gLj¼1 yields the outputs yj ðkÞ :¼ ½ej1 ðkÞ ej2 ðkÞ ej3 ðkÞT ; j ¼ 1; 2; . . . ; L, where each model M j is driven by the same control input uðkÞ :¼ ½vðkÞ oðkÞT of the robot. With this control structure, the performance of the different controllers cannot be directly evaluated in

ra ðkÞ :¼ f 2 f^2 ¼ e1 ðk þ 1Þ e1 ðkÞ

5

1285

Fig. 3. Hierarchical control scheme.

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

1286

parallel because at any time instant only one control input can be chosen. Therefore, the output responses of the multiple models are used for the estimation of the controller performance. Deﬁning the prediction errors ej ðkÞ :¼ yðkÞ yj ðkÞ;

j ¼ 1; 2; . . . ; L,

(32)

where yðkÞ :¼ ½e1 ðkÞ e2 ðkÞ e3 ðkÞT is the vector of the measured robot tracking errors, the following performance indices are considered: J j ðkÞ :¼ aeTj ðkÞej ðkÞ þ b

k1 X

l ðkt1Þ eTj ðtÞej ðtÞ,

t¼k0

j ¼ 1; 2; . . . ; L; kX0; k 2 Z,

ð33Þ

where the design parameters a and b are positive constants, the module of l is lesser then one and k0 is the initial time instant. The parameters a, b determine the relative weight given by instantaneous and long-term measures, while l determines the memory of the index. In this multiple models control scheme, at every time instant the performance indices are monitored in order to deduce the controller to be applied to the robot, i.e. the controller corresponding to the model with the minimum index is used (switched) for controlling the robot. Different models fM j gLj¼1 representing different operative conditions and different control strategies for the controllers fC j gLj¼1 can be incorporated into this multiple models framework. Moreover, in this approach when a new conﬁguration of the mobile robot occurs, a new model M Lþ1 has to be learnt on line becoming an a priori known model with a corresponding controller C Lþ1 for all future occurrences of the same mobile robot conﬁguration. The proposed multiple models control scheme has been developed with a starting conﬁguration incorporating three different controllers C 1 ; C 2 ; and C 3 , i.e. L ¼ 3. This simple structure based on three controllers meets the requirement of reduced implementation complexity with the capability to cope with environment disturbances and parameter uncertainties. In fact C 1 is a ﬁxed control, C 2 is an adaptive control and C 3 is an adaptive-learning control. Each controller C j is designed for controlling the corresponding model M j .

(i) The ﬁrst controller C 1 is a ﬁxed RBFN-based control system described by (31) with j ¼ 1, where the terms f 12 ðe3 ; vr Þ and f 15 ðor Þ are substituted by their estimates obtained by ﬁxed off-line trained RBFNs ha1 and hb1 , which have the same structures of (24) and (26), respectively. These nets are off-line trained with data acquired in a set of trajectories chosen with different shapes, curvature radius and tracking velocities as described in Section 4.2. The corresponding model M 1 has the form (30) with j ¼ 1, where f 12 ðe3 ; vr Þ and f 15 ðor Þ are estimated by ha1

and hb1 , respectively, and f 11 ðe2 Þ, f 13 ðe1 Þ and f 14 ðe3 ; vr Þ are approximated by T c e2 ðkÞ, T c e1 ðkÞ and T c e3 ðkÞvr ðkÞ, respectively. (ii) The second controller C 2 described by (31) with j ¼ 2, is a single model adaptive RBFN-based controller, where the terms f 22 ðe3 ; vr Þ and f 25 ðor Þ are substituted by their estimates obtained by adaptive RBFNs ha2 and hb2 , which have the same structures of (24) and (26), respectively. In this controller, adaptive RBFNs ha2 and hb2 are considered, where the weights of the nets second layers are on-line tuned making use of the on-line adaptive algorithm described in Section 3.4. The corresponding model M 2 has the form (30) with j ¼ 2, where f 22 ðe3 ; vr Þ and f 25 ðor Þ are estimated by the considered NNs ha2 and hb2 , respectively, and f 21 ðe2 Þ, f 23 ðe1 Þ and f 24 ðe3 ; vr Þ are approximated by T c e2 ðkÞ, T c e1 ðkÞ and T c e3 ðkÞ vr ðkÞ, respectively. (iii) The third controller C 3 described by (31) with j ¼ 3, is an adaptive-learning RBFN-based controller, where the terms f 32 ðe3 ; vr Þ and f 35 ðor Þ are substituted by their estimates obtained by adaptive-learning RBFNs ha3 and hb3 , that have the same structures of (24) and (26), respectively. In this controller, adaptive-learning RBFNs ha3 and hb3 are considered where the number of centers of the net ﬁrst layer and the weight of the net second layer are on-line updated making use of the on-line adaptive clustering algorithm described in Section 3.5. The corresponding model M 3 has the form (30) with j ¼ 3, where f 32 ðe3 ; vr Þ and f 35 ðor Þ are estimated by ha3 and hb3 , respectively, and f 31 ðe2 Þ, f 33 ðe1 Þ and f 34 ðe3 ; vr Þ are approximated by T c e2 ðkÞ, T c e1 ðkÞ and T c e3 ðkÞ vr ðkÞ, respectively. During the robot tasks new controllers are derived by new NNs-based models M j of the form (30) where the terms f j2 ðe3 ; vr Þ and f j5 ðor Þ with j ¼ 4; 5; . . ., are estimated with NNs where the centers of the ﬁrst layer and the weight of the second layer are updated by the on-line OLS algorithm recalled in Section 3.6. The new NNs obtained with this on-line algorithm characterize new models able to describe new operative conditions. These new models M Lþi are stored and used as a priori known models with the corresponding controllers C Lþi for all future occurrences. This requires to introduce new performance indices J Lþi ; i ¼ 1; 2; . . ., that are monitored as the indices considered at the starting conﬁguration in order to deduce the controller to be applied to the robot. In this way the set of models increases during the robot tasks and the new models can be also stored at the end of robot tasks and used in a new trajectory. The implementation of the developed multiple models control scheme requires a hierarchical architecture, where the control actions and the on-line learning strategy are two independent processes coordinated by a

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

1287

Fig. 4. Multiprocessor system.

supervisor (see Fig. 3). The supervisor task is double: to monitor the performance indices J j expressed by (33) with j ¼ 1; 2; . . . ; L þ i and i ¼ 1; 2; . . ., and to manage the on-line procedure recalled in Section 3.6. For this second task the supervisor checks on-line if the outputs of the RBFNs hidden layer are under a proper ﬁxed threshold for a speciﬁed time lag. When this threshold is exceeded, this means that the new set of input vectors is outside the input space covered by RBFNs and the online OLS algorithm recalled in Section 3.6 is applied for computing a new NN, describing a new model. The supervisor adds new models with the relative controllers to the multiple models control scheme increasing its ability to cope with different operating environments. This solution is computation intensive and requires a high-performance computer architecture based on a multiprocessor system. The developed hierarchical control structure suggests to share the computation on two independent processors, where a processor is dedicated to the implementation of the real-time multiple models control scheme and of the supervisor, while on the second processor the on-line OLS algorithm recalled in Section 3.6 is implemented. The processors are independent and an asynchronous data link is used for coordinating the supervisor with the on-line learning algorithm.

The architecture of the developed multiprocessors controller is represented in Fig. 4. Two independent personal computers are used. On the ﬁrst personal computer PC1 the functions of the multiple models controller are implemented and all the function for monitoring the existing models are pointed out in Fig. 4. PC1 acquires also the sensor measurements on the mobile base and it manages the transmission of the acquired data to the second personal computer. On the second personal computer PC2 the on-line implementation of the OLS algorithm is performed. The procedures implemented on the independent personal computers PC1 and PC2 do not require strong interaction. In fact there are asynchronous reduced data exchanges between the two computers only when the learning procedures locate a new network model.

5. Experimental results The real-time experimental tests of the control algorithms have been performed on the LabMate mobile robot, available at the Robotics Lab of the Universita` Politecnica delle Marche and provided with an automatic navigation system (Conte, Longhi, & Zulli, 1995). The navigation system is based on a hybrid structure

ARTICLE IN PRESS 1288

P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

(Angeloni, Leo, Longhi, & Zulli, 1996; Conte, Longhi, & Zulli, 1996; Jetto, Longhi, & Venturini, 1999) composed of the motion planning module which runs off-line, the localization and control modules, the perception module and the obstacle avoidance module which work on-line with real-time implementations. The path planning module considers the non-holonomic constraints of the mobile robot and the a priori knowledge of the environment for planning a collision free path from given start and goal conﬁgurations (Conte et al., 1996). In real-time the localization and the control modules generate the estimate of the position and orientation of the mobile robot making use of external sensors (sonar sensors) and internal sensors (dead reckoning sensors that are incremental encoders on the driving wheels) (Jetto, Longhi, & Venturini, 1999; Jetto, Longhi, & Vitali, 1999) and the control inputs of the vehicle (the linear and the angular velocities). The absolute localization module based on adaptive extended Kalman ﬁlter (Jetto, Longhi, & Venturini, 1999; Jetto, Longhi, & Vitali, 1999) guarantees high accuracy with average error in module lower than 1 cm. The environment perception module acquires data on the environment by the same set of external sensors (sonar sensors) and produces a local map of the environment with the detected obstacles (Angeloni et al., 1996). This map is used by the real-time obstacle avoidance module for modifying the planned trajectory to be tracked with a new collision free trajectory. All these modules of the navigation system are implemented on a PC-based architecture composed by two independent personal computers PC1 and PC2, connected by an Ethernet link and both installed on the vehicle as discussed at the end of the above Section 4.3. PC1 is a 486 DX2 with PC 104 bus and PC2 is a Pentium II. On PC1 the functions of the multiple model controller are implemented together with the conditioning and acquisition procedures of sensor measurements. The path planning module, the real-time localization module and the real-time obstacle avoidance module are implemented on PC2 together with the on-line implementation of the OLS algorithm. Fig. 5 shows the LabMate mobile base equipped with the PC-based architecture. In the developed control scheme, the controllers C j ; j ¼ 2; 3; . . ., are based on NNs trained by the online algorithms recalled in Sections 3.4 – 3.6 starting by the off-line trained NNs that are used for implementing controller C 1 (see Section 4.2). Therefore, RBFNs with one hidden layer with gaussian activation functions and one output layer with linear activation functions have been employed for the estimation of f j2 ðe3 ; vr Þ and f j5 ðor Þ in the control law (31), j ¼ 1; 2; . . . ; L þ i, i ¼ 1; 2; . . ., with the integral term f j1 ðe2 Þ substituted with its rectangular approximation expressed by (22) with f j1 ðe2 Þ instead of f 1 ðe2 Þ.

Fig. 5. LabMate mobile robot.

The proposed control law has the following form: j 1 ðk1 e1 ðkÞ þ oðkÞe2 þ f^2 ðe3 ; vr ÞÞ, Tc 1 ^j ðf ðor Þ þ k2 e2 ðkÞ þ k3 e3 ðkÞÞ, ð34Þ oj ðkÞ ¼ Tc 5 j where the estimate f^2 of f j2 ðe3 ; vr Þ is performed by the neural net haj : R‘a þ2 ! R of the form

vj ðkÞ ¼

a;j

yaj

¼

haj ðxa Þ

¼

la;j 0

þ

nr X

a;j a la;j ‘ fðkx c‘ kÞ

(35)

‘¼1 j with the output yaj ðkÞ :¼ f^2 ðkÞ and the input pattern xa 2 j R‘a þ2 deﬁned as in (25) and the estimation f^5 of f j5 ðor Þ is performed by the neural net hbj : R‘b ! R of the form: b;j

ybj

¼

hbj ðxb Þ

¼

lb;j 0

þ

nr X

b;j b lb;j ‘ fðkx c‘ kÞ

(36)

‘¼1 j

with the output ybj ðkÞ :¼ f^5 ðkÞ and the input pattern xb 2 R‘b deﬁned as in (27). In the following subsection, the implementation and technical details of ﬁxed RBFNs of controller C 1 , that are the starting conﬁguration for nets of the other controllers C j ; j ¼ 2; 3; . . . ; are described. 5.1. Structure and validation of implemented fixed RBFNs The training and testing phases of NNs of C 1 have been performed with data acquired in a set of planned trajectories where experimental data have been used composed by the measures of the robot tracking errors and of the actual and reference control efforts applied to

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

the robot along the set of trajectories. A set of 20 trajectories, chosen with different shapes, curvature radius and tracking velocities, have been used for the training phase. For each trajectory from 180 to 400 pairs of input and output samples have been considered. In particular the set of experimental data is given by the pairs ðxa ; ya Þ and ðxb ; yb Þ which have the forms speciﬁed in (25), (27), (28). These data acquired by the mobile base in the considered real operative conditions are used for training the nets by OLS-based algorithm. Solutions found by the algorithm may be not optimal but useful if yield satisfactory performance. b;1 The positive integers na;1 r , nr , ‘ a and ‘ b , which specify the complexity of RBFNs described by (35) and (36) for j ¼ 1, have been chosen for matching the approximation capability of the nets to a low net complexity which is necessary for a real-time implementation on the control module of the mobile base. The number of neurons has been chosen for obtaining a good trade off among the time complexity of learning, the computation efforts of the resulting NNs control schemes and the accuracy of predictions for obtaining satisfactory control performance. Therefore, the hidden layers of nets (35) and (36) for j ¼ 1 are chosen with a number of neurons na;1 r ¼6 and nb;1 r ¼ 9. The number of input variables, stated in relations (25) and (27), is speciﬁed with ‘a ¼ ‘b ¼ 2. This choice qualiﬁes the nets to learn the mobile robot behaviors in steady-state conditions for linear and angular accelerations and this choice is adequate for the smooth trajectories produced by the path planner and by the obstacle avoidance module implemented in the navigation system. Different choices with an increment of the number of inputs, as for example with ‘a ¼ ‘b ¼ 3, have been tested without signiﬁcative improvements of the performance. A tolerance r ¼ 0:01 has been chosen in relation (8) for balancing the accuracy and complexity of the ﬁnal network. The choices operated on the structure of the nets represented by (35) and (36) with j ¼ 1, have been conﬁrmed by the performed cross-validation tests, which have been experimentally used to monitor the performance generalization of the networks in order to avoid the overﬁt of the data used for training the nets. Different robot trajectories not used during the training phase have been considered for cross-validation tests. The trained NNs have shown good capability to generalize the predictions of f 2 and f 5 in all considered robot validation trajectories different to the robot trajectories of the training phase. Samples of the performed tests are shown in Figs. 6 and 7 for nets (35) and (36) with j ¼ 1, respectively, where for robot trajectories not used for the training phase, the nets (35) and (36) with j ¼ 1 used in the ﬁrst controller C 1 show the good capability to reproduce the terms f 2 and f 5 . In Figs. 6(a) and 7(a), the dashed lines are the desired net outputs f 2 and f 5 , respectively and the continuous lines

1289

70

60

50

40

30

20

10 0

50

100

150 samples

200

250

0

50

100

150 samples

200

250

(a) 4 3 2 1 0 -1 -2 -3 (b)

Fig. 6. Cross-validation of ha1 : (a) the continuous line denotes the 1 output f^2 ðkÞ of ha1 and the dashed line denotes the desired output f 2 ðkÞ and (b) difference between the desired output and the net output.

are the nets outputs f^12 and f^15 , respectively. The difference between the desired output and the net output is shown in Figs. 6(b) and 7(b) for nets described by (35) and (36) with j ¼ 1, respectively. The Anderson’s test (Mihram, 1972) that analyzes the sampled autocorrelation function of the tested signal to determine its whiteness has been also used for NNs 1 validation. The estimate residuals r1 :¼ f f^ , r1 :¼ a

1

2

2

b

f 5 f^5 , which have the form (29), are obtained from the identiﬁcation performed by the two NNs used in the ﬁrst controller C 1 . These residuals are generally reduced (the mean-square NNs functional approximation errors are less than 103 ) and white. The autocorrelation functions of residuals lie within the 95% of the considered conﬁdence strip, as shown in Fig. 8 where a sample of

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

1290

20

1.2

15

1

10

0.8

5

0.6

0

0.4

-5

0.2

-10

0

-15

-0.2 -0.4

-20 0

50

100

(a)

150 samples

200

250

0

50

100

0

50

100

(a)

150 samples

200

250

200

250

0.3

1.2 0.2

1

0.1

0.8

0

0.6

-0.1

0.4 0.2

-0.2

0 -0.3

-0.2 -0.4 0 (b)

50

100

150 samples

200

250

Fig. 7. Cross-validation of hb1 : (a) the continuous line denotes the 1 output f^5 ðkÞ of hb1 and the dashed line denotes the desired output f 5 ðkÞ and (b) difference between the desired output and the net output.

these tests are reported for the same validation trajectories considered in Figs. 6 and 7. Figs. 8(a) and 8(b) show the autocorrelation functions relative to the estimate residuals of networks (35) and (36) for j ¼ 1, respectively. As above stated, the nets implementing controllers C j ; j ¼ 2; 3; . . . ; are chosen with the same structure of nets used in the implementation of C 1 . In particular the RBFNs of controllers C j ; j ¼ 2; 3; . . . ; have the structures described by (35) and (36) with the same number of input variables, i.e. ‘a ¼ ‘b ¼ 2. Moreover the number b;2 b;1 of neurons na;2 coincides with na;1 r ; nr r ; nr , while b;j na;j ; n , j ¼ 3; 4; . . . ; could increase with respect to the r r b;1 starting values na;1 ; n (as stated in Sections 3.5 and r r 3.6) to cope with the environment disturbances and parameter uncertainties.

-0.4 (b)

150 samples

Fig. 8. Autocorrelation functions of the estimate residuals obtained from the identiﬁcation performed by neural networks ha1 and hb1 . The dashed straight lines denote the 95% conﬁdence strip: (a) autocorrelation function of the estimate residuals r1a of neural network ha1 and (b) autocorrelation function of the estimate residuals r1b of neural network hb1 .

5.2. Experimental tests for the multiple models-based control scheme The real-time multiple models RBFN-based controller and the supervisor have been implemented on PC1. A sampling time of 0.4 s has been used, in order to guarantee a maximum vehicle velocity of 0.3 m/s. Moreover, PC1 acquires data from the actual trajectory for RBFNs on-line learning. The data are sent to PC2 connected to PC1 by an Ethernet link. PC2 implements the on-line OLS algorithm for updating the off-line determined RBFN centers using the data acquired by the local net. The new NNs are then sent to the PC1 and are used by the control module to add a new model

ARTICLE IN PRESS

mm

P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

3000 2500 Constant disturbances start point (20th sample) 2000 1500 1000 500 100th sample 0 1st sample -500 150th sample -1000 Constant disturbances end point -1500 (200th sample) -2000 -2500 250th sample -3000 -3500 -3500 -3000 -2500 -2000 -1500 -1000 -500 mm

0

500

Fig. 9. Planned trajectory.

1291

M j ; j ¼ 4; 5; . . . ; with the relative controller to the multiple model control scheme (see Fig. 4). The proposed multiple models RBFN-based control scheme has been tested in many different environmental conditions and with different planned trajectories obtaining satisfactory results. Signiﬁcant samples of the performed experimental tests on the proposed controller are reported in the following. The planned trajectories of Figs. 9 and 12 are considered. The proposed multiple models RBFN-based controller has been tested with constant unmeasurable disturbances acting on the control inputs of the mobile robot (de Wit & Khennouf, 1995). The constant disturbances start at the 20th and 50th sample and end at the 200th and 150th sample for trajectories of Figs. 9 and 12, respectively. In the performed experimental tests, the vehicle covers two time the same trajectory with the same constant

40

40

30

20

20

0

mm

mm

10 0

-20 -40

-10 -60 -20 -80

-30

-100

-40 0 (a)

50

100

150

200

0

250

samples

50

100

150

200

250

samples

(b)

0.5 0.4 0.3 0.2

rad

0.1 0 -0.1 -0.2 -0.3 -0.4 -0.5 0 (c)

50

100

150

200

250

samples

Fig. 10. (a)–(c) tracking errors e1 ðkÞ, e2 ðkÞ and e3 ðkÞ, respectively. The dotted lines and the continuous lines have been obtained when the vehicle covers the ﬁrst time and the second time the trajectory, respectively.

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

1292

4

index j

3

2

1 0

50

100

150

200

250

200

250

samples

(a) 4

index j

3

2

1 0

50

100

150 samples

(b)

Fig. 11. Index j of the applied sequence of regulators C j . (a) The ﬁrst time the robot covers the trajectory (j ¼ 1; 2; 3) and (b) the second time the robot covers the trajectory (j ¼ 1; 2; 3; 4).

7000 6000

Constant disturbances start point (50th sample)

5000

mm

4000 3000 1st sample

2000 1000 0

Constant disturbances end point (150th sample) 200th sample

-1000 -3500 -3000 -2500 -2000 -1500 -1000 -500 mm Fig. 12. Planned trajectory.

0

500

disturbances acting on the control inputs in order to verify the learning capability of the multiple models control scheme based on the multiprocessor system. In these tests, the multiprocessor control scheme stores the new operating conditions through new NNs trained by the on-line OLS learning algorithm. These nets are used for implementing model M 4 and the relative controller C 4 . Figs. 10 and 13 show the tracking errors obtained with the developed multiprocessor control scheme. The dotted lines represent the tracking errors of the proposed multiprocessor controller when the vehicle covers the ﬁrst time the planned trajectory, while the continuous lines represent the tracking errors when the vehicle covers the second time the trajectory with the same constant disturbance acting on the control inputs. In this second case the control scheme is based on 4 models, where the fourth model M 4 is that trained by the on-line OLS learning algorithm when the mobile base covers for the ﬁrst time the trajectory. The ﬁrst time that the mobile base covers the trajectories of Figs. 9 and 12, the multiprocessor control scheme uses controllers C j ; j ¼ 1; 2; 3, and the applied sequence of C j ; j ¼ 1; 2; 3, is displayed in Figs.11(a) and 14(a) for trajectories of Figs. 9 and 12, respectively; while the second time that the robot covers the same trajectory, the sequence of applied controllers C j ; j ¼ 1; 2; 3; 4, is displayed in Figs. 11(b) and 14(b) for trajectories of Figs. 9 and 12, respectively (Figs. 9–14). Note that the considered multiple models control structure switches on the adaptive and adaptive-learning controllers C j ; j ¼ 2; 3; 4 only in the presence of constant disturbance inputs, switching back to the ﬁxed controller C 1 when the disturbances end. This fact shows that the ﬁxed RBFN-based controller is suitable for trajectories without disturbances. In fact, it is based on NNs trained on a large set of different trajectories without disturbances and therefore these nets have the ability to generalize to trajectories without disturbances. For a sample of signiﬁcant trajectories, characterized by different shapes, curvature radius and tracking velocities, the reduction of the tracking errors obtained by the multiprocessor control scheme is summarized in Table 1(a). The considered tests have been performed considering constant unmeasurable disturbances acting on the control inputs of the mobile robot (de Wit & Khennouf, 1995). In all performed tests the investigated multiprocessor RBFN-based control scheme is satisfactory in terms of tracking errors and it shows an improvement of the control performance when the vehicle covers the second time the trajectories in the case of constant unmeasurable disturbance acting on the vehicle. Moreover, the performed experimental tests show a noticeable improvement of the control performance of the multiprocessor RBFN-based control scheme with respect to ﬁxed and adaptive RBFN-based

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

1293

40

60

20

40

0 -20

0

mm

mm

20

-20

-40 -60 -80

-40 -100 -60

-120 -140

-80 0

20

(a)

40

60

80

100 120 140 160 180 200

samples

0

20

40

60

(b)

80

100 120 140 160 180 200 samples

0.4 0.3 0.2

rad

0.1 0 -0.1 -0.2 -0.3 -0.4 0

20

40

60

(c)

80

100 120 140 160 180 200 samples

Fig. 13. (a)–(c) tracking errors e1 ðkÞ, e2 ðkÞ and e3 ðkÞ, respectively. The dotted lines and the continuous lines have been obtained when the vehicle covers the ﬁrst time and the second time the trajectory, respectively.

controllers (D’Amico et al., 2001a,b) as summarized in Tables 1(b) and (c).

6. Conclusions In this paper, a multiprocessor multiple models RBFN-based approach for the tracking problem of a mobile vehicle is analyzed. The proposed controller has been developed and compared also to ﬁxed and adaptive NN-based controllers by experimental tests performed on the mobile base LabMate. The proposed approach is an application of a general learning control methodology which is based on the use of multiple models, switching and tuning (Chen & Narendra, 2000; Narendra & Xiang, 2000). Fixed, adaptive and adaptive-learning RBFN-based control strategies are integrated into this multiple

models framework. The NNs have been trained on experimental data acquired from the mobile base LabMate. Different training trajectories have been considered. The proposed control algorithm makes use of multiple models of the mobile robot and of a switching scheme which determines the speciﬁc control input that drives the robot actuators. The main idea is to generate a control input which is based on the identiﬁcation model which best approximates the robot at that instant. Moreover, the proposed adaptivelearning control scheme is able to introduce new models during robot tasks. Therefore, the developed multimodel control scheme increases its knowledge on the environment by the introduction of new models and related controllers and it is able to cope with timevarying operating conditions. The real-time implementation of this switching controller with on-line learning of new models and

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295

1294

Table 1

4

e e

Percent mean square errors ei % :¼ i ei i 100, i ¼ 1; 2; 3; where ei is the mean square tracking error of the multi-processor control scheme when the robot covers the second time the considered trajectories T j ; j ¼ 1; 2

index j

3

2

1 0

20

40

60

80

(a)

100 120 140 160 180 200 samples

4

T1

T2

(a) e1 % e2 % e3 %

33.029 7.102 14.892

63.789 18.412 26.814

(b) e1 % e2 % e3 %

87.342 46.434 41.114

91.878 56.428 60.649

(c) e1 % e2 % e3 %

65.565 23.367 19.090

80.031 27.013 31.230

(a) ei is the mean square error of the multi-processor control scheme when the robot covers the ﬁrst time the trajectories. (b) ei is the mean square error of the ﬁxed RBFN-based control scheme. (c) ei is the mean square error of the adaptive RBFN-based control scheme.

index j

3

2

1 0 (b)

20

40

60

80

100 120 140 160 180 200 samples

Fig. 14. Index j of the applied sequence of regulators C j : (a) the ﬁrst time the robot covers the trajectory (j ¼ 1; 2; 3) and (b) the second time the robot covers the trajectory (j ¼ 1; 2; 3; 4).

controllers requires a hierarchical architecture where the control actions and the on-line learning are two independent processes coordinated by a supervisor. This solution is computation intensive and requires a highperformance computer architecture based on a multiprocessor system. The hierarchical developed control structure suggests to share the computation on two independent processors, where a processor is dedicated to the implementation of the real-time multiple model control scheme and of the supervisor, while on the second processor the on-line learning algorithm is implemented. The computation of control actions and the on-line learning of new models and controllers are partially independent processes and are performed on independent computers connected by a local communication network. In all the performed experimental tests the proposed control scheme showed signiﬁcant improvements

of the tracking behavior in different environmental conditions. These results conﬁrm the non-linear approximation capabilities of the considered NN in modeling the kinematic behavior of the vehicle which can be used also for compensating unmodeled tracking errors. Concluding the proposed control scheme is able to cope with signiﬁcant environment disturbances and parameter uncertainties affecting the control of mobile robots operating in real contexts and the developed multiprocessor architecture represents an efﬁcient and simple solution for a real-time implementation of this new control scheme.

References Ahmed-Zaid, F., Ioannou, P. A., Polycarpou, M. M., & Youssef, H. M. (1993). Identiﬁcation and control of aircraft dynamics using radial basis function networks. In Proceedings of the second IEEE conference on control applications (pp. 567–572). Vancouver, BC. Angeloni, A., Leo, T., Longhi, S., Zulli, R. (1996). Real time collision avoidance for mobile robots. In Proceedings of the sixth symposium on measurement and control in robotics (ISMCR 96) (pp. 239–244). Brussels, Belgium. Chen, S., Cowan, C. F. N., & Grant, P. M. (1991). Orthogonal least squares learning algorithm for radial basis function networks. IEEE Transactions on Neural Networks, 2(2), 302–309. Chen, L., & Narendra, K. S. (2000). Nonlinear adaptive control using neural networks and multiple models. In Proceedings of the 2000 American control conference (vol. 6, pp. 4199–4203). Conte, G., Longhi, S., & Zulli, R. (1995). Nonholonomic motion planning using distance ﬁeld. In Proceedings of the 1995 IFAC conference on intelligent autonomous vehicles (pp. 93–98). Helsinky, Finland.

ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295 Conte, G., Longhi, S., & Zulli, R. (1996). Motion planning for unicycle and car-like robots. International Journal of Systems Science, 27(8), 791–798. Corradini, M., Ippoliti, G., & Longhi, S. (2003). Neural networks based control of mobile robots: Development and experimental validation. Journal of Robotic Systems, 20(10), 587–600. Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signal and Systems, 2(4), 303–314. D’Amico, A., Ippoliti, G., Longhi, S. (2001a). Adaptation and learning in neural networks multiple models based control of mobile robots. In Proceedings of the 2001 IFAC workshop on adaptation and learning in control and signal processing (ALCOSP 2001) (pp. 199–206). Cernobbio-Como, Italy. D’Amico, A., Ippoliti, G., Longhi, S. (2001b). A radial basis function networks approach for the tracking problem of mobile robots. In Proceedings of the 2001 IEEE/ASME international conference on advanced intelligent mechatronics (AIM ’01) (pp. 498–503). Como, Italy. de Wit, C. C., & Khennouf, H. (1995). Quasi-continuous stabilizing controllers for nonholonomic systems: Design and robustness considerations. In Proceedings of the third European control conference (pp. 2630–2635). Rome, Italy. de Wit, C. C., Khennouf, H., Samson, C., & Sordalen, O. J. (1993). Nonlinear control design for mobile robots. In Y. F. Zheng (Ed.), Recent trends in mobile robots (pp. 121–156). Singapore: World Scientiﬁc. Girosi, F., & Poggio, T. (1990). Neural networks and the best approximation property. Biological Cybernetics, 63, 169–176. Hunt, K. J., Sbarbaro, D., Zbikowski, R., & Gawthrop, P. J. (1992). Neural networks for control systems—a survey. Automatica, 28(6), 1083–1112. Hush, D. R., & Horne, B. G. (1993). Progress in supervised neural networks. IEEE Signal Processing Magazine, 10, 8–39. Ippoliti, G., & Longhi, S. (2004). Multiple models for adaptive control to improve the performance of minimum variance regulators. In IEE Proceedings - Control Theory and Applications (vol. 151, no. 2, pp. 210–217). Oxford, UK. Jagannathan, S., Lewis, F., Pastravanu, O. (1994). Model reference adaptive control of nonlinear dynamical systems using multilayer neural networks. In Proceedings of the 1994 IEEE World congress on computational intelligence, 1994 IEEE international conference on neural networks (pp. 4766–4771). Orlando, FL, USA. Jetto, L., Longhi, S., & Venturini, G. (1999). Development and experimental validation of an adaptive extended kalman ﬁlter for the localization of mobile robots. IEEE Transactions on Robotics & Automation, 15, 219–229.

1295

Jetto, L., Longhi, S., & Vitali, D. (1999). Localization of a wheeled mobile robot by sensor data fusion based on a fuzzy logic adapted kalman ﬁlter. Control Engineering Practice, 7, 763–771. Mihram, G. A. (1972). Simulation: Statistical foundations and methodology. Mathematics in science and engineering (vol. 92). New York, USA: Academic Press. Morse, A. S. (1995). Control using logic-based switching. In A. Isidori (Ed.), Trends in control: A European perspective (pp. 69–113). London, Great Britain: Springer. Narendra, K. S., & Xiang, C. (2000). Adaptive control of discrete-time systems using multiple models. IEEE Transactions on Automatic Control, 45(9), 1669–1686. Polycarpou, M. M. (1996). Stable adaptive neural control scheme for nonlinear systems. IEEE Transactions on Automatic Control, 41(3), 447–451. Polycarpou, M. M., & Ioannou, P. A. (1995). Adaptive bounding techniques for stable neural control systems. In Proceedings of the 34th conference on decision and control (pp. 2442–2447). New Orleans, LA. Sadegh, N. (1995). A nodal link perceptron network with applications to control of a nonholonomic system. IEEE Transactions on Neural Networks, 6(6), 1516–1523. Sanner, R. M., & Essex, C. (1996). Multiresolution radial basis function networks for the adaptive control of robotic systems. In UKACC international conference on control ’96 (pp. 894–898). Sanner, R. M., & Slotine, J.-J. E. (1991). Stable adaptive control and recursive identiﬁcation using radial gaussian networks. In Proceedings of the 30th conference on decision and control (pp. 2116–2123). Brighton, England. Sanner, R. M., & Slotine, J.-J. E. (1994). Function approximation, ‘‘neural’’ networks, and adaptive nonlinear control. In Proceedings of the third IEEE conference on control applications (pp. 1225–1232). Glasgow, UK. Unar, M. N., & Murray-Smith, D. J. (1999). Automatic steering of ships using neural network. International Journal of Adaptive Control and Signal Processing, 13, 203–218. Warwick, K., Kambhampati, C., Parks, P., & Mason, J. (1996a). Dynamic systems in neural networks. In K. Hunt, G. Irwin, & K. Warwick (Eds.), Neural network engineering in dynamic control systems (pp. 27–41). London, Great Britain: Springer. Warwick, K., Kambhampati, C., Parks, P., & Mason, J. (1996b). Dynamic systems in neural networks. In K. Hunt, G. Irwin, & K. Warwick (Eds.), Neural network engineering in dynamic control systems (pp. 27–41). London, Great Britain: Springer. Weller, S. R., & Goodwin, G. C. (1994). Hysteresis switching adaptive control of linear multivariable systems. IEEE Transactions on Automatic Control, 39(7), 1360–1375.

Learning control of mobile robots using a multiprocessor system

Learning control of mobile robots using a multiprocessor system

Recommend Documents