ARTICLE IN PRESS
Control Engineering Practice 14 (2006) 1279–1295 www.elsevier.com/locate/conengprac
Learning control of mobile robots using a multiprocessor system P. Antonini, G. Ippoliti, S. Longhi Dipartimento di Ingegneria Informatica, Gestionale e dell’Automazione, Universita` Politecnica delle Marche, Via Brecce Bianche, 60131 Ancona, Italy Received 12 February 2004; accepted 28 June 2005 Available online 15 August 2005
Abstract A real-time multiprocessor system is proposed for the solution of the tracking problem of mobile robots operating in a real context with environmental disturbances and parameter uncertainties. The proposed control scheme utilizes multiple models of the robot for its identification in an adaptive and learning control framework. Radial Basis Function Networks (RBFNs) are considered for the multiple models in order to exploit the net non-linear approximation capabilities for modeling the kinematic behavior of the vehicle and for reducing unmodeled contributions to tracking errors. The training of the nets and the tests of the achieved control performance have been done in a real experimental setup. The proposed control architecture improves the robot tracking performance achieving fast and accurate control actions in presence of large and time-varying uncertainties in dynamical environments. The experimental results are satisfactory in terms of tracking errors and computational efforts. r 2005 Elsevier Ltd. All rights reserved. Keywords: Robotics; Autonomous vehicles; Multiple models; Multiprocessor systems; Neural networks
1. Introduction Mobile robots usually work with different tasks in operative environments which assume different forms depending on specific applications to be performed. Environments may represent different terrains on which the system operates or different loads in the work space. In this context, a multiplicity of robot models can be used to identify different operative conditions and to determine the control actions with the necessary autonomy to cope with rapid environment changes. For the control problem, if a single identification model is used, when operating conditions change, this model should be adapted accordingly to the new environment before a control action is taken. Therefore, a multiplicity of models may be required to identify different environments and to take the necessary control action rapidly when changes in the environments are of considerable magnitude and the autonomy of the robot Corresponding author. Tel.: +39 071 220 4451; fax: +39 071 220 4474. E-mail address:
[email protected] (S. Longhi).
0967-0661/$ - see front matter r 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.conengprac.2005.06.012
must be improved (Chen & Narendra, 2000; Ippoliti & Longhi, 2004; Morse, 1995; Narendra & Xiang, 2000; Weller & Goodwin, 1994). The kinematic constraint on the use of commercially available vehicles which imposes to control the robot by setting its linear and angular velocities, hinders to take explicitly into account parameters variations in the design of the control algorithm for such vehicles. In fact, the commercial available vehicles impose to use a ‘‘kinematic viewpoint’’ based on the kinematic model which does not contain information on vehicle wheels dimensions, robot mass and inertia. These limits associated with the use of the kinematic model, can be overcome by the use of Neural Networks (NNs)-based controllers (Corradini, Ippoliti, & Longhi, 2003), where the non-linear approximation capabilities of Neural Nets have been exploited for reducing, with a simple kinematic control scheme, the effects of unmodeled phenomena, that are strong when relative high linear and angular accelerations are imposed by the navigation system. In this paper, NN-based control methodologies are further investigated within the context of multiple
ARTICLE IN PRESS 1280
P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
models control of mobile robots in an adaptive and learning control framework. A switching strategy among these models determines the best possible control input at any given instant. Radial basis function networks have been used for identification and control in order to avoid the non-linear optimization techniques used in the learning algorithm of the Multilayer Neural Networks (MNN) considered in Corradini et al. (2003) and the related problems of local minima. The experimental results using Radial Basis Function Networks (RBFNs) control schemes (D’Amico, Ippoliti, & Longhi, 2001b) have shown that RBFN are a possible alternative to control schemes based on MNN (Corradini et al., 2003). Moreover, these nets are useful for the identification and control of complex non-linear dynamical systems in the presence of uncertainties because the particular structure of RBFN allows their applicability to adaptive and adaptive-learning control schemes (Jagannathan, Lewis, & Pastravanu, 1994; Ahmed-Zaid, Ioannou, Polycarpou, & Youssef, 1993; Polycarpou & Ioannou, 1995; Polycarpou, 1996; Sadegh, 1995; Sanner & Slotine, 1991, 1994; Sanner & Essex, 1996). The real-time implementation of this switching controller with on-line learning of new models and controllers requires a hierarchical architecture where the control actions and the on-line learning are two independent processes coordinated by a supervisor. The supervisor task is double: to deduce the controller to be applied to the robot and to manage the on-line learning procedure. This solution is computation intensive and requires a high-performance computer architecture based on a multiprocessor system. The proposed hierarchical control structure imposes to share the computation on two independent processors, where a processor is dedicated to the implementation of the realtime multiple model control scheme and of the supervisor, while on the second processor the on-line learning algorithm is implemented. The computation of control actions and the on-line learning of new models and controllers are partially independent processes and are performed on independent computers connected by a local communication network. The technical aspects of this multiprocessor architecture will be analyzed in this paper. Experimental tests of the proposed NNs control schemes have been performed on the LabMate mobile base, available at the Robotics Lab of the Universita` Politecnica delle Marche and provided with an automatic navigation system. Experiments have been performed in a real indoor environment with unknown masses, friction, drive train compliance and with meaningful trajectories characterized by relatively strong linear and angular accelerations imposed by the planner. The experimental results show noticeable improvements of the vehicle performance with respect to fixed and
single-model adaptive RBFN-based control scheme (D’Amico et al., 2001b), confirming that the theoretical concepts on switching, learning and tuning developed in Chen and Narendra (2000), Narendra and Xiang (2000) can be applied to achieve significant improvements in the control of mobile robots. The paper is organized as follows: in Section 2, the multiple models control approach for non-linear dynamical systems is described. The RBFNs used for implementing the multiple models control scheme and their learning algorithm are recalled in Section 3. The details on the multiple models control scheme based on NNs and on the multiprocessor architecture are described in Section 4, while experimental results are reported in Section 5. The paper ends with comments on the performance of the proposed controllers and on the necessity of using a hierarchical architecture for the realtime implementation of the switching controller.
2. A multiple models control approach for non-linear dynamical systems In this section, basic definitions and concepts of a multiple models control approach for non-linear dynamical systems are recalled. This methodology has been developed for improving the performance of dynamical systems operating in rapidly varying environments. The approach is based on the use of multiple models, switching, adaptation and learning (Chen & Narendra, 2000; Narendra & Xiang, 2000). The multiple models are based on neural networks that are an useful tool for the identification and control of complex non-linear dynamical systems in the presence of uncertainties. Assuming that L identification models M j ; j 2 f1; Lg are used in parallel with the given plant, the objective is to determine which among them is closest (according to some criterion) to the plant at any given instant and to use the corresponding controller C j ; j 2 f1; Lg, designed for controlling the corresponding model M j , to control the plant (Chen & Narendra, 2000; Narendra & Xiang, 2000). With this control structure, the performance of the different controllers cannot be directly evaluated in parallel because at any time instant only one control input can be chosen. Therefore, the output responses yj ; j ¼ 1; 2; . . . ; L of the multiple models and the plant output y are used for an estimate of the controller performance. Hence, at every instant, some measures of the identification errors ej :¼ yj y are determined with suitable performance indices J j ðkÞ; j ¼ 1; 2; . . . ; L. The model corresponding to minfJ j ðkÞgLj¼1 is chosen to determine the plant control input at that time instant, i.e. the controller corresponding to the model with the minimum index is used (switched) for controlling the plant.
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
When the process to be controlled is non-linear, NNbased models fM j gLj¼1 can be used for the theoretical ability of NN to approximate arbitrary non-linear mappings (Cybenko, 1989). The determination of the models fM j gLj¼1 , representing different operative conditions, has to be based on the a priori information concerning the plant and when a new configuration of the plant occurs, a new identification model M Lþ1 has to be learned online becoming an a priori known environment for all future performance of the control system. The approach here recalled has been specialized to the control of a mobile robot. In particular, RBFNs have been used for the identification and control. In the following section, the main features of the RBFNs are recalled. These nets are useful for the identification and control of complex non-linear dynamical systems in the presence of uncertainty because the particular structure of RBFN allows their applicability to adaptive and adaptive-learning control schemes.
3. Learning algorithm for radial basis function networks 3.1. Radial basis function networks A RBFN with input pattern x 2 Rn and a scalar ^ 2 R implements a mapping f : Rn ! R output c r according to ^ ¼ f ðxÞ ¼ l0 þ c r
nr X
li fðkx ci kÞ,
(1)
1281
output samples acquired in a finite set of past time instants (Hunt, Sbarbaro, Zbikowski, & Gawthrop, 1992; Warwick et al., 1996a). In the RBFN the functional form fðÞ and the centers ci are suitably fixed. By providing a set of network input pattern xðkÞ and the corresponding desired output cðkÞ to be approximated by the net, for k ¼ 1; 2; . . . ; N, the values of the weights li , i ¼ 0; 1; . . . ; nr can be determined using the linear Least-Squared (LS) method. The choices of centers ci must be carefully considered in order the RBFN closely matches the performance of MNNs, while theoretical investigation and practical results show that the choice of the non-linearity fðÞ, a function of the distance d i between the current input x and the center ci , does not significantly influence the performance of the RBFN (Chen et al., 1991). Therefore, the following gaussian function is considered: fðd i Þ ¼ expðd 2i =b2 Þ;
i ¼ 1; 2; . . . ; nr ,
(2)
where d i ¼ kx ci k and the real constant b is a scaling or ‘‘width’’ parameter (Chen et al., 1991). The centers ci , i ¼ 1; 2; . . . ; nr are generally chosen from the data set fxðkÞgN k¼1 . The Orthogonal Least Square (OLS) algorithm (Chen et al., 1991) is an efficient method for selecting centers from the data set obtaining adequate and parsimonious RBFN thus reducing computational complexity and numerical illconditioning. The application of this method requires to rewrite the RBFN expressed by (1) as a linear regression model
i¼1
where fðÞ is a given function from Rþ to R, k k denotes the Euclidean norm, li , i ¼ 0; 1; . . . ; nr are the weights or parameters, ci 2 Rn , i ¼ 1; 2; . . . ; nr , are the radial basis functions centers and nr is the number of centers (Chen, Cowan, & Grant, 1991; Girosi & Poggio, 1990; Unar & Murray-Smith, 1999; Warwick, Kambhampati, Parks, & Mason, 1996a). The block diagram representing the implementation of considered network (1) is shown in Fig. 1. Note that, if a RBFN is used for the estimation of the output of a dynamical system, the system dynamics can be taken into account through the input pattern x, that must be composed of a proper set of system input and
x1 x2
1(||x-c1||)
2(||x-c1||)
1 1 2
0 Σ
ψˆ
cðkÞ ¼
M X
pi ðkÞdi þ ðkÞ,
where M :¼ nr þ 1, cðkÞ is the desired output to be ^ approximated by the net, ðkÞ :¼ cðkÞ cðkÞ is the error signal, di :¼ li1 are the parameters to be estimated, pi ðkÞ :¼ fðkxðkÞ ci kÞ are given fixed function of xðkÞ where the centers ci have to be fixed and p1 ðkÞ ¼ 1. The error signal ðkÞ is assumed to be uncorrelated with pi ðkÞ. The OLS method, is considered for the selection of centers ci from the data set fxðkÞgN k¼1 , with a reduced number M of pi ðÞ. These significant regressors pi ðkÞ, i ¼ 1; 2; . . . ; M can be selected using the OLS algorithm operating in a forward regression way (Chen et al., 1991). Arranging (3) for k ¼ 1; 2; . . . ; N in the following matrix form: (4)
W ¼ PD þ E, T
xn
n (||x-cn ||) r
nr
r
Fig. 1. Radial basis function neural network.
(3)
i¼1
T
where W :¼ ½cð1Þ cðNÞ , D :¼ ½d1 dM , E :¼ ½ð1Þ ðNÞT and P :¼ ½p1 pM with pi :¼ ½pi ð1Þpi ð2Þ pi ðNÞT , i ¼ 1; 2; . . . ; M the OLS method involves the transformation of the set of pi into a set of orthogonal basis vectors wi , i ¼ 1; 2; . . . ; M, where the space spanned by the set of orthogonal basis vectors wi , i ¼ 1; 2; . . . ; M, is the
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
1282
same space spanned by the set of pi , i ¼ 1; 2; . . . ; M, and (4) can be rewritten as (5)
W ¼ Wg þ E
with W ¼ ½w1 ; w2 ; . . . ; wM . This algorithm makes it possible to calculate the individual contribution to the desired output energy from each basis vector. Because wi and wj are orthogonal for iaj, the sum of squares or energy of WðkÞ is WT W ¼
M X
g2i wTi wi þ E T E
(6)
i¼1
and this relation suggests to consider as the error reduction ratio due to wi the following quantity: ½erri :¼ g2i wTi wi =ðWT WÞ;
i ¼ 1; 2; . . . ; M.
(7)
The regressors selection procedure terminates at the M s th step when 1
Ms X
½errj or,
(8)
3.3. ‘‘Width’’ of the radial basis functions Once the centers have been selected, the normalization parameter b2 of (2), that represents a measure of the spread of the data associated with each centre, has to be determined. No rigorous method exists to calculate this parameter. In the developed solution an Akaiketype criteria (Chen et al., 1991) is proposed for choosing the normalization parameter obtaining a good compromise between estimate accuracy and network complexity (D’Amico et al., 2001a). Akaike-type criteria which compromises between the performance and the number of parameters has the following form: AICðwÞ ¼ N logðs2e Þ þ M s w,
(9)
where w is the critical value of the chi-squared distribution with one degree of freedom and for a given level of significance, N is the number of data set, s2e is the variance of the net estimate residuals and M s is the number of significant regressors. The procedure terminates when AICðwÞ reaches its minimum.
j¼1
where 0oro1 is a chosen tolerance. This gives rise to a model containing only M s significant regressors. The orthogonal property makes the whole selection procedure simple and efficient and the tolerance r is an important parameter in balancing the accuracy and the complexity of the final network. 3.2. K-means clustering algorithm The OLS method can be employed as a forward regression procedure (Chen et al., 1991) to select a suitable set of centers ns pnr , from a large initial set of candidates, for the RBFN f r : Rn ! R. In the developed solution the initial set of centers is obtained using the kmeans unsupervised clustering algorithm that starting from a reasonable high number of centers randomly chosen in the input space, moves them in the most significative regions of the input space (D’Amico, Ippoliti, & Longhi, 2001a; Hush & Horne, 1993). The k-means clustering algorithm is given by the sequential execution of the following steps: Step 1: Initialize the cluster centers cj , j ¼ 1; 2; . . . ; nc . This centers are randomly chosen from the input data set fxðkÞgN k¼1 . Step 2: Group all the inputs of the data set fxðkÞgN k¼1 in the sets Oj , j ¼ 1; 2; . . . ; nc , where Oj is the set of input vectors xðÞ closest to the cluster center cj , i.e. N kxðkÞ ci kg. Oj :¼ fxðÞ 2 fxðkÞgN k¼1 j kxðkÞ cj k ¼ mini¼1P Step 3: For all j ¼ 1; 2; . . . ; nc do cj ¼ M1 j xðkÞ2Oj xðkÞ, where M j is the number of input vectors xðkÞ 2 Oj . Step 4: If there is no change in the cluster assignments of Step 3 from one iteration to the next the algorithm is stopped otherwise go back to Step 2.
3.4. On-line adaptive algorithm The structure of RBFNs allows to introduce adaptive and adaptive-learning on-line algorithms for improving the non-linear approximation capabilities of NNs. The RBFN is a special two-layer network which is linear in the parameters li ; i ¼ 0; 1; . . . ; nr , by fixing all radial basis functions centers and non-linearities in the hidden layer. Therefore, the second layer implements a linear combiner of the outputs of the hidden layer with weights li ; i ¼ 0; 1; . . . ; nr . The considered on-line adaptive algorithm computes the weights li by standard LS methods. Therefore, this particular structure of RBFN allows their applicability to adaptive control schemes, where the weights of the network second layer are online adapted making use of plant input and output signals. This adaptation allows the use of nets in a control scheme able to cope with parameters variations and external disturbances (D’Amico et al., 2001a,b; Warwick, Kambhampati, Parks, & Mason, 1996b). 3.5. On-line adaptive clustering algorithm The structure of the RBFN allows also an on-line adaptive clustering by updating the number of the radial basis function centers and the weights of the second layer. The proposed on-line adaptive clustering algorithm, for each new input vector of the RBFN, checks if the outputs of the first RBFN layer are significantly different from zero. If the outputs of the hidden layer are under a proper fixed threshold it means that the new input vector is outside the input space covered by the RBFN. In this case, the algorithm adds a new center to
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
1283
the RBFN to cover the new input vector and update the weight of the second layer with the LS method recalled in Section 3.4 (D’Amico et al., 2001a). This on-line adaptive clustering algorithm adds new centers to the RBFN to cover new input vectors outside the input space covered by the RBFN without taking in account the centers selected off-line with the OLS algorithm. In order to optimize the position of these new centers, such new centers are combined with centers selected off-line, by the on-line implementation of the OLS algorithm, as stated in the following Section 3.6.
4.1. The kinematic controller
3.6. On-line OLS algorithm
AT ðqÞ_q ¼ ½sinðym Þ
The introduction of new centers detected by the above algorithm, can be optimized by the use of an on-line OLS algorithm which adds new centers to the NN on the basis of new input vectors which are outside the input space covered by the RBFN. If f r ðxÞ : Rn ! R is the off-line trained NNs with a number ns of centers, the algorithm, on the basis of the new input vectors computes a new NN f r ðxÞ : Rn ! R of the form
which corresponds to the hypothesis of pure rolling and non-slipping condition. Defining a full rank matrix SðqÞ, such that AT ðqÞSðqÞ ¼ 0, a vector u exists satisfying
c ¼ f r ðxÞ ¼ l0 þ
ns X
li fðkx ci kÞ
i¼1
þ
nr X
li fðkx ci kÞ,
ð10Þ
i¼ns þ1
where ci ; i ¼ ns þ 1; . . . ; nr are the new centers introduced for covering the new input vectors. The new centers ci ; i ¼ ns þ 1; . . . ; nr are derived by the k-means clustering algorithm recalled in Section 3.2 which is now applied to the set of the new input vectors which are outside the input space covered by f r ðxÞ : Rn ! R. With these set of centers ci ; i ¼ ns þ 1; . . . ; nr the OLS learning algorithm (Chen et al., 1991; D’Amico et al., 2001a) is on-line implemented for deducing the strictly necessary centers ci ; i ¼ ns þ 1; . . . ; ns , with ns pnr , which are orthogonal among them and orthogonal to the off-line determined centers ci ; i ¼ 1; 2; . . . ; ns .
A unicycle mobile robot is fully described by a threedimensional vector of generalized coordinates q constituted by the coordinates ðxm ; ym Þ of the midpoint between the two robot driving wheels and by the orientation angle ym with respect to a fixed frame (see Fig. 2): q ¼ ½q1
q2
q3 T ¼ ½xm
ym
ym T .
The vehicle is subjected to an independent velocity constraint of the form cosðym Þ 0_q ¼ 0,
q_ ¼ SðqÞu,
(12)
(13)
where u :¼ ½v oT , with o ¼ y_ m (the angular velocity) and qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi v ¼ x_m 2 þ y_m 2 (the linear velocity). Eq. (13) is the wellknown kinematic model of the unicycle vehicle, with cosðym Þ sinðym Þ 0 T . (14) SðqÞ ¼ 0 0 1 Now consider a reference trajectory of the form x_ r ¼ vr cosðyr Þ, y_ r ¼ vr sinðyr Þ, y_ r ¼ or ,
ð15Þ
where vr and or are the linear and angular reference velocities, respectively, and define the tracking errors as #" " # " # xr xm cosðym Þ sinðym Þ e1 :¼ , yr ym sinðym Þ cosðym Þ e2 e3 :¼ yr ym .
ð16Þ
Y
4. Architecture of the real-time multiple models neural network controller
v(t) ym (t)
The use of multiple models in an adaptive and learning control architecture is developed for the control of mobile robots. The main idea is to generate the robot control inputs making use of the model which closely approximates the mobile robot at each instant. This leads to a switching control scheme which improves the performance of the tracking errors. The technical aspects of the proposed solution are stated in the following, where the kinematic controller of a mobile robot and its implementation with NNs are preliminary recalled.
(11)
C
θm(t)
O xm (t) Fig. 2. Mobile robot with its vector of generalized coordinates.
X
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
1284
The dynamics of the robot tracking errors has the following form (de Wit, Khennouf, Samson, & Sordalen, 1993): 32 3 2 3 2 e1 e_ 1 0 o 0 76 7 6 7 6 6 e_ 2 7 ¼ 6 o 0 0 76 e2 7 54 5 4 5 4 _e3 e3 0 0 0 3 2 3 2 vr cosðe3 Þ 1 0 " # 7 v 6 7 6 7 6 0 7 . ð17Þ þ6 5 4 vr sinðe3 Þ 5 þ 4 0 o or 0 1 In view of the experimental application of proposed control laws the controller design has been performed in the discrete time framework. Therefore, the model (17) has been discretized with a sampling time T c . Integrating equations (17) from kT c to ðk þ 1ÞT c and assuming the control efforts u :¼ ½v oT to be constant over the same time-interval ½kT c ðk þ 1ÞT c Þ, the following discretetime model is obtained: e1 ðk þ 1Þ ¼ e1 ðkÞ þ f 1 ðe2 ÞoðkÞ þ f 2 ðe3 ; vr Þ T c vðkÞ, e2 ðk þ 1Þ ¼ e2 ðkÞ þ f 3 ðe1 ÞoðkÞ þ f 4 ðe3 ; vr Þ, e3 ðk þ 1Þ ¼ e3 ðkÞ þ f 5 ðor Þ T c oðkÞ,
ð18Þ
where the argument k denotes the variable evaluated in t ¼ kT c and f i ; i ¼ 1; . . . ; 5 are the following non-linear function: Z ðkþ1ÞT c f 1 ðe2 Þ :¼ e2 ðtÞ dt, kT c
Z
vr ðtÞ cosðe3 ðtÞÞ dt,
Z
e1 ðtÞ dt, ðkþ1ÞT c
f 4 ðe3 ; vr Þ :¼ Z
vr ðtÞ sinðe3 ðtÞÞ dt, kT c ðkþ1ÞT c
f 5 ðor Þ :¼
e1 ðkÞ
or ðtÞ dt.
ð19Þ
kT c
Consider the following control laws for the linear and angular velocities: 1 ðk1 e1 ðkÞ þ f 1 ðe2 ÞoðkÞ þ f 2 ðe3 ; vr ÞÞ, Tc 1 ðf ðor Þ þ k2 e2 ðkÞ þ k3 e3 ðkÞÞ, oðkÞ ¼ Tc 5
3
6 7 T c vr ðkÞ 7 54 e2 ðkÞ 5. e3 ðkÞ ð1 k3 Þ (21)
Hence, the control gains k1 ; k2 and k3 of control law (20) are chosen to impose the asymptotic stability of (21), i.e. the eigenvalues l1 ; l2 and l3 of system (21) are imposed in module less than one. This choice guarantees the local asymptotic stability of the closed-loop system described by Eqs. (18) and (20). The eigenvalues l2 and l3 of (21) depend on vr ðkÞ, therefore the gain k2 must be related to the reference velocity vr ðkÞ (velocity scaling (de Wit et al., 1993)). For example, eigenvalues l1 , l2 and l3 are assigned to system (21) by the following choice of the gains k1 , k2 and k3 : k1 ¼ 1 þ l1 , k2 ¼ ðl1 l2 þ l1 l2 þ 1Þ=T c vr , k3 ¼ 2 l1 l2 . 4.2. Radial basis function neural networks control for mobile robots Before to introduce the multiple models control, a simple RBFN-based fixed controller for mobile robots is recalled (D’Amico et al., 2001b). In this controller, the universal approximation ability of NNs is used for the estimation of f 2 ðe3 ; vr Þ and f 5 ðor Þ in the implementation of (20), with the integral term f 1 ðe2 Þ substituted with its rectangular approximation expressed by: (22)
k1 1 ^ f ðkÞ, e1 ðkÞ þ oðkÞe2 ðkÞ þ Tc 2 Tc k2 k3 1 ^ f ðkÞ, oðkÞ ¼ e2 ðkÞ þ e3 ðkÞ þ ð23Þ Tc 5 Tc Tc where the estimation f^2 of f 2 ðe3 ; vr Þ is performed by the neural net ha : R‘a þ2 ! R of the form vðkÞ ¼
kT c
Z
32
Therefore, the developed control law has the following form:
kT c ðkþ1ÞT c
f 3 ðe1 Þ :¼
0
f 1 ðe2 Þ ’ T c e2 ðkÞ.
ðkþ1ÞT c
f 2 ðe3 ; vr Þ :¼
form: 2 3 2 0 ð1 k1 Þ e1 ðk þ 1Þ 6 e ðk þ 1Þ 7 6 0 1 4 2 5¼4 e3 ðk þ 1Þ 0 k2
a
a
a
a
y ¼ h ðx Þ ¼
la0
þ
nr X
la‘ fðkxa ca‘ kÞ
(24)
‘¼1
with the output ya ðkÞ :¼ f^2 ðkÞ and the input pattern xa 2 R‘a þ2 defined as
vðkÞ ¼
ð20Þ
where k1 , k2 and k3 are the control gains. Under the assumption of vr ðkÞ different to zero (de Wit et al., 1993) and a small sampling time (T c is chosen small enough to produce good estimations of simple integral terms with their rectangular approximations), the closed-loop system described by Eqs. (18) and (20) around the equilibrium point ðe1 ¼ e2 ¼ e3 ¼ 0Þ has the following
xa :¼ ½e3 ðkÞ e3 ðk 1Þ vr ðk ‘a þ 1Þ vr ðk ‘a þ 2Þ vr ðk 1Þ vr ðkÞT ,
ð25Þ
the estimation f^5 of f 5 ðor Þ is performed by the neural net hb : R‘b ! R of the form: b
b
b
b
y ¼ h ðx Þ ¼
lb0
þ
nr X ‘¼1
lb‘ fðkxb cb‘ kÞ
(26)
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
with the output yb ðkÞ :¼ f^5 ðkÞ and the input pattern xb 2 R‘b defined as xb :¼ ½or ðk ‘b þ 1Þ or ðk ‘b þ 2Þ or ðk 1Þ or ðkÞT .
ð27Þ
The net input vectors xa and xb are composed of the tracking error measures and by the linear and angular velocities of the reference trajectories acquired in a finite set of successive sampling time-instants. The terms la‘ , lb‘ , ca‘ and cb‘ of the NNs ha and hb are off-line estimated by training on a set of experimental data given by the pairs ðxa ; ya Þ and ðxb ; yb Þ composed of the measures of the robot tracking errors and of the actual and reference robot control efforts acquired on a set of reference trajectories tracked by the mobile robot in real environments. In particular xa and xb , which have the form (25) and (27), are the input patterns of networks ha and hb and ya and yb are the desired outputs of the networks ha and hb , which from Eq.(18) have the forms ya ¼ e1 ðk þ 1Þ e1 ðkÞ T c e2 ðkÞoðkÞ þ T c vðkÞ, yb ¼ e3 ðk þ 1Þ e3 ðkÞ T c oðkÞ,
ð28Þ
where e1 ðÞ, e2 ðÞ, e3 ðÞ and vðÞ, oðÞ are the measures of the robot tracking errors and of the actual control efforts, respectively, acquired on a set of reference trajectories tracked by the mobile robot. The algorithm used for the net training minimizes the NN functional approximation errors along the considered reference trajectories. By (18) the NN functional approximation errors ra and rb of the neural networks ha and hb have the following expressions:
where eji ðÞ and ei ðÞ, i ¼ 1; 2; 3, are the model tracking errors and the measured robot tracking errors respectively, with j ¼ 1; 2; . . . ; L. Moreover, a set of L controllers fC j gLj¼1 , implementing the following control laws, respectively, for the linear and angular velocities of the mobile robot, is considered: 1 ðk1 e1 ðkÞ þ f j1 ðe2 ÞoðkÞ þ f j2 ðe3 ; vr ÞÞ, Tc 1 j ðf ðor Þ þ k2 e2 ðkÞ þ k3 e3 ðkÞÞ, oj ðkÞ ¼ Tc 5
vj ðkÞ ¼
T c e2 ðkÞoðkÞ þ T c vðkÞ f^2 ðkÞ, rb ðkÞ :¼ f f^ ¼ e3 ðk þ 1Þ e3 ðkÞ 5
T c oðkÞ f^5 ðkÞ.
ð29Þ
The training algorithm based on the OLS algorithm recalled in Section 3.1 is proposed for the net tuning. This RBFN-based controller implementing the control law (23) with f^2 and f^5 estimated by NNS (24) and (26) is used in the multiple models control scheme introduced in the next Section 4.3. 4.3. Multiple models framework for mobile robot control Consider a set of L discrete-time models fM j gLj¼1 that have the same form as the dynamics of the mobile robot tracking errors (18) ej1 ðk þ 1Þ ¼ ej1 ðkÞ þ f j1 ðe2 ÞoðkÞ þ f j2 ðe3 ; vr Þ T c vðkÞ, ej2 ðk þ 1Þ ¼ ej2 ðkÞ þ f j3 ðe1 ÞoðkÞ þ f j4 ðe3 ; vr Þ, ej3 ðk þ 1Þ ¼ ej3 ðkÞ þ f j5 ðor Þ T c oðkÞ,
ð30Þ
ð31Þ
where the controller C j is designed for controlling the corresponding model M j ; j ¼ 1; 2; . . . ; L. In the proposed multiple models controller (see part Multiple models controller in Fig. 3), at every time instant k only one controller of the set fC j gLj¼1 is applied to the robot and a parallel architecture of L models fM j gLj¼1 yields the outputs yj ðkÞ :¼ ½ej1 ðkÞ ej2 ðkÞ ej3 ðkÞT ; j ¼ 1; 2; . . . ; L, where each model M j is driven by the same control input uðkÞ :¼ ½vðkÞ oðkÞT of the robot. With this control structure, the performance of the different controllers cannot be directly evaluated in
ra ðkÞ :¼ f 2 f^2 ¼ e1 ðk þ 1Þ e1 ðkÞ
5
1285
Fig. 3. Hierarchical control scheme.
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
1286
parallel because at any time instant only one control input can be chosen. Therefore, the output responses of the multiple models are used for the estimation of the controller performance. Defining the prediction errors ej ðkÞ :¼ yðkÞ yj ðkÞ;
j ¼ 1; 2; . . . ; L,
(32)
where yðkÞ :¼ ½e1 ðkÞ e2 ðkÞ e3 ðkÞT is the vector of the measured robot tracking errors, the following performance indices are considered: J j ðkÞ :¼ aeTj ðkÞej ðkÞ þ b
k1 X
l ðkt1Þ eTj ðtÞej ðtÞ,
t¼k0
j ¼ 1; 2; . . . ; L; kX0; k 2 Z,
ð33Þ
where the design parameters a and b are positive constants, the module of l is lesser then one and k0 is the initial time instant. The parameters a, b determine the relative weight given by instantaneous and long-term measures, while l determines the memory of the index. In this multiple models control scheme, at every time instant the performance indices are monitored in order to deduce the controller to be applied to the robot, i.e. the controller corresponding to the model with the minimum index is used (switched) for controlling the robot. Different models fM j gLj¼1 representing different operative conditions and different control strategies for the controllers fC j gLj¼1 can be incorporated into this multiple models framework. Moreover, in this approach when a new configuration of the mobile robot occurs, a new model M Lþ1 has to be learnt on line becoming an a priori known model with a corresponding controller C Lþ1 for all future occurrences of the same mobile robot configuration. The proposed multiple models control scheme has been developed with a starting configuration incorporating three different controllers C 1 ; C 2 ; and C 3 , i.e. L ¼ 3. This simple structure based on three controllers meets the requirement of reduced implementation complexity with the capability to cope with environment disturbances and parameter uncertainties. In fact C 1 is a fixed control, C 2 is an adaptive control and C 3 is an adaptive-learning control. Each controller C j is designed for controlling the corresponding model M j .
(i) The first controller C 1 is a fixed RBFN-based control system described by (31) with j ¼ 1, where the terms f 12 ðe3 ; vr Þ and f 15 ðor Þ are substituted by their estimates obtained by fixed off-line trained RBFNs ha1 and hb1 , which have the same structures of (24) and (26), respectively. These nets are off-line trained with data acquired in a set of trajectories chosen with different shapes, curvature radius and tracking velocities as described in Section 4.2. The corresponding model M 1 has the form (30) with j ¼ 1, where f 12 ðe3 ; vr Þ and f 15 ðor Þ are estimated by ha1
and hb1 , respectively, and f 11 ðe2 Þ, f 13 ðe1 Þ and f 14 ðe3 ; vr Þ are approximated by T c e2 ðkÞ, T c e1 ðkÞ and T c e3 ðkÞvr ðkÞ, respectively. (ii) The second controller C 2 described by (31) with j ¼ 2, is a single model adaptive RBFN-based controller, where the terms f 22 ðe3 ; vr Þ and f 25 ðor Þ are substituted by their estimates obtained by adaptive RBFNs ha2 and hb2 , which have the same structures of (24) and (26), respectively. In this controller, adaptive RBFNs ha2 and hb2 are considered, where the weights of the nets second layers are on-line tuned making use of the on-line adaptive algorithm described in Section 3.4. The corresponding model M 2 has the form (30) with j ¼ 2, where f 22 ðe3 ; vr Þ and f 25 ðor Þ are estimated by the considered NNs ha2 and hb2 , respectively, and f 21 ðe2 Þ, f 23 ðe1 Þ and f 24 ðe3 ; vr Þ are approximated by T c e2 ðkÞ, T c e1 ðkÞ and T c e3 ðkÞ vr ðkÞ, respectively. (iii) The third controller C 3 described by (31) with j ¼ 3, is an adaptive-learning RBFN-based controller, where the terms f 32 ðe3 ; vr Þ and f 35 ðor Þ are substituted by their estimates obtained by adaptive-learning RBFNs ha3 and hb3 , that have the same structures of (24) and (26), respectively. In this controller, adaptive-learning RBFNs ha3 and hb3 are considered where the number of centers of the net first layer and the weight of the net second layer are on-line updated making use of the on-line adaptive clustering algorithm described in Section 3.5. The corresponding model M 3 has the form (30) with j ¼ 3, where f 32 ðe3 ; vr Þ and f 35 ðor Þ are estimated by ha3 and hb3 , respectively, and f 31 ðe2 Þ, f 33 ðe1 Þ and f 34 ðe3 ; vr Þ are approximated by T c e2 ðkÞ, T c e1 ðkÞ and T c e3 ðkÞ vr ðkÞ, respectively. During the robot tasks new controllers are derived by new NNs-based models M j of the form (30) where the terms f j2 ðe3 ; vr Þ and f j5 ðor Þ with j ¼ 4; 5; . . ., are estimated with NNs where the centers of the first layer and the weight of the second layer are updated by the on-line OLS algorithm recalled in Section 3.6. The new NNs obtained with this on-line algorithm characterize new models able to describe new operative conditions. These new models M Lþi are stored and used as a priori known models with the corresponding controllers C Lþi for all future occurrences. This requires to introduce new performance indices J Lþi ; i ¼ 1; 2; . . ., that are monitored as the indices considered at the starting configuration in order to deduce the controller to be applied to the robot. In this way the set of models increases during the robot tasks and the new models can be also stored at the end of robot tasks and used in a new trajectory. The implementation of the developed multiple models control scheme requires a hierarchical architecture, where the control actions and the on-line learning strategy are two independent processes coordinated by a
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
1287
Fig. 4. Multiprocessor system.
supervisor (see Fig. 3). The supervisor task is double: to monitor the performance indices J j expressed by (33) with j ¼ 1; 2; . . . ; L þ i and i ¼ 1; 2; . . ., and to manage the on-line procedure recalled in Section 3.6. For this second task the supervisor checks on-line if the outputs of the RBFNs hidden layer are under a proper fixed threshold for a specified time lag. When this threshold is exceeded, this means that the new set of input vectors is outside the input space covered by RBFNs and the online OLS algorithm recalled in Section 3.6 is applied for computing a new NN, describing a new model. The supervisor adds new models with the relative controllers to the multiple models control scheme increasing its ability to cope with different operating environments. This solution is computation intensive and requires a high-performance computer architecture based on a multiprocessor system. The developed hierarchical control structure suggests to share the computation on two independent processors, where a processor is dedicated to the implementation of the real-time multiple models control scheme and of the supervisor, while on the second processor the on-line OLS algorithm recalled in Section 3.6 is implemented. The processors are independent and an asynchronous data link is used for coordinating the supervisor with the on-line learning algorithm.
The architecture of the developed multiprocessors controller is represented in Fig. 4. Two independent personal computers are used. On the first personal computer PC1 the functions of the multiple models controller are implemented and all the function for monitoring the existing models are pointed out in Fig. 4. PC1 acquires also the sensor measurements on the mobile base and it manages the transmission of the acquired data to the second personal computer. On the second personal computer PC2 the on-line implementation of the OLS algorithm is performed. The procedures implemented on the independent personal computers PC1 and PC2 do not require strong interaction. In fact there are asynchronous reduced data exchanges between the two computers only when the learning procedures locate a new network model.
5. Experimental results The real-time experimental tests of the control algorithms have been performed on the LabMate mobile robot, available at the Robotics Lab of the Universita` Politecnica delle Marche and provided with an automatic navigation system (Conte, Longhi, & Zulli, 1995). The navigation system is based on a hybrid structure
ARTICLE IN PRESS 1288
P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
(Angeloni, Leo, Longhi, & Zulli, 1996; Conte, Longhi, & Zulli, 1996; Jetto, Longhi, & Venturini, 1999) composed of the motion planning module which runs off-line, the localization and control modules, the perception module and the obstacle avoidance module which work on-line with real-time implementations. The path planning module considers the non-holonomic constraints of the mobile robot and the a priori knowledge of the environment for planning a collision free path from given start and goal configurations (Conte et al., 1996). In real-time the localization and the control modules generate the estimate of the position and orientation of the mobile robot making use of external sensors (sonar sensors) and internal sensors (dead reckoning sensors that are incremental encoders on the driving wheels) (Jetto, Longhi, & Venturini, 1999; Jetto, Longhi, & Vitali, 1999) and the control inputs of the vehicle (the linear and the angular velocities). The absolute localization module based on adaptive extended Kalman filter (Jetto, Longhi, & Venturini, 1999; Jetto, Longhi, & Vitali, 1999) guarantees high accuracy with average error in module lower than 1 cm. The environment perception module acquires data on the environment by the same set of external sensors (sonar sensors) and produces a local map of the environment with the detected obstacles (Angeloni et al., 1996). This map is used by the real-time obstacle avoidance module for modifying the planned trajectory to be tracked with a new collision free trajectory. All these modules of the navigation system are implemented on a PC-based architecture composed by two independent personal computers PC1 and PC2, connected by an Ethernet link and both installed on the vehicle as discussed at the end of the above Section 4.3. PC1 is a 486 DX2 with PC 104 bus and PC2 is a Pentium II. On PC1 the functions of the multiple model controller are implemented together with the conditioning and acquisition procedures of sensor measurements. The path planning module, the real-time localization module and the real-time obstacle avoidance module are implemented on PC2 together with the on-line implementation of the OLS algorithm. Fig. 5 shows the LabMate mobile base equipped with the PC-based architecture. In the developed control scheme, the controllers C j ; j ¼ 2; 3; . . ., are based on NNs trained by the online algorithms recalled in Sections 3.4 – 3.6 starting by the off-line trained NNs that are used for implementing controller C 1 (see Section 4.2). Therefore, RBFNs with one hidden layer with gaussian activation functions and one output layer with linear activation functions have been employed for the estimation of f j2 ðe3 ; vr Þ and f j5 ðor Þ in the control law (31), j ¼ 1; 2; . . . ; L þ i, i ¼ 1; 2; . . ., with the integral term f j1 ðe2 Þ substituted with its rectangular approximation expressed by (22) with f j1 ðe2 Þ instead of f 1 ðe2 Þ.
Fig. 5. LabMate mobile robot.
The proposed control law has the following form: j 1 ðk1 e1 ðkÞ þ oðkÞe2 þ f^2 ðe3 ; vr ÞÞ, Tc 1 ^j ðf ðor Þ þ k2 e2 ðkÞ þ k3 e3 ðkÞÞ, ð34Þ oj ðkÞ ¼ Tc 5 j where the estimate f^2 of f j2 ðe3 ; vr Þ is performed by the neural net haj : R‘a þ2 ! R of the form
vj ðkÞ ¼
a;j
yaj
¼
haj ðxa Þ
¼
la;j 0
þ
nr X
a;j a la;j ‘ fðkx c‘ kÞ
(35)
‘¼1 j with the output yaj ðkÞ :¼ f^2 ðkÞ and the input pattern xa 2 j R‘a þ2 defined as in (25) and the estimation f^5 of f j5 ðor Þ is performed by the neural net hbj : R‘b ! R of the form: b;j
ybj
¼
hbj ðxb Þ
¼
lb;j 0
þ
nr X
b;j b lb;j ‘ fðkx c‘ kÞ
(36)
‘¼1 j
with the output ybj ðkÞ :¼ f^5 ðkÞ and the input pattern xb 2 R‘b defined as in (27). In the following subsection, the implementation and technical details of fixed RBFNs of controller C 1 , that are the starting configuration for nets of the other controllers C j ; j ¼ 2; 3; . . . ; are described. 5.1. Structure and validation of implemented fixed RBFNs The training and testing phases of NNs of C 1 have been performed with data acquired in a set of planned trajectories where experimental data have been used composed by the measures of the robot tracking errors and of the actual and reference control efforts applied to
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
the robot along the set of trajectories. A set of 20 trajectories, chosen with different shapes, curvature radius and tracking velocities, have been used for the training phase. For each trajectory from 180 to 400 pairs of input and output samples have been considered. In particular the set of experimental data is given by the pairs ðxa ; ya Þ and ðxb ; yb Þ which have the forms specified in (25), (27), (28). These data acquired by the mobile base in the considered real operative conditions are used for training the nets by OLS-based algorithm. Solutions found by the algorithm may be not optimal but useful if yield satisfactory performance. b;1 The positive integers na;1 r , nr , ‘ a and ‘ b , which specify the complexity of RBFNs described by (35) and (36) for j ¼ 1, have been chosen for matching the approximation capability of the nets to a low net complexity which is necessary for a real-time implementation on the control module of the mobile base. The number of neurons has been chosen for obtaining a good trade off among the time complexity of learning, the computation efforts of the resulting NNs control schemes and the accuracy of predictions for obtaining satisfactory control performance. Therefore, the hidden layers of nets (35) and (36) for j ¼ 1 are chosen with a number of neurons na;1 r ¼6 and nb;1 r ¼ 9. The number of input variables, stated in relations (25) and (27), is specified with ‘a ¼ ‘b ¼ 2. This choice qualifies the nets to learn the mobile robot behaviors in steady-state conditions for linear and angular accelerations and this choice is adequate for the smooth trajectories produced by the path planner and by the obstacle avoidance module implemented in the navigation system. Different choices with an increment of the number of inputs, as for example with ‘a ¼ ‘b ¼ 3, have been tested without significative improvements of the performance. A tolerance r ¼ 0:01 has been chosen in relation (8) for balancing the accuracy and complexity of the final network. The choices operated on the structure of the nets represented by (35) and (36) with j ¼ 1, have been confirmed by the performed cross-validation tests, which have been experimentally used to monitor the performance generalization of the networks in order to avoid the overfit of the data used for training the nets. Different robot trajectories not used during the training phase have been considered for cross-validation tests. The trained NNs have shown good capability to generalize the predictions of f 2 and f 5 in all considered robot validation trajectories different to the robot trajectories of the training phase. Samples of the performed tests are shown in Figs. 6 and 7 for nets (35) and (36) with j ¼ 1, respectively, where for robot trajectories not used for the training phase, the nets (35) and (36) with j ¼ 1 used in the first controller C 1 show the good capability to reproduce the terms f 2 and f 5 . In Figs. 6(a) and 7(a), the dashed lines are the desired net outputs f 2 and f 5 , respectively and the continuous lines
1289
70
60
50
40
30
20
10 0
50
100
150 samples
200
250
0
50
100
150 samples
200
250
(a) 4 3 2 1 0 -1 -2 -3 (b)
Fig. 6. Cross-validation of ha1 : (a) the continuous line denotes the 1 output f^2 ðkÞ of ha1 and the dashed line denotes the desired output f 2 ðkÞ and (b) difference between the desired output and the net output.
are the nets outputs f^12 and f^15 , respectively. The difference between the desired output and the net output is shown in Figs. 6(b) and 7(b) for nets described by (35) and (36) with j ¼ 1, respectively. The Anderson’s test (Mihram, 1972) that analyzes the sampled autocorrelation function of the tested signal to determine its whiteness has been also used for NNs 1 validation. The estimate residuals r1 :¼ f f^ , r1 :¼ a
1
2
2
b
f 5 f^5 , which have the form (29), are obtained from the identification performed by the two NNs used in the first controller C 1 . These residuals are generally reduced (the mean-square NNs functional approximation errors are less than 103 ) and white. The autocorrelation functions of residuals lie within the 95% of the considered confidence strip, as shown in Fig. 8 where a sample of
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
1290
20
1.2
15
1
10
0.8
5
0.6
0
0.4
-5
0.2
-10
0
-15
-0.2 -0.4
-20 0
50
100
(a)
150 samples
200
250
0
50
100
0
50
100
(a)
150 samples
200
250
200
250
0.3
1.2 0.2
1
0.1
0.8
0
0.6
-0.1
0.4 0.2
-0.2
0 -0.3
-0.2 -0.4 0 (b)
50
100
150 samples
200
250
Fig. 7. Cross-validation of hb1 : (a) the continuous line denotes the 1 output f^5 ðkÞ of hb1 and the dashed line denotes the desired output f 5 ðkÞ and (b) difference between the desired output and the net output.
these tests are reported for the same validation trajectories considered in Figs. 6 and 7. Figs. 8(a) and 8(b) show the autocorrelation functions relative to the estimate residuals of networks (35) and (36) for j ¼ 1, respectively. As above stated, the nets implementing controllers C j ; j ¼ 2; 3; . . . ; are chosen with the same structure of nets used in the implementation of C 1 . In particular the RBFNs of controllers C j ; j ¼ 2; 3; . . . ; have the structures described by (35) and (36) with the same number of input variables, i.e. ‘a ¼ ‘b ¼ 2. Moreover the number b;2 b;1 of neurons na;2 coincides with na;1 r ; nr r ; nr , while b;j na;j ; n , j ¼ 3; 4; . . . ; could increase with respect to the r r b;1 starting values na;1 ; n (as stated in Sections 3.5 and r r 3.6) to cope with the environment disturbances and parameter uncertainties.
-0.4 (b)
150 samples
Fig. 8. Autocorrelation functions of the estimate residuals obtained from the identification performed by neural networks ha1 and hb1 . The dashed straight lines denote the 95% confidence strip: (a) autocorrelation function of the estimate residuals r1a of neural network ha1 and (b) autocorrelation function of the estimate residuals r1b of neural network hb1 .
5.2. Experimental tests for the multiple models-based control scheme The real-time multiple models RBFN-based controller and the supervisor have been implemented on PC1. A sampling time of 0.4 s has been used, in order to guarantee a maximum vehicle velocity of 0.3 m/s. Moreover, PC1 acquires data from the actual trajectory for RBFNs on-line learning. The data are sent to PC2 connected to PC1 by an Ethernet link. PC2 implements the on-line OLS algorithm for updating the off-line determined RBFN centers using the data acquired by the local net. The new NNs are then sent to the PC1 and are used by the control module to add a new model
ARTICLE IN PRESS
mm
P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
3000 2500 Constant disturbances start point (20th sample) 2000 1500 1000 500 100th sample 0 1st sample -500 150th sample -1000 Constant disturbances end point -1500 (200th sample) -2000 -2500 250th sample -3000 -3500 -3500 -3000 -2500 -2000 -1500 -1000 -500 mm
0
500
Fig. 9. Planned trajectory.
1291
M j ; j ¼ 4; 5; . . . ; with the relative controller to the multiple model control scheme (see Fig. 4). The proposed multiple models RBFN-based control scheme has been tested in many different environmental conditions and with different planned trajectories obtaining satisfactory results. Significant samples of the performed experimental tests on the proposed controller are reported in the following. The planned trajectories of Figs. 9 and 12 are considered. The proposed multiple models RBFN-based controller has been tested with constant unmeasurable disturbances acting on the control inputs of the mobile robot (de Wit & Khennouf, 1995). The constant disturbances start at the 20th and 50th sample and end at the 200th and 150th sample for trajectories of Figs. 9 and 12, respectively. In the performed experimental tests, the vehicle covers two time the same trajectory with the same constant
40
40
30
20
20
0
mm
mm
10 0
-20 -40
-10 -60 -20 -80
-30
-100
-40 0 (a)
50
100
150
200
0
250
samples
50
100
150
200
250
samples
(b)
0.5 0.4 0.3 0.2
rad
0.1 0 -0.1 -0.2 -0.3 -0.4 -0.5 0 (c)
50
100
150
200
250
samples
Fig. 10. (a)–(c) tracking errors e1 ðkÞ, e2 ðkÞ and e3 ðkÞ, respectively. The dotted lines and the continuous lines have been obtained when the vehicle covers the first time and the second time the trajectory, respectively.
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
1292
4
index j
3
2
1 0
50
100
150
200
250
200
250
samples
(a) 4
index j
3
2
1 0
50
100
150 samples
(b)
Fig. 11. Index j of the applied sequence of regulators C j . (a) The first time the robot covers the trajectory (j ¼ 1; 2; 3) and (b) the second time the robot covers the trajectory (j ¼ 1; 2; 3; 4).
7000 6000
Constant disturbances start point (50th sample)
5000
mm
4000 3000 1st sample
2000 1000 0
Constant disturbances end point (150th sample) 200th sample
-1000 -3500 -3000 -2500 -2000 -1500 -1000 -500 mm Fig. 12. Planned trajectory.
0
500
disturbances acting on the control inputs in order to verify the learning capability of the multiple models control scheme based on the multiprocessor system. In these tests, the multiprocessor control scheme stores the new operating conditions through new NNs trained by the on-line OLS learning algorithm. These nets are used for implementing model M 4 and the relative controller C 4 . Figs. 10 and 13 show the tracking errors obtained with the developed multiprocessor control scheme. The dotted lines represent the tracking errors of the proposed multiprocessor controller when the vehicle covers the first time the planned trajectory, while the continuous lines represent the tracking errors when the vehicle covers the second time the trajectory with the same constant disturbance acting on the control inputs. In this second case the control scheme is based on 4 models, where the fourth model M 4 is that trained by the on-line OLS learning algorithm when the mobile base covers for the first time the trajectory. The first time that the mobile base covers the trajectories of Figs. 9 and 12, the multiprocessor control scheme uses controllers C j ; j ¼ 1; 2; 3, and the applied sequence of C j ; j ¼ 1; 2; 3, is displayed in Figs.11(a) and 14(a) for trajectories of Figs. 9 and 12, respectively; while the second time that the robot covers the same trajectory, the sequence of applied controllers C j ; j ¼ 1; 2; 3; 4, is displayed in Figs. 11(b) and 14(b) for trajectories of Figs. 9 and 12, respectively (Figs. 9–14). Note that the considered multiple models control structure switches on the adaptive and adaptive-learning controllers C j ; j ¼ 2; 3; 4 only in the presence of constant disturbance inputs, switching back to the fixed controller C 1 when the disturbances end. This fact shows that the fixed RBFN-based controller is suitable for trajectories without disturbances. In fact, it is based on NNs trained on a large set of different trajectories without disturbances and therefore these nets have the ability to generalize to trajectories without disturbances. For a sample of significant trajectories, characterized by different shapes, curvature radius and tracking velocities, the reduction of the tracking errors obtained by the multiprocessor control scheme is summarized in Table 1(a). The considered tests have been performed considering constant unmeasurable disturbances acting on the control inputs of the mobile robot (de Wit & Khennouf, 1995). In all performed tests the investigated multiprocessor RBFN-based control scheme is satisfactory in terms of tracking errors and it shows an improvement of the control performance when the vehicle covers the second time the trajectories in the case of constant unmeasurable disturbance acting on the vehicle. Moreover, the performed experimental tests show a noticeable improvement of the control performance of the multiprocessor RBFN-based control scheme with respect to fixed and adaptive RBFN-based
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
1293
40
60
20
40
0 -20
0
mm
mm
20
-20
-40 -60 -80
-40 -100 -60
-120 -140
-80 0
20
(a)
40
60
80
100 120 140 160 180 200
samples
0
20
40
60
(b)
80
100 120 140 160 180 200 samples
0.4 0.3 0.2
rad
0.1 0 -0.1 -0.2 -0.3 -0.4 0
20
40
60
(c)
80
100 120 140 160 180 200 samples
Fig. 13. (a)–(c) tracking errors e1 ðkÞ, e2 ðkÞ and e3 ðkÞ, respectively. The dotted lines and the continuous lines have been obtained when the vehicle covers the first time and the second time the trajectory, respectively.
controllers (D’Amico et al., 2001a,b) as summarized in Tables 1(b) and (c).
6. Conclusions In this paper, a multiprocessor multiple models RBFN-based approach for the tracking problem of a mobile vehicle is analyzed. The proposed controller has been developed and compared also to fixed and adaptive NN-based controllers by experimental tests performed on the mobile base LabMate. The proposed approach is an application of a general learning control methodology which is based on the use of multiple models, switching and tuning (Chen & Narendra, 2000; Narendra & Xiang, 2000). Fixed, adaptive and adaptive-learning RBFN-based control strategies are integrated into this multiple
models framework. The NNs have been trained on experimental data acquired from the mobile base LabMate. Different training trajectories have been considered. The proposed control algorithm makes use of multiple models of the mobile robot and of a switching scheme which determines the specific control input that drives the robot actuators. The main idea is to generate a control input which is based on the identification model which best approximates the robot at that instant. Moreover, the proposed adaptivelearning control scheme is able to introduce new models during robot tasks. Therefore, the developed multimodel control scheme increases its knowledge on the environment by the introduction of new models and related controllers and it is able to cope with timevarying operating conditions. The real-time implementation of this switching controller with on-line learning of new models and
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295
1294
Table 1
4
e e
Percent mean square errors ei % :¼ i ei i 100, i ¼ 1; 2; 3; where ei is the mean square tracking error of the multi-processor control scheme when the robot covers the second time the considered trajectories T j ; j ¼ 1; 2
index j
3
2
1 0
20
40
60
80
(a)
100 120 140 160 180 200 samples
4
T1
T2
(a) e1 % e2 % e3 %
33.029 7.102 14.892
63.789 18.412 26.814
(b) e1 % e2 % e3 %
87.342 46.434 41.114
91.878 56.428 60.649
(c) e1 % e2 % e3 %
65.565 23.367 19.090
80.031 27.013 31.230
(a) ei is the mean square error of the multi-processor control scheme when the robot covers the first time the trajectories. (b) ei is the mean square error of the fixed RBFN-based control scheme. (c) ei is the mean square error of the adaptive RBFN-based control scheme.
index j
3
2
1 0 (b)
20
40
60
80
100 120 140 160 180 200 samples
Fig. 14. Index j of the applied sequence of regulators C j : (a) the first time the robot covers the trajectory (j ¼ 1; 2; 3) and (b) the second time the robot covers the trajectory (j ¼ 1; 2; 3; 4).
controllers requires a hierarchical architecture where the control actions and the on-line learning are two independent processes coordinated by a supervisor. This solution is computation intensive and requires a highperformance computer architecture based on a multiprocessor system. The hierarchical developed control structure suggests to share the computation on two independent processors, where a processor is dedicated to the implementation of the real-time multiple model control scheme and of the supervisor, while on the second processor the on-line learning algorithm is implemented. The computation of control actions and the on-line learning of new models and controllers are partially independent processes and are performed on independent computers connected by a local communication network. In all the performed experimental tests the proposed control scheme showed significant improvements
of the tracking behavior in different environmental conditions. These results confirm the non-linear approximation capabilities of the considered NN in modeling the kinematic behavior of the vehicle which can be used also for compensating unmodeled tracking errors. Concluding the proposed control scheme is able to cope with significant environment disturbances and parameter uncertainties affecting the control of mobile robots operating in real contexts and the developed multiprocessor architecture represents an efficient and simple solution for a real-time implementation of this new control scheme.
References Ahmed-Zaid, F., Ioannou, P. A., Polycarpou, M. M., & Youssef, H. M. (1993). Identification and control of aircraft dynamics using radial basis function networks. In Proceedings of the second IEEE conference on control applications (pp. 567–572). Vancouver, BC. Angeloni, A., Leo, T., Longhi, S., Zulli, R. (1996). Real time collision avoidance for mobile robots. In Proceedings of the sixth symposium on measurement and control in robotics (ISMCR 96) (pp. 239–244). Brussels, Belgium. Chen, S., Cowan, C. F. N., & Grant, P. M. (1991). Orthogonal least squares learning algorithm for radial basis function networks. IEEE Transactions on Neural Networks, 2(2), 302–309. Chen, L., & Narendra, K. S. (2000). Nonlinear adaptive control using neural networks and multiple models. In Proceedings of the 2000 American control conference (vol. 6, pp. 4199–4203). Conte, G., Longhi, S., & Zulli, R. (1995). Nonholonomic motion planning using distance field. In Proceedings of the 1995 IFAC conference on intelligent autonomous vehicles (pp. 93–98). Helsinky, Finland.
ARTICLE IN PRESS P. Antonini et al. / Control Engineering Practice 14 (2006) 1279–1295 Conte, G., Longhi, S., & Zulli, R. (1996). Motion planning for unicycle and car-like robots. International Journal of Systems Science, 27(8), 791–798. Corradini, M., Ippoliti, G., & Longhi, S. (2003). Neural networks based control of mobile robots: Development and experimental validation. Journal of Robotic Systems, 20(10), 587–600. Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signal and Systems, 2(4), 303–314. D’Amico, A., Ippoliti, G., Longhi, S. (2001a). Adaptation and learning in neural networks multiple models based control of mobile robots. In Proceedings of the 2001 IFAC workshop on adaptation and learning in control and signal processing (ALCOSP 2001) (pp. 199–206). Cernobbio-Como, Italy. D’Amico, A., Ippoliti, G., Longhi, S. (2001b). A radial basis function networks approach for the tracking problem of mobile robots. In Proceedings of the 2001 IEEE/ASME international conference on advanced intelligent mechatronics (AIM ’01) (pp. 498–503). Como, Italy. de Wit, C. C., & Khennouf, H. (1995). Quasi-continuous stabilizing controllers for nonholonomic systems: Design and robustness considerations. In Proceedings of the third European control conference (pp. 2630–2635). Rome, Italy. de Wit, C. C., Khennouf, H., Samson, C., & Sordalen, O. J. (1993). Nonlinear control design for mobile robots. In Y. F. Zheng (Ed.), Recent trends in mobile robots (pp. 121–156). Singapore: World Scientific. Girosi, F., & Poggio, T. (1990). Neural networks and the best approximation property. Biological Cybernetics, 63, 169–176. Hunt, K. J., Sbarbaro, D., Zbikowski, R., & Gawthrop, P. J. (1992). Neural networks for control systems—a survey. Automatica, 28(6), 1083–1112. Hush, D. R., & Horne, B. G. (1993). Progress in supervised neural networks. IEEE Signal Processing Magazine, 10, 8–39. Ippoliti, G., & Longhi, S. (2004). Multiple models for adaptive control to improve the performance of minimum variance regulators. In IEE Proceedings - Control Theory and Applications (vol. 151, no. 2, pp. 210–217). Oxford, UK. Jagannathan, S., Lewis, F., Pastravanu, O. (1994). Model reference adaptive control of nonlinear dynamical systems using multilayer neural networks. In Proceedings of the 1994 IEEE World congress on computational intelligence, 1994 IEEE international conference on neural networks (pp. 4766–4771). Orlando, FL, USA. Jetto, L., Longhi, S., & Venturini, G. (1999). Development and experimental validation of an adaptive extended kalman filter for the localization of mobile robots. IEEE Transactions on Robotics & Automation, 15, 219–229.
1295
Jetto, L., Longhi, S., & Vitali, D. (1999). Localization of a wheeled mobile robot by sensor data fusion based on a fuzzy logic adapted kalman filter. Control Engineering Practice, 7, 763–771. Mihram, G. A. (1972). Simulation: Statistical foundations and methodology. Mathematics in science and engineering (vol. 92). New York, USA: Academic Press. Morse, A. S. (1995). Control using logic-based switching. In A. Isidori (Ed.), Trends in control: A European perspective (pp. 69–113). London, Great Britain: Springer. Narendra, K. S., & Xiang, C. (2000). Adaptive control of discrete-time systems using multiple models. IEEE Transactions on Automatic Control, 45(9), 1669–1686. Polycarpou, M. M. (1996). Stable adaptive neural control scheme for nonlinear systems. IEEE Transactions on Automatic Control, 41(3), 447–451. Polycarpou, M. M., & Ioannou, P. A. (1995). Adaptive bounding techniques for stable neural control systems. In Proceedings of the 34th conference on decision and control (pp. 2442–2447). New Orleans, LA. Sadegh, N. (1995). A nodal link perceptron network with applications to control of a nonholonomic system. IEEE Transactions on Neural Networks, 6(6), 1516–1523. Sanner, R. M., & Essex, C. (1996). Multiresolution radial basis function networks for the adaptive control of robotic systems. In UKACC international conference on control ’96 (pp. 894–898). Sanner, R. M., & Slotine, J.-J. E. (1991). Stable adaptive control and recursive identification using radial gaussian networks. In Proceedings of the 30th conference on decision and control (pp. 2116–2123). Brighton, England. Sanner, R. M., & Slotine, J.-J. E. (1994). Function approximation, ‘‘neural’’ networks, and adaptive nonlinear control. In Proceedings of the third IEEE conference on control applications (pp. 1225–1232). Glasgow, UK. Unar, M. N., & Murray-Smith, D. J. (1999). Automatic steering of ships using neural network. International Journal of Adaptive Control and Signal Processing, 13, 203–218. Warwick, K., Kambhampati, C., Parks, P., & Mason, J. (1996a). Dynamic systems in neural networks. In K. Hunt, G. Irwin, & K. Warwick (Eds.), Neural network engineering in dynamic control systems (pp. 27–41). London, Great Britain: Springer. Warwick, K., Kambhampati, C., Parks, P., & Mason, J. (1996b). Dynamic systems in neural networks. In K. Hunt, G. Irwin, & K. Warwick (Eds.), Neural network engineering in dynamic control systems (pp. 27–41). London, Great Britain: Springer. Weller, S. R., & Goodwin, G. C. (1994). Hysteresis switching adaptive control of linear multivariable systems. IEEE Transactions on Automatic Control, 39(7), 1360–1375.