Accepted Manuscript Real-time nonparametric reactive navigation of mobile robots in dynamic environments Sungjoon Choi, Eunwoo Kim, Kyungjae Lee, Songhwai Oh
PII: DOI: Reference:
S0921-8890(16)30039-2 http://dx.doi.org/10.1016/j.robot.2016.12.003 ROBOT 2763
To appear in:
Robotics and Autonomous Systems
Received date : 24 January 2016 Revised date : 28 October 2016 Accepted date : 11 December 2016 Please cite this article as: S. Choi, et al., Real-time nonparametric reactive navigation of mobile robots in dynamic environments, Robotics and Autonomous Systems (2016), http://dx.doi.org/10.1016/j.robot.2016.12.003 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Real-Time Nonparametric Reactive Navigation of Mobile Robots in Dynamic Environments Sungjoon Choi, Eunwoo Kim, Kyungjae Lee, Songhwai Oh∗ Department of Electrical and Computer Engineering, Seoul National University, Seoul, Korea
Abstract In this paper, we propose a nonparametric motion controller using Gaussian process regression for autonomous navigation in a dynamic environment. Particularly, we focus on its applicability to low-cost mobile robot platforms with low-performance processors. The proposed motion controller predicts future trajectories of pedestrians using the partially-observable egocentric view of a robot and controls a robot using both observed and predicted trajectories. Furthermore, a hierarchical motion controller is proposed by dividing the controller into multiple sub-controllers using a mixture-ofexperts framework to further alleviate the computational cost. We also derive an efficient method to approximate the upper bound of the learning curve of Gaussian process regression, which can be used to determine the required number of training samples for the desired performance. The performance of the proposed method is extensively evaluated in simulations and validated experimentally using a Pioneer 3DX mobile robot with two Microsoft Kinect sensors. In particular, the proposed baseline and hierarchical motion controllers show over 65% and 51% improvements over a reactive planner and predictive vector field histogram, respectively, in terms of the collision rate. Keywords: Autonomous navigation, dynamic environments, Gaussian process regression, and learning curve of Gaussian process regression
∗ Corresponding
author Email addresses:
[email protected] (Sungjoon Choi),
[email protected] (Eunwoo Kim),
[email protected] (Kyungjae Lee),
[email protected] (Songhwai Oh)
Preprint submitted to Journal of LATEX Templates
December 22, 2016
1. Introduction The use of robots has been extending steadily from structured industrial factories to unstructured and cluttered daily living spaces. To reflect this trend, numerous navigation algorithms for dynamic environments have been studied [1, 2, 3, 4, 5, 6, 7]. While 5
motion planning in a static environment can be done relatively easily using existing algorithms, such as sampling-based path planning or model predictive control, not all methods are applicable for more realistic dynamic environment. One important reason behind this discrepancy is that most of the algorithms assume structured environments, e.g., positions and dynamics of moving obstacles are assumed to be given beforehand.
10
Moreover, considering the real-time constraint of navigation algorithms, computationally heavy methods such as sampling-based or optimization-based methods are often intractable for low-cost mobile robots. On the other hand, reactive control algorithms, such as the potential field method [4], dynamic window approach (DWA) [8], and vector field histogram (VFH) [9, 10], are computationally efficient, and thus, suitable for
15
real-time navigation applications. However, these methods are vulnerable to noises and changes in dynamic environments as they only consider current measurements. In this paper, we propose a nonparametric motion controller suitable for low-cost mobile robots operating in a dynamic environment to overcome issues, such as structured environments assumption and heavy computational loads. The proposed motion
20
model predicts future positions of objects using an autoregressive Gaussian process motion model to overcome the limitations of the partially-observable view of a robot. In particular, the future position of an object is predicted given three previous consecutive positions. The prediction performance of the proposed motion model is shown to be more robust against measurement noises compared to existing linear model ap-
25
proaches, e.g., a constant velocity or acceleration model, often used in practice. The proposed motion controller using Gaussian process regression learns how to act from exhaustive offline simulations (training phase) using a receding time horizon control and achieves the state-of-the-art navigation performance in real-time in the test phase. Moreover, in order to further reduce the computational complexity, the motion con-
30
troller is divided into multiple sub-controllers using a mixture-of-experts framework
2
for a mixture of Gaussian processes. The overall control process takes less than 4ms making it suitable for low-cost or embedded processors. An efficient method of approximating the upper bound of the generalization error of Gaussian process regression, which is known as a learning curve [11], is also proposed. Using this learning curve, 35
we can efficiently determine the required number of training samples for a desirable level of performance of the proposed algorithm. A preliminary version of this work appeared in [7]. The current work extends [7] by incorporating a learning method using Hamiltonian Monte Carlo, and introducing a hierarchical Gaussian process motion controller using mixture-of-experts framework
40
to further reduce the computational complexity and an approximation to the learning curve of Gaussian process regression for estimating a required number of training samples for the desired accuracy of the proposed motion model and motion controller. A more extensive set of experiments is also included in the current work. The remainder of this paper is organized as follows. In Section 2, related work
45
is discussed. In Section 3, Gaussian processes and a mixture of Gaussian processes are described. The proposed motion model and motion controller are described in Section 4 and 5, respectively. The performance of the proposed algorithm is extensively validated both in simulations and real-world experiments. The results are shown in Section 6 and 7.
50
2. Related Work A number of studies in autonomous navigation for a dynamic environment have been conducted. We can categorize various navigation methods into four different categories using two main criteria. The first criterion is the perspective of a view used by the navigation method (reference or egocentric) and the second is the method used
55
for selecting control (optimization-based or reactive control). The classification of navigation algorithms and our proposed method is summarized in Table 1. We can further categorize motion prediction methods into two groups based on the method used for predicting the future positions of dynamic obstacles (model-based and datadriven).
3
Table 1: Classification of navigation algorithms.
60
Optimization-based
Reactive control
Reference view
[3, 12, 13, 2, 14, 1]
[15, 4]
Egocentric view
[16, 17, 18, 19, 20]
This work, [10]
In the reference view approach, the positions of a robot and other obstacles in the environment are assumed to be given or measured by external devices. From this complete environmental information, the level of safety of a planned path can be calculated or sometimes guaranteed in a probabilistic manner [3, 2]. However, these algorithms are limited to static environments or dynamic environments with known
65
obstacle trajectories. On the contrary, the egocentric view approach uses information from the perspective of an operating robot. Since the information is coming from a partially-observable view, it is sometimes hard to guarantee complete collision avoidance [17, 19]. However, since these algorithms do not need additional devices for localizing a robot and obstacles, it can be easily deployed for many practical settings.
70
We also categorize navigation algorithms based on the method used to make a control input for navigating in a dynamic environment. In particular, an optimization-based method outputs a finite-length trajectory minimizing some cost functional where a randomized sampling is often used for optimization. On the other hand, a reactive controlbased method outputs a control input for navigation in response to received measure-
75
ments. Whereas the optimization-based method can produce more cost-effective control inputs, it often requires a considerable amount of computations, making such methods unsuitable for low-cost mobile robot platforms. Lastly, navigation algorithms can be categorized into a model-based prediction and data-driven prediction methods based on how they predict future positions of dynamic
80
obstacles. In a model-based prediction approach, it is assumed that a dynamic obstacle follows a certain type of dynamics, e.g., a constant velocity model, and the prediction is made using a chosen dynamic model. In a data-driven approach, however, the dynamic
4
model of an obstacle is learned from experience. In particular, nonparametric methods, such as Gaussian process regression, have been used in this regard. 85
2.1. Reference View + Optimization-Based Method Luders et al. [3] have proposed a chance-constrained rapidly-exploring random tree (CC-RRT) which guarantees probabilistic feasibility for obstacles with linear dynamics. Assuming that the linear dynamic model is corrupted by a white Gaussian noise and the shape of each obstacle is a polygon, probabilistic feasibility at each time
90
step can be established based on the predictive distribution of the state of obstacles. [12] proposed a model predictive control (MPC) approach under the RRT framework to speed up the computation time. The proposed algorithm selects the best trajectory according to the cost of traversing a potential field. In [21], Joseph et al. used a mixture of Gaussian processes with a Dirichlet process prior to model and cluster an unknown
95
number of motion patterns. In [13], Aoude et al. combined CC-RRT [3] and the motion model from [21] and proposed an RRT-Reach Gaussian Process (RR-GP) path planning algorithm [13, 2]. From diverse simulations, they have shown that an RR-GP can identify probabilistically safe trajectories in the presence of dynamic obstacles. Trautman et al. [1] proposed a multiple goal interacting Gaussian process (mIGP) which
100
is an extension of an interacting Gaussian process (IGP) from [14]. From a number of experiments in a university cafeteria, the proposed algorithm outperformed a noncooperative algorithm similar to [13] and a reactive planner [8]. In their experiment, the positions of pedestrians and a robot are obtained using Point Grey Bumblebee2 stereo cameras mounted in the ceiling.
105
2.2. Reference View + Reactive Control In [15], a number of features that can be effectively used in socially compliant robot navigation scenarios have been presented including density, speed and orientation, velocity, and social forces. From extensive sets of comparisons, Vasquez et al. emphasized the importance of motion prediction with respect to smooth and human-
110
like robot motion. A potential function based path planner is proposed in [4]. An
5
elliptical shape potential field is used for signifying the predicted position and the direction of an obstacle. It is shown that it is possible to encode probabilistic motion information into the potential field formulation. 2.3. Egocentric View + Optimization-Based Method 115
In [16], SPENCER, a socially aware service robot for guiding and assisting passengers in busy airports, has been introduced. The motion planning module considers social rules learned from Bayesian inverse reinforcement learning and the policy function is trained with Gaussian process inverse reinforcement learning. In [17], Park et. al. proposed a sampling based navigation algorithm for a wheelchair robot in a dy-
120
namic environment where model predictive equilibrium point control is used to follow a selected trajectory. However, whereas the sensor information are gathered from the real wheelchair robot, the experiments are performed in simulated environments. Furthermore, future positions of pedestrians are predicted using a constant velocity model. In [18], a large set of feasible paths are generated using kinodynamic RRT [22] and the
125
best path for a robot is chosen with respect to multiple cost functions considering goal configuration, nearby pedestrians, and obstacles. In particular, a normalized weightedsum method had been used to combine multiple costs. In [19], a socially adaptive path planning framework is proposed by combining Bayesian inverse reinforcement learning with graph-based path planning for a robotic wheelchair mounted with an RGB-D
130
camera. In particular, the social adaptability is learned from human demonstrations and the velocities of pedestrians are estimated with an RGB-D optical flow algorithm. Several real-world experiments are conducted using a robotic wheelchair with an RGBD camera and its performance is compared to the dynamic window approach (DWA) planner [8]. In [20], Fulgenzi et al. have modeled motion patterns of moving obstacles
135
using a mixture of Gaussian processes by assuming that typical motion patterns exist, which is similar to [21]. The likelihood of the future trajectory of an obstacle is computed by conditioning the current trajectory of a moving obstacle given training data sequences and used for computing risk of collision. This risk of collision is used inside the motion planner, called Risk-RRT, which is similar to [3].
6
140
2.4. Egocentric View + Reactive Control In [10], Ulrich et al. presented the VFH* algorithm, which is an enhanced version of VFH. While the original VFH only looks for gaps in locally constructed polar histogram, VFH* predicts the future condition of a robot with a look-ahead verification and generates a control input considering both current and future conditions.
145
2.5. Model-Based Prediction vs. Data-Driven Prediction Parametric motion prediction methods have been widely used to estimate the motions of dynamic objects using a constant velocity model [17, 23]. In [3], Luders et. al. assumed that any dynamic obstacles are assumed to follow known, deterministic paths and provided probabilistically safe path planning method using a chance-constrained
150
RRT (CC-RRT). For the data-driven approach, Gaussian processes have been extensively used. Gaussian process motion patterns (GPMP) is proposed in [13] by extending CC-RRT with data-driven predictions. The GPMP is defined as a mapping from locations to a distribution over trajectory derivatives where a Gaussian process is used to model the
155
mapping function [13]. In [2], Aoude et. al. proposed a RRT-Reach Gaussian process (RR-GP) which combines GPMP in a closed-loop RRT algorithm where the GPMP is used to sample nodes of an RRT search tree. Similar motion pattern modeling using a mixture of Gaussian processes has been studied in [21, 20]. However, the aforementioned motion prediction methods assume that there exists certain motion patterns at
160
each location which is not suitable for open spaces where dynamic objects can freely move around. 3. Preliminaries 3.1. Gaussian Process Regression A Gaussian process defines a distribution over functions and is completely speci-
165
fied by its mean function m(x) and covariance function k(x, x0 ) [11]. For notational simplicity, the mean function is usually set to be zero.
7
Let D = {(xi , yi ) | i = 1, · · · , n} be a set of input-output pairs. Let x =
{x1 , . . . , xn } and y = [y1 . . . yn ]T . The conditional distribution of y? at a new input x? given data D becomes
y? |D ∼ N (ˆ µ(x? |D), σ ˆ 2 (x? |D)),
(1)
2 µ ˆ(x? |D) = k(x? , x)T (K(x, x) + σw I)−1 y,
(2)
σ ˆ 2 (x? |D) = k(x∗ , x∗ ) − k(x∗ , x)T (K(x, x))−1 k(x∗ , x),
(3)
where
and
where k(x? , x) ∈ Rn and K(x, x) ∈ Rn×n are a vector and a covariance matrix
defined as [k(x? , x)]i = k(x? , xi ) and [K(x, x)]ij = k(xi , xj ), respectively.
Due to its nonparametric flexibility and rich expressiveness, GPR has been exten170
sively used to model complex and nonlinear motion patterns of pedestrians [3, 13, 2, 21, 20]. In the aforementioned works, the motion pattern is defined as a distribution over trajectory derivatives as a function of locations. Whereas we also utilize GPR to model motion patterns of pedestrians, the proposed motion prediction method in Section 4 utilized Gaussian process regression (GPR) to model the next positions of
175
pedestrians given previous positions relative to the position of a robot to capture how pedestrians move when they are approaching to a robot. Furthermore, it is also shown from the experiments that the prediction with GPR is more robust to the measurement noises compared to the parametric constant velocity model. 3.2. A Mixture of Gaussian Processes
180
In Gaussian process regression, both training data and test data are assumed to follow a joint Gaussian distribution via the covariance function k(x, x0 ). While this joint Gaussian distribution makes the mean function (2) of a Gaussian process analytically tractable, it also makes it hard to model a function with different degrees of smoothness. Nonstationary Gaussian processes [24] can model functions with inhomogeneous
185
smoothness, but it requires an infinite dimensional hyperparameter (function), increasing computational requirements. 8
In [25], Tresp proposed a mixture of Gaussian processes (MGP) inspired by the mixture of experts method. This MGP model allows Gaussian processes to model more general functions by clustering the training data into groups with the similar prop190
erty and adjusting hyperparameters within each group (or cluster). By doing this, an MGP is shown to be more robust to model functions with inhomogeneous smoothness. Moreover, an MGP can also alleviate the memory requirement as well as computational complexity. While Gaussian process regression has O(n2 ) memory requirement and O(n3 ) computational complexity, an M clustered mixture of Gaussian processes 2
195
has O( nM ) memory requirement for saving M matrices of size
n M
computational complexity for inverting the kernel matrix.
×
n M
3
n and O( M 2)
Let M be the number of mixtures and Fµ (x) = {µ1 (x), ... , µM (x)} be M mean
functions of an MGP. Then the mean function of an MGP given a test input x∗ can be obtained as E(y|x∗ , Fµ (x)) =
M X j=1
µj (x∗ )P (z = j|x∗ ),
(4)
where P (z = j|x∗ ) is a gating network which outputs the contribution of each (Gaussian process) expert and z is a random variable indicating the assignment of data to clusters. For modeling a set of mean functions, Fµ (x), we need M hyperparameters and gating networks Fz (x) = {P (z = 1|x), . . . , P (z = M |x)} to compute cluster assignment probabilities. The expectation-maximization (EM) algorithm is used to estimate hyperparameters of an MGP model. In particular, the contribution of the ith training data (xi , yi ) to the jth cluster is estimated as
where
gˆ(zi |j, xi , yi ) Pˆ (zi = j|xi , yi ) = PM ˆ(zi |m, xi , yi ), m=1 g gˆ(zi |k, xi , yi ) = Pˆ (zi = k|xi )N (yi |µk (xi ), kk (xi )),
200
(5)
(6)
and µk (x) and kk (x) are, respectively, mean and variance functions of a Gaussian process trained with data in the kth cluster. In [26], an infinite mixture of Gaussian process experts (IMGPE) is proposed. Unlike an MGP which have a finite number of clusters, the IMGPE model can au9
tomatically adjust the number of clusters by setting a Dirichlet prior over mixtures. 205
However, this nonparametric Bayesian approach does not likely work well in practice, especially, when the number of training data is not sufficiently large. It is mainly due to its sampling-based prediction and the vague prior distribution on the concentration parameter which directly influences the number of clusters. In our work, we used an MGP approach to motion controller to alleviate the computational cost.
210
3.3. Hamiltonian Monte Carlo Method Hamiltonian Monte Carlo (HMC) is a Markov Chain Monte Carlo (MCMC) method, in which Hamiltonian dynamics is applied to a Markov chain [27]. Hamiltonian dynamics explains the motion of an object using its location x and momentum p, which are related to potential energy U (x) and kinetic energy K(p), respectively. Hamiltonian H(x, p) = U (x) + K(p), which is a constant in a frictionless space, gives us the relationship between U (x) and K(p) by following equations: ∂H(x, p) ∂K(p) ∂x = = ∂t ∂p ∂p ∂p ∂H(x, p) ∂U (x) =− =− . ∂t ∂x ∂x
(7) (8)
The concept of Hamiltonian dynamics can be applied to sampling based maximum likelihood estimation (MLE) for finding hyperparameters θ and this method is often called Hamiltonian Monte Carlo (HMC). In our case, we treat hyperparameters θ as the location x and the negative log likelihood of hyperparameters θ with respect to all training data as the potential energy U (x) as follows: U (x) = − log p(y|X, θ) =
1 T −1 1 y K y + log |K| + C, 2 2
where K is a kernel matrix and C is a constant. The momentum p is sampled from a Gaussian distribution N (0, Σp ) and works as a random search direction. The kinetic energy K(p) is defined as − log(N (p|0, Σp )). More details can be found in [27]. We
have used HMC to learn hyperparameters of proposed Gaussian process motion model
215
and motion controller. In a hierarchical Gaussian process motion controller discussed in Section 5.3, the same set of hyperparameters is used by all mixture components.
10
4. Autoregressive Gaussian Process Motion Model In this section, we focus on the problem of predicting the future trajectory of a dynamic obstacle given a recent trajectory using autoregressive Gaussian process re220
gression. Predicting the future trajectory is significantly important in robotics because an accurate and robust motion prediction is a key element in successful navigation in the environment, in which a robot coexists with humans. 4.1. Autoregressive Gaussian Process Motion Model m Let xt ∈ R2 be the position of an object at time t and τt−1 be the trajectory or a
m collection of recent m positions until t − 1, i.e., τt−1 = [xTt−m . . . xTt−1 ]T ∈ R2m .
An autoregressive model is a way of representing a time-varying random process and an autoregressive model of order m is defined as xt = c +
m X
(9)
ψi xt−i + wt
i=1
where c, ψ1 , ... , ψm are parameters of an autoregressive model and wt is a process 225
noise which is often modeled by a zero-mean Gaussian random variable. Under the Gaussian noise assumption, the parameters of (9) can be estimated using ordinary least squares. We use Gaussian process regression to generalize an autoregressive model such m that xt is a Gaussian process whose input is τt−1 , i.e., given ω ∈ Ω, xt is a function of
230
(xt−m , . . . , xt−1 ), and xt has the following Gaussian distribution using (1): xt
∼
m m N (µ(τt−1 ), Σ(τt−1 )),
(10)
m m ) and Σ(τt−1 ) are the mean vector and covariance matrix, respectively, where µ(τt−1 m based on the recent m positions τt−1 .
In detail, we use the following kernel function for the proposed autoregressive Gaussian process motion model (AR-GPMM): 0
kSE (τ, τ ) =
σf2
exp −
m X (xt−i+1 − x0t−i+1 )2
2σx2i
i=1
11
!
2 + σw δτ,τ 0 ,
(11)
where τ = [xTt−m . . . xTt−1 ]T ∈ R2m is a trajectory or a sequence of m positions
2 } are hyperparameters of a Gaussian process, which and θ = {σf2 , σx21 , . . . , σx2m , σw
235
are estimated using the HMC method described in Section 3.3.
Let DM = {(τ m , xnext )i |i = 1, 2, ... , k} be k training data for the motion model,
which consists of previous m positions τ m and the corresponding next position xnext . m The collection of trajectories in DM is denoted by τD and the collection of next M
positions is denoted by xDM . The next position can be predicted using the AR-GPMM 240
as follows: ˆt x
m = E(xt ) = µ(τt−1 )
(12)
m m T m m −1 = kSE (τD , τt−1 ) KSE (τD , τD ) xDM , M M M m m m where kSE (·) is a kernel vector where the ith element is kSE (τim , τt−1 ), and KSE (τD , τD ) M M m m is a kernel matrix with [KSE (τD , τD )]ij = kSE (τim , τjm ). M M
4.2. Comparison With Linear Autoregressive Models We compared our proposed AR-GPMM with a linear autoregressive model. The 245
autoregressive model in (9) is a linear model, such that the next position is computed as a linear combination of past m positions. A constant velocity or constant acceleration model, often used in practice, can be categorized into this group. The simplicity of a linear model makes it easy to be implemented. However, in practice, a linear model often suffers from its vulnerability to noise. In other words, in
250
the presence of measurement noises, a linear (parametric) model is more likely to show poor prediction performance. To validate the robustness of the proposed AR-GPMM, we have conducted experiments of predicting the next position given past positions. We first collected trajectories of people moving around a robot using a Vicon motion capture system1 at a sampling
255
rate of 1 Hz. The collected trajectories consist of 837 points and 5-fold cross validation is used to compute the average prediction error. 1
URL: http://www.vicon.com/
12
(a)
(b)
Figure 1: (a) Collected trajectories of people. (b) Average prediction errors of a linear autoregressive model and AR-GPMM. Figure 1(a) shows collected trajectories in black lines and the position of a robot with a red circle. We assume noisy measurements and computed prediction accuracies under different measurement errors. The average prediction errors of the linear autore260
gressive model and AR-GPMM are shown in Figure 1(b). We can see that under the noiseless scenario, both methods predict well. However, in the presence of measurement noises, the proposed AR-GPMM outperforms the linear autoregressive model in terms of the prediction error. In particular, one snapshot of both prediction methods is shown in Figure 1(a). Three consecutive true positions and noisy measurements are
265
shown in red triangles and blue circles, respectively. The predicted positions using the linear autoregressive model and the proposed AR-GPMM are shown in a red circle and a green circle, respectively. 4.3. Learning Curves for Gaussian Process Regression As described in Section 4.2, the proposed AR-GPMM shows good prediction per-
270
formance as well as robustness to noise. Moreover, it is known that given an infinite number of training data, a predictor using a Gaussian process converges to the true target function with respect to the point-wise loss function [11]. When the mean and variance of a Gaussian process are evaluated, a matrix inversion operation is required, which results in time complexity of O(n3 ), where n is the number
13
275
of training data. Hence, it is desirable to limit the number of training data. On the other hand, we want to use a large number of training data in order to improve the prediction accuracy of Gaussian process regression. Hence, it is crucial to determine the minimum required number of training data for the desired accuracy. This trade-off between the prediction performance and the number of training data
280
is called the learning curve [28]. To be specific, we focus on the average-case learning curve which shows the average generalization performance as a function of the number of training data under the assumption that the target function is drawn from a Gaussian process and an underlying covariance structure is known. Details of existing learning curves for Gaussian process regression can be found in [11]. Under the assumption that a target function f is drawn from a Gaussian process with known covariance function k(x, x0 ) and we observe n d-dimensional locations of f , resulting a training set Dn = (X, y) of size n, where X are locations and y are outputs of the training set, the generalization error is given by Z g E g (f ) = EX (f )dX,
where g EX (f ) = 285
Z
k(x∗ , x∗ ) − k(x∗ )T K(X)−1 k(x∗ ) p(x∗ )dx∗ ,
(13)
(14)
k(x∗ ) = k(X, x∗ ) ∈ Rn is a kernel vector, K(X) = K(X, X) ∈ Rn×n is a kernel
matrix, and p(x∗ ) is the distribution of the test input x∗ . Note that the first term inside the integral is equivalent to the predictive variance (3).
Numerically approximating
(13) with Monte Carlo integration is usually infeasible as it requires integrating over all possible configurations of the test input, x∗ , and n training samples, X. In particu290
lar, handling n training samples requires heavy computational resources as the sample space is a product of n d-dimensional spaces, i.e., Rn×d . Many existing learning curves for Gaussian process regression find the upper or lower bound of the generalization error using eigenfunction analysis. For example, the asymptotic property of Gaussian process regression is derived from the eigenfunction
295
analysis of kernel functions. However, computing eigenfunctions of a kernel function can only be analytically tractable for a squared exponential function with the Gaussian
14
measure [11]. Moreover, the number of eigenfunctions is infinite for a non-degenerate kernel function, making the analysis even harder. We propose an efficient method for approximating the upper bound of the gener300
alization error (13) for Gaussian process regression with an isotropic kernel function. The proposed approximation scheme can numerically approximate the upper bound and is independent from the dimension of the input space, making its computation tractable. Finding the upper bound of the generalization error is highly useful since it can guarantee the worst-case performance. The predictive variance of a Gaussian process using n training data Dn is shown in
(3). For simplicity, it can be represented as follows:
2 σX (x∗ ) = k(x∗ , x∗ ) − k(x∗ , X)K(X, X)−1 k(X, x∗ )
¯ ∗ , X)K(X, ¯ ¯ = σf2 − σf2 k(x X)−1 k(X, x∗ ),
(15)
¯ ¯ where σf2 = k(x∗ , x∗ ) is the gain of the kernel function and k(·) and K(·) are a
¯ kernel vector and a kernel matrix computed using a normalized kernel function k(·), ¯ x0 ). Then we can rewrite (15) as i.e., k(x, x0 ) = σf2 k(x, 2 ¯T K ¯ , ¯ −1 k σX (x∗ ) = σf2 1 − k
where
and
305
¯ ∗ , x1 ) k(x
¯= k ¯ ∗ , X2:n ) k(x ¯ 1 , x2:n ) k(x
1
(16)
¯ = . K ¯ 2:n , x1 ) k(x ¯ 2:n , x2:n ) k(x
Applying the matrix inversion lemma, (16) becomes 2 σX (x∗ )
¯ ∗ , x1 )2 ) = σf2 (1 − k(x ¯ 2:n , x∗ ) − k(x ¯ 2:n , x1 )k(x ¯ 1 , x∗ ))T − σf2 (k(x ¯ 2:n , x2:n ) − k(x ¯ 1 , x2:n )T k(x ¯ 1 , x2:n ))−1 × (k(x ¯ 2:n , x∗ ) − k(x ¯ 1 , x2:n )T k(x ¯ 1 , x∗ )). × (k(x
15
(17)
2 The first term of (17) is the predictive variance σD of a Gaussian process using only 1
one training sample (x1 , y1 ) and the second term is always nonnegative due to the positive-definitess of a kernel function. Using this fact, we can bound the generalization error of Gaussian process regression using a subset of training data. Lemma 1. The generalization error (13) for Gaussian process regression satisfies Z Z E g (f ) ≤ EUg (f ) = σx2 c (x∗ )p(X)p(x∗ )dXdx∗ , (18) 310
where xc is a training input which is nearest to x∗ , X is a set of n training inputs, p(x∗ )
and p(X) are probability density functions of a test input and a set of training inputs, ¯ ∗ , xc )2 ) is the predictive variance computed respectively, and σx2 c (x∗ ) = σf2 (1 − k(x using only one training data xc .
It is clear that the nearest point xc gives the most tight upper bound among all 315
inputs in X, if a single such input is chosen. Computing (18) is easier than computing (13) as it requires no matrix inversion, however, we still have to integrate over x∗ ∈ Rd
and X ∈ Rd×n , where d is the dimension of the input data and n is the number of training data. We first make two assumptions:
• Assumption 1: The kernel function is isotropic, i.e., k(x, x0 ) = kI (||x − x0 ||). 320
• Assumption 2: The training data points are uniformly distributed around x∗ . Let
R be the radius of a ball around x∗ , such that there are n data points uniformly placed inside the ball.
The following theorem from [29] is used to estimate the upper bound EUg (f ). Theorem 1 ([29]). In a binomial point process (BPP) consisting of n points uniformly and randomly distributed in a d-dimensional ball of radius R centered at the origin, the Euclidean distance Rm from the origin to its mth nearest point follows a generalized beta distribution, i.e., for r ∈ [0, R], d B(m − 1/d + 1, n − m + 1) R B(n − m + 1, m) r d 1 ×β ; m − + 1, n − m + 1 , R d
fRm (r) =
where β(x; a, b) = (1/B(a, b))xa−1 (1 − x)b−1 and B is the beta function. 16
Radius (R): 1 / Dimension (d): 2
Radius (R): 1
Generalization Error
Upper-bound of Generalization Error (BPP)
1
E g (f ) EUg (f ) ˜ g (f ) E U
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
5
10
15
20
25
30
35
0.8
d=1 d=2 d=3 d=4 d=5 d=6
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 10 0
40
Number of Training Inputs (n)
10 1
10 2
10 3
10 4
10 5
Number of Training Data
(a)
(b)
Figure 2: (a) Learning curves for Gaussian process regression estimated from simulation (d = 2 and R = 1). The learning curve (13) and its upper bound (18) are shown in blue circles and green triangles, respectively. The proposed approximation (20) to the upper bound using a binomial point process is shown in red squares. (b) Approximate upper bounds (20) of the generalization error of Gaussian process regression at different input space dimensions. By Assumption 1, the predictive variance σx2 c (x∗ ) is a function of the distance r between xc and x∗ , i.e., r = ||x∗ − xc ||2 . By defining σI2 (r) = σx2 c (x∗ ) and using the fact data points are uniformly distributed from Assumption 2, we have Z R g EU (f ) = σI2 (r)g(r)dr,
(19)
0
where g(r) is a probability density function of r, the distance between the test input point x∗ and the nearest input point xc . From Assumption 2, we can assume that a ball of radius R centered at the nearest input point xc also contains approximately n input points, which are distributed uniformly. Hence, Theorem 1 can be applied and g(r) can be computed using fR1 (r). Therefore, we can approximate EUg (f ) as follows: ˜ g (f ) = EUg (f ) ≈ E U 325
Z
0
R
σI2 (r)fR1 (r)dr,
(20)
Simulation results of a Gaussian process learning curve and its upper bounds using (18) and (20) are shown in Figure 2(a). We assume that the input space of data points is a unit ball in R2 . Monte Carlo integrals of (18) and (20) are computed by uniformly 17
sampling 1, 000 test inputs for a training set of 1, 000 data points in a two-dimensional unit ball and 1, 000 radii within [0, 1]. The original learning curve (13) and its upper 330
bound (18) are shown in blue circles and green triangles, respectively, and the proposed upper bound using (20) is shown in red squares. First of all, we can see that the required number of training data for Gaussian process regression to achieve similar performance increases exponentially as the dimension of the input space increases (see Figure 2(b)). Furthermore, while both upper bounds, (18) and (20), come up with
335
similar results as expected, the computation times for approximating the upper bound differs significantly.2
In [28], the authors derived the upper bound of the learning
curve similar to ours. However, their method is restricted to one dimensional problem whereas our method can be applied regardless of the dimension of the input space. 5. Nonparametric Bayesian Motion Controller Many existing navigation methods for dynamic environments assume that veloci-
340
ties of dynamic obstacles in the environments are available, which is often impossible or too expensive to obtain as a real-world application [19]. In this section, we develop a real-time motion controller for a mobile robot to avoid incoming (or static) obstacles. The main motivation is that numerous navigation algo345
rithms that work well in off-line or simulated environments may not be implementable for a real robot as a real-time application due to their high computational complexity. Moreover, when it comes to navigating in dynamic environments, where moving obstacles exist, more computations are often required. Hence, it is crucial for a navigation algorithm to run in real-time while maintaining the desirable performance in a dynamic
350
environment. 5.1. Gaussian Process Motion Controller Navigating in dynamic environments requires predicting future states and incorporating predicted states into finding the optimal control. However, exact dynamic 2
In particular, numerically computing (18) with 1, 000 test inputs and 1, 000 data points takes more than
2, 400s, whereas computing (20) with 1, 000 radii takes less than 60ms.
18
models as well as current state estimations are needed for an accurate prediction. Fur355
thermore, making an optimal control using both current and predicted states requires heavy computations. While existing methods first predict future states based on current measurements and then find an optimal control based on current and predicted states, we aim to directly find a mapping function that maps the current state to a corresponding control considering predicted states to alleviate the computational cost. In order
360
to do this, additional training phase is required and it is described in Section 5.2. This separation of a training phase and an execution phase plays a significant role in lightening the computational requirement and making the whole process suitable for real-time applications. In particular, we again use Gaussian process regression to model the mapping func-
365
tion Fu from a trajectory space T ⊂ R2m , where m is the length of a trajectory, to a
control input space U ⊆ R2 in case of a simple ground robot, such as a unicycle dynamic model. The covariance function ku (·) for the Gaussian process is specified as
follows: ku (τtm , τt0m )
=
σf2
exp −
2 + σw δτtm ,τt0m ,
m X (xt−i+1 − x0t−i+1 )2 i=1
2σx2i
! (21)
where τtm = [xTt−m+1 . . . xTt ]T ∈ R2m is a trajectory or a sequence of m positions
370
2 } are hyperparameters of a Gaussian process. In fact and θ = {σf2 , σx21 , . . . , σx2m , σw
(21) is equivalent to (11) as the input to the motion model and motion controller are the same. However, note that the output of an AR-GPMM is the next position of the object, whereas the output of the Gaussian process motion controller is the control input which avoids collisions. 375
The order m of the input space of a Gaussian process motion controller (GPMC) is an important parameter. If the order is too low, then it cannot fully capture the dynamics of a moving object. If the order is too high, the required number of training data will be too high as the required number of training data increases exponentially as the dimension of the input space increases as shown in Section 4.3. Furthermore, since
380
the GPMC utilizes a partially observable egocentric view, it takes more time to collect a trajectory with a sufficient length. Hence, as the order increases, the robot becomes 19
less reactive to dynamic obstacles. We have investigated an appropriate order for the GPMC from simulation. Using a training set of size 3,000, we have conducted 100 runs for different orders from 1 to 6. The average collision rates of the GPMC under 385
different orders are shown in Table 2. The best performance can be obtained when the order is three and it is used for the GPMC throughout the paper. Order
1
2
3
4
5
6
Collision ratio (%)
27
31
16
23.2
19
30
Table 2: Average collision rates of the proposed GPMC under different autoregressive orders. Since we do not assume any additional tracking system in our proposed framework, the GPMC solely depends on measurements from the egocentric-view of a robot. A nearest neighbor tracking algorithm is used to handle multiple trajectories by assigning 390
a detected object (or pedestrian) to the closest known trajectory, but a more sophisticated multi-target tracking algorithm, such as [30], can be applied. In the egocentric view approach, the view of a robot is constantly changing as a robot moves, which makes it hard to obtain continuous trajectories. Also motion planning considering the current measurement only is likely to fail to provide safe
395
navigation. In our approach, we tightly entangle the AR-GPMM into the GPMC by predicting the positions of pedestrians outside the current field of view. By using both observed and predicted trajectories, the GPMC can effectively overcome the limitation of the egocentric view. When more than one pedestrian is detected, we first predict the future positions of pedestrians and make control using the trajectory of the pedestrian
400
who is closest to the robot. Snapshots of the proposed motion controller with and without an AR-GPMM are shown in Figure 3. By incorporating predicted positions, a robot can navigate more safely through dynamic obstacles even when the robot cannot detect nearby pedestrians outside its current field of view.
20
Figure 3: Snapshots from GPMC-based collision avoidance simulations. GPMC-based collision avoidance without the motion prediction is shown in the top row. GPMCbased collision avoidance with the AR-GPMM motion prediction is shown in the bottom row. A robot and moving obstacles are shown in a green square and yellow ellipses, respectively. Observed and predicted trajectories are represented by black circles and red stars, respectively. 5.2. Training a GPMC 405
An exhaustive sampling-based receding horizon control is used to collect the training data for a GPMC. The objective is to find an optimal path of a fixed length where the cost function is inversely proportional to the distance to the goal location while maintaining a distance of 1, 000mm from a nearby pedestrian. A snapshot from the training phase is shown in Figure 4, where sampled paths, the selected optimal path,
410
and the future pedestrian path are shown in gray, red, and blue lines, respectively. Note that in the training phase, we assume that the dynamics of a pedestrian is known and collected positions of a pedestrian and a control input of a robot at each time to generate training data for the GPMC. As explained in Section 5.1, the proposed GPMC has separated training and execution phases, consequently, collecting the training data is
415
free from timing restrictions. In order to gather a diverse set of training data, the initial position of an obstacle in each trial is randomly sampled from the field of view of the robot. At each position, the velocity and acceleration of the obstacle is again randomly chosen. Once dense training 21
Figure 4: A sampling-based receding horizon control to generate training samples for the proposed GPMC. data are collected, a detrimental point process (DPP) [31] is used to select an effective 420
subset of training data while minimizing overlaps in the training data. Specifically, we used k-DPP algorithm for collecting a subset of size k and (21) is used as a kernel function. The collected training data is further clustered using a mixture of Gaussian processes for a hierarchical Gaussian process motion controller discussed in the next section.
425
5.3. Hierarchical Gaussian Process Motion Controller In Section 3.2, a mixture of Gaussian processes (MGP) is presented. While an MGP can effectively model a nonstationary function by clustering data into appropriate groups, it still suffers from its high-computational complexity for computing the cluster assignment probability of each data using the gating network.
430
In [32], Kim et al. proposed piecewise Gaussian processes for modeling nonstationary phenomena by partitioning the region of interest into disjoint subregions, such that each subregion has the same covariance structure. A Voronoi tessellation is used to partition the region. A Voronoi tessellation is defined by the cluster centers c = (c1 , ..., cM ), which divide the region D into M disjoint subregions, R1 , ..., RM ,
435
such that points within Ri are closer to ci than other center cj for j 6= i.
Instead of using a Voronoi tessellation, we use a support vector machine (SVM) 22
Figure 5: An overview of the hierarchical Gaussian process motion controller (HGPMC). classifier to partition the region. We first cluster the training data, i.e., training input and output pairs, using a mixture of Gaussian processes. Then, we train the classifier using input training data and its cluster labels. In the execution phase, once a new input 440
data is given, we compute the cluster label using the trained SVM classifier and make regression using the training data categorized into the classified cluster. The proposed method has two major benefits. By partitioning the region, one can efficiently model a nonstationary function. The other is the computational efficiency. Since the computational complexity of inverting a kernel matrix is cubic in the number
445
of training data, by dividing data into M clusters, we can reduce the computational 3
n complexity from O(n3 ) to O( M 2 ).
Using piecewise Gaussian processes, we propose a hierarchical Gaussian process motion controller (HGPMC). Each cluster in an HGPMC can be interpreted as a specific motion reaction. In other words, one motion control cluster might indicate going 450
backward as an obstacle is too close to the robot and the other cluster might indicate turning right as an obstacle is approaching from the left. This phenomenon is reflected to the simulation results given in Table 3, which shows that the velocity at collision is reduced for the HGPMC. This is mainly because, unlike a GPMC, an HGPMC first chooses which cluster current measurement belongs to and makes a control using the
23
455
corresponding training data only. If a robot is about to collide, the output of corresponding training data will likely to have backward motions making the control output more likely to go backward. An overview of the HGPMC is shown in Figure 5. 6. Simulation In this section, we perform a number of comparative simulations to validate the
460
performance of the proposed navigation algorithms, GPMC and HGPMC. 6.1. Setup To make a realistic setup, we assume that a robot can only obtain information from its egocentric view. The field of view (FOV) is set to 120◦ and this is due to the fact that Microsoft Kinect used in our experiments has 57◦ FOV and we use two Kinects for a
465
larger FOV. The number of pedestrians varies from one to seven to validate the safety of the proposed method under different conditions. For each setting, 100 independent runs are performed. For each simulation, the goal is to reach the goal position which is located at 5m away from the initial position of a robot. For comparison, we apply five differ-
470
ent navigation algorithms: GPMC, HGPMC, reactive planner, vector field histogram (VFH), and autoregressive vector field histogram (AR-VFH). When there is no incoming pedestrian, the robot tries to reach the goal using pure pursuit (PP) low-level controller [33]. 6.2. Determining the Number of Training Data
475
As illustrated in Section 4.3, the performance of GPR increases as the number of training data increases. However, due to the O(n3 ) computational complexity of GPR, determining an appropriate number of training data is important when GPR is used for a practical application. In Section 4.3, we have shown that the generalization error of Gaussian process
480
regression using n training data (13) is upper bounded by that of using the nearest training data (18) which requires no matrix inversion. Furthermore, we also proposed an efficient way of estimating the upper bound (20) using the distance distribution of a 24
Figure 6: An approximated upper bound of the Gaussian process learning curve (20). binomial point process. Since there is no analytic solution for computing (18), Monte Carlo integration is used to approximate. It is known that the standard deviation of the 485
approximation error is proportional to
√V , m
where V is the volume of the integration
and m is the number of samples. In other words, the required number of samples for the Monte Carlo integral is proportional to the square of the volume. In our problem formulation, the proposed AR-GPMM has 6 input dimensions and the volume of integration of the proposed approximate upper bound (20) is [0, R], n+1
490
whereas (18) is [0, R]6
, where R is the sensing radius of a robot and n is the number
of training data. Mean squared prediction errors of the AR-GPMM using whole training data and the closest to a test point are demonstrated in Figure 6, where the Euclidean norm is used to measure the distance. The radius 5 of a hyper-ball in a six dimensional space 495
reflects 5m of the egocentric sensing radius and an autoregressive model of order three in a two dimensional space. The proposed upper bound of the learning curve is depicted in Figure 6. Even though the proposed upper bound is not exact, we can still obtain a good insight about the required number of training data. As this setting is identical to the proposed AR-
500
GPMM and GPMC, we have collected 3, 000 training samples to train both AR-GPMM and GPMC since the learning curve reaches a saturated region when there are 3, 000
25
training samples. 6.3. Compared Navigation Algorithms 6.3.1. Reactive Planner 505
This planner first predicts future trajectories of pedestrians using our AR-GPMM. Then it randomly generates 20 trajectories and calculates the cost of each trajectory. The cost function is defined as the minimum distance between a randomly generated trajectory and predicted pedestrian trajectories. 20 trajectories are examined to match the computation time of reactive planner similar to the proposed method. The minimum
510
cost control input is applied at each time. This approach is similar to [17, 12] in that it makes a locally optimal decision using the receding horizon control framework. 6.3.2. Vector Field Histogram (VFH) VFH [34] is used as the baseline navigation algorithm. 6.3.3. Autoregressive Vector Field Histogram (AR-VFH)
515
VFH [34] and its variants such as VFH+ [35] and VFH* [10] assume that a robot is navigating through a static environment. In our experiment, however, obstacles are constantly moving making VFH not suitable. In order to improve the performance of VFH, we implemented an autoregressive vector field histogram (AR-VFH). The only difference is that we track moving objects and make a polar histogram using both
520
current and predicted positions of objects. For predicting the next position of a moving object, we use our AR-GPMM in Section 4. 6.3.4. Discussion of Untested Navigation Algorithms In Section 2, a number of navigation algorithms designed for dynamic environments are listed. However, algorithms categorized under the reference-view approach
525
are not compared because our goal of this work is to develop a navigation algorithm which can be applied to a wide range of situations without the help of an external tracking system. Similarly, location-specific motion patterns [20, 13, 2] are not compared since we assume that obstacles can move freely in the operating space.
26
A potential field method is not compared since it requires tedious parameter tun530
ing. Moreover, Koren and Borenstein [9] have discussed substantial shortcomings of potential field based methods and proposed VFH to overcome discussed shortcomings. 6.4. Results Snapshots of four navigation algorithms, reactive planner, vector field histogram (VFH), proposed Gaussian process motion controller (GPMC), and autoregressive vec-
535
tor field histogram (AR-VFH), under the same scenario are shown in Figure 7. Due to the limited field of view (FOV) of a robot, algorithms without motion prediction, reactive planner and VFH, fail to navigate safely through dynamic obstacles, whereas AR-VFH and GPMC successfully avoid moving obstacles by predicting the future locations. Furthermore, the proposed GPMC shows the safest behavior as it considers
540
future obstacle locations into its control framework. Four different criteria are used to evaluate the performance of the proposed algorithm: the collision rate, the velocity at the collision, the total moved distance, and the average computation time. The results are shown in Figure 8 and Table 3. Figure 8(a) indicates the collision rates of different algorithms at different numbers of mov-
545
ing obstacles. The overall average collision rates are shown in Table 3. The proposed methods, GPMC and HGPMC outperforms VFH and the reactive planner in terms of the collision rate. One interesting result is the collision rates of VFH and AR-VFH. The performance of VFH improves dramatically as we incorporate predicted positions into account. This indicates that incorporating current as well as predicted positions is
550
crucial in navigating in a dynamic environment. The average elapsed time to reach the goal position while avoiding moving obstacles are shown in Figure 8(b). Our proposed motion controllers require more time to reach the goal since they make detours to avoid collisions. In particular, the HGPMC takes the longest time as it makes more conservative motion controls, which can also
555
be seen by the collision velocity shown in Table 3. The average velocity of each algorithm at a collision is shown in Table 3.
The
proposed GPMC and HGPMC show over 65% and 51% improvement, respectively, over AR-VFH which combines VFH with motion prediction with autoregressive Gaus27
Figure 7: Simulations of five different navigation algorithms for the same scenario. From the top row, we have the reactive planner, VFH, AR-VFH, proposed GPMC, and HGPMC. A robot and moving obstacles are shown in a green square and yellow ellipses, respectively. Observed and predicted trajectories are represented by black circles and red stars, respectively. sian process regression. 560
Note that the velocity at a collision for the HGPMC is the
smallest. This is mainly because the piecewise Gaussian process discussed in Section 5.1 clusters trajectories and corresponding input control pairs into clusters sharing similar attributes. In other words, once a moving obstacle approaches nearby, an HGPMC makes a control based on the motion cluster that controls the robot backward to avoid an immediate collision. The required computation times for five algorithms are also 28
(a)
(b)
Figure 8: Simulation results of five navigation algorithms. (a) Average collision rates at different numbers of moving obstacles. (b) Average moving distances until reaching the goal location. Averages from 100 independent runs are reported. Reactive
VFH
AR-VFH
GPMC
HGPMC
25.7
36.4
25
8.57
12.14
Computation Time (ms)
112.26
0.91
1.25
11.9
3.23
Collision Velocity (cm/s)
10
19.6
17.9
12.4
0.47
Collision Rate (%)
Table 3: A performance comparison of five navigation algorithms in terms of average collision rates, average computation time, and the average directional velocity of a robot at collision.
565
shown in Table 3. We have found that the proposed GPMC and HGPMC can make some interesting motions. One is to go slowly if a pedestrian is coming closely and even going backward if a pedestrian is coming too closely. These reactive motions are reasonable in that an usual differential drive mobile robot has a nonholonomic constraint (robot can only
570
move in the direction normal to the axis of the driving wheels) as well as the maximum speed lower than that of a moving pedestrian. In order for a robot to safely navigate in a dynamic environment, its motion strategy should be defensive rather than aggressive.
29
Figure 9: A Pioneer 3DX mobile robot with two Kinect cameras. 7. Experiments 7.1. Setup 575
We used a Pioneer 3DX differential drive mobile robot with two Microsoft Kinect cameras mounted on top of the robot as shown in Figure 9. All programs are written in MATLAB using mex-compiled ARIA package.3 The positions of pedestrians are detected using the skeleton grab API of a Kinect camera. Pedestrian tracking and the proposed navigation algorithm run at about 10 Hz on a 2.1 GHz notebook where the
580
Kinect image acquisition takes the most of the computing time. Eight sonar sensors equipped in the Pioneer platform are used to avoid an immediate collision. This exception routine using sonar sensors is activated if the minimum distance between the robot and pedestrians are less that 40 cm since the Kinect camera cannot detect pedestrians closer than 40 cm. 3 http://robots.mobilerobots.com/wiki/ARIA
30
1 Person
2 Persons
9.1
13.6
Average Minimum Distance (mm)
627.7
601.5
Average Moved Distance (mm)
5131.2
7663.2
Collision Rate (%)
Table 4: Experimental results of two navigation scenarios: avoiding 1 person and 2 persons. 22 experiments are performed in each scenario.
585
7.2. Results We have performed experiments in a hall under two different scenarios: reaching a goal position at 5m away with one or two pedestrians. 22 experiments are performed for each scenario and we have computed the average collision rate, minimum distance to pedestrian, and total moved distance. The results are summarized in Table 4. If the
590
minimum distance between a robot and a pedestrian is smaller than 40cm, we declare a collision has occurred. For the one pedestrian case, the average minimum distance was 627.7mm and the collision rate was only 9.1%. For the two pedestrian case, the average minimum distance was 601.5mm and the collision rate was 13.6%. The performance gap between experiments and simulation results given in Table
595
3 is due to the poor detection performance of a Kinect camera. Kinect cameras occasionally miss detections when a robot rotates. We expect that the performance of the motion controller can be improved with a better detection algorithm. We have also tested the GPMC in an L-shaped corridor and a school cafeteria to validate that the proposed motion planner can work in more complex and dynamic envi-
600
ronments. In the L-shaped corridor case, goal locations were (6m, 0m) and (6m, −6m). In the school cafeteria case, the robot navigates to its goal at 10m away and return to its
starting location. In both cases, six pedestrians were freely moving in the environment. For the experiment in an L-shaped corridor, the elapsed time, total moved distance, and average directional velocity were 48s, 12.79m, and 0.24m/s, respectively. The aver605
age and minimum distance to obstacles estimated from the sonar sensors were 1.461m and 0.429m, respectively. For the school cafeteria experiment, the elapsed time, total
31
Figure 10: Snapshots from the L-shaped corridor experiment. Each snapshot shows a photo taken by a third person, a photo taken by the robot, and the robot’s internal state visualized on the floor plan. Note that the robot does not know the floor plan or the map of the area. moved distance, and average directional velocity were 91.7s, 21.88m, and 0.23m/s, respectively. The average and minimum distance to the obstacles estimated from the sonar sensors were 0.351m and 2.108m, respectively. Some snapshots from the ex610
periments are shown in Figure 10 and 11. Both experiments in the L-shaped corridor and the school cafeteria are conducted in indoor environments since our system detects a pedestrian using a Kinect sensor, which is not reliable in outdoor environment. However, the proposed navigation method can be used in outdoor environments with another sensor which can reliably provide pedestrians’ relative positions, e.g., detecting
615
human legs using a 2D range data [36].
32
Figure 11: Snapshots from the school cafeteria experiment. Each snapshot shows a photo taken by a third person, a photo taken by the robot, and the robot’s internal state visualized on the floor plan. Note that the robot does not know the floor plan or the map of the area. 8. Conclusions In this paper, we have proposed a novel Gaussian process motion controller suitable for low-cost mobile robots to navigate through a crowded dynamic environment using measurements from the egocentric view of a robot. The proposed method is ro620
bust against measurement noises in inexpensive sensors and computationally efficient for real-time operations for low-cost mobile robots. To overcome the limitations of using measurements from the egocentric view of a robot, we proposed a robust motion 33
prediction algorithm, autoregressive Gaussian process motion model (AR-GPMM). To compute the collision avoiding control efficiently, we have taken the data-driven ap625
proach and proposed Gaussian process motion controller (GPMC) and hierarchical Gaussian process motion controller (HGPMC). The performance of proposed methods is extensively validated in a number of simulations and experiments. We compare the proposed methods with relatively simple algorithms as we focus low-cost motion controllers suitable for inexpensive mobile robots. The propose motion controllers, GPMC
630
and HGPMC, have shown the best result with respect to the collision rate with acceptable control frequency for a real-time application, 100Hz with GPMC and 300Hz with HGPMC. Due to the sensing range of the Kinect sensor, which is about 5 m, a robot had to be operated at a lower speed. We plan to apply the proposed method with a sensor with a longer range in order to operate the robot at a higher speed in more crowded
635
environments. References [1] P. Trautman, J. Ma, R. M. Murray, A. Krause, Robot navigation in dense human crowds: the case for cooperation, in: Proc. of the IEEE International Conference of Robotics and Automation (ICRA), 2013.
640
[2] G. S. Aoude, B. D. Luders, J. M. Joseph, N. Roy, J. P. How, Probabilistically safe motion planning to avoid dynamic obstacles with uncertain motion patterns, Autonomous Robots (2013) 1–26. [3] B. Luders, M. Kothari, J. P. How, Chance constrained RRT for probabilistic robustness to environmental uncertainty, in: Proc. of the AIAA Guidance, Naviga-
645
tion, and Control Conference, 2010. [4] N. Pradhan, T. Burg, S. Birchfield, Robot crowd navigation using predictive position fields in the potential function framework, in: American Control Conference (ACC), 2011. [5] P. Henry, C. Vollmer, B. Ferris, D. Fox, Learning to navigate through crowded
34
650
environments, in: Proc. of the IEEE International Conference of Robotics and Automation (ICRA), 2010. [6] C.-P. Lam, C.-T. Chou, K.-H. Chiang, L.-C. Fu, Human-centered robot navigation—towards a harmoniously human–robot coexisting environment, IEEE Transactions on Robotics 27 (1) (2011) 99–112.
655
[7] S. Choi, E. Kim, S. Oh, Real-time navigation in crowded dynamic environments using Gaussian process motion control, in: Proc. of the IEEE International Conference of Robotics and Automation (ICRA), 2014. [8] D. Fox, W. Burgard, S. Thrun, et al., The dynamic window approach to collision avoidance, IEEE Robotics & Automation Magazine 4 (1) (1997) 23–33.
660
[9] Y. Koren, J. Borenstein, Potential field methods and their inherent limitations for mobile robot navigation, in: Proc. of the IEEE International Conference of Robotics and Automation (ICRA), 1991. [10] I. Ulrich, J. Borenstein, VFH*: local obstacle avoidance with look-ahead verification, in: Proc. of the IEEE International Conference of Robotics and Automation
665
(ICRA), 2000. [11] C. Rasmussen, C. Williams, Gaussian Processes for Machine Learning, Vol. 1, MIT press Cambridge, MA, 2006. [12] M. Svenstrup, T. Bak, H. J. Andersen, Trajectory planning for robots in dynamic human environments, in: Proc. of the IEEE International Conference on Intelli-
670
gent Robots and Systems (IROS), 2010. [13] G. S. Aoude, J. Joseph, N. Roy, J. P. How, Mobile agent trajectory prediction using Bayesian nonparametric reachability trees, in: Proc. of the AIAA Infotech, 2011. [14] P. Trautman, A. Krause, Unfreezing the robot: Navigation in dense, interacting
675
crowds, in: Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2010. 35
[15] D. Vasquez, B. Okal, K. O. Arras, Inverse reinforcement learning algorithms and features for robot navigation in crowds: An experimental comparison, in: Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS), 680
IEEE, 2014. [16] R. Triebel, K. Arras, R. Alami, L. Beyer, S. Breuers, R. Chatila, M. Chetouani, D. Cremers, V. Evers, M. Fiore, et al., Spencer: A socially aware service robot for passenger guidance and help in busy airports, Field and Service Robots. [17] J. J. Park, C. Johnson, B. Kuipers, Robot navigation with model predictive equi-
685
librium point control, in: Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2012. [18] G. Ferrer, A. Sanfeliu, Multi-objective cost-to-go functions on robot navigation in dynamic environments, in: Proc. of the IEEE International Conference on Intelligent Robots and Systems (IROS), IEEE, 2015.
690
[19] B. Kim, J. Pineau, Socially adaptive path planning in human environments using inverse reinforcement learning, International Journal of Social Robotics (2015) 1–16. [20] C. Fulgenzi, A. Spalanzani, C. Laugier, C. Tay, Risk based motion planning and navigation in uncertain dynamic environment, INRIA Research report.
695
[21] J. Joseph, F. Doshi-Velez, A. S. Huang, N. Roy, A Bayesian nonparametric approach to modeling motion patterns, Autonomous Robots 31 (4) (2011) 383–400. [22] S. M. LaValle, J. J. Kuffner, Randomized kinodynamic planning, The International Journal of Robotics Research 20 (5) (2001) 378–400. [23] M. Zucker, J. Kuffner, M. Branicky, Multipartite RRTs for rapid replanning in dy-
700
namic environments, in: Proc. of the IEEE International Conference of Robotics and Automation (ICRA), 2007. [24] C. Paciorek, M. Schervish, Nonstationary covariance functions for Gaussian process regression, in: Advances in neural information processing systems (NIPS), 2004. 36
705
[25] V. Tresp, Mixtures of Gaussian processes, in: Advances in neural information processing systems (NIPS), 2000. [26] C. E. Rasmussen, Z. Ghahramani, Infinite mixtures of Gaussian process experts, Advances in neural information processing systems (NIPS) 2 (2002) 881–888. [27] R. Neal, MCMC using Hamiltonian dynamics, Handbook of Markov Chain
710
Monte Carlo 2. [28] C. K. Williams, F. Vivarelli, Upper and lower bounds on the learning curve for Gaussian processes, Journal of Machine Learning 40 (1) (2000) 77–102. [29] S. Srinivasa, M. Haenggi, Distance distributions in finite uniformly random networks: Theory and applications, Vehicular Technology, IEEE Transactions on
715
59 (2) (2010) 940–949. [30] S. Oh, S. Russell, S. Sastry, Markov chain Monte Carlo data association for multitarget tracking, IEEE Transactions on Automatic Control 54 (3) (2009) 481–497. [31] A. Borodin, Determinantal point processes, arXiv preprint arXiv:0911.1153. [32] H.-M. Kim, B. K. Mallick, C. Holmes, Analyzing nonstationary spatial data using
720
piecewise Gaussian processes, Journal of the American Statistical Association 100 (470) (2005) 653–668. [33] O. Amidi, C. E. Thorpe, Integrated mobile robot control, Tech. Rep. Robotics Institute, Carnegie Mellon University, Robotics Institute, Carnegie Mellon University (1991).
725
[34] J. Borenstein, Y. Koren, The vector field histogram-fast obstacle avoidance for mobile robots, IEEE Transactions on Robotics and Automation 7 (3) (1991) 278– 288. [35] I. Ulrich, J. Borenstein, VFH+: Reliable obstacle avoidance for fast mobile robots, in: Proc. of the IEEE International Conference of Robotics and Automa-
730
tion (ICRA), 1998.
37
[36] K. Arras, O. Mozos, W. Burgard, Using boosted features for the detection of people in 2d range data, in: Proc. of the IEEE International Conference of Robotics and Automation (ICRA), IEEE, 2007.
38
Sungjoonn Choi (S’12)) received thee B.S. degreee in electrical and computeer engineeringg from Seoull National Universitty, Seoul, Koorea, in 2012. Currently, hhe is working towards the Ph.D. degreee at the Depaartment of Electricaal and Computer Engineeering, Seoul National University. His current reseearch interestts include metric Bayesiian methods, kernel methhods, and maachine learnin ng algorithmss with appliccations to nonparam robotics.
hung-Ang Eunwoo Kim (S’11) received the B.S. degree in Electricall and Electronics Engineeering from Ch M degree in Electricall Engineering g and Compu uter Sciences from Seoul National Universitty, and the M.S. Universitty, in 2011 andd 2013, respecctively, wheree he is currenttly pursuing th he Ph.D. degreee. His curren nt research interests include machhine learning, computer c visi on, and patterrn recognition.
Kyungjaee Lee (S’15) received the B.S. degree in electrical and computeer engineeringg from Seoull National Universitty, Seoul, Koorea, in 2015. Currently, hhe is working towards the Ph.D. degreee at the Depaartment of Electricaal and Compuuter Engineering, Seoul Naational Univerrsity.His curreent research in interests inclu ude cyberphysical systems, machhine learning,, robotics, opti timization, leaarning from deemonstration aand their applications.
Songhwai Oh (S'04~M'07) received the B.S. (Hons.), M.S., and Ph.D. degrees in electrical engineering and computer sciences from the University of California, Berkeley, CA, USA, in 1995, 2003, and 2006, respectively. He is currently an Associate Professor with the Department of Electrical and Computer Engineering, Seoul National University, Seoul, Korea. Before his Ph.D. studies, he was a Senior Software Engineer at Synopsys, Inc., Mountain View, CA, USA, and a Microprocessor Design Engineer at Intel Corporation, Santa Clara, CA, USA. In 2007, he was a Post-Doctoral Researcher with the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA. From 2007 to 2009, he was an Assistant Professor of Electrical Engineering and Computer Science in the School of Engineering, University of California, Merced, CA, USA. His current research interests include cyber-physical systems, robotics, computer vision, and machine learning.
-A nonparametric motion controller using Gaussian process regression for autonomous navigation in a dynamic environment is proposed. -The limited sensing capabilities of the egocentric view of a robot is effectively handled using the proposed autoregressive Gaussian process motion model. -An efficient method to approximate the upper bound of the learning curve of Gaussian process regression is proposed and used to determine the number of training data for the motion controller.