Indoor multi-sensor fusion positioning based on federated filtering

Indoor multi-sensor fusion positioning based on federated filtering

Journal Pre-proofs Indoor multi-sensor fusion positioning based on federated filtering HuiXia Li, LongHui Ao, Hang Guo, XiaoYi Yan PII: DOI: Reference...

1MB Sizes 0 Downloads 73 Views

Journal Pre-proofs Indoor multi-sensor fusion positioning based on federated filtering HuiXia Li, LongHui Ao, Hang Guo, XiaoYi Yan PII: DOI: Reference:

S0263-2241(20)30043-9 https://doi.org/10.1016/j.measurement.2020.107506 MEASUR 107506

To appear in:

Measurement

Received Date: Revised Date: Accepted Date:

17 September 2019 15 December 2019 11 January 2020

Please cite this article as: H. Li, L. Ao, H. Guo, X. Yan, Indoor multi-sensor fusion positioning based on federated filtering, Measurement (2020), doi: https://doi.org/10.1016/j.measurement.2020.107506

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Β© 2020 Published by Elsevier Ltd.

Contents lists available at Science Direct

Measurement journal homepage: www.elsevier.com/locate/measurement

Indoor multi-sensor fusion positioning based on federated filtering HuiXia Li a,b , LongHui Ao b,c, Hang Guo a b,*, XiaoYi Yan b,c Key laboratory of Poyang Lake Environment and Resource Utilization of Ministry of Education, School of Resources Environment and Chemical Engineering, Nanchang University, Nanchang, Jiangxi 330031, China b Institute of Space Science and Technology, Nanchang University, Nanchang, Jiangxi 330031, China c Information Engineering School of Nanchang university, Nanchang, Jiangxi 330031, China Corresponding author: Hang Guo ([email protected]). a

ARTICLE INFO

ABSTRACT

Article history: Received Received in revised form Accepted Available online

In order to meet the requirements of high-precision indoor navigation, a fast point cloud registration method based on Voxel-SIFT (Scale-Invariant Feature Transform) is proposed to improve the registration efficiency of LiDAR (Light Detection and Ranging). Due to the influence of information distribution factors, the dynamic adaptive system cannot be realized in the traditional federated filter. In this case, a hybrid federated filter based on weighted least square method is designed. It is proposed that the main filter in the hybrid federated filter adopts the minimum variance criterion and the optimal estimate of the sub-filter are fused according to the optimal coefficient weighted algorithm. The information distribution factor can be dynamically updated in real time to obtain a global optimal estimate. Finally, the experimental results show that the method can significantly improve the indoor positioning accuracy of mobile robots. The average positioning error is 0.22 meters, which is more accurate than the visual odometer method or the LiDAR odometer method alone.

Keywords: Federated filter Visual positioning navigation 3D point cloud registration Indoor Positioning

C 2019 Elsevier Ltd. All rights reserved. β—‹

1. Introduction

With the desire for more and more convenient life, the field of service robots has become an important direction for the future development of robots. In the development of service robots, continuous and accurate state estimation is a prerequisite for the effective operation of home service robots. For the problem of mobile robot state estimation, scholars have carried out a lot of research [1-8]. Mur-Artal R et al., Forster C et al. and Engel J et al. proposed several navigation and positioning methods based on visual odometer, these visual navigation algorithms can perform stable and robust pose estimation under stable motion conditions [1-4].

Kohlbrecher S et al. and Hess W et al proposed two simultaneous localization and mapping (SLAM) methods that rely solely

on laser data and do not require other sensor information, by solving the problem of laser scanning matching, they got better positioning accuracy[9,10]. In the past, light detection and ranging (LiDAR) was a relatively common sensor for sensing surrounding information and positioning, and it has been widely used in the field of navigation and positioning. In addition, the use of cameras as sensors to collect surrounding information of the visible SLAM has also become a research hotspot [11]. SLAM technology based on sensors such as cameras or lasers sensors has strict requirements on the environment, and in the case, the surrounding information is weak, and the positioning errors is large [12]. In contrast, the inertial measurement unit (IMU) based on pedestrian dead reckoning (PDR) algorithm can output information in real time and at high frequency, and has less information demand for the surrounding environment

[13].

Due to the limitations of single-sensor navigation and positioning, multi-sensor information

fusion technology has gradually become a key technology of integrated navigation system [14]. Li et al. fused vision and IMU by extended Kalman filter to obtain state estimates in real time [15]. Qin et al. proposed a tight coupling scheme between vision and IMU, which put together the residual term of the visual structure and the residual term of the IMU to form a joint optimization

problem [16]. Shi et al. used the pose estimation result obtained by the visual odometer as the initial value of ICP, and then matched the laser data, which achieved good results in terms of real-time performance and accuracy [17]. Currently the most widely used and most efficient fusion method is the federated filtering algorithm based on Kalman filtering technology [18]. In literature [19-20] the federated filtering is applied to the decentralized filtering of multi-sensor integrated navigation systems. Reklaitis G.V et al. applied the federated Kalman filtering technique to the navigation data fusion, but the accuracy of navigation and positioning still has much room to improve [21]. Due to the characteristics of the sensors and the influence of the specific environment, the reliability of the sensor measurement data is not the same. If this factor is not considered, it will have a large negative impact on the filtering structure, and even cause the estimated value to be completely unreliable. In view of the above, an IMU/Vision/LiDAR integrated navigation and positioning method based on federated filtering is proposed in this paper, which helps improve positioning robustness. In this paper, three-sensor combined indoor positioning based on federated filtering is implemented in four steps. Firstly, the Voxel-SIFT feature extraction method based on scale-invariant feature transform (SIFT)

[22]

algorithm is proposed to extract

features from three-dimensional (3D) laser point cloud data, and the initial registration is completed. Then the random sample consensus (RANSAC)

[23]

algorithm is used to eliminate the edge registration points and optimize the initial transformation

parameters. Finally, based on the optimized initial transformation parameters, the K-dimensional tree ( Kd-Tree) neighbor search method

[24]

is combined with the iterative closest point (ICP)

[25]

algorithm to complete the fast and accurate registration of the

original point cloud data quickly and accurately, then the LiDAR positioning information is obtained. In the second step, by comparing the two visual odometer methods based on the feature point method and the direct sparse odometry (DSO) method [3]. The DSO visual odometer method is selected for the visual fusion part to obtain monocular visual positioning information. In the third step, as the traditional federated filter cannot implement the dynamic adaptive system of the information distribution factors, a hybrid federated filter based on the weighted least square method is designed. The main filter in the hybrid federated filter adopts the minimum variance criterion and fuses the optimal estimates of the sub-filter according to the optimal coefficient weighting method to realize the real-time dynamic update of the information distribution factor and obtain the global optimal estimate. In the fourth step, several experimental results show that this method can significantly improve the indoor positioning accuracy of the mobile robots. 2. Feature extraction and registration of 3D laser point cloud

The working principle of the 3D (three-dimensional) laser scanning radar sensor is that the laser emitter periodically drives the laser diode to emit laser pulses and uses the time of flight of the laser pulses to measure the distance from the target to the scanning center [26]. The 3D point cloud data coordinates are local coordinates represented by the polar coordinate system with the center of the LiDAR as the pole, as shown in Fig. 1.

Fig.1. 3D laser coordinate system Where 𝑅 is the distance between the laser and the obstacle, πœ” is the angle between the beam and the y-axis in the vertical direction of the multi-line 3D laser, Ξ± is the angle between the scan line and the y-axis in the horizontal direction, and converts the 3D laser point cloud data to the global coordinates in the ordinary coordinate system are as shown in equation (1), and (π‘₯,𝑦,𝑧)

is the 3D space coordinates of the 3D laser point cloud. π‘₯ = 𝑅 βˆ— cos (πœ”) βˆ— sin (𝛼) 𝑦 = 𝑅 βˆ— cos (πœ”) βˆ— sin (𝛼)#(1) 𝑧 = 𝑅 βˆ— sin (𝛼) 2.1. Feature extraction based on intrinsic shape signatures(ISS)

Intrinsic shape signatures (ISS) [27] method mainly has four steps. Firstly, assuming a frame point cloud has m points, and the 3D coordinate of any point 𝑃𝑖 is (π‘₯𝑖,𝑦𝑖,𝑧𝑖),𝑖 = 0,1,β‹―,π‘š ― 1, define a local Cartesian coordinate system for point 𝑃𝑖, and set a search radius 𝜌. The weight calculation equation of this point is shown in equation (2), and 𝑃𝑗 is a point in the neighborhood space of 𝑃𝑖. Then calculate the covariance of 𝑃𝑖 and 𝑃𝑗 to get the covariance matrix of 𝑃𝑖, as shown in equation (3). Then calculate the eigenvalues (πœ†1𝑖 ,πœ†2𝑖 ,πœ†3𝑖 ) of the covariance matrix (3*3 matrix) and arrange them in descending order. Finally, two thresholds πœ€1 and πœ€2 are set, and the points satisfying the equation (4) are ISS characteristic points. 𝑀𝑖 =

1

,𝑝𝑗 ∈ {‖𝑝𝑖 ― 𝑝𝑗‖ < ρ}, i β‰  j ∈ (0,1,…,π‘š ― 1)#(2)

‖𝑝𝑖 ― 𝑝𝑗‖

cov( pi ) ο€½

οƒ₯

p j ο€­ pi ο€Ό 

wi  pi ο€­ p j  pi ο€­ p j 

T

οƒ₯

p j ο€­ pi ο€Ό 

wi ,

(3)

i ο‚Ή j οƒŽ (0,1, οƒ— οƒ— οƒ—, m ο€­ 1) πœ†2𝑖 πœ†1𝑖 ≀ πœ€1,πœ†3𝑖 πœ†2𝑖 ≀ πœ€2#(4) 2.2. Feature extraction based on Voxel-SIFT 3D point cloud

In practical application scenarios, 3D laser point cloud data usually has features such as large amount of data and irregularity, uneven distribution of data density, the scenarios are complex and with noise. Therefore, the extraction and recognition of feature points is the major research difficulties. The selection of feature extraction method will seriously affect the registration accuracy of 3D laser point cloud, thus affecting the overall positioning and mapping accuracy of the laser SLAM. At present, ISS [27] feature extraction and Harris feature extraction

[28]

are the mainstream methods of 3D laser feature point extraction. A 3D point cloud

feature extraction algorithm based on Voxel-SIFT is presented in this paper. The Voxel-SIFT feature extraction method is specifically described as follows. 1) The 3D laser point cloud is used to construct a 3D voxel grid model. According to the point cloud information in the z-axis direction, the point cloud data is divided from outer to the inner spatial layers. It is assumed that the original voxel grid model of each layer is 𝑀(π‘₯,𝑦,𝑧), the scale space model π‘€π‘˜(π‘₯,𝑦,𝑧) of each layer is the convolution of the original voxel grid model and the 3D Gaussian filter, it is as shown in equation (5) and equation (6): π‘€π‘˜ = 𝑀(π‘₯,𝑦,𝑧)⨂𝐺(π‘₯,𝑦,𝑧,π‘˜π›Ώ)#(5) 1 (π‘₯2 + 𝑦2 + 𝑧2) 2(π‘˜π›Ώ)2 𝐺(π‘₯,𝑦,𝑧,π‘˜π›Ώ) = #(6) 𝑒― 3 ( 2πœ‹π‘˜π›Ώ) Where 𝐺(π‘₯,𝑦,𝑧,π‘˜π›Ώ) is the Gaussian function, 𝛿 is the spatial scale parameter, π‘˜is the scale size adjustment parameter, and different π‘˜ values represent different scale spaces. 2) Calculate the Gaussian difference model of adjacent scale layers. π·π‘œπΊπ‘˜ = π‘€π‘˜(π‘₯,𝑦,𝑧) ― π‘€π‘˜ ― 1(π‘₯,𝑦,𝑧)#(7) 3) Detect extreme points from different directions of the Gaussian difference space 4) Re-divide the spatial region containing 8Γ—8Γ—8 small cells with the extreme point as the center, and further divide each subregions into 4Γ—4Γ—4 grid regions, after the division calculate the normal vector of each point in the set of points of each grid region; 5) A weighted histogram similar to that in document [29] is constructed in the 3D neighborhood around the given extreme point, and finally generating the feature vector. 2.3. 3D point cloud registration

The purpose of 3D laser point cloud registration is to convert two frame point clouds (source point cloud P and target point cloud

Q) in the same scene collected at different locations into the same coordinate system through the transformation matrix. The flow chart of the Voxel-SIFT based point cloud registration method proposed in this paper is shown in Fig. 2. Point cloud data P and Q

Data downsampling

Voxel-SIF method to extract 3D feature points

Feature point initial matching and optimization initial matching parameters Convert point cloud P to point cloud P'

Target point cloud Q

Use Kd-Tree Method to speed up search matching points

ICP matching method

Fig.2. Registration algorithm flow chart The specific implementation steps of the point cloud registration method based on Voxel-SIFT are as follows: (1) Extract Voxel-SIFT feature points. (2) Use the feature points for initial registration, the random sampling consistency method is used to eliminate the misregistration points, and then the transformation parameters between the two frame feature point clouds are calculated. Finally, based on the above results, the ICP algorithm is used to fine-tune the feature point cloud data, and the optimized transformation parameters are obtained. (3) Based on the optimized transformation parameters, the ICP registration algorithm is used again to complete the accurate registration of the original point cloud data, and the Kd-Tree nearest neighbor search method is used to improve the registration efficiency. 3. Selection principles of visual odometry algorithm

According to whether feature points are extracted and matched, visual odometry algorithm can be devided into two categories, feature-based method and direct method. ORB-SLAM is a typical feature-based slam system that uses oriented fast and rotated brief (ORB) for tracking, mapping, relocation, and closed loop. For each frame, feature points are first extracted and matched with the reference frame. The uniform velocity model is used to accelerate the feature matching process. The initial pose estimation is calculated by the motion-only bundle adjustment (BA) method. Then perform local BA processing on all frames in the sliding window to optimize all 3D points and camera pose to improve the accuracy of the system. The DSO method is to optimize the depth of the pixels and the pose of the camera directly by minimizing photometric errors without the need of feature extraction and matching. Compared with the feature-based method, the DSO method can use more image information, it is more robust in scenes with less texture, and is more suitable for 3D reconstruction. However, the DSO method relies on the assumption that the brightness is constant, and the performance will be severely tested in the case of rapid

changes in lighting. 3.1. Comparison of feature-based method and direct method

In the experiments, the effects of ORB-SLAM and DSO in indoor experimental scenarios were tested. Fig. 3 shows the key points of ORB-SLAM and DSO extraction. It can be seen that the number of key points extracted by DSO method is significantly more than that of ORB-SLAM method, and more image information is used. Indoor scenes often encounter low-feature areas such as white walls. In this case, ORB-SLAM based on the feature-point method performs poorly and is difficult to initialize or crash after tracking for a period. In comparison, since the robot will not move particularly fast indoor, DSO method has better adaptability to this kind of scene, and shows better robustness and positioning effects. Fig. 4 shows the positioning results of the two methods in the same scene and environment. Compared with ORB-SLAM method, the tracking of DSO method is completed more stably, which is basically consistent with the true value. For the serious drift of the feature-based method, the main reason is that the image motion blur at the corners causes insufficient feature points, which leads to the collapse of the system. However, the direct method can extract more feature points, and complete the vision tracking and pose estimation more stably. In summary, the DSO method is used as the visual front-end of the loosely coupled visual inertial fusion navigation method.

(a) video frame

(b) DSO

(c) ORB-SLAM

Fig.3. Comparison of the points selected between direct method and feature-based method

Fig.4. Comparison experiment of indoor positioning results of DSO method and ORB-SLAM method 4. Federated filter model

Federated filtering is a two-stage decentralized filtering method with block estimation and two-step cascade. It consists of multiple sub-filters and a main filter. It can process data in parallel and hierarchically. During the fusion calculation process, each sub-filter performs measurement update independently, and the local optimal estimation result of each sub-filter was fused in the main filter to obtain the globally optimal estimation. The decentralized structure of the federated filtering method is helpful to reduce the amount of system computation. At the same time, the system can still output the filtering result when some sub-filters fail, which improves the fault tolerance of the entire system. Considering that the IMU sensor can stably output measurement results, the IMU is combined with the 3D laser and the camera respectively, to form the sub-filter 1 and the sub-filter 2, then the local optimal results of these two sub-filters were fused, and finally the optimal combination result is obtained. In this federated filtering system, if the sub-filter 2 lose efficiency due to the failure of the camera visual positioning solution, the sub-filter 1 can continue the filtering solution, which improves the fault tolerance of the system positioning solution. The algorithm flow is shown in Fig. 5.

Closed loop correction

INS

Dynamic weight value

Sub-filter1

Local optimal estimate1

Time update

Local optimal estimate2

Optimal fusion

LiDAR Sub-filter2 Monocular

Dynamic weight value

Fig.5. Federated filtering method flow chart Based on the inertial navigation system (INS), the strapdown inertial navigation system is built as a public reference system. The state variables are: 𝑋𝐼𝑁𝑆 = [𝛿𝐿𝐼𝐸,π›Ώπœ†πΌπ‘,𝛿𝑣𝐼𝐸,𝛿𝑣𝐼𝑁,πœ‘πΌπΈ,πœ‘πΌπ‘,πœ‘πΌπ‘ˆ,βˆ‡πΈ,βˆ‡π‘,πœ€πΈ,πœ€π‘,πœ€π‘ˆ]#(8) Where 𝛿𝐿𝐼𝐸, π›Ώπœ†πΌπ‘ are the latitude and longitude error. 𝛿𝑣𝐼𝐸 and 𝛿𝑣𝐼𝑁 are the eastward and northward velocity errors, πœ‘πΌπΈ, πœ‘πΌπ‘ and πœ‘πΌπ‘ˆ are the platform heading angle, roll angle and pitch angle error, respectively. βˆ‡πΈ and βˆ‡π‘ are the eastward and the northward accelerometer bias, πœ€πΈ, πœ€π‘, πœ€π‘ˆ are the eastward gyro drift, the northward gyro drift and the azimuth gyro drift , respectively. 4.1. IMU and 3D laser data filtering

The fusion strategy of IMU and 3D Lidar is: through the processing of 3D laser data, the travel distance of the robot movement platform can be measured, and the travel time is combined with the measurement interval to suppress the position drift of the IMU sensor. The state variables of 3D laser are shown in equation (9): 𝑋𝐿𝑖𝐷𝐴𝑅 = [𝛿𝐿𝐿,π›Ώπœ†πΏ,𝛿𝑣𝐸𝐿,𝛿𝑣𝑁𝐿,π›Ώπœ“πΏ,π›ΏπœƒπΏ,𝛿𝛾𝐿]#(9) Where 𝛿𝐿𝐿 and π›Ώπœ†πΏ are the latitude and longitude errors of 3D laser output. 𝛿𝑣𝐸𝐿 and 𝛿𝑣𝑁𝐿 are the eastward and the northward velocity errors. π›Ώπœ“πΏ is the heading angle error, π›ΏπœƒπΏ is the pitch angle error, and 𝛿𝛾𝐿 is the roll angle error. The state equation of the INS/3D laser integrated navigation system is: ― (𝑑) = π‘‹π‘˜1

[

] [

― 𝑋𝐼𝑁𝑆 𝐹𝐼𝑁𝑆 = 0 ― 𝑋𝐿𝑖𝐷𝐴𝑅

][

]

0 𝑋𝐼𝑁𝑆 𝐹𝐿𝑖𝐷𝐴𝑅 𝑋𝐿𝑖𝐷𝐴𝑅 + π‘Š#(10)

Where = [π‘Šπ›ΏπΏπΌ,π‘Šπ›Ώπœ†πΌ,π‘Šπ›Ώπ‘£πΈπΌ,π‘Šπ›Ώπ‘£π‘πΌ,π‘Šπ›Ώπœ“πΌ,π‘Šπ›ΏπœƒπΌ,π‘Šπ›Ώπ›ΎπΌ,0, 0,0,0,0,π‘Šπ›ΏπΏπΏ,π‘Šπ›Ώπœ†πΏ,π‘Šπ›Ώπ‘£πΈπΏ,π‘Šπ›Ώπ‘£π‘πΏ,π‘Šπ›Ώπœ“πΏ,π‘Šπ›ΏπœƒπΏ,π‘Šπ›Ώπ›ΎπΏ]𝑇, π‘Š is white noise with mean value is zero and variance is constant Q, 𝐹𝐼𝑁𝑆 is a 12Γ—12 matrix , and 𝐹𝐿𝑖𝐷𝐴𝑅 is a 7Γ—7 matrix.

{[

𝐹𝐿𝑖𝐷𝐴𝑅 = π‘‘π‘–π‘Žπ‘” ―

]}

1 1 1 1 1 1 1 ,― ,― ,― , ― , ― , , #(11) 𝜏𝐿𝐿 πœπœ†πΏ πœπ‘£πΈπΏ πœπ‘£π‘πΏ πœπœ“πΏ πœπœƒπΏ πœπ›ΎπΏ

The correlation time 𝜏𝐿𝐿, πœπœ†πΏ, πœπ‘£πΈπΏ, πœπ‘£π‘πΏ, πœπœ“πΏ, πœπœƒπΏ, πœπ›ΎπΏ are selected between 100~150s. The position error, velocity error and attitude error of the 3D laser assumed above are all first-order Markov processes. The real-time position, velocity and attitude information can be obtained by the processing of 3D laser data. In the design of the sub-filter 1, the position, velocity and attitude are taken as the systematic observations. The measurement equation of INS/3D laser integrated navigation system is:

[ ]

𝐿𝐼 ― 𝐿𝐿 πœ†πΌ ― πœ†πΏ 𝑣𝐸𝐼 ― 𝑣𝐸𝐿 𝑍 = 𝑣𝑁𝐼 ― 𝑣𝑁𝐿 = 𝐻𝑋 + 𝑉 = [𝐻𝐼𝑁𝑆 πœ“πΌ ― πœ“πΏ πœƒπΌ ― πœƒπΏ 𝛾𝐼 ― 𝛾𝐿

𝐻𝐿𝐼𝐷𝐴𝑅]

[𝑋𝑋 ] + 𝑉#(12) 𝐼𝑁𝑆

𝐿𝑖𝐷𝐴𝑅

Where 𝐻𝐿𝑖𝐷𝐴𝑅= -17 Γ— 7,

H INS

04ο‚΄4 οƒͺ οƒͺ 01ο‚΄4 οƒͺ οƒͺ ο€½οƒͺ 0 οƒͺ 1ο‚΄4 οƒͺ οƒͺ οƒͺ 01ο‚΄4 

04ο‚΄1 C12C32 C122  C222

04ο‚΄1 C22C32 C122  C222

C22

C12

1 ο€­ C322

1 ο€­ C322

C21C33  C23C31 C312  C332

C13C31  C11C33 C312  C332

04ο‚΄1 04ο‚΄5 οƒΉ οƒΊ ο€­1 01ο‚΄5 οƒΊ οƒΊ οƒΊ 0 01ο‚΄5 οƒΊ οƒΊ οƒΊ οƒΊ 0 01ο‚΄5 οƒΊ 

(13)

The measurement noise 𝑉 is white noise with mean value is zero mean and variance is constant 𝑅, 𝑉 and π‘Š are uncorrelated. 4.2. IMU and monocular data filtering

In the integrated navigation system with the combined filtering of IMU and monocular vision, the velocity-integrated mode is used, and the difference between the carrier velocity calculated by the INS and the monocular visual solution is taken as the observation. The state quantity is as follows: π‘‹π‘˜2 = [π›ΏπΏπ‘˜2,π›Ώπœ†π‘˜2,π›Ώπ‘£π‘π‘˜2,π›Ώπ‘£πΈπ‘˜2,π›Ώπœ‘π‘π‘˜2,π›Ώπœ‘πΈπ‘˜2,π›Ώπœ‘π·π‘˜2,π›Ώπ‘˜]𝑇#(14) In the equation (14), 𝛿𝐿k2 and π›Ώπœ†k2 are the latitude and longitude errors of the carrier, 𝛿𝑣𝑁k2 and 𝛿𝑣𝐸k2 are north and east velocities of the carrier, π›Ώπœ‘π‘k2, π›Ώπœ‘πΈk2 and π›Ώπœ‘π·k2 are the attitude error angles of the north, east and the sky directions of the carrier, respectively, and π›Ώπ‘˜ is the camera scale factor error. π›Ώπ‘˜ includes constant error and random error. In this system, random constant is used instead of constant error, random error is represented by first-order Markov process, and the inverse time correlation constant is set to 1 𝜏 . The system state equation is: 𝑐

― π‘‹π‘˜2 (𝑑) =

[

𝐹𝐼𝑁𝑆 01 Γ— 5

]

05 Γ— 1 1 ― π‘‹π‘˜2(𝑑) + π‘Šπ‘˜2#(15) πœπ‘

𝐹𝐼𝑁𝑆 is a 7Γ—7 matrix. Take the difference between the carrier velocity 𝑣𝑛𝑏(𝐼𝑁𝑆) calculated by the INS system and the carrier velocity 𝑣𝑛𝑏𝑐 calculated by the monocular solution as the measurement vector. The measurement equation of the system is shown in equation (16):

[

]

1 0 0 π‘π‘˜2 = 0 1 0 (𝑣𝑛𝑏(𝐼𝑁𝑆) ― 𝑣𝑛𝑏𝑐) = π»π‘‹π‘˜2(𝑑) + 𝑉(𝑇)#(16) Observation matrix is shown in equation (17):

0 0 1 1 vbn c D vbn c E 0 H ο€½οƒͺ n vbn c N 0 οƒͺ0 0 1 1 ο€­vb c D

n οƒΉ ο€­vcN οƒΊ n ο€­vcE 

(17)

Where 𝑣𝑛𝑏(𝑐)𝑁, 𝑣𝑛𝑏(𝑐)𝐸 and 𝑣𝑛𝑏(𝑐)𝐷 are the northward carrier velocity, the eastward carrier velocity and the groundward carrier velocity calculated by the monocular solution, respectively. 𝑣𝑛𝑐𝑁, 𝑣𝑛𝑐𝐸 respectively represent the projection of the velocity of the monocular solution in the north and east directions, and 𝑉(𝑑) is the measurement noise. 4.3. Global filter optimal estimation

The least square iterative weighted algorithm is used to fuse the optimal estimates of the two sub-filters. In general, the least squares method treats the importance of each data in the time series equally, and in fact, the impact of the time series data on the subsequent effects should be different. Therefore, a more reasonable approach is to use the weighted method, which is more important. Different weights are assigned depending on the importance of the data, and a larger weight is assigned to the relatively ― ― (𝑑) and π‘‹π‘˜2 (𝑑), and the relationship between the important data. The updated local optimal estimates of the two sub-filter are π‘‹π‘˜1

weight distribution coefficients of the two sub-filters is shown in equation (18). + 𝑏𝑋 {𝑋(𝑑) = π‘Žπ‘‹π‘Ž + (𝑑) 𝑏=1 ― π‘˜1

― π‘˜2 (𝑑)

#(18)

Where π‘Ž and 𝑏 are the weights to be determined. Take the measurement data at π‘˜ times to perform the least squares algorithm,

and the equation is shown in (19). π‘š+π‘˜

βˆ‘ [𝑋𝐼𝑁𝑆 ― (π‘Žπ‘‹π‘˜1― (𝑑) + π‘π‘‹π‘˜2― (𝑑))]2 = 𝑀𝑖𝑛#(19)

𝑑=π‘š

The parameters π‘Ž and 𝑏 in the π‘˜ times are obtained, and then the two local optimal estimation values π‘‹π‘˜1(𝑑) and π‘‹π‘˜2(𝑑) are sent to the main filter to obtain the optimal estimate 𝑋(𝑑) according to equation (18). 5. Experiments

5.1. Voxel-SIFT feature extraction algorithm verification experiment

In order to verify the test effect of the Voxel-SIFT feature extraction algorithm on 3D laser point cloud data, a single-frame point cloud data is selected in two typical scenarios, and experiments are performed using Voxel-SIFT feature extraction algorithm and ISS feature extraction algorithm, including the actual collected data set through the point cloud pre-processing operation(16-line R-fans sensor). Fig. 6 show the comparison of the feature extraction effects of the Voxel-SIFT feature extraction algorithm and the ISS feature extraction algorithm on the outdoor residential road (scene1) and the indoor gallery (scene 2).

Scene 1

Scene 2 Fig.6. Comparison results of two algorithms to deal with the scene 1 and the scene 2 Table 1 Comparison of feature point detection results

Scenes scene 1

Number of point clouds 56271

ISS feature detection 234

Detection time/hundred points (s) 2.05

Voxel-SIFT feature detection 242

Detection time/hundred points (s) 0.14

scene 2

27552

122

1.15

172

0.12

The comparison in Table 1 shows that the number of feature points extracted by the Voxel-SIFT algorithm in Table 1 is larger than the number of feature points extracted by the ISS algorithm in these two test scenarios, and the detection time is shorter. Therefore, considering the detection time and the number of feature points, the Voxel-SIFT algorithm can significantly reduce the detection time while ensuring the number of feature points, which is a feature extraction method suitable for the post-3D laser point cloud registration. 5.2. 3D Point cloud registration experiment

In order to verify the performance of the registration algorithm proposed in this paper, the feature points were detected on the datasets of these two typical scenarios (the outdoor residential road scene and the indoor gallery scene). For the experimental data, point cloud registration simulation experiments were performed on the computing platform (Core i5-4210U low voltage processor). Set the transformation parameters between the two frames of point clouds. The distance between the fixed source point cloud and the target point cloud in the horizontal direction between the x-axis and the y-axis is 1 m. The yaw angle is gradually increased from zero degrees, the registration result is obtained every 5 degrees of the rotation angle between the two frame points, until the angle changes too much and the obtained registration error exceeds the set threshold, the statistics are stopped. Figure 7 shows the result of point cloud registration in these two scenarios.

Scene 1

Scene 2 Fig.7. Registration results of point cloud

Scene 1

Scene 2 Fig.8. Comparison of registration accuracy

Scene (1)

Scene (2) Fig.9. Comparison of registration time From the comparison results of Fig. 8 and Fig. 9, it can be seen that with the gradual increase of the rotation angle parameter, the average registration error of the proposed Voxel-SIFT algorithm is smaller than that of the ICP algorithm, and the registration time is shortened to 16%~30% of the registration time of the traditional ICP algorithm. The average registration time is 78% shorter than the traditional ICP registration algorithm. While maintaining higher registration accuracy, the algorithm greatly improves the working frequency of the 3D laser, and reduces the real-time performance obstacles of multi-sensor fusion navigation and positioning. 5.3. LiDAR/camera/IMU fusion navigation positioning experiment based on federated filtering

The turtlebot2 mobile robot is used as a data acquisition device. It is equipped with a 9-degree-of-freedom (DOF) inertial sensor (three-axis accelerometer, three-axis gyroscope, and three-axis magnetometer), a rolling shutter camera and a 3D laser (16-line Rfans sensor). The data acquisition frequency of IMU is 200Hz, the image acquisition is 30Hz, and the resolution is 640Γ—480. The experimental environment is in an office area. The experimental platform and experimental environment diagram are shown in Fig. 10.

Fig.10. Experimental platform and experimental environment The positioning performance of the system is compared with the vision-based visual odometer method (DSO), the LiDAR-based odometer method, and the federal filtering-based IMU/Vision/LiDAR combined navigation positioning algorithm.

Fig.11. Positioning result of zigzag path

Fig.12. Positioning result of circle path Fig. 11 and Fig. 12 shows tests of the curved path. In these figures, the black diamond represents the start point and the end point, the brown line represents the ground truth value, which is measured by the laser rangefinder, the green line represents the trajectory of the proposed method, the red line represents the pure visual trajectory, and the blue line represents the positioning trajectory of the 3D laser odometer method. Obviously, the proposed trajectory of the IMU/Vision/LiDAR integrated navigation system based on the federated filter is more closer to the real path value, and the specific evaluation results are shown in Table 2 and Table 3. Table 2 The positioning error of different methods (zigzag path)

Positioning method

Mean error(m)

95th percentile error (m)

Visual method

2.15

4.89

LiDAR method

0.43

0. 78

Proposed method

0.21

0.35

Table 3 The positioning error of different methods (circle path)

Positioning method

Mean

95th percentile

error(m)

error (m)

Visual method

2.62

5.11

LiDAR method

0.48

0.81

Proposed method

0.23

0.37

6. Conclusion

To solve the problem of multi-sensor indoor navigation and positioning, a 3D point cloud feature extraction algorithm method based on Voxel-SIFT was proposed. The experimental results show that the Voxel-SIFT feature extraction algorithm can significantly shorten the extraction time under the premise of extracting more feature points. On this basis, the initial value can be quickly obtained by using the feature points to participate in the initial registration, then the source point cloud can be fine-tune using this initial value, and then combined with ICP fine registration to improve the registration accuracy and speed of the point cloud. In addition, combined with the Voxel-SIFT-based point cloud fast registration method, the IMU/Vision/LiDAR integrated navigation and positioning method based on federated filtering is designed. The experimental results show that the positioning accuracy can reach to an average error of 0.22 meters, and through different test scenarios, it is verified that the system has higher positioning accuracy and robustness than the traditional pure vision method and 3D laser odometer method. Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments

This paper is supported by the projects of the national key technologies R&D program (2016YFB0502002) and the national natural science foundation of China (41764002). Corresponding author: Professor Hang Guo, E-mail: [email protected]. Nanchang, Jiangxi, 330031 China. Reference

[1] Mur-Artal R, Montiel J M M, Tardos J D. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE transactions on robotics 31(5) (2015): 1147-1163. [2] Forster C, Pizzoli M, Scaramuzza D. SVO: Fast semi-direct monocular visual odometry. IEEE International Conference on Robotics and Automation, 2014, 15-22. [3] Engel J, Koltun V, Cremers D. Direct sparse odometry. IEEE transactions on pattern analysis and machine intelligence, 40(3) (2017): 611-625. [4] Gao M, Yu M, Guo H, Yu M. Mobile Robot Indoor Positioning Based on a Combination of Visual and Inertial Sensors. Sensors, 19(8) (2019): 1773. [5] Guo H, Li H, Xiong J, Yu M. Indoor positioning system based on particle swarm optimization algorithm. Measurement, 134(2019): 908-913. [6] Li H, Wen X, Guo H, Yu M. Research into Kinect/inertial measurement units based on indoor robots. Sensors, 18(3) (2018): 839. [7] Chiang K W, Tsai G J, Chang H W, Joly C, EI-Sheimy N. Seamless navigation and mapping using an INS/GNSS/gridbased SLAM semi-tightly coupled integration scheme. Information Fusion, 50(2019): 181-196. [8] Wang L. Automatic control of mobile robot based on autonomous navigation algorithm. Artificial Life and Robotics (2019): 1-5. [9] Kohlbrecher S, Von Stryk O, Meyer J, Klingauf U. A flexible and scalable slam system with full 3d motion estimation. In 2011 IEEE International Symposium on Safety, Security, and Rescue Robotics. IEEE, 2011: 155-160. [10] Hess W, Kohler D, Rapp H, Andor D. Real-time loop closure in 2D LIDAR SLAM. In 2016 IEEE International Conference on Robotics and Automation, 2016: 1271-1278.

[11] Chen Y, Tang J, Jiang C, Zhu L, LehtomΓ€ki M, et al. The accuracy comparison of three simultaneous localization and mapping (SLAM)-based indoor mapping technologies. Sensors, 18(10)(2018): 3228(1-24). [12] Xu X, Dai W, Yang B, et al. A visual inertial SLAM method based on graph optimization in indoor environment. Journal of Chinese Inertial Technology, 25(3)(2017): 313-319. [13] Liu D, Pei L, Qian J, et al. A novel heading estimation algorithm for pedestrians using a smart-phone without attitude constraints. 2016 Fourth IEEE International Conference on Ubiquitous Positioning, Indoor Navigation and LocationBased Service. Shanghai, China, 2016: 29-37. [14] Li Dengbiao, Jia Wei, Zhao Jumin, et al. Integrated navigation algorithm based on extended Kalman filter for SLAM/GNSS/INS. The 9th China Satellite Navigation Academic Annual Meeting, Harbin: Springer, 2018. [15] Li M, Mourikis A I. 3-D motion estimation and online temporal calibration for camera-IMU systems[C]. 2013 IEEE International Conference on Robotics and Automation, IEEE, 2013: 5709-5716. [16] Qin T, Li P, Shen S. Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics, 34(4)(2018): 1004-1020. [17] Shi J, He B, Zhang L, Zhang J. Vision-based real-time 3D mapping for UAV with laser sensor. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, 2016, 4524-4529. [18] Zhang Pei. Research on Multi-sensor Information Fusion Algorithm for Multi-rotor Aircraft. Taiyuan: North University, 2017. [19] Zhu H , Guan Y ,Guan G X. Information fusion of GPS/DR integrated navigation based on federal filter.Microcomputer Information 24(10)(2008) : 233-234. [20] Zhao L . Federated adaptive Kalman filtering and its application. IEEE Proceedings of 7th Word Congress on Intelligent Control and Automation New York 2008:1369-1372. [21] Reklaitis G V, Ravindran A, Ragsdell K M. Engineering optimization: methods and applications. New York: Wiley, 1983. [22] Lowe D G. Distinctive image features from scale- invariant keypoints. International Journal of Computer Vision, 60(2)(2004): 91-110. [23] Yan Lei. Research on automatic image stitching algorithm based on feature matching. University of Science and Technology of China. [24] Samet Hanan. K-nearest neighbor finding using MaxNearestDist. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2)(2007): 243-252. [25] Chen C, Qi Y, Zhu Y, Pei L, Xu Q C. The analysis of influence factors and evaluation indexes on ICP algorithm. Navigation Positioning and Timing, 5(5)(2018): 67-72. [26] Zhang Bu. Research on 3D laser point cloud data registration. Xi'an University of Science and Technology, 2015. [27] Zhong Y. Intrinsic shape signatures: a shape descriptor for 3D object recognition. 2009 IEEE 12th International Conference on Computer Vision Workshops, IEEE 2009: 689-696. [28] Harris C G, Stephens M. A combined corner and edge detection. Proceedings of the Fourth Alvey Vision Conference, 1988: 147–151. [29] Rusu R B, Blodow N, Beetz M. Fast point feature histograms (FPFH) for 3D registration. 2009 IEEE International Conference on Robotics and Automation. IEEE, 2009: 3212-3217.

Author Contributions: Hang Guo: Supervision and guided the research work. Huixia Li and Longhui Ao conceived and designed the experiments. Longhui Ao and Xiaoyi Yan performed the experiments, and analyzed the data. Huixia Li and Longhui Ao wrote the paper. Huixia Li revised the paper.

Highlights ο‚· A fast point cloud registration method based on Voxel-SIFT is proposed to improve the registration efficiency of LIDAR. ο‚· A hybrid federated filter based on weighted least square method is designed. ο‚· The proposed approach is robust to system errors when registration the point cloud. ο‚· The proposed approach has higher positioning accuracy when tracking the robot.