3D Occupancy Mapping Framework Based on Acoustic Camera in Underwater Environment

3D Occupancy Mapping Framework Based on Acoustic Camera in Underwater Environment

12th IFAC Symposium on Robot Control 12th IFAC Symposium Symposium on Robot Robot Control 12th IFAC on Control Budapest, Hungary, August 27-30, 2018 1...

1MB Sizes 0 Downloads 16 Views

12th IFAC Symposium on Robot Control 12th IFAC Symposium Symposium on Robot Robot Control 12th IFAC on Control Budapest, Hungary, August 27-30, 2018 12th on Control 12th IFAC IFAC Symposium Symposium on Robot Robot Control Budapest, Hungary, August 27-30, 2018 Available online at www.sciencedirect.com Budapest, Hungary, August 27-30, 2018 Budapest, Hungary, August 27-30, 2018 Budapest, Hungary, August 27-30, 2018 12th IFAC Symposium on Robot Control Budapest, Hungary, August 27-30, 2018

3D 3D 3D 3D

ScienceDirect IFAC PapersOnLine 51-22 (2018) 324–330

Occupancy Mapping Framework Occupancy Occupancy Mapping Mapping Framework Framework Based on Acoustic Camera Occupancy Mapping Framework Based on Acoustic Camera Based on Acoustic Camera in Underwater Environment Based on Acoustic Camera in in Underwater Underwater Environment Environment in Underwater Environment Yusheng Wang ∗∗∗ Yonghoon Ji ∗∗ Hanwool Woo ∗∗∗ ∗∗ ∗∗

Yusheng Wang Ji Woo ∗∗ Hanwool ∗ ∗ ∗ ∗ ∗ Yusheng Wang Yonghoon Ji ∗ Yonghoon ∗ Yusheng Wang Yonghoon Ji Hanwool Woo Yusuke Tamura Atsushi Yamashita Asama Woo Hajime Yusheng Wang Yonghoon Ji ∗∗ Hanwool Hanwool Woo ∗ ∗ ∗ ∗ ∗ ∗ Yusuke Tamura Atsushi Yamashita Asama Hajime ∗ ∗ ∗ Yusuke Tamura Atsushi Yamashita Asama Hajime ∗ ∗∗ ∗ ∗ Atsushi ∗ Asama Woo Yusuke Tamura Yamashita Hajime Yusheng Wang Yonghoon Ji Hanwool Yusuke Tamura Atsushi Yamashita Asama Hajime ∗ ∗ Yusuke Tamura ∗ Atsushi Yamashita ∗ Asama Hajime ∗ ∗ Department of Precision Engineering, The University of Tokyo, ∗ Department of Precision Engineering, The University of Tokyo, ∗ Department Precision Engineering, University of ∗ Department of Precision Engineering, The University of Tokyo, 7-3-1 of Hongo, Bunkyo-ku, Tokyo The 113-8656, Japan Department of Precision Engineering, The University of Tokyo, Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan ∗ 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan [email protected]). Department of Precision Engineering, The University of Tokyo, 7-3-1 (e-mail: Hongo, Bunkyo-ku, Tokyo 113-8656, Japan (e-mail: [email protected]). ∗∗ (e-mail: [email protected]). (e-mail: [email protected]). of Bunkyo-ku, Precision Mechanics, Chuo University, 7-3-1 Hongo, Tokyo 113-8656, Japan (e-mail: [email protected]). ∗∗ Department ∗∗ Department of Precision Mechanics, Chuo University, ∗∗ Department of Mechanics, Chuo ∗∗ 1-13-27 Department of Precision Mechanics, Chuo University, Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan (e-mail: [email protected]). Department of Precision Precision Mechanics, Chuo University, University, 1-13-27 Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan ∗∗ 1-13-27 1-13-27 Kasuga, Bunkyo-ku, Tokyo 112-8551, 112-8551, Japan (e-mail: [email protected]) Department of Precision Mechanics, Chuo University, 1-13-27 Kasuga, Bunkyo-ku, Tokyo Japan (e-mail: [email protected]) (e-mail: [email protected]) (e-mail: [email protected]) 1-13-27 Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan (e-mail: [email protected]) (e-mail: [email protected]) Abstract: In this paper, we present a novel probabilistic three-dimensional (3D) mapping Abstract: In In this this paper, paper, we we present present aaa novel novel probabilistic probabilistic three-dimensional three-dimensional (3D) (3D) mapping mapping Abstract: this paper, we present probabilistic three-dimensional Abstract: In (3D) framework that usespaper, acoustic captured an underwater environment. Acoustic camera Abstract: In this weimages present a novel novelin probabilistic three-dimensional (3D) mapping mapping framework that uses acoustic images captured in an underwater environment. Acoustic camera framework that uses acoustic images captured in an underwater environment. Acoustic camera framework that uses acoustic images captured in an underwater environment. Acoustic camera is a forward-looking imaging sonar that isnovel commonly used inthree-dimensional underwater inspection recently; Abstract: In this paper, we present a probabilistic (3D) mapping framework that uses acoustic images captured in an underwater environment. Acoustic camera is a forward-looking imaging sonar that is commonly used in underwater inspection recently; is a forward-looking imaging sonar that is commonly used in underwater inspection recently; is a forward-looking imaging sonar that is commonly used in underwater inspection recently; however, the loss of elevation angle information makes it difficult to get a better understanding framework that uses acoustic images captured in an underwater environment. Acoustic camera is a forward-looking imaging sonar that is commonly used in underwater inspection recently; however, the the loss loss of of elevation elevation angle angle information information makes makes it it difficult difficult to to get get aaa better better understanding understanding however, however, the loss of elevation angle information it difficult to get better understanding of aunderwater environment. To cope with this,makes we apply ainprobabilistic occupancy mapping is forward-looking imaging sonar that is commonly used underwater inspection recently; however, the loss of elevation angle information makes it difficult to get a better understanding of underwater underwater environment. environment. To To cope cope with with this, this, we we apply apply aaa probabilistic probabilistic occupancy occupancy mapping mapping of of underwater environment. To cope with this, we occupancy framework aof novel inverse sensor model suitable fora probabilistic the to acoustic camera in mapping order to however, thewith loss elevation angle information it difficult get a better understanding of underwater environment. To cope with this,makes we apply apply probabilistic occupancy mapping framework with a novel inverse sensor model suitable for the acoustic camera in order to to framework with a novel inverse sensor model suitable for the acoustic camera in order to framework with a novel inverse sensor model suitable for the acoustic camera in order reconstruct theenvironment. underwater environment in volumetric presentation. The simulations and of underwater To cope with this, we apply a probabilistic occupancy mapping framework with a novel inverse sensor model suitable for the acoustic camera in order to reconstruct the underwater environment in volumetric presentation. The simulations and reconstruct the underwater environment in volumetric presentation. The simulations and reconstruct the underwater environment in volumetric presentation. simulations experimental results demonstrate that our framework for the The acoustic camera can framework with a novel inverse sensor model suitable for the acoustic camera in orderand to reconstruct the underwater environment inmapping volumetric presentation. simulations and experimental results demonstrate that our our mapping framework for the the The acoustic camera can experimental results that mapping framework for acoustic camera can experimental results demonstrate that our mapping framework for the acoustic camera can dense 3D demonstrate model ofenvironment underwater targets successfully. reconstruct the underwater in volumetric presentation. The simulations and experimental results demonstrate that our mapping framework for the acoustic camera can reconstruct dense dense 3D 3D model model of of underwater underwater targets targets successfully. successfully. reconstruct reconstruct 3D model successfully. experimentaldense results that ourtargets mapping framework for the acoustic camera can reconstruct dense 3D demonstrate model of of underwater underwater targets successfully. © 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. reconstruct dense 3D model of3D underwater targets successfully. Keywords: 3D reconstruction, occupancy mapping, underwater sensing, acoustic camera, Keywords: 3D reconstruction, 3D occupancy mapping, underwater sensing, acoustic camera, Keywords: 3D reconstruction, Keywords: 3D model reconstruction, 3D 3D occupancy occupancy mapping, mapping, underwater underwater sensing, sensing, acoustic acoustic camera, camera, inverse sensor Keywords: 3D reconstruction, 3D occupancy mapping, underwater sensing, acoustic camera, inverse sensor model inverse inverse sensor sensor model Keywords: 3D model reconstruction, 3D occupancy mapping, underwater sensing, acoustic camera, inverse sensor model inverse1.sensor model INTRODUCTION inflexibility or low resolution. SSSs are usually mounted 1. INTRODUCTION INTRODUCTION inflexibility or low resolution. SSSs are usually mounted 1. inflexibility or low SSSs are mounted 1. INTRODUCTION inflexibility or low resolution. SSSs are usually mounted on a ship and cannot change pose. Thus, the 1. INTRODUCTION inflexibility or they low resolution. resolution. SSSs their are usually usually mounted on a ship and they cannot change their pose. Thus, the on a ship and they cannot change their pose. Thus, a ship and they cannot change their pose. Thus, the ship itself has to move forward to measure the surrounding 1. INTRODUCTION inflexibility or low resolution. SSSs are usually mounted In recent years, underwater operations such as mainte- on on a itself ship has andtothey cannot change their pose. Thus, the the ship move forward to measure the surrounding In recent years, underwater operations such as mainteship itself has to move forward to measure the surrounding In recent years, underwater operations such as mainteship itself has to move forward to measure the surrounding environment. In other words, it is not suitable to be In recent years, underwater operations such as mainteon a ship and they cannot change their pose. Thus, the nance, ship hull inspection, and construction become much ship itself has to move forward to measure the surrounding In recent years, underwater operations such as mainteenvironment. In other words, it is not suitable to be nance, ship ship hull hull inspection, inspection, and and construction construction become become much much environment. In other words, it is not suitable to be nance, environment. In other words, it is not suitable to be mounted on AUVs or ROVs with high degrees of freedom. nance, ship hull inspection, and construction become much ship itself has to move forward to measure the surrounding more important these days; and however, hazards may prohibit environment. In other words, it is not suitable to be In recent years, underwater operations such as maintenance, ship hull inspection, construction become much mounted on AUVs or ROVs with high degrees of freedom. more important these days; however, hazards may prohibit mounted on AUVs or ROVs with high degrees of freedom. more important these days; however, hazards may prohibit mounted on AUVs or ROVs with high degrees of freedom. MSISs are being used in relatively small AUVs or ROVs more important these days; however, hazards may prohibit environment. In other words, it is not suitable to be human access (e.g., the Fukushima Daiichi nuclear power mounted on AUVs or ROVs with high degrees of freedom. nance, ship hull inspection, and construction become much more important these days; however, hazards may prohibit MSISs are being used in relatively small AUVs or ROVs human access access (e.g., (e.g., the the Fukushima Fukushima Daiichi Daiichi nuclear nuclear power power MSISs are being used in relatively small AUVs or ROVs human MSISs are being used in relatively small AUVs or ROVs thanks to its small size, however, they have comparatively human access (e.g., the Fukushima Daiichi nuclear power mounted on AUVs or ROVs with high degrees of freedom. station, which has been the crisis since the 2011 earthMSISs are being used in relatively small AUVs or ROVs more important these days; however, hazards may prohibit human the Fukushima Daiichi nuclear power thanks to its small size, however, they have comparatively station, access which(e.g., has been been the crisis crisis since since the 2011 earthearthto small size, they have comparatively station, which has the the 2011 thanks to its small size, however, they have comparatively low resolutions and very low sampling rates. station, which has the since the 2011 earthMSISs are being used inhowever, relatively small AUVs or ROVs quake off the Pacific coast of crisis Tohoku in the east thanks to its its small size, however, they have comparatively human access (e.g., the Fukushima Daiichi nuclear power thanks station, which has been been the crisis since the 2011Japan). earthlow resolutions and very low sampling rates. quake off the Pacific coast of Tohoku in the east Japan). low resolutions and very low sampling rates. quake off the Pacific coast of Tohoku in the east Japan). low resolutions and very low sampling rates. quake offwhich the Pacific coast of crisis Tohoku in the the east Japan). thanks to its small size, however, they have comparatively Moreover, the Pacific limited field visionsince due to turbidity and low resolutions and very low sampling rates. station, has been the the 2011 earthquake off the coast of Tohoku in east Japan). the development of acoustic camera, such as dual Moreover, the the limited limited field field of of vision vision due due to to turbidity turbidity and and Recently, Moreover, Moreover, the limited field of vision due to turbidity and the development of acoustic as dual low resolutions and very low samplingcamera, rates. such lack ofoff illumination makes it difficult forthe underwater op- Recently, quake the Pacific coast Tohoku in east Japan). Moreover, the limited field of vision due to turbidity and Recently, the development of camera, such as Recently, the development of acoustic camera, such as dual frequency identification (DIDSON) lack of illumination makes it difficult for underwater opRecently, the developmentsonar of acoustic acoustic camera,and suchadaptive as dual dual lack of illumination makes it difficult for underwater oplack of illumination makes it difficult for underwater opfrequency identification sonar (DIDSON) and adaptive erations. Therefore, in order to fulfill underwater tasks, Moreover, the limited field of vision due to turbidity and lack of illumination makes it difficult for underwater op- frequency identification sonar (DIDSON) and adaptive frequency identification sonar (DIDSON) and adaptive resolution imaging sonar (ARIS) which can generate highRecently, the development of acoustic camera, such as dual erations. Therefore, in order to fulfill underwater tasks, frequency identification sonar (DIDSON) and adaptive erations. Therefore, in order to fulfill underwater tasks, erations. Therefore, inoforder order to fulfill fulfill for underwater tasks, imaging sonar (ARIS) which can generate highreconstruction process three-dimensional (3D) underwalack of illumination makes it difficult underwater op- resolution erations. Therefore, in to underwater tasks, resolution imaging sonar (ARIS) which can generate highresolution imaging sonar (ARIS) which can generate highand wide-range image,(DIDSON) facilitates understanding frequency identification sonar and adaptive reconstruction process of three-dimensional (3D) underwaresolution imaging sonar (ARIS) which can generate highreconstruction process of three-dimensional (3D) underwareconstruction process oforder three-dimensional (3D) underwaresolution and wide-range image, facilitates understanding ter environment by robots suchto as fulfill autonomous underwater erations. Therefore, inof underwater tasks, resolution reconstruction process three-dimensional (3D) underwaand wide-range image, facilitates resolution and wide-range image, facilitates understanding of underwater situation et al. (2002)). This highkind imaging sonar(Belcher (ARIS) can understanding generate ter environment environment by robots robots such as as autonomous autonomous underwater and wide-range image, which facilitates understanding ter by such underwater ter environment by robots such as autonomous underwater of underwater situation (Belcher et al. (2002)). This kind vehicles (AUVs) or robots remotely operated underwater vehicles resolution reconstruction process of three-dimensional (3D) underwater environment by such as autonomous underwater of underwater situation (Belcher et al. (2002)). This kind of underwater situation (Belcher et al. (2002)). This kind sensor is relatively small that is easy to be mounted on resolution and wide-range image, facilitates understanding vehicles (AUVs) or remotely operated underwater vehicles of underwater situation (Belcher et al. (2002)). This kind vehicles (AUVs) or remotely operated underwater vehicles vehicles (AUVs) or remotely operated underwater vehicles of sensor is relatively small that is easy to be mounted on (ROVs) is necessary. ter environment by robots such as autonomous underwater vehicles (AUVs) or remotely operated underwater vehicles of sensor is relatively small that is easy to be mounted on of sensor is relatively small that is easy to be mounted on AUVs or ROVs and can gather information of a larger area underwater situation (Belcher et al. (2002)). This kind (ROVs) is necessary. of sensor is relatively small that is easy to be mounted on (ROVs) is necessary. (ROVs) (AUVs) is necessary. necessary. AUVs or ROVs and can gather information of a larger area vehicles or remotely operated underwater vehicles AUVs (ROVs) is or ROVs and can gather information of a larger area or ROVs and can gather information of a larger area much faster. However, wide fan-shape beam is emitted and of sensor is relatively small that is easy to be mounted on Sonars are useful for sensing an underwater environment AUVs AUVs or ROVs and can gather information of a larger area much faster. However, wide fan-shape beam is emitted and Sonars are are useful for for sensing sensing an an underwater underwater environment environment much (ROVs) is necessary. faster. However, wide fan-shape beam is emitted and Sonars useful much faster. However, wide fan-shape beam is emitted and information lost occurs during imaging, which makes 3D Sonars are useful for sensing an underwater environment AUVs or ROVs and can gather information of a larger area since they are less affected by turbidity, illumination, and much faster. However, wide fan-shape beam is emitted and Sonars are are useful sensing underwater environment lost occurs during imaging, which makes 3D since they they less for affected by an turbidity, illumination, and information occurs during imaging, makes 3D since are less affected by turbidity, illumination, and information lost occurs during imaging, which makes 3D reconstruction difficult. Forfan-shape example, wewhich are facing since they are less affected by turbidity, illumination, and much faster. lost However, beam is emitted and absorption as optical sensors oran laser sensors. There exist a information information lost occurswide during imaging, which makeswith 3D Sonars are are useful for sensing underwater environment since they less affected by turbidity, illumination, and reconstruction difficult. For example, we are facing with absorption as optical sensors or laser sensors. There exist a reconstruction difficult. For example, we are facing with absorption as optical sensors or laser sensors. There exist a reconstruction difficult. For example, we are facing with the uncertainty of 14 deg in case of ARIS EXPLORER absorption as optical sensors or laser sensors. There exist a information lost occurs during imaging, which makes 3D few studies onoptical 3D underwater environment reconstruction reconstruction difficult. For example, we are facing with since they are less affected by turbidity, illumination, and absorption as sensors or laser sensors. There exist a the uncertainty of 14 deg in case of ARIS EXPLORER few studies studies on on 3D 3D underwater underwater environment environment reconstruction reconstruction the uncertainty of 14 deg in ARIS EXPLORER few studies on 3D underwater environment reconstruction the uncertainty of 14 deg in case of ARIS EXPLORER 3000 (Sound Metrics) which iscase theof latest-model acoustic few reconstruction difficult. For example, we are facing with using the sonars such as side scansensors. sonars (SSSs) and the uncertainty of 14 deg in case of ARIS EXPLORER absorption as optical sensors or laser There exist a few studies on 3D underwater environment reconstruction 3000 (Sound (Sound Metrics) which is the latest-model acoustic using the the sonars sonars such such as as side side scan scan sonars sonars (SSSs) (SSSs) and and 3000 which the acoustic using (Sound Metrics) which is the latest-model acoustic camera. using the sonars such as side scan sonars (SSSs) and the uncertainty of 14 deg in is ARIS EXPLORER mechanically imaging sonars (MSISs). Sun etand al. 3000 3000 (Sound Metrics) Metrics) which iscase theoflatest-model latest-model acoustic few studies onscanned 3D underwater environment reconstruction using the sonars such as side scan sonars (SSSs) camera. mechanically scanned imaging sonars (MSISs). Sun et al. camera. mechanically scanned imaging sonars (MSISs). Sun et al. camera. mechanically scanned imaging sonars (MSISs). Sun et etfield al. camera. 3000 (Sound Metrics) which is the latest-model acoustic employed an approach based on Markov random using the sonars such as side scan sonars (SSSs) and mechanically scanned imaging sonars (MSISs). Sun al. the best of our knowledge, most previous studies employed an an approach approach based based on on Markov Markov random random field field To employed employed an approach based on Markov random field To the best of our knowledge, most previous studies camera. (MRF) that detects shadow using SSS torandom reconstruct mechanically scanned imaging sonars (MSISs). Sun etfield al. To employed an approach based on Markov the best of our most previous studies To the best3D of reconstruction our knowledge, knowledge, most previous studies that achieve using acoustic cameras are (MRF) that detects shadow using SSS to reconstruct To the best of our knowledge, most previous studies (MRF) that detects shadow using SSS to reconstruct (MRF) that detects shadow using SSS to torandom reconstruct achieve 3D reconstruction using acoustic cameras are 3D model of detects sea floorshadow (Sun etusing al. Markov (2008)). Kwon etfield al. that employed an approach based on (MRF) that SSS reconstruct that achieve 3D reconstruction using acoustic cameras are that achieve 3D reconstruction using acoustic cameras are feature-based method because using it is possible to calculate To the best of our knowledge, most previous studies 3D model of sea floor (Sun et al. (2008)). Kwon et al. that achieve 3D reconstruction acoustic cameras are 3D model of sea floor (Sun et al. (2008)). Kwon et al. 3D model of sea floor (Sun et al. (2008)). Kwon et al. feature-based method because it is possible to calculate applied occupancy mapping theory on MSISs to realize 3D (MRF) that detects shadow using SSS to reconstruct 3D model of sea floor (Sun et al. (2008)). Kwon et al. feature-based method because it is possible to calculate feature-based method because it is possible to calculate 3D coordinate values through matching of corresponding that achieve 3D reconstruction using acoustic cameras are applied occupancy mapping theory on MSISs to realize 3D feature-based method because it is possible to calculate applied occupancy mapping theory on MSISs to realize 3D applied occupancy mapping theory on MSISsetto toal. realize 3D 3D coordinate values through matching of corresponding reconstruction of underwater objects (Kwon (2017)). 3D model of sea floor (Suntheory et al.on (2008)). et 3D al. 3D applied occupancy mapping MSISs realize coordinate values through of corresponding 3D coordinate values through matching of corresponding feature points method extracted from matching multiple acoustic images, feature-based because it is possible to calculate reconstruction of underwater underwater objects (Kwon etKwon al. (2017)). 3D coordinate values through matching of corresponding reconstruction of underwater objects (Kwon et al. (2017)). reconstruction of objects (Kwon et al. (2017)). feature points extracted from multiple acoustic images, However, theseof sonar sensorstheory are facing with et problems like applied occupancy mapping on(Kwon MSISs toal. realize 3D feature reconstruction underwater objects (2017)). points extracted from multiple acoustic images, feature points extracted from multiple acoustic images, like coordinate stereo matching inthrough optical Corners or lines are 3D values matching of corresponding However, these sonar sonar sensors are are facing with problems problems like feature points extracted fromimages. multiple acoustic images, However, these sensors facing with like However, these sonar sensors are facing with problems like like stereo matching in optical images. Corners or lines are reconstruction of underwater objects (Kwon et al. (2017)). However, these sonar sensors are facing with problems like like stereo matching in optical images. Corners or lines  like stereo matching in optical images. Corners or lines are usually used as extracted features for 3Dimages. reconstruction. Mai et are al. feature points from multiple acoustic images, This work was in part supported by Tough Robotics Challenge, like stereo matching in optical Corners or lines are  usually used as features for 3D reconstruction. Mai et al.  This work was in part supported by Tough Robotics Challenge, However, these sonar sensors are facing with problems like usually used as features for 3D reconstruction. Mai et al.  This work was in part supported by Tough Robotics Challenge, usually used as features for 3D reconstruction. Mai et al. ImPACT Program (Impulsing Paradigm Change through Disruptive  achieved 3D reconstruction by tracing such features using This work was in part supported by Tough Robotics Challenge, like stereo matching in optical images. Corners or lines are usually used as features for 3D reconstruction. Mai et al. This work was in part supported by Tough Robotics Challenge, ImPACT Program Program (Impulsing (Impulsing Paradigm Paradigm Change Change through through Disruptive Disruptive achieved 3D reconstruction by tracing such features using ImPACT achieved 3D reconstruction by tracing such features using Technologies Program).  ImPACT Program (Impulsing Paradigm Change through Disruptive achieved 3D reconstruction reconstruction by tracing such features features using usually used as features for 3D reconstruction. Mai et al. ImPACT Program (Impulsing Paradigm Change through Disruptive This work was in part supported by Tough Robotics Challenge, achieved 3D by tracing such using Technologies Program). Technologies Program). Technologies Program). Technologies Program). ImPACT Program (Impulsing Paradigm Change through Disruptive achieved 3D reconstruction by tracing such features using Technologies Program). 2405-8963 © IFAC (International Federation of Automatic Control) Copyright © 2018, 2018 IFAC 330Hosting by Elsevier Ltd. All rights reserved. Copyright 2018 IFAC 330 Copyright © 2018 IFAC 330 Peer review© under responsibility of International Federation of Automatic Control. Copyright © 2018 330 Copyright © 2018 IFAC IFAC 330 10.1016/j.ifacol.2018.11.562 Copyright © 2018 IFAC 330

IFAC SYROCO 2018 Budapest, Hungary, August 27-30, 2018

Yusheng Wang et al. / IFAC PapersOnLine 51-22 (2018) 324–330

Acoustic camera

n

The remainder of this paper is organized as follows. Section 2 explains principles of acoustic camera. Section 3 describes 3D occupancy mapping with acoustic camera. The effectiveness of the proposed 3D mapping framework is evaluated with the experiment results in Section 4. Section 5 gives conclusions and future works of this paper.

Yc

rmax

Xc

max

To solve this problem, occupancy grid mapping theory can be considered as a way for building dense 3D volumetric representation. In 2005, Fairfield et al. successfully inspected a sinkhole in Mexico using the occupancy grid mapping theory with pencil beams which proves the effectiveness of such volumetric representation in underwater environment (Fairfield et al. (2005)). In 2016, Teixeira et al. accomplished 3D reconstruction of shiphull consisting of volumetric submaps using acoustic images (Teixeira et al. (2016)). However, when processing time of flight (TOF) data from acoustic images, either the first or the strongest return that is above a certain threshold is kept that too many data are discarded, which is not efficient. Guerneve et al. also applied grid-based method to acoustic camera, but their method is not fully probabilistic and robot can only be moved in z-axis (i.e., height direction) which is not useful for unmanned robot with arbitrary motion and not robust enough to noise (Guerneve and Petillot (2015)). In this paper, we also use the volumetric 3D model for reconstruction of underwater environment, and Bayesian inference is used to update the probability of each voxel constituting the 3D model. Here, in order to make the occupancy mapping theory more suitable for acoustic camera, we design a novel inverse sensor model. By using occupancy mapping theory and applying a novel inverse sensor model, it is possible to reconstruct a dense underwater 3D environment robustly and efficiently from arbitrary acoustic views.

Zc

r mi

extended Kalman filter (EKF) (Mai et al. (2017a))(Mai et al. (2017b)). However, automatically detecting such features on the acoustic images is not as easy as the optical images. Moreover, this kind of sparse 3D model consisting of such low-level features is not suitable for representing complex objects.

325

max (a)

Azimuth angle direction

(b)

Fig. 1. Acoustic projection model:(a) shows the geometrical model of the acoustic image and (b) illustrates beam slices in the acoustic camera. Imaging geometry of the acoustic camera is like different points in the 3D sensing area with the same range r and the same azimuth angle θ are mapped at the same pixel on the 2D acoustic image. In other words, points can be represented by (r, θ, φ) in polar coordinate (i.e., (x, y, z) in Cartesian coordinate) are mapped into (r, θ) in the 2D acoustic image. Conversely, if the elevation angles φ of each point are recovered, we can reconstruct 3D point cloud in the 3D sensing area for each measurement. More details on the principles of the acoustic camera can be found in (Kwak et al. (2015)).

2. PRINCIPLES OF ACOUSTIC CAMERA An acoustic camera can sense a wide range of 3D area and generate images like an optical camera; however, the imaging principle is totally different. For optical camera, depth information is lost. On the contrary, for acoustic camera, elevation angle information is lost. This phenomenon can be explained by the model described in this section. Acoustic cameras insonify a fan-shape acoustic wave in the forward direction with the azimuth angle θmax and elevation angle φmax within the scope of the maximum range of rmax as shown in Fig. 1a. Once the acoustic wave hits underwater objects, it is reflected back to the sensor because of backscattering. As shown in Fig. 1b, acoustic camera is a multiple beam sonar that the 3D acoustic wave from the acoustic camera can be separated into 2D beam slices in and for each beam, only range information is acquirable. The result is that elevation angle information is lost. 331

3. 3D RECONSTRUCTION FRAMEWORK Figure 2 shows overview of our 3D reconstruction framework. The input and output of our framework are: Input: • Acoustic images from multiple viewpoints • Corresponding six degrees of freedom (6-DOF) camera poses Output: • Dense 3D reconstruction of an underwater environment The process is divided into four steps: image segmentation, generation of input point cloud, 3D occupancy mapping and refinement. Detailed processes for each step are explained in the next subsections.

IFAC SYROCO 2018 326 Budapest, Hungary, August 27-30, 2018

Yusheng Wang et al. / IFAC PapersOnLine 51-22 (2018) 324–330

Fig. 2. Overview of 3D environment reconstruction framework based on occupancy mapping. Algorithm 1 Image segmentation Unknown

Occupied

Free

(a)

(b)

(c)

Fig. 3. Image segmentation process: (a) original acoustic image, (b) is the image after binarization and (c) shows segmentation process to classify pixels into occupied, free and unknown areas which are represented by red, green and blue respectively. 3.1 Image Segmentation After acoustic images are inputted into the system, they are binarized by a threshold set preliminary. Figure 3b shows an acoustic image after binarization process. The binarized acoustic image is then segmented into three areas which are defined as occupied, free and unknown. As we mentioned above, the acoustic wave can be discretized into 128 2D beams in the azimuth angle direction. Considering one beam, every pixel in this radial direction is searched and labeled depending on its intensity and the relationship with several adjacent pixels. Algorithm 1 describes an algorithm for image segmentation process. Here, n denotes the pixel numbers in radial directions and j represents the pixel index in one radial direction from sensor origin. I is the intensity value of each pixel. If j is bigger than n, Ij equals zero. Figure 3 shows the acoustic image after segmentation, where red area represents occupied, green area represents free and blue area represents unknown.

After processing image segmentation, pixel data of each group from all images are converted to input point clouds by generating elevation angles φ. To begin with, each pixel in acoustic image is represented by (r, θ). φ is generated as φmax j φmax − . 2 M

the sensing scope of the acoustic camera in elevation angle direction. Then, the generated points are converted from the camera spherical coordinates (r, θ, φ) to the camera Cartesian coordinates (xc , yc , zc ) as:     xc r cos φ sin θ yc = r cos φ cos θ . (2) zc r sin φ From the camera Cartesian coordinate, the points are then transformed to the World Cartesian coordinate (xw , yw , zw ) as: 

3.2 Point Cloud Generation

φj =

Input:Acoustic image Output:Classified pixels while i<128 do F lag = 0 while j 0 then return occupied pixels else if Ij ≤ 0 and F lag = 0 then return free pixels else return unknown pixels end if if Ij >0 and Ij+1 ,Ij+2 , . . . ,Ij+k ≤ 0 then F lag = 1 end if if j + k>n then F lag = 1 end if j ←j+1 end while i←i+1 end while

(1)

Here, the elevation direction is equally separated by M radian intervals. Thus, j is integer zero to M . φmax denotes 332

[xw yw zw 1] =  cθ c φ c φ s θ s ψ − c ψ s φ s ψ s φ + c ψ c φ s θ cθ s φ c ψ cφ + sθ s ψ s φ cψ s θ s φ − c φ s ψ  −s cθ sψ cθ cψ θ 0 0 0

  x xc y   yc  , z   zc  1 1

(3)

where (x, y, z, ψ, θ, φ) indicates the 6-DOF pose of the acoustic camera (i.e., camera viewpoint). Here, s and c represent sine and cosine functions, respectively. Figures 4a and 4b show an example of input acoustic image after segmentation process and corresponding 3D point cloud data through the generation process above. The point cloud is also divided into three groups by labeling.

IFAC SYROCO 2018 Budapest, Hungary, August 27-30, 2018

Yusheng Wang et al. / IFAC PapersOnLine 51-22 (2018) 324–330

327

Algorithm 2 Inverse sensor model Unknown

Input:3D points Output:L(mi |Ct ) if It is an occupied 3D point then return loccupied end if if It is an free 3D point then return lfree end if if It is an unknown 3D point then return lunknown end if

Occupied

Free

(a)

(b)

1.0m

After the 3D point cloud data are generated, the point clouds are down sampled by voxel grid filter (Rusu and Cousins (2011)) in order to prevent over updating of the probability in next occupancy mapping process—given that if there are k points existing in one voxel, this voxel is going to be updated by k times, which will not only increase computing burden, but also make the probability too high or too low. 3.3 3D Occupancy Mapping The typical occupancy grid mapping algorithm is used to realize 3D reconstruction from point cloud data (Elfes (1989)). We use OctoMap library which produces Octree structure based 3D model in order to implement occupancy mapping (Hornung et al. (2013)). Bayesian inference We define point cloud data from discrete time step t = 1 to t = T as C1 , C2 , . . . , CT . Here, the time step has the same meaning as the index of the image used. 3D environment map is denoted as m and each voxel is denoted as mi . Under an assumption that each voxel is conditionally independent of one another given measurements, the posterior probability of the map information can be written as follows: p(m|C1:t ) = p(mi |C1:t ). (4) i

By using Bayesian inference, p(mi |C1:t ) can be calculated as: p(mi |Ct )p(Ct )p(mi |C1:t−1 ) . (5) p(mi |C1:t ) = p(mi )p(Ct |C1 , . . . , Ct−1 ) To simplify calculation, we define L(x) using log odds as follows:   P (x) L(x) = log . (6) 1 − P (x) Therefore, Eq. (5) can be written as: L(mi |C1:t ) = L(mi |C1:t−1 ) + L(mi |Ct ), (7) where L(mi |Ct ) is called inverse sensor model. Each voxel has a probability of occupancy that after setting a threshold, voxels can be classified into occupied, free and unknown. Normally, occupied voxels are the dense 3D reconstruction result. 333

Height

Fig. 4. Generation process of input point cloud data: (a) segmented acoustic image and (b) generated 3D point cloud data. Here, green, red and blue respectively represent the free, occupied and unknown groups for one measurement.

0

(a)

(b)

Fig. 5. Refinement process: (a) raw 3D reconstruction result and (b) 3D reconstruction result after filtering. Inverse sensor model In order to apply acoustic camera images to occupancy mapping, a novel inverse sensor model is designed based on principles of acoustic camera. The inverse sensor model can be represented as:  loccupied     lfree (8) L(mi |Ct ) =     lunknown and calculated by Algorithm 2. In inverse sensor model, we take acoustic shadow into consideration and several assumptions below are made for lfree | as ratio the inverse sensor model. Here, we define | loccupied of empty. • Free space is more likely to be free that ratio of empty > 1. • Acoustic shadows are difficult to be distinguished from background, they are thought to be mixed with background. • Since shadows are not detected from images, areas that shadows are supposed to be existed is processed as unknown. Ratio of empty > 1 is important that if there is no reflection signal in one radial direction, we can be sure that there is nothing in this direction. On the other hand, occupied measurement may conclude plenty of “false measurement” due to the point cloud generation method. Assume that lfree = −2.2 with an occupancy probability of 0.1, loccupied = 0.41 with an occupancy probability of 0.6 and the threshold of occupancy probability is 0.5 which is zero in log odds. If we acquired a measurement that this voxel is free, we need more than five occupied measurements to change the state of this voxel from free to occupied. This is also the reason why we take shadow into consideration. If one shadow area is simply processed as

IFAC SYROCO 2018 328 Budapest, Hungary, August 27-30, 2018

Yusheng Wang et al. / IFAC PapersOnLine 51-22 (2018) 324–330

A co ustic ca m era v iew po ints

R o ll ro ta tio n

6 5 cm

1 5 0 cm

1 7 5 cm

(a)

(b)

Fig. 6. Simulation experiment with virtual images: (a) is the simulation environment and (b) shows examples of the generated virtual images. 0.7m

4.1 Simulation

Height 0

(a)

(b)

Fig. 7. Simulation result: (a) is the dense 3D reconstruction result of complex environment and (b) renders 3D reconstruction result and model used as ground truth together. free, we need plenty of measurements to change the state of these voxels which will make our reconstruction sometimes unsuccessful. If ratio of empty → ∞ occupancy mapping can be seen as a paradigm of space carving (Klingensmith et al. (2014))(Aykin and Negahdaripour (2017)).

3.4 Refinement After generating 3D occupancy map, noise exists which will degenerate our reconstruction result. Therefore, we exploit radius outlier filter (Rusu and Cousins (2011)), which is one of the simplest refinement method. Figure 5 shows our raw result and the result after filtering.

4. EXPERIMENT To prove the feasibility of our 3D reconstruction framework, simulation experiments were conducted with virtual acoustic images generated from acoustic camera imaging simulator we developed (Kwak et al. (2015)). First, we tried the 3D reconstruction of some simple objects as Fig. 5 which performed successfully. Then, we attempted our framework on complex environment. We also applied real experiment data to our framework to verify its effectiveness. 334

Figure 6b shows the complex simulation environment we designed which is generated by structure from motion (SfM). The size of the simulation environment is 1750 W × 1500 D × 650 H mm3 including six artificial objects fixed on boards. Here, the simulated acoustic camera viewpoints (i.e., 6-DOF poses) are represented in yellow boxes as Fig. 6a, where blue, green and red axes represent x, y and z axes, respectively. Camera motion may affect the result of 3D reconstruction (Huang and Kaess (2015))(Aykin and Negahdaripour (2017)). Elevation angle has an uncertainty of 14 deg that we have to narrow the occupied points using intersection. Arcs with uncertainty from different observations are better to intersect rather than coincide or parallel. In other words, translation of z-axis, rotations of pitch (i.e., yaxis rotation) and roll (i.e., x-axis rotation) are relatively effective. This is not a necessary condition, but if one observation merely increases the occupied possibility of voxels supposed to be free, we need more observations to decrease the possibility of these voxels to get a correct final result. For single motion, roll rotation can be the most effective way for leading to intersection of arcs with uncertainty. Camera motion is generated as: • Moving the acoustic camera around the environment to be inspected and stop at an arbitrary appropriate position. • Making roll rotation (i.e., x-axis rotation), storing images and poses. • Moving the acoustic camera again to another arbitrary appropriate position. Some corresponding simulated images are illustrated in Fig. 6b. Note that we treated these images as real image data from multiple acoustic camera viewpoints in our simulation experiment. 3D reconstruction process was performed using simulation data mentioned above. As shown in Fig. 7, we reconstruct the environment successfully. In Fig. 7b, dense 3D reconstruction result and model we used

IFAC SYROCO 2018 Budapest, Hungary, August 27-30, 2018

Acoustic camera

Yusheng Wang et al. / IFAC PapersOnLine 51-22 (2018) 324–330

329

Wall

Triangle Square Board (a)

(b)

Fig. 8. Real experiment using ARIS EXPLORER 3000: (a) shows the experiment environment. Acoustic camera ARIS EXPLORER 3000 is mounted on a pan tilt which is fixed to steel bar and (b) shows examples of the captured real images. 1.5m

To inspect these objects, roll rotation, from −60 deg to 60 deg was carried out for one time. It is worth mentioning that images are binarized by conventional Otsu’s method (Otsu (1979)). The dense 3D reconstruction result is shown in Fig. 9. We can recognize objects and environment successfully. More viewpoints are necessary for a better result which is one of our future work.

Square Triangle Height

Wall

5. CONCLUSION

0

Board

Fig. 9. Dense 3D reconstruction result with real acoustic images captured by roll rotation in one position. as ground truth are rendered simultaneously. From this result, we confirmed the effectiveness of our framework. As for computing cost, each acoustic image is processed in about 2 s using an Intel Core i7 vPro. 8,960,000 points are generated from one image and after applying voxel grid filter, about 1,000,000 points left for updating occupancy map. The resolution of voxels is 20 mm. For one roll rotation, image is taken every 5 deg so that 72 images are used as input data from one position. However, it is possible to take less images from one position. The computing time shows that this algorithm can generate map online. 4.2 Real Data We also applied some real experiment data to our framework using ARIS EXPLORER 3000 which is a state-ofthe-art acoustic camera. In order to verify the validity of our method, first of all, we took 27 acoustic images in a real underwater environment as shown in Fig. 8. Triangle and square objects were fixed on a wood board and were submerged by turbid water. Acoustic camera was mounted on a pan-tilt camera mounting module which can make roll and pitch rotation. In this experiment, position of acoustic camera was fixed and pitch angle was kept at about 30 deg. 335

In this paper, we applied a probabilistic method to realize 3D reconstruction from acoustic images with unknown elevation angles. A new inverse sensor model is designed to apply acoustic image data to occupancy mapping theory. Simulation is implemented and real experiment data is also used to confirm the feasibility of our algorithm. Our proposed framework can generate any shape of objects without relying on features. Future work related to this paper will involve valuing our result quantitatively and mounting acoustic camera on an underwater robot to carry out real experiment with more viewpoints. Because it is difficult to acquire real value of robot pose, localization method may be necessary to get a precise result that a method to realize simultaneous localization and mapping (SLAM) will be implemented. Furthermore, after 3D reconstruction, 3D recognition is also very important for automatic inspection of the underwater environment. Finally, further proof is also necessary to find out whether our algorithm works on real sea bed with non-artificial objects. ACKNOWLEDGEMENTS The authors would like to thank S. Fuchiyama, A. Ueyama, K. Okada and their colleagues at KYOKUTO Inc. for full access to their facilities for the real experiment. We would also like to thank Y. Yamamura, F. Maeda, S. Imanaga at TOYO Corp. who helped in this research project over a day for the equipment set up and the acquisition of the sonar data.

IFAC SYROCO 2018 330 Budapest, Hungary, August 27-30, 2018

Yusheng Wang et al. / IFAC PapersOnLine 51-22 (2018) 324–330

REFERENCES Aykin, M.D. and Negahdaripour, S. (2017). Threedimensional target reconstruction from multiple 2-d forward-scan sonar views by space carving. IEEE Journal of Oceanic Engineering, 42(3), 574–589. Belcher, E., Hanot, W., and Burch, J. (2002). Dualfrequency identification sonar(didson). Proceedings of the 2002 IEEE International Symposium on Underwater Technology (UT), 187–192. Elfes, A. (1989). Using occupancy grids for mobile robot perception and navigation. Computer, 22(6), 46–57. Fairfield, N., Kantor, G.A., and Wettergreen, D. (2005). Three dimensional evidence grids for slam in complex underwater environments. Proceedings of the 14th International Symposium of Unmanned Untethered Submersible Technology (UUST). Guerneve, T. and Petillot, Y. (2015). Underwater 3d reconstruction using blueview imaging sonar. Proceedings of the 2015 MTS/IEEE conference on OCEANSGenova, 1–7. Hornung, A., Wurm, K.M., Bennewitz, M., Stachniss, C., and Burgard, W. (2013). OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots, 34(3), 189–206. Huang, T.A. and Kaess, M. (2015). Towards acoustic structure from motion for imaging sonar. Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 758–765. Klingensmith, M., Hermann, M., and Srinivasa, S. (2014). Noisy data via voxel depth carving. Proceedings of the 2014 International Symposium on Experimental Robotics (IESR). Kwak, S., Ji, Y., Yamashita, A., and Asama, H. (2015). Development of acoustic camera-imaging simulator based on novel model. Proceedings of the 2015 IEEE 15th International Conference on Environment and Electrical Engineering (EEEIC), 1719–1724. Kwon, S., Park, J., and Kim, J. (2017). 3d reconstruction of underwater objects using a wide-beam imaging sonar. Proceedings of the 2017 IEEE International Symposium on Underwater Technology (UT), 1–4. Mai, N.T., Woo, H., Ji, Y., Tamura, Y., Yamashita, A., and Asama, H. (2017a). 3-d reconstruction of line features using multi-view acoustic images in underwater environment. Proceedings of the 2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligence Systems (MFI), 312–317. Mai, N.T., Woo, H., Ji, Y., Tamura, Y., Yamashita, A., and Asama, H. (2017b). 3-d reconstruction of underwater object based on extended kalman filter by using acoustic camera images. Preprints of the 20th World Congress of the International Federation of Automatic Control (IFAC2017), 1066–1072. Otsu, N. (1979). A threshold selection method from graylevel histograms. IEEE Transactions on Systems, Man and Cybernetics, 9(1), 62–66. Rusu, R.B. and Cousins, S. (2011). 3d is here: Point cloud library (pcl). Proceedings of the 2011 IEEE International Conference on Robotics and Automation (ICRA), 1–4. Sun, N., Shim, T., and Cao, M. (2008). 3d reconstruction of seafloor from sonar images based on the multi-sensor method. Proceedings of the 2008 IEEE International 336

Conference on Multisensor Fusion and Integration for Intelligence Systems (MFI), 573–577. Teixeira, P.V., Kaess, M., Hover, F.S., and Leonard, J.J. (2016). Underwater inspection using sonar-based volumetric submaps. Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 4288–4295.