Augmented reality navigation for liver resection with a stereoscopic laparoscope

Augmented reality navigation for liver resection with a stereoscopic laparoscope

ARTICLE IN PRESS JID: COMM [m5G;October 7, 2019;14:58] Computer Methods and Programs in Biomedicine xxx (xxxx) xxx Contents lists available at Sci...

4MB Sizes 0 Downloads 67 Views

ARTICLE IN PRESS

JID: COMM

[m5G;October 7, 2019;14:58]

Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

Contents lists available at ScienceDirect

Computer Methods and Programs in Biomedicine journal homepage: www.elsevier.com/locate/cmpb

Augmented reality navigation for liver resection with a stereoscopic laparoscope Huoling Luo a,b,1, Dalong Yin c,d,1, Shugeng Zhang c,d, Deqiang Xiao a,b, Baochun He a, Fanzheng Meng c, Yanfang Zhang e, Wei Cai f, Shenghao He a, Wenyu Zhang f, Qingmao Hu a,b, Hongrui Guo c, Shuhang Liang c, Shuo Zhou c, Shuxun Liu c, Linmao Sun c, Xiao Guo c, Chihua Fang f, Lianxin Liu c,d,∗, Fucang Jia a,b,∗ a

Research Lab for Medical Imaging and Digital Surgery, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, China c Department of Hepatobiliary Surgery, First Affiliated Hospital of Harbin Medical University, Harbin, China d Department of Hepatobiliary Surgery, Shengli Hospital Affiliated to University of Science and Technology of China, Hefei, China e Department of Interventional Radiology, Shenzhen People’s Hospital, Shenzhen, China f Department of Hepatobiliary Surgery, Zhujiang Hospital, Southern Medical University, Guangzhou, China b

a r t i c l e

i n f o

Article history: Received 10 January 2019 Revised 14 August 2019 Accepted 27 September 2019 Available online xxx Keywords: Augmented reality Laparoscopic surgery Liver resection Surgical navigation

a b s t r a c t Objective: Understanding the three-dimensional (3D) spatial position and orientation of vessels and tumor(s) is vital in laparoscopic liver resection procedures. Augmented reality (AR) techniques can help surgeons see the patient’s internal anatomy in conjunction with laparoscopic video images. Method: In this paper, we present an AR-assisted navigation system for liver resection based on a rigid stereoscopic laparoscope. The stereo image pairs from the laparoscope are used by an unsupervised convolutional network (CNN) framework to estimate depth and generate an intraoperative 3D liver surface. Meanwhile, 3D models of the patient’s surgical field are segmented from preoperative CT images using VNet architecture for volumetric image data in an end-to-end predictive style. A globally optimal iterative closest point (Go-ICP) algorithm is adopted to register the pre- and intraoperative models into a unified coordinate space; then, the preoperative 3D models are superimposed on the live laparoscopic images to provide the surgeon with detailed information about the subsurface of the patient’s anatomy, including tumors, their resection margins and vessels. Results: The proposed navigation system is tested on four laboratory ex vivo porcine livers and five operating theatre in vivo porcine experiments to validate its accuracy. The ex vivo and in vivo reprojection errors (RPE) are 6.04 ± 1.85 mm and 8.73 ± 2.43 mm, respectively. Conclusion and Significance: Both the qualitative and quantitative results indicate that our AR-assisted navigation system shows promise and has the potential to be highly useful in clinical practice. © 2019 Elsevier B.V. All rights reserved.

1. Introduction Laparoscopic liver resection is a promising type of minimally invasive surgery (MIS) that provides significant clinical benefits, including decreased postprocedure complications, reduced blood loss, less recovery time, less scarring and less tissue trauma [1]. Over the past few decades, interventional endoscopy [2] has been widely performed on various organs within the body cavity, including laparoscopic gastrectomy, cholecystectomy, pancrea-



1

Corresponding authors. E-mail addresses: [email protected] (L. Liu), [email protected] (F. Jia). These authors contributed equally.

tectomy, and partial nephrectomy [3–6]. However, in contrast to traditional open surgery, in laparoscopic surgery, surgeons must operate surgical instruments through small access ports in the abdominal wall within the limited working space created by pneumoperitoneum while watching the laparoscopic view on a monitor. The lack of tactile feedback and depth perception, the limited space for manipulation, and poor viewing angles make this type of procedure challenging to perform and cause a steep learning curve for surgeons. Recently, the advent of 3D laparoscopy has offered surgeons depth perception and is becoming popular in the operating room (OR). To alleviate the drawbacks of laparoscopic surgery, augmented reality (AR)-assisted navigation systems, including video-based, projection-based, and see-through AR visualization methods [7],

https://doi.org/10.1016/j.cmpb.2019.105099 0169-2607/© 2019 Elsevier B.V. All rights reserved.

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

JID: COMM 2

ARTICLE IN PRESS

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

Fig. 1. The main components of the AR-assisted laparoscopic navigation system.

have been introduced to enhance the surgical experience. Among these display methods, video-based AR is widely accepted by surgeons because it does not require additional hardware devices, such as a head-mounted device (HMD), that might burden surgeons during the operation. Video-based AR-assisted navigation systems fuse pre- or intraoperative medical image information with the live laparoscopic video to expand the surgeons’ field of vision, allowing them to see critical internal structures located below the surface and achieve safer and more effective surgical outcomes. Although AR navigation has been applied to some types of surgical sites [8,9], it is still challenging to develop a comprehensive visualization system to accurately localize the tumor position and determine the resection margin on the hepatic surface during laparoscopic liver resection operation. This difficulty is due to the complexity of intraoperative dynamics, such as deformations of the involved soft-tissue organs caused by pneumoperitoneum, respiration, and surgical manipulations, all of which compromise the accuracy of AR navigation systems. From the standpoint of developing an AR-assisted navigation system, segmentation from preoperative computed tomography (CT) images, stereo surface reconstruction, hand-eye calibration, and registration are the main components that affect the system performance (Fig. 1). Registration, which is responsible for achieving transformations of pre- and intraoperative models, is a crucial component in a successful implementation of an AR navigation system as well as the key process for boosting system accuracy and precision [9,10]. Nevertheless, because registration requires both pre- and intraoperative models as its inputs, successful and precise reconstruction of the 3D models derived from preoperative CT images of the liver and a surface of the intraoperative surgical field are significant steps in developing an AR navigation system. The goals of this study were to develop an AR-assisted navigation system for laparoscopic liver resection surgery that featured efficient and intuitive visualization utilizing the automatic registra-

tion method and to explore the validity and accuracy of the developed system through ex vivo and in vivo experiments. The main contributions of this paper are as follows. (1) A CNN-based automatic algorithm was adopted to segment the liver model from preoperative CT images, and an unsupervised CNN framework was introduced to perform depth estimation when reconstructing the intraoperative 3D model for registration; both were integrated into the Medical Imaging Interaction Toolkit (MITK) [11] and can be calculated rapidly during the procedure; (2) the system provides an intuitive visualization showing both the tumor and its resection margin, vessels, etc., which can be seen in the results of the experiments, and gains real-time or quasi realtime refresh rate during AR navigation; (3) extensive experiments have conducted to explore the validity and accuracy of the developed AR system. The rest of this paper is organized as follows. Section 2 presents related works, while Section 3 presents an overview of the proposed AR-assisted navigation system, including detailed information concerning the main components and the algorithms. Section 4 reports the settings of the ex vivo and in vivo experiments. The experimental results and discussions are presented in Sections 5 and 6, respectively. 2. Related works AR-assisted navigation systems are used in a variety of different surgeries and involve various registration methods, surface reconstruction algorithms, and visualization technologies. Research groups worldwide have paid considerable attention to these methods, and numerous methods have been proposed to meet the unresolved challenges of this area. Wu et al. [12] presented an ARassisted system for spinal surgery that used a camera-projector system to superimpose the preoperative models on the surfaces of patients. Wen et al. [13] proposed an AR-based surgeon-robot cooperative system for transcutaneous ablation therapy. Similarly,

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

JID: COMM

ARTICLE IN PRESS

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

Wen et al. adopted the projector-based AR visualization method proposed by Wu et al. to provide surgeons with AR information. Hayashi et al. [14] developed a laparoscopic gastrectomy navigation system that adopted anatomical landmarks on the external body surface, including the xiphoid process and the umbilicus, and capitalized on the corresponding fiducials in CT images for registration, virtual laparoscopic views synchronized with the laparoscope were presented on another monitor for visualization. Nevertheless, anatomical landmarks located on an abdominal surface are difficult to transfer to laparoscopic liver navigation situations because the distribution of landmarks may be different after pneumoperitoneum. This would increase the fiducial registration error. Moreover, surgeons frequently need to switch between different views based on this style of visualization. Yasuda et al. [15] introduced an AR system that utilized a tablet PC as a live image-capture device and superimposed preoperative models onto the surgical field after registration. However, this type of AR system may be constrained during open surgery. Thompson et al. [16] described an AR navigation system and an application for laparoscopic liver resection in which registration was performed between the surface patches reconstructed from a stereo laparoscope and preoperative anatomical surface models using a commercial modelling service. Due to the importance of intraoperative models and registration for AR navigation systems, intraoperative imaging modalities [17] and nonrigid registration methods were adopted to promote accuracy and precision. Kang et al. [18] introduced a stereoscopic AR system for laparoscopic surgery that combined live laparoscopic ultrasound (LUS) and stereoscopic video. Tsutsumi et al. [4] described a real-time AR navigation system based on open magnetic resonance imaging (MRI) and applied it to laparoscopic cholecystectomy. Liu et al. [19] developed a laparoscopic AR system that integrated a stereoscope, laparoscopic ultrasound, and an electromagnetic tracking device. Kong et al. [20] used cone-beam computed tomography (CBCT) or CT to drive a nonrigid registration process for a virtual biomechanical model in soft-tissue surgery. Mountney et al. [21] registered a pre- to intraoperative CBCT coordinate system using a nonrigid biomechanically driven method and then registered laparoscopic coordinates to the CBCT space via landmarks in fluoroscopic images. Kong et al. [20] adopted biomechanical finite-element models (FEM) to register a virtual model to laparoscopic images in which shape changes were identified by tracking fiducials via an optical tracking system. Although these systems can potentially improve registration accuracy, the introduction of fiducials makes this type of navigation system significantly interruptive to surgical workflow. Recently, Zhang et al. [56] proposed a markerless deformable registration method by firstly employing semiglobal block matching (SGBM) to reconstruct renal surface, stitching point cloud, then using ICP algorithm to obtain a coarse registration and finally performing the coherent point drift (CPD) algorithm to achieve a fine registration result. Although it does not rely on any external tracker and the deformation can be overcomed, for now it is hard to gain real-time refresh rate during AR navigation. In addition, because stereo laparoscopies have already been introduced in laparoscopic surgery, they can be used to provide 3D surface information of the surgical field. Stoyanov et al. [22] presented a semidense reconstruction approach for robotic-assisted surgery by first finding a seed set of candidate feature matches and then applying a region growing method to propagate disparity information around the seeds to reconstruct a semidense surface. Penza et al. [23] proposed two methods that followed the traditional sum of absolute difference (SAD)-based and census transform approaches and then refined the disparity image using superpixel segmentation. In addition to stereo reconstruction, a monocular laparoscope can also provide depth information. Chen et al. [24] proposed an ORB-SLAM-based algorithm for 3D surface recon-

3

struction that employed the moving least squares (MLS) algorithm for smoothing and Poisson surface reconstruction for point cloud dataset processing. A comprehensive review [25] of 3D surface reconstruction in laparoscopic surgery focused on methods other than large-scale applications of deep learning. Recently, depth estimation using a CNN [26] has achieved promising results in the computer vision community and has been extended to the robotic surgery field [27] for depth estimation or 3D surface reconstruction. Our goal in this study is to provide an efficient and intuitive visualized AR-assisted navigation system for laparoscopic liver resection surgery. The laparoscopic stereo image pairs are introduced for depth estimation utilizing a CNN framework to generate the intraoperative 3D liver surface, while the patient’s liver model is quickly segmented from preoperative CT images using a pretrained V-Net architecture network. Point clouds are generated from the pre- and intraoperative model surfaces and input into the globally optimal iterative closest point (Go-ICP) algorithm to obtain the physicalto-image spatial transformation. Then, tumors and their resection margins, blood vessels, etc. are superimposed on the live laparoscopic images to provide surgeons with detailed information about the patient’s subsurface anatomy. To explore the validity and accuracy of the developed system, we conducted extensive ex vivo and in vivo experiments.

3. Overview of AR-assisted navigation systems The proposed AR-assisted navigation system was implemented with C++ and Python on Windows 10 OS. We adopted various open source toolkits, including the Insight Segmentation and Registration Toolkit (ITK) [28], Visualization Toolkit (VTK) [29], Open Source Computer Vision Library (OpenCV) [30], TensorFlow [31] and MITK. MITK provided an image-guided therapy (IGT) module [32] used to readily access the positional tracker devices. The navigation system consists of five modules: hand-eye calibration, preoperative image segmentation, intraoperative liver surface reconstruction, image-to-patient registration, and AR navigation. Fig. 2 shows a pictorial representation of the entities involved in the AR-assisted navigation system. A rigid stereo laparoscope is used to capture live images for 3D liver surface reconstruction and AR display. A positional tracker is used to obtain the position and orientation of the stereo laparoscope, and a stereo display monitor with circularly polarized 3D glasses is used visualizing the depth perception. The key to successfully implementing an AR surgical navigation system is to find the spatial transformations between

Fig. 2. Pictorial representation of the entities involved in an AR-assisted navigation system.

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

ARTICLE IN PRESS

JID: COMM 4

the various physical entities in the surgical scene, as shown in Fig. 1. Five coordinate systems are involved in the AR navigation system: (1) a positional tracker coordinate system, CT ; (2) a reflective passive marker (RPM) coordinate system, CR ; (3) a preoperative CT image coordinate system, CI ; (4) an intraoperative stereo laparoscope coordinate system (3D), CL , and (5) a laparoscopic image coordinate system (2D), CP . More precisely, the positional tracker is used to obtain the pose of the laparoscope by tracking the RPM fixed to the distal end of the laparoscope. Intraoperative models reconstructed from stereo laparoscope images define the intraoperative coordinate system and are registered with the models derived from the preoperative CT images to obtain the transformation between the pre- and intraoperative models and reflect the imageto-patient spatial relationship. To generate the AR visualization, the models are transformed into the coordinate system of the stereo laparoscope and then projected to the coordinate system of the laparoscopic image for visualization. Considering the transformations and relationships described in Fig. 1, the product of a closed transformation loop is identity matrix and the following equation is formulated: eye

Thand × hand Ttracker × tracker Tmodel × model Teye = I,

(1)

where the superscripts and subscripts denote the direction of a particular transformation, i.e., for each term, the transformation direction starts at the bottom-right subscript and ends with the upper-left superscript. Here, eye Thand , hand Ttracker , tracker Tmodel , and model T eye are the rigid transformation matrices from the RPM to the laparoscopic coordinate system, from the positional tracker coordinate system to RPM, from the preoperative coordinate system to the positional tracker coordinate system, and from the laparoscopic coordinate system to the preoperative coordinate system, respectively. The matrices are represented as a 4 × 4 homogeneous matrix in which the upper-left 3 × 3 submatrix denotes the rotation, and the upper-right 3 × 1 submatrix is the translation vector. As Eq. (1) and Fig. 1 show, the matrix hand Ttracker is provided by the positional tracker in real time. The calculation of the transformation matrix from the RPM to the laparoscopic camera, eye Thand , is a well-known problem known as “hand-eye calibration”, which is a term that stems from the robotics literature. For laparoscopic surgical navigation systems, “hand” denotes the RPM, while “eye” denotes the laparoscopic camera. The matrix tracker Tmodel can be obtained by registration. In other words, a point PI defined in CT image space will be transformed to the laparoscopic coordinate system CL (denotes the point as PL ) according to the following equation: eye

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

Tmodel = eye Thand × hand Ttracker × trac ker Tmodel

PL = eye Tmodel × PI .

(2) (3)

After the point defined in the laparoscopic coordinate system has been calculated, it can be readily projected to the laparoscopic 2D image coordinate system with the help of extrinsic and intrinsic camera parameters to generate the virtual views and overlaid on the real surgical scene to form the AR visualization for surgeons. 3.1. Hand-eye calibration The hand-eye calibration matrix eye Thand is vital in an AR laparoscopic navigation system, as shown in Fig. 1. The stereo laparoscopic camera is calibrated to obtain the intrinsic and extrinsic camera parameters via the widely used checkerboard method [33] from OpenCV prior to hand-eye calibration. To obtain the hand-eye transformation, we adopted the progressive strategy based on an invariant dot previously proposed by our group [34], which was inspired by the work of Thompson et al. [35].

Because the RPM is fixed on the distal end of a laparoscope with a custom-made fixture (Fig. 4(D)), the hand-eye calibration matrix can be calculated once after the clinical sterile procedure and saved as a file that can be loaded during surgical navigation. 3.2. Liver segmentation We employed the V-Net [36] fully convolutional architecture to segment the liver from preoperative CT images. The V-Net model was trained on our own collected datasets of 120 patients’ liver CTs and then fine-tuned on a total of 15 porcine liver datasets that were scanned under 13 mm Hg pneumoperitoneum pressure prior to the in vivo experiment. The ground truth of the training datasets was annotated by an experienced expert. Before training, air is segmented using an automatic region growing method, and the body mask is extracted by detecting the maximum connected component of the non-air regions. Finally, the gray value of CT images is normalized to the range of [0, 1]. All the training images are simply resampled to the same size (64 × 64 × 64) for input to V-Net. Because the segmentation program must be deployed on a laptop computer with CPU. Considering that we only need to segment the liver surface contacted with pneumoperitoneum for registration, the segmentation of the surface becomes easy as the intensity of air is obviously different from the tissues. Therefore, we finally used the input size of 64 × 64 × 64 and the Caffe library to implement the V-Net on windows CPU. The network parameters were all set using the original V-Net with four batches and 10,0 0 0 iterations for fast training. The initial learning rate was lr = 0.0 0 01, which was decayed at a step size of 0.1 after each 50 0 0 iterations. We used the stochastic gradient descent optimizer and Dice loss during the training procedure. As the downsampled resolution resulted in a very coarse segmentation result, the surface contacted with pneumoperitoneum was not accurate by only directly extracting from the segmentation result (Fig. 12(A) and (B)). Thus, we improved the results by searching the pixels from the initial surface pixels in the Y-axis direction until the pixel intensity value became lower than zero (see blue arrows in Fig. 12(B)). Using this method, the extracted surface was not affected by segmentation of input resolution as shown in Fig. 12(C). 3.3. Liver surface reconstruction Stereo vision-based reconstruction methods need to be able to find the corresponding features in a stereo image pair to estimate disparity. However, feature-based approaches often result in a sparse or semidense reconstruction surface, and their accuracy is not ideal, especially for textureless areas, those with repeated textures, and highlighted areas - all of which are common phenomena in laparoscopic liver surgery. Thus, it is challenging to produce a dense reconstruction result. Recently, this problem has been alleviated by applying CNN-based methods, which have already achieved promising results. Given a rectified stereo image pair, the disparity of a particular pixel (x, y) on the reference image (left image) is the offset d of its location at (x-d, y) on the opposite image (right image). After a disparity map has been computed, the depth z can be easily calculated by z = fB/d, where f and B are the camera focal length and the stereo camera’s baseline, respectively. Both of these parameters can be measured through stereo camera calibration. Inspired by the work of [37], we resorted to an unsupervised stereo depth estimation method for liver surface reconstruction. The proposed CNN-based unsupervised learning model is illustrated in Fig. 3. The main idea in [37] is that a function learns to reconstruct one image from the other; thus, it can also provide the 3D shape information of the scene being imaged. In con-

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

ARTICLE IN PRESS

JID: COMM

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

5

Fig. 3. CNN-based unsupervised learning model for liver surface reconstruction.

trast to the work in [37], which used only the left image to predict the left and right disparity maps and sample the counterpart image to reconstruct the left and right images, to fully use the stereo image information, we set the left image as the CNN input to predict the left disparity map and used the right image to infer the right disparity map through the same encoder-decoder network, an approach that is similar to the architecture of [37]. Bilinear sampling from the spatial transform networks (STN) [38] is employed to reconstruct the left and right images. Skip connections from the activation blocks of the encoder are used in the decoder network to enable it to resolve higher-resolution details of the input image. Just as in [37], the training loss is based on three main terms: appearance-matching loss Cap , disparity smoothness loss Cds , and left-right consistency loss Clr . To make the loss function even more robust, four output scales are used in the training model, and the total loss is the sum of those four scales:  l + C r ) + b(C l + C r ) + c (C l + C r ), where Loss = 4s=1 Cs , Cs = a(Cap ap ds lr ds lr a, b, and c are weights that control the effects of these three loss terms. Readers are encouraged to consult [37] for more details on this loss function. The CNN-based unsupervised learning model was initially trained on the da Vinci data from [27] and then tested on ex vivo and in vivo laparoscopic images. During the testing stage of our AR-assisted navigation system, only the left estimated disparity map is used for 3D surface reconstruction. However, if the pretrained CNN model is directly applied to predict the disparity image of our experimental laparoscopic image, the absolute scale of the 3D reconstruction will be inaccurate. Because the cameras used to capture the training datasets may have different intrinsic and extrinsic parameters from the one we used in the experiment, we follow the strategy proposed in [39] to adjust the disparity: Din−vivo = ( fin−vivo/ ftrain )D pre−trained , where Dpre − trained is the disparity image inferred directly from the pretrained model, while ftrain and fin − vivo are the focal length of the camera used for training and the laparoscope used in our experiment, respectively. 3.4. Image-to-patient registration The image-to-patient registration is employed to determine the transformation between the coordinate system of the preoperative model and the positional tracker, as shown in Fig. 1. The iterative closest point (ICP) algorithm is a widely used approach for registration in a navigation system. Thompson et al. [16] used the rigid ICP algorithm to perform the registration between the preoperative models derived from CT images and the surface patches reconstructed from intraoperative laparoscopic images. This approach is effective for constructing a wide area based on the hand-eye transformation and an optical positional tracking system. However, the conventional ICP algorithm is susceptible to local minima, and a

high-quality initialization must be performed to guarantee its performance. Accomplishing this task is a challenge for surgical navigation systems because the preoperative and intraoperative models are extracted at very different times and from different spaces. In this paper, we extend the work of [16] to address this problem by exploiting the globally optimal ICP (Go-ICP) [40] algorithm in our AR-assisted navigation system. The Go-ICP approach is based on a branch-and-bound scheme that integrates the local ICP to search the entire 3D motion space. This method can produce a stable global optimal registration result. In our AR-assisted navigation system, one point cloud is sampled from the surface of the preoperative model, and the other is reconstructed using the liver surface reconstruction method described in Section 3.3. Both of these point clouds are used as the inputs to the Go-ICP algorithm to obtain the ultimate optimal registration transformation. To improve the registration precision, only the surface of the preoperative liver model contact with pneumoperitoneum, which is the area visible from the laparoscopic camera position, is extracted and used during registration. 4. Evaluation We conducted two groups of experiments to evaluate the utility and accuracy of the proposed system: one used ex vivo data (four porcine livers), and the other used in vivo animal data (five live pigs). 4.1. Hardware The system runs on a 64-bit Windows 10 Lenovo ThinkPad W540 laptop computer with an NVIDIA Quadro K1100M graphics card, a 2.7 GHz quad-core processor, and 32GB of memory. A Polaris Spectra optical tracker (Northern Digital Inc., Waterloo, Ontario, Canada) is employed for positional tracking. A RPM is attached by an ad hoc 3D printed fixture on the distal-end of the rigid stereo laparoscope. The laptop display is extended to a stereo display monitor (LG, model 2792PB) for visualizing AR with depth perception in a pair of worn circularly polarized 3D glasses. Because no clinical stereo laparoscope was available in our laboratory, we assembled two ultra-mini CMOS color cameras (MISUMI) to create a rigid stereo laparoscope (Fig. 4) with a spatial resolution of 720P, a 70° field of view and six integrated light sources around every camera. The custom-made stereo laparoscope has a 10 mm rod diameter and a 4.4 mm baseline, making it very similar to the commercial 3D laparoscope Aesculap® EinsteinVision® 3D camera system. We used the latter for the in vivo experiments, and used the custom-made rigid stereo laparoscope only in the ex vivo experiments. Because the EinsteinVision® 3D camera system has no software interface, the live images from the laparoscope were

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

JID: COMM 6

ARTICLE IN PRESS

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

Fig. 4. The custom-made rigid stereo laparoscope used for the laboratory ex vivo experiments. (A) The custom-made rigid stereo laparoscope; (B) the miniature camera with six integrated LED lights around it; (C) reflective passive markers fixed on the laparoscope; (D) the ad hoc fixture for the RPM.

captured via an Epiphan AV.io HDTM (Epiphan Systems Inc., Ottawa, Canada) video grabber by connecting the 3D camera system with the laptop computer. 4.2. Ex vivo experiments In this experiment, we used four ex vivo porcine livers to evaluate the efficacy of our AR surgical navigation system. The porcine liver was fixed in a customized plastic basin. Five to seven copper nails (see Fig. 5(B)) with a cap diameter of 2.5 mm were attached to the surface of liver to act as markers for AR error evaluation. One artificial liver tumor was implanted in each liver for tumor resection. The tumor was simulated by injecting 5 ml agar solution created by mixing 3 g agar and 100 ml distilled water, heating it to boiling, and then cooling it to 40 °C before injection into the liver

parenchyma. Example CT scans of the artificial tumors are shown in Fig. 6. To prevent the surface of the porcine liver from becoming darker, the temperature and humidity must be controlled within a reasonable range, which we achieved by placing some water and ice into the plastic basin and maintaining it at CT-room conditions throughout the experimental procedure. A CT scan was performed to obtain the preoperative CT image of the AR system. Hand-eye calibration, preoperative image segmentation, intraoperative surface reconstruction and registration were performed according to the methods described in Section 3. After these steps were completed, the preoperative models of the ex vivo porcine liver, including the tumor and markers, were projected from CT image space to the stereo laparoscopic image space. To evaluate the accuracy and functionality of the proposed AR navigation system, a hepatobiliary surgeon was in charge of the artificial tumor resection operation under the guidance of the AR system. Prior to the liver resection, the model of the copper nails was projected and superimposed onto the laparoscopic image. The merged images were saved for post-experimental error evaluation. 4.3. In vivo experiments The in vivo experiment was approved by the institutional review board for animal experiments. All the animal experiments were performed in a CT intervention room. Nine healthy domestic pigs (weight ranging from 30 kg to 49 kg) were utilized in these experiments; however, four of the pigs died during the experimental procedure due to excessive muscle relaxant injection. Thus, only five pigs were successfully tested. Fig. 7 details the steps of the in vivo experiment. The pigs were anesthetized, supinely fixed on the CT scan bed, endotracheally intubated under general anesthesia, and monitored by a professional veterinarian throughout the experiment. As in the ex vivo experiment, one artificial tumor was simulated by injecting the agar solution into the liver parenchyma

Fig. 5. Ex vivo experimental scene. (A) The proposed AR-assisted navigation system was tested with porcine livers; (B) several copper nails with a cap diameter of 2.5 mm and a length of 1 cm were employed to act as markers for error evaluation; (C) the porcine liver was fixed in a customized plastic basin; (D) artificial tumor after liver resection.

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

JID: COMM

ARTICLE IN PRESS

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

7

Fig. 6. Artificial tumors imaged in the CT scans, left: ex vivo data; right: in vivo data.

Fig. 7. Detailed information of all steps in the animal experiments. (A) A pig was anesthetized and supinely fixed on the CT scan bed; (B) a pig was endotracheally intubated under general anesthesia; (C) an artificial tumor was injected under the guidance of ultrasonic imaging; (D) a CT scan was performed to obtain the preoperative CT image; (E) establishment of the pneumoperitoneum; (F) copper nails were implanted under the laparoscopic image; (G) scan to obtain the intraoperative CT images; (H) models reconstructed from the CT images; (I) intraoperative liver surface reconstruction; (J) AR visualization.

percutaneously under the guidance of ultrasonic imaging. The critical internal liver structures were avoided during this procedure. The artificial tumor was used for AR visualization and tumor resection. After the pneumoperitoneum was established, five copper nails were attached to onto the surface of the liver for error evaluation. To improve the AR display during this experiment, a CT contrast medium (meglumine diatrizoate) was injected into the blood vessel by intravenous injection at the arterial and venous phases. In order to achieve similar effects to deformation registration, for in vivo experiment, the preoperative models were derived from CT

scans after pneumoperitoneum. After the experiments were completed, these pigs were killed by euthanasia. The data acquisition procedure was performed after the AR navigation system configuration. The marker model was projected to the laparoscopic camera space and superimposed on the laparoscopic image. The AR visualization results were saved for error evaluation after the completion of the animal experiment. The tumor, along with its resection margin, vessels, etc., were superimposed on the laparoscopic image to provide the AR view for the surgeon conducting the tumor resection.

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

ARTICLE IN PRESS

JID: COMM 8

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx Table 1 Comparison of different surface reconstruction methods. Methods

Wang [42] Stoyanov [22] Hosni [43] Chang [44] Penza 1 [23] Penza 2 [23] Godard [37] Ours

Heart 1

Heart 2

RMSE (mm)

MAE (mm)

Match (%)

RMSE (mm)

MAE (mm)

Match (%)

N/A 3.88 ± 0.87 8.24 ± 0.92 1.85 ± 0.82 2.95 3.38 2.20 ± 0.67 1.77 ± 0.51

2.16 ± 0.65 2.36 ± 0.92 4.87 ± 0.87 1.24 ± 0.89 N/A N/A 1.76 ± 0.59 1.41 ± 0.42

97.25 ± 1.13 78.64 ± 2.00 95.96 ± 1.57 100 57.50 55.40 100 100

N/A 4.85 ± 1.82 7.73 ± 1.56 2.66 ± 1.47 1.66 1.70 2.36 ± 0.45 2.27 ± 0.39

2.14 ± 0.83 3.20 ± 1.15 5.37 ± 1.53 1.47 ± 1.23 N/A N/A 1.56 ± 0.57 1.43 ± 0.47

99.96 ± 0.11 80.64 ± 1.87 88.92 ± 2.38 100 51.90 44.70 100 100

Fig. 8. The qualitative results of the proposed surface reconstruction method applied to cardiac datasets from the Hamlyn Center. (A) input images excluding background; (B) the predicted disparity images; (C) reconstructed 3D points; (D) the error map of reconstructed points.

5. Results 5.1. Liver surface reconstruction

Table 2 RPE (mm) of fiducials for AR navigation system in the ex vivo experiments. Data

To verify the effectiveness and accuracy of the employed liver surface reconstruction approach, we validated it on two public cardiac datasets [22,41] with associated ground truth. The stereo image pair was rectified using the accompanying camera parameters before surface reconstruction, and the dark background was removed to exclude the impact on the reconstruction result using the intensity threshold method. The CNN-based unsupervised learning model was trained on da Vinci data from [27] and fine-tuned on the Heart 1 and Heart 2 datasets separately (i.e., through crosstraining and testing on these two datasets). Table 1 lists the statistics of different algorithms with respect to the mean absolute error (MAE), root mean square error (RMSE) and percentage of matched pixels compared with ground truth. Fig. 8 shows the qualitative results of the proposed method applied to the same datasets.

5.2. Ex vivo and in vivo experiments We adopted the reprojection error (RPE) [45] as a quantitative measurement for the presented system. RPE is the average Euclidean distance from a projected 2D image point to the true position in 3D space of the laparoscope. The left camera was chosen as the reference frame. The projected 2D image point was transformed to 3D space via triangulation.

Liver Liver Liver Liver All

1 2 3 4

Mean ± SD

Min

Max

6.06 ± 2.16 5.76 ± 1.75 6.02 ± 1.26 6.32 ± 2.43

3.01 2.38 4.51 2.71

8.53 8.10 8.09 10.45

6.04 ± 1.85

2.38

10.45

5.2.1. Ex vivo result The AR system projected the model of the fiducials and superimposed it on the laparoscopic images. Because calculating the RPE requires the pixel coordinates of the 2D image for triangulation, one student was trained to pick the projected and real fiducial pixel coordinates from the laparoscopic images. The mean RPE for the ex vivo experiment was 6.04±1.85 mm. Table 2 lists the mean and standard deviation of the RPE for every ex vivo liver experiment. The corresponding error distribution of the box plot representation for the ex vivo experiment is shown in Fig. 9. The configuration time includes the segmentation of preoperative models, liver surface reconstruction and image-to-patient registration. The stereo camera calibration and hand-eye calibration are excluded from it. The segmentation of the ex vivo preoperative liver, markers and tumor is readily performed using the MITK threshold utility tool within 5 min. The main contributor to the configuration time comes from the liver surface reconstruction and

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

ARTICLE IN PRESS

JID: COMM

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

9

Table 4 RPE (mm) of fiducials for AR navigation system on in vivo experiments. Data

Mean ± SD

Min

Max

Pig Pig Pig Pig Pig

8.78 9.80 7.53 9.01 8.45

± ± ± ± ±

1.47 1.10 3.67 2.92 2.97

6.77 8.39 4.34 6.60 4.36

10.43 10.94 11.25 13.67 10.76

8.73 ± 2.43

4.34

13.67

All

Fig. 9. Box-plot representation of the reprojection error in the ex vivo experiments. Table 3 Configuration time for the ex vivo experiments. Liver Liver Liver Liver Liver

1 2 3 4

Mean ± SD

Reconstruction time (s)

Registration time (s)

155 148 141 116

22 32 27 46

140 ± 17

32 ± 10

registration. Table 3 lists configuration times for the different procedures in the ex vivo liver experiments. The mean reconstruction time is 140 ± 17 s in all ex vivo cases. In the ex vivo experiments, we resampled the intraoperative models as a point-cloud using a fixed number of 10 0,0 0 0 points, and the MSE threshold was set to 0.0 0 05. The mean registration time is 32 ± 10 s. Fig. 10 demonstrates the qualitative results of an ex vivo experiment of the presented AR navigation system. The initial positions of the pre- and intraoperative liver surface models differ substantially; however, the Go-ICP approach can register these two models with reasonable accuracy. After the registration is complete, the tumor with its resection margin can be superimposed on the laparoscopic image to guide the surgeon performing the liver resection. The frame rate of the current version of our AR navigation system is approximately 10 to 12 FPS without GPU acceleration. 5.2.2. In vivo results Five healthy domestic pigs were used to validate the accuracy of the presented system. The preoperative liver model was segmented using the method introduced in Section 3.2. The segmentation results were obtained by leave-one-out cross-validation, which achieved an average dice coefficient of 0.907 for the in vivo porcine liver under 13 mmHg pneumoperitoneum pressure. Fig. 11 shows the liver segmentation result for the porcine CT images. Because the vessels, artificial tumor, and markers have very different intensities in DICOM images, they can be segmented using the MITK manual interactive tools. The preoperative models de-

1 2 3 4 5

rived from CT images are depicted in Fig. 13. The total time cost of the preoperative models, including tumor, markers, and vessels, is 10 min for one dataset. The mean time cost for automatic liver segmentation is 134.6 ± 20 s. Five copper nails were attached to the surface of the in vivo porcine livers to act as fiducials for error evaluation after the pneumoperitoneum was established. The RPE of these markers is 8.73 ± 2.43 mm, as shown in Table 4. Fig. 14 shows the corresponding error distribution in a box plot representation of the in vivo experiments. The qualitative results of the in vivo experiment are shown in Fig. 15. The resections in both the ex vivo and in vivo experiments resulted in a negative resection margin. Table 5 lists a comparison of different AR surgical navigation systems based on the registration method, surgical site, experiment, modality of preand intraoperative imaging, and validation metric. 6. Discussion During partial liver resection, the surgeon tries to preserve as much healthy liver parenchyma tissue as possible while ensuring that all the tumor tissues are removed. In recent years, biophotonics-based AR methods [8] have been developed to address this issue. A fluorescent dye such as indocyanine green (ICG) [46] or fluorescein is injected into the patient prior to surgery. This type of agent glows in a different spectrum after laser excitation with a specific wavelength. The emitted light can be captured by a camera equipped with an appropriate filter and used to guide the surgeon in distinguishing normal tissue from abnormal tissue. However, the optimal dose or time of ICG injections before hepatobiliary surgery has not been well studied [47]. Furthermore, the penetration depth of ICG fluorescence is shallow, and it is difficult for many hospitals to purchase such expensive equipment. The proposed AR-assisted navigation system achieves similar functionality for the aforementioned issue based on the preoperative segmented models during liver resection. Utilizing the CNN approach, the pre- and intraoperative models can be derived in a reasonable time, as shown in Table 3; then, the models are registered to transform them into a unified coordinate space. This system presents an intuitive AR navigation visualization by superimposing the tumor and resection margin in different colors, as shown in Figs. 10 and 15. Moreover, different preoperative models, including the portal

Fig. 10. (A) The initial positions of the pre- and intraoperative liver surface models; (B) results of Go-ICP registration; (C) AR overlay for the fiducials; (D) tumor with resection margin superimposed on the laparoscopic image.

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

JID: COMM 10

ARTICLE IN PRESS

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

Fig. 11. Liver segmentation results for porcine CT images based on the method presented in this paper. The first row shows the results for non-pneumoperitoneum CT images, and the second row shows the results for CT images under 13 mm Hg pneumoperitoneum pressure. The blue contour denotes the ground truth, while the red color indicates the output of the method.

Fig. 12. Illustration of the liver surface contacted with pneumoperitoneum extraction method: (A) segmentation result with a resolution of 64 × 64 × 64, (B) liver surface contacted pneumoperitoneum according to the segmentation result (red line), and (C) the final improved surface result that was obtained by searching the boundary from the coarse surface in Y-axis direction until the pixels first had lower intensity value than zero.

Fig. 13. Segmentation result of the proposed CNN-based liver segmentation method. The yellow contour indicates the output of the CNN-based segmentation method. The red contour denotes the region used for registration with the intraoperative model. (A) Axial view; (B) sagittal view; (C) coronal view; (D) 3D view of the preoperative models.

vein, hepatic vein, and gall bladder, could also be selected and overlaid on the live laparoscopic image to provide detailed information about the subsurface patient anatomy in the surgical field. From the viewpoint of developing an AR laparoscopic navigation system, five vital modules are required: hand-eye calibration, preoperative image segmentation, intraoperative liver surface reconstruction, image-to-patient registration, and AR navigation visualization. As Eq. (1) shows, the overall error of the AR-assisted navigation system is influenced by the errors of these modules. To achieve a more accurate system, the error of each module should be constrained within a reasonable range. For hand-eye calibration, after the reflective passive markers were fixed on the distal end of the rigid laparoscope via a custom-made fixture, the trans-

formation could be determined after the clinical sterile procedure. The accuracy and feasibility of this approach were demonstrated in our previous work [34]. Thanks to the V-Net deep learning architecture, the preoperative image segmentation can be completed quickly and it achieves better segmentation results, as shown in Fig. 11, with an average Dice coefficient of 0.907 as assessed by leave-one-out cross-validation. If we were to consider only contour contact with pneumoperitoneum, which was used for registration, the preoperative liver segmentation would be sufficiently precise for AR navigation. As Table 5 shows, a considerable number of AR navigation systems obtain the preoperative models based on commercial services. To verify the effectiveness and accuracy of the surface reconstruction approach employed in this paper, we used

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

ARTICLE IN PRESS

JID: COMM

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

11

Table 5 Comparison of different AR surgical navigation methods. Registration

Yasuda [15] Thompson [45] Ieiri [50] Hayashi [14] Kang [18] Kong [20] Mahmoud [51] Mountney [21] Thompson [16] Tsutsumi [4] Wild [49] Ours

Validation

Metric

Method

Style

Surgical site

Validation

Modality

Model

Metric

Error

F1 NF2 F F / F F NF NF F F NF

R3 R R R R R R NR4 R R R R

HBP5 Liver Spleen Gastric Liver Kidney Abdominal Liver Liver Abdominal Liver/Kidney Liver

Clinical In vivo Clinical Clinical Phantom/In vivo Ex vivo/In vivo Volunteers/Pigs Ex vivo/In vivo Phantom/In vivo Clinical Ex vivo Ex vivo/In vivo

CT CT CT CT CT/LUS6 CT CT/MRI CT/CBCT CT Open MRI Fluorescence CT

Commercial Commercial Commercial Manually / / / / Commercial / / Automatic

FRE RPE FRE/TRE FRE TRE TRE FRE/TRE TRE/SRE RPE FRE FVE/TVE RPE

∼7.4 mm ∼12.0 mm ∼8.5/∼5.0 mm ∼14.0 mm ∼3.0 mm ∼1.0 mm ∼5.0/∼5.0 mm ∼2.4 mm ∼10.0 mm ∼7.0 mm ∼10/∼15 px ∼8.7 mm

1 and 2 denote the registration method based on Fiducials or Nonfiducials; 3 and 4 represent the Rigid and Nonrigid registration; 5 is the abbreviation of Hepatobiliary and Pancreatic, and 6 represents Laparoscopic Ultrasound.

Fig. 14. Box-plot representation of the reprojection error during the in vivo experiments.

Fig. 15. The qualitative result of the AR projection during an in vivo experiment, including the markers for error evaluation, the liver venous system, artery, artificial tumor and tumor resection margin.

two public cardiac datasets [22,41] for validation by evaluating the MAE, RMSE and the percentage of matched pixels. The results in Table 1 show that the unsupervised CNN framework is an alternative option for intraoperative surface reconstruction of the AR navigation system. As previously mentioned, image-to-patient registration is a critical step in successfully implementing an AR navigation system. Considering the complexity and difficulty of implementing the de-

formable registration, many groups have resorted to the fiducialbased method (Table 5) and use a rigid registration approach to map the models from the preoperative space to the laparoscopic image coordinate system. All the AR navigation systems listed in Table 5 have been applied to abdominal organs, including hepatobiliary organs, pancreas, spleen, stomach, and kidney, which are prone to deformation. Kong et al. [20] adopted finite element modeling to update the models constructed from CT scans before superimposing them on laparoscopic images. This method dramatically improved the overall error of their AR navigation system. The other approaches to compensate for the deformation caused by pneumoperitoneum or movements employ an intraoperative imaging device. For instance, Kang et al. [18] developed an AR system that combined live laparoscopic ultrasound with stereoscopic video. Tsutsumi et al. [4] introduced open magnetic resonance imaging to capture intraoperative images. Mountney et al. [21] used cone bean CT and fluoroscopy as a bridge for preoperative and intraoperative registration. Moreover, Mountney et al. [21] conducted biomechanically nonrigid registration to improve the accuracy. However, most of the existing AR navigation systems adopted rigid body registration based on the reliability of the results, the difficulty of implementation, and the real-time requirements of a navigation system. Furthermore, evaluating registration error is an important and challenging issue in AR navigation systems. A variety of evaluation metrics have been proposed to address this issue, such as fiducial registration error (FRE) [48], target registration error (TRE) [48], surface registration error (SRE) [21], fiducial visualization error (FVE) [49], target visualization error (TVE) [49], and reprojection error (RPE) [45]; however, the names and definitions of the evaluation metrics are not uniform. To some extent, they are easy to confuse and introduce difficulties when selecting a metric to compare the different algorithms. Nevertheless, FRE and FVE are used for fiducial-based registration methods, SRE is non-fiducial-based, and TRE, TVE, and RPE can be applied to either type. In particular, virtual projected models can be calculated for laparoscopic images and used in AR-assisted navigation systems, but the pixel errors are difficult to interpret and will increase when the camera is close to the target and decrease when the camera is further from the target. Recently, Thompson et al. [45] presented an intuitive method for error evaluation during in vivo clinical theater that shows the errors to the surgeon during use. Their experimental results showed that the RPE is a good predictor of TRE. Based on their work, we adopted RPE as the evaluation metric for our developed AR-assisted navigation system. Because the relationship between triangulation and distance to the camera is sensitive [16], we maintained a near-constant distance between the liver surface and the

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

JID: COMM 12

ARTICLE IN PRESS

[m5G;October 7, 2019;14:58]

H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

laparoscopic camera in the ex vivo and in vivo experiments. As listed in Tables 2 and 4, the mean RPEs were 6.04 ± 1.85 mm (2.38 mm to 10.45 mm) and 8.73 ± 2.43 mm (4.34 mm to 13.67 mm) for the ex vivo and in vivo experiments, respectively. Both of the RPE means were below 10 mm, which agrees with the conditions reported in [52] for clinical acceptance. Based on the same metric, Thompson et al. [45] reported that the SmartLiver system achieved an accuracy of approximately 12 mm for in vivo experiments (Table 5), and Hayashi et al. [53] presented a surgical navigation system for laparoscopic gastrectomy that achieved an average TRE of 12.6 mm in four blood vessels. Zhang et al. [56] employed CPD algorithm to perform deformable refined registration after ICP coarse registration, the promising results were achieved on their laparoscopy partial nephrectomy phantom experiment, and showed the application in renal retrospective clinical data. Unfortunately, none of quantitative in vivo experiment results were reported excepting registration time. Moreover, the boundary conditions of deformed kidney are easy to find compared with liver as the deformation of kidney is smaller than liver, the largest organ of abdomen cavity. From the aspect of reliable and real-time requirement of AR-assisted navigation system, we employed GoICP as the registration algorithm. However, the preoperative models we used in the in vivo experiments were derived from CT images scanned after establishing pneumoperitoneum, which can be regarded as adopting the intraoperative CT image. Liver mobilization and deformation caused by breathing, heartbeat and surgical manipulation are other factors likely to affect the accuracy of an AR navigation system. Deformation correction [54] and breathing compensation [55] are the corresponding useful tools to improve performance. Given these factors, we believe that the accuracy of the AR systems reported in [45] and [53] would be comparable to ours if intraoperative images were used, and the results of the in vivo intraoperative experiments would be improved, achieving a performance closer to that of the ex vivo experiments if further deformation correction and breathing compensation were employed. Partial-to-whole image registration is the other registration problem in AR navigation and means that only a small surface region will be reconstructed when the laparoscope is close to the target. This makes it difficult to achieve good results during registration to the entire preoperative liver surface. We followed the approach in [16] to combine the surface patches into a relatively large patch before registration. However, the time-consumption of surface reconstruction is proportional to the number of reconstructed patches and the accuracy of the combined surface is affected by liver movement. For the preoperative model, only the surface of the liver in contact with pneumoperitoneum was extracted for registration, which was the area visible from the laparoscopic camera position, too. Based on this approach, the pre- and intra-operative models can be constrained to the same region, which alleviates the partial-to-whole registration problem to some extent. Although our AR navigation system achieves good results, it still has some limitations. A short time lag is an important aspect of a real-time AR navigation system. The frame rate of our current version AR system is approximately 10 to 12 FPS without GPU acceleration. This slow rate may confuse the surgeon during operation when moving the laparoscope. However, we believe the frame rate could be improved if a GPU were used, and a real-time refresh rate could be achieved. The other limitation is that the preoperative patient models are not deformed by pneumoperitoneum pressure; this problem also affects the accuracy of the system. How to demonstrate system error to surgeons in real-time is another limitation of the proposed AR system. In addition, there is a limitation that the input resolution of the liver segmentation network is 64 × 64 × 64, which effected the liver segmentation accuracy. In fact, the liver segmentation results could be further improved by increasing the input resolution and

the scale of the network. Therefore, we did another retrospective experiment that used an input size of 128 × 128 × 128 and the 3D U-Net model which used more features than the original V-Net on the NVIDIA® Tesla-V100 GPU with 16G memory. It obtained average Dice value of 0.93 through leave-one-out strategy. we plan do retrospective experiments based on more accurate segmentation results by a larger and deeper network. As Fig. 1 shows, the preoperative models are transformed from CT image space to the positional tracker space, i.e., tracker Tmodel , and then transformed to the laparoscopic image space with the help of RPM (hand Ttracker and eye Thand ). It is somewhat cumbersome that the line-of-sight of positional tracker is an obstacle for deploying the AR navigation system in an operating room; however, the positional tracker could be excluded if the intraoperative liver surface were reconstructed and registered to preoperative models in real time. In the future, we will focus on these considerations to achieve further improvements. 7. Conclusions This paper presents an AR-assisted navigation system for liver resection based on a rigid stereoscopic laparoscope. The developed AR system can provide intuitive visualization for navigation by superimposing preoperative models, which include tumors with resection margins and vessels, on laparoscopic images, allowing surgeons to perceive the subsurface of the patient’s anatomy during surgery. Both laboratory ex vivo porcine livers and in vivo porcine experiments were conducted to validate the validity and accuracy of the AR-assisted navigation system. The results of the qualitative and quantitative experimental results show that the proposed AR navigation system are encouraging and show that the system has the potential to be highly useful in clinical practice. Ethical approval The in vivo experiment was approved by the institutional review board for animal experimental procedures. Declaration of competing interest The authors declare that they have no conflicts of interest. Acknowledgments This work was supported in part by the National Key Research and Development Program (2016YFC0106500/2/3 and 2017YFC0110903), the NSFC-Union Program (U1613221), the Shenzhen Key Basic Science Program (JCYJ20170413162213765 and JCYJ20180507182437217), the Shenzhen Key Laboratory Program (ZDSYS201707271637577), and the grant from State Key Laboratory of Robotics, Shenyang Institute of Automation, CAS (2019O14). The authors thank the radiologist, anesthetist, and the US imaging physicians at the First Affiliated Hospital, Harbin Medical University for their help with the animal experiments. References [1] S. Nicolau, L. Soler, D. Mutter, J. Marescaux, Augmented reality in laparoscopic surgical oncology, Surg. Oncol. 20 (3) (2011) 189–201. [2] X. Luo, K. Mori, T.M. Peters, Advanced endoscopic navigation: surgical big data, methodology, and applications, Annu. Rev. Biomed. Eng. 20 (1) (2018) 221–251. [3] T. Son, W.J. Hyung, Laparoscopic gastric cancer surgery: current evidence and future perspectives, World J. Gastroenterol. 22 (2) (2016) 727–735. [4] N. Tsutsumi, M. Tomikawa, M. Uemura, T. Akahoshi, Y. Nagao, K. Konishi, S. Ieiri, J. Hong, Y. Maehara, M. Hashizume, Image-guided laparoscopic surgery in an open mri operating theater, Surg. Endosc. 27 (6) (2013) 2178–2184. [5] P. Pessaux, M. Diana, L. Soler, T. Piardi, D. Mutter, J. Marescaux, Robotic duodenopancreatectomy assisted with augmented reality and real-time fluorescence guidance, Surg. Endosc. 28 (8) (2014) 2493–2498.

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099

JID: COMM

ARTICLE IN PRESS H. Luo, D. Yin and S. Zhang et al. / Computer Methods and Programs in Biomedicine xxx (xxxx) xxx

[6] P. Chauvet, T. Collins, C. Debize, L. Novais-Gameiro, B. Pereira, A. Bartoli, M. Canis, N. Bourdel, Augmented reality in a tumor resection model, Surg. Endosc. 32 (3) (2018) 1192–1201. [7] R. Tang, L.F. Ma, Z.X. Rong, M.D. Li, J.P. Zeng, X.D. Wang, H.E. Liao, J.H. Dong, Augmented reality technology for preoperative planning and intraoperative navigation during hepatobiliary surgery: a review of current methods, Hepatobiliary Pancreat. Dis. Int. 17 (2) (2018) 101–112. [8] S. Bernhardt, S.A. Nicolau, L. Soler, C. Doignon, The status of augmented reality in laparoscopic surgery as of 2016, Med. Image Anal. 37 (2017) 66–90. [9] Z. Yaniv, C.A. Linte, Applications of augmented reality in the operating room, Fundam. Wearable Comput. Augment. Reality (2016) 485–518. [10] J.M. Fitzpatrick, D.L. Hill, C.R. Maurer, Image registration, Handb. Med. Imaging 2 (20 0 0) 447–513. [11] I. Wolf, M. Vetter, I. Wegner, T. Böttger, M. Nolden, M. Schöbinger, M. Hastenteufel, T. Kunert, H.P. Meinzer, The medical imaging interaction toolkit, Med. Image Anal. 9 (6) (2005) 594–604. [12] J.R. Wu, M.L. Wang, K.C. Liu, M.H. Hu, P.Y. Lee, Real-time advanced spinal surgery via visible patient model and augmented reality system, Comput. Methods Programs Biomed. 113 (3) (2014) 869–881. [13] R. Wen, W.L. Tay, B.P. Nguyen, C.B. Chng, C.K. Chui, Hand gesture guided robot-assisted surgery based on a direct augmented reality interface, Comput. Methods Programs Biomed. 116 (2) (2014) 68–80. [14] Y. Hayashi, K. Misawa, M. Oda, D.J. Hawkes, K. Mori, Clinical application of a surgical navigation system based on virtual laparoscopy in laparoscopic gastrectomy for gastric cancer, Int. J. Comput. Assist. Radiol. Surg. 11 (5) (2016) 827–836. [15] J. Yasuda, T. Okamoto, S. Onda, Y. Futagawa, K. Yanaga, N. Suzuki, A. Hattor, Novel navigation system by augmented reality technology using a tablet pc for hepatobiliary and pancreatic surgery, Int. J. Med. Robot. Comput. Assist. Surg. 14 (5) (2018) e1921. [16] S. Thompson, J. Totz, Y. Song, S. Johnsen, D. Stoyanov, S. Ourselin, K. Gurusamy, C. Schneider, B. Davidson, D. Hawkes, M.J. Clarkson, Accuracy validation of an image guided laparoscopy system for liver resection, Medical Imaging 2015: Image-Guided Procedures, Robotic Interventions, and Modeling, 9415, 2015. [17] H.G. Kenngott, M. Wagner, M. Gondan, F. Nickel, M. Nolden, A. Fetzer, J. Weitz, L. Fischer, S. Speidel, H.P. Meinzer, D. Böckler, M.W. Büchler, B.P. Müller-Stich, Real-time image guidance in laparoscopic liver surgery: first clinical experience with a guidance system based on intraoperative ct imaging, Surg. Endosc. 28 (3) (2014) 933–940. [18] X. Kang, M. Azizian, E. Wilson, K. Wu, A.D. Martin, T.D. Kane, C.A. Peters, K. Cleary, R. Shekhar, Stereoscopic augmented reality for laparoscopic surgery, Surg. Endosc. 28 (7) (2014) 2227–2235. [19] X. Liu, S. Kang, W. Plishker, G. Zaki, T.D. Kane, R. Shekhar, Laparoscopic stereoscopic augmented reality: toward a clinically viable electromagnetic tracking solution, J. Med. Imaging 3 (4) (2016) 045001. [20] S.H. Kong, N. Haouchine, R. Soares, A. Klymchenko, B. Andreiuk, B. Marques, G. Shabat, T. Piechaud, M. Diana, S. Cotin, J. Marescaux, Robust augmented reality registration method for localization of solid organs tumors using ct-derived virtual biomechanical model and fluorescent fiducials, Surg. Endosc. 31 (7) (2017) 2863–2871. [21] P. Mountney, J. Fallert, S. Nicolau, L. Soler, P.W. Mewes, An augmented reality framework for soft tissue surgery, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, 2014, pp. 423–431. [22] D. Stoyanov, M.V. Scarzanella, P. Pratt, G.Z. Yang, Real-time stereo reconstruction in robotically assisted minimally invasive surgery, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, 2010, pp. 275–282. [23] V. Penza, J. Ortiz, L.S. Mattos, A. Forgione, E.D. Momi, Dense soft tissue 3d reconstruction refined with superpixel segmentation for robotic abdominal surgery, Int. J. Comput. Assist. Radiol. Surg. 11 (2) (2016) 197–206. [24] L. Chen, W. Tang, N.W. John, T.R. Wan, J.J. Zhang, Slam-based dense surface reconstruction in monocular minimally invasive surgery and its application to augmented reality, Comput. Methods Programs Biomed. 158 (2018) 135–146. [25] L. Maier-Hein, P. Mountney, A. Bartoli, H. Elhawary, D. Elson, A. Groch, A. Kolb, M. Rodrigues, J. Sorger, S. Speidel, D. Stoyanov, Optical techniques for 3d surface reconstruction in computer-assisted laparoscopic surgery, Med. Image Anal. 17 (8) (2013) 974–996. [26] R. Garg, V.K. BG, G. Carneiro, I. Reid, Unsupervised cnn for single view depth estimation: geometry to the rescue, in: European Conference on Computer Vision, 2016, pp. 740–756. [27] M. Ye, E. Johns, A. Handa, L. Zhang, P. Pratt, G.Z. Yang, Self-supervised siamese learning on stereo image pairs for depth estimation in robotic surgery. arXiv:1705.08260 (2017). [28] W. Schroeder, L. Ng, J. Cates, The ITK Software Guide, The Insight Consortium, 2003. [29] W.J. Schroeder, B. Lorensen, K. Martin, The Visualization toolkit: an Object-Oriented Approach to 3D Graphics, Kitware, 2004. [30] G. Bradski, A. Kaehler, Learning OpenCV: Computer vision With the OpenCV Library, O’Reilly Media, Inc., 2008. [31] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., Tensorflow: a system for large-scale machine learning, Sympos. Oper. Syst. Des. Implement. 16 (2016) 265–283. [32] A.M. Franz, A. Seitel, M. Servatius, C. Zollner, I. Gergel, I. Wegner, J. Neuhaus, S. Zelzer, M. Nolden, J. Gaa, et al., Simplified development of image-guided

[33] [34]

[35]

[36]

[37]

[38] [39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

[49]

[50]

[51]

[52]

[53]

[54]

[55]

[56]

[m5G;October 7, 2019;14:58] 13

therapy software with mitk-igt, in: Medical Imaging 2012: Image-Guided Procedures, Robotic Interventions, and Modeling, 8316, 2012, p. 83162J. Z. Zhang, A flexible new technique for camera calibration, IEEE Trans. Pattern Anal. Mach. Intell. 22 (11) (20 0 0) 1330–1334. J. Shao, H. Luo, D. Xiao, Q. Hu, F. Jia, Progressive hand-eye calibration for laparoscopic surgery navigation, Comput. Assist. Robot. Endosc. Clin. Image-Based Proced. (2017) 42–49. S. Thompson, D. Stoyanov, C. Schneider, K. Gurusamy, S. Ourselin, B. Davidson, D. Hawkes, M.J. Clarkson, Hand-eye calibration for rigid laparoscopes using an invariant point, Int. J. Comput. Assist. Radiol. Surg. (2016) 1–10. F. Milletari, N. Navab, S.A. Ahmadi, V-net: fully convolutional neural networks for volumetric medical image segmentation, in: 2016 Fourth IEEE International Conference on 3D Vision, 2016, pp. 565–571. C. Godard, O.M. Aodha, G.J. Brostow, Unsupervised monocular depth estimation with left-right consistency, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2, 2017, pp. 270–279. M. Jaderberg, K. Simonyan, A. Zisserman, Spatial transformer networks, Adv. Neural Inf. Process. Syst. (2015) 2017–2025. K. Tateno, F. Tombari, I. Laina, N. Navab, Cnnslam: real-time dense monocular slam with learned depth prediction, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2, 2017. J. Yang, H. Li, D. Campbell, Y. Jia, Go-icp: a globally optimal solution to 3d icp point-set registration, IEEE Trans. Pattern Anal. Mach. Intell. 38 (11) (2016) 2241–2254. P. Pratt, D. Stoyanov, M. Visentini-Scarzanella, G.Z. Yang, Dynamic guidance for robotic surgery using imageconstrained biomechanical models, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, 2010, pp. 77–85. C. Wang, F.A. Cheikh, M. Kaaniche, O.J. Elle, Liver surface reconstruction for image guided surgery, Medical Imaging 2018: Image-Guided Procedures, Robotic Interventions, and Modeling, 10576, 2018. A. Hosni, C. Rhemann, M. Bleyer, C. Rother, M. Gelautz, Fast cost-volume filtering for visual correspondence and beyond, IEEE Trans. Pattern Anal. Mach. Intell. 35 (2) (2013) 504–511. P.L. Chang, D. Stoyanov, A.J. Davison, P. Edwards, Realtime dense stereo reconstruction using convex optimisation with a costvolume for image-guided robotic surgery, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, 8149, 2013, pp. 42–49. S. Thompson, C. Schneider, M. Bosi, K. Gurusamy, S. Ourselin, B. Davidson, D. Hawkes, M.J. Clarkson, In vivo estimation of target registration errors during augmented reality laparoscopic surgery, Int. J. Comput. Assist. Radiol. Surg. 13 (6) (2018) 865–874. T. Ishizawa, N. Fukushima, J. Shibahara, K. Masuda, S. Tamura, T. Aoki, K. Hasegawa, Y. Beck, M. Fukayama, N. Kokudo, Real-time identification of liver cancers by using indocyanine green fluorescent imaging, Cancer 115 (11) (2009) 2491–2504. M.S. Alfano, S. Molfino, S. Benedicenti, B. Molteni, P. Porsio, E. Arici, F. Gheza, M. Botticini, N. Portolani, G.L. Baiocchi, Intraoperative icg-based imaging of liver neoplasms: a simple yet powerful tool. preliminary results, Surg. Endosc. 33 (1) (2019) 126–134. J.M. Fitzpatrick, J.B. West, The distribution of target registration error in rigid-body point-based registration, IEEE Trans. Med. Imaging 20 (9) (2001) 917–927. E. Wild, D. Teber, D. Schmid, T. Simpfendorfer, M. Müller, A.C. Baranski, H. Kenngott, K. Kopka, L. Maier-Hein, Robust augmented reality guidance with fluorescent markers in laparoscopic surgery, Int. J. Comput. Assist. Radiol. Surg. 11 (6) (2016) 899–907. S. Ieiri, M. Uemura, K. Konishi, R. Souzaki, Y. Nagao, N. Tsutsumi, T. Akahoshi, K. Ohuchida, T. Ohdaira, M. Tomikawa, K. Tanoue, M. Hashizume, T. Taguchi, Augmented reality navigation system for laparoscopic splenectomy in children based on preoperative ct image using optical tracking device, Pediatr. Surg. Int. 28 (4) (2012) 341–346. N. Mahmoud, Ó.G. Grasa, S.A. Nicolau, C. Doignon, L. Soler, J. Marescaux, J.M.M. Montiel, On-patient see-through augmented reality based on visual slam, Int. J. Comput. Assist. Radiol. Surg. 12 (1) (2017) 1–11. D.M. Cash, M.I. Miga, T.K. Sinha, R.L. Galloway, W.C. Chapman, Compensating for intraoperative soft-tissue deformations using incomplete surface data and finite elements, IEEE Trans. Med. Imaging 24 (11) (2005) 1479–1491. Y. Hayashi, K. Misawa, D.J. Hawkes, K. Mori, Progressive internal landmark registration for surgical navigation in laparoscopic gastrectomy for gastric cancer, Int. J. Comput. Assist. Radiol. Surg. 11 (5) (2016) 837–845. J.S. Heiselman, L.W. Clements, J.A. Collins, J.A. Weis, A.L. Simpson, S.K. Geevarghese, T.P. Kingham, W.R. Jarnagin, M.I. Miga, Characterization and correction of intraoperative soft tissue deformation in image-guided laparoscopic liver surgery, J. Med. Imaging 5 (2) (2017) 021203. J. Ramalhinho, M. Robu, S. Thompson, P. Edwards, C. Schneider, K. Gurusamy, D. Hawkes, B. Davidson, D. Barratt, M.J. Clarkson, Breathing motion compensated registration of laparoscopic liver ultrasound to ct, Medical Imaging 2017: Image-Guided Procedures, Robotic Interventions, and Modeling, 10135, 2017. X. Zhang, J. Wang, T. Wang, X. Ji, Y. Shen, Z. Sun, X. Zhang, A markerless automatic deformable registration framework for augmented reality navigation of laparoscopy partial nephrectomy, Int. J. Comput. Assist. Radiol. Surg. 14 (8) (2019) 1285–1294.

Please cite this article as: H. Luo, D. Yin and S. Zhang et al., Augmented reality navigation for liver resection with a stereoscopic laparoscope, Computer Methods and Programs in Biomedicine, https://doi.org/10.1016/j.cmpb.2019.105099