Multi-view Convolutional Neural Network for lung nodule false positive reduction

Multi-view Convolutional Neural Network for lung nodule false positive reduction

Journal Pre-proof Multi-view Convolutional Neural Network for Lung Nodule False Positive Reduction Salsabil Amin El-Regaily , Mohammed Abdel Megeed S...

8MB Sizes 0 Downloads 24 Views

Journal Pre-proof

Multi-view Convolutional Neural Network for Lung Nodule False Positive Reduction Salsabil Amin El-Regaily , Mohammed Abdel Megeed Salem , Mohamed Hassan Abdel Aziz , Mohamed Ismail Roushdy PII: DOI: Reference:

S0957-4174(19)30734-1 https://doi.org/10.1016/j.eswa.2019.113017 ESWA 113017

To appear in:

Expert Systems With Applications

Received date: Revised date: Accepted date:

19 April 2019 24 September 2019 10 October 2019

Please cite this article as: Salsabil Amin El-Regaily , Mohammed Abdel Megeed Salem , Mohamed Hassan Abdel Aziz , Mohamed Ismail Roushdy , Multi-view Convolutional Neural Network for Lung Nodule False Positive Reduction, Expert Systems With Applications (2019), doi: https://doi.org/10.1016/j.eswa.2019.113017

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier Ltd.

Highlights • • • •

The Computer Aided Detection System detects nodules attached to vessels / lung wall Specially designed Multi-view Convolutional Neural Network perfectly suits the inputs Exploits 3D features of the nodules in multiple 2D views without 3D complexity System is automated, produces consistent outputs with high sensitivity and accuracy

1

Multi-view Convolutional Neural Network for Lung Nodule False Positive Reduction Salsabil Amin El-Regailya, Mohammed Abdel Megeed Salemb, Mohamed Hassan Abdel Azizc, Mohamed Ismail Roushdyd

a

: Corresponding author, Basic Science Department at the Faculty of Computer and, Information Sciences, Ain Shams University, Cairo, Egypt. Emails: [email protected], [email protected] Tel: +2 01002474505 b

: Scientific Computing Department at the Faculty of Computer and Information Sciences Ain Shams University Cairo, Egypt. German University in Cairo (GUC) Faculty of Media Engineering & Technology, Cairo, Egypt. Emails: [email protected], [email protected] c

: Basic Science Department at the Faculty of Computer and Information Sciences, Ain Shams University Cairo, Egypt. Email: [email protected] d

:Faculty of Computers and Information Technology, Future University in Egypt. On leave, Computer Science Department at the Faculty of Computer and Information Sciences. Ain Shams University, Cairo, Egypt. Email: [email protected] Abstract: Background and objective: Computer-Aided Detection (CAD) systems save radiologists time and provide a second opinion in detecting lung cancer by performing automated analysis of the scans. False positive reduction is one of the most crucial components of these systems that play a great role in the early diagnosis and treatment process. The objective of this paper is to efficiently handle this problem by detecting nodules and separating them from a large number of false positive candidates. Methods: The proposed algorithm segments lungs and nodules through a combination of 2D and 3D region growing, thresholding and morphological operations. Vessels and most of the internal lung structure have a tabular shape that differs from the compact rounded shape of nodules, therefore they are eliminated by building and thresholding a3D depth map, to produce the initial candidates. To reduce the number of false positives, a rule-based classifier is used to eliminate the obvious non-nodules, followed by a multi-view convolutional neural network. The convolutional network is built specifically to handle the provided inputs and is customized to provide the best possible outputs without the extra computational complexity that is required when compared to a 3D network. 650 cases from the LIDC dataset are used to

2

train and test the network. For each candidate, the axial, coronal and sagittal views are extracted and fed to the three network streams. Results: The proposed algorithm achieved a high detection sensitivity of 85.256%, a specificity of 90.658% and an accuracy of 89.895%. Experimental results indicate that the proposed algorithm outperforms most of the other algorithms in terms of accuracy and sensitivity. The proposed solution achieves a good tradeoff between efficiency and effectivity and saves much computation time. Conclusion: The work shows that the proposed multi-view 2D network is a simple, yet effective algorithm for the false positive reduction problem. It can detect nodules that are isolated, linked to a vessel or attached to the lung wall. The network can be improved to detect ground glass nodules in the future. keywords: Lung nodules; Computer Aided Detection; Convolutional Neural Networks; Computed Tomography 1. Introduction: Lung cancer is one of the main causes of death in many countries around the world (Jemal et al., 2009). It is caused by the uncontrollable irregular growth of cells in lung tissue. Detecting the abnormalities in the lung tissue in an early phase enhances the treatment efficiency and gives the patient a better chance of survival (El-Baz & Suri, 2011). Lung nodules are spherical abnormal tissue with a diameter of up to approximately 30mm (Austin, Müller, & Friedman, 1996). Computed tomography (CT) is one of the most accurate methods for detecting lung nodules. The differences in density between the normal and diseased tissue are more apparent on CT scans. Also, CT makes it possible to visualize the small or low-opacity nodules that are hardly seen in the other conventional medical imaging techniques (Way, Chan, Hadjiiski, & Sahiner, 2010). However, depending on the CT scans alone in making an accurate diagnosis can be a very challenging task (Shah et al., 2005). A single CT scan for a patient may contain 150-500 slices that must be checked by a radiologist, which is a time-consuming process. As a result, many Computer-Aided Detection (CAD) methods offer a second opinion to the radiologists and help them make a more accurate diagnosis. CAD systems can locate subtle details that could be very important and may be missed by human experts (Sluimer, Schilham, Prokop, & Van Ginneken, 2006). A typical CAD system consists of five main modules: image acquisition, preprocessing, lung segmentation, nodule detection and false positives (FPs) reduction. Using an efficient classifier at the FPs reduction step is crucial to the CAD performance to reduce the high amount of FPs, including lung vessels, or nodule-like internal lung structure. Recently, Convolutional Neural Networks (CNNs) have proven to outperform many classification algorithms in the medical field, due to the rapid growth in the amount of the annotated data through the publicly available datasets and the usage of Graphics Processing Units (GPUs) that provide vast acceleration compared to the regular CPU. CNNs can extract highly discriminative features at multiple levels of abstractions from the training data, without the time-consuming handcrafting of features. They are inspired by biological processes where the connections between neurons simulate the organization of a cat visual cortex. Each individual neuron responds to a stimuli in the receptive field, which is a specific region of the visual field. Neighboring fields overlap partially to cover the whole visual field. This paper proposes a CAD system with a multi-view CNN that is built from the beginning specifically to classify lung nodules from CT scans. Parameters of the network are optimized through experiments on a large LIDC dataset. The CAD can detect nodules that are attached to a vessel or to the lung wall. It exploits 3D features of the nodules without the complications of 3D processing. The paper is organized as follows: The second section includes most of the previous related work of CAD systems using CNNs as classifiers. The third section represents the details and features of the data used in this work. The methodology is explained in detail in the fourth section, including the proposed CAD system and the proposed multi-view CNN along with the parameter selection process. Finally, Results and discussion are presented, followed by the final conclusion. 2. Related Work CNNs have proven to be powerful tools for a broad range of computer vision tasks. This section provides an overview of some of the state-of-art CNNs used for detection and classification of lung nodules in CAD systems. There exist a number of review papers on the usage of CNNs in medical imaging such as (Greenspan, H., Van Ginneken, B., & Summers, 2016) and (Litjens et al., 2017). A previous review for our work was presented in (El-Regaily, Salem, Aziz, & Roushdy, 2017b). (Lu et al., 2017) applied CNN to two CAD systems for thorax-abdominal lymph node detection and interstitial

3

lung disease classification. A modified U-Net trained on LUNA 16 labeled data was created by (Chon & Lu, 2017), then the output was fed into a 3D vanilla CNN and a GoogleNet-based 3D CNN. (Shen et al., 2017) developed a new model and presented a multi-crop convolutional neural network that was able to automatically extract salient nodule information by employing a novel multi-crop pooling strategy that crops different regions from convolutional feature maps and applies max-pooling at varying times.(W. Li, Cao, Zhao, & Wang, 2016) designed a CNN that has the advantage of auto-learning and generalization. The system can recognize three types of nodules: solid, semisolid and ground-glass opacity (GGO). A multi-view CNN was built by (Arindra et al., 2016) that was fed a set of 2D patches from different orientations. A similar approach of multi-view input streams was adopted by (Ciompi et al., 2017), but the input patches were also built in a multi-scale manner to provide more training data. The CNN could work on multiple triplets of 2D views of a nodule at many scales. (Albelwi & Mahmood, 2017) introduced a new optimization objective function that gathered information and error rates by utilizing deconvolutional networks. A 3D CNN was designed by (Hamidian, Sahiner, Petrick, & Pezeshk, 2017) to detect lung nodules automatically, by extracting 3D volumes of interest from the LIDC dataset. The 3D CNN could produce the score map for the whole volume in a single pass, instead of passing through the entire 2D slices of the volume. (Huang, X., Shan, J., & Vaidya, 2017) proposed a CAD system with a 3D CNN where the nodule candidates were generated using a local geometric model-based filter. Data augmentation was used to increase the size of the training examples. (Ahn, 2017) transformed squeeze-Net into the 3D space to improve the speed and accuracy of the diagnosis process. The squeezed CNN had 1x1 filters instead of 3x3 which leads to a much smaller number of parameters. A 3D CNN with multi-scale prediction was used to detect lung nodules by (Y. Gu et al., 2018). The algorithm used 10fold cross-validation on data from LUNA16 database. There had been many promising algorithms developed by the Kaggle competition participants as in (Verleysen et al., 2017) and (Hammack & Wit, 2017). (Rodrigues et al., 2018) presented a structural co-occurrence matrix-based approach to extract nodule features, classify nodules into malignant or benign and also into their malignancy levels. The classification stage used multilayer perceptron, support vector machine and k-nearest neighbors. In the work done by (X. Liu, Hou, Qin, & Hao, 2018), a multi-view multi-scale CNN was constructed. Images were sampled at different scales and high frequency contents were used to sort the different views depending on their importance for each scale. (K. Liu & Kang, 2017). (K. Liu & Kang, 2017) explored the classification of lung nodules using the 3D multi-view CNNs with both chain architecture and directed acyclic graph architecture, including 3D Inception and 3D Inception-ResNet. All networks employed the multi-view-one-network strategy. They conducted a binary classification (benign and malignant) and a ternary classification (benign, primary malignant and metastatic malignant on CT. There had been a lot of work in detecting lung diseases using Neural Networks (NN) and CNNs in chest radiographs as well. (Ke et al., 2019) proposed a neuro-heuristic method to detect degenerated lung tissues in x-ray images. The images were evaluated using NN and the possibilities of respiratory diseases were detected. The heuristic method identified the degenerated tissues in the x-ray image based on the use of a fitness function that depended on the measures of color, similarity, and convergence. (Połap, Woźniak, Damaševičius, & Wei, 2019) developed an application that extended search space only with necessary elements using bio-inspired evaluation of pixels. An aggregated image of lungs segmented from the body was composed, which became the search space for the algorithm. Decision models simulated human perception over the tissues during the medical examination. A method was proposed by (C. Li, Zhu, Wu, & Wang, 2018) to reduce false positives of lung nodules in chest radiograph using an ensemble of CNNs. After sharpening the images and creating patches, the inputs were fed to three different CNNs with different depths. The outputs were fused at the end using AND operator. (Woźniak et al., 2018) proposed a method to detect small nodules by calculating the variance image of the chest X-ray and finding the locations of the local maxima. Probabilistic NN was used as a classifier to reduce the number of false positives. Many of the previously mentioned algorithms used complicated and very deep CNNs or pre-trained CNNs that are not built specifically for the lung nodule classification problem. Also, 3D CNNs may achieve good results, but they need extra computations and much more running times that would reach many hours on advanced hardware to reach a final result. Our contribution can be summarized as follows: - We built a simple, yet efficient CNN architecture from the start, that is trained to suit the simple two-class problem of nodule detection. Using a 2D 3-view CNN greatly simplifies the training process and shortens the training time without sacrificing the classification accuracy. The CAD including the proposed CNN in the classification step reaches high sensitivity and accuracy that outperforms many of the state-of-the-art algorithms and is comparable to the others. It achieves a good tradeoff between efficiency and effectivity. The results show a high potential of the proposed method for further research and development.

4

- The proposed CAD is able to detect nodules that are attached to the chest wall or to blood vessels, which are usually missed in the segmentation stage in other CAD systems. This is accomplished by using the rolling ball algorithm in the lung restoration step, and the thresholded depth map in the tabular structure elimination step. 3. Data To conduct this research and to develop the CAD system and train both classifiers: the rule-based classifier and the multiview CNN, a large dataset is downloaded from the Lung Image Database Consortium (LIDC-IDRI) data collection. LIDC is provided by the National Cancer Institute (NCI) and it is considered as one of the largest public databases available (Armato et al., 2011; McNitt-Gray et al., 2007; Reeves et al., 2007). The dataset includes 1,018 cases with slice thickness that varies from 0.6mm to 5.0 mm. LIDC is fully annotated by four different radiologists (McNitt-Gray et al., 2007). The observations are done on two consecutive phases: first the blinded phase, then the un-blinded one where each observer could see the results of the other three observers to reach a final opinion. The target of this process was to identify as completely as possible all lung nodules in each CT scan without requiring forced consensus. Annotations for each scan show the location of the nodules along with their characteristics such as subtlety, solidity, spiculation, lobulation, sphericity in XML format. Total of 750 cases is downloaded for this study. The grouping of boundaries into unique ROIs to create the ground truth data is performed by the MAX software and the LIDC MATLAB toolbox provided by (Lampert, Stumpf, & Gançarski, 2016). Nodules less than 3mm are excluded as there is no enough information about their degree of malignancy and they are neglected by the provided toolbox. The list of cases with small nodules or non-nodules is provided in (Biancardi, 2011). This would leave us with a total of 650 scans including 829 nodules with various imaging qualities of different patients. The nodules vary in size between 3mm and 30mm (with volume ranging between 7.73 to 167789.82 mm3) Cases are divided into approximately 80% as training data and 20% as testing data. The LIDC toolbox is used to generate the ground truth data given the XML annotations of the LIDC data as an input. The output of the toolbox includes a separate folder for each slice containing individual masks provided by the four radiologists. To produce the final nodule masks, the individual masks are combined together to make Probability Maps (PMaps) where the pixels are summed and normalized to show the probability of contribution of this pixel to the ground truth nodule according to the radiologists' opinions. If the pixel probability is greater than 0.75 the pixel is counted as a part of the nodule mask. After creating the 3D PMaps, three different views are extracted from each nodule representing axial, coronal and sagittal views and saved in different folders. The other six basic views are also extracted for further experiments, as will be shown later. To avoid overfitting and enrich the training data with false samples, false positives are added to the training data by running the CAD system without the last false positives reduction stage, and saving only the 3D structures that are not actual nodules, taking in consideration the ground truth data provided by the LIDC XML files in each scan folder. The final data contains 829 true positives and 1200 false positives in nine different views, along with the original 3D views available for the 3D features extraction process. 4. Methods 4.1 The proposed CAD system This section presents the proposed method for lung segmentation and nodule (See a detailed explanation in our previous work in (El-Regaily, Salem, Aziz, & Roushdy, 2017a) ). The CAD system consists of five main processing modules as in figure 1: image acquisition, preprocessing, lung segmentation, nodule detection and false-positive reduction. Images are acquired from the online LIDC database as shown in the previous section. Preprocessing is implemented using contrast stretching and enhancing as in figure 2b to reduce noise and artifacts in the provided scans, and increase the contrast of the slices to the limits to facilitate the detection process of nodules. The contrast is enhanced by first finding the lower and upper intensity limits, then the values of the input images are mapped to new values in the enhanced images, between the pre-specified limits. Lung segmentation is performed using a combination of region growing, thresholding and morphological operations. First, Thorax is extracted as in figure 2c using a technique similar to (Sousa, Silva, de Paiva, & Nunes, 2010), but instead of using a 2D region growing, a 3D version is adopted along with one seed point at the top corner of the volume to save much computation time. Then, the lung parenchyma is extracted as shown in figure 2d, using a 3D region growing, with a seed point identified as the first non-zero voxel on the diagonal line of the first slice in the scan.

5

Finally, to reconstruct the lungs, fill the holes and preserve the nodules attached to the lung wall, a rolling-ball algorithm (Gurcan et al., 2002) is applied using a circular structure element of radius 14 pixels. The optimum radius is found through experiments on the given dataset. Then, a 3D flood-fill algorithm is used to fill the 3D holes followed by morphological opening with the same structure element to restore the original lung size. As a final touch to the lung reconstruction, a logical OR is applied with the lung mask from the lung extraction step, to restore the sharp lung edges that were lost by using morphological operations. The output of lung reconstruction is shown in figure 2e.

Figure 1: Diagram of the proposed CAD system The nodule detection step locates all suspicious areas with high detection sensitivity. Nodule detection is done using 2D thresholding that identifies the internal structure of the lung slice by slice as explained by (Sousa et al., 2010), followed by 3D region growing that segments each separate structure as in figure 2 f and g. Each 3D structure is then subjected to tabular structure elimination to provide nodule candidates. A depth map is created using Euclidean distance transform where a number is assigned to each voxel, representing the sum of distances between these voxels and the closest nonzero voxels (El-Regaily et al., 2017a). Then, a threshold is applied to eliminate most of the tabular structures leaving us with a list of nodule candidates, including a lot of false positives, as shown in figure 2h. In the false positive reduction stage, some of the basic nodule features are extracted from the training data to set thresholds for a simple rule-based classifier, that quickly eliminates all the obvious non-nodules and reduce the workload forwarded to the CNN classifier. The rules are applied using a selected set of features that are proven to be more efficient for excluding the non-nodule candidates, such as major axis length, minor axis length, area (of the largest slice), volume and spherical disproportion (El-Regaily et al., 2017a). Results of the nodule detection and initial classification stages on the testing data includes a sensitivity of 77.77%, a specificity of 69.5% and an accuracy of 70.53 %. The final classification is done using multi-view CNNs as will be explained later in detail. 4.2 The proposed multi-view CNN structure Since most of the pre-trained networks - especially the 3D networks - are very large and need high CPU capabilities and GPU memory, a 2D multi-view network that is designed from scratch was the optimum solution. Multi-view 2D networks can take advantage of the 3D features of the segmented nodule volumes without having the extra complications of a 3D network. Creating a new network is a harder task, as you have to determine the network configuration, obtain a large dataset of labeled input images and keep experimenting and training while changing different variables and parameters till you reach the optimum architecture. On the other hand, it gives you control over the network and provides you with the most suitable configuration for your own dataset that could produce impressive results. Also, the classification problem in our case includes only two classes, so there is no need for a huge pre-trained network that is configured to classify tens of classes in different kinds of colored images and would require high computational capabilities and long running times. The proposed network - as shown in figure 5- consists of three separate streams, each stream consists of three convolutional layers, three max-pool layers, a fully connected layer, a

6

softmax layer and a classification layer. There are 16 5x5 kernels in the first convolutional layer, 32 and 64 3x3 kernels in last two layers, respectively.

Figure 2: Proposed CAD sequence results (a) Original slice (b) Result of the preprocessing step with higher contrast (c) Thorax extraction results with the external areas removed (d) Lung extraction output after the removal of the thoracic wall with holes and obvious errors (e) Reconstructed lungs using morphological processing (f) Internal structure as appears in a 2D slice after thresholding (g) Result of 3D Region Growing (h) Result of tabular structure elimination with most of the vessels removed. The actual nodule is colored in orange in the bottom right corner of the image

Finding the appropriate parameter combination for a certain dataset is very difficult, as it is not well understood how the parameters function together to affect the accuracy of the results (L. Li, Jamieson, & Desalvo, 2016). There is no mathematical model that helps in calculating the optimum parameters, so the selection process must be an iterative process where you go around cycles of experiments and decision making many times. It would help to take advantage of the previous work and reference other network architectures to benefit from what researchers have proven successful (“Convolutional Neural Networks in MATLAB,” 2018). To select the proper architecture, we had to fix all the parameters at some default values at first based on previous experience and experiment only on one parameter at a time. Once the optimum value of this parameter is obtained, it replaces its original value and then the selection process moves on to the next parameter. Initial default values are 2 convolutional layers, 32 3x3 kernels, 20 training epochs, a learning rate of 10-3 and only one stream for one input view. In the next few subsections, we will show the parameter selection process in detail. The parameters are updated in this order ( Number of layers, number and size of kernels, number of views and fusion method) 4.2.1 Number of layers Recent research in (He & Sun, 2016; Srivastava, Greff, & J Schmidhuber, 2015) showed that increasing the depth affects the performance, as proven by the experiments in (He & Jian, 2015). Also, increasing the number of layers makes the network more difficult to optimize and more vulnerable to overfitting (J. Gu et al., 2018). Most of the recent work that adopts building a new CNN from scratch favors minimizing the number of convolutional and max-pooling layers. Authors in (Ciompi et al., 2017) proposed a network architecture that consists of four convolutional layers, in (Arindra et al., 2016; Hamidian et al., 2017; Huang, X., Shan, J., & Vaidya, 2017; Shen, Zhou, B, Dong, & Yang, 2016; Sun et al.,

7

2016) three convolutional layers while in (W. Li et al., 2016) they used only two layers. Adding more layers may be useful in some cases, where there are a lot of classes to be classified, different kinds of inputs to be fed to the network or larger size inputs. But in a simple-two-classes classification problem like lung nodule detection, simple architecture would be better in order not to lose the details of the inputs through a large number of layers and to minimize the time and computational resources required to solve the problem. Wide experiments are done to test the effect of changing the number of convolutional layers on the accuracy, sensitivity, and specificity using training data. 500 cases are used for determining the optimum number of layers. The parameters are set to their initial default values as derived from previous work. The network is run 20 different times for each number of layers - as the weights are initialized randomly each timeand then, the number that achieves the highest average sensitivity and accuracy is chosen. The final decision was to use three convolutional layers in the proposed network according to the results shown in figure 5 in section 5. Detailed results are shown in the ablation study in table 1. Table 1: An ablation study of different numbers of convolutional layers are evaluated on the 500 training cases from the LIDC dataset. The chosen number is in bold. 2 conv layers √ √ √ √ √ √ √ √

3 conv layers

4 conv layers

5 conv layers

6 conv layers

7 conv layers

8 conv layers

9 conv layers

√ √ √ √ √ √ √

√ √ √ √ √ √

√ √ √ √ √

√ √ √ √

√ √ √

√ √



Highest average sensitivity 95.03 97.50 93.75 95.27 94.67 95.15 96.20 97.21

Highest average accuracy 88.22 88.67 82.89 81.56 76.89 81.56 78.44 78.89

4.2.2 Number and size of kernels Recent work such as (Arindra et al., 2016; Ciompi et al., 2017; Hamidian et al., 2017; Huang, X., Shan, J., & Vaidya, 2017; W. Li et al., 2016; Shen et al., 2016; Sun et al., 2016) use small kernel sizes starting from 3x3 up to 7x7. Using smaller kernels forces the network to focus on the local features of the nodules not the global features of the background. Including small kernel size combinations, achieved the highest results in terms of on accuracy and sensitivity through extensive experiments on the training data as is shown in table2 in section 5. We chose to use 16 5x5 kernels for the first convolutional layer, 32 and 64 3x3 kernels for the last two layers, respectively. The number of kernels should be increased as we go deeper into the network, to capture more features and details (LeCun, Bengio, & Hinton, 2015). 4.2.3 Number of views: one of the most critical parameters to be optimized is the number of input views that determines how many network streams are set to get the final output. The purpose is to take advantage of the 3D properties obtained through 3D segmentation of the nodule candidates, but at the same time reduce the computational complexity of the algorithm by taking multiple views of each input candidate instead of the whole volume. There are 9 possible views of a square volume (Arindra et al., 2016). The three basic views are the axial, coronal and sagittal views, and the others represent the different basic planes between a 3D box sides. See an example of the different views in figure 3. 4.2.4 Output fusion The outputs of the three network streams are fused using a logical OR operation, just after the fully connected and softmax layers. Letting each stream obtain its own decision and fusing the outputs using an OR operation afterward increases the opportunity for having more nodules included in the mixed outputs, especially when more views are added. However, this method also increases the number of false positives and decreases the specificity and accuracy of the outputs, but it's a risk better to take and to be handled later by human experts better than missing the existence of an actual nodule.

8

After choosing the parameters of a single CNN stream for one input view ( number of layers, number, and size of kernels), and choosing the fusion method, the number of views is analyzed through extensive experiments on the selected LIDC training dataset. For each number of views (from one to nine), the network is trained twenty different times, and then the network with the highest sensitivity to validation data is chosen, as weights are initialized randomly and give different results for each run. As the number of views increases, sensitivity increases, but at the same time, specificity and accuracy decrease, due to the increasing number of false positives included during the fusion process. Views are tested in order as shown in figure 3, with one view added at a time, for example: if the number of views is one, axial view is used alone, if it is four, axial, coronal and sagittal views along with view 4 are used. We managed to get a three-view network with a high sensitivity of 97.50%, a reasonable specificity of 83.79% and a total accuracy of 88.67%. This sensitivity could not be matched except with a nine-view network with very low specificity and accuracy, and high running time. The views used are the basic axial, coronal and sagittal views, which contain the most information about a 3D nodule. See table 3 in section 5 for the results.

Figure 3: The different possible views for a 3D nodule. Upper image represents the 3D view of the nodule. 4.3 Network training and provided tools The final proposed network is shown in figure 4. It consists of three separate 2D network streams. The inputs are of size 64x64 pixels, with the nodule centered at the middle. Each stream consists of three consecutive convolutional layers and three max-pooling layers, with 1 fully connected layer, a softmax layer and a classification layer at the end. Each convolutional layer is followed by batch normalization and ReLU layers. Batch normalization layers normalize the activation and gradients propagating through a network by subtracting the mini-batch mean and dividing by the minibatch standard deviation, to speed up network training and reduce sensitivity to the initialized parameters. ReLU is the most common activation function to follow the convolutional layers, where any value less than zero is set to zero. Maxpooling stride filters are all of size 2x2, to reduce the size of the input patches by half. The fully connected layer has 2 output units for the 2 desired classes (nodule / non-nodule). The network training and testing are done using the multi-view dataset that was extracted from the LIDC database as mentioned earlier. The training data (500 cases + 1200 false positives) is divided into 70% for training and 30% for validation. A validation dataset is a sample of data held back from training your model that is used to give an estimate of model skill while tuning model’s hyper-parameters. The validation dataset is different from the test dataset that is also held back from the training of the model, but is instead used to give an unbiased estimate of the skill of the final tuned model when comparing or selecting between final models. Hence the model occasionally sees this data, but never does it “Learn” from this. For this work, the holdout validation method is used, which is the simplest kind of validation, where the data is divided into two sets: training and validation, to check how well the model performs on unseen data. The holdout method is usually used with larger datasets, instead of k-fold cross validation that requires running the training algorithm k independent times for each number of views, which requires a considerable amount of time and computational resources (Jeff Shneider, n.d.).

9

The training algorithm used to update the weights is the stochastic gradient descent method, with a mini-batch-size of 128. The choice of the mini-batch-size depends on the speed and size of the used GPU-RAM, and it would be preferably a multiple of 2 so that the GPU can be able to distribute the workload efficiently. The larger the size of the batch is, the more accurate the algorithm works, due to the variance drop. For a less deep and a simple model like the proposed model, setting the batch size to 128 would be reasonable.

Figure 4: The proposed multi-view CNN. The upper stream is for the axial-view inputs, the middle stream is for the coronal-view inputs and the lower stream is for the sagittal-view inputs. Outputs are fused together at the end to produce the final decision. The maximum number of epochs is set to 30 epochs and the initial learning rate is 10-4. Data shuffling is done once only at the beginning to feed all the network streams different views for the same input. The multi-view network achieved a sensitivity of 97.5%, a specificity of 83.8% and a total mixed accuracy of 88.67% on the validation data. To make sure of the network performance, further testing was done using the LIDC 150 test cases that were preserved for the CAD testing, without prior training. Near outcomes are obtained with sensitivity 96 %, specificity 87.3% and accuracy 91%. The network is designed and trained using the MatConvNet MATLAB toolbox. The toolbox provides the basic building blocks of a 2D CNN (Vedaldi & May, 2015) in a simple and flexible manner, where they can be combined and extended to create simple CNN architecture, such as convolutional, pooling and classification layers. It supports computations on the GPU and requires a compatible C++ compiler. 4.4 Integration of the multi-view CNN with the proposed CAD system After training and testing the proposed multi-view CNN, the network is saved to be integrated with the CAD system as a final classifier. The false-positive reduction stage - as shown earlier - included a rule-based classifier that is used to quickly eliminate the obvious non-nodule candidates. The 3D output candidates of the rule-based classifier are fed to the trained multi-view CNN to be classified. For each candidate, the coronal, axial and sagittal views are extracted to produce the desired inputs each of size 64x64 pixels. Outputs of all networks streams are fused using logical OR operation to produce the final classification labels for the CAD system. 5. Results and comparisons

10

Experiments are done on the 500 training cases to determine the optimum set of parameters for the CNN. Figure 5, Table 2 and Table 3 show the network training results as explained in section 4.2 for choosing the number of convolutional layers, the size of filters and the number of views, respectively.

100 95 90

Accuracy

85

Sensitivity

80

Specificity

75 3

4

5

No. of Conv Layers

Figure 5: The effect of changing the number of convolutional layers on the average accuracy, sensitivity, and specificity

Table 2: The effect of changing the size of filters in each convolutional layer on the accuracy, sensitivity, and specificity. Each row represents one combination of the filter sizes in all layers. 1st layer filter size 9x9 9x9 9x9 7x7 7x7 7x7 5x5 5x5 3x3

2nd layer filter size 7x7 7x7 5x5 5x5 5x5 3x3 5x5 3x3 3x3

3rd layer filter size 7x7 5x5 5x5 5x5 3x3 3x3 3x3 3x3 3x3

Accuracy

Sensitivity

Specificity

85.1 86.8 85.3 87.1 87.5 87.3 85.1 88.6 84.4

84.8 88.8 93.3 89.3 89.5 90.5 94.5 97.5 88.9

85.3 85.7 81.5 87.7 86.3 84.5 86.2 83.8 81.1

Table 3: Sensitivity, specificity, accuracy and running time for different number of views in the proposed multi-view CNN.

No. of Views 1 2 3 4 5 6 7

The Highest Sensitivity 91.53% 95.03% 97.50% 93.75% 95.27% 94.67% 95.15%

Specificity 94.25% 83.64% 83.79% 75.91% 73.31% 66.19% 76.14%

Accuracy 92.44% 88.22% 88.67% 82.89% 81.56% 76.89% 81.56%

Running Time (sec) 101.24 173.19 274.42 368.33 501.14 577.84 682.82

8 9

96.20% 97.21%

68.84% 66.79%

78.44% 78.89%

814.26 874.64

11

A total of 150 LIDC cases are used to test the proposed CAD system. For each case, the number of true positives, true negatives, false positives, and false negatives are calculated from the output labels and compared to the ground truth labels to estimate the efficiency of the complete CAD system. The results of the complete CAD system with the proposed CNN achieved great advancements when compared to the CAD system with the rule-based classifier only. Sensitivity increased from 77.77% to 85.265%, specificity increased from 69.5% to 90.658%, total accuracy increased from 70.53% to 89.895%. Check the confusion table in table 4, where N is the number of testing samples. Area Under the ROC Curve (AUC) is measured as a performance metric for the proposed CNN as 0.9485, where ROC stands for Receiver Operating Characteristics and presents specificity against sensitivity. Running time varies from a case to another, depending on the number of slices of each scan and the sensitivity of the algorithm to small candidates and segmentation noise, which is tuned during the nodule detection step. Time ranges between 194 seconds to 1600 seconds with an average of 759.59 seconds (12.65 minutes) on an Intel Core i7 CPU with 8 GB RAM. Time differs according to the number of slices per scan, and the noise removal threshold that is set during the structure extraction phase. The tool used is MATLAB R2017b with GPU Cuda device GeForce GT 650M with compute capability 3.0. The proposed CAD system can handle the cases in which the nodule is attached to the chest wall or connected to the blood tree as a result of using the rolling ball algorithm and the Euclidean depth map. The work is totally automated without required external interference and produces consistent results for the same inputs. The system achieves comparable performance to the other similar approaches using CNNs as shown in table 5. However, the CAD can't detect Ground Glass Nodules (GGN), which have intensity similar to the lung parenchyma. The algorithm also can't detect small nodules less than 3mm, as they are not provided by the LIDC toolbox (Lampert et al., 2016) as ground truth data to be used in the CNN training process.

Table 4: Confusion table for the CNN results Actual Values N = 406

Predicted Values

Positive

Negative

Positive

TP = 141

FP = 23

Negative

FN = 25

TN = 217

Notice that some algorithms may show better sensitivity or accuracy in table 5, with multiple-views onenetwork approach like the work in (Arindra et al., 2016), (K. Liu & Kang, 2017) and (Anirudh, Thiagarajan, Bremer, & Kim, 2016), but that is because they did not build a complete CAD system or perform image segmentation, which affects the overall performance and the time consumed by a CAD system. They used the nodule centers and ground truth data provided by the LIDC Nodule report (http://www.via.cornell.edu/lidc). Their results should be compared to the test results of the proposed CNN which outperforms most of them, with sensitivity 96 %, specificity 87.3% and accuracy 91% on the unseen testing data.

Table 5: Comparison with other algorithms Ref

3D Network

(Shen et al., 2016)

No

Architecture 3 conv layers

Parameters to be optimized

Dataset info

Performance

1-Number of hidden neurons

LIDC dataset Training: 2272 nodules

Acc.: 70.69 % AUC: 0.63

12

2- Number of training epochs

(Shen et al., 2017)

Yes

3 conv layers 3 multi-crop pooling layers

1- Number of kernels 2- Position of the multi-crop pooling layer 3- Dataset sample

2 conv layers (W. Li et al., 2016)

(Arindra et al., 2016)

(Ciompi et al., 2017)

(Hamidian et al., 2017)

(K. Liu & Kang, 2017)

(Huang, X., Shan, J., & Vaidya, 2017) (Anirudh, Thiagaraja n, Bremer, & Kim, 2016) The proposed algorithm

1- Map size 2- Momentum 3- Learning rate 4- Test scheme

No

LIDC dataset 888 scans Inputs are 64x64 multiple views of the nodule.

3views 9 streams, 3 for each scale 4 conv layers each

1- Number of scales and augmentation angle 2-Learning rate 3- Batch size 1- Kernel size 2- Stride 3- Feature Volumes 1- Dropout rate 2- Learning rate 3- Batch size 4- Momentum

Training: Italian MLD trial (934 cases) Testing: Danish DLCST trial (468 cases) Inputs: triplets of 64x64 2D patches of different scales. LIDC dataset. Training: 509 scans Testing: 25 cases. Inputs: 3D patches LIDC dataset 96 scans with diagnostic information Used augmentation to enrich the data

1- Effect of principal direction 2- Effect of dense evaluation 1- Nodule volume 2- Kernel size

99 scans, with extra augmented data Inputs: 3D 32x32x32 cubes

Sens.: 90% FPS/scan: 5

SPIE-LUNGSx dataset of weakly labeled data Training: 20 cases Testing: 47 cases Inputs: 25x25x7 and 41x41x7 LIDC dataset Training: 500 cases: (720 nodules + 1115 generated FPs)

Sens.: 80% FPs/scan: 10

Yes Multi-view-onenetwork with: *Chain architecture, * DAG architecture (with inception Module) 3 conv layers

Yes

Yes

No

Sens: 87.1% FPs/scan: 4.62

1- Number of layers 2- Kernel size 3- Number of views 4- fusion method

3 conv layers

Yes

Acc.: 87.14% Sens: 0.77 Spec: 0.93 AUC: 0.93

9 views 3 conv layers each No

No

Testing: 115 cases Inputs: 64x64x64 fed as consecutive 2D patches LIDC dataset All 1010 scans along with their annotated centers (825 nodules for training, 257 nodules for validation and testing) LIDC dataset 40,772 nodules 21,720 nonnodules Training: 85% of the data Inputs: 64x64

2 streams for 2 scales Each 3 conv layers 3 streams 3 conv layers each

1- Number of layers 2- Number and size of kernels

13

-Without small nodules: Sens.: 85.4% FPs/scan: 1 -With small nodules: Sens.: 78.2% FPs/scan: 1 Acc.: 72.9%

Sens.: 80% FPs/scan: 22.4

For 3D MVCNN with DAG architecture: Sens.: 95.68% Spec.:94.51% Error rate: 4.59

Sens: 85.256% Spec.: 90.658% Acc.: 89.895% AUC: 0.94

3- Number of views 4-Fusion method

Testing: 150 cases Inputs: 3 64x64 views of the nodule

6. Discussion and conclusions In this work, a complete CAD system for lung nodule detection using a multi-view CNN is proposed. 650 LIDC scans are used for the training and testing process. The CAD uses a combination of traditional successful algorithms like thresholding, region growing, and rule-based classifier, along with some of the state of the art techniques like distance transform map and convolutional neural networks. The proposed CNN is a multi-view network that uses only three views of the segmented 3D nodule candidate, which makes a perfect trade-off between output results and computational complexity. Extensive experiments are done to test the effect of changing the different parameters of the CNN and choose their optimum values. The proposed CAD system achieved promising and comparable results to the other algorithms using CNNs with different architectures and datasets. The proposed work proves that multi-view 2D networks are simple, but efficient in handling 3D data. It indicates the potential of using CNNs for the classification stage in any CAD system, instead of extracting engineered features to be used in complicated classifiers. Results prove that building a custom network for the task in hand may outperform many algorithms that use complex pre-trained networks which are previously trained on a totally different type of inputs. For future work, we would like to train and test the system on multiple datasets other than the LIDC, include a special GGN database and optimize the software of the current CAD system to provide faster and better performance. K-fold cross-validation may be used during the CNN training to provide more accurate and generalized results. Acknowledgments The authors would like to thank Misr Radiology Center for providing initial guidance to this work. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References Ahn, B. (2017). The compact 3D convolutional neural network for medical images. Cs231n.Stanford.Edu. Retrieved from http://cs231n.stanford.edu/reports/2017/pdfs/23.pdf Albelwi, S., & Mahmood, A. (2017). A Framework for Designing the Architectures of Deep Convolutional Neural Networks. Entropy, 19(6), 242. https://doi.org/10.3390/e19060242 Anirudh, R., Thiagarajan, J. J., Bremer, T., & Kim, H. (2016). Lung Nodule Detection Using 3d Convolutional Neural Networks Trained on Weakly Labeled Data - Semantic Scholar. Medical Imaging, 9785. Retrieved from https://www.semanticscholar.org/paper/LungNodule-Detection-Using-3d-Convolutional-Anirudh-Thiagarajan/d255bd8a937a161727b47ec450cbf35a04378c88 Arindra, A., Setio, A., Ciompi, F., Litjens, G., Gerke, P., Jacobs, C., … Ginneken, B. Van. (2016). Pulmonary Nodule Detection in CT Images : False Positive Reduction Using Multi-View Convolutional Networks. IEEE Transactions on Medical Imaging, 35(5), 1160– 1169. Armato, S. G., McLennan, G., Hawkins, D., Bidaut, L., McNitt-Gray, M. F., Meyer, C. R., … Clarke, L. P. (2011). The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans. Medical Physics, 38(February), 915–931. https://doi.org/10.1118/1.3528204 Austin, J., Müller, N., & Friedman, P. (1996). Glossary of terms for CT of the lungs: recommendations of the Nomenclature Committee of the Fleischner Society. Radiology, (200), 327–331. Retrieved from http://pubs.rsna.org/doi/pdf/10.1148/radiology.200.2.8685321 Biancardi, A. M. (2011). LIDC Size Report Notes - Computer Vision and Image Analysis Group. Retrieved April 19, 2019, from http://www.via.cornell.edu/lidc/notes3.2.html Chon, A., & Lu, P. (2017). Deep Convolutional Neural Networks for Lung Cancer Detection. Standford University. Ciompi, F., Chung, K., Riel, S. J. Van, Adiyoso, A. A., Gerke, P. K., Jacobs, C., … Marchian, A. (2017). Towards automatic pulmonary nodule management in lung cancer screening with deep learning. Scientific Reports, 7(46479). Convolutional Neural Networks in MATLAB. (2018). Retrieved from https://www.mathworks.com/solutions/deep-learning/convolutionalneural-network.html El-Baz, A., & Suri, J. S. (2011). Lung Imaging and Computer Aided Diagnosis. CRC Press. https://doi.org/10.1201/b11106 El-Regaily, S. A., Salem, M. A. M., Aziz, M. H. A., & Roushdy, M. I. (2017a). Lung Nodule Segmentation and Detection in Computed Tomography. 2017 Eighth International Conference on Intelligent Computing and Information Systems (ICICIS), 72–78. https://doi.org/10.1109/INTELCIS.2017.8260029

14

El-Regaily, S. A., Salem, M. A. M., Aziz, M. H. A., & Roushdy, M. I. (2017b). Survey of Computer Aided Detection Systems for Lung Cancer in Computed Tomography. Current Medical Imaging Reviews, 14(1), 3–18. https://doi.org/10.2174/1573405613666170602123329 Greenspan, H., Van Ginneken, B., & Summers, R. M. (2016). Guest Editorial Deep Learning in Medical Imaging : Overview and Future Promise of an Exciting New Technique. IEEE Transactions on Medical Imaging, 35(5), 1153–1159. https://doi.org/10.1109/TMI.2016.2553401 Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., & Shuai, B. (2018). Recent Advances in Convolutional Neural Networks. Pattern Recognition, 77, 354–377. Gu, Y., Lu, X., Yang, L., Zhang, B., Yu, D., Zhao, Y., … Zhou, T. (2018). Automatic lung nodule detection using a 3D deep convolutional neural network combined with a multi-scale prediction strategy in chest CTs. Computers in Biology and Medicine, 103, 220–231. https://doi.org/10.1016/j.compbiomed.2018.10.011 Gurcan, M. N., Sahiner, B., Petrick, N., Chan, H., Kazerooni, E. A., Cascade, P. N., & Hadjiiski, L. (2002). Lung nodule detection on thoracic computed tomography images : Preliminary evaluation of a computer-aided diagnosis system. Medical Physics, 29(11), 2552–2558. https://doi.org/10.1118/1.1515762 Hamidian, S., Sahiner, B., Petrick, N., & Pezeshk, A. (2017). 3D Convolutional Neural Network for Automatic Detection of Lung Nodules in Chest CT. Medical Imaging 2017: Computer-Aided Diagnosis, 10134. https://doi.org/10.1117/12.2255795.3D Hammack, D., & Wit, J. de. (2017). Kaggle Cometetion, Data Science Bowl 2017, Predicting Lung Cancer: 2nd Place Solution Write-up. Retrieved from http://blog.kaggle.com/2017/06/29/2017-data-science-bowl-predicting-lung-cancer-2nd-place-solution-write-up-danielhammack-and-julian-de-wit/ He, K., & Jian, S. (2015). Convolutional Neural Networks at Constrained Time Cost. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5353–5360. He, K., & Sun, J. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778. Huang, X., Shan, J., & Vaidya, V. (2017). Lung nodule detection in ct using 3d convolutional neural networks. IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), 379–383. Jeff Shneider. (n.d.). Cross Validation. Retrieved March 28, 2019, from https://www.cs.cmu.edu/~schneide/tut5/node42.html Jemal, A., Siegel, R., Ward, E., Hao, Y., Xu, J., & Thun, M. J. (2009). Cancer statistics, 2009. CA: A Cancer Journal for Clinicians, 59(4), 225–249. https://doi.org/10.3322/caac.20006 Ke, Q., Zhang, J., Wei, W., Połap, D., Woźniak, M., Kośmider, L., & Damaševĭcius, R. (2019). A neuro-heuristic approach for recognition of lung diseases from X-ray images. Expert Systems with Applications, 126, 218–232. https://doi.org/10.1016/j.eswa.2019.01.060 Lampert, T. a., Stumpf, A., & Gançarski, P. (2016). An Empirical Study Into Annotator Agreement, Ground Truth Estimation, and Algorithm Evaluation. IEEE Transactions on Image Processing, 25(6), 2557–2572. https://doi.org/10.1109/TIP.2016.2544703 LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539 Li, C., Zhu, G., Wu, X., & Wang, Y. (2018). False-Positive Reduction on Lung Nodules Detection in Chest Radiographs by Ensemble of Convolutional Neural Networks. IEEE Access, 6, 16060–16067. https://doi.org/10.1109/ACCESS.2018.2817023 Li, L., Jamieson, K., & Desalvo, G. (2016). Hyperband : A Novel Bandit-Based Approach to Hyperparameter Optimization. ArXiv Preprint ArXiv:1603.06560. Li, W., Cao, P., Zhao, D., & Wang, J. (2016). Pulmonary Nodule Classification with Deep Convolutional Neural Networks on Computed Tomography Images. Computational and Mathematical Methods in Medicine 2016. Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A. A., Ciompi, F., Ghafoorian, M., … Sánchez, C. I. (2017). A Survey on Deep Learning in Medical Image Analysis. Retrieved from http://arxiv.org/abs/1702.05747 Liu, K., & Kang, G. (2017). Multiview convolutional neural networks for lung nodule classification. International Journal of Imaging Systems and Technology, 27(1), 12–22. https://doi.org/10.1002/ima.22206 Liu, X., Hou, F., Qin, H., & Hao, A. (2018). Multi-view multi-scale CNNs for lung nodule type classification from CT images. Pattern Recognition, 77, 262–275. https://doi.org/10.1016/j.patcog.2017.12.022 Lu, L., Member, S., Xu, Z., Nogues, I., Yao, J., & Mollura, D. (2017). Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architechtures, Dataset Characteristics and Transfer Learning. IEEE Transactions on Medical Imaging, 35(5), 1285–1298. https://doi.org/10.1109/TMI.2016.2528162.Deep McNitt-Gray, M. F., Armato, S. G., Meyer, C. R., Reeves, A. P., McLennan, G., Pais, R. C., … Clarke, L. P. (2007). The Lung Image Database Consortium (LIDC) Data Collection Process for Nodule Detection and Annotation. Academic Radiology, 14(Lidc), 1464–1474. https://doi.org/10.1016/j.acra.2007.07.021 Połap, D., Woźniak, M., Damaševičius, R., & Wei, W. (2019). Chest radiographs segmentation by the use of nature-inspired algorithm for lung

15

disease detection. Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence, SSCI 2018, (0080), 2298–2303. https://doi.org/10.1109/SSCI.2018.8628869 Reeves, A. P., Biancardi, A. M., Apanasovich, T. V., Meyer, C. R., MacMahon, H., van Beek, E. J. R., … Clarke, L. P. (2007). The Lung Image Database Consortium (LIDC). A Comparison of Different Size Metrics for Pulmonary Nodule Measurements. Academic Radiology, 14(Lidc), 1475–1485. https://doi.org/10.1016/j.acra.2007.09.005 Rodrigues, M. B., Da Nobrega, R. V. M., Alves, S. S. A., Filho, P. P. R., Duarte, J. B. F., Sangaiah, A. K., & De Albuquerque, V. H. C. (2018). Health of Things Algorithms for Malignancy Level Classification of Lung Nodules. IEEE Access, 6, 18592–18601. https://doi.org/10.1109/ACCESS.2018.2817614 Shah, S. K., McNitt-Gray, M. F., De Zoysa, K. R., Sayre, J. W., Kim, H. J., Batra, P., … Aberle, D. R. (2005). Solitary pulmonary nodule diagnosis on CT: results of an observer study. Academic Radiology, 12(4), 496–501. https://doi.org/10.1016/j.acra.2004.12.017 Shen, W., Zhou, M., B, F. Y., Dong, D., & Yang, C. (2016). Learning from Experts : Developing Transferable Deep Features for Patient-Level Lung Cancer Prediction. International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. Cham., 124–131. https://doi.org/10.1007/978-3-319-46723-8 Shen, W., Zhou, M., Yang, F., Yu, D., Dong, D., Yang, C., … Tian, J. (2017). Multi-crop Convolutional Neural Networks for lung nodule malignancy suspiciousness classification. Pattern Recognition, 61, 663–673. https://doi.org/10.1016/j.patcog.2016.05.029 Sluimer, I., Schilham, A., Prokop, M., & Van Ginneken, B. (2006). Computer analysis of computed tomography scans of the lung: A survey. IEEE Transactions on Medical Imaging, 25(4), 385–405. https://doi.org/10.1109/TMI.2005.862753 Sousa, J. R. F. D. S., Silva, A. C., de Paiva, A. C., & Nunes, R. A. (2010). Methodology for automatic detection of lung nodules in computerized tomography images. Computer Methods and Programs in Biomedicine, 98, 1–14. https://doi.org/10.1016/j.cmpb.2009.07.006 Srivastava, R., Greff, K., & J Schmidhuber. (2015). Training very deep networks. Advances in Neural Information Processing Systems, 2377– 2385. Retrieved from http://papers.nips.cc/paper/5850-training-very-deep-networks Sun, W., Zheng, B., Qian, W., Imaging, M., Paso, E., States, U., … Biomedical, S. (2016). Computer aided lung cancer diagnosis with deep learning algorithms. Medical Imaging 2016: Computer-Aided Diagnosis, 9785, 97850Z-97850Z. https://doi.org/10.1117/12.2216307 Vedaldi, A., & May, C. V. (2015). MatConvNet Convolutional Neural Networks for MATLAB. Proceedings of the 23rd ACM International Conference on Multimedia. ACM, 689–692. Verleysen, A., Vansteenkiste, E., Godin, F., Korshunova, I., Degrave, J., Pigou, L., & Freiberger, M. (2017). Kaggle Competition,Data Science Bowl 2017, Predicting Lung Cancer: Solution Write-up. Retrieved from http://blog.kaggle.com/2017/05/16/data-science-bowl-2017predicting-lung-cancer-solution-write-up-team-deep-breath/ Way, T., Chan, H., Hadjiiski, L., & Sahiner, B. (2010). Computer-Aided Diagnosis of Lung Nodules on CT Scans:: ROC Study of Its Effect on Radiologists’ Performance. Academic Radiology, (3), 323–332. Retrieved from http://www.sciencedirect.com/science/article/pii/S1076633209005881 Woźniak, M., Połap, D., Capizzi, G., Sciuto, G. Lo, Kośmider, L., & Frankiewicz, K. (2018). Small lung nodules detection based on local variance analysis and probabilistic neural network. Computer Methods and Programs in Biomedicine, 161, 173–180. https://doi.org/10.1016/j.cmpb.2018.04.025

Credit Author Statement Manuscript title: Multi-view Convolutional Neural Network for Lung Nodule False Positive Reduction

Salsabil Amin El-Regaily: Conceptualization, Methodology, Software, Validation, Investigation, Resources, Writing – Original Draft, Visualization

16

Mohammed Abdel-Megeed Salem: Conceptualization, Validation, Resources, Writing – Review & Editing, Supervision

Mohamed Hassan Abdel Aziz: Resources, Supervision

Mohamed Ismail Roushdy: Supervision, Project Administration

17