Classification of DNA damages on segmented comet assay images using convolutional neural network

Classification of DNA damages on segmented comet assay images using convolutional neural network

Journal Pre-proof Classification Of Dna Damages On Segmented Comet Assay Images Using Convolutional Neural Network Umit Atila , Yusuf Yargı Baydilli ...

1MB Sizes 0 Downloads 42 Views

Journal Pre-proof

Classification Of Dna Damages On Segmented Comet Assay Images Using Convolutional Neural Network Umit Atila , Yusuf Yargı Baydilli , Eftal Sehirli , Muhammed Kamil Turan PII: DOI: Reference:

S0169-2607(19)30070-7 https://doi.org/10.1016/j.cmpb.2019.105192 COMM 105192

To appear in:

Computer Methods and Programs in Biomedicine

Received date: Revised date: Accepted date:

17 January 2019 4 November 2019 6 November 2019

Please cite this article as: Umit Atila , Yusuf Yargı Baydilli , Eftal Sehirli , Muhammed Kamil Turan , Classification Of Dna Damages On Segmented Comet Assay Images Using Convolutional Neural Network, Computer Methods and Programs in Biomedicine (2019), doi: https://doi.org/10.1016/j.cmpb.2019.105192

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier B.V.

Highlights 

A comprehensive investigation was carried out for DNA damage classification of comets processed in the form of gray-scale images using convolutional neural networks (CNN).



Results are compared with previous studies that applied different methods including varying image processing techniques and classical machine learning algorithms under same conditions.



Consequently, the accuracy rate of the recently proposed method on classification of DNA damage was obtained %96.1.



This is the first study that applies CNN to the problem and the success of the proposed method in the paper is not dependent to the parameters used in feature extraction phase of image processing such as threshold value and so on.



The proposed method is more robust than other methods in the literature on DNA damage classification.

1

CLASSIFICATION OF DNA DAMAGES ON SEGMENTED COMET ASSAY IMAGES USING CONVOLUTIONAL NEURAL NETWORK Ümit ATİLA1, Yusuf Yargı BAYDİLLİ1, Eftal ŞEHİRLİ2*, Muhammed Kamil TURAN3 1

Department of Computer Engineering, Faculty of Engineering, Karabuk University, Karabuk, Turkey Department of Medical Engineering, Faculty of Engineering, Karabuk University, Karabuk, Turkey 3 Department of Medical Biology, Faculty of Medicine, Karabuk University, Karabuk, Turkey 2

*Corresponding author email: [email protected]

Phone: +90 505 503 35 50

ABSTRACT Background and Objective: Identification and quantification of DNA damage is a very significant subject in biomedical research area which still needs more robust and effective methods. One of the cheapest, easy to use and most successful method for DNA damage analyses is comet assay. In this study, performance of Convolutional Neural Network was examined on quantification of DNA damage using comet assay images and was compared to other methods in the literature. Methods: 796 single comet grayscale images with resolution labeled by an expert and classified into 4 classes each having approximately 200 samples as G0 (healthy), G1 (poorly defective), G2 (defective) and G3 (very defective) were utilized. 120 samples were used as test dataset and the rest were used in data augmentation process to achieve better performance with training of Convolutional Neural Network. The augmented data having a total of 9,995 images belonging to four classes were used as network training data set. Results: The proposed model, which was not dependent to pre-processing parameters of image processing for DNA damage classification, was able to classify comet images into 4 classes with an overall accuracy rate of 96.1%. Conclusions: This paper primarily focuses on features and usage of Convolutional Neural Network as a novel method to classify comet objects on segmented comet assay images. Keywords: comet assay, DNA damage, convolutional neural network, deep learning

1. INTRODUCTION DNA damage is one of the significant research topics in medical and biomedical area. Human cells may be influenced by many internal and external factors to get damage unintentionally. Comet assay or single cell gel electrophoresis is a cheap, easy to use, widely accepted and successful method to analyze DNA damage on comet objects. Scope of comet assay method is quite large and areas of its usage are fundamental research in DNA damage and repair, human biomonitoring and molecular epidemiology, diagnosis of genetic disorders, monitoring environmental contamination with genotoxins and testing novel chemicals for genotoxicity [1]. Automated analysis of DNA damage on comet assay images is one of the less frequently studied topics in biomedical area. Computer aided analysis is facilitating and crucial since there are a few critical points that may lead to wrong interpretations and measurements such as variability of images in terms of color or gray level, the morphology of comet objects, high number of comet objects on images, heaviness of the DNA damage, elimination of artefacts, blurry objects created by environmental factors and overlapped objects created by DNA. Recently, digital image processing and machine learning have become one of the most popular topics in computer science. There are many algorithms concerning image processing like detecting objects, correcting images, extracting information from images, classification, filtering, segmentation, and performing analysis on images etc. After performing image processing algorithms on input images, the output of the systems generally becomes a digital image, graphics and numerical results. It is possible to observe some studies that apply above mentioned techniques in order to analyze DNA damage. In one of the studies, Sreelatha et al. presented a fully automatic method to detect comet objects on very noisy silver stained comet assay images. They developed a computer program that measures parameters as comet length, tail length, head diameter, percentage DNA in tail, percentage of DNA in head and tail moment. Sensitivity was calculated in their study as 89.30% [2]. Later, they proposed an improved algorithm for automatic detection of true comets. They applied homomorphic filtering, shading correction, Otsu’s thresholding method, morphological thickening and morphological filtering on images to detect comets. Sensitivity was calculated as 93.17% [3]. Konca et al. presented a computer program called as CASP for detection of comets. The CASP calculates the head radius, the head area, the head center, tail area and tail length parameters [4]. Gonzalez et al. presented a computer program called as CellProfiler to detect, quantify and export information about comet assay images in an automatic manner. They used mixture of expectation-maximization algorithm and Gaussian method to detect comets. They did not explain their method in the paper and presented only performance comparisons between CellProfiler program and CASP program [5]. In one of the observed studies in the literature that uses classical machine learning algorithm, Sreelatha et al. presented a computer program called as CometQ for the detection and quantification of DNA damage on comet assay images. In the classifier stage, Support Vector Machine (SVM) algorithm mainly performs to classify type of images as silver stained images or fluorescent stained images. In addition to this, SVM classifies damage type of comet cells, also. In comet 2

segmentation stage, four different segmentation methods were separately developed for four different classes according to type of images with type of comet cells. In comet partitioning stage, actual comet objects were classified as head clustering, halo clustering and tail clustering by the help of morphological operations. Finally, in comet quantification stage, comet parameters were calculated by taking into consideration of head and tail regions. 600 silver stained images and 56 fluorescent stained images were used in their study. Positive predictive value was calculated as 90.26% and sensitivity was calculated as 93.34% [6]. Lee et al. developed a method to classify DNA damage patterns on comet assay images including detection, adjustment and analysis. In analysis stage, comet parameters were calculated. The average classification accuracy was calculated as 86.80% for 20 test data set including more than 300 images [7]. As can be seen, DNA damage analysis is a studied research area by scientists. However, it still needs more effort to develop robust and effective methods. In the literature, it is observed that there is no study utilizing Deep Learning (DL) algorithms about automated detection of DNA damage on comet assay images. Most of the studies concentrate on dividing comet structure into two parts such as head part and tail part and calculate comet parameters like head intensity, head area, head length, tail intensity, tail area, tail length and tail moment. Besides, one of the greatest advantages of DL in contrast to machine learning is that raw data itself is taken as input. Thus, DL can allow an end-toend learning process. In machine learning methods, raw data is passed through some pre-processing processes and transformed into vectors or matrices. After that, these vectors or matrices are used as input parameters for machine learning methods. However, errors that may occur during vectorization and some specific features that cannot be extracted, affect adversely the classification success [8]. With DL, it is possible to overcome this problem. Convolutional Neural Network (CNN) is one of the DL methods used to solve computer vision and analysis problems. This method, which is based on the ideas offered by Fukushima [9] in 1980s, made progress with handwritten recognition task performed by LeCun [10] in 1998. However, the actual outburst of this method was sighted after 2012, when AlexNET [11] architecture achieved great success in ImageNet Challenge. From this date onwards, it became possible to obtain deeper and more complex models that can respond to both microscopic and macroscopic problems, thanks to ever-evolving computing power [12]. The main contribution of this study is to bring usage of CNN method into the forefront regarding segmented comet assay images. The study quantifies and classifies segmented comet assay images into four different damage levels by CNN as a novel method. The study is organized as follows. In the first section, protocol and used materials were mentioned to perform laboratory experiments. In the second section, data acquisition process and proposed CNN model were explained. In the third section, results were presented. In the fourth section, obtained results were discussed. In the fifth section, conclusion was put forward. 2. MATERIAL AND METHODS 52 preparates have been worked on by using human lymphocyte cells to perform comet assay experiment in this study. 10 preparates have been given damage by injecting hydrogen peroxide (H2O2). 42 preparates have stayed stable. Eventually, 205 single comet images obtained from 52 preparates have been utilized. All images have been stored in comet assay database. 2.1. Protocol of Comet Assay Experiment According to protocol of comet assay experiment, lymphocyte cells are obtained by isolating from serum-free culture medium. Subsequently, normal melting point agarose and low melting point agarose are prepared to make cells spread on dry slides. Then, the slides are put in lysis solution and unwind solution in order. Next, electrophoresis process is performed for the slides to move the cells from cathode to anode. Thus, the cells move off in the horizontal direction since the slides are aligned with borders of electrophoresis tank. While this process brings about determining tail of damaged comets, healthy comets move without creating tail. As soon as electrophoresis process finishes, the slides are stained with ethidium bromide (EtBr) in order to paint the cells and monitor with a microscope clearly. Before monitoring the cells, the slides are made dry in room temperature [13]–[15]. 2.2. Features of Imaging System Thanks to fluorescent light microscope, the preparates are monitored under green filter. In this respect, Olympus Cx 31 with fluorescent attachment has been utilized to obtain images. The fact that the preparates reflects on background which is composed of dark colors of radiometric range at red channel is provided. 2.3. Validation 796 individual comet images that obtained from different patients were saved in resolution in grayscale form. Subsequently, the images were labeled by an expert and were classified into four classes as G0 (healthy), G1 (poorly defective), G2 (defective) and G3 (very defective). In order to be able to perform a decent classification with machine learning methods, the data distribution of the classes must be close to each other. Otherwise, a phenomenon called ―imbalanced data set‖ occurs. Even if a high success rate is achieved at the end of classification, this ratio will only reflect the accuracy of the class with high data count [16]. 3

One of the solutions applied to overcome this problem, which is frequently encountered in medical image analysis, is to perform under-sampling. Since the images obtained from the patients has a high number of G0 data compared to the other three classes, 205 of them were randomly chosen as parallel to the data count of the other classes. G1, G2 and G3 were formed as 191, 202 and 198, respectively. Eventually, 85% of the data set belonging to different classes were separated as training set and 15% of the data set were separated as test set. 2.4. Data Augmentation One of the problems faced while operating with deep learning methods is that its dependency to data size. When the higher amount of the data is used, the greater success can be achieved [17]. Data augmentation method, which is often used in deep learning, comes in handy to solve this issue [18]. In this method, copies of the images in different variations are created and it is ensured that the model is able to extract more features from all images. Thus, number of samples through seven different methods (Rotation, Shear, Width Shift, Height Shift, Zoom, Horizontal Flipping and Vertical Flipping) to increase success rate of the model is enhanced. The images of one sample of each class before and after the data augmentation procedure are shown in Fig. 1.

Before augmentation

After augmentation

a) G0

b) G1

c) G2

d) G3

Fig. 1. The images before and after data augmentation procedure. At the end of the process, a total of 9,995 images belonging to four classes were used in training of network. On the other hand, test data consists of 120 images. The properties of the generated data set are given in Table 1. Table 1. The properties of the generated data set. Class Property Number of samples Training set Data Augmentation Test set Color Resolution

G0

G1

G2

G3

Total

205 173 2496 32

191 166 2499 25

202 175 2500 27 Grayscale

198 162 2500 36

796 676 9995 120

An image, which is perceived as visions by cortex, is seen as a matrix that contains pixel values varying between 0 and 255 for each color channel by computers. In order to assure that network’s parameters reach converge state faster and are not affected by differences of pixel values, images were normalized by dividing each pixel value by 255 [19]. 2.5. Convolutional Neural Networks As mentioned before, images are matrices that contain pixel values. It is therefore possible to extract the features of these matrices by means of convolutional filters. For example, edges of an object which takes place in an image can be detected using an edge detection filter. Thus, CNN aims to make a classification by learning features extracted through the filters via artificial neural networks (ANN) and questioning whether other images have same properties. A CNN architecture forms of three parts such as convolution layers, pooling and classification. The convolution layer performs feature extraction through filter matrices and presents these inferences in a hierarchical fashion. Thus, the basic features of an image are extracted in the first layers, then, these features can be used to reveal more complex features in subsequent layers. In addition, thanks to two features that distinguish CNN from classical ANNs, while the used filters target to learn the best parameters providing classification of whole image (weight sharing), the architecture remains unaffected by anomalies found on different regions of the image (local connectivity). In this way, all neurons are responsible for only receptive field that affects itself. Another advantage provided by the convolutional filters is that 4

it greatly reduces the number of parameters that must be learned [20]. Further, pooling layer, which is used at later stages, ensures the invariance of the representations [21]. The classification part of CNN works like a Multi-Layer Perceptron (MLP). The neurons existing in the last convolutional layer are transformed into a vector-like design by flattening. After that, the classification is performed by adding hidden layers and neurons as many as desired number. At the end of the process, if the network learns to classify images in a correct way through parameters of the filters, the same filters can be used to extract features from a test image in order to make a comparison. Thus, if convolution layers encounter any of these features previously observed, corresponding neurons will be activated. As for MLP layer, the classification is performed by neurons having similar output values. 2.6. Proposed Architecture Hyper-parameter optimization is an another issue that needs to be dealt with when constructing deep learning architectures [22]. Hyperparameters are attributes determined at the beginning and usually remain unchanged during the training process, such as; number of convolution filters, number of convolution layers, activation functions etc. [23]– [25]. Working with right hyper-parameters is crucial due to the fact that they directly affect success of the model [26]. For example, if the network is too deep, it may lead to increase training duration. If the network is too shallow, it may lead to failure of learning training data [27]. Moreover, sigmoid activation function causes to occur vanishing gradient problem [28]. As creating the architecture, it is disclosed to propose an effective model considering the problems mentioned above. The pruning technique was used in hyper-parameter optimization in this manner. This method initiates the training process in a deep architecture and tries to increase learning rate by decreasing the training parameters (layers) [29]. The proposed architecture is shown in Fig. 2. Due to the fact that each pixel of images carries meaningful information, images have been translated in all directions after the data augmentation process. Padding value is determined as same in convolutional operations for whole network. Thus, it is aimed to be able to extract information that lies in the pixels on edges. In the first convolution layer, single-channel and comet images were scanned with 6 different kernels. Here, in order to obtain a higher receptive field and reduce the number of parameters by half at the next layer, it was determined kernel size = 7 and stride = 2. Although, pooling layers are very useful in reducing the number of parameters, these layers have been used relatively rare in the architecture of this study since they can cause to lose some important features [30]. Next, the second layer was scanned with 16 convolution filters. After the maxpool operation, obtained 16 different feature maps were scanned through 36 kernels with size. Then, the resulting feature maps were scanned by 36 kernels with size again. This method is called as convolution used in some state-art-models like AlexNET and VGG. While it provides with increasing number of obtained features through increasing receptive field and depth of architecture, it prevents number of parameters from increasing to be trained [11], [31], [32]. In the model, dual convolution layers were constructed by taking consideration of this idea.

Fig. 2. Proposed CNN model. After the second maxpool operation, 36 feature maps with size were finally scanned with 64 kernels with size. In the classification section, 6400 neurons obtained from last convolution layer were flattened and fully-connected to 64 neurons. An output layer with four neurons was attached to end of the model due to existing four damage classes. During learning process, batch normalization method [33], which adds noise by normalizing activations of layers, and dropout method [34], which randomly activates and deactivates neurons in the network, have been used in order to 5

increase learning speed and avoid overfitting. Although, the ReLU activation function ( ) ( ) may help vanishing gradient problem prevent, during the back-propagation, it causes that some neurons die after a while, due to generating continuously same values by turning negative values into zero [35]. Therefore, the PReLU [36] activation ( ) function ( ) ( ) was preferred in both convolution and classification layers. On the other hand, Adaptive Momentum Estimation (ADAM) [37] was used as optimization function which tries to prevent local minimum phenomenon from occurring by performing adaptive learning based on the network parameters during the update process. Table 2 summarizes the structure of the proposed network. Table 2. The structure of the proposed network. Layer num. 0 1 2 3 4 5 6 7 8 9 10

Layer Type Input Convolution Batch norm. Convolution Batch norm. Max-pool Convolution Batch norm. Convolution Batch norm. Max-pool Convolution Batch norm. FC FC Output

Size

Kernel Size, Stride -

-

-

Padding same same same same same -

Activation PReLU PReLU

Dropout Rate -

PReLU PReLU

-

PReLU PReLU PReLU Softmax

-

-

The target vectors, which declare the correct class of each image, were created by the one-hot-encoding method. This vector is an array that contains elements as many as number of classes (it is four in this problem) which take 1 for the correct class and 0 for the others. During the feed-forwarding process, probabilities of the predicted outputs and the error rates relative to target vectors were calculated by using softmax presented in Eq. 1 and categorical cross-entropy function presented in Eq. 2 [20]. ∑ where is the computed probability for feed-forwarded image; generates for each class, and is the class count. ∑ where,

is a vector containing the outputs that network

( )

is the target vector that represents the correct classes of images.

Besides, one of the hyper-parameters used in training of CNN is batch size. This value determines how many images should be handled by the architecture in each forward/backward iteration [38]. Batch size was defined as 128 in order to increase the training speed. In addition, k-fold cross validation method was used. By choosing , it was allowed that the model performs learning with 90% of the training data and validation with the rest. By shuffling the training data, the model was able to use different images for each epoch. Thus, it was aimed to prevent training data from overfitting and have accurate results for all images [39]. In briefly, the built model had 8,995 images (in sets of 128) passing through the feed-forward process in each epoch. The total error for each set was calculated and back-propagation [40] was performed via the optimization function to update the weights. Following the updates for all images, the trained network was tested for the success rate by performing verification on 1,000 images. The model that contains 594,398 parameters (594,082 trainable + 316 nontrainable batch norm) was trained for 100 epochs. 3. RESULTS In spite of attempting to prevent over-fitting by various normalization and regularization techniques, it was observed that the network reached to convergence state between the scale of 85% and 90% classification accuracy and began to 6

memorize training data. The highest classification accuracy was observed at 84th epoch as 0.8990. The graph showing all the training and validation success rates that the network achieved during the process is given in Fig. 3.

Fig. 3. Training and validation success rates of network. Another opportunity provided by CNNs is that since the results produced by the model are found in form of matrices in layers, layers are visualized how the training has evolved over time [21]. The following figure that contains the extracted features from the convolution and activation layers of G1 class and colored by jet color palette is shown in Fig. 4.

Fig. 4. Visualization of convolution and activation layers. As it can be seen, in the initial convolution and activation maps, the model was able to detect Region of Interest (ROI) parts of the related comet image and extract the basic properties of the images. After that, in the following layers, it focused on more complex features using information coming from first layers.

7

After the training process, when the test data were classified with the model reached the best validation rate on training data, the true classification rate (predictive value) was measured as 92.5%. The confusion matrix, which contains the number of correct and incorrect classified images for each class, can be seen in Fig. 5. Due to the confusion matrix, it is also possible to calculate statistical metrics such as sensitivity, specificity, precision and accuracy [41]. Thus, the classification report of the model is shown in Table 3.

Fig. 5. Confusion matrices w/ and w/o normalization. Table 3. Classification report. Study

Class G0 G1 G2 G3

Our’s

Ganapathy, S., et al [6] Turan, M.K. and E. Sehirli, [15]

Overall

# of Samples 205 191 202 198 796 656 412

Sens. (%)

Spec. (%)

Prec. (%)

Acc. (%)

100.00 84.00 92.59 91.67 92.50 93.34 99.03

97.53 97.83 95.56 98.73 97.37 N/A 99.67

94.12 91.30 86.21 97.06 92.50 90.26 99.03

98.23 94.87 94.87 96.52 96.10 N/A 99.51

As can be seen in the table, this study outperformed from [6] in mean of precision, but left behind from [15]. However, it should be noted that this study is independent on parametric values. The other two studies have accuracy values that vary according to the pre-processing parameters like threshold value on feature extraction. When other methods encounter new images, it is possible that chosen segmentation methods fail. However, this study can be defined as a robust model that can respond to all images from the same source. On the other hand, if the number of samples that belong to G1 and G2 classes which can negatively affect sensitivity and specificity values increases, it will be observed that success rate of the deep learning model tends to increase. However, it is estimated that success rates will further decrease in other studies, which are dependent on parameters when encountered with more samples. Mis-classified 9 out of 120 comet images adversely affect the performance of this study. The correct and predicted classes for each class are shown in Table 4. Table 4. Misclassified comet images.

True Label

G0

G0

G1

G2

G3

–––

–––

–––

–––

–––

G1

G2

–––

–––

–––

8

G3

–––

–––

Predicted Label When misclassified samples were detailly investigated, it was observed that the images are fairly close to the predicted classes, as much as even an expert can be failed. 4. DISCUSSION In this study, comet images in different classes obtained were classified into four classes with 96.1% accuracy. During training and test process, it was observed that data characteristics were the most important factor affecting the success of classification. Although success rate of classification could be increased by data augmentation techniques for a certain extent, the rate can ascend only one point and stop, since the augmented data are copies of the original data. Images with more specific features that can provide discrimination among classes should be used in order to increase success rate of classification [42], [43]. One of the most important results observed in this study is that G0 and G3 images are easier to distinguish from each other, but images belonging to G1 and G2 classes are more difficult to distinguish from each other. It has been observed that similarities among the images of neighbor classes directly affect success rate and cause to have the lowest classification performance, especially for G1 class comets. In this respect, it is essential to have more specific data to be trained and tested in order to increase success rate of classification. Off the above-mentioned offers, reducing the number of classes can be tried as one of the methods that may help success rate increase. 3-class (healthy - G0, defective - G1&G2, very defective - G3) or binary (healthy - G0, defective G1&G2&G3) categorization which is performed in place of a 4-class categorization may increase the diversity of the data. In addition, due to such a classification may lead an estimation ability according to complement of the classes, it is foreseen that the success rate of the classification can be carried to higher points. As known, if number of data is insufficient, deep learning methods such as transfer learning and fine tuning [44] are used in order to make it possible to classify data set with a high success rate. In these methods, models previously used in different tasks (AlexNET, VGG etc.) that have ability to extract hierarchical properties of a different data set are used as a classifier for own data set to extract features. However, such architectures are mostly configured for 3-channel data.

Fig. 6. Transfer learning vs. fine tune. At this part of the study, in order to evaluate TL and FT performance, we copied the data exists in 1-channel to other two channels, and thus, we generated 3-channel inputs (same data in each channel) [44]. Next, we performed the tests using three pre-trained models, namely; VGG-19 [32], ResNet-50 [45] and NASNetMobile [46]. We used ImageNET [11] data set to initialize weights of the architectures. Two training strategies were employed while performing the tests: 

Transfer Learning (TL): In this technique, a pre-trained model was used as a feature extractor and the bottleneck features of the data set were obtained. Then, final convolution layers were flattened and linked to 3 fully-connected layers containing 512, 64, and 4 neurons, respectively.



Fine Tuning (FT): In this technique, while freezing the weights in which the model extracts the generic features of the data, upper layers that focus to more specific characteristics of the data are included in the

9

training process. In this way, it is tried to prevent overfitting by adapting the network to new samples. Again, outputs of the last convolution layers were flattened and connected to 512, 64 and 4 neurons, respectively. During the test, 10-fold cross validation method was used and the models were trained for 100 epochs. The last 5 layers of the VGG-19, the last 20 for ResNet-50 and the last 20 for NASNetMobile were fine tuned. The obtained results are shown in Table 5. Table 5. Transfer learning/fine tuning results. Model VGG19 ResNet50 NASNetMobile CNN

Method TL FT TL FT TL FT from scratch

Sensitivity (%) 85.00 86.67 22.50 84.17 39.17 40.83 92.50

Specificity (%) 94.44 95.12 46.55 94.10 65.89 67.43 97.37

Precision (%) 85.00 86.67 22.50 84.17 39.17 40.83 92.50

Accuracy (%) 91.89 92.86 36.73 91.40 56.29 57.99 96.10

As can be seen in the table, the models over fitted the training data. Even though, fine tuning method reached higher rates, the results stayed behind the CNN that trained from scratch. This situation can be explained by ―negative transfer‖ phenomenon, which is frequently encountered in transfer learning and domain adaption [47]. Although TL and FT methods can achieve high success in many tasks, the lack of similarity between ImageNET and the comet samples caused a negative effect on the results. In the deep learning community, there are some accepted ―rule of thumbs‖ in determining the way you should follow while dealing with domain adaptation/transfer learning. If there is no similarity between the data set used as initial weights and the data set to be adapted, then the FT method is recommended to avoid overfitting. However, if the data set is large, it is foreseen that performing a training from scratch instead of TL/FT will result higher results [48]. Therefore, since we generated a large data set after data augmentation, we were able to prove that an end-to-end process with CNN achieved much better results. 5. CONCLUSION DNA has crucial for human beings as well as all living beings. Therefore, the consequences of DNA damage can be fateful. Through analyzes carried out on DNA, it is possible to implement right treatments by obtaining important information about patients. In this respect, it is very significant to analyze comet assay images, which is a method used in DNA damage analysis, and to determine degree of damage with high success. Although successful modeling can be achieved by extraction of meaningful information through image processing techniques and machine learning algorithms, success rates of these methods are dependent on correct parameters to be chosen. Hence, it causes the model to decrease success rate. Thanks to the fact that the images can directly be used as input data, deep learning emerges as a negatively unaffected method from above mentioned issues. Since it does not need pre-processing, it is not affected by errors in this section. It is possible to perform a robust modeling if there are enough data. In this study, 796 comet images were divided into two groups (85% training, 15% test) and performed classification by CNN method. To obtain more learning from each sample, the training data was expanded by data augmentation method. At the end of this study, the accuracy value of the model was measured as 96.1% on the test data not seen before by the model. Although results obtained in this study are behind the previous study that the Decision Tree method is used [15], it should be noted that the results are independent on parameters and trial/error process. By adding more unique samples to the training data, it is predicted that higher success rates can be reached when compared to the other study. CONFLICT OF INTEREST none 6. ACKNOWLEDGMENT This work was supported by Research Fund of the Karabük University, Project Number: KBÜ-BAP-16/2-DR-102. Our thanks to Karabük University to provide an opportunity and work cooperatively during comet assay experiment. 7. REFERENCES [1]

A. R. Collins, ―The comet assay for DNA damage and repair: principles, applications, and limitations,‖ Molecular Biotechnology, vol. 26, no. 3, pp. 249–261, Mar. 2004. 10

[2]

[3]

[4] [5]

[6]

[7]

[8] [9] [10] [11] [12]

[13] [14] [15] [16] [17] [18] [19]

[20] [21]

[22] [23] [24] [25] [26]

[27]

G. Sreelatha, P. Rashmi, P. S. Sathidevi, M. Aparna, P. Chand, and R. P. Rajkumar, ―Automatic detection of comets in silver stained comet assay images for DNA damage analysis,‖ in 2014 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Guilin, Guangxi Zhuangzu, China, 2014, pp. 533–538. G. Sreelatha, A. Muraleedharan, P. Chand, R. P. Rajkumar, and P. S. Sathidevi, ―An improved automatic detection of true comets for DNA damage analysis,‖ Procedia Computer Science, vol. 46, pp. 135–142, Jan. 2015. K. Końca et al., ―A cross-platform public domain PC image-analysis program for the comet assay,‖ Mutation Research/Genetic Toxicology and Environmental Mutagenesis, vol. 534, no. 1–2, pp. 15–20, Jan. 2003. J. E. González, I. Romero, J. F. Barquinero, and O. García, ―Automatic analysis of silver-stained comets by CellProfiler software,‖ Mutation Research/Genetic Toxicology and Environmental Mutagenesis, vol. 748, no. 1, pp. 60–64, Oct. 2012. S. Ganapathy, A. Muraleedharan, P. S. Sathidevi, P. Chand, and R. P. Rajkumar, ―CometQ: An automated tool for the detection and quantification of DNA damage using comet assay image analysis,‖ Computer Methods and Programs in Biomedicine, vol. 133, pp. 143–154, Sep. 2016. T. Lee et al., ―Robust classification of DNA damage patterns in single cell gel electrophoresis,‖ in 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka, Japan, 2013, pp. 3666–3669. K. Suzuki, ―Overview of deep learning in medical imaging,‖ Radiological Physics and Technology, vol. 10, no. 3, pp. 257–273, Sep. 2017. K. Fukushima, ―Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,‖ Biological Cybernetics, vol. 36, no. 4, pp. 193–202, Apr. 1980. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, ―Gradient-based learning applied to document recognition,‖ Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998. A. Krizhevsky, I. Sutskever, and G. E. Hinton, ―ImageNet classification with deep convolutional neural networks,‖ in Advances in Neural Information Processing Systems 26, Stateline, NV, US, 2013, pp. 1097–1105. L. Nanni, S. Brahnam, S. Ghidoni, and A. Lumini, ―Bioimage classification with handcrafted and learned features,‖ IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 16, no. 3, pp. 874–885, May 2019. N. P. Singh, M. T. McCoy, R. R. Tice, and E. L. Schneider, ―A simple technique for quantitation of low levels of DNA damage in individual cells,‖ Experimental Cell Research, vol. 175, no. 1, pp. 184–191, Mar. 1988. A. R. Collins, ―The comet assay: Principles, applications, and limitations,‖ Methods in Molecular Biology, vol. 203, pp. 163–177, 2002. M. K. Turan and E. Sehirli, ―A novel method to identify and grade DNA damage on comet images,‖ Computer Methods and Programs in Biomedicine, vol. 147, pp. 19–27, Aug. 2017. H. He and E. A. Garcia, ―Learning from imbalanced data,‖ IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, Sep. 2009. D. Shen, G. Wu, and H.-I. Suk, ―Deep learning in medical image analysis,‖ Annual Review of Biomedical Engineering, vol. 19, no. 1, pp. 221–248, Jun. 2017. J.-G. Lee et al., ―Deep learning in medical imaging: General overview,‖ Korean Journal of Radiology, vol. 18, no. 4, pp. 570–584, 2017. F. Chollet, ―Building Powerful Image Classification Models Using Very Little Data,‖ 2018. [Online]. Available: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html. [Accessed: 28Aug-2018]. S. K. Zhou, H. Greenspan, and D. Shen, Deep Learning for Medical Image Analysis, 1st ed. London; San Diego: Academic Press, 2017. H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, ―Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations,‖ in 26th Annual International Conference on Machine Learning (ICML ’09), Montreal, Quebec, Canada, 2009, pp. 1–8. J. Bergstra and Y. Bengio, ―Random search for hyper-parameter optimization,‖ Journal of Machine Learning Research, vol. 13, no. Feb, pp. 281–305, 2012. Yoshua Bengio, ―Hyper-­parameters for deep learning,‖ presented at the 31st International Conference on Machine Learning, Beijing, China, 2014. T. Gao, ―Hyperparameter optimization for deep learning,‖ presented at the UNC CS Deep Learning Journal Club, University of North Carolina at Chapel Hill, NC, US, 2016. M. Hushchyn, ―Model selection & hyperparameters tuning,‖ presented at the Cyclotron Seminar ALICE/NA61/SPbSU, Saint Petersburg State University, Sankt-Peterburg, Russia, 2017. X. Glorot and Y. Bengio, ―Understanding the difficulty of training deep feedforward neural networks,‖ in Thirteenth International Conference on Artificial Intelligence and Statistics, Chia Laguna Resort, Sardinia, Italy, 2010, pp. 249–256. A. Ng, ―Basic Recipe for Machine Learning - Practical aspects of Deep Learning,‖ Coursera, 2018. [Online]. Available: https://www.coursera.org/lecture/deep-neural-network/basic-recipe-for-machine-learning-ZBkx4. [Accessed: 31-Aug-2018]. 11

[28] V. Nair and G. E. Hinton, ―Rectified linear units improve restricted boltzmann machines,‖ in 27th International Conference on Machine Learning (ICML ’10), Haifa, Israel, 2010, pp. 807–814. [29] Y. LeCun, J. S. Denker, S. Solla, R. E. Howard, and L. D. Jackel, ―Optimal brain damage,‖ in Advances in Neural Information Processing Systems 2, Denver, CO, US, 1989. [30] S. Sabour, N. Frosst, and G. E. Hinton, ―Dynamic routing between capsules,‖ arXiv:1710.09829 [cs], Oct. 2017. [31] M. Lin, Q. Chen, and S. Yan, ―Network in network,‖ arXiv:1312.4400 [cs], Dec. 2013. [32] K. Simonyan and A. Zisserman, ―Very deep convolutional networks for large-scale image recognition,‖ arXiv:1409.1556 [cs], Sep. 2014. [33] S. Ioffe and C. Szegedy, ―Batch normalization: Accelerating deep network training by reducing internal covariate shift,‖ in 32nd International Conference on Machine Learning, Lille, France, 2015, vol. 37, pp. 448–456. [34] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, ―Dropout: A simple way to prevent neural networks from overfitting,‖ Journal of Machine Learning Research, vol. 15, no. Jun, pp. 1929–1958, 2014. [35] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, ―Fast and accurate deep network learning by exponential linear units (ELUs),‖ arXiv:1511.07289 [cs], Nov. 2015. [36] K. He, X. Zhang, S. Ren, and J. Sun, ―Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification,‖ in 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2015, pp. 1026–1034. [37] D. P. Kingma and J. Ba, ―Adam: A method for stochastic optimization,‖ in 3rd International Conference on Learning Representations, San Diego, US, 2014. [38] N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. P. Tang, ―On large-batch training for deep learning: Generalization gap and sharp minima,‖ in 5th International Conference on Learning Representations, Toulon, France, 2016. [39] P. Refaeilzadeh, L. Tang, and H. Liu, ―Cross-validation,‖ in Encyclopedia of Database Systems, Boston, MA, US: Springer, 2009, pp. 532–538. [40] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, ―Learning representations by back-propagating errors,‖ Nature, vol. 323, no. 6088, pp. 533–536, Oct. 1986. [41] scikit-learn, ―Model evaluation: quantifying the quality of predictions — scikit-learn 0.19.2 documentation,‖ 2018. [Online]. Available: http://scikit-learn.org/stable/modules/model_evaluation.html. [Accessed: 03-Sep2018]. [42] L. Perez and J. Wang, ―The effectiveness of data augmentation in image classification using deep learning,‖ arXiv:1712.04621 [cs], Dec. 2017. [43] S. C. Wong, A. Gatt, V. Stamatescu, and M. D. McDonnell, ―Understanding data augmentation for classification: When to warp?,‖ in 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Gold Coast, QLD, Australia, 2016, pp. 1–6. [44] N. Tajbakhsh et al., ―Convolutional neural networks for medical image analysis: Full training or fine tuning?,‖ IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1299–1312, May 2016. [45] K. He, X. Zhang, S. Ren, and J. Sun, ―Deep residual learning for image recognition,‖ in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, NV, US, 2016, pp. 770–778. [46] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, ―Learning transferable architectures for scalable image recognition,‖ in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, US, 2018, pp. 8697–8710. [47] K. Weiss, T. M. Khoshgoftaar, and D. Wang, ―A survey of transfer learning,‖ Journal of Big Data, vol. 3, no. 1, p. 9, May 2016. [48] A. Karpathy, ―Transfer Learning,‖ 2018. [Online]. Available: http://cs231n.github.io/transfer-learning/. [Accessed: 22-Nov-2018].

12