computer methods and programs in biomedicine 133 (2016) 143–154
j o u r n a l h o m e p a g e : w w w. i n t l . e l s e v i e r h e a l t h . c o m / j o u r n a l s / c m p b
CometQ: An automated tool for the detection and quantification of DNA damage using comet assay image analysis Sreelatha Ganapathy a,*, Aparna Muraleedharan b, Puthumangalathu Savithri Sathidevi a, Parkash Chand c, Ravi Philip Rajkumar d a
Department of Electronics & Communication Engineering, National Institute of Technology Calicut, India Department of Anatomy, Pondicherry Institute of Medical Sciences (PIMS), Kalapet Puducherry, India c Department of Anatomy, Jawaharlal Institute of Postgraduate Medical Education & Research (JIPMER), Puducherry, India d Department of Psychiatry, Jawaharlal Institute of Postgraduate Medical Education & Research (JIPMER), Puducherry, India b
A R T I C L E
I N F O
A B S T R A C T
Article history:
Background and objective: DNA damage analysis plays an important role in determining the
Received 8 February 2016
approaches for treatment and prevention of various diseases like cancer, schizophrenia and
Received in revised form
other heritable diseases. Comet assay is a sensitive and versatile method for DNA damage
17 April 2016
analysis. The main objective of this work is to implement a fully automated tool for the
Accepted 31 May 2016
detection and quantification of DNA damage by analysing comet assay images. Methods: The comet assay image analysis consists of four stages: (1) classifier (2) comet seg-
Keywords:
mentation (3) comet partitioning and (4) comet quantification. Main features of the proposed
Comet assay image analysis
software are the design and development of four comet segmentation methods, and the
CometQ
automatic routing of the input comet assay image to the most suitable one among these
Comet segmentation
methods depending on the type of the image (silver stained or fluorescent stained) as well
Comet quantification
as the level of DNA damage (heavily damaged or lightly/moderately damaged). A classifier
Image enhancement
stage, based on support vector machine (SVM) is designed and implemented at the front end, to categorise the input image into one of the above four groups to ensure proper routing. Comet segmentation is followed by comet partitioning which is implemented using a novel technique coined as modified fuzzy clustering. Comet parameters are calculated in the comet quantification stage and are saved in an excel file. Results: Our dataset consists of 600 silver stained images obtained from 40 Schizophrenia patients with different levels of severity, admitted to a tertiary hospital in South India and 56 fluorescent stained images obtained from different internet sources. The performance of “CometQ”, the proposed standalone application for automated analysis of comet assay images, is evaluated by a clinical expert and is also compared with that of a most recent and related software—OpenComet. CometQ gave 90.26% positive predictive value (PPV) and 93.34% sensitivity which are much higher than those of OpenComet, especially in the case
* Corresponding author. Department of Electronics & Communication Engineering, National Institute of Technology Calicut, India. Tel.: +91 9496291852. E-mail address:
[email protected] (S. Ganapathy). http://dx.doi.org/10.1016/j.cmpb.2016.05.020 0169-2607/© 2016 Elsevier Ireland Ltd. All rights reserved.
144
computer methods and programs in biomedicine 133 (2016) 143–154
of silver stained images. The results are validated using confusion matrix and Jaccard index (JI). Comet assay images obtained after DNA damage repair by incubation in the nutrient medium were also analysed, and CometQ showed a significant change in all the comet parameters in most of the cases. Conclusions: Results show that CometQ is an accurate and efficient tool with good sensitivity and PPV for DNA damage analysis using comet assay images. © 2016 Elsevier Ireland Ltd. All rights reserved.
1.
Introduction
DNA damage analysis has great significance in the field of medical research. Human cells are constantly attacked by several harmful agents generated by both exogenous and endogenous processes which lead to DNA damage. The majority of damages are induced by reactive oxygen species (ROS) and reactive nitrogen species (RNS). These oxidative DNA damages are critical risk factors for cancer, ageing, neuro degenerative diseases, heart diseases, Parkinson’s disease, Schizophrenia and Alzheimer’s disease [1–5]. DNA damage analysis provides important information regarding early biological effects of hazardous chemicals and relevant information about the different stages of diseases. This information can be used by clinicians and pathologists for planning treatment and determining the best course of intervention. Hence, an accurate, fast and sensitive method for the analysis of DNA damage is highly demanded [6]. Comet assay or single-cell gel electrophoresis (SCGE) is a widely accepted method for DNA damage analysis [7,8]. The comet assay is well known for its simplicity, low cost, robust statistical analysis, requirement of small number of cells per sample and high sensitivity for detecting low levels of DNA damage. Comet assay finds extensive use in the area of testing new chemicals for genotoxicity, monitoring environmental contamination with genotoxins, human biomonitoring and molecular epidemiology, diagnosis of genetic disorders and fundamental research in DNA damage and repair. Comet assay procedure starts with separating the lymphocyte from blood samples. The cells are embedded in agarose gel and subjected to lysis. The slides are then incubated in alkaline electrophoresis buffer for alkaline unwinding, followed by electrophoresis. During electrophoresis, negatively charged DNA particles move towards the anode and form a comet like structure based on different levels of DNA damage. After neutralisation, slides are stained with suitable DNA-binding dye for the proper visualization of DNA through fluorescent/ optical microscope. Different variations of this procedure are available to detect different types of DNA damage [9]. DNA damage is quantified by evaluating various parameters [10] of the individual comets from the comet assay images. Comets can be scored using either by manual methods or by using software tools. In manual method, comets are scored by measuring the length of a comet tail using a photomicrograph or using a graticule, which is laborious and gives only limited information [11]. Visual scoring is another approach under manual method. Both these methods are time consuming and require skilled operators. But still, visual scoring is used
as a benchmark to validate the results obtained through automated algorithms. Semi-automated [12–15] and fully automated [16–23] tools are available for comet scoring. CASP [14,15] and Comet Score are two semi-automated tools freely available in the Internet. Here also, the requirement of experts at different stages makes the technique user dependent, tedious and time consuming. Most of the automated tools are integrated with in-house microscope set up, and hence these tools are not freely accessible. They are very expensive and the source code cannot be modified. Moreover, most of the algorithms except Cell profiler and OpenComet are developed for fluorescent stained images. This paper presents CometQ, an automated tool for the detection and quantification of DNA damage by analysing comet assay images. The system mainly consists of four stages: (1) classifier, (2) comet segmentation, (3) comet partitioning and (4) comet quantification. CometQ is a robust algorithm which automatically identifies the type of input image (fluorescent or silver stained images) as well as the level of DNA damage (heavily damaged or lightly/moderately damaged) present in these images and directs it to the appropriate segmentation methods for the identification of actual comets. A classifier stage, based on SVM is designed and implemented at the front end, to categorise the input image into one of the above four groups to ensure proper routing. Comet segmentation is followed by comet partitioning which is implemented using a novel technique coined as modified fuzzy clustering. Comet parameters are calculated in the comet quantification stage and are saved in an excel file. This is an open source standalone application developed using Matlab software. CometQ is validated using silver stained images from 40 Schizophrenia patients with different severity levels and 56 fluorescent stained images collected from Internet sources. Patients with schizophrenia have greater DNA damage than the normal population due to various mechanisms involving the redox status in these patients [24,25], and hence are chosen as the main cases in this study. Even though this package is specifically developed for clinical applications, it can be used in other research applications where DNA damage analysis is performed using comet assay images. The rest of the paper is organised as follows. In Section 2, different methods used for the implementation of CometQ are discussed. This section describes the CometQ user interface, comet assay procedure and the algorithm developed for comet assay image analysis. Section 3 elaborates the performance of CometQ. Section 4 highlights the salient features of the proposed method and comparison with the most recent related work. The paper concludes with a brief summary in Section 5.
computer methods and programs in biomedicine 133 (2016) 143–154
2.
Methods
2.1.
CometQ user interface
Fig. 1 shows the user interface for CometQ. The user has to first select the images (any number of images from a folder) for analysis. Many of the common image file formats like BMP, TIFF, JPG etc., are supported. Next, the user selects an output directory and a file name for storing the comet parameters in MS-Excel format. In order to have a robust output, provision is given to select the objective magnification. An error message will pop up if the user forgets to select any of the input fields. Once the required inputs are selected, the analysis can be started by pressing the “Run” button. The output image files and the excel sheets are automatically saved in the selected output folder.
2.2.
Comet assay procedure
After obtaining written and informed consent, blood samples were drawn from a peripheral vein of confirmed cases of Schizophrenia between the age group of 18 and 65 years admitted in a Psychiatry ward of a tertiary care hospital in South India, and lymphocytes were separated by centrifugation. The first step in comet assay procedure is the preparation of homogeneous slides to keep the noise level as low as possible. A three-layer slide is used in this work. Initially, the slide was coated with 0.75% Normal Melting Point Agarose (NMPA) and dried at room temperature. After solidification, lymphocytes suspended in 0.5% Low Melting Point Agarose (LMPA) were uniformly spread over the NMPA coat and allowed to solidify at 4oC in a refrigerator. After forming lymphocyte-LMPA cell layer, 0.5% LMPA were added to the agarose gel mixture and placed in a refrigerator for 10 to 15 min for solidification. To examine the DNA repair, the remaining cells were incubated for 2 h in a nutrient medium containing Roswell Park Memorial Institute (RPMI) 1640 Medium with L-glutamine (Sigma), fetal bovine serum suitable for Cell culture F2442 (Sigma) Penicillin and Streptomycin (Sigma). The suspension was again centrifuged and the cell pellet was subjected to conventional comet assay. In the next step, i.e., lysis, the slides were placed in a precooled lysis solution for a minimum duration of one hour at a temperature of 4oC. The slides were then placed inside a hori-
Fig. 1 – User interface of CometQ.
145
zontal submarine gel electrophoresis system. In order to unwind DNA strands and expose the alkali labile sites, the slides were allowed to stay in the alkaline buffer for 30 min. This process is called alkali unwinding which exposes the alkali labile sites with negative charge. Further, cells were electrophoresed under alkaline condition at 25V and 300 mA (0.74V/cm) for 30 min. After electrophoresis, neutralisation of alkali in the gel was carried out by rinsing the slides with neutralisation buffer. Then the comets were fixed and stained using silver nitrate. The silver stained slides were viewed under a bright field light microscope (Olympus BX43, 20x magnification) and the images were captured using a CCD camera.
2.3.
Algorithm for comet assay image analysis
The algorithm used in CometQ is depicted in Fig. 2. The four major blocks are: (1) a classifier to classify the images into four categories, (2) comet segmentation block which includes four methods for segmenting the comets, out of which the best method will be selected depending upon the type of image and the type of comets present in the image, where the actual comets for parameter quantification are segmented from the input image, (3) Comet partitioning block where each comet is partitioned into head and tail regions and (4) Comet quantification block where comet parameters are calculated and saved in an excel file. Detailed description of each block is explained in the following subsections.
2.3.1.
Classifier
CometQ is capable of analysing both fluorescent and silver stained images. Initially, a classifier (Classifier 1 in Fig. 2a) classifies the images into two classes as silver stained and fluorescent stained images. Further, Classifiers 2 and 3 divide the images into two more groups as images having lightly or moderately damaged cells and heavily damaged cells. So the final output of the classifier will have 4 groups: groups 1 and 2 correspond to silver stained images having lightly or moderately damaged cells and heavily damaged cells, respectively, and groups 3 and 4 those of fluorescent stained images. The performance of comet segmentation algorithm can vary depending on the type of images and also the type of comets (depending on the level of DNA damage) present in these images. Therefore, the best comet segmentation algorithm can be selected based on the group in which an image belongs to. All the three classifiers are designed based on SVM. Classifier 1 is used to identify the type of images. The mean intensity and standard deviation of the two types of images are linearly separable and hence these are selected as features for this classifier. Based on these features, an SVM is trained to classify each image into either of the two classes silver stained or fluorescent stained images. Classifiers 2 and 3 are used to classify the silver stained and fluorescent stained images into two more groups based on the types of comets. The different features considered for these classifiers are mean intensity, standard deviation, entropy, energy and mean eccentricity of comets. Different combinations of these features are tested and the best feature for the classifiers is found to be the mean eccentricity of comets. Hence, with this feature, the two SVMs are trained to classify the images (both types) into any one of the two groups; images having lightly/moderately damaged or
146
computer methods and programs in biomedicine 133 (2016) 143–154
Fig. 2 – Algorithm of CometQ: (a) the major steps involved in the comet assay image analysis; (b) the detailed steps of the four methods in the comet segmentation algorithm.
heavily damaged cells. Hence, final output of the classifier will have four classes (Class 1: Silver stained images with lightly or moderately damaged cells, Class 2: Silver stained images with heavily damaged cells, Class 3: Fluorescent stained images with lightly or moderately damaged cells and Class 4: Fluorescent stained images with heavily damaged cells).
2.3.2.
Comet segmentation
In comet segmentation, the actual comets are to be segmented from the image. Once the images are classified, the
algorithm automatically directs each image to the most suitable comet segmentation method. Four segmentation algorithms which are appropriate for each class of images are described in the following paragraphs. The major steps involved in these methods are shading removal, image enhancement, thresholding, noise removal and detection of actual comets. During the study, it is observed that for different types of images, certain combinations of enhancement and thresholding methods give better segmentation results.
computer methods and programs in biomedicine 133 (2016) 143–154
2.3.2.1. Method 1. Method 1 is suitable for silver stained images with lightly or moderately damaged cells and its block level representation is shown in Fig. 2b. Images under consideration are microscopic images and hence they may be affected by shading. Shading effect should be removed for the accurate comet parameter estimation. Gray scale morphological bottom-hat transformation is used for shading removal in all the four methods. A disk shaped structuring element with a radius of 200 pixels is used. In the case of silver stained images, the comet tails may be merged with the background. The homomorphic filtering aids to simultaneously compress the intensity range and enhance the contrast of the image [26]. Otsu’s thresholding method [27] is used here for binarisation. The thresholded image may contain noise other than comets. The background noise present in silver stained images will be very much higher than that of fluorescent stained images. Hence, extra steps are required to remove this noise. In the noise removal stage, simple morphological operations are used to smoothen the contours and to eliminate noises. The first operation in this stage is a closing operation which smoothens the contours, eliminates small holes and fills smaller gaps in the object boundaries. Further, filling operation is done to fill possible holes that may be present in heavily damaged cells. Finally, opening operation is done to eliminate noise in the background. A disk of radius 4 pixels is used as the structuring element for all morphological operations. Morphological thickening operation is also incorporated to ensure proper selection of comets without tail loss. Even though the noises are removed, the resultant image may contain boundary objects, overlapping comets and artefacts which are to be eliminated before comet scoring. Those comets which are touching the boundaries are detected and eliminated using connected component analysis. Overlapping comets are removed by the thresholding operation. Artefacts are removed either by profile analysis or by using SVM classifier. SVM classifier [28] is used to eliminate artefacts from images captured with 20x magnification, and profile analysis [23] is employed to eliminate artefacts from other images. 2.3.2.2. Method 2. The most appropriate comet segmentation method for silver stained images with heavily damaged cells is Method 2 (see Fig. 2b). When compared to Method 1, the difference lies in image enhancement and thresholding techniques. The tail region is very less for Class 1 type of images when compared with that of Class 2 type. The background substitution method [23] is used here for image enhancement. The combination of two contrast enhancement stages together with Gaussian filter clearly demarcates the comets from the background. This method is essential for such images to minimise the tail loss. Distance regularised level set evolution (DRLSE) method [28] is used for thresholding. This is a more general approach and the advantage of this method is that the noise removal step can be eliminated. DRSLE is an iterative process and the evolution of the surface slows down when it reaches the object boundary in the image. With background substitution method, DRSLE is able to stop at 710 iterations irrespective of the size of the image. Artefact removal is done in a similar way as that of method 1.
147
2.3.2.3. Method 3. Comets are well segmented from fluorescent stained images with lightly or moderately damaged cells with Method 3 (refer Fig. 2b). A combination of background substitution method for image enhancement and Otsu’s thresholding method is found to be highly suitable for this type of images. Noise removal block is same as that of Method 1. Profile analysis [23] is used for the artefact removal. 2.3.2.4. Method 4. In the case of fluorescent stained images with heavily damaged cells, best comet segmentation is obtained with Method 4, and the block level representation of this is shown in Fig. 2b. Here, modified adaptive gamma correction with weighting distribution (MAGCWD) [29] method developed by the authors is used for enhancement. This enhancement method is very effective in segmenting the comets, with minimum tail loss. The MAGCWD is formulated as in Eq.(1), where l is the intensity level, lmax is the maximum intensity level of the input image, cdf is the cumulative distribution function with weighting distribution, and p is an approximation parameter which will be automatically calculated based on the probability of zero intensity. The intensity l of each pixel in the input image is transformed to T(l) after performing Eq. (1). This transformation maps the darker pixels more darker and lighter pixels more lighter which aids to demarcate the comets clearly from the background, with minimum tail loss. ⎛ l ⎞ T (l) = lmax ⎜ ⎝ lmax ⎟⎠
(1.5− cdf (l))p
(1)
For heavily damaged cells with long tail, the intensity variation is large compared to moderately damaged cells. Hence, the use of fuzzy c means (FCM) algorithm with 3 clusters for thresholding is found to produce better segmentation in the case of Class 4 images. Noise removal is same as that of Method 1 and profile analysis is used for artefact removal.
2.3.3.
Comet partitioning
Once comet segmentation is over, the resultant images will contain only actual comets. From this image, individual comets are cropped out and orientation is corrected. The next step is to partition the comet into head, halo and tail regions. For this, FCM algorithm is adopted here, since a clear boundary cannot be defined for these regions. In order to set the number of clusters in FCM algorithm, 300 comets having different levels of DNA damage are analysed, and it is observed that head region is identified more precisely for a cluster size of 4. Hence, four clusters are defined as head, halo, tail and background [23]. After fuzzy clustering, defuzzification is done with “maximum method”, where each pixel is assigned to a cluster for which it has maximum membership value. This approach gives good results for moderately damaged cells, but for heavily damaged cells the tail intensity may be higher and the algorithm assigns these pixels to head/halo region, which makes the results adverse. Therefore, an additional pre-processing is done on the FCM clustered images IL to rectify this problem. The objective of this additional step is to assign the scattered head/halo region to tail region and define new clusters for these regions. A new clustering and partitioning algorithm [23] proposed by the authors is employed here. A brief explanation of the algorithm is given in the following paragraphs.
148
computer methods and programs in biomedicine 133 (2016) 143–154
2.3.3.1. Head clustering. From the labelled image IL, head region alone is extracted and then morphological opening and filtering operations are done to eliminate small areas. If multiple heads are present, then the leftmost object is assigned as the final head, HF. 2.3.3.2. Halo clustering. From the labelled image, head and halo regions together are extracted and closing operation is done to remove small areas. The resultant image is Ha. If multiple objects are present, then the halo region to be retained is found by identifying the region whose centroid is closely matching with the centroid of the final head identified in the previous stage. The final halo region, HaF is extracted by subtracting HF from this resultant image, Ha. 2.3.3.3. Tail clustering. From the labelled image head, halo and tail regions together are extracted and filling operation is carried out to eliminate small holes in the image. Then the new tail region, TF is obtained by subtracting Ha from this resultant image. The next step is to combine the extracted clusters to form a new labelled image, ILnew.
2.3.4.
Validation
The comet segmentation results are validated using confusion matrix and with Jaccard index (JI). Confusion matrix indicates how effectively the proposed method detects the actual comets and JI indicates how well the comets are segmented from the background. The true comets, manually marked by the clinical expert, are used as ground truth data.
2.4.1.
2.4.2.
Jaccard index
This test indicates the similarity measure between the manually marked comets in the reference image, I Ref and the segmented comets in the test image, ITest. JI is defined as in Eq. (2) and its value lies between zero and one. JI with value one indicates exact match between the two sets and zero indicates two different sets.
JI ( IRef , ITest ) =
IRef ∩ ITest IRef ∪ ITest
(2)
Comet quantification
In this stage, the halo region is assigned to tail region which gives more accurate value of DNA (%) in tail for heavily damaged cells. In the case of undamaged cells (control cells), as the clusters look like concentric circles around the head region, the area will be approximately double that of expected. Hence, DNA (%) in tail will be indicated as very high in contradiction to an expected lower value. To address this problem, ILnew is partitioned into head and tail regions. The head centroid and head radius are found out from the head cluster HF. From the head centre, radius is added in the x direction to get the boundary point of head region. A vertical partitioning is done based on these pixel coordinates. All the pixels lying to the left of the partitioning are considered as head and to the right, as tail. Comet parameters are calculated based on head and tail regions.
2.4.
Fig. 3 – Interpretation of box-and-whisker plot.
Confusion matrix
The total number of comets accepted by the expert is taken as true number of comets (TNC). A comet is considered to be correctly detected if it is marked by the clinical expert also. In addition to true positive (TP—Number of correctly detected comets), false negative (FN—Number of comets not detected) and false positive (FP—Number of falsely detected comets), the other parameters calculated are TP(%), FN(%), FP(%), PPV and sensitivity. TP(%), FN(%) and FP(%) are calculated as percentage of TNC [23].
2.5.
Interpretation of a box-and-whisker plot
Box-and-whisker plot shows the distribution of a set of data. The plot divides the data into four parts using median and quartiles. Fig. 3 shows the box-and-whisker diagram with all the important measurements. The plot consists of two boxes and two whiskers. The maximum and minimum values of the data are shown at the ends of each whisker. The start of green box is the lower quartile (Q1) or 25th percentile. The end of purple box is the upper quartile (Q3) or 75th percentile. The line joining the green and purple box is the median (Q2). The dot indicates the average or mean value of the dataset. The range of the data is the difference between the maximum and minimum values in the dataset. The interquartile range (IQR) is defined as the difference between Q1 and Q3. The green box indicates the lower quartile range (LQR), (Q2 - Q1), and purple box indicates the upper quartile range (UQR), (Q3 - Q2).
3.
Results
3.1.
Classifier
From our dataset, hundred images consisting of both fluorescent and silver stained images are selected for training. Classifier is tested with the remaining data and an accuracy of 100% is obtained for Classifier 1. Classifier 2 is trained with silver stained images alone and Classifier 3 with fluorescent stained images alone. For selecting the best model, Classifiers 2 and 3 are trained using both linear kernel and radial basis function (RBF) kernel with different combinations of features. Fig. 4 shows the receiver operating characteristic (ROC) plot of Classifiers 2 and 3 with different number of features. From this figure, it is clear that maximum accuracy is obtained for those SVMs (linear and RBF kernels) that are trained with mean eccentricity of comets. Hence, linear kernel with eccentricity as feature
computer methods and programs in biomedicine 133 (2016) 143–154
149
Fig. 4 – ROC plot of Classifiers 2 and 3 with different combinations of kernel functions and features: (a) ROC plot of Classifier 2 with linear kernel and different combinations of features; (b) ROC plot of Classifier 2 with RBF kernel and different combinations of features; (c) ROC plot of Classifier 3 with linear kernel and different combinations of features; and (d) ROC plot of Classifier 3 with RBF kernel and different combinations of features. Legends-F stands for feature and the number (e.g. 5 in F5), indicates the number of features used. F5 (all the five features—mean, standard deviation, energy, entropy and mean eccentricity of comets), F4 (all the features except mean eccentricity), F2 (standard deviation and eccentricity of comets), and F1 (mean eccentricity of comets).
is selected for SVM model. An accuracy of 100% is obtained with both the classifiers. 10-fold validation is used for SVM model selection.
3.2.
Comet segmentation
During the study, it is observed that for different types of images, certain combinations of enhancement and thresholding methods give better segmentation results. This motivated us to design a scheme for choosing the best combination among those, based on the type of image and the level of DNA damage. The three enhancement methods used are background substitution method (B), homomorphic filtering method (H) and MAGCWD method (M). The three thresholding methods used are Otsu’s method (O), distance regularised level set evolution method (D) and FCM based thresholding (F). The two methods used for artefact removal are profile analysis (P) and SVM based method (S). Different combinations of above methods are tested and the most suitable method is identified for each type of image, based on the performance indices TP (%) and Jaccard index. 100 silver stained images and 56 fluorescent stained images are considered for evalu-
ating the performance indices. Fig. 5 shows the performance of different combinations of methods for each class of images. It is evident from Fig. 5 that the best segmentation performance, both in terms of TP (%) and JI, obtained for classes 1, 2, 3 and 4 are HOS, BDS, BOP and MFP, respectively. Hence, the proposed scheme is designed for automatically switching to one of these segmentation methods based on the type of input images. The overall performance of the proposed method is compared with that of the OpenComet software and is tabulated in Table 1, based on the parameter indices TP (%), FN (%), FP(%), PPV and sensitivity. Due to space constraint, parameters obtained for 56 fluorescent stained images and 100 silver stained images (5 schizophrenia cases) having different levels of DNA damage alone are shown in this table. Mean TP(%), PPV, sensitivity, FN(%) and FP(%) obtained for the five schizophrenia cases are 96.13%, 91.61%, 96.13%, 3.87% and 9.03% respectively. High values of TP(%), PPV and sensitivity and low values of FN(%) and FP(%) indicate better performance. Mean values of TP (%), PPV, sensitivity, FN (%) and FP (%) are also calculated by considering all data under study and are 93.34%, 90.26%, 93.34%, 6.66% and 11.82%, respectively.
150
computer methods and programs in biomedicine 133 (2016) 143–154
Fig. 5 – Performance comparison of different methods in terms of TP (%) and JI for each class of images: (a), (b), (c) and (d) shows the comparison of class 1, class 2, class 3 and class 4 type images, respectively. The different methods tried with class 1 and 2 images are OC (OpenComet [22]) HOS, BOS, BDS, MOS, MFS. The different methods tried with class 3 and 4 images are OC (OpenComet [22]) HOP, BOP, BDP, MOP, MFP. (H—Homomorphic method, B—Background substitution method, M—Modified AGCWD method, O—Otsu’s method, D—DRLSE method, F—FCM algorithm, P—Profile analysis and S—SVM classifier). The green and orange box indicates the lower quartile range and, purple and blue box indicates the upper quartile range of TP (%) and JI, respectively.
151
computer methods and programs in biomedicine 133 (2016) 143–154
Table 1 – Performance analysis of the proposed method (PM) with that of OpenComet (OC) software for silver stained and fluorescent stained images (Fl images). Cases
TNC
Case 1 Case 2 Case 3 Case 4 Case 5 Fl images
3.3.
43 104 76 38 74 172
TP (%)
FN (%)
FP (%)
PPV
Sensitivity
OC
PM
OC
PM
OC
PM
OC
PM
OC
PM
18.6 92.31 44.74 78.95 85.14 63.37
97.67 99.04 98.68 94.74 90.54 83.20
81.4 7.69 55.26 21.05 14.86 36.63
02.33 00.96 01.32 05.26 09.46 16.80
558.14 98.08 223.68 157.89 151.35 003.49
16.28 11.54 05.26 02.63 09.46 03.26
32.00 48.50 16.70 33.30 36.00 94.78
85.71 89.57 94.94 97.30 90.54 96.20
18.6 92.31 44.74 78.95 85.14 63.37
97.67 99.04 98.68 94.74 90.54 83.20
Comet partitioning and quantification
The comet partitioning and quantification methods explained in our previous work [23] are adopted here. Percentage of DNA in tail is considered for comet quantification. Here, forty cases have been analysed with clinically confirmed schizophrenia with different levels of clinical severity. Fig. 6 shows the box-and-whisker plot of DNA (%) in tail determined using OpenComet software v1.3 [22] and the proposed software CometQ for five cases. They are listed in the increasing order of percentage DNA in tail. Score obtained by visual analysis
is marked on the x axis. In visual scoring, comets are classified into five groups (0 to 4 : 0—no damage, 1—minimal damage, 2—mild damage, 3—moderate damage and 4—severe damage) according to the degree of damage. Here, 50 cells per slide are scored. Each of the fifty comets is assigned a number in between zero to four according to the level of DNA damage and the total is taken as the score value. Hence, the final score lies between 0–200 (no units), which is used for statistical and comparative analysis. From Fig. 6(a) and (b), it is evident that the proposed method shows a trend in DNA (%) in tail in coherence with the conventional visual scoring method. Comet assay images obtained after DNA damage repair by incubation in the nutrient medium are also analysed, and CometQ showed a significant change (t-test has been conducted and obtained the p-value as, p < 0.0001) in DNA (%) in tail for most of the cases [30]. Fig. 7 is the box-and-whisker plot showing the DNA (%) in tail determined using CometQ for six cases before and after DNA damage repair. The cases are placed in the increasing order of DNA damage, and for visual clarity, DNA (%) in tail obtained before and after incubation are placed nearby in each case. It can be observed that significant reduction in DNA (%) in tail is obtained after DNA damage repair.
4.
Fig. 6 – Comparison of DNA (%) in tail using box-andwhisker plot for five cases: (a) DNA (%) obtained using OpenComet (v1.3); (b) DNA (%) obtained using CometQ. The x axis indicates the manual score obtained for each case in the increasing order of DNA damage. The green box indicates the lower quartile range and purple box indicates the upper quartile range.
Discussion
CometQ, a fully automated and an efficient method for DNA damage analysis using comet assay images, is capable of analysing both silver stained and fluorescent stained images. The proposed software automatically selects the most suitable comet segmentation method depending on the type of input images and levels of DNA damage. After comet segmentation, the individual comets are partitioned into head, halo, tail and background using FCM. Then the output of FCM is modified with clustering and partitioning algorithms to reduce it into head and tail regions. In comet quantification stage, all the comet parameters are calculated and stored in an excel file. The DNA (%) in tail is considered for quantitatively analysing DNA damage. The performance of the proposed method is compared with that of OpenComet [22]. In OpenComet, silver stained images are to be inverted before analysis where as CometQ is capable of analysing these images directly. OpenComet is good in selecting the comets with less tail loss, and comet segmentation is mainly based on shape attributes. Due to this, some noisy structures having similar shapes as that of comets are also detected from silver stained images as actual comets. The profile analysis used by
152
computer methods and programs in biomedicine 133 (2016) 143–154
Fig. 7 – DNA repair capacity measured using CometQ in schizophrenia patients: The x axis indicates the six schizophrenia cases before and after DNA damage repair. In C18(B/A), Cn stands for individual cases and “B” indicates before DNA repair and “A” stands for after DNA repair. The y axis indicates the percentage of DNA in tail. The green and orange box indicates the lower quartile range and purple and blue box indicates the upper quartile range.
OpenComet for finding the head region in certain comets is found to be giving inaccurate results which are evident from the box-and-whisker plot. Therefore, to compare the proposed method and OpenComet, the wrongly detected objects by OpenComet are manually deleted first and the program is updated for recalculating the DNA (%) in tail. The performance of the proposed method is found to be better than that of OpenComet in terms of most of the performance indices (refer Table 1 and Fig. 6). CometQ is capable of detecting ghost cells in both fluorescent stained and silver stained images. With the proposed new partitioning algorithm, the small head region of the ghost cells
were successfully identified by CometQ in the case of silver stained images (Refer Fig. 8 (a) & (b)). But, it was not successful in some cases of fluorescent stained images having cells with very feeble and small head (Refer Fig. 8 (d)). This is because, the head intensity is so small that it assigns to halo region while fuzzy clustering. Even though the head detection is wrong, CometQ is successful in segmenting the comets with head region. For the same cell, OpenComet could not segment the comet with head region (Refer Fig. 9). Fig. 8 illustrates some typical ghost cells present in fluorescent and silver stained images. First and second cells in Fig. 8 are silver stained ghost cells, whereas third and fourth cells are fluorescent stained
Fig. 8 – Typical examples of some ghost cells and their comet partitioning results: First row of the figure illustrates four ghost cells. Among these first two cells are silver stained ghost cells and the last two are fluorescent stained ghost cells (gray scale images are shown). The second row shows the output of fuzzy c mean clustered output. Third row shows the result of modified FCM output, and the last row shows the final result of the comet partitioning algorithm.
computer methods and programs in biomedicine 133 (2016) 143–154
153
Fig. 9 – (a) Shows a typical examples of fluorescent stained ghost cells (b) result of comet segmentation using OpenComet software, and (c) is the result of the proposed method. This ghost cell is the same as the last cell in Fig. 8. Proposed method is successful in segmenting the comet with head region.
ghost cells (gray scale versions of fluorescent stained images are shown in Fig. 8 (c) and (d)). Partitioning and quantification are successful in all the cells except the last cell shown in Fig. 8. One limitation of the proposed method is the computation time required for image analysis. The average time required to analyse an image using the proposed method is 12 s whereas it is 2 s using OpenComet. The testing is done on Intel Core i5-3210M CPU @2.50GHz with 8 GB RAM. But, computational complexity is not an issue here, as the application does not demand real time processing. Accuracy is more important as far as DNA damage detection and quantification are concerned.
5.
Conclusion
A fully automated tool for DNA damage analysis, using both fluorescent stained and silver stained comet assay images, for clinical applications, is developed and implemented in this work. The proposed software consists of three classifiers based on SVM to categorize the input images into four classes: silver stained images with lightly or moderately damaged cells, silver stained images with heavily damaged cells, fluorescent stained images with lightly or moderately damaged cells and fluorescent stained images with heavily damaged cells. The algorithm will automatically switch to the most suitable segmentation method based on the class of the image, and this will eventually lead to better DNA damage quantification. The performance of the proposed method is analysed by clinical expert and is found to be very useful for DNA damage analysis and repair in the area of clinical research. The proposed method can also be used in other applications such as toxicology, pharmacogenomics, oncology, human epidemiology and biomonitoring, etc., where DNA damage analysis is carried out using comet assay images.
REFERENCES
[1] E. Kadioglu, S. Sardas, S. Aslan, E. Isik, A.E. Karakaya, Detection of oxidative DNA damage in lymphocytes of patients with Alzheimer’s disease, Biomarkers 9 (2) (2004) 203–209.
[2] N. Kopjar, V. Garaj-Vrhovac, I. Milas, Assessment of chemotherapy-induced DNA damage in peripheral blood leukocytes of cancer patients using the alkaline comet assay, Teratog. Carcinog. Mutagen. 22 (1) (2002) 13–30. [3] D. Psimadas, N. Messini-Nikolaki, M. Zafiropoulou, A. Fortos, S. Tsilimigaki, S.M. Piperakis, DNA damage and repair efficiency in lymphocytes from schizophrenic patients, Cancer Lett. 204 (1) (2004) 33–40. [4] A.R. Collins, K. Rašlová, M. Somorovská, H. Petrovská, A. Ondrušová, B. Vohnout, et al., DNA damage in diabetes: correlation with a clinical marker, Free Radic. Biol. Med. 25 (3) (1998) 373–377. [5] A.C. Andreazza, B. Noronha Frey, B. Erdtmann, M. Salvador, F. Rombaldi, A. Santin, et al., DNA damage in bipolar disorder, Psychiatry Res. 153 (1) (2007) 27–32. [6] S.P. Jackson, J. Bartek, The DNA-damage response in human biology and disease, Nature 461 (7267) (2009) 1071–1078. [7] O. Ostling, K. Johanson, Microelectrophoretic study of radiation-induced DNA damages in individual mammalian cells, Biochem. Biophys. Res. Commun. 123 (1) (1984) 291– 298. [8] N.P. Singh, M.T. McCoy, R.R. Tice, E.L. Schneider, A simple technique for quantitation of low levels of DNA damage in individual cells, Exp. Cell Res. 175 (1) (1988) 184–191. [9] W. Liao, M.A. McNutt, W.-G. Zhu, The comet assay: a sensitive method for detecting DNA damage in individual cells, Methods 48 (1) (2009) 46–53. [10] T. Kumaravel, B. Vilhar, S.P. Faux, A.N. Jha, Comet assay measurements: a perspective, Cell Biol. Toxicol. 25 (1) (2009) 53–64. [11] A.R. Collins, A.A. Oscoz, G. Brunborg, I. Gaivão, L. Giovannelli, M. Kruszewski, et al., , The comet assay: topical issues, Mutagenesis 23 (3) (2008) 143–151. [12] J.-F. Rivest, M. Tang, J. McLean, F. Johnson, Automated measurements of tails in the single cell gel electrophoresis assay, in: Instrumentation and Measurement Technology Conference, 1996. IMTC-96. Conference Proceedings.‘Quality Measurements: The Indispensable Bridge between Theory and Reality’., IEEE, Vol. 1, IEEE, 1996, pp. 111–114. [13] C. Harrison, J. Pearson, R. Bilton, D. Burton, J. Roberts, The comet moment ratio and other parameters obtained by applying image processing techniques and feature extraction to the SCGE assay, in: Computer-Based Medical Systems, 1998. Proceedings. 11th IEEE Symposium on, IEEE, 1998, pp. 234–239. [14] C. Helma, M. Uhl, A public domain image-analysis program for the single-cell gel-electrophoresis (comet) assay, Mutat. Res. 466 (1) (2000) 9–15. [15] K. Kon´ca, A. Lankoff, A. Banasik, H. Lisowska, T. Kuszewski, S. Góz´dz´, et al., A cross-platform public domain pc
154
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
computer methods and programs in biomedicine 133 (2016) 143–154
image-analysis program for the comet assay, Mutat. Res. 534 (1) (2003) 15–20. W. Böcker, W. Rolf, T. Bauch, W.-U. Müller, C. Streffer, Automated comet assay analysis, Cytometry 35 (2) (1999) 134–144. W. Frieauff, A. Hartmann, W. Suter, Automatic analysis of slides processed in the comet assay, Mutagenesis 16 (2) (2001) 133–137. G. Dehon, P. Bogaerts, P. Duez, L. Catoire, J. Dubois, Curve fitting of combined comet intensity profiles: a new global concept to quantify DNA damage by the comet assay, Chemometr. Intell. Lab. Syst. 73 (2) (2004) 235–243. G. Dehon, L. Catoire, P. Duez, P. Bogaerts, J. Dubois, Validation of an automatic comet assay analysis system integrating the curve fitting of combined comet intensity profiles, Mutat. Res. 650 (2) (2008) 87–95. M. Sansone, O. Zeni, G. Esposito, Automated segmentation of comet assay images using Gaussian filtering and fuzzy clustering, Med. Biol. Eng. Comput. 50 (5) (2012) 523–532. J. González, I. Romero, J. Barquinero, O. Garca, Automatic analysis of silver-stained comets by cellprofiler software, Mutat. Res. 748 (1) (2012) 60–64. B.M. Gyori, G. Venkatachalam, P. Thiagarajan, D. Hsu, M.-V. Clement, Opencomet: an automated tool for comet assay image analysis, Redox Biol. 2 (2014) 457–465. G. Sreelatha, A. Muraleedharan, P. Chand, R.P. Rajkumar, P.S. Sathidevi, Quantification of DNA damage by the analysis of
[24]
[25]
[26]
[27] [28]
[29]
[30]
silver stained comet assay images, IRBM 36 (5) (2015) 306– 314 http://dx.doi.org/10.1016/j.irbm.2015.09.006. A. Jorgensen, K. Broedbaek, A. Fink-Jensen, U. Knorr, M.G. Soendergaard, T. Henriksen, et al., Increased systemic oxidatively generated DNA and RNA damage in schizophrenia, Psychiatry Res. 209 (3) (2013) 417–423 http://dx.doi.org/10.1016/j.psychres.2013.01.033. G. Dadheech, S. Mishra, S. Gautam, P. Sharma, Evaluation of antioxidant deficit in schizophrenia, Indian J. Psychiatry 50 (1) (2008) 16. G. Sreelatha, A. Muraleedharan, P. Chand, R.P. Rajkumar, P.S. Sathidevi, An improved automatic detection of true comets for DNA damage analysis, Procedia Comput. Sci. 46 (2015) 135–142. N. Otsu, A threshold selection method from gray-level histograms, IEEE Trans. Syst. Man Cybern. 9 (1) (1979) 62–66. G. Sreelatha, K.S. Manoj, P.S. Sathidevi, A level set approach combined with support vector machine for comet detection from comet assay images, J. Med. Imaging Health Inform. (2016) Accepted for publication. S.-C. Huang, F.-C. Cheng, Y.-S. Chiu, Efficient contrast enhancement using adaptive gamma correction with weighting distribution, IEEE Trans. Image Process. 22 (3) (2013) 1032–1041. A. Muraleedharan, V. Menon, R.P. Rajkumar, P. Chand, Assessment of DNA damage and repair efficiency in drug nave schizophrenia using comet assay, J. Psychiatr. Res. 68 (2015) 47–53.