A shape-independent algorithm for fully-automated gridding of cDNA microarray images

A shape-independent algorithm for fully-automated gridding of cDNA microarray images

ARTICLE IN PRESS JID: CAEE [m3Gsc;June 27, 2017;11:34] Computers and Electrical Engineering 0 0 0 (2017) 1–16 Contents lists available at ScienceD...

4MB Sizes 0 Downloads 40 Views

ARTICLE IN PRESS

JID: CAEE

[m3Gsc;June 27, 2017;11:34]

Computers and Electrical Engineering 0 0 0 (2017) 1–16

Contents lists available at ScienceDirect

Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng

A shape-independent algorithm for fully-automated gridding of cDNA microarray imagesR Hamidreza Saberkari∗, Mousa Shamsi, Habib Badri Ghavifekr Department of Electrical Engineering, Sahand University of Technology, Tabriz, Iran

a r t i c l e

i n f o

Article history: Received 4 November 2016 Revised 13 June 2017 Accepted 13 June 2017 Available online xxx Keywords: Microarray Gridding Otsu thresholding De-noising Image processing

a b s t r a c t In this paper, a fully-automated microarray gridding algorithm is presented. This algorithm contains the block finding in an image by using variable length Blackman window, image contrast enhancement based on the Otsu thresholding approach, and identification of image objects including spots and artifacts through the 8-connected labeling method. Furthermore, a shape-independent algorithm based on the area of pixels in each object is proposed for noise elimination. The final gridding is performed using a new method based on the constructed spot matrix and a refinement procedure is exploited to minimize all probable grid-line errors. The performance of the proposed algorithm is evaluated by five public datasets including the Swiss Institute of Bioinformatics (SIB), Joe DeRisi’s individual tiff files (DeRisi), University of California, San Francisco (UCSF), Gene Expression Omnibus (GEO), and Stanford Microarray Database (SMD). The obtained results reveal that the proposed algorithm reaches a higher level of accuracy and stability against some restrictions in microarray images such as noise, artifacts, and irregularities regarding the shapes of spots in comparison with other state-of-the-art methods. © 2017 Elsevier Ltd. All rights reserved.

1. Introduction Microarray technology allows simultaneous monitoring of the expression levels of thousands of genes in a single hybridization experiment [1]. Microarrays are widely used for biological research such as sequencing, evaluating genetic mechanisms in living cells, comparing normal and cancerous tissues, fundamental studies on gene expressions, and pharmaceutical and clinical investigations [2]. A microarray comprises thousands of spots in which each spot has the identified deoxyribonucleic acid (DNA) strand, known as the DNA probe [3]. Two types of microarrays—arrays based on complementary DNA (cDNA) and oligonucleotide arrays that are called oligo—have the most number of biological applications. Each gene in cDNA microarrays is recognized by a long strand (between 20 0 0–50 0 0 bps). The cDNA includes two different samples, namely test and reference ones, which are merged together in an array. All test and reference samples are labeled with red and green fluorescent dyes respectively, which are named Cy3 and Cy5, and possess different wavelengths. If the two cDNA samples include the strands of complementary DNA probe, the cDNA sample will be mixed with a spot. Those cDNA samples that have found their own complementary probes are hybridized on the array, while the non-hybridized ones are washed. Next, the array is scanned by a laser ray to determine which sample is joined to a specific spot. When a hybridized microarray is scanned by red and R ∗

Reviews processed and recommended for publication to the Editor-in-Chief by Area Editor Dr. E. Cabal-Yepez. Corresponding author. E-mail addresses: [email protected] (H. Saberkari), [email protected] (M. Shamsi), [email protected] (H.B. Ghavifekr).

http://dx.doi.org/10.1016/j.compeleceng.2017.06.018 0045-7906/© 2017 Elsevier Ltd. All rights reserved.

Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

JID: CAEE 2

ARTICLE IN PRESS

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

Fig. 1. The procedure of microarray image (MAI) analysis.

green wavelengths, two images are obtained. The fluorescent intensity ratio in each spot demonstrates the DNA strand’s relative redundancy in the two mixed cDNA samples on the spot. A gene expression study is carried out by surveying the gene expression level ratio in two images, namely Cy3 and Cy5. The expression level of a gene corresponds to the number of RNA versions in a cell and is directly related to the recombinant protein [4]. In microarray experiments, spot intensity represents either the impact of the hybridization process or the fluorescence emission on the slide surface. Therefore, background intensity regulation plays a vital role in the estimation of spot intensity quantities. Image processing is the first step in microarray data analysis; it is utilized to determine spot values and local background intensities. In an ideal microarray, spots are individually assigned to blocks as horizontal and vertical patterns. Some characteristics of an ideal microarray are as follows [5]: -

All sub-grids have identical dimensions Spaces between sub-grids are equal All spots have the same size and shape There is no dust and contamination on the slide Background intensity is low and distributed uniformly.

Generally, microarray image (MAI) analysis consists of three main steps (Fig. 1): (i) Gridding [6]: The first step is employed to designate the positions of sub-grids in the microarray and spots in sub-grids. In this way, some parameters, such as the spacing distance between the rows and columns of the sub-grids, the distance between the rows and columns of the spots, and the average diameter of the spots, should be estimated. (ii) Segmentation [7]: This step is conducted to classify the pixels in foreground and background classes. (iii) Intensity extraction [8]: In this step, the foreground and background intensities are extracted from their pixels. These are used to determine the gene expression quantities from the corresponding genes. Although gridding is the first step in MAI analysis, its result significantly influences the entire process. Various factors have negative impacts on gridding; these include inappropriate position of the sub-grid, microarray curvature, inappropriate distance between spots, misalignment between the red and green channels, array rotation in the image, and deviations in the symmetry owing to generated artifacts during the array scanning process [9]. In addition, many noise sources, such as Gaussian noise, can reduce the quality of the gridded image. Gridding approaches are generally classified into two groups: holistic and spot-by-spot methods. In holistic methods, positions are assigned to each spot by considering some efficient paths in the image and drawing horizontal and vertical lines. Although this method has low computational complexity, it is more suitable for ideal MAIs. Spot-by-spot gridding methods determine the position of each spot individually and then allocate a cell to each spot, even to non-hybridized spots (empty or dark spots). Spot-by-spot gridding suffers from a high level of computational complexity in high-density MAIs as a single spot can be detected in each iteration. The methods based on holistic gridding can be assorted into three sub-categories: manual [10], semi-automated [11], and fully-automated methods [12]. In manual gridding methods, such as in ScanAlyze software [10], all parameters are set manually by the user. The major drawback of this method is that it is a time-consuming procedure. Thus, it is not applicable to high-density MAIs. Semi-automated microarray gridding approaches require some levels of human intervention to locate the exact spot centers. Consequently, these methods are not suitable for high-throughput MAI processing. Some commercial packages for semi-automated gridding are ImaGene [13], Dapple [14], UCSF Spot [12], MAGIC [15], and Spot Finder [16]. In contrast, fully-automated methods exploit image processing techniques for automatically computing some important parameters such as the spot diameter, spacing between spots, and distance between sub-grids. Many automated gridding methods are based on the horizontal and vertical projections of the entire image and summation of the intensities of rows and columns. The main problem of such intensity projection is that it requires a large number of spots with high intensities. Thus, it has no stability against misalignment of the grids. To overcome this limitation, the projected signals are first smoothened with morphological or smoothing filters. In [17], a gridding method based on the morphological operators Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

ARTICLE IN PRESS

JID: CAEE

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

3

Table 1 Utilized datasets to evaluate the performance of the proposed gridding algorithm [24].

Data set name Image format # of Images # of Sub-grids Spot layout Spot resolution Image resolution

SIB

DeRisi

UCSF

GEO

SMD

Swiss institute of bioinformatics Tiff 14 56 from 5×7 to 7×7 18×18 10 0 0×10 0 0

Joe DeRisi individual

University of California, San Francisco Tiff 2 72 14×15 8×8 1512×1488

Gene expression omnibus Tiff 13 464 18×24 to 13×14 12×12 from 1942×1802 to 2200×5997

Stanford microarray database Tiff 22 528 14×18 to 44×44 from 18×18 to 8×8 from 1910×5550 to 1024×1024

Tiff 14 56 40×40 8×8 1024×1024

was proposed in which the blocks are well separated from each other. Nevertheless, this method suffers from two issues: (i) Fluorescent noise can easily affect its performance, which leads to lower image quality; and (ii) the threshold level is considered as the average amount of the filtered signal that is not an optimal value. A gridding technique based on the genetic algorithm (GA) was proposed in [18]. In this method, parallel line segments with same distances are determined and then a refinement procedure is applied to optimize the distance of the borders between adjacent spots. Maximizing the margins between rows and columns by using support vector machines (SVMs) is another approach proposed in [19] in order to grid the MAIs. Initially, a set of grid-lines is placed on the image to isolate a pair of consecutive rows and columns. Then, the optimal position of the lines is determined by maximizing the margins between the rows and columns by employing the linear maximum margin classifier. Spot-by-spot gridding methods comprise three stages [20]: (i) Determining all the hybridized spots in the sub-grid in a way that all objects with a high intensity level are characterized using a simple histogram-based segmentation technique [21]. (ii) Estimating the position of spots with a low-level hybridization through the geometry specifications of high-intensity spots. In this step, the position of the estimated spots is improved by applying the Radon transform [22]. (iii) Designating an area around the spot for final gridding. A method for spot addressing in hexagonal structured MAIs based on the spotby-spot procedure was presented in [23]. This method used the growing concentric hexagon (GCH) algorithm to detect the non-hybridized spots. Its drawback is the high sensitivity to starting points. For a higher number of starting points, gridding accuracy decreases owing to the existence of incorrect detected objects and thus the algorithm fails to optimize the positions of non-hybridized spots. In [24], the image objects were determined using template matching. In this process, the selected template and spot sizes are approximately equal. However, spot sizes in different datasets may vary due to different levels of hybridization. Hence, this method has a limited level of generality for microarray gridding. The main purpose of this paper is to introduce a fully-automated microarray gridding algorithm. First, the block finding of the MAI is conducted using a variable length Blackman window. Then, image contrast enhancement is achieved by utilizing an algorithm based on the Otsu thresholding approach and replacing the sub-grid pixels with the optimal threshold level. In the third step, all image objects, including spots and artifacts, are identified by exploiting the 8-connected labeling method. Next, noise elimination is carried out through the proposed shape-independent algorithm based on the area of pixels in each object. Owing to the greater quantity of noise in the microarray, the data average located in the first, fourth, or both the quartiles in the objects sorted by size is affected on the basis of the size of noise in each sub-grid. Finally, final gridding is performed by utilizing a new method based on the spot matrix, and a refinement procedure is applied to remove the probability of wrong grid-lines. The rest of the paper is organized as follows: in Section 2, the data and block diagram of the proposed algorithm are described. Section 3 includes the results of the designed user-friendly software package environment and comparison of the proposed algorithm with state of the art methods from the gridding accuracy point of view. Finally, the conclusion is provided in Section 4.

2. Materials and methods We used five microarray datasets to evaluate the performance of the proposed gridding algorithm. Microarray blocks were stored as the tagged image file format (TIFF) and each pixel in the array has a 16-bit intensity value. Specifications of the MAIs are as follows and summarized in Table 1 [24]. The first dataset consists of 14 MAIs chosen from the Swiss Institute of Bioinformatics (SIB) (http://www.isrec.isb-sib.ch/). These images are named by Def and have the experiment IDs of 661, 662, 663, 664, 667, and 667 in channels Cy3 and Cy5. Each image includes four blocks, and each block has variable amounts of spots ranging from 35 to 49. The second dataset is selected from Joe DeRisi individual dataset (http://www.bio.davidson.edu/projects/magic/magic. html, DeRisi). Each image has four blocks, and each block contains 1600 spots. Images corresponding to channels Cy3 and Cy5 have been labeled with the experiment IDs of 1302, 1303, 1309, 1310, 1311, 1312, and 1313. The third dataset includes two real images with 36 blocks from the University of California, SanFrancisco (UCSF, http: //cancer.ucsf.edu/research/cores/array), and each block has 210 spots. Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

JID: CAEE 4

ARTICLE IN PRESS

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

Fig. 2. Block diagram of the proposed gridding algorithm.

The fourth dataset includes a set of images from the Gene Expression Omnibus (GEO) dataset (http://www.ncbi.nlm. nih.gov/geo). As many as 13 images corresponding to Channels 1 and 2 of a microarray experiment have been labeled as GSM15898, GSM16101, GSM16389, GSM16391 and GSM17137, GSM17163, GSM17186, GSM17190, and GSM17192 respectively. The fifth dataset contains 14 MAIs from the public Stanford Microarray Database (SMD, http://smd.stanford.edu) that correspond to Channels 1 and 2 of a microarray experiment; they are labeled as 20,385, 20,387, 20,391, 20,392, and 20,395. These pairs are the results of four experiments in a study on global transcriptional factors for a hormone treatment of Arabidopsis thaliana (http://www.arabidopsis.org, Microarray Experiment Category: Hormone treatment, Experiment name: Transcriptional profiling of WT, axr3-1, and arx3-1R4). Fig. 2 illustrates the block diagram of the proposed gridding algorithm comprising five stages. The first step (i.e. block finding) is based on a holistic approach in which the blocks are separated through projecting the image. The next four steps include: image contrast enhancement to improve the quality of each sub-grid, hybridized spot detection to identify all objects in the image, non-hybridized spot detection, which estimates the position of each dark and empty spot, and the gridding step that allocates a cell around the detected spot. We will discuss all these steps in more details in the following sections.

2.1. Step 1: block finding (Global gridding) The first step includes the positioning of each sub-grid in the MAI, known as block finding. In this stage, some global parameters of the sub-grids, such as sub-grid width and height and the spacing between two adjacent grids, should be determined. We propose a new block finding technique based on the projection of horizontal and vertical profiles to detect the position of sub-grids. First, 1-D projection signals in both horizontal and vertical directions of the image are calculated Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

JID: CAEE

ARTICLE IN PRESS H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

[m3Gsc;June 27, 2017;11:34] 5

Fig. 3. Block finding procedure: (a) 1-D horizontal projection signal (b) 1-D vertical projection signal (c) Cropped horizontal projection, (d) Cropped vertical projection, (e) Horizontally averaged signal, and (f) Vertically averaged signal. Local maximums are marked as red points in Fig. 3 (e) and (f). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

ARTICLE IN PRESS

JID: CAEE 6

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

Fig. 4. The proposed block finding algorithm applied to the microarray sample image with the ID of 20,395 from the SMD dataset. Valleys in the projection signals are the sub-grid separating lines and denoted by red points. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

using Eq. (1) and illustrated in Fig. 3(a) and (b) respectively:

H (y ) = V (x ) =

w −1 x=0 h −1 y=0

f (x , y ) (1) f (x , y )

where f = {axy } (x ∈ [0 , w − 1] andy ∈ [0 , h − 1]) denotes the pixel intensity of the MAI and w and h are the number of rows and columns respectively. Split points of the sub-grids can be recognized by considering the valleys in the projected signals. To obtain the coordinates of these points, we calculate the absolute minimum of the signal (R = min [projected signal]). Then, the resulted signal is cropped above 2R, as depicted in Fig. 3(c) and (d). Afterward, a Blackman window with a size equal to one-third of the length of the projection signal slides along the signal and its average is computed in each sub-window. By calculating the local maxima of the resulting signals (Fig. 3[e] and [f]), split points are identified and global gridding is conducted. Fig. 4 shows the result of the block finding step for a sample corresponding to Channel 2 of the MAI with the ID of 20,395 from SMD dataset. 2.2. Step 2: contrast enhancement In the contrast enhancement step, the 16-bit MAI (f) has been converted into 8-bit (g) by g =f /256 for simplicity. Our proposed method for image contrast enhancement is based on a proper selection of the threshold level and replacing the pixel intensities in the image g by the upper bound. Since selecting an appropriate threshold level plays an important role in the image enhancement, we utilize the Otsu thresholding method. The Otsu thresholding is a method to segment the image into two distinct classes based on the selection of the optimal threshold level (t) [25]. Our definition of the optimal threshold level refers to finding the value of t in which the maximum uniformity of the intensity function is appointed for both classes. To define the value of t, the variance of the intensity distribution function in the pixels among the two classes should be minimized. By applying the Otsu thresholding method, the optimal value of t is calculated. Next, to further increase the differentiation between the spots and background intensities, all pixels with an intensity higher than t are multiplied by a coefficient of the threshold level. In this regard, we define:



p (x , y ) =

Tbt g (x , y ) ≥ t g (x , y ) o.w

(2)

where Tbt = C × g (x , y) × (t + 1) is the selected upper bound of contrast enhancement and the coefficient C is set to 5.5. According to Eq. (2), the intensity quantity in a sub-grid will be replaced by Tbt if the pixel intensity in the sub-grid is higher than the maximum value of t. Fig. 5(a) depicts a sub-grid sample corresponding to Channel 2 of the microarray 20,395 from the SMD dataset. This sub-grid not only contains many dark spots, but its background noise level is high as well. The contrast enhancement result for this sub-grid is shown in Fig. 5(b). After enhancing the contrast by utilizing Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

ARTICLE IN PRESS

JID: CAEE

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

[m3Gsc;June 27, 2017;11:34] 7

Fig. 5. The results of the proposed image contrast enhancement for the microarray sample image with the ID of 20,395 from the SMD dataset (a) Original sub-grid, (b) Enhanced sub-grid, (c) Binarized sub-grid before the application of the proposed contrast enhancement, and (d) Binarized sub-grid after the application of the proposed contrast enhancement.

the proposed approach, the Otsu method is employed once more in order to binarize the image and isolate the sharpest edges. Fig. 5(c) and (d) reveal the binarized results before and after the application of the proposed contrast enhancement respectively. 2.3. Step 3: finding the image objects by using the 8-connected labeling The binarized sub-grid illustrated in Fig. 5(d) contains different noise components that cause severe problems with the gridding of the image. Usually, precise identification of these noisy components in a MAI is the first attempt to improve the gridding accuracy. Some noise sources in microarrays are tailed trail, irregular spot morphology, high background intensity, spot overlaps, misalignment between channels, bubbles, and spot contamination. Furthermore, the existence of artifacts creates some disconnected clusters in the MAI. To cluster the image into connected components and find the objects (i.e. spot, noise, and artifact), the 8-connected labeling procedure [25] is exploited. The main idea of this approach is to assign a unique label to the pixels belonging to each object in the image. Accordingly, this approach appoints a specific label to the detected objects and determines the specifications of an object such as area, width, and the number of image objects. 2.4. Step 4: noise cancelling using a shape-independent algorithm based on the area of objects MAIs can be categorized into two noiseless and noisy groups. High artifacts (noise) to spots ratio are seen in case of noisy images. Our classification of noisy images is based on the ratio of noise to spot components. In this regard, MAIs are classified into four groups: 1) MAIs in which the size of noise is lower than the spot (i.e. arrays with salt and pepper noise); 2) MAIs in which the size of noise is higher than the spot (i.e. arrays with tailed trail); 3) MAIs containing both noise models; and 4) noiseless MAIs. To classify the MAIs from this point of view, the number of pixels in each object is initially calculated. Afterward, the appropriate range, which is considered as the spot size, is determined. We have established two auxiliary parameters, Ratio M and Ratio L, to discover this range. To realize it, the pixels of all objects are sorted in an ascending order, and the first and third quartiles (Q1 ,Q3 )of the pixels are calculated. Ratio Mis defined as the ratio of the average intermediate data (i.e. the average of data between the first and third quartiles) to the last 25 percent of data. Similarly, the first 25 percent of the data over the average of intermediate data characterizes theRatio L. By these definitions, we have:

Ratio M = Ratio L =

mean (Q1 : Q3 ) mean (Q3 : end )

mean (1 : Q1 ) mean (Q1 : Q3 )

(3) (4)

Rule 1. Let us suppose thatRatio M < 50 %. It denotes that the last 25 percent data is mostly noisy and should be eliminated. In this case, the range of R for a spot can be determined as below:

R = {Area o f Ob jects < Q3 }

(5)

Rule 2. In the event thatRatio L < 50 %, it implies that the first 25 percent data is mainly noisy. By this condition, the range of R for a spot can be adjusted by Eq. (6).

R = {Area o f Ob jects > Q1 }

(6)

Rule 3. If Ratio M < 50 %andRatio L < 50 %, both the first and last 25 percent of data will be predominantly noisy. In this case, the range of R for a spot can be adapted as the intersection of the two above states, as shown in the following:

R = {Q1 < Area o f Ob jects < Q3 }

(7)

Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

ARTICLE IN PRESS

JID: CAEE 8

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

Fig. 6. (a) The results of the proposed de-noising technique and (b) Identified spot centers for the microarray sample image with the ID of 20,395 from the SMD dataset.

Rule 4. If Ratio M > 50 % andRatio L > 50 %, the MAI will be noiseless. However, it assumes that the whole object in the image as a spot might ignore the possible noises. Therefore, the desired range for the spot is considered among the first and fourth quartiles, as shown in Eq. (8).

R = [Rmin , Rmax ]

(8)

where Rmin and Rmax are set as below:



Rmin = median (1 : Q1 ) Rmax = median (Q3 : end )

(9)

According to the previously mentioned rules, a MAI will be recognized as a noisy image if at least one of the two ratios (Ratio M and Ratio L) is lower than 50%. To distinguish between high and low levels of noise in a noisy MAI, a contract is settled as follows: - A microarray sub-grid will be classified into the high-level noise group if either one of the two ratios (Ratio M and Ratio L) is less than one-third of 50% or both these ratios are less than 50%. Ratio M < 13 × 50 % indicates that the sub-grid includes noise with large sizes. On the other hand, Ratio L < 13 × 50 % means that the sub-grid contains spots with a low level of hybridization or small sizes. - If Ratio M > 13 × 50 % and Ratio L > 13 × 50 %, the sub-grid will be assigned to the low-level noise group. Fig. 6(a) illustrates the results of the proposed de-noising algorithm applied to a sub-grid, as explained in Fig. 4. For this sub-grid, Ratio M and Ratio L reach the levels of 48.4% and 16.5% respectively, which means that this sub-grid belongs to the high-level noise group. By applying the proposed de-noising algorithm, a higher percentage of the remaining objects in a MAI is assigned to spots. In the next step, the spot centers and the distance between two consecutive rows or columns should be calculated. The center of each spot can be determined as the following:



xi Ii i∈D Xc =  Ii i∈D



yi Ii i∈D , Yc =  Ii

(10)

i∈D

where XC and YC are the coordinates of the mass center; xi and yi are the coordinates of the pixels in each spot (i ∈ Di ); Ii is the intensity of the ith pixel; and D is the whole number of pixels in each spot. In addition, the distance between two consecutive rows (DY ) and columns (DX ) can be calculated as follows:

DY = Y¯i − Y¯i + 1 DX = X¯ i − X¯ i + 1

(11)

where Y¯i and Y¯i+1 are the average of YC s in each row and the adjacent row respectively. Similarly,X¯ i and X¯ i+1 are the average of XC s in each column and the adjacent column respectively. To determine the objects related to a specific row and column, the position of this object (i.e. XC and YC ) and other adjacent objects should be considered. If the maximum average distance between two YC s is equal to half of the average of the object width, Ywill be termed as the Ycorresponding to a row. Obviously, the first Y, which in size is greater than half of the average of the object width, is recognized as the Y related to the next row. Finally, the Y- and X- average are calculated for each row and column respectively. Fig. 6(b) depicts the results of the identified spot centers in the sub-grid, as explained in Fig. 4. Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

ARTICLE IN PRESS

JID: CAEE

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

[m3Gsc;June 27, 2017;11:34] 9

Fig. 7. The constructed microarray spots matrix for the final gridding.

2.5. Step 5: gridding Assuming the spots of a MAI as the elements of a m × n matrix, the coordinates of a special spot will be determined by crossingXof each row and Yof each column. For instance, as represented in Fig. 7, the coordinates of the third spot in the first row is defined as (X3 ,Y1 ). The mass centers of other non-hybridized spots can be recognized similarly. Note that, it is not mandatory to know the position of all spots for final gridding. In order to yield the gridded image, the positions of all Xi , i ∈ [1 , 2 , ... n] and Yj ,j ∈ [1 , 2 , ... m] in each sub-grid are identified. Afterward, a segmenting line is laid out on the average of the two consecutive X (denoted by by

dY 2

dX 2

in Fig. 7) and similarly on the average of the two consecutive Y (denoted

in Fig. 7).

2.6. Refinement procedure The grid-lines in a sub-grid of a MAI are obtained by carrying out Steps 1 to 4 of the proposed algorithm. Furthermore, noise and artifacts are removed thanks to the application of the proposed de-noising approach. However, in the presence of a noise with the size of the spot, the algorithm may recognize its coordinates as an extra row or column. Although the position of this noise is random, the line differences between the noise and the next (previous) two grid-lines in the rows (columns) are less than or equal to half of the normal distance between two consecutive rows or columns. In addition, the algorithm may identify no objects in a row or column because of either a low-level hybridization or object removal in a row or column during the de-noising stage. In these situations, the difference between the two lines is higher than the normal distance between two rows or columns. The distance between two grid-lines and the average distance are computed as below:

1  di = Yi + 1 − Yi , i ∈ [1 , q] , d¯ = di q

(12)

i∈q

We can define the distance error of two grid-lines as follows:





deri = di − d¯

(13)

The gridding process will face an error if one of the following conditions occurs:

1 × (2 k + 1 ) × d¯, k = 0 , 1 , .... 2 ≈ k d¯, k = 2 , 3 , ...

(I ) deri ≈ (II ) deri

(14)

In order to solve the aforementioned issues, we propose the following rules: Rule 5. Let us suppose that deri ≈ 12 × (2k + 1 ) × d¯, k = 0 , 1 , ..... It denotes that one of the two centers in di represents a noise. Each of these centers resulting in a wrong gridding between the columns distance will be eliminated. If both centers lead to a wrong gridding in only one direction (horizontal or vertical distance), Yi + 1 will be removed. Rule 6. If we have deri ≈ k d¯ , k = 2, 3, ..., the algorithm will not be able to recognize a row or some rows. In this condition, the distance between two consecutive rows (columns) (Yi ,Yi + 1 or Xi ,Xi + 1 ) is divided into k equal segments, resulting in diagnosis of the empty spot rows. Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

ARTICLE IN PRESS

JID: CAEE 10

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

Fig. 8. The final gridded image (a) before and (b) after the refinement procedure for the microarray sample image with the ID of 20,395 from the SMD dataset.

Fig. 9. A perspective of the designed user-friendly software package for gridding the MAIs.

Fig. 8(a) and (b) illustrate the final gridding results before and after employing the refinement procedure, respectively, for the microarray sub-grid, as explained in Fig. 4. The major difference between these two figures is the inability of the gridding process to detect the second row from the top one in the test sub-grid. This issue has been resolved by utilizing the proposed refinement procedure, as denoted by the dashed lines in Fig. 8(b). 3. Results and discussion Fig. 9 exhibits the designed user-friendly software package for gridding MAIs, which has been designed and implemented in the MATLAB environment. Some specifications of this package are summarized as below: - Loading different MAIs (both real and simulated images) - Demonstrating the details of each MAI individually, such as the image format, image size, number of image bits, and the number of pixels in each row and column - Global gridding of the MAI and displaying the results separately - Local gridding of each sub-grid by utilizing the proposed spot-by-spot algorithm Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

JID: CAEE

ARTICLE IN PRESS

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

11

Fig. 10. The ratio of artifact (noise) to spot in the test sub-grids of the (a) 20,387, (b) 20,392, and (c) 20,395 microarrays from the SMD dataset.

-

Calculating the number of rows and columns in each sub-grid after local gridding Computing the number of hybridized and non-hybridized spots in the sub-grid Computing the area of the cell containing both hybridized and non-hybridized spots Specifying the parameters Ratio M and Ratio L Determining the noise level of each sub-grid in the MAI based on the parameters Ratio M and Ratio L as high- or low-level noises - Displaying the gridding results for each sub-grid of the MAI - Determining the values of Rmin and Rmax , separately for each sub-grid of the MAI. Performance evaluation of the proposed gridding algorithm upon noisy microarray sub-grids: To evaluate the performance of the proposed gridding algorithm, it was applied to the microarrays 20,387, 20,392, and 20,395 from the SMD dataset. As already explained, noisy sub-grids refer to those images in which at least one of the parameters Ratio Mor Ratio L is less than 50%. Fig. 10 shows the diagrams of Ratio M and Ratio L in terms of the number of test sub-grids in the previously mentioned MAIs. As it can be seen, Ratio L is below 50% in all sub-grids. This means that: (i) all these three microarrays contain noise and (ii) the dominant noise in these sub-grids includes either components with a size smaller than spots or sub-grids having spots with a low-level hybridization. Based on Fig. 10(b), big noise components are not seen in the microarray 20,392, except in the three sub-grids of this microarray, while most of the big noise components can be seen in the microarray 20,387 (Fig. 10[a]). Since Ratio L in this microarray is less than 50%, the third rule of the de-noising algorithm is matched. According to the definitions of highand low-noise microarrays, the most and least high-noisy sub-grids belong to the microarrays 20,387 (with 28 high-noise sub-grids) and 20,392 (with 19 high-noise sub-grids) respectively, as can be seen in Figs. 10(a) to (c). Gridding results for the microarrays 20,387, 20,392, and 20,395 are shown in Fig. 11. The quantities of Ratio M(Ratio L) for these sub-grids are 38.67 (23.62), 75.31 (13.41), and 37.67 (13.21) respectively. In addition, to prove the generality of the proposed algorithm, we applied it to four other datasets (SIB, DeRisi, UCSF, and GEO), as shown in Fig. 12. The amounts of Ratio M(Ratio L) for these sub-grids are 37.77 (31.04), 49.67 (59.16), 76.11 (57.51), and 49.64 (59.48) respectively. As it is obvious, the proposed algorithm shows appropriate gridding performance for all images, even for those with various spot shapes and sizes. The main advantage of this algorithm is its fully automatic operation. Thus, there is no need for user intervention to adjust the gridding parameters. Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

JID: CAEE 12

ARTICLE IN PRESS

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

Fig. 11. The results of the gridding algorithm in the (a) 20,387, (b) 20,392, and (c) 20,395 microarrays from the SMD dataset.

Fig. 12. The results of the gridding algorithm in the (a) SIB, (b) DeRisi, (c) UCSF, and (d) GEO datasets.

Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

ARTICLE IN PRESS

JID: CAEE

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

13

Table 2 The gridding accuracy in the sub-grids with high- and low-level noises for different methods. SMD Dataset Methods Otsu method Otsu +morphology Correlation-based method PDE-based method Proposed

20387 Low 8/20 15/20 5/20 17/20 20/20

High 7/28 18/28 5/28 19/28 21/28

Mean (di ) 2.52 1.91 2.90 1.86 1.84

SD (di ) 1.71 1.58 2.01 1.81 1.10

20392 Low 19/29 25/29 19/29 24/29 25/29

High 12/19 15/19 11/19 17/19 18/19

Mean (di ) 2.00 1.78 2.55 1.85 1.73

SD (di ) 1.77 1.52 2.61 1.81 1.76

20395 Low 22/25 22/25 15/25 21/25 23/25

High 16/23 19/23 10/23 18/23 18/23

Mean (di ) 2.51 1.61 2.60 1.75 1.63

SD (di ) 2.47 1.30 2.30 1.23 1.14

Table 3 Selecting the optimal range of the proposed noise cancellation algorithm. SMD Dataset Level of Noise 0%_Residue 25% _ Residue 50%_ Residue 75%_ Residue 100%_ Residue

20387 Low 20/20 17/20 15/20 15/20 10/20

High 21/28 21/28 20/28 20/28 18/28

20392 Low 25/29 25/29 25/29 21/29 19/29

High 18/19 18/19 15/19 16/19 14/19

20395 Low 23/25 23/25 23/25 23/25 21/25

High 18/23 17/23 14/23 14/23 13/23

Quantitative evaluation of the gridding accuracy for microarrays in the SMD dataset: The gridding accuracy is defined as the number of spots that are gridded correctly to the number of all spots [20]. We have specified the following rule to identify the number of correctly gridded spots: Rule 7. If the Euclidean distance between the centers of the spot and the cell (di ) is less than or equal to the difference between cell width and the radius of the spot, we can claim that the spot is completely located within the cell. Note that this rule is true for spots with a uniform circle shape. In Table 2, the average and standard deviation (SD) of the Euclidean distance between the centers of the spot and cell (di ) for the SMD dataset is given. Additionally, the number of correctly gridded sub-grids (i.e. 100% gridding accuracy) for the proposed algorithm and other state-of-the-art approaches—i.e. Otsu, Otsu+morphology [26], correlation-based [27], and partial differential equation (PDE)-based [28] methods—is illustrated. As it can be seen, our algorithm has the highest number of correctly gridded sub-grids compared to the other methods for both high and low level of noises. Furthermore, the mean of di in the proposed algorithm is lower than the other methods— for instance, the amount of di in the microarray 20,387 for the proposed algorithm is equal to 1.84, while this value is obtained 2.52, 1.91, 2.9, and 1.86 for Otsu, Otsu+morphology, and correlation- and PDE-based methods respectively. Furthermore, Table 2 reveals that sub-grids with low-level noise have a higher level of gridding accuracy than high-noise sub-grids. Selecting the optimal range of noise cancellation in the SMD dataset: In the fourth step of the discussed gridding algorithm, the range of noise cancellation is calculated for different MAIs. In order to determine the optimal range of noise elimination, the percentage of the noise range that is close to Q1 or Q3 is discovered. Next, the number of correctly gridded sub-grids is calculated, as shown in Table 3. In this Table, 0%_Residue indicates complete removal of the specified range in the fourth stage of the noise-cancelling algorithm, whereas 100%_Residue represents the performance of the gridding algorithm by ignoring Rules 1–4. As it can be seen, most of the sub-grids have been correctly gridded by selecting 0%_Residue, which means that the defined rules have improved the gridding performance. For instance, assuming low-level noise in the sub-grid 20,387, the gridding accuracy of 50% is achieved in the case of 100%_Residue, while this amount reaches 100% for 0%_Residue. This improvement can also be observed in the images with high-level noise. Comparing the proposed algorithm with other methods applied to the SIB, DeRisi, UCSF and GEO datasets: In order to investigate the effective performance of the proposed algorithm in gridding the MAIs with different qualities, it has been applied to four other datasets. The gridding accuracy of the proposed algorithm compared to other methods, including Otsu, Otsu+morphology, and correlation- and PDE-based ones, are given in Table 4. In the SMD dataset with the highest level of noise, the gridding accuracy of the proposed algorithm has been improved by the factors of 1.13, 1.05, 1.18, and 1.03 compared to Otsu, Otsu+morphology, and correlation- and PDE-based methods respectively. A similar level of superiority is obtained for the SIB, DeRisi, UCSF, and GEO datasets. It is to be noted that correlation- and PDE-based methods are mostly suitable for ideal and noiseless MAIs due to their direct dependency on projection and hence the gridding performance of these approaches faces an error in noisy MAIs. However, the proposed algorithm is robust against high-noise sub-grids because of its independence of the shapes of spots. Furthermore, unlike the correlation- and PDE-based methods, one extra phase (i.e. refinement procedure) is rendered in the proposed algorithm to remove the probability of wrong grid-lines and/or to sketch those grid-lines that are not intended in the final gridding process for any reason (e.g. object removal in the denoising process or undefined spots owing to low-level hybridization in the binary version of a sub-grid). Fig. 13 depicts the diagrams of Ratio M and Ratio Lversus the number of test sub-grids for the SIB, DeRisi, UCSF, and GEO datasets. Following Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

JID: CAEE 14

ARTICLE IN PRESS

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16 Table 4 Comparing the gridding accuracies of different methods for the five datasets. Tested Dataset Methods Otsu method Otsu + morphology Correlation-based method PDE-based method Proposed

SIB 94.6% 97.3% 91.0% 97.9.% 98.5%

DeRisi 99.2% 99.7% 88.2% 99.8% 100%

UCSF 100% 100% 100% 100% 100%

GEO 89.7% 97.2% 86.4% 98.0% 99.5%

SMD 84.4% 91.2% 81.0% 93.0% 95.5%

Fig. 13. The ratio of artifact (noise) to spot in the test sub-grids for the (a) SIB, (b) DeRisi, (c) UCSF, and (d) GEO datasets.

these diagrams, most of the sub-grids in these four datasets are categorized in the low-noise group. It should be mentioned that the accessible numbers of sub-grids in the SIB, DeRisi, UCSF, and GEO datasets are 20, 10, 10, and 26 respectively. Comparing the computational complexity of the proposed algorithm with other methods: We evaluated the computational complexity of the proposed algorithm and other methods for a MAI with the size of M × N pixels. The complexity of the Otsu thresholding method is obtained as O (ts N2 ), where ts denotes the threshold level size. For the correlation-based method, the computational complexity is O (M + N). The complexities for Otsu+ morphology and PDE-based methods are O (M + N) + O (ts N2 ) and O (2 M × N + p (M + N)) respectively, where Se and Pare the size of the structuring element and the number of iterations required for the evolution of profiles respectively [28]. In the proposed algorithm, the Otsu thresholding method with the complexity of O (ts N2 ) is utilized for contrast enhancement and creating the binary version of the enhanced sub-grid. The total number of pixels of the image are compared to the proposed threshold level in both contrast enhancement and the binarization processes, giving the complexity of 2 (M × N). Furthermore, we assume that the number and sum of object pixels in an image are in the order of α and β respectively. Averaging and subtracting are conducted in rows (columns) and between two rows (columns) that are of the order of O (α ). Moreover, the number of refinement operations is supposed to bep. It is to be noted that α ,β , and p are always lower than M × N. Thus, the overall computational complexity of the proposed algorithm is approximately O (M × N) + O (ts N2 ). Table 5 summarizes the average execution time of the proposed algorithm in comparison with the other methods in the five datasets. All the steps of the proposed algorithm were performed on a computer with a 1.6 GHz processor (Intel ® Pentium ® M processor) and 4 GB RAM memory. It is obvious that the computational complexity is different in the five datasets owing to various resolutions and the number of MAI spots. The dimensions of sub-grids in the five datasets are roughly 360×253 pixels (SIB), 453×451 pixels (DeRisi), 243×228 pixels (UCSF), 449×377 pixels (GEO), and 461×443 pixels (SMD). Since the proposed algorithm is inspired from the combination of holistic and spot-by-spot approaches, its running time is more than the other methods, except for the PDE-based one. This difference is nearly twice the Otsu approach for Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

JID: CAEE

ARTICLE IN PRESS

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

15

Table 5 The processing time (in seconds) of gridding for different methods. Tested Dataset Methods Otsu method Otsu + morphology Correlation-based method PDE-based method Proposed

SIB 0.032 0.074 0.027 0.085 0.081

DeRisi 0.05 0.095 0.034 1.09 1.03

UCSF 0.025 0.057 0.025 0.067 0.063

GEO 0.041 0.085 0.044 1.01 0.096

SMD 0.047 0.097 0.045 1.03 1.01

the DeRisi dataset. All the methods have less (high) running time in SIB (DeRisi) because each sub-grid of these two datasets has 35–49 and 1600 spots respectively. 4. Conclusion The shape-independent gridding algorithm implemented in this study was able to successfully grid different kinds of MAIs. This was achieved by introducing some valuable rules to determine the intervals to investigate the presence or absence of noise components in the MAIs. The proposed algorithm was tested on five datasets by dividing their MAIs into high- and low- noise sub-grids based on the defined parameters Ratio M and Ratio L. The results revealed that the SMD dataset has the highest level of noise, while the two datasets UCSF and DeRisi have the lowest one. Thanks to the application of the proposed de-noising algorithm, the gridding accuracy increased by the amounts of 11.1%, 4.3%, 14.5%, and 2.5% for the SMD dataset compared to the conventional Otsu, Otsu+morphology, correlation- and PDE-based approaches, even in the presence of various noise types. One of the key advantages of the proposed algorithm is that it does not take into account the shape of objects in a MAI. Instead, the area of each object including noise components and spots were considered since the area of spots is often in a certain range, as stated in Rules 1–4. Another advantage is its noise resistance, thus it can be utilized to analyze different kinds of MAIs with different spot shapes, different levels of hybridization, and even in the absence of spots. The proposed algorithm can be applied to the other types of images if they have similar circumstances of MAIs. More precisely, objects that are supposed to be gridded in an image should be arranged in a rectangular form in hypothetical rows and columns. Time-consumption of the proposed algorithm has the potential to be optimized, so we have focused on optimizing the performance of the algorithm with less running time for future works, as it can be useful for GPU-based computing structures of MAI analysis platforms. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28]

Wee A, Liew C, Yah H, Yang M. Pattern recognition techniques for the emerging field of bioinformatics: a review. Pattern Recognit 2005;38:2055–73. Charpe AM. DNA Microarray: advances in biotechnology. India: Springer; 2014. p. 71–104. Lu Y, Han J. Cancer classification using gene expression data. Inf Syst Data Manage Bioinform 2003;28:243–68. Scott CP, VanWye J, McDonald MD, Crawford DL. Technical analysis of cDNA microarrays. PLos ONE 2009;4:e4486. Wong TT, Hsu CH. Two-stage classification methods for microarray data. Expert Syst Appl 2008;34:375–83. Alhadidi B, Fakhouri HN, Al Mousa OS. cDNA microarray genome image processing using fixed spot position. Am J Appl Sci 2006;3:1730–4. Uslan V, Bucak IO. Microarray image segmentation using clustering methods. Math Comput Appl 2010;15:240–7. Yang YH, Buckley MJ, Dudoit S, Speed TP. Comparison of methods for image analysis on cDNA microarray data. J Comp Graph Stat 2002;11:108–36. Bowtell D, Sambrook J. DNA microarrays: a molecular cloning manual. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press; 2003. Eisen MB ScanAlyseAccessed: http://rana.Stanford.EDU/software/. Fielden MR, Halgren RG, Dere E, Zacharewski TR. GP3: GenePix post-processing program for automated analysis of raw microarray data. Bioinformatics 2002;18 771–733. Jain AN, Tokuyasu TA, Snijders AM, Segraves R, Albertson DG, Pinkel D. Fully automated quantification of microarray image data. Genome Res 2002;12:325–32. Médigue C, Rose M, Viari A, Danchin A. Detecting and analyzing DNA sequencing errors: toward a higher quality of the Bacillus subtilis genome sequence. Genome Res 1999;9:1116–27. Brueck C, Sunny S, Collins J. Oligonucleotide array CGH analysis of a robust whole genome amplification method. Biotechniques 2007;42:230–3. Heyer LJ, Moskowitz DZ, Abele JA, Karnik P, Choi D, Campbell AM, et al. MAGIC tool: integrated microarray data analysis. Bioinformatics 2005;21:2114–15. Saeed AI, Bhagabati NK, Braisted JC, Liang W, Sharov V, Howe EA, et al. TM4 microarray software suite. Methods Enzymol 2006;41:134–93. Bengtsson A, Bengtsson H. Microarray image analysis: background estimation using quantile and morphological filters. BMC Bioinform 2006;7:96. Zacharia E, Maroulis D. An original genetic approach to the fully automatic gridding of microarray images. IEEE Trans Med Imag 2007;27:805–13. Bariamis D, Iakovidis DK, Maroulis D. M3G: maximum margin microarray gridding. BMC Bioinform 2010;11:49. Giannakers N, Kalatzis F, Tsipourar MG, Fotiadis DI. A generalized methodology for the gridding of microarray images with rectangular or hexagonal grid. Sig Image Video P 2016;10:719–28. Jung HY, Cho HG. An automatic block and spot indexing with k-nearest neighbors graph for microarray image analysis. Bioinformatics 2002;18:S141–51. Giannakeas N, Kalatzis F, Tsipouras MG, Fotiadis DI. Spot addressing for microarray images structured in hexagonal grid. Comput Methods Programs Biomed 2012 I-06:I-13. Steinfath M, Wruck W, Seidel H, Radelof LH, O’Brien J. Automated image analysis for array hybridization experiments. Bioinformatics 2001;17:634–41. Shao G, Li T, Zuo W, Wu S, Liu T. A combinational clustering based method for cDNA microarray image segmentation. PLOS ONE 2015;10(10):e0133025. Gonzalez RC, Woods R. Digital image processing. 3rd ed. Pearson; 2007. Shao GF, Yang F, Zhang Q, Zhou QF, Luo LK. Using the maximum between-class variance for automatic gridding of cDNA microarray images. IEEE ACM Trans Comput Biol Bioinf 2013;10:181–92. Helmy AK, GhS El-taweel. Regular gridding and segmentation for microarray images. Adv Electr Comp Eng 2013;39:2173–82. Belean B, Terebes R, Bot A. Low-complexity PDE-based approach for automatic microarray image processing. Med Biol Eng Comput 2015;53:99–110.

Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018

JID: CAEE 16

ARTICLE IN PRESS

[m3Gsc;June 27, 2017;11:34]

H. Saberkari et al. / Computers and Electrical Engineering 000 (2017) 1–16

Hamidreza Saberkari received his B.Sc. in Electrical Engineering from the University of Guilan, Rasht, Iran, in 2011. In 2013, he received his M.Sc. in Communication Engineering from Sahand University of Technology, Tabriz, Iran. Now, he is a PhD candidate in Electrical Engineering at Sahand University of Technology, Tabriz, Iran. His research interests include biomedical image processing, bioinformatics, and pattern recognition. Mousa Shamsi received his PhD in Electrical Engineering from University of Tehran in December 2008. From April 2013, he is an associate professor at Faculty of Electrical Engineering, Sahand University of Technology, Tabriz, Iran. His research interests include medical image and signal processing, genomic signal processing, pattern recognition, adaptive networks, and facial surgical planning. Habib Badri Ghavifekr received his PhD in Electrical Engineering from Technical University of Berlin. Since 2013, he is an associate professor at Faculty of Electrical Engineering, Sahand University of Technology, Tabriz, Iran. His research interests include microsystem technologies, microelectronic packaging, MEMS, and electronic measurement system for industrial applications.

Please cite this article as: H. Saberkari et al., A shape-independent algorithm for fully-automated gridding of cDNA microarray images, Computers and Electrical Engineering (2017), http://dx.doi.org/10.1016/j.compeleceng.2017.06.018