Pattern Recognition Letters 42 (2014) 1–10
Contents lists available at ScienceDirect
Pattern Recognition Letters journal homepage: www.elsevier.com/locate/patrec
Using scale space filtering to make thinning algorithms robust against noise in sketch images q Houssem Chatbri a,⇑, Keisuke Kameyama b a b
Graduate School of Systems and Information Engineering, Department of Computer Science, University of Tsukuba, Japan Faculty of Engineering, Information and Systems, University of Tsukuba, Japan
a r t i c l e
i n f o
Article history: Received 15 August 2013 Available online 31 January 2014 Keywords: Thinning algorithm Robustness against noise Scale space filtering Sketch image preprocessing
a b s t r a c t We apply scale space filtering to thinning of binary sketch images by introducing a framework for making thinning algorithms robust against noise. Our framework derives multiple representations of an input image within multiple scales of filtering. Then, the filtering scale that gives the best trade-off between noise removal and shape distortion is selected. The scale selection is done using a performance measure that detects extra artifacts (redundant branches and lines) caused by noise and shape distortions introduced by high amount of filtering. In other words, our contribution is an adaptive preprocessing, in which various thinning algorithms can be used, and which task is to estimate automatically the optimal amount of filtering to deliver a neat thinning result. Experiments using five state-of-the-art thinning algorithms, as the framework’s thinning stage, show that robustness against various types of noise was achieved. They are mainly contour noise, scratch, and dithers. In addition, application of the framework in sketch matching shows its usefulness as a preprocessing and normalization step that improves matching performances. Ó 2014 Elsevier B.V. All rights reserved.
1. Introduction Thinning algorithms are classic in digital image processing and used to extract a pattern’s skeleton, that is a thin or nearly thin representation of the pattern [1]. Thinning algorithms have been used in many applications such as OCR [2,3], document image analysis [4,5], fingerprint identification [6,7,8], biometric authentication using retinal images [9,10], signature verification [11–13], sketch matching and sketch-based image retrieval [14–17], etc. In OCR, thinning is used as a normalization step to insure invariance to pen thickness and handwriting styles. In fingerprint identification, thinning is used as a key preprocessing step essential before feature extraction. In authentication systems using retinal images, thinning is applied in order to produce a one-pixel-wide vascular tree of blood vessels, whose geometrical and topological properties are used in the identification process. In document analysis, signature verification and sketch-based image retrieval, thinning is used as a preprocessing and normalization step. A thinning algorithm is considered desirable if it meets the following properties [1,18]: q
This paper has been recommended for acceptance by A. Koleshnikov.
⇑ Corresponding author. Tel.: +81 80 3937 5254; fax: +81 29 853 6200x8480. E-mail addresses:
[email protected] (H. Chatbri),
[email protected]. tsukuba.ac.jp (K. Kameyama). http://dx.doi.org/10.1016/j.patrec.2014.01.011 0167-8655/Ó 2014 Elsevier B.V. All rights reserved.
produce a thin or nearly thin skeleton; preserve the connectivity of the original pattern, which means that connected parts in the original pattern should stay connected in the skeleton; preserve the visual topology of the original pattern, which means that although the skeleton is a compact representation of the original pattern, it should deliver the same visual information; be robust against noise. Although many thinning algorithms have been presented, the criteria above are rarely met altogether. Usually, choosing a particular thinning algorithm is an application dependent issue motivated by the ability of the algorithm to fulfill a particular criterion among the mentioned above. For instance, the necessity of producing one-pixel-wide skeletons is a priority in some applications [19], however it is sacrificed in other applications in order to preserve the topology of the pattern [18]. While many thinning algorithms have satisfying performances regarding connectivity and topology preservation, most of the algorithms are sensitive to noise [11,20]. Using a state-of-the-art thinning algorithm [21] to thin the noisy patterns in Fig. 3(a) gave the results in Fig. 3(b). Such noisy skeletons are a challenge for applications such as OCR [22], Content-Based Image Retrieval [19], and fingerprint recognition [7].
2
H. Chatbri, K. Kameyama / Pattern Recognition Letters 42 (2014) 1–10
Fig. 1. Using scale space filtering in ATF: Images in the first row are results of different amount of filtering. Images in the second row are binarized images corresponding to the ones of the first row. Images in the third row are skeletons corresponding to binary images of the second row. ATF should be able to select the amount of filtering producing the best trade-off between noise removal and shape alterations, as shown in the blue box. (For interpretation of the references to colour in this figure caption, the reader is referred to the web version of this article.)
In this work, we introduce a framework for making thinning algorithms robust against noise, using scale space filtering. Inputs of our method are hand-drawn binary images that are scanned or introduced using a computer graphics software or a tablet. We refer to this class of images as sketch images. Our method produces multiple representations of an input image within multiple scales of filtering. Then, the filtering scale that gives the best trade-off between noise removal and shape distortion is selected. For optimal scale selection, we introduce a performance measure that detects irrelevant artifacts caused by noise and shape distortions introduced by high amounts of filtering. The selection is automatic and done adaptively to each input image. Hence, we call our method Adaptive Thinning Framework (ATF). ATF is aimed to deal with the typical types of noise which exist in sketch images. They are contour noise, scratch and dithers. The remainder of this paper is as follows: Section 2 is a state-ofthe-art review. ATF is explained in detail in Section 3, and evaluated in Section 4. Section 5 concludes the paper.
2. Related work Thinning algorithms received considerable attention with the emergence of pattern recognition applications requiring to transform an image to a compact form that preserves geometrical and topological properties. Thinning algorithms for binary images can be categorized into sequential algorithms, parallel algorithms, and medial axis algorithms [1]. Sequential algorithms proceed by deleting iteratively contour points in a predetermined order, and hence are non-isotropic. Parallel algorithms deal with this limitation by making points deletion based on the result of only the previous iteration. Medial axis algorithms produce a medial or central line of the pattern by line following methods or distance transform. The parallel paradigm received more attention than the two other categories and many parallel algorithms have satisfying performances in preserving connectivity and topology [1]. In a typical parallel thinning algorithm, the image pixels are checked from
top-left to bottom-right, and those satisfying certain conditions or matching specific templates are flagged. Then, when all pixels are checked, the flagged ones are removed. We refer the reader to [1] for a comprehensive survey on thinning algorithms. Most algorithms require a binary image as input [1,18–26]. In case of grayscale images, a binarization step is needed before applying the algorithm. Alternatively, several thinning algorithms that work directly on grayscale images without binarization have been introduced [27–31]. Despite the vast repository of thinning algorithms, sensitivity to noise remains an open issue [11,20]. Some attempts for making thinning algorithms robust against noise have been presented. In [23], Chen and Yu presented an entropy-based method for thinning noisy images inspired from the human vision process. Their method computes the circular range containing the maximal information for each pixel. A symmetry score of the pixels distribution in the circular range in then computed. The symmetry information is treated as a grayscale image which is then thinned to obtain the skeleton. Chen and Yu’s method has been reported to maintain good performances under reasonable noise levels [24]. However, it becomes vulnerable once the amount of noise is significant [25], and it is time-consuming [26]. Hoffman and Wong presented an algorithm for thinning both binary and grayscale images using scale space filtering [27]. Their algorithm generates filtered versions of an image, then it extracts the skeleton by finding the peak, ridge and saddle points in the grayscale filtered image. The extracted pixels are called ‘‘The Most Prominent Ridge-Line pixels (MPRL)’’ and used to form the skeleton. A MPRL pixel in the image scale space pyramid is a pixel such that all ridge-line pixels in its sub-pyramid have greater second derivatives. The quality of the skeleton in regards to noise depends strongly on a parameter that determines the minimum amount of smoothing to be applied on the image. This parameter has to be set manually by the user. In a recent work [28], Cai introduced a thinning algorithm for thinning noisy patterns based on Oriented Gaussian Filters. The author dealt with contour noise present in handwriting and fingerprints, caused by pen perturbations and scanning documents and
H. Chatbri, K. Kameyama / Pattern Recognition Letters 42 (2014) 1–10
3
Fig. 2. Image datasets.
images. Cai’s algorithm proceeds by classifying pixels into edges, valleys and ridges using Oriented Gaussian Filters, and then removing noise pixels by trimming negative parts of ridge energy images. Finally, skeletons are extracted based on smooth intensity surfaces of ridge energy images and principal directions. Cai’s algorithm is designed to work specifically on images of handwriting and fingerprints, and the oriented scales of the Gaussian filters are predetermined in a specific way for each class of images. The main concern with thinning algorithms is their sensitivity to noise. So far, approaches for making thinning algorithms robust against noise are domain-specific and semi-automatic. In this work, we introduce a framework that aims at making thinning algorithms robust against three types of noise which commonly appears in sketch images. They are mainly contour noise, scratch and dither noise (Fig. 3(a)). Unlike other algorithms that serve the same purpose, the proposed framework automatically selects the amount of filtering needed adaptively to each image, and any thinning algorithm can be used during the framework’s thinning stage.
scales of filtering. Then, the filtering scale that gives the best tradeoff between noise removal and shape distortion is automatically selected. The optimal scale selection is done using a measure that expresses the amount of noise and shape distortion. In Section 3.2, we clarify the theoretical considerations for using scale space filtering. Then, we highlight the key operations of our framework (Sections 3.3 and 3.4). Finally the detailed procedure is explained (Section 3.5). 3.2. Scale space filtering Scale space filtering, introduced by Witkin [29], is a technique to derive representations of the image through multiple filtering scales, allowing a set of descriptions of the data that range from fine (capturing local and detailed informations) to coarse (capturing only the overall structure). Formally, for a given image f : N2 ! N, the scale space representation L : N2 Rþ ! N is defined as follows [30]:
Lð:; :; 0Þ ¼ f 3. The proposed approach and 3.1. Overview
Lð:; :; rÞ ¼ /ð:; :; rÞ f
Adaptive Thinning Framework (ATF) uses scale space filtering to derive multiple representations of an input image within multiple
where is the convolution operation, /ð; :; ;rÞ is the filtering kernel and r controls the scale. A larger r generates a coarser scale,
4
H. Chatbri, K. Kameyama / Pattern Recognition Letters 42 (2014) 1–10
describing the overall structure of the image, while a smaller r corresponds to finer scales containing details of the data. Scale space filtering paves the way to study how the image changes as the scale is varied. Applications of this include selecting a scale where the image representation is the most appropriate for a particular task [31], detection of local image features that are stable against multi-scale representations [32], or studying the evolution of the image within multiple scales [33]. In this work, the goal is to generate several filtered images and then select the scale that gives the best thinning result. Fig. 1 illustrates this idea: as the scale increases, the image’s noisy contours become softer, gaps caused by scratch become filled, and artifacts caused by dithers become connected and form flat regions which removes the extra branches after the thinning. When the scale becomes large, topology distortion is introduced. The challenge is then to reach a trade-off between noise reduction and shape distortion. As for the filter, we use the Gaussian filter which point spread function is defined as:
gðx; y; rÞ ¼
2 1 2 eðxþyÞ =2r 2pr2
where r is the smoothing parameter that controls the scale, and x and y are pixel coordinates.
The sensitivity measure is the performance measure used to select the best thinned image from the scale space. It should reflect the sought trade-off between noise removal and shape distortion. In [34], Zhou et al. introduced an objective measure for skeleton noise estimation by calculating the number of transitions from black to white pixels in the neighborhood of each skeleton’s pixel (Eq. (3)), using the assumption that this number tends to be large in noisy skeletons and relatively smaller in neat skeletons. However, there are two issues with this measure: It does not consider isolated black pixels that are caused by noise. It has not been designed to express shape distortions caused by filtering. Therefore, without modification, it cannot be used within a scale space based framework. We modify Zhou et al.’s measure in order to incorporate isolated black pixels and shape distortions. Our measure is as follows: N X M 1X S1 ði; jÞ n i¼1 j¼1
ð1Þ
where:
8 1; T BW ði; jÞ > 2 OR T BW ði; jÞ ¼ 0 > > > < OR S1 ði; jÞ ¼ > ½ðIth ði; jÞ ¼ 1Þ AND ðIði; jÞ ¼ 0Þ > > : 0; otherwise:
ð2Þ
where n is the number of foreground pixels in Ith , and M and N are the image Ith dimensions. T BW ði; jÞ, in the first condition, returns the number of transitions from 1 (black foreground pixel) to 0 (white background pixel) in the neighborhood of pixel (i, j) (the neighboring pixels are ordered from the middle top pixel until the north top pixel). T BW ði; jÞ is defined as follows: 8 X T BW ði; jÞ ¼ transitionðPk Þ k¼1
transitionðPk Þ ¼
1; if ðPk ¼ 1Þ AND ðPðkþ1Þ 0;
mod 8
¼ 0Þ;
otherwise:
The modifications are the conditions inside the boxes in Eq. (2). The second condition, T BW ði; jÞ ¼ 0, penalizes isolated black pixels that come usually from noise. The third condition, [(Ith (i, j) = 1) AND (I(i, j) = 0)], penalizes black pixels of the thinned image Ith , that come from shape distortions caused by the filtering, although do not originally exist in the original image I. 3.4. Optimal scale selection The framework generates a number of filtered images depending on the width w of the image. Therefore, the filtering scale r takes discrete values in the interval ½r0 ; 2w þ 1, where r0 is the initial scale value and 2w þ 1 is the maximum scale value. Initially, rmin is selected within this interval. In order to find the optimal scale rbest , regression by a quadratic function is used. For this purpose, the sensitivity Sm samples at rmin ; rmin Dr and rmin þ Dr to estimate a quadratic regression function. Then, rbest corresponds to the zero-crossing of the quadratic function’s derivative. 3.5. Detailed procedure
3.3. Sensitivity measure
Sm ðIth Þ ¼
where:
ð3Þ
ATF operates as follows: the width w of the input image is estimated by calculating the boundary points removing operations needed until no more foreground points exist. Then, during each framework iteration, the input image is first filtered using a Gaussian filter of scale ri , which produces a gray-scale image IGðiÞ . Next, IGðiÞ is binarized using a binarization algorithm [35] to produce image IBðiÞ , and a plugged thinning algorithm is used to produce a thinned image IthðiÞ . Then, a sensitivity measure Sm is calculated for IthðiÞ . Afterwards, the Gaussian scale ri is incremented and the whole process is repeated until ri ¼ 2w þ 1. Finally, the best scale rbest is estimated using a quadratic regression that takes the scale rmin corresponding to the smallest value of Sm , the previous scale rmin Dr and the next scale rmin þ Dr. The output of the framework, IthðbestÞ , is the skeleton generated using rbest . Any thinning method can be used during the thinning step. We only assume that a thinning algorithm takes a binary or grayscale image as input, and produces a binary image of the skeleton. In case of using a thinning algorithm for grayscale images, the binarization step is omitted. As a whole, ATF is similar to Hoffman and Wong’s method [27] as both use scale space filtering. However, ATF automatically estimates the optimal filtering scale, contrarily to Hoffman and Wong’s method that depends on a static parameter. In addition, ATF allows for any thinning algorithm to be used during the thinning stage, while in Hoffman and Wong’s method the thinning mechanism is hard-coded. 4. Experiments We evaluate ATF to examine the following: ATF’s performance in presence of noise and the degree of shape distortions that it introduces (Section 4.2). ATF’s usefulness in sketch-based image retrieval (Section 4.3). 4.1. Image datasets We begin by introducing the image datasets used for the experiments.
H. Chatbri, K. Kameyama / Pattern Recognition Letters 42 (2014) 1–10
5
4.1.4. Dataset 4 This dataset was generated by using ATF to produce skeletons of Dataset 2 images (Fig. 2(d)). 4.1.5. Dataset 5 We use the ‘‘Shape data for the MPEG-7 core experiment CEShape-1’’ dataset [37], which contains 1402 images of various patterns such as animals, insects, mechanical tools, solid shapes, etc. (Fig. 2(e)). The images of this dataset are much thicker than the images of Dataset 1. Although these images are not generally considered as typical sketch images, they are used here to evaluate ATF in case where the thickness of the input images is large. 4.2. Experiment 1 This experiment aims to evaluate ATF’s robustness against noise and the degree of topology distortions it introduces. For this purpose, sensitivity measure Sm (Section 3.3) was used to measure the effect of noise, and two measures for the probability of topology preservation T 1 and T 2 were used to estimate shape distortions. For the sake of simplicity, we refer to T 1 and T 2 as topology preservation measures. Images of Dataset 1 were used. Evaluation measures. The topology preservation measures are defined as follows: Topology preservation measure 1 [38]:
T1 ¼
Area½IMD ; Area½I
ð0 < T 6 1Þ
Here, IMD is the image formed by the maximal discs that fit to the original image along the skeleton (we refer the reader to [38] for a detailed explanation), and I is the original image. Large values near 1 express topology preservation, while small values near 0 express significant distortion. Topology preservation measure 2 [21]:
1 Nth T 2 ¼ 1 2 Nc
Fig. 3. Results of Experiment 1.
4.1.1. Dataset 1 This dataset contains 136 images of scanned hand-drawn sketches, drawings generated by computer graphics software, with various stroke thickness and noise (Fig. 2(a)). The dataset images have various dimensions ranging from 22 29 to 654 636 pixels. Sketch thickness ranged from 4 to 42 pixels.
4.1.2. Dataset 2 This dataset of scanned images contains 1431 images of handwritten alphabets, digits, mathematical symbols and expressions. The number of classes is 105 and the number of images per class ranges between 8 and 16 images. The images were collected by asking 8 subjects to imitate a sample of each class. Scale variance was introduced by asking the subject to introduce a same image in a different scale (Fig. 2(b)).
4.1.3. Dataset 3 This dataset was generated by using a thinning algorithm [36] to produce skeletons of Dataset 2 images (Fig. 2(c)).
Here, N th is the number of foreground pixels in the skeleton, and N c is the number of contour pixels in the original image. This measure considers that, for an image with a relatively smooth contour, the pixels of the contour of the original image are nearly as twice as those in the skeleton, based on the assumption that the skeleton should be near the ideal medial axis and one pixel width. The measure is normalized in order to make values near 1 expressing thinning preserving the topology, and values near 0 expressing significant distortion effects. Thinning algorithms. Experiments were held using five thinning algorithms: Zhang and Suen’s thinning algorithm [39]: performs thinning of binary images by repeating two sub-iterations: one deletes the south-east boundary points and the north-west corner points while the other one deletes the north-west boundary points and the south-east corner points. Point deleting is done according to a specific set of rules. The two sub-iterations are repeated until no more points validate the deleting rules. Huang et al.’s thinning algorithm [21]: the algorithm uses a set of templates to decide if a contour point should be deleted in a binary image. Iterations of contours points deleting are repeated until no more points validate the deleting templates. Zhang et al.’s thinning algorithm [36]: this algorithm works similarly to Zhang and Suen’s algorithm [39], with introducing an additional point deleting rule to improve connectivity preservation.
6
H. Chatbri, K. Kameyama / Pattern Recognition Letters 42 (2014) 1–10
Chatbri and Kameyama’s thinning algorithm [40]: performs thinning of binary images in two stages: during the first stage, contour pixels are removed iteratively using template matching until reaching a thinned image containing 1 and 2-pixels-width strokes. Then, during the second stage, pixels from the 2-pixelwidth strokes are removed to generate finally a 1-pixel-width skeleton. Weiss’s thinning algorithm [41]: Weiss describes a thinning methodology for grayscale images by threshold superposition: the grayscale image is decomposed into constituent binary images by thresholding. Then, the binary images are thinned. Finally, the skeleton is reconstituted by summing the thinned binary images. Result and evaluation. The table in Fig. 3(d) shows the average values of performance measurements of the five algorithms used
directly, and when plugged in ATF. The values show that thinning algorithms always had a better sensitivity measure when plugged in ATF than when applied directly. The two topology preservation measures decreased for all algorithms when using ATF, and this is explained by the noise removal caused by the Gaussian filtering, and considered as a topology distortion. Fig. 3(c) shows examples of the experimental results. Figs. 4 and 5 show a visualization of ATF applied on images in Figs. 4(a) and 5(a). The image in Fig. 4(a) combines contour and scratch noises caused by digitization and imperfection of digital sketching pens, but still the image can be considered relatively neat. The framework’s output was the image in Fig. 4(h) requiring a Gaussian filtering of rbest ¼ 6 (Fig. 4(b)) and keeping high values of topology measure T 1 and topology measure T 2 (Figs. 4(c) and 4(d)). As r increases, distortions in topology are
Fig. 4. Visualization of ATF applied on image (a) (The algorithm described in [45] was used for thinning). Plots (b)–(d) show the performance measures changing as r varies. Images (e)–(n) show thinning results after applying a Gaussian filter of scale r. In this case, ATF produced the skeleton (h) requiring a Gaussian filtering of rbest ¼ 6.
H. Chatbri, K. Kameyama / Pattern Recognition Letters 42 (2014) 1–10
7
Fig. 5. Visualization of ATF applied on image (a) (The algorithm described in [45] was used for thinning). Plots (b)–(d) show the performance measures changing as r varies. Images (e)–(o) show thinning results after applying a Gaussian filter of scale r. In this case, ATF produced the skeleton (o) requiring a Gaussian filtering of rbest ¼ 21.
introduced (Figs. 4(i)–4(n)), increasing sensitivity measure Sm and decreasing topology measure T 1 and topology measure T 2 . The image in Fig. 5(a) includes dither noise. In this case, ATF needed a Gaussian filtering of rbest ¼ 21 for sensitivity measure Sm to reach a minimum (Fig. 5(b)) corresponding to the skeleton in Fig. 5(o). Meanwhile, larger filtering scales caused topology measure T 1 and topology measure T 2 to decrease significantly (Fig. 5(c) and (d)). The best filtering scale rbest gives the minimum value of sensitivity measure Sm because the filtering leads to the following: Decreasing noise by softening the noisy contours. Filling the gaps in scratch areas.
Connecting between black pixels caused by dither to form connected regions that, once thinned, give lower sensitivity measures. When the noise is reduced, the skeleton contains less artifacts (redundant branches and lines caused by noise), and hence, sensitivity measure Sm will be lower. As for the scratch noise, the goal is to obtain the skeleton that contains single lines instead of double lines and bump-like structures in the scratch areas (Fig. 3(c)). This would increase sensitivity measure Sm since the medial lines are artifacts and do not originally exist in the original image. However, when the scratch noise is not abundant, artifact medial lines will not affect significantly
8
H. Chatbri, K. Kameyama / Pattern Recognition Letters 42 (2014) 1–10
the overall sensitivity measure Sm , and ATF’s output will be the skeleton with the best beautification effect.
4.3. Experiment 2 In this experiment, the goal is to evaluate using ATF as a preprocessing for sketch-based image retrieval. Evaluation procedure. Performances of sketch-based image retrieval methods when using a conventional thinning algorithm [36] and when using ATF are compared. Three state-of-the-art methods are used to check their performances: Angular Partitioning (AP) [14], Edge Relational Histogram (ERH) [42,43], and Shape Context (SC) [44]. Evaluation is done using precision-recall graphs, which are obtained by varying the number of recalled images, and the Area Under the Curve (AUC) measure. Images of Dataset 3 (Section 4.1.3) and Dataset 4 (Section 4.1.4) were used in this experiment. Result and evaluation. Fig. 6 shows precision-recall graphs respective to the retrieval methods. The result shows that the
use of ATF improves retrieval performances comparing to conventional thinning. This improvement in retrieval performances is due to the ability of ATF to produce neat skeletons (table in Fig. 6(d)): when using ATF instead of a conventional thinning algorithm, the dissimilarity measure for AP [14] decreased by a factor of 3, the similarity measure for ERH [42,43] increased by a factor of 4.6, and the dissimilarity measure for SC [44] decreased by a factor of 59.25. This result encourages using ATF in similar applications where thinning is required such as OCR, document analysis, signature recognition, etc. 4.4. Experiment 3 In this experiment, we evaluate ATF’s performance in thinning patterns with large stroke thickness. Evaluation procedure. We compare between the performance of Zhang et al.’s thinning algorithm [36] when used directly, and its performance when plugged inside ATF, using the same metrics in Experiment 1.
Fig. 6. Effect of using ATF as a preprocessing and normalization step on sketch-based image retrieval: (a) AUCAP+ATF/AUCAP=1.13. (b) AUCERH+ATF/AUCERH=1.28. (c) AUCSC+ATF/ AUCSC=1.03. (d) Improvement on the similarity measure introduced by ATF when matching sketches with different amount of noise.
H. Chatbri, K. Kameyama / Pattern Recognition Letters 42 (2014) 1–10
9
Fig. 7. Results of Experiment 3.
Result and evaluation. The table in Fig. 7(b) shows that, similarly to Experiment 1, ATF improves the performance of thinning in presence of noise without harming topology preservation. Skeletons produced using ATF are neat comparing to results of a Zhang et al.’s algorithm (Fig. 7(a)). However, since the stroke thickness here is significantly large, most of the visual information is lost, and the skeletons do not reflect the original patterns. This is best illustrated in images of the bat and the bell (Fig. 7(a)). 5. Conclusion In this paper, we introduced a framework for making thinning algorithms robust against noise in sketch images. Our framework uses scale space filtering to generate multiple representations of an input image within multiple scales. Then, the filtering scale that gives the best trade-off between noise removal and shape distortion is selected. The framework estimates the optimal filtering scale automatically and adaptively to the input image. In addition, any thinning algorithm can be used during the framework’s thinning stage. Experimental results, using five state-of-the-art thinning algorithms, showed that our framework is robust against typical types of noise which exist in sketch images, mainly contour noise, scratch and dithers. In addition, application of the framework as a preprocessing step in sketch matching shows its usefulness as a preprocessing and normalization step that improves matching performances. Acknowledgement The authors would like to thank Prof. Mark E. Hoffman for his kind and prompt assistance with his paper, and anonymous reviewers for their constructive criticism and useful comments.
References [1] L. Lam, S. Lee, C. Suen, Thinning methodologies-a comprehensive survey, IEEE Trans. Pattern Anal. Mach. Intell. 14 (9) (1992) 869–885. [2] F. Stentiford, R. Mortimer, Some new heuristics for thinning binary handprinted characters for OCR, IEEE Trans. Syst. Man Cybern. 13 (1) (1983) 81–84. [3] L. Lam, C.Y. Suen, An evaluation of parallel thinning algorithms for character recognition, IEEE Trans. Pattern Anal. Mach. Intell. 17 (9) (1995) 914–919. [4] K.-C. Fan, J.-M. Lu, L.-S. Wang, H.-Y. Liao, Extraction of characters from form documents by feature point clustering, Pattern Recognit. Lett. 16 (9) (1995) 963–970. [5] S.J.F. Guimarães, M. Couprie, A. de Albuquerque Araújo, N. Jerônimo Leite, Video segmentation based on 2D image analysis, Pattern Recognit. Lett. 24 (7) (2003) 947–957. [6] G.L. Marcialis, F. Roli, Fingerprint verification by fusion of optical and capacitive sensors, Pattern Recognit. Lett. 25 (11) (2004) 1315–1322. [7] Y. He, J. Tian, X. Luo, T. Zhang, Image enhancement and minutiae matching in fingerprint verification, Pattern Recognit. Lett. 24 (9) (2003) 1349–1360. [8] M. Fons, F. Fons, E. Cantó, Fingerprint image processing acceleration through run-time reconfigurable hardware, IEEE Trans. Circuits Syst. Express Briefs 57 (12) (2010) 991–995. [9] T. Chanwimaluang, G. Fan, An efficient algorithm for extraction of anatomical structures in retinal images, International Conference on Image Processing (ICIP), vol. 1, IEEE, 2003, pp. 1193–1196. [10] M.M. Fraz, P. Remagnino, A. Hoppe, B. Uyyanonvara, A.R. Rudnicka, C.G. Owen, S.A. Barman, Blood vessel segmentation methodologies in retinal images – a survey, Comput. methods programs biomed. 108 (1) (2012) 407–433. [11] G. Zhu, Y. Zheng, D. Doermann, S. Jaeger, Signature detection and matching for document image retrieval, IEEE Trans. Pattern Anal. Mach. Intell. 31 (11) (2009) 2015–2031. [12] H. Lv, W. Wang, C. Wang, Q. Zhuo, Off-line chinese signature verification based on support vector machines, Pattern Recognit. Lett. 26 (15) (2005) 2390– 2399. [13] R. Kumar, J. Sharma, B. Chanda, Writer-independent off-line signature verification using surroundedness feature, Pattern Recognit. Lett. 33 (3) (2012) 301–308. [14] A. Chalechale, G. Naghdy, A. Mertins, Sketch-based image matching using angular partitioning, IEEE Trans. Syst. Man Cybern. 35 (1) (2005) 28–41. [15] W. Leung, T. Chen, Trademark retrieval using contour–skeleton stroke classification, International Conference on Multimedia and Expo (ICME), vol. 2, IEEE, 2002, pp. 517–520.
10
H. Chatbri, K. Kameyama / Pattern Recognition Letters 42 (2014) 1–10
[16] T. Kato, T. Kurita, N. Otsu, K. Hirata, A sketch retrieval method for full color image database-query by visual example, in: IAPR International Conference on Pattern Recognition, Conference A: Computer Vision and Applications, vol. I, IEEE, 1992, pp. 530–533. [17] K. Bozas, E. Izquierdo, Large scale sketch based image retrieval using patch hashing, Adv. Visual Comput. 7431 (2012) 210–219. [18] C. Arcelli, Pattern thinning by contour tracing, Comput. Graphics Image Process. 17 (2) (1981) 130–144. [19] H. Chatbri, K. Kameyama, Sketch-based image retrieval by shape points description in support regions, in: International Conference on Systems, Signals and Image Processing (IWSSIP), 2013, pp. 19–22. [20] H. Chatbri, K. Kameyama, Towards making thinning algorithms robust against noise in sketch images, in: International Conference on Pattern Recognition (ICPR), 2012, pp. 3030–3033. [21] L. Huang, G. Wan, C. Liu, An improved parallel thinning algorithm, in: International Conference on Document Analysis and Recognition (ICDAR), IEEE, 2003, pp. 780–783. [22] L. Lam, C.Y. Suen, Evaluation of thinning algorithms from an OCR viewpoint, in: International Conference on Document Analysis and Recognition (ICDAR), IEEE, 1993, pp. 287–290. [23] Y. Chen, Y. Yu, Thinning approach for noisy digital patterns, Pattern Recognit. 29 (11) (1996) 1847–1862. [24] R. Singh, V. Cherkassky, N. Papanikolopoulos, Determining the skeletal description of sparse shapes, in: International Symposium on Computational Intelligence in Robotics and Automation (CIRA), IEEE, 1997, pp. 368–373. [25] R. Palenichka, M. Zaremba, Multi-scale model-based skeletonization of object shapes using self-organizing maps, International Conference on Pattern Recognition (ICPR), vol. 1, IEEE, 2002, pp. 143–146. [26] Y. Chen, Hidden deletable pixel detection using vector analysis in parallel thinning to obtain bias-reduced skeletons, Comput. Vision Image Underst. 71 (3) (1998) 294–311. [27] M.E. Hoffman, E.K. Wong, Scale-space approach to image thinning using the most prominent ridge line in the image pyramid data structure, in: Photonics West’98 Electronic Imaging, International Society for Optics and Photonics, 1998, pp. 242–252. [28] J. Cai, Robust filtering-based thinning algorithm for pattern recognition, Comput. J. 55 (7) (2012) 887–896. [29] A. Witkin, Scale-space filtering: a new approach to multi-scale description, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 9, IEEE, 1984, pp. 150–153.
[30] T. Lindeberg, Scale-space theory: a basic tool for analyzing structures at different scales, J. appl. stat. 21 (1–2) (1994) 225–270. [31] T. Sezgin, R. Davis, Scale-space based feature point detection for digital ink, in: ACM SIGGRAPH 2006 Courses, ACM, 2006, pp. 29–35. [32] D. Lowe, Distinctive image features from scale-invariant keypoints, IJCV 60 (2) (2004) 91–110. [33] F. Mokhtarian, S. Abbasi, J. Kittler, Efficient and robust retrieval by shape content through curvature scale space, Ser. Softw. Eng. Knowl. Eng. 8 (1997) 51–58. [34] R. Zhou, C. Quek, G. Ng, A novel single-pass thinning algorithm and an effective set of performance criteria, Pattern Recognit. Lett. 16 (12) (1995) 1267–1275. [35] N. Otsu, A threshold selection method from gray-level histograms, Automatica 11 (285–296) (1975) 23–27. [36] F. Zhang, Y. Wang, C. Gao, S. Si, J. Xu, An improved parallel thinning algorithm with two subiterations, Optoelectron. Lett. 4 (1) (2008) 69–71. [37] Shape data for the mpeg-7 core experiment ce-shape-1 (2013).
. [38] B. Jang, R. Chin, One-pass parallel thinning: analysis, properties, and quantitative evaluation, IEEE Trans. Pattern Anal. Mach. Intell. 14 (11) (1992) 1129–1140. [39] T. Zhang, C. Suen, A fast parallel algorithm for thinning digital patterns, Commun. ACM 27 (3) (1984) 236–239. [40] H. Chatbri, K. Kameyama, An adaptive thinning algorithm for sketch images based on gaussian scale space, IEICE technical report. Image engineering 111(442) (2012) 33–38. [41] J.M. Weiss, Grayscale thinning, in: Computers and Their Applications, 2002, pp. 86–89. [42] Y. Kumagai, T. Arikawa, G. Ohashi, Query-by-sketch image retrieval using edge relation histogram, in: MVA2011 IAPR Conference on Machine Vision Applications, 2011, pp. 83–86. [43] Y. Kumagai, G. Ohashi, Query-by-sketch image retrieval using edge relation histogram, IEICE E96-D (2) (2013) 340–348. [44] S. Belongie, J. Malik, J. Puzicha, Shape matching and object recognition using shape contexts, IEEE Trans. Pattern Anal. Mach. Intell. 24 (4) (2002) 509–522. [45] Z. Guo, R. Hall, Parallel thinning with two-subiteration algorithms, Commun. ACM 32 (3) (1989) 359–373.