Accepted Manuscript
Structure Preserving Binary Image Morphing using Delaunay Triangulation Abbas Cheddad PII: DOI: Reference:
S0167-8655(16)30335-X 10.1016/j.patrec.2016.11.010 PATREC 6676
To appear in:
Pattern Recognition Letters
Received date: Revised date: Accepted date:
27 May 2016 14 September 2016 20 November 2016
Please cite this article as: Abbas Cheddad , Structure Preserving Binary Image Morphing using Delaunay Triangulation, Pattern Recognition Letters (2016), doi: 10.1016/j.patrec.2016.11.010
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
AC
CE
PT
ED
M
AN US
Highlights Multiple dilation rapidly deforms object structure using mathematical morphology. Preserving object structure is of paramount importance in pattern recognition. A new geometric-based mechanism for binary image dilation is proposed. Our method exploits Delaunay triangulation; a versatile geometric structure. Our method shows high performance when applied to handwritten digit classification.
CR IP T
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
Abbas Cheddada, a
Blekinge Institute of Technology (BTH), Karlskrona, SE-371 79, Sweden
AC
CE
PT
ED
M
AN US
Abstract Mathematical morphology has been of a great significance to several scientific fields. Dilation, as one of the fundamental operations, has been very much reliant on the common methods based on the set theory and on using specific shaped structuring elements to morph binary blobs. We hypothesised that by performing morphological dilation while exploiting geometry relationship between dot patterns, one can gain some advantages. The Delaunay triangulation was our choice to examine the feasibility of such hypothesis due to its favourable geometric properties. We compared our proposed algorithm to existing methods and it becomes apparent that Delaunay based dilation has the potential to emerge as a powerful tool in preserving objects structure and elucidating the influence of noise. Additionally, defining a structuring element is no longer needed in the proposed method and the dilation is adaptive to the topology of the dot patterns. We assessed the property of object structure preservation by using common measurement metrics. We also demonstrated such property through handwritten digit classification using HOG descriptors extracted from dilated images of different approaches and trained using Support Vector Machines. The confusion matrix shows that our algorithm has the best accuracy estimate in 80% of the cases. In both experiments, our approach shows a consistent improved performance over other methods which advocates for the suitability of the proposed method.
recognition to medical imaging and modelling of organism’s growth, to industrial product inspection, etc. Thus, capturing interest of both research and industrial world. Shape descriptors are most often associated with statistical and directional metrics that act on a binary image of a given object. Before attempting such analysis, pre-processing of non-binary images is normally carried out to segment the region of interest, whilst there exist some related methods that work on the grayscale intensity channel such as those that use energy minimisation algorithms (e.g., active contours). The latter type is not the focus of this paper. A known discipline that relates directly to binary shape processing is the mathematical morphology which illustrates its fundamental operations using the set theory. Its non-linear processing and its direct relation to shape descriptors, tease it apart from the convolution operations deployed in signal processing. As far as image processing is concerned, mathematical morphology offers a powerful and unified approach to tackle a number of problems [1]. Morphological operators are among the first image-based operators where Dilation is one of the two primitive operators in mathematical morphology and the other being the dual operator, known formally as erosion, with respect to set complementation. Dilation, which is the focus of this paper, is defined as the morphological transformation that combines two sets using vector addition of set elements; as Haralick et.al like to put it [2]. Historically, there were other definitions and they all boil down to how dilation operates. For instance, image spatial shifting to yield dilation is in fact closely related to the Minkowski addition concept. Dilation is implemented using binary kernels termed as structuring elements (SE). They can be of any shape: square, disk, diamond, or any other arbitrary shapes. Dilation provides isotropic expansion of binary shapes which has been recently extended to grayscale objects, it can also bridge gaps in broken segments or smooth out shapes. Cuisenaire [3] defined a morphological operation as a variant of distance transformation algorithms using balls SEs that are locally adaptable. We shed more light on the modern notion of distance transform in connection with dilation in section 2. Shih [4] described the sweep morphology operation that is rendered adaptive by changing the rotation angles and scaling factors of the SEs which are defined with respect to the boundary of the curve of a given object (Ch. 11, p.344). In the recent years, development around the topic of mathematical morphology migrated to grayscale images and 3D objects. Thus, all of the fundamental morphological operations on binary images have been successfully extended to grayscale spatial plane [5][6][7][8], homography, 3D mesh and hyperplanes projective space [9][10][11]. Additionally, a current trend is the adaptation of classical morphological tools from signal processing to graph structures (e.g., 3D projections), a survey paper on graph-based morphology is available in [12]. Such research venues are beyond the scope of this letter, therefore, for the sake of conciseness and clarity, we limit our discussion to binary image dilation.
CR IP T
Structure Preserving Binary Image Morphing using Delaunay Triangulation
Keywords Binary Image; Delaunay Triangulation; Dilation; Mathematical Morphology; Set Theory; Structuring Element; Distance Transform; Pattern Recognition. 1.
Introduction and Background
Shape analysis is of paramount importance in different computational fields, spanning from computer vision and robotics to optical character recognition (OCR) and pattern Corresponding author. Tel.: +46 4 55-385863; fax: +0-000-0000000; e-mail:
[email protected]
1.1. Example Applications Binary Image Dilation Stahlberg and Vogel [13] applied a dilation operation with ellipse shaped 3x3 kernel to increase thickness of handwritten text lines. Analogous to that was the work of Fouladi and Araabi [14] where they took the image of the main body of a glyph (an elemental symbol) and performed multiple dilation operations to thicken the body structure.
ACCEPTED MANUSCRIPT shows that morphology is among those techniques that represent shapes as a non-linear transformation. Our ultimate objective in this work, illustrated as a dashed line, is to adapt the Delaunay triangulation topology to perform basic mathematical morphology (i.e., dilation in this case).
Voronoi Diagram /Delaunay Triangulation The use of these techniques in the literature is not scarce either. For instance, Cheddad et al. [22] [23] extracted the outer vertices of the Delaunay triangulation (DT) to segment grayscale images by processing only a small set that contains no more than 255 points (i.e., equivalent to the total possible number of image histogram bins) which makes it a very fast segmentation algorithm. Xiao and Yan [24] and Xie and Lam [25] applied DT to model human faces in images for biometric applications (e.g., face recognition, simulation of facial expression, etc.). The aforementioned applications are but a few examples extracted from a larger pool of morphology applications which we hope will rekindle the interest in revisiting this central and vital field.
In order to make this paper self-contained, we briefly review the widely used dilation methods. A morphing procedure usually consists of two steps. The first step is to choose a proper flat structuring element (kernel) with a specified neighbourhood, the latter is a square matrix containing 1's and 0's (the pattern formed by ones and zeros dictate the shape of the structuring elements); the second step is to apply a morphology filtering method with the chosen matrix to achieve a morphology effect (probing and expanding the shapes contained in an input binary set). We can state that symbolically as follows. Let I and S be sets in the 2D space N2 that correspond to the binary image and the structuring element, respectively. And let i = {i1, i2,…in} and s = {s1, s2,…sn} be the elements in I and S, respectively. The dilation of I by S, denoted by I S , is defined as: (1) I S d N 2 d i s, i I , s S
CR IP T
Shaus et al. [15] dilated bounding octagon in order to account for certain inaccuracies in the Hebrew texts of the Iron Age (First Temple period) in facsimiles (black and white images of ancient inscriptions). Zelenika et al. [16] dilated binary images of different Sobel filters by a rectangular structuring element. Desai [17] analysed the basic dilated shape of characters for building feature set for their support vector machine classifier. Jamal [18] used morphological reconstruction method by dilation that is based on an estimated baseband from shape analysis for Arabic handwritten texts segmentation. Khayyat et al. [19] proposed detection of handwritten text lines by applying an adaptive mask to morphological dilation. In [20] binary handwriting samples were morphologically dilated using a 7-pixel-width diamondshaped structuring element. They claimed that this step enhances the template matching outcome due to the resulting different degrees of word shape “fuzziness”. On another frontier, Han et al. [21] demonstrated an eye detection system using morphological operation. They called it morphology-based eye-analogue segmentation. Its aim is to reduce the interference of the background. Basically they performed a closing operation, a type of morphology that uses dilation followed by erosion, and clipped different portions to find the candidate eye-analogue pixels.
Fig. 1. Expanded taxonomy model of shape representation domain techniques and the unique connection this work makes (dotted arrow).
Common Dilation Methods
ED
M
AN US
2.
1.2. Research Statement and Contribution
AC
CE
PT
The outcome of dilation is dependent on the used structuring element and the number of iterations as well as the characteristics of the shape being processed. If we desire to have a mild effect in each dilation pass one can use a small SE which for a flat disk SE object it translates to 5 neighbours and for a flat square SE object it translates to 4 neighbours. Repetitive passes quickly evolve to an unrecognisable thickened shape especially for small binary objects. Shih [4] stated that “traditional mathematical morphology uses a designed structuring element that does not allow geometric variations during the operation to probe an image.” (Ch. 11, p.341). Although, [3] and [4] have proposed adaptive dilation processes, however, both still utilise some sort of structuring elements. Therefore, it is the intent of this paper to propose an approach to address this void. Thus, the contribution of this paper is in our attempt to revive the area of mathematical morphology by suggesting to also consider, in addition to the set theory, the pure geometry association of dot patterns in the Euclidean space. We namely propose the use of the DT for binary image morphing. To the best of our knowledge, such concept has not been proposed before, thus reinforcing its novelty. Our results reaffirm the usefulness of this notion and confirm the stability and the mild thickening effect that helps preserve shape properties which is inevitability a good property for many applications including pattern recognition algorithms. Fig. 1
which can be rewritten as the aggregated union of translations of Si (the structuring element centred at every pixel location i in the foreground): (2) I S S I iI Si It is possible to achieve dilation by reflecting S (a.k.a, the symmetric of S). Let Sˆ denotes the reflected S, then the above equation becomes: (3) I S n N 2 Sˆn I
There are dozens of standardised structuring elements and limitless number of them can be arbitrarily set. The structuring elements we use in this study are a square structure SE1 ={(0,0),(0,1),(1,0),(1,1)} and a disk structure SE2 ={(0,1),(1,0),(1,1),(1,2),(2,1)} which are depicted graphically in Fig.2.
Fig.2. The two morphological structuring elements used in this study. Square shape SE containing 4 neighbours (left) and disk shape SE containing 5 neighbours.
A modern method is the adaptation of the Euclidean distance map (EDM), also called Euclidean distance transform, to mathematical morphology. EDM is a fundamental geometrical
ACCEPTED MANUSCRIPT
Fig. 5. Delaunay triangulation of the seven dot patterns shown as black lines.
Each triangle in the DT shown in Fig. 5 is created in the following manner. Once the VD for a set of points is constructed, it is a simple matter to produce DT by connecting any two sites whose Voronoi polygons share an edge. More specifically: let P be a circle free set. Three points, p, q and r of P define a Delaunay triangle if there is no further point of P in the interior of the circle which is circumscribed to the triangle p, q, r and its centre lies on a Voronoi vertex; Fig. 6 depicts this notion.
CR IP T
operator with great applicability in variety of computer vision applications and it is also related to many other important entities such as VD which we discuss in this paper [26], morphological dilation [27] and graphs [28]. Distance transform associates each point p of a set P with the distance from itself to the closest point in the complementary set of P [29]. Computation of morphological filters using EDM is highlighted in [26, p.2:6] and [3]. Thresholding the EDM at level 1.1 generates the smallest possible dilation as shown in Fig. 3.
Fig. 3. Isotropic dilation using the Euclidean distance transform. (a) Original binary image, “cat32.pgm”, from the Binary Shape Databases, the Brown University. (b) Dilation with a distance thresholded at level 1.1. (c) Colour enhanced EMD of the inverted image in (a).
Delaunay Triangulation based Binary Image Morphing (DTBIM)
AN US
3.
PT
( p y q y )2
(4)
CE
d ( p, q) ( p x qx ) 2
ED
M
The proposed approach is based ultimately on DT technique, which is the dual of Voronoi Diagram (VD), a versatile geometric structure [30][31][24]. We opt to address the issue of binary morphology geometrically and not spatially. In other words, we treat shapes as dense dot patterns, a field that DT is meant to handle. A VD is essentially a set of geometric elements derived from the distance relationship of geometric objects and has some useful properties for engineering applications. VD can be generated using different algorithms. Perhaps the renowned implementations are the Fortune's algorithm and the Quickhull algorithm. Given a set of 2D points, the Voronoi region for a point pi is defined as the set of all the points that are closer to pi than to any other points, see Fig. 4. The Euclidean distance is normally the distance metric of choice for VD construction which is defined, between two points p and q with (x, y) coordinates in the Euclidean space, as:
AC
Fig. 4. Voronoi diagram of seven dot patterns shown as thick lines in blue colour.
The intersections of the Voronoi regions for the set of points construct the Voronoi diagram. More formally, let a set K = {p1,…, pn} of n distinct points in the 2D plane ( 2 ). The Voronoi cell V (pi) of a point pi K is defined as:
V ( pi ) : {q 2 : d ( pi , q) d ( p j , q), i j}
(5)
The dual tessellation of VD is known as the Delaunay Triangulations (DT) (as shown in Fig. 5). The duality comes about in the following way: vertices in the Voronoi diagram correspond to faces in DT, while Voronoi cells correspond to vertices of DT.
Fig. 6. Delaunay construction. The triangle pqr defines a Delaunay triangle where the centre (C) of the circle circumscribed to pqr is a Voronoi vertex.
The key point in the preceding discussion and the above synthesised figures is that, sites (dot patterns) geometry dictates the construction of the DT via its dual topology; the VD. Now, when it comes to digital images, blob pixels would correspond to those dot patterns, however, point coordinates (x, y) are discrete and are described by their two dimensional locations on a square grid. 3.1.
DTBIM: A constrained DT
As mentioned earlier, DTBIM tries to provide an alternative for the existing dilation methods (using set theory or distance transform). One of the main advantages of DTBIM is its small incremental expansion that is, as we will see later, unreachable by the smallest SE used in the common methods. Thus, shape characteristics may survive longer chained dilations before the shape is totally deformed. Such a feature can leverage the performance of several algorithms for further processing (e.g., word/line segmentation in scanned text documents, text and shape based pattern recognition, etc.). The other characteristic of DTBIM is that it obviates the need for the use of a specific SE. Its morphology is inherited from its inner triangle structure which, unlike SE, is adaptive depending on the geometry formed by the object pixels in the Euclidean space. Moreover, DTBIM tend to be less prone to additive noise. Inner solid patches in a binary image are of no interest to DTBIM (i.e., no expansion occurs). Therefore, we pre-process the input binary image by extracting its contour with a predefined inward thickness level as follows (detailed steps are furnished in Algorithms 1&2): 0 Bt where
(6)
B t Dist ( B c )
where Bc is the image complement of the original binary image, B, where the foreground is black, and the contour is obtained from its distance transformed image (Bt). The variable τ controls the inward thickness of the contour and is chosen to generate a contour that is sufficient to construct DT ( 3, and B ).
ACCEPTED MANUSCRIPT Algorithm 1: Outer contour extraction ( ) input: Binary image (B) output: Contour ( ) Take the image complement, Bc Generate the distance map, Bt if (Bt > 0 && Bt <=t) then set Bt (x, y) to 1 else set Bt (x, y) to 0 endif t =B return Fig. 7: DT Triangulations of three randomly allocated pixels in N 8 ( p) (3x3 block matrix). The numerical values correspond to: ( A( DT ), a, b, c) ,
the 8-neighbours ( x 1, y 1 ) and the 24-neighbours ( x 3, y 3 ) of a given pixel p(x, y), respectively. And let A(DT) be the Heron's area of the DT triangle TDT formed by any three “on” pixels in the adjacent neighbourhood N 8 ( p ) . Then the
a slightly larger expansion.
DT , iff A( DT ) TDTa ,b,c n 2 DT otherwise
(7)
3.2.
M
where =2 unit since any formed triangle within the block N 8 ( p ) must have its area less or equal to 2 (see, Fig. 7). And n = 4 penalises the side-lengths (a, b, c) of the triangle (TDT) preventing them from over stretching to blocks beyond N 24 ( p ) . The final phase is to apply median filter to smoothen DT boundary: DTBIM med ( DT ( ) B) , where DT is the output of Eq. 7, and med is the 3x3 median filter to smooth edges.
AC
CE
PT
ED
Algorithm 2: The constrained DT (DTBIM) input: + Outer contour extraction ( ) + The binary image (B) output: Constrained DT (DTBIM) Take the image complement of B, Bc Perform exclusive disjunction, B c Construct DT, DT ( ) // Filter out invalid triangles for each triangle TDT in DT do if (Area of TDT <= && length of TDT side-lengths n 2 ) then Keep (a valid triangle) else Delete TDT endif endfor // Geometry to spatial domain pixel interpolation for each valid triangle TDT in DT do set the interpolated pixels of TDT to 1 on , DTs ( ) endfor // Peform median filter on the union set DTBIM med ( DTs ( ))
return DTBIM
respectively. As can be seen, in all cases we have 0.5 A( DT ) 2 . Hence the triangulation is adaptive to the arrangement of pixels. Triangles are quantised into the Cartesian coordinate system to yield the sought expansion. In the case of N 24 ( p) , the notion is the same except the fact that it generates
DTBIM: Results and Discussion
AN US
constraint that we impose on DT is twofold.
CR IP T
In order to tailor DT to fit into the context of image morphology, we need to control its behaviour. Let N 8 ( p ) and N 24 ( p ) denote
Our experimental data set is comprised of binary images obtained from the web as well as the Binary Shape Databases1 available from the Computer Vision centre (LEMS) at the Brown University. To quantify the performance of the proposed algorithm we use some standard quality metrics. In all the metrics listed below, the comparison is made between the original binary image and the morphed image of each method (after each dilation pass). That gives a hint at the degree of shape deformation. We also used another public database of handwritten digits, constructed by the Computer Vision Group at the University of São Paulo, to test for DTBIM efficiency; see the end of this section. Structural Similarity Index (SSIM) The SSIM originally computes three terms, the luminance term, the contrast term and the structural term [32]. The former two are discarded as they are irrelevant to binary images, and we kept the last term as it captures the structure which is central to this work. The structure comparison function is formulated as follows: s( I , Iˆ)
IIˆ I Iˆ
(8)
where λ is a regularization constants for structural term, I , Iˆ , IIˆ are the two standard deviations, and the last tuple is the cross-covariance for the two binary images under scrutiny; (original binary image) and Iˆ (the estimate of I after applying dilation). Peak Signal-to-Noise Ratio (PSNR) There was an intense debate around the superiority of SSIM over PSNR, but the latter is still a recognised standard to measure quality of images after distortion (e.g., JPEG compression). Which one is more efficient in our case could be a moot point that is out of the scope of this paper. 2D Correlation Coefficient
1
http://vision.lems.brown.edu/content/available-software-and-databases, Accessed on 2016-05-19.
ACCEPTED MANUSCRIPT
CR IP T
Fig.9. A binary turbinate image and its 4 th dilation pass using the three algorithms.
PT
ED
M
AN US
The correlation coefficient is another popular metric that, like SSIM, has lower and upper boundaries and is easy to interpret. The past discussion was at the abstract level and although it deals with dilation, erosion is possible by operating on the complement image; I C . An advantage of DTBIM is its resilience to noise. Gaussian white noise ( 0, 2 0.01) was added to the original binary image, subsequently we applied a single dilation pass using the different methods. The results of dilations shown in Fig. 8 & Fig. 9 suggest that DTBIM is less prone to additive noise and natural noise, respectively. A more exhaustive experiment using the public binary shape database (see the link in section 3.2) reinforces our claim (Fig. 10). One recognisable limitation of DTBIM, however, is that it cannot dilate any collinear points (non-convex surfaces). Thus the horizontal line shown in the supplementary material, Fig S1 (b) DTBIM, has eventually disappeared.
Fig. 8. Robustness against additive white Gaussian noise. This experiment is based on a single dilation pass.
AC
CE
Fig. S1 (in the supplementary document) sheds light on some illustrative examples and Fig. S2 is an enlarged figure of the example shown in Fig. S1(b) but with only 5 dilation passes. At the first iteration, each binary image is compared to itself so that to force all the curves to depart from the same perfect point. The PSNR is the odd metric in this respect since it has no upper limit and inputting identical images results in a division by zero, thus returning infinite value which is omitted in the display. Now, by examining the test results, we can confirm our hypothesis that DTBIM does morph objects in a slow motion as compared to dilation using the two common SEs (shown in section 2). To support this claim, we extracted the absolute dilated area which is computed by the logical operation XOR( I , Iˆ) and then calculating the dilated area (see supplementary material, Fig. S1 (e)).
Fig. 10. Robustness against additive white Gaussian noise on 300 randomly selected images from the public binary shape database available from the Brown University (see section 3.2 and the URL link therein).
Time Complexity Dilation operations have time complexity of O(n p q) for n image pixels and rectangular structuring elements of size p by q [20]. The DTBIM time complexity is overwhelmingly consumed by the underlying DT construction. DT uses the Fortune's algorithm based on sweepline Voronoi code by Steven Fortune [33], which can be computed in a linearithmic time, that is ~( n ) . On average, T(DTBIM)= O(n log n) o O(2.5 (n log n)) O(n log n) . DTBIM efficiency: application to handwritten digit classification As we have seen in the introductory section, there are myriad ways to dilate a binary image, however, the chances of preserving object structure plummet with each dilation pass until such object becomes unrecognisable. This section delves briefly –given the imposed page limit- into testing our proposed method for digit classification. Bloch et al. [34] stated that two elements of a lattice are considered as indistinguishable if their images by a morphological operator are identical, a notion that would very well fit into our problem at hand. For this purpose, we used the
ACCEPTED MANUSCRIPT acknowledge the support of SONY Mobile Communications AB, Lund, Sweden, and ArkivDigital®, Sweden. References 1. R.C. Gonzalez and R. E. Woods, Digital Image Processing. Pearson Education Inc., 2008. 3 rd edition, Ch.9 p. 628. 2. R.M. Haralick, S.R. Sternberg, and X. Zhuang, “Image Analysis Using Mathematical Morphology,” IEEE Transactions on Pattern Analysis and Machine Intelligence., vol. PAMI-9, no. 4, pp.532- 550, 1987. 3. O. Cuisenaire, "Locally adaptable mathematical morphology using distance transformations," Pattern Recognition, vol. 39, no. 3, pp.405–416, 2006. 4. F.Y. Shih, Image Processing and Mathematical Morphology: Fundamentals and Applications, CRC Press, 2009. 5. L. Najman and H. Talbot, “Mathematical Morphology,” Wiley-ISTE, 1st edition, 24 Jan. 2013. 6. O. Marques, 2011 “Morphological Image Processing”, in Practical Image and Video Processing Using MATLAB®, John Wiley & Sons, Inc., Hoboken, NJ, USA. 7. E.R. Urbach and M.H.F. Wilkinson, “Efficient 2-D Grayscale Morphological Transformations With Arbitrary Flat Structuring Elements,” IEEE Transactions on Image Processing, vol. 17, no. 1, January 2008, pp. 1 - 8. 8. J. Guan, T. Zhang, X. Wang, J. Mei, “New class of Grayscale Morphological Filter to enhance infrared building target,” IEEE Aerospace and Electronic Systems Magazine, vol.27, Issue. 6, 2012, pp.5 - 10. 9. M. Fauvel, Y. Tarabalka, J.A. Benediktsson, J. Chanussot and J.C. Tilton, “Advances in Spectral-Spatial Classification of Hyperspectral Images,” Proceedings of the IEEE, vol. 101, Issue. 3, March 2013, pp.652 - 675. 10. B. Burgeth and A. Kleefeld, “Morphology for Color Images via Loewner Order for Matrix Fields,” Proceedings of the 11th International Symposium on Mathematical Morphology, ISMM 2013, Uppsala, Sweden, May 27-29, 2013. 11. M. Moreaud and F. Itthirad, “Fast Algorithm for Dilation and Erosion using Arbitrary Flat Structuring Element,” International Conference on Multimedia Computing and Systems (ICMCS), 14-16 April 2014, pp.289 - 294. 12. L. Najman and J. Cousty, “A graph-based mathematical morphology reader,” Pattern Recognition Letters, 47 (2014) 3–17. 13. F. Stahlberg and S. Vogel, “The QCRI Recognition System for Handwritten Arabic,” in Lecture Notes in Computer Science, vol. 9280, pp. 276-286, Springer Switzerland, 2015. 14. K. Fouladi and B. N. Araabi, “Toward automatic development of handwritten personal Farsi/Arabic OpenType® fonts,” International Journal on Document Analysis and Recognition (IJDAR), vol. 18, no. 3, pp. 249-262. SpringerVerlag Berlin Heidelberg 2015. 15. A. Shaus, E. Turkel, E. Piasetzky, “Binarization of First Temple Period Inscriptions – Performance of Existing Algorithms and a New Registration Based Scheme,” International Conference on Frontiers in Handwriting Recognition, pp. 645 - 650. Bari, Italy, 18-20 Sept. 2012. 16. D. Zelenika, J. Povh, B. Ženko, “Text Detection in Document Images by Machine Learning Algorithms,” Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015, vol. 403, pp 169-179. Advances in Intelligent Systems and Computing. Springer, 2015. 17. A.A. Desai, “Support vector machine for identification of handwritten Gujarati alphabets using hybrid
AN US
CR IP T
publicly available database [35]. Images were dilated using DTBIM and distance transform. The features we extracted were some statistics (i.e., the entropy metric and the ratio median/ standard deviation) from the Histogram of Oriented Gradient (HOG) descriptors [36] of each dilated image. A HOG cell size parameter of 4-by-4 was used to encode shape information. A multiclass Support Vector Machine (SVM) classifier was then trained on the extracted features from a pre-labelled digit instances. To quantify the classifier’s accuracy, we evaluate the digit classifier models using the confusion matrix. As can been seen, the classification error rate improves when features are extracted from digits which have been dilated using DTBIM as compared to the distance transform based dilation method, see Fig.11.
M
Fig.11. Confusion matrices showing the SVM classification rate in percentage form. The columns of the confusion matrix denote the predicted labels, while the rows depict the known labels. 10 dilation iterations were used for each method. As can be observed, all digits are correctly classified using our method (a), while for the method in (b) digit 5 was misclassified as 3 and digit 2 was equally classified as either 2 or 3. Note that only two derived statistical features were used, training with a more representative feature set is likely to produce a better classifier but that is outside of the scope of this short letter. (c) Shows that (a) outperformed (b) in 80% of the cases, shown diagonally in bold.
Conclusion
ED
4.
AC
CE
PT
In this paper we brought about a novel notion which we termed Delaunay Triangulation Binary Image Morphing (DTBIM). It is an attempt to move the morphology dilation operation that has traditionally been operating in the set theory to a different relationship in the geometry plane. Needless to say, DTBIM does not need supplying any structuring elements. Several synthetic and real examples have been included, in this paper, in order to help the assimilation of the presented concept. It is important to stress that, in some scenarios, it is advantageous to use the elegant common dilation methods; however, DTBIM provides yet another alternative to a lightweight dilation mechanism which can also link broken segments. Another merit of DTBIM is its ability to elucidate the influence of noise. The overall rationale behind this notion is to provide a different perspective to deal with dilation. To the best of our knowledge, this perspective has not been introduced before. The experiments herein advocate for the usefulness of DTBIM in a broad range of topics in computer vision and could be inspirational and probe further advanced complementary scientific work in the field. Acknowledgment This work is part of the research project “Scalable resource efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032) in Sweden. We also
ACCEPTED MANUSCRIPT
CR IP T
35. D. A. Vaquero, J. Barrera and R. Hirata Jr., “Handwritten Digit Images - Computer Vision Group at the University of São Paulo,” available from: http://www.vision.ime.usp.br/~daniel/sibgrapi2005/. Date accessed: 6th August 2016. 36. N. Dalal and B. Triggs. "Histograms of Oriented Gradients for Human Detection," IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1 (June 2005), pp. 886–893.
AC
CE
PT
ED
M
AN US
feature space,” CSI Transactions on ICT, vol. 2, no. 4, pp. 235241, January 2015. 18. A.T. Jamal, N. Nobile, C.Y. Suen, “End-Shape Recognition for Arabic Handwritten Text Segmentation,” Proceedings of the 6th IAPR TC 3 International Workshop on Artificial Neural Networks in Pattern Recognition, vol. 8774 pp 228-239. Springer-Verlag New York, 2014. 19. M. Khayyat, L. Lam, C.Y. Suen, F. Yin, and C.L. Liu, “Arabic Handwritten Text Line Extraction by Applying an Adaptive Mask to Morphological Dilation,” In Proceedings the 10th IAPR International Workshop on Document Analysis Systems (DAS), pp. 100-104, 2012. 20. M.N. Abdi and M. Khemakhem, “A modelbased approach to offline text-independent Arabic writer identification and verification,” Pattern Recognition, vol. 48, no. 5, pp.1890–1903, May 2015. 21. C.C. Han, H.Y.M. Liao, G.J. Yu, L.H. Chen, “Fast face detection via morphology-based pre-processing,” Pattern Recognition, vol. 33, no. 10, pp. 1701-1712, 2000. 22. A. Cheddad, J. Condell, K. Curran and P. Mc Kevitt. “On Points Geometry for Fast Digital Image Segmentation,” The 8th International Conference on Information Technology and Telecommunication IT&T, Galway Mayo Institute of Technology, Galway, Ireland 23 rd-24th October 2008, pp: 54-61. 23. A. Cheddad, D. Mohamad and A. Abd Manaf. “Exploiting Voronoi Diagram Properties in Face Segmentation and Features Extraction,” Pattern Recognition 41 (12)(2008) 3842-3859, Elsevier Science. 24. Y. Xiao, H. Yan, “Face boundary extraction,” in: C. Sun, H. Talbot, S. Ourselin, T. Adriaansen (Eds.), Proceedings of the VII Digital Image Computing: Techniques and Applications, Sydney, 10–12 December 2003. 25. X. Xie, K.M. Lam, “Face recognition using elastic local reconstruction based on a single face image,” Pattern Recognition, vol. 41, no.1, pp. 406–417, 2008. 26. R. Fabbri, L.D.F. Costa, J.C. Torelli and O.M. Bruno, “2D Euclidean distance transform algorithms: A comparative survey," ACM Computing Surveys, vol. 40, issue 1, Feb 2008. 27. P. Soille, “Morphological Image Analysis: Principles and Applications,” Springer, 2nd edition (May 25, 2007). 28. P. Nacken, A. Toet L. Vincent, "Graph Morphology," Journal of Visual Communication and Image Representation, vol. 3, no. 1, March, pp. 24-38, 1992. 29. L. Najman and H Talbot. Mathematical Morphology. Wiley-ISTE, 1st edition (24 Jan. 2013), (section 1.2). 30. L.F. Costa, R.M. Cesar, “Shape Analysis and Classification,” CRC Press, 2nd edition USA, 2009. 31. N. Ahuja, “Dot pattern processing using Voronoi neighborhoods,” IEEE Trans. Pattern Recognition Mach. Intell. PAMI, vol. 4, no. 3, pp. 336–343, 1982. 32. Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli, “Image Quality Assessment: From Error Visibility to Structural Similarity,” IEEE Transactions on Image Processing, vol. 13, Issue 4, pp. 600–612, April 2004. 33. E. R. Dougherty and R.A. Lotufo, “Hands-on Morphological Image Processing,” Tutorial Texts in Optical Engineering (Book 59), 3002, vol. tt59. SPIE Publications (July 24, 2003). 34. I. Bloch, A. Bretto and A. Leborgne, "Robust similarity between hypergraphs based on valuations and mathematical morphology operators," Discrete Applied Mathematics, 183 (2015) 2–19.
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
Graphical abstract