Pattern Recognitton. V o l 24. N o 12. pp. 1211-1221. 1991
1Xl31-3203 'Jl $31NI .. {It~ Pergam(m Press pie (C) 1991 Pattern Rccognttton ~t~Cl~Z[}
Printed in Great Britain
A KNOWLEDGE-BASED THINNING ALGORITHM BEI Ll and CtlING Y. SUEN Centre for Pattern Recognition and Machine Intelligence, Concordia University. 1455 dc Maisonncuvc Blvd. West. Montreal, Quebec H3G I MS, Canada
(Received 11 February 1991; in revised form 20 May 1991: receivedfor publication 6 June 1991) Abstract---One common defect of thinning algorithms is deformation at crossing points. To solve this problem, a new thinning method, called the knowledge-based thinning algorithm (KBTA), is proposed. It first represents a binary pattern by coded run lengths of thc horizontal line segments. Then the relationship between line segments is described quantitatively by another new algorithm which makes use of both forward and backward derivatives. It afterwards identifies the rcgions where branches of the pattern meet, then extracts their shape features and thins all of them. Knowing the idcntitics of these regions, perfect skeletons can be obtained. Other regions are thinned by an existing algorithm which is based on contour generation. Experiments with a wide variety of binary patterns show that this new technique generates better skeletons than several other well-known algorithms. Thinning
Skeletonization
Knowledge-based thinning
l. INTRODUCTION
Thinning is a very important preprocessing step for the analysis and recognition of different types of images. It has been widely used in such areas as OCR, chromosome and fingerprint recognition. °-3) The purpose of thinning is to reduce the width of a line pattern to just one single pixel to compress the data and to facilitate the extraction of distinctive features from the digitized pattern. (4-9) Many thinning algorithms exist and they can be classified into two general types: sequential and parallel algorithms. (~°.u) A parallel algorithm uses only the result from the previous iteration to remove a pixel in the current iteration. It is suitable for implementation by parallel hardware such as an array processor. A sequential algorithm makes use of both the result from the previous pass and the results obtained so far in the current pass to process the current pixel. So, parallel and sequential algorithms work independently. Most applications make use of one of these two types of algorithms to thin various shapes. The result is that some skeletons are good in some cases, but can be poor in others. It is very difficult if not impossible to develop a thinning algorithm which can produce satisfactory results for all varieties of pattern shapes. Actually thinning is a simple task for human beings. They can thin patterns with a wide variety of shapes without any difficulty, It appears that they first catch the global view of the shapes, then apply different algorithms to thin different shapes or different parts of the same pattern. As a result, the skeletons produced by humans are usually considered as reference skeletons, °2) which have always
Preprocessing
been superior to those obtained from thinning algorithms. This fact suggests such human knowledge can be very helpful to the thinning process. One common defect of thinning algorithms is deformation at crossing points as illustrated in Fig. 1. This paper aims at solving this problem by incorporating human knowledge. The approach is to apply shape knowledge to thin the crossing regions and merge these results with those obtained by another algorithm to thin the remaining regions. In this new method, a digitized image is first segmented into different parts which correspond to the different shapes which have been identified. It has been implemented in PASCAL. The results of experiments on a large variety of digitized patterns show that compared with several thinning algorithms, the KBTA generates superior skeletons, although a longer processing time is required. This paper is organized in the following way. Section 2 introduces the proposed knowledge-based thinning algorithm. Section 3 describes the implementation of KBTA. Section 4 contains a comparison of the results obtained from other thinning algorithms including contour generation, (~3~ SPTA, (4) Suzuki's (la) and KBTA. Concluding remarks are given in Section 5.
1211
2. KNOWLEDGE-BASEDTHINNING ALGORITHM
2.1. Representation of binary image of an object In pattern recognition, the binary image of an object is usually represented by pixels. But here, the binary image of an object is represented by horizontal line segments which are coded by run length. 05) The kth line segment (i, bk, ek) is defined
1212
BEI LI and C. Y. SUEN
-'
I
":::.a
....
i
11
!
ii
. . . . . . . . . . . . . . . . . . . . . . iJs..oi*..." "'~,1161~1. ........ i
Iii
6.
.
.
.
.
6
,°
.
......
.
.
.
.
.
.
.
.
"';
°86
. . . . . . . . . . . . . . . . . . . 681m661.61o" ..~IoHII.. . . . . . . . . . . 6 . . . . . .
i°
"
. . . . . . . . . .
. . . ' : • -6
,. --
%
...........
*
'--,
o "i
' ,
. .
H . .
.
.
. .
.
. .
.
o 4
. 6
' ' ' aam661J68*ll
i
. . . . . . .
::::I..%::
(a)
(b)
Fig. 1. The problem of skeleton deformation at crossing points of existing algorithms: (a) Kwok's; c~3) (b) SPTA. "~
by
Table 1. Codes generated from character "e" bk = j ,
whenf(i,j-1)=Oandf(i,j)=l;
ek =j, whenf(i,j)= landf(i,j+ l)=O;
f(i,j)=l,
forbk
<-j<=e,,
where f is a binary image of the object, i is the row number, j is the column number, and bk is the position of the end point of the kth line segment, respectively. Using a set of line segments represented by the triple (i, bk, ek), we can determine the position of the line segment and the relationship between two vertical adjacent lines, i.e. the line above and the line below. Figure 2 shows a character "e". The total number of line segments is 46. Its run length codes are shown in Table 1. 2.2.
Derioative algorithm
Next, in order to describe quantitatively the OO###
OO#OOOOOOOOO OOOOOOO#OaOOOOO# O###OOOO#OO#OOOOO#O oaooo OOOOOIIO #lOll IIii11111 OOO1
IlIIOO01
O#OOO# lee###l# Ill#O# IIIIOlll OIIIIOO Ollllllll IIIIIIIOI IIIIIIiii O011111OlIIIIIIIIIIIIIIIIIII IIIIIIIIIIIO010IOIIIIIIIIIIII IOIIIIIOOflO01111111111111111 IIIIOIII lOOIIOO0
OIOliOO# OOOOOIOO IIIii111 lillllll IIIIIIII IOIOIIIO IIIOIIIII IoOO01OlOO OOl IIIOIOlOI Oil #01110till loll IIIIIIIIII III IIIIOIOIIiOI OIIII IOIOi@O01000011111110111 IIIIIIOOIIIIIIIilOIII OII@IIIOIOIIIilIIOII IIIOIIIIIIIIIIIII IIIIIOIIIIIIIO OIOOIlIII#
Fig. 2. A character " e ' .
No.
i
b
e
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
15 16 17 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 27 28 29 30 31 32 33 34 35 36 37 38 38 39 39 40 40 41 41 42
30 28 26 24 24 36 22 36 22 38 20 38 20 38 20 38 19 38 19 18 18 18 18 18 18 18 18 18 18 18 18 44 19 43 19 42 20 42 20
34 39 41 42 28 43 26 44 25 45 25 45 25 45 26 46 27 46 46 46 46 25 25 25 25 25 25 25 25 26 27 46 27 45 28 45 29 44 31
40 41 42 43
42 43 44 45
40 20 22 22
44 43 42 41
44 45 46
46 47 48
24 26 28
40 39 37
1
A knowledge-based thinning algorithm ~ # # llne
A ->
llne
# # #
1213
# # # # # # #
### <- llne
B
# # # # # #
C -> # # # # #
# # #
llne D -> # # #
# # #
#lit### # lit # # lit #
lit # # # # # #
Fig. 3. Forward derivative,
# # #
# # # line D -> line
C ->
# # # # # # # # # #
l i n e A -># # # # # #
# # # lit <- l a n e B ##lit
Fig. 4. Backward derivative.
relationship between the position of the current line segment and its adjacent neighbours, a forward derivative and backward derivative algorithm has been developed. Forward derivatives of a line segment are defined by FD[k, 1] = bk+ n - bk FD[k, 2] = ek+n - ek, where bk, ek are, respectively, the positions of the beginning and end points of the kth line segment; b , + , , ek+, are, respectively, the positions of the beginning and end points of the (k + n)th line segment which is below and nearest to the kth line, and n is a positive integer. If there is no below line segment connecting with the kth line, FD[k, 1] will be a special value (emp). If two or more line segments also join the same next connective line, they will be referred as the same line segment. For example, in Fig. 3, lines A and B are both connected to line C, hence their forward derivatives will be processed with line C. Suppose line C is the kth line segment with an adjacent line segment D below it, i.e. the (k + l)th line segment, so n = 1. Suppose line A is the kth line segment and C is the adjacent line segment below it, i.e. the (k + 2)th line segment, then n = 2. Similarly, backward derivatives are defined as follows: BD[k, 11 = b k - , - bk BD[k, 2] = ek-,, - ek where b~_,, e , _ , are, respectively, the positions of the beginning and end points of the segment nearest to the kth line immediately above. If there is no above line connecting with the kth line, BD[k, 1] will also be a special value (emp). If two line segments are connected to the same line above them, they will also be operated with the same line segment. For example, lines A and B in Fig. 4 are both connected to line C above, so their backward derivatives are operated with line C. Suppose line C is the kth line segment and the adjacent segment above C is line D, i.e. the (k - l)th line segment, so n = 1. Suppose line A is the kth line
segment and the adjacent segment above A is line C, i.e. the (k - 2)th line, so n = 2. To illustrate the way these derivatives work. all the forward and backward derivatives (FDs and BDs) for the character "e" in Fig. 2 are given in Table 2. The line segment derivatives can be used to determine the position difference between two adjacent lines. If the absolute values of the derivatives are very small, it indicates that there is little change in the direction of the contour. Otherwise, it is likely that a cross region is encountered. Based on this fact, one can distinguish where the contour changes and a new cross region may occur. Some shape information about the cross regions can be further derived by examining the area around where at least one of the absolute values of the derivatives is large enough, so that it can be identificd and segmented from an object. The same operation is applied to the backward derivative. 2.3. Segmentation using derivatioes The extraction and segmentation of cross regions are based on the values of FD[k. 11, BD[k, 1], FD[k, 2] and BDIk, 21. When one absolute value of them appears to be large, a search is performed on a classification tree (Fig. 5). Half of the stroke width, which is computed statistically from the object, is used as the threshold (T). BD[k, 1] = e m p and FD[k, 1] = emp, respectively, represent no line above the current line and below it. The k'th line is the last line of one kind of patterns after the kth line which has changed state, so k' > k. The k"th line is some line back from the kth line which has changed state, it is also the last line of one kind of patterns, so k > k". For example, in Fig. 6, if the FD[k, 1] of the current kth line segment is a large negative integer, it indicates a changing state in the left of the line segment. So, the region below may be one of ±, .J and +, etc. Meanwhile, if the FD[k, 2} of the line segment is a large positive integer, there is a changing state in the right of the line segment also. So, this line is probably a starting line of a +, r~ or ± region.
1214
BEt LI and C. Y. SUEn
Table 2. Forward and backward derivatives of "e" fd[ I, 11 = fd[ 2, 11 = fd[ 3,1] =
-2
fd[ 1, 21 =
5
-2
fd[ 2, 21 =
2
-2
fd[ 3, 21=
1
fd[ 4, I] =
0
fd[ 5 , 1 ] = fd[ 6, 11 = fd[ 7, 11 = fd[ 8, l] = fd[ 9 , 1 1 = fd[10, I] = fd[ll, l] =
-2 o 0 2 -2 0 0
fd[12, 1] =
0
fd[13, 1] = 0 fd[14, 11 = o fd[15, 11 = - 1 fd[16, II = 0 fd[17, 1] = 0 fd[18, 1] = - 1 9 fd[19, 1] = - 1 fd[20, 1] = 0 fd[21, 1] = 0 fd[22, 1] = 0 fd[23, 1] = 0 fd[24, 1] = 0 fd[25, 1] = 0 fd[26, 1] = 0 fd[27, 1] = 0 fd[28, 11 = 0 fd[29, 1] = 0 fd[30, 11 -0 fd[31, 1] = 1 fd[32, 1] = - 1 fd[33, I] = 0 fd[34, II = - i fd[35, 11 -~ 1 fd[36, 1] -0 fd[37, 1] = 0 fd[38, 1] = - 2 fd[39, 1] = 0 fd[40, 1] = - 2 0 fd[41, I] = 2 fd[42, I I = 0 fd[43, 1] = 2 fd[44, 1] = 2 fd[45, 1] = 2 fd[46, 1] = emp
bd[ I, I] =crop bd[ 2, 1] = 2 2 bd[ 3, 1] =
fd[ 4, 2] = -14
bdl 4, 1] =
fd[ 5, 2] = fd[ 6, 21 = fd[ 7, 21 = fdl 8, 21 = fd[ 9, 21= fd[lO, 2] = fd[ll, 21 =
0 bd[ 5, 1] = bd 6, I] = -12 bdl 7, II = 2 bd 8, 11 = 0 0 bd 9, II = bd 10, 11 = - 2 2 b d l l , I] = 0 bd 12, 1] = b d l 3 , 11 = 0 0 bd 14, 1] = bd 15, 11 = 0 0 bd 16, 1] = bd 17, II = I bd 18, 11 = 0 bd 19, 11 = 19 bd 20, I] = I bd 21, 1] = 0 0 bd 22, 1] = o bd 23, 1] = bd[24, 1] = 0 bd[25, 1] = 0 0 bali26, II = bd[27, 11 = 0 bd[28, 1] = 0 bd[29, 11 = 0 bd[30, I] = 0 0 bd[31, I] = bd[32, l] = emp bd[33, 11 = - 1 bd[34, 11 = 1 0 bd[35, 1] = bd[36, 1] = 1 balL37, 1] = - 1 bd[38, 11 = 0 bd[39, 1] = 0 bd[40, 11 = 2 bd[41, 11 = 20 bd[42, l] = - 2 0 bali43, 11 = bd[44, II = - 2 bd[45, 1] = - 2 bd[46, 11 = - 2
fd[12, 2] -fd[13, 2] fd[14, 21 fd[15, 21 frill6, 2] fd[17, 2] fd[18, 2] fd[19, 2] fd[20, 2] fd[21,21 fd[22, 2] fd[23, 2] fd[24, 2] fd[25, 2] fd[26, 2] fd[27, 2] fd[28, 21 fd[29, 2] [d[30, 2] fd[31, 2] fd[32, 2] fd[32, 2] fd[34, 21 fd[35, 21 fd[36, 2] fd[37, 21
0
-= = = =
1 1 1 0 19
=
0
= 0 = 0 = -21 = 0 = 0 = o = 0 = 0 = 0 = 0 = 1 = I = 0 = -I = 1 = 0 = 1 = -1 = 2
fd[38,21 =
fd[39, 2] fd[40, 2] fd[41, 21 fd[42, 21 fd[43, 21 fd[44, 2] fd[45, 21 fd[46, 2]
-2 1 -1 1 0 0 0
0
= 12 = -1 = -I = -1 = -1 = -1 = -2 = emp
If we k e e p on searching, and find the k ' t h line s e g m e n t has a large n e g a t i v e i n t e g e r B D [ k ' , 1] and a n o t h e r large positive i n t e g e r B D [ k ' , 2] in the connected line b e l o w k, this r e g i o n can be c o n s i d e r e d as a + pattern. Figure 6 shows that p o r t i o n of an object which contains this type of region. T h e F D [ k , 1] and F D [ k , 2] of the kth line are - 5 and 6, respectively, and the B D [ k ' , 1] and B D [ k ' , 2] of the k ' t h line are - 5 and 5. So this r e g i o n can be e x t r a c t e d and c o n s i d e r e d as a + p a t t e r n . In the a l g o r i t h m we have to c o n s i d e r the effect of noise also, so the conditions m a y be r e l a x e d w h e n w a r r a n t e d . O t h e r patterns can be e x t r a c t e d in a similar way. U s i n g the values of F D s and B D s , o n e can identify the shape of the region correctly. B a s e d on the a b o v e , the cross regions can be classified into a n u m b e r of c a t e g o r i e s , or patterns.
2
bd[ lad[ bd[ bd[ bd[ bd[ bd[ Ixl I
I, 2] 2, 21 3, 2] 4, 21 5, 2] 6, 21 7, 2] 8,21
= crop = -5 = -2 = -1 = 14 = -I = 2 = -1
bd[ 9, 2] = bd[10, 21 = bd[12, 2] = bd[12, 21 = bd[13, 2] = bd[14, 2] = bd[15,21 = bd[16,21 =
1 -I
bd[18, 21 = bd[19, 21 =
0 0 0 0 -1 -1 -1 0 0
bd[20, 2] =
0
bd121,21 = bd[22, 2] =
0 21
bd[23,2] =
0
bd[17, 21 =
bd[24, 21 = 0 bd[25, 2] = 0 bd[26, 21 = 0 bd[27, 2] = 0 bd[28, 2] = 0 bd[29, 21 = 0 bd[30, 2] = - I bd[31, 2] = - I bd[32, 2] = emp bd[33, 21 = 0 bd[34, 2] = 1 bd[3S, 2] = - 1 bd[36,2] = 0 bd[37, 2] = - 1 bd[38, 21 = 1 bd[39, 2] = - 2 bd[40, 21 = 0 bd[41, 21 = I bd[42, 2] = 1 bd[43, 21 = 1 bd[44, 2] = l bd[45, 2] = 1 bd[46,21 = 2
S o m e basic patterns e x t r a c t e d by the a b o v e procedures include: [7"-, -1, L, / , _1_, 4, -r, I-, + , ~ .
#### k-th line-> # # # # # # # # # # # # # # # # # # # # ~ # # # # # ~ # # # # # t # # # # # # # # # # # # t # # #
# # # # # # # # # # # # # # # k'-~h line-> # # # # #
Fig. G. Pa~ofanobjcct.
A knowledge-based thinning algorithm
I
ABS(FDIk.I])~T or ABStFDIR.2IbTor ABS(BDIk.IIhT or ABS(BDIk.2IDT
NO
~, Yes
L_
ye* ~ " ~ ~ L
1215
NO
FOlk ]" Oano FDIx,2) < -T
FDIk. I] ( -T J
I
I 8D
Ix
I I] • emD I
Yes
I
I I
YeS
~°
FD[k,2]'O
FD lk.2] ~T
~es
L. ~ l k ' . l ]
±/ t••I.sNo
FO[k.l]~ T y[~pe5
FDIK.2) (-I'
i
1
- emp
I
Yes I
BD Ik. I]-0 and BD [k',2hT J BOIk'. ~, ~I<-T and BD(x- 1.21-0
[ BD[k'I)<-T ]
I
+
Bo[K'l'OandBD[k,2] >f
-7
V Fig. 5. A block diagram of the classification tree.
2.4. Knowledge-based thinning algorithm After the identification of various cross regions, the shapes of different regions are known. Based on human knowledge, we know what the perfect skeletons of these regions should be and we try to thin them like humans do. Hence, for a corner region as shown in (Fig. 7), we refer the upper contour C1, left contour C2 and region bounds L i and L 2, to thin this region. First, we determine the centre point p. According to C2 and L1 we extract a vertical skeleton. According to C1 and L 2 we extract a horizontal skeleton. In this way, we can also extract skeletons
C1
L~
x Z
X
~
for other regions which have the same shape but different orientations like L, _J and q . For the crossing region bounded by dashed lines in Fig. 8, first we locate p, the centre point of the region. Then we determine the vertical skeleton in the region according to left contour Cl and L3, and determine the horizontal skeleton according to L I and L2. Similarly, by rotating the pattern in Fig. 8 n * 90 ° (n = 1, 2, 3), shapes like ±, 4 and T can be obtained and thinned. The final skeletons based on human knowledge will remain on the central axes. In this way a better skeleton can be obtained because the spaces are already known. Similarly, skeletons of other regular shapes + , . . . , etc. can be obtained.
X
x I
x: ~
I
Z
~
K
= = = z
~ J$"
L~
L1
Z Jr
L2 ............................ i ~
,gr
JIt
.Ill"
.lIT
Fig. 7. Corner region.
I~ 24:12-G
L2
x
= K
J,
~
x
.11"
X
,lr
i~
Fig. 8. t- Cross region.
1216
BEI LI and C. Y. SUEN
Using shape features detected by forward and backward derivatives, perfect skeletons of cross regions I-, q, L, _J, ± , -I, 7, F, + , rh are extracted. These often appear in Chinese and alphanumeric characters. Other regions are thinned by an existing algorithm which is based on contour generation. 3. IMPLEMENTATION OF KBTA
3.1. A thinning algorithm by contour generation In the implementation of KBTA, a thinning algorithm by contour generation is employed for those regions which are not thinned by KBTA. A brief description of this algorithm is included below, and details can be found in reference (13). A given binary image is first represented by chain codes. 06) A chain code is generated for every closed contour of the object, and the direction of the chain is counter-clockwise for the exterior contour of an object and clockwise for the inner contour of a hole. The contour is also considered as a sequence of a chain of edge points P0, P ~. . . . . p , , where P0 = P,. The sequence of pixels is represented by the Freeman code, which is a sequence of directions dir0, dirl . . . . . dir,_l, pointing to the next pixel in the segment. For 8-connected contours, diri is in the range 0 to 7 representing the eight directions shown in Fig. 9. As this thinning algorithm is iterative, so afte~ plotting the first contour, the algorithm goes through a number of iterations. In every iteration, the contour describing the edge of an object is traced; meanwhile the edge points are examined against a set of criteria to decide whether the edge point should be removed or not. The iteration terminates for a particular contour when there are no more unsafe points in that contour. When the operation completes, the skeleton remains. A chain code describing the skeleton is also available.
3.2. The new concept of merging thinning algorithms By classifying the cross regions into two types, we can consider the KBTA and contour algorithm as complementry in nature. One of them is the KBTA which can process simple cross regions like L, _.1, _L, ~, + and r~, and the other type, which is the contour algorithm in this case, can process thin complicated cross regions such as I~ and,~. The results obtained X3
X2
X:
\t/
X4 .ql----P ~
Xo
X5
X6
X7
by Kwok (13) have shown us that mostly the skeletons at complicated cross regions are better than those at simple cross regions, and KBTA can produce better skeletons at simple cross regions. Hence we try to combine them to remove their respective deficiencies. Another reason for choosing the contour algorithm is that it was one of the fastest among the thinning algorithms tested in reference (13). So we choose the contour algorithm to obtain the preliminary skeleton. Then, we analyse the shapes of the original object, thin the cross regions by knowledge, and replace the original skeletons in such regions. Details are described below. 3.3. Implementation of the knowledge-based
thinning algorithm For thinning an object, the following steps are performed. Step (1). Remove noise in the input image: the smoothing algorithm consists of essentially moving a 3 * 3 window across the binary image and comparing the state of the central element of this window with its 8-neighbors to decide whether this state (pixel value) should be retained or changed. Smoothing helps to avoid spurious tails and miscellaneous distortions. This method removes both pepper and salt noise. Step (2). Generate a preliminary skeleton by the thinning algorithm based on contour generation. Step (3). Code the binary image by run length and compute the derivatives for every line segment in the original object. Step (4). Obtain the derivatives of various line segments and look for those which change state: when a line segment has changed state with respect to its adjacent line segment, a search on the classification tree will be executed to see if the region matches one of the known patterns; if so, the shape feature of the region of the object is extracted. Based on the shape feature of the region, e.g. a "T" junction or a corner, a thinning program, presented in Section 2, is executed to produce a skeleton of that particular shape. Step (5). Replacing regions: skeletons produced by KBTA will replace those obtained by contour generation at the corresponding regions. Other parts of the skeleton of the object are not replaced. Step (6). Postprocessing: this is the final step in which some unnecessary points are deleted. The skeleton is kept at unit width. 4. COMPARISON OF RESULTS
Fig. 9. The 8-neighbors of a pixel p and Freeman code.
Several experiments were conducted to compare the proposed algorithm and Kwok's algorithm, O3) SPTA (4) and Suzuki's algorithm. (14J In our experiments, about 200 Chinese characters, 70 alphanumeric characters and special symbols have been tested. Some results produced by these methods are
A knowledge-based
8658•
"
,,
,..
:
.,
"'"~'•'"' ....
~...
'
~
'
~
•
~
•
•
•
4
~
....
.
.
.
.
'~
"~ ~.
.
.
4
:. :.:."~
.
.
.
.
.
.::. ~ a. . .. .. . .
.,
....
.
.
~ .. ~
.
•,
.
: : : . ~~: . . . .
.:.:.:. : ~ :~ . ~::~: . .
.
, .
. • .
. .
.
.
. .
. .
,
.
'~.
,.
"::*':::~:: ........ ...~•• . . . . . . . . . .
..-.
....
....
"~."
,.
~...
•~ .
.
.
.
~ . . . . . . . . . . . ~
......
• . . . . .
''~ . ~
. 6~
.
.
. .
.
........
(.)
.
.
:.~,..;. ""~
.
.
.
.
.
.
.
.
. •~
.
.
.
.
.
.
.
.
m
.
. .
.
.
.
....
" : " • ' "~ " " "" ' .
• . • .... .
. .
a~64•~..:
~:::::
.
. . . . . . .
"
'''',"
~_
""
..............
.~
:~'.
• .... • ....
•
' ~ . . ~ • ~ • ' ~ . . ~ "
.
•.:::
.... ..... . .....
~ a .
.
~""
. . . .. . . •~. .... . . ~ ....
~:...:.
....
.
. , . .
. . . .
. . •. ~. • . ~ . • .~ .• ~. •. . . .. . . . . . . . . . •• ~ • • • • ~ • • ~ • ~ • • • • • • ....... .. . . . . . . . . . . . . . . . . . . . . . . . . . . . •~ . . . . . . . . . . . . . . . . . . . . . . . .
....~
:::~::::
. ""~" .
• .......
"'~•* •~.
°.
: ' ~ . . ..
.
.
.
.......
.~ ....
..:,
.
.
. .
. . . ... . 4 ~
. ....
'~"
, .
.....
.
. . . . . . . . . . . . . . . . . .
.
.
.
". ' " . ~. " ' ~.'
..
.,...
.
.......
m--
...
"'''~••'" ...~
.
"::..~::-
..
. . .• . ' ' • ~ . ..-
.~ .
..::,.."
'~''
"•
, , , ,.,., ., ,. ..
,~,,,,,,,,~"
........
'•~':'"
:::~:
1217
.....
. -. - .- :., , . ". ' T
,°,..
.. • "."
••
algorithm
•
""
.....
, . . . . . . . ..~:~'• . . . . . . .
'
thinning
.
.
. .
.
.
.
.
.
.
.
. .
~": •~
• . . . . . . . . . . . A ~ ..... ~••
.
.
.
.
.
~* ~ . , .
....
~. . . . .
4
:...~•...~
(b)
~....-%::.
.
.
.
.
.
.
.
.
.
.
.
~ ....
. .
.
. ~•
.
.
~•
.
. .
.
.
.
~ . ,
. .
.
•~
. . . . . . . . . . .
~ • • ~ 4 • • • ~
...:
.............
(c)
~.. .
~
"
•
"
(d)
(i)
w
....... .....
4•~4•444 • 4•
•
....... . . . . .
. . . . . . . . .
"N •.. ' 'k~444 "*m*' '44
•"
•~•
4
.
.
• . .
. '''•
• .~
' ..... .
.
. .
. .
.
.
.
.
. .
.
. .
.
Ib
~.........
•''" '
a
*
. . . . . .
.~
....
..
"''N
~
""
. . . . . . . .
' N• .. . .. . . . .4 .J '.4 4. 4 .6 . . .. . ... .. . . i ~4. .~ m m "'m•*~tt w66*'~tN
....
• ....
. .
~
i',
........
"::
......
. . . . . . . . . . .
.m
. . . .' . . . •.
~ • ..... .. .
": . . . .
' "
$
~ ' ' ~
ii
,:.
.::,.::-
.
m''" i
...... • ....
A ......
It
~....
.
.
.
• ..
.
.:::: ...~ '
4
. k .... .
.................
. .• . . . &•lbk&lb ...... •
o
":::..~,..:.
. . . . . . . . . . . . . .. . .
.
-
. . . .. . . 6It*' '4•
..-
i ~
"'••''"
• .
.
.
.
. .#~ . . • • . .
.
. - "" 6 • ' . .~.
'
q
.ii~.""
::::
::i:::
:::'V: " " " :iiii
6 . . . . . . . . . . . . ~'~'~"~J,~i~'i.
'"
. . . ." ~m"
"''w " ' "w'
'" " '
" " ' "e . . . . . . . . . . . . . '''" 4• . . . . . . . . . . . .
" " "mJ '''
. . . . . . . . . . .
~i::i..""""~i
(b)
(a)
"'
~;
......
(c)
(d)
(ii)
Fig.
10. Comparison Suzuki's
algorithm,
of some for the
results
produced
following
by:
patterns:
(a)
KBTA;
(b)
Kwok's
(i) e; (ii) 2; (iii)-(v)
some
algorithm; Chinese
(c) SPTA; characters.
(d)
,,~,,..
•
1218
BEn L ] a n d C . Y . S U E N
....
.... ..... ....
I .... U66666 66
• ' ' 1 1 ' ' 6 1 . .
661666~•
. . . . . . . . . . . . ' . . . . . . . . . . . .
"'6
i~61.•6
• ' 'u • '1
n ....
"" • "I .
• ............ j
. . . . . . . . . . . . II' 66611 ...... I 16' "11 " '
. . . . . . . . . . .
......
. . . . . . . . . . .
11
.............
"'I
66666611661661~I61616111 . . . . . . . . . . . . . . . . . . .
" "
• ' •
e .... " ' ' ~ ' ' ' "'• ''
. . . . . . . . . . . . . .
66"
• "61..66
...... e .....
"''
.
....
"''11
~.,,
~..~'...••••1111
.
" " "
::i,21:
.
" t ""
6• ..............
6
•I ..........
..... 6 ' 16' '6 ....
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1t
••i•
...... 11 ....
. . . . . . . . . 41. . . . . . . . . . . . .
6 " " • ' "
"." . . . . . . . . . . . . . . . . . . . . .
~ . . . . . . . . . .
~ I 1 6 1 • 6 6 6 1 1 1 1 6 1 6 6 6 1 1 ~ 6 6 i 6 i 6 1 1 6 i . . ~
66e666666'' ......
~.
. . . . . . . . . . . . . . . . . . . . .
.::~:::
....
6161111
• . . . . . . . . . . . . .
• ' ~t66611666661161166~111~t66661666 . . . . . . . . . . . . . . . . . . . . .
e . . . . . . . . . . . .
~ ........ . . . . . . . . . . . . . . . . . . . . .
• ' 'u
.
.
.
.
.
u'
. ''~6166611111•61~611111166111.~i . . . . . . . . . . . . . . . .
.~:: ..........
e . . . . . . . . . . . .
t .......
" '
•
a . . . . . . . . . . . . . . . .
.
.
.
.
.
• .
. . . . . . . . . . . . . . . . ~ 1 • 1 1 1 1 6 6 6 6 ¢ 6 1
.
.
.
.
A
' " "
e. . . . . . . . . . 66 t ' ' ' 1 1 1 1 6 6 6 1 1 . . i ~ , ~ 'N
::.~::...:::.:.::::
. . . • . .
' ' ' , 6 '
(a)
"''1611 . . "''i 66 "''i .... " ' I' ' I . .. . .. . ... .. . . . .
.:~I.:: .... •
I"••
"''tI . . . . . .........
.. 11
..
..
(b)
. . . . . . . . . . . I'' . . . . . . . . . . 6 "" 111611"'11 '" I I• .... ''' . . . . .
...........
..
..
. . . . . . . . . . . . . . . . . . . . Ill '''61161t . . . . . . . . . . ~6~. ''11m ~61J0ttA61AN...
:::::::.
. . •i ' 6 . . .. .. . .. .. . .. .. . .. ..
6 ''" " ' w
.
....
. . . . . . . . . . . . . . 6 66 . . . . . . . . . . 11666661• .... • ' •••61.... II'I . . . . . .• .
. . . . . . . . . . . . . . . .
....
m . . . . . . . . . . . .
. . . . . . .
: : : ~ •: : .:. .. .
. . .1.6 ~ 'm'" "
.222m2"
~66161 ti . . . . . . . . . . . . . . ...... ~t66t1111611"''Im . . . . . . . . . . . . . . . . . NAil•
1""
61~ '
• ' 'N . • •N . . . . . . . . . . . . . . . . . . . . . .
• " '•
. . . . . . . . . . . . . . . .
.
.
.
.
.
~ . . . . . . . . . .
"::16611111,,611,~111661~1..1::: .
:::~::
6
61
' "
6
. .
.
.
.
•"
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . .. . . . . . . . . .
m ......
N . . . . . . N . . . . . N" " N" ' I m A m "" "
"'' "'1111616616661111~116166661~ "• n
........
.
.
.
.
.
.
t . " ' ' N ' '
.....
(d)
(c) (iii)
F i g . 10. ( C o n t i n u e d )
"
m" "
6 1 6 1 1 1 1 1 1 1 1 1 1 1 6 6
.... A m " i i 1 6 . . 6 . . .
....
. .
• . . . . . . . . . . . . .
" ' ' 1 6 1 6 6 8 ~ I I 1 1 ~ 1 6 1 1 1 1 6 1 1 1 ''" m . . . . . . . . . • . . . . . . . . . . . . . . . . . . . . ~ ' ' "• • ' "• ~• . . . . . . . . . .. . .. . .. . .. . .
~ • 6 6 1 1 1 1 A 1 1 1 1 1 1 1 1 1 1 ~ I 1 6 6 6 6 1 6 1 1 1 1 6 ~ I 6 6 6 ~ ' "
.
. . . . . . . . . . . ~1111616 .... w61'"
I'' 'm' .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
I..
.
.
.
.
.
.
.
.
i i i ~.... !
~ ....
•
•
~
°
•
ii~ ~
'!iii!i!iiiiiiiiii 0
=_
r~
w
0
-i
1220
BEI LI a n d C . Y . SUEN
•..#•..
...~...
•..it.••
..........
. . . : : .... : . . : : : : : : :
. . . . . . . . . . . . . . . . . . . . . It•. .... tt.Jtt.tttt, t~tt.ttttt*tt.ttttt
Jttttttt
....
~::t ................................
It...
...~
It••.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
,,it,,, ..... ..... ....
It
'~it
t . . . . . . . . . . . . . . . . . . . . . . . . . ~ .tttttttttttttttC, ttttttttt. It •.
~
. • ' • " 't • 't • .~•
~
•..it •..it
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.,~ ......
.
.
.
.
It. . . . . . . . . . . : : , . .
. .
.
.
.
It... It...
tt
t t t t t t t t t t t
.... lift"
22. . . . . . . . . . .
t t t t t t t t t t ..........
'''
. . . . . .
. . . . . . . . . . . . . It . . . . . . . . . . . . . ...it.. ...it.. . . . . . It.. ...it.. '''~t . . . . w . . . . "''t . . . . t ............ t'" ''t . . . . . . . t t t t t t t t t t t t . ' ' • 'it . . . . . . . t •..it . . . . . . ~ . . . . . . . . . . . . . . .
......
• 't
It. • It" " It" " . . . . 41 • • ...... It"" t .... ~ ' " " Iltllt t''' . . . #t#t . . . . . . . . . . . . . . . . . . ' : : ~ ~ : : :..tltttlttlttl~ttt#...
.
..... t ....
t'
.•.it.. '''It'" . . . . It.. . . . . . It.. 'It . . . . . It . . . . . . . . . It . . . . . . . . . . . . . t" " " ' ' : : ~itttttttttttttt. : " • • • • •it . . . . . . . . . . . . . . . .
.... ~ . . . .... t~ . . .... t ~t . . .... It' " t . " " t ''" tt • "t ..... '' 't . . . . . . . • ' "t . . . . . .~tt.~:~ '
. . . . . . . . .
~ggglglggtgt..::..l~gggJgt~tJ
.....
:::::::::::::::::::::::::::::
:iiit%~:
...... ....
"'"
.... .... .....
t
...it... .•.it...
. . . . . .
.... It~ .... t ~ . . . . . ...tit...tt~ "''t ..... '''t . . . . "''t . . . . ''It.::: . . . . . . . . .
.
.
.
It. . . . . . . . . . . : : ~
.
It. It" . . t " ...... It. t .... t "' . . . tt't t'" . . . t ' tt . . . . . . . . . . . . . . . . . . ' ' ~ : . : : : . . t t t t t t t t t t ........ . . . . . . . . . . . . . . . . . ttttt::: .
.
.
.
.
.
. .
. .
.
(b)
O) ...~...
" " t ' ' " '''m''" " ' ' u ' "
...it... .. .. .. i t . . . ~ . .
.....
: ......
.----...,,%,.. . . . . g g
..... .... • " 't
"'" .... ....
$$tttttltttlt
.. . .. . .. . .. . . . . . . . . . . . .. . .. .
t$$tt$~$tttl
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
t . t"
.
.
.
. . . . . . . . . . . . . . . . . . . . . . . . .
tt
" " t t t t t t t t t t t t t t t t i t
.... t •'' It' " "
t ~
tt
.
.
- - - t . "
.
.
••it,• • •it . "'it''
''
.
.
.
t
.
.... • •t . . . . . . • • •it . . . . . . • • .it . . . . . . .... .. .. .. ..
Itt . . . . . . . .. . . . . . . . . It t ' 'tt
. . . . . . . . . . . .
.
.
.it..
It, • 'ttw'"
~tit..::
" .... .
.
.
.
.
.
.
.
.
.
.
.
"''m'"
t
•
t t t t t t t t t t t t . " t ~ ............... It . . . . . . . . . . . . . . . .
..,it
,,,it,. It•' It,. '''it . . . . . . It ............. t' ' . . . . . Ittttttttttttttit:' . . . . It . . . . It . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.....
It. . . . . . . . . . . : : / .
.... It . . . . . . " " h i . . . . . . "''tit Ittt . . . . . "''it ' Itt . . . . .
It' ~ 'I t '
It• It" It' It"
.:iP:":.it:~::::tit:
'''~ "''t
. . . . . . .. ..
.
.itttttt~tJlt~''tttttttttttit. ............. ItJtit
t . • it . . :it~tttt . . "''it . • ..it . "''tit .
"'" tt ...... It'' ..... t .... ~ '' . . . . . . . ttttit''' . . . . . . . tt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . tt . . . . . . . . . . . . . . . . . .. . . ~ : : : : ~ . . t t t t t t t t t t t t t t ~ : :
''it 'tit
' ' ' t
'''t ''It
.
:~ . . . . . . . . . . . . . .
::itt~t . . . . . . . . . . . . . . . . . . . . . . . . .
''' ....
It . . . . . . . . . . . . . ...it.. ...it.. It..
.
Ittttttttttttttt''"
:::~ . . . . . . . . . . . . . . . .
....
..............
t i tIttt . . . .. .. . . . .. . . . .. . . . .. .. . .. .. .
~
......
t
. . . . . . . . . . . . .
:it.:~
....
(c)
ItttttN'" Itt
. . . . . . . . . . . . . . . . . . .
........ wt . . . . . . . . . . . . . . . . . . • "::::.it~ttt~ttttt~tt:::
(d) (v)
Fig. 10. (Continued)
illustrated in Fig. 10. Since the K B T A is based on knowledge to thin s o m e cross regions, its results are far better than others at such regions. With this new approach, the c o m e r shapes, " T " junction and its shapes in other orientations can be kept by skeletons without deformation• Such an approach will highly facilitate the recognition process because it supplies some efficient information about c o m e r s and forks, etc. H o w e v e r , it requires some m o r e processing time than Kwok's algorithm. Our experiments are p e r f o r m e d on a Micro V A X II. All programs are written in P A S C A L •
5. C O N C L U D I N G R E M A R K S
A new knowledge-based thinning algorithm ( K B T A ) has been presented in this paper. A f t e r merging it with a n o t h e r thinning algorithm, the combined algorithm produces a connected skeleton of unit width along the medial axis of the given binary image. The proposed new algorithm has been tested on a relatively large database. The skeletonized results by K B T A are superior to those produced by Kwok's algorithm, S P T A and Suzuki's algorithm,
A knowledge-based thinning algorithm
especially at the cross regions. They also maintain very well the original topology of the object.
Acknowledgements--The authors would like to thank Dr P. C. K. Kwok of the University of Calgary and A. Arumugam for providing their algorithms to our research team, Dr L. Lam for her constructive comments and both the Natural Sciences and Engineering Research Council of Canada and the Ministry of Education of Quebec for their financial support. REFERENCES
1. C. Y. Suen, M. Berthold and S. Mori, Automatic recognition of handprinted characters, Proc. IEEE 68, 297-307 (1980). 2. C. J. Hilditch, Linear skeletons from square cupboards, Machine Intelligence, B. Meltzcr and D. Michie, eds, Vol. 4, pp. 32.5-347. American Elsevicr, New York (1968). 3. B. Moayer and K. S. Fu, A syntactic approach to fingerprint pattern recognition, Pattern Recognition 7, 1-23 (1975). 4. N. J. Nacache and R. Shinghal, SPTA: a proposed algorithm for thinning binary patterns, IEEE Trans. Syst. Man Cybern. 14, 409--418 (1984). 5. T. Y. Zhang and C. Y. Suen, A fast parallel algorithm for thinning digital patterns, Communs ACM 27. 236239 (1984). 6. W. Xu and C. Wang, CGT: a fast thinning algorithm implemented on a sequential computer, IEEE Trans. Syst. Man Cybern. 17, 847-851 (1987).
1221
7. Y. S. Chen and W. H. Hsu, A modified fast parallel algorithm for thinning digital patterns, Pattern Recognition Lett. 7, 99-106 (1987). 8. C. Arcelli and G. S. D. Baja, A width-independent fast thinning algorithm, IEEE Trans. Pattern Anal. Mach. lntell. 463-474 (1985). 9. T. Pavlidis, A thinning algorithm for discrete binary images, Comput. Graphics Image Process. 13. 142-157 (1980). 10. A. Rosenfeld and J. L. Pfaitz, Sequential operations in digital picture processing, J. ACM 13, 471-494 (1966). 11. R. Stefanelli and A. Rosenfeld, Some parallel thinning algorithms for digital pictures, J. ACM 18, 255-264 (1971). 12. R. Plamondon and C. Y. Suen, Thinning of digitized characters from subjective experiments: a proposal for a systematic evaluation protocol of algorithms, Comput. Vision Shape Recognition, pp. 261-272. World Scientific, Singapore (1989). 13. P. C. K. Kwok, A thinning algorithm by contour generation, Communs. ACM 31, 1314-1324 (1988). 14. S. Suzuki and K. Abe, Binary picture thinning by an iterative parallel two-subcycle operation, Pattern Recognition 20, 297-307 (1987). 15. Minjin Wu and Bei Li, PLS recognition method on handprinted Chinese characters, Invited Paper, Proc. 1989 Int. Symp. Chinese Text Process., Boca Raton, FL, pp. 5-23-5-24, March (1989). 16. H. Freeman, On the encoding of arbitrary geometric configurations, IEEE Trans. Electronic Comput. 10, 260-268 (1961).
About the Author--BEx Lt received her B.Sc. and M.Sc.(Eng.) degrees from East China Normal
University, Shanghai, China, in 1983 and 1986, respectively. From 1986 to 1989 she was a lecturer in the Department of Information Technology, East China Normal University. Since 1989 she has been a research assistant in the Center for Pattern Recognition and Machine Intelligence of Concordia University, Canada. She has published a number of papers and her fields of interest include character recognition, image processing, pattern recognition and artificial intelligence.
About the Author---CHmG Y. SUENreceived his M.Sc.(Eng.) degree from the University of Hong Kong and a Ph.D. degree from the University of British Columbia, Canada. In 1972, he joined the Department of Computer Science of Concordia University, Montreal, Canada, where he became Professor in 1979 and served as Chairman from 1980 to 1984. Presently, he is the Director of CENPARMI, the new Center for Pattern Recognition and Machine Intelligence of Concordia. During the past 15 years, he was also appointed to visiting positions in several institutions in different countries. Prof. Suen is the author or editor of several books on subjects ranging from Computer Vision and Shape Recognition, Frontiers in Handwriting Recognition, to Computational Analysis of Mandarin and Chinese. His latest book is entitled Operational Expert System Applications in Canada, now in press by Pergamon Press. Dr Suen is the author of many papers and his current interests include pattern recognition and machine intelligence, expert systems, optical character recognition and document processing, and computational linguistics. An active member of several professional societies and Fellow of the IEEE, Dr Suen is an Associate Editor of several journals related to his areas of interest. He is the Past President of the Canadian Image Processing and Pattern Recognition Society, Governor of the International Association for Pattern Recognition, and President of the Chinese Language Computer Society.