Performance analysis of an OCR system via an artificial handwritten chinese character generator

742KB Sizes 3 Downloads 83 Views

Report

PDF Reader
Full Text

Pergamon

Pattern Recognition, Vol. 27, No. 2, pp. 221 232, 1994 Elsevier Science Ltd Copyright © 1994 Pattern Recognition Society Printed in Great Britain. All rights reserved 0031 3203/94 $6.00+.00

P E R F O R M A N C E ANALYSIS OF AN OCR SYSTEM VIA AN ARTIFICIAL HANDWRITTEN CHINESE CHARACTER GENERATOR CHENG-HUANGTUNG, YUAN-JONGCHEN and HSI-JIAN LEE~ Department of Computer Scienceand Information Engineering,National Chiao Tung University, Hsinchu, Taiwan 30050, R.O.C.

(Received 13 January 1993; in revisedform 13 August 1993;receivedfor publication 23 August 1993) Abstract--A handwritten Chinese character generator that generates handwritten Chinese characters with different variations is proposed and used to evaluate the performance of an OCR system. The character generator first generates radicals of a character stroke by stroke and then combines the generated radicals to form a line-vector character. Next, the line-vector character is thickened to obtain a character image. Characters generated by this method will have great variance in shape but will still satisfy the structural constraints. The generated characters are then used to perform two types of evaluations. First, the stability of stroke extractors in terms of stroke number is evaluated. Two stroke extractors, one based on thinning and the other on vectorization, are analyzed. The vectorization-hased stroke extractor we propose operates directly on the run length codes of line segments.From experimental results, we find that the thinning-based method is more time-consuming but more stable than the vectorization-based method. Second, the peak performance of a matching module is evaluated and the recognition error caused by stroke extractors is identified. Handwritten Chinese character generator B-splinefunctions

Radical

I. INTRODUCTION Typically, a large number of character images are required to test the performance of a character recognition system. Collecting a large database of character images is time-consuming. Instead of using a large testing database in evaluating the performance of character recognition systems, it would be more convenient to use an artificial character generator that can generate character images with a wide range of natural variations. Several researchers have already worked with artificial character generation. Sinha and Karnick I1~ used PLANG (Picture LANGuage)-based specifications to describe a class of pictures by a single prototype specification and to generate variants as instances of this specification. Govindan and Shivaprasad ~2~proposed prototype specifications based on end point coordinates, the nature of segments, and connectivity to generate characters. These two character generators define how each stroke in a character can be perturbed locally. Leung ~3~ proposed a distortion model for Chinese character generation. A standard character image is distorted by random functions. Ishii~4~ perturbed each stroke of a character. Yet when the distortion of a stroke is large, the structure of the character may be destroyed. The above papers do not consider whether perturbed strokes might form incorrect structures, thereby creating unnatural characters. When Chinese characters consisting of many strokes I"Author to whom correspondence should be addressed. 221

Line-vector characters

Strokeextractor

are generated, the probability of generating unnatural characters with incorrect structures increases. In this paper, we propose a system for generating naturallooking handwritten Chinese characters. The functions of the handwritten Chinese character generator proposed here can be divided into two stages: line-vector character generation and thickening. In the first stage, each radical of a character is generated stroke by stroke. The strokes generated must satisfy the relative stroke relations defined in the reference radical. The radicals generated are then combined into a line-vector character such that their positions satisfy the relative radical relations specified in the reference character. The scheme used in generating a radical and combining the generated radicals is a generate-and-test procedure, ts) which first generates possible candidates and then checks the candidates to see whether they satisfy the given constraints. In the second stage, a thickening procedure is applied to thicken each stroke in a linevector character and add noise to it. A character image is thus generated from the line-vector character. In the second part of this paper, generated characters are used to evaluate two kinds of image operations. First, the stability of different stroke extractors is evaluated. Two stroke extractors, one based on thinning and the other on vectorization, are taken into consideration. The former has previously been developed completely,t6'7) and the latter is developed in this paper from a one-pass vectorizer 18'9~ proposed by Pavlidis. The one-pass vectorizer operates at high speed but produces unstable line segments. Here we propose a two-pass vectorizer to improve stability and construct

C.-H. TUNGet al.

222

a stroke extractor based on vectorization. We then compare the performance of our stroke extractor with that of a thinning-based stroke extractor. Second, we evaluate the performance of a matching module and identify recognition error caused by stroke extractors. Line-vector characters are taken as the ideal input of a matching module, since these characters contain no feature-extraction error. The recognition rate of a matching module for the ideal input is thus assumed to be the peak performance of the matching module. When the results of a stroke extractor are used by the matching module, the change of recognition rate due to the stroke extractor can be obtained. If the change is small, the stroke extractor is taken to be suitable for the matching module.

2. LINE-VECTORCHARACTERGENERATION 2.1. Database for modeling Chinese characters Before generating handwritten Chinese characters, we must create a database that stores all of the modeling characters. A Chinese character is composed of radicals, each of which contains several strokes. The spatial relations between radicals are stored in the radical-relation table. The relations between the strokes are stored in a stroke-relation table. The features of a stroke are the start point, length and angle of the stroke. Figure 1 shows the data structure of a modeling character in the database. We can obtain the features of all strokes directly from on-line information.

,. . . . . . . . . . . . . . . . . . . . . .

;----: ..................

radical 1 with , 8 strokes ,

' ,

9 strokes

on-line, infor, maUon ,,

i

radical 3 with 6 strokes

- -

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data stored in the database .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

q

radical-relation table

+

radical 1

radical 2

radical 3

',stroke 1: ((x,y), length, angle) stroke 1: ((x8), length, angle) stroke 1: ((x~v), length, angle)

;stroke 8: ((xy), length, angle) stroke 9: ((x,y), length, angle) stroke 6: ((xy), length, angle)

+ stroke-relation table

+

+

stroke-relation table

stroke-relation table

Fig. l. The data structure of an on-line modeling Chinese character.

i i/IT'-i--7 f I~ _~__j radical2 ~ I

I I --L I J ~

radical 1

1

1

2

3

L

L

radical 3 3

(a)

(b)

Fig. 2. (a) Three radicals for a Chinese character. (b) The radical-relation table for the character.

Performance analysis of an OCR system stroke 2

1

223

2

3

1 ~ \ C

stroke 4

2

4

5

6

L

L

L

C

C

U

U

U

N'xN R

R

C

X

U

\

stroke 3 3 stroke

stroke 5

4

~

5

stroke 6

u

\

6

(a)

(h)

Fig. 3. (a) The strokes in a radical. (b) The stroke-relation table.

To represent the relations between the radicals in a Chinese character, we define a function rad (i, j), which denotes the relative position of radical i with respect to radical j. There are four possible position values for rad(i,j): "LEFT", "RIGHT", "UP", and "DOWN", which we abbreviate as "L", "R", "U", and "D", respectively. There are three radicals in the Chinese character in Fig. 2. Radical 1 is at the left of radical 2, so rad (1, 2) = "L". A radical-relation table is created to record the relations between the three radicals. Next, we define six relations between the individual strokes of each radical. Let the function stroke(i, j) denote the relative position of stroke i with respect to stroke j. There are six stroke relations for stroke (i, j): "UP", "DOWN", "LEFT", "RIGHT", "CROSS", and "CORNER", which we abbreviate as "U", "D", "L", "R', "X", and "C', respectively. An example of the stroke relations in a radical is shown in Fig. 3. Figure 3(a) shows the six strokes in the radical and Fig. 3(b) depicts the stroke-relation table. Now, we can establish the data structures for the modeling characters. To generate the distorted character from a modeling character, each distorted radical under the constraint of its own stroke-relation table is generated and then the distorted radicals are combined under the constraint of the radical-relation table of the character. 2.2. Radical generation When we write a radical in a Chinese character, we construct the structure of the strokes in the radical carefully. In Fig. 4(a), the radical is written well and looks natural. In Fig. 4(b), the position of one of the horizontal strokes has been shifted too far to the left; the structure of the radical is unnatural. The structure in a radical must be maintained when the radical is distorted. Ishii~4)perturbed each stroke in the character but did not check the perturbed structure. We adopt a generate-and-test method. If a generated stroke induces an ill-structured radical, the stroke will be regenerated. In radical generation, all strokes of a radical are generated by the generate-and-test method. The stroke relations in the generated radical must be the same as

stroke with excessive variation

(a)

(b)

Fig. 4. (a) A natural-looking radical.(b) An unnatural-looking radical containing an abnormal horizontal stroke.

those in the modeling radical. The features of each reference stroke in a modeling character are taken as the mean values of normal distributions of feature values. The features of each generated stroke are samples of these normal distributions. Since values of stroke features vary in individual samples, characters with different styles will be produced. Now we illustrate how to generate the features of a stroke according to a reference stroke in the modeling radical. Let (Xref, L yref) L be the end point of the last reference stroke. Let the rectangle bounding the reference strokes that have been generated have width Wm and height Href, and let the start point of the new reference stroke be (Xr©f, s Yrcf)' s Similarly, let (Xgen , L YS¢,)L be the end point of the last generated stroke. Let the rectangle bounding the generated strokes have width Wge, and height Hg,,. Figure 5(a) shows a modeling radical. The line vector ((Xref s -- .Xref) , L (YrefS-- yLf)) will be used to determine the expected coordinates of the start point of a newly generated stroke. The line vector should be normalized in width and length. Thus, the expected coordinate for the start point of the newlygenerated stroke is S S ( L S L Wgen (xgen,Ygen)= Xgen + (Xr~f -- Xref) X Wref' x

Ygen + (Yref -- Yref) × Href / as shown in Fig. 5(b). Next, we use a pair of random variables (X, Y) to represent the position of the start point of the generated stroke, where X = N ( x gs~ . , a t 2) and Y = N ( y as, n, o'2), 2 where N(/~, a 2) denotes a normal distribution with mean

C,-H. TUNGet aL

~-.~

..,77'

~

(-e.,~c.r)

i--71

(4..~,,)

..-/i/

\ 1 1(a)

(b)

Fig. 5. The mean position (x~e,, s yt,,) s for the start point of a generated stroke. (a) The modeling radical, where (xSef,y s ) is the start point of the next reference stroke. (b) (xS¢,,Y~on)is the mean of the start point of the next generated stroke.

(Xr¢2, V,¢Z)

(X.~,,2, Y,,~,2)

i w.:,~

(Xrefa,Yr~%0 W~e/,t ~fCr--7 --~

H 17%

i

(X,,~a, Ynewa) Wne~A

w,~,~

TZ--Tl T-~--7

i

it,./,

H,ef,2

I

I

L. . . . . .

(XreL3,YrefJ) Wref,3

I

~ z

I I

I I

"

/ ~ I'~. (x.~,,s. v~.~.D w,,~,s

(a)

x

z

3

(b)

!

.,.--

!. II

1 -I

(c) Fig. 6. Radical combination. (a) The radicals and their positions in a modeling character. (b) The character formed by the generated radicals. (c) The generated character after the radicals are aligned.

(a)

(b)

Fig, 7. The generated line-vector characters of a character. (a) The modeling line-vector character, (b) Four line-vector characters distorted from the modeling character. /~ and variance tr 2. Similarly, if the angle of the stroke in the modeling radical is a and the length of the stroke is l, then the angle and length of the corresponding generated stroke are samples of A = N(a,a~) and L = N(l, try),respectively. The variance in a normal distribution, which is assigned by the designer, is used to

produce variation from the mean. F o r stroke relation testing, the stroke relation stroke (i, j ) in the generated strokes i and j is compared with that in the modeling radical, If the newly generated stroke is rejected owing to an illegal stroke relation, the stroke will be regenerated,

Performance analysis of an OCR system

2.3.

Radical combination

All of the radicals in a character can be generated by the above generate-and-test method. After radical generation, the generated radicals are combined according to the specifications in the radical-relation table. First, we use a bounding rectangle to represent the position of a radical. A rectangle can be described by three features: width, height, and start position. We first change the width and height of each rectangle according to normal distribution models, and then generate the start position of the rectangle according to another distribution model. Let Wge..iand H,~,.i be the width and height of the rectangle of the ith generated radical. The width and length of the ith re-sized radical are samples of

W~e,..~=N(W,~.,i,a~) and

H .... ,=N(Hs~,i,~r2).

Let ( X r e f , i , Yref,i), Wref,i and Href, I denote the start position, width, and height of the ith modeling radical. Then (X .... ~, Y.cw.~) is the start position of the ith generated radical, where X ....

i:

N(X ×

.... i-I

g r e f , i - 1)

Woo ,i- ,,0-2 U'

Y.¢..i--N(L.w.i-1 X

-{-(Xref,i-

q- (Yref,i-- Y~*f,i-1)

H n e w i - l , 0"2 ...... " 8 nref, i- 1 /]

When the start position and the size of a radical are generated, we check the relations between the radical and the other radicals generated previously. If the relations are violated, we generate the start position and the size of the radical again. If the attributes of a radical are generated many times and are always unacceptable, we will regenerate the radical in a backtracking manner. Sometimes the generated radicals satisfy the radical

(a)

225

relations but seem not natural enough in terms of style. In accordance with standard Chinese handwriting customs, if rad (i, j) is "UP" or "DOWN", we will align the horizontal centers of radical i and radical j; if rad (i, j) is "LEFT" or "RIGHT", we will align their vertical centers. Figure 6 shows an example of radical combination. In Fig. 6(b), radicals 2 and 3 are combined and aligned, and the combined block must now be aligned with radical I. The aligned character is shown in Fig. 6(c). Other examples of generated line-vector characters are shown in Fig, 7. Figure 7(a) shows the line-vector character without distortion. Figure 7(b) shows the line-vector characters distorted from the modeling character.

3. T H I C K E N I N G

LINE-VECTOR

CHARACTERS

We now propose a method for generating images of the characters used for stroke extraction once linevector characters are generated. For ~,isual effect, we use B-spline functions to change straight lines in linevector characters into curves. Next, we thicken the strokes and add noise on the boundaries of the thickened strokes. B-spline functions are piecewise polynomial functions that can provide local approximations of curves using a small number of parameters. We use B-spline functions to bend straight lines to make them look like strokes written by a human hand. Let t be a curve parameter and let x(t) and y(t) denote the given curve coordinates. The B-spline representation is written as i-n

X(t) = ~ PiBi.k(t), i

0

X(t) a=[x(t), y(t)] T, Pi a=[ pti, P2i]T where the P~ are called the control points and Bi.kIt), i = 0, 1.... , n, k = 1,2,..., are called the normalized Bsplines of order k. The B-splines of order k can be

(b)

(c) Fig. 8. (a) The control points are sampled from a straight line. (b) The control points are moved to one side of the straight line according to parameters w and h and the curve is calculated from these control points. (c) The curve is changed as the values for a w and h are different.

226

C.-H. TUNG et al.

it-,-'--

(b)

(a)

Fig. 9. An example of corner smoothing. (a) The control points are obtained by sampling the lines forming a corner. (b) The control points are moved to create a smoothed corner.

J

~7"7_

J

Fig. 10. Examples of generated handwritten Chinese characters.

/,

Performance analysis of an OCR system generated via the following recursive formulas: Bi,k(t) --

(t--ti)Bi,k_x(t)

-t

(ti+~--t)B~+Lk_~(t)

t i + k - 1 - - ti

the stability of stroke extractors. Then, we will discuss the recognition error of a matching module induced by the stroke extractor.

,

t i + k - - ti+ 1

k=2,3,... B i , ~ ( t ) = l,

4.1. T w o s t r o k e e x t r a c t o r s f o r analysis

ti <- t < ti+ ~

= 0,

Before presenting our performance analysis, we introduce two stroke extractors, one of which works on the basis of thinning and the other on the basis of vectorization. The thinning-based stroke extractor has three main stages: thinning, line approximation, and line segment merging. We use the thinning method presented by Chen and Hsu, tv~ which was modified from Zhang and Suen's method, t6~ Chen and Hsu's method uses a look-up table to check whether or not a pixel can be removed. The look-up table represents all 256 possible combinations of bit patterns in a 3 × 3 window. The methods for line approximation and line segment merging are from Lee and Chen. t~~ The thinning-based stroke extractor is a pixel-based method. If we have a character of size N x N, then the method of obtaining strokes has a complexity of at least N ~. The vectorization-based stroke extractor operates directly on the run length codes of a binary image. The complexity of the stroke extractor, as we will explain below, is approximately the order of the number of runs plus the cost of run length encoding. Because the extractor operates on a group of pixels rather than on individual pixels, it will speed up the processing speed. Line adjacency graphs (LAG) are the basic data structures used in the vectorization method; ~a~ the nodes of the LAGs correspond to line segments in a pixel-image. Continuous segments in an LAG are grouped into a path node. Thus, an LAG can be transformed into a new graph whose nodes are merged from many segments in the original LAG. The new graph is called a "compressed LAG", abbreviated as "c-LAG'. The vectorization algorithm proposed by Pavlidistg~ uses c-LAGs to extract the strokes of a character. From a c-LAG, the strokes that approximate the runs in the path nodes of the c-LAG are obtained. There are some drawbacks to this method when it is applied to Chinese character images. When a horizontal line

otherwise.

In this system, nonperiodic blending functions for k = 3 are adopted3 ~°~ We sample the points on the straight stroke as control points, and then move them to one side of the stroke. Thus, we can create an arc from these control points. An example is shown in Fig. 8. Figure 8(a) shows the control points sampled from a straight line. The control points are moved to one side according to parameters w and h defined below. Figures 8(b) and (c) show two curves that result from using different values for w and h. The value of h controls the curvature of the curve, and the parameter w determines which point has the largest curvature. The values for h and w have different variants given by probability models. If there are two strokes that must be connected, we can smooth the corner formed by the two strokes by not choosing the connected point as the control point. Figure 9 shows an example of corner smoothing. The curve generated by B-spline functions is formed by several line segments. Each short line segment is thickened directly. For visual effect, we also add noise on the boundary of the thickened curve. Examples of 64 generated character images are shown in Fig. 10.

4. P E R F O R M A N C E

EVALUATION

OF AN OCR SYSTEM

In this section, we will introduce a method of measuring the performance of an OCR system via our handwritten Chinese character generator. There are 250 modeling characters in our system, and 100 patterns per modeling character are generated. We focus on evaluating OCR systems that use strokes as matching features. The functions of this kind of OCR system can be divided into two stages: stroke extraction and matching. Below, we will first discuss

.............

227

"." ' L " , 7 "

(a)

,..,,,,,,.......~,.,~.,...,..~.,.,~........~.w.,~......y ,...,~...,

Ii:.]~.[i]:[:~::i~i :i::~:i ? i[::~:]]]:~i ] i::~:%::i::i]:i]]]?~i::i::::i:l

[.[[~]~]i[[]:~]~.~~i]i[i[i]i:,,.'[]][i~i:~:~ilil~i|

{b) Fig. 1l. Each of the two horizontal strokes will result in three line vectors.

C.-H. TUNGet al.

228

image is scanned horizontally, the widths of the segments are quite unstable. We may obtain several line segments for the horizontal line image after applying the one-pass vectorization method. These line segments are generally quite unstable. For example, in Figs 1l(a) and (b), the run lengths of the two horizontal line images change rapidly. Each of the two horizontal line images will be divided into three regions, resulting in three line vectors. To make the generated line segments more stable, we shall develop a stroke extractor based on two-pass vectorization. During the first pass, when we obtain a line vector, we will check if the angle between the line vector and the vertical line is less than 45 °. If it is, then the line vector is accepted and the part of the image corresponding to the line vector is set the same as the background of the character image. During the second pass, the residual image is scanned vertically to construct another c-LAG, and some line segments are detected. Figures 12 and 13 show two examples of the two-pass vectorization method. Figures 12(a) and 13(a) are the source images of input characters. Figures 12(b) and 13(b) show the vectors that incline vertically after the first pass. Figures 12(c) and 13(c) give the residual images after the vertically inclined lines are removed. Figures 12(d) and 13(d) present the line vectors after the second pass vectorization. Figure 14 shows other results of the one-pass and two-pass vectorization methods. We find that the results of the two-pass vectorization method are more similar to line-vector characters than those of the onepass vectorization method are. Our experimental results show that the vectorizationbased stroke extractor is much faster than the thinningbased stroke extractor. On a Sun workstation, the average execution time for thinning was 0.947 s, that for line approximation was 0.149 s, and that for merging was 0.42 s. For the stroke extractor based on two-pass vectorization, the average execution time for vectorization was 0.235 s and that for merging was 0.38 s.

)..L

\!/

- - "

O)

I

(c)

/ (d)

Fig. 13. Result 2 of the two-pass vectorizer. (a) Source image. (b) The segments after the first pass. (c) The residual image after the first pass. (d) The line vectors after the second pass vectorization. To evaluate the performance of the two stroke extractors, we will make a few modifications in the thickening procedure. When a stroke in the linevector character is thickened, each pixel is labeled by the index number of the stroke. Figure 15(a) gives an example. The image is generated from stroke ! and stroke 2. Each pixel is labeled as stroke 1 or stroke 2, depending on the stroke to which the pixel belongs. Figure 15(b) shows that two of the four line segments come from reference line vector 1, and the others come from reference line vector 2. Figure 16 shows the procedure for using the character generator. Cz is the line-vector Chinese character we generate, Ci is the Chinese character image after applying the thickening procedure on Cj, and C, and C~ denote the line segments of the character after stroke extraction on C~ via thinning and via vectorization, respectively. 4.2. Stability analysis of stroke extractors

/\)t, (b)

(c)

(a)

(d)

Fig. 12. Result 1 of the two-pass vectorizer. (a) Source image. (b) The line segmentsafter the first pass. (c) The residual image after the first pass. (d) The line vectors after the second pass vectorization.

A stroke extractor that extracts strokes from character images may cause errors in the extracted line segments. We classify these errors into three cases: (1) A stroke in the line-vector character may be broken into more than one line segment or merged with another stroke. (2) The length of a stroke in the line-vector character may be different. (3) The angle of a stroke in the line-vector character may be changed. Case 1 is the main problem for a stroke extractor. Case 2 mainly depends on case I. Case 3 does not occur frequently. Accordingly, we will analyze the stability of the number of strokes after stroke extraction. Each stroke number has only one stroke in Ct, but there may

Performance analysis of an OCR system one-pass vectorizer

229

two-pass vectorizer

//__

I" IlL_

I

7\

/\ /

/

Fig. 14. Results of the one-pass and two-pass vectorization methods.

222 2222 2222 22222 22222 2222 2222 2222 11 1 1 1 1 1 1 1 1 2 2 2 2 1 1 111111 11 11 1 1 1 1 1 1 1 1 2 2 2 2 1 1 111111 11 11 11111 1 1 1 2 2 2 2 1 1 1 1 1 1 1 1 11 222 222 2222 222 2222 22222 22222 2222

stroke number 2|/ stroke number = 1 stroke number = 1 stroke number = 2

(a)

(b)

Fig. 15. In) The strokes that are thickened. Each pixel is labeled with a stroke number. (b) The four segments with different stroke numbers after being processed by a stroke extractor.

d m°deling-t~ line-vect°r character ~ _ ~ character -I generation

thickening C / ~ l

/

stroke extraction I,,.._ C~ via thinning

["'-

stroke extractionviah~ C,, vectorization |

Fig. 16. The process of using the character generator.

230

C.-H. TUNGet al.

be several strokes in Cj or C,. To evaluate the stroke number error caused by a stroke extractor, we sum the difference between the number of corresponding line segments in Ct or Cv and the number of strokes in Ct, then normalize the value. The error criterion functions are represented by

ance of the thinning-based stroke extractor is more stable than the performance of the vectorization-based stroke extractor. 4.3. Error analysis for a recognition system Another application of the character generator introduced here is in measuring the rate of error in character recognition caused by the stroke extractor. Suppose that a matching module in an OCR system uses the strokes extracted by our stroke extractors. Since no feature-extraction error is involved, the linevector character is used as the perfect input of the matching module. The recognition rate using the linevector characters as input will be taken as the peak recognition rate of the matching module. Then the line vectors extracted by the stroke extractor are used as the input of the matching module. The recognition rate will decrease, because the input of the matching module contains errors caused by the stroke extractor. The matching module used in our experiments is a local matching method, ~12} which uses the strokes of

N

[ 1 - T ~ I / N and ~ [ 1 - V i [ / N i=1

i=1

where T~ and V~are the number of line segments with the stroke number i in Ct and Cv, respectively, and N is the number of strokes in C 1. Figure 17 shows the accumulative distribution of stroke number error, where the X axis represents stroke number error, the Y axis represents the accumulative number of characters, and the point (x,y) indicates that there are y characters whose errors are less than or equal to x. From this figure, we find that stroke extraction via thinning is more stable than that via vectorization. The distribution of stroke number error can be taken as an indicator of the stability of a stroke extractor. From our analysis, we find that the perform-

number of chara~ers 25000-

200001 175001

j

~

15000:

//

Thinning-based Vectorization-based

12500-

l°°°°i 7

5

5

0

0

0

.

0

0

.

0

.

0.2

~

.

0.4

.

.

.

0.6

,

0.8

i-

1

,

m

1.2

,

i

1.4

r

i

,

1.6

m

,

w

,

t ~

1.8 2 2.2 stroke number error

Fig. 17. The accumulative distribution ofstroke number error.

% 100 90 80 70 60 50 40 30 20 10

[

recognition [ rate for C~

l

recognition rate for Ct

l

recognition rate for Ct

Fig. 18. Different recognition rates of the local matching algorithm caused by various inputs.

Performance analysis of an OCR system

an input character as features. Figure 18 shows that the feature extraction errors caused by the thinningbased and vectorization-based stroke extractors reduce the recognition rate by about 7.1 and 13.49/o, respectively, in the local matching algorithm. The peak recognition rate of the local matching module, not including the feature extraction error, is 92.7% for the 250character database. We find that the thinning-based stroke extractor is more suitable for the local matching module than the vectorization-based stroke extractor is. 5. CONCLUSIONS In this paper, we have presented an artificial handwritten Chinese character generator that generates images of characters. The character generator consists of two components: a line-vector character generator and a thickening procedure. The former is used to generate artificial line-vector characters; the latter is used to thicken these line-vector characters stroke by stroke and add noise on the boundary of each thickened line. In addition, a criterion function was used to compare the strokes produced by two stroke extractors, one using thinning and the other vectorization, with the line-vector character. The error values obtained from the criterion function were used to measure the stability of the stroke extractors. According to the error values obtained from the criterion function, we find that stroke extraction via thinning is more stable than that via vectorization. For an O C R system, it is not easy to determine what problems decrease the recognition rate. This paper presents an approach for determining whether the recognition error is mainly due to the stroke extractor or to the matching module. There are three possible improvements that could be made to our character generator. We state them below: (1) The variance parameters used in the character generator can affect the result of the generator. Different character patterns can be generated depending on the values of these parameters. We could collect many line-vector characters from actual handwritten characters as training data to obtain the variance parameters used in the line-vector character generator.

231

(2) We could simulate the effects of using different kinds of pens and writing on different kinds of paper by changing the width of the line segments. This simulation could then be used as a more reliable thickening model to produce more realistic input for the stroke extractor. (3) We could further study the validation of generated characters. A database of actual human handwritten character images is needed. A Turning test 151would be appropriate for determining the similarity among generated characters and human handwritten characters. If the results of the Turning test were satisfying, the generated characters could be taken to be similar to the human handwritten characters.

REFERENCES

1. R. M. K. Sinha and H. C. Karnick, PLANG based specification of pattern with variations for pictorial databases, Comput. Vision Graphics Image Process. 43, 98-110 (1988). 2. V. K. Govindan and A. P. Shivaprasad, Artificial database for character recognition research, Pattern Recognition Let(. 12, 645-648 (1991). 3. C. H. Leung, Y. S. Cheung and K. P. Chan, A distortion model for Chinese character generation, IEEE Proc. Int. Conf. Cybernetics and Society', Tucson, Arizona, U.S.A., pp. 38-41 (1985). 4. K. Ishii, Generation of distorted characters and its applications, Syst. Comput. Controls 14(6), 19-27 (1983). 5. E. Rich and K. Knight, Artificial Intelligence, 2nd Edn. McGraw-Hill, New York (1991). 6. T.Y. Zhang and C. Y. Suen, A fast parallel algorithm for thinning digital patterns, Commun. ACM 27, 236-239 (1984). 7. Y.S. Chen and W.H. Hsu, A modified fast parallel algorithm for thinning digital patterns, Pattern Recognition Let(. 7, 99-106 (1988). 8. T. Pavlidis, A hybrid vectorization algorithm, Proc. 7th Int. Conf. Pattern Recognition, Montreal, Canada, pp. 490-492 (1984). 9. T. Pavlidis, A vectorizer and feature extractor for document recognition, Comput. Vision Graphics Image Process. 35, 111-127 (1986). 10. A.K. Jain, Fundamentals of Digital Image Processing. Prentice-Hall, Englewood Cliffs, New Jersey (1989). 11. H.J. Lee and B. Chen, Recognition of handwritten Chinese characters via short line segments, Pattern Recognition 25, 543-552 (t992J. 12. N.C. Wang, A handwritten Chinese text recognition system with a contextual postprocessing module, Master's thesis, Institute of Computer Science and Information Engineering, National Chiao Tung University (1991 ).

About the Author--CHENG-HUANG TUNG was born in Tainan city, Taiwan, Republic of China, on 28 May

1967. He received a B.S. degree in computer science and information engineering from the National Chiao Tung University, Hsinchu, Taiwan, in 1989. He is now a Ph.D. candidate in the Institute of Computer Science and Information Engineering, National Chiao Tung University, Taiwan. His research interests are in the areas of pattern recognition, artificial intelligence, and graphics.

About the Author--YUAN-JONG CHEN was born in Tao-Yuan county, Taiwan, Republic of China, on 8

August 1968. He received a B.S. degree in computer science from the Soochow University, Taipei, Taiwan, in 1990, and received an M.S. degree in computer engineering from the National Chiao Tung University, Hsinchu, Taiwan, in 1992. He serves in the army now. His research interests are in the areas of pattern recognition and image processing.

232

C.-H. TUNG et al. About the Author--Hst-JtAN LEE received B.S., M.S., and Ph.D. degrees in computer engineering from the National Chiao Tung University, Hsinchu, Taiwan, in 1976, 1980, and 1984, respectively. From 1981 to 1984 he was a Lecturer at the Department of Computer Engineering, National Chiao Tung University, and from 1984 to 1989 an associate professor at the same department. Since August 1989 he has been with the National Chiao Tung University as a professor. He is at present the chairman of the Department of Computer Science and Information Engineering, National Chiao Tung University. His current research interests include image processing, computer vision, pattern recognition, artificial intelligence, and natural language processing. He is a member of Phi Tau Phi, the Association of Computational Linguistics, and the Chinese Language Computer Society.

Performance analysis of an OCR system via an artificial handwritten chinese character generator

Performance analysis of an OCR system via an artificial handwritten chinese character generator

Recommend Documents