Stroke segmentation by bernstein-bezier curve fitting

Stroke segmentation by bernstein-bezier curve fitting

0031-3203/90 $3.00 + .00 Pergamon Press plc © 1990 Pattern Recognition Society Pattern Recognition, Vol. 23, No. 5, pp. 475-484, 1990 Printed in Grea...

647KB Sizes 15 Downloads 136 Views

0031-3203/90 $3.00 + .00 Pergamon Press plc © 1990 Pattern Recognition Society

Pattern Recognition, Vol. 23, No. 5, pp. 475-484, 1990 Printed in Great Britain

STROKE SEGMENTATION BY BERNSTEIN-BEZIER CURVE FITTING CHIA-WEI L1AO and JUN S. HUANG* Computer Vision Laboratory, Institute of Information Science,Academia Sinica,Taipei, Taiwan, Republic of China

(Received 9 February 1989; in revised form 21 June 1989; received for publication 17 July 1989) Abstract--Stroke segmentation is essential to the recognition of handwritten Chinese characters. Here some new and reliable techniques are proposed. First, the thinning process is applied to each character to obtain the skeleton, then a maximum circle technique is used to remove the spurious branches and merge the split 4-fork points. All possible pairs of stroke segments connected at the same fork point are considered, and the Bernstein-Bezier curve is used to fit each pair to smooth the data and find its trend. Then from the result of this curve fitting we can decide which pair belongs to the same stroke. Finally the inflection points with very small radius of curvature are found, and the stroke segmentation is carried out based on these inflection points. Experiments on some Chinese characters show that the proposed new techniques are reliable and time saving compared with the direction method. Bemstein-Bezier curve Curve fitting Inflection point Radius of curvature

Stroke segmentation

1. INTRODUCTION Handwritten Chinese characters are very complicated and have large variations in general, hence the classical approaches of their recognition are rather limited and are unable to reach a high recognition rate. (1~ Among the popular approaches a promising one is the stroke analysis since intuitively Chinese characters are likely to consist of relatively few elementary strokes, ~z) and it is particularly true when we imagine how they are written. As pointed out in reference (1) stroke analysis approaches can be divided into three groups. The first is the local or semi-local level, the second is the perceptual level and the third is the knowledge level. The method proposed here belongs to the perceptual level; it differs from the work of Kobayashi et al. ~3) in that they consider straight line segments and use the inner product of two line segments as a measure of segmenting a stroke, whereas we use the BernsteinBezier curve(4) to smooth the skeleton of a stroke and use the curvature of the fitting curve to segment a stroke. Stroke segmentation has been studied by many authors. Cz-~°) Some of them use the structural information obtained after thinning, and the others use the directional information of each point and the line template, which is too time consuming, and in reference (9) the authors use the Hough transform to extract the linear stroke obtained after thinning. Although thinning in general will produce distorted skeletons.~11-t3) yet because of its simplicity and ability to be implemented in parallel hardware this * To whom correspondence should be addressed.

End point

Fork point

approach of stroke segmentation is still pursued by many authors. Here we adopt this approach and devise a maximum circle technique to correct the distorted skeleton obtained after thinning (e.g. to find and delete the spurious branches and merge two 3-fork points into a 4-fork point). Finally we consider all the possible pairs of the stroke segments connected at the same fork point, then use the properties of the Bernstein-Bezier curve to decide which pair belongs to the same stroke. Experiments show that the new method is reliable and time saving compared with the directional method. ~9~ 2. STROKE SEGMENTATION

It is true in general that the local properties (or features) of a character are not very stable when this character is thinned, so we cannot usually get correct fork points of a character after thinning. For example, a 4-fork point in the original character is often split into two 3-fork points and a 5-fork point into three 3-fork points as shown in Fig. 1. But the global features remain almost the same. So if we want to get the correct and stable features such as fork points and strokes, the only way is to use a global view to analyse a character. Here we devise a new method based on the idea of curve fitting to extract strokes from a character, and this method can look into characters from the global view. The new method can be described by the following four parts: Part 1--thinning. The selection of thinning algorithms does not matter. Here we use the thinning algorithm developed by Zhang and Suen.I~~

475

476

CHIA°WEI LIAOand JUN S. HUANG

x x xxxxx

x x xxxRx

X X X

X

X X X

XX ~X~

(i iiiiiiiiiiiiiiii (a)

x x xxxxx

Ca)

X

This pix.l should be deleted, (b)

(©)

Fig. 2. A "3-fork" point in the original character may be converted into the form as in (a), and it can be corrected in the way depicted in (b); (c) is the result after correction.

the result to check whether this kind of bug exists. If it exists, we can correct it by deleting a certain point, and we show the corrected result in Fig. 2(c).

Part 2 (b) Fig. 1. (a)A "4-fark'" point is split into two "3-fork" points after thining, (b)A "5-fork" point is split into three "3-fork" points after thinning.

Part 2--modifying the result of thinning. Whatever thinning algorithm is used, distortions are always brought about after thinning. Here we use the maximum circle technique and some skills to modify the thinning result so that the skeleton can represent the original shape. Part 3--extracting strokes from thinned characters. The strokes extracted in this part may contain inflection points, and they terminate only at the end or fork points of the thinned character processed by Part 1 and Part 2. We call this kind of stroke a lst-stage stroke. Part 4--refining strokes obtained from Part 3. We split every lst-stage stroke by inflection points. For example, a lst-stage stroke containing an inflection point may be divided into two smooth and flat strokes. We call this kind of stroke 2nd-stage stroke. We shall discuss these four parts in detail in the following paragraphs.

Part 1 The differences among various thinning algorithms do not influence the result of stroke extraction seriously, so we do not care very much about which algorithm is chosen. Here we adopt the algorithm developed by Zhang and Suen (it) (and revised by Chen and Hsu (12) later) because this algorithm is simple, fast and can be implemented parallelly. After some experiments we discovered a bug in that algorithm. When the thinning process terminates it is possible for this algorithm to generate a result shown in Fig. 2(a), and the misjudgement of the fork type as well as the fork location happens when tracing this character after thinning, thus all the processes of stroke extraction fail due to this misjudgement. So every time the thinning process completes we rescan

Fork points are not stable for almost every kind of thinning algorithm. For example, a 4-fork point is usually split into two 3-fork points after thinning as shown in Fig. 1. Thinning also results in spurious forks at the end of a stroke when the original unthinned stroke is swollen at the ends as shown in Fig. 3. Here we presented a reliable method for merging these fork points, and this method can also delete spurious forks at the end of a stroke. First we define a cluster to be the class of fork points that should be merged together. Then we can get the center of a cluster by averaging these fork points in the cluster and use this center point to represent this cluster. Now we describe the algorithm for finding a cluster: Step l - - l e t S be the original unthinned character and T be the thinning result. Step 2 - - f o r every fork point f (we regard an end point as a l-fork point), no matter what kind of fork it is, figure out the radius of the largest circle within S that is centered at f. If the radius is less than one (when the stroke is very thin), we set the radius of this circle to 1. Let R s represent the radius of the fork f. An example is shown in Fig. 4(a). Step 3--for every pair of forks ./'1 and f2 calculate the distance df],f 2 between them. I f d f t , f 2 is less than or equal to Ry] + R f2 , we say fl and f2 are connected; that is, the two largest circles of f~ and f2 intersect each other as shown in Fig. 4(b).

( iiiiiiii!i iiii

/

Fig. 3.

Stroke segmentation

•F ~ ,

l

............

~

J. iii

:

)

thinn,(I

chirl©l:erT

iiii .......

~

l

originil ¢hiriet,r $

477

used to represent this cluster. Now we reconnect each stroke with the center point Ck, and a new thinned character is formed as shown in Fig. 4(d). Step 6 - - w e can determine what kind of forks a cluster is by the following program: if this cluster contains only one fork point, which is 1-fork

then Fig. 4{a)

c o u n t e r = 1;

else

{

c o u n t e r = 2;

for every fork f in this cluster do

{ i f f is 1-fork counter = counter -

1;

else counter = counter + KIND(f)

- 2;

} Fig. 4(b) F3 forms a cluster itself. F7 forms a duster itself. F2 is connected with F1, so they form a cluster. F4 is connected with F5 and F6, so F4, F5 and F6 form a duster.

Fig. 4(c) All fork points and strokes are disconnected.

i!iiiiiiiiiiiiiii Fig. 4(d) C3 is equal to F3. C3 is an end point (1-FORK point). C2 is equal to F7. C2 is an end point (I-FORK point). C1 is equal to the average of F2 and F1. Since 2 + (3 - 3) + (3 - 2) = 4, so C1 is a 4-FORK point. C4 is equal to the average of F4, F5 and F6. Since 2 + (3 - 2) - 1 - l = 1, so C4 is an end point (1-FORK point).

Step 4 - - i f f a and f2 are connected and f2 and fa are also connected we say f~, f2 and f3 are all connected with one another. Using this criterion we can finally partition all the fork points into several clusters, and a fork point only connects with those fork points in the same cluster (an example is shown in Fig. 4(b). Step 5--erase all pixels within each circle of each fork point except the fork point itself as shown in Fig. 4(c). For every cluster K average the positions of the fork points in K to get the center C~, and Ck is

} end. The result is that this cluster is a c o u n t e r - f o r k point. Here K I N D is the function returning the kind of fork. F o r example KIND(5

- fork) = 5

KIND(3

- fork) = 3

An example is depicted in Fig. 4(d). Every spurious fork point f usually has one short stroke S 1 or two short strokes Sx and $2 connecting to itself (see f4 in Fig. 4b). We will consider the case of two short strokes as an example. Let Ex and E 2 be the other end points of S~ and $2 respectively. E1 and E 2 are always close to f. Since f, El and E 2 are very close to one another, the circles (defined in step 2) off, E 1 and E 2 overlap one another; that is, f, E~ and E 2 belong to the same cluster. By step 6 we can have c o u n t e r = 1 for this cluster f, so f is an end point ( 1 - f o r k point means end point). In this way we replace this kind of spurious fork point with a correct end point. An example is depicted in point C4 of Fig. 4(d). The reason why we set the radius of every circleto be greater than or equal to I is that when the stroke is very thin and a 4-fork point is splitinto two 3-fork points, these two 3-fork points will not be merged together if the radii of these two 3 - f o r k points are zero as shown in Fig. 5.

(a)

(b)

Fig. 5. (a) Original unthinned character. If the strokes are very thin, the thinning result will almost be the same as the original character. (b) Thinning result. F1 and F2 will never be merged because of their zero radii of the largest circles within the original unthinned character•

478

CHIA-WEI LIAO and JUN S. HUANG

Part 3

ttl

In this part we extract strokes from the character processed by Part 2. At this stage all the fork and end points have been properly handled, and the structure of the character is much more reliable than that of the character processed only by Part 1. Here we define a vertex to be a fork or an end point, and an edge to be a line (or curve) connected between two adjacent vertices. Let Et, E2 . . . . . E,, be the edges that are connected to the same fork point, which is a m-fork point. First, what we want to know is which pairs of edges, connected at this fork point, belong to the same strokes. If this is done for all the fork points except end points, then we get edge pairs that belong to the same stroke, and we also get isolate strokes that contain only one edge. Every edge pair forms a class. If two classes contain an edge in common, we say these two classes overlap each other. Secondly, we combine all the classes that overlap one another to form new classes. This combining operation terminates when none of the classes overlap one another. The edges in the same class constitute a stroke. The strokes extracted here may contain turning (or inflection) points, and they terminate only at end or fork points of the original thinned character. In the beginning of this paper we have stated that to extract strokes from a character correctly we must carefully observe this character from the global view. Thinning always causes strokes to be zigzag in the vicinities of the fork points, and every time we reinput the same character we always get different zigzag lines near the same fork points after thinning. So we can hardly find the right and stable directions or other attributes of the strokes near fork points by observing just local areas of fork points. The method presented here can find out appropriate pair of edges, belonging to the same stroke and connected at the same fork point, from the global structures of the character. We describe this method in the following steps: Step 1--for every n-fork point f, n > 2, find out all possible combinations of edge pairs connected at f. There will be (~) kinds of combinations for f. Merge every possible edge pair, thus we can get h, and h = (~), new lines (curves) ll, 12..... lh as shown in Fig. 6. Step 2--for every line l~ figure out a BernsteinBezier curve t*) best fitted to l~ and this curve has the same start and end points as those of l~. Let these curves be BB~, 1 < i < h. Every time we reinput the same character the trend of every stroke is almost the same, and we can know the trend of l~ from BB~. Step 2.1. Let P(0), P(I) . . . . . P(k) be the points of a line l~, where k is the total number of points in line l;, P(0) is the start point and P(k) the end point. The Bernstein-Bezier polynomial is

B(t) = ~ (7')Pf(1 - t)m-~ for 0 _< t _< 1

i=O

where Po and P~ are end points and points P~, P2 ..... P~_ t are control points.

F..2

E4

Fig. 6. (~) = (4.3)/(1.2) = 6. L1 = (El,E2). L2 = (El,E3), L3 = (El,E4), L4 = (E2,E3), L5 = (E2,E4), L6 = (E3,E4).

Here we set m to be [k/c], where c is a constant greater than 1. In our experiment c is [(width of character + length of character)/10]. Since the best fit Bernstein-Bezier curve B(t) has the same end points as those of li, P0 = P(0) = B(0) and Pm= e(k) = B(1). What we want here is that B(j/k) should be as close to P(j) as possible for 0 < j < k. Since the coordinates of every point are X and Y, we first try to find out X component of every control point of best fit BernsteinBezier curve, then we deal with Y component in the same manner. Step 2.2. Let Bx(t) = ~, (~)ti(1

-

t)m-ipxl.

i=0

where 0 < t < 1.

8x(0) =

Px(0)

=

Px0

and Bx(1) = Px(k) = P.m. Here Bx(t), Pxi, Px(O) and Px(k) mean X components of B(t), Pi, P(0) and P(k), respectively. By least square error method we find out every control point's X component, (P.1, P., . . . . . P*t=-q), of BernsteinBezier curve best fit to Ii. Let error(P**, P.2 . . . . .

PxIm-1])----

k E i=0

(Bx(i/k)- Px(i))2"

Now

we different-

iate error (Pxl, P** . . . . . P . v . - n ) with respect to P.t, Px2 . . . . . P.c.-1) and set the results to zero, i.e. ~(error)/O(P**) = 0, O(error)/O(P.2) = 0 . . . . ~(err°r)/O(P.v.-u) = 0. Since there are ( m - I) variables and ( m - 1) linear equations, we can get P.,, P.2 . . . . . Px(m-1} easily. The calculation of the Y components, (Pyl, Py2. . . . . Py{m-t)), of the control points can be done in the same manner. Thus we get all the control points of Bernstein-Bezier curve best fit to li. Step 2.3. Use these control points to reconstruct Bernstein-Bezier curve BB~. Step 3--because every BB~ curve can be represented by parametric equations with parameter range from 0 to 1 (that is, it is continuous, smooth and without zigzag curve in itself), we can get mathematical attributes of every point on curve BB~ such as gradient, curvature, etc. We can say that these attributes are obtained from a global view since every point on BB~ is influenced by all points on Ii, not just a local interval. The local zigzag curve in Ii always corresponds to a smooth curve in BB~ (see the curve in the vicinity

Stroke segmentation

of the fork point shown in Fig. 7). For every curve BBI find out the point Pi on BB~(t), 0 < t < 1, that is closest to the fork point f, and then figure out BB~'s radius of curvature r i at point P~. Let Pi be (Xi, Yi), then r i = I(X 2 + y2)l.~/(f(. ~'_ j~. l")l where " and '" mean the first and the second degree of differentiations. Step 4--after carefully observing many Chinese characters we make three assumptions stated below: (1) By experience we know that the strokes in a character are very smooth everywhere except at inflection points. It hardly ever happens that a fork point in a stroke is also an inflection point. So we make an assumption that a stroke should be as smooth as possible in the vicinity of the fork point. (2) A 3-fork point is composed of one stroke going through it and another stroke terminating at it. (3) An h-fork point with h > 3 is composed of two strokes going through it and the others terminating at it. According to the first assumption we set a criterion that line li with greatest radius r~ should be chosen as the first edge pair (ll is formed by a pair of edges), lj with the greatest rj among the lines l's which share no common edges with l~ should be chosen as the second edge pair. The calculation of lj is required only when f is a h-fork point with h > 3. In this way we can figure out stroke pair(s) of f. An example is depicted in Fig. 8. The main idea of this step is that every stroke is very smooth at the fork point, and thus we think of the edge pair whose best fit curve has the greatest radius of curvature in the vicinity of fork point as a part of a stroke. Step 5 - - u s e the algorithm stated above to find out all the edge pairs of all fork points except the end points. The edge not appearing in any edge pairs forms a stroke itself, and we call this kind of stroke isolate stroke. We say that two edge pairs are connected if they have an edge in common. Thus we can use finding connected component algorithm to partition all the edge pairs into several mutual exclusive classes. Edge pairs in the same class form a stroke, and we call this kind of stroke compound stroke because it is formed by more than one edge. Step 6--store both kinds of isolate and compound strokes, and these strokes are what we call lst-

479

et

•Or•O

int ~

(a)

1,

Fig. 7.

(c)

(d)

Fig. 8. P is the closest point to F. LI has the greatest radius of curvature at P obviously, so we think E1 and E2 belong to a stroke and E3 belong to another.

stage stroke. The strokes extracted here may contain inflection points and terminate only at end or fork points of the original thinned character. An example is given in Fig. 9. Part 4 If the lst-stage strokes cannot meet our demands; that is, we may hope that every stroke contains no inflection points and terminates at end, fork or inflection points of the original character. We can refine the lst-stage strokes to get the 2rid-stage stroke. Every lst-stage stroke is checked by the following method to see whether it can be divided into two or more 2nd-stage strokes. Step 1--suppose the current lst-stage stroke S contains n + 1 points. Let S(O), S(1). . . . . S(n) be the points of S sequentially.

E2 E~

,%~

inflection point

BSi

0

(b)

E7 -

Li

L1 = (El, F2),

L2= (et,~, L3 = (E~,E'3),

E3

- E8 inflection point

Fig. 9. (a) Isolate stroke: E7, E8. Edge Pairs: PI(EI.E4) P2(E4,E9) P3(E2,E3) P4(E5,E6). E4 exists both in P1 and P2, so P1 and P2 are connected. Merge P1 and P2 P5(EI.E4.E9). Compound stroke: P3, P4 and P5. (b) There is an inflection point in E9, so P5 contains an inflection point.

480

CHIA-WEI L]AOand JUN S. HUANG

Flat portion ~elat Portion

inflectionS--)p~

(a)

point P

,._9

inTleotlon Portion

(--J

(b)

~P

(c)

Fig. 10. (a) "I-STAGE" stroke. (b) Bernstein-Bezier curve. (c) The "l-STAGE" stroke in (a) is split into two "2-STAGE" strokes at inflection point P. Step 2 - - u s e the method stated in Part 3 to generate the Bernstein-Bezier curve best fit to S. Let BS(t), and 0 < t < 1, be this curve. The shapes of the curve BS(t) and stroke S are very much alike, so BS(t) can be used instead of S in some cases, for example, when smoothness is required. After making many experiments, we found that BS(i/n) is always very close to S(i) for all 0 < i < n (i is integer), so an inflection point with parameter t = i/n on curve BS corresponds to an inflection point on stroke S with index i. Step 3 - - A curve is said to be flat at some point if the radius of curvature at that point is greater than some threshold t,. We define inflection curve portion to be the part of curve that is flat in the vicinities of its two end points, and the angle between the two gradients at these two end points must be greater than some threshold ta. Trace BS(i/n), for 0 < i < n, to find out all the inflection curve portions. For every inflection curve portion, find out the point rj in it with the minimum radius of curvature, and we say

(a)

(b)

that rj is an inflection point. Record the parameter value rip of inflection point rj. The BS(i/n) is almost equal to S(i), and BS(rjp) is an inflection point, so S([n'rjp]) is an inflection point on stroke S. In this way we get all the inflection points on stroke S. According to our experience t, is 15 and ta is of degree 39 ° for a character size 120 x 120. Step 4--Split the current stroke S by inflection points, and thus generate some smaller strokes which are very flat and smooth. We call these strokes 2ndstage strokes. An example is shown in Fig. 10. Every lst-stage stroke refined by the above steps generates at least one 2nd-stage stroke, and we consider that the features of the 2nd-stage strokes are more useful and easier to extract compared with those of lst-stage strokes. F o r example, the direction and length of 2nd-stage strokes are more easily defined than those of lst-stage strokes. We also can use the curve fitting method stated above to replace the 2ndstage stroke by its best fit curve which is smoother when smoothness is required.

3. EXPERIMENTALRESULTS It is thinning that sometimes makes the stroke extraction unreliable, and the algorithm discussed in Part 2 usually can correct most of the distortions caused by thinning. We also find out that the thicker the strokes are in a character, the higher the possibility of causing distortions after thinning. So if the strokes of a character are very thin compared with character, we can extract strokes more stably and correctly. Most strokes of a machine printed character are much thicker than those of the same character written by the ball pen. Thus thinning the handwritten character will generate less distortion, and better results can be reached. So stroke extraction for machine printed characters is not obviously easier than that for handwritten characters in our algorithm. We take two machine printed characters as examples (Fig. 1 la and Fig. 14a), and the results of the stroke extraction are shown in Figs 11-16. From the exper-

(c)

(d)

f-?

(e)

Fig. 1I. (a) A Chinese character. (b) Result after thinning. (c) Erase the pixels near every fork point. (d) Reconnect each stroke to the fork point. (e) Use Bernstein-Bezier curve to represent (a).

Stroke segmentation

481

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i) Fig. 12. The stroke extraction of the Chinese character (shown in Fig. ! 1) in the first stage.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(i)

(m) Fig. 13. The stroke extraction of the Chinese character (shown in Fig. 11) in the second stage.

482

CHIA-WEI LIAO and JUN S. HUANG

{a)

(b)

(c)

(d)

(e) Fig. 14. (a) A Chinese character. (b) Result after thinning. (c) Erase the pixels near every fork point. (d) Reconnect each stroke to the fork point. (e) Use Bernstein-Bezier curve to represent (a).

)

,

.r

-/

.,I"

) /

/ (a)

(b)

(c)

(d)

(e)

(£)

Cg)

(h)

(i)

(j)

(k)

(i)

(m) Fig. 15. The stroke extraction of the Chinese character (shown in Fig. 14) in the first stage.

Stroke segmentation

483

)

,r

/(a)

./

./

(b)

/-

(c)

./

J (d)

./

J

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(i)

(m)

(n)

(o)

Fig. 16. The stroke extraction of the Chinese character (shown in Fig. 14) in the second stage.

imental results, we can easily see that some first stage strokes, with one or more inflection points, are divided into several flat line segments after they are processed by the four steps described in Part 4, and there are no obvious inflection points in the second stage strokes. The locations of Bernstein-Bezier curve's control points can represent the shape of the original stroke approximately, so we can store the control points of every lst-stage stroke and use them as a feature of a character. In Step 3 of Part 2. dil,~ z, r l l and rs2 are used to check ifft and fz are connected. We propose another way to check the connection between any two fork points: Step l - - l e t m be the midpoint o f f , and f2, that is,

centered at their midpoint. Thus we did not use this method in our experiment. There are three thresholds in this stroke extraction algorithm, and the way in which these thresholds are defined is discussed in the following.

(a)

m = (A + A)/2.

Step 2 - - f i n d out the largest circle C centered at m within original stroke S. Step 3 - - i f both f~ and f2 are within C then f~ and fz are connected. An example is depicted in Fig. 17. This method may be more accurate, but it consumes more computational time because for every pair of fork points we must figure out their largest circle

(b) Fig. 17. (a) m is the mid-point of fl and t"2. Because both fl and f2 are in the largest circle centered at m, fl and f2 should be connected. (b) m is the mid-point of fl and f2. Because none of fl and t"2 are in the largest circle centered at m, fl and t"2 are not connected.

484

CHIA-WEILIAO and JUN S. HUANG

The first one is the ratio of the n u m b e r of points in the original curve to that of control points used to represent this curve. There is an example of this ratio in Step 2.2 of Part 3, and we get this result by experience. The second and third are to and t, defined in Step 3 of Part 4. to and tr should be update when the style of input character is changed, that is, a learning process is necessary. For example, when we input another person's handwriting that we have not dealt with before, we should input several inflection points and straight line patterns of this person's handwriting beforehand to estimate to and t, correctly. 4. SUMMARY The stroke is a very important feature for every character, so extracting a character's strokes correctly is very helpful to the recognition of the character. In this paper, two main techniques are developed. The first is the maximum circle technique, which is used to merge the split fork points and delete the spurious strokes resulted from thinning. The second is the curve fitting method, which is used to smooth the zigzag curves in the points, and then the curvatures of the points near fork points can be figured out easily. From these curvatures strokes in the input character can be found out. Both of them are proved to be reliable after experiments. REFERENCES

1. S. Mori, K. Yamamoto and M. Yasucia, Research on machine recognition of handprinted characters, IEEE Trans. Pattern Anal. Mach. Intell. PAMI 6, 386-405 (1984).

2. M. Yoshida and M. Eden, Handwritten Chinese character recognition by A-b-S method, Proc. 1st Int. Joint Conf. Pattern Recognition, pp. 197-204 (1973). 3. K. Kobayashi, F. Yada, K. Banno, K. Yamamoto, and H. Nambu, Recognition of handprinted Chinese characters by stroke matching method, Trans. IECE Japan PRL, 81-33 (1981). 4. D. Faux and M. J. Pratt, Computational Geometry for Design and Manufacture. Ellis Horwood, New York (1979). 5. W. W. Stallings, Recognition of printed Chinese characters by automatic pattern analysis, Comp. Graph. Image Process. 1, 44-65 (1972). 6. N. Okabe, N. Yoshimura, Y. Miyake and M. Ichikawa, A feature extraction method using extended distance function and linear filter for handprinted characters, J. IECE Japan J59-D, 858-865 (1976). 7. N. Babaguchi, Y. Kitamura, M. Shiono, H. Sanada and Y. Tezuka, A method of direction segment extraction from character pattern without thinning process, J. IECE Japan J65-D, 874-881 (1982). 8. S. Mori and T. Sakakura, Line filtering method and its application to stroke segmentation of Kanji characters, Trans. IECE Japan PRL, 83-12 (1983). 9. F. H. Cheng and W. H. Hsu, Three stroke extraction methods for recognition of handwritten Chinese characters, Proc. Inc. Conf. Chinese Computing, Singapore, pp. 191-195 (1986). 10. Y. S. Chen and W. H. Hsu, An interpretive method of line continuation in human visual perception, Pattern Recognition 22, 617-637 (1989) 11. T. Y. Zhang and C. Y. Suen, A fast parallel algorithm for thinning digital patterns, Communs. ACM 27, 236239 (1984). 12. Y. S. Chen and W. H. Hsu, A new parallel thinning algorithm for binary image, Proc. National Computer Symp., Taiwan, R.O.C., pp. 295-299 (1985). 13. Y. K. Chu and C. Y. Suen, An alternate smoothing and stripping algorithm for thinning digital binary patterns, Signal Process. 11, 207-222 (1986).

About the Author--CH1A-WE1 LIAO was born in Taipei, Taiwan, Republic of China, 31 July, 1963. He received his B.S. and M.S. degrees in computer science from the National Chiao Tung University, Taiwan, 1985 and National Tsing Hua University, Taiwan, 1987, respectively. Now he is an assistant research fellow in the Institute of Computer Science, Academia Sinica, Nankang, Taipei, Taiwan, R.O.C. His current research interests include computer vision, graphics, image processing and pattern recognition. About the Author--JuN S. HUANGwas born in Taiwan in 1945, and received a B.S. degree in Mathematics from the National Taiwan University in 1969. After one year of graduate study in Computers at the Institute of Electronics, National Chiao-Tung University, he went to the University of Florida and received a Ph.D. degree in Statistics in 1977. Then he joined the Department of Applied Mathematics, National Chiao-Tung University, as an Associate Professor, and later in 1981 he joined the Institute of Information Science, Academia Sinica, to develop the pattern recognition project. Jun S. Huang has published a number of papers in the fields of probability, statistics, image processing, pattern recognition and computer vision. He is now a Research Fellow in the Institute of Information Science, Academia Sinica, and is the head of the Computer Vision Laboratory.