Pattern Recognition, Vol. 29, No. 1, pp. 97 119, 1996 Elsevier Science Ltd Copyright © 1996 Pattern Recognition Society Printed in Great Britain. All rights reserved 0031-3203/96 $15.00 +.00
Pergamon
0031-3203(95)00063--1
A NOVEL PARALLEL APPROACH TO CHARACTER RECOGNITION AND ITS VLSI IMPLEMENTATION H. D. C H E N G t and D. C. XIA:~ Department of Computer Science, Utah State University, Logan, UT 84322, U.S.A.
(Received 21 June 1994; in revised form 5 April 1995; accepted for publication 18 April 1995) Abstract--Character recognition has become more important over the years in areas of document processing, language translation, electronic publication, office automation, etc. Good feature extraction is critical to the success of character classification, especially for very large character sets. In this paper, a parallel VH2D (Vertical-Horizontal-2Diagonal)method and its VLSI implementation are proposed. In the VH2D method, projections on character images are made from four directions: vertical, horizontal and two diagonals (45 and 135°), which produce four subfeature vectors for each character. Four subfeature vectors are transformed according to the central point of the character before they are combined into a complete feature vector for a given character. In this research, for the experiment, the character dictionary consists of 3000 feature vectors of the character set. The experimental results indicate that all the input characters in the dictionary are correctly classified and all the characters outside the dictionary are rejected. The proposed approach contains extensive pipelining and parallelism. The time complexity of the proposed algorithm is O(N) instead of O(N 2) when a uniprocessor is used, where N is the dimension of the digitized image of the input character. A study on a simple VLSI architecture composed of four linear arrays of processing elements (PEs) for the proposed VH2D approach is also presented. VH2D projection Feature extraction Pattern recognition VLSI algorithm and architecture Pipelining Parallelism
evidence that the Chinese characters do not occur in isolation but are constituents of words. So, the highlevel linguistic knowledge about word formation, the varying frequency of usage of words and the strong word contexts effects in resolving ambiguities in the recognition of Chinese characters are extensively utilized in the optical character recognition (OCR) system. It has the advantages of noise tolerance and selflearning capability. However, it does not emphasize the discriminative power of features used as in traditional character-oriented approaches. Chen and Lieh proposed a method in which strokes are taken as features and relations between them are also used to help recognition. Random graphs and relaxation techniques are used to recognize handwritten characters, t7J Liao and Huang introduced a transformation invariant matching method that can extract radicals from an input Chinese character. Is~ The matching algorithm has some merits such as transformations (scaling, rotation and translation) invariant and invulnerable to the inherent defects of the thinning algorithms. Cheng et al. proposed a set of algorithms dealing with parallel image transformations, tg~ These algorithms perform the mapping and filling at the same time while respecting the connectivity of the original image, resulting in more consistent and accurate transformations. The implementation using VLSI architecture makes the time complexity only O(N) compared with O(N 2) using a uniprocessor. Taxt et al. proposed statistical
l. I N T R O D U C T I O N
Character recognition has many applications, such as document processing, language translation, electronic publication, office automation, man-machine communication, information retrieval, etc. Therefore, character recognition is one of the important and active research areas related to these topics. 11-3~ Tang et al. presented an approach known as the concentration-contour approach where diagonal-diagonal regional projection transformation (DDRPT) was used to convert a compound pattern into an integral object. ~4)A VLSI architecture has been designed to implement that approach with the time complexity of O(N) versus O(N 2) when a uniprocessor is used. Akiyama and Hagita tS~ proposed a system for automatically reading either Japanese or English documents that have complex layout structures including graphics. They attempted to define intrinsic features for document layout structure recognition by which document components in images can be extracted. It also described multi-font character recognition based on feature vector matching. Lua and Gan proposed a word-oriented recognizer using the Interactive Activation and Competition Model (IAC model) to recognize Chinese characters. ~6~The method is based on the t Author to whom correspondence should be addressed. :~Now with Ameritech Library Services, Provo, UT 846045630, U.S.A. 97
98
H.D. CHENG and D. C. XIA
methods for recognizing isolated handwritten or printed symbols directly from raster images of documents such as technical drawings and maps. tt°) By using these methods, the information lost and increased computation time caused by the thinning and vectorization of the symbol candidates are reduced. A nonlinear normalization method was proposed by Yamada et al, tl 1) where a resampling is carried out so as to equate the product of a local line density and a variable sampling pitch. The method was applied to handprinted Kanji character recognition. Tavakoli discussed three primary processes utilized in most pattern recognition systems. ~12)He proposed a unified approach to character recognition in which the generation of the known base and the identification process are independent of the known base, thus allowing for better recognition. Also, three major approaches: statistical, syntactical and neural network approaches for three processes along with a concept of fuzziness and learning are described and evaluated. Jaravine presented a usage of Syntactic Neural Network (SNN) in a written text reading system, tl 3) It demonstrated that hybrid models are integrated with three basic properties: (1) the basic principles from syntactic models, (2) parallel processing on networks with complex neural units and (3) the possibility of efficient incremental learning and recognition of patterns. Baker and McCartor compared the classification performance of two popular ANN (Artificial Neural Network) algorithms: Back Propagation and Learning Vector Quantization. ~14) The difference in recognition accuracy is not significant enough to make a convincing argument that one algorithm is better than the other. Houle and Eom presented a system for recognizing degraded machine-printed characters. "5) The system relies on a priori knowledge primitives. A set of"clean" characters from multiple fonts was used to emphasize that a set of clean characters can be used to build an inference engine to recognize noisy characters. Cheng et al. proposed a Modified Hough Transform (MHT) method for the recognition of handwritten Chinese characters. ~16) The authors pointed out that feature extraction is the most important part in pattern recognition because features are the main key to recognize the unknown object. Features extracted for automatic character recognition can be either global or local. The method helps to overcome the problem of noise sensitivity in the local feature approach and the time consuming in the global feature approach. Kahan et al. described a system that recognizes printed text of various fonts and sizes for the Roman alphabet. ~17)The system combines several techniques in order to improve the overall recognition rate. Lee developed a fuzzy tree classifier which is used in recognition of noisy Hangul characters. "s) The tree classifiercan make a global decision much more quickly than the single stage classifier. Pavlidis et al. proposed a method for the direct computation of geometrical features, such as strokes, directly from the gray-scale image. "9) It is used for recognition of poorly printed
text. Another related work by the same group proposed a method of detection of curved and straight segments from gray-scale topography. It attempts to overcome the ambiguity problems resulting from polygonal approximations, ca°) Shan (31) proposed a rasterscan-based segmentation algorithm and a chain-code encoding-based graph traversal algorithm for feature extraction in an OCR system for printed text recognition. A parallel RISC VLSI architecture is also implemented. However, many of the current approaches dealing with a large set pattern classification problem have the following major drawbacks. (a) It has to find out the physical center (or central point) of an object before it is to be projected. (b) Practically, to centralize the object is to find its central coordinates and then shift it to the center of the image. Some algorithms using the N x N processing elements array can perform some types of projection operations in O(N), but they assume that the image is centralized when it is input, which is not true for most cases. The centralization process needs O(N 2) in time using an ordinary algorithm. (c) For an N x N pattern, a VLSI array ofN x N PEs is used for the projections. (d) In order to carry out projections, an N x N PE array is needed to store the image of the object. Our research work will focus on Chinese character recognition. This is more challenging than some other types of character recognition. First, it has the problem of a large set of characters. To find good features which can effectively classify each pattern in the set is very difficult. Second, Chinese characters have more complex structure than alphabetic characters and there are a large number of mutually similar character groups which are hard to classify without the aid of wellselected features. Figure 1 gives some examples. The issue becomes more complicated if the characters appeared in different sizes, fonts and orientations are considered. Third, since the character set is large, the high recognition speed is demanded to have it applied in real-time processing applications. Therefore, to explore a more robust and efficient approach which combines highly parallelism and VLSI techniques becomes essential to character recognition tasks. Although our emphasis is on Chinese character recognition, our proposed algorithm and architecture can also be used for the character recognition of other languages. In this work, we have proposed a novel VH2D (Vertical-Horizontal-2Diagonal) approach in which separate projections are made over each character from four directions: vertical, horizontal and two diagonals (Fig. 2). For a randomly projected image of a character, the central point of this character is calculated and is to be used to adjust four subfeature vectors obtained from four projections. The four subfeature vectors which have been adjusted will be combined into a complete feature of that particular character. All feature vectors of the data set will be used to build a data dictionary which is used to be the
Character recognition
person
field
big
in
eight
from
fu-st
land
99
Imlml
soldier
state
husband sky measure
no
king
host
live
go
end
Fig. I. Some pairs (or groups) of Chinese characters which have similar structures.
0
\
t~
,,'
Vertical ~
0
I
3
4 5
5
. 5
4
3
0
~v
'V %%
Fig. 2. VH2D projection and its corresponding subfeatures.
base for character recognition. During the process of character recognition, that which has the closest feature vector in the dictionary to that of the known input character will be selected to be the candidate. The VH2D approach will be discussed in the next section. Section 3 studies the VLSI architecture for the proposed algorithm. Section 4 verifies the timing and time complexity of the VLSI implementation. The
experiments and results are described in Section 5. Finally, conclusions are given in Section 6. 2. THE VH2D A P P R O A C H
In this section, we will propose a novel parallel VH2D approach to solve the problem of pattern recognition with a large set of characters. This approach
100
H.D. CHENG and D. C. XIA
consists of four tasks: (1) feature extraction; (2) feature adjustment; (3) dictionary building; (4) pattern matching. First, to find good features of patterns is the most crucial step to pattern recognition problems, especially to classification of a large pattern set. The VH2D approach will extract feature vectors from the projections of each character's binary image in four directions, which are vertical, horizontal and two diagonal (45 and 135°) directions, respectively. Secondly, since a randomly scanned character image does not always have the character located in the center of its image, in order to ensure that every imaged character is located in the center of its image, the feature adjustment is essential to reduce the possibility of misclassifying patterns due to the differences between images without centralization of patterns. Four subfeature vectors from four directions will be adjusted according to the character's central point and then combined to form a complete feature vector for each individual character. Thirdly, when all the feature vectors of the character set are obtained, a dictionary of feature vectors is built up to be used as a base for pattern matching. Finally, when an unknown character is input, its feature vector is obtained by using the same way as in the second task. Then the unknown character's feature vector is used to match the feature vectors of the characters in the dictionary. The unknown character will be recognized as the one who has the minimum difference between the feature vectors of the unknown character and all the candidates in the dictionary, provided that the minimum difference is within a reasonable threshold. Otherwise, the unknown character will be rejected which means it is not tbe character in the dictionary. 2.1. Feature extraction
To find good features which can effectively classify a large set of characters is critical to the success of
solving such problems. Many feature extraction approaches have been studied and successfully applied to character recognition. ~1'32-34~ Due to the large amount of Chinese character sets, although the structures of characters can vary greatly, there are still many pairs (or groups) of characters which are very difficult to distinguish as they have such similar structures. Figure 1 gives some examples of such characters. From Fig. 1, we can see these pairs (or groups) of characters look so similar that the image projections from one direction cannot reflect the characteristics (or features) of their structures well. For example, the vertical projections of the characters from and first, no and end are almost same. Similarly, the horizontal projections of the characters person and in are close to each other. Therefore, if we only use projections from one direction, it is more likely to result in a high misclassification rate. Based on this understanding, we develop the VH2D feature extraction method. In our experiments, all of the examples in Fig. 1 can be correctly classified. The detail will be discussed later. Figure 2 gives an example of the VH2D projections and their correspondent subfeature vectors of the Chinese character "BIG". Figure 3 is a simplified architecture. The details are discussed below. For the clarity of descriptions, we introduce the following definitions: Definition 2.1. A binary image of a character is denoted as follows:
IiJ =
{10 i f p i x e l l , j i s b l a c k otherwise,
(I)
where, i,j = 1, 2 . . . . . N (N is the size of the image). Definition 2.2. If the image size of a character is N × N, then the central point of the image Iij is Io([N/2], IN/2]). In the rest of the paper, we assume N is even.
w
I I I
I I I
I I I Ph----
~X L
PEt
Fig. 3. Four linear arrays of PEs for VH2D approach.
Character recognition
Definition 2.3. For a character C which has a binary image lij, its central point is denoted as l,(x, y), where:
F
x=
',
~
-
-
l
-
direction (v) is denoted as Pv = [Pvt . . . . . Pvj. . . . . P~N], where: N
N N I ," ~--"~---~'~--7-,
L E~=xEJ=,Iki J
y=
10l
L ~,,=,ZJ=11ki d
(2)
Pv~ = ~ llji=1
(6)
(3)
Definition 2.7. For a binary image lij of a character C (i,j = 1..... N), the image projection in a horizontal direction (h) is denoted as Ph = [Phi . . . . . Ph/. . . . . Phsl, where:
i,j, k = 1, 2 ..... N (N is the size of the image). Definition 2.4. To move a character C into the center
N
of its image, is to move the character to the position of the image in which the I~ overlaps the Io.
Psi = ~ I0"
Definition2.5. For an image of character, Io = (N/2, N/2), I~ = (x, y), let A~ = x - N/2, Ay = y - N/2,
Definition 2.8. F o r a binary image I o of a character C(i,j = 1. . . . . N), the image projection in a 45 ° diaoonal
to have the character located in the center of its image is to transform the image according to the following
direction ( d l ) i s denoted as Pdl = [Pdll . . . . . Pdli. . . . . Pdl2N- 1-], where:
( ~'~N
~"~i
I
~ ~ . d = N - i + l ~ . , k = l lk P d l i m } ~-,2N-i~'~N I I.L,I=I /..~k=i-N+l lk
if l < i < _ N and l = k + N - i if N + I < i < 2 N - - 1 and k = l + i - N .
relations: If Ax =
>0 <0 =0
the image shifts left the image shifts right the image does not shift, (vi
'~i
I
Pd2i = ) ~ . , l = 1/-,k = 1 tk N N ~Ei=i-N+ lEk=i-N+
If Ay =
>0 <0 =0
the image shifts up the image shifts down the image does not shift.
(8)
Definition 2.9. For a binary image lij of a character C (i,j ..... N), the image projection in a 135 ° diagonal
(4) ,direction (d2) is denoted as Pa2=[Pd21 . . . . . P d21. . . . .
P a22N- 1"]' where:
if l < i < N lllk
(7)
j=l
and k = i - l + l if N + l < i < 2 N - 1 and k = i - l + l
(5)
The distances of shifting are A~ or Ay respectively. According to Definitions 2.1-2.5, each input character can be located in the center of its N x N image. First, find It, then obtain Ax and Ay, finally, transform the image according to Ax and Ay. As we mentioned, this is a traditional way to solve the image centralization problem. Image projections are made by adding the pixel values of the image I o along certain directions such as vertical, horizontal or diagonal directions. As the distribution of pixels are different, the image projections change significantly even for a same character. Therefore, to make the character located in the same positions in the images is very important to generate image projections from which the feature vectors of characters are extracted. By using a sequential algorithm, this process is O(N 2) in time complexity. However, to use a parallel algorithm and a VLSI parallel processing architecture, the time complexity will be reduced to O(N), where N is the size of the image. The details will be discussed in Sections 3 and 4.
Definition 2.6. For a binary image I~j of a character C (i,j = 1. . . . . N), the image projection in a vertical
(9)
Image projections in four directions (v, h, dl and d2) can be used as four subfeature vectors of a character. Now, we define the subfeature vectors for a character.
Definition 2.10. F o r a n i m a g e o f a c h a r a c t e r l i ~ , i t s four image projections (Pv, Ph, Pdl and Pd2) in equations (6)-(9) are defined as four subfeature vectors of the character: F~, Fy, F45o and F~ 35o, where: F x = P~ Fy = Ps F,s ° = Pal
F135 ° = Pa2"
(10)
The four subfeature vectors F x, Fy, F45o and Ft35o are combined into the feature vector of the character.
Definition 2.11. For an image of a character li~, its feature vector is defined as: F = (F~, Fy, F4s °, F 135*)-
(11)
By using the above definitions and formula, we can extract the feature vector F for a given character C in the following algorithm.
Algorithm 2.1. Step 1. Find the central point l¢(x, y) of the character C; Step 2. Transform the image lo according to AxandAy defined in equations (4) and (5), obtain the centralized
102
H.D. CHENG and D. C. XIA N
O
image I~j, where:
r
X
D
I[i =
{10 i f / ' + ~ + ~ ' = l otherwise.
(12) ,
P(x,y)
Step 3. Find the four subfeature vectors of the centralized image I~: F~, and F 1' 3 5 o ," , Fr,, F45o , Step4. Combine the four subfeature vectors into a complete feature vector F'. 2.2. Feature adjustment Algorithm 2.1 is a sequential algorithm in which none of the first three steps can start before its previous step has been finished. In other words, image centralization has to be carried out before the feature extraction of a character can start. By using pipelining and parallel VLSI architecture (see Sections 3 and 4), both Step 1 and Step 3 are O(N) in time complexity, but Step 2 is still in O(NZ). If we can eliminate Step 2, the total time complexity of Algorithm 2.1 will become O(N) rather than O(N2). To eliminate Step 2 means we need to find some relationship among the character's central point, the image projections before and after image centralization.
Theorem 2.1. The VH2D projections of a centralized image I;j can be obtained by adjusting the VH2D projections of its uncentralized image Ii~ according to the central point of the character It(x, y). Proof. Assume that the central point of the character C is I c (Pv, Ph, Pdl and Pd2), four projections of the uncentralized image lij (P', P[, P~I and P~2) are four projections of the centralized image I[~. From the way the It(x, y) is calculated [equations (2) and (3)] and the way the image is centralized [equations (4), (5) and (12)], the numbers of"l"s in lit and I[j should not be changed. Therefore, the values of the elements in P~ should be the same as those in P'v, as in Ph and P[, Pa~ and P~I,Pa2 and Pd2. The only difference is that the positions of the correspondent elements in (Pv, Ph, PdI,Pa2) and (P'o,P'h,P'dl,P'a2) are different. Therefore, we can adjust the positions of elements in (P~,Ph, Pdx, Pa2) to make them equal to those in (P'~,P'h,P'~I,P'a2). Obviously, the adjustment of projections in vertical and horizontal directions are dependent on A~ and Ay respectively. Hence, we have
Fig. 4. Feature vectors adjustment for 45° based on the character's central point P(x, y).
We use the geometric method to prove this. In Fig. 4, the central point of the character is P(x, y). The center of the image is Io. The adjustment is carried out according to the effect that P(x, y) will presumably be moved to Io. The projection through P(x, y) is fp, the projection through the image's central point is fs. (a) If P(x, y) is within AABD, we denote the distance in position between fp and fN as Ifp -fNI, the distance of UD as IUDI, since the image pixels on points U and D will be projected through P(x, y) and Io, respectively, producing fp and fN. Here, we consider the distance as the count of processing elements (PEs). Therefore:
Ifp -fNI = IUDI = N -(lZWl
= N - (x + IWPI) = N - (x + y).
{ P~j÷~
if I < j + A x < N otherwise.
[fp -fNI = IVDI = ITDI- ITVI =y-ITPI =y-(N-x) =(x + y ) - N.
(13)
as:
Ifp -fNI = IN - ( x + y)l.
(14)
The adjustment of projections in 45 ° and 135 ° diagonal directions are based on the following proofs. (3) For 45 ° diagonal direction.
(16)
We can combine equations (15) and (16) into (17)
(2) For horizontal direction, for the same reason as in (1):
p,hi=~ph,+a~ if l < i + A r < _ N otherwise. ~o
(15)
(b) If P(x, y) is within ABCD, then:
(1) For vertical direction: P'vj --
+ IWUI)
(17)
Let A45o = N - (x + y), we can obtain:
, Pdxi =
{Pdli+A,5
if I
(3) For 135 ° diagonal direction We use a similar method to prove this. In Fig. 5, the central point of the character is P(x, y). I o is the center of the image. The projection through P(x, y) is fp, the
Character recognition 0 A
U •
W Id
iN
~
D
v
image l~j after they have been adjusted according to the central point of the character It(x, y).
X
Proof. From Definition 2.11 and Corollary 2.1, Corollary 2.2 is easily proved. []
P(x,y?',
V,
Io N
103
By Corollary 2.2, now we can eliminate Step 2 in Algorithm 2.1, and propose the following parallel algorithm to extract feature vector for a character C.
~
Al#orithm 2.2. Step 1. Find the central point It(x, y) of the character
T =1-- .'-':~ (x',)
C and calculate A~, Ay, A45o, A135o.
'7
Step 2. Find the four subfeature vectors of the uncen-
Fig. 5. Feature vectors adjustment for 135° based on the character's central point P(x, y).
tralized image lif Fx, by, F45 o and F 135o. Step 3. Adjust F x, Fy, F45o and F 135° into F~,,F', F~,5o and F'~35o according to Ax, Ay, A45o and A 135°, respectively. Step 4. Combine the four adjusted subfeature vectors F', F'r,F'45oand F'~35ointo a complete feature vector F'. Compared with Algorithm 2.1, Algorithm 2.2 has three advantages over Algorithm 2.1:
projection through the image's central point is fN' With the same reason, we have (a) If P(x, y) is within AABC, we denote the distance in position between f , and f s as Ifp --fN[, the distance of AV as [AVI, then: If~ --fNI----IAVI
= IATI- IVTI = Y --IPTI = y - x.
2.3. Building dictionary
(19)
(b) If P(x, y) is within AADC, then:
If. --fN[ = IAUI --IaWl-lUWI (20)
We can combine equations (19) and (20) into (21) as: Ifp - f u [ = Ix - y[.
(21)
Here, D[i].l is the index of B[i], D[i]. C is the code of D[i], D[i]. F is the feature vector of D[i]. D[i].l is obtained by summations of all the vector elements of four subfeature vectors of D[i]. It is represented in the following way: N
N
j=l 2*N- 1
P,d2i_~{Pd2i+A~35o if I < i + A 1 3 s o < 2 N - - 1 (22)
+ ~
j=l 2*N- 1
O[i].F,soj+
j=l
According to equations (13), (14), (18) and (22), the proof of Theorem 2.1 is completed. []
Corollary 2.1. The VH2D four subfeature vectors of a centralized image l~j can be obtained by adjusting the VH2D four subfeature vectors of its uncentralized image l~j according to the central point of the character
~
O[i].Flasoj,
(24)
j=l
where D[i].Fxj, D[i].Fyj, D[i].F45oj and D[i].F135oj are the jth vector elements of D[i].F~, D[i].Fy, D[i].F~5o and D[i].F135 o, which are the four subfeature vectors of DI'O. N is the image size. By using the combined feature vector F, the above D[i] can be represented in another shorter form:
l~(x,y).
6*N- 2
Proof. From Definition 2.10 and Theorem 2.1, Corollary 2.1 is easily proved.
(23)
D[i].l = ~ D[i].F~j+ ~ OEi].Fyj
Let A135o = x - y , we can obtain: otherwise.
The dictionary is composed of all the feature vectors of the character set. If the size of the character set is M, then the dictionary is simply a linear array D[M]. Each element of the array D[i] (i = 1..... M) is basically a tuple, which represents a character.
D[i] =(D[i].I,D[i].C,D[i].F)
= x -IPWl
= x - y.
(a) The step of image transformation (Step 2 in Algorithm 2.1) is eliminated. (b) Step 1 and Step 2 can be processed in parallel. (c) The time complexity of Algorithm 2.2 is O(N) by using a parallel algorithm.
[]
Corollary 2.2. The VH2D feature vector of a centralized image I~ can be obtained by combining the VH2D four subfeature vectors of its uncentralized
D[i].I=
~
D[i].Fj,
(25)
.i=1
where D[i].Fj is the jth vector element of O[i].F. D[i].C is a pre-assigned integer code which represents the ith character in the dictionary. D[i].F is obtained by using Algorithm 2.2.
104
H.D. CHENG and D. C. XIA
The dictionary is organized in a way that it is sorted according to the increasing order of index value. If there are two similar indices, they are located consecutively. The time complexity of equation (25) is in O(N), so the time complexity of obtaining indices for the character set is O(M x N). The sorting dictionary could be O(N2), but these two steps are carried out only once. Therefore, the time complexity of building dictionary is basically O(M x N + N2). When M >>N, then it becomes O(M x N). Due to the discrete nature of the digitized image, there could be differences generated between two different times of scanning of the same image. Therefore, in order to build the dictionary on a statistical base, we use a set of image instants of a character C to find their individual feature vectors, and then use the mean average value of the feature vectors of these instants to be the feature of the character C in the dictionary. In this way, we expect the dictionary will be more representative and the reliability of classification will be improved significantly. There are two ways to obtain image instants for each character in the dictionary. First, for each character, we can obtain image instants (or samples) from a number of different scannings which means each image generated by one scanning is an image sample. This is a natural way to obtain image samples. However, it takes lots of work to obtain them manually. Due to our limited time of research and emphasis of verifying the effectiveness of our proposed approach, we take the following alternative method to generate all these image samples. For each character in the dictionary, we scan it once and obtain its original image. Then, we use the rotation algorithm {9)to generate all the image samples for each original image. For the given image I and rotation degree 0 (0 < 0 < 2~), I is first rotated by 0 into I(0), then 1(0) is rotated backwards by - 0 into 1(0,- 0), which is an image sample of the original sample I. Due to the discrete nature of the digitized images of the characters, the feature vectors before the rotation and after the rotations (0 and - 0 ) are different, therefore, they can be considered as different inputs. With a set of different values of 0, we can obtain a series of image samples for the original image. As described in reference (9), the transformation algorithms can perform the mapping and filling at the same time while preserving the connectivities of the original image. They are also VLSI algorithms and can be implemented in parallel by using VLSI architectures. 2.4. Pattern matchino When the dictionary of feature vectors has been built up, we are able to perform pattern matching, i.e. to classify the input unknown character. For a test sample, its feature vector is obtained by Algorithm 2.2 and its index is calculated using equation (25). Then we find the position of the element whose index is nearest to the index ofthe test sample in
the dictionary array which is sorted according to the indices of characters in the dictionary set. We then match the feature vector of the test sample with those of the elements within a certain boundary around the position. We call this process local pattern matching. The reason we can carry out local pattern matching instead of global pattern matching (i.e. search through all the dictionary array) is based on the fact that character's indices can reflect the structures and stroke numbers of characters to a certain degree, which means two samples of the same character are much more likely to have indices with relatively close values. Therefore, we can search in the dictionary within a range of positions which has relatively close indices to that of the test sample or an unknown character. The proper parameters of such a range can be obtained through experiments. In the pattern matching process, we find the element which has the minimum difference value (MDV) of feature vectors between the test sample and all the characters within its correspondent boundary. Then the M D V is checked to see if it is within a properly selected threshold (T). If it is not greater than T, then the character with (MDV) will be selected as the candidate, otherwise, the test sample will be rejected. For the purpose of testing the proposed algorithm, every character in the dictionary and test sample set is assigned a unique character code. Same characters in the dictionary and test sample set have the same character codes. After the candidate has been selected by the pattern matching algorithm, compare the character code of the candidate with the code of the test sample. If two codes are the same, then the test sample is correctly recognized, otherwise, it is wrongly recognized. If the test sample is rejected and its character code is found not to belong to the codes of the dictionary set, then it is correctly rejected. Otherwise, the rejection is wrong. The accuracies of sample recognition and rejection are calculated by the numbers of correctly recognized and correctly rejected characters over the numbers of total sample recognition and rejection, respectively. The pattern matching process can be summarized in the following algorithm. Algorithm 2.3. Step 1. Extract feature vector for a test sample S by using Algorithm 2.2, obtaining a tuple (S.I, S.C, S.F), where S.I, S.C, S.F are the index, code and feature vector of the character S. Step 2. Locate the position i in D[i] (i = 1..... M) at which D[i].I is the nearest to S.I. Step 3. Find: MD V = min(S.F - D [k].F),
(26)
where 6*N - 2
S.F - D[k].F =
~, j=l
IS.Fj- D[k].Fjl.
(27)
Character recognition Here, 1 < k _< M, and k meets the following relation:
(28)
in which LOBOUND and UPBOUND are two integer constants which will be specified properly by experiments. Step 4. If MDV < T ( T is the threshold of the feature differences), then go to Step 5, otherwise go to Step 6. Step 5. Compare D[k].C with S.C, if D[k].C = S.C, then the test sample is correctly recognized, otherwise, it is wrongly recognized, go to Step 7. Step 6. If 1 _>N and 0 < K << 1). It is dependent on the size of the character set and values of specified searching range. When the accuracies of sample testing are very high, we can use the dictionary to classify unknown characters. If the accuracies of sample testing are A, (recognition) and At (rejection), then we can say that an unknown character will be correctly recognized or rejected by the proposed approach with an accuracy of A, o r Aj. To classify an unknown character, we will use the following algorithm.
Algorithm 2.4. Step 1. Extract feature vector for an unknown character U by using Algorithm 2.2, obtaining a tuple
(U.I, U.C, U.F), where U.I, U.F are the index and feature vector of the character U, while the code of the character U.C is unknown. Step2. Locate the position i in D[i] (i= 1..... M) at which D[i].I is the nearest to U.I. Step 3. Find:
MDV= min(U.F - O[k].F),
(29)
where: 6*N - 2
U.F-D[k].F=
~, pU.Fj--D[k].Fj.
Step 5. Make U.C = D[k].C; the known character is classified as the character with the code D[k].C.
D[i].l - LOBOUND ~ D[k].l < D[i'l.l + UPBOUND,
105
(30)
j=l
Here, k has the same conditions as those in equation (28). Step 4. If MDV<_ T (T is the threshold of the feature differences), then go to Step 5, otherwise the unknown character is rejected.
3. VI..SI ARCHITECTURE FOR THE PROPOSED APPROACH
In the last decade, the VLSI technology has undergone great progress, which not only successfully reduces the size, cost and delay time of hardware devices but also develops a new direction for the implementation of parallel algorithms. ~25~Numerous areas of research and applications, especially such as pattern recognition, image processing and artificial intelligence have been explored. Many research work addressed the implementation of pattern recognition and image processing algorithms which are particularly time-consuming and memory storage-demanding. Some image transformation algorithms had similar features39.21.25.26) More researchers have been interested in designing VLSI algorithms and VLSI architectures for pattern recognition and image processing problems. From the parallelism point of view, the following hierarchy exists: VLSI algorithms cparallel algorithms c algorithms. The transformation algorithm in reference (26) can be implemented by using a VLSI array, but it has the following problems: the computation time is quite expensive compared with the proposed algorithm; the error rate is high, especially when handling long narrow objects, and it needs some "global" information to find the boundary of the pattern. ~9) The algorithm in references (26, 28, 29) are parallclizable, however, their accuracies are not high enough. The algorithm in reference (4) assumes that all input images of characters have already been centralized, which is not true in most cases. The centralization process is time-consuming (O(n2)) by using a sequential algorithm. For an N x N pattern image, a VLSI array ofN x N PEs are used for doing projections and storing the image of the pattern. This is time-consuming and memory storage-demanding. The VH2D approach has the following features: (1) The centralized image is not assumed, while image centralization is taken care of in image projections without actually transforming images. (2) The image projections and image centralization can be processed in parallel. (3) It is not only a parallel algorithm, but also a VLSI algorithm with low time complexity and high accuracy. (4) The VLSI architecture and implementation for the proposed algorithms are simple, the communication and computations among PEs are regular and simple. Here, we propose a VLSI architecture to perform image centralization and feature extraction (Algorithm 2.2). The architecture is four linear arrays of processing elements (PEs). Among them, two N-linear arrays of PEs are used for the projections in vertical and horizontal directions, while the other two ( 2 N - 1)-linear arrays of PEs are for the two diagonal directions (45 ° and 135°), respectively.
106
H.D. CHENG and D. C. XIA
3.1. The array of PEs for horizontal direction 3.1.1. The dataflow and control flow. Figure 6(a) represents the image matrix lii ( i , j = l . . . . . AT) of a character. Figure 6(!)) is an N-linear array of PEs for projections in horizontal direction. The dataflow will move as follows: (1) For a single row i in the matrix, I n . . . . . lis move into the ith element of the array P E i in a pipelining way, which means in each time unit, there is a pixel moving into the PE.
(2) All rows of data move into their corresponding PEs in parallel, which means l l j . . . . . lij . . . . . INj ( j = 1. . . . . N) move into P E t . . . . . P E i. . . . . PE N in a parallel manner. (3) Within the array of PEs, the dataflow goes from P E t to PE N in a pipelining way. The control flow is controlled in a way that controlled data can move in two directions. This is required by the adjustment of subfeature vectors in the algorithm. Here, we need to explain the structure of the storage elements for the character image lq (i, j = 1. . . . . N). All
> II~1 I" II I.
"1 I.I ["
A
.I I IIq .I [ luq
-'-I J
pE~ PE2
IJ:l'.
j
( S e e Fig. 5.)
(a)
(b)
from Ai-ll
i-I N
from Ai-12
i-I N
aft~rn+~-2., gglkJ °fPE~-~ afterN÷~-~., gg",-i*j °feE~-'.c_°_~.__with ~)_=_~ _s__~_-!_o!~--~-~
. . . . . . . . . . . . .
~-_~J=t
. . . . . . . . . . . . . . . . . . . . .
. . . . . .
,
. . . .
/ dataflow row i
Iilq ... Ill
( ................ iN
~.Y. Ikj toAi+n
k=l j=i
of PEi+l
~ ....................
iN
ZZIkj*j
k=l j=l
_---II s m
]
l .........
toAi+12 of PF..i+l
connected with SRi+I of PEi+I
(c) Fig. 6. (a) A character's image matrix lij (i, j = 1..... N), (b) An N-linear array of PEs for horizontal direction (Y-axis), (c) The internal structure of the PEI (row i for i = 2..... N - 1), (d) The internal structure of the row PE, (e) The notation of (a)-(d).
Character recognition
107
from AN41 after 2N-2 ..............
after 2N-1 :_~_=_. . . . . . . . . . . . . . . . . . . . . .
tu
dataflow row N
tu
2/
INN ... IN! :
after s ~
s
I I
(Am)
(M)
xiNj [19[
j
[
.L
/
', ',
j
%Y
after2N t~ z.z.
from fld,I-12 of PEN-I ~ ~ Ikj * j I , ~nnectedwith -_ "- . . . . . . . . . . . . . . . . . -_'__
N-1 N
N-I N r'fDl~N 1 ~ .~ Ikj / " " "
/
T-INj*j /~..._~
i
after2N tu
I '
k=lj=~
~--'1
,
to Dy after2N+lm~;~f t.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
tO
after2N+2tu-1 3L.x ,after2N+3rul[~~ .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
°- . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
--t
pu after 2N+3tu
S135"
(d)
Ail, Ai2, AN1, AN2
• Accumulator
SRi, SRN: Shift Register C: Adding ' T ' Counter M: Multiplier N N N N Dx: Divider x = int (( ~ ~ Ikj * j ) / ( ~ Y. Ikj )) k=lj=l
k=l j=1
Sx: Subtracter Ax = x - N/2 SCx: Shift Control Register, if Ax > 0 ( < 0, = 0 ), then SRi (i = 1 ..... N) shift left ( shift right, no shift), the shift distance is IAxl. tu: time unit(s) D: 1 tu delay Dy: See Fig. 7 (d) $45": see Fig. 8 (e) $135" : see Fig. 9 (e)
(e) Fig. 6. (continued)
part (a)s of Figs 6-9 are just for the clarity of description. The actual structure of the storage elements for the image lii is a matrix of storage nodes. Each node contains two shift registers which have the same content of lq: lld and l~j (see Fig. 10). In Fig. 6(a), the lq represents l~i, while in Fig. 7(a), it represents I .b. When Algorithm 2.2 starts, the dataflOWS of lqa and Iqb move simultaneously towards the PE arrays for horizontal and vertical direction, respectively. Additionally, the dataflow of li*j moves towards the PE arrays for (45 ° and 135 °) diagonal directions through connection groups A and B in Figs 6, 8 and 9, respectively. IJ"
3.1.2. The internal structure of PEs. Figure 6(c) gives the internal structure of the PEi (i = 1..... N). Ph~
in equation (7) is obtained in accumulator Aia and then stored in SR i. C is a counter which adds one to itself each time unit and is initialized to ' T ' . M is a multiplier. The shift registers of each PE can move their contents in both directions. Figure 6(d) gives the internal structure of the PEN. Pht~ is obtained in accumulator AN1 and then stored in SRN. The output of ANt is used for calculating both x and y in equations (2) and (3) (the central point of the character). The output of AN2 is used for calculating x, which is obtained as an output of the divider O~. Sx is a subtracter, calculating the distance and direction Ax in equation (4) for the adjustment of subfeature vectors Fx. SCx is a shift control register which performs transforming subfeature vectors Fx. Figure 6(e) is the notation of (a)-(d).
108
H.D. CHENG and D. C. XIA
•
°
•
•
°
°
Ilj . . . It3 Ii: Iil
PEt PE2
:
PE3 O
O I
PEj ¢.)
I
V •
°
.
•
•
•
PEN-t I~
.
.
IN1
.
PEN
(a)
(b) from Ai-12 of PEj-t
j-t N after N+j-I tu Y. E Ilk* i k=l i=l .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
connected with
~_~Rj:l_Of..yEj-I
.
dataflow column j INi ... h i
57 j after N÷I m ~
~.
i=!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-.i.
j ~ z. z. ~ , k=t i=1
i
to Aj+12 of PEi+t
COMeCted with SRj+t of PEj+I
(c) Fig. 7. (a) A character's image matrix 1 o (i,j = 1 ..... N), (b) An N-linear array of PEs for vertical direction (X-axis), (c) The internal structure of the PEj (columnj forj = 2..... N - 1), (d) The internal structure of the column PE, (e) The notation of (a)-(d). 3.2. The array of PEs for vertical direction 3.2.1. The dataflow and control flow. Figure 7(a) represents the image matrix I o (i,j= 1..... IV) of a character. Figure 6(b) is an N-linear array of PEs for projections in horizontal direction. The dataflow will move as follows:
(2) All columns of data move into their correspondent PEs in parallel, which means I~1,..., Iij,...,I~N (i = 1. . . . . N) move into PE1 ..... PE~, .... PE N in parallel. (3) Within the array of PEs, th dataflow goes from PE 1 to PEN in a pipelining way. The control flow is the same as in Fig. 6.
(1) F o r a single column j in the matrix, Ii~ .... INj move into thejth element of the array PEj in a pipclining way, which means in each time unit there is a pixel moving into the PE.
3.2.2. The internal structure of PEs. Figure 7(c) gives the internal structure of the PEj ( j = 1 . . . . . N). Its structure is similar to that of PEi in Fig. 6(0. However, A jr of each PEj need not be connected,
Character recognition
109
from AN-12 of PENq N-1 N connected with after2n4 tu 5". Z I i k * i ' SRN-1of PEN-1 _k-_~ _H_ . . . . . .
........................................
datafiow column N INN
... h N after Ntu
from Fig.6(d) N N
X X Ikj
~
~
,f'--'
after 2N+l tu~" ~ ..................
2 £Iik*i
I '
'
"~
after 2N+2tu
k=li=l
[~-~ ~
~ after 2N+3tu
-~
tO .......................................
$45"
pulse start after 2N+3tu
" - ~ $135"
(d) Ajl, Aj2, AN1, AN2 : Accumulator SRj, SRN: Shift Register C: Adding "1" Counter M: Multiplier N N N N Dy: Divider y = int (( Y. Z Iik* i ) / ( • E Ikj )) k=l i=1 k=l i=1 Sy: Subtractor Ay = y - N/2 S C y : Shift Control Register, if Ay > 0 ( < 0, = 0 ) , then SRj (j = 1 .....N) shift left ( shift right, no shift), the shift distance is IAyl. tu: time unit(s)
$45": see Fig. 8 (e) $135" : see Fig. 9 (e)
(e) Fig. 7. (continued)
because the output of ANt in Fig. 6(d) can be used again in calculating y. Pvj in equation (6) is obtained in accumulator A jl and then stored in SR r The input and output of Art and Aj2 are also shown in Fig. 7(c). The shift registers of each PE are connected each other, and their contents can move in both directions. Figure 7(d) gives the internal structure of the PE N. Besides, it has same components of Fig. 7(c) and some additional components. PeN is obtained in accumulator ANt and then stored in SR N. The output of AN2 is used for calculating y, which is obtained as an output of the divider D r Sy is a subtracter, calculating the distance and direction Ay in equation (5) for the adjustment of subfeature vectors F r SCy is a shift control register which performs transforming subfeature vectors F r Figure 7(e) is the notation of (a)-(d).
3.3. The array of PEs for 45 ° diaoonal direction 3.3.1. The dataflow and control flow. Figure 8(a) represent the image matrix Iij (i,j = N) of a character. Figure 8(b) is a (2N - 1)-linear array of PEs for projections in a 45 ° diagonal direction. The dataflow will move as follows: (1) F o r a single row i in the matrix, Iil,.., , l m move into the (2N - i)th element of the PE array in a pipehning way, which means in each time unit there is a pixel moving into the PE. (2) All rows of data move into their correspondent PEs in parallel, which means I1i. . . . . lq . . . . . I m ( j = 1. . . . . N) move into PE2N_ 1. . . . . PE2N- i. . . . . PEN in parallel.
110
H.D. CHENG and D. C. XIA dataflow
Ilb ,•
....
A
hj . . .
\
I11
.J
B r
a
•
.
.
.
.
,
.
.
.
.
.ss. S
lhr
,s
/
s •
.
/
hl
/
r
PE2N-1 PE2N-2 PE2N-3
A
O
/
I
\
(a)
PEN+i O
PEN+I PEN
i.
SPEN-1 PEN-2
N-i-I N-I
Y Y Ilk conneacd with connected wi~h I=I k=i+l ~ + i + 1 of PN-N+i+I SRN+i+I of PEN+i+I
k=l+i
[-
~
i i ,
I
i
IN-i N
'
/-.,after
'
N ~, _r_:_l',
!
N-i N
', ',
X ZI~k
"
before
'
Nm
i=1
k=i+l
k=l+i
AN+i-1 of PEN+i-1
.
.
.
.
.
.
.
.
.
.
.
.
'
PE2
, i
PE1
(b)
|
~KN+i-1 0t l"l~N+i-1
N N-i
l=k+i
i i i i i
# / /
(c)
Z Z Ilk l=i+l k=l
.
PEN-i
,
t
connected with SRN-i+t of PEN-i+I
[ ~
before N tu
I
.
......
¥- - connected with SRN-i-1 of PEN-i-I
Fig. 8. (a) A character's image matrix I o (i,j = 1,..., N), (b) A 2N - l-linear array of PEs for 45 ° diagonal direction, (c) The internal structure of the PEN +1 (for i = 0..... N - 1 ), (d) The internal structure of the PE N (for i = 1..... N - 2), (e) The internal structure of the PE1, (f) The notation of (a)-(e). _
(3) Within the array of PEs, the dataflow goes from P E 2 N - 1 to P E 1 in a pipelining way.
Here, dataflow in (3) can start just 1 time unit after dataflow in (2) starts. In this scheme of dataflow, p i p d i n i n g and parallelism are combined interactively. The control flow in the PE array is the same as in the Fig. 6. 3.3.2. T h e internal structure o f PEs. Figure 8(c) gives the internal structure of the PEN+I
(i=O ..... N-1).
For each PEN+I, there are one accumultor and one shift register. Pdll (i = N . . . . . 2N - 1) in equation (8) are obtained through accumulators A N+ ~and stored in shift registers S R N +i. The input and output of A N +~ are also shown in Fig. 8(c). The outputs o f S R N + i move to S R N _ i in Fig. 8(d). The shift registers of each P E are connected each other, and their contents can move in both directions. Figure 8(d) gives the internal structure of the P E N_ i (i = 1. . . . . N - 2). P~li (i = 1. . . . . N - 1) in equation (8) are obtained through accumulator A N of P E N, then
Character recognition
111
connected with SPd a f t e r 2N+I tu
x from Fig.6(d) y from Fig.7(d):
;(x~
after 2N+3tu pulse start after 2N+3 tu
(e)
Au+i : Accumulator Register $45" : Subtractor A45"= N - (x+y) SC45" : Shift Control Register, if A4s" > 0 ( < 0, = 0 ) , then SRi (i = 1.....2N-l) shift fight( shift left, no shift), the shift distance is IA45"1.
SRN+L SRN-i " Shift
tu: time unit(s)
(0 Fig. 8. (continued)
moved from SRN of P E N, and finally stored in S R N_ ~, which are shift registers. Figure 8(e) gives the internal structure of P E t. Aay in equation (18) is obtained from the subtracter $45o and output to the shift control register S C , v, which controls transforming subfeature vectors E,5o. Figure 8(f) is the notation of (a)-(e). 3.4. The array o f P E s f o r 135 ° diagonal direction 3.4.1. The dataflow and control flow. Figure 9(a) represents the image matrix lij ( i , j = l . . . . . N) of a character. Figure 9(b) is a ( 2 N - 1)-linear array of PEs for projections in a 135 ° diagonal direction. The dataflow will move as follows: (1) For a single row i in the matrix, l q . . . . . I m move into the (N + i - 1 ) t h element of the P E array in a pipelining way, which means in each time unit there is a pixel moving into the PE. (2) All rows of data move into their correspondent PEs in parallel, which means l t j . . . . , l q , . . . , I N j ( j = 1. . . . . N) move into P E N. . . . . P E N ÷ i - t . . . . . PEzN_ 2 in parallel. (3) Within the array of PEs, the dataflow goes from P E 2 N - 1 to PE1 in a pipelining way. Here, dataflow in (3) can also start just 1 time unit after dataflow in (2) starts. In this scheme of dataflow, pipelining and parallelism are also combined interactively as in Fig. 8.
The control flow in the PE array is the same as in Fig. 6. 3.4.2. The internal structure o f PEs. Figure 9(c) gives the internal structure of the P E N+ (i = 0 . . . . . N - I). For each PEN÷ i, there are one accumulator and one shift register. Pa2i (i = N . . . . . 2 N - 1) in equation (9) are obtained through accumulators AN÷i and stored in shift registers S R N +~. The input and output of AN + ~are also shown in Fig. 8(c). The outputs of SRN+i move to SRN_ ~in Fig. 9(d). The shift registers of each P E are connected each other, and their contents can move in both directions. Figure 9(d) gives the internal structure of the P E N_ i (i = 1. . . . . N - 2). Pd2i (i = 1. . . . . N - 1) in equation (9) are obtained through accumulator AN of P E N, then moved from S R N of P E N, and finally stored in S R s _ i, which are shift registers. Figure 9(e) gives the internal structure of P E t. A t35o in equation (22) is obtained from the subtracter St3 v and output to the shift control register SC13s°, which controls transforming subfeature vectors F t 35°. Figure 9(f) is the notation of (a)-(e). 3.5. The array o f P E s f o r pattern matching Similar to the VLSI architectures of feature extraction and feature adjustment, we can also implement the VLSI architecture for process of pattern matching proposed in the VH2D approach in Fig. 11.
112
H.D. CHENG and D. C. XIA
PE1 PE2
PEN-i
B
data_flow
. . . Ilj . . . . . . . .
.
hN
•
.
.
. i~
, . ~ ~.t Ill "~,
\
I /
--
PEN-2
r
~ PEN-I PEN lI PEN+I
r
[ PEN+2
0
0
V
,
.
: , . .
In
PEN+i
PE2N-2 Ir~',
: . . I~ . . .
"Ill
r
PE2N-t
(a)
(b)
connected with ............
connected with . . . . SP~-I-I of PEN-I-t
~-~- ....
ZEla li+1
before N m
[ I=i÷l k f i + l
b~fore Nm
[
N-i N-i
ZZrak I=1 kffil I~ k=N-I-i+l - -
lfi+2k f i + 1 1 ~
.............
. . . . wilth . . . .l. . . . . . . . . . . . . . conncoted AN*i+I of PEN+i+1
(C)
after Nta
j__.
t
.......................
~t c-o~nneaed ~ . . . . .with . . OAX.N+i+IOI l"/~.N+i+1
connected with
• _ _ ~ SRN-z+I
of
PEN-i÷1
(d)
Fig. 9. (a) A character's image matrix I o (i,j = 1..... N), (b) A 2N - l-linear array of PEs for 135° diagonal direction, (c) The internal structure of the PE N+i (for i = 0 .... , N - 1), (d) The internal structure of the PE N_ i (for i = 1..... N - 2), (e) The internal structure of the PEI, (f) The notation of (a)-(e).
3.5.1. The dataflow. Figure ll(a) represents the storage matrix of feature vectors of the dictionary. Suppose the dictionary__size is M, then the matrix is c o m p o s e d of x / M x x / M shift registers. Here, we assume x / M is an integer or the.__minimum integer greater than it. Figure 1 l(b) is a x / M - l i n e a r array of PEs for pattern matching process. The dataflow will move as follows: (1) The feature vector of the unknown character F , is input to P E i (i = 1. . . . . x / M ) in parallel. (2) F o r a single row i in the matrix, F a . . . . . F~4~ m o v e into PEt in a pipclining way, which means in each time u__nit, there is a feature vector F u (j = 1. . . . . x / M ) moving into the corresponding PE.
(3) AI! rows of data m o v e into their correspondent PEs in parall_._el, which means F t j . . . . . F o . . . . . F4-~j (j = 1. . . . . x / M ) m o v e into P E t . . . . . PE~ . . . . . PE47~1 in parallel. (4) Within the array of PEs, the dataflow goes from PE47~t P E t in a pipelining way.
3.5.2. The internal structure of PEs. Figure ll(___c) gives the internal structure of the P E i (i = 1. . . . . x / M ) . It has two comparators CPil and CP~2 and one shift register SRv CP~, calculates IF, - Ful (1 < j < x/M). Each of its output is sent t_._oCP~2 which is used to find minlFu - Ful (1 < j < x / M ) . Each time the output of CPi2 is stored in SR~ and used as a feedback input to
Character recognition
113
_ _ I - - L _ pulse start after 2N+3tu .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.1
after 2N+l m x fromFig.6(d) y fromFig.7(d)
r•er2N+2tu [ i~1 I
...........
after Ntu
_~_el'_2N_+3_m_. . . . . I__.t
connected with SR2
(e)
Au+i : Accumulator SRN+i. SRN-i : Shift Register S135" : Subtraetor A135"= x-y SCt35" :Shift Control Register, if A135" > 0 ( < 0, = 0 ) , then SRi(i = 1,...,2N-1) shift left ( shift right, no shift), the shift distance is IA135"1. tu: time unit(s)
(0 Fig. 9. (continued)
b
tO Ii-1 j of li-1 j
from
I~+1oflij+l____~
to Ii'j-I of Iij-1
b from Ii+l j of Ii÷l j Fig. 10. The internal structure ofstorage element for the image llj (i,j = 1..... N), where l,*jand l~,jare the two copies of the element llj.
4. TIMING AND TIME COMPLEXITY OF THE VLS1 CPi2. After each SR i has held minlFu-F~i I IMPLEMENTATION FOR THE PROPOSED APPROACH (1 < j < ~/M), then CPi2 will take the contents of SRi+ , of P__Ei+l [ m i n l F u - F ~ l (1 < j < ~ , In Section 2, we have proposed the VH2D approach i + 1 < k < ~/M)] and SR~ as two inputs to produce and algorithms to classify unknown characters in minlFu - Fkjl (1 < j < ~/M, i < k < ~/M) and store in a large character set. In Section 3, we have also proSR~ again. Two control signals are used to control th__e posed a VLSI architecture to implement the proposed starting time of calculating]__Fu - F i j I (l____
114
H.D. CHENG and D. C. XIA
Dlaflow
F141~I
PE1
FI2 Fll ~
F24]~1
PE2
F21i~
F22
Fi~r
Z2~
PEi
Fi2 Fil I
F41~'~ r
F~2 F41~1~
(b)
(a)
l to CPi-12of PEi-1 Ell
dataflow row i
~
R
I
Fi4~l - • • Fil
i l~ coatrol signal 1
]
alter 2~1~-i+1 tu Mini Fu- Fkj I (l
from SRi+I of PEi+I control signal 2
after2'~l-i tu
Min lF. - FRj I (I
(c)
C!~1: Comparator, to obtain I Fu - Fij I (l
tu: time unit(s).
(d)
Fig. 11. VLSI architecture for the process of pattern matching in the VH2D approach. (a) The matrix of featurevectorsofthe dictionaryF~ (1 < j < x//-'M,1 < i < x/~, (b) The array of PEs, (c)The internal structure of PE~,(d) The notation of (a)-(c).
Character recognition
115
processed in parallel. Also, we will discuss the time complexity of pattern matching process. In this section, we will verify that such assertion of time complexity and timing relationship are correct.
For the Algorithm 2.3, Step 1 is in the time complexity of O(N). Step 2 can be carried out in O(loo~ M). In Step 3, assume CP~ and CPz take one time unit for each comparison, then considering in the case of searching through all the dictionary, it takes x / M + 1
4.1. The time complexity of Algorithm 2.2 and Algorithm 2.3
time units to obtain m i n l F , - Fol (1 < j < V/M) and another x / M time units to obtain minlF~-F~jl
Step 1 of Algorithm 2.2 is to find the central point lc(x, y) of the character c. For row i [see Fig. 6{c)], the dataflow I , , . . . ,I~s in the storage matrix moves into Air in a pipelining way. For clarity and simplicity, we assume that any single operation of data transferring, adding or multiplying can be finished in one time unit (tu). Then, after N time units, the accumulator A~t has ~ = tlv. After N + 1 time units, the accumulator A~ has ~/= ~I~j.j. For all rows, the dataflows are moved into each correspondent PE in parallel. After N time units, the dataflows will take another N time units to move from A ~ and A~z of PE~ to As~ and Asz of P E n. So, totally, it takes 2N time units to obtain N ~ =N ~Y.j: ~I~ and Y.~= t ~ = ~l,~*j [see Fig. 6(d)]. D~, S~ and SC~ take one time unit each to carry out their operations. Hence, after 2N + 3 time units, A~ in equation (4) has been obtained. Therefore, to find x and A~is in O(N) time complexity. Since, to find yean be carried out in parallel with to find x, and operations in a horizontal direction are similar to a vertical direction, to find y and A, is also in O(N) time complexity. Therefore, Step 1 of Algorithm 2.2 is of the order O(N). Step 2 of Algorithm 2.2 is to find four subfeature vectors of the character. From Figs 6(c) and (d), F~ can be obtained in N time units. Similarly, from Figs 7(c) and (d), F~ can also he obtained in N time units. For a 45 ° diagonal direction, see in Fig. 8, all rows of data in the storage element matrix move into each correspondent PE in parallel. For PEN+i, before N time units, two streams of data from two directions IN_iS N--i-1 N-1 and Y~= ~ ~,~=i+t.k=~+il~kareaddedmAt~+iandthe ~'~N-i~'~N results/..~=~,~=i+t,k=~+i I ,k are output to A s + i - t of PEN + ~ - 1. By the same time, all the results are stored in SRN +v In the array of the PEs, starting from the 2nd time unit after rows of data start moving towards PE array, the contents of S R ~ N _ , . . . , S R s of P E 2 N _ , . . . , PE N will take N - 1 time units to move into SRN,..., SR~ of PEN,..., PE x. After N time units since the dataflows start moving into the PE array, all the SRN+I of the PEN+ ~will hold ~_.N-i~'~N I = l / . k = i + Lk =t + dlk (all vector elements for the upleft triangle matrix). All of the SR N_ i of the PE N_ i will hold N N-i ~l=i+ ~.~=~+iY~= ~I,~ (all vector elements for the lowright triangle matrix). Therefore, the time complexity of finding F4so is in O(N). Similarly, finding F~a~o is also in O(N). Hence, Step 2 is in the time complexity of O(N). Steps 3 and 4 are just to adjust F~, F,, F4so and F~3~o and combine them into F'. Therefore, the time complexity of Algorithm 2.2 is O(N), where N is the size of the image.
(1 < j < x/M, 1 _< k _
116
H.D. CHENG and D. C. XIA
until PEx, respectively. So, after N time units, all PEs will have all the vector elements of F135o described in equation (9). In Fig. 9(e), through $135o, A13so is obtained and stored in SCt35o after 2N + 3 time units. Before that, after N time units, the Fx 3so of the uncentralized image of the character has been already obtained and stored in SR~ (i = 1..... 2 N - 1) in Fig. 9. Therefore, after 2N + 3 time units, when shift control pulse starts, F135o will be adjusted into F~35o and stored in SR i. For the timing relationship of pattern matching process, it can be described as follows: assume that the feature vector of the unknown character F, is obtained. At the first time unit, Fu is input to PEi (i = 1..... ~/M) in parallel. Frj ( j - - 1 ..... ~/M) move into PEr in a pipelining way. Each Fij (i,j = 1..... ~/M) move into PE i in pa__rallel. In CP~I, after each [F~-Fijl (1 < j _< ~/M) is finished, the control signal 1 will be given, and the output of CPil will be sent to CPi2.CPi2 will take another time unit to obtain the smaller value between the feedback from SR~ and current output of CP~I. So, it takes ~ + 1 time units to obtain IF~ - F~jI (1 _
scribed in Section 2.3. All images of characters in T I F F files are transformed into PBM binary files by using file format transfer utilities under the UNIX system. The dictionary of feature vectors for the character set is built in the way that all the vector elements of each character's feature vector are the mean of the vector elements of its 10 image instants' feature vectors. For the purpose of sample testing, each character in the dictionary is assigned a code which uniquelyrepresents itself. In selecting the sample set for testing, in order to have it be more representative in different structures and strokes of characters, we randomly select 300 characters from the dictionary to be the sample set for testing. The results of sample testing have shown that the accuracies of recognition and rejection are extremely high. All input characters having corresponding counterparts in the dictionary are correctly classified. All input characters not having the counterparts in the dictionary are rejected. The programs were written in C. All experiments have been conducted on a DEC 5000 workstation run under UNIX system. Table 1 gives the experimental results of recognition and rejection by using selected test samples. In the experiments, the test samples do not overlap with the training samples. Figure 12 gives the images of some groups of Chinese characters which are difficult to classify by using existing methods. However,the proposed algorithm can correctly classify them. Let us use an example to show how our proposed algorithm can classify those characters listed in Fig. 12. One of the groups in Fig. 12 has a character "person" (code is 10) and another character "in" (code is 11). From Table 2, we can see the four subfeatures, for character "person", represented by its four local indices are 93, 202, 64 and 34, respectively, those for character "in" are 115, 113, 75 and 44, respectively. This only reflects the general differences in their feature vectors to distinguish those characters. From Table 3, we can see that the differences in local indices (x,y,45 and 135°), which are the summation of the differences of elements of the four subfeature vectors, respectively, between characters "person" and "in", are 63, 63, 19 and 18, which totals 163. This value is greater than the experimental threshold value (130), so the input patterns"person" and "in" can be correctly classified. Other examples can be explained in a similar way. Therefore, our method can effectively and correctly classify those characters
Table 1. Experimental results of recognition and rejection using character samples both from and not from the dictionary Test number Test I Test 2
Number of characters in the dictionary
Samples taken from the dictionary
Samples not taken from the dictionary
Recognition rate (%)
Rejection rate (%)
3000 3000
300 300
0 100
100 100
-100
Character recognition
person
in
eight
field
from
king
host
no
first
wood
state
live
I :>-= I big
end
go
I--
husband
village
sky
l
measure
official
wait
Fig. 12. Some groups of Chinese characters which have similar structures. These characters are obtained directly from scanning and will be used to generate their feature vectors from the images of these characters. Table 2. List of total indices and local indices of all the characters given in Fig. 13. Here, the indices are the summations of feature vectors, and the local indices are the summation of four subfeature vectors, respectively Character group
Character in English
Character code
Total index
Local index in four directions X
Y
45 °
135°
1
eight person in
9 I0 11
317 292 347
104 93 115
108 202 113
67 64 75
38 34 44
2
soldier land
28 29
214 203
69 68
70 66
36 37
39 32
3
end no
218 219
445 399
143 127
145 128
82 75
75 69
4
field from state first
259 260 261 262
581 600 513 523
191 196 171 172
190 195 170 171
100 106 87 89
100 103 85 91
5
king host
95 325
431 497
141 164
141 163
77 84
72 86
6
big measure sky husband
31 32 85 86
354 289 446 490
113 89 144 162
114 90 145 157
65 53 76 82
62 57 81 89
7
go live
1282 832
589 533
192 175
195 175
99 93
103 90
8
wait official
2312 1813
815 659
264 213
265 214
146 114
140 118
9
wood village
661 662
658 614
215 200
216 203
112 107
115 104
118
H. D, CHENG and D. C. XIA Table 3. List of all the differences in total indicesand local indicesof all the character groups givenin Fig. 13. Here, the differences in total indices are the summations of the differences in local indices in four directions The differencesin iocalindices are the summations of the differencesof each vector elementsof the four subfeature vectors, respectively. Character group 1
2
Character code A
Character code B
Total Local index differencein four directions index difference X Y 45° 135°
9
10
149
62
9 l0
11 11
152 163
73 63
53 44 63
23 24 19
11 11 18
28
29
190
50
86
22
32
3
218
219
263
83
110
33
37
4
259 259 259 260 260 261
260 261 262 261 262 262
211 155 217 146 234 212
80 71 93 43 111 106
80 38 69 52 53 55
11 18 36 27 41 34
40 28 19 24 29 17
5
95
325
364
134
130
68
32
6
31 31 31 32 32 85
32 85 86 85 86 86
178 214 150 190 170 192
85 114 46 79 65 102
31 50 56 51 63 40
32 30 28 30 20 32
30 20 20 30 22 18
7
832
1282
199
68
83
20
28
8
1813
2312
236
121
54
31
30
9
661
662
271
78
122
37
34
which are very difficult to obtain using the existing approaches. Table 2 lists all the indices and local indices which represent summations of four subfeature vectors for all the characters in Fig. 12. Table 3 lists all the differences in feature vectors and subfeature vectors among those character groups given in Fig. 12. 6. CONCLUSIONS Character recognition with a large character set is a difficult problem in image processing and pattern recognition. In this work, we have proposed the VH2D approach and its algorithms to solve the problem of character recognition with a large set of Chinese characters. The proposed approach and algorithms have several advantages over the existing approaches: (1) good features of characters have been extracted, which are very important in resulting in an extremely high classification rate. (2) The problem of image centralization has been considered without an actual process of image transformation, thus, a large amount of computation time is avoided. Since it need not keep the image itself during the whole process of classification, the main memory demanding is low. (3) Finding the central point of a character and its feature vector can be carried out in parallel. The four subfeature vectors of the character are also processed in parallel. The proposed algorithms are suitable for parallel pro-
cessing and VLSI implementation.(4) The VLSI architectures and implementation for the proposed algorithms are very simple. Four linear PE arrays are only used instead of N x N PE arrays. (5) The time complexity will be O(N) for the proposed VLSI architectures instead of O(N 2) when the uniprocessor is used. Since a character can be treated as any type of object, the proposed VH2D approach is not limited to Chinese character recognition; it can be used to classify any other language characters, symbols and objects. Further, due to the characteristics of the VH2D approach, our proposed approach and its VLSI implementation can be extended to more realtime applications in image processing and pattern recognition.
REFERENCES
1. S. Mori, C. V. Suen and K. Yamamoto, Historical review of OCR research and development, 1EEE Proc. 10291057 (July 1992). 2. C. Y. Suen, Character recognition by computer and applications,in Handbook of Pattern Recognition and lmaoe Processino, T. Y. Young and K. S. Fu, eds, pp. 569-586. Academic Press, New York (1986). 3. T. Wakahara et al., On-line handwriting recognition, 1EEE Proc. 1181-1193 (July 1992). 4. Y.Y. Tang et al., VLSI architecture for parallel concentration-contour approach, in Proc. I I th ICPR Int. Conf.
Character recognition
5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
Pattern Recognition, The Hague, The Netherlands, August 30-September 3 (1992). T. Akiyama and N. Hagita, Automated entry system for printed documents, Pattern Recognition 23(11), 11411154(1990). K. T. Lua and K. W. Gau, Recognizing Chinese characters through interactive activation and competition, Pattern Recognition 23(12), 1311-1321 (1990). L. H. Chen and J. R. Lieh, Handwritten character recognition using a 2-layer random graph model by relaxation matching, Pattern Recognition 23(11), 1189-1205 (1990). C. W. Liao and J. S. Huang, A transformation invariant matching algorithm for handwritten Chinese character recognition, Pattern Recognition 23(11), 1167-1188 (1990). H. D. Cheng et al., Parallel image transformation and its VLSI implementation, Pattern Recognition 23(10), 11131129 (1990). T. Taxt et aL, Recognition of handwritten symbols, Pattern Recognition 23(11), 1155-1166 (1990). H. Yamada, et al., A nonlinear normalization method for handprinted Kanji character recognition-line density equalization, Pattern Recognition 23(9), 1023-1029 (1990). N. Tavakoli, Character recognition: a unified approach, Proc. SPIE, Mach. Vis. Appl. Character Recognition Indust. Inspect. 1661, 158-167 (1992). V.A. Jaravine, Syntactic neural network for character recognition, Proc. SPIE 1661, 215-223 (1992). T. Baker and H. McCartor, A comparison of neural network classifiers for optical character recognition, Proc. SPIE 1661, 191-202 (1992). G. Houle and K. B. Eom, Use of a priori knowledge for character recognition, Proc. SPIE 1661, 146-156 (1992). F. H. Cheng et aL, Recognition of handwritten Chinese characters by modified Hough transform techniques, IEEE Trans. P A M I 11, 429-439 (1989). S. Kahan et al., On the recognition of printed characters of any font and size, IEEE Trans. PAMI 9, 274-288 (1987). S. W. Lee, NoisyHangulcharacterrecognitionwith fuzzy tree classifier, Proc. SPIE 1661, 127-136 (1992). T. Pavlidis et al., Recognition of poorly printed text by direct extraction of features from gray scale, Proc. SPIE 1661, 118-126 (1992). Y. Y. Tang et al., Transformation-ring-projection (TRP) algorithm and its VLSI implementation, Int. J. Pattern Recognition Artif. Intell. 5(1 and 2), 25-56 (1991).
119
21, H. D. Cheng and K. S. Fu, VLSI architectures for string matching and pattern matching, Pattern Recognition 20(1), 125-141 (1987). 22. A. Taza and C. Y. Such, Discrimination of planar shapes using shape matrices, IEEE Trans. Syst. Man Cybern. 19(5), 1281-1289 (1989). 23. S. W. Lee et aL, Translation-, rotation- and scale-invariant recognition of hand-drawn symbols in schematic diagrams, Int. J. Pattern Recognition Artif. lntell. 4(1), 1-25 (1991). 24. J. Kittler, Feature selection and extraction, in Handbook of Pattern Recognition and Image Processing, T.Y. Young and K. S. Fu, ¢ds, pp. 282-310, Academic Press, New York (1986). 25. H. D. Cheng et al., VLSI architecture for digital picture comparison, IEEE Trans. Circuits Syst., Special Issue on VLSI Implementation for Digital Image and 14deo Processing Applications ~(10), 1326-1335 (1989). 26. S. Y. Lee et al., Parallel image normalization on a mesh connected array processor, Pattern Recognition 20, 115124 (1987). 27. K.S. Fu, ed., VLSI for Pattern Recognition and Ima#e Processing. Springer Verlag, Berlin, Heidelberg (1984). 28. R.H. Wu and H. Stark, Rotation and scale invariant recognition of images, Proc. 8th Int. Conf. Pattern Recognition, pp. 92-94. Paris, France (October 1986). 29. K. Mersereau and G. M. Morris, Scale, rotation, and shift invariant image recognition, Appl. Optics 25, 2338-2342 (July 1986). 30. L. Wang and T. Pavlidis, Detection of curved and straight segments from gray scale topography, in Character Recognition Technologies, D. P. D'Amato, ed., Vol. 1906, pp. 10-20 (1993). 31. T.J. Shan, A novel structure recognition based on OCR system and its parallel VLSI architecture, in Character Recognition Technologies, D. P. D'Amato, ed. Vol. 1906, pp.172-183 (1993). 32. C. Y. Suen et al., Automatic recognition of handprinted characters--the state of the art, Proc. IEEE 68(4), 469487 (1980). 33. V. K. Govindan and A. P. Shivaprasad, Character recognition--a review, Pattern Recognition 23(7), 671 683 (1990). 34. T.H. Hildebrandt and W. Liu, Optical recognition of handwritten Chinese characters: advances since 1980, Pattern Recognition 26(2), 205-225 (1993).
Aboat the Anthor--HENG-DA CHENG received his Ph.D. degree in Electrical Engineering from Purdue University, West Lafayette, IN, in 1985. Now he is an Associate Professor, Department of Computer Science, and an Adjunct Associate Professor, Department of Electrical Engineering, Utah State University, Logan, Utah. Dr Cheng has published numerous technical papers and is the co-editor of the book, Pattern Recognition: Algorithms, Architectures and Applications (World Scientific Publishing Co., 1991). His research interests include parallel processing, parallel algorithms, artificial intelligence, image processing, pattern recognition, computer vision, fuzzy logic, genetic algorithms, neural networks and VLSI architectures. Dr Cheng is Program Co-Chairman of Vision Interface '90, Session Chair and Member of Best Paper Award Evaluation Committee, International Joint Conference on Information Sciences, 1994. He is also a Senior Member of the IEEE Society, a Member of the Association of Computing Machinery, an Associate Editor of Pattern Recognition and an Associate Editor of Information Sciences.
About the Author--DAVID C. XIA received the B.S. in Computer Science from Shanghai University of Science and Technology, China in 1984 and M.S. in Computer Science from Utah State University, Logan, Utah in 1993. His research interests include pattern recognition, parallel processing VLSI design and high-level synthesis, and database systems. He has been with Ameritech Library Services since 1994.