SHARP: a shape recognition system and its parallel
implementation C P Ravikumar* and Rajender Sethi t
A parallel algorithm for shape recognition is presented along with its implementation on a distributed memory muhiprocessor. Shape recognition is one of the fundamental problems of computer vision. We consider a shape to be composed of a set of small straight line segments tangential to the object. The recognition problem is to determine whether the test image contains a specified reference shape or not. The straight line Hough transform (SLHT) has been used to detect reference shapes. A signaturebased parallel algorithm called SHARP is developed for shape recognition using SLHT on a distributed memory multiprocessor system. In the SHARP algorithm, the (0, r) space is divided among processors. The SHARP algorithm has been implemented on a Meiko transputer with 32 nodes. We analyse the performance of the parallel algorithm using both theoretical and experimental techniques.
Keywords: shape recognition, parallel algorithm, Hough transform
for detection of straight lines in binary images 2'4's. A straight line in the two-dimensional plane can be represented by an angle 0 of its normal and its distance r from the origin. The equation of the straight line is given by: r = xcos0 + ysin0 The value of 0 is restricted to the interval [0, x] and r is restricted to the interval [--n*(cos 45 + sin 45), n*(cos 45 + sin 45)], where n * n is the size of the image. The (0, r) space is discretized and represented by a two-dimensional accumulator array. Each accumulator cell counts the number of edge points (xi, Yi) mapped to it. Thus, information about line segments is obtained by applying a threshold value to the accumulator array. A limitation of the Hough transform is that it does not preserve information about which pixel belongs to a particular line, i.e. the connectivity of the edge pixels is not taken into account. In this paper we use a modified transformation called the
Shape recognition is one of the fundamental problems of computer vision. In many practical applications it is necessary to recognize shapes of various types from pictures. Given a reference shape, the recognition problem is to determine if the test image contains the reference shape or not. In this paper, we consider the problem of shape recognition in binary images.
Shape representation An arbitrary shape can be considered to be composed of small tangential straight line segments as shown in Figure 1. Hough transformation (HT) 1 3 is a well known method *Electrical Engineering Department, Indian Institute of Technology, New Delhi 110016, India fNational Informatics Centre, Labour Information Systems, Division 111, SS Bhavan, Rafi Marg, New Delhi 11000], India Paper received: 15 December 199 ~
Figure 1 An arbitrary shapeand its represemation
0141-9331/95/$09.50 '!~ 1995 Elsevier Science B.V. All rights reserved Microprocessors and Microsystems Volume 19 Number 3 April 1995
131
SHARP: a shape recognition system: C P Ravikumar and R Sethi straight line Hough transform (SLHT) 5'~', where the continuity of the edge pixels is checked; unrelated pixels are not mixed together. The line segments are collected in a twodimensional array called slht. A threshold is applied to the length of line segments detected in each slht cell, rather than to the number of edge points in the accumulator. In SLHT, the (0, r) space is regarded as a two-dimensional Boolean array, say A, such that A[O,][r/1 will have a value of 1 if the corresponding slht cell contains one or more detected line segments; otherwise the entry will have a value 07. Each line segment in the input shape (see Figure 1 for an example) maps to a point (0, r). Alternatively, each point in (0, r) space corresponds to a line that is tangential to the curve. Thus the SLHT of an arbitrary shape is the set of lines that are tangential to it 5'6. A shape signature, called the scalable translation invariant rotation-to-shifting (STIRS)7, is obtained from the SLHT by computing the distances between pairs of parallel tangents to the shape.
stored in the database. Each processor computes the partial SLHT of the test image, a partial signature of the test image, and matches the partial signature with the reference signature to find a partial score. These partial scores are combined to draw a conclusion about the detection and orientation of the reference object. The only overhead of the parallel algorithm is the summation of partial scores, thereby resulting in a processor utilization that is close to unity. The details of the SHARP algorithm are given below.
Shape matching
SLHT OF A CURVE
In recognition of arbitrary shapes, we are concerned with the detection of the object as well as estimation of the angle of rotation of the object in the test image. To recognize the test object, we need to match the signature of the test object with the signature of the reference object. The score at every angle of orientation is computed for all points at which the reference signature matches with the test signature. The detection of a peak at some angle [~ in the score indicates the orientation of the reference shape by angle fi in the test image. If no significant peak is detected, we conclude that the test shape does not contain an instance of the reference object.
A closed curve can be considered ~' 5, 7 to be composed of small, tangential, straight line segments. A curved object can be described by an SLHT. The SLHT of an arbitrary shape, say C, can be described as a set of tangents, i.e. as a multiple-valued function r(fl) such that for a given 0i, say:
Organization The concept of signature-based shape recognition using the SLHT is explained in the following section. The parallel shape recognition algorithm (SHARP) developed for the signature-based technique is described in the section following. Experimental results of the parallel implementation are then discussed, followed by conclusions.
r(O) = {rlthe l i n e / : r = x* cos 0 + y* sin 0 is tangent to C} We now consider the effect of rotation on the SLHT of a straight line, say8 / - (01, rl). Consider rotating the image by an angle q~ in a counter-clockwise direction. Then the point (01, r~) will be mapped to (02, r2) such that: ()2 - (01 + qS) mod 180
Computation requirements Shape recognition is a computation-intensive task, especially because the reference shape may appear in many orientations in the test image. If we imagine all possible 360 orientations then reference features have to be matched a corresponding number of times in order to locate the shape. This motivates us to consider parallel solutions for recognizing shapes in real-time applications, e.g. robotics.
Parallel implementation In this paper, we present SHARP (SHApe Recognition in Parallel) which has been designed to obtain linear speedups. The algorithm has been implemented on a distributedmemory multiprocessor system. Our algorithm divides the 0 space nearly equally among the available number of processors p. Given a binary test image and a database of reference shapes, our algorithm considers each reference shape in succession and matches it with the test image. The signatures for the reference shapes are precomputed and
132
and r 2 - rr r2=
if 0 + # 5 < 180 r~
if0+~b>180
SLHT representation If we discretize the (0, r) space into m~ and mr levels respectively, the SLHT of a curve is represented by a Boolean array A[O...m~][O...mr], where A[OiJ[r i] will have a value 1 if the corresponding slht cell contains one or more detected line segments at (0i, rj); otherwise it will have a value 0.
STIRS signature of the curve The distances between pairs of parallel tangential lines to C represent the signature of the curve C. The resultant transform (signature) has the following properties: • It is invariant to the translation of the shape. • Rotation of the shape corresponds to a circular shift of its signature in the (0, r) space.
Microprocessors and Microsystems Volume 19 Number 3 April 1995
SHARP: a shape recognition system: C P Ravikumar and R Sethi •
If the shape is scaled by a factor 5, then the signature is also scaled by the same factor.
Due to the above reasons, the signature is termed as a scalable translation-invariant, rotation-to-shifting (STIRS) signature 7. The STIRS signature is also represented by a Boolean array, say D[0..,m,]10...m~] such that D[Oi][r i] = 1 if the distance between a selected pair of points in 0, column is equal to rj; otherwise D[Oi][O] = 0.
PARALLEL ALGORITHM FOR SHAPE RECOGNITION Parallel computing in shape recognition The following three steps are involved in detecting a reference shape in a test image: (a) Computation of the SLHT for the test image. (b) Computation of the STIRS signature of the test image. (c) Matching the test signature with that of the reference shape. The process of shape amount of CPU time, processing is desirable recognizer in a real-time
recognition demands a significant and speedup through parallel if we intend to use the shape application.
Computational model We assume a distributed-memory, multiple instruction multiple data (MIMD) computational model '~. This is because the machine available to us for experimentation, the Meiko transputer system ~°, is a distributed memory multiprocessor.
Parallelization scheme First consider step (a), namely, the computation of the SLHT of a test image. A naive parallelization scheme is to divide the image row-wise or column-wise into as many partitions as the number of processors p. Each processor can compute the slht array independently on the partitioned image. However, this scheme calls for a globally accessible slht array, which cannot be supported in our computational model. Alternatively, a separate copy of the slht array can be maintained in the local memory of each processor, and these can be finally merged. This solution is also expensive, since it calls for an excessive amount of interprocessor communication. Furthermore, the memory requirement at each computing node is increased significantly, since the slht array is m, x m~ in dimension. In the SHARP algorithm, we consider a different form of data parallelism. The slht array is divided over the 0 space into p partitions, each consisting of m~/p angles. Processor i computes the signature for the entire image for angles in the range Ai = [i*mJp, ( i + ] ) m ~ / p - 1 ] . We refer to this as the partial signature. Processor i also applies the matching
algorithm to the test image and the reference image for the angles in the range Ai for all m, orientations of the reference shape. The score resulting from the partial match is termed the partial score, and consists of an array of m~ real values. The merging step now involw~s the computation of the total score by adding the respective elements of the partial score available with processors. The merging step can be carried out using a binary-tree reduction procedure whose complexity is only O(m,~ + log 2 p). This is further explained in the next section. Li et al. ~ have proposed a scheme for speeding up the computation of the SLHT using a systolic array architecture. Their scheme is restricted to the detection of straight-line segments in the test image. Further, the technique { uses fine-grain parallelism. The systolic array solution leads to a special-purpose machine for shape recognition. The emphasis in this paper is on using coarse-grain parallelism in the shape recognition algorithm; this leads to a multiprocessor-based parallel algorithm which can be implemented on a variety of commercially available MIMD machines such as the Intel iPSC, Intel Touchstone-Delta, BBN Butterfly, NCUBE, PARAM and so on. Furthermore, with the popularity of massively parallel computers, it is expected that future generations of parallel computers will employ over 4096 processors working in a distributedmemory environment. These machines use advanced switching techniques such as wormhole routing to cut down the cost of interprocessor communication. Our parallel algorithm is ideally suited for such platforms. A limitation of our parallel algorithm is that it requires that a local copy of the test image and the reference signature be available with each processor. The space requirement of the algorithm is highest during the computation of the SLHT; in addition to an n × n image, a reference signature of dimension m~ × mr, there is also the need for storing information regarding line segments which we need to keep track of the connectivity of pixels. If there are k bright pixels in the image, the worst-case memory requirement for the connectivity information alone is no more than k*mJp*mr. In our implementation, we have used the Meiko transputer where each of the 32 processors has 4 MB of local memory. We have tested our algorithm on as many as three reference shapes being matched with a test image of size 256 × 256. With a 5 resolution or lower, we did not encounter any memory problems.
SHARP algorithm We now describe SHARP, a parallel algorithm which is based on the ideas developed in the previous section. The algorithm given in Figure 2 is executed by each of the p processors. A list of reference objects is specified to the algorithm; the processors handle one reference shape at a time. A synchronization step is involved before the processors can proceed to the next reference shape. A separate copy of the test image and the reference signature are loaded into the local memory of each processor. Ignoring file I/O time, the SHARP algorithm can be expected to achieve a linear speedup because of the negligible over-
Microprocessors and Microsystems Volume 19 Number 3 April 1995
133
SHARP: a shape recognition system: C P Ravikumar and R Sethi head of the procedures participate in add and synchronize. The procedures to compute the partial SLHT and partial signature, matching, and addition of partial scores are described below.
Parallel algorithm
for SLHT
Each processor applies the partial_slht procedure on the test shape for bright pixels. To compute the SLHT, the 0 space is divided into p regions numbered 0, 1, 2 .... , p - 1. The region i corresponds to 0 values from i*3~m,/p to ( i + 1)*~mo/p. The processor i is assigned to region i and computes the partial SLHT denoted by slhl[t] [r], 0 <~ t <~ m~/p for the 0 values meant for region i. The algorithm to compute the SLHT is given in Figure 3. The complexity of applying this procedure is O ( n 2 * l * m J p ) for an n*n binary image, where / is the average number of line segments detected in each slht cell. Parallel algorithm
for computing
the s i g n a t u r e
The computation of the signature can proceed independently in each processor after the processor has computed the partial SLHT of the test shape. Processor i computes an
accumulator A i after applying a line-length threshold value on its slht cells. The signature array D ~ is computed for all possible pairs of points in the accumulator A i. The algorithm to compute the partial test signatures is given in Figure 4. The time complexity of computing the Boolean accumulator and the signature are O(mr*l*m~/p) and O(m2*m~/p) respectively. Thus the time complexity for computation of the test signature is O(m2*m~/p).
Parallel algorithm
for matching
Since each processor i contains the test signature for the appropriate range of 0, the algorithm shown in Figure 5 matches a region of the test signature with the corresponding region of the reference signature. The reference signature is rotated and matched m~ times (wrap around) in steps of ,~j. Each processor maintains its own copy of the matching score for all the orientations. The time complexity of computing the partial match is O(mr*rnJp) for each orientation of the reference shape. Therefore, the overall complexity of the matching step is O ( m r * m 2 j p ) . procedure partialsignature (i)
/* i is the processor id */ begin for E) = 0 t o m e / p - 1 do begin f o r r = O t o m , - I do for each line in slhf[®][r] do
procedure SHARP (p. i. test_shape, reference_shapes) /* p is the number of processors and
if length of line > threshold then
i is the processor id */
Ai[®][r] = 1
begin forr = 0tom,-
read the test shape into pixel array;
if Ai[E)][r] = I then
compute partial_slht; compute partial_signature; for each reference_shape do begin read reference signature; perform p a r t i a l m a t c h ;
participate in add;
1 do
forr I = r + ltom,-
I do
if Ai[$)][rl] = 1 then Di[E)][G-r] = I end end
(* Add partial scores *)
ifi = p- 1then
Figure 4 Computingthe signature. Thresholdis the line length threshold
find peak in matching score; synchronize; end
procedure partial_match(i)
end
/* i is the processor id */ begin
Figure 2 The SHARPalgorithm
for 01 = 0 t o m e - 1 do begin
match = approx = miss = 0 for E)2 = 0 t o m e / p - 1 do
procedure partial_slht (i)
/* i is the processor id */
begin
begin
t = (6) I + ®2) m o d m e
e ~ , = i * Se*me/p e ~ , -- if+ 1)*Se*me/p - 1
forr = 0tom,-
match = match + 1 else if D ' t [ e 2 ] [ r ± 1] = I then
approx = approx + 1
t = (e - e',,J /5e r = x *cose
else miss = miss + I
+ y *sine
end
update/append slhf[t][r].lines
scord[E) 7] = match + approx/2- miss
end end
end end
Figure 3 pixel is an n x n array containing the test shape, slhf contains
line segments
134
I do
if D,[t][r] = I then if LY~[®2][r] = I then
forx = O t o n - l d o for y = O t o n - l d o if p i x e l [ x ] [ y ] = 1 then for e = eJ,~. to ei,~,~ step 6e do begin
Figure 5
The array
sol"
Microprocessors and Microsystems Volume 19 Number 3 April 1995
.~core i
contains the matching score for the /th proces-
SHARP: a shape recognition system: C P Ravikumar and R Sethi
Parallel a l g o r i t h m for a d d i n g
scores
The scores computed at the end of partialmatch are added to find the total score. Since the addition operator is associative, a binary-tree technique 9 has been applied to add the matching scores in Iog2p communication steps. The final matching score will be available with the processor labelled (p 1). The detection of peaks in the total score is carried out by processor p - 1. The algorithm to compute the total score is given in Figure 6. The computations involved in this procedure have O(m~*log~ p) complexity. The communication overhead involved in this step is indicated by t(:,,rnm, which is the interprocessor communication time for transferring m0 real values. Finally, finding a peak in the matching score takes m~ operations in processor p - 1. Thus the overall time complexity of the procedure is O(m~l*log~ p ÷ m~ + t~.......), i.e. O(m~*log~ p + t~. . . . ). Speedup
of t h e S H A R P a l g o r i t h m
The total parallel time taken by the SHARP algorithm, TSHARP, is given by Equation (1). TSHA~
O n2
+ O(m~log2p) + t~....... The time taken by the sequential algorithm for shape recognition, Ts~, is given by Equation (2).
TsE~2-- O(n 2Im~) + O(m~ m~) + O(Nm~m~ 2)
(2)
Thus the speedup of the SHARP algorithm is given by
Speedup-
TSE(,) TsuA,~ ~ p
Since the overhead O(m~*log~p)+ t(-...... << TSHARP, the speedup expected is p. EXPERIMENTAL
RESULTS
algorithm was tested using test reference shapes $1, $2 and $3 (Figure 7) and test images T1, T2, T3 (Figure 8). The test object T1 consists of multiple shapes, i.e. instances of shapes $1 and $2. Test shapes T2 and T3 are rotated instances of shapes $2 and $3 respectively. The reference signatures were computed using the following parameters: 1. 2. 3. 4. 5. 6.
Shape size, n - : 256 0~<0~<~-363 ~ r ~< 363 ~r~ = 5" i.e. mc~ - 37 ~r = 1 i.e. m~ -- 727 Line length threshold = 2.0
SHARP implementation
The SHARP program along with a copy of the test shape is loaded in all the processors running in parallel. The discretized [0...m~] space is divided equally into a number of regions. Each region is associated with a processor, i.e. the /th processor performs the computation meant for the/th region. The important data structures used in the SHARP algorithm are: • the pixel array to store the test image; • the slht ~array to store line segments (s, e, /) obtained on applying the Hough transform for i* (~mo/p <~ 0 <~ (i + 1) *3~m~mjp (region i for the /th processor), where s, e and / are the starting point, end point and length of the line segment respectively; • the accumulator array A i obtained after line-length thresholding for the i-th region; • the test signature array, namely D' or D~, computed from accumulator array W, corresponding to the /th region; • the reference shape signature array Dr; • the scord array to store the matching score for the /th processor.
The SHARP algorithm has been implemented in C language on a Meiko transputer ~° of 32 nodes with a SunSPARC system as the host under the Unix environment. The core of the parallel algorithm requires about 700 lines of code. The entire software, including relevant I/O operations and options for generating and transforming reference images, etc., uses about 2000 lines of C code. The parallel procedure
participate
$1
Figure 7
in add(i)
$2
$3
Reference shapes
/* i is the processor id */
begin fork = O t o / o g # ) - 1 d o if i__ 2 ' - 1then if (i m o d 2 k+l = 2 ' - 1) tl}en send S c o r # to processor i + 2 k else if (i m o d 2 *+~ = 2 *+t - 1) then begin receive S c o r e from processor i - 2 k update local S c o r # end T1
end
Figure 6
Procedure to add partial scores available in each processor
Figure 8
T2
T3
Test shapes
Microprocessors and Microsystems Volume 19 Number 3 April 1995
135
SHARP: a shape recognition system: C P Ravikumar and R Sethi SloeecluD v s
The result of matching the reference shape $1 with test image T1 is shown in Figure 9a. As can be seen, a peak is obtained at 0 ~', indicating that the test object T1 contains an instance of the reference shape $1. Figure 9b shows the result of matching $2 and T2. It is clear that a peak with a score of 99.37 exists at 0 = 9 0 , indicating that the test shape contains a 90" rotated instance of the reference shape $2. When we apply the matching algorithm to two shapes that indeed do not match, the resulting score array does not show positive values. We have observed that proper thresholding is important to find out the exact orientation of the reference object in the test shape. On the other hand, if the orientation information is not crucial, the recognition algorithm is quite robust to changes in the threshold value.
I 2
+
I 4
t
I
II
15
]2
l U m e r o f p,,~,.,,~lCrO T:B v s 52 ¢ T i vw 5;I
Computational results Cornmunicet. i o n T[rne([in SecsD vs In Figure 10a, the speedup of the SHARP algorithm is plotted as a function of the number of processors p. It can be seen that the speedup obtained grows more or less linearly up to p = 16, beyond which there is a drop in speedup. This can be attributed to the domination of the overhead of
W ~ t ~ r Of pfe
4
Matching Scores Tq v l
i
$1 ~ 52
1(]0 2
WI. gOg?-
t dOGSa
94(lab
!!"lgD-
'1
+
N ~ 12 vn E~
~r ir m
m ¢
T3 vm E ]
Figure 10 a, Variation of speedupwith number of processors;b, effect of communication with number of processors
6O It? ~e
B~
lid|
u
o
ul
i
u
2s
u
n
i u l l
SD
u
i
;
uu
i
u
~S
u
i
,u
u
dOD
i
;
12S
u
i i i
i
1S0
u
r
ul
n
,iris
Thets
a Metching
Scores
1 2 vm B 2
mM-
N-
interprocessor communication. We also notice that the speedup obtained when matching the test image T2 with the reference shape $2 is larger when compared to the case of T3 and $3. This is because the image T2 is more complex than T3, and computation of the SLHT is more time consuming. As a result, the computation time is relatively larger in comparison to the communication overhead. The interprocessor communication time (in seconds) is plotted in Figure 10b. In the present implementation, it is clear that p = 16 delivers maximum speedup for the same problem size. It was observed that speedup increases if the size of problem is increased by decreasing 6+j for higher angular resolution.
E-
Issues in parallel i m p l e m e n t a t i o n 74
0
l U , l l , l l U , , l l l U l , l n l l l l n l l l U U l l l U r l 25 50 75 tla 125 15D 17S
b Figure 9 a, Matching score for $1 vs T1; b, scoreof matching $2 with T2
136
The chief factors which affect the speedup of the parallel algorithm are interprocessor communication time, load balancing among processors, and the problem size. These are addressed below, along with the experimental results.
Microprocessors and Microsystems Volume 19 Number 3 April 1995
SHARP: a shape recognition system: C P Ravikumar and R Sethi
Communication
Problem size
The CS tools software supported by Meiko t° provides a way for configuring the processor interconnection structure by programming the communication links of the transputers. Since the SHARP algorithm requires a binary tree kind of interprocessor communication for summing the partial scores, the experiments have also been carried out by defining communication links appropriately. We also used two other interconnection schemes, namely:
Since the SHARP algorithm divides the () space to achieve parallelism, it can be expected that the speedup is sensitive to variation of m~. Two problem sizes were chosen to study the effect of increasing the angular resolution: (i) ~ = 5 i.e. m~ = 37 and (ii) ~ = 2 ~ i.e. m~ = 91. The results of this experiment are shown in Figure 13. It is clear that as the problem size is increased, there is increase in speedup as well. This is because (i) the problem has been distributed more equally with m~ = 91 than the first option, giving better results, (ii) more computations are involved in comparison with the communication time. Thus p must be chosen appropriately depending on the size of problem.
• •
Default, i.e. link allocation is performed automatically by the Meiko operating system Forced, i.e. proximity of the processors is specified.
The results of these are shown in Figure 11.
CONCLUSIONS
Load balancing As we have seen in the previous section, the SHARP algorithm partitions the 0 space equally into p regions. When m~ is not a multiple of p, the partition is imbalanced. Two different schemes were tested for handling this problem: (i) The remainder of the angles was distributed equally among the low-indexed processors. (ii) The remainder was distributed among high-indexed processors. The results of the experiments are shown in Figure 12. Scheme (i) gives better results, since the critical path in our algorithm includes the computations on processor p - 1. By loading the high-index processors more than the lowindex processors, we increase the length of the critical path. It is clear from our performance measurements that there is no effect on speedup when the number of processors is small, up to four. After this, the effect becomes significant. The job is distributed almost equally among the processors when there are a small number of them. When the number of processors is increased, the job will not be distributed equally; e.g. if nb -- 37 is divided among 32 processors, it would imply that five processors will carry out twice the computations than others, thus affecting the overall speedup.
In this paper, we have described a parallel algorithm for the shape recognition problem which arises in a number of applications in robotics and machine vision. The parallel algorithm uses coarse-grained parallelism and is suitable for implementation on multiprocessors; we have discussed an implementation on a transputer-based multiprocessor. Effect. 11 10 9 8 G 7 D 6 5 G 4 LO 3 2 1 0
I
1
Effect
11
I
I
I
I
2 4 8 16 Number o f p r o c e s s o r s -=-Type I --~-Type II
32
Figure 12 Effectof load balancing with numberof processors
Commun i cat. i o n E t l " e c t T2 v s 52
0
I
of Load Balancing T2 v s S2
o f P r o b I em S [ z e T3 v £ $3
12
1
2 4 8 18 Number of proce£gors -e-De-rau i t. --~-I=or-cecl -.~Bln&ry Tree
32
0I 1 +Problem
Figure 11 Cornmunicationeffect with number of processors
2 4 8 16 Number o f p r o c e g ~ o r - s Size I -~-Problem Size
32 II
Figure 13 Effectof problem size on speedupwith number of processors
Microprocessors and Microsystems Volume 19 Number 3 April 1995
137
S H A R P : a s h a p e r e c o g n i t i o n system: C P R a v i k u m a r and R Sethi
The shape recognition algorithm uses a signature-based technique; the signature of the test image is computed and matched with the precomputed signature of the reference shape. The matching procedure considers any possible rotation that the reference shape may have undergone in the test image. The signatures are based on the straight-line Hough transform and are robust for changes in the orientation, translation and scaling of the reference shape. The data parallelism in the signature-based algorithm has been fully utilized in developing the parallel algorithm. In many image processing algorithms, the image can be partitioned into a number of regions and algorithms can be applied on these regions concurrently. In order to parallelize shape recognition algorithm, the 0 space, rather than the image space, has been partitioned across processors. The performance analysis of our parallel algorithm has been presented using both theoretical and implementation techniques. Implementation issues such as load balancing, memory requirement of processing nodes, and interprocessor communication have been discussed for effective use of processors.
6
7
8 9 10 11
mentation of the Hough transformation for straight line detection' Part. Recogn. Vol 22 No 6 (1989) pp697 706 Casasent, D and Krishanapuram, R 'Curved object location by Hough transformations and inversions' Patt. Recogn. Vol 2(1 No 2 (1987) pp 181-188 Pao, D C W, Li, H F and layakumar, R 'Shapes recognition using the straight line Hough transform: theory and generalization' IEEE Trans. Patt. Anal. Machine Intell. Vol 14 No 11 (19921 pp 1076-1089 Mckenzie, D S and Protheroe, S R 'Curve description using the inverse Hough transform' Part. Recogn. Vol 23 No 3 and 4 (19901 pp283 290 Hwang, K and Briggs, F A Computer Architecture and Parallel Processing, McGraw-Hill, New York (1985) 'Communicating sequential tools' in Meiko User's Manual Meiko Limited (1989) Ravikumar, C P and Gandhi, R K 'Parallel algorithms for edge detection in intensity images' Technical Report, Department of Electrical Engi neering, liT Delhi (1992)
REFERENCES 1 Ballard, D H and Brown, C M Computer Vision, Prentice-Hall, Englewood Cliffs, N] (19821 2 Davies, E R Machine Vision: Theory, Algorithms, Practicalities Academic Press, London (1990) 3 Duda, R O and Hart, P E 'Use of the Hough transformation to detect lines and curves in pictures' Comm. ACM Vol 15 No 1 (1972) pp 11-15 4 IIlingworth, J and Kittler, J 'A survey of the Hough transform' Comp. Vis. Graph. Image Proc. Vol 44 No 1 (1988) pp 87-116 5 Li, H F, Pao, D and Jayakumar, R 'Improvements and systolic imple-
138
M i c r o p r o c e s s o r s and M i c r o s y s t e m s V o l u m e 19 N u m b e r 3 April 1995
C P Ravikumar obtained his Bachelor's degree in Electronics from Bangalore Universily (1983), Master's degree in Computer Science from the Indian Institute of Science (1987), and Ph.D. in Computer Engineering from the University of Southern California (1991). Since 1991, he has been in the Electrical Engineering Department of liT Delhi as an Assistant Professor. His research interests include parallel processing, CAD for VESI, VLSI design, and heuristic search.
R Sethi obtained an M.Tech. degree in Computer Applications from the Department of Mathematics, liT Delhi in 1993. He is currently a senior programmer at the National Informatics Center, New Delhi, India. His areas of interest are parallel processing, image processing, and databases.