Pyramidal thinning algorithm for SIMD parallel machines

Pyramidal thinning algorithm for SIMD parallel machines

Pattern Recoonition, Vol. 28, No. 12, pp. 1993 2000, 1995 Elsevier Science Ltd Copyright © 1995 Pattern Recognition Society Printed in Great Britain. ...

674KB Sizes 0 Downloads 108 Views

Pattern Recoonition, Vol. 28, No. 12, pp. 1993 2000, 1995 Elsevier Science Ltd Copyright © 1995 Pattern Recognition Society Printed in Great Britain. All rights reserved 0031-3203/95 $9.50 +.00

Pergamon

0031-3203(95)00037-2

PYRAMIDAL THINNING ALGORITHM FOR SIMD PARALLEL MACHINES STI~PHANE UBI~DA TSI-CNRS U RA n°842, Universite Jean-Monnet, 23 Rue Docteur Michelon, 42023 Saint-Etienne, France

(Received 8 July 1994;in revisedform 22 February 1995;receivedfor publication 17 March 1995) Abstract--We propose a parallel thinning algorithm for binary pictures. Given an N x N binary image including an object, our algorithm computes in O(N2) the skeleton of the object, using a pyramidal decomposition of the picture. The behavior of this algorithm is studied considering a familyof digitalization of the same object at a differentlevelof resolution. With the ExclusiveRead ExclusiveWrite (EREW) Parallel Random Access Machine (PRAM),our algorithm runs in O(logN) time using O(N2/logN) processors and it is work-optimal. The same result is obtained with high-connectivitydistributed memory SIMD machines having strong hypercube and pyramid. We describe the basic operator, the pyramidal algorithm and some experimental results on the SIMD MasPar parallel machine. Thinning

Pyramidal algorithm

Parallel complexity

I. I N T R O D U C T I O N

The object of this paper is the thinning of an object in a binary image. This transformation is a derivative of the medial axis, described by Blum, for the continuous plane.~t) It can be defined with the help of thefire grass concept, where the object is seen as a meadow. A fire is lit along its contour such that all fire fronts invade the object with the same speed. The medial axis of the object is the set of points reached by more than one fire front at the same time. Giving a simple and rigorous definition of a skeleton, to design a perfect thinning process, is a challenging problem:121Extracting a skeleton from an N x N binary picture consists of removing, at each iteration, all the contour points except those belonging to the "axis" of objects and proceed until no more change occurs. In recent years, there has been many developments on thinning. Iterative algorithms ~3) (scanning the image until a condition is done) and direct algorithms (using a fixed number of image scanning) have been studied, c*) Gray-scaled image thinning,15)as well as parameter extraction from the skeleton,(61are investigated. Some important surveys of thinning methodologies can be found.(7'sl Iterative thinning algorithms fall into two classes according to the way contour points are detected. Areabased algorithms check each pixel, every iteration [i.e. O(N 2) accesses to the image]. Contour-based algorithms instead make, each iteration, a contour tracing~9) and check the entire contour [i.e. O(N) operations]. The number of iterations a thinning algorithm has to perform is proportional to the maximum thickness of the object, which is a function of the resolution of the digitization. The effective complexities of these two thinning approaches are respectively O(N 3) and O(N2). However, area-based algorithms use parallel oper-

ations, while contour-based algorithms are made of strongly sequential operations. The goal of this paper is to design a new thinning algorithm preserving the regularity of area-based algorithms, while having no more time complexity than contour-based algorithms. We also give the parallel complexity of our new algorithm for SIMD abstract machines. 1.1. P R A M and distributed memory models From a theoretical point of view, many abstract models for parallel machines exist in the literature, and the PRAM is by far the most preferred model for describing parallel algorithms.(1°) A PRAM is viewed as a collection of processors synchronously working on a single instruction flow that comes from a control unit. Processors have random access to a shared memory, through which inter-processor communication is implemented. The number of processors (p) is usually defmed as functions on the input size of the problem, say n. The time complexity, say O(f(n)), of a parallel algorithm is measured in terms of the number f(n) of parallel instructions performed. PRAM algorithms are said to be workefficient (respectively, work-optimal) if p . f ( n ) = O(t), where t is the time complexity of the best known sequential algorithm (respectively, the sequential lower bound) for the problem. Since processors are allowed to randomly access the shared memory, conflicts can occur while reading from or writing into memory positions. Different protocols exist for ruling memory access, and each one is the basis of a specific PRAM model. Throughout this paper we shall use the weakest PRAM model, i.e. an EREW PRAM, where concurrent accesses to the same memory position are disallowed in both read and write modes.

1993

1994

S. UBI~DA

Because of technological constraints, concurrent access of all memory positions, when the number of processors becomes very larger, still seems difficult to be supported practically. Several distributed memory parallei computers (or D M P C for short) were conceived and produced. The interconnection topology characterizes each DMPC. The parallel algorithms for DMPC cannot be well evaluated using the PRAM model and some SIMD distributed memory models of the parallel computer have been introduced, ttl~ For instance, we are interested in the SIMD hypercube and SIMD pyramid. A two-hypercube is composed of N = 2d processing elements connected in a hypercube topology. A pyramid (two-dimensional) consists of layers of mesh, with additional links between the layers. The base is a mesh of size n', the next layer is a mesh of size n'/4, then a layer of size n'/16 and so to the apex (size 1), involving (4n '2 - 1)/3 processing elements.~12~The first interconnection network is used because it is one of the most used network and the second one is introduced because it is well fitted for our new thinning algorithm. 1.2. New results After a short survey, in Section 2, of existing parallel thinning algorithms, in Section 3 we present a new parallel thinning operator that works on a 2 x 2 bloc of pixels. In Section 3.1, we show that the application of this operator preserves connectivity of both the object and the background. In Section 3.2, we design a new parallel thinning algorithm preserving regularity while achieving reduction of the complexity using a pyramidal decomposition of the picture; the resulting skeletons and execution times for both the pyramidal algorithm and a standard neighboring algorithm are presented in Section 3.3. In Section 4 we analyse the complexity according to abstract models of parallel machines. We close the paper with some concluding remarks and ways for further research. 2. PARALLEL THINNING ALGORITHMS

A thinning algorithm takes as input an N × N binary image which contains one object (i.e. only one connected component of black pixels), and produces an N x N image with the skeleton of the original object. It is known that the skeleton produced by a thinning algorithm should satisfy a number of conditions~13~ (listed below). Preserving these properties while using parallel processing is a challenging problem. In the following sections we review first the characterization of the skeleton and second the design of parallel thinning algorithms. 2.1. Characterization of a skeleton A digital skeleton is characterized by its topological properties.121In this section we simply enumerate those properties.

Be homotopic. It is a basic requirement that any thinning algorithm must preserve connectivity of both the object and the background. Have branches meaningful of elongations. Skeleton branches must be meaningful of the elongation of the original object. No formal definition for elongation could be found. It is usually expressed through the end points of the object, where an end point is a pixel having a single neighbour in the object. Have some noise immunity. Since elongations are not formally defined, this condition can hardly avoid distortions on the skeleton. Undesirable branches appear due to some noise in the contour. Be 1-pixel thick. A skeleton is a set of curves. In order to be able to use these curves to their best, they must be as thin as possible. Be isotropic. A shape descriptor has to be invariant with the position of the object in the image. That means that thinning must be invariant under translation and rotation. Notice that this condition does not hold in the digital plane, since rotations are no longer invariant transformations in the digital plane. 2.2. Parallel area-based algorithms With the development oi" graphic work-stations, the time needed to process larger and larger pictures is often prohibitive. Parallelism allows to decrease processing time. Both parallel area-based algorithms and parallel contour-based algorithms are proposed. 17~This section describes four methods to design parallel area-based thinning algorithm. First of all a parallel thinning operator must be defined. To be a parallel operator, the survival condition ofa pixel at iteration k must be computed only with the values of its neighbouring pixels at iteration k - 1. The survival condition is usually computed with respect to neighbours at distance 1 in the eight main directions in the picture (the eight-neighbourhood). This operator is called a 3 x 3 operator according to the size of the scanning window (or mask). Unfortunately, fully parallel 3 x 3 thinning operators have difficulties in preserving the connectivity of an image. Pixels of the set A, B, C in Fig. 1 have congruent eight-neighbourhoods. Thus, a parallel 3 x 3 operator cannot distinguish pixel A from pixels B and C. Suppose a skeleton pixel is only defined as a needless pixel for connectivity preservation; the deletion of such pixel in Fig. 1 by a parallel thinning operator splits the original object into non-connected sets. The first solution is to extend the window dimension using a 4 x 4 thinning window. A second solution is to serialize the algorithms partially by breaking a given iteration into distinct subiterations, each using a different operator or working on distinct subfields of the original picture.

Pyramidal thinning algorithm

1995

AI

Fig. 1. Possible loss of connectivity with a parallel 3 x 3 operator.

V=I 1 E=12 F=I G=V+F-E=0

One object (one 8-connected component of object pixels) and one hole (one 4-connected component of background pixels.

Fig. 2. Computation of the Euler number.

To define (sub)operators, a common solution is to introduce a directional bias, for example, by favoring north over south and west over east. Distortion is minimized by introducing subiterations that differ only in the directions of the bias. A systematic approach exists to generate all the compatible couples of suboperators.(14)

The third method is to apply the same operator on two different subfields by partitioning the original image like a chessboard. 114) Olszewski suggests a solution using four subfields3 t 5 The last parallelization technique is based on a recording of the image.O6~ During a preprocessing stage, each pixei of the picture is labeled according to the shortest distance to the background of the image. The thinning operator maintains this information and avoids loss of connection (without introducing subiterations). 3. PYRAMIDAL THINNING ALGORITHM

An iterative algorithm is considered to be pyramidal if it reduces the amount of considered data by a constant factor at each iteration. We are interested in a thinning algorithm taking as input an N × N binary image at a given iteration and having an N/2 x N/2 image as a result. In the first subsection we define our new thinning operator. In the second subsection we illustrate how it can be used to build a pyramidal algorithm, and the last subsection presents some experimental results. 3.1. Basic operator A thinning operator computes for each object pixel in the image at iteration k a boolean value, which decides whether the considered pixel will be removed in the picture at iteration k + 1. The main requirement of

thinning is the homotopy preservation. Olszewski has proposed an elegant solution of the homotopy requirement based on the Euler number,tls~ The Euler number of a picture is equal to the number of object components minus the number of holes (i.e. background-connected components which are not connected to the border of the image). This global parameter of the picture can be obtained by counting local configurations. In fact, the Euler number G of an image is: G = V + F - E, where V is the number of object pixels, E the number of horizontally or vertically adjacent pairs of object pixels and F is the number of complete 2 x 2 blocs within the object (see Fig. 2). An operator which does not modify this parameter (i.e. AG = 0) is a good candidate for thinning. The thinning operator defined by Olszewski takes in the value of a pixel and its eight neighbours in the picture in iteration k and verifies if the removal of this pixel modifies G. We use this idea but we extend the size of the operator: our new operator takes as input a 2 x 2 pixel bloc from the image and computes the thinning condition of the full block. The operator takes as input a bloc and the 12 pixels, adjacent to some pixel in the bloc, and computes the variation of the Euler number if the bloc is removed. A bloc can be deleted if its removal does not change the Euler number of the picture. We consider the variation AG of the Euler number considering four-connected background pixels. We have AG = A V + AF - AE, where AV (respectively, AF and AE) is the number of background pixels (respectively, the number of 2 x 2 blocs of background and the number of horizontally or vertically adjacent pairs of background pixels) added to the image by the removal of the bloc. The computation of AG is split into two different parts: AG = AG~ + AG 2. AG 1 corresponds to the effect

1996

S. UBI~DA

in

i AG1 = 0

i

AG1 -=- 0

i

AG1 = - 1

AG1 = 0

AGl = 0

Fig. 3. AG~ for the five possible blocs.

ii /

ni

i

I inn

II li liil

AG1 = - 1 AF2 = 4 A E 2 1 = 3, AE2.2 = :2 AG = 0

II



I I

AG1 -- 0 AF2 = 3 AE2.1 = 2, A E . . AG = 0

II

in

i

AG1 = 0 AF~ = 3 = 1

A E 2 . 1 = 2, A E 2 . 2 = 0 AG = -1

Fig. 4. Three examples of computation of AG. of the bloc's removal as if it were isolated in the image. Notice that AG1 =0, except when the bloc corresponds to a pair of diagonal object pixel (see Fig. 3). AG2 takes care of the interaction between the removed bloc and its neighbourhood. We decompose AG 2 into A V 2, AF 2 and AE2: AG 2 = A V 2 + A F 2 - A E 2 (see Fig. 4 for some cases). • AV2 = 0, there is no pixel removed. • AF 2 is equal to the sum of: AF2.1, the number of horizontal (or vertical) pairs of adjacent background pixels, in the neighbourhood excluding corners, and which are four-adjacent to at least one object pixei in the bloc (at most four such pairs, exist, each along one side of the 2 x 2 bloc). AF2. 2, the number of corner groups composed of three background pixels in a right angle adjacent to an object pixel in the corresponding corner of the bloc (at most four such corner groups exist). • AE 2 is equal to the number of background pixels in the neighbourhood which are four-adjacent to an object pixel of the bloc. Let us show how this operator can be used to design a pyramidal thinning algorithm. 3.2. Pyramidal decomposition If pixel coordinates take values in O . . . N - 1, all considered 2 x 2 bloc have the top left pixel with both coordinates even. The union of the (disjoint) blocs is the original image. Fig. 3 shows the five possible bloc configurations.

An N x N binary picture may be considered as an N / 2 x N / 2 array of blocs. Each bloc can take values in

[0, 15], where 0 corresponds to the "empty" bloc of four background pixels. Our bloc thinning operator removes all contour blocs in the object except those belonging to an elongation. All [1, 14] valued blocs are contour blocs. Some of the 15 valued blocs may also be contour blocs. Suppose all contour blocs are removed at the first iteration, then the result is an N / 2 x N / 2 binary picture, thus enabling further reduction. We iterate until at least one [1, 14] valued bloc remains in the picture (Fig. 5). Suppose this occurs at iteration k. We have an N / 2 k+l x N/2 k+t array I s constituted by blocs. Let us denote Q the set of blocs in I s whose values belong to [1, 14]. To force a binary image (i.e. to reduce the set Q), any [l, 14] valued bloc in Q may be replaced by a 15 valued bloc as long as topological conditions are preserved. If Q reduces to an empty set the iteration can continue. Suppose at iteration k no more reduction is possible, leaving an N/2~ x N / 2 k binary image (an N / 2 k+t x N/2k+ 1 bloc image). Some final thinning can yet be performed to improve the thickness of the resulting skeleton. The sequence of size decreasing pictures obtained during the k first iterations can be viewed as a pyramidal data structure. The amount of processing elements of this truncated pyramidal structure has a bound of ~N 2 and requires a number of iterations lower than log N to be processed. Let N* be the width of the image at the lowest resolution reached. We want to suggest that N* is

Pyramidal thinning algorithm

1997

N x N binary picture

One iteration

N/2 x N/2 [0,15] image

N/2 x N/2 binary picture

Fig. 5. An iteration of the pyramidal algorithm.

Fig. 6. The "strait", a local configuration prohibiting scale reduction. shape dependent, but independent of the initial resolution. Figure 6 presents a critical situation which may prevent further scale reduction. This situation is characterized by the "strait" present in the continuous picture. The ratio of the geometrical dimension of this strait with respect to the average thickness (or width) of the object is directly related to N*. This magnitude, a number without dimension, is a characteristic feature of the object. Suppose that the initial resolution N produces a situation where the original strait is represented by four nonadjacent pixels which belong to two adjacent blocs as in Fig. 6. These two blocs cannot be removed since they represent extremities of the object, nor can they be replaced by full object pixel blocs• In this case, N* = N. Now suppose that the initial resolution, where N is twice N*, allows one step of scale reduction before a similar situation is reached. More generally, if N = 2kN then k steps of scale reduction are possible, which shows that N* is independent of the initial resolution. As a result the number of iterations is proportional to log N. The post-processing part where thinning is performed at the final scale requires another O((N*) 3)

operations, which is independent of N. 3.3. Experimental results Now we compare our pyramidal algorithm and Olszewski's algorithm on a S I M D machine, namely the MasPar MP-1 (1,024 processors connected as a two-dimensional mesh). Figure 7 presents both skeletons for a 128 x 128 Chinese ideogram. Both skeletons are meaningful of the shape of the original object and both skeletons preserve the extremity of elongated parts of the original shape. The two skeletons differ in the preservation of symmetries. OIszewski's skeleton is closer to the medial axis of the original object while pyramidal skeletons are made of short broken lines. This is due to the projection of the obtained skeleton in the highest level of reduction into the original picture. A single pixel of the skeleton at the highest reduction level became a short line in the original image. So the resulting skeleton cannot be as close to the medial axis as the Olszewski skeleton. However, for most pattern analysis post-processing, the obtained pyramidal skeletons are meaningful enough. Both algorithms seem to be immune to noise, as shown by our example set of varied sizes and shapes. To sum up, we can state that

1998

S. U B E D A

Area based algorithm

P y r a m i d a l algorithm

Fig. 7. Resulting skeletons. I

Area ba,s~4 ~

8

algorithm is a logarithmic function of the dimension of the input image. This is due to the pyramidal data reduction and it is detailed in the next section. Figure 9 shows execution times of our algorithm compared with the execution times of the Olszewski algorithm.

I

/

7 6 5

4. COMPLEXITY EVALUATION

4 3< 2

7

6

.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.11 0 64

8

, , Area b~s~c~ ~ - p y r a m l

A

I

128

256

o

a

~

/

512

Fig. 8. N u m b e r of iterations for each algorithm as a function of log N.

pyramidal algorithms create skeletons as good as the Olszewski algorithm, except for a slight loss of symmetries. Efficiency is compared according to two criteria: the number of iterations and the actual execution time. The number of iterations is a standard efficiency measure of a thinning algorithm because it is machineindependent and implementation-independent. Figure 8 shows that the number of iterations of the pyramidal

Two main families of thinning algorithms exist: area-based methods, which apply a thinning operator all over the picture at each iteration, and contourbased methods, which apply a thinning operator only to the contour at each iteration. The complexity evaluation of thinning algorithms consists of the evaluation of the number of times the thinning operator is applied. To obtain a skeleton the number of iterations needed can be expressed as a function of the image size. At each iteration, all contour pixels are erased or labeled "skeleton" by application of the thinning operator. The number of such shrinking needed to completely removed or labeled as an object which is proportional to the thickness of the object. Thickness is a function of the digitization rate. Thus, the number of iterations of a thinning algorithm is O(N),while N is the width of the image. Area-based algorithms perform N 2 applications of the operator per iteration involving a complexity of O(N3). Contour-based algorithms operate on the current contour at each iteration, invading the object until all object pixels have been processed. The complexity of such algorithms is equal to the area of the object, i.e.

O(N2). Our new thinning algorithm is similar to area-based algorithms since the operator is applied all over the picture at each iteration. However, at each iteration, the size of the picture is reduced by a factor of 2. Such a reduction, however, can occur until there remains but one pixel. O(logN) iterations and O(N 2) operations are necessary to complete the process. This

Pyramidal thinning algorithm

1999

0

Q



0

0

0

0





0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Ib

0

( (a)

(b)

(c)

Fig. 9. Execution times for various dimension of a Chinese ideogram.

sequential complexity is the same as for the contourbased method. However, the pyramidal algorithm is made of parallel applications of a local operator. Therefore, optimal parallel complexity can be expected. 4.1. On the (EREW) PRAM model Let us consider the three presented algorithms as if implemented on an EREW PRAM machine. Each iteration of an area-based thinning algorithm can be done by N 2 processors at a constant time. Thus, a parallel area-based algorithm can produce a skeleton in O(N) time using N z processors. There exists a parallel contour-based algorithm which produces a skeleton in O(N) time using O(N) processors. "71 In the pyramidal thinning algorithm, each iteration may be split into two independent parts made of parallel operators: application of the thinning operator and reduction of the scale. Taking N2/Iogn processors, the first iterations may be achieved in O(log N); the second achieved in O(logN/4); the ith iteration is completed in O(logN/4i). Suppose the pyramidal decomposition of the image is made until a single pixel remains, this is done in: 0 logN ~

=O{logN)

i=0

We conclude that a pyramidal thinning algorithm may compute the skeleton of an N sized object in 0(logN) with N2/logN processors on an (EREW) PRAM. 4.2. On the hypercube model Contour-based thinning algorithms cannot be directly used for a distributed architectureJ 17) Using a grid of N 2 processors, contour-based algorithms and parallel area-based algorithms become almost idential. They both perform in O(N) time} is) As for area-based methods, pyramidal algorithms are made of local operations. However, as suggested by the section title, a mesh interconnection network is not

sufficient to exhibit maximal parallelism. We consider a hypercube network as well as its embedded twodimensional meshJ 19~ The number of processors is fixed at N2/log N. The algorithm runs as described for the PRAM until there remains a single pixel in each processor. This occurs just before a data reduction stage, at the end of the (log4 log N)th iteration. After the subsequent reduction stage, only a quarter of the processors are in charge of a pixel. To apply the thinning operator, neighbours of this pixel are needed and these neighbours are in processors at a distance 2 in the mesh. On each further iteration this distance increases by a factor of 2, thus preventing the computation to terminate in O(log N) time. One solution is to concentrate the image in the left higher quarter of the mesh; this can be done in two steps. First concentration is made in every column of the mesh and then similary for the rows (Fig. 10). Let us consider a mesh embedded in a hypercube where columns and rows appear as subcubesJ19) Each step of the concentration operation can be done in O(log M) if M is the size of a row or a column} 2m Now the left higher quarter of the mesh is also a subcube of the hypercube. The initial size of a row (or column) is less than N and it decreases by a factor of 2 for each iteration. So the sum of the concentration operations during the entire algorithm is O(log N). The algorithm can be performed in (log N) combining the two concentration steps in each iteration after (log4 log N)th.

5. CONCLUSIONS We present a new thinning algorithm for parallel machines whose optimality is based on theoretical considerations and is justified by practical applications. This algorithm produces skeletons in accordance with standard topological theoretical criteria. However, the main progress over previously published algorithms is concerned with the parallel complexity. Like the best contour-based algorithms, it has a sequential cost of O(n2), but it uses a full parallel

2000

S. UBI~DA

REFERENCES

Fig. 10. Concentration algorithm in two steps. operator like the best area-based algorithms. Areabased methods make O(N) iterations, while our new algorithm does O(iogN) iterations due to its pyramidal structure. We prove that this pyramidal thinning algorithm is work-optimal on a E R E W parallel machine as well as on a E R E W Hypercube parallel machine. Most skeletons produced by the new method have branches that seem to be around the exact medial axis of the shape and introduce a los of accuracy. However, the obtained skeleton is meaningful enough for most pattern recognition; moreover, this apparent drawback can be reduced by an improvement of the projection technique. In fact, it produces a skeleton in multiple levels of resolution, a technique which becomes a c o m m o n feature of m o d e m pattern recognition applications. The experimental results show good speed for not too small data; for large data structures it is faster than the specific assembler code subroutine of our target machine.

1. H. Blum, A transformation for extracting new descriptors. Syrup. On modelsfor Perception of Speech and Visual Form. MIT Press, Massachusetts (1964). 2. C. Ronse, A topological characterization of thinning, Theoret. Comput. Sci. 43, 31-41 (1986). 3. Z. Guo and R. Hall, Fast fully parallel thinning algorithms, Comput. Vision Graphics Image Process. Image Understanding 55(3), 317-328 (1992). 4. P.B. Gibbons and W. Niblack, A width-independent parallel thinning algorithm, in 1lth Int. Conf. on Pattern Recognition (The Hague), pp. 708-711. IEEE Press (1992). 5. S.S. Yu and W. H. Tsai, A new thinning algorithm for gray-scale images by the relaxation techniques, Pattern Recognition 23(10), 1067-1076 (1990). 6. G. Sanniti di Baja, o(n) computation of projections and moments from the labeled skeleton. Comput. Vis. Graphics Image Process. 49, 369-378 (1990). 7. L. Lam, S. W. Lee and C. Y. Suen. Thinning methodologie--a comprehensive survey, IEEE Trans. Pattern Analy. Mach. Intell. 14(9), 869-885 (1992). 8. Y.Y. Zhang and P. S. P. Wang, Analysis of thinning algorithms. Int. Conf. on Pattern Recognition (The Hague), pp. 763-766. IEEE Press (1992). 9. A. M. Vossepoel, J. P. Buys and G. Koelewjin, Skeleton from chain-coded contours. 10th Int. Conf. on Pattern Recognition (Atlantic City), pp. 70-73. IEEE Press (1990). 10. R. Karp and V. Ramachandran, A survey of parallel algorithms for shared-memory machines, Technical report ucb/csd 88/408. University of California, Computer Science Division (1988). 11. M. Cosnard and A. Ferreira, On the real power of loosely coupled parallel architectures. Parallel Process. Lett 1(2), 103-111 (1991). 12. Q.F. Stout, An Algorithmic Comparison of Meshes and Pyramids, pp. 107-120. Academic Press Inc., New York (1986). 13. T. Pavlidis, A thinning algorithm for discrete binary images. Comput. Vis. Image Process. 20, 142-157 (1980). 14. Z. Guo and R.W. Hall, ParaLlel thinning with twosubiteration algorithms. Comm. ACM 32, 359-373 (1989). 15. C. Neusius and J. Olszewski, An efficient distributed thinning algorithm, Parallel Comput. 18, 47-55 (1992). 16. A. Favre and H. J. Keller, A parallel syntactic thinning by recording of binary pictures. Comput. Vis. Graphics Image Process. 23, 99-112 (1983). 17. A. Ferreira and S. Ub6da, Ultra-fast contour tracking with application to thinning. Technical Report 87, LITHEPEL (1993). 18. S. Ub6da, Algorithmes d'Amincissement d'Images sur Machines Parall61es. PhD thesis, Ecole Normale Sup6rieure de Lyon (1993). 19. J. Brandenburg and D. Scott, Minimal mesh embeddings in binary hypercubes, IEEE Trans. Comput. 37(10), 12841285 (1988). 20. D. Nassimi and S. S. Sahni, Data broadcasting in simd computers. IEEE Trans. Comput. 30(2), 101-107 11981).

About the Autlmr--STI~PHANE UBI~,DA received the PhD degree in computer science from the Ecole

Normale Superieure do Lyon. Since September 1992, he has been member of the Laboratoire de l'Informatique du Parall61isme of the ENS, Lyon. His research has been in the area of parallelism and algorithmic with application to images processing. During 1992-1993 he was assistant-professor in the Swiss Federal Institut of Technology, Lausanne, in the Theoretical Computer Science Laboratory (LITH). He is currently assistant-professor in the Universit6 Jean Monnet, Saint-Etienne, France, in the Signal Processing and Instrumentation Laboratory (TSI CNRS URA 842). Its main scientific interests are parallel algorithm and complexity for both images processing and combinatorial optimization.