A novel line scan clustering algorithm for identifying connected components in digital images

A novel line scan clustering algorithm for identifying connected components in digital images

Image and Vision Computing 21 (2003) 459–472 www.elsevier.com/locate/imavis A novel line scan clustering algorithm for identifying connected componen...

574KB Sizes 1 Downloads 62 Views

Image and Vision Computing 21 (2003) 459–472 www.elsevier.com/locate/imavis

A novel line scan clustering algorithm for identifying connected components in digital images Yang Yanga, David Zhangb,* a Agilent Technologies, Singapore Vision Operation Private Ltd, 1 Yishun Avenue 7, Singapore, Singapore 768923 Centre for Multimedia Signal processing, Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong

b

Received 27 October 2000; received in revised form 15 January 2003; accepted 21 January 2003

Abstract In this paper, the Line-Scan Clustering (LSC) algorithm, a novel one-pass algorithm for labeling arbitrarily connected components is presented. In currently available connected components labeling approaches, only 4 or 8 connected components can be labeled. We overcome this limitation by introducing the new notion n-ED-neighbors. In designing the algorithm, we fully considered the particular properties of a connected component in an image and employed two data structures, the LSC algorithm turns to be highly efficient. On top of this, it has three more favorable features. First, as its capability to be processed block by block means that it is suitable for parallel processing, improving the speed when multiple processors are used. Second, its applicability is extended from working on binary images only to directly work on gray images, implying an efficiency gain in time spent on image binarization. Moreover, the LSC algorithm provides a more convenient way to employ the labeling result for conducting processing in later stages. Finally we compare LSC with an efficient connected labeling algorithm that is recently published, demonstrating how the LSC algorithm is faster. q 2003 Elsevier Science B.V. All rights reserved. Keywords: Image processing; Connected components labeling; Connectivity; One-pass; Parallel processing

1. Introduction Connected component labeling is one of the most common operations in virtually all image processing applications. The points in a connected component form a candidate region for representing an object. In machine vision most objects have surfaces. Points belonging to a surface project to spatially closed points. The notion of ‘spatially closed’ is captured by connected components in digital images. Yet, connected component algorithms usually form a bottleneck in a binary vision system [7]. So far, several algorithms to label connected components in a binary image have been proposed [1 –5,8 –10,14]. The classical sequential connected components labeling algorithm dates back to the early days of computer vision and image processing [8,9]. The algorithm relies on two subsequent raster-scans of a binary image. In the first scan a temporary label is assigned to each foreground pixel, based on the values of its neighbors (in the sense of 4-connectivity * Corresponding author. Tel.: þ 852-27667271; fax: þ 852-27740842. E-mail address: [email protected] (D. Zhang).

or 8-connectivity) that are already visited. When a foreground pixel with its foreground neighbors carrying different labels is found, the labels associated with the pixels in the neighborhood are registered as being equivalent. The second scan replaces each temporary label by the identifier of its corresponding equivalence class. Improvement on the labeling speed in different ways has been reported in various papers [1,2,4,10,14]. For instance, Stefano and Bulgarelli [11] proposed a comparatively more efficient and much simpler algorithm, that shows an improvement in the efficiency of the labeling process in their paper. All these algorithms are common in the sense that they all need two passes and they can only deal with 4 or 8 connected components in a binary image. From our real word experiences, it is observed that objects in an image can be 12 or more connected components. Thus, a solution to label arbitrarily connected components is yet to be developed. In this paper, we propose a Line-Scan Clustering (LSC) algorithm to label connected components. We also propose to derive a block-by-block-LSC (BLSC) algorithm from the LSC algorithm. The contributions of our paper

0262-8856/03/$ - see front matter q 2003 Elsevier Science B.V. All rights reserved. doi:10.1016/S0262-8856(03)00015-5

460

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

are six-fold. First, we show why and how the LSC algorithm can find arbitrarily connected components by introducing a new notion of n-ED-neighbors. Second, the LSC algorithm scans the image only once, which is different from currently available connected components labeling algorithms that scan the image twice. Third, the LSC algorithm can be processed block by block, deriving our BLSC algorithm, which is suitable for parallel processing. BLSC can improve the speed if multiple processors are used. Fourth, since conventional algorithms can only deal with binary images, they must binarize an image before labeling it. In sight of the situation, our algorithms is designed to directly work on gray images, instead of merely on binary images so that the time spent on image binarization is saved. Fifth, we brief that our algorithms are much easier and more convenient to compute characteristics (such as size, position and bounding rectangle) of the connected components. In many applications it is desirable to compute those component characteristics while performing the labeling process. Finally we present the speed of our algorithms by means of many experiments on natural images and the comparisons with a recently published efficient connected component algorithm [11], showing that our algorithms are much faster. This paper is organized as follows: Sections 2 and 3 discuss the basic ideas of our LSC algorithm and explain the implementation of the algorithm, respectively. Then we depict a block-by-block-LSC algorithm, BLSC, derived from the LSC algorithm in Section 4. In Section 5, we compare the LSC algorithm with a recently published connected components labeling algorithm and show the comparison results from experiments. We also show some applications in this section. Conclusions are drawn in Section 6.

2. Definitions and properties For explanation sake, we first focus on how our algorithm work with binary images. In later sections, we then further elaborate why and how the very same algorithm can directly work on gray images. First of all, we would like to introduce a notion n-EDneighbors by means of the Euclidean distance. Definition. : n-ED-neighbors. n-ED-neighbors of a pixel Pðx; yÞ belong to the set S ¼ {Qði; jÞldðP; QÞ # n} where dðp; QÞ is the Euclidean distance between P and Q: For example, the normal 4-neighbors correspond to 1-EDneighbors since the Euclidean distance between a pixel and any one of its 4-neighbors is equal pffiffito one. Similarly the normal 8-neighbors correspond to 2-ED-neighbors. If a pixel is an n-ED-neighbor of another pixel then these two pixels are called n-ED-adjacent. Our algorithm aims to label n-ED-connected components in an image. In this paper, x represents the row coordinate and y represents the column coordinate in Pðx; yÞ:

We observe some particular properties in a connected component and take them into consideration in designing the algorithm. Fig. 1 draws a simulated binary image where a T indicates a target (foreground) point. Property 1. In one row, continuous neighboring target points must belong to the same class. These points should be considered as one unit in clustering. We define such a unit as a base class. In other words, a base class, C; is a sequence of T-pixels ðp0 ; p1 ; …; pm Þ; where all pi have the same row coordinate and pi is a neighbor of pi21 for i ¼ 1; …; m: For example, the first five T points in the first row of Fig. 1 constitute a base class. Thus, a base class can be described as a horizontal line with the start point, Sði; js Þ; and terminal point, Eði; je Þ; where i; js and je indicate the row coordinate, the column coordinate of the start point and the column coordinate of the terminal point of this base class, respectively. We consider a base class as the minimum unit in labeling. A base class, C; is represented by a data structure, BASECLASS. BASECLASS is a 9-tuple consisting the fields beginX, beginY, endY, label, count, tag, parent, leftson, and rightsib. C ! beginX represents the row coordinate of C; C ! beginY and C ! endY are the column coordinate of the start point and the terminal point of C; respectively; C ! label records the class label of C; and C ! count counts the number of points in C; C ! tag indicates whether a node C is the root of a tree. If C is the root, C ! tag is set to t; otherwise, f : C ! parent; C ! leftson and C ! rightsib indicate the parent, left son and right sibling of C; respectively. For example, in Fig. 1 after the first row is scanned, the first base class corresponding to the five connected T points is formed and its information is saved in a BASECLASS type pointer ptrBase1. The records in ptrBase1 are listed below: ptrBase1 ! label ¼ 0; ptrBase1 ! count ¼ 5; ptrBase1 ! beginX ¼ 1; ptrBase1 ! beginY ¼ 3; ptrBase1 ! endY ¼ 7; ptrBase1 ! tag ¼ t; ptrBase1 ! leftson ¼ null; ptrBase1 ! rightsib ¼ null; ptrBase1 ! parent ¼ null:

Fig. 1. Base class. A base class can be described as a horizontal line with start point, Sði; js Þ; and terminal point, Eði; je Þ:

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

461

There are one base class in the second row and two base classes in the third row. By means of data structure BASECLASS, a class is related to a tree which we call class tree. A class tree may consist of more than one node. Each node is a base class. Property 2. When a base class, CðSði; js Þ; Eði; je ÞÞ; is obtained, we will determine whether it can merge with other previous classes. Assuming the n-ED-connectivity is considered, then only those previous classes which contain at least one point whose column coordinate falls into the interval ½j1 ; j2 ; where j1 ¼ js 2 n and j2 ¼ je þ n; need to be considered. We call this limitation as inter-column limitation. The reason is simple and we give its explanation in Appendix A. According to Property 2, when checking whether a base class can merge with other previous classes, we do not check the previous class which does not satisfy the inter-column limitation: Property 3. Besides, we will not check the previous class whose row coordinate is less n than the row coordinate of the current base class. The reason is that in this case the inter-row distance between this previous class and the current base class is larger than n; so the Euclidean distance between them is surely larger than n: We call this limitation as inter-row limitation. Property 4. Moreover, when checking whether a base class CðSði; js Þ; Eði; je ÞÞ can merge with other previous classes, not all the previous classes which satisfy intercolumn limitation must be checked. Instead, only the previous class which most recently visited the jth column, where j [ ½j1 ; j2  ðj1 ¼ js 2 n and j2 ¼ je þ nÞ; needs to be checked. In other words, there may be more than one previous class that satisfies the inter-column limitation, but only the previous class which most recently visited the jth column need to be checked. We call this limitation as the latest-visiting limitation. The reason is that the inter-row distance between the previous class which most recently visited the jth column and the base class C is the minimum so the Euclidean distance between them is the minimum. As an example (see Fig. 2) assuming class c is the current base class, and class a and class b are two previous classes (already scanned), we start to examine the second column to determine whether class c can merge with any previous class. Here class a and class b both satisfy the intercolumn limitation. Then do we need to check both class a and class b? No. According to Property 4, class b most recently visited the second column, hence only class b will be checked. In order to save the information of the most recently visited class for any column, we introduce a very simple

Fig. 2. Latest-visiting limitation in class merging.

but important data structure, ColumnArray. There are two fields in ColumnArray: classLabel and rowCoordinate. We define a ColumnArray type array vol, which is an array as large as the width of the binary image. As for vol[j], the field of classLabel records the label of the class that most recently visited the jth column while the field of rowCoordinate records the row coordinate of the base class that most recently visited the jth column. Whenever a new base class is created, vol will be updated and only those columns which are visited by this base class will be updated. In this manner, vol½j always keep the information of the base class that most recently visited the jth column. Take Fig. 2 as an example and assume we are searching for 1-ED-connected components. The field of rowCoordinate of vol½1 – 6 are initialized to 2 1000. When row 2 is scanned, base class a is created and its class label is set to 0. Since a visits column 2 only, we only update vol½2: vol½2 is updated to (0,2) where 0 represents the field of classLabel and 2 represents the field of rowCoordinate: When row 4 is scanned, baseclass b is created and its class label is 1. Now b visits column 2, so vol½2 is updated to (1,4). From here we can see that although both class a and b visit column 2, vol½2 only keeps the record of class b: Therefore, later in the labeling process, if we want to check column 2, a will not be considered. In this manner, we avoid to consider unnecessary classes. To continue to scan row 5, a new base class, c; will be created. Now we need to consider whether c can be merged with the two previous classes. According to the inter-column limitation of Property 2, the previous classes that visits column 1 – 6 will be considered. Since vol½1 – 6 records the information of the previous classes that most recently visited column 1 – 6 we need to check vol½1 – 6: From rowCoordinate of vol½1 and vol½3 – 6 we know that there are no valid records in them, meaning that no any classes have ever visited column 1, 3 – 6 before c: Hence, we only need to check vol½2: classLabel of vol½2 is 1 and rowCoordinate of vol½2 is 4. So next we need to examine whether the previous class 1 and base class c can be merged together by calculating the distance between them.

462

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

Property 5. When computing the distance between the base class, C ðSði; js Þ; Eði; je ÞÞ; and a previous class, even the previous class satisfies both the inter-column limitation and latest-visiting limitation, not all the points in this previous class need be considered in the distance calculation. We only need to check the points in the previous class which fall into the set X; where X ¼ {ðxi ; xj Þlxj [ ½j1 ; j2 ; j1 ¼ js 2 n; j2 ¼ je þ n; xi is the row coordinate of the base class which most recently visited the column xj }. This is because the inter-column distance between the base class C and a point in the previous class which is not in the set X will be larger than n: The explanation is similar to that in Property 2. Notice that rowCoordinate in vol is very important. Take Fig. 2 as an example. Without rowCoordinate; we only know which class most recently visited column 2, we do not know how to calculate the distance between the current base class, c; and the previous class saved in vol½2 (which is class b in this example). With rowCoordinate; we know the row coordinate of the base class that most recently visited column 2, which is 4 in this case. Meanwhile we are checking column 2, we know the column coordinate is 2. Hence, we know class b has a point (4,2) in column 2. We can calculate the distance between this point and the current base class to decide whether the current base class c can merge with class b:

address of each class. For example, Clusters½k will save the memory address of class k: Therefore, by tracing Clusters½k; we can visit all points belonging to class k. Another ColumnArray type of array vol is used for recording the class which most recently visited each column. For example, vol½j will tell the base class which most recently visited column j is vol½j:classLabel and the row coordinate of this base class is vol½j:rowCoordinate: 3.2. Properties implementation To clearly illustrate our algorithm, the flowchart of the LSC algorithm is shown in Fig. 3. Some words in some blocks of the flowchart, are underlined beneath which are some smaller words. The underlined words give a brief description for that step and the smaller words are either its detailed description or its related implementation codes. By following the flowchart, the algorithm can easily be implemented.

Property 6. Since a base class C can be described as a horizontal line, the computing of the distance between a point and a base class C can be simplified as a Euclidean distance of two points by virtue of the following definition: Definition. : The distance, d; between a point Pðxi ; xj Þ and a base class C ðSði; js Þ; Eði; je ÞÞ can be expressed as:



8 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi > > kS 2 Pk ¼ ðjs 2 xj Þ2 þ ði 2 xi Þ2 if ðxj , js Þ; > > < qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi kE 2 Pk ¼ ðje 2 xj Þ2 þ ði 2 xi Þ2 > > > > : li 2 xi l

if ðxj . je Þ;

ð1Þ

otherwise:

3. The LSC algorithm In this section, we describe how to make full use of the above mentioned properties in designing the algorithm and further explain the contributions that these properties made to our algorithm. 3.1. Data structure The two basic data structures described in Section 2, BASECLASS and ColumnArray; will be used. Clusters is a BASECLASS type pointer array used for saving the memory

Fig. 3. Flowchart of LSC algorithm.

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

We can consider the LSC algorithm as a 4-step process. First, it scans the image in raster order. Once it finds a target point, it starts to generate a base class, curBaseClass. Second, it determines whether this base class curBaseClass can merge with other previous classes by means of Properties 2– 4. If it can merge with other previous classes, we will do class merging by two merging rules illustrated later. Third, it records the base class information by vol after determining which class the current base class should belong to. Fourth, if the current baseclass curBaseClass cannot merge with any previous classes, it will be recorded to the pointer array Clusters as a new class. So the final results can be traced out by visiting Clusters: The first step relates to Property 1. When a target point is scanned, the LSC algorithm will not consider it independently, but scan continuously until a base class is constituted. This is done by a function whose flowchart is drawn in Fig. 4(a). Take Fig. 5 as an example. When point (1,1) is found to be a target point, the LSC algorithm will continue to scan row 1 until a background point or the end of row is encountered. In this example, when scan to point (1,4), a base class, a; is generated. The second step is the key step of this algorithm. We will determine whether the current base class can merge with any other previous class. Properties 2– 4 are considered here. In step 1, we generate a base class. Now we calculate j1 and j2 for this base class. According to Property 2, we only need to check the previous class that most recently visited columns from j1 to j2 : The previous classes which satisfy both the inter-column limitation and the latest-visiting limitation are recorded in vol: We check only those points (vol[ j ].rowcoordinate, j) to decide whether the class vol½j:classLabel can be merged with the current base class.

Fig. 4. Flowchart of two functions in the LSC algorithm. (a) Base class generation; (b) Distance calculation between a base class and a point in a previous class.

463

Fig. 5. LSC example.

According to Property 3, we will not check the previous class which does not satisfy the inter-row limitation (the NO branch of vol½k:rowCoordinate , i 2 n in the LSC flowchart is doing this). Moreover, we only check the previous class limited by Property 4. Besides, we will only check points that satisfy Property 5. In this step, there are two cases that we need to consider to merge two classes. Case 1. The distance between the current base class and a previous class is less than n. Case 2. The distance between a previous class that contains the current base class and another previous class is less than n. The distance between a point in a previous class and the base class is calculated according to Property 6 and its flowchart is shown in Fig. 4(b). We have a function FIND for identifying the root of a class tree (refer to Fig. 6(b). In both cases, for a class with more than one node, we use FIND to locate the root node of the class tree that this class relates to. The label of this root node will represent the final class label of this class. Therefore, in one class tree, there maybe some nodes whose labels are different, however, their root node is the same so their final class labels are the same. In class merging, we should make the class which may be visited later most possibly to be the parent of the merged class so that its parent can be FIND in the shortest step in later process. Corresponding to these two cases, we have the following two rules for class merging (refer to Fig. 6(a)). Rule 1: Make the current base class be the parent of the merged class. Assign the previous class label to the current base class. The label of the previous class is found by FIND: Rule 2: Make the previous class that contains the current base class be the parent of the merged class. During the class merging, if a class a is merged into another class b; i.e. b becomes the parent of a; we will set a ! tag ¼ f which means a is not a root of a class tree. In order to avoid repeated merging, we check whether these two classes have the same label before merging them (by curLabel ¼ prevClassK ! label in the LSC flowchart). In the third step, the contents of vol½j1 – j2  are updated by the information of the base class which most recently visited column jðj [ ½j1 ; j2 Þ: vol records the information of

464

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

Fig. 7. Labeling process on the LSC example (a) Records update in each base class; (b) records update in vol along with the generation of base class a; b and c:

Fig. 6. Flowchart of another two functions in the LSC algorithm. (a) Class merging; (b) FIND.

which base class most recently visited any columns and the row coordinate of the base class that most recently visited each column so that in step 2 these information can be used. Take Property 3 as an example and assume we want to check whether any previous class that most recently visited column j can merge with the current base class. We can know the class which most recently visited column j from vol½j:classLabel: However, this class may contain more than one base class. Do we need to check each point in each base class? Definitely not. vol½j:rowCoordinate always tells the row coordinate of the base class which most recently visited column j: Therefore, we only need to check the point ðvol½j:rowCoordinate; jÞ: The final step is to record the current base class in Clusters½newLabel if it cannot merge with any other previous classes (this is done by simply assigning Clusters½newLabel ¼ curBaseClass in the LSC flowchart). Let us go back to the example of Fig. 5 and assume 2-ED-connectivity; namely, n ¼ 2: After scanning to point (1,4) in row 1, a base class a is formed and it is the current base class. Since there is no previous class before it, there is no distance calculation and class merging involved. We simply update vol½1 – 3 to (0,1) and record a in Clusters½0 as a new class. In Fig. 7(a), the second row lists the results saved in Clusters½0 after class a is generated. In this figure, the first column and second column tell us which row and column of the image is scanned. If vol appears in the row

field, it means the update is caused by checking the information saved in vol½col where col is the column coordinate in the col field of this figure. The third column conveys information of which base class is updated and columns 4 – 12 tell the contents of each field of the related base class. Column 13 indicates the update on newLabel while the last column indicates the class label of the current base class or class. The updates on vol after generating each base class are shown in Fig. 7(b). You can see from this figure that after a is generated, the contents of vol related to column 1, 2 and 3 are updated to (0,1), which means that the base class which most recently visited column 1, 2 and 3 is class 0 and the row coordinate of the base class is 1. We need to emphasize that once a base class is generated, its tag field will be set to t: Its tag will be updated to f once it is merged into other classes, indicating that this class is not a root node of a class tree. To continue scanning the image, we generate a new base class b; which contains points from columns 6 –9 in row 1. For b; j1 ¼ b ! beginY 2 n ¼ 4 and j2 ¼ b ! endY þ n ¼ 11: Although there is a previous class with label 0, there are no records in vol½j1 – j2 : Thus, there are neither distance calculation nor class merging involved. vol½6 – 9 are updated and Clusters½1 links to b: The third row in Fig. 7(a) and the third row in Fig. 7(b) list these updates. To continue scanning row 2. A new base class, c; is formed. Its records are listed in the fourth row in Fig. 7(a). Since c ! beginY ¼ 8 and c ! endY ¼ 9; by means of Properties 2 –5, points ðvol½6 – 11:rowCoordinate; 6 – 11Þ will be checked one by one to determine whether c can be merged with the previous classes vol½6 – 11:classLabel: The first point checked is ðvol½6:rowCoordinate; 6Þ; namely (1,6). The distance between this point and the current base class c is calculated pby ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p ffiffi Eq. (1) pffiffi and the result is ð6 2 8Þ2 þ ð2 2 1Þ2 ¼ 5: Since 5 is larger than n; no merging is being done. Next the point ðvol½7:rowCoordinate; 7Þ; namely (1,7), is checked and

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

pffiffi the distance d between it and c is 2: Since d , n; class c will be merged with the class vol½7:classLabel; namely class 1 or b: According to our merging rule 1, c is the current base class so it becomes the parent of b: The updates on each class are reflected in the fifth and sixth rows in Fig. 7(a). From this figure we can see that Clusters½1 now links to c while b (which originally linked to Clusters½1) becomes leftson of c and b ! tag is updated to f : c ! count is updated to the sum of c ! count and b ! count: The other points ðvol½8 – 9:rowCoordinate; 8 – 9Þ will not be checked because vol½8 – 9:classLabel are the same as the label of class c: Since no record in vol½10 – 11; points ðvol½10 – 11:rowCoordinate; 10 – 11Þ will not be checked. vol½8 – 9 are updated due to c and these updates are listed in the fourth row in Fig. 7(b). In Fig. 7(b), the summary row lists all the records in vol so far. To continue to scan row 3, a new base class e is formed. Its contents are listed in row 7 of Fig. 7(a). The points ðvol½1 – 10:rowCoordinate; 1 – 10Þ will be checked one by one to determine whether e can be merged with the previous classes with label vol½1 – 10:classLabel: After the point (1,3) is checked, e can be merged with a: According to merging rule 1, e becomes parent of a and a becomes the leftson of e: a ! tag is set to f and e ! count is accumulated. e and a are updated and their records are listed in the eighth and ninth rows in Fig. 7(a). When the point (1,6) is checked, e can be merged with class 1. Notice that currently e has merged with a so e is no longer a single base class. Rule 2 should be used for merging class e and class 1. According to this rule, e is made to be the parent. a is the leftson of e: c becomes the rightsib of a: c ! tag is set to f : e ! count is accumulated. Rows 10 – 12 in Fig. 7(a) show all the updates. Points (vol½7 – 9:rowCoordinate; 7 – 9) will not be checked since the label of the root of class vol½7 – 9:classLabel are equal to the label of the current class (they have the same parent with label 0). The final records of Clusters and vol are given in Fig. 8. To better understand the algorithm, we should have a clear picture of the relationship between a base class and a class. In the LSC algorithm, a class is represented by a base class linked with many other base classes. During labeling, we generate base classes and these base classes constitute some different classes based on their connectivity. Although a base class is a single node at the first instance when it is generated, it may become the parent of some other base classes. Once a base class becomes a parent or when it cannot be merged with any other classes, it represents a base class as well as a class. As in the above example, base class a is also a class since it cannot merge with any other classes. As for base class e; when it is generated at the beginning, it is just a single base class. However, in the later process, it is merged with class a; b and c and it becomes the parent of a; b and c: Then it is a base class that represents the points in the row 3, it is also a class since it is the parent of a; b and c and we can find a; b and c through it.

465

Fig. 8. Final labeling results of the LSC example (a) final records in base classes; (b) final records in vol:

Eventually, how do we know how many classes there are in the image? We know it by checking the tag field of each class recorded by Clusters½k: Once a class is merged into another class, its tag field will be set to f so only the class whose tag ¼ t is a root of a class tree. The class with tag ¼ f is a child of another class (not an independent class), which can be traced out by its parent: In this example, Clusters½1:tag ¼ f so it is not an independent class. In fact it links to Clusters½0: Clusters½0:tag ¼ t indicates there is only one class, class 0, in this image. After labeling, if we want to visit all the points in class 0, we can trace Clusters½0: Clusters½0 has all the information of the target points in this image. From the above description, it is demonstrated that LSC is a one-pass algorithm for labeling arbitrarily connected components. Besides, we can also easily find the following two good features of the LSC algorithm. 3.3. Easy for postprocessing As explained above, all the points in each class can be traced out from the root of the class tree. Therefore, when we need to do some processing on one class, only the points in this class will be checked. We do not need to check the background point or points that belong to any other classes. For example, in one of our applications, we only want to know the class with the maximum number of points. It is very easy for us to do so—we can simply compare the count field of the root of each class tree. Another application for us is to calculate the bounding rectangle of each class. It can be easily known from the LSC labeling results by comparing coordinates of each point in each class tree. 3.4. Labeling gray image directly Another good feature of the LSC algorithm is that the input image does not have to be a binary image, which is needed by traditional approaches. The input image can just be a gray image. For example, we want to label all the points whose gray level is smaller than a value gLevel: What we need to do for LSC is to change all the statements if img(i,j) ¼ T into if imgði; jÞ , gLevel: For one of our applications, we need to label points whose gray level is 0

466

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

or bigger than a value gLevel; what we do is to modify all the statements if imgði; jÞ ¼ T into if ðimgði; jÞ ¼ 0 or imgði; jÞ . gLevelÞ: This application is not only simple, it can also save the time used for image binarization.

4. Block-by-block-LSC Another good feature of LSC is that it is suitable for part by part processing. We can divide the image into several vertical parts and perform LSC on each individual part at the same time. We can then merge the results of all the parts in the final stage. This derives our Block-by-blockLSC(BLSC, in short) algorithm. Fig. 9 shows the flowchart of the BLSC algorithm. The main idea of BLSC is to label each part by the LSC algorithm and at the same time record the row coordinates and labels of classes that most recently visited each columns of the first n rows and the last row in each part by two ColumnArray type array volUp and volDown: We then merge the labeling results of neighbored parts by means of the information saved in volDown of one part and volUp of its adjacent lower part. In the following discussion we assume 1-Edconnectivity, namely n ¼ 1 (corresponding to the traditional four connectivity approach). In the LSC algorithm, there is already an array to record the class information that most recently visited each column. After LSC labels the whole image, this array just saves the row coordinates and labels of base classes that most recently visited each columns of the last row. So we only need to add another array in the LSC to record the class information for the first row. This derives the modified LSC algorithm, singleLsc: The flowchart of singleLsc algorithm is shown in Fig. 10. We can consider singleLsc as a five-step process. In the initialization stage, the newLabel is set to startLabel; which is related to partNo to guarantee no overlapped labels occur in labeling the results of different parts. This is necessary to the later parts of the merging. Step 0 in singleLsc is to label the first row and save the class information of the first row into volUp: In order not to change volUp; we copy volUp contents into volDown: volDown will continue to be updated just as what we do to vol in LSC. In this manner, after labeling, volUp and volDown will keep the first row and last row information for a single part, respectively. Step 1, step 2 and step 3 in singLsc are the same as those in the LSC. But here, we consider n ¼ 1 case, j1 and j2 as setting to the start column and end column of the current base class. No distance calculation is needed in step 2, which makes it quite simple compared to LSC which caters for arbitrary connectivity cases. In step 4, the new class will be saved in Clusters½newLabel 2 startLabel instead of Clusters½newLabel: BLSC can be considered as a three-step process. Let us assume we divide the image into BLKS parts. We use a two

Fig. 9. FlowChart of the BLSC algorithm.

dimensional BASECLASS pointer type array subClusters½0 – ðBLKS 2 1Þ½ to save the labeling results for each part. And we use two dimensional array volUp½0 – ðBLKS 2 1Þ½ and volDown½0 – ðBLKS 2 1Þ½ to save the row coordinates and labels of classes that most recently visited each columns of the first row and last row for each part. Step 1 in BLSC is to label each part by singLsc: After step 1, labeling results for part partNo will be saved in subClusters½partNo½: The class information that most recently visited each column of its first row and last row will be saved in volUp½partNo½ and volDown½partNo½; respectively. Step 2 is to consolidate the labeling results from each individual part except part 0. In order to guarantee no overlapped labels in different parts, the start label in each individual part except part 0 is not 0. When we put the labels in each individual part together, they will not be continuous, which brings difficulty in class

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

467

Fig. 10. FlowChart of SingleLsc algorithm.

merging. For example, we assume the maxLabel ¼ 100: For part 0, we assume the class label is from 0 to 5. For part 1, its class label will be from 100. If we put the labeling results of part 0 and 1 together, the class label will change from 0 to 5 to 100. After step 1, we already know how many labels are used in each individual part. Therefore, in step 2, we adjust the labels in different parts to make them continuous when putting them together. This is done by subtracting a number extraLabel from the label of classes in each part. The extraLabel is equal to the difference between the startLabel and the number of labels used by all the preceding parts of each part (refer to Fig. 9). Then link all the labeling results in each part to a pointer array Clusters (—its type is the same as in LSC). Step 3 is to merge the labeling results in each adjacent part sequentially. The flowchart of two parts merging routine in the case of n ¼ 1 is shown in Fig. 11. To merge the two parts with partNo – 1 and partNo; we use two variants lastLabelU and lastLabelD to keep track of the labels of the two lately merged classes. They are set to two invalid values initially. We start to search the valid record in volUp½partNo: Once a valid record related to a certain column j is found, we will compare whether the class saved

Fig. 11. FlowChart of two parts merging in the BLSC algorithm.

in volUp½partNo½j and the class saved in volDown½partNo – 1½j have been merged already. This is done by comparing whether volUp[partNo][j].classLabel ¼ lastLabelU and volDown[partNo – 1][j].classLabel ¼ lastLabelD (We found this particularly useful since in many cases, there are classes in the last row in part partNo – 1 and classes in the first row in part partNo that belong to the same class with many connected points extended horizontally. Once we merge any two classes, we do not want to check those points belonging to these two merged classes again). If they are not merged yet, we will

468

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

Fig. 12. BLSC example.

continue to check whether they can be merged. Since right now we are considering n ¼ 1 case and we are checking the same column in the last row of part partNo – 1 and the first row in part partNo; we only need to compare their row difference by subtracting their rowCoordinate which are saved in volUp½partNo½j and volDown½partNo – 1½j: If this difference is just 1 then we can merge them. To merge them is just to link one class into another class in this case. Notice that for volUp½partNo½j; its rowCoordinate is either the coordinate of the first row of part partNo or invalid. However, for volDown[partNo –1][j], its rowCoordinate may not be the last row coordinate of part partNo – 1 because volDown records the overall results of the whole part. Let us take Fig. 12 as an example and assume BLKS ¼ 2; n ¼ 1 and maxLabel ¼ 10: The height of each part is 2. Step 1: label part 0 and 1 by singleLsc: Let us firstly process part 0. In row 0, there are no target points, so there are no valid records in volUp½0½: After scanning row 1, three base classes are formed. The labeling results in subClusters½0 are listed in Fig. 13(a). The contents of volDown½0 is updated along with the generation of each base class and their results are shown in Fig. 13(b). The final results of volUp½0½ and volDown½0½ are listed in Fig. 13(c). Labeling part 1 by singleLsc: The startLabel is 10. After scanning its first row (row 2 in the full image), three classes are formed. The labeling results in subClusters½1 and volUp½1 are listed in row 2– 4 of Fig. 14(a) and (b), respectively. The contents of volUp½1 are copied to volDown½1 (refer to Fig. 14(c)).

Fig. 13. Part 0 labelling results on BLSC example. (a) Records of base classes; (b) records update of volDown½0 along with the generation of each baseclass; (c) final records in volUp½0 and volDown½0:

To continue to scan the second row (row 3 in the full image). A base class c7 is generated. Then we check whether c7 can be merged with any previous class by means of volDown½1 (refer to Fig. 14(c). First check volDown½1  ½3; whose records are (10, 2), while c7 is in row 3. So c7 and class ð10 – startLabelÞ namely class 0 should be merged. Class 0 can be obtained from subClusters½1½0 and c4 ¼ subClusters½1½0: So c4 is merged into c7: After volDown½1½6 and volDown½1½9 are checked, c5 and c6 are merged into c7 as well. The results are reflected in Fig. 14(a). After these merging operations, column 3– 9 of volDown½1 are updated to (10,3) due to e as shown in Fig. 14(d). Now part 1 has been labeled. Its overall results are listed in Fig. 14(e) and (f). Step 2: Make the class labels in part 0 and part 1 continuous. The labels for part 0 starts from 0 so there is no need to do any operation on part 0. After step 1, we know there are 3 labels that are used in part 0. So the extraLabel for part 1 will be partNo £ maxLabel 2

Fig. 14. Part 1 labeling results on BLSC example. (a) Records update in baseclassed along with the generation of each base class and class merging; (b) records update of volUp½1; (c) records copy from volUp½1 to volDown½1; (d) records update of volDown½1; (e) final records in base classes; (f) final records in volUp½1 and volDown½1:

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

beoreLabel½1 ¼ 10 2 3 ¼ 7: Therefore all the labels of classes in part 1 should be updated by subtracting 7. Also the field of classLabel in volUp½1 and volDown½1 needs to be updated in the same way. The changes on the classes of part 1 are listed in Fig. 15(a) and the updates on volUp½1 and volDown½1 are listed in Fig. 15(c). Then we link records in both subCluster½0 and subClusters½1 to Clusters: The contents of Clusters are listed in Fig. 15(b). Step 3: Merge the labeling results from these two parts according to the merging principal (refer to Fig. 11). We start to check valid records in volUp½1 from left to right. The first valid record (refer to Fig. 15(c)) is (3,2) related to volUp½1½1: Then we check volDown½0½1 to see whether its record is valid. It has a valid record (0,1) (refer to Fig. 13(c)). The row difference between these two classes is 2 2 1 ¼ 1 so they should be merged together. Before merging, double check whether these two classes 0 and 3 have the same class label in their root nodes. This can be done by checking FINDðClusters½0Þ and FINDðClusters½3Þ: For this case, they belong to different classes. So we merge class Clusters½0 into class Clusters½3: This leads to c1 being merged into c7: The updates in each class are shown in Fig. 16(a). Now lastLabelU is updated to 3 and lastLabelD is updated to 0 to record class labels of the two latest merged classes. To continue to check volUp[1][2], it has a valid record (3, 2) and volDown[0][2] has a valid record (0,1). However, volUp[1][2].class Label ¼ lastLabelU and volDown [0][2].classLabel ¼ lastLabelD which means these two classes have been merged already so we do not take any action. For the same reason, volUp[1][3] is passed. volUp½1½4 – 5 does not have valid records so we continue to check volUp½1½6: There are valid records in both volUp½1½6 ((4,2)) and volDown½0½6 ((1,1)). By examining their records we know the class saved in Clusters½1 ðc2Þ should merge into the class saved in Clusters½4

469

Fig. 16. Parts merging results on BLSC example. (a) Records update due to c1 merging into c7 after checking volUp½1½1; (b) records update due to c2 merging into c7 after checking volUp½1½6; (c) records update due to c3 merging into c7 after checking volUp½1½9:

(parent of Clusters½4 is c7 (refer to Fig. 16(b))), namely c2 is merged into c7: Finally c3 is merged into c7 after volUp½1½9 is checked (records update is shown in Fig. 16(c)). The final records in all base classes are shown in Fig. 16(d). From this figure, we can see that only Clusters½3 ! tag ¼ t so there is only one class in this image and the total number of points in this class is Clusters½3 ! count ¼ 21: We just demonstrate how pffiffiffiffiBLSC works in the case of n ¼ 1: In the case of n ¼ ð2Þ; it is easy to modify it by setting j1 ¼ curBaseClass ! beginY 2 1 and j2 ¼ curBaseClass ! endY þ 1 in singleLsc: Also in the parts merging process, for any records in volUp½partNo½j; we should check records in volDown½partNo – 1½j 2 1; volDown½partNo – 1½j and volDown½partNo – 1½j þ 1: However, in the case of n $ 2; we need to save the first n rows information in singleLsc and the parts merging turns to be too complicated to discuss in this paper. For the traditional 4 or 8 connected components labeling, BLSC will improve the clustering speed if multi-processors are used.

5. Comparison, analysis and applications

Fig. 15. Results of BLSC step 2 on BLSC example. (a) Labels readjustment of classes in part 1; (b) records in part 0 and part 1 are linked to Clusters; (c) labels re-adjustment of volUp½1 and volDown½1 of part 1.

To compare our algorithms with a recently published efficient connected component labeling algorithm [11] (let us call it SB algorithm and its extended version is considered here), we set n ¼ 1 for our algorithms since in their paper they give codes for the traditional 4-connectivity case. All the images are tested on a PentiumIII PC (550 MHz CPU).

470

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

The images we tested are natural images downloaded from Volume 3 (Miscellaneous) of the image data base of USC (University of Southern California, http://sipi.usc.edu/ services/database/database.cgi?volume ¼ misc). There are totally 44 images in this site and some of them are color images. For color images, we first saved them as 256 gray images using a public image tool: Paint Shop Pro. Since the SB algorithm can only work on binary image, we transformed these gray images into binary images using Otsu’s threshold selection method [6] before labeling them. We can label either points whose gray levels $ threshold or points whose gray levels , threshold so that we can actually label 88 images. First we labeled the points whose gray level $ threshold and the results from each algorithm are shown in Fig. 17(a). In this figure, column 1 shows the names of tested images which follow those in the above web site. Column 2 tells the size of the tested images. Column 3 shows the threshold used to binarize the related image and column 4 lists the time of image binarization (not including the time of threshold determining). Columns 5 shows the time used by the SB algorithm to label the whole image while column 6 is the summing of the time used by the SB algorithm and the time used by image binarization (we call it SB1 time in the figure) since the SB algorithm needs a binary image while LSC does not. Column 7– 10 display the time used by the LSC algorithm and the BLSC algorithm (simulating k processors are used) to label the whole image, respectively. In the last five columns, we show the improvement of using the LSC algorithm over the use of the SB algorithm; the LSC algorithm improvement over the SB algorithm plus image binarization; the BLSC ðk ¼ 2Þ algorithm over the LSC algorithm; the BLSC ðk ¼ 3Þ algorithm over the LSC algorithm and the BLSC ðk ¼ 4Þ algorithm over the LSC algorithm, respectively. Note that the improvement ratio of algorithm 1 over algorithm 2 is equal to ðT2 2 T1 Þ=T2 £ 100% where T1 and T2 denote the running time of algorithm 1 and algorithm 2, respectively. To calculate the running time of the BLSC algorithm, we simulate k processors are used. We first record the maximum time t1 used on labeling each individual part and record the time t2 used on parts merging. Then we take t ¼ t1 þ t2 as the final BLSC running time. In these figures, there are some empty cells that correspond to the comparison results between the BLSC and the LSC algorithms. They are empty because the time taken on LSC is too short so the comparison between LSC and BLSC will not be accurate due to the time measurement error. The maximum, the minimum and the average improvement of the LSC algorithms over the SB algorithms and SB1, the BLSC algorithm over the LSC algorithm corresponding to Fig. 17(a) are listed in Fig. 17(b). Fig. 17(c) shows the similar results as Fig. 17(b) but this time the points whose gray levels , threshold are labeled by these algorithms. From Fig. 17(b) and (c), we can see that the improvement of the LSC algorithm over the SB

algorithm varies from 22.3 to 91.7%. The average improvement of the LSC algorithm over the SB algorithm is about 67%. Considering the time used in image binarization, the maximum, the minimum and the average improvement of the LSC algorithm over the SB algorithm are 28, 92.4 and 70%, respectively. While the average improvements of the BLSC algorithm in the case

Fig. 17. Comparison between the SB, the LSC and the BLSC algorithms. (a) Detailed results of different algorithms labeling white points in 44 images; (b) summary of improvement of the LSC algorithm over the SB algorithm and the BLSC algorithm over the LSC algorithm from (a); (c) summary of improvement of the LSC algorithm over the SB algorithm and the BLSC algorithm over the LSC algorithm on labeling black points in the same 44 images.

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

471

Fig. 19. An example of 2-ED-connected components.

label them as 9 objects while the traditional connected components labeling algorithms will obtain 144 objects. Another application of the LSC algorithm is to label n-Edconnected components in the case of n . 2; which can be found in the paper [15]. Fig. 18. Binary image of 7.1.07.bmp.

of 2, 3 and 4 processors over the LSC algorithm are about 31, 39 and 45%, respectively. The least improvement for the LSC algorithm over the SB algorithm is to label the white points in image 7.1.07.bmp which is shown in Fig. 18. From the above algorithm description, we can see that the time complexity of the LSC algorithm is determined mainly by the cost of class merging operations. While the cost of class merging is determined by the number of base classes, the number of points in each base class, and n; the ED-connectivity. Let L be the number of base classes and assume the maximum base class, cMax contains M points. In the worst case, the number of class merging operations related to cMax is ðM þ 2nÞ: In the worst case, the overall number of class merging operations will be L £ ðM þ 2nÞ: According to Refs. [12,13], any set of m class merging operations can be performed in time OðmaðmÞ) and in most practical cases, aðmÞ # 5: Hence the time complexity of LSC algorithm in the worst case is OðL £ ðM þ 2nÞaðL £ ðM þ 2nÞÞ which can be approximated to Oð5L £ ðM þ 2nÞÞ: Generally, Oð5L £ ðM þ 2nÞÞ will be much less than OðNÞ; where N is the number of pixels in the image. The storage complexity of the LSC algorithm is determined by the number of base classes, L; and is OðLÞ since we apply one memory space for each base class. The traditional connected component labeling algorithm uses an extra image with size ð16 £ ImagesizeÞ to save the labeling result. For large size images, this labeled image will be very large. Finally, we need to mention that our algorithms can label arbitrarily connected components which will be very useful in some applications. For example, in Fig. 19, all the patterns are 2-ED-connected. The LSC algorithm can easily

6. Conclusions We have proposed an efficient one pass LSC algorithm and its derived BLSC algorithm to label the arbitrarily connected components in a gray image. The BLSC algorithm is suitable for parallel processing as it can improve the labeling speed if multiple processors are used. These algorithms merge classes whenever an equivalence between two classes is noted. Compared with a recently published connected components labeling algorithm, the time complexity of our algorithms is improved greatly. The minimum, the maximum and the average improvements of LSC algorithm are 22.3, 91.7, and 67%, respectively. If the time spent on image binarization is considered, the minimum, the maximum and the average improvements of the LSC algorithms will be 28, 92.4, and 70%, respectively. If 2, 3, and 4 processors are considered, BLSC are able to further improve the labeling speed for about 31, 39, and 45% on average compared with the LSC algorithm, respectively. Moreover, the LSC and the BLSC algorithms provide us with a more convenient way to employ the labeling result to perform post processing. Experimental results have demonstrated the effectiveness of our algorithms.

Appendix A. Proof of Property 2 When a base class, CðSði; js Þ; Eði; je ÞÞ; is obtained, we need to determine whether it can merge with other previous classes. If the n-ED-connectivity is assumed, only those previous classes which contain at least one point whose column coordinate falls into the interval ½j1 ; j2 ; where j1 ¼ js 2 n and j2 ¼ je þ n; need to be considered. We call this

472

Y. Yang, D. Zhang / Image and Vision Computing 21 (2003) 459–472

limitation as inter-column limitation. The reason can be explained as follows: if no such points exist in the previous class, for any point pðx; yÞ in the previous class, its column coordinate, y; will be y , j1 ¼ js 2 n or y . j2 ¼ je þ n:

ðA1Þ

The Euclidean distance d between pðx; yÞ and any point qði; jÞ in the base class is: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðA2Þ d ¼ kp 2 qk ¼ ðj 2 yÞ2 þ ði 2 xÞ2 : Since qði; jÞ belongs to the base class, (Sði; js Þ; Eði; je ÞÞ; j will be: js # j # je :

ðA3Þ

From Eqs. (A1) and (A3), we can derive that lj 2 yl . n: From Eq. (A2), we know d . n; which means the distance between a base class and a previous class which does not satisfy the inter-column limitation will be greater than n: Hence, when checking whether a base class can merge with other previous classes, we do not need to check the previous class which does not satisfy the inter-column limitation.

References [1] R. Gonzales, R. Woods, Digital Image Processing, Addison-Wesley, Reading, MA, 1992, pp. 42–45.

[2] R. Haralick, L. Shapiro, Computer and Robot Vision, 1, AddisonWesley, Reading, MA, 1992, pp. 33 –37. [3] R. Klette, P. Zamperoni, Hand Book of Image Processing Operators, Wiley, New York, 1996, pp. 314 –319. [4] H.S.M. Dillencourt, M. Tamminen, A general approach to connected component labeling for arbitrary image representations, Journal of the ACM 39 (2) (1992) 253–280. [5] R.L.O. Zuniga, L. Shapiro, A new connected components algorithm for virtual memory computers, Computer Vision, Graphics and Image Processing 22 (1983) 287– 300. [6] N. Ostu, A threshold selection method from gray-level histograms, IEEE Transactions on Systems, Man and Cybernetics 9 (1979) 62 –66. [7] R.K.R. Jain, B.G. Schunck, Machine Vision, McGraw-Hill, 1995, pp. 44– 45. [8] A. Rosenfeld, A.C. Kak, Digital Picture Processing, 2, Academic Press, New York, 1982, pp. 241–242. [9] A. Rosenfeld, J. Pfaltz, Sequential operations in digital picture processing, Journal of the ACM 13 (4) (1966) 471–494. [10] H. Samet, M. Tamminen, An improved approach to connected component labeling of images, Proceedings of CVPR’86 (1986) 312 –318. [11] L.D. Stefano, A. Bulgarelli, A simple and efficient connected components labeling algorithm, International Conference on Image Analysis and Processing (1999) 322 –327. [12] R. Tarjan, Efficiency of a good but not linear set union algorithm, Journal of ACM 22 (2) (1975) 215– 225. [13] R. Tarjan, J.V. Leeuwen, Worst-case analysis of set union algorithms, Journal of ACM 31 (2) (1984) 245– 281. [14] X.D. Yang, An improved algorithm for labeling connected components in a binary image, CVIP (1992). [15] Y. Yang, C.S. Chua, Y.K. Ho, Real-time detecting and labelling of human body, Vision Interface 2000 (2000) 187– 193.