Discrete Applied Mathematics (
)
–
Contents lists available at ScienceDirect
Discrete Applied Mathematics journal homepage: www.elsevier.com/locate/dam
A new algorithmic framework for basic problems on binary images T. Asano a,∗ , L. Buzer b,c , S. Bereg d a
JAIST, Japan
b
Université Paris-Est, LABINFO-IGM, UMR CNRS 8049, France Department of Computer Science, ESIEE, France
c d
Department of Computer Science, University of Texas at Dallas, USA
article
info
Article history: Received 26 September 2014 Received in revised form 14 February 2016 Accepted 29 February 2016 Available online xxxx Keywords: Algorithm Binary image Computational complexity Connected component Connected components labeling Connectivity Constant work space algorithm Grid graph Small work space algorithm
abstract This paper presents a new algorithmic framework for some basic problems on binary images. Algorithms for binary images such as one of extracting a connected component containing a query pixel and that of connected components labeling play basic roles in image processing. Those algorithms usually use linear work space for efficient implementation. In this paper we propose algorithms for several basic problems on binary images which are efficient in time and space, using space-efficient algorithms for grid graphs. More exactly, some of them run in O√ (n log n) time using O(1) work space and the others run in O(n) or O(n log n) time using O( n) work space for a binary image of n pixels stored in a read-only array. © 2016 Elsevier B.V. All rights reserved.
1. Introduction The demand for embedded software is growing fast with the increased use of intelligent hardware such as scanners, digital cameras, and with the availability of other smart and portable equipments. One of the most important aspects and also a difference between ordinary software in computers and embedded software comes from the constraints on the size of local memory. For example, to design an intelligent scanner, a number of algorithms should be embedded in the scanner. In most of the cases, the size of the pictures is increasing while the amount of work space available for such software is severely limited. Thus, algorithms which require a restricted amount of work space and run reasonably fast are desired. In this paper we propose several space-efficient algorithms designed for some fundamental image processing tasks, which essentially look for connected components of a grid graph. In one case we are requested to find the number of connected components or the largest component in the grid graph. All of the tasks that we consider have straightforward solutions if sufficient memory (typically, of size proportional to the size of the image) is available (see, for example, [13,17]). Solving the same tasks with restricted memory, without severely compromising the running time, is more of a challenge.
∗
Corresponding author. E-mail addresses:
[email protected] (T. Asano),
[email protected] (L. Buzer),
[email protected] (S. Bereg).
http://dx.doi.org/10.1016/j.dam.2016.02.025 0166-218X/© 2016 Elsevier B.V. All rights reserved.
2
T. Asano et al. / Discrete Applied Mathematics (
)
–
Fig. 1. Input color picture (a) and its largest component in a binary image defined by the same predicate f on color values(b). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
1.1. Computational model with limited work space We measure the space efficiency of an algorithm by the amount of work space used. Such space typically takes the form of pointers and counters (whose number of√ bits is at most the logarithm of the image size). Our objective is to formulate efficient algorithms that use only O(1) or O( n) such pointers and counters.√ √ Throughout the paper we assume that an input binary image consists of O( n) rows and O( n) columns. We also assume a read-only array keeping an input binary image with random access to pixels. 1.2. Why read-only array? Why do we assume a read-only array for an input binary image? Algorithms using read-only access allow parallel processing on the same image data. Another reason is to exclude possibility of a sophisticated mechanism to embed some information in an input array. Such a technique is used in implicit and succinct data structures (see e.g. [15]). It is known that predicates can be used to transform images and save the memory. For example, suppose that we are interested in extracting an object region in a given color image C of n pixels and we know some color information characterizing the object. Then, using this information we can define a predicate f which determines in constant time whether a pixel value (color) is for the object. The pair (C , f ) defines a binary image. In our setting, we can expect that the largest connected component in the binary image corresponds to the object region. To save memory space, we do not create a binary image from the input image since any pixel value can be computed in constant time. This is equivalent to assuming that the binary image is stored in a read-only array. Fig. 1 shows such an example. A target object in Fig. 1(a) is a bridge which is characterized by color intervals, say Red [80, 255], Green [0, 80], and Blue [0, 255]. Fig. 1(b) shows the largest component. 1.3. Basic tasks and known results Five basic tasks for a binary image considered in this paper are: CCC—Connected Components Counting: Count the number of connected components in the image. MERR—Minimum Enclosing Rectangles Reporting: Report the minimum enclosing (axis-parallel) rectangle of every connected component. LCCR—Largest Connected Component Reporting: Report the locations of all the pixels of a largest connected component. Order of those pixels is arbitrary. BILCC—Binary Image defined by Largest Connected Component: Output the binary image defined by the largest connected component in a given binary image in raster order. More exactly, output the binary image in which a pixel is white if and only if it is originally white and also belongs to the largest connected component in the given image. CCL—Connected Components Labeling: Assign integral labels to all pixels so that any black pixel has label 0 and any two white pixels have the same positive integral label if and only if they belong to the same connected component. LCCR and BILCC are similar to each other but their computational complexities may be different from the viewpoint of limited use of workspace. BILCC seems to be harder than LCCR because of its additional constraint on the output order (as noted in Table 1). But it is not known whether it is true. Among the five basic tasks or problems, Connected Components Labeling has been extensively studied and a number of algorithms have been proposed so far under several different computational models including pipelining, parallel computation [1,4,7,8,10–12,16–20]. A nice survey [13] on this topics is also available. Since we may have O(n) connected components for a binary image of n pixels, the whole label matrix takes O(n log n) bits in total. If O(n) work space is available, it is not so hard to design a linear-time algorithm for connected components labeling.
T. Asano et al. / Discrete Applied Mathematics (
)
–
3
Fig. 2. Labeling a binary image. (a) Input binary image, and (b) resulting labels of different colors. Note that black pixels are colored white in (b). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Table 1 Table for time complexities of algorithms for the five basic problems and work space. CCC (Connected Components Counting), MERR (Minimum Enclosing Rectangles Reporting), LCCR (Largest Connected Component Reporting), BILCC (Binary Image defined by Largest Connected Component), and CCL (Connected Components Labeling). Problem
CCC MERR LCCR BILCC CCL
Space
√
O(n)
O(1)
O ( n)
O(n) O(n) O(n) O(n) O(n)
O(n log n) O(n log n) O(n log n) O(n2 ) O(n2 log n)
O(n) O(n) O(n log n) O(n log n) O(n log n)
Fig. 2 shows an example of a binary image. The right figure is the result of labeling using different colors for different components. Pixels in a connected component have the same label while any two pixels from different components have different labels. Colors are used for labels in the figure, but they are integers 1, 2, . . . , c for c connected components. We can solve all of the other four problems once we have correct labeling. The number of connected components is given by the number of distinct labels. Minimum enclosing rectangles can be computed by scanning white pixels while maintaining the minimum and maximum x and y coordinates of pixels in each connected component. The largest connected component is found by evaluating the area (the number of pixels) of each connected component. Thus, all the four problems can be solved in O(n) time if O(n) work space √ is allowed. A main question in this paper is how fast we can solve those problems using less work space, O(1) or O( n). 1.4. Results obtained Our results can be summarized in Table 1. 1.5. Novelty of our approach Designing algorithms using limited work space has been studied for many years under the name of ‘‘log-space algorithms’’ in the complexity theory. A main concern for log-space algorithms has been of theoretical interest in a class of problems which can be solved in polynomial time using only constant work space for an input of size n in addition to a read-only array of size O(n) to keep n input data. However, the constraint of the work space to O(1) may be too severe for practical applications. It is quite natural n pixels to√use some constant number of rows and columns √ to work on an image of √ as work space, which amounts to O( n) for an image of size O( n) × O( n). From a theoretical point of view, it is rather easy to design a linear-time algorithm for connected components labeling if we are allowed to use linear work space. A folklore algorithm using wave propagation is one such example. In this paper we propose a completely different framework for the problem. Our main purpose here is to reduce the amount of work space assuming that an input binary image is given on a read-only √array. Our algorithm outputs a label matrix row by row in the raster order just using two rows of the matrix (and thus O( n) work space for an image of n pixels) and runs in O(n log n) time. The running time depends on complexity of an input image. If it is simple enough, it runs in linear time.
4
T. Asano et al. / Discrete Applied Mathematics (
)
–
Fig. 3. A model of an image. Each pixel is represented by a small square.
Fig. 4. Sequential indices of pixels in the raster order (a) and edge indices (b). A binary image of size 7 × 9 is embedded in a larger image of size 9 × 11 by filling black pixels in the margin.
2. Preliminary: basic assumptions and definitions For the simplicity of the √ √ presentation, we assume that all the border pixels of the image are black. A binary image of n pixels of size O( n)× O( n) is given on a read-only array. Pixels are sequentially indexed from 0 to n − 1 in the raster order. When we denote the width and height by w and h, respectively, pixels in the array are indexed from 0 to w h − 1 = n − 1. A pixel is considered as a square with four sides (see Fig. 3). Square sides become edges if their adjacent pixels have different values. The top and the right sides (edges) of a pixel of index k are indexed with 2k and 2k + 1, respectively. Thus, edges are also uniquely indexed using O(log n) bits. Fig. 4 shows how pixels and edges are indexed. Pixels are all indexed, but some edges are not. For example, edges on the left and bottom sides of the image are not indexed. Those boundary edges are never accessed or cause no problem. Here we summarize basic definitions and terminologies (see Fig. 5). Raster order: From left to right and from bottom to top. Pixel color: Each pixel in a binary image has one of two distinct values, 0 and 1. A pixel of value 1 (resp. value 0) is referred to as a white (resp. black) pixel. Connectivity: Two white pixels are 4-connected (resp. 8-connected) if they are adjacent horizontally or vertically (resp., one of them lies in the 3 × 3 neighborhood of the other). We also say that two white pixels are 4-connected (resp. 8-connected) if there is a path of white pixels such that any two consecutive pixels in the path are 4-connected (resp. 8-connected). Connected component: A maximal subset of white pixels such that any two of them are connected. Hole: We could define connectivity of black pixels as well. Then, a hole is a connected component of black pixels contained in a white connected component. Connectivity for holes: Use alternate connectivities for white pixels and black ones. If connectivity for white pixels is 4-connectivity, then that for black ones must be 8-connectivity. Island: A connected component of white pixels contained in a hole.
T. Asano et al. / Discrete Applied Mathematics (
)
–
5
Fig. 5. An example of a binary image consisting of three connected components. The largest component contains two holes and the right hole contains an island. Each of five boundaries has a unique canonical edge indicated by an arrow. (a) Lowest runs are indicated by shading. (b) A result of labeling the image using different colors. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 6. How to find the next edge (function NextEdge()) on a boundary depending on (a) 4-connectivity and (b) 8-connectivity.
Boundary: A sequence of edges between black and white pixels. Orientation of boundary edges: Edges on a boundary are oriented so that white pixels always lie to the right. External boundary: The boundary of a connected component of white pixels surrounded with black pixels which is always clockwise. Internal boundary: The boundary of a connected component of black pixels surrounded with white pixels which is counter-clockwise. Raster order of pixels: A pixel p1 precedes another pixel p2 in the raster order if p1 is lower than p2 or if they lie in the same row and p1 lies to the left of p2 . Canonical pixel: The pixel in a connected component C that is smallest in raster order among all the pixels in C . Raster order of edges: A vertical edge e1 is smaller than another vertical edge e2 if e1 is lower than e2 or they lie in the same row but e1 lies to the left of e2 . We are not interested in ordering horizontal edges. Canonical edge [14]: The vertical edge of a boundary (external or internal) that is smallest in raster order. By our convention on orientation canonical edges of external (resp. internal) boundaries are upward (resp. downward) oriented. NextEdge(e): Given any edge on a boundary, we can compute the next edge on the boundary in constant time (see Fig. 6). So, we assume a function NextEdge(e) to compute the next edge. The function depends on which connectivity is used to define a connected component of white pixels, but it is an easy exercise to design such a function, and thus omitted here (see also Fig. 6(b)). PrevEdge(e): A function to compute the previous edge of e on a boundary. Run: A maximal sequence of white pixels in a row. Lowest run: A run which is not adjacent to any run in its lower row. Labeling: Assigning positive integral labels to pixels so that two pixels have the same label if and only if they belong to the same connected component. Canonical labeling: A labeling of a binary image by sequential integral numbers in the raster order of their canonical edges. 3. Important previous results 3.1. Canonical edge A connected region in a binary image refers to a maximal set of white pixels in which any two of them are connected by a 4-connected path of white pixels. It may contain holes, islands, and further holes of islands. Thus, the boundary of a
6
T. Asano et al. / Discrete Applied Mathematics (
)
–
Fig. 7. Geometric model of a binary image. There are three connected components C1 , C2 and C3 . C2 contains four holes H1 , . . . , H4 , and H1 includes an island C3 . Edges are directed so that white pixels lie to their right. Canonical edges are indicated by arrows.
Fig. 8. Local condition for an edge e to be a canonical edge (a) of an external boundary and (b) of an internal boundary.
connected component is defined by a single external boundary and possibly more than one internal boundary. In this paper we assume a small square for each pixel, which has four sides. Two adjacent pixels share a side. A side between two pixels of different values is called an edge. Then, a boundary is a sequence of such edges, horizontal or vertical. Any boundary must have a unique vertical edge that is lowest (and leftmost if there are ties). This uniquely defined edge is referred to as the canonical edge of the boundary. Fig. 7 shows how canonical edges are defined using a simple binary image when each boundary is oriented so that white pixels always lie to the right. 3.2. Bidirectional search It is known that we can enumerate all the canonical edges in O(n log n) time using the algorithm [6,9] which was originally designed for traversing a planar map without using any mark bits. By the definition of a canonical edge, an edge e is canonical if and only if there is no other boundary edge e′ on the same boundary which is lexicographically smaller than e, that is, e′ < e. The condition can be tested by following the boundary until we find a smaller edge, but it takes time. A magic for acceleration is to search in two opposite directions (Bidirectional Search) [6]. Since we can distinguish canonical edges on the external boundaries from those on the internal ones only using local information around them (see Fig. 7), we can count the number of connected components by counting that of canonical edges on external boundaries. Thus, the task CCC can be done in O(n log n) time using constant work space [2,3]. Algorithm 1: Connected Components Counting—CCC c = 0. // counter for the number of connected components for each vertical edge e in raster order if LocalCondition(e) and IsCanonical(e) then increment the counter c. Report the counter value c as the number of connected components. In the algorithm the function LocalCondition(e) returns True if and only if the three pixels associated with e are 0, 1, and 0 as shown in Fig. 8(a). The function IsCanonical(e) returns True if and only if e is a canonical edge, that is, there is no vertical edge e′ with e′ < e on the boundary containing e. To implement the function IsCanonical(e) we have to follow a boundary from e to see whether there is any vertical edge e′ on the boundary such that e′ < e. An idea of bidirectional search is to see two pointers on the same boundary moving in opposite directions from e. The search ends when one of the pointers reaches a vertical edge e′ with e′ ≤ e. If e = e′ , then e is canonical since we now know there is no vertical edge e′ < e. Otherwise, it is not canonical. Thus, we can implement the function as follows: Boolean IsCanonical(e){ // Is an edge e canonical? ef = NextEdge(e). // the next edge of e on the boundary. eb = PrevEdge(e). // the previous edge of e on the boundary. while(ef ̸= e and eb ̸= e) do{ if (IsVertical(ef ) and ef < e) then return false. if (IsVertical(eb ) and eb < e) then return false. ef = NextEdge(ef ). // forward search eb = PrevEdge(eb ). // backward search } return true }
T. Asano et al. / Discrete Applied Mathematics (
)
–
7
Fig. 9. Component pixel traversal: (a) the first two rows, and (b) the entire traversal.
This function cannot be computed in constant time, but the total time we need to evaluate the function for every edge is bounded by O(n log n) due to bidirectional search. 4. Constant-work-space algorithms 4.1. Solving MERR We now know that the first problem CCC to count the number of connected components is easily solved using only constant work space. It is rather straightforward to extend the algorithm for the task MERR. Whenever we find a canonical edge of an external boundary, we follow the boundary. Then, we can compute the minimum enclosing rectangle of the corresponding component in time linear in the boundary length by maintaining the smallest and largest x and y coordinates of the edges on the boundary. Thus, MERR can be solved in O(n log n) time using O(1) space. 4.2. Solving LCCR It is not easy to extend the previous result so that it reports component sizes. Difficulty comes from existence of holes. If a component has no hole, it suffices to follow its boundary to compute its area. Fortunately, it is also known that the problem can be solved in O(n log n) time with O(1) space by applying the algorithm due to Bose and Morin [6], which is described below. The algorithm by Bose and Morin [6] which works on a graph can be modified to deal with a binary image where a graph structure is implicitly given. We start from the canonical edge of the external boundary of a connected component C and follow boundaries associated with C . At each upward vertical edge e, successively traverse the white pixels on the right of e until we reach a boundary edge f . If f is a canonical edge, then we move to f with f as the current edge. Otherwise we first check whether the next edge of e on the boundary is canonical or not. If it is the canonical edge of the external boundary (by further checking whether it is upward), then we are done. If it is the canonical edge of a hole, then we walk to the west until we encounter a boundary edge e′ and let e = e′ . Fig. 9 illustrates how the algorithm traverses pixels in a connected component. Fig. 9(a) shows the first two rows and the entire traversal is given in (b). A more formal description of algorithm follows. Algorithm 3 - Component Pixel Traversal: // Visiting every pixel of a connected component starting from an edge es . e = es . //es is the starting edge. repeat{ If e’s right pixel is white then { Report consecutive white pixels until we reach a boundary edge f . if IsCanonical(f ) then e = f . } Else If IsCanonical(e) then { Traverse the white pixels on the left until we reach an edge f . e = f.} While ( ! IsVertical(e)) e =NextEdge(e). }until(e = es ) We avoid recursion by using standard techniques for non-recursive tree traversals. Otherwise, such an approach would need working space depending on the depth of recursion. Theorem 4.1. Let I be a binary image. When a canonical edge of an external boundary of a connected component Ci is known, we can report all pixels of Ci in O(aC + bC log bC ) time using O(1) work space, where aC denotes the area of C and bC denotes the number of edges defining the boundaries of C .
8
T. Asano et al. / Discrete Applied Mathematics (
)
–
Proof. In the algorithm each pixel in the component is visited at most twice, once from left to right and once from right to left, and thus the part visiting pixels is done in O(aC ) time. The test IsCanonicalEdge() is implemented using bidirectional search, then the total time for tests IsCanonicalEdge() is O(bC log bC ), since every boundary edge is tested at most twice. Thus, the total time is O(aC + bC log bC ). Using the results listed above, we have the following theorem. Theorem 4.2. Given a binary image of n pixels, we can solve the LCCR in O(n log n) time using O(1) work space. Proof. Given a binary image, we examine every boundary edge whether it is the canonical edge of an external boundary. It is done in O(n log n) time in total. A largest connected component can be computed in two phases. In the first phase we detect all canonical edges of external boundaries. For each such canonical edge e we count the number of pixels of the component associated with e by applying Algorithm 3 for component pixel traversal. In this way we maintain a canonical edge having the largest count. Then, in the second phase we apply Algorithm 3 for component pixel traversal again with the canonical edge obtained in the first phase. 4.3. Solving BILCC There are two problems associated with the largest connected component. One is LCCR to report locations of all pixels in the component and the other is BILCC to output the binary image defined by the component. We have seen that LCCR can be solved in O(n log n) time using O(1) space, but BILCC seems to be harder than that due to the constraint on its output. It is rather easy to design O(n2 )-time algorithm using the linear-time algorithm by Malgouyres and More [14] for determining whether two pixels belong to the same connected component. In this first phase we find the canonical edge e and its associated white pixel p of the largest connected component. Then, for each pixel q in the second raster scan we check whether it is a white pixel and also it belongs to the same component as p. We output 1 if it is the case and 0 otherwise. Lemma 4.3. Given a binary image of n pixels, we can solve the problem BILCC, that is, we can output the binary image defined by the largest connected component in O(n2 ) time using O(1) work space. 4.4. Solving CCL The last problem CCL (Connected Components Labeling) is harder, but it can be solved in a similar way. Recall that each connected component has a unique external boundary. So, components can be ordered by the alphabetical order of their corresponding canonical edges. More precisely, we do raster scan. At each white pixel p, we first find the canonical edge e(p) of the connected component containing p, which is done in O(n) time. Then, we do raster scan to enumerate all the canonical edges e1 , e2 , . . . , em of external boundaries until we reach the pixel p. More precisely, there are m canonical edges of external boundaries which strictly precede e(p) in the alphabetical order. This part is done O(n log n) time since it is just the same as that of counting the number of connected components. Then, the label at the pixel p is m + 1. Thus the overall time complexity is O(n2 log n). Lemma 4.4. Given a binary image of n pixels, we can output a matrix for connected components labeling for the image in O(n2 log n) time using O(1) work space. 5. Basic algorithm for connected components labeling—CCL We start with a basic algorithm for CCL (Connected Components Labeling) which runs in O(n) time and O(n) work space for a binary image of n pixels [5,17]. Although it is already known that CCL can be solved in linear time, the algorithm is a √ basis of our algorithm using O( n) work space. Suppose we are given a binary image such as shown in Fig. 10. We first partition the image into runs. A run is a maximal sequence of white pixels in a row. In this example we have runs r1 through r50 . Then, we build a graph called run adjacency graph. Its vertices are those runs and two vertices are joined by an edge if their corresponding runs are connected (in the sense of 4-connectivity or 8-connectivity). The graph shown in Fig. 10 is defined by the 4-connectivity. Algorithm 2: Connected Components Labeling—CCL Input: a binary image of n pixels. Partition the given image into runs r1 , r2 , . . . , rm . Build a graph G: Vertices of G are those runs. An edge of G corresponds to two runs with at least two connected pixels. Partition G into connected components by depth-first search in O(n) space.
T. Asano et al. / Discrete Applied Mathematics (
)
–
9
Fig. 10. An example of a binary image and its associated run adjacency graph.
Fig. 11. Decomposition of run adjacency graph into connected graph components.
For each pixel p in raster order if p is black then output 0. else output the component name (index) of the run containing p. Lemma 5.1. The run adjacency graph for a binary image of n pixels has O(n) vertices and edges. Proof. The number of runs is obviously O(n). Edges are defined by vertical adjacency between white pixels, and thus the total number of edges is bounded by O(n). Lemma 5.1 suggests a linear-time depth-first search algorithm for decomposing it into connected graph components using O(n) work space. For example, the run adjacency graph in Fig. 10 is decomposed as shown in Fig. 11. Once we have assigned correct labels to white pixels in a given binary image, we can output a label matrix. For each white pixel we first compute its run index and then obtain the index of the connected graph component to which it belongs. It is done in constant time for each pixel. Thus, the algorithm runs in linear time. Using the label matrix, we can compute the areas of connected components.
√
6. O( n)-space algorithms for CCC and MERR
√
From now on, we will consider algorithms using O( n) work space to solve the two problems, CCC and MERR. In Section 5, we gave a linear-space algorithm for connected components labeling using a run adjacency graph. In our algorithm we read an input image row by row in the raster manner and convert each row into a sequence of runs. More
10
T. Asano et al. / Discrete Applied Mathematics (
)
–
exactly, a run r is a maximal sequence of white pixels in a row. It is associated with an interval I (r ) = {s, s + 1, . . . , t } when it starts at the sth column and ends at the tth column. So, a white run (s, t ) in the ith row means that white pixels continue from the sth column to the tth column and (s − 1)st and (t + 1)st pixels are both black pixels in the row (if they exist). A white run r1 ‘‘intersects’’ another white run r2 if they are in consecutive rows and their associated intervals denoted by I (r1 ) and I (r2 ) have non-empty intersection. The first phase of our algorithm is to partition each row into a sequence of runs. Secondly, we construct a graph called Run Adjacency Graph with vertices being runs and edges between two intersecting runs in consecutive rows. The third phase is to partition the graph into connected components {C1 , C2 , . . . , Ck } by applying a depth-first search and assign the integral label i to each run in the component Ci . To distinguish connected components in a graph from those in a binary image, we call the former as graph components and the latter as image components. There is a one-to-one correspondence between graph components and image components. So, the last phase is to convert the run representation into a labeling matrix using the label for each run. The algorithm described above is almost the same as the old one by Rosenfeld and Pfalts [18], which consists of two phases, horizontal scan to partition each row into runs and vertical scan to merge vertically adjacent runs using a unionfind data structure. Differences are (1) we use horizontal scan twice (instead of horizontal scan followed by vertical scan) and (2) we use a depth-first algorithm for computing graph components after building a run adjacency graph (instead of using a union-find tree data structure). Since the depth-first algorithm runs in linear time, the whole algorithm runs in linear time. This is a folklore knowledge although such a formal statement is rather rare in√ the literature. n) space. Due to the space constraint In this paper we are interested in space-efficient algorithms, especially using O ( √ we cannot build the whole run adjacency graph. A key idea is to use the O( n) work space to keep a set of runs in two consecutive rows. The algorithm is as follows. Assuming that the previous row has already been scanned and all runs in the row have been stored in an array A[.], we scan the current row and store all runs in another array B[.]. After putting all the runs in B[.] into a set V of vertices, we successively find intersecting runs from the two arrays A[.] and B[.], which define graph edges. This is done in linear time by taking runs from the two arrays A[.] and B[.] in the increasing order of the right endpoints of their intervals. When the current row has been processed, we exchange the roles of A[.] and B[.] and then proceed to the next row. Once we have enumerated all the runs and their adjacencies to define a run adjacency graph G, we partition G into graph components {C1 , C2 , . . . , Ck }. Then, each run in a graph component Ci gets the label i. Using this information it is straightforward to output the label matrix in the raster order in linear time. Provisional label: an idea to save space. Unfortunately, there are O(n) runs in a binary image and thus O(n) vertices in its associated run adjacency graph. There are two ideas to reduce the work space. The first idea is to introduce a notion of provisional labels which are labels temporarily used in the algorithm and may be different from the final labels to be reported. More important is that we can use the same provisional label for two graph components if they are ‘‘clearly’’ separated. We read an input binary image in the raster order row by row. We read the first row and put provisional labels to those runs in the row. Then, in the second row, we construct a modified run adjacency graph for a set of runs in the two rows and then partition it into connected components (graph components). In this way we know how those runs in the first two rows are connected and thus we can put provisional labels to the runs in the second row. In general, assuming that those runs in the previous row have been labeled, we put provisional labels to the runs in the current row by examining connectivities among those runs. We call a run colored if it is labeled (for example, the runs in the previous row), otherwise the run is white. The graph components can be found by applying depth-first search which runs in linear time. The resulting graph components are classified as (see Fig. 12): (1) Starting component consists of a single white run in the current row, (2) Terminating component consists of colored runs in the previous row, and (3) Extending component includes both colored and white runs in two rows. We process them as follows (see Figs. 12 and 13). 1. Starting component. If a graph component C is a singleton consisting of a single white run in the current row, then we create a new image component which may be merged in the future scan with another existing image component. We create a new image component, increment the value of L and assign the value to the new image component. We maintain the size of each image component by an array size[.], which is initially determined by the run length. 2. Terminating component C . By definition, C does not contain any run in the current row. It may have two or more colored runs, which must be connected in the underlying graph. Therefore, all the colored runs in the component must have a single common provisional label. Since no colored run in the component intersects any white run in current row, it does not extend to the current row. That is, the corresponding image component terminates in the previous row. This event is called a termination of the image component. 3. Extending component. If a component C has at least one colored run and at least one white run, then the corresponding image component may extend to the next row. If the label k is the smallest label among those colored runs in the graph component, then all the pixels associated with those labels and runs in the graph components should be labeled k in the
T. Asano et al. / Discrete Applied Mathematics (
)
–
11
Fig. 12. A set of runs in two consecutive rows (rows i − 1 and i). Runs r1 , . . . , r5 in the previous row (row i − 1) have been labeled by connectivities established in the already scanned part. The runs r7 and r9 in the current row intersect no run in the previous row, and hence they may create new components. On the other hand, the component associated with the runs r3 and r4 does not extend to the current row, and hence the component terminates here. The corresponding run adjacency graph is given below in the figure. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 13. Three kinds of components associated with the current row (row i), (a) starting, (b) terminating, and (c) extending components.
next row. At the same time, we update the size of the merged graph component C labeled k by the sum of the sizes of all associated image components.
√
6.1. O( n)-space algorithm for CCC We present an algorithm that computes not only the number of components but also reports the following information about each component: The canonical edge and the size of the component. The additional information will be useful for solving LCCR later. The algorithm for CCC can be obtained by simplifying Algorithm 4 (removing arrays CA[..] and CB[..]). Algorithm 4: Counting the number of connected components. Input: A binary image of n pixels. Output: The number of connected components in the input image. A[..] = ∅; // an array to store runs from the previous row CA[..] = ∅; // an array to store information about previous components. c = 0; // a counter for the components for each row{ Create the run adjacency graph GL (VL , EL ): Process the current row and convert it into runs of white pixels; Let B be the resulting set of runs; Define a vertex set VL by provisional labels in A and runs in B; EL = ∅; for each pair of mutually intersecting runs (ri , rj ), ri ∈ A, rj ∈ B EL = EL ∪ {(ri , rj )};
12
T. Asano et al. / Discrete Applied Mathematics (
)
–
Partition GL into graph components {C1 , C2 , . . . , Cm }; Find the type of each graph component and order them such that the terminating components follow the other components. for each i ∈ {1, . . . , m}{ if Ci is terminating then // report c = c + 1; Output "c-th component:" and the information about it in CA[j], where j is the label of a run in Ci . if Ci is starting then // add new component Let rj be the run in Ci . Store in CB[i] of the canonical edge of Ci (using the first pixels of rj ) and the area of Ci (the number of pixels in rj ). if Ci is extending then // merge old components and new runs Store in CB[i] the canonical edge of Ci , the smallest (in raster order) edge of all previous components, and the area of Ci , the sum of the sizes of old components and new runs in Ci . } for each rj ∈ B, set the label in rj to i if rj ∈ Ci . A = B; CA = CB; // prepare for the next row } Report the value of the counter c; Theorem 6.1. Given a binary image √ of n pixels, we can report the number of connected components and the size of each component in O(n) time using O( n) work space.
√
Proof. Let I be a binary image of h rows and w columns (assuming that both of them are O( n)), and Ii be the partial image of I consisting of its first i rows. Let Ri be a set of all runs in the row i. A labeling of Ri is a set of labeled runs of Ri . Labeling of Ri is said to be correct if any two labeled runs in Ri have the same label if and only if they belong to the same connected component in Ii . To prove the theorem it suffices to show that Algorithm 4 maintains the correct labeling. We prove that, if A is a correct labeling of Ri−1 , then it computes B storing a correct labeling of Ri . Suppose that two runs ra and rb from B belong to the same connected component in Ii . Then, there exists a path connecting two pixels from ra and rb . This path must have a subpath connecting two pixels of two runs ra′ and rb′ from A. Furthermore the subpath lies completely in Ii−1 . Therefore, ra′ and rb′ must have the same label and Algorithm 4 labels ra and rb with the same label. Similarly, we can prove that, if two runs ra and rb from B have the same label, then they belong to the same connected component in Ii .
√
6.2. O( n)-space algorithm for MERR It is rather straightforward to modify√ Algorithm 4 for CCC so that it can also report the minimum enclosing rectangle of every connected component using O( n) work space. When the first run of a starting component is found we set the rectangle corresponding to the run. When two components are merged we update the rectangle as the smallest bounding rectangle containing the corresponding rectangles. When a component terminates we report√its rectangle. This algorithm reports the minimum enclosing rectangle of every connected component in O(n) time and O( n) work space.
√
6.3. O( n)-space algorithm for LCCR We have already had an O(n log n)-time and O(1)-space √ algorithm for solving the problem LCCR (Largest Connected Component Reporting). Can we solve it in linear time using O( n) work space? Unfortunately, it seems very hard, but some √ small improvement is possible. Exactly speaking, we can design an algorithm which runs in O(n + m log m) time using O( n) work space, where m is the size of a largest connected component. We modify our algorithm for reporting the smallest enclosing rectangle for each connected component. By the definition of the canonical edge of the external boundary of a connected component C it must be located at the leftmost lowest corner. We modify the algorithm so that we maintain the following information (1) the leftmost lowest edge, (2) the number of pixels for each label. Then, in one scan over an input image we get the leftmost lower edge e of a largest component in O(n) time. Then, we apply the O(1)-space algorithm for component pixel traversal which runs in O(m log m) time for a connected component of m pixels. Theorem 6.2. Given a binary √ image of n pixels, we can report all pixels of a largest connected component of size m in O(n + m log m) time and O( n) work space.
T. Asano et al. / Discrete Applied Mathematics (
)
–
13
√
7. O( n)-space algorithms for CCL and BILCC 7.1. O(n)-space and O(n)-time algorithm To explain our basic idea, we will first present a linear-time algorithm using linear work space. We use two linear-size arrays, one for a label matrix and the other for keeping labels on boundary edges. We scan a given binary image in the raster order. Whenever we encounter an edge which has not been labeled yet, we create a new label and follow the boundary from the edge while putting the label on the array for edge labels. If two white pixels are consecutive in a row, we just propagate a label from left to right. This simple algorithm works for a binary image without a hole. To deal with holes, we also follow a boundary whenever we have a transition from white to black (transition from ‘1’ to ‘0’). If an edge associated with the transition has not been labeled, we first obtain a label from the pixel to the left of the edge (it must have been labeled) and then follow the boundary while putting the label on the edge array. By the definition of canonical edges, for any boundary (external or internal) the first edge we encounter in the scan must be its canonical edge (see Fig. 14). If we keep a label of a component, such an edge can be detected when we encounter an edge which has not been labeled yet. Note that the canonical edge of an external boundary of a connected component must have been detected earlier than canonical edges of internal boundaries of the component. So, if we put a label when we find the canonical edge of the external boundary and put the new label on those edges on the boundary and propagate the label to the right, the label must reach the canonical edges of hole(s) in the component before we detect the edges by transitions ‘10’. Algorithm 5: Computing the label matrix using O(n) space. Input: A binary image B[i], i = 0, . . . , w h − 1. Output: The label matrix after connected components labeling. Work space: lab[0..w h − 1], edge[0..2w h − 1]. L = 0; //the number of labels used so far Initialize arrays lab[.] and edge[.]; for i = 1 to h − 1{ for j = 1 to w − 1 { // pixel in the raster order p = i × w + j; // pixel at (i, j) es = 2p − 1; // vertical edge between (i, j − 1) and (i, j) Switch of the value of B[p − 1]B[p] case ’00’ //black pixel p: lab[p] = 0; break; case ’11’ //next pixel of the run: lab[p] = lab[p − 1]; break; case ’01’ {//boundary pixel p: if edge[es ] = 0 then{ // new component L = L + 1; // new label e = es ; // e: current edge e repeat {//traverse the boundary of component L edge[e] = L; e = NextEdge(e); }until(e = es ); } else{ // old component labeled edge[2p − 1] lab[p] = edge[es ]; } break. } Otherwise: { //case ’10’, pixel p − 1 is on a boundary lab[p] = 0; if edge[es ] = 0 then { // unlabeled boundary e = es ; //start edge es and current edge e repeat { // traverse the unlabeled boundary edge[e] = lab[p − 1]; e = NextEdge(e); } until(e = es ); } } } } Output the current content of the array lab[0..w h − 1]
14
T. Asano et al. / Discrete Applied Mathematics (
)
–
Fig. 14. Lowest runs and runs adjacent to holes.
So, Algorithm 5 follows a boundary whenever it finds a run adjacent to a canonical edge. Since we examine pixels in the raster order, we encounter canonical edges of external boundaries in the raster order. By the definition, the canonical edge of an external boundary must be smaller in the raster order than any canonical edges of internal boundaries of holes in the component. Thus, it is guaranteed that the labeling obtained by Algorithm 5 is canonical and that whenever we find a run adjacent to a canonical edge of a hole the correct label must have been put in the array edge[.]. The running time of Algorithm 5 is O(n) since we follow each boundary only once and the remaining operations are obviously done in linear time. Lemma 7.1. Given a binary image of n pixels, Algorithm 5 computes its correct canonical labeling in O(n) time and O(n) work space.
√
√
7.2. O( n)-space and O(n n)-time algorithm We have shown that there is a folklore algorithm (Algorithm 2) which solves Connected Components Labeling in O(n) time using O(n) work space. Our main concern is to reduce space complexity while suppressing increase of time complexity. √ In Algorithm 5 we used two arrays of linear size. It is easy to reduce the array for a label matrix into one of O( n). Recall that Algorithm 5 outputs the label matrix row by row. Thus, we do not need to keep the whole array, but one or two rows are enough. How about the array to store labels in edges? Labels stored in edges are needed to propagate labels along (external and internal) boundaries. Is it possible to get the label information without using the array? Yes, it is possible. Our idea is very simple. Whenever we encounter a boundary edge we follow the boundary while checking pixels along the boundary whether they are labeled or not. Since we examine pixels in the raster order, if we start from any edge which is not the canonical edge of an external boundary we must reach a boundary edge which is smaller in the raster order than the starting edge. In our algorithm we create a new label when we find the canonical edge of the external boundary of a new component. If the current boundary edge e is not canonical, then we follow the boundary and get a label, which is the correct label of the pixel adjacent to e. We can determine whether an edge e is canonical or not just by following the boundary from e to see whether we see any edge e′ < e.
√
Algorithm 6: Computing the label matrix using O( n) space. Input: A binary image B[i], i = 0, . . . , w h − 1. Array: lab[0..1][0..w − 1]. // label array for two rows L = 0; //the number of labels used so far Initialize the array lab[0..1][0..w − 1]; a = 0; b = 1; for i = 1 to h − 1 { // the labels in the previous row are computed in lab[b][.] for j = 1 to w − 1{ // pixel in the raster order p = i × w + j; //pixel at (i, j) es = 2p − 1; // vertical edge between (i, j − 1) and (i, j) Switch of the value of B[p − 1]B[p]{ case ’00’ // empty pixel p lab[a][p] = 0; break; case ’11’ // next pixel of the run lab[a][p] = lab[a][p − 1]; break; case ’01’ //boundary pixel p e = es ; // e: current edge e repeat{ //traverse the boundary from es e = NextEdge(e);
T. Asano et al. / Discrete Applied Mathematics (
)
–
15
√
Fig. 15. Worst case examples: (a) for which Algorithm 5 takes Ω (n n) and (b) for which Algorithm 6 takes Ω (n log n) time, respectively.
} until(e = es or e is adjacent to a pixel with a positive label k in the row i (by lab[a][.]) or i − 1 (by lab[b][.])) if e = es then {//new component L = L + 1; k = L; //new label } lab[a][p] = k; break; Otherwise: // case ’10’ lab[a][p] = 0; } } Output the content of the array lab[a][0..w − 1]; Exchange the roles of a and b (a = 1 − a, b = 1 − b); }
√
√
Lemma 7.2. Given a binary image of n pixels, Algorithm 5 computes √its correct canonical labeling in O(n n) time using O( n) work space. There is an example for which the algorithm takes Ω (n n) time.
√
Proof. We follow boundaries whenever we encounter an edge associated with a transition ‘01’. There may be O( n) such edges in a row and each boundary may consist of O(n) edges. However, we never follow the same boundary more than once since we can stop whenever we come to an edge which is smaller in the raster√order than the starting edge. Thus, the total length √ of boundaries we follow in a row is bounded by O(n). Since we have O( n) rows, the total running time is bounded by O(n n). The correctness of the algorithm can be proved by induction.
√
There is an example for which the algorithm takes Ω√ (n n) time. See Fig. 15(a) for a worst case example. The binary image contains only one connected component, which has O( n) lowest runs depicted by bold lines. Whenever we encounter the left edge of each such lowest run we follow the boundary in a clockwise order. For a lowest run starting at x = k, the length √
O(
of our traverse is O(k2 ) by the construction. Thus, it takes time O
n)
k=1
√
k2 = O(n n).
√
7.3. O( n)-space and O(n log n)-time algorithm It is surprisingly easy to accelerate Algorithm 2 while keeping the size of the work space. See the worst case example shown in Fig. 15(a). It is easy to see that it took much time since we follow the boundary in clockwise manner. If we follow it counterclockwisely instead, then we immediately find a lower edge and thus the total running time concerning the boundary is just linear. This suggests a bidirectional search for a lower boundary edge. Bose and Morin [6] achieved O(n log n) time using the bidirectional traverse on a planar subdivision. More exactly, at each boundary edge we examine the boundary in opposite directions using the two functions, NextEdge() and PrevEdge(), to find the next and previous edges on the boundary. If the starting edge e is not canonical, then one of the pointers reaches an edge which is smaller in the raster order than e, which must be adjacent to a labeled pixel. Otherwise the two pointers meet without finding such pixels. Bose and Morin [6] showed that the bidirectional search runs in O(n log n) time without using any extra array. Thus, the algorithm is described as follows. Algorithm 7: Computing the label matrix using bidirectional search. Input: A binary image B[i], i = 0, . . . , w h − 1. Array: lab[0..1][0..w − 1]. // label array for two rows L = 0; //the number of labels used so far Initialize the array lab[0..1][0..w − 1]; a = 0; b = 1; for i = 1 to h − 1{ // the labels in the previous row are computed in lab[b][.] for j = 1 to w − 1{ // pixel in the raster order
16
T. Asano et al. / Discrete Applied Mathematics (
)
–
p = i × w + j; //pixel at (i, j) es = 2p − 1; // vertical edge between (i, j − 1) and (i, j) Switch of the value of B[p − 1]B[p]{ case ’00’ // empty pixel p lab[a][p] = 0; break; case ’11’ // next pixel of the run lab[a][p] = lab[a][p − 1]; break; case ’01’ //boundary pixel p ef = eb = es ; // two pointers in two directions d = 0; // search direction: 0 for forward repeat{ //traverse the boundary in two directions if d = 0 then ef = NextEdge(ef ) else eb = PrevEdge(eb ); } until(ef = eb or either ef or eb is adjacent to a pixel with a positive label k in the row i or i − 1); if ef = eb then {// new component L = L + 1; k = L; //new label } lab[a][p] = k; break; Otherwise: // case ’10’ lab[a][p] = 0; } } Output the content of the array lab[a][0..w − 1]; Exchange the roles of a and b (a = 1 − a, b = 1 − b); }
√
Theorem 7.3. Given a binary image of n pixels, Algorithm 7 computes the canonical labeling in O(n log n) time using O( n) work space. There is an example for which the algorithm takes Ω (n log n) time. Fig. 15(b) shows a worst case example for which the algorithm needs Ω (n log n) time. It consists of many vertical extensions. A horizontal interval [a, b] is recursively divided into two parts [a, m] and [m, b] by a center vertical extension at m. The length of the vertical extension at m is shorter by a constant than the ones at a and b.√ Thus, we have extensions of O(log n) different lengths. The length of traversal using bidirectional search is bounded by Ω ( n/2k−1 ) for the lowest run associated with a vertical extension of the kth longest length. Note that we have 2k−2 such extensions. Thus, the total length of our traversal is Ω (n log n).
√
7.4. O( n)-space algorithm for BILCC
√
We have shown that Algorithm 7 solves the problem CCL, Connected Components Labeling in O(n log n) time using O( n) space for a binary image of n pixels. We can combine Algorithm 7 with the one for finding the canonical edge of the largest connected component given earlier. Then, we know the label k of the largest connected component. Thus, it is easy to output the binary image defined by the largest connected component. For each pixel in raster order we output 1 if its label is k and 0 otherwise.
√
Theorem 7.4. Given a binary image of n pixels, we can solve the problems CCL and BILCC in O(n log n) time and O( n) work space. 8. Conclusions In this paper we have proposed a new algorithmic framework for several basic problems on image processing, especially on binary images. Assuming that an input image is stored in a read-only array, proposed algorithms are efficient √ both in time and space. In addition to constant work space algorithms, we have presented algorithms requiring only O( n) work space for a binary image √ of n pixels. Counting the number of connected components is rather easy. In fact we had a linear-time algorithm if O( n) work space is available. However, it has been left open whether it is possible to report all pixels in a largest connected component in linear time or not. In this paper we implicitly assumed 4-connectivity for white pixels, that is, two white pixels are directly 4-connected if they are horizontally or vertically adjacent. It is rather easy to adapt our algorithms for 8-connectivity. It suffices to modify the definition of intersection between two runs in two consecutive rows and functions to determine the next and previous edges on a boundary. For the 4-connectivity a run [s1 , t1 ] intersects another run [s2 , t2 ] if their associated intervals have non-empty intersection, that is, [s1 , t1 ] ∪ [s2 , t2 ] ̸= ∅. For the 8-connectivity it is the case if [s1 , t1 ] ∩ [s2 − 1, t2 + 1] ̸= ∅ or [s1 − 1, t1 + 1] ∩ [s2 , t2 ] ̸= ∅.
T. Asano et al. / Discrete Applied Mathematics (
)
–
17
Acknowledgments The part of this research of T. Asano was partially supported by the Ministry of Education, Science, Sports and Culture (Grant Nos. 24106004 and 23300001), Grant-in-Aid for Scientific Research on Priority Areas and Scientific Research (B). The authors would like to thank David Kirkpatrick for his valuable suggestions. References [1] H.M. Alnuweiri, V.K. Prasanna, Parallel architectures and algorithms for image component labeling, IEEE Trans. Pattern Anal. Mach. Intell. 14 (10) (1992) 1024–1034. [2] T. Asano, In-place algorithm for erasing a connected component in a binary image, Theory Comput. 50 (1) (2012) 111–123. [3] T. Asano, S. Bereg, D. Kirkpatrick, Finding nearest larger neighbors: A case stgudy in algorithm design and analysis, in: S. Albers, H. Alt, S. Naeher (Eds.), Lecture Notes in Computer Science, Efficient Algorithms, Springer, 2009, pp. 249–260. [4] P. Bhattacharya, Connected component labeling for binary images on a reconfigurable mesh architectures, J. Syst. Arch. 42 (4) (1996) 309–313. [5] A. Bishnua, B.B. Bhattacharyab, M.K. Kundub, C.A. Murthyb, T. Acharyac, A pipeline architecture for computing the Euler number of a binary image, J. of Systems Archi. 51 (2005) 470–487. [6] P. Bose, P. Morin, An improved algorithm for subdivision traversal without extra storage, Internat. J. Comput. Geom. Appl. 12 (4) (2002) 297–308. [7] F. Chang, C.J. Chen, C.J. Lu, A linear-time component-labeling algorithm using contour tracing technique, Comput. Vis. Image Understand. 93 (2004) 206–220. [8] A. Choudhary, R. Thakur, Connected component labeling on coarse grain parallel computers: An experimental study, J. Parallel Distrib. Comput. 20 (1994) 78–83. [9] M. de Berg, M. van Kreveld, R. van Oostrum, M. Overmars, Simple traversal of a subdivision without extra storage, Int. J. Geogr. Inf. Sci. 11 (1997) 359–373. [10] M.B. Dillencourt, H. Samet, M. Tamminen, A general approach to connected-component labeling for arbitrary image representations, J. ACM 39 (2) (1992) 253–280. [11] T. Goto, Y. Ohta, M. Yoshida, Y. Shirai, High speed algorithm for component labeling, Trans. IEICE J72-D-II (2) (1989) 247–255 (in Japanese). [12] L. He, Y. Chao, K. Suzuki, A run-based two-scan labeling algorithm, IEEE Trans. on Image Processing 17 (5) (2008) 749–756. [13] R. Klette, A. Rosenfeld, Digital Geometry: Geometric Methods for Digital Picture Analysis, Elsevier, 2004. [14] R. Malgouyres, M. More, On the computational complexity of reachability in 2D binary images and some basic problems of 2D digital topology, Theoret. Comput. Sci. 283 (2002) 67–108. [15] J.I. Munro, H. Suwanda, Implicitdatastructures for fast search and update, J. Comput. System Sci. 21 (2) (1980) 236–250. [16] A. Rosenfeld, Connectivity in digital pictures, J. ACM 17 (1) (1970) 146–160. [17] A. Rosenfeld, A.C. Kak, Digital Picture Processing, Vol. 2, second ed., Academic, San Diego, CA, 1982. [18] A. Rosenfeld, J.L. Pfalts, Sequential operations in digital picture processing, J. ACM 13 (4) (1966) 471–494. [19] H. Samet, Connected component labeling using quadtrees, J. ACM 28 (3) (1981) 487–501. [20] K. Suzuki, I. Horiba, N. Sugie, Linear-time connected-component labeling based on sequential local operations, Comput. Vis. Image Understand. 89 (2003) 1–23.