A sparse structure for fast circle detection

Pattern Recognition 97 (2020) 107022 Contents lists available at ScienceDirect Pattern Recognition journal homepage: www.elsevier.com/locate/patcog ...

Download PDF

5MB Sizes 218 Downloads 162 Views

Report

Full Text

Pattern Recognition 97 (2020) 107022

Contents lists available at ScienceDirect

Pattern Recognition journal homepage: www.elsevier.com/locate/patcog

A sparse structure for fast circle detection Yuanqi Su∗, Xiaoning Zhang, Bonan Cuan, Yuehu Liu, Zehao Wang The School of Electronic and Information Engineering, Xi’an Jiaotong University, China

a r t i c l e

i n f o

Article history: Received 29 December 2018 Revised 11 July 2019 Accepted 25 August 2019 Available online 26 August 2019 Keywords: Circle detection Hough transform Voting Sparse structure Oriented chamfer distance

a b s t r a c t In the paper, we present a circle detector that achieves the state-of-art performance in almost every type of image. The detector represents each circle instance by a set of equally distributed arcs and searches for the same number of edge points to cover these arcs. The new formulation leads to the voting in minimizing/maximizing way which is different from the typical accumulative way adopted by Hough transform. From the formulation, circle detection is then decomposed into radius-dependent and -independent part. The calculation of independent part is computationally expensive but shared by different radii. This decomposition gets rid of the redundant computation in handling multiple radii and therefore speeds up the detection process. Originated from the sparse nature of independent part, we design a sparse structure for its batch computation which is fulﬁlled in just one sweep of the edge points. Circle detector based on this sparse structure is then proposed which achieves the comparable time complexity as the algorithm based on Hough transform using 2D accumulator array. For testing, we created an information-rich dataset with images coming from multiple sources. It contains ﬁve categories and covers a wide spectrum of images, ranging from true color images to the binary ones. The experimental results demonstrate that the proposed approach outperforms the solutions based on accumulative voting. © 2019 Elsevier Ltd. All rights reserved.

1. Introduction Being one of the basic geometric element, circle appears very often in our daily life and the circle detection is an ingredient of many visual tasks, such as the ball detection [1], cell segmentation [2], iris localization [3–5], traﬃc sign detection [6,7] and so on. It has been widely studied and is a critical issue in computer vision and pattern recognition. Though circle is simple in its geometric structure, the circle detection is yet challenging. Circle is usually parameterized by,

( x − ox )2 + ( y − oy )2 = r 2 ,

(1)

where (x, y) is an edge point, o = (ox , oy ) the circle center and r the radius. The combination of (o, r) gives a circle instance. All the possible combinations make up the parameter space. Given an image, circle detection is to extract a set of instances from the parameter space. Over the past four decades, lots of solutions have been proposed for the problem. However, these solutions fail frequently when the cluttered background, occlusion, the inevitable imaging ∗ Corresponding author at: Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, China. E-mail addresses: [email protected] (Y. Su), [email protected] (X. Zhang), [email protected] (B. Cuan), [email protected] (Y. Liu), [email protected] (Z. Wang).

https://doi.org/10.1016/j.patcog.2019.107022 0031-3203/© 2019 Elsevier Ltd. All rights reserved.

defect and deformation present. Current solutions are still far from satisfactory, especially for real images. This is also veriﬁed by the results from a contest held by International Association of Pattern Recognition (IAPR). From 1995, the IAPR International Workshop of Graphics Recognition (GREC) regularly held graphics recognition contest to evaluate the state-of-art methods. One of its challenge is dedicated for arc and circle segmentation and its recent report [8] demonstrates that there is a lot of room for improvement in circle detection. The classical solutions for circle detection are based on Hough transform (HT) [9,10] and Randomized Hough transform (RHT) [11]. They verify a circle instance by counting the number of edge points falling on its boundary. The accumulative way is easily inﬂuenced by noises, especially when cluttered background presents. In the paper, we discard the classical accumulative way and turn to a minimizing/maximizing manner. Our method represents circle as a set of disjoint segments and searches for the edge points to cover these segments. The formulation is then implemented via a 3D voting. It is fulﬁlled by letting the edges points vote for the combinations of segment number and spatial location. To speed up this voting, we explore the sparsity behind the process and design a sparse structure to aid its calculation. Main contributions of the paper include, 1. A novel formulation: The new formulation extends the idea proposed in [12] that searches for a pre-determined number of

2

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

maximally compatible edge points to cover the circle boundary. From the formulation, we decompose the circle detection into radius-dependent and -independent part, meanwhile prove the sparse characteristics of the radius-independent part. 2. A sparse structure: The independent part is computationally expensive but can be shared by the radius-dependent calculation for different radii. Besides, the outputs of the independent part are statistically sparse. We explore the sparsity and design a sparse structure for the calculation of the independent part. 3. A 3D voting scheme: Calculation of the radius-independent part is then implemented via a 3D voting with the sparse structure. It works by letting the edge points vote for a subset of the combinations of segment number and spatial location. With the sparse structure, the voting can be updated in a very fast manner. 4. A comprehensive dataset: To test the algorithm, we construct a dataset. It is composed of 5 categories that present considerable diversities in imaging conditions, backgrounds, occlusion, deformation, distribution of circle radii and centers. In the following sections, we explain these points in detail. In Section 2, algorithms proposed for circle detection are reviewed from the perspective of candidate generation and veriﬁcation. After that, our formulation for circle detection is explained in Section 3, where the implementation of the proposed circle detection is decomposed into the calculation of radius-dependent and -independent part. In Section 4, we describe the sparse structure in detail and give the algorithm for radius-independent part with the structure. In Section5, we report the experimental evaluations and discuss the time complexity of the proposed algorithm. The conclusions and some further avenues are summarized in the last section. 2. Related work The research of circle detection can be traced back to Hough transform (HT) [9]. In a recent survey for HT [13], its history for circle detection is summarized. HT for circle lets each edge point vote for a cone of (o, r) deﬁne by Eq. (1) with voting values accumulated in an array. According to the accumulated values, some entries of the array are then picked out as the candidates. Based on HT, lots of algorithms have been proposed for improving the detection accuracy, reducing the computational complexity or saving the storage space. In a survey paper by Yuen et al. [14], some early variants based on HT were reviewed and experimentally compared. One of these was implemented as a standard module of Open Source Computer Vision (OpenCV) [15]. 2.1. Hough transform and its variants To speed up voting, Kimme et al. [16] adopted the gradient orientation to restrict the voting range. In their modiﬁcation, each edge point votes for the circle centers along its gradient direction. The voting way is circle’s generalized Hough transform (GHT) [10]. For multiple radii, GHT deals with them one by one and uses a 3D accumulator array to record the voting values. The technique to handle multiple radii is the major burden for computational efﬁciency. Some proposed to consider a range of circle radii in just one round of voting [14,17,18]. These methods can save the storage space and avoid handling radii one by one. One of the methods [19] selected for comparison belongs to this type. It is implemented by ﬁrst localizing the circle center followed by resolving the radius. In detection, each point vote for a line segment determined by the range of radius. Algorithms with the modiﬁcation usually use a two- instead of three-dimensional accumulator array and thus reduce the memory requirement.

However, the votes for line segment introduce the aliasing artifacts in accumulator array [18,20,21]. In order to isolate the effects of different radii, some methods [18,20,21] use phase to represent radius and vote with complex values. Each of the localized maxima from the accumulator array contains both the magnitude and phase, the latter of which is used for identifying radius. Using phase-coded value for voting can reduce the aliasing but cannot get rid of it [21]. There still exists the degeneration of the detection accuracy. Thus, how to speed up the voting while keeping the accuracy is yet challenging. Guil et al. [22] proposed a divide-and-conquer way for the purpose. They introduce a hierarchical structure that iteratively splits the parameter space according to the predeﬁned rules in [23] and searches for the optimal instances in a coarse-to-ﬁne manner. Their heuristic way for splitting achieves the comparable performance as HT but is hard for implementation. Djekoune et al. [4] chose to discretize the voting direction of each point and gave an incremental algorithm to update the accumulator. Their divideand-incremental way is suitable for parallel implementation. 2.2. Randomized hough transform and its variants The Randomized Hough Transform (RHT) [11] generates the candidates by random sampling which replaces the enumeration of edge points in HT. In its original form, RHT for circle detection randomly samples three edge points, determines a circle instance and then modiﬁes the corresponding entry of accumulator array. With the increase of sampling tries, the accumulator array generated by RHT approaches the accumulator coming from HT. RHT needs suﬃcient sampling tries to guarantee its recall performance. To reduce these tries and improve the detection accuracy, different sampling strategies [24–28] have been proposed. Chen et al. [26] sampled four points instead of three and checked their compatibility as a circle. If the answer from compatibility checking is yes, a candidate is generated and subsequently veriﬁed with an evidence-collection procedure similar to geometric hashing (GH) [29]. Their method does not need an accumulator array and thus saves the storage space. Adopting the similar way, Chiu et al. [27] proposed a sampling-and-searching strategy. Their method works by sampling two points and ﬁnding the third one from the circles constrained by the ﬁrst two. The subsequent generation and checking of the candidates shared the same process as [26]. Ioannou et al. [25] designed their sampling rule based on an assumption that edge points of a connected component usually came from the same object. Their algorithm traversed the pairs of edge points from the same connected component and casted the vote for the centers. These sampling strategies neglect the orientations. Chung et al. [30] used the gradient orientation to check the sampled points and reduce the sampling tries. Zhang et al. [31] reduced the sampling tries by sampling isosceles triangles and letting each one vote for a range of circle centers. Through clustering these voted centers, they found a set of candidates. Marco et al. [32] introduced the isophotes curvature [33] into the four-point sampling method [26] to reduce the computational expense. In their method, the sampled four points should have similar isophotes curvature which is the reciprocal of the subtended radius. Compared with HT, RHT has shown its success in improving computational eﬃciency but often fails when signiﬁcant noises present. If edges points from the background overwhelm those from the foreground, the random sampling cannot hit the effective proposals thus easily misses the correct objects. Voting can generate the higher detection accuracy but with the sacriﬁce of computational eﬃciency. The randomized way is doing the opposite. In order to amend the weakness of both ways, a type of methods were proposed that also work by enumerating the geometric prim-

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

itives in the image domain. They don’t let each edge point vote for the candidates but explore the geometric properties of the shape primitives to reduce the computational expense. 2.3. Exploring the geometric properties Ho et al. [34] proposed to explore the geometric symmetry of circles. Their method is fulﬁlled by localizing the symmetric axes. For the purpose, they vertically and horizontally scanned the image, calculated the midpoints of each pair of edge points in the same scanning line and ﬁt lines through the midpoints for candidates of symmetric axes. By exploring the symmetry, they speed up the detection procedure. Besides, their idea can be directly generalized to ellipse detection [6,34,35]. Yuan and Liu [36] explored the property of circle power. They transformed the edge map into several power histograms and localized the circle instances from these histograms. Hilaire and Tombre [37] took advantage of line segments. They proposed a method to vectorize the edge map and generate the line drawings. With these drawings, they gave an uniﬁed way for detecting arc and circle. Lamiroy et al. [38] also used the line drawings but lent an idea from the image tracking. They proposed a method that gradually deformed the candidates of circular arcs as to precisely ﬁt onto the image strokes. EDCircle [39] and its variant TDCircle [40] proposed a heuristic way to cluster the line segments for circle proposals. They worked with the linked segments [41–43] and generated the candidates by clustering the adjacent segments. Lu et al. [44] reﬁned the line segments by adding into the geometric constraint of arc and worked with the arc-support line segments. Using paired arc-support line segments, they generated a set of circle candidates which were subsequently veriﬁed with the accumulative way of HT. These heuristic ways greatly shrink the number of candidates and lead to the amazing computational eﬃciency. But their dependence on line segments meanwhile brings in some risk. Experiments have veriﬁed the point and showed that detection accuracies of these algorithms drop quickly with the degeneration of image quality. There are some other methods that are not included in the introduction. Like the method proposed by Schuster et al. [45], they formulated the problem of ﬁnding circles as the estimation based on weighted least squared error. Frosio and Borghese [46] casted the detection as the non-linear ﬁtting problem which is further minimized through the Nalder-Mead simplex algorithm [47]. AyalaRamirez et al. [48] supplied another choice and they placed the task in the genetic algorithm. These methods use the optimization framework to generate the candidates and meanwhile verify them.

3

Different from the accumulative manner, there is another type of voting scheme that works in a minimizing/maximizing way [50– 54]. Obviously, the scheme is closely related with HT/GHT but has subtle difference. It assumes that object is represented by a set of parts. During its voting procedure, the scheme searches for at most one instance for each of the parts. The voting scheme has shown its value in many recent solutions for object detection [50–54]. But it hasn’t been involved in the task of circle detection. Here, we use it for circle detection and represent a circle instance by a ﬁxed number of parts. In Section 3, we give a formal formulation, where some other strategies are integrated to suppress noises and deal with occlusion. 3. Formulation and optimization Given a circle instance (o, r), we propose to partition it into a set of equally distributed segments as shown in Fig. 1. Considering its symmetry, the count for the partitioned segments is always an even number 2K where K is an integer. For kth segment, its position and normal direction are represented by Eq. (2).

θk = mod( kKπ , π ) ck (o, r ) = o + rpk

(2)

where, k ∈ {0, 1, · · · , 2K − 1}. The vector pk records the direction from circle center to the segment’s location. It is equal to (cos kKπ , sin kKπ ). With the deﬁnition, it is easy to check that the normal direction θ k of the segment only depends on segment number and is irrelevant with the instance parameters (o, r). While position of a segment is a function of both the segment number and instance parameters. Instead of counting the edge points along the circle boundary, our formulation works by ﬁnding edge points to cover these segments. Thus, each segment needs only one edge point as shown in Fig. 2. For kth segment Sk (o, r), our detection searches for an optimal edge point minimizing the difference in both orientation and spatial location as shown in Eq. (3).

Dk (o, r ) = min [ρ ( fe , θk ) + d (xe , ck (o, r ))] e

(3)

Here, edge point is represented by (xe , fe ) with xe its location and fe the gradient orientation. The function ρ measures the orientation difference between fe and the normal direction of the segment, and d is the spatial deviation. The achieved minimum is the matching extent for kth segment. After summing up the value w.r.t.

2.4. Candidate veriﬁcation For candidate veriﬁcation, most of the methods for circle detection adopt the accumulative way. In the original version of HT [9], a candidate is veriﬁed by counting the number of edge points falling on its boundary. Counting the edge points easily accumulates noises which introduces false positives. Latter, GHT [10] prunes those edge points whose gradient orientations don’t coincide with the circle instance. Compared with HT, the constraint posed by GHT is too strict. In practice, due to blurring and discretization, an edge point from the circle cannot guarantee that its gradient orientation perfectly coincides with that required by the circle. Thus the way will miss some correct candidates and lower the recall of the detection algorithm. To circumvent the problem, the probabilistic Hough transform (PHT) [49] relaxes the voting range. It still uses the accumulative way but relaxes the voting range. The casted votes are weighted according to spatial and orientation difference w.r.t. some circle candidates.

Fig. 1. Discretization of the circle instance. Given an instance (o, r), we partition it into a set of equally distributed segments {Sk (o, r)}. Location of a segment is denoted by ck . The vector pk gives the direction from the circle center o to the segment’s location.

4

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

Fig. 2. Illustration of our voting model. Our formulation works by ﬁnding edge points to cover the segments of a circle instance. In (a), there are three points that can match the marked segment. The segment needs only one (with solid line) while neglecting the other two (with dashed line). The selected edge point (with solid line) minimizes the matching deviation to the speciﬁc segment. In (b), the Hough transform accumulates the edge points falling on its circle instance.

k, we get the matching extent k Dk (o, r) for the whole instance. Next, we give the detailed form for the difference of spatial location and deduce our decomposition. 3.1. Radius-Dependent and -independent decomposition The normalized squared distance between the position of a segment and an edge point is used for measuring their spatial deviation as shown in Eq. (4).

d (xe , ck (o, r )) =

1

γ

2

xe − o − rpk 2

(4)

where γ is a parameter controlling the inﬂuence of spatial deviation. After substituting Eq. (4) into the deﬁnition of Dk (o, r), we reach a minimization problem as follows.

Dk (o, r ) = min e

ρ ( f e , θk ) +

1

γ2

xe − o − rpk

2

(5)

Mk (o ) = min e

ρ ( f e , θk ) +

Without the orientation deviation ρ , the computation of radiusindependent part Mk (o) corresponds to the Chamfer distance [55]. The Chamfer distance [55] takes in two sets of points, a template T and a test set S, and calculates a metric by,

cd (y, S ) = min y − x x∈S

1

γ2

xe − o

2

(6)

With the new deﬁnition, the matching extend Dk (o, r) is a displaced version of Mk (o) as shown in Eq. (7).

Dk (o, r ) ≈ Mk (o + rpk )

(7)

where · is the rounding operation and nearest neighboring interpolation is used. The approximation in the above equation is due to the discretization. The circle center o takes value from the image grids. But the displaced location o + rpk is usually off the grids because of the non-integer offset rpk . With the approximation, we decomposes the calculation of Dk into radius-dependent and -independent part. The computation of Mk (o) is independent of the radius, while the calculation of Mk (o + rpk ) is dependent. As for different radii, they share the same set of {Mk }. That is to say, the calculation of radius-independent part is shared by calculations of the radius-dependent part for different radii.

(8)

where y is a point from template T. Given T = {o} and S = {xe }, to optimize Mk without ρ is the calculation of the Chamfer distance between the set of image grids and that of the edge points. With the orientation deviation ρ , Mk (o) can be viewed as the oriented Chamfer distance [52,56]. It is usually solved via the generalized distance transform of sampled function [57]. Shotton et al. [52] modiﬁed the Chamfer distance by incorporating the orientation and gave the oriented Chamfer distance as follows.

ocd (y, S ) = min y − x + | fy − fx |a x∈S

To suppress noises, an upper bound is introduced, requiring that Dk (o, r) ≤ 1. Intuitively speaking, if either orientation or spatial position of an edge point deviates far from the kth segment, the edge point cannot match it. As a consequence, the upper bound deﬁnes a range within which the edge points are involved in the calculation of Dk (o, r). The matching extent for Dk (o, r) is a function deﬁned on a three dimensional space. Fortunately, the deﬁnition of spatial deviation helps us separate the inﬂuence of circle center and radius. For the purpose, we introduce a function in Eq. (6) that excludes the inﬂuence of circle radius.

3.2. Connection with the oriented chamfer distance

(9)

where fx gives the edge orientation of the point x and | · |a is the angular difference. The oriented chamfer distance considers both the orientation and spatial deviation. To consider both deviations makes it similar to our formulation for Mk (o). The similarity then lets the fast schemes proposed for oriented chamfer distance adapt to our case. To optimize (9), Shotton et al. proposed a two-stage way that used distance transform of sampled function [57] to ﬁnd the nearest point x∗ for each y and then computed the orientation difference according to the nearest x∗ . Ming-Yu et al. [56] discretized the orientation and optimized the same objective function (9) with the aid of dynamic programming. However, the subtle difference between our formulation and the oriented Chamfer distance restricts us from choosing those solutions. In our formulation, θ k changes with the segment number. Each segment needs a separate optimization process because orientation difference ρ (fe , θ k ) varies. The large K makes the fast algorithms for oriented Chamfer distance unsuitable for our case. Instead, we explore the characteristics implied in our formulation and propose a sparse structure for the fast solution that calculates {Mk (o)} for all segments in just one sweep of the edge points. 4. Sparse structure for radius-independent part 4.1. Whence sparsity A fast algorithm for Mk (o) plays a central role in our circle detection. The direct solution is to traverse the edge points falling in the range deﬁned by Eq. (10) and ﬁnd the one achieving the mini-

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

5

Fig. 3. Illustration of the votes for segment numbers. In (a), there are two segment numbers involved in a point’s votes. In (b), the gradient orientation of an edge point coincides with the normal direction of a segment. Then there is only one segment involved in the point’s votes.

mum.

ρ ( f e , θk ) +

1

γ2

xe − o < 1 2

to π ρ ( fe , θk1 )γ 2 . Thus, the total number of the entries that a point

(10)

From the side of edge, the procedure can be explained in another way. Each edge point casts a weighted voting to a 3D space composed by the image grids o and the segment number k. For an entry (o, k) in the 3D space, there may be several edge points involved in its voting. Only the one producing the minimum is accepted. With the explanation, calculation of Mk can be implemented via a 3D voting in minimizing manner. Due to the upper bound in Eq. (10), voting range of each edge point is limited making Mk sparse for each k. Next, we deduce this sparsity from the perspective of edge points. First, the upper bound restricts each point’s votes for the segment numbers. This can be veriﬁed from the deﬁnition of ρ (fe , θ k ). The difference ρ (fe , θ k ) measures the angular distance between gradient orientation fe of an edge point and the normal direction of a segment.

ρ ( f e , θk ) =

K

π

| f e − θk | a

(11)

In Eq. (11), fe is normalized to [0, π ), | fe − θk |a gives the angular difference and the normalization factor πK is the corresponding difference between two adjacent segments. The deﬁnition makes the adjacent segments differ 1 in the orientation difference, separating their interference. Here, we only consider the segments in range [0, K − 1]. In circle, the normal direction θ k is equal to θk+K . This makes the map Mk the same as Mk+K and reduces the count of maps from 2K to K. Since normal directions of these segments are evenly distributed on [0, π ), there are at most two segment numbers involved in each point’s votes as shown in Fig. 3. Given gradient orientation fe , the two numbers are calculated by the following equation.

k1 =

fe

π

K ;

k2 = mod(k1 + 1, K )

(12)

The corresponding orientation differences are then computed by,

ρ ( fe , θk1 ) = πfe K − πfe K ρ ( f e , θk 2 ) = 1 − ρ ( f e , θk 1 )

(13)

Besides of the segment numbers, the spatial locations for which an edge point can vote are also bounded, satisfying the constraints in Eq. (14). 1 2 γ 2 x e − o < 1 − ρ ( f e , θk 1 ) 1 2 γ 2 xe − o <

ρ ( f e , θk 1 )

(14)

For k1 , the grid locations satisfying the above constraint fall into a circular region centered on xe with radius γ 1 − ρ ( fe , θk1 ). Count of the image grids inside this region approximates the area of the circle, that is π (1 − ρ ( fe , θk1 ))γ 2 . This count for k2 is approximate

can inﬂuence is around π γ 2 . Let Ne be the count of the edge points, then votes casted by all edge points approximately amount to π γ 2 Ne . We assume that each vote will generate a new entry in {Mk }. The upper bound for the touched entries in all {Mk } approximates π γ 2 Ne . Supposing that the gradient orientation of edge point is evenly distributed on πγ 2

[0, π ), the amount for voted entries in each Mk is roughly K Ne . There are around K1 votes dedicated to the kth normal direction. Typically the edge points are relatively sparse compared with the grid locations of a given image. Besides, γ and K are chosen such πγ 2

that K is relatively small. All these make each Mk sparse. Here we assume the evenly distributed gradient orientation, thus the casted votes for Mk are sparse in statistical view. However if we directly record these votes in {Mk (o)}, the majority of their sites will possess the value of 1 instead of 0. The sparse votes of the edge points have not led to the sparse maps {Mk (o)}. According to the deﬁnition of Mk (o), if a combination (o, k) hasn’t been touched by any edge point, then its matching extent will be 1. To meet the sparse property, we use 1 − Mk (o ) as the voting weights. The modiﬁcation brings in a voting map Vk (o) deﬁned in Eq. (15).

Vk (o ) = 1 − Mk (o )

= max max(1 − e

1 ρ ( fe , θk ) + 2 xe − o2 , 0 ) γ

(15)

where the inner max is due to the constraint in Eq. (10). For an entry (o, k) outside any edge point’s voting range, the assigned value is then 0. Since votes for Mk are sparse, the nonzero locations for Vk are also sparse in the statistical view, making Vk a sparse map for each k. In Fig. 4, we show some voting maps where sparsity of these maps is obvious. The input image and edge map of these voting maps are shown in Figs. 1 and 2 respectively. The K is equal to 150 and γ is set to 3. According to the above analysis, the upper bound for the non-zero sites of each voting map approximates 9π 150 Ne ≈ 0.18Ne . Due to the large K, only a limited number of edge points are involved in the calculation of each voting map. In fact, the voting maps are more sparse than the analyzed results. Edge points are usually connected, their voting regions are interacted with each other. That signiﬁcantly reduces the number of the nonzero sites in {Vk }. 4.2. The sparse structure for radius-independent part In practice, each voting map Vk is of the same size of the input image. Given a M × N image, the space for storing {Vk } is M × N × K. It is often infeasible to allocate such a space for large K. For example, the K in our experiments is around 150. When K = 150, there

6

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

Fig. 4. Illustration of the voting maps. As for the voting maps, the number of segments is set to 150 and γ is equal to 3. The input image and edge map for them are shown in Fig. 1 and Fig. 2 respectively. Given that the most regions of each map possess the value 0 (see the color bar), the sparsity of these voting maps is obvious.

Fig. 5. Illustration of the sparse structure. The sparse structure utilizes a 2D matrix L of size N × K. Each of its entry L(n, k) points to the tail of a linked list. The linked list is composed of a set of triples {(row, val, prev)} storing the nonzero sites of the nth column of Vk .

are totally 150 voting maps. That’s a very huge demand for storage but obviously wasting since each Vk is statistically sparse. Thus, to explore the sparsity underlying Vk is absolutely necessary. For the purpose, we deﬁne a sparse structure which makes use of the fact that each Vk is of the same size and sparse. Our sparse structure utilizes a 2D matrix L of size N × K. Each of its entry L(n, k) points to the tail of a linked list. The linked list is composed of a set of triples {(row, val, prev)} storing the nonzero sites of the nth column of Vk as shown in Fig. 5. The triple (row, val, prev) records the value val of the speciﬁc entry whose row and column index are row and n respectively. The prev of the triple points to the entry in the column whose row number directly precedes the current one. When linked list is traversed from L(n, k), the representation has the row number row decrease with the traversing order. Based on the representation, a batch algorithm for {Vk } is presented.

4.3. Batch calculation with the sparse representation Algorithm summarized in Algorithm 1 is designed from the perspective of voting. It traverses the edge points, ﬁgures out each one’s voting range and updates the values of the corresponding entries of {Vk }. One sweep of the edge points ﬁnishes the calculation and updates all the involved entries of {Vk }. During the procedure, the sparse structure helps us concentrate on the non-zero entries of {Vk } and let the others untouched. We scan the edge points row by row in the algorithm. For a speciﬁc one (xe , fe ), we ﬁrst determine its voting for the segment numbers and then the spatial locations. Each point can vote for at most two segment numbers. For each segment, the spatial voting range is a circular region. The region is broken into columns and updated one by one. Next, time complexity for manipulating one single column is analyzed, followed by the discussion of the time complexity for the whole voting procedure.

Lemma 1. Given the location of current edge point: xe = (xe , ye ), the maximum row number for all the linked lists is less than or equal to ye + γ . Here, xe corresponds to the column index and ye the row one. For any combination (o, k) in the voting range of an edge point x = (x, y ), it is easily veriﬁed that |y − oy | ≤ γ where oy is the ycoordinate of o. Violation of the inequality makes spatial difference grow beyond the upper bound, which subsequently excludes (o, k) from the voting range. The conclusion then speciﬁes the maximum row number in the voting range of the edge point. The row number cannot exceed y + γ . The presented algorithm scans edge points in a row-wise manner. The way lets the current point xe = (xe , ye ) possess the maximum y-coordinate among all the traversed edge points. Accordingly, the maximum row number that the scanned edge points can vote for is less than or equal to ye + γ , that subsequently leads to the lemma. Lemma 2. The while iteration in Update-col stops within γ − σy + 1 steps. Each column in the voting range is processed with the decease of row numbers. We record the index of the ﬁrst encountered row by ye + σy where 0 ≤ σ y ≤ γ and let (n, k) be the combination of involved column and segment. The while iteration in Update-col is used for seeking a position in the linked list to deal with the ﬁrst row. In the algorithm, NIL indicates the end of a linked list. For convenience, we use head for the ﬁrst tripe in a linked list. If the head of the list is NIL or row number of the head is less than ye + σy , it means that the ﬁrst row hasn’t appeared in the linked list. Then, the while procedure stops in just 1 step, resulting in a position for inserting the ﬁrst row. When row number of head equals to ye + σy , head corresponds to the ﬁrst row that can also be reached in 1 step.

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

Algorithm 1 Algorithm for the batch calculation of {Vk } with the sparse structure. Batch-Calculate(L ) 1 L(n, k ) = nil ∀n, k 2 repeat 3 scan edge point {(xe , fe )} row by row 4 calculate k1 and k2 by Eq. (12) 5 for k ∈ {k1 , k2 }

σx = β 1 − ρ ( fe , θk ) for n = xe − σx to xe + σx Update-col(xe , fe , n, k, L )

6

7 8 9 until end of the edge map Update-col(xe , fe , n, k, L )

σy = γ [1 − ρ ( fe , θk )] −

2 3 4 5 6

m = ye + σy p = nil c = L(n, k ) while c = nil and c.row > m p = c, c = c. prev

7 8

for m = ye + σy downto ye − σy v = 1 − ρ ( fe , θk ) − γ12 (m − ye )2 + (n − xe )2

9 10 11 12 13 14 15 16 17 18

γ2

( n − xe )2

where m and n are the row and column index respectively. Then existence of the triple for the entry is checked. If it does exist, we update the value of the tripe; otherwise a new one is generated and inserted into the list. The time complexity is determined by the while- and for-loop together. According to Lemma 2, the while-loop stops within γ − σy + 1 steps. The for-loop repeats up to 2σy + 1 times. Both loops ﬁnish within γ + σy + 2 steps. Since σ y ≤ γ , this upper bound is less than or equal to 2γ + 2. Within each loop, the called operations manipulate the linked list and complexity for each operation is O(1). Thus, complexity for the whole procedure is O(γ ). Theorem 2. If the edge point is scanned in a row-wise manner, then each point’s vote can be ﬁnished in O(γ 2 ).

1

1

7

if c = nil and c.row = m if c.val < v c.val = v else c = new(m, v, c ) if p = nil L(n, k ) = c else p. prev = c p = c, c = c. prev

When the row number of the head is greater than ye + σy , according to Lemma 1, its value is less than or equal to ye + γ . Since every iteration in the while will decrease the row number by at least 1, the maximum count of iterations before stop is the difference between the row number of the head and ye + σy . Consequently, the maximum count of iterations for the procedure is less than or equal to γ − σy + 1. Theorem 1. The procedure Update-col works correctly for updating each column, and its time complexity is O(γ ). In Update-col, m = ye + σy deﬁnes the ﬁrst row of the voting range in current column. It gives the maximum row number to be updated. The while-loop is to seek a start position from the linked list to update the votes and the subsequent for-loop does the updating. We assume that the linked list is created correctly before the updating of current column. Correctness of the procedure can then be veriﬁed by tracing p and c whose values determine the subsequent operations. In the while-loop, if the head of the linked list is NIL or its row number is less than or equal to m, then p is NIL and c is equal to the head. When the row number of the head is greater than m, then p is a triple from the linked list whose row number is the least among all the tripes with row numbers greater than m. The corresponding c is its previous triple in the linked list. Thus c is either NIL or a triple whose row number is less than or equal to m. In the for-loop, we ﬁrst calculate the voting value for Vk (m, n)

The theorem can be veriﬁed by the time complexity for updating a column and the maximum number of involved columns. The maximum count of columns is 2γ + 1 for an edge point. Thus, each point’s vote can be ﬁnished in O(γ 2 ). Let Ne denote the count of the edge points, then time complexity for the votes of all edge points is O(Ne γ 2 ). 4.4. Circle detection algorithm With the calculated voting maps, algorithm for circle detection is then summarized in Algorithm 2. The algorithm works by calcuAlgorithm 2 Algorithm for circle detection. Circle-Detection(L, {r}, K, th) 1 2 3 4 5 6 7 8 9 10 11

V (m, n, r ) = 0 ∀m, n, k Initialize the hash table H for r ∈ {r} for n = 0 to N − 1 for k = 0 to K − 1 [dx , dy ] = round(rpk ) c = L(n, k ) while c = nil V (m + dy , n + dx , r ) = V (m + dy , n + dx , r ) + c.val if V (m + dy , n + dx , r ) > th Update (m + dy , n + dx , r ) in H

12 13 14

V (m − dy , n − dx , r ) = V (m − dy , n − dx , r ) + c.val if V (m − dy , n − dx , r ) > th Update (m − dy , n − dx , r ) in H

15

c = c.prev

lating the voting map for each instance as follows. It corresponds to the calculation of radius-dependent for different radii.

V ( o, r ) =

2K k=1

Vk (o + rpk ) =

K

[Vk (o + rpk ) + Vk (o − rpk )] (16)

k=1

where V(o, r) is the voting value for instance (o, r). Due to symmetry, we only calculate K voting maps {Vk } and have Vk = Vk+K . Besides, deﬁnition of pk lets us have pk+K = −pk . In the algorithm, a hash table H is used for recording the potential candidates. Different from the accumulative voting, our way bounds the voting value from above whose maximum is equal to the number of segments, i.e. 2K. Thus, an adaptive threshold th value can be used to control the number of accepted candidates. Here, we choose th = 0.05K; experiments show that this value is proper for balancing the recall and the detection speed. When one of the entry in the voting map is updated, we compare it with the threshold. If the voting value is greater than the

8

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

Fig. 6. The voting map generated by the circle detection algorithm in Algorithm 2. From left to right, (a), (b), (c) correspond to the voting map for radius 59, 68, 77 respectively.

threshold, we update the instance in hash table H. If it has been added to the table, we update its voting value; otherwise we generate a new key in H and record its voting value. As shown in Fig. 6, only few candidates can meet the thresholding constraint and be put into the table due to the sparsity of the voting maps {Vk }. The thresholding operation saves the time for subsequent post-processing. The computational complexity of the circle detection algorithm summarized in Fig. 2 is determined by both the non-zero elements of voting maps {Vk } and the number of discretized radii. Number of the non-zero elements in the voting maps is proportional to its voting complexity, i.e. O(Ne γ 2 ). Let R and r denote the radius range and discretization step, then the number of discretized radii is,

Nr =

R +1 r

5. Experimental evaluation 5.1. Dataset For testing, we created an information-rich dataset with images coming from multiple sources.1 The dataset contains ﬁve categories which cover a wide spectrum of images. These categories are ‘Bubble’, ‘Cell’, ‘Seal’, ‘Sign’ and ‘Coin’; each of them is named after the objects presented in its images. The ﬁve categories are collected from different application scenarios, circle detection plays the key role in all these tasks. Our ﬁve categories present signiﬁcant variability in terms of imaging conditions, image types, distribution of circle centers and radii, deformation and occlusion. For ‘Bubble’ and ‘Cell’, they are composed of the microscopic images collected from chemical and blood tests respectively. Images in‘Seal’ are the scanned documents containing the circular seals and those in ‘sign’ are all the natural images collected for testing the detection algorithm designed for traﬃc sign. The last one –‘coin’ includes the images taken under the controlled lighting environment. As for color type, our dataset covers the binary image (‘Seal’), the gray-scale image (‘Bubble’, ‘Cell’ and ‘Coin’) and RGB true-color image (‘Sign’). The detailed info for our ﬁve categories is summarized in Table 1. To show their considerable diversities, some sample images are given in Fig. 7. Next, we give the detailed description of these categories. Bubble: Images in the category come from a chemical experiment that involves tracing the quickly moving bubbles. Each image downloaded

from

Category

#Images

#Instances

Color

Source

Bubble

32

666

Gray

Cell

45

2911

Gray

Sign Seal

117 14

246 94

RGB BW

Coin

17

100

Gray

Microscopic image Microscopic image Natural image Scanned document Photographs taken under artiﬁcial lighting

(17)

In practice, we always let γ = max(0.8 r, 2 ). Thus the complexity for our circle detection is O(Nr Ne γ 2 ). After substituting the representation of Nr and γ into the representation, we get the complexity for our algorithm, which is O(R rNe ). Thus to speed up the detecting process, we need to control the discretization step r.

1 Our dataset can be CircleDetectionDatasets.

Table 1 Summary of the dataset (#Images and #Instances are the number of images and instances contained in the category.).

https://github.com/cartoonxjtu/

in the category is a photo montage generated by a sequence of pictures taken by high-speed microscopic camera. Due to the imaging conditions, most of its 666 instances suffer from the deformation. Part of their boundaries are missing and the rest of their boundaries are heavily blurred making the edge detection hard to pick the bubbles out of the background. Besides, the reﬂection of light casts a lot of bright spots on the bubbles bringing in the false positives. These challenging factors prevent the detection of bubbles from achieving high accuracy. Cell: This category possesses 45 images collected from the blood test of several patients. Some of their blood cells lack pits and our circle detection is used to count the number of cells without pit. All images in the category are taken by the electronic microscope and rescaled to 512 × 512. Though image count of the category ranks 2, it has the most instances whose count is up to 2911. The mutual occlusion among these instances is a challenge that cannot be neglected. Besides, mature Red Blood Cells (RBCs) often have the pit in the center. Contour of the pit and the boundary of the whole cell compose a ring. For ground truth, we labeled the cell as one instance and neglected its pit. Thus, the detection algorithm should deal with the concentric circles presented in almost each cell. Seal: All the 14 documents in the category are scanned with 300 DPI and then re-scaled with the longer edge equaling to 1024. To detect these circular seals is a necessary step towards the understanding of the image contents that leads to further compact representation. However, the cluttered backgrounds composed of characters and some other shapes increase the diﬃculty for accurate detection. Sign: Its 117 images come from three sources. Some were collected from Flickr [58], some were photographs by our own with a hand-held camera and the rest were shot by cameras ﬁxed in an unmanned car. In the category, we try to localize the circular traﬃc signs in the natural environment that can be further used for auto-

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

9

Fig. 7. Sample images from our dataset. From top to bottom, the rows correspond to the category of ‘Bubble’, ‘Coin’, ‘Cell’, ‘Seal’ and ‘Sign’ respectively.

matic navigation. Due to the shooting positions and angles, circular traﬃc signs in the test images have considerable deformations led by out-of-plane rotation and the radii of all the 246 instances scatter in a very wide range from 3 pixels to 120 pixels. Besides, as we can see in Fig. 7, existence of the concentric circles are inevitable. All these factors along with the cluttered backgrounds make the accurate detection very hard. Coin: There are 17 gray images in this category with instances amounting to 100. All are photographed under the artiﬁcial lighting in our laboratory and are rescaled to 512 × 384. For each image, instances in it have around the same size. Circle detection is involved for counting the coins. Occlusions and the cluttered backgrounds are the main challenges of the category. DataG: Besides of our 5 categories, we use a public dataset from https://github.com/duguqiankun/circleDetection for the testing. Author of the set hasn’t supplied a remarkable name for it. For convenience, we denote it by ‘DataG’ (Dataset from Github). In ‘DataG’, there are 75 gray images with totally 152 instances. The dataset is closest to our ‘Sign’ except that images in ‘Sign’ are color ones and those in ‘DataG’ are gray. Among all the 6 test sets, ‘DataG’ has the largest searching range. Images in the set require the algorithms to search the circles with radius ranging from 0 to half of their longer edges. As shown in Fig. 8, the existence of blurred boundary and the concentric circles also makes the accurate detection on the test set very hard.

To summarize, deformation, partial missing of the boundary, the cluttered background and concentric circles are the four main challenges in all these datasets. 5.2. Performance evaluation In this section, the proposed algorithm is compared against a variety of detectors both in terms of accuracy and speed. The involved algorithms include the Hough Transform (HT) implemented in OpenCV (HT) [15], the modiﬁed Hough Transform by Tao Peng (TP) [19], the randomized circle detection (RCD) [26], the algorithm based on maximum likelihood estimation (AMLE) [59], the circle detection with arc-support line segments (CDAL) [44], the algorithm with false detection control (ED) [39], along with its open source implementation (RPCD) [60] and improved version TDCircle (TD) [40]. We follow the tips in [26] to re-implement the RCD except that the circle candidates are proposed within the connected components. In checking stage, the gradient orientation [30] was used to suppress the noises. As for AMLE [59], we add an auxiliary step that counts the number of edge points along each candidate and discards those with small values. The strategy suppresses the false positives. According to the tips in [19], a strategy is introduced for resolving radius. The same strategy is also used in our implementation of AMLE for acquiring more accurate estimation of radius.

10

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

Fig. 8. Sample images from ‘DataG’.

As for TD, we used the executable package supplied by its authors [39]. Since they haven’t made their sources public, we tuned the available parameters in the package to get the best performance for all the test sets. The authors of CDAL supplied a routine to use their algorithm written by C. As for RPCD and ED2 , we use the open source code. All the algorithms are implemented by C or C++. Their parameters are tuned for better performance to make the comparisons fair. For each involved algorithm, we tried a set of parameter settings and selected the one achieving the best performance. In our experiments, TP uses the edge points obtained by thresholding the gradient magnitudes. The threshold value was tuned for the best detection performance. ED and its variants: RPCD, TD and CDAL use the edge linking [41–43] which groups the edge points into the small segments. RPCDA and CDAL ﬁltered the line segments by using geometric constraint of arc. Except these algorithms, all the other involved algorithms use the Canny edge detector [61], whose high and low threshold values are determined adaptively according to the maximum gradient magnitude of the given image. To make the comparison fair, we also keep the parameters of Canny same across the experiments. Also, all algorithms use the same searching range of radius as ours and their radius step is one pixel. For those without the searching range, we tune the radius-related parameters according to the given range for better performance. All experiments are conducted on a desktop with Intel Core i5-4590 3.3GHz, 8G memory. The detection results and the time used are reported for the analysis. As for ours, we discretize the searching range of radius to save the computational time. Given a searching range, we uniformly sample Nr samples for radius with Nr the number of radius samples. As pointed in the previous section, concentric circles is a problem for the accurate detection. Thus it is necessary to eliminate the unnecessary concentric candidates. This is an extra step adopted by all the algorithms except ED, TD and CDAL. Two circles that are concentric should exactly share the same center and meet the condition that one should contain the other. Here, we use a relaxation that requires the bounding box of one circle to enclose that of the other. Let A = (xa , ya , ra ) and B = (xb , yb , rb ) denote 2 candidates. Without loss of generality, we assume that ra > rb and only consider circles with similar scale, requiring that rb > 0.6ra . To be concentric, the bounding box of A should contain that of B. If the condition is met, the maximum deviation between their centers is √ 2|ra − rb |. We then deﬁne the concentric degree by the normalized distance between the two centers by,

Concentric =

( xa − xb )2 + ( ya − yb )2 √ 2| r a − r b |

(18)

where (xa − xb )2 + (ya − yb )2 measures the Euclidean distance between (xa , ya ) and (xb , yb ). The threshold of concentric degree 2

The source code for EDCircle comes from https://github.com/CihanTopal/ED_Lib.

is set to 0.5 in all the experiments. Resolving concentric centers removes a number of false positives, making all algorithms more precise. For the proposed algorithm, we select K = 130 and K = 170 for comparison and summarize the precision vs. recall (PR) curves in Fig. 9. The proposed algorithm achieves the state-of-art performance for all our 5 categories and ‘DataG’. It outperforms the others on F measure. Our Canny plus voting with sparse structure performs well and steadily for almost all types of images. The default parameters for Canny detector in OpenCV are ﬁne for all image types involved in the experiments. Next, the performance of these algorithms are observed and analyzed from two folds: the generation of the candidates and their veriﬁcation. For RCD, the generation of the candidates is through randomized sampling and its veriﬁcation is a modiﬁed accumulative manner. For the former, the cluttered backgrounds prevent RCD from generating effective proposals. When edge points from the backgrounds dominate the edge map, the random selection then has the higher probability to hit the points from backgrounds. Without a proper generation, even a perfect veriﬁcation cannot output the high detection accuracy. Thus, RCD’s performance degenerate on the natural images(‘Sign’) and the images with poor image quality(‘Bubble’ and ‘Cell’). As for AMLE, the cluttered backgrounds inﬂuence the veriﬁcation of potential candidates. The complex values casted by circular objects are messed with those by the backgrounds. The aliasing makes the veriﬁcation fail in picking up correct candidates. The same situation exists in Hough and TP. As shown in Fig. 9, compared with accumulative way, the proposed method succeeds in suppressing noises led by cluttered background, leading to fewer false positives. The voting scheme used by both Hough transforms accumulates the noises, thus a uniform threshold value may discard those circles with small radius and introduce false positives from cluttered backgrounds. Performance of EDcircle and its variants: RPCD, TD and the CDAL degenerate with the decrease of the image quality [39,40]. The performance degeneration of all these methods is closely related with the edge linking algorithm [41–43]. They rely on the linked segments to discover the potential candidates. When objects are small or the image quality is degenerated, object boundaries are easily blurred. The blurred boundaries make the accurate boundary detection very hard and subsequently inﬂuence the edge linking stage. Their detection accuracies on ‘Bubble’ and ‘Cell’ verify the point. This can also be veriﬁed from Fig. 10 where some detection results are shown. Besides, the average running time is reported in Table 2. The time for each algorithm is measured under the same parameter settings for generating the PR curves in Fig. 9. Here, we did not incorporate TDCircle into our comparison because its running time from the executable ﬁle is not accurate. The use of the running time from their executable ﬁle is unfair for this comparison. From Table 2, we can see that the selection of segment number K does

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

11

Fig. 9. The precision vs. recall (PR) curves for our 5 categories and ‘DataG’. Numbers in the bracket give the F-measure for each algorithm.

not take much effect on the running time of the proposed algorithm because it haven’t been involved in the algorithm’s time complexity. Second, the running time of the proposed method is closely related with the number of edge points. That can be veriﬁed by the running time of ‘Seal’. If we assume that image of the same size generates approximately the same number of edge points, then larger image will output more edge points. That’s a strategy adopted by Canny edge detector in OpenCV [15]. Thus, the image size determines the running speed. Among our 5 categories and ‘DataG’, images in ‘Seal’ have the largest size. It means that they

will output the most edge points and increase the running time of each algorithm. This can be veriﬁed by all methods except AMLE. These methods except AMLE work either with the edge points or line segments from edge points. As for AMLE, the situation is different. The reported running time for ‘DataG’ is the largest. AMLE uses the convolution to ﬁnd the potential candidates. Searching range of the radii determines the size of convolution ﬁlters. The larger range needs a corresponding larger ﬁlter that will increase the running time of the algorithm. ‘DataG’ has the largest searching range. Images in the set require the algorithms to search the circles with radius ranging from

12

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

Fig. 10. Sampling Results. Images in the last two columns come from ‘DataG’. The green circles give the true positives, the red ones with the dashed bounding boxes are the false positives and the blue ones with bounding boxes in solid line correspond to the missed instances.

Table 2 The average time (s). Each number is reported under the parameter settings for the best detection performance. Bubble

Cell

Coin

Seal

Sign

DataG

AMLE TP Hough RCD RPCD ED CDAL

0.69 0.34 0.11 0.50 0.09 0.26 0.10

5.18 2.76 0.19 1.92 0.13 0.25 0.40

8.94 1.34 0.16 1.17 0.12 0.25 0.13

11.50 10.47 0.34 8.22 0.29 0.31 0.23

13.84 5.78 0.05 1.91 0.14 0.26 0.16

163.69 6.64 0.21 2.55 0.09 0.20 0.23

Ours (K = 130) Ours (K = 170)

0.14 0.14

0.33 0.39

0.29 0.26

0.72 0.58

0.34 0.36

0.35 0.35

0 to half of their longer edges. Thus, its running time for ‘DataG’ increases to over 100 s per image. The RPCD, ED and CDAL generate the candidates by combining the line segments. So their running time is less connected with the searching range of radii but closely related with the number of edge points which determines the count for line segments. Compared with ED, its variants: RPCD and CDAL reﬁne the generation

of line segments. They prune a subset of segments by using the geometric constraint of circle and thus improve the running eﬃciency. On the other hand, these methods save the time for candidate generation since they don’t need to enumerate the radii. This can be veriﬁed by the running time of ED, RPCD and CDAL. The ﬂuctuations of their running time on the 6 dataset are less than the other algorithms. Hough, TP and the proposed method need to enumerate the radii for candidates. The enumeration will potentially increase the complexity of the corresponding algorithms. In Table 3, we reported the occupations of the radius-dependent and -independent part in the total executable time. Due to enumeration of radii, calculation of radius-dependent part uses much more time than the calculation of -independent part. The reported result is consistent with our complexity analysis. Among all the six test sets, the RPCD ranks number one for four, the Hough transform implemented in OpenCV gets the ﬁrst position in the category ‘Sign’ and CDAL is the best for ‘Seal’. Hough implemented in OpenCV has the similar performance as the CDAL and PRCD. The running time of the proposed algorithm is similar to that of Hough and RPCD in the category ‘Bubble’ and ‘Coin’. Our algorithm performs steadily and gets higher accuracy at

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022

13

Table 3 The distribution of the running time (%). ‘Independent’ and ‘Dependent’ correspond to the calculation of radius-independent and -dependent part respectively. ‘Others’ include the preprocessing of the input image and some post-processing steps like the resolving of concentric circles. Here the occupations for both K = 130 and K = 170 are reported with results separated by /.

Independent: Dependent: Others:

Bubble

Cell

Coin

Seal

Sign

DataG

10.9/11.0 55.5/57.2 33.6/31.8

7.6/7.7 66.1/68.0 26.3/24.3

4.1/ 4.1 74.6/75.1 21.3/20.8

6.2/ 6.1 70.7/70.5 23.1/23.5

7.9/ 7.8 71.7/71.9 20.4/20.3

5.7/ 5.5 75.8/76.1 18.5/18.4

the cost of slight increase of time complexity. Next, we conduct the comprehensive test to investigate the inﬂuence of the key parameters in the proposed algorithm. 5.3. Choice of the key parameters Among the parameters in the proposed algorithm, the parameter γ is set according to the discretization step of the radius r and the threshold th is set w.r.t. the segment number K. Obviously, r and segment number K are our most import parameters. We conduct experiments on a set of combinations to delve into their impact on the detection accuracy. Results are summarized in Table 4. In the summary, K is uniformly sampled from the range [50, 210] with the interval value 40. As for r, we sampled its values according to the number of discretized radii Nr and the Area Under Cover of the PR curve (AUC) is used for comparison. Next, we ﬁrst investigate the inﬂuence of segment number K. It determines the cover density of the circle instance.

Table 4 AUC for combinations of different segment number K and r. The value in bracket gives the number of radii Nr generating the corresponding radius step r. Each column corresponds to a segment number K. The threshold value th is selected th = 0.05K for all the categories. Bubble 1.12(17) 1.38(14) 1.80(11) 2.57( 8 ) 4.50( 5 )

50 77.14% 77.28% 77.71% 77.42% 71.56%

90 79.35% 80.26% 80.47% 79.54% 72.95%

130 79.27% 79.45% 79.92% 78.71% 73.17%

170 73.38% 72.37% 72.83% 70.85% 62.95%

210 57.52% 55.76% 55.65% 51.97% 42.72%

92.09% 92.55% 91.69% 92.90% 91.25%

93.45% 93.57% 93.17% 93.98% 92.52%

93.70% 94.04% 93.33% 94.25% 92.99%

93.72% 94.10% 93.68% 94.22% 92.74%

93.75% 94.14% 93.74% 94.28% 92.56%

98.67% 98.59% 97.20% 98.43% 92.67%

98.73% 98.73% 97.44% 97.41% 95.98%

98.86% 98.78% 98.83% 98.71% 98.17%

98.92% 98.85% 98.87% 98.69% 98.45%

98.90% 98.73% 98.83% 98.59% 98.50%

98.42% 97.94% 98.59% 97.47% 95.68%

98.20% 98.29% 98.56% 97.56% 94.71%

98.14% 98.40% 98.60% 97.71% 92.82%

74.45% 75.50% 74.43% 59.50% 41.48%

0.00% 0.00% 0.00% 0.00% 0.00%

88.23% 89.31% 91.64% 92.17% 92.65%

92.93% 88.37% 91.53% 91.12% 92.05%

91.58% 89.14% 92.00% 93.37% 91.96%

91.33% 90.04% 90.84% 93.53% 92.29%

92.14% 93.52% 91.89% 92.59% 89.72%

Cell 1.45(23) 1.78(19) 2.29(15) 3.20(11) 5.33(7) Coin 1.85(28) 2.27(23) 2.94(18) 4.17(13) 7.14(8) Seal 1.30(28) 1.59(23) 2.06(18) 2.92(13) 5.00(8) Sign 2.56(40) 3.12(33) 4.00(26) 5.56(19) 9.09(12)

In digital images, we assume that number of the edge points along a circle approximates its perimeter, i.e. 2π r. Thus, if K is greater than 2π r, then the number of segments is greater than the count of edge points along the circle. In a perfect case without noise, only 2π r segments out of all K can be covered. In the situation, purpose of our formulation becomes similar to Hough transform. It counts the number of edge points along the circle boundary. When the segment number K is less than 2π r, our formulation obviously departs from Hough transform. In the summary, we use a uniﬁed thresholding value th = 0.05K to pick out the potential candidates. The previous analysis shows that with the increase of segment number K, the thresholding value th = 0.05K will be greater than 2π r at some moment. As a consequence, the algorithm will miss circles whose radius is equal to or less than r. Thus, to increase K will eliminate more false positives at the cost of discarding more true positives. Especially for circles with small radius, there is a high probability that these instances would be neglected when K is large. In the category of ‘Seal’ and ‘Bubble’, the thresholding value th is the main reason for the drop off of AUC. For ‘Seal’, when K = 210, the detector misses all. For ‘Bubble’, it lost most of the instances. The selection of K exceeds the upper bound of radius times 2π . Generally speaking, to take care of the circles with smaller radius, threshold th should be chosen according to both the segment number K and the radius range. The situation for the other 3 categories is different. We can ﬁnd from Table 4 that our method performs steadily for different K and r. Take ‘Cell’ for example, the best performance for all Ks occurs at r = 3.2. The segment number K = 50 and K = 210 generate the minimum/maximum AUC value respectively. These two values are 92.90%/94.28% that are very close to each other. Thus, we suggest to modify the threshold to th = min(0.05K, 7 ) where 7 is used for bounding the threshold from above. The new threshold gives the result in Table 5. Since for K = 50, 90 and 130, the threshold keeps the same thus we neglect these results for clarity. Except for the category ‘Cell’, performance of the other four keeps the same or improves over the result of previous threshold. As for ‘Seal’, the new threshold has eliminated the case that misses all the instances. The discretization step r has two folds on the ﬁnal performance. First, it determines γ , increasing it will bring in large spatial blur in the voting map, that enhances the algorithm’s capability to handle the deformation. But, it will at the same time increase the risk of missing the circle whose radius is between two adjacent discretized radii and lose its localization accuracy. The ‘Coin’ category has less deformation, thus a small step r helps to cover all the possible radii and increase the localization accuracy. While for ‘Sign’, the situation is opposite. Considerable deformation occurs in the category. Thus a large r helps the method to accommodate the deformation. From the perspective of higher recall and accurate localization, the presented method favors smaller r. The smaller r can cover all the radius and increase the localization accuracy. From deformation’s perspective, it prefers the larger one. According to the experimental results, we found that it’s proper to chose r between 2 and 4. If we can tolerate a

14

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022 Table 5 AUC for combinations of different segment number K and r. The value in bracket gives the number of radii Nr generating the corresponding radius step r. Each column corresponds to a segment number K. The threshold value th is selected th = min(0.05K, 7 ) for all the categories. Bubble

r ( N r ) 1.12(17) 1.38(14) 1.80(11) 2.57( 8 ) 4.50( 5 )

170 73.38% 72.37% 72.83% 70.85% 62.95%

210 57.52% 55.76% 55.65% 51.97% 42.72%

r ( N r ) 1.45(23) 1.78(19) 2.29(15) 3.20(11) 5.33(7)

170 93.10% 93.55% 92.53% 92.64% 84.42%

210 92.54% 93.07% 91.81% 91.80% 81.83%

98.11% 98.33% 98.42% 97.44% 93.62%

98.12% 98.40% 98.41% 97.50% 94.35%

Seal 98.92% 98.85% 98.87% 98.69% 98.45%

98.90% 98.73% 98.83% 98.59% 98.50%

91.88% 91.17% 91.27% 93.94% 92.57%

92.57% 93.76% 92.23% 92.90% 91.31%

1.30(28) 1.59(23) 2.06(18) 2.92(13) 5.00(8)

Sign 2.56(40) 3.12(33) 4.00(26) 5.56(19) 9.09(12)

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.patcog.2019.107022.

Cell

Coin 1.85(28) 2.27(23) 2.94(18) 4.17(13) 7.14(8)

Supplementary material

slight risk of performance degeneration, we can directly ﬁx the step around 3. The results in Table 4 verify the point. By connecting the spatial blur parameter γ with the discretization step r, and the threshold th with K, we reduce our tuned parameters into two. If we can tolerate the slight risk of performance degeneration, we can ﬁx the discretization step, letting it near 2 or 3. The relaxation further reduces the tuned parameter to 1 and increases the usability of the our algorithm. 6. Conclusion In the paper, we have proposed a sparse structure for fast circle detection. Our Canny plus 3D voting with the sparse structure performs well and steadily for almost all types of images. The default parameters of Canny work ﬁne for all images involved in our experiments. The experimental results show that the proposed method is able to accurately and reliably localize circles among clutter. Compared with EDCircle and its variants, the degeneration of image quality has less impact on our method. Besides, we have reduced the number of the parameters if our algorithm and made it easy to use. Though the proposed method is inferior to EDCircle and its variants in running time, it improves the accuracy with the cost of slight degeneration of the detection speed. To summarize, the proposed algorithm is a good choice for the application scenarios requiring high accuracy. There are several avenues for further research. In current stage, for fast implementation, our approach uses a uniﬁed K for all radii. In the next stage, we want to introduce an adaptive way for choosing K and make it dependent on the circle radius. The way is introduced for handling circles with small radius. As for the running speed, the time complexity for the proposed algorithm is O(R rNe ) which linearly depends on the number of the edge points. If this number is reduced, then we can speed up the proposed method. For the purpose, we can choose the shape primitives that are more compact than edge points, such as the line segments. They could be the choice for our future modiﬁcation. Acknowledgment The authors would like to thank NSFC for support under the grant number 61973245, 91120 0 09, 678 679 61328303 and 61305051.

References [1] T. D’Orazio, C. Guaragnella, M. Leo, A. Distante, A new algorithm for ball recognition using circle hough transform and neural classiﬁer, Pattern Recognit. 37 (3) (2004) 393–408. [2] E. Cuevas, D. Oliva, M. Daz, D. Zaldivar, M. Prez-Cisneros, G. Pajares, White blood cell segmentation by circle detection using electromagnetism-like optimization, Comput. Math. Methods Med. 2013 (2013) 395071. [3] Y.E. Soelistio, E. Postma, A. Maes, Circle-based eye center localization (CECL), in: 2015 14th IAPR International Conference on Machine Vision Applications (MVA), 2015, pp. 349–352. [4] A.O. Djekoune, K. Messaoudi, K. Amara, Incremental circle hough transform: an improved method for circle detection, Optik 133 (2017) 17–31. [5] K.W. Bowyer, K. Hollingsworth, P.J. Flynn, Image understanding for iris biometrics: a survey, Comput. Vis. Image Underst. 110 (2) (2008) 281–307. [6] N. Barnes, A. Zelinsky, Real-time radial symmetry for speed sign detection, in: IEEE Intelligent Vehicles Symposium, 20 04, 20 04, pp. 566–571, doi:10.1109/IVS. 2004.1336446. [7] S.K. Berkaya, H. Gunduz, O. Ozsen, C. Akinlar, S. Gunal, On circular traﬃc sign detection and recognition, Expert Syst. Appl. 48 (2016) 67–75. [8] E.H. Barney Smith, B. Lamiroy, Circle Detection Performance Evaluation Revisited, Springer International Publishing, Cham, pp. 3–18. [9] R.O. Duda, P.E. Hart, Use of the hough transformation to detect lines and curves in pictures, Commun. ACM 15 (1) (1972) 11–15. [10] D. Ballard, Generalizing the hough transform to detect arbitrary shapes, Pattern Recognit. 13 (2) (1981) 111–122. [11] L. Xu, E. Oja, P. Kultanen, A new curve detection method: randomized hough transform (RHT), Pattern Recognit. Lett. 11 (5) (1990) 331–338. [12] Y. Su, Y. Liu, X. Huang, Circle detection based on voting for maximum compatibility, IEICE Trans. Inf. Syst. 95 (6) (2012) 1636–1645. [13] P. Mukhopadhyay, B.B. Chaudhuri, A survey of hough transform, Pattern Recognit. 48 (3) (2015) 993–1010. [14] H. Yuen, J. Princen, J. Illingworth, J. Kittler, Comparative study of hough transform methods for circle ﬁnding, Image Vis. Comput. 8 (1) (1990) 71–77. [15] G. Bradski, The OpenCV Library, 20 0 0. [16] K. Carolyn, D. Ballard, J. Sklansky, Finding circles by an array of accumulators, Commun. ACM 18 (1975) 120–122. [17] R.K. Yip, P.K. Tam, D.N. Leung, Modiﬁcation of hough transform for circles and ellipses detection using a 2-dimensional array, Pattern Recognit. 25 (9) (1992) 1007–1022. [18] D. Kerbyson, T. Atherton, Circle detection using hough transform ﬁlters, in: Fifth International Conference on Image Processing and its Applications, 1995, pp. 370–374. [19] T. Peng, http://www.mathworks.com/matlabcentral/ﬁleexchange/ 9168- detect- circles- with- various- radii- in- grayscale- image- via- hough- transform, 2010. [20] T. Atherton, D. Kerbyson, Using phase to represent radius in the coherent circle hough transform, in: IEE Colloquium on Hough Transforms, 1993, pp. 5/1–5/4. [21] T. Atherton, D. Kerbyson, Size invariant circle detection, Image Vis. Comput. 17 (11) (1999) 795–803. [22] N. Guil, E.L. Zapata, Lower order circle and ellipse hough transform, Pattern Recognit. 30 (10) (1997) 1729–1744. [23] H. Li, M.A. Lavin, R.J.L. Master, Fast hough transform: a hierarchical approach, Comput. Vis. Graph. Image Process. 36 (2–3) (1986) 139–161. [24] V.F. Leavers, The dynamic generalized hough transform: its relationship to the probabilistic hough transforms and an application to the concurrent detection of circles and ellipses, Computer Vis. Graph. Image Process. 56 (3) (1992) 381–398. [25] D. Ioannou, W. Huda, A.F. Laine, Circle recognition through a 2d hough transform and radius histogramming, Image Vis. Comput. 17 (1) (1999) 15–26. [26] T.-C. Chen, K.-L. Chung, An eﬃcient randomized algorithm for detecting circles, Comput. Vis. Image Underst. 83 (2) (2001) 172–191. [27] S.-H. Chiu, J.-J. Liaw, An effective voting method for circle detection, Pattern Recognit. Lett. 26 (2) (2005) 121–133. [28] W. Lu, J. Tan, Detection of incomplete ellipse in images with strong noise by iterative randomized hough transform (IRHT), Pattern Recognit. 41 (4) (2008) 1268–1279. [29] H.J. Wolfson, I. Rigoutsos, Geometric hashing: an overview, IEEE Comput. Sci. Eng. 4 (4) (1997) 10–21, doi:10.1109/99.641604. [30] K.-L. Chung, Y.-H. Huang, S.-M. Shen, A.S. Krylov, D.V. Yurin, E.V. Semeikina, Eﬃcient sampling strategy and reﬁnement strategy for randomized circle detection, Pattern Recognit. 45 (1) (2012) 252–263. [31] H. Zhang, K. Wiklund, M. Andersson, A fast and robust circle detection method using isosceles triangles sampling, Pattern Recognit. 54 (2016) 218–228. [32] T.D. Marco, D. Cazzato, M. Leo, C. Distante, Randomized circle detection with isophotes curvature analysis, Pattern Recognit. 48 (2) (2015) 411–421. [33] M. Van Ginkel, J. Van de Weijer, L. Van Vliet, P. Verbeek, Curvature estimation from orientation ﬁelds, in: Proceedings of the Scandinavian Conference on Image Analysis, vol. 2, 1999, pp. 545–552.

Y. Su, X. Zhang and B. Cuan et al. / Pattern Recognition 97 (2020) 107022 [34] C.-T. Ho, L.-H. Chen, A fast ellipse/circle detector using geometric symmetry, Pattern Recognit. 28 (1) (1995) 117–124. [35] Y. Lei, K.C. Wong, Ellipse detection based on symmetry, Pattern Recognit. Lett. 20 (1) (1999) 41–47. [36] B. Yuan, M. Liu, Power histogram for circle detection on images, Pattern Recognit. 48 (10) (2015) 3268–3280. [37] X. Hilaire, K. Tombre, Robust and accurate vectorization of line drawings, IEEE Trans. Pattern Anal. Mach.Intell. 28 (6) (2006) 890–904. [38] B. Lamiroy, L. Fritz, O. Gaucher, Robust circle detection, in: Ninth International Conference on Document Analysis and Recognition, vol. 1, 2007, pp. 526–530, doi:10.1109/ICDAR.2007.4378765. [39] C. Akinlar, C. Topal, EDCircles: a real-time circle detector with a false detection control, Pattern Recognit. 46 (3) (2013) 725–740. [40] Ciplak, G., 2016. A quantitative comparison of state of the art circle detection algorithms, Master’s thesis, Anadolu Universitesi. [41] C. Akinlar, C. Topal, EDLines: a real-time line segment detector with a false detection control, Pattern Recognit. Lett. 32 (13) (2011) 1633–1642. [42] C. Topal, C. Akinlar, Edge drawing: a combined real-time edge and segment detector, J. Vis. Commun. Image Represent. 23 (6) (2012) 862–872. [43] C. Akinlar, C. Topal, EDPF: a real-time parameter-free edge segment detector with a false detection control, Int. J. Pattern Recognit.Artif. Intell. 26 (01) (2012) 1255002. [44] C. Lu, S. Xia, W. Huang, M. Shao, Y. Fu, Circle detection by arc-support line segments, in: 2017 IEEE International Conference on Image Processing, IEEE, 2017, pp. 76–80. [45] G. Schuster, A. Katsaggelos, Robust circle detection using a weighted MSE estimator, in: Image Processing, 2004. ICIP ’04. 2004 International Conference on, vol. 3, 2004, pp. 2111–2114Vol. 3. [46] I. Frosio, N.A. Borghese, Real-time accurate circle ﬁtting with occlusions, Pattern Recognit. 41 (3) (2008) 1041–1055. [47] J.A. Nelder, R. Mead, A simplex method for function minimization, Comput. J. 7 (14) (1965) 308–313. [48] V. Ayala-Ramirez, C.H. Garcia-Capulin, A. Perez-Garcia, R.E. Sanchez-Yanez, Circle detection on images using genetic algorithms, Pattern Recognit. Lett. 27 (6) (2006) 652–657. [49] R.S. Stephens, Probabilistic approach to the hough transform, Image Vis. Comput. 9 (1991) 66–71. [50] B. Leibe, A. Leonardis, B. Schiele, Combined object categorization and segmentation with an implicit shape model, in: ECCV Workshop on Statistical Learning in Computer Vision, 2004, pp. 17–32. [51] B. Leibe, A. Leonardis, B. Schiele, An implicit shape model for combined object categorization and segmentation, in: Toward Category-Level Object Recognition, Springer, 2006, pp. 508–524. [52] J. Shotton, A. Blake, R. Cipolla, Multiscale categorical object recognition using contour fragments, IEEE Trans. Pattern Anal. Mach.Intell. 30 (7) (2008) 1270–1281. [53] A. Opelt, A. Pinz, A. Zisserman, A boundary-fragment-model for object detection, in: European Conference on Computer Vision, Springer Berlin / Heidelberg, 2006, pp. 575–588.

15

[54] V. Ferrari, F. Jurie, C. Schmid, From images to shape models for object detection, Int. J. Comput. Vis. 87 (3) (2010) 284–303. [55] G. Borgefors, Hierarchical chamfer matching: a parametric edge matching algorithm, IEEE Trans. Pattern Anal. Mach.Intell. 10 (1988) 849–865. [56] M.-Y. Liu, O. Tuzel, A. Veeraraghavan, R. Chellappa, Fast directional chamfer matching, in: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 1696–1703. [57] P.F. Felzenszwalb, D.P. Huttenlocher, Distance transforms of sampled functions, Theory Comput. 8 (1) (2012) 415–428. [58] www.ﬂickr.com. [59] E. Zelniker, I. Clarkson, Maximum-likelihood estimation of circle parameters via convolution, IEEE Trans. Image Process. 15 (4) (2006) 865–876. [60] Y. Du, J. Chen, Robust parameter-free coin detector, https://github.com/ answeror/rpcd, 2012. [61] J. Canny, A computational approach to edge detection, IEEE Trans. Pattern Anal. Mach.Intell. 8 (6) (1986) 679–698. Yuanqi Su received his B.E. degree in mechanical and automatic engineering and M.E. in pattern recognition and intelligent system at the Xian Jiaotong University, Xian, China in 2003 and 2006 respectively; and the PhD degree in pattern recognition and intelligent system from Xian Jiaotong University, in 2012. He is now working in the Department of Computer Science and Technology, Xian Jiaotong University, Xian, China. His research interests include image processing, computer vision and computer graphics. Xiaoning Zhang received her B.E. degree from the Zhengzhou University, Zhengzhou, China in 2016. She is currently involved in master research studies in the department of Computer Science and Technology, Xian Jiaotong University, Xian, China. Her research interests include computer vision and computer graphics. Bonan Cuan received his B.E. degree and M.E. degree from the Xian Jiaotong University, China in 2012 and 2015 respectively. He is currently involved in doctoral research studies in INSA Lyon, CNRS, LIRIS, Lyon, France. His research interests include computer vision and computer graphics. Yuehu Liu received his B.E. and M.E. degree from the Xian Jiaotong University, China in 1984 and 1989, respectively, and the PhD degree in electrical engineering from Keio University, Japan, in 20 0 0. He is now a professor of Xian Jiaotong University, Xian China. His research focuses on computer vision, computer graphics and digital content analysis. He is currently member of the IEEE and IEICE. Zehao Wang received his B.E. degree from the China University of Petroleum, Qingdao, China in 2017. He is currently involved in master research studies in the department of Computer Science and Technology, Xian Jiaotong University, Xian, China. His research interests include computer vision and computer graphics.

A sparse structure for fast circle detection

A sparse structure for fast circle detection

Recommend Documents