Pattern Recognition Letters 129 (2020) 232–239
Contents lists available at ScienceDirect
Pattern Recognition Letters journal homepage: www.elsevier.com/locate/patrec
Segmentation strategies for the alpha-tree data structure Georgios K. Ouzounis Kauno kolegija/University of Applied Sciences, Faculty of Technology, Pramones prospektas 20, Kaunas 50468, Lithuania
a r t i c l e
i n f o
Article history: Received 1 June 2019 Revised 19 November 2019 Accepted 20 November 2019 Available online 21 November 2019 Keywords: Segmentation Clustering Alpha-tree
a b s t r a c t The alpha-tree is a versatile algorithm for color image segmentation employing attribute constraints to control the partitioning of the alpha-connected image space. Attribute constraints are enforced using a maximization strategy that returns the set of the largest connected components complying with these constraints assuming no prior violations from nested sub-components. This article presents two new strategies extending the way this set is defined. These are the non-target clustering and attribute maximization strategies, that give access to segments that could not be defined previously. Collectively they enable the handling of texture-rich regions that cannot be clustered into meaningful segments, and compute the unsupervised segmentation of images by seeking for extreme attribute values.
1. Introduction The alpha-tree is a hierarchical partition representation data structure [14,15] introduced in the context of supervised color image segmentation. It is a versatile algorithm that organizes the image information content using the construct of alpha-connectivity [21,24], generating a hierarchy of nested α -connected components. The latter facilitates a memory- and cpu- efficient computation of segmentation operators based on the concept of constraintconnectivity [21] that were previously computed sequentially. The alpha-tree was originally conceived to address segmentation challenges in vast size remote sensing imagery [7,12,15] and was later extended to provide automatically labeled training samples for deep learning algorithms [1,13]. Compared to other tree-based image representation data structures [6,8,17] the alpha-tree is distinguished for its ability to encode content from images or volume sets consisting of any number of channels (gray-scale, rgb, multi-spectral or hyper-spectral) while maintaining a fine representation granularity; the upper limit of the number of connected components represented by an alphatree is twice the number of the input image pixels. The alphatree utilizes pixel dissimilarity metrics to organize pixels into connected components and to order the latter. This is an alternative to marginal or vectorial processing methods which require the definition of a total order (or preorder) relation on the set of raw, multi-value components [2,3]. Attribute filters [5] implemented on component-trees for multi-valued images have been presented in [10]. Constraint-connectivity operators implemented
E-mail address:
[email protected] https://doi.org/10.1016/j.patrec.2019.11.027 0167-8655/© 2019 Elsevier B.V. All rights reserved.
© 2019 Elsevier B.V. All rights reserved.
on modified components trees of the inverse of the image edge graphs (min-trees) lead to a different data structure in [11,26]. Segmentation on the alpha-tree structure is driven by a maximization strategy, referred to as the Constraint-Connected Component Processor or CCCP [15], that seeks for the largest α -connected component along each root-path (tree traversal path from a leaf node to its root) that satisfies a set of attribute constraints and so do all its nested sub-components. Connected component attributes include size, shape descriptors, intensity or color-space features, etc. They are evaluated against scalar thresholds or vectors of threshold values using logical predicates [20] that return true upon compliance and false otherwise. CCCP using the ω-range attribute was demonstrated in [21,22]. Its performance was further improved by incorporating concepts such as the α -connectivity strength and other transition-zone resolution methods [23,25]. CCCP though being very efficient for the nature of challenges it was destined for, has a number of limitations that prevent it from generalizing further. The first concerns objects or desirable segments containing highly-textured image regions embedded in non-textured immediate surroundings. Establishing connections between constituent α -connected components separated by large dissimilarity values requires tolerating high α thresholds which bears the risk of leakage, i.e. establishing connections with other non-desirable regions/α -connected components. The second concerns accessing connected components in the same root-path past the first attribute constraint violation. There exist applications, as will be shown later in this article, that benefit from extraction of multiple α -connected components along each root-path and this is not supported by CCCP. In response to these challenges, two new segmentation strategies are presented in this article; the non-target clustering and
G.K. Ouzounis / Pattern Recognition Letters 129 (2020) 232–239
the attribute maximization. Pseudo-code and real image demos are given for each one. The demos do not aim at solving specific problems and no comparisons to other methods are given. Instead the scope is to introduce a wider range of segmentation operators and thus to enrich the range of applications that can be addressed by the alpha-tree algorithm. The structure of the article is as follows; Section 2 gives a brief overview of the concept of α -connectivity and of the alpha-tree data structure. Section 3 reviews the original attribute-constraint segmentation strategy for alpha-trees and presents the two new ones, each demonstrated on separate exercises. Section 4 discusses the results and the article closes with a summary of findings and conclusions in Section 5.
The alpha-tree [14,15] is a spatially rooted dendrogram representing a hierarchical organization of the image information content in which pixels are grouped into α -connected components [21] or α CCs; α is a threshold on some dissimilarity metric between adjacent elements of the image definition domain. This section discusses the notions of pixel dissimilarity and α connectivity, how the latter is used to define partitions, and how stacking α -partitions together under conditions leads to the notion of the alpha-tree. 2.1. Alpha connectivity Let a path π (xy) between any two endpoints x, y ∈ E be a chain of N pairwise-adjacent pixels of the image definition domain E:
π ( x y ) ≡ x = x0 , x1 , . . . , xN = y.
(1)
Definition 1. Given the set = ∅ of all possible paths between any two elements x, y ∈ E in a total of C channels, and a dissimilarity measure dc between any two adjacent points within a neighborhood kernel K of the same channel c ∈ C, i.e. xi , xi+1 ∈ K, the global dissimilarity measure between x and y is the ultrametric functional dˆ:
dˆ(x, y )x,y∈E =
π ∈
of E, i.e. α CCs are both collectively exhaustive and mutually exclusive in E, [15]. Given a point x and an α -dissimilarity range A = [0, 1, . . . , αmax ], for any two α values in A:
αiCCx ⊆ α jCCx , ∀ αi ≤ α j , and αi , α j ∈ A,
dc (xi , xi+1 ) | xi , xi+1 ∈ π
(2)
i∈[0,N p −1], c∈C
In words, the dissimilarity measure dˆ between any two pathconnected elements of E is the infimum among the set of values, each corresponding to the maximal dissimilarity between pairwise adjacent elements along each path in and in each channel c. K is a often a square kernel of size 3 × 3 centered on xi . It defines the set of pixels that are adjacent to xi subject to the 4- or 8- way grid connectivity. Relaxing the size/shape constraint for K leads to generalized notions of α -connectivity such as the α equivalents of clustering/contraction-based [4,19] or hybrid (maskbased) [16] second-generation connections. The global dissimilarity metric of Def. 1 has been used to define α -connected components [21] also known as quasi-flat zones [9]. Definition 2. An α CC marked by a point x is a set of maximal extent consisting of the union of x with all points y such that for each one there exists a path from x to y in which adjacent elements have a dissimilarity less than or equal to α or for which the global dissimilarity dˆ is less than or equal to α :
αCCx = {x} ∪ y | dˆ(x, y ) ≤ α .
(3)
The α CCs are equivalence classes on the image definition domain, consequently the set of α CCs for all x ∈ E defines a partition
(4)
i.e. increasing the threshold α results in larger α CCs and thus in coarser partitions of E. 2.2. The alpha-tree data structure Let PA be the set of all α -partitions of E, ∀α ∈ A such that |A| > 1. Given a point x ∈ E marking a cell of a partition P α j ∈ PA with α j ∈ A, then for any other α i ≤ α j :
∀ x ∈ E, αiCCx ⊆ α jCCx ⇒ Pα j Pα j .
2. Background theory
233
(5)
The symbol denotes an order relation with respect to α ∈ A. The family of all ordered partitions of E for the entire α dissimilarity range defines a partition pyramid: Definition 3. A partition pyramid of E for |A| > 1, is a mapping A : E → {E}A given by:
A = Pα=0 , Pα=1 , . . . , Pα=αmax , | Pα Pα , ∀ α < α with α , α ∈ A.
(6)
A pyramid level α ∈ A is a partition Pα of E, with α ∈ A. As a consequence of (5) the base of the pyramid corresponds to the finest and the tip to the coarsest α -connected partition of E. I.e. to the set of components at α = 0, also known as flat-zones [18], and to the single αmaxCC associated to the image definition domain, respectively. Partition pyramids may often carry redundancies; i.e. cells can propagate through consecutive pyramid levels unchanged thus adding overhead to the underlying data structure. To counter this an index mapping of α CCs is introduced, that leads to a hierarchical partition representation structure configured with strict inclusion. Consider a variable j ∈ Jα , with J α ⊆ Z being an index set employed to address the α CCs of Pα . Given a point x ∈ E, there exists explicitly only one j ∈ Jα : x ∈ α CCj . Definition 4. Let A be an α -connected partition pyramid of an image I, defined for a dissimilarity measure range A. An α connected partition hierarchy HA is a family of ordered mappings HαA : J α → K α with Kα ⊆Jα , given by:
HA = HαA=0 , HαA=1 , . . . , HαAmax
∀ α < α with α , α ∈ A
and
| HαA ≺ HαA ,
αCC j | (αCC j ∈ α ) ∧ (αCC j ∈ α−1 ) , ∀ α > 0, ∀ j ∈ Jα , and ∀ α ∈ A .
HαA =
(7)
(8)
An example of an α -connected partition pyramid and hierarchy for a simple 9-pixel image chip is shown in Fig. 1. The top row shows the original tile at the left with the pixel intensities imprinted. The red bonds between pixels indicate α -linkage relations established between pixels as the dissimilarity threshold α increases from 0 to 2 (left to right). For α =2 all pixels are linked to a single α CC that coincides with the image definition domain. Note that not all pixels are directly linked with all their neighbors. The middle row shows the partition pyramid corresponding to the same image tile. Note that at level α =1 the three right-most nodes are replicas of the same ones at level α =0. This is the type of redundancy that the partition hierarchy or alpha-tree in this case, eliminates. The respective alpha-tree is shown at the bottom row. The leaves of the tree are strictly defined as nodes at level 0, and the root as the single and further most node from the leaves.
234
G.K. Ouzounis / Pattern Recognition Letters 129 (2020) 232–239
Fig. 2. CCCP segmentation using the omega-range attribute.
Fig. 1. Example of a partition pyramid and hierarchy. Top row: a 9-pixel tile with their intensities printed. Red bonds indicate α -linkage for different values of the dissimilarity threshold α . Middle and bottom rows: the α -connected partition pyramid and hierarchy of the input tile respectively.
3. Segmentation strategies This section discusses segmentation strategies for the alphatree algorithm starting with a revision of the original from [15] and followed by two new ones. Image segmentation aims at organizing the image information content in disjoint segments each of which coincides with an identifiable region or object in the original image. Attribute-constraint segmentation [15] is a framework devised for the alpha-tree algorithm that extents the original work of intensity based constraint connectivity operators [21] to structural attributes and beyond. Attribute-constraint segmentation is formalized through logical predicates [20]. Consider first-order logic as a symbolic formal system and let θ be a logical predicate defined as a function that returns true when the associated argument is satisfied and false otherwise. Logical predicates are typically decreasing, i.e. for any X, Y⊆E:
θ (X ) = true ⇒ (Y ) = true, ∀ Y ⊆ X.
true false
if at t r (αCC ) ≤ τ otherwise,
3.1. Constrained connected component processor The constrained connected component processor or CCCP is the original segmentation strategy introduced in [15]. CCCP initiates on each leaf of the tree and for each root-path (leaf to root tree traversal) seeks for the furthest α CC that satisfies all given predicates and so do all its descendents. If an α CC is reached that violates one or more of the predicates all remaining nodes along that root-path are invalidated, i.e. assigned the background label. Definition 5. Let be a family of N decreasing predicates. An attribute-constraint-connected component assuming the CCCP segmentation strategy acx3 ⊆ E containing x ∈ E is defined as the largest α CC (and thus of the highest α value), marked by a point x that satisfies all predicates and so do all its nested descendants (children tree nodes):
(9)
Given an operator attr() that returns the value of a selected attribute of an α CC, the logical predicate can be expressed as:
θ τ (αCC ) =
this case setting τ1 = αmax allows the remaining predicates to be evaluated in the full extent of the alpha-tree. The structured application of logical predicates on α CCs along the alpha-tree is described as a strategy for returning attributeconstraint-connected components or ac3 s. The latter are partition cells or segments and can be produced with a number of different methods, some of which are presented next.
(10)
in which τ is an attribute threshold. α CCs are usually evaluated using multiple attributes; each one requires its own threshold and predicate. Assuming |N| ≥ 1 attributes, let be a family of decreasing predicates θ n , with n ∈ N and τ n ∈ T. The first predicate, i.e. for n = 1, typically evaluates a constraint on the α value. In
acx3 =
⎧ αiCCx | ⎪ ⎪ ⎨
⎫ θnτ (αiCCx ) = true ⎪ τ ⎬ ∀ θn ∈ : n ∈ N ∧ τ ∈ T ⎪
⎪ ⎪ ⎩
and
θnτ (αkCCy ) = true ∀ (k ≤ i ∧ y ∈ αiCCx ).
.
⎪ ⎪ ⎭
(11)
Computing the CCCP segmentation requires two passes through the alpha-tree; see pseudo-code 1. The first detects the furthest α CCs that comply with Eq. (11) and invalidates all others down to the root. The second, from the root of the tree to the leaves, restores the output labels and propagates them to the leaves. Fig. 2 shows examples of CCCP segmentation. The images in both rows show the original photos on the left and the results
G.K. Ouzounis / Pattern Recognition Letters 129 (2020) 232–239
Algorithm 1: Constraint-connected component processor.
1 2 3 4 5
6 7 8 9 10 11 12 13 14
15 16 17 18 19 20
21 22 23 24 25 26
27 28 29 30 31 32 33 34 35 36
Input: at, τα , [τat t r ] /* inputs: alpha-tree, alpha and attribute thresholds */ at.resetOutputLabels() // output label = node index maxAlpha = at.getMaxLevel() // get the max tree level if maxAlpha < τα then τα = maxAlpha end /* forward pass - until alpha threshold */ for level=0 to τα do maxnodes = at.getNodesAtLevel(level) for node=0 to maxnodes do decision = true for all attributes do attr = computeNodeAttribute(node) decision ×= computeLogicalPredicate(attr, τat t r ) end if decision == false OR at.newLabel[node] == backgroundLabel then parent = at.getNodeParent(node) at.newLabel[node] = backgroundLabel at.newLabel[parent] = backgroundLabel end end end /* forward pass - invalidate the remaining nodes */ for level=τα to maxAlpha do maxnodes = at.getNodesAtLevel(level) for node=0 to maxnodes do at.newLabel[node] = backgroundLabel end end /* backward pass - propagate decision labels to leaves */ for level=maxAlpha to 0 do maxnodes = at.getNodesAtLevel(level) for node=0 to maxnodes do parent = at.getNodeParent(node) if at.newLabel[parent] != background then parent = at.getNodeParent(node) at.newLabel[node] = at.newLabel[parent] end end end
of the segmentation using constraints on the α and ω ranges on the right. Constraints on the α range limit the tolerated dissimilarity between adjacent elements of the same connected component. Constraints on the ω range [21] limit the dissimilarity range tolerated for the entire connected component, e.g. in the case of intensity dissimilarity that would be the difference between maximum and minimum intensities of pixels of the same connected component. Alpha threshold values greater than the omega threshold value make no sense since the dissimilarity between adjacent elements can not exceed the connected component dissimilarity range ω. The settings are (τ (α ), τ (ω)) = (150,150) and (100,100) respectively. 3.2. Non-target clustering The non-target clustering or NTC is a segmentation strategy that aims at objects/structures that are texture-rich compared to their
235
Algorithm 2: Non-target clustering.
1 2 3 4
5 6 7 8 9 10 11
Input: at, τα , [τat t r ] /* inputs: alpha-tree, alpha and attribute thresholds */ at.resetOutputLabels() // output label = node index maxAlpha = at.getMaxLevel() // get the max tree level if maxAlpha < τα then τα = maxAlpha /* forward pass - until alpha threshold */ for level=0 to τα do maxnodes = at.getNodesAtLevel(level) for node=0 to maxnodes do decision = true for all attributes do attr = computeNodeAttribute(node) decision ×= computeLogicalPredicate(attr, τat t r )
14
if decision then parent = at.getNodeParent(node) at.newLabel[parent] = backgroundLabel
15
else
12 13
at.newLabel[node] = backgroundLabel
16
17 18 19 20
21 22 23 24 25 26
/* forward pass - invalidate the remaining nodes for level=τα to maxAlpha do maxnodes = at.getNodesAtLevel(level) for node=0 to maxnodes do at.newLabel[node] = backgroundLabel
*/
/* backward pass - propagate decision labels to leaves */ for level=maxAlpha to 0 do maxnodes = at.getNodesAtLevel(level) for node=0 to maxnodes do parent = at.getNodeParent(node) if at.newLabel[parent] != background then at.newLabel[node] = at.newLabel[parent]
immediate background. In any such case in order to link constituent connected components into one that coincides with an object or region of interest a high value of α is needed. This often causes leakage to the local background. Instead, NTC clusters large and relatively homogeneous image regions and filters out all other connected components. This can be used to create a binary image in which foreground corresponds to all non-target clusters and background to the residual, i.e. the set of holes created after the filter application [5]. Each hole can be labeled and treated as an object in itself thus giving access to objects that cannot be defined using the CCCP strategy. Definition 6. Let be a family of N decreasing predicates. An attribute-constraint-connected component assuming the NTC segmentation strategy acx3 ⊆ E containing x ∈ E is defined as the smallest α CC (and thus of the lowest α value), marked by a point x that satisfies all predicates given that none of its nested descendants does:
acx3 =
⎧ αiCCx | ⎪ ⎪ ⎨ ⎪ ⎪ ⎩
θnτ (αiCCx ) = true ∀θnτ ∈ : n ∈ N ∧ τn ∈ T and
⎫ ⎪ ⎪ ⎬
.
⎪ ⎪ θnτ (αkCCy ) = false ⎭ ∀ (k ≤ i ∧ y ∈ αiCCx ∧ n > 1 ).
(12)
236
G.K. Ouzounis / Pattern Recognition Letters 129 (2020) 232–239
Algorithm 3: Attribute maximization.
1 2 3 4
5 6 7 8 9
10 11 12 13
Fig. 3. NTC segmentation and attribute filtering for skin mole detection. 14
The logical predicate on α with n = 1 for all α CCs that are subsets of acx3 , i.e. with index k < i, is always true by default since α is increasing; second condition of Eq. (12). Computing the NTC segmentation is shown in the pseudo-code 2. The example in Fig. 3 shows an application of NTC segmentation in skin mole detection. NTC was configured with τ (α ) = 10 and τ (area) = 100, i.e. set to cluster together all connected components that grow in size more than 100 pixels at a dissimilarity level as low as 10. This clustered the subject’s torso in a single connected component, and the 4 background segments into four respective components, Fig. 3(b). All moles that cluster in larger components at much higher values of α were retained as small, pixel-sized components that failed the size predicate up until τ (α ) and were thus assigned the background label. Post-processing the result in (b) and after assigning a unique label to each disjoint connected component, moles can be easily differentiated from the torso contour and other large segments on the basis of compactness and size. Isolating moles allows for their counting, size and shape measuring, Fig. 3(c). 3.3. Attribute maximization The attribute maximization or ATM by contrast to the previous two is an unsupervised image segmentation strategy that pursues selected attribute extreme values (attribute maximization or minimization) along each leaf path i.e. tree traversal from the root node to all leaf nodes. The process may be optionally constrained by attribute filters [5] that limit the range of connected components to be considered; in this case it becomes a semi-supervised strategy. The ATM strategy accesses the root node and traverses towards the leaves. Each node visited is initially set to an accept state and then subjected to a number of optional filters. This may be useful for reducing or eliminating bias from very large connected components at high α values as will be discussed next. If the node fails one or more of the predicates associated with the filters, its state is changed to reject. If not rejected a further test evaluates if the selected ATM attribute is maximized or minimized accordingly in the given leaf-path. In the case of maximization, that is if the current node has a higher attribute value compared to the highest registered that far in the given leaf-path. If it does, its value updates
15 16 17 18 19 20 21 22 23
24 25 26
Input: at, τα , [τat t r ] /* inputs: alpha-tree, alpha and attribute thresholds */ at.resetOutputLabels() // output label = node index maxAlpha = at.getMaxLevel() // get the max tree level if maxAlpha < τα then τα = maxAlpha /* root to leaves pass: propagate background label for level=maxAlpha to τα do maxnodes = at.getNodesAtLevel(level) for node=0 to maxnodes do parent = at.getNodeParent(node) at.newLabel[node] = at.newLabel[parent]
*/
/* root to leaves pass: attribute maximization, all node members LeafPathAttribute are pre-initialized to 0 */ for level=τα to 0 do maxnodes = at.getNodesAtLevel(level) for node=0 to maxnodes do decision = true // node accept decision parent = at.getNodeParent(node) // leaf-path extreme value path_attr = at.getLeafPathAttribute(parent) for all filter attributes do attr = computeNodeAttribute(node) decision ×= computeLogicalPredicate(attr, τat t r ) attr = computeATMnodeAttribute(node) if attr ≤ path_attr then decision = false if decision then at.newLabel[parent] = backgroundLabelat.setLeafPathAttribute(node, attr) else at.newLabel[node] = at.newLabel[parent] at.setLeafPathAttribute(node, path_attr)
the leaf-path extreme value and its output label is set to its original node identifier value. Otherwise it is set to a reject state and its new label becomes that of the previous winner node in that leaf-path. Consider the example in which the maximization attribute is the scale invariant measure of compactness given by the inverse of the normalized moment of inertia [27].
compactness(αCC ) =
area(αCC )2 . 2π p∈αCC ( p − p¯ )2
(13)
Starting from the root node, i.e. the image definition domain, it is highly possible to obtain a very compact connected component that may not have any descendents of smaller compactness. The effect of this would be to propagate the background label all the way to the leaves of the tree. The later may be pixelsized components and more compact that the background but every other component in between the tree extremes will be discarded. This is referred to as bias and to counter it additional attribute filters [5] can be considered depending on the maximization/minimization attribute. Size constraints are often used to limit the number of components.
G.K. Ouzounis / Pattern Recognition Letters 129 (2020) 232–239
237
Fig. 5. ATM segmentation for building detection from satellite imagery.
nents’ compactness attribute. The connected components of interest were limited with τ (α ) = 100 and with τ (sizemin ) = 100 and τ (sizemax ) = 30 0 0. 4. Discussion of results
Fig. 4. ATM segmentation and attribute filtering of a brain aneurysm.
Definition 7. Let be a family of N decreasing predicates designated for filtering and attr() be a function that returns the value of the component attribute selected for maximization along its leafpath. An attribute-constraint-connected component assuming the ATM segmentation strategy acx3 ⊆ E containing x ∈ E is defined as the smallest α CC (and thus of the lowest α value), marked by a point x that satisfies all filter predicates in and the result of attr() is the highest from all its ancestor nodes:
acx3 =
⎧ αiCCx | ⎪ ⎪ ⎨ ⎪ ⎪ ⎩
⎫ θnτ (αiCCx ) = true ⎪ τ ⎬ ∀ θ n ∈ : n ∈ N ∧ τn ∈ T ⎪
and . ⎪ at t r (αiCCx ) ≥ at t r (αkCCy )⎪ ⎭ ∀ (k ≥ i ∧ x ∈ αkCCy ).
(14)
Computing the ATM segmentation in the case of maximization is shown in pseudo-code 3. The example in Fig. 4 shows the ATM segmentation of a highly textured brain aneurysm from CT data - image (a). The image dimensions are 630 × 630pix. Image (b) shows the aneurysm extracted using the ATM strategy. The ATM was configured to pursue maximal values of compactness - Eq. 13. The connected components of interest were limited with τ (α ) = 100 and with τ (sizemin ) = 50 0 0 and τ (sizemax ) = 10 0 0 0. Image (c) shows the manually extracted reference object. A zoom in view of the end result after dilation with a kernel of size 7x7 is shown in (d). The second example in Fig. 5 uses the ATM segmentation to generate a partition of the image space free of supervision. It aims to form segments that coincide with building footprints in the original satellite image. The alpha-tree CCCP method was previously employed for supervised building footprint extraction in [7]. Image (a) shows a residential area in Istanbul, Turkey under heavy jpeg compression. It was purposely selected due to the fact that red roofs pose a number of challenges in remote sensing image analysis. Image (b) shows the unsupervised organization of pixels into connected components with most buildings identified by a single segment. The ATM, like in the previous example, is configured to pursue maximal values of the connected compo-
The CCCP is pre-existing and no further analysis is given. To evaluate the NTC segmentation a dermatologist was asked to study image (c) of Fig. 3 and count the number of clearly identifiable skin moles and those relevant for frequent monitoring. The total count of skin moles is 78 out of which 15 were considered subject for annual inspection. According to the expert’s opinion 7 out of 15 must be checked systematically. The skin moles are of different sizes and the very small were ignored. The NTC method counted a total of 77 mole candidates with true positives = 74, false positives = 3, false negatives = 4, true negatives = 1 (the background). Summarizing the findings, the method demonstrated a precision of 96.1% and recall of 94.8%. To evaluate the result of the aneurysm segmentation in Fig. 4, the manually segmented target seen in image (c) was compared against progressively dilated versions of the actual segmentation result. The IoU was reported for each case. Table 1 summarizes the findings. The kernel of size 7x7 pixels gave the best approximation of the ground truth with an IoU of 90.12%. Table 1 Brain segmentation using the ATM strategy; validation results. ker
0x0
3x3
5x5
7x7
9x9
IoU
74.18%
84.80%
89.42%
90.12%
88.58%
To evaluate the quality of the ATM segmentation in building footprint detection - Fig. 5, a method was devised to select all segmentation CCs that coincide with an object in the reference set, and for each one reference to retain only the one overlapping segmentation CC with the highest intersection-over-union (IoU) measure. The method for any given IoU threshold counts the number of the segmentation CCs that satisfy it as true positives and considers the background as the only true negative. Every segmentation CC that fails the IoU is counted as a false negative. False positives cannot be counted because all CCs not overlapping with reference objects were previously rejected. Fig. 6 shows an example using a set of 125 building footprints manually delineated as reference data in (a). The results of the validation method are shown in (b). White corresponds to true positives, red to reference data CCs not overlapping with the selected segmentation CC, and green for all selected CCs that fail the IoU
238
G.K. Ouzounis / Pattern Recognition Letters 129 (2020) 232–239
ership, or other equity interest; and expert testimony or patentlicensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.
Fig. 6. ATM validation. The reference building footprints for a subset of the input image in (a); the validation results color coded in (b).
Table 2 Building detection using the ATM strategy; validation results. Building footprint detection evaluation results IoU threshold
recall
f1score
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
99.2% 96.8% 92.0% 87.2% 78.4% 72.0% 64.0% 36.0% 4.8%
99.6% 98.2% 95.8% 93.1% 87.9% 83.7% 78.0% 52.9% 9.1%
criterion either due to limited extend within the reference CC or due to excessive leakage. The result is produced for an IoU threshold of 50%. Results for different values of IoU threshold are reported on Table 2. Note that precision is 100% in all cases since the false positives are 0. All experimental material is available at https://github.com/ georgiosouzounis/publication_resources. 5. Conclusions This article presented two new segmentation strategies in an effort to extend the capabilities of the alpha-tree algorithm. The new strategies enable the clustering of background components into large homogeneous regions in cases of complex targets and allow for the automatic image segmentation based on search for connected component extreme attribute values. Both give access to image regions that were previously not possible. The NTC strategy is sensitive to homogeneous image regions and will provide meaningless results if none is present. Each textured region standing out from its immediate surroundings will be considered as a flatzone of the root node α CC. It is best suited for anomaly detection in image analysis. The ATM strategy is sensitive to structures best described by the maximization/minimization attribute. All others, like the long building in Fig. 6(b) in case of compactness maximization, will be poorly represented. It runs free of attribute thresholds. In future works the aim is to present a new strategy for instance segmentation with clustering criteria that selectively ignore dissimilarity barriers among the constituent α -connected components of the desired segments (clustering-based second generation α -connectivity). This hybrid strategy is work in progress. Declaration of Competing Interest The author certify that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers’ bureaus; membership, employment, consultancies, stock own-
References [1] N. Adelborgh, G.K. Ouzounis, K. Stamatiou, Unsupervised object detection on remote sensing imagery using hierarchical image representations and deep learning, in: Proceedings of the 2017 conference on Big Data from Space, Toulouse France, 2017, doi:10.2760/383579. 255–228 [2] J. Angulo, Morphological colour operators in totally ordered lattices based on distances: Application to image filtering, enhancement and analysis, Computer Vision and Image Understanding 107 (1) (2007) 56–73, doi:10.1016/j.cviu.2006. 11.008. Special issue on color image processing [3] V. Barnett, The ordering of multivariate data, J. R. Stat. Soc. Series A (General) 139 (3) (1976) 318–355. [4] U. Braga-Neto, J. Goutsias, Connectivity on complete lattices: new results, Comput. Vision Image Underst. 85 (1) (2002) 22–53, doi:10.1006/cviu.2002.0961. [5] E.J. Breen, R. Jones, Attribute openings, thinnings, and granulometries, Comput. Vis. Image Underst. 64 (3) (1996) 377–389, doi:10.10 06/cviu.1996.0 066. [6] E. Carlinet, T. Géraud, A comparative review of component tree computation algorithms, IEEE Trans. Image Process. 23 (9) (2014) 3885–3895, doi:10.1109/ TIP.2014.2336551. [7] D. Ehrlich, T. Kemper, X. Blaes, P. Soille, Extracting building stock information from optical satellite imagery for mapping earthquake exposure and its vulnerability, Natural Hazards 68 (1) (2013) 79–95, doi:10.1007/s11069- 012- 0482- 0. [8] R. Jones, Connected filtering and segmentation using component trees, Comput. Vision Image Underst. 75 (3) (1999) 215–228, doi:10.1006/cviu.1999.0777. [9] F. Meyer, P. Maragos, Nonlinear scale-space representation with morphological levelings, J. Visual Commun. ImageRepresent. 11 (3) (20 0 0) 245–265, doi:10. 1006/jvci.1999.0447. [10] B. Naegel, N. Passat, Component-trees and multi-value images: a comparative study, in: M.H.F. Wilkinson, J.B.T.M. Roerdink (Eds.), Mathematical Morphology and Its Application to Signal and Image Processing, Springer Berlin Heidelberg, Berlin, Heidelberg, 2009, pp. 261–271. [11] L. Najman, On the equivalence between hierarchical segmentations and ultrametric watersheds, J. Math. Imag. Vision 40 (3) (2011) 231–247, doi:10.1007/ s10851- 011- 0259- 1. [12] G. Ouzounis, Automatic extraction of built-up footprints from high resolution overhead imagery through manipulation of alpha-tree data structures, US Patent 8682079 (2014). [13] G. Ouzounis, N. Adelborgh, K. Stamatiou, Shape-based segmentation using hierarchical image representations for automatic training data generation and search space specification for machine learning algorithms, US Patent 10372984 (2019). [14] G.K. Ouzounis, P. Soille, Pattern spectra from partition pyramids and hierarchies, in: P. Soille, M. Pesaresi, G.K. Ouzounis (Eds.), Mathematical Morphology and its Applications to Image and Signal Processing, (ISMM) 2011; Proc. 10th Int. Symposium, Lecture Notes in Computer Science, 6671, Springer Berlin/Heidelberg, 2011, pp. 108–119. [15] G.K. Ouzounis, P. Soille, The Alpha-Tree algorithm, JRC Technical Reports, European Commission, Joint Research Centre, Institute for the Protection and Security of the Citizen, 2012, doi:10.2788/48773. [16] G.K. Ouzounis, M.H.F. Wilkinson, Mask-based second-generation connectivity and attribute filters, IEEE Trans. Pattern Anal. Mach.Intell. 29 (20 07) 990–10 04. [17] P. Salembier, A. Oliveras, L. Garrido, Antiextensive connected operators for image and sequence processing, IEEE Trans. Image Process. 7 (1998) 555–570. [18] P. Salembier, J. Serra, Flat zones filtering, connected operators, and filters by reconstruction, IEEE Trans. Image Process. 4 (1995) 1153–1160. [19] J. Serra, Connectivity on complete lattices, J. Math. Imag. Vision 9 (3) (1998) 231–251, doi:10.1023/A:1008324520475. [20] P. Soille, On genuine connectivity relations based on logical predicates, in: Proc. of 14th Int. Conf. Image Analysis Processing, Modena, Italy, 2007, pp. 487–492, doi:10.1109/ICIAP.2007.4362825. [21] P. Soille, Constrained connectivity for hierarchical image decomposition and simplification, IEEE Trans. Pattern Anal. Mach.Intell. 30 (7) (2008) 1132–1145, doi:10.1109/TPAMI.2007.70817. [22] P. Soille, Constrained connectivity for the processing of very high resolution satellite images, Int. J. Remote Sens. 31 (22) (2010) 5879–5893, doi:10.1080/ 01431161.2010.512622. [23] P. Soille, Preventing chaining through transitions while favouring it within homogeneous regions, in: P. Soille, M. Pesaresi, G. Ouzounis (Eds.), Proc. of ISMM 2011, Lecture Notes in Computer Science, 6671, Springer-Verlag, 2011, pp. 96– 107, doi:10.1007/978- 3- 642- 21569- 8_9. [24] P. Soille, J. Grazzini, Advances in constrained connectivity, in: DGCI’08: Proc. 14th IAPR international conference on Discrete geometry for computer imagery, Springer-Verlag, Berlin, Heidelberg, 2008, pp. 423–433. [25] P. Soille, J. Grazzini, Constrained connectivity and transition regions, in: M.H.F. Wilkinson, J.B.T.M. Roerdink (Eds.), Mathematical Morphology and Its Application to Signal and Image Processing, Proceedings of, Lecture Notes in Computer Science, 5720, Springer, 2009, pp. 59–69, doi:10.1007/ 978- 3- 642- 03613- 2_6.
G.K. Ouzounis / Pattern Recognition Letters 129 (2020) 232–239 [26] P. Soille, L. Najman, On morphological hierarchical representations for image processing and spatial data clustering, in: U. Köthe, A. Montanvert, P. Soille (Eds.), Applications of Discrete Geometry and Mathematical Morphology, Springer Berlin Heidelberg, Berlin, Heidelberg, 2012, pp. 43–67.
239
[27] E.R. Urbach, J.B.T.M. Roerdink, M.H.F. Wilkinson, Connected shape-size pattern spectra for rotation and scale-invariant classification of gray-scale images, IEEE Trans. Pattern Anal. Mach.Intell. 29 (2) (2007) 272–285, doi:10.1109/TPAMI. 2007.28.