COMPUTER GRAPItICS AND IMAGE PROCESSING
11, 88-99 (1979)
NOTE Constructing Trees for Region Description* DAVID L. 5IILGRAM~
Computer Scietwe Cet~ter, Ul~iversity of 3laryla~d, College Park, Maryla~d 207~2 Received December 14, 1978; accepted February 12, 1979 A thresholded (binary) image can be represented as a region tree in which each node corresponds to a component of l's (object) or O's (background). If regions O and B share a border, then one encloses the other. This enclosure relation defines the tree. The chain code of the component and a description based on statistical features are stored at each node. The region tree is built in a single pass over the image. 1. INTRODUCTION The algorithms to be described here are basically well known. However, the versions given here are especially simple and well-structured. I n addition, some of the algorithms, particularly those t h a t construct the c o n t a i n m e n t tree and the border chain codes, are not easily found in the literature. An effort has been made to design the basic algorithmic steps in a m a n n e r applicable to hardware implementation. We assume t h a t we are given an image and a threshold. The border of the image consists of below-threshold points. For each connected component, we wish to extract and store three essentially different kinds of information. The first consists of descriptive features over the region such as mean and s t a n d a r d deviation of gray level, area, perimeter, moments, etc. The second t y p e of inform a t i o n is a chain encoding of the outer b o u n d a r y of the region. Note t h a t while a region m a y have m a n y holes (each corresponding to an inner b o u n d a r y ) it has exactly one outer boundary. The third t y p e of information is stored inherently b y the data structure and corresponds to the c o n t a i n m e n t relations a m o n g regions. I t is convenient to store the complete set of connected components, including components of background, rather t h a n simply the above-threshold regions. * The support of the U. S. Army Night Vision Laboratory under Contract DAAG53-76C-0138 (ARPA Order 3206) is gratefully acknowledged. t Present address : Lockheed Corp., Palo Alto, Calif. 88 0146-664X/79/090088-12502.00/0 Copyright @ 1979 by Academic Press, Inc. All rights of reproduction in any form reserved.
REGION TREE CONSTRUCTION
89
As an example of the desired structure, consider the following region m a p :
The data structure describing the regions of this map is:
Each node eontains the feature data and the b o u n d a r y encoding of the n a m e d region. Note t h a t if the root is considered to be at level 0 then nodes at odd levels correspond to above-threshold regions while nodes at even levels ( > O) correspond to holes. In addition to producing a data structure the algorithm can produce a labeled image in which each connected c o m p o n e n t is assigned a unique label. The ability to know precisely which pixels belong to the same component is very useful for display as well as analysis. Labeled regions thought to be of particular interest can be highlighted or outlined, while regions of no interest (e.g., clutter, noise, accidents) can be suppressed in the display. However, while the data structure can be produced in a single pass over the original image, it requires two passes to produce the labeled region map : t h e first, to partially label the image and create a label equivalence table; the second, to read both the partially labeled region map and the label equivalence table and to produce the final region map. The connectivity decisions which make up the core of the algorithm use a 2 X 2 processing window. However, a 3 X 3 window is used since it also allows statistics based on border points to be computed. The processing window moves across the image in raster fasbion so t h a t every point of the image is in the center of the 3 X 3 window exactly once. I t is convenient to embed the image in a row of background levels which are not processed directly. This allows us to define the algorithm without considering b o u n d a r y conditions. We represent the pixels in a 3 X 3 window as
abc dxe fgh. Thus three rows of the image are maintained at all times (although, as we mentioned, only two rows are strictly necessary). In addition, various auxiliary data structures are needed.
90
DAVID L. MILGRAM
T w o algorithms will be presented. The first describes region t r a c k i n g ; the second constructs b o u n d a r y chMn codes. I t is s t r a i g h t f o r w a r d to i n t e g r a t e the two algorithms into a single pass. 2. REGION TRACKING This algorithm constructs for each connected c o m p o n e n t a descriptive feature vector. Although we do not specify the features, it is assumed t h a t t h e y are all extractable f r o m a raster scan using a 3 X 3 processing window. Additional storage is available to hold the feature values for the components. W h e n a new region is encountered during a raster scan, it is assigned a vector of registers to store its feature values. As the region is being tracked on the s a m e row or continued oil the next row, values continue to be ,meumulated into its feature vector. I n order to specify the correspondence between a region and its register vector, a label is created and assigned to each point of the region which has already been visited. The label will identify the a p p r o p r i a t e register vector, usually b y some indexing scheme. Region points found to be adjacent to already labeled region points inherit t h a t label value and contribute statistically to its register vector. Often a region encountered for what is t h o u g h t to be the first time m a y at a later row prove to be connected to a previously encountered region. Such regions are called subeomponents. I n a s m u c h as feature values were being m M n t a i n e d separately for each subeomponent, it becomes necessary to combine the feature values i m m e d i a t e l y or at least to create a flag t h a t signifies t h a t the values for the two s u b e o m p o n e n t s will eventuMly be combined. I f the latter course is chosen, a table is constructed indicating which labeled regions overlap. The label equivalence table can be stored either as a bit m a t r i x or as a list. I n this algorithm we t a k e the i m m e d i a t e approach and combine feature vectors for equivMent regions as soon as the equivMenee is discovered. The fact of r e c o m b i n a t i o n imposes eonstrMnts on the types of features which can be e v a l u a t e d during the single pass. Thus while the area of a c o m p o n e n t is the sum of areas of its subcomponents, it is not possible to c o m p u t e the m a x i m u m row extent of a component from the run widths of its subeomponents. This feature, of course, could be c o m p u t e d on a second pass (i.e., over the labeled region m a p ) . Since region labels p r o p a g a t e from point to point, we m u s t m a i n t M n the labels of the current line and the previous line as well as their g r a y levels. Actually, only the labels of those points in the preceding row t h a t are neighbors of unexamined points in the current row, and the labels of those examined points in the current row, need be stored. Thus the a m o u n t of storage necessary for labeled points can be reduced to a single row. J u s t before a new row is read in to be processed, the current row becomes the previous row. R a t h e r t h a n copy the d a t a f r o m one d a t a area to another, one ordinarily cycles an indirect pointer array. However, in hardware, copying the data m a y be more efficient t h a n cycling a pointer since the copying can be done in parallel.
REGION TREE CONSTRUCTION
91
The label assigned to a component should designate whetber the component is above or below threshold. If the background is not partitioned into regions (i.e., is ignored) b y the Mgorithm then the data structure becomes simply a list of above-threshold regions. This is suitable for m a n y applications, e.g., infrared target cuing. For the sake of generality, we describe below how to determine the contMnment relation which defines the tree structure. I t is evident t h a t if two components of a binary image are adjacent t h e n one encloses the other. However, if more t h a n one object-background transition has been detected, one cannot know which encloses which from strictly local information at the time of initial label assignment. The determining condition is "which region terminates first?" The region ternfinating first is enclosed b y the adjaeent region. Thus whenever a region terminates, the data structure is u p d a t e d to reflect the containment relation. When a region is initiated it is entered onto an "active" l i s t - - t h e list of unterminated regions. At the end of each row, the active list is compared with the list of component labels of the current label row. Any active component whose label does not appear in the current label row must have terminated. Additionally, when overlapping regions are combined, the discarded label must be deleted from the active list. For simplicity, the algorithm will use 0 to indicate a point below threshold and 1 to indicate a point above. For example, e~b = oo~indicates t h a t only d and x are above threshold. We are also assuming t h a t regions of above threshold points are 4-connected (i.e., based on adjacency of 4-neighbors). To maintain consistency when tracking the background, we require t h a t background regions be defined b y 8-connectivity. Other implementations of this algorithm operate b y maintaining lists of endpoints of above threshold runs. This can increase the efficiency of the algorithm when computing descriptive features, since for features not dependent on the gray level, one can compute the total contribution to the feature accumulator based on the run rather t h a n at each individual point of the run. This can result in significant savings. The above algorithm can be modified to take advantage of this saving.
Algorithm for Producing the Labeled Region Tree "INITIALIZE" R E G I O N _ C O U N T +-- 1 R E G I O N _ N O T _ N E W +--- F A L S E N E W _ L A B E L _ N E E D E D +- F A L S E P L A C E B A C K G R O U N D L A B E L ON A C T I V E L I S T DO F O R X = 1 , . . . , W I D T H " L a b e l the outer b o u n d a r y as b a c k g r o u n d " C U R R E N T _ L A B E L S ( X ) +-- ( R E G I O N L C O U N T , B A C K G R O U N D }
OD "PROCESS EACH NEW IMAGE RECORD" DO F O R E A C H I M A G E R E C O R D DO F O R X = 1 , . . . , W I D T H "Cycle the label buffers" P R E V I O U S _ L A B E L S (X) ~-- C U R R E N T _ L A B E L S (X)
OD
92
DAVID L. MILGRAM READ AN IMAGE RECORD AND CYCLE THE IMAGE ROW BUFFERS DO FOR X = 1 , . . . , W I D T H "Threshold and label each pixel" CALL D E T E R M I N E _ L A B E L (X,LABEL, NEW_LABEL-NEEDED,REGION_NOT_NEW) C U R R E N T LABELS (X) ~-- LABEL IF NEW_LABEL_NEEDED T H E N " W e seem to be entering a new region" PLACE LABEL ON ACTIVE LIST N E W _ L A B E L _ N E E D E D (-- FALSE FI I F REGION_NOT_NEW T H E N "Merge the two equivalent region descriptions" CALL EQUIV (PREVIOUS_LABELS (X), CURRENT_LABELS (X -- 1)) REGION_NOT_NEW ¢-- FALSE FI CALL DOSTATS(X) "Accumulate the feature data for X " OD DO FOR EACH LABEL ON T H E ACTIVE LIST BUT NOT IN CURRENT_LABELS "Delete any completed region from the active list" F I N D LEAST X SO THAT PREVIOUS_LABELS (X) = LABEL PARENT_LABEL ~- CURRENT_LABELS (X) D E L E T E LABEL FROM ACTIVE LIST. PLACE ON COMPLETED LIST. A P P E N D LABEL TO THE C O N T A I N M E N T LIST OF PARENT_LABEL OD
OD "At this point the completed region list has the required tree structure" " No w compute the desired features from the accumulated feature data at each node" RETURN END
PROCEDURE D E T E R M I N E _ L A B E L (POSN,LABEL, NEW, EQ) "Assign a label to the lower right position of a 2 X 2 neighborhood" COMPUTE C = VECTOR OF T H R E S H O L D DECISIONS FOR A 2 X 2 NEIGHBORHOOD AT POSN. C A S E C OF 10, 1)0
012,
10 10,
"Label the point, with the upper-left label" LABEL ~-- PREVIOUS_LABELS (POSN-1)
11 01,
~11
REGION TREE CONSTRUCTION DO
" L a b e l the point with the upper-right label" L A B E L (-- P R E V I O U S _ L A B E L S (POSN)
11 00, DO
93
:: 11, " L a b e l the point with the lower-left label" L A B E L ~-- C U R R E N T _ L A B E L S ( P O S N - 1 )
01
11 DO
" E q u i v a l e n c e the lower-left and upper-right labels" L A B E L *-- P R E V I O U S _ L A B E L S (POSN) E Q ~-- T R U E
DO
" C r e a t e a new b a c k g r o u n d label" REGIO~COUNT *-- R E G I O ~ C O U N T q- 1 L A B E L *-- ( R E G I O N _ C O U N T , B A C K G R O U N D } NEW *- TRUE
DO
" C r e a t e a new foreground label" R E G I O N _ C O U N T ¢-- R E G I O N - C O U N T -~- 1 L A B E L ¢-- ( R E G I O N _ C O U N T , F O R E G R O U N D } N E W ~-- T R U E
DO
" T h e r e is b a c k g r o u n d equivalence and a new foreground label" R E G I O N _ C O U N T *-- R E G I O N C O U N T q- 1 LABEL *- (REGIONTCOUNT,FOREGROUND} N E W *-- T R U E EQ ~- T R U E
ESAC RE T URN END PROCEDURE EQUIV(LABEL1, LABEL2) " T w o region labels identify the same region"
94
D A V I D L. M I L G R A M
IF
LABEL1 ¢ LABEL2 THEN "Merge the two region descriptions" COMBINE THE ACCUMULATED STATISTICS OF LABEL2 W I T H LABEL1 A P P E N D THE C O N T A I N M E N T LIST OF LABEL2 to LABEL1 D E L E T E LABEL2 FROM T H E ACTIVE LIST DO FOR X = 1 , . . . , W I D T H I F PREVIOUS_LABELS(X) = LABEL2 THEN PREVIOUS_LABELS(X) ~ LABEL1 I F CURRENT_LABELS(X) = LABEL2 THEN CURRENT_LABELS (X) ~-- LABEL1 DO
FI FI
F[ RE T URN END 3. C O N S T R U C T I N G B O U N D A R Y C H A I N S
This section describes that portion of the overall algorithm for region description which creates and stores coded representations of boundaries. As was said before, each region has exactly one outer boundary. The encoding of the boundary uses a more primitive code than the Freeman code. However, it is trivial to convert the chain code given here. The boundary chain of a region is defined here to follow the cracks between two regions using four directions:
I 2 + 0
.
We illustrate the boundary encoding for a simple region:
t-t Note that this region does not have a hole and that if one tracks the boundary by keeping the right hand on the object and taking right turns when possible then the encoding consists of a single trace. On the other hand, consider:
t2- -
REGION TREE CONSTRUCTION
95
Following the previous tracing rule, one discovers two disjoint b o u n d a r y encodings. The encoding we have described defines 4-connected foreground regions. 8-connected region boundaries are defined b y switching the roles of O's and l's. The reason for choosing a crack-following scheme r a t h e r t h a n a typical eightdirection code is to ensure t h a t no two region boundaries intersect. T h e encodings themselves are s o m e w h a t less economical, and p r e s u m a b l y the algorithm could be modified suitably to a c c o m m o d a t e a more efficient encoding. As before, the input image is processed in a raster scan based on the same 2 X 2 neighborhood window (within the 3 X 3 processing window). Since m a n y objects m a y intersect the row currently being processed, a list of chain segments, called the active chain list, is maintained. Chain links arising out of a 2 X 2 neighborhood are called "ehainlets." Chainlets k n o w n to extend existing (active) chains are concatenated to those chains. Thus the active list is a list of strings. C o n c a t e n a t i o n is indicated b y an ampersand. W h e n a chain is closed, i.e., forms a cycle, t h e n we k n o w t h a t the object surrounded b y the chain is terminated. Alternatively, if a chain has not been extended in the current row t h e n it m u s t h a v e t e r m i n a t e d in the previous row. This test for t e r m i n a t i o n can be synchronized with the terminatiort test for regions so t h a t the tree node corresponding to a region and its b o u n d a r y can be closed at t h a t time. W h e n a ehainlet is discovered to be b o t h the head of one active chain and the tail of another, we concatenate the two active chains into a single c h a i n - - o r , if the two chains are the same already, t e r m i n a t e the chain. T h e overall structure of this chain algorithm thus is quite similar to the region tracking algorithm a n d b o t h can be incorporated within the same overall framework. Consider the 2 X 2 neighborhood ~ ~ for the point x. Let the image coordinates of x be (i, j). T h e ehainlet etx ~ b is denoted (i q- 1, j ~ i, j)', similarly, ,~ ~-bgives" rise to (i, j --* i, j -t- 1). An open chain can be represented b y a head and a tail, each specified b y a pair of image coordinates and a sequence of moves from head to tail based on the previous four-direction code. I n a closed chain, the h e a d and tail are the same coordinate pair. T h e point chosen to "anchor" a closed ehairt can be a n y point t h r o u g h which the chain passes. A good choice for anchor point m i g h t be the r i g h t m o s t point on the b o t t o m m o s t row, since this is the last point seen b y the algorithm. W h e n a chain is terminated, we associate the chain encoding with the tree node corresponding to the labeled region enclosed b y the chain. One cannot determine at the time of t e r m i n a t i o n which active chain will u l t i m a t e l y contain the t e r m i n a t e d chain. I t is certainly one of the active chains (if we include the outer b o u n d a r y of the image as a vacuous chain), b u t as the figure below indicates there is not enough information to determine which.
or
". - .
"-"22".' /
/'
96
DAVID L. MILGRAM
It is therefore desirable that boundary encoding be embedded within the region tracking algorithm so that '~ complete tree description can be produced. The algorithm as presented is linear in the number of points. However, for the configurations {o °°, ~} H nothing need be done. Thus the algorithm is actually linear in the number of boundary points, which may be a significantly smaller set of points. The operations FIND_HEAD_AT and FIND_TAIL_AT can be coded as binary searches on the column index, or even more efficiently within a hash table. The four-direction encoding is easily converted to a Freeman chain encoding. We describe the translation for the boundary chain of a object. Boundaries of holes are treated similarly. Each segment (codon) of the four-direction encoding is replaced by at most one Freeman codon by considering that codon and its forward neighbor. The table below shows the translation matrix. The symbol 0 indicates the null code; while A indicates that this combination cannot occur in the four-direction encoding. next codon
4
¢
current codon +
-~
~
A
A
~/
+-
A
Algorithm to Construct Boundary Chains
"INITIALIZE" SET T H E ACTIVE CHAIN LIST TO E M P T Y "PROCESS EACH IMAGE R E C O R D " DO FOR EACH IMAGE RECORD, I = 1 , . . . , L E N G T H READ AN IMAGE RECORD AND CYCLE T H E I N P U T BUFFERS DO FOR J = 1 , . . . , W I D T H CALL PROCESS_CHAINLET (1, J) OD OD STOP END PROCEDURE PROCESS_CHAINLET(I, J) LET C BE THE 2 X 2 BINARY RESULT OF T H R E S H O L D I N G da xb LET COLOR BE THE RESULT OF T H R E S H O L D I N G X LET COLORA BE THE RESULT OF T H R E S H O L D I N G A C A S E C OF 0000, 11 11
REGION TREE CONSTRUCTION
00,
11
01,
10
10•, ?~,
DO
"Nothing"
DO
CALL CREATE(I,J, COLOR)
DO
CALL C R E A T E ( I , J , COLOR) CALL M E R G E (I,J,COLORA)
DO
CALL M E R G E ( I , J , C O L O R A )
97
01 11
10
oo
10,
11,
010, 11 00,
DO
CALL A F F I X ( I , J , COLOR)
DO
CALL A F F I X ( I , J , C O L O R ) CALL APPEND (I,J, COLOR)
01 11 01, 01 DO
CALL APPEND(I,J,COLOR)
ESAC RETURN END
PROCEDURE MERGE ([,J, COLOR) CALL FIND_TAIL_AT (I,J, CHAIN1) CALL FIND_HEAD_AT (/,J, CHAIN2) IF K = L T H E N CALL COMPLETE_CHAIN (CHAIN1,COLOR) D E L E T E CHAIN1 FROM ACTIVE LIST TAIL (CHAIN1) (--- TAIL (CHAIN2) ELSE CHAIN (CHAIN1) ¢- CHAIN (CHAIN1) & CHAIN (CHAIN2) D E L E T E CHAIN2 FROM ACTIVE CHAIN LIST FI RETURN
98
DAVID L. MILGRAM
P R O C E D U R E A F F I X (I,J, COLOR) CALL FIND_HEAD_AT(I,J, CHAIN1) I F COLOR = B A C K G R O U N D T H E N H E A D ( C H A I N 1 ) ~-- (I + 1,J) C H A I N ( C H A I N 1 ) ~--"~" & C H A I N ( C H A I N 1 ) ELSE H E A D (CHAIN1) ~- (I,J + 1) C H A I N ( C H A I N 1 ) ~-- "--~" & C H A I N ( C H A I N 1 ) FI R E T URN P R O C E D U R E A P P E N D (I,J, COLOR) CALL F I N D _ T A I L _ A T (I,J, C H A I N 1 ) IF COLOR = B A C K G R O U N D T H E N T A I L ( C H A I N ) ~-- (I, J + 1) C H A I N ( C H A I N 1 ) ~-- C H A I N ( C H A I N 1 ) & " . - " ELSE T A I L ( C H A I N 1 ) .-- (I + 1, J ) C H A I N (CHAIN1) ~-- C H A I N (CHAIN1) & "t,, FI R E T URN P R O C E D U R E F I N D _ T A I L _ A T (I,J, CH) " N o t e : This routine is guaranteed to succeed" F I N D C H SO T H A T T A I L ( C H ) -- (I, J ) RETURN P R O C E D U R E F I N D _ H E A D _ A T ( I , J, CH) " N o t e : This routine is guaranteed to succeed" F I N D C H SO T H A T H E A D ( C H ) = (I, J) RETURN P R O C E D U R E C O M P L E T E _ C H A I N ( C H , COLOR) " N o t e : Head (CH) is the lower right corner of the completed chain: Color is the color of the top right point of the 2 X 2 neighborhood" IF COLOR = BACKGROUND T H E N "CH is the b o u n d a r y of a hole" ELSE "CH bounds a foreground region" FI A S S I G N C H A I N D E S C R I P T I O N TO R E G I O N D E S C R I P T I O N F O R T H E REGION RETURN END 4. CONCLUSION The algorithms presented here allow the user to construct a tree description of the regions contained in a thresholded image. The arc relation defining the tree is one of containment while the information stored at each node consists of a statistical description of the region and a chain encoding of its boundary.
REGION TREE CONSTRUCTION
99
Algorithms for constructing region descriptions in the course of a r o w - b y - r o w series are relatively well k n o w n (e.g., see ~1~ and its bibliography), b u t it is often difficult to find clear s t a t e m e n t s of these algorithms in the literature. I t is hoped t h a t this note will help to remedy this deficiency. REFERENCE 1. A. K. Agrawala and A. V. Kulkarni, A sequential approach to the extraction of shape features, Computer Graphics and Image Processing, 1977, 538-557.