Average efficiency of data structures for binary image processing

Average efficiency of data structures for binary image processing

Information Processing Letters 26 (198?/88) 89-93 North-Holland 19 October 1987 AVERAGE EFFICIENCY OF DATA STR~JCTURES FOR BINARY IMAGE PROCESSING ...

430KB Sizes 0 Downloads 68 Views

Information Processing Letters 26 (198?/88) 89-93 North-Holland

19 October 1987

AVERAGE EFFICIENCY OF DATA STR~JCTURES

FOR BINARY IMAGE PROCESSING

C. MATHIEU, Claude PUECH and Hossein YAHIA DPpartement de MathPmatiques et d’lnformatique, 75230 Paris C&de-x05, France

Laboratoire d’lnformatique de 1’Ecole Normale Supt%ure,

45 Rue d’Ulm,

Ccmmunicated by W.L. Van der Poe1 Received 12 February 198’

Keywords: Image processing, quadtree, chain code, run-length encoding, average memory occupancy

1. How to store a binary image Several structures exist for representing binary images: let us suppose that the image is drawn on a 2h X 2h array made of black-or-white pixels. Each connected region can be represented by a starting point and a chain code that describes its boundary in terms of a sequence of directions chosen from E (East), N (North), W (West) or S (South), each letter of the sequence corresponding to an elementary step in the associated direction. Thus, the boundary of image I (see Fig. 1) is encoded by E4S4E2NESWS2W6N6. The image can also be stored as a linked list of the runs of pixels of the same colour met when reading the array row by row, each run being given by its global length. But, the data structure we are most concerned with here-is the quadtree representation of an image, which is based on a recursive subdivision of the space. Starting from a 2h x 2h grid, we subdivide it into four 2h-’ X 2h-1 grids (‘quadrants’), each of which is in turn subdivided into four subquadrants, and so on until a quadrant of uniform colour is reached. Thus, we obtain a tree (of height d h) where each interior node has outdegree four (see Fig. 2 for an example). In recent years, it has aroused much interest (Samet, in particular, has studied it at length- see [ 1,4]), and yields a model of random images that allows to compute its average performances.

Fig. 1. c1020-0190/87/$3.50 0 1987, Elsevier Science Publishers B.V. (North-Holland)

89

INFORMATION

Volume 26, Number 2

PROCESSING LETTERS

19 October 1987

Fig. 2.

2. Average amount of storage required by each structure

Let Qh be the set of all quadtrees of height < h. We define a probability distribution on Qh in the following way: consider a sequence (pi) of reals between zero and i ( &, = i). A random tree of Q,l is built by using a branching process, such that the quantity 2pi is the probability for a node at level h - i of being external (and so, at level h, all nodes are external with probability one, as they should). Once a node is known to be external, it is assumed to be white or black with equal probability. Thus, the probability of tree t of Qh is given by the following formula; ift=o,

(Ph ah(t) =

if t = 0,

ph

In the case when, for arbitrary i, pi is equal to a constant p, the following can be interpreted for the associated images: when p is near zero, the random images thus obtained look like clouds of points and, when p increases, the images are more and more structured. Choosing pi equal to 1/2(i + 1) amounts to having a random model of images such that, in the quadtree representation, the external node containing a fixed pixel has got equal probability of being of any size. This is equivalent to Samet’s definition of a random image: a node is equally likely to appear in any position and level in the tree. We now turn our attention to the amount of storage needed for each of the data structures. 2.1. Proposition. The average number of runs in Run-Length-Encoding of two-dimensional binary images is equal to: Case pi = p: [4(a _ 2p)]

-+ 2h-‘(1

- W/(1

- 4p).

Case pi = 1/2(i + 1): h2h-1/(h + 1) + 4h/(h + I). General case:

2.hhi12i(1 - 2Ph) (1 ’

i=o 90



2Ph_,+ 1)(3ph-i - &) t 4h(l - 2Ph) ’ * ’ U - 2PI J’

Volume 26, Number 2

2.2. Propsition.

INFORMATION

PROCESSING LETTERS

The average number of nodes in the quadtree representation

19 October 1987

of binary images is equal to:

Case pi = p:

l/(SP - 3) + [4(1-

2p)lh4F;

-

-:’

.

Case pi = 1/2(i + 1): f(4h+‘-

4

1) - -$[(3h h+l

- 1)4h + 11.

General case:

1 +

i

4j(l

- 2ph)(l

-

2ph_1)

l

l

l

(l -

2Ph-j+l).

j=l

See [2,5] for fttrther results.

As for the chain code representation, since its length is proportional to the number of concerns of the boundary (number of changes of direction), its mean is easily seen to be equal to the average number of external nodes of the corresponding quadtrees, that is, 1 + (i)“- *, where N is the total number of nodes of the quadtree. The accuracy of the model of randomness may be tested through statistics on images, by computing, for example, the mean perimeter of the images and comparing it with its theoretical value. The perimeter algorithm itself may be found in [4]. 2.3. Proposition. The average perimeter [per] of a binary image is e@ai io: Case pi = 0:

[Per] = 2h+1. Case Pi=+:

[per]=ih+2-i(f)h-$h

(h-,

+oo).

[per] = (6p - 1)/(4p - 1) + E -

[2(1-

Case pi=p+O,

$:

2p)lh.

Case pi = 1/2(i + 1):

[per] = 2 +

2h”-(h+2) h+l

- 2”+‘/(h

+ 1)

(h + + 00).

Proof. Let L(t) be the set of leaves of the tree t encoding the image. For each direction D taken in E, N, W, S, let Ly (t) be the set of all black leaves of t whose neighbour in direction D is also black and bigger than themselves, and L!(t) be the set of all black leaves of t whose neighbour in direction D is also black and of equal size (i.e., at the same level as in t). 4/2’ is the perimeter of a leaf at level i. The perimeter of the 91

INFORMATION

Voiume 26, Number 2

19 October 1987

PROCESSING LE-ITERS

whole image is then given by the following formula: per(t) =

x

per(n) - 2

nf%t)

C

-2

C

iper

2

C

+per(n) - 2

aper

-

C

x

iper

-

x

fper(n)

nE Ly(t)

nE L!(t)

nELy(t)

C

fper(n)

nE L:(t)

C

-

nEL?!(t) -

+per(n) -

nELLE_(t)

fper(n).

nE LS,(t)

nELS=(t)

Taking expectations, and allowing for the fact that all directions are equivalent, we have [per] =

c

-4

per(n)

c

TIhl(t)

tEQlh]

- 8 c

nEL(t)

c

Tlhl(t) (

tEQjh’

~[~lo(t

‘EQ[ttl

$per(n)

c nE LE,(t)

c nE Lb(t)

and splitting according to the levels of the leaves yields h 4 [per] = C ,i(fi - 2f F,i- fE,i), i-0

where fi (respectively f ~,i, f =E ,i) is the average cardinality of L(t) (respectively LF (t), Lk (t)) at level i. If p, (i) (respectively p=(i)) denotes the probability that the eastern neighbour of a leaf at level i is bigger than (respectively is of the same size as) itself, w’: clearly have f E = ap, (i)fi, and f E = &J= (i)fi. An easy calculation shows that fi = 2 X 4’ph_i(l - 2p h-i+*). . (1 - 2Ph). Computing p=(i) can be done by considering level i - j of the youngest ancestor common to a leaf and to its eastern neighbour; searching for a leaf neighbour is done by going up the branch of the leaf and asking at each level whether the node is an eastern son of its father (in which case we have to go further up the branch) or whether it is a western son (in which case we have found an ancestor common to the leaf and its neighbour, and can now proceed downwards in a symmetric way). Thus, at each level examined, the probability that we have found the ancestor is f. The neighbour we search is at level i if and only i’call the intermediate nodes in the downwards path are internal, until at level i an external node is reached. This justifies the following result: l

i-l p=(i)

= ,_o C

&(l

- 2Ph-j-l)

l

ma t1 - 2Ph-i+1)2Ph-i*

(1)

.

In a similar fashion we obtain i-l

p,(i)=

c j=O

&

‘2

(1-2Ph-j-l)..*(1-2Ph-k+l)2Ph-k.

(2)

k=j+l

Putting all this together finally yields the desired result on the mean perimeter.

References [II CR. Dyer, A. Rosenfeld and H. Samet, Region representation: Boundary codes from quadtrees, Comm. ACM 23 (3) (1980) 171-179. 92

0

[2] C. Puech and H. Yahia, Quadtrees, octrees, hyperoctrees: A unified analytical approach to tree data structures used in graphics, geometric modeling and image processing, ACM Symp. on Computational Geometry, Baltimore, MD (1985) 272-280.

Volume 26, Number 2

INFORMATION

PROCESSING LETTERS

[3] H. Samet, Computing perimeters of images represented by quadtrees, IEEE Trans. Pattern Analysis & Machine IntelIigence PAMI- (6) (1981) 683-68?. [4] H. Samet, The quadtree and related hierarchical data structures, ACM Comput. Surveys 16 (2) (1984) 187-260.

19 October 19b7

[S] H. Bahia, AnaIyse des Structures de Don&es Arborescentes Representant des Images, These de Troisieme Cycie, Univ. d’Orsay, December 1986.

93