Image-data compression using edge-optimizing algorithm for WFA inference

Image-data compression using edge-optimizing algorithm for WFA inference

Information Pergamon Processing & Management, Vol. 30, No. 6, pp, 829-838, 1994 Copyright 0 1994 Elsevier Science Ltd Printed in Great Britain.All r...

2MB Sizes 10 Downloads 138 Views

Information

Pergamon

Processing & Management, Vol. 30, No. 6, pp, 829-838, 1994 Copyright 0 1994 Elsevier Science Ltd Printed in Great Britain.All rightsreserved

0306-4573/94 $6.00+ .OO 0306-4573(94)ooo19-0

IMAGE-DATA COMPRESSION USING EDGE-OPTIMIZING ALGORITHM FOR WFA INFERENCE KAREL CULIK II Department of Computer Science, University of South Carolina, Columbia, S.C. 29208, U.S.A. and JARKKO KARI Academy of Finland, and Mathematics Department, University of Turku, 20500 Turku, Finland finite automata (WFA) define real ness functions of graytone images. Earlier, the authors (inference) algorithm that converts an arbitrary function that can (approximately) regenerate it. The WFA obtained

Abstract-Weighted

functions, in particular, graygave an automatic encoding (graytone image) into a WFA

by this algorithm had (almost) minimal number of states, but a relatively large number of edges. Here we give an inference algorithm that produces a WFA with not necessarily minimal number of states, but with a relatively small number of edges. Then we discuss image-data compression results based on the new inference algorithm alone and in combination with wavelets. It is a simpler and more efficient method than the other known fractal compression methods. It produces better results than wavelets alone.

1. INTRODUCTION finite automata (WFA) have been introduced in [5], They compute real functions of n-variables, more precisely functions [0, I] ’ -+ IR. For n = 2 such a function can be interpreted as the grayscale function of an image. In [4] we gave an algorithm that for a function (image) given in table (pixel) form finds a WFA with a small number of states that approximates the given function. When this algorithm was used for image-data compression, the ratio of compression and quality of the regenerated image was less than satisfactory, since the automata produced by the algorithm had a relatively small number of states but large number of edges. Next [3] we combined WFA with Debauchie’s wavelets; we first expressed the wavelet transform as a WFA and then simplified it by our algorithm. This approach produced better results than wavelets alone, but only marginally so. Here we develop a new inference algorithm for WFA that optimizes the number of edges. Using this algorithm alone (without wavelets) we have obtained results comparable to or better than wavelets alone. However, the best results are obtained by computing first the wavelets’ coefficients, representing them as an image (Mallat form) and then applying our new inference algorithm to this image and approximating it by a WFA. The decoding is the reverse of this process, using first the decoding algorithm for WFA and then the decoding algorithm for wavelets. The weighted

2. IMAGES AND WEIGHTED

FINITE AUTOMATA

By a finite-resolution image we mean a digitized grayscale picture that consists of 2”’ by 2” pixels (typically 7 I m I 11) each of which takes a real value (practically digitized to a value between 0 and 2k - 1, typically k = 8). By a multi-resolution image, we mean a collection of compatible 2” by 2” resolution images for n = O,l, . , . . We will assign to each pixel at 2” by 2” resolution a word of the length n over the alphabet C = (0,1,2,3) I. 829

K. CULIK II and J. KARI

830

Each letter of C refers to one quadrant of the unit square as shown in Fig. 1. We assign e as the address of the root of the quadtree representing an image. Each letter of C is the address of a child of the root. Every word in C* of the length k, say w, is then an address of a unique node of the quadtree at depth k. The children of this node have addresses w0, wl, w2, and w3. Therefore, in our formalism a multi-resolution image is a real function on C*. The compatibility of the different resolutions is formalized by requiring that f: C* + IRis an average-preserving function. A function f: C*+ IRis average-preserving (ap) if

f(w) = t [f(W

+f(wl) +f(w2) +f(w3)1

(1)

for each w E C*. An ap-function f is represented by an infinite labeled quadtree. The root is labeled by f (E), its children from left-to-right by f(O), f(l), f(2), f(3), etc. Intuitively, f(w) is the average grayness of the subsquare w for a given graytone image. We consider the set of functions f: C* + IR as a vector space. The operations of sum and multiplication with a real number are defined in a natural way: (fi +fz)(w) (cf)(w)

=f*(w)

+f2(w)9

= cf(w),

foranyf,,f,:C*+IRand

for any function

WEE*,

f: C’ -+ IR and real number

c.

The set of ap-functions forms a linear sub-space, because any linear combination of apfunctions is average-preserving. The sum of two ap-functions represents the image obtained by summing up the graynesses of the two images, and the multiplication with a real number corresponds to the change of the contrast. By an infinite-resolution image we mean a local-grayness function g : [0,1] 2 + lR. For every integrable local-grayness function g we can find the corresponding multi-resolution function f: C* -+ IR by computing f ( w) as the integral over the square with the address w, divided by l/41”“‘, the size of the square, for each w E C’. Conversely, for a point p E [0, 112, g(p) is the limit of the pixel values containing p, if such a limit exists. Thus, not every multi-resolution image can be converted into an infinite-resolution image. For a color image in rgb (red-green-blue) representation, we need three graytone images, one for each basic color. We can view the addresses of the nodes of the quadtree as the addresses of the corresponding subsquares of the unit square [O,l] 2. For example, the whole [0,112 has the address E and the black square in Fig. 3 has the address 320. All subsquares with their addresses for resolution 4 x 4 are shown in Fig. 2. The distance between two functions f and g (the error of approximating f by g) is usually measured by

Ilf - gll, =

[

l’s,’

If(X,Y)

- g(x,Y)l~dxdY]lm.

(2)

In practice, one desires a metric that parallels the human perception (i.e., the image differences that seem larger to the human eye are mathematically large and those perceived

El 1

3

0

2

Fig. 1. The addresses of quadrants.

Image-data compression

11

13

31

33

10

12

30

32

01

03

21

23

00

02

20

22

831

Fig. 2. The addresses of subsquares of resolution 4

x

4.

as insignificant are mathematically small). The usual choice is (2) with p = 2, which we will adapt. Therefore, we will consider the average square error also in the case of finite resolution images. A weighted finite automaton (WFA) A is specified by 1. 2. 3. 4. 5.

Q is a finite set of states, C is a finite alphabet (here we use the alphabet C = (0,1,2,31), W,: Q x Q + R, for each a E C, the weights at edges labeled by a, I: Q + [ -m,oo] is the initial distribution, F: Q + [ -m,m] is the final distribution.

We say that (p,a,q) E Q x C x Q is an edge (transition) of A if W,(p,q) # 0. This edge has label a and weight W,( p, q) . For 1QI = n we will usually view W, as an n x n matrix of reals and Z, F as real vectors of size n. A WFA defines a multiresolution image fA by .

fA(a,a2..

ak) = ZW,, W,, . . . W,, F

for each k 2 0 and a, a2. . . ok E c’. EXAMPLE 1. A WFA can be specified as a diagram with n nodes ( 1, . . . , n ) . There is an edge from node i to node j with label a E C and weight r # 0 iff (Wa)ii = r. The initial and final distribution values are shown inside the nodes, as illustrated in Fig. 4, where Z = (l,O),

F= (;A,

w, =

(A y),

WI= (i

t),

W,=

(8

t),

and

Fig. 3. The subsquare specified by the string 320. IPM30:6-H

w,=

(8

t

K.

832

CULIK II and

1,2.3 (l/2)

J. KARI

1,2 (l/4)

0,1,2,3

(

Fig. 4. WFA A defining the linear grayness function fa.

From a diagram the multiresolution image can be read as follows: The weight of the path in the diagram is obtained by multiplying the weights of all transitions on the path, the initial distribution value of the first node, and the final distribution value of the last node on the path together. Then fA(w) is the sum of the weights of all paths whose labels form the word w. For example, fA(03) = ZWoW3F = g, or alternatively, fA (03) = sum of the weights of three paths labeled by w = g + i + 0 = $. The image fA for resolutions 2 x 2, 4 x 4, and 256 x 256 is shown in Fig. 5. If (W,+

W, + W2+ W,)F=4F,

(3)

then fA(~0) + fA(WI) + fA(w2) + fA(~3) = 4fA (w) for all w E C*. In other words, if (3) holds, then the multiresolution image f is average-preserving. In this case we will also call WFA A average-preserving (ap-WFA). Note that (3) states that 4 is an eigenvalue of W, + W, + W, + W, and F is the corresponding eigenvector. All WFA considered here will be average-preserving. InthespecialcaseF=(l,l,.. . , 1) considered in [4], eqn (3) is reduced to the requirement that for each state the sum of the weights of all the outgoing edges is 4. The matrices W,, a E C, and the final distribution F define a multiresolution $i for every state i E Q by rl/i(Qlaz..

Equivalently,

.ak) = (W,,W,,...

W,,F);.

for every i E Q, a E C, and w E C* we have

Gitaw)

=

i

(K)ij$j(w)*

j=l

Fig. 5. The imagefA in resolutions 2 x 2, 4 x 4, and 256 x 256.

Image-data compression

833

We call $; the image of state i. It is average-preserving if the WFA A is. The final distribution value 4 = g;(e), the average grayness of image $;. The transition matrices W,, a E C, specify how the four quadrants of each image Gi are expressed as linear combinations of $i , &, . . . , $,,; specifically, the image in quadrant a of rc/;is expressed as (W,);, 11/1- ( W,)i2$2 + . . . + (Wa)in$n. The initial distribution Z specifies how the multiresolution image f, computed by WFA A, is expressed as a linear combination of the images $i , &, . . . , $,; clearly,

For an arbitrary multiresolution multiresolution image fu(w)

image f over C and word u E C*, fu denotes the

= f(w),

for every w E C*.

f, is the image defined by f inside the subsquare with address u. 2. We will show intuitively how we can infer a WFA that generates a given image. The method follows the inference algorithm described in [5]. Consider the multiresolution image f shown in Fig. 6. We explain the construction of WFA A over C = (0,1,2,3) defining f shown in Fig. 7. (A is drawn as a labeled and weighted directed graph. The nodes of the graph represent the states. The edges represent non-zero elements of the transition matrices: If (Wa)ij = r # 0 there is an edge in the graph from node i to node j with label a and weight r. Weights are shown in parentheses. To simplify the graph, multiple edges with the same weight but several labels are drawn as a single edge.) As we know, each state i of the automaton defines an image pi, which in Fig. 7 is shown for each state inside the box representing that state. In A, or in any WFA produced by the inference algorithm of (51, the images $i of all states belong to (f, 1 u E C* 1, that is, they are subimages off inside some subsquares of the unit square. We denote qw the state with image f,,,. Clockwise, starting from top left, WFA A has the states qE, ql, qlo, EWLE

qlm, 400, and 40.

We start by creating the state qE that represents f,. Now we have to “process” qc by expressing all quadrants off, as linear combinations of existing states or as new states. We have fi = f, and f3 = 0, and two new states q. and q, representing f. and fi , respectively.

Fig. 6. The diminishing triangles.

K. CULIK II and J. KARI

834

Fig. 7. A WFA generating

the diminishing

triangles.

Thus, we draw the loop at qE with label 2 and weight 1, and edges from qe to q. and q1 with weights 1 and labels 0 and 1, respectively. Next, we process q1 ; the image fi is again self-similar, namely fiz = f, 1= fi . Hence, we draw the loop at q1 with labels 1 and 2 and weight 1. Quadrant 3 of fi is empty, and quadrant 0 contains a new image, a black triangle fro. Next we process qo, which yields the first nontrivial linear combination, namely fol = if0 + if,,. Hence, there are two edges, each with label 1 and weight i starting at qo: one loops back to q. and the other to 410. We repeat this process for each new created state and each a E C. If it terminates, that is, if all created states have been processed so that each quadrant of every state is expressed as a linear combination of the states (implemented by the edges), then we have constructed a WFA that perfectly represents the given image. The initial distribution of A at state q. is 1 and at all other states is 0. The final distribution at each state is the average intensity of the image of that state. Note that WFA A is average-preserving.

3. A RECURSIVE

INFERENCE

ALGORITHM

FOR WFA

In [5] we gave an inference algorithm that produces a WFA defining a given multiresolution function f. The WFA was guaranteed to have the minimal number of states. However, it was not necessarily optimal as the number of edges was concerned; even though the dimensions of the transition matrices were relatively small, the matrices themselves were typically full. In the following we describe a recursive inference algorithm intended for finite resolution images. It produces a WFA with possibly not optimal number of states, but with sparse transition matrices. In a practical situation we are given an image (e.g., a graytone or color photograph) with certain finite resolution. In the terms of a quadtree we are given all the values at one level, say level k. By computing for each parent the average value of

Image-data compression

835

the labels of all its children, for all the nodes at the higher levels, we get the labels everywhere above the given resolution, and leave “don’t cares” below it. By assigning a different state at level k and higher we trivially get (too large) ap-WFA that (perfectly) defines the given image. The practical problem, therefore, is not whether it is possible to encode a given image, but whether we can get a good trade-off between the size of the automaton and the quality of the regenerated approximation of the given image. This trade-off, of course, depends on the “nonrandomness” or “regularity” of the image. It is a well known fact in the descriptive (Kolmogorov) complexity of strings [l] that most strings are algorithmically random and cannot be encoded (compressed) by a shorter program. However, the “interesting strings” are not random and possibly can be compressed. The same holds for images. Our algorithm needs to compute the distance dk of two multiresolution functions at level k. This distance could be any additive function

dk : IREk x IREk+ IR, where additive

means

that there exists d: IR2 -+ IR such that

d/c(.Lg) = C d(f(w),g(w)) WE

Ek

for every f,g E IR’* and k = O,l, . . . . Therefore the distance between two images at level k is the sum of the distances between their corresponding quadrants at level k - 1. In our implementation, dk is the square of the L2-metric:

&(f,g)

= C [f(w) - s(w)l*. WEEk

Recursive inference algorithm

The algorithm produces a small WFA A such that the kth levels of the input f and the function computed by A are close to each other. More precisely, we want the value of

&(f,h)

+ G-size(A)

to be as small as possible, where size (A) denotes the storage space required to store the WFA A, and G E R is a parameter. Parameter G controls the quality of the approximation. With large values of G a small automaton with poor approximation off is produced. When G is made smaller, the approximation improves while the size of the automaton increases. Table 1 contains an outline of the recursive inference algorithm. Global variable n indicates the number of states so far, and tii denotes the multiresolution function of state i, 1 5 i s n. A call make_wfa( i, k,max) to the recursive function tries to approximate the multiresolution function tii at level k as well as possible by adding new transitions and (possibly) new states to the automaton. The value of cost = dk ($i, #I) + Gas is minimized, where $,! is the obtained approximation of $; and s is the increase in the size of the automaton caused by the new edges and states. If cost > max, the value 00 is returned; otherwise cost is returned. Initially, before calling the recursive function for the first time, one sets n +- 1 and +, -f, where f is the function that needs to be approximated at level k. Then one calls

make_wfa(l,k,m). make_ wfa(i, k,max)

tries to approximate the functions (Il/i)U, a E C, (the four quadrants of the image Gi when C = (0,1,2,3]) in two different ways: by a linear combination of the functions of existing states (step 1 in Table l), and by adding a new state and recursively calling make_ wfa to approximate ( $i)a (steps 2 and 3). Whichever alternative yields a better result is chosen (steps 4 and 5).

K.

836

CULIK

II

and J. KARI

Table 1. Outline of the recursive inference algorithm for WFA; see the text for details. moke_wfa(i,k,maz) : If maz < 0 then return( cost +-- 0; If k=O cost + do(f, 0); else do the steps l-5 with 11,= (+i). 1. Find T~,Q,...

for &~a E C:

tn such that the value of Costi

cdk-l(~~rl!bl

+...+r,%+n)+G’s

is small, where s denotes the increase in the size of the automaton caused by adding edges from state i to states j with non-zero weights rj and label a, and dr;_l denotes the distance between two multiresolution images at level k - 1.

2. no - n, n + n + 1, qO,,+ + and add an edge from state i to the new state n with label a and weight 1. Let s denote the increase in the size of the automaton caused by the new

state and edge. 3. cost2 - s + make_wfo(n,k-l,min{maz-coat,costf}-s); 4. If co&

5 costf

then

5. If cost1 no+l,... on step j=

cost + cost + co@

< cost2 then cost I (added during 2. Set n - no 1,2 , . . . n with weights

If coat 5 ma2 return(cort)

+- cost + co&l, remove all outgoing transitions from states the recursive call), as well as the transition from state i added and add the transitions from state i with label a to states r1 whenever rj # 0.

else return(W);

The initial distribution of the WFA produced by the algorithm is I, = 1, Z, = 0 for all i 2 2, and the final distribution is F, = rl/i(E) for all i L 1. The compression achieved by the recursive algorithm can be further improved by introducing an initial basis. Before calling make_wfu for the first time, set n + N with fixed in the basis do not even need to be defined by a images $1, $2,. . . , gCN.The functions WFA. The choice of the functions can of course depend on the type of images one wants to compress. Our initial basis resembles the code book in vector quantization [8], which can be viewed as a very restricted version of our method. Another modification of the algorithm that yields good results is to combine the WFA with a wavelet transformation (see [7]): Instead of applying the inference algorithm directly to the original image, one first makes a wavelet transformation on the image and writes the wavelet coefficients in the Mallat form [7]. The Mallat form can be understood as an image itself, on which our inference algorithm can be applied. In this case WFA can be understood as a fancy way of quantizing the wavelet coefficients. Decoding is of course done in the opposite order: first the decoding of the WFA, and then the inverse wavelet transformation. Because the wavelets we use are orthonormal, the L2 error done on the Mallat form by the WFA is equal to the error caused to the original image. Let us describe shortly how a WFA is stored efficiently to get a good final compression. There are two types of edges in the automaton: edges created on step 2 when a new state is added, and edges created on step 1, which express the linear combinations. The former ones form a tree, and they can be stored trivially using four bits per state, where each bit indicates for one label which of the two alternatives was chosen on steps 4-5. Even fewer bits are enough if a coding with variable length codewords is used. For example, most states are typically leaves of the tree; that is, the alternative with linear combinations was chosen for all quadrants, and therefore it is wise to use a one-bit codeword instead of four bits for leaves. For the second type of edges, both the weight and the endpoints need to be stored. According to our experiments, the weights are normally distributed. Using this fact, an effective encoding with variable length codewords has been done. Storing the endpoints of the edges is equivalent to storing four sparse binary matrices. This has been done using

Image-data

837

compression

Fig. 8. 160 times compressed

Lenna.

run-length coding. If an initial basis is used, some states in the initial basis tend to be used more frequently than others. In particular, according to our experiments, if the completely black square is one of the images in the basis, it is used more frequently than other states. In such cases we further improved the encoding of the matrices. The recursive inference algorithm alone seems to be competitive with any other method for image compression, especially for high-compression rates. Its combination with wavelets seems to be the best high-compression method available. It not only gives a very good ratio between compression rate and the quality of the regenerated images, but also is relatively simple and time-efficient compared to the other fractal methods. In Fig. 8 are examples of 160 times compressed images of Lenna (resolution 512 x 512). The figure shows, clockwise starting at top left, the original image, the regenerated images using WFA only, the combination of Debauchie’s wavelets W6 with WFA, and the W6 wavelets only. The qualities of the images as the signal-to-noise ratios are 25.38 dB for WFA, 25.99 dB for the combination of WFA and W6, and 24.91 dB for W6 alone. Acknowledgemenl-Research

was supported

by the National

Science Foundation

under Grant

No. CCR-9202396.

REFERENCES [I] G. J. Chaitin, Algorithmic information theory, IBM Journal of Research and Development, 21, 350-359 (1977). [2] K. Culik II and S. Dube, Rational and affine expressions for image description, Discrete Applied Muthemuf-

k-s, 41, 85-120 (1993). [3] K. Culik II, S. Dube, and Peter Rajcani,

Efficient compression of wavelet coefficients for smooth and fractal-like data, CDD’93, Snowbird, Utah, Proceedings of Ihe D&u Compression Conference, 234-243, ed. J.A. Storer and M. Cohn, IEEE Computer Society Press (1993).

838

K. CULIKII and J. KARI

[4] Karel Culik II and Juhani Karhumaki, Automata computing real functions, SIAMJ. on Computing, to appear. Tech. Report TR 9105, University of South Carolina, Columbia (1991). [S] K. Culik II and J. Kari, Image compression using weighted finite automata, Computer and Graphics, 17, 305-313 (1993). [6] K. Culik II and J. Kari, Image compression using weighted finite automata. Proc. Mathematical Foundation of Computer Science 1993, Lecture Notes in Computer Science, 711, 392-402 (1993). [7] R.A. DeVore, B. Jawerth, and B.J. Lucier, Image compression through wavelet transform coding, IEEE Transactions on Information Theory, 38, 719-746 (1992). [8] M. Rabbani and P.W. Jones, Digital image compression techniques, Tutorial Texts in Optical Engineering TT7, SPIE, Optical Engineering Press (1991). [9] Cl. Strang, Wavelets and dilation equations: A brief introduction, SIAM Review, 31, 614-627 (1989).