COMPUTER
VISION,
GRAPHICS,
AND
IMAGE
PROCESSING
23,
366-315
NOTE Picture Information Measures for Similarity Retrieval SHI-KUO information
Systems
Loboratory,
Illinois
CHANG
Institute
of Technology,
Chicago,
Illinois
60616
Research
Laboratory,
AND CHUNG-CHUN Systems
Research
Branch,
YANG
Aerospace Systems Division, Washington, D. C. 20375
Naval
Received June 14, .1982; revised July 30, 1982 A family of picture information measures based upon the concept of the minimum number of gray level changes to convert a picture into one with a desired histogram is presented. This family of picture information measures is found to satisfy a number of axioms and useful inequalities. A natural extension to a Lorenz information measure is discussed. Based upon these information measures, a structural information measure can also be defined for logical pictures. Applications to similarity retrieval in a pictorial information system are then suggested. 1. INTRODUCTION A two-dimensional picture function f is a mapping, f: N X N --j (0, 1, . . . , L - l}, where N represents the set of natural numbers (1,. . . , N), and (0, 1,. . . , L - l} is the pixel value set or gray level set. f(x, y) then represents the pixel value at (x, y). Given a picture function f and a point set S, f/S denotes the restricted picture function which is defined only for points in S, i.e., f/S(x, y) = f(x, y) if x is in S. f/S is called the support picture function or support picture for S. A measure for the amount of information contained in a picture f is based upon the minimum number of pixel gray level changes to convert a picture into one with constant gray level. Let h: (0, 1,. . . , L - l} -+ N represent the histogram off, where h(i) is the number of pixels with gray level i. We define the pictorial information measure PIM( f) as follows:
We note that PIM( f ) = 0 if and only if f is a constant picture (i.e., f (x, y) = constant for all (x, y) in N x N). On the other hand, PIM( f) is maximum if and only if f has a uniform histogram (i.e., h(i) = constant, 0 I i I L - 1). Let the total number of pixels in f be N( f ). f has a uniform histogram if and only if PIM( f) = N( f )( L 1)/L. In other words, PIM( f) is minimal when f is least informative, and maximal when f is most informative. Suppose a picture point set S is divided into two disjoint subsets S, and S,. We then have PIM( f/S)
2 PIM( f/S,) 366
0734-189X/83
$3.00
Copyright 0 1983 by Academic Press. Inc. All rights of reproduction in any form reserved
+ PIM( f/S,).
(2)
PICTURE
INFORMATION
367
MEASURES
Therefore, if we use disjoint sets Si to cover a picture f, then the sum of PIM( f/S,) is always less than or equal to PIM(f). We can also define a normalized picture information measure as NPIM( f ) = PIM( f )/N( f ). In picture encoding [ 1, 21 we can use PIM or NPIM to decide whether a picture f should be decomposed. For example, if NPIM(f) is less than a threshold, then f need not be further decomposed. On the other hand, if NPIM(f) is close to maximum, and for every subpicture f/S, NPIM( f/S) is close to maximum, then the picture f is almost random and also need not be further decomposed. If we define pi as h(i)/N( f ), then we have NPIM( Furthermore,
(3)
I
f/S,)/N( f ), then we can prove
if we define w, as N( NPIM( f/S)
f ) = 1 - maxp;.
2 w, x NPIM(
f//s,) + wZ x NPIM(
We can define a more general measure, PIM,,
f/S,).
(4)
as follows:
(5) largesth(i)‘s
and NPIM,
is accordingly defined as
NPIMk(f)=
1-
(6) largest p, ‘s.
We can also prove that PIM, satisfies inequality (2), and NPIM, satisfies inequality (4). The picture information measures introduced above can be used to select subpictures in picture covering [2-41. In what follows, we demonstrate that these information measures satisfy a number of axioms and useful inequalities (Section 2) and generalize the concept to a Lorenz information measure (Section 3). Based upon these information measures, a structural information measure can also be defined for logical pictures (Section 4). Applications to similarity retrieval are suggested (Section 5). 2. A FAMILY
OF INFORMATION
MEASURES
The picture information measures NPIM and NPIM, introduced above belong to a family of information measures. We will use p to denote a probability vector ( p,, . . . , p,,), where the pi’s are nonnegative real numbers. Let p, r, and q be three probability vectors. We write p = r + q, if p, = r, + q,, 1 I i I n. Let w, be the sum of the ‘;‘s, and w, be the sum of the 4;‘s. We call IM(p) an information measure if the following inequality holds: IM(p)
2 w,IM(q)
+ w21M(r).
(7)
CHANG
368 THEOREM.
Let
AND YANG
IM(p) = F(l,O,. . . , 0) - F(p). If the function F satisfies the follow-
ing:
(Bl) (B2) (B3) (B4) then IM(p)
IM((l,...,
F(p) is continuous in pi for all i, F(p) is symmetric in pi for all i, F(O,..., 0)= 0, F is convex, i.e., F(w, X r + w,
X
q) I w, F(r) + w2F(q),
is an information measure. Moreover, the minimum of IM(p) . . , l/n)). 0)) = 0, and the maximum of IM(p) is at IM((l/n,.
We give the following examples of information
is at
measures:
F(p) = max pi
(8)
i
is a convex symmetric function, and IM(p) in this case is the NPIM defined above. F(P) =
c
P,
(9)
i is one of the k largest p, ‘s
is a convex symmetric function, and IM(p) in this case is the NPIM,
defined above.
F(P) = t P;
(10)
1=l
is convex symmetric if (Y> 1. Therefore, we can use IM(p) = 1 - i
pi”
(11)
1=l
as an information measure. Intuitively, this information measure behaves similarly to NPIM when (Yis large, and behaves similarly to NPIM, when (Yis closer to 1. F(P) = i
Pi log Pi
(12)
i=l
is convex symmetric, and F( 1, 0, . . . , 0) = 0. Therefore, IM(p) in this case is the entropy function H(p), the standard measure for information. Comparing the pictorial information measure with the standard entropy function, we note that in axiomatic information theory, H(p) is shown to satisfy the following three axioms: (Cl) H(p) is continuous in pi for all i, (C2) H(p) is symmetric in pi for all i, (C3) If p, = q, + q2, then H(P,> Pz*.*.r P,-(3 419 q2) = NP
I,...,Pn-l?
PJ +P?Aq,/P~,qJPJ
PICTURE
INFORMATION
369
MEASURES
The minimum of H(p) is also at H(( 1, 0, . . . , 0)) = 0, and the maximum of H(p) is at l/n)). (C3) is not satisfied by NPIM or NPIM,, and this axiom is W(l/n,..., replaced by (B3) and (B4) given above. If the function F is as given in Eq. (lo), we can show that
In general, if F(p) is the summation of f(pi), where f is a continuous convex function in p,, then F is a symmetric convex function, and IM(p) thus defined is an information measure. More details and proofs of theorems can be found in Ref. [5]. 3. LORENZ
INFORMATION
MEASURE
The reasons for using NPIM or NPIM, as information measures, instead of using the usual entropy function, are (1) they have an intuitively meaningful interpretation with respect to pictures, (2) they are easy to compute, and (3) they represent a family of picture information measures, so that for a given application, a desirable one can be selected by adjusting the various thresholds and constraints. Furthermore, the picture information measure described above has a natural extension to a Lorenz information measure, as will be discussed in this section. Let NPIM, denote the normalized picture information measure where PIM, is the minimum number of pixel gray level changes to convert a picture to k gray levels. Suppose the pi’s are ordered such that p.
I
...
IP,-,.
It can be seen that NPIM,( f ) is 1 minus the sum of the last k terms in the sequence PO,PI,..., pL- ,, which is equal to the sum of the first L - k terms in the sequence. In particular, NPIM(P) or NPIM,( f ) is the sum of the first L - 1 terms in the sequence pO, p ,, . . . , pL _ , . The following equalities hold: 0 = NPIM,(f)
I NPIM,-,(f)
If we define S, to be NPIM,_,(f),
I ...
I NPIM,(f)
2 NPIM,(f)
= 1.
we have
SL = 1 k-l sk=
h,. 1=0
By plotting the points (k/L, sk), k = 0, 1,. . . , L - 1, we obtain a piecewise linear curve, as illustrated in Fig. 1, which is called the Lorenz curve [6]. We note that this curve represents the information content of a picture. To find out the value of NPIM,, we simply check the point ((L - k)/L, sLek) on the Lorenz curve. If the gray levels of the pixels are uniformly distributed, then the curve becomes a straight line from (0,O) to (l,l). Otherwise, the curve will be a convex piecewise linear curve under this straight line. In Fig. 1, curve C, for picture f is always above curve Cg for picture g. That is to say, NPIM,(f) 2 NPIM,(g) for every k, or picture f is more informative than picture g. Another way of describing this
370
CHANG
AND
YANG
M Cf
%A
(00)
FIGURE
I
relationship is that picture g is less complex than picture f. In other words, “as the bow is bent, concentration increases,” and the corresponding picture is less complex and consequently less informative. Using the concept of majorants, if C, > Cg, then f is more informative than g. The Lorenz curve derived from the picture information measures NPIM,(f) is called the Lorenz information curve. It can be seen that once the histogram h is given, the Lorenz information curve is completely specified. Conversely, if the Lorenz information curve is given, we know the histogram in its permutation equivalent class. The Lorenz information measure LIM( p,, . . . , p,,) is defined to be the area under the Lorenz information curve. Clearly, 0 I LIM( pI,. . . , p,) < 0.5. For any probability vector (p,, . . . , p,), LIM( p,, . . . , p,,) can be computed by first ordering the pi’s, then calculating the area under the piecewise linear curve. Since LIM( p,, . . . , p,) can be expressed as the sum of f( pi), and f( pi) is continuous convex function in pi, p,) is also an information measure. Intuitively, the Lorenz information LIWP,,..., measure is the weighted sum of the NPIM,‘s, so that LIM can be regarded as a global measure of information content. 4. STRUCTURED
INFORMATION
MEASURE
Since the Lorenz information curve is always normalized, we can plot such curves for pictures having different sizes and gray level sets and compare them. As illustrated in Fig. 2, two Lorenz information curves may intersect at points A, = (O,O), A,, AZ,..., A,,, = (l,l). We can define a similarity measure between f and g, d( f, g), as the summation of the polygonal areas enclosed by the two curves C, and Ap=(l,l)
A,=(O,o) FIGURE
2
PICTURE
INFORMATION
371
MEASURES
Cg. Clearly, 0 2 d(f, g) I 0.5. If d(f, g) is below a preset threshold t, the two pictures can be considered as informationally similar in the sense of having similar Lorenz information curves. It should also be noted that this approach can be generalized to handle not only physical pictures defined by a picture function f, but also logical pictures consisting of logical objects and relational objects [ 11. Suppose there are N objects in the logical picture, and these objects are classified into L different types: T,, T,, . . . , TL. We can define/z: (1,2,..., L} + N to be the logical histogram, where h(i) is the number of objects having type 7;. The Lorenz information curve for logical pictures can then be computed. As an example, suppose the picture object set is V = {o,(A), v2( A), v,(B), q,(B), v,(B)}. The relational object set is R = {r,(X, v,, v2), r2(Y, v,, v2), r3( X, v,, v,), r4(Y, v,, v5), r5( X, v2, Q)}. The picture objects and relational objects are illustrated in Fig. 3. Objects v, and v2 are of same type A, and v3, v,, v5 of same type B. Relations r,, r3, rs are of same type X, and r,, r, of same type Y. We can define the structural information measure SIM as the weighted sum of three parts: (Sl) Object information measure OIM = IM(a,, . . . , a,) where a, is the probability of occurrence for objects of type T. (S2) Intraset information measure IIM(i) = IM(&,, . . . , bik) where b,, is the probability of occurrence for relations of type R, in the object set of type T,. (S3) Interest information measure TIM(i, j) = IM(c,~,,. . . , cijk) where c,~~ is the probability of occurrence for relations of type R, between object sets q and I;. The structural information
measure SIM is defined to be L
L
SIM = w,OIM + c rv,IIM(i) 1=I
L
+ c c qjTIM(i, ;=I
j)
(13)
j-1
where wO,wi, uij are nonnegative weights. For example, if there are L types of objects, we can set w0 to 1, w, to l/L, and uij to 2/( L x (L - 1)).
/--\
c-
/
/ /
0 -0 \,
I
I
r3
’
'1
'2
‘4
@ \
\ ‘6,, ‘-4
FIG.
/I
/
‘5
I I I
0 “4
:
: I
\
/
3. Logical
I
I I
I
\,
I I
I
'
I I
‘\
I
I I
I
v3
\
“1
;
\
‘\
I
\ \
picture
with
\
objects
\
0 “5
I /I
and relations.
312
CHANG
FIG. 4. SEASAT
images:
AND
YANG
(a) image
# I; (b) image
#2
In (Sl), (S2), and (S3), we can use any information measure IM. If we use LIM as the information measure, we have OIM = LIM(0.4,0.6) = 0.45, IIM(A) = LIM(O.5,0.5) = 0.5, IIM(B) = LIM(0) = 0, and TIM(A, B) = LIM(i, 3) = 0.4167. Therefore, SIM = 0.45 + 0.5 X 0.5 + 0.5 X 0 + 1 X 0.4167 = 1.1167. If the summation of Weis 1, and the summation of uij is 1, it is clear that 0 I SIM 5 1.5.
PICTURE
INFORMATION
373
MEASURES
Since the structured information measure is defined with respect to a given object set V and relational object set R, we should write SIM(V, R) to indicate SIM is always calculated for given V and R. Therefore, SIM can be calculated for any subpicture of a logical picture, or any subpicture of a physical picture, by changing V and R. 5. APPLICATION
TO SIMILARITY
RETRIEVAL
To define similar pictures, we can use a combination of the following criteria: (1) their physical (and/or logical) histograms are similar, (2) their Lorenz information curves are similar, (3) their Lorenz information measures are similar, and (4) their structured information measures are similar. The picture information measure introduced above is based upon the minimal number of gray level changes to convert a picture to one having a desirable histogram. In the case of PIM,, the desirable histogram consists of a peak at a single (arbitrary) gray level. For PIM,, the desirable histogram consists of k peaks at k (arbitrary) gray levels. For similarity measurement, the exact shape of the desirable histogram can be specified, and the algorithms for optimal histogram matching for both the L, norm (Ref. [7]) and the L, norm [S] have been developed. As suggested in [S], we can also use the computed minimal number of mismatches to measure the similarity between two pictures or subpictures. Figure 4(a) illustrates a SEASAT image (Image # 1) of the Los Angeles area, and Fig. 4(b) another SEASAT image (Image #2) of a certain coastal area. Both images were quantized into 128 X 128 pixels of 64 gray levels. Figures 5(a) and (b) illustrate the Lorenz information curves for SEASAT Images # 1 and #2, respectively. We can apply picture decomposition techniques (see [ 1, 21) to decompose these two pictures into pages of size 10 by 10 and construct picture trees, using the Primitive algorithm, the PIM-guided algorithm, and the LIM-guided algorithm. The experimental results are summarized below. SEASAT Image # 1 Threshold Value
Number of Nodes
Number of Blocks
Number of Pages
Algorithm Type
0.0 0.0 0.0 0.01 0.01 0.02 0.02 0.03 0.03 0.04 0.04 0.05 0.05 0.10 0.10
52 51 49 52 47 59 57 55 69 66 71 67 71 16 15
31 30 29 30 28 34 31 28 33 30 34 29 33 3 3
135 136 134 130 129 116 114 107 106 90 85 55 72 3 3
Primitive PIM LIM PIM LIM PIM LIM PIM LIM PIM LIM PIM LIM PIM LIM
374
CHANG
AND
YANG
SEASAT Image #2 0.0 0.0 0.0 0.01 0.01 0.02 0.02 0.03 0.03 0.04 0.04 0.05 0.05 0.10 0.10
49 50 50 50 54 44 46 43 43 41 41 35 42 15 5
28 28 30 28 32 23 24 21 21 17 17 10 14 2 0
Primitive PIM LIM PIM LIM PIM LIM PIM LIM PIM LIM PIM LIM PIM LIM
69 69 70 69 71 55 54 47 45 28 24 12 15 3 0
In the above experiments, each picture is divided and subdivided by finding “subtractors,” which are those areas with NPIM less than or equal to a threshold value. The subtractor areas are discarded, and the remaining areas are decomposed into rectangular blocks. The decomposition algorithms are based upon a heuristic to minimize the expected number of pages (Primitive algorithm), or the minimization of the sum of PIM for the decomposed areas (PIM-guided algorithm), or the minimization of the sum of LIM for the decomposed areas (LIM-guided algorithm). In the seven experiments for SEASAT Image # 1, in five cases the LIM-guided algorithm was strictly better than the PIM-guided algorithm. In the seven experiments for SEASAT Image #2, in four cases the LIM-guided algorithm performed better. Therefore, the LIM-guided algorithm to construct picture trees usually gives good results. Once the picture tree is constructed, we can retrieve a subpicture by searching the picture tree, and finding similar pictures according to the similarity criteria suggested above. The structured information measure introduced above allows for the consideration of both the physical picture and the logical picture. The subpictures retrieved according to the similarity criteria can then be processed to determine
\Ja 1-d 1
1
FIG.
5. Lorenz
information
curves
for image
# l(a)
and image
#2(b).
PICTURE
INFORMATION
MEASURES
375
whether they satisfy the detailed picture query [2]. Therefore, inexact similarity retrieval techniques can be combined with exact query processing techniques, thus providing a flexible retrieval system to the user. ACKNOWLEDGMENTS
This research was supported by the National Science Foundation under Grant ECS-8005953 and by the Naval Research Laboratory under Contract NOOO14-82-C2156. REFERENCES I. S. K. Chang, A methodology for picture indexing and encoding, in Picture Engineering (T. Kunii and K. S. Fu, Eds.), pp. 33-53, Springer-Verlag, 1982. 2. S. K. Chang and C. C. Yang, Picture encoding techniques for a pictorial database, Technical Report, Naval Research Laboratory, 1982. 3. J. Reuss, S. K. Chang, and B. H. McCormick, Picture paging for efficient image processing, in Picforial Information Systems (S. K. Chang and K. S. Fu, Eds.), pp. 228-256, Springer-Verlag; 1980. 4. S. H. Liu and S. K. Chang, Picture Covering by 2-D AH Encoding, Proceedings, IEEE Workshop on Computer Architectures for Pattern Analysis and Image Database Management, Hot Springs, Virginia, November I I - 13, 198 I. 5. H. Silver and S. K. Chang, Picture information measures, Technical Report, Information Systems Laboratory, Illinois Institute of Technology, August 1982. 6. A. W. Marshall and I. Olkin, Inequalities: Theory of Majorization and Its Applications, Academic Press, New York, 1979. 7. S. K. Chang and Y. Wong, Optimal histogram matching by monotone gray level transformation, Commun.
ACM
22, 835-840,
1978.
8. S. K. Chang and Y. Wong, Ln norm optimal histogram matching and application to similarity retrieval, Computer Graphics Imnge Processing 13, 36 1-37 1, 1980. 9. R. C. Gonzalez ef al., A measure of scene content, CHl318-5/78/0000-0385500.75, IEEE 1978, pp. 385-389.