Pattern Recognition Letters 8 (1988) 15 20 North-Holland
July 1988
Random field identification from a sample." experimental results Stanley M. D U N N * , Richard L. K E I Z E R Department of Electrical and Computer Engineering, Rutgers University, Piscataway, NJ 08854, USA Azriel R O S E N F E L D Centerfor Automation Research, University of Maryland, College Park. MD 20742. USA Received I1 December 1987 Abstract: In this report, we consider the problem of identifying a random field belonging to a given class, given a sample generation by that random field. We take the field to be from one of two special classes: stationary fields of independent samples and fields that are simple stationary Markov chains. Interval estimators for the parameters of the field are derived from the joint frequencies of occurrence of elements of the sample. We use Monte Carlo simulations to evaluate the performance of these estimators and to investigate the tightness of some theoretical bounds for their confidence levels. We also demonstrate how these methods can be applied to the problem of texture classification or segmentation, and present examples of textures distinguishable using these methods but not distinguishable to the eye. Key words: Texture classification, random fields, Markov chains.
1. Introduction In this report, we c o n s i d e r the p r o b l e m of identifying a r a n d o m field b e l o n g i n g to a given class, given a s a m p l e g e n e r a t e d by t h a t r a n d o m field. O u r w o r k is b a s e d on theoretical results o b t a i n e d by Prof. M. R o s e n b l a t t - R o t h in [1] a n d [2]. Section 2 treats the case of r a n d o m fields consisting of i n d e p e n d e n t r a n d o m variables, a n d s u m m a r i zes results o b t a i n e d in [1]. Section 3 is a s u m m a r y of results o b t a i n e d in [2] for the case of simple stat i o n a r y M a r k o v chains. Sections 4 a n d 5 describe the e x p e r i m e n t s p e r f o r m e d to d e t e r m i n e how well the b o u n d s in Sections 2 a n d 3 predict the b o u n d s o b t a i n e d in practice. Section 6 briefly discusses ap-
*The support of the first author by the National Bureau of Standards under Grant 60NANB4D0053 while at the University of Maryland, and by a Henry Rutgers Research Fellowship, is gratefully acknowledged.
plications to texture classification a n d image segmentation.
2. Theoretical results: the independent case Let C~ d e n o t e a sequence of s consecutive indep e n d e n t trials with possible o u t c o m e s At (1 < i < n) a n d associated p r o b a b i l i t i e s Pl (1 < i <_ n) that a d d to 1. Let mi be the n u m b e r of times the o u t c o m e A~ a p p e a r s in Cs (I < i < n). Let us d e n o t e by u(e) the solution of the e q u a t i o n
1
G i v e n e > O, 3 > O, s > O, we say t h a t C o n d i t i o n A holds if 462s > U2(~).
0167-8655/88/$3.50 (~) 1988, Elsevier Science Publishers B.V. (North-Holland)
(1) 15
Volume8, Number I
PATTERN RECOGNITION LETTERS
We will denote by W{S} the confidence of statement S. In [2], Rosenblatt-Roth proved Theorem 1. Let the arbitrary sequence Cs be generated by an independent identically distributed sequence of trials with unknown probabilities Pi (1 _< i _< n), and let Condition A be satisfied. Then
T
m,
} >1-~.
W mi-8
(2) 3. Theoretical results: the simple Markov case Consider a simple stationary Markov chain of discrete random variables, or trials, with possible outcomes A~ (1 < i _< n), with initial probabilities Pi (1 < i < n), and with transition probabilities Pik (1 < i, k < n). Since the chain is assumed to be stationary, the probability of occurrence of a sequence C~ does not depend on the moment when trials begin. If we take into consideration the Markovian dependence of the trials, this probability can be written as
July 1988
462 • min{mi, 1 < i ~ n} > z2(g)
(3)
wheree>0,6>0,mi>0 (l
0, s > 1 let F~. be the set of all sequences Cs e Fs. ~ such that sPiPuI < S8
[rail --
for all pairs (i, 1) (1 < i, I < n). For given values of s and 6 let G'~.~ be the set of all sequences C~ in F~ such that 8
Ira,- sp,I < ~s,
8
Ira.- m,p.] < ~s
and let G~, ~ be the set of all sequences C~ such that [mu -- miPilI < 6m i for all pairs (i,/) (1 <_ i,l < n). The relationship among these subsets is given by the following Lemma from [2]. Lemma 1.
C;.,~/2 ~ C~.~ ~ r',~. In [2] Rosenblatt-Roth proved
P(C~) = Pk, [1 Pk,_ ,,k,. 1=2
Let us denote by mu the number of consecutive pairs of the form k,_1,k ,
(2
in which
k . _ l = i, k . = l .
limpr~=pl
We shall assume that the simple, stationary Markov chains being studied obey the law of large numbers, i.e., for an arbitrary 6 > 0
•'im/Ib
---Pi
~
S
I t >6
=0
(1 <__iNn).
Let z(e) be defined as the solution of the equation x//~
e
=
1-
Condition B is said to hold if 16
Theorem 2. Let C s be a sample sequence from a stationary Markov chain with unknown probabilities Pi, Pit (1 < i , l <_n) and suppose that Pit>O (1 < i, l < n) so that the ergodic theorem holds, i.e., if p~l (1 < i, l < n) are the transition probabilities for r > I steps, then
.
(l
Then, if Condition B holds, we have
w{m_•._ mi
m.
J
}
<_i,l<_n > 1--e. (4)
Theorem 2 gives confidence intervals for the transition probabilities. In order to get confidence intervals for the initial probabilities we state and prove the following corollary. Corollary 1. Suppose Cs is a sample sequence from
Volume 8, N u m b e r 1
PATTERN RECOGNITION
LETTERS
July 1988
0.2S
x p
0.2
.....
----
------
r-
exp. ep~. lengJth = I~ng~h = length = length =
=
theor,
eps.
8
16 32 64
/
m 0.15 n
o
I
e
0.1
..... :2;2 . . . . . . . . .
P
~:5-
I o n
dS; "
.05
I • 825
[
I
.0S
I
.075
I
0.1
8
1
125
I
8.15
I
0.175
8
I 2
0.225
O.2S
epsilon
theoreticol
Figure 1. Independent n = 2 .
a stationary Markov chain which satisfies the assumptions of Theorem 2. Then W m~-(3
Proof. By T h e o r e m 2 it follows with confidence greater than 1 - ~ that Cs • G;,, 6. But according to L e m m a 1, G;., ~ is a subset of G's, 2~, from which it follows with confidence greater than 1 - ~ that C~ • G', 26. The result n o w follows immediately from the definition of Gj, 6.
>l-e.
s
(5) 0.2S
0.2
- ..... ----------
....
exp. eps. = lheor, l engkh = 8 length = 16 length = 32 l eng'kh = 64 length = 128
eps,
B.1S
0.1
/
........
.05
1
I
• g25
.05
I .075
I g.I
I 0.125
lheoreEicQl
I 0.1S
I 0.175
I 0.2
I 0.225
0.25
epsllon
Figure 2. Independent n = 4. 17
Volume 8, Number 1
PATTERN RECOGNITION LETTERS
July 1988
8.25
e
exp.
x p
=
----- JengJch ------ l e n g £ h ---leng£h
e
c
i
....
= theor,
eps.
leng£h
. . . . .
0.2
=
eps.
/
8 16
= 32 = =
length
64
128
m
e
n
0.15
Y
? 0.1
I
/
•0 5 ~
~
.~
f~/f
~ . . . . --_-_C .__._-~-Td:~z~.'-~.-
//././""
•025
F
I--
.g5
.g75
_.___..2Z_'_'.'C....
I-O.I
T
;
0.125
£heoce~icol
-- . . . . . . . . .
f
8.15
1-
0.175
I
0.2
0.225
g.25
epsilon
Figure 3. Independent n = 8.
4. Experimental results: the independent case
to a desired confidence level 1 - ~. Next we compute the confidence interval size 6 as a function of e using (1), converted into an equation. For given s, n, and p we generate sample sequences and see if the sample frequencies ml/s are within 6 of pl. For each p~ we compute the fraction of sample sequences whose sample frequencies lie outside of that inter-
This section describes experiments performed in connection with the analysis in Section 2. We can empirically investigate the tightness of the confidence intervals of Theorem 1 as follows. We first fix s and n and choose an e corresponding 8.25
exp.
g.2
eps,
:
.....
length
-----....
l e n g k h : 32 l e n g £ h = 64 l e n g £ h = 128
th . . . . .
/ ~ / / ~
p s.
= 8 .'/// ..// /
8.15
//¢/
"'''"" t
0.1
.05
~ ~
_ 5-~" . ~.~-~ ~
'
-
-
-
t ......
m .....
~heoreticol
r .....
~----q
epsilon
Figure 4. Markov n = 2. 18
.....
T. . . . . .
T----
Volume 8, Number 1
PATTERN RECOGNITION LETTERS
val. We then find the maximum of these fractions to obtain an experimental lower bound for e which may be compared to the theoretical e. Figures 1-3 are graphs of experimental e vs. theoretical ~ for the cases n = 2, 4, and 8. We see that the experimental e values are consistently smaller than the theoretical values.
5. Experimental results: the simple Markov case This section describes experiments performed in connection with Section 3. As in Section 4, for each sample sequence we compute a 6 from (3), converted into an equation, and check to see whether each Pu lies the interval
[m,,_ 6 ,m,, ] --+6. t mi
ml
We generate a large number of sample sequences and obtain an experimental ~ as the fraction of sample sequences whose sample frequencies lie outside of that interval. Figures 4 and 5 are graphs of experimental ~ vs. theoretical e for the cases n = 2 and n = 4. Again, the experimental values are consistently smaller.
July 1988
6. Applications to texture classification and image segmentation Textures in digital images are often modeled by discrete random fields having various types of dependency properties [3]. The use of such models provides theoretical foundations for the important problems of texture classifcation and of segmenting an image into uniformly textured regions [4]. In texture classification problems we are given a texture sample, in the form of a digital image, and we have to decide to which of a given set of classes it belongs. Typically, this is done by computing a standard set of features of the sample, usually based on second-order gray level statistics or on firstorder statistics of various local property values [4]. The resulting feature vector is then compared with standard feature vectors representing the classes, and the sample is classified as belonging to that class whose feature vector is closest to the sample's vector. This approach typically yields about 90% correct classifications using samples of size 64 by 64 (i.e., 4096 pixels). We have found that better results can be obtained using a model-based approach. If the classes can be modeled by random fields of a given type, we can apply our methods to decide which of a set
g.25
g.2
..... --------~--
epB, gxp. = ep6. length = 8 length = 16 length = 32 lenght = 128
kheor.
/
/
// z/
8.15
8.1
.{)5
.82S
.g£
.07S
g.I
8.125
kheorekical
8.1S
g.17~
g.2
0.22S
0.25
epsilon
Figure 5. Markov n = 4. 19
Volume 8, Number 1
PATTERN RECOGNITION LETTERS
July 1988
whose values of p differ by 0.07 are usually not distinguishable by eye, even using much larger sample sizes. This is illustrated in Figure 6, which shows two synthetic textures (in the left and right halves of the image) having values of p equal to 0.465 and 0.535, respectively.
7. Conclusion
Figure 6. Two i.i.d, textures(left and right halves) that are indistinguishable using samples of size 256 (16 × 16) with 95% confidence. of random fields is most likely to have generated the sample, i.e., to classify the sample. For the simple case where the random fields consist of independent random variables, we have found that classification with 95% confidence can be achieved using samples of size only 16 by 16, i.e., 256 pixels. Similar remarks apply to the problem of texturebased image segmentation. This typically involves computing texture features for pieces of an image, and merging the two pieces if their feature vectors are similar, or deciding that there is a boundary between the two pieces if their vectors differ. When relatively large image pieces are used (e.g. 64 by 64), this results in blocky segmentation. Thus it is very advantageous to use an approach that allows the use of smaller pieces. For simplicity, let us consider the case of a random field made up of i.i.d, random variables taking on only two possible values. Using (1), for a sample size of 256 and a confidence level of 0.95, the confidence interval size 6 turns out to be 0.07. This implies that if the values of p for two random fields differ by 0.07, a sample of size 256 taken from one of them will not be identified as coming from the other, with confidence of 0.95. On the other hand, two textures composed of i.i.d, random variables
20
Rosenblatt-Roth has described techniques for identifying a random field belonging to a given class, given a sample of the random field. He has considered the two special cases of random fields composed of independent random variables and of simple stationary Markov chains and has determined confidence bounds for identification in each of these cases. In this paper we have presented Monte Carlo simulations that allow us to evaluate the tightness of these bounds. In all cases we found that the experimental bounds (e's) were half of the theoretical bounds or smaller, which menas that the confidence interval sizes can be made smaller while still achieving the desired confidence levels. Our interest in this problem is to develop efficient texture classification and segmentation algorithms. Our results show that we can achieve good classifications with small sample sizes.
References [1] Millu Rosenblatt-Roth (1985). Random field identification from a sample: I. The independentcase. TR-1583,Center for Automation Research, University of Maryland, College Park, November 1985. [2] Millu Rosenblatt-Roth (1986). Random field identification from a sample: If. The simple Markov case. TR-1599, Center for Automation Research, University of Maryland, College Park, January 1986. [31A. Rosenfeld,ed. (1981). Image Modelling. AcademicPress, New York. [4] A. Rosenfeldand A.C. Kak (1982). Digital Picture Processing (second edition). Academic Press, New York, Sections 10.1.3, 10.2.3,and 12.1.5.