COMPUTER
G11IpHICS
Stochastic
INU
IMAGE
Tree
PIWWSSINU
9, 234-245 (197!1)
Grammar Inference for Texture and Discrimination1
Synthesis
8. Y. Lu2 Department of Electrical and Computer Engineering, Syracuse, New York 13210
Syracuse University,
AND
K. S. Fu School
of Electrical West
Engineering,
Lafayette, Indiana
Purdue University, 47907
Received December 15, 1977 In a previous paper (S. Y. Lu and K. S. Fu, Computer Graphics and Image Processing 8, 1978), we have proposed a syntactic model for texture analysis. A texture pattern is divided into fixed size windows. Windowed patterns are described by using tree representations and characterized by tree grammars. This paper introduces a texture grammar inference procedure which employs a clustering algorit’hm and a stochastic regular grammar inference procedure. 1. INTRODUCTION
In [l], we have proposed a syntactic model for texture analysis. In the model, a texture pattern is divided into fixed-size windows. A windowed pattern is represented by a tree of some chosen structure. Each tree node corresponds to a single pixel or a small homogeneous area of the windowed pattern. The (average) gray level of the small area determines the node label. A tree is then constructed such that the union of all the nodes covers the entire area of the windowed pattern. Consequently, we have a set of trees of the same structure obtained from the training patterns. Tree grammars are then constructed to characterize windowed patterns of the same class. These grammars can be used for texture synthesis as well as texture discrimination. The advantage of the proposed model is its computational simplicity. The decomposition of a pattern into fixed-size windows and the use of a fixed tree structure for representation make the analysis procedure and its implementation very easy. The independence between windowed patterns and between branches 1 This work wss supported by the AFOSR Grant 74-2661. 2 S. Y. Lu was with School of Electrical Engineering, Purdue Indiana 47907. 234 0146-664X/79/030234-12$02.00/0 Copyright All rights
@ 1979 by Academic Press, Inc. of reproduction in any form reserved.
University,
West Lafayette,
in a tree also makes the model suitable for analJ,sis on a parallcl proccssor. HOXVever, the method is very sensitive to local noise and structural distortion such as shift, rot’ation, and fluctuation. We have proposed the use of stochastic tree grammars and higher level placement rules to model local noise and structural distortion rcspcctively, and the use of noarc,st-Ilcighl,or recognition rule ant1 (>rror-correcting parsers for the clas:sification of noisy and distortcxd textuw patterns. The proposed model has been tested for tcxturc synthwis and discrimination using selected patterns in Brodata’ book CL’]. In [l], all the experiments IISP heuristically constructed grammars. In this paper, w-e apply a stochastic regular grammar inference procedure to texture grammar inference. The inferred hexturcs grammars are then used to synthesize Brodatz’ tcxturc patt’crrts. The definition of probabilistic word functions? stochastic, grammars sr~tl st,ochastic languages given in [S] are briefly rcviwved. DEFINITION 1. A probabilistic word functjion is a mapping, slidi that, for ewry
j(P) : 2” -
K
0 I S(r3) I
1
v E S”, x:fcv) = 1,
where ?: represents a finite scht of terminal symbols. X* is tlic c*losuw of 8, and IZ is the set of rc>al numbws. DEFINITION 2. A language L C 2” is called a stochastic language if t,hcw is a probabilistic word function u(,u) dcfincd on I,. DEFIIUITION 3. A stochastic finito-state> grammar is a 5tuplc (; = (:V, S, I’! 6, ,S) whcrc ;v = I.41, AZ, . . .) A,) is a finite wt of nontwminals B is a finite scbtof tc>rminal symbols. S is a start symbol, I’ is a finite s:catof production rldos 6 is a mapping 6 : P + 12 such that, for c>ac:hprodwtion
of tltcl fornl .l.; -+ p,,i
tlit~rc is probability n-ith prcmisc Ai. A stwhastic grammar iL
pj,i
iSi/
associated with it, and 74is tlw numbctr of product ions i:: prop(‘r if
DEFIKITION 4. The grammatical
word fllnction
gcw>rated by G is dcfincd as
236
LU AND
FIG. 1. A digitized binary pi&we
FU
of pattern D 22 from Brodatz’s book.
where N(v) is the number of distinct leftmost derivations of v K(v, n) is the number of steps in the nth derivation. P~,~(v) is the probability of the production used at the kth step of the nth derivation. 2. THE
REPRESENTATION
OF A TEXTURE
PATTERN
The extraction of a tree representation in syntactic texture analysis is illustrated by an example. Figure 1 is a digitized binary picture of pattern D22 in Brodatz’s book. Let the gray level of a single pixel be a primitive, and let the window size be 8 by 8 pixels. Figure 2 shows three windowed patterns. The tree representations of these patterns are given in Fig. 3(a), (b), and (c) respectively, where node
label 0 corresponds to a white cell, and node labial 1 corr~~spor~dsto a black ~~‘11.” The start’ing point of the tree: coMruct ion is at the nliddlc of the leftmost column of a window. A tree is dcvclopcd from toI) to I)ot,ton~ \vllilc a windoxv is scanned from left] to right. Each pair of radiate l)ranc*hcls corresponds to a column of :L window: the left branch corrwponds to the: lo\\ cr half Al’ tlw column and t.hc~ right branch corresponds to the upper half. 7%~ labc~l “S” is a relational symbol indicating the concatc>nation bc~t.~~wnl(tft and right I)ranchw and the continuity in a window.
23s
LU AND
/I I-’ #I 1-l 4’ t-1 A-’ 0-O /I 0-O H-1 0-I 1-O ,/o-O
/I
HI
-J-l-
HI-
,l’l-~-l.o-o
OA f
-0
O-0
1-I
O-0
I-’
O-1. -f-od
/I/I
0-I
‘-1
-o-
0-I
-4-o. O-1.
Ho-$-o.o/o-
0-O
’
I-
‘-1
t -o-
‘-0,
I/’ -/o/o/o-
/o-
-0 f -0
--I.,
-o.,
-1
6 -o4
O-l-o-
I-’
WL (4
-1
-o--l.l
-f-l-J
-1-O
O-1.
-0
O-1
-f--O
-1-1
O-.
0.
/,/I-$-O-
I-’
I
-0
‘j--o-
/I-’
-o-
/I-$--o, /I
FU
IN
I ‘-1
WI (b)
/l-t-l, -1-I -1-l
FIG.
3. Tree representations
-
$ -o.-
-o-
of the three windowed
patterns WI, Wz, and Wa.
windowed patterns of the same class are usually not identical. However, the repeatedness of similar windowed pattern can be detected by using a clustering procedure [7, 81, Let a training pattern be an 1 X I array of pixels of the same texture, and let the window size by n X n. Windowed patterns are first grouped into a number of clusters based on a similarity measure. Assume that there are p clusters resulted from the clustering procedure. A premise is assigned t’o each cluster. Thus, the windowed patterns are represented by p premises. Consequently, the training pattern is described by a matrix of m X m premises, where m = Z/n, with p different premises. The characteristics of a texture, such as regularity, repeatedness of subpatterns, etc., can be extracted from the matrix. Such characteristics can further bc formalized by using placement rules for windowed patterns [l]. After a clustering procedure is applied to the windowed patterns, each cluster is a finite set of trees having the same structure. A stochastic tree grammar is then inferred for each cluster. The idea of using a grammar to characterize a sample set is to form a compact representation of the set or to predict other sentences which are of the same nature as the set. The use of a stochastic grammar further models the probabilistic distribution of a sample set bv assigning proba-
“ii!,
j (l-l-1-1,
11;
i (04-I-1,
: I, IO-O-O-l,
i 1i
1 co-O-l-l,
;: I, 10-0-1-0,
: I{
;(o-I-i-l,
iI,
:I]
10-O-1-1,
bilities to production rules in the grammar. For a given Clustw, a stochastic grammar is infrrrcd such that the gcweratctd grammatical word function will bc wry close to the probabilistic word function defined on the members of the clust,er. The start symbol of the grammar is the premiw of the cluster. The final texture grammar is the: union of the placement rules and all th(b stochast,ic trw grammars. Consider the: pattern D22 shown in Fig. 1. ‘I‘hc~cxntirc training sample contains 100 \\-indow. ‘l‘hc window size is 8 1-1).8 l)ixol~.
Lrt tlu: similarity bctlyecn t\\o windo\ved patterns bc measured in terms of a distanw or wightcd distance bctw-ccn t,hcir t,rec rcprcscntations. The distance bckwwn two trws of the same structure is d&ncd as t,he least number of substituGon transformations required to obtain one from the ot’hor [SJ. Then the distances bctwctln IV, and TV%,IV1 and V-3, IV2 and IV3 in Fig. 3 arc 6, 6, and 10 rrspectivcly. Using a seritc~ncc-to-scntel~~(~ clustering prowdurc [8 J, the 400 training \vindowd patterns arc.b grouped into X clusters.
In this infcrc~nce problem, the samph srt is a finite set of trws lvith the>structure roprescnt,ing windo\wd pattcrrls of some resemblance (i.e., in the same cluster). l’hc trees st,ructure is predctcrmincad and fixed. Instead of using a tree grammar infcrenw proccdurc, M’Pshall consider the problem to bc a two-dimensional finitestate string grammar infcrerwo ~)roblcm : tdlcI growth of a trw trunk is along the first dimension, and branches radiatcl along the scwnd dimension. The: three: trws WI, Mrs, and KS sho\vn i~i Fig. 3 are uwd to illustIratje thcb infcrcncc! prowdurc. Assumci that thc>y are in thr same cluster. A finite-state
240
LU AND
0t
FU
1
t
‘13
A
0
N2
0
FIG.
4.
Grammar
0 1
144
t ’
t
GN.
symbol corresponds to four vertical pixels in the window. A canonical definite finite-state grammar [3] for WI consists of production rules of the following form :
RII I Al-+ $ -A2 I Lll
AZ+
.. .
R12 I
$--As I
Lz
R18
LU and R11, LIZ and R,z, . . . , Lls and RI8 represent the first, the second, . . . , and the eighth pair of branches of tree WI. Similar sets of production rules can also be obtained for Wa and WS. The canonical definite grammar is then simplified by
L(GL3)
= 1(1-l-l-l,
z/3),
( O-l-l-l,
l/3)1
(b)
overlapping the xt of production rules constructed for a t~rw with all the other trees. ‘Ilk overlapping is under t,hc assumption that’ all the trees in the same cluster arc similar. Wc further rowritc ttic simplified grammar in the form of a tree grammar : ( ;., = j 1 : I’, I’, .-I 1I WllWC
242
LU AND FU
FIG.
6. A synthesized
patt,ern
D22.
In GA, L1 is the premise representing the left branches of the first pairs in all trees in a cluster. Similar definitions are used for RI, Lz, Rn etc. For a cluster consisting of trees WI, Wz, and Wa, the sets of sentences which LI, RI, L2, . . . , L8, and Rs represent are given in Table 1. In this table, elements in a set are pairs of the form (a, p) where p = f(v) and f is the probabilistic word function defined on the set. A stochastic finite-state grammar can be inferred for each probabilistic set. For any probabilistic set, the underlying characteristic grammar is a finitestate grammar, GN, that generates { 0, l)“, i.e., L(GN) = (0, 114. The transition diagram of GN is shown in Fig. 4. The inference of a stochastic finite-state grammar from a probabilistic set follows the procedure described in [9], which is also given in the Appendix. The inferred grammars GL,, GL3, and GL1 for the probabilistic set ((l-l-l-l, l)}, ( (l-l-l-l, Q), (O-l-l-l, $)I, and ((l-l-O-0, p), (l-l-1-0, i)) are given in Fig. 5. The final stochastic tree grammar is the union of grammar GA and stochastic grammars GL,, GR1, . . ., GL~, and GRa. During this final procedure, identical
When two probabilistic sets have w~rtl f~lrwtions similar to twh othc~r, thck t \\o sets can be merged and charactchrizcd by tllc> same stocha,+ti(* production rlllw. Thus, tho final grammar can by furthtbr mininiizc~tl.
X synthwizc~d tclxturc pattern is sho~~n in ITig. 6. ‘l’hchpattorn is gcwratc~d hn~ the premise matrix and the infcrrcad &chastk tt‘c’c’ grammars of pat tclrn D22. Using the same: procedure, the synthwizod result for patkrn DG3 is shop\-n in Fig. 7. .I digitized binary pictuw of DC,8 in Brodatz’ l~~oli i:: piwn in Fig. 8.
LU
FIG.
8. A digihed
sample
binary
AND
picllxe
FU
of pattern
1~68 from
Brodh’s
book.
first clustered into groups. Stochastic finite-state grammars that describe the growth of the tree trunk and the development of radiate branches are inferred for each cluster. The inferred grammars have been tested for texture symhesis. They could be used for discrimination as well. Windowed patterns are parsed with respect to the error-correcting tree automaton constructed from the inferred grammars. The membership of a windowed pattern is then determined based on the maximum-likelihood criterion [6], APPENDIX
Given a sample set X = (xl, xs, . . . , xx) and a characteristic grammar G = (V, Z, P, 8). Assume that X E L(G). Let pu be the production probability associated with the production
liE:FIcl<
15N( ‘1~3
s, C’otnpirto Graph/m and fmgc~ 1, ii. Y. Ltt nttd K. S. Fu, A syttiact ic qqrmtch 1o test WC mtlysi l’roressing 7, 1978, 303-330. 2. P. Brcjdntx, Textures, Ibver Publications, New York, 1966. 3. K. S. Fu and T. I,. Hoot,h, Grarntnati~al ittfcrertrr : Itttrodttcliott attd sttrvey---Paris I md II. I&BE Ymns. Sys. :Ilnn. and Cybernrtia SMC-5, Jattr~ary attd July 397.;. “l’est~~ral Featttres for Itttage Vl:is,~ificat iotl,” 4. IX. p12. Haralick, K. Shanrnttgmt, and I. Dittslrirt, IEEE Tmns. Sys. ,lIan. and (‘~~l~c~rrwtic~.s SMC-3, Nov. 1973. .>. .J. S. Weszka, <‘. R. Ilyer, :utd A. I&)settfcld, “:I (btnpxrat ive Study of Texture 1Ieasures for Terrain (Ilasrifica:rtt ion,” /EEB l’mns. Sys. .llnn. rind (‘~~brrnriics SMC-6, April 1976. C,. S. Y. Lu and K. S. Ftt, “Strrtctttre-preserved error-uxrec,t ittg tree automata for pattern reug. ml ton, in Z’rowrdilags o.f the l!)Th’ IEEE f,‘or~~~~~r~nc~c~ m llwision rind Control, Dec. l&3, Clearwalrr tie:tclt, FL. 7. K. S. Fu and X. T. Lu, A clusteriltg procerlitrc For ,synt WI ic* p:ri terns, ILSh’E 2’0~s. Sqs. dlrrrt nnd C&mrzticx SMC-7, October 1977. X. S. Y. Lu and K. 6. RI, A setttence-tc,-setrt~tl(~e c~lustering procedttre for pattern analysis, irt Pmvedings, the First Intrrnulionnl Computer So~/wnre rind tlpplicnliwns ConfPwncr (UjnlPSAC’), (Chicago, Illinois, November 8-11, 1!>77. 9. K. S. Fu, S(och:rstic langttagr for pictttrr ;m;ilysi,, (‘onlpntrr Graph irx at& Zaraye lJmcrssi~~g 2, 1!)73, 433--133.