S. Y. Lu Department of Electrical and Computer Engineering, Syracuse University, Syracuse, New York 13210

Syracuse University,


K. S. Fu School

of Electrical West


Lafayette, Indiana

Purdue University, 47907

Received December 15, 1977 In a previous paper (S. Y. Lu and K. S. Fu, Computer Graphics and Image Processing 8, 1978), we have proposed a syntactic model for texture analysis. A texture pattern is divided into fixed size windows. Windowed patterns are described by using tree representations and characterized by tree grammars. This paper introduces a texture grammar inference procedure which employs a clustering algorit’hm and a stochastic regular grammar inference procedure. 1. INTRODUCTION

In [1], we have proposed a syntactic model for texture analysis. In the model, a texture pattern is divided into fixed-size windows. A windowed pattern is represented by a tree of some chosen structure. Each tree node corresponds to a single pixel or a small homogeneous area of the windowed pattern. The (average) gray level of the small area determines the node label. A tree is then constructed such that the union of all the nodes covers the entire area of the windowed pattern. Consequently, we have a set of trees of the same structure obtained from the training patterns. Tree grammars are then constructed to characterize windowed patterns of the same class. These grammars can be used for texture synthesis as well as texture discrimination. The advantage of the proposed model is its computational simplicity. The decomposition of a pattern into fixed-size windows and the use of a fixed tree structure for representation make the analysis procedure and its implementation very easy. The independence between windowed patterns and between branches makes the model suitable for analysis on a parallel processor. However, the method is very sensitive to local noise and structural distortion such as shift, rotation, and fluctuation. We have proposed the use of stochastic tree grammars and higher level placement rules to model local noise and structural distortion respectively, and the use of nearest-neighbor recognition rule and error-correcting parsers for the classification of noisy and distorted texture patterns. The proposed model has been tested for texture synthesis and discrimination using selected patterns in Brodatz' book [2].

West Lafayette,

In [1], all the experiments use heuristically constructed grammars. In this paper, we apply a stochastic regular grammar inference procedure to texture grammar inference. The inferred texture grammars are then used to synthesize Brodatz' texture patterns. The definition of probabilistic word functions, stochastic grammars and stochastic languages given in [3] are briefly reviewed.

where Σ represents a finite set of terminal symbols. Σ* is the closure of Σ, and R is the set of real numbers. DEFINITION 2. A language L ⊂ Σ* is called a stochastic language if there is a probabilistic word function f(v) defined on L. DEFINITION 3. A stochastic finite-state grammar is a 5-tuple G = (N, Σ, P, δ, S) where N = {A1, A2, ..., An} is a finite set of nonterminals Σ is a finite set of terminal symbols. S is a start symbol, P is a finite set of production rules δ is a mapping δ : P → R such that, for each production of the form Ai → pij there is probability associated with it, and n is the number of productions with premise Ai. A stochastic grammar is proper if

FIG. 1. A digitized binary pi&we


of pattern D 22 from Brodatz’s book.

where N(v) is the number of distinct leftmost derivations of v K(v, n) is the number of steps in the nth derivation. Pk,n(v) is the probability of the production used at the kth step of the nth derivation. 2. THE INFERENCE PROCEDURE




The extraction of a tree representation in syntactic texture analysis is illustrated by an example. Figure 1 is a digitized binary picture of pattern D22 in Brodatz’s book. Let the gray level of a single pixel be a primitive, and let the window size be 8 by 8 pixels. Figure 2 shows three windowed patterns. The tree representations of these patterns are given in Fig. 3(a), (b), and (c) respectively, where node

label 0 corresponds to a white cell, and node label 1 corresponds to a black cell. The starting point of the tree construction is at the middle of the leftmost column of a window. A tree is developed from top to bottom while a window is scanned from left to right. Each pair of radiate branches corresponds to a column of a window: the left branch corresponds to the lower half of the column and the right branch corresponds to the upper half. The label "S" is a relational symbol indicating the concatenation between left and right branches and the continuity in a window.



3. Tree representations


of the three windowed

patterns WI, Wz, and Wa.

windowed patterns of the same class are usually not identical. However, the repeatedness of similar windowed pattern can be detected by using a clustering procedure [7, 81, Let a training pattern be an 1 X I array of pixels of the same texture, and let the window size by n X n. Windowed patterns are first grouped into a number of clusters based on a similarity measure. Assume that there are p clusters resulted from the clustering procedure. A premise is assigned t’o each cluster. Thus, the windowed patterns are represented by p premises. Consequently, the training pattern is described by a matrix of m X m premises, where m = Z/n, with p different premises. The characteristics of a texture, such as regularity, repeatedness of subpatterns, etc., can be extracted from the matrix. Such characteristics can further bc formalized by using placement rules for windowed patterns [l]. After a clustering procedure is applied to the windowed patterns, each cluster is a finite set of trees having the same structure. A stochastic tree grammar is then inferred for each cluster. The idea of using a grammar to characterize a sample set is to form a compact representation of the set or to predict other sentences which are of the same nature as the set. The use of a stochastic grammar further models the probabilistic distribution of a sample set bv assigning proba-


Consider the pattern D22 shown in Fig. 1. The entire training sample contains 400 windows. The window size is 8 by 8 pixels.

Lrt tlu: similarity bctlyecn t\\o windo\ved patterns bc measured in terms of a distanw or wightcd distance bctw-ccn t,hcir t,rec rcprcscntations. The distance bckwwn two trws of the same structure is d&ncd as t,he least number of substituGon transformations required to obtain one from the ot’hor [SJ. Then the distances bctwctln IV, and TV%,IV1 and V-3, IV2 and IV3 in Fig. 3 arc 6, 6, and 10 rrspectivcly. Using a seritc~ncc-to-scntel~~(~ clustering prowdurc [8 J, the 400 training \vindowd patterns arc.b grouped into X clusters.

In this infcrc~nce problem, the samph srt is a finite set of trws lvith the>structure roprescnt,ing windo\wd pattcrrls of some resemblance (i.e., in the same cluster). l’hc trees st,ructure is predctcrmincad and fixed. Instead of using a tree grammar infcrenw proccdurc, M’Pshall consider the problem to bc a two-dimensional finitestate string grammar infcrerwo ~)roblcm : tdlcI growth of a trw trunk is along the first dimension, and branches radiatcl along the scwnd dimension. The: three: trws WI, Mrs, and KS sho\vn i~i Fig. 3 are uwd to illustIratje thcb infcrcncc! prowdurc. Assumci that thc>y are in thr same cluster. A finite-state














A canonical definite finite-state grammar [3] for W1 consists of production rules of the following form:

R11 | A1 → $ -A2 | L11


L11 and R11, L12 and R12, ..., L18 and R18 represent the first, the second, ..., and the eighth pair of branches of tree W1. Similar sets of production rules can also be obtained for W2 and W3. The canonical definite grammar is then simplified by


overlapping the set of production rules constructed for a tree with all the other trees. The overlapping is under the assumption that all the trees in the same cluster are similar. We further rewrite the simplified grammar in the form of a tree grammar: GA = {N, Σ, P, A1} where




6. A synthesized



In GA, L1 is the premise representing the left branches of the first pairs in all trees in a cluster. Similar definitions are used for RI, Lz, Rn etc. For a cluster consisting of trees WI, Wz, and Wa, the sets of sentences which LI, RI, L2, . . . , L8, and Rs represent are given in Table 1. In this table, elements in a set are pairs of the form (a, p) where p = f(v) and f is the probabilistic word function defined on the set. A stochastic finite-state grammar can be inferred for each probabilistic set. For any probabilistic set, the underlying characteristic grammar is a finitestate grammar, GN, that generates { 0, l)“, i.e., L(GN) = (0, 114. The transition diagram of GN is shown in Fig. 4. The inference of a stochastic finite-state grammar from a probabilistic set follows the procedure described in [9], which is also given in the Appendix. The inferred grammars GL,, GL3, and GL1 for the probabilistic set ((l-l-l-l, l)}, ( (l-l-l-l, Q), (O-l-l-l, $)I, and ((l-l-O-0, p), (l-l-1-0, i)) are given in Fig. 5. The final stochastic tree grammar is the union of grammar GA and stochastic grammars GL,, GR1, . . ., GL~, and GRa. During this final procedure, identical

When two probabilistic sets have word functions similar to each other, the two sets can be merged and characterized by the same stochastic production rule. Thus, the final grammar can be further minimized.

A synthesized texture pattern is shown in Fig. 6. The pattern is generated by the premise matrix and the inferred stochastic tree grammars of pattern D22. Using the same procedure, the synthesized result for pattern D68 is shown in Fig. 7. A digitized binary picture of D68 in Brodatz' book is given in Fig. 8.



of pattern

1~68 from



Windowed patterns are first clustered into groups. Stochastic finite-state grammars that describe the growth of the tree trunk and the development of radiate branches are inferred for each cluster. The inferred grammars have been tested for texture synthesis. They could be used for discrimination as well. Windowed patterns are parsed with respect to the error-correcting tree automaton constructed from the inferred grammars. The membership of a windowed pattern is then determined based on the maximum-likelihood criterion [6]. APPENDIX

Given a sample set X = {x1, x2, ..., xn} and a characteristic grammar G = (V, Σ, P, S). Assume that X ⊆ L(G). Let pij be the production probability associated with the production


