OPTICS COMMUNICATIONS Optics Communications 88 (1992) 485-493 North-Holland
Optimization of cascaded threshold logic filters for grayscale image processing Akira Asano, K a z u y o s h i Itoh and Yoshiki I c h i o k a Department of Applied Physics, Osaka University, Yamadaoka 2-1, Suita, Osaka 565, Japan Received 19 August 1991; revised manuscript received 20 October 1991
Threshold logic filter is a broad class of the nonlinear filters. A designing method of cascaded threshold logic filters is proposed. The designing method is based on the learning rule for the multilayer neural network. The results of the optimization of the cascaded threshold logic filters are presented. Optimization includes those of the weighted median filter and the rank-order based nonlinear differential operator (RONDO) for specific applications.
1. Introduction
Recently, nonlinear filters have attracted more attention than the linear filters [ 1 ]. The median filter and its generalizations [ 2-7 ] is one of the most popular families of the nonlinear filters. They have remarkable characteristics; they remove impulsive noise effectively while preserving edges in images. However, it is believed to he difficult to design these filters, since there are no general methods for the analysis and design of these filters. We recently proposed a new method [ 8,9 ] for the optimization of the weighted median filter (WMF) [ 10 ]. In this method, the operation of the WMF is reduced to the threshold logic in the binary operation. Using a close relationship between the threshold logic operation and the function of an artificial neuron, we optimized the filter by using the learning algorithm for the feedforward neural networks. The learning method has great advantages; it is very simple and needs no structural analysis or mathematical description of the behavior of the filter. Thus designing by the learning method is suitable for the nonlinear filters that are difficult to be analyzed. In this paper, we extend our method to cascades of general threshold logic filters for grayscale image processing. We show an optimization method for cascaded threshold logic filters. This method is based on the learning algorithm for the multilayer feed-for-
ward neural network model. In particular, we show the excellent performance of the trained cascade of the rank-order based nonlinear differential operator (RONDO) [l 1] and the WMF. The learning procedure only requires a number of examples of input and output image pairs. Some designing methods for the nonlinear filters have been proposed in the literature [ 12-14 ] ~These methods, however, seem to be too complicated to optimize cascaded filters. We draw attention to the relationship between the threshold logic filter and feed forward neural network in section 2. We describe in section 3 the optimization of RONDO, and the cascade of RONDO and the WMF. In section 3, we discuss about "contradiction" in the training set. We show that the ability of the trained filter tends to be worse if the training set has more "contradictions". In section 4 we conclude our work. Since the aim of this paper is the proposal of a general designing method of the threshold logic filter family, it is omitted to compare its performance or computational complexity with filtering methods which are previously proposed. See refs. [ l 5,16 ] for more information about other filtering methods.
2. Threshold logic filter and neural network
To deal with grayscale inputs and produce grays-
0030-4018/92/$05.00 © 1992 Elsevier Science Publishers B.V. All rights reserved.
485
FULL LENGTH ARTICLE
VoIur~e 88, number 4,5,6
cac' outputs by using the 1 threshold decomposition us::d This architecture di, tion on gray-level images i a caLe image of n-level is d, (n -- l ) binary images. Eac irr~age (l~
where H is a threshold ful dow extent, and C is the 1 Tlae binary version of th example of the threshold T(q) = I for Vqe~, and
c'=
Z T(q),
•eshold logic filter, the chitecture [ 17,18 ] is [es the filtering operathree parts: (i) Grays)mposed into a set of fixel of the k-th binary rresponding pixel value mn k, and l otherwise. he threshold logic filter :ge. (iii) All the result~d up pixel by pixel. is defined by the mulLs by the threshold delain the optimization operation. The binary ed summing of the pixel he thresholding opera,et T(q) be the weight ive pixel position in the ted and filtered images (.), respectively. Using ~ration of the threshold on Q is formulated as
The operation shown in eq. (5) can be considered as the state change of an artificial neuron V2(Q) determined by the states of neurons V~(P) and the interconnections T(P-Q). This operation is equivalent to the double-layer feed-forward neural network model. The first layer expresses the input binary image, and the second layer expresses the filtered image. In eq. (5), T(P- Q) and C are determined only by the relative position between P and Q. This indicates that this network model has the shift invariant interconnection. We can construct a multilayer network by concatenating the output of a double layer network to an input of another one. Training the multilayer network makes a combination of the filter optimal for the set of the input and output images.
.q)- c),
3. Optimization method
(1)
:ion, ~ is the filter wineshold level. nedian filter is a typical }gic filter. In this case,
(2)
/ q ( x ) = l , ifx>~0, =0,
if x < 0 .
In tlhe case of the weighte( arbii.trary positive integer. We recently proposed a called rank order based n ator ( R O N D O ) . In this ( tt(x)=:
l,
ifx>~fl,
=
0,
if-,0
=:-l, 4,86
ifx~-fl,
1Apfil1992
OPTICS COMMUNICATIONS
(3) nedian filter, T(q) is an ew differential operator dinear differential opere, we set C = 0 and
where fl is a positive constant. Here we extend the definition of T(q) as T(q) = 0 if q~E2, and we define P = Q + q . Then we get from eq. ( 1 ) that
The multilayer neural network that contains the cascade of operations expressed by eq. (5) can be trained by a well-known algorithm called error back propagation (EBP). We adapt the EBP algorithm to our network model and modify the threshold function. The EBP method requires the threshold function to be differentiable. We introduce the sigmoid function to modify the threshold functions. However, there are specific problems in the modification for each case of the optimization of the filters in the following: (i) WMF, (ii) R O N D O and (iii) cascade of R O N D O and WMF. (i) WMF. In this case, we directly replace the threshold function eq. (3) with the sigrnoid function shown in the following:
1
H(x) = I + e x p ( - Z x ) '
(4)
(6)
where ~ is a positive value. When ;t~oo, eq. (6) is equivalent to eq. (3).
Volume88, number4,5,6
(ii) R O N D O . The threshold function for RONDO is a combination of the two simple step functions. To make this function differentiable, we introduce a combined sigmoid function in the following: ' 1 H ( x ) = 1+ e x p [ ' 2 ( x - f l ) ] +
1 -1. 1+exp [ --2(x+fl) ]
(7)
When 2--.o0, eq. (7) is equivalent to eq. (4). (iii) Cascade o f R O N D O and WMF. In the optimization of the cascade, the second stage of the multilayer network is for optimization of the WMF. This filter expressed by the network essentially requires binary (0 or 1 ) inputs. Thus the first stage must output the absolute values of the output of RONDO. For this purpose, we modify eq. (7) as follows: 1 n ( x ) = f(fl) _ f ( _ fl) X { [ f ( x - f l ) - - f ( x + f l ) ] + [f(fl) - f ( - - f l ) ]},
(8) H(x)
H(x) 1
1
0
0
1 April 1992
OPTICS COMMUNICATIONS
x
0
x
0
(a)
where f(x)-
l 1+ e x p ( - 2 x ) "
This function is differentiable and the value of this function lies in the range [0, 1 ). When 2--,o0, eq. (8) is equivalent to the absolute value of eq. (7). Illustrations of eqs. (6)- (8) are shown in fig. 1.
4. Experimental results In our experiment, we train the network by a set of binary images. If we assume the salt-and-pepper noise, the network (i.e. filter) trained by binary images with this type of noise is also appropriate for grayscale image processing by using the threshold decomposition architecture. See fig. 2. The threshold logic filter is defined by the binary operations. Each typical case of the combination of the pixels in a window in fig. 2a is expressed in each place of the binary image fig. 2b. Thus, the network trained with the binary image (b) is sufficiently trained for the grayscale model (a). We use the binary image shown in fig. 3 for the training stage. Since it has longer length of edges than fig. 2b, the network has more opportunities to learn about edge detection in noises. The first experiment examines the effectivity of removal of the white-level noise. The corrupted im-
gra] . . . . . . . .
H(x)
H(x)
,0
+I~
x
~
0
-i
,f
(b)
0
(a,
+p
-~
H(x)
(9)
spatial coordinates
H(x)
+
x
~
-I
0 -I
(b)
(c) Fig. 1. Illustrations of the functions (a) eq. (6), (b) eq. (7) and (c) eq. (8).
Fig. 2. Step edge models. (a) Illustration ofgrayscale image and (b) binary image.
487
Volume 88, number 4,5,6
1 April 1992
OPTICS COMMUNICATIONS
00o
O
o
(~)
o
o
(~)
o
©
O
°O o
(~
o
"
O
°
O Fig. 3. Ideal output fi - learning stage. age is shown in fig. 4. Becau ;e of the edge effect, fig. 3 is smaller than fig. 4. Eacl pixel is replaced by the white level (i.e. maximun level) pixel with the probability of 15%. The lear ring algorithm trains the network to yield fig. 3 from fig. 4. The network has the perceptual area of 7 × 7 pixels for the interconnections between the input nd the intermediate layers, and 3 X 3 pixels for the i ttermediate and the output layers, The results ofth~ weight coefficients after the training are illustrated n figs. 5a,b. Each circle means the weight coefficie~ t in each position. The values of the weight coeffici nts are illustrated by the
Fig. 5. Illustrations of the resultant weight coefficientsusing the imagescorrupted by the white-levelnoise. (a) RONDO and (b) WMF.
Fig. 4. Noisy image for learning tage. Corrupted by the whitelevel noise of prnbability 15%. 488
diameters of the circles. White circles mean positive coefficients and gray circles mean negative ones. One can see that the set of the filters is optimized to detect vertical edges. We apply the trained network on a corrupted image shown in fig. 6. The result of filtering by the trained cascades of R O N D O and WMF are compared with that of an untrained cascade with the window coefficients illustrated by fig. 7. The coefficient of fig. 7a is for the R O N D O part and that by fig. 7b is for the WMF part. The illustrations obviously show that this set of the weight coefficients are optimized for detecting vertical edges. Figs. 8a, b show the experimental results of these filters. They
OPTICS COMMUNICATIONS
Volume 88, number 4,5,6
1 April 1992
Table 1 Average errors of the filtered images corrupted by the white-level noise. Filter is trained by the white-level noise.
Fig. 6. Corrupted image for testing the filters. 3"he statistics of the noise are the same as fig. 4.
-1-I-1
1
1
1
-1 -1 -1
1
1
1
!1 1 il 3
-1 -1 -1 '1
1
1
I1
(a)
1 1
1 1 (b)
Fig. 7. Fixed windows for the cascade of (a) RONDO and (b) WMF. obviously show the excellence o f the trained filter. Table 1 shows the measures for the average errors between the resultant images and the ideal output. The tabulated figures mean the ratios o f the numbers of erroneous pixels to the numbers o f all pixels in the
Filters
Average errors (%)
Standard deviations (%)
Trained filter Cascade with fixed windows
0.212
0.164
1.80
0.149
image, We calculated them experimentally by preparing ten corrupted images with different realizations of noises with the same statistics. This table also shows the distinct advantage o f the trained filters. We trained another cascaded filter by using the saltand-pepper noise with probability 15%. The resultant weight coefficients are illustrated in fig. 9. We also applied to the images corrupted by the similar noise. The results are shown in fig. 10 and table 2. In the case o f the salt-and-pepper noise, we also have the effectivity o f the learning method. We applied the filter trained with the white-level noise to images corrupted by salt-and-pepper noise.. The results are shown in fig. 11 and table 3. We can see from these results that the performance o f the filter trained by the white noise is better than that o f the filter trained by the salt-and-pepper noise itself. This behavior may be explained by the problem o f "contradiction". The binary filter can be regarded to separate pixel patterns in its window into two output groups - 0 and 1. Suppose that in the inputs o f a
Fig. 8. Results from the image corrupted by the white-level noise. (a) The trained filter and (b) the cascade of the filters with fixed windows. 489
l
FULL LENGTH ARTICLE
Volume 88, number 4,5,6
Table 2 Average errors of the filtered images corrupied by the salt-andpepper noise. Filter is strained by the salt-and-pepper noise.
O o
1 April 1992
OPTICS COMMUNICATIONS
©
O
O
O O
Filters
Average errors (%)
Standard deviations (%)
Trained filter Cascade with fixed windows
0.254
0.0879
1.80
0.286 t
o
©
•
©
O
O
O
o
O ©
Fig. 9. Illustrations of the resultar weight coefficients using the images corrupted by the salt-and-l: pper noise. (a) RONDO and (b) WMF.
Fig. l 1. Results from the image corrupted by the salt-and-pepper noise.
Fig. 10. Results from the image q ,rrupted by the salt-and-pepper noise. (a) The trained filter using the salt-and-pepper noise and (b) the cascade of the filters with fix¢ windows. 490
Volume 88, number 4,5,6
1 April 1992
OPTICS COMMUNICATIONS
Table 3 Average errors of the filtered images corrupted by the salt-andpepper noise. Filter is trained by the white-level noise. Filter
Average error (%)
Standard deviation (%)
Trained filter
0.0928
0.00490
training set there are two pixel patterns that cannot be separated into different groups by the shift-invariant model. If these patterns arc desired to be separated into different groups, this training set has "contradiction". It seems that the training set of the salt-and-pepper noise has more contradictory inputoutput pairs than the set of the white noise, and these contradictory pairs hinder the evolution of training. We have carried out an experiment to examine this hypothesis. We trained two networks separately by using input pictures corrupted by the salt-and-pepper noise and the white-level noise, respectively. We measured the residual errors under each learning process. The results are illustrated in fig. 12. It is obvious that the learning process with the salt-and-pepper noise leaves more residual errors. Thus we see that the filters trained by the white-level noise performs better. We also applied the cascaded filter trained by the white-level noise to grayscale images. The original image of 256×256 8-bit pixels is shown in fig. 13. The image corrupted by the white-level noise of probability 15% is shown in fig. 14. This image is so
Fig. 13. Original grayscale image.
heavily corrupted that the original image almost cannot be captured. The results of the filter trained by the white-level noise and the cascade of RONDO and WMF with the fixed window coefficients of fig. 7 are shown in fig. 15. The corrupted image by the salt-andpepper noise is shown in fig. 16. The filtered images of fig. 16 by the trained and untrained cascades are shown in fig. 17. These results show that the trained filter outputs strictly vertical edges only. This fact means that the learning algorithm optimizes the filter not only for the noise, but also for the edge orientation.
1200
301
(b)
"0
"~ 20
~
0 ..Q
1000
10
800
<
0
•
0
,
40
.
,
.
,
80
,
• ' 120
600
Iterations (xlO0 times)
0
100 ' °
200
salt-pepper noise white-level noise
Fig. 12, Residual errors. The noise probability is (a) 15% and (b) 30%.
491
OPTICSCOMMUNICATIONS
1 April 1992
e by the white-level noise.
Fig. 16. Corrupted grayscaleimageby the salt-and-pepper noise.
fization method for the r. This method utilizes the threshold logic illthe feed forward neural s can be optimized by
the learning algorithm for the networks. The learning process is simple and easy. We have examined the performance of the cascade of R O N D O and the WMF trained by our method experimentally. We have shown the excellent performance of the trained cascaded filters from the statistical point of view. We have measured the residual error under the
The trained filter usingthe white-level noise and (b) the cascadeof the filters with fixedwindows.
Volume 88, number 4,5,6
OPTICS COMMUNICATIONS
1 April 1992
Fig. 17. Results from fig. 16. ( a ) The trained filter using the white-level noise and (b) the cascade of the filters with fixed windows.
l e a r n i n g process, a n d have f o u n d that the filter t r a i n e d b y the white-level noise has b e t t e r p e r f o r m a n c e t h a n that t r a i n e d b y the s a l t - a n d - p e p p e r noise, e v e n for test images c o r r u p t e d b y the salt-and-pc.pper noise. To i m p r o v e the efficiency o f o u r m e t h o d , we h a v e to study a b o u t t r a i n i n g sets. I n o u r e x p e r i m e n t s , we p r e p a r e d the t r a i n i n g set o f stripe-like image. We c a n see in the result t h a t the filter is t r a i n e d to detect strictly vertical edges. I f we n e e d s o m e t o l e r a n c e a b o u t the edge o r i e n t a t i o n s i n the o u t p u t images, we have to p r e p a r e t r a i n i n g sets o f edge images o f several angles. It is a f u t u r e p r o b l e m h o w to d e t e r m i n e the t r a i n i n g set systematically.
References [1 ] V. Kim and L.P. Yaroslavsky, Comput. Vision Graphics Image Process. 35 (1986) 234. [2]T.S. Huang, ed., Topics in Applied Physics, TwoDimensional Digital Signal Processing II (Springer, Berlin, 1981). [3] G.R. Arce and R.E. Foster, IEEE Trans. Acoust. Speech Signal Process. 37 (1989) 83.
[4] R. Wichman, J.T. Astola, P.J. Heinonen and Y.A. Neuvo, IEEE Trans. Acoust. Speech Signal Process. 38 (1990) 2108. [5 ] A.C. Bovik, T.S. Huang and D.C. Munson, Jr., IEEE Trans. Acoust. Speech Signal Process. ASSP-31 (1983) 1342. [ 6 ] K. Itoh, Y. Ichioka and T. Minami, Appl. Optics 27 (1988) 3445. [7] A. Asano, K. Itoh and Y. Ichioka, Pattern Recognition 23 (1990) 1059. [8]A. Asano, W. Zhang, K. Itoh and Y. Ichioka, Pattern Recognit. Lett. 11 (1990) 557. [9] A. Asano, K. Itoh and Y. Ichioka, Optics Len. 16 ( 1991 ) 168. [ 10] D.R.K. Brownrigg, Commun. ACM. 27 (1984) 807. [ 11 ] A. Asano, K. Itoh and Y. Ichioka, Jpn. J. Appl. Phys. Part 2 29 (1990) 1270. [ 12 ] J.-H. Lin and E.J. Coyle, IEEE Trans. Acoust. Speech Signal Process. 38 (1990) 663. [ 13 ] J.-H. Lin, T.M. Sellke and E.J. Coyle, IEEE Trans. Acoust. Speech Signal Process. 38 (1990) 938. [ 14 ] M. Gabhouj, E.J. Coyle, IEEE Trans. Acoust. Speech Signal Process. 38 (1990) 955. [ 15 ] L.P. Yaroslavsky, Digital picture processing an introduction (Springer, Berlin, 1984). [ 16 ] E.J. Coyle, J.-H. Lin and M. Gabhouj, IEEE Trans. Acoust. Speech Signal Process. 37 (1989) 2037. [ 17 ] J.P. Fitch, E.J. Coyle and N.C. Gallagher, Jr., IEEE Trans. Acoust. Speech Signal Process. ASSP-32 (1984) 1183. [ 18 ] P.D. Wendt, E.J. Coyle and N.C. Gallagher, Jr., IEEE Trans. Acoust. Speech Signal Process. ASSP-34 (1986) 898.
493