Adaptive neuro-fuzzy inference systems based approach to nonlinear noise cancellation for images

Adaptive neuro-fuzzy inference systems based approach to nonlinear noise cancellation for images

Fuzzy Sets and Systems 158 (2007) 1036 – 1063 www.elsevier.com/locate/fss Adaptive neuro-fuzzy inference systems based approach to nonlinear noise ca...

2MB Sizes 32 Downloads 130 Views

Fuzzy Sets and Systems 158 (2007) 1036 – 1063 www.elsevier.com/locate/fss

Adaptive neuro-fuzzy inference systems based approach to nonlinear noise cancellation for images Hao Qin∗ , Simon X. Yang Advanced Robotics and Intelligent Systems (ARIS) Lab, School of Engineering, University of Guelph, Guelph, Ont., Canada N1G 2W1 Received 10 April 2005; received in revised form 25 January 2006; accepted 27 October 2006 Available online 24 January 2007

Abstract The adaptive neuro-fuzzy inference system (ANFIS) is an effective tool that can be applied to induct rules from observations, e.g. pattern recognition. In this paper, we extend the nonlinear noise cancellation method using ANFIS from 1-D signals to 2-D counterpart images. First, the image restoration contaminated with Gaussian noise is investigated in nonlinear passage dynamics of order 2. We inspect eight types of membership functions (MF): bell MF, triangle MF, Gaussian MF, two sided Gaussian MF, pi-shaped MF, product of two sigmoidal MFs, difference of two sigmoidal MFs, and trapezoidal MF. In addition, the other parameters, such as the training epochs, the number of MFs for each input, the optimization method, the type of output MFs, and the over-fitting problem, are investigated. For comparison with the noise cancellation using ANFIS, we simulate 22 conventional filtering techniques: spatial filters, optimal Wiener filter, frequency domain filters, wavelet, wavelet packet, 2-D adaptive filters, etc. The quality in terms of mean square error (MSE) of image restoration using the proposed noise cancellation using ANFIS (Pi-shaped MF) is at least 75 times better for Gaussian noise than that derived using any of these conventional filtering techniques. © 2007 Published by Elsevier B.V. Keywords: Noise cancellation; Adaptive neuro-fuzzy inference system; Image degradation; Fuzzy reasoning; Image restoration

1. Introduction Images are often corrupted by noise due to a noisy channel or faulty image acquisition device. Much research has been done on removing noise. The aim is to suppress the noise while preserving the integrity of the edges and details of images. For example, mean filtering [6] is a linear technique to remove random Gaussian noise. However, while it removes the noise, it also removes too much detail and edge sharpness in image processing. Median filtering [6] is a nonlinear technique that is known for its effectiveness in removing noise, especially impulse noise, while preserving edge sharpness in image processing [1,15,17]. Even though median filters can suppress and remove noise from contaminated images, too much signal distortion is introduced, and features such as sharp edge are lost. Also, performance advantages of these conventional approaches can only be achieved when the occurrence probability of noise is small. Due to lack of adaptability, these variations of median filters cannot perform well when the probability is more than 0.2. Other conventional filtering methods all face the same limitation as the median filter. ∗ Corresponding author. Tel.: +1 519 824 4120 x 52437.

E-mail addresses: [email protected] (H. Qin), [email protected] (S.X. Yang). 0165-0114/$ - see front matter © 2007 Published by Elsevier B.V. doi:10.1016/j.fss.2006.10.028

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1037

With the applications and development of soft computing, more and more methods of fuzzy and neural networks have been developed to solve the problems at the practical level. Recently, adaptive systems based on neural networks [7,11] or fuzzy theory [2,14,18] with data-driven adjustable parameters have emerged as attractive alternatives. The methods such as noise adaptive soft-switching median filter [3], optimal detail-restoring stack filters [27], hybrid FIR weighted order statistics filters [26], and adaptive average iterative filters [24] have been developed. Several fuzzy filter methods were designed by Forero-Vargas and Delgado-Rangel [4], for their wide applicability to image filtering. They also presented a survey of different design techniques for fuzzy filters. These fuzzy filters are the multipass fuzzy filter, fuzzy multilevel median filter, histogram adaptive filter, fuzzy vector rank filter, fuzzy vector rational median filter, and fuzzy credibility color filter. They also evaluated the effect of the filter performance using criteria such as mean average error, mean square error, normalized mean square error, signal to noise error ratio, and mean chromaticity error. However, the performance of some fuzzy filters is affected when other types of noise besides impulse noise are presented. For noise reduction in images, Kwan [12] presented seven fuzzy filters. They are the Gaussian fuzzy filter with median center, the symmetrical triangular fuzzy filter with median center, the asymmetrical triangular fuzzy filter with median center, the Gaussian fuzzy filter with moving average center, the symmetrical triangular fuzzy filter with moving average center, the asymmetrical triangular fuzzy filter with moving average center, and the decreasing weight fuzzy filter with moving average center. Every fuzzy filter uses a weighted membership function (MF) on an image in a subimage window to determine the center pixel. It is difficult to control the size of window width because a small window width is appropriate for a low level of noise and a large one is appropriate for a high level of noise. Kalaykov and Tolt [10] proposed a filter for reducing mixed noise in images. The filter evaluates fuzzy similarities between pixels in a local subimage processing window. The algorithm is suitable for high-speed hardware implementation. The filter includes two tunable parameters:  which is associated with the similarity between the central pixel and a template pixel; and , which is associated with the similarity between two pixels within the template. It is robust against changes in noise distribution. The performance of the fuzzy similarity based filter is less efficient for a high intensity of salt and pepper noise because the intensity change is assigned a large weight in the defuzzification, implying an inaccurate central pixel update. Fotopoulos et al. [5] described the design and evaluation of three fuzzy color filters: multichannel fuzzy filter (MFF), multichannel filter based on local histogram values (HMF), and fuzzy color filter using relative entropy (RELF), based on a specific local image model. They introduced three different statistical indices as fuzzy variables which are used to detect flat regions, edges and ramps. The three areas should be handled in different ways. The fuzzy variables are extracted from the estimation of the local distribution. The summation of the Parzen estimator-potential functions, the maximum/minimum value of the local distribution, and the relative entropy are referred to the indices, which are used to determine the types of local areas. The results of RELF are degraded a little by the presence of impulse noise, and MFF is less efficient than the others due to the inaccurate features employed. The proposed noise cancellation in degraded 2-D signals with adaptive neuro-fuzzy inference system (ANFIS) is inspired by Widrow and Glover’s [25] adaptive noise cancellation model for 1-D signals, in which the objective is to filter out noise by identifying a nonlinear model between a measurable noise source and the corresponding unmeasurable interference while removing noise in 1-D signals with Jang’s ANFIS. In ANFIS, neural networks exploit their structures with abundant theorems and efficient numerical training algorithms while fuzzy systems can directly encode structured knowledge. The combination between fuzzy systems and neural networks generates an adaptive system in that the neural networks embed in an overall fuzzy architecture, generating and refining fuzzy rules from training data. Because ANFIS has been successfully applied in 1-D signal processing, we decided to conduct research with some new techniques like ANFIS on nonlinear noise cancellation for images. Also many applicable cases in the real world fit the model of nonlinear noise cancellation for images with ANFIS. Assume that there is a picture in a box which is covered by an unknown thick membrane. After shining a strong light (the source noise) on it for a long time, the picture fades. If we need to restore the quality of the picture, we can scan it into a computer, and then measure the source light (noise). However, after it goes through the unknown thick membrane and the unknown shining time, we cannot measure the distorted noise. Yet, the faded picture (corrupted image) can be measured. Therefore, this situation can be resolved by the noise cancellation model with ANFIS that we will present later in this paper.

1038

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

s (i) Unknown image signals

x (i)=s (i)+v0 (i) Measurable corrupted signals

+ +

v1(i) Measurable source noise

Passing through membranes, mirrors or atmosphere

v0 (i) Unmeasurable distorted noise

Fig. 1. Corruption model from the real world used in this paper.

In another example, a picture is put on a table. When a person opens a window, the light shines on the picture through the reflection in the glass and finally the picture fades. We can measure the source light (noise) and the faded picture. However, we cannot measure the reflected light (distorted noise) because we cannot measure the original angle. We can also use the proposed model to restore the faded picture in this scenario. Further, a satellite takes pictures of the earth through the atmosphere in outer space. It is easy to measure the source light (noise) because we can build the optical instrument on the satellite. However, we cannot measure the distorted light (noise) after the light perforates the atmosphere because some of the light is refracted when it meets the droplets of water in the atmosphere. We can also record and measure the corrupted images. Thus, we can use the noise cancellation model with ANFIS to restore the images which are taken from the satellite. The proposed method explores the use of adaptive network-based fuzzy inference systems for use in noise cancellation. The main application of the work is in noise removal for corrupted 2-D signals. The model of corruption we summarize from the above examples is shown in Fig. 1. Here an information signal s(i) is unmeasurable and a noise source signal v1 (i) is measurable. The noise source goes through membranes, mirrors or the atmosphere to generate a distorted noise v0 (i). It is then added to an information signal s(i) to compose an output signal x(i) which is measurable. We obtain the following equation: x(i) = s(i) + v0 (i).

(1)

Our goal is to recover the information signal s(i) from the compound output signal x(i), which forms the information signal s(i) plus v0 (i), a distorted and delayed version of v1 (i) with the proposed ANFIS. In the next section, the proposed method for image denoising with the ANFIS algorithm is represented. Simulation results of denoising an image with ANFIS are described in Section 3. Also, we discuss the other parameters of ANFIS, and compare its results with conventional filtering techniques in this section. The summaries and analyses are shown in Section 4. Finally, a conclusion is presented in Section 5. 2. The proposed method for image denoising with ANFIS In this section, we derive the 2-D counterparts from 1-D signals, and then we use them in image processing applications. 2.1. Transformation and nonlinear passive dynamics As we already know, a 2-D image is described as a matrix, which can also uniquely be expressed as a 1-D sequence by connecting the rows (or columns) shown in Fig. 2. Assume a 2-D image matrix has M rows and N columns. A pixel S(j, k) can be uniquely represented by a point in its corresponding 1-D sequence vector s as s(i) = S(j, k),

(2)

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1039

N x (1)

x (N+1)

M

x (2)

x (N)

x (3)

x (N+2)

x (N+3)

x (N+3)

x ((M-1)N+1)

x (2N)

x (MN)

Fig. 2. A 1-D sequence vector after a 2-D image matrix is connected with the row (or columns).

s(i)

x(i)=s(i)+v0(i)

+ +

v1(i)

Nonlinear Noise f (.)

v0(i)

y (i) ANFIS

-

+ e(i)=s(i)+v0(i)-y(i)

Fig. 3. The architecture of ANFIS for noise cancellation.

where j = 1, 2, . . . , M, k = 1, 2, . . . , N; S is the original image matrix; s is the 1-D transformed sequence vector. The index number i is obtained by i = j N + k,

(3)

where i = 1, 2, . . . , MN. 2.2. Nonlinear adaptive noise cancellation with ANFIS The typical application of an adaptive filter is to work as an adaptive noise cancellation. This transformation offers the best filtering in a wide variety lot of applications. Adaptive noise cancellation is used when the signal is very weak or cannot be measured in noise fields. It will filter the auxiliary signals or the reference signals from one or several sensor(s) and substrate them from the original signals. As a result, the original noise is attenuated or eliminated owing to cancellation. The concept of linear adaptive noise cancellation can be extended to the nonlinear area by using nonlinear adaptive systems. ANFIS [9] can be used to recognize unknown nonlinear passage dynamics which transmit a noise source into an interference component in a detected signal. Sometimes the proposed nonlinear adaptive cancellation technique is more suitable for filtering certain noise components than are noise cancellation techniques based on frequency-selective filtering. The schematic diagram of adaptive noise cancellation with ANFIS is shown in Fig. 3. Here an information signal s(i) is unmeasurable and a noise source signal v1 (i) is measurable. The noise source goes through an unknown nonlinear function to generate a distorted noise v0 (i) which is unmeasurable. It is then added to

1040

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

an information signal s(i) to compose a output signal x(i) which is measurable. Our goal is to recover the information signal s(i) from the compound output signal x(i), which forms the information signal s(i) plus v0 (i), a distorted and delayed version of v1 (i). The detected output signal is presented as x(i) = s(i) + v0 (i) = s(i) + f (v1 (i), v1 (i − 1), . . .),

(4)

where the function f (.) denotes the nonlinear function that the noise source signal v1 goes through. If we know the function f (.) exactly, we can easily retrieve the original information signal by subtracting v1 (i) from x(i) directly. However, f (.) is usually unknown in advance and may change with time. The spectrum of v0 (i) may overlap that of s(i) to some extent. The result is poor when the common frequency-domain filter is adopted. The noise signal v1 (i) and the distorted noise signal v0 (i) do not relate to the information signal s(i). However, it cannot be measured directly because it is a part of the overall measurable signal x(i). The detected signal x(i) can be measured as the expected output of ANFIS training only if the information signal s(i) is not correlated with the noise signal v1 (i) shown in Fig. 3. Adaptive noise cancellation has two inputs: the detected output signal, also called the original input s(i), and the noise source signal, also called the reference input v1 (i). The reference input v1 (i) is correlated to the distorted noise signal v0 (i) and not correlated to the information signal s(i). ANFIS accepts the error signal e to control and adjust the weights W which make the output of ANFIS, denoted as y(i), to approximate the distorted noise signal v0 (i) in the original signal x(i). So e(i)=x(i) − y(i) will approach the information signal s(i). y(i) has this property. Because s(i) is not correlated to v0 (i) and v1 (i), e(i) = x(i) − y(i) = s(i) + v0 (i) − y(i)

(5)

we square the above equation and obtain e(i)2 = (s(i) + v0 (i) − fˆ(v1 (i), v1 (i − 1), v1 (i − 2), . . .))2 ,

(6)

where fˆ is the function implemented by ANFIS. Since s(i) is not related to v1 (i) and its previous values, ANFIS does not know how to remove the noise contributing to s(i). This means that the information signal s(i) includes an unrelated “noise’’ component in the data fitting processing. ANFIS cannot solve this problem except by obtaining its steady-state value. What ANFIS can do is to minimize the error component of (v0 (i) − y(i)). That is (v0 (i) − fˆ(v1 (i), v1 (i − 1), v1 (i − 2), . . .))2 in Eq. (6) can be expanded to e(i)2 = (s(i))2 + (v0 (i) − y(i))2 + 2s(i)v0 (i) − 2s(i)y(i).

(7)

Taking means of both sides of Eq. (7) yields E[e2 ] = E[s 2 ] + E[(v0 − y)2 ] + 2E[sv0 ] − 2E[sy].

(8)

Because s(i) is not correlated to v0 (i), v1 (i) and y(i), E[sv0 ] and E[sy] are equal to zero, E[e2 ] = E[s 2 ] + E[(v0 − y)2 ].

(9)

Here E[s 2 ] does not change when ANFIS adjusts its MFs to minimize E[e2 ] because it does not relate to weight W . So minimizing E[e2 ] is equal to minimizing the second item, E[(v0 − y)2 ], in Eq. (9) such that the ANFIS function fˆ(.) approaches the passage dynamics f (.) as much as possible: E[e2 ]min ⇐⇒ E[(v0 − y)2 ]min .

(10)

For simplification of the discussion, we assume that: (1) the information signal s(i) is a zero signal for all i; and (2) we use the least-square method to handle premise parameters and updated consequent parameters of ANFIS. Item 1 means that we can obtain ideal training data which affect only the measurement noise. Item 2 means only linear parameters are used in ANFIS. ANFIS with changeable parameters will still generate a fitting error e(i) which is equal to the difference between an expected output and the ANFIS output even using ideal training data, and this error e(i) is generated by measurement noise and/or modelling errors. If the error e(i) is zero expectation, the consequent parameters obtained by the least-squares method in ANFIS are unbiased, which is the characteristic of the linear least squares estimator (LSE).

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1041

The previous assumption can be relaxed. It is stated that s(i) is a zero signal in item 1. However, this is unrealistic in the real world. Because s(i) is an additive component, we can still recognize unbiased consequent parameters by LSE only if s(i) is zero expectation. ANFIS is required to update its consequent parameters only in item 2. We use the proposed hybrid learning rule—that is, the backpropagation learning rule with LSE to update the premise (nonlinear) and consequent (linear) parameters in the simulations. Then we use the backpropagation learning rule compared with the hybrid learning rule. Even though ANFIS is a nonlinear model and the Gauss–Markov theorem is no longer held, the system has the capacity of reducing modelling errors further. We derive the following result from the above deduction: even though the information signal s(i) is not zero expectation and the consequent parameters obtained by the least-squares method in ANFIS are biased, the system still has the capacity of reducing modelling errors further because ANFIS is a nonlinear model and the Gauss–Markov theorem is no longer held. Our experimental data also match and prove this result. If a linear filter is used to replace the ANFIS block in Fig. 3, the original adaptive linear noise cancellation is similar to the settings proposed by Widrow and Glover [25]. We can handle a wide range of nonlinear passage dynamics by replacing the linear filter with a nonlinear ANFIS filter. Before we attempt some simulations, we need to emphasize the conditions under which adaptive noise cancellation is valid: • The information signal s(i) should be unrelated to the noise signal v1 (i) which is available. • The order of the passage dynamics can be obtained from experiment and observation so that the number of inputs to the ANFIS filter is determined.

2.3. Algorithm of ANFIS for nonlinear passive dynamics of order 2 The key characteristic of neural networks is its function of adaptive learning. Using this characteristic adaptive learning to analyze and model generates the technique of adaptive neural network. This is a very effective tool for building the model of a fuzzy system. The most important aspect of adaptive network-based fuzzy inference system (ANFIS) [8,13,21,22] is that the method of system modelling is based on data. MFs and fuzzy rules in ANFIS are obtained by learning a great deal of known data rather than by arbitrarily selecting a given experience or based on intuition. This is very important when working with systems whose characteristics are unknown or complex. The shapes of MFs depend on parameters. Variable parameters change their shapes and some important properties. 2.3.1. Overview of ANFIS The design of ANFIS does not depend on the models of objects. It depends considerably on experience and the knowledge of experts and operators. The construction of ANFIS is very suitable for the expression of quality or fuzzy experience and knowledge. These experiences and knowledge are represented by if–then fuzzy rules. If the experience is short, the design cannot be expected to produce a good control effect. For the above problem, adaptation is a good solution. However, the adaptive method brings great difficulty to the design and construction of the system. It requires many special profound theories. Also the applicability of different adaptive theories and methods is narrow. Thus the design and realization of ANFIS are difficult. On the other hand, the development of fuzzy sets and neural networks (NN) hastens the progress of intelligent control. There are two different areas. The differences between them in basic theory are great. However, they all belong to artificial intelligence. Theory and practice show that they can be combined together. Adaptive network-based fuzzy inference system is also called adaptive neuro-fuzzy inference system (ANFIS). Because fuzzy reasoning has no function of adaptive learning, its application is greatly limited. Neural networks cannot express fuzzy language and lack transparency like a black box. Therefore they cannot express a reasoning function like our brains do. However, ANFIS can organically combine both together. Not only does it develop the advantages of both but it also compensates for the shortages of each one. FIS has an obvious disadvantage in that it lacks effective learning mechanisms. The outstanding property of ANFIS is that it compensates for the disadvantage of FIS with the learning mechanism of NN.

1042

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

Layer 1

Layer 2

Layer 3

Layer 4

Layer 5

v1(i) A1

Π

A2

Π

wn

N

wn

1

v1(i) N

2 Σ

B1

Π

N

3

B2

Π

N

4

y (i)

v1(i-1)

v (i-1) Fig. 4. The architecture of ANFIS for a two-input first-order Sugeno fuzzy model with two MFs and four rules.

Mamdani FIS and Sugeno FIS have their advantages and disadvantages. Because its form of rules fits the habits of human thought and language expression, Mamdani FIS can easily express human knowledge. However, its disadvantages are that it involves complex calculation and also goes against mathematical analysis. On the other hand, simple calculation is the strong point of Sugeno FIS which leads itself to mathematical analysis. It can easily be combined with proportional, integral, derivative (PID) control, optimization and adaptive methods. We can obtain controllers with optimal and adaptive ability or tools of fuzzy modelling by Sugeno FIS. According to its properties, it is used to construct ANFIS with NN, which has adaptive learning abilities. The combination of fuzzy logic and NN is an important research area of intelligent computing. ANFIS, which combines fuzzy logic and NN, has the advantage that the fuzzy set easily expresses human knowledge and NN has the ability of message storage with distribution and learning. It provides an effective tool for modelling and control in complex systems. 2.3.2. Algorithm of ANFIS for nonlinear passive dynamics of order 2 In theory, the architectures of ANFIS for standard Mamdani model and Sugeno model have perfect design methods. Correspondent to the first-order Sugeno model, Jang et al. [9] presents an ANFIS similar to a first-order Sugeno FIS. The architecture of ANFIS with two inputs is shown in Fig. 4. Each node in every layer has the same function. For nonlinear passive dynamics of order 2, FIS has two inputs v1 (i) and v1 (i − 1) and one output y. For a first-order Sugeno FIS [20,23,19], a common rule set with four fuzzy if-then rules is as follows: Rule 1: If Rule 2: If Rule 3: If Rule 4: If

v1 (i) is A1 v1 (i) is A1 v1 (i) is A2 v1 (i) is A2

and v1 (i − 1) is B1 , and v1 (i − 1) is B2 , and v1 (i − 1) is B1 , and v1 (i − 1) is B2 ,

then y1 then y2 then y3 then y4

= 1 v1 (i) + 1 v1 (i − 1) + 1 . = 2 v1 (i) + 2 v1 (i − 1) + 2 . = 3 v1 (i) + 3 v1 (i − 1) + 3 . = 4 v1 (i) + 4 v1 (i − 1) + 4 ,

where v1 (i) or v1 (i − 1) is the input of mth node and i is the length of the sequence vector; A1 , A2 , B1 and B2 are linguistic variables (such as “big’’ or “small’’) associated with this node; 1 , 1 and 1 are the parameters of the first node, 2 , 2 and 2 of the second node, 3 , 3 and 3 of the third node, and 4 , 4 and 4 of the forth node. This architecture matches with the nonlinear passive dynamics of order 2 with two MFs for each input for noise cancellation in the experiment that will be described in the next section. First layer: Node n in this layer is denoted as square nodes (Parameters in this layer are changeable). Here O1,n is denoted as the output of the nth node in layer 1 O1,n = As1 (v1 (i)),

s1 = 1, 2,

O1,n = Bs2 (v1 (i − 1)),

n = 1, 2,

s2 = 1, 2,

n = 3, 4,

(11) (12)

where s1 is defined as s1th MF of the input v1 (i) and s2 as s2th MFs of the input v1 (i − 1). It is said that O1,n is the membership grade of a fuzzy set A(= A1 , A2 , B1 or B2 ). It specifies the degree to which the given input v1 (i)

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1043

(or v1 (i − 1)) satisfies the quantifier A. Here the MF for A can be any appropriate parameterized MF, such as the function of the product by two sigmoid functions: A (v1 ) =

1 , (1 + e−a1 (v1 −c1 ) )(1 + e−a2 (v1 −c2 ) )

(13)

where parameters a1 , a2 , c1 and c2 decide the shape of two sigmoid functions. Parameters in this layer are referred to as premise parameters.  Second layer: The nodes in this layer are labelled by in Fig. 4. The outputs are the products of inputs O2,n = wn = As1 (v1 (i))Bs2 (v1 (i − 1)),

s1 = 1, 2,

s2 = 1, 2,

n = 1, 2, 3, 4.

(14)

The output of each node n output represents the firing strength of a rule. Usually any other T-norm operators which perform fuzzy AND can be used as the node function in this layer. Third layer: The nodes in this layer are labelled with N . The nth node calculates the ratio of nth rule’s firing strength to the summation of all rules’ firing strengths: O3,n = w n = 4

n=1 wn ,

wn

(15)

n = 1, 2, 3, 4.

The outputs of this layer are named as normalized firing strengths. Fourth layer: Each node in this layer is called an adaptive node. The outputs are O4,n = w n yn = w n (n v1 (i) + n v1 (i − 1) + n ),

n = 1, 2, 3, 4,

(16)

where n , n , n are parameters of the nodes. Parameters in this layer are called consequent parameters. Fifth layer: Every node in this layer is a fixed node labelled . Its total output of the summation of all input is y = O5,1 =

4 w n yn w n yn = n=1 . 4 n=1 wn n=1

4 

(17)

Let q be a parameter of the fuzzy set A or B from some MFs in Subsection 2.6.1. We use the backpropagation method [16] to determine the change in parameters of each MF. The error E is the sum of squared differences between target and actual output. By using the chain rule in the differential, the change in the parameter q for the input v1 (i) is jAs1 (v1 (i)) jE jy jw n jwn jE q(v1 (i)) = − = − jq jy jw n jwn jAs1 (v1 (i)) jq j w n · (1 − w n ) wn As1 (v1 (i)) =  · (x − y) · yn · · wn As1 (v1 (i)) jq  · yn · w n · (x − y) · (1 − w n ) jAs1 (v1 (i)) = · , s1 = 1, 2, n = 1, 2, (18) As1 (v1 (i)) jq where  is a learning rate. As the same, the change in the parameter q for the input v1 (i − 1) is q(v1 (i − 1)) =

 · yn · w n · (x − y) · (1 − w n ) jBs2 (v1 (i − 1)) · , s2 = 1, 2, Bs2 (v1 (i − 1)) jq

n = 3, 4,

(19)

where q is the parameters a1 , a2 , c1 and c2 which decide the shape of two sigmoid functions. The changes in the parameter q for the input v1 (i) and v1 (i − 1) is q = q(v1 (i)) + q( v1 (i − 1))  · yn · w n · (x − y) · (1 − w n ) jAs1 (v1 (i)) · = As1 (v1 (i)) jq  · yn · w n · (x − y) · (1 − w n ) jBs2 (v1 (i − 1)) + · , s1 = 1, 2, Bs2 (v1 (i − 1)) jq

s2 = 1, 2, n = 1, 2, 3, 4.

(20)

1044

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

Table 1 Two passes in the hybrid learning procedure for ANFIS

Premise parameters Consequent parameters Signals

Forward pass

Backward pass

Fixed Least-squares method Node outputs

Gradient descent Fixed Error signals

2.3.3. Hybrid learning algorithm Because the basic learning algorithm, backpropagation method, which we presented before, is based on the gradient method which is notorious for its slow convergence and tendency to be trapped in local minima, a hybrid learning algorithm is introduced to speed up the learning process substantially. Let V be a matrix that contains one row for each pattern of the training set. For a nonlinear passive dynamic of order 2, V can be defined as V = (v1 (i), v1 (i − 1)).

(21)

Let Y be the vector of the target output values from the training data and let A = ((i), (i), (i))

(22)

be the vector of all the consequent parameters of all the rules for a nonlinear passive dynamic of order 2. The consequent parameters are determined by VA = Y.

(23)

A least squares estimate (LSE) of A, A∗ , is sought to minimize the squared error VA − Y. The most well-known formula for A∗ uses the pseudo-inverse of A A∗ = (VT V)−1 VT Y,

(24)

where VT is the transposition of V, and (VT V)−1 VT is the pseudo-inverse of V if VT V is non-singular. It is expensive in computation when dealing with the matrix inverse. Moreover, it becomes ill-defined if VT V is singular. The sequential formulas of LSE are more efficient and are used for training ANFIS Let v(i)T be the ith row vector of matrix V and let y(i)T be the ith element of vector Y. Then A can be calculated iteratively using the sequential formulas: A(i + 1) = A(i) + S(i + 1) · v(i + 1) · (y(i + 1)T − v(i + 1) · A(i)),

(25)

S(i) · v(i + 1) · vT (i + 1) · S(i) , 1 + v(i + 1) · S(i) · v(i + 1)

(26)

S(i + 1) = S(i) −

i = 0, 1, . . . , MN − 1,

where S(i) is often called the covariance matrix. The modifications for the premise parameters are determined by the backpropagation method in Eq. (20) for a nonlinear passive dynamic of order 2. Now the gradient method (also called backpropagation method) is combined with the least squares method to update the parameters of MFs in an adaptive inference system. Each epoch in the hybrid learning algorithm includes a forward pass and a backward pass. In the forward pass, the input data and functional signals go forward to calculate each node output. The functional signals still go forward until the error measure is calculated. In the backward pass, the error rates propagate from output end toward the input end, and the parameters are updated by the gradient method. Table 1 summarizes the activities in each pass. 2.3.4. ANFIS design ANFIS uses the mature parameter learning algorithm in neural networks—the back-propagation (BP) algorithm or the BP least-squares method (LSM). It adjusts the shape parameters of MF in FIS by learning from a set of given input and output data.

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1045

2.3.4.1. Learning and reasoning based on the model of ANFIS The algorithm of ANFIS is very simple. It provides a learning method which can obtain corresponding messages (fuzzy rules) from data for fuzzy modelling. This learning method is very similar to the learning algorithm of NN. It can effectively calculate the best parameter for MF. It designs a Sugeno FIS which can best simulate the expected or actual relations between inputs and outputs. ANFIS is a modelling method based on the given data. It is the best measuring criterion for whether or not the results of FIS model can simulate the data well. 2.3.4.2. Adjustment of structure and parameters for FIS The structure in Fig. 4 is similar to the structure of NN. It first maps inputs with the MFs of inputs and parameters. Then it maps the data of input space to output space with the MFs of output variables and parameters. The parameters determining the shapes of MFs can adjust and change through a learning procedure. The adjustment of these parameters is accomplished by a gradient vector. It appraises a set of certain parameters how FIS meets the data of inputs and outputs. Once this gradient is constructed, the system can adjust these parameters to reduce the error between it and the expected system by optimal algorithm: (This error is usually defined as the square of difference between the outputs and targets.) The functions of ANFIS can estimate the parameters of MFs by BP algorithm or by combining the LSM estimation with the BP algorithm. 2.3.4.3. Validity of the training data and the resulting model The modelling procedure of ANFIS is similar to the method of system recognition. First, ANFIS assumes a parameterized model structure. This model connects to input variables, the MF of input variables, fuzzy rules, output variables and the MF of output variables. Then it obtains a group of pairs of input and output data and composes training data to the algorithm of ANFIS according to a certain format. Now it can train the previous parameterized FIS model by ANFIS functions and adjusts the parameters of MFs according to special rules of error. Finally, it makes this model continuously approximate (simulate) it to the given training data. This modelling method can usually obtain good results if the data reflect the model characteristics well. However, in applications this is not always the case. Sometimes the training data cannot represent all the characteristics of the system if the training data include noisy signals. Measuring the validity of the training procedure and the resulting model is a very important procedure. Usually not all data are expected to train the system in ANFIS modelling. Because the calculation of ANFIS increases much more than the ratio of increasing training data, it increases the calculation greatly by using vast training data and consumes time. Also, more importantly, learning of NN does not always converge toward the optimal direction. Training results become poorer with the increase of training data in some cases. In worse cases, the system changes the structure by increasing its complexity to reflect the properties of a few data with large noisy signals. This result must be avoided when the ANFIS method is used. Thus the data must be separated in three groups: the first group is the training data set which trains the model; the second group is the checking data set which checks the model in the training procedure; and the third group is the testing data set which tests the results of the model. 2.3.4.4. Measuring the result model with checking and testing data sets The choice of training data can result in the inclusion of some unreliable factors. By doing some preprocessing work on the obtained data, checking during the training procedure and the final result model is important. In general, the checking procedure for the result model uses those input and output data which are not used to train the system to compare with the trained model to see whether it can match and predict those data. This procedure usually uses the called testing data set to finish. In ANFIS function, we also use another called checking data set to control the training procedure and find the result model overfitting. If the training data is limited compared with the model parameters, the model can simulate the training data well after training some epochs. However, if the model is trained continuously, it will have some characteristics of overfitting. This model deviates from the expected results. For example, if we train NN to simulate a second-order curve by the given data, the training results obtain a sixth-order curve. Even though the errors decrease for the training data, the system cannot work well for the testing data. The checking data input the ANFIS function with the training data simultaneously. When the training data is sparse, the system will converge to multiple directions. The function will choose the suitable results according to the rule

1046

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

resulting in the fewest errors in the checking data. The checking data do not directly participate to train the parameters of the system as training data. Their values only help for judgement and choice. This case is a typical problem in adaptive technique: it is important to select not only the expected model which represents the needs of the system but also effective checking data which differ from the training data (to avoiding useless checking). For those systems which have plentiful reliable data that represent the properties of model, it is easy to choose sufficient training data. However, if the chosen training data can not express the characteristics of the model or include a lot of noise, it is necessary to use checking data because it can avoid generation of the structure of overfitting for simulating the characteristics of noise. The checking data are different from the testing data. The testing data do the calculation and comparison correspondingly after the final model obtains the results. However, the usage and calculation of the checking data are simultaneous with the training data. If the system experiences the case of not matching (for example the training data is too small), the errors of the training data become fewer and fewer and the errors of the checking data become larger and larger when the training continues to some extent. If the above case appears, we know the system parameters are not matching the training data or overfitting. 2.4. The transformation from 1-D sequence vector back to 2-D image matrix After finishing the restoration of images corrupted by the noise of the nonlinear passage dynamics with ANFIS, we need recover the image from 1-D sequence vector back to 2-D image matrix. If dividing N on both sides of Eq. (3), we can get j ∗ , the number of the row, and k ∗ , the number of the column from ∗ i , the number of 1-D sequence. j ∗ obtains from the integer of i ∗ /N and k ∗ from the numerator or residue of i ∗ /N . j∗ +

i∗ k∗ = , N N

(27)

where j ∗ = 1, 2, . . . , M, k ∗ = 1, 2, . . . , N, and i ∗ = 1, 2, . . . , MN. Finally, we get the recovered image matrix from the 1-D transformed sequence vector denoised by ANFIS ˆˆ ∗ , k ∗ ) = sˆ (i ∗ ), S(j

(28)

where Sˆ is the recovered image matrix; sˆ is the 1-D transformed sequence vector denoised by ANFIS. 2.5. Restoration of images corrupted by the noise of the nonlinear passage dynamics In this paper, we propose to discuss separately the restoration of an image corrupted by noise in nonlinear passage dynamics of order 2. 2.5.1. Restoration of images corrupted by the noise of the nonlinear passage dynamics of order 2 We will use ANFIS to restore an image corrupted by noise of a nonlinear passage dynamics of order 2. In the experiment, the unknown nonlinear passage dynamic is assumed to be defined as v0 (i) = f (v1 (i), v1 (i − 1)) =

4 sin(v1 (i))v1 (i − 1) , 1 + (v1 (i − 1))2

(29)

where v1 (i) is a noise source and v1 (i − 1) is the one order delay of the noise source, and v0 (i) is defined as the resultant of the nonlinear passage dynamics f (.) owing to v1 (i) and v1 (i − 1), where i is from 1 to the number of the pixels in the image. 2.5.2. Modifications for 2-D signal denoising with the ANFIS algorithm For the operation between random Gaussian noises and 2-D signal pixels, we need also to change each pixel value from an unsigned 8-bit integer to double-precision. After removing the noise part from each pixel value detected by ANFIS, the pixel values need to be transformed back to an unsigned 8-bit integer for display.

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1047

2.6. Factors discussed in the proposed method Two aspects will be investigated in this section: various MFs, and other parameters in each MF. 2.6.1. Various types of MFs Eight types of MF will be inspected. They are bell MF, triangular MF, Gaussian MF, two-sided Gaussian MF, pi-shaped MF, product of two sigmoidal MFs, difference between two sigmoidal MFs, and trapezoidal MF. For triangular MF, its expression is ⎧ 0, x a, ⎪ ⎪ ⎪ ⎪ x−a ⎪ ⎨ , a x b, −a f (x, a, b, c) = cb − (30) x ⎪ ⎪ , b x c, ⎪ ⎪ c−b ⎪ ⎩ 0, c x, where the parameters a, b and c decide the shape of triangular MF. The trapezoidal MF is expressed as ⎧ 0, x a, ⎪ ⎪ ⎪ x−a ⎪ ⎪ ⎪ , a x b, ⎪ ⎨ b−a b x c, f (x, a, b, c, d) = 1, ⎪ ⎪ d −x ⎪ ⎪ , c x d, ⎪ ⎪ ⎪ ⎩ d −c 0, c x,

(31)

where the shape of trapezoidal MF is decided by the parameters a, b, c and d. With Gaussian MF, its expression is 2 /22

f (x, c, ) = e−(x−c)

(32)

,

where the parameters c and  decide the shape of Gaussian MF. The two-sided Gaussian MF is expressed as 2 2 e−(x−c1 ) /21 , x c1 , f (x, c1 , c2 , 1 , 2 ) = 2 2 e−(x−c2 ) /22 , x c2 ,

(33)

where the shape of two-sided Gaussian MF is decided by the parameters 1 , c1 and 2 , c2 which correspond to the widths and centers of the left and right half Gaussian functions. For the bell MF, its expression is f (x, a, b, c) = e−1/1+|(x−c)/a| , 2b

(34)

where the parameters a, b and c decide the shape of bell MF. The product of two sigmoidal MFs is expressed as f (x, a1 , c1 , a2 , c2 ) =

1 (1 + e−a1 (x−c1 ) )(1 + e−a2 (x−c2 ) )

,

where the parameters a1 , c1 and a2 , c2 decide the shapes of two sigmoid MFs. The difference between two sigmoidal MFs is expressed as





1 1

, f (x, a1 , c1 , a2 , c2 ) =

− (1 + e−a1 (x−c1 ) ) (1 + e−a2 (x−c2 ) )

where the parameters a1 , c1 and a2 , c2 decide the shapes of two sigmoid MFs. The pi-shaped MF is the product of Z shape and S shape functions.

(35)

(36)

1048

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

A1

v1(i)

A2

A3

v1(i)

1

Π

N

Π

N

2

Π

N

3

Π

N

4

Π

N

5

Π

N

6

Π

N

7

Π

N

8

Π

N

Σ

y(i)

B1

v1(i-1)

B2

B3

v1(i−1)

9

Fig. 5. The architecture of ANFIS for two-input first order Sugeno fuzzy model with three MFs and nine rules (redrawn from Jang et al. [9]).

2.6.2. Parameters in each MF We also will investigate other parameters in each MF, such as the training epochs, the number of MFs for each input, the optimization method, the type of output MFs, and the training data and checking data. For the training epochs, various training numbers will be investigated for the restoration effect from 20 to 400. The aim is to find the optimal point or range and also save time. We inspect the number of MFs for each input to inspect how many MFs for each input match the structure of data because it reflects the complexity of ANFIS for choosing parameters. If the number chosen is too small, it will not reflect the complex structure of data. Meanwhile, if the number chosen is too large, it will produce redundancy for the structure of data. Usually we survey the area from 2 to 6 MFs for each input. Note that the structure of ANFIS is not unique. In our experiment, the number of MFs assigned to each input variable is selected empirically—that is simply by trial and error. We also show the architecture of ANFIS for two-input first-order Sugeno fuzzy model with three MFs and nine rules in Fig. 5. In addition, the optimization method will be investigated. We discuss the two optimization methods: the combination of least-squares and backpropagation gradient descent methods, and the backpropagation method. After that, the type of output MFs will be investigated. For Sugeno fuzzy inference system, the type of output MFs can be “constant’’ or “linear’’. Finally, we will divide the overall data into two parts. One part is called the training data. It will be used to train the fuzzy system. The other part is called the checking data. It will be used to check if overfitting exists in this fuzzy system. The training procedure consists of the following steps: (i) Propagate all patterns from the training data and determine the consequent parameters by iterative LSE. The antecedent parameters remain fixed. (ii) Propagate all patterns again and update the antecedent parameters by backpropagation. The consequent parameters remain fixed.

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1049

(iii) If the error is reduced in four consecutive steps then increase the learning rate by 10%. If the error is subject to consecutive combinations of increment and decrease, then decrease the learning rate by 10%. (iv) Stop if the error is small enough, otherwise continue with step (i). 3. Simulation results In this section, we display the results of 2-D signal denoising using ANFIS to nonlinear passage dynamics of order 2 in Gaussian noise. In the experiment, we generate and add different intensity of Gaussian noise. We choose different variance and standard deviation from a normal distribution to generate a different intensity of noise. We divide them into three categories according to the signal noise ratio (SNR): low noise, medium noise, and high noise. Table 2 The SNRs and the MSE of the image corrupted by the different intensity Gaussian noises Intensity of noise SNR MSE

Low 12.7131 175.3

a

Medium 3.1707 1577.9

b

Power Spectral Density of v1(i)

High −1.2663 4383.1

Power Spectral Density of v0(i)

10

10

0

0

-10

-10

-20

-20

-30

-30

-40

-40 -50

-50 0

500

1000

1500

2000

2500

0

500

Frequency (Hz)

c

1000

1500

2000

2500

Frequency (Hz)

d

Power Spectral Density of s(i)

Power Spectral Density of x(i)

10

10

0

0

-10

-10

-20

-20

-30

-30

-40

-40

-50

-50 0

500

1000

1500

Frequency (Hz)

2000

2500

0

500

1000

1500

Frequency (Hz)

Fig. 6. Spectral density distributions corrupted by high Gaussian noise.

2000

2500

1050

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1.5 1 v0(i)

0.5 0 -0.5 -1 -1.5 4 4

2 v

2

1 (i-

0 1)

0

-2

-2 -4

-4

v 1(i)

Fig. 7. The characteristics of ANFIS function fˆ.

Fig. 8. The signals being used in ANFIS. (a) The measurable source noise v1 (i); (b) The distorted noise v0 (i); (c) The estimated distorted noise y(i) by ANFIS; (d) The error between the estimated distorted noise y(i) by ANFIS and the distorted noise v0 (i).

We also save these noises for comparisons. If we always randomly generate noises in different programs of ANFIS, these noises will have some slight changes in their means, variances and standard deviations. Therefore, to increase comparability, it is better to save these noises rather than randomly generate them every time in different programs of ANFIS.

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

b

Initial MFs for v1(i)

1

Membership grade

Membership grade

a

0.8 0.6 0.4 0.2 0

Initial MFs for v1(i-1)

1 0.8 0.6 0.4 0.2 0

4

2

0

2

4

4

2

v1(i)

c

0

2

4

v1(i-1)

d

Final MFs for v1(i) 1

Membership grade

Membership grade

1051

0.8 0.6 0.4 0.2 0

Final MFs for v1(i-1) 1 0.8 0.6 0.4 0.2 0

4

2

0 v1(i)

2

4

4

2

0

2

4

v1(i-1)

Fig. 9. The changes in the bell MFs before and after training.

We list the SNRs and the MSE of the image corrupted by the different intensity of Gaussian noises in Table 2. The measurable noise is Gaussian noise (a normal distribution) with a mean of zero, variance one and standard deviation one. For comparison, all the different MFs use the same source noise v1 (i). We first investigate the behavior of these signals in the frequency domain. Fig. 6(a)–(d) display the spectral density distributions of v1 (i), v0 (i), s(i) and x(i), respectively. Obviously, the spectra of the information signal s(i) and the distorted noise v0 (i) overlap each other in a large frequency area in the figure. This makes it impossible to apply common frequency domain filtering methods to remove v0 (i) from x(i). Fig. 7 is the ANFIS surface fˆ(.) after 20 epochs of batch learning. The distorted noise v0 (i) caused by the source noise v1 (i) in Fig. 8(a) and produced by the nonlinear dynamics in Eq. (29) is shown in Fig. 8(b). We show only the high noise to represent the different intensity noise. The estimated distorted signal y(i) is shown in Fig. 8(c). The error between the estimated distorted noise y(i) by ANFIS and the distorted noise v0 (i) is shown in Fig. 8(d). In Fig. 8(d), the color shows almost pure black. It means the error between the estimated distorted noise y(i) by ANFIS and the distorted noise v0 (i) is very small since the pixel value of black in an 8-bit grey image is denoted as 0. The changes in the two bell MFs, the two triangular MFs, the two trapezoidal MFs, and the two pi-shaped MFs before and after training corrupted by high noise are shown in Figs. 9–12, respectively. The images corrupted by the intensity of low, medium and high Gaussian noise (Fig. 13(b), (d) and (f)) and the results of removing noise with ANFIS (Fig. 13(c), (e) and (g)) are shown in Fig. 13, respectively. The restoration effect is particularly good compared with the original image in Fig. 13(a) regardless of how corrupted the image is by heavy noise. All MSE results processed with different MFs are listed in descending order in Table 3 for comparison purposes.

1052

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

b

Initial MFs for v1(i)

1

Membership grade

Membership grade

a

0.8 0.6 0.4 0.2 0

Initial MFs for v1(i-1)

1 0.8 0.6 0.4 0.2 0

-4

-2

0

2

4

-4

-2

d

Final MFs for v1(i)

1

Membership grade

Membership grade

c

0

2

4

v1(i-1)

v1(i)

0.8 0.6 0.4 0.2 0

Final MFs for v1(i-1)

1 0.8 0.6 0.4 0.2 0

-4

-2

0 v1(i)

2

4

-4

-2

0

2

4

v1(i-1)

Fig. 10. The changes in the triangular MFs before and after training.

Now the other parameters of ANFIS are investigated. First we set the training epoch number of 50, 100, 200 and 400 separately. Fig. 14(a)–(d) shows the RMSE with the intensity of high noise. The curves of step size removing high Gaussian noise in the different training epoch numbers are shown in Fig. 15(a)–(d). We also list the MSE between the original image and the restored image in various training epoch number in Table 4. Then we discuss the effect of the number of bell MFs for each input to the image restoration corrupted by high Gaussian noise. Assume that the number of MFs for each input is 3, 4, 5 and 6 bell MFs separately. The 4 and 5 MFs for each input before and after training, which reflects changes in premise nonlinear parameters are shown in Figs. 16 and 17. We list MSE of noise-filtered image contaminated with Gaussian high noise in the different number of MF for each input in Table 5. We also use a combination of least-squares and backpropagation gradient descent methods used for training MF parameters or the backpropagation method to approximate the output data. MSE of noise-filtered images using the both methods are shown in Table 6. From this table, we find that using the hybrid learning algorithm to remove noise is much better than using the backpropagation method. In addition, we select either linear or constant output MFs to approximate the output data. MSE of noise-filtered images using the linear output MF and constant output MF are shown in Table 7. In this table, we find that the effect for removing noise by the linear output MF is much better than by the constant output MF. Finally, we split the whole data into two halves. One half is called the training data and another half the checking (validation) data. The checking data are to detect overfitting of the training data set. Overfitting can be detected when the checking error increases while the training error decreases. We show the RMSE curve in Fig. 18 when the image is corrupted by high noise. We also show the changes in the bell MFs of the training data and checking data before and after training in Fig. 19.

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

b

Initial MFs for v1(i) 1

Membership grade

Membership grade

a

0.8 0.6 0.4 0.2 0

Initial MFs for v1(i-1) 1 0.8 0.6 0.4 0.2 0

-4

c

-2

0 v1(i)

2

4

-4

d

Final MFs for v1(i)

1

Membership grade

Membership grade

1053

0.8 0.6 0.4 0.2 0

-2

0 v1(i-1)

2

4

Final MFs for v1(i-1)

1 0.8 0.6 0.4 0.2 0

-4

-2

0 v1(i)

2

4

-4

-2

0 v1(i-1)

2

4

Fig. 11. The changes in the trapezoidal MFs before and after training.

3.1. The comparisons between ANFIS and some conventional filtering systems We list the image noise cancellation using ANFIS and typical conventional filtering systems—median filter, Wiener filter and Butterworth lowpass filter in Fig. 20. We use the same deviations of low, medium and high Gaussian noise in conventional filtering techniques and in ANFIS for nonlinear noise cancellation. We can see all conventional filtering techniques have the much worse results for denoising the images corrupted by different intensity of noise compared with the results recovered with ANFIS in Table 8 regardless of which kind of filtering techniques is—spatial filters, optimal Wiener filter, frequency domain filters, wavelet, wavelet packet or 2-D adaptive filters. 4. Summaries and analyses From the above experiments, we can generate the following summaries and analyses: • In removing Gaussian noise, the abilities to restore images in 8 MFs are quite different in Table 3. Pi-shaped MFs, two-sided Gaussian MFs, product of two sigmoidal MFs, and difference between two sigmoidal MFs all show their excellent capabilities for filtering different intensity of noise regardless of what the intensity of noise is. However, the capability of triangular MFs is the least useful for removing Gaussian noise because its differential function is not continuous. This makes it more difficult to approximate nonlinear systems than pi-shaped MFs, two-sided Gaussian MFs, or product of two sigmoidal MFs whose differential functions are continuous. • With the increment of the training epoch, the result for noise cancellation improves in Table 4. After 100 epochs, the curve of RMSE tends to stablize. The curve of removing noise flattens after 100 epochs (is not so sharp as that before 100 epochs); therefore, the effect of removing noise degrades. However, it takes longer. For time considerations,

1054

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

b

Initial MFs for v1(i)

1

Membership grade

Membership grade

a

0.8 0.6 0.4 0.2 0

Initial MFs for v1(i-1)

1 0.8 0.6 0.4 0.2 0

-4

-2

0

2

4

-4

-2

v1(i)

d

Final MFs for v1(i) 1

Membership grade

Membership grade

c

0

2

4

v1(i-1)

0.8 0.6 0.4 0.2 0

Final MFs for v1(i-1) 1 0.8 0.6 0.4 0.2 0

-4

-2

0 v1(i)

2

4

-4

-2

0

2

4

v1(i-1)

Fig. 12. The changes in the pi-shaped MFs before and after training.

choosing a training epoch between 50 and 100 is better than choosing one beyond this range. This is because after 100 training epochs, the parameters of MFs are stable and can hardly be tuned further. • The effect of choosing the combination of least-squares and backpropagation gradient descent methods is much better than that of choosing the backpropagation method in Table 6. The combination of least-squares and backpropagation gradient descent methods can guarantee that the mean square error always decreases because the changes in the parameters of MFs are related to the minus gradient of the mean square error. On the other hand, the backpropagation method does not use the minus gradient of the mean square error to tune the parameters of MFs. Therefore, it cannot guarantee that the mean square error always decreases. • The linear output MF removes Gaussian noise much better than the constant output MF in Table 7 because the linear output MF approximates nonlinear curves much better than the constant output MF. In the constant output MF, a typical fuzzy rule has the following form: if x is A and y is B then z = k, where we present that A and B are fuzzy sets while k is an exact constant (non-fuzzification concept). If each rule’s output conclusions are constant, this kind of Sugeno FIS is the same as Mamdani FIS. The only difference is that each MF of the rule’s outputs in zero order Sugeno FIS is a singleton set and the implication algorithm in the procedure of fuzzy reasoning and the aggregation algorithm of the output are fixed. The implication algorithm in Sugeno FIS adopts a simple product operation (here the conclusions of product and minimum operation are the same because the MF values are equal to 1 in singleton sets). The aggregation algorithm adopts a simple summation of these singleton sets (the calculation results of the operations of max, sum, and probor are the same).

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1055

Fig. 13. The image corrupted by different intensity noises and the results of removing noise with ANFIS of bell MFs. (a) The original image; (b) the image with low noise; (c) the restoration from low noise; (d) the image with medium noise; (e) the restoration from medium noise; (f) the image with high noise; (g) the restoration from high noise.

With more common uses in applications, fuzzy rules in the linear output MF generally have the following format: if x is A and y is B then z = px + qy + r,

1056

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

Table 3 MSE between the original 2-D signal and the restoration 2-D signal with all types of MFs Name of MFs

MSE

Triangular Gaussian Trapezoidal Bell Product of two sigmoid Difference between two sigmoidal Two-sided Gaussian Pi-shaped

Low

Medium

High

175.3222

1577.9000

4383.1000

61.8565 3.9071 4.7063 2.0298 1.4214 1.4214 1.1739 1.1888

548.5101 29.1639 21.7074 11.3250 3.8072 3.8072 3.5559 2.0820

1500.9000 79.6379 56.3219 29.9477 8.7097 8.7097 8.8405 4.0841

a

b 0.233

0.233

0.232

0.232

0.231

0.231 0.23 RMSE

RMSE

0.23 0.229 0.228

0.229 0.228

0.227

0.227

0.226

0.226

0.225

0.225

0.224

0.224 0

5

10

15

20

25

30

35

40

45

50

0

10

20

30

40

Epochs

60

70

80

90 100

d 0.233

0.233

0.232

0.232

0.231

0.231

0.23

0.23 RMSE

RMSE

c

50

Epochs

0.229 0.228

0.229 0.228

0.227

0.227

0.226

0.226

0.225

0.225

0.224

0.224 0

20

40

60

80 100 120 140 160 180 200 Epochs

0

50

100

150

200

250

300

350

Epochs

Fig. 14. The RMSE of ANFIS for the different training epoch numbers. (a) 50 epochs; (b) 100 epochs; (c) 200 epochs; (d) 400 epochs.

400

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

a

1057

b 0.17

0.18

0.16

0.16

0.15

0.14 Step size

Step size

0.14 0.13 0.12 0.11

0.1 0.08

0.1

0.06

0.09

0.04

0.08

0.02 0

5

10

15

20

25 30 Epochs

35

40

45

0

50

10

20

30

40

50

60

70

80

90 100

Epochs

c

d 0.18

0.18

0.16

0.16

0.14

0.14

0.12

0.12 Step size

Step size

0.12

0.1 0.08

0.1 0.08

0.06

0.06

0.04

0.04

0.02

0.02

0

0 0

20

40

60

80 100 120 140 160 180 200

0

50

Epochs

100

150

200

250

300

350

400

Epochs

Fig. 15. The change in step size curve in the training processing. (a) 50 epochs; (b) 100 epochs; (c) 200 epochs; (d) 400 epochs. Table 4 MSE between the original 2-D signal and the restoration 2-D signal with the bell MFs with a different number of training epochs Epoch number

20

50

100

200

400

MSE

29.9477

13.1895

7.7370

6.9795

6.9171

where p, q and r are exact constants. It is regarded as the extension of the constant output MF in which every rule defines the place of a dynamically moving singleton set. It seems that a zero output singleton set moves the place of a singleton set in output space in a linear mode according to the inputs. • The step size of training data for all MFs in Fig. 15 is related to the RMSE in Fig. 14. If the RMSE decreases, the step size increases. However, if the RMSE oscillates, the step size decreases. This is decided by the algorithm of ANFIS. If the error is reduced in four consecutive steps then the learning rate increases by 10%. If the error is subject to consecutive combinations of increment and decrease, then the learning rate decreases by 10%. • We did not find overfitting in ANFIS for noise cancellation regardless of which intensity of noise was in the nonlinear passive dynamics of order 2. In Fig. 18, we see the error of checking data always reduces when the error of training

1058

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

b

Initial MFs for v1(i)

1

Membership grade

Membership grade

a

0.8 0.6 0.4 0.2 0

Initial MFs for v1(i-1) 1 0.8 0.6 0.4 0.2 0

-4

-2

0

2

4

-4

-2

v1(i)

d

Final MFs for v1(i) 1

Membership grade

Membership grade

c

0

2

4

v1(i-1)

0.8 0.6 0.4 0.2 0

Final MFs for v1(i-1) 1 0.8 0.6 0.4 0.2 0

-4

-2

0 v1(i)

2

4

-4

-2

0

2

4

v1(i-1)

Fig. 16. The changes in 4 bell MFs for each input before and after training.

data decreases. The changes in the MFs’ parameters are the same in the training data and in the checking data in Fig. 19. • Choosing the number of MFs for each input reflects the complexity of ANFIS for choosing parameters in Table 5. For example, in the nonlinear passive dynamics of order 2 corrupted by Gaussian noise, when the number of MFs for each input is 2 or 3, it is not enough to reflect the complex structure of data; therefore the MSE of filtered 2-D signals is large. However, when we choose the number of MFs for each input as 6, it produces a redundancy for the structure of data, therefore the MSE slightly increases. If we choose the number of MFs for each input as 4 or 5, either reflects the complex structure of data properly because the MSEs of them are small. It is important that the structure of the system matches the data. Therefore, to build the model with ANFIS, the choice of data must express all the properties of the system. The choice for the system structure should have enough parameters to reflect all the characteristics. However, the numbers of parameters should also be restricted. It is by no means true that the more complex the structure, the better the effect. We need to decide on the structure according to experience or by changing it to observe the effect in special applications. • We use the improved version of the periodogram, Welch’s method, to calculate power spectral density (PDF) distributions for signals s(i), v1 (i), v0 (i) and x(i). We choose the length as 256. Then we calculate the discrete Fourier transforms for these four signals. Finally, their complex conjugates are calculated to obtain their PDFs. To display more details, only frequencies in [0 2500] are shown in the figures. Power spectral density distribution of x = s + v0 decreases at high frequency in Gaussian noise in Fig. 6. This is because Gaussian noise (normally distributed random noise) can be positive or negative. For low Gaussian noise, because the noise v0 , whose magnitude is 10−1 , is much smaller than s, whose magnitude is 10, x is almost equal to s, x ≈ s. For high Gaussian noise, if v0 is negative, x = s − |v0 | will also be a small number.

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

b

Initial MFs for v1(i) 1

Membership grade

Membership grade

a

0.8 0.6 0.4 0.2 0

Initial MFs for v1(i-1) 1 0.8 0.6 0.4 0.2 0

-4

-2

0

2

4

-4

-2

v1(i)

c

0

2

4

v1(i-1)

d

Final MFs for v1(i) 1

Membership grade

Membership grade

1059

0.8 0.6 0.4 0.2 0

Final MFs for v1(i-1) 1 0.8 0.6 0.4 0.2 0

-4

-2

0

2

4

-4

-2

v1(i)

0

2

4

v1(i-1)

Fig. 17. The changes in 5 bell MFs for each input before and after training.

Table 5 MSE of restored 2-D signals filtered with the different number of bell MFs for each input Epoch number

2

3

4

5

6

MSE

29.9477

116.3464

4.0627

5.3917

7.2124

Table 6 MSE of contaminated 2-D signal restored using the backpropagation method or using the hybrid learning algorithm Optimization method

Low noise

Medium noise

High noise

Backpropagation method Hybrid learning algorithm

50.0287 2.0298

431.4803 11.3250

1168.2000 29.9477

Table 7 MSE of the corrupted 2-D signal restored by the linear and constant output MF Types of output MF

Low noise

Medium noise

High noise

Constant output MF Linear output MF

14.7049 2.0298

130.0012 11.3250

360.6174 29.9477

1060

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

0.233 0.232

Training RMSE Checking RMSE

0.231

RMSE

0.23 0.229 0.228 0.227 0.226 0.225 0.224

0

2

4

6

8

10 12 Epochs

14

16

18

20

Fig. 18. The changes in RMSEs of the training data and the checking data.

b

Initial MFs for v(i)

Membership grade

Membership grade

a 1

0.5

0 -4

-2

0

2

Initial MFs for v(i-1) 1

0.5

0

4

-4

-2

v(i) Final MFs with Training Data for v(i) 1

0.5

0 -4

-2

0

2

0.5

0

4

-4

-2

Membership grade

Membership grade

0.5

0 0 v(i)

2

4

2

Final MFs with Checking Data for v(i-1)

f

1

-2

0 v(i-1)

Final MFs with Checking Data for v(i)

-4

4

1

v(i)

e

2

Final MFs with Training Data for v(i-1)

d Membership grade

Membership grade

c

0 v(i-1)

4

1

0.5

0 -4

-2

0

2

v(i-1)

Fig. 19. The changes in the MFs of the training data and checking data before and after training.

4

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1061

Fig. 20. The images filtered using bell MFs in ANFIS corrupted by (a) low, (e) medium, and (i) high Gaussian noise; using median filter corrupted by (b) low, (f) medium, and (j) high Gaussian noise; using Butterworth lowpass filter corrupted by (c) low, (g) medium, and (k) high Gaussian noise; and using Wiener filter corrupted by (d) low, (h) medium, and (l) high Gaussian noise. Table 8 MSE of noisy image contaminated with different intensity of noise filtered by the proposed noise cancellation method withANFIS and the conventional techniques: the variances of low, medium and high Gaussian noise are 175.3, 1577.9 and 4383.1 Name of filter

Pi-shaped MF Arithmetic mean filter Geometric mean filter Harmonic mean filter Contraharmonic mean filter Median filter Alpha-trimmed mean filter SD-ROM filter Max filter Min filter Midpoint filter Frequency transformation method Frequency sampling method Windowing method Gaussian lowpass filter Butterworth lowpass filter Wiener lowpass filter Wavelet with soft threshold Wavelet with hard threshold Wavelet packet by soft threshold Wavelet packet by hard threshold 2-D LMS filter 2-D FEDS filter

MSE Low

Medium

High

1.1888 250.9947 252.0248 275.9819 225.1598 226.1985 207.8163 276.5266 528.3509 756.7153 349.2780 212.5613 240.8762 197.4664 186.3891 129.0895 89.8830 146.7896 169.5155 211.6517 211.6517 505.5324 312.9463

2.0820 403.2217 1356.0000 730.7467 433.5892 561.0192 475.0034 608.8525 702.4051 808.9610 440.8521 440.0593 479.4706 423.3473 363.2081 385.5263 352.7439 1191.5000 1470.5000 539.3796 539.8328 611.1128 517.0347

4.0841 708.2673 3251.6000 1310.7000 903.4906 1211.0000 951.5577 1203.8000 1190.9000 1199.2000 671.4911 825.2965 890.5405 816.0861 681.3593 827.8351 801.3708 3027.8000 3559.4000 1092.0000 1093.7000 824.2988 731.7431

1062

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

5. Conclusion In this paper, we tried to extend adaptive noise cancellation from 1-D signals to 2-D signals—images. We used ANFIS to restore the images contaminated with Gaussian noise. The nonlinear passive dynamics of order 2 was discussed because this case was typical in the corrupted images. We did not discuss high order components (higher than order 3) because the increasing of delay time dramatically attenuates their effect in the whole noise. Eight different MFs are used to remove noise and restore the image. We also discuss the other parameters in ANFIS. We conclude from removing Gaussian noise using ANFIS for nonlinear passive dynamics of order 2 as follows: • In removing Gaussian noise, the MFs’ abilities are different. • With the increment of the training epoch, the result for noise cancellation improves. • Using the combination of least-squares and backpropagation gradient descent methods to remove noise is more efficient than using the backpropagation method. • The linear output MF removes more Gaussian noise than the constant output MF. • The step size of training data for all MFs in ANFIS is related to the RMSE. If the RMSE decreases, the step size increases. However, if the RMSE oscillates, the step size decreases. • We did not find overfitting in ANFIS for noise cancellation regardless what intensity of noise was in the nonlinear passive dynamics of order 2. • Choosing the number of MFs for each input reflects the complexity of ANFIS for choosing parameters. It is important that the structure of the system matches the data. Because ANFIS has not only the simplifying function of fuzzy reasoning but also the self-learning ability of neural networks, it has the strong capability of eliminating pseudo signals (noise). The clarity processed by ANFIS in the above figures is more effective than those processed by any traditional filters in this paper. When the data in Table 8 is compared with Table 3, the precision of restoration with the proposed noise cancellation using ANFIS (Pi-shaped MF) is at least 75 times higher for Gaussian noise than that using the conventional filtering systems under the standard of MSE. Our experiments show that ANFIS can be successfully applied to image noise cancellation. References [1] G. Arce, R. Foster, Detail-preserving ranked-order based filter for image processing, IEEE Trans. Acoustics, Speech, Signal Process. 37 (1) (1989) 83–98. [2] J.C. Bezdek, The thirsty traveler visits gamont: a rejoinder to comments on fuzzy sets—what are they and why?, IEEE Trans. Fuzzy Systems 2 (1) (1994) 43–45. [3] H.L. Eng, K.K. Ma, Noise adaptive soft-switching median filter, IEEE Trans. Image Process. 10 (2) (2001) 242–251. [4] M.G. Forero-Vargas, L.J. Delgado-Rangel, Fuzzy filter for noise removal, in: M. Nachtegael et al. (Eds.), Fuzzy Filters for Image Processing, Springer, Berlin, Heidelberg, New York, 2003, pp. 1–24. [5] S. Fotopoulos, A. Fotinos, S. Makrogiannis, Fuzzy rule-based color filtering using statistical indices, in: M. Nachtegael et al. (Eds.), Fuzzy Filters for Image Processing, Springer, Berlin, Heidelberg, New York, 2003, pp. 72–97. [6] R.C. Gonzalez, R.E. Woods, Digital image processing, second ed., Prentice-Hall, Upper Saddle River, NJ, 2001. [7] S. Haykin, Neural Networks, Prentice-Hall, Englewood Cliffs, NJ, 1998. [8] J.-S.R. Jang, ANFIS: adaptive network-based fuzzy inference system, IEEE Trans. Systems, Man, Cybern. 23 (03) (1993) 665–685. [9] J.-S.R. Jang, C.-T. Sun, E. Mizutani, Neuro-fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence, Prentice-Hall, Upper Saddle River, NJ, 1997. [10] I. Kalaykov, G. Tolt, Real-time image noise cancellation based on fuzzy similarity, in: M. Nachtegael et al. (Eds.), Fuzzy Filters for Image Processing, Springer, Berlin, Heidelberg, New York, 2003, pp. 54–71. [11] B. Kosko, Neural Networks and Fuzzy System, Prentice-Hall, Englewood Cliffs, NJ, 1992. [12] H.K. Kwan, Fuzzy filter for noise reduction in images, in: M. Nachtegael et al. (Eds.), Fuzzy Filters for Image Processing, Springer, Berlin, Heidelberg, New York, 2003, pp. 25–53. [13] C.-T. Lin, C.S.G. Lee, Neural-network-based fuzzy logic control and decision system, IEEE Trans. Comput. 40 (12) (1991) 1320–1336. [14] P. Liu, H. Li, Image restoration techniques based on fuzzy neural networks, Science in China 45 (4) (2002) 273–285. [15] M. McLoughlin, G. Arce, Deterministic properties of the recursive separable median filter, IEEE Trans. Acoustics, Speech, Signal Processing 35 (1) (1987) 98–106. [16] D. Nauck, F. Klawonn, R. Kruse, Foundation of Neuro-fuzzy Systems, Wiley, Chichester, England, 1997. [17] G. Qiu, Functional optimization properties of median filter, IEEE Trans. Signal Process. Lett. 1 (4) (1994) 64–65. [18] T. Ross, Fuzzy Logic with Engineering Applications, McGraw-Hill, New York, 1995. [19] D.F. Specht, Probabilistic neural network, Neural Networks 3 (1990) 109–118. [20] M. Sugeno, G.T. Kang, Structure identification of fuzzy model, Fuzzy Sets and Systems 28 (1988) 15–33.

H. Qin, S.X. Yang / Fuzzy Sets and Systems 158 (2007) 1036 – 1063

1063

[21] C.T. Sun, Rulebase structure identification in an adaptive network-based fuzzy inference system, IEEE Trans. Fuzzy Systems 2 (1) (1994) 64– 73. [22] H. Takagi, I. Hayashi, NN-driven fuzzy reasoning, Internat. J. Approx. Reason. 5 (3) (1991) 191–212. [23] T. Takagi, M. Sugeno, Derivation of fuzzy control rules from human operator’s control actions, in: Proc. IFAC Symp. on Fuzzy Information, Knowledge Representation and Decision Analysis, 1983, pp. 55–60. [24] H. Tang, Smoothing of noisy images by the adaptive average differential iterative and the improved median filters, Proc. IEEE Region 10 Conf. on Computer, Communication, Control and Power Engineering, Vol. 2, 1993, pp. 1054–1057. [25] B. Widrow, J.R. Glover, Adaptive noise cancelling: principles and applications, IEEE Proc. 63 (1975) 1692–1716. [26] L. Yin, Y. Neuvo, Adaptive FIR-WOS hybrid filtering, Proc. Internat. Conf. Symp. on Circuits, Systems, Vol. 6, 1992, pp. 2637–2640. [27] B. Zeng, H. Zhou, Y. Neuvo, Synthesis of optimal detail-restoring stack filters for image processing, Proc. Internat. Conf. on Acoustics, Speech, Signal Processing, Vol. 4, 1991, pp. 2533–2536.