Optimization and Application of Real-Valued Negative Selection Algorithm

Optimization and Application of Real-Valued Negative Selection Algorithm

Available online at www.sciencedirect.com Available online at www.sciencedirect.com Procedia Engineering Procedia Engineering 00 (2011)23000–000 Pro...

684KB Sizes 3 Downloads 110 Views

Available online at www.sciencedirect.com Available online at www.sciencedirect.com

Procedia Engineering

Procedia Engineering 00 (2011)23000–000 Procedia Engineering (2011) 241 – 246 www.elsevier.com/locate/procedia

2011 International Conference on Power Electronics and Engineering Application (PEEA 2011)

Optimization and Application of Real-valued Negative Selection Algorithm Xu Aiqiang, Liu Yong, Zhao Xiuli, Yang Chunying, Li Tingjun Naval Aeronautical and Astronautical University,Yantai 264001,China

Abstract Concerned with the problem of lacking fault samples of complex equipments, it studies the principle and application of negative selection algorithm of artificial immune system. The detectors generation mechanism of real-valued negative selection algorithm is introduced. Intuitively, to maximize the covering produced by a set of detectors, it is necessary to reduce their overlapping and not covering the self set. This paper presents an optimization strategy base on re-heating simulated annealing algorithm to modify the position of detectors, not changing their number. This method can improve the covering effect of non-self space. The triangle training data are used to demonstrate the properties of optimized VRNS. Detection rate is improved and false alarming rate is decreased. It is used for fault detection in analog circuit; result demonstrates that the proposed algorithm is better than artificial neural network in fault detection of this circuit.

© 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of [name organizer] Keywords:negative selection algorithm; re-heating simulated annealing; optimization; fault detection

1. Introduction The successful application of artificial intelligence has solved some problems which cannot be solved by traditional fault diagnosis technology. It brings new vitality to the development of fault diagnosis [1]. For most equipments, it is easy to obtain lots of data reflecting their normal operation condition but hard to obtain diagnosis data. So some intelligent diagnosis methods relying on a lot of samples to training and learning cannot implement well. This has influenced further development of fault diagnosis research. AIS emerged in the 1990s as a new branch in Computational Intelligence. The field of AIS is becoming more popular and AIS-based works spanning from theoretical modeling and simulation to wide variety of applications. Negative selection, immune network model and clonal selection are still the most

1877-7058 © 2011 Published by Elsevier Ltd. doi:10.1016/j.proeng.2011.11.2496

2242

Xu Aiqiang et al. / Procedia Engineering 23000–000 (2011) 241 – 246 Author name / Procedia Engineering 00 (2011)

discussed models [2]-[4]. They are used in pattern recognition, fault detection, computer security, and a variety of other applications researchers are exploring in the field of science and engineering[5]-[8]. 2. Real-valued Negative Selection Algorithm The main task of the immune system is to defend the body against disease caused by pathogens. To detect and eliminate pathogens, the immune system contains certain types of white blood cells called lymphocytes, which can recognize pathogenic patterns, called antigens. To avoid a misclassification of self proteins by lymphocytes, the immune system eliminates self reactive lymphocytes in a censoring process called negative selection [2]. The negative selection algorithm is one of the most widely used techniques in the field of AIS. The components in negative selection (NS) algorithms to be discussed in this paper include data space representation, detector representation, matching rule, detector generation/elimination mechanism. Data representation is a fundamental difference between various models of negative selection algorithms. It limits the possible matching rules, the detector generation mechanism, and the detection process. Theoretically, binary representation subsumes any other representations because any data are eventually implemented as binary bits in a computer. But it has some limitations. Real-valued vector representation is thus another important type of representation. It represents each data item as a vector of real numbers. The matching rules and the measure of difference or similarity should be based on the numeric elements of the vector. The real-valued negative selection (RNS) algorithm was proposed by Gonzalez et al[9]. A detector (antibody) is defined by a n-dimensional vector that corresponds to the center and by a real value that represents its radius; therefore, a detector can be seen as a hypersphere in. The real-valued representation involves the data to be normalized in the range [0,1] first before any further processing. The matching rule is usually expressed as the Euclidean distance. 3. Optimization strategy of Detectors Generation Mechanism This technique is used to re-distribute the randomly generated detectors to achieve optimal distribution. The problem of finding a good distribution of the detectors can be better stated as an optimization problem: Maximize: V ( D) = Volume { x ∈ U | ∃di ∈ D, x − di ≤ ri } (1) restricted to:

{s ∈ S | ∃di ∈ D, s − di i = 1, 2," , num

≤ ri } = ∅ (2)

Where D is the set of detectors, num is the number of detectors ri is the detector radius, S is input self set. The original function to optimize corresponds to the volume covered by the detector set (Equation(1)); however, to calculate it can be very costly. Intuitively, to maximize the covering produced by a set of detectors, it is necessary to reduce their overlapping. So the purpose is to minimize both the overlapping between the detectors and self-overlap while not changing the detectors number. The following equation defines an approximate measure of the overlapping of the detector set D :

3243

XuAuthor Aiqiang et al. / Procedia Engineering (2011) 241 – 246 name / Procedia Engineering 0023 (2011) 000–000

⎛ ⎜ 2 ⎜ − di − d j Overlapping ( D) = ∑ exp ⎜ 2 i≠ j ⎜ ⎛ ri + rj ⎞ ⎜ ⎜ 2 ⎟ ⎠ ⎝ ⎝ i, j = 1, 2," num

⎞ ⎟ ⎟ ⎟, ⎟ ⎟ ⎠

(3)

The self-overlap is described by ⎛ ⎜ 2 − di − s ⎜ SelfCovering ( D) = ∑ ∑ exp ⎜ r +r 2 s∈S di ∈D ⎜ ⎛⎜ i s ⎞⎟ ⎜ ⎝⎝ 2 ⎠ i = 1, 2," num

⎞ ⎟ ⎟, ⎟ ⎟ ⎟ ⎠

(4)

Where rs is the self variability threshold. Using Equation (3) and (4), the function without restrictions to be optimized here is defined as C ( D) = Overlapping ( D) + γ ⋅ SelfCovering ( D) (5)

Where γ is the self covering penalization coefficient for violating the noself-covering restriction. To test the optimization strategy described in the previous section, experiments were carried out using 2-dimensional synthetic data. A fixed number(1000) of random points from the self region are used as the self sample to generate the detector set. We assume that the training self points are distributed randomly over the self space of certain shape. When the initial detectors set is generated, maximum number of detectors is 200, estimated coverage is 90%, maximum proportion of self area allowed is 15%. Considering the performance and complexity, parameters are tried with many experiments, they are: self radius rs = 0.01 , maximum number of iterations for first and second annealing is 150, minimum accepted transitions ηmin = 0.3 , temperature decay rate α = 0.95 , c1 = 15 , c 2 = 25 , penalization coefficient γ = 5 . 100 suggested method reference method

90 80

Temperature

70 60 50 40 30

← first annealing

20 10 0

Fig. 1. Comparison of cooling method

0

50

100

150 Iteration number

← annealing 200

250

300

4244

Xu name Aiqiang et al. / Procedia Engineering 23000–000 (2011) 241 – 246 Author / Procedia Engineering 00 (2011)

Fig.2 shows the coverage achieved by detector set generated by different algorithm, using triangle 2 training data. The abscissas and ordinates denote respectively position of the unit square [ 0,1] . These red points represent self samples and blue circles represent detectors in the figure. Simulation results show the covering effect of detectors. (a) The initial detectors generated randomly could cover the nonself region well. The holes are easier to be covered by the detector’s characteristic of variable-size, however the overlapping between detectors and self set is relatively large. Therefore, some local normal sample points are also covered by detectors. (b) Number of detectors is not changed but overlapping of detectors is decreased, especially overlapping between detector and self set. Comparing (a) and (b), it is easy to see the effect of optimization on the results. It has improved the coverage of nonself region and reduced overlapping of self set. 1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0

(a) initial

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(b) optimized

Fig.2. Covering effect of triangle training data

Fig.3 shows the result of anomaly detection experiment. The test set is 1000 random distributed points over the entire space, including normal points (green points in the figure) and anomaly points (red points in the figure). We use the same control parameters. 1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

(a) initial

1

0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(b) optimized

Fig.3. Detection result of triangle training data

Table 1 highlights the difference between initial and optimized algorithm. Less number of detectors can detect large number of anomaly points. The actual detection of VRNS generation mechanism is

5245

XuAuthor Aiqiang et al. / Procedia Engineering (2011) 241 – 246 name / Procedia Engineering 0023 (2011) 000–000

improved than target coverage estimation (90%) of the initial detectors set. The detection of optimization is improved a little (about 1%), while false alarm rate is decreased significantly (about 10.96%). Obviously, the overall performance of optimized algorithm has been improved. To study the effect of the important control parameters, experiments were also carried out by the pentagram shape training data. Fig.4 shows some results of covering effect for target coverage from 90% through 99%. The mean of 50 repeated tests is considered as the result of experiments. Table 1. Analisis of Detection Result Number of

Training

Number of

Data

Normal Points

Anormal Points

Triangle

73

927

Number of

Generation

Detection

False Alarm

Detectors

Mechanism

Rate

Rate

initial

98.06%

10.96%

optimized

99.14%

0

77

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(a) expected coverage of 90%

0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(b) expected coverage of 99%

Fig.4. Detection result of triangle training data

3.1. Fault feature extraction of analog circuit

The model of this circuit is constructed by Pspice, where the tolerance range of resistors is 10% and 15% of capacitors. We find that changes can obviously affect the output response waveform by sensitivity analysis. Therefore, we select the faults of these six components as fault patterns to be considered. Table 2 shows the components parameters. Here only soft fault may happen in one component once, so there are 12 kinds of fault patterns. Fault feature can be shown in frequency response curve. This paper takes the AC signal analysis with input signal of AC 5V, 100Hz~1MHz. Firstly, 50 monte carlo analyses were taken to the circuit in normal pattern, and 50 normal samples were gotten. Then 30 samples of the normal samples were selected randomly as the detector set training samples of VRNS. Subsequently, 20 monte carlo analyses were taken in each fault pattern to get 240 fault samples. We take all fault samples and other 20 normal

6246

Xu Aiqiang et al. / Procedia Engineering 23000–000 (2011) 241 – 246 Author name / Procedia Engineering 00 (2011)

samples together as testing samples. The Sample frequency points are 3KHz, 7KHz, 10KHz, 15KHz and 20KHz. The voltage values of these frequencies constitute the 5- -dimensional eigenvectors of different patterns. The normalized eigenvectors are sample data needed for VRNS. 3.2. Results of fault detection

.The detection results tell us that the general detection rate based on VRNS is 95% while ANN is 84.5%. It is shown that soft fault of this circuit can be detected well by VRNS, better than ANN method. 4. Conclusion

For reducing the detectable black hole, re-heating simulated annealing algorithm is proposed to optimize the detectors distribution never changing number of detectors. Simulation shows that the proposed algorithm improves the coverage of the nonself space. Experiment results have demonstrated that the proposed algorithm is able to detect faults in the analog circuit. References [[1] M. Hajiaghajani, H. A. Toliyat, and I. M. S. Panahi, “Advanced Fault Diagnosis of a DC Motor,” IEEE Trans. Energy Conversion, Vol. 19, no. 1, 2004, pp. 60-65. [2] S. Forrest, A.S. Perelson, L. Allen, and R. Cherukuri, “Self-Nonself Discrimination in a Computer,” Proceedings of the 1994 IEEE Symposium on Research in Security and Privacy, Los Alamitos, CA: IEEE Computer Society Press, 1994, pp. 202-212. [3] L.N. de Castro and F.J.V. Zuben, “aiNet: An artificial immune network for data analysis,” Data Mining: A Heuristic Approach, H.A. Abbass, R.A. Sarker, and C.S. Newton, Eds. Idea Group Publishing, USA, 2001, pp. 231–259. [4] De Castro L N, Von Zuben F J, “Learning and optimization using the clonal selection principle,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 3, 2002, pp. 239-251. [5] J. H. Carter, “The Immune System as a model for Pattern Recognition and classification,” Journal of the American Medical Informatics Association, vol. 7, no. 3, 2000, pp. 28-41. [6] D. Dasgupta, K. Krishnakumar, D. Wong and M. Berry, “Negative Selection Algorithm for Aircraft Fault Detection,” Proceedings of the 3rd International Conference on Artificial Immune Systems, Italy, 2004, pp. 13-16. [7] Jorge Amaral, Jose Amaral and Ricardo Tanscheit. An Immune Fault Detection System for Analog Circuits with Automatic Detector Generation. Published in the proceedings of IEEE World Congress on Computational Intelligence in Congress on Evolutionary Computation, Vancouver, Canada, 2006, pp. 16-21.