A novel intelligent protection system for power transformers considering possible electrical faults, inrush current, CT saturation and over-excitation

A novel intelligent protection system for power transformers considering possible electrical faults, inrush current, CT saturation and over-excitation

Electrical Power and Energy Systems 64 (2015) 1129–1140 Contents lists available at ScienceDirect Electrical Power and Energy Systems journal homepa...

1MB Sizes 0 Downloads 36 Views

Electrical Power and Energy Systems 64 (2015) 1129–1140

Contents lists available at ScienceDirect

Electrical Power and Energy Systems journal homepage: www.elsevier.com/locate/ijepes

A novel intelligent protection system for power transformers considering possible electrical faults, inrush current, CT saturation and over-excitation Mohammad Yazdani-Asrami a,⇑, Mehran Taghipour-Gorjikolaie a, S. Mohammad Razavi b, S. Asghar Gholamian c a

Young Researchers and Elite Club, Sari Branch, Islamic Azad University, Sari, Iran Faculty of Electrical and Computer Engineering, University of Birjand, Birjand, Iran c Faculty of Electrical and Computer Engineering, Babol University of Technology, Babol, Iran b

a r t i c l e

i n f o

Article history: Received 27 July 2013 Received in revised form 5 August 2014 Accepted 12 August 2014

Keywords: Magnetizing inrush current Internal fault CT saturation Over-excitation Bayesian classifier Improved gravitational search algorithm

a b s t r a c t Many electrical events can be easily damage electrical equipments in power systems. Such events or faults can be easily stopped at incipient steps but because of weakness of protecting systems they grow and extend, and consequently impose so many problems and cost to utilities. Power transformers are one of the vital equipments in electrical networks and industries, although many protecting systems have been implemented to prevent dangerous electrical faults, but most of them suffer many problems, such as; time wasting, computational burden, and low speed in response. In addition, whenever patterns of fault signals are similar, their discrimination from each other is so hard. Magnetizing inrush current, internal fault, CT saturation and over excitation are common electrical faults in power transformers so that most of the time are difficult to be separated. In this paper all considerations for designing perfect protecting system for power transformers are considered. Using intelligent approach, Artificial Neural Network (ANN) based method is designed. Indeed, proposed protecting system includes two major sections. In the first section, using Bayesian Classifier (BC) which works based on Bayesian rules and uses knowledge of training data directly; internal fault is detected and is discriminated from three other mentioned faults and normal condition. If the event is not internal fault, second condition of this intelligent system makes a decision. In this section, ANN trained by swarm based algorithms, namely; Improved Gravitational Search Algorithm (IGSA) or Particle Swarm Optimization (PSO) is used to discriminate magnetizing inrush current, current transformer (CT) saturation and over excitation. Obtained results show that proposed system can easily and precisely follow the electrical faults in power transformer and detect them at incipient steps. Such quick and accurate response helps to save so much energy, financial cost and time. Ó 2014 Elsevier Ltd. All rights reserved.

Introduction Power transformers are the most important and essential components in electrical power systems. The reliability of transformers depends upon adequate design, correct operation, proper maintenance and the application and also, installation of protective equipments. A reasonable design includes adequate insulation of windings, laminations, core bolts, withstanding of conductors against short circuit stresses and any other transients. As a result, the operation of transformer protection must be reliable, sensitive,

⇑ Corresponding author. E-mail address: [email protected] (M. Yazdani-Asrami). http://dx.doi.org/10.1016/j.ijepes.2014.08.008 0142-0615/Ó 2014 Elsevier Ltd. All rights reserved.

accurate and fast. Fault statistics show that about 12% of total power system faults are due to power transformer failures. The future development of modern power systems has been reflected in transformer design [1,2]. Magnetizing inrush current and various internal and external fault currents in power transformers are among the oscillatory disturbances that initiate high currents with many destructive features. One of the main concerns in power transformer protection is to accurately distinguish between different transient disturbances. There have been many proposed techniques to classify the various currents flowing through any power transformer in order to develop an accurate, efficient, and reliable protection. Also, accurate classification of currents in a power transformer is considered the key requirement for preventing mal-operation of the protective

1130

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140

equipment under different non-fault conditions, including magnetizing inrush current, through-fault current, CT saturation, tap changers, ratio mismatch, and currents in the grounding impedance. This requirement for power transformer protection has become a main challenge due to the need for accurate, fast, and reliable differentiation between internal fault and magnetizing inrush currents [3–6]. The differential protection is actually considered the main protection of the power transformer. Discrimination between the internal faults and transformer inrush current must be rapid and accurate in decision. The inrush current contains all harmonic orders, but in practice, only the second harmonic is used. The harmonic restraint methods work on the assumption that the magnetizing inrush current contains high levels of second harmonic content [7–9]. However, the second harmonic component may also be generated during internal faults in the power transformer. To avoid the tripping operation dependency on such a threshold many novel methods have been proposed. Some of these approaches include; wavelet transform based techniques [10–12], instantaneous leakage inductance based techniques [13,14], induced voltage based method [15], correlation transform based technique [16], similarity degree between voltage and current [17], fuzzy logic based approaches [18,19], combination of wavelet transform and fuzzy logic [20], Support Vector Machine [12,21], Hidden Markov Models [22], Gaussian Mixture Models [23], and space-vector method [24]. In this paper, a novel approach has been used to discriminate transient conditions in power transformers. As it can be seen in the rest of the paper, BC which is used to detect and discriminate internal fault from other conditions, has been present in power research area for the first time. Also, using population based methods such as; Improved Gravitational Search Algorithm (IGSA) and other optimization methods, ANN has been trained and used for fault detection system which discriminate four transient conditions. Indeed, using BC internal fault is detected, if input signal is not internal fault, then trained-ANN decides about the condition that can be: magnetizing inrush current, CT saturation, over excitation or normal condition. The proposed method shows promising and accurate results. As it can be seen in following sections, BC can be excellent choice for detecting internal fault with low computational cost and IGSA shows very good results in training ANN for discriminating purpose. Overview of several transient conditions in power transformers Principles of magnetizing inrush current When an unloaded power transformer is switched on to a power supply, the initial magnetizing current is generally much larger than the magnetizing current at steady-state conditions and often much larger than the rated current of the transformer. This phenomenon is known as magnetizing inrush current. This current will cause fault in operation of protection relays and fuses, mechanical damage for windings caused by magnetic force and also power quality problems in electrical systems [25,26]. The phenomenon of transient inrush current can be explained as follows. Ignoring the winding’s resistance value, the relationship between the voltage EmSin(xt) and flux /(t) is given by Eq. (1):

Em SinðxtÞ ¼ N1 /ðtÞ ¼ 

Em

xN 1

d/ðtÞ dt

CosðxtÞ þ /r

ð1Þ

ð2Þ

where xENm1 ¼ /m and also, N1 and /r are the number of turns in the primary windings and the remnant flux, respectively [26,27].

During the period of transient inrush current, the transformer’s core enters into state of saturation normally. In this core-saturated state, the magnitude of permeability would be regarded as the absolute permeability l0, and then the magnitude of inductance is reduced. The simplified equation that has been used in the industry to calculate the peak value of the first cycle of inrush current is [28]:

pffiffiffi   2BN þ BR  BS 2U Ipk ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi BN ðxLÞ2 þ R2

ð3Þ

where U, L and R are applied voltage, inductance of the transformer and DC resistance of the transformer windings, respectively and also, BR, BS and BN are remnant, saturation, and normal rated flux density of the core, respectively. The current would be increased quickly due to the decrease in inductance. The exciting current of a transformer in steady-state is typically less than 2% of the rated current, but the inrush current may be as high as 10 times to the rated current or more [25– 27]. Principles of CT saturation In order to protect power systems, it’s necessary to know the amount of flowing current in the power transmission line. Since this high amount of current cannot be used in measurement and protection equipments, and in the other hand, isolation of these equipments is an important issue, thus somehow, this current should be reduced and then use for measurements and protecting devices. This is done by CT. For small amounts of primary currents and consequently secondary currents a voltage equal to Es is induced in the secondary of transformer which is very small. Flux generated by transformer is equal to (Es/4.44FN) and magnetizing current is small with the same ratio. Consequently secondary current becomes N/(primary current). If the primary current is increased, first, secondary current increases by the same ratio and causes an increase in the amount of induced voltage in secondary. The increase in secondary voltage is done just by an increase in the amount of flux generated by CT. While flux is increasing transformer needs to draw high magnetizing current. Although after the knee point of B-H curve, increases of flux causes an inappropriate and sever growth in magnetizing current and this is due to non-linear nature of magnetizing curve of the CT. It’s worthy to note that magnetizing current is not completely sinusoidal but has a considerable peak value. While the primary currents are increasing a moment reaches that in which a very high magnetizing current is needed and almost all of the transmitted current is used for magnetizing. This means that a very small current is fed to the load and in this case it is said that CT is fully saturated [29]. Principle of over-excitation condition Transformer differential relays may encounter unnecessary operation in generating plants when a unit-connected generator is separated while exporting VARs. The higher than nominal ‘‘volts per hertz’’ condition can caused by sudden increase of voltage rise impressed on the unit transformer windings from the loss of VAR load. When the primary winding of a transformer is over-excited and driven into saturation, more power appears to be flowing into the primary of the transformer than is flowing out of the secondary. A differential relay, with its inputs supplied by properly selected CTs to accommodate ratio and phase shift, will perceive this as a current differential between the primary and secondary windings and, therefore, will operate. This would be an undesirable

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140

1

  1 1 T 1 exp  ðx  l Þ R ðx  l Þ 2 j Rj 2

1131

operation, as no internal fault would exist, with the current imbalance being created from the over-excitation condition.

Nðxjl; RÞ ¼

Principle of internal fault

where the D-dimensional vector l is so-called ‘‘mean’’, d2 is ‘‘variance’’ and the D  D matrix R is so-called ‘‘covariance’’ and |R| denotes the determinant of R. In order to illustrate the operation of BC, pay attention to following example: Suppose that a two-class problem is available so that normal (or bayesian) distribution of each class can be seen in Fig. 1. Indeed, using mean and variance of each class these curves are plotted. When determining corresponding class of input test data Xin is our purpose, its class can be determined only by determining the amount of bayesian distributions corresponding to input test data. According to Fig. 1, fC2(Xin) is bigger than fC1(Xin), therefore input test data is assigned to class C2.

In the condition monitoring of power transformers, fault is defined as any unpredictable and non-programmed events inside or out of the transformer. In this situation, transformer has to be tripped and repaired, before re-connection. Short circuit in transformer windings are one of the most dangerous events which is called ‘‘internal fault’’. Whenever windings of transformer encounter severe current caused by short circuit, it suffers high mechanical stress which is made by magnetic forces caused by short circuit current. This stress can cause serious physical deformation in windings and consequently fault in transformer operation. The most important internal faults of transformers are turn to turn fault, winding to winding fault, winding to earth fault, terminal to terminal fault, terminal to earth fault and winding’s insulation fault. Internal faults can be created because of many reasons but the most important ones are: mechanical damage and defect in primary structure of winding, over-voltage and transient current, incautious transportation of transformers, insulation problems, aging, etc. Since, internal faults have destructive effect on power transformers operation; therefore, which should be quickly detected and transformer should be tripped. So, detection of internal faults and their discrimination from other transient events such as magnetizing inrush current, CT saturation, and over-excitation, are so important and vital. Overview of Bayesian classifier One of the most well known probabilistic classifiers is BC, which is based on Bayes theorem [30]. According to Bayes’ rule presented in Eq. (4), this theorem consists of some parameters, which are determined in training process.

posterior ¼

prior  liklihood ev idence

ð4Þ

where if ‘‘d’’ is used as data and ‘‘h’’ is used as hypothesis, then:

pðhjdÞ ¼

pðhÞ  pðdjhÞ pðdÞ

ð5Þ

where prior probability is the probability available before observation of the identity of data that is shown by p(h), in Eq. (5). Likelihood is the probability if the hypothesis h is true and it is shown by p(d|h), it is also mentionable that p(h|d)p(d) = p(d|h)p(h). Using Sum Rule, the denominator in Bayes theorem can be expressed in term of the quantity in numerator, as follow:

pðdÞ ¼

X pðdjhÞpðhÞ

ð6Þ

ð2pÞD=2

ð8Þ

Overview of artificial neural network Basic concepts of ANN ANNs are the most common classifiers for data and pattern recognition in engineering problems. These classifiers are designed based on human neural network and in other word; they have similar action to human neural network. ‘‘The human brain is a physical symbol system that can be fully understood by computer simulation of its processes’’ Herbert Simon (1918–2001) said. Many topologies and structures of ANN are presented and introduced in pattern recognition filed. Feed-Forward Back Propagation Neural Network (FFBPNN), Cascade Forward Back Propagation Neural Network (CFBPNN) and Radial Basis Function Neural Network (so-called RBF) are the most well-known and powerful classifiers used for classification problems. Feed-forward back propagation neural network As it can be seen in Fig. 2, a simple model of ANN is created by connecting neurons. Relative position of cells in network (number of neurons, number of layers and all connections between cells) is network topology. Indeed, topology is a hardware connection system of neurons that by using corresponding software (namely; mathematic procedure of information flow and weight’s calculation) determines kind of neural network application. In this topology, there is one input layer accepts information, also some hidden layers that gives information from previous layer and finally one output layer is exist that accepts results of calculations and presents output. Each cell is connected to all cells of next layer. Selfconnection, connect to previous layer and jump from one layer are forbidden. This is so-called ‘‘feed forward’’, because information always flows from input to output. About 90–95% of applications use this topology. At first, synapse weights are random

h

This equation is so-called ‘‘marginal probability’’. The goal of Eq. (5) is posterior probability that is the probability after having seen data d. Using concept and basis of mentioned equations, BC can be created. For this purpose, classes’ target/label, mean and variance (for one dimensional applications) or covariance (for more than one dimension) for each class should be determined and calculated. Also, for modeling classes, Gaussian distribution is used. Eqs (7) and (8) present prior probability of a given data for one dimensional applications and more than one dimension, respectively.

Nðxjl; d2 Þ ¼

  1 2 exp  ðx  l Þ 1=2 2d2 ð2pd2 Þ 1

ð7Þ Fig. 1. A two-class problem with their bayesian distribution of classes C1 and C2.

1132

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140

Fig. 2. Feed-Forward Neural Network (FFNN). Fig. 4. Several conventional and commonly used activation functions.

values that will be corrected during an especial iterative training method [31]. As it can be seen in Fig. 3, for each neuron, weighted value of output of last layer’s neurons is added together, and then by crossing one activation function, output value for mentioned neuron is obtained. So, equation for one neuron in ANN can be expressed as follow:

yk ¼ uk

m X wi xi

ð9Þ

i¼1

where xi is input data to ANN or output data of previous layer, wi S are synaptic weights between ith neuron of previous layer and kth neuron of next layer. Coefficient uk is activation function which can be existed in different form; some activation function can be seen in Fig. 4. Cascade-forward back propagation neural networks Cascade-forward networks are completely similar to feed-forward networks, but it consists of connections from input to all layers with its corresponding weights, namely; for each layer, additional to input connection from previous layer one connection is fed into this layer from input. Fig. 5 shows the typical structure of CFBP networks. In this figure Win2HL is weight vector from input to hidden layer. Win2OL is weight vector from input to output layer and WHL2OL is weight vector from hidden layer to output layer. Also, bHL and bOL are bias vectors. Radial basis function neural networks In common network’s structure like Multi Layer Perceptron (MLP), neurons of hidden layers have non-linear function such as; Sigmoid, Hyperbolic, and Tangent, while, neurons of hidden layer in RBF networks consist of Gaussian non-linear function. The RBF network has one input layer, one hidden layer and one output layer. Neurons of hidden layer are multi dimension units

Fig. 3. A sample of a neuron in ANN.

that their dimension is equal to the number of inputs. Training procedure of RBF includes two sections; supervised and unsupervised. First of all, using one of the clustering methods, the parameters of Gaussian function of hidden layer are adjusted and then, connection weights between hidden layer and output layer are corrected using a supervised method [32]. Behavior of RBF network is the same as behavior of real biological networks. Hidden layer includes relative sensitive and non-linear units and output layer consists of linear units, in most applications. Each neuron in hidden layer has bigger output, whenever input vector of network is closer to center of nonlinear function of that neuron. Increasing distance between input vector and center of nonlinear function of that neuron, leads to reduce output of that neuron. A simple structure of RFB network has been shown in Fig. 6. RBF networks are belonging to kernel classifiers. Non-linear functions in hidden layer are called ‘‘Basis Function’’, which are chosen in Gaussian form. Overview of swarm-based algorithms Nowadays, different algorithms are presented to solve optimization problems. Evolutionary algorithms, such as; Genetic Algorithms (GA) and Swarm-based Algorithms, such as; Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), Gravitational Search Algorithm (GSA) [33] can be good choice for such problems. Almost, swarm-based algorithms are most simple and understandable in comparison with evolutionary ones. The swarm intelligence is used to find the best and optimal solution. In this paper two different kinds of them have been used to train ANN. Overview of particle swarm optimization PSO simulates the behaviors of bird flocking. In PSO, each single solution is a ‘‘bird’’ in the search space. Here, it is called as ‘‘particle’’. For all of the particles fitness value has been calculated, which are evaluated by the fitness function to be optimized, and have velocities, which direct the flying of the particles. The particles are ‘‘flown’’ through the problem space by following the current optimum particles. PSO is initialized with a group of random particles (solutions) and then, searches for optima by updating generations. In each iteration, each particle is updated by following two ‘‘best’’ values. The first one is the best solution (fitness) it has achieved so far. The fitness value is also stored. This value is called ‘‘pbest’’. Another ‘‘best’’ value that is tracked by PSO is the best value, obtained so far by any particle in the population. This best value is a global best which is called ‘‘gbest’’. After finding the two best values, the particle updates its velocity and positions based on Eqs. (10) and (11):

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140

1133

Fig. 5. Typical structure of CFBP networks.

Fig. 6. A simple structure of RBF network.

V new ¼ W  V old þ C 1  r  ðPpb  X cs Þ þ C 2  r  ðPgb  X cs Þ

ð10Þ

X new ¼ X old þ V new

ð11Þ

where W is the inertia weight, Vnew is the particle velocity, Xcs is the current particle (solution) of each particle, Ppb and Pgb are pbest and gbest, r is a random number between (0, 1) and C1, C2 are learning factors. Particle’s velocities on each dimension are clamped to a maximum velocity Vmax. If the sum of accelerations would cause the velocity on that dimension to exceed Vmax, which is a parameter specified by the user then the velocity on that dimension is limited to Vmax. In Fig. 7, typical movement of one particle in solution space has been shown.

a gravitational force. In the GSA, agents are considered as objects and their performances are expressed by their masses and determined using a fitness function. Position of each object corresponds to a solution of the problem. By the gravity force, all these objects attract each other and a heavy mass has a large effective intensity of attraction as it is illustrated in Fig. 8. Due to the forces that act on an agent from other agents, it can see atmosphere around itself and gravitational force acts as an information-transferring tool. As a result, the agents tend to move toward the objects with heavier masses. Heavier masses which are more efficient agents and correspond to good solutions move more slowly than lighter ones. This guarantees the exploitation step of the search algorithm. The heaviest mass presents an optimum solution in the search space [33–35]. In some problems, since GSA does not have memory, it could not present acceptable results and trap into local optimum such as problem presented in this paper, therefore, an improved version of GSA has been used. IGSA’s formulas are based on GSA, but concept of PSO has been used in velocity update step, namely; using PSO, memory term has been added into GSA. Formulation of improved gravitational search algorithm In the IGSA, each agent has three major specifications, namely; new position, old or previous position and inertial mass. The

Overview of improved gravitational search algorithm GSA is a swarm-based heuristic search algorithm based on the law of gravity in which each particle attracts other particles with

Fig. 7. Typical movement of one particle in solution space.

Fig. 8. All forces acting form other masses on mass M4.

1134

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140

position of the agent corresponds to a solution of the problem, and its gravitational and inertial masses are determined using a fitness function. In an n-dimensional search space, the position of ith agent is represented as the vector Xi = (xi1, . . . , xni ). In the initialization process, the position of a set of agents is created randomly. Calculation of the total force in different directions As it can be seen in Fig. 8, the force acting on ith agent from jth agent at iteration t is defined as follow:

F dij ðtÞ ¼ GðtÞ

M i ðtÞ  M j ðtÞ d ðxj ðtÞ  xdi ðtÞÞ Rij ðtÞ þ e

Rij ðtÞ ¼ kX i ðtÞ; X j ðtÞk2

ð12Þ ð13Þ

where M i ðtÞ and Mj ðtÞ are the masses related to agents i and j at iteration t, respectively. Term e is a small constant, and Rij ðtÞ is the Euclidian distance between the two agents i and j. Coefficient G(t) is gravitational constant at iteration t and initialized at the beginning and will be reduced with iteration steps to control the search accuracy as Eq. (14): at

GðtÞ ¼ G0  etmax

ð14Þ

where tmax is maximum iteration number, G0 is initial value of G, and a is a positive factor. For giving a stochastic characteristic to the search algorithm, total force that acts on ith agent in dimension d is calculated as follow:

F di ðtÞ ¼

N X

randj F dij ðtÞ

ð15Þ

j¼1;j–i

where randi is a random number in the interval [0, 1]. Updating of agents’ position Each agent has a velocity and an acceleration which is defined as v i ðtÞ and ai ðtÞ, respectively. Acceleration of any agents at iteration t, is equal to the total force acted on the agent divided by its related mass as follows:

adi ðtÞ ¼

F di ðtÞ M i ðtÞ

ð16Þ

Using the acceleration, the velocity of any agent is modified under the following equation in the IGSA:

v di ðt þ 1Þ ¼ rand  v di ðtÞ þ adi ðtÞ þ rand  C 1  ðxðtÞ  xpbest Þ þ rand  C 2  ðxðtÞ  xgbest Þ

ð17Þ

where rand is a uniform random number between 0 and 1, xpbest is the best personal position for each agent and xgbest is the best global position among all agents, so far. According to Fig. 9, each agent moves from the current position to the next one by the modified velocity in Eq. (17) using the following equation:

xdi ðt þ 1Þ ¼ xdi ðtÞ þ v di ðt þ 1Þ

ð18Þ

Training ANN with IGSA The main goal of training is updating weights and biases and finally reducing error in output. In this paper, Mean Squared Error (MSE) performance function that is a criterion of difference between actual output of ANN and desired output has been used as a function which should be minimize. Hence, value of MSE is reduced until zero using IGSA. When MSE reaches to zero, it means that actual output is the same as target output and it can be understand that ANN has been well trained. In order to applying IGSA to train ANN some steps should be done.

Fig. 9. Movement of M4, according to update process in IGSA.

Step (1). Determining initial parameters In this step, structure of ANN and initial parameters and controlling values of IGSA are set. So, at first, number of layers and number of neurons in each layer and topology of ANN are determined. After that, initial parameters of IGSA are determined which are number of masses/agents, number of problem/solution dimension that is dependent on number of synaptic weights of FFNN, number of weights or in other word number of solution dimension can be calculated by Eq. (19).

½ðNumber of Input DataÞ  h þ b1  þ ½hb þ b2  þ ½bc þ b3  ¼ xðNumber of Neural Network weightsÞ

ð19Þ

where h, b and c are the number of neurons in first layer, second layer and third one, respectively. And b is bias for each neuron. Then value of controlling parameters G0, a and velocity parameters C1, C2 and number of iteration are determined, eventually in this part primary location of masses in solution space is determined randomly, that each mass has w dimensions. Step (2). Applying optimizing algorithm (IGSA) In this step, IGSA algorithm has been run to train FFNN. It should be noting that swarm intelligence based algorithms are usually used to optimize (minimize or maximize) non smooth or no-linear mathematical function. FFNN can be considered as a non-smooth mathematical function which can be optimized by swarm based algorithms. In order to use IGSA, this function should be called. In a given sub function, structure of desired FFNN has been designed. This work consists of some steps as it can be seen in follow. At first input data, target matrix and matrix of weights that expresses initial position of masses are loaded. Dimension of weight matrix is 1  w, as follow:

W 1 ¼ ½w1 w2    ww 

ð20Þ

where w1 is the position of the first mass in solution space that expresses all weights of FFNN for the first mass. After dividing this matrix into sub-matrixes which express weights of one layer and its biases, structure of FFNN is made like bottom:

S1 ¼ W h  ½ID þ b1

ð21Þ

S2 ¼ softmaxðS1 Þ

ð22Þ

1135

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140

Fig. 10. Graphical Flowchart of proposed method.

Table 1 Result of determining internal fault data’s condition using BC. Characteristic of train data

Condition

16MVA, 110/33 kV 25MVA, 110/33 kV 5MVA, 110/33 kV 3MVA, 110/33 kV 2MVA, 110/33 kV 16MVA, 110/11 kV 25MVA, 110/11 kV 5MVA, 110/11 kV 3MVA, 110/11 kV 2MVA, 110/11 kV 16MVA, 66/33 kV 25MVA, 66/33 kV 5MVA, 66/33 kV 3MVA, 66/33 kV 16MVA, 110/11 kV 16MVA, 110/33 kV 5MVA, 110/11 kV 3MVA, 110/33 kV

Bayesian Classifier (BC)

Internal fault Internal fault Internal fault Internal fault Internal fault Internal fault Internal fault Internal fault Internal fault Internal fault Internal fault Internal fault Internal fault Internal fault CT saturation Over excitation Inrush current Normal condition

Value of BC for internal fault condition

Value of BC for other conditions

2.96 2.05 7.51e001 6.75e001 8.93e001 9.92e001 1.35 9.33e001 9.17e001 1.14 1.61 1.71 1.83 8.16e001 0 0 0 0

0 0 0 0 1.12e145 1.54e103 3.03e044 0 2.99e044 1.33e103 1.30e103 1.82e102 6.59e102 3.75e101 1.35e052 3.72e042 2.15e045 4.67e042

Table 2 Binary code of conditions. Condition

Normal current (1)

Inrush current (2)

Over excitation (3)

CT saturation current (4)

Binary code

0001

0010

0100

1000

output and target output is calculated, this value is fitness value of FFNN for IGSA and proposed algorithm will change weights till this value becomes minimum in next iterations. MSE function can be seen in Eq. (27).

P MSE ¼

Z 1 ¼ W b  ½ID þ b2

ð23Þ

Z 2 ¼ softmaxðZ 1 Þ

ð24Þ

O1 ¼ W c  ½ID þ b3

ð25Þ

O2 ¼ softmaxðO1 Þ

ð26Þ

M;N ðtarget

output  acual outputÞ MN

2

ð27Þ

Step (3). Updating IGSA parameters In this step, parameters of IGSA are updated, at first, acceleration is updated and then velocity and position of masses will be updated, as mentioned before, these new position for one mass expresses new weights for FFNN. Step (4). Inspection of stop criterion

where ID is input data, W h , W b and W c are weights between input and first layer, between first layer and second one and between second layer and third one, respectively. b1, b2 and b3 are biases related to first layer, second layer and third layer, respectively. At the end, O2 is actual output of network and MSE value of subtract of actual

In order to stop this algorithm, there exist two criterions; one is number of iteration and other is MSE value, whenever this value becomes zero, it means that actual output and target output are same. Step (5). End

1136

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140

Table 3 Results of using training data and also test data in test step. Condition

Actual output

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

25MVA, 110/33 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

5MVA, 110/33 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

3MVA, 110/33 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

2MVA, 110/33 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

16MVA, 110/11 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

25MVA, 110/11 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

5MVA, 110/11 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

3MVA, 110/11 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

2MVA, 110/11 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

16MVA, 66/33 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

25MVA, 66/33 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

5MVA, 66/33 kV

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

Characteristic of train data 16MVA, 110/33 kV

Tested transformer 3MVA, 66/33 kV

Error

Application of proposed method for condition monitoring After extraction of discrete samples from simulation (data presented in [36]), data are classified into two major classes, first class is belong to internal fault class and second class is belong to other conditions. Mean and covariance of training data of two classes have been calculated and then, BC is created and has been used to discriminate internal fault

from other conditions. When BC determined input data is not internal fault, ANN which is trained by IGSA assigns its class (condition). Fig. 10 shows flowchart of proposed method. As it can be seen in Fig. 10, in order to discriminate 4 conditions by ANN, binary codes has been used, so that; ‘‘0001’’ for normal condition, ‘‘0010’’ for magnetizing inrush current, ‘‘0100’’ for over excitation and ‘‘1000’’ for CT saturation.

1137

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140 Table 4 Sensitivity analysis on effect of controlling parameters variations. Sensitivity analysis for

Controlling parameters

Condition

Actual output

a

5

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0001 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0139 0.0000 0.0001 0.0000 0.0000 1.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.9999 0.0000 0.0000 0.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.9860 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0139 0.0000 0.9999 0.0000 0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.9999 0.0000 0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0140 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.9989 0.0000 0.0000 0.0008 0.9995 0.0000 0.0000 0.0000 1.0000 0.0000

0.0000 0.9999 0.0000 0.0000 0.0000 0.9992 0.0000 0.0000 0.0000 0.9992 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000

1.0000 0.0001 0.0000 0.0000 1.0000 0.0008 0.0011 0.0000 1.0000 0.0000 0.0005 0.0000 1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0011 0.0000 0.0000 0.0008 0.0005 0.0000 0.0000 0.0000 0.0000 0.0000

0.0000 0.0001 0.0000 0.0000 0.0000 0.0008 0.0000 0.0000 0.0000 0.0008 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

0.0000 0.0001 0.0000 0.0000 0.0000 0.0008 0.0011 0.0000 0.0000 0.0000 0.0005 0.0000 0.0000 0.0000 0.0000 0.0000

1 2 3 4

0.0000 0.0000 0.0000 0.9953

0.0000 0.5001 0.5003 0.0047

0.0000 0.4999 0.4996 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0047

0.0000 0.5001 0.4996 0.0047

0.0000 0.5001 0.4996 0.0000

0.0000 0.0000 0.0000 0.0000

0.0199

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

1.09e18

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

7.28e22

10

15

20

G0

100

150

200

250

C1 < C2

C1 = 1 C2 = 3

C1 = C2

C1 = 2 C2 = 2

C1 > C2

C1 = 3 C1 = 1

Error

MSE of trained FFNN 2.21e34

1.09e18

0.0073

0.045

1.77e09

3.94e07

2.02e07

1.09e18

Table 5 Sensitivity analysis on FFNN’s structure. Structure of FFNN (the number of synaptic weights)

Condition

Actual output

16-8-4 (444)

1 2 3 4

0.0000 0.0003 0.0000 1.0000

0.0689 0.0007 0.0689 0.0000

0.0000 0.9990 0.0000 0.0000

0.9311 0.0000 0.9311 0.0000

0.0000 0.0000 0.0000 0.0000

0.0689 0.0000 0.9311 0.0000

0.0000 0.0000 0.0000 0.0000

0.0689 0.0000 0.9311 0.0000

0.0089

16-16-4 (612)

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

1.09e18

16-24-4 (948)

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

7.80e11

Results and discussion As mentioned in previous section first of all, input data is fed into BC to determine whether this input data is internal fault or other conditions. Results show that BC can promising map knowledge of internal fault into classification space and easily discriminates

Error

MSE of trained FFNN

it from other conditions. Although ANN is one of the most powerful classifiers in classification problems, but when more simple classifiers are available using more quick and simple with less computational burden classifiers are reasonable. Therefore in this work, BC which directly uses knowledge of data is preferred for first stage of intelligent classifier and protection system. Whenever, a data

1138

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140

Table 6 Effect of output activation function on obtained results. Activation function

Condition

Actual output

Error

Purelin (linear)

1 2 3 4

0.1789 0.1789 0.1789 0.1789

0.3159 0.3159 0.3159 0.3159

0.1910 0.1910 0.1910 0.1910

0.0955 0.0955 0.0955 0.0955

0.1789 0.1789 0.1789 0.8211

0.3159 0.3159 0.6841 0.3159

0.1910 0.8090 0.1910 0.1910

0.9045 0.0955 0.0955 0.0955

1.88e01

Softmax

1 2 3 4

0.0000 0.0000 0.0000 1.0000

0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

1.09e18

logsig

1 2 3 4

0.0000 0.0000 0.0000 0.9998

0.0000 0.0000 0.9676 0.0000

0.0000 0.9322 0.0000 0.0000

0.9999 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0002

0.0000 0.0000 0.0324 0.0000

0.0000 0.0678 0.0000 0.0000

0.0001 0.0000 0.0000 0.0000

2.67e02

of other conditions is fed into system, BC determines that it is belong to other classes and then ANN determines input data’s condition, which can be normal condition, magnetizing inrush current, over-excitation or CT saturation. Using BC In this section, in order to show ability of BC for detecting internal fault, ‘‘cross-validation’’ technique has been used. According to this technique, in each step/run one data is put away, and by remain data BC is trained, namely; covariance and mean of each class has been calculated by remain data. And then, that data is tested whether this trained classifier is capable to determine its condition or not. As it can be seen in Table 1, BC can determine all internal fault data’s condition. For example, in second data, amount of normal distribution for internal fault condition is 2.05 and for other conditions is zero and because 2.05 is bigger than zero, therefore classifier can exactly determines this input condition. Also, as it can be seen, this classifier can determine that the four last data in Table 1 are not internal fault.

2

MSE of trained FFNN

3

0

0

0

13

6 0 6 6 4 0

0

13

13

0

0 7 7 7 0 5

13

0

0

0

2

0

0

0 1

3

60 0 1 07 6 7 6 7 40 1 0 05 1

0

0

ð28Þ

ð29Þ

0

where Matrixes (28) and (29) are confusion matrixes. This kind of matrix shows that how many data are correctly classified and how many are misclassified. For example in Matrix (28), rows are desired output and columns are actual output. It is supposed that first row expresses normal condition and it can be seen that all 13 normal condition data are correctly classified. Sensitivity analysis As mentioned before, IGSA has some parameters that have direct effect on results. Therefore, choose the optimal and effective value can lead to proper results.

Using trained ANN When input data is not classified as internal fault, ANN which is trained by IGSA or PSO has been used to classify input data. In this section, FFNN has been trained by IGSA or PSO. PSO is used as one of the well-known swarm based optimization algorithm in comparison with our proposed optimization algorithm. Following results show that IGSA not only is quicker than PSO but also can present more accurate results. It should be mentioned that in order to improve classification quality, binary codes have been used to express classes, as Table 2. According to the results presented in Table 3, IGSA can do discrimination of four conditions in the best way. Also in Eqs. (28) and (29), confusion matrix of train data and test data has been presented. In order to simulate or in other word train FFNN, IGSA has some parameters which should be adjusted, such as; G0, a, C1 and C2. In this training, G0 = 250, a = 5, C1 = 3 and C2 = 1. Also, the number of mass is equal to 20 and for FFNN, ‘‘16-24-4’’ has been considered as structure, namely; this FFNN has two hidden layer with 16 and 24 neurons in hidden layer one and two, respectively and one output layer with 4 neurons. The activation functions in all layers are ‘‘softmax’’. Considered parameters are the optimum values and structure based on sensitivity analysis presented in Section ‘Sensitivity analysis’.

Effect of controlling parameters on the results. In this section, to evaluate variations of parameters, one parameter has been changed while others remain constant. When variations of a is evaluated, G0 = 250, C1 = 2, C2 = 2 and the number of masses = 20. For evaluating G0, a = 10, C1 = 2, C2 = 2 and the number of masses = 20. And in order to evaluate effect of variations of C1 and C2, G0 = 250, a = 10 and the number of masses = 20. Results presented in Table 4 show that small value for a can present better results, because, small value of a means more speed in movement of agents and it can lead to receive global best more quickly. Also, bigger values for G0 have the same effect on movements and results. But in evaluating Cs, when C2 is bigger than C1, results can be better, it means that coefficient of global best effect has more influence. Effect of FFNN’s structure on the results. As mentioned in Section ‘Overview of artificial neural network’, FFNN can accept different structures, namely; different neurons in each layer. But it is worth noting that all structures cannot result in same result or best result, therefore, finding best structure can help to improve correct classification rate. According to fulfilled analysis in Table 5, it is illustrated that the best structure can be; 16 neurons in first hidden layer, 24 neurons in second hidden layer, and 4 neurons for output layer. Using this structure, the weight vector will has 612 dimensions.

1139

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140 Table 7 Obtained results using trained ANN by IGSA and PSO. ANN structure

Operating condition

Actual output of proposed algorithms

16-16-4

1 2 3 4

0.0000 0.0000 0.0000 1.0000

MSE value

1.09e18

IGSA

PSO 0.0000 0.0000 1.0000 0.0000

0.0000 1.0000 0.0000 0.0000

1.0000 0.0000 0.0000 0.0000

0.0003 0.0008 0.0003 0.9966

0.0021 0.0000 0.0020 0.0034

0.0000 0.9987 0.0000 0.0000

0.9976 0.0005 0.9978 0.0000

0.0064

Table 9 Obtained results using RBFNN. Operating condition

Actual output of proposed algorithms RBF

1 2 3 4

0.0001 0.0856 0.0616 1.0000

MSE value

2.8570e023

0.0000 0.1986 0.5217 0.0000

0.0002 0.7224 0.0662 0.0000

0.9999 0.0067 0.4828 0.0000

Fig. 11. A sample of convergence of MSE values using IGSA.

Fig. 12. A sample of convergence of MSE values using PSO.

Effect of output activation function on the results. Activation function has important role in result of ANN, using proper activation function can lead to proper and desired output, for example when output more than unit is needed, it is clear that ‘‘hardlim’’ activation function could not be a good choice or when nature of problem is non-linear, using ‘‘purelin’’ activation function will lead to not suitable results. Also, experiences show that output activation function has the most important role in results. Therefore, in order to choose the best some activation functions that are compatible with this work are evaluated, as it is shown in Table 6. Comparison between proposed method and other conventional and traditional methods Studied problem in this paper can be categorized as a classification problem. So far, different kind of classifier are invented and

presented. ANNs are one the most popular and modern and powerful methods in this field. Different topologies of ANN are available, such as; Feed-Forward Back Propagation Neural Network (FFBPNN), Cascade-Forward Back Propagation Neural Network (CFBPNN), and Radial Basis Function Neural Network (RBFNN). In this paper, to confirm ability of proposed method, obtained results by proposed method has been compared with mentioned topologies. Results have been presented in Tables 7–9. Tables 7–9 show obtained results by trained ANN, the result of FFBPNN and CFBPNN and the result of RBFNN, respectively. Comparison has been done for all networks in same condition, and the result of this comparison shows that proposed method can be an excellent classifier for discrimination of these four conditions. Also, convergences of MSE values using IGSA and PSO have been shown in Figs. 11 and 12, respectively. As mentioned in Table 2, for example when magnetizing inrush current occurred, proposed intelligent protection system shows (0010) or (0.000 0.000 1.000 0.000). Also MSE value that is the criterion of similarity between actual output of intelligent system and target output should be equal to zero or close to zero. Table 7 shows the result of using ANN trained by IGSA as compared to ANN trained by PSO. It is worth noting that two optimization algorithms are used in same situation (i.e. with same population, same iteration and same stop criteria). Obtained results show that IGSAANN can exactly detect all conditions while PSO-ANN has a little error although it can do classification as well as IGSA-ANN. In same situation and structure FFBP and CFBP presents so weak results. As it can be seen in Table 8, both of them have about 0.3 MSE values that seems high, and also they have problem in detecting conditions. For example in second condition (namely; Inrush current)

Table 8 Obtained results using FFBPNN and CFBPNN. ANN structure

Operating condition

Actual output of proposed algorithms CFBP

16-16-4

1 2 3 4

0.5006 0.5006 0.5009 0.5645

MSE value

0.3427

FFBP 0.5319 0.9913 0.6317 0.9185

0.9390 0.5067 0.8117 0.5111

0.5286 0.5014 0.5558 0.5059

0.5396 0.5400 0.5415 0.5419 0.3382

0.5417 0.5457 0.5408 0.5451

0.8647 0.8586 0.8637 0.8550

0.5540 0.5558 0.5540 0.5579

1140

M. Yazdani-Asrami et al. / Electrical Power and Energy Systems 64 (2015) 1129–1140

FFBP approximately detect this condition but it has a mentionable error (0.5400 0.5457 0.8586 0.5558) and CFBP not only makes a lot of mistake in determining this condition but also do it wrong (0.5006 0.9913 0.5067 0.5014). RBF is one of the most famous ANN classifiers which uses knowledge mapping in its structure. As it can be seen in Table 9, MSE value is so perfect but the accuracy of the results is low. Since, the output of RBF presents both negative and positive values and MSE function uses summation in it, therefore achieving to such MSE value is acceptable. According to discussed results, neither of the FFBP, CFBP and RBF are not reliable and accurate classification tools for this complicate problem; also, IGSA trains ANN more accurate as compared to PSO. As a result, this training method namely using optimization algorithms presents more reliable and accurate protection systems than traditional training methods (like back propagation method) and structure of ANN (like radial basis functions). Conclusion A novel intelligent method has been proposed to discriminate between five different transients of power transformers, i.e. magnetizing inrush current, over-excitation, CT saturation, normal condition and internal faults. In this proposed work, an ANN model consisting of three different architectures (FFBPNN, CFBPNN and RBFNN) of neural network for power transformer protection has been developed. The problem has been broken down into the tasks of feature extraction, strategy selection and performance evaluation. The k-means clustering approach has been applied to dataset to reduce training computation burden and improve training possibility. All architectures were trained using IGSA and PSO algorithms and the results were compared. Whereas the back propagation algorithm minimizes an average sum squared error term by doing a gradient descent in the error space, but there is a possibility of getting stuck or oscillating around a local minimum. So, the effectiveness of this method is not satisfactory. In addition, the convergence of the back propagation algorithms during the training is sensitive to the initial values of weights. If the initial values of the weights are not properly selected, the optimization will be trapped in a local minimum or maximum. But, while IGSA or PSO algorithm was used for training, that has overcome the abovementioned drawbacks of BPN and gives improved results. In addition, IGSA show more efficient results with quick response because The proposed schemes show satisfactory robustness as it can classify a testing dataset without any misclassification, namely; classification accuracy as high as 100% can be achieved. References [1] Moravej Z, Ashkezari JD, Pazoki M. An effective combined method for symmetrical faults identification during power swing. Int J Electr Power Energy Syst 2015;64:24–34. [2] Bejmert D, Rebizant W, Schiel L. Transformer differential protection with fuzzy logic based inrush stabilization. Int J Electr Power Energy Syst 2014;63:51–63. [3] Rasoulpoor M, Banejad M. A correlation based method for discrimination between inrush and short circuit currents in differential protection of power transformer using Discrete Wavelet Transform: theory, simulation and experimental validation. Int J Electr Power Energy Syst 2013;51:168–77. [4] Sidhu TS, Burnworth J, Darlington A, Kasztenny B, Liao Y, McLaren PG, et al. Bibliography of relay literature, 2006 IEEE Committee Report. IEEE Trans Power Deliv 2008;23(4):1864–75. [5] Saleh SA, Scaplen B, Rahman MA. A new implementation method of waveletpacket-transform differential protection for power transformers. IEEE Trans Ind Appl 2011;47(2):1003–12.

[6] Saleh SA, Rahman MA. Testing of a wavelet-packet-transform-based differential protection for resistance-grounded three-phase transformers. IEEE Trans Ind Appl 2010;46(3):1109–17. [7] Moravej Z, Vishwakarma DN, Singh SP. Digital filtering algorithms for differential relaying of power transformer: an overview. Electr Mach Power Syst 2000;28(6):485–500. [8] Saleh SA, Rahman MA. Modeling and protection of three-phase power transformer using wavelet packet transform. IEEE Trans Power Deliv 2005;20(2):1273–82. [9] Eldin AAH, Refaey MA. A novel algorithm for discrimination between inrush current and internal faults in power transformer differential protection based on discrete wavelet transform. Electr Power Syst Res 2011;81(1):19–24. [10] Faiz J, Lotfi-Fard S. A novel wavelet based algorithm for discrimination of internal faults from magnetizing inrush currents in power transformers. IEEE Trans Power Deliv 2006;21:1989–96. [11] Seddighi AR, Haghifam MR. Detection of inrush current in distribution transformer using wavelet transform. Int J Electr Power Energy Syst 2005;27(5–6):361–70. [12] Moravej Z. Power transformer protection using support vector machine network. In: Conference on power and energy systems, Iran; 2008. [13] Jing M, Zengping W. A novel algorithm for discrimination between inrush currents and internal faults based on equivalent instantaneous leakage inductance. In: IEEE power engineering society general meeting; 2007. p. 1–8. [14] Baoming G, Almeida AT, Qionglin Z, Xiangheng W. An equivalent instantaneous inductance-based technique for discrimination between inrush current and internal faults in power transformers. IEEE Trans Power Deliv 2005;20:2473–82. [15] Kang YC, Lee BE, Kang SH. Transformer protection relay base on induced voltages. Int J Electr Power Energy Syst 2007;29:281–9. [16] Zhang H, Wen JF, Liu P, Malik OP. Discrimination between fault and magnetizing inrush current in transformers using short time correlation transform. Int J Electr Power Energy Syst 2002;24(7):557–62. [17] Guocai S, Dachuan Y. Identifying internal faults of transformers through the similarity degree between voltage and current. In: IEEE power engineering society winter meeting; 2000. p. 1868–72. [18] Shin M, Park Ch, Kim J. Fuzzy logic-based relaying for large power transformer protection. IEEE Trans Power Deliv 2003;18(3):718–24. [19] Naresh R, Sharma V, Vashisth M. An integrated neural fuzzy approach for fault diagnosis of transformers. IEEE Trans Power Deliv 2008;23(4):2017–24. [20] Sayed E, Eldin MT. A novel approach for classifying transient phenomena in power transformers. Int J Emerg Electr Power Syst 2004;1(2):50–60. [21] Jazebi S, Vahidi B, Jannati M. A novel application of wavelet based SVM to transient phenomena identification of power transformers. Energy Convers Manage 2011;52:1354–63. [22] Jazebi S, Vahidi B, Hosseinian SH. A novel discriminative approach based on hidden markov models and wavelet transform to transformer protection. Int Trans Soc Model Simul 2010;86(2):93–107. [23] Jazebi S, Vahidi B, Hosseinian SH, Faiz J. Magnetizing inrush current identification using wavelet based Gaussian mixture models. Simul Model Pract Theory 2009;17(6):991–1010. [24] Diaz G, Arboleya P, Gomez-Aleixandre J. A new transformer differential protection approach on the basis of space-vectors examination. Electr Eng 2005;87:129–35. [25] Feyzi MR, Sharifian MBB. Investigation on the factors affecting inrush current of transformers based on finite element modeling. In: IEEE 5th international power electronic and motion control conference, vol. 1; 2006. p. 1–5. [26] Yazdani-Asrami M Ebadi, Ahmadi-Kordkheili AR, Taghipour M. Effect of null wire on the peak value of inrush current in three-phase transformers bank. Int Rev Model Simul (IREMOS) 2010;3(2):140–5. [27] Chen Sh, Lin R, Cheng Ch. Magnetizing inrush model of transformers based on structure parameters. IEEE Trans Power Deliv 2005;20(3):1947–54. [28] Girgis RS, teNyenhuis EG. Characteristics of inrush current of present designs of power transformers. In: IEEE power engineering society meeting; 2007. p. 1–6. [29] El-Naggar KM, Gilany MI. A discrete dynamic filter for detecting and compensating CT saturation. Electr Power Syst Res 2007;77(5–6):527–33. [30] Bishop CM. Pattern recognition and machine learning. Springer; 2006. [31] Galushkin AI. Neural networks theory. Springer; 2007. [32] Howlett RJ, Jain LC. Radial basis function networks 2. Springer; 2001. [33] Rashedi E, Nezamabadi-pour H, Saryazdi S. GSA: a gravitational search algorithm. Inform Sci 2009;179:2232–48. [34] Sarafrazi S, Nezamabadi-pour H, Saryazdi S. Disruption: a new operator in gravitational search algorithm. Scientia Iranica 2011;18(3):539–48. [35] Shaw B, Mukherjee V, Ghoshal SP. A novel opposition-based gravitational search algorithm for combined economic and emission dispatch problems of power systems. Int J Electr Power Energy Syst 2012;35(1):21–33. [36] Geethanjali M, Slochanal SMR, Bhavani R. PSO trained ANN-based differential protection scheme for power transformers. Neurocomputing 2008;71:904–18.