Statistical decision-tree based fault classification scheme for protection of power transmission lines

Statistical decision-tree based fault classification scheme for protection of power transmission lines

Electrical Power and Energy Systems 36 (2012) 1–12 Contents lists available at SciVerse ScienceDirect Electrical Power and Energy Systems journal ho...

1MB Sizes 4 Downloads 103 Views

Electrical Power and Energy Systems 36 (2012) 1–12

Contents lists available at SciVerse ScienceDirect

Electrical Power and Energy Systems journal homepage: www.elsevier.com/locate/ijepes

Statistical decision-tree based fault classification scheme for protection of power transmission lines J. Upendar ⇑, C.P. Gupta 1, G.K. Singh 2 Department of Electrical Engineering, Indian Institute of Technology, Roorkee, 247 667 Uttarakhand, India

a r t i c l e

i n f o

Article history: Received 19 February 2009 Received in revised form 9 August 2011 Accepted 12 August 2011 Available online 21 December 2011 Keywords: Fault classification Wavelets Classification and Regression Tree method (CART) Transmission lines Artificial neural network (ANN)

a b s t r a c t This paper presents a statistical algorithm for classification of faults on power transmission lines. The proposed algorithm is based upon the wavelet transform of three phase currents measured at the sending end of a line and the Classification and Regression Tree (CART) method, a commonly available statistical method. Wavelet transform of current signal provides hidden information of a fault situation as an input to CART algorithm, which is used to classify different types of faults. The proposed technique is simulated using MATLAB/SIMULINK software and it is tested upon the data created with the fault analysis of the 400 kV sample transmission line considering wide variations in the operating conditions. The classification results are also compared with the results obtained using back propagation neural network. Ó 2011 Elsevier Ltd. All rights reserved.

1. Introduction Transmission line protective relaying is an important aspect of a reliable power system operation. The faults do occur in power system network and are more frequent in transmission and distribution systems. Following the occurrence of a power system fault, the maintenance crew must find and fix the problem to restore the service as quickly as possible. Fast and accurate restoration of the service, reduces the loss of revenue and outage time. Therefore, these challenges reinforce the need to examine merits of different fault classification methods and various protection reinforcement schemes available to system planners, so as to achieve the highest possible incremental reliability and improvement in the system protection scheme under a variety of fault conditions. Over the last two decades, owing to technological progress in computers and electronics, power system has been equipped with digital relays which offer a number of advantages over electromechanical relays. Event recorders used at remote terminal units transmit the data to the control center through the supervisory control and data acquisition (SCADA) systems. For complex fault or malfunction scenarios, identification of the fault type and malfunctioning devices may require extensive knowledge about the

⇑ Corresponding author. E-mail addresses: [email protected] (J. Upendar), [email protected] (C.P. Gupta), [email protected] (G.K. Singh). 1 Tel.: +91 1332 285594 (Off). 2 Tel.: +91 1332 285070 (Off); fax: +91 1332 273560. 0142-0615/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.ijepes.2011.08.005

power system and its protective devices. Traditionally, fault diagnosis is performed off line by experienced engineers, but software tools emerging in present times for fault classification may provide more effective and flexible solution. To improve the accuracy and speed of fault classification, the information is stored in a database and intelligent systems in a control center can access the database for diagnosis of a fault type for further genesis. In the past, several attempts have been made for fault classification using traveling wave, neural network and fuzzy logic based approaches. However, traveling wave methods [1] require high sampling rate and have problems in distinguishing between waves reflected from the fault and from the remote end of the line. Artificial neural network (ANN) based fault classification technique was reported in [1–4]. Although ANN-based approaches are quite successful in determining the correct fault type, the main disadvantage is that it requires considerable training effort for good performance, especially under a wide variation of operating conditions (such as system loading level, fault resistance and fault inception instance). Another disadvantage is that the training may end up in a local minimum e.g., contingencies may not converge to the desired value. When the learning gets stuck on local minima, the requisite performance will suffer. The application of fuzzy logic to classify the faults was used in relaying [5,6]. The benefit of fuzzy logic is that its knowledge representation is explicit, using simple ‘‘IF-THEN’’ relations. But logic-based expert systems have a combinatorial explosion problem [6] when it is applied to a large system. Again, the accuracy of fuzzy logic based schemes cannot be guaranteed for wide variations in the system conditions.

2

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12

In modern power systems, digital relays and fault recorders are installed at prime locations to monitor and record important information regarding power quality disturbances. Recently, wavelet based signal processing techniques have emerged as a powerful tool for feature extraction of power quality disturbances [7], data compression [8] and fault classification [9–11]. A pattern-recognition technique based on wavelet transform has been found to be an effective tool in monitoring and analyzing power system disturbances including power quality assessment [12] and system protection against faults. A technique based on comparison of currents in the corresponding phases of the two lines to detect faults and discriminate between faulty and healthy phases is proposed [13]. Wavelet transform-based fault classification using voltage and current signals of a simple transmission line is reported in [14]. More recently, the CART methodology [15,16] has caught the interest of a wide community of applied mathematicians and digital signal/image processing engineers. Built around ideas of recursive partitioning, it develops, based on an analysis of noisy data, a piecewise constant reconstruction, where the pieces are terminal nodes of a data-driven recursive partition. Since its inception in the last decade, the CART methodology of tree-structured adaptive nonparametric regression has been widely used in statistical data analysis. For example, in health risk analysis [15], it was used to classify heart attack patients into two groups: those who will survive 30 days (low risk) and those who will not (high risk). After examining 19 variables including age and blood pressure, the classification tree was formed using CART and it was observed that 89% of low risk and 75% of high risk patients were correctly classified. Again CART was used for pollution monitoring to study the relationship between air pollution concentration and housing values [15]. In [17], CART was used for deciding the value of the house depending upon various variables (like crime rate, tax rate, air pollution, number of rooms, distance to employment sectors and accessibility to road highways) and also for investing the risk factor related to bankruptcy in banking sector. It was also used in household food-insecurity analysis [18], for identifying indicators of vulnerability to famine and chronic food insecurity from the information collected from household survey in Bangladesh and Ethiopia, that provide the indication of households most likely to be food insecure. The information used as input consists of calories available per person per day for various households. Then the households were separated into two groups – food insecure and food secure. CART was widely used for data pruning in image coding [19–22]. This paper presents an application of a new statistical decisiontree based fault classification technique using CART for transmission lines. In many of the previous papers/research, it is observed that, the fault classification is mainly implemented with the help of threshold values. But, because of non-linearity nature of these threshold values under various operating conditions, it is difficult to determine these threshold values. In our proposed fault classification algorithm, the advantage of CART is that it is nonparametric in nature. CART does not require any variables to be selected in advance. CART algorithm will itself identify the most significant variables and eliminate non-significant one. CART results are invariant to monotone transformations of its independent variables. By changing one or several variables to its logarithm or square root will not change the structure of the tree. Based on its own analysis, it will analyze the large amount of data within a short period. The proposed method is capable of providing a reliable and fast estimation of fault types on the basis of measurement of three phase currents using wavelet transform (WT). The method classifies whether a normal state, single-line-to-ground, double-line-to-ground, phase-to-phase or a three-phase fault has occurred. The proposed algorithm is tested on a 400-kV two terminal transmission line

simulated using MATLAB/SIMULINKÒ. The performance of the proposed technique is analyzed by comparing the fault classification results with Back-Propagation Neural Network (BPNN) method for the same test data considering a wide variation in system condition. Fault signals in each case are extracted to several scales using wavelet transforms and certain selected features of wavelet transformed signals are used as input for a training process of the proposed statistical algorithm. 2. Wavelet analysis Wavelet analysis [23] is a mathematical technique for signal processing and is inherently suited for non-stationary and nonperiodic wide-band signals. It helps in archiving the localization both in frequency and time. Wavelet analysis involves an appropriate wavelet function called ‘‘mother wavelet’’ and performs analysis using shifted and dilated versions of this wavelet. The continuous wavelet transform (CWT) of a continuous signal x(t) is defined as

CWTða; bÞ ¼

Z

1 1

xðtÞWa;b dt

ð1Þ

where W(t) is the mother wavelet and other wavelets Wa;b ðtÞ ¼ p1ffiffia Wa;b ðtb Þ are its dilated and translated versions, the cona stants a and b being dilation (scale) and translation (time shift) parameters, respectively. The CWT at different scales and locations provides variable time–frequency information of the signal.The digitally-implementable counterpart of CWT known as discrete wavelet transform (DWT), is the one which is used for the proposed fault classification. The DWT of a signal x(t) is defined as

  1 XX k  nb0 am 0 DWTðx; m; nÞ ¼ pffiffiffim xðkÞW am a0 m n 0

ð2Þ

m where a ¼ am 0 and b ¼ nb0 a0 ; a0, b0 being fixed constants are generally taken as a0 = 2 and b0 = 1. k, m and n are integer variables. The actual implementation of DWT is done by multi-resolution analysis (MRA) [24]. The original signal is analyzed at different frequency bands with different resolutions. The signal is decomposed into a smooth approximation version and a detail version. The approximation is further decomposed into an approximation and a detail; and the process is repeated. This decomposition of the original signal is obtained through successive high-pass and low-pass filtering of the signal. The successive stages of decomposition are known as levels. The MRA details at various levels contain the features for the detection and classification of faults.

3. Classification and regression tree method Classification and Regression Tree (CART) [15] is a classification method which uses historical data to construct decision trees. The aim of a classification and regression tree is to partition the input data in a tree-structured fashion, and to construct an efficient algorithm which provides a piecewise-constant estimator f or a classifier u by fitting to the data in each cell of the partition. This algorithm is based on binary tree-structured partitions and on a penalized criterion that permits to select some ‘‘good’’ tree-structured estimators among a huge collection of trees. In practice, it yields some easy-to-interpret and easy-to-compute estimators. More precisely, given a training sample of observations, the CART algorithm consists in constructing a large tree from the observations by minimizing at each step some impurity function and then, in pruning the thus constructed tree to obtain a finite sequence of nested trees thanks to a penalized criterion whose penalty term is proportional to the number of leaves. Decision trees are then used

3

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12

to classify new data. In order to use CART, it is necessary to know number of classes a priori. For building decision trees, CART uses so-called learning sample - a set of historical data with pre-assigned classes for all observations. Decision trees are represented by a set of questions, which splits the learning sample into smaller and smaller parts. CART asks only yes/no questions. CART algorithm will search for all possible variables and all possible values in order to find the best split – the question that splits the data into two parts with maximum homogeneity. The process is then repeated for each of the resulting data fragments. CART method is robustness to outliers. Usually, the splitting algorithm will isolate outliers in individual node or nodes [17– 19,25]. CART is a nonparametric which implies that this method does not require specification of any functional form. CART does not require variables to be selected in advance. CART algorithm will itself identify the most significant variables and eliminate non-significant ones. CART results are invariant to monotone transformations of its independent variables. Changing one or several variables to its logarithm or square root will not change the structure of the tree. Only the splitting values (but not variables) in the questions will be different. The main idea is that the learning sample is consistently replenished with new observations. It means that CART tree has an important ability to adjust to current situation. CART methodology consists of three parts: A. Construction of maximum tree. B. Choice of the right tree size. C. Classification of new data using constructed tree.

Building the maximum tree implies splitting the learning sample up to last observations, i.e. when terminal nodes contain observations only of one class. The construction of classification trees is explained as follows. 3.1.1. Classification tree Each observation of leading sample is classified only when the class of that learning sample is known. Generally the classes of learning samples are provided by the user or it can be decided by using some rules. In this paper, AG, BG, CG, ABG, BCG, CAG, AB, BC, CA, ABC faults are used as 10 classes (K) to construct the classification tree. Let tp be a parent node and tL, tR – respectively left and tight child nodes of parent node tp. Consider the learning sample with variable matrix X with M number of variables xj and N observations. Let class vector Y consist of N observations with total amount of K classes. Classification tree is built in accordance with splitting rule – the rule that performs the splitting of learning sample into smaller parts. Each time data have to be divided into two parts with maximum homogeneity. Using the definition given in [14], the measure of the impurity i(t) at node t [21] is denoted by k X

pðwjj tÞ log pðwj jtÞ

PR

xj < x tL

R j

tR

Fig. 1. Splitting algorithm of CART.

The best division is that which maximizes the difference Di(t) is given by

DiðtÞ ¼ iðt p Þ  P L iðt L Þ  PR iðt R Þ

ð4Þ

Every time the entities are divided into two child nodes with maximum homogeneity which can be decided by impurity function i(t). Since the impurity of parent node tp is constant for any of the possible splits xj < xjR , j = 1, . . . , M, the maximum homogeneity of left and right child nodes will be maximization of change of impurity function Di(t) which is given by

arg max ½iðtp Þ  PL iðtL Þ  PR iðt R Þ

ð5Þ

xj 6xRj ;j¼1;...M

It gives the best split condition xj < xjR by searching all possible values of variables, which will maximize the change of impurity measure Di(t). In this way, the decision tree splits into sub-trees until there is no possibility of significant decrease in the measure of impurity. There are several impurity functions, in which the widely used is Gini splitting rule. 3.1.2. Gini splitting rule The impurity function i(t) used in Gini splitting rule (or Gini index) is given by

3.1. Construction of maximum tree

iðtÞ ¼ 

tP PL

ð3Þ

j¼1

where p(wj|t) is the proportion of patterns xj allocated to class wj at node t. Each non-terminal node is further divided into two further nodes tL and tR, as shown in Fig. 1 where, xjR represents the best splitting values of variable xj. The corresponding proportions (probabilities) of entities for the new nodes are PL, PR respectively.

iðtÞ ¼

X

pðk=tÞpðl=tÞ

ð6Þ

k–l

where k, l = 1, . . . K is the index of the class; p(k/t) is the conditional probability of class k provided we are in node t. Applying the Gini impurity function (6) to maximization problem (5), change of impurity measure Di(t) becomes

DiðtÞ ¼ 

K X

p2 ðk=tp Þ þ PL

k¼1

K X

p2 ðk=t L Þ þ PR

k¼1

K X

p2 ðk=t R Þ

ð7Þ

k¼1

Therefore, Gini algorithm will solve the following problem:

" arg max  xj 6xRj ;j¼1;M

K X k¼1

2

p ðk=tp Þ þ PL

K X k¼1

2

p ðk=t L Þ þ PR

K X

# 2

p ðk=t R Þ

ð8Þ

k¼1

Gini algorithm will search in learning sample for the largest class and isolate it from the rest of the data. Gini works well for noisy data. 3.2. Choice of the right size tree The decision-tree grows into sub-trees and maximum number of trees can exceed hundreds of levels depending on the complexity of data. Therefore, the maximum number of tree is to be optimized before using for the classification of new data. By choosing the right size of tree, cutting-off insignificant nodes and sub-trees it is possible to optimize the tree size. The two popular tree pruning methods to optimize the tree size are: (1) optimization of number of points in each node and (2) cross-validation. 3.2.1. Optimization by minimum number of points The splitting is stopped if the number of observations in the node is less than pre-defined required number Nmin. If Nmin

4

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12

parameter is bigger, then the growth in tree is smaller. This approach is easy to implement and very fast. But it requires the calibration of parameter Nmin. Generally Nmin is taken as 10% of the total learning sample. While defining the size of tree, there is a trade off between the measure of tree impurity and complexity of the tree which is given by total number of terminal nodes in ~ For the maximum tree, the impurity measure will be the tree T. minimum and nearly equal to zero, but the number of terminal nodes T~ will be maximum. 3.2.2. Cross-validation It depends on the optimal proportion between the complexity of the tree and misclassification error. With the increase of tree size, misclassification error decreases and in case of maximum tree error becomes zero. But for independent data the complex decision tree will show poor performance. The performance of decision tree on independent data is called as predictive power of tree. Therefore, it is important to find the optimal proportion between complexity and misclassification error. This can be obtained by cost complexity function as given below.

Ra ðTÞ ¼ RðTÞ þ að Te Þ ! min T

ð9Þ

~ is the where R(T) is the misclassification error of the tree T, aðTÞ ~ complexity measure which depends on T (total sum of terminal nodes in the tree). The a-parameter is found through the sequence of learning sample when a part of learning sample is used to build the tree. The other part of the data is taken as a testing sample. The process is repeated several times for randomly selected learning and testing samples. 3.3. Classification of new data After constructing the classification tree, it can be easily implemented to classify the new test data. For each new test samples, the method results a class or some response value. With the set of conditions present in tree, each of the new samples will get classified into one of the terminal node of the tree, where the observation belongs to and the class with more observations in the current node is called as dominating class. A more detailed explanation to build the CART is given in this section. Initially, all the testing samples are placed at root node, which is impure or heterogeneous in nature. The main aim is to form a rule that initially breaks-up these samples in group form that are internally more heterogeneous than the root node. Then the following procedure is applied to form the digression when splitting the samples from the root node [18]. (1) Starting with the first variable, CART splits variables at all of its possible splits with yes and no response conditions. (2) Then evaluates the reduction in impurity by applying goodness of a split criterion to each split point. This works as follows: Suppose the new dependent variable, taken the value 1 (if, say a Fault in phase-A) and 2 (if there is no Fault in phase-A) and the probability distribution of these variables are p(1|t) and p(2|t) at the corresponding node t which decides the measure of heterogeneity or impurity, i(t) of nodes. It selects the best split by maximizing the reduction in the degree of heterogeneity in i(t) where, i(t) = N(p(1|t), p(2|t)). (3) For each of remaining variable at root node, steps 1 and 2 are applied. Then according to reduction in impurity obtained at each split, rank all the best splits on each variable and select the variable and split point that having most reduced impurity at the root or parent node.

(4) The CART assigns class to these nodes according to a condition that minimizes the misclassification error cost. It is possible to apply algorithms or user-defined values which can be incorporated in splitting rule to minimize the misclassification error cost. Alternatively, the analyst can use the default category, assuming that all misclassifications are equally costly. (5) The steps 1–4 are repeatedly applied to each non-terminal child node at each of the successive stage. (6) The final and large tree can be obtained by continuing the splitting process until every sample to be classified into one of the terminal node. Obviously, such a tree will have a large number of terminal nodes that are either pure or very small in content.

4. Back propagation neural network A neural network is a set of interconnected simple processing elements called neurons, where each connection has an associated weight. The neurons are usually organized into a series of layers. A neural network typically consists of three or more layers. The input layer passes the data into the network. The data from the input layer arrive at the intermediate layer or the hidden layer with the associated connection weights. The hidden neurons take in the weighted inputs and calculate the outputs by their transfer functions. Their outputs are fed to the next hidden layer in turn (if there is more than one) or the output layer. The output layer then generates the results representing the mapping from the given input data. There is no clear rule to determine the number of neurons in each hidden layer, which is generally done by educated trial and error. Once the network structure has been determined, the network connection weights are learned from the training data. The most popular learning algorithm is back-propagation, which was created by generalizing the widrow-Hoff learning rule to multilayer network and non-linear differentiable transfer function. Input vectors and corresponding target vectors are used to train a network until it can approximate a function, associate input vectors with specific output vectors, or classify input vectors in an appropriate way as defined in this study. The weights are initialized to small random numbers in the beginning and then, the inputs are propagated forward by activating the transfer functions in the neurons and calculating the outputs of each layer in turn. Afterwards, the error between the actual network output and desired response is propagated backwards to update the weights in order to minimize the network prediction error. Back-propagation iteratively processes the training samples through the input forward propagation and error backward propagation until specified accuracy or other terminating conditions are satisfied. When a neural network is used as a classification technique, its operation involves two steps: learning and recall. In the learning phase, all the network weights are adjusted to adapt to the patterns of the training data. In the recall phase, the trained network produces responses of the test data based on learned network parameters. Back propagation artificial neural networks (BP-ANN) are highly effective for pattern recognition. The size of the input layer and output layer was determined by the nature of the application. Additionally, the number of hidden layers of neurons may vary. Generally, many hidden layer neurons may result in divergence. The number of hidden layer nodes is best determined by trial and error based on the overall system performance. In this study, the NN used the back propagation scaled conjugate gradient training algorithm, and adjusted the weight update step at each iteration. The network stops learning when the Mean Square Error

5

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12

(MSE) or number of iterations reached a predetermined target value. There are several training algorithms for feed-forward networks. All these algorithms use the gradient of the performance function to determine how to adjust the weights to minimize performance. The gradient is determined using a technique called back propagation, which involves performing computational backwards through the network. The simplest implementation of back propagation learning updates the network weights and biases in the direction in which the performance function decreases more rapidly. An iteration of this algorithm can be written

X kþ1 ¼ X k  ak g k

Get the current signals Ia, Ib, Ic from the record

Detection Module Calculate the 7th level Detailed Coefficients of Ia, Ib, Ic

Find indesis Qa, Qb, Qc

ð10Þ

where Xk, a vector of current weights and biases; gk, the current gradient and ak is the learning rate.

Is Qa+ Qb+ Qc> TH?

No

No Fault. Avoid the record transfer

Yes

Classification Module

Calculate indesis Sa, Sb, Sc and |(Sa+Sb+Sc)|

5. Power system model and details A sample 3-phase power system network as shown in Fig. 2 was built to test the performance of proposed scheme. The prototype system consists of a 3-phase power supply, a transmission line represented by lumped parameters connecting a load. Faults were created at different places on the transmission line [26]. 5.1. Generator Voltage rating: 400 kV, 50 Hz. Total impedance of generator and transformer together: (0.2 + j4.49) X. X/R ratio: 22.45. 5.2. Transmission line Positive and negative sequence resistance/unit length = 0.02336 X/km. Zero sequence resistance per unit length = 0.38848 X/km. Positive and negative sequence inductance/length = 0.95106 mH/km. Zero sequence inductance per unit length = 3.25083 mH/km. Positive and negative sequence capacitance/unit length = 12.37 nF/km. Zero sequence capacitance/unit length = 8.45 nF/km. 5.3. Load Load impedance = (720 + j1.11) X correspondence to load of 200 MVA at 0.9 pf. 6. Proposed fault classification method The proposed algorithm involves both detection and classification of faults as shown in Fig. 3. The first step of the detection module is to get the line current samples of Ia, Ib, Ic from the current recorder, which is located near the sending end. The high frequency noise signals are filtered and

Current recorder

Fault

Gen

Load 300 Km Fig. 2. Sample power system network.

Analyze Qa, Qb, Qc and |(Sa+Sb+Sc)| using CART or BPNN algorithm

Report the Fault type and allow the record transfer

Fig. 3. Overview of proposed fault classification scheme.

corresponding 7th-level detailed wavelet coefficients are calculated which represents the 2nd and 3rd order harmonic components in the fault current signals. During fault condition, the detailed (HPF) coefficients of 7th level with band of (99–199 Hz) frequency have higher magnitudes due to presence of 2nd and 3rd order harmonics content in the faulted line currents. From the previous studies, it is found that, daubechies mother wavelet is having good capability to capture the time of transient occurrence and extraction of frequency features during power system faults and disturbance. In the proposed algorithm, ‘‘Db1’’ mother wavelet is used to get the DWT coefficients for classification of different type of faults. Using MRA, line currents Ia, Ib, Ic are decomposed into 9-levels and corresponding approximate and detailed coefficients are calculated for each signal. With the help these coefficients, Qa, Qb and Qc indices are calculated. Where, Qa, Qb and Qc are sum of absolute values of 7th -level detailed coefficients of line current Ia, Ib and Ic; respectively. The fault detection is carried out by analyzing the above calculated indices Qa, Qb and Qc. In case of no fault, it is found that (Qa + Qb + Qc) is less the threshold value, and no data is transferred to the next classification module for the further analysis. Otherwise, in case of fault, it is found that the sum (Qa + Qb + Qc) is greater than threshold value. Then the detailed coefficients of current signals including Qa, Qb and Qc are transferred to the next classification module to find the type of fault. In classification module, with the use of already determined detailed coefficients of the line currents, the indices Sa, Sb, Sc and |(Sa + Sb + Sc)| are also calculated. Where Sa, Sb and Sc are the sum of 7th-level detailed coefficients of line current Ia, Ib and Ic; respectively. Then the indices Qa, Qb, Qc and |(Sa + Sb + Sc)| (=aS), are given as input to the CART tree, which analyze the type of fault. The variation of indices (Qa + Qb + Qc), Qa, Qb, Qc and |(Sa + Sb + Sc)| during LG fault at different locations of the transmission line, and various incidence angles with 0 X of fault resistance are depicted in the Fig. 4a–e. In order to classify the faults, the CART tree is used which must be trained with the help of training samples and CART algorithm before implementing the proposed method. The learning database with the verity of faulted samples is used to improve the CART generalization capability. The different rules

6

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12

Fig. 4. Variation of (a) (Qa + Qb + Qc), (b) Qa, (c) Qb, (d) Qc and (e) |(Sa + Sb + Sc)| for different values of distances and inception angles during the LG (A-G) fault.

evaluated using the training samples is explained in the next section. 7. Results and discussion As mentioned in previous section, each observation (learning sample) as an input to CART contains the indices Qa, Qb, Qc and |(Sa + Sb + Sc)|. The proposed technique is tested using simulated data obtained using MATLAB/SIMULINKÒ. The data sets are created by considering different operating conditions i.e. the different

values of inception angles ranging between 0° and 360°, different values of fault resistances between 0 and 200 O and different fault distances from 0 to 300 km as follows: (a) Fault type: AG, BG, CG, ABG, BCG, CAG, AB, BC, CA, ABC. (b) Fault locations: Training: 12, 30, 48, 66, . . . , 282 km (in steps of 18 km). Testing: 0.25, 5, 9.75, . . . , 300 km (in steps of 4.75 km). (c) Fault inception angle: Training: 0°, 20°, 40°, 60°, ... , 340° (in steps of 20°).

7

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12

Fig. 5. The wavelet decomposition of the input current signal.

Table 1 Statistical decision tree based rules generated and its evaluation with the use of training data. Rule

Description

1 2 3 4 5 6 7 8 9 10 11 12 13

(Qc < 7001.53) (Qc > 7001.53) (Qc < 7001.53) (Qc < 7001.53) (Qc < 7001.53) (Qc > 7001.53) (Qc > 7001.53) (Qc > 7001.53) (Qc > 7001.53) (Qc > 7001.53) (Qc > 7001.53) (Qc > 7001.53) (Qc > 7001.53)

& & & & & & & & & & & & &

(Qa < 8241.23) (aS < 2.351e6) (Qa > 8241.23) & (aS < 0.8238) (Qa > 8241.23) & (aS > 0.8238) & (Qb < 7992.64) (Qa > 8241.23) & (aS > 0.8238) & (Qb > 7992.64) (aS > 2.351e6) & (Qb > 8815.69) & (aS > 4.296) (aS > 2.351e6) & (Qb < 8815.69) & (aS < 0.4715) & (Qb < 7122.81) (aS > 2.351e6) & (Qb < 8815.69) & (aS > 0.4715) & (Qa < 7520.79) (aS > 2.351e6) & (Qb < 8815.69) & (aS > 0.472) & (Qa > 7520.9) (aS > 2.351e6) & (Qb > 8815.69) & (aS < 4.296) & (Qa < 6919.3) (aS > 2.351e6) & (Qb > 8815.69) & (aS < 4.296) & (Qa > 6919.3) (aS > 2.351e6) & (Qb < 8815.69) & (aS < 0.472) & (Qb > 7122.81) & (Qb < 7199.5) (aS > 2.351e6) & (Qb < 8815.69) & (aS < 0.472) & (Qb > 7122.81) & (Qb > 7199.5)

Testing: 0°, 4°, 8°, 12° . . . , 359° (in steps of 4°). (d) Fault resistance (O): Training: 0, 12, 24, . . . , 200 O, (in steps of 12 O). Testing: 0, 10, 20, . . . , 200 O (in steps of 20 O). In building/training phase of the CART tree, 16  18  17 = 4896 observations are made for each fault type to form the rules, resulting into a training data set of 4896  10 = 48960 learning samples for all ten types of faults. Then, to test the proposed algorithm, 64  90  21 = 120,960 observations for each fault type were simulated. As a result, for testing purpose, total observation of 120,960  10 = 1,209,600 were simulated for all types of faults. The current signals obtained during various observations are decomposed in 9-levels using MRA. As for N-level decomposition, 2N samples are required. So in this work, two full cycles (0–720°) of current signals at a sampling period of Ts = 7.828  105 and frequency of 50 Hz are selected to get the (0.04/Ts)=512 samples between (0–720°). The first signal containing 512 (12.77 kHz) samples are passed through HPF and LPF, and corresponding approximate coefficients and detailed coefficients are recorded. As given in the Fig. 5, the 7th-level coefficients of HPF contains the frequency range between 99 and 199 Hz and represents the 2nd and 3rd harmonic frequency content of the faulted signals. During training, the DWT coefficients so obtained for all 48,960 observations, there is a class indicating the type of fault i.e. AG, BG, CG, ABG, BCG, CAG, AB, BC, CA, and ABC faults. Therefore, 48,960 observations with 10 overlapping classes are fed to CART. It then divides up the sample according to a ‘‘splitting rule’’ and a ‘‘goodness of split criteria’’. In order to capture the data structure, splitting algorithm will generate many splits (nodes) at the border. In the end, CART will grow into a huge tree where almost each observation at the border will be in a separate node. CART algorithm tries to split all observations of learning sample. CART first makes more important splits and then, CART tries to capture the overlapping structure. CART isolates the similar observations from the rest at each split.

Size

Class

2 2 3 4 4 4 5 5 5 5 5 6 6

BG ABC AB AG ABG BCG CAG CG CAG BC BCG CA CAG

Using CART algorithm, the training samples generates a tree, which can be represented by 13 rules presented in Table 1. It can be seen that these rules explains how the input variables of CART are distributed for different type of faults and some of the rules can be easily identified by experts, but it is difficult task to get all rules for the classification accurately. Columns 2 and 3 explain about the equations and number of conditions to classify the respective fault. The decision tree formed using the generated rules of CART algorithm with the help of training samples is shown as in Fig. 6. A wide variety of classification options are obtained for classifying different faults and no single classification solution will always perform best.

Fig. 6. Decision-tree formed using training samples.

8

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12

Table 2 Test results for the proposed fault classification scheme using test samples. Fault

AG BG CG ABG BCG CAG AB BC CA ABC

Test samples recognized as AG

BG

CG

ABG

BCG

CAG

AB

BC

CA

ABC

Success (%)

120,917 0 0 0 0 0 0 0 0 0

43 120,960 0 0 0 0 0 0 0 0

0 0 120,957 0 0 0 0 0 0 0

0 0 0 120,959 0 0 0 0 0 0

0 0 0 0 120,960 0 0 0 0 0

0 0 3 0 0 120,949 0 0 294 0

0 0 0 1 0 0 120,960 0 0 0

0 0 0 0 0 0 0 120,960 0 0

0 0 0 0 0 11 0 0 120,666 0

0 0 0 0 0 0 0 0 0 120,960

99.96 100.00 99.99 99.99 100.00 99.99 100.00 100.00 99.75 100.00

Total

(1,209,248/1,209,600)  100

It is absorbed that all the learning samples were classified into different fault classes while training tree CART. The symbols x1, x2, x3, x4 in Fig. 6 represents the indices |(Sa + Sb + Sc)|, Qa Qb Qc respectively. Each non-terminal node is split into two child-nodes. The right side-node represents the success condition of the rule present at previous parent node, whereas the left side node represents the failure condition of the rule present at the previous parent node. During testing, the test samples are recognized using the rules extracted from the decision tree generated by CART algorithm. Classification results for the testing samples using the proposed classification method are given in Table 2. The diagonal values in Table 2 denote the fault case recognized correctly and the remaining off-diagonal values represent the fault cases, which are misclassified. It is observed that the proposed algorithm provides nearly 100% accuracy for BG, BCG, AB, BC, ABC, CG, ABG, CAG fault testing observations. Lowest accuracy obtained for CA fault observations as 99.75%. Some of these observations of CA fault were classified under CAG fault and remaining observations were classified as no-fault. The variation in performance accuracy is in between 99.75% and 100%, where, as overall performance is 99.97%. It is observed that the decision tree correctly classify all the faults samples with very high degree of accuracy. 8. Comparison of results with BPNN In order to classify the faults using BPNN method, the same data set, by considering different operating conditions, created earlier are used during both the learning phase and testing phase. In this, a three layer network is developed as shown in Fig. 7. An input vector and the corresponding desired output are considered first. Input layer contains 4 neurons, that corresponds to four variables as Qa, Qb, Qc and abs(Sa + Sb + Sc), one hidden layer, composed of 10 neurons, and output vector contains 4 neurons corresponds to represent the three faulted phases and ground fault, which categories the type of fault. The input is propagated forward through the network to compute the output vector. The output vector is compared with the desired output, and the errors are determined. The errors are then propagated back through the network from the output to input layer. The process is repeated until the errors being minimized. In this investigation, the Mean Square Error (MSE) was set to 0.01 and the maximum number of times of training was set to 400. Bias weights and momentum factors are used to get optimum results in minimum time. The error goal (MSE) of 0.01 during training has been reached after 22 epochs of iterations which is as shown in Fig. 8. When designing a neural network, one crucial and difficult point is to determine the number of neurons in the hidden layers. The hidden layer is responsible for internal representation of the

99.97

Qa

A

Qb

B

Qc

C

|S|

G output layer

input layer

hidden layer

Fig. 7. The architecture of back propagation neural network.

data and the information transformation between input and output layers. If there are too few neurons in the hidden layer, the network may not contain sufficient degrees of freedom to form a representation. If too many neurons are defined, the network might become over trained. Therefore, an optimum design for the number of neurons in the hidden layer is required. In order to achieve good performance with BPNN, it is evaluated for various values of neurons in hidden layer. A topology with 10 neurons in hidden layers showed the best performance for the problem. After the training process, the algorithm was used to classify the test data. As an example, results of output layer BPNN for a phase–phase-ground, ABG fault at different locations, fault inception angles and fault resistances is presented in Fig. 9. The x-axis in Fig. 9 represents the input parameters of the BPNN input layer for all possible cases which belongs to training data set. And y-axis represents the BPNN index for the corresponding output neuron of the BPNN output layer. The BPNN index value during involvement of fault in phase is considered as ‘1’, and during healthy condition the index value is taken as ‘0’. The output BPNN index values for the 1st, 2nd and 4th neurons in output layer are observed as ‘1’, which represents the involvement of the fault in phases A, B and ground fault as shown in Fig. 9a, b, and d. The BPNN index for remaining 3rd neuron of output layer is equal to ‘0’, which represents the health condition of phase C. In total, 1,209,600 (=120,960  10) fault cases in the transmission line were simulated and 99.88% success was achieved

9

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12 Table 3 Test results for the proposed fault classification scheme. Fault

Proposed decision tree method

Back propagation ANN

Success

Fail

%Success

Success

Failure

%Success

AG BG CG ABG BCG CAG AB BC CA ABC

120,917 120,960 120,957 120,959 120,960 120,949 120,960 120,960 120,666 120,960

43 00 03 01 00 11 00 00 294 00

99.96 100.00 99.99 99.99 100.00 99.99 100.00 100.00 99.75 100.00

120,921 120,892 120,810 120,959 120,960 120,782 120,677 120,960 120,232 120,960

39 68 150 01 00 178 283 00 728 00

99.97 99.94 99.88 99.99 100.00 98.85 99.77 100.00 99.40 100.00

Total

1,209,248

352

99.97

1,208,153

1447

99.88

Fig. 8. Error convergence of BPNN during training process.

for classification of all types of faults. In order to have good accuracy of BPNN, it is necessary to evaluate more number of fault cases at different system conditions. The performance of BPNN classifier for all types of faults is verified and the classification results are summarized as in Table 3. It can be seen that the accuracy from the wavelet based BPNN algorithm is also satisfactory. One of the strongest advantages of applying CART to create classification rules is that a large array of potentially useful data could be used for analysis and CART automatically selected which layers were useful and which were not. This selection process distinguished CART from other classification methods like expert systems, fuzzy logic and neural networks. With expert systems, a priori knowledge is necessary to select data. With neural networks, only useful layers will be used, but the selection might be hidden from the analyst, hindering interpretation of the results and application to other classification problems. The percentage failure in classifying the testing data set using CART and BPNN methods are as shown in Fig. 10. Results shows that CART method classifies fault type more perfectly during all operating conditions as compared to BPNN. Moreover, the CART method is able to identify the testing patters almost immediately.

Fig. 10. Comparison of % failure in fault classification by applying CART and BPNN methods.

9. Inverse Interpolation based Fault Location Algorithm After classifying the type of fault using the method explained in previous section, both the training and testing data sets created

Fig. 9. Variation of BPNN indexes of output layer during the LLG (AB-G) fault for all testing samples observed at different distances, fault inception angles and fault resistances.

10

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12

using |S|, Qa, Qb, Qc parameters in Sections 6 and 7, are used for the detection of fault location. Using the data set of corresponding fault type, the fault location is detected by applying the inverse interpolation method. Generally, the parameters |S|, Qa, Qb, Qc are non-linear in nature under various operating conditions as shown in Fig. 4. In view of above non-linear nature of input parameters, inverse interpolation technique (see Appendix A) can be easily applied to detect the fault location. Using the fault data thus generated, fault location is carried out by means of MATLAB program.

Table 4 Estimated fault location of different fault types based on inverse interpolation method.

9.1. Fault location algorithm

Inception angle

Fault location (km) Actual (d)

Estimated (D)

AG

0 100 200

5 149 293

24.2 128.7 233.25

24.202 128.66 233.15

0.00067 0.0133 0.0333

ABG

30 70 130

48.2 105.8 192.2

55.55 97.35 160.05

55.456 99.704 160.23

0.0313 0.1180 0.0600

AB

10 80 160

19.45 120.2 236.45

34.65 107.8 191.4

34.274 107.83 191.23

0.1253 0.0100 0.0567

ABC

20 60 170

33.8 91.4 249.8

45.1 86.9 201.85

44.99 88.02 200.13

0.0367 0.37333 0.5733

A new method is proposed for fault classification based on statistical Classification and Regression Tree (CART) method. The proposed algorithm uses the digitized samples of current signals. The DWT is used for extraction of features of faulted signals and the statistical CART is used to classify the type of fault. The test result shows that the method based on proposed statistical algorithm is able to classify faults with very high precision under various fault conditions. It can be concluded that the proposed fault

300

250

250

200 150 100

AG-Fault Actual Distance Detected Distance

0

0

50

100

150

200

250

300

350

200 150 100

ABG-Fault Actual Distance Detected Distance

50 0

40

0

100

150

200

250

Inception Angle

(a)

(b) 0.6

0

0.4

% Error

0.5

-1

300

350

40

300

350

40

0.2 0

AG-Fault

-1.5

50

Inception Angle

-0.5

Error ((d–D)/ L)  100

10. Conclusion

300

50

% Error

Rf (X)

figure shows that the proposed method presented lower errors, generally less than 1.5%. It is absorbed that the proposed method had an adequate performance in all studied cases for various types of faults. Some representative fault location results are presented in Table 4, which shows the better accuracy of the results for different type of faults.

Distance (km)

Distance (km)

Initially at an inception angle, from the data set([Qa]) of corresponding fault, the different possible values of fault locations that are matching with the value of Qa are calculated using inverse interpolation. Assume the tabulated values as dQai, where i = 1 to m. Similarly, the different possible values of dQbj, dQck, d|S|h are calculated from the dataset of [Qb], [Qc], [|S|] using inverse interpolation, where j = (1–n), k = (1–p), h = (1–q). qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi The error e ¼ ðdi  dj Þ2 þ ðdj  dk Þ2 þ ðdk  dh Þ2 þ ðdh  di Þ2 is calculated for all values of i, j, k and h. Where all possible combinations of i, j, k and h for each inception angle are (m  n  p  q). The above procedure is repeated to calculate possible error values for all inception angles of a given data set. Finally, the values dQa, dQb, dQc, d|S| which represents the minimum value of error is decided as the fault location. In ideal case, the error value is zero for equal values of dQa, dQb, dQc, d|S|. Fig. 11a and b shows the results of estimated fault distance for AG fault and ABG fault under various operating conditions. It represents the actual and estimated fault locations for different values of fault inception angles from (0–360°) for the fault resistance increase from 0 to 200 X in step wise. Where as, is only possible to represent two parameters inception angle and estimated distance in two dimensional plot Fig. 11a and b. The error in estimated fault location w.r.t total length of line is as shown in Fig. 11c and d. This

Fault type

0

50

ABG-Fault

100

150

200

250

300

350

40

-0.2

0

50

100

150

200

250

Inception Angle

Inception Angle

(c)

(d)

Fig. 11. Actual and estimated fault location with simulated data during (a) AG and (b) ABG fault. Error in fault location with simulated data during (c) AG and (d) ABG fault.

11

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12

1 0.5

sin (wt)

classification technique is simple and can achieve very high accuracy. CART can easily handle both numerical and categorical variables. Among other advantages of CART method is its robustness to outliers. Usually the splitting algorithm will isolate outliers in individual node or nodes. An important practical property of CART is that the structure of its classification or regression trees is invariant with respect to monotonic transformations of independent variables. One can replace any variable with its logarithm or square root value, the structure of the tree will not change. Supervised fault classification with CART analysis is an effective and easily implemented means for creating a ruled-based classification when expert knowledge is insufficient. Applied in appropriate circumstances, it provides an alternative tool for fault classification.

0 -0.5 -1 0

0.005

0.01

0.015

0.02

Time Fig. A2. Pure sine wave with a original data set of 30 points.

Appendix A 1

A.1. Interpolation

f ðxÞ ¼ a2 x þ a1 : Consider points (2,1) and (5,3). These points along with interpolated linear polynomial are graphed below (Fig. A1) To solve for the interpolated polynomial, we must determine constants a2 and a1. If the linear polynomial is to intercept (2,1) and (5,3), it must satisfy

f ð2Þ ¼ 2a2 þ a1 : f ð5Þ ¼ 5a2 þ a1 : By solving, interpolated polynomial becomes f(x) = 2x/3  1/3. Inverse Interpolation: Suppose the values of a function have been already tabulated, and if it is required to find the value of the argument which corresponds to some given value of the function, intermediate between two of the tabulated values. The process by which we obtain this value, which may be called the ‘‘anti function’’, is known as Inverse. Other formulae for determining the anti function may be obtained by reverting any of the formulae of interpolation.

Fig. A1. A linear polynomial interpolated between the two points (2,1) and (5,3).

sin (wt)

0.5

Interpolation is a process employed in mathematics. This is the computation of points or values between ones that are known or tabulated using the surrounding points or values. Interpolation is a method that can be used for predicting. It allows you to predict unknown values, if you know any two particular pair of values. Interpolation takes a series of (x, y) points and generates estimated values for ‘y’ at new ‘x’ points. Interpolation is used when the function that generated the original (x, y) points is unknown. Interpolation is related to, but distinct from, fitting a function to a series of points. In particular, an interpolated function goes through all the original points while a fitted function may not. There are various methods for performing interpolation. Linear interpolation is a very simple form of interpolation. Linear polynomial is used as a interpolation function. Between two points (x1, y1), (x2, y2) with x1 < x2, we interpolate a linear polynomial of the form

0 -0.5 -1 0

0.005

0.01

0.015

0.02

Time Fig. A3. Pure sine wave with 4000 points using inverse interpolation.

Inverse interpolation takes a series of (x, y) points and generates different estimated values for ‘x’s at new ‘y’ points. Inverse Interpolation is used when the function that generated the original (x, y) points is unknown. Here a simple application of inverse interpolation is explained using Figs. A2 and A3. In Fig. A2, a sine wave of one full cycle at 50 Hz is drawn between 0 and 20 ms with dataset of 30 points. Fig. A3 is drawn by applying inverse interpolation to the original dataset, the different values of time are determined for 4000 values present in between 1 and +1 with interval of 0.001. Normal interpolation method is not applicable to determine the possible value of time on x-axis for any value between 1 and +1 of y-axis. Because normal interpolation method is applicable if the curve is monotonic, and not applicable in case of non-monotonic curves. The main advantage of this inverse interpolation method is that, it is possible to get the accurate values of x-axis, even if the waveforms are non-monotonic (linear) in nature. References [1] Shehab-Eldin EH, McLaren PG. Travelling, wave distance protection-problem areas and solutions. IEEE Trans Power Delivery 1998;3(3):894–902. [2] Dalstein T, Kulicke B. Neural network approach to fault classification for high speed protective relaying. IEEE Trans Power Delivery 1995;10(2):1002–11. [3] Dag O, Ucak C. Fault classification for power distribution systems via a combined wavelet-neural approach. In: Proceedings of international conference on power system technology, vol. 2; 2004. p. 1309–14. [4] Samantaraya SR, Dashb PK, Upadhyayb SK. Adaptive Kalman filter and neural network based high impedance fault detection in power distribution networks. Int J Electr Power Energy Syst 2009;31(4):167–72. [5] Dash PK, Pradhan AK, Panda G. A novel fuzzy neural network based distance relaying scheme. IEEE Trans Power Delivery 2000;15(3):902–7. [6] Youssef Omar AS. Combined fuzzy-logic wavelet-based fault classification technique for power system relaying. IEEE Trans Power Delivery 2004;19(2): 582–9. [7] Pillay P, Battacharjee A. Application of wavelets to model short term power system disturbances. IEEE Trans Power Syst 1996;11(November):2031–7. [8] Santoso S, Powers EJ, Grady WM. Power quality disturbance data compression using wavelet transform methods. IEEE Trans Power Delivery 1997;12(3): 1250–7. [9] Youssef OAS. Fault classification based on wavelet transforms. In: IEEE T&D conference. Atlanta, GA, October 28–November 2; 2001. p. 531–6.

12

J. Upendar et al. / Electrical Power and Energy Systems 36 (2012) 1–12

[10] Junga Hosung, Parka Young, et al. Novel technique for fault location estimation on parallel transmission lines using wavelet. Int J Electr Power Energy Syst 2007;29(1):76–82. [11] Valsan Simi P, Swarupa KS. Wavelet transform based digital protection for transmission lines. Int J Electr Power Energy Syst 2009;31(7–8): 379–88. [12] Abdel-Galil TK, Kamel M, Youssef AM, El-Saadany EF, Salama MMA. Power quality disturbance classification using the inductive inference approach. IEEE Trans Power Delivery 2004;19(4):1812–8. [13] Gilany MI, Malik OP, Hope GS. A digital technique for parallel transmission lines using a single relay at each end. IEEE Trans Power Delivery 1992;7(4):118–23. [14] Youssef Omar AS. New algorithm to phase selection based on wavelet transform. IEEE Trans Power Delivery 2002;17(4):908–14. [15] Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and regression trees. London (UK): Chapman & Hall; 1984. [16] Samantaray SR. Decision tree based fault zone identification and fault classification in flexible AC transmissions-based transmission line. IET-Gener Trans Distrib 2009;3(5):425–36. [17] Timofeev Roman. Classification and regression trees (CART): theory and applications. Master thesis, Center of Applied Statistics and Economics Humboldt University, Berlin; 2004. [18] Yohannes Yisehac, Hoddinott John. A technical guide on classification and regression trees: an introduction. Technical guide published by International Food Policy Research Institute, Washington DC, USA; March 2008.

[19] Bittencourt Helio Radke, Clarke Robin Thomas. Use of classification and regression trees (CART) to classify remotely-sensed digital images. IEEE international geo-science and remote sensing symposium. France, 21–25 July 2003. p. 3751–3. [20] Chou PA, Lookabaugh T, Gray RM. Optimal pruning with applications to treestructured source coding and modeling. IEEE Trans Inform Theory 1989;35(2):299–315. [21] Wernecke, Possinger, Kalb, Stein. Validating classification trees. Biometrical J 1998;40(8):993–1005. [22] Donoho DL. CART and best ortho-basis: a connection. Ann Stat 1997;25(5):1870–911. [23] Santoso S, Powers EJ, Grady WM, Hofmann P. Power quality assessment via wavelet transform analysis. IEEE Trans Power Delivery 1996;11(2): 924–30. [24] Mallet SG. A theory for multi-resolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 1989;11(7): 674–93. [25] Xu Le, Chow Mo-Yuen. Power distribution systems fault cause identification using logistic regression and artificial neural network. In: 13th Conference on intelligent systems applications to power system. Washington DC, November 2005. [26] Das D, Singh NK, Singh AK. A comparison of Fourier transform and wavelet transforms methods for detection and classification of faults on transmission lines. In: IEEE power India conference, New Delhi, 10–12 April 2006.