Shrift Pnpar Probabilistic modelling of distribution nhdarmon ics: data collection a analysis Z A Yamayee Gonzaga University, Spokane, WA, USA W E Kazibwe Clarkson University Potsdam, NY, USA
Numerous nonlinear loads on the distribution system distort currents and voltages. Exact penetration, time of operation, etc. of these loads are non-deterministic. Therefore, harmonics at a distribution node (branch) should be viewed as a stochastic process. This paper presents a technique, based on statistical tests, which can be used to derive a probabilistic model for harmonics. This technique can be used to assess impacts of harmonics. An example demonstrates the application of the proposed method. The methodology consists of the followin 9 two tests: (1) a non-parametric run test to establish stationarity of a data set, and (2) chi-squared goodness-of-fit, another non-parametric test, to establish normality of the data set. Keywords : distribution systems, power system element modelling, data gathering and analysis
I. I n t r o d u c t i o n Numerous nonlinear loads are connected to a distribution feeder, either directly or through a distribution transformer. These nonlinear loads and transformers' saturation distort currents and voltages on the system. Exact penetration and time of operation of these harmonic-producing elements are non-deterministic. The specific type of harmonic information needed can be identified from undesirable effects of harmonics. Some of these effects areS: Increased power loss, overheating of transformers, etc. These losses can be calculated from the knowledge of magnitudes of harmonic currents (voltages). 2 Capacitor bank failure. The presence of harmonics at a capacitor bank can increase dielectric losses, can cause resonance conditions, and over-voltages. To Received: 21 August 1985, Revised: 13 January 1987
Vol 9 No 3 July 1987
assess these phenomena we need both the magnitude and phase angle of each harmonic. 3 Telephone interference (TI). In assessing TI we need the magnitude of each harmonic. Therefore, in order to assess the impacts of harmonics we need the magnitude and phase angle of each harmonic. Reference 2 proposes a list of information to be collected. This paper presents two non-parametric tests which are used to analyse the data collected. These tests determine stochastic properties of the collected data. The first test, called the 'Run Test', determines the stationarity of the data. Run Test is based on the hypothesis that the data are independent sample values of a random variable. If this hypothesis is true, the variations in the sequence of sample values will be random and display no trend. The second test, chi-squared goodness-of-fit test, determines the normality of the data. This is also a hypothesis test. The paper is organized as follows. Representation of a harmonic as a random variable is presented first, followed by measurements and processing of a distorted wave on the distribution system. The method of deriving the probabilistic model from the collected data is presented next, followed by a sample problem. Conclusions and a list of references close the paper. II. H a r m o n i c as a s t o c h a s t i c p r o c e s s Harmonic contents of voltages and current waves at a particular location on a distribution system vary continuously as a function of time. These variations are due to changes in load and stochastic behaviour of power system equipment. Therefore, a particular harmonic h changes continuously over time during a day.
0142-161 5/87/030189-04 1987 Butterworth & Co (Publishers) Ltd
189
Generally, characteristics of the system load and devices do not change drastically from one day to the next. Hence, harmonic h can be viewed as a stochastic process. The data recorded for one day represent a sample record of this stochastic process.
can be used to perform the statistical tests needed to derive probabilistic models of harmonics at the location where data were collected.
To obtain an ensemble of this stochastic process, we need to collect the time history of harmonics over many weekdays. However, if this process is stationary, statistical modelling can be performed using a single day's record. From this single-day record we can derive a stochastic model for the process. For example, let us say we collect data for the I0:00-11:00 time interval on a weekday. We apply a stationarity test to these data. If the recorded data are stationary, we need only these 10:00-11:00 data to derive a probabilistic model. This model can be used to represent a typical weekday 10:00-11:00 period.
IV. S t a t i s t i c a l tests of r ecor ded data The first test applied to the chronological data is to determine daily stationarity. This test determines if an harmonic record for a day is sufficient to derive a probabilistic model. If the answer is yes, then each hour, e.g. 10:00-11:00, of a day can be represented by a probability density function (PDF). Furthermore, if the data satisfy the normality test, the mean and standard deviations for the P D F can be calculated from these data. We will explain the procedure by way of an example. Let us assume that the voltage V(t), between 10:00-11:00 on Monday morning, is recorded. These data are analysed by a spectrum analyser, and harmonic h is singled out. Assume, further, that V, . . . . . V N represent phasors for harmonic h voltage. Stationarity of these data can be tested as follows.
Other time intervals of interest are the peak and off-peak hours of a day. Due to hourly load variations, it is not expected that harmonics at all hours of a day have identical properties. However, it may be that peak hours of a day have similar properties, and off-peak hours similar properties. If we can establish this hypothesis two probability functions, one peak and one off-peak, may be sufficient to model harmonics on a weekday. A final issue of interest is the correlation between peak and off-peak harmonics. It may be that harmonics at these hours are related to each other by a constant. This constant could be the ratio of loads at peak and off-peak hours.
III. M e a s u r e m e n t and processing of a sample distorted wave Measurement is performed by recording the electrical signals, i.e. voltages and currents, on a magnetic tape. As many as seven signals may be recorded simultaneously. These include three phase voltages, three phase currents, and the neutral current.
Test the magnitudes and phase angles of V~ . . . . . V N for the presence of obvious trends or variations other than those due to expected sampling variations. If trends do exist the process is non-stationary. Hence, ensembles (data from several days) should be used to derive statistical models. If there are no trends, we proceed. We next perform a rigorous non-parametric test. The reason for the use of a non-parametric test is that such tests do not require knowledge of sampling distribution of data parameters. The non-parametric test we propose is called the Run Test 3. This test can be applied to our problem as follows. We explain the process for voltage magnitude only. It is hypothesized that V1, ..., VNare each independent sample values of a random variable V. If this hypothesis is true, the variations in the sequence of sample values are random and display no trends. Hence, the number of runs in the chronological sequence relative to the mean,
n/=l
Data on tape are generally sampled at a rate of 1930 Hz. This sampling rate implies 32 samples per cycle of 60 Hz (32 samples per 16.67 ms or one sample every 0.5 ms). This sampling rate is an appropriate rate for capturing the most troublesome harmonic contents of a wave. A spectrum analyser using Fourier Transform (FT) or Fast FT (FFT) converts these samples into their spectral components. The next step is to calculate phasors of each harmonic h. The 60 Hz component of one of the waveforms, e.g. phase A voltage, is taken as reference. The root-mean-square (RMS) value and phase angle of each harmonic are calculated. Data are continuously sampled to get as m a n y readings as possible. The output of this procedure is a chronological cycle-by-cycle record of phasors, RMS value and phase angle, of each harmonic h of each waveform. These data
190
will be as expected for a sequence of independent random observations. We classify each observation into one of the following mutually exclusive categories. These categories may be identified simply by (+), if greater than the mean; (-), if smaller than the mean; and (b), if equal to the mean. The sequence of plus-minus observations might be as follows: ++
--
++
1
2
3
-
-
4
++ 5
-
-
6
+ 7
+ 8
9
+ 10
11
b
12
13
A run is defined as a sequence of identical observations that is followed and preceded by a different observation or no observation at all. In the above example there are r = 13 runs in the sequence of N = 20 observations. The sampling distribution for the number of runs in the sequence is given in Reference 3. The expected number
Electrical Power & Energy Systems
Table 1. Run distribution N
0.975
0.95
0.05
0.025
10 12 14 16 18 20 22 24 26 28 30 32 36 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200
2 3 3 4 5 6 7 7 8 9 10 11 12 14 18 22 27 31 36 40 45 49 54 58 63 68 72 77 82 86
3 3 4 5 6 6 7 8 9 10 11 11 13 15 19 24 28 33 37 42 46 51 56 60 65 70 74 79 84 88
8 10 11 12 13 15 16 17 18 19 20 22 24 26 32 37 43 48 54 59 65 70 75 81 86 91 97 102 107 113
9 10 12 13 14 15 16 18 19 20 21 22 25 27 33 39 44 50 55 61 66 72 77 83 88 93 99 104 109 115
Table 2. Chi-squared distribution 2
n
Zn, o.o5
r/
r 2 0 . 1 - 0 . 0 5 / 2 ~-~ r20,0.975 = 6 r20,o.05/2 = r20,o.o25 = 15
Since r = 13 falls between 6 and 15, the hypothesis is accepted. That is, there is no reason to question that observations are independent. Therefore, the process is stationary. After the process passes stationarity, the next task is to determine if a normal distribution represents the probability distribution for the voltage V(t) between 10:00-11:00 of a weekday. The normality test we propose is called the chi-squared goodness-of-fit test 3. The procedure involves the use of a statistic with an approximate x-squared distribution as a measure of the discrepancy between the observed P D F and the normal density function. A hypothesis of equivalence is then tested by studying the sampling distribution of this statistic. Let us explain the procedure using our V1. . . . . V, data. Let the N observations be grouped into K intervals of equal width. These interv~.ls are called 'class intervals'. The number of observations falling within the ith class interval is called the observed frequency in the ith class, and will be denoted by f~. The number of observations which would be expected to fall within the ith class interval if the true P D F of V was a normal density function is called the expected frequency in the ith class, and is denoted by Fi. To measure the total discrepancy for all class intervals, the squares of the discrepancies in each interval are summed to obtain the sample statistic
Zn2, 0.05
1 2 3 4 5
7.88 10.60 12.84 14.86 16.75
21 22 23 24 25
41.40 42.80 44.18 45.56 46.93
6 7 8 9 10
18.55 20.28 21.96 23.59 25.19
26 27 28 29 30
48.29 49.64 50.99 52.34 53.67
11 12 13 14 15
26.76 28.30 29.82 31.32 32.80
40 60 120
66.77 91.95 163.65
16 17 18 19 20
34.27 35.72 37.16 38.58 40.00
Vol 9 No 3 July 1987
of runs, r, for various numbers of observations, N, is presented in Table 1. The confidence level is represented by (1 - ~). For example for ~ = 0.05, we are 95% sure that the results of the test are accurate. For the above example, we check the independence hypothesis by comparing the observed runs, r = 13, to the interval b e t w e e n r20 ' 1-0.05/2 and r20,0.05/2 calculated from Table 1.
X2=
k i=12"
( f l - El) 2 Fi
(1)
The distribution for X z is approximately the same as for ~2. The number of degrees of freedom, n, is equal to K minus the number of different independent linear restrictions imposed on the observations. There is one such restriction due to the fact that the frequency in the last class interval is determined once the frequencies in the first K - 1 class intervals are known. For the normal distribution test two additional constraints are involved, because mean and variance of the normal distribution must be computed from the observed data. Therefore, the number of degrees of freedom for X 2 is n = k - 3. Using equation (1), X 2 is computed. If X 2 ~< X,2 o 05, the test is accepted. Values of X,2 o 05 for different n are given in Table 2. If the sample v a l u e X 2 is greater than X,.o.os 2 the hypothesis is rejected at the ~ = 0.05 level of significance.
V. Sample example A set of 47 chronological seventh-harmonic voltage magnitudes are given in Table 3. Stationarity and normality tests are applied to this data set.
191
T a b l e 3. H a r m o n i c data for s a m p l e p r o b l e m ( h a r m o n i c n u m b e r 7 -- 4 2 0 H z )
T a b l e 4. C a l c u l a t i o n s for n o r m a l i t y test
{Ji"- Fi) 2
Interval number Data no. 1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Magnitude
Data no.
Magnitude
(range of values)
71
El*
163 159 166 168 162 162 154 156 159 158 152 150 157 161 168 160 159 157 159 168 168 162 167 172 171
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
172 161 162 162 171 172 171 170 153 174 130 168 166 162 171 168 168 163 168 164 167 162
1(130-137) 2 (137-141) 3 (141-151) 4 (151-158) 5 (158-165) 6 (165-174)
1 0 1 7 17 21
0 1 2 6 16 21
2 We perform the Run Test around the mean (average) of the data. The average is 163. To apply the Run Test, all observations with a value greater than 163 are identified by a (+), all values less than 163 by ( - ) , and all values equal to 163 by (b). The result is: b + + +
+ +
+
+ + + +
+ + + +
---
+
+
+
b
+
+
+
b
Hence, there are r = 20 runs represented by the sequence of 47 observations. For these data to be stationary, we need to show that the hypothesis that these observations are independent is true. For this hypothesis to be true, r should fall in the following range: r47,1-=/2
/~ r ~ r47,ct/2
For ~ = 0.05, r47,o.975 17 and r47,0.025 31. Since r = 20 falls between 17 and 31, harmonic number 7 (420 Hz) represents a stationary process. Hence, measured data from one data set can be used to derive a, statistical model for this harmonic. =
chi-squared goodness-of-fit approach. We used six class intervals, i.e. k = 6. Results are summarized in Table 4. For this test n = k - 3 = 3. From Table 2, Z,2,o.os = 2 2 Z3,o.o5 = 12.84. Since X 2 < Z,.o.os, the normality assumption for this harmonic is acceptable. The mean and standard deviation are 163 volts and 7.6 volts, respectively.
Conclusions
In this paper, we have presented statistical techniques that can be used to derive probabilistic models of existing harmonics on distribution systems. These techniques are: (1) Run Test used to establish stationarity of recorded data, and (2) chi-squared goodness-of-fit test, to establish normality of the data. We applied these tests to a sample record of voltage magnitudes of the seventh harmonic. We applied the Run Test to check stationarity of the harmonic. The results indicate that this harmonic is a stationary process. Hence, to model this harmonic stochastically, we need only measure data for a typical weekday. Then, we checked the data for normality. We used the chi-squared goodness-of-fit to perform the test. The test showed that the assumption of normal distribution is a valid one. This is not unexpected. Load on a power system is, usually, normally distributed. Since the major sources of harmonics on a distribution feeder are feeder loads, a normal distribution is a logical one. Normality of harmonic makes stochastic modelling very convenient. We need only the mean (#) and standard deviation (a) to have a complete probabilistic model. For the data used in our sample problem, the mean is 163 and the variance 7.6.
=
V.2 N o r m a l i t y t e s t The next question is to determine if the data represent a normal distribution. To perform this test, we apply the
192
z _ 1.73
Fi
* Used Math Tables (16) and the example data to calculate entries in this column
VI.
1 Looking at the data, there are no apparent trends.
--
1
0.5 0.1666 0.0625 0
mean = 163, variance = 7.6,
X2 = --~L (Ff / i- ) i =~l
Undefined
Average = 163
V.1 S t a t i o n a r i t y test
+ +
N = 47,
Fi
VII. References 1 'Power System Harmonics' IEEE PE$ Tutorial Course Text No 84 EH0221-2PWR (1984) 2 err, J A and Cyganski, D 'Data collection and statistical analysis techniques for power system harmonics" Prec. Int. Conf. Harmonics in Power Systems, Worcester Polytechnic Institute (October 1984) pp 73-77 Bendat, Julius S and Piersol, A C Random Data: Analysis And Measurement Procedures John Wiley & Sons, Inc. (1971 )
Electrical Power & Energy Systems