Applied Energy 135 (2014) 382–390
Contents lists available at ScienceDirect
Applied Energy journal homepage: www.elsevier.com/locate/apenergy
Characterizing probability density distributions for household electricity load profiles from high-resolution electricity use data Joakim Munkhammar a,⇑, Jesper Rydén b, Joakim Widén a a b
Built Environment Energy Systems Group, Department of Engineering Sciences, Uppsala University, SE-751 21 Uppsala, Sweden Department of Mathematics, Uppsala University, SE-751 06 Uppsala, Sweden
h i g h l i g h t s A probability distribution model of household electricity use is presented. The distributions are fitted with high-resolution data on household electricity use. Both Weibull distribution and lognormal distributions are used. Aggregate distribution of multiple uncorrelated households approaches a Gaussian.
a r t i c l e
i n f o
Article history: Received 30 October 2013 Received in revised form 22 August 2014 Accepted 24 August 2014
Keywords: Household electricity use Stochastic modeling Probability density distributions Weibull distribution
a b s t r a c t This paper presents a high-resolution bottom-up model of electricity use in an average household based on fit to probability distributions of a comprehensive high-resolution household electricity use data set for detached houses in Sweden. The distributions used in this paper are the Weibull distribution and the Log-Normal distribution. These fitted distributions are analyzed in terms of relative variation estimates of electricity use and standard deviation. It is concluded that the distributions have a reasonable overall goodness of fit both in terms of electricity use and standard deviation. A Kolmogorov–Smirnov test of goodness of fit is also provided. In addition to this, the model is extended to multiple households via convolution of individual electricity use profiles. With the use of the central limit theorem this is analytically extended to the general case of a large number of households. Finally a brief comparison with other models of probability distributions is made along with a discussion regarding the model and its applicability. Ó 2014 Elsevier Ltd. All rights reserved.
1. Introduction Quantifying residential electricity use is valuable for such diverse purposes as devising demand-side management strategies for increased energy efficiency in buildings, integrating distributed renewable energy supply in the built environment and designing electricity distribution grids for urban or rural communities [26,28,29]. However, household electricity use with high time resolution is complex to quantify. Not only are there seasonal and diurnal variations in electricity use from for example heating, lighting, cooking and dishwashing, but the load is also highly stochastic [15]. Based on monitoring data from households it is possible to study electricity use via devising ‘‘bottom-up’’ models based assumptions or data for activity patterns, appliance use and appli⇑ Corresponding author. Tel.: +46 704464271. E-mail addresses:
[email protected] (J. Munkhammar),
[email protected] (J. Rydén),
[email protected] (J. Widén). http://dx.doi.org/10.1016/j.apenergy.2014.08.093 0306-2619/Ó 2014 Elsevier Ltd. All rights reserved.
ances [4,18,20,21,27,29–33]. Here [4,18,27,30,33] make use of detailed information on household appliances and occupancy to model electricity use of any number of households while [20,21,31,32] use data on occupancy and appliances to construct stochastic models [29]. Here bottom-up means calculating the electricity use from individual households and possibly scaling up the results to estimate regional or national electricity use [15,26]. Conversely a ‘‘top-down’’ approach considers only typically the residential sector as an energy sink with no resolution of individual households [15,26]. The top-down models need only aggregate data while the bottom-up models need more detailed data on household level [26]. In both top-down and bottom-up the modeling of electricity use might be either deterministic or stochastic. In both the deterministic and the stochastic approaches measured or estimated data is needed to setup the models along with assumptions regarding model configurations [32]. Stochastic models can be based on methods such as Markov chains and probability distributions [26,31].
J. Munkhammar et al. / Applied Energy 135 (2014) 382–390
The problem of bottom-up quantifying individual household electricity use could be condensed to quantifying three factors: (a) the set of appliances in the household, (b) the electricity use of the appliances and (c) the use patterns of the appliances [31]. The stochastic nature of household electricity use mostly stems from (c); that is mainly human behavior. The estimation of probability distributions—or probability density functions (PDFs)—for describing electricity use in households have been developed for example for demand forecasting [7]. Such models have also been used for modeling loads in distribution networks [24]. Generally there are several studies on various levels of resolution and detail dealing with different distributions for household load profiles such as Normal, Log-Normal, Gamma, Gumbel, Inverse-Normal, Beta, Exponential, Rayleigh and Weibull [8,9–11,13,16,23,27]. A conclusion which can be drawn from the literature is that there is generally no unique or canonical distribution type suitable for modeling household electricity use [8,24]. However, there are benefits in not using for example the normal distribution, since it extends to negative power use values whereas for example the Weibull distribution and Log-Normal distribution do not [5]. Generally PDF models on electricity use are also frequently used to generate time-series by using Monte Carlo simulations [5,6]. However, the stochastic information and the applicability of PDF models is perhaps most useful in the form of the analytic PDFs, where for example calculations for applications might be done analytically [3]. Generally due to the variability between different regions and countries and the lack of proper measurement data for the electricity use at many of these locations there is a need for stochastic models which have high resolution and are based on high-resolution national or regional data [3, p.185]. Access to high-quality data is crucial for the modeling approach discussed here. If the bottom-up modeling is supposed to aggregate the instantaneous power demand of individual buildings, data from a large set of representative households are needed, with a high sub-hourly time resolution. These data could be available from national monitoring campaigns, or possibly from high-resolution metering of customers by distribution system operators (DSOs). Normally, though, data with this level of detail are not generally available. Therefore, an important aim for electricity use modeling is to find representative models of typical end-users that can be applied for various purposes. The aim of this paper is to develop a PDF-model by estimating parameters for Weibull distributions and Log-Normal distributions of average household electricity use from a unique high-resolution monitoring campaign of household electricity use in 400 detached houses in Sweden provided in [34]. The PDF model for an average household electricity use is then extended to the scenario of multiple households via convolution of distributions from the sum of stochastic variables for N households. The distributions are analyzed in comparison with data in terms of relative variation of electricity use and standard deviation. A Kolmogorov–Smirnov test of goodness of fit is also provided. As a first study of its kind on Swedish data it will provide information on the fit to distribution for Swedish electricity use from detached houses and have applications for grid power calculations and coincidence estimates between load and intermittent power sources. In Section 2 the model and the statistical tools are described. In Section 3 the data which was used to fit the distributions of the model is presented. In Section 4 the statistical properties and simulations of the model are given. In Section 5 the results are discussed.
for a representative sample of detached houses in Sweden [34]. The investigation in this paper was restricted to two typical PDFs which can be used for modeling household electricity use and which have appeared in the previous literature, but for different data resolution and region [5,6,11,23]: Weibull distribution and Log-Normal distribution. The technical part of the parameter estimation was made with software developed in Matlab. In the following subsections we will describe the mathematical details of the distributions, the statistical analysis tools and describe the data which was used to estimate the fit to distributions.
2.1. Model framework The model in this paper is based on two assumptions: (I) At any time, the magnitude of power demand for a household is a random outcome. (II) The probabilities for all possible magnitudes of electricity use can be approximated by a continuous PDF. Assumption (I) includes the stochastic aspect of household electricity use which together with the assumption of approximate equivalence of probability for household electricity use with continuous PDFs (II) directs the model developed in this paper: The fitting of PDFs to data. The model is developed according to the flow-chart in Fig. 1. The basic assumptions of the model are illustrated in Fig. 2; a fictive data set represented by a histogram and a fictive PDF fitted to the data for one time interval. Upon inspection the histograms of the data sets appeared to be similar to Log-Normal and Weibull distributions, which were chosen since they appeared to capture the essential random features of the data sets. In order to include most of the diurnal and seasonal variation while keeping the number of distributions within a reasonable limit in the model, both the distributions Weibull and Log-Normal have been estimated for the following categories:
2. Methodology The model in this paper is defined by a set of PDFs which have been fitted with a recent large data set for household electricity use
383
Fig. 1. A flow chart of the process for developing the model in this paper.
384
J. Munkhammar et al. / Applied Energy 135 (2014) 382–390
2.3. Convolution and the central limit theorem The PDFs of power use described in the previous section are initially only defined for one average household. In order to estimate the PDF of power use for any given number of households a convolution of N PDFs has to be performed. If we assume that the power use in a household at a given time T is given by a certain distribution, the average sum SN of the stochastic variables X 1 ; X 2 ; . . . ; X N can be expressed as:
SN ¼ Fig. 2. Illustration of fictive data in a histogram along with a fictive distribution which is fit to data.
– Each hour of the day. (24) – Each day of the week. (7) – Each month of the year. (12) This means that there are 24 7 12 2 ¼ 4032 distributions fitted from the data set with in total 8064 parameters. It should be noted that the data set had 10-min resolution. The parameters of the distributions are fitted with the aid of maximum likelihood estimates for each of the Weibull- and Log-Normal PDFs, a method described in Section 2.4. The PDFs are estimated for one individual average household, but via convolution it is possible to define PDFs for any number of households, which is described in Section 2.3. 2.2. Probability distributions The main components of the model are the probability distributions, which are reviewed in this section. The PDF of a Weibull random variable x is defined by (see eg. [22,14, p.61]):
( k x k1
f W ðx; k; kÞ ¼
k k
eðx=kÞ
0
k
x P 0; x < 0;
ð1Þ
where k > 0 is the shape parameter and k > 0 is the scale parameter. The integrated version of (1), that is the cumulative distribution function (CDF) of a Weibull random variable W, is
( F W ðx; k; kÞ ¼
1 eðx=kÞ 0
k
x P 0; x < 0:
ð2Þ
For the special case of k ¼ 1 the Weibull distribution is equivalent to the exponential distribution and for k ¼ 2 it is equivalent to the Rayleigh distribution. The distribution has the following mean value (l):
l ¼ kCð1 1=kÞ;
ð3Þ
and the variance (r2 ): 2
2
2
r ¼ k Cð1 þ 2=kÞ l ;
ð4Þ
where CðxÞ is the C-function. We also have the PDF of a Log-Normal random variable L [22]:
f L ðx; l; rÞ ¼
ðln xlÞ2 1 pffiffiffiffiffiffiffi e 2r2 x > 0; xr 2 p
ð5Þ
where l is the mean value and r2 is the variance. The integrated version of (5), the CDF of a Log-Normal variable x, is:
F L ðx; l; rÞ ¼
1 lnðxÞ l pffiffiffi ; 1 þ erf 2 r 2
ð6Þ
where erfðxÞ is the error-function defined by [1, p.297]:
2 erfðxÞ ¼ pffiffiffiffi
p
Z 0
x
2
et dt:
ð7Þ
N 1X Xi: N i¼1
ð8Þ
Finding a mathematical explicit expression for the distribution of SN means a convolution of distributions, which in general is difficult or even not possible. In our particular cases we assume that all stochastic variables are either Weibull distributed or Log-Normal distributed and thus that the N-fold convolution of several stochastic variables is the N-fold convolution of the same probability PDF. Unfortunately, there seems to be no known analytic expressions for the convolution of N stochastic variables of either Weibull- or Log-Normal distributions, even if there exist such for approximate distributions [2,17]. Even if there are no analytic expressions for the PDF of the sum of these stochastic variables it is always possible to simulate the average of the stochastic variables and analyze the outcome, a study of this is presented in Section 4. Even though it might not be possible to obtain the analytic expressions for the convolution of N PDFs it is possible with the central limit theorem to give limiting distributions for large N [22, p.91]. This theorem states that for N independent stochastic variables X ¼ ðX 1 ; X 2 ; . . . ; X N Þ from the same distribution FðXÞ with expectation value E½X ¼ l and variance V½X ¼ r2 then the distribution mean as N grows large is normally distributed with expectation value E½X ¼ l and variance V½x ¼ r2 =N:
pffiffiffiffi pffiffiffiffi N ðxlÞ2 N f N ðx; l; r= NÞ ¼ pffiffiffiffiffiffiffi e 2r2 : r 2p
ð9Þ
This allows for a simpler PDF description of large numbers of households. A hint of this will be illustrated in Section 4. Note that this means that as the number of households increase to very high numbers all stochastic randomness will vanish—as expected—and the value that the households on average will take will be very close to l. It should be noted that the electricity use of a few households might not be accurately represented by a normal distribution like (9) since the central limit theorem is only necessarily true for large N [6]. 2.4. Parameter estimation and goodness of fit The estimation of parameters for the distributions was made with maximum likelihood estimation [19]. In such an estimation, given data set values X ¼ fx1 ; x2 ; . . . ; xN g and an assumed PDF f with unknown parameters l; r, the likelihood function Lf (see e.g. [22, p.82]):
Lf ðX; l; rÞ ¼ f ðx1 ; l; rÞ f ðx2 ; l; rÞ f ðxN ; l; rÞ ¼
N Y
f ðxi ; l; rÞ:
ð10Þ
i¼1
The parameters l and r are then determined by maximizing Lf , see [22, p.80] for more information. To estimate the goodness of fit between original data sets and proposed distributions, several tests exist in the literature. A commonly used test is the two-sample Kolmogorov–Smirnov test, which was used in this article and implemented in a Matlab script. This test is based on the maximum deviance between two
385
J. Munkhammar et al. / Applied Energy 135 (2014) 382–390
distribution functions, and tables with critical values (or their implementations in statistical software) can be used to deduce statistical significance. A test quantity D can be formulated as
and the primary goal of this study was to investigate the household electricity use of a typical average household, the electricity use from all appliances for each time-step were summed up for each household. Furthermore all electricity use data was divided up in vectors for each hour, for each day of the week and for each month as described in the List 2.1. Each such category contains data from several different households, and it can be assumed that the data set is a representative data set for the electricity use of detached houses in Sweden. The distributions were fitted with the entire data set instead of for example dividing up the data set into training data and test data. This division can be found in machine learning theory, often applied if there is a classification issue, while the objective of this study was to fit a distribution to data.
D ¼ maxjF 1 ðxÞ F 2 ðxÞj x
where F 1 ðxÞ and F 2 ðxÞ are empirical distribution functions. Although more specialised tests have been developed, in particular for, in our case, the Weibull distribution, the Kolmogorov–Smirnov test remains widely used [12,14]. For the normal distribution, better tests exist [25]. In our study, one sample was original electricity use, the other a large data set simulated from either of the used distributions. A p value was returned from the computations, and those with value lower than 5% were coded as 1 and higher than 5% as 0. A measure was defined as the ratio of pass relative to total number, as was done in [6]. Results for this is shown in Section 4.
4. Results For each category of electricity use data, a fit to distribution for both a Weibull distribution and Log-Normal distribution was made. For the process of estimating parameters in the proposed distributions, maximum likelihood estimation was performed using Matlab scripts. The resulting distribution varies depending on which category set of data is used: which time of day, which day and which season. This variation can be seen in the three examples of histogram data along with the corresponding fitted PDFs shown in Fig. 3. In that figure there are indications that the goodness of fit for each of the distributions compared with the histograms from the corresponding data sets varies between the three examples. As a more general way of illustrating the magnitude of goodness of fit we have constructed surface plots to illustrate relative variation between the modeled electricity use and the electricity use from the data set. In Fig. 4 the relative variation between mean electricity use from the data set and the mean electricity use from the model is shown for each hour and each month on a Tuesday. The left-hand side (A) plot represents the relative variation between the electricity use of the data and its corresponding Weibull distribution. In that plot it is possible to observe the small deviations which occur around midnight and during non-summer months. The right-hand side (B) plot represents the ratio of the mean electricity use from the data set and the mean electricity
3. Data set The data which was used to fit the distributions in this paper thoroughly analyzed in [34], however that investigation did not include the development of any stochastic model based on the data. The data was obtained from 400 households in Sweden: 40 of these households were measured for approximately one year and 360 households for approximately one month. Measurements were made on most of the appliances in each household (including heating when electric) on 10-min resolution. Half of the data set was comprised of data from detached houses, the other half was data from apartments. In this study data for detached houses was used. This data set consisted of 200 households where 20 households were measured for a whole year and the rest for approximately one month. The data set was setup in columns of household Id, appliance Id and electricity use for each time-step. This data set was also used in a recent study regarding consumer flexibility and a solar home management system for photovoltaic power production [33]. For more detailed information such as regarding residents, appliance specifications, electricity use etc see [34]. For this model certain assumptions regarding the data set were made. Since the set of appliances was unique for each household
−3
3
−3
x 10
3
−3
x 10
6
2.5
2.5
5
2
2
4
1.5
1.5
3
1
1
2
0.5
0.5
1
0
0
500
1000
1500
2000
0
0
500
1000
1500
2000
0
x 10
Data Weibull PDF Log−Normal PDF
0
500
1000
1500
Power (W)
Power (W)
Power (W)
(A)
(B)
(C)
2000
Fig. 3. Three examples of histograms of electricity use data along with both a Weibull PDF and a Log-Normal PDF. (A) is for the hour 23:00–24:00 on a Sunday in December, (B) is for the hour 20:00–21:00 on a Monday in April and (C) is for the hour 12:00–13:00 on a Saturday in September.
386
J. Munkhammar et al. / Applied Energy 135 (2014) 382–390
1
1.01 1.03
1.2 1.005
μData/μ
1.01 1
1
0.99 0.995
0.98 24:00 12:00
Time 00:00
2
4
6
8
10
0.9
1
μData/μModel
Model
1.02
0.8 0.8 0.7
0.6 0.4
0.6
0.2 24:00
0.5
12 12:00
0.99
Time
Months
(A)
4
2
00:00
6
8
10
12
0.4 0.3
Months
(B)
Fig. 4. These plots show the average electricity use from the data for each month and hour divided with the average electricity use for the corresponding day and month provided by the distributions. (A) is for the Weibull distribution, (B) the Log-Normal distribution. The simulation is for each hour and month for a Tuesday, the day was picked by random in order to limit the number of plots. Plots which compare weekdays and weekend days are given in Figs. 6–9.
use from the Log-Normal distribution. In similarity with the Weibull distribution in plot (A) there seems to be only small deviations—except during midnight of winter months. The reason for this might be that the electricity use from high power heating during those months is less Log-Normal distributed than during other times. An observation of the histograms from those times also hinted that the distributions were flatter compared with times with better fit. This might indicate that electricity use from heating might not necessarily be optimally modeled with a Log-Normal distribution, at least during times of intense use. In that figure the Weibull distribution seems to have a better fit than the LogNormal distribution in terms of average electricity use. As a complement to the power-use-ratio surface plots in Fig. 4 we provide surface plots in Fig. 5 presenting the relative variation measure of the standard deviation of electricity use for each hour and each month from the data divided by the standard deviation of electricity use from the distributions for the same categories. In this figure there appears to be a seasonal variation for both the Weibull distribution and the Log-Normal distribution. During
summer time both the Weibull distributions (A) and Log-Normal distributions (B) have lower standard deviation than the data which is shown via a higher ratio. During spring the standard deviation of the Log-Normal distribution is higher for the data, whereas the data to Weibull distribution ratio is kept close to one. A daily variation pattern in standard deviation ratio is also present, in particular during winter midnight where in particular the Log-Normal distribution deviates. In similarity with the deviations between mean electricity use from the model and electricity use from the data in Fig. 4 this might be an artefact of the stochastic patterns of heating during winter time. The variations between PDFs for each hour during different days can also be directly illustrated by using surface plots. In Fig. 6 we have four plots of PDFs for December: (A) Weibull PDFs for Monday, (B) Log-Normal PDFs for Monday, (C) Weibull PDFs for Saturday, (D) Log-Normal PDFs for Saturday. Fig. 7 shows the same for April, Fig. 8 for summer and Fig. 9 for fall. From the PDF plots for a specific day for each season in Figs. 6–9 it is possible to conclude that there appears to be:
1.3 1.2 1.25
1.6
1.5 1
1.1 1.05
1
Model
1.15 1.2
σData/σ
σData/σModel
1.2 1.4
1 0.8 0.5
0.6
0 24:00
0.4
1 0.8 24:00
0.95
12:00
Time 00:00
2
4
(A)
6
8
10
12
Months
0.9
12:00
Time 00:00
2
4
6
8
10
12
0.2
Months
(B)
Fig. 5. These plots show the standard deviation of electricity use from the data for each month and hour divided with the standard deviation of electricity use for the corresponding day and month provided by the distributions. (A) is for the Weibull distribution, (B) the Log-Normal distribution. The simulation is for each hour and month for a Tuesday, the day was picked by random in order to limit the number of plots. Plots which compare weekdays and weekend days are given in Figs. 6–9.
387
J. Munkhammar et al. / Applied Energy 135 (2014) 382–390 −3
−3
x 10
x 10
3
6
2
4
1
2
0
0 24:00 12:00 00:00
Time
400
200
0
600
24:00 12:00 00:00
Time
Power (W)
(A) Monday
0
600
400
200
Power (W)
(B) Monday
−3
−3
x 10
x 10
3
3
2
2
1
1
0
0 24:00 12:00
Time
00:00
400
200
0
600
24:00 12:00
Time
Power (W)
00:00
(C) Saturday
0
600
400
200
Power (W)
(D) Saturday
Fig. 6. These plots show PDFs of electricity use in time of day and power for December. (A) Weibull PDFs over Monday, (B) Log-Normal PDFs for Monday, (C) Weibull PDFs over Saturday, and (D) Log-Normal PDFs for Saturday.
−3
−3
x 10
x 10
3
6
2
4
1
2
0
0 24:00 12:00
Time
00:00
0
200
400
600
24:00 12:00
Time
Power (W)
(A) Monday
00:00
0
200
400
600
Power (W)
(B) Monday
−3
−3
x 10
x 10
3
6
2
4
1
2
0
0 24:00 12:00
Time
00:00
0
200
400
Power (W)
(C) Saturday
600
24:00 12:00
Time
00:00
0
200
400
600
Power (W)
(D) Saturday
Fig. 7. These plots show probability density distributions of electricity use in time of day and power for April. (A) Weibull PDFs over Monday, (B) Log-Normal PDFs for Monday, (C) Weibull PDFs over Saturday, and (D) Log-Normal PDFs for Saturday.
388
J. Munkhammar et al. / Applied Energy 135 (2014) 382–390
0.01
0.01
0.005
0.005
0
0 24:00 12:00
Time
00:00
0
200
400
600
24:00 12:00
Time
Power (W)
(A) Monday
00:00
200
0
400
600
Power (W)
(B) Monday
0.01
0.01
0.005
0.005
0
0 24:00 12:00
Time
00:00
0
200
400
600
24:00 12:00
Time
Power (W)
(C) Saturday
00:00
200
0
400
600
Power (W)
(D) Saturday
Fig. 8. These plots show probability density distributions of electricity use in time of day and power for July. (A) Weibull PDFs over Monday, (B) Log-Normal PDFs for Monday, (C) Weibull PDFs over Saturday, and (D) Log-Normal PDFs for Saturday.
A difference between Weibull PDFs and Log-Normal PDFs fitted to the same set of data. A seasonal difference, weekday difference and difference over time of day between the PDFs. A seasonal difference, weekday difference and difference over time of day between the PDFs and the data set.
x 10
Also when interpreting Figs. 6–9 a general rule of thumb seems to be that the lower peak magnitude of the PDFs the higher the average electricity use. This difference is most noticeable between winter in Fig. 6 and summer in Fig. 8. The PDFs appear to be flatter during winter which is reasonably related to the higher electricity use from lighting and heating during winter compared with for
−3
x 10
−3
4 4 2
2
0
0 24:00 12:00 00:00
Time
0
200
400
600
24:00 12:00 00:00
Time
Power (W)
(A) Monday
x 10
0
200
400
600
Power (W)
(B) Monday
−3
x 10
−3
4 4 2
2
0
0 24:00 12:00 00:00
Time
0
200
400
Power (W)
(C) Saturday
600
24:00 12:00 00:00
Time
0
200
400
600
Power (W)
(D) Saturday
Fig. 9. These plots show probability density distributions of electricity use in time of day and power for October. (A) Weibull PDFs over Monday, (B) Log-Normal PDFs for Monday, (C) Weibull PDFs over Saturday, and (D) Log-Normal PDFs for Saturday.
J. Munkhammar et al. / Applied Energy 135 (2014) 382–390 Table 1 Kolmogorov–Smirnov test percentage of pass.
KS pass (%)
Weibull
Log-Normal
74
67
example summer. Especially heating is high power load which adds power use to the high power end of the distribution which makes it flatter. As a test of goodness of fit we performed a Kolmogorov–Smirnov (K–S) test for each distribution for both distribution types for all categories. Table 1 showns the ratio of passed to total number of distributions. The results indicate that the Weibull distributions overall could be considered better fit than the Log-Normal distributions. This was also hinted in the relative-variation plots in Figs. 4 and 5. In Section 4.2 we will discuss these results in relation to other investigations which have previously been carried out in the literature. 4.1. Multiple households The model that was analyzed in the previous section was designed for a single average household. In order to estimate the average electricity use from N different households—based on the same distribution—it is necessary to compute the average SN of N stochastic variables, see Section 2.3. Such a process constitutes a convolution of PDFs, but there does not seem to exist any analytic expression for this in the literature for either Weibull or Log-Normal PDFs [17]. However, it is possible to use Monte Carlo simulations for each stochastic variable to obtain a PDF representing the mean of the stochastic variables SN . In order to illustrate the convolution of the PDFs for a few households we have carried out Monte Carlo simulation for a number of stochastic variables and the result is given in Fig. 10. That figure indicates that as the number of stochastic variables increases, both the convoluted
−3
Probability density
x 10
8
1 household 3 households 30 households N(μW,σW/30)
6 4 2 0
0
500
1000
1500
Power (W)
(A) −3
Probability density
x 10
8
1 household 3 households 30 households N(μL,σL/30)
6 4 2 0
0
500
1000
1500
Power (W)
(B) Fig. 10. In Plot (A) Weibull PDFs for one, two and thirty households are shown along with a Normal distribution with setup according to the average output from thirty households. Plot (B) is similar to Plot (A), but for the Log-Normal distribution. This particular example is of PDFs for a Sunday in December between 17:00 and 18:00. In Plot (A) the Normal distribution is barely visible since it approximates the 30-household convoluted curve to a high degree of accuracy.
389
Weibull PDFs and convoluted Log-Normal PDFs converge to Normal distributions with expected mean value and average variance the same as for the single average household distributions. This was expected for large N according to the central limit theorem, see Section 2.3.
4.2. Comparison with other results Several types of distributions for describing household electricity use have been investigated in the literature, and generally there seems to be no consensus on a canonical type of distribution for describing household electricity use [24]. In similarity with this paper there are investigations involving Weibull distributions [5,6,11] and Log-Normal distributions [5,6,23]. Each paper in the literature has specific detail and resolution on the modeling and the data sets vary in size, detail and place of measurement. For example in [11] data for Irish households were used and in [23] data for Finnish households were used. In contrast to previous studies the model in this paper is developed using a unique comprehensive high-resolution data set for household electricity use from detached houses in Sweden [34]. In terms of goodness of fit the ratio of pass to number of distributions from Kolmogorov– Smirnov test turned out to be 74% for the Weibull distributions and 67% for the Log-Normal distributions. One study which used the same type of measure but for aggregated sets of data concluded a 30–40% pass ratio for the Weibull distribution and a 80 % pass ratio for the Log-Normal distribution [6]. That study, in similarity with the investigation regarding aggregates of households in this paper, also hinted the increase in goodness of fit for the normal distribution as the number of households increase.
5. Discussion and conclusions Household electricity use varies over season and day and is largely dependent on human behavior. This makes the theoretical quantification of electricity use a complex problem. With the aid of modeling probability distributions from data it is possible to capture a comprehensive amount of information from a data set while keeping the number of model parameters to a reasonable minimum. Such an approach encompasses both the essential expectation values whilst simultaneously including information regarding the stochastic dynamics of the data. However, since it is a model based on delimiting assumptions and based on limited information, it will be limited in accuracy and predictability. The statistical investigations carried out in this paper showed that there were variations between data and distribution output in terms of mean values and standard deviation over time of day, day of week and season. However, in most cases the deviations between model output and data were relatively small, and presumably negligible for most applications of the model. The deviation was largest during night time of winter nights which might be due to the stochastic patterns of electric heating. This might indicate that a PDF-approach to modeling heat load might require a different kind of PDF than used here, or that it might not be accurately modeled using a PDF approach at all. The overall goodness of fit of the distributions were to some extent quantified with Kolmogorov–Smirnov tests in terms of pass-ratio frequency for both types of distributions, indicating that the Weibull distribution had best fit. The Monte Carlo methods for calculating the PDFs for N households provided a hint of the asymptotic Normal distribution behavior of both distributions for large N which was expected from the central limit theorem. This property might prove useful for future studies involving this model, in particular for aggregate scenarios.
390
J. Munkhammar et al. / Applied Energy 135 (2014) 382–390
Extended investigations of this model—and generalizations thereof—could perhaps include detailed surface plots for electricity use and standard deviation for all weekdays. Also a comprehensive goodness of fit investigation, including a more detailed study with use of Kolmogorov–Smirnov tests could be interesting. In addition to this the development of an extended version of the model in terms of grouping of appliances, different resolution and detail could be interesting and have different ranges of applicability compared with the model presented in this paper. Also approximate analytic expressions for the convolution of the electricity use distributions for any number of households could prove useful for investigations regarding few numbers of households. It could also be interesting to use this model in order to investigate traditional simple estimates for sizing up electricity grids. Also estimating distributions for other types of electricity demand, such as apartments and offices could be interesting. Since this model is based on high-resolution data for Sweden it can be applied to study a variety of different power system applications both in a general way, but also regarding the specific location of Sweden. In particular applications of this model to energy efficiency via demand-side management strategies, estimating load-production coincidence with distributed intermittent power sources and load flow calculations in grids could perhaps prove to be the most useful applications of this model. Acknowledgement This work has been carried out under the auspices of The Energy Systems Programme, which is primarily financed by the Swedish Energy Agency. References [1] Abramowitz M, Stegun IA. Handbook of mathematical functions with formulas, graphs, and mathematical tables. New York: Dover; 1965. [2] Beaulieu NC, Abu-Dayya AA, McLane PJ. Estimating the distribution of a sum of independent log-normal random variables. IEEE Trans Commun 1995;73: 2869–73. [3] Bollen MH, Hassan F. Integration of distributed generation in the power system. IEEE Press series on power engineering, John-Wiley & Sons Inc, Hoboken, New Jersey; 2011. [4] Capasso A, Grattieri W, Lamedica R, Prudenzi A. A bottom-up approach to residential load modeling. IEEE Trans Power Syst 1994;9:957–64. [5] Carpaneto E, Chicco G. Probability distributions of the aggregated residential load. In: 9th International conference on probabilistic methods applied to power systems KTH, Stockholm, Sweden; June 11–15, 2006. [6] Carpenato E, Chicco G. Probabilistic characterisation of the aggregated residential load patterns. IEEE Gener, Transm Distrib, IET 2008;2:373–82. [7] Charytoniuk W, Kotas P. Demand forecasting in power distribution systems using nonparametric probability density estimation. IEEE Trans Power Syst 1999;14:1200–6. [8] Gonzales-Longatt F, Rueda J, Erlich I, Villa W, Bogdanov D. Mean variance mapping optimization for the identification of Gaussian mixture model: test case, intelligent systems (IS). In: 6th IEEE international conference; 2012. p. 158–63. [9] Herman R, Kritzinger JJ. The statistical description of grouped domestic electrical load currents. Electr Power Syst Res 1993;27:43–8.
[10] Heunis SW, Herman R. A probabilistic model for residential consumer loads. IEEE Trans Power Syst 2002;17:621–5. [11] Irwin GW, Monteith W, Beattie WC. Statistical electricity demand modelling from consumer billing data. IEEE Gener, Transm Distrib, IEE Proc C 1986;133: 328–35. [12] Lai C-D, Chen K-H, Wang R-T. Weibull distributions and their applications. In: Pham H, editor. Springer handbook of engineering statistics. Springer-Verlag; 2006. [13] Lin H-H, Chen K-H, Wang R-T. A multivariate exponential shared-load model. IEEE Trans Reliab 1993;42:165–71. [14] Mann NR, Fertig KW. A goodness-of-fit test for the two parameter vs. threeparameter Weibull; confidence bounds for threshold. Technometrics 1975;17:237–45. [15] McLoughlin F, Duffy A, Conlon M. Characterising domestic electricity consumption patterns by dwelling and occupant socioeconomic variables: an irish case study. Energy Build 2012;48:240–8. [16] McQueen DHO, Hyland PR, Watson SR. Monte Carlo Simulation of residential electricity demand for forecasting maximum demand on distribution networks. IEEE Trans Power Syst 2004;19:1685–9. [17] Nadarajah S. A review of results on sums of random variables. Acta Appl Math 2008;103:131–40. [18] Paatero JV, Lund PD. A model for generating household load profiles. Int J Energy Res 2006;30:273–90. [19] Pawitan Y. In all likelihood: statistical modelling and inference using likelihood. Oxford University Press; 2001. [20] Richardson I, Thomson M, Infield D. A high-resolution domestic building occupancy model for energy demand simulations. Int J Energy Res 2008;40: 1560–6. [21] Richardson I, Thomson M, Infield D, Delahunty A. Domestic lighting: a highresolution energy demand model. Energy Build 2009;41:781–9. [22] Rychlik I, Rydén J. Probability and risk analysis. An introduction for engineers. Springer-Verlag; 2006. [23] Seppälä A. Statistical distribution of customer load profiles. Energy Manage Power Deliv 1995;2:696–701. [24] Singh R, Pal BC, Jabr RA. Statistical representation of distribution system loads using Gaussian mixture model. IEEE Trans Power Syst 2010;25:29–37. [25] Steinskog DJ, Tjøstheim DB, Kvamstø NG. A cautionary note on the use of the Kolmogorov–Smirnov test for normality. Mon Weather Rev 2007;135: 1151–7. [26] Swan LG, Ugursal VI. Modeling of end-use energy consumption in the residential sector: a review of modeling techniques. Renew Sustain Energy Rev 2009;13:1819–35. [27] Walker CF, Pokoski JL. Residential load shape modeling based on customer behavior. IEEE Trans Power Apparatus Syst 1985;104:1703–11. [28] Walla T, Widén J, Johansson J, Bergerland C. Determining and increasing the hosting capacity for photovoltaics in Swedish distribution grids. In: Proceedings of 27th EU-PVSEC, Frankfurt; 2012. [29] Widén J, Molin A, Ellegård K. Models of domestic occupancy, activities and energy use based on time-use data: deterministic and stochastic approaches with application to various building-related simulations. J Build Perform Simul 2012;5:27–44. [30] Widén J, Lundh M, Vassileva I, Dahlquist E, Ellegård K, Wäckelgård E. Constructing load profiles for household electricity and hot water from time-use data – modelling approach and validation. Energy Build 2009;41: 753–68. [31] Widén J, Nilsson AM, Wäckelgård E. A combined Markov-chain and bottom-up approach to modelling of domestic lighting demand. Energy Build 2009;41:1001–12. [32] Widén J, Wäckelgård E. A high-resolution stochastic model of domestic activity patterns and electricity demand. Appl Energy 2010;87:1880–92. [33] Widén J, Munkhammar J. Evaluating the benefits of a solar home energy management system: impacts on photovoltaic power production value and grid interaction. In: Proceedings of ECEEE summer study; 2013. [34] Zimmermann JP. End-use metering campaign in 400 households in Sweden: assessment of the potential electricity savings. Report Swedish Energy Agency; 2009.