Data fusion analysis applied to different climate change models: An application to the energy consumptions of a building office

Data fusion analysis applied to different climate change models: An application to the energy consumptions of a building office

Energy & Buildings 196 (2019) 240–254 Contents lists available at ScienceDirect Energy & Buildings journal homepage: www.elsevier.com/locate/enbuild...

4MB Sizes 0 Downloads 107 Views

Energy & Buildings 196 (2019) 240–254

Contents lists available at ScienceDirect

Energy & Buildings journal homepage: www.elsevier.com/locate/enbuild

Data fusion analysis applied to different climate change models: An application to the energy consumptions of a building office Francesco Guarino∗, Daniele Croce, Ilenia Tinnirello, Maurizio Cellura Department of Engineering, University of Palermo, Viale delle Scienze, Building 9, 90128 Palermo, Italy

a r t i c l e

i n f o

Article history: Received 9 January 2019 Revised 5 April 2019 Accepted 3 May 2019 Available online 13 May 2019 Keywords: Climate change Building simulation Heating and cooling Data fusion IPCC Regression Elastic net

a b s t r a c t The paper aims to achieve the modelling of climate change effects on heating and cooling in the building sector, through the use of the available Intergovernmental Panel on Climate Change forecasted data. Data from several different climate models will be fused with regards to mean air temperature, wind speed and horizontal solar radiation. Several climatic models data were analysed ranging from January 2006 to December 2100. Rather than considering each model in isolation, we propose a data fusion approach for providing a robust combined model for morphing an existing weather data file. The final aim is simulating future energy use for heating and cooling of a reference building as a consequence of the expected climate changes. We compare results, in terms of robustness to overfitting, for two different fusion methodologies, based on the comparison between errors on punctual historical data or prediction models that can be obtained by each climate simulator and by the actual ERA-INTERIM data set. Finally, we map the new aggregated data into a prediction trace of heating and cooling energy requirements. The expected energy demand is in the range of the one provided by single climate models, with a variability that reaches up to the 10% of the overall energy requirements. The approach proposed is an advancement as it allows to achieve better fits with existing re-analysis data if compared to specific global circulation models output data. Thus a more reliable estimation of energy use for heating and cooling can be achieved. © 2019 Elsevier B.V. All rights reserved.

1. Introduction Climate change has risen in the past years to a serious concern with very wide implications, ranging from social to security to biodiversity safeguard and health hazards. What is undeniable is that the last years have been the hottest on record throughout the world, marking a worldwide phenomenon that will continue to grow if left unchecked. Although the moral imperative – acting against climate change – is undeniably clear, the path towards the much needed decarbonisation of the economy and energy sector is all but smooth and clear. In this general context, the building sector is one of the key contributors to climate change. In 2010 buildings accounted for 32% of total global final energy use, 19% of energy-related GHG emissions (including electricity-related), approximately one-third of black carbon emissions. This energy use and related emissions may double or potentially even triple by mid-century due to several implications: a very important trend is the increased access for billions of people in developing countries to adequate hous-



Corresponding author. E-mail address: [email protected] (F. Guarino).

https://doi.org/10.1016/j.enbuild.2019.05.002 0378-7788/© 2019 Elsevier B.V. All rights reserved.

ing, electricity, and improved cooking facilities. The ways in which these energy-related needs will be provided will significantly determine trends in building energy use and related emissions. Moreover, phenomena of growth of population, immigration to urban areas, increasing levels of wealth and lifestyle changes can potentially contribute to significant increases in building energy use. Also the substantial new construction rates – also mostly based on “international” designs (thus with poor bioclimatic inspiration) that is taking place in developing countries is potentially a very significant risk from a mitigation perspective. Although the future perspectives described up to now are not really encouraging, final energy use may stay constant or even decline by mid-century, as compared to today’s levels, if today’s costeffective best practices and technologies are broadly diffused, thus pointing towards the decarbonisation of the economy and energy sector. Due to the very long lifespans of buildings, one of the very significant reasons why this could not happen is the carbon “lock-in” risk that points to the urgency of ambitious and immediate measures. This means that, even if we start building with the state of the art techniques and systems today, the weight of the choices ‘locked in’ in the past decades will still have a substantial impact on the current energy use.

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

This effect is quantified in [1], that clearly states that “if the most ambitious of currently planned policies are implemented”, approximately 80% of 2005 energy use in buildings globally will be ‘locked in’ by 2050 for decades, compared to a scenario where today’s best practice buildings become the standard in new building construction and existing building retrofit. As a result, the urgent adoption of state-of-the-art performance standards, in both new and retrofit buildings, is fundamental in avoiding locking-in carbon intensive options for several decades [2,3]. Besides the carbon ‘lock-in’ effect, the spiralling negative effect caused by the long life cycle of buildings is exacerbated by the fact that high-performance buildings built today, that are already subject to overheating phenomena due to high airtightness and low transmittance values, will be highly impacted by climate change. Thus the buildings that do not contribute to the “lock-in” effect of high carbon emissions, will contribute indirectly to energy use increase if the improvement in energy performances are not widespread, and thus if climate change is not halted. From this perspective, it is paramount to try and estimate how much climate change could impact high performance buildings (i.e., buildings built today, compliant with the energy performance of buildings directive in EU and with national regulations) under different scenarios, or in other words, under different levels of decarbonisation of the energy sector [4], based on different assumptions. Over the last two decades IPCC has released several classes of emissions scenarios that differ between them for several factors such as population expected growth, economic development or development of new technologies. In the last decades IPCC has developed several sets of emission scenarios up to 2007, when four emissions scenarios were defined (RCP2.6, RCP4.5, RCP6.0 and RCP8.5), called Representative Concentration Pathways (RCP) that were the source of data of the last IPCC assessment report (AR5). Compared to the previous ones, the RCP scenarios consider larger amounts of data and implications to climate change such as socio-economic aspects, emerging technologies and land cover. Each RCP scenario represents a rough estimate of the radiative forcing (defined as the additional power taken up by the Earth as a system due to the enhanced greenhouse effect) Earth will undergo under the climate change scenarios. More precisely, it can be defined as the change in the net radiative flux through atmosphere due to a change in an external driver of climate change, such as, for example, a change in the concentration of carbon dioxide [5,6]. Table 1 presents the four RCP scenarios reported in [7]. The scenarios are: a low-carbon scenario, two intermediate scenarios and one with very high Greenhouse Gas emissions. More in detail, RCP2.6 is based on the assumption that global warming would be stopped below 2 °C above pre-industrial temperatures through efforts to limit this increase to 1.5 °C through a nearly full decarbonisation of the economy, while RCP8.5 is a business as usual scenario, with no policy changes to reduce emissions (CO2 are threefold higher in 2100 if compared to today) [7]. The aforementioned scenarios do not report climate change predictions, they instead portray the cause to climate change, as

241

caused by anthropogenic causes, by realistic pathways and assumptions, to be used as a baseline for the climate change modelling. Emission scenarios are usually input to General Circulation Models (GCMs) to obtain predictions of the future climate. GCMs are essentially mathematical models of the general circulation of a planetary atmosphere, which describe the most important components, processes and interactions in the climate system. The GCMs predict climate at a relatively high level of spatial and temporal resolution. Driven by the fact that assessing the impact of climate change on energy building performances requires local weather data at higher temporal resolution, the global circulation model outputs have to be ‘downscaled’, referring to a process of generating climate change information at spatial and temporal scales lower than those provided by the GCMs [8]. Numerical models representing physical processes in the atmosphere, ocean, cryosphere and land surface, are the most advanced tools currently available for simulating the response of the global climate system to increasing greenhouse gas concentrations. While simpler models have also been used to provide globally- or regionally-averaged estimates of the climate response, only GCM shave the potential to provide geographically and physically consistent estimates of regional climate change which are required in impact analysis. GCMs depict the climate using a three dimensional grid over the globe, typically having a horizontal resolution of between 250 and 600 km, 10 to 20 vertical layers in the atmosphere and sometimes as many as 30 layers in the oceans. Moreover, many physical processes, such as those related to clouds, also occur at smaller scales and cannot be properly modelled. Instead, their known properties must be averaged over the larger scale in a technique known as parameterization. This is one source of uncertainty in GCM-based simulations of future climate. Others relate to the simulation of various feedback mechanisms in models concerning, for example, water vapour and warming, clouds and radiation, ocean circulation and ice and snow albedo. For this reason, GCMs may simulate quite different responses to the same forcing, simply because of the way certain processes and feedbacks are modelled [9]. To adapt GCMs outputs and assess the impact of climate change on building performance two different approaches can be usually found in the state of the art [10]: statistical methods and building simulation approaches. Statistical analyses model the interactions between the local meteorological variables and the building energy demand based on historical available data. An example is the prediction of building energy consumption using the degree-days approach [11–14]. It is essentially a single-measure steady-state approach aimed at the quantification of building energy use for heating and cooling. This approach is commonly used by the building industries to relate the trends of building energy consumption with local climate conditions, in the fastest and simplest way. This approach is not particularly effective in the context of energy use prediction via detailed dynamic building simulation,

Table 1 Overview of RCPs scenarios [7]. Scenario

Radiative forcing

CO2eq concentration in 2100

RCP8.5 RCP6.0 RCP4.5 RCP2.6

Rising radiative forcing pathway leading to 8.5 W/m2 by 2100 Stabilization without overshoot pathway to 6.0 W/m2 after 2100 Stabilization without overshoot pathway to 4.5 W/m2 after 2100 Peak in radiative forcing at ∼3 W/m2 around 2050 and then decline to 2.6 W/m2 by 2100

1370 ppm 850 ppm 650 ppm 400 ppm

Increase in global mean temperatures at 2100 relative to 1986–2005 period 2.6 °C–4.8 °C 1.4 °C–3.1 °C 1.1 °C–2.6 °C 0.3 °C–1.7 °C

242

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

as having availability of hourly future weather data is a prerequisite and a key point for the energy demand prediction under climate change by taking advantage of building simulation tools [15]. In this context, the creation of future weather files is usually approached in two ways [16]: by combining climate projections with a weather generator to allow the creation of typical future weather years data, or by a mathematical transformation (morphing) of the time series of existing current weather files using climate change forecasts [17–19]. The first category is the most represented approach towards integrating climate change considerations within actual practice of climate change analysis. Among these studies it is possible to differentiate different features among the input data files. Either GCM or Regional climate models (RCM) are generally used. GCMs are the most common modelling tools to study and understand climate variability and to project future climate change. They are usually validated under re-analysis of monitored atmospheric data like [20], the models are usually run having as input different Representative Concentration Pathways [21] concentration of CO2 within the atmosphere. Outputs are generally aggregated on a monthly level. They usually have low spatial resolution (in range of 10 0–30 0 km) but cover basically the whole surface of the planet and their development is generally coordinated at international level through the efforts of the Intergovernmental Panel of Climate Change (IPCC). However, since the local microclimate can have significant impact on the heating and cooling building energy requirements, as discussed in [22] and [23], and since GCM cannot usually investigate local climatic issues, GCM models are usually integrated with other approaches able to investigate the issue. RCMs are instead climate models for general purposes, also defined as limited-area models, they are based on very specific locations and areas, calculated by imposing energy\mass balances on a much thicker density of points with higher spatial resolution within a GCM. They are numerical models that require explicitly specified boundary conditions from a GCM or through re-analysis datasets [24]. However, data from GCM and RCM cannot be used directly in building simulation as they usually come with lower time and space resolutions, which will need to be appropriately downscaled to the application at hand. Among the downscaling techniques available are statistical techniques (e.g., interpolation of the main climate related variables), stochastic (whereas models can derive variables stochastically from a few independent weather variables), or through the use of the “Morphing” method, which applies the monthly data from GCM or RCM to hourly pre-existing weather data files, through operations of “shift”, “stretch” and a combination of “shift” and “stretch” [17]. Some examples of the previously mentioned methodologies will be briefly discussed in the following. In [25] Farah et al. present a methodology aimed at integrating climate change features into historical weather data to make suitable climate change weather data available for buildings performance simulation. The approach analysis separately hourly dry bulb temperature into three different time series components through Fourier series to simplify the integration of three climate change features. The methodology modifies maximum and minimum monthly averages, number of days with maximum temperature above a specified threshold and the number of consecutive occurrences of days with heat waves. The modification of the weather data file for building simulation is based on specific reports and projections for climate change within selected Australian cities. Significant shifts between heating and cooling needs are expected for the case study in Australia with heating being less

impactful and cooling increasing significantly. The approach is however data and application specific. Among the morphing method–based studies, several focus on case-specific geographical issues, as per calculating the variations in heating and cooling in a specific time frame under specific climates. It is the case of Flores-Larsen et al. [26], who, as an example use the morphing method and data from the HADCM3 global circulation model to evaluate the impact of climate change on the energy performance of residential buildings in Argentina and to analyse whether current bioclimatic design solutions might need to be re-imagined in the future due to climate change. Results highlight a reduction of up to 7.8 kWh/m2 which is balanced by an increase in cooling of roughly the same magnitude. On this topic, several studies approach in a similar way the regional analysis. The work from Moazami et al. [27] provides an overview of the major approaches to create future weather data sets based on the statistical and dynamical downscaling of climate models. Through the use of dynamic downscaling of weather data from different global circulation models several weather data sets for Geneva were synthesized and applied to energy simulation of 16 ASHRAE standard reference buildings, in order to create a virtual neighbourhood. The study concludes that it is particularly important to take in consideration both typical and extreme conditions to provide solid results and figures: peak cooling demand can be as higher as 28.5% due to extreme events if compared to average data, among all the simulations and case studies proposed. Among the studies using RCM as data source, Nik [28] proposes the development of several climate scenarios: one typical and two extreme weather datasets. The methodology proposed aims at decreasing the number of weather data sets to be used without losing quality and details and is based on the simulation of an office building in Geneva and the residential building stock in Stockholm. Several generated climate data based on different RCM and GCM are generated, which in the end allow to recreate distributions of heating and cooling demand that are very similar to those achieved with the original weather dataset. In [29] an overview on the various methodologies currently available for the generation of future weather files for building simulation software is provided and identified the following weather generators: COPSE [30] and PROMETHEUS [31]. These tools are based on data from the most recent set of climate change projections in the UK developed in 2009 by the UK Department of Environment (UKCP09) [32]. Eames et al. [31] make use of said UKCP09 probabilistic tool, developing a sensitive analysis by sequencing 30 0 0 example years in an attempt to study the full range of probabilistic response and risk and comparing it to the single probabilistic reference years provided in the standard approach. The results highlighted that the benefits achieved by such complex approach and the use of such a wide sensitivity analysis is not justified by a significant increase in precision of the results and that the reference years reach enough detail to guarantee good agreement with monitored data. In this framework, the paper proposes the analysis of different GCM based forecasts of climate change. These data are compared with existing climate databases to compare simulated with historical weather databases, aimed at generating a new ‘data-fusion’ dataset through the combination of different IPCC models so that the generated model is similar to the real available data and, simultaneously, more robust to the overfitting problem typical of the LSE (Least Square Error) methods. In particular, we compute the average trend of each dataset by removing the seasonal and noise residual components, and use the Elastic-Net regression on the (approximated) linear trend obtained on the historical data. This process is based on an abstract representation of the data (the average trend) and avoids the risk of overfitting on the single points of the dataset. We then use the obtained weights on the RCP4.5 and RCP8.5 scenarios, achieving a unified model for future

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

243

Fig. 1. Organization of the database.

predictions. The new dataset is then compared to the existing ones in terms of variations predicted of heating and cooling, in order to understand clearly what is the level of precision attained when choosing one specific model or another to compare climate change effects on the building sector. The new datasets are used in a simplified building case-study simulated both in a residential and non-residential configuration, compliant with the existing regulations in terms of building energy performances in Italy. 2. Methodology 2.1. Dataset description The Fifth Phase of the Coupled Model Intercomparison Project (CMIP5), a project coordinated by the World Climate Research Program (WRCP), manages a dataset including many climate models [33]. The project is a global reference for climate researchers and is aimed to study and verify the knowledge on the climate and its variations. With input from various research groups, CMIP allows to shed light on the phenomena that regulate climate and its changes. CMIP5 defines a wide range of experiments including climate control, paleoclimatic, sensitivity, hindcast and predictive experiments, as well as providing the opportunity for researchers to define their own historical experiments. The experiments are divided into two types, long-term (century time scale) and near-term (10–30 years) also called decennial prediction experiments. The datasets are available from the World Data Center Climate (WDCC) database managed by CMIP5, which organizes access to climate data according to the flowchart shown in Fig. 1. The available datasets are output of 58 different climate models (General Circulation Model – GCM), developed by 23 different research centres. The datasets have different spatial resolution (latitude/longitude degrees) and might not be available for each RCP scenario or for each climatic parameter. These choices depend essentially on the research centres that initiated the simulations. Moreover, the database includes different types of simulations, such as: • piControl (Pre-industrial Control): pre-industrial conditions, including prescribed atmospheric concentrations or nonevolving emissions of gases or aerosols, as well as unperturbed land use. • amip (Atmospheric Model Intercomparison Project), imposes sea surface temperatures (SSTs) and sea ice. • historical: conditions consistent with observations from 1850 to 2005, which may include: atmospheric composition due to both anthropogenic and volcanic influences, solar forcing, as well as land use. • rcp: future projections (20 06–210 0) forced by RCP2.6, 4.5, 6.0, and 8.5. These projections are enforced by specific atmospheric concentrations, known as Representative Concentration Pathways (RCPs), resulting approximately in radiative forcings of 2.6, 4.5, 6.0 and 8.5 W/m2 at year 2100, respectively.

• esm (Earth system model): includes climate-carbon cycle feedbacks. • sst (SST): imposes sea surface temperatures. • palaeo: paleoclimate modelling, consistent with PMIP (Paleo Model Intercomparison Project) specifications. • aqua: imposes SSTs on a planet without continents, as consistent with the CFMIP (Cloud Feedback Model Intercomparison Project) specifications. Finally, within each dataset is contained data in NetCDF extension (Network Common Data Form, .nc), organized according to the syntax A_B_C_D_E_F.nc, where each letter corresponds to: • A: climate variable,1 such as for example: cloud area fraction (clt); relative humidity (hurs); air pressure at sea level (psl); surface downwelling shortwave flux in air (rsds); wind speed (sfcWind); air temperature (tas); monthly mean of the daily-maximum near-surface air temperature (tasmax); monthly mean of the daily-minimum near-surface air temperature (tasmin). • • • •

B: Time-step at which data is provided (monthly, daily, 6 h) C: climate model used to generate the output considered; D: type of simulation; E: boundary conditions used by the climate model to generate the output; • F: time frame of the data contained within the file In our analysis we focused on long-term experimentation and, in particular, we considered: • simulations from January 1850 to December 2005 (“Historical”) within the historical category; • future RCP simulations (from January 2006 to December 2300 – the time limit we are most interested in is 2006– 2100) in accordance with the four RCP emission scenarios (RCP2.6, RCP4.5, RCP6.0 and RCP8.5) developed by the IPCC (Intergovernmental Panel on Climate Change). Historical simulations cover much of the industrial period and is guided by the changes observed in the atmospheric composition (both of anthropogenic nature and of natural origin) and by the time-evolving land cover. For future projections instead we considered the RCP 4.5 and RCP 8.5 scenarios that represent the "coreset" of the long-term experiments of CMIP5. In particular, RCP 8.5 refers to a scenario with high emissions while RCP 4.5 is representative of a "midrange mitigation emission" scenario, where greenhouse gas emissions peak around 2040, then decline. In the RCP 8.5 scenario, for example, the solar radiation increases along the twenty-first century to reach the level of 8.5 W/m2 at the end of the century, while in the RCP 4.5 scenario a level of 4.5 W/m2 is reached at the end of the century. In particular we used the datasets described in Table 2, in a specific location (latitude 1.25°, longitude 1.875°) for the following parameters: 1 The complete list is available at the link "https://www.dkrz.de/up/services/ data-management/projects- and- cooperations/ipcc- data/cmip5- variables"

244

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

Table 2 Description of the datasets used and simulations. Research centre

Acronym

Model(s)

Simulations

Commonwealth Scientific and Industrial Research Organization/Bureau of Meteorology Australia National Institute of Meteorological Research, Korea Meteorological Administration, South Korea Met Office Hadley Centre, UK

CSIRO-BOM NIMR, KMA MOHC

ACCESS1.0, ACCESS1.3 HadGEM2-AO HadGEM2-CC

historical, rcp 4.5, rcp 8.5 historical, rcp 4.5, rcp 8.5 historical, rcp 4.5, rcp 8.5

Fig. 2. Dataset download procedure for the ERA-Interim database [20].

• Tas (Near-Surface Air Temperature [K]) • sfcWind (wind speed [m/s]) • Rsds (Surface Downwelling Shortwave Radiation [W/m2 ]). These simulations were then compared with those provided by the ERA-Interim dataset, which was taken as a reference for combining the models in order to generate a unique model that is able to represent more consistently the trend provided by the database ERA-Interim, chosen as basis of comparison for the future datasets. ERA-interim [2] is a project of re-analysis of global observed data, funded in 20 0 0 by the European Union under the V framework program (Energy, Environment and Sustainable Development). In particular, it is possible to download climatic data with a spatial resolution ranging from 0.125°/0.125° to 3°/3° (latitude, longitude) and for the period between January 1979 and September 2017 (continuously updating). Access to the ERA-Interim datasets can take place interactively via the web (downloading the data in netCDF or GRIB format), as well as via MARS, a data extraction and archiving system. Fig. 2 shows the organization of the ERA-Interim database. Authors combine the IPCC models so that the trend of the generated model is similar to the ERA-Interim average trend. We first remove from the time series the seasonal component, then we approximate the average trend with a robust linear fit (using the RANSAC algorithm), finally we use the Elastic-Net regression on the linear trend obtained from the decomposition of the different “historical” models. We then obtain the weights to be used for the data fusion of the model’s datasets (we use now the original, nondecomposed traces) and achieving a unified model which is closer to the ERA Interim dataset. Since the ERA Interim dataset is obviously not available for RCP4.5 and RCP8.5 scenarios, we use the coefficients obtained on the historical datasets also for the data fusion of RCP models and achieving a similar unified model for future predictions. The data fusion process is shown in Fig. 3 and will be described in more details in the ‘Models Data Fusion’ section. 2.2. Models data fusion 2.2.1. Time series decomposition The tas, sfcWind and rsds climate parameters are characterized by a strong seasonal component which deeply influences time series. This is clearly visible from Fig. 4, where a detail of the ERAInterim dataset is shown. Clearly, the same seasonal component is present also in the other climatic models of interest. Thus, to combine the parameters of interest provided by IPCC into one single model, we first decompose the climatic parameters of interest. Indeed, we split the time series into three additive

components: average trend, seasonality and residual. The decomposition is obtained using the statsmodels Python module [34] and is based on a convolution filter and a moving average. As we will show, the trend can be approximated through a robust regression algorithm, namely RANSAC, with a polynomial of first order, i.e., an interpolating line, while the seasonality, which is the annual frequency component of the time series, can be parameterized through the Discrete Fourier Transform (DFT) main frequency (sinusoidal) component. Finally, the residuals are generally characterized by a Gaussian probability distribution. The RANSAC (RANdom SAmple Consensus) algorithm, used to approximate the average trend, is an iterative algorithm for the robust estimation of parameters. RANSAC adapts a regression model to a subset of data, called “inliers”, removing the impact of the outliers. We can summarize the RANSAC algorithm in the following way. (1) Select a random number of inlier samples and train the model. (2) Test all other data points with respect to the adapted model and add those points that fall within a specific tolerance with respect to inlier values (3) Reshape the model using all inlier values (4) Estimate the model error, adapted from inlier values (5) Stop the algorithm if the performance reaches a certain userdefined threshold or if a fixed number of iterations has been reached, otherwise return to step 1. In our experiments we set the maximum number of iterations of the RANSAC regression to 100, the minimum number of randomly chosen samples to 150, and the threshold value to the MAD (Median Absolute Deviation) of the trend data to choose the samples as "inliers". The seasonal component instead is constituted by a 12-month cyclic pattern which represents the average seasonal variations seen throughout the entire dataset. As we will show, this seasonal component can be straightforwardly approximated with the first harmonic component of the FFT, corresponding to a 12-month periodic sinusoid. Finally, the residual distribution resulting after removing the robust linear trend and the seasonality is similar to a Gaussian distribution, with average close to zero. 2.2.2. Linear regression To achieve the data fusion and combine the parameters into one single model, we use a multiple linear regression. Linear regression is a machine learning technique that “weights” each dataset according to the climatic input data without introducing any

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

245

Fig. 3. Model decomposition and regression process used for data fusion of the different climatic models.

Fig. 4. Details of the ERA time series for the 3 parameters of interest.

additional hypothesis. A strength of this approach is the ease of extending it including new input data. More in detail, in a linear regression the output (dependent variable) is equal to the sum of a weighted combination of the input. The equation can be written as follows:

yˆ(w, X ) = w0 + w1 x1 + . . . + w p x p

(1)

where vector w = (w1 , . . . , w p ) contains the coefficients (weights) of the inputs X = (x1 , . . . , x p ), w0 is the intercept and yˆ is the desired output. To calculate the regression weights, it is important to choose a quantitative measure that minimizes the difference between the observed output y and the one produced by the linear combination yˆ. A simple solution is to use the LSE method that minimizes the MSE (Mean Square Error), formally defined as min Xw − y22 . Howw

ever, since the MSE is not dimensionless, it is often preferred standardized version of the MSE, namely the coefficient of determinaMSE tion R2 , which is related to the MSE as: R2 = 1 − Var (y ) . The value of this coefficient is between 0 and 1 for the training dataset but can become negative for the test set. The coefficient of determination is equal to one when the model identifies the data perfectly, i.e., when the MSE is zero.

This approach, although valid in many cases, is based on the assumption that the different inputs of the model are independent. However, when the inputs are correlated and the columns of the X matrix have a quasi-linear dependence (as in our case), the method becomes highly sensitive to the random fluctuations of the response, producing a large variance that makes the values of the coefficients w not reliable. To overcome this problem, we used Elastic-Net [35], a more sophisticated linear regression model trained with two regularization parameters, α e ρ. Its objective function to be minimized can be expressed mathematically in the following form:

min w

1 X w − y 2 2 + αρw1 + 2nsamples

α (1 − ρ ) 2

w2 2

(2)

The combined action of α and ρ keeps the amplitude of the coα(1−ρ ) efficients low (in particular the term w2 2 ) and at the same 2 time tries to guarantee a certain "sparsity" of the coefficients, preferring, if it is possible to set some coefficients to 0 (with the term αρw1 ). The appropriate choice of these parameters is a great help to combat overfitting, an unwanted phenomenon that appears when the model has an excessive number of parameters compared to the observed data. In overfitting, models can adapt perfectly to the input data but are unable to generalize what has been learned during training on new, unseen data.

246

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

2.2.3. Model training The input dataset X, is a matrix that has a number of rows equal to the number of samples of the parameter under investigation and columns equal to the number of climate models to be combined. The vector column y is the response, in our case the values provided by the Era-Interim database considered as the ground truth reference. The aim is to find an optimal combination of the climate models and to generate a single meta-model that incorporates them. The first step is to divide the matrix X and the respective value of the response y into two parts: X_train, X_test, y_train, y_test. The split is done randomly, where we have reserved a portion of the available data to verify (test) at the end of the training the generalization capacity of the model obtained, in other words its ability to predict correctly also on different data compared to those used for training. We chose to use 80% of the data for training and the remaining 20% for the test. The second step is the "Model selection", in which we identify the appropriate values for the parameters α and ρ . This is a very delicate phase because a wrong choice of parameters can lead to a too simple model, therefore underfitting the training data or vice versa an excessively complex model that leads to overfitting the training data. We thus compare different parameter settings in order to obtain the optimal parameters, i.e., those that guarantee the best prediction accuracy on the test data. However, reusing the same test dataset several times during model selection, would bias the results leading to a possible overfit. To overcome this problem we used the "k-fold cross-validation" technique which consists of randomly subdividing the training dataset into k parts: k−1 parts are used for model training and a part is used for testing. This procedure is repeated k times in order to obtain k different model parameters and performance estimates. Then we derive the average performance of the models based on the different independent subdivisions, to obtain a performance estimate that is less sensitive to the partitioning of the training data. The parameter k is commonly set to a value between 5 and 10. The setting of the parameters and the choice of the model is based precisely on the maximization of the cross-validation score. In particular, we considered a set of reasonable values for α and ρ and we evaluated all the cross-validation scores for all the possible combinations of parameters to obtain the parameters which provided the better scores. Once we obtained the optimal parameters, the third step is to proceed with the Elastic-Net training on the whole training-set. Then, we test the model on the data that was not examined during the training phase (test set), in order to obtain an unbiased evaluation of the performance of the algorithm. In this last phase it must be verified that the accuracy of the prediction (measured through the coefficient R2 ). On the test set the accuracy should not be significantly lower than the one obtained on the training set, otherwise the regressor is overfitting. Once we obtained the optimal coefficients for the “historical” dataset, we then use them also to predict the future parameters, thus obtaining also a data fusion of the RCP models. 2.2.4. Generating synthetic data traces To verify the robustness of our approach, we use the model decomposition to generate several new traces which have the same properties of the original dataset. For each model, we take the average linear trend and seasonality and sum them to new residuals generated according to a Gaussian distribution with same variance as the previous residuals. We then compare our regression-based data fusion with a classic LSE method which minimizes the error on a point-by-point basis. As we will show, our regression is more robust to the overfitting while having MSE and R2 coefficient very close to the LSE results.

2.3. Building weather data generation The results achieved from the previous step were used for development of weather data files to be used for simulation of future energy performances in a non-steady state simulation environment. Both the original climate data and the data-fusion results were used for the generation of weather data file through the application of the morphing method. It is based on a mathematical procedure that generates future monthly data to generate hourly weather data to be used for building energy simulation. Every climate variable (xo) of the existing weather data is modified by one of the following operations: • shift; • linear stretch; • combination of a shift and a stretch. In detail, an operation of shifting is mostly applied if an absolute monthly variation to the mean is available from the climate model. For example, the future hourly atmospheric pressure (p) could be calculated directly from the present hourly value of the atmospheric pressure (p0 ) and from the monthly increment in atmospheric pressure (pm ), as in the following equation:

p = po +  pm

(3)

where the subscript “0” refers to current weather data files, “m” is referred to monthly data, no subscripts terms are future data. A stretch is mostly useful if the climate change forecasts are available as a fractional monthly change. For example for the global horizontal radiation (r), an increase for monthly average solar shortwave flux received at the surface (rm ) is obtained. A scaling factor for the month m (α rm) is calculated from the absolute variation (rm ) and the monthly mean (͞r0m ) from the baseline climate as in the following equation:

αrm = 1 +

r m r0m

(4)

This scaling factor is then multiplied to all months m in the time series using the following equation:

r = αrm · r0

(5)

where r0 is the hourly current global horizontal radiation, r is the global horizontal radiation. An operation of simultaneous shift and stretch is used for climatic variables such as dry-bulb temperature to reflect changes in both the daily mean and the maximum and minimum daily values. As an example, for the dry-bulb temperature from the climate change scenario the following outputs are calculated: the monthly daily mean temperature variation (tm), the monthly daily maximum temperature variation (tmax, m ) and the monthly daily minimum temperature variation (t min ,m ). Using tmax, m and tmin, m , the scaling factor for the dry-bulb temperature (α tm) is calculated through Eq. (6) using monthly mean values from both the current and future data:

αtm =

tmax,m − tmin,m t¯0 max,m − t¯0 min,m

(6)

where t̅0max, m and t̅0 min, m are the monthly mean of the current daily maximum temperature and the monthly mean of the current minimum daily temperature, respectively. Afterwards, the future hourly variable dry bulb temperature is calculated through the following equation:

t = to + tm + αtm · (t0 − t0,m )

(7)

where t0 is the present hourly dry-bulb temperature and t̅0 m is the monthly mean temperature variation in the current climate for the month m.

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

The analysis is performed for the 2090 future IPCC data, based on two different scenarios, RCP 4.5 and 8.5. 3. Results 3.1. Data decomposition We first analyse the ERA-Interim dataset, breaking the time series into the three additive components: average trend, seasonality and residuals and approximating the trend with the robust linear fit obtained through the RANSAC algorithm. Both the trend time series and the robust linear model are shown in Fig. 5. The seasonal component is constituted by the average variations in a 12-month cyclic pattern which repeats itself throughout the entire dataset. For the ERA Interim dataset, Fig. 6 shows two cycles (i.e., two years) of the seasonal time series, while Fig. 7 shows the magnitude of the corresponding Fast Fourier Transform (FFT). From both figures, it is clear that the seasonality can be straightforwardly approximated with the first harmonic component of the FFT, corresponding to a 12-month periodic sinusoid, as shown in Fig. 6. Finally, the residual distribution resulting after removing the robust linear trend and the seasonal components can be approx-

247

imated with a Gaussian distribution, with zero average and MAD standard deviation (STD) of 0.60, 0.33 and 8.06 for temperature, wind speed and radiation respectively. This is clearly visible from Fig. 8. The same analysis and decomposition has been developed for the four models considered, during the same time span of the ERA Interim dataset. For the sake of brevity, we omit the figures and limit the description to the tabular results. Indeed, Table 3 summarizes the linear trend average and increase (slope), the amplitude of the seasonal component and the MAD STD of the residuals while, using the same metrics, Table 4 shows the difference between the four models and ERA Interim in terms of relative error. From the tables, the HadGEM2-CC model is in most cases the closest to ERA, except for few parameters mostly regarding the solar radiation where HadGEM2-AO and ACCESS 1.0 have relative error lower than HadGEM2-C. 3.2. Data fusion To integrate the different climate models we use the ElasticNet regressor. The obtained additive coefficients allow us to combine linearly the four models of interest and obtain a unified model, spanning the same time period, for each climatic parameter

Fig. 5. Average trend of the ERA dataset and its robust linear fit obtained through the RANSAC algorithm.

Fig. 6. Seasonal component of the ERA dataset and its sinusoidal approximation.

Fig. 7. Fast Fourier Transform (FFT) of the seasonal component of the ERA time series.

248

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

Fig. 8. Residual distribution of the ERA time series and its Gaussian approximation. Table 3 Linear trend average and increase, seasonal amplitude and MAD standard deviation of the residuals for the 1979–2005 period (absolute values). Model

ERA Interim ACCESS1.0 ACCESS1.3 HadGEM2-AO HadGEM2-CC

Temperature [K]

Global solar radiation [W/m2 ]

Wind speed [m/s]

Linear trend Linear trend Seasonal average increase amplitude

Residuals MAD_STD

Linear trend Linear trend Seasonal average increase amplitude

Residuals MAD_STD

Linear trend Linear trend Seasonal average increase amplitude

291.59 292.27 291.49 291.19 290.51

0.60 0.39 0.44 0.42 0.48

5.30 5.18 5.70 5.93 6.13

0.33 0.30 0.41 0.41 0.37

204.66 201.51 195.51 199.87 198.36

0.28 1.33 0.64 0.46 0.29

6.00 4.90 4.92 5.57 5.76

0.18 0.10 0.34 −0.47 0.24

1.49 1.76 1.88 1.44 1.52

1.44 5.89 10.28 3.40 5.90

105.19 107.71 106.09 105.84 106.41

Residuals MAD_STD 8.06 4.66 5.40 4.93 5.52

Table 4 Relative error for the four models of interest compared to the ERA Interim dataset (in %). Model

Temperature [K]

ACCESS1.0 ACCESS1.3 HadGEM2-AO HadGEM2-CC

0.68 −0.10 −0.40 −1.08

375.00 128.57 64.29 3.57

Global solar radiation [W/m2 ]

Wind speed [m/s]

Linear trend Linear trend Seasonal average increase amplitude −18.33 −18.00 −7.17 −4.00

Residuals MAD_STD

Linear trend Linear trend Seasonal average increase amplitude

Residuals MAD_STD

Linear trend Linear trend Seasonal average increase amplitude

−35.00 −26.67 −30.00 −20.00

−2.26 7.55 11.89 15.66

−9.09 24.24 24.24 12.12

−1.54 −4.47 −2.34 −3.08

−44.44 88.89 −361.11 33.33

Table 5 Weights coefficients obtained through the Elastic-Net.

18.12 26.17 −3.36 2.01

309.03 613.89 136.11 309.72

2.40 0.86 0.62 1.16

Residuals MAD_STD −42.18 −33.00 −38.83 −31.51

Table 6 Regularization parameters obtained for the three parameters of interest.

Parameter

w0

w1

w2

w3

Parameter

α

ρ

Temperature Radiation Wind Speed

0.386 0.422 0.153

0.371 0.139 0.477

0.182 0.408 0.122

0.061 0.055 0.173

Temperature Radiation Wind Speed

10 10 0.001

0.001 0.5 0.75

considered. However, for the combination coefficients we based the regression on the robust linear trend more than on the punctual dataset, which reduces the risk of overfitting. The coefficients have been obtained using 309 months from the "Historical" dataset that overlap on the ERA Interim period, for the climate parameters tas, sfcWind and rsds, in the Palermo, Italy area (Latitude 38.75°, Longitude 13.125°). Once we obtained the combination coefficients, we use them to obtain a unified future forecast model based on the RCP 4.5 and RCP 8.5 scenarios. Referring to the objective function of Elastic-Net introduced previously, we have obtained the weights w∗ = (w0 , w1 , w2 , w3 ) with input X corresponding to the linear trend matrix of the model (respectively Access 1.0, Access 1.3, HadGEM2-AO and HadGEM2CC) and y the corresponding values of ERA-Interim. The regularization parameters α and ρ have been identified through the crossvalidation mechanism with grid search using the ElasticNetCV classifier of the scikit-learn [36] Python library for machine learning. Tables 5 and 6 show the coefficients and the regularization parameters obtained. Finally, Table 7 shows the MSE and R2 coefficient for the resulting combined model and, for comparison, also of the four single models compared to the ERA-Interim dataset. It is easy to see that the regression results in a higher R2 coefficient (lower MSE error) compared to the individual climatic models.

To obtain the future predictions, since we have no ground truth model, we applied the same coefficients of Table 6 to combine the RCP models together. We undertake the same decomposition (trend, seasonality and residuals) as for the “historical” dataset and, since the ERA dataset is not available, we can only compare the Regression results against the other models. For example, Fig. 9 shows for Temperature, Wind Speed and Radiation the average trend of the obtained Regression dataset compared to their ACCESS1.0 equivalents for RCP4.5 and RCP8.5, together with their robust linear fit obtained through the RANSAC algorithm. From the figure, it is easy to see that the time series resulting of the data fusion Regression can be quite different from the single models traces. Similar results have obtained for the other RCP models and parameters. Finally, Tables 9 and 10 report the decomposition parameters (namely, linear trend average and increase, seasonal amplitude and MAD standard deviation) for all the individual models and the data fusion Regression obtained for RCP4.5 and RCP8.5. 3.3. Comparison with LSE and synthetic traces We now compare our regression-based data fusion with the LSE method. Since the LSE aims at minimizing the error on a pointby-point basis, the overfitting on a specific dataset might impact

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

249

Table 7 MSE and R2 coefficient of the regression model obtained with Elastic-Net and of the four individual models compared to the ERA Interim dataset. Wind speed [m/s]

Global solar radiation [W/m2 ]

Model

Temperature [K]

Metric

MSE

R2

MSE

R2

MSE

R2

Regression ACCESS1.0 ACCESS1.3 HadGEM2-AO HadGEM2-CC

2.17 3.14 2.67 2.32 3.20

0.91 0.87 0.89 0.90 0.87

0.72 1.05 1.47 1.56 1.89

0.62 0.44 0.21 0.17 −0.01

262.78 310.68 411.42 367.39 359.08

0.96 0.96 0.94 0.95 0.95

Fig. 9. Average trend of the obtained Regression dataset and its robust linear fit compared to ACCESS1.0 for Temperature, Wind Speed and Radiation with RCP4.5 and RCP8.5.

Fig. 10. Comparison between the Regression and the LSE trend results on the historical datasets.

deeply the model obtained. For example, Fig. 10 compares the Regression and the LSE methods on the historical models. From the figure, it is clear that the results of the two techniques can be very different. Although the LSE method minimizes the MSE and R2 coefficient on this specific dataset, for this same reason, LSE is also more prone to cause overfitting. Instead, the proposed regression is more generic because it is based on the calculation of the robust linear trend more than on a punctual error minimization. Despite being punctually sub-optimal, the regression method has MSE and R2 coefficient still very close to the LSE, as reported in Table 8.

To verify the robustness of our approach against the risk of overfitting, we generated new synthetic traces with the same statistical properties (Tables 9–11), i.e., same linear trend and seasonality but generating new residuals using the same Gaussian distribution of the original dataset. We then repeat the Elastic-Net regression ad the LSE method and re-compute the weights for the data fusion. Since the new synthetic trace has the same properties of the original one, we expect that the new coefficients are similar to the previous ones. However, as reported in Table 9 for the tas parameter on the historical dataset, the LSE coefficients are quite different because of the overfitting on the original traces, while the

250

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254 Table 8 MSE and R2 coefficient of the regression data fusion and of the LSE based on the historical models. Global solar radiation [W/m2 ]

MODEL

Temperature [K]

Wind speed [m/s]

Metric

MSE

R2

MSE

R2

MSE

R2

Regression LSE

2.17 1.76

0.91 0.93

0.72 0.65

0.62 0.65

262.78 249.45

0.96 0.96

Table 9 Weights coefficients obtained through the Elastic-Net regression and LSE for the tas historical dataset. Parameter

Original Synthetic

w0

w1

w2

w3

LSE

Regression

LSE

Regression

LSE

Regression

LSE

Regression

−0.049 0.195

0.386 0.383

0.081 −0.169

0.371 0.371

0.478 0.188

0.182 0.184

0.492 0.788

0.061 0.061

Table 10 Linear trend average and increase, seasonal amplitude and MAD standard deviation of the residuals for RCP4.5 (absolute values). Model

Temperature [K]

Parameter

Linear trend Linear trend Seasonal average increase amplitude

Residuals MAD_STD

Linear trend Linear trend Seasonal average increase amplitude

Residuals MAD_STD

Linear trend Linear trend Seasonal average increase amplitude

ACCESS1.0 ACCESS1.3 HadGEM2-AO HadGEM2-CC Regression

293.86 293.45 292.91 292.23 293.31

0.47 0.45 0.48 0.46 0.28

5.04 5.53 5.97 6.09 5.19

0.34 0.43 0.40 0.41 0.23

207.26 205.69 206.44 204.76 211.53

1.70 2.62 1.95 2.34 2.08

Wind speed [m/s]

5.66 5.49 6.50 6.57 5.80

0.07 −0.15 −0.28 −0.19 −0.15

Radiation [W/m2]

1.95 1.88 1.47 1.60 1.65

1.48 7.84 7.38 5.94 5.70

120.44 116.99 121.20 119.33 123.05

Residuals MAD_STD 5.17 4.28 4.90 4.99 3.01

Table 11 Linear trend average and increase, seasonal amplitude and MAD standard deviation of the residuals for RCP8.5 (absolute values). Radiation [W/m2 ]

Model

Temperature [K]

Parameter

Linear trend Linear trend Seasonal average increase amplitude

Residuals MAD_STD

Linear trend Linear trend Seasonal average increase amplitude

ACCESS1.0 ACCESS1.3 HadGEM2-AO HadGEM2-CC Regression

294.66 294.20 293.35 293.33 294.08

0.43 0.41 0.50 0.48 0.27

5.02 5.49 6.00 6.08 5.16

5.04 4.71 4.28 5.04 4.78

Wind speed [m/s]

5.70 5.56 6.56 6.66 5.86

−0.53 −0.43 −0.44 −0.32 −0.36

1.94 1.85 1.49 1.54 1.62

Residuals MAD_STD

Linear trend Linear trend Seasonal average increase amplitude

0.34 0.40 0.37 0.41 0.21

207.57 207.15 206.33 205.33 212.32

8.22 7.38 3.37 5.82 6.40

119.40 117.37 120.76 120.02 122.53

Residuals MAD_STD 5.11 4.63 4.99 4.58 3.20

Table 12 Main building features. Building features Heated floor area [m2 ] Volume [m3 ] Roof U value [W/(m2 K)] Floor U value [W/(m2 K)] Wall U value [W/(m2 K)] Windows U value [W/(m2 K)] WWR south façade WWR west/east façade WWR north façade

81 218.7 0.32 0.40 0.42 3 20% 15% 10%

Fig. 11. The building model used as case study.

regression based on the linear trend approximation is more robust and the coefficients are almost unchanged using the new traces. Similar considerations can be made also for the other parameters of interest and the RCP models although we omit the results for the sake of brevity.

3.4. Building case study and simulation The assessment of the energy requirements in terms of heating and cooling energy is based on a simple building energy model, described in detail in [37]. Fig. 11 shows a screenshot of the geometry of the building modelled in Energy Plus environment. It is a low-rise building with a total heated area of 81 m2 .

The building is an office occupied from Monday to Friday from 9:00 a.m. until 6:00 p.m., with the exception of a break from 1:00 p.m. to 2:00 p.m. Table 12 reports the main buildings features. In particular, the window-to-wall ratio (WWR) of the south façade is about 20%, with nearly 5 m2 of glazed area. U values are compliant with Italian new construction regulations limits. Thermal internal loads are due by lighting and office equipment. Lighting power is 5 W/m2 , controlled by an illuminance dimmering (500 lux setpoint) activated by presence of people. Other electrical loads are included with an overall 1530 Wp installed. The existing building was simulated under the previously discussed assumptions for the existing situation and different future scenarios: RCP 4.5 and 8.5 each with all climate data models mentioned in the previous sections as well as with the regression new model.

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254 Table 13 Heating and cooling energy use, current time weather data.

Table 15 Heating and cooling energy use, RCP 8.5. Heating [kWh/m2 ]

Current weather data [kWh/m2 ] Cooling Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Tot

Heating

0 0 0 0.09 0.82 3.73 8.24 9.59 5.09 2.03 0.05 0 29.64

4.11 4.21 2.06 0.52 0 0 0 0 0 0 0.28 2.03 13.21

Table 14 Heating and cooling energy use, RCP 4.5. Heating [kWh/m2 ]

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Year

ACCESS 1.0

ACCESS 1.3

HadGem2-AO

HadGem2-CC

Data fusion model

1.99 2.62 0.81 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.57 6.10

1.24 2.29 0.42 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.19 4.16

1.81 2.35 0.70 0.08 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.59 5.57

1.16 2.03 0.74 0.10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.42 4.44

1.52 2.33 0.66 0.07 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.42 5.00

Cooling [kWh/m2 ]

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Year

251

ACCESS 1.0

ACCESS 1.3

HadGem2-AO

HadGem2-CC

Data fusion model

0.00 0.00 0.00 0.15 2.25 6.87 12.02 14.02 8.08 5.32 0.19 0.00 48.91

0.00 0.00 0.00 0.31 3.08 7.48 12.48 14.72 8.73 6.58 0.31 0.00 53.68

0.00 0.00 0.00 0.17 2.71 7.39 12.21 14.06 8.81 5.52 0.13 0.00 50.99

0.00 0.00 0.00 0.17 2.57 7.40 12.26 13.97 8.60 6.04 0.21 0.00 51.21

0.00 0.00 0.00 0.18 2.62 7.21 12.24 14.19 8.57 5.81 0.22 0.00 51.03

Table 13 shows the results of the preliminary analysis, i.e., the energy simulation of the existing case-study with currently available IWEC weather datasets. The features of the test building model are compliant with regulations limits for new buildings in Italy, or in other words, are typical of newly built ‘high performance’ buildings. In this light, the high relevance of cooling on the total air conditioning energy uses can be easily explained. Tables 14 and 15 report instead the monthly energy use for heating and cooling in the future scenarios, respectively for RCP4.5 and RCP8.5. Although in the current time frame heating accounts for roughly the 30% of the total energy uses for air conditioning, this is not the case anymore for the future scenarios that show results coherent with the decarbonisation assumptions behind the scenarios. Cooling is nearly doubled in the RCP 4.5 future scenarios whereas heating is reduced to a third; in the 8.5 RCP scenario instead heating is close to zero while cooling is increased threefold. The distribution of cooling energy use among the five datasets proposed clearly indicates some differences between models: ACCESS 1.3 proposes the most extreme weather while ACCESS 1.0 the most temperate one in the case of the RCP 4.5. This is not the case

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Year

ACCESS 1.0

ACCESS 1.3

HadGem2-AO

HadGem2-CC

Data fusion model

0.46 0.73 0.07 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 1.28

0.24 0.83 0.09 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 1.17

0.61 1.10 0.17 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.06 1.95

0.21 0.88 0.10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 1.20

0.36 0.89 0.10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 1.37

Cooling [kWh/m2 ]

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Year

ACCESS 1.0

ACCESS 1.3

HadGem2-AO

HadGem2-CC

Data fusion model

0.00 0.00 0.05 0.76 5.06 9.54 15.43 17.36 11.31 8.43 1.32 0.00 69.25

0.00 0.00 0.04 0.82 4.96 9.53 15.21 17.36 11.15 8.81 1.51 0.04 69.44

0.00 0.00 0.00 0.53 4.71 9.31 15.04 17.11 11.14 7.82 0.72 0.00 66.38

0.00 0.00 0.02 0.64 5.13 9.78 15.33 17.70 12.06 9.40 1.43 0.01 71.51

0.00 0.00 0.00 0.68 4.96 9.53 15.23 17.37 11.39 8.59 1.26 0.00 69.02

for the 8.5 scenario whereas, to a more uniform distribution of results in the hottest part of the year has more divergent results between September and October, having highest results within the HadGem2-CC model and lowest in the HadGem2-AO one. In the month of October alone, the difference between these two models in terms of cooling energy consumption is close to 8%. The fusion regression models’ data always reaches intermediate results between all models investigated. Also, since the most relevant contribution in the scenarios modelled is cooling, Fig. 12 proposes a recap of cooling throughout all scenarios investigated in this paper. Results can vary as much as ±10%, merely by using one data source or another. 4. Sensitivity analysis The sensitivity of the results to some parameters and issues will now be briefly shown. The first aspect to briefly take in consideration will be the impact of potential Urban Heat Island (UHI) on the results. UHI refers to a significantly warmer metropolitan area than its surrounding rural area due to human activities [38]. Although this phenomenon might be seen as somewhat offsetting heating requirements in the colder climates, it can become a significant issue especially under hot climates as the heat effect during summer would increase the air-conditionings requirements. This is specifically the case under the climate change scenarios previously discussed and shown in Fig. 13. As in several previous literature a wide variability of potential temperature variation was quantified [39–41] according to the features of the local urbanization and of the natural features. In particular, in [42], the impact of UHI to the average temperature of the air during the year is quantified in several cases with a climate similar to the case-study proposed of a paper as variable between 1 K to 3–4 K. It is therefore possible to achieve a moderate insight on the impact of this phenomenon on energy uses for cooling within future climates, by performing a simple operation of “stretch” of the available weather data, as already described for

252

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

Fig. 12. Recap of cooling energy use among all scenarios.

Fig. 13. Air conditioning requirements for cooling under different severity of heat waves.

the morphing method in the previous paragraph, up to 3 K, thus imagining a linear increase in temperature throughout the year. A new set of simulations was carried out for both RCP 4.5 and 8.5 for the data fusion dataset only as an example. Under the hypotheses that were briefly discussed before, the results reveal a high increase in cooling if compared to the data fusion results, which reaches 77 kWh/(m2 K) in the RCP 45 case and 96.37 kWh/ (m2 K) in the RCP 85 one, with a linear increase between all the scenarios shown in Fig. 11 up to around 40%. Also the sensitivity of the results to the building features and occupancy was investigated. The following Fig. 14 reports the variation of cooling requirements in a building holding the same geometrical features, with roughly 30% lower glazed area but used for residential purposes, thus occupied from around 18:00 until 8:30 during the weekday and subjected to a more variable schedule during the weekend. Internal loads and lighting profiles of use are from [43] and together are never higher than 25 W/m2 , while HVAC profiles follow closely the occupancy levels. Also the residential case shows little resilience to the climate change scenarios, with increases in cooling that are higher by 30% if the RCP 8.5 is compared to the current weather datasets.

Fig. 14. Cooling requirements for a residential building.

5. Discussion The paper has proposed a methodology to approach the quantification of energy consumption for heating and cooling in the

future, that is expected to increase in a very significant amount even in the ‘mitigation’ scenario implementing solid decarbonisation pathways within the energy sector.

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

The methodology proposed and the presentation of the results can therefore rise the following points for discussion: • As the lifetime of buildings is for sure in the range of 50– 100 years, climate change will impact inevitably the operation of buildings being constructed or already built today. Being able to depict with reliable data what will happen is a design necessity to push resilience to climate change, in terms of prediction and building design variation; • the results identified in the paper clearly show a wide variability among all models thus reinforcing the need for detailed validation and alternative modelling solution like the one proposed in this paper. A 10% variation in cooling energy requirements on a design level would have relevant impacts in terms of costs, decarbonisation potential, environmental impacts of the energy sector. Choosing the right model – or developing a new one – is paramount in having robust data; • the methodology followed achieved climatic data that ‘fit’ better the available historical simulations based on the regression of the four datasets chosen. The strength of the idea is the methodology itself that could be easily applied to any set of climatic data, provided that a robust set of historical/monitored data to base the comparison on is available; additionally, we have proven that the regression method based on the average trend is much more robust and less prone to overfitting errors compared to classic LSE; • the approach proposed represents an advancement in comparison to existing statistical downscaling methodologies as it allows to achieve better fits with existing re-analysis data if compared to specific GCM output data. Thus a more reliable estimation of energy use for heating and cooling is the result of using a more appropriate weather dataset for the local climate, • the use of a limited pool of models compared in the study aims at achieving more comparable results. This allows actually to remove one layer of uncertainty from the results as all IPCC climate data used in this paper use the same spatial grid, removing the need for further manipulation of the data to achieve comparable inputs; • the approach proposed is a site specific application, based on the acquisition and analysis of data for the city of Palermo (Italy) in the island of Sicily. Thus, the paper does not aim to acknowledge more or less reliability to any specific model, but rather reinforces the need for careful assessment of the available climate change models for any specific location when developing specific building designs; • the building used as case-study is thus representative of a range of buildings characterized by the increase in risk of overheating caused by glazed surfaces and high airtightness. The issue, investigated by the International Energy Agency Energy in Buildings and Communities Annex 62 – “Ventilative Cooling”, is currently felt among practitioners and academics not only in traditionally cooling dominated countries such as the southern Europe area, but is becoming increasingly frequent also in colder areas of the world.

6. Conclusions The transition towards a low-carbon energy system is quickly becoming an important target of scientific efforts and research. This will entail the profound transformation of all main sectors responsible for emissions – power generation, industry, transport, buildings and agriculture – through steeply reducing carbon intensity according to their technological and economic potential.

253

Both the construction of new buildings and the retrofit of the existing stock are in need of tools to address climate change in the design phase as one of the variables to be kept in consideration, in particular for high performance buildings, which are expected to see a considerable increase in cooling requirements of the next decades in EU in comparison to standard constructions. In this framework, the paper proposes a methodology to generate solid provisional data starting from the existing climate forecasts from IPCC. The idea is to perform a regression of the existing datasets based on a validation procedure with the existing re-simulation historical data from ERA-INTERIM. The new weather climate provisional data fit the available ERA datasets better than the original models and, importantly, is generic enough to ‘fuse’ different data with reduced risk of overfitting, as demonstrated by using similar synthetic datasets. The methodology proves thus able to generate, in an innovative approach, a framework to be included in the building design process to interact with the need to counter climate change. Furthermore, the generation of building simulation weather files and the building simulation allowed for the quantification of potential significant differences between different models outputs, strengthening the need for a more detailed approach to climate change modelling. Indeed, the proposed regression-based method is a new step forward in this direction. The approach has the innovativeness of creating a hybrid dataset through statistical re-analysis of existing General Circulation Models outputs, able to fit better the existing database of reanalysis ERA – INTERIM, thus creating a more reliable data source for climate change induced modification of building energy performance. The strength of the methodology is however in the wide applicability: although the paper includes only a case-specific development, the methodological framework can be re-developed for any other geographical site. The main limits are connected to the limited geographical resolution of the original GCM which cannot fully investigate local micro- and mesoclimatic aspects influencing the building energy performance in a harmonized and fully integrated approach. Nevertheless the methodology could also be applied to more specific and local datasets such as those derived from regional climate models. Conflict of interest None. References [1] O. Lucon, et al., Buildings, in: O. Edenhofer, R. Pichs-Madruga, Y. Sokona, E. Farahani, S. Kadner, K. Seyboth, A. Adler, I. Baum, S. Brunner, P. Eickemeier, B. Kriemann, J. Savolainen, S. Schlömer, C. von Stechow, T. Zwickel, J.C. Minx (Eds.), Climate Change 2014: Mitigation of Climate Change. Contribution of Working Group III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 2014. [2] M. Cellura, L. Campanella, G. Ciulla, F. Guarino, V. Lo Brano, D.N. Cesarini, A. Orioli, The redesign of an Italian building to reach net zero energy performances: A case study of the SHC task 40 - ECBCS Annex 52, Paper presented at the ASHRAE Trans. 117 (Part 2) (2011) 331–339. [3] M. Cellura, F. Guarino, S. Longo, M. Mistretta, Modeling the energy and environmental life cycle of buildings: a co-simulation approach, Renew. Sustain. Energy Rev. 80 (2017) 733–742. [4] M.A. Cusenza, S. Bobba, F. Ardente, M. Cellura, F. Di Persio, Energy and environmental assessment of a traction lithium-ion battery pack for plug-in hybrid electric vehicles, J. Clean. Prod. 215 (Apr. 2019) 634–649. [5] Q.D. Stocker, et al., IPCC, climate change 2013: the physical science basis, Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 2013. [6] M. Beccali, M. Cellura, M. Mistretta, Environmental effects of energy policy in sicily: the role of renewable energy, Renew. Sustain. Energy Rev. 11 (2) (2007) 282–298.

254

F. Guarino, D. Croce and I. Tinnirello et al. / Energy & Buildings 196 (2019) 240–254

[7] R.H. Moss, et al., The next generation of scenarios for climate change research and assessment, Nature 463 (7282) (2010) 747–756. [8] M. Cellura, F. Guarino, S. Longo, M. Mistretta, Different energy balances for the redesign of nearly net zero energy buildings: an Italian case study, Renew. Sustain. Energy Rev. 45 (2015) 100–112. [9] “What is a GCM?” [Online]. Available: http://www.ipcc-data.org/guidelines/ pages/gcm_guide.html. [Accessed 25 May 2018]. [10] R.A. Cox, M. Drews, C. Rode, S.B. Nielsen, Simple future weather files for estimating heating and cooling demand, Build. Environ. 83 (2015) 104–114. [11] N. Artmann, D. Gyalistras, H. Manz, P. Heiselberg, Impact of climate warming on passive night cooling potential, Build. Res. Inf. 36 (2) (2008) 111–128. [12] M. Christenson, H. Manz, D. Gyalistras, Climate warming impact on degree– days and building energy demand in Switzerland, Energy Convers. Manag. 47 (6) (2006) 671–686. [13] D.H. Rosenthal, H.K. Gruenspecht, E.A. Moran, Effects of global warming on energy use for space heating and cooling in the United States, Energy J. 16 (2) (1995) 77–96. [14] H. Wang, Q. Chen, Impact of climate change heating and cooling energy use in buildings in the United States, Energy Build. 82 (2014) 428–436. [15] R. Gupta, M. Gregg, Using UK climate change projections to adapt existing English homes for a warming climate, Build. Environ. 55 (2012) 20–42. [16] M. Herrera, et al., A review of current and future weather data for building simulation, Build. Serv. Eng. Res. Technol. 38 (5) (2017) 0143624417705937. [17] S.E. Belcher, J.N. Hacker, D.S. Powell, Constructing design weather data for future climates, Build. Serv. Eng. Res. Technol. 26 (1) (2005) 49–61. [18] A. Robert, M. Kummert, Designing net-zero energy buildings for the future climate, not for the past, Build. Environ. 55 (2012) 150–158. [19] P. de Wilde, W. Tian, Management of thermal performance risks in buildings subject to climate change, Build. Environ. 55 (2012) 167–177. [20] F.M. Berrisford P, Dee D.P., Poli P., Brugge R., Fielding K., “The ERA-Interim archive Version 2.0,” 2011. [21] T.F. Stocker, D. Qin, G., K. Plattner, M. Tignor, S.K. Allen, J. Boschung, A. Nauels, Y. Xia, V. Bex, P.M. Midgley, IPCC, climate change 2013: the physical science basis, Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 2013. [22] A.L. Pisello, G. Pignatta, V.L. Castaldo, F. Cotana, The impact of local microclimate boundary conditions on building energy performance, Sustainability 7 (7) (2015) 9207–9230. [23] M. Santamouris, Cooling the cities – a review of reflective and green roof mitigation technologies to fight heat island and improve comfort in urban environments, Sol. Energy 103 (2014) 682–703 0. [24] D. Jacob, et al., EURO-CORDEX: new high-resolution climate change projections for European impact research, Reg. Environ. Chang. 14 (2) (Apr. 2014) 563–578. [25] S. Farah, D. Whaley, W. Saman, J. Boland, Integrating climate change into meteorological weather data for building energy simulation, Energy Build. 183 (Jan. 2019) 749–760. [26] S. Flores-Larsen, C. Filippín, G. Barea, Impact of climate change on energy use and bioclimatic design of residential buildings in the 21st century in Argentina, Energy Build. 184 (Feb. 2019) 216–229.

[27] A. Moazami, V.M. Nik, S. Carlucci, S. Geving, Impacts of future weather data typology on building energy performance – investigating long-term patterns of climate change and extreme weather conditions, Appl. Energy 238 (Mar. 2019) 696–720. [28] V.M. Nik, Making energy simulation easier for future climate – synthesizing typical and extreme weather data sets out of regional climate models (RCMs), Appl. Energy 177 (Sep. 2016) 204–226. [29] A. Mylona, The use of UKCP09 to produce weather files for building simulation, Build. Serv. Eng. Res. Technol. 33 (1) (2012) 51–62. [30] G.J. Levermore, et al., Deriving and Using Future Weather Data for Building Design from UK Climate Change Projections – An overview of the COPSE, University of Manchester, Manchester, 2012. [31] M. Eames, T. Kershaw, D. Coley, On the creation of future probabilistic design weather years from UKCP09, Build. Serv. Eng. Res. Technol. 32 (2) (2010) 127–142. [32] J.M. Murphy, et al., UK Climate Projections Science Report: Climate Change Projections, Meteorological Office Hadley Centre, Exeter, UK, 2009. [33] K.E. Taylor, R.J. Stouffer, G.A. Meehl, An overview of CMIP5 and the experiment design, Bull. Am. Meteorol. Soc. 93 (4) (2012) 458–498. [34] S. Seabold, J. Perktold, Statsmodels: econometric and statistical modeling with Python, in: Proceedings of the 9th Python in Science Conference, 2010, pp. 57–61. Scipy. [35] H. Zou, T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc. B 67 (2) (2005) 301–320. [36] É.D. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, Scikit-learn: machine learning in Python, J. Mach. Learn. Res. 12 (2011) 2825–2830. [37] M. Cellura, F. Guarino, S. Longo, G. Tumminia, Climate change and the building sector: modelling and energy implications to an office building in southern Europe, Energy Sustain. Dev. 45 (Aug. 2018) 46–65. [38] Y. Han, J.E. Taylor, A.L. Pisello, Toward mitigating urban heat island effects: investigating the thermal-energy impact of bio-inspired retro-reflective building envelopes in dense urban settings, Energy Build. 102 (Sep. 2015) 380– 389. [39] R. Kotharkar, A. Ramesh, A. Bagade, Urban heat island studies in south Asia: a critical review, Urban Clim. 24 (2018) 1011–1026 Jun.. [40] C. Hachem, A. Athienitis, P. Fazio, Evaluation of energy supply and demand in solar neighborhood, Energy Build. 49 (2012) 335–347. [41] F. Roberge, L. Sushama, Urban heat island in current and future climates for the island of Montreal, Sustain. Cities Soc. 40 (Jul. 2018) 501– 512. [42] M. Santamouris, C. Cartalis, A. Synnefa, D. Kolokotsa, On the impact of urban heat island and global warming on the power demand and electricity consumption of buildings—a review, Energy Build. 98 (Jul. 2015) 119– 124. [43] K. Darcovich, E. Entchev, P. Tzscheutschler, An international survey of electrical and DHW load profiles for use in simulating the performance of residential micro-cogeneration systems, 2014 ECB Annex 54. http://www.ieaebc.org/Data/ publications/EBC_Annex_54_DHW_Electrical_Load_Profile_Survey.pdf.