A spatially-explicit methodological framework based on neural networks to assess the effect of urban form on energy demand

A spatially-explicit methodological framework based on neural networks to assess the effect of urban form on energy demand

Applied Energy 202 (2017) 386–398 Contents lists available at ScienceDirect Applied Energy journal homepage: www.elsevier.com/locate/apenergy A spa...

1MB Sizes 0 Downloads 33 Views

Applied Energy 202 (2017) 386–398

Contents lists available at ScienceDirect

Applied Energy journal homepage: www.elsevier.com/locate/apenergy

A spatially-explicit methodological framework based on neural networks to assess the effect of urban form on energy demand Mafalda C. Silva a,b,⇑, Isabel M. Horta a, Vítor Leal a, Vítor Oliveira a,c a

Faculdade de Engenharia da Universidade do Porto, Porto, Portugal INEGI - Instituto de Ciência e Inovação em Engenharia Mecânica e Engenharia Industrial, Porto, Portugal c CITTA - Centro de Investigação do Território, Transportes e Ambiente, Porto, Portugal b

h i g h l i g h t s  A methodology for characterizing the link between urban form and energy is proposed.  The methodology combines neural networks with a spatial analysis.  For the city of Porto, urban form explains about 78% of the variation of energy use.  The most relevant features are the number of floors, mix of uses and floor area.  The methodology may be useful for assessing the energy impact of new urban projects.

a r t i c l e

i n f o

Article history: Received 24 February 2017 Received in revised form 17 April 2017 Accepted 13 May 2017

Keywords: Urban form Energy demand Model Artificial neural networks GIS

a b s t r a c t Urban form is an important driver of energy demand and therefore of GHG emissions in urban areas. Yet, research on urban form and energy remains sectorial and hasn’t been able to deliver a full understanding of the impact of the physical structure of cities upon their energy demand. Most common approaches feature engineering models in buildings, and statistical models in transports. This study aims at contributing to the characterization of the link between urban form and energy considering altogether three distinct energy uses: ambient heating and cooling in buildings, and travel. A high-resolution methodology is proposed. It applies GIS to provide the analysis with a spatially-explicit character, and neural networks to model energy demand based on a set of relevant urban form indicators. The results confirm that the effect of urban form indicators on the overall energy needs is far from being negligible. In particular, the number of floors, the diversity of activities within a walking reach, the floor area and the subdivision of blocks evidenced a significant impact on the overall energy demand of the case study analyzed. Ó 2017 Elsevier Ltd. All rights reserved.

1. Introduction Energy demand and GHG emissions associated with cities are a result of a chain of factors and drivers. The link between urban form and energy has been unceasingly explored for almost three decades. In their seminal study, Newman and Kenworthy [1] identified an inverse relationship between density and gasoline consumption. More recently, urban form is also claimed to influence energy demand in buildings [2]. Along with economic geography, socio-demographic factors, and technology, urban form is considered a driver of high importance in mature cities [3]. The impact of urban form on energy demand lies in its role in determining ⇑ Corresponding author at: INEGI - Instituto de Ciência e Inovação em Engenharia Mecânica e Engenharia Industrial, Faculdade de Engenharia da Universidade do Porto, Rua Roberto Frias, 400, 4200-465 Porto, Portugal. E-mail address: [email protected] (M.C. Silva). http://dx.doi.org/10.1016/j.apenergy.2017.05.113 0306-2619/Ó 2017 Elsevier Ltd. All rights reserved.

the energy needs of two key urban sectors: buildings (notably ambient heating and cooling) and transports. The ability to quantify the energy impact of different urban configurations is of major importance to inform urban planning decisions recurrently taken in cities. An early-stage assessment of urban projects or plans could return important energy savings during a time horizon that is likely to span for decades or centuries. Although the relationship between urban form and energy has been investigated for a few decades at different scales and in different geographical contexts, its effect on the energy flows of cities is not yet fully understood or quantified [4,3,5], whether considering the effect of urban form in relation to other drivers of energy demand, or the relative importance of different urban form attributes. Several difficulties may be identified, as summarized in Silva et al. [6]. For instance, some physical features of the urban environment (e.g. housing type) may be correlated to socio-economic factors [7]. Despite there is a large number of variables of urban

M.C. Silva et al. / Applied Energy 202 (2017) 386–398

form, comprehensive studies are still lacking. An exception is the meta-analysis from Ewing and Cervero [8], although it focuses on travel outcomes only. Also, the degree of interaction amongst the different urban form variables remains unknown. Ewing and Cervero [9] recall that the influence attributed to density may derive from other factors of urban form to which density is often associated. Finally, the existing research is predominantly sectorial, although it is acknowledged that buildings and mobility should be considered together [10]. The existing modeling approaches depend on the scope and on the purpose of the analysis. Some studies include urban form as a parameter amongst many others (usually defined by one or a couple of variables), since they are not focused on exploring its specific contribution. In the cases where the goal is to infer on the impact of features of urban form on energy demand, the existing research presents some bottlenecks. The first, is an incomplete characterization of the overall effect of urban form, as few indicators are simultaneously considered [11,12]. The second limitation, arises from focusing on buildings and mobility separately. This leads to neglecting existing energy trade-offs between the two sectors (e.g. density affects both thermal needs in buildings and mobility). In addition, spatially-explicit approaches are still lacking. However, these may be the key to uncover the effect of urban form on energy demand [13]. This paper addresses these issues by proposing a spatially-explicit methodological framework, considering a comprehensive set of urban form attributes with energy relevance, belonging both to the building and the transport sectors. Section 2 presents an overview of existing approaches, both explanatory and predictive, i.e. either those aiming at determining the influence of urban form variables on energy demand and, those aiming at estimating energy demand, resorting to indicators of urban form. Section 3 proposes an innovative spatially-explicit high-resolution methodology, grounded on the application of neural networks to predict the energy requirements of an urban area, given a set of urban form attributes. Section 4 illustrates the applicability of the methodology, using the city of Porto in Portugal as a case study. It discusses the results and the relevance of the methodology as a decision-support tool. Section 5 concludes and presents future research directions. 2. Modeling energy in the built environment: A review Urban energy models can be classified depending on the detail of the analysis – top-down vs. bottom-up, i.e. use of aggregated data downwards vs. use disaggregated data upwards, or depending on the modeling approach, e.g. econometric, engineering-economy (or end-use), hybrid models, scenario approaches, process models, input-output, and artificial neural networks [14]. Keirstead et al. [15] identify six categories of urban energy models: technology design, building design, urban climate, systems design, policy assessment and, land use and transportation (LUT) modeling. The authors point urban complexity as a constraint, while arguing for integrated modeling approaches with policy relevance. Anderson et al. [16] present the most common methods for urban analysis, ranging from spatial analysis to material accounting and simulation models. Zhao and Magoulès [17], consider three categories of models based on the nature of the analytical method: 1 – engineering models, 2 – statistical models, and 3 – data mining models. This study follows this classification, and considers the two most important sectors using energy in cities (buildings and transport). 2.1. Engineering models Engineering models are typically more complex and detailed, comparing to the remaining categories, and are also called simula-

387

tion models. In buildings, engineering models are mostly applied at the scale of a single building or building subset. Larger scales may be covered, if building types are previously defined (e.g. [18]. These models ground on physical principles for estimating the energy balance (typically thermal) of the object under analysis [17]. Building simulation models are well-established methods for determining energy demand at the building scale. A comprehensive catalog may be found at the BEST Directory [19]. As the relevance of these models relies on the accuracy of input data, several studies focus on their calibration (e.g. Sun and Reddy [20] [21]). Data inputs usually include weather data (typical meteorological year), construction characteristics (e.g. materials, fenestration), occupancy and respective activities, as well as thermal systems description. Aspects of urban form are marginally captured, usually incorporated at the stage of describing building geometry, with a few studies considering building’s surroundings. Although building simulation models may include features of urban form, the effect of morphological parameters, or the impact of spatial planning options on urban energy demand is seldom the focus of the analysis. In order to explore the effect of building geometry, Pessenlehner and Mahdavi [22] analyze fifty-four building morphological variations. The thermal behavior was predicted by the NODEM software, in terms of heating load and overheating index. Linear regressions were used to relate these indexes to a compactness indicator. Compactness was found to be significantly associated with the heating load and that its use in building energy standards could be relevant. With a stronger emphasis on morphological criteria, Ratti et al. [23] analyze the effect of urban texture on building energy consumption, through the use of the Lighting and Thermal (LT) model. The LT does not require as much detail as a full dynamic simulation model, but still, it requires circa 30 parameters as inputs. The variables of urban texture considered are: (i) passive/non-passive ratio, (ii) orientation, (iii) urban horizon angle (UHA), and (iv) obstruction sky view (OSV). A variation of at least 10% in building energy consumption for three case studies (Toulouse, Berlin and London) was attributed to texture variables. It was concluded that the passive to non-passive ratio is more important to explain energy variations than the surface-to-volume ratio. With a similar approach, Zhao [24] found that the dispersion degree of buildings is the variable with the highest influence on energy use. Hargreaves et al. [25] developed a modeling framework combining socioeconomic projections and a representation of the variability of land use patterns. The authors used a ‘tiles’ method for converting densities into a representation of the built stock, assuming a gamma distribution, while energy data was estimated by the ‘Domestic Energy and Carbon Model’. The analysis intended to assess the suitability of different options for retrofitting and decentralized supply. An additional line of research explores the impact of urban form (mostly vegetation and urban features such as density) on the urban heat island (UHI) effect. This effect is characterized by higher average urban temperatures in relation to the rural hinterlands [26], largely attributed to the urbanized land cover. Despite the features of urban form may refer both to the building and transport sectors, the energy implications due to changing temperatures are mainly felt as a result of varying thermal loads in buildings. Under this scope, Oke [27] simulated night cooling rates in rural and urban settings, and found that canyon geometry (captured through the sky view factor – SVF) is a relevant factor in determining the existence of heat islands. In line with this, Hu et al. [28] combine a parametric modeling and optimization, allocating a given amount of floor area to different urban configurations. The UHI effect is also analyzed based on the SVF. Wong et al. [29] assessed the influence of buildings’ surroundings to predict air temperature

388

M.C. Silva et al. / Applied Energy 202 (2017) 386–398

variation and analyze how it affects building energy consumption. The authors found that urban form variables can have an impact of up to 0.9–1.2 °C on the air temperature. In a different way, Akbari et al. [30] use a climate model to predict the climate response given urban surface albedo modifications. With regard to transports, simulation studies generally predict travel outcomes based on assumed relationships between urban form and travel [31]. Land-use transport (LUT) models developed since the 60s, relying on land use components for predicting travel demand. For a more detailed review on land use, transports and energy modeling, please see Ghauche [32]. A review of transport models and the evolution of LUT may also be found in Sivakumar [33]. Similarly to the case of buildings, engineering transport models are not deeply concerned with exploring the role of urban form on energy demand. The following examples somehow capture a few aspects of urban form for estimating travel patterns. The MIT model iTEAM (Integrated Transport and Energy Activity-based Model) aims at predicting transport energy consumption. Urban form is captured through the location of households and firms and existing transport infrastructure [32]. A different approach from Carty and Ahern [34] applies a cellular automata model based on spatial dynamic systems – MOLAND (Monitoring Land Use/Cover Dynamics). MOLAND uses digital maps to estimate future energy demand for transport through the likely development of land use. This has the advantage of being a spatially-explicit approach. Carty and Ahern [34,35] examine different scenarios of urban development in Greater Dublin, considering the implementation of a new transport plan and estimate the resulting energy demand. 2.2. Statistical models The roots of the body of research of urban form and energy grounds on empirical statistical analysis focused on travel. The seminal study from Newman and Kenworthy [1] correlated urban indicators and gasoline consumption in 10 U.S. cities, and found a variation of up to 40%. The authors concluded that the density of urban activities is one of the most important factors influencing energy consumption. Although this study has been criticized claiming that the effect of density is overrated (it neglects the influence of other urban attributes), the fact is that, since then, density became one of the most prominent factors of urban form advocated to influence urban energy demand (e.g. [36–38]. Regression analysis is applied at different scales and contexts (e.g. [7,39,40]. Cervero [41] applies a stepwise regression to investigate how jobs-housing imbalances affect mobility patterns in 18 suburban employment centers. Frank and Pivo [42] consider both density and mixed land uses for all transport modes, through the application of different statistical techniques, such as linear correlation, multivariate regression models, stepwise analysis and cross-tabulation. Urban form and mode choice were significantly related, evidencing nonlinear relationships. Although this study reinforced the relationship between urban form and mode choice, the analysis could still benefit if a wider set of factors was considered. Cervero and Kockelman [43] use a factor analysis to examine the influence of the 3D’s (density, diversity and design), and Cervero [44] uses a binomial logit model weighting the influence of the 3D’s with factors related to the generalized cost and socioeconomic attributes of travelers. The effect of the built environment was estimated by building one ‘basic model’ (traditional expression of utility in mode choice) and one ‘expanded model’ (including the built environment). The urban environment added significant explanatory power to the model (consistent with the 1997 research) revealing that the intensity and mix of land use significantly influence travel patterns. The influence of urban design is usually more modest. Soltani and Allan [45] analyze the effect of

micro scale urban attributes on travel. This is done at neighborhood scale, through a multinomial logit model considering urban form indicators, along with socio-demographics to predict modal choice. Finally, Fang et al. [46] investigate the relationship between macro features of urban form and aggregated CO2 emissions, resorting to a panel data model for the period 1990–2010. The authors concluded that the presence of green areas, urban compactness and complexity significantly influence CO2 emissions. Structural equation models (SEM) are more flexible than conventional statistical techniques. Golob [47] reviews the application of structural equation models (SEM) to travel behavior research, claiming that SEM have a substantial potential in activity-based modeling. Cervero and Murakami [48] apply SEM to investigate the effect of the built environment on vehicle miles traveled (VMT). More recently, Lee and Lee [49] apply multilevel SEM to assess the influence of urban form in GHG emissions in the U.S., both from the transport and the residential sector. In the case of buildings, Zhao and Magoulès [17] claim that statistical models serve three key purposes: predicting energy use over simplified variables, predicting useful energy indexes, and estimating building parameters related to energy use. Hsu [50] discusses the strengths and weaknesses of simulation models over statistical ones. The author advocates that statistical models are useful for overcoming shortcomings, such as model complexity. Kazanasmaz et al. [51] somehow consider urban form and design, in determining the energy performance of residential buildings in a Turkish city, through the investigation of its relation with architectural features. Such features included: zoning status, orientation, floor counts, area/volume ratio, construction year, and floor and window area. The analysis includes techniques such as an ANOVA, t-test and a regression analysis. From the ANOVA and the t-test, it was concluded that energy use was dependent on five out of eight architectural indicators. Engvall et al. [52] apply a hierarchical cluster analysis, followed by a regression analysis to look for dependencies among selected variables and residential energy consumption. Building design is somehow captured by the ‘building period’ and ‘type of building’ indicators. The effect of urban form on the UHI has also been investigated using regression analysis [53]. Here, the variables refer both to the building (e.g. year of construction or number of bedrooms) and the transport sectors (e.g. street intersections). 2.3. Data mining Data mining deals with complex datasets, and is often an extension of statistical methods ‘‘Data mining is a step in the KDD (Knowledge Discovery in Database)1 process that consists of applying data analysis and discovery algorithms that produce a particular enumeration of patterns (or models) over the data.” [55]. Liao et al. [56] reviewed a decade of data mining techniques and applications from 2000 to 2011, claiming that data mining techniques have originated a branch called artificial intelligence. Chen et al. [57] classify data mining in two main categories: (i) Statistical models and (ii) artificial intelligence methods, or machine learning. The latter ground on ‘learning’ process from a training dataset, from which the ‘machine’ is able to identify and generalize patterns. Methods can be included in two main categories: supervised and unsupervised learning. The former uses input data that is somehow linked to the output data. The latter, has no correct answers or a clear measure of success. Data mining has been considered increasingly attractive, due to the growing availability of computational power, with applications in a variety of fields. 1 ‘‘Knowledge discovery in databases (KDD) is a multidisciplinary research field for nontrivial extraction of implicit, previously unknown, and potentially useful knowledge from data” [54].

389

M.C. Silva et al. / Applied Energy 202 (2017) 386–398 Table 1 Existing modeling approaches considering urban form and energy demand.

Engineering models

Data mining

Statistical models

Building simulation Land-use & transport model Urban climate simulation

Support vector regression/machines (SVR/SVM) Artificial Neural Networks (ANN) Classification and regression trees (CT/RT) Chi-squared automatic interaction detection (CHAID) General linear models (GLM) Genetic programming (GP)

Linear correlation Regression analysis Stepwise regression analysis Logit models ANOVA t-test Factor analysis Cross-tabulation Panel data Structural equation models

Chou and Bui [58] applied and compared various data mining techniques (support vector regression (SVR), ANN, classification and regression tree, chi-squared automatic interaction detector, and general linear regression) to assess the energy performance of twelve building types. The best performing methods were combined into ensemble models.2 Input data consisted of physical and design aspects of buildings. The ensemble model (SVR + ANN) performed better for predicting the cooling load and the SVR retrieved better results for predicting the heating load. Neural Networks are the most popular data mining technique [56]. Karatasou et al. [59] combine ANN with other statistical processes for building energy prediction. The authors use datasets with environmental variables from two different buildings. Kalogirou [60] explores ANN for addressing building design issues, enabling stakeholders to get a quick insight on the effect of a certain change in the building performance. Additional applications of neural networks on energy in buildings can be found in Issa et al. [61], Kalogirou [62], Yalcintas and Akkurt [63]. Alternatively, support vector machines (SVM) have been used for building load estimation [64], producing better results than ANN and Genetic Programming. Zhao and Magoulès [17] review methods for predicting building energy consumption. They concluded that artificial neural networks (ANN) and SVM can have a highly accurate prediction in nonlinear problems, in contrast to statistical methods. They highlight the suitability of artificial intelligence methods for building energy prediction. In regard to transports, Dougherty [65] reviewed applications of ANN, which include parameter estimation, vehicle detection/classification, traffic pattern analysis, traffic forecasting, transport policy and economics, and traffic control. The application of machine learning to energy in transport is not as frequent as in the case of buildings. A couple of exceptions are Geem [66], who use ANN to estimate energy demand in South Korea using socio-economic variables and transport-related indicators, at a macro level. Murat and Ceylan [67] use ANN in a similar approach in Turkey. Urban form is neglected in both cases. Data mining has also been applied to urban form, although without a focus on energy so far. Sokmenoglu et al. [68] use data mining for identifying patterns in the distribution of urban attributes (e.g. predicting the use of a first floor, given the use of the ground floor and a density index). Gil et al. [69] use various features of the urban environment to classify the streets and neighborhoods into different typologies.

2 Ensemble models use a combination of models aiming to improve the performance of the resulting model.

2.4. Synthesis There is an array of energy modeling approaches including morphological parameters, although the majority is not focused on exploring the contribution of urban form to the overall energy balance. Existing models vary in scope, structure, complexity, level of detail and data requirements. The way that urban form is included in such models is not consistent, but typically, the studies primarily concerned with exploring the effect of urban form have a statistical nature (Table 1). Traditionally, energy in buildings and in transports is analyzed separately. Research in buildings dominantly uses engineering models, which are complex and data-intensive. In this case, the physical relationships are better known. The effect of urban form is often marginally captured through building geometry, with a few studies considering building surroundings. In the case of transports, there is a tradition of exploring the effect of form attributes on energy, dominantly resorting to statistical techniques. Here, the interactions amongst different variables are quite complex and are not fully understood. Data mining has been increasingly adopted, with proven results in a diversity of fields. Research on urban energy is concentrated in predicting energy use in buildings, for which data mining seems to be a feasible alternative to engineering models. Applications in transports are scarcer. Some techniques are suggested to work as design tools, enabling to inform decision-making. The potential of data mining to uncover the relationship between urban form and energy has not been explored yet. Its application to this field could help to generate deeper knowledge on a debate with over two decades, significantly contributing to inform urban policymaking.

3. Proposal of a methodological framework for modeling the relationship between urban form and energy The methodological framework proposed in this section aims at addressing a gap in the literature by extensively describing the urban environment, and modeling a set of urban form variables from the built environment and transport networks. Different modeling techniques will be compared to infer on the magnitude of the influence of urban form on energy demand (for heating and cooling in buildings, and travel). The energy uses that are not expected to be significantly affected by urban form will not be considered (e.g. domestic hot water or electrical appliances). The methodological framework is spatially-explicit and encompasses six steps. The first, consists of the definition of the indicators

390

M.C. Silva et al. / Applied Energy 202 (2017) 386–398

Fig. 1. Operational processes within the methodological framework.

of urban form with energy relevance and respective metrics. This step constitutes the pillar where the following steps will ground. The remaining steps have an operational nature (Fig. 1) and will be detailed in the following subsections. The selection of the indicators is made concerning two major areas: the built environment and urban networks. The built environment is related to the properties resulting from the physical arrangement of the built structures, which directly influences the arrangement of unbuilt areas – this is why green areas are included in this category. Urban networks are related to the characteristics of transport infrastructure, independently of the transport mode. Density is likely the most prominent and frequently cited urban attribute acknowledged to influence energy demand in cities [1,42,48,70,71]. It is covered through three indicators aiming at addressing different configurations of density [72]. While density refers to land-use intensity, granularity provides a picture of the size of key urban elements (blocks and plots) and the way they are subdivided (e.g. [43,73]. Diversity is dependent on the urban functions and on the proportion of residential and nonresidential uses [43,42]. Green areas are advocated to affect the urban microclimate [74] and avoid the urban heat island effect [53]. Compactness, passive areas and shading are lower scale attributes (relating to a building or building section) affecting the thermal balance of buildings, whether determining heat losses or gains [75,76,23]. Orientation influences solar gains, and consequently space heating requirements [77,76,23]. As for urban networks, connectivity can be defined as the degree to which two points communicate with each other, influence travel patterns through determining trip distances [78] and walkability [8]. Accessibility relates to the easiness of reaching desired destinations or opportunities, and is measured through three different indicators: the DivAct counts the number of activity types reachable within a total of activity types most relevant for travel demand for different transport modes [79]; Reach measures the built volume reachable within a distance, as a proxy for urban destinations [80]; Proximity to public transports is a proxy for public transport accessibility [8]. Centrality refers to the distance to the Central Business District (CBD). Greater distances to CBD are associated to an increased use of energy for commuting [81–83], especially in monocentric cities. Finally, design includes small-scale attributes of urban form. While research hasn’t explored lower scale attributes as much as macro ones, they are claimed to influence travel patterns [43,84,85] and building energy needs [86]. Table 2 presents the selection of urban form indicators and metrics with energy relevance (all variables are quantitative). It draws on a systematization of existing literature on urban form and energy [87] and aims at describing the urban environment in an extensive way. 3.1. Database preparation A structured database is the first requirement to allow for the subsequent steps. The database preparation entails two different

processes: i) data collection and, ii) assembling and structuring the database. Here, the level of detail desired should be defined. All indicators should be measured for a previously defined urban spatial unit, corresponding to relatively homogeneous areas in terms of urban form and structure (e.g. an urban block). Chances are that the database includes data from different sources, which has to be previously harmonized and tailored for the spatial analysis. Data needed includes spatial data and statistical data (for urban form variables) and energy data (response variables). Typical spatial data includes buildings’ footprints, land uses, and the street network. Statistical data should enable to compute the indicators from Table 2. Regarding energy demand, two types of data may feed the model: ‘real’ data, from official sources at a disaggregated level (e.g. surveys, metering, or energy certification schemes); and ‘artificial’ data (from specific models), when the former is not available. Despite being noisier, official data is considered more reliable. Also, using energy needs is deemed preferable than real consumption patterns, because the latter depends on a larger set of factors [88]. In line with this, data from energy certification in buildings, refers to thermal energy needs and is a major source of information. The database preparation per se consists of merging the different data sources, and ensuring that all data has a spatial location and a proper ID. In this way, it can be imported to the geographic information system (GIS) and later exported into spreadsheet format to be modelled. 3.2. Spatial analysis The spatial analysis consists of measuring the indicators of urban form that will enter the model as predictors. This step involves some degree of expertise on GIS, and entails some preliminary operations before the measurement of the indicators. These include actions such as clipping, deleting, merging, and joining features and tables. When operating spatial data some mismatches may happen, depending on data resolution and quality. Corrections may be necessary in order to remove conflicts between databases (e.g. road axes without contiguity, or buildings located on the top of a road lane), as well as deleting irrelevant data features that may originate noise (e.g. removing building annexes from the building cartography). The spatial analysis per se consists of measuring the urban form indicators previously selected in the GIS. This is performed through physical attributes such as location, areas, volumes, and distances which are intrinsic to spatial data, or from the statistical data that has been imported to the GIS for each of the spatial units, corresponding themselves to an encoded layer file. Although the results have a graphic visualization, the outputs of this step are also numeric and will serve as inputs for the model. This is the main advantage of a spatially-explicit method. For each indicator, a map can be produced varying in an appropriate range of intervals for visual analysis.

391

M.C. Silva et al. / Applied Energy 202 (2017) 386–398 Table 2 Selected factors of urban form with energy relevance. Focus area

Factor

Indicator

Relevance scale

Metric

Sources

Built environment

Density

Diversity Distribution of green areas Compactness Passivity Shading

Ground space index Floor space index Building height Subdivision indicator Block size Mixed-use Index Spatial distribution of green areas Surface-to-volume ratio Passive/Non-passive ratio Urban Horizon Angle

Bl, N, C Bl, N, C Bu, Bl, N, C Bl, N, C Bl, N, C Bl, N, C Bl, N, C

[72] [72] [72] [73] [104] [105] (adapted from [73])

Orientation

% Favorable orientation

Bu, Bl, N, C

Building coverage area/total ground area Gross floor area/total ground area Average number of building floors # of parcels/total ground area Total area (m2) Residential gross floor area/total GFA Ai=Si, where Ai is (m2) of green areas and Si is total block area. STV = S/V Passive areas/Non-passive areas tan(UHA) = H/W, where H is building height and W is street width % of façade length facing S/E/W

Connectivity

Beta index

S, Bl, N, C

[106]

Accessibility

DivAct

Bl, N, C

b = e/n (0  b  (n - 1)/2), where b is the average number of edges (e) per node (n) in a given street network P ðActyfyÞ (DivAct) = yP fy 0, where y is the activity

Granularity

Urban networks

Bu, Bl, N, C Bu, Bl, N, C Bu, S, Bl, N, C

[23] [23] [75] (authors)

[79]

y

Bu, S, Bl, N, C

type, Acty a value representing the existence or not of the activity type y inside accessibility boundaries (Acty {0; 1}) and fy the potential use frequency of the activity type P W½j, where d½i; j is the Reach½i ¼

Public transport route density Distance to CBD

Bl, N, C

shortest path distance between nodes and in G, and W½j is the weight of a destination node j. # PT stops/total block area

Sidewalk/street ratio Proportion of street length with provision of trees Proportion of street length with provision of cycling lanes Parking lots per spatial unit

S, Bl, N, C S, Bl, N, C

Reach

[80]

j2Gfig;d½i;j6r

Centrality Design

Bl, N, C

[8]

Average shortest path (m) between each building in a tract and the CBD gravity center Average sidewalk area/total block area Street length with trees/total street length

[8]

S, Bl, N, C

Street length with cycling lanes/total street length

[43]

S, Bl, N, C

# parking places/total block area

(adapted from [43])

(adapted from [43]) [43]

Bu – Building, S - Street, Bl – Block; N - Neighborhood, C – City.

3.3. Extraction of metrics After measuring the indicators in the GIS, the data should be gathered and exported into a suitable format for modeling. This is done through the ID code assigned to the area units during the spatial analysis. A suitable format for exporting the metrics is, for instance, an MS Office ExcelTM spreadsheet. Despite most GIS software does not enable to export directly to spreadsheet format, it is possible to do so in two phases. First, the data can be exported into a database management table that is compatible with MS ExcelTM, and then to save it as a spreadsheet file. These files are also compatible with most modeling software (e.g. SPSS, R, MATLAB,. . .). At this point, ‘atypical’ spatial units are removed from further steps of the analysis, although they can be considered for measuring urban form indicators. These are blocks exclusively occupied by green or public spaces, as well as blocks exclusively allocated to services or commercial uses, such as hospitals or faculties. The energy behavior in these areas would not be comparable to the remaining parts of the city, as it is determined by a different set of factors. Removing these areas allows the model to predict more accurately the energy demand in residential or mixed-use urban blocks. 3.4. Modeling The fifth step corresponds to specifying and running the model. This research specifically aims at obtaining quantitative knowledge on the influence of urban form on energy demand (for specific rel-

evant uses). The model is formulated considering energy demand for heating, cooling and travel as a function of a set of urban form variables, as follows:

ðEBh ; EBc ; ETÞ ¼ f ðUFi Þ

ð1Þ

where EBh are final energy needs for heating in buildings, EBc are final energy needs for cooling in buildings, and ET are travel needs that may be translated into vehicle kilometer or final energy for travel. UFi are the variables of urban form considered. This can be designated as a function fitting prediction problem. Artificial Neural Networks (ANN) arise as a suitable technique to address it. An ANN uses principles believed to be used in the human brain [89], and usually consists of a layer of input neurons, a layer of output neurons and one or more hidden layers. ANN relate inputs and outputs through a learning process, where an algorithm is able to map the data patterns from the inputs towards the target outputs. ANN are claimed to be cost-effective tools for analyzing nonlinear problems, enabling to deal with complexity with relatively little computational and time effort. Despite their proven usefulness and accuracy in solving complex problems and dealing with noisy data, they have been criticized for working as a ‘black box’, i.e. the user is provided with little information on the nature of relationships between the response variables and predictors [90]. The definition of the net topology, i.e. the number of layers and neurons, is an important task. Too many layers and neurons may overfit the net, while too few can reduce the network’s ability to map the target outputs – leading to greater errors [91]. Kalogirou

392

M.C. Silva et al. / Applied Energy 202 (2017) 386–398

[60] advocates that the number of neurons in the hidden layer should be approximately the average of the inputs and outputs. However, this is also dependent on the number of training cases. It is not unusual to find in the literature that the selection of the net architecture has been done by testing different combinations in an iterative process [92]. The backpropagation ANN with a single hidden layer is claimed to be a universal function approximator [90]. These nets ground on three stages: a feedforward of the input training pattern, the computation and propagation of the error, and the adjustment of the weights. The Levenberg-Marquardt algorithm is often referred to be simple and robust for approximating a function. The application of neural networks is becoming easier with the use of suitable software for processing and analyzing large amounts of data. The Neural Network Toolbox for MATLAB used in this study is the most complete software package to date [54]. 3.5. Knowledge discovery The last stage of the methodology is to convert the results of the modeling phase into knowledge. The application of data mining to the urban form/energy model is expected to retrieve results in the form of weights, assigning the relative influence of the urban features on the energy demand for thermal needs in buildings and for travel. The final model works as a context-specific model that enables to predict the energy behavior of a given urban setting, by using a set of characteristics of the urban environment. The model can be used for performing a sensitivity analysis to account for the effect of changes in the variables of urban form on the energy performance of the city. This type of model has a huge potential as a decision-support tool. It uses simple metrics computed from data that is generally available from local authorities, and has the ability of dealing with large datasets to find intricate patterns in spatial data. In order to translate the results produced into practical knowledge, it should be interpreted in the light of urban planning and development policies in order to anticipate how desired improvements could be achieved, i.e. which particular measures could be suitable for attaining more efficient urban development paths. To provide further insights on the applicability of this methodology, the following section presents its application to the city of Porto in Portugal. 4. Application of the methodology to the city of Porto 4.1. Study area The city of Porto is located in northern Portugal and is the second most important city following the capital, Lisbon. During the 70 s and the 80 s, the city experienced a tertiarisation process, and a trend of population decline up to current days (mainly felt in the city center). The masterplan of Porto, the so-called Plano Diretor Municipal (PDM), considers ten types of land uses (e.g. historical areas, consolidated areas, areas in the process of consolidation, detached housing areas, and areas of isolated buildings). It entails a detailed description of the land uses at the block level, often including different types for a single block. The land uses follow a set of morphological criteria, enabling to assume that a given land use type roughly corresponds to an urban tissue. Considering energy use, some key documents have been produced with the aim of characterizing the existing patterns, as part of the municipality’s commitment as a signatory of the Covenant of Mayors. The energy matrix of Porto describes the energy use considering its sources, vectors, sectors and end uses. Buildings stand out as the largest primary energy users (24% for households and

29% in services), while transports account for roughly 37%. In residential buildings, ambient heating amounts to 20% of final energy demand, whereas cooling represents 1% of final energy. With regard to transports, about 55% of final energy is used by light duty-vehicles; from which roughly 74% correspond to passenger car and 26% to freight. Public transport accounts for 7% of final energy in the transport sector [93]. 4.2. Data and sample For the application of the methodology to the city of Porto, the data collected included spatial data, statistical data and energy data. Spatial data on key urban elements for Porto was made available by the municipal authority, consisting of the buildings’ footprints, the municipal master plan, and the street network. The spatial units for the analysis were obtained in vector format from the National Statistics Office (INE – Instituto Nacional de Estatística), with 1831 units being considered. These features are encoded with an ID field. The spatial units correspond to the lowest level of disaggregation of statistical data with a relatively homogeneous physical structure, the so-called subsecções estatísticas, roughly corresponding to an urban block. They enable to assume that the morphologic characteristics are constant for each unit. As a consequence, it is expected that the existing variations in the urban form indicators, lead to a certain degree of variation in the corresponding energy demand. Additional statistical data included building’s heights, functions, age and population. The variables of urban form were calculated according to Table 2. With regard to energy, the national building certification system is a major source of information on building energy needs. From an original sample of nearly 18.000 certificates made available by the National Energy Agency (ADENE – Agência para a Energia), about 12.500, with valid entries and properly georeferenced, were used. This database includes the certificates of the existing buildings since the enforcement of the certification scheme in the national law (2006) until the year of collection (2015). The energy requirements given by the certificates do not refer to a specific year, but instead to nominal energy needs (for heating and cooling) under specific conditions (e.g. a temperature set point) for typical yearly local climatic conditions (e.g. average HDD). In order to match the data to the spatial units under analysis, the average energy demand for heating and cooling was considered. High-resolution data on mobility is quite difficult to collect. A suitable data format is an OD matrix, which was obtained from a traffic assignment model (private vehicles) calibrated for the case study through traffic counts (originally dating to 2005, and periodically updated ever since). It results from a cooperation between the University of Porto and the local municipal authority for the characterization of the mobility patterns in the city, used in official publications [94]. The OD matrix refers to the morning peak hour and has 89 traffic zones. These were downscaled into the spatial units considered, through a population ratio-based normalization. Finally, the annual average daily traffic (AADT) flows were estimated. Table 3 presents the summary statistics of the independent and dependent variables considered (see Eq. (1)). Both the variables of urban form and of energy demand evidence a large variability, given the figures of the standard deviation. Energy for travel (ET) shows the largest variability. Some spatial units had missing values for the energy needs in buildings, whereas transports were better covered. 4.3. Methods and accuracy metrics Before applying the ANN, a couple of statistical methods were tested: a Multiple Linear Regression (MLR) and a Structural

393

M.C. Silva et al. / Applied Energy 202 (2017) 386–398 Table 3 Descriptive Statistics of variables considered. Code EBh EBc ET GSI FSI F SS_Area SubDiv MXI AiSi STV Connect DivAct_PT DivAct_Ped Reach PT_RD Centrality Tree SW

Variable description 2

Final energy needs for heating (kwh/m /yr) Final energy needs for cooling (kwh/m2/yr) Final energy needs for travel (kwh/m2/yr) Ground Space Index Floor Space Index Number of Floors Size of block Subdivision Indicator Mixed-use Index Spatial Distribution of green areas Surface-to-Volume Ratio Beta index Divact Public transport DivAct Pedestrian Reach (within 500 m radius) Public transport route density Linear distance to CBD Proportion of street length with provision of trees Sidewalk area/block area

N

Minimum

Maximum

Mean

Std. deviation

1307 1307 1831 1831 1831 1831 1831 1831 1831 1831 1831 1831 1831 1831 1831 1831 1831 1831 1831

16.8 0 0 0 0 0 1.60  102 0 0 0 0 0 0 0 0 0 2.10  102 0 0

354.4 9.8 990.3 4.29 10.70 9.00 3.70  105 3.12  102 1.00 0.99 0.66 4.75 1.00 1.00 2.80  106 1.54  103 7.87  103 1.00 0.70

130.3 2.8 25.7 0.35 0.93 2.69 1.90  104 1.23  103 0.91 0.03 0.26 3.12 0.92 0.80 7.32  105 6.50  105 3.72  103 0.18 0.04

53.3 1.4 37.7 0.28 0.87 1.16 2.69  104 1.70  103 0.20 0.12 0.10 0.53 0.13 0.15 4.96  105 1.29  104 1.81  103 0.29 0.00

evaluate the quality of the model. In addition, the mean absolute percentage error – MAPE (2) and the root mean square percentage error – RMSPE (3) were used to verify model accuracy (Table 4). For further considerations on prediction errors please see Lan et al. [95].

Equation Model (SEM). The preliminary statistical analysis aimed at providing a comparison to the ANN results and additional sensitivity on the relative explanatory power and significance of the predictors. Nevertheless, ANN seem to be an adequate technique for this problem, able to address nonlinear relationships and adding a prediction character that SEM cannot offer. The different modeling techniques were compared using selected fit measures (Table 4) and the determination coefficient. ^i is the predicted value of y and yi is the real In (2), (3) and (5), y

4.3.2. Structural equation models (SEM) Structural Equation models are based on linear methods, but are more flexible. However, they require the user to specify and map the relationships between the variables. AmosTM 23 was used for building the SEM. Two versions were considered:

value of y. In (4) and (5), X 2 is the chi-square and df are the degrees of freedom. Model is the model being evaluated and null refers to the null model, reflecting the null hypothesis which considers that the dependent variables are not related with the explanatory variables.

1. A model using all the variables considered: (Bh, Bc, ET) = f (UF1, UF2, UF3, UFx). 2. A model considering covariates, using the stepwise approach such that the variables included in the model were: (Bh, Bc, ET) = f (GSI, FSI, F, SS_Area, SubDiv, STV, DivAct_Ped).

4.3.1. Multiple linear regression (MLR) A Multiple Linear Regression (MLR) for each dependent variable in relation to urban form variables was applied, including the forward stepwise procedure. This consists of adding independent variables, step by step, in order to improve the quality of the model. This was performed in IBMÒ SPSSÒ Statistics 23. Three regression models were tested: 1. Bh = f (GSI, FSI, F, SS_Area, SubDiv, AiSi, STV, and Tree) 2. Bc = f (GSI, FSI, F, SS_Area, SubDiv, AiSi, STV, and Tree) 3. ET = f (GSI, FSI, F, SS_Area, SubDiv, MXI, AiSi, Tree, Connect, DivAct_PT, DivAct_Ped, Reach, PT_Rt_Density, Centrality, SW)

In the case of the SEM, the two accuracy measures are the RMSEA – Root Mean Square Error of Approximation and the CFI – Comparative Fit Index (Table 4). These are common metrics for evaluating the fit of SEM, however, they tend to penalize model complexity, favoring models with less parameters. RMSEA (4) is an absolute fit index based on the chi-square, with values closer to 0 being better. CFI (5) compares the developed model with the null model. CFI is normalized and ranges between 0 and 1, with higher values indicating a better fit.

The ET model includes more variables than Bh and Bc, because the number of attributes of urban form influencing mobility identified in the literature is higher. In MLR a coefficient of determination (R2) determined by the least squares criterion enables to

4.3.3. Artificial neural networks (ANN) To the best of our knowledge, machine learning hasn’t been applied to investigate the effect of urban form on energy demand. Here, ANN will be applied to assess the influence of a

Table 4 Fit measures for evaluation of the different modeling techniques. Fit measure

Expression

Mean Absolute Percentage Error (MAPE)

MAPE ¼

Root Mean Square Percentage Error (RMSPE) Root Mean Square Error of Approximation (RMSEA) Comparative Fit Index (CFI) Mean Squared Error (MSE)

Modeling technique

Pn

^i yi y i¼1 j yi

 100j rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pn yi ^yi 2 1 RMSPE ¼ 100  n i¼1 ð y Þ i qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 =df Þ1 ; 0 RMSEA ¼ max½ðX n1 1 n

max½ðX 2model dfmodel Þ;0 max½ðX 2model dfnull Þ;ðX 2model dfmodel Þ;0

CFI ¼ 1  P ^i Þ2 MSE ¼ 1n ni¼1 ðyi  y

(2)

MLR, ANN

(3)

MLR, ANN

(4)

SEM

(5)

SEM

(6)

MLR, ANN

394

M.C. Silva et al. / Applied Energy 202 (2017) 386–398

comprehensive set of variables of urban form on energy demand, given a large dataset. If the relationship between urban form and energy demand is found to be significant, such model may enable to predict the energy impact of future urban planning choices. The ANN model includes urban form variables as inputs (predictors), while the outputs are energy demand for heating (Eh) and cooling (Ec) in buildings, and mobility (ET), according to (1). The data was normalized within the range of 1 and 1, in order to meet the requirements of the transfer function applied. The data sample for training, validating and testing the model was randomly split, within a proportion of 80%, 10% and 10%, respectively. The training phase is used for computing the gradient and updating the weights and the biases of the net. The validation phase is targeted at avoiding overfitting. The validation errors are monitored during the training process, in order to be kept at a minimum (before they begin to rise again due to overfitting). Finally, the testing dataset works as a reference to be compared with the remaining models. If the error of the test phase reaches a minimum at a very different iteration, it may evidence a poor division of the data. In order to facilitate comparisons between different networks, a script was developed in order to run a pre-specified number of network trainings (e.g. 100, 200, 1000,. . .), where the weights are randomly initialized, and the net with the lowest MSE – mean squared error (6) is selected. Different network architectures were tested. Starting with a default number of 10 hidden neurons, and gradually increasing it up to 15 (a larger number of neurons increases the model complexity and makes the interpretation of the network weights more difficult). Three algorithms suitable to this problem were also tested: the Levenberg-Marquardt (LM) backpropagation algorithm, the Scaled Conjugate Gradient (SCG) algorithm and, the BFGS quasi-Newton backpropagation. While LM is advocated to be a fast convergence algorithm usually retrieving lower MSE values, it requires higher storage capacity. SCG is able to deal with a large number of weights, and BFG is similar to LM but with lower storage requirements [96]. A net with 15 neurons in the hidden layer and with a Levenberg-Marquardt (LM) training algorithm delivered the best results (Fig. 2). The results obtained are compared with key evaluation criteria, similar to the one from MLR. A coefficient of determination (R2) enables to infer on the explanatory power of the model, while prediction errors enable to evaluate the prediction power. The MAPE and the RMSPE were calculated using the testing dataset, for each of the dependent variables (Eh, Ec, ET). The mean squared error – MSE (6) was determined for the whole model (i.e. Eh, Ec, and ET modelled together). Several methods can be used to understand the degree of contribution of each predictor to explain the variability observed in the dependent variables (for discussions on the topic see, for instance Gevrey et al. [97] [98] [99]). Olden and Jackson [100] refer three common methods for estimating the importance of predictors in ANN: the Neural Interpretation Diagram (NID), Garson’s algorithm, and sensitivity analysis. Following Fischer’s (2015) recent findings, the Garson’s algorithm [101] was applied. 4.4. Results and discussion The results obtained for the different modeling techniques are summarized in Table 5. Considering the MLR, the model on heating needs evidenced the strongest relationship, with an R2 of 0.218. Although this may not be a strong correlation, for a linear model with several independent variables, and acknowledging that urban form is not the only factor determining energy demand, it is quite a significant result. If 22% of the variability of the energy demand for heating (Eh) is a result of urban form, the ability to better control it through more efficient planning policies, could certainly lead to

Fig. 2. Architecture of the ANN model.

meaningful energy savings. Conversely, the cooling model (Ec) evidenced the weakest relationship and the worst accuracy results (RMSPE). This could be expected, as cooling needs may be more dependent on non-structural solutions (e.g. shading devices). The R2 for the travel model is relatively lower. Nevertheless, the standardized coefficients of the individual variables are in line with the literature, around 0.20 [8]. The stepwise MLR presented similar results to the ones from the simple MLR, with fewer variables. This suggests that only a few predictors evidence a significant explanatory power when using a linear approach. SEM was explored as a more flexible approach than MLR, enabling to model Eh, Ec and ET at the same time. All the individual determination coefficients increased significantly, especially for the response variables performing worse (Ec and ET). It is not evident whether SEM 2, presents a better fit than SEM 1 (Table 5). The different R2 obtained in SEM1 and SEM2 also indicate that, despite a more conspicuous influence of macro variables of urban form, especially those related to density (in line with the literature), there may still be an important contribution of other factors for determining the overall energy demand. In both cases, the model accuracy has room for improvement. This indicates that SEM may not be flexible enough to fully address the relationship between urban form and energy demand. The pioneer application of ANN to model thermal energy needs in buildings and mobility altogether retrieved promising results. Table 5 presents the accuracy metrics for the best ANN model (15 hidden neurons, LM training algorithm). The model selected had similar determination coefficients for the three modeling phases (training, validation and testing), and the minimization of the validation and testing errors was achieved at close iterations.

395

M.C. Silva et al. / Applied Energy 202 (2017) 386–398 Table 5 Determination coefficients and accuracy metrics obtained for each model. Model

Metric

Eh

Ec

ET

Eh, Ec, ET

Output(s)

*

MLR 1

R2 MAPE RMSPE Input var. included

0.218 32.7% 64.7% GSI, FSI*, F*, SS_Area, SubDiv, AiSi, STV*, Tree

0.025 66.5% 892.6% GSI, FSI*, F*, SS_Area, SubDiv, AiSi, STV, Tree

0.099 70.6% 86.8% GSI*, FSI*, F*, SS_Area*, SubDiv, MXI*, AiSi, Connect, DivAct_PT, DivAct_Ped*, Reach, PT_RD, Centrality, Tree, SW

– – – –

MLR 2 (stepwise)

R2 MAPE RMSPE Input var. included

0.214 32.7% 66.1% F, FSI, STV, SS_Area

0.016 67.0% 934.2% GSI, SubDiv

0.097 69.8% 86.6% DivAct_Ped, MXI, SS_Area, GSI, F, FSI

– – – –

SEM 1

R2 RMSEA CFI Input var. included

0.25 – –

0.16 – –

0.18 – –

– 0.211 0.042 GSI, FSI, F, SS_Area, SubDiv, MXI, AiSi, STV, Connect, DivAct_PT, DivAct_Ped, Reach, PT_RD, Centrality, Tree, SW

SEM 2 (stepwise)

R2 RMSEA CFI Input var. included

0.20 – –

0.04 – –

0.08 – –

– 0.233 0.704 GSI, FSI, F, SS_Area, SubDiv, STV, DivAct_Ped

ANN (15 neurons + LM algorithm)

R2 MAPE RMSPE MSE Input var. included

0.48 29.1% 44.9% –

0.12 47.3% 79.1% –

0.37 43.3% 62.9% –

0.78 40.2% 63.8% 0.038 GSI, FSI, F, SS_Area, SubDiv, MXI, AiSi, STV, Connect, DivAct_PT, DivAct_Ped, Reach, PT_RD, Centrality, Tree, SW

Significant at probability level of 0.05.

This avoids overfitting and ensures a good split of the data. The normality of the errors was confirmed. The ANN results corroborate the existence of significant relationships between urban form and energy needs (with higher R2 than those from previous methods), whereas the individual prediction errors highlight the suitability of ANN in relation to the remaining techniques. It shows that a model relying only on variables of urban form enables to estimate with accuracy the energy demand of relevant urban end-uses, at a high resolution level. The R2 of the global model confirms that the combined effect of urban form may be quite large (in line with [8]). In this case, it explains about 78% of the overall energy needs for buildings and transports. There are additional drivers of energy demand (e.g. behavioral and economic factors), which may explain the remaining variability. The goal was to investigate the influence of urban form on energy demand, quantifying its impact and describing the urban environment as extensively as possible, which was not expected to explain the full variability associated with energy demand. Note that heating and cooling data refers to energy needs. Considering the real consumption would likely find weaker relationships, as this is more dependent on technologic and behavioral factors. The results suggest that energy needs for cooling are those least influenced by urban form, for the latitude explored. Urban form primarily influences ambient heating (about 48%) and mobility (about 36%), which strongly supports the adoption of urban planning and urban form-oriented policies targeted at energy conservation. In order to further explore the contribution of each variable of

urban form on energy demand, the Garson’s algorithm was applied [101,102]. The method consists of splitting the hidden-output connection weights for each neuron in the hidden layer into components linked to each input. The contribution of each input for each output is computed through the normalized product of the input-hidden and the hidden-output connection matrixes (the full procedure is illustrated in Olden and Jackson [100]). The relative weight of each variable in the ANN model is presented in Fig. 3. Individual weights range from 4% to 9%. The most important variables are in line with the ones identified in the stepwise MLR. These are macro variables related to density (such as GSI, FSI, F) and granularity (Subdiv and SS_Area), along with building compactness (STV) and the pedestrian accessibility (Divact_Ped). However, in the ANN model, the MXI loses importance in the variable rank, while Centrality emerges with a high explanatory power. In the case of Porto, the city center represents an important hub of urban activities, and this evidences that energy demand is significantly influenced by the distance to CBD. This is in line with previous findings (e.g. [103]). It is worth noting that the remaining variables account for about 40% of the total variable weight. While the ANN was able to find significant relationships between these predictors and the response variables, the MLR and SEM performed better without them. In the case of Porto, some spatial trends may explain the results. Central urban blocks evidence lower needs for motorized travel. The city center is well served by public transports and is characterized by narrow, mixed-use and pedestrian-friendly streets, with small blocks and high granularity. Conversely, one of the areas

396

M.C. Silva et al. / Applied Energy 202 (2017) 386–398

Fig. 3. Weight of each predictor in the ANN model (application of Garson’s algorithm).

with highest energy demand for mobility corresponds to the surroundings of the academic campus, located in an outer part of the city, characterized by large blocks, individual buildings, and monofunctional areas. In buildings, as expected, heating and cooling typically have contrary performances. Again, the city center presents a distinct behavior from the remaining areas, with higher needs for cooling, but lower ones for heating. Here, is where building compactness is greater, which seems to be the most relevant indicator for determining buildings’ energy balance. Regarding the replicability of this approach, it is possible that the effect of urban form in other urban areas may differ, for instance, due to a more prominent influence of other drivers of energy demand (e.g. behavioral or economic). Using a model based only on variables of urban form may not predict energy demand with similar accuracy in different urban contexts. To complement this study, it would be interesting to test additional case studies to assess whether the weights of the predictors determined for Porto are extendable to other cities. In case they aren’t, the methodological framework developed can still be used to determine the right weights for different cities, and therefore to build a functional model in other geographies. The methodology advances the state of the art in many ways. First, it models the two most important sectors using energy in cities together, considering a comprehensive set of indicators (while the existing research is dominantly sectorial). Second, it applies for the first time a machine learning technique to explore the link between urban form and energy demand, accounting in an explicit and integrated way for the existing trade-offs between the different urban sectors and energy end-uses. Third, it combines the advantages of a spatially-explicit analysis with the modeling accuracy of neural networks. This innovative application has produced encouraging results. It found relationships of larger magnitudes than conventional techniques, and significantly increased the prediction accuracy for the three relevant energy end-uses. This approach enables to quantify the overall influence of urban form on energy demand, as well as to characterize the individual effect of each urban attribute. It works as a case-specific tool for assessing and guiding future development pathways, with relevance for early stage decision-support in urban and energy planning. It holds important implications on the identification of where action would have the greatest benefits, contributing to establish (re)development priorities. The methodology can be used to assess how different options of urban development will likely impact energy demand. Potential applications include, a quantified comparison of development alternatives like ‘enabling future development to occur in district A’ or ‘enabling future development

to occur in district B’ or ‘random development throughout the city’. In addition, by considering different attributes, it indicates whether and where the development should privilege residential and non-residential areas, or specific development configurations such as increasing building heights instead of building new ground area. 5. Conclusions This paper presents a new methodological framework for exploring the combined influence of urban form on the energy needs of the building and transport sectors. The methodology proposed combines the advantages of a spatially explicit highresolution analysis, with the analytical power and accuracy of neural networks. The ability to map the model predictors and outputs enables to link the data to the territory, while capturing spatial trends. ANN proved to be a suitable modeling tool, delivering better results than conventional statistical techniques. It enabled to identify relationships of higher magnitudes, with higher accuracy. The application of machine learning to the urban form-energy nexus may constitute a promising field of research, in a time where the urbanization rate worldwide is increasing, and urban development choices may dictate energy demand for centuries. The model works as a case-specific tool, enabling to predict the energy behavior of a given urban setting from a comprehensive set of urban form attributes. Its application to a testbed city confirmed the existence of significant relationships between urban form and energy demand, and enabled shedding some light on the effect of each individual variable. This may contribute to a better understanding of the effect of the physical configuration of cities on energy demand, a field of research with almost three decades. For the city of Porto, the most relevant variables of urban form are related to density and land uses (the number of floors, pedestrian accessibility and floor area). However, the results also show that micro variables should not be neglected. In this case, increasing densities through higher building heights, combined with promoting the diversity of activities within a walkable reach constitute development priorities. In addition, planning for a higher granularity within urban blocks, as well as increasing residential densities closer to the city center may return significant results. This study constitutes a groundbreaking application of a machine learning technique to characterize the effect of urban form on energy demand, in a comprehensive way. The promising results obtained encourage its application to other geographies.

M.C. Silva et al. / Applied Energy 202 (2017) 386–398

Further case studies could enable finding patterns on the influence of urban form by city type (for instance, depending on city size, or economic structure). The methodological framework holds a significant potential for assessing and guiding future development pathways. Its enables the identification of where action would have the greatest benefits, contributing to establish (re)development priorities. The methodology may represent a useful decision-support tool for local authorities, in the design of more efficient urban planning and development policies. Acknowledgements The first author gratefully thanks FCT – Fundação para a Ciência e a Tecnologia, for the financial support of her PhD studies SFRH/ BD/52305/2013. References [1] Newman PWG, Kenworthy JR. Gasoline consumption and cities. J Am Plan Assoc 1989;55:24–37. [2] Ewing R, Rong F. The impact of urban form on U.S. residential energy use. Housing Policy Debate 2008;19:1–30. [3] Seto K, Dhakal S, Bigio A, Blanco H, Delgado G, Dewar D, et al. Human settlements, infrastructure and spatial planning. In: Edenhofer O, PichsMadruga R, Sokona Y, Farahani E, Kadner S, Seyboth K, Adler A, Baum I, Brunner S, Eickemeier E, Kriemann B, Savolainen J, Schlömer S, von Stechow C, Zwickel T, Minx JC, editors. Climate change 2014: mitigation of climate change. contribution of working Group III to the fifth assessment report of the intergovernmental panel on climate change. Cambridge, UK: Cambridge University Press; 2014. p. 923–1000. [4] Weisz H, Schandl H. Materials use across world regions: inevitable pasts and possible futures. J Ind Ecol 2008;12:629–36. [5] Creutzig F, Baiocchi G, Bierkandt R, Pichler P-P, Seto KC. Global typology of urban energy use and potentials for an urbanization mitigation wedge. Proc Natl Acad Sci 2015;112:6283–8. [6] Silva M, Oliveira V, Leal V. Urban morphology and energy: progress and prospects. Urban Morphol 2016;20:72–3. [7] Kitamura R, Mokhtarian PL, Laidet L. A micro-analysis of land use and travel in five neighborhoods in the San Francisco Bay Area. Transportation 1997;24:125–58. [8] Ewing R, Cervero R. Travel and the built environment. a meta-analysis. J Am Plan Assoc 2010;76:265–94. [9] Ewing R, Cervero R. Travel and the built environment: a synthesis. Transp Res Rec 2001;1780:87–114. [10] O’Brien WT, Kennedy CA, Athienitis AK, Kesik TJ. The relationship between net energy use and the urban density of solar buildings. Environ Plan B: Plan Design 2010;37:1002–21. [11] Naess P. Urban structures and travel behviour. Experiences from empirical research in Norway and Denmark. Eur J Transport Infrastruct Res 2003;3:155–78. [12] Rickwood P, Glazebrook G, Searle G. Urban structure and energy—a review. Urban Policy Res 2008;26:57–81. [13] Grubler A, Bai X, Buettner T, Dhakal D, Fisk DK, Ichinose T, et al. Chapter 18: urban energy systems. In: Global energy assessment - toward a sustainable future. Cambridge, UK and New York, NY, USA and the international institute for applied systems analysis, Laxenburg, Austria: Cambridge University Press; 2012. p. 1307–400. [14] Bhattacharyya S, Timilsina G. Energy demand models for policy formulation: a comparative study of energy demand models (The World Bank); 2009. [15] Keirstead J, Jennings M, Sivakumar A. A review of urban energy system models: approaches, challenges and opportunities. Renew Sustain Energy Rev 2012;16:3847–66. [16] Anderson JE, Wulfhorst G, Lang W. Energy analysis of the built environment a review and outlook. Renew Sustain Energy Rev 2015;44:149–58. [17] Zhao H, Magoulès F. A review on the prediction of building energy consumption. Renew Sustain Energy Rev 2012;16:3586–92. [18] Theodoridou I, Papadopoulos AM, Hegger M. A typological classification of the Greek residential building stock. Energy Build 2011;43:2779–87. [19] IBPSA-USA. Building Energy Software Tools - BEST Directory; 2016. [20] Sun J, Reddy TA. Calibration of building energy simulation programs using the analytic optimization approach (RP-1051). HVAC&R Res 2006;12:177–96. [21] Kim KH, Haberl JS. Development of methodology for calibrated simulation in single-family residential buildings using three-parameter change-point regression model. Energy Build 2015;99:140–52. [22] Pessenlehner W, Mahdavi A. Building morphology, transparence, and energy performance. (Eindhoven, Netherlands: na); 2003. [23] Ratti C, Baker N, Steemers K. Energy consumption and urban texture. Energy Build 2005;37:762–76.

397

[24] Zhao H. Energy consumption based urban texture analysis. In: Eighteenth international seminar on urban form. Montreal, Canada;2011. [25] Hargreaves A, Cheng V, Deshmukh S, Leach M, Steemers K. Forecasting how residential urban form affects the regional carbon savings and costs of retrofitting and decentralized energy supply. Appl Energy 2017;186:549–61. [26] Oke TR. Boundary layer climates. London, New York: Routledge; 1987. [27] Oke TR. Canyon geometry and the nocturnal urban heat island: Comparison of scale model and field observations. J Climatol 1981;1:237–54. [28] Hu Y, White M, Ding W. An urban form experiment on urban heat island effect in high density area. Proc Eng 2016;169:166–74. [29] Wong NH, Jusuf SK, Syafii NI, Chen Y, Hajadi N, Sathyanarayanan H, et al. Evaluation of the impact of the surrounding urban morphology on building energy consumption. Sol Energy 2011;85:57–71. [30] Akbari H, Damon Matthews H, Seto D. The long-term effect of increasing the albedo of urban areas. Environ Res Lett 2012;7:024004. [31] Handy S. Methodologies for exploring the link between urban form and travel behavior. Transport Res Part D: Transport Environ 1996;1:151–65. [32] Ghauche A. Integrated transportation and energy activity-based model. Master Thesis. Massachusetts Institute of Technology; 2010. [33] Sivakumar A. modelling transport: a synthesis of transport modelling methodologies. Imperial College of London; 2007. [34] Carty J, Ahern A. Exploring the link between transport and urban form: applications of the MOLAND model 2008. Urban Institute Ireland Working Paper Series. [35] Carty J, Ahern A. Mapping the relationship between transport energy consumption and urban form characteristics for the Greater Dublin Area. J Map 2010;6:160–8. [36] Banister D, Watson S, Wood C. Sustainable cities: transport, energy, and urban form. Environ Plan B 1997;24:125–43. [37] Newman P, Kenworthy J. Urban design to reduce automobile dependence. Opolis 2006;2:35–52. [38] Glaeser EL, Kahn ME. The greenness of cities: carbon dioxide emissions and urban development. J Urban Econ 2010;67:404–18. [39] Bento AM, Cropper M, Mobarak AM, Vinha K. The impact of urban spatial structure on travel demand in the United States (The World Bank); 2003. [40] Cervero R, Duncan M. Which reduces vehicle travel more: jobs-housing balance or retail-housing mixing? J Am Plan Assoc 2006;72:475–90. [41] Cervero R. Jobs-housing balancing and regional mobility. J Am Plan Assoc 1989;55:136–50. [42] Frank LD, Pivo G. Impacts of Mixed use and density on utilization of three modes of travel: single-occupant vehicle, transit, walking. Transp Res Rec 1994:44–52. [43] Cervero R, Kockelman K. Travel demand and the 3Ds: density, diversity, and design. Transport Res Part D: Transport Environ 1997;2:199–219. [44] Cervero R. Built environments and mode choice: toward a normative framework. Transport Res Part D: Transport Environ 2002;7:265–84. [45] Soltani A, Allan A. Analyzing the impacts of microscale urban attributes on travel: evidence from Suburban Adelaide, Australia. J Urban Plan Dev 2006;132:132–7. [46] Fang C, Wang S, Li G. Changing urban forms and carbon dioxide emissions in China: a case study of 30 provincial capital cities. Appl Energy 2015;158:519–31. [47] Golob TF. Structural equation modeling for travel behavior research. Transport Res Part B: Methodol 2003;37:1–25. [48] Cervero R, Murakami J. Effects of built environments on vehicle miles traveled: evidence from 370 US urbanized areas. Environ Plan A 2010;42:400–18. [49] Lee S, Lee B. The influence of urban form on GHG emissions in the U.S. household sector. Energy Policy 2014;68:534–49. [50] Hsu D. Identifying key variables and interactions in statistical models of building energy consumption using regularization. Energy 2015;83:144–55. _ Akkurt GG, Turhan C, Ekmen KE. On the relation [51] Kazanasmaz T, Uygun IE, between architectural considerations and heating energy performance of Turkish residential buildings in Izmir. Energy Build 2014;72:38–50. [52] Engvall K, Lampa E, Levin P, Wickman P, Öfverholm E. Interaction between building design, management, household and individual factors in relation to energy use for space heating in apartment buildings. Energy Build 2014;81:457–65. [53] Stone B, Rodgers MO. Urban form and thermal efficiency: how the design of cities influences the urban heat Island effect. J Am Plan Assoc 2001;67:186–98. [54] Tsui K-L, Chen V, Jiang W, Aslandogan Y. Data mining methods and applications. In: Prof HP, editor. Springer handbook of engineering statistics. London: Springer; 2006. p. 651–69. [55] Fayyad U, Piatetsky-Shapiro G, Smyth P. From data mining to knowledge discovery in databases. AI Magazine 1996;17:37. [56] Liao S-H, Chu P-H, Hsiao P-Y. Data mining techniques and applications – a decade review from 2000 to 2011. Expert Syst Appl 2012;39:11303–11. [57] Chen L-D, Sakaguchi T, Frolick MN. Data mining methods, applications, and tools. Inform Syst Manage 2000;17:65–70. [58] Chou J-S, Bui D-K. Modeling heating and cooling loads by artificial intelligence for energy-efficient building design. Energy Build 2014;82:437–46. [59] Karatasou S, Santamouris M, Geros V. Modeling and predicting building’s energy use with artificial neural networks: methods and results. Energy Build 2006;38:949–58.

398

M.C. Silva et al. / Applied Energy 202 (2017) 386–398

[60] Kalogirou SA. Artificial neural networks in energy applications in buildings. Int J Low Carbon Technol 2006;1:201–16. [61] Issa RRA, Flood I, Asmus M. Development of a neural network to predict residential energy consumption. In: Proceedings of the sixth international conference on application of artificial intelligence to civil & structural engineering. Stirling, Scotland: Civil-Comp Press; 2001. p. 65–6. [62] Kalogirou SA. Applications of artificial neural-networks for energy systems. Appl Energy 2000;67:17–35. [63] Yalcintas M, Akkurt S. Artificial neural networks applications in building energy predictions and a case study for tropical climates. Int J Energy Res 2005;29:891–901. [64] Dong B, Cao C, Lee SE. Applying support vector machines to predict building energy consumption in tropical region. Energy Build 2005;37:545–53. [65] Dougherty M. A review of neural networks applied to transport. Transport Res Part C: Emerg Technol 1995;3:247–60. [66] Geem ZW. Transport energy demand modeling of South Korea using artificial neural network. Energy Policy 2011;39:4644–50. [67] Murat YS, Ceylan H. Use of artificial neural networks for transport energy demand modeling. Energy Policy 2006;34:3165–72. [68] Sokmenoglu A, Cagdas G, Sarıyıldız S. Exploring the patterns and relationships of urban attributes by data mining. In: Future cities: 28th eCAADe conference proceedings. Zurich, Switzerland; 2010. p. 873–881. [69] Gil J, Montenegro N, Beirão JN, Pinto Duarte J. On the discovery of urban typologies. Data mining the multi-dimensional character of neighbourhoods. In: Computation: the new realm of architectural design: 27th eCAADe conference proceedings. Istanbul, Turkey; 2009. p. 269–278. [70] Kanaroglou P, South R. Can urban form affect transportation energy use and emissions? Energy Stud Rev 2001;9:22–40. [71] Dieleman FM, Dijst M, Burghouwt G. Urban form and travel behaviour: micro-level household attributes and residential context. Urban Stud 2002;39:507–27. [72] Pont MB, Haupt P. The spacemate - density and the typomorphology of the urban fabric. Nordisk Arkitekturforskning 2005;4:55–68. [73] Bourdic L, Salat S, Nowacki C. Assessing cities: a new system of cross-scale spatial indicators. Build Res Inform 2012;40:592–605. [74] Taha H. Urban climates and heat islands: albedo, evapotranspiration, and anthropogenic heat. Energy and Buildings 25; 1997. [75] Baker N, Hoch D, Steemers K. The LT Method. Version 1.2. (Cambridge: Cambridge Architectural Research and Martin Centre for Architectural and Urban Studies, University of Cambridge); 1992. [76] Steemers K. Energy and the city: density, buildings and transport. Energy Build 2003;35:3–14. [77] Littlefair P. Passive solar urban design: ensuring the penetration of solar energy into the city. Renew Sustain Energy Rev 1998;2:303–26. [78] Litman T, Steele R. 2005. Land use impacts on transport. Victoria Transport Policy Institute;2005. [79] Silva C. Structural accessibility for mobility management. Prog Plan 2013;81:1–49. [80] Sevtsuk A, Mekonnen M. Urban network analysis. A new toolbox for ArcGIS. Revue Internationale de Géomatique 2012;22:287–305. [81] Alford G, Whiteman J. Macro-urban form and transport energy outcomes: investigations for Melbourne. Road Transport Res: J Aust New Zealand Res Practice 2009;18:53. [82] Naess P. Residential location affects travel behavior—but how and why? The case of Copenhagen metropolitan area. Prog Plan 2005;63:167–257. [83] Holden E, Norland IT. Three challenges for the compact city as a sustainable urban form: household consumption of energy and transport in eight residential areas in the greater Oslo Region. Urban Stud 2005;42:2145–66.

[84] Southworth M. Walkable suburbs?: An evaluation of neotraditional communities at the urban edge. J Am Plan Assoc 1997;63:28–44. [85] Salon D, Boarnet MG, Handy S, Spears S, Tal G. How do local actions affect VMT? A critical review of the empirical evidence. Transport Res D: Transport Environ 2012;17:495–508. [86] Ko Y. Urban form and residential energy use: a review of design principles and research findings. J Plan Liter 2013;28:327–51. [87] Silva M, Oliveira V, Leal V. Urban form and energy demand: a review of energy-relevant urban attributes. J Plan Literature; 2017, published online. [88] Salat S. Energy loads, CO2 emissions and building stocks: morphologies, typologies, energy systems and behaviour. Build Res Inform 2009;37:598–609. [89] Jain A, Mao J, Mohiuddin KM. Artificial neural networks: a tutorial. Computer 1996;29:31–44. [90] Francis L. Neural networks demystified. Casualty actuarial society forum. Winter 2001:253–320. [91] Cohen D, Krarti M. A neural network modelling approach applied to energy conservation retrofits. In: Proceedings of fourth international conference on building simulation. Madison, WI; 1995. p. 423–430. [92] Zhang P, Wang H. Fuzzy wavelet neural networks for city electric energy consumption forecasting. Energy Proc 2012;17(Part B):1332–1338. [93] Leal V, Santos H, Mourão Z, Souza G. Matriz Energética dos concelhos da Área Metropolitana do Porto na Margem Norte do rio Douro (Porto. Portugal: Fundação Gomes Teixeira, Universidade do Porto); 2012. [94] CMP. Mobilidade na Cidade do Porto. Análise das deslocações em transporte individual. Porto, Portugal: Câmara Municipal do Porto; 2007. [95] Lan L, Lin F-Y, Kuo A. Three novel methods to predict traffic time series in reconstructed state spaces. In: Samuelson Wei-Chiang (Ed.), Principal concepts in applied evolutionary computation: emerging trends: emerging trends. United States of America: IGI Global; 2012. p. 16–35. [96] Beale M, Hagan M, Demuth H. Neural Network ToolboxTM - User’s Guide; 2014. [97] Gevrey M, Dimopoulos I, Lek S. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecol Model 2003;160:249–64. [98] Olden JD, Joy MK, Death RG. An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data. Ecol Model 2004;178:389–97. [99] Fischer A. How to determine the unique contributions of input-variables to the nonlinear regression function of a multilayer perceptron. Ecol Model 2015;309–310:60–3. [100] Olden JD, Jackson DA. Illuminating the ‘‘black box”: a randomization approach for understanding variable contributions in artificial neural networks. Ecol Model 2002;154:135–50. [101] Garson GD. Interpreting neural-network connection weights. AI Expert 1991;6:46–51. [102] Goh ATC. Back-propagation neural networks for modeling complex systems. Artif Intell Eng 1995;9:143–51. [103] Hachem C. Impact of neighborhood design on energy performance and GHG emissions. Appl Energy 2016;177:422–34. [104] Boarnet MG, Joh K, Siembab W, Fulton W, Nguyen MT. Retrofitting the suburbs to increase walking: evidence from a land-use-travel study. Urban Stud 2011;48:129–59. [105] van den Hoek J. The MXI (Mixed-use Index) as tool for urban planning and analysis. TU-Delft: Brussels; 2008. p. 15. [106] Chen S, Claramunt C, Ray C. A spatio-temporal modelling approach for the study of the connectivity and accessibility of the Guangzhou metropolitan network. J Transp Geogr 2014;36:12–23.