Walk this way! An IoT-based urban routing system for smart cities

Walk this way! An IoT-based urban routing system for smart cities

Computer Networks 162 (2019) 106857 Contents lists available at ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet Wa...

2MB Sizes 1 Downloads 54 Views

Computer Networks 162 (2019) 106857

Contents lists available at ScienceDirect

Computer Networks journal homepage: www.elsevier.com/locate/comnet

Walk this way! An IoT-based urban routing system for smart cities Andrea Pimpinella, Alessandro E.C. Redondi∗, Matteo Cesana Dipartimento di Elettronica, Informazione e Bioingegneria Politecnico di Milano Piazza Leonardo da Vinci 32, Milano 20134, Italy

a r t i c l e

i n f o

Article history: Received 30 November 2018 Revised 20 April 2019 Accepted 8 July 2019 Available online 12 July 2019 Keywords: Internet of things Smart cities Urban routing Spatial interpolation Temporal forecasting

a b s t r a c t Future smart cities are expected to change radically the way people live, interact and move in urban environments. This will be possible thanks to the massive amount of data that will be generated by ubiquitously deployed sensor devices through the Internet of Things paradigm. Indeed, solutions able to improve the quality of urban mobility for citizens are of particular interests. As a matter of fact, they are a key objective for many municipal administrations as well as one of the priority themes of the European Commission. In this context, this work proposes an advanced smart urban routing service named SURF, which is specifically thought for pedestrians and cyclists willing to move inside a city. The system allows to retrieve the best route between a source location and a destination according to user-defined objective function (e.g., selecting the route with the best air quality or with the lowest average temperature). This is possible through the interaction with a federation of IoT testbeds, deployed worldwide. This paper comments on the implementation and the evaluation of the proposed system, focusing on both the backend (data retrieval and spatio/temporal data interpolation and forecasting operations) and the front-end (graphical user interface). We assess the performance of several spatial interpolation and temporal prediction models, to understand their relationship with the particular sensor measurements (air pollution, temperature, sound pressure level, etc.). We show through experiments that for what concerns spatial interpolation, Universal Kriging is generally able to perform well across all sensor measurements and can be selected as a generic interpolation strategy. As for temporal prediction, experiments highlight a tradeoff between model accuracy and look-ahead capability. We note that short and mid-term prediction methods show satisfactory performance across all sensor measurements. Finally, subjective and objective experiments demonstrate the positive impact of IoT-based solutions for smart routing on urban citizens. © 2019 Elsevier B.V. All rights reserved.

1. Introduction The Internet of Things (IoT) paradigm is populating the world with an increasing number of devices able to sense the environment and communicate measurements remotely. Such devices, which are able to work even in harsh environmental conditions, can be leveraged to support value added services and brand new business ecosystems. Among other scenarios, cities and towns are majorly impacted by the IoT, as the penetration of widespread sensing and communication technologies is fostering the creation of advanced and better services to all the stakeholders of the urban landscape (municipalities, citizens, industries, etc.). Several authors discuss about the challenges related to the integration of smart systems and IoT devices within a city-wide scenario. These mainly regard sensors deployment and real-time Big



Corresponding author. E-mail addresses: [email protected] (A. Pimpinella), [email protected] (A.E.C. Redondi), [email protected] (M. Cesana). https://doi.org/10.1016/j.comnet.2019.07.013 1389-1286/© 2019 Elsevier B.V. All rights reserved.

Data analytics, including data generation, collection, aggregation and preprocessing [1–3]. Major and relevant issues which can be approached and solved by leveraging an IoT ecosystem in the field of smart cities include (but are not limited to) traffic/mobility management, security/emergency management and pollution control. To give some examples, authors in [4] propose an intelligent transport system framework to efficiently manage vehicles distribution and avoid traffic congestion, leveraging real time traffic data generated by wireless sensors networks. Secondly, in [5] authors leverage data from sensors integrated into common cellular devices to detect emergency or danger situations, ensuring full sensing coverage for city-wide environments. Also, authors in [6] propose a method to estimate the necessary amount of water for irrigation of urban lawns leveraging RGB sensors, with the aim of reducing water wastes. In this context, urban transportation will change radically in the upcoming years. While recent EU countries regulations are pushing hard towards cleaner mobility alternatives, several EU car manufacturers are issuing statements about stopping the production of fossil fuel-based cars. Also, aside electric cars and public

2

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

transports, zero-emission mobility solutions including bike sharing platforms and walking buses are popping up at an extremely high pace in many cities around the world. However, although such initiatives are increasingly more frequent, urban navigation is still relying on traditional routing systems, developed primarily for cars. Some navigation services (e.g., Google Maps) do provide options for pedestrians and bikers, but the routes they compute are generally based only on path length and average duration, sometimes also considering morphological features such as altitude. Moreover, quality path indicators like pollution or sound pressure level are usually not taken into account. In this work we show how to augment the capabilities of a urban routing system by leveraging an IoT system able to collect capillary city-wide data. To this extent, we exploit the platform developed within the EU H2020’s FIESTA-IoT project [7–11], that provides simple semantic-based APIs to access multiple federated testbeds of sensor networks located in several parts of the world. Leveraging the data collected by such testbeds, we propose SURF (Smart Urban Routing for FIESTA-IoT), a routing system that computes the best route between a source/destination couple according to a generic, user-defined objective function. The objective function encompasses several metrics, including the standard journey traveling time, the carbon monoxide (CO) and ozone (O3) concentration, the sound pressure level, the relative humidity and the air temperature or solar exposure along the path. Thus, SURF combines geographical information obtained from publicly available navigation services (e.g., Google Maps or OpenStreetMap) and data retrieved by sensors, made available in the location of interests by the FIESTA-IoT platform in a completely agnostic fashion. Note that the choice of the specific objective metric to select depends of course on the sensor resources available in a particular geographical area, which are conveniently discovered through the FIESTA-IoT platform. As an example, SURF allows users to select the least polluted route in geographical areas where air quality sensors are installed, or the quietest route where environmental sound pressure level sensors are installed. To describe the proposed navigation engine in a nutshell, on field sensors’ readings are processed by two core components: (i) an interpolation processing block, to increase the spatial granularity of sensor measurements, and (ii) a temporal prediction model based on neural networks. The former allows a user to obtain a valid route even when measurements are sparse in space, while the latter allows for in-the-future route computation. We believe SURF lets users gain several degrees of freedom in selecting the best path for reaching his or her final destination, augmenting the performance of traditional routing systems. This work is organized as follows: Section 2 reviews the relevant literature on urban routing systems; Section 3 describes SURF’s architecture, also providing a general introduction to FIESTA-IoT platform. Experiments to optimise and evaluate SURF are described in Section 4 by leveraging the data coming from the city of Santander [12]. Finally, Section 5 summarises conclusions and future works.

2. Related works Traffic management is one of the most sensitive issues at urban level. To this extent, several research efforts are in place to study smart navigation and routing systems which leverage the digitalisation and sensorisation of the cities to design and promote better and more sustainable mobility. Broadly speaking, the works in this field can be categorised on the basis of the final goal of the routing decision problem (optimise traffic distribution, reduce journey time, reduce carbon footprint, etc. for either vehicular, pedestrian or bike mobility scenarios) and the type of data used to support

such decision problem (geographical information, traffic congestion, etc.). For what regards vehicular environments, data coming from Telco operators are leveraged in [13] to obtain real-time traffic congestion monitoring which is then fed back into the navigation system. Similar goal of estimating the current traffic conditions is pursued in [14,15], which do not rely upon Telco data but rather use taxis as moving probes of the current traffic conditions. Distributed crowd-sensed data can also be leveraged to get more capillary information on the status of traffic and avoid congestions or traffic jams [16]. As for commercial urban navigation systems, the baseline includes nowadays systems like Google Maps and Waze which allow to obtain the shortest path between a source and a destination, also taking into account the current traffic situation inferred by data coming from user devices. Such systems mainly focus on finding routes with minimal journey time neglecting other route quality measures. To this extent, novel route planning algorithms are constantly proposed to determining quicker and more fuel-efficient routes, reducing emissions, or smoothing the deceleration/acceleration of vehicles [17]. For what concerns pedestrians and cyclists mobility, many studies have shown that a proper selection of routes through the urban area may significantly increase the quality of commuting. Recent works [18,19] show that proper route selection can decrease the air pollution exposure by up to 67%, and traveling outside the rush time periods reduces the exposure between 10% and 30%. In [20], where authors propose a web-based route planning service for cyclists in Montreal, Canada, it is shown that a lower traffic pollution exposure route alternative to shortest one was available in almost 60% of the cases, with a corresponding little increase (often less than 1 km) in the overall path length. These results indicate that a route planner for selecting the best route through the city including indicators other than journey time such as air quality might be a good service for the public. Following this line, the work in [21] proposes Cyclevancouver, a cycling route planner for Metro Vancouver (the metropolitan area of Vancouver, Canada). The planner uses geographical information (e.g. distance, elevation gain, etc.) as well as environmental data (e.g. route safety, air pollution, etc.) to propose optimised cycling routes through the region based on the users preferences. For calculating air pollution concentrations in the city, the system uses an offline pollution map obtained with a land use regression model rather than relying on real-time data retrieved by sensors deployed in the city. A similar offline pollution map is leveraged in [20]. Lastly, a similar route planning service for the city of San Francisco, USA [22] allows users to plan the most bike-friendly route in the city, taking into account the city elevation (a functionality which was recently added to Google Maps as well). Compared to the aforementioned works, the system proposed in this paper does not rely on neither external nor crowd-sensed data to design the routes but rather on actual sensor readings (or future predictions). Such readings are instantaneously available from the federated testbeds provided by FIESTA-IoT project. As a consequence, the routing design can be adapted in a flexible way to the needs of the end users and to the current availability of data, with respect to the targeted geographical area. This makes SURF a Smart City compliant service, which is not properly the case for the cited works in the field of pedestrians and cyclists mobility. Moreover, differently from the previously mentioned works, we compare the performance of different spatial interpolation and temporal forecasting tools with respect to different sensor quality kinds. In this way, we are able to investigate whether there is a best-performing match between interpolation or forecasting

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

3

Fig. 1. SURF system architecture.

method and sensor type. To the best of our knowledge, this analysis is not present in the literature. 3. System implementation This section describes the single components of SURF architecture. SURF consists in a web service that is able to compute a route between a starting point and a destination according to a user defined metric. The architecture of the system, depicted in Fig. 1, is composed of two main parts: the SURF Frontend System (SFS) and the SURF Backend System (SBS). Section 3.1 describes the SFS, a web page that allows the user to interact with the system. It sends requests and gets responses from the SBS, where the core functionalities of the system are implemented. The SBS, described in Section 3.2, is in charge of retrieving data from FIESTA-IoT platform and processing it according to the user’s requests. The SBS also maintains a local MongoDB database where data from FIESTA-IoT are stored for several purposes (caching, adjusting prediction models, etc.). 3.1. SURF Frontend System (SFS) The SFS provides a web-based graphical interface for interacting with the system, and has been designed to be as more userfriendly as possible. As illustrated in Fig. 2, a user can select the addresses of its starting point and final destination using the leftsided menu. The menu also shows an option to compute the route Now or at a particular time in the future. This choice impacts on how the SBS processes the sensor data: if Now is selected, the SBS uses real-time data measurements from the sensors to compute the optimal route. Otherwise, the SBS runs a machine learning forecasting tool to predict future values of the sensor data.

More details on this process will be given in Section 3.2.2 (In-thefuture route computation). Finally, in the bottom part of the menu, the user can select the preferred route optimisation metric by using two dropdown lists. The button Find Route in the bottom right corner triggers the SBS to compute the route according to the input from the user. The leftmost dropdown list allows to select between Maximising and Minimising a particular route metric, whose meaning is shown in the rightmost list. Its content depends on the sensor resources found in the area of interest. 3.1.1. Route definition and visualisation Upon insertion of valid locations in the From and To input forms, the SFS transmits their content to the SBS, which replies with a list of sensor resources found in the area of interest. Such operation is based on a resource discovery performed by SBS on FIESTA-IoT platform, which will be detailed in Section 3.2.1. When the discovery process is completed, the retrieved sensors are displayed on the map using pre-defined icons. Concurrently, the dropdown list for route optimisation is populated with the particular sensor types that have been found, so that a user can select which type of sensor should be used to compute the route. As an example, in Fig. 2 the users decides to find the route for which temperature is minimised. Once these options are set, the user clicks on the Find Route button, which triggers the execution of the backend functions for route computation, which will be detailed in Section 3.2.2. Instead of returning a single route, the system returns the best and second-best alternatives. This is done to emphasise the different characteristics of the two routes, in terms of distance to be covered and, more importantly, in terms of the metric chosen for route optimisation. As one can see, in Fig. 2 two routes are returned: for each one, a small information dialog is shown, reporting the total distance to be covered, the estimate duration in time and the average value of the sensor measurements selected

4

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

Fig. 2. SURF web page. The example shows the computation of a route in Santander minimizing air temperature. Data is provided by the SmartSantander testbed through the FIESTA-IoT platform.

for route optimisation. In case no sensors are found in the area of interest for the user, a small dialog is returned to inform the user that no sensors are present, and the SFS displays routes computed only according to a shortest-path criterion. 3.2. SURF Backend System (SBS) The SBS is the core component of SURF system and controls all interactions with the users as well as with FIESTA-IoT platform. Its main tasks are: (i) resource discovery from FIESTA-IoT according to the user’s input, (ii) data processing for computation of the best routes and (iii) construction of prediction models for computing routes in the future. The SBS is implemented on a Ubuntu 16.04 virtual environment with Intel Xeon CPU E5-2676 v3 @ 2.40 GHz and 1GB RAM. 3.2.1. Resource discovery When a user selects the locations of source and destination, as well as the time for which to compute the route (i.e., either Now or for a future time), the geographical coordinates of such two locations are transferred to the backend. Here, the testbed-agnostic access enabled by FIESTA-IoT is used to retrieve data from all sensors deployed in the area of interest. The latter is defined as a geographical rectangular window delimited by the coordinates input by the user. According to the specifications of the ontology defined by the FIESTA-IoT project, the SBS creates a SPARQL query which is submitted to the FIESTA-IoT framework for data retrieval. Once the SPARQL is processed by FIESTA-IoT, the SBS communicates back to the SFS the location of all sensors found in the area of interest, together with their type. Due to the sensor-agnostic nature of FIESTA-IoT, it is possible that sensors which are not useful for urban routing may be present. As an example, at the time of writing, sensors regarding Voltage, Electric Current, Active Power and Water Temperature were present in some indoor environments and may be therefore returned by the system. This process has two important drawbacks: (i) from a usability point of view, the user

may be confused by a completely unjustified number of options to choose from and (ii) from a computational point of view, data that will never be used by the final user is still retrieved from the FIESTA-IoT platform. For these reasons, the SPARQL query transmitted by the SBS contains filter statements to discovery only sensors which are believed to be useful for the final urban routing application, such as: Air Temperature, Relative Humidity, Illuminance, Chemical Agent Atmospheric Concentration CO/NO2/O3 and Environmental Sound Pressure level. After the sensors are discovered, the SFS can thus proceed in showing such sensors on the map. At this point, the SBS waits for the user to select which sensor type and what type of optimisation (minimum or maximum) it should perform for computing the best route.

3.2.2. Route computation Once the user has selected its starting point and final destination, as well as the quantity to optimise the route for, the SBS proceeds in computing the best routes available. The entire process can be divided in three main steps:

Preprocessing. First, a set of routes tailored to pedestrians and cyclists is computed relying on an external navigation services. We rely on Google Maps Directions API and OpenStreetMap Routing Machine project1 to obtain this first set of routes. In details, we ask the APIs to return all routes from the starting point to the final destination according to the walking and bicycling transportation modes. This allows to obtain a number of routes (generally from 3–5) with already satisfying desirable criteria. In particular, such precomputed routes avoid main roads and are characterised by reasonably short travelling time. Each road is returned as an ordered set of N road segments S = s1 , . . . , sn , . . . , sN , each one characterised by a length ln and an average journey duration tn .

1

http://map.project-osrm.org

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

5

Fig. 3. IDW interpolation process.

3.2.2.1. Data retrieval and spatial interpolation. Upon selection of a particular sensor data kind to be used for optimising the route, the SBS extracts from the query result all sensors belonging to that specific data kind. However, the spatial granularity of such sensors may be not enough to directly compute the best routes according to the user criteria. Therefore, before any other operation is performed, the SBS performs spatial interpolation of the data in the area of interest. In details, referring to Fig. 3(a), each route segment in S is covered by a rectangular bounding box. The bounding is spatially sampled with a uniform grid of points at positions pi . A circular search area of radius r is then superimposed on each bounding box and all sensors falling within that area are considered as input to the spatial interpolation process. Note that the extension of the search area depends on the sensor quality kind which is interpolated and thus has to be set according to the application scenario. Later in Section 4.2 we comment on this point. Let xj be the positions of the considered sensors and vj their values. The goal of the spatial interpolation process is to compute the unknown sensor values vi at position pi in the bounding box superimposed to each route segment. We consider different options for the spatial interpolation algorithm: 1. Inverse Distance Weighting (IDW): This is one of the simplest algorithm for performing spatial interpolation. The unknown sensor values vi are computed according to the following weighted sum:

vi =



wi, j v j

(1)

j

where the weights wi, j are computed according to the inverse of the Euclidean distance di, j = ||pi − x j ||, and normalised to unity as follows:

di, j wi, j =  , j di, j wi, j = 0,

f or di, j ≤ r

otherwise

(2) (3)

Note that all the sensors at a distance greater than r from the unknown sensor value location (i.e., outside the circular search area) are not included in the interpolation process. Fig. 3(b) gives an insight of IDW operations; 2. Gaussian Radial Basis Function (G-RBF): The unknown sensor values vi are approximated as weighted sum of radial basis functions φ (di,j ), that is:

vi =

 j

w j φ (di, j ).

(4)

Several choices are available for the function φ , being the Gaussian kernel the most popular, that is:



φ (di, j ) = exp −

di,2 j 2σr2



.

(5)

Weight vectors wi are obtained starting from the known sensor values vj and inverting the corresponding linear system obtained from (4) with least squares. The parameter σ r is set to match the extension of the circular search area of radius r. Also in this case, all those sensors at a distance greater than r from position pi does not influence the interpolation output. 3. Universal Kriging (UK): Kriging is a family of interpolation algorithms which work by treating the quantity to interpolate as a geostatistical variable Z(x) expressed as the sum of a deterministic trend μ(x) and a random, autocorrelated error δ (x). The trend could either be a simple constant (in case of Ordinary Kriging), that is μ(x ) = μ for all locations x, or a more complex function of the spatial coordinates (Universal Kriging). In a nutshell, Kriging interpolation at an unknown point xj is defined as:

Z ( xi ) =



λ j Z ( xi ),

(6)

j

where the weights λj are computed starting from the variogram of Z(xi ). The variogram is a function that measures the correlation between two values of a spatial process based only on the distance between the corresponding locations. Referring to such distance as lag, the variogram at lag h is defined as the average squared difference between data values at distance h. Assuming that close locations are characterised by similar process realisations, the weights λj are drawn from the variogram in such a way that (i) data points closer to the unknown point xi influence more the interpolation and (ii) the higher the spatial correlation between xi and a given data point the higher the weight of that data point. This means that, according to the variogram, Kriging interpolation algorithm automatically tunes the search area, by setting to zero the weights of those data points which turn to be spatially uncorrelated with xi (i.e., which either are too far from the unknown point or show low spatial correlation). Finally, note that since the variogram depends i) on the considered spatial process and ii) on the specific targeted geographical area, also the values of the weights λj share the same dependencies. We comment on the performance of the different interpolation techniques later in Section 4.2.

6

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

Postprocessing. After spatial interpolation over each one of the N road segments of each precomputed paths, the SBS computes the best routes according to the user criteria and the “direction” of the optimisation function (i.e., max or min). First each segment sl is given a weight cost cl obtained by averaging together all interpolated data points vi belonging to the corresponding bounding box. Each route is then given a final score J computed as a timeweighted average of the corresponding link costs, that is:

 vn tn J = N . N tn

(7)

The precomputed routes are then sorted according to the resulting final value of J in ascending or descending order, according to the selected optimisation criteria (min or max). Finally, the two best routes are returned to the user and shown on the graphical user interface. The corresponding values of J are also displayed in a small clickable label associated to each route (see Fig. 2).

(6) 3 Days Average (Bl-3DA):

vi (t ) = q1 vi (t − 24 ) + q2 vi (t − 48 ) + q3 vi (t − 72 ) (13) (b) Neural Networks (NN): (1) 1 Week Before (NN-1WB):

vi (t ) = f (vi (t − 7 · 24 ))

(14)

(2) 1 Day Before (NN-1DB):

vi (t ) = f (vi (t − 24 ))

(15)

(3) 1 Day Average (NN-1DA):

vi (t ) = f (vi (t − 1 ), . . . , vi (t − k ), . . . , vi (t − 24 ))

(16)

(4) 2 Days Average (NN-2DA):

vi (t ) = f (vi (t − 1 ), . . . , vi (t − k ), . . . , vi (t − 48 ))

(17)

(5) 3 Days Average (NN-3DA): In-the-future route computation. As illustrated in Fig. 2, a user may choose to get directions for Now or for a particular time in the future. This latter way of computing routes is based on a set of prediction models, independent for each sensor kind. These models are trained, evaluated and stored locally on the SBS, leveraging historical data queried from FIESTA-IoT platform. The entire process is executed as follows: 1. The SBS runs a periodic daily process which downloads all data available from the testbeds connected to FIESTA-IoT for the past 24 h. The data retrieved from the platform is saved as JSON documents into a MongoDB database running on the SBS. The database works according to a circular buffer fashion, that is, old data is overwritten with new measurements after a period of time equal to the maximum prediction models’ look-ahead capabilities. 2. The buffered data is used to train different prediction models which are needed in order to estimate the value of a particular sensor upon a user request for a route in the future. The SBS selects the model that obtains the highest performance and uses it whenever a prediction on a particular sensor is required. We consider different prediction models characterised by different complexity and look ahead capabilities. For what concerns complexity, we focus on Baseline models (obtained by simple data copy or regression on the past values) and Neural Networks (NN) models, where historical data is fed into a neural network to produce the forecasted value. For what concerns the look ahead capabilities, we consider three cases: (i) 1 h look ahead, (ii) 1 day look ahead and (iii) 1 week look ahead. Let vi (t) be the value to be predicted for the ith sensor at a particular hourly-sampled time in the day t. The models considered are as it follows: (a) Baseline Prediction Models (BL): (1) 1 Hour Before (BL-1HB):

vi (t ) = vi (t − 1 )

(8)

(2) 1 Day Before (BL-1DB):

vi (t ) = vi (t − 24 )

(9)

vi (t ) = f (vi (t − 1 ), . . . , vi (t − k ), . . . , vi (t − 72 ))

(18)

For what concerns Baseline models, weights q1 , q2 , q3 are estimated according to linear regression. For what concerns NN models, we choose to use neural networks with a single hidden layer and a single output layer, while the number of neurones in the hidden layer is chosen according to a cross validation strategy (see Section 4 for details). Note that models BL-1HB, NN-1DA, NN-2DA and NN-3DA require the knowledge of the past hour sample v(t − 1 ) and are therefore 1 h look ahead models. Models BL-1DB, BL-1DA, BL2DA, BL-3DA and NN-1DB are 1 day look ahead models. Finally, models BL-1WB and NN1WB are 1 week look ahead models. 3. Finally, when a user selects a particular hour for computing a route in the future, the SBS first checks in the database which are the sensors for which data has been downloaded and a model is available. The SBS picks the best available model, depending on the sensor kind and on the point in the future selected by the user, and computes predictions for all sensors participating in the route computation. Basically, this step is the equivalent of what explained in Section 3.2.2: the difference is that instead of querying the FIESTA-IoT platform, the system queries the local MongoDB database. 4. System evaluation This Section comments on the experiments performed to optimise and evaluate the proposed system. In particular, we perform a detailed comparison of the different spatial interpolation and temporal forecasting methods in Sections 4.2 and 4.3, respectively. Section 4.4 comments on the impact of the proposed system in terms of the quality of the resulting routes compared to the ones computed by traditional routing systems. while Section 4.5 reports the results of a survey conducted on several users of the platform. As a remainder, we report in Table 1 the models we evaluated for spatial interpolation and temporal forecasting; for the latter, we also specify the corresponding models’ look-ahead capabilities.

(3) 1 Week Before (BL-1WB):

vi (t ) = vi (t − 7 · 24 )

(10)

(4) 1 Day Average (BL-1DA):

vi (t ) = q1 vi (t − 24 )

(11)

(5) 2 Days Average (BL-2DA):

vi (t ) = q1 vi (t − 24 ) + q2 vi (t − 48 )

(12)

4.1. Available data FIESTA-IoT platform provides data coming from different sensors testbeds of different cardinality and with different temporal granularity characteristics. To harmonise the dataset, we first reduce the time granularity of all sensor measurements to hourly samples, performing a hourly median sampling of the extracted time series. We observe that two types of issues may be present

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

7

Table 1 Evaluated Spatial Interpolation (left) and Temporal Prediction (right) algorithms.

Table 2 Available sensors.

Table 3 Datasets for spatial interpolation.

Sensor Kind

N. Sensors

Spatial Interpolation

Temporal Forecasting

Sensor Kind

Max N. Sensors for selected T

Useful Sensors Set Size

Temperature CO O3 Sound Pressure Rel. Humidity

171 65 50 22 14

Yes Yes Yes Yes No

Yes Yes Yes No Yes

Temperature CO O3 Sound Pressure

116 65 50 22

90 38 17 18

in the data: first, most sensors are coupled with their geographical coordinates, but this is not available for, e.g., relative humidity sensors; for such quality kinds spatial interpolation is not possible. Secondly, we note that for some quality kinds (e.g., ambient sound pressure) the time series are non continuous and interrupted by frequent sensor-related issues. Therefore, we excluded from the temporal forecasting experiments all those sensors quality kinds whose data showed a time continuity smaller than one month. Table 2 summarises the number of sensors per quality kind and the corresponding availability for either geospatial interpolation or temporal forecasting analysis. 4.2. Spatial data interpolation The different interpolation techniques introduced in Section 3.2.2 (Data retrieval and spatial interpolation) have been compared to evaluate their performance. The main goal of such experiments is to assess whether there exists a clear winner interpolation method, outperforming all others regardless of the type of measurements to which it is applied, or if each data type requires a specific spatial interpolation methodology. As a reminder, for the sake of clarity, we consider for this set of experiments (i) Air Temperature, (ii) CO Concentration, (iii) O3 Concentration and (iv) Environmental Noise Pressure sensors. Tests have been undertaken focusing, for each quality kind, on the hourly timestamp T for which the highest number of sensors measurements were available with respect to the overall number of sensors of that kind. Moreover, we performed a geographical windowing of the whole selected area. This is done to get similar working conditions with respect to SURF’s after the creation of bounding boxes around each route segment, as described in Section 3.2.2. Note that, across the four different sensor kinds selected for spatial interpolation experiments, we kept constant the ratio between the surface of the geographical window and the number of sensors falling within, such to get comparable scenarios. These operations further reduced the sets of sensors useful for spatial interpolations, as shown in Table 3. Lastly, we divided the obtained sensor sets in training and test sets with splitting ratios of 0.8 and 0.2, respectively. This means that first the considered interpolation models have been trained with the sensors of the train-

Table 4 Radius of circular search area for the selected sensor quality kinds. Sensor Kind

Radius [m]

Temperature CO O3 Sound Pressure

400 25 25 50

ing set. Considering IDW and G-RBF interpolation strategies, during the training phase we selected the best value of the radius r of the circular search area (see Section 3.2.2, Data retrieval and spatial interpolation) through a k − f old cross validation strategy with k = 10. Note that while r is a direct input parameter for IDW, in the case of G-RBF the corresponding parameter which controls the extension of the circular area is σ r , which we tuned accordingly. Table 4 summarises the radius of the circular search area for the considered sensor types. Results show that the best extension of the circular search area depends on the considered spatial process. For Air temperature sensors, the search area is 16 times larger than the one selected for average CO and O3 concentrations, while it is 8 times larger than the corresponding extension for environmental Noise Pressure sensors. On the one hand, this depends on the fact that both Pollution Gases and environmental Noise Pressure are strictly local spatial processes, e.g. the surroundings of city centre roads are much more noisy than those in residential areas as well as CO/O3 average exposure is generally much higher in heavily congested roads than less travelled ones. This turns into lower spatial de-correlation distances, i.e., sensors further than 50 m show very low spatial correlation values. On the other hand, this is not the case for Air Temperature, as the corresponding spatial process realisations vary at city-wide levels and thus having larger de-correlation distances. Once we completed the cross-validation and training phase, we performed prediction in the locations of test set sensors. To assess the performance of each model we computed the Root Mean Squared Errors (RMSE) between real measurements of test set sensors and predicted values. In other words, (i) we first predicted the measurement values in the locations of the test set sensors as if actually there were no sensor resources in those locations, and then (ii) we computed the RMSE between such predictions and the

8

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

pressure level, where UK outperforms the other two algorithms, in both cases more than halving the error. Similar reasoning can be done for CO and O3: for the former, the three algorithms give comparable RMSEs (with slightly better error for G-RBF), while for the latter UK gives more than 12% lower error with respect to the other two algorithms. To answer our original question, it appears that different data types do calls for different interpolation methods in order to obtain the best results. However, our experiments show that Universal Kriging is generally able to obtain satisfactory results across all data types and can be therefore selected as a generic interpolation strategy. 4.3. Temporal data forecasting Fig. 4. Performance of IDW, G − RBF and UK interpolation techniques for Air Temperature, Chemical Agent Atmospheric Concentration (CO, O3) and environmental Sound Pressure Level (SPL) sensors.

real sensor values (which actually were available as those locations were effectively equipped with sensor resources). Fig. 4 shows the values of RMSE we obtained from the aforementioned analysis. As one can see, different sensor kinds require different algorithms. Temperature data can be interpolated quite well through simple IDW algorithm, whose RMSE turns to be comparable with (but not higher than) those of G-RBF and UK. This is not the case for sound

Similarly to what done for spatial interpolation, we compared the temporal prediction models introduced before to evaluate their performance. The main goal here is the same as the one of the previous section, thus that of assessing whether we are able to find a clear winning prediction model which outperforms all the others regardless of the sensor type. For each sensor type we considered 10 sensors and we performed forecasting according to the following process. Each sensor’s time series is split in training and testing set of three weeks and one week respectively. Then, weights q1 , q2 , q3 (defined for predictors BL-1DA, BL-2DA and BL-3DA) have been estimated according to linear regression using the training set. Moreover, the number of hidden neurones (i.e., the size of

Fig. 5. Performance of the proposed temporal prediction models for Air Temperature, Chemical Agent Atmospheric Concentration (CO, O3) and Relative Humidity sensors.

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

9

Fig. 6. Comparison between paths suggested by Google Maps and SURF in terms of average travel time (left) and average environmental Sound Pressure Level (SPL, right). Path generated with a distance d between source and destination equal to 2 km (top), 4 km (middle), 6 km (bottom).

the hidden layer) for each NN predictor has been determined after a cross-validation analysis on the training set (with a split of 2 weeks for sub-training set and 1 week for validating set). Note that both the weights for baseline predictors and the hidden layer sizes for neural networks change according to the sensor type and the specific predictor. Fig. 5 reports averaged RMSEs obtained from testing the above described models on the four selected sensor quality kinds. Expectedly, there is a tradeoff between model precision and look-ahead

capabilities: the simplest model (BL- 1HB) is the one showing the best results overall, but it can be used only for 1-hour ahead predictions. Models forecasting further in the future are associated with a higher RMSE. Overall, our results suggest that for shortterm (1 h) and mid-term (1-day) predictions baseline methods allow to obtain satisfactory results across all tested sensor types. Conversely, neural networks are best suited for longer look-ahead periods (weekly predictions) compared to baseline methods as can be seen by comparing the black bars of Fig. 5. In general, our

10

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

Fig. 7. Comparison between paths suggested by Google Maps and SURF in terms of average travel time (left) and average exposure to CO (right). Path generated with a distance d between source and destination equal to 2 km (top), 4 km (middle), 6 km (bottom).

results suggest that a single forecasting scheme can be adopted regardless of the particular sensor type, greatly simplifying the design of the overall system. 4.4. Impact In this section we evaluate the impact of the proposed SURF by assessing the quality of the routes planned by SURF with re-

Table 5 Google API and SURF processing times for planning routes of length 2, 4 and 6 km. Navigation Service

Google API SURF (CO) SURF (Sound)

Proc. Time [ms] (Now)

Proc. Time [ms] (In-the-future)

2 km

4 km

6 km

2 km

4 km

6 km

427 450 452

444 474 473

523 499 501

456 494 495

469 504 503

525 573 566

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

11

Table 6 Survey questions and results obtained. Question

1

How would you rate the website GUI? How would you rate the usability of the website? How would you rate the performance of the website when Now is selected? How would you rate the performance of the website when Today At is selected? Would you use such a service, if available in your city? Do you think other kinds of sensors should be made available? Would you install an App on your device to provide measurements for improving the service?

0 0 1 0 2 2 1

spect to the routes planned by Google Maps (GM) and OSRM. To this extent, we randomly generated 20 couples source-destination at a given distance d with d ∈ [2 km, 4k m, 6 km]. For each couple source-destination, we then found the best path provided by GM / OSRM and the best path derived by SURF when considering the average CO exposure and average environmental sound pressure level as path quality metrics, according to the process described in Section 3.2. We finally compared the average journey time and the average exposure to CO and sound pressure level characterising the paths given by GM and SURF, respectively. The aforementioned process is repeated in five different moments of the day: 8-11am, 11am-2pm, 2pm-5pm, 5-8pm, 8pm-11pm. Moreover, we compared the average processing times of the considered navigation services to provide the best route, in the cases of a route request for Now or for a given time in the future. Fig. 6 reports the average travel time and the average sound pressure level along the path for the three different distances between source and destination. In general, the paths designed by SURF feature lower exposure to sound pressure level than those suggested by GM at an expense of slightly higher journey time. On average, during busy hours, SURF routes are characterised by 10% lower sound pressure level than GM planned paths, with a corresponding average increase of the journey time by 10%. Also, Fig. 6 (b), (d), (f) show that the gain of SURF-planned paths in terms of lower sound pressure level is higher during busy hours (i.e., 8am–11am and 5pm–8pm) than other periods of the day, and expectedly increases for longer paths in the same period. A similar behaviour can be noted also when SURF plans the best routes according to the CO exposure. Fig. 7 shows the average travel time and the average concentration of CO along the paths planned by GM and SURF. As one can see, SURF trades off the travel time with the healthiness of the path, proposing routes which on average require 10% more time to be travelled but are 25% less exposed to CO than GM’s routes (during rush hours). To conclude, Table 5 shows the processing times of GM / OSRM and SURF navigation services to provide the best path in terms of the corresponding objective metric. As one can note, the processing time overheads of SURF with respect to traditional navigation services is limited to 30 ms in the case of a route request for Now and to 38 ms for In-the-future routes, which will hardly be perceived by users. 4.5. Subjective We deployed the experiment on a publicly available web server and asked twenty five people to test the application. A survey was then conducted in order to retrieve subjective measures of how both the experiment as well as the FIESTA-IoT platform were perceived. The survey contains several questions, and each user has the possibility to answer each question with a number from 1 (very low) to 5 (very high). Observing the results illustrated in Table 6, several considerations can be made. Focusing on the first two questions, which are

(0%) (0%) (4%) (0%) (8%) (8%) (4%)

2

3

0 (0%) 1 (4%) 6 (24%) 1 (4%) 1 (4%) 10 (40%) 4 (16%)

4 3 8 4 2 8 8

(16%) (12%) (32%) (16%) (8%) (32%) (32%)

4

5

16 (64%) 16 (64%) 7 (28%) 13 (52%) 10 (40%) 3 (12%) 11 (44%)

5 (20%) 5 (20%) 3 (12%) 7 (28%) 10 (40%) 2 (8%) 1 (45)

more system-related, we can see that more than 95% of the users rated the appearance and the usability of the website in a positive way. In both cases, 84% of the users rated these two features of the website as either high or very high. This validates the overall design of the service and the performance of data retrieval from the FIESTA-IoT platform. The last three questions of the survey can be used to evaluate how well the users perceived the application, and indirectly give insights on the goodness of both the proposed system itself as well as the FIESTA-IoT platform. We can see that 80% of the users gave a positive answer when asked if they would use the service, if available in their city. This means that having other city-wide or outdoor testbeds connected to FIESTA-IoT would greatly improve how well the project is perceived, extending the coverage of the final applications. At the same time, only 20% of the users think that more sensor types should be added to the experiment. Again, this means that the set of sensors already available through FIESTA-IoT is enough for providing a good urban navigation service. Finally, 48% of the users gave a positive answer when asked about the possibility of installing an App on their smartphones to increase the number of sensor data available in the experiment, while only 20% of them gave a negative answer. Again, this means that providing a service such as urban routing to the users can foster a crowdsourcing environment in which users inject measurements in the system themselves, and such measurements can be later used for other purposes. 5. Conclusions We have detailed the implementation of SURF, a smart urban routing system tailored for pedestrians and cyclists built over the FIESTA-IoT platform for federated testbeds access. The system is able to cope with both spatial and temporal lack of data: the former is addressed using spatial interpolation, while the latter is addressed through prediction models, both tailored to each sensor quality kind. Objective and subjective experiments confirm the positive impact this kind of services have on the general public. A video of the system is available at https://tinyurl.com/surf-fiestaiot. Future research directions include, among the others, the use of an ad-hoc, local navigation service rather than relying upon external services such as Google Maps Directions API (or similar) for computing the initial set of routes. Also, new temporal forecasting tools such as Long Short Term Memory (LSTM) recurrent neural networks could be explored, with the aim of improving prediction performance for longer look-ahead capabilities Declaration of competing interest We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

12

A. Pimpinella, A.E.C. Redondi and M. Cesana / Computer Networks 162 (2019) 106857

Acknowledgment This work is supported by the 3rd open call of the EU H2020 FIESTA-IoT project (contract number: 643943). References [1] M.M. Rathore, A. Ahmad, A. Paul, S. Rho, Urban planning and building smart cities based on the internet of things using big data analytics, Comput. Netw. 101 (2016) 63–80. [2] M.M. Rathore, A. Paul, W.-H. Hong, H. Seo, I. Awan, S. Saeed, Exploiting IoT and big data analytics: defining smart digital city using real-time urban data, Sust. Cities Soc. 40 (2018) 600–610. [3] L. Parra, J. Marn, P. V. Mauri, J. Lloret, V. Torices, A. Massager, Scatternet formation protocol for environmental monitoring in a smart garden, Netw. Protoc. Algor. 10 (2019) 63, doi:10.5296/npa.v10i3.14122. [4] A. Rehman, M. Mazhar Rathore, A. Paul, F. Saeed, R.W. Ahmad, Vehicular traffic optimisation and even distribution using ant colony in smart city environment, IET Intell. Transp. Syst. 12 (7) (2018) 594–601, doi:10.1049/iet-its.2017.0308. [5] L. Garca, L. Parra, M. Taha, J. Lloret, System for detection of emergency situations in smart city environments employing smartphones, in: Proceedings of the International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2018, pp. 266–272, doi:10.1109/ICACCI.2018.8554654. [6] J. Marn, J. Rocher, L. Parra, S. Sendra, J. Lloret, P.V. Mauri, Autonomous WSN for lawns monitoring in smart cities, in: Proceedings of the IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA), 2017, pp. 501–508, doi:10.1109/AICCSA.2017.72. [7] http://fiesta-iot.eu/. [8] A. Gyrard, M. Serrano, G.A. Atemezing, Semantic web methodologies, best practices and ontology engineering applied to internet of things, in: Proceedings of the IEEE 2nd World Forum on Internet of Things (WF-IoT)„ IEEE, 2015, pp. 412–417. [9] A. Gyrard, M. Serrano, A unified semantic engine for internet of things and smart cities: from sensor data to end-users applications, in: Proceedings of the IEEE International Conference on Data Science and Data Intensive Systems (DSDIS), IEEE, 2015, pp. 718–725. [10] R. Agarwal, D.G. Fernandez, T. Elsaleh, A. Gyrard, J. Lanza, L. Sanchez, N. Georgantas, V. Issarny, Unified IoT ontology to enable interoperability and federation of testbeds, in: Proceedings of the IEEE 3rd World Forum on Internet of Things (WF-IoT), IEEE, 2016, pp. 70–75. [11] J. Lanza, L. Sanchez, D. Gomez, T. Elsaleh, R. Steinke, F. Cirillo, A proof-of-concept for semantically interoperable federation of IoT experimentation facilities, Sensors 16 (7) (2016) 1006. [12] L. Sanchez, L. Muñoz, J.A. Galache, P. Sotres, J.R. Santana, V. Gutierrez, R. Ramdhany, A. Gluhak, S. Krco, E. Theodoridis, et al., Smartsantander: IoT experimentation over a smart city testbed, Comput. Netw. 61 (2014) 217– 238. [13] C. Costa, G. Chatzimilioudis, D. Zeinalipour-Yazti, M.F. Mokbel, Towards realtime road traffic analytics using telco big data, in: Proceedings of the International Workshop on Real-Time Business Intelligence and Analytics, in: BIRTE ’17, ACM, New York, NY, USA, 2017, pp. 5:1–5:5, doi:10.1145/3129292.3129296. [14] Y. Lu, A. Misra, w. Sun, H. Wu, Smartphone sensing meets transport data: a collaborative framework for transportation service analytics, IEEE Trans. Mob. Comput. 17 (4) (2018) 945–960, doi:10.1109/TMC.2017.2743176. [15] J. Aslam, S. Lim, X. Pan, D. Rus, City-scale traffic estimation from a roving sensor network, in: Proceedings of the 10th ACM Conference on Embedded Network Sensor Systems, in: SenSys ’12, ACM, New York, NY, USA, 2012, pp. 141– 154, doi:10.1145/2426656.2426671.

[16] J. Liu, H. Shen, H.S. Narman, W. Chung, Z. Lin, A survey of mobile crowdsensing techniques: a critical component for the internet of things, ACM Trans. Cyber Phys. Syst. 2 (3) (2018) 18:1–18:26, doi:10.1145/3185504. [17] R. Doolan, G.-M. Muntean, Vanet-enabled ECO-friendly road characteristics-aware routing for vehicular traffic, in: Proceedings of the 77th Vehicular Technology Conference (VTC Spring), IEEE, 2013, pp. 1–5. [18] O. Hertel, M. Hvidberg, M. Ketzel, L. Storm, L. Stausgaard, A proper choice of route significantly reduces air pollution exposurea study on bicycle and bus trips in urban streets, Sci. Total Environ. 389 (1) (2008) 58–70. [19] M.H. Sharker, H.A. Karimi, Computing least air pollution exposure routes, Int. J. Geograph. Inf. Sci. 28 (2) (2014) 343–362. [20] M. Hatzopoulou, S. Weichenthal, G. Barreau, M. Goldberg, W. Farrell, D. Crouse, N. Ross, A web-based route planning tool to reduce cyclists’ exposures to traffic pollution: a case study in montreal, Canada, Environ. Res. 123 (2013) 58–61. [21] J.G. Su, M. Winters, M. Nunes, M. Brauer, Designing a route planner to facilitate and promote cycling in metro vancouver, Canada, Transp. Res. Part A Pol. Pract. 44 (7) (2010) 495–505. [22] http://amarpai.com/bikemap/. Andrea Pimpinella received his Master degree in Telecommunications Engineering in April 2018 from Politecnico di Milano and is currently attending Ph.D. at the Department of Electronics and Information of the same institute. His research activities are focused on data analysis for performance optimization in Wireless Networks.

Alessandro E.C. Redondi is currently Assistant Professor with the Dipartimento di Elettronica, Informazione e Bioingegneria of the Politecnico di Milano, Italy. He received the MS in Computer Engineering in July 2009 and the Ph.D. in Information Engineering in February 2014, both from Politecnico di Milano. From September 2012 to April 2013 Alessandro was a visiting student at the EEE Department of the University College of London (UCL). His research activities are focused on the design and optimization of IoT systems and on network data analytics.

Matteo Cesana is currently an Associate Professor with the Dipartimento di Elettronica, Informazione e Bioingegneria of the Politecnico di Milano, Italy. He received his MS degree in Telecommunications Engineering and his Ph.D. degree in Information Engineering from Politecnico di Milano in July 20 0 0 and in September 2004, respectively. From September 2002 to March 2003 he was a visiting researcher at the Computer Science Department of the University of California in Los Angeles (UCLA). His research activities are in the field of design, optimization and performance evaluation of wireless networks with a specific focus on communication technologies for the Internet of Things and Future Generation Cellular Networks. Dr. Cesana is an Associate Editor of the Ad Hoc Networks Journal (Elsevier).