Science of the Total Environment 482-483 (2014) 440–451
Contents lists available at ScienceDirect
Science of the Total Environment journal homepage: www.elsevier.com/locate/scitotenv
A tool for urban soundscape evaluation applying Support Vector Machines for developing a soundscape classification model Antonio J. Torija a,⁎, Diego P. Ruiz b, Ángel F. Ramos-Ridao c a b c
ISVR, University of Southampton, Highfield Campus, SO17 1BJ Southampton, UK Department of Applied Physics, University of Granada, Avda. Fuentenueva s/n, 18071 Granada, Spain Department of Civil Engineering, University of Granada, Avda. Fuentenueva s/n, 18071 Granada, Spain
H I G H L I G H T S • • • • •
Support Vector Machine algorithms are used to develop an urban soundscape classification model. SVM and SMO-based algorithms are implemented as a part of a comprehensive soundscape evaluation tool. These models allow the acoustical and (indirect) perceptual assessments of soundscapes with high classification performance. A new methodology for soundscape evaluation is proposed, using SVM algorithms implemented within this framework. Experimental data show that the SMO model outperforms the SVM model in classifying the urban soundscapes considered.
a r t i c l e
i n f o
Article history: Received 26 March 2013 Received in revised form 26 July 2013 Accepted 27 July 2013 Available online 2 September 2013 Editor: Pavlos Kassomenos Keywords: Soundscape classifier Classification model Acoustical assessment Soundscape evaluation Support Vector Machines Sequential Minimal Optimization
a b s t r a c t To ensure appropriate soundscape management in urban environments, the urban-planning authorities need a range of tools that enable such a task to be performed. An essential step during the management of urban areas from a sound standpoint should be the evaluation of the soundscape in such an area. In this sense, it has been widely acknowledged that a subjective and acoustical categorization of a soundscape is the first step to evaluate it, providing a basis for designing or adapting it to match people's expectations as well. In this sense, this work proposes a model for automatic classification of urban soundscapes. This model is intended for the automatic classification of urban soundscapes based on underlying acoustical and perceptual criteria. Thus, this classification model is proposed to be used as a tool for a comprehensive urban soundscape evaluation. Because of the great complexity associated with the problem, two machine learning techniques, Support Vector Machines (SVM) and Support Vector Machines trained with Sequential Minimal Optimization (SMO), are implemented in developing model classification. The results indicate that the SMO model outperforms the SVM model in the specific task of soundscape classification. With the implementation of the SMO algorithm, the classification model achieves an outstanding performance (91.3% of instances correctly classified). © 2013 Elsevier B.V. All rights reserved.
1. Introduction It is widely acknowledged that noise has a major negative impact on the quality of life in cities (Montalvao Guedes et al., 2011; Kim et al., 2012). In fact, there is increasing evidence that, under steady-state conditions, environmental-noise exposure is associated with serious psychological, physiological, and social effects (Paunovic et al., 2009; Laszlo et al., 2012). Despite this, urban areas are still characterized by a serious sound degradation (Torija et al., 2012). An urban area has a variety of different sound environments, which are dominated by sounds related not only to traffic, leisure, or industry, but also to human or natural sounds (e.g. water and birds). However, today, urban environments are highly impacted by road traffic, which ⁎ Corresponding author. Tel.: +44 23 8059 3752. E-mail address:
[email protected] (A.J. Torija). 0048-9697/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.scitotenv.2013.07.108
masks and degrades different soundscapes (Montalvao Guedes et al., 2011; Torija and Ruiz, 2012). Moreover, impact of road traffic on soundscapes in urban environments can reach different values depending on its intensity and composition. Thus, for instance, Powered Two Wheelers (PTW) have been found as one of the most annoying environmental noise sources, so an increase in the number of these vehicles in traffic can lead to higher degradation of the sound environment (Paviotti and Vogiatzis, 2012). Sound in outdoor environments has traditionally been considered in negative terms as both intrusive and undesirable (Jennings and Cain, 2013). However, sound may provide positive effects, such as enhancing a person's mood, triggering a pleasant memory of a prior experience, or encouraging a person to relax and recover (Payne, 2013). Thus, soundscape framework proposes a positive approach, which claims not only to reduce sound exposure but also to preserve, conserve, or even encourage certain sounds that may be of great interest to the population.
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451
To improve urban soundscapes, sound criteria should be incorporated in effective urban planning (Torija et al., 2012). In this sense, planners, architects, or engineers need tools that enable them to make decisions in line with the design and management of sound spaces (Jennings and Cain, 2013). In managing urban sound environments, an evaluation phase should be the first step. This evaluation would focus on the main acoustical characteristics of the soundscape, ascertaining how the soundscape is perceived by the population exposed to it as well (Torija et al., 2013). A method for evaluating soundscapes should consider an acoustical categorization (Rychtáriková and Vermeir, 2013). However, as they involve human perception, soundscape evaluation should not be restricted to acoustical determinations (Zannin et al., 2003), as the human element needs to be included (Dubois et al., 2006; Maris et al., 2007a,b). For all the foregoing, this study develops and tests an automatic tool to classify urban soundscapes. The development of this tool is based on the previous results reported by Torija et al. (2013) which are briefly also outlined in this paper for completeness, who proposed a methodology for categorizing and differentiating urban soundscapes using acoustical descriptors and semantic-differential (SD) attributes. In the present study, a classification model is proposed to be constructed using: (i) Support Vector Machines (SVM) and (ii) Support Vector Machines trained with Sequential Minimal Optimization algorithm (SMO). This model seeks to allow an automatic classification of urban soundscapes on the basis of acoustical as well as perceptual criteria. The underlying hypothesis is that using the developed classification model an acoustical and (indirect) perceptual assessments of soundscapes can be approached, which will enable a proper evaluation of a given urban soundscape. It should be noted that this research is aimed at developing a statistical model for classifying urban sound environments into one of the categories of soundscapes previously identified by Torija et al. (2013). Thus, Section 2 presents a brief introduction to SVM and SMO algorithms. In Section 3, the methodology is described for the collection of the acoustical data, the establishment of input variables used for model implementation, and the evaluation of model performance. Finally, in Section 4, the experimental results are presented and discussed.
2. Support Vector Machines for developing classification models The development of models for environmental problems is becoming more relevant for environmental engineers and scientists. The application of machine learning methods for environmental modeling is extensive (Li et al., 2011), due to their robustness and ability to solve complex non-linear problems. One the most widely used machine learning methods to approach environmental non-linear problems is SVM. This method has been successfully applied to a wide range of real problems, including document classification (Fu and Lee, 2012), bioinformatics (Noble, 2004), financial applications (Ince and Trafalis, 2002), and environmental modeling (Lu and Wang, 2005; Solomatine and Ostfeld, 2008; Yang et al., 2012). The SVM method is a popular and promising tool for data classification (Chen and Lin, 2006), which provides several advantages: (i) better generalization performance compared with many other machine learning methods (Shao et al., 2012). This is due to the adoption of the Structure Risk Minimization Principle (SRM), which minimizes the upper bound of the generalization error (Vapnik, 1998; Cristianini and Shawe-Taylor, 2000; Deng et al., 2012). With the implementation of SRM, SVM models can avoid problems such as over-fitting training and local minima (typical drawback of conventional neural network models) (Lu and Wang, 2005); (ii) SVM models contain few free parameters to be estimated (Lu and Wang, 2005); and (iii) SVM models have proved highly expandable and robust (Lu and Wang, 2005).
441
2.1. Brief introduction to SVM techniques A SVM is a machine learning technique able to approach a nonprobabilistic non-linear classification by using kernel functions. Its basic idea is to construct a hyper-plane or a set of hyper-planes in a high- or infinite-dimensional space to achieve the largest separation between different classes (Steinwart and Christmann, 2008). Among all possible hyper-planes, the one with the maximum margin between classes (optimal hyper-plane) is selected (Chen and Lin, 2006). SVM method maps original data x into a feature space F with higher dimensionality via a non-linear mapping function ϕ (Vapnik, 1995). With SVM used as the classifier, the different classes of data are separated by hyper-planes contained by the decision function in Eq. (1). T f ðxÞ ¼ sgn ω ϕðxi Þ þ b
ð1Þ
where f(xi) is the prediction function, sgn(·) is a symbol function, ωT is the permutation of normal vector ω, ϕ(·) is a map function, and b is a scalar. During the training stage of SVM the optimal hyper-planes are sought. This training phase involves the solution of a quadratic programming problem (QPP). However, because the original SVM (Vapnik, 1995, 1998) was developed for binary classification, various reformulations of SVM algorithm have been proposed to deal with more than two classes (Bolbol et al., 2012). Thus, Crammer and Singer (2000) introduced a reformulation of the support vector quadratic problem in order to extend binary SVM into a multi-class SVM. The proposed algorithm is presented in Eqs. (2), (3) and (4). Minimise Subject to :
1 Xk C Xm 2 ε kωn k þ n¼1 i¼1 i 2 m
ð2Þ
D E n ϕðxi Þ; ωyi −hϕðxi Þ; ωn i ≥bi −εi ði ¼ 1; …; mÞ
ð3Þ
t ðfωn g; εÞ ¼
for which the decision function is: argmaxn¼1;…;K hϕðxi Þ; ωn i
ð4Þ
where ε is the tolerance of the termination criterion, m is the number of training patterns (called support vectors) and C is the cost parameter. The cost parameter C controls how significant misclassifications should be treated —that is, high C values force the SVM to create a prediction function complex enough to misclassify as few training points as possible, while a lower C parameter will lead to a simpler prediction function (Bolbol et al., 2012). For the application of this method, the optimal value of C parameter, as well as the tolerance parameter used for checking the stop criterion have been sought and chosen. 2.2. Brief introduction to SMO As stated above, the training of SVM requires the solution of a QPP. Because of this, SVM is characterized by high computational complexity, which restricts their applicability. Therefore, several improved algorithms have been proposed (Shao et al., 2012). One of these improved algorithms is SMO (Platt, 1999). By training SVM with the SMO algorithm, a quick solution can be found, without using any extra matrix storage and without implementing numerical QPP optimization processes (Platt, 1999). The SMO algorithm breaks the classification problem into a series of possible sub-problems, which are then solved analytically using Osuna's theorem to ensure convergence. The SMO algorithm approaches the classification problem by finding two Lagrange multipliers, which are optimized with respect to each other, and by analytically computing the optimal step for the two Lagrange multipliers (Flake and Lawrence, 2001). The SMO algorithm actually has two components: an analytic method for solving
442
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451
the two Lagrange multipliers and a heuristic for choosing which multipliers to optimize (Platt, 1999). For implementing the SMO model, the stop criterion of Shevade et al. (2000) is commonly used. For this implementation, besides C and tolerance parameters, the optimal value of epsilon parameter for round-off error (∈) has been sought and chosen. 2.3. Kernel functions used SVM uses an implicit mapping of the input data into a highdimensional feature space, defined by a kernel function (Bolbol et al., 2012). The SVM solution is calculated as a weighted sum of kernelfunction outputs, so that these kernel functions can be an inner product, Gaussian basis function, polynomial, or any other function that obeys Mercer's condition (Flake and Lawrence, 2001). In this study, two kernel functions are used: (i) Gaussian Radial Basis Function (RBF) [Eq. (5)] and (ii) VII function-based universal kernel (PUK) [Eq. (6)]. Both kernel functions have been widely used in classification problems (Üstün et al., 2006; Bolbol et al., 2012). The SVM model is implemented with the RBF as the kernel function, whereas the SMO model uses PUK as the kernel function. ′ ′ 2 k x; x ¼ exp −γ x−x K xi ; x j ¼ " 1þ
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2ffipffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 #ω 2 ≥ kxi −x j k 2ð1=ωÞ −1 σ
ð5Þ
ð6Þ :
For the implementation of the PUK kernel in the SMO model, the optimal value of the ω and σ parameters was sought and chosen while, for implementing RBF with SVM, the optimal value of γ parameter was sought and selected. 3. Methodology 3.1. Study area The study area is the city of Granada, which is located in southeastern Spain (Fig. 1). It has a total area of 88.02 km2 and a population of 239,154 inhabitants (population density = 2717 inhabitant/km2). The city is administratively divided into eight districts that have different characteristics in terms of urban morphology, architectural design, main economical activity and social class. The main economical activity in the city is linked to services (commercial and tourism). Regarding the land use, most of the urban area is destined to residential use. Also, important areas are destined to tertiary sector activities and health and educational centers as well. The area destined to industrial use is practically insignificant. Moreover, it should be noted that the city has a significant number of parks and gardens with many historic and popular entailments. As for transport sector, there is a vigorous movement of road vehicles between the city and nearby towns, which involves large levels of traffic in the city, with the consequent problems of traffic jams and congestion of the road transport system (Torija et al., 2012). Hence, it can be established that road traffic is the main source of environmental noise. Thus, the city of Granada is characterized by a great spatial heterogeneity as to urban morphology, architectural design and human activity, which also is typical in medium-sized cities of the South of Europe. This heterogeneity is observed in the 8 districts of the city. Thus, in “Beiro”, “Centro” and “Ronda” districts the main urban roads, as well as the economic center of the city are present. In these districts, which can be considered as the downtown, important areas are destined to uses related to tertiary sector. In addition, “Chana” and “Zaidín” districts are popular residential areas characterized by the coexistence of low-medium
intensities of road traffic and traditional human activities, such as small shops and leisure places. “Norte” and “Genil” districts are newly built residential areas, which are characterized by the presence of wide avenues and recreational areas (such as parks and gardens). Finally, “Albaycin” district, which was declared a world heritage site in 1984, is a historic area of the city. This area retains the narrow winding streets of its Medieval Moorish past. Moreover, there is a vibrancy touristic activity across its multiple historical sites and small gardens and squares. As a consequence of all this heterogeneity, the sound conditions in each of these districts are quite different. This sound heterogeneity can be observed in Table 1, where the estimated number of people exposed to different bands of Lden in each of the 8 districts of the city of Granada is shown. It should be noted that this information has been obtained from the Strategic Noise Map of the city of Granada (SICA, 2007). As can be observed in this Table 1, popular residential areas, with several types of soundscapes composed of sounds coming from road traffic mixed with those of human occupancy, birds and fountains, show the lesser impact on population. In addition, downtown districts show a high percentage of population exposed to Lden levels between 60 and 65 dBA, and several types of soundscapes coexist. In the other side, in the “Albaycin” district, where different types of soundscapes are present, one dominated by sounds coming from road traffic and other from human occupancy and tourism, the noise impact is greater, with half of the population exposed to Lden levels over 65 dBA. The same case occurs in districts whose soundscapes are dominated from sounds coming mainly from road traffic (Ronda district). As can be observed, noise exposure becomes an important problem, with a large number of people exposed to high noise levels, although it is comparable to other similar cities in the area. 3.2. Data collection In Torija et al. (2013), on the basis of acoustical and perceptual criteria, 15 different typologies of soundscapes were identified. To address such task, an extensive data collection campaign was performed, which included locations of all the districts of the city, with different land use, urban morphology, geometry, typology (road, square, park), human activity, social class and predominant sound source. Thus, in each of the selected locations both subjective and acoustical evaluations were conducted. On the one hand, for the acoustical evaluation, a set of descriptors were used to include an equivalent sound-pressure level, maximum–minimum sound-pressure level, impulsiveness of the sound-pressure level, sound-pressure level time course, and spectral composition. On the other hand, the subjective evaluation was performed using a questionnaire with 15 semantic-differential scales: (i) quiet–loud, (ii) pleasant–unpleasant, (iii) not annoying–very annoying, (iv) relaxing– irritating, (v) quiet–disturbing, (vi) bearable–unbearable, (vii) little attending–very attending, (viii) organized–disorganized, (ix) far–nearby, (x) continuous–discontinuous, (xi) smooth–rough, (xii) distinct–hubbub, (xiii) varied–monotonous, (xiv) predictable–chaotic, and (xv) calming– agitating. Finally, with all that perceptual and acoustical information a hierarchical cluster analysis was conducted to classify soundscapes into different typologies. The design of the measurement campaign for collecting the acoustical data to develop the model for automatic classification of soundscapes was based both on the results found in Torija et al. (2013) as well as on previous knowledge and experience in the different sound environments present in the city, acquired by this research group during the development of the strategic noise map of the city of Granada. Firstly, a study was conducted in order to select, among the typologies identified in Torija et al. (2013), the most predominant types of soundscapes in the city of Granada. After this study, 10 types of soundscapes were chosen to conduct the acoustical measurements. A description of the selected categories of soundscapes can be found in Table 2, which can be considered as a standard approach in medium
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451
443
Fig. 1. Orthoimage of the city of Granada.
sized cities in Southern Europe. Secondly, in order to make a selection of the sampling points appropriate to the objective of this work, locations with the same characteristics of urban morphology, geometry, typology, land use, and predominant sound source as those included in the different typologies of soundscapes found in Torija et al. (2013) were sampled. Thus, for instance, the sampled locations belonging to soundscape category SC1 had the same characteristics as the locations included in the reference soundscape category 1. The criteria adopted when selecting the different measuring points were: (i) that the sample of
the different types of soundscapes considered should be representative, and (ii) that the ratio between selected categories of soundscapes in the sample and in the city of Granada was approximate. In addition, it should be noted that a given measurement point was ultimately associated with a specific category of soundscape based on the experience and criteria adopted by the measurement team. Overall, a number of 469 measurement points were chosen. As for measurement procedure, the criterion adopted was, at each selected point, to take the acoustical measurement when the conditions of
444
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451
Table 1 Estimated number of people (hundreds) exposed to different ranges of Lden in each of the 8 districts of the city of Granada (Strategic Noise Map of the city of Granada (SICA, 2007)). City District
Lden Range (dBA) 55–59
60–64
65–69
70–74
N75
Albaycin Beiro Centro Chana Genil Norte Ronda Zaidín
24 (29) 54 (26) 63 (26) 54 (27) 87 (40) 92 (38) 80 (19) 135 (36)
17 (20) 82 (39) 93 (38) 77 (39) 64 (29) 94 (39) 149 (35) 143 (39)
26 (31) 52 (24) 62 (26) 54 (28) 62 (28) 48 (20) 107 (26) 70 (19)
15 (18) 22 (10) 20 (8) 12 (6) 7 (3) 6 (2) 75 (18) 21 (6)
1 (1) 2 (1) 4 (2) 0 (0) 0 (0) 0 (0) 9 (2) 1 (0)
In brackets is shown the percentage of people (with respect to the total population of the district) exposed to different bands of Lden.
sound environment were close to the description of the soundscape category sampled. Once the period of time in which to conduct the data collection at each selected point is set up, an analysis was performed to find out the measurement time necessary to obtain a representative sample of the sound evaluated. This analysis was based on the previous knowledge acquired by this research group by the acoustical characterization of the city of Granada, obtained from both the Noise Map developed according to 2002/49/EU directive, and the dynamics of the sound sources present in the sound environments categorized by the stabilization time of the sound-pressure level (Torija et al., 2011) in such environments. So, taking into account the continuous noise monitoring data (over three years in selected locations) and long-time measurements in selected points, it was selected some short time periods where the soundscapes are stationary in time, thus applying a time window over the recorded data. This time window was fixed to 3 min long, and the sound in this time window was taken as a representative sample of the soundscape. The data collection campaign was carried out in different types of days (i.e. workdays and weekends) and daily periods (from 8:00 h to 23:00 h). For these measurements, a type-1 sound-level meter (2260 observer/investigator model with sound basic analysis program BZ7219), with tripod and wind screen, was used. Before the measurement, the soundlevel meter was calibrated using a 4231 Brüel & Kjaer calibrator. Measurements followed the ISO 1996–2:2007 guidelines, meaning that the
sound-level meter was placed at a height of 1.5 m and 2 m away from the nearest vertical surface. Finally, it should be noted that each type of soundscape considered in this work was compared for acoustical similarity with its corresponding soundscape category (Torija et al., 2013). To approach this analysis, the squared Euclidean distance between the centroid of each of the 10 types of soundscapes measured and the centroid of its corresponding category was calculated. This analysis aimed to assess whether the sound characteristics of each of the 10 types of soundscape considered here were similar to the sound characteristics of its corresponding soundscape category. This is an aspect of great importance, since the correspondence between the types of soundscape measured and the categories of soundscape found (hereafter called reference soundscape categories) as different in the city of Granada (based on acoustical and perceptual descriptors (Torija et al., 2013)) should be ensured in order to develop a model for soundscape classification based on acoustical and perceptual criteria. The underlying hypothesis is that, under similar characteristics of urban morphology, geometry, typology, land use, and predominant sound sources, if the acoustical similarity between a given type of soundscape measured and its corresponding reference soundscape category is proved, then the average perceptual pattern of this type of soundscape measured would be similar to the one of its corresponding reference soundscape category. 3.3. Input variables for soundscape classification model A set of acoustical descriptors were found as the variables with the greatest impact on the differentiation of the typologies of soundscapes characteristics of the city of Granada (see Torija et al., 2013 for a detailed analysis). These descriptors are shown in the next paragraph. Therefore, these acoustical descriptors were used as input variables in order to build the model for classification of measured locations into the 10 types of soundscapes considered in this work (Table 2). From each of the three-minute measurements conducted at the locations chosen, a total of 14 acoustical descriptors were calculated: (i) A-weighted energy-equivalent sound-pressure level (LAeq), characterizing the overall sound-level exposure; (ii) A-weighted soundpressure level exceeded by 1% of the measurement data (LA1), for describing intrusive sound levels; (iii) minimum A-weighted soundpressure level with an impulse response (LAImin), for describing
Table 2 Main sound features and description of the 10 types of urban soundscapes considered in this work, with their corresponding reference soundscape category. Soundscape categories considered
Reference soundscape categories
Main sound features
Description Main avenues with high road-traffic intensity. The presence of road traffic is very high and constant. Wide avenues with 3 or more lanes of traffic. Main avenues with high road-traffic intensity and large number of vehicles with sirens as well. The presence of road traffic is very high and constant. Wide avenues with 3 or more lanes of traffic. Streets with medium road traffic intensity. Intermittent presence of road traffic. Streets with up to 2 lanes of traffic. Presence of vegetation (trees), birds and human occupancy. Streets with low road traffic intensity in school, residential, and tourist areas. Streets with up to 2 lanes of traffic. Presence of vegetation (trees), birds and human occupancy. Open spaces (free field) in the vicinity of motorways. Wide urban squared affected by road traffic, but with the appearance of fountains. Pedestrian areas with many persons passing by. Commercial streets and squares with the predominant presence of human-occupancy sounds. Narrow streets and small size squares. Large city parks (open areas) situated close to a main avenue with large road traffic intensity. Small city parks and gardens isolated from road traffic. Quiet open areas, isolated from sounds of human activities, and highly dominated by the presence of fountains. They are locations characterized by the massive presence of vegetation (birds in the background) and with fountains.
SC1
1
Soundscape dominated by road traffic noise.
SC2
7
Soundscape dominated by road traffic noise and sirens.
SC3
6
Sounds from road traffic mix with those of human occupancy and birds.
SC4
8
Sound from human occupancy and birds predominate over road traffic sounds.
SC5 SC6
4 14
SC7
9
SC8
11
SC9 SC10
13 2
Soundscape dominated by road traffic noise. Road traffic noise, plus sounds from fountains. Human-occupancy sounds
Birds sounds, human-occupancy sounds, plus road traffic noise Birds sounds, human-occupancy sounds Fountains sounds, birds sounds
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451
minimum sound-pressure-level values; (iv) crest factor (CF) (Torija et al., 2011), which is used for the characterization of impulsiveness; (v) sound level at 25 Hz (L25), 31.5 Hz (L31.5), and 125 Hz (L125) 1/3-octave bands, for characterizing low-frequency content; (vi) sound level at 500 Hz (L500), 630 Hz (L630), and 800 Hz (L800) 1/3-octave bands, to describe medium-frequency content; and (vii) sound level at 5 kHz (L5000), 10 kHz (L10000), 16 kHz (L16000), and 20 kHz (L20000) 1/3-octave bands, to describe high-frequency content. 3.4. Performance evaluation For evaluating the performance of the models built, as well as for assessing possible differences in results between the two approaches implemented in this work (SVM and SMO), four statistical parameters were used: True Positive (TP) rate, False Negative (FN) rate, F-measure and the Area Under the Curve (AUC) in a Receiver Operating Characteristic (ROC) analysis. The TP rate is the proportion of examples which were classified as class x, among all examples which truly have class x, i.e. how much of the class was captured.
TP rate ¼
tp tp þ fn
ð7Þ
where tp is true positive (correct result) and fn is false negative (missing result). The FN rate is the proportion of examples which were not classified as class x, among all examples which truly have class x.
FN rate ¼
fn tp þ fn
ð8Þ
where fn is false negative and tp is true positive. The F-measure is a combined measure for recall (equivalent to TP rate) and precision, which can be described as:
F‐Measure ¼ 2
Precision Recall : ðPrecision þ RecallÞ
ð9Þ
The precision is the proportion of the examples which truly have class x among all those which were classified as class x.
Precision ¼
tp tp þ fp
ð10Þ
where tp is true positive and fp is false positive (unexpected result). The ROC curve is a graphic representation of the TP rate vs. (1 − Specificity). Specificity can be described as: Specificity ¼
tn tn þ fp
ð11Þ
where tn is true negative (correct absence of result) and fp is false positive. These statistical parameters are used for comparing the performance of both SVM and SMO models in classifying measured locations into the 10 types of soundscape considered. These parameters were chosen and selected for checking both overall and per category performance differences between the built models. Moreover, the evaluation of the performance of the models developed was approached by employing standard 10-fold cross-validation technique.
445
4. Experimental results and discussion 4.1. Correspondence analysis between measured and reference categories of soundscapes Before implementing SVM and SMO algorithms to develop a soundscape-classification model, as stated in Section 3.1, an analysis was made to ensure the acoustical correspondence between each of the types of soundscapes considered in this work and its corresponding reference soundscape categories (Torija et al., 2013). This correspondence analysis is an essential step in developing the classification model, since it is necessary to ensure such a correspondence in order to establish a classification model which includes acoustical and perceptual criteria. Table 3 shows the squared Euclidean distance between the centroid of each type of soundscape considered here and the centroid of each reference soundscape category. As expected, the largest distances are found between those types of soundscapes where the main sound source is road traffic and those where human and nature (birds, water) sounds predominate. Moreover, some of the types of soundscapes with predominance of road traffic have relatively similar values. This is the case of soundscape type SC3, which has a relatively similar distance value with reference soundscape categories 1 and 6, and soundscape type SC4, with relatively similar distance value with reference soundscape categories 6 and 8. Nevertheless, Table 3 shows that all the types of soundscapes considered in this work had the smallest squared Euclidean distance with their corresponding reference soundscape categories. These results imply the acoustical correspondence between measured and reference soundscape categories and, therefore, establish that the locations measured in a given category of soundscape will be properly classified on the basis of acoustical and perceptual criteria.
4.2. Model for soundscape classification The parameters used to build both SVM and SMO classification models are shown in Table 4. It should be noted that, after a prior analysis, it was found that the use of RBF kernel to implement SVM model and PUK kernel to implement SMO model gave the best performance results in this classification problem. The outcomes with the implementation of the two machine learning techniques used in this work (Table 5) points out that both SVM and SMO models offered high performance in the classification of soundscapes. Thus, the weighted average TP rate was 0.885 and 0.913 for SVM and SMO models, respectively. However, an analysis of the results per soundscape category confirms that SVM model does not provide a good enough result in classifying some categories (SC2, SC5, and SC10). Indeed, for TP rate, the SMO model outperforms the SVM model in 7 out of the 10 soundscape categories, whereas the SVM achieves the best results only in category SC6. With regard to F-measure parameter, the SMO model gives higher values than SVM in all categories, with the exception of category SC6, and the weighted average value of the SMO model (0.911) is greater than the value of SVM model (0.882) as well. As for AUC parameter, not only does the SMO model offer higher values than does SVM model, but also offers values over 0.92 in all soundscape categories. The weighted average value of AUC is 0.929 and 0.973 for SVM and SMO models, respectively. Moreover, in a complementary way, the FN rate for each soundscape category is plotted in Fig. 2(a) (SVM model) and (b) (SMO model). An examination of both figures reveals that the FN rate of SVM model is greater than that of SMO model in almost all the soundscape categories. Besides, the SVM model makes classification mistakes, which can be critical from a sound standpoint. Thus, SVM model classify as SC7 (human-sound predominance) soundscapes belonging to category SC1
446
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451
Table 3 Squared Euclidean distance between the centroid of each of the considered soundscape categories and the centroid of each of the reference soundscape categories. Soundscape categories
1
7
6
8
4
14
9
11
13
2
SC1 SC2 SC3 SC4 SC5 SC6 SC7 SC8 SC9 SC10
0.276 0.613 0.404 0.787 0.963 0.684 0.778 1.191 1.433 1.424
0.877 0.352 1.004 1.317 1.482 1.274 1.358 1.648 1.870 1.881
0.567 0.848 0.318 0.424 0.630 0.523 0.603 0.800 1.054 1.116
1.077 1.174 0.844 0.401 0.684 0.671 0.731 0.739 0.720 0.788
1.399 1.613 1.142 0.698 0.580 1.034 0.774 0.802 0.880 1.072
0.903 1.201 0.703 0.501 0.621 0.445 0.615 0.836 0.910 0.888
1.341 1.528 1.126 0.656 0.739 0.852 0.540 0.901 0.765 0.810
1.408 1.651 1.054 0.603 0.667 1.047 0.871 0.267 0.652 0.988
1.788 1.979 1.479 0.916 0.941 1.226 0.954 0.766 0.483 0.704
1.380 1.639 1.238 0.991 1.147 0.698 0.738 1.213 0.951 0.579
Bold data indicate significance of squared Euclidean distance between each of the considered soundscape category and its corresponding reference soundscape category.
(road-traffic predominance) and also classify as SC1 the soundscapes belonging to category SC10 (quiet areas dominated by fountains). All these results suggest that SMO model outperforms the SVM model in classifying the urban soundscapes considered in this work. This finding is consistent with the literature in other environmental problems, where the SMO algorithm is demonstrated to be a powerful and efficient classification technique (Nahar et al., 2012). Moreover, in case of implementing this model as an automatic classifier of soundscapes in urban environments, the use of the SMO algorithm is an advantage over the use of SVM, since SVM involves a great computational complexity which restricts its application to large-scale problems, while SMO achieves a faster training of SVM (Platt, 1999).
use variations in medium frequencies (L500 Hz, L630 Hz and L800 Hz) as indicators for the presence of human sounds. This leads to the model that classifies four instances belonging to categories SC3, SC4, and SC8 as SC7. (iv) Fourth, in Fig. 3(f), (g) and (h), low-frequency variations seem to be used by the model to detect the presence or absence of road traffic. (v) Fifth, in Fig. 3(d) and (h), it is observed that the model uses decreases in low-medium frequencies and increases in high frequencies to classify instances belonging to soundscape categories SC4 and SC8 as category SC9. Such information could be used by the classification model to detect the presence of natural sounds and absence of road traffic. (vi) Finally, Fig. 3(g) reflects that the high-frequency content appears to lead to the classification model to detect the presence of fountains.
4.3. Analysis of misclassified instances 4.4. Evaluation of urban soundscapes An analysis was made of the instances incorrectly classified by the classification model developed (the SMO model). This analysis was aimed at understanding, on the basis of their values for the 14 acoustical descriptors used as input variables, why the SMO classification model misclassifies such instances. Thus, Fig. 3(a–j) plots the normalized difference (for each descriptor) between the instances misclassified as a given type of soundscape and the average value of the soundscape category which they belong. From this analysis, some important aspects should be highlighted: (i) First, in Fig. 3(a) and (c) two instances appear belonging to soundscape category SC1 and, with smaller values in overall descriptors (LAeq, LA1 and LAImin) and in medium-high frequencies, are classified as category SC3, whereas the opposite occurs with three instances belonging to category SC3, which are classified as SC1 due to their higher value in such descriptors. Thus, the classification model appears to identify the variations in such descriptors (especially overall descriptors and L125 Hz–L10 kHz) as indicators of road-traffic intensity. (ii) Second, in Fig. 3(b) and (d), the model appears to identify the variations in LAeq, LA1, medium frequencies and CF parameter as indicators of presence of vehicles with sirens. Thus, because of great differences in such descriptors, two instances belonging to SC1 and SC4 are classified as SC2. (iii) Third, in Fig. 3(c), (d) and (h), the classification models appear to
Table 4 Value of each parameter used for the implementation of the SVM and SMO algorithms. Parameters
SVM
SMO
C ε Tolerance criterion Kernel type γ ω σ
10.0 – 1.0E−3 RBF 10.0 – –
8.0 1.0E−12 1.0E−3 PUK – 0.4 1.0
This work is based on the hypothesis that from a set of categories of urban soundscapes which were classified and categorized using acoustical as well as perceptual descriptors, the development of a proper model that classifies a given location measured in an urban setting into one of those specific soundscape categories could allow the evaluation of the soundscape of this location according to acoustical and (indirectly) perceptual criteria. On the basis of this hypothesis, the acoustical correspondence between the 10 types of soundscapes considered in this work and its corresponding reference soundscape category were confirmed (Section 4.1). Moreover, it has been demonstrated that, with the 14 acoustical descriptors used as input variables, and with the implementation of a powerful machine learning method, such as SMO, achieves great performance in classifying the measured records into its appropriate soundscape category (Section 4.2). Thus, in this work, it is proposed that for evaluating the soundscape of a given urban location a number of steps should be taken:
Table 5 Weighted average and per soundscape category value of the parameters TP rate, F-Measure, and AUC, with the use of the SVM and SMO algorithms. Soundscape categories (Target)
SVM
SMO
TP rate
F-measure
AUC
TP rate
F-measure
AUC
SC1 SC2 SC3 SC4 SC5 SC6 SC7 SC8 SC9 SC10 Weighted average
0.987 0.571 0.936 0.741 0.636 0.833 0.804 0.853 0.931 0.583 0.885
0.958 0.727 0.915 0.754 0.778 0.909 0.796 0.866 0.931 0.700 0.882
0.976 0.786 0.951 0.855 0.818 0.917 0.889 0.922 0.963 0.791 0.929
0.987 0.714 0.945 0.845 0.727 0.667 0.804 0.912 0.966 0.750 0.913
0.977 0.769 0.928 0.831 0.842 0.727 0.837 0.899 0.949 0.818 0.911
0.991 0.994 0.968 0.954 0.984 0.988 0.925 0.993 0.997 0.934 0.973
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451
447
Fig. 2. FN rate for all the soundscape categories considered, with implementation of SVM model (a) and SMO model (b).
(i) a sound-measurement campaign in the area under study; (ii) the calculation of the 14 acoustical descriptors mentioned in Section 3.2; (iii) the application of the developed SMO-based classification model, as the use of this model can classify such an urban location into its corresponding category of soundscape; and (iv) once the urban location has been classified into a given soundscape category, an acoustical evaluation can be conducted on the basis of the average value of all the acoustical descriptors used for that specific category. In this sense, in Fig. 4(a–n) appears the average value of the 14 acoustical descriptors in each soundscape category. In addition, a perceptual evaluation can be approached from the outcomes shown in Table 6, where the average values of the 15 SD scales for each of the reference soundscapes categories (Torija et al., 2013) are presented.
Once these acoustical and perceptual evaluations are conducted, relevant information can be obtained in order to make a proper diagnosis of such urban location and, whether deviations between expected and estimated are found, implement appropriate corrective measures. 5. Conclusions For addressing a proper assessment of a soundscape, it is essential to include not only acoustical but also perceptual factors. For that reason, in this study a model is developed and tested for classifying a given urban sound space into its corresponding soundscape category, category that has been previously defined according to acoustical and perceptual
448
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451
Fig. 3. Normalized difference between the instances incorrectly classified as a given category of soundscape and the average value of its corresponding category for the 14 acoustical descriptors. In brackets the number of instances classified into a wrong category.
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451
Fig. 4. Average value of each of the 14 acoustical descriptors measured in each category of urban soundscapes.
449
450
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451
Table 6 Average value of the 15 semantic-differential scales for each of the reference soundscape categories (Torija et al., 2013). Semantic-differential scales
1 (SC1)
7 (SC2)
6 (SC3)
8 (SC4)
4 (SC5)
14 (SC6)
9 (SC7)
11 (SC8)
13 (SC9)
2 (SC10)
Quiet (0)–loud (10) Pleasant (0)–unpleasant (10) No annoying (0)–very annoying (10) Relaxing (0)–irritant (10) Quiet (0)–disturbing (10) Bearable (0)–unbearable (10) Little attending (0)–very attending (10) Organized (0)–disorganized (10) Far (0)–nearby (10) Continuous (0)–discontinuous (10) Smooth (0)–rough (10) Distinct (0)–hubbub (10) Varied (0)–monotonous (10) Predictable (0)–chaotic (10) Calming (0)–agitating (10)
7.81 7.80 7.33 7.38 7.34 6.84 7.36 6.71 7.88 3.75 7.60 5.25 4.30 6.45 7.48
8.13 9.57 8.03 9.30 8.07 8.73 8.67 7.23 9.03 2.63 8.13 4.97 5.40 8.73 8.73
5.84 5.94 5.68 6.69 6.09 4.88 6.11 6.90 6.56 6.22 6.55 4.54 4.25 4.36 6.61
4.79 5.39 4.59 5.96 4.76 3.53 4.51 4.99 3.71 6.95 5.63 4.79 3.82 5.32 4.75
7.45 7.09 7.04 6.34 6.41 6.40 7.97 5.84 8.56 1.50 6.92 3.18 8.64 1.49 7.53
3.67 2.85 1.93 2.74 2.38 2.30 2.54 3.15 5.54 1.92 3.15 2.49 4.85 2.24 2.93
6.10 5.69 5.09 6.10 6.00 3.82 5.81 7.15 6.99 1.71 4.33 8.62 4.58 6.17 6.50
4.77 5.03 4.72 4.91 5.13 4.42 5.03 4.98 4.75 3.75 4.61 5.72 4.25 4.38 4.85
3.72 2.15 2.29 2.56 2.07 1.58 5.00 2.36 4.41 3.00 2.79 4.03 2.98 3.71 2.90
2.27 1.03 0.64 0.70 0.71 0.66 8.34 0.73 8.00 0.87 1.14 0.82 1.36 1.31 1.34
In brackets is shown the type of soundscape measured in this work.
criteria. In lights of the results presented in this work, the developed model successfully performs such soundscape classification task so it can be a powerful tool to automatically classify urban sound spaces into previously defined soundscape categories. For the development of this model, a set of 14 acoustical descriptors were used, which have been identified as the parameters with the highest influence on the differentiation of the soundscape typologies considered here. The classification model was developed by implementing two machine learning techniques, SVM (SVM model) and SVM trained with SMO algorithm (SMO model). According to the results, the SMO-based model outperformed the SVM-based one. The implementation of SMO algorithm not only achieved high overall performance (only 8.7% of instances misclassified) but also showed outstanding performance for all the considered soundscape categories. Based on this performance, the authors proposed a SMO classification model as a good tool for evaluating soundscapes on the basis of acoustical and (indirectly) perceptual criteria. With this approach, both acoustical and perceptual evaluations can be approached on the basis of the average value of the 14 acoustical descriptors used and the average value of the 15 SD scales. Therefore, after this evaluation, relevant information could be gained to apply to the appropriate management of urban soundscapes in an integrated approach. This classification is made on the basis of an unsupervised categorization so it can be implemented in an automatic procedure with the selected input data as shown in this paper. Finally, although the classification model has been developed for a number of spatially distributed sampling points, this methodology could be easily extended to continuous monitoring systems, and thus technicians could perform a continuous monitoring of the perceptual and acoustical characteristics in a selected location. Acknowledgments This work was funded by the University of Malaga and the European Commission under the Agreement Grant no. 246550 of the seventh Framework Programme for R&D of the EU, granted within the People Programme, “Co-funding of Regional, National and International Programmes” (COFUND). Moreover, this work is also supported by the “Ministerio de Economía y Competitividad” of Spain under the project TEC2012-38883-C02-02. References Bolbol A, Chen T, Tsapakis I, Haworth J. Inferring hybrid transportation modes from sparse GPS data using a moving window SVM classification. Comput Environ Urban Syst 2012;36:526–37.
Chen YW, Lin CJ. Combining SVMs with various feature selection strategies. In: Guyon I, Gunn S, Nikravesh M, Zadeh L, editors. Feature extraction, foundations and applications. London, UK: Springer; 2006. p. 315–23. Crammer K, Singer Y. On the learnability and design of output codes for multiclass problems. Proceedings of the Thirteenth Annual Conference on Computational Learning Theory, 35–46, COLT ’00, San Francisco, CA, USA; 2000. Cristianini N, Shawe-Taylor J. An introduction to support vector machines: and other kernel-based learning methods. Cambridge, New York: Cambridge University Press; 2000. Deng NY, Tian YJ, Zhang CH. Support vector machines: theory, algorithms and extensions. CRC Press; 2012. Dubois D, Guastavino C, Raimbault M. A cognitive approach to soundscapes: using verbal data to access auditory categories. Acta Acust United Acust 2006;92:865–74. Flake GW, Lawrence S. Efficient SVM regression training with SMO. Mach Learn 2001;46: 271–90. Fu JH, Lee SL. A multi-class SVM classification system based on learning methods from indistinguishable Chinese official documents. Expert Syst Appl 2012;39:3127–34. Ince H, Trafalis TB. Support vector machine for regression and applications to financial forecasting. International Joint Conference on Neural Networks. Como, Italy: IEEE-INNS-ENNS; 2002. Jennings P, Cain R. A framework for improving urban soundscapes. Appl Acoust 2013;74: 293–9. Kim K-H, Ho DX, Brown RJC, Oh J-M, Park CG, Ryu IC. Some insights into the relationship between urban air pollution and noise levels. Sci Total Environ 2012;424:271–9. Laszlo HE, McRobie ES, Stansfeld SA, Hansell AL. Annoyance and other reaction measures to changes in noise exposure — a review. Sci Total Environ 2012;435–436: 551–62. Li J, Heap AD, Potter A, Daniell JJ. Application of machine learning methods to spatial interpolation of environmental variables. Environ Model Softw 2011;26(12):1647–59. Lu W-Z, Wang W-J. Potential assessment of the “support vector machine” method in forecasting ambient air pollutant trends. Chemosphere 2005;59:693–701. Maris E, Stalen PJ, Vermunt R, Steensma H. Noise within the social context: annoyance reduction through fair procedures. J Acoust Soc Am 2007a;121:2000–10. Maris E, Stalen PJ, Vermunt R, Steensma H. Evaluating noise in social context: the effect of procedural unfairness on noise annoyance judgments. J Acoust Soc Am 2007b;122: 3483–94. Montalvao Guedes AC, Bertoli SR, Zannin PHT. Influence of urban shapes on environmental noise: a case study in Aracaju — Brazil. Sci Total Environ 2011;412–413: 66–76. Nahar J, Imam T, Tickle KS, Shawkat Ali ABM, Phoebe Chen Y-P. Computational intelligence for microarray data and biomedical image analysis for the early diagnosis of breast cancer. Expert Syst Appl 2012;39(16):12371–7. Noble WS. Kernel methods in computational biology. Support vector machine applications in computational biology. Cambridge: MIR Press; 2004. p. 71–92. Paunovic K, Jakovljevic B, Belojevic G. Predictors of noise annoyance in noisy and quiet urban streets. Sci Total Environ 2009;407:3707–11. Paviotti M, Vogiatzis K. On the outdoor annoyance from scooter and motorbike noise in the urban environment. Sci Total Environ 2012;430:223–30. Payne SR. The production of a perceived restorativeness soundscape scale. Appl Acoust 2013;74:255–63. Platt J. Fast training of support vector machines using sequential minimal optimization. In: Scholkopf B, Burges CJC, Smola AJ, editors. Advances in kernel methods: support vector learning. Cambride: MIT Press; 1999. p. 185–208. Rychtáriková M, Vermeir G. Soundscape categorization on the basis of objective acoustical parameters. Appl Acoust 2013;74:240–7. Shao Y-H, Deng N-Y, Yang Z-M. Least squares recursive projection twin support vector machine for classification. Pattern Recogn 2012;45:2299–307. Shevade SK, Keerthi SS, Bhattacharyya C, Murthy KRK. Improvements to the SMO algorithm for SVM regression. IEEE Trans Neural Netw 2000;11:1188–93.
A.J. Torija et al. / Science of the Total Environment 482-483 (2014) 440–451 SICA. Noise Pollution Information System. http://sicaweb.cedex.es/mapas-consulta-fase1. php, 2007. Solomatine DP, Ostfeld A. Data-driven modeling: some past experiences and new approaches. J Hydroinformatics 2008;10:3–22. Steinwart I, Christmann A. Support vector machines. New York: Springer-Verlag; 2008. Torija AJ, Ruiz DP. Using recorded sound spectra profile as input data for real-time short-term urban road-traffic-flow estimation. Sci Total Environ 2012;435–436:270–9. Torija AJ, Ruiz DP, Ramos-Ridao A. Required stabilization time, short-term variability and impulsiveness of the sound pressure level to characterize the temporal composition of urban soundscapes. Appl Acoust 2011;72(2–3):89–99. Torija AJ, Ruiz DP, Alba-Fernandez V, Ramos-Ridao A. Noticed sound events management as a tool for inclusion in the action plans against noise in medium-sized cities. Landsc Urban Plan 2012;104:148–56.
451
Torija AJ, Ruiz DP, Ramos-Ridao AF. Application of a methodology for categorizing and differentiating urban soundscapes using acoustical descriptors and semantic-differential attributes. J Acoust Soc Am 2013;134:791–802. Üstün B, Melssen WJ, Buydens LMC. Facilitating the application of support vector regression by using a universal Pearson VII function based kernel. Chemom Intell Lab Syst 2006;81:29–40. Vapnik V. The nature of statistical learning theory. New York: Springer-Verlag; 1995. Vapnik VN. Statistical learning theory. New York: Wiley; 1998. Yang Q, Shao J, Scholz M, Boehm C, Plant C. Multi-label classification models for sustainable flood retention basins. Environ Model Softw 2012;32:27–36. Zannin PHT, Calixto A, Diniz FB, Ferreira JAC. A survey of urban noise annoyance in a large Brazilian city: the importance of a subjective analysis in conjunction with an objective analysis. Environ Impact Assess Rev 2003;23:245–55.