Development of enhanced emission factor equations for paved and unpaved roads using artificial neural network

Development of enhanced emission factor equations for paved and unpaved roads using artificial neural network

Transportation Research Part D 69 (2019) 196–208 Contents lists available at ScienceDirect Transportation Research Part D journal homepage: www.else...

3MB Sizes 0 Downloads 55 Views

Transportation Research Part D 69 (2019) 196–208

Contents lists available at ScienceDirect

Transportation Research Part D journal homepage: www.elsevier.com/locate/trd

Development of enhanced emission factor equations for paved and unpaved roads using artificial neural network Tse-Huai Liu, Yoojung Yoon

T



Civil & Environmental Engineering, West Virginia University, Morgantown, WV 26506, USA

ARTICLE INFO

ABSTRACT

Keywords: Environmental impact Particulate matter Emission factor Paved road Unpaved road Artificial neural network (ANN)

Road dust is a primary source of PM that is having a significant impact on human health and air quality. In order to efficiently develop PM control strategies, it is critical to improving the ability to estimate the emission levels of PM resuspended from paved and unpaved roads. The U.S. Environmental Protection Agency (EPA) has developed emission factor equations to quantify the magnitude of PM for paved and unpaved roads based on multiple linear regression (MLR) models. However, the MLR models are not suitable to capture the complex and non-linear mechanisms of PM emissions, thereby limiting the accuracy of the MLR-based PM prediction models. This paper is to present a method to improve the quality of the existing EPA emission factor equations for paved and unpaved roads by employing an artificial neural network (ANN). The presented method consists of the following steps: data processing, ANN model training, and validation of the presented method through data testing. The data utilized for the case study were retrieved from the database used by the EPA to generate PM10 emission factor equations for the paved and unpaved roads for a fair comparison. The presented method was evaluated by demonstrating its improved performance as shown in the coefficient of determination (R2) and the root mean square error (RMSE) values. The empirical findings of the case study verified that the presented method using the ANN model is capable of improving the quality of the EPA emission factor equations, resulting in higher R2 and lower RMSE values for both paved and unpaved roads. The expected significance of this paper is that the presented method improves the ability to develop more reliable emission factor equations for predictable PM levels that can help agencies establish enhanced PM control strategies.

1. Introduction Particulate matter (PM), which is a complex mixture of extremely small particles and liquid droplets, is responsible for widespread air pollution according to the World Health Organization (WHO) (2013) and has been an international concern as an air pollutant since the 1990s (Pope et al., 1992; Dockery, 2001). The rapid progression of industrialization and energy utilization is causing serious PM pollution in many cities worldwide. The possible sources of PM can be natural (e.g., wildfires, volcanic activity, and sea sand) or anthropogenic (e.g., road traffic, industrial activity, and bonfires). PM pollution can be differentiated as PM10 and PM2.5 based on its particle size. PM10 is the aerodynamic diameter of 10 μm or less, and PM2.5 is the size up to 2.5 μm. PM10 and PM2.5 have been proven to have a significant impact on human health, environment, and transportation safety (Pope, 1995, Poep et al., 2002; Molina, 2004). Schwartz et al. (1993) found that a highly significant relationship exists between certain



Corresponding author. E-mail addresses: [email protected] (T.-H. Liu), [email protected] (Y. Yoon).

https://doi.org/10.1016/j.trd.2019.01.033

1361-9209/ © 2019 Elsevier Ltd. All rights reserved.

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Fig. 1. Comparison of PM Emission Estimate Methods (EPA, 1995).

concentrations of PM10 and asthma hospital admissions. Pope (1995) showed that eight deaths in 100,000 people per year were associated with each 1 μm/m3 increase in the concentration of fine PM from 1982 to 1989 in all 50 U.S. states. As the wind carries PM over long distances, the chemical composition of PM may damage sensitive forests and farm crops or change the balance in pH values of lakes or rivers (Philpott 2015). The stomatal openings in plants can be clogged by PM, leading to failures in the photosynthesis process. PM pollution is a major cause of decreased visibility, called haze pollution (U.S Environmental Protection Agency (EPA), 2004b). The reduced visibility increases the possibility of traffic accidents due to the impaired performance of vehicle drivers according to the recent study by Sager (2016). The WHO air-quality guidelines (AQGs) provide information to policy-makers about the daily and annual mean concentration limits of key air pollutants to protect public health (WHO, 2006). For example, the daily and annual mean concentration limits are 50 μg/m3 and 20 μg/m3 for PM10 as well as 25 μg/m3 and 10 μg/m3 for PM2.5, respectively. PM emission estimates are therefore significant to develop emission reduction targets and establish emission control strategies. EPA has identified four different methods which include direct measurement, material balance, emission factor, and engineering judgement, and compared the methods considering the cost and reliability of estimates (EPA, 1995). Fig. 1 indicates the relationship between cost and reliability for each method. In Fig. 1, the letters from A to E (excellent to poor) represent the quality of the outcomes generated. In general, the direct measurement method is divided into two approaches: (1) continuous emission monitoring (CEM) and (2) stack testing. The parametric source tests and single source tests belong to the stack test approach. The direct measurement methods generally yield the most reliable estimation results, but involve much expensive agency costs for building and maintaining facilities. The material balance and emission factor methods (e.g., source category emissions model, state/industry factors, and emissions factors (AP-42)) generate very similar resulting estimates in a wide range of reliability from A to E while the estimated cost of the material balance method is above all three emission factor methods. The emission estimate cost of the engineering judgment method is the most attractive compared to other methods. The method, however, provides very low quality of emission estimate. Therefore, when both the costs and reliability of the emission estimation methods are considered, the emission factor method is both satisfactory and appropriate. EPA (1995) has published a document entitled “Complication of Air Pollutant Emission Factors” (also called AP-42), which contains around 9000 emission factors for stationary sources (e.g., factories, refineries, boilers, power plants, and other fixed emitters of air pollutants) (Placet, 2000). An emission factor is defined as a representative value that is used to quantify the amount of air pollutants released into the atmosphere during certain activities (EPA, 1995). An emission factor is usually expressed as the weight of the pollutant per unit weight, volume, distance, or duration of an activity (e.g., pounds of PM10 emitted per vehicle miles traveled). Among other publications that also provide the information regarding emission factors, AP-42 is the primary guidance document and source of rated emission factors used by public agencies (EPA, 2001). Some emission factors in AP-42 are developed through a multiple linear regression (MLR) model considering the emission measurement data collected from direct measurement methods (Lashgari and Kecojevic, 2015). The MLR model generally attempts to find a linear relationship between dependent and independent variables to predict outcomes. However, the formation of PM pollution is a very complex and nonlinear phenomenon due to physical and chemical processes that occur in the atmosphere, which implies that the conventional statistical methods based on the linear relationship of dependent and independent variables have a limitation insofar as accurately predicting the PM concentration/amount. According to several studies, road dust is an important source of PM and has a significant impact on air quality (Harrison and 197

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Jones, 1995; Chow et al., 1996). Watson and Chow (2000) identified unpaved and paved roads as the major sources of PM10 and PM2.5. Therefore, the objective of this paper is to present a method to improve the quality of the EPA emission factor equations for paved and unpaved roads through the application of an Artificial Neural Network (ANN) model which can overcome the limitations in the linear relationship-based conventional statistical models. The application of the presented method takes the steps as follows: collect the PM data used for EPA PM10 emission factor equations for the paved and unpaved roads; organize the collected data for outliers, data normalization, and data classification; design a basic platform of the ANN model to analyze the processed data; determine the structure and regularization of the ANN; and, demonstrate the effectiveness of the emission factor equations obtained from the application. For the fair comparison of the results from the presented method with the EPA equations, the data used for EPA were reused. The comparison results to demonstrate the effectiveness of the presented method which improves the quality of the EPA emission factor equations for paved and unpaved roads are presented. 2. EPA emission factor equations for paved and unpaved roads 2.1. Paved road The paved road emission factor equations were first developed by the EPA in 1985 and were divided into two categories: industrial paved roads and public paved roads EPA (1985). In 2011, the EPA updated the initial emission factor equations by considering silt loading, mean vehicle weight, and mean vehicle speed as the potential correction factors in order to develop an emission factor model (EPA, 2011). The emission factor equation was developed based on using stepwise multiple linear regression, and then the EPA tested the three correction factors using the regression function in analysis tool of Excel. As a result, the empirical expression to quantify PM emissions for paved roads includes the silt loading and mean vehicle weight factors while the mean vehicle speed factor is not statistically significant as shown in Eq. (1), where k: particle size multiplier, sL: road surface silt loading (g/m2), and W: average weight of the vehicles (tons). Table 1 includes the various particle size multipliers associated with the different particle size ranges (PM10 and PM2.5) and units (grams per vehicle kilometers traveled (g/VKT), grams per vehicle miles traveled (g/VMT), and pound per vehicle miles traveled (lb/VMT)). EPA validated Eq. (1) by generating the coefficient of determination (R2) of 0.72. The R2 is the common indicator to check how close the data are to the fitted regression line, which implies that the EPA emission factor equation for paved roads has a 72% predictive power to correctly estimate the dependent variable for the dependent variables given. (1)

E = k (sL)0.91 (W )1.02

2.2. Unpaved road In 1979, EPA first published the unpaved emission factor equation that includes the variables such as surface silt content, average vehicle speed, mean vehicle weight, average number of wheels, and particle size multiplier. EPA (1998) then updated the initial unpaved emission factor equation, considering the three significant correction factors, which include surface slit content (s), mean vehicle weight (W), and surface material moisture content (M) as shown in Eq. (2), as a result of the stepwise regression model analysis. The updated emission factor equation is in the linearization of a multiplicative form in which the log-normally distributed data of the correction factors are log-transformed. However, the performance of the model is very low as 0.35 of the R2 compared to that for paved roads. The values for the particle size multiplier (=k) and empirical constants (=a, b, and c) used in Eq. (2) are presented in Table 2. Also, it should be noted that the unit of the emission factor in Eq. (2) is lb/VMT.

E=

k(s /12)a (W /3)b (s /0.2) c

(2)

3. Principles and functions of artificial neural network To analyze the emission factors for paved and unpaved roads, this paper considers the ANN model which is suitable for defining a non-linear, complex relationship between input and output variables through the analysis of historical data (Chiarazzo, et al., 2014). The ANN model employs an iterative machine learning process for the data analysis and thereby provides robust solutions. The following discusses the principles and functions such as fundamental structure, training algorithms, regularization techniques, and activation function that should be essentially considered for the ANN model. Table 1 Particle Size Multipliers for Paved Roads (EPA, 2011). Size Range

PM2.5 PM10

Particle Size Multiplier (k) g/VKT

g/VMT

Ib/VMT

0.15 0.62

0.25 1

0.00054 0.0022

198

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Table 2 Particle Size Multiplier and Empirical Constants for Unpaved Roads (EPA, 1998). Particle Size Multiplier and Empirical Constant

PM2.5

PM10

k a b c

0.38 0.8 0.4 0.3

2.6 0.8 0.4 0.3

3.1. Fundamental structure The structure of an ANN model is built by several highly-interconnected processing and computing elements (also called neurons) that work in parallel to solve specific problems. The ANN model includes the neurons in three different layers (e.g., input, hidden, and output). The neurons in a precedent layer feed their outputs multiplied by weights to the neurons in a succeeding layer. While a typical ANN model consists of one input layer, one hidden layer, and one output layer, it is also possible for the model to include any number of hidden layers. In general, the more hidden layers and neurons there are in an ANN model, the greater its learning capacity. However, this higher learning capacity can increase the processing time as well as the risk of overfitting (Panchal et al., 2011). Overfitting happens when the network tends to remember the training cases instead of extrapolating the patterns. 3.2. Training algorithms The well-known training algorithm for feedforward neural network, which is the most common type of ANN model, is backpropagation because of its low computational complexity (Kathirvalavakumar and Subavathi, 2012). The feedforward back-propagation neural network sends a one-direction signal (feedforward) from the input neurons to the neurons in the next layers, and then propagates the errors between the output value (estimated value) and the target value (observed value) back to adjust the weights of the neuron connections. The feedforward back-propagation neural network proceeds in the following three steps: (1) the neurons deliver the signals forward to the output layer; (2) the error value is found by comparing the output value and the actual output value; and (3) the error backward passes to each neuron to adjust weights until the errors stop improving. 3.3. Regularization techniques Regularization techniques have been developed to reduce the noise and overfitting problems in ANN models. Overfitting occurs when the model captures the detail and noise from the training data and causes negative performance by the model on the testing data. In the ANN training process, regularization techniques are used with a back-propagation algorithm to seek a smaller error (Kayri, 2016). There are three common regularization techniques which include the Levenberg-Marquardt, conjugate-gradient, and Bayesian regularization methods. The Levenberg-Marquardt method, also known as the damped least-squares method, is a virtual standard in non-linear optimization which significantly outperforms the conjugate-gradient method for medium-sized problems (Roweis, 1996). This method is an iterative process that locates the local minimum of a multivariate function that is expressed as the sum of squares of several non-linear relationships (Lourakis and Argyros, 2005). The conjugate gradient method adjusts the weight in the negative of the gradient and needs more storage than other methods. It implies that the conjugate gradient method is good for a neural network with a large number of weights (Sharma and Venugopalan, 2014). The purpose of the Bayesian regularization method is to minimize a combination of squared errors and weights, depending on the probabilistic interpretation to choose optimal sets of weights to minimize estimation error and efficiently avoid overfitting (Kayri, 2016). The major advantage of using Bayesian regularization is that there is no need to divide a dataset into training and testing datasets so it is applicable when there is little data (Burden and Winkler, 2008; Ticknor, 2013). 3.4. Activation function The purpose of the activation function is to add non-linearity elements to the ANN model in order to accommodate more complex problems in the model. It is essential for the ANN model to learn non-linear problems. There are three common activation functions such as logistic, hyperbolic tangent, and linear. The logistic activation function generates all the output values between 0 and 1 in an S-shaped curve (sigmoid curve). The hyperbolic tangent activation function generates all the values between −1 and 1 in an S-shaped curve. The sigmoid curve of the logistic activation is very similar to the curve of the hyperbolic tangent function, but it is a little sharper due to the range of outputs. The linear-activation function generates the same output values as the input values (straight line) which is the weighted sum of neurons. Generally, the logistic and hyperbolic tangent activation functions are used for a hidden layer while the linear activation function is suitable for an output layer where the output values not constrained to any boundary are generated. Between the activation functions for non-linearity, the convergence behavior of the logistic activation function shows a more non-linear pattern as the hyperbolic tangent activation function is almost linear at the low absolute values of the input variables. 199

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Fig. 2. The Flowchart of the Presented Method; v – input variable, m – maximum number of hidden neurons, h – hidden neuron, t – number of ANN runs, R – coefficient of determination, and C – combination of input variables.

4. Emission factor development method The method presented to develop emission factor for paved and unpaved roads in this paper include three major modules as follows: data processing, ANN model training, and model validation as shown in Fig. 2. 4.1. Module-1: data processing The data processing module consists of four steps such as: organizing the raw data to detect outliers, normalizing the values of the organized raw data, splitting the data into training and testing datasets, and classifying input variables. The data organization removes outliers from the raw data as the outliers in the training dataset have adverse effects on the quality of an ANN model by decreasing its accuracy (Khamis et al., 2005). If there is a sufficient number of data points in the training dataset, the neural network can receive learning enough to tolerate a reasonable number of outliers using fault tolerance of the model. Fault tolerance is the ability of the model to continue operating in the fault of its input data. When the number of training data points is not adequate, the neural network will not be able to sufficiently learn so that the accuracy of the model becomes more sensitive to the outliers. There are several techniques such as a box-plot, Q-Q plot, and z-score to detect outliers. However, although the outliers are detected by the statistic principles, they should not be excluded from a raw dataset without apparent physical reasons such as damaged sample, computational error, abnormal data measuring conditions (American Society for Testing and Materials, 2002). The data normalization, which is one of the most common techniques used for data processing, is effective in reducing the adverse impacts of outliers on model development in cases where the outliers cannot be clearly detected or reasoned (Patel and Mehta, 2011). The normalization process for the organized raw data is also needed for increasing the speed of convergence, decreasing the bias between independent variables, and minimizing the error of prediction results (Jayalakshmi and Santhakumaran, 2011; Atomi, 2012; Baghirli, 2015). Min-max normalization is widely used technique to adjust data in different scales to a range of 0 to 1 by using the linear transformation of data as shown in Eq. (3), where, Rmin and Rmax are the minimal and maximal values of the organized raw dataset respectively, Xi is the value of data point i before normalization, and Xi' is the normalized value of Xi . However, the min-max normalization method has a potential limitation with newly-observed values in the future which are out of the range of Rmin and Rmax so cannot provide any information to the ANN model because they do not fit the training process. Therefore, this paper also considers a Z-score normalization technique to increase the applicability of the proposed method. The Z-score normalization uses a mean value and standard deviation of the organized raw dataset to scale the input data to have a mean of zero and a variance of one so that it has no range problem and can reduce the effects of outliers in the data. The formula of Z-score normalization is shown in Eq. (4), where μ and σ are the mean and standard deviation of the data points in the organized raw dataset respectively. 200

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Xi' =

Xi Rmin Rmax Rmin

Xi' =

Xi

(3)

µ

(4)

As the third step, the data points organized and normalized are divided into a training dataset to develop an ANN model and a testing dataset to validate the reliability of the model. It is critical that the training dataset includes an adequate number of data points for a sufficient learning process of the model. Data type classification as the last step is designed to identify the input variables of an ANN model. In general, a dataset includes various input variables. However, it is difficult to select some of the input variables that have a statistical significance for model development. For the data type classification, the ANN model considers all possible combinations of input variables and then identifies one combination which provides the best performance through a series of the model training process. For example, the ANN model considers a total of the seven combinations (e.g., xyz, xy, xz, yz, x, y, and z) when there are three input variables (e.g., x, y, and z) involved in a dataset, implying that the ANN logic creates seven individual models for the combinations and then the models are tested for each combination with different numbers of hidden neurons. 4.2. Module-2: ANN model training The ANN model training module is to find the optimal combination of input variables and number of hidden neurons as well as assign the weights and biases to the ANN model. Therefore, as shown in Fig. 2, this module takes the steps which begin with setting up the maximum number of hidden neurons, training each number of hidden neurons for t times for each combination of input variables recording a test R value for each training, calculating the average of the test R values for each training, determining the optimal combination of input variables and number of hidden neurons with the highest average test R value, and assigning the weights and biases to the model. R is the coefficient of correlation being computed by Eq. (5), where n is number of samples, yi is the n observed output values, yi is the n estimated output values by the ANN model, y is the mean of the n observed output values, and y is the mean of n estimated output values. The values of R range from −1 to +1, implicating the strength of the relationship between the dependent and independent variables. A negative correlation means that one variable increases while the other variable decreases and vice versa. Whereas, a positive correlation implies that a pair of the variables in a relationship moves in the same direction (e.g., increase or decrease). An R value close to −1 or + 1 indicates very strong negative or positive relationships.

R=

n i=1 n i=1

(yi

(yi

y¯)(yi y¯) 2

n i=1

y) (yi

y )2

(5)

The number of neurons in the hidden layer is determined based on the rule of thumb, considering the complexity of a problem or the number of input and output variables. That is, the number of hidden neurons could be increased for more complex problems. A number of hidden neurons can be randomly determined between one and the number of input variables (= {v1, v2, ∙∙∙∙∙∙, vm}). Using another way, the ANN model utilizes the trial-and-error method, which examines the number of hidden neurons from one to the maximum number of hidden neurons, max(h). Finding the optimal one out of all possible combinations of input variables m (= v = 1 mCv ) and the optimal number of hidden neurons considers the three sub-steps as follows: - First, each combination of the input variables is given t times training for each number of possible hidden neurons, max(h). Then, the test R value for each training is recorded, which implies that there are t different Test R values for each of the hidden neurons, and each combination provides a total of max(h) × t different Test R values. - Second, the average of the Test R values for each number of neurons for each combination is calculated, which implies that there is a total of max(h) average Test R values for each combination. - Third, the second step is repeated for all other combinations of the input variables. Then, the average Test R values are compared to identify the highest average Test R value at which the optimal combination of the input variables and number of hidden neurons are determined for the ANN model. Once the optimal combination of input variables and number of hidden neurons are chosen, the ANN model needs to assign weights and include bias neurons for testing in the last module. ANN model usually adds one additional neuron for bias to input and hidden layers in order to increase the flexibility of the model (De Veaux and Ungar, 2012). In general, the initial value of a bias neuron is assumed to be 1, but bias values are changeable to other values in a way similar to the interception in a regression equation. For example, if the intercept is large (e.g. bias value), then the coefficient can take smaller values (e.g. the weights). The connections between any two neurons in the different layers are assigned by the weights. All the initial weights between each layer are randomly selected from a given range for a neural network. Fig. 3 is presented as an example, which shows two input neurons, two hidden neurons, and one output neuron, to provide a better understanding of the weights (e.g., w) and bias neurons (e.g., Ib and Hb) in the ANN model. 4.3. Module-3: ANN model validation The ANN model validation module has three steps: generating normalized output estimates, de-normalizing the output estimates, 201

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Fig. 3. Weights and Bias Neurons in ANN Model.

and evaluating the ANN model. As the first step, the normalized data points in the testing dataset are fed into the ANN model which is obtained identified from Module-2. As a result, the model provides normalized output estimates which are then de-normalized to generate the root mean square error (RMSE) values between the output estimates and the observed data in the testing dataset. The performance of the ANN model is evaluated by demonstrating its excellence in the RMSE values compared to the values of the EPA emission factor equations for the paved and unpaved roads. RMSE is a widely-used measure to demonstrate the fitness of a model through the evaluation of the differences between estimated values and observed values. The values closer to 0 indicates that a model is more reliable for prediction. RMSE is expressed in Eq. (6).

RMSE =

n i=1

(yi

yi )2

(6)

n

5. Application of the presented method 5.1. Data collection and organization The data for the model application were mainly retrieved from the EPA paved road report (2011) and unpaved road report (1998) and some of the data retrieved from the EPA website (https://www3.epa.gov/ttn/chief/old/ap42/ch13/). The report for paved roads includes a total of 84 data points for the three input variables (e.g., silt loading (SL) present on the road surface, mean vehicle weight (MVW), and average vehicle speed (AVS)). The report for unpaved roads includes a total of 92 data points for the six input variables (e.g., silt content of road surface material, MVW, surface moisture content, AVS, number of vehicle passes (NoVP), and mean number of vehicle wheels (MnoVW)). The observed data of PM10 are represented in grams per vehicle mile traveled (g/VMT) for the paved roads and pounds per vehicle mile traveled (lb/VMT) for the unpaved roads. The statistical properties of the original paved and unpaved roads data (minimum, maximum, mean, and standard deviation values) are presented in Tables 3 and 4. The observed raw data were normalized by the min-max and Z-score normalization methods using the statistical properties. As the EPA reports where the data were retrieved have no explanations of any physical reasons for possible outliers, the application of the presented method assumed that there are no noticeable outliers in the dataset. However, it should be mentioned that the application considered the normalization and ten times repetitive ANN training with different training datasets to minimize the adverse impact of possible outliers. The normalized data points in each of the two datasets for the paved and unpaved roads were divided into the training and testing datasets for the analysis in the ANN model, considering that neither the variance of training nor testing performance is too high. Shahin et al. (2004) suggested assigning 70−90% of a training dataset to develop an ANN model and 30%-10% for testing. On the other hand, the MATLAB NNTOOL, which was used for the ANN model in this paper, sets 85% of the data points for the training dataset as the default with the remaining for the testing dataset. Therefore, developing the emission factor equations for the paved and unpaved roads employed two conditions as follows: 1) random percentages between 75% and 85% data points for training and Table 3 Statistical Properties of the Paved Road Data. Label SL MVW AVS PM10

Unit 2

g/m Tons Mph g/VMT

Sample

Min.

Max.

Mean

Std.

84 84 84 84

0.0127 2 1 0.084

287 41 55 1750

10.08 16.37 21.08 60.8

39.93 16.98 16.7 211.95

202

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Table 4 Statistical Properties of the Unpaved Road Data. Label

Unit

Sample

Min.

Max.

Mean

Std.

Silt MVW Moisture AVS NoVP MNoVW PM10

% Tons % Mph # # Ib/VMT

92 92 92 92 92 92 92

1.82 1.8 0 8.8 4.63 4 0.09

35.1 286 8.5 43.1 330 10 15.6

8.31 56.05 1.73 24.85 66.13 5.26 3.42

5.31 72.34 2.04 8.57 73.82 1.6 3.43

the remaining for testing and 2) an equal number of the training data points for both paved and unpaved roads to avoid a possible argument for the reliability of the ANN model. As a result, Table 5 shows the numbers and percentages of the data points determined for the model training and testing. A total of three input variables for the paved roads and six input variables for the unpaved roads as stated in the EPA reports were classified, which generates 7 and 63 combinations for the paved and unpaved roads respectively. 5.2. ANN model configuration The MATLAB NNTOOL was used to develop the proposed ANN model considering the following configuration elements: training algorithm, training regularization method, and activation function. Table 6 summarizes the methods for the configuration elements selected through the literature review in the previous section. Also, the structure of the presented ANN model for each of the paved and unpaved roads consists of one input layer, one hidden layer, and one output layer. The input layers for the paved and unpaved roads include three and six neurons for the input variables, respectively, with one additional bias neuron for each of them. Although it is acceptable to include more than one hidden layer for the ANN model, the model application for both cases in this paper considered one hidden layer for the following reasons: (1) too many hidden layers may result in a significant deterioration of the ANN model learning (Macukow, 2016); (2) there is no certain evidence that the performance of the ANN model would improve with multiple hidden layers (Ahmed, 2005); and (3) many past studies stated that one hidden layer is sufficient to solve the majority of non-linear complex problems (Karsoliya, 2012). Common practices to determine an optimal number of neurons in the hidden layer is based on the trial-and-error method (Mohd-Safar et al., 2018). The optimal number of hidden neurons based on the rule of thumb could be twice the number of input neurons plus the number of output neurons (Hecht-Nielsen, 1989, Ahmad et al., 2017). It implies that the optimal number of the hidden neurons for the unpaved roads might be 13 (e.g., 6 input neurons ×2 + one output neuron). Therefore, to determine the optimal number of hidden neurons, a number of hidden neurons ranging between 1 and 20 were considered, taking into account the computing time to test each hidden neuron. 5.3. Optimal combination of the input variables and number of the hidden neurons The ANN model application considered 10 times training for the normalized training data points for the hidden neurons from 1 to 20, which produced 10 Test R values for each number of the hidden neurons. The Test R values at each hidden neuron were averaged, which was applied for all other hidden neurons to compare one another in order to identify the highest average Test R where the optimal number of hidden neurons was found. This training process was repeated for all the possible combinations of the input variables for each of the paved and unpaved roads. The results of the highest average Test R values of all combinations are illustrated in Figs. 4 through 6. In Fig. 4, the input variables a, b, and c represent silt loading, mean vehicle weight, and mean vehicle speed, respectively. In Figs. 5 and 6, the input variables 1, 2, 3, 4, 5, and 6 represent silt content, mean vehicle weight, moisture content, mean vehicle speed, number of vehicle passes, and mean number of wheels, respectively. It should be noted that each of the paved and unpaved roads shows two different training results based on the data normalized by the min-max and Z-score normalization methods. The training results clearly show that the optimal combinations of the input variables and numbers of hidden neurons for the paved and unpaved roads which are emphasized in Figs. 4 through 6 as well as summarized in Table 7. The weights of the input variables, hidden neurons, and bias neurons are represented in Tables 8 through 11. 5.4. Results of model performance testing The performance of the ANN model was evaluated by demonstrating its excellence in the RMSE values compared to the values using the EPA emission equation. R2 based on the Test R also was used to evaluate the model performance as the test R indicates the Table 5 The Numbers and Percentages of the Training and Testing Data. Pavement Type

Training Data

Testing Data

Percentage (Training/Total)

Paved Unpaved

70 70

14 22

83.3% 76.1%

203

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Table 6 Methods selected for the Configuration Methods. Element

Selected Method

Training Algorithm Training Regularization Activation Function

Back-propagation Bayesian regularization Logistic activation function Linear activation function

Hidden Layer Output Layer

Fig. 4. Number of Hidden Neurons at the Highest Average Test R for each Combination (Paved Roads); a – silt loading, b – mean vehicle weight, and c – mean vehicle speed.

Fig. 5. Number of Hidden Neurons at the Highest Average Test R for each Combination, Min-Max Normalization (Unpaved Roads); 1 – silt content, 2 – mean vehicle weight, 3 – moisture content, 4 - mean vehicle speed, 5 – number of vehicle passes, and 6 – mean number of wheels.

predictive power of a model. The results of the comparison of the paved roads with the min-max and Z-score normalization methods and the EPA paved road emission factor equation calculated the RMSE values (201.98, 200.60, and 446.83, respectively) and the R2 values (0.90, 0.89, and 0.72, respectively). It implies that the presented ANN model is capable of improving the quality of the EPA emission factor equation for the paved roads with the increase of 54.8% in the RSME values and the increase of 25.0% and 23.6% in the R2 values. In addition, for the unpaved roads, the results show that the RMSE values were 2.22, 2.23, and 2.98, respectively, and the R2 values were 0.72, 0.79, and 0.35, respectively. The reason that the RMSE values for the unpaved roads are very small compared to the ones for the paved roads is due to the different units of PM10 for the paved roads (i.e., g/VMT) and unpaved roads (i.e., lb/VMT). The EPA emission factor equation for the unpaved roads was improved by the ANN model with the 25.5% and 25.2% increase in the RMSE values and the 105.7% and 125.7% increase in the R2 values. It should be noted that the R2 values of the EPA emission factor equations for the paved and unpaved roads were retrieved from the relevant EPA reports (EPA, 1998; 2011). The reliability of the performance results from the ANN model also was tested by examining the probabilities that the model 204

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Fig. 6. Number of Hidden Neurons at the Highest Average Test R for each Combination, Z-Score Normalization (Unpaved Roads); 1 – silt content, 2 – mean vehicle weight, 3 – moisture content, 4 - mean vehicle speed, 5 – number of vehicle passes, and 6 – mean number of wheels. Table 7 The Optimal Combination of Input Variables and Number of Hidden Neurons. Normalization Method

Min-Max Z-score

Paved Roads

Unpaved Roads

Optimal Combination

Optimal No. of Hidden Neurons

Optimal Combination

Optimal No. of Hidden Neurons

SL, MVW, and AVS SL, MVW, and AVS

14 15

Silt, MVW, Moisture, AVS, and MNoVW Silt, MVW, Moisture, AVS, and MNoVW

16 16

Table 8 Weights of the ANN Model for the Paved Roads (Min-Max). Weights to the Hidden Layer

Weights to the Output Layer

Input1

Input2

Input3

Bias

Hidden Neuron

Bias

0.0350 0.0499 −0.5201 0.0418 0.0410 0.6745 −1.3430 1.0409 0.0481 0.0454 0.0354 0.0485 0.0344 0.0479

0.0022 0.0031 1.3647 0.0026 0.0026 0.2473 −0.8967 −1.0926 0.0030 0.0029 0.0022 0.0031 0.0022 0.0030

−0.0433 −0.0613 0.9376 −0.0515 −0.0506 1.4117 0.1209 0.2677 −0.0591 −0.0559 −0.0438 −0.0597 −0.0425 −0.0589

0.0023 0.0031 −0.0411 0.0027 0.0026 0.0693 0.1597 −0.3691 0.0030 0.0029 0.0023 0.0030 0.0023 0.0030

0.0900 0.1133 −1.3033 0.1006 0.0994 1.1357 −1.2358 −1.0428 0.1105 0.1062 0.0906 0.1112 0.0890 0.1102

0.1133

outperforms the EPA emission factor equations as the average RMSE and R2 values have the measures of variability. The reliability test was performed for only RMSE values as the two statistical measures (e.g., RMSE and R2) provide an equal interpretation for a model validation. Therefore, a total of 100 RMSE values was generated for each of the paved and unpaved roads, and the frequencies and the probability distribution of the values were identified as shown in Fig. 7. The normality of the probability distributions for the paved and unpaved roads (i.e., W = 0.62 and p-value = 1.124e-14 for the paved roads and W = 0.84 and p-value = 5.98e-09 for the unpaved roads) was confirmed based on the Shapiro-Wilk test where small values of W statistic are evidence of normality. The Z-score of the EPA RMSE value for the paved roads is 4.93, which implies that the ANN model provides almost 100% reliable PM10 estimates compared to the EPA emission factor equation. On the other hand, the Z-score of the EPA RMSE value for the unpaved roads is 1.33 at which the probability is 90.8%. It concludes that the ANN model shows extremely high reliability to estimate PM10 for unpaved roads 205

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Table 9 Weights of the ANN Model for the Paved Roads (Z-score). Weights to the Hidden Layer

Weights to the Output Layer

Input1

Input2

Input3

Bias

Hidden Neuron

Bias

−0.0308 −0.0308 −0.9041 −0.0308 0.4890 −0.0307 −0.0308 −0.7116 −1.1558 −0.0307 −0.0243 −0.0313 −0.0308 −0.0308 −0.0307

−0.0330 −0.0330 0.9853 −0.0330 −1.4700 −0.0330 −0.0330 −0.3434 −0.8739 −0.0330 −0.0262 −0.0336 −0.0330 −0.0330 −0.0330

0.0438 0.0438 −0.2613 0.0438 −1.0332 0.0437 0.0438 −1.4099 0.0749 0.0437 0.0347 0.0445 0.0438 0.0438 0.0437

0.0017 0.0017 0.3611 0.0017 −0.0018 0.0017 0.0017 −0.0737 0.2210 0.0017 0.0013 0.0017 0.0017 0.0017 0.0017

−0.0845 −0.0845 0.9885 −0.0845 1.2486 −0.0844 −0.0845 −1.1508 −1.1989 −0.0844 −0.0713 −0.0855 −0.0845 −0.0845 −0.0843

−0.0749

Table 10 Weights of the ANN Model for the Unpaved Roads (Min-Max). Weights to the Hidden Layer

Weights to the Output Layer

Input 1

Input 2

Input 3

Input 4

Input 5

Bias

Hidden Neurons

Bias

−0.5278 0.1712 0.1712 0.1712 0.1712 −0.1554 0.1712 0.1712 −0.1554 0.1712 0.1712 0.1712 0.1712 0.1712 0.4359 2.9865

−0.0657 −0.1568 −0.1568 −0.1568 −0.1568 0.1528 −0.1568 −0.1568 0.1528 −0.1568 −0.1568 −0.1568 −0.1568 −0.1568 −1.1423 2.2770

−0.9393 −0.3706 −0.3705 −0.3706 −0.3706 0.3414 −0.3706 −0.3706 0.3414 −0.3706 −0.3706 −0.3706 −0.3706 −0.3706 −1.5775 −1.0050

−0.0574 −0.1764 −0.1764 −0.1764 −0.1764 0.1697 −0.1764 −0.1764 0.1697 −0.1764 −0.1764 −0.1764 −0.1764 −0.1764 −2.2301 2.0216

−0.5772 0.0019 0.0019 0.0019 0.0019 0.0016 0.0019 0.0019 0.0016 0.0019 0.0019 0.0019 0.0019 0.0019 1.0186 −1.6566

0.3307 −0.1203 −0.1203 −0.1203 −0.1203 0.0983 −0.1203 −0.1203 0.0983 −0.1203 −0.1203 −0.1203 −0.1203 −0.1203 −0.4618 −0.8312

0.7279 −0.5389 −0.5388 −0.5389 −0.5389 0.4894 −0.5389 −0.5389 0.4895 −0.5389 −0.5389 −0.5389 −0.5389 −0.5389 1.9228 3.3296

−0.0161

Table 11 Weights of the ANN Model for the Unpaved Roads (Z-score). Weights to the Hidden Layer

Weights to the Output Layer

Input 1

Input 2

Input 3

Input 4

Input 5

Bias

Hidden Neurons

Bias

−0.1359 −0.1379 −0.1379 −0.1377 −0.5380 −0.1382 −2.1376 −0.1382 −0.1377 −0.1375 −0.1380 0.1804 −0.1380 0.0303 0.1795 −0.1124

0.1750 0.1763 0.1763 0.1762 −1.3476 0.1765 −2.1258 0.1765 0.1762 0.1760 0.1764 −0.2039 0.1763 −2.0029 −0.2034 0.1594

0.3061 0.3092 0.3091 0.3088 1.1010 0.3096 0.6827 0.3096 0.3088 0.3085 0.3093 −0.3737 0.3092 −1.0552 −0.3723 0.2699

0.1819 0.1828 0.1828 0.1827 −1.2798 0.1829 −1.7991 0.1829 0.1827 0.1826 0.1828 −0.1984 0.1828 −2.5062 −0.1984 0.1691

−0.1574 −0.1589 −0.1588 −0.1587 −0.9599 −0.1591 1.7618 −0.1591 −0.1587 −0.1586 −0.1589 0.1883 −0.1589 1.1190 0.1878 −0.1395

0.0646 0.0666 0.0666 0.0664 0.4582 0.0669 0.9345 0.0669 0.0664 0.0662 0.0667 −0.1103 0.0667 −1.0560 −0.1092 0.0434

0.4783 0.4834 0.4833 0.4828 −1.6390 0.4841 −2.8413 0.4841 0.4828 0.4823 0.4836 −0.6172 0.4835 2.3512 −0.6151 0.4176

−0.0511

206

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Fig. 7. Frequencies and Normal Probability Distributions of the RMSE Values for the Paved (left) and Unpaved Roads (right).

although there is a slight chance that the EPA emission factor equation works better. 6. Conclusions This paper presented an ANN model-based method to efficiently represent the complex, non-linear properties of the PM emissions from the paved and unpaved roads. The presented method was evaluated through the case study considering the data used for the EPA emissions factors development. The application results of the method demonstrated its effectiveness by showing the enhanced RMSE and R2 values compared to the EPA emission factors. It implies that the presented method has an apparent contribution to enhancing the quality of the emission factor equations for the paved and unpaved roads. Therefore, it is expected that the presented method can help agencies establish reliable PM control strategies for paved and unpaved roads using the increased predictability of PM emission levels considering the affecting factors. The application results also indicate that the presented method makes a significant addition to the EPA’s knowledge base by using ANN to increase the accuracy of the emission factors. Finally, the selection process used to identify the optimal combination of the input variables in this paper has a potential to be applicable to other fields (e.g., financial analysis, electrical engineering, architecture design, and wastewater treatment) that require a selection process for ANN models. However, the presented method was subject to limitations that should be investigated in future research. First, the application results were based on a dataset that did not include any meteorological data as the method is presented to demonstrate its ability to enhance the quality of the EPA emission factors for the paved and unpaved roads. The meteorological factor is one of the important impact factors for PM because meteorological conditions such as rainfall, temperature, and wind speed and direction disperse and dilute the PM in the atmosphere. Second, the random selection process of the MATLAB NNTOOL to train the ANN model might have a selection bias issue when it randomly considered 15% of the training dataset as the testing data to detect the power of prediction. That is, during the random selection process, each data point in the training dataset has an equal opportunity to be selected as a test data point. It implies that there might be a chance that the selection process selects the same data points as the test data multiple times, which then affects the average value of Test R. Future work should consider ways to eliminate or reduce the bias of selections. The third limitation is that this paper identified the optimal number of hidden neurons considering a total of 20 neurons at maximum as there is no quantifiable answer for an optimal number of hidden neurons for any particular application, rather only general rules selected based on the rule-of-thumb method. Therefore, it would be worthwhile to consider a higher number of maximum hidden neurons for possibly better prediction ability as much as effective computing time can be maintained in the future. Appendix A. Supplementary material Supplementary data to this article can be found online at https://doi.org/10.1016/j.trd.2019.01.033. References American Society for Testing and Materials, 2002. Standard practice for dealing with outlying observations. ASTM Standard E178-02, ASTM, West Conshohocken, PA. Ahmed, F.E., 2005. Artificial neural networks for diagnosis and survival prediction in colon cancer. Mol. Cancer 4, 29. Ahmad, M.W., Mourshed, M., Rezgui, Y., 2017. Trees vs neurons: Comparison between random forest and ANN for high-resolution prediction of building energy consumption. Energy Build. 147, 77–89. https://doi.org/10.1016/j.enbuild.2017.04.038. Atomi, W.H., 2012. The effect of data preprocessing on the performance of artificial neural networks techniques for classification problems. Thesis Under Publication. University Tun Hussein Onn, Malaysia. Baghirli, O., 2015. Comparison of Lavenberg-Marquardt, Scaled Conjugate Gradient And Bayesian Regularization Backpropagation Algorithms for Multistep Ahead Wind Speed Forecasting Using Multilayer Perceptron Feedforward Neural Network. Dissertation, no. June, p. Uppsala University. Burden, F., Winkler, D., 2008. Bayesian Regularization of Neural Networks. Methods in Molecular Biology™ Artificial Neural Networks, pp. 23–42. 10.1007/978-160327-101-1_3. Chow, J.C., Watson, J.G., Lowenthal, D.H., Countess, R.J., 1996. Sources and chemistry of PM10 aerosol in Santa Barbara County, CA. Atmos. Environ. 30 (9), 1489–1499. https://doi.org/10.1016/1352-2310(95)00363-0. Chiarazzo, V., Caggiani, L., Marinelli, M., Ottomanelli, M., 2014. A neural network based model for real estate price estimation considering environmental quality of property location. Transport. Res. Proc. 3, 810–817. https://doi.org/10.1016/j.trpro.2014.10.067.

207

Transportation Research Part D 69 (2019) 196–208

T.-H. Liu and Y. Yoon

Dockery, D.W., 2001. Epidemiologic evidence of cardiovascular effects of particulate air pollution. Environ. Health Perspect. 109 (S4), 483–486. https://doi.org/10. 1289/ehp.01109s4483. De Veaux, R.D., Ungar, L.H.A., 2012. Brief introduction to artificial neural networks. Artif. Adapt. Syst. Med. New Theor. Models New Appl. 5–11. https://doi.org/10. 2174/978160805042010901010005. Harrison, R.M., Jones, M., 1995. The chemical composition of airborne particles in the UK atmosphere. Sci. Total Environ. 168 (3), 195–214. Hecht-Nielsen, R., 1989. Theory of the back propagation neural network. Int. Joint Conf. Neural Netw. – IJCNN 1, 593–605. https://doi.org/10.1109/IJCNN.1989. 118638. Jayalakshmi, T., Santhakumaran, A., 2011. Statistical normalization and back propagation for classification. Int. J. Comput. Theory Eng. 89–93. https://doi.org/10. 7763/ijcte.2011.v3.288. Kathirvalavakumar, T., Subavathi, S.J., 2012. Modified backpropagation algorithm with adaptive learning rate based on differential errors and differential functional constraints (PRIME-2012) In: International Conference on Pattern Recognition, Informatics and Medical Engineering, https://doi.org/10.1109/icprime.2012. 6208288. Karsoliya, S., 2012. Approximating number of hidden layer neurons in multiple hidden layer BPNN architecture. Int. J. Eng. Trends Technol. 3 (6), 714–717. Kayri, M., 2016. Predictive abilities of Bayesian regularization and Levenberg–Marquardt algorithms in artificial neural networks: a comparative empirical study on social data. Math. Comput. Appl. 21 (2), 20. https://doi.org/10.3390/mca21020020. Khamis, A., Ismail, Z., Haron, K., Tarmizi Mohammed, A., 2005. The effects of outliers data on neural network performance. J. Appl. Sci. 5 (8), 1394–1398. Lashgari, A., Kecojevic, V., 2015. Comparative analysis of dust emission of digging and loading equipment in surface coal mining. Int. J. Min., Reclam. Environ. 30 (3), 181–196. https://doi.org/10.1080/17480930.2015.1028516. Lourakis, M., Argyros, A., 2005. Is Levenberg-Marquardt the most efficient optimization algorithm for implementing bundle adjustment? In: Tenth IEEE International Conference on Computer Vision (ICCV05) Volume 1. 10.1109/iccv.2005.128. Macukow, B., 2016, Neural Networks – State of Art, Brief History, Basic Models and Architecture, Computer Information Systems and Industrial Management Lecture Notes in Computer Science, pp. 3–14, https://doi.org/10.1007/978-3-319-45378-1_1. Mohd-Safar, N.Z., Ndzi, D., Sanders, D., Noor, H.M., Kamarudin, L.M., 2018. Integration of fuzzy c-means and artificial neural network for short-term localized rainfall forecasting in tropical climate. In: Bi, Y., Kapoor, S., Bhatia, R. (Eds.), Studies in Computational Intelligence, vol. 751. Springer, pp. 325–348. Molina, M.J., Molina, L.T., 2004. Megacities and atmospheric pollution. J. Air Waste Manage. Assoc. 54 (6), 644–680. Panchal, G., Ganatra, A., Shah, P., Panchal, D., 2011. Determination of over-learning and over-fitting problem in back propagation neural network. Int. J. Soft Comput. 2 (2), 40–51. https://doi.org/10.5121/ijsc.2011.2204. Patel, Vaishali R., Mehta, Rupa G., 2011. Impact of outlier removal and normalization approach in modified k-means clustering algorithm. IJCSI Int. J. Comput. Sci. Issues 8 (5), 2. Philpott, Don, 2015. Critical Government Documents on the Environment. Bernan Press, Lanham. Pope, C.A., Schwartz, J., Ransom, M.R., 1992. Daily mortality and PM10 Pollution in Utah Valley. Arch. Environ. Health: An Int. J. 47 (3), 211–217. Pope, C.A., 1995. 213 Particulate pollution and health. Epidemiology 6 (2), S42. Pope 3rd, C.A., Burnett, R.T., Thun, M.J., Calle, E.E., Krewski, D., Ito, K., Thurston, G.D., 2002. Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. J. Am. Med. Assoc. 287 (9), 1132–1141. Roweis, S., 1996. Levenberg-Marquardt Optimization. University Of Toronto, Toronto, ON, Canada. Sager, L., 2016. Estimating the effect of air pollution on road safety using atmospheric temperature inversions. Working Paper No. 251, Grantham Research Institute on Climate Change and the Environment, London, UK. Schwartz, J., Slater, D., Larson, T.V., Pierson, W.E., Koenig, J.Q., 1993. Particulate air pollution and hospital emergency room visits for asthma in Seattle. Am. Rev. Respir Dis. 147, 826–831. Shahin, M.A., Maier, H.R., Jaksa, M.B., 2004. Data division for developing neural networks applied to geotechnical engineering. J. Comput. Civil Eng. 18 (2), 105–114. https://doi.org/10.1061/(asce)0887-3801(2004) 18:2(105). Sharma, B., Venugopalan, P.K., 2014. Comparison of neural network training functions for hematoma classification in brain CT images. IOSR J. Comput. Eng. 16 (1), 31–35. https://doi.org/10.9790/0661-16123135. Ticknor, J.L., 2013. A Bayesian regularized artificial neural network for stock market forecasting. Exp. Syst. Appl. 40 (14), 5501–5506. https://doi.org/10.1016/j. eswa.2013.04.013. U.S. Environmental Protection Agency, 1985. Compilation of Air Pollutant Emission Factors, Vol. I: Stationary Point and Area Sources, AP-42, 4th ed., GPO No. 055000-00251-7. EPA, Research Triangle Park, NC. U.S. Environmental Protection Agency. (1995). Introduction to AP 42, Volume I, Fifth Edition; Available at < https://www3.epa.gov/ttn/chief/ap42/c00s00.pdf > . U.S. Environmental Protection Agency. (1998). Emission factor Documentation for AP-42, Section 13.2.2 Unpaved Road; Available at < https://www3.epa.gov/ttn/ chief/ap42/ch13/bgdocs/b13s02-2.pdf > . U.S Environmental Protection Agency, 2001. Emergency Planning and Community Right-To-Know Act-section 313: guidance for reporting toxic chemicals: mercury and mercury compound category. United States Environmental Protection Agency, Office of Environmental Information, Washington, D.C. U.S Environmental Protection Agency, 2004b. The Particle Pollution Report: Current Understanding of Air Quality and Emissions through 2003. Report no. EPA/ 454–R– 04–002, Emissions, Monitoring, and Analysis Division, Office of Air Quality Planning and Standards, Research Triangle Park, NC 27711; Available at < https://www.epa.gov/sites/production/files/2017-11/documents/pp_report_2003.pdf > . U.S. Environmental Protection Agency, 2011. “Emission factor Documentation for AP-42, Section 13.2.1 Paved Road; Available at < https://www3.epa.gov/ttn/chief/ ap42/ch13/bgdocs/b13s0201.pdf > . Watson, J.G., Chow, J.C., 2000. Reconciling Urban Fugitive Dust Emissions Inventory and Ambient Source Contribution Estimates: Summary of Current Knowledge and Needed Research. DRI, Document No. 6110.4F. World Health Organization, 2006. WHO air quality guidelines for particulate matter, ozone, nitrogen dioxide and sulfur dioxide-Global update 2005. Regional Office for Europe, Copenhagen, Denmark. World Health Organization, 2013. Health effects of particulate matter: Policy implications for countries in eastern Europe. Caucasus and central Asia, Regional Office for Europe, Copenhagen, Denmark.

208