Atmospheric Environment 60 (2012) 142e152
Contents lists available at SciVerse ScienceDirect
Atmospheric Environment journal homepage: www.elsevier.com/locate/atmosenv
Identifying controlling factors of ground-level ozone levels over southwestern Taiwan using a decision tree Hone-Jay Chu a, Chuan-Yao Lin b, Churn-Jung Liau c, Yi-Ming Kuo d, * a
Department of Geomatics, National Cheng-Kung University, Tainan, Taiwan Research Center for Environmental Changes, Academia Sinica, Taipei 115, Taiwan c Institute of Information Science, Academia Sinica, Taipei 115, Taiwan d Department of Design for Sustainable Environment, Ming Dao University, 369 Wen-Hua Rd., Peetow, Chang-Hua 52345, Taiwan b
h i g h l i g h t s < We investigate the temporal variation of O3 using a decision tree approach. < Temperature, wind speed, VOCs, and NOx mainly affect ozone variations. < Spatial ozone patterns are caused by sea-land breeze and pollutant sources.
a r t i c l e i n f o
a b s t r a c t
Article history: Received 9 November 2011 Received in revised form 7 June 2012 Accepted 8 June 2012
Kaohsiung City and the suburban region of southwestern Taiwan have suffered from severe air pollution since becoming the largest center of heavy industry in Taiwan. The complex process of ozone (O3) formation and its precursor compounds (the volatile organic compounds (VOCs) and nitrogen oxide (NOx) emissions), accompanied by meteorological conditions, make controlling ozone difficult. Using a decision tree is especially appropriate for analyzing time series data that contain ozone levels and meteorological and explanatory variables for ozone formation. Results show that dominant variables such as temperature, wind speed, VOCs, and NOx can play vital roles in describing ozone variations among observations. That temperature and wind speed are highly correlated with ozone levels indicates that these meteorological conditions largely affect ozone variability. The results also demonstrate that spatial heterogeneity of ozone patterns are in coastal and inland areas caused by sea-land breeze and pollutant sources during high ozone episodes over southwestern Taiwan. This study used a decision tree to obtain quantitative insight into spatial distributions of precursor compound emissions and effects of meteorological conditions on ozone levels that are useful for refining monitoring plans and developing management strategies. Ó 2012 Elsevier Ltd. All rights reserved.
Keywords: Decision tree Ozone Volatile organic compounds Nitrogen oxides Meteorological conditions
1. Introduction Tropospheric ozone (O3) is a critical component of photochemical smog that may cause considerable human health hazards (Mudway and Kelly, 2000; Sillman, 2003; Geddes et al., 2009). Nevertheless, as a secondary pollutant, the atmospheric levels of tropospheric ozone are often difficult to control. Nitrogen oxide (NOx) and volatile organic compounds (VOCs) originate from various sources and can exhibit a nonlinear effect on local groundlevel ozone production (Geddes et al., 2009). Large amounts of ozone are produced in urban and downwind rural areas of Taiwan.
* Corresponding author. Tel.: þ886 4 8876660x8616. E-mail address:
[email protected] (Y.-M. Kuo). 1352-2310/$ e see front matter Ó 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.atmosenv.2012.06.032
The Kao-Ping area (including Kaohsiung City and Pingtung County) located in southern Taiwan is inundated with highly contaminating industrial plants such as petrochemical, steel, and power. Chang et al. (2005) used hourly ozone concentration as an index of photochemical activity to distinguish traffic conditions. Photochemical activity is an essential factor in ozone formation, as well as local weather conditions constantly influence air pollution levels in urban areas. Analyzing air pollution and meteorological data facilitates understanding of air pollution mechanisms. Data mining techniques were used to describe ambient pollution concentrations based on driving factors such as meteorological conditions. Recently, numerous studies have investigated the applicability and potential of several data mining algorithms to model air quality (Kurt and Oktay, 2010; Chen et al., 2010; Jirava et al., 2010; Tzima
H.-J. Chu et al. / Atmospheric Environment 60 (2012) 142e152
et al., 2011). Kurt and Oktay (2010) used neural networks to forecast air pollution indicator levels in Istanbul, Turkey. Tzima et al. (2011) applied a series of models such as neural network, logistic regression, and an evolutionary rule-induction algorithm to estimate PM10 (particulate matter diameter less than 10 mm) and O3 concentration levels in the metropolitan area of Thessaloniki, Greece. Chen et al. (2010) conducted cluster analysis followed by meteorological and interspecies correlation to explore features of acidic and basic air pollutants around a high-tech industrial park in central Taiwan. Jirava et al. (2010) applied rough set approaches to air quality assessment in town Pardubice of the Czech Republic. Using a decision tree is an essential technique of data mining. Numerous studies have used a decision tree to analyze environmental problems (Nerini et al., 2000; Gardner and Dorling, 2000; Kuebler et al., 2002; Bruno et al., 2004; Slini et al., 2006; Hu et al., 2008). Nerini et al. (2000) used a decision tree to forecast daily dissolved oxygen rates in a lagoon along the French Mediterranean sea coast. Hu et al. (2008) investigated the effect of temperature and air pollutants on total mortality during the summer in Sydney, Australia, revealing that maximal temperature and sulfur dioxide on the sample day had significant interaction effects on total mortality. Slini et al. (2006) applied a decision tree to estimate PM10 concentrations for the years 1994e2000 in the municipality of Thessaloniki, Greece. Using a decision tree is an alternative or complementary non-parametric approach that can perhaps accommodate these complex interactions because they avoid any assumption of linear relationships among the variables or homoscedasticity in variances (Hu et al., 2008). Through successive binary splits, conducting decision tree analysis segments the data into homogenous subgroups ideally suited for exploring and modeling (Breiman et al., 1984). Gardner and Dorling (2000) compared linear regression, regression tree, and neural network models of hourly surface ozone concentrations. Although using network models more accurately captures the underlying relationship between the meteorological variables and hourly ozone concentrations, the regression tree model is among the decision
143
tree models that are considered to be more readily and physically interpretable. Developing models of hourly surface ozone is difficult because of the nonlinearities and complex interactions between explanatory variables. Such complexity arises because the pollution episodes involve atmospheric dynamics, nonlinear chemistry, and various emission sources (Kuebler et al., 2002). This study proposes two decision tree models for ozone management of scenarios representing certain driving factors and air quality conditions. Using a decision tree provides information about the index of hourly ozone levels in specific NOx, VOCs, and meteorological conditions based on observations. The aims of this study were to investigate the temporal variation of O3 from multiple observations using a decision tree approach, to discover the implications for visibility of high-pollution episodes in 2009. This study sought to determine the major factors of ozone levels within the Kao-Ping area in Taiwan. A case study approach was used to represent characteristics of NOx, VOCs, and meteorological conditions that lead to high ozone episodes over the study area. 2. Materials and methods 2.1. Study area and data Kaohsiung City, encompassing an area of 2950 km2 in southern Taiwan, is heavily industrialized and densely populated with approximately 2.77 million inhabitants, 1,012,000 registered motorcycles, and 377,300 cars. Fig. 1 shows the numerous manufacturing-intensive industrial parks located in a coastal area of approximately 35 km2 (Bureau of the Ministry of Economic Affairs, 2011). These industrial parks have caused the coastal area of Kaohsiung City to have the poorest air quality in Taiwan. Therefore, the Taiwanese EPA (Environmental Protection Administration) established general air quality monitoring (GAQM) stations in 1993 to monitor the long-term trends of air pollutants and assess the effectiveness of control strategies. This study collected data from
Fig. 1. Study area and location of air quality monitoring stations in the southwestern Taiwan.
144
H.-J. Chu et al. / Atmospheric Environment 60 (2012) 142e152
Fig. 2. Temporal variation of (a) NOx concentration [ppb], (b) temperature [ C], (c) wind speed [m sec1] and (d) relative humidity [%] during April 29eMay 12 (AVE: average values) (a) (b) (c).
the following five GAQM stations: Chen-jen (CJ), Hsiao-kang (HK), Da-liau (DL), Ping-tung (PT), and Mei-nung (MN) (Fig. 1). In addition, to understand O3 problems and determine the mechanisms involved in O3 formation, monitoring O3 and O3 precursors such as VOCs and NOx is necessary. Therefore, the Taiwanese EPA in 2005 established two photochemical assessment
monitoring (PAM) stations in Kaohsiung according to the U.S. EPA methodology to provide accurate, representative, and long-term data of VOCs. Both of these PAM stations were deliberately placed to be representative of the northern (Chiau-tou) and southern (Hsiao-kang) areas of Kaohsiung City (Fig. 1). The intensity of UV is also critical indicator of ozone formation. Unfortunately, hourly
H.-J. Chu et al. / Atmospheric Environment 60 (2012) 142e152
a
HK
300
CT
145
AVE
Alkanes [ppb]
250 200 150 100 50 0 4/28 4/29 4/30 5/1 5/2 5/3 5/4 5/5 5/6 5/7 5/8 5/9 5/10 5/11 5/12 5/13 date
Alkenes [ppb]
b
180
HK
CT
AVE
150 120 90 60 30 0 4/28 4/29 4/30 5/1
5/2
5/3
5/4
5/5 5/6 date
5/7
5/8
5/9 5/10 5/11 5/12 5/13
Aromatic hydrocarbons [ppb]
c 210
HK
CT
AVE
180 150 120 90 60 30 0 4/28 4/29 4/30 5/1 5/2 5/3 5/4 5/5 5/6 5/7 5/8 5/9 5/10 5/11 5/12 5/13 date
Fig. 3. Temporal variation of volatile organic compounds (VOCs) including alkanes (AA), alkenes (AE), and aromatic hydrocarbons (AH) [ppb] during April 29eMay 12 (AVE: average VOCs concentration in the observations).
radiation data were not available in the study area. Therefore, data of ozone precursors (VOCs and NOx) and meteorological variables (temperature, wind speed, and relative humidity) were employed in this study. 2.2. Sample analysis Hourly time series of O3, NOx, VOCs, and meteorological conditions (temperature, wind speed, and relative humidity) monitored at five GAQM stations and two PAM stations in Kaohsiung were downloaded from Taiwanese EPA websites (http://taqm.epa.gov. tw/taqm/zh-tw/default.aspx). Sampling heights varied from 4 m to 12 m. The highest average ozone concentrations within the 14day (336 h) monitoring period in 2009 were chosen to represent the high ozone episode examined in this study. The monitoring
period in the high ozone episode was from April 29eMay 12, and hourly data were obtained for each station. Analyses of NOx, O3, and VOCs satisfy the requirements of the US EPA as equivalent or reference methods. Nitrogen oxides (NOx) and O3 are measured using automated continuous chemiluminescent (ML9841B, Eco-Tech, Canada) and UV O3 absorption analyzers (ML 9810B, Eco-Tech), respectively. Ambient VOC samples were collected in 6-L stainless steel canisters (Scientific Instrumentation Specialists, Moscow, ID). Approximately fifty species were analyzed and classified according to molecular structure under the categories of alkanes (AA), alkenes (AE), and aromatic hydrocarbons (AH). An independent quality assurance program, quality control procedures, regular maintenance of instruments, and a performance audit are also undertaken by the Taiwanese EPA to ensure data quality.
146
H.-J. Chu et al. / Atmospheric Environment 60 (2012) 142e152
Table 1 Descriptive statistics of hourly O3, NOx, VOCs, and meteorological data of five air quality monitoring stations during April 29eMay 12, 2009 (AA: alkanes, AE: alkenes, AH: aromatic hydrocarbons). Descriptive statisticsa
O3 [ppb]
NOx [ppb]
Temperature [ C]
Wind speed [m sec1]
Relative humidity [%]
AA [ppb]
AE [ppb]
AH [ppb]
Mean SD Q25 Q75 Min Max
50.5 21.9 30.5 67.6 12.5 107.8
17.4 4.3 14.6 18.9 8.0 36.3
26.0 2.5 24.1 28.0 19.9 32.4
1.9 0.9 1.1 2.6 0.6 3.9
65.4 11.3 56.3 74.9 35.3 83.6
45.2 20.9 31.0 53.9 16.5 174.7
10.2 6.0 6.8 11.3 2.7 46.0
27.6 13.5 18.8 31.5 9.0 112.9
a
SD: standard deviation; Q25: the first quartile; Q75: the third quartile; Min: minimum; Max: maximum.
2.3. Decision tree models
concentration of nitrogen oxides [ppb]. Fig. 2 shows the temporal patterns of NOx and meteorological data including temperature, wind speed, and relative humidity in the study area. Because temperature is highly negatively correlated to relative humidity (correlation coefficient ¼ 0.916), the relative humidity is neglected in the models (Fig. 2). Volatile organic compounds (VOCs) contain three major components: average cumulative concentration of alkanes (AA), alkenes (AE), and aromatic hydrocarbons (AH) in PAM observations [ppb]. Fig. 3 shows the temporal patterns of VOCs in the study area.
The decision tree models used in the current study were primarily based on the classification and regression tree (CART) methodology defined by Breiman et al. (1984). The CART is a binary tree algorithm that splits data sets into two groups based on the value of the predictor that maximizes the dissimilarity between the groups. The decision tree grows by means of successive division until no significant increase exists in the homogeneity of any nodes involving further division (Breiman et al., 1984; Atkins et al., 2007). Eventually, the node sub-divides no further and automatically becomes a terminal node. The CART produces a tree structure using a parametric decision at each node based on an inequality, Xi < c, where Xi corresponds to independent (explanatory) variable i and c is a constant value within the range of Xi values. Observations that satisfy this condition are sent to the left node while the others are sent to the right node (Eisenberg and McKone, 1998). The entire classification problem is structured as a sequence of tree-based simple questions, and the answers to these questions trace a path down the tree (Speybroeck et al., 2004). In this study, hourly time, NOx, VOCs, temperature, and wind speed were selected as the independent variables (Xi), whereas the O3 level is a dependent variable used in the CART methodology. Various levels of air pollution are defined for air quality assessment (Jirava et al., 2010). Most O3 values in the ranges of 0e33, 33e65, and over 65 [ppb h1] were defined as Levels 1, 2, and 3. Two decision tree models were defined in this study:
(2) Local model
O3 ðiÞ ¼ f ðHr; TempðiÞ; WSðiÞ; NOx ðiÞ; AAðiÞ; AEðiÞ; AHðiÞÞ
(2)
[ C],
where Temp(i) is the local temperature WS(i) is the local wind speed [m sec1], and NOx(i) is the local hourly concentration of nitrogen oxides [ppb] in the observation i. Because only two PAM stations were in the study area, we assumed an average value of VOCs to represent the local values of VOCs. 336 hrs monitoring data on the episode day were used for decision trees. In the models, the impure nodes must have 10 or more observations to be split. A minimal number of observations per tree leaf is one, and the weight of every observation is equal. A 10-fold cross validation was used for estimating the test error of the models. 3. Results and Discussions 3.1. Time series analysis
(1) Regional model
O3 ðiÞ ¼ f Hr; Temp; WS; NOx ; AA; AE; AH
Table 1 lists descriptive statistics of hourly averages of O3, NOx, VOCs, and meteorological data at five air quality monitoring stations, on April 29eMay 12, 2009. In the study area, the average concentrations for O3, NOx, temperature, wind speed, and relative humidity in the observations are in the ranges of 12.5e107.8 [ppb], 8.0e36.3 [ppb], 19.9e32.4 [ C], 0.6e3.9 [m sec1], and 35.3e83.6 [%], respectively. Moreover, wind speed varies periodically and is
(1)
where O3(i) is the hourly concentration of ozone in the observation i [ppb];Hr is the hourly time measurement from 0 to 23 [hour]; Temp is the average temperature [ C], WS is the average wind speed in the study area [m sec1]; and NOx is the average hourly
150
CJ
PT
MN
DL
HK
AVE
Ozone [ppb]
120 90 60 30 0 4/28 4/29 4/30 5/1
5/2
5/3
5/4
5/5 5/6 date
5/7
5/8
5/9 5/10 5/11 5/12 5/13
Fig. 4. Temporal variation of O3 concentration [ppb] during April 29eMay 12 (AVE: average O3 concentration in five observations).
H.-J. Chu et al. / Atmospheric Environment 60 (2012) 142e152
147
Table 2 Rule of regional model in the multiple observations (CJ: Chen-jen, PT: Ping-tung, MN: Mei-nung, DL: Da-liau, and HK: Hsiao-kang). O3 Levela
Level 1
Level 2
Level 3 a
Stations CJ
PT
MN
DL
HK
WS < 2.3 m s1 & Temp < 23 C or WS < 2.3 m s1 & Temp > 23 C & Hr < 9:00 am WS < 2.3 m s1 & Temp > 23 C & Hr > 9:00 am or WS > 2.3 m s1 & NOx < 12.5 ppb WS < 2.3 m s1 & NOx > 12.5 ppb
WS < 2.4 m s1 & Temp < 24.4 C
Hr < 9:00 am
Temp < 24.9 C
Hr < 9:00 am or Hr > 9:00 am & Temp < 28.4 C & AH > 49.6 ppb
WS < 2.4 m s1 & Temp > 24.4 C
Hr > 9:00 am & WS < 2.3 m s1
Temp > 24.9 C & WS < 2.3 m s1
Hr > 9:00 am & Temp < 28.4 C & AH < 49.6 ppb
WS > 2.4 m s1
Hr > 9:00 am & WS > 2.3 m s1
Temp > 24.9 C & WS > 2.3 m s1
Hr > 9:00 am & Temp > 28.4 C
O3 values in the ranges of 0e33, 33e65, and over 65 [ppb h1] are defined as levels 1, 2, and 3, respectively.
at a maximum during the day and at a minimum during the night because of sea-land breeze. The average concentrations for AA, AE, and AH in the observations are in the ranges of 13.1e251.2 [ppb], 3.4e106.2 [ppb], and 10.4e164.8 [ppb], respectively (Fig. 3). On the episode day (May 11, 2009), an ozone peak was observed at 14: 00 LST (Local Standard Time) (Fig. 4). Concurrently, the air temperature and wind speed were as high as 32.04 C and 3.61 m s1. Compared with the ozone time series at five stations, the highest ozone level was at the DL station, followed by levels at the PT and MN stations (Fig. 4). However, the time of peak O3 concentrations are varied slightly among the stations (Fig. 4). The time to ozone peak in the coastal areas (HK and CJ stations) was earlier than that in the inland areas (PT and MN stations). Transport of air pollution resulted in the spatial heterogeneity of air quality (Lin et al., 2007). 3.2. Regional and local models Tables 2 and 3 show the rule table of the decision tree models (CART) for ozone concentration levels in regional and local models (Figs. 5 and 6), respectively. The decision tree clearly represents the relationship between ozone levels and driving factors. For example, at the DL station, the ozone level belongs to Level 1 when temperature is below 24.9 C and belongs to Level 2 or 3 when temperature is above 24.9 C in the local model (Fig. 5 (d)). According to this rule, temperature is the dominant driving factor influencing the ozone level at the station. Air pollution varies with space and time and is transported by advection, diffusion, and reaction (Huang and Hsu, 2004; Tong et al., 2006; Baik et al., 2007). The decision tree can capture the spatial heterogeneity of air quality (i.e. ozone level observed at PT and MN; ozone level at HK and CJ) associated with the representative driving factors. In the local model, the crucial factors are
VOCs and NOx, and metrological conditions in HK, CJ and DL. However, the PT and MN stations have similar CART rules for metrological conditions in the regional model (Table 2 and Fig. 5(b) and (c)). The rules at these two stations are dominated by hourly time, average temperature and wind speed. The PT and MN stations are located in inland cities, and the air pollution was transported by sea breeze during the daytime in the study area (Liu et al., 1994; Lin et al., 2007). During winter and spring, the northeasterly wind prevails over Taiwan. Wind speeds are stronger on the windward side of coastal areas in the north and center of Taiwan. However, the northeasterly wind speed is weaker over southwest Taiwan as a result of the blocking effect of high mountains (Lin et al., 2005). Local circulation dominates the wind pattern over southwest Taiwan. Air pollutants can easily be trapped over the southwest coastal area after a sea breeze develops. Ozone and its precursors are easily accumulated and transported to inland areas by the sea breeze. Therefore, stations PT and MN have a similar rule in association with sea breeze and metrological conditions. Fig. 7 shows the wind field and ozone concentration in the area on 11 May, 2009. The north wind occurred at 9:00 LST, and the sea breeze was developed at 10:00 LST. At 14:00 LST, the ozone concentration enhanced and extended toward downwind (inland area). These processes may play crucial roles in the ozone formation over inland rural areas for high ozone episodes (Lin et al., 2007). Moreover, decision trees also imply a similar rule for the coastal stations (HK and CJ). Ozone level observed at the HK and CJ stations highly depends on VOCs, NOx and temperature. When the temperature is greater than 25.6 C, the ozone is at Level 2 or 3. However, when the temperature is less than 25.6 C, the ozone is at Level 1 or 2 (Table 3, Fig. 6(a) and (e)). The rules detect various ozone levels for which temperature is an effective surrogate.
Table 3 Rule of local model in the multiple observations. (CJ: Chen-jen, PT: Ping-tung, MN: Mei-nung, DL: Da-liau, and HK: Hsiao-kang). O3 Levela
Stations CJ
PT
MN
DL
HK
Level 1
Temp > 25.6 C & NOx > 17.7 ppb
Temp < 25.0 C
Hr < 9:00 am
Level 2
25.6 < Temp < 27.9 C or Temp < 25.6 C & NOx < 17.7 ppb Temp > 27.9 C
25.0 < Temp < 29.0 C & Temp < 29.0 C & Hr > 5:00 pm Temp > 29.0 C & Hr < 5:00 pm
Hr > 7:00 pm
Hr < 9:00 am or Hr > 9:00 am & WS < 2.0 m s1 & NOx > 22.4 ppb Hr > 9:00 am & WS < 2.0 m s1 & NOx < 22.4 ppb Hr > 9:00 am & WS > 2.0 m s1
Temp < 25.8 C or Temp > 25.8 C & WS < 2.9 m s1 & AH > 48.8 ppb Temp > 25.8 C & WS < 2.9 m s1 & AH < 48.8 ppb Temp > 25.8 C & WS > 2.9 m s1
Level 3
Hr > 9:00 am & Hr < 7:00 pm
NA: not available. a O3 values in the ranges of 0e33, 33e65, and over 65 [ppb h1] are defined as Level 1, 2, and 3.
148
H.-J. Chu et al. / Atmospheric Environment 60 (2012) 142e152
WS<2.3 m/sec
Y
N NOx<12.5 ppb
Temp<23.0°C
N
Y
N
Y
Hr< 9:00 am
1
2
Y
WS<2.4 m/sec N
Y 3
N
Y
N
1
1
2
3
Temp<24.4°C
2
a
b
Hr< 9:00 am Temp<24.9°C Y
N
1
Y WS<2.3 m/sec
N
1
WS<2.3 m/sec
N
Y
N
Y 3
2
2
c
3
d
Hr< 9:00 am Y
N
1
Temp<28.4°C
Y
N 3
AH<49.6 ppb N
Y 2
1
e Fig. 5. Regional model for ozone levels in the observations (a) CJ, (b) PT, (c) MN, (d) DL, (e) HK (Terminals 1, 2 and 3 represent Levels 1, 2 and 3 of hourly ozone concentrations between 0 and 33, 33e65, and over 65 [ppb h1]).
H.-J. Chu et al. / Atmospheric Environment 60 (2012) 142e152
149
Temp<25.6°C
Y
N
2
N
Y
1
2
N
Y Temp<27.9 °C
NOx<17.7 ppb
Y
Temp<29.0°C
N
3
Hr< 5:00 pm
Temp<25.0°C
Y
N
Y
2
1
N 2
3
b
a
Hr<9:00 am
Hr<9:00 am Y
Y
N
N
1
1
WS<2.0 m/sec
Hr< 7:00 pm Y
Y
N
3
2
Y
3
NOx<22.4 ppb
c
Y 2
Temp<25.8°C
N
N 1
d
N
1
WS<2.9 m/sec
Y
N
3
AH<48.8 ppb Y
N
2
1
e Fig. 6. Local model for ozone levels in the observations (a) CJ, (b) PT, (c) MN, (d) DL, (e) HK (Terminals 1, 2 and 3 represent Levels 1, 2 and 3 of hourly ozone concentrations between 0 and 33, 33e65, and over 65 [ppb h1]).
3.3. Dominant driving factors for ozone levels The decision tree model is more readily interpretable and deduces physical explanations of the air pollution (Gardner and Dorling, 2000). Table 4 shows the union of dominant variables in regional and local models. Results show that temperature is a common factor for all models. In the regional model, natural forces such as average wind speed and average temperature are the dominant factors. However, natural and manmade sources (that is, VOCs) are the dominant ones in the local model (In Table 4, average wind speed and temperature are natural factors, but most sources of alkanes, alkenes, and aromatic hydrocarbons are manmade factors). Determining pollution transportation by wind depends on the speed and direction. In the local model, local wind speed varies among stations. Linking the effect of local wind speed to ozone level is difficult. However, average wind speed is the sum of local
wind speeds in the regional model, showing the influence of wind in the study area. Ozone level is highly related to hourly time, temperature and average wind speed at the PT and MN stations but it strongly depends on temperatures and the NOx and VOCs (for example, AH) at the HK and CJ stations (Table 4). Following the onshore flows, sea breezes can carry ozone and its precursors to the PT and MN stations in inland rural areas. A high temperature indicates the possible importance of photochemistry. Photochemical conditions favoring ozone production, such as increasing temperature and light, also enhance biogenic emission rates (Geddes et al., 2009). Results show that the ozone levels at the HK and CJ stations are highly related to the VOCs and NOx. The VOCs and NOx are responsible for the temporal changes of ozone in these areas (Shiu et al., 2007). Industrial plants and commercial traffic activities in the coastal area of Kaohsiung City have caused this area to have
150
H.-J. Chu et al. / Atmospheric Environment 60 (2012) 142e152
Fig. 7. The monitor wind field and ozone concentration over the observed stations at (a) 09: 00 (b) 10: 00 (c) 12: 00, and (d) 14: 00 LST on 11 May 2009. The ozone concentration [ppb] is represented by color scale. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
H.-J. Chu et al. / Atmospheric Environment 60 (2012) 142e152
151
Table 4 The sets of dominant variables in the multiple observations (CJ: Chen-jen, PT: Ping-tung, MN: Mei-nung, DL: Da-liau, and HK: Hsiao-kang). Model type
Stations CJ
PT
MN
DL
HK
Regional model Local model Union of two models
Temp, Hr, WS, NOx Temp, NOx Temp, Temp, Hr, WS, NOx , NOx
Temp, WS Temp, Hr Temp, Temp, Hr, WS
Hr, WS Hr Hr, WS
Temp, WS Temp, NOx, Hr Temp, Temp, Hr WS, NOx
Temp, Hr,AH Temp, WS, AH Temp, Temp, Hr, WS, AH
Table 5 Misclassification errors in 10-fold cross validation in the models (CJ: Chen-jen, PT: Ping-tung, MN: Mei-nung, DL: Da-liau, and HK: Hsiao-kang). Model type
Regional model Local model
Error in station
Average error
CJ
PT
MN
DL
HK
25% 26%
26% 28%
19% 24%
18% 17%
24% 22%
22% 23%
poor air quality. The ozone levels at the HK and CJ stations are strongly affected by industrial sources and traffic emissions. However, the ozone concentration cycles are associated with the sea-land breeze in this study area. The primary pollutants emitted by anthropogenic activities (such as industry and traffic) concentrated near the coast are transported into inland (PT and MN) by the sea breeze and are subjected to photochemical transformations. The primary pollutants are subsequently injected thermally upward and transported back toward the sea by the return flows (Andronopoulos et al., 2000). Our results match the previous studies (National Research Council (NRC) 1991; Lin et al., 2004) that dominant factors including precursor emissions, atmospheric chemistry and meteorological conditions influence the ground-level ozone concentration. The complex process of ozone formation is related to VOCs and NOx emissions, accompanied by meteorological conditions (Chou Charles et al., 2006). Table 5 shows the misclassification errors in these models. Average ozone level classification accuracy achieved 78% and 77% in regional and local models, based on a 10fold cross validation. The validation results illustrate that the decision tree provides a more effective fit between driving factors and the ozone level for this case study. Therefore, the decision tree can be considered as an appropriate alternative management model when ozone concentrations vary with space-time and in the presence of nonlinear relationships with driving factors. The decision tree methodology assesses the relationship between ozone levels and driving factors and distinguishes the effects of driving factors on ozone levels for pollution control. When developing the model and a control strategy, it is desirable to assess how controls work under a comprehensive set of meteorological conditions, leading to high pollutant levels (Kuebler et al., 2002). 4. Conclusion The complex process of ozone formation is related to emissions of precursor compounds (volatile organic compounds (VOCs) and nitrogen oxides (NOx)), accompanied by meteorological conditions. This study utilized a decision tree to map the temporal characteristics of ozone data in Taiwan during an ozone episode in 2009. The decision tree identifies the spatial variability and heterogeneity of ozone concentration in various observations. The deriving rules exhibit spatial heterogeneity of air quality in the Kao-Ping area of Taiwan. Results indicate two distinct types of transport formulations for the inland area (for example, the PT and MN stations) and the coastal area (for example, the HK and CJ stations). Ozone levels are highly related to hourly time, temperature and average wind speed
in the inland area and strongly depend on metrological conditions and the precursor compounds in the coastal area. This study offers an alternative approach to explore the temporal patterns of air pollution and provides urban managers with sufficient information to understand more clearly the ozone distribution when engaging in environmental planning and health policy design. Future study further will consider more data instances and features such as radiation data for modeling an ozone-level decision tree. Acknowledgments The authors would like to thank editors, anonymous reviewers, and helpers. We are grateful to the Taiwan Environmental Protection Administration (TWEPA) for providing the monitoring data used in this study. References Andronopoulos, S., Passamichali, A., Gounaris, N., Bartzis, J.G., 2000. Evolution and transport of pollutants over a Mediterranean coastal area: the influence of biogenic volatile organic compound emissions on ozone concentrations. Journal of Applied Meteorology 39 (4), 526e545. Atkins, J.P., Burdon, D., Allen, J.H., 2007. An application of contingent valuation and decision tree analysis to water quality improvements. Marine Pollution Bulletin 55 (10e12), 591e602. Baik, J.J., Kang, Y.S., Kim, J.J., 2007. Modeling reactive pollutant dispersion in an urban street canyon. Atmospheric Environment 41 (5), 934e949. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., 1984. Classification and Regression Trees. Wadsworth International Group, Belmont, California. 133e189. Bruno, F., Cocchi, D., Trivisano, C., 2004. Forecasting daily high ozone concentrations by classification trees. Environmetrics 15 (2), 141e153. Chang, C.C., Chena, T.Y., Lina, C.Y., Yuan, C.S., Liu, S.C., 2005. Effects of reactive hydrocarbons on ozone formation in southern Taiwan. Atmospheric Environment 39, 2867e2878. Chen, H.W., Tsai, C.T., She, C.W., Lin, Y.C., Chiang, C.F., 2010. Exploring the background features of acidic and basic air pollutants around an industrial complex using data mining approach. Chemosphere 81 (10), 1358e1367. Chou Charles, C.K., Liu, Shaw, C., Lin, C.Y., Shiu, C.J., Chang, K.H., 2006. The trend of surface ozone in Taipei, Taiwan, and its causes: implications for ozone control strategies. Atmospheric Environment 40 (21), 3898e3908. Eisenberg, J.N.S., McKone, T.E., 1998. Decision tree method for the classification of chemical pollutants: incorporation of across-chemical variability and withinchemical uncertainty. Environmental Science & Technology 32 (21), 3396e3404. Gardner, M.W., Dorling, S.R., 2000. Statistical surface ozone models: an improved methodology to account for non-linear behaviour. Atmospheric Environment 34 (1), 21e34. Geddes, J.A., Murphy, J.G., Wang, D.K., 2009. Long term changes in nitrogen oxides and volatile organic compounds in Toronto and the challenges facing local ozone control. Atmospheric Environment 43 (21), 3407e3415. Hu, W.B., Mengersen, K., McMichael, A., Tong, S.L., 2008. Temperature, air pollution and total mortality during summers in Sydney, 1994e2004. International Journal of Biometeorology 52 (7), 689e696. Huang, H.C., Hsu, N.J., 2004. Modeling transport effects on ground-level ozone using a non-stationary space-time model. Environmetrics 15 (3), 251e268. Jirava, P., Krupka, J., Kasparova, M., 2010. Application of rough sets theory in air quality assessment. In: RSKT’10 Proceedings of the 5th International Conference on Rough Set and Knowledge Technology. Kuebler, J., Russell, A.G., Hakami, A., Clappier, A., van den Bergh, H., 2002. Episode selection for ozone modelling and control strategies analysis on the Swiss Plateau. Atmospheric Environment 36 (17), 2817e2830. Kurt, A., Oktay, A.B., 2010. Forecasting air pollutant indicator levels with geographic models 3 days in advance using neural networks. Expert Systems with Applications 37 (12), 7986e7992. Liu, C.M., Huang, C.Y., Shieh, S.L., Wu, C.C., 1994. Important meteorological parameters for ozone episodes experienced in the Taipei basin. Atmospheric Environment 28 (1), 159e173.
152
H.-J. Chu et al. / Atmospheric Environment 60 (2012) 142e152
Lin, C.H., Wu, Y.L., Lai, C.H., Lin, P.H., Lai, H.C., Lin, P.L., 2004. Experimental investigation of ozone accumulation overnight during a wintertime ozone episode in south Taiwan. Atmospheric Environment 38 (26), 4267e4278. Lin, C.Y., Liu, S.C., Chou, C.C.K., Huang, S.J., Liu, C.M., Kuo, C.H., Young, C.Y., 2005. Long-range transport of aerosols and their impact on the air quality of Taiwan. Atmospheric Environment 39, 6066e6076. Lin, C.Y., Wang, Z., Chou, C.C.K., Chang, C.C., Liu, S.C., 2007. A numerical study of an autumn high ozone episode over southwestern Taiwan. Atmospheric Environment 41 (17), 3684e3701. Mudway, I.S., Kelly, F.J., 2000. Ozone and the lung: a sensitive issue. Molecular Aspects of Medicine 21, 1e48. Nerini, D., Durbec, J.P., Mante, C., 2000. Analysis of oxygen rate time series in a strongly polluted lagoon using a regression tree method. Ecological Modelling 133 (1e2), 95e105. National Research Council (NRC), 1991. Rethinking the Ozone Problem in Urban and Regional Air Pollution. National Academy Press, Washington, District of Columbia.
Shiu, C.J., Liu, S.C., Chang, C.C., et al., 2007. Photochemical production of ozone and control strategy for Southern Taiwan. Atmospheric Environment 41 (40), 9324e9340. Sillman, S., 2003. In: Holland, H.D. (Ed.), Tropospheric Ozone and Photochemical Smog, pp. 407e432. Slini, T., Kaprara, A., Karatzas, K., Moussiopoulos, N., 2006. PM10 forecasting for Thessaloniki, Greece. Environmental Modelling & Software 21 (4), 559e565. Speybroeck, N., Berkvens, D., Mfoukou-Ntsakala, A., Aerts, M., Hens, N., Van Huylenbroeck, G., Thys, E., 2004. Classification trees versus multinomial models in the analysis of urban farming systems in Central Africa. Agricultural Systems 80 (2), 133e149. Tong, D.Q., Muller, N.Z., Mauzerall, D.L., Mendelsohn, R.O., 2006. Integrated assessment of the spatial variability of ozone impacts from emissions of nitrogen oxides. Environmental Sciences and Technology 40 (5), 1395e1400. Tzima, F.A., Mitkas, P.A., Voukantsis, D., Karatzas, K., 2011. Sparse episode identification in environmental datasets: the case of air quality assessment. Expert Systems with Applications 38 (5), 5019e5027.