Parameterization of secondary organic aerosol mass fractions from smog chamber data

Parameterization of secondary organic aerosol mass fractions from smog chamber data

ARTICLE IN PRESS Atmospheric Environment 42 (2008) 2276–2299 www.elsevier.com/locate/atmosenv Parameterization of secondary organic aerosol mass fra...

2MB Sizes 0 Downloads 21 Views

ARTICLE IN PRESS

Atmospheric Environment 42 (2008) 2276–2299 www.elsevier.com/locate/atmosenv

Parameterization of secondary organic aerosol mass fractions from smog chamber data Charles O. Staniera,, Neil Donahueb, Spyros N. Pandisb,c a

Department of Chemical and Biochemical Engineering, University of Iowa, 4122 Seamans Center, Iowa City, IA 52242, USA b Center for Atmospheric Particle Studies, Carnegie Mellon University, Pittsburgh, PA 15213, USA c Department of Chemical Engineering, University of Patras, Patras 26504, Greece Received 27 June 2007; received in revised form 11 December 2007; accepted 11 December 2007

Abstract A framework is presented and evaluated for the parameterization of smog chamber results for use in atmospheric chemical transport models (CTMs). The parameterization uses an absorptive partitioning model to describe formation of secondary organic aerosol (SOA). The key points of the framework are (1) the ability to fit results from several types of chamber experiments; (2) the use of a basis set of surrogate compounds characterized by fixed effective saturation concentrations instead of the more commonly used variable saturation concentrations; (3) calculation of uncertainties of the estimated SOA aerosol mass fractions (AMF) outside of the fitted experimental range; and (4) determination of the best effective enthalpy to reproduced observed temperature sensitivity. The features of this data analysis and fitting framework are demonstrated using simulated data, and actual measurements from a-pinene ozonolysis experiments. Representation of SOA formation using as many as 8 surrogate compounds with fixed effective saturation concentrations is shown to be feasible and has advantages over simpler parameterizations. r 2007 Elsevier Ltd. All rights reserved. Keywords: Secondary organic aerosol; Modeling; Parameter estimation; a-pinene

1. Introduction Driven by the need to simulate secondary organic aerosol (SOA) formation for chemical transport models (CTMs), the results of ‘‘smog’’ chamber studies of SOA formation have been extensively parameterized. Early smog chamber experiments were translated into aerosol mass fractions (AMF) (yields) for models by assuming a constant aerosol Corresponding author. Tel.: +1 319 335 1399; fax: +1 319 335 1415. E-mail address: [email protected] (C.O. Stanier).

1352-2310/$ - see front matter r 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.atmosenv.2007.12.042

mass fraction for each precursor (Hatakeyama et al., 1989) or assuming independent condensation of each SOA component when its concentration exceeded its saturation value (Pandis et al., 1992). With additional chamber experiments and theoretical investigation, it was demonstrated that SOA AMF of the various organic precursors are not constant, but depended on aerosol concentration in a manner consistent with equilibrium gas–particle partitioning (Odum et al., 1996; Pankow, 1994a, b). With the completion of more chamber studies and the wider acceptance of absorptive partitioning theory for organic aerosol formation, a database of

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

empirical concentration-dependent AMF was developed, and implemented in CTMs (Griffin et al., 1999; Lurmann et al., 1997; Odum et al., 1997; Strader et al., 1999). In other words, measured AMF are used to generate parameters related to formation and thermodynamic partitioning of semivolatile SOA components. These parameters are used in the CTMs to describe what happens when an SOA precursor (e.g. a-pinene) is oxidized. The parameterization quantifies the yield of SOA, and (if semi-volatile SOA is treated) the partitioning of the semi-volatiles between the gas and organic aerosol phases. Many experimental and modeling SOA studies have been reviewed (Kanakidou et al., 2005; Seinfeld and Pankow, 2003). Prior to about 2004, the main type of chamber SOA formation experiment gave 1 aerosol mass fraction (aerosol formed/hydrocarbon reacted) after several hours of oxidation and aerosol formation. These ‘‘final aerosol mass fraction’’ experiments were limited in time resolution, usually by the time required to measure the precursor concentration by GC (aerosol concentration could be monitored at minute time scales using scanning mobility particle sizers). While real-time monitoring of VOC levels during SOA formation was available using long path FTIR, this required high concentrations of reactants. Now that proton transfer reaction mass spectrometry (PTRMS) is in use in several chambers, dynamic concentrations can be measured generating dozens of data points for each chamber experiment (Lee et al., 2006; Ng et al., 2006; Presto et al., 2005b). The applicability of dynamic AMF can be evaluated by comparison with fixed AMF (Pathak et al., 2007b). Other experiments or sequences of experiments have been conducted to establish the relationship between temperature and SOA formation. These include series of experiments with similar hydrocarbon concentrations, but different temperatures (Pathak et al., 2007b; Takekawa et al., 2003) as well as formation at a fixed temperature followed by heating or cooling of the chamber inside a temperature-controlled room or enclosure (Stanier et al., 2007). This work attempts to synthesize all of these types of data to inform the SOA formation parameterization. Smog chamber experiments done at 1 temperature give no information about the temperature dependence of SOA AMF. Product yields may vary as a function of temperature and the vapor pressures of individual products will certainly vary

2277

with temperature. The former case (temperaturedependent product distributions) is likely, but not quantified at this time. In the latter case, lower temperatures should favor higher AMFs. Temperature dependence is typically included in CTMs by calculating temperature-dependent saturation concentrations (or partitioning coefficients) according to the Clausius–Clapeyron equation (Pankow, 1994a, b), using enthalpy of evaporation DH for representative semi-volatile species. Typical values fall between 46 and 92 kJ/mol for carboxylic acid products of a-pinene (Bilde and Pandis, 2001; Hallquist et al., 1999; Kamens et al., 1999). This leads to a strong link between temperature dependence of SOA concentrations and the DH values used. Measurements of the temperature dependence of SOA AMFs are now becoming available. One goal of this work is to present a method for including the temperature dependence results into parameter fitting, and thus reduce CTM uncertainty due to lack of knowledge of the appropriate DH (Pun et al., 2003; Tsigaridis and Kanakidou, 2003). Two other methods for the determination of partitioning coefficients and AMFs should be noted. The first is the direct measurement of gas and particle-phase concentrations through filter/ denuder GC/MS techniques (Kamens and Jaoui, 2001; Yu et al., 1999). The other is a combined kinetic and thermodynamic approach where the time series of aerosol mass and individual products is used to constrain a combination of rate constants, stoichiometric yields, and gas–particle partitioning parameters (Kamens and Jaoui, 2001). To date, these methods have been useful at elucidating gasphase kinetic mechanisms and product distributions, but have not been used to inform CTM modeling of gas–aerosol partitioning. As emphasized in recent overview articles on SOA formation, representation of the full complexity of SOA formation and aging will require a combination of thermodynamic approaches (as emphasized in this work) as well as kinetic descriptions of SOA processes (Donahue et al., 2006; Kroll et al., 2007). Fig. 1 illustrates many of the concepts to be explored in this paper. The figure was created by simulating the oxidation of a VOC producing 2 semi-volatile compounds, with temperature independent stoichiometric yields. The absorptive partitioning equations were solved to determine the fraction of product in the gas and aerosol phases. The aerosol phase portion contributes to the aerosol

ARTICLE IN PRESS 2278

C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

Fig. 1. Conceptual diagram showing the AMF surface for a hypothetical set of semi-volatile oxidation products. Low temperatures and high existing aerosol loadings favor condensed phase partitioning (high AMF). Lines labeled DROG show the expected temperature dependent AMF and aerosol concentrations for oxidation of fixed amounts of precursors under conditions of no preexisting aerosol. The 1 and 10 ppb line terminate (AMF goes to zero) at 6 and 27 1C; product concentrations do not exceed saturation above these temperatures. Rectangles drawn over the surface show that experiments and the atmosphere can be considered with respect to their ranges of aerosol concentrations and temperatures.

mass fraction shown as the z-axis. The goal of this work is to determine how to parameterize the AMF surface from smog chamber experimental data, including traditional final AMF, fixed-temperature experiments, variable temperature experiments, and dynamic chamber results. Also noted in Fig. 1 is the possibility that experimental data is not available across the atmospherically relevant concentration and temperature range. Uncertainty bounds are critical for meaningful extension of the experimental results to CTM applications. The uncertainty bounds need to be valid for interpolation within the experimental data, and as much as possible, for extrapolation outside of the experimental regime. The goal of this work is to refine the estimation of parameters from chamber experiments using absorption partitioning theory. In Section 2.1, absorptive partitioning is reviewed. In Section 2.2, the parameterization algorithm is presented. Section 2.3 reviews the types of chamber data that are available for fitting. Section 2.4 defines several terms needed for discussion of the fitting and describes 4 sets of pseudodata used in testing the fitting algorithm. Sections 3.1–3.4 describe the application of the

fitting procedure to the pseudodata sets, while Section 3.5 applies it to a combined a-pinene ozonolysis dataset including multiple chambers and multiple authors. This work does not consider the parameterization of aging, heterogeneous reactions, or slow oligomerization. It is limited to ‘‘prompt’’ aerosol mass formation from chamber experiments, which capture chemistry and partitioning occurring over approximately a 10–300 min timescale. The choice to leave these more recently discovered processes out of this work is not due to a belief that they are negligible or that they do not need to be eventually incorporated into CTMs. However, the choice is justifiable on at least 2 grounds: (1) a sound procedure for parameterization of prompt SOA formation and its uncertainty is a necessary building block for more advanced schemes; and (2) widely used CTMs (e.g. CMAQ, CAMx, and GEOSCHEM) share the absorptive partitioning framework, and thus can be modified without changing their basic structure for increased accuracy and for prediction of uncertainties on SOA formation due to limitations in the underlying chamber data.

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

is (Donahue et al., 2006): X X ctot;i ctot;i xi ¼ . cOA ¼ 1 þ ðci =cOA Þ

2. Methods 2.1. Theoretical basis for semi-empirical fitting equations Absorptive SOA partitioning is used throughout this work, with the following nomenclature. The effective saturation concentration ci* is defined through the expression: yi ci  ci;gas ,

(1)

where ci,gas is the gas-phase mass concentration of species i and yi is its mass fraction in the absorbing aerosol phase (yi ¼ ci,aer/cOA). The variables ci,aer and cOA are the aerosol mass concentration of species i, and the total absorbing organic aerosol concentration, respectively. This is equivalent to relating effective saturation concentration to pure component vapor pressure using: M OA gi poL;i ci ¼ gi coi M OA , ¼ Mi RT

(2)

where Mi is the molecular weight of species i, MOA is the average molecular weight of the absorbing aerosol phase, gi is the activity coefficient, coi is the saturation concentration and poL;i is the subcooled liquid vapor pressure of i. Eq. (1) avoids the need for estimating the ratios of product to precursor molecular weights, and the ratios of molecular weights between individual semi-volatile species and the average organic aerosol. The activity coefficient is included in the effective saturation concentration, and thus the mixture is treated as pseudoideal. It should be noted that for atmospherically relevant semi-volatile compounds, the activity coefficient and partitioning behavior is influenced by the absorbing organic material (Liang et al., 1997). Thus parameters determined by the method in this work will be less accurate when the absorbing organic matter is different from that used in the experiments used to generate the dataset. This limitation needs to be addressed in future work. The mass fraction of species i partitioned to the aerosol phase, xi, for a single semi-volatile compound in absorption partitioning is xi 

caer;i 1 , ¼ ctot;i 1 þ ðci =cOA Þ

2279

(3)

where ctot,i is the total concentration (gas+aerosol) of species i. Therefore, for a mixture of semi-volatile compounds, the total organic aerosol concentration

(4)

In Eqs. (1)–(4), we have not made any distinction as to the source of organic material (either ctot,i or cOA)—it can be either preexisting in the atmosphere, or it can be generated from reactions or from combustion emissions. Smog chambers are a special case, where typically all the organic mass (for all species) is generated from oxidation of the parent reactive organic gas (ROG). Some chamber experiments are conducted with preexisting semi-volatiles in the gas or aerosol phase, but they are the minority and would require minor changes to the fitting equations and parameterization algorithm. In cases where ctot,i is solely the result of a reaction of a precursor gas: ctot;i ¼ ai;molar DROG M i =M ROG ¼ ai;mass DROG: (5) In this work, all ai values are mass-based yields. Expressing Eqs. (3)–(5) for a smog chamber experiment with no preexisting organic aerosol (such that DcOA ¼ cOA): X DcOA ai x ¼ , (6) DROG 1 þ ðci =cOA Þ where x is the overall aerosol mass fraction. Eq. (6) suggests that the aerosol mass fraction can exceed unity, owing to increase in molecular weight upon reaction. The partitioning Eqs. (1)–(4) are applicable to all atmospheric absorptive partitioning calculations, while the smog chamber Eqs. (5)–(6) should be reserved for cases where the assumption of zero preexisting absorbing aerosol is met. The term aerosol mass fraction (x) is used throughout instead of the term yield (yieldDcOA/ DROG) common in SOA literature. This is done to reserve the term yield for product yields (ai) and avoid confusion between product yields ai and aerosol mass fraction (AMF or x). For comparison to previous works, the substitution yieldDcOA/ DROG ¼ x can usually be made. As pointed out by Presto and Donahue (2006) there is uncertainty on the density of organic aerosols. Aerosol mass concentration is often measured by combining the measured aerosol number distribution by a scanning mobility particle sizer with the assumption of sphericity and an assumed density (often unity). Thus, application of Eq. (6) to both atmospheric prediction and to experimental data reduction

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

2280

requires careful treatment of density. The relevant density correction equations are in Appendix A. All of the experimental data used in this work was reported using assumed aerosol density of 1 g cm3. The organic fraction of atmospheric aerosols is typically assumed to have a density of around 1.2 g cm3 (Turpin and Lim, 2001). Measurement of aerosol density is increasingly available in recent field and laboratory studies of SOA using a combination of SMPS and aerosol mass spectrometer measurements (Cross et al., 2007; DeCarlo et al., 2004). The partition coefficient Kom,i (units mg1 m3) can be used interchangeably with the effective saturation concentration, with the equation for interconversion as K om;i 

caer;i 1 ¼ . cgas;i cOA ci

(7)

A typical procedure (Cocker et al., 2001; Griffin et al., 1999; Odum et al., 1996) for data reduction of chamber experiments is to use Eq. (6) to regress smog chamber data (pairs of cOA and x) to fit either 2 parameters (e.g., a single ai and saturation concentration ci ) or 4 parameters (e.g., a pair of ai values with corresponding saturation concentrations). When AMF are needed at different temperatures, the Clausius–Clapeyron equation is used to adjust

the effective saturation concentrations    T ref DH i 1 1 exp  , ci ðTÞ ¼ ci ðT ref Þ T R T ref T

(8)

where Tref is a reference temperature where a reference effective saturation is defined, and DHi is the enthalpy of evaporation. There is considerable uncertainty regarding the value of DH to use in this equation (Bian and Bowman, 2002; Stanier et al., 2007; Strader et al., 1999; Tsigaridis and Kanakidou, 2003). One approach (which usually gives too strong of a temperature response) is to use the few known DH values of identified SOA components. 2.2. Parameterization algorithm The parameterization strategy is to determine values of ci , ai, and DHi so that AMF predicted by Eqs. (6) and (8) match experimental values as closely as possible. This is done by nonlinear leastsquares regression. The actual regression is straightforward, but there are several challenging issues related to the parameterization: (1) selecting the number of basis set compounds (n) to use, (2) deciding whether to fix the c* values or attempt to optimize them in the regression, (3) estimating the temperature sensitivity, and (4) estimating confidence intervals (CI) across the range of cOA and temperature values anticipated in the CTM

Fig. 2. Flowchart of parameter fitting algorithm.

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

application. An overview of the procedure is shown in Fig. 2. The fitting procedure does not consider variation in product yields (ai) due to reaction temperature, or other factors such as NOx/VOC ratio (Ng et al., 2007; Nojgaard et al., 2006; Presto et al., 2005b; Song et al., 2005), relative humidity, or UV light intensity (Presto et al., 2005a). However, different fits can be done at different NOx/VOC ratios (Pathak et al., 2007a). In this work, we have taken the approach of using fixed c* values only, rather than trying to optimize them simultaneously with yields ai. The reasons for this are (1) covariance between yields and c* values makes the regression difficult, (2) the uncertainty method relies on fixed c* values, and (3) the use of fixed c* ‘‘volatility basis vectors’’ may lead to reduction in the number of semi-volatile surrogate species necessary in a CTM, as the products of different precursors can be assigned to the same c* bins. 2.2.1. Selection of extreme values of c* Equations for selecting extreme values of c* are derived in Appendix B. They depend on the accuracy goals, organic aerosol concentrations, and temperatures of the expected CTM application. For c* values referenced to 298 K (25 1C) the equations are:    DH 1 1  cmin;298 K  d exp  , (9) R T max 298    5 DH 1 1   , cmax;298 K  cOA;max exp f R T min 298 (10) where d is the allowable error (mg m3) in the semivolatile partitioning calculation under clean conditions, Tmin and Tmax are the temperature limits of CTM modeling application, and f is the maximum fractional error in the parameterization at high

2281

concentration (e.g. 0.1 for 710% allowable organic aerosol concentration). Some values from Eqs. (9) and (10) are given in Table 1 for DH of 100 kJ mol1. For the a-pinene cases examined in Section 3, it turns out that this choice is overly conservative, and a value of 33 kJ mol1 will suffice. 2.2.2. Selection of number of c* values A priori specification of the number of c* values (including c*min and c*max) required for a suitable fit and for confidence interval characterization is difficult. A practical approach is to space the c* values on a lognormal basis (e.g. 0.01, 0.1, 1, y 103, 104, 105 mg m3) and thus the number of elements in the basis vector is determined by the spacing between c*min and c*max. The regression calculation is quick and can be repeated easily with a different basis, so trial and error selection of the c* basis set is feasible. From doing the fits for this work, we recommend:   n  a log10 cmax =cmin þ 1, (11) where a is from 0.5 to 1.0. There exist cases where the experimental data can be fit with a small basis set (e.g. n ¼ 1 or 2). This will be true especially when the data have a small dynamic range in terms of cOA values. When Eq. (11) gives a large recommended basis set, but 1 or 2 well selected c* values are sufficient to fit the experimental data, the additional parameters are still necessary for calculating uncertainty when the parameterization is extrapolated outside the range of the underlying experimental data. Although the current work is limited to fitting AMF (and not a full mass balance for the SOA formation chemical reaction) the use of a suitably large basis set may have further utility in tracking the mass balance, secondary reactions, aging, and atmospheric fate of biogenic and anthropogenic

Table 1 Minimum and maximum c* values for some scenarios Scenario

General Remote with high altitude Polluted urban Proposed decadal range

d (mg m3)

f

TminTmax (1C)

c*min (25 1C) (mg m3)

c*max (25 1C) (mg m3)

Recommended n

60 5

0.1 0.05

0.05 0.05

0–40 30 to 40

1  103 7  104

240,000 4.5  106

5–10 6–11

120 25

0.5 0.07

0.05 0.05

0–40 0–40

0.07 0.01

480,000 100,000

4–8 4–8

cOA,max (mg m3)

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

2282

ROG emissions (Donahue et al., 2006; Robinson et al., 2007). 2.2.3. Regression The objective function minimized during regression is X h  i 2 OF ¼ x  x^ ai ; DH; ci ðT ref Þ; T ref ; cOA ; T , i¼1...m

(12) where x are the measured AMF (m data points) and x^ are the modeled AMF calculated using Eqs. (6) and (8). The modeled AMF are calculated using fixed parameters (Tref), and experimental variables cOA and T. The adjustable parameters are ai, a single DH (instead of n DHi values). Fixed saturation concentrations ci* are used in this work, but they can also be adjustable parameters if n is small. The fact that the AMF is linear in the adjustable parameters ai (Eq. (6)) greatly speeds up the calculations. At any step in the regression (and for any value of DH), the best set of ai can be computed by solving the following equation subject to yield non-negativity constraints aiX0: 2 1  1  1 3 c1 c2 cn 1 þ 1 þ    1 þ cOA;1 cOA;1 cOA;1 7 6 7 6 7 6 .. .. 7 6 . . 7 6 7 6   1 1 1 5 4    c1 c2 cn 1 þ cOA;m 1 þ cOA;m ... 1 þ cOA;m 2 3 2 3 x1 a1 6 7 6 7 6 a2 7 6 x2 7 6 7 6 7 6 7 6 7 6 .. 7 6 .. 7 6 . 7 ¼ 6 . 7, (13) 6 7 6 7 6 . 7 6 . 7 6 . 7 6 . 7 6 . 7 6 . 7 4 5 4 5 an xm where cOA,i and xi are the organic aerosol concentration and measured AMF corresponding to the ith experimental data point. Therefore, nonlinear minimization is performed for only DH. In this work, the Levenberg–Marquardt method implemented in MATLAB’s lsqnonlin function (MathWorks, 2002) is used for the nonlinear minimization and the MATLAB function lsqnonneg is used for the nonnegative least-squares problem. The regression calculation is not sensitive to initial guesses and takes a few seconds for fixed ci* values, with n ¼ 8 and hundreds of datapoints. The goodness-of-fit

Table 2 Error formulas used for goodness of fit Name and symbol

Formula

Mean squared error (errmse) vs. data

2 1 X xi  x^ i m

Average bias vs. data (errmean)

1 X xi  x^ m

Mean absolute fractional error vs. data (errfrac)

1 X



xi  x^ =xi m

will be characterized using the metrics shown in Table 2. Initial CI are calculated using the asymptotic confidence interval method (Seber and Wild, 2003) as implemented in MATLAB’s nlpredci function. The uncertainty bounds are calculated without using the constraints that there are physical upper and lower limits to AMF. Therefore, far enough from the data, the inherent multicollinearity in the basis set manifests itself as unreasonably high and low values of the CI. These are truncated using the logic that AMF cannot decrease with increasing cOA (at constant temperature), that AMF cannot increase with decreasing cOA (at constant temperature), and that AMFs have both lower (zero) and upper (1.5) limits. This approach often leads to wide nonphysical CI when extrapolating outside the experimental range. Therefore, a second method of calculating CI is employed: repeated random (Monte Carlo) sampling of selected ai values. The basic idea is that the experimental data may poorly constrain certain yields associated with low volatility (low c*) and high volatility (high c*) elements in the basis set. Therefore, the yields for these compounds are selected randomly while the remaining ai’s are fit to the experimental data. The algorithm for the Monte-Carlo CI is included in Appendix C. Random sampling is done in large (1000) simultaneous trials with either 1, 2 or 3 random ai’s. Random sampling is stopped when change in the CI is smaller than a preset tolerance. This is the slowest part of the overall parameterization code. One thousand trials require 1 s using MATLAB’s lsqnonneg routine on 2.1 MHz PC with 2 GB RAM (this is while estimating CI with 300 points in the cOA vs. T space). The trials need to be repeated from 2 to 20 times for each value of DH explored, so exploration of the CI using 10 values of

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

2283

Table 3 List of data sources for a-pinene SOA AMF Study

Experiments used for fitting in this work

a-pinene concentration range (ppb)

Temperature range (1C)

Griffin et al. (1999) Cocker et al. (2001)

Outdoor (dark) chamber reaction with O3 in excess, OH scavenger, inorganic seed, and dry conditions Indoor chamber reaction with O3 in excess or limiting, OH scavenger, RHo20%; some seeded experiments and some unseeded Indoor chamber reaction with O3 in excess, OH scavenger. Seeded and unseeded experiments at many temperature/ concentration combinations Indoor chamber reaction with O3 in excess, low NOx. Experiments 6/14/05 and 6/28/05

15–65

32–37

6

23–163

28–30

15

4–50

0–40

31

1.5–138

22

Pathak et al. (2007a) Presto and Donahue (2006)

DH can take 1–2 minutes. The combination of these techniques for different cases will be discussed in subsequent sections. 2.3. Description of data used in fitting Data used in this work to parameterize SOA AMF fall into 3 categories. The first category is the traditional fixed-temperature smog chamber AMF experiment. These types of experiments are referred to as static, final AMF, or traditional smog chamber experiments. A second category of data available for fitting are from chamber experiments, where VOC concentration can be monitored rapidly using PTRMS and aerosol concentrations are measured at minute time scale by SMPS (Ng et al., 2006; Presto and Donahue, 2006; Presto et al., 2005b). Therefore, a time series of cOA, DROG, AMF, and temperature is generated. The dynamic AMFs can be used for fitting in cases, such as a-pinene ozonolysis, where there is a single rate-limiting step to aerosol formation. A third category of data available for fitting are from chamber experiments that include a change in temperature following aerosol formation. This is done for the purpose of assessing the temperature sensitivity of aerosol partitioning. Rather than comparing 2 experiments at different temperatures (each with independent uncertainties that complicate attributing changes in AMF to DT), the ramp or step change in temperature allows direct observation of the change in partitioning. The technique is called temperature-ramped equilibrium volatility analysis (TREVA). Results

No. experiments (for this analysis)

2

are available only for the a-pinene–ozone and b-pinene–ozone systems (Pathak et al., 2007a, b; Stanier et al., 2007). Data from all 3 of these experiment types are used in section 3.5 to parameterize AMF and uncertainty for a-pinene ozonolysis. The specific datasets are listed and referenced in Table 3. Individual datapoints used in fitting are listed in a supplemental data section. 2.4. Nomenclature and description of synthetic data used in fitting Synthetic data, also called pseudodata, is useful for demonstrating that the fitting procedures work and assessing specific strengths and weaknesses of the parameterization strategy. Values of ai, ci , and DHi are used to simulate AMF experiments, generating synthetic (pseudo) data as datapoints with values of cOA, x, and T. These can be simulated with random error and bias if desired. We will use the following terminology:



Underlying model: The ai, ci , and DHi used to generate synthetic data. Ideal AMF: x calculated from the underlying model. Synthetic data or pseudodata: Data (x, temperature, and cOA) used in regression. ^ Predictions or regression predictions: AMF ðxÞ calculated according to the regression output. For pseudodata cases, regression predictions are compared to synthetic data, where agreement is expected, and to ideal AMFs, where agreement is not necessarily expected.

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

2284

Table 4 Description of synthetic data experiments Chemical and physical characteristics of underlying SOA model Description

Components 1–5

2 SOA products

14 SOA products







Pseudodata sets generated from each underlying model

6–10

11–14

Log10 c*

0.699; 2.90

a  100 DH (kJ mol1)

10; 15 80; 60

Log10 c*

1.75; 1.54; 0.436; 0.218; +0.60

0.702; 1.04; 1.56; 1.84; 2.05

2.58; 3.02; 3.24; 3.83

a  100 DH (kJ mol1)

0.12; 0.06; 0.6; 3; 1.8 120; 110; 90; 80; 60

4.2; 1.2; 0.9; 2.4; 0.6 45; 72; 45; 83; 100

9.0; 1.2; 0.6; 2.4 40; 35; 40; 50

Pseudodata A, B1, B2 (see text)

Regression model: The structure (Eq. (6)) and  parameters for x^ ai ; DH; ci ðT ref Þ; T ref ; cOA ; T . The adjustable parameters are ai (n values) and DH. Tref is a fixed parameter (set to 298 K for this work). cOA and T are the independent experimental variables. Basis set or basis vector: The fixed values of ci used in a regression model. Parameterization applicability domain (also model domain): The expected range of cOA and temperatures where the parameterization will be employed. Experimental domain: Range of cOA and temperatures in the experimental or synthetic data. Temperature sensitivity: The change in aerosol concentration with temperature due to semivolatile partitioning, expressed as q ln ðcOA Þ=qT with units of K1. For SOA from a-pinene ozonolysis, this is expected to be 0.004 to 0.036 K1 (Stanier et al., 2007). For consistency, all temperature sensitivity values in this work are calculated at 40 mg m3 and 20 1C.

The ideal models and synthetic data experiments used in this work are described below. A 2-product underlying model is used to generate sets A, B1, and B2. Parameters for this 2-product underlying model can be found in Table 4. In pseudodata set A, the sequence of hypothetical experiments includes 19 chamber final AMF experiments, with experimental cOA values from 7–525 mg m3 and temperatures from 26 to 37 1C. Experiments are assumed to have very low independent random error in determination of DROG such that the coefficient of variation (CV ¼ s/mean)

Pseudodata C

on DROG from repeated smog chamber experiments would be 0.007. In pseudodata set B1, 21 final AMF experiments are simulated and they are assumed to come from 3 separate series of chamber experiments. The first series of experiments is 10 experiments, all at 26 1C. They are assumed to have random error such that CV ¼ 0.02, but they are also assumed to have a series-specific bias of +10% in measured AMF (e.g. true AMF of 0.2 would be sampled as 0.22). The second series of experiments is 6 experiments, all at 37 1C. They are assumed to have random error such that CV ¼ 0.02, but they are also assumed to have a series-specific bias of 10% in measured AMF. A third series of experiments is 5 experiments at the same fixed DROG, but each at a different temperature, from 17 to 37 1C. They are assumed to be free of systematic error, but have random error such that CV ¼ 0.02. Systematic differences in the results of smog chamber AMFs are to be expected due to causes such as differences in GC calibrations, errors in wall loss corrections, and differences in SMPS and CPC calibrations. In pseudodata set B2, the pseudodata experiments from B1 are supplemented with 5 temperature ramp experiments, where the AMF is measured at 5 different temperatures as the chamber is heated or cooled. The data is assumed to be noisy, and to have correlated errors. Specifically, each of the 5 experiments is assumed to have a bias shared by all 5 datapoints, associated with the determination of the DROG for that experiment. This is generated randomly such that the CV ¼ 0.06. The details of the hypothetical experiments are that each experiment gives 5 AMF datapoints at temperatures ranging from 22 to 39 1C. Each datapoint was

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

generated with an additional independent error, corresponding to a CV ¼ 0.03. In pseudodata set C, a much more complex physical mixture is assumed, with 14 products with c* values ranging from 102 to 104 mg m3. No single component was dominant, but the most prominent 3 had yields (ai) of 0.09, 0.042, and 0.03 at c* values of 100.2, 100.7 and 102.6 mg m3, respectively. A full list is given in Table 4. The pseudodata comprising dataset C are sampled in a very similar sequence of experiments as in set A. Nineteen simulated experiments, with low random error (CV ¼ 0.007), no systematic error, spanning min/max cOA values of 7 and 525 mg m3, and including temperatures from 26 to 37 1C. In all cases (A, B1, B2, and C) the product yields are assumed to be temperature independent. 3. Results and discussion 3.1. Ability to fit synthetic data Following the flowchart in Fig. 2, there are 3 separate steps before any calculations begin. The first requirement is experimental data. For this demonstration, pseudodata set A is used as described in Section 2.4. The second requirement is a basis vector of c* values to be used in the fit; the basis vector needs to be consistent with the CTM model domain, as described in Section 2.2. For this demonstration, we assume the model domain is: 0–40 1C; cOA,max of 25 mg m3; acceptable error (d) of 0.07 mg m3 in SOA partitioning under very clean conditions; and acceptable error of 1.25 mg m3 (fcOA,max with f ¼ 0.05) in SOA partitioning under polluted conditions. This set of values requires c*min of 0.01 mg m3 and c*max of 105 mg m3. Eq. (11) puts n at 4–8. The choice of 8 corresponds to a decadal spacing (0.01, 0.1, 1, 10, 100, 103, 104, 105 mg m3). The third requirement is to form vectors to represent the model domain as a p  j grid of cOA and temperature values. In this work, we use 20 values of cOA (p ¼ 20) and repeated all calculations at 5 different temperatures (0, 10, 20, 30 and 40 1C). Therefore, an output of the algorithm will be AMF estimates and CI at the p  j points. Finally, a vector of DH values is required for probing the uncertainty limits. In this work, we test values ranging from 10 to 120 kJ mol1 in 5–15 kJ mol1 increments. With those 3 items set, the optimization and the determination of the CI can proceed. The result for

2285

the basis c* vector (0.01, 0.1, 1, 10, 100, 103, 104, and 105 mg m3) is shown in Fig. 3. The regression lines (thick solid lines) go through the data very well (the mean absolute fractional error is less than 1%). The regression prediction (thick lines) and the ideal AMFs (dotted lines) do not agree completely because (a) the structure of the regression model (choices of fixed c* and DH values) do not match the underlying model exactly; and (b) there is some simulated experimental error in the pseudodata. An examination of the regression residuals does not show curvilinearity, correlation between AMF and residuals, or nonnormality. The uncertainty bounds (gray) are narrowest where there is the most data (30 1C) and become wider away from the data. The underlying model included 2 SOA species with DH values of 80 and 60 kJ/mol. The regression estimated a DH value of 77 kJ mol1. The fitted a values were (0, 0, 0.027, 0.097, 0, 0.069, 0.25, 0.25). The ideal AMF (narrow dotted lines) are encompassed within the CI for nearly the entire parameterization applicability domain (88 percent of the ideal AMF fall within the CI). The uncertainty bounds are output not as a parameterized function, but rather as an array with a value for the upper and lower CI at each of the p  j aerosol–temperature combinations. The CI are small in parts of the model domain, reflecting the high precision in the pseudodata. The CI at low concentration (low COA) can be readily understood. The high limit corresponds to the lowest AMF arising from an essentially nonvolatile product; the upper confidence interval is thus a horizontal line below the lowest concentration datapoint. The low limit, on the other hand, corresponds to the lowest AMF arising from material with a saturation concentration close to the observed concentration. In Fig. 3, and in all subsequent figures of the same style, the experimental data are shifted to the nearest temperature line using a preset temperature sensitivity of 0.015 K1. For fitting, the data is fit at its actual temperature—the shift is only so that the data can be summarized in one AMF vs. cOA plot. 3.2. Exploring goodness-of-fit vs. basis set selection To explore the relationship between the choice of the basis set and the quality of the parameterization, we now repeat the fit of pseudodata set A with different basis vectors, varying the number of elements in the basis vector, their spacing, and the

ARTICLE IN PRESS 2286

C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

Fig. 3. Regression result against pseudodata set A. The regression predictions (at a range of temperatures) are the solid black lines, while the narrower dotted lines are the ideal AMFs. Pseudodata are white squares. Data points have been adjusted to the nearest line using a preset temperature sensitivity of 0.015 K1. Gray bands show 95% confidence intervals.

extreme values of the basis set (c*min and c*max). The pseudodata are unchanged; only the basis vectors are changed. The choices are then compared on the basis of goodness-of-fit, fraction of data within CI, and confidence interval width. The results are shown in Figs. 4 and 5. Five different fits with basis vector length 2 (n ¼ 2 fits) were done. These are denoted in Fig. 4 as entries 2a–2e and the c* values for these were (in mg m3) 2a (5/800), 2b (10/1000), 2c (0.1/100), 2d (10/100), 2e (1/10). Four different n ¼ 3 fits were included in the comparison (3a–3d), as well as 2 n ¼ 4 fits (4a, 4b) and the c* values for these were 3a (103/1/ 1000), 3b (0.01/1/100), 3c (1/100/104), 3d (1/10/ 1000), 4a (0.1/1/100/1000), and 4b (0.1/1/100/104). Single fits were done at n ¼ 5, 6, 7, 8, and 9, respectively. The n ¼ 5–7 cases had lognormal spacing between 0.01 and 1000 mg m3. The n ¼ 8 fit is repeated from Section 3.1. The n ¼ 9 fit has lognormal spacing from 0.01 to c*max to 105 mg m3. Fig. 4 has performance statistics for all 16 of these cases. Fig. 5 shows AMF vs. cOA plots for 4 of the 16 cases. On the y-axis of Fig. 4a is a goodness-of-fit of the regression prediction relative to the synthetic data

(black bars) and to the ideal AMF (white bars). The plotted error is the mean error fraction, errfrac (see Table 2). For example, a prediction of 0.18 AMF versus a data point of 0.20 is expressed as a 10% error in Fig. 4a. Fig. 4b graphs the percentage of the ideal AMF across the entire domain that fall within the CI. Fig. 4c shows the width of the CI at 4 specific points in the system. One point (30 1C, 26 mg m3, triangle symbols) is right in the heart of the pseudodata. One would expect the narrowest CI for that portion of the regression. Another, (30 1C, 7 mg m3, circle symbols) is within the data, but just barely. Two points represent the extrapolation in temperature (10 1C, 26 mg m3) and in temperature and concentration (10 1C, 7 mg m3). Several conclusions can be drawn from Figs. 4 and 5. First, low n (2–3 components in the basis vector) fits are variable in goodness-of-fit relative to the data. As shown in Fig. 4a (fits 2a–3d), some basis sets lead to good fits, while other fail badly. On the other hand, using an n ¼ 2 basis vector for estimating CI on the regression gives poor results regardless of the c* values. The fraction of ideal model points falling in the CI (Fig. 4b, fits 2a–2e) is fairly low, while the size of the CI is highly

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

2287

Fig. 4. Performance characteristics for fits of pseudodata set A using 16 different basis vectors. See text in Section 3.2. Four of the 16 cases are graphed in detail in Fig. 5. All fits had a single variable DH, from 2–9 fixed ci* values, and 2–9 fitted ai values. Mean absolute relative error is plotted in panel A. The percentage of the pseudodata falling within the 95% confidence intervals is shown in panel B. The width of the confidence intervals at specific temperature and organic aerosol concentrations is shown in panel C.

dependent on the goodness-of-fit, varying from unreasonably narrow (2a) to very wide (2c). The variability in the regression results is reinforced by the n ¼ 2 and 3 panels of Fig. 5; the goodness-of-fit and the width of the CI show extreme variability, depending on the exact choice. In case 2a, the c* values selected correspond exactly to the values in the underlying model. Therefore, the data are fit very well, giving the lowest value of the objective function of all the cases in Figs. 4 and 5. However, the excellent fit coupled with the fact that there are no poorly constrained regression parameters gives very narrow CI. The second point to draw from Figs. 4 and 5 is that the regression results improve in quality and consistency when a basis vector with nX5 and appropriate c*min and c*max are used. The exact choice of individual c* values becomes much less important as n increases to values of 5 and higher.

In all the nX5 cases, the errors relative to the data and pseudodata are small. The fraction of ideal points within the CI are reasonable, and the uncertainty in the prediction is smallest in the heart of the data, and grows larger as the degree of extrapolation increases. AMF vs. cOA plots can be seen with uncertainty bounds for the n ¼ 6 case (Fig. 5) and the n ¼ 8 case (Fig. 3). It should be noted that both n and c*min/c*max need to be appropriately selected for good fits, and no specific value of n can guarantee good results. An example of an n ¼ 5 fit with unacceptable CI can be found in Section 3.5.1. 3.3. Temperature sensitivity and DH In some fitting applications the data will constrain the temperature sensitivity of aerosol concentrations. In other words, in a plot such as Fig. 3,

ARTICLE IN PRESS 2288

C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

Fig. 5. Some of the cases from Section 3.2 shown in detail. The difference between the four cases is the selection of the basis vector. The pseudodata (white squares) is the same in each case. Each plot has shows separate regression lines (thick solid black lines), ideal AMFs (thinner dotted lines), and confidence intervals (gray shading) at five different temperatures. As in Fig. 3, temperatures are (in order of decreasing AMFs) 0, 10, 20, 30 and 40 1C. See Fig. 4 for error statistics for these cases.

the AMF will be equally well predicted at a range of different temperatures. Pseudodata set A is, for the most part, able to constrain the change in AMF with temperature. For example, the n ¼ 7 regression recovers a temperature sensitivity of 0.022 K1. The temperature sensitivity of the underlying model at the same point is 0.021 K1. The reason for this good agreement is that the pseudodata includes data from a range of temperatures, there is no correlation between error and temperature, and random errors are small. The fitted DH value (77 kJ mol1) is in agreement with the underlying model (60 and 80 kJ mol1), as explained in Section 3.1. Furthermore, in pseudodata set A, the DH value is tightly constrained, with a standard error of 3 kJ mol1 and visible deterioration of the goodness-of-fit when DH is reduced from 77 to 60.

In still other scenarios the data will not constrain temperature sensitivity. This can occur when all data were sampled within a narrow temperature range. In that case, the CI will be narrow at temperatures with data, and increasingly wide at other temperatures. Or the cause can be excessive random error in the data masking any temperature signal. This should lead to wide CI at all temperatures. Another possibility is that correlations between error and temperature lead to an biased temperature sensitivity. To illustrate the case of incorrect recovered temperature sensitivity, a pseudodata set B1 was generated (see Section 2.4 for complete description). In pseudodata B1, experimental bias is correlated with temperature. The regression result is shown in Fig. 6a. As expected, the regression predictions are

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

incorrect in terms of temperature sensitivity; the recovered DH value is too low (40 kJ mol1 compared to ideal values of 60–80 kJ mol1) and the temperature sensitivity of the regression prediction is 0.019 K1 compared to the ideal value of 0.021 K1. 3.3.1. Using ‘‘temperature ramp’’ chamber experiments One method of overcoming poorly constrained temperature sensitivity is through repeated smog chamber experiments at different temperatures (Pathak et al., 2007b). However, the number of experiments may need to be fairly large (415) to overcome random error between chamber experiments. Another type of experiment design is a temperature ramp design, where a temperaturecontrolled smog chamber is heated or cooled quickly (minutes) after initial SOA formation (Stanier et al., 2007). This has the advantage of minimizing the effect of experiment-to-experiment noise in measuring the temperature sensitivity. This type of experiment was included in pseudodata set B2. Set B2 is identical to B1 except that 5 temperature ramp experiments, each generating 5 datapoints, are added to the 19 experiments already in set B1. Fig. 6b shows the regression result when set B2 is fit. The temperature ramp experiments, although noisy compared to the original data, meet

2289

the objective of improving the estimation of the temperature sensitivity. The regression shown in Fig. 6b has a temperature sensitivity of 0.022 K1 and a regressed DH value of 67 kJ mol1. Similar results to Fig. 6 were obtained by reversing the biases in the 29 and 37 1C data series. In the above paragraphs and in Fig. 6, the lower temperature data was biased high while the higher temperature data was biased low, leading to an apparent low temperature sensitivity. If the biases are reversed, the apparent temperature sensitivity becomes large compared to the ideal temperature sensitivity of the underlying model. For example, with the true temperature dependence the same as above (0.021 K1), the initial regression’s value is 0.034 K1. Once the temperature ramp data is added, the sensitivity goes to 0.024 K1. Volatility TDMA experiments (An et al., 2007; Offenberg et al., 2006; Philippin et al., 2004) may also be suitable for this type of analysis, as long as equilibrium is achieved in the flow setup. 3.3.2. User specified limits on temperature sensitivity If the CTM modeling results are sensitive to temperature effects on partitioning, then realistic temperature sensitivity and CI are highly desirable. If no experimental data are available as a constraint, then suggested limits are (based on a-pinene SOA) q ln(cOA)/qT of 0 to 0.04 K1 (Stanier et al., 2007).

Fig. 6. Temperature sensitivity of regression prediction. Each plot has regression lines (thick solid black lines) and ideal AMFs (thinner dotted lines) at five different temperatures (in order of decreasing AMFs) 0, 10, 20, 30 and 40 1C). In figure (a), the pseudodata contain correlated errors in AMF as a function of temperature, reducing the temperature sensitivity in the regression prediction. The ideal AMFs have a high temperature sensitivity (wide spacing between ideal AMFs at different temperatures). In figure (b) the pseudodata have been augmented with temperature ramp data, bringing the fitted and ideal temperature sensitivities into closer agreement.

ARTICLE IN PRESS 2290

C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

Bracketing temperature sensitivity is preferred over the simpler approach of just limiting the DH value used Eq. (6) to a preset range. The reason for this is that the value of DH required for a certain temperature sensitivity depends on the basis vector selection and aerosol concentrations. 3.4. Pseudodata set C—complex mixture The underlying model for sets A and B was relatively simple, with only 2 SOA products. Actual organic aerosols, even for single precursors, are much more complex. In case C, the SOA mixture is assumed to have 14 components with c* values ranging from 102 to 104 mg m3. No changes to the algorithm or procedures are required. The result of the fitting algorithm is shown in Fig. 7. A basis vector of 8 lognormally space c* values (from 0.01 to 105 mg m3) was used to perform the regression. The fit results (error relative to data, relative to underlying model, temperature sensitivity, and CI) are quite good. The mean absolute fractional error for prediction of the datapoints is less than 0.02. All of the ideal model values are within the CI. The underlying model temperature sensitivity is 0.026 K1 and the regression temperature sensitivity is 0.022 K1 (with a fitted DH of 56 kJ mol1). An examination of the regression residuals does not show curvilinearity or nonnormality of the residuals, but there is some tendency for increasing absolute value of residuals as the cOA increases. This is not surprising since the artificial noise added to the pseudodata was added as a percentage of the measured AMF (rather than an absolute value). One-to-one mapping of the ideal to the fitted yields (a values) is not expected at the individual basis vector level. Nevertheless, broad agreement is seen when ideal and fitted yields are compared. When mapped to the nearest c* value, the true yields would be: [0.002 0 0.04 0.07 0.04 0.11 0.02 0]. The corresponding fitted values are: [0 0 0.09 0.03 0.09 0.14 0 0]. Grouping into 4 bins instead of 8, the comparison is closer: true yields at [0.002 0.11 0.15 0.02] and fitted yields at [0 0.12 0.14 0]. 3.5. Fitting a-pinene ozonolysis data We now demonstrate the fitting algorithm on real data. The fitting is very similar to the pseudodata cases presented above. The main differences are that the dataset is larger, that the data somewhat noisier,

and that new types of possible input data are included (final AMF experiments, dynamic AMF experiments using PTRMS, and temperature ramp experiments). 3.5.1. Demonstration algorithm on large a-pinene ozonolysis dataset The data used for fitting is summarized in Table 3, and individual datapoints are listed in supplemental data. For the model applicability domain the ‘‘general’’ model case from Table 1 was selected (0.1–60 mg m3 with a temperature range from 0 to 40 1C). While Table 1 (based on Eqs. (9)–(11)) recommends a c*min and c*max of 1  103 and 2.4  105, respectively, these are calculated based on a single component with DH of 100 kJ mol1, and the real aerosol does not have nearly that temperature sensitivity. Therefore, a narrower range might be employed. This necessary range was tested by fitting with basis vectors of length 9 (103–105), length 7 (102–104), and length 5 (l01–103). The fits and CI were nearly identical for the n ¼ 7 and 9 cases. However, the CI change somewhat in the n ¼ 5 case. Specifically, the CI narrowed at low and high concentrations from the removal of the 0.01 and 10,000 mg m3 entries in the basis set. Therefore, all of the fits in Section 3.5 utilize an n ¼ 7 basis set with decadal c* values from 102 to 104 mg m3. The fitting result for the n ¼ 7 case is shown in Fig. 8. The mean absolute fractional error is 0.08. In other words, the average error on an AMF of 0.2 is 70.016. From the plot, it can be seen that the source of this fitting error is the variability in the data itself. There is obvious variability in the experiments, and some bias between the different series of experiments. The yield values (for c* ¼ 102–104 mg m3) are [0 0 0.072 0.075 0.082 0.29 0.29], where the 0.29 values at a6 and a7 are user-selected upper limits (the data is not sufficient to constrain the most volatile component yields). Fitted DH is 35 kJ mol1 and temperature sensitivity was 0.02 K1. The largest CI occur under 2 conditions: at less than 1 mg m3, and also at temperatures below 20 1C. The cause of both of these areas of large uncertainty is a relative scarcity of data. An examination of residuals shows that residuals are correlated with the source of the data (e.g. lead investigator and chamber), and have heavier tails than a normal distribution. Finally, there is some tendency for increasing absolute value of the residual with cOA, reflecting the fact that uncertainty in chamber experiments consists of

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

2291

Fig. 7. Fitting of pseudodata set C (complex mixture). Regression lines (thick solid black lines) have been fit to go through pseudodata (white squares) and the 95% confidence intervals on regression are shown by gray shading. Ideal AMFs are also shown as thin dotted lines.

errors that increase with the signal (e.g. with cOA). Repeating the analysis of this paper with a more refined regression process to account for these would lead to shifts in the best-fit lines and CI, but the qualitative features of the fitting would remain, such as the range and number of c* values needed and the role of repeated random sampling of a values for CI. 3.5.2. Removing correlated data in highly timeresolved data The use of PTRMS to track the reacting VOC concentration, and the use of SMPS or aerosol mass spectrometers can generate multiple data points per experiment. The 2 PTRMS experiments fit in this section have 38, and 89 datapoints, respectively. Because datapoints are weighted equally in the fitting process, it is important to remove non-

independent data points that may occur in the PTRMS-based experiments. In other words, a method is required to reduce the number of data points that are not independent from one another. The Durbin–Watson test, a test for serial correlation in the residuals, was selected for this work. The test was performed by first fitting a basis vector of length 7 to the data for the PTRMS experiment in question only. The residuals from this fit are used in the Durbin–Watson test (Hamilton, 1992). In one of the 2 cases, the 38 residuals were determined to not be correlated. In the other, the Durbin–Watson test did indicate excessive correlation. This was addressed by averaging successive data points (starting with those contributing most to the Durbin–Watson score) until either the Durbin–Watson test statistic was raised above the upper critical value, or the number of data points was reduced to 15. The main

ARTICLE IN PRESS 2292

C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

Fig. 8. Fit of a wide variety of a-pinene ozonolysis data. Regression lines (solid black lines) are shown for 5 different temperatures and have been calculated by fitting data (symbols) using an n ¼ 7 basis set with c* ¼ [102 101 100 101 102 103 104] mg m3. Confidence intervals on regression are shown by gray shading. See text (Section 3.5) for discussion and Table 3 for list of fitted data. The temperature sensitivity of the regression at 40 mg m3 and 20 1C is 0.020 K1 and the fitted DH was 33 kJ mol1.

effect of the reduction in the number of datapoints is to increase the CI. Leaving in all the PTRMS data leads to unreasonably small CI, especially at high yield where the datapoints are repeated. Binning the PTRMS data according to AMF would have a similar outcome as using the Durbin–Watson test. 3.5.3. Response of confidence intervals to key data Fig. 9 shows the result when only a subset of data are fit. The data that is used is that of Cocker et al. (2001). This data were selected because all of the samples are at 2971 1C. Therefore, not surprisingly, extrapolation to different temperatures gives significant error. The best-fit DH and temperature sensitivity for this dataset are 10 kJ mol1 and 0.010 K1. The uncertainty bounds are determined by the range of allowed DH values in the Monte Carlo sensitivity study. Those shown are for limits of 5 and 120 kJ mol1. A wider allowed range would give even wider CI for the temperatures other than 2971 1C. For example, setting DH to 120 kJ mol1 gives a temperature sensitivity of in this case at 20 1C and 40 mg m3 of 0.046 K1. Two conclusions can be drawn from Fig. 9. First, the confidence interval algorithms used here

respond in the correct qualitative fashion as key data are added/subtracted to the dataset. Second, extending the temperature range and/or temperature sensitivity of AMF measurements is just as necessary for reduction in overall uncertainty as is extending the concentration range. Figs. 8 and 9 are directly comparable and show the effect of the increase in the range of data on the uncertainty bounds. 3.6. Discussion of application to chemical transport modeling The paper and algorithm have focused on leaving the basis set selection flexible, to be varied as needed according to the demands of the expected model application (e.g. Tmin, Tmax, cOA,min, cOA,max, and allowable errors in partitioning). If followed literally, that idea could lead to a proliferation of data reductions, each tailored to a specific model domain; that is not a direction advocated by the authors. Rather, the key concepts from this work that should translate to improved modeling are (1) mismatches in the fitting basis vector and model organic aerosol concentrations and temperatures

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

2293

Fig. 9. Fit of subset of a-pinene ozonolysis data selected to demonstrate the possibility of wide confidence intervals for temperature extrapolation. Regression lines (solid black lines) are shown for 5 different temperatures and have been calculated by fitting data (white squares). Confidence intervals on regression are shown by gray shading. Panel (f) is included to show that the regression lines are almost temperature independent.

can have undesirable consequences; (2) realistic CI on aerosol mass fraction are possible; (3) multiple experiment types can be integrated to give improved parameterization; (4) comparisons of AMF (measurements, predictions, and CI) and temperature sensitivity are much more informative than comparisons of regression parameters (ai, ci , and DH); and (5) high n fixed c* basis vector fits are feasible and useful. A practical method of achieving the improved modeling results may be for wider adoption of a decadal basis set including ‘‘nonvolatile’’ (c* o0.1 mg m3), ‘‘semi-volatile’’ (SVOC; 0.1 mg m3 oc*o1000 mg m3), and ‘‘intermediate-volatility’’ (IVOC; 1000 mg m3oc*o100,000 mg m3) organic compounds. In other words, the n ¼ 8 basis set with c* from 102 to 105 mg m3. In this work, the n ¼ 8 set proves more than suitable for fitting the existing apinene ozonolysis data, and for spanning simulated SOA partitioning data across realistic ranges of temperature and organic aerosol concentration. For modeling applications where a reduced n is desirable (e.g. a global model), the n ¼ 8 regression results can be translated to a reduced basis set, to

polynomial fits, or to lookup tables. For example, n ¼ 4 may be an attractive treatment for online partitioning calculations in a large-scale CTM. With n ¼ 4, the choice of c*min and c*max values will be important in determining the performance of the partitioning fit, and use of Eqs. (9) and (10) is recommended for basis vector selection. Fig. 4 showed that n ¼ 4 basis vectors were not acceptable, but the specific n ¼ 4 basis sets were not optimized for best performance in Section 3.2. The basis vectors of [0.1 1 10 100], [1 10 100 103], and [0.1 2.2 46 1000] should be considered if an n ¼ 4 representation is desired. It should be emphasized that an n ¼ 4 basis set may be rich enough for fitting the experimental data and for the CTM partitioning calculations, but a basis vector with n44 may be needed for offline quantification of uncertainty limits during the data reduction step. 4. Summary and conclusion A method for fitting of smog chamber data for CTM applications was presented. The method uses a volatility basis set of compounds with fitted AMF

ARTICLE IN PRESS 2294

C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

and user-specified temperature-dependent saturation concentrations. A single effective enthalpy is fitted to match experimental and modeled temperature sensitivity of AMF. Important aspects of the method include: selection of minimum and maximum c* values, selection of the number of basis set compounds, and estimation of uncertainty bounds for the regression prediction. Main conclusions include:













When fitting parameters to experimental smog chamber data, the range of effective saturation concentration (c*) values used must be consistent with the expected temperature and organic aerosol concentration range in the final model application. Equations for c*min and c*max are presented in Section 2.2. The length of the basis vector (number of c* values used, n) can influence both the quality of the data fitting, and the ability to compute CI. Careful selection of c* values is required with np4 to insure a good fit and suitable CI. With 5 or more lognormally spaced c* values, the best-fit line and CI become insensitive to the exact c* basis vector used. It should be noted that the range of c*min and c*max influence the required value of n. A method for determination of CI is demonstrated. Although parts of the absorption SOA parameterization are nonlinear and/or implicit, the calculation of AMF is linear in the stoichiometric coefficients ai. Therefore, fitted ai values can be calculated using nonnegative linear least squares, increasing computational efficiency. In addition to traditional final AMF smog chamber experiments, newer dynamic aerosol formation experiments (e.g. PTRMS) and experiments with varied temperatures can be useful in constraining AMF parameterizations. Highly time-resolved data, such as a 1-min time series of AMF vs. DROG from a PTRMS and SMPS, may require averaging to reduce serial correlation between datapoints. Errors in apparent temperature sensitivity can be caused by correlated errors in measurements and temperature. This can be partially corrected with temperature ramp experiments that isolate the temperature sensitivity. Temperature-dependence of semi-volatile aerosols is best dealt with by matching experimental and modeled temperature sensitivity (q ln c/qT).





For the a-pinene ozonolysis, the fitted temperature sensitivity at 40 mg m3 and 20 1C is 0.02 K1 and the regressed DH is 33 kJ mol1. Regressed DH values depend on the basis vector and thus do not necessarily match the pure component enthalpies. Temperature sensitivity should not depend on the basis vector; therefore, comparisons should be made of temperature sensitivity and not of regressed DH. In order to reduce uncertainty in CTM modeling, additional experimental data is needed at both lower concentrations and at a range of temperatures. Although a large number value of n (6 or more) may be required for an initial characterization of CI, this large number of parameters does not necessarily have to be used in the CTM. In other words, the items of value to the CTM modeler are the aerosol mass fraction and uncertainty bounds themselves, not the regression parameters. While each CTM modeling application may have a different optimal basis set, a practical method of achieving improved CTM modeling results may be for wider adoption of a basis set including ‘‘nonvolatile’’ (c*o0.1 mg m3), ‘‘semivolatile’’ (SVOC; 0.1 mg m3oc*o1000 mg m3), and ‘‘intermediate-volatility’’ (IVOC; 1000mg m3 oC*o100,000 mg m3) organic compounds. This is the n ¼ 8 basis set with c* from 102 to 105 mg m3. The parameters can be reduced to a smaller set as needed for specific CTM applications as needed.

Acknowledgements Although the research described in this manuscript has been funded wholly or in part by the United States Environmental Protection Agency through Grant RD-83108101, it has not been subjected to the Agency’s required peer and policy review and therefore does not necessarily reflect the views of the Agency and no official endorsement should be inferred. The detailed comments of the anonymous reviewers are greatly appreciated. Appendix A. AMF (yield) equations with density and preexisting organic aerosol considerations Eq. (6) is the commonly encountered fitting equation. Here, we show that Eq. (6) is a special case of a more general set of mass balances for the

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

semi-volatile aerosol system. The equations in this appendix are necessary when accounting for experimental and simulation cases where the organic aerosol density is not assumed to be unity, and for cases with preexisting organic mass. Starting from the following definition: ci;gas 

yi ci

caer;i  ¼ c cOA i

(A.1)

the following expression can be derived by mass balance: caer;i 1 . ¼ ctot;i 1 þ ðci =cOA Þ

equations for cOA and AMF in the case of preexisting aerosol and assumed unit density. cOA ¼

X

xi ci;tot ¼

X cPtot;i þ Dctot;i 1 þ ðci =cOA Þ

(A.6)

In the case of no preexisting aerosol: cOA;U r X ai ¼ xU ¼ U  DROG r 1 þ ðci rU =cOA;U rÞ X a0i  0  ¼ 1 þ ci =cOA;U

(A.2)

and

This can be used with the following definitions of AMF and reaction mass yield ai.



xi 

2295

r r X a0i xU ¼ 0 rU rU 1 þ ci =cOA;U

(A.7)

(A.8)

and Dctot;i  ai DROG; AMF ¼ x 

(A.3)

X DcOA ai ¼ . DROG 1 þ ðci =cOA Þ

cOA ¼ DROG (A.4)

Using the above equations, we can write expressions for prediction of organic aerosol mass (cOA), species-specific partitioning values ei, and smog chamber AMF as function of parameters such as ai, DROG, and ci*. To keep the expressions general, we assume the density of the organic aerosol is unknown, and with preexisting organic aerosol. We denote the values of variables corresponding to the real organic aerosol density as in the above equations (no subscript). We denote values of variables under the case of assumed unit organic aerosol density with a subscript U for physical variables such as x and cOA. Preexisting aerosol is denoted by the superscript P. Expression for cOA, cOA,U, x and xU are then: X X ctot;i cOA ¼ cOA;U r=rU ¼ xi ci;tot ¼ 1 þ ðci =cOA Þ X X c c   tot;i ¼  tot;i  ¼ 0 1 þ ci rU =cOA;U r 1 þ ci =cOA;U X c  0 tot;i , ¼ (A.5) 1 þ ci r=cOA rU 0

where the definitions ci ¼ ci rU =r and a0i ¼ ai rU =r are used. The 4 alternate versions of the denominator in Eq. (A.5) can be made in all the cases below, and the different expressions are not necessarily repeated thoughout the remainder of Appendix A. Separating the organic aerosol (cOA) and the total concentration of species i into preexisting and reaction-generated parts, gives the

r X a0i . 0 rU 1 þ ci r=cOA rU

(A.9)

In this work, all aerosol concentrations and AMFs are calculated using unit density conversions from SMPS aerosol volume, so the fitting really 0 recovers the normalized values ci ¼ ci rU =r and a0i ¼ ai rU =r according to Eq. (A.7). To make concentration predictions for organic aerosols with densities other than 1.0 requires the use of Eq. (A.9). At low AMF, the terms cancel and correct AMF and cOA predictions will be made by regressed values of a, regardless of the density assumption made. At high aerosol loadings, AMF and cOA concentration predictions will be low for actual aerosol specific gravities 41. In the event that individual surrogate compound densities are known, the overall mixture density r can be calculated from the component densities using P X 1 x ctot;i yi r¼ ¼P i , (A.10) ri xi ctot;i =ri where yi are AMF and xi are defined by Eq. (A.2). Appendix B. Derivation of formulae for c*min and c*max For any expected parameterization applicability domain (defined by cOA,min, cOA,max, Tmin, and Tmax) the extreme values required of the effective saturation concentration (c*min and c*max) are calculated as follows. For a single condensing component in the presence of cNV,min (nonvolatile,

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

2296

organic, and solution forming) preexisting aerosol, the mass balance for the semi-volatile species is caer;SV ¼ ctot;SV  cgas;SV ¼ ctot;SV  ySV cSV ,

(B.1)

where cgas,SV and caer,SV refer to the gas- and particle-phase concentrations of the semi-volatile in mg m3. Adding in the preexisting nonvolatile organic, the total organic aerosol concentration is cOA ¼ caer;SV þ cNV;min ¼ ctot;SV  ySV cSV þ cNV;min . (B.2) To determine c*min, the lowest c* necessary to achieve the required modeling accuracy goal d, we construct a scenario where a truly nonvolatile species is represented in the model as having a finite saturation concentration c*min, therefore creating an error. The true aerosol concentration in this case will be: cOA;true ¼ ctot;SV þ cNV;min

(B.3)

while the error in the modeled aerosol concentration will be: cerror ¼ ySV cmin pd

(B.4)

to keep this error less than a user-selected goal d: cmin p

d . ySV

(B.5)

If the organic aerosol consists only of the nonvolatile and semi-volatile fractions, then we can eliminate ySV to give:   cNV;min cmin pd 1 þ . (B.6) ctot;SV For a conservative estimate of c*min, we assume ctot,SVbcNV,min, to keep the right-hand side as small as possible. Furthermore, the calculation is assumed to apply at the highest temperature applied in the model (Tmax)    T max DH 1 1 exp  cmin ðT ref Þ ¼ d . R T max T ref T ref (B.7) Putting in 100 kJ/mol, neglecting Tmax/Tref, and with temperature in Kelvin:   12; 000  cmin;298 K  d exp  40:27 (B.8) T max with d of 1 mg m3 and Tmax at 40 1C, c*min is 0.14 mg m3. To derive c*max we consider a case with nonvolatile solution-forming aerosol (secondary or

primary) cNV,max and 1 condensing species with saturation concentration c*max. The goal is then to select c*max sufficiently high so that material in this bin is not in the aerosol phase. Donahue et al. (2006) suggest 1000cNV,max, which ensures that less than about 0.1% of material in the c*max bin partitions to the aerosol phase. A slightly more flexible approach is to include the CTM modeler’s tolerance for error in semi-volatile partitioning under high concentrations and low temperature. The mass balance for the semi-volatile species is given by Eq. (B.1). A relative accuracy goal, denoted by f is defined. At the high end of the expected concentration range in the model, the accuracy goal (mg m3) is 7fcOA,max (e.g., f ¼ 0.1 for 710% accuracy) caer;SV pf ðcNV;max þ caer;SV Þ.

(B.9)

If we assume that caer,SV is a small fraction of cNV,max then: ctot;SV pfcNV;max (B.10) caer;SV ¼ 1 þ cmax =cNV;max which can be solved for c*max yielding: ctot;SV  cNV;max . cmax X f

(B.11)

The ctot,SV term is difficult to estimate. It represents the total pool of semi-volatile compounds that need to be modeled as in the gas phase. One reasonable assumption is to assume for any given location, that gas-phase semi-volatile concentrations are proportional to organic aerosol (or vice versa), such that ctot,SV ¼ kcNV,max. This simplifies Eq. (B.11) to:   k 1 . (B.12) cmax XcNV;high f If we further assume that k is at least unity (the mass of semi-volatiles is at least the organic aerosol concentration) and the desired fractional error is small, then: k cmax X cNV;high , f

(B.13)

where k is an unknown ratio of the semi-volatile gases to the organic aerosol. Taking values from Mexico City, we find a total VOC concentration of 2000 mg m3 (Edgerton et al., 1999) and an organic aerosol concentration of 20 mg m3 (Salcedo et al., 2006). If 5% of the VOC is semivolatile, then k ¼ 5 for Mexico City. Applying a temperature correction similar to the one for c*min (but in the

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

other direction) k T min cmax ðT ref ÞX cNV;max f T ref





 DH 1 1 exp  . R T min T ref (B.14)

Using Tref of 25 1C, DH of 100 kJ mol1, k ¼ 5, and neglecting Tmin/Tref, Eq. (B.13) reduces to:   5 12; 000  cmax;298 K  cNV;max exp  40:27 . (B.15) f T min If the fractional accuracy goal is 10%, cNV,max is 100 mg m3, and Tmin is 0 1C then c*max(298 K)E 200,000 mg m3, or 200 mg m3. Thus the total multiplier to cNV,max in this case is 2000, comparable to the multiplier of 1000 in Donahue et al. (2006). Appendix C. Algorithm for the Monte-Carlo confidence intervals 1. Create a vector cOA,model (length p20) of aerosol concentrations, logarithmically spaced from cOA,min to cOA,max (the maximum and minimum expected aerosol concentrations in the modeling application). 2. Create a vector (length j) of temperatures from Tmin to Tmax. In this work, j is selected for spacing between elements of 10 1C. 3. Let the experimental yields be a vector n, and the fitted predictions at the experimental values _ of cOA and T are the vector n d, where the values d are from the CI of the asymptotic method (e.g., MATLAB nlpredci function). All _ these vectors (n, n d, and d) will have length of m corresponding to number of experiments. 4. The experimental data are shifted by d such that: nhigh ¼ n+d and nlow ¼ nd. 5. A reference error value, corresponding to how well the fit goes through the data, is noted: X MSEref ¼ ðxi  x^ i Þ2 =m. 6. Initialize j upper CI vectors (each of length p), one for each temperature. Set each element to zero. 7. Initialize j lower confidence interval vectors (each of length p), one for each temperature. Set each element to a large value (e.g. 999). 8. Create a vector (length k) of DH values. In this work, k was usually selected for a separation of 5 kJ mol1 and a range from 10 to 120 kJ mol1. The outer loop (steps 9–16) is repeated 4k times.

2297

For each value of DH, the 4 runs are for upper CI (allowing random selection of low volatility yields); lower CI (allowing random selection of low volatility yields); upper CI (allowing random selection of high volatility yields); and lower CI (allowing random selection of high volatility yields). Outer Loop 9. Since for any iteration DH is fixed, the left-hand matrix of Eq. (13) (C) is calculated and stored. This speeds up the subsequent calculation of yields given any vector a. 10. The n yields a are divided into 2 groups, those that will be selected randomly, and those that will be fit. If the iteration involves randomized low volatility yields, then some number of the lowest volatility species (e.g. a1 and a2) are selected randomly. If the iteration involves randomized high volatility yields, then some of the highest volatility species (e.g., an1 and an) are chosen randomly. In this work, the number of low and high volatility yields to be randomized were selected by the user on a case by case basis, with the goal of only selecting yields that are poorly constrained by the experimental data. The typical n ¼ 7 case required randomization of a1, a2 and a6, a7. Inspection of the error covariance matrix and the standard errors on the original fitted yields will help in selecting which yields to randomize. In the demonstration case of Fig. 3 (Section 3.1), the a’s with the 5 largest standard errors were selected, or elements 1, 2 (uncertain low volatility compounds) and 6, 7 and 8 (uncertain high volatility compounds). The range of experimental data can serve as a guide for which terms are poorly constrained. If a given c* value is outside the range of experimental data (in this case from 7 to 525 mg m3), then the corresponding a value should probably be randomized. Erring on the conservative side, and trying to select a values randomly that are in fact well constrained by the data does not effect the quality of the result. Rather it adds run time as calculations are performed that are of no subsequent use in determining the CIs. 11. Let the number of random a values be q and the number to be fit equals nq. Let the fixed yields be denoted as aq and the variable yields as anq. Also let the C matrix (from Eq. (13)) be split into an m  q matrix Cq corresponding to the columns of the fixed yields, and a matrix Cnq

ARTICLE IN PRESS 2298

C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299

[size m  (nq)] corresponding to the columns of the adjustable yields. Inner Loop 12. 1000 random combinations of the random a values are chosen. 13. For each of the 1000 cases, the remaining nq yields are calculated by solving a modified version of Eq. (13). The adjustable yields anq are determined by solving Cnq anq ¼ n  Cq aq with anq X0, where x* is xhigh when calculating an upper CI and x* is xlow when calculating a lower CI. 14. For each of the 1000 cases, the mean squared error 2 P is calculated MSE ¼ xi  x^ i ðaq ; anq Þ =m:. If MSEpMSE_ref then it is a suitably good fit. Fitted values n d are calculated at all j  p combinations of temperature and organic aerosol concentration. For upper CI, CIhigh ¼ _ max(CIhigh, n d),_ and for the lower CI, CIlow ¼ min(CIlow, n d) where the comparison is on an element by element basis. 15. If the CI moved more than a present tolerance, repeat the inner loop by going to step 12. 16. Loop back to step 9 until all 4k cases are completed. 17. For each temperature of interest, upper and lower confidence interval vectors are now calculated using the Monte-Carlo method. They should be inspected graphically and compared to the limits from the asymptotic method. Appendix D. Supplementary materials The online version of this article contains additional supplementary data. Please visit doi:10.1016/ j.atmosenv.2007.12.042. References An, W.J., Pathak, R.K., Lee, B.H., Pandis, S.N., 2007. Aerosol volatility measurement using an improved thermodenuder: application to secondary organic aerosol. Journal of Aerosol Science 38 (3), 305–314. Bian, F., Bowman, F.M., 2002. Theoretical method for lumping multicomponent secondary organic aerosol mixtures. Environmental Science and Technology 36 (11), 2491–2497. Bilde, M., Pandis, S.N., 2001. Evaporation rates and vapor pressures of individual aerosol species formed in the atmospheric oxidation of alpha- and beta-pinene. Environmental Science and Technology 35 (16), 3344–3349. Cocker, D.R., Clegg, S.L., Flagan, R.C., Seinfeld, J.H., 2001. The effect of water on gas-particle partitioning of secondary organic aerosol. Part I: alpha-pinene/ozone system. Atmospheric Environment 35 (35), 6049–6072.

Cross, E.S., et al., 2007. Laboratory and ambient particle density determinations using light scattering in conjunction with aerosol mass spectrometry. Aerosol Science and Technology 41 (4), 343–359. DeCarlo, P.F., Slowik, J.G., Worsnop, D.R., Davidovits, P., Jimenez, J.L., 2004. Particle morphology and density characterization by combined mobility and aerodynamic diameter measurements. Part 1: theory. Aerosol Science and Technology 38 (12), 1185–1205. Donahue, N.M., Robinson, A.L., Stanier, C.O., Pandis, S.N., 2006. Coupled partitioning, dilution, and chemical aging of semivolatile organics. Environmental Science and Technology 40 (8), 2635–2643. Edgerton, E.S., et al., 1999. Particulate air pollution in Mexico city: a collaborative research project. Journal of the Air & Waste Management Association 49 (10), 1221. Griffin, R.J., Cocker, D.R., Flagan, R.C., Seinfeld, J.H., 1999. Organic aerosol formation from the oxidation of biogenic hydrocarbons. Journal of Geophysical Research-Atmospheres 104 (D3), 3555–3567. Hallquist, M., Wangberg, I., Ljungstrom, E., Barnes, I., Becker, K.H., 1999. Aerosol and product yields from NO3 radicalinitiated oxidation of selected monoterpenes. Environmental Science and Technology 33 (4), 553–559. Hamilton, L.C., 1992. Regression with graphics. Wadsworth, Inc., Belmont, CA. Hatakeyama, S., Izumi, K., Fukuyama, T., Akimoto, H., 1989. Reactions of ozone with alpha-pinene and beta-pinene in airyields of gaseous and particulate products. Journal of Geophysical Research-Atmospheres 94 (D10), 13013–13024. Kamens, R.M., Jaoui, M., 2001. Modeling aerosol formation from alpha-pinene plus NOx in the presence of natural sunlight using gas-phase kinetics and gas-particle partitioning theory. Environmental Science and Technology 35 (7), 1394–1405. Kamens, R.M., Jang, M., Chien, C.J., Leach, K., 1999. Aerosol formation from the reaction of a-pinene and ozone using a gas-phase kinetics-aerosol partitioning model. Environmental Science and Technology 35, 1394–1405. Kanakidou, M., et al., 2005. Organic aerosol and global climate modelling: a review. Atmospheric Chemistry and Physics 5, 1053–1123. Kroll, J.H., Chan, A.W.H., Nga, N.L., Flagan, R.C., Seinfeld, J.H., 2007. Reactions of semivolatile organics and their effect on secondary organic aerosol formation. Environmental Science and Technology 41 (10), 3545–3550. Lee, A., et al., 2006. Gas-phase products and secondary aerosol yields from the photooxidation of 16 different terpenes. Journal of Geophysical Research-Atmospheres 111 (D17). Liang, C.K., Pankow, J.F., Odum, J.R., Seinfeld, J.H., 1997. Gas/particle partitioning of semivolatile organic compounds to model inorganic, organic, and ambient smog aerosols. Environmental Science and Technology 31 (11), 3086–3092. Lurmann, F.W., et al., 1997. Modelling urban and regional aerosols.2. Application to California’s south coast air basin. Atmospheric Environment 31 (17), 2695–2715. MathWorks, 2002. MATLAB. Ng, N.L., et al., 2006. Contribution of first- versus secondgeneration products to secondary organic aerosols formed in the oxidation of biogenic hydrocarbons. Environmental Science and Technology 40 (7), 2283–2297.

ARTICLE IN PRESS C.O. Stanier et al. / Atmospheric Environment 42 (2008) 2276–2299 Ng, N.L., et al., 2007. Secondary organic aerosol formation from m-xylene, toluene, and benzene. Atmospheric Chemistry and Physics 7 (14), 3909–3922. Nojgaard, J.K., Bilde, M., Stenby, C., Nielsen, O.J., Wolkoff, P., 2006. The effect of nitrogen dioxide on particle formation during ozonolysis of two abundant monoterpenes indoors. Atmospheric Environment 40 (6), 1030–1042. Odum, J.R., et al., 1996. Gas/particle partitioning and secondary organic aerosol yields. Environmental Science and Technology 30 (8), 2580–2585. Odum, J.R., Jungkamp, T.P.W., Griffin, R.J., Flagan, R.C., Seinfeld, J.H., 1997. The atmospheric aerosol-forming potential of whole gasoline vapor. Science 276 (5309), 96–99. Offenberg, J.H., Kleindienst, T.E., Jaoui, M., Lewandowski, M., Edney, E.O., 2006. Thermal properties of secondary organic aerosols. Geophysical Research Letters 33, L03816. Pandis, S.N., Harley, R.A., Cass, G.R., Seinfeld, J.H., 1992. Secondary organic aerosol formation and transport. Atmospheric Environment Part A-General Topics 26 (13), 2269–2282. Pankow, J.F., 1994a. An absorption-model of gas–particle partitioning of organic-compounds in the atmosphere. Atmospheric Environment 28 (2), 185–188. Pankow, J.F., 1994b. An absorption-model of the gas aerosol partitioning involved in the formation of secondary organic aerosol. Atmospheric Environment 28 (2), 189–193. Pathak, R.K., et al., 2007a. Ozonolysis of a-pinene: parameterization of secondary organic aerosol mass fraction. Atmospheric Chemistry and Physics 7 (14), 3811–3821. Pathak, R.K., Stanier, C.O., Donahue, N.M., Pandis, S.N., 2007b. Ozonolysis of alpha-pinene at atmospherically relevant concentrations: temperature dependence of aerosol mass fractions (yields). Journal of Geophysical Research-Atmospheres 112 (D03201) (doi:10.1029/2006JD007436). Philippin, S., Wiedensohler, A., Stratmann, F., 2004. Measurements of non-volatile fractions of pollution aerosols with an eight-tube volatility tandem differential mobility analyzer (VTDMA-8). Journal of Aerosol Science 35 (2), 185–203. Presto, A.A., Donahue, N.M., 2006. Investigation of alphapinene plus ozone secondary organic aerosol formation at low total aerosol mass. Environmental Science and Technology 40 (11), 3536–3543. Presto, A.A., Hartz, K.E.H., Donahue, N.M., 2005a. Secondary organic aerosol production from terpene ozonolysis. 1. Effect of UV radiation. Environmental Science and Technology 39 (18), 7036–7045.

2299

Presto, A.A., Hartz, K.E.H., Donahue, N.M., 2005b. Secondary organic aerosol production from terpene ozonolysis. 2. Effect of NOx concentration. Environmental Science and Technology 39 (18), 7046–7054. Pun, B.K., et al., 2003. Uncertainties in modeling secondary organic aerosols: three-dimensional modeling studies in Nashville/Western Tennessee. Environmental Science and Technology 37 (16), 3647–3661. Robinson, A.L., et al., 2007. Rethinking organic aerosols: semivolatile emissions and photochemical aging. Science 315 (5816), 1259–1262. Salcedo, D., et al., 2006. Characterization of ambient aerosols in Mexico City during the MCMA-2003 campaign with aerosol mass spectrometry: results from the CENICA Supersite. Atmospheric Chemistry and Physics 6, 925–946. Seber, G.A.F., Wild, C.J., 2003. Nonlinear Regression. Wiley, New York. Seinfeld, J.H., Pankow, J.F., 2003. Organic atmospheric particulate material. Annual Review of Physical Chemistry 54, 121–140. Song, C., Na, K.S., Cocker, D.R., 2005. Impact of the hydrocarbon to NOx ratio on secondary organic aerosol formation. Environmental Science and Technology 39 (9), 3143–3149. Stanier, C.O., Pathak, R.K., Pandis, S.N., 2007. Measurements of the volatility of aerosols from alpha-pinene ozonolysis. Environmental Science and Technology 41 (8), 2756–2763. Strader, R., Lurmann, F., Pandis, S.N., 1999. Evaluation of secondary organic aerosol formation in winter. Atmospheric Environment 33 (29), 4849–4863. Takekawa, H., Minoura, H., Yamazaki, S., 2003. Temperature dependence of secondary organic aerosol formation by photooxidation of hydrocarbons. Atmospheric Environment 37 (24), 3413–3424. Tsigaridis, K., Kanakidou, M., 2003. Global modelling of secondary organic aerosol in the troposphere: a sensitivity analysis. Atmospheric Chemistry and Physics 3, 1849–1869. Turpin, B.J., Lim, H.J., 2001. Species contributions to PM2.5 mass concentrations: revisiting common assumptions for estimating organic mass. Aerosol Science and Technology 35 (1), 602–610. Yu, J.Z., Cocker, D.R., Griffin, R.J., Flagan, R.C., Seinfeld, J.H., 1999. Gas-phase ozone oxidation of monoterpenes: gaseous and particulate products. Journal of Atmospheric Chemistry 34 (2), 207–258.