Journal of Colloid and Interface Science 238, 167–176 (2001) doi:10.1006/jcis.2001.7496, available online at http://www.idealibrary.com on
Adsorption-Energy Distribution of Heterogeneous Surfaces Predicted by Projections onto Convex Sets Thomas Hocker, Grigoriy L. Aranovich, and Marc D. Donohue1 Department of Chemical Engineering, The Johns Hopkins University, Baltimore, Maryland 21218 Received October 17, 2000; accepted February 22, 2001
The method of projections onto convex sets (POCS) is used to calculate the adsorption-energy distribution function from the adsorption integral (using a modified Langmuir local isotherm) for energetically heterogeneous surfaces. The POCS method, originally developed in the 1960s, has been successfully applied for many years to estimation problems, mainly in the fields of image processing, signal recovery, and optics. It allows one to incorporate into an iteration scheme available information about the experimental data and the measurement error as well as a priori constraints (such as nonnegativity) based on physical reasoning. It is important to note that the POCS method does not lead to a unique “optimum” solution. Rather, a feasible solution that is consistent with all imposed contraints is found within a “solution space.” The “size” of this solution space depends on how large the measurement errors are; it also depends on the accuracy of the error statistics and the number and significance of a priori constraints used. In several examples, the POCS method is used to recover energy distributions from simulated adsorption data containing normally distributed errors. The excellent recoveries obtained demonstrate the value of the POCS method as a robust and reliable tool for adsorption-integral inversions. °C 2001 Academic Press Key Words: pocs; adsorption energy distribution; heterogeneous surface; discrete ill-posed problem.
“islands,” which are assumed to be large on a molecular scale. This allows one to apply a simple model for homogeneous adsorption to each island. An expression for the overall adsorption then follows by integration over the product of the local isotherms, 0(T, ρ, ², . . .), multiplied by an appropriate weighting function, f (²) (2, 3): Z ²max ¯ 0(T, ρ) = 0(T, ρ, ², . . .) f (²) d². [1] ²min
Since experimentally obtained adsorption data consist of un¯ avoidable measurement errors, n ex , in Eq. [1], 0(T, ρ) needs to be replaced by 0ex (T, ρ) − n ex , which leads to Z ²max 0ex (T, ρ) = 0(T, ρ, ², . . .) f (²) d² + n ex . [2] ²min
For given experimental adsorption data, 0ex (T, ρ), given information about the experimental error, n ex , and an assumed local adsorption model, 0(T, ρ, ², . . .), the goal is now to solve Eq. [2] for the energy distribution function, f (²). To proceed further, consider the discretized version of Eq. [2], 0ex (T, ρi ) =
M X
w j 0(T, ρi , ² j , . . .) f (² j ) + n qd + n ex ,
[3]
j=1
1. INTRODUCTION
For all real sorbents, the energy change associated with the adsorption of a molecule (from a gas or liquid bulk phase) at a solid surface varies with location; i.e., the surface is energetically heterogeneous. For characterizing particular surfaces, it is therefore of great importance to determine the distribution of those adsorption energies from experimental adsorption data. A rigorous approach to this would involve information about the arrangement of sorbent molecules on the surface (see, for example, (1) for a lattice model taking into account molecularscale heterogeneities, information that is hardly available for real sorbents. A common “work around” is available for patchwise heterogeneous surfaces which consist of energetically homogeneous
1
To whom correspondence should be addressed. E-mail:
[email protected].
where the set {w j } is the weights of the chosen quadrature rule and n qd is the corresponding quadrature error (4). Furthermore, for equally spaced abscissas, ² j ≡ ²min + ( j − 1)1², where 1² ≡ (²max − ²min )/(M − 1). Introducing the simplified notation bi ≡ 0ex (T, ρi ), Ai j ≡ wi 0(T, ρi , ² j , . . .), f j ≡ f (² j ),
[4]
n i ≡ n qd + n ex , Eq. (3), written in matrix notation, becomes b = Af + n.
[5]
Despite its apparent simplicity, solving Eq. (5) for f (almost always) represents a highly “ill-posed” problem for which it is 167
0021-9997/01 $35.00
C 2001 by Academic Press Copyright ° All rights of reproduction in any form reserved.
168
HOCKER, ARANOVICH, AND DONOHUE
hopeless to find an exact solution. Indeed, a singluar value decomposition of A (see (4) for details) usually reveals singular values gradually decaying to zero and a large condition number, i.e., a large ratio between the largest and smallest nonzero singular value (5). Consequently, there are an infinite number of (physical and unphysical) solutions fˆ which all lead to values bˆ practically indistinguishable from the original data (6, 7). 2. CONVENTIONAL REGULARIZATION
ˆ a common strategy is to incorTo find a useful and stable f, porate a priori knowledge about the expected solution [5]. A widely used method is constrained least-squares minimization2 (CLSM, also known as Phillips–Twomey or Tikhonov–Miller regularization (4), which postulates the minimization problem ˆ 2 → min . krk2 + λ2 kBfk
[6]
Here r ≡ Afˆ − b is the residual vector, k · k denotes the Euclidean norm, λ is an adjustable regularization parameter (or Lagrange multiplier), and B is a matrix operator designed to reduce the ill-posed behavior. Often, B represents the identity maˆ trix finding the “lowest-energy solution” with the smallest kfk. This leads to “zeroth-order” regularization which is equivalent to singular value decomposition (4). Another common choice for B is a second-order differential operator imposing a smoothness condition on fˆ (4, 6, 7). For a given value of λ > 0, Eq. [6] then finds a unique solution, fˆ = (AT A + λBT B)−1 AT b,
[7]
where the matrix inversion now can be carried out by standard techniques such as LU decomposition (4). The “art” of CLSM is to find a value for λ that regularizes just enough to find a stable solution, without increasing krk2 too much (rendering the results physically meaningless). Methods for finding λ range from fairly objective (for example, incorporating statistical knowledge of the measurement errors into an optimization problem for finding λ) to entirely subjective (for example, by visual comparison of results for several λ values). Also, combining Eq. [7] with further a priori constraints (such as nonnegativity) is not a straightforward matter and usually leads to problems much more computationally demanding than solving Eq. [7] (8, 9). Generally, CLSM lacks the flexibility to incorporate in a straightforward manner all available information into the solution process. 3. PROJECTIONS ONTO CONVEX SETS
The method of projections onto convex sets (POCS) provides a very flexible framework for “narrowing down” the infinite solution space associated with ill-posed problems: An acceptable or feasible solution is found by incorporating all the information available about a certain system. 2 See (8) for a review on the use of regularization methods for calculating adsorption-energy distributions.
Stark and Yang (10) note that the POCS method was developed in the 1960s by Gubin et al. (11), Bregman (12), Halperin (13), and Opial (14). It was further extended (mainly in the field of image restoration) in the late 1970s and early 1980s by Youla (15), Youla and Webb (16), and Levi and Stark (17) and has since then experienced a tremendous growth. (For further readings, an excellent textbook on vector space projections is provided by Stark and Yang (10); applications to problems in image recovery are given in (18); projection methods in the broader context of set-theoretic estimation are discussed by Combettes (19).) The POCS method allows one to incorporate into an iteration scheme available information about the experimental data and the measurement error together with a priori constraints (such as nonnegativity) based on physical reasoning. As Combettes points out (19): “Each piece of information is associated with a set in the solution space and the intersection of these sets, the feasibility set, represents the acceptable solutions . . . If some of the solutions are not acceptable, then it must be the case that the formulation fails to include some important constraint that has not been identified.” The algorithm is guaranteed to find a solution that is consistent with all imposed constraints as long as they can be represented by projections onto closed and convex3 sets in a Hilbert space4 , H. However, even for nonconvex or inconsistent sets, projection methods have been shown to give meaningful results (10, 19). The fact that the POCS method does not lead to a unique “optimum” solution has been subject to some criticism. However, following again Combettes (19): “From a practical standpoint, because of the uncertainty that surrounds the specifications of most estimation problems, providing a region of acceptability for a solution rather than a single point seems more realistic.” 3.1. POCS Theory As stated above, the POCS method allows one to incorporate an arbitrary number of constraints into an iteration scheme. This is done by projecting the solution vector5 x (at each iteration step k) onto all the sets associated with the chosen constraints. Consider now C1 , C2 , . . . , Cm closed and convex sets in a Hilbert m Ci (i.e., for space H. For a nonempty intersection C0 = ∩i=1 6 consistent constraints) the iteration scheme xk+1 = (P1 P2 . . . Pm )xk
[8]
3 A set C is said to be convex when for all vectors x , x ∈ C, the linear 1 2 combination x3 = αx1 + (1 − α)x2 for 0 ≤ α ≤ 1 is also a member of the set; i.e., x3 ∈ C. 4 A Hilbert space in an inner product space which is complete with respect to the norm induced by the inner product. An example of a Hilbert space is the P Euclidean space R n with the inner product defined as hx, yi = i xi yi and the P 2 1/2 n induced norm kxk = ( i xi ) for all x ∈ R (10). 5 The simplified notation x instead of fˆ is used throughout the theory section. 6 Here the unrelaxed sequential version of the POCS algorithm is shown. Note that a relaxed version of Eq. [8] as well as parallel projection algorithms also exist; using those alternative algorithms can significantly improve the convergence speed (see (10, 19) for details).
169
ADSORPTION-ENERGY DISTRIBUTION OF HETEROGENEOUS SURFACES
A soft-linear constraint must be used when some scalar system property is only known approximately. For example, soft-linear constraints are used to incorporate statistical information calculated from single data-error samples, such as the mean and the variance. There, upper and lower bounds are specified for the expected values, expressing the level of confidence in the error statistics (20). Soft-linear sets are defined as © ª Csl = y ∈ R n : δil ≤ hy, ai ≤ δiu ,
FIG. 1. Example of three closed convex sets with a nonempty intersection, Co . The unrelaxed successive POCS algorithm, Eq. [8], converges to a feasible solution on the boundary of Co .
converges to a point on the boundary of C0 [10]. An example of a system with three constraints is shown schematically in Fig. 1. Each projection operator, Pi , i = 1, . . . , m, projects an arbitrary x ∈ H onto the closest point y on the boundary of the corresponding set Ci . Consequently, Pi is found by evaluating kx − Pi xk = min kx − yk. y∈C
[12]
where, instead of an exact system property δi , the upper and lower bounds δiu and δil are used. The projection operator associated with Csl is given by x Psl x = x + x+
x ∈ Csl δil −hx,ai a kak2
hx, ai < δil
δiu −hx,ai a kak2
hx, ai > δiu .
[13]
Note that the derivation of Psl is similar to the derivation of Pli shown in the Appendix. For further details, see (10).
[9] 3.2. POCS Application
Among the most important constraints are those of the linear and soft-linear type. Linear sets are defined as Cli = {y ∈ R n : hy, ai = δi },
[10]
where δi represents some scalar property of the system, such as a data point. The associated projection operator, Pli , is given by (see the Appendix for details) ( x x ∈ Cli [11] Pli x = δi −hx,ai x + kak2 a x 6∈ Cli .
The method of projections onto convex sets is now applied to the problem of finding a meaningful adsorption-energy distribution function from given adsorption data; i.e., the goal is to find a feasible solution for the energy distribution vector fˆ in Eq. [5]. This is accomplished by incorporating all available information about the experimental data as well as physically sound a priori constraints. For example, inspection of Eq. [5] suggests that the properties of the residual vector r ≡ Afˆ − b should match all
A famous example of a sequential projection method based on linear constraints is Kaczmarz’s algorithm first published in 1937 (19). Figure 2 illustrates how it can be used to iteratively solve two coupled linear equations. As can be seen, the algorithm finds the solution by alternating projections onto the two linear equations x2 = x1 /4 and x2 = 2 − x1 . In agreement with the “shortest-distance” requirement, Eq. [9], the projections onto the respective sets are normal. Note also that the convergence speed strongly depends on the angle between the straight lines. In fact, in the limit where the straight lines become perpendicular to each other the algorithm converges to the exact solution with a single iteration step.7 7 For some problems it is practical to first orthogonalize the set of linear constraints resulting in a system that converges with a single iteration step. This can be done, for example, by Gram–Schmidt orthogonalization (for a numerically stable version see (25)).
FIG. 2. Sequential projection algorithm for solving two coupled linear equations originally proposed by Kaczmarz in 1937 (19).
170
HOCKER, ARANOVICH, AND DONOHUE
known properties of the measurement errors, n.8 Trussell and Civanlar showed that distorted X-ray fluorescence signals can be restored when bounds are placed on the residual norm, on the residual mean, and on outliers of the residual (20). Their approach will be adopted in the following derivation of the various residual bounds. 3.2.1. Bounded Residual Norm Due to the uncertainty involved in specifying residual bounds from single data-error samples, confidence intervals must be used. The upper and lower bounds on the residual norm, δvu and δvl respectively, can be estimated from the norm of a single dataerror sample, knk, with δvl < knk < δvu . Since for zero mean kn2 k is the sample variance, i.e., s 2 = knk2 , it has a (nonsymmetric) χ 2 distribution.9 The upper and lower bounds on the residual norm are then estimated from (21) s δvu/l =
3.2.2. Bounded Residual Mean The upper and lower bounds on the residual mean, δmu and respectively, can be estimated of Pthe errors P from the sum , with δml < i n i < δmu . of a single data-error sample, i n iP has a Student t Since the mean of an error sample, i n i /N , P distribution,10 the bounds on the residual sum, i ri , are given by (21) X √ n i ± N knkt[(1+α)/2](N −1) , [18] δmu/l = δml ,
i
where t is the Student t distribution for (N − 1) degrees of freedom and α ∈ (0, 1) confidence limits. P Note that the set enforcing upper and lower bounds on i ri = P ˆ i (bi − [Af]i ) is defined as X © ª ˆ i ) ≤ δu . Cm = fˆ ∈ R M : δml ≤ (bi − [Af] [19] m i
(N − , 2 χ[(1∓α)/2](N −1) 1)knk2
[14]
where N is the sample size and χ 2 is the distribution for (N − 1) degrees of freedom and a confidence limit specified by α ∈ (0, 1). Note that α expresses the user’s “belief” in the accuracy of the derived error statistics (for example, α = 0.95 is the 95% confidence limit). δvu and δvl having been specified, the set enforcing upper and lower bounds on the residual norm is now defined as © ª Cv = fˆ ∈ R M : δvl ≤ kAfˆ − bk ≤ δvu .
[15]
The associated projection operator, Pv , is derived from evaluating Eq. [9] (see (20) for details) and can be written as (22) ˆ fˆ ∈ Cv f ˆ Pv fˆ = gγ (f)|krγ k=δvl krk < δvl ˆ gγ (f)|krγ k=δvu krk > δvu ,
[16]
ˆ γ >0 gγ ≡ (I + γ AT A)−1 (γ AT b + f),
[17]
where
and krγ k ≡ kAgγ − bk. Consequently, for each projection of fˆ 6∈ Cv , one must find γ iteratively by solving Eq. [17] until the u/l condition krγ k = δv is met. Fortunately, krγ k2 is a monotonically decreasing function of γ for γ > 0 so that the computation can be performed by a Newton–Raphson scheme or by neural nets (22).
The set Cm is of the soft-linear type which has the general form shown in Eq. [12] and the projection operator given by Eq. [13]. The relationship between Eqs. [19] P and [12] becomes obvious P ∗ ˆ i ) is replaced by ˆ ∗ when i (bi − [Af] i bi − hf, a i, where a is a vector whose elements are given by the sums over the column elements of A; i.e., ½X ¾T a∗ ≡ [A]i1 , [A]i2 , . . . , [A]i M . [20] i
The operator projecting onto Cm then can be derived from Eq. [13] as ˆ f fˆ ∈ Cm P l P r −δ l m ∗ i i ˆ i ri < δm [21] Pm fˆ = f + ka∗ k2 a P u P u i ri −δm ∗ fˆ + ka a ∗ k2 i ri > δm . 3.2.3. Bounded Residual Outliers Following Trussell and Civanlar (20): “The individual elements of the residual should have the same probability distribution as the noise . . . Thus, it is possible to check for values in the residual which deviate an unlikely amount from the mean.” Here we use the “three-sigma limit” to impose bounds on individual residual elements which can be written as u/l
δou/l = u/l
δm ± 3δvu/l . N
[22]
u/l
Here δv is given by Eq. [14] and δm given by Eq. [18]. The set enforcing upper and lower bounds on each ri = bi − ˆ i , i = 1, . . . , N , is defined as [Af] © ª ˆ i ≤ δu . Co = fˆ ∈ R M : δol ≤ bi − [Af] o
[23]
8
Strictly, n represents the sum of the quadrature and measurement errors, n qd + n ex . In practice, however, n ex À n qd so that n ≈ n ex holds. 9 χ 2 can be approximated by the normal distribution for large sample sizes. Values for χ 2 can be obtained from statistics tables.
10 Similar to the χ 2 distribution, the Student t distribution also can be approximated by the normal distribution for large sample sizes.
171
ADSORPTION-ENERGY DISTRIBUTION OF HETEROGENEOUS SURFACES
As for Cm , the set Co is also of the soft-linear type. This becomes ˆ i in Eq. [23] is rewritten as bi − hf, ˆ ai i, obvious when bi − [Af] where ai is a vector representing the ith row of A; i.e., ai ≡ {Ai1 , Ai2 , . . . , Ai M }T .
[24]
The operator projecting onto Co can be derived from Eq. [13] as ˆf ˆf ∈ Co l ri −δo l [25] Po ˆf = ˆf + kai k2 ai ri < δo fˆ + ri −δou a r > δ u . i o kai k2 i 3.2.4. A Priori Constraints In the previous sections the constraint sets, Cv , Cm , and Co were derived based on information about the experimental data as well as the data error. However, in order to find a feasible solution from the smallest possible solution space, additional a priori constraints need to be included. Nonnegativity. Trussell and Civanlar found that nonnegativity imposes a strong constraint on the recovery of distorted X-ray fluorescence signals (20). In general, nonnegativity is a very powerful constraint when the expected distribution function has many values close to zero, as is clearly the case for typical adsorption-energy distributions. The corresponding set onto which the solution vector is projected is defined as Cn = {fˆ ∈ R M : fˆ j ≥ 0},
j = 1, . . . , M,
[26]
and the corresponding projection operator simply becomes ) ( ˆf j ˆf j ≥ 0 , j = 1, . . . , M. [27] Pn fˆ j = 0 fˆ j < 0 Constant R ²max area. If the integral over the distribution function, δa = ²min f (²) d², is known, this can be also formulated as a projection onto a convex set. As in Section 1, consider the discrete integral formulation δa =
M X
ˆ w j f (² j ) = hw, fi,
[28]
j=1
which leads to the (linear) constant-area set ˆ = δa }. Ca = {fˆ ∈ R M : hw, fi
[29]
The corresponding projection operator follows from Eq. [11] as ( Pa x =
fˆ ∈ Ca
fˆ
fˆ +
δa −hw,ˆfi w kwk2
fˆ 6∈ Ca .
[30]
Of course, if δa is only known approximately, upper and lower bounds must be specified and the soft-linear formulation, Eq. [12], needs to be applied.
Smoothness. Another common (though physically less rigˆ orous) a priori constraint is to impose a certain smoothness on f. Within the framework of POCS, smoothness conditions based on the first and second derivatives of fˆ have been used to estimate probability density functions (pdf’s) (23). However, it should be noted that the smoothness condition used within the POCS framework differs from the one used in Tikhonov regularization methods. While the latter minimizes a certain derivative of fˆ (subject to additional constraints), the prior constrains a certain derivative of fˆ to specified upper and lower bounds using the linear set formulation, Eq. [10]. Consequently, smoothness constraints used within the POCS framework can contribute only when the upper and lower bounds for the derivatives can be estimated with sufficient accuracy, as is the case for estimating pdf’s from data histograms (23). However, it is unlikely that this is also possible for estimating adsorption-energy distributions. 4. RESULTS
The POCS method, as outlined in Section 3, is now applied to the adsorption integral, Eq. [5], using a modified Langmuir local isotherm. The latter follows from the regular solution monolayer adsorption equation (24) ( 0 = ρb
1
)
¡ ¢ −1 , −40)−² ρb + (1 − ρb) exp ²mm (ρbkT
[31]
where ²mm denotes fluid–fluid and ² fluid–surface interaction energies. Note that for small to moderate pressures, ρb ≈ P/Psat and ²mm , ² > 0, corresponding to attractive interactions. Much of the information about adsorption-energy distributions is obtained at very low pressures where fluid–fluid interactions, ²mm , can be neglected. In this limit, Eq. [31] simplifies to ( 0 = ρb
) ¡ ¢ exp kT² £ ¡ ¢ ¤ −1 . 1 + ρb exp kT² − 1
[32]
Note that the Langmuir model is related to Eq. [32] through the further assumption exp( kT² ) À 1. To analyze the performance of the POCS algorithm, Eq. [8] simulated adsorption data have been obtained from Eq. [2] using Eq. [32] as the local isotherm. Simulated measurement errors have been added by generating normally distributed random numbers based on the Box–Muller method (4) The specified maximum data errors were identified with the “three-sigma limit” of each data point. Figure 3 shows POCS results where the predictions from the following combinations of constraints are compared to each other: set 1 = Cv , Cm , Co ; set 2 = Cv , Cm , Co , Cn ; and set 3 = Cv , Cm , Co , Cn , Ca . Set 1 shows typical oscillations similar to those obtained from conventional regularization methods. Those oscillations often make it impossible to judge whether small peaks are real or not. This undesirable behavior is a consequence of the missing nonnegativity constraint, which is crucial for the recovery of distribution functions with many
172
HOCKER, ARANOVICH, AND DONOHUE
FIG. 3. Energy distribution functions, f (ε), predicted from simulated adsorption data with 3% Gaussian error at 280 K. POCS predictions are shown for three different combinations of constraints: set 1 = Cv , Cm, Co ; set 2 = Cv , Cm , Co , Cn ; set 3 = Cv , Cm , Co , Cn , Ca .
values close to zero.11 As expected, set 3, which incorporates the most constraints, gives the best results. However, the predictions from set 2 are very close to those obtained from set 3, suggesting that the constant-area constraint, Ca , has little effect. This is confirmed by the fact that the area under the curve predicted from set 2 is in error only by approximately 2%. Since the area under the distribution function, and therefore δa , is generally not known, all further calculations were performed by the algorithm fˆ k+1 = (Pn Po Pm Pv )fˆ k ,
[33]
FIG. 4. Adsorption data at T = 78 K with 5% error and the corresponding energy distribution function. The isotherm represented by the solid line in (a) has been calculated from the predicted distribution function showed in (b).
with fˆ k=1 = {0, . . . , 0}T as initial guess. Once a distribution function has been obtained from given adsorption data, the question arises how the adsorption isotherm calculated from the predicted distribution compares with the original data. This point is illustrated in Fig. 4, where the distribution in Fig. 4b predicted from adsorption data at T = 78 K with 5% error leads to the adsorption isotherm represented by the solid line in Fig. 4a. As expected, the predicted isotherm provides a “best fit” to the scattered original data. Figure 5a shows simulated adsorption data at T = 78 K with a normally distributed maximum data error of 2%. Also shown in Fig. 5b are the predictions of Eq. [33] after i = 1 and i = 20 iterations. Surprisingly, even after the first iteration step, the POCS algorithm correctly predicts the number and location of given peaks. Note also that the area under the predicted distribution function deviates from the exact value only by about 0.5%. However, there is very little improvement when going to 20 iterations for which Eq. [33] has almost converged. This can be seen from Figs. 6a and 6b, where Pthe variation of the residual norm, krk, and the residual sum, i ri , with number of iterations are shown. The dashed horizontal lines represent the residual bounds (or confidence limits) obtained from an error analysis for α = 0.02; i.e., the error statistics were assumed to 11 However, when the assumed local isotherm, 0(T, ρ, ², . . .), contradicts the real adsorption behavior, the application of the nonnegativity constraint might lead to additional artificial positive peaks in the distribution function.
FIG. 5. (a) Adsorption isotherm at T = 78 K with 2% data error; (b) predicted energy distribution functions, f (ε), after i = 1 and 20 iterations.
ADSORPTION-ENERGY DISTRIBUTION OF HETEROGENEOUS SURFACES
173
FIG. 8. Energy distribution functions, f (ε), predicted from simulated adsorption data with 1 and 5% Gaussian errors at 78 K. The POCS predictions are based on Eq. (33).
FIG. 6. Variation P with number of iterations of (a) the residual norm, [r], and (b) the residual sum, i ri . The dashed horizontal lines show the confidence limits obtained from error statistics for α = 0.02.
be very accurate. One sees that the residual norm, as well as the residual sum, reaches, its bound asymptotically. From a practical standpoint, the results shown in Figs. 5 and 6 suggest that satisfying recoveries often can be obtained long before convergence (in the mathematical sense) is reached. In Fig. 7 predictions of Eq. [33] are presented for simulated adsorption data at T = 78 K with 0 and 5% Gaussian error. The given distribution is perfectly recovered when there is no
FIG. 7. Energy distribution functions, f (ε), predicted from simulated adsorption data at T = 78 K with 0 and 5% Gaussian errors. 100 data points were used from within the density range 10−16 –10−4 . The POCS predictions are based on Eq. (33).
measurement error involved and even for 5% error the number and location of peaks are predicted correctly. However, when the given peaks become smaller, only “good enough” data (i.e., only small errors in the measurements) ensure a sufficient resolution. This is illustrated in Fig. 8, where for 5% error the two small peaks are no longer resolved, while for 1% error there still is excellent agreement with the given distribution. Figures 9 and 10 illustrate the importance of the availability of data points at the right “location”; i.e., for a full recovery of f , it is crucial to have a sufficient number of data points carrying information about all the important features of the underlying distribution. In this example, there are adsorption data missing at very low densities, as seen from Fig. 9. Consequently, the highest energy peak of the given f shown in Fig. 10 cannot be recovered. Instead, the POCS algorithm predicts a constant nonzero “background term.” The temperature effect on the accuracy of the recovered distribution is discussed in Figs. 11 and 12. Figure 11 shows adsorption data at 78, 120, and 150 K with 2% Gaussian error, and in Fig. 12 the resulting distribution functions based on the
FIG. 9. Adsorption isotherm with 0% Gaussian error at 78 K calculated from the given energy distribution shown in Fig. 10. Since the adsorption data at very low densities are missing the highest energy peak of the distribution, function cannot be recovered.
174
HOCKER, ARANOVICH, AND DONOHUE
FIG. 10. Energy distribution function, Cv , Cm , Co , Cn , predicted from simulated adsorption data with 0% Gaussian error at 78 K. The POCS predictions are based on the incomplete adsorption isotherm shown in Fig. 9. Consequently, the highest energy peak cannot be recovered.
FIG. 11. Adsorption isotherms with 2% Gaussian error at 78, 120, and 150 K calculated from the given energy distribution shown in Fig. 12.
FIG. 13. Energy distribution functions, f (ε), predicted from simulated adsorption data with 2% Gaussian error at 78 K. POCS predictions are shown for two different values α1 = 0.5 and α2 = 0.99999. Note that αε [0, 1] expresses the belief in the correctness of the error statistics.
POCS algorithm are provided. As expected, there is less agreement between the given and predicted distributions at higher temperatures, indicating the influence of entropy. Since entropy always scales with temperature, higher temperatures increase the entropy effect (relative to the energy effect) on the adsorption behavior. In fact, when T → ∞ (i.e., when energy effects are unimportant), the density of molecules in the adsorbed space will equal the bulk density, making it impossible to obtain any information about surface heterogeneity. It is important to note that both the accuracy of the adsorption data and the precise determination of the statistics of the measurement errors imposing upper and lower bounds on the residual are crucial for a successful recovery of f . The more narrow those bounds are, the more accurate the recovered distribution will be. Narrow bounds correspond with small confidence limits, i.e., values for α ∈ [0, 1) close to zero. However, choosing a small α-value for error statistics that are in fact not very accurate can lead to artificial peaks in the predicted distribution. Also, choosing a large confidence limit for accurate error statistics means that not all the available information is optimally used. This point is illustrated in Fig. 13, where POCS predictions are shown for the confidence limits α1 = 0.5 and α2 = 0.99999. Both results are based on the same accurate error statistics, so that a small α-value is “allowed.” It is obvious that a smaller value for α leads to a much more accurate distribution. Since the choice of a realistic α-value is left to the user’s belief in the error statistics, this uncertainty is a weakness of the method. However, as pointed out by Combettes (19), experience helps! This is another aspect showing how useful it is to “experiment” with simulated adsorption data before doing the actual experiments. 5. CONCLUSIONS
FIG. 12. Energy distribution functions, f (ε), predicted from simulated adsorption data with 2% Gaussian error at 78, 120, and 150 K. The results are based on the adsorption isotherms shown in Fig. 11. The POCS predictions are based on Eq. [33].
The method of projections onto convex sets has been used to calculate adsorption-energy distributions for heterogeneous surfaces from the adsorption intergral. The following types of system information were incorporated into an iteration scheme:
175
ADSORPTION-ENERGY DISTRIBUTION OF HETEROGENEOUS SURFACES
• experimental data • statistical properties of the measurement errors • a priori constraints. This information was formulated (in a straightforward manner) as projections onto closed, convex sets, for which the iteration is guaranteed to converge to a solution that satisfies all imposed constraints. In several examples, the POCS method has been used to recover energy distributions from simulated adsorption data containing normally distributed errors of different magnitudes. Excellent recoveries were obtained for adsorption data containing errors of up to 5%, assuming that accurate statistics of the measurement errors were available. The number, location, and accuracy of data points play a crucial role in the recovery of a meaningful energy distribution. Due to its efficiency and simplicity, the POCS method should be a valuable tool for “planning” experiments, especially when the range of expected adsorption energies is known approximately. Then, hypothetical distributions of various complexity can be generated and criteria (such as feasible parameter spaces) can be determined for a particular problem, leading to a successful distribution function recovery. One of the most important features of the POCS method is its ability to incorporate an unlimited number of constraints, assuming that they can be formulated as (closed convex) sets. Hence, we believe this method provides room for further improvements. The goal is to find a combination of sets that uses all available system information in an optimum way. APPENDIX: LINEAR AND SOFT-LINEAR CONSTRAINTS
For the reader’s convenience, some of the properties of linear and soft-linear contraints are reproduced here from Stark and Yang (10). First, consider the linear set as given by Eq. [10]. Convexity. Let y1 , y2 ∈ Cli and y3 = αy1 + (1 − α)y2 for α ∈ [0, 1]. Then hy3 , ai = hαy1 + (1 − α)y2 , ai = αhy1 , ai + (1 − α)hy2 , ai = αδi + (1 − α)δi = δi . Hence, y3 ∈ Cli and Cli is convex. Projection. The projection of x ∈ H onto the linear set Cli is given by Eq. [11]. Derivation. One wishes to project an arbitrary x ∈ H onto the closest point y on the boundary of Cli , where the unknown y ∈ Cli is found from evaluating ky − xk → min.12 Let a0 = a/kak.
12 For any closed convex set C in a Hilbert space H, one can show that for each x ∈ H there exists a unique x∗ ∈ C that is closest to x (10).
Then each vector x ∈ H has the orthogonal decomposition x = hx, a0 ia0 + d,
[A.1]
where d = x − hx, a0 ia0 . Clearly hd, a0 i = 0, so d is orthogonal to a0 . Since each y ∈ Cli satisfies hy, a0 i = δi /kak, one can write y according to Eq. [A.1] as y=
δi a0 + e, kak
[A.2]
where e is orthogonal to a0 . By combining Eqs. [A.1] and [A.2], ° °2 ° ° δi ° a0 + e° kx − yk = °hx, a0 ia0 + d − ° kak 2
= k(hx, a0 i − δi /kak)a0 + (d − e)k2 = k(hx, a0 i − δi /kak)a0 k2 + kd − ek2 , [A.3] because hd − e, a0 i = 0. Clearly, ky − xk is minimized when d = e, which yields y= =
δi a0 + d kak δi a0 + x − hx, a0 ia0 kak
= x+
δi − hx, ai a, kak2
[A.4]
which is the projection of x onto Cli . In practice, the scalar δi on the right-hand side of Eq. [10] often represents a property of the measurement error that is not known exactly. For such cases, it is necessary to use soft-linear constraints as defined in Eq. [12]. Note that Csl is both closed and convex and has an associated projection operator given by Eq. [13]. The derivation of Psl is similar to the shown derivation of Pli ; see [10] for details. ACKNOWLEDGMENTS The authors acknowledge support by the Division of Chemical Sciences of the Office of Basic Energy Sciences, U.S. Department of Energy, under Contract DE–FG02–87ER13777. T.H. thanks Dr. Jerry Prince from the Electrical and Computer Engineering Department for pointing out the usefulness of projection methods, and Dr. Michael Karweit from Chemical Engineering for advice with previous work on this project.
REFERENCES 1. Aranovich, G. L., and Donohue, M. D., J. Chem. Phys. 104, 3851 (1996). 2. Rudzinski, W., and Everett, D. H., “Adsorption of Gases on Heterogeneous Surfaces.” Academic Press, London, 1992. 3. Jaroniec, M., and Madey, R., “Physical Adsorption on Heterogeneous Solids.” Elsevier, Amsterdam, 1988.
176
HOCKER, ARANOVICH, AND DONOHUE
4. Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P., “Numerical Recipes in Fortran 77: The Art of Scientific Computing,” 2nd ed. Cambridge Univ. Press, Cambridge, UK, 1992. 5. Hansen, P. C., Numer. Algorithms 6, 1 (1994). 6. Phillips, D. L., J. Assoc. Comput. Mech. 9, 84 (1962). 7. Twomey, S., “Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements.” Elsevier, Amsterdam, 1977. 8. von Szombathely, M., Br¨auer, P., and Jaroniec, M., J. Comput. Chem. 13, 17 (1992). 9. Bhatia, S. K., Chem. Eng. Sci. 53, 3239 (1998). 10. Stark, H., and Yang, Y., “Vector Space Projections.” Wiley, New York, 1998. 11. Gubin, L. G., Polyak, B. T., and Raik, E. V., USSR Math. Phys. 7, 1 (1967). 12. Bregman, L. M., Dokl. Akad. Nauk. USSR 162, 487 (1965). 13. Halperin, I., Acta Sci. Math. 23, 96 (1960). 14. Opial, Z., Bull. Am. Math. Soc. 73, 591 (1967). 15. Youla, D. C., IEEE Trans. Circuits Syst. 25, 694 (1978).
16. Youla, D. C., and Webb, H., IEEE Trans. Med. Imaging 1, 81 (1982). 17. Levi, A., and Stark, H., J. Opt. Soc. Am. 73, 810 (1983). 18. Stark, H., Ed., “Image Recovery: Theory and Application.” Academic Press, London, 1987. 19. Combettes, P. L., Proc. IEEE 81, 182 (1993). 20. Trussell, J. H., and Civanlar, M. R., IEEE Trans. Acoust. Speech Signal Process. 32, 201 (1884). 21. Ostle, B., and Mensing, R. W., “Statistics in Research,” 3rd. ed. Iowa State Univ. Press, Ames, 1975. 22. Stark, H., and Olsen, E. T., J. Opt. Soc. Am. A 9, 1914 (1992). 23. Yang, Y., and Stark, H., in “Signal Processing Methods for Audio, Images, and Telecommunications” (P. M. Clarkson and H. Stark, Eds). Chap. 10. Academic Press, London, 1995. 24. Hocker, T., Aranovich, G. L., and Donohue, M. D., J. Chem. Phys. 111, 1240 (1999). 25. Horn, R. A., and Johnson, C. R., “Matrix Analysis.” Cambridge Univ. Press, Cambridge, UK, 1990.