Journal of Colloid and Interface Science 258 (2003) 383–389 www.elsevier.com/locate/jcis
Measurement of emulsion droplet sizes using PFG NMR and regularization methods K.G. Hollingsworth and M.L. Johns ∗ Department of Chemical Engineering, University of Cambridge, Pembroke Street, Cambridge CB2 3RA, UK Received 7 May 2002; accepted 5 November 2002
Abstract The droplet size distributions of emulsions have been measured using pulsed field gradient (PFG) nuclear magnetic resonance (NMR) for many years. This technique finds particular application with emulsions that are concentrated and/or opaque, since such emulsion systems are difficult to characterize by other methods. Most studies employing PFG techniques assume a lognormal form when extracting the droplet size distribution from the experimental data. It is clearly desirable to retrieve a droplet size distribution from the experimental data without assuming such a functional form. This is achieved for the first time using regularization techniques. Regularization based on the distribution area and on its second derivative are compared and assessed along with the following techniques for selecting the optimal regularization parameter: the L-curve method, generalized cross validation (GCV), and the discrepancy principle. Regularization is applied to both simulated data sets and experimental data. It is found that when the experimental error can be estimated accurately, the discrepancy principle with area regularization is the best approach. When the error is not known the GCV method, with second derivative regularization and allowing only nonnegative values, is most effective. 2003 Elsevier Science (USA). All rights reserved. Keywords: Emulsions; Droplet size distributions; Regularization; NMR; PFG
1. Introduction Emulsions are an important class of both industrial products and intermediates, being found in foodstuffs, pharmaceuticals, cosmetics, paints, and explosives, as well as the crude oil emulsions produced in oil recovery [1]. Measuring the size distribution of droplets in an emulsion is an important characterization tool, as the size of the droplets influences the viscosity and stability of the emulsion, important characteristics for its processing and functionality. Techniques such as confocal microscopy, electron microscopy, and light scattering are routinely used to size emulsion droplets [2]; however, each of these methods has its own limitations. Confocal microscopy usually requires the addition of a fluorescent contrast agent to one of the phases of the emulsion and is unable to resolve beyond the surface layer when the refractive indices of the continuous and discrete phases differ significantly. Likewise, light scattering can only be employed on dilute emulsions where there is a * Corresponding author.
E-mail address:
[email protected] (M.L. Johns).
high transmittance. Electron microscopy is highly invasive, requiring elaborate and destructive preparation of the sample [3]. Pulsed field gradient (PFG) NMR offers a unique selection of advantages over other emulsion-sizing methods. In particular, it can routinely deal with both opaque and concentrated emulsions. The measurement of emulsion droplet sizes using PFG is based on the restriction of the selfdiffusion of the molecules of the discrete phase (i.e., the droplets) in an emulsion by the boundary between the phases (i.e., interior droplet surfaces). Various investigators have derived mathematical expressions to describe this restricted self-diffusion and its effect on the PFG signal as a function of droplet size (e.g., [4–6]). This PFG emulsion droplet sizing technique has been used by several authors, beginning with Packer and Rees in 1972 [7]. It has been used to investigate water-in-crude oil emulsions [8], which are important in the pipeline transport of recovered crude oils and the creaming processes present in oil-in-water emulsions [9]. Very concentrated emulsion systems have been shown to demonstrate diffraction patterns in the acquired PFG signal and these patterns can be used to size the compartment boundaries [10].
0021-9797/03/$ – see front matter 2003 Elsevier Science (USA). All rights reserved. doi:10.1016/S0021-9797(02)00131-5
384
K.G. Hollingsworth, M.L. Johns / Journal of Colloid and Interface Science 258 (2003) 383–389
The PFG method has also been used to probe droplet sizes within food materials such as mayonnaise, margarines, and cheeses [11–13]. One key problem in processing the experimental data is the mathematical difficulty of retrieving the droplet size distribution from the PFG experimental data. This has generally been overcome by assuming a lognormal form for the droplet size distribution. For many homogeneously mixed emulsion systems, with a low-viscosity continuous phase, this is a satisfactory solution [14]. Retrieving the droplet size distribution without assuming a droplet size distribution shape is the focus of the work presented here. The technique of regularization is employed to achieve this. Various methods of implementing regularization are compared.
2. Methodology 2.1. Determining emulsion droplet size distributions using PFG-NMR In NMR, by applying a magnetic field gradient pulse to a sample, molecules are spatially encoded with a phase shift. A second magnetic field gradient pulse is applied at a later time, which reverses this phase shift. If none of the molecules have moved between the gradient pulses, the overall residual phase shift will be zero and the NMR signal will be completely refocused. Due to Brownian motion, however, the molecules will have moved and the refocused signal will be attenuated. The signal attenuation thus gives a measure of the movement of the molecules. This technique is known as PFG-NMR and is described in more detail in Callaghan [15]. Within the discrete droplets of an oil-in-water emulsion, the oil molecules will undergo self-diffusion, but since the oil molecules cannot diffuse beyond the droplet surface, the molecules are only able to diffuse a fixed distance from their original positions. For a molecule that is diffusing inside a spherical cavity, Murday and Cotts [16] calculated the NMR signal attenuation expression ln R(∆, δ, a) = −2γ 2 g 2
∞
1
α 2 (α 2 a 2 m=1 m m
− 2)
− 2e−αm D∆ 2 D)2 (αm 2 2 2eαm Dδ − e−αm D(∆+δ) − , (1) 2 D)2 (αm
2+e 2δ − × 2D αm
2 D(∆−δ) −αm
2
where R is the signal attenuation, a is droplet radius, δ is the duration of the pulsed field gradient, ∆ is the diffusion or observation time, defined as the time between the leading edges of the two pulsed field gradients, D is the unrestricted self-diffusivity of the discrete phase liquid, g is the pulsed field gradient applied (T m−1 ), and γ is the gyromagnetic ratio for the proton (2.675 × 108 s−1 T−1 ). αm is given by
the positive roots of the following expression, where Jn is an nth order Bessel function: J3/2 (αa) = αaJ5/2 (αa).
(2)
Packer and Rees [7] extended this result to an emulsion with a probability distribution of radii, P (a). Since the magnitude of the NMR signal received is proportional to the volume of material, a volume average of the detected attenuation must be taken, ∞ 3 a P (a)R(g, a) da b(g) = 0 ∞ 3 (3) , 0 a P (a) da where b(g) is the signal attenuation corresponding to a pulsed field gradient of magnitude g. Since b(g) is not known as an analytical function, but rather as a discrete set of measurements contaminated by experimental error, it is not possible to directly extract P (a) using Eq. (3). The probability distribution, P (a), has usually been taken to be a lognormal distribution, given by (ln 2a − ln d)2 1 exp − , P (a) = (4) 2aσ (2π)1/2 2σ 2 where σ is the shape parameter (variance) and d is the size parameter (median diameter). Some workers have approached this problem by taking a lognormal distribution as a basis function and allowing a limited number of perturbations from that function [17–19]. This approach has some success with distributions that do not vary significantly from the basis function, but still does not provide a truly nonparametric method. 2.2. Regularization In this work Eq. (3) is solved to provide P (a) by employing regularization techniques, which requires no assumptions about the functional form of P (a). Equation (3) is treated as a first-order Fredholm equation. This class of equations is well known in the mathematics of inverse problems, finding application in diverse scientific areas such as the determination of pore sizes from adsorption isotherms, aerosol measurements, and astrophysics [20–22]. Recognizing that the denominator of Eq. (3) is a constant for a given emulsion and hence neglecting it, the integral in the numerator is discretized and rewritten as bi =
m
R(gi , aj )P (aj )δj ,
(5)
j =1
where bi is the ith attenuation reading corresponding to a gradient field strength of gi and P (aj ) represents the probability density of the solution at the j th quadrature interval (P (a) is in discretized form, i.e., divided into m subintervals). The matrix R is a transfer matrix and represents the value of the attenuation for discrete values of g and a, multiplied by a 3 , as calculated using Eq. (1). δj represents the weighting of the quadrature intervals;
K.G. Hollingsworth, M.L. Johns / Journal of Colloid and Interface Science 258 (2003) 383–389
however, if they are equally spaced this factor can be discarded. Equation (5) can be rewritten in matrix algebra as b = RP.
(6)
This is now a linear algebra problem over a finite domain in which we are required to solve for the vector P. R is generally very ill-conditioned (many of the rows of the matrix are nearly linearly dependent) and thus inversion of Eq. (6) is not possible. Hence, normally such a linear system would be solved by finding P such that the following quantity, H , is minimized H = min RP − b2 .
(7)
This does not generally lead to meaningful solutions as noise in b results in large fluctuations in the solution P. This is a general problem with first order Fredholm equations. As Eq. (7) is minimized, the solution P becomes more and more oscillatory and no useful information is produced. Most physical functions are not oscillatory, and regularization uses this fact to penalize solutions that are highly oscillatory, ideally extracting the maximum amount of information about the solution while including minimum noise [23]. Regularization is implemented by imposing a penalty function on Eq. (7). This penalty function will penalize “nonsmoothness” in the solution, P. Equation (7) is now modified to H = min RP − b2 + λ2 LP2 . (8) The first term (RP − b2 ) is known as the residual norm and represents how close P is to being a true solution of the system, and the second term (LP2 ) represents the penalty function and indicates how smooth the function is. λ is the regularization parameter and controls the degree of smoothness of the solution, P. L is the operator representing the choice of smoothness criterion. A low value of the regularization parameter, λ, places more importance on finding an exact solution to the analytical equation and a high value places more importance on the smoothness of the function.
385
As λ is increased, it is reasonably obvious to the eye where the solution is optimized—i.e., where the transition from underregularized to overregularized occurs. However, an automated method of selecting λ is clearly desirable. Consequently, the following methods of choosing the optimal regularization parameter, λ, are considered. 2.3.1. The L-curve [23] Equation (8) is optimized for a range of values of λ. An L-curve is then formed by plotting the penalty function (LP2 ) against the residual norm (RP−b2 ) for the range of λ considered. The L-curve method derives its name from a characteristic L-shaped corner in this plot. An example is shown in Fig. 1a. To the left of this corner (i.e., as λ decreases), the penalty function increases rapidly, indicating that the solution is becoming oscillatory. To the right of the corner (as λ increases) the residual norm increases rapidly, indicating that these points are increasingly poor solutions to Eq. (8). The point of maximum curvature (i.e., the L-shaped corner) is taken to be the optimal regularization parameter [23].
(a)
2.3. Implementation of the regularization method Two options for the smoothness criterion, L, were considered: (i) Summing the area of the curve defined by P. (ii) Calculating the second derivative of P using finite differences and then summing the squares of these second derivatives. Choosing the area smoothness criterion (i) will favor solutions, P, with minimum area, which discourages an oscillatory form. The second derivative criterion (ii) provides a measure of how fast the gradient of P is changing and hence finds the least oscillatory solution consistent with the regularization parameter, λ.
(b) Fig. 1. (a) A typical L-curve and (b) a typical plot of the GCV score against the regularization parameter for real experimental data.
386
K.G. Hollingsworth, M.L. Johns / Journal of Colloid and Interface Science 258 (2003) 383–389
2.3.2. Generalized cross validation (GCV) [24] The basic principle of the GCV method is to omit one of the data points and then determine the regularization parameter, λ, which predicts the missing data point with greatest accuracy. This is done for all data points in the set, b, and a generalized cross-validation score can be evaluated as a function of the regularization parameter [25]. An example is presented in Fig. 1b. The optimal regularization parameter is given by the minimum of this function. It is possible to approximate a GCV function allowing only nonnegative values in the solution, P [24]. 2.3.3. The discrepancy principle [26] The L-curve and GCV methods require no knowledge of the random error (noise) in the experimental data. A third method of picking the regularization parameter is the discrepancy principle, which does require an estimate of this error. The discrepancy principle demands that the problem is solved so that the residual norm (RP − b2 ) is the same as the error in the measurements. The value of λ which permits this is selected as the optimal value. Clearly this method demands a good estimate of the experimental error. The droplet size distribution, P, not only is expected to be smooth and continuous but it must be positive at all points. This provides a constraint on the optimization of Eq. (8). A suitable method of implementing such a nonnegativity constraint is described in [25]. The optimization toolbox of MATLAB is used to solve the optimization problems that result. This was implemented for the two methods of describing the penalty function, L (area and second derivative). Each of these was in turn evaluated using the three methods of selecting the optimum value of λ (L-curve, GCV, and the discrepancy principle). The resultant six implementations of regularization were evaluated and compared based on their application to both model droplet size distributions (contaminated with noise) and experimental data for a toluene-inwater emulsion.
Fig. 2. Retrieval of a unimodal function (Case A) using the discrepancy principle with area and second derivative penalty functions. The original function is shown for comparison.
Fig. 3. Retrieval of a bimodal function (Case B) using the discrepancy principle with area and second derivative penalty functions. The original function is shown for comparison.
these simulated attenuation data using normally distributed random numbers with a variance of 1 × 10−2 . In this work, quadrature intervals of 0.5 µm were used, with a range from 0.5 to 16 µm (giving 32 intervals).
2.4. Model droplet size distributions
2.5. Experimental description
The emulsion droplet size distributions recorded in the literature have been mostly unimodal or bimodal in form. Hence the following two model droplet size distribution forms were generated:
An oil-in-water emulsion, consisting of 48.5 wt% toluene, 3 wt% Triton X-100, and 48.5 wt% water, was produced by first dissolving the surfactant into the water and then shearing the toluene into the surfactant solution for several minutes. This emulsion was introduced into a glass tube inside a 7.4-T Bruker DMX 300 magnet. PFG measurements were performed using a stimulated echo diffusion sequence, as shown in Fig. 4, at 32 different gradient strengths, ranging from 2 to 80 G cm−1 and using a pulsed gradient duration (δ) of 5 ms and a diffusion time (∆) of 200 ms. The spectrum of the emulsion has three peaks: an aromatic peak, a water peak and a methyl peak. The attenuation of the aromatic peak was followed in these experiments, as it is due solely to the oil phase of the emulsion. Unlike the simulated data, there is no a priori knowledge of the error in the experimen-
Case A: A unimodal lognormal distribution with σ = 0.3 and d = 10 µm (shown in Fig. 2). Case B: A bimodal distribution, created by summing two lognormal distributions, σ1 = 0.3, d1 = 10 µm and σ2 = 0.2, d2 = 26 µm (shown in Fig. 3). These distributions, P, were then multiplied by a transfer matrix R corresponding to an experiment with δ = 5 ms, ∆ = 200 ms, and g = 2–80 G cm−1 (32 gradient steps). This produces “perfect” attenuation data. Noise was added to
K.G. Hollingsworth, M.L. Johns / Journal of Colloid and Interface Science 258 (2003) 383–389
387
Fig. 4. The pulsed gradient stimulated echo sequence (PGSTE). Table 1 Optimum regularization parameters for a unimodal distribution (Case A) Run no.
L-curve corner
No. of maxima
GCV
No. of maxima
Discrepancy principle
No. of maxima
1 2 3 4 5
2.438 0.667 1.930 1.276 1.285
3 7 3 3 3
0.284 6.154 × 10−6 9.483 × 10−6 5.385 5.292
4 7 7 1 1
18.01 18.01 13.03 5.385 13.02
1 1 1 1 1
Note. Area regularization was employed.
tal measurements, as required by the discrepancy principle. One method of estimating this error is to perform a similar diffusion experiment on pure water, where a Stejskal–Tanner fit for unrestricted diffusion can be performed [15]:
R = exp −Dγ 2 g 2 δ 2 (∆ − δ/3) . (9) The residual error from this exponential fit can be used as an estimate for the error required by the discrepancy principle.
3. Results and discussion Results are now presented for the regularization of both the simulated model data and the real experimental data. A comparison of the two variants of the penalty function, based on minimizing area and the second derivative of P, respectively, is included. A subsequent comparison of the three methods of selecting the regularization parameter (Lcurve, GCV, and discrepancy principle method) is performed and the best implementations of regularization, using both area and the second derivative for the penalty function, are then identified. 3.1. Extracting droplet size distributions for the model distributions The simulation of the model attenuation data, b, for the two test distributions was repeated five times, each with its own unique addition of noise. The predictions of the three methods of selecting λ for regularization based on the area criterion are presented in Table 1 for Case A. The optimum value of λ is shown along with a classification of fit, defined as the number of significant maxima in the solution function, P. Clearly for a good fit there should only be a single peak; a higher number indicates oscillatory data.
It can be seen that the discrepancy principle is consistently successful, the GCV method is occasionally successful, and the L-curve method is rarely successful. Similar results were found for the model bimodal distribution (Case B). The L-curve method performs poorly for all implementations considered; similarly poor solutions are produced when the second derivative criterion is used as the penalty function. In all implementations we considered, the L-curve method leads to underregularized solutions. The L-curve method is clearly not a robust technique for automatic selection of λ for this particular implementation of regularization. The discrepancy principle consistently produced the best estimate of the original function for both Case A and B, when regularization was used with the area criterion as the penalty function. In Figs. 2 and 3 the prediction, using area regularization, of the droplet size distributions corresponding to Cases A and B is presented along with the original functions. The agreement with the original distributions is very good. Also included in Figs. 2 and 3, are the predictions of regularization using the second derivative as the penalty function and the discrepancy principle. Again good agreement with the original functions is produced. The performance of the area and second derivative implementations of regularization is similar. The discrepancy principle performs best for both implementations. It does, however, require some estimate of the error of the experimental measurement, data that might not always be accessible. Of the other regularization variants assessed, using the second derivative as the penalty function, with the GCV method to select λ and nonnegativity constraints, consistently performed best and produced comparable results to the discrepancy principle; generally the results tended to be slightly underregularized but still described the original forms well. The prediction of this implementation of the GCV method for Case A is presented in Fig. 5, along with
388
K.G. Hollingsworth, M.L. Johns / Journal of Colloid and Interface Science 258 (2003) 383–389
Table 2 Regularization parameters found for a real experimental dataset
L-curve corner GCV Discrepancy principle
Regularization parameter
No. of maxima
3.35 2.03 × 10−3 101.5
4 7 1
Note. Area regularization was employed.
Fig. 5. Droplet size distribution resulting from using the GCV method with a second derivative regularization allowing only nonnegative values (Case A).
Fig. 7. Comparison between area regularization and second derivative regularization for a toluene-in-water emulsion. The regularization parameter has been picked using the discrepancy principle and the solution has been restricted to nonnegative values.
(a)
solution. The optimal selection of experimental sampling points for the GCV method is the focus of ongoing research. 3.2. Extracting droplet size distributions from the experimental data
(b) Fig. 6. A comparison between picking the optimum regularization parameter with the discrepancy principle and the GCV method using a second derivative penalty function with nonnegativity for (a) Case A and (b) Case B.
the original function. Good agreement is evident, similar results were produced for Case B. A direct comparison of the GCV method and the discrepancy principle is presented in Figs. 6a and 6b for Cases A and B, respectively. The GCV method is dependent on the spread of errors in the attenuation measurements. With a small number of measurements, the errors can seem to be correlated, and the GCV method will tend to predict a slightly underregularized
With respect to the experimental data, the results for the L-curve, GCV, and discrepancy principle methods are presented in Table 2, where area has been used as the penalty function. The discrepancy principle produces only a single peak; this would be expected for the emulsion as it was homogeneously and thoroughly sheared [14]. In all previous work [27], the droplet size distributions for these emulsion systems formed in an identical manner contained only a single peak. This was verified using confocal microscopy. With respect to the discrepancy principle, the error in the attenuation data was estimated to be 0.05, as determined using a self-diffusion measurement on water and the same range of echo attenuation as measured for the emulsion. The error estimate is clearly quite good, as the most sensible prediction of the droplet size distribution is produced. Thus in agreement with the predictions of applying area regularization to the model functions, the discrepancy principle consistently performs best in describing the experimental emulsion droplet size distributions. Figure 7 compares the predictions of regularization, using area and the second derivative for the penalty functions, of the acquired experimental data. The regularization parameter was picked using the discrepancy principle. Although
K.G. Hollingsworth, M.L. Johns / Journal of Colloid and Interface Science 258 (2003) 383–389
389
Acknowledgments K.G.H. gratefully recognizes the financial support of the Engineering and Physical Sciences Research Council and St John’s College, Cambridge.
References
Fig. 8. Comparison between solutions for experimental data using the discrepancy principle and GCV with second derivative regularization. The solution is restricted to nonnegative values.
the two distributions indicate a similar average size for the droplets, their shapes are slightly different. The area regularization minimizes the area and is hence somewhat narrower than the second derivative regularization, which seeks to minimize the change in the second derivative (the “wiggliness” of the solution). Figure 8 compares directly the discrepancy principle and GCV for a second derivative regularization allowing only nonnegative values. It can be seen that the two approaches give similar results, with the GCV result being slightly underregularized compared to the discrepancy principle solution.
4. Conclusions It is clear that the most fundamental choice in applying regularization to emulsion droplet sizing is to choose a method for picking the optimum regularization parameter. It is seen that the L-curve is not a robust method in this particular application. The decision therefore becomes whether to use the GCV method requiring no estimate of the experimental error, or to make an estimate of experimental error and use the discrepancy principle. If the error in the NMR attenuation measurement can be established accurately, then the discrepancy principle provides the most reliable regularization technique. In this work we were able to produce a reliable estimate of the error by effectively calibrating the system based on water self-diffusion. Where this error cannot be estimated, the generalized cross validation (GCV) method using second derivative regularization and nonnegativity constraints is the most reliable.
[1] P. Becher (Ed.), Emulsions, Lattices and Dispersions, Dekker, New York, 1978. [2] E. Dickenson (Ed.), Colloids in Food, Applied Science Publishers, 1983. [3] E. Dickenson, D.J. McClements, Advances in Food Colloids, Blackie, Glasgow, 1996. [4] C.H. Neumann, J. Chem. Phys. 60 (1974) 4508. [5] P. Linse, O.J. Soderman, J. Magn. Reson. A 116 (1995) 71. [6] B. Balinov, B. Jonsson, P. Linse, O. Soderman, J. Magn. Reson. A 104 (1993) 17. [7] K.J. Packer, C. Rees, J. Colloid Interface Sci. 10 (1972) 206. [8] B. Balinov, O. Urdahl, O. Soderman, J. Sjoblom, Colloids Surf. A 82 (1994) 173. [9] P.J. McDonald, E. Ciampi, J.L. Keddie, M. Heidenreich, R. Kimmich, Phys. Rev. E 59 (1999) 874. [10] B. Hakansson, P. Pons, O. Soderman, Magn. Reson. Imaging 16 (1998) 643. [11] G.J.W. Goudappel, J.P.M. van Duynhoven, M.M.W. Mooren, J. Colloid Interface Sci. 239 (2001) 535. [12] J.C. Van den Enden, D. Waddington, H. Van Aalst, C.G. Van Kralingen, K.J. Packer, J. Colloid Interface Sci. 140 (1990) 105. [13] P.T. Callaghan, K.W. Jolley, R.S. Humphrey, J. Colloid Interface Sci. 93 (1983) 521. [14] E.S. Rajagopal, Kolloid Z. 162 (1959) 85. [15] P.T. Callaghan, Principles of Magnetic Nuclear Resonance Microscopy, Clarendon, Oxford, 1991. [16] J.S. Murday, J.M. Cotts, J. Chem. Phys. 48 (11) (1968) 4938. [17] L. Ambrosone, A. Ceglie, G. Colafemmina, G. Palazzo, J. Chem. Phys. 107 (24) (1997) 10,756. [18] L. Ambrosone, A. Ceglie, G. Colafemmina, G. Palazzo, J. Chem. Phys. 110 (2) (1999) 797. [19] L. Ambrosone, A. Ceglie, G. Colafemmina, G. Palazzo, J. Phys. Chem. B 164 (2000) 786. [20] G.M. Davies, N.A. Seaton, V.S. Vassiliadis, Langmuir 15 (1999) 8235. [21] J.J. Lloyd, C.J. Taylor, R.S. Lawson, R.A. Shields, J. Aerosol Sci. 28 (1997) 1251. [22] W.M. Smart, Stellar Dynamics, Cambridge Univ. Press, Cambridge, UK, 1938. [23] P.C. Hansen, D.P. O’Leary, SIAM J. Sci. Comput. 14 (1993) 1487. [24] G. Wahba, Statist. Decision Theory Relat. Top. III 2 (1982) 383. [25] J.D. Wilson, J. Mater. Sci. 27 (1992) 3911. [26] V.A. Morozov, Methods for Solving Incorrectly Posed Problems, Springer-Verlag, New York, 1984, Chapter 26. [27] M.L. Johns, L.F. Gladden, J. Magn. Reson. 154 (1) (2002) 142.