JOURNAL OF COLLOID SCIENCE 195 ~9--559 (1964)
MATHEMATICAL AND GRAPHICAL INTERPRETATION OF THE LOG-NORMAL LAW FOR PARTICLE SIZE DISTRIBUTION ANALYSIS John Elvans Smith 1 and Myra Lee Jordan 2 U. S. Naval Research Laboratory, Washington 25, D. C. Received February 10, 1965 ABSTRACT T h e log-normal law serves as a n excellent m a t h e m a t i c a l model for particle size
distribution analysis to the extent that the various mathematical terms are properly interpreted. In particular, the probability density function (or frequency function) is shown to be more than an abstract mathematical variable. Particular attention is given to (a) the proper methods of analyzing log-normal distributions, quite different from normal distributions; (b) the physical significance of the mathematical variables and how they relate to experimentally observed data; (c) data-gathering; and (d) the correct modes of portraying and analyzing experimental data graphically. LIST OF SYMBOLS = arithmetic m e a n particle diameter (log-normal distribution). M = geometric m e a n particle diameter. n = number of particles within size interval 5x. = arithmetic standard deviation. ~, = geometric standard deviation. = per cent of particles less t h a n a stated size. x = particle diameter. = arithmetic m e a n particle diameter (normal distribution). xm = particle diameter of most frequent occurrence. y", y', y = probability density (frequency function). INTRODUCTION
I n spite of the m a n y years over which the subject of the size-frequency distribution of finely divided particulate m a t t e r has been investigated (1, 2), the recent literature continues to illustrate the fact t h a t the subject of proper data analysis and interpretation is frequently misunderstood. Several comprehensive reference works are available (3-7), each devoting sections to the m a t h e m a t i c a l description of dispersed systems, the gatherl P r e s e n t address: S o u t h e r n Colorado S t a t e College, Pueblo, Colorado. P r e s e n t address: U n i v e r s i t y of California, Berkeley, California. 549
550
SMITH AND JORDAN
ing of data, and the subsequent analysis of the data. But each of these works exhibits one or more important errors~,or omissions in its treatment of these topics. Additionally, there is some lack of agreement among these and other authors regarding the proper method of analyzing the data and the interpretation of the mathematical model describing dispersed systems. The result is that much of the information being published is not as clearly presented as it could have been, or is erroneous. There is hardly a single discipline in the physical sciences which is not in some way concerned with dispersed systems and their associated effects. We may include, to mention but a few, such diverse areas as atomic bomb fallout and bomb debris, photographic emulsions, industrial dusts, smoke screens, filter penetrations and retentions, crystallization phenomena, condensation nuclei and raindrops, pharmacological aerosols, pigments, and light-scattering phenomena caused by dispersed systems. MATHEMATICAL MODEL
T h e usual purpose of investigating dispersed particulate systems is to determine one or more parameters which describe or pertain to the system, such as the arithmetic or geometric mean particle size, skewness and dispersion in the distribution of sizes, or the size distribution itself. A mathematical model which is found useful is the well-known Gaussian distribution law, one of the most important laws of nature. The Gaussian or normal distribution law is given by y,, _
1
exp [
-
=
j,
[1]
where y" is variously referred to in the literature as the probability density, probability, frequency function, or frequency; x is the particle diameter; the arithmetic mean particle size; and z the standard deviation. It has been found, however, that relatively few dispersions are described by Eq. [1] per se. It is recognized that the vast majority of dispersed systems, such as those produced by the milling and grinding of minerals, tend to follow instead a logarithmic variant of Eq. [1], as given by Y'
_
1
I (lnx
In a~ ~¢/~ exp _
_-- in_M)2.1 2 In2 a~ J'
[2]
where y' is once again the probability density, x is the particle diameter, ~ the geometric standard deviation, and M the geometric mean particle size by count. For log-normal distributions, the geometric mean diameter M is the same as the number median diameter, which is that particle size above or below which half the total number of particles is found. Logarithms are to the base e. As used herein, "ln" specifies logarithms to the base e, and "log" will be reserved for the base 10.
551
PARTICLE SIZE DISTRIBUTION ANALYSIS
Equation [2] has been subjected to misinterpretation in two important ways. First of all, some authors have not specified that logarithms to t h e base e are intended. Other writers, in their preference for the base 10, have simply taken Eq. [2] and substituted log for in, thereby implying that base 10 is intended. However, set to the base 10, the rigorously correct form is ,
1 = 2.303 log
[ exp
(log X -- log M)2 l 2 log
"
[31
In the exponential term, the conversion factors of 2.303 will cancel and thus do not appear. To omit the conversion factor in the denominator of the coefficient, as has often been done , does not negate the validity of the equation so long as y' is correctly interpreted; but giving y' this additional weight by a factor of 2.303 obscures its physical relationship to the experimentally observed data. The second and more serious difficulty, which has led to considerable confusion, rests in the proper interpretation of the quantity y' and its physical significance. It has frequently been stated that y' is the frequency with which a particle of diametei x occurs, but the precise meaning of the term "frequency" has gone undefined. Some writers set y' equal to n~/~__,i. ni, where n~ is the number of particles of size x~, and )-~.~n~ is the total number of particles observed. Such statements are seriously misleading. For purposes of discussion it is convenient to rewrite Eq. [3] in a form explicitly involving )-~n, ~n [ Y = 2.303 log z~ ~v/~ exp
(log x - log M ) ' ] 2 logs ~g "
[4]
The previously noted incorrect interpretation that y' equals n~/~J]~in~ would then imply that y is the number n of particles possessing diameter x. This definition of y is faulty in two respects. First, it should be obvious that the number of actual particles possessing the diameter of, let us say, exactly 5.000..- microns (u) will be exceedingly few, if any at all, No matter how large ~--~nmay be, this number of particles is distributed over a continuum of infilfitely many possible diameters x, and the actual number of particles of any one exact diameter will perforce be minute or zero. The only meaningful way to speak of an actual number of particles is to consider a particle density clustered about some diameter x, that is, the number n of particles occurring within some size interval 3x, expressed as a linear density n/Ax. This quantity, however, is still not the one represented by the elusive y. In fact, the physical significance of y is that it is numerically equal to nx/~x, that is, a quantity obtained by weighting this particle density by the mid-intervM size x. This vital point has been greatly neglected in the literature.
552
SMITH
AND
JORDAN
The other misconception regarding the interpretation of y rests in the fact that Eq. [4] does not, as some writers have indicated, describe the distribution of the particle sizes themselves, but rather describes the distribution of the logarithms (base e) of particle sizes? Therefore, by the same reasoning set forth above, the interval under consideration must be a logarithmic interval A(ln x). The quantity y is, then, equal to n/A(ln x), where n is the number of particles whose diameters have their logarithms lying in the interval A(ln x). However, because of the mathematical relationship, A(ln x) ~ Ax/x (over sufficiently small intervals), the numerical value of y is very nearly equal to nx/Ax, where n is the number of particles whose diameters lie in the interval Ax, the mid-point of which is x. A plot of y against In x has the form of an ordinary normal (Gaussian) distribution; that is, if x is lognormally distributed, In x is normally distributed. Equation [4] thus provides a rigid mathematical model for many dispersed systems. In order to determine the two parameters M and a o which uniquely define a given system, Eq. [4] must be analyzed further. Such an analysis shows that the data derived from a given dispersed system can be analyzed either mathematically or graphically. Following through, then, with Eq. [4], one can show that these two parameters are determined by the expressions n~ log x~ log M -
~
,
[5]
i
and / ~ ] n~ (log xi -- log M) 2 ~ ni
log ~ =
[6]
i
The numerical values for M and ao are correctly determined by these equations irrespective of the logarithmic base employed since all conversion constants will cancel. Equation [6], which has been reproduced at least once in the literature without the n~ term or summation symbols Y~., describes the skewness and dispersion of the curve. Another parameter occasionally desired is the arithmetic mean diameter 8 defined as ~ TbiXi =
~
[7] i
8The equation for the frequency distribution of x itself is given by
n
~n
~-~ = 2.303 Xog ~g ~
1
~ (logx-logM),7
z oxp [_
j 2 - - a o7~ g~
"
PARTICLE SIZE DISTRIBUTION ANALYSIS
553
and related to the basic parameters M and go by log ~ = log M + 1.1513 log 2 go.
[7a]
One useful but seldom encountered equation is the one describing the particle size x,~ about which will be clustered those particles with the greatest frequency of occurrence (8) (or, more precisely, for which the particle density n/Ax is a maximum) : xm = M exp I--In ~ ~o],
[8]
or
log xm = log M -- 2.303 log2 za-
[8a]
Equations [4] through [6] constitute the framework for all interpretations and analyses of those dispersed systems which follow a logarithmic Gaussian distribution. In actual practice, however, the determining of M and zg can be done by simple graphical methods, thus obviating the necessity of tedious and time-consuming calculations. But for precise analysis, the equations are occasionally utilized, and their value should not be overlooked. A tabular summary of the equations needed for the determination of other parameters such as surface area and volume, either on a count or weight basis, is found in Drinker and Hatch (3). DATA GATHERING AND GRAPHICAL ANALYSIS
A common method of gathering data involves some form of observation and measurement on individual particles' until several hundreds or thousands have been observed. The number of particles n falling within a given size interval Ax is noted, the mid-point of the interval being x. It is not generally appreciated, however, that in the accumulation of data, the counts should be tabulated over equal size intervals where possible. The shape of a plot of n vs. x can take on a variety of different forms, depending upon the inequality between different size intervals, with none of these curves accurately portraying a true size-frequency curve. The illustration of this point can best be accomplished by referring to a hypothetical dispersed system which has a true log-normal distribution. In the first three columns of Table I are tabulated the essential data of such a system. (That it is a true log-normal distribution is shown by Fig. 3, wherein the per cent of all particles less than a stated size is plotted against the stated size on logarithmic probability paper. A straight line results--a fact imtially illustrated by Hazen (9). Further reference will be made to Fig. 3 later on.) It is sufficient for our purposes to plot the data directly rather than converting values of n to frequencies through division by ~ n ; the curves are identical in shape in either case. The solid curve of Fig. 1 is a plot of n vs. the middnterval diameter x, and it can be seen that the shape of the curve
55/~
SMITH AND JORDAN TABLE I Data for a Lo '-Normal Distribution of ,M' = 5.0I~ and (rg = ~.071t~
(1) Size range (microns)
0-0.5 0.5-1.2 1.2-1.8 1.8-2.4 2.4-3.0 3.0-3.8 3.8-4.8 4.8-6.0 6.0-7.2 7.2-8.4 8.4-10.0 t0.0-11.6 il.6-13.0 13.0-15.0 t5.0-17.0 17.0-19.2 t9.2-22.0 ,)2.0-25.0 >25 Total
~ (2) Mid size x (micirons)
0.25 0.85 1.5 2.1 2.7 3.4 4.3 5.4 6.6! 7.8 ! 9.2 I 10.8] 12.3! 14.0 16.01 18.1] 20.6 j 23.5 i -- , --
(3)
(4)
(s)
(6)
' .(7)
(8)
(9)
(to)
(il)
n
nx
~ ~"
nx
.. j,
n log x
Cumulati~'e total ~
Cumulative per cent @
% Frequency nf~n
0 0 28 23.8 67 100.5 92 193.2 102 275.4 134 455.6 149 640.7 145 783.0 111 732.6] 84 655.21 81 745.2 i 56 604.8 35 430.5 35 490.131 23 368.13i 17 307.7i 14 288.41 9 211..5i 18 ~450 L 1200 7 7 5 6 . 1
0 40,13 111.7 153.3 170.0 16715 149.0 120.8 92.5 70.0 50.6 35.0 25.0 17,5 11.5 7.7 5.0 3.0 . .
1230.1
0 34.0 167.5 322.0 459.0 569.5 640.7 652.5 610.5 546.0 ! 465.8 378.0 307.5' 245 0 184.01 139.9! 103.01 __70'5 . .
0.1 0 0 I %0 33.8 --2.0 28 2.33 166.7 11.8 95 7.92 322.( 29.6 187 15.58 458.1 44.0 289 24.08 570.4 71.2 423 35.25 643.0 94.4 572 47.67 654.0 106.2 717 59.75 612~1 91.0 828 69.00 546.7 74.9 912 76.00 464.2 78.1 993 82.75 377.0 57.9 1049 87.4,~ 307.5 38.1 1084 90.3.~ 243.1 40.1 1119 93.25 184.5 27.7 1142 95.17 138.8 21.4 1159 96.58 99.9 18.4 1173 97.75 69.2 12.3 1182 98.50 25 1 2 0 0 100.00
5895.41 5 8 9 1 . 1
840
-
-
%0 2.3~, 5.58 7.67 8.513 11.17 12.4'~ 12.08 9.2.5 7.013 6.75 4.67 2.92 2.92 1,92 1,42 1.17 0.75 1.50 100.02
I
is c e r t a i n l y no i n d i c a t i o n of t h e f a c t t h a t i t r e p r e s e n t s a t r u e l o g - n o r m a l d i s t r i b u t i o n . T h e p a r a m e t e r s of t h e s y s t e m a r e M = 5.01~, aa = 2.071~, = 6.53~, a n d x~ = 2.95u. W h a t is r e a l l y m e a n t b y a size o r size-frequency d i s t r i b u t i o n is a k n o w l edge of t h e p a r t i c l e d e n s i t y , t h a t is, t h e e x t e n t t o w h i c h p a r t i c l e s a r e g r o u p e d b y c o u n t a b o u t s o m e g i v e n d i a m e t e r . T h e m o s t c o n v e n i e n t expression for p a r t i c l e d e n s i t y is p a r t i c l e s p e r u n i t size i n t e r v a l , n / A x . B y t h e s i m p l e e x p e d i e n t of division b y Ax ( c o l u m n 5 in T a b l e I ) , t h e r e b y n o r m a l i z ing all n v a l u e s to a n i n t e r v a l of 1 micron, t h e solid i r r e g u l a r c u r v e is t r a n s f o r m e d into t h e s m o o t h d o t t e d curve of Fig. 1. 4 S u c h a p r o c e d u r e closely a p p r o x i m a t e s t h e n u m b e r of p a r t i c l e s w h i c h w o u l d h a v e b e e n o b s e r v e d if e q u a l size i n t e r v a l s of 1.0u were o r i g i n a l l y t a k e n , a n d s i m u l t a n e o u s l y y i e l d s a c u r v e m o r e t r u l y r e p r e s e n t a t i v e of t h e s y s t e m . A n a l t e r n a t i v e m e t h o d for o b t a i n i n g a s m o o t h c u r v e is a v a i l a b l e a n d 4 The equation of this curve is given in footnote 3.
PARTICLE SIZE DISTRIBUTION ANALYSIS
555
BOr--
,6op Z roof i 8oI-- / Fig >D
Pll
o~ 20,,~ n-
I1
a<-%
Xm
i
i
^
I
",, V f
2
4
6
~
~
I
T" -"l':~'--r---~
8 I0 12 14 16 18 20 PARTICLE SIZE X (MICRONS)
22.24
Fze. 1. Particle count and particle density vs. size.
offers the advantage that one may normalize any interval of Ax rather than being restricted to the experimental values. One first graphically integrates the data by making a plot of cumulative per cent ~b (column 10 of Table I) vs. the upper limit of the size interval. Such a plot tends to smooth out any experimental errors and fluctuations. The size-frequency curve is then obtained through graphical differentiation of the curve over equal size intervals by plotting A¢/Ax vs. the mid-value of Ax. This procedure applied to the data of Table I would yield a curve identical in shape to the dotted curve of Fig. 1. It was stated earlier that the physical significance of y is that it numerically equals nx/Ax. Values of nx/Ax appear in column 6 of Table I; column 7 gives the values for y calculated from Eq. [4]. Each nx/Ax value is in very close agreement with its corresponding y value. The reason they are not identical is that the equation yields values based upon a perfectly smooth curve representing an idealized model, whereas our calculations are based upon nonidea] data: We are obliged to make our observations over finite intervals Ax, counting particles only to the nearest whole number. For the same reason, the calculation of M, ~g, a, and xm from Eqs. [5] through [8], utilizing the data of Table I, yields values almost, but not quite, identical to the true known values. In the case of true log-normal dispersions as represented by the dotted curve of Fig. 1, some authors have stated that such curves can be converted into symmetrical (Gaussian) curves which have a peak a~ the number median diameter M simply by substituting the logarithm of the size for
556
SMITH AND JORDAN X
200 .-. 180 -
>. I--
X
~160 -
z
~ 140 -
700
I-120if) Z
,,, 10C -
6 0 0 u~ .A 5 0 0 F-
,,, 8 0 -
4 0 0 0.,. '~
"'
n.."
~-- 6 O n~
) ..."i i
40-
2o!-
o!0.1
1,0 I0 PARTICLE SIZE X {MICRONS)
300 ~ 200
~(D
100 ~: 0 ' 100 N
F I e . 2. P a r t i c l e d e n s i t y a n d size w e i g h t e d p a r t i c l e d e n s i t y v s . size.
the size itself in the graph. However, if one does this, as shown by the solid curve of Fig. 2, it is apparent that the curve can reach its peak value at no place other t h a n previously, i.e., at t h a t particle diameter of greatest frequency of occurrence x ~ . I t is true t h a t the curve becomes symmetrical, but symmetrical about that particle size of greatest frequency x~ and not the number median diameter M. If, however, we take each point on this curve and weight it according to the corresponding particle size x, the resulting values of nx/Ax yield the dotted curve of Fig. 2, which is not only symmetrical but also reaches its peak at the number median diameter M of 5.01~. Since nx/Ax and y are numerically equal, the dotted curve is effectively a plot of y vs. log x. The latter curve graphically illustrates that the particles are symmetrically grouped in some coherent manner with respect to the logarithms of their diameters, the axis of symmetry being the logarithm of the number geometric mean diameter M. The dotted curve of Fig. 2 thus illustrates a graphical method for determining M. I n actual practice, this type of plot is seldom utilized. A more expedient method is to make use of logarithmic probability paper, plotting the per cen~ ¢ of all particles less than a stated size vs. the logarithm of the stated size as shown in Fig. 3. I t is important to note t h a t this cumulative per cent ¢ should be plotted against the upper limit of the interval Ax, not the mid-value as some authors have directed. It can readily be seen t h a t if n particles appear in the interval Ax, these and all smaller particles are, by definition, smaller than the upper limit of Am, rather than its mid-value. From a log-probability plot we can at once determine the two parameters
PARTICLE SIZE DISTRIBUTION ANALYSIS
557
I00
M = 5,0IF o~ = 2,071F
Z 0
X
== !
0
I
2
5
I0
20 30 40 50 6 0 7 0
80
90
95
98 99
CUMULATIVE PERCENT ~ OF PARTICLES LESS THAN STATED SIZE
F i e . 3. Cumulative per cent ¢ of particles less t h a n ~ s t a t e d size vs. the s t a t e d size.
M and ~, which rigidly define the system. By the definition of M, its value is found at the 50 % mark. And it can be shown that za is given by ¢~ =
84.13% size 50% size
-
50% size 15.87% size"
[9]
It is not uncommon to find experimental data yielding points on a logprobability plot which are scattered, the apparent degree of scattering being greater at the extremities. In establishing the best straight line, Kottler (10) has pointed out that preference should be given to those points lying closest to the mid-value of 50 cumulative per cent. The distance by which points are displaced from a straight line becomes increasingly significant as one considers points progressively closer to the 50 % mark. This is due to the distortion of the probability axis. For this reason, some investigators ignore points beyond certain limits, such as Drinker and Hatch (3), who generally fit the best straight line to those points within the 20 % to 80 % marks.
558
SMITH AND JORDAN
A true log:normal dispersion will always berepresented by: a straight line on log-probability paper. This does not mean, however, that the converse is always true. A true log-normal system will be continuously variable in size, with the result that no matter what particle: size within the effective distribution range one may consider, a finite number of particles will be grouped around it. This fact places upon the investigator the responsibility of sampling over small enough intervals Ax to assure that a continuous spectrum of particle sizes is present, at the same time not making the intervals so small that the time required to accumulate a significant number of counts is excessive. The use of logarithmic probability paper thus becomes an expedient means of determining M and ~ where the data plot as a straight line. By a series of subtractions of one value of ~ from another at equal intervals of 1 t~ particle size, the size-frequency distribution curve can be obtained. Where the experimental points are too scattered for a good straight line fit to be visually ascertained, a mathematical determination using the above equation may be more accurate. Kottler (11) presents an algebraic method of curve fitting for precise analysis. Aitchison and Brown (8) present a comprehensive discussion of the log-normal law, including a comparison of severM alternative methods of mathematical data analysis and a treatment of the special problems arising when physical limitations prevent measurements over part of the range of particle sizes. Sv~A~Y A logarithmic variant of the normal Gaussian distribution law serves as a mathematical model for many dispersed systems. Misinterpretation of this model has often led to the erroneous interpretation of data and reporting of experimental results. It must constantly be borne in mind that the model is not the dispersed system itself. The mathematical equations deal with probabi]ity dynamics; as such, they can at best only closely approximate a description of a given dispersed system, never rigorously define it in its entirety. Such approximations are entirely adequate for all practical purposes. Equation [4] is a common form of the log-normal distribution law. In the form shown, the variable y finds physical significance in that it numerically equals nx/Ax. The two parameters M and ao uniquely define any given system. In handling data, the most meaningful way of speaking of particle size distributions is to consider particle density n/Ax, where n is the number of particles grouped within a small size interval Ax, the mid-value of the interval being x. Particle counts ideally should be gathered over uniform size intervals when one wishes to obtain a graphical portrayal of the sizefrequency distribution as illustrated by the dotted curve of Fig. 1. If this
PARTICLE SIZE DISTRIBUTION ANALYSIS
559
is not possible, each n value should be normalized to a consistent size interval through division by Ax. The intervals chosen for counting should be small enough to insure that a continuum of particle sizes is present. For each Ax interval considered, a sufficient number of counts should be gathered to insure statistical significance. This consideration must often be waived at the extreme ends of the distribution, where the small relative number of particles would require excessive counting times. The most expedient method of analyzing the data is to plot the per cent of all particles less than a stated size vs. the stated size on logarithmic probability paper, each point being plotted against the upper limit of the corresponding size range. For such plots, it is not necessary to acquire data over equal size intervals or otherwise normalize them unless there is some doubt as to the continuity of size distributions. The geometric mean particle size (number median diameter) M is immediately found at the 50 cumulative per cent intercept. The geometric standard deviation zo is readily found from Eq. [9]. I:~EFERENCES 1. DRINKER, P., "The sizeJrequency and identification of certain phagocytosed dusts," J. Ind. Hyg. 7, 305 (1925). 2. ttATCH, T., AND CHOATE, S. P., "Statistical description of the size properties of non-uniform particulate substances," J. Franklin Inst. 207,369 (1929). 3. DRINKER, P., Ann ttATC~, T., "Industrial Dust," 2nd. ed. McGraw-Hill, New York, 1954. 4. GREEN, H. L., ANn LANE, W. 1~., "Particulate Clouds: Dusts, Smokes and Mists." Van Nostrand, New York, 1957. 5. OuR, C., Jm, AND DALLAVAnL~,J. M., "Fine Particle Measurement." Macmillan, New York, 1959. 6. "Symposium on Particle Size Measurement," Am. Soc. Testing Mater. Spec. Tech. Publ. No. 234, 1959. 7. II~ANI, l:~. 1~., AND CALLIS, C. F., "Particle Size: Measurement, Interpretation, and Application." Wiley, New York, 1963. 8. AITC~ISON,J., ANn BROWN, J. A. C., "The Lognormal Distribution." Cambridge University Press 1957. 9. HAZEN, A., "Storage to be provided in impounding reservoirs," Trans. Am. Soc. Civil JEngrs. 77, 1539 (1914). 10. KOTTLER, F., "The distribution of particle sizes," J. Franklin Inst. 250, 339, 419 (1950). 11. ~OTTLEI%, ]~., "The goodness of fit and the distribution of particle sizes," J. Franklin Inst. 251,499, 617 (1951).