Scripta Materialia 52 (2005) 1281–1285 www.actamat-journals.com
The application of information entropy to the estimation of three-dimensional grain or particle size distributions from materialographic sections R.J. McAfee, I. Nettleship
*
Department of Materials Science and Engineering, University of Pittsburgh, 848 Benedum Hall, Pittsburgh, PA 15261, USA Received 24 January 2005; received in revised form 18 February 2005; accepted 22 February 2005 Available online 24 March 2005
Abstract Information entropy was applied to size class selection for the unfolding of grain size distributions and the results were compared with traditional forward and inverse methods using arbitrarily selected size classes. This comparison shows that information entropy provides a better representation of the data. Ó 2005 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved. Keywords: Microstructure; Particle size
1. Introduction The estimation of three dimensional grain or particle size distributions from materialographic sections is usually solved numerically for a population using discrete grain size intervals. The suitability of such a representation will obviously depend on the way the data is partitioned into classes. The numerical techniques involve the calculation of a coefficient matrix, P, based on geometrical probability and statistics of the intersection between a sampling plane and microstructural features. This matrix maps the number of particles or grains per unit volume, NV, in a particular size class to the number of sections per unit area NA in a particular section size class [1]. N A ¼ PN V
*
Corresponding author. E-mail address:
[email protected] (I. Nettleship).
ð1Þ
The techniques can be divided into two main categories known as forward and inverse methods. The forward methods [2–4] iteratively adjust a theoretical distribution in grain size until the calculated distribution in section size is in good agreement (usually by v2 statistics) with the measured distribution of section size. The inverse method (1) obtains a theoretical distribution directly from the measured section data by inversion of the probability matrix and does not require the arbitrary choice of a theoretical size distribution. Typically in both the forward and inverse methods the distributions are divided into classes of equal increments in diameter or area. The arbitrary selection of class increments can result in poor agreement between measured and calculated section distributions for the forward methods especially in the small size classes. In the case of the inverse method the choice of the classes can result in negative frequencies in the grain size distribution that are often overcome by reducing the number of classes. These problems are typically attributed to an accumulation of error in the matrix operations. However, it may be due primarily to the inefficient use of the available measured section data.
1359-6462/$ - see front matter Ó 2005 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.scriptamat.2005.02.022
1282
R.J. McAfee, I. Nettleship / Scripta Materialia 52 (2005) 1281–1285
To overcome the problem of negative frequencies and effectively increase the resolution (number of size classes) of the inverse method, smoothing of the distribution using an assumed distribution type (Weibel, lognormal, etc.) in grain sections has been employed [5]. The present study was designed to demonstrate the use of information entropy to select appropriate class boundaries and improve the resolution of the unfolding technique. This new method allows for the optimum choice of interval boundaries and the calculation of the coefficient matrix, which can then be used in the inverse or the forward method. Furthermore, relative entropy can be used as a tool to discriminate between different ways of defining class boundaries. 2. The analysis of the unfolded method using information entropy Information entropy (SI) is a concept developed in information theory to ensure that a collection of discrete data points is properly represented when it is partitioned into classes. It can be defined using the following [6]: SI ¼
k X
fi lnðfi Þ
ð2Þ
i¼1
where fi is the interval relative frequency, which is also the probability that a data point will be in interval i and k is the total number of intervals. Therefore information entropy is a measure of the difference in information content of each interval. The maximum entropy ðS Max Þ [6] of a data set can be defined as a set I of intervals for which a data point has an equal probability of residing in any of the intervals (fi = fj for all i and j) and is given by: k X 1 1 Max ln SI ¼ ¼ lnðkÞ ð3Þ k k i¼1 The maximum entropy can therefore be used to quantify the ability of any interval scheme to suitably represent the information and can be employed to avoid a situation in which intervals have very few or very many data points relative to the average. To compare distributions with different numbers of classes, the relative information entropy (RIE, SI), is used. RIE is the entropy of a particular interval system normalized by the maximum entropy (logarithm of the number of classes, Eq. (3)). SI ¼
k X i¼1
fi
lnðfi Þ lnðkÞ
ð4Þ
RIE has been used in previous work to compare five different class systems used in grain size analysis by image analysis of grains based on their maximum projected area without unfolding [7].
Rather than consider the distributions of section and grain size as isolated systems, in this study the entropy for the combined systems was used, including the distribution by number for cross-sections (NSD) and the grain size distributions by number (NPD) and volume (VPD). This was done because of the effect of frequency weighting on the choice of intervals. The combined system allows the best choice of intervals independent of which frequency weighting was used. The average RIE Þ is calculated for the cross-section distribution ðS NSD I using Eq. (4) with fi equal to the fraction of crosssections that fall within the range of size class i. Similarly, fi equals the number and volume fraction for grains in the size range of class i for NPD and VPD, respectively. The average RIE ðS Avg Þ is defined as the I arithmetic mean for the three distributions. 1 S Avg ¼ S NSD þ S NPD þ S VPD I I I I 3
ð5Þ
The maximum value of an individual RIE is 100%. However, it is not possible to achieve the maximum for the average over all systems due to the weighting of the distribution by volume. 2.1. Method employed to adjust class boundary in the inverse method for high RIE The goal of this study was to make the best possible use of the available data in order to provide the best estimate of the grain size distribution. To achieve this goal an inverse method was developed which allows the class boundaries in the size distributions to be adjusted in a manner that maximizes the entropy of the system. This requires an iterative technique in which the average relative entropy, S Avg , is recalculated after I each adjustment of the class boundaries. An inverse method was selected because the theoretical grain size distribution is calculated directly from the experimental data, the distribution in grain section sizes. This eliminates the need for iterations for both the class boundary and the theoretical distribution of grain size adjustments. The final results would be equivalent using either a forward or inverse method when maximum entropy is achieved. In effect this method allows the distribution in section size to determine the position of class boundaries. 2.2. Methods of initialization and adjustment of boundaries for the inverse method Various methods of initializing and adjusting the positions of the class boundaries as well as altering the number of classes in the inverse method can be used to identify the best solution using the entropy method. The primary initialization method employed in this work began with class boundaries of equal increments
R.J. McAfee, I. Nettleship / Scripta Materialia 52 (2005) 1281–1285
3. Procedures Samples of porous alumina ceramic were prepared, sectioned, polished and etched before observation by SEM [8]. The alumina grain boundaries were reconstructed for at least 2000 grains using image analysis. Then the distribution of equivalent sphere sizes was unfolded with the inverse method using three different methods to choose the classes. These included both equal diameter (linear) and equal area (geometric) intervals as well as intervals chosen using information entropy. While many other representative grain shapes have been explored, the grain shape assumption is the same in each distribution calculated in this study and did not affect the comparisons. The method of selecting an appropriate shape is given elsewhere [9]. Finally, the results of the inverse unfolding method using information entropy were compared with the results from the forward method using equal diameter intervals. In each comparison the relative information entropy was calculated to compare the distributions in terms of their ability to represent the data.
4. Results and discussion Fig. 1 shows a comparison of cumulative distributions for the unfolded results from the alumina sample using the inverse method. Negative frequencies occurred
1.00
Cumulative Frequency Undersize
in diameter, as in the standard inverse methods. If the desired number of classes could not be obtained due to the occurrence of negative frequencies, the maximum number of classes before the negative frequencies occurs can be used in the initialization and then class boundaries can be inserted between the equal increment boundaries with adjustment of the boundaries until the desired number of boundaries can be achieved. Software was developed so that the method of adjusting boundaries could be performed by repeating a cycle of sequentially adjusting boundaries and accepting adjustments that result in an increase in entropy. Various strategies were used to obtain solutions in the shortest amount of computation time. One variation of this method involves repeatedly changing the direction of the sequence of class adjustments beginning with the smallest grain size class and then switching to begin the sequence with the class for the largest grains. Other variations include randomly selecting the direction of the sequence or randomly selecting class boundaries for adjustment. All of these methods can be developed rather quickly by programming. In fact the coding may require only slightly more code than that used to perform a more traditional forward or inverse method and involve relatively short computation times.
1283
0.75
Equal Diameter Increments (7) Equal Area Increments (3)
0.50
Relative Entropy (25)
0.25
Relative Entropy (95)
0.00 0.0
0.5
1.0
1.5
2.0
Grain Size (microns) Fig. 1. A comparison of the grain size distributions achieved with the inverse method using different types of class intervals. Distributions are shown for equal diameter, equal area and information entropy derived classes.
in the unfolded data when the number of classes was greater than 7 for the equal diameter intervals and 3 for the equal area intervals. It is clear that the occurrence of negative frequencies when using more classes severely reduced the ability of the distributions to represent the data. The S Avg for the equal diameter increments I with 7 classes was 55% and for 3 equal area classes the S Avg was slightly improved to 61%. In both cases the I low number of intervals for grain sizes below 1 lm, where the cumulative frequency is changing from 0% to over 80%, is thought to lead to high estimations of the cumulative frequency in this size range. The inverse information entropy method was also applied to the same section size data. For initialization an equal diameter increment was used followed by iterative maximization of S Avg for the distribution. The number I of classes was then increased and the entropy maximization iterations begun again for comparison. Fig. 1 shows that the maximum number of classes achieved for this sample by this method was 95 without smoothing the distribution in grain section size. Also shown in Fig. 1 is a distribution obtained by information entropy using 25 classes. The distributions determined by the information entropy techniques are very similar and predict significantly lower frequencies in the grain size distribution below 1 lm than for 7 equal diameter or 3 equal area classes. The average relative information entropies for the inverse information entropy distributions were much higher, 25 intervals gave a S Avg of 97% while that of the I distribution with 95 intervals was lower at 94%. The distribution obtained using 95 class intervals was used only to illustrate the ability of the technique to increase resolution.
1284
R.J. McAfee, I. Nettleship / Scripta Materialia 52 (2005) 1281–1285
1.00
Cumulative Frequency
Cumulative Frequency
1.00
0.75 15 Classes
0.50
20 Classes 25 Classes
0.25
35 Classes
0.5
1.0
1.5
0.50
25 Information Entropy Classes 25 Forward Linear
0.25 0.00 0.0
95 Classes
0.00 0.0
0.75
0.5
1.0 1.5 Grain Size (microns)
2.0
2.0
Grain Size (microns) Fig. 2. The cumulative frequency undersize distributions for alumina sintered at 1350 °C for 5.1 h using an increasing number of classes by the entropy based method and a spherical shape assumption.
The ability of the information entropy method to represent the data more consistently is reflected in the insensitivity of the distribution to the number of intervals chosen for the analysis. Fig. 2 compares the distributions for the information entropy based method with 15, 20, 25, 35 and 95 intervals. A previous study tested different distribution functions in terms of their ability to characterize the distributions in the unfolded data from this sample [8]. The log-normal distribution function was found to be the most appropriate. As the number of intervals increased, the median grain size increased slightly from 0.52 lm to 0.54 lm and the standard deviation was almost unchanged, increasing from 1.33 for 15 intervals to 1.36 for 95 intervals. While the larger number of intervals give slightly lower cumulative frequencies above 0.5 lm it is clear that the distribution characteristics are not significantly affected by the number of intervals. It is therefore possible to conclude that the information entropy method for choosing class boundaries makes better use of the available data than the arbitrary class selection methods. While the problems associated with the occurrence of negative frequencies can be avoided by using the forward method, there is still no guarantee that the original choice of intervals is the most appropriate representation of the data. Equal diameter or equal area intervals are also commonly used for the forward method. Fig. 3 shows a comparison of the unfolded distributions, using 25 classes, for the alumina data between the forward method and the inverse method using information entropy to choose the class boundaries. The forward method using equal diameter intervals is used because the S Avg was higher compared to the S Avg for equal area I I intervals. Note that the information entropy method places a larger number of classes in the size range where the rate of change of frequency is highest resulting in a better representation of the data. The advantage of the information entropy method is reflected in the values
Fig. 3. The cumulative frequency undersize distribution for the forward method with 25 equal diameter classes (linear) and the distribution calculated using the inverse method with 25 information entropy classes.
for the distributions, 84% for the forward methof S Avg I od and 95% for the inverse method with information entropy.
5. Conclusions The comparisons presented in this work have illustrated the effect of the selection of class intervals on the resulting estimated grain size distributions using forward and inverse methods based on geometric probability and statistics. The use of information entropy to set size class boundaries improves the representation of the data and may also increase the resolution of the technique by dramatically increasing the number of classes that can be used without resulting in negative frequencies using the inverse methods. This improves the inverse method without artificially smoothing the original distribution in grain sections. The average relative information entropy, as it is defined here, can be used as a measure of the ability of a chosen distribution to represent the data irrespective of the method used to estimate the grain size distribution.
Acknowledgement The authors would like to thank the National Science Foundation under grant DMII 9800430.
References [1] Saltykov SA, Stereometric metallurgy, English Translation by Technical Documents Liaison Office, MCLTD, Wright-Patterson Air Force Base, Ohio, 1961. [2] Schwartz DM. J Microsc 1972;96:25. [3] Wasen J, Warren R. Mater Sci Technol 1989;5:222. [4] Bucki JJ, Kurzydlowski KJ. Mater Charact 1992;29:365. [5] Fang Z, Patterson BR, Turner Jr ME. Mater Charact 1992;31: 177.
R.J. McAfee, I. Nettleship / Scripta Materialia 52 (2005) 1281–1285 [6] Shannon CE, Weaver W. The mathematical theory of communication. University of Illinois Press; 1963. [7] Full WE, Erlich R, Kennedy S. Particle characterization in technology. Morphological analysis, vol. II. Boca Raton, FL: CRC Press; 1984. p. 136.
1285
[8] Nettleship I, McAfee RJ, Slaughter WS. J Am Ceram Soc 2002;85:1954. [9] McAfee RJ, Nettleship I. Acta Mater 2003;51(15):4603.