Computerized
Medical Imaging and Graphics,
Printed in the USA.
Vol.
IS.
NO. 1.
pp.3-9.
1991
All rights reserved.
Copyright
08956111/91 $3.00 + .OLl Q 1991 Per@mon Press plc
CLASSIFICATION OF ULTRASONIC IMAGE TEXTURE BY STATISTICAL DISCRIMINANT ANALYSIS AND NEURAL NETWORKS John S. DaPonte Computer Science Department,
Southern Connecticut
State University,
New Haven, CT 06515
and Porter Sherman Department
of Computer
Science and Engineering,
University
of Bridgeport,
Bridgeport,
CT 06601
(Received 1 March 1990; Revised 25 June 1990)
Abstract-In this paper the ability of two common statistical discriminant analysis procedures are compared with two commercial neural network software packages. The major objective of this study was to determine which of the procedures could best discriminate between normal and abnormal ultrasonic liver textures. The same set of features were input into both statistical discrimhumt analysis procedures and both neural network models. Preliminary results have found the restricted Coulomb Energy (RCE) neural network model to have a testing accuracy of 90.6% which is approximately 10% better than any of the other techniques investigated. Key Words: Ultrasound,
Image texture, Linear discriminant
analysis, Neural networks,
INTRODUCTION
Nearest neighbor analysis
(RCE) which consistently achieved a classification accuracy of 100% on a training set of 96 cases and achieved a classification accuracy of 90.6% on a testing set of 32 cases not included in the training set. For the NeuralWare package the counterpropagation neural network model was used and the best results yielded a 89.5% classification accuracy for the training set and 8 1.2% accuracy for the test set. The classification accuracy from counterpropagation network was equal to the best statistical discriminant results while the Nestor RCE network performed approximately 10% better than all other procedures.
lomb Energy
In the past, ultrasonic image texture has been classified by statistical discriminant analysis (1, 2). This earlier work has typically consisted of calculating a large number of quantitative descriptors, identifying the descriptors which can best distinguish between normal and abnormal liver texture and applying statistical discriminant analysis to the selected features. In some cases, statistical discriminant analysis is based on assuming that the features are from a multinormal distribution while in other cases no assumption is made about the distribution of the data being analyzed. In recent years neural networks have emerged as an alternative method to statistical discriminant analysis for pattern detection (3). The objective of this study was to apply different types of computer controlled classification algorithms to ultrasonic image texture for the purpose of distinguishing between normal livers and livers with diffuse disease. In addition to applying the more traditional statistilcal discriminant analysis we have also used neural networks. Results obtained from linear discriminant analysis, a parametric procedure, and nearest neighbor discriminant analysis, a nonparametric procedure, are compared to results obtained from two different commercial neural network software packages. The two neural network software packages used were Nestor Network Development System and NeuralWare Professional II. The Nestor system uses Restricted Cou-
THEORY Three categories of quantitative variables have been used as features in this study. They consist of descriptors associated with gray level histograms (4), cooccurrence matrices (5, 6) and image gradients (1). Variables associated with gray level histograms, sometimes called first order statistics, are used to describe the shape and location of the histogram. The mean gray level intensity which is a measure of the histogram’s central tendency provides an indication of image brightness. Other first order statistics include the standard deviation, which is a measure of histogram dispersion, the skewness, which determines if the histogram is skewed to the right or left of center and the kurtosis, which measures how peaked a histogram is. 3
4
Computerized
Medical Imaging and Graphics
Descriptors computed from co-occurrence matrices are sometimes referred to as second order statistics. Of particular importance in this study were the maximum probability, contrast, inverse element difference, entropy, uniformity, and correlation. The maximum probability is the largest value found in the normalized co-occurrence matrix. Both the contrast and the inverse element difference are measures of how diagonalized the co-occurrence matrix is. When the co-occurrence matrix contains many values near the diagonal the contrast is small but the inverse element difference is large. The entropy and uniformity, sometimes called homogenity, are measures of randomness within a co-occurrence matrix. The co-occurrence matrix for a homogeneous image will have a small number of cells with large values, while the co-occurrence matrix for a nonhomogeneous image will have a large number of small entries. The uniformity is larger in a homogeneous image than in a nonhomogeneous image. The opposite is true of the entropy which is generally inversely related to the uniformity. The correlation provides a correlation coefficient of the rows and columns of a normalized co-occurrence matrix. Parameters associated with image gradient analysis describe local differences in gray level intensity. The parameters used in this study are based on the absolute value of the gradient calculated from pixels immediately adjacent to the pixel of interest. In particular, the mean gradient absolute value and the standard deviation of gradient absolute value have been used as discriminators in this study. The Restricted Coulomb Energy (RCE) system used by Nestor is a multistage hierarchical architecture that uses a large number of interconnected processing elements. It is composed of a three layer, feed forward network that acts like a “High Storage Density” model of a collection of electrical charges in pattern space (7). The pattern recognition ability of the RCE system is a function of “influence regions” that are mapped into the network during training (i.e., as the system trains on patterns, it adjusts groups of processing elements to represent regions of influence specific to each presented pattern). When the trained system is presented with test patterns the RCE system will try to relate the test patterns to the previously learned “influence regions. ” If the RCE system is not “certain” about a specific pattern, it has the ability to make an “uncertain” decision as to the identify of the pattern. Although the inner workings of the RCE system is hidden in the application software, the user has the ability to “tweak” the system (e.g., enlarging/shrinking the influence regions, make the decision criteria more liberal, etc. . , ). The user is also able to view every aspect of the training scenario through a complete diagnostics package.
JanuaryFebmaryll991,
Volume 15, Number 1
The NeuralWare software package offers eleven neural network models and the capability to design custom networks. The counterpropagation network was chosen because it is simple and it can be used for rapid prototyping (8) where fast training time is desired (as was our case). The Counter-propagation network combines two separate neural network paradigms into one system. It is a two layer, bidirectional mapping network. It acts as an adaptive pattern classifier using a control strategy that functions like a “look-up table.” The first layer of the network is self-organizing, using the Kohonen learning technique. This layer uses competitive learning (i.e., the processing element that is closest to the input pattern is the only one strengthened). The end result of this type of learning is that a group of processing elements is trained to respond to a specific input pattern. The second layer uses the Grossberg Outstar strategy. This layer essentially takes the results of the first layer and classifies them. The user has complete control over the design and operation of this network. Complete diagnostics are provided for analysis of both training and testing. Statistical discriminant analysis has been accomplished using both linear discriminant analysis and the nearest neighbor rule. Linear discriminant analysis is a parametric technique meaning that it assumes that the features being analyzed follow a multinormal distribution (9). The nearest neighbor rule is a nonparametric technique which makes no assumptions about the distribution of the data being analyzed (10). In linear discriminant analysis a vector containing the means of all the variables is computed for each category studied in the training set. Then as cases in the test set are processed they will be classified into a particular category based on the category that yields the largest posterior probability (11). The nearest neighbor rule computes an euclidean distance function and classifies a case in the test set into the category that minimizes the distance function (12). The reference data and associated categories used to classify the test data are derived from the training set.
METHODS The images processed in this study were collected from the video output of an Acuson 128 real-time ultrasonic scanner. The video display adhered to the RS-170 standard and an Imaging Technology PC Vision Plus frame grabber housed in an IBM AT compatible was used to digitize the images. After reviewing each image a 64 by 64 pixel Regions of interest (ROIs) was chosen as close to the center as possible avoiding major blood vessels. Once the ROIs were selected, the results
Ultrasonic image texture
AlALOO
??J.
S. DAPONTEand P.
to
.
DIGITAL
ccuvu1El
SHERMAN
5
DUIPUI
Loo~-uP
TMLE
.
cl
uLIwoulc mxua
DIWIOE
Fig. 1. Block diagram of a typical frame grabber/image
were stored on hard disk using Imaging Technology’s Image Action Plus software package. These disk files were later processed by several customized computer programs for the purpose of calculated various quantitative parameters described in the theory section. These computed descriptors were then used as input features for both statistical discriminant analysis, using the SAS statistical analysis package on a VAX mainframe, and the two separate neural network software packages. The features were transferred from the AT environment to the VAX by the Kermit software package. A functional block diagram, of the processing techniques is provided in Fig. 1. The Nestor (NDS) was setup on a PC-AT. A total of 7 separate feature encoding scenarios were coded in C Language. Each scenario was composed of subsets of the 2 1 features and ranged from a minimum of 2 subsets to a maximum of 5 subsets. Each individual subset was tested on different architectures in the NDS system. We found that a good initial measure of how well a subset was working in the NDS could be determined by evaluating each subset on a “standard evaluation NDS architecture” that we derived. The evaluation architec-
processor
which is configured
as an add on board.
ture was designed to increase the resolution of the RCE ANN while reducing the minimum influence. This was done in four steps and the architecture was trained/tested on the actual feature data. Training was performed on 96 out of 128 Regions of interest (ROIs) and testing was performed on the remaining 32 ROIs. These were the same 96/32 ROIs that the statistical analysis trained and tested on. The results of exaltations architecture were used to see how the RCE ANN responded to each subset. This allowed us to identify an efficient combination of subsets that could be built into a specific NDS architecture. The counterpropagation Artificial Neural Network (CP-ANN) was also set up on a PC-AT using NeuralWare Neural Networks Professional II. The network was designed with 2 layers. The training/testing was performed in the same fashion as for the RCE, using the same 96 ROIs to train with and the same 32 ROIs to test with. Table 1 summarizes the percent of agreement between supplied diagnosis and counterpropagation model for both the training set and test set for various processing elements. The number of processing elements were varied in increments of 14 and the best
Computerized
6
Medical Imaging and Graphics
January-Februaryll991,
Volume 15, Number
1
Table 1. Percent agreement for the training set and test set for various processing elements used in the counterpropagation neural network. Processing Elements
Training Set % of Agreement
Test Set % of Agreement
14 28 42 56
63.6 85.4 89.5 88.0
56.0 84.4 81.2 81.2
combination was found to be 42 elements. This is based on choosing the combination that yielded the best percentage of agreement for the training set and maintained a good percent agreement for the test set. A.
RESULTS A total of 128 ROIs were investigated by taking 8 nonoverlapping ROIs for each of 16 patients. Each ROI included a representative sample of liver paremchyma selected using a repeated measures approach. A centroid hierarchical cluster analysis was run on all 128 ROIs using the 2 1 quantitative variables and while some of the ROIs for the same patients tended to cluster together, there was not a significant amount of clustering. This suggests that ROIs from the same patients are not highly correlated. A total of 80 ROIs were taken from normal patients and 48 ROIs were taken from patients with diffuse liver disease. The training set contained a total of 96 ROIs with 60 ROIs coming from normal subjects and 36 ROIs coming from abnormals. The test set contained 20 normal ROIs and 12 abnormal ROIs. Thus, the training set and the test set included a mixture of both categories of data. Figure 2 contains examples of two typical images used in this study including a typical normal liver and a liver exhibiting fatty infiltration. Note that the region of interest is depicted on each image as a rectangular area chosen as close to the lateral center as possible including liver paremchyma while avoiding major anatomical structures. A total of 21 quantitative descriptors were investigated. In some of the analysis all of the variables were included in the model while in other parts of the analysis only those descriptors that had the ability to best distinguish between normal and abnormal ROIs were used. Thirteen variables were computed from gray level histograms, six variables were computed from co-occurrence matrices and two variables were computed from image gradients. The SAS procedures used were PROC STEP DISC,
B.
Fig. 2. Typical ultrasonic images used in this study. (A) Typical normal liver. (B) A liver with fatty infiltration. Please note 64 by 64 region of interest displayed on each image.
PROC DISCRIM, and PROC NEIGHBOR. PROC STEP DISC was used to identify those descriptors that could best discriminate between the normal and abnormal ROIs. When stepwise discriminant analysis was applied to all 128 ROIs 7 quantitative descriptors were identified. These variables were the inverse element difference, the minimum intensity, the smoothness, the contrast, the skewness, the entropy and the correlation. Linear discriminant analysis is implemented in SAS by using PROC DISCRIM. When this procedure was applied to the test set with a model consisting of all 21 quantitative descriptors, the computed classification agreed with the supplied diagnosis 75% of the time. This was improved to 78.125% by only including the
Ultrasonic
image texture 0 J. S. DAPONTEand P. SHERMAN
seven quantitative descriptors identified by PROC STEP DISC as good discriminators. Nearest neighbors discriminant analysis is implemented in SAS by PROC NEIGHBOR. When this procedure was applied to test data with a model consisting of all variables and only considering the closest neighbor, the computed classification agreed with the supplied diagnosis 75% of the time. The level of agreement dropped to 60% when the model only contained- the seven variables identified by PROC STEP DISC. Thus, limiting the model to seven variables increased the level of agreement for linear discriminant analysis but decreased the level of agreement for nearest neighbor discriminant analysis. This discrepancy may result from the fact that both PROC STEP DISC and PROC DISCRIM are parametric procedures which are based on the assumption that the data being analyzed is normally distributed, while PROC NEIGHBOR is a nonparametric procedure and makes no assumptions about the distribution of the data being analyzed. The results of the nearest neighbors discriminant analysis were improved to 81% agreement by keeping all twenty-one quantitative variables in the model and increasing the number of nearest neighbors from one to three. However, when the number of neighbors considered was increased from three to five, no additional improvement was realized in agreement between computed classification and supplied diagnosis. The best result that wa.s obtained from the Nestor RCE-ANN was a training accuracy of 100% on 96 training cases and a testing accuracy of 90.6% on 32 cases not included in the training set. The best RCE architecture had one “uncertain-correct” in the conservative mode (which is the most stringent mode of the NDS classification parameters). When the mode was changed to liberal the one “uncertain-correct” classification was shifted to “identified-correct.” This scenario also used the “nearest neighbor” internal parameter that Nestor provides. The ‘‘nearest-neighbor” parameter will instruct the RCE-ANN to perform a “nearest neighbor” classification for any “uncertain” input pattern (i.e., one that falls outside of a class region as defined by the trained RCE-ANN). It should be noted that the RCE-ANN consistently trained with an accuracy of 100% and would train 96 cases within 30 seconds when running on a PC-AT clone rated at the Norton Utilities computing index of 11.7 relative to an IBM/XT. The best results obtained from the NeuralWare Counterpropagation network was a training accuracy of 89.5% when trained on 96 cases and a testing accuracy of 81.2% when tested on 32 cases not included in the training cases. The Counterpropagation network essentially performs a “nearest neighbor” classification (13)
I
Table 2. Contingency table of McNemar’s test. METHOD
1
Correct
Incorrect
Correct
A
B
Incorrect
C
D
METHOD 2
and the closeness of the counter propagation’s results to the statistical “nearest neighbor” results might be related to this. The counterpropagation network took ten minutes to converge during the training of the 96 cases when running on the same PC-AT clone that was used for the Nestor (NDS). In order to assess the statistical significance of differences found among the various pattern recognition models used in this study, McNemar’s test was applied to the results. This is a nonparametric statistical procedure that compares the results of two different diagnostic tests on the same ROIs. Thus McNemar’s test was used to compare the Nestor (RCE-ANN) results with the nearest neighbor analysis and linear discriminant analysis. The goal of this test is to estimate how likely differences in these techniques are attributed to real differences in discriminatory ability rather than random error. McNemar’s test statistic is computed from a 2 by 2 contingency table (Table 2) in which the classification results of the two techniques are presented in the cells of the table. In this contingency table cell A is the number of cases that both Method 1 and Method 2 classified correctly, Cell B is the number of cases that Method 2 classified correctly but Method 1 classified incorrectly, Cell C is the number of cases that Method 1 classified correctly but Method 2 classified incorrectly and cell D is the number of cases that both Method 1 and Method 2 classified incorrectly. Then McNemar’s test statistic can be defined from the contingency table as (14):
McNemar’s
Test =
((B-C\B+C
1>2 .
Contingency Tables comparing the discriminatory ability of Nestor (RCE-ANN) with the nearest neighbor analysis is provided in Table 3 and the results of comparing Nestor (RCE-ANN) with linear discriminatory analysis is depicted in Table 4. An estimate of how likely differences in these techniques are due to random error is obtained by applying linear interpolation to standard Chi-square statistics tables with 1 degree of freedom. When comparing Nestor with the nearest
8
ComputerizedMedical Imaging and Graphics Table 3. 2 by 2 contingency table comparing nestor (RCE-ANN) with nearest neighbor analysis for the three nearest neighbors (K= 3).
NEAREST NEIGHBOR ANALYSIS Incorrect (K=3) CLASSIFICATION Totals
Correct
Incorrect
Totals
26
0
26
3
3
6
29
3
32
neighbor analysis McNemar’s test statistic was 1.3333 which resulted in Chi-square statistic of .2489155. When comparing Nestor with linear discriminant analysis the results were 1.125 and .3070277, respectively. These results from McNemar’s test indicate that there is approximately a 25% chance that differences between Nestor and the nearest neighbor analysis is due to random error rather than a true difference in being able to distinguish between normal and abnormal liver tissue. For linear discriminant analysis there is approximately a 31% chance that differences might be attributed to random error. This suggests that differences between Nestor and the nearest neighbor analysis are slightly less likely to be attributed to random error than differences found between Nestor and linear discriminant analysis for this study. SUMMARY These preliminary results suggest that neural networks may be capable of more accurately classifying ultrasonic image texture than traditional statistical pattern recognition methods. In the future, additional studies need to be completed on a larger group of patients and other types of liver disease need to be investigated. We have found
Table 4. 2 by 2 contingency table comparing nestor (RCE-ANN) with linear discriminant analysis. NESTOR (RCE-ANN) CLASSIFICATION
LINEAR DISCRIMINANT
Correct
Incorrect
Totals
Correct
23
2
25
Incorrect
6
1
7
29
3
32
ANALYSIS CLASSIFICATION Totals
Volume 15, Number 1
the commercial neural network software more convenient to use in a microcomputer environment than the statistical software package. Future efforts will include the development of a real-time image acquisition, feature extraction, and pattern recognition system in a microcomputer environment.
NESTOR (RCE-ANN) CLASSIFICATION
Correct
January-February/l991,
Acknowledgements-This research was supported by a grant from the Collaborative High Technology Program from the Department of Higher Education for the State of Connecticut. The authors would like to acknowledge the assistance of Drs. Joel Gelber and Martin Fox in obtaining the ultrasonic images from the University of Connecticut. A special thanks to Mr. Joseph Vitale for his many useful suggestions on statistically analyzing the results.
REFERENCES 1. Reath, U.; Schlaps, D.; Limberg, B.; et al. Diagnostic accuracy of computerized B-scan texture analysis and conventional ultrasonography in diffuse parenchymal and malignant liver disease. J. Clin. Ultrasound 13:87-99; 1985. 2. Lerski, R.A.; et al. Discriminate analysis of ultrasonic texture data in diffuse alcoholic liver disease. Ultrasonic Imaging 3: 164-172; 1981. 3. German, R.P.; Sejonowski, T.J. Analysis of hidden units in a layered network trained to classify sonar targets. Neural Networks. 1:75-89; 1988. Itoh, K.; et al. Acoustic intensity histogram pattern diagnosis of liver diseases. J. Clin. Ultrasound 13:4449456; 1985. Gonzalez, R.C.; Wintz, P. Digital image processing. Reading, MA: Addison-Wesley; 1987. Nichols, D.; et al. Tissue characterization from ultrasound in medicine and biology. 12: 135-143; 1986. Reilly, D.; Scofield, C.; Gouin, P.; Rimey, R.; Collins, E.; Ghosh, S. An application of a multiple neural network learning system to industrial part inspection, I&4/88, Houston, TX, pp. l-14; 1988. 8. Wasserman, P. Neural computing theory and practice. New York: Van Nostrand Reinhold; 1989. 9. Cooley, W.W.; Cohnes, P.R. Multivariate data analysis. New York: John Wiley and Sons Publishing; 1971. non-parametric approach, 2nd Edition. 10. Noether, G. Statistics-A Boston, MA: Houghton-Mifflin Publishing; 1974. 11. Anderson, T.W. Introduction to multivariate statistical analysis. New York: John Wiley & Sons Publishing; 1958. 12. Tou, T.T.; Gonzales, R.C. Pattern recognition principles. Reading, MA: Addison-Wesley Publishing; 1974. 13. NeuralWare Professional II User’s Manual. NeuralWare Inc., 1988 Pittsburgh, Pa., pp. 466-479. 14. Glantz, S.A. Primer of biostatistics, 2nd edition. New York: McGraw-Hill; 1987.
About the Author-Jam S. DAPONTEreceived a B.E. from the State University of New York at Stonybrook in 1970, a M.S. from the Rochester Institute of Technology in 1973, and a Ph.D. from The University of Connecticut in 1988. Dr. DaPonte is currently a professor and chairperson of the Computer Science Department at Southern Connecticut State University in New Haven, Connecticut. His research interests include medical image processing, pattern recognition, and statistical analysis. Dr. DaPonte is a member of the Association for Computing Machinery and the Institute of Electrical and Electronic Engineers. He has authored several scientific publications on medical imaging. About the Author-PORTERD. S-N Computer
Engineering
from the University
received the B.S. degree in of Bridgeport in 1980 and
Ultrasonic
image texture 0 J. S. DAPONTEand P. SHERMAN
received the M.S. degree in Computer Science from Polytechnic University of New York in 1983, where is is presently finishing his Ph.D. From 1977 to 1981, Professor Sherman worked as a microprocessor design engineer at Summagraphics Corporation. In 1981 he joined the Department of Computer Science at The University of Bridgeport where he is presently an Assistant Professor. Professor Sherman also is one of the principal investigators for a State of
9
Connecticut High Technology Grant, performing research in the area of applying neural networks to biomedical imaging. Professor Sherman has made presentations to various conferences on the subject of neural network applications. He also is the principle author of a book on how to design rule-based expert systems, published in 1990 by Prentice Hall. Professor Sherman is a member of IEEE, ACM, AAAJ, Upsilon Pi Epsilon, and Eta Kappa Nu.