Expert Systems with Applications 36 (2009) 11528–11535
Contents lists available at ScienceDirect
Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
An intelligent system for sorting pistachio nut varieties Mahmoud Omid a,*, Asghar Mahmoudi b, Mohammad H. Omid c a b c
Faculty of Biosystems Engineering, University of Tehran, Karaj, Iran Faculty of Agriculture, University of Tabriz, Tabriz, Iran Faculty of Soil and Water Engineering, University of Tehran, Karaj, Iran
a r t i c l e
i n f o
Keywords: Pistachio nut Classification Sorting Impact acoustic Neural network Principal component analysis Fast Fourier Transform
a b s t r a c t An intelligent pistachio nut sorting system combining acoustic emissions analysis, Principal Component Analysis (PCA) and Multilayer Feedforward Neural Network (MFNN) classifier was developed and tested. To evaluate the performance of the system 3200 pistachio nuts from four native Iranian pistachio nut varieties were used. Each variety was consisted of 400 split-shells and 400 closed-shells nut. The nuts were randomly selected, slide down a chute, inclined 60° above the horizontal, on which nuts slide down to impact a steel plate and their acoustic signals were recorded from the impact. Sound signals in the time-domain are saved for subsequent analysis. The method is based on feature generation by Fast Fourier Transform (FFT), feature reduction by PCA and classification by MFNN. Features such as amplitude, phase and power spectrum of sound signals are computed via a 1024-point FFT. By using PCA more than 98% reduction in the dimension of feature vector is achieved. To find the optimal MFNN classifier, various topologies each having different number of neurons in the hidden layer were designed and evaluated. The best MFNN model had a 40–12–4 structure, that is, a network having one hidden layer with 40 neurons at its input, 12 neurons in the hidden layer and 4 neurons (pistachio varieties) in the output layer. The selection of the optimal model was based on the examination of mean square error, correlation coefficient and correct separation rate (CSR). The CSR or total weighted average in system accuracy for the 40–12–4 structure was 97.5%, that is, only 2.5% of nuts were misclassified. Ó 2009 Elsevier Ltd. All rights reserved.
1. Introduction Pistachio (Pistacia vera L.) is a dry-climate deciduous tree, producing nuts in clusters. More than sixty pistachio varieties cultivate in different regions of Iran. Based on FAO statistics, Iran produced about 275,000 Mt of pistachio nuts in 2003, which represents approximately 54.7% of the world’s pistachio production. Iran exported 184,946 Mt of its pistachio nuts in this year, and the total export revenue from pistachio nuts was about 679,940,000 US$ (faostat.fao.org). Therefore, pistachio nut has great economic value for Iran. Pistachios are served principally as salted nuts. A large percentage of pistachios are marketed in the shell for snack food. Closed-shell, filled nuts are used for processing. Mixing pistachio nuts of different varieties and quality often occurs as a result of mixed plantations, manual harvesting, transportation or handling, storage, etc. (Brosnan & Sun, 2002). In order to give the consumers a more uniform product, the inspection and classification of mixed nuts into lots of uniform shape and size is desirable. Visual inspection is usually performed by human operators and its output is
* Corresponding author. E-mail addresses:
[email protected] (M. Omid),
[email protected] (A. Mahmoudi),
[email protected] (M.H. Omid). 0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2009.03.040
then affected by several factors, such as age of operators, their concentration and motivation, fatigue and visual acuity, room conditions (lighting, heating, ventilation, noise, and so on); for these reasons, automated systems are especially welcome. When pistachios arrive at the processing plant, the following procedures are conducted (Nakhaeinejad, 1998): (a) dehulling, to separate the soft hull from nuts; (b) trash and blank separation, to remove blank pistachios and trashes such as small branches, remaining shells and leaves; (c) unpeeled pistachios separation, to remove unpeeled and unripe nuts; (d) washing, which involves spraying water at high pressures on the pistachios to clean the nuts; (e) drying, to decrease moisture content of pistachios from 30–40% to the appropriate level; (f) split nuts separation, to separate split nuts from non-split ones; (g) salting; (h) roasting; and (i) packaging. Razavi et al. investigated comprehensively the physical properties of Iranian pistachio nut and its kernel for five pistachio varieties as a function of moisture content (Razavi, Emadzadeh, Rafe, & Mohammad-Amini, 2007a; Razavi, Mohammad-Amini, Rafe, & Emadzadeh, 2007c; Razavi, Rafe, & Akbari, 2007d; Razavi, Rafe, Mohammadi-Moghaddam, & Mohammad-Amini, 2007b). These authors examined geometrical, gravimetrical and frictional properties as well as terminal velocity of five Iranian commercial varieties of pistachio nut and its kernel (namely; Akbari, Badami,
M. Omid et al. / Expert Systems with Applications 36 (2009) 11528–11535
Kalle-Ghuchi, Momtaz and O’hadi) in these series of papers. These characteristics are necessary in order to the designing of equipments and machines for the transporting, dehulling, sorting, handling, processing and drying, and may be useful to enhance classification accuracy of pistachio nuts with other methods, e.g. those which are based on morphological or image processing techniques. Pearson et al. evaluated the physical properties of pistachio nuts as affected by moisture content and variety (Pearson, Slaughter, & Studer, 1994). However, they could not find significant correlation among the physical properties of nuts in order to devise sorting criteria to remove early split nuts from the crop, although the properties showed early split nuts were significantly smaller in length, width, height, mass and volume than normal nuts. In recent years, computer-generated neural classifiers that are intended to mimic human decision making for product quality have been studied intensively. Combined with high-technology handling systems, consistency is the most important advantage these artificial classifiers provide in classification of agricultural products (Kavdir & Guyer, 2008). Combined image analysis and neural classifier was used for the classification of lentil, apple and sweet onion (Shahin & Symons, 2001a; Shahin, Tollner, & McClendon, 2001b; Shahin, Tollner, Gitaitis, Sumner, & Maw, 2002). The online lentil color classification using a flatbed scanner with neural classifier developed in (Shahin & Symons, 2001a) achieved an overall accuracy of more than 90%. Various techniques including optical, mechanical, electrical and acoustical have also been used for classification and/or sorting of pistachio nuts. Machine vision was introduced for detection of stained and early split pistachio nuts (Pearson, 1996). Later, the feasibility of an automated food inspection system for pistachio defects detection based on X-ray imaging and statistical characterization was demonstrated (Pearson, Doster, & Michailides, 2001). Ghazanfari et al. utilized Fourier descriptors and gray level histogram features of 2D images to classify pistachio nuts into one of three USDA size grades or as having closed-shells (Ghazanfari, Irudayaraj, Kusalik, & Romaniuk, 1997a; Ghazanfari, Kusalik, & Irudayaraj, 1997b). Impact acoustic emission was used as the basis for a device that separates pistachio nuts with closed-shells from those with split-shells (Cetin, Pearson, & Tewfik, 2004a; Cetin, Pearson, & Tewfik, 2004b; Pearson, 2001). The sorting system included a microphone, DSP hardware, material handling equipment and an air-reject mechanism. The same impact acoustics based system was later extended to separate cracked hazelnuts shells from undamaged ones (Kalkan et al., 2006), underdeveloped ones from full hazelnuts (Onaran, Pearson, Yardimci, & Cetin, 2006) and wheat inspection for detection of IDK (insect damaged kernel) from undamaged kernels (Pearson, Cetin, Tewfik, & Haff, 2007). Although the mechanical structure was similar, the authors reported that the signal features used for pistachio classification did not work well in wheat inspection. The results obtained by these works (Kalkan et al., 2006; Onaran et al., 2006; Pearson et al., 2007) emphasized the importance of signal processing methods of the impact acoustic signal to achieve higher accuracies in food inspection. A multi-structure neural network (MSNN) classifier was proposed and applied to classify pistachio nuts (Ghazanfari, Irudayaraj, & Kusalik, 1996). The performance of MSNN classifier was compared with that of a Multilayer Feedforward Neural Network (MFNN) classifier. The average accuracy of the MSNN classifier was 95.9%, an increase of over 8.9% of the performance of the MFNN, for the four commercial varieties of nuts tested. In another research, Fourier descriptors and the projected area of the individual nuts were extracted from their 2D images and used as recognition features to classify pistachio nuts into four grades (Ghazanfari et al., 1997b). The Fisher criterion in conjunction with Gaussian classification method for feature selection was used. The results of this feature selection indicated that seven harmonics were suf-
11529
ficient for this classification task. The selected Fourier descriptors and the area of each nut were subsequently used as inputs to two classification schemes: hybrid decision-tree classifier and ANN. The average classification accuracy obtained for the decision-tree classifier was 87.1%., whereas the ANN resulted in an average classification accuracy of 94.8%. In pistachio processing plants, image-based sorting devices using visible light have largely been replaced with X-ray or NIR devices, and the commercially available image-based sorters (Pearson et al., 2001) are in fact no longer in production. Casasent et al. obtained promising results by X-ray imaging and neural net processing to classify pistachio nuts (Casasent, Sipe, Schatzki, Keagy, & Lee, 1998). X-ray image histogram features and their spatial derivatives were used for detection of insect–infested nuts. The objective of this study was to develop impact acoustics based expert system for classification of different varieties of pistachio nuts. Principal Component Analysis (PCA) and Artificial Neural Network (ANN) were then used to combine the optimum feature parameters and classify mixed nuts into lots of uniform shape and size and variety. 2. Description, requirements and design of sorter Four major Iranian varieties of fresh/raw pistachio nuts, namely; Kaleh–Ghouchi (Ka), Akbari (Ak), Badami (Ba) and Ahmad–Agaee (Ah) were selected in this study (Fig. 1). A total of 3200 nuts are used in this study. Each class was consisted of 800 nuts (one half were split-shells and the other half were closedshells) which were randomly selected and their impacts were recorded. A schematic of the experimental apparatus for singulating pistachio nuts, dropping them onto the impact plate, collecting the acoustic emissions from the impact is shown in Fig. 2. The impact plate is a polished block of stainless steel and its mass is much heavier than that of the nuts in order to minimize vibrations from the plate interfering with acoustic emissions from nuts (Amoodeh, Khoshtaghaza, & Minaei, 2006). The prototype included a chute, inclined 60° above the horizontal, on which nuts slid down and are projected onto the impact plate. The acoustic emissions from the nuts were pick up by a highly-directional Panasonic Electret capsule microphone (VM-034CY model), which is sensitive to frequencies up to 100 kHz. The microphone was installed inside an isolated acoustic chamber to eliminate environmental noise effects (Amoodeh et al., 2006). Detected sound signals were sent to a PC based data acquisition system. Signals were digitized at a sampling frequency of 44.1 kHz, with 16 bit resolution. To further eliminate environmental noise effects, the data acquisition was triggered using a piezoelectric sensor mounted on the plate. By designing and programming the necessary electronic circuitry (main board, microcontroller, potentiometer, analogue comparator, etc.), microcontroller automatically issued a command to the PC indicating a nut impact. In this way only the signals initiated from the impact of nuts onto the plate were responsible for triggering the microcontroller and the environmental noise did not interfere with the actual signal emitted by nuts. Sound signals were saved by using
Fig. 1. Typical images of four varieties of pistachio nuts: Kaleh–Ghouchi (Ka), Akbari (Ak), Badami (Ba) and Ahmad–Agaee (Ah).
11530
M. Omid et al. / Expert Systems with Applications 36 (2009) 11528–11535
Fig. 2. Schematic of pistachio classifier based on acoustic emissions.
Matlab data acquisition toolbox for subsequent analysis (MathWorks, MATLAB User’s Guide, & The MathWork, 2007). Since the maximum frequency (sampling rate) of the sound card was 44.1 kHz, data acquisition continued for 5.67 ms after triggering. This produced about 250 samples, i.e., 44.1 kHz 5.67 ms, from the impact of each nut in the time-domain. In Fig. 3a, we show the peak values of different nut varieties (shown in Fig. 1) recorded with the data acquisition system. It is seen that signal amplitudes are similar and by themselves are not very useful for the separation of different nut varieties. The microphone signal magnitude of open-shell nuts can be expected to be higher. The frequency of sounds emanating from closed-shell and open-shell nuts was slightly different. The physical properties data showed some variations among the varieties. However, discriminate analysis based on physical properties of these varieties, using published data (Razavi et al., 2007a, 2007b, 2007c, 2007d), showed all of the physical properties measured on individual nut varieties (dimensions, mass, thickness, volume, sphericity, shell ratio, terminal velocity, etc.), cannot achieve an accurate segregation between classes. That is to say these properties, used alone or in combination, had very low correlations in order to devise sorting criteria to classify them based on physical properties. To train the system to differentiate different nut classes, traditional procedures for system modeling become unmanageable or virtually insoluble because, the basic system can no longer be adequately linearly approximated, and a large number of measured variables (i.e., features) have to be considered simultaneously. A close examination of signals initiated from the impact of nuts onto the plate, an example of which shown in Fig. 3a, revealed that transient features in the signal could carry significant information for discrimination among classes, as extreme amplitudes of the signals are quite variable. Such features are sometimes omitted due to their low energy or improper analysis that ignores temporal information. Therefore, it is important to apply signal processing methods (Appendix A) and consider both time and frequency-domain properties of the signal, in order to generate rich set of features that could successfully discriminate between the nut varieties. Impact sounds of pistachio nuts were analyzed, and feature parameters describing time and frequency-domain characteristics of the acoustic signals were extracted and combined into feature vectors. Thus, the extracted features used to distinguish pistachio nut classes relied
heavily on the time and frequency-domain characteristics of the acoustic signals. A large number of potential features were extracted from the microphone signals, off-line. Based on the above considerations, the proposed scheme for the classification of nut varieties is depicted in Fig. 4. The individual system states are represented at the input of the stage for knowledge based interpretation by a class statement based on available expert knowledge. This expert knowledge is then fed to the system in the training phase. The classification is performed by MFNN. The knowledge obtained during training phase is not stored as equations or in a knowledge base, but is distributed throughout the network in the form of connection weights between neurons. During the training process, the classifier is given specimen signals, it then sets its weight and bias coefficients in the training phase so that it is able to reproduce the classification results as adequately as possible in the recall (test) phase. 2.1. Feature generation by FFT Initially, signal analysis procedures from time-domain (e.g. peak values) to frequency-domain are carried out to generate useful features. A 1024-point Discrete Fourier Transform (DFT) is computed from each discrete time signal using a FFT. A brief description of DFT, its computation via FFT, and implementation in Matlab code is provided in Appendix A. It must be pointed out that the FFT is not a different transform from the DFT, but rather just a means of computing the DFT with a considerable reduction in the number of calculations required. The computed 1024-point FFT covers the impact sound of pistachio nuts starting at about 80 data points before the signal maximum slope, which corresponds to the moment of impact of the nut. Amplitude, phase and Power Spectral Density (PSD) of sound signals were calculated, respectively, by Eqs. (A4), (A5) and (A6) given in Appendix A. In Fig. 3b, an example of the computed FFT amplitude (Top), phase angle (Middle) and PSD (Bottom) for nut classes shown in Fig. 1 is presented. The FFT analysis produced 1024 sample data for each nut. Due to even (odd) symmetry in PSD (phase), these features are halved. Also, since FFT amplitude information is included in the PSD, it is not considered further. Therefore, altogether we started off with 1274 features (250 data samples on signal peak values in time-domain, 512 sample points for phase angle and 512 sample points for PSD) from the impacts of each pistachio nut. The amount of data gathered in this way is too extensive (input matrix:
M. Omid et al. / Expert Systems with Applications 36 (2009) 11528–11535
11531
2.2. Feature extraction by PCA Time (ms) 0.23
0.45
0.68
0.91
1.13
Amp (mV)
-2 0.00
-12 Ka
Ba
Ak
Ah
-22
(a)
FFT Magnitude
10
0 0
4307
0
4307
8613
12920
17227
21533
FFT Phase angle
180
90
0 8613
12920
17227
21533
-90
PCA is one of the most widely used data-driven techniques (Duda, Hart, & Stork, 2000). From the point of view of data, PCA is an optimal dimensionality reduction in terms of capturing the variance of the data, and it accounts for correlations among variables (Jolliffe, 1986). From the modeling point of view, PCA transforms correlated variables into uncorrelated ones and determines the linear combinations with high and low variability. These uncorrelated variables are termed as principal components (PCs) of each data vector of the data set and the succeeding PCs accounts for as much of the remaining variability in the data. In case the data are highly collinear (redundant), the first few PCs explain most of the variability in the data and are retained. PCA is thus a very efficient data compression technique. Before the original data are transformed into a lower dimensional space, they are mean centered because only the variability of the data is of interest. According to the time-domain and frequency-domain analysis of sound signals, three sets of features could be compressed by applying PCA: (i) signal peak values (A) in the time-domain, and (ii) phase (Ph) and (iii) PSD in the frequency-domain. Using Matlab program provided in Appendix A, PCs for different combinations of A, Ph and PSD ranging from 6 to 185 were obtained (Fig. 5). It is seen from this figure that we can express up to 98% of total variations in the input data set with retaining of just twenty PCs, i.e., 6-A, 6-Ph and 7-PSD components. It is seen that for higher PC variances, e.g. 99.9%, the size of input vector is seriously increased. This is not desirable from MFNN modeling point of view. Therefore, the selection criteria for the number of features in the input layer and the number of hidden layers should be based on our interest in minimizing the computational load and attaining the desired accuracy. Ultimately, the combination of 40 PCs; 24-A (0.1%), 6-Ph (2%) and 10-PSD (1%) is selected for the optimum MFNN configuration (Table 1 and Fig. 5). Using these PCs, 98.26% reduction in features is achieved, i.e., input matrix becomes 40 3200 instead of 1274 3200. In Matlab (R2007a version), the PCA is achieved thru ProcessPca function (see Appendix A). 2.3. Design of Multilayer Feedforward Neural Network classifier
-180
PSD (db/Hz)
100
Ka Ak
Ba Ah
10 043078613129201722721533 Frequency (Hz)
(b) Fig. 3. Typical impact sound signals and spectra of pistachio nuts varieties: (a) time-domain response showing signal differences among varieties, (b) frequencydomain responses; FFT amplitude (Top), FFT phase angle (Middle) and power spectrum density (Bottom).
1274 3200, output matrix: 4 3200) and need to be compressed. For real-time application such as sorting the compression is necessary because the system must be easily trainable and automated. Here the MFNN modeling was performed on the reduced feature vector extracted from PCA.
The first step in developing MFNN classifier deals with the definition of the network architecture, which is defined by the basic processing elements (neurons) and by the way in which they are interconnected (layers). The multilayer perceptron (MLP) is one of the most widely implemented neural network topologies used for classification tasks (Haykin, 1999). Classical MLP architecture was used for the models developed in this study. Each neuron has a weighted connection to every neuron in the next layer, and each performs a summation of its inputs passing the results through non-linear sigmoid transfer functions, f(x) = tanh(x), at the hidden and output layers. In general, the value at the output unit is always the same for a certain set of input values. Therefore the output can be seen as a function of the input values. MLP is normally trained with the error back propagation (BP) algorithm (Rumelhurt, Hinton, & Williams, 1986). It is a general method for iteratively solving for weights and biases. In the feedforward networks, error minimization can be obtained by a number of procedures including Gradient Descent (GD), Levenberg–Marquardt (LM) and Conjugate Gradient (CG). The standard BP uses the GD technique which is very stable when a small learning rate is used, but has slow convergence properties. Several methods for speeding up BP algorithm have been used including GD with momentum (GDM) and a variable learning rate. In this paper, GDM learning rule which is an improvement to the straight GD rule in the sense that a momentum term is used to avoiding local minima, speed up learning and stabilizing convergence. Momen-
11532
M. Omid et al. / Expert Systems with Applications 36 (2009) 11528–11535
Detected signal
Feature generation
Feature compression
Input vector
Artificial Neural Networks
Pistachio variety
Peak Values (250 samples)
Sound Amplitude
Ah Principal Component Analysis
FFT Phase (512 pt.)
Ak Ba
Time
Power Spectrum (512 pt.)
Ka
Fig. 4. BPNN-based scheme for pistachio nut variety classification.
tum makes the current weight change depend on the previous weight change as well as on the current error, which encourages weight changes to continue in the same direction. The MLP is trained with error correction learning (supervised), which means that the desired response for the system must be known a priori. In order to minimize the training time, only one hidden layer is considered. Each neuron has a weighted connection to every neuron in the next layer, and each performs a summation of its inputs passing the results through a transfer function – this is a linear function at the input layer and a non-linear hyperbolic tangent function at every other layer. In the nth training iteration, the weights are updated according to (1): ðnÞ
ðn1Þ
wji ¼ wji
ðnÞ
þ Dwji
ð1Þ
and the weight adjustment is given by (2): ðnÞ ðnÞ ðn1Þ DwðnÞ ji ¼ gdj oi þ aDwji
ð2Þ
where wji is the weight between the jth node of the upper layer and the ith node of the lower layer, dj error signal of the jth node, oi output value of the ith node of the previous layer, and g and a are the learning rate and the momentum term, respectively. The terms DwðnÞ ji in (1) and (2) are in fact the gradient vector associated with the weights. The gradient vector is the set of derivatives for all weights with respect to the output error. 2.4. Technical details The data set on 3200 pistachio nuts are split into three parts: 70% of the data (2240) for training, 15% for cross-validation (CV),
120
106
Amplitude
Number of features
100
PSD Phase Angle
80
55
60 40 20
6 7 6
8 10 11
13
19 25
24
0 0.02
0.01
0.005
0.001
Percentage of omitted variance Fig. 5. Relation among the number of selected PCs and components variances.
and the remaining 480 data point (15%) for testing of MFNN models. After adequate training, the network weights are adapted and employed for validation in order to determine the MFNN model overall performance. NeuroSolutions 5.0 is used for the design and testing of MFNN models (NeuroSolutions for Excel, 2005). In developing MFNN models, the tangent sigmoid is used for both the hidden and the output layer transfer functions. The values of 0.1 and 0.7 are used for g and a, respectively. In Matlab (R2007a version), data
Table 1 Results of selecting various features as input vector to BPNN. No. of PCa
Featuresb
Topology
MSE
6 7 6 13 12 13 19 8 10 11 18 19 21 29 13 19 25 32 38 44 57 24 55 106 79 130 161 185 32 84 31 30 37 34 35 40 45
A 0.02 PSD 0.02 Ph 0.02 A 0.02 + PSD 0.02 A 0.02 + Ph 0.02 PSD 0.02 + Ph 0.02 A 0.02 + PSD 0.02 + Ph 0.02 A 0.01 PSD 0.01 Ph 0.01 A 0.01 + PSD 0.01 A 0.01 + Ph 0.01 PSD 0.01 + Ph 0.01 A 0.01 + PSD 0.01 + Ph 0.01 A 0.005 PSD 0.005 Ph 0.005 A 0.005 + PSD 0.005 A 0.005 + Ph 0.005 PSD 0.005 + Ph 0.005 A 0.005 + PSD 0.005 + Ph 0.005 A 0.001 PSD 0.001 Ph 0.001 A 0.001 + PSD 0.001 A 0.001 + Ph 0.001 PSD 0.001 + Ph 0.001 A 0.001 + PSD 0.001 + Ph 0.001 A 0.0005 A 0.0004 A 0.001 + PSD 0.02 A 0.001 + Ph 0.02 A 0.001 + PSD 0.02 + Ph 0.02 A 0.001 + PSD 0.01 A 0.001 + Ph 0.01 A 0.001 + PSD 0.01 + Ph 0.02 A 0.001 + PSD 0.01 + Ph 0.01
6–12–4 7–12–4 6–12–4 13–12–4 12–12–4 13–12–4 19–12–4 8–12–4 10–12–4 11–12–4 18–12–4 19–12–4 21–12–4 29–12–4 13–12–4 19–12–4 25–12–4 32–12–4 38–12–4 44–12–4 57–12–4 24–12–4 55–12–4 106–12–4 79–12–4 130–12–4 161–12–4 185–12–4 32–12–4 84–12–4 31–12–4 30–12–4 37–12–4 34–12–4 35–12–4 40–12–4 45–12–4
0.1803 0.1950 0.2438 0.1334 0.1654 0.1690 0.1373 0.1272 0.1477 0.1676 0.1044 0.1337 0.1176 0.0930 0.0752 0.0718 0.1183 0.0444 0.0779 0.0762 0.0433 0.0306 0.0498 0.1002 0.0230 0.0473 0.0661 0.0355 0.0277 0.0190 0.0272 0.0284 0.0315 0.0323 0.0325 0.0206 0.0225
a
Total number of principal component (PCs) selected. A, Ph and PSD are amplitude, phase and power spectrum, respectively. Numbers next to them indicate the percentages of omitted variances. b
11533
M. Omid et al. / Expert Systems with Applications 36 (2009) 11528–11535
0.12 250 epochs
Average Mse (%)
0.1
200 epochs
0.3
MSE (%)
standardization (1, 1) is achieved thru the MapMinmax function (See Appendix A). In order to develop statistically sound MFNN models, the networks were trained three times and the average values were recorded for each parameter. All simulations were performed with the same architecture, i.e., a three-layer MFNN with GDM learning rule (Eq. (2)) and the TANH transfer function for all of the neurons in the hidden and the output layers (Fig. 4). The number of PCs in the input vector and the number of neurons in the hidden layer are important for MFNN overall performance and generalization ability. As a rule of thumb, too few neurons would result in high training and generalization error, whereas too many of them would result in low training error, but high generalization error due to overfitting or overtraining (Haykin, 1999). Here, the number of neurons in the hidden layer (Nh) is determined by varying and testing different Nh, from 2 to 25, in the hidden layer. A plot of MSE on CV set for different number of epochs (from 100 to 250) is provided in Fig. 6. The optimal number of neurons for the hidden layer was found from a close examination of curves of MSE vs. Nh. It can be seen from Fig. 6 that the network with Nh = 12 is more stable and has least standard deviation. Hence, for a fix number of PCs, the MLP network performed the best when Nh = 12. Having fixed Nh, we can now design
0.2 Training Cross Validation 0.1
0 0
200
400
600
800
Number of Epochs Fig. 8. Learning curves with GDM algorithm after 800 epochs.
various topologies, each having different number of PCs in the input layer, in order to get the best set of PC combination for our classifier. To find the minimum number of PCs needed to achieve the best accuracy, forty different PC combinations were tested (see Table 1 and Fig. 5). It is seen from the results given in Table 1 that the best selection for the input vector is the combinations of 24 peak value, 6 phase angle and 10 PSD features. These PCs represent 98.26% of the variances. The optimal configuration is shown in Fig. 7. The convergence of the MSE of the optimal network during training and cross-validation is shown in Fig. 8. 3. Results and discussion
150 epochs 0.08
Performance of different MFNN models were compared based on mean square error (MSE), correlation coefficient (r) and correct separation rates (CSR). The expression used to calculate the MSE is given by NeuroSolutions for Excel (2005):
100 epochs 0.06
m X n 1 X ðdij oij Þ2 mn j¼1 i¼0
0.04
MSE ¼
0.02
where m is the number of output neurons (four in this study), n is the number of exemplars in data set (2240), and dij and oij are the network outputs and the desired outputs for ith exemplar at jth neuron, respectively. A summary of our findings is shown in Tables 1–3. Among the different configurations examined in Table 1, the 40–12–4 configuration, shown in Fig. 1, exhibited highest accuracy and least error on CV data set (MSE = 0.0206). This optimum MFNN has 40 features as input vector, 12 neurons in its hidden layer and 4 neurons as output vector for each class of pistachio nut varieties. The performance of this network is shown in Fig. 8. After evaluating of the optimized configuration with the test set, the MSE of 0.0141, 0.0270, 0.0193 and 0.0123 for, respectively, Ka, Ak, Ba and Ah pistachio nut varieties were obtained. The corresponding r-values were 0.97, 0.93, 0.95 and 0.97, respectively (Table 2). The correct separation rates (CSR) were calculated from the confusion matrix given in Table 3. The CSR percentages for Ka, Ak, Ba and Ah pistachio nut varieties are 96.97, 97.64, 96.36 and 99.10, respectively. Calibration results show that the optimum configuration produces the CSR of greater than 96.0% for all classes of pistachio nut
0 2
4
6
8
10
12
14
16
18
20
22
24
Number of PEs Fig. 6. MSE on CV data set verses the number of neurons in the hidden layer at various epochs.
1
amplitude 24 1
Ah [0001]
2
Ak [0010]
3
Ba [0100]
4
Ka [1000]
1
25 2
phase 30
ð3Þ
12
31
Table 2 Performance of optimal BPNN structure.
power spectrum 40
Input layer
Hidden layer
Output layer
Fig. 7. The topology of optimal ANN.
Performance
Ka
Ba
Ak
Ah
MSE r CSR (%)
0.014 0.97 96.97
0.019 0.95 96.36
0.027 0.93 97.64
0.012 0.97 99.10
11534
M. Omid et al. / Expert Systems with Applications 36 (2009) 11528–11535
Table 3 Confusion matrix showing the number of correctly classified pistachio nuts with 40– 12–4 structure. Desired classification
Predicted classification Ka
Ba
Ak
Ah
Ka Ba Ak Ah
128 1 0 3
1 106 3 0
0 3 124 0
1 0 0 110
varieties (Table 3). The total weighted average in system accuracy is 97.5%, that is to say only 2.5% of nuts are misclassified. In Cetin et al. (2004a) the average accuracy for all three size categories and the mixed set were 96.8% for closed-shell and 98.9% for split-shell nuts. Using the same four data sets, the discriminant analysis routine described in Pearson (2001) classified the splitshell nuts with an average accuracy of 96.8% and the closed-shell nuts with an average accuracy of 98.8%. In order to be able to compare our results with previous results (Cetin et al., 2004a, 2004b; Pearson, 2001), we created a new classifier: (40:12:2)-MLP. The output layer was consisted of split-shell and closed-shell nuts. Among the 480 nuts kept away for testing the BPNN model, 235 were split-shell and the remaining 245 nuts were closed-shell. After training and testing the network, we obtained the CSR percentages of 92.34 and 91.84 for split-shell and closed-shell nuts, respectively. The corresponding MSE were 0.069 and 0.067 and rvalues were 0.85 and 0.86, respectively. Comparing the results of this 40–12–2 configuration with that given in Tables 2 and 3 for the 40–12–4 configuration and results in Cetin et al. (2004a), it is observed that r-values is smaller and MSE is larger but the CCR is still higher than 90%. One possible explanation for getting higher error rate with this configuration is the use of mixed set in the output layer. In the 40–12–4 configuration and (Cetin et al., 2004a) each class was consisted of nuts of the same variety, i.e. same physical characteristics, whereas in the 40–12–2 configuration mixed variety of nuts was grouped as either split-shell or closed-shell – a major physical property difference. Therefore, the method developed in this study appears to offer similar classification accuracy as the discriminant analysis method (Cetin et al., 2004a, 2004b; Pearson, 2001) and has the potential to be implemented in a real-time system for sorting split-shells and closed shells pistachio nuts of same variety. The primary advantage of the method developed here is that it is much more easily trained and automated, and since it is based on methods used to distinguish characteristics of speech, it is perhaps more adaptable to other applications.
4. Concluding remarks In this study, an intelligent system, based on a combined acoustic detection and neural network is developed for classifying pistachio nut varieties. The method is based on data reduction by PCA and classification using back-propagation neural networks (BPNN). By using PCA more than 98% reduction in the dimension of feature vector is achieved. The total weighted average in system accuracy was 97.5%, that is to say only 2.5% of pistachio nuts were misclassified. The procedure outlined here works on the basis of impact sound differences and it is therefore not restricted to a particular application. Further more, because of none destructivity, the developed system does not cause damage to the split-shell nuts. Therefore it does not cause rejection by the consumer and can boost the exports. The proposed approach is also useful within other models that improve the performance of neural classifier. The reasons why these models improve a raw neural network can be clearly
understood using this approach. While the field of ANN knowledge-extraction is one that continues to attract considerable interest, it is anticipated that the current approach will initiate further research and make ANN more useful to the agrobusiness industry. Appendix A Suppose we just have a signal, such as the impact sound signals of pistachio nuts used in the Section 2, for which there is no formula. How then would you compute the spectrum? For example, how did we compute a spectrogram such as the one shown in the Fig. 3b? The Fourier Transform (FT) provides the means of transforming a signal defined in the time-domain into one defined in the frequency-domain. When a function is evaluated by numerical procedures, it is always necessary to sample it in some fashion. This means that in order to fully evaluate a FT with digital operations, it is necessary that the time and frequency functions be sampled in some form or another. Thus the Discrete Fourier Transform (DFT) is of primary interest. The DFT allows the computation of spectra from discrete time data such as the one shown in Fig. 3a. Let x(n) represent the discrete time signal, and let X(k) represent the discrete frequency transform function. The Discrete Fourier Transform (DFT) is given by
XðkÞ ¼
N 1 X
xðnÞejk2pn=N ;
k ¼ 0; 1; . . . ; N 1
ðA1Þ
n¼0
The inverse transform will be defined as
xðnÞ ¼
N1 1X XðkÞejk2pn=N ; N k¼0
n ¼ 0; 1; . . . ; N 1
ðA2Þ
where X(k) represents the Fourier coefficients of x(n), and the integer N is the number of time (or frequency) samples. Basically, the computational problem for the DFT is to compute the sequence {X(k)} of N complex valued numbers given another sequence of data {x(n)} of length N, according to Eq. (A1). In general, the transform into the frequency domain will be a complex valued function, that is, with magnitude and phase. Real valued series can be represented by setting the imaginary part to 0. The Fast Fourier Transform (FFT) is simply a class of special algorithms which implement the DFT with considerable savings in computational time.
XðkÞ ¼ FFTfxðnÞg $ xðnÞ ¼ FFT 1 fXðkÞg
ðA3Þ
According to time and frequency-domains analyses of sound signals pistachio nuts three sets of features were generated for PCA purposes. These features were obtained from: PCs of signal peak values in the time-domain (A), PCs of signal phase in frequency-domain (Ph). PCs of power spectral density in the frequency-domain (PSD). While it is possible to develop FFT algorithms that work with any number of points, maximum efficiency of computation is obtained by constraining the number of time points to be an integer power of two, e.g. 1024 or 2048. For this purpose, first the DFT of the sound signal X(k) is obtained using Eqs. (A1) and (A3) by taking 1024-point FFT of sound signal x(n). The magnitude, phase angle and PSD (a measure of the power at various frequencies) are then calculated from the signal spectrum as:
Magnitude ¼ XðkÞ ¼ jFFTfxðnÞg; ð1024Þj
ðA4Þ
Phase ¼ \XðkÞ
ðA5Þ
XðkÞX ðkÞ PSD ¼ 1024
ðA6Þ
M. Omid et al. / Expert Systems with Applications 36 (2009) 11528–11535
where * denotes complex conjugation. In Matlab, the function fft() is used to compute the FFT of the signal. To compute Eqs. (A4), (A5) and (A6) the following Matlab’s code has been applied: Amp = fft(A,1024); Ph = angle(Amp); PSD = Amp.*conj(Amp)/1024; In Matlab (R2007a version), standardization and PCA are achieved, respectively, thru the MapMinmax and ProcessPca functions as: [pnPeak, pp1] = mapminmax (A); [ptrans1, ps1] = processpca (pnPeak,0.001); [pnPhase, pp1] = mapminmax (Ph); [ptrans2, ps2] = processpca (pnPhase,0.02); [pnPsd, pp3] = mapminmax (PSD); [ptrans3, ps3] = processpca (pnPsd,0.01); Using similar codes, PCs for different combinations of amplitude (A), phase (Ph) and power spectral (PSD) ranging from 6 to 185 can easily be generated. The results are shown in Fig. 5.
References Amoodeh, M. T., Khoshtaghaza, M. H., & Minaei, S. (2006). Acoustic on-line grain moisture meter. Computers and Electronics in Agriculture, 52, 71–78. Brosnan, T., & Sun, D. W. (2002). Inspection and grading of agricultural and food products by computer vision systems – A review. Computers and Electronics in Agriculture, 36, 193–213. Casasent, D. A., Sipe, M. A., Schatzki, T. F., Keagy, P. W., & Lee, L. C. (1998). Neural net classification of X-ray pistachio nut data. Lebensmittel Wissenschaft und Technologie, 31(2), 122–128. Cetin, A. E., Pearson, T. C., & Tewfik, A. H. (2004a). Classification of closed and open shell pistachio nuts using voice recognition technology. Transactions of ASAE, 47, 659–664. Cetin, A. E., Pearson, T. C., & Tewfik, A. H. (2004b). Classification of closed and open shell pistachio nuts using principal component analysis of impact acoustics. In Proceedings of IEEE international conference acoustics, speech and signal processing (ICASSP’04), vol. 5, V677-80. Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern classification (2nd ed.). Wiley Interscience. Ghazanfari, A., Irudayaraj, J., & Kusalik, A. (1996). Grading pistachio nuts using a neural network approach. Transactions of ASAE, 39(6), 2319–2324.
11535
Ghazanfari, A., Irudayaraj, J., Kusalik, A., & Romaniuk, M. (1997a). Machine vision grading of pistachio nuts using Fourier descriptors. Journal of Agricultural Engineering Research, 68(3), 247–252. Ghazanfari, A., Kusalik, A., & Irudayaraj, J. (1997b). Application of a multi-structure neural network to sorting pistachio nuts. International Journal of Neural System, 8(1), 55–61. Haykin, S. (1999). Neural networks: A comprehensive foundation. Prentice-Hall. Jolliffe, I. T. (1986). Principal component analysis. Springer-Verlag. Kalkan, H., & Yardimci, Y. (2006). Classification of hazelnuts by impact acoustics. In Proceedings 16th IEEE signal processing society workshop on MLSP, pp. 325–330. Kavdir, I., & Guyer, D. E. (2008). Evaluation of different pattern recognition techniques for apple sorting. Biosystems Engineering, 99, 211–219. MathWorks, MATLAB User’s Guide, The MathWork, Inc., 2007. Nakhaeinejad, M. (1998). Pistachio hulling and processing in Iran. Kerman, Iran: Momtazan Industrial Co.. NeuroSolutions for Excel, NeuroDimension, Inc., 2005. Onaran, I., Pearson, T. C., Yardimci, Y., & Cetin, A. E. (2006). Detection of underdeveloped hazelnuts from fully developed nuts by impact acoustics. Transactions of ASAE, 49(6), 1971–1976. Pearson, T. C. (1996). Machine vision system for automated detection of stained pistachio nuts. Lebensmittel Wissenschaft Technologie, 29(3), 203–209. Pearson, T. C. (2001). Detection of pistachio nuts with closed shells using impact acoustics. Applied Engineering in Agriculture, 17(2), 249–253. Pearson, T. C., Cetin, A. E., Tewfik, A. H., & Haff, R. P. (2007). Feasibility of impactacoustic emissions for detection of damaged wheat kernels. Digital Signal Processing, 17, 617–633. Pearson, T. C., Doster, M., & Michailides, T. J. (2001). Automated detection of pistachio defects by machine vision. Applied Engineering in Agriculture, 17(5), 729–732. Pearson, T. C., Slaughter, D. C., & Studer, H. E. (1994). Physical properties of pistachio nuts. Transactions of ASAE, 37(3), 913–918. Razavi, M. A., Emadzadeh, B., Rafe, A., & Mohammad-Amini, A. (2007a). The physical properties of pistachio nut and its kernel as a function of moisture content and variety – Part I. Geometrical properties. Journal of Food Engineering, 81(1), 209–217. Razavi, M. A., Rafe, A., Mohammadi-Moghaddam, T., & Mohammad-Amini, A. (2007b). Physical properties of pistachio nut and its kernel as a function of moisture content and variety – Part II. Gravimetrical properties. Journal of Food Engineering, 81(1), 218–225. Razavi, M. A., Mohammad-Amini, A., Rafe, A., & Emadzadeh, B. (2007c). Physical properties of pistachio nut and its kernel as a function of moisture content and variety – Part III. Frictional properties. Journal of Food Engineering, 81(1), 226–235. Razavi, M. A., Rafe, A., & Akbari, R. (2007d). Terminal velocity of pistachio nut and its kernel as affected by moisture content and variety. African Journal of Agricultural Research, 2(12), 663–666. Rumelhurt, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by backpropagation errors. Nature, 322, 533–536. Shahin, M. A., & Symons, S. J. (2001a). A machine vision system for grading lentils. Canadian Biosystem Engineering, 43, 7.7–7.14. Shahin, M. A., Tollner, E. W., & McClendon, R. W. (2001b). Artificial intelligence classifiers for sorting apples based on watercore. Journal of Agricultural Engineering Research, 79(3), 265–274. Shahin, M. A., Tollner, E. W., Gitaitis, R. D., Sumner, D. R., & Maw, B. W. (2002). Classification of sweet onions based on internal defects using image processing and neural network techniques. Transactions of ASAE, 45(5), 1613–1618.