Computers and Electronics in Agriculture 119 (2015) 40–50
Contents lists available at ScienceDirect
Computers and Electronics in Agriculture journal homepage: www.elsevier.com/locate/compag
Original papers
A computer vision based technique for identification of acrylamide in potato chips Malay Kishore Dutta a,⇑, Anushikha Singh a, Sabari Ghosal b a b
Department of Electronics & Communication Engineering, Amity University, Noida, India Amity Institute of Biotechnology, Amity University, Noida, India
a r t i c l e
i n f o
Article history: Received 18 July 2015 Received in revised form 12 October 2015 Accepted 13 October 2015
Keywords: Toxic substances Acrylamide Image analysis Feature extraction from images Classification Support vector machine
a b s t r a c t Acrylamide is a well-known neurotoxin substance commonly found in fried and baked food items such as potato chips, cookies, biscuits & French fries. Identification of such toxic chemicals in fried food is of great importance. Conventional methods of acrylamide identification in food items are time consuming, expensive and may need specialized manpower. The proposed work presents a computer vision based nondestructive method to identify the presence of acrylamide in potato chips. The proposed method is based on analysis and classification of the discriminatory features of the image in spatial domain. The potato chips are automatically segmented from the image followed by statistical and texture features extraction from the segmented image in spatial domain. These statistical features are then analyzed for identification of acrylamide content using support vector machine (SVM) classifier. The discriminatory variation in the features of the image is strategically related to the presence of acrylamide using image processing techniques. The experimental results have shown accuracy over than 94% and sensitivity of 96% indicating that this method could be explored for viable commercial use. Ó 2015 Elsevier B.V. All rights reserved.
1. Introduction Fried/baked potato based food items like French fries and potato chips are very common in all parts of the world. It has been found that toxic substances are formed in the food items when starchy food like potato is fried in high temperature. Acrylamide is one such carcinogenic toxic substance that is formed when potato is fried in high temperatures. Mottram et al. (2002) had shown that acrylamide is formed in starchy food like potato during heating because of the Maillard reaction between amino acids. They reported that asparagines were the main cause of acrylamide formation in potatoes and cereals. Many other researches (Pedreschi et al., 2010; Rosen and Hellenäs, 2002) has been carried out in this direction and the reports are quite alarming which indicates that there is presence of this carcinogenic toxic substances in food items of fried potato and there is a serious need to counter this serious problem. Conventional methods based on chemical analysis of toxic substances like acrylamide in food items may be time consuming, expensive and involve specialized manpower. Other methods like spectroscopic methods are destructive methods where the food ⇑ Corresponding author. E-mail addresses:
[email protected] (M.K. Dutta), anushikha4june@ gmail.com (A. Singh),
[email protected] (S. Ghosal). http://dx.doi.org/10.1016/j.compag.2015.10.007 0168-1699/Ó 2015 Elsevier B.V. All rights reserved.
sample under test is destructed. Hence there is a need of fast and inexpensive methods which could be applied on food samples for accurate results in real time. Image processing based methods suits with these requirements and are being investigated quite extensively in recent time. Image processing can be considered an alternate convenient non-destructive method which can be used for quick identification of toxic substances from food items in batches after batches while producing commercially. With regards to image analysis for identification of toxic substances, some related work is summarized here. Gokmen et al. (2007) estimated the concentration of acrylamide in potato chips and french fries using image processing based method. This method was based on the Color analysis of food image to measure the acrylamide content and reported results are encouraging. He et al. (2013) also studied the color of food image to identify the presence of acrylamide in fried potato chips. In this work a linear regression equation was established from the color information of the potato chips for acrylamide identification. Pedreschi et al. (2006) proposed a computer vision based method to measure the color and shape of heterogeneous food material. They also studied (Pedreschi et al., 2005) the kinetics of color changes during frying of blanched and unblanched potato chips. In this work a relationship was developed between the color change in food image and formation of acrylamide in food sample. They used potato chip images for experimental purpose and reported experimental
M.K. Dutta et al. / Computers and Electronics in Agriculture 119 (2015) 40–50
results indicate that it was successful to identify acrylamide in potato chips using this image processing method. Pedreschi et al. (2007) proposed computer vision based method for quality evaluation of potato chips for acrylamide identification. In this work discrete color categories obtained from all the possible combinations of gray levels for the segmented region and results were encouraging. These encouraging results have created immense interest in this area which needs to be explored further to find out suitable cost effective and efficient methods. This work proposes a nondestructive computer vision based to identify the presence of acrylamide content in fried potato chips. The main contribution of this paper is an efficient and accurate non destructive image processing based method for identification of acrylamide from potato chips using support vector machine (SVM) classifier. For accurate discriminatory feature extraction, area of potato chips is segmented from the image and then strategic feature extraction is done from this segmented image. To improve the accuracy and efficiency of acrylamide identification, statistical and texture features are extracted from segmented potato chip images and strategically compressed using principal component analysis (PCA) method. To improve the performance of classifier only the discriminatory features are selected after feature normalization and feature reduction and only these features are subjected to classification. The proposed method achieves accuracy over 94% and sensitivity of 96% when the SVM is applied with a Polynomial Kernel with order – 3 and rbf kernel with gamma – 3. This proposed image processing prototype is a non destructive method and is suitable for real time applications for acrylamide identification in fried/baked potato chips. The rest of the paper is organized as follows. Section 2 of the paper comprises Material and methods followed in this work. While, the Section 3 describes the measurement of acrylamide using LC–MS analysis. Section 4 includes the computer vision based analysis of potato chips images. The next section describes the feature analysis of potato chips images for acrylamide identification. Section 6 includes the proposed methodology of acrylamide identification including ROI segmentation, feature extraction, feature normalization followed by feature reduction and acrylamide identification using classification. Section 7 includes experimental results obtained using proposed method. The next section highlights the final remarks on experimental results. Finally Section 9 provides conclusion to the paper.
2.3. Image acquisition of potato chips for image processing Images of potato chips were captured using a self developed image acquisition system. Samples were illuminated using four fluorescent lamps (length of 2 feet) of white light and four CFL of 25 watt. The four lamps and four CFL were arranged as a square 35 cm above the sample and at an angle of 45° with the sample plane to give a uniform light intensity over the food sample. Images were captured using a color digital camera located vertically from the sample at a distance approx. 25 cm. The digital camera of 8 mega pixel with auto focus is used for image acquisition which provides images in JPEG format. The angle between the camera lens axis and the lighting sources was around 45°. Sample illuminators and camera were inside a box whose internal walls were painted white to avoid the light and reflection from the room. Images were captured with the mentioned camera at its maximum resolution (3104 1746 pixels) and connected to the USB port of a computer with Intel core i3 processor. Images were stored in the computer directly via USB port in JPEG format. Fig. 1 shows the image acquisition setup used to capture image for image processing. 2.4. Image analysis for acrylamide identification In the proposed work, identification of acrylamide in potato chips was based of image analysis of potato chips. Input sample image was preprocessed and area of region of interest was segmented out from the input image. Statistical and texture features were considered to explore the discrimination possibility between normal images and acrylamide contained images. Dimension reduction technique was employed for feature reduction to reduce the time complexity and improve the performance. Supervised classifier was used to classify the normal potato chip image and acrylamide contained potato chip image using these image features. 3. Measurement of acrylamide using LC–MS analysis 3.1. Sample preparation Samples were prepared by the procedures as described by Gokmen et al. (2007). In brief, 1 g of finely grounded potato chips obtained after various degree of frying under controlled conditions were suspended in 5 mL of methanol and 13C3 labeled acrylamide
2. Materials and methods 2.1. Materials Potatoes of variety of Kufri Anand and vegetable oil (canola or sunflower oil) were the raw materials used for sample preparation. Potato stored at 4 °C and 90% of relative humidity were thoroughly washed in water and gently peeled before cutting. Potato slices of thickness 2.0 mm were cut using a Potato Chip Slicer Machine.
2.2. Pre-treatments and sample preparation Potato slices were rinsed immediately after cutting for 1 min in distilled water to remove any excess starch adhering to the surface prior to frying. Potato slices were cooked corn, canola and/or sunflower oil to make chipy crunch potato chips using an electrical fryer at the different frying condition (temperature 120 °C to 180 °C) (Pedreschi et al., 2006) and then slices were drained after frying.
41
Fig. 1. Image acquisition setup.
42
M.K. Dutta et al. / Computers and Electronics in Agriculture 119 (2015) 40–50
(1000 ng/g) into a 10 mL of glass centrifuge tube. The mixture, after homogenization of 2 min was centrifuged at 5000 rpm for 10 min. The clear supernatant was treated with Carrez I and Carrez II solutions (25 lL each) to precipitate the co-extractives and centrifugation was performed at 5000 rpm for 5 min. Quantitatively, 1 mL of the supernatant was concentrated to ca. 50 lL followed by immediate reconstitution to a total volume of 1 mL with water. For SPE clean up waters HLB cartridge was preconditioned with 1 mL of methanol and 1 mL of water. Subsequently, 1 mL of the extract was passed through the preconditioned cartridge using a syringe. First 500 lL of the eluent was discarded and the forthcoming drops were collected and passed through a syringe filter (0.45 lm). Twenty lL of the final test solution was injected into LC column of LC–MS analysis. 3.2. LC–MS analysis LC–MS analysis of acrylamide in food samples (Gokmen et al., 2007) was performed. Agilent 1100 HPLC system consisting of a binary pimp, an auto sampler, a temperature controlled column oven coupled to a detector (Agilent 1100 MS) equipped with atmospheric pressure chemical ionization interface was used for the analysis. The analytical separation was carried out on Waters C18 column (250 4.6 mm, 5 lm) using an isocratic mixture of 0.01 mM acetic acid in 0.2% aqueous solution of formic acid at a flow rate of 0.6 mL/min at 25 °C. The interface parameters were: drying gas (N2, 100 psig), flow rate 4 L/min, nebulizer pressure 60 psig, drying gas temperature of 325 °C, vaporizer temperature of 425 °C, capillary voltage of 4 kV, corona current of 4 lA, and fragmentor voltage of 55 eV. Ions monitored were m/z 72 and 55 for acrylamide and m/z 75 and 58 for 13C labeled acrylamide. All the samples whose images are experimented in the work have been subjected to this LC–MS method for labeling them as normal or acrylamide content sample. 4. Computer vision based analysis of potato chips images Color of potato chips is an important parameter to be controlled during processing together with chipness, oil and acrylamide content (Pedreschi et al., 2007; Rosen and Hellenäs, 2002; Scanlon et al., 1994). Fried potato color is the result of the Maillard reaction that depends on the content of reducing sugars, amino acids or proteins at the surface, temperature and time of frying (Smith, 1975; Márquez and Añón, 1986). Among the different classes of physical properties of fried items, color is considered the most important visual attribute in the perception of product quality. Hence the color of potato chips sample image can be a measure of acrylamide content in potato chips. Fig. 2(a) shows the color components present in the sample image of potato chips in which acrylamide is not present. Similarly Fig. 2(b) shows the color component for the sample image in which acrylamide content is present. It can be seen in Fig. 2 that presence of acrylamide discriminate the color components present in the image of a food sample from a normal food sample image. The presence of toxic substance like acrylamide can be evaluated by analyzing the color of potato chips image (Pedreschi et al., 2006). In terms of image processing color image (RGB image) can be considered as the combination of three different channels red, green & blue. Since these three primary color makes a color RGB image hence these three channels of RGB image will have discriminatory variation in the pixels intensity values for normal sample and acrylamide content sample images and the presented results in Gokmen et al. (2007) are encouraging. Since color RGB image is 3-dimensional image, it may be difficult and more complex to process the heavy RGB image directly for acrylamide identification. A 2-Dimensional (2D) image will pro-
vide faster processing with less computational cost. This 3dimensional (3D) RGB image can be converted into 2D gray scale image using the following image processing operation:
Gray scale Image I ¼
RþGþB 3
ð1Þ
where R, G & B represents red, green and blue channels of RGB image respectively. Fig. 3 represents RGB image of potato chips, Red, Green & Blue channels of RGB image and gray scale image of potato chips. Since this 2-dimensional gray scale image is the intensity average of all three channel of RGB image, hence the discriminatory features of acrylamide will be retained in the 2-dimensional gray scale image. Accordingly if the gray scale image contains the discriminatory features then processing this 2-D image for acrylamide identification will be computationally cheap increasing the efficiency of the method. Fig. 4 shows the plot of number of pixels with gray levels presents in the gray scale version of normal image & acrylamide content image respectively. The gray level intensity distribution indicates that normal and acrylamide sample image have discriminatory behavior in the pattern distribution. Hence processing of 2-dimensional gray image seems to a feasible approach for acrylamide identification using image processing. 5. Feature analysis of potato chips image for identification of acrylamide It was observed from the histogram distributions that gray scale image of potato chips have discrimination in the pixels intensity values for normal and acrylamide sample images. Based on these observations gray scale image are used in this work to identify acrylamide presence from the samples. Since number of pixels in the gray image is very high depending on the size of image so processing of pixel values for acrylamide identification may be a time taking approach. Hence, for efficient computation some statistical parameters may act as representative of image for analysis. Statistical features like mean, standard deviation, and variance are calculated from segmented image which is the region of interest (ROI) in spatial domain and it is observed that these features have discrimination between normal and acrylamide samples. Fig. 5 shows the plot of mean for normal & acrylamide content samples. It can be clearly seen in the figures that mean value shows discriminatory variation between normal sample image and acrylamide sample image. Similarly Fig. 6 indicates discriminatory pattern variation for the value of variance. On the basis of these observations there is a motivation to explore various statistical and textures features from the gray images for identification of acrylamide in potato chip images. 6. The proposed method The proposed work presents a non destructive compute vision based method for acrylamide identification in fried/baked potato chips. Statistical and texture features from segmented gray image of potato chips were used to find out discrimination between normal potato chip image and acrylamide contained potato chip image. Accuracy of acrylamide identification may be improved if prominent distinct features are considered for analysis and classification. Significant features from the image will be extracted if only the informative area of image is considered for feature extraction. In this case the potato chip area needs to be segmented from the background image so that only informative features are considered and redundant information from the background is
M.K. Dutta et al. / Computers and Electronics in Agriculture 119 (2015) 40–50
43
Fig. 2. Color map for sample images. (a) Potato chips image in which acrylamide is not present (normal image). (b) Potato chips image in which acrylamide is present (acrylamide content image). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 3. Sample image of potato chips. (a) RGB image (color image). (b) Red channel of RGB image. (c) Green channel of RGB image. (d) Blue channel image of RGB image. (e) Gray scale image. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
removed. The actual area of the potato chip is the Region of Interest (ROI) in this case. In the proposed work acrylamide identification was done by supervised classification technique using more informative features from segmented gray image of potato chip. It is known that accuracy of classifier can be improved if more distinct features are used for classification and complexity of classifier closely related to number of features used for classification purpose. To increase the accuracy and reduce the complexity of classifier, a dimension reduction algorithm was used which transforms the high dimensional feature vector into a compressed but representative feature vector. Finally this feature vector was used to identify the presence of acrylamide using supervised classification technique. Fig. 7 represents the block diagram of the proposed method.
The proposed work to identify the presence of acrylamide in potato chips involve Region of Interest (ROI) Segmentation, Feature extraction from ROI segmented Image, Feature normalization and Feature reduction and finally Acrylamide identification using Supervised Classification.
6.1. ROI segmentation The area of potato chips is considered as ROI as this is the only region which contains the information required for acrylamide identification. Color image of potato chips is used as an input image for ROI segmentation. White background in this case is a convenient option for segmenting the ROI.
44
M.K. Dutta et al. / Computers and Electronics in Agriculture 119 (2015) 40–50
In RGB representation color image can be considered to be composed of three independent component images for red, green and blue primary color components but it may not be possible to directly get the information about the color components and intensity components. Hence to segment the area of potato chips (ROI), RGB image is converted into HSI model to separate the color components from the intensity in an image (Gonzalez and Woods, 2001). In HSI model, Hue (H) is a color attribute that describes the pure color (pure yellow, orange or red), whereas saturation gives the measure of the degree to a pure color is diluted by white light. The white background in these images facilitates the application of this model. Brightness is the intensity component of color image. For the ROI segmentation ‘‘S” component is selected as it shows the higher intensity difference in the area of potato chips and background as shown in Fig. 8(b). It can be clearly seen in Fig. 8(b) that background of image is completely black in the ‘‘S” component and hence ROI will be segmented accurately using thresholding operation (Gonzalez and Woods, 2001; Otsu, 1979). Noise present in the segmented ROI, can be removed using morphological operations. The steps of the proposed method of ROI segmentation are given in Algorithm 1.
Algorithm: 1 // Region of Interest (ROI) Segmentation // Step 1: Read Input color image (RGB image) Step 2: RGB image is converted into HSI color space. where H, S and I represents hue, saturation and intensity components of HSI color image. If R, G & B are red, green & blue components of RGB image then H¼
b
if b 6 G
360 b if b > G 8 9 > > < = 1 ½ðR GÞ þ ðR BÞ 2 where b ¼ cos1 h i 1=2 > > : ðR GÞ2 þ ðR BÞðG BÞ ; 3 S¼1 fminðR; G; BÞg RþGþB
I¼
RþGþB 3
Fig. 8 represents four samples of RGB image of Potato chips, their saturation component and segmented ROI in gray images. It can be clearly seen from the Fig. 8 that area of potato chips (ROI) is segmented accurately using this proposed method. 6.2. Feature extraction from ROI segmented gray scale image It was discussed in Section 5 that spatial domain features of gray image have discriminatory pattern variation between normal and acrylamide sample of potato chips. To develop an accurate and efficient image processing based system for acrylamide identification, Statistical and Texture features are extracted from segmented gray image. These features are Mean, Variance, Standard Deviation, Contrast, Correlation, Energy, Homogeneity and the features obtained from gray-level run length (GLRL) matrix. Total of 12 features are obtained from image in spatial domain for further analysis. Brief definition of these features given below: 6.2.1. Statistical features: mean, variance, standard deviation Statistical features of image in spatial domain are the measure of variation in the intensity values of pixel in an image. Mean is defined as average of intensity of pixels in an image. Standard deviation of image is the measure of how spread out intensity values of pixels and variance is defined as the average of the squared deviations from the mean. These statistical features are calculated for segmented gray scale image I of size M N using following equations:
PP
Mean ðrÞ ¼
ð2Þ
ð4Þ
ð5Þ
Step 3: Selection of ‘‘S” component image to segment ROI Step 4: S component image is converted into binary image by Thresholding method as explained below: if S(x, y) > s1 then s(x, y) = 1; else s(x, y) = 0; where s(x, y) is the pixels intensity of ‘‘S” component image at location (x,y) and s1 is threshold parameter calculated automatically using OTSU Thresholding (Gonzalez and Woods, 2001) technique. In this binary image white pixels ‘‘1” represents potato chip area (ROI) and rest black pixels ‘‘0” belong to background. Step 5: Morphological ‘‘holes filling” operation is used to fill the small holes in the segmented potato chip area (ROI) to segment the complete ROI. Step 6: Selection of object (ROI) having largest area to remove noisy pixels. Step 7: To smooth the boundary of segmented ROI, Morphological ‘‘Dilation” operation is used as explained below: IROI = I Sd where IROI is segmented ROI, I is object having largest area and Sd is disk shape structuring element of size ‘‘2”
ð6Þ
2
Variance ¼ ð3Þ
i j Iði; jÞ MN
fIði; jÞ rg MN
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 fIði; jÞ rg Standard deviation ¼ MN
ð7Þ
ð8Þ
where M and N are number of rows and columns in the image I. 6.2.2. Texture features: GLCM features & features obtained from graylevel run length (GLRL) matrix Texture descriptors provides measures of properties, such as smoothness, coarseness, and regularity, which indicate a mutual relationship among intensity values of neighboring pixels repeated over an area larger than the size of the relationship. Such properties can be used as features for classification and pattern recognition. 6.2.3. Gray-level co-occurrence matrix A gray-level co-occurrence matrix (GLCM) depicts how often different combinations of pixel brightness values (gray levels) occur in an image. It is a second order measure because it measures the relationship between neighborhood pixels. For an image of size M N, a second-order statistical textural analysis is done by constructing the GLCM (Galloway, 1975) as
C d ði; jÞ ¼ jfðp; qÞ; ðp þ dx; q þ dyÞ : Iðp; qÞ ¼ i;
Iðp þ dx; q þ dyÞ ¼ jgj ð9Þ
In which (p, q), (p + dx, q + dy) e M N, d = (dx, dy) and || denotes the cardinality of a set. Then given a gray level i in an image, the probability that a pixel at a (dx, dy) distance away is j is:
C d ði; jÞ Pdði; jÞ ¼ P C d ði; jÞ
ð10Þ
Using (9) and (10), four features energy, contrast, Homogeneity and entropy are calculated using following set of equations:
45
M.K. Dutta et al. / Computers and Electronics in Agriculture 119 (2015) 40–50
x 10
4
x 10
3
(a) 2.5
2.5
2
2
No. of Pixels
No. of Pixels
4
3
1.5
1
0.5
(b)
1.5
1
0.5
0
0 Gray Levels
20
40
60
80
100
Gray Levels
120
140
160
180
200
20
40
60
80
Gray Levels
100
120
140
160
180
200
Gray Levels
Fig. 4. Histogram for sample image of potato chips. (a) Normal image. (b) Acrylamide image.
55
Homogeneity H ¼
Mean Value of Acrylamide Samples
50
Entropy E ¼
40
Pd ði; jÞ
j
1 þ ð i jÞ
XX ½Pd ði; jÞfln Pd ði; jÞg i
ð14Þ
j
35 Difference in mean value of Normal and Acrylamide Samples
30 25 20
Mean Value of Normal Samples
15 10
0
5
10
15
Samples
20
25
30
Fig. 5. Plot of mean value for normal and acrylamide samples.
6.2.4. Run length matrix In the run-length matrix Ph (i, j), each cell in the matrix consists of the number of elements in which gray level i successively appears j times in the direction h, and the variable j is termed as run length. The resultant matrix characterizes the gray-level runs by the gray tone, length, and direction of the run. As a common practice, run length matrices of h equal to 0°, 45°, 90°, and 135° are calculated to determine the following features (Galloway, 1975; Tan et al., 2010): short-run emphasis, long-run emphasis, gray-level non-uniformity, run-length non-uniformity, and run percentage.
PP
5500 5000
i
PP
Short Run Emphasis ðSREÞ :
Varaince of Normal Samples
4500
i
4000
j
Gray level Non-uniformity ðGLNÞ :
2000
500
j P h ði; jÞ
i
PP i
Varaince of Acrylamide Samples
1000
0
5
10
15 Samples
20
25
Run Length Non-uniformity ðRLNÞ :
i
Contrast C ¼
i
j
i 2 ði jÞ fPd ði; jÞg
j P h ði; jÞ
2 P P Ph ði; jÞ j P Pi i j P h ði; jÞ
ð17Þ
ð18Þ
PP
ð11Þ
j
XXh
i2
30
Fig. 6. Plot of the value of variance for normal and acrylamide samples.
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi XX 2 Energy E ¼ ½Pd ði; jÞ
ð16Þ
P hP
Difference in Variance of Normal & Acrylamide Samples
1500
ð15Þ
j P h ði; jÞ
3000 2500
P h ði;jÞ j2
PP 2 j Ph ði; jÞ Pi Pj i j P h ði; jÞ
Long Run Emphasis ðLREÞ :
3500
Variance
ð13Þ
2
i
45
Mean Vlaue
XX
ð12Þ
Run Percentage ðRPÞ :
i
j P h ði; jÞ
A
ð19Þ
Total of 12 features are calculated from segmented potato chips image in spatial domain and shown in the Table below. Tables 1 and 2 shows the spatial domain features from 6 sample images which are normal (no acrylamide present) and for 6 sample image in which acrylamide is present. It is clearly seen in Tables 1 and 2 that these features are discriminatory for further acrylamide identification.
46
M.K. Dutta et al. / Computers and Electronics in Agriculture 119 (2015) 40–50
ROI Segmentation RGB to HSI Conversion
Selection of Saturation (S) Component
Thresholding & Morphological Processing
Potato chips Color (RGB) Image Feature Reduction
Statistical & Texture Feature Extraction ROI Segmented Gray Image
Supervised Classification
Identification of presence of Acrylamide Fig. 7. Block diagram indicating proposed image processing based acrylamide identification system.
6.3. Feature normalization and feature reduction In the proposed work, acrylamide identification is done using supervised classification. To improve the performance of classifier, extracted image features are standardized. ‘‘z-score normalization” technique is used (Dunham, 2002) to normalize the image features for improving the efficiency of classification. A feature vector (E) consisting of n features of an image is converted to zero mean and unit variance using z-score normalization procedure. The normalized value of Ei of ith columns (features) of feature vector is
Ei -mean Normalized ðEi Þ ¼ std
ð20Þ
where normalized (Ei) is the normalized value of Ei. In Eq. (20) mean and std are the mean and standard deviation of feature vector E respectively.
mean ¼
n 1X Ei n i
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u n u 1 X ðEi meanÞ2 std ¼ t ðn 1Þ i
ð21Þ
ð22Þ
The spatial domain features of image are normalized using this procedure and the normalized feature vector is created. It is known that Accuracy of classifier can be improved if more distinct features are used for classification and complexity of classifier closely related to number of features used for classification purpose. Complexity of classifier increases if more number of features is considered for classification. Hence there is a need to reduce the features by selecting only the more discriminatory features while rejecting others. A dimension reduction algorithm transforms the high dimensional feature vector into a compressed but representative feature vector. On the basis of experiment it was found that Principal Component Analysis (PCA) Jolliffe, 2002; Abdi and Williams, 2010 provides the most discriminatory features for the feature vector used in this work. Principal component analysis (PCA) is a dimension reduction method which uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables. These variables are called principal components (PCs). To reduce the dimension of feature
vector principle components of largest variance can be selected while rejecting others having smaller variance. Normalized feature vector were compressed by Principal Component Analysis (PCA) and two main principle components which are have maximum discriminatory properties are selected for further processing and classification to identify the presence of acrylamide in the image samples. 6.4. Acrylamide identification using supervised classification To identify presence of acrylamide content in potato chips, feature vector compressed using PCA are used. To improve the performance of classifier, only two main principle components of PCA {pc1 pc2} which are more discriminatory are used. This feature vector {pc1 pc2} can be defined as pattern of image in a two-dimensional feature space. Each sample image of potato chips is an instance of a point in a two-dimensional feature space. To identify the acrylamide, there is a need to categorize patterns into classes so that patterns belonging to different classes are well separated. In the proposed work of acrylamide identification, two different classes are considered as mentioned below: Normal Class: Potato samples containing not detectable (HPLC) or absence of acrylamide. Acrylamide Class: Potato sample with detectable acrylamide. Now to identify the acrylamide there is a need to build a classifier which can actually determine the class of sample image. In this study supervised classification technique is used to improve the accuracy of classifier. In supervised classification technique, initially classifier is trained with the training data set and later accuracy of classifier is tested using testing set. Training set includes labeled data set which can train the classifier to decide the classes of testing samples accurately. In this work there is no overlap between training data and test data. On the basis of the experiments performed, it was found that support vector machine (SVM) classifier Chang and Lin (2001) and Keerthi et al. (2001) provides highest accuracy to classify this compressed data set using PCA. The SVM classifier was trained with given set of training data and later tested with test samples. On the basis of experimental trials the performance of classifier is measured for different kernels and the optimized kernel is selected which can accurately separate the data.
47
M.K. Dutta et al. / Computers and Electronics in Agriculture 119 (2015) 40–50
Fig. 8. (a) RGB image of potato chips. (b) ‘‘Saturation Component” in HSI model. (c) ROI segmentation using OTSU thresholding & morphological processing. (d) Segmented ROI in gray image of potato chips.
7. Experimental results In the proposed work, data base of 150 sample images of potato chips were used for experimentation purpose. Out of 150 images, 75 images are from normal potato chips and remaining 75 images are from potato chips in which acrylamide is present which has been confirmed by the LC–MS as discussed in Section 3.1. Out of 150 images 100 images are used to train the classifier and remaining 50 images are used to test the accuracy of classifier. There is no overlap in the training data and test data used in this classifier.
Performance indicative coefficients for a classifier are sensitivity, specificity and accuracy which are defined below. Sensitivity is the probability that, acrylamide content sample is identified as normal and Specificity is defined as the probability of normal sample is classified as acrylamide content sample. Accuracy of the classification method defined as the probability of all the samples are classified accurately. These parameters are defined as follows:
Sensitiv ity or True Positive Rate ðTPRÞ ¼
TP TP þ FN
ð23Þ
48
M.K. Dutta et al. / Computers and Electronics in Agriculture 119 (2015) 40–50
Table 1 Spatial domain features from normal sample images. Features in spatial domain
Mean Std. deviation Variance Contrast Correlation Energy Homogeneity SRE LRE GLN RLN RP
Sample images which have no acrylamide (normal sample) Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
30.0119 50.7040 2570.90 323,911 0.00708 1.9E06 0.00856 0.44010 36.8331 1841.58 4533.18 0.68152
29.7349 50.9560 2596.51 294,688 0.00053 2.0E06 0.00801 0.43782 36.9790 1787.69 4391.02 0.65931
30.10267 50.68343 2568.811 392800.2 0.001706 1.8E06 0.007820 0.446837 36.40242 2117.980 5015.818 0.756740
31.9666544 51.4541368 2647.52820 315805.152 0.00661623 1.5E06 0.00827153 0.45142561 36.1087607 2539.85660 5412.11306 0.81832107
33.2624762 51.5376430 2656.12864 310937.078 .0074298 1.4E06 0.00827749 0.44553862 36.4855282 2015.14513 4914.43311 0.74096200
32.699705 51.686645 2671.5093 329601.25 0.004340 2.0E06 0.0079037 0.4408154 36.787811 1940.2141 4579.3060 0.6887254
Short Run Emphasis (SRE), Long Run Emphasis (LRE), Gray-Level Non-uniformity (GLN), Run Length Non-uniformity (RLN), Run Percentage (RP).
Table 2 Spatial domain features from acrylamide sample images. Features in spatial domain
Mean Std. deviation Variance Contrast Correlation Energy Homogeneity SRE LRE GLN RLN RP
Sample images in which acrylamide is present (acrylamide sample) Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
21.359935 34.585686 1196.1696 326429.47 0.003507 1.E06 0.0074927 0.4434770 36.617471 1983.9826 4761.9965 0.7172181
18.261401 29.819127 889.18038 331390.01 0.005963 1.E06 0.0085049 0.4463267 36.435088 1989.2243 4975.4531 0.7504595
18.1737582 29.9748933 898.494230 310049.371 0.0003544 1.9E06 0.00849958 0.43816654 36.9573410 1845.33202 4411.59976 0.66253063
16.9680194 28.1976476 795.107331 313884.344 0.00154340 1.8E06 0.00856852 0.44858742 36.2904050 1953.63311 5159.64215 0.77910539
17.1035867 28.2844151 800.008138 326750.163 0.0089304 1.E06 0.00809045 0.44252885 36.6781534 1862.21109 4695.17555 0.70680147
15.743201 27.157296 737.51875 291382.40 0.006783 2.8E06 0.0082571 0.4169471 38.315384 1146.9927 3427.9846 0.5078125
Short Run Emphasis (SRE), Long Run Emphasis (LRE), Gray-Level Non-uniformity (GLN), Run Length Non-uniformity (RLN), Run Percentage (RP).
Table 3 Performance of SVM classifier for acrylamide identification for different setting parameters. No. of training samples: 100 (normal – 50 & acrylamide – 50) No. of training samples: 50 (normal – 25 & acrylamide – 25) Setting parameter (kernel function) for SVM classifier
Sensitivity (%)
Specificity (%)
Accuracy (%)
Linear kernel rbf Kernel with gamma – 3 Quadratic kernel Polynomial kernel with order – 3
92.00 96.00 92.00 96.00
88.00 92.00 96.00 92.00
90.00 94.00 94.00 94.00
Specificity or True Positive Rate ðTPRÞ ¼ Accuracy ¼
TP þ TN PþN
TN FP þ TN
ð24Þ ð25Þ
where P and N are total number of acrylamide and normal samples respectively. TP (True Positive): Number of acrylamide content sample classified correctly TN (True Negative): Number of normal sample classified correctly FP (False positive): Number of normal sample is identified as acrylamide FN (False Negative): Number of acrylamide sample is classified as normal The complexity parameter and accuracy of SVM classifier depends on the kernel function used to distribute the data into different classes. Performance of SVM classifier has been evaluated for different kernel functions and the best kernel function which has
the best accuracy is finally selected for classification. Table 1 presents the sensitivity, specificity and accuracy of SVM classifier with different kernel functions. It can be seen from the Table 1 that accuracy of identification using SVM classifier is 94% for nonlinear kernels rbf (radial basis function) & Polynomial which can be considered as encouraging results. Table 3 shows that SVM classifier achieves 94% accuracy for acrylamide identification with 96% sensitivity and 92% specificity which can be considered as significantly encouraging results. All the test image samples having acrylamide content are correctly classified except one sample and 23 out of 25 normal samples are classified correctly which indicates that proposed method is efficient to identify acrylamide content in potato samples. It can be seen from Table 4 that performance of SVM classifier improves as number of training sample is increased to train the classifier. Fig. 9 shows the accuracy of classification with variation of the number of training sample data. The average computational time required for ROI segmentation and feature extraction is 1.5 s per image and 1.2 s per image respectively. The computational time required to train the classifier
49
M.K. Dutta et al. / Computers and Electronics in Agriculture 119 (2015) 40–50 Table 4 Acrylamide identification using SVM classifier in terms of sensitivity, specificity and accuracy. No of training samples
No of testing samples
No of accurately classified testing samples
Normal
Acrylamide
Normal
Acrylamide
Normal
Acrylamide
10 15 20 25 28 40 50
10 15 20 25 30 40 50
13 13 13 13 13 20 25
13 13 13 13 13 20 25
10 10 11 12 11 17 23
11 12 12 12 13 20 24
Specificity (%)
Accuracy (%)
84.61 92.30 92.30 92.30 100 100 96.00
76.93 76.93 84.61 92.31 84.61 85.00 92.00
80.77 84.61 88.46 92.31 92.31 92.50 94.00
0.08
Second Principal Component
95
Accuracy
90
85
80
20
30
40
50
60
70
80
90
100
Number of Training Samples Fig. 9. Accuracy of acrylamide identification with no. of training samples.
0.08 0 1 Support Vectors
0.06 0.04 0.02
0 (training) 0 (classified) 1 (training) 1 (classified) Support Vectors
0.06 0.04 0.02 0 -0.02 -0.04 -0.08
Second Principal Component
Sensitivity (%)
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
First Principal Component Fig. 11. SVM classification with non linear kernel ‘‘Polynomial – order 3” for testing samples by considering spatial domain features (1 for normal class and 0 for acrylamide class).
Table 5 shows a comparative data of some of the existing work similar to proposed methodology used to identify acrylamide content in food items. It can be seen from the comparative chart that proposed method is a novel approach of research in this area. While other existing work has not quantified and reported the accuracy of the model the proposed method has reported the accuracy. With an accuracy of 94%, the proposed method can be considered a significant contribution toward identification of acrylamide content in fried potato chip using image processing.
0
8. Discussion and final remarks on experimental results
-0.02 -0.04 -0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
First Principal Component Fig. 10. SVM classifier training for training samples with non linear kernel ‘‘Polynomial – order 3” by considering spatial domain features (1 for normal class and 0 for acrylamide content class).
using 100 training samples is 2.8 s. To test the 50 test samples simultaneously total of 2.10 s are required. The image processing algorithms were implemented in MATLAB R2011b (Math Works) software on a CPU@ 2.3 GHz, 4 Gb RAM, 64 bit operating system. Fig. 10 shows the plot of two main principle components for training set using polynomial kernel based SVM classification technique. It is clearly seen from the figure that polynomial kernel of order 3 typically enhances the separability of features. Test images are used as inputs and trained classifier then classifies the input image. Fig. 11 represents the plot for classification of test samples. It is clearly visible in the plot that the classification of the test samples is done accurately.
i. The area of potato chips is automatically segmented accurately which has the advantage of extraction of only the prominent informative features for identification of acrylamide content. The removal of background removes the redundant information which improves the classifier efficiency. ii. The statistical features extracted from segmented gray image of potato chips have distinct discriminatory variation for the presence of acrylamide. These discriminatory properties of the features make it an efficient option for classification of acrylamide presence in the potato chips. iii. Use of PCA for feature compression and SVM classifier for acrylamide identification in the proposed method has 94% accuracy with 96% sensitivity which can be considered as a significant result. This work may be considered as a significant novel contribution toward the image processing based identification of acrylamide in potato chips based on efficient feature extraction and classification. This work using may open new dimension of research in this area.
50
M.K. Dutta et al. / Computers and Electronics in Agriculture 119 (2015) 40–50
Table 5 A comparative table of some existing related work with the proposed work. Reference
Objective of work
Methodology used
Observation parameters
Gokmen et al. (2007)
Computer vision based analysis of acrylamide concentrations of potato chips and french fries Determination of acrylamide contents in fried potato chips using image processing
Image processing using red, green & blue components of color image
Development of a computer vision system to measure the color of potato chips Measure of color change and acrylamide concentration in fried potato slices Quality evaluation and control of potato chips and french fries Image processing methods for food inspection
Image pre-processing (filtering, contrast adjustment), segmentation, RGB to lab conversion
+4 ± 14% Mean difference between measured and predicted acrylamide concentrations/not reported 4.94% Maximum relative error between the acrylamide contents calculated and standard chemical method Chip color changes dE/not reported
He et al. (2013)
Pedreschi et al. (2006)
Pedreschi et al. (2005)
Pedreschi et al. (2007) Yorulmaz (2012)
Sun, 2011
Khan, 2013
Proposed method
Hyperspectral Imaging Technology: A non-destructive tool for food quality and safety evaluation and inspection Analysis of acrylamide in potato chips by SPE and GC–MS Computer vision based analysis of acrylamide concentrations in potato chips
Image segmentation and color measurement: Segmentation of potato chip area and acrylamide identification based on color measurement
A first-order rate equation was used to model the kinetics of color change
Activitation energy/not reported
Image preprocessing and image segmentation, feature extraction and classification for acrylamide identification Detection of fungal infection on popcorn kernel images (two methods): 1. Cepstrum based feature extraction and classification using (SVM) 2. Covariance based methods Hyperspectral imaging techniques is used for color, shape, size, surface texture evaluation of food products & surface defects detection in food inspection Potato chips extraction using porous graphitic carbon for solid phase extraction (SPE) and acrylamide analysis using GC–MS on a polyethylene glycol phase GC column Extraction of statistical & texture features from segmented gray image of potato chips followed by feature reduction and SVM classification
90% Classification performance
9. Conclusion This paper proposes a non-destructive and efficient image processing based method to identify Acrylamide content in potato chips. Statistical and texture features are extracted from segmented gray scale image of potato chips in spatial domain. These features are normalized using z-score normalization method followed by compression using Principle Component Analysis (PCA). This compressed data is classified using supervised SVM classification technique. Experimental result indicates that proposed method provides 94% accuracy with 96% sensitivity which can be considered a significant contribution in image processing based acrylamide identification. Future work may be to explore the other transform domain for feature extraction and other methods of classification to improve accuracy and efficiency of acrylamide identification using image processing. References Abdi, H., Williams, L.J., 2010. Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics. Chang, C.-C., Lin, C.-J., 2001. LIBSVM: A library for support vector machines. (Online). Available: http://www.csie.ntu.edu.tw/cjlin/libsvm (accessed: March 10, 2011). Dunham, M.H., 2002. Data Mining Introductory and Advance Topics. Prentice Hall, NJ. Galloway, M.M., 1975. Texture classification using gray level run length. Comput. Graph. Image Process. 4 (2), 172–179. Gokmen, V., Senyuva, H.Z., Dulek, B., Cetin, A.E., 2007. Computer vision-based image analysis for the estimation of acrylamide concentrations of potato chips and french fries. Food Chem. 101, 791–798.
95.5% success rate/accuracy in cookie detection
Not reported
Standard addition calibration curve/ not reported 94% Accuracy in acrylamide detection in potato chips
Gonzalez, R.C., Woods, R.E., 2001. Digital Image Processing. Prentice Hall, NJ. He, Peng, Wan, Xiao-Qing, Zhou, Zhen, Wang, Cheng-Lin, 2013. Determination of acrylamide contents in fried potato chips based on colour measurement. Adv. Inform. Sci. Serv. Sci. (AISS) 5 (1), pp. 437-435. Jolliffe, I.T., 2002. Principal Component Analysis Series in Statistics. Springer Verlag. Keerthi, S.S. et al., 2001. Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput. 13, 637–649. Khan, Anila I., 2013. Analysis of Acrylamide in Potato Chips by SPE and GC-MS. Application Note 20734. Thermo Fisher Scientific, Runcorn. ~ o´n, M.C., 1986. Influence of reducing sugars and amino acids in the Ma´rquez, G., An color development of fried potatoes. J. Food Sci. 51, 157–160. Mottram, D.S., Wedzicha, B.L., Dodson, A.T., 2002. Acrylamide is formed in the Maillard reaction. Nature 3419 (6906), 448–449. Otsu, N., 1979. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9 (1), 62–66. Pedreschi, Franco, Moyano, Pedro, Kaack, Karl, Granby, Kit, 2005. Color changes and acrylamide formation in fried potato slices. Food Res. Int. 38, 1–9. Pedreschi, Franco, Leo´n, Jorge, Mery, Domingo, Moyano, P., 2006. Development of a computer vision system to measure the color of potato chips. Food Res. Int. J., 1092–1098 Pedreschi, Franco., 2007. Domingo Mery and Thierry Marique, ‘‘Quality Evaluation and Control of Potato Chips and French Fries” Computer Vision Technology for food quality evaluation, pp. 549–570. ISBN: 978-0-12-373642-0. Pedreschi, Franco, Granby, Kit, Risum, Jørgen, 2010. Acrylamide mitigation in potato chips by using NaCl. Food Bioprocess Technol. 3 (6), 917–921. Rosen, J., Hellenäs, K.E., 2002. Analysis of Acrylamide in cooked foods by liquid chromatography tandem mass spectrometry. Analyst 127 (7), 880–882. Scanlon, M.G., Roller, R., Mazza, G., Pritchard, M.K., 1994. Computerized video image analysis to quantify colour of potato chips. Am. Potato J. 71, 717–733. Smith, O., 1975. Potato chips. In: Talburt, W.F., Smith, O. (Eds.), Potato Processing. The Avi Publishing Company Inc, Westport, CT, pp. 305–402. Sun, Da-Wen, 2011. Hyperspectral imaging technology: A non-destructive tool for food quality and safety evaluation and inspection. In: Proceedings of the 11th International Congress on Engineering and Food (ICEF 11), Paper No. MFS1281, 22–26 May 2011, Athens, Greece. Tan, J.H., Ng, E.Y.K., Acharya, U.R., 2010. Study of normal ocular thermogram using textural parameters. Infrared Phys. Technol. 53 (2), 120–126. Yorulmaz, Onur, 2012. Image processing methods for food inspection. MS Thesis.