Hyperspectral imaging of ribeye muscle on hanging beef carcasses for tenderness assessment

Hyperspectral imaging of ribeye muscle on hanging beef carcasses for tenderness assessment

Computers and Electronics in Agriculture 116 (2015) 55–64 Contents lists available at ScienceDirect Computers and Electronics in Agriculture journal...

2MB Sizes 0 Downloads 44 Views

Computers and Electronics in Agriculture 116 (2015) 55–64

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture journal homepage: www.elsevier.com/locate/compag

Hyperspectral imaging of ribeye muscle on hanging beef carcasses for tenderness assessment Govindarajan Konda Naganathan a, Kim Cluff a, Ashok Samal b, Chris R. Calkins c, David D. Jones a, Carol L. Lorenzen d, Jeyamkondan Subbiah a,e,⇑ a

Biological Systems Engineering, University of Nebraska, Lincoln, NE 68583, United States Computer Science and Engineering, University of Nebraska, Lincoln, NE 68588, United States Animal Science, University of Nebraska, Lincoln, NE 68583-0908, United States d Animal Science Research Center, University of Missouri, Columbia, MO 65211, United States e Food Science and Technology, University of Nebraska, Lincoln, NE 68583, United States b c

a r t i c l e

i n f o

Article history: Received 19 August 2014 Received in revised form 7 June 2015 Accepted 10 June 2015 Available online 25 June 2015 Keywords: Beef grading Tenderness forecasting Fisher’s linear discriminant modeling Textural features Principal component analysis

a b s t r a c t A prototype hyperspectral image acquisition system (k = 400–1000 nm) was developed to acquire images of exposed ribeye muscle on hanging beef carcasses in commercial beef packing or slaughter plants and to classify beef based on tenderness. Hyperspectral images (n = 338) of ribeye muscle on hanging beef carcasses of 2-day postmortem were acquired in two regional beef packing plants in the U.S. After image acquisition, a strip steak was cut from each carcass, vacuum packaged, aged for 14 days, cooked, and slice shear force values were collected as a measure of tenderness. Different hyperspectral image features namely descriptive statistical features, wavelet features, gray level co-occurrence matrix features, Gabor features, Laws’ texture features, and local binary pattern features, were extracted after reducing the spectral dimension of the images using principal component analysis. The features extracted from the 2-day images were used to develop tenderness classification models for forecasting the 14-day beef tenderness. Evaluation metrics such as tender certification accuracy, overall accuracy, and a custom defined metric called accuracy index were used to compare the tenderness classification models. Based on a third-party true validation with 174 samples, the model developed with the gray level co-occurrence matrix features outperformed the other models and achieved a tenderness certification accuracy of 87.6%, overall accuracy of 59.2%, and an accuracy index of 62.9%. The prototype hyperspectral image acquisition system developed in this study shows promise in classifying beef based on tenderness. Ó 2015 Elsevier B.V. All rights reserved.

1. Introduction Beef tenderness is the second most important quality attribute of beef, preceded only by safety, based on consumers’ demand (NCBA, 2007). Recent studies reported that consumers can discern differences in tenderness while eating cooked beef and they are willing to pay a premium of $1.14–$2.76 per pound for guaranteed tender beef (Gao and Schroeder, 2007; Loureiro and Umberger, 2004). The National Beef Assessment Plan III conducted by the National Cattlemen’s Beef Association in 2007 concluded that certifying a beef product as tender has more value than producing a tender product (NCBA, 2007). Currently, no nondestructive technology is available for real-time beef tenderness assessment.

⇑ Corresponding author at: 212 L.W. Chase Hall, University of Nebraska, Lincoln, NE 68583-0726, United States. Tel.: +1 402 4724944; fax: +1 402 4726338. E-mail address: [email protected] (J. Subbiah). http://dx.doi.org/10.1016/j.compag.2015.06.006 0168-1699/Ó 2015 Elsevier B.V. All rights reserved.

Hence, development of such a technology is of prime importance to the beef industry to meet the ever increasing consumer demand for guaranteed tender beef. Konda Naganathan et al. (2008a) used a spectrograph-based visible/near-infrared (VNIR) hyperspectral imaging (HSI) system (k = 400–1000 nm) to predict tenderness of 14-day aged, cooked beef from hyperspectral images of fresh ribeye steaks acquired at 14-day postmortem. In another study, they used a spectrograph-based, near-infrared (NIR) HSI system (k = 900– 1700 nm) to predict tenderness of 14-day aged, cooked beef from hyperspectral images of fresh ribeye steaks acquired at 3-day postmortem (Konda Naganathan et al., 2008b). Both studies used gray-level textural co-occurrence matrix (GLCM) analysis to extract second-order statistical textural features from hyperspectral images of beef ribeye steaks. Canonical discriminant models were developed using the textural features to classify beef samples into three tenderness categories: tender, intermediate, and tough.

56

G. Konda Naganathan et al. / Computers and Electronics in Agriculture 116 (2015) 55–64

With a leave-one-out cross-validation procedure, the VNIR HSI system classified beef samples (n = 111) into three tenderness categories with 96.4% accuracy, based on the current status of tenderness (14-day aged VNIR hyperspectral images predicted 14-day tenderness categories). All of the tough samples were correctly identified. The NIR HSI system classified beef samples (n = 314) into three tenderness categories with an overall accuracy of 77% in a tenderness forecasting scenario (3-day NIR hyperspectral image features were used to predict 14-day tenderness categories). When two tenderness categories (tender and tough by pooling the intermediate samples into tender category) were used, the accuracy of the NIR HSI system increased to 94.6%. Both the HSI systems mentioned earlier were laboratory scale bench-top systems that required excision of a one-inch thick ribeye steak from each carcass for image acquisition and tenderness assessment. This process takes time and degrades the value of primal cuts. To resolve these shortcomings and assess tenderness at plant level, the HSI systems should acquire images of the ribeye muscle on hanging beef carcasses (Fig. 1) and assess tenderness on-line. In addition, robust hyperspectral image analysis algorithms are needed to improve the accuracy and repeatability of tenderness assessment. Only a few image features such as the gray level co-occurrence matrix, wavelet, and Gabor features, have been used for beef and pork tenderness evaluation (Konda Naganathan et al., 2008a,b; Jackman et al., 2009a, 2010; Barbin et al., 2013). Additional feature sets need to be evaluated for their ability to discriminate tenderness. The objectives of this study were to: (1) develop a prototype hyperspectral image acquisition system for acquiring hyperspectral images of the ribeye muscle on hanging beef carcasses in commercial beef packing plants, (2) implement and compare different textural features for their ability to forecast 14-day aged beef tenderness classes such as tender and tough, and (3) conduct a third-party true validation of the system for classifying beef carcasses based on their 14-day tenderness.

2. Materials and methods 2.1. Prototype hyperspectral image acquisition system A prototype hyperspectral image acquisition system (Fig. 2) was developed for acquiring reflectance images of ribeye muscle (longissimus thoracis et lumborum) on hanging beef carcasses in beef packing plants. The system consisted of an adjustable-height mobile console (Fig. 2a), a camera module (Fig. 2b), and a computer (2.83 GHz Intel quad core processor and 8 GB of RAM). 2.1.1. Camera module The camera module (Fig. 2b) included a line scan camera (Model: MV1-D1312-160-CL-12, Photonfocus, Lachen, Switzerland), a spectrograph (V10E, Specim Spectral Imaging Ltd., Oulu, Finland), a lens (XENOPLAN 1.4/17-0903, Schneider Optics, Kreuznach, Germany), and a mirror scanner assembly. The spectrograph provided a spectral sensitivity in the visible-near infrared region (VNIR) ranging from 400 to 1000 nm and had a spectral resolution of 2.8 nm. Also, the spectrograph had a 30 lm slit and a numeric aperture of F/2.4. The Photonfocus camera provided a resolution of 1312  1082 pixels with a 12- bit grayscale. The camera had a 2D CMOS image sensor array with a quantum efficiency of greater than 50%. The lens had a focal length of 17 mm and a maximum aperture of 1.4. The system scans a single-spatial line of a target object, and the spectrograph disperses light from each line element or pixel to a spectrum. Thus, each spectral image contained spatial pixels in one axis (1312 pixels) and spectral pixels in the other axis (1082 pixels). To obtain a three dimensional (3-D) hyperspectral data cube, a mirror scanner was used to scan the ribeye muscle area. 2.1.2. Mirror scanner assembly The mirror scanner assembly (Fig. 2b) was a part of the camera module and included a mirror, a white-painted dome, and locating plates. The mirror was positioned at a 45° angle, so that the

Fig. 1. A hanging beef carcass showing the exposed ribeye muscle between the 12th and 13th rib and available vertical space between the forequarter and hindquarter for placing an instrument to acquire an image of the ribeye muscle.

G. Konda Naganathan et al. / Computers and Electronics in Agriculture 116 (2015) 55–64

57

8 6 7

9

4

1

5

2

3

(a) Full view of thesystem.

13 12

14

11

10

16

15

15

(b) Zoomed in view of the camera module.

(c) Schematic showing the use of the locating plates to align the camera module with a beef carcass. Fig. 2. Schematic of the prototype hyperspectral image acquisition system (HSI). Parts: (1) camera module, (2) uninterrupted power supply, (3) computer, (4) monitor, (5) height-adjustable mobile console, (6) vertical tower, (7) cantilever, (8) tool balancer, (9) tension-adjustable retractable cable, (10) camera, (11) spectrograph, (12) lens, (13) mirror scanner assembly, (14) lighting dome, (15) locating plates, and (16) handle.

58

G. Konda Naganathan et al. / Computers and Electronics in Agriculture 116 (2015) 55–64

reflected signal from the beef ribeye would pass at a 90° angle onto the spectrograph. This design was employed so that the system would not be obstructed by the hindquarter of the carcass, which is directly above the exposed ribeye. Also, the mirror was attached to a stepper motor to scan the ribeye muscle. A lighting dome was fabricated and attached with the mirror scanner assembly. Six 50-W tungsten halogen lamps were placed at the bottom of the dome, facing upwards. The dome was painted white to provide uniform diffuse lighting on the steak surface (Munsell white reflectance coating, Edmund Optics, Barrington, NJ). The purpose of the locating plates (Fig. 2c) was to firmly position the camera module on the exposed portion of the carcass during image acquisition. This is critical because movement of the camera during image acquisition will blur and distort the image. The height of the dome was designed to provide a preset working distance between the ribeye muscle and the lens, and supplied a 13 cm  18 cm field of view that covered the entire ribeye muscle area. 2.1.3. Mobile console The computer and an uninterrupted power supply (UPS) unit were placed in the console (Fig. 2a), which was equipped with a vertical tower, a cantilever, and a tool balancer. This tool balancer had a retractable, tension-adjustable cable and helped to handle the camera module easily. The cantilever with the tool balancer placed the camera module at approximately the same height as the exposed ribeye on a hanging carcass. In addition, the console was equipped with gas springs to raise its height to a comfortable position for image acquisition and lower its height to make it compact during transportation between the packing plants in a van. 2.2. Samples Data were collected in two beef packing or slaughter plants over a period of three days. A total of 338 beef carcasses were imaged after 36–48 h postmortem in a cooler with a temperature of 1 °C and high relative humidity. Beef carcasses representing ‘‘A’’ maturity and USDA quality grades of Prime, Choice, Select, and Standard were considered. Carcass characteristics such as carcass maturity, marbling, ribeye area, USDA quality and yield grades were assigned by a trained USDA AMS grader. The carcass population included steers and heifers. In order to allow sufficient time for blooming, the carcasses were imaged at least 30 min after ribbing. Both the plants followed similar operating practices such as carcass hanging method and low voltage electrical stimulation. The selected carcasses were railed off for image acquisition.

Fig. 3. Acquiring an image of a beef ribeye muscle using the prototype hyperspectral imaging system (HSI).

spectrograph. This operation provided 196 spectral bands. Spectral calibration is given by:

k ¼ 0:00005  ðBNÞ2 þ 0:7087  ðBNÞ þ 302:22

ð1Þ

where k is the wavelength (nm) and BN represents the band or channel number. The camera module was placed over the exposed ribeye muscle and moved laterally such that the locating plates came in firm contact with the hanging carcass and covered the entire ribeye. It provided a steady setup to avoid shaking of the camera module during image acquisition. Then the camera was triggered to acquire and save a hyperspectral image in 12 s. In addition to the beef images, dark and white reference images were acquired at approximately 45–60 min intervals to calibrate the beef images. The white images were acquired using a 99% spectralon plate (Labsphere, North Sutton, NH), whereas the dark images were acquired by closing the camera lens with a cap. As the spectralon plate (13 cm  13 cm) did not cover the entire field of view (13 cm  18 cm), two white reference images, one covering the bottom half of the field of view and the other covering the top half of the field of view, were acquired and merged by averaging the overlapping regions to create a white reference image of size 13 cm  18 cm. 2.4. Reference tenderness scores

2.3. Hyperspectral image acquisition Fig. 3 shows how the HSI system was used to acquire the ribeye images on a hanging beef carcass at the 12th and 13th rib interface. An integration time of 32 ms, a frame rate of 30 frames per second, and a scan speed of 17.55 mm/s were used to achieve a square pixel. The stepper motor was programmed to have a step size of 0.52 mm and cover 18 cm (length of the field of view). The step size and scan speed were fixed in such a way that there were no overlapping frames during image acquisition. The ribeye muscle was scanned along the length. The camera had 1082 pixels along the spectral axis, which covered a spectral range from 320 to 1030 nm. However, the spectrograph was sensitive only in the range of 400 nm and 1000 nm. So, acquiring beyond this range would not provide any useful data. Using a spectral calibration equation (Eq. (2)) provided by the spectrograph manufacturer, it was determined that the channels (or band numbers) from 139 to 922 correspond to the 400 nm to 1000 nm range. This created 784 spectral bands with a band interval of 0.8 nm, which were binned by a factor of 4 to match the spectral resolution of the

An independent third-party lab measured slice shear force (SSF) values by following the procedures explained by Shackelford et al. (1999). After image acquisition, strip loin steaks were cut from the carcasses beginning at the 13th rib interface and moving posteriorly, vacuum packaged, and aged for 14-day postmortem in refrigerated conditions. The samples were never frozen because the process of freezing and thawing may affect tenderness. Immediately after the aging period, the samples were cooked on an XLT Oven (Model No. 1832-EL, BOFI, Inc., Wichita, KS). Then a 2.54 cm thick slice was taken from the distal end of the steak parallel to the muscle fibers and sheared perpendicular to the muscle fibers using a universal testing machine (Model No. SSTM 500, United Calibration Corporation, Huntington Beach, CA). The samples having SSF values greater than 245.2 N were considered as tough (Shackelford et al., 2005). The third-party lab sorted the samples based on the SSF values within each day and allotted every other sample to either a training or a validation dataset to ensure that the tenderness distribution within each data set was similar. It released only the SSF

G. Konda Naganathan et al. / Computers and Electronics in Agriculture 116 (2015) 55–64

values pertaining to the training dataset for developing tenderness calibration equations. The SSF values pertaining to the validation dataset were sequestered and kept with the third-party lab for conducting a true validation of the developed prototype HSI system. The authors provided the classification results of the validation samples to the third party lab to determine the accuracy measures. 2.5. Hyperspectral image processing The hyperspectral image acquisition was done on-line, whereas the hyperspectral image analysis was conducted off-line. Image calibration and PCA were implemented in ENVI (ITTVIS, Boulder, CO). Textural feature extraction and tenderness classification model development were implemented in MATLAB (The MathWorks Inc., Natick, MA). Feature selection algorithm was developed in SAS (SAS Institute Inc., Cary, NC). 2.5.1. Image calibration and region-of-interest (ROI) selection Reflectance images were obtained by subtracting the dark image from the raw beef image and dividing by the dark subtracted white image. This reflectance calibration procedure minimized the variations due to illumination, sensor response, and environmental conditions in hyperspectral beef images. After the image calibration, a region-of-interest (ROI) of size 64  128 pixels (33.3  66.6 mm) was selected within the ribeye muscle. This step was implemented manually. All subsequent image processing steps were performed on these ROI hyperspectral images. 2.5.2. Mosaic principal component analysis Hyperspectral images were acquired at very narrow wavelength intervals (2.8 nm) and had redundant or correlated information in adjacent bands or wavelength images. Principal component analysis (PCA) is the most common dimensionality reduction method to reduce the redundant (correlated) information in the hyperspectral beef image. In this study, the hyperspectral ROI images within the training set (n = 164) were mosaicked (Subbiah et al., 2014) together and the mosaicked image was subjected to the PCA procedure. The size of each ROI in the hyperspectral image was 64  128  196 (number of pixels in horizontal

PC Band 1

59

dimension  number of pixels in vertical dimension  number of spectral bands). During the mosaic PCA implementation, the PCA was performed on a data set comprising of spectra from each and every pixel of 164 images (1,343,488 spectra of size 1  196). The output of this analysis included Eigen values and loading vectors. Eigen values provide information about the percent variation explained by each principal component (PC). Because the first three PCs explained over 99% of the variation of the original image, the first three loading vectors were used to reconstruct three PC bands (Fig. 4) for each ROI hyperspectral image. Each loading vector was multiplied with the mean centered spectral profile of each pixel in the ROI hyperspectral image to determine the corresponding pixel values (PC scores) of the PC band. The textural features were extracted from the PC bands.

2.5.3. Feature extraction Six different hyperspectral image feature sets were evaluated for beef tenderness assessment: (1) descriptive statistical features (DSF), (2) wavelet features (WF) (Subbiah, 2004), (3) gray level co-occurrence matrix features (GLCMF) (Konda Naganathan et al., 2008a,b; Li et al., 1999), (4) Gabor features (GF) (Manjunath and Ma, 1996), (5) Laws texture features (LF) (Laws, 1980), and (6) local binary pattern features (LBPF) (Ojala et al., 1996). An additional analysis scenario, pooled features (PF), was also created by pooling all the features. The descriptive and local binary pattern features were simple statistical features such as mean, standard deviation, second moment, entropy, skewness, and kurtosis. The descriptive features were computed using the PC bands, whereas the local binary pattern features were computed using a set of binary images derived from the PC bands by simple thresholding. The GLCM features included contrast, correlation, entropy, and homogeneity. Three wavelet energy features: two energy features from the Level-1 and Level-2 detail images and one energy feature from the Level-2 approximation image, were computed for each PC band. Similarly, four Gabor energy features representing four different scales were calculated. A total of 14 Laws energy features were calculated from the Laws images, which were obtained by convolving the Laws kernels with the PC bands. A detailed description of the algorithms and equations used to extract these features is explained elsewhere (Konda Naganathan, 2011). The features

PC Band 2 Fig. 4. First three principal component (PC) bands.

PC Band 3

60

G. Konda Naganathan et al. / Computers and Electronics in Agriculture 116 (2015) 55–64

were extracted from the first three PC bands and used to assess beef tenderness.

Accuracy Index ¼

 100

2.6. Tenderness classification models and evaluation metrics

ð8Þ

Various hyperspectral image feature sets were used to build Fisher’s linear discriminant models to classify beef samples into two tenderness categories: tender or tough. Prior to developing the tenderness classification models, a stepwise feature selection algorithm (STEPDISC procedure in SAS) was used to identify image features that have the greatest tenderness discrimination power. A p-value of 0.15 was used to test the significance level of a feature during both forward addition and backward elimination. Overall accuracy (Eq. (2)) is a traditional metric commonly used to evaluate classification models. Even though the overall accuracy is useful in many classification problems, it may not adequately explain the performance or usefulness of a beef tenderness classification model in imbalanced datasets such as the one collected in this study (more than 80% of the samples were tender in this study). Therefore, other accuracy measures were considered to evaluate the models. Four additional accuracies namely tender identification accuracy or tender sensitivity (Eq. (3)), tough identification accuracy or tough sensitivity (Eq. (4)), tender certification accuracy or tender specificity (Eq. (5)), and tough certification accuracy or tough specificity (Eq. (6)), were computed. In addition, a new metric called accuracy index (Eq. ((7)) was defined by weighting the four different accuracy measures (Eqs. 3–6). In this paper, the accuracy index values were calculated using Eq. (8), which was obtained by substituting a = 1, b = 2, c = 2, and d = 1 in Eq. (7). The tender certification accuracy is the most useful measure for the beef industry in making tenderness marketing claims. Similarly, the tough identification accuracy is also equally important because any misclassification of tough samples as tender affects the tenderness marketing claim. Therefore, these two accuracy measures were weighted more (twice) than the other two accuracy measures. For comparing the tenderness classification models, the tender certification accuracy, overall accuracy, and accuracy index values were used.

Overall accuracy ¼

AþD  100 ðA þ B þ C þ DÞ

ð2Þ

Tender identification accuracy or tender sensitivity ¼

A  100 ðA þ BÞ

ð3Þ

Tough identification accuracy ortough sensitivity D ¼  100 ðC þ DÞ

ð4Þ

A  100 ðA þ CÞ

ð5Þ

Tough certification accuracy ortough specificity ¼

D  100 ðB þ DÞ

Accuracy Index ¼

ð6Þ  1 A D a þb ða þ b þ c þ dÞ ðA þ BÞ ðC þ DÞ  A D  100 þd þc ðA þ CÞ ðB þ DÞ

where A = Number of true tender samples correctly predicted as tender samples. B = Number of true tender samples incorrectly predicted as tough samples. C = Number of true tough samples incorrectly predicted as tender samples. D = Number of true tough samples correctly predicted as tough samples. A + B = Total number of true tender samples. C + D = Total number of true tough samples. A + C = Total number of predicted tender samples. B + D = Total number of predicted tough samples. A + D = Total number of correctly predicted samples. A + B + C + D = Total number of samples. 3. Results and discussions 3.1. Hyperspectral image acquisition Hyperspectral image acquisition on hanging beef carcasses was successfully demonstrated in two different packing plants. On average, this system took about 12 s to acquire a hyperspectral image and 97 s to analyze an image and assign a tenderness score using a quad-core computer with 8 GB of memory. The image analysis time included the time for various analysis steps such as image calibration, ROI selection, feature extraction, and prediction of the tenderness category. The system did not require any additional carcass handling or ribbing procedures to acquire hyperspectral images of beef ribeye muscle. Therefore, it can be implemented in beef packing plants for objective tenderness evaluation of beef carcasses without affecting the line speed. Assessment of tenderness at beef packing plants is efficient because there are only small number of packing plants compared to the number of feedlots and retail stores. The cost of tenderness evaluation can also be reduced because there is no need to excise samples from each carcass as in the case of bench-top HSI systems. In addition, hyperspectral image acquisition on hanging beef carcasses is a technological advancement because of the sophisticated imaging hardware, large data size, and rapid image acquisition. 3.2. Samples

Tender certification accuracy ortender specificity ¼

  1 A D A D þ2 þ2 þ 6 ðA þ BÞ ðC þ DÞ ðA þ CÞ ðB þ DÞ

ð7Þ

The USDA Choice and Select samples represented 45.7% (75 of 164) and 48.8% (80 of 164), respectively, of the training dataset. The percentage of the tough samples in the USDA Choice and Select category were 10.7% (8 of 75) and 26.3% (21 of 80), respectively. George et al. (1999) reported the odds of obtaining a slightly tough or tougher rating for supermarket beef was 20–25% for USDA Choice and Select-grade strip steaks, respectively. So, the odds of obtaining tough beef remained almost the same in the USDA Select grade and were reduced by half in the USDA Choice grade. The USDA Select grade samples had higher SSF values compared to that of the USDA Choice grade. Fig. 5 presents the SSF distribution of samples in the training set, where 81.1% (133 of 164) of the samples were in the tender category and the remaining 18.9% (31 of 164) of the samples were in the tough category based on the measured SSF values. Similarly, in the validation set, 82.8% (144 of 174) and 17.2% (30 of 174) of the samples were in the tender and tough category, respectively.

61

G. Konda Naganathan et al. / Computers and Electronics in Agriculture 116 (2015) 55–64

Fig. 5. Distribution of the slice shear force (SSF) values of the ribeye steaks in the training dataset.

3.3. Assessment of the tenderness classification models Table 1 and Figs. 6–8 summarize the tenderness classification results obtained with various tenderness classification models in cross-validation and third-party true validation. In Table 1, the column 2 represents the number of features used to build the tenderness classification models and columns 3, 4, 5, and 6 represent the prediction results, which can be substituted for A, B, C, and D, respectively, in Eqs. 2–8 to compute different evaluation metrics. When comparing the models, the results obtained with the third-party validation were given more importance because they are stringent and accurate representation of the efficacy of the models compared to the cross-validation results. In addition, two additional criteria namely model compactness and robustness were considered. The model compactness is based on the number of features used in the classification equation, whereas the robustness is based on the performance of the model in cross-validation and true validation. A model with fewer features and similar performance in cross-validation and true validation is preferred.

Considering a robustness (difference between the cross-validation and third-party true validation metrics) threshold of 5%, the WF, GLCMF, GF, and LF models had less than 5% difference in tender certification accuracy between the validation procedures. With respect to the overall accuracy, the GLCMF, GF, and LF models had less than 5% difference. Only the GLCMF and GF models passed the 5% threshold criteria as per the accuracy index values. In other words, only two models namely the GLCMF and GF, out of the seven had performed very similarly in both the cross-validation and true validation procedures. Of these two models, the GLCMF model had a tender certification accuracy of 87.6%, overall accuracy of 59.2%, and an accuracy index of 62.9%, in the third-party true validation. The corresponding values for the GF model were 84.5%, 58%, and 57.1%, respectively. The GLCMF model used three features, whereas the GF model used four features. Considering all these measures, the GLCMF model was the best. The performances of the models developed with image textural features such as WF, GLCMF, GF, and LF models were better than

Table 1 Comparison of the beef tenderness prediction models developed using different hyperspectral image feature sets. Models

No. of features

Classification table True tender predicted as tender

True tender predicted as tough

True tough predicted as tender

True tough Predicted as tough

10 11 11 13 11 10 10

21 20 20 18 20 21 21

18 14 12 16 16 17 17

12 16 18 14 14 13 13

Leave-one-out cross-validation (Total number of true tender samples = 133. Total number of true tough samples = 31) DSF model WF model GLCMF model GF model LF model LBPF model PF model

3 5 3 4 3 6 8

91 92 82 79 82 94 100

42 41 51 54 51 39 33

Third party true validation (Total number of true tender samples = 144. Total number of true tough samples = 30) DSF model 3 90 54 WF model 5 84 60 GLCMF model 3 85 59 GF model 4 87 57 LF model 3 89 55 LBPF model 6 83 61 PF model 8 94 50

DSF – descriptive statistical features; WF – wavelet features; GLCMF – gray level co-occurrence matrix features; GF – Gabor features; LF – Laws features; LBPF – local binary pattern features; PF – pooled features.

62

G. Konda Naganathan et al. / Computers and Electronics in Agriculture 116 (2015) 55–64

Fig. 6. Tender certification accuracy or tender specificity values of the tenderness prediction models developed using different hyperspectral image feature sets. DSF – descriptive statistical features; WF – wavelet features; GLCMF – gray level co-occurrence matrix features; GF – Gabor features; LF – Laws features; LBPF – local binary pattern features; PF – pooled features.

Fig. 7. Overall accuracy values of the tenderness prediction models developed using different hyperspectral image feature sets. DSF – descriptive statistical features; WF – wavelet features; GLCMF – gray level co-occurrence matrix features; GF – Gabor features; LF – Laws features; LBPF – local binary pattern features; PF – pooled features.

the model developed with image descriptive features such as DSF model. It indicates that the image textural features are better indicators of beef tenderness. These findings agree with the results reported in the literature. In a study conducted to relate the near infra-red hyperspectral image features to pork tenderness, the coefficient of determination (R2) increased to 0.75 from 0.63 with the addition of wavelet features (Barbin et al., 2013). Similarly, Liu et al. (2010) achieved an overall classification accuracy of 84% and 72% with and without Gabor features, respectively, in classifying pork based on quality using hyperspectral images.

Of the three evaluation metrics, the accuracy index values accurately reflected the performance of the models compared to the other metrics such as tender certification accuracy and overall accuracy. Even though the overall accuracy of the GLCMF and LF models were the same, the GLCMF model had a higher accuracy index because it correctly identified more tough samples. Identification of tough samples is important in making tenderness marketing claims. As shown in Eq. (8), the accuracy index gave a twofold weighting to the tough identification accuracy and ranked the models appropriately. Therefore, the accuracy index is a better

G. Konda Naganathan et al. / Computers and Electronics in Agriculture 116 (2015) 55–64

63

Fig. 8. Accuracy index values of the tenderness prediction models developed using different hyperspectral image feature sets. DSF – descriptive statistical features; WF – wavelet features; GLCMF – gray level co-occurrence matrix features; GF – Gabor features; LF – Laws features; LBPF – local binary pattern features; PF – pooled features.

metric to evaluate the performance of a tenderness classification model than the overall accuracy. 3.4. Comparison of the prototype hyperspectral imaging system with the BeefCam and NIR spectroscopy In 2009, a study funded by the National Cattlemen’s Beef Association evaluated three tenderness classification instruments: BeefCam, NIRS, and an acousto-optic tunable filter (AOTF)-based prototype on-line HSI system, head-to-head (Lorenzen et al., 2009). For comparison purposes, the accuracy index values of the BeefCam, the NIRS, and the AOTF HSI system were computed using Eq. (8). The BeefCam, NIRS, and prototype on-line AOTF HSI systems yielded accuracy index values of 53.9%, 58.3%, and 70%, respectively. The prototype HSI system presented in this study had an accuracy index of 62.9%. The HSI system performed better than the BeefCam and NIRS systems. The AOTF and spectrograph HSI systems had the same wavelength range (400–1000 nm). However, the AOTF system had higher spatial resolution (0.16 mm) than that of the spectrograph system (0.52 mm). The slightly poorer performance of the spectrograph-based system compared to the AOTF system might be due to the lower spatial resolution. These results are in agreement with the recent findings that high resolution images could yield better beef tenderness discrimination when compared to that of low resolution images (Goldberg, 2011; Howard et al., 2010; Jackman et al., 2009b; Subbiah, 2004). 3.5. Advantages and limitations of the system Major advantages of the system, in addition to hyperspectral image acquisition on hanging beef carcasses, include tenderness forecasting, and the ability to predict USDA quality and yield grades. Tenderness forecasting, use of 2-day images to predict 14-day tenderness class, is important to the beef industry for managing the product flow from properly labelling the products at the packing plant itself to making tenderness marketing claims to reducing the cost and time associated with aging of beef to ensure tenderness. As this system provides high quality images,

it is possible to determine the ribeye area, fat thickness, marbling, and USDA quality and yield grades, in addition to tenderness. It leads to an integrated system that can predict tenderness and USDA quality and yield grades. The hyperspectral image analysis routines need to be integrated with the image acquisition system to enable real-time tenderness sorting. It is also important to increase the speed of image acquisition and analysis. One way to achieve faster speed is to identify, acquire, and analyze only key wavelengths images as opposed to hyperspectral images, which are large in size because of the narrow wavelength interval. With the use of sophisticated hardware and software, key wavelength images can be parsed from the camera and analyzed quickly. 4. Conclusions The prototype hyperspectral image acquisition system developed in this study has successfully demonstrated the feasibility of acquiring hyperspectral images of the ribeye muscle on hanging beef carcasses. The system was used to acquire images in two different commercial beef packing or slaughter plants. On average, this system took about 12 s for image acquisition. Results from a third-party true validation indicated that the hyperspectral images acquired with this prototype system at 2-day postmortem can forecast 14-day beef tenderness with a tender certification accuracy of 87.6%. The model developed with the gray level co-occurrence matrix features (GLCMF) outperformed the other models sets. The GLCMF model was the most compact model with only three features. It was also robust because it performed very similarly in both the cross-validation and true validation procedures based on the evaluation metrics such as tender certification accuracy, overall accuracy, and a custom defined metric called accuracy index. Acknowledgements The authors would like to appreciate National Cattlemen’s Beef Association and Nebraska Agricultural Experiment Station for providing funding for conducting this study. The authors are also thankful to the beef industry for providing access to the plant

64

G. Konda Naganathan et al. / Computers and Electronics in Agriculture 116 (2015) 55–64

and the third-party lab for the slice shear force measurements of the beef samples. References Barbin, D.F., Valous, N.A., Sun, D., 2013. Tenderness prediction in porcine longissimus dorsi muscles using instrumental measurements along with NIR hyperspectral and computer vision imagery. Innovative Food Sci. Emerg. Technol. 20, 335–342. Gao, Z., Schroeder, T.C., 2007. Effects of additional quality attributes on consumer willingness-to-pay for food labels. Presented at AAEA, WAEA, and CAES Joint Annual Meeting. Portland, Oregon. George, M.H., Tatum, J.D., Belk, K.E., Smith, G.C., 1999. An audit of retail beef looin steak tenderness conducted in eight U.S. cities. J. Anim. Sci. 77, 1735–1741. Goldberg, D., 2011. Imaging method for determining meat tenderness. U.S. Patent Number 0007151 A1. Howard, S., Goldberg, D., Tatum, J., Woerner, D., Smith, G., Belk, K., 2010. Highresolution imaging and beef tenderness. National Cattlemen’s Beef Association Project Summary. Jackman, P., Sun, D., Allen, P., 2009a. Comparison of various wavelet texture features to predict beef palatability. Meat Sci. 83 (1), 82–87. Jackman, P., Sun, D., Allen, P., 2009b. Comparison of the predictive power of beef surface wavelet texture features at high and low magnification. Meat Sci. 82, 353–356. Jackman, P., Sun, D., Allen, P., 2010. Prediction of beef palatability from colour, marbling and surface texture features of longissimus dorsi. J. Food Eng. 96 (1), 151–165. Konda Naganathan, G., 2011. Development and evaluation of spectral imaging systems and algorithms for beef tenderness grading. Dissertation, University of Nebraska-Lincoln. Konda Naganathan, G., Grimes, L.M., Subbiah, J., Calkins, C.R., Samal, A., Meyer, G.E., 2008a. Visible/near-infrared hyperspectral imaging for beef tenderness prediction. Comput. Electron. Agric. 64 (2), 225–233.

Konda Naganathan, G., Grimes, L.M., Subbiah, J., Calkins, C.R., Samal, A., Meyer, G.E., 2008b. Partial least squares analysis of near-infrared hyperspectral images for beef tenderness prediction. Sens. Instrum. Food Qual. Saf. 2 (3), 178–188. Laws, K., 1980. Textured image segmentation. Dissertation, University of Southern California. Li, J., Tan, J., Martz, F.A., Heymann, H., 1999. Image texture features as indicators of beef tenderness. Meat Sci. 53 (1), 17–22. Liu, L., Ngadi, M.O., Prasher, S.O., Gariepy, C., 2010. Categorization of pork quality using Gabor filter-based hyperspectral imaging technology. J. Food Eng. 99, 284–293. Lorenzen, C., Belk, K., Calkins, C., Miller, R., Morgan, B., O’Connor, M., et al., 2009. Validation of tenderness prediction instruments. National Cattlemen’s Beef Association Project Summary. Loureiro, M., Umberger, W., 2004. A choice experiment model for beef attributes: what consumer preferences tell us. Presented at American Agricultural Economics Association Annual Meeting. Denver, Colorado. Manjunath, B.S., Ma, W.Y., 1996. Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Mach. Intell. 18 (8), 837–842. NCBA, 2007. National beef instrument assessment plan (NBIAP) III meeting: The next five years. National Cattlemen’s Beef Association, Englewood, CO. Ojala, T., Pietikäinen, M., Harwood, D., 1996. A comparative study of texture measures with classification based on feature distributions. Pattern Recogn. 29, 51–59. Shackelford, S.D., Wheeler, T.L., Koohmaraie, M., 1999. Tenderness classification of beef: II. Design and analysis of a system to measure beef longissimus shear force under commercial processing conditions. J. Anim. Sci. 77 (6), 1474–1481. Shackelford, S.D., Wheeler, T.L., Koohmaraie, M., 2005. On-line classification of US Select beef carcasses for longissimus tenderness using visible and near-infrared reflectance spectroscopy. Meat Sci. 69 (3), 409–415. Subbiah, J., 2004. Nondestructive evaluation of beef palatability. Dissertation, Oklahoma State University. Subbiah, J., Calkins, C.R., Samal, A., Konda Naganathan, G., 2014. System and method for analyzing properties of meat using multispectral imaging. U.S. Patent number 8774469B2.