Deep learning for noninvasive classification of clustered horticultural crops – A case for banana fruit tiers

Postharvest Biology and Technology 156 (2019) 110922 Contents lists available at ScienceDirect Postharvest Biology and Technology journal homepage: ...

Download PDF

4MB Sizes 0 Downloads 37 Views

Report

Full Text

Postharvest Biology and Technology 156 (2019) 110922

Contents lists available at ScienceDirect

Postharvest Biology and Technology journal homepage: www.elsevier.com/locate/postharvbio

Deep learning for noninvasive classiﬁcation of clustered horticultural crops – A case for banana fruit tiers Tuan-Tang Lea, Chyi-Yeu Lina,b,c, Eduardo Jr Piedadd,

T

⁎

a

Department of Mechanical Engineering-National Taiwan University of Science and Technology, Taipei 106, Taiwan Taiwan Building Technology Center, National Taiwan University of Science and Technology, Taipei 106, Taiwan c Center for Cyber-Physical System Innovation, National Taiwan University of Science and Technology, Taipei 106, Taiwan d Department of Electrical Engineering, University of San Jose-Recoletos, Cebu City 6000, Philippines b

A R T I C LE I N FO

A B S T R A C T

Keywords: Banana Deep learning Fruit classiﬁcation Horticultural crop

Practical classiﬁcation of some horticultural crops such as banana tiers, lanzones and grapes come into clusters instead of individual classiﬁcation. Unlike most of classiﬁcation studies, clustered crops are rarely studied due to their complex physical structure. A noninvasive deep learning classiﬁcation of clustered banana given only a single image feature has been developed as a pioneering deep learning study for clustered horticultural crops. In recent deep learning developments, mask region-based convolution neural networks, also known as Mask RCNN, show unique applications in image recognition by detecting objects within an image while simultaneously generating segmentation masks. With Mask R-CNN, detection of the complex banana fruit within an image predicts the banana class while at the same time generating a mask separating the fruit from its background. A real dataset is used based on banana tiers and the developed model discriminates normal from abnormal tiers. Unlike the previous general machine learning study, which discriminates reject class from normal class with classiﬁcation accuracy of 79%, our deep learning model obtained a better averaged accuracy of 92.5%. The previous average weighted accuracy of 94.2% also improved to 96.1% with only a single image feature instead of tedious multiple image and size features. With data augmentation, the model slightly improved into 93.8% accuracy on classifying reject class and 96.5% for overall accuracy. Having successfully implemented in banana tiers, this deep learning classiﬁcation can also serve as basis for other clustered horticultural crops.

1. Introduction Banana is considered as the most important traded fruit in the global market in terms of volume (Castillo and Fuller, 2015). And it is one of the top leading horticultural export products for the Philippines (Philippine Statistics Authority, 2017). However, postharvest loss of banana can be up to 35% due to diverse reasons such as over ripening, disease, harvest of immature fruit and mechanical damage (Mopera, 2016). Aquino-Nuevo and Apaga (2010) found that there were no quality and size standards being strictly followed for sorting bananas and other harvested crops. Without this quality control, misclassiﬁcation of bananas and other crops may accumulate more postharvest losses that may aﬀect production earnings. Most sorting studies deals with popular fruit such as apple and orange (Bargoti and Underwood, 2017), mango and lemon (Savakar, 2012), papaya (Santos Pereira et al., 2018), and almond (Teimouri et al., 2014). These fruit were classiﬁed individually using machine

⁎

learning tools based on images or other features. Fruit such as bananas and grapes need to be sorted in tiers or bunches and therefore clustered physical characteristics tend to be more complex and more diﬃcult to distinguish using general machine learning. Image background subtraction techniques may not work properly on clustered fruit resulting in retention of the background. Imperfect fruit image especially with complicated background reduces the fruit classiﬁcation performance of its trained deep convolutional neural network (Zhang et al., 2019). Few studies deal with banana classiﬁcation. A fruit classiﬁcation including yellow bananas using a biogeography-based optimization and feed-forward neural network was proposed by Zhang et al. (2016). Prediction banana ripeness and its shelf life was developed using clustering and statistical tools (Thor, 2017). A computer-vision based banana sorting system classiﬁed good and defective bananas (Dimililer and Olaniyi, 2015). Another similar image processing-based study (Surya Prabha and Satheesh Kumar, 2015) successfully diﬀerentiated under-mature from the mature and over-mature bananas, but not the

Corresponding author. E-mail address: [email protected] (E.J. Piedad).

https://doi.org/10.1016/j.postharvbio.2019.05.023 Received 2 February 2019; Received in revised form 23 May 2019; Accepted 24 May 2019 Available online 10 June 2019 0925-5214/ © 2019 Elsevier B.V. All rights reserved.

Postharvest Biology and Technology 156 (2019) 110922

T.-T. Le, et al.

latter two among the rest. All of these prior studies were based on individual fruit (commonly known as “ﬁngers”), the classiﬁcation of which is highly impractical. Bananas are sorted either by tiers (also called “hands”) or by cluster (called “bunches”). In Piedad et al. (2018), a general machine learning study successfully classiﬁed banana tiers based on color and shape features using a random forest classiﬁer. There were three major drawbacks in that study. Firstly, it is not a fully automatic process because the shape feature, which is considered to be the best model feature, still needs to be manually measured by humans. Secondly, both features used in this model cannot suﬃciently classify defective bananas from the normal ones giving only at most 79% accuracy. If taking only the image feature as the input of another model with similar parameters, the accuracy might be even lower. Finally, the study took six perspective sides of banana, which can be considered as impractical and expensive. Therefore, development of a more practical classiﬁcation system that only take one image per banana tier without using manually-measured shape features, but still give better prediction accuracy to both normal and defective classes, is needed. In addition, a classiﬁcation system should employ a better deep learning tool than a simple machine learning that cannot easily sort the physical characteristics of clustered fruit. In this study, a classiﬁcation system using deep learning that diﬀerentiates normal banana tiers from defective ones is developed. The emerging application of deep learning has reached the ﬁeld of agriculture. A recent survey in agriculture shows that applying deep learning provides better accuracy than conventional image processing and data analysis techniques (Kamilaris and Prenafeta-Boldú, 2018). Various non-invasive deep learning models in agriculture were developed using deep simulated learning (Rahnemoonfar and Sheppard, 2017), via vein morphological patterns (Grinblat et al., 2016), pixellevel image fusion (Liu et al., 2018), hyperspectral data (Chen et al., 2014), and transfer learning (Mehdipour Ghazi et al., 2017). Some studies use deep learning for disease detection of plants (Mohanty et al., 2016) and fruit (Brahimi et al., 2017). Very limited number of deep learning studies can be found on fruit classiﬁcation and sorting, and no prior deep learning studies were performed to classify of banana tiers. Convolutional neural network (CNN) is widely used in object detection and recognition using deep learning. Recent variants include RCNN (Girshick et al., 2014), faster R-CNN (Ren et al., 2015), fast R-CNN (Girshick, 2015) and mask region-based CNN, also known as mask RCNN, (He et al., 2017) algorithms, which performed better and faster than the typical CNN in terms of object detection. Mask R-CNN, which predicts an object mask together with the existing branch for recognizing a bounding box, outperformed other variants when detecting multiple objects in an image using some image dataset benchmarks (He et al., 2017). The object mask was found to be critical for good results. With mask R-CNN, it is possible to detect the banana tier within an image, predict the class, and at the same time generate a mask separating the complex fruit from its background. Since this tool can be applied to any complex-structured object, it will also be useful for any clustered fruit, not only for banana tiers. The objective of this study was to develop a deep learning model to classify banana tiers. The proposed model discriminates normal banana tiers from defective ones. The model also serves as a test case for banana tiers and other clustered crops that have no prior studies in the literature. In addition, this study employs a recent and powerful deep learning tool, Mask R-CNN, for object detection and classiﬁcation.

Fig. 1. A typical deep learning pipeline with training, testing and evaluation phases.

processing where machine learning is not suﬃcient. Deep learning also needs to be trained, validated and tested. In fact, it is equivalent to a typical neural network algorithm but with deeper hierarchical structures. One example is the deep neural network (DNN) model. Instead of using few numbers of hidden layers, DNN usually has more hidden layers, which can reach up to 100 depending on the input and its application. The introduction of more layers may cause numerical instability so DNN and other emerging deep learning models employ various techniques such as pooling, convolution, and optimizers. Convolution neural network (CNN) and recurrent neural network are just two early examples of the very rapid and fast increase of deep learning variants. Each variant has diﬀerent advantages and disadvantages suitable to various kinds of applications. 2.2. Proposed classiﬁcation approach Mask R-CNN model is chosen. It is initially pre-trained using COCO dataset similar to He et al. (2017) in order to have initial converging model parameters. Then after this pre-training, the model is used to learn our banana image dataset. Fig. 2 shows the general implementation architecture of the Mask R-CNN for banana classiﬁcation. The ﬁrst stage uses the regional proposal network (RPN). In the second stage, the network outputs k binary masks for each region-of-interest (ROI), one for each of k class together with the classiﬁcation of bounding box. In this research, k is equal to 2 as we only have normal class and reject class. The deep learning implementation of this model is performed as training and testing modes. The model initially learns from the training dataset. After tuning its parameters, it is then evaluated using a diﬀerent testing dataset for the inference mode. 2.3. Deep learning implementation The previous proposed deep learning model is implemented in this section. The learning steps are summarized into six general steps. Step 0. A data preprocessing is performed. Binary images were created by manually annotating the original images using a python script as shown in Fig. 3. Then, a conversion to JavaScript Object Notation (JSON) annotation format keeps all the information about image and category identiﬁcation, bounding box, area and image segmentation in image pixel coordinates. Step 1. An image preprocessing is conducted ﬁrst. Each original image size is resized into a ﬁx image size by adding zero-value symmetric area (black) to the undersized dimension (width) which is known to be the zero padding approach. In this research, a square image with a resolution of 640 × 640 is used as the ﬁxed size. Since the banana images have 640 × 480 image resolution, these are converted into the ﬁxed image size by appending two equal black areas on top and bottom of the image as shown in Fig. 4. Note that the output images of this step are used for the CNN in general while annotated images of step 0 are used only as reference in the later steps. Step 2. Feature pyramid network (FPN) has bottom-up and top-down

2. Materials and methods 2.1. Deep learning model Deep learning is a deep hierarchal extension of machine learning, a method of data analysis for automating inference based on historical data. A typical deep learning ﬂowchart is shown in Fig. 1. Deep learning has been applied to the ﬁeld of image processing and natural language 2

Postharvest Biology and Technology 156 (2019) 110922

T.-T. Le, et al.

Fig. 2. Mask R-CNN architecture for banana tier image classiﬁcation.

values. RPN is a small model inside the overall network and this needs to be trained. A module, RPN_Target, provides a ground truth value for the training process. Based on the image anchors, ground truth class identiﬁcation, and ground truth bounding box from the output of step 0, a module will compute overlaps. Anchors with intersection over union (IoU) values greater than or equal to 0.7 are positive anchors while negative anchors have IoU less than or equal to 0.3. Neutral anchors have IoU values between 0.3 and 0.7 which are not considered for training. Fig. 7 shows three illustration of anchor search. Based on these value, the system identiﬁes positive anchors and bounding box reﬁnement values (deltas) to reﬁne them and to match their corresponding ground truth boxes as the ground truth output for optimizing the RPN model. Two losses are computed

pathways with lateral connections. In this study, there are four levels both bottom-up and top-down pathways producing four backbone feature maps each. A sample reject banana image in Fig. 4 is fed into the FPN. The ﬁrst few layers of four feature maps (C2 to C5) for bottom-up pathways and another four feature maps (P2 to P5) for the top-down pathways are shown in Fig. 5. Another feature map (P6) is produced based from the smallest backbone feature map (P5). Both bottom-up and top-down backbone feature maps show various interesting topographies of the reject banana tier. The generated feature maps from FPN are then fed into an RPN model as shown in Fig. 6. This model runs the binary classiﬁer on the input anchor over the image and returns the predicted background and foreground scores. It also generates the bounding box reﬁnement

Fig. 3. Image masking of input. 3

Postharvest Biology and Technology 156 (2019) 110922

T.-T. Le, et al.

Fig. 4. An image processing of a sample reject banana image using zero padding approach.

Fig. 5. Four backbone feature maps of FPN bottom-up pathways and ﬁve backbone feature maps of FPN top-down pathways.

Fig. 6. An architecture of region proposal network (RPN).

Fig. 7. Three illustrations of anchor search – (a) a negative case with IoU ≤ 0.3, (b) a neutral case with 0.3 < IoU < 0.7 and a positive case with IoU ≥ 0.7.

4

Postharvest Biology and Technology 156 (2019) 110922

T.-T. Le, et al.

Fig. 8. Two RPN prediction images - (a) with reﬁnement showing only top 100 anchors, and (b) with reﬁnement and clipping showing only top 50 anchors. Fig. 9. Three illustrations of intersection over union (IoU) - (a) A bounding box proposal (in red or tmiddle box) and two ground truth bounding boxes (in black); (b) An overlap between proposal and the ﬁrst ground trh bounding box with IoU = 0.75; and (c) An overlap between proposal and the second ground truth bounding box with IoU = 0.5.

the top anchor using the foreground score as the criteria for later use. A small set of data including score, reﬁnement deltas and anchors, is extracted from the dataset based on the previous step indexes. At this point, deltas are applied to anchors to produce set of reﬁned anchors. Two RPN images with reﬁnement and after reﬁnement with clipping are shown in Fig. 8. Step 4. In this step, the successful proposals in step 3 are used to measure the ﬁnal ROIs as well as the detection targets for an image. First, the overlap between the proposals and the ground truth bounding box is computed. Fig. 9 shows the case of overlaps between one anchor with other two ground truth bounding boxes. In this case, the network only keeps the maximum overlap values for each anchor. Based on the generated overlaps values, the highest IOU values are used to calculate the positive indices and negative indices with ﬁxed ratio between them. Then, using these indices, positive and negative ROIs are determined. Positive ROIs are those with IoU larger than 0.5 with a ground truth while negative ROI have IoU values smaller than 0.5 compared to ground truth bounding box. The system assigns positive ROIs to ground truth boxes and masks before calculating the ﬁnal bounding box reﬁnement for positive ROIs and mask targets. Fig. 10 shows a case with positive anchor before and after reﬁnement. Step 5. With the ROIs and the feature pyramid network (FPN), two graphs are fed into FPN heads. The FPN class graph is a computation graph of the FPN classiﬁer and regressor heads to predict its class and its bounding box reﬁnement. Another computational graph, FPN mask graph, is used to predict the masks. Finally, we have the prediction and the corresponding target and therefore we can deﬁne the class loss, bounding box loss and mask loss. The network continues to optimize these loss values until it converges based on the number of predeﬁned epochs or error values. After the training, we the model for the inference mode and check whether overﬁtting happens by comparing the inference loss values with the previous training loss values. Overﬁtting happens when loss variation is

Fig. 10. Positive anchors before reﬁnement (dotted red box) and after reﬁnement (solid red box).

based on the prediction and ground truth values. The RPN bounding box loss measures the regression loss of the bounding box proposal and the RPN class loss measures the classiﬁcation loss whether the proposals are normal or reject class. This step also minimizes these loss values and improves the accuracy of RPN model. Step 3. The predictions of previous step serve as the input of the proposal layer in this step. To improve computing performance and eﬃciency, a small subset of anchor indexes is retained by trimming to 5

Postharvest Biology and Technology 156 (2019) 110922

T.-T. Le, et al.

Fig. 11. Tier-based banana dataset with reject bananas enclosed in a solid red box.

Fig. 12. Image transformations of the original banana image in the upper left using data augmentation.

published dataset of M. acuminata banana species in Piedad et al. (2018) has 194 banana tier samples with six images from on each six perspective sides. For practical reasons, the impact of camera distance to the banana tier sample during the image capture process was disregarded and assumed to be statistically insigniﬁcant (Piedad and Villeta, 2017). There are four diﬀerent banana classes each has its own market – extra class as the export quality fruit, class I as the high-value domestic fruit and class II as local market consumption while the reject class is still considered for local trade but usually with comparatively lower price. In this study, only a single side image of banana is chosen. In this case, the three classes are grouped into the normal class which has 139 samples. The dataset is now subdivided into two parts – the normal and the abnormal (reject) classes, having 139 and 55 samples each,

Table 1 Original Deep Learning Dataset. Variable

Features

Parameters

Sample Size

X y

Image Class type

Image File (jpg) 0: Normal, 1: Abnormal

194

very high.

2.4. Banana tier dataset Classifying based on banana tiers instead of the widely-studied banana ﬁngers is considered practical in postharvest practice. A recently 6

Postharvest Biology and Technology 156 (2019) 110922

T.-T. Le, et al.

Fig. 13. Tier-based banana dataset partition with ﬁve times cross-validation.

Fig. 14. The loss graph of the deep learning model without data augmentation after ﬁve simulations.

Fig. 15. The loss graph of the deep learning model with data augmentation after ﬁve simulations.

respectively. Fig. 11 shows the selected banana image dataset with the reject bananas enclosed in a red box. It can be seen the diﬃculty of classifying banana in two-dimensional image only even for human recognition due to its complex and varying physical structures. Extracting color and shape features may not easy for a non-invasive classiﬁcation. Due to a limited dataset, another classiﬁcation case using data augmentation is performed. Data augmentation helps in improving classiﬁcation performance of deep learning models (Zhang et al., 2019). Image resize, rotation, translation and combination of these techniques

Table 2 Average precision performance on the test dataset. Threshold Intersection-over-union (IoU)

0.50 or 50 % 0.75 or 75 %

Average Precision (%) Run 1

Run 2

Run 3

Run 4

Run 5

82.3 82.3

91.7 91.7

91.8 91.8

87.3 87.3

100.0 100.0

7

Postharvest Biology and Technology 156 (2019) 110922

T.-T. Le, et al.

Table 4 Classiﬁcation performance with data augmentation. Banana Class

Normal Reject Weighted Ave.

Classiﬁcation Accuracy

Run 1

Run 2

Run 3

Run 4

Run 5

Ave.(%)

Piedad et al., (2018) Ave.(%)

40/41 14/16 54/57

40/41 14/16 54/57

41/41 15/16 56/57

38/41 16/16 54/57

41/41 16/16 57/57

97.56 93.75 96.49

98.67 79.00 94.20

similar model from the previous case. The new data set is about 20 times larger than the original training data, and the latest results are compared between using and without using data augmentation. Fig. 12 illustrates a sample image with its augmentation. 2.5. Performance evaluation Table 1 presents the deep learning dataset consist of images and its class label. The number of normal class samples is 139 greater than the 55 reject banana samples. For the data augmented case, there are 1568 normal and 1053 reject bananas for training the model. Both cases are simulated under similar number of test dataset and identical data partitions. In this case, a 70-30% balance dataset partition for training and testing dataset is conducted as shown in Fig. 13. This means that 98 normal samples are used for training while 41 samples for verifying the developed model. Similarly, 39 and 16 reject samples are used for training and testing, respectively. The partitions of validation for normal and reject banana class with all 194 real image data are published in (Piedad et al., 2019). Similar partition is performed in the data augmented case. A k-fold stratiﬁed random sampling is used where k = 5 for this sampling cross-validation where the average performance is taken. For the performance evaluation, there are two parts – model evaluation and classiﬁcation result. Loss function, precision accuracy and real-valued conﬁdence score measures are usually taken to evaluate model’s image recognition precision, and convergence and classiﬁcation performances (Everingham et al., 2010). For classiﬁcation result, a general accuracy is performed during the testing phase using Eq. (1).

Fig. 16. A normal class prediction with 100% conﬁdence score.

Classification Accuracy =

Table 3 Classiﬁcation performance result without data augmentation.

Normal Reject Weighted Ave.

Classiﬁcation Accuracy

(1)

Banana image each has 640 × 480 resolution size. Each image has only one banana tier but is not ﬁxed and ﬁtted in the image. Mask RCNN detects this tier by predicting what banana class together with a bounding box showing the mask of the object. In addition, a normalized confusion matrix can further visualize the result of the classiﬁcation performance. Finally, the whole process was repeated ﬁve times, with the average of the accuracy taken. Python libraries from Scikit-learn platform (Pedregosa et al., 2011) were used to implement the machine learning classiﬁers – artiﬁcial neural network, support vector, and random forest.

Fig. 17. A normal class prediction with confusion between normal (96.7% conﬁdence) and reject (83.2% conﬁdence) class.

Banana Class

Number of correctly classified banana % Total number of banana

3. Results and discussion

Run 1

Run 2

Run 3

Run 4

Run 5

Ave.(%)

Piedad et al., (2018) Ave.(%)

39/41 14/16 53/57

40/41 15/16 55/57

41/41 15/16 56/57

39/41 15/16 54/57

41/41 15/16 56/57

97.56 92.50 96.14

98.67 79.00 94.20

The loss graphs of Figs. 14 and 15 present the computational performance of our Mask R-CNN model without and with data augmentation, respectively. No signiﬁcant variations in performances of training and testing modes after ﬁve averaged runs were detected. The models also converged at epoch = 10. Further increasing the number of epochs will introduce overﬁtting issue when the training loss value deviates from its test loss value. The diﬀerence between the training and testing is almost constant. It is also evident that the model with data augmentation is more robust than a model without data augmentation by comparing the loss diﬀerence between their training and

increase the number of dataset. Balance data partition such as similar proportion of rejected and normal bananas is maintained to ensure 8

Postharvest Biology and Technology 156 (2019) 110922

T.-T. Le, et al.

Fig. 18. Confusion matrix of the classiﬁcation result from the model (a) without data augmentation and (b)with data augmentation.

4. Conclusion

testing phases. This may be due to the eﬀect of increasing the limited number of banana tiers used in training by 20 times using data augmentation originally from 137 training samples only. Deep learning model usually has to be trained with suﬃciently large number of images to make the model more robust. Since generating more banana dataset is expensive, its performance highly relies on whatever available number of images. In this case, however, the developed model still performed suﬃciently even without data augmentation. Table 2 shows the average precision of the test dataset. It can be observed that the typical intersection-over-union (IoU) threshold values of 50 and 75% show identical results over ﬁve runs. This means that the model works the same either of these values. In this study we use of 50% as the IoU threshold value. Fig. 16 shows a successful prediction and masking with 100% conﬁdence score for classifying normal condition. Its reject class prediction has a conﬁdence score less than the IoU threshold value thus it was not proposed as prediction. Fig. 17 also shows another successful prediction and masking for normal class but with confusion whether it is reject or normal class. However, the normal class still has a greater conﬁdence score than the reject class and therefore the normal class is preferred. Tables 3 and 4 shows the classiﬁcation performance of the developed deep learning model without and with data augmentation, respectively. Both models predict better than the previous results of Piedad et al. (2018),which employ general machine learning models. When discriminating reject class from normal ones, the models have better averaged classiﬁcation accuracies of 92.5% and 93.8% for both respective cases than the 79% of the prior study. Though the model underperforms in classifying normal class compared to the prior study, there is no much diﬀerence and its performance is still greater than 97% accuracy. The developed models tend to trade its accuracy when classifying the normal class in order to perform better when classifying reject banana class. Since the number of normal banana samples is almost three times as much as the reject banana classes, the total weighted classiﬁcation accuracy of the developed model 96.1% slightly improved compared with 94.2% in the previous study. In the second case, applying data augmentation also gives slightly better performance with 96.5%. However, this model performs with almost no obvious diﬀerence from the ﬁrst case when no data augmentation. This only suggests that either our increased number of image samples when applying data augmentation is still insuﬃcient enough to improve our model or this technique does not improve our model at all. Techniques to increase the number of real dataset such as data augmentation can be used. However, our deep learning model even without data augmentation can still predict satisfactorily (Fig. 18).

A noninvasive classiﬁcation of banana fruit tiers has been implemented using deep learning Mask R-CNN. Unlike the previous study that uses the same real dataset, our model can easily discriminate the reject banana class. It also uses a single side image of banana as an input feature instead of the original six images and banana ﬁnger length size. The developed model has an average weighted accuracy slightly better than the prior study. When applying data augmentation to increase 20 times the dataset, Mask R-CNN still perform better than the prior study but only slightly than the former case when no data augmentation is applied. With or without data augmentation, banana classiﬁcation using Mask R-CNN is successfully performed. This serves as an initial step for classifying complex and clustered horticultural crops using deep learning. Acknowledgments The authors wish to thank the Department of Science and Technology – Philippine Council for Industry, Energy and Emerging Technology Research and Development (DOST PCIEERD) for the training support on artiﬁcial intelligence. This work is ﬁnancially supported by both Taiwan Building Technology Center and Center for Cyber-Physical System Innovation from the Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan. References Aquino-Nuevo, P., Apaga, A.R., 2010. Technology on reducing postharvest losses and maintaining quality of fruit and vegetables (Philippines). Proceedings of 2010 AARDO Workshop. pp. 154–167. Bargoti, S., Underwood, J.P., 2017. Image segmentation for fruit detection and yield estimation in apple orchards. J. Field Robot. 34 (6), 1039–1060. https://doi.org/10. 1002/rob.21699. Brahimi, M., Boukhalfa, K., Moussaoui, A., 2017. Deep learning for tomato diseases: classiﬁcation and symptoms visualization. Appl. Artif. Intell. 31 (4), 299–315. https://doi.org/10.1080/08839514.2017.1315516. Castillo, C., Fuller, D., 2015. Bananas: the Spread of a Tropical Forest Fruit As an Agricultural Staple. Oxford University Press. Chen, Y., Lin, Z., Zhao, X., Wang, G., Gu, Y., 2014. Deep learning-based classiﬁcation of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 7 (6), 2094–2107. Retrieved from. https://www.scopus.com/inward/record.uri?eid=2-s2. 0-84901570503&partnerID=40&md5=ed33bcf87e141bca804dbb2507284d6d. Dimililer, K., Olaniyi, E.O., 2015. Intelligent sorting system based on computer vision for banana industry. Int. J. Sci. Eng. Res. 6 (9), 332–337. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A., 2010. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88 (2), 303–338. Girshick, R., 2015. Fast R-CNN. The IEEE International Conference on Computer Vision (ICCV).

9

Postharvest Biology and Technology 156 (2019) 110922

T.-T. Le, et al.

Technol. 145, 93–100. https://doi.org/10.1016/J.POSTHARVBIO.2018.06.004. Piedad, E.J., Le, T.-T., Lin, C.-Y., 2019. Data for Deep Learning for Noninvasive Classiﬁcation of Clustered Horticultural Crops – a Case for Banana Fruit Tiers. https://doi.org/10.17632/xpz3d7jhbp.1. Rahnemoonfar, M., Sheppard, C., 2017. Deep count: fruit counting based on deep simulated learning. Sensors (Switzerland) 17 (4), 1–12. https://doi.org/10.3390/ s17040905. Ren, S., He, K., Girshick, R., Sun, J., 2015. Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 91–99. Santos Pereira, L.F., Barbon, S., Valous, N.A., Barbin, D.F., 2018. Predicting the ripening of papaya fruit with digital imaging and random forests. Comput. Electron. Agric. 145 (January), 76–82. https://doi.org/10.1016/j.compag.2017.12.029. Savakar, D., 2012. Identiﬁcation and classiﬁcation of bulk fruits image using artiﬁcial neural networks. Int. J. Eng. Innov. Technol. 1 (3), 36–40. Surya Prabha, D., Satheesh Kumar, J., 2015. Assessment of banana fruit maturity by image processing technique. J. Food Sci. Technol. 52 (3), 1316–1327. https://doi. org/10.1007/s13197-013-1188-3. Teimouri, N., Omid, M., Mollazade, K., Rajabipour, A., 2014. A novel artiﬁcial neural networks assisted segmentation algorithm for discriminating almond nut and shell from background and shadow. Comput. Electron. Agric. 105, 34–43. https://doi.org/ 10.1016/j.compag.2014.04.008. Thor, N., 2017. Applying machine learning clustering and classiﬁcation to predict banana ripeness states and shelf life. Cloud Publ. Int. J. Adv. Food Sci. Technol. 2 (1), 20–25. Retrieved from. http://scientiﬁc.cloud-journals.com/index.php/IJAFST/article/ view/Sci-533. Zhang, Y.D., Dong, Z., Chen, X., Jia, W., Du, S., Muhammad, K., Wang, S.H., 2019. Image based fruit category classiﬁcation by 13-layer deep convolutional neural network and data augmentation. Multimed. Tools Appl. 78 (3), 3613–3632. https://doi.org/10. 1007/s11042-017-5243-3. Zhang, Y., Phillips, P., Wang, S., Ji, G., Yang, J., Wu, J., 2016. Fruit classiﬁcation by biogeography-based optimization and feedforward neural network. Expert. Syst. 33 (3), 239–253. https://doi.org/10.1111/exsy.12146.

Girshick, R., Donahue, J., Darrell, T., Malik, J., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 580–587. Grinblat, G.L., Uzal, L.C., Larese, M.G., Granitto, P.M., 2016. Deep learning for plant identiﬁcation using vein morphological patterns. Comput. Electron. Agric. 127, 418–424. https://doi.org/10.1016/j.compag.2016.07.003. He, K., Gkioxari, G., Dollár, P., Girshick, R., 2017. Mask r-cnn. In computer vision (ICCV). 2017 IEEE International Conference on. pp. 2980–2988. Kamilaris, A., Prenafeta-Boldú, F.X., 2018. Deep learning in agriculture: a survey. Comput. Electron. Agric. 147 (July 2017), 70–90. https://doi.org/10.1016/j.compag. 2018.02.016. Liu, Y., Chen, X., Wang, Z., Wang, Z.J., Ward, R.K., Wang, X., 2018. Deep learning for pixel-level image fusion : recent advances and future prospects. Inf. Fusion 42 (September 2017), 158–173. https://doi.org/10.1016/j.inﬀus.2017.10.007. Mehdipour Ghazi, M., Yanikoglu, B., Aptoula, E., 2017. Plant identiﬁcation using deep neural networks via optimization of transfer learning parameters. Neurocomputing 235 (April 2016), 228–235. https://doi.org/10.1016/j.neucom.2017.01.018. Mohanty, S.P., Hughes, D.P., Salathé, M., 2016. Using deep learning for image-based plant disease detection. Front. Plant Sci. 7 (September), 1–10. https://doi.org/10. 3389/fpls.2016.01419. Mopera, L.E., 2016. Food loss in the food value chain: the philippine agriculture scenario. Journal of Development in Sustainable Agriculture 16, 8–16. https://doi.org/10. 11178/jdsa.11.8. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al., 2011. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830. Philippine Statistics Authority, 2017. Agricultural Foreign Trade Statistics of the Philippines: 2015. Retrieved from. https://psa.gov.ph/content/agricultural-foreigntrade-statistics-philippines-2015. Piedad, E.J.D., Villeta, R.B., 2017. Displacement and illumination levels eﬀect on shortdistance measurement errors of using a camera. Recoletos Multidiscipl. Res. J. Piedad, E.J., Larada, J.I., Pojas, G.J., Ferrer, L.V.V., 2018. Postharvest classiﬁcation of banana (Musa acuminata) using tier-based machine learning. Postharvest Biol.

10

Deep learning for noninvasive classification of clustered horticultural crops – A case for banana fruit tiers

Deep learning for noninvasive classification of clustered horticultural crops – A case for banana fruit tiers

Recommend Documents