Postharvest Biology and Technology 156 (2019) 110922
Contents lists available at ScienceDirect
Postharvest Biology and Technology journal homepage: www.elsevier.com/locate/postharvbio
Deep learning for noninvasive classification of clustered horticultural crops – A case for banana fruit tiers Tuan-Tang Lea, Chyi-Yeu Lina,b,c, Eduardo Jr Piedadd,
T
⁎
a
Department of Mechanical Engineering-National Taiwan University of Science and Technology, Taipei 106, Taiwan Taiwan Building Technology Center, National Taiwan University of Science and Technology, Taipei 106, Taiwan c Center for Cyber-Physical System Innovation, National Taiwan University of Science and Technology, Taipei 106, Taiwan d Department of Electrical Engineering, University of San Jose-Recoletos, Cebu City 6000, Philippines b
A R T I C LE I N FO
A B S T R A C T
Keywords: Banana Deep learning Fruit classification Horticultural crop
Practical classification of some horticultural crops such as banana tiers, lanzones and grapes come into clusters instead of individual classification. Unlike most of classification studies, clustered crops are rarely studied due to their complex physical structure. A noninvasive deep learning classification of clustered banana given only a single image feature has been developed as a pioneering deep learning study for clustered horticultural crops. In recent deep learning developments, mask region-based convolution neural networks, also known as Mask RCNN, show unique applications in image recognition by detecting objects within an image while simultaneously generating segmentation masks. With Mask R-CNN, detection of the complex banana fruit within an image predicts the banana class while at the same time generating a mask separating the fruit from its background. A real dataset is used based on banana tiers and the developed model discriminates normal from abnormal tiers. Unlike the previous general machine learning study, which discriminates reject class from normal class with classification accuracy of 79%, our deep learning model obtained a better averaged accuracy of 92.5%. The previous average weighted accuracy of 94.2% also improved to 96.1% with only a single image feature instead of tedious multiple image and size features. With data augmentation, the model slightly improved into 93.8% accuracy on classifying reject class and 96.5% for overall accuracy. Having successfully implemented in banana tiers, this deep learning classification can also serve as basis for other clustered horticultural crops.
1. Introduction Banana is considered as the most important traded fruit in the global market in terms of volume (Castillo and Fuller, 2015). And it is one of the top leading horticultural export products for the Philippines (Philippine Statistics Authority, 2017). However, postharvest loss of banana can be up to 35% due to diverse reasons such as over ripening, disease, harvest of immature fruit and mechanical damage (Mopera, 2016). Aquino-Nuevo and Apaga (2010) found that there were no quality and size standards being strictly followed for sorting bananas and other harvested crops. Without this quality control, misclassification of bananas and other crops may accumulate more postharvest losses that may affect production earnings. Most sorting studies deals with popular fruit such as apple and orange (Bargoti and Underwood, 2017), mango and lemon (Savakar, 2012), papaya (Santos Pereira et al., 2018), and almond (Teimouri et al., 2014). These fruit were classified individually using machine
⁎
learning tools based on images or other features. Fruit such as bananas and grapes need to be sorted in tiers or bunches and therefore clustered physical characteristics tend to be more complex and more difficult to distinguish using general machine learning. Image background subtraction techniques may not work properly on clustered fruit resulting in retention of the background. Imperfect fruit image especially with complicated background reduces the fruit classification performance of its trained deep convolutional neural network (Zhang et al., 2019). Few studies deal with banana classification. A fruit classification including yellow bananas using a biogeography-based optimization and feed-forward neural network was proposed by Zhang et al. (2016). Prediction banana ripeness and its shelf life was developed using clustering and statistical tools (Thor, 2017). A computer-vision based banana sorting system classified good and defective bananas (Dimililer and Olaniyi, 2015). Another similar image processing-based study (Surya Prabha and Satheesh Kumar, 2015) successfully differentiated under-mature from the mature and over-mature bananas, but not the
Corresponding author. E-mail address:
[email protected] (E.J. Piedad).
https://doi.org/10.1016/j.postharvbio.2019.05.023 Received 2 February 2019; Received in revised form 23 May 2019; Accepted 24 May 2019 Available online 10 June 2019 0925-5214/ © 2019 Elsevier B.V. All rights reserved.
Postharvest Biology and Technology 156 (2019) 110922
T.-T. Le, et al.
latter two among the rest. All of these prior studies were based on individual fruit (commonly known as “fingers”), the classification of which is highly impractical. Bananas are sorted either by tiers (also called “hands”) or by cluster (called “bunches”). In Piedad et al. (2018), a general machine learning study successfully classified banana tiers based on color and shape features using a random forest classifier. There were three major drawbacks in that study. Firstly, it is not a fully automatic process because the shape feature, which is considered to be the best model feature, still needs to be manually measured by humans. Secondly, both features used in this model cannot sufficiently classify defective bananas from the normal ones giving only at most 79% accuracy. If taking only the image feature as the input of another model with similar parameters, the accuracy might be even lower. Finally, the study took six perspective sides of banana, which can be considered as impractical and expensive. Therefore, development of a more practical classification system that only take one image per banana tier without using manually-measured shape features, but still give better prediction accuracy to both normal and defective classes, is needed. In addition, a classification system should employ a better deep learning tool than a simple machine learning that cannot easily sort the physical characteristics of clustered fruit. In this study, a classification system using deep learning that differentiates normal banana tiers from defective ones is developed. The emerging application of deep learning has reached the field of agriculture. A recent survey in agriculture shows that applying deep learning provides better accuracy than conventional image processing and data analysis techniques (Kamilaris and Prenafeta-Boldú, 2018). Various non-invasive deep learning models in agriculture were developed using deep simulated learning (Rahnemoonfar and Sheppard, 2017), via vein morphological patterns (Grinblat et al., 2016), pixellevel image fusion (Liu et al., 2018), hyperspectral data (Chen et al., 2014), and transfer learning (Mehdipour Ghazi et al., 2017). Some studies use deep learning for disease detection of plants (Mohanty et al., 2016) and fruit (Brahimi et al., 2017). Very limited number of deep learning studies can be found on fruit classification and sorting, and no prior deep learning studies were performed to classify of banana tiers. Convolutional neural network (CNN) is widely used in object detection and recognition using deep learning. Recent variants include RCNN (Girshick et al., 2014), faster R-CNN (Ren et al., 2015), fast R-CNN (Girshick, 2015) and mask region-based CNN, also known as mask RCNN, (He et al., 2017) algorithms, which performed better and faster than the typical CNN in terms of object detection. Mask R-CNN, which predicts an object mask together with the existing branch for recognizing a bounding box, outperformed other variants when detecting multiple objects in an image using some image dataset benchmarks (He et al., 2017). The object mask was found to be critical for good results. With mask R-CNN, it is possible to detect the banana tier within an image, predict the class, and at the same time generate a mask separating the complex fruit from its background. Since this tool can be applied to any complex-structured object, it will also be useful for any clustered fruit, not only for banana tiers. The objective of this study was to develop a deep learning model to classify banana tiers. The proposed model discriminates normal banana tiers from defective ones. The model also serves as a test case for banana tiers and other clustered crops that have no prior studies in the literature. In addition, this study employs a recent and powerful deep learning tool, Mask R-CNN, for object detection and classification.
Fig. 1. A typical deep learning pipeline with training, testing and evaluation phases.
processing where machine learning is not sufficient. Deep learning also needs to be trained, validated and tested. In fact, it is equivalent to a typical neural network algorithm but with deeper hierarchical structures. One example is the deep neural network (DNN) model. Instead of using few numbers of hidden layers, DNN usually has more hidden layers, which can reach up to 100 depending on the input and its application. The introduction of more layers may cause numerical instability so DNN and other emerging deep learning models employ various techniques such as pooling, convolution, and optimizers. Convolution neural network (CNN) and recurrent neural network are just two early examples of the very rapid and fast increase of deep learning variants. Each variant has different advantages and disadvantages suitable to various kinds of applications. 2.2. Proposed classification approach Mask R-CNN model is chosen. It is initially pre-trained using COCO dataset similar to He et al. (2017) in order to have initial converging model parameters. Then after this pre-training, the model is used to learn our banana image dataset. Fig. 2 shows the general implementation architecture of the Mask R-CNN for banana classification. The first stage uses the regional proposal network (RPN). In the second stage, the network outputs k binary masks for each region-of-interest (ROI), one for each of k class together with the classification of bounding box. In this research, k is equal to 2 as we only have normal class and reject class. The deep learning implementation of this model is performed as training and testing modes. The model initially learns from the training dataset. After tuning its parameters, it is then evaluated using a different testing dataset for the inference mode. 2.3. Deep learning implementation The previous proposed deep learning model is implemented in this section. The learning steps are summarized into six general steps. Step 0. A data preprocessing is performed. Binary images were created by manually annotating the original images using a python script as shown in Fig. 3. Then, a conversion to JavaScript Object Notation (JSON) annotation format keeps all the information about image and category identification, bounding box, area and image segmentation in image pixel coordinates. Step 1. An image preprocessing is conducted first. Each original image size is resized into a fix image size by adding zero-value symmetric area (black) to the undersized dimension (width) which is known to be the zero padding approach. In this research, a square image with a resolution of 640 × 640 is used as the fixed size. Since the banana images have 640 × 480 image resolution, these are converted into the fixed image size by appending two equal black areas on top and bottom of the image as shown in Fig. 4. Note that the output images of this step are used for the CNN in general while annotated images of step 0 are used only as reference in the later steps. Step 2. Feature pyramid network (FPN) has bottom-up and top-down
2. Materials and methods 2.1. Deep learning model Deep learning is a deep hierarchal extension of machine learning, a method of data analysis for automating inference based on historical data. A typical deep learning flowchart is shown in Fig. 1. Deep learning has been applied to the field of image processing and natural language 2
Postharvest Biology and Technology 156 (2019) 110922
T.-T. Le, et al.
Fig. 2. Mask R-CNN architecture for banana tier image classification.
values. RPN is a small model inside the overall network and this needs to be trained. A module, RPN_Target, provides a ground truth value for the training process. Based on the image anchors, ground truth class identification, and ground truth bounding box from the output of step 0, a module will compute overlaps. Anchors with intersection over union (IoU) values greater than or equal to 0.7 are positive anchors while negative anchors have IoU less than or equal to 0.3. Neutral anchors have IoU values between 0.3 and 0.7 which are not considered for training. Fig. 7 shows three illustration of anchor search. Based on these value, the system identifies positive anchors and bounding box refinement values (deltas) to refine them and to match their corresponding ground truth boxes as the ground truth output for optimizing the RPN model. Two losses are computed
pathways with lateral connections. In this study, there are four levels both bottom-up and top-down pathways producing four backbone feature maps each. A sample reject banana image in Fig. 4 is fed into the FPN. The first few layers of four feature maps (C2 to C5) for bottom-up pathways and another four feature maps (P2 to P5) for the top-down pathways are shown in Fig. 5. Another feature map (P6) is produced based from the smallest backbone feature map (P5). Both bottom-up and top-down backbone feature maps show various interesting topographies of the reject banana tier. The generated feature maps from FPN are then fed into an RPN model as shown in Fig. 6. This model runs the binary classifier on the input anchor over the image and returns the predicted background and foreground scores. It also generates the bounding box refinement
Fig. 3. Image masking of input. 3
Postharvest Biology and Technology 156 (2019) 110922
T.-T. Le, et al.
Fig. 4. An image processing of a sample reject banana image using zero padding approach.
Fig. 5. Four backbone feature maps of FPN bottom-up pathways and five backbone feature maps of FPN top-down pathways.
Fig. 6. An architecture of region proposal network (RPN).
Fig. 7. Three illustrations of anchor search – (a) a negative case with IoU ≤ 0.3, (b) a neutral case with 0.3 < IoU < 0.7 and a positive case with IoU ≥ 0.7.
4
Postharvest Biology and Technology 156 (2019) 110922
T.-T. Le, et al.
Fig. 8. Two RPN prediction images - (a) with refinement showing only top 100 anchors, and (b) with refinement and clipping showing only top 50 anchors. Fig. 9. Three illustrations of intersection over union (IoU) - (a) A bounding box proposal (in red or tmiddle box) and two ground truth bounding boxes (in black); (b) An overlap between proposal and the first ground trh bounding box with IoU = 0.75; and (c) An overlap between proposal and the second ground truth bounding box with IoU = 0.5.
the top anchor using the foreground score as the criteria for later use. A small set of data including score, refinement deltas and anchors, is extracted from the dataset based on the previous step indexes. At this point, deltas are applied to anchors to produce set of refined anchors. Two RPN images with refinement and after refinement with clipping are shown in Fig. 8. Step 4. In this step, the successful proposals in step 3 are used to measure the final ROIs as well as the detection targets for an image. First, the overlap between the proposals and the ground truth bounding box is computed. Fig. 9 shows the case of overlaps between one anchor with other two ground truth bounding boxes. In this case, the network only keeps the maximum overlap values for each anchor. Based on the generated overlaps values, the highest IOU values are used to calculate the positive indices and negative indices with fixed ratio between them. Then, using these indices, positive and negative ROIs are determined. Positive ROIs are those with IoU larger than 0.5 with a ground truth while negative ROI have IoU values smaller than 0.5 compared to ground truth bounding box. The system assigns positive ROIs to ground truth boxes and masks before calculating the final bounding box refinement for positive ROIs and mask targets. Fig. 10 shows a case with positive anchor before and after refinement. Step 5. With the ROIs and the feature pyramid network (FPN), two graphs are fed into FPN heads. The FPN class graph is a computation graph of the FPN classifier and regressor heads to predict its class and its bounding box refinement. Another computational graph, FPN mask graph, is used to predict the masks. Finally, we have the prediction and the corresponding target and therefore we can define the class loss, bounding box loss and mask loss. The network continues to optimize these loss values until it converges based on the number of predefined epochs or error values. After the training, we the model for the inference mode and check whether overfitting happens by comparing the inference loss values with the previous training loss values. Overfitting happens when loss variation is
Fig. 10. Positive anchors before refinement (dotted red box) and after refinement (solid red box).
based on the prediction and ground truth values. The RPN bounding box loss measures the regression loss of the bounding box proposal and the RPN class loss measures the classification loss whether the proposals are normal or reject class. This step also minimizes these loss values and improves the accuracy of RPN model. Step 3. The predictions of previous step serve as the input of the proposal layer in this step. To improve computing performance and efficiency, a small subset of anchor indexes is retained by trimming to 5
Postharvest Biology and Technology 156 (2019) 110922
T.-T. Le, et al.
Fig. 11. Tier-based banana dataset with reject bananas enclosed in a solid red box.
Fig. 12. Image transformations of the original banana image in the upper left using data augmentation.
published dataset of M. acuminata banana species in Piedad et al. (2018) has 194 banana tier samples with six images from on each six perspective sides. For practical reasons, the impact of camera distance to the banana tier sample during the image capture process was disregarded and assumed to be statistically insignificant (Piedad and Villeta, 2017). There are four different banana classes each has its own market – extra class as the export quality fruit, class I as the high-value domestic fruit and class II as local market consumption while the reject class is still considered for local trade but usually with comparatively lower price. In this study, only a single side image of banana is chosen. In this case, the three classes are grouped into the normal class which has 139 samples. The dataset is now subdivided into two parts – the normal and the abnormal (reject) classes, having 139 and 55 samples each,
Table 1 Original Deep Learning Dataset. Variable
Features
Parameters
Sample Size
X y
Image Class type
Image File (jpg) 0: Normal, 1: Abnormal
194
very high.
2.4. Banana tier dataset Classifying based on banana tiers instead of the widely-studied banana fingers is considered practical in postharvest practice. A recently 6
Postharvest Biology and Technology 156 (2019) 110922
T.-T. Le, et al.
Fig. 13. Tier-based banana dataset partition with five times cross-validation.
Fig. 14. The loss graph of the deep learning model without data augmentation after five simulations.
Fig. 15. The loss graph of the deep learning model with data augmentation after five simulations.
respectively. Fig. 11 shows the selected banana image dataset with the reject bananas enclosed in a red box. It can be seen the difficulty of classifying banana in two-dimensional image only even for human recognition due to its complex and varying physical structures. Extracting color and shape features may not easy for a non-invasive classification. Due to a limited dataset, another classification case using data augmentation is performed. Data augmentation helps in improving classification performance of deep learning models (Zhang et al., 2019). Image resize, rotation, translation and combination of these techniques
Table 2 Average precision performance on the test dataset. Threshold Intersection-over-union (IoU)
0.50 or 50 % 0.75 or 75 %
Average Precision (%) Run 1
Run 2
Run 3
Run 4
Run 5
82.3 82.3
91.7 91.7
91.8 91.8
87.3 87.3
100.0 100.0
7
Postharvest Biology and Technology 156 (2019) 110922
T.-T. Le, et al.
Table 4 Classification performance with data augmentation. Banana Class
Normal Reject Weighted Ave.
Classification Accuracy
Run 1
Run 2
Run 3
Run 4
Run 5
Ave.(%)
Piedad et al., (2018) Ave.(%)
40/41 14/16 54/57
40/41 14/16 54/57
41/41 15/16 56/57
38/41 16/16 54/57
41/41 16/16 57/57
97.56 93.75 96.49
98.67 79.00 94.20
similar model from the previous case. The new data set is about 20 times larger than the original training data, and the latest results are compared between using and without using data augmentation. Fig. 12 illustrates a sample image with its augmentation. 2.5. Performance evaluation Table 1 presents the deep learning dataset consist of images and its class label. The number of normal class samples is 139 greater than the 55 reject banana samples. For the data augmented case, there are 1568 normal and 1053 reject bananas for training the model. Both cases are simulated under similar number of test dataset and identical data partitions. In this case, a 70-30% balance dataset partition for training and testing dataset is conducted as shown in Fig. 13. This means that 98 normal samples are used for training while 41 samples for verifying the developed model. Similarly, 39 and 16 reject samples are used for training and testing, respectively. The partitions of validation for normal and reject banana class with all 194 real image data are published in (Piedad et al., 2019). Similar partition is performed in the data augmented case. A k-fold stratified random sampling is used where k = 5 for this sampling cross-validation where the average performance is taken. For the performance evaluation, there are two parts – model evaluation and classification result. Loss function, precision accuracy and real-valued confidence score measures are usually taken to evaluate model’s image recognition precision, and convergence and classification performances (Everingham et al., 2010). For classification result, a general accuracy is performed during the testing phase using Eq. (1).
Fig. 16. A normal class prediction with 100% confidence score.
Classification Accuracy =
Table 3 Classification performance result without data augmentation.
Normal Reject Weighted Ave.
Classification Accuracy
(1)
Banana image each has 640 × 480 resolution size. Each image has only one banana tier but is not fixed and fitted in the image. Mask RCNN detects this tier by predicting what banana class together with a bounding box showing the mask of the object. In addition, a normalized confusion matrix can further visualize the result of the classification performance. Finally, the whole process was repeated five times, with the average of the accuracy taken. Python libraries from Scikit-learn platform (Pedregosa et al., 2011) were used to implement the machine learning classifiers – artificial neural network, support vector, and random forest.
Fig. 17. A normal class prediction with confusion between normal (96.7% confidence) and reject (83.2% confidence) class.
Banana Class
Number of correctly classified banana % Total number of banana
3. Results and discussion
Run 1
Run 2
Run 3
Run 4
Run 5
Ave.(%)
Piedad et al., (2018) Ave.(%)
39/41 14/16 53/57
40/41 15/16 55/57
41/41 15/16 56/57
39/41 15/16 54/57
41/41 15/16 56/57
97.56 92.50 96.14
98.67 79.00 94.20
The loss graphs of Figs. 14 and 15 present the computational performance of our Mask R-CNN model without and with data augmentation, respectively. No significant variations in performances of training and testing modes after five averaged runs were detected. The models also converged at epoch = 10. Further increasing the number of epochs will introduce overfitting issue when the training loss value deviates from its test loss value. The difference between the training and testing is almost constant. It is also evident that the model with data augmentation is more robust than a model without data augmentation by comparing the loss difference between their training and
increase the number of dataset. Balance data partition such as similar proportion of rejected and normal bananas is maintained to ensure 8
Postharvest Biology and Technology 156 (2019) 110922
T.-T. Le, et al.
Fig. 18. Confusion matrix of the classification result from the model (a) without data augmentation and (b)with data augmentation.
4. Conclusion
testing phases. This may be due to the effect of increasing the limited number of banana tiers used in training by 20 times using data augmentation originally from 137 training samples only. Deep learning model usually has to be trained with sufficiently large number of images to make the model more robust. Since generating more banana dataset is expensive, its performance highly relies on whatever available number of images. In this case, however, the developed model still performed sufficiently even without data augmentation. Table 2 shows the average precision of the test dataset. It can be observed that the typical intersection-over-union (IoU) threshold values of 50 and 75% show identical results over five runs. This means that the model works the same either of these values. In this study we use of 50% as the IoU threshold value. Fig. 16 shows a successful prediction and masking with 100% confidence score for classifying normal condition. Its reject class prediction has a confidence score less than the IoU threshold value thus it was not proposed as prediction. Fig. 17 also shows another successful prediction and masking for normal class but with confusion whether it is reject or normal class. However, the normal class still has a greater confidence score than the reject class and therefore the normal class is preferred. Tables 3 and 4 shows the classification performance of the developed deep learning model without and with data augmentation, respectively. Both models predict better than the previous results of Piedad et al. (2018),which employ general machine learning models. When discriminating reject class from normal ones, the models have better averaged classification accuracies of 92.5% and 93.8% for both respective cases than the 79% of the prior study. Though the model underperforms in classifying normal class compared to the prior study, there is no much difference and its performance is still greater than 97% accuracy. The developed models tend to trade its accuracy when classifying the normal class in order to perform better when classifying reject banana class. Since the number of normal banana samples is almost three times as much as the reject banana classes, the total weighted classification accuracy of the developed model 96.1% slightly improved compared with 94.2% in the previous study. In the second case, applying data augmentation also gives slightly better performance with 96.5%. However, this model performs with almost no obvious difference from the first case when no data augmentation. This only suggests that either our increased number of image samples when applying data augmentation is still insufficient enough to improve our model or this technique does not improve our model at all. Techniques to increase the number of real dataset such as data augmentation can be used. However, our deep learning model even without data augmentation can still predict satisfactorily (Fig. 18).
A noninvasive classification of banana fruit tiers has been implemented using deep learning Mask R-CNN. Unlike the previous study that uses the same real dataset, our model can easily discriminate the reject banana class. It also uses a single side image of banana as an input feature instead of the original six images and banana finger length size. The developed model has an average weighted accuracy slightly better than the prior study. When applying data augmentation to increase 20 times the dataset, Mask R-CNN still perform better than the prior study but only slightly than the former case when no data augmentation is applied. With or without data augmentation, banana classification using Mask R-CNN is successfully performed. This serves as an initial step for classifying complex and clustered horticultural crops using deep learning. Acknowledgments The authors wish to thank the Department of Science and Technology – Philippine Council for Industry, Energy and Emerging Technology Research and Development (DOST PCIEERD) for the training support on artificial intelligence. This work is financially supported by both Taiwan Building Technology Center and Center for Cyber-Physical System Innovation from the Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan. References Aquino-Nuevo, P., Apaga, A.R., 2010. Technology on reducing postharvest losses and maintaining quality of fruit and vegetables (Philippines). Proceedings of 2010 AARDO Workshop. pp. 154–167. Bargoti, S., Underwood, J.P., 2017. Image segmentation for fruit detection and yield estimation in apple orchards. J. Field Robot. 34 (6), 1039–1060. https://doi.org/10. 1002/rob.21699. Brahimi, M., Boukhalfa, K., Moussaoui, A., 2017. Deep learning for tomato diseases: classification and symptoms visualization. Appl. Artif. Intell. 31 (4), 299–315. https://doi.org/10.1080/08839514.2017.1315516. Castillo, C., Fuller, D., 2015. Bananas: the Spread of a Tropical Forest Fruit As an Agricultural Staple. Oxford University Press. Chen, Y., Lin, Z., Zhao, X., Wang, G., Gu, Y., 2014. Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 7 (6), 2094–2107. Retrieved from. https://www.scopus.com/inward/record.uri?eid=2-s2. 0-84901570503&partnerID=40&md5=ed33bcf87e141bca804dbb2507284d6d. Dimililer, K., Olaniyi, E.O., 2015. Intelligent sorting system based on computer vision for banana industry. Int. J. Sci. Eng. Res. 6 (9), 332–337. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A., 2010. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88 (2), 303–338. Girshick, R., 2015. Fast R-CNN. The IEEE International Conference on Computer Vision (ICCV).
9
Postharvest Biology and Technology 156 (2019) 110922
T.-T. Le, et al.
Technol. 145, 93–100. https://doi.org/10.1016/J.POSTHARVBIO.2018.06.004. Piedad, E.J., Le, T.-T., Lin, C.-Y., 2019. Data for Deep Learning for Noninvasive Classification of Clustered Horticultural Crops – a Case for Banana Fruit Tiers. https://doi.org/10.17632/xpz3d7jhbp.1. Rahnemoonfar, M., Sheppard, C., 2017. Deep count: fruit counting based on deep simulated learning. Sensors (Switzerland) 17 (4), 1–12. https://doi.org/10.3390/ s17040905. Ren, S., He, K., Girshick, R., Sun, J., 2015. Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 91–99. Santos Pereira, L.F., Barbon, S., Valous, N.A., Barbin, D.F., 2018. Predicting the ripening of papaya fruit with digital imaging and random forests. Comput. Electron. Agric. 145 (January), 76–82. https://doi.org/10.1016/j.compag.2017.12.029. Savakar, D., 2012. Identification and classification of bulk fruits image using artificial neural networks. Int. J. Eng. Innov. Technol. 1 (3), 36–40. Surya Prabha, D., Satheesh Kumar, J., 2015. Assessment of banana fruit maturity by image processing technique. J. Food Sci. Technol. 52 (3), 1316–1327. https://doi. org/10.1007/s13197-013-1188-3. Teimouri, N., Omid, M., Mollazade, K., Rajabipour, A., 2014. A novel artificial neural networks assisted segmentation algorithm for discriminating almond nut and shell from background and shadow. Comput. Electron. Agric. 105, 34–43. https://doi.org/ 10.1016/j.compag.2014.04.008. Thor, N., 2017. Applying machine learning clustering and classification to predict banana ripeness states and shelf life. Cloud Publ. Int. J. Adv. Food Sci. Technol. 2 (1), 20–25. Retrieved from. http://scientific.cloud-journals.com/index.php/IJAFST/article/ view/Sci-533. Zhang, Y.D., Dong, Z., Chen, X., Jia, W., Du, S., Muhammad, K., Wang, S.H., 2019. Image based fruit category classification by 13-layer deep convolutional neural network and data augmentation. Multimed. Tools Appl. 78 (3), 3613–3632. https://doi.org/10. 1007/s11042-017-5243-3. Zhang, Y., Phillips, P., Wang, S., Ji, G., Yang, J., Wu, J., 2016. Fruit classification by biogeography-based optimization and feedforward neural network. Expert. Syst. 33 (3), 239–253. https://doi.org/10.1111/exsy.12146.
Girshick, R., Donahue, J., Darrell, T., Malik, J., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 580–587. Grinblat, G.L., Uzal, L.C., Larese, M.G., Granitto, P.M., 2016. Deep learning for plant identification using vein morphological patterns. Comput. Electron. Agric. 127, 418–424. https://doi.org/10.1016/j.compag.2016.07.003. He, K., Gkioxari, G., Dollár, P., Girshick, R., 2017. Mask r-cnn. In computer vision (ICCV). 2017 IEEE International Conference on. pp. 2980–2988. Kamilaris, A., Prenafeta-Boldú, F.X., 2018. Deep learning in agriculture: a survey. Comput. Electron. Agric. 147 (July 2017), 70–90. https://doi.org/10.1016/j.compag. 2018.02.016. Liu, Y., Chen, X., Wang, Z., Wang, Z.J., Ward, R.K., Wang, X., 2018. Deep learning for pixel-level image fusion : recent advances and future prospects. Inf. Fusion 42 (September 2017), 158–173. https://doi.org/10.1016/j.inffus.2017.10.007. Mehdipour Ghazi, M., Yanikoglu, B., Aptoula, E., 2017. Plant identification using deep neural networks via optimization of transfer learning parameters. Neurocomputing 235 (April 2016), 228–235. https://doi.org/10.1016/j.neucom.2017.01.018. Mohanty, S.P., Hughes, D.P., Salathé, M., 2016. Using deep learning for image-based plant disease detection. Front. Plant Sci. 7 (September), 1–10. https://doi.org/10. 3389/fpls.2016.01419. Mopera, L.E., 2016. Food loss in the food value chain: the philippine agriculture scenario. Journal of Development in Sustainable Agriculture 16, 8–16. https://doi.org/10. 11178/jdsa.11.8. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al., 2011. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830. Philippine Statistics Authority, 2017. Agricultural Foreign Trade Statistics of the Philippines: 2015. Retrieved from. https://psa.gov.ph/content/agricultural-foreigntrade-statistics-philippines-2015. Piedad, E.J.D., Villeta, R.B., 2017. Displacement and illumination levels effect on shortdistance measurement errors of using a camera. Recoletos Multidiscipl. Res. J. Piedad, E.J., Larada, J.I., Pojas, G.J., Ferrer, L.V.V., 2018. Postharvest classification of banana (Musa acuminata) using tier-based machine learning. Postharvest Biol.
10