Computers and Electronics in Agriculture xxx (xxxx) xxxx
Contents lists available at ScienceDirect
Computers and Electronics in Agriculture journal homepage: www.elsevier.com/locate/compag
Machine vision-based automatic disease symptom detection of onion downy mildew ⁎
Wan-Soo Kim, Dae-Hyun Lee , Yong-Joo Kim Department of Biosystems Machinery Engineering, Chungnam National University, Daejeon 34134, Republic of Korea
A R T I C LE I N FO
A B S T R A C T
Keywords: Crop disease Onion downy mildew Monitoring system Deep learning Weakly supervised learning
The effective crop management is major issue in recent agriculture because the cultivation area per farmer is increasing consistently while the aging-related reductions in the labor force. To manage crop cultivation effectively, it needs automatic monitoring in farmland. This paper presents an image-based field monitoring system for automatically crop monitoring and consists of constructing field monitoring system for periodic capturing of onion field images, training the deep neural network model for detecting the disease symptom, and evaluating performance of the developed system. The field monitoring system was composed of a PTZ camera, a motor system, wireless transceiver, and image logging module. The deep learning model was trained based on weakly supervised learning method that can classify and localize objects only with image-level annotation. It is effective to recognize crop disease symptom which has ambiguous boundary. The model was trained using captured onion images using the filed monitoring system, and 6 classes including the disease symptom were classified. The detected disease symptom was localized from background through thresholding of the class activation map. The 60% of maximum value in class activation map was determined as an Optimal threshold for disease symptom localization. Identification performance of disease symptom was evaluated using mAP metric by IoU. The results show that the mAP at IoU criteria 0.5, which should have over 50% overlap, was the highest in all models from 74.1 to 87.2. The results showed that the developed field monitoring system could automatically detect onion disease symptoms in real-time.
1. Introduction Onion is a one of the major functional foods for health. The global onion cultivation area increased by 62.96% from 2 million ha in 2000 to 5.4 million ha in 2016, while production increased from 60.5 million tons to 89.8 million tons during the same period (Hanci, 2018). As the demand for onion increases, large-scale cultivation areas should be managed efficiently. However, the management of these areas has suffered a great setback because of crop diseases due to global warming and because of aging-related reductions in the labor force. In particular, more than 50% of the yield loss is caused by disease damage (Harvey et al., 2014), and for this reason, onion growth and disease monitoring is significant for preventing disease damage and minimizing yield loss (Lu et al., 2017). Most disease monitoring for onion cultivation was conducted manually by visual observation; this approach is not only inefficient because the area per worker is too large but also error-prone because the approach depends on skill and health of the workers (Bock et al., 2010). Therefore, the development of an image-based, automated disease observation technology is needed to replace the conventional
⁎
method (Lee et al., 2016). Recently, computer vision systems with image processing technology have been highly developed (Camargo and Smith, 2009), and machine learning approaches have enabled this technology to automatically recognize various objects in a manner similar to the human eye. In particular, deep-learning, which is described as hierarchical learning with a deep neural network layer (LeCun et al., 2015), is the most effective technique in research overall and has shown rapid progress for various intelligence tasks such as visual recognition (Krizhevsky et al., 2012; Simonyan and Zisserman, 2015), image captioning (Zhu et al., 2018), multi-image cued story generation (Kim et al., 2018), medical applications (Kourou et al., 2015), autonomous driving (Chen et al., 2018), and other similarly complex analyses with big data. It also has been applied in the field of agriculture for identification of crop diseases. Crop disease identification with deep-learning has several advantages: to separate the disease symptom from complex image backgrounds through trainable feature learning, to identify multiple instances simultaneously, and possibly to employ low-cost field systems with robust crop monitoring. In this context, the approach
Corresponding author. E-mail address:
[email protected] (D.-H. Lee).
https://doi.org/10.1016/j.compag.2019.105099 Received 1 July 2019; Received in revised form 14 October 2019; Accepted 6 November 2019 0168-1699/ © 2019 Elsevier B.V. All rights reserved.
Please cite this article as: Wan-Soo Kim, Dae-Hyun Lee and Yong-Joo Kim, Computers and Electronics in Agriculture, https://doi.org/10.1016/j.compag.2019.105099
Computers and Electronics in Agriculture xxx (xxxx) xxxx
W.-S. Kim, et al.
Fig. 1. Disease monitoring system for onion cultivation.
this system can reduce the labor force and manage crops efficiently. Conventionally, monitoring of the suspected symptom was conducted by a farmer or an educated forecaster. This method is insufficient and inaccurate for monitoring a wide area with the same labor force (Chou et al., 2019) because it is dependent on the person’s ability (skill, tiredness, etc.) and the environmental conditions (weather, obstacles, etc.); this situation makes it difficult to ensure consistent and accurate monitoring, and it raises the concern about the spread of disease by human. In this study, to reduce farmers’ burden of disease forecasting and minimizing yield loss by disease infection through early detection of suspected disease symptom, an automatic, consistent, and image-based disease monitoring system available onsite was developed. The purpose of the monitoring system is to collect consistently image data of onion cultivation by machine vision, to identify the disease symptom described by the infected portion of crop measurable using the deep neural network model, and to evaluate the system performance. Our work has advantages: (1) it is a fully automated system that includes large-scale image capturing in real-time for unmanned onion field monitoring and disease warning, and (2) a weakly supervised learning approach with localization using an optimal threshold was used, which can better detect the ambiguous boundary between diseased and healthy crop areas in an onion field.
of various deep-learning techniques was challenged regarding the automatic diagnosis. Lu et al. (2017) exploited deep multiple instance learning to detect wheat diseases using the in-field wheat disease dataset 2017 (WDD2017) collected by mobile cameras, and the results showed an accuracy of greater than 95%. Ferentinos (2018) studied automatic recognition of plant diseases on leaves using various types of deep learning models and compared their accuracy. An openly available database was used to train the model, and the best performance had a success rate of approximately 99%. Mohanty et al. (2016) also conducted plant disease classification based on a deep convolutional neural network, and they reported that the trained model could identify 26 diseases. In addition, more than 40 studies have already been conducted employing deep learning (Kamilaris et al., 2017) for purposes such as insect detection in stored grain (Shen et al., 2018), identification of interline weeds in unmanned aerial vehicle (UAV) images (Bah et al., 2018), and fruit localization and counting (Chen et al., 2017; Rahnemoonfar and Sheppard, 2017; Sa et al., 2016). Furthermore, several big data analysis practices in agriculture and other types of research have been challenged recently to improve performance or to develop new applications. Generally, previous studies have shown high detection performance. However, in most paper, the image data were collected by capturing specific area (diseased region) or using open database which focused on target (Ferentinos, 2018). The real-time images captured in the field might be focused on other objects not only crops area such as weed (Dyrmann et al., 2017), lawn (Armstrong, 2017), or the other part in cultivation area. So, it is difficult to apply the model trained by intended images to field monitoring system for monitoring unspecified scenes continuously. In addition, it was costly to create training dataset, because it is hard to accurately diagnose most crop diseases without annotation of disease presence and region in the images by a pathologist or crop cultivation expert. Although a weakly supervised-based learning approach that can classify and localize objects with image-level annotation was used to diagnose crop disease (Lu et al., 2017), the diseased area was approximately localized without contour verification due to the ambiguous boundary. For this reason, the deep learning-based agriculture applications to diagnose disease have trouble to apply to the field as unmanned monitoring. To overcome this problem, a low-cost annotation-based approach that can be trained using real-time images captured in field is needed. Real-time monitoring of the suspected symptom can be evaluated onsite by image-based deep-learning easier than diagnosis and can be applied on unmanned alarm systems for agriculture disease forecasting;
2. Materials and methods 2.1. Field monitoring system The field monitoring system for monitoring onion growth and the disease symptom, as shown in Fig. 1, was developed to enable periodic capturing of onion field images in real time. The system consists of a pan, tilt, zoom (PTZ) camera (HDWC-S322MIR, Honeywell, USA) capable of high-resolution scanning and a motor system with a linear guide for controlling the vertical axis according to crop elevation. In addition, a limit sensor was installed for restricting vertical movement to prevent damage by out of control and the system was covered to protect it against weather conditions such as rain, snow or dust. The size of PTZ camera is 201.83 (D) × 370.78 (H) mm and weighs 4.9 kg as shown in Table 1. To capture images of the overall field and the individual growth of an onion, lenses with the capacity to zoom 32x and up to 512x with full HD (1920 × 1080) resolution were used. The acquired images were transmitted through wireless ethernet communication to 2
Computers and Electronics in Agriculture xxx (xxxx) xxxx
W.-S. Kim, et al.
disease. Thus, identification of diseases with similar symptoms is out of the scope of this study; therefore, the only observed symptoms considered were those of downy mildew. Distinguishing similar symptoms should be performed after the symptom outbreak warning, which will be examined in future studies. In general, researchers develop their own image datasets themselves for their research (Kamilaris et al., 2017) and these images can be bias toward the detection targets. it is hard to apply to automatic field monitoring (Jiang and Nachum, 2019), because of the collected images through field monitoring system automatically have overall scene of crop cultivation unspecified to diseases area in farmland. Each image contains a variety of classes, and their portions, perspectives, and shapes are not intentional. Thus, image dataset that automatically collected by the field monitoring system was used to identify symptoms of onion downy mildew. The captured images have various kinds of classes, however only 6 major classes were used in this study as shown in Fig. 3. For the symptom detection in the onion cultivation area, various information included in the image should be classified and the disease symptom in the crop area should be separated. For this reason, the collected and sampled images were classified into crop area and obstacle-background. The crop area is classified into normal growth, diseased symptom and lawn, and obstacle-background is classified into worker, sign and ground.
Table 1 Spsecifications of PTZ camera (HDWC-S322MIR) used in this study. Item
Specifications
Image sensor Zoom Focus Resolution Frame rate Scanning system Panning Tilting Power consumption Size Weight
1/2.8″ Sony Exmor CMOS Optical 32 times zoom, Digital 16 times zoom Auto Full HD (1920 × 1080p) 1080P@60fps Progressive Scan 360°, 380° revolution per second (Maximum) 200°, 380° revolution per second (Maximum) DC12V ± 10%, 3.0A(36 W) 201.83(D) × 370.78(H)mm 4.9 kg
the embedded image logging module and stored in memory sequentially, as shown in Fig. 1(b). Each image captured and stored in realtime consists of large file size, so it was sampled into a smaller size to construct a dataset for training the deep learning model for the disease symptom identification. The servo motor (CSMT-01B, OEMax, Korea) for the motor system was selected according to the weight of the image capture system and has 100 W of power at a rotation speed of 3000 rpm. The motor system can control vertical movement in a range of 550 mm through the linear guide and communicates with the wireless transceiver by the RS232 method. The program for motor control was developed based on C++ and controls the rotation speed and operation time of the motor using a signal remotely transmitted through the wireless transceiver. In addition, the PTZ camera and the motor system can be controlled automatically according to the predetermined sequence table to monitor onion growth and the disease symptoms in real time.
2.3. Disease symptom identification An Artificial neural network (ANN) is a machine learning method that mimics the human brain structure and is effective for nonlinear problem solving (Tealab et al., 2017; Zhong et al., 2018). Recently, computing speed has dramatically improved, and it is possible to rapidly train deep neural networks using a GPU by developing a convolutional neural network (CNN) structure. In particular, the CNN structure is effective for feature extraction using spatial features of images, and feature extraction is possible through learning (LeCun et al., 1998). This system also works well for identification of objects with uncertain traits such as crop phenotypes according to growth and the species (Ferentinos, 2018; Grinblat et al., 2016; Mohanty et al., 2016). Thus, in this study, CNN-based deep learning models were used to detect the symptoms of downy mildew, and the transfer learning method, which has the advantage of rapidly training the model to fit the target dataset and is effective in modeling using smaller datasets such as expert-labeled images, was used (Ramcharan et al., 2017). Transfer learning is effective for crop disease identification because the model is trained with a small dataset consisting of expert-labeled images. The vgg16 (Simonyan and Zisserman, 2014) architecture is the base model and is relatively lightweight and capable of high image recognition. Object detection is not only done by classification but also by localizing objects in the image. Therefore, it is necessary to annotate the classes and locations for each object in the image (object-level annotation) as well as for the entire image. Object-level annotation is expensive and laborious, and it is difficult to annotate the object's location. The approaches for detection of crop disease trained their models by using annotated image data (Ferentinos, 2018; Rahnemoonfar and Sheppard, 2017). However, it is not only costly task but also less accurate of ground truth because of the ambiguous boundary between the disease symptom and normal growth in the crop. Therefore, to identify the onion disease symptom, a weakly supervised learning method that can classify and localize objects only with image-level annotation, which had information on the existence of objects in the image, was used (Zhou et al., 2016). The weakly supervised learning has a structure that connects the average values of the final feature maps which outputs the last convolutional layer to the classifier using the global average pooling (GAP) of Eq. (1). It can train the importance of each feature map. Object classification was conducted using the class score of Eq. (2) and the softmax classifier of Eq.
2.2. Image data description Identification of crop disease has three purposes of measurement: incidence, severity and yield loss (Agrios, 2005). Incidence is the proportion of infected plants and severity is described as the area that appears as infected phenotypes. Yield loss indicates a decrease in crop yield due to quality damage that is difficult to harvest. In case of disease incidence, there are some cases impossible to identify with the human eye due to a latent period or invisible symptoms and thus yield loss cannot be estimated by using only images. Therefore, disease of onion was detected through machine vision by automating and mechanizing visual observation of symptoms that can be detected with the naked eye. Downy mildew was considered as the most serious onion disease as it is an infectious disease of the soil which causes serious yield loss with damage to the leaf mainly; thus, its rapid discovery is needed (Araújo et al., 2017). Generally, the latency of downy mildew is approximately 9–16 days (Maude, 1990). Downy mildew can be prevented from spreading when visual symptoms appear by exposure to high temperatures or use of fungicide (Whiteman and Beresford, 1998). Therefore, it is very important to detect the initial visual symptoms of the disease (Buloviene and Surviliene, 2006), and the symptoms begin with small spots and spread throughout the leaf. The diseased leaves are bent very heavily, become twisted and then turn yellowish and dry out (Araújo et al., 2017; Thakur and Mathur, 2002), as shown in Fig. 2. In this study, the criteria for symptom recognition of downy mildew were the heavily bent, twisted, yellowish and dried out leaves’ color and shape; symptom area was annotated through visual. These symptoms are the earliest symptoms of downy mildew. Farmers visually determine whether the onions are infected with downy mildew based on the above symptoms. Therefore, it is very meaningful to detect the above symptoms in real time using a field monitoring system. However, the purpose of this study is to automatically capture images to monitor large-scale cultivation areas and to detect symptoms early but not to diagnose the 3
Computers and Electronics in Agriculture xxx (xxxx) xxxx
W.-S. Kim, et al.
Fig. 2. Representative images for symptoms of onion downy mildew.
in the class activation map, and a maximum bounding box containing all of the passed pixels was created. To determine the optimal threshold for localizing the crop disease area, the intersection over union (IoU) was evaluated according to the threshold level change, and the threshold level was set to 50, 60, 70, 80, and 90% of the maximum value of CAM in the experimental conditions. The IoU is a metric used to evaluate the similarity between the predicted bounding box and the ground truth as shown in Eq. (5). The ground truth of the disease symptom area was segmented for each image in the test set, which was conducted by onion pathology experts. The ground truth was expressed as a bounding box, and the smallest square that contained the disease area segmented by experts using red, green, and blue (RGB) images in the test set was selected as the bounding box. To evaluate the IoU by the threshold, one-way ANOVA and the least significant difference (LSD) test, for which the factors were the threshold value and the model type, were conducted with SAS (version 9.1, SAS Institute, Cary, USA).
Fig. 3. Sample images for major classes of onion cultivation monitoring.
IoU =
(3). The region of the classified object can be visualized using the class activation map (CAM) which is calculated by summing the feature maps multiplied by its weight as shown in Eq. (4) (Zhou et al., 2016).
Gk =
(1)
N
Image data were collected from the onion cultivation test bed of the National Institute of Crop Science, Muan, South Jeolla Province. Onion was sown in early November, and images were taken using field monitoring system from April 20 to May 20, 2017. The field monitoring system was installed at a distance of 10 m from the cultivation area, which is the minimum distance of the coverage area of the entire test bed area, and images were collected every 30 min during the period from 07:00 am to 19:00 pm. The soil temperature of onion field was 25–28° during this period. The captured raw image has large-scale view of crop cultivation area with high resolution, and the divided small images from raw image were used for training the model to reduce memory cost. The captured images were cropped to several smaller size images of 224 × 224 pixels to construct the dataset for model training (Krizhevsky et al., 2012). Image cropping was performed as follows: extracting the center area of 3000 × 2000 pixels size and then cropping area at a size of 224 × 224 pixels, sliding 100 pixels in the right and bottom direction from the top left corner (McCool et al., 2017). Training the deep neural networks (DNN) for the model of the disease symptom identification was implemented in python 3.6 and pytorch 0.4, and central processing unit (CPU) and GPU for image training were i7-8700 K and Titan-V, respectively. Based on vgg16 architecture, four models (A, B, C, and D type) were constructed by modifying network layers to compare the effects of added layers as shown in Table 2; the performances of each model were compared. The learning rate was set to 0.005 considering the dataset size and mini batch size during the model’s training, and the performance was validated using the k-fold validation method to enhance generalization performance of the models. To compare the performance of each model statistically, One-Way ANOVA and the least significant difference test (LSD test), whose factor was the model type, was conducted with SAS (version 9.1, SAS
where Gk is global average pooling of the k feature map, fk(x, y) is the pixel value at xth row-yth column in kth feature map, and N is total number of pixels in the feature map.
Sc = ωkc ·Gk
(2)
where Sc is the score of the cth class and ωkc is weight of kth feature map for cth class.
exp (Sc ) ∑c exp (Sc )
(3) th
where Pc is the probability of the c class.
Mc (x , y ) =
(5)
2.4. Model training and performance test
∑x , y fk (x , y ) th
Pc =
Area of overlap Area of union
∑k ωkc ·fk (x , y)
(4) th
where Mc(x, y) is the activation value at (x, y) pixel for c class. Since weakly supervised learning uses image-level labels, the annotation labor is reduced, but a method of determining the area of the detected object is needed. Most weakly supervised learning-based approaches use experimental binary thresholds and contour extraction for localization. In the agriculture field, the weakly supervised learning method was first proposed for a wheat disease diagnosis system (Lu et al., 2017), and the authors reported that 7 different classes, including healthy wheat, could be localized precisely for the corresponding disease region using the general localization method as mentioned above. However, in this study, images were captured from a large-scale cultivation area, while Lu et al. (2017) collected disease area-focused images; thus, the ambiguity between diseased and healthy areas is greater due to low resolution. Therefore, in this study, the minimum area with a high region match between the thresholded CAM and ground truth was determined as the disease symptom area as shown in Fig. 4. In detail, the disease symptom location is the threshold of pixels 4
Computers and Electronics in Agriculture xxx (xxxx) xxxx
W.-S. Kim, et al.
Fig. 4. Process for model training and determination the disease symptom area.
Institute, Cary, USA). The mean average precision (mAP) was used as the metric for evaluating identification performance of the onion disease symptom and it is mainly used to evaluate the classification performance in object detection work (Shen et al., 2018; Ramcharan et al., 2017; Zhou et al., 2016). The mAP was the mean value of each class's average precision (AP) and it is calculated using recall (r) and precision (p) in Eq. (6) and (7) and is expressed in Eq. (8) (Shen et al., 2018). Region is selected as the disease symptom when IoU is higher than the criteria value. The performance evaluation was conducted by the criteria value, and the criteria value was used 5 levels between 0.5 and 0.9 of IoU. The criteria value is using IoU, so this means that the higher the criteria value, the lower the probability that the region will be selected as the symptom region.
Table 2 Modified models based on vgg16 architecture for the disease symptom identification. Model
A
B
C
D
Total weight layers
13
15
17
19
Base configuration
Input: RGB (224 × 224 × 3) 3 × 3 conv, 64 3 × 3 conv, 64 Max pooling 3 × 3 conv, 128 3 × 3 conv, 128 Max pooling 3 × 3 conv, 256 3 × 3 conv, 256 3 × 3 conv, 256
r=
number of correct detection total number of object
(6)
p=
number of correct detection total number of detection
(7)
Max pooling
AP =
3 × 3 conv, 512 3 × 3 conv, 512 3 × 3 conv, 512
∫ pdr
(8)
3. Results and discussion Max pooling
3.1. Image data collection
3 × 3 conv, 512 3 × 3 conv, 512 3 × 3 conv, 512 –
Additional layers
The results of the field monitoring system development and captured sample images are shown in Fig. 5. The collected images from the system were consistently transmitted to the image logging module, and then the captured large-scale cultivation image was cropped to 224x224 RGB images and stored normally. Images were automatically taken at regular intervals. Image collection was conducted under various zoom conditions to determine the zoom value suitable for disease monitoring. The zoom value that could capture the largest area in which identification of the disease symptoms is possible was selected experimentally. The right side of Fig. 5 shows the disease symptoms and normal growth in the top and bottom images, respectively. The disease symptoms in the image can be discerned by the human eye. The number of images automatically captured was 584 and each image has 4000 × 3000 pixels size with RGB channels, and each image
Max pooling 3×3 3×3 3×3 3×3
conv, conv, conv, conv,
512 512 512 512
Global average pooling –
FC, 2048 FC, 2048
–
FC, 2048 FC, 2048
Soft-max
5
Computers and Electronics in Agriculture xxx (xxxx) xxxx
W.-S. Kim, et al.
Fig. 5. Field monitoring system(left) and sample images (right) for disease symptom (right-top) and normal growth (right-bottom).
epoch (overfitting phenomenon). The stop condition of the model learning was determined considering the training accuracy and generalization performance. Thus, the epochs showing a minimum validation cost were selected as the stop condition to avoid over-fitting caused by too many number (Raskutti et al., 2014) and the weights for this time were used as models parameters. The model D has 19 network layers and has lowest validation loss at epochs 50, and shows relatively high performance compared to other models. Table 4 shows the results of training the models, and each values’ average resulted according to the change of the k-fold configuration at the stop condition. The accuracies of each model were in the range of 85–90% and 87–89% for the training set and the validation set, respectively. The results showed that the validation accuracy difference was less than approximately 5% compared with wheat disease diagnosis (Lu et al., 2017), although the model was trained using real-time images without focusing on the disease area. For model A, the validation accuracy was higher than the training accuracy, and this may be the result of underfitting the of training set due to stopping the training too early. The test result evaluating the model using the test set show that accuracy was highest in the modes D at 87.9, similar to model C. From the above results, models C and D have better performance compared with models A and B. The reason for this result is that the convolution layers are added in models C and D, so that they enhance the networks’ feature extraction ability making the model’s object classification in the agricultural image more suitable. There was no significant difference between models A and B, and C and D. This indicates that there is no significant difference in performance for extracting features of agricultural images by adding fully connected networks. The above results suggest that it is advantageous for the classification pattern of onion disease symptoms to have more memory for image feature extraction. As shown in Fig. 7, CAMs are represented for the 6 major classes, and each CAM was generated using the test set. The heat map, which is a graphical representation of data as colors, was used to visualize of feature activation. The closer to red the color is, the higher the activation of class characteristic at the region, and the bluer color
Table 3 Dataset composition for training, validation, and test by class.
Crop area
Obstacle or background Total
Normal growth Disease symptom Lawn Worker Sign Ground
Training set (50%)
Validation set (30%)
Test set (20%)
Total
2104
1263
842
4209
1641
985
656
3282
1093 468 394 706 6406
655 281 237 424 3845
437 187 158 282 2562
2185 936 789 1412 12,813
was cropped to construct the dataset for the model training. A total of 12,813 cropped images were obtained and each image was annotated with an image-level and classified into training, validation, and test datasets. The results of the data classification are shown in Table 3. The ratio of training, validation, and test datasets were set to 5:3:2 in general (Oliveira et al., 2018). The number of images per class was highest at 4209 (33%) in normal growth class and lowest at 780 (6%) in sign class. The images of the dataset were real field-data collected by automatic field monitoring and were shown unspecific characteristic to detection target. In addition, it is possible to automatically generate field dataset for deep learning through the field monitoring system. 3.2. Results of the model training Fig. 6 shows the loss curves by epochs which means the number of training iterations. Each graph expressed the loss difference between using training set and validation set during repeated learning. The loss curve for the training set decreases continuously according to the repetition, and gradually converges to zero. The validation loss decreased and then increased gradually after a specific epoch with fluctuation and it shows the model was too fitted for the training set after a specific 6
Computers and Electronics in Agriculture xxx (xxxx) xxxx
W.-S. Kim, et al.
Fig. 6. Loss curves for model training by epoch.
significant difference compared with IoU at a 60% threshold level. Therefore, whole models have a maximum IoU at same threshold level and it is concluded that the threshold level for disease symptom localization of is 60%. Additionally, a bounding box was generated using the CAM value and threshold level of 60% and the generated bounding box was evaluated using the IoU metric. It is significant result that can contribute to detect the disease symptom region which has ambiguous boundary in agriculture field. In the past studies, approximate bounding boxes for the disease were generated using contour detection without considering analysis of the disease symptom activation (Lu et al., 2017). The IoU of the bounding box created by threshold analysis is more than 0.7, which is the IoU criterion for classification of the previous research (Shen et al., 2018), and it can be used for disease region localization. Fig. 8 shows representative results for the regional detection of the disease symptom at the model D case. The results show the visualized CAM using a heat map as shown in Fig. 8(a) and (b) shows the determined bounding box through thresholding the CAM values with the selected 60% threshold level. The yellow bounding box is for the predicted disease area from the developed system and the green one is for the ground truth labeled by experts. Although there are some differences in performance depending on the image capturing conditions, most of the cases show that ground truth and the predicted region have similar locations with high overlap. In this system, it is aimed to detect the suspected region of the disease symptom based on the RGB image, and it was possible that automatically detected disease symptom area was visible by the human eye. It can improve the identification performance to the diagnosis level through collecting actual infected onion image including high-reliability annotation-related disease class and its precision location. Table 6 shows that the results of the performance evaluation for disease symptom identification using mAP metric by IoU criteria. The IoU criteria value indicates minimum overlap area for selection into disease symptom, so the higher the IoU criteria, the more difficult it becomes to select as a disease symptom.
Table 4 Results of training the model and comparing accuracy. Model A B C D
Epochs 7±2 20 ± 2 26 ± 2 49 ± 3
Training 85.9 88.5 88.9 90.8
± ± ± ±
Validation a
1.92 2.33a 2.16a 1.98a
87.4 88.2 88.4 89.1
± ± ± ±
Test a
3.06 2.97a 3.14a 2.22a
79.3 80.1 86.3 87.9
± ± ± ±
3.21b 2.53b 2.71a 3.05a
(1) Average ± standard deviation. (2) Means with different superscript (a, b) in each column are significantly different at p < 0.05 by LSD's multiple range tests.
represents a region with low relevance. It is possible to extract features for the characteristics of the 6 major classes and to distinguish various objects including the difference between normal growth and disease symptoms in onion cultivation field. Using these results, crop area was separated from the images and disease symptom region was localized precisely (Zhou et al., 2016). 3.3. Disease symptom identification Table 5 shows regional detection performance of disease symptom area by threshold level, and it was evaluated by a model using IoU. The IoU was expressed as averaged value over repeated experiments. As seen in the results, the threshold level of 60% has the highest IoU in most cases. The IoU were 0.76, 0.80, 0.75, 0.62 and 0.41 for threshold levels 50, 60, 70, 80 and 90%, respectively. At the threshold level 60%, the D model had the highest IoU at 0.84 and the other model had a similar IoU lower than model D at a significance level of 5%. As a result of the IoU comparison by threshold level and model, the IoU decreased sharply when the threshold level was higher than 60%, and the lowest IoU was observed at the threshold level of 90%. The threshold level of 90% is the most difficult to pass the values of the pixels of the CAM. In particular, the lowest value of 0.27 was in the model B. For models B and C with a threshold of 70% at a maximum IoU, however, there is no 7
Computers and Electronics in Agriculture xxx (xxxx) xxxx
W.-S. Kim, et al.
Fig. 7. Representative recognition result of each class and visualization using the class activation map.
disease detection have been proposed, and their results show that diseases were detected automatically with high accuracy. These studies were focused on disease diagnosis in captured images, but the use of automatic disease monitoring has limitations because the images used are captured manually, which biases the results toward the disease area, and the target was a very specific and small area compared to the entire area. In addition, crop disease cannot be diagnosed by imagebased area detection alone without expert confirmation; thus, it is more useful to provide a warning through large-scale automatic monitoring and suspected symptom identification. Our system can monitor a wide area and capture images periodically, and downy mildew in onions can be detected using a deep learning-based approach. It is possible that the highest performance of fully-automated disease monitoring can show an mAP at an IoU criteria of 0.5 in the range of 74.1 to 87.2. This was similar to the performance in previous studies, although unspecific field images were used. We also exploited a weakly supervised learning method to train the system regarding the disease area, and localization was conducted with the optimal threshold, which was selected through statistical analysis by evaluating the IoU metric according to the threshold level change. The threshold level of 60% was verified to have the highest IoU in most cases, which could contribute to determining the ambiguous boundary between diseased and healthy crops. The area match between the predicted and real values was higher than 70% using the selected threshold; therefore, this system can be used for crop
Through the results, the mAP at IoU criteria of 0.5, which should have over 50% overlap, was the highest in all models from 74.1 to 87.2. In case of IoU criteria of 0.9, which should have over 90% overlap, mAP was the lowest in the range of 27–30. As the IoU criteria increased from 0.5 to 0.6, the mAP decreased by 6–8%, but the mAP decreased more than by 25% when increasing from 0.6 to 0.7. Therefore, it is considered that the range of 0.5–0.6 is suitable for selecting the IoU criteria for improving the reliability of the identification system for onion disease symptom. The models were trained using real-time images captured by field monitoring system automatically, the performance show that the mAP is 87.2 in the model D. The result showed that the mAP was about 98–120% of the previous research in agriculture field (Ramcharan et al., 2017; Shen et al., 2018). Therefore, it is possible that the automatic monitoring and detection of disease symptom using the developed system.
4. Conclusions This study was conducted to develop an onsite real-time automatic disease monitoring system for early detection of suspected disease symptoms. Our research was focused on full automation of the process of real-time image capturing through disease symptom detection for unmanned yield forecasting. Deep learning-based approaches for
Table 5 Regional detection performance of the disease symptom area by threshold level using IoU metric. Model
The ratio of the maximum value of class activation map (threshold level) 50%
A B C D Average
0.74 0.72 0.76 0.80 0.76
60% ± ± ± ±
Bb
0.019 0.014Cb 0.020Bb 0.021Ab
0.78 0.78 0.79 0.85 0.80
70% ± ± ± ±
Ba
0.017 0.019Ba 0.018Ba 0.022Aa
0.60 0.79 0.81 0.80 0.75
80% ± ± ± ±
Bc
0.015 0.017Aa 0.020Aa 0.021Ab
0.58 0.61 0.57 0.72 0.62
90% ± ± ± ±
Cc
0.019 0.022Bc 0.019Cc 0.016Ac
(1) Average ± standard deviation. (2) Means with different superscript (A, B, C, D) in each column row are significantly different at p < 0.05 by LSD's multiple range tests. (3) Means with different superscript (a, b, c, d, e) in each row are significantly different at p < 0.05 by LSD's multiple range tests. 8
0.39 0.27 0.40 0.58 0.41
± ± ± ±
0.018Cd 0.018Cd 0.014Bd 0.016Ad
Computers and Electronics in Agriculture xxx (xxxx) xxxx
W.-S. Kim, et al.
Fig. 8. Representative results of class activation map (up) and onion disease symptom localization (down) Yellow: determined area, Green: Ground truth. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
level of labelling and the wide view with row resolution. Thus, criteria or references for the warning level of the current images and more precise localization techniques are needed to support quantification of the severity of the disease, which can establish an efficient strategy for crop disease management. Overall, the developed system can be used onsite, and it can contribute to saving time and costs for disease forecasting through unmanned surveillance with reliable observation. In addition, the system is feasible for minimizing infection damage through detecting symptoms quickly and providing warnings about disease risk. Furthermore, it will be possible to improve the diagnostic accuracy at the onsite level by enhancing model performance and various imaging data, including the accurate location of the infected area; thus, the developed system can contribute to realizing unmanned agriculture.
Table 6 Identification performance of the disease symptom using the mAP by IoU criteria. Model
A B C D
IoU 0.5
0.6
0.7
0.8
0.9
75.0 74.1 81.8 87.2
68.2 69.0 77.8 83.1
54.5 50.1 56.9 62.4
31.0 34.5 42.5 39.1
26.2 25.8 26.1 30.8
disease region localization. Some of the captured images could not be used to detect the disease area due to low intensity of illumination, and it was also difficult to distinguish similar early symptoms between downy mildew and other diseases. Although the purpose of this study is early warning, optimal zoom selection according to illuminance and collection of various images that target disease-specific phenotypes for practical use are needed for unmanned monitoring. Additionally, early warning of symptoms is possible with the current system; however, it is difficult to provide the quantitative severity due to the non-diagnostic
Declaration of Competing Interest The authors declared that there is no conflict of interest.
9
Computers and Electronics in Agriculture xxx (xxxx) xxxx
W.-S. Kim, et al.
Acknowledgments
Biotechnol. J. 13, 8–17. https://doi.org/10.1016/j.csbj.2014.11.005. Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks alex. Adv. Neural Inf. Process. Syst. 1–9. https://doi.org/ 10.1201/9781420010749. LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521, 436–444. https://doi. org/10.1038/nature14539. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2323. https://doi.org/10.1109/5. 726791. Lee, D.H., Kim, Y.J., Choi, C.H., Chung, S.O., Nam, Y.S., So, J.H., 2016. Evaluation of operator visibility in three different cabins type far-east combine harvesters. Int. J. Agric. Biol. Eng. 9, 33–44. https://doi.org/10.3965/j.ijabe.20160904.1850. Lu, J., Hu, J., Zhao, G., Mei, F., Zhang, C., 2017. An in-field automatic wheat disease diagnosis system. Comput. Electron. Agric. 142, 369–379. https://doi.org/10.1016/j. compag.2017.09.012. McCool, C., Perez, T., Upcroft, B., 2017. Mixtures of lightweight deep convolutional neural networks: applied to agricultural robotics. IEEE Robot. Autom. Lett. 2, 1344–1351. https://doi.org/10.1109/lra.2017.2667039. Maude, R.B., 1990. Leaf diseases of onions. In: Rabinowitch, H.D., Brewster, J.L. (Eds.), Onion and Allied Crops. CRC, Boca Raton, pp. 173–212. Mohanty, S.P., Hughes, D.P., Salathé, M., 2016. Using deep learning for image-based plant disease detection. Front. Plant Sci. 7. https://doi.org/10.3389/fpls.2016. 01419. Oliveira, I., Cunha, R.L.F., Silva, B., Netto, M.A.S., 2018. A scalable machine learning system for pre-season agriculture yield forecast. In: Proc. - IEEE 14th Int. Conf. eScience, pp. 423–430. Rahnemoonfar, M., Sheppard, C., 2017. Deep count: fruit counting based on deep simulated learning. Sensors 17, 1–12. https://doi.org/10.3390/s17040905. Ramcharan, A., Baranowski, K., McCloskey, P., Ahmed, B., Legg, J., Hughes, D., 2017. Using transfer learning for image-based cassava disease detection. Front. Plant Sci. 8, 1–7. https://doi.org/10.3389/fpls.2017.01852. Raskutti, G., Wainwright, M.J., Yu, B., 2014. Early stopping and non-parametric regression: an optimal data-dependent stopping rule. J. Mach. Learn. Res. 15, 335–366. Sa, I., Ge, Z., Dayoub, F., Upcroft, B., Perez, T., McCool, C., 2016. Deepfruits: a fruit detection system using deep neural networks. Sensors 16, 1–23. https://doi.org/10. 3390/s16081222. Shen, Y., Zhou, H., Li, J., Jian, F., Jayas, D.S., 2018. Detection of stored-grain insects using deep learning. Comput. Electron. Agric. 145, 319–325. https://doi.org/10. 1016/j.compag.2017.11.039. Simonyan, K., Zisserman, A., 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. Proc. Int. Conf. Learn. Represent., pp. 1–14. Simonyan, K., Zisserman, A., 2014. Two-stream convolutional networks for action recognition in videos karen. Adv. Neural Inf. Process. Syst. 568–576. https://doi.org/ 10.1016/0006-2952(83)90587-7. Tealab, A., Hefny, H., Badr, A., 2017. Forecasting of nonlinear time series using ANN. Futur. Comput. Informatics J. 2, 39–47. https://doi.org/10.1016/j.fcij.2017.05.001. Thakur, R.P., Mathur, K., 2002. Downy mildews of India. Crop Prot. 21, 333–345. https:// doi.org/10.1016/S0261-2194(01)00097-7. Whiteman, S.A., Beresford, R.M., 1998. Evaluation of an onion downy mildew disease forecaster in New Zealand. In: Proc. 51st New Zeal. Plant Prot. Conf., pp. 117–122. Zhong, Y., Ma, A., Ong, Y., Zhu, Z., Zhang, L., 2018. Computational intelligence in optical remote sensing image processing. Appl. Soft Comput. J. 64, 75–93. https://doi.org/ 10.1016/j.asoc.2017.11.045. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A., 2016. Learning deep features for discriminative localization. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 2921–2929. Zhu, X., Li, L., Liu, J., Li, Z., Peng, H., Niu, X., 2018. Image captioning with triple-attention and stack parallel LSTM. Neurocomputing 319, 55–65. https://doi.org/10. 1016/j.neucom.2018.08.069.
This work was supported by research fund of Chungnam National University (2018-0608-01) References Agrios, G.N., 2005. Plant Pathology, fifth ed. Elsevier/Academic, Amsterdam. Araújo, E.R., Alves, D.P., Knoth, J.R., 2017. Weather-based decision support reduces the fungicide spraying to control onion downy mildew. Crop Prot. 92, 89–92. https://doi. org/10.1016/j.cropro.2016.10.022. Armstrong, C., 2017. Using Imagery of Lawns to Estimate Lawn Care Need. Tech. Discl. Commons. Bah, M.D., Hafiane, A., Canals, R., 2018. Deep Learning with unsupervised data labeling for weeds detection on UAV images. arXiv Prepr. Bock, C.H., Poole, G.H., Parker, P.E., Gottwald, T.R., 2010. Plant disease severity estimated visually, by digital photography and image analysis, and by hyperspectral imaging. CRC. Crit. Rev. Plant Sci. 29, 59–107. https://doi.org/10.1080/ 07352681003617285. Buloviene, V., Surviliene, E., 2006. Effect of environmental conditions and inocolum concentration on sporulation of Peronospora destructor. Agron. Res. 4, 147–150. Camargo, A., Smith, J.S., 2009. An image-processing based algorithm to automatically identify plant disease visual symptoms. Biosyst. Eng. 102, 9–21. https://doi.org/10. 1016/j.biosystemseng.2008.09.030. Chen, S.W., Shivakumar, S.S., Dcunha, S., Das, J., Okon, E., Qu, C., Taylor, C.J., Kumar, V., 2017. Counting apples and oranges with deep learning: a data-driven approach. IEEE Robot. Autom. Lett. 2, 781–788. https://doi.org/10.1109/LRA.2017.2651944. Chen, Y., Zhao, D., Lv, L., Zhang, Q., 2018. Multi-task learning for dangerous object detection in autonomous driving. Inf. Sci. (Ny) 432, 559–571. https://doi.org/10. 1016/j.ins.2017.08.035. Chou, W.C., Tsai, W.R., Chang, H.H., Lu, S.Y., Lin, K.F., Lin, P., 2019. Prioritization of pesticides in crops with a semi-quantitative risk ranking method for Taiwan postmarket monitoring program. J. Food Drug Anal. 27, 347–354. https://doi.org/10. 1016/j.jfda.2018.06.009. Dyrmann, M., Jørgensen, R.N., Midtiby, H.S., 2017. RoboWeedSupport - detection of weed locations in leaf occluded cereal crops using a fully convolutional neural network. Adv. Anim. Biosci. 8, 842–847. https://doi.org/10.1017/s2040470017000206. Ferentinos, K.P., 2018. Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 145, 311–318. https://doi.org/10.1016/j.compag.2018.01. 009. Grinblat, G.L., Uzal, L.C., Larese, M.G., Granitto, P.M., 2016. Deep learning for plant identification using vein morphological patterns. Comput. Electron. Agric. 127, 418–424. https://doi.org/10.1016/j.compag.2016.07.003. Hanci, F., 2018. A comprehensive overview of onion production: worldwide and Turkey. J. Agric. Vet. Sci. 11, 17–27. https://doi.org/10.9790/2380-1109011727. Harvey, C.A., Rakotobe, Z.L., Rao, N.S., Dave, R., Razafimahatratra, H., Rabarijohn, R.H., Rajaofara, H., Mackinnon, J.L., 2014. Extreme vulnerability of smallholder farmers to agricultural risks and climate change in Madagascar. Philos. Trans. R. Soc B 369, 1–23. https://doi.org/10.1098/rstb.2013.0089. Jiang, H., Nachum, O., 2019. Identifying and Correcting Label Bias in Machine Learning. arXiv e-print. Kamilaris, A., Kartakoullis, A., Prenafeta-Boldú, F.X., 2017. A review on the practice of big data analysis in agriculture. Comput. Electron. Agric. 143, 23–37. https://doi. org/10.1016/j.compag.2017.09.037. Kim, T.H., Heo, M.O., Son, S.I., Park, K.W., Zhang, B.T., 2018. GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation. arXiv Prepr. Kourou, K., Exarchos, T.P., Exarchos, K.P., Karamouzis, M.V., Fotiadis, D.I., 2015. Machine learning applications in cancer prognosis and prediction. Comput. Struct.
10