Computers and Electronics in Agriculture 100 (2014) 160–167
Contents lists available at ScienceDirect
Computers and Electronics in Agriculture journal homepage: www.elsevier.com/locate/compag
Estimating mango crop yield using image analysis using fruit at ‘stone hardening’ stage and night time imaging A. Payne a, K. Walsh b,⇑, P. Subedi b, D. Jarvis a a b
Central Queensland University, Centre for Intelligent and Networked Systems, Bruce Highway, Rockhampton, Queensland 4701, Australia Central Queensland University, Centre for Plant and Water Science, Bruce Highway, Rockhampton, Queensland 4701, Australia
a r t i c l e
i n f o
Article history: Received 4 June 2013 Received in revised form 8 October 2013 Accepted 23 November 2013
Keywords: Applied image colour segmentation Processing Fruit Automated counting Texture segmentation
a b s t r a c t This paper extends a previous study on the use of image analysis to automatically estimate mango crop yield (fruit on tree) (Payne et al., 2013). Images were acquired at night, using artificial lighting of fruit at an earlier stage of maturation (‘stone hardening’ stage) than for the previous study. Multiple image sets were collected during the 2011 and 2012 seasons. Despite altering the settings of the filters in the algorithm presented in the previous study (based on colour segmentation using RGB and YCbCr, and texture), the less mature fruit were poorly identified, due to a lower extent of red colouration of the skin. The algorithm was altered to reduce its dependence on colour features and to increase its use of texture filtering, hessian filtering in particular, to remove leaves, trunk and stems. Results on a calibration set of images (2011) were significantly improved, with 78.3% of fruit detected, an error rate of 10.6% and an R2 value (machine vision to manual count) of 0.63. Further application of the approach on validation sets from 2011 and 2012 had mixed results, with issues related to variation in foliage characteristics between sets. It is proposed the detection approaches within both of these algorithms be used as a ‘toolkit’ for a mango detection system, within an expert system that also uses user input to improve the accuracy of the system. Ó 2013 Published by Elsevier B.V.
1. Introduction The number of fruit on trees in a mango orchard is currently estimated via a manual count of a small number of trees to predict resource requirements for harvest, and to arrange marketing. Crop load is usually estimated at the stone hardening stage of fruit development, about six weeks prior to harvest, as at this stage the majority of fruit drop (fruit self-thinning) has occurred, and fruit numbers will remain relatively constant until harvest. However, fruit colouration (anthocyanin synthesis, chlorophyll breakdown) develops with further maturation on the tree. In a companion study (Payne et al., 2013) we report the development of an algorithm for identification of fruit within images of a mango canopy, based on the use of an RGB camera in overcast daylight conditions, and the use of filters on shape, texture and colour. However this study dealt with fruit at close to harvest maturity, and consequently with more, and consistent, ‘blush’ (red colouration) on the fruit than is found at the stone hardening stage. At stone hardening, fruit may be half green and half pale orange colour, or all green. It is likely that such image characteristics will require a segmentation approach that downplays reliance on a colour filter ⇑ Corresponding author. Tel.: +61 7 49309707; fax: +61 7 49306536. E-mail address:
[email protected] (K. Walsh). 0168-1699/$ - see front matter Ó 2013 Published by Elsevier B.V. http://dx.doi.org/10.1016/j.compag.2013.11.011
(e.g. the Normalised Difference Index used in Payne et al., 2013), and increases the weighting on other features, such as texture and edge detection. For example, the border limited mean filter (Image J Wiki, 2011) calculates the average gray scale value over n x m pixels and has been used to filter for the consistent general colour within the mango fruit, rather than for a distinct pixel colour. A hessian filter allows for discrimination between blob, plate and line-like structures. This filter has been used in medical imaging where tube like structures, e.g. veins, need to be identified (Foruzan et al., 2012). In the current application, it may serve to discriminate between line-like leaves and stems, and oval mangoes. 2. Materials and methods 2.1. Imaging hardware Each image set was collected using the same camera but an improved mounting and lighting system over that used in the previous work (Payne et al., 2013). This previous work showed that diffuse light conditions were optimum for imaging for this computer vision application. With this in mind, we chose in our current work to image at night under artificial lighting. We felt that this approach made it possible to consistently recreate the diffuse conditions of daytime imaging. The camera, a Canon 50D SLR camera
A. Payne et al. / Computers and Electronics in Agriculture 100 (2014) 160–167
161
Fig. 1. Lighting rig as used for image collection. Left image shows rig as used in 2011 (day image). Right image shows rig as used in 2012 (fully illuminated). Only the inner four lights were illuminated during imaging for sets discussed in this paper to maintain consistency between sets.
were imaged on both sides from two distances. Initially, the vehicle was driven at a greater distance from the tree line to ensure imaging of the whole canopy, for comparison of count of fruit in image to in-field total tree counts with results used to calculate relationship to manual tree count. The vehicle was then driven 2 m from the tree to ensure consistency in distance with 2011 sets for the purposes of automated image counts. These two sets are treated as one for the remainder of the paper, and used in the appropriate contexts. 2.2. Plant populations
Fig. 2. Example image, acquired at night under artificial lighting.
with the standard kit 28–135 mm IS zoom lens was mounted on a frame which held four 6 3 W LED spotlights (SCA P/L) at a distance 1 m from the camera (positioned above, below, to left and right of camera). The frame was mounted to the tray of a utility and a sonar activated sensor was mounted at 0.6 m height on a bar extending 0.5 m towards the tree line, set to trigger off the passing of a tree trunk. The camera could also be triggered from the front seat of the car without using the sonar. Fig. 1 shows the rig used in 2011 (left side, day) and 2012 (right side, night with lights illuminated). The 2012 rig included an additional four lights, but for consistency, these were not used in the collection of images discussed in this paper. Images were stored in Canon RAW format in RGB colour and at a resolution of 4752 3168 pixels. The car was driven down the plantation rows such that the camera was positioned in the inter-row approximately 2 m from the tree trunk. Each set of images were acquired in the evening (well after sunset) in a single session. Images were acquired with the camera facing the tree row, and aligned to the trunk of a given tree. The night conditions and directed lighting reduced image background features significantly (sample image presented as Fig. 2). In a preliminary trial, the effect of exposure setting was explored. The colour of fruit in images acquired using the auto-exposure setting was washed out, while a minus three stop setting created images with too great a contrast in lighting between background and foreground fruit. A setting of minus two stops was found to yield images with the best (subjective) quality, and this setting was used for all work in this paper. 2012 imaging varied from 2011 in that greater care was taken in positioning the spotlights to achieve more even illumination of the canopy, Also, trees
The current study is based on two different plantations of mango (Mangifera indica). The first (2011) is in tropical north Queensland and is the same orchard as used by Payne et al. (2013). The second orchard is in the wide bay region of southern Queensland, some 8° latitude south of the first, and in a subtropical region. Images of fruit were acquired at stone hardening stage in two seasons (2011, 2012), while the previous paper (Payne et al., 2013) was based on images of fruit near harvest in the 2010 season. The 2011 season produced a different pattern of fruit development than that of 2010, with fruit on tree displaying a wide range of size and colour. Indeed, a number of trees carried panicles ranging from flowering through to stone-hardening fruit. This variation is ascribed to a more variable and colder season in 2011, resulting in multiple flowering events on each tree. The 2012 season was similar to that of 2010. 2.3. Image sets Four sets of images were collected (Table 1). During season 2011, a set of 100 mango images was collected from 50 trees as a calibration set. This set included images of both sides of each tree, and manual counts of all fruit on each tree were undertaken. The first 10 trees of the calibration set were re-photographed using the same lighting conditions but on a different night (from one side only). This set of images is referred to as validation set 1. On the same night that the calibration set images were acquired, images of a single side of an additional 74 trees were acquired. This set is referred to as validation set 2. In season 2012, images were acquired using the same equipment and under the same conditions as in 2011, of both sides of 21 trees located in a different orchard. These images form validation set 3. Manual counts of total tree fruit load were also made. 2.4. Manual total tree crop load counts For the calibration set and validation set 3, the total number of mature fruit on each tree was manually counted in the field during daylight hours. The approach differed from the companion study in
162
A. Payne et al. / Computers and Electronics in Agriculture 100 (2014) 160–167
Table 1 Characteristics of image sets. The 2011 sets contained a high number of split fruit in images and multiple flowering events on trees. Tree fruit load % refers to the % of trees carrying a high, moderate and low fruit load, respectively.
Payne et al. (2013) Calibration Set Validation Set 1 Validation Set 2 Validation Set 3
Season
Night
2010 2011 2011 2011 2012
Day Night Night Night Night
2 1 2 3
Orchard
# Trees
# Images
Imaging extent
A A A A B
555 50 10 of 50 in cal set 74 21
555 100 10 74 42
Side A Sides A + B Side A Side A Sides A + B
that only fruit which was viable (that is, had reached stone hardening size and maturity characteristics appropriate for commercial harvest) were counted. This excluded smaller fruit from the later flowerings, and also fruit which were fallen or was clearly damaged with significant yellowing, blackness or a shrivelled appearance. The calibration set of 50 trees carried 83.5 + 36.9 (mean + SD) fruit/tree. Validation set 3 (43 trees) carried 92.5 + 50.4 (mean + SD) fruit/tree. 2.5. Image counts In this manuscript, the count of fruit in an image is termed ‘‘image load’’. For every image considered, the total number of fruit in each image (‘image load’) was manually counted using the Photoshop (Adobe P/L) count feature, using the same criteria as above. For the calibration set, Side A image load was 34.7 + 9.7 (mean + SD) fruit/image. Side B image load was 28.6 + 11.6 (mean ± SD) fruit/image. The average image load, at 31 ± 11.1, was similar to that of the set used in the previous year, at 32.3 ± 14.3 fruit per image). Validation sets 1 and 2 both had a higher overall image count than either the 2010 set or the calibration set, with overall image counts of 41.9 ± 6.2 and 42.5 ± 13.3 fruit per image respectively. Validation set 3 had a low load with 19.5 ± 12.6 fruit per image. Image load for each of the sets was characterised based on the criteria for image load set in the study of Payne et al. (2013) (categories based on the mean and SD of the 2010 population) (see Table 6). For the calibration set, 79% of images rated as a moderate load, 11% as a high load and 10% as low load. Validation sets 1 and 2 skewed towards high load images, while validation set 3 had close to an even distribution between low and moderate, with few high load images. 3. Results 3.1. Total tree fruit load and manual image counts For the calibration set, a manual count of fruit in images of each side of each tree was compared to the in-field tree count of total fruit on each of the 50 trees of the calibration set. An R2 value of 0.51 existed between the whole tree count and the sum of manual counts for Side A and B images for the 2011 calibration set (Table 2). In contrast, an R2 value of 0.92 existed between the whole tree count and the manual count of images from both sides for 21 trees of validation set 3 (Table 3). 3.2. Automated count of fruit number in images The calibration set was first processed using algorithm A, being that developed in Payne et al. (2013), modified in that the colour threshold was reduced from a Cr threshold of 150 to 130 to accommodate the greener fruit of the current set (results of all processing are summarised in Table 4). Error rates were significant (>30%), with false detections involving flower stems, branches and trunk,
Fruit #/tree (mean ± SD) 83.5 ± 36.9
92.5 ± 50.4
Fruit #/image (mean ± SD)
Tree fruit load% high, mod, low
32.3 ± 14.3 31 ± 11.1 41.9 ± 6.2 42.5 ± 13.3 19.5 ± 12.6
15%, 70%, 15% 10%, 79%, 11% 0%, 60%, 40% 3%, 63%, 34% 47%, 51%, 2%
and a large number of predominantly green fruit were not detected. The R2 of automated to manual count was 0. The algorithm was modified as follows (with reference to the original steps of the algorithm, see also Fig. 3): Step 1: The Normalised Difference Index was removed from the algorithm. This step selected ‘blush’ areas of fruit over foliage. However, fruit at stone hardening stage (2011 and 12 sets) were largely ‘unblushed’, i.e. green all over. Step 2: This step was retained as in the original algorithm. The RGB photo was processed using a 3 3 variance filter, converted to gray scale and then filtered where 0 < pixelv alue 6 90. This step effectively removes pixels which were in regions with a large number of edges – such as within mango foliage or grass, and also pixels between which there was zero variance, such as sky. Step 2A: An additional filtering step was added to the original algorithm. A border limited mean filter (Image J Wiki, 2011) was added with parameters. This filter calculates the average gray scale value over n m pixels. The filter was set to work with areas of 125 150 pixels, the approximate size of a clearly visible mango fragment, allowing the filter to select areas which matched the mean gray scale value of a mango. The results were then filtered for high values (P170). This step identified clear blocks of mango fruit pixels. Step 2B: An additional complex object detection step was added to the original algorithm. A hessian filter (largest eigenvalue of Hessian tensor) (Meijering, 2013) with a smoothing scale of 10 was applied to the photo using an absolute eigenvalue comparison. The results were then filtered for low gray scale values (630). The largest eigenvalue was able to discriminate between line-like leaves and stems, and the oval mango shapes. Step 3: The Cr layer filter was removed from the algorithm. Insufficient colour difference existed between fruit, foliage and trunk in the 2011 season trees. Additionally, 2011 trees included significant areas of flowering stem, the colour of which overlaps with that of fruit. This step was replaced with the following two processes: Step 3A: The Cb and Cr layers were averaged (Cb + Cr/2). The resultant image was filtered for high gray scale values (185– 255). Remaining blobs were filtered on size (>50,000). This step identified the trunk when it was a large and clearly visible item in the image. This reduced false trunk detections significantly. Step 3B:The Cb and Cr layers were subtracted (Cb Cr). The resultant image was filtered for low values (6110). This step removed a significant proportion of false mango fruit detections associated with the white sign used in for tree identification. While white signs do not regularly appear in field practice, this step – or a similar colour filtering approach – may be useful to remove other man-made orchard features with a consistent colouring. Step 4: This step remains as in the original algorithm. The Cb layer was filtered where Cb 6 100. This value was shown to consistently select overexposed and yellow leaves within the photo, as these features were identified as a source of error.
163
A. Payne et al. / Computers and Electronics in Agriculture 100 (2014) 160–167
Table 2 Calibration Set – manual (in field) tree counts versus manual image counts, involving images taken from two aspects of each of 50 trees. Correlation coefficient of determination between manual image count and total tree count is presented. The final column is generated by calculating the percentage of fruit shown in Side A and B (combined) and comparing it as a percentage with the manual count of the tree for each image. The mean and standard deviation shown is the average and standard deviation of percentages for all trees in the set. Sample#
Fruit # per tree
Image count Side A
Image count Side B
Image count Side A + B
Side A + B/fruit # per tree (%)
Mean SD R2
83.5 36.9
34.7 9.7 0.37
28.6 11.6 0.34
63.3 17.9 0.51
84 23
Table 3 Validation Set 3 – manual (in field) tree counts versus manual image counts, involving images taken from both aspects of each of 21 trees. Correlation coefficient of determination between manual image count and total tree count is presented. Similarly to Table 2, the final column is generated by calculating the percentage of fruit shown in Side A and B (combined) and comparing it as a percentage with the manual count of the tree for each image. The mean and standard deviation shown is the average and standard deviation of percentages for all trees in the set. Sample #
Fruit # per tree
Image count side A + B
Side A + B/fruit # per tree (%)
Mean SD R2
92.5 50.4
52.4 30.5 0.92
59 16
Table 4 Summary of outcomes of processing runs. Algorithm A refers to that developed in Payne et al. (2013), while Algorithm B was developed in the current study. The linear regression statistics of machine count to manual count of fruit per image, and the % of fruit correctly detected, and % of false detections of fruit in the machine vision count are presented. RMSE refers to root mean square error of residuals (machine – actual count per tree). Image set
Fruit #/tree
Algorithm
Algorithm parameters
Regression model
R2
Bias adjusted RMSE
Correct (%)
False (%)
Cr = 150, Size = 1400 Cr = 130, Size = 1400 Size = 1200 Hessian = 30, Size = 1200 Hessian = 30, Size = 1200 Hessian = 30, Size = 1200 Hessian = 30,
y = 0.582x 0.20
0.74
7.7
52
4.8
y = 0.0843x + 46.188
0
15.9
Poor
>30
y = 0.5662x + 9.4237
0.632
6.7
78.3
10.6
y = 0.6051x + 5.6457
0.78
4.6
66.1
10.7
y = 0.5911x + 5.3686
0.63
8.1
56
18.7
y = 0.6837x + 9.156
0.59
8.1
69.6
26
Mean
SD
Payne et al. (2013)
32.3
14.3
A
Calibration Set
31.7
11.1
A
Calibration Set
31.7
11.1
B
Validation Set 1 Validation Set 2 Validation Set 3
41.9 42.5 19.5
9.6 13.3 12.6
B B B
fast filter = 170 fast filter = 170 fast filter = 170 fast filter = 170
Fig. 3. Comparison of steps in Algorithm 1 and Algorithm 2.
164
A. Payne et al. / Computers and Electronics in Agriculture 100 (2014) 160–167
Step 5: A binary image was generated by collating the results of the previous four steps as follows.
pixelfinal
image
¼ pixelv ariance and pixelblmean and pixelhessian and notðpixelCb Þ and notðpixelTrunk Þ and notðpixelWhiteSign Þ
The result was a binary image, effectively masking pixels associated with mango fruit within the image. Step 6: A count of the number of particles in the binary image was then performed. Particles counted were limited by both a lower and an upper limit on the number of pixels in the particle (1200–50,000). Calibration set images were re-processed using the modified algorithm (‘algorithm B’). Processing time was approximately 20 s per image using functional but un-optimised code and computing hardware. Fruit detection rates were high, at 78.3%. The error rate was 10.6% of all detections (with 2.4% associated with white signs, and 2.1% with detections of split and therefore unviable fruit). Fruit detected twice constituted a further 2.2% of errors. The error of detection of two fruit as one occurred in only 0.3% of error events (two cases). The R2 value of the linear correlation between automated and manual count was increased to 0.63 and the bias adjusted root mean square error was decreased to 6.7 (Fig. 4a). Using the same parameters for the algorithm, validation sets 1–3 were also processed using Algorithm B. Results of these are shown in Table 4, with a breakdown of errors from each processing set shown in Table 5, and representative graphs shown Fig. 4. The slope of the regression between automated and manual count varies between 0.56 and 0.68 in the different sets of images, while the coefficient of correlation varied between 0.59 and 0.78, the bias adjusted RMSEC varied between 4.6 and 8.1 and the percentage of fruit detected varied between 56% and 78%. There was a significant increase in the number of false detections of leaves in the validation sets. The rate of occurrence of this error in validation set 1 was similar to that in the calibration set, however, in validation set 2 the error increased to 18.7%, with
13.8% associated with leaves. In validation set 3, this error increased to 26%, 22.6% of which were leaves. Missed fruit per image was also analysed (Fig. 5). Significant outliers exist in both validation sets 2 and 3. In each case, the outlier was due to significant bunches of fruit hanging from the alternate side in dim light. 4. Discussion 4.1. Estimating total fruit load from 2D images The machine count concept is based on the premise that effectively all fruit on a tree can be seen in images of the two sides of the tree, or at least that the number seen is proportional to the actual total. The result achieved with validation set 3 in 2012 (R2 0.92, using both images of the two sides of the tree) is comparable to that achieved in Payne et al. (2013), consistent with the use of this imaging strategy as a means of estimated tree fruit load. The poor result for the 2011 calibration set (R2 0.51, using both images of the two sides of the tree) can be ascribed to several issues: (a) tree growth since 2010 resulted in some trees limbs extending either vertically or horizontally outside the camera image; (b) insufficient lighting at the edges of the tree; (c) a wide range of fruit maturity on the 2011 trees, with variation in the manual assessment of which fruit should be included in the count as at or beyond stone hardening stage (count per tree SD of 36.9, compared to 14.9 in 2010). For the 2012 set, camera position and light position was adjusted to minimise the first two issues, and the third issue did not apply. The question remains whether the strength of the relationship between fruit in images of two sides of the tree and actual tree fruit load is sufficient to support the concept of using machine vision count of fruit in such images in an estimate of orchard yield. The machine vision approach allows all trees in the orchard to be assessed, while only a small fraction (typically 0.5%) are assessed in the current human based assessment of fruit per tree of a few trees per orchard. Note that an appropriate sample size to estimate the mean tree fruit load to 95% confidence is 218 for the 2011 set (given a SD of 36.9; from Eq. (1), Payne et al., 2013). Current field practice of counting load of approximately 10 trees is thus unlikely
Fig. 4. Automated count of fruit number made using algorithm 2 plotted against a manual count of fruit number in image for each of the image sets.
165
A. Payne et al. / Computers and Electronics in Agriculture 100 (2014) 160–167
Table 5 Percentage of each type of error per set, for false positive error types of white sign, leaves, tree trunk, a single fruit one counted as two, split fruit, split fruit detected as two fruit, and the false negative type of two fruit detected as a single piece of fruit.
Payne et al. (2013) Calibration Set Validation Set 1 Validation Set 2 Validation Set 3
White sign (%)
Leaves (%)
N/A 2.37 1.61 0.00 0.00
2.4 inc. sky, ground 3.47 0.47 5.48 0.32 13.82 1.02 22.63 0.73
Trunk (%)
1 2 (%)
Split (%)
Split double (%)
2 1 (%)
2.4 2.21 0.97 2.44 2.68
N/A 1.74 2.26 1.02 0.00
N/A 0.32 0.00 0.41 0.00
4.5 0.32 0.97 0.41 0.24
Table 6 Percentage of images belonging to the three fruit load categories (low, moderate, high) and the R2 of the correlation between automated and manual fruit count per image for each of four sets of images. Fruit load
Low %
Moderate %
High %
Low R2
Moderate R2
High R2
Total R2
Payne et al. (2013) Calibration set Validation set 1 Validation set 2 Validation set 3
15.3 10 0 3 47
70.1 79 60 63 51
14.6 11 40 34 2
0.36 0.7 0 N/A 0.64
0.55 0.38 0.67 0.39 0.04
0.06 0.02 0.04 0.46 N/A
0.97 0.63 0.78 0.63 0.59
Fig. 5. Comparison of missed fruit versus total image fruit for each set.
to produce reliable harvest estimates. Yet the current yield estimates are reported to be within 10–20% of the actual yields obtained at harvest (pers. comm. plantation managers). Verification of this claim is required. Certainly the image count relationship to total tree load will be effected by canopy shape, and a more two dimensional canopy structure would suit the machine application, as indeed it would suit automated fruit harvest. 4.2. Lighting Manual fruit picking under artificial lights is already performed in the evening in some Australian mango orchards. The artificial lighting required not only offers favourable conditions for viewing the fruit, but also allows the job to be performed during the cool evening period, rather than during the hot, humid conditions of daytime. A number of studies have noted that direct sunlight, with its attendant shadowing and variation in level, creates difficulties
for a machine vision count of fruit on tree (e.g. Stajnko et al., 2004; Payne et al., 2013). Further, it is not practical to limit the imaging of a large orchard to occur within overcast (diffuse lighting) conditions, as occurred in Payne et al. (2013). Imaging under artificial light by night offers the potential of controlled lighting conditions. For example, the results of indicate that night imaging is promising for kiwifruit identification. The results described within this paper (up to 78% fruit detection, 10.2% errors, Table 4) is consistent with the suggestion that artificial lighting at night can provide consistent illumination without strong directional shadows. LED based artificial illumination may also allow for use of wavelengths that accentuate contrast of the object of interest to background. 4.3. Fruit features Algorithm A, developed on images from the 2010 season, failed spectacularly in use with images of the 2011 season. The reasons
166
A. Payne et al. / Computers and Electronics in Agriculture 100 (2014) 160–167
for this failure are explored below, providing justification for the approach employed in algorithm B: Fruit of the 2011 season was more varied in colour than that of the fruit imaged in 2010. This variation almost completely negated the use of colour as a useful image processing feature, as the 2011 season fruit colour overlapped strongly with foliage, trunk and stems. Decreasing the accepted area of colour resulted in increased false detections, given small areas of red colouration on leaves. The replacement of colour filters with the border limited mean enabled more subtle differences in areas of colour to be exploited. The feature of fruit surface texture was consistent between the two seasons. Continuing use of the 3 3 variance filter enabled mangoes to be differentiated from surrounding foliage and tree (areas with significant edges). While fruit shape was consistent between seasons, the obscuration of fruit by other fruit, leaves and branches means that edge and shape detection is an unreliable approach and thus was not used in our algorithm development. However, the use of a hessian filter provided the ability to detect areas with leaf and stem shapes, without selecting the inner area of a mango. This filter assisted in eliminating leaf and stem pixels, while retaining fruit pixels. 4.4. False detections The false detection of fruit in background (non-canopy) areas of the image experienced with natural lighting (Payne et al., 2013) was largely removed by night imaging. Specific filters may be employed to deal with errors associated with a specific feature in the orchard. For example, a white sign was used for identification of each tree in the 2011 season, and a filter was added to Algorithm B to remove this area of the image. The majority of ‘false positive’ errors associated with the use of algorithm B in validation sets 2 (2011) and 3 (2012) were erroneous detections of leaves. Such errors did not occur to any significant extent in the earlier work, nor in the calibration and validation set 1 of this paper. This outcome is ascribed to the widening of the colour filters to include more of the green spectrum, therefore overlapping the colour space of the leaves, and some difference in leaf colour between validation sets (validation set 2 was from a different area of the orchard to the calibration set and validation set 1, while validation set 3 was from a different season, a different orchard and a different region). The most difficult errors to remove were those associated with leaves, trunks and stems. However, the combination of shape, texture and size filters appears to have reduced trunk and stem errors to negligible levels, while leaf errors are still significant. The fruit detection rate of 78% and false detection rate of 10.2% (Table 4) of the calibration set compares favourably with other authors in similar circumstances. Kurtulmus et al. (2011) achieved 75.3% detection with an error rate of 27.3% of green citrus fruit under natural daytime outdoor conditions. Using night imaging, achieved a 90% detection rate with an error rate of 7–17% for gold variety kiwifruit, and 60% detection rate with an error rate of 24– 31% for green variety fruit. Orange coloured citrus fruit represent an application which offers greater colour distinction between fruit and foliage. Under daylight, Hannan et al. (2009) achieved 90% detected of orange fruit, with a 4% false detection rate. Stajnko et al. (2004) achieved detection of 93% of fruit, with 4% error rate, again for mature oranges. However, while Validation Set 1 had an acceptable error rate (10.2%), Validations Set 2 and 3 increased in errors significantly – 18% and 25%, respectively. This effectively removed any opportunity for any sensible comparison of the regression R2 values for each of these sets.
4.5. Next steps It is difficult to envisage that adjustment of the parameters of the algorithm presented in this study will result in any marked improvement in performance. We believe that further work should explore alternative image collection and processing approaches. For example, the approach used by Kurtulmus et al. (2011) could offer additional features to improve the results of the existing algorithm. They used an ‘eigenfruit’ classifier (over intensity and saturation) and a gabor filter (circular) as input into a majority voting algorithm. Resultant blobs were analysed and to merge multiple detections. In particular, the training of the novel ‘eigenfruit’ approach allowed the identification and use of softer features of the fruit, reducing the impact of occlusion and lighting. It is expected that including an eigenfruit approach in our algorithm would assist in eliminating false positives significantly, allowing the widening of existing filters with the result of detecting a greater proportion of visible fruit. However, given the large variation in mango fruit size, colour, irregular shape and obscuration, the collection of a sufficient set of eigenfruits would be a non-trivial task. Additionally, the gabor filter would need to be modified to cater for the irregular shape of mangoes, and may need greater specificity in order to differentiate between fruit and leaves, Linker et al. (2012) also offer a relevant approach. They identified seed areas similar to the blobs detected by our algorithm. However, they then expand, segment, combine and select particular areas using mathematical representations of the arc of the outside of the fruit. The approach was effective in the identification of multiple blobs representing a single fruit, although this type of error has largely been removed by Algorithm B (<1% in all sets). It also assists in the situation where there is significant occlusion and shadowing of fruit by expanding likely regions based on the presence of edges in addition to texture and colour. Again, the technique is worth exploring with mango fruit – albeit with similar issues to the Gabor filter above. An alternative collection technique to consider is that of stereo vision such as used for robotic harvesting in Yang et al. (2007) for tomato fruit (preliminary work only). Stereo images are used for depth segmentation of target fruit clusters identified using colour. In our case, depth segmentation could be used to identify fruit on the facing side of the tree, and exclude those visible but belonging to the rear side. This could improve the ratio between manual tree count and either image or automated count. Finally, perhaps the goal of identifying individual fruit is too great. It may be found that reasonable correlation exists between fruit pixels (possibly increased over what is identified in Algorithm B in this paper by a region growing approach), and yield. Other authors have found success with this approach in simpler applications (e.g. blueberry, (Zaman et al., 2008)).
5. Conclusions Machine vision for orchard fruit load estimation, and robotic harvesting, requires a relatively planar tree structure, when based on imaging from each inter-row. Artificial lighting at night time affords more consistent lighting conditions than daylight for the application of fruit counting using the algorithm described in this paper. Lighting should be configured to offer a strong, even, but non-directional light across the entire tree. Consideration should be given to of the use of ‘non-white’ light to accentuate contrast of the area of interest. It is also possible that the use of a three dimensional camera and depth segmentation would improve results by allowing rejection of ‘far’ fruit from the automated count. The algorithm was improved by introducing altered colour matching and hessian filtering. These steps should reduce the need
A. Payne et al. / Computers and Electronics in Agriculture 100 (2014) 160–167
for perfect lighting conditions. The performance of this improved algorithm should be validated using images of trees from a range of growing conditions. It is likely that in practice, an in-field system might require a ‘toolkit’ of feature detection options, with potential to vary filter settings such that the operator could optimise settings for the condition of the orchard. This toolkit should consist of filtering (colour using RGB and YCbCr, border limited mean) and edge detection (3 3 variance filter). A hessian filter to detect tube like structures should also be included to enable complex object detection and elimination (e.g. leaves and stems). Additional steps should be investigated to use texture (possibly an ‘eigenfruit’ approach) and shape analysis (gabor filter or Linker et al., 2012). The choice of alternatives in such a toolkit could be supported by an expert system. User input to identify correct, missed and falsely detected mangoes could also be employed for two purposes – to support the user to perform a semi-automated crop yield count, and to build a significant database of mango images for later use as eigenfruit. References Foruzan, Amir H., Zoroofi, Reza A., Sato, Yoshinobu, Hori, Masatoshi, 2012. A Hessian-based filter for vascular segmentation of noisy hepatic CT scans. Int. J. Comput. Assist. Radiol. Surg. 7 (2), 199–205.
167
Hannan, M.W., Burks, T.F., Bulanon, D.M., 2009. A machine vision algorithm combining adaptive segmentation and shape analysis for orange fruit detection. Agricultural Engineering International: CIGR Journal. Vol. XI. (Manuscript 1281). http://www.cigrjournal.org/index.php/Ejounral/article/view/1281. Image J Wiki, 2011. Fast filters plugin, accessed 28 April, 2013, http:// imagejdocu.tudor.lu/doku.php?id=plugin:filter:fast_filters:start. Kurtulmus, F., Lee, W.S., Vardar, A., 2011. Green citrus detection using ‘eigenfruit’, color and circular Gabor texture features under natural outdoor conditions. Comput. Electron. Agr. 78 (2), 140–149. Linker, R., Cohen, Oded., Naor, Amos., 2012. Determination of the number of green apples in RGB images recorded in orchards. Comput. Electron. Agr. 81, 45–57. Meijering, E., 2013. FeatureJ hessian filter, accessed 28 April, 2013, http:// www.imagescience.org/meijering/software/featurej/hessian.html. Payne, A.B., Walsh, K.B., Subedi, P.P., Jarvis, D., 2013. Estimation of mango crop yield using image analysis – segmentation method. Comput. Electron. Agr. 91, February 2013, pp. 57–6. Stajnko, D., Lakota, M., Hocevar, M., 2004. Estimation of number and diameter of apple fruits in an orchard during the growing season by thermal imaging. Comput. Electron. Agr. 42 (1), 31–34. Yang, L., Dickinson, J., Wu, Q.M.J., Lang, S. 2007. A fruit recognition method for automatic harvesting. Mechatronics and Machine Vision in Practice, 2007. M2VIP 2007. In: 14th International Conference on vol. no. pp.152–157, 4–6 December 2007. doi: http://dx.doi.org/10.1109/MMVIP.2007.4430734, URL:
. Zaman, Q.U., Percival, D.C., Gordon, R.J., Schumann, A.W., 2008. Estimation of wild blueberry fruit yield using digital color photography. Acta Horticulturae 824, 57–65.