Influence of solar irradiance on hyperspectral imaging-based plant recognition for autonomous weed control

Influence of solar irradiance on hyperspectral imaging-based plant recognition for autonomous weed control

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9 Available online at www.sciencedirect.com journal homepage: www.elsevier.co...

727KB Sizes 0 Downloads 50 Views

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9

Available online at www.sciencedirect.com

journal homepage: www.elsevier.com/locate/issn/15375110

Research Paper

Influence of solar irradiance on hyperspectral imaging-based plant recognition for autonomous weed control Y. Zhang*, D.C. Slaughter Department of Biological and Agricultural Engineering, University of California, Davis, One Shields Avenue, Davis, CA 95616, USA

article info

Canopy reflectance in the visible and near infrared region (384e810 nm) was examined to

Article history:

discriminate between plant species, grown under various sunlight intensities, using

Received 14 April 2011

ground-based hyperspectral imaging technology. Black nightshade and pigweed were

Received in revised form

grown with processing tomatoes under two levels of solar irradiance. The canonical

6 September 2011

Bayesian classifiers based on the full spectral range (400e795 nm) achieved an overall

Accepted 7 September 2011

classification accuracy of 88.2% for the low solar irradiance treatment and 95.1% for the

Published online 22 September 2011

high solar irradiance treatment, using internal cross-validation analysis. The plant species exposed to higher solar irradiance were more easily distinguished in the feature space. The classifier trained with the plants grown in the low solar irradiance treatment was more robust to varying sunlight conditions; and correctly identified 88.7% of the plants for both sunlight intensity growing conditions. Global calibration achieved an optimum classification rate of 90% over the studied range of solar irradiance using 270 plants in the global domain as the training samples. It provided an alternative method to mitigate the effect on species discrimination and to improve classification robustness due to changing canopy optical properties across variation in solar irradiance during plant establishment. ª 2011 IAgrE. Published by Elsevier Ltd. All rights reserved.

1.

Introduction

Weed infestations in fields typically have a heterogeneous spatial distribution and frequently occur in aggregated patches. In order to efficaciously control weeds at minimal environmental and economic costs, the development of sitespecific weed management using precision weed control technology has been studied over the last decade (e.g., ˚ strand & Baerveldt, 2002; Blasco, Aleixos, Roger, Rabatel, & A Molto´, 2002; Lee, Slaughter, & Giles, 1999; Zhang, Staab, Slaughter, Giles, & Downey, 2009). The success of the commercial implementation of precision weed control has been greatly challenged by a lack of real-time weed detection methods that are robust to natural variability in

environmental growing conditions common in agricultural fields (Slaughter, Giles, & Downey, 2008). Prior studies on automatic weed detection and plant species identification on a field scale have primarily been based upon the biological morphology, visual texture and reflectance spectral features of the weed plants (Brown & Noble, 2005; Slaughter, Giles, & Downey, 2008). In comparison with these methods, spectral reflectance-based pattern recognition techniques typically require less computational intensity and have superior potential to discriminate species when partial leaf occlusions occur in seedlines (Brown & Noble, 2005; Slaughter, Giles, & Downey, 2008). With the rapid development of array sensors and digital cameras, hyperspectral imaging of plant foliage reflectance

* Corresponding author. Tel.: þ1 530 754 9776. E-mail address: [email protected] (Y. Zhang). 1537-5110/$ e see front matter ª 2011 IAgrE. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.biosystemseng.2011.09.006

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9

has recently been utilised to develop robotic machine vision systems for precision in-row weed control. One pioneering study using hyperspectral imaging was conducted by Borregaard, Nielsen, Norgaard, and Have (2000) to discriminate sugar beet and potato plants from three weed species in a laboratory setting. The crop-versus-weed classifiers achieved an accuracy of over 89% classification for the crop plants using two to three narrowband reflectance wavelengths selected from 660 to 1060 nm. This study initially demonstrated the potential of hyperspectral imaging technology for crop-weed discrimination. Feyaerts and van Gool (2001) extended the work to real field conditions and developed a hyperspectral vision system which successfully distinguished 80% of sugar beets and 91% of weeds using six separating-feature wavelengths extracted from the 435e1000 nm spectral range. Slaughter, Giles, Fennimore, and Smith (2008) applied the line imaging spectroscopy technique to discriminate weeds in direct-seeded lettuce fields; and 90% of plant canopy were correctly identified by a 21-waveband multispectral classifier established on the reflectance spectra of 384e810 nm. Although the feasibility of spectral reflectance in the visible and near-infrared (NIR) range for plant species discrimination has been demonstrated under laboratory and field conditions, prior research was conducted in a single growing season under relatively uniform sunlight conditions. On the large corporate farms common in California, USA, plants are exposed to a wide range of solar irradiance conditions due to the long growing season (in some locations crops are grown year-round). California processing tomatoes, for example, are planted over a 4-month period (late February to end May), during which the sunlight conditions change dramatically. The average solar power in California’s Central Valley in March 2009 was reported to be 5.5e6.0 kW m2, while the intensity in May increased to 7.0e7.5 kW m2 (NREL, 2009). In addition, the solar irradiance conditions can also vary regionally due to changes in latitude and cloud patterns. For example, a map of annual average solar power for the United States (Fig. 1) indicates that the direct solar irradiance of the

Fig. 1 e Static map showing the annual average direct normal solar source for the United States (adapted from NREL, 2009).

331

western continental states is between 2.5 and 8.3 kW m2, while in many eastern US locations the irradiances values are well below 5.0 kW m2. Changes in solar irradiance during growth affect the optical properties of vegetation canopy in both visible and NIR regions. Canopy reflectance in the visible region (400e700 nm) is directly related to spectral absorption of leaf pigments such as chlorophyll and anthocyanin (Guyot, 1990; Zwiggelaar, 1998). Chlorophyll formation is photo-stimulated in plants (Barton, 1930, Chapter 6). Ultraviolet, visible and far-red light can induce anthocyanin biosynthesis in leaf tissues; whereas absence of light can inhibit anthocyanin production (Chalker-Scott, 1999). Light also plays a strong formative role in the development of foliage physical properties and leaf anatomical structures (Barton, 1930, Chapter 6), which can affect canopy reflectance in the NIR region (700e1300 nm; Guyot, 1990; Zwiggelaar, 1998). Prolonged exposure to sunlight can stimulate long-term changes in plant physiology and morphology, which are of principal interest in this paper. Some research has been conducted on the effects of variability in environmental conditions on the potential use of spectral reflectance for plant recognition. Henry, Shaw, Reddy, Bruce, and Tamhankar (2004) conducted a study using individual leaf-based reflectance in 350e2500 nm to discriminate soybeans from weeds over a range of soil moisture statuses. They achieved overall classification accuracies at 86e91% for an unstressed control to high moisture stress treatments, respectively, and concluded that moisture stress improved the classification of plant species in their study. In a study of the seasonal effects of growing temperature on discrimination between tomatoes and weeds, Zhang and Slaughter (2011) found that multispectral classifiers developed under single season conditions successfully identified over 91% of plant canopy; however, the performance was not stable and was degraded by 2e31% for multi-seasonal applications. Previous studies on sunlight effects have concentrated on the operational effects of ambient illumination conditions on the performance of machine vision systems using leaf reflectance information for precision weed detection (e.g., Tian & Slaughter, 1998; Vrindts, Baerdemeaeker, & Ramon, 2002; Woebbecke, Meyer, Von Bargen, & Mortensen, 1995). To the authors’ knowledge, however, few studies have systematically examined the role of sunlight intensity on the optical properties of plant foliage and its associated potential for plant species recognition. In addition, Zwiggelaar (1998) commented that the robustness of the spectroscopy-based species classifiers for in-row weed detection has not been widely utilised due to evidence that the selection of feature wavelengths was significantly influenced and varied by the crop/weed species involved, the local field conditions under which the spectral data were collected, and the different seasons when the plants were grown. Research is needed to examine the stability of plant species classification over varying sunlight conditions using canopy reflectance. A machine learning algorithm with ability to automatically adapt to environmental changes may be beneficial in robust intra-row weed detection over a range of sunlight conditions.

332

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9

Therefore, the objectives of this study were (1) to evaluate the effects of solar irradiance on canopy reflectance-based plant species identification using visible/NIR spectroscopy; (2) to examine the potential of hyperspectral imaging for plant species recognition over varying solar irradiance conditions and (3) to develop an automated machine learning algorithm that can alleviate sunlight effects on plant recognition and achieve robust weed detection under variability in solar irradiance during plant development.

2.

Materials and methods

2.1.

Plant cultivation and solar irradiance treatment

Tomatoes (Lycopersicon esculentum Mill., Cultivar ‘Heinz 8892’) and two weed species, black nightshade (Solanum nigrum L.) and red-root pigweed (Amaranthus retroflexus L.), were geminated from seed in mid April, 2008 in the greenhouse facility at University of California, Davis, CA, USA. The two weed species were selected as they are common pests in Californian processing tomato fields and ranked as the most troublesome weeds for control (UC IPM, 2009). All species were sown in excess in 230-ml square grow pots filled with modified UC mix (UCD PGF, 2008). After germination, all plants were thinned to one plant per pot and moved to two outdoor experimental benches placed in parallel with an east-west orientation at a location under full sun exposure. To simulate the sunlight variation in Californian’s Central Valley for processing tomatoes during the typical growing season starting in early March to late May, two levels of solar irradiance, high (HSI) and low (LSI) intensities, were randomly assigned to the experiment benches in a completely randomised design. The HSI treatment was the direct solar irradiance of the natural sunlight, typical of the valley in late April and early May. A shade frame (4.3 by 0.9 by 1.1 m) completely covered by a single layer of black knitted shade mesh fabric (made of high density polypropylene, commercially available for agricultural and gardening practices) was mounted over one bench to achieve the LSI treatment. The manufacturer’s nominal shade factor for the fabric was 30%, which was selected to simulate the average solar irradiance intensity of March in the valley (NREL, 2009). No shading occurred between the benches throughout the experiment period. The plants were randomly placed within each experimental unit. The incident solar illuminance and colour temperature were measured at noon time on a daily basis for both treatments with a digital light meter (Type 72-7250, TENMA, Springboro, OH, USA) and a digital chroma meter (Type IIIF, Konica Minolta, Tokyo, Japan), respectively. The average

incident solar illuminance and colour temperature were correspondingly 100.5 k lux (STD 9.54 k lux) and 5523 K (STD 90.5 K) for the HSI treatment and 59.5 k lux (STD 6.54 k lux) and 5419 K (STD 105.0 K) for the LSI treatment. A Tukey multiple comparison of the results showed that the differences in both illuminance and colour temperature between the two treatments were significant at a ¼ 0.05 level (the minimum significant distances for illuminance and colour temperature was 3.85 k lux and 46.1 K, respectively). Scheduled fertigation (UCD PGF, 2008) was manually applied on a daily basis to all plants and none appeared to be subject to water or nutrient deficiency during the trial. Hyperspectral images were collected when the tomato seedlings reached the 2nd-true-leaf stage; and all border plants were excluded from this study. The number of plants for each species was reported in Table 1 with the corresponding spectral pixels in the resulting hyperspectral image database.

2.2.

Hyperspectral image acquisition and preprocessing

The canopy reflectance spectra were acquired using a hyperspectral imaging system (Zhang & Slaughter, 2011). The image detector consisted of a line-scanning spectrograph (nominal resolution 5 nm, ImSpector V8_4_102, Spectral Imaging Ltd., Oulu, Finland), a temperature-controlled, monochrome-area CCD camera (Photometrics CoolSNAPcf, Roper Scientific Photometrics, Tucson, AZ, USA), and an objective lens (12 mm F/1.2, C61215, Cosmicar/Pentax, Hoya Corp., Tokyo, Japan). The camera was mounted in an enclosed chamber with the height adjusted so that the field of view (FOV) was approximately 108 mm across the seedline, the region left unweeded by precision inter-row cultivation and where automatic inrow weed control would be implemented. The spectral range of the transmission grating was 384e810 nm (selected based on results from Slaughter, Lanini, & Giles, 2004). The images were binned into 330 (spatial) by 260 (spectral) pixels inside the camera before transfer in order to improve the signal to noise ratio and increase the data transfer rate from the camera to the computer. The resulting hyperspectral images had a 1.64 nm pixel spacing in the spectral axis and a 0.31 mm in the spatial axis across the seedline. Plants were transported on a custom-designed metal frame at a continuous speed of 32 mm s1 beneath the stationary camera to simulate the motion of a tractor-drawn system down the seedline. With the data acquisition rate set at approximately 7 frames s1, the camera’s FOV was about 2.5 mm wide in the travel direction (i.e. along the seedline). The plant canopy was illuminated by two tungstenhalogen bulbs (Ushio EYF/FG, 12 Vdc, 75 W, SP12; Ushio

Table 1 e Distribution of plant species at each solar irradiance treatment in the hyperspectral image database. Solar irradiance treatment

LSI HSI Total

Number of plants (pixel counts) Tomato

Black nightshade

Pigweed

232 (825,081) 221 (713,054) 453 (1,538,135)

231 (969,630) 222 (353,924) 453 (1,323,554)

224 (1,070,067) 233 (881,070) 457 (1,951,137)

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9

America Inc., Cypress, CA, USA) collimated through a rectangular duct with a 15 incident angle. The light duct extended 36 mm in width and 51 mm in length beyond each side of the camera’s FOV in order to minimise illumination edge effects and provide a more uniform illumination level. A blue filter (KB-12, B þ W, Jos. Schneider Optische Werke GmbH., Bad Kreuznach, Germany) was used to cover the lens in order to improve the uniformity of the light sensitivity across the spectrum. All images were corrected for camera dark signal and normalised with an optically-flat reference standard (Spectralon, Labsphere, Inc, North Sutton, NH, USA). Before classification of the plant species, the modified red ratio vegetation index (MRVI; Biller, 1998) was utilised to segment plants from background using a threshold of 1.4. In order to minimise the spectral distortion due to minor foliage damage or soil deposited on the leaf surface, all hyperspectral images were preprocessed with a 15-pixel convolution window using a central moving average and a Savitzky-Golay quadratic smoothing operation (Savitzky & Golay, 1964; Steinier, Termonia, & Deltour, 1972) in the spatial and spectral dimensions, respectively. As the optical detectors had degraded sensitivity at the boundaries of the nominal spectral range, 15-nm of reflectance information was eliminated at both edges of the spectral boundaries. Therefore, the plant species identification was conducted based upon the reflectance in 400e795 nm. The outlier pixels were excluded using a clustering algorithm based upon Euclidean distance.

2.3.

Plant species classification

Multivariate Bayesian classifiers were constructed using the SAS Discrim procedure (SAS, 2007) to distinguish tomato from weed species across the range of solar irradiance conditions. The classifier feature space was created by two canonical variables (CVs) calculated for each species at each irradiance level using the Candisc procedure in SAS software (SAS, 2007). One advantage of canonical discriminate analysis (CDA) is that it is able to reduce the dimensionality of the feature space by projecting the original hyperspectral data of 241 wavebands (in the range of 400e795 nm) onto a two-dimensional (determined by the number of classes subtracted by 1; CruzCastillo et al., 1994) orthogonal feature space that optimises within and between class variances (Duda, Hart, & Stork, 2001). This approach also enables visual assessment of the classification relationships between species in a twodimensional plane. Complete external validation was conducted to evaluate the performance (indicated by pixel-based classification/error rates, %) of the discrimination models across different sunlight levels. The canonical variables of the external validation data (e.g., HSI plants) were determined in the calibration feature space (e.g., LSI plants). For the single sunlight conditions, 20-fold cross-validation analysis was applied (leaving out 5% of the data, representing approximately 68 plants, at each step). To gain more visual insight into the model robustness to the various solar irradiance levels, the a posteriori probability contours of the three species were illustrated in the canonical feature space for each validation condition.

333

To investigate the wavebands that are mostly responsible for species classification across the variation of sunlight conditions, the full spectral range was segmented into four regions, i.e., 400e515 nm, 515e610 nm, 610e680 nm, and 680e795 nm, which were characteristic of the major absorption peaks of chlorophyll in blue portion (Chlorophyll Bl), anthocyanin, chlorophyll in red portion (Chlorophyll Rd) and red edge to NIR, respectively (Zwiggelaar, 1998). The multivariate discrimination models were constructed for each solar irradiance level with the reflectance information of the above four spectral segments eliminated individually. The modified classifiers were externally validated on the sunlight condition that was not represented in the training data set; and 20-fold internal cross-validation was used when the models were applied to the same condition as in the calibration. The accuracy of the resulting classification was compared to that achieved by the full-range models.

2.4.

Global calibration

The concept of global calibration (Shenk, Workman, & Westerhaus, 2008) was applied for the purpose of improving the model robustness to the variability in sunlight conditions and of assessing its ability to optimise the discriminant boundaries for a broad range of conditions and reduce misclassification when samples were not represented in the calibration data. In this study, the domain of the global data set was defined as the original hyperspectral database including the canopy reflectance spectra of all three species collected for the two solar irradiance treatments. Five to fifty percent of the plants in the global database were randomly selected and used as the training set to calibrate the threespecies multivariate Bayesian classifiers. An ordinary bootstrap estimation process (Duda et al., 2001) was employed and repeated 100 times independently to generate the calibration set and determine the Bayesian parameters (means and covariance matrices of species groups). The calibrated models were validated independently on the remaining 95e50% of the database. The overall performance of the global classifiers was evaluated on a pixel basis by averaging among all 100 repetitions for the global condition and the individual local sunlight conditions.

3.

Results and discussion

3.1.

Canopy reflectance for solar irradiance treatments

The average canopy reflectance spectra (relative reflectance converted from original hyperspectral data) over the 384e810 nm range are displayed in Fig. 2 for the tomato plants grown under the two sunlight intensity treatments. The average reflectance spectra for black nightshade and pigweed were similar to those for tomato. Varying sunlight intensity appeared to shift the canopy reflectance over the full range with slight changes in spectral character. Plant foliage reflectance in the visible domain (400e700 nm) displays the broadband absorption peaks of leaf pigments such as chlorophyll (435, 480 nm, 650 nm and 670e680 nm), and anthocyanin (400e550 nm) (Zwiggelaar,

334

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9

edge shift, an indicator of changes in chlorophyll concentration (Collins, 1978), possibly caused by treatment effects on plant vitality (Boochs, Kupfer, Docktor, & Kuhbauch, 1990) as a result of environmental stress (Collins, Chang, Raines, Canney, & Ashley, 1983).

3.2. Classification of plant species with varying sunlight intensities

Fig. 2 e Average relative reflectance spectra measured for tomato seedlings grown under high and low sunlight intensity conditions.

1998). For the plants grown under the low sunlight intensity, the average reflectance was slightly lower at 400e515 nm and 610e680 nm, and was higher at 515e610 nm. Examination of the Savitzky-Golay 2nd derivatives of the reflectance spectra (Savitzky & Golay, 1964; Steinier et al., 1972; data not shown) showed that these changes may have been due to differences in leaf pigment concentrations. In the NIR part of the reflectance spectra, the LSI plants showed slightly higher average reflectance between 730 and 810 nm. The Savitzky-Golay 2nd derivatives (data not shown) of the original reflectance spectra showed few substantial changes in this region, which probably indicated that physical changes may have occurred in leaves and caused a difference in refractive index, which could result in variation in the reflectance. Fig. 2 also shows some spectral differences between 680 and 730 nm, which may be associated with red-

The validation error rates of the three-species Bayesian classifiers are displayed in Table 2 for tomato, black nightshade, pigweed, as well as the total error rates of three species and crop vs. weed (errors between black nightshade and pigweed were ignored), individually. The main diagonal values, shown in bold type in Table 2, are the internal cross-validation results for the two training sets. The remaining off-diagonal column groups show the classification error rates of the complete external validation for the two models. Within each column group, from left to right, the individual columns present the validation results of the classifiers trained on the corresponding sunlight condition data sets, using the canonical feature vectors based upon the full spectrum (400e795 nm) and the full spectral range with the chlorophyll blue (Bl) portion (400e515 nm), anthocyanin portion (515e610 nm), chlorophyll red (Rd) portion (610e680 nm), and NIR portion (680e795 nm) excluded, respectively. The results of the internal cross-validation based on the spectral reflectance of the full range (left-most column in each of the main diagonal groups) show that the classifiers overall correctly identified 88.2% and 95.1% of the canopy pixels for the LSI and HSI treatments, respectively. In the LSI training set, 12.2% of tomato, 17.5% of black nightshade and 6.5% of pigweed pixels were misclassified, while 5.5% of tomato, 8.7% of black nightshade and 2.9% of pigweed were misidentified for the HSI treatment. According to Vargas, Fischer, Kempen,

Table 2 e Validation error rates of canonical Bayesian models for plant recognition under two solar irradiance conditions using the full spectral range and spectral sections selectively excluding major absorption wavelengths characteristic for chlorophyll in blue portion (Chlorophyll Bl), anthocyanin, chlorophyll in red portion (Chlorophyll Rd) and red-edge to NIR. Validationa Full spectrum j Chlorophyll Bl j Anthocyanin j Chlorophyll Rd j NIR

Error rates (%, pixel-based) of: Tomato Black nightshade Pigweed Total (three species) Total (crop-weed) Calibration

LSI

HSI

LSI

12.2 17.5 6.5 11.8 11.5

21.3 21.6 7.7 16.8 16.2

13.6 18.5 7.1 12.8 12.5

13.4 19.0 7.0 12.8 12.6

14.0 23.7 7.7 14.7 14.1

17.8 9.1 5.9 10.8 10.3

32.7 13.7 3.5 16.0 15.0

18.9 9.9 6.1 11.5 10.9

18.6 9.2 7.2 11.8 11.2

17.7 16.1 11.7 14.7 13.4

HSI

3.4 53.2 12.9 21.0 20.8

8.0 60.3 13.5 24.9 24.3

4.0 52.1 16.5 22.1 21.9

4.1 52.1 13.4 21.1 20.8

7.3 50.4 18.5 23.5 22.8

5.5 8.7 2.9 4.9 4.7

16.4 15.4 2.3 10.0 9.5

6.1 9.3 3.2 5.4 5.1

6.0 10.1 3.2 5.5 5.2

7.0 13.4 3.3 6.5 6.1

a Numbers shown in bold italic type were results of 20-fold cross-validation analysis when the three-species classifiers were tested in solar irradiance conditions represented in the training set.

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9

and Wright (1996), an automated weed control machine with an efficacy above 85% would have superior performance to the weed control typically achieved by many hand labour crews. These results demonstrate that the canopy reflectance collected using ground-based hyperspectral imaging technology has the potential of discriminating tomato seedlings from weeds, regardless of solar irradiance intensities. The overall cross-validation performance of the HSI model was superior to that of the LSI model. This indicates that the plant species in this study are easier to distinguish from one another when exposed to more intense sunlight as typically occurs later in season and may also have implications for regions receiving higher solar power. It was hypothesised that the spectral signatures induced by the HSI condition were species-dependent and made the species more distinct in the feature space. The misclassification rates of pigweed in the training set were at least two times lower than those of the other two species for both levels of solar irradiance intensity. In addition, the total error rates for crop vs. weed discrimination were not substantially lower than those for the three species, indicating that most classification errors occurred between tomato and black nightshade. These results are not surprising given that pigweed is taxonomically different from tomato and black nightshade, both of which are in the same family. For all validation results across the two sunlight treatments, the total three-species error rates of the full-spectrum classifiers ranged from 4.9% to 21.0% (Table 2). The optimal classification accuracy occurred along the diagonal cells showing the cross-validation results when the models were applied to the same condition as in the training set. The effects of sunlight intensity were not symmetric on the discrimination performance of the Bayesian models. In the HSI treatment, 10.8% of the plants were misclassified using the LSI model; whereas 21.0% of the plants in the LSI treatment were misidentified using the HSI model. In general, the model based on the LSI plant data was more robust over the variation in reflectance spectra due to changing sunlight intensities. The overall classification performance of the LSI model actually improved about 1.0% (the three-species total error rates dropped from 11.8% to 10.8%) when applied to the HSI plant spectra. This is probably because the LSI model relied more on general between-species features in canopy reflectance that are presumably stable across sunlight intensity levels, rather than spectral signatures uniquely associated with high solar irradiance. An interaction was also observed between the plant species and the sunlight intensity. For tomatoes, the validation error rates of the HSI model were 3.4e5.5%, compared to 12.2e17.8% of the LSI model. The classification performance for tomatoes was both superior and more stable on average when the model was trained with the HSI plant spectra. For the two weed species, the opposite effect was observed. In black nightshade, the validation error rates increased from 9.1e17.5% when applying the LSI classifier to 8.7e53.2% when using the HSI classifier. Using the LSI classifier 5.9e6.5% of the pigweed pixels were misrecognized; whereas the discrimination inaccuracy was 2.9e12.9% for the HSI model. The LSI classifier was able to provide a higher classification accuracy that was relatively more robust to the variation in sunlight intensity.

335

3.3. Contour maps of species population distribution with varying sunlight intensities The two-dimensional Gaussian distributions of the a posteriori (or posterior) probability densities were examined for the three species over varying sunlight intensities using the CDA technique. Fig. 3 displays the a posteriori distribution profiles of the HSI plants in a canonical feature space determined by the spectra data of the same sunlight condition. The hyperspectral image pixels closer to the peaks of the Gaussian profiles had higher a posteriori probabilities than the ones near the bottom. To better visualise the classification performances with respect to various sunlight intensities, the Gaussian profiles in Fig. 3 were projected onto the two-dimensional canonical feature space, producing population contour maps of all three species shown in Fig. 4. The contour lines are the connection of the points on the Gaussian profiles with equal a posteriori probabilities. Bayesian discriminant boundaries were determined where the a posteriori probabilities of each of the two species were equalised (Duda et al., 2001). Three boundary segments were yielded accordingly. The location of the contour centroids were determined by the mean of the canonical feature vectors, and the shapes of the contours were determined by the covariance matrices. Fig. 4 consists of four contour maps, each displaying the classification results of the LSI and HSI spectral data (shown in the columns) using the models trained on the two solar irradiance conditions respectively (shown in the rows). Along the main diagonal are the calibration performances of the LSI and HSI models. The off-diagonal plots show the complete external validation results when the species a posteriori probabilities of one sunlight condition (e.g., HSI) was determined by the model calibrated under the other sunlight condition (e.g., LSI). Bayesian decision boundaries of the LSI and HSI models were also depicted in solid black lines in Fig. 4. The canonical contour maps in Fig. 4 allow visual inspection of the validation results presented in the left-most

Fig. 3 e Two-dimensional Gaussian profiles of a posteriori probability densities of the HSI tomato (blue), black nightshade (green) and pigweed (red) in canonical feature space established on the HSI condition.

336

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9

Fig. 4 e Canonical contour maps of validation a posteriori probabilities of three species (tomato e blue; black nightshade e green; pigweed e red) for two sunlight intensity treatments using the full spectral range of canopy reflectance data. Bayesian discriminant boundaries were depicted in solid black curves.

column in each of the corresponding column groups in Table 2. The bottom left contour map, for example, shows that substantial shifts occurred in the population distributions of black nightshade and pigweed closer to the tomato-weed boundaries when applying the HSI classifier to the LSI spectra. These shifts accordingly yielded drastic increases in the misclassification rates of black nightshade (from 17.5% to 53.2%, Table 2) and pigweed (from 6.5% to 12.9%, Table 2) compared to the calibration performance of the LSI model (left top contour map in Fig. 4). Generally, the two principal factors that may change the performance of the Bayesian classifiers are the perpendicular distance from the contour centroids (determined by means of feature vectors) to the corresponding discriminant boundaries and the shape of the population contours (determined by covariance matrices). Fig. 4 indicates that some shifts of the centroids, rotation of the contours as well as distortion in the shapes occurred when the plant spectra of one sunlight condition were tested using the models optimised for the other condition. It is shown in Fig. 4 that the validation accuracy of the models was influenced predominantly by the movement of the population centroids rather than the distortion or rotation of the contours. Upon validation, great movement was observed for the centroids of tomato and black nightshade contours across the sunlight treatments, which resulted in significant changes in the distance of the centres with respect to the tomato/nightshade boundaries. In contrast, the distances from the pigweed centroids to the pigweed-tomato/nightshade boundaries were

relatively stable with changing sunlight conditions (except when the HSI model was applied to the LSI condition; also indicated in Table 2 with a pigweed error rate of 12.9%). This observation may explain the smaller variation in the pigweed error rates (2.9e6.5%, Table 2) across the studied solar irradiance range when compared to the other two species. It also confirms that sunlight intensity primarily affected discriminations between tomato and black nightshade. In particular, the stability of the black nightshade classification rates was heavily influenced by the variation in incident sunlight intensity (Table 2 and Fig. 4).

3.4. Characteristic spectral regions for species classification The validation error rates of the modified models, which had the four defined characteristic spectral segments (400e515 nm, 515e610 nm, 610e680 nm and 680e795 nm) removed individually, were also shown in Table 2. The results indicate that the original patterns of the classification accuracy were retained in all validation results for the modified models. Excluding the four regions individually did not substantially improve the classifier robustness to the various sunlight conditions. The removal of the chlorophyll red bands slightly alleviated the sunlight effects on crop vs. weed discrimination (standard deviation across the two sunlight treatments decreased from 6.7% to 6.4%). However, this improvement was not significant and it contained some overall performance penalty.

337

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9

Table 3 e Validation error rates of canonical Bayesian classifier using global calibration with bootstrap estimation technique. Validation error ratea (%, pixel-based)

Solar irradiance condition of global calibration data

Number of plants in training set (%, plant-based)

LSI HSI All solar irradiance treatment combined

5

10

20

30

40

50

13.0 a 8.0 a 11.0 a

12.6 b 7.7 b 10.6 b

12.4 c 7.6 bc 10.5 c

12.4 c 7.6 c 10.4 c

12.3 c 7.5 c 10.4 c

12.3 c 7.6 c 10.4 c

a Mean values averaged over the 100 bootstrap repetitions; the same letter within a row are not significantly different by Tukey multiple comparison method at P  0.05; the minimum significant distance was about 0.10% for global condition, 0.15% for LSI and 0.13% for HSI treatments.

The overall classification performance was degraded to different degrees by selectively eliminating the defined regions. For all applications, more penalties on classification accuracy resulted from removing the chlorophyll blue portion or the red edge to NIR region. This finding is consistent with the conclusion of some previous studies that the blue and NIR wavelength are important for plant species identification using foliage reflectance acquired on the ground or at low altitude (Brown & Noble, 2005; Slaughter et al., 2004). Although the importance of the chlorophyll blue bands and NIR region to species discrimination was established by the overall three-species classification accuracy, the same was not true for individual species. For tomato and black nightshade, the chlorophyll blue portion was the most critical region as the classification error rates (averaged overall applications) increased by 9.9% (from 9.7% to 19.6%) and 5.7% (from 22.1% to 27.8%), respectively, when removed from the models. The NIR region contained the second most useful information. It accounted for 1.8% performance degradation (from 9.7% to 11.5%) for tomato and 3.8% (from 22.1% to 25.9%) for nightshade. Less than 1% of the classification accuracy was lost for tomato (error rate increased from 9.7% to 10.6%) and nightshade (error rate increased from 22.1% to 22.6%) by removal of the anthocyanin and chlorophyll red wavelength regions. In the pigweed identification, the NIR bands played the most important role, with 3.2% increase (from 7.1% to 10.3%) in error rate with its removal. Excluding the chlorophyll blue region, on the other hand, slightly improved the average classification accuracy by 0.3% (from 7.1% to 6.8%). Therefore, the variation of canopy reflectance associated with the spectral signatures in the characteristic regions was dependent on species. Excluding any of the defined spectral regions barely improved the robustness of the Bayesian classifiers with respect to various sunlight intensities.

3.5.

Global calibration

It was noted from Table 2 that the Bayesian discriminant models calibrated exclusively on a local condition (i.e., LSI or HSI) were not robust to varying solar irradiance intensity. Global calibration has the potential to improve the classifier stability and provides a solution for automated machine learning to achieve robust weed detection to varying sunlight intensities. The bootstrap means of total validation error rates of global calibration were tabulated in Table 3 for the global condition and each sunlight condition (while the bootstrap estimates were asymptotically unbiased, it is possible that the estimates

reported here are slightly optimistic; Duda et al., 2001). The variation in the error rates across the range of solar irradiance was about 5%, independent of the size of the global calibration data set. Compared to the variation in the error rates obtained by the single-condition calibrations (from 4.9% to 21.0% in Table 2), global calibration greatly improved the stability of species classification over various sunlight intensities and provided a promising method to alleviate the sunlight effects on plant recognition. For all sampling rates from 5% to 50% tested for global calibration, the total error rates of the HSI data were consistently about 5% lower than those of the LSI spectra. This finding is in accordance with the cross-validation results in Table 2, showing that it was easier to distinguish tomato seedlings from the weeds when the solar irradiance was more intense. The Tukey multiple comparison of the bootstrapping results showed that the performance of the global classifier was optimised (error rate of 10.5%) when it was trained with 20% of the global data set (representing 91 plants of each species) and could not be improved significantly (a ¼ 0.05) by increasing the size of the training set. It is also noted that training the model with 5% of the global set (representing 23 plants of each species) overall misrecognised 11.0% of the plant canopy, which was 0.5% higher than the optimal performance. This increase in classification error may cause yield loss, although a 5% sampling rate could save the cost of in-field data collection by three fold compared to a 20% rate. Therefore, the cost savings need to be evaluated against the potential penalty to yield loss when applying a lower sampling rate for global calibration in agricultural practices.

4.

Conclusions

This study investigated the influence of varying solar irradiance during plant development on the performance of a hyperspectral imaging-based plant sensor for robust species identification in the context of autonomous intra-row weed control for processing tomatoes. The canopy reflectance of tomato and two weed species, black nightshade and pigweed, changed over the visible/NIR range of 384e810 nm with variation in solar irradiance in this study. Multivariate Bayesian classification models were developed based upon the reflectance information in 400e759 nm for the two solar irradiance treatments individually using canonical discriminant analysis. Cross-validation results of applying the

338

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9

models to the same sunlight conditions as represented in the training sets showed that, overall, the models correctly identified 88.2% and 95.1% of the plant canopy for the low and high solar irradiance treatments, respectively. Canopy reflectance collected in this study using ground-based hyperspectral imaging technology demonstrated the potential for distinguishing tomato seedlings from weeds over various solar irradiance intensities. The potential did not appear to be substantially diminished by changes in reflectance caused by varying incident sunlight intensities as long as the same training condition was applied. Providing an overall accuracy of 95.1% compared to that of 88.2% obtained for low solar irradiance, high solar irradiance made the plant species in this study more distinguishable in the feature space and improved species classification for same-sunlight-condition applications. The classification performance was unstable, with overall accuracy ranging from 79% to 95.1%, when models were applied to the reflectance spectra of plants grown under sunlight conditions that deviated from that of the training data. The classification rate of the low solar irradiance model across all solar irradiance spectra was 88.7% with a standard deviation of 0.7%; while that of the high solar irradiance model was 87% with a standard deviation of 11.4%. In general, the model optimised for the low solar irradiance condition provided superior and relatively more stable discrimination performance to changing solar irradiance. The model trained over the global sample domain, represented by the spectra data collected under both the sunlight conditions used in this study, achieved overall classification rates of 87.5% for the low solar irradiance condition and 92.3% for the high solar irradiance condition. Global calibration provided a potential technique to alleviate the sunlight effect on plant species discrimination and to improve classification robustness of the hyperspectral imaging-based plant sensor under varying solar irradiance growing conditions. Global calibration with a 20% sampling rate (representing 91 plants of each species) in this study was sufficient to produce an optimal performance of about 90% correctly classified plant foliage.

Acknowledgements Partial funding for this project was provided by the California Tomato Research Institute. The authors would also like to acknowledge the technical assistance provided by Burt Vannucci, Chris Gliever and Pamela Riley from the University of California, Davis.

references

˚ strand, B., & Baerveldt, A.-J. (2002). An agricultural mobile robot A with vision-based perception for mechanical weed control. Autonomous Robots, 13(1), 21e35. Barton, W. (1930). Recent advances in plant physiology. London, U.K.: J And A Churchill. Biller, R. H. (1998). Reduced input of herbicides by use of optoelectronic sensors. Journal of Agricultural Engineering Research, 71(4), 357e362.

Blasco, J., Aleixos, N., Roger, J. M., Rabatel, G., & Molto´, E. (2002). Robotic weed control using machine vision. Biosystems Engineering, 83(2), 149e157. Boochs, F., Kupfer, G., Docktor, K., & Kuhbauch, W. (1990). Shape of the red edge as vitality indicator for plants. International Journal of Remote Sensing, 11(10), 1741e1753. Borregaard, T., Nielsen, H., Norgaard, L., & Have, H. (2000). Cropweed discrimination by line imaging spectroscopy. Journal of Agricultural Engineering Research, 75(4), 389e400. Brown, R. B., & Noble, S. D. (2005). Site-specific weed management: sensing requirements e what do we need to see? Weed Science, 53(2), 252e258. Chalker-Scott, L. (1999). Environmental significance of anthocyanins in plant stress responses. Photochemistry and Photobiology, 70(1), 1e9. Collins, W. (1978). Remote sensing of crop type and maturity. Photogrammetric Engineering & Remote Sensing, 44(1), 43e55. Collins, W., Chang, S.-H., Raines, G., Canney, F., & Ashley, R. (1983). Airborne biogeophysical mapping of hidden mineral deposits. Economic Geology and the Bulletin of the Society of Economic Geologists, 78(4), 737e749. Cruz-Castillo, J. G., Ganeshanandam, S., MacKay, B. R., Lawes, G. S., Lawoko, C. R. O., & Woolley, D. J. (1994). Applications of canonical discriminant analysis in horticultural research. Horticultural Science, 29(10), 1115e1119. Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification (2nd ed.). New York, N.Y: John Wiley and Sons. Feyaerts, F., & van Gool, L. (2001). Multi-spectral vision system for weed detection. Pattern Recognition Letters, 22(6e7), 667e674. Guyot, G. (1990). Optical properties of vegetation canopies. In M. D. Steven, & J. A. Clark (Eds.), Applications of remote sensing in agriculture (pp. 19e44). London, U.K: Butterworths. Henry, W. B., Shaw, D. R., Reddy, K. R., Bruce, L. M., & Tamhankar, H. D. (2004). Spectral reflectance curves to distinguish soybean from common cocklebur (Xanthium strumarium) and sicklepod (Cassia obtusifolia) grown with varying soil moisture. Weed Science, 52(5), 788e796. Lee, W. S., Slaughter, D. C., & Giles, D. K. (1999). Robotic weed control system for tomatoes. Precision Agriculture, 1(1), 95e113. NREL. (2009). United States Department of Energy National Renewable Energy Laboratory concentrating solar power radiation (10 km)d Static maps: 1998e2005. Available at http://www.nrel.gov/gis/ solar.html Accessed 17.09.10. SAS. (2007). SAS OnlineDoc, 9.2. Cary, NC: SAS Institute, Inc. Savitzky, A., & Golay, M. J. E. (1964). Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, 36(8), 1627e1639. Shenk, J. S., Workman, J. J., Jr., & Westerhaus, M. O. (2008). Application of NIR spectroscopy to agricultural products. In D. A. Burns, & E. W. Ciurczak (Eds.), Handbook of near-infrared analysis (pp. 347e386). Boca Raton, Fla.: CRC Press Taylor & Francis Group. Slaughter, D. C., Giles, D. K., & Downey, D. (2008). Autonomous robotic weed control systems: a review. Computers and Electronics in Agriculture, 61(1), 63e78. Slaughter, D. C., Giles, D. K., Fennimore, S. A., & Smith, R. F. (2008). Multispectral machine vision identification of lettuce and weed seedlings for automated weed control. Weed Technology, 22(2), 378e384. Slaughter, D. C., Lanini, W. T., & Giles, D. K. (2004). Discriminating weeds from processing tomato plants using visible and nearinfrared spectroscopy. Transactions of the ASAE, 47(6), 1907e1911. Steinier, J., Termonia, Y., & Deltour, J. (1972). Comments on smoothing and differentiation of data by simplified least square procedure. Analytical Chemistry, 44(11), 1906e1909.

b i o s y s t e m s e n g i n e e r i n g 1 1 0 ( 2 0 1 1 ) 3 3 0 e3 3 9

Tian, L. F., & Slaughter, D. C. (1998). Environmentally adaptive segmentation algorithm for outdoor image segmentation. Computers and Electronics in Agriculture, 21(3), 153e168. UCD PGF. (2008). University of California at Davis plant growth facilities e Research greenhouse soil and fertilizer. Available at http://greenhouse.ucdavis.edu/research/materials/mediafert. html Accessed 12.04.08. UC IPM. (2009). Pest management guidelines: tomatoes. UC statewide integrated pest management program. University of California Agriculture and Natural Resources. Available at http://www. ipm.ucdavis.edu/PMG/selectnewpesttomatoes.html Accessed 06.02.10. Vargas, R., Fischer, W. B., Kempen, H. M., & Wright, S. D. (1996). Cotton weed management. In S. J. Hake, T. A. Kerby, & K. D. Hake (Eds.), Cotton production manual (pp. 187e202). Oakland, Cal: University of California, Division of Agriculture and Natural Resources.

339

Vrindts, E., Baerdemeaeker, J. D., & Ramon, H. (2002). Weed detection using canopy reflection. Precision Agriculture, 3(1), 63e80. Woebbecke, D. M., Meyer, G. E., Von Bargen, K., & Mortensen, D. A. (1995). Color indices for weed identification under various soil, residue, and lighting conditions. Transactions of the ASAE, 38(1), 259e269. Zhang, Y., & Slaughter, D. C. (2011). Hyperspectral species mapping for automatic weed control in tomato under thermal environmental stress. Computers and Electronics in Agriculture, 77(1), 95e104. Zhang, Y., Staab, E. S., Slaughter, D. C., Giles, D. K., & Downey, D. (2009). Precision automated weed control using hyperspectral vision identification and heated oil. Paper No. 1009313 The 2009 ASABE Annual International Meeting. Zwiggelaar, R. (1998). A review of spectral properties of plants and their potential use for crop/weed discrimination in row-crops. Crop Protection, 17(3), 189e206.