Computers and Electronics in Agriculture 57 (2007) 88–98
Crop identification with wavelet packet analysis and weighted Bayesian distance Jui Jen Chou a,∗ , Chun Ping Chen b , Joannie T. Yeh c a
Department of Bio-industrial Mechatronics Engineering, National Taiwan University No. 1, Roosevelt Road, Section 4, Taipei, Taiwan b Wintech Microelectronics Co., Taipei, Taiwan c College of Medicine, University of Illinois, Chicago, IL, USA
Received 14 November 2006; received in revised form 16 February 2007; accepted 16 February 2007
Abstract This study proposes a novel approach for crop identification by using wavelet packet transform combined with weighted Bayesian distance based on crop texture and leaf features. Automatic processing for agriculture produces requires accurate identification of crops to target plants for treatment according to their needs. Wavelet analysis, which features spatial/frequency localization, data compression, denoising, and data analysis/data mining, is a good candidate for identifying crops. If the energy of wavelet packet coefficients is the sole identifying characteristic, however, results may vary significantly depending on factors such as weather, plant density, growth stage, and sunlight. To overcome these variables, the weighted Bayesian distance was introduced for an identification criterion, also referred to as the decision distance, where the weighting is based on the statistic of crop texture and leaf shape. By utilizing the decision distance under different climates within three consecutive days of photography, the crop identification can achieve an accuracy of 94.63%. © 2007 Elsevier B.V. All rights reserved. Keywords: Crop; Identification; Wavelet analysis; Wavelet packet; Bayesian distance
1. Introduction Automation is a key element for achieving competitiveness by effectively reducing labor cost and enhancing product quality. The identification of crops is critical to automating operations and processes in agriculture. However, operations and processes such as sprinkling, fertilization, weed control, pest prevention, post-harvest process, and quality inspection vary according to different crop species and growth stages. Therefore, the successful identification of crop species and growth stages becomes a prerequisite to achieving automated operation and processing. A key component of the technology that allows for nondestructive identification or classification of the objects of interest is machine vision. In the process of image processing, feature extraction based on texture is one of the most popular approaches. Ghate et al. (1993) employed machine vision systems to acquire the texture characteristics of peanuts to determine their ripeness; Liao et al. (1993) developed a nondestructive machine vision inspection system for corn kernels; Heinemann
∗
Corresponding author. Tel.: +886 2 23635375; fax: +886 2 23627620. E-mail address:
[email protected] (J.J. Chou).
0168-1699/$ – see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.compag.2007.02.007
J.J. Chou et al. / Computers and Electronics in Agriculture 57 (2007) 88–98
89
et al. (1994) utilized machine vision to automatically classify mushrooms; Tao et al. (1995) applied image processing techniques for differentiation of tomatoes and apples based on color. Among all of the above studies based on machine vision, texture analysis proves to be a common and useful method for the classification or identification of crops. Image analysis based on texture is generally categorized into three different methods: statistical, structural, and spectral. The statistical method exploits local correlation of image pixels to extract texture features, e.g. navigation and satellite imaging, spatial distributions of gray levels, co-occurrence probabilities, or run length statistics (Heinemann et al., 1994) The structural method investigates the spatial organization or geometrical interrelationships between primitive elements in images. The spectral approach treats textures as a kind of spatial frequency distributions and uses Fourier transform, wavelet (packet) transform, or other filters (filter banks) to analyze the texture feature over frequencies. Wavelet transform is a powerful technique for analyzing information stored in images or signals (Kokare et al., 2005; Park et al., 2002). Unlike traditional texture analysis, wavelet applied in texture analysis improves space and frequency localization. Features of interest are easy to obtain at any local portion of images or signals which makes this technique convenient. Wavelet analysis has shown great applicability in many fields, including data approximation/compression, denoising, space/frequency analysis, pattern recognition, source identification, fingerprints analysis and data mining. This study developed a crop identification system by using wavelet packet transform and weighted Bayesian distance to describe crop features and further classify crops. Wavelet packet is more appropriate for analyzing meaningful middle and high frequency bands in an image (Chang and Kuo, 1993). While wavelet transform is a pyramidstructured method that can only analyze low frequency bands, wavelet packet (Coifman and Wickerhauser, 1992) analyzes various frequency channels one step further through a tree-structured method. The energy coefficients derived from the wavelet packet transform describe the features of the crop texture and allow the weighted Bayesian distance to be calculated. To resolve concerns of environmental influences and intra-species variations, information from crop texture and leaf shapes was used to calculate the weighted Bayesian distance as a crop identification criterion. 2. Materials and methods The field crop images used in this research were obtained with a digital camera at a distance of one meter perpendicular to the crops. Fig. 1 shows the images of 10 different crops. The resolution of images is 640 × 480 pixels. To explore the effect of weather on the results of image identification, pictures were captured in three consecutive days and numbered according to the time the images were taken. As shown in Table 1, Images 1-1 to 1-4 (shaded in gray) taken on the first day were used for training, and Image 1-5 from the first day, 2-1 to 2-5 from the second day were treated as testing samples, and 3-1 to 3-4 from the third day were selected as validation samples. The process of crop identification is divided into two phases as shown in Fig. 2: training and testing/validation. First, the captured color images were converted into gray to make the identification process simpler. Then, wavelet packet transform was utilized to extract the features of the images. Fig. 3(a) shows the original crop image; and Fig. 3(b) shows the image through three levels of wavelet packet transform. To obtain enough training and testing samples, we dissect each 640 × 480-pixel image into 40 overlapping sub-images with 250 × 250 pixels. Hence, there are 1600 training samples and 4000 testing/validation samples in total.
Table 1 Naming of crop images
90
J.J. Chou et al. / Computers and Electronics in Agriculture 57 (2007) 88–98
Fig. 1. (a–j) Images of 10 different crops.
2.1. Wavelet transform For a signal f(x), a continuous wavelet transform (CWT) is to project f(x) on a set of base functions derived from a basis ψ(x) as shown in Eq. (1). ∞ x−b dx (1) f (x)ψ CWTψ f (a, b) = |a|−1/2 a −∞
J.J. Chou et al. / Computers and Electronics in Agriculture 57 (2007) 88–98
91
Fig. 2. Crop identification process.
where a, b ∈ R, a = 0, and they represent dilating and translating coefficients, respectively. The multiplication factor |a|−1/2 is for energy normalization so that the transformed signal will have the same energy at every scale. However, the CWT is often redundant. Typically, we sample a and b in dyadic grid, i.e., a = 2j and b = n2−j . j and n are integer parameters directly relating to dilation and translation. The set of base function ψj,n (x) is derived from the dilation and translation of the mother wavelet function ψ(x). ψj, n (x) = 2−j/2 ψ(2−j x − n) ∞ DWTψ f (j, n) = f (x)ψj,n (x) dx −∞
(2) (3)
Due to the orthonormal properties and the appropriate choice of a and b, there exists the multiresolution analysis (MRA) algorithm providing a simple method for the construction of the mother wavelet function. Let φ(x) be a scaling function that satisfies the following equation. φj = hs (n) φj−1, n (4) n
Therefore, a corresponding wavelet function from the MRA is ψj = hw (n)φj−1, n
(5)
n
where hs (n) = φj , φj−1,n
(6)
hw (n) = ψj , φj−1, n
(7)
92
J.J. Chou et al. / Computers and Electronics in Agriculture 57 (2007) 88–98
Fig. 3. (a–c) Two-level wavelet packet transform of a crop image.
hs (n) and hw (n) correspond to the low-pass filter and high-pass filter coefficients, respectively. The relationship between the hs (n) and hw (n) is hw (n) = (−1)n hs (L − 1 − n)
(8)
L is the total number of points. Wavelet analysis is to carry out the convolution of the signal Θj−1 (n) with the low-pass filter and high-pass filter, and execute a down-sampling; therefore, the analysis results in low-frequency approximation Θj (p) and high-frequency detail Ωj (p), where the low-frequency signal Θj (p) will be regarded as a new signal and further transformed. hs (2p − n)Θj−1 (n) (9) Θj (p) = n
Ωj (p) =
hw (2p − n)Θj−1 (n)
(10)
n
Fig. 4. Wavelet analysis and reconstruction.
J.J. Chou et al. / Computers and Electronics in Agriculture 57 (2007) 88–98
93
Fig. 5. Two-level wavelet an analysis on an Image.
We can apply inverse wavelet transform to reconstruct the signal Θj−1 (p) by upsampling, convolution and summation. Fig. 4 illustrates the wavelet analysis and reconstruction procedure. They are simply in reverse order. hs (n − 2p)Θj (p) + Θj−1 (n) = hw (n − 2p)Ωj (p) (11) p
p
For the wavelet analysis of a two-dimensional image, the original image first passes through the low-pass and highA ), low-high (ΩH ), high–low (ΩV ) and high–high (ΩD ) four sub-images. pass filters and results in low–low (Θj−1 j−1 j−1 j−1 A ) can be further decomposed to the next level with the repeated The resulting sub-image for the low–low band (Θj−1 procedure. Fig. 5 illustrates a two-level wavelet analysis. 2.2. Wavelet packets The wavelet transform applies the transform step to low pass results. The wavelet packet transform, however, applies the transform step to both low pass and high pass results. Fig. 6 is an example of two-level decomposition using wavelet packet transform. It is especially appropriate for signals with dominant frequency channels which fall in the mid or high frequency bands. The wavelet packet transform can be viewed as a tree. The root of the tree is the original data set; and the next level of the tree is the result of one step of the wavelet transform to the low and high pass filter results from the previous wavelet transform step. 2.3. Texture description with wavelet packet coefficients Deciding dominant frequency channels can lower the amount of computation required for identification. To conduct j level wavelet packet transform for an image with size N × N, the texture characteristics are described using the energy coefficients of various frequency bands obtained after the transform. 2
Energyj,i
j
N /4 4j p 2 = 2 |Cj,i | N
(12)
p=1
Fig. 6. (a and b) Two-level transform of wavelet packets.
94
J.J. Chou et al. / Computers and Electronics in Agriculture 57 (2007) 88–98 p
In Eq. (12), Cj,i represents the wavelet packet coefficient of the ith frequency band, where p indicates pixels. Utilizing all the dominant channels for identification, however, is not necessary. Lee and Pun (2000) conducted a three-level wavelet packet analysis (i = 64) to a Brodatz image and obtained the best accuracy when only 48 bands with higher energy were used as identification features. Fig. 3(c) illustrates the selection of dominant characteristics. The numbers in each cell represent the energy values for each band. The gray areas indicate the dominant channels determined by the six bands with the higher energy values. Therefore, the dominant characteristics can be obtained by sorting the channels according to energy size. 2.4. Bayesian distances With the dominant channel coefficients contained in feature vector X, a Bayesian distance can be calculated based on Eq. (13) if the probability density function of each class is a normal distribution. −1 BDk (X) = ln |Σk | + (X − Mk )T (X − Mk ) (13) k
where 1 Mean vector : Mk = Ek {X} ∼ X = Nk
(14)
X ∈ ωk
Covariance matrix : Σk = Ek {(X − Mk )(X − Mk )T } ≡
1 XXT − Mk MkT Nk
(15)
X ∈ ωk
where k denotes the crop species. For Bayesian classifier, the probability density function of each class must be known and fit the normal distribution. In the study, the class conditional probabilities of crops are proved to fit Gaussian distribution by normal probability plotting. 2.5. Weighted Bayesian distances The geometric texture of crop images is widely used for crop classification (Haralick et al., 1973; Lee and Pun, 2000). This study uses the statistic of run lengths from the crop image to obtain the weighted Bayesian distance. With the weighted Bayesian distance, the crop identification rate can be significantly enhanced. Fig. 7 illustrates the process of weighting and grading.
Fig. 7. Process of weighting and grading.
J.J. Chou et al. / Computers and Electronics in Agriculture 57 (2007) 88–98
95
Fig. 8. Run length curve of a crop image.
To perform run length weighting, gray level images should be converted into binary images. And a binary image is typically obtained by thresholding a gray level image. The threshold value can be estimated from the histogram of the image. A run is a sequence of ‘1’ (or ‘0’) pixels in the row or column of an image. In the study, the run length statistic is the frequency of various run length in rows and columns of an image. Fig. 8 illustrates the run length curve with an example: (a) shows the binary image of a crop; (b) displays the run length curve in the first row of the binary image. Therefore, an image with size r × c would have r + c run length curves. The run length statistic of the image can be obtained by summing the r + c run length information together. Run length parameter estimation is necessary for calculating factors of weighted sorting, including the mean of run length frequency μL,k and the standard deviation σ L,k . For the class k with a training set of P samples, mean value μL,k and standard deviation σ L,k are defined as 1 gL,k P P 2 P (gL, k − μL, k ) = P −1
μL,k =
(16)
σL,k
(17)
where gL,k represents the frequency of run length L for class k. To measure the dissimilarity between unknown crop and a certain training class k, a normalized frequency VL,k is defined in Eq. (18), where gˆ L is the frequency of run length L for the unknown crop. And the degree of dissimilarity Rk between the unknown crop and the class k can be obtained by summing the VL,k over all run lengths in Eq. (19). Finally, the degree of dissimilarity Rk is sorted from the smallest to the largest and given corresponding sorting grade Wk for its training class k, where the set of sorting grade is {Wk } = {1, 2, . . ., 10}. As the sorting grade Wk corresponding to the degree of dissimilarity Rk decreases, the similarity increases. VL,k = Rk =
|ˆgL − μL,k | σL,k
(18)
VL,k
(19)
{Wk } = grade{sort{Rk }}
(20)
L=1
2.6. Decision distance for crop identification The decision distance Dk , also referred to as the weighted Bayesian distance and used as the identification criterion of an unknown crop in this study, is calculated according to Eq. (21) as follows: 1 Dk = BDk 1 + (21) Wk , C : constant C
96
J.J. Chou et al. / Computers and Electronics in Agriculture 57 (2007) 88–98
Table 2 Average number of elements in feature vector X for various crops Crop no.
Scientific name
English name
Average number of elements in vector X
1 2 3 4 5 6 7 8 9 10
Amaranthus mangostanus L. Ipomoea aquatica Forsk. Spinacia oleracea L. Ipomoea batatas L. Artemisia dracunculus L. Allium sativum L. Brassica rapa L. Crassocephalum rabens Lactuca sativa L. Vernonia cinerea L.
Edible amaranth Water spinach Spinach Sweet potato leaves Tarragon Garlic Chinese cabbage Nodding burnweed Garden lettuce Lony ironweed
15.85 15.37 15.94 14.53 13.49 16.76 15.12 17.18 17.03 25.96
where BDk is the Bayesian distance; Wk represents the sorting grade; and C denotes the weighting factor. The larger the C value is, the less influence of the weighting and grading on the decision distance. This distance Dk indicates the degree of dissimilarity between the unknown crops and the class k. The smaller the decision distance is, the higher the similarity. The developed algorithm was written in Matlab and processed on a desktop computer with a 2.4 GHz P4 processor and 512 MB memory. 3. Results and discussions Sixty-four frequency bands are generated by applying three levels of wavelet packet transform with 20-tap Daubechies on each sub-image. To reduce the computation complexity, the dominant frequencies were selected based on the energy coefficients higher than −30 dB. Table 2 shows the average number of elements in the feature vector X for various crops. Since the average numbers for various crops are different, length 17 was chosen under compromise. After performing the wavelet packet transform and calculating Bayesian distance, we can obtain the identification rate of 71.25% on average for unknown crop images (testing/validation images). Table 3 shows the identification rate for each class k. We can see that the crops with major differences in texture or more evenly distributed patterns have better identification results, such as garlic, nodding burnweed, garden lettuce, and lony ironweed. Among them, garlic and garden lettuce have accuracy rates as high as 100%. To improve the identification rate, a weighted Bayesian distance is introduced as an identification criterion, also referred to as decision distance, where the weighting is based on the statistic of crop texture and leaf shapes. Fig. 9 shows that as long as the run length under consideration is greater than 40 pixels, it ensures accuracy at a fairly high level (90%). Considering accuracy and computation costs, we range run lengths between 40 and 65 pixels. The mean Table 3 Crop identification rate with Bayesian distance Crop
1 2 3 4 5 6 7 8 9 10
Identification rate (%) Day 1
Day 2
Day 3
90.00 52.50 90.00 57.50 65.00 100.00 67.50 97.50 100.00 100.00
77.00 65.00 53.00 37.00 56.00 100.00 13.50 99.00 99.00 100.00
66.25 45.63 70.00 27.50 73.13 100.00 27.50 96.88 95.00 100.00
Identification rate (%) for class k
Overall identification rate (%)
74.00 50.00 63.50 35.25 63.75 100.00 24.50 98.00 97.50 100.00
71.25
J.J. Chou et al. / Computers and Electronics in Agriculture 57 (2007) 88–98
97
Fig. 9. Effect of run length on accuracy.
Fig. 10. Effect of constant C on identification accuracy rate. Table 4 Crop identification rate with weighted Bayesian distance Crop
1 2 3 4 5 6 7 8 9 10
Identification rate (%) Day 1
Day 2
Day 3
100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
97.50 100.00 100.00 100.00 100.00 100.00 68.00 99.00 100.00 100.00
99.38 99.38 98.13 100.00 100.00 98.13 46.88 100.00 90.63 100.00
Identification rate (%) for class k
Overall identification rate (%)
98.50 99.75 99.25 100.00 100.00 99.25 53.75 99.50 96.25 100.00
94.63
μL,k and standard deviation σ L,k should be first calculated from training samples. The weighted Bayesian distance can be determined by Eqs. (18)–(21). The identification rate can be significantly improved by incorporating weighting into Bayesian distance, which results in decision distance Dk as shown in Eq. (21). The weighting factor C in the equation determines the importance of the weighting. The greater the value of C is, the less important the weighting on the decision distance. Fig. 10 depicts how the identification rate varies with different weighting factors C. The accuracy goes beyond 90% when the value of C is less than 16. Table 4 demonstrates the best identification rate of 94.63% is at C = 4. 4. Conclusions This study uses the statistic of run lengths from the crop image and coefficients derived from wavelet packet transform to obtain the weighted Bayesian distance. With the weighted Bayesian distance, the crop identification rate can be significantly enhanced. Wavelet packet transform is highly suitable for the description of crop features. Sixtyfour frequency channels are generated in the study by applying three levels of wavelet packet transform with 20-tap Daubechies on sub-images of each crop. To reduce the computation complexity, the dominant frequency bands are
98
J.J. Chou et al. / Computers and Electronics in Agriculture 57 (2007) 88–98
chosen based on the energy coefficients higher than −30 dB. Due to significant differences in texture among various types of crops, the crops are effectively categorized using the Bayesian distance. The results obtained by this approach, however, did not produce a satisfying accuracy rate (71.25%). By employing the weighted Bayesian distance proposed in the study, experiments demonstrated a significant increase in identification rates up to 94.63%. References Chang, T., Kuo, C-C.J., 1993. Texture analysis and classification with tree-structured wavelet transform. IEEE Trans. Image Process 2 (4), 429–441. Coifman, R.R., Wickerhauser, M.V., 1992. Entropy-based algorithms for best basis selection. IEEE Trans. Inf. Theory 38, 713–718. Ghate, S.R., Evans, M.D., Kvien, C.K., Rucker, K.S., 1993. Maturity detection in peanuts (Arachis hypogaea L.) using machine vision. Trans. ASAE 36 (6), 1941–1947. Haralick, R.M., Shanmugam, K., Dinstein, I., 1973. Textural features for image classification. IEEE Trans. SMC 3, 610–621. Heinemann, P.H., Hughes, R., Morrow, C.T., Sommer III, H.J., Beelman, R.B., Wuest, P.J., 1994. Grading of mushrooms using a machine vision system. Trans. ASAE 37 (5), 1671–1677. Kokare, M., Biswas, P.K., Chatterji, B.N., 2005. Texture image retrieval using new rotated complex wavelet filters. IEEE Trans. SMC2 35 (6). Lee, M.C., Pun, C.M., 2000. Texture classification using dominant wavelet packet energy features. In: Proceedings of Fourth IEEE Symposium Image Analysis and Interpretation, pp. 301–304. Liao, K., Paulsen, M.R., Reid, J.F., Ni, B.C., Bonifacio-Maghirang, E.P., 1993. Corn kernel breakage classification by machine vision using a neural network classifier. Trans. ASAE 36 (6), 1949–1953. Park, B., Lawrence, K.C., Windham, W.R., Chen, Y.-R., Chao, K., 2002. Discriminant analysis of dual-wavelength spectral images for classifying poultry carcasses. Comput. Electron. Agric. 33, 219–231. Tao, Y., Heinemann, P.H., Varghese, Z., Morrow, C.T., Sommer III, H.J., 1995. Machine vision for color inspection of potatoes and apples. Trans. ASAE 38 (5), 1555–1561.