Optik 123 (2012) 1562–1567
Contents lists available at SciVerse ScienceDirect
Optik journal homepage: www.elsevier.de/ijleo
Composite wavelet based morphological correlation for continuous-scale-invariant pattern recognition Qu Wang ∗ , Li Chen, Jinyun Zhou, Qinghua Lin School of Physics and Optoelectronic Engineering, Guangdong University of Technology, Guangzhou 510006, PR China
a r t i c l e
i n f o
Article history: Received 4 April 2011 Accepted 20 August 2011
Keywords: Wavelet matched filter Composite filter Scale-invariant recognition
a b s t r a c t A composite nonlinear correlation is proposed to perform invariant recognition for the input image with continuous-scale distortions. The proposed correlation can be considered as a summation of many composite wavelet matched filters. Every matched filter consists of a linear combination of binary slices that are generated from threshold decomposition of the training images, and an adaptive wavelet filter. The adaptive wavelet is optimized to produce the sparse image features. Computer simulations are carried out to prove the scale-invariance and noise robustness of the new scheme. © 2011 Elsevier GmbH. All rights reserved.
1. Introduction A binary-decomposition-based nonlinear correlation, what is termed the morphological correlation (MC) [1–3], has been found to have higher discrimination capability (DC) and better noisy robustness against some non-Gaussian noises than the conventional linear matched filter. The MC is optimal in the sense of mean-absolute error (MAE) [1]. The MAE increases faster than the conventional mean-squared error (MSE) when reference scene is displaced from the optimal matching position, which results in sharper correlation peak. A joint transform correlation (JTC) setup can be employed to realize MC optically because the MC involves linear summation of many linear correlations (LC) between threshold decomposition binary slices [2]. Two harmonic based MC schemes, rotation-invariant MC (RIMC) [4] and morphological radial harmonic correlation (MRHC) [5], were also reported to achieve the rotation- and scale-invariance of the correlation peaks and were implemented optically. Wavelet transform (WT) has been attracting increasing attention in the optical pattern recognition community because of its attractive feature extraction, denoising and multi-resolution capabilities. Wavelet-matched filter and wavelet based JTC have been reported to improve the correlation performance [6–8]. These schemes can generate the correlation of two wavelet transforms directly without a need for WT preprocessing. In the previous work, we introduced wavelet based morphological correlation (WBMC) by multiplying a wavelet intensity spectrum filter with the joint
∗ Corresponding author. Tel.: +86 020 87084387; fax: +86 020 87084387. E-mail address:
[email protected] (Q. Wang). 0030-4026/$ – see front matter © 2011 Elsevier GmbH. All rights reserved. doi:10.1016/j.ijleo.2011.10.001
power spectrum (JPS) of the MC setup [9]. Our test results confirm that the WBMC provide higher discrimination with considerably sharp and intensive correlation signal than the LC, the MC and the joint wavelet transform correlation (JWTC) can if the dilation factor of the wavelet filter is appropriately selected. Furthermore, the WBMC also shows better resistance to salt-and-pepper noise. In spite of this, the feature-based WBMC may be sensitive to image distortions. The harmonic approach is a useful tool to obtain rotation, scale or projection invariance in plane. A rotation-invariant WBMC has been built by combining circular harmonic filter (CHF) technique with the WBMC in another work [10]. Mellin harmonic filter is invariant to continuous-scale variation [11]. However, this filter is based on a single Mellin harmonic component. The WBMC and the other WT based correlations are all based on the edge and corner features of the entire image. Synthetic discrimination function (SDF) is another solution for distortion-invariant recognition. An optical composite wavelet matched filter (CWMF) has been implemented to carry out distortion-invariant pattern recognition [12,13]. The composite WBMC (CWBMC) that we proposed in this paper is invariant to continuous-scale variations of the input image. In the CWBMC, all training images are first threshold-decomposed into many binary slices according to gray level. In the Fourier domain, the composite filter is defined as a linear combination of the Fourier transforms of these training binary slices multiplied by an isotropic wavelet spectrum intensity filter. In order to ensure high discrimination capability, the wavelet is devised to generate sparser local features for a given image. We also investigate the robustness of the CWBMC against salt-and-pepper noise in comparison with the CWMF. Computer simulation results are presented to establish the effectiveness of the CWBMC.
Q. Wang et al. / Optik 123 (2012) 1562–1567
1563
2. Morphological correlation and wavelet based morphological correlation Let two discrete gray level two-dimensional signals, g(x,y) and r(x,y) be input and reference scene, respectively. The MC between them has two equivalent definitions[1]: MC(x, y) =
min[r(h, l), g(x + h, y + l)]
h,l
=
Q −1
gq (x, y) ⊗ rq (x, y),
(1)
q=1
where ⊗ indicates linear correlation operator, gq (x,y)and rq (x,y)are binary threshold decomposed slices as below:
gq (x, y) =
1 g(x, y) ≥ q rq (x, y) = 0 g(x, y) < q
1 r(x, y) ≥ q . 0 r(x, y) < q
(2)
The second definition shows that MC is the amplitude sum of all correlations between threshold decomposed slices of the input and reference image at each gray level. The MC has next three noteworthy properties: (1) MC is more robust when outlier noises corrupt input scene; (2) MC yields sharper signal intensity than LC does, assuming some local stationarity; and (3) the autocorrelation intensity of the MC is always higher than or equal to the crosscorrelation intensity when input scene is noise-free. However, some drawbacks still exist with the MC operation. The correlation peak produced by the MC is always weaker than that of the LC under the same condition. When the energy of the false input object is considerably high, the difference between the correct correlation peak intensity and the false cross-correlation peak intensity becomes almost negligible. Moreover, the noise resistance of the MC is still confined within a small noise intensity range. The WT can be considered as a cross-correlation between the input signal f(x) and a set of elementary functions (daughter wavelets) derived from a mother wavelet h(x) by means of dilations and translations, as shown below [14]: 1 Wf (a, b) = √ a
∞
x − b
h∗
a
1 f (x) dx = √ h a
b a
⊗ f (b),
(3)
−∞
where * is conjugate operator, a and b are respectively the dilation and the translation parameters. The WT often serves as a reliable tool to localize the edge feature of images, which play a crucial role in identifying a target and rejecting a non-target object. On the basis of this property, a WBMC was defined as below to improve the correlation performance of the conventional MC [9]: E(x, y) =
Q −1
Q −1
q=1
∞
−∞
Q −1 q=1
∞
Gq (u, v)H ∗ (au, av)
∗
Rq (u, v)H ∗ (au, av)
∞
−∞
∞
2.1. Composite wavelet based morphological correlation A composite filter provides solutions for optical verification when there is a set of patterns to be recognized by a single filter. The original composite filter is a linear combination of the training images. Only the values at the centre of correlations are controlled. To reduce sensitivity of the WBMC, The CWBMC is devised as a linear combination of
the WBMCs. Assume that we are given by a set of training images tn (x, y) with n = 1, 2, · · ·N, which are scaled up or down from the original reference image with different scale factors. As have done for the WBMC, all these training images are firstly threshold decomposed into many binary slices tnq (x,y). For input image g(x,y), the correlation output of the proposed CWBMC is written as: E(x, y) =
Q −1
Wgq ⊗
N
q=1
(5)
˛nq Wtnq
n=1
where Wtnq is the WT of binary slice tnq (x,y). In the frequency domain, the spectrum of the CWBMC is given by Q −1
N 2
Gq (u, v) H(au, av)
q=1
where
∗ ˛nq Tnq (u, v)
(6)
n=1
T (u,v)
is
the
Fourier
transform
of
t (x,y).
filter (CWMF) proposed by Roberge and Sheng [12,13]. The coefficients ˛nq are determined by a group of equations in the correlation plane as below:
2
Gq (u, v)Rq∗ (u, v) H(au, av)
−∞
exp [j2(xu + yv)] du dv
2
setup of the WBMC, in which an intensity spectrum H(au, av) , shown in a spatial light modulator (SLM2 ) is used to modulate the JPS of binary slices rq (x,y) and gq (x,y) on another SLM (SLM2 ). Both the WTs preprocessing of binary slices and final Fourier transform are performed in a single step. Enhanced local edge features lead to high discriminability of the WBMC, while make it exposed to the shape distortion of the input. To solve this problem, in the following section, a wavelet-composite correlation based on the WBMC is depicted to resist the unknown continuous scale-variance of the input with good discrimination capability to reject false object.
nq nq H(au, av) 2 N ˛nq Tnq (u, v) is a composite wavelet matched n=1
−∞
2
Rq∗ (u, v) H(au, av) may be seen as a wavelet-matched filter (WMF) defined by D. Roberge et al. Fig. 1 show an optoelectronic hybrid JTC
I(u, v) =
× exp [j2(xu + yv)] du dv =
transform of dilated wavelet function. From this definition, one can easily find that the WBMC is actually the summation of many linear correlations between the WTs of threshold slices rq (x,y) and gq (x,y).
Wgq ⊗ Wrq
q=1
=
Fig. 1. Optoelectronic hybrid implementation setup for the WBMC and the CWBMC.
(4)
where Wrq and Wgq are the wavelet transforms of threshold binary slices rq (x,y) and gq (x,y)respectively. Rq (u,v) and Gq (u,v) denote the Fourier transforms of rq (x,y) and gq (x,y). H(au,av) is the Fourier
tmq (x, y)
N
˛nq FT−1
2 Tnq (u, v) H(au, av) dx dy = cmq
n=1
(7) −1
where FT denotes the inverse Fourier transform and cmq is the desired central correlation output for the input binary slice tmq .
1564
Q. Wang et al. / Optik 123 (2012) 1562–1567
Q −1
The summation c is the desired central correlation output q=1 mq of the CWBMC. The solution for ˛nq can be achieved with iterative gradient-descent method [15,16]. As a CWMF, the impulse response of
H(au, av) 2 N ˛nq Tnq (u, v) is a linear combination of the n=1
WTs of the training binary slice tnq . One can consider it as a quasiorthogonal combination in the sense that the cross correlations among its components are almost negligible, because the WTs produce sparse features of the training binary slice tnq . These features do not overlap generally. As a result, the convergent speed of the iteration algorithm for solving coefficients ˛nq increases greatly, more training images can be included in a single composite filter than the conventional composite filter, and the central correlation outputs without large sidelobe can be obtained. However, the CWBMC is still vulnerable to those scale variances that are not considered in the training set. In practical application, the proposed scheme should be invariant not only to discrete scales of the training images but also to a continuous-scale variation of the input image. According to scale-space analysis, the scale a of the wavelet filter plays an important role in determining the behavior of the correlation output [17]. Sheng and Roberge [13] have found that for the case of a one-dimensional binary image (a binary step function), when scale of wavelet filter a and the size increment√ of the scaled training images satisfy Sparrow criterion a ≥ /2 2 [18], the central correlation output of the CWMF can be considered to be constant to the continuous scale variance of the input size. Because the CWBMC is a summation of many CWMFs (Eq. (6)), and training slice tnq and input slice gq are both binary images, the Sparrow criterion is also supposed to be effective in reducing the sensitivity of central correlation output to the continuous scale variation for the CWBMC. By extracting edge features of a two-dimensional (2-D) graylevel image, an isotropic WT, in general, can generate contours in the image. The impulse response of the conventional CWMF would consist of overlapped 2-D contours of the scaled training images. These overlapped 2-D contours would constitute a thick contour or a completely blurred pattern, resulting in a poor discrimination capability. To avoid this problem, the wavelet should be designed to generate sparser features instead of continuous contours. The following 2-D separable first-order derivative of a Gaussian wavelet can be employed to detect junction corners of vertical and horizontal edges: ha (x, y) =
1 h a2
x y ,
a a
=
e 2a2
xy a2
exp −
1 (x2 + y2 ) . 2a2
(8)
However, continuous contours for edges that are inclined with respect to the x and the y axes still exist. To smooth away these continuous local maxima, Sheng combined two wavelets that were defined in Eq. (8) to form a directionally selective adaptive wavelet [13], in which one wavelet is rotated by with respect to the other: a
= ha + (1 − )ha ,
(9)
where the weight ≤ 1 and ha is the rotated wavelet by . As a linear combination of wavelet a is itself a wavelet. For a given image f(x,y), the coefficients and are chosen adaptively to minimize the variance of the WT outputs and to maximize the following criterion: K(, ) =
1 Wf2 dx dy
,
(10)
which will concentrate most of energy of the WT of f(x,y) in some sparse local zones, resulting in a intensive correlation signal. Considering the CWBMC is actually a summation of many CWMF operations, we still use this adaptive wavelet for the CWBMC.
Fig. 2. Test image: (a) Su-27 and (b) T10.
The JTC setup, shown in Fig. 1, is also available to implement the CWBMC. First, the input binary slice gq and composite reference
N
t are placed on the SLM1 side by side while keeping SLM2 n=1 nq completely transparent. Then, by use of Fourier transform lens, the JPSq with respect to gray level q is recorded by CCD. All these JPSq s are obtained and added together. Next, the summation of these JPSq s is transported to SLM1 , and the power spectrum of adaptive wavelet is shown on SLM2 to perform the wavelet filtering. The lens carry out the final Fourier transform. Because for the same gray level q, the WTs of these training slices tnq with different scale factors yield non-overlapped contours, the cross correlations among them are close to zero. Thus no false extra alarms appear to disturb the correct correlation signals on the correlation plane. 2.2. Computer simulation and analysis Computer simulation experiments are performed on the Matlab platform. We choose Su-27 and T-10, shown in Fig. 2(a) as training images. The training image is scaled to 9 discrete scales: k = 0.6, 0.7,. . ., 1.3, 1.4. The image with scale factor k = 1 is of a size of approximately 60 pixels. The size increment is = 6. By use of Sparrow √ criterion, we choose wavelet scale a = 2.5 that satisfies a ≥ /2 2. For Su-27, = 0.45 and = 62◦ is used as optimal parameters to maximize criterion K(, ). For T-10, we choose = 0.35 and = 65◦ . Fig. 3(a) shows the correlation peak intensity as a function of the input image scale factor k when Su-27 is used as training images of the CWBMC. For the Su-27 with 0.6 ≤ k ≤ 1.4, the peak intensity varies between 1.0 and 0.95. If we change the size increment of training images to = 12, √ the variation is between 1.0 and 0.8 where the condition a ≥ /2 2 is not satisfied. The correlation intensities yielded by conventional CWMF are given in Fig. 3(b) for comparison. For the input of 0.6 ≤ k ≤ 1.4, it varies between 1.0
Q. Wang et al. / Optik 123 (2012) 1562–1567
a
1
0.9
0.8
0.8
0.7 0.6 0.5 Δ=6 Δ=12
0.4 0.3
0.6
0.8
1 k
1.2
1.4
0.5
0.3
0 0.4
1
0.9
0.8
0.8
0.7 0.6 0.5 0.4 Δ=6 Δ=12
0.3
1.4
and 0.88 for = 6, and between 1.0 and 0.66 for = 12. Obviously, the variation amplitude is much larger than that in the case of the CWBMC. Moreover, where the input scales k are not in the training set, the peak intensities of the CWMF decrease more rapidly than that of the CWBMC. Compared with the CWMF, the CWBMC shows more stable invariance to the continuous scale distortions. Fig. 4 shows correlation peaks variations for T-10, from which similar conclusion is obtained. In order to test the discrimination capability of the CWBMC, we choose Su-27 as training images with = 6 to design composite filter, and T-10 as false input. From Fig. 5(a), one can find that the CWBMC discriminates against the T-10 within the whole tested range successfully. Fig. 5(b) shows that the conventional CWMF can also discriminate against the false object within most of the tested range, but somewhat worse than the CWBMC. For the CWBMC, it is the binary slices that are wavelet transformed, which generates contours or local maxima that are, in general, thinner or sparser than those from gray level images. This leads to better discrimination capability of the CWBMC than the CWMF. However, the correlation intensities of the CWBMC are weaker than those generated by the CWMF, because the CWBMC is based on the MC. Fig. 6(a) gives the 3-D correlation output of the CWBMC. The CWBMC is designed for Su-27 with = 6. The three images of the Su-27 with scales of k = 0.8, 0.85 and 1.2 are detected. The Su-27 with the scale of 0.85 is not in the training set, but still produces a correlation peak with intensity close to those of the images with k = 0.8 and 1.2. The correlation peak of the false object T-10 with k = 0.8 is much weaker and can easily be discriminated. Fig. 6(d) shows that the CWMF can also perform the discrimination task under the same condition, but its sidelobe is wider and there is more correlation noise on the correlation plane. The noise and sidelobe is from the overlapped local features of the WTs of the training images. Different from the conventional CWMF, the impulse response of the CWBMC is a linear combination of the WTs of the training binary slices. The quasi-orthogonality of the linear combination is stronger than that in the CWMF, because the WTs of the binary slices can generate thinner contours or sparser features.
1.6
Δ=6 Δ=12
0.3
0 0.4
1.6
Fig. 3. Correlation peak intensities of (a) the CWBMC and (b) the CWMF for the image Su-27.
1.4
0.4
0.1 1.2
1.2
0.5
0.1 1 k
1 k
0.6
0.2
0.8
0.8
0.7
0.2
0.6
0.6
1
b
0.9
0 0.4
Δ=6 Δ=12
0.4
1.6
Maximum correlation intensity
Maximum correlation intensity
0.6
0.1
0.1 0 0.4
0.7
0.2
0.2
b
1
0.9
Maximum correlation intensity
Maximum correlation intensity
a
1565
0.6
0.8
1 k
1.2
1.4
1.6
Fig. 4. Correlation peak intensities of (a) the CWBMC and (b) the CWMF for the image T-10.
As a result, the correlation peaks of the CWBMC are narrower than those of the CWMF, and no much correlation noise exists on the correlation plane. Finally, we use peak-to-noise ratio (PNR) to test the noise robustness of the CWBMC against the salt-and-pepper noise. The PNR is given as: PNR =
1 N
Imax
x,y ∈ M
, z(x, y) 2
(11)
where Imax is the correct correlation peak intensity, M represents the set of pixels whose intensity values are below half of the Imax and N is the pixel number of this set. In this test, a Su-27 with the scale of k = 1 is corrupted by different intensities of salt-andpepper noise and used as input image. Wavelet scale factor is a = 3 and size increment of the training image is = 6. Fig. 7 shows the PNR of the CWMF and the CWBMC as a function of the intensity of salt-and-pepper noise. The PNRs of CWBMC are larger than those of the CWMF within almost all tested range. The morphological operation in the CWBMC plays important role in suppressing noisy effect. The 3-D correlation outputs of the Fig. 8 support our conclusion further. 3. Conclusion We have described a composite nonlinear approach to the recognition of an object whose size is not known exactly. The CWBMC is a combination of the WBMC and the SDF method. This composite correlation is a summation of many WBMFs. In these sub-WBMFs, threshold-decomposed binary slices of the training images with respect of gray level q are linearly combined to be composite reference slices. A directionally-selective wavelet is used to generate the sparse features of these composite reference slices. Simulation test has shown that the CWBMC is invariant to continuous-scale variations of the input. Moreover, compared with the CWMF, the CWBMC generates more strong robustness against
1566
Q. Wang et al. / Optik 123 (2012) 1562–1567
a
14
800
x 10
CWBMC CWMF 700
12
500 8 Su−27 T−10
6
400 300
4
200
2
100
0 0.4
8
b
PNR
Maximum correlation intnesity
600 10
0.6
0.8
1 k
1.2
1.4
0 0.05
1.6
0.1
0.15 0.2 0.25 Intensity of salt−and−pepper noise
0.3
Fig. 7. PNR of the CWBMC and the CWMF.
x 10
Maximum correlation intensity
7 6 5
Su−27 T−10
4 3 2 1 0 0.4
0.6
0.8
1 k
1.2
1.4
1.6
Fig. 5. Correlation peak intensities of (a) the CWBMC and (b) the CWMF when composite filters are designed for Su-27, and input images are Su-27s and T-10s with different scales.
Fig. 8. 3-D correlation results of (a) the CWBMC and (b) the CWMF when input image Su-27 (k = 1) is corrupted by salt-and-pepper noise with the intensity of 0.2.
Foundation for Distinguished Young Talents in Higher Education of Guangdong, China (LYM10070). References
Fig. 6. 3-D correlation results of (a) the CWBMC and (b) the CWMF.
the salt-and-pepper noise. A JTC setup is available to implement the CWBMC optically Acknowledgments This research is financially supported by general program of National Natural Science Foundation of China (60977029) and
[1] P. Maragos, Morphological correlation and mean absolute error, in: ICASSP89: 1989 International Conference on Acoustic, Speech and Signal Processing, vol. 3, Institute of Electrical and Electronics Engineers, New York, 1989, pp. 1568–1571. [2] P. Garcia-Martinez, D. Mas, J. Garcia, C. Ferreira, Nonlinear morphological correlation: optoelectronic implementation, Appl. Opt. 37 (1998) 2112– 2118. [3] S. Zhang, M.A. Karim, Illumiantion-invariant pattern recognition with jointtransform- correlator-based morphological correlation, Appl. Opt. 38 (1999) 7228–7237. [4] P. Garcia-Martinez, C. Ferreira, J. Garcia, H.H. Arsenault, Nonlinear rotationinvariant pattern recognition by use of the optical morphological correlation, Appl. Opt. 39 (2000) 776–781. [5] J. Yao, S. Tan, Y. Liew, Morphological radial-harmonic correlation for shift-and scale-invariant pattern recognition, Opt. Eng. 41 (2002) 81–86. [6] R. Tripathi, K. Singh, Pattern discrimination using a bank of wavelet filters in a joint transform correlator, Opt. Eng. 37 (1998) 532–538.
Q. Wang et al. / Optik 123 (2012) 1562–1567 [7] M.S. Alam, D. Chain, Efficient multiple target recognition using a joint wavelet transform processor, Opt. Eng. 39 (2000) 1203–1210. [8] J. Widjaja, Detection performance of wavelet-based joint transform correlation, Appl. Opt. 46 (2007) 8278–8283. [9] Qu Wang, Li Chen, Liang Lei, Bo Wang, Wavelet-based morphological correlation, Opt. Commun. 283 (2010) 3937–3944. [10] Qu Wang, Li Chen, Jinyun Zhou, Lin. Qinghua, Modified wavelet based morphological correlation for rotation-invariant recognition, Optics & Laser Technology 43 (2011) 1504–1512. [11] J. Rosen, J. Shamir, Scale invariant pattern recognition with logarithmic radial harmonic filters, Appl. Opt. 28 (1989) 240–244. [12] D. Roberge, Y. Sheng, Optical composite wavelet matched filters, Opt. Eng. 33 (1994) 2290–2295.
1567
[13] Y. Sheng, D. Roberge, Continuous-scale-invariant pattern recognition with adaptive-wavelet-matched filters, Appl. Opt. 38 (1999) 5541–5547. [14] J.M. Combes, A. Grossmann, Ph. Tchamitchian, Wavelets: Time–Frequency Methods and Phase Space, 1st ed., Springer-Verlag, 1989. [15] D.A. Jared, D.J. Ennis, Inclusion of filter modulation in synthetic discriminant function construction, Appl. Opt. 28 (1989) 232–239. [16] D. Roberge, Y. Sheng, Optical implementation of real-time correlator with phase-only composite filters, Opt. Eng. 35 (1996) 2541–2547. [17] T. Lindeberg, Scale–space DOE discrete signals, IEEE Trans. 12 (1990) 234–254. [18] G.O. Reynolds, J.B. DeVelis, G.B. Parrent, B.J. Thompson, Physical Optics Notebook: Tutorials in Fourier Optics, Vol. PM01yHC of SPIE Tutorial Text Series, SP IE Press, Bellingham, Wash, 1989 (Chap. 6).