Optik 124 (2013) 1957–1960
Contents lists available at SciVerse ScienceDirect
Optik journal homepage: www.elsevier.de/ijleo
Enhancement for road sign images and its performance evaluation Xing Yang State Key Laboratory of Pulsed Power Laser Technology, Key Laboratory of Infrared and Low Temperature Plasma of Anhui Province, Hefei 230037, Anhui province, China
a r t i c l e
i n f o
Article history: Received 6 January 2012 Accepted 6 June 2012
Keywords: Road sign recognition Image enhancement ITS
a b s t r a c t Recently, road sign recognition in natural scenes for intelligent transportation system (ITS) applications are paid much attention to, such as license plate recognition, traffic sign recognition and road text recognition. To improve recognition performance under some poor imaging conditions, e.g. complex lighting variations and pollution, an enhancement method for road sign images is proposed in this paper. It is composed of two steps: contrast maximizing while graying and mathematical morphological tophat–bothat transform. Performance evaluation results have demonstrated that the propose method can do well in purposiveness, real-time performance, and empirical parameter independence simultaneously, comparing to some other enhancement methods. © 2012 Elsevier GmbH. All rights reserved.
1. Introduction Road signs in natural scenes mainly include license plate, traffic sign and road text. Automatic recognition for these road signs plays an important role in ITS. For example, license plate recognition [1–5] is the key technology for vehicle automatic management that is the basic task for ITS. Moreover, some new developed ITS projects, such as intelligent vehicle, driverless system and drivingassist system, are close related to traffic sign recognition [6–8] and road text recognition [9–12]. Poor imaging conditions in natural scenes, e.g. complex lighting variations and pollution, are probably to leave contrast of target region in these road sign images lower, which seriously influence recognition rate. Thus, image enhancement is always applied to heighten the contrast of target region so as to improve recognition performance in the case of low-quality images. Yet, present enhancement methods for road sign image cannot do well in purposiveness, real-time performance, and empirical parameter independence, simultaneously. Basically, there are two categories of enhancement methods – global and local ones. It has been proved that the global are always efficient but not purposiveness-robust, e.g. histogram equalization [2] and contrast stretching [12], since their performances rely on global contrast instead of local one we interested in (contrast in target region) but global contrast analysis can save much time. Contrarily, the local methods are purposiveness-robust but time-consuming, e.g. local standard deviation [1], due to reliance on local contrast and their time-consuming contrast analysis by window scanning. In addition, dependence on empirical parameter further weakens
E-mail address:
[email protected] 0030-4026/$ – see front matter © 2012 Elsevier GmbH. All rights reserved. http://dx.doi.org/10.1016/j.ijleo.2012.06.015
the robustness of enhancement methods, e.g. grayscale range for contrast stretching, scanning window size and intensity threshold for local standard deviation. To cover the shortages above, an enhancement method for road sign images is established. And its performance evaluation is carried out with comparison to several enhancement methods. This paper is organized as follows. In Section 2, contrast maximizing while graying is presented. Mathematical morphological tophat–bothat transform is detailed in Section 3. Enhancement performance evaluation is presented in Section 4. And concluding remarks are given in Section 5.
2. Contrast maximizing while graying Trichromatic imaging is the fundament of designing imaging devices in visible range; and its principle can be detailed: firstly, visible spectrum is divided into the wavebands of red, green and blue according to the spectral quantization characteristics of human eye [13]; secondly, light from an optical image is quantized on trichromatic wavebands respectively; finally, these three quantization results are combined to describe colors we see. License plate, traffic sign, and road text images, can be seen as cooperative targets with color-discrete characteristic as presented in Fig. 1. This characteristic can be described as follows: both target and local background have relative balanced colors; these two colors, on the other hand, apparently differ from each other. Local background, opposite to global background in an image, indicates that some parts of non-target area close related to target, e.g. license plate background in Fig. 1(a). Hence, color-discrete characteristic explicitly explains that spectral reflection characteristics of target and local background in visible range differ from each other obviously. Taking the spectral reflection characteristic of a yellow-black
1958
X. Yang / Optik 124 (2013) 1957–1960
Dmax , as illustrated in (4) is able to maximize the contrast between target and local background. g(i, j) = I(i, j).
Where g(i, j) denotes grayscale of pixel (i, j) in the gray image. In application, the RGB components are chosen as candidates if two of them (DR (, ), DG (, ), DB (, )) are close and obviously greater than the third or all of them are close to each other. Finally, the candidate interested can be adopted further. For example, R and G are the candidate for a yellow-black license plate in Fig. 2.
Fig. 1. Road signs: (a) license plate, (b) traffic sign, (c) road text.
license plate in Fig. 2 for example, we can find that two reflection characteristics have minimal difference in blue waveband but have maximal difference both in green and red ones. On the ground of trichromatic imaging, it is evident that color characteristics of each pixel in an image are mainly determined by four key factors: spectral reflection characteristic, spectral response function of imaging device, channel gain of sensor, and photoelectric conversion coefficient (without regard for light transmission attenuation). In these factors, channel gain of sensor and photoelectric conversion coefficient are constants for a certain sensor. Applying (1), we can define the quantization characteristic functions, fT (, ) and fL (, ), for reflection spectrum of target and local background. These functions are able to describe color characteristics of pixels effectively.
⎧ ⎪ ) = A ⎪ f (, RT ()TI ()d ⎨T (, ) . ⎪ ⎪ RL ()()d ⎩ fL (, ) = A
3. Mathematical morphological tophat–bothat transform To increase contrast in target region further, grayscales of the one (target and local background) are expected to be decreased and grayscales of the other are expected to be increased respectively. It is believed that mathematical morphological tophat–bothat transform based on gray image can function well for this aim. The main idea of mathematical morphological image processing is to make use of a structural element to probe into an image so as to understand its structure characteristics by parts correlation. Erosion and dilation are the most basic mathematical morphological operations and the others can be achieved by them. Erosion and dilation for gray images can be given by I B
= (f b)(i, j)
= min f (i + x, j + y) − b(x, y)|(i + x, j + y) ∈ Df , (x, y) ∈ Db
(1)
(5)
(, )
and
Here, RT () and RL () are the spectral reflectivity of target and local background respectively, and A is the product of channel gain and photoelectric conversion coefficient; TI () is the spectral response function of imaging device in trichromatic waveband, while I = {R, G, B} and (, ) refer to the three primary colors and their spectral range. Furthermore, difference function of quantization characteristic, DI (, ), is defined as
(4)
DI (, ) = fT (, ) − fL (, ) = A
RT () − RL () TI ()d.
(, )
(2) Suppose that Dmax = max(DR (, ), DG (, ), DB (, )),
I⊕B
we can infer that between target and local background, the difference of quantization characteristic or the contrast in the corresponding monochromatic image is the most notable. Therefore, graying a color image by the RGB component I(i, j), corresponding to
(6) respectively. Where f(i, j) is the grayscale in image I; b(i, j) is the grayscale in structural element image B; Df and Db are their definition domains respectively. Erosion can be used to reduce the target following its edge and isolate some connected regions; on the contrary, dilation is the process that enlarges the target following its edge and merges some isolated regions. Erosion and dilation can be combined to form another two basic operations: open and close. They are denoted by I ◦ B = I B ⊕ B
(3)
= (f ⊕ b)(i, j)
= max f (i − x, j − y) + b(x, y)|(i − x, j − y) ∈ Df , (x, y) ∈ Db
(7)
and I • B = I ⊕ BB
(8)
respectively. As we known, open operation is able to get rid of some isolated dots, the blur and the bulge, which smooth the region contour leaving position and shape invariant; on the other hand, close operation is able to fill up some holes and cracks, which also smooth the region contour leaving position and shape invariant. Since tophat transform is the difference between open operation image and the original one, it indicates the peak grayscales of the original image. Similarly, bothat transform presents the bottom grayscales of the original image, because it is the difference between close operation image and the original one. Therefore, tophat–bothat transform in (9) can decrease grayscales of the one (target and local background) and increase grayscales of the one respectively as our expectation. I = tobhat(I) + I − bothat(I).
(9)
Where tophat(I) = I ◦ B − I and bophat(I) = I • B − I ; and I is the Fig. 2. Reflection characteristic of a yellow-black license plate.
enhanced image.
X. Yang / Optik 124 (2013) 1957–1960
1959
Table 1 Performance evaluation results.
C¯ t¯
Fig. 3. (a) Image with higher grayscales of target region, (b) histogram equalization, (c) contrast stretching, (d) local standard deviation, (e) the proposed method.
4. Enhancement performance evaluation Generally, two categories of images need to be enhanced: grayscales of target region is higher or lower, as illustrated in Figs. 3 and 4(a) respectively. 50 license plate images that all belong to these two categories are prepared as the one data set; similarly, 50 traffic sign images are prepared as the other data set. These two data sets are used to carry out comparing experiments among the methods of histogram equalization, contrast stretching, local standard deviation, and the proposed method on Matlab7.8.0 and PC with Pentium IV at 2.4 GHz and 1 GB RAM. Some empirical parameters for the later three methods should be set. Stretching parameter n for contrast stretching is set to 40 and size of morphological structure element is set to 40 × 40. Empirical parameters for local standard deviation are set according to [1]. Looking forward to achieve two contrasts – the one against grayscale range in target region and the other against whole grayscale range (0, 255) – in enhanced target region as high as possible, we give the relative contrast and the absolute one as Cr =
|gT − gL | gT + gL
(10)
|gT − gL | . 255
(11)
and Ca =
Cr + Ca . 2
Contrast stretching
Local standard deviation
The proposed method
0.119 39 sm
0.102 43 sm
0.076 1294sm
0.245 143 sm
Therefore, average integrated contrast C¯ can be adopted to describe enhancement performance for each method. Moreover, average execution time t¯ is applied to describe real-time performances for each method. Time consumption for conventional weighted average graying [3] are added into that of each method except the proposed one, since it is combined with graying and enhancement functions. Experimental results are shown in Table 1 and two groups of sample images are given in Figs. 3 and 4. Average integrated contrast of the proposed method is much higher than those of the others, and reaches 0.245. Target regions in its enhanced images as illustrated in Figs. 3 and 4(e) are thus the most feature-salient. Although average execution time of the proposed is more than three times as much as those of histogram equalization and contrast stretching, it is about one tenth of that get by local standard deviation. In addition, the proposed method is more robust in empirical parameter independence, compared with contrast stretching and local standard deviation. With the same parameters, the proposed method achieves satisfied effects both in processing the license plate and traffic sign images. The other two, however, cannot function well in same cases, as presented in Fig. 3(c) and (d) and Fig. 4(c) and (d). Local standard deviation even generates unacceptable result while processing traffic sign image, which directly lead to its lowest average integrated contrast. Obviously, parameter resetting for these two methods are needed. Experimental results and discussion above demonstrate that the proposed method can do well in purposiveness, real-time performance, and empirical parameter independence simultaneously.
5. Conclusions
where gT and gL are mean grayscale of target pixels and local background ones respectively. Due to the same importance of the two contrasts, integrated contrast is presented by C=
Histogram equalization
(12)
To cover the shortage that preset enhancement methods for road sign image cannot do well in purposiveness, real-time performance, and empirical parameter independence simultaneously, a new one is proposed in this paper. It is composed of two steps: contrast maximizing while graying and mathematical morphological tophat–bothat transform. In the first step, color-characteristic of road sign images is use to maximize the contrast in target regions while graying. And then mathematical morphological tophat–bothat transform is adopted to heighten the contrast further, since it can decrease grayscales of the one (target and local background) and increase grayscales of the other respectively. Combined the advantages of these two steps, the proposed achieves satisfied performance evaluation results. Particularly, robust purposiveness but no time-consumption is the most notable characteristic of this method, which denotes its significance in road sign recognition in natural scenes for ITS applications.
Acknowledgments
Fig. 4. (a) Image with lower grayscales of target region, (b) histogram equalization, (c) contrast stretching, (d) local standard deviation, (e) the proposed method.
Thanks to the support of the funds for national natural science (60972093), outstanding young of Anhui province in China (10040606Y07), and Beijing Jiaotong University Science and Technology Foundation (2010JBZ010).
1960
X. Yang / Optik 124 (2013) 1957–1960
References [1] D.N. Zhen, Y.N. Zhao, J.X. Wang, An efficient method of license plate location, Pattern Recognit. Lett. 26 (2005) 2431–2438. [2] M. Cinsdikici, A. Ugur, T. Tunali, Automatic number plate information extraction and recognition for intelligent transportation system, Imaging Sci. J. 55 (2007) 102–113. [3] P.G. Hou, J. Zhao, M. Liu, A license plate locating method based on tophat–bothat changing and line scanning, J. Phys.: Conf. Ser. 48 (2006) 431–436. [4] C.N.E. Anagnostopoulos, I.E. Anagnostopoulos, V. Loumos, E. Kayafas, A license plate-recognition algorithm for intelligent transportation system applications, IEEE Trans. Intell. Transp. Syst. 7 (2006) 377–392. [5] H. Caner, H.S. Gecim, A.Z. Alkar, Efficient embedded neural-network-based license plate recognition system, IEEE Trans. Veh. Technol. 57 (2009) 2675–2683. [6] X. Baró, S. Escalera, J. Vitrià, O. Pujol, P. Radeva, Traffic sign recognition using evolutionary adaboost detection and forest-ECOC classification, IEEE Trans. Intell. Transp. Syst. 10 (2009) 113–126.
[7] C.G. Keller, C. Sprunk, C. Bahlmann, J. Giebel, G. Baratoff, Real-time recognition of U.S. speed signs, in: Proc. IEEE Intell. Veh. Symposium, 2008, pp. 518–523. [8] M. Garcia-Garrido, M. Sotelo, E. Martin-Gorostiza, Fast traffic sign detection and recognition under changing lighting conditions, in: Proc. IEEE ITSC, 2006, pp. 811–816. [9] X.L. Chen, J. Yang, J. Zhang, A. Waibel, Automatic detection and recognition of signs from natural scenes, IEEE Trans. Image Process. 13 (2004) 87–99. [10] N. Ezaki, M. Bulacu, L. Schomaker, Text detection from natural scene images: towards a system for visually impaired persons, in: Proc. IEEE 17th Int. Conference on Pattern Recognition, 2004, pp. 683–686. [11] U. Bhattacharya, S.K. Parui, S. Mondal, Devanagari, Bangla, Text extraction from natural scene images, in: Proc. IEEE 10th Int. Conf. Document Analysis and Recognition, 2009, pp. 171–175. [12] J.S. Kim, S.C. Park, S.H. Kim, Text locating from natural scene images using image intensities, in: Proc. 2005 Eighth Int. Conf. Document Analysis and Recognition, 2005, pp. 655–659. [13] R.C. Gonzalez, E.W. Woods, Digital Image Processing, Prentice-Hall, Englewood Cliffs, NJ, 2001.