Neurocomputing 73 (2010) 2708–2717
Contents lists available at ScienceDirect
Neurocomputing journal homepage: www.elsevier.com/locate/neucom
Shadow compensation based on facial symmetry and image average for robust face recognition Ping-Cheng Hsieh, Pi-Cheng Tung n Department of Mechanical Engineering, National Central University, No. 300, Jhongda Road, Jhongli City, Taoyuan County 32001, Taiwan, ROC
a r t i c l e in f o
a b s t r a c t
Article history: Received 7 September 2009 Received in revised form 26 April 2010 Accepted 28 April 2010 Communicated by J. Yang Available online 8 May 2010
In this paper, we propose a novel shadow compensation approach for face recognition under varying illumination. This approach based on facial symmetry and image average can solve the drawbacks of the conventional algorithms such as histogram equalization (HE) and region-based histogram equalization (RHE). Among others, the proposed approach has the following advantages: (1) it is very simple, so it is easily implemented in a real-time face recognition system; (2) it is able to reinforce key facial features and to standardize other parts of the face; and (3) it can apply directly to single face image without any prior information of light source direction. The experimental results reveal that the proposed approach has a rather excellent recognition performance. & 2010 Elsevier B.V. All rights reserved.
Keywords: Face recognition Shadow compensation Illumination variation Facial symmetry Image average Histogram equalization (HE)
1. Introduction Human face recognition, as one of the primary biometric technologies, became more and more important due to increased demands on security, such as criminal identification, identity verification for credit card or passport, access control, intelligent surveillance, and so on. However, since the intra-person variations caused by illumination change are often larger than the inter-person differences [1], illumination-robust face recognition is still a challenging problem [2,3]. Recently, numerous different approaches have been proposed to deal with this problem. Basically, these approaches can be classified into three main categories [4–6]: preprocessing and normalization, invariant feature extraction, and face modeling. In the first category (preprocessing and normalization), some image processing techniques have been presented to alleviate the effect of uneven illumination of face image before performing feature extraction procedure. One of the most well-known preprocessing methods is histogram equalization (HE), which can uniform the distribution of luminance value of an image. In other words, the HE is considered to optimize the contrast of an image. In addition to HE, several modified HE techniques have
n
Corresponding author. Tel.: + 886 3 4267304; fax: + 886 3 4254501. E-mail addresses:
[email protected] (P.-C. Hsieh),
[email protected] (P.-C. Tung). 0925-2312/$ - see front matter & 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2010.04.015
also been introduced to cope with illumination variations, such as region-based histogram equalization (RHE) [7], block-based histogram equalization (BHE) [8], etc. The main idea of RHE is to execute HE independently in some pre-defined image regions partitioned from the original face image, so as to increase the local contrast and to reinforce the local information such as textures and edges. Nevertheless, the significant difference between the pixels at the edge of adjacent regions is generated after performing RHE process. In order to solve this problem, the BHE is actually to carry out HE in pre-defined image blocks, which overlap half with each other. However, the computational requirement of BHE is much higher than that of RHE and noises are also enhanced after performing BHE. More recently, in order to deal with illumination variation in face image, Song et al. [9] proposed the mirror-image method, which is based on the assumption of facial symmetry. More specifically, this method is first to divide the original face image into left face image and right face image. In the second step, the luminance difference between left and right face images is computed. When this luminance difference is bigger than a pre-defined threshold value, the mirror image is produced by mirroring the brighter half face image. Nevertheless, since the mirror-image method only considers the illumination variation caused by left/right light source direction, it cannot handle the illumination variation caused by other light source directions. For instance, when the light source direction is 801 of angle of elevation against front face, the luminance difference between left and right face images is almost equal to
P.-C. Hsieh, P.-C. Tung / Neurocomputing 73 (2010) 2708–2717
zero. In other words, the mirror-image method will not do any further process for this case. In the second category (invariant feature extraction), many algorithms have been presented to extract facial features which are invariant to illumination variations. As an example, Shashua and Riklin-Raviv [10] introduced the quotient image (QI) method, which is based on the ratio of the albedo of two objects. This quotient image is regarded as the illumination invariant signature image, which can be used for face recognition under variable lighting. However, this method might fail in obtaining the illumination invariant feature when the input image has a shadow. Therefore, several improved QI approaches based on the concept of original QI method have been proposed, such as quotient image relighting (QIR) [7], self quotient image (SQI) [11], and total variation quotient image (TVQI) [12]. In addition, Gross and Brajovic [13] presented a new image preprocessing algorithm that first estimates the illumination field from a single brightness image, and then compensates for it to mostly recover the scene reflectance. Another one of the most well-known methods is local binary pattern (LBP) that has been used as an illumination invariant feature. The limitation of LBP is its sensitivity to random and quantization noise in uniform and near-uniform image regions such as the forehead and cheeks. To counter this Tan and Triggs [14] extend LBP to local ternary pattern (LTP) that is more discriminant and less sensitive to noise in uniform regions. In the third category (face modeling), based on some face images of each person under different lighting conditions, a number of researchers attempt to construct an appropriate face model that can be used to render face images, which are unaffected by illumination variations. One of the predominant approaches is illumination cone model [15] that can be approximated by a low-dimensional linear subspace whose basis vectors are estimated using a number of images under variable lighting conditions. In Ref. [16], the authors model illumination variations using a 9D linear subspace, which is formed by 9 harmonic images. Nevertheless, for obtaining harmonic images, the object’s structure (or at least its albedo) has to be known in advance. In order to overcome this problem, Lee et al. [17] utilize 9 images captured under nine different lighting conditions to obtain 9D linear subspace. However, one of the main shortcomings of the face modeling approaches is that a number of face images of the individual under different lighting conditions are needed during the modeling phase. Accordingly, it is very difficult to apply these approaches in real time. From the standpoint of applicability, the first category’s algorithms usually have the following properties: (1) simplicity, (2) general purpose, and (3) no modeling steps or training images required. Hence, we present a novel shadow compensation approach, which not only has the advantages of the first approach group, but also can solve the drawbacks of the previous mentioned algorithms in the first category. This approach, called mirror-region based histogram equalization and image average approach (MRHEIA), is first to carry out HE in pre-defined mirror-regions for enhancing the local contrast. In contrast to RHE, each mirrorregion’s partition direction is only considered to parallel with the symmetric axis of the face. Therefore, the mirror version of left face image can be produced by mirroring the enhanced right face image, and then the average image is obtained by computing the average of enhanced left face image and its mirror version. Finally, the compensated face image used for face recognition is acquired by mirroring the average image. Through the above process, we can reinforce the facial features and standardize other parts of the face. From the experimental results, we can see that the recognition performance of face recognition system, whether using principal component analysis (PCA) [18] or two-dimensional PCA (2DPCA) [19,20] as a feature extraction method, is significantly improved by carrying out our approach.
2709
The rest of this paper is organized as follows. Section 2 describes the proposed approach for shadow compensation of face image. Experimental results and discussion are presented in Section 3. Finally, conclusions are drawn in Section 4.
2. Proposed approach In this section, we provide a detailed description on MR-HEIA approach. Firstly, in order to enhance the local contrast, the MR-HEIA performs HE in pre-defined mirror regions, which are generated by dividing the original face image along the direction parallel to the symmetric axis of the face. More specifically, suppose that I denotes a face image of a person and K denotes the number of divided mirror regions of half face image. That is to say, a face image has 2K mirror regions in total. Further, each mirror region in left face image is expressed as ILi with i¼1,2, y, K, where i denotes the position of the mirror region in half face image and increases progressively from the border of the face image to its center. Similarly, each mirror region in right face image is expressed as IRi with i¼ 1,2, y, K. Fig. 1 shows an example of mirror region partition with K ¼2. After the first step of MR-HEIA, we obtain 2K enhanced mirror regions, and each of them in left and right face images is expressed as ^ILi and ^IRi , respectively, where i¼1,2, y, K. Figs. 2(a) and (b) show some face images which are severely affected by uneven illumination and corresponding enhanced face images under using K ¼2, respectively. As can be seen from Fig. 2(b), the facial features of face images are reinforced, but the shortcoming similar to RHE still exists. In order to make up for this shortcoming and preserve the enhanced facial features at the same time, the image average concept proposed by Jenkins and Burton [21] is introduced in our approach. In Ref. [21], Jenkins and Burton indicated that averaging different face images of the same person dilutes some transients (e.g., facial expression, lighting condition, and age) while preserving appearances of the face image, which are consistent across images. Nevertheless, one of the main issues in a task of face recognition lies in the difficulties of collecting samples [22]. In general, the number of samples per
Fig. 1. An example of mirror-region partition with K¼2
2710
P.-C. Hsieh, P.-C. Tung / Neurocomputing 73 (2010) 2708–2717
pixel value sum of the non-facial features of the same position when computing the average image of two mirror regions (i.e., enhanced mirror region and corresponding mirror version) is usually over 255. Consequently, with this property into account, the modified average image is computed as shown below ( ð^ILi þ ^IMLi Þ=2, if ^ILi þ ^IMLi o 255 IMALi ¼ i ¼ 1,2,. . .,K ð1Þ 255=2, if ^ILi þ ^IMLi Z 255 Through the above process, the facial features can be reinforced and other parts of the face can be standardized. Then, the compensated face image used for face recognition is acquired by mirroring the modified average image. Fig. 2(c) shows some compensated face images, which are obtained by performing MR-HEIA on the face images as shown in Fig. 2(a). Finally, in face recognition procedures, the PCA and the 2DPCA are utilized, respectively, for feature extraction, and then the nearest neighbor classifier based on the Euclidean distance is employed for classification. Fig. 3 shows the system block diagram of MR-HEIA for
Input face image
partition 2K mirror-regions
HE 2K enhanced mirror-regions Fig. 2. An example of performing the MR-HEIA using K¼2: (a) original face images, (b) enhanced face images, and (c) compensated face images.
person available for training is much fewer than 10, even just one. Therefore, take above-mentioned situation into account, the mirror technique based on facial symmetry is used as a basis of performing image average. In other words, the MR-HEIA can still work in the case of one sample per person. The remainder steps of MR-HEIA are formulated as follows. Based on mirror technique, the mirror version of each mirror region in left face image is produced by mirroring each enhanced mirror region in right face image, and each of them is denoted as ^IMLi with i¼1,2, y, K. Afterwards, based on image average concept, the average image can be obtained by computing the average of each enhanced mirror region in left face image and the corresponding mirror version. However, the image average method cannot bring its ability into full play for diluting the variations because the average image is only computed by two mirror regions (i.e., enhanced mirror region and corresponding mirror version). Hence, we propose a modified image average method, which is more suitable to the situation faced at present, and describe it as follows. In view of the local region of a face image, the values of the gray-level intensity of the facial features (such as eyebrows, eyes, and mouth), whether light source direction is center or not, are usually lower than that of other parts of the local region. Further, due to the action of HE on the local region, the luminance of the facial feature becomes darker, but opposite in other parts of the local region. This means that the
mirror K enhanced mirror-regions and corresponding mirror versions
MR-HEIA
modified image average K modified average images mirror Compensated face image
face recognition Recognition result
Fig. 3. System block diagram and corresponding illustration using K¼ 2.
P.-C. Hsieh, P.-C. Tung / Neurocomputing 73 (2010) 2708–2717
understanding the overall procedure more clearly. Since we mainly focus on shadow compensation of face image, the details on the face recognition procedures will not be discussed in this paper. See Refs. [18–20] for more details.
3. Experimental results and discussion In this section, we evaluate the performance of the proposed approach based on two public face databases: Yale B face database [15,23] and Weizmann face database [24]. In the experiments, all images are manually cropped and aligned in compliance with Ref. [15]. Specifically, the distance between eyes is equal to four sevenths of the cropped window width and the face was centered along the vertical orientation so that the two imaginary horizontal lines passing through the eyes and mouth are equidistant from the center of the cropped window. Subsequently, all face images are resized to a size of 112 92 pixels.
2711
3.1. Experiments on Yale B face database The Yale B database contains images of 10 subjects in nine poses and 64 illumination conditions per pose. We used 450 face images of 10 subjects (45 images per subject) with frontal pose and normal expression for illumination variant test. Further, these face images were divided into four subsets based on the angle of the light source orientation. Subset 1 includes 70 images (7 images per subject) captured under the angles between the light source orientation and the camera axis within 121. Subsets 2 and 3 contain 120 images (12 images per subject) each, which were captured under light source orientations within 251 and within 501, respectively. Subset 4 has 140 images (14 images per subject) captured under light source orientations within 771. Example images of these subsets are shown in Fig. 4. Following Ref. [7], in the experiments, the images in Subset 1 were selected as a training set, and then each of the remainder subsets were independently used as a test set. Since the
Fig. 4. Example images of a subject in Yale B database. The images are divided into four subsets according to the angle between the light source direction and the camera axis: (a) subset 1 (up to 121); (b) subset 2 (up to 251); (c) subset 3 (up to 501) and (d) subset 4 (up to 771).
2712
P.-C. Hsieh, P.-C. Tung / Neurocomputing 73 (2010) 2708–2717
recognition accuracy of the region-based approach may be affected by the size of image region [25,26], several sizes of mirror region were used in our experiments: 112 46 (K ¼1) and 112 23 (K ¼2). It should be mentioned that, in the case of K ¼3, the mirror regions in the border of image (i.e., IL1 and IR1) were set to a size of 112 16 and the remainder mirror regions (i.e., IL2, IL3, IR2, and IR3) were set to a size of 112 15. Moreover, following Ref. [7], the partition structure of the four regions (as shown in Fig. 5) was adopted when performing RHE in the experiments. The comparisons of different illumination preprocessing methods on top recognition rate and corresponding dimension of feature vector (for PCA) or feature matrix (for 2DPCA) are tabulated in Tables 1 and 2. Figs. 6 and 7 show the recognition rate versus the number of projection axes for Subset 4 under using PCA- and 2DPCA-based recognition system, respectively. Here, PCA and 2DPCA used all projection axes and the first 30 projection axes, respectively, for achieving the maximal recognition accuracy. It can be found from Table 1 that MR-HEIA is able to achieve up to a 18.57% performance improvement over RHE for Subset 4 when using PCA as a feature extraction method. A similar result can be seen from Table 2, in which MR-HEIA achieves up to a 14.28% performance improvement over RHE for Subset 4 when using 2DPCA as a feature extraction
Table 2 The top recognition rate (TRR, %) comparisons of different illumination preprocessing methods and the corresponding dimensions (D) of the feature matrices when using 2DPCA as a feature extraction method. Method
Subset 2
Subset 3
Subset 4
TRR
D
TRR
D
TRR
D
Non HE RHE
97.5 100 100
112 9 112 4 112 1
86.67 99.17 100
112 30 112 18 112 2
24.29 52.14 77.86
112 26 112 29 112 29
MR-HEIA (K ¼1) MR-HEIA (K ¼2) MR-HEIA (K ¼3)
100 100 100
112 1 112 1 112 1
100 100 100
112 8 112 10 112 12
89.29 90 92.14
112 10 112 11 112 14
R-HEIA (K¼ 1) R-HEIA (K¼ 2) R-HEIA (K¼ 3)
100 100 100
112 1 112 1 112 1
100 100 100
112 15 112 10 112 7
83.57 85 90.71
112 30 112 25 112 30
90 80
Recognition Rate (%)
70 60 50 40 HE RHE MR-HEIA(K=1) MR-HEIA(K=2) MR-HEIA(K=3)
30 20 10
0
10
20
30
40
50
60
70
Number of projection axes Fig. 6. The recognition rates of HE, RHE, and the proposed method versus the number of projection axes using PCA as a feature extraction method.
100 Fig. 5. The four regions partition for RHE.
90
Method
Subset 2
Subset 3
Subset 4
TRR
D
TRR
D
TRR
D
Non HE RHE
95 100 100
11 8 6
58.33 92.5 100
35 33 11
22.14 40.71 67.14
39 11 35
MR-HEIA (K¼1) MR-HEIA (K¼2) MR-HEIA (K¼3)
100 100 100
6 6 6
100 99.17 99.17
36 20 37
83.57 85.71 85.71
23 48 36
R-HEIA (K¼ 1) R-HEIA (K¼ 2) R-HEIA (K¼ 3)
100 100 100
5 5 5
100 100 100
17 23 22
78.57 83.57 89.29
37 40 47
80
Recognition Rate (%)
Table 1 The top recognition rate (TRR, %) comparisons of different illumination preprocessing methods and the corresponding dimensions (D) of the feature vectors when using PCA as a feature extraction method.
70 60 50 HE RHE MR-HEIA(K=1) MR-HEIA(K=2) MR-HEIA(K=3)
40 30 20
0
5
10
15
20
25
30
Number of projection axes Fig. 7. The recognition rates of HE, RHE, and the proposed method versus the number of projection axes using 2DPCA as a feature extraction method.
P.-C. Hsieh, P.-C. Tung / Neurocomputing 73 (2010) 2708–2717
method. As the total results, MR-HEIA has better recognition rate than that obtained using other traditional methods. Additionally, the results also show that the recognition rate of MRHEIA raises when K increases from 1 to 3. Now, a question arises: how to set a proper value for the parameter K? For answering this question, we re-execute the experiments of testing Subset 4 on Yale B database and vary K from 1 to 10 to observe the effect of K on face recognition. Fig. 8 shows the recognition rate for the proposed approach with varying K. In the case of PCA, the recognition rates are about 88% when K varies from 7 to 10. In the case of 2DPCA, the
2713
recognition rates are the same (i.e., 91.43%) when K varies from seven to ten. Consequently, the range from seven to ten can be considered as a good foundation for choosing the parameter K. Besides, the top recognition accuracy of 2DPCA is 92.14% under using K¼3. Fig. 9 shows example images of mirror-region partition with K¼3. In Fig. 9, we can find that the eyes and nose in each image happen to be divided into two parts which are nearly symmetric with each other. Perhaps this partition structure is the main reason that causes the recognition accuracy of 2DPCA to improve. 3.2. Experiments on Weizmann face database
96
In the Weizmann database, we selected 150 face images of 10 adults with frontal pose and normal expression for our experiments. This subset contains 15 different illumination conditions for each adult. Following Ref. [27], in the experiments, the face images with frontal lighting condition were selected as a training set and the rest face images used as a test set. Figs. 10(a) and (b) show the set of face images of an adult used for training and testing, respectively. The experimental results are listed in Table 3. From the results, we can see that (1) MR-HEIA can achieve up to a 19.29% performance improvement over HE under using PCA-based recognition system and achieve up to a 8.57% performance improvement over HE under using 2DPCA-based recognition system; (2) the performance difference between MR-HEIA and RHE is insignificant owing to the
94
Recognition Rate (%)
92 90 88 86 84 82 80
PCA 2DPCA 1
2
3
4
5
6
7
8
9
10
K Fig. 8. Recognition rate of the proposed approach for different values of K
Fig. 9. Example images of mirror-region partition with K¼ 3.
Table 3 Experimental results on Weizmann face database. Recognition system
Method
Top recognition rate (%)
PCA
Non HE RHE MR-HEIA (K ¼1) MR-HEIA (K ¼2) MR-HEIA (K ¼3)
77.86 80 98.57 97.14 99.29 99.29
9 9 9 9 9 9
2DPCA
Non HE RHE MR-HEIA (K ¼1) MR-HEIA (K ¼2) MR-HEIA (K ¼3)
89.29 90 99.29 97.86 98.57 98.57
112 19 112 4 112 7 112 6 112 3 112 3
Fig. 10. Sample images of an adult in Weizmann database used for (a) training and (b) testing.
Dimension
2714
P.-C. Hsieh, P.-C. Tung / Neurocomputing 73 (2010) 2708–2717
light source directions gradually changing within a small range. The results are consistent with the experimental results on Yale B database (i.e., the results of Subset 2 and 3).
Fig. 11. An example of performing the RHE: (a) original face images and (b) processed face images.
3.3. Discussion 3.3.1. The mirror operation’s advantage and limitation From completely experimental results we can see that the MR-HEIA evidently outperforms RHE when the illumination variation is large. Now, a question is: Why does the recognition rate of RHE descend significantly in the case of large lighting variations? According to the concept of RHE, for large variations due to changes in illumination, it can be easily observed that the difference between the pixels at the edge of adjacent regions becomes quite significant (as shown in Fig. 11). On the contrary, in our approach, the enhanced mirror region and corresponding mirror version are combined to form one new compensated image, in which the facial features are reinforced and other parts of the face are standardized. Accordingly, the MR-HEIA can overcome the disadvantage of RHE. On the other hand, since the mirror-image method [9] only employs the mirror operation to produce the compensated face image (i.e., the brighter half face image is mirrored), it can only handle the illumination variation caused by left/right light source direction. On the contrary, the proposed approach based on HE and image average can further extend the ability of the mirror-image method [9] so as to handle the illumination variation caused by arbitrary light source directions. From another standpoint, the MRHEIA can be viewed as an extension of the mirror-image method [9]. Furthermore, based on the assumption of facial symmetry, it is not difficult to find that the last step of MR-HEIA can be omitted. This means that the modified average image (i.e., the compensated half face image) can be used directly for face recognition. Through carrying out the same experiments, we obtained the identical experimental results and confirmed the feasibility of this simplified approach. As far as computational speed and storage requirement are concerned, we have reasons to believe that the simplified version of MR-HEIA is an efficient illumination compensation approach and is more suitable for real-time face recognition system. However, it is clear that the efficacy of the mirror operation in our method strongly depends on accurate face alignment. In other words, the mirror operation is sensitive to misalignment and the MR-HEIA is not workable in the case of misalignment. As already discussed in Section 2, in terms of the concept of modified image average method, the mirror operation actually can be omitted and Eq. (1) can be altered as follows: ( IMALi ¼
( IMARi ¼
Fig. 12. Example images of an individual from Weizmann database.
ð^ILi þ ^ILi Þ=2,
if
255=2,
if
ð^IRi þ ^IRi Þ=2,
if
255=2,
if
^ILi þ ^ILi o 255 ^ILi þ ^ILi Z 255
^IRi þ ^IRi o 255 I^Ri þ ^IRi Z 255
i ¼ 1,2,. . .,K
i ¼ 1,2,. . .,K
ð2Þ
ð3Þ
where IMALi and IMARi denote the modified average image in the left half and the right half of an image, respectively. After this process, the compensated face image used for face recognition is obtained. For convenience, this simplified process is called region based histogram equalization and image average approach (R-HEIA) and its experimental results based on Yale B database are listed in Tables 1 and 2.
Table 4 The formative conditions of each of subsets.
Viewpoint Facial expression Lighting condition Number of images
Training set
Subset 1
Subset 2
V1–V5 E1 I2 (5 1 1) 26¼ 130
V1–V5 E1 I1, I3 (5 1 2) 26¼ 260
V1–V5 E2, E3 I1, I3 (5 2 2) 26 ¼520
P.-C. Hsieh, P.-C. Tung / Neurocomputing 73 (2010) 2708–2717
From the results we can see that (1) the simplified version of MR-HEIA (i.e., R-HEIA) still evidently outperforms other traditional methods; (2) in the experiment of Subset 4, when using the same value of K, the recognition rate of R-HEIA is lower than that of MRHEIA except for the case of K¼3 under using PCA-based recognition system. Therefore, we can conclude that the modified image average method can effectively improve the performance of region-based histogram equalization and the mirror operation can further increase the efficacy of modified image average method.
Table 5 Experimental results of Subset 1 on Weizmann face database. Dimension
Recognition system
Method
Top recognition rate (%)
PCA
Non HE RHE R-HEIA (K¼ 1) R-HEIA (K¼ 1)+ AF R-HEIA (K¼ 2) R-HEIA (K¼ 2)+ AF R-HEIA (K¼ 3) R-HEIA (K¼ 3)+ AF
75.38 71.15 73.85 89.62 89.62
102 117 110 119 119
98.08 98.85
78 94
99.23 99.23
106 112
Non HE RHE R-HEIA (K¼ 1) R-HEIA (K¼ 1)+ AF R-HEIA (K¼ 2) R-HEIA (K¼ 2)+ AF R-HEIA (K¼ 3) R-HEIA (K¼ 3)+ AF
3.3.2. The R-HEIA’s robustness and practicality As we know, in the real-world applications, many face images often have lighting and pose variation at the same time. Since this, the following experiment was conducted to verify the practicality of RHEIA. First, 1170 images of 26 individuals (45 images per individual) were selected from Weizmann database. This set contains 45 viewing conditions (5 viewpoints 3 facial expressions 3 lighting conditions). Example images of an individual are shown in Fig. 12. In the experiment, we chose 130 images of 26 individuals with normal expression (E1) and center-light condition (I2) as a training set, in which 5 images per person from 5 different viewpoints (V1–V5). Afterwards, some of the rest images were picked and divided into two subsets for lighting variant test and expression plus lighting variant test. Subset 1 includes 260 images of 26 individuals that were captured under different lighting conditions (I1 and I3) compared with the training set. Subset 2 has 520 images of 26 individuals that have different expression and lighting conditions at the same time (i.e., I1E2, I1E3, I3E2, and I3E3). The formative conditions of each of these subsets are summarized in Table 4. As previous experiments, these images were also resized to a size of 112 92 pixels in the experiment. 100 90 80
95.38 94.23 93.85 94.23 94.62
112 15 112 20 112 19 112 19 112 20
98.85 98.85
112 14 112 15
99.23 99.23
112 12 112 13
Recognition Rate (%)
2DPCA
70 60 50 40
20 10 0
0
20
40
60
80
100
120
Number of projection axes
Method
Top recognition rate (%)
PCA
Non HE RHE R-HEIA (K¼ 1) R-HEIA (K¼ 1)+ AF R-HEIA (K¼ 2) R-HEIA (K¼ 2)+ AF R-HEIA (K¼ 3) R-HEIA (K¼ 3)+ AF
61.73 59.62 65 78.08 78.08
118 122 125 126 127
88.85 89.23
126 127
90.77 90.96
128 128
85 77.88 83.27 82.5 83.46
112 20 112 20 112 19 112 18 112 18
90.19 90.58
112 12 112 15
91.73 92.12
112 19 112 19
Fig. 13. The recognition rates of HE, RHE, and the proposed method versus the number of projection axes using PCA as a feature extraction method.
100 90 80
Recognition Rate (%)
Dimension
Recognition system
Non HE RHE R-HEIA (K¼ 1) R-HEIA (K¼ 1)+ AF R-HEIA (K¼ 2) R-HEIA (K¼ 2)+ AF R-HEIA (K¼ 3) R-HEIA (K¼ 3)+ AF
Non HE RHE R-HEIA(K=1) R-HEIA(K=2) R-HEIA(K=3)
30
Table 6 Experimental results of Subset 2 on Weizmann face database.
2DPCA
2715
70 60 50 Non HE RHE MR-HEIA(K=1) MR-HEIA(K=2) MR-HEIA(K=3)
40 30 20 10
0
2
4
6
8
10
12
14
16
18
20
Number of projection axes Fig. 14. The recognition rates of HE, RHE, and the proposed method versus the number of projection axes using 2DPCA as a feature extraction method.
2716
P.-C. Hsieh, P.-C. Tung / Neurocomputing 73 (2010) 2708–2717
Fig. 15. Some processed images based on Weizmann database: (a) original images, (b) images processed by HE, (c) images processed by RHE, and (d) images processed by R-HEIA (K¼3). Note that the notations in parentheses denote the lighting condition.
Tables 5 and 6 list the experimental results. Figs. 13 and 14 show the recognition rate versus the number of projection axes for Subset 1 under using PCA- and 2DPCA-based recognition system, respectively. Here, PCA and 2DPCA used all projection axes and the first 20 projection axes, respectively, for achieving the maximal recognition accuracy. From the results we can observe that (1) when using PCA-based recognition system, the performance improvement by means of R-HEIA is from 75.38% to 99.23% in lighting variant test, and from 61.73% to 90.77% in expression plus lighting variant test; (2) when using 2DPCA-based recognition system, the performance improvement by means of R-HEIA is about 5% in these two tests. Furthermore, the results also reveal that HE and RHE are unable to improve the recognition performance. Some processed images by different methods are shown in Fig. 15. As can be seen from this figure, in a situation that pose variation occurs, HE and RHE make the side-lighting effect more significant when the test set (i.e., I1 and I3) is compared with the training set (i.e., I2). On the contrary, RHEIA, which makes use of HE and modified image average on the local regions of an image, can still work well when lighting and pose variations happen simultaneously.
3.3.3. Comparisons with other well-known methods In this section, the advantages of the proposed approach compared to other well-known preprocessing methods will be further discussed. In Ref. [7], Shan et al. not only presented the RHE method, but also introduced the quotient image relighting (QIR) method to deal with illumination variation problem. For the experiment of Subset 4 on Yale B database, the recognition rate of QIR is 90.6% [7]. Obviously, the recognition accuracy of our method (i.e., 92.14%) is higher than that of QIR when neglecting some test conditions such as image size and the types of distance measure, classifier, and recognition system. In terms of relative literatures, the recognition rates of some current preprocessing methods are more than 95% for the same experiment, but most of these methods are only evaluated by using completely frontal face images with a standard pose. In other words, we do not know
whether these methods still work well in the case of pose variation. Moreover, some of these methods also adopt HE as the first step in order to enhance its own performance. Therefore, in contrast to these state-of-the-art algorithms, the proposed approach is relatively simple and is able to work in the case of pose variation. Finally, there are still some questions that deserve further study. In Fig. 15, the processed images show that the difference between the pixels at the edge of adjacent regions cannot be entirely eliminated by R-HEIA. This is the main reason why the performance of R-HEIA is lower than that of MR-HEIA in the case of frontal pose. For reducing these remainder differences further, we attempt to apply a spatial averaging filter (AF) of size 7 7 to the edges of partition regions. The experimental results listed in Tables 5 and 6 reveal that the performance is slightly improved (within 1%). The similar results were obtained by using the Gaussian lowpass filter. Thus, how to remove these differences more effectively is one of our future works. Besides, the shape of the nose becomes rather vague after the modified image averaging process since the gray-level intensity of the nose is similar to that of other non-facial features. This is another question that should be future explored.
4. Conclusions In this paper, we propose a novel shadow compensation approach (i.e., MR-HEIA) for face recognition under varying illumination. This approach can solve the shortcomings of the traditional algorithms such as HE and RHE. Moreover, this approach has several advantages: (1) it is very simple so it is easily implemented in a real-time face recognition system; (2) it is able to reinforce key facial features and standardize other parts of the face; and (3) it can apply directly to single face image (i.e., it can work in the case of one sample per person). The experimental results show that the proposed approach has a rather excellent recognition performance and confirm that our approach is an efficient shadow compensation approach for face recognition under varying illumination.
P.-C. Hsieh, P.-C. Tung / Neurocomputing 73 (2010) 2708–2717
Acknowledgement The authors gratefully acknowledge the support provided to this work by the National Science Council of Taiwan, under Grant no. NSC 97-2221-E-008-008-MY2. References [1] Y. Adini, Y. Moses, S. Ullman, Face recognition: the problem of compensating for changes in illumination direction, IEEE Trans. Pattern Anal. Mach. Intell. 19 (7) (1997) 721–731. [2] W. Zhao, R. Chellappa, R.J. Phillips, A. Rosenfeld, Face recognition: a literature survey, ACM Comput. Surv. 35 (4) (2003) 399–458. [3] A.F. Abate, M. Nappi, D. Riccio, G. Riccio, 2D and 3D face recognition: a survey, Pattern Recognition Lett. 28 (2007) 1885–1906. [4] W. Chen, M.J. Er, S. Wu, Illumination compensation and normalization for robust face recognition using discrete cosine transform in logarithm domain, IEEE Trans. Syst. Man Cybern. Part B: Cybern. 36 (2) (2006) 458–466. [5] J. Ruiz-del-Solar, J. Quinteros, Illumination compensation and normalization in eigenspace-based face recognition: a comparative study of different preprocessing approaches, Pattern Recognition Lett. 29 (2008) 1966–1979. [6] X. Zou, J. Kittler, K. Messer, Illumination invariant face recognition: a survey, in: Proceedings of the First IEEE International Conference on Biometrics: Theory, Applications, and Systems, 2007. [7] S. Shan, W. Gao, B. Cao, D. Zhao, Illumination normalization for robust face recognition against varying lighting conditions, in: Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures, 2003, pp. 157–164. [8] X. Xie, K.M. Lam, Face recognition under varying illumination based on a 2D face shape model, Pattern Recognition 38 (2005) 221–230. [9] Y.J. Song, Y.G. Kim, U.D. Chang, H.B. Kwon, Face recognition robust to left/ right shadows; facial symmetry, Pattern Recognition 39 (2006) 1542–1545. [10] A. Shashua, T. Riklin-Raviv, The quotient image: class-based re-rendering and recognition with varying illuminations, IEEE Trans. Pattern Anal. Mach. Intell. 23 (2) (2001) 129–139. [11] H. Wang, S.Z. Li, Y. Wang, Face recognition under varying lighting conditions using self quotient image, in: Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004, pp. 819–824. [12] T. Chen, W. Yin, X.S. Zhou, D. Comaniciu, T.S. Huang, Illumination normalization for face recognition and uneven background correction using total variation based image models, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, 2005, pp. 532–539. [13] R. Gross, V. Brajovic, An image preprocessing algorithm for illumination invariant face recognition, in: Proceedings of the Fourth International Conference on Audio- and Video-based Biometric Person Authentication, vol. 2688, 2003, pp. 10–18. [14] X. Tan, B. Triggs, Enhanced local texture feature sets for face recognition under difficult lighting conditions, in: Proceedings of the 2007 IEEE International Workshop on Analysis and Modeling of Face and Gestures, LNCS 4778, 2007, pp. 168–182. [15] A.S. Georghiades, P.N. Belhumeur, D.J. Kriegman, From few to many: illumination cone models for face recognition under variable lighting and pose, IEEE Trans. Pattern Anal. Mach. Intell. 23 (6) (2001) 643–660. [16] R. Basri, D.W. Jacobs, Lambertian reflectance and linear subspaces, IEEE Trans. Pattern Anal. Mach. Intell. 25 (2) (2003) 218–233.
2717
[17] J.C. Lee, J. Ho, D. Kriegman, Nine points of light: acquiring subspaces for face recognition under variable lighting, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, 2001, pp. 519–526. [18] M. Turk, A. Pentland, Eigenfaces for recognition, J. Cognitive Neurosci. 3 (1) (1991) 71–86. [19] J. Yang, D. Zhang, A.F. Frangi, J.Y. Yang, Two-dimensional PCA: a new approach to appearance-based face representation and recognition, IEEE Trans. Pattern Anal. Mach. Intell. 26 (1) (2004) 131–137. [20] Y. Xu, D. Zhang, J. Yang, J.Y. Yang, An approach for directly extracting features from matrix data and its application in face recognition, Neurocomputing 71 (10-12) (2008) 1857–1865. [21] R. Jenkins, A.M. Burton, 100% accuracy in automatic face recognition, Science 319 (2008) 435. [22] X. Tan, S. Chen, Z.H. Zhou, F. Zhang, Face recognition from a single image per person: a survey, Pattern Recognition 39 (2006) 1725–1745. [23] Yale B. face database (http://cvc.yale.edu/projects/yalefacesB/yalefacesB. html). [24] Weizmann face database (http://www.faculty.idc.ac.il/moses/). [25] R. Gottumukkal, V.K. Asari, An improved face recognition technique based on modular PCA approach, Pattern Recognition Lett. 25 (2004) 429–436. [26] P.C. Hsieh, P.C. Tung, A novel hybrid approach based on sub-pattern technique and whitened PCA for face recognition, Pattern Recognition 42 (2009) 978–984. [27] S. Bhavani, A. Thawani, V. Sridhar, K.R. Ramakrishnan, Illumination invariant face recognition for frontal faces using modified census transform, in: Proceedings of the IEEE Region 10 International Conference on TENCON, 2007.
Ping-Cheng Hsieh was born in Miaoli, Taiwan, in August 1981. He received his BS degree in Mechanical Engineering from Minghsin University of Science and Technology, Taiwan, in 2003. He received his MS degree in the Department of Systems and Naval Mechatronic Engineering from National Cheng-Kung University, Taiwan, in 2005. Now, he is a PhD student at Department of Mechanical Engineering in National Central University, Taiwan. His current research interests are in the areas of face recognition, pattern recognition, image processing, and robot vision.
Pi-Cheng Tung received the PhD degree from the Michigan State University in 1987. He is currently a Professor in the Department of Mechanical Engineering, the National Central University, Taiwan. He has published over 70 papers in international journals. His current research interests include chaos, dynamics control, signal processing, and image processing. Dr. Tung currently serves as an Associated Editor of the Journal of Aeronautics, Astronautics and Aviation.