Computers and Electrical Engineering 79 (2019) 106451
Contents lists available at ScienceDirect
Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng
Representation-based classification methods with enhanced linear reconstruction measures for face recognitionR Jianping Gou a, Jun Song a, Weihua Ou b,∗, Shaoning Zeng c, Yunhao Yuan d, Lan Du e a
School of Computer Science and Communication Engineering and Jiangsu Key Laboratory of Security Tech. for Industrial Cyberspace, Jiangsu University, Zhenjiang, Jiangsu, 212013, China b School of Big Data and Computer Science, Guizhou Normal University, Guiyang, Guizhou, 550025, China c School of Information Science and Technology, Huizhou University, Huizhou Guangdong, 516007, China d School of Computer Science and Technology, Yangzhou University, Yangzhou, Jiangsu, 225127, China e The faculty of information technology, Monash University, 3800 Australia
a r t i c l e
i n f o
Article history: Received 26 April 2019 Revised 23 August 2019 Accepted 26 August 2019
Keywords: Collaborative representation Sparsity augmented collaborative representation Representation-based classification Linear reconstruction measurement
a b s t r a c t Representation-based classification (RBC) methods have recently been the promising pattern recognition technologies for object recognition. The representation coefficients of RBC as the linear reconstruction measure (LRM) can be well used for classifying objects. In this article, we propose two enhanced linear reconstruction measure-based classification methods based on the sparsity-augmented collaborative representation-based classification method (SA-CRC). One is the weighted enhancement linear reconstruction measure-based classification method (WELRMC) that introduces data localities into SA-CRC. Another is the two-phase weighted enhancement linear reconstruction measure-based classification method (TPWELRMC) that integrates both the coarse and fine representations into SA-CRC. To demonstrate the effectiveness of the proposed methods, experiments are conducted on several public face databases in comparison with the state-of-the-art representation-based classification methods. The experimental results show that the proposed methods significantly outperform the competing RBC methods. © 2019 Published by Elsevier Ltd.
1. Introduction In pattern recognition, the collaborative representation-based classification (CRC) method with l2 -norm of representation coefficients [1] and the sparse representation-based classification (SRC) method with l1 -norm instead [2] are two typical types of representation-based classification methods. Recently, since both have the promising classification performance, there are many variants of CRC and SRC that have been widely applied to various classification tasks, such as face recognition [3,4] and image classification [5–7].
R This paper was submitted for special section SI-aicv3, but it must be included in the regular papers section of CAEE. Reviews processed and recommended for publication to the Editor-in-Chief by Associate Editor Dr. Huimin Lu. ∗ Corresponding author. E-mail addresses:
[email protected] (J. Gou),
[email protected] (J. Song),
[email protected] (W. Ou),
[email protected] (S. Zeng),
[email protected] (Y. Yuan),
[email protected] (L. Du).
https://doi.org/10.1016/j.compeleceng.2019.106451 0045-7906/© 2019 Published by Elsevier Ltd.
2
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
The standard sparse representation-based classification method (SRC) was proposed in [2] and has been well applied to face recognition with different variations, such as lighting, occlusions, facial expressions and so on. In [2], the discriminative property of sparse representation has also been analyzed in details. Recently, the research progress in theories and applications of sparse representation-based classification has been reviewed in [8], and a survey of dictionary learning as a crucial part of sparse representation was given in [3]. Cheng et al. analyzed the discriminative nature of sparse representation from the perspective of the similarity measure. The sparse representation coefficients can well reflect the similarities of data [9]. Using the similarity property of sparse representation coefficients, the sum of class-specific coefficients (SoC) as the classification decision was proposed in [10]. Moreover, since the sparse representation coefficients hold the similarities of data, the sparse graph construction is well used for dimensionality reduction [11]. It has shown that the sparse representation has good pattern discrimination for classifying objects. Collaborative representation is another typical representation used for pattern classification [1]. Zhang et al. argued that the working mechanism of sparse representation-based classification benefits from the collaboration of all the training samples instead of l1 -norm sparsity constraints. They introduced the benchmark collaborative representation-based classification method (CRC) with l2 -norm of representation coefficients [1]. In [12], the nonnegative representation-based classification was proposed, which emphasizes that the constraint of the nonnegative representation coefficients can help pattern classification. Deng et al. [13] further analyzed the discriminative property of CRC and proposed the superposed linear representation-based classification to degrade the negative impact of the representation coefficients from the wrong classes. From the probability point of view, the discriminative nature of CRC was analyzed in [14] and its corresponding extension was the probabilistic collaborative representation-based classification method that considers the class-specific representation [14,15]. In [16], Xu et al. proposed the discriminative sparse representation-based classification method as a variant of CRC, which adopts the constraint term of the pairs of class-specific representations. As shown in the SRC and CRC methods, both the sparse representation and the collaborative representation are meaningful and useful for classifying objects because l1 -norm of coefficients can promote sparseness and l2 -norm of coefficients can overcome outliers [17]. To enhance pattern discrimination of SRC and CRC, the data localities are always used as the weights to constrain the representation coefficients [18,19]. Using the efficiency of collaborative representation, Xu et al. proposed the two-phase test sample sparse representation method that employs the coarse-to-fine collaborative representation to quickly improve the sparsity of coefficients for good classification performance [20]. As argued in [9], the sparse representation coefficients can be used as the similarity measure. The linear reconstruction measure (LRM) using the representation coefficients with different norms as the similarity measure was introduced, and then the linear reconstruction measure-based nearest neighbor classification was proposed in [21]. Using LRM and the coarse-to-fine representation, Gou et al. [22] proposed the two-phase linear reconstruction measure-based classification method (TPLRMC) by using l1 -norm of coefficients and l2 -norm of coefficients, respectively. In [23], Akhtar et al. argued in details that both collaboration and sparsity can simultaneously make the representation-based classification more effective, and the sparsity augmented collaborative representation-based classification method (SA-CRC) was then proposed. Since sparsity from sparse representation can get more discrimination and CRC often has similar multi-class specific residuals to cause misclassification, SA-CRC joints sparse representation and collaborative representation to get the augmented representation for favourable classification. In fact, the SA-CRC is a linear reconstruction measure-based classification method. As aforementioned, the data localities can make the similar or close training samples have more contribution to the representation and classification of each testing sample [18,19], and the coarse to fine representation can make the representation coefficients sparse by deleting the dissimilar training samples through the coarse representation [20,22]. Inspired by both LRM and SA-CRC, we propose in this article two new enhanced linear reconstruction measure-based classification methods, extensions of SA-CRC which aims to further improve the representation-based classification performance. One is entitled the weighted enhancement linear reconstruction measure-based classification method (WELRMC). In WELRMC, the data localities as the weights are used to simultaneously constrain the coefficients of sparse representations and collaborative representations in order to get more effective augmented representations. Another is the two-phase weighted enhancement linear reconstruction measure-based classification method (TPWELRMC). In TPWELRMC, we design the coarse-to-fine augmented representation for classification by simultaneously weighting the coefficients of sparse representation and collaborative representation. To demonstrate the effectiveness of the proposed methods, we conduct the experiments on five public face databases by comparing them to the related RBC methods. The experimental results show that the proposed WELRMC and TPWELRMC are promising classification methods with satisfactory performance. In summary, the main contributions of this article are briefly clarified as follows:
(a) The weighted sparsity augmented collaborative representation is designed and its corresponding classification method called WELRMC is proposed. (b) The coarse-to-fine weighted sparsity augmented collaborative representation is designed and its corresponding classification method called TPWELRMC is proposed.
The rest of the article is organized as follows: Section 2 reviews the related works. Section 3 describes the proposed methods in details. Section 4 reports the experimental results. Section 5 concludes this article.
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
3
2. The related works In this section, the related works are briefly reviewed. We first summarise the common notations that are always used in the following sections. Suppose that the training set of n training samples in m-feature space is denoted as X = [x1 , x2 , ..., xn ] ∈ Rm×n . All the training samples belong to M classes that are denoted as the set {c1 , c2 , . . . , cM }, and the given testing sample is denoted as y ∈ Rm . 2.1. LRM The linear reconstruction measure (LRM) [21] can be regarded as a good similarity measure that is well used to design the representation-based classification methods [10,22,23]. Generally speaking, LRM is to use the linear reconstruction coefficients to reflect the similarities of data. The benchmark model of LRM is defined as
min y − X S22 ,
(1) ]T
where S = [s1 , s2 , . . . , sn is the vector of linear reconstruction coefficients and si reflects the similarity between the training sample xi and y. The regularized LRM is defined as
min y − X S22 + λS p ,
(2)
where λ is the regularization parameter and p is often set as 1 or 2. 2.2. SA-CRC The sparsity augmented collaborative representation-based classification method (SA-CRC) is a linear reconstruction measure-based classifier that integrates the coefficients of sparse representation and collaborative representation [23]. As argued in SA-CRC, both collaboration from collaborative representation and sparseness from sparse representation can simultaneously strengthen the power of representation-based pattern discrimination. The SA-CRC method has three parts including sparse representation, collaborative representation and augmentation representation. The first part is collaborative representation (also called dense representation) that is defined as
Sˇ = argmin
S
y − X S22 + λS22 .
(3)
Then, the solution can be easily obtained as Sˇ = X T X + λI representation that is defined as
−1
X T y, where I is an identity matrix. The second part is sparse
Sˆ = argmin y − X S2 , s.t.S0 ≤ k,
(4)
S
where .0 is the notation of l0 -pseudo norm, and k is the sparsity threshold. In fact, sparse coefficients Sˆ can always be solved as
Sˆ = argmin
S
y − X S22 + λS1 .
(5)
The third part of SA-CRC is the augmented representation, and its augmented representation coefficients S∗ is defined as
S∗ =
Sˆ + Sˇ , Sˆ + Sˇ2
(6)
After obtaining S∗ , the class label ly of the testing sample y is determined as the class with the minimal sum of coefficients among all the classes:
ly = argmax ci
δ ci ( S ∗ ).
(7)
where δ ci (S∗ ) ∈ Rn is a new vector whose nonzero entries are the entries in S∗ belonging to class ci . 2.3. TPLRMC The two-phase linear reconstruction measure-based classification (TPLRMC) method is to use the representation coefficients of both the coarse and the fine representations in deciding the class label of a testing sample [22]. Specifically speaking, it uses coefficients of the coarse representation to choose the nearest training samples in the first phase, and further adopts the chosen nearest training samples to finely represent the testing sample in the second phase. Finally, the classification decision is made by using the coefficients of the fine representation. In the first phase, a testing sample y is coarsely represented by all the training samples as y ≈ x1 α1 + x2 α2 , . . . , xn αn = X α , where α = [α1 , α2 , . . . , αn ]T . And the representation coefficient vector α is solved as follows:
α ∗ = argmin y − X α22 + λα p , α
(8)
4
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
Fig. 1. The example of all the facial images of one individual chosen from each face database.
where the value of p in lp -norm of representation coefficients is always set to be 1 or 2. After obtaining the optimized coefficient vector α ∗ in Eq. (8), the coefficient αi∗ is regarded as the contribution of each training sample xi to the representation of y. Then, using the coefficients as the similarity measure, K training samples corresponding to the top K largest coefficients in α ∗ are chosen to finely represent y again. The chosen K training samples from H classes are denoted as X¯ = [x¯1 , x¯2 , . . . , x¯K ]. The set of the chosen H classes is also denoted as {c¯1 , c¯2 , . . . , c¯H } belonging to the subset of {c1 , c2 , . . . , cM }. T In the second phase, y is finely represented as y ≈ β1 x¯1 + β2 x¯2 + . . . + βK x¯K = X¯ β , where β = [β1 , β2 , . . . , βK ] is the vector of the fine representation coefficients. The optimized β is solved as
2 β ∗ = argmin y − X¯ β 2 + λβ p , β
(9)
T
where β ∗ = β1∗ , β2∗ , · · · , βK∗ . Similarly, the representation coefficient βi∗ as the similarity measure can be viewed as the contribution of the training sample x¯i to the fine representation of y. Using the fine representation coefficients in the second phase, the classification decision of TPLRMC is defined as
ly = argmax c¯i
δ c¯i (β ∗ ).
(10)
where δ c¯i (β ∗ ) ∈ RK is a new vector whose nonzero entries are the entries in β ∗ belonging to class c¯i . That is to say, the class label ly of the testing sample y is determined as the class c¯i with maximal sum of coefficients among the chosen H classes. It should be noted that two phases in TPLRMC adopt the same norms of coefficients, and TPLRMC with l1 -norm and l2 -norm of representation coefficients are denoted as TPLRMC(p=1) and TPLRMC(p=2), respectively. 3. The proposed methods In this section, we propose two representation-based classification methods based on the LRM and the coarse-to-fine representation methods. They are the weighted enhancement linear reconstruction measure-based classification (WELRMC) and the two-phase weighted enhancement linear reconstruction measure-based classification (TPWELRMC). Both can be seen as a variant of SA-CRC. 3.1. WELRMC It is shown in SA-CRC that using representation coefficients of both the sparse representation and the collaborative representation can significantly improve the classification performance of RBC [23]. In fact, the augmented representation coefficients in SA-CRC for favourable classification are obtained by designing the new linear reconstruction measure in Eq. (6). SA-CRC has more pattern discrimination power, benefitting from the augmented representation coefficients with the combination of the properties of sparsity and collaboration. The study in [24] shows that the localities of data are more crucial than sparsity, and localities of data can enhance sparsity but not vice versa. It has also been proven that sparsity can strengthen discrimination. Thus, the localities of data can improve the pattern discrimination. Recently, the localities of data are used for extending many RBC methods to improve classification performance [18,19]. By considering the good properties of the augmented representation coefficients and the localities of data, we introduce the localities between the training
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
5
Fig. 2. The sum of the coefficients and reconstruction residual per class via SA-CRC and WELRMC for given testing sample on GT.
samples and each testing sample into the SA-CRC method and propose the WELRMC method. In WELRMC, the local similarity distances between the training samples and each testing sample as the weights constrain both sparse representation coefficients and collaborative representation coefficients in the augmented representation coefficients. The proposed WELRMC method mainly contains the weighted sparse representation, the weighted collaborative representation and the augmented representation. In the weighted collaborative representation, the testing sample y is collaboratively represented by all the training samples, and the collaborative representation coefficients are constrained by the localities of data. The model of collaborative representation is defined as
Aˇ = argmin
A
y − X A22 + λWA22 ,
(11)
where Aˇ is the optimized vector of A that contains collaborative representation coefficients and W is the constrained weight matrix that reflects the localities of data. W is defined as
W = diag(y − x1 2 , y − x2 2 , . . . , y − xn 2 ),
(12)
where y − xi 2 is the local similarity distance between the testing sample and the training samples xi . Using the weighted constraint, the similar training sample xi with closer distance y − xi 2 can capture more contribution to representing y. That is, xi has larger representation coefficient to represent y. The collaborative representation coefficient vector can be −1 T easily obtained as Aˇ = X T X + λI X y. In the weighted sparse representation, the testing sample y is sparsely represented by all the training samples with the weighted constraint on sparse representation. The weighted sparse model is defined as
Aˆ = argmin A
y − X A22 + λWA1 ,
(13)
where Aˆ is the optimized vector of A that contains sparse representation coefficients and the constrained weight matrix W is defined in Eq. (12).
6
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
Fig. 3. The sum of the coefficients and reconstruction residual per class via SA-CRC and WELRMC for given testing sample on IMM.
Using the optimized vectors of the weighted collaborative representation and sparse representation coefficients, the augmented representation coefficients are defined as
A∗ =
Aˆ + Aˇ . Aˆ + Aˇ 2
(14)
Then, the augmented representation coefficients in A∗ are regarded as the linear reconstruction measure for the classification decision. The class label ly of the testing sample y is determined as follows:
ly = argmax ci
δ ci ( A∗ ).
where δ ci (A∗ ) ∈ Rn is a new vector whose nonzero entries are the entries in A∗ belonging to class ci . As discussed above, the steps of the proposed WELRMC method are summarized in Algorithm 1. Algorithm 1 The WELRMC Algorithm. Require: Training sample set: X ∈ Rm×n , the given testingsample: y ∈ Rm . Ensure: The class label ly of y. ˇ via Eqs. (11) and (12). 1: Compute the collaborative representation coefficients A ˆ by Eqs. (13) and (13). 2: Compute the sparse representation coefficients A 3: Calculate the augmented representation coefficients A∗ by Eq. (14). 4: Determine the class label ly of y by computing the sum of class specific coefficients in A∗ with Eq. (15).
(15)
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
7
Fig. 4. The sum of the coefficients and reconstruction residual per class via SA-CRC and WELRMC for given testing sample on ORL.
3.2. TPWELRMC As discussed above, the augmented representation in SA-CRC has the superiority of both collaboration and sparsity, the sparsity of the sparse representation can strengthen pattern discrimination, and the localities of data can further enhance sparsity. The proposed WELRMC holds these properties. It has been proven that the coarse-to-fine representation-based classification methods can achieve more pattern discrimination [15,20,22], because the coarse-to-fine representation can eliminate the dissimilar training samples in the first phase and finely represent and correctly classify each testing sample in the second phase. Based on these good properties of the augmented representation, the localities of data and the coarse-tofine representation, we introduce the coarse-to-fine representation into the WELRMC method and propose the TPWELRMC method. The TPWELRMC method mainly contains two phases, each of which has the weighted collaborative representation, the weighted sparse representation, and the augmented representation. In the first phase of TPWELRMC, the given testing sample y is coarsely represented by all the training samples as y ≈ B1 x1 + B2 x2 , . . . , Bn xn = XB, where B = [B1 , B2 , . . . , Bn ]T . Then, the weighted collaborative representation coefficients are computed as
Bˇ = argmin
B
y − X B22 + λW1 B22 ,
(16)
where Bˇ is the optimized vector of B including n collaborative representation coefficients corresponding to n training samples, and the constrained weight matrix W1 reflects the localities of data and is identically defined in Eq. (12). Bˇ can be easily −1 T obtained as Bˇ = X T X + λW T W1 X y. Simultaneously, the weighted sparse representation using all the training samples is 1
defined as
Bˆ = argmin B
y − X B22 + λW1 B1 ,
(17)
8
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
Fig. 5. The sum of the coefficients and reconstruction residual per class via SA-CRC and WELRMC for given testing sample on Yale.
where Bˆ is the optimized vector of B including n sparse representation coefficients. Using the coefficients of both collaborative representation and sparse representation, the augmented representation coefficients are calculated as
B∗ =
Bˆ + Bˇ , Bˆ + Bˇ2
(18)
where B∗ = [B∗1 , B∗2 , . . . , B∗n ]T . We use the augmented representation coefficient B∗i as a linear reconstruction measure to reflect the similarity between the training sample xi and the testing sample y. The K nearest training samples corresponding to the largest K augmented representation coefficients are determined as the representative samples to further finely represent and classify the testing sample y. The chosen nearest training samples are denoted as the set X¯ = [x¯1 , x¯2 , . . . , x¯K ]. The chosen K training samples are from T classes that are denoted as the set {c1 , c2 , . . . , cT } belonging to the subset of {c1 , c2 , . . . , cM }. In the second phase of TPWELRMC, y can be approximately represented by the K nearest training samples as y ≈ C1 x¯1 + C2 x¯2 + . . . + CK x¯K , where C = [C1 , C2 , . . . , CK ]T are the vector of the representation coefficients. To obtain the representation coefficients, the weighted collaborative representation using the chosen training samples is defined as
Cˇ = argmin
C
y − XC 22 + λW2C 22 ,
(19)
where Cˇ is the optimized vector of C including K collaborative representation coefficients. The weighted matrix W2 reflects the localities of data between the chosen training samples and the testing sample y and is defined as
W2 = diag(y − x¯1 2 , y − x¯2 2 , . . . , y − x¯K 2 ). Cˇ can also be solved as Cˇ = samples is defined as
Cˆ = argmin C
(X T X
+ λW2T W2 )−1 X T y.
y − XC 22 + λW2C 1 ,
(20)
Meanwhile, the weighted sparse representation using the chosen training
(21)
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
9
where Cˆ is the optimized vector of C including K sparse representation coefficients. Using the coefficients of collaborative representation and sparse representation, the augmented representation coefficients in the second phase is calculated as
C∗ =
Cˆ + Cˇ , Cˆ + Cˇ2
(22)
where C ∗ = [C1∗ , C2∗ , . . . , CK∗ ]T is the vector of the augmented representation coefficients using the chosen nearest training samples. Through the coarse-to-fine augmented representation, we use the augmented representation coefficients as the linear reconstruction measure to make the classification decision. The testing sample y is classified as
ly = argmax ci
δci (C ∗ ),
(23)
where δ ci (C ∗ ) ∈ RK is a new vector whose nonzero entries are the entries in C∗ belonging to class ci . The class label ly is the class with the largest sum of the augmented representation coefficients. According to the details of the proposed TPWELRMC, its main steps are summarized in Algorithm 2. Algorithm 2 The proposed TPWELRMC algorithm. Require: Training sample set: X ∈ Rm×n , the given testing sample: y ∈ Rm . Ensure: The class label ly of y. 1: The coarse augmented representation. a) Solve the collaborative representation coefficients Bˇ by Eq. (16). b) Solve the sparse representation coefficients Bˆ by Eq. (17). c) Compute the augmented representation coefficients B∗ by Eq. (18). d) Use the augmented representations to choose K nearest training samples X¯ = [x¯1 , x¯2 , . . ., x¯K ]. 2: The fine augmented representation. a) Solve the collaborative representation coefficients Cˇ by Eq. (19). b) Solve the sparse representation coefficients Cˆ by Eq. (21). c) Compute the augmented representation coefficients C ∗ by Eq. (22). 3: The classification decision. Determine the class label ly of y using the augmented representation coefficients C ∗ by Eq. (23).
3.3. Differences between proposed and related methods The differences between WELRMC and SA-CRC are largely due to use the localities of data. Since the localities of data can improve the power of pattern discrimination, the proposed WELRMC introduces the localities of data into both the collaborative representation and the sparse representation used in SA-CRC. WELRMC can be seen as a weighted extension of SA-CRC. TPWELRMC differs from TPLRMC in four aspects. Firstly, TPWELRMC uses different ways to represent each testing sample. In the TPLRMC, either the sparse representation or the collaborative representation is used to represent each testing sample in two phases. However, the two phases of TPWELRMC use the augmented representation that joints both the sparse representation and the collaborative representation. Secondly, TPWELRMC uses a different way to choose the nearest training samples in the first phase. TPLRMC relies on the coefficients of either the sparse representation or the collaborative representation to choose the training samples, but TPWELRMC uses the augmented representation. Thirdly, TPWELRMC and TPLRMC use different rules to make the final classification of the objects. The classification decision of TPLRMC is determined by the sums of the coefficients of sparse representation or collaborative representation, but TPWELRMC makes use of the sums of the coefficients of augmented representation. The last but not least is that the localities of data are considered in TPWELRMC but not in TPLRMC. Besides, TPWELRMC also extends WELRMC by further introducing the coarse-to-fine representation. So, WELRMC and TPWELRMC, SA-CRC and TPLRMC are a linear reconstruction measure-based classification method for classifying objects like faces. 4. Experiments In this section, the experiments are carried out to verify the effectiveness of the proposed WELRMC and TPWELRMC methods on five public face databases. We compare the proposed methods to the state-of-the-art RBC methods including CRC [1], SRC [2], SoC [10], SA-CRC [23] and TPLRMC [22]. In the experiments, the ranges of the regularization parameter are empirically set to be {0.0 01, 0.0 05, 0.01, 0.05, 0.1} in SoC, SA-CRC and WELRMC, {0.001, 0.01, 0.1, 1, 10} in CRC, TPLRMC and TPWELRMC, and {0.1, 0.2, 0.3, 0.4} in SRC. The best classification results of each competing method are obtained among these pre-set ranges on five face databases.
10
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
Fig. 6. The sum of the coefficients and reconstruction residual per class via SA-CRC and WELRMC for given testing sample on LFW.
4.1. Databases The face databases including ORL, IMM, GT, Yale and LFW face datasets. The ORL face database1 contains 400 image samples from 40 individuals and each individual has 10 facial images. The images in ORL were mainly taken by the facial expression changes. The Yale face database2 contains 165 image samples from 15 individuals and each individual has 11 facial images that were taken by various expressions and illuminations. The IMM face database3 has 240 image samples from 40 individuals (7 female and 33 male), and each individual has 6 facial images. The LFW (Labeled Faces in the Wild) face database4 has more than 13,0 0 0 image samples from 1680 individuals, each of which has two or more images. All images collected from websites were used to study face recognition under unconstrained conditions. In the experiments, we use a subset of LFW including 1251 facial images from 86 subjects, each of which has about 11–20 images [7]. The GT (Georgia Tech) face database5 contains 750 image samples from 50 individuals, and each individual has 15 facial images with different poses, expressions and illuminations. Fig. 1 shows all the images of one individual from each face database as an example. In our experiments, all facial images in the five face databases are cropped and resized to 32 × 32 pixels and the grey level values of images are re-scaled to [0, 1]. On each face database, l image samples from each class are chosen as the training samples and the remaining ones are the testing samples.
1 2 3 4 5
http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html. http://cvc.yale.edu/projects/yalefaces/yalefaces.html. http://www.imm.dtu.dk/∼aam/datasets/datasets.html. http://vis-www.cs.umass.edu/lfw/. http://www.anefian.com/research/face_reco.htm.
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
11
Fig. 7. The classification accuracy rates of TPLRMC and TPWELRMC with varying the chosen training samples in the first phase on Yale, GT and LFW.
4.2. Experiment 1 The first set of experiments is to analyze the differences between SA-CRC and WELRMC in order to verify if the use of the localities of data in WELRMC could gain an advantage in face recognition, compared with SA-CRC. As discussed in Section 3, WELRMC is a weighted SA-CRC because the augmented representation is obtained by weighted collaborative representation and weighted sparse representation. The weighted collaborative and sparse representations are constrained by the localities of data, respectively. The comparisons between SA-CRC and WELRMC are illustrated by using the classspecific sums of augmented representation coefficients and reconstruction residual per class in Figs. 2–6. Note that the class-
12
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
Fig. 8. The classification accuracy rates of TPLRMC and TPWELRMC with varying the chosen training samples in the first phase on IMM and ORL.
specific reconstruction residuals in these figures are computed by y − XD2 , where D is vector of augmented representation coefficients computed by Eq. (6) in SA-CRC and Eq. (14) in WELRMC. In the experiments, the first l image samples from each class are chosen as the training samples on each database and the remaining ones as the testing samples. The value of l is set to 9 on GT, 7 on ORL, 5 on Yale, 3 on IMM and 3 on LFW. In these five figures, for the demonstration purpose, the testing samples are chosen from class 1 on GT, IMM and Yale, class 10 on ORL and class 6 on LFW, respectively. Given the testing samples, the sums of the augmented representation coefficients and the reconstruction residuals are plotted for all the classes. The green bars show that WELRMC correctly identify the class labels for the testing samples, and the red bars show the mistakes made by SA-CRC. Through the experiments in Figs. 2–6, we can clearly see that the proposed WELRMC (i.e., the weighted SA-CRC) correctly classifies the given testing samples and SA-CRC does not. Moreover, the differences among the class-specific sums of augmented representation coefficients in WELRMC are significantly greater than the corresponding ones in SA-CRC. And the differences among the class-specific reconstruction residuals in WELRMC are also very greater than the corresponding ones in SA-CRC. These experimental facts implies that the proposed WELRMC is more discriminative than SA-CRC. We can conclude that this improvement of the power of pattern discrimination in WELRMC benefits from the localities of data used in SA-CRC. More importantly, we can also observe that the maximal class-specific sums of representation coefficients correspond to the minimal reconstruction residuals. This fact held in collaborative representation and sparse representation has also been proven in [25]. And this fact also implies that the representation coefficients as the linear reconstruction measure are the effective similarity measure [9,21]. Thus, the proposed WELRMC is more effective than SA-CRC and the localities of data can enhance the power of pattern discrimination in classification tasks, like face recognition. 4.3. Experiment 2 Both TPWELRMC and TPLRMC use the coarse-to-fine representation. This set of experiments is to verify the effectiveness of the proposed TPWELRMC in comparison with TPLRMC. The experiments are set up as follows: The first l image samples from each class are chosen as the training samples on each database, and the number of the chosen training samples are set as l = 8, 9 on GT, l = 7, 8 on ORL, l = 5, 7 on Yale, l = 3, 4 on IMM and l = 3, 5 on LFW. It should be noted that the ratios r
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
13
Table 1 The maximal classification accuracy rates (%) of the competing methods on all the face databases. Data
l
CRC
SRC
SoC
SA-CRC
WELRMC
TPLRMC(p=1)
TPLRMC(p=2)
TPWELRMC
IMM
2 3 4 5 5 6 7 8 3 5 7 9 6 7 8 9 3 5 7 9
58.13 69.17 71.25 77.50 93.00 96.67 97.50 98.75 79.17 88.33 93.11 96.67 63.78 62.00 63.14 64.33 25.78 32.03 32.36 38.37
61.75 65.83 76.50 80.00 93.00 94.38 95.00 98.75 74.17 86.67 92.56 96.67 59.33 57.50 60.29 57.67 23.67 26.43 29.74 34.17
60.62 70.83 71.25 80.00 96.00 96.67 96.88 98.75 80.83 88.33 93.44 98.91 68.22 68.00 74.00 72.33 26.39 30.09 34.82 40.46
60.62 70.83 75.49 80.13 93.50 95.83 96.25 98.75 80.83 86.67 93.47 96.67 63.11 62.75 66.00 67.67 23.46 24.60 27.27 32.70
61.88 70.83 76.03 82.00 96.00 97.35 97.50 99.13 81.82 88.33 94.43 99.21 77.56 78.50 81.41 81.67 29.10 39.09 40.37 49.48
61.04 68.06 69.57 73.33 95.25 96.25 97.78 98.75 84.00 91.67 92.24 96.33 66.00 67.80 71.31 71.73 23.05 36.87 38.94 42.17
61.25 67.50 68.83 72.50 94.83 95.83 97.22 99.17 84.50 91.00 92.05 96.67 64.04 65.05 68.17 69.73 22.01 35.79 37.62 41.06
62.29 71.82 79.32 82.71 96.67 97.71 98.33 99.58 84.67 93.00 95.12 99.33 75.56 78.70 81.54 82.33 29.35 39.46 43.51 49.71
ORL
Yale
GT
LFW
of the chosen training samples K in the first phase to all the training samples n (i.e. r = Kn ) are set in the range {0.1, 0.2, , 0.9, 1} on five face databases. To make the number K = nr of the chosen training samples in the first phase to be integer, we set the number of the chosen training samples as K = nr . The experimental results of TPWELRMC and TPLRMC with different numbers of the chosen training samples in the first phase are shown in Figs. 7 and 8. We can observe that the proposed TPWELRMC constantly outperformTPLRMC at the whole range of r values on four out of the five databases. Moreover, TPWELRMC is more robust to TPLRMC since the performance of TPWELRMC is much stabler that TPLRMC over nearly all the settings. These experimental results demonstrate that the localities of data and the augmented representation can enhance the coarse-to-fine representation-based classification performance. 4.4. Experiment 3 Besides, the proposed WELRMC and TPWELRMC are further compared with the competing representation-based classification methods including CRC [1], SRC [2], SoC [10], SA-CRC [23] and TPLRMC [22]. In the experiments, the training samples from each class are randomly chosen, and the rest samples per class are used in testing on each database. The numbers of the training samples are set as follows: l = 6, 7, 8, 9 on GT, l = 5, 6, 7, 8 on ORL, l = 3, 5, 7, 9 on Yale, l = 2, 3, 4, 5 on IMM and l = 3, 5, 7, 9 on LFW. The classification results of each method are averages of the accuracies on five divisions of each database. Note that the best performance of TPWELRMC and TPLRMC is achieved by the same way as in Figs. 7 and 8. The comparative classification results of the competing methods on all the face databases are reported in Table 1. Note that the classification accuracies of the proposed methods are denoted in bold-face. It is clear that the classification accuracy of each method increase with the increased number of the training samples on all the face databases. Both WELRMC and TPWELRMC almost always significantly outperform CRC, SRC, SoC, SA-CRC and TPLRMC. Moreover, TPWELRMC achieves the best classification performance among all the competing methods. In summary, we can conclude through the experiments above that the proposed methods have more effectiveness and robustness than the competing RBC methods in terms of the classification accurracy. Furthermore, the experiments also demonstrate that the coarse-to-fine representation, the augmented representation and the localities of data can strengthen the power of pattern discrimination for the representation-based classification. Meanwhile, the sparse or collaborative representation coefficients as the good similarity measure with more essential discrimination have been verified in the proposed methods. 5. Conclusions In the representation-based classification, where each testing sample can be represented as a weighted linear combination of a set of training examples, the localities of data and the coarse-to-fine representation play an important role in discriminating objects, like faces. The representation coefficients in the linear reconstruction are often used to measure how similar a given testing sample and the training samples. Our two proposed linear reconstruction measure-based classification methods have been designed to further improve the representation-based classification performance. In the proposed WELRMC method, we incorporate the localities of data in SA-CRC and use the augmented representation coefficients as linear reconstruction measure for the classifier design. In the proposed TPWELRMC, we further extend the SA-CRC method by
14
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
using the localities of data and the coarse-to-fine representation. The augmented representation coefficients as a linear reconstruction measure are used for choosing the nearest training samples in the first phase of TPWELRMC and for designing the classification decision in the second phase of TPWELRMC. To demonstrate the effectiveness of the proposed methods, we compare them to the state-of-the-art RBC methods on five public face databases. The experimental results show that the proposed methods outperform their counterparts. In future work, since the augmented representation coefficients can be regarded as the linear reconstruction measure for face recognition, we will introduce the augmented representation or the coarse-to-fine augmented representation into the other kinds of classifiers. Moreover, due to the good pattern discrimination of the proposed methods, we will combine them with the efficient deep learning models for some practical problems. Declaration of Competing Interest All the authors declare that there are no conflicts of interests regarding the publication of this article. Acknowledgment This work was supported in part by National Natural Science Foundation of China (Grant Nos. 61976107, 61962010, 61502208, 61762021 and 61402122), Natural Science Foundation of Jiangsu Province of China (Grant No. BK20150522), International Postdoctoral Exchange Fellowship Program of China Postdoctoral Council (No. 20180051), Research Foundation for Talented Scholars of JiangSu University (Grant No. 14JDG037), China Postdoctoral Science Foundation (Grant No. 2015M570411), Open Foundation of Artificial Intelligence Key Laboratory of Sichuan Province (Grant No. 2017RYJ04), and Natural Science Foundation of Guizhou Province (Nos.[2017]1130 and [2017]5726-32), Excellent Young Scientific and Technological Talents of Guizhou[2019]-5670. References [1] Zhang L, Yang M, Feng X. Sparse representation or collaborative representation: which helps face recognition? 2011 International conference on computer vision. IEEE; 2011. p. 471–8. doi:10.1109/ICCV.2011.6126277. [2] Wright J, Yang AY, Ganesh A, Sastry S, Ma Y. Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell. 2009;31(2):210–27. doi:10.1109/TPAMI.2008.79. [3] Xu Y, Li Z, Yang J, Zhang D. A survey of dictionary learning algorithms for face recognition. IEEE Access 2017;5:8502–14. doi:10.1109/ACCESS.2017. 2695239. [4] Zeng SN, Gou J, Deng LM. An antinoise sparse representation method for robust face recognition via joint l1 and l2 regularization. Expert Syst. Appl. 2017;82:1–9. doi:10.1016/j.eswa.2017.04.001. [5] Yang M, Zhang L, Feng X, Zhang D. Sparse representation based fisher discrimination dictionary learning for image classification. Int J Comput Vis 2014;109(3):209–32. doi:10.1007/s11263- 014- 0722- 8. [6] Zhang J, Tao D. FAMED-Net: a fast and accurate multi-scale end-to-end Dehazing network. IEEE Trans Image Process 2019. doi:10.1109/TIP.2019. 2922837. [7] Gou J, Hou B, Ou WH, Mao QR, Yang HB, Liu Y. Several robust extensions of collaborative representation for image classification. Neurocomputing 2019;348:120–33. doi:10.1016/j.neucom.2018.06.089. [8] Zhang Z, Xu Y, Yang J, Li XL, Zhang D. A survey of sparse representation: algorithms and applications. IEEE Access 2015;3:490–530. doi:10.1109/ACCESS. 2015.2430359. [9] Cheng H, Liu ZC, Hou L, Yang J. Sparsity-induced similarity measure and its applications. IEEE Trans Circuit Syst Video Technol 2016;26(4):613–26. doi:10.1109/TCSVT.2012.2225911. [10] Li J, Lu CY. A new decision rule for sparse representation based classification for face recognition. Neurocomputing 2013;116:265–71. doi:10.1016/j. neucom.2012.04.034. [11] Gou J, Yi Z, Zhang D, Zhan YZ, Shen XJ, Du L. Sparsity and geometry preserving graph embedding for dimensionality reduction. IEEE Access 2018;6:75748–66. doi:10.1109/ACCESS.2018.2884027. [12] Xu J, An W, Zhang L, Zhang D. Sparse, collaborative, or nonnegative representation: which helps pattern classification? Pattern Recognit 2019;88:679– 88. doi:10.1016/j.patcog.2018.12.023. [13] Deng WH, Hu JN, Guo J. Face recognition via collaborative representation: its discriminant nature and superposed representation. IEEE Trans Pattern Anal Mach Intell 2018;40(10):2513–21. doi:10.1109/TPAMI.2017.2757923. [14] Cai S, Zhang L, Zuo W, Feng X. Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE; 2017. p. 2950–9. doi:10. 1109/CVPR.2016.322. [15] Gou J, Wang L, Hou B, Lv J, Yuan Y, Mao Q. Two-phase probabilistic collaborative representation-based classification. Expert Syst Appl 2019;133:9–20. doi:10.1016/j.eswa.2019.05.009. [16] Xu Y, Zhong ZF, Yang J, You J, Zhang D. A new discriminative sparse representation method for robust face recognition via l2 regularization. IEEE Trans Neural Netw Learn Syst 2017;28(10):2233–42. doi:10.1109/TNNLS.2016.2580572. [17] Zhu Q, Zhang DQ, Sun H, Li ZM. Combining l1 -norm and l2 -norm based sparse representations for face recognition. Optik 2015;126(7–8):719–24. doi:10.1016/j.ijleo.2015.02.020. [18] Lu CY, Min H, Gui J, Zhu L, Lei YK. Face recognition via weighted sparse representation. J Visual Commun Image Represent 2013;24(2):111–16. doi:10. 1016/j.jvcir.2012.05.003. [19] Timofte R, Van Gool L. Adaptive and weighted collaborative representations for image classification. Pattern Recognit Lett 2014;43:127–35. doi:10.1016/ j.patrec.2013.08.010. [20] Xu Y, Zhang D, Yang J, Yang JY. A two-phase test sample sparse representation method for use with face recognition. IEEE Trans Circuit Systr Video Technol 2011;21(9):1255–62. doi:10.1109/TCSVT.2011.2138790. [21] Zhang J, Yang J. Linear reconstruction measure steered nearest neighbor classification framework. Pattern Recognit 2014;47(4):1709–20. doi:10.1016/j. patcog.2013.10.018. [22] Gou J, Xu Y, Zhang D, Mao QR, Du L, Zhan YZ. Two-phase linear reconstruction measure-based classification for face recognition. Inf Sci 2018;433:17– 36. doi:10.1016/j.ins.2017.12.025. [23] Akhtar N, Shafait F, Mian A. Efficient classification with sparsity augmented collaborative representation. Pattern Recognit 2017;65:136–45. doi:10. 1016/j.patcog.2016.12.017. [24] Yu K, Zhang T, Gong YH. Nonlinear learning using local coordinate coding. Adv Neural Inf Process Syst 2009:2223–31.
J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451
15
[25] Ma HX, Gou J, Wang XL, Ke J, Zeng SN. Sparse coefficient-based k-nearest neighbor classification. IEEE Access 2017;5:16618–34. doi:10.1109/ACCESS. 2017.2739807. Jianping Gou received the Ph.D. degree in computer science from University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2012. He is currently an associate Professor in School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu, China. His current research interests include pattern recognition, machine learning. Jun Song is currently pursuing the master degree in computer science in the School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu, China. His research interests include image classification and machine learning. Weihua Ou received the Ph.D. degree in information and communication engineering from the Huazhong University of Science and Technology (HUST), China, in 2014. He is currently an associate Professor with the School of Big Data and Computer Science, Guizhou Normal University, Guiyang, China. His current research interests include sparse representation, multiview learning, cross-modal retrieval, image processing, and computer vision. Shaoning Zeng is currently pursuing the Ph.D. degree in computer science in the Department of Computer and Information Science, Faculty of Science and Technology at the University of Macau. From 2009 to now, he is a lecturer in the School of Information Science and Technology at Huizhou University, China. His research interest includes computer vision, pattern recognition, machine learning and deep learning. Yun-Hao Yuan received the Ph.D. degree in pattern recognition and intelligence system from Nanjing University of Science and Technology (NUST), China, in 2013. He is currently an associate professor with the Department of Computer Science and Technology, College of Information Engineering, Yangzhou University. His research interests include pattern recognition, machine learning, multimedia search, and information fusion. Lan Du received the Ph.D. degree in computer science from ANU in 2012. He is currently a Data Science Lecturer with the Faculty of Information Technology, Monash University, Australia. Before he joined Monash University, he was a Post-Doctoral Research Fellow associated with the Computational Linguistic Group, Macquarie University. His research interests include machine learning, natural language processing, text mining, and social network analysis.