Representation-based classification methods with enhanced linear reconstruction measures for face recognition

Computers and Electrical Engineering 79 (2019) 106451 Contents lists available at ScienceDirect Computers and Electrical Engineering journal homepag...

Download PDF

1MB Sizes 0 Downloads 45 Views

Report

PDF Reader
Full Text

Computers and Electrical Engineering 79 (2019) 106451

Contents lists available at ScienceDirect

Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng

Representation-based classiﬁcation methods with enhanced linear reconstruction measures for face recognitionR Jianping Gou a, Jun Song a, Weihua Ou b,∗, Shaoning Zeng c, Yunhao Yuan d, Lan Du e a

School of Computer Science and Communication Engineering and Jiangsu Key Laboratory of Security Tech. for Industrial Cyberspace, Jiangsu University, Zhenjiang, Jiangsu, 212013, China b School of Big Data and Computer Science, Guizhou Normal University, Guiyang, Guizhou, 550025, China c School of Information Science and Technology, Huizhou University, Huizhou Guangdong, 516007, China d School of Computer Science and Technology, Yangzhou University, Yangzhou, Jiangsu, 225127, China e The faculty of information technology, Monash University, 3800 Australia

a r t i c l e

i n f o

Article history: Received 26 April 2019 Revised 23 August 2019 Accepted 26 August 2019

Keywords: Collaborative representation Sparsity augmented collaborative representation Representation-based classiﬁcation Linear reconstruction measurement

a b s t r a c t Representation-based classiﬁcation (RBC) methods have recently been the promising pattern recognition technologies for object recognition. The representation coeﬃcients of RBC as the linear reconstruction measure (LRM) can be well used for classifying objects. In this article, we propose two enhanced linear reconstruction measure-based classiﬁcation methods based on the sparsity-augmented collaborative representation-based classiﬁcation method (SA-CRC). One is the weighted enhancement linear reconstruction measure-based classiﬁcation method (WELRMC) that introduces data localities into SA-CRC. Another is the two-phase weighted enhancement linear reconstruction measure-based classiﬁcation method (TPWELRMC) that integrates both the coarse and ﬁne representations into SA-CRC. To demonstrate the effectiveness of the proposed methods, experiments are conducted on several public face databases in comparison with the state-of-the-art representation-based classiﬁcation methods. The experimental results show that the proposed methods signiﬁcantly outperform the competing RBC methods. © 2019 Published by Elsevier Ltd.

1. Introduction In pattern recognition, the collaborative representation-based classiﬁcation (CRC) method with l2 -norm of representation coeﬃcients [1] and the sparse representation-based classiﬁcation (SRC) method with l1 -norm instead [2] are two typical types of representation-based classiﬁcation methods. Recently, since both have the promising classiﬁcation performance, there are many variants of CRC and SRC that have been widely applied to various classiﬁcation tasks, such as face recognition [3,4] and image classiﬁcation [5–7].

R This paper was submitted for special section SI-aicv3, but it must be included in the regular papers section of CAEE. Reviews processed and recommended for publication to the Editor-in-Chief by Associate Editor Dr. Huimin Lu. ∗ Corresponding author. E-mail addresses: [email protected] (J. Gou), [email protected] (J. Song), [email protected] (W. Ou), [email protected] (S. Zeng), [email protected] (Y. Yuan), [email protected] (L. Du).

https://doi.org/10.1016/j.compeleceng.2019.106451 0045-7906/© 2019 Published by Elsevier Ltd.

2

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

The standard sparse representation-based classiﬁcation method (SRC) was proposed in [2] and has been well applied to face recognition with different variations, such as lighting, occlusions, facial expressions and so on. In [2], the discriminative property of sparse representation has also been analyzed in details. Recently, the research progress in theories and applications of sparse representation-based classiﬁcation has been reviewed in [8], and a survey of dictionary learning as a crucial part of sparse representation was given in [3]. Cheng et al. analyzed the discriminative nature of sparse representation from the perspective of the similarity measure. The sparse representation coeﬃcients can well reﬂect the similarities of data [9]. Using the similarity property of sparse representation coeﬃcients, the sum of class-speciﬁc coeﬃcients (SoC) as the classiﬁcation decision was proposed in [10]. Moreover, since the sparse representation coeﬃcients hold the similarities of data, the sparse graph construction is well used for dimensionality reduction [11]. It has shown that the sparse representation has good pattern discrimination for classifying objects. Collaborative representation is another typical representation used for pattern classiﬁcation [1]. Zhang et al. argued that the working mechanism of sparse representation-based classiﬁcation beneﬁts from the collaboration of all the training samples instead of l1 -norm sparsity constraints. They introduced the benchmark collaborative representation-based classiﬁcation method (CRC) with l2 -norm of representation coeﬃcients [1]. In [12], the nonnegative representation-based classiﬁcation was proposed, which emphasizes that the constraint of the nonnegative representation coeﬃcients can help pattern classiﬁcation. Deng et al. [13] further analyzed the discriminative property of CRC and proposed the superposed linear representation-based classiﬁcation to degrade the negative impact of the representation coeﬃcients from the wrong classes. From the probability point of view, the discriminative nature of CRC was analyzed in [14] and its corresponding extension was the probabilistic collaborative representation-based classiﬁcation method that considers the class-speciﬁc representation [14,15]. In [16], Xu et al. proposed the discriminative sparse representation-based classiﬁcation method as a variant of CRC, which adopts the constraint term of the pairs of class-speciﬁc representations. As shown in the SRC and CRC methods, both the sparse representation and the collaborative representation are meaningful and useful for classifying objects because l1 -norm of coeﬃcients can promote sparseness and l2 -norm of coeﬃcients can overcome outliers [17]. To enhance pattern discrimination of SRC and CRC, the data localities are always used as the weights to constrain the representation coeﬃcients [18,19]. Using the eﬃciency of collaborative representation, Xu et al. proposed the two-phase test sample sparse representation method that employs the coarse-to-ﬁne collaborative representation to quickly improve the sparsity of coeﬃcients for good classiﬁcation performance [20]. As argued in [9], the sparse representation coeﬃcients can be used as the similarity measure. The linear reconstruction measure (LRM) using the representation coeﬃcients with different norms as the similarity measure was introduced, and then the linear reconstruction measure-based nearest neighbor classiﬁcation was proposed in [21]. Using LRM and the coarse-to-ﬁne representation, Gou et al. [22] proposed the two-phase linear reconstruction measure-based classiﬁcation method (TPLRMC) by using l1 -norm of coeﬃcients and l2 -norm of coeﬃcients, respectively. In [23], Akhtar et al. argued in details that both collaboration and sparsity can simultaneously make the representation-based classiﬁcation more effective, and the sparsity augmented collaborative representation-based classiﬁcation method (SA-CRC) was then proposed. Since sparsity from sparse representation can get more discrimination and CRC often has similar multi-class speciﬁc residuals to cause misclassiﬁcation, SA-CRC joints sparse representation and collaborative representation to get the augmented representation for favourable classiﬁcation. In fact, the SA-CRC is a linear reconstruction measure-based classiﬁcation method. As aforementioned, the data localities can make the similar or close training samples have more contribution to the representation and classiﬁcation of each testing sample [18,19], and the coarse to ﬁne representation can make the representation coeﬃcients sparse by deleting the dissimilar training samples through the coarse representation [20,22]. Inspired by both LRM and SA-CRC, we propose in this article two new enhanced linear reconstruction measure-based classiﬁcation methods, extensions of SA-CRC which aims to further improve the representation-based classiﬁcation performance. One is entitled the weighted enhancement linear reconstruction measure-based classiﬁcation method (WELRMC). In WELRMC, the data localities as the weights are used to simultaneously constrain the coeﬃcients of sparse representations and collaborative representations in order to get more effective augmented representations. Another is the two-phase weighted enhancement linear reconstruction measure-based classiﬁcation method (TPWELRMC). In TPWELRMC, we design the coarse-to-ﬁne augmented representation for classiﬁcation by simultaneously weighting the coeﬃcients of sparse representation and collaborative representation. To demonstrate the effectiveness of the proposed methods, we conduct the experiments on ﬁve public face databases by comparing them to the related RBC methods. The experimental results show that the proposed WELRMC and TPWELRMC are promising classiﬁcation methods with satisfactory performance. In summary, the main contributions of this article are brieﬂy clariﬁed as follows:

(a) The weighted sparsity augmented collaborative representation is designed and its corresponding classiﬁcation method called WELRMC is proposed. (b) The coarse-to-ﬁne weighted sparsity augmented collaborative representation is designed and its corresponding classiﬁcation method called TPWELRMC is proposed.

The rest of the article is organized as follows: Section 2 reviews the related works. Section 3 describes the proposed methods in details. Section 4 reports the experimental results. Section 5 concludes this article.

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

3

2. The related works In this section, the related works are brieﬂy reviewed. We ﬁrst summarise the common notations that are always used in the following sections. Suppose that the training set of n training samples in m-feature space is denoted as X = [x1 , x2 , ..., xn ] ∈ Rm×n . All the training samples belong to M classes that are denoted as the set {c1 , c2 , . . . , cM }, and the given testing sample is denoted as y ∈ Rm . 2.1. LRM The linear reconstruction measure (LRM) [21] can be regarded as a good similarity measure that is well used to design the representation-based classiﬁcation methods [10,22,23]. Generally speaking, LRM is to use the linear reconstruction coeﬃcients to reﬂect the similarities of data. The benchmark model of LRM is deﬁned as

min y − X S22 ,

(1) ]T

where S = [s1 , s2 , . . . , sn is the vector of linear reconstruction coeﬃcients and si reﬂects the similarity between the training sample xi and y. The regularized LRM is deﬁned as

min y − X S22 + λS p ,

(2)

where λ is the regularization parameter and p is often set as 1 or 2. 2.2. SA-CRC The sparsity augmented collaborative representation-based classiﬁcation method (SA-CRC) is a linear reconstruction measure-based classiﬁer that integrates the coeﬃcients of sparse representation and collaborative representation [23]. As argued in SA-CRC, both collaboration from collaborative representation and sparseness from sparse representation can simultaneously strengthen the power of representation-based pattern discrimination. The SA-CRC method has three parts including sparse representation, collaborative representation and augmentation representation. The ﬁrst part is collaborative representation (also called dense representation) that is deﬁned as

Sˇ = argmin

S

y − X S22 + λS22 .

(3)

Then, the solution can be easily obtained as Sˇ = X T X + λI representation that is deﬁned as

−1

X T y, where I is an identity matrix. The second part is sparse

Sˆ = argmin y − X S2 , s.t.S0 ≤ k,

(4)

S

where .0 is the notation of l0 -pseudo norm, and k is the sparsity threshold. In fact, sparse coeﬃcients Sˆ can always be solved as

Sˆ = argmin

S

y − X S22 + λS1 .

(5)

The third part of SA-CRC is the augmented representation, and its augmented representation coeﬃcients S∗ is deﬁned as

S∗ =

Sˆ + Sˇ , Sˆ + Sˇ2

(6)

After obtaining S∗ , the class label ly of the testing sample y is determined as the class with the minimal sum of coeﬃcients among all the classes:

ly = argmax ci

δ ci ( S ∗ ).

(7)

where δ ci (S∗ ) ∈ Rn is a new vector whose nonzero entries are the entries in S∗ belonging to class ci . 2.3. TPLRMC The two-phase linear reconstruction measure-based classiﬁcation (TPLRMC) method is to use the representation coefﬁcients of both the coarse and the ﬁne representations in deciding the class label of a testing sample [22]. Speciﬁcally speaking, it uses coeﬃcients of the coarse representation to choose the nearest training samples in the ﬁrst phase, and further adopts the chosen nearest training samples to ﬁnely represent the testing sample in the second phase. Finally, the classiﬁcation decision is made by using the coeﬃcients of the ﬁne representation. In the ﬁrst phase, a testing sample y is coarsely represented by all the training samples as y ≈ x1 α1 + x2 α2 , . . . , xn αn = X α , where α = [α1 , α2 , . . . , αn ]T . And the representation coeﬃcient vector α is solved as follows:

α ∗ = argmin y − X α22 + λα p , α

(8)

4

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

Fig. 1. The example of all the facial images of one individual chosen from each face database.

where the value of p in lp -norm of representation coeﬃcients is always set to be 1 or 2. After obtaining the optimized coefﬁcient vector α ∗ in Eq. (8), the coeﬃcient αi∗ is regarded as the contribution of each training sample xi to the representation of y. Then, using the coeﬃcients as the similarity measure, K training samples corresponding to the top K largest coeﬃcients in α ∗ are chosen to ﬁnely represent y again. The chosen K training samples from H classes are denoted as X¯ = [x¯1 , x¯2 , . . . , x¯K ]. The set of the chosen H classes is also denoted as {c¯1 , c¯2 , . . . , c¯H } belonging to the subset of {c1 , c2 , . . . , cM }. T In the second phase, y is ﬁnely represented as y ≈ β1 x¯1 + β2 x¯2 + . . . + βK x¯K = X¯ β , where β = [β1 , β2 , . . . , βK ] is the vector of the ﬁne representation coeﬃcients. The optimized β is solved as

2 β ∗ = argmin y − X¯ β 2 + λβ p , β

(9)

T

where β ∗ = β1∗ , β2∗ , · · · , βK∗ . Similarly, the representation coeﬃcient βi∗ as the similarity measure can be viewed as the contribution of the training sample x¯i to the ﬁne representation of y. Using the ﬁne representation coeﬃcients in the second phase, the classiﬁcation decision of TPLRMC is deﬁned as

ly = argmax c¯i

δ c¯i (β ∗ ).

(10)

where δ c¯i (β ∗ ) ∈ RK is a new vector whose nonzero entries are the entries in β ∗ belonging to class c¯i . That is to say, the class label ly of the testing sample y is determined as the class c¯i with maximal sum of coeﬃcients among the chosen H classes. It should be noted that two phases in TPLRMC adopt the same norms of coeﬃcients, and TPLRMC with l1 -norm and l2 -norm of representation coeﬃcients are denoted as TPLRMC(p=1) and TPLRMC(p=2), respectively. 3. The proposed methods In this section, we propose two representation-based classiﬁcation methods based on the LRM and the coarse-to-ﬁne representation methods. They are the weighted enhancement linear reconstruction measure-based classiﬁcation (WELRMC) and the two-phase weighted enhancement linear reconstruction measure-based classiﬁcation (TPWELRMC). Both can be seen as a variant of SA-CRC. 3.1. WELRMC It is shown in SA-CRC that using representation coeﬃcients of both the sparse representation and the collaborative representation can signiﬁcantly improve the classiﬁcation performance of RBC [23]. In fact, the augmented representation coeﬃcients in SA-CRC for favourable classiﬁcation are obtained by designing the new linear reconstruction measure in Eq. (6). SA-CRC has more pattern discrimination power, beneﬁtting from the augmented representation coeﬃcients with the combination of the properties of sparsity and collaboration. The study in [24] shows that the localities of data are more crucial than sparsity, and localities of data can enhance sparsity but not vice versa. It has also been proven that sparsity can strengthen discrimination. Thus, the localities of data can improve the pattern discrimination. Recently, the localities of data are used for extending many RBC methods to improve classiﬁcation performance [18,19]. By considering the good properties of the augmented representation coeﬃcients and the localities of data, we introduce the localities between the training

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

5

Fig. 2. The sum of the coeﬃcients and reconstruction residual per class via SA-CRC and WELRMC for given testing sample on GT.

samples and each testing sample into the SA-CRC method and propose the WELRMC method. In WELRMC, the local similarity distances between the training samples and each testing sample as the weights constrain both sparse representation coeﬃcients and collaborative representation coeﬃcients in the augmented representation coeﬃcients. The proposed WELRMC method mainly contains the weighted sparse representation, the weighted collaborative representation and the augmented representation. In the weighted collaborative representation, the testing sample y is collaboratively represented by all the training samples, and the collaborative representation coeﬃcients are constrained by the localities of data. The model of collaborative representation is deﬁned as

Aˇ = argmin

A

y − X A22 + λWA22 ,

(11)

where Aˇ is the optimized vector of A that contains collaborative representation coeﬃcients and W is the constrained weight matrix that reﬂects the localities of data. W is deﬁned as

W = diag(y − x1 2 , y − x2 2 , . . . , y − xn 2 ),

(12)

where y − xi 2 is the local similarity distance between the testing sample and the training samples xi . Using the weighted constraint, the similar training sample xi with closer distance y − xi 2 can capture more contribution to representing y. That is, xi has larger representation coeﬃcient to represent y. The collaborative representation coeﬃcient vector can be −1 T easily obtained as Aˇ = X T X + λI X y. In the weighted sparse representation, the testing sample y is sparsely represented by all the training samples with the weighted constraint on sparse representation. The weighted sparse model is deﬁned as

Aˆ = argmin A

y − X A22 + λWA1 ,

(13)

where Aˆ is the optimized vector of A that contains sparse representation coeﬃcients and the constrained weight matrix W is deﬁned in Eq. (12).

6

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

Fig. 3. The sum of the coeﬃcients and reconstruction residual per class via SA-CRC and WELRMC for given testing sample on IMM.

Using the optimized vectors of the weighted collaborative representation and sparse representation coeﬃcients, the augmented representation coeﬃcients are deﬁned as

A∗ =

Aˆ + Aˇ . Aˆ + Aˇ 2

(14)

Then, the augmented representation coeﬃcients in A∗ are regarded as the linear reconstruction measure for the classiﬁcation decision. The class label ly of the testing sample y is determined as follows:

ly = argmax ci

δ ci ( A∗ ).

where δ ci (A∗ ) ∈ Rn is a new vector whose nonzero entries are the entries in A∗ belonging to class ci . As discussed above, the steps of the proposed WELRMC method are summarized in Algorithm 1. Algorithm 1 The WELRMC Algorithm. Require: Training sample set: X ∈ Rm×n , the given testingsample: y ∈ Rm . Ensure: The class label ly of y. ˇ via Eqs. (11) and (12). 1: Compute the collaborative representation coeﬃcients A ˆ by Eqs. (13) and (13). 2: Compute the sparse representation coeﬃcients A 3: Calculate the augmented representation coeﬃcients A∗ by Eq. (14). 4: Determine the class label ly of y by computing the sum of class speciﬁc coeﬃcients in A∗ with Eq. (15).

(15)

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

7

Fig. 4. The sum of the coeﬃcients and reconstruction residual per class via SA-CRC and WELRMC for given testing sample on ORL.

3.2. TPWELRMC As discussed above, the augmented representation in SA-CRC has the superiority of both collaboration and sparsity, the sparsity of the sparse representation can strengthen pattern discrimination, and the localities of data can further enhance sparsity. The proposed WELRMC holds these properties. It has been proven that the coarse-to-ﬁne representation-based classiﬁcation methods can achieve more pattern discrimination [15,20,22], because the coarse-to-ﬁne representation can eliminate the dissimilar training samples in the ﬁrst phase and ﬁnely represent and correctly classify each testing sample in the second phase. Based on these good properties of the augmented representation, the localities of data and the coarse-toﬁne representation, we introduce the coarse-to-ﬁne representation into the WELRMC method and propose the TPWELRMC method. The TPWELRMC method mainly contains two phases, each of which has the weighted collaborative representation, the weighted sparse representation, and the augmented representation. In the ﬁrst phase of TPWELRMC, the given testing sample y is coarsely represented by all the training samples as y ≈ B1 x1 + B2 x2 , . . . , Bn xn = XB, where B = [B1 , B2 , . . . , Bn ]T . Then, the weighted collaborative representation coeﬃcients are computed as

Bˇ = argmin

B

y − X B22 + λW1 B22 ,

(16)

where Bˇ is the optimized vector of B including n collaborative representation coeﬃcients corresponding to n training samples, and the constrained weight matrix W1 reﬂects the localities of data and is identically deﬁned in Eq. (12). Bˇ can be easily −1 T obtained as Bˇ = X T X + λW T W1 X y. Simultaneously, the weighted sparse representation using all the training samples is 1

deﬁned as

Bˆ = argmin B

y − X B22 + λW1 B1 ,

(17)

8

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

Fig. 5. The sum of the coeﬃcients and reconstruction residual per class via SA-CRC and WELRMC for given testing sample on Yale.

where Bˆ is the optimized vector of B including n sparse representation coeﬃcients. Using the coeﬃcients of both collaborative representation and sparse representation, the augmented representation coeﬃcients are calculated as

B∗ =

Bˆ + Bˇ , Bˆ + Bˇ2

(18)

where B∗ = [B∗1 , B∗2 , . . . , B∗n ]T . We use the augmented representation coeﬃcient B∗i as a linear reconstruction measure to reﬂect the similarity between the training sample xi and the testing sample y. The K nearest training samples corresponding to the largest K augmented representation coeﬃcients are determined as the representative samples to further ﬁnely represent and classify the testing sample y. The chosen nearest training samples are denoted as the set X¯ = [x¯1 , x¯2 , . . . , x¯K ]. The chosen K training samples are from T classes that are denoted as the set {c1 , c2 , . . . , cT } belonging to the subset of {c1 , c2 , . . . , cM }. In the second phase of TPWELRMC, y can be approximately represented by the K nearest training samples as y ≈ C1 x¯1 + C2 x¯2 + . . . + CK x¯K , where C = [C1 , C2 , . . . , CK ]T are the vector of the representation coeﬃcients. To obtain the representation coeﬃcients, the weighted collaborative representation using the chosen training samples is deﬁned as

Cˇ = argmin

C

y − XC 22 + λW2C 22 ,

(19)

where Cˇ is the optimized vector of C including K collaborative representation coeﬃcients. The weighted matrix W2 reﬂects the localities of data between the chosen training samples and the testing sample y and is deﬁned as

W2 = diag(y − x¯1 2 , y − x¯2 2 , . . . , y − x¯K 2 ). Cˇ can also be solved as Cˇ = samples is deﬁned as

Cˆ = argmin C

(X T X

+ λW2T W2 )−1 X T y.

y − XC 22 + λW2C 1 ,

(20)

Meanwhile, the weighted sparse representation using the chosen training

(21)

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

9

where Cˆ is the optimized vector of C including K sparse representation coeﬃcients. Using the coeﬃcients of collaborative representation and sparse representation, the augmented representation coeﬃcients in the second phase is calculated as

C∗ =

Cˆ + Cˇ , Cˆ + Cˇ2

(22)

where C ∗ = [C1∗ , C2∗ , . . . , CK∗ ]T is the vector of the augmented representation coeﬃcients using the chosen nearest training samples. Through the coarse-to-ﬁne augmented representation, we use the augmented representation coeﬃcients as the linear reconstruction measure to make the classiﬁcation decision. The testing sample y is classiﬁed as

ly = argmax ci

δci (C ∗ ),

(23)

where δ ci (C ∗ ) ∈ RK is a new vector whose nonzero entries are the entries in C∗ belonging to class ci . The class label ly is the class with the largest sum of the augmented representation coeﬃcients. According to the details of the proposed TPWELRMC, its main steps are summarized in Algorithm 2. Algorithm 2 The proposed TPWELRMC algorithm. Require: Training sample set: X ∈ Rm×n , the given testing sample: y ∈ Rm . Ensure: The class label ly of y. 1: The coarse augmented representation. a) Solve the collaborative representation coeﬃcients Bˇ by Eq. (16). b) Solve the sparse representation coeﬃcients Bˆ by Eq. (17). c) Compute the augmented representation coeﬃcients B∗ by Eq. (18). d) Use the augmented representations to choose K nearest training samples X¯ = [x¯1 , x¯2 , . . ., x¯K ]. 2: The ﬁne augmented representation. a) Solve the collaborative representation coeﬃcients Cˇ by Eq. (19). b) Solve the sparse representation coeﬃcients Cˆ by Eq. (21). c) Compute the augmented representation coeﬃcients C ∗ by Eq. (22). 3: The classiﬁcation decision. Determine the class label ly of y using the augmented representation coeﬃcients C ∗ by Eq. (23).

3.3. Differences between proposed and related methods The differences between WELRMC and SA-CRC are largely due to use the localities of data. Since the localities of data can improve the power of pattern discrimination, the proposed WELRMC introduces the localities of data into both the collaborative representation and the sparse representation used in SA-CRC. WELRMC can be seen as a weighted extension of SA-CRC. TPWELRMC differs from TPLRMC in four aspects. Firstly, TPWELRMC uses different ways to represent each testing sample. In the TPLRMC, either the sparse representation or the collaborative representation is used to represent each testing sample in two phases. However, the two phases of TPWELRMC use the augmented representation that joints both the sparse representation and the collaborative representation. Secondly, TPWELRMC uses a different way to choose the nearest training samples in the ﬁrst phase. TPLRMC relies on the coeﬃcients of either the sparse representation or the collaborative representation to choose the training samples, but TPWELRMC uses the augmented representation. Thirdly, TPWELRMC and TPLRMC use different rules to make the ﬁnal classiﬁcation of the objects. The classiﬁcation decision of TPLRMC is determined by the sums of the coeﬃcients of sparse representation or collaborative representation, but TPWELRMC makes use of the sums of the coeﬃcients of augmented representation. The last but not least is that the localities of data are considered in TPWELRMC but not in TPLRMC. Besides, TPWELRMC also extends WELRMC by further introducing the coarse-to-ﬁne representation. So, WELRMC and TPWELRMC, SA-CRC and TPLRMC are a linear reconstruction measure-based classiﬁcation method for classifying objects like faces. 4. Experiments In this section, the experiments are carried out to verify the effectiveness of the proposed WELRMC and TPWELRMC methods on ﬁve public face databases. We compare the proposed methods to the state-of-the-art RBC methods including CRC [1], SRC [2], SoC [10], SA-CRC [23] and TPLRMC [22]. In the experiments, the ranges of the regularization parameter are empirically set to be {0.0 01, 0.0 05, 0.01, 0.05, 0.1} in SoC, SA-CRC and WELRMC, {0.001, 0.01, 0.1, 1, 10} in CRC, TPLRMC and TPWELRMC, and {0.1, 0.2, 0.3, 0.4} in SRC. The best classiﬁcation results of each competing method are obtained among these pre-set ranges on ﬁve face databases.

10

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

Fig. 6. The sum of the coeﬃcients and reconstruction residual per class via SA-CRC and WELRMC for given testing sample on LFW.

4.1. Databases The face databases including ORL, IMM, GT, Yale and LFW face datasets. The ORL face database1 contains 400 image samples from 40 individuals and each individual has 10 facial images. The images in ORL were mainly taken by the facial expression changes. The Yale face database2 contains 165 image samples from 15 individuals and each individual has 11 facial images that were taken by various expressions and illuminations. The IMM face database3 has 240 image samples from 40 individuals (7 female and 33 male), and each individual has 6 facial images. The LFW (Labeled Faces in the Wild) face database4 has more than 13,0 0 0 image samples from 1680 individuals, each of which has two or more images. All images collected from websites were used to study face recognition under unconstrained conditions. In the experiments, we use a subset of LFW including 1251 facial images from 86 subjects, each of which has about 11–20 images [7]. The GT (Georgia Tech) face database5 contains 750 image samples from 50 individuals, and each individual has 15 facial images with different poses, expressions and illuminations. Fig. 1 shows all the images of one individual from each face database as an example. In our experiments, all facial images in the ﬁve face databases are cropped and resized to 32 × 32 pixels and the grey level values of images are re-scaled to [0, 1]. On each face database, l image samples from each class are chosen as the training samples and the remaining ones are the testing samples.

1 2 3 4 5

http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html. http://cvc.yale.edu/projects/yalefaces/yalefaces.html. http://www.imm.dtu.dk/∼aam/datasets/datasets.html. http://vis-www.cs.umass.edu/lfw/. http://www.aneﬁan.com/research/face_reco.htm.

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

11

Fig. 7. The classiﬁcation accuracy rates of TPLRMC and TPWELRMC with varying the chosen training samples in the ﬁrst phase on Yale, GT and LFW.

4.2. Experiment 1 The ﬁrst set of experiments is to analyze the differences between SA-CRC and WELRMC in order to verify if the use of the localities of data in WELRMC could gain an advantage in face recognition, compared with SA-CRC. As discussed in Section 3, WELRMC is a weighted SA-CRC because the augmented representation is obtained by weighted collaborative representation and weighted sparse representation. The weighted collaborative and sparse representations are constrained by the localities of data, respectively. The comparisons between SA-CRC and WELRMC are illustrated by using the classspeciﬁc sums of augmented representation coeﬃcients and reconstruction residual per class in Figs. 2–6. Note that the class-

12

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

Fig. 8. The classiﬁcation accuracy rates of TPLRMC and TPWELRMC with varying the chosen training samples in the ﬁrst phase on IMM and ORL.

speciﬁc reconstruction residuals in these ﬁgures are computed by y − XD2 , where D is vector of augmented representation coeﬃcients computed by Eq. (6) in SA-CRC and Eq. (14) in WELRMC. In the experiments, the ﬁrst l image samples from each class are chosen as the training samples on each database and the remaining ones as the testing samples. The value of l is set to 9 on GT, 7 on ORL, 5 on Yale, 3 on IMM and 3 on LFW. In these ﬁve ﬁgures, for the demonstration purpose, the testing samples are chosen from class 1 on GT, IMM and Yale, class 10 on ORL and class 6 on LFW, respectively. Given the testing samples, the sums of the augmented representation coeﬃcients and the reconstruction residuals are plotted for all the classes. The green bars show that WELRMC correctly identify the class labels for the testing samples, and the red bars show the mistakes made by SA-CRC. Through the experiments in Figs. 2–6, we can clearly see that the proposed WELRMC (i.e., the weighted SA-CRC) correctly classiﬁes the given testing samples and SA-CRC does not. Moreover, the differences among the class-speciﬁc sums of augmented representation coeﬃcients in WELRMC are signiﬁcantly greater than the corresponding ones in SA-CRC. And the differences among the class-speciﬁc reconstruction residuals in WELRMC are also very greater than the corresponding ones in SA-CRC. These experimental facts implies that the proposed WELRMC is more discriminative than SA-CRC. We can conclude that this improvement of the power of pattern discrimination in WELRMC beneﬁts from the localities of data used in SA-CRC. More importantly, we can also observe that the maximal class-speciﬁc sums of representation coeﬃcients correspond to the minimal reconstruction residuals. This fact held in collaborative representation and sparse representation has also been proven in [25]. And this fact also implies that the representation coeﬃcients as the linear reconstruction measure are the effective similarity measure [9,21]. Thus, the proposed WELRMC is more effective than SA-CRC and the localities of data can enhance the power of pattern discrimination in classiﬁcation tasks, like face recognition. 4.3. Experiment 2 Both TPWELRMC and TPLRMC use the coarse-to-ﬁne representation. This set of experiments is to verify the effectiveness of the proposed TPWELRMC in comparison with TPLRMC. The experiments are set up as follows: The ﬁrst l image samples from each class are chosen as the training samples on each database, and the number of the chosen training samples are set as l = 8, 9 on GT, l = 7, 8 on ORL, l = 5, 7 on Yale, l = 3, 4 on IMM and l = 3, 5 on LFW. It should be noted that the ratios r

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

13

Table 1 The maximal classiﬁcation accuracy rates (%) of the competing methods on all the face databases. Data

l

CRC

SRC

SoC

SA-CRC

WELRMC

TPLRMC(p=1)

TPLRMC(p=2)

TPWELRMC

IMM

2 3 4 5 5 6 7 8 3 5 7 9 6 7 8 9 3 5 7 9

58.13 69.17 71.25 77.50 93.00 96.67 97.50 98.75 79.17 88.33 93.11 96.67 63.78 62.00 63.14 64.33 25.78 32.03 32.36 38.37

61.75 65.83 76.50 80.00 93.00 94.38 95.00 98.75 74.17 86.67 92.56 96.67 59.33 57.50 60.29 57.67 23.67 26.43 29.74 34.17

60.62 70.83 71.25 80.00 96.00 96.67 96.88 98.75 80.83 88.33 93.44 98.91 68.22 68.00 74.00 72.33 26.39 30.09 34.82 40.46

60.62 70.83 75.49 80.13 93.50 95.83 96.25 98.75 80.83 86.67 93.47 96.67 63.11 62.75 66.00 67.67 23.46 24.60 27.27 32.70

61.88 70.83 76.03 82.00 96.00 97.35 97.50 99.13 81.82 88.33 94.43 99.21 77.56 78.50 81.41 81.67 29.10 39.09 40.37 49.48

61.04 68.06 69.57 73.33 95.25 96.25 97.78 98.75 84.00 91.67 92.24 96.33 66.00 67.80 71.31 71.73 23.05 36.87 38.94 42.17

61.25 67.50 68.83 72.50 94.83 95.83 97.22 99.17 84.50 91.00 92.05 96.67 64.04 65.05 68.17 69.73 22.01 35.79 37.62 41.06

62.29 71.82 79.32 82.71 96.67 97.71 98.33 99.58 84.67 93.00 95.12 99.33 75.56 78.70 81.54 82.33 29.35 39.46 43.51 49.71

ORL

Yale

GT

LFW

of the chosen training samples K in the ﬁrst phase to all the training samples n (i.e. r = Kn ) are set in the range {0.1, 0.2, , 0.9, 1} on ﬁve face databases. To make the number K = nr of the chosen training samples in the ﬁrst phase to be integer, we set the number of the chosen training samples as K = nr . The experimental results of TPWELRMC and TPLRMC with different numbers of the chosen training samples in the ﬁrst phase are shown in Figs. 7 and 8. We can observe that the proposed TPWELRMC constantly outperformTPLRMC at the whole range of r values on four out of the ﬁve databases. Moreover, TPWELRMC is more robust to TPLRMC since the performance of TPWELRMC is much stabler that TPLRMC over nearly all the settings. These experimental results demonstrate that the localities of data and the augmented representation can enhance the coarse-to-ﬁne representation-based classiﬁcation performance. 4.4. Experiment 3 Besides, the proposed WELRMC and TPWELRMC are further compared with the competing representation-based classiﬁcation methods including CRC [1], SRC [2], SoC [10], SA-CRC [23] and TPLRMC [22]. In the experiments, the training samples from each class are randomly chosen, and the rest samples per class are used in testing on each database. The numbers of the training samples are set as follows: l = 6, 7, 8, 9 on GT, l = 5, 6, 7, 8 on ORL, l = 3, 5, 7, 9 on Yale, l = 2, 3, 4, 5 on IMM and l = 3, 5, 7, 9 on LFW. The classiﬁcation results of each method are averages of the accuracies on ﬁve divisions of each database. Note that the best performance of TPWELRMC and TPLRMC is achieved by the same way as in Figs. 7 and 8. The comparative classiﬁcation results of the competing methods on all the face databases are reported in Table 1. Note that the classiﬁcation accuracies of the proposed methods are denoted in bold-face. It is clear that the classiﬁcation accuracy of each method increase with the increased number of the training samples on all the face databases. Both WELRMC and TPWELRMC almost always signiﬁcantly outperform CRC, SRC, SoC, SA-CRC and TPLRMC. Moreover, TPWELRMC achieves the best classiﬁcation performance among all the competing methods. In summary, we can conclude through the experiments above that the proposed methods have more effectiveness and robustness than the competing RBC methods in terms of the classiﬁcation accurracy. Furthermore, the experiments also demonstrate that the coarse-to-ﬁne representation, the augmented representation and the localities of data can strengthen the power of pattern discrimination for the representation-based classiﬁcation. Meanwhile, the sparse or collaborative representation coeﬃcients as the good similarity measure with more essential discrimination have been veriﬁed in the proposed methods. 5. Conclusions In the representation-based classiﬁcation, where each testing sample can be represented as a weighted linear combination of a set of training examples, the localities of data and the coarse-to-ﬁne representation play an important role in discriminating objects, like faces. The representation coeﬃcients in the linear reconstruction are often used to measure how similar a given testing sample and the training samples. Our two proposed linear reconstruction measure-based classiﬁcation methods have been designed to further improve the representation-based classiﬁcation performance. In the proposed WELRMC method, we incorporate the localities of data in SA-CRC and use the augmented representation coeﬃcients as linear reconstruction measure for the classiﬁer design. In the proposed TPWELRMC, we further extend the SA-CRC method by

14

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

using the localities of data and the coarse-to-ﬁne representation. The augmented representation coeﬃcients as a linear reconstruction measure are used for choosing the nearest training samples in the ﬁrst phase of TPWELRMC and for designing the classiﬁcation decision in the second phase of TPWELRMC. To demonstrate the effectiveness of the proposed methods, we compare them to the state-of-the-art RBC methods on ﬁve public face databases. The experimental results show that the proposed methods outperform their counterparts. In future work, since the augmented representation coeﬃcients can be regarded as the linear reconstruction measure for face recognition, we will introduce the augmented representation or the coarse-to-ﬁne augmented representation into the other kinds of classiﬁers. Moreover, due to the good pattern discrimination of the proposed methods, we will combine them with the eﬃcient deep learning models for some practical problems. Declaration of Competing Interest All the authors declare that there are no conﬂicts of interests regarding the publication of this article. Acknowledgment This work was supported in part by National Natural Science Foundation of China (Grant Nos. 61976107, 61962010, 61502208, 61762021 and 61402122), Natural Science Foundation of Jiangsu Province of China (Grant No. BK20150522), International Postdoctoral Exchange Fellowship Program of China Postdoctoral Council (No. 20180051), Research Foundation for Talented Scholars of JiangSu University (Grant No. 14JDG037), China Postdoctoral Science Foundation (Grant No. 2015M570411), Open Foundation of Artiﬁcial Intelligence Key Laboratory of Sichuan Province (Grant No. 2017RYJ04), and Natural Science Foundation of Guizhou Province (Nos.[2017]1130 and [2017]5726-32), Excellent Young Scientiﬁc and Technological Talents of Guizhou[2019]-5670. References [1] Zhang L, Yang M, Feng X. Sparse representation or collaborative representation: which helps face recognition? 2011 International conference on computer vision. IEEE; 2011. p. 471–8. doi:10.1109/ICCV.2011.6126277. [2] Wright J, Yang AY, Ganesh A, Sastry S, Ma Y. Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell. 2009;31(2):210–27. doi:10.1109/TPAMI.2008.79. [3] Xu Y, Li Z, Yang J, Zhang D. A survey of dictionary learning algorithms for face recognition. IEEE Access 2017;5:8502–14. doi:10.1109/ACCESS.2017. 2695239. [4] Zeng SN, Gou J, Deng LM. An antinoise sparse representation method for robust face recognition via joint l1 and l2 regularization. Expert Syst. Appl. 2017;82:1–9. doi:10.1016/j.eswa.2017.04.001. [5] Yang M, Zhang L, Feng X, Zhang D. Sparse representation based ﬁsher discrimination dictionary learning for image classiﬁcation. Int J Comput Vis 2014;109(3):209–32. doi:10.1007/s11263- 014- 0722- 8. [6] Zhang J, Tao D. FAMED-Net: a fast and accurate multi-scale end-to-end Dehazing network. IEEE Trans Image Process 2019. doi:10.1109/TIP.2019. 2922837. [7] Gou J, Hou B, Ou WH, Mao QR, Yang HB, Liu Y. Several robust extensions of collaborative representation for image classiﬁcation. Neurocomputing 2019;348:120–33. doi:10.1016/j.neucom.2018.06.089. [8] Zhang Z, Xu Y, Yang J, Li XL, Zhang D. A survey of sparse representation: algorithms and applications. IEEE Access 2015;3:490–530. doi:10.1109/ACCESS. 2015.2430359. [9] Cheng H, Liu ZC, Hou L, Yang J. Sparsity-induced similarity measure and its applications. IEEE Trans Circuit Syst Video Technol 2016;26(4):613–26. doi:10.1109/TCSVT.2012.2225911. [10] Li J, Lu CY. A new decision rule for sparse representation based classiﬁcation for face recognition. Neurocomputing 2013;116:265–71. doi:10.1016/j. neucom.2012.04.034. [11] Gou J, Yi Z, Zhang D, Zhan YZ, Shen XJ, Du L. Sparsity and geometry preserving graph embedding for dimensionality reduction. IEEE Access 2018;6:75748–66. doi:10.1109/ACCESS.2018.2884027. [12] Xu J, An W, Zhang L, Zhang D. Sparse, collaborative, or nonnegative representation: which helps pattern classiﬁcation? Pattern Recognit 2019;88:679– 88. doi:10.1016/j.patcog.2018.12.023. [13] Deng WH, Hu JN, Guo J. Face recognition via collaborative representation: its discriminant nature and superposed representation. IEEE Trans Pattern Anal Mach Intell 2018;40(10):2513–21. doi:10.1109/TPAMI.2017.2757923. [14] Cai S, Zhang L, Zuo W, Feng X. Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE; 2017. p. 2950–9. doi:10. 1109/CVPR.2016.322. [15] Gou J, Wang L, Hou B, Lv J, Yuan Y, Mao Q. Two-phase probabilistic collaborative representation-based classiﬁcation. Expert Syst Appl 2019;133:9–20. doi:10.1016/j.eswa.2019.05.009. [16] Xu Y, Zhong ZF, Yang J, You J, Zhang D. A new discriminative sparse representation method for robust face recognition via l2 regularization. IEEE Trans Neural Netw Learn Syst 2017;28(10):2233–42. doi:10.1109/TNNLS.2016.2580572. [17] Zhu Q, Zhang DQ, Sun H, Li ZM. Combining l1 -norm and l2 -norm based sparse representations for face recognition. Optik 2015;126(7–8):719–24. doi:10.1016/j.ijleo.2015.02.020. [18] Lu CY, Min H, Gui J, Zhu L, Lei YK. Face recognition via weighted sparse representation. J Visual Commun Image Represent 2013;24(2):111–16. doi:10. 1016/j.jvcir.2012.05.003. [19] Timofte R, Van Gool L. Adaptive and weighted collaborative representations for image classiﬁcation. Pattern Recognit Lett 2014;43:127–35. doi:10.1016/ j.patrec.2013.08.010. [20] Xu Y, Zhang D, Yang J, Yang JY. A two-phase test sample sparse representation method for use with face recognition. IEEE Trans Circuit Systr Video Technol 2011;21(9):1255–62. doi:10.1109/TCSVT.2011.2138790. [21] Zhang J, Yang J. Linear reconstruction measure steered nearest neighbor classiﬁcation framework. Pattern Recognit 2014;47(4):1709–20. doi:10.1016/j. patcog.2013.10.018. [22] Gou J, Xu Y, Zhang D, Mao QR, Du L, Zhan YZ. Two-phase linear reconstruction measure-based classiﬁcation for face recognition. Inf Sci 2018;433:17– 36. doi:10.1016/j.ins.2017.12.025. [23] Akhtar N, Shafait F, Mian A. Eﬃcient classiﬁcation with sparsity augmented collaborative representation. Pattern Recognit 2017;65:136–45. doi:10. 1016/j.patcog.2016.12.017. [24] Yu K, Zhang T, Gong YH. Nonlinear learning using local coordinate coding. Adv Neural Inf Process Syst 2009:2223–31.

J. Gou, J. Song and W. Ou et al. / Computers and Electrical Engineering 79 (2019) 106451

15

[25] Ma HX, Gou J, Wang XL, Ke J, Zeng SN. Sparse coeﬃcient-based k-nearest neighbor classiﬁcation. IEEE Access 2017;5:16618–34. doi:10.1109/ACCESS. 2017.2739807. Jianping Gou received the Ph.D. degree in computer science from University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2012. He is currently an associate Professor in School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu, China. His current research interests include pattern recognition, machine learning. Jun Song is currently pursuing the master degree in computer science in the School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu, China. His research interests include image classiﬁcation and machine learning. Weihua Ou received the Ph.D. degree in information and communication engineering from the Huazhong University of Science and Technology (HUST), China, in 2014. He is currently an associate Professor with the School of Big Data and Computer Science, Guizhou Normal University, Guiyang, China. His current research interests include sparse representation, multiview learning, cross-modal retrieval, image processing, and computer vision. Shaoning Zeng is currently pursuing the Ph.D. degree in computer science in the Department of Computer and Information Science, Faculty of Science and Technology at the University of Macau. From 2009 to now, he is a lecturer in the School of Information Science and Technology at Huizhou University, China. His research interest includes computer vision, pattern recognition, machine learning and deep learning. Yun-Hao Yuan received the Ph.D. degree in pattern recognition and intelligence system from Nanjing University of Science and Technology (NUST), China, in 2013. He is currently an associate professor with the Department of Computer Science and Technology, College of Information Engineering, Yangzhou University. His research interests include pattern recognition, machine learning, multimedia search, and information fusion. Lan Du received the Ph.D. degree in computer science from ANU in 2012. He is currently a Data Science Lecturer with the Faculty of Information Technology, Monash University, Australia. Before he joined Monash University, he was a Post-Doctoral Research Fellow associated with the Computational Linguistic Group, Macquarie University. His research interests include machine learning, natural language processing, text mining, and social network analysis.

Representation-based classification methods with enhanced linear reconstruction measures for face recognition

Representation-based classification methods with enhanced linear reconstruction measures for face recognition

Recommend Documents