A robust geometric mean-based subspace discriminant analysis feature extraction approach for image set classification

Optik - International Journal for Light and Electron Optics 199 (2019) 163368 Contents lists available at ScienceDirect Optik journal homepage: www...

Download PDF

4MB Sizes 0 Downloads 94 Views

Report

Full Text

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

Contents lists available at ScienceDirect

Optik journal homepage: www.elsevier.com/locate/ijleo

Original research article

A robust geometric mean-based subspace discriminant analysis feature extraction approach for image set classiﬁcation

T

⁎

Jianqiang Gaoa, , Li Lib a b

School of Medical Information Engineering, Jining Medical University, Shandong 276826, China Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214122, China

A R T IC LE I N F O

ABS TRA CT

Keywords: Image set classiﬁcation Feature extraction Geometric mean vector Subspace discriminant analysis Dimension reduction

Discriminant analysis technique is an important research topic in image set classiﬁcation because it can extract discriminative features. However, most existing discriminant analysis methods almost fail to work for feature extraction of data because there is only a small amount of valid discriminant information. The main weakness of most existing discriminant analysis models is that the class mean vector is constructed by using class sample average. That is not suﬃcient to provide an accurate estimation of the class mean. In this paper, we proposed a robust geometric mean-based subspace discriminant analysis feature extraction method for image set classiﬁcation. This method is a combination of geometric mean vector, subspace and covariance matrix to jointly represent an image set because they are contain diﬀerent discriminative information. The above three representation ways lie on diﬀerent spaces. To reduce the dissimilarity between the heterogeneous spaces, a robust geometric mean-based subspace discriminant analysis learning (GMSDA) framework is developed, which includes three steps operation. At ﬁrst, uses a new geometric mean vector to construct geometric between-class scatter matrix (Sgb) and geometric within-class scatter matrix (Sgw) instead of traditional mean vector of many methods. Secondly, maximizes the geometric between-class scatter matrix to increase the diﬀerence between exS tracted features. Finally, maximizes the subspace between-class scatter (Sgb ) and minimizes the S ) simultaneously in the subspace of original space. Experiments subspace within-class scatter (Sgw on ﬁve datasets illustrate that our proposed GMSDA method works the best in small sample size situation by using maximum likelihood classiﬁcation (MLC).

1. Introduction Recently, image classiﬁcation has been one of the most active research topics in most of the real world applications such as access control, card identiﬁcation, computer vision and cell classiﬁcation [1–4]. Still image set classiﬁcation uses the features of still image to identify the index of it. Traditional classiﬁcation system includes three steps: feature construction, dimension reduction and classiﬁcation [5]. Generally speaking, feature construction method can reveal the essential characteristics of things and achieve higher classiﬁcation accuracy in image set classiﬁcation tasks. Feature construction method is used to ﬁnd a transformation that maps the original data to a lower dimensional space where the essential discriminative information can be mainly preserved. The feature construction method can be broadly divided into two groups: feature selection and feature extraction [6]. For feature selection methods, there is no new features will be generated and only a subset of original features is selected. Feature extraction mainly aims

⁎

Corresponding author. E-mail addresses: [email protected], [email protected] (J. Gao), [email protected], [email protected] (L. Li).

https://doi.org/10.1016/j.ijleo.2019.163368 Received 13 March 2017; Received in revised form 20 August 2019; Accepted 4 September 2019 0030-4026/ © 2019 Elsevier GmbH. All rights reserved.

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

at reducing the dimensionality of original data while keeping as much intrinsic information as possible, and also exploring the inherent low-dimensional structure, reducing the computational complexity, and improving the performance of data analysis. In addition, as the performance of classiﬁer is heavily relied on the quality of features, therefore the extraction of eﬀective features is extremely critical in image set classiﬁcation. Generally speaking, features selected by feature selection techniques maintain the physical meaning of original data while features obtained by feature extraction methods are discriminative than those selected by feature selection techniques. There are many methods for feature extraction of image set classiﬁcation ranging from unsupervised to supervised techniques. Unsupervised methods do not require any prior knowledge, even though they are not directly aimed at optimizing the accuracy in a given classiﬁcation task. That is to say, unsupervised methods mainly focus on another representation of the original data in a lower dimensional subspace, satisfying some criterions, and usually do not concern class discrimination, principal component analysis (PCA) [7] is one of the best known unsupervised feature extraction method. The purpose of PCA is to project the original data into a new subspace by minimizing the reconstruction error in the mean squared sense. The features extracted by PCA have the highest and the lowest contrast or variance in the ﬁrst and in the last principal components, respectively. Therefore, the classiﬁcation results by using PCA features are practically competitive. Supervised methods rely on the existence of labeled samples to infer class separability. Two widely used feature extraction methods are linear discriminant analysis (LDA) [8] and nonparametric weighted feature extraction (NWFE) [9], which utilize the mean vector and covariance matrix of each class. The performance of LDA is very poor when the classes have non normal-like or multi-modal mixture distributions, and LDA can extract at most c − 1 features, which may not be suﬃcient to represent essential information of the original data in image set classiﬁcation. In addition, the performance of LDA will be very poor when the within-class covariance is nearly singular, especially in high-dimensional data. These problems have been solved by NWFE method, which is placing diﬀerent weights on every sample to calculate the weighted means and then applying the distances between samples and their weighted means as their closeness to boundary. Kernel principal component analysis (KPCA) [10] and generalized discriminant analysis (GDA) [11] are two nonlinear versions of PCA and LDA by using kernel trick (mapping the input data from the original space to a convenient feature space where inner products in the feature space can be computed by a kernel function). However, the computational burden of nonlinear representations is very heavy. Semi-supervised methods try to ﬁnd a projection using very limited number of labeled samples and a large number of unlabeled samples [12]. In order to preserve the properties of local neighborhoods of image set, some local methods have been proposed, such as, locality preserving projection (LPP) [13] and maximum margin projection (MMP) [14]. MMP method are using class information and neighborhood information to construct a within-class graph and a between-class graph, respectively. For two data points, if they share equal label or they are suﬃciently close to each other, they can be connected in the within-class graph. If they do not share equal label, they can be connected in the between-class graph. Finally, the original data can be mapped to a new subspace by using a linear transformation matrix. In addition, there are some methods using a hybrid criterion that is a combination of the two or more than two feature extraction methods for providing the advantages, such as Gabor-based feature [15–17], canonical correlation analysis (CCA) [18], multiple-rank supervised canonical correlation analysis (MSCCA) [19] and multi-model fusion metric learning (MMFML) [20]. Subsequently, there are many eﬀective still image classiﬁcation methods have been proposed, such as k-nearest neighbors (KNN) [21], support vector machines (SVM) [22], twin SVM (TSVM) [23], twin multiple rank support matrix machines (TMRSMMs) [24]. Now, quite a lot of videos are being captured everyday. Hence, video-based face recognition has recently attracted increasing interest in the pattern recognition community, such as Ding and Tao [25] proposed the convolutional neural networks-based (CNN-based) comprehensive framework to process video-based face image and Masi et al. [26] proposed a Pose-Aware Models (PAM) to process a face image via using several pose-speciﬁc, deep CNN. In the conventional still image classiﬁcation, each training and testing example is a single still image, on the contrary, in image set classiﬁcation, each training and testing example contains a set of images. Therefore, an image set can provide us more information to describe objects, such as diﬀerent illumination conditions of the objects. In image set classiﬁcation, each image set includes a wide range of appearance variations, how to represent the image set is a challenging problem. So image set classiﬁcation is to recognize an object or a person by measuring the similarities between the gallery sets and the probe sets. Many outstanding contributions have been made on image set classiﬁcation, such as image sets are modeled as distribution functions [27], subspaces [28,29], manifold [28,29], statistical feature [30], domain [31]. Apart from above image set based methods, some single image-based methods are also developed for image set classiﬁcation, such as cooperative sparse representation [32], feature space discriminant analysis (FSDA) [33]. The main idea of FSDA method is reduce the redundant spectral information in the extracted features by introducing the between-spectral scatter based on class sample average vector. As mentioned above, mean vector, subspace and covariance are eﬀective image set representation methods. That is to say, they contain diﬀerent discriminative information of an image set. Hence, in order to mine eﬀective information from these image set representations, a robust geometric mean-based subspace discriminant analysis method that termed as geometric mean-based subspace discriminant analysis learning (GMSDA) is proposed to jointly represent an image set. The proposed method uses a geometric mean vector to construct geometric between-class scatter matrix (Sgb) and geometric within-class scatter matrix (Sgw) instead of conventional mean vector of state-of-the-art methods. Subsequently, maximizes the Sgb to increase the diﬀerence between extracted S ) and features and obtains the feature subspace of original image, and ﬁnally maximizes the subspace between-class scatter matrix (Sgb S minimizes the subspace within-class scatter matrix (Sgw ) simultaneously. Experimental results of face recognition ORL database, Yale database, AR database, FERET database and object recognition on COIL20 database demonstrate that our proposed method achieves the highest recognition rate. The rest of this paper is organized as follows. In Section 2, we will give overviews of the proposed method. Section 3 proposes the GMSDA framework and its relevant theory and algorithm for image set classiﬁcation. Experiments and results analysis on various 2

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

kinds of databases are performed in Section 4. Finally, Section 5 provides the conclusion and future work. 2. The basic idea of GMSDA In this section, extracting discriminative features and building powerful representation framework for image set classiﬁcation are very important with small sample size samples. In our proposed GMSDA method, we further mine the information from the geometric mean vector, subspace and covariance. There are three main purposes of GMSDA. (i) The intrinsic geometry structure of the image set database can be revealed exactly and the GMSDA algorithm can learn nonlinear correlation features with well discriminating power by MLC. (ii) The produced features are as diﬀerent from each other as possible. (iii) Separability between classes is increased. The GMSDA method uses a geometric mean vector to construct geometric between-class scatter matrix (Sgb) and geometric within-class scatter matrix (Sgw) instead of conventional mean vector of many methods. Subsequently, maximizes the Sgb to increase the diﬀerence between extracted features and obtains the feature subspace of original image, and ﬁnally maximizes the subspace between-class S S ) and minimizes the subspace within-class scatter matrix (Sgw ) simultaneously. scatter matrix (Sgb N d Let = {x i : x i ∈ ℝ }i = 1 be the training dataset with geometric mean m and belonging to c classes. Each class has a geometric mean c mi of ni data points where N = ∑i = 1 ni . For extracting p features from d × 1 original feature vector (x), a transformation matrix p × d A∈ℝ is used. The extracted feature vector will be: yp×1 = Ap×dxd×1. 3. Geometric mean-based subspace discriminant analysis learning 3.1. Image set representation Suppose X = [x1, …, xn] be an image set, where x i ∈ ℝd is the ith image sample in the image set. The image pixel values as the original features. The aﬃne subspace is the jointly representation of the image set using geometric mean vector and subspace, that is to say the geometric mean vector can provide the complementary information of the subspace [5]. As we all know the geometric mean vector and covariance matrix are the ﬁrst-order statistics and second order statistics of an image set, respectively. The statistics information that from diﬀerent order statistics is diﬀerent in terms of [5]. The geometric mean vector reﬂects the position of the sample in the Euclidean space, and the covariance matrix reﬂects the variance and the correlation of features in the diagonal elements and the non-diagonal elements, respectively. Therefore, these statistics features can provide complementary information to represent the image set. Geometric mean: The geometric mean of n positive numbers x1, x2, …, xn is deﬁned as positive n-th root of their product [34]: n

xg =

n

⎞ ⎛ x1·x2 ·… ·x n = ⎜∏ x i ⎟ = i 1 ⎠ ⎝

1 n

(1)

Generally speaking, the geometric mean is more resistant to outliers (or skewed data). Let Data = {3.3, 3, 10, 3.1, 1, 3.2, 3.4, 3.5} 1 + 3 + 3.1 + 3.2 + 3.3 + 3.4 + 3.5 + 10 = 3.8125 and the geometric mean of with the outliers ”1” and ”10” then the average value of Data x m = 8 8 x g = x1·x2 ·… ·x 8 ≈ 3.2245. So, we observe that 3.2245 is more closer to the central tendency Data 3 + 3.1 + 3.2 + 3.3 + 3.4 + 3.5 = 3.25) than 3.8125. Hence, the geometric mean is more eﬀective than average value. ( x central = 6 Subspace: The subspace is usually obtained by PCA or singular value decomposition (SVD), whose process corresponds to the eigen-decomposition of the matrix XXT [5]: (2)

XXT = U ΣU T

where U is the d × q orthogonal matrix, U = [u1, ⋯, uq] contains the q largest eigenvectors of XX and serves as the basis of the qdimensional linear subspace span(U) Covariance: The covariance matrix C of the image set is computed as follows, which represent the correlation of two individual features of each pair of samples in the image set [5]: T

C=

1 n−1

n

∑ (xi − x g )(xi − x g )T

(3)

i=1

3.2. The implementation of GMSDA To improve the image set classiﬁcation accuracy, the proposed GMSDA method will minimize the within-class scatter matrix and to maximize the between-class scatter matrix simultaneously in the second step of GMSDA for class discrimination. In image set processing, the training sample matrix is denoted as follows:

x ⎡ 11j ⎢ x21j Xj = ⎢ ⋮ ⎢ x d1j ⎣

x12j x22j ⋮ x d2j

… … ⋯ …

x1cj ⎤ x2cj ⎥ , (j = 1, 2, …, nk ) ⋮ ⎥ x dcj ⎥ ⎦

(4) 3

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

where xikj (i = 1, 2, …, d;k = 1, 2, …, c;j = 1, 2, …, nk) is the j-th training sample of class k in i-th dimension. nk is the number of training samples of each class. So, the vector hij is deﬁned as follows:

hij = [x i1j , x i2j, ⋯, x icj]T , (i = 1, 2, …, d; j = 1, 2, …, nk )

(5)

hij must contain a representative from each class and the representative of each class is the geometric mean of it. When we want to use the training samples of classes instead of the geometric mean of them, the representative of each class is a training sample from that class. It means that all classes must be involved in the composing of hij. In this paper, the same number of training samples must be used in all classes, i.e., nt = min{nk }ck = 1. Through all above analysis, the geometric between-spectral scatter matrix (Sgb ∈ ℝd × d ) is calculated as follows: nt

Sgb =

d

∑ ∑ (hij − h¯ j)(hij − h¯ j)T (6)

j=1 i=1

1 h¯ j = d

d

∑ hij

(7)

i=1

ℝc × c

can be obtained via maximizing tr(Sgb). The original feature space of image set is transformed into a The projection matrix W ∈ new feature space by using obtained projection matrix W such that the raw features have more diﬀerence from each other in it. Let us consider the SVD of Sgb ∈ ℝd × d

Sgb = Ub Σb UbT

(8)

partitioning Ub as Ub = [ Ub1 Ub2], where c=rank(Sgb). So, we have

⏟ c ⏟ d−c

range(Sgb) = span(Ub1)

(9)

Hence, Eq. (8) can be transformed into the following form. T Σ 0 ⎡Ub1 ⎤ Sgb = Ub Σb UbT = [ Ub1 Ub2] ⎡ b1 ⎤ ⎢ T ⎥ 0 0 ⎣ ⎦ ⎣Ub2 ⎦ ⏟ c ⏟ d−c

ℝd × c

(10)

ℝc × c

is column orthogonal, Σb1 ∈ is a diagonal matrix and it's diagonal components are nonincreasing positive. The where Ub1 ∈ ability of discrimination may be reduced because of the singularity of matrix Sgb. Hence, the zero eigenvalues and corresponding eigenvectors should be discarded. In the following, Theorem 1 can verify Ub1 is a solution of the problem W = argmax WT W = I tr(W T Sgb W ) . Theorem 1. Given any a orthogonal matrix G ∈ ℝc × c , W = Ub1G is a solution of W = argmax WT W = I tr(W T Sgb W ) .Proof: Since GTG = GGT = Ic and UbT1 Ub1 = Ic , we have

(Ub1 G )T (Ub1 G ) = GT UbT1 Ub1 G = Ic and

tr((Ub1 G )T Sgb (Ub1 G )) = tr(UbT1 Sgb Ub1 GGT ) = tr(UbT1 Sgb Ub1) which indicates that the conclusion is true. That is to say, the features diﬀerence is increased in the transformed feature space with using following transformation:

(g ij)c × 1 = W hij, (i = 1, 2, ⋯, d; j = 1, 2, ⋯, nt )

(11)

The sample matrix Xj (j = 1, 2, ⋯, nt) in (4) is transformed to (12) as follows by using the above transformations. G

G G r12 j … r1cj ⎤ G G ⎥ r22 j … r2cj ⎥ , (j = 1, 2, ⋯, nk ) ⋮ ⋯ ⋮ ⎥ ⎥ G rdG2j … rdcj ⎥ ⎦

⎡ r11j ⎢ G r Rj = ⎢ 21j ⎢⋮ ⎢ G r ⎢ ⎣ d1j

(12)

The vectors gij and Rkj are deﬁned as follows: G T g ij = [riG1j, riG2j, ⋯, ricj ] , (i = 1, 2, ⋯, d; j = 1, 2, ⋯, nt )

(13)

G G G T Rkj = [r1kj , r2kj , ⋯, rdkj ] , (k = 1, 2, ⋯, c; j = 1, 2, ⋯, nt )

(14)

Therefore, in the second step of GMSDA, the scatter matrices Sb and Sw can be obtained in the transformed feature space as follows: nt

Sb =

c

∑ ∑ (Rkj − R¯ )(Rkj − R¯ )T (15)

j=1 k=1

4

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

Fig. 1. The ﬁrst row: Image examples of the ORL database; the second row: Image examples of the Yale database; the third row: Image examples of the AR database; the fourth row: Image examples of the FERET database; the last row: Image examples of the COIL20 database. c

Sw =

nt

nt

∑ ∑ ∑ (Rki − Rkj)(Rki − Rkj)T (16)

k=1 j=1 i=1

¯ = R

1 c × nt

nt

c

∑ ∑ Rkj (17)

j=1 k=1

The matrix Sw may be singular in the high-dimensional small sample sized data sets, thus the solution cannot be obtained directly. Hence, we use the regularization skill to deal with it as follows [33,36,37]:

Sw =

1 1 Sw + diag(Sw ) 2 2

(18)

The projection matrix A is usually transformed into the following eigenvalue problem. (19)

Sb A = λSw A

tr(Sw−1 Sb)

with above Eq. (19) and Fisher criterion [35]. For extracting p features from d × 1 Hence, the A can be got via maximizing original feature vector (x), the p eigenvectors of Sw−1 Sb associated with the largest p eigenvalues of Sw−1 Sb compose the projection matrix A. Thus, we have: yp×1 = Ap×dxd×1. The proposed GMSDA learning method can not only extract the eﬀective feature from image set, but also it can learn nonlinear correlation features with well discriminating power by MLC. 4. Experiments and analysis In this experiments, the performances of all tested methods were quantitatively compared using the average classiﬁcation accuracy (AA), average classiﬁcation validity (AV) and overall accuracy (OA). The classiﬁcation process was carried out using MATLAB 7.11 version on a computer equipped with an Intel Core i7 processor at 3.40-GHz. To evaluate the capability of the GMSDA method in 5

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

Fig. 2. Overall accuracy measures for ORL dataset using diﬀerent feature extraction methods and various number of features.

image set classiﬁcation, the proposed GmFSDA method together with common methods such as PCA, KPCA, MMP, NWFE, and FSDA were tested on ﬁve datasets. The AA, AV and OA were used in this comparison [12,33], and these are deﬁned as follows:

AA =

1 C

C

∑ ACC(c )

(20)

c=1

where ACC(c) = nc/Nc denotes classiﬁcation accuracy of each subject class, nc denotes the number of pixels that are correctly classiﬁed and Nc denotes the number of test pixels in that class.

AV =

1 C

C

∑ VAL(c )

(21)

c=1

where VAL(c) = nc/mc denotes classiﬁcation validity of each subject class, mc denotes the number of all the pixels labeled as class c in output class map.

OA =

n N

(22)

where n denotes the number of all the pixels that are correctly classiﬁed and N denotes the number of all pixels in the test set. 4.1. Image set data description The ORL face dataset, also called AT&T face dataset, consists of 400 images from 40 individuals. Each individual has 10 grayscale 6

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

Fig. 3. Overall accuracy measures for Yale dataset using diﬀerent feature extraction methods and various number of features.

images with diﬀerent lighting, facial expressions and details. The size of each image is 112 × 92 pixels with 256 gray levels. The ﬁrst row of Fig. 1 shows some sample images from this database. q random (q = 3, 4, 5, 6, 7) samples per class are chosen for training and the rest are treated as testing samples. Yale database contains 165 images from 15 persons, and each person has 11 images. These images were various as the change of facial expressions and lighting conditions. Each image has size 100 × 100, the second row of Fig. 1 shows some sample images from this database. q random (q = 3, 4, 5, 6, 7) samples per class are chosen for training and the rest are treated as testing samples. AR database contains over 4000 color images from 126 individuals (male: 70, female: 56), and the database characteristics vary in terms of lighting conditions, occlusions and facial expressions. The original resolution of these images is 165 × 120. In our experiments, we use a subset of AR database where each person has 14 diﬀerent images with illumination change and diﬀerent expressions. Since the number of individuals and total samples are much larger, experiments on this database could be much more signiﬁcant. Hence, the images are also further down-sampled to 50 × 40 due to the computation consideration. Some sample images from AR database are listed in the third row of Fig. 1. q random (q = 6, 7, 8, 9, 10) samples per class are chosen for training and the rest are treated as testing samples. FERET face image database displays diversity across gender, ethnicity, and images. Since images are acquired during diﬀerent photo sessions, illumination conditions, and facial expressions, the size of the face may vary. The FERET face image database contains 14051 gray face images from 1199 diﬀerent subjects. In our experiments, we use a subset of FERET database which contains 200 individuals and 7 images for each person. All the images in subset are resized to 80 × 80 pixels. q random (q = 3, 4, 5) samples per class are chosen for training and the rest are treated as testing samples. Some resized images from FERET database are shown in the fourth row of Fig. 1. The COIL20 object dataset contains 1440 images of 20 diﬀerent objects, and each object has 72 images that are obtained by 7

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

Fig. 4. Overall accuracy measures for AR dataset using diﬀerent feature extraction methods and various number of features.

rotating the object through 360 degrees at the intervals of 5 degrees. q random (q = 10, 15, 20, 25, 30) samples per class are chosen for training, and the rest are treated as testing samples. The last row of Fig. 1 shows some sample images from this database. 4.2. Experiments and results In this subsection, the extracted features of PCA, KPCA, MMP, NWFE, and proposed GMSDA methods have been fed into an maximum likelihood classiﬁer (MLC) and then the classiﬁcation results have been compared for the above ﬁve datasets. In MMP and NWFE methods, we used the Gaussian kernel function and regularization parameter p = 0.5, respectively. In order to demonstrate the capability of all above mentioned methods, the number of diﬀerent samples of each class is used for training the MLC. Because of random selection of training samples, we do each experiment 10 times and the average results are reported. In the experiment, the number of features is set from 1 to 14 in all methods and each time. The overall classiﬁcation accuracy measures for ORL, Yale, AR, FERET, and COIL20 datasets corresponding to each approach versus the number of features are shown in Figs. 2, 3, 4, 5 and 6 , respectively. In the ﬁve experiments, we evaluate the performance of the proposed GMSDA in comparison with PCA, KPCA, MMP, NWFE, and FSDA in small sample size situation for the above ﬁve datasets. From Figs. 2–6, we can obtain the fact that the superiority of the proposed GMSDA method in comparison with the competing algorithms is apparent. As shown in Figs. 2–6, with the increase of the number of features, the overall accuracy is increased dramatically and then reach the highest accuracy, subsequently remain unchanged, especially in the ORL and COIL20 two datasets. When the number of features are very short, the overall accuracy of each method is diﬃcult to achieve good accuracy. As training samples are increasing, all the algorithms have an incremental tendency in recognition rates, and the recognition rates of the compared algorithms are close to those of our algorithm little by little in most of cases. However, our algorithm always possesses the 8

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

Fig. 5. Overall accuracy measures for FERET dataset using diﬀerent feature extraction methods and various number of features.

highest recognition rates. In addition, we also noticed that the overall accuracy of NWFE algorithm has been improved very slowly. In summary, we proposed GMSDA method not only obtain the higher recognition rates, but also well captures the discriminating structure of samples. All the experimental results give a reasonable observation that our method is an eﬀective and robustness method, especially when the number of extracted feature is 14. In order to further evaluate the performance of our proposed GMSDA method, the AA, AV, OA and standard deviations of accuracy measures with diﬀerent numbers of training samples (the number of extracted feature is 14) are shown in Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 for ORL, Yale, AR, FERET, and COIL20 datasets, respectively. Tables 1–15 illustrate the classiﬁcation accuracy of each method with diﬀerent number of training samples. We can see that the performance of all the compared methods will improve as the number of training samples increases. The classiﬁcation accuracy of the proposed GMSDA method is higher than that of the other ﬁve methods. Compared to all methods, the NWFE method always have the lower classiﬁcation rates, which reveals the eﬀectiveness of supervised information in feature extraction, especially in AR and FERET two datasets. It is clear to see that in FERET dataset, the classiﬁcation accuracy of PCA, KPCA, MMP, NWFE, and FSDA ﬁve methods is very low, that is to say these methods almost fail. However our algorithm always exceeds them in AA, AV and OA. In addition, the AA and OA are the same in terms of Tables 1–15, which are an underlying relationship with Eqs.(20) and (22). KPCA as a kernel version method, it cannot reach the best performance. Actually, in our GMSDA method, we transform the feature space two times. First, the original feature space is transformed to a new feature subspace by using a primary projection and the features have more discrimination in the new subspace. And then, the capability of class discrimination can be improved via using discrimination analysis learning. The geometric mean-based vector can not only reveal intrinsic structure of image set, but also it can improve learning nonlinear correlation features with well discriminating power by MLC. Hence, all the experimental results on the ﬁve datasets reveal that GMSDA is an eﬀective feature 9

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

Fig. 6. Overall accuracy measures for COIL20 dataset using diﬀerent feature extraction methods and various number of features. Table 1 The AA (%) and standard deviation of accuracy on the ORL dataset. Methods

PCA

Train = 3 Train = 4 Train = 5 Train = 6 Train = 7

85.5357 88.2917 92.5000 94.0625 95.5833

KPCA ± ± ± ± ±

2.2849 2.3964 2.6151 1.8171 2.4273

86.0714 89.3333 92.5500 94.6250 95.5000

MMP ± ± ± ± ±

2.1030 2.6468 2.3182 2.0760 2.4086

81.4643 82.9167 84.2500 86.5000 86.8333

NWFE ± ± ± ± ±

1.9244 2.1757 2.5886 1.9455 2.7013

29.8929 33.4167 35.8000 40.3125 45.0833

FSDA ± ± ± ± ±

2.1005 2.4887 2.5891 2.5063 3.5193

84.6786 87.6250 91.6500 93.6875 95.1667

GMSDA ± ± ± ± ±

2.3129 2.5276 2.7143 1.9084 2.5982

88.7143 92.0833 94.9500 95.8125 97.2500

± ± ± ± ±

2.4433 2.1658 2.0058 2.0862 1.9299

± ± ± ± ±

2.4473 2.4121 1.8346 1.9324 1.7979

Table 2 The AV (%) and standard deviation of accuracy on the ORL dataset. Methods

PCA

Train = 3 Train = 4 Train = 5 Train = 6 Train = 7

87.9229 89.9646 93.9950 95.4381 96.5417

KPCA ± ± ± ± ±

2.2596 2.4234 2.4792 2.0559 2.7680

88.0558 90.7055 93.7037 95.9065 96.4958

MMP ± ± ± ± ±

1.9994 2.6248 2.3700 2.1947 2.8408

83.4757 84.9933 85.9983 88.4292 88.4208

NWFE ± ± ± ± ±

1.9907 2.0344 2.6229 2.2228 2.8051

10

40.9733 41.2412 43.0651 45.7825 48.0437

FSDA ± ± ± ± ±

3.7664 3.0088 3.2495 3.3820 4.5303

87.5973 89.7805 93.3682 94.8744 95.9708

GMSDA ± ± ± ± ±

2.2791 2.4724 2.5715 2.3708 2.9282

91.3985 93.4169 96.0071 96.8619 98.0417

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

Table 3 The OA (%) and standard deviation of accuracy on the ORL dataset. Methods

PCA

Train = 3 Train = 4 Train = 5 Train = 6 Train = 7

85.5357 88.2917 92.5000 94.0625 95.5833

KPCA ± ± ± ± ±

2.2849 2.3964 2.6151 1.8171 2.4273

86.0714 89.3333 92.5500 94.6250 95.5000

MMP ± ± ± ± ±

2.1030 2.6468 2.3182 2.0760 2.4086

81.4643 82.9167 84.2500 86.5000 86.8333

NWFE ± ± ± ± ±

1.9244 2.1757 2.5886 1.9455 2.7013

29.8929 33.4167 35.8000 40.3125 45.0833

FSDA ± ± ± ± ±

2.1005 2.4887 2.5891 2.5063 3.5193

84.6786 87.6250 91.6500 93.6875 95.1667

GMSDA ± ± ± ± ±

2.3129 2.5276 2.7143 1.9084 2.5982

88.7143 92.0833 94.9500 95.8125 97.2500

± ± ± ± ±

2.4433 2.1658 2.0058 2.0862 1.9299

± ± ± ± ±

3.0961 3.6642 3.6903 3.5813 2.4779

± ± ± ± ±

3.1483 3.3138 3.3980 3.0936 2.2026

± ± ± ± ±

3.0961 3.6642 3.6903 3.5813 2.4779

± ± ± ± ±

2.5024 2.4160 2.1542 1.8753 1.9150

± ± ± ± ±

2.5216 2.4374 2.0879 1.8735 1.8975

Table 4 The AA (%) and standard deviation of accuracy on the Yale dataset. Methods

PCA

Train = 3 Train = 4 Train = 5 Train = 6 Train = 7

72.3333 74.1905 75.3333 74.2667 78.5000

KPCA ± ± ± ± ±

4.1780 2.6973 2.6450 5.0088 3.0542

73.1667 74.7619 76.4444 75.4667 78.6667

MMP ± ± ± ± ±

2.9751 2.4852 2.9505 4.5851 3.0208

72.6667 74.5714 74.8889 72.9333 76.8333

NWFE ± ± ± ± ±

3.0367 2.4955 3.2360 3.6667 3.6986

31.0000 33.9048 35.0000 35.3333 38.3333

FSDA ± ± ± ± ±

4.0111 4.0459 5.1299 4.7692 5.4013

70.2500 74.0000 74.2222 74.6667 77.8333

GMSDA ± ± ± ± ±

3.7773 2.6019 2.8935 4.7799 3.3479

90.9167 94.0000 95.6667 98.0000 98.8333

Table 5 The AV (%) and standard deviation of accuracy on the Yale dataset. Methods

PCA

Train = 3 Train = 4 Train = 5 Train = 6 Train = 7

78.6181 79.0965 80.6402 79.2272 82.6143

KPCA ± ± ± ± ±

4.3774 2.7492 3.0035 4.6032 3.0891

79.1749 80.0112 80.3421 80.1421 82.7937

MMP ± ± ± ± ±

3.4286 2.6762 2.9633 4.4415 3.2615

80.5857 80.4470 80.6743 78.4988 82.1656

NWFE ± ± ± ± ±

3.2928 2.6885 3.3761 3.3867 2.9399

37.3097 38.2009 39.7289 41.4588 41.0586

FSDA ± ± ± ± ±

6.6679 5.1246 6.2853 5.6317 7.4921

76.6429 79.1216 79.5542 79.6026 82.0984

GMSDA ± ± ± ± ±

3.9063 2.8103 2.8826 5.0857 3.6130

93.1890 95.4297 96.5683 98.4063 99.0667

Table 6 The OA (%) and standard deviation of accuracy on the Yale dataset. Methods

PCA

Train = 3 Train = 4 Train = 5 Train = 6 Train = 7

72.3333 74.1905 75.3333 74.2667 78.5000

KPCA ± ± ± ± ±

4.1780 2.6973 2.6450 5.0088 3.0542

73.1667 74.7619 76.4444 75.4667 78.6667

MMP ± ± ± ± ±

2.9751 2.4852 2.9505 4.5851 3.0208

72.6667 74.5714 74.8889 72.9333 76.8333

NWFE ± ± ± ± ±

3.0367 2.4955 3.2360 3.6667 3.6986

31.0000 33.9048 35.0000 35.3333 38.3333

FSDA ± ± ± ± ±

4.0111 4.0459 5.1299 4.7692 5.4013

70.2500 74.0000 74.2222 74.6667 77.8333

GMSDA ± ± ± ± ±

3.7773 2.6019 2.8935 4.7799 3.3479

90.9167 94.0000 95.6667 98.0000 98.8333

Table 7 The AA (%) and standard deviation of accuracy on the AR dataset. Methods

PCA

Train = 6 Train = 7 Train = 8 Train = 9 Train = 10

63.7292 67.5000 68.7361 70.9167 72.6250

KPCA ± ± ± ± ±

1.2049 1.4545 1.5012 1.6484 1.7367

58.8750 62.6190 63.4583 66.2500 67.8542

MMP ± ± ± ± ±

1.1031 1.5097 1.7261 1.4892 1.7962

76.8229 79.4405 81.0556 81.9833 83.6042

NWFE ± ± ± ± ±

0.9546 1.1762 1.5811 1.6650 1.4148

13.2292 13.3452 13.5556 14.4833 15.2500

FSDA ± ± ± ± ±

0.8247 0.7510 0.8216 1.0721 0.8700

63.2812 67.0714 68.3611 70.4000 72.6875

GMSDA ± ± ± ± ±

1.2297 1.4983 1.5574 1.7432 1.6394

95.0938 95.8214 96.5278 96.9667 97.2917

Table 8 The AV (%) and standard deviation of accuracy on the AR dataset. Methods

PCA

Train = 6 Train = 7 Train = 8 Train = 9 Train = 10

67.9371 71.0687 72.8894 75.3720 77.4583

KPCA ± ± ± ± ±

1.2044 1.3327 1.6340 1.6151 1.9570

62.5508 65.8409 67.3076 70.0986 71.5847

MMP ± ± ± ± ±

1.2689 1.3952 1.7931 1.6424 2.0989

79.9441 82.4627 83.8568 84.7930 86.0922

NWFE ± ± ± ± ±

0.8578 1.2269 1.3681 1.6622 1.4569

11

14.9038 14.5639 15.1252 15.2490 16.2778

FSDA ± ± ± ± ±

1.0947 0.9558 0.9296 1.4294 1.2227

67.5095 70.8981 72.6045 74.7099 77.4378

GMSDA ± ± ± ± ±

1.2368 1.3442 1.6950 1.6937 1.9127

95.6804 96.4136 97.0825 97.4710 97.8444

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

Table 9 The OA (%) and standard deviation of accuracy on the AR dataset. Methods

PCA

Train = 6 Train = 7 Train = 8 Train = 9 Train = 10

63.7292 67.5000 68.7361 70.9167 72.6250

KPCA ± ± ± ± ±

1.2049 1.4545 1.5012 1.6484 1.7367

58.8750 62.6190 63.4583 66.2500 67.8542

MMP ± ± ± ± ±

1.1031 1.5097 1.7261 1.4892 1.7962

76.8229 79.4405 81.0556 81.9833 83.6042

NWFE ± ± ± ± ±

0.9546 1.1762 1.5811 1.6650 1.4148

13.2292 13.3452 13.5556 14.4833 15.2500

FSDA ± ± ± ± ±

0.8247 0.7510 0.8216 1.0721 0.8700

63.2812 67.0714 68.3611 70.4000 72.6875

GMSDA ± ± ± ± ±

1.2297 1.4983 1.5574 1.7432 1.6394

95.0938 95.8214 96.5278 96.9667 97.2917

± ± ± ± ±

2.5024 2.4160 2.1542 1.8753 1.9150

Table 10 The AA (%) and standard deviation of accuracy on the FERET dataset. Methods

PCA

KPCA

MMP

NWFE

FSDA

GMSDA

Train = 3 Train = 4 Train = 5

27.2875 ± 1.1201 30.9500 ± 1.3561 34.5250 ± 1.5794

25.6250 ± 0.9712 28.2333 ± 1.1883 30.6750 ± 1.2622

17.4125 ± 0.7693 20.1500 ± 1.0282 21.6750 ± 1.1929

6.2250 ± 0.6602 4.9667 ± 0.3967 5.3250 ± 0.5474

27.5875 ± 1.2986 31.0000 ± 1.5702 35.7500 ± 1.6377

77.3500 ± 1.3416 80.8833 ± 1.4779 82.7750 ± 1.7474

Table 11 The AV (%) and standard deviation of accuracy on the FERET dataset. Methods

PCA

KPCA

MMP

NWFE

FSDA

GMSDA

Train = 3 Train = 4 Train = 5

33.2121 ± 1.4925 35.6338 ± 1.5861 36.2025 ± 2.0657

30.2874 ± 1.3175 31.8590 ± 1.5335 31.7025 ± 1.5642

21.0628 ± 1.1074 24.0608 ± 1.5032 22.4331 ± 1.4607

8.2780 ± 0.7809 5.6068 ± 0.5531 5.3977 ± 0.6844

33.7943 ± 1.6133 35.7926 ± 1.8342 37.2562 ± 2.0546

80.8458 ± 1.4752 83.7304 ± 1.4529 84.9468 ± 1.6949

Table 12 The OA (%) and standard deviation of accuracy on the FERET dataset. Methods

PCA

KPCA

MMP

NWFE

FSDA

GMSDA

Train = 3 Train = 4 Train = 5

27.2875 ± 1.1201 30.9500 ± 1.3561 34.5250 ± 1.5794

25.6250 ± 0.9712 28.2333 ± 1.1883 30.6750 ± 1.2622

17.4125 ± 0.7693 20.1500 ± 1.0282 21.6750 ± 1.1929

6.2250 ± 0.6602 4.9667 ± 0.3967 5.3250 ± 0.5474

27.5875 ± 1.2986 31.0000 ± 1.5702 35.7500 ± 1.6377

77.3500 ± 1.3416 80.8833 ± 1.4779 82.7750 ± 1.7474

NWFE

FSDA

GMSDA

Table 13 The AA (%) and standard deviation of accuracy on the COIL20 dataset. Methods

PCA

Train = 10 Train = 15 Train = 20 Train = 25 Train = 30

91.5242 95.0000 96.3750 97.6596 98.7262

KPCA ± ± ± ± ±

1.2863 1.0242 0.7875 0.8670 0.6492

90.3790 93.6579 95.0096 96.6170 97.8333

MMP ± ± ± ± ±

1.1960 1.1349 0.9478 0.6755 0.8088

88.5000 89.2018 89.7404 90.3617 90.8095

± ± ± ± ±

0.9013 0.9561 0.7239 0.9507 0.7904

72.9597 80.8596 85.9615 89.4255 91.2024

± ± ± ± ±

1.7199 1.7133 1.5344 1.5007 1.2049

90.7016 94.4474 95.7596 97.3511 98.4048

± ± ± ± ±

1.4329 1.1299 1.0145 0.9515 0.8510

92.4839 94.9386 96.5577 97.9362 98.3333

± ± ± ± ±

2.2927 2.1411 1.7568 1.2869 1.5561

± ± ± ± ±

2.0568 2.0396 1.5990 1.2326 1.5569

Table 14 The AV (%) and standard deviation of accuracy on the COIL20 dataset. Methods

PCA

Train = 10 Train = 15 Train = 20 Train = 25 Train = 30

91.9962 95.2601 96.6132 97.8259 98.7931

KPCA ± ± ± ± ±

1.2145 1.0452 0.7781 0.8078 0.6231

90.5411 93.7797 95.2053 96.7905 97.9107

MMP ± ± ± ± ±

1.1313 1.1330 0.9339 0.6466 0.8008

88.9455 89.8214 90.1894 90.9049 91.4647

NWFE ± ± ± ± ±

0.8939 0.9000 0.7863 0.8938 0.8762

75.7252 82.0860 86.7283 89.9766 91.5726

FSDA ± ± ± ± ±

1.7227 1.5745 1.4426 1.4124 1.2773

91.2285 94.8122 96.0585 97.5264 98.5012

GMSDA ± ± ± ± ±

1.4997 1.2100 0.9910 0.9300 0.8312

93.3902 95.5765 96.8216 98.0896 98.4151

extraction algorithm in image set classiﬁcation tasks. 5. Conclusion In this paper, a robust geometric mean-based subspace discriminant analysis feature extraction method is proposed for image set classiﬁcation. Diﬀerent from the traditional methods, the proposed GMSDA can well capture the discriminating structure of samples 12

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

Table 15 The OA (%) and standard deviation of accuracy on the COIL20 dataset. Methods

PCA

Train = 10 Train = 15 Train = 20 Train = 25 Train = 30

91.5242 95.0000 96.3750 97.6596 98.7262

KPCA ± ± ± ± ±

1.2863 1.0242 0.7875 0.8670 0.6492

90.3790 93.6579 95.0096 96.6170 97.8333

MMP ± ± ± ± ±

1.1960 1.1349 0.9478 0.6755 0.8088

88.5000 89.2018 89.7404 90.3617 90.8095

NWFE ± ± ± ± ±

0.9013 0.9561 0.7239 0.9507 0.7904

72.9597 80.8596 85.9615 89.4255 91.2024

FSDA ± ± ± ± ±

1.7199 1.7133 1.5344 1.5007 1.2049

90.7016 94.4474 95.7596 97.3511 98.4048

GMSDA ± ± ± ± ±

1.4329 1.1299 1.0145 0.9515 0.8510

92.4839 94.9386 96.5577 97.9362 98.3333

± ± ± ± ±

2.2927 2.1411 1.7568 1.2869 1.5561

by using the geometry mean-based vector with class label information. The proposed GMSDA method need to transform the original data to a new subspace. So, GMSDA uses two projection matrices for feature extraction. The primary projection matrix is used to maximize the geometric between-class scatter matrix. The secondary projection matrix is used to maximizes the separability of between classes in the feature subspace obtained by the ﬁrst step. To evaluate our algorithm, we design extensive experiments on the ﬁve image datasets, and analyze the performance on AA, AV, and OA. All the experimental results have demonstrated the superiority of our algorithm in real-world image classiﬁcation tasks. Acknowledgements This work is partially supported by the Doctoral Research Foundation of Jining Medical University under Grant No. 2018JYQD03, and a Project of Shandong Province Higher Educational Science and Technology Program under Grant No. J18KA217, Graduate Innovation Foundation of Jiangsu Province under Grant No. KYLX16_0781, the 111 Project under Grant No. B12018, and PAPD of Jiangsu Higher Education Institutions, China. References [1] A. Wagner, J. Wright, A. Ganesh, et al., Toward a practical face recognition system: Robust alignment and illumination by sparse representation, IEEE Trans. Pattern Anal. Mach. Intell. 34 (2) (2012) 372–386, https://doi.org/10.1109/TPAMI.2011.112. [2] J.Y. Choi, W. De Neve, K.N. Plataniotis, et al., Collaborative face recognition for improved face annotation in personal photo collections shared on online social networks, IEEE Trans. Multim. 13 (1) (2011) 14–28, https://doi.org/10.1109/tmm.2010.2087320. [3] S. Zeng, Y. Xiong, Weighted average integration of sparse representation and collaborative representation for robust face recognition, Comput. Vis. Med. 2 (4) (2016) 357–365, https://doi.org/10.1007/s41095-016-0061-5. [4] C.L. Chen, A. Mahjoubfar, L.C. Tai, et al., Deep learning in label-free cell classiﬁcation, Sci. Rep. 6 (2016) 21471, https://doi.org/10.1038/srep21471. [5] X. Gao, Q. Sun, H. Xu, et al., Multi-model fusion metric learning for image set classiﬁcation, Knowl. Based Syst. 164 (2019) 253–264, https://doi.org/10.1016/j. knosys.2018.10.043. [6] M. Imani, H. Ghassemian, Binary coding based feature extraction in remote sensing high dimensional data, Inf. Sci. 342 (2016) 191–208, https://doi.org/10. 1016/j.ins.2016.01.032. [7] M. Turk, A. Pentland, Eigenfaces for recognition, J. Cogn. Neurosci. 3 (1) (1991) 71–86, https://doi.org/10.1162/jocn.1991.3.1.71. [8] P.N. Belhumeur, J.P. Hespanha, D.J. Kriegman, Eigenfaces vs. ﬁsherfaces: recognition using class speciﬁc linear projection, IEEE Trans. Pattern Anal. Mach. Intell. (7) (1997) 711–720, https://doi.org/10.1109/34.598228. [9] B.C. Kuo, D.A. Landgrebe, Nonparametric weighted feature extraction for classiﬁcation, IEEE Trans. Geosci. Remote Sens. 42 (5) (2004) 1096–1105, https://doi. org/10.1109/TGRS.2004.825578. [10] B. Scholkopf, A. Smola, K.R. Muller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput. 10 (5) (1998) 1299–1319, https://doi.org/ 10.1162/089976698300017467. [11] G. Baudat, F. Anouar, Generalized discriminant analysis using a kernel approach, Neural Comput. 12 (10) (2000) 2385–2404, https://doi.org/10.1162/ 089976600300014980. [12] L. Li, H. Ge, J. Gao, et al., Hyperspectral image feature extraction using Maclaurin series function curve ﬁtting, Neural Process. Lett. 49 (2019) 357–374, https:// doi.org/10.1007/s11063-018-9825-5. [13] G.F. Lu, Y. Wang, J. Zou, et al., Matrix exponential based discriminant locality preserving projections for feature extraction, Neural Netw. 97 (2018) 127–136, https://doi.org/10.1016/j.neunet.2017.09.014. [14] B. Li, J. Du, X.P. Zhang, Feature extraction using maximum nonparametric margin projection, Neurocomputing 188 (2016) 225–232, https://doi.org/10.1016/j. neucom.2014.11.105. [15] L. Shen, L. Bai, A review on Gabor wavelets for face recognition, Pattern Anal. Appl. 9 (2-3) (2006) 273–292, https://doi.org/10.1007/s10044-006-0033-y. [16] S. Meshgini, A. Aghagolzadeh, H. Seyedarabi, Face recognition using Gabor-based direct linear discriminant analysis and support vector machine, Comput. Electr. Eng. 39 (3) (2013) 727–745, https://doi.org/10.1016/j.compeleceng.2012.12.011. [17] L. Li, H. Ge, Y. Tong, et al., Face recognition using Gabor-based feature extraction and feature space transformation fusion method for single image per person problem, Neural Process. Lett. 47 (3) (2018) 1197–1217, https://doi.org/10.1007/s11063-017-9693-4. [18] Q.S. Sun, S.G. Zeng, Y. Liu, et al., A new method of feature fusion and its application in image recognition, Pattern Recognit. 38 (12) (2005) 2437–2448, https:// doi.org/10.1016/j.patcog.2004.12.013. [19] X. Gao, Q. Sun, H. Xu, Multiple-rank supervised canonical correlation analysis for feature extraction, fusion and recognition, Expert Syst. Appl. 84 (2017) 171–185, https://doi.org/10.1016/j.eswa.2017.05.017. [20] X. Gao, Q. Sun, H. Xu, et al., Multi-model fusion metric learning for image set classiﬁcation, Knowl. Based Syst. 164 (2019) 253–264, https://doi.org/10.1016/j. knosys.2018.10.043. [21] T. Hastie, R. Tibshirani, Discriminant adaptive nearest neighbor classiﬁcation, IEEE Trans. Pattern Anal. Mach. Intell. 18 (6) (1996) 607–616, https://doi.org/10. 1109/34.506411. [22] V.N. Vapnik, The nature of statistical learning theory, IEEE Trans. Neural Netw. 8 (6) (1995) 1564, https://doi.org/10.1007/978-1-4757-2440-0. [23] R. Khemchandani, S. Chandra, Twin support vector machines for pattern classiﬁcation, IEEE Trans. Pattern Anal. Mach. Intell. 29 (5) (2007) 905–910, https:// doi.org/10.1142/9789812813220_0009. [24] X. Gao, L. Fan, H. Xu, A novel method for classiﬁcation of matrix data using Twin Multiple Rank SMMs, Appl. Soft Comput. 48 (2016) 546–562, https://doi.org/ 10.1016/j.asoc.2016.07.003. [25] C. Ding, D. Tao, Trunk-branch ensemble convolutional neural networks for video-based face recognition, IEEE Trans. Pattern Anal. Mach. Intell. 40 (4) (2018)

13

Optik - International Journal for Light and Electron Optics 199 (2019) 163368

J. Gao and L. Li

1002–1014, https://doi.org/10.1109/TPAMI.2017.2700390. [26] I. Masi, F.J. Chang, J. Choi, et al., Learning pose-aware models for pose-invariant face recognition in the wild, IEEE Trans. Pattern Anal. Mach. Intell. 41 (2) (2019) 379–393, https://doi.org/10.1109/TPAMI.2018.2792452. [27] A. Mahmood, A. Mian, R. Owens, Semi-supervised spectral clustering for image set classiﬁcation, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. 2014 (2014) 121–128, https://doi.org/10.1109/CVPR.23. [28] H. Tan, Z. Ma, S. Zhang, et al., Grassmann manifold for nearest points image set classiﬁcation, Pattern Recognit. Lett. 68 (2015) 190–196, https://doi.org/10. 1016/j.patrec.2015.09.008. [29] Z. Huang, R. Wang, S. Shan, et al., Projection metric learning on Grassmann manifold with application to video based face recognition, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (2015) 140–149, https://doi.org/10.1109/CVPR.2015.7298609. [30] J. Lu, G. Wang, P. Moulin, Image set classiﬁcation using holistic multiple order statistics features and localized multi-kernel metric learning, Proc. IEEE Int. Conf. Comput. Vis. (2013) 329–336, https://doi.org/10.1109/ICCV.2013.48. [31] Q.S. Zeng, J.H. Lai, C.D. Wang, Multi-local model image set matching based on domain description, Pattern Recognit. 47 (2) (2014) 694–704, https://doi.org/10. 1016/j.patcog.2013.08.025. [32] P. Zheng, Z.Q. Zhao, J. Gao, et al., Image set classiﬁcation based on cooperative sparse representation, Pattern Recognit. 63 (2017) 206–217, https://doi.org/10. 1016/j.patcog.2016.09.043. [33] M. Imani, H. Ghassemian, Feature space discriminant analysis for hyperspectral data feature reduction, ISPRS J. Photogramm. Remote Sens. 102 (2015) 1–13, https://doi.org/10.1016/j.isprsjprs.2014.12.024. [34] H.A. Tayali, S. Tolun, Dimension reduction in mean-variance portfolio optimization, Expert Syst. Appl. 92 (2018) 161–169, https://doi.org/10.1016/j.eswa. 2017.09.009. [35] L.F. Chen, H.Y.M. Liao, M.T. Ko, et al., A new LDA-based face recognition system which can solve the small sample size problem, Pattern Recognit. 33 (10) (2000) 1713–1726, https://doi.org/10.1016/s0031-3203(99)00139-9. [36] N. Zheng, X. Guo, Y. Tie, et al., Incremental generalized multiple maximum scatter diﬀerence with applications to feature extraction, J. Vis. Commun. Image Representation 55 (2018) 67–79, https://doi.org/10.1016/j.jvcir.2018.04.012. [37] S. Su, H. Ge, Y.H. Yuan, A label embedding kernel method for multi-view canonical correlation analysis, Multimedia Tools Appl. 76 (12) (2017) 13785–13803, https://doi.org/10.1007/s11042-016-3786-3.

14

A robust geometric mean-based subspace discriminant analysis feature extraction approach for image set classification

A robust geometric mean-based subspace discriminant analysis feature extraction approach for image set classification

Recommend Documents