Gait recognition using Pose Kinematics and Pose Energy Image

Gait recognition using Pose Kinematics and Pose Energy Image

Signal Processing 92 (2012) 780–792 Contents lists available at SciVerse ScienceDirect Signal Processing journal homepage: www.elsevier.com/locate/s...

801KB Sizes 0 Downloads 136 Views

Signal Processing 92 (2012) 780–792

Contents lists available at SciVerse ScienceDirect

Signal Processing journal homepage: www.elsevier.com/locate/sigpro

Gait recognition using Pose Kinematics and Pose Energy Image Aditi Roy a,n, Shamik Sural a, Jayanta Mukherjee b a b

School of Information Technology, Indian Institute of Technology Kharagpur, India Department of CSE, Indian Institute of Technology Kharagpur, India

a r t i c l e in f o

abstract

Article history: Received 21 March 2011 Received in revised form 30 June 2011 Accepted 22 September 2011 Available online 29 September 2011

Many of the existing gait recognition approaches represent a gait cycle using a single 2D image called Gait Energy Image (GEI) or its variants. Since these methods suffer from lack of dynamic information, we model a gait cycle using a chain of key poses and extract a novel feature called Pose Energy Image (PEI). PEI is the average image of all the silhouettes in a key pose state of a gait cycle. By increasing the resolution of gait representation, more detailed dynamic information can be captured. However, processing speed and space requirement are higher for PEI than the conventional GEI methods. To overcome this shortcoming, another novel feature named as Pose Kinematics is introduced, which represents the percentage of time spent in each key pose state over a gait cycle. Although the Pose Kinematics based method is fast, its accuracy is not very high. A hierarchical method for combining these two features is, therefore, proposed. At first, Pose Kinematics is applied to select a set of most probable classes. Then, PEI is used on these selected classes to get the final classification. Experimental results on CMU’s Mobo and USF’s HumanID data set show that the proposed approach outperforms existing approaches. & 2011 Elsevier B.V. All rights reserved.

Keywords: Gait recognition Pose Kinematics Pose Energy Image Dynamic programming Gait Energy Image

1. Introduction Over the last decade, gait recognition has become a promising research topic in biometric recognition since it provides many unique advantages such as non-contact and non-invasive compared to the traditional biometric features like face, iris, and finger print. Gait recognition refers to verifying and/or identifying persons using their walking style. A gait recognition method combined with other biometric based human recognition methods holds promise as tools in visual surveillance systems, tracking, monitoring, forensics, etc., since they provide reliable and efficient means of identity verification. Gait recognition approaches are mainly classified into three types [1], namely, model based approaches, appearance n

Corresponding author. E-mail addresses: [email protected] (A. Roy), [email protected], [email protected] (S. Sural), [email protected] (J. Mukherjee). 0165-1684/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2011.09.022

based approaches and spatiotemporal approaches. While model based methods are generally view and scale invariant, use of such methods is still limited due to current imperfect vision techniques (e.g., tracking and localizing human body accurately in 2D or 3D space has long been a challenging and unsolved problem), requirement of good quality silhouettes and high computational cost. Most of the current approaches are appearance based [3–8], which directly use the silhouettes of gait sequences for feature extraction without developing any model. These approaches are suitable for practical applications since they operate on binary silhouettes which could be of low quality and also due to the fact that no color or texture information is needed. While appearance based approaches deal with the spatial information alone, spatiotemporal approaches deal with both spatial as well as temporal domain information [11]. Among the appearance-based approaches, temporal template based gait feature has obtained significant attention due to its simple, robust representation and good recognition accuracy. This type of approaches compresses

A. Roy et al. / Signal Processing 92 (2012) 780–792

a gait cycle into one static image. The ability to reduce the effect of random noise in silhouettes by averaging, makes it robust. Since human motion is represented by a single 2D image, storage space and computational cost are also reduced. Thus, temporal template is considered to be an effective way for gait representation. Bobick et al. [2] first proposed Motion Energy Image (MEI) and Motion History Image (MHI) for robust static representation of human motion. Han and Bhanu [3] later extended this idea to gait representation and developed a new feature named as Gait Energy Image (GEI). GEI reflects major shapes of silhouettes and their changes over a gait cycle. Thus it keeps both the static information and motion information. GEI is obtained by averaging silhouettes over a gait cycle. A higher intensity pixel in GEI indicates higher frequency of occurrence of the corresponding body part at that position. For uncorrelated and identically distributed noise, GEI achieved encouraging performance in gait recognition. However, since GEI represents a gait sequence by using a single image, it loses intrinsic dynamic characteristics and detailed information of the gait pattern. To capture the motion information more accurately, Gait History Image (GHI) [4] was proposed. GHI preserves the temporal information besides the spatial information. Later, Gait Moment Image (GMI) [5] was developed, which is the gait probability image at each key moment of all gait cycles. The gait images over all possible gait cycles corresponding to a key moment are averaged as the GEI of this key moment. The number of key moments are some dozens. However, it is not easy to select the key moments from gait cycles with different periods. In the search for better representation of the dynamic information, some other methods were proposed by directly modifying the basic GEI, namely Enhanced GEI (EGEI) [6], Frame Difference Energy Image (FDEI) [7], and Active Energy Image [8]. Better performance was reported than the conventional GEI method. The above-mentioned methods are constrained by factors like clothing, carrying object, etc., and there is still scope for improvement. This motivated us to investigate how the dynamic information can be represented in a better way. To capture motion information in higher resolution, we introduce a novel gait representation called Pose Energy Image (PEI). As a result of increased resolution of gait representation, this feature based recognition approach becomes slow. To increase the processing speed, we propose another new feature named as Pose Kinematics, which captures pure dynamic information without considering the silhouette shape. This second feature is used to reduce the search space of the PEI based method by selecting most probable P classes based on only dynamics. PEI based approach is then applied on these selected classes to get the end result. The rest of the paper is organized as follows. Section 2 introduces the proposed approach. Section 3 presents and analyzes the experimental results, and Section 4 concludes the paper. 2. Proposed approach The motivation behind this work is to better capture the temporal variation in gait feature representation.

781

Since in original GEI representation, intrinsic dynamic information is not preserved properly, it is found to be less discriminative. Here we represent a gait cycle by a series of K key poses. A Pose Energy Image (PEI), as proposed in this paper, is the average image of all the silhouettes in a gait cycle which belong to a particular pose state. Thus for N key pose states, we get N PEIs. PEI is designed in a way to increase the resolution of GEI. Hence, it is able to capture detailed motion information. The intensity variation in a PEI is low since each PEI represents centralized motion. Thus, the series of PEIs representing a gait cycle is able to represent how the shape is changing in different phases of the gait cycle and the amount of change in each gait phase. So, increasing resolution helps to capture minute dynamic information. But, this representation requires higher computational complexity and storage space than the conventional GEI. As a result, processing speed gets slower. To alleviate this problem, we introduce another novel feature, named as Pose Kinematics, which captures pure dynamic information. It is defined as the percentage of time spent in each of the N key pose states. For different subjects the percentage of time spent in each key pose state is different, which determines their unique dynamic characteristic. Thus, these two features separately capture two key components of gait feature, namely shape and dynamics. Processing of Pose Kinematics can be done very fast, but the discriminative power of dynamics based feature alone is found to be not as effective. Shape plays an important role in gait feature representation as has been reported in the literature [1]. Keeping this in mind, we combine Pose Kinematics and PEI in a way such that both accuracy and efficiency are high. We present Algorithm 1 (Complete Gait Recognition Algorithm), which describes the overall method for gait recognition using Pose Kinematics and PEI. The proposed approach can be divided into two broad parts. First is key pose estimation and silhouette classification into the key pose classes (step 1 in Algorithm 1). The second part consists of gait feature vector computation using PK and PEI, feature space reduction and classification (steps 2 and 3 in Algorithm 1). The first part is described in detail in Sections 2.1 and 2.2. Fig. 1 shows the corresponding block diagram. In the training stage, training silhouette sequence is first projected into the eigenspace. Then K-means clustering is applied on these transformed silhouette images to determine the key poses. During testing, the test silhouettes are first projected into the eigenspace to get the projected feature vectors. Then similarity values between each test silhouette and key poses are determined. Finally, dynamic programming based longest path search method is applied to compute the key pose label of each test silhouette. The second part is covered in Sections 2.3–2.5. The flow chart of Fig. 2 shows the steps followed in this stage. From the labeled training sequence, first we compute the PK and PEI features. Next, subspace method is applied on the PEI feature vector to reduce the feature dimension. During test phase, from the labeled test silhouette sequence, PK features are extracted. Then classification is performed based on these features only if the similarity value is high enough. Otherwise, PK is used to

782

A. Roy et al. / Signal Processing 92 (2012) 780–792

Training Silhouette Sequence

Eigen Space Projection

K-means Clustering

Database

Transformation Matrix

Test Silhouette Sequence

Eigen Space Projection

Matching Score Computation

Most Probable Path Search

Classification of silhouettes into key poses

Fig. 1. Block diagram of key pose estimation and silhouette classification into the estimated key pose classes.

Training silhouettes with corresponding key pose label

Test silhouettes with corresponding key pose label

Compute Pose Kinematics Compute Pose Energy Image

Similarity Measurement

Similarity Value > Threshold

Yes

No

PCA/LDA Analysis Transformation Matrix

Compute Pose Kinematics

Recognition Result

Select a Set of Most Probable Classes

Compute Pose Energy Image

Feature Space Transformation

Similarity Measurement

Fig. 2. Flow chart of human recognition method using proposed PEI and PK features.

select a set of most probable classes. Then PEI features are extracted from the test silhouettes and dimension reduction is preformed using the transformation matrix obtained from the training data. Finally, classification is done by comparing the test PEI feature vector with PEI feature vectors of the selected classes. Algorithm 1. Complete algorithm. Step 1: Key Pose Estimation and Silhouette Classification into Key Poses Input: Training silhouettes(I1 , . . . ,IN ), eigenvector count(K), key pose count (K), Test silhouettes(F 1 , . . . ,F T ) Output: Key poses(P 1 , . . . ,PK ), Silhouette Class label Step 1.1: Apply eigenspace projection to input silhouettes(Ii ,i 2 N) to obtain K EigenSilhouette(ui ,i 2 K) Step 1.2: Apply K-means clustering on EigenSilhouettes to get K key poses representing a gait cycle(P 1 , . . . ,P K ) Step 1.3: Project test silhouettes(F t ,t 2 T) to eigenspace Step 1.4: Compute match scores among all projected training images and key poses P 1 , . . . ,P K Step 1.5: Apply graph based path searching to find out key pose class labels of the test silhouette sequence Step 2: Gait Feature Extraction and Dimension Reduction Input: Test silhouettes(F 1 , . . . ,F T ) with pose class labels, Gait cycle length (GC), key pose count (K)

Output: Pose Kinematics (PK), Projected PEI Step 2.1: Compute ith ði 2 KÞ element of PK as GC 1 X PK i ¼ 1 if F t 2 P i GC t ¼ 1 Step 2.2: Compute ithPose Energy Image (PEI) as GC X 1 F t ðx,yÞ if F t 2 P i PEIi ðx,yÞ ¼ GC nPK i t ¼ 1 Step 2.3: Apply subspace algorithms to PEI and get projected low dimensional PEI Step 3: Human Recognition Input: PK and Projected PEI Output: Most probable subject class label Step 3.1: Apply PK based nearest neighbor classifier to select P highest ranked classes Step 3.2: Apply PEI based classifier on selected P subject classes and compute the most probable class

2.1. Key pose estimation Before computing the proposed features, it is first required to define the key poses and classify the input silhouette sequence into these key pose classes. Since no standard way of determining the number of key poses and their characteristics is available, unsupervised learning, namely constrained K-means clustering, is applied to choose

A. Roy et al. / Signal Processing 92 (2012) 780–792

the key poses. Instead of applying K-means clustering in the silhouette space directly, PCA is first applied on silhouette image sequences. PCA computes a smaller set of orthogonal vectors which preserves the observed total variance better than the original feature space. Then, K-means clustering is applied on the PCA transformed feature vectors. These steps are described in detail in the next subsections. 2.1.1. Eigenspace projection In this section we discuss step 1.1 of Algorithm 1 in detail. At first, eigenspace projection is applied to find the principal components or the eigenvectors of the silhouette image set, which are termed as EigenSilhouettes due to their silhouette like appearance. Since only a subset of these EigenSilhouettes is important in encoding the variation in silhouette images, we select the most significant K EigenSilhouettes. Let, there be N training silhouette images I1 ,I2 , . . . ,IN of size S ¼ W  H, where W is the width and H the height of a silhouette image. Each image Ii 2 I is represented as a column vector Gi of size S  1, where I represents the training image set. The mean silhouette vector C is computed as follows: N 1X G Ni¼1 i



ð1Þ

We then find the covariance matrix C as follows: C¼

N N 1 X 1 X ðGn CÞðGn CÞT ¼ Fn FTn ¼ AAT Nn¼1 Nn¼1

ð2Þ

where A ¼ ½F1 , F2 , . . . , FN . Computing eigenvectors ui from C (size S  S) is intractable for typical image sizes [18]. To handle this problem, we first compute the eigenvectors of much smaller ATA matrix of size N  N and take linear combinations of the silhouette images Fi . Thus we get N eigenvectors (vi ,i ¼ 1; 2, . . . ,N) each of dimension N  1. Now from matrix properties, we compute eigenvectors (ui ,i ¼ 1; 2, . . . ,N) of the covariance matrix C¼AAT as ui ¼Avi [18]. Since the dimension of A is S  N, the dimension of ui becomes S  1. ui is normalized such that Jui J ¼ 1. Thus, N eigenvectors of C are obtained. Since the number of eigenvectors is still large and all of them do not contain significant information, we select K most significant eigenvectors. The eigenvalues are sorted in decreasing order and K number of eigenvectors that account for variance more than 90% are selected. Once eigenvectors are computed, we find the weight vectors, also known as silhouette space image, as follows:

Oi ¼ uT Fi , i ¼ 1; 2, . . . ,N

ð3Þ

where u ¼ ½u1 ,u2 , . . . ,uK ,K rN, and silhouette space image Oi ¼ ½w1 ,w2 , . . . ,wK T . Now each silhouette in the training set (mean subtracted), Fi can be represented as a linear combination of these EigenSilhouettes ui as follows:

Fi ¼

K X

wj uj

ð4Þ

j¼1

2.1.2. K-means clustering In this stage we apply constrained K-means clustering to determine the key poses (each cluster represents a key

783

pose class). This section describes step 1.2 of Algorithm 1. The inherent sequential nature of the key poses in a gait cycle makes the clusters formed by K-means clustering temporally adjacent. Let, the feature vectors of a gait cycle of the nth subject be On ¼ on1 , on2 , . . . , onp , where p is the number of silhouettes in a gait cycle. Initially the clusters are formed by equally partitioning each gait cycle into K segments. The jth frame is assigned to cluster i ¼ 1þ bððjnKÞ=pÞc, where i 2 K. Thus, all the frames in the ith segment of each gait cycle of all the subjects are grouped under the ith cluster. Let the initial set of clusters be S 0 ¼ S01 ,S02 , . . . ,S0K , and the corresponding centroids are P 0 ¼ P01 ,P 02 , . . . ,P0K , each of which represents a key pose. Then, constrained K-means clustering is applied for iteratively refining the clusters. The first constraint applied is that, the only allowable transitions are from the ith cluster to ði1Þth or ith or ði þ1Þth clusters. The second constraint is that, after performing cluster assignment by taking the first constraint into account, check the transition order of each frame. If it is not ordered properly, then reassign those frames such that the previous frame’s cluster is lower or equal to the current frame’s cluster. Lastly, ensure that every cluster has at least one frame from each gait cycle of each subject. After initialization, the algorithm proceeds by alternating between the following two steps: Update step: Calculate the centroid of the cluster. P ðtÞ i ¼

1

X

9SðtÞ i 9 oj 2SðtÞ

oj

ð5Þ

i

Assignment step: Reassign each frame to the allowable cluster with the closest mean: þ 1Þ ðtÞ ¼ foj : Joj PðtÞ Sðt i i J rJoj P j J for j ¼ i1 or i þ 1g

ð6Þ

The algorithm terminates when the assignments no longer change. To decide the optimum number of key poses (representing one complete gait cycle) formed by K-means clustering, we consider the Rate-distortion curve as used in [9]. It plots the average distortion as a function of the number of clusters, an example of which is shown in Fig. 3. It can be observed from the plot that beyond 16 clusters the average distortion does not decrease significantly. Thus, we choose 16 clusters which gives a set of 16 key poses. Fig. 4 shows the 16 key poses corresponding to 16 cluster centroids P 1 2P 16 over one gait cycle.

2.2. Silhouette classification Next stage is silhouette classification into key poses. Given the input test sequence (F 1 , . . . ,F T ), each silhouette is first linearly projected into the eigenspace to get the weight vectors (see step 1.3 in Algorithm 1). Let the mean weight vectors corresponding to the key poses be ðP1 ,P 2 , . . . ,P K Þ. Given an unknown test silhouette Fi represented as column vector G00i , we first find the meansubtracted vector as F00i ¼ G00i C. Then, it is projected into the eigenspace and the weight vector is determined

784

A. Roy et al. / Signal Processing 92 (2012) 780–792

as follows:

O00i ¼ uT F00i

ð7Þ

After the feature vector (weight vector) for the test silhouette is computed, the match scores of the probe silhouette to all of the key pose weight vectors ðP 1 , . . . ,P K Þ are determined (see step 1.4 in Algorithm 1). To do this, we use Euclidean distance measure (MatchValðO00i ,P j Þ ¼ 1DðP j O00i Þ) for j ¼ 1; 2, . . . ,K. If there are K key poses and T frames in a sequence, a ½K  T array of match scores is obtained. From these match scores, the input silhouette can be classified into one of the key pose classes directly by considering the best matched key pose. But, this simplified approach overlooks the temporal context of key pose sequence. As a result, two consecutive frames may be classified into two temporally non-adjacent classes which is clearly wrong. The problem may be caused due to distorted silhouettes obtained from imperfect background segmentation. When perfectly clean silhouettes are available then also incorrect detection can occur due to the fact that different key poses may generate similar silhouettes (like left foot forward position and right foot forward position). So, to mitigate these factors of unreliable observations and robustly classify an input sequence to a sequence of known key poses, we take advantage of the temporal constraints imposed by a state transition model.

2.2.2. Graph-based path searching Let, the input sequence of frames be F ¼ F 1 , . . . ,F T and the possible set of states in the ith frame be S i ¼ Si1 , . . . ,SiK , where S1 to SK states represent key poses P i ¼ P i1 ,Pi2 , . . . ,P iK . The set of vertices V of the graph G corresponds to the key pose states S i for i¼ 1 to T. An edge eikp : Sik -Sipþ 1 is added to the graph G only if transition from Sk to Sp is allowable by the state transition diagram. The graph thus constructed is a directed acyclic graph. Fig. 6 shows an example of a graph constructed from the state transition model shown in Fig. 5 for five consecutive frames. For each frame there are five states (S1–S5). An edge between the nodes in frame Fi and F i þ 1 is added if

1000 950 900 Distortion

2.2.1. State transition model In our state transition model one complete gait cycle is modeled as a chain of estimated key poses. Thus, if the number of key poses forming a gait cycle is K, then the number of states in the transition diagram will also be K. Fig. 5 shows an example state transition diagram having five states in a gait cycle. One key pose is associated with each state. Since in our case we consider 16 key poses to represent a gait cycle, the corresponding state transition model also has 16 states unlike the example model shown in Fig. 5 having five states. This state transition model provides contextual and temporal constraints where the links specify the temporal order of key poses. We construct a directed graph from this state transition model, where vertices are the key pose states and edges are the allowable state transitions. The key pose finding problem is formulated as the most likely path finding problem in the directed graph which is solved using dynamic programming.

850

T51

800 750 700

T11

S1

T12

S2

T23

S3

T34

S4

T45

S5

T55

650 600 5

10 15 Number of Centroids

Fig. 3. Rate–distortion plot.

20

25

T22

T33

T44

Fig. 5. Proposed state transition diagram considering five states (S1–S5) corresponding to five key poses (P1–P5).

Fig. 4. Reconstructed key poses obtained from K-means clustering in eigenspace (CMU MoBo database).

A. Roy et al. / Signal Processing 92 (2012) 780–792

785

F1

F2

F3

F4

F5

S1

0.32 0.32 1

0.48 0.80 1

0.19 0.99 1

0.13 1.12 1

0.17 1.29 1

S2

0.15 0.15 2

0.30 0.62 1

0.37 1.17 1

0.34 1.51 2

0.05 1.56 2

S3

0.09 0.09 3

0.07 0.22 2

0.29 0.91 2

0.30 1.47 2

0.12 1.63 2

S4

0.16 0.16 4

0.05 0.21 4

0.08 0.30 3

0.15 1.06 3

0.39 1.86 3

S5

0.28 0.28 5

0.10 0.38 5

0.07 0.45 5

0.08 0.53 5

0.27 1.33 4

Fig. 6. Directed acyclic graph constructed from the state transition diagram of Fig. 5 over five frames. For each frame there are five states (S1–S5). An edge between the nodes in frame Fi and F i þ 1 is added if that transition is allowed by the state transition diagram. The first value shown in each node represents MatchValðÞ, the second value represents PathValðÞ, and the third value represents PrevNodeðÞ. The bold edges show the longest path found by dynamic programming. The pose assignment obtained for each frame is: S1–S2–S3–S4–S5.

that transition is allowed by the state transition diagram in Fig. 5. The most likely sequence of key poses for a silhouette sequence will be the longest path (the path having maximum weight) belonging to the set of all admissible paths in this graph. Dynamic programming [12] is used to find the longest path. The complete algorithm for finding most probable path is described in Algorithm 2. Algorithm 2. Silhouette classification algorithm. Parameters: T ¼Frame count in a sequence F ¼frame sequence F 1 ,F 2 , . . . ,F T GF ðV ,EÞ¼ Directed acyclic graph K ¼Number of nodes in each frame (same as the number of states in the state transition model) t ij ¼ jth state in frame Fi Eðt ij ,t kl Þ¼ Edge joining the node tij with tkl MatchValðt ij Þ¼Probability of being frame Fi in jth state PathValðt ij Þ¼ Weight of the path up to Fi in state j PrevNodeðt ij Þ¼ State of the previous frame F i1 that maximizes PathValðt ij Þ Input: GF ðV,EÞ Output: MaximumWeightedPath from F1 to FT, BestStatei i 2 T Initialization: PathValðt 1j Þ ¼ MatchValðt 1j Þ, PathValðt ij Þ ¼ 0, 8i 41,j PrevNodeðt ij Þ ¼ 0, 8i,j Begin Iteration: For each state ðj 2 KÞ of each frame F i ,i 2 T compute: PathValðtij Þ ¼ maxðPathValðt kl Þ þ MatchValðt ij ÞÞ, 8 nodes in the previous frame Fk such that k ¼ i1 PrevNodeðt ij Þ ¼ argmaxðPathValðt kl Þ þ MatchValðt ij ÞÞ Termination: MaximumWeightedPath ¼ maxðPathValðt Tj ÞÞ, 8j in frame FT BestStateT ¼ argmaxðPathValðt Tj ÞÞ

Path Backtracking: BestStatei ¼ PrevNodeðt ði þ 1ÞðBeatStatei þ 1 Þ Þ, i ¼ T1,T2, . . . ,1 End

At each time step, we compute three values for each node: the match score (MatchValðt ij Þ in Algorithm 2 (Silhouette Classification Algorithm) which is the first value shown in each node of Fig. 6) between each state j in the graph and input silhouette image of frame Fi, the best path score value (PathValðt ij Þ in Algorithm 2 which is the second value shown in each node of Fig. 6) along a path up to current node ðt ij Þ and the previous node along the longest path up to current node (PrevNodeðt ij Þ in Algorithm 2 which is the third value shown in each node of Fig. 6). The match score MatchValðO00i ,Pj Þ (represented as MatchValðt ij Þ in Algorithm 2) actually represents to what extent the silhouette of the current input frame Fi matches the key pose (Pj) corresponding to state j. The procedure for computing this value is described in detail in Section 2.2 (step 1.4 in Algorithm 1). During initialization, the PathVal() is made zero for all the frames except the first frame where PathValðÞ is made equal to MatchValðÞ. Similarly, PrevNodeðÞ is also made zero for all the frames initially. Then during iteration, at every frame number Fi, node tij searches all possible previous nodes that link to the current node in the graph and chooses the one with the maximum best path score value. Then the path score value of the current node tij is updated accordingly and the selected previous node is recorded. When the last frame is reached, the node with the maximum path score value is selected as the pose class of the final frame and then backtracking is performed to get the longest path (shown in bold in Fig. 6).

786

A. Roy et al. / Signal Processing 92 (2012) 780–792

For a fully ergodic graph, algorithmic complexity is OðK 2 TÞ, where K is the number of states and T is the number of frames. But, here, since the average in-degree of each node is small, the overall complexity becomes O(KT). Thus, the output at this stage is a sequence of key pose labels representing the most probable class of each silhouette frame in the input sequence. For example silhouette sequence of a subject with their key pose labels is shown in Fig. 7.

is 0.0833. The other components of the PK feature vector can also be computed by following the same procedure. The complete PK feature vector for the sequence shown in Fig. 7 is 0.0833, 0.0278, 0.0278, 0.0278, 0.1667, 0.0278, 0.0833, 0.0833, 0.0278, 0.0278, 0.0278, 0.0833, 0.1389, 0.0278, 0.0556, 0.0833}. Algorithm 3 (Feature Computation Algorithm) shows the steps for PK feature computation on a silhouette sequence of one gait cycle length. Algorithm 3. Feature computation algorithm.

2.3. Pose Kinematics and Pose Energy Image (PEI) computation Once the key pose class of each frame in a silhouette sequence is obtained, we compute Pose Kinematics as a K element vector where K is the number of key poses. The ith element (PKi) of the vector represents the fraction of time ith pose (Pi) occurred in a complete gait cycle: PK i ¼

GC 1 X 1 GC t ¼ 1

if F t 2 P i

ð8Þ

where GC is the number of frames in the complete gait cycle(s) of a silhouette sequence, Ft is the tth frame in the sequence (moment of time) and Pi is the ith key pose. For example, to compute the first component of the PK feature vector from the silhouette sequence with their key pose labels shown in Fig. 7, we first count the number of silhouettes that belong to key pose class 1. It is found to be 3 (frame numbers 12–14 in Fig. 7), and the length of the gait cycle is 36. So, the first component will be 3/36 which

Input: GC ¼Gait cycle length K¼Number of key poses F ¼ Silhouette sequence F 1 ,F 2 , . . . ,F GC L¼ Corresponding key pose label L1 ,L2 , . . . ,LGC Output: PK¼ Pose Kinematics PEI¼Pose Energy Image Initialization: PKk ¼0, PEIk ¼ 0, 8k 2 K Begin for i¼ 1 to GC do for k¼ 1 to K do if Li ¼ ¼ k then PK k þ þ ; PEIk ¼ PEIk þ F i ; end if end for end for for k¼ 1 to K do PEIk ¼ PEIk =PK k ; PK k ¼ PK k =GC; end for End

Fig. 7. Example silhouette sequence for a subject of one gait cycle length. The key pose class labels of the silhouette sequence obtained from the dynamic programming based most probable path search is shown by the labels in the bottom of each silhouette. The gait cycle starts from frame no. 1 (state S13 or key pose 13) and ends at frame no. 36 (state S12 or key pose 12). Thus the gait cycle length is 36. Silhouette count for key pose classes 1–16 is [3 1 1 1 6 1 3 3 1 1 1 3 5 1 2 3].

A. Roy et al. / Signal Processing 92 (2012) 780–792

Pose Energy Image (PEI) is used to represent shape property of gait signature. One PEI is computed for each key pose. Thus, instead of a single 2D image as done for GEI, K number of PEIs are computed from one gait cycle of an image sequence. PEI is based on the basic assumptions that, (i) the order of poses in human walking cycles is the same and (ii) differences exist in the phase of poses in a walking cycle. Given the preprocessed binary gait silhouette image It ðx,yÞ corresponding to frame Ft at time t in a sequence, ith gray-level Pose Energy Image (PEIi) is defined as follows: PEIi ðx,yÞ ¼

GC X 1 It ðx,yÞ GC nPK i t ¼ 1

if F t 2 P i

ð9Þ

where x and y are values in the 2D image coordinate. The detailed steps for computing PEI are shown in Algorithm 3. Fig. 8(a) shows the PEIs corresponding to the silhouette sequence of Fig. 7 of a subject having 16 key poses. For example, the first PEI is obtained by taking average of silhouettes 12–14 of Fig. 7 which belong to key pose class 1. Other PEIs are also obtained similarly. It can be observed that intensity variation within each PEI is small since it reflects unified motion. Thus, PEI reflects major shapes of silhouettes and their changes over a gait cycle. Fig. 8(b) shows PEIs of another subject. 2.4. Feature dimension reduction Since Pose Kinematics is a K-dimensional vector, where K is much smaller than one gait cycle length, dimension reduction is not required for it. The feature vectors of the probe sets are classified into one of the subject classes using nearest neighbor criteria. However, for PEI feature,

787

reduction of dimensionality is absolutely necessary to overcome the ‘‘curse of dimensionality’’. Many subspace algorithms have been applied in gait recognition in recent years, like principal component analysis (PCA), linear discriminant analysis (LDA), PCAþLDA [3], discriminative common vectors (DCV) [6], two-dimensional locality preserving projections (2DLPP) [8], etc. Advanced and complex algorithms are shown to achieve higher recognition accuracy with also higher computational cost. Here we apply two classical linear transformations, namely PCA and LDA, which have lower computational cost. The projected low dimensional PEI feature vector of a test sequence is categorized using nearest neighbor criteria as done in Pose Kinematics. With this simple projection method, the proposed features are observed to achieve higher recognition accuracy than the other existing feature representation methods, thus establishing the inherent power of our gait representation. Next we describe the two subspace methods used for feature learning in detail. 2.4.1. Learning features using PCA By applying PCA, we obtain several principal components to represent the original PEI gait features from a high-dimensional measurement space to a low-dimensional eigenspace. Let, each series of K PEIs obtained from a gait cycle of a subject be represented by a column vector xi of size d ¼ W  H  K, where W is the width and H the height of a PEI and K the number of key pose classes. Thus, the d-dimensional training PEI data set of size M is x1 ,x2 , . . . ,xM . Then, the average vector m and covariance matrix S are computed as follows: m¼

M 1 X x Mk¼1 k

ð10Þ

Fig. 8. PEI example images from CMU MoBo data base: (a) First two rows show 16 PEIs of a subject obtained from the silhouette sequence of Fig. 7. (b) Last two rows show PEIs for another subject.

788



A. Roy et al. / Signal Processing 92 (2012) 780–792 0

M 1 X ðx mÞðxi mÞT Mi¼1 i

ð11Þ

Next, the eigenequation Sek ¼ lk ek is solved and eigen0 vectors [e1 ,e2 , . . . ,ed0 ] corresponding to d large eigenva0 lues (l1 Z l2 Z    Z ld0 ) are selected (d o d). Then the 0 d -dimensional feature vector yk is obtained from xk as follows: yk ¼ ½e1 ,e2 , . . . ,ed0 T ðxk mÞ ¼ T PCA ðxk mÞ,

k ¼ 1, . . . ,M ð12Þ

0

These reduced d -dimensional feature vectors are used for recognition. 2.4.2. Learning features using LDA The second subspace method used is LDA. However, instead of pure LDA, we use PCAþLDA to address the singularity issues. It occurs since the training data set size is smaller than the feature vector size. Consider the d-dimensional training PEI data set of size M to be x1 ,x2 , . . . ,xM as before, and they belong to c classes. LDA computes the optimal discriminating space T by maximizing the ratio of between-class scatter matrix SB to the within-class scatter matrix SW as follows: T ¼ arg max W

9W T SB W9 9W T SW W9

ð13Þ

SW is defined as SW ¼

c X X

ðxmi Þðxmi ÞT

ð14Þ

i ¼ 1 x2Di

P where mi ¼ ð1=ni Þ x2Di x and Di is the training PEI set that belongs to the ith class and ni is the number of PEIs in Di . The between-class scatter matrix SB is defined as SB ¼

c X

ni ðmi mÞðmi mÞT

ð15Þ

i¼1

where m is obtained using Eq. (10). T is the set of eigenvectors corresponding to the largest eigenvalues in SB wi ¼ gi SW wi

ð16Þ

However, the rank of SW is not more than ðMcÞ, where M is the total training data set size and c is the total number of classes. SW will be non-singular only if its size is lower than ðMcÞ. But, since the size of SW is determined by the size of row-scanned PEI image of size d (order of 100,000), it is much more than ðMcÞ (order of 1000). To solve this problem, PCA is applied first to reduce the dimension of training PEIs which keeps no more than the largest ðMcÞ principal components such that SW is non-singular. So, instead of applying LDA on the original PEI data set x1 ,x2 , . . . ,xM , we first apply PCA to get a set of 0 M d -dimensional principal component vector y1 ,y2 , . . . , 0 yM using Eqs. (10)–(12). d is chosen in a way such that 0 d o ðMcÞ and observed total variance is more than 90%. Then SB and SW are computed using Eqs. (14) and (15) (replacing x by y), and the eigenvectors are computed from Eq. (16). A maximum of ðc1Þ eigenvectors fn1 , n2 , . . . , nc1 g are obtained which form the transformation matrix TLDA. Thus, the ðc1Þ-dimensional gait feature

vector is obtained from d -dimensional principal component vector yk as follows: zk ¼ ½n1 , n2 , . . . , nc1 T yk ¼ T LDA yk ¼ T LDA T PCA ðxk mÞ,

k ¼ 1, . . . ,M

ð17Þ The obtained feature vectors of dimension ðc1Þ are used in the next stage for final recognition. 2.5. Human recognition by combining Pose Kinematics and PEI Since Pose Kinematics is obtained by simply determining the number of frames belonging to each key pose state, it is quite fast whereas PEI’s discrimination power is much higher than Pose Kinematics. On the other hand, PEI requires higher computational time and storage space than Pose Kinematics which makes it slower. To combine the advantages of both the representations, we propose a hierarchical scheme for final recognition. Since Pose Kinematics provides a fast yet comparatively weaker classifier, we apply it in the first stage. Then, in the next stage, PEI based classification is done. Let the set of training PK feature vectors be denoted by fpg, corresponding PEI feature vector set is feg, and the transformation matrix computed from the training data set is T. The class centers of the training data are mpi ¼ P P ð1=ni Þ p2Pi p, mei ¼ ð1=ni Þ e2Ei e, where i ¼ 1, . . . ,c, c is the number of classes (subjects) in the database, Pi is the set of PK feature vectors belonging to the ith class, Ei is the set of PEI feature vectors belonging to the ith class, and ni is the number of feature vectors in Pi or Ei . Given the test silhouette sequence F we follow the steps discussed in Sections 2.1–2.3 to compute PK gait features PF ¼ fP 1 , P 2 , . . . ,P J g and sequence of PEI features EF ¼ fE 1 , E 2 , . . . ,E J g, where J is the number of compete gait cycles extracted from the test sequence. Then, the transformed feature vector sets are obtained as follows: fE^F g : E^j ¼ TE J ,

j ¼ 1, . . . ,J

ð18Þ

At first, PK based classifier is applied for recognition and the distance between the test PK feature and training PK data is defined as J

DðPF , Pi Þ ¼

1X JP mpi J, J j¼1 j

i ¼ 1, . . . ,c

ð19Þ

Then the test sequence is classified as of subject a as follows: a ¼ arg minDðPF , Pi Þ

ð20Þ

i2C

if minci ¼ 1 DðPF , Pi Þ 4 y, where y is a threshold determined experimentally. Since the nearest distance of the Pose Kinematics based approach is low enough, its output class label is considered to be correct and final. Then, PEI is not applied further in the next stage. Otherwise, the S nearest neighbors are selected from training sample space denoted as, P1 , . . . , PS , S 5 c. S is the rank of recognition performance when the average accuracy over all possible probe and gallery sets is higher than 90%. Thus, we essentially restrict the number of classes to be searched using PEI based feature. Then on these top S subject

A. Roy et al. / Signal Processing 92 (2012) 780–792

classes we apply PEI feature based method to get the final classification result. For the classifier based on PEI feature, we define J

1X ^ DðE^F , Ei Þ ¼ JE mei J, J j¼1 j

i2S

ð21Þ

Then the sequence is assigned to subject class b if b ¼ arg min DðE^F , Ei Þ

ð22Þ

i2S

This hierarchical method of feature combination achieves both fast computation and high recognition rate. 3. Experimental results We evaluated the proposed algorithm in varied challenging conditions such as variation in the size of data set, walking surface, walking speed, carrying condition, shoe type, camera angle, clothing, time, etc. The gait databases used for conducting experiments were the CMU MoBo (CMU) database [13] and the USF HumanID gait database [10]. We have carried out experiments on a 2 GHz Intel Core2 Duo computer, with 2 GB RAM, in Matlab 7.8 environment. 3.1. CMU MoBo database The CMU MoBo database [13] consists of indoor video sequences of 25 subjects walking on a treadmill. Videos were captured in different modes of walking, namely, slow walk, fast walk, walking on an inclined plane, and walking with a ball in two hands. Each sequence is 11.33 s long, recorded at a frame rate of 30 frames per second. All silhouettes are vertically scaled, horizontally aligned and rescaled to 132  192. In this paper, all the four types of walking, i.e., ‘slow walk’ (S), ‘fast walk’ (F), ‘ball walk’ (B) and ‘inclined walk’ (I) are used for both gallery and probe. Given a probe sequence, the first step is to classify each silhouette into one of the key poses. This is done by following the steps described in Sections 2.1 and 2.2 (also see Fig. 1). Once the key pose labels are known for each

789

silhouette, the PK and PEI features are computed as described in Section 2.3. Then dimension reduction is done using either PCA or LDA. At last classification is done following the steps discussed in Section 2.5. These steps are shown in detail in Fig. 2. Table 1 shows recognition performance comparing our algorithm against five existing methods. Existing methods show high recognition rates when gallery and probe sets are either the same or have small shape variation (train with S and test with F or train with F and test with S). For the other experiments, relatively low recognition rates are achieved, which indicates that those algorithms are not robust enough to appearance changes. Results of the experiments which are not reported, have been left blank in the table. In contrast, the performance of our algorithm across all types of gallery/probe combinations shows the best classification accuracy. The third last row shows recognition accuracy with only Pose Kinematics feature. As expected, it can be observed that the recognition result is not high enough. The second last row shows recognition accuracy with only PEI followed by PCA, which is higher than any of the existing methods. After hierarchical combination of the two features, recognition accuracy is shown in the last row. In the first stage, the Pose Kinematics based method selects the top 30% of gallery set on which PEI is applied. The combined method achieves slightly better overall accuracy than only PEI based method. This occurs in the following situation. Say, the test subject is actually subject A in the training data set. During only PEI based recognition, it could be classified as subject B if both have similar physical build. However, during Pose Kinematics based search space pruning in the combined method, subject B will not be selected because his kinematics does not match that of the test subject. Next, when PEI based recognition is applied on the reduced search space, test subject is classified correctly as subject A (option of subject B will not be there). Table 2 shows the average time requirement for classifying a subject using either Pose Kinematics or PEI or the combined method. The average accuracy in Table 2

Table 1 Recognition results on Mobo data set. S/F represents Gallery S and Probe F. Experiments

CMU [14]

UMD [15]

MIT [16]

SSP [17]

FSVB [11]

Pose Kinematics

PEI

Combination method

S/S (%) S/F (%) S/B (%) S/I (%) F/S (%) F/F (%) F/B (%) F/I (%) B/S (%) B/F (%) B/B (%) B/I (%) I/S (%) I/F (%) I/B (%) I/I (%)

100 76 92

100 80 48

100 64 50 – – – – – – – – – – – – –

100 54 – – 32 100 – – – – – – – – – –

100 82 77

100 32 84 68 52 92 28 64 80 36 92 60 76 48 40 100

100 100 92 60 88 100 64 72 92 84 100 60 60 80 32 100

100 100 92 60 88 100 60 72 92 84 100 76 76 80 48 100

– – – – – – – – – – – –

– 84 100 48 – 68 48 92 – – – – –

– 80 100 61 – 89 73 100 – – – – –

790

A. Roy et al. / Signal Processing 92 (2012) 780–792

sequence, the active regions are extracted by calculating the difference of two adjacent silhouette images. Then the AEI is constructed by accumulating these active regions. Here we applied PCA as well as PCAþLDA dimensionality reduction techniques for performance comparison. It can be observed from the table that for probes H, I and J (carrying briefcase), our approach gives lower performance than the AGEI method. On the other hand, AGEI performs poorly on probes D, E, F and G when walking surface changes. The basic difference between these two covariates is that while the first one affects the appearance, the other one affects the walking pattern. Our proposed PEI captures dynamics in higher resolution than AGEI. So, the performance of our method is less affected by surface change compared to AGEI which fails to capture dynamic variation when walking surface changes (probe set D–G). AGEI performs better in carrying briefcase situation (probe set H–J) because at the time of computing AGEI, the stationary regions are deleted by taking the difference between two adjacent silhouette images in a sequence. Since the briefcase regions are stationary between two adjacent images, they are not considered in AGEI. Thus, it suppresses appearance variation by only concentrating on active regions. However, goodness of any approach is measured considering the performance over all the probes. According to the weighted mean recognition results over all the 12 probes, our PEI and Pose Kinematics based approach outperforms all of the existing gait feature representation methods. To judge the ranking capabilities of the proposed approach, we plot the cumulative match characteristic (CMC) curves for the 12 probes shown in Fig. 9. Here PCA is used as the dimension reduction method. The curve corresponding to each experiment represents variation in the probability of recognizing a subject with increase in rank from 1 to 20. The weighted mean value of accuracy is also shown on the same plot. It can be observed from the plot that the weighted mean accuracy almost saturates (at 75–85%) beyond a rank value of 12.

is obtained by taking average of all accuracies for different types of experiments performed in Table 1. As already stated, it can be observed from the result that the time requirement using Pose Kinematics is low. On the other hand, PEI requires 83% higher computational time than Pose Kinematics. After hierarchical combination of the two features, the time requirement is shown to be reduced by 18% compared to the PEI method alone. Although it seems small for one subject, when multiple subjects have to be recognized, overall time saving will be considerable. This is especially true for surveillance applications where raw video durations run into hours. It can also be noted that in spite of time improvement, accuracy is not adversely affected. Thus the final result is not constrained by the restricted search space obtained from Pose Kinematics feature. 3.2. USF HumanID database We also conducted experiments on the USF HumanID outdoor gait database (Version 2.1) [10]. The database consists of 1870 sequences of 122 subjects. For each subject, there are five covariates: view points (left/right), surface (grass/concrete), shoe (A/B), carrying conditions (with/without briefcase), and recording time and clothing (may/november). Depending on these covariates, 12 experiments are designed labeled from A to L for testing. Table 3 shows the comparative recognition results of our algorithm combining the two proposed features with previously proposed GEI [3], EGEI [6] and AGEI [8] methods. GEI is obtained by averaging silhouettes over a gait cycle. To compute EGEI, variation analysis is done to find the dynamic region in GEI. Based on this analysis, a dynamic weight mask is constructed to enhance the dynamic region and suppress the noise in the unimportant regions. The gait representation thus obtained is called the EGEI. In AGEI, active or dynamic regions are extracted in a different manner. From a gait silhouette Table 2 Time required for each feature and combined method.

4. Conclusions

Approaches

Time (s)

Average accuracy (%)

Pose Kinematics PEI Combined method

35.5 64.8 54.03

66.25 80.25 82.75

In this paper, two new gait representation approaches called Pose Kinematics and Pose Energy Image (PEI) are proposed to capture the motion information in detail. Compared with the other GEI based methods like conventional GEI [3], EGEI [6], AGEI [8], PEI represents

Table 3 Recognition results on USF gait data set. Approaches

A

B

C

D

E

F

G

H

I

J

K

L

Mean

GEI [3]

PCA LDA

80 88

87 89

72 74

26 25

22 25

11 15

13 20

47 52

40 53

37 56

6 9

6 18

39.25 45.93

EGEI [6]

PCA LDA

80 89

87 89

70 76

24 22

20 28

11 14

12 20

48 52

40 52

40 58

6 12

3 1

39.30 46.14

Active GEI [8]

PCA LDA

83 89

91 93

76 82

17 22

22 26

7 13

7 14

75 83

67 71

51 61

3 9

3 9

45.02 51.22

Proposed method

PCA LDA

81 85

93 94

76 78

47 49

31 33

18 22

24 26

61 71

53 69

38 47

6 12

9 12

47.73 53.11

A. Roy et al. / Signal Processing 92 (2012) 780–792

791

100 90 80

Recognition Rate (%)

70 60 50

Probe A Probe B Probe C Probe D Probe E Probe F Probe G Probe H Probe I Probe J Probe K Probe L Mean

40 30 20 10 0 0

2

4

6

8

10 Rank

12

14

16

18

20

Fig. 9. Cumulative match characteristic curves of all the probe sets.

minute motion information by increasing the resolution of GEI. The second feature namely Pose Kinematics captures pure dynamic information without involving shape of a silhouette. Since discriminative power of pure dynamic information is not high enough, this feature is less robust. On the other hand, PEI preserves detailed dynamic information as well as shape information which makes it more discriminative. But, since the PEI feature is slower to compute, Pose Kinematics is used in the first stage to select a set of most probable classes based on only dynamic information. Then, PEI based approach is applied on these selected classes to get the final classification result. Thus, this hierarchical method of feature combination proposed here achieves both accuracy and efficiency. Experimental results demonstrated that the proposed approach performs better than the other existing GEI based and non-GEI based approaches. Here we applied two classical techniques namely PCA and LDA for dimensionality reduction and discriminative feature extraction. In our future work, we will attempt to apply other recently proposed robust dimensionality reduction methods to achieve higher accuracy.

Acknowledgment This work is partially supported by the project grant 1(23)/2006-ME TMD, Dt. 07/03/2007 sponsored by the Ministry of Communication and Information Technology, Government of India and also by Alexander von Humboldt Fellowship for Experienced Researchers.

References [1] N.V. Boulgouris, D. Hatzinakos, K.N. Plataniotis, Gait recognition: a challenging signal processing technology for biometrics identification, IEEE Signal Processing Magazine 22 (6) (2005) 78–90. [2] A.F. Bobick, J.W. Davis, The recognition of human movement using temporal templates, IEEE Transactions on Pattern Analysis and Machine Intelligence 23 (3) (2001) 257–267. [3] J. Han, B. Bhanu, Individual recognition using gait energy image, IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (2) (2006) 316–322. [4] J. Liu, N. Zheng, Gait history image: a novel temporal template for gait recognition, in: Proceedings of IEEE Conference on Multimedia and Expo, 2007, pp. 663–666. [5] Q. Ma, S. Wang, D. Nie, J. Qiu, Recognizing humans based on gait moment image, in: Eighth ACIS International Conference on SNPD, 2007, pp. 606–610. [6] X. Yang, Y. Zhou, T. Zhang, G. Shu, J. Yang, Gait recognition based on dynamic region analysis, Signal Processing 88 (9) (2008) 2350–2356. [7] C. Chen, J. Liang, H. Zhao, H. Hu, J. Tian, Frame difference energy image for gait recognition with incomplete silhouettes, Pattern Recognition Letters 30 (11) (2009) 977–984. [8] E. Zhang, Y. Zhao, W. Xiong, Active energy image plus 2DLPP for gait recognition, Signal Processing 90 (7) (2010) 2295–2302. [9] A. Kale, A. Sundaresan, A.N. Rajagopalan, N.P. Cuntoor, A.K. Roy-Chowdhury, V. Kruger, R. Chellappa, Identification of humans using gait, IEEE Transactions on Image Processing 13 (2004) 1163–1173. [10] S. Sarkar, P.J. Phillips, Z. Liu, I. Robledo-Vega, P. Grother, K.W. Bowyer, The human ID gait challenge problem: data sets, performance, and analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (2) (2005) 162–177. [11] S. Lee, Y. Liu, R. Collins, Shape variation-based frieze pattern for robust gait recognition, in: Proceedings of IEEE Conference on CVPR, 2007, pp. 1–8. [12] L.R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, vol. 77(2), 1989, pp. 257–286.

792

A. Roy et al. / Signal Processing 92 (2012) 780–792

[13] R. Gross, J. Shi, The CMU Motion of Body (MoBo) Database, Technical Report CMU-RI-TR-01-18, Robotics Institute, Carnegie Mellon University, 2001. [14] R. Collins, R. Gross, J. Shi, Silhouette-based human identification from body shape and gait, in: International Conference on Automatic Face and Gesture Recognition, 2002, pp. 351–356. [15] A. Veeraraghavan, A.R. Chowdhury, R. Chellappa, Role of shape and kinematics in human movement analysis, in: Proceedings of the IEEE Conference on CVPR, 2004.

[16] L. Lee, W. Grimson, Gait analysis for recognition and classification, in: Proceedings of the International Conference on Automatic Face and Gesture Recognition, 2002, pp. 155–162. [17] C. BenAbdelkader, R.G. Cutler, S. Davis, Gait recognition using image self-similarity, EURASIP Journal on Applied Signal Processing 2004 (4) (2004) 572–585. [18] M. Turk, A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience 3 (1) (1991) 71–86.