Accepted Manuscript
Fusion of spatial-temporal and kinematic features for gait recognition with deterministic learning Muqing Deng, Cong Wang, Fengjiang Cheng, Wei Zeng PII: DOI: Reference:
S0031-3203(17)30056-0 10.1016/j.patcog.2017.02.014 PR 6050
To appear in:
Pattern Recognition
Received date: Revised date: Accepted date:
3 September 2016 10 January 2017 8 February 2017
Please cite this article as: Muqing Deng, Cong Wang, Fengjiang Cheng, Wei Zeng, Fusion of spatialtemporal and kinematic features for gait recognition with deterministic learning, Pattern Recognition (2017), doi: 10.1016/j.patcog.2017.02.014
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights • We present a gait recognition method based on the fusion of different
CR IP T
features.
• Spatial-temporal and kinematic features can fused for human identification.
AC
CE
PT
ED
M
AN US
• We show good recognition performance on four widely used gait databases.
1
ACCEPTED MANUSCRIPT
Fusion of spatial-temporal and kinematic features for gait recognition with deterministic learning
CR IP T
Muqing Denga , Cong Wanga,∗ , Fengjiang Chenga , Wei Zengb a
AN US
b
College of Automation, South China University of Technology, Guangzhou 510640, Guangdong, China College of Mechanical & Electrical Engineering, Longyan University, Longyan 364000, Fujian, China
Abstract
For obtaining optimal performance, as many informative cues as possible should be involved in the gait recognition algorithm. This paper describes a gait recognition algorithm by combining spatial-temporal and kinematic gait
M
features. For each walking sequence, the binary silhouettes are characterized with four time-varying spatial-temporal parameters, including three lower
ED
limbs silhouette widths and one holistic silhouette area. Using deterministic learning algorithm, spatial-temporal gait features can be represented as the
PT
gait dynamics underlying the trajectories of lower limbs silhouette widths and holistic silhouette area, which can implicitly reflect the temporal changes of
CE
silhouette shape. In addition, a model-based method is proposed to extract joint-angle trajectories of lower limbs. Kinematic gait features can be repre-
AC
sented as the gait dynamics underlying the trajectories of joint angles, which can represent the temporal changes of body structure and dynamics. Both spatial-temporal and kinematic cues can be used separately for gait recogni∗
Corresponding author. Tel.: +86 20 87114615; fax: +86 20 87114612. E-mail address:
[email protected] (C. Wang).
Preprint submitted to Pattern recognition
February 10, 2017
ACCEPTED MANUSCRIPT
tion using smallest error principle. They are fused on the decision level using different combination rules to improve the gait recognition performance. The
CR IP T
fusion of two different kinds of features provides a comprehensive characterization of gait dynamics, which is not sensitive to the walking conditions variation. The proposed method can still achieve superior performance when
the testing walking conditions are different from the corresponding training conditions. Experimental results show that encouraging recognition accuracy
GAID, OU-ISIR, USF HumanID.
AN US
can be achieved on five public gait databases: CASIA-B, CASIA-C, TUM
Keywords: Gait recognition, Gait dynamics, Deterministic learning, Spatial-temporal features, Kinematic features
M
1. Introduction
Since September 11th attack, the demand for automatic human iden-
ED
tification is strongly increasing and growing, especially noncontact human identification at a distance. In security-sensitive environments (e.g. railway
PT
stations, airports and banks), it is desirable to detect threats quickly and biometrics is a suitable, powerful tool for reliable human identification [1].
CE
1.1. Motivation of gait recognition As a new behavioral biometric, gait recognition aims at identifying people
AC
by the way they walk. Compared with other widely used biometrics, the main characteristics of gait recognition lie in the following aspects: 1. Gait is unique. From a biomechanics perspective, gait is unique for each person if all the properties of body structures, synchronized integrated 3
ACCEPTED MANUSCRIPT
movements of body parts, interaction among them are considered. The potential of gait for automatic human identification is supported by a
CR IP T
rich literature [2]. 2. Gait is noncontact. The first generational biometrics, such as face,
fingerprints and iris, are restricted to controlled environments, usually require physical touch or proximal sensing. In contrast, gait has great
prominent advantages of being non-contact, non-invasive, unobvious.
AN US
Gait can be collected secretly, which does not require the subject cooperation [3].
3. Gait can be collected at a distance. Biometrics such as fingerprint and iris usually require sensing the subject at close ranges. However, at a
M
distance, these biometrics are no longer applicable. Fortunately, gait can still work in this case, even in a low resolution environment. This
ED
makes gait ideal for long distance security and surveillance applications [4].
PT
As stated above, gait has many advantages, making it very attractive for
CE
human identification at a distance and applications in video surveillance. 1.2. Related work
AC
Existing gait recognition methods mainly fall into two categories: model-
based methods and silhouette-based methods [5]. Model-based methods model the human body and its motion from gait
sequences. Kinematic characteristics of walking are then extracted from the model components and used as features for classification. Cunado et al. 4
ACCEPTED MANUSCRIPT
[6] proposed an early gait-pendulum model and achieved model-based gait recognition. Nixon et al. [7] developed a stick model and calculated walking
CR IP T
kinematic characteristics without directly analyzing gait sequences. Mu and Wu [8] presented a five-link bipedal walking model. More recently, techniques
based on activity-specific static body parameters [9], deterministic learning and five-link model were developed for model-based gait recognition [10].
Silhouette-based methods directly operate on the gait sequences without
AN US
any specific model. Gait characteristics are implicitly reflected by the holis-
tic appearance of walking individual. Phillips et al. [11] used the silhouettes features to establish a baseline recognition algorithm. Han et al. [12] characterized human gait pattern with Gait Energy Image (GEI) by averaging image of silhouettes in one gait period. Alpha-GEI, an enhanced version of
M
GEI, was proposed by Hofmann et al. [13] to mitigate nonrandom noise. Makihara et al. [14] extracted individuality-preserving silhouette for gait
ED
recognition. Matovski et al. [15] improved the segmentation processing by using quality metrics for automatic gait recognition. More recently, tech-
PT
niques based on gait entropy image (GEnI) [16], chrono gait image (CGI) [17] were developed for silhouette-based gait recognition.
CE
The most commonly used gait features, according to human gait theory, can be roughly divided into two categories: spatial-temporal parameters and kinematic parameters [18]. Generally speaking, spatial-temporal parameters
AC
are the intuitive gait features including stride length, step length, silhouette width and so on. Kinematic parameters are usually characterized by the joint angles between body segments and joint motion in the gait cycle [18]. In [19], step length and speed were extracted as spatial-temporal parameters
5
ACCEPTED MANUSCRIPT
to perform the gait recognitoin task. In [20], lower limb angles were extracted as kinematic parameters. In [21], step length, cycle time, speed and angle-
CR IP T
based kinematic parameters were combined as gait features. Vertical distance features (VDF) was developed by Ahmed et al. [22]. Chattopadhyay et al.
[23] attempted to combine the relative distance features and joint velocity to achieve better performance.
In our previous works [10, 24], the potential of the use of the spatial-
AN US
temporal and the kinematic parameters in gait recognition has been inves-
tigated separately. In [10], the dynamics along the phase portrait of joint angles versus angular velocities were captured to achieve model-based gait recognition. In [24], the dynamics along the trajectories of silhouette width features were caputred to achieve silhouette-base gait recognition. The ex-
M
perimental results indicated that, for the purpose of gait recognition, the amount of discriminability provided by the dynamics of the silhouette feature
ED
is similar (or equivalent) to the discriminability provided by the dynamics of kinematic parameters like joint angles and/or angular velocities [24]. How-
PT
ever, the combined use of silhouette spatial-temporal feature and kinematic parameters has not been investigated in our experiments yet.
CE
For obtaining optimal performance, as many informative cues as possible should be involved in the gait recognition algorithm. Based on this assumption, in this paper, we attempt to fuse the two completely different sources
AC
of information: spatial-temporal and the kinematic parameters for human identification.
6
AN US
CR IP T
ACCEPTED MANUSCRIPT
Fig. 1. Overall work flow of the proposed method.
1.3. Outline of the proposed method
The proposed method is schematically shown in Fig. 1. For each gait
M
sequence, lower limbs silhouette widths and holistic silhouette area are extracted as spatial-temporal parameters, lower limbs joint angles are extracted
ED
as kinematic parameters. Spatial-temporal gait features can then be calculated using deterministic learning algorithm and be represented as the dy-
PT
namics along the trajectories of lower limbs silhouette widths and holistic silhouette area. Additionally, kinematic gait features can be extracted and
CE
represented as the dynamics along the trajectories of four lower limbs joint angles. This two kinds of gait features reflect the temporal change of body
AC
poses or walking motion between consecutive frame sequences in two completely different aspects, whlie preserving temporal dynamics information of human walking. Both spatial-temporal and kinematic information can be independently used for recognition using the smallest error principle. They are combined as well on the decision level for a better recognition performance. 7
CR IP T
ACCEPTED MANUSCRIPT
AN US
Fig. 2. Flowchart of the spatial-temporal feature extraction process.
2. Spatial-temporal feature extraction
As schematically shown in Fig. 2, spatial-temporal parameters from each gait sequence are extracted, spatial-temporal gait features can then be cal-
M
culated using deterministic learning algorithm and be represented as the dynamics along the trajectories of spatial-temporal parameters. In deter-
ED
ministic learning algorithm, identification of nonlinear gait system dynamics is achieved according to the following elements: (a) employment of localized radial basis function (RBF) networks; (b) satisfaction of a partial persistent
PT
excitation condition along the periodic or recurrent orbit; (c) exponential stability of the adaptive system; (d) locally-accurate neural network approx-
CE
imation of the unknown gait dynamics.
AC
2.1. Silhouette extraction and representation An important cue in determining underlying gait information of the walk-
ing procedure is temporal changes of silhouette shape. Using the background subtraction method, silhouettes in each walking sequence are first extracted [25]. Then, we fill in holes and remove noises by using mathematical mor8
Fig. 3.
AN US
CR IP T
ACCEPTED MANUSCRIPT
Illustration of silhouette extraction. (a) Background image. (b)
Original image. (c) Segmented regions. (d) Smoothed segmented regions after morphological processing. (e) Silhouette with bounding box and (f) Silhouette
M
contour.
phology method. Edge images can be obtained by applying a Canny operator
ED
with hysteresis thresholding. Dilation and erosion procedure is adopted and the body silhouette is finally determined. A bounding box is placed around
PT
the part of the silhouette image and the silhouettes are resized to the same height. The whole silhouette extraction process is shown in Fig. 3.
CE
Width of the silhouette, which has been proved as a good silhouette representation method [26, 24], is used in this study. Defined as the distance between left and right extremities of the silhouette, width parameters implic-
AC
itly capture structural as well as dynamical information of gait. We divide the gait silhouette into four equal regions: subregion 1, subregion 2, subregion 3 and subregion 4, as shown in Fig. 4. Let (X, Y ) be the set of pixel
points in the silhouette image, where X denotes the row index and Y denotes
9
CR IP T
ACCEPTED MANUSCRIPT
AN US
Fig. 4. Width parameters extraction.
the width along that row. H is the height of the silhouette. Obviously, the width can be calculated as the difference between leftmost and rightmost boundary pixels in that row. YXL and YXR denote the Y -coordinates of the
M
leftmost and rightmost pixel point in the Xth row, respectively. Among different width parameters, we select empirically the median width of the
ED
holistic silhouette (Wd1 ), the median width of subregions 3 and 4 (Wd2 , Wd3 ) as the spatial-temporal parameters for later analysis: (1)
Wd2 = median(YXR − YXL )|X∈[ 1 H, 3 H]
(2)
Wd3 = median(YXR − YXL )|X∈[ 3 H,H]
(3)
AC
CE
PT
Wd1 = median(YXR − YXL )|X∈[0,H]
2
4
4
where d represents the dth silhouette frame. Silhouette area reflects the periodic nature of spatial silhouette contours
and is selected as one of the spatial-temporal parameters as well. It is minimum when two feet are aligned together and becomes maximum when they 10
AN US
CR IP T
ACCEPTED MANUSCRIPT
Fig. 5. (a) Median width of holistic silhouette; (b) Median width of regions
M
3; (c) Median width of regions 4; (d) Silhouette area.
are the farthest apart. Denoted as Ad, silhouette area, is calculated by
ED
counting the number of pixels in it, where d denotes the dth frame. To comply with the silhouette representation requirements, we tried dif-
PT
ferent schemes and finally picked the most appropriate four spatial-temporal parameters: The median width of the holistic silhouette Wd1 reflects the holis-
CE
tic changes of silhouette shape. The median widths of the lower limbs regions Wd2 , Wd3 give structural as well as dynamical information of gait. The area Ad gives size information of the silhouette and reflects the periodicity of hu-
AC
man gait. These four spatial-temporal parameters reflect the dynamics of gait silhouette in different aspects. Fig. 5a-d show the curves of Wd1 , Wd2 , Wd3 , Ad of one walking sequence.
11
ACCEPTED MANUSCRIPT
2.2. Spatial-temporal signature acquisition Our method uses these spatial-temporal parameters from gait sequences
CR IP T
to determine the underlying gait dynamics of the walking procedure as a spatial-temporal signature, which can implicitly represent the temporal changes of silhouette shape.
Gait dynamics can be represented by the following equation: x˙ = F (x; p) + v(x; p),
x(t0 ) = x0
(4)
AN US
where x = [x1 , . . . , x4 ]T ∈ R4 is the state variables, representing the four
spatial-temporal parameters; p is a constant system parameters vector; F (x; p) represents the gait dynamics, v(x; p) represents the modeling uncertainty. φ(x; p) = F (x; p) + v(x; p) is defined as the general gait dynamics. The fol-
M
lowing summarizes the main steps in determining the gait dynamics φ(x; p) using deterministic learning algorithm. Parameter adjustment is conducted
ED
empirically.
First, localized RBF neural networks are constructed: T
fnn (Z) = W S(Z) =
N X
wi si (Z)
(5)
PT
i=1
where Z is the input spatial-temporal parameters, W represents the network
CE
weights, si (·) is a radial basis function. Gaussian function si (||Z − ξi ||) = T
exp[ −(Z−ξiη)i 2 (Z−ξi ) ], i = 1, . . . , N , is used in this paper. ξi (i = 1, . . . , N )
AC
represents distinct points in state space. The network is constructed in a regular lattice, with its centers ξi evenly spaced on [−1, 1]×[−1, 1]×[−1, 1]× [−1, 1], node-node width η = 0.15, nodes number N=83,521. Second, dynamical RBF model is used to model the gait dynamics: ˆ T Si (x), i = 1, . . . , n xˆ˙ i = −ai (ˆ xi − xi ) + W i 12
(6)
ACCEPTED MANUSCRIPT
where xˆ = [ˆ x1 , . . . , xˆn ]T represents the state vector of the model, x is the input ˆ T Si (x) represents the localized RBF network, approximating parameters. W i
CR IP T
ˆ i represents the estimate the unknown general gait dynamics. The notation W of the optimal weights. The design constants ai = 0.5.
Third, the weights are updated by the following updating law: ˆ˙ i = W ˜˙ i = −Γi Si (x)˜ ˆi W xi − σi Γi W
(7)
AN US
˜i = W ˆ i − Wi ∗ , Wi ∗ is the optimal constant weight where x˜i = xˆi − xi , W vector. In this paper, Γ = diag{1.5, 1.5, 1.5, 1.5}, σi = 10(i = 1, . . . , 4). The derivative of the state estimation error x˜i satisfies
ˆ iT S(x) − φi (x; p) = −ai x˜i + W ˜ iT S(x) − i x˜˙ i = −ai x˜i + W
(8)
M
ˆ i [27], we can obtain a constant Based on the convergence result of W ¯ i = meant∈[ta ,t ] W ˆ i (t), tb > ta > 0 vector of neural weights according to W b
ED
represent a time segment after the transient process. Therefore, accurate modeling of the general gait system dynamics φi (x; p) is achieved along the
PT
¯i parameter trajectory by using W ¯ T Si (x) + i2 φi (x; p) = W i
(9)
CE
where i2 = O(i1 ) is the practical approximation error. Hence, the dy-
namics φi (x; p) underlying almost spatial-temporal parameters can be accu-
AC
rately modeled in a time-invariant manner via deterministic learning. Compared with static feature methods, deterministic learning theory ex-
cels in capturing the dynamics information underlying the temporal features, in which more in-depth information can be discovered [28].
The
spatial-temporal signature can be represented as a time-invariant matrix 13
AN US
CR IP T
ACCEPTED MANUSCRIPT
Fig. 6. Flowchart of the kinematic feature extraction process.
¯ 1, W ¯ 2, W ¯ 3, W ¯ 4 ]. In traning phase, we may acquire spatial-temporal signa[W ture matrix from different subjects to constitute a spatial-temporal template
M
library.
ED
3. Kinematic feature extraction
As schematically shown in Fig. 6, a five-link biped model is selected and
PT
the dynamics of the biped model is derived. With the absolute domination of gait dynamics, four lower limbs joints angles are extracted and selected as
CE
kinematic parameters. Kinematic gait features can then be calculated using deterministic learning algorithm.
AC
3.1. Model construction and representation Similar to [8], the human walking model used in this paper is composed
of rigid parts: the trunk, the pelvises, the thighs, the shanks and the torso. Each part is considered to be rigid with movement only allowed at joint 14
AN US
CR IP T
ACCEPTED MANUSCRIPT
Fig. 7. A five-link biped model [8].
M
positions, as shown in Fig. 7. θi (i = 1, . . . , 5) is the absolute angle between the ith link and the vertical direction. A gait cycle usually is composed of
ED
two phases: single support phase (SSP) and double support phase (DSP). Since the time period of the DSP is very short while the SSP lasts for a
PT
longer time. The dynamics of the biped model on the SSP is more suitable to represent gait dynamics than on the DSP. The DSP can be considered as
CE
a boundary state of the SSP [10]. The dynamics of the biped model on the SSP is then derived in the following Lagrangian equation: (10)
AC
D(θ)θ¨ + H(θ)θ˙2 + G(θ) = Tθ
where θ = [θ1 , θ2 , θ3 , θ4 , θ5 ]T , D(θ) is the 5×5 positive definite and symmetric
inertia matrix, H(θ) is the 5×5 matrix of centrifugal and Coriolis terms, G(θ),
˙ θ¨ are the 5×1 matrix of gravity terms, generalized torque, generalized Tθ , θ θ, coordinates, velocities and accelerations, respectively (More details can be 15
ACCEPTED MANUSCRIPT
found in [8]).
following form:
θ˙ = ω ω˙ = f (θ) + g(θ, ω)
CR IP T
Let ω = θ˙ = [θ˙1 , θ˙2 , θ˙3 , θ˙4 , θ˙5 ]T , Eq. (10) can then be transformed into the
(11)
where f (θ) = D(θ)−1 (T − G(θ)), g(θ, ω) = −D(θ)−1 H(θ)ω 2 .
According to the physical parameters in the simulation of [8], we can
AN US
obtain the numerical simulation results of f (θ) and g(θ, ω). As shown in Fig. 8, kf (θ)k kg(θ, ω)k. It is obvious that f (θ) represents the absolute domination in the dynamics of the biped model. Gait dynamics, therefore,
can be represented by function f (θ) = D(θ)−1 (T − G(θ)) along the phase portrait of θ approximatively. Hence, the joint angles are selected as the
M
kinematic parameters, and the gait dynamics can be simplified to be related to the state variables of joint angles.
ED
Directly extracted from human gait sequences, joint angles reflect the kinematic characteristics of gait manner. For the sake of reducing computational cost, we assume that the biped model is walking with its torso
PT
maintained in an upright position, that is, θ3 = 0, θ˙3 = 0. Hence, the gait dy-
CE
namics can be further simplified to be related to the four lower limbs joints angles θ1 , θ2 , θ4 , θ5 . These four joints angles can be obtained by using the body segment property in [7, 10], and reflect the gait dynamics in the kine-
AC
matic aspects. Fig. 9 and Fig. 10 shows an example of joints positioning and joint angles computing from image sequences.
16
Fig. 8.
AN US
CR IP T
ACCEPTED MANUSCRIPT
Numerical simulation of f (θ) = [f1 , f2 , f3 , f4 , f5 ]T and g(θ, ω) =
AC
CE
PT
ED
M
[g1 , g2 , g3 , g4 , g5 ]T .
Fig. 9. Example of joint positioning. (a) the binary silhouette, (b) edge image, (c) the bounding box, (d) joint positioning, ”×” stands for joint positions
17
AN US
CR IP T
ACCEPTED MANUSCRIPT
Fig. 10. Joint angle computing from gait sequences. a thigh angle computing, b knee angle computing.
3.2. Kinematic signature acquisition
M
Our method uses these four joint angles from gait sequences to determine the underlying gait dynamics as a kinematic signature, which can represent
ED
the temporal changes of body structure and dynamics. The process of signature acquisition via deterministic learning is simi-
PT
lar to the process shown in Section 2.2 and is omitted here for clarity and conciseness. Deterministic learning theory is capable of capturing the dy-
CE
namics information underlying the temporal kinematic parameters. Similarly, the kinematic signature can be represented as a time-invariant ma-
AC
¯ 5, W ¯ 6, W ¯ 7, W ¯ 8 ]. Kinematic signature matrix from different subjects trix [W in traning phase can be regarded as a kinematic template library for latter recognition. Fig. 11 presents an example of neural network construction in a regular lattice, and Fig. 12 shows the convergence of neural weights during kinematic signature acquisition. 18
Fig. 11.
AN US
CR IP T
ACCEPTED MANUSCRIPT
Schematic of neural networks computing: a. Construct neural
networks in a regular lattice, with its node-node width η = 0.15; b. Input the
AC
CE
PT
ED
M
kinematic parameter trajectories to the network.
Fig. 12.
ˆ 5 and W ˆ 6 during kinematic Partial parameter convergence of W
signature acquisition. Only the weight of some neurons whose centers close to the orbit are activated and updated. The weight of neurons whose centers far away from the orbit are not activated and almost unchanged.
19
ACCEPTED MANUSCRIPT
4. Recognition scheme and fusion rules As a traditional pattern recognition problem, gait recognition in this pa-
CR IP T
per can be achieved by measuring similarities between training gait signature ¯ training and test signature matrixs W ¯ test . Here we try the smallest matrixs W
error principle. The following summarizes the main steps in recognizing a test gait sequence using the smallest error principle.
First, a bank of M estimators is constructed for the trained sequences by
AN US
using the learned knowledge obtained in the training phase: ¯ kT S(x) χ¯˙ k = −B(χ¯k − x) + W
(12)
where k = 1, . . . , M stands for the kth estimator, χ¯k = [χ¯k1 , . . . , χ¯k4 ]T is the state of the estimator. x is the state of an input test gait sequence.
M
B = diag[b1 , . . . , bn ] is a diagonal matrix which is set to the same for all estimators, i.e., B = diag[−25, −25, −25, −25].
ED
Second, by comparing the test sequence with the set of M estimators,
PT
recognition error systems are obtain as follows: ¯ kT Si (x) − W ¯ T Si (x), i = 1, . . . , 4, k = 1, . . . , M χ˜˙ ki = −bi χ˜ki + W i i
(13)
CE
where χ˜ki = χ¯ki − xi is the state estimation (or synchronization) error. We
¯ T stands for the learned knowledge obtained adjust bi to −25 in this paper. W i
AC
in the test phase. Third, the average L1 norm of the error χ˜ki (t) is obtained: Z t 1 k kχ˜i (t)k1 = |χ˜k (τ )|dτ, t ≥ Tc Tc t−Tc i
where Tc = 1.2s is human gait cycle. 20
(14)
ACCEPTED MANUSCRIPT
If there exists some finite time ts , s ∈ {1, . . . , k} and some i ∈ {1, . . . , n}
such that kχ˜si (t)k1 < kχ˜ki (t)k1 for all t > ts , that is, the corresponding
CR IP T
error kχ˜si (t)k1 becomes the smallest among all the errors kχ˜ki (t)k1 , then the appearing person can be recognized.
There is no doubt that more sophisticated classifiers could be used, but the primary interest in this paper is to evaluate the discriminatory ability of the fusion of spatial-temporal and kinematic features. The recognition results
AN US
(scores) obtained from each feature scheme, may have different ranges or distributions, therefore must be transformed to a comparable range before
fusion. The logistic function eα+βx /(1 + eα+βx ) in [4] can be used at this preprocessing stage. In this paper, we investigate the rank-summation-based, score-summation-based, max, min, mean, and product rules for classifier
M
combination [29, 30]. If the input to the jth classifier (j = 1, . . . , R) is xj ,
ED
and the winning label is l, the aforementioned rules are given as follows: P r(n , R ) . The rank-summation-based rule: l = arg minn nk , R k j Pj=1 The score-summation-based rule: l = arg minn nk , R j=1 s(nk , Sj ) .
PT
The max rule: l = arg maxk maxj p(wk /xj ).
CE
The min rule: l = arg maxk minj p(wk /xj ). P The mean rule: l = arg maxk R j=1 p(wk /xj ). Q The product rule: l = arg maxk R j=1 p(wk /xj ).
AC
5. Experiments In this section, five widely used gait databases: 1) CASIA gait database
B; 2) CASIA gait database C; 3) TUM GAID gait database; 4) OU-ISIR treadmill gait database B; 5) USF HumanID database are used to evaluate 21
ACCEPTED MANUSCRIPT
the performance of the proposed method. 5.1. Experiments on CASIA-B gait database
CR IP T
This section reports experimental results on CASIA-B database [31], which includes 124 different subjects (93 males and 31 females) with variations in walking status (normal, in a coat, or with a bag). There are 6 normal walking sequences, 2 walking in a coat and 2 walking with a bag for each
subject. All subjects walk along the straight line under 11 different view
AN US
angles. Only sequences in the lateral view angle (view angle 90◦ ) are used in
ED
M
this section (Fig. 13).
Fig. 13. Sample images in CASIA-B gait database: (a) Normal walking; (b)
PT
walking in a coat; (c) walking with a bag.
5.1.1. Recognition accuracy on CASIA-B gait database with no walking vari-
CE
ations
In our experiments, we first extract spatial-temporal features as men-
AC
tioned in Section 2. Additionally, we perform the joints positioning, joint angles computing and extract kinematic features as mentioned in Section 3. Two types of experiments are carried out on this database. The first type is recognition with no walking variations. That is, both training and test patterns are under normal walking conditions. A total of 22
AN US
CR IP T
ACCEPTED MANUSCRIPT
(a)
(b)
Fig. 14. Recognition performance on the CASIA-B gait database: (a) Results using a single modality. (b) Results using rank-summation-based, score-
M
summation-based, product, sum, max and min combination rule.
124 × 6 = 744 sequences/patterns are involved and the leave-one-out cross-
ED
validation is employed. That is, we leave one of the 744 patterns out, train on the remainder and then verify the omitted element according to its similar-
PT
ities with respect to the remaining examples. The recognition performance is reported in terms of the correct classification rate (CCR) and cumulative
CE
match characteristics (CMC). We first use spatial-temporal and kinematic features separately for recognition, and then evaluate the performance after fusing both spatial-temporal and kinematic features based on the combina-
AC
tion rules mentioned in Section 4. Fig. 14 (a) and (b) show the recognition results (for ranks up to 5). It can be seen that (1) the power of discriminability provided by the
dynamics of the spatial-temporal features is similar/equivalent to the dis-
23
ACCEPTED MANUSCRIPT
Table 1. Recognition performance (%) on the CASIA-B gait database under changes of carrying or clothing condition. nm-nm
CCR
rank=1
Spatial-temporal features
93
bg-nm
cl-nm
CR IP T
Probe-gallery
rank=5 rank=1 rank=5 rank=1 97
90
97
Kinematic features
91
98
88
98
Fusion(Rank-summation)
94
100
93
98
Fusion(Score-summation)
96
100
92
100
92
98
83
90
Fusion(Sum)
96
100
94
100
Fusion(Max)
95
Fusion(Min)
92
86
95
89
94
89
92
93
100
82
95
92
100
AN US
Fusion(Product)
rank=5
100
89
100
89
99
97
90
100
89
96
criminability provided by the dynamics of and kinematic features; (2) the results using feature fusion are better than that using any single modality.
M
Another observation from the comparative results is that the sum rule outperforms other rules for gait recognition, which is consistent with the findings
ED
in [32].
PT
5.1.2. Recognition accuracy on CASIA-B gait database under changes of clothing and carrying condition
CE
The second type of experiment is recognition with walking variations. We assign normal walking (nm) sequences to the training set, walking sequences with a bag (bg) or changing clothes (cl) to the test set. In this sense, training
AC
and test sequences are under different walking conditions. The whole process is similar to Section 5.1.1, therefore, is omitted here for conciseness. Table 1 shows the recognition results. Our method is not sensitive to changes of clothing and carrying condition,
24
ACCEPTED MANUSCRIPT
Table 2. Comparisons with other existing methods on the CASIA-B gait database (rank=1), with the CCR obtained in lateral views. nm-nm
bg-bg
cl-cl
bg-nm
cl-nm
LF + AVG [33]
71
63
61
13
20
62
18
-
21
25
64
32
21
20
23
GEI + PCA + LDA [34]
91
4
4
44
23
GPPE [35]
93
62
55
56
22
GEnI [36]
92
65
55
56
27
Fusion(Sum)
96
94
95
94
92
Fusion(Max)
95
92
90
89
89
89
89
90
89
90
87
83
82
Fusion(Min)
92
Fusion(Product)
92
Fusion(Rank-summation)
94
Fusion(Score-summation)
96
Kinematic features
91
Spatial-temporal features
93
Avg
12
40
27
25
9
28
18
30
18
51
19
52
92
94
87
90
75
87
79
86
AN US
LF + DTW [33] LF + oHMM [33]
bg-cl
CR IP T
Probe-gallery(%)
90
89
93
89
88
91
94
90
92
93
88
92
90
86
88
89
86
88
90
93
90
86
79
89
M
because the proposed method makes full use of multiple-aspect feature information to dispel influence caused by different walking conditions. Morever,
ED
the proposed method captures the gait dynamics underlying gait parameters via deterministic learning algorithm, reflecting the temporal dynamics
PT
information of human walking. This kind of temporal dynamics information does not rely on shallow shape information, therefore, is robust to changes
CE
of walking condition.
5.1.3. Comparisons with other existing methods on the CASIA-B gait database
AC
We further compare the proposed method with other existing methods on
the CASIA-B gait database under two different walking condition variations: carrying a bag and wearing a coat. Table 2 shows the comparison results. Experimental results show that the proposed method outperforms other
25
ACCEPTED MANUSCRIPT
methods in rank 1 recognition rate for conditions of carrying a bag and wearing a coat. The proposed method does not need to know the exact walking
CR IP T
conditions of each gait sequence, and has a strong tolerance on variation of carrying or wearing status. The condition variations has less effect on
the proposed method, but greatly degrades the recognition results of other methods. Compared with the condition of carrying a bag, the condition of
wearing a coat can affect gait seriously with a greater drop of rank 1 recogni-
AN US
tion rate. For combination rules, the sum rule performs the best in average
recognition rate among the 6 chosen rules. Fusion of different gait parameters contains more discriminant information than the single-modality. On the basic of multi-feature fusion, deterministic learning theory contributes to extract in-depth underlying shallow gait parameters to bulid a robust
M
recognition system against different walking condition variations.
ED
5.2. Experiments on CASIA-C gait database This paper further reports experimental results on CASIA-C gait database [37], which consists of 153 different subjects (130 males and 23 females) un-
PT
der different walking status at night. These walking variations include walking speeds, carrying conditions and illumination conditions: normal walk-
CE
ing (nm), slow walking (sw), fast walking (fw), normal walking with a bag (bw). There are 4 normal walking sequences, 2 slow walking, 2 fast walk-
AC
ing and 2 walking with a bag for each subject (Fig. 15). We assign three nm sequences to the training set, and the rest sequences (the remaining nm sequence, 2 sw sequences, 2 fw sequences, 2 bw sequences) to the test set. Comparison with other classical methods, i.e., Gait Curves [38], Normalized Dual-Diagonal Projections (NDDP) [39], Orthogonal Diagonal Projections 26
ACCEPTED MANUSCRIPT
Table 3. Comparisons with other existing methods on the CASIA-C database (rank=1), with the CCR obtained in lateral views. nm-nm
sw-nm
fw-nm
bw-nm
91
65
70
26
NDDP [39]
98
84
84
16
ODP [40]
98
80
80
16
WPSR [41]
93
83
85
20
HDP [42]
98
84
88
36
89
89
90
80
98
91
94
25
WBP [45]
99
86
90
81
RSM [46]
100
100
100
96
Fusion(Sum) Fusion(Max) Fusion(Min) Fusion(Product) Fusion(Rank-summation) Fusion(Score-summation) Kinematic features
69
70
77
87
77
89
99
100
100
99
96
99
100
92
93
90
94
98
96
89
79
91
100
95
93
90
95
99
91
93
84
92
100
100
94
89
96
96
94
90
89
92
95
91
93
82
90
M
Spatial-temporal features
63
71
AN US
AEI [43] Pseudoshape [44]
Avg
CR IP T
Probe-gallery Gait Curves [38]
(ODP) [40], Wavelet Packet Silhouette Representation (WPSR) [41], Hor-
ED
izontal Direction Projection (HDP) [42], Active Energy Image (AEI) [43], Pseudoshape [44], WBP [45], Random Subspace Method (RSM) [46] is given
PT
on the CASIA-C gait database. Experimental results are illustrated in Table 3. It is shown that the proposed method can still achieves promising perfor-
CE
mance across different walking speeds in a outdoor environment at night. 5.3. Experiments on TUM GAID gait database
AC
The TUM Gait from Audio, Image and Depth (GAID) database contains
305 different subjects in an outdoor scenario [47]. Two recording sessions with the time variation (where clothing, lighting, and other recording properties are different) were performed: the first session in January and the sec-
27
CR IP T
ACCEPTED MANUSCRIPT
Fig. 15. Sample images in CASIA-C gait database: (a) normal walking with a bag; (b) normal walking, (c) fast walking; (d) slow walking.
AN US
ond in April. Hereinafter, four different walking conditions are considered: normal walk (N), carrying a backpack (B), wearing coating shoes (S), and elapsed time (TN-TB-TS). Each subject is composed of six normal walking sequences(N1-N6), two sequences carrying a bag (B1-B2) and two sequences wearing coating shoes (S1-S2). Additionally, 32 subjects were recorded in
CE
PT
ED
TS1-TS2) (Fig. 16).
M
both sessions, thus they have 10 additional sequences (TN1-TN6, TB1-TB2,
Fig. 16. Sample images in TUM GAID gait database in two sessions.
AC
This section reports experimental results on this subset of 32 subjects
for robustness test to the time variation. Detailed design of experiments are outlined in Table 4. The process of training and test is similar to the process of CASIA-B/C database, and is omitted here for conciseness. Experimental results are illustrated in Table 5. It is possible to recognize 28
ACCEPTED MANUSCRIPT
Table 4. Experiments on TUM GAID database for robustness test to time variation. Gallery set
Probe set
Gallery size
N
N1,N2,N3,N4
N5,N6
32 × 4
B
N1,N2,N3,N4
B1,B2
32 × 4
S
N1,N2,N3,N4
S1,S2
32 × 4
TN
N1,N2,N3,N4
TN5,TN6
32 × 4
TB
N1,N2,N3,N4
TB1,TB2
32 × 4
TS
N1,N2,N3,N4
TS1,TS2
32 × 4
Probe size
CR IP T
Experiment
32 × 2 32 × 2 32 × 2 32 × 2 32 × 2 32 × 2
to time variation. Experiments Kinematic features
AN US
Table 5. Experimental results on TUM GAID database for robustness test
Spatial-temporal features
B
S
TN TB
TS
99
85
83
75
77
81
98
80
89
82
72
81
100 86
90
85
77
80
M
Fusion(Max)
N
Fusion(Min)
99
85 87
80
75
82
Fusion(Sum)
100
94 96
88
80
83
83
80
80
85
84
79
77
91
85
79
82
ED
Fusion(Product) Fusion(Rank-summation)
100 87 100 89 90
PT
Fusion(Score-summation) 100
89
individuals by using the proposed method even when clothing, lighting, and
CE
other recording properties are significantly different. Due to the differences in gallery/probe size, to the best knowledge of the authors, it is not possible to compare our proposed method with the experimental results available in
AC
the literature directly. Table 6 provides an indirect, rough comparison with other existing works, and our method is robust to the time variation.
29
ACCEPTED MANUSCRIPT
Table 6.
Comparison with other existing methods on the TUM GAID
database, with the CCR obtained from [48] in lateral views. B
S
TN TB
99
27
53
44
6
SVIM [49]
98
64
92
66
31
RSM [46]
100
79
97
58
38
DCS [48]
100
99
99
78
62
H2M [48]
99
100 98
72
63
Kinematic features
99
85
83
75
77
Spatial-temporal features
98
80
89
82
72
Fusion(Max) Fusion(Min) Fusion(Sum) Fusion(Product) Fusion(Rank-summation)
Avg
9
56
50
81
57
88
55
96
44
96
81
83
81
84
100
86
90
85
77
80
86
99
85
87
80
75
82
85
100
94
96
88
80
83
90
100
87
89
83
80
80
87
100
89
85
84
79
77
86
100
90
91
85
79
82
88
M
Fusion(Score-summation)
TS
CR IP T
N
GEI [47]
AN US
Experiments
5.4. Experiments on OU-ISIR gait database
ED
In this section, experiments are carried out on OU-ISIR Treadmill dataset A [50] to examine the robustness to speed variations. In these experiments,
PT
34 subjects with speed variation of 4 km/h to 6 km/h are considered (Fig. 17). Hereinafter, the following nomenclature is employed to refer each of
CE
the walking speeds: speed 4 km/h (Ts4), 5 km/h (Ts5), 6 km/h (Ts6). We assigned the Ts5 sequences as the gallery set, whlie Ts4, Ts5, Ts6 are
AC
assigned as the probe set. From the results shown in Table 7, the proposed method can avoid the great drop of recognition rate no matter the difference between gallery set and probe set is small or large.
30
CR IP T
ACCEPTED MANUSCRIPT
Fig. 17. Sample images in OU-ISIR gait database: (a) 4 km/h; (b) 5 km/h;
AN US
(c) 6 km/h.
Table 7. Comparison with other existing methods on the OU-ISIR database
M
(Gallery set: Ts5) with the CCR obtained in lateral views. Ts4
PSA [51]
35
47
47
43
FD [52]
77
85
91
84
GEI [53]
35
88
88
70
AEI [43]
35
85
71
64
GPI [54]
77
97
77
84
AC
CE
PT
ED
Probe set
Ts5 Ts6 Avg
Kinematic features
89
92
85
89
Spatial-temporal features
88
94
87
90
Fusion(Max)
91
97
88
92
Fusion(Min)
89
96
80
88
Fusion(Sum)
96
100
98
98
Fusion(Product)
92
100
89
94
Fusion(Rank-summation)
96
98
87
94
Fusion(Score-summation)
98
100
92
97
31
ACCEPTED MANUSCRIPT
5.5. Experiments on USF HumanID gait challenge database The USF HumanID gait challenge database [25] comprises 1870 sequences
CR IP T
of 122 subjects walking along an elliptical path, with variations in walking surface (grass (G)/concrete (C)), carrying status (carrying a briefcase (BF)
/not carrying a briefcase (NB)), shoe type (A/B), viewpoint (right (R)/left
(L)) and elapsed time (May (M)/November (N)) (Fig. 18). A total of 33 common subjects were recorded in both May and November for time covari-
M
AN US
ate.
ED
Fig. 18. Sample images in USF HumanID gait challenge database.
This section further reports experimental results on this challenge database for robustness test to condition variations and silhouette quality. We assign
PT
(G, A, R, NB, M/N) sequences of 122 subjects to the training set, and sequences of different walking conditions to the test set. Detailed design of
CE
the 12 experiments on the USF database are outlined in the first three rows of Table 8. As shown in Fig. 19, complicated background and illumination
AC
variations bring challenges to silhouette quality and gait recognition. From the comparative results with other existing methods in Table 8,
the following observations can be obtained: (1) The proposed method can still achieve reliable performance even under complicated background and illumination variations. (2) The silhouette quality has a sinificant effect on 32
ACCEPTED MANUSCRIPT
the recognition performance, particularly for the kinematic features. Fortunately, this dilemma can be improved by using deterministic learning algo-
CR IP T
rithm, which extracts the gait dynamics underlying kinematic parameters as the kinematic features. That is, kinematic features are represented as the change rate in the time-varying trajectories of kinematic parameters. Therefore, the proposed kinematic features can avoid the great drop of recognition rate due to silhouette quality changes. (3) The proposed kinematic
AN US
and spatial-temporal features are designed to complement each other in gait
recognition process, leading to a superior performance in the combined use
ED
M
of kinematic and spatial-temporal features.
Fig. 19. Illustration of the silhouette sequences in: (a) the gallery set; (b)
PT
probe set A; (c) probe set B; (d) probe set C; (e) probe set D; (f) probe set
CE
E; (g) probe set F; (h) probe set G.
5.6. Computational complexity
AC
The proposed algorithm has been implemented in an Intel Core i7 CPU,
3.4 GHz PC with 8 GB RAM. Spatial-temporal and kinematic gait parameters are extracted simultaneously from the same input walking sequence. For calculating the target spatial-temporal and kinematic features based on deterministic learning, we need to construct RBF neural networks and calculate 33
CR IP T
ACCEPTED MANUSCRIPT
Table 8. Comparison with other existing methods on the USF HumanID database (Gallery set: (G, A, R, NB, M/N)). Here, keys for covariates: V, view; H, shoe; S, surface; B, briefcase; T, time; C, clothes. W-AvgI represents
Probe set
A
B
C
D
Probe size
122
54
54
121
Covariate
V
H
VH
S
GEI [12]
90
91
81
56
GFI [56]
89
93
70
19
STM-SPP [57]
92
95
84
72
STM-DM [58]
93
96
VI-MGR [55]
95
96
E
F
G
H
I
J
K
L
60
121
60
120
60
120
33
33
-
SH
SV
SHV
B
BH
BV
THC
STHC
-
W-AvgI
57.7
64
25
36
64
60
60
6
15
23
7
8
78
67
48
3
9
46.1
68
29
40
69
60
64
20
18
63.1
M
CCR (rank=1)
AN US
the weighted average identification rate [55].
86
70
69
39
37
78
71
66
27
22
66.7
86
54
57
34
36
91
90
78
31
28
68.1
88
93
91
82
80
74
72
87
83
71
61
67
80.2
92
96
95
86
82
75
78
86
87
72
70
64
82.7
Fusion(Sum)
97
98
96
92
88
81
82
95
92
82
76
73
88.9
ED
Kinematic features Spatial-temporal features
CCR (rank=5) GEI [12]
94
94
93
78
81
56
53
90
83
82
27
21
76.2
98
94
93
40
47
26
25
94
85
74
24
24
63.9
STM-SPP [57]
96
98
95
80
84
59
61
92
84
85
30
27
79.1
STM-DM [58]
97
98
96
82
83
61
60
95
89
83
39
28
80.4
VI-MGR [55]
100
98
96
80
79
66
65
97
95
89
50
48
83.8
CE
PT
GFI [56]
93
98
94
88
88
83
80
92
93
83
70
73
87.5
97
98
98
93
90
83
82
93
95
88
76
70
90.1
Fusion(Sum)
100
100 100
98
95
88
85
98
100
92
82
79
94.4
AC
Kinematic features
Spatial-temporal features
34
Fig. 20.
AN US
CR IP T
ACCEPTED MANUSCRIPT
Training phase implementation using Matlab and GPU parallel
M
processing platform.
the constant RBF matrixs. We note that the complexity of parameters ex-
ED
traction is negligible compared to computational load of neural computation in training and testing phase. Fortunately, the computational complexity can be improved considerably by implementation in Matlab, using parallel
PT
processing platforms like Graphical Processing Units (GPU). The computation consists of two phases: a off-line training phase and a on-line test phase.
CE
In the off-line training phase (Fig. 20), the average training time is about 10 s for one spatial-temporal pattern, about 12 s for one kinematic pattern.
AC
In the on-line test phase (Fig. 21), for predicting one unlabeled recording, it takes on average 0.9 s, 1.1 s and 0.7 s in spatial-temporal pattern recognition, kinematic pattern recognition and decision level fusion. Table 9 shows an example of time consumption of spatial-temporal pattern recognition on CASIA-B gait database. 35
Fig. 21.
AN US
CR IP T
ACCEPTED MANUSCRIPT
Test phase of spatial-temporal pattern using Matlab and GPU
Table 9.
ED
M
parallel processing platform.
Time consumption of test phase under multi-pattern. Here, n
PT
represents number of patterns; time represents time consumption.
time
n
time
n
1
0.949
6
1.046
11 1.179
16 1.296
2
1.013
7
1.107
12 1.189
17 1.365
3
1.018
8
1.121
13 1.212
18 1.383
4
1.024
9
1.154
14 1.235
19 1.413
5
1.033
10 1.176 15
AC
CE
n
36
time
n
1.258 20
time
1.439
ACCEPTED MANUSCRIPT
5.7. Discussions From the results obtained above, the following observations can be ob-
CR IP T
tained: • The proposed method achieves superior performance compared with existing other gait recognition methods when the testing walking conditions are different from the corresponding training conditions. The
fusion of two different features can provide a comprehensive character-
AN US
ization of gait dynamics, which is not sensitive to clothing variation, carrying status variation, walking speed variation, illumination variation and time variation.
• The proposed method can enhance the recognition accuracy of single
M
modality. Gait characteristics underlying one single aspect of features are limited and not comprehensive enough to develop an optimal gait
ED
recognition system. The proposed method fuses different aspects of features and extract the dynamics underlying different gait features.
PT
The results using feature fusion are better than that using any single modality.
CE
• The proposed framework facilitates the applications of multi-feature gait recognition in practice. Two different kinds of gait parameters can
AC
be extracted simultaneously from the same input walking sequence. Suppose there is a real scenario where only limited walking sequences could be collected, the proposed method can still make it work by extracting as many informative cues as possible for optimal recognition
37
ACCEPTED MANUSCRIPT
performance. Moreover, the proposed framework can handle walking condition variations.
CR IP T
• The proposed model-based and silhouette-based features are designed
to complement each other in gait recognition process. Spatial-temporal
features can work well even in silhouettes of poor quality, while kine-
matic features can provide more information on the temporal changes of body structure and dynamics. The combined use of the two features
AN US
improves the recognition accuracy by a long way.
• The proposed paper aims to evaluate the discriminatory ability of the fusion of spatial-temporal and kinematic features, therefore, the factor of view angle is not discussed in this paper. Future large sample size
M
studies involving multiple view angles may help to further verify the
6. Conclusion
ED
combined use of feature fusion and deterministic learning.
PT
The fusion of spatial-temporal and kinematic features is investigated in this paper for human gait recognition. There are some conclusions in below.
CE
Deterministic learning theory is used to extract the gait dynamics underlying spatial-temporal and kinematic parameters. Spatial-temporal gait
AC
features can be represented as the gait dynamics underlying the trajectories of spatial-temporal parameters, which can implicitly reflect the temporal changes of silhouette shape. Kinematic gait features can be represented as the gait dynamics underlying the trajectories of kinematic parameters, which can represent the temporal changes of body structure and dynamics. They 38
ACCEPTED MANUSCRIPT
are fused on the decision level using different combination rules to improve the gait recognition performance. The proposed method can provide an efficient
CR IP T
way for optimal human recognition, which is promising and reliable for individuals recognition, even for the cases of walking conditions change. When compared with other existing methods on well-known public gait databases,
encouraging recognition accuracy can be achieved. Future work will focus on
AN US
multi-modal fusion for gait recognition. Acknowledgments
This work was supported by the National Science Fund for Distinguished Young Scholars (Grant No. 61225014), by the National R&D Program for
M
Major Research Instruments (Grant No. 61527811). References
ED
[1] L. Wang, W. Hu, T. Tan, Recent developments in human motion anal-
PT
ysis, Pattern recognition 36(3) (2003) 585–601. [2] C. BenAbdelkader, R. Cutler, H. Nanda, L. Davis, Eigengait: Motion-
CE
based recognition of people using image self-similarity, in: International Conference on Audio-and Video-Based Biometric Person Authentica-
AC
tion, 2001, pp. 284–294.
[3] J. Zhang, J. Pu, C. Chen, R. Fleischer, Low-resolution gait recognition, IEEE Trans Syst Man Cybernet B 40(4) (2010) 986–996.
[4] L. Wang, H. Ning, T. Tan, W. Hu, Fusion of static and dynamic body
39
ACCEPTED MANUSCRIPT
biometrics for gait recognition, IEEE Transactions on Circuits and Systems for Video Technology 14(2) (2004) 149–158.
puter Vision, Springer, 2014, pp. 309–318.
CR IP T
[5] D. S. Matovski, M. S. Nixon, J. N. Carter, Gait recognition, in: Com-
[6] D. Cunado, M. S. Nixon, J. N. Carter, Using gait as a biometric,
via phase-weighted magnitude spectra, in: International Conference on Audio-and Video-Based Biometric Person Authentication, 1997, pp. 93–
AN US
102.
[7] J.-H. Yoo, M. S. Nixon, C. J. Harris, Extracting gait signatures based on anatomical knowledge, in: Proceedings of BMVA Symposium on Advancing Biometric Technologies, 2002, pp. 40–48.
M
[8] X. Mu, Q. Wu, A complete dynamic model of five-link bipedal walking,
4926–4931.
ED
in: Proceedings of the 2003 American Control Conference, 2003, pp.
PT
[9] A. F. Bobick, A. Y. Johnson, Gait recognition using static, activityspecific parameters, in: International Conference on Computer Vision
CE
and Pattern Recognition, 2001, pp. I–423. [10] W. Zeng, C. Wang, Human gait recognition via deterministic learning,
AC
Neural Networks 35 (2012) 92–102.
[11] P. J. Phillips, S. Sarkar, I. Robledo, P. Grother, K. Bowyer, The gait identification challenge problem: data sets and baseline algorithm, in: 2002 Proceedings. 16th International Conference on Pattern Recognition, 2002, pp. 385–388. 40
ACCEPTED MANUSCRIPT
[12] J. Man, B. Bhanu, Individual recognition using gait energy image, IEEE Trans. Pattern Anal. Mach. Intell. 28(2) (2006) 316–322.
CR IP T
[13] M. Hofmann, S. M. Schmidt, A. N. Rajagopalan, G. Rigoll, The gait
identification challenge problem: data sets and baseline algorithm, in:
2012 5th IAPR International Conference on Biometrics (ICB), 2012, pp. 390–395.
AN US
[14] Y. Makihara, T. Tanoue, D. Muramatsu, Y. Yagi, S. Mori, Y. Utsumi, M. Iwamura, K. Kise, Individuality-preserving silhouette extraction for gait recognition, IPSJ Transactions on Computer Vision and Applications 7 (2015) 74–78.
[15] D. S. Matovski, M. Nixon, S. Mahmoodi, T. Mansfield, On including
M
quality in applied automatic gait recognition, in: 2012 21st International
ED
Conference on Pattern Recognition (ICPR), 2012, pp. 3272–3275. [16] K. Bashir, T. Xiang, S. Gong, Gait recognition using gait entropy image,
PT
in: 3rd International Conference on Crime Detection and Prevention (ICDP 2009), 2009, pp. 1–6.
CE
[17] C. Wang, J. Zhang, J. Pu, X. Yuan, L. Wang, Chrono-gait image: a novel temporal template for gait recognition, in: European Conference
AC
on Computer Vision, 2010, pp. 257–270.
[18] J.-H. Yoo, M. S. Nixon, Automated markerless analysis of human gait motion for recognition and classification, Etri Journal 33(2) (2011) 259– 266.
41
ACCEPTED MANUSCRIPT
[19] J. Preis, M. Kessel, M. Werner, C. Linnhoff-Popien, Gait recognition with kinect, in: 1st international workshop on kinect in pervasive com-
CR IP T
puting, 2012, pp. P1–P4. [20] A. Ball, D. Rye, F. Ramos, M. Velonaki, Unsupervised clustering of people from’skeleton’data, in: Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction, 2012, pp. 225–226.
AN US
[21] V. O. Andersson, R. M. de Ara´ ujo, Person identification using anthropometric and gait data from kinect sensor., in: AAAI, 2015, pp. 425–431. [22] M. Ahmed, N. Al-Jawad, A. Sabir, Gait recognition based on kinect sensor, in: SPIE Photonics Europe, 2014, pp. 91390B–91390B.
M
[23] P. Chattopadhyay, S. Sural, J. Mukherjee, Frontal gait recognition from incomplete sequences using rgb-d camera, IEEE Transactions on Infor-
ED
mation Forensics and Security 9(11) (2014) 1843–1856.
PT
[24] W. Zeng, C. Wang, F. Yang, Silhouette-based gait recognition via deterministic learning, Pattern Recognition 47(11) (2014) 3568–3584.
CE
[25] S. Sarkar, P. J. Phillips, Z. Liu, I. R. Vega, P. Grother, K. W. Bowyer, The humanid gait challenge problem: Data sets, performance, and anal-
AC
ysis, IEEE transactions on pattern analysis and machine intelligence 27(2) (2005) 162–177.
[26] M. Deng, C. Wang, Q. Chen, Human gait recognition based on deterministic learning through multiple views fusion, Pattern Recognition Letters 78 (C) (2016) 56–63. 42
ACCEPTED MANUSCRIPT
[27] C. Wang, D. J. Hill, Deterministic learning and rapid dynamical pattern recognition, IEEE Transactions on Neural Networks 18(3) (2007) 617–
CR IP T
630. [28] C. Wang, D. J. Hill, Deterministic learning theory for identification, recognition, and control, CRC Press, 2009.
[29] B. Achermann, H. Bunke, Combination of classifiers on the decision level
AN US
for face recognition, Citeseer, 1996.
[30] G. Shakhnarovich, T. Darrell, On probabilistic combination of face and gait cues for identification, in: Automatic Face and Gesture Recognition, 2002. Proceedings. Fifth IEEE International Conference on, 2002, pp. 169–174.
M
[31] S. Yu, D. Tan, T. Tan, A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition, in: Proc. Int.
ED
Conf. Pattern Recognition, 2006, pp. 441–444.
PT
[32] J. Kittler, M. Hatef, R. P. Duin, J. Matas, On combining classifiers, IEEE transactions on pattern analysis and machine intelligence 20(3)
CE
(1998) 226–239.
[33] M. Hu, Y. Wang, Z. Zhang, D. Zhang, J. J. Little, Incremental learning
AC
for video-based gait recognition with lbp flow, IEEE Trans. Cybern. 43 (2013) 77–89.
[34] S. Sarkar, P. J. Phillips, Z. Liu, I. R. Vega, P. Grother, K. W. Bowyer, The human id gait challenge problem: data sets, performance, and analysis, IEEE Trans. Pattern Anal. Mach. Intell. 27 (2005) 162–177. 43
ACCEPTED MANUSCRIPT
[35] M. Jeevan, N. Jain, M. Hanmandlu, G. Chetty, Gait recognition based on gait pal and pal entropy image, in: IEEE Int. Conf. Image Processing,
CR IP T
2013, pp. 4195–4199. [36] K. Bashir, T. Xiang, S. Gong, Gait recognition using gait entropy image, in: Int. Conf. Crime Detection and Prevention, 2009, pp. 1–6.
[37] D. Tan, K. Huang, S. Yu, T. Tan, Efficient night gait recognition based
on template matching, in: 18th International Conference on Pattern
AN US
Recognition, 2006, pp. 1000–1003.
[38] B. DeCann, A. Ross, Gait curves for human recognition, backpack detection, and silhouette correction in a nighttime environment, in: SPIE Defense, Security, and Sensing, 2010, pp. 76670Q–76670Q.
M
[39] D. Tan, S. Yu, K. Huang, T. Tan, Walker recognition without gait cycle estimation, in: International Conference on Biometrics, 2007, pp. 222–
ED
231.
PT
[40] D. Tan, K. Huang, S. Yu, T. Tan, Orthogonal diagonal projections for gait recognition, in: 2007 IEEE International Conference on Image Pro-
CE
cessing, 2007, pp. 337–340. [41] F. Dadashi, B. N. Araabi, H. Soltanian-Zadeh, Gait recognition using
AC
wavelet packet silhouette representation and transductive support vector machines, in: Image and Signal Processing, 2009. CISP’09. 2nd International Congress on, 2009, pp. 1–5.
[42] D. Tan, K. Huang, S. Yu, T. Tan, Uniprojective features for gait recognition, in: International Conference on Biometrics, 2007, pp. 673–682. 44
ACCEPTED MANUSCRIPT
[43] E. Zhang, Y. Zhao, W. Xiong, Active energy image plus 2dlpp for gait recognition, Signal Processing 90(7) (2010) 2295–2302.
CR IP T
[44] D. Tan, K. Huang, S. Yu, T. Tan, Recognizing night walkers based on one pseudoshape representation of gait, in: 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8.
[45] W. Kusakunniran, Q. Wu, H. Li, J. Zhang, Automatic gait recognition
AN US
using weighted binary pattern on video, in: Advanced Video and Signal
Based Surveillance, 2009. AVSS’09. Sixth IEEE International Conference on, 2009, pp. 49–54.
[46] Y. Guan, C.-T. Li, A robust speed-invariant gait recognition system for walker and runner identification, in: 2013 International Conference on
M
Biometrics (ICB), 2013, pp. 1–8.
ED
[47] M. Hofmann, J. Geiger, S. Bachmann, B. Schuller, G. Rigoll, The tum gait from audio, image and depth (gaid) database: Multimodal recogni-
PT
tion of subjects and traits, Journal of Visual Communication and Image Representation 25(1) (2014) 195–206.
CE
[48] F. M. Castro, M. J. Mar´ın-Jim´enez, N. Guil, Multimodal features fusion for gait, gender and shoes recognition, Machine Vision and Applications
AC
(2016) 1–16.
[49] T. Whytock, A. Belyaev, N. M. Robertson, Dynamic distance-based shape features for gait recognition, Journal of Mathematical Imaging and Vision 50(3) (2014) 314–326.
45
ACCEPTED MANUSCRIPT
[50] Y. Makihara, H. Mannami, A. Tsuji, M. A. Hossain, K. Sugiura, A. Mori, Y. Yagi, The ou-isir gait database comprising the treadmill
CR IP T
dataset, IPSJ Transactions on Computer Vision and Applications 4 (2012) 53–62.
[51] L. Wang, T. Tan, W. Hu, H. Ning, Automatic gait recognition based on statistical shape analysis, IEEE transactions on image processing 12(9)
AN US
(2003) 1120–1131.
[52] C. P. Lee, A. W. Tan, S. C. Tan, Gait recognition via optimally interpolated deformable contours, Pattern Recognition Letters 34(6) (2013) 663–669.
[53] Z. Liu, S. Sarkar, Simplest representation yet for gait recognition: Av-
M
eraged silhouette, in: Proceedings of the 17th International Conference
ED
on Pattern Recognition, 2004, pp. 211–214. [54] C. P. Lee, A. W. Tan, S. C. Tan, Gait probability image: an information-
PT
theoretic model of gait representation, Journal of Visual Communication and Image Representation 25(6) (2014) 1489–1492.
CE
[55] S. D. Choudhury, T. Tjahjadi, Robust view-invariant multiscale gait recognition, Pattern Recognition 48 (3) (2015) 798–811.
AC
[56] T. H. W. Lam, K. H. Cheung, J. N. K. Liu, Gait flow image: A silhouette-based gait representation for human identification, Pattern Recognition 44 (4) (2011) 973–987.
46
ACCEPTED MANUSCRIPT
[57] S. Das Choudhury, T. Tjahjadi, Silhouette-based gait recognition using procrustes shape analysis and elliptic fourier descriptors, Pattern
CR IP T
Recognition 45 (9) (2012) 3414–3426. [58] S. D. Choudhury, T. Tjahjadi, Gait recognition based on shape and motion analysis of silhouette contours , Computer Vision & Image Un-
AC
CE
PT
ED
M
AN US
derstanding 117 (12) (2013) 1770–1785.
47
ACCEPTED MANUSCRIPT
Muqing Deng is a Ph.D. candidate at the College of Automation, South China University of Technology, Guangzhou, China. His current research interests include dynamical pattern recognition, gait recognition, deterministic learning theory. E-mail:
[email protected].
AN US
CR IP T
Cong Wang received the B.E. and M.E. degrees from Beijing University of Aeronautics and Astronautics, Beijing, China, in 1989 and 1997, respectively, and the Ph.D. degree from the National University of Singapore, Singapore, in 2002. Currently, he is a professor at the College of Automation Science and Engineering, South China University of Technology, Guangzhou, China. He has authored and co-authored over 60 papers in international journals and conferences, and is a co-author of the book Deterministic Learning Theory for Identification, Recognition and Control (Boca Raton, FL: CRC Press, 2009). His current research interests include dynamical pattern recognition, adaptive NN control/identification, deterministic learning theory, pattern-based intelligent control, oscillation fault diagnosis, and cognitive and brain sciences. E-mail:
[email protected].
ED
M
Wei Zeng received the M.E. degree from the Department of Automation, Xiamen University, Xiamen, China, in 2008, and the Ph.D. degree from the College of Automation Science and Engineering, South China University of Technology, Guangzhou, China, in 2012. His current research interests include dynamical pattern recognition, adaptive NN control/identification, and deterministic learning theory. E-mail:
[email protected].
AC
CE
PT
Fengjiang Cheng is a M.S. candidate at the College of Automation and Center for Control and Optimization, South China University of Technology, Guangzhou, China. His current research interests include gait recognition, engineering application of deterministic learning theory. E-mail:
[email protected].