ARTICLE IN PRESS
JID: FI
[m1+;January 16, 2020;22:58]
Available online at www.sciencedirect.com
Journal of the Franklin Institute xxx (xxxx) xxx www.elsevier.com/locate/jfranklin
Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views Muqing Deng a,∗, Tingchang Fan a, Jiuwen Cao a, Siu-Ying Fung b, Jing Zhang c a Key
Lab for IOT and Information Fusion Technology of Zhejiang, Hangzhou Dianzi University, Hangzhou, China b Department of Orthopaedics and Traumatology, the University of Hong Kong, Hong Kong c UBTECH Sydney Artificial Intelligence Centre and the School of Computer Science, Faculty of Engineering and Information Technologies, University of Sydney, Australia Received 9 August 2019; received in revised form 10 November 2019; accepted 21 December 2019 Available online xxx
Abstract Deformation of gait silhouettes caused by different view angles heavily affects the performance of gait recognition. In this paper, a new method based on deterministic learning and knowledge fusion is proposed to eliminate the effect of view angle for efficient view-invariant gait recognition. First, the binarized walking silhouettes are characterized with three kinds of time-varying width parameters. The nonlinear dynamics underlying different individuals’ width parameters is effectively approximated by radial basis function (RBF) neural networks through deterministic learning algorithm. The extracted gait dynamics captures the spatio-temporal characteristics of human walking, represents the dynamics of gait motion, and is shown to be insensitive to the variance across various view angles. The learned knowledge of gait dynamics is stored in constant RBF networks and used as the gait pattern. Second, in order to handle the problem of view change no matter the variation is small or large, the learned knowledge of gait dynamics from different views is fused by constructing a deep convolutional and recurrent neural network (CRNN) model for later human identification task. This knowledge fusion strategy can take advantage of the encoded local characteristics extracted from the CNN and the long-term dependencies captured by the RNN. Experimental results show that promising recognition accuracy can be achieved. © 2019 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.
∗
Corresponding author. E-mail addresses:
[email protected],
[email protected] (M. Deng).
https://doi.org/10.1016/j.jfranklin.2019.12.041 0016-0032/© 2019 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.
Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
2
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
1. Introduction Gait recognition, with the goal of automatically identifying individuals by the way they walk, has attracted intensive attention due to its great prominent advantages in non-contact identification at a distance without subjects’ cooperation [1]. Within the past few decades, despite that much progress has been made for gait recognition, one of the inevitable problems in the practical process of these gait recognition methods is the robustness against the view variation [2–4]. In realistic situations, people can not be requested to always walk in one single direction, and the deformation of gait appearance due to the variation of views or walking directions will heavily affect the performance of gait recognition [5,6]. Attempts to resolve this dilemma have resulted in the the development of view-invariant gait recognition methods [7]. We roughly divide the existing methods into the following three categories. The first category is to identify view-invariant gait features. For instance, Zeng and Wang [2] used four kinds of view-invariant silhouette features to describe human gait, and the differences between the gait dynamics underlying silhouette features was used to measure gait similarity. Jean et al. [8] proposed an approach to compute view-normalized body part trajectories on potentially non-linear paths. Liu et al. [9] proposed a joint subspace learning method to obtain the prototypes of different views, and the coefficients of the linear combination of these prototypes in the corresponding views were extracted for feature representation. However, these methods could not cope with large variations in view angle. Therefore, Goffredo et al. [5,10] proposed a viewpoint-independent markerless gait analysis method. Two consecutive stages, namely markerless lower limb joints’ estimation and viewpoint rectification were adopted by working with uncalibrated camera system and a wide range of walking directions. However, it is hard to guarantee the accuracy of lower limb joints’ estimation. The second category is to transform gait features under various view angles onto a common view angle. By using nonlinear mapping and view transformation techniques, any gait features from different views can be mapped into the same reference view before the similarity measure is computed. The muliti-view gait recognition problem is then transformed into the single-view recognition problem. In this category, Hu et al. [11] proposed a view-invariant discriminative projection method by learning the low dimensional geometry and finding the unitary linear projection. Kusakunniran et al. [12] developed a view transformation model (VTM) across various views. The trained VTMs can normalize gait features from different views onto the same view before gait similarity is measured. An arbitrary VTM (AVTM) was recently proposed in Refs. [13] that accurately matches a pair of gait traits from an arbitrary view. The proposed AVTM method was further extended by incorporating a part-dependent view selection scheme. Despite that these methods were technically sound, they had limited practical utility, as they all depended on strong assumptions. The third category is to synthesize view angle based on a three-dimensional model. For example, Tang et al. [14] proposed the 3D human pose estimation and shape deformation to reconstruct parametric 3D body from 2D data. Zhao et al. [15] synthesized gait characteristic captured by multiple cameras and set up a 3D walking model. The lower limbs motion trajectories were then extracted from the 3D model for further matching and recognition. In [16], human walkers were identified from two-dimensional motion sequences in multiple viewpoints based on a three-dimensional linear model and the Bayesian rule. In [17], a 3D walking model was constructed by using multiple cameras and image-based rendering Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
JID: FI
ARTICLE IN PRESS
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
3
Fig. 1. Flow chart of the proposed algorithm.
technique was employed based on the 3D model for automatically constructing the proper view. It is observed that the abovementioned view-invariant methods can sort out the problem to some extent, particularly when the view changes are small. However, it is still challenging to deal with the effect of view angle especially when the speed changes are significant [18]. Therefore, the proposed method will tackle this issue as the core research objectives in this paper. In our previous works, a dynamical gait recognition scheme was developed and gait dynamics underlying silhouette features was extracted successfully via deterministic learning theory [19–21]. In this paper, we extend the gait dynamics concept to the view-invariant recognition problem. First, gait dynamics underlying three kinds of time-varying silhouette width parameters is effectively captured for each view angle by using the RBF neural networks through deterministic learning algorithm. The width parameters have already been proven to play a primary role in recent gait researches [22–24]. This kind of gait dynamics reflects the temporal dynamics information of human walking, and is shown to be insensitive to the variance across various view angles. The learned knowledge of gait dynamics is stored in constant RBF networks and used as the gait pattern. Second, a knowledge fusion strategy is introduced, in which gait dynamics collected from different views are synthesized by a deep convolutional and recurrent neural network (CRNN) model. The convolutional neural network (CNN) layers [25] are used to collect generic local characteristics of the extracted nonlinear gait dynamics under different view angles and the recurrent neural network (RNN) layers are used to model the long-term semantic contextual dependencies across the input multi-view gait dynamics sequence. Our contributions are summarized below: (1) the gait dynamics functions are locally accurately approximated as the view-insensitive features based on the deterministic learning; (2) the proposed knowledge fusion framework synthesizes gait dynamics characteristics from various view angles and takes advantage of the encoded local characteristics extracted from the CNN and the long-term dependencies captured by the RNN; (3) to achieve rapid matching of a test gait pattern from a set of trained gait patterns, it is preferred not to construct a uniform trained bank directly consisting of gait patterns under different view angles; (4) no matter the view variation is small or large, the proposed method can still perform well, which is shown below in the experiment section. 2. The proposed view-invariant gait recognition algorithm 2.1. Overview on the proposed algorithm To have a good understanding on the proposed view-invariant gait recognition method, an overview on the gait recognition system is firstly given below. The overall flowchart is shown in Fig. 1, which can be broadly divided into the following stages. Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
JID: FI
4
ARTICLE IN PRESS
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
Fig. 2. Illustration of width parameter extraction. (a) Original image. (b) Binary silhouette. (c) Width parameters extraction.
• Image preprocessing: Walking silhouettes and fundamental gait parameters are vital to periodic gait sequences analysis. We first extract the binary silhouette from one walking sequence and introduce three kinds of periodic silhouette width parameters based on the extracted binary silhouettes. • Gait dynamics extraction: Given that the dynamics nature is the essential characteristic of human walking, the deterministic learning based feature extraction method (DLM) is employed, and a novel gait dynamics feature (GD) which capture the in-depth dynamics information underlying the temporal width parameters is analyzed through different walking viewpoints. • CRNN model: For gait dynamics under different view conditions, a deep convolutional and recurrent neural network (CRNN) model based on the GD features is constructed. No matter the view variation is small or large, the learned knowledge of gait dynamics from different views is fused for later human identification task. • Gait recognition: With the above GD feature and machine learning algorithm based knowledge fusion scheme, the goal of human identification across different walking views is achieved. 2.2. Silhouette extraction and width parameters calculation One of the crucial factors for a successful view-invariant gait recognition algorithm is seeking salient features which must reflect the essential gait characteristics and should yield good robustness across various view angles. In this section, we first introduce three kinds of periodic silhouette width parameters, then present the proposed gait dynamics learning method under different view angles. Gait patterns are represented as the nonlinear gait dynamics underlying width parameters. Finally, the trained pattern bank consisting of gait dynamics under different view angles is derived. Walking silhouettes can be extracted firstly by using the method in [26]. Preprocessing steps including filling in holes, removing noises, edge extraction, dilation, erosion and size normalization, are then conducted by using several well established image processing methods [20]. Fig. 2(a) and (b) shows the binary silhouette extraction result from a walking image. Defined as the distance between left and right extremities of binary silhouette, width parameters implicitly capture structural as well as dynamical information of gait, and is used in this study. As shown in Fig. 2 (c), the extracted binary silhouette is divided into four Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
JID: FI
ARTICLE IN PRESS
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
5
Fig. 3. (a) Median width of regions 3; (b) Median width of regions 4.
equal regions, from subregion 1 to subregion 4. Here, (X, Y), X, Y and H denote the set of pixel points in the binary image, the row index, the width along that row and the height of the silhouette, respectively. Therefore, the width can be calculated as the difference between leftmost and rightmost boundary pixels in that row. Let YXL and YXR be the Y-coordinates of the leftmost and rightmost pixel point in the Xth row, respectively. To comply with the discriminability requirements, we have tested different schemes and finally picked the most appropriate three width parameters: the median width of the lower limbs silhouette (Wd1 ), the median width of subregions 3 and 4 (Wd2 , Wd3 ) as the gait parameters for later gait dynamics extraction task: Wd1 = median(YXR − YXL )|X ∈[ 1 H,H]
(1)
Wd2 = median(YXR − YXL )|X ∈[ 1 H, 3 H]
(2)
Wd3 = median(YXR − YXL )|X ∈[ 3 H,H]
(3)
2
2
4
4
where d represents the dth silhouette frame. The median width of the lower limbs holistic silhouette Wd1 reflects the holistic changes of lower limbs shape as well as the size information. The median widths of the lower limbs regions Wd2 , Wd3 reflects structural as well as dynamical information of gait. They reflect the dynamics of human walking from different aspects, and are used to show the periodicity of human gait. Fig. 3 (a) and (b) give two examples of parameter trajectories of Wd2 , Wd3 from the same walking sequence. 2.3. Gait dynamics learning mechanism based on width parameters In this section, based on deterministic learning algorithm, we propose a scheme for the identification of gait dynamics underlying time-varying width trajectories under different view angles. Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
6
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
Fig. 4. Input the width signals into the RBF networks for deterministic learning calculation.
Consider a general nonlinear human gait dynamics system in the following form: x˙ = F (θ ; p) + v(θ ; p),
θ (t0 ) = θ0
(4)
where θ = [θ1 , θ2 , θ3 ]T ∈ R3 is the state variables, representing the three lower limb width parameters; p is the constant parameters vector; F(θ ; p) and v(θ ; p) represent the gait system dynamics and modeling uncertainty term, respectively. We define φ(θ ; p) = F (θ ; p) + v(θ ; p) as the general gait dynamics. We attempt to achieve extraction and representation of the gait dynamics φ(θ ; p) through deterministic learning mechanism [27,28]. The following dynamical observer using the RBF network is employed to model (extract) the gait dynamics: ˙ θˆi = −ai (θˆi − θi ) + Wˆ iT Si (θ ), i = 1, 2, 3
(5)
where θˆ = [θˆ1 , θˆ2 , θˆ3 ]T represents the state vector of the dynamical model, θ is the input signals. Wˆ iT Si (θ ) represents the localized RBF network for approximating the unknown general gait dynamics. Wˆ i represents the estimate of the optimal weights. Gaussian funcT tion si (||Z − ξi || ) = exp[ −(Z−ξηi )i 2 (Z−ξi ) ], i = 1, . . . , N, is adopted. N is the neural network node number. ξi (i = 1, . . . , N ) are the network nodes evenly spaced on [−1.05, 1.05] × [−1.05, 1.05] × [−1.05, 1.05], with node-node width η = 0.15, the design constants ai = 0.5. Fig. 4 shows the RBF neural nodes distribution for the deterministic learning calculation. Using the Lyapunov synthesis method, the neural weights are updated by the following updating law: W˙ˆ i = W˙˜ i = −i Si (θ )θ˜i − σi iWˆ i
(6)
where θ˜i = θˆi − θi , W˜ i = Wˆ i − Wi ∗ , Wi ∗ is the optimal constant weight vector. In this paper, = diag{1.5, 1.5, 1.5}, σi = 10(i = 1, . . . , 3). Consider the adaptive system consisting of the nonlinear dynamical cardiac system (4), the dynamical RBF model (5), and the weight updating law (6), the derivative of the state estimation error θ˜i ( let θ˜i = θˆi − θi ) satisfies θ˙˜i = −ai θ˜i + Wˆ iT Si (θ ) − φi (θ ; p) = −ai θ˜i + W˜ iT Si (θ ) − i
(7)
Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
where W˜ i = Wˆ i − Wi ∗ , i = φi (θ ; p) − Wi ∗ T Si (θ ). Eqs. (6) and (7) constitute the following adaptive system: i Si (θ )T θ˜i −ai θ˙˜i − = −i Si (θ ) 0 σi iWˆ i W˜ i W˜˙ i
7
(8)
Combined with the Theorem 2.4 derived in [27], we can yield the exponential stability of the equilibrium point (θ˜i , W˜ i ) = 0 of the nominal part of the adaptive system (8), when the persistent excitation (PE) condition of Si (θ ) is satisfied. Since it is not feasible in practice to require the input signal θ (t) to visit every center of the whole RBF network persistently (PE of Si (θ )), we rewrite the Eq. (7) as the following form along the trajectory Z(θ 0 ): θ˙˜i = −ai θ˜i + W˜ ζTi Sζ i (θ ) − ζ i
(9)
where the notion ( · )ζ i and (·)ζ¯i represent terms related to the regions close to and away from the trajectory Z(θ 0 ), respectively. Sζ i (θ ) and Sζ¯i (θ ) are subvectors of Si (θ ). Wˆ ζ i and Wˆ ζ¯i are weight subvectors of Wˆ i . ζ i = ζ i + Wˆ ζ¯Ti Sζ¯i (θ ) = 0(ζ i ) is the approximation error along the trajectory. The aforementioned adaptive system (8) can now be redescribed by ζ i Sζ i (θ )T −ai θ˙˜i θ˜i − (10) = −ζ i Sζ i (θ ) 0 W˜ ζ i σi ζ iWˆ ζ i W˙˜ ζ i and W˙ˆ ζ¯i = W˙˜ ζ¯i = −ζ¯i Sζ¯i (θ )θ˜i − σi ζ¯iWˆ ζ¯i
(11)
According to Theorem 2.4 in [27], the regression subvector Sζ i (θ ) satisfies PE condition along any periodic or recurrent trajectory. This will lead to exponentially stability of the origin (θ˜i , W˜ ζ i ) = 0 of the nominal part of the new adaptive system (10) [29]. Based on the analysis in [27], the weight Wˆ ζ i converges to a small neighborhood of Wζ∗i along the trajectory Zζ (θ 0 ) ˆT ˜T φi (θ ; p) = Wζ∗T i Sζ i (θ ) + ζ i = Wζ i Sζ i (θ ) − Wζ i Sζ i (θ ) + ζ i = Wˆ ζTi Sζ i (θ ) + ζ i1
(12)
where ζ i1 = ζ i − W˜ ζTi Sζ i (θ ) = 0(ζ i ) = 0(i ) represents the practical approximation error, which is small value. Based on the convergence result of Wˆ i , we can obtain a constant vector of neural weights according to W¯ i = meant∈[ta ,tb ]Wˆ i (t ), tb > ta > 0 represent a time segment after the transient process. Therefore, accurate modeling of the general gait system dynamics φ i (θ ; p) is achieved along the parameter trajectory by using W¯ i φi (θ ; p) = Wˆ ζTi Sζ i (θ ) + ζ i1 = W¯ ζTi Sζ i (θ ) + ζ i2
(13)
where ζ i2 = 0(ζ i1 ) = 0(i ) is the practical approximation error. Hence, the dynamics φ i (θ ; p) underlying input signal can be accurately modeled in a time-invariant manner via deterministic learning. This kind of gait dynamics information represents the temporal change of width parameters between consecutive frame sequences and the dynamical nature of human walking, therefore, the amount of discriminability provided by the extracted gait dynamics is larger than the Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
JID: FI
8
ARTICLE IN PRESS
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
Fig. 5. Gait dynamics for different individuals under the same walking condition. Left: 25-year-old male (red line), 24-year-old female (blue line); Right: 32-year-old male (red line), 30-year-old female (blue line). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 6. The gait dynamics of the same person under different view angles. (a) person 5: view 1 (red line), view 2 (blue line); (b) person 6: view 1 (red line), view 2 (blue line). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
original width parameters even under different view angles. Fig. 5 shows the three-dimensional representation of the extracted gait dynamics for different individuals under the same walking condition. And Fig. 6 demonstrates the robustness of the gait dynamics features to various view angles. 2.4. Knowledge fusion for gait recognition In this section, we present a gait recognition scheme based on the learned gait dynamics and knowledge fusion under different view angles. It can be observed in Section 2 that the extracted gait dynamics is insensitive to the variance across various view angles. This section further introduces a knowledge fusion strategy using a deep CRNN model to synthesize gait characteristics collected from different view angles. The proposed CRNN network structure in this paper is shown in Fig. 7. We used three convolutional layers, the first layer with 32 kernels, the second layer with 32 keneral and the Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
JID: FI
ARTICLE IN PRESS
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
9
Fig. 7. The CRNN model used for knowledge fusion of gait dynamics under different view angles.
Fig. 8. The PCRNN model used for knowledge fusion of gait dynamics under different view angles.
third layer with 64 kernel. The learnable kernel size in each layer is set to be 3 × 3, and the Relu activation function is used in each convolutional layer. Followed each convolutional layer, a max -pooling is employed where 2 × 2 windows are used and the stride is 2 × 2. Hence, for each input sample, a feature map can be obtained after the convolutional and max pooling layers, which is the input into an Long Short-term Memory(LSTM) layer. The LSTM layer further learns the temporal relationship between feature maps. The output features of the LSTM layer is flattened through a fully connected (FC) layer with 64 neurons, which is applied to learn the global features. Finally, a softmax layer is applied to derive the probability distribution across the different classes. Within a residual block, a batch normalization (BN) layer and a dropout layer followed by a max -pooling layer are employed. BN is a layer that constantly normalizes each mini-batch throughout the entire network, reducing the internal covariant shift caused by progressive transforms, and a dropout layer can reduce the number of neurons and prevent over-fitting of network training. The proposed paper further investigates the performance of parallel CRNN model (PCRNN) in the recognition task. As shown in Fig. 8, the proposed parallel architecture is divided into four blocks in order to maintain the spatial and temporal characteristics underlying the original gait dynamics signal. The outputs of the two parallel blocks are then fused as one uniform feature vector. We use a softmax operation after a fully connected layer to obtain a probability distribution for different individuals. The CNN block includes three convolutional layers, with 16, 32 and 64 filters respectively. A max -pooling operation is used for signal representation, where the first max -pooling layer uses a 2 × 2 window with 2 × 2 stride. And the upper max -pooling layer uses Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
10
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
Fig. 9. Sample images from 11 different views in the CASIA-B gait database.
a 4 × 4 region with 2 × 2 stride to extract more robust representations. In RNN block, the input is first processed by a max -pooling layer to reduce the dimension. Followed by the max -pooling layer, the LSTM layer is employed for feature extraction. We concatenate the outputs of the CNNs block and RNN block into one feature vector. After feature fusion, the syncretic representation is fed into a fully connected layer and softmax layer to implement the classification. 3. Experiments In this section, comprehensive experimental results are given on CASIA-B database [30] and CMU MoBo gait database [31] to demonstrate the recognition and robustness performance of the proposed method. CASIA-B gait database contains a large number of subjects and directly supports the study of multi-view gait recognition with large view variation. CMU Mobo gait database has been widely used in a large number of existing works. All experiments are implemented in Matlab software and tested on a laptop with Intel Core i7 (3.5 GHz) CPU, 8 GB RAM and 64-bit Win 10 operating system. 3.1. Experiments on CASIA-B gait database The CASIA-B database contains gait sequences of 124 subjects captured from 11 different view angles (namely 0◦ , 18◦ , 36◦ , 54◦ , 72◦ , 90◦ , 108◦ , 126◦ , 144◦ , 162◦ and 180◦ ). At each view angle, each subject is required to walk along a straight line in the common speed for 6 times, walk with a bag for 2 times, walk with a coat for 2 times. All of the gait sequences are collected at a far distance with uniform and controlled environment, namely normal walking sequences (nm-01,..., nm-06), bag-carrying walking sequences (bg-01, bg-02) and coat-wearing walking sequences (cl-01, cl-02), respectively. Fig. 9 gives several sample images in this database. 3.1.1. Recognition accuracy on CASIA-B gait database with no knowledge fusion In our experiments, as mentioned, width parameters are extracted and gait dynamics is learned. In order to eliminate the data difference between different width parameters, all the Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
11
Fig. 10. Neural weights convergence in the deterministic learning algorithm. Table 1 Gait recognition performance (%) on the CASIA-B gait database with no knowledge fusion. Probe view
Gallery view
0◦
18◦ 36◦ 54◦ 72◦ 90◦ 108◦ 126◦ 144◦ 162◦ 180◦
0◦
18◦
36◦
54◦
72◦
90◦
108◦
126◦
144◦
162◦
180◦
87 83 86 87 87 88 90 87 83 83 86
87 88 87 84 86 87 91 87 84 87 84
86 90 87 83 81 91 90 83 81 86 87
84 83 87 86 83 86 91 84 84 83 86
81 88 83 86 84 93 88 78 86 84 83
87 87 81 84 86 92 94 86 81 86 81
86 83 86 87 84 91 90 81 81 84 89
84 88 83 86 81 83 93 83 81 86 90
87 81 84 84 86 93 90 88 87 83 87
86 86 84 81 78 91 87 86 86 87 87
81 84 84 83 81 91 87 87 81 83 91
width data is normalized. Fig. 10 shows an example of the neural weights convergence for the neural computation in deterministic learning. Two types of experiments are carried out in this database. The first type is the gait recognition with no knowledge fusion. That is, only single-view sequences are used as training patterns in the evaluation. The robustness performance of the proposed gait dynamics features against view conditions is evaluated. A total of one hundred and twenty-one experiments are carried out under one hundred and twenty-one different gallery-probe view configurations (gallery-probe): (0◦ − 0◦ ), (0◦ − 18◦ ), (0◦ − 36◦ ), . . . , (180◦ − 180◦ ). For each view configuration, 124 subjects with 124 × 3 sequences are used as the training patterns and 124 × 3 sequences are used as the test patterns. We adopt the same protocol as [19] for evaluation, in which dynamics similarities between the test pattern and the trained patterns are used for gait recognition. The results are tabulated in Table 1. It can be observed from the experimental results that: (1) the feasibility of the extracted gait dynamics features underlying width parameters as an identifier of individuals is confirmed; (2) the power of discriminability provided by the proposed dynamical features is similar in different gallery-probe view configurations. The proposed method has little sensitivity to Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
12
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
Table 2 Experiments on the CASIA-B database for robustness against different view angles. Experiment
Gallery set for fusion
Probe Set
Gallery Size
Probe Size
A B C D E F G H I J K
0◦
0◦
124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11
124 × 3 124 × 3 124 × 3 124 × 3 124 × 3 124 × 3 124 × 3 124 × 3 124 × 3 124 × 3 124 × 3
0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦
180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦
18◦ 36◦ 54◦ 72◦ 90◦ 108◦ 126◦ 144◦ 162◦ 180◦
Table 3 Gait recognition performance against different view angles on the CASIA-B gait database. Experiment
CCR(%)
A B C D E F G H I J K
89 90 84 91 84 92 87 79 91 85 79
CCR-Correct classification rate.
the effect of different view angles and can avoid the great drop of recognition rate. This is because the proposed method captures the gait dynamics features, preserving temporal dynamics information of human walking, which does not rely on shape information. 3.1.2. Recognition accuracy on CASIA-B gait database with knowledge fusion The second type of experiment is the gait recognition with knowledge fusion. That is, multi-view sequences are used as training patterns and synthesized in the evaluation. Eleven experiments designed for this section are listed in Table 2. For each view angle, we assign three nm sequences to training set for all the 124 subjects. That is, there are 124 × 3 × 11 = 4092 patterns in the training dataset for the proposed knowledge fusion algorithm. The process of the gait dynamics extraction and knowledge fusion is similar to Section 2, therefore, is omitted here for clarity and conciseness. The recognition performance of the proposed method with CRNN model is reported in Table 3. It should be noted that a poor algorithm can still achieve high recognition rates if the amount of subjects is small. This section, therefore, evaluates the proposed algorithm under different amounts of subjects. Similar to Refs. [32], we divide all these 124 subjects into four subsets, each of which contains 31 subjects randomly. For each subject amount, we Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
JID: FI
ARTICLE IN PRESS
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
13
Fig. 11. Average recognition accuracy under different amount of subjects. Table 4 Details of the adopted CRNN architectures in the experiments. Model CNN RNN CRNN-a CRNN-b PRCNN-a PRCNN-b
Architecture details Conv(16,3,3,1)-MaxPool(2,2)-BN-Conv(32,3,3,1)-MaxPool(2,2)-BN-Conv(64,3,3,1)MaxPool(2,2)-BN-FC(32)-Drop(0.5)-FC(2) LSTM(64)-FC(64)-Drop(0.5)-FC(2) Conv(16,3,3,1)-MaxPool(2,2)-BN-Conv(32,3,3,1)-MaxPool(4,4)-BN-Conv(64,3,3,1)MaxPool(2,2)-BN-LSTM(64)-FC(32)-Drop(0.5)-FC(2) Conv(16,3,3,1)-MaxPool(2,2)-BN-Conv(32,3,3,1)-MaxPool(4,4)-BN-Conv(64,3,3,1)MaxPool(2,2)-BN-GRU(64)-FC(32)-Drop(0.5)-FC(2) Conv(16,3,3,1)-MaxPool(2,2)-BN-Conv(32,3,3,1)-MaxPool(4,4)-BN-Conv(64,3,3,1)MaxPool(2,2)-BN; MaxPool(2,2)-LSTM(64); FC(32)-Drop(0.5)-FC(2) Conv(16,3,3,1)-MaxPool(2,2)-BN -Conv(32,3,3,1)-MaxPool(4,4)-BN-Conv(64,3,3,1)MaxPool(2,2)-BN; MaxPool(2,2)- GRU(64); FC(32)-Drop(0.5)-FC(2)
draw these four subsets randomly for the gallery set, and the recognition performance of the proposed method under different subject amount is reported in terms of the average rate of these subsets. Detailed experimental results is given in Fig. 11. The experimental settings and the process of training and test is similar to the aforementioned experiments, therefore, is omitted here for clarity and conciseness. It is shown from Fig. 11 that the proposed method is insensitive to the amount of subjects. It is known that the deep learning technique shows powerful discriminability in pattern recognition but relys on the large size of the gallery set. Fig. 12 further reports the recognition performance of the proposed method under different gallery sizes. The extracted gait dynamics features are embedded with more distinctive dynamics information, which is different from the conventional time-frequency features, representing the essential characteristics of human walking. It can be observed from the experimental results that our proposed method can avoid the great drop of recognition rates even when the gallery size is small. We investigate the performance of the proposed method under different deep learning architectures (models). As shown in Table 4, we provide a description of the adopted network Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
JID: FI
14
ARTICLE IN PRESS
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
Fig. 12. Average recognition accuracy under different gallery sizes. Here n is the number of gait sequences per subject used in the training set.
Fig. 13. Results of different deep learning models: (a) Original CNN and RNN models; (b) CRNN-a model and CRNN-b model; (c) PCRNN-a model and PCRNN-b model.
architectures. Here, Conv(z,x,y,n) stands for a convolutional layer with z filters where x and y are the width and height of 2D filter window, with a stride of n. MaxPool(x,y) stands for a Max Pooling layer where x and y are the pool sizes. BN stands for a batch normalization layer. FC(x) stands for a fully connected layer with x nodes. LSTM(x) stands for a LSTM layer where x is the dimensionality of the output space. GRU(x) stands for a Gated Recurrent Uint layer where x is the dimensionality of the output space. Drop(x) stands for a dropout layer with a dropout coefficient equal to x. In all cases, the training was done with 50 epochs. All the activation functions were Rectified Linear Units (ReLU) with the exception of the last layer with softmax activation. It can be observed from Fig. 13 that the proposed CRNN or PCRNN models outperform the original CNN or RNN models. For the best recognition rates of these models, the proposed PCRNN-a model is not inferior to other models. For the average recognition rates, the PCRNN-a model is more robust to the case of view variations and the dramatical degradation of recognition rate can be avoided. Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
JID: FI
ARTICLE IN PRESS
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
15
Fig. 14. Average recognition accuracy under different view fusion strategies.
Considering the demand for practical applications, this section introduces different fusion strategies with different view angles to evaluate the performance of the aforementioned method. That is, different view fusion strategies with different amounts of fused views are used in our experiments. Detailed experiments designed for this section can be seen in Fig. 14. For each view fusion strategy, we assign three nm sequences to training set and three nm sequences to testing set for all the 124 subjects. It is seen that as many informative cues under different view angles as possible should be involved in the gait recognition algorithm for obtaining optimal recognition performance. The proposed method makes full use of multiple cameras to dispel influence caused by human body’s rotation, enabling robust gait recognition in real-world surveillance environments. The present method is further compared with the existing view-invariant method in Refs. [2] on the CASIA-B gait database. Table 5 shows the detailed recognition performance comparison. From the comparison results, the following observations can be obtained: (1) Despite that the method in Refs. [2], can obtain high recognition rates in some certain probe views, it suffers from great drops of recognition rate in other probe views. Compared with [2], we utilize more advanced concepts and theories from view fusion and knowledge fusion in this paper, to construct the deep learning framework, which is shown to be effective on multiview information fusion. Multi-view information contains more discriminant features than the single-view information. (2) The proposed method can still achieve reliable performance no matter the view variation is small or large. The consistent performance is helpful for facilitating the application of gait recognition in practice. (3) The proposed method introduces a knowledge fusion strategy instead of constructing a uniform trained bank directly consisting of gait patterns under different view angles, which is helpful for the rapid matching of a test gait pattern from a set of trained gait patterns. Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
16
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx Table 5 Gait recognition performance comparison on the CASIA-B gait database. Experiment
Refs. [2](%)
Proposed method (%)
A B C D E F G H I J K
55 44 67 78 78 88 68 77 76 77 58
83 86 88 91 92 95 93 90 87 86 85
Table 6 Experiments on the CMU MoBo database for robustness against different view angles. Experiment
Gallery set for fusion
Probe Set
Gallery Size
Probe Size
A B C D E
NW+SW+S+SE+E NW+SW+S+SE+E NW+SW+S+SE+E NW+SW+S+SE+E NW+SW+S+SE+E
NW SW S SE E
25 × 3 × 5 25 × 3 × 5 25 × 3 × 5 25 × 3 × 5 25 × 3 × 5
25 × 3 25 × 3 25 × 3 25 × 3 25 × 3
Fig. 15. Sample images from 5 different views in the CMU MoBo gait database.
3.2. Experiments on CMU MoBo gait database This paper further evaluates the robustness against view conditions on the CMU MoBo gait database, which comprises gait sequences from 25 subjects in 6 different view, namely walking toward east (E), southeast (SE), south (S), southwest (SW), northwest (NW) and north (N), with the walking subject facing toward the south. Walking variations in the database include speed status: slow walking (sw) and fast walking (fw) as well. Fig. 15 shows several sample images of the CMU database. Only five of the six view directions are used in our experiments, omitting the north view walking. In order to increase the gallery and probe sizes, each walking sequence in the database is divided into three subsequences. Several experiments designed for this database are listed in Table 6. The process of deterministic learning calculation and CRNN-based knowledge fusion is similar to the examples of CASIA-B database, therefore, is omitted here for clarity and conciseness. Experimental results Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
17
Table 7 Gait recognition performance comparison on the CMU MoBo gait database. Experiment
Refs. [2](%)
Proposed method (%)
A B C D E
88 72 56 88 92
100 92 92 96 100
Table 8 Eleven experiments on CASIA-B gait database for robustness test. The gait sequences of normal walking are selected as the gallery set and walking sequences with different clothing types as the probe set. Experiment
Gallery set for fusion (nm)
Probe Set (cl)
Gallery Size
Probe Size
A1 B1 C1 D1 E1 F1 G1 H1 I1 J1 K1
0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦
0◦ 18◦ 36◦ 54◦ 72◦ 90◦ 108◦ 126◦ 144◦ 162◦ 180◦
124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11
124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2
180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦
and comparison are presented in Table 7. We obtain the recognition rates of the proposed method under the slow-walking status in all the experiments of this section. It is shown that the proposed method can achieve reliable performance across different walking views. 3.3. Discussion Considering the demand for robust gait recognition, this section further tests the robustness performance of the aforementioned method against other walking variations. In realworld surveillance environments, multiple cameras are needed and used simultaneously for reliable human identification tasks. Gait characteristics extracted with multiple cameras are more comprehensive than the single camera to develop a robust recognition algorithm. Based on this assumption, the proposed recognition scheme based on multi-view knowledge fusion can achieve reliable performance even for the cases of changes in clothing types, carrying conditions and walking speeds. We first carry out experiments on the CASIA-B database with clothing types and carrying conditions variations. In our experiment, test gaits are walking when carrying a bag (bg) or changing clothes (cl). Training gaits are normal walking (nm). Detailed design of experiments are outlined in Tables 8 and 9. The process of training and test is similar to the process of normal walking situation, and is omitted here for conciseness. Experimental results are illustrated in Tables 10 and 11. It is possible to recognize different individuals by using the proposed PCRNN-based knowledge fusion method, even when the clothing types and carrying conditions are significantly different. Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
18
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
Table 9 Eleven experiments on CASIA-B gait database for robustness test. The gait sequences of normal walking are selected as the gallery set and walking sequences with different carrying conditions as the probe set. Experiment
Gallery set for fusion (nm)
Probe Set (bg)
Gallery Size
Probe Size
A2 B2 C2 D2 E2 F2 G2 H2 I2 J2 K2
0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦ 0◦
0◦ 18◦ 36◦ 54◦ 72◦ 90◦ 108◦ 126◦ 144◦ 162◦ 180◦
124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11 124 × 3 × 11
124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2 124 × 2
180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦ 180◦
Table 10 Gait recognition performance (%) on CASIA-B gait database under different clothing types. Experiment
CCR(%)
A1 B1 C1 D1 E1 F1 G1 H1 I1 J1 K1
75 84 86 81 89 80 79 76 75 79 80
Table 11 Gait recognition performance (%) on CASIA-B gait database under different carrying conditions. Experiment
CCR(%)
A2 B2 C2 D2 E2 F2 G2 H2 I2 J2 K2
80 85 84 89 79 85 87 80 79 79 83
Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
19
Table 12 Five experiments on CMU MoBo gait database for robustness test. The gait sequences of slow walking are selected as the gallery set and fast walking as the probe set. Experiment
Gallery set for fusion (slow walking)
Probe Set (fast walking)
Gallery Size
Probe Size
A3 B3 C3 D3 E3
NW+SW+S+SE+E NW+SW+S+SE+E NW+SW+S+SE+E NW+SW+S+SE+E NW+SW+S+SE+E
NW SW S SE E
25 × 3 × 5 25 × 3 × 5 25 × 3 × 5 25 × 3 × 5 25 × 3 × 5
25 × 3 25 × 3 25 × 3 25 × 3 25 × 3
Table 13 Five experiments on CMU MoBo gait database for robustness test. The gait sequences of fast walking are selected as the gallery set and slow walking as the probe set. Experiment
Gallery set for fusion (fast walking)
Probe Set (slow walking)
Gallery Size
Probe Size
F3 G3 H3 I3 J3
NW+SW+S+SE+E NW+SW+S+SE+E NW+SW+S+SE+E NW+SW+S+SE+E NW+SW+S+SE+E
NW SW S SE E
25 × 3 × 5 25 × 3 × 5 25 × 3 × 5 25 × 3 × 5 25 × 3 × 5
25 × 3 25 × 3 25 × 3 25 × 3 25 × 3
Table 14 Gait recognition performance (%) on CMU MoBo gait database under different walking speeds. Experiment
CCR(%)
A3 B3 C3 D3 E3 F3 G3 H3 I3 J3
96 93 88 88 92 87 82 79 77 92
We further carry out experiments on the CMU MoBo database with walking speed variations. Detailed design of experiments are outlined in Tables 12 and 13. The process of training and test is similar to the process of normal walking situation, and is omitted here for conciseness. Experimental results are illustrated in Table 14. The proposed method can achieve very promising performance no matter the walking speed is fast or slow. 4. Conclusions A new robust gait recognition method against different view angles based on gait dynamics and knowledge fusion is proposed in this work. The performance of this method has been evaluated experimentally on CASIA-B gait databases. To obtain the view-invariant features for robust gait recognition, we extract the gait system dynamics underlying width parameters by using the RBF neural networks through determinPlease cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
ARTICLE IN PRESS
JID: FI
20
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
istic learning algorithm. This kind of gait dynamics information is shown to be insensitive to the view variations. Gait dynamics collected from different views are synthesized by a deep convolutional and recurrent neural network model. We make comprehensive performance evaluations and demonstrate the promising performance of the proposed method. The proposed method can achieve reliable performance under different view angles, which is suitable for real-time gait recognition application in practice. Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments This work was supported by the NNSF of China under Grant 61803133, 61806062, the National Program for Major Research Instruments under Grant 61527811, the Natural Science Foundation of Guangdong Province under Grant 2016A030313554, the key research grant for national fitness from General Administration of Sports of China under Grant 2015B043. References [1] J. Zhang, J. Pu, C. Chen, R. Fleischer, Low-resolution gait recognition, IEEE Trans. Syst. Man Cybern. Part B Cybern. 40 (4) (2010) 986. [2] W. Zeng, C. Wang, View-invariant gait recognition via deterministic learning, in: Proceedings of the International Joint Conference on Neural Networks (IJCNN), 2014, pp. 3465–3472. [3] N. Liu, J. Lu, G. Yang, Y.P. Tan, Robust gait recognition via discriminative set matching., J. Vis. Commun. Image Represent. 24 (4) (2013) 439–447. [4] R. Cilla, M.A. Patricio, A. Berlanga, J.M. Molina, A probabilistic, discriminative and distributed system for the recognition of human actions from multiple views., Neurocomputing 75 (1) (2012) 78–87. [5] G. Michela, B. Imed, J.N. Carter, M.S. Nixon, Self-calibrating view-invariant gait biometrics, IEEE Trans. Syst. Man Cybern. B Cybern. 40 (4) (2010) 997–1008. [6] Z. Wu, Y. Huang, L. Wang, X. Wang, T. Tan, A comprehensive study on cross-view gait based human identification with deep CNNs, IEEE Trans. Pattern Anal. Mach. Intell. 39 (2) (2017) 209–226. [7] Z. Wei, W. Cong, View-invariant gait recognition via deterministic learning, Neurocomputing 175 (1) (2015) 324–335. [8] F. Jean, A.B. Albu, R. Bergevin, Towards view-invariant gait modeling: computing view-normalized body part trajectories, Pattern Recognit. 42 (11) (2009) 2936–2949. [9] N. Liu, J. Lu, Y.P. Tan, Joint subspace learning for view-invariant gait recognition, IEEE Signal Process. Lett. 18 (7) (2011) 431–434. [10] M. Goffredo, I. Bouchrika, J.N. Carter, M.S. Nixon, Performance analysis for automated gait extraction and recognition in multi-camera surveillance, Multimed. Tools Appl. 50 (1) (2010) 75–94. [11] M. Hu, Y. Wang, Z. Zhang, J.J. Little, H. Di, View-invariant discriminative projection for multi-view gait-based human identification, IEEE Trans. Inf. Forensics Secur. 8 (12) (2013) 2034–2045. [12] W. Kusakunniran, Q. Wu, J. Zhang, H. Li, Cross-view and multi-view gait recognitions based on view transformation model using multi-layer perceptron, Pattern Recognit. Lett. 29 (1) (2012) 882–889. [13] D. Muramatsu, A. Shiraishi, Y. Makihara, M.Z. Uddin, Y. Yagi, Gait-based person recognition using arbitrary view transformation model, IEEE Trans. Image Process. 24 (1) (2015) 140–154. [14] J. Tang, J. Luo, T. Tjahjadi, F. Guo, Robust arbitrary-view gait recognition based on 3d partial similarity matching, IEEE Trans. Image Process. 26 (1) (2016) 7–22. [15] G. Zhao, G. Liu, L. Hua, M. Pietikainen, 3D gait recognition using multiple cameras, in: Proceedings of the International Conference on Automatic Face & Gesture Recognition, 2006. [16] Z. Zhang, N.F. Troje, View-independent person identification from human gait, Neurocomputing 69 (1) (2005) 250–256. Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041
JID: FI
ARTICLE IN PRESS
[m1+;January 16, 2020;22:58]
M. Deng, T. Fan and J. Cao et al. / Journal of the Franklin Institute xxx (xxxx) xxx
21
[17] R. Bodor, A. Drenner, D. Fehr, O. Masoud, N. Papanikolopoulos, View-independent human motion classification using image-based reconstruction, Image Vis. Comput. 27 (8) (2009) 1194–1206. [18] T. Connie, M.K. Goh, A.B. Teoh, A grassmannian approach to address view change problem in gait recognition, IEEE Trans. Cybern. PP (99) (2017) 1–14. [19] M. Deng, C. Wang, Q. Chen, Human gait recognition based on deterministic learning through multiple views fusion, Pattern Recognit. Lett. 78 (2016) 56–63. [20] M. Deng, C. Wang, F. Cheng, W. Zeng, Fusion of spatial-temporal and kinematic features for gait recognition with deterministic learning, Pattern Recognit. 67 (C) (2017) 186–200. [21] M. Deng, C. Wang, T. Zheng, Individual identification using a gait dynamics graph, Pattern Recognit. 83 (2018) 287–298. [22] C.P. Lee, A.W.C. Tan, S.C. Tan, Gait recognition via optimally interpolated deformable contours, Pattern Recognit. Lett. 34 (6) (2013) 663–669. [23] M. Hu, Y. Wang, Z. Zhang, D. Zhang, J.J. Little, Incremental learning for video-based gait recognition with LBP flow, IEEE Trans. Syst. Man Cybern.s Part B Cybern. 43 (1) (2013) 77–89. [24] S.D. Choudhury, T. Tjahjadi, Silhouette-based gait recognition using procrustes shape analysis and elliptic fourier descriptors, Pattern Recognit. 45 (9) (2012) 3414–3426. [25] W. Jiang, Y. Yi, J. Mao, Z. Huang, X. Wei, CNN-RNN: a unified framework for multi-label image classification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2285–2294. [26] J. Lu, E. Zhang, Gait recognition for human identification based on ICA and fuzzy SVM through multiple views fusion, Pattern Recognit. Lett. 28 (16) (2007) 2401–2411. [27] C. Wang, Deterministic Learning Theory for Identification, Recognition, and Control, CRC Press, 2009. [28] C. Wang, T. Chen, G. Chen, D.J. Hill, Deterministic learning of nonlinear dynamical systems, Int. J. Bifur. Chaos 19 (04) (2009) 1307–1328. [29] J.A. Farrell, Stability and approximator convergence in nonparametric nonlinear adaptive control, IEEE Trans. Neural Netw. 9 (5) (1998) 1008–1020. [30] S. Yu, D. Tan, T. Tan, A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition, in: Proceedings of the Eighteenth International Conference on Pattern Recognition, 2006, pp. 441–444. [31] R. Gross, J. Shi, The CMU Motion of Body (MoBo) Database Technical Report. Robotic Institute, Carnegie Mellon University, Pittsburgh, PA, CMU-RITR-01-18 (2001). [32] K. Yang, Y. Dou, S. Lv, F. Zhang, Q. Lv, Relative distance features for gait recognition with kinect, J. Vis. Commun.s Image Represent. 39 (2016) 209–217.
Please cite this article as: M. Deng, T. Fan and J. Cao et al., Human gait recognition based on deterministic learning and knowledge fusion through multiple walking views, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j. jfranklin.2019.12.041