Accepted Manuscript
Person re-identification post-rank optimization via hypergraph-based learning Saeed-Ur Rehman , Zonghai Chen , Mudassar Raza , Peng Wang , Qibin Zhang PII: DOI: Reference:
S0925-2312(18)30127-9 10.1016/j.neucom.2018.01.086 NEUCOM 19288
To appear in:
Neurocomputing
Received date: Revised date: Accepted date:
20 December 2016 30 January 2018 31 January 2018
Please cite this article as: Saeed-Ur Rehman , Zonghai Chen , Mudassar Raza , Peng Wang , Qibin Zhang , Person re-identification post-rank optimization via hypergraph-based learning, Neurocomputing (2018), doi: 10.1016/j.neucom.2018.01.086
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights
Hypergraph is exploited to describe relationship between person re-identification images concerning highorder. To the best of our knowledge, especially for image re-ranking in person re-identification no existing work explores the features of hypergraph. A new refinement algorithm is presented for rank list classification and filtering. This process is based on relative score estimation and rank categorization.
CR IP T
A hypergraph is constructed via discriminative information from different visual features and complex visual representations, whereas individual weight learning is performed using soft assignment and each hyperedge is given a unique weight.
Promising results are achieved which are compared to available state-of-the-art approaches.
AC
CE
PT
ED
M
AN US
1
ACCEPTED MANUSCRIPT
Person re-identification post-rank
CR IP T
optimization via hypergraph-based learning Saeed-Ur-Rehman, *Zonghai Chen, Mudassar Raza, Peng Wang, Qibin Zhang
AN US
Department of Automation, University of Science and Technology of China (USTC), Hefei, Anhui, PR. China
*Corresponding Author: Zonghai Chen (email:
[email protected]) Abstract: - In computer vision, person re-identification has recently received significant attention from researchers
M
and is becoming an emerging research domain with various challenges. Specifically, re-ranking or post-rank optimization is a significant challenge. Existing re-identification methods perform well in certain particular
ED
scenarios, but their performance at rank-1 remains a major concern. Such methods cannot model the complex and higher-order relationship among the images. To address such issues, we present a hypergraph-based learning scheme that not only improves the rank-1 accuracy but also models the complex and higher-order relationships among the
PT
images. After obtaining the rank list using a baseline method, we apply a new refinement algorithm on it to classify
CE
ranks accordingly. Furthermore, to discover the relationship among samples, we utilize the hypergraphs for re-rank learning. A soft assignment technique is used to perform weight learning of hyperedges. The proposed method
AC
achieves better ranking performance; consequently, the re-identification is improved. An extensive experimental analysis on challenging and publicly available datasets reveals that the proposed re-ranking scheme performs better than the existing methods. Keywords: Hypergraph-based learning, post-rank optimization, person re-identification, rank classification 1. Introduction
2
ACCEPTED MANUSCRIPT
In the recent era, vision systems are used in public places, such as airports, subway stations and shopping malls, for regular security monitoring. The surveillance cameras generate visual data that are passed to security agencies, law enforcement units, and various other military and civil divisions for investigation and forensic purposes. Likewise, the analyst can easily monitor the activities and behaviors of individuals or groups of people at a specific time. Additionally, this visual information facilitates the officers in the generation of surveillance, event predictions
CR IP T
and in-time alerts in various situations. Person re-identification is an important application for this technology, where images of people are matched across multiple cameras. Recently, person re-identification has appeared as an emerging field for researchers [1-3]. In such a re-identification system, a query image, also called a probe, is compared with database images, mainly known as a gallery set. It is very important that we match and present the
AN US
images to users for prompt decisions. In this case, the accuracy or recognition rate is very important.
In general, person re-identification is difficult to automate due to various challenges such as variations in lighting, background, and viewpoint that can degrade the performance of a re-identification system. Recently, a range of techniques [4-8] has been presented in the literature to improve the accuracy of person re-identification systems. The exact match and retrieval of a person’s image from large database is an intricate process in a person re-identification
M
system, where the goal is to produce the rank list that contains target images after matching with the gallery set. Existing methods primarily focus on two aspects, 1) generating robust feature representations or feature descriptors
ED
[9] and 2) learning an effective distance metric [10]. In most approaches, the matching score is computed between the query and every gallery image via extracted features, and then a rank list is generated. Such type of pairwise
PT
similarity is unable to explore the complex and high-order relationships between the sample images. Therefore, it leads to suboptimal matching results, especially at rank-1. Although several efforts have been made to improve
CE
accuracy and performance, still there is space for further improvements [2]. Fig. 1 illustrates a general person reidentification system including a post-ranking module.
AC
To boost the ranking results and to enhance the rank-1 accuracy of a re-identification system, several re-ranking
systems, additionally called post-rank optimization approaches, have been investigated. In our study, both aforementioned terms are used interchangeably. Soft biometric [11] , post-rank optimization [12] , bidirectional reranking [13] and saliency re-ranking [14] have tried to improve the rank list performance. Although various add-on techniques [11, 13, 15] are applied with the baseline methods [4, 5, 14] and different algorithms have been proposed for image re-ranking, existing results show that the rank-1 accuracy is still not guaranteed and the resultant
3
ACCEPTED MANUSCRIPT
techniques are disjoint from the initial matching results [2, 3]. Hence, extracting features from different feature spaces and exploring higher-order relationships between the samples can be a solution. Therefore, the focus of this
M
AN US
CR IP T
study is the post-rank optimization, or re-ranking, which is currently least addressed in the literature [2].
ED
Fig. 1. A typical person re-identification system with a post-ranking module In this work, we present a hypergraph-based learning method that solves the re-ranking problem for a human reidentification system. Hypergraphs have demonstrated outstanding performance in clustering, image retrieval and
PT
person re-identification [16-18]. Unlike a conventional graph, where an edge connects only two vertices and
CE
considers the pair wise relationship, a hypergraph explores the higher-order relationships and constructs a structure having more than two vertices connected with one edge. This structure also helps in the visualization and consideration of complex data [19].
AC
The main contributions of this study are as follows: 1.
A hypergraph is exploited to describe the higher-order relationship among person re-identification images. To the best of our knowledge, no existing work is found for image re-ranking in person re-identification that explores the features of a hypergraph.
2.
A refinement algorithm is presented for classification and filtering of initial rank lists. The process is based on relative score estimation and rank categorization.
4
ACCEPTED MANUSCRIPT
3.
A hypergraph is constructed via discriminative information from different visual features and hyperedge weight learning is performed using soft assignment mechanism.
The rest of the paper is organized as follows: Section II lists the related work. Section III introduces the motivations and preliminaries of hypergraph-based learning. A detailed overview and other components of the proposed re-ranking system are given in Section IV. Section V explains experimental configuration and comparison
CR IP T
with state-of-the-art methods, while Section VI presents the paper’s conclusions.
2. Related Work
Most existing approaches that focus on improving matching percentage for person re-identification are
AN US
categorized as either feature representative methods [9] or metric learning methods [20]. Feature representative methods are based on local and global features including color and texture. Color features are invariant to the viewpoint and pose variations and are broadly applied in vision systems [21-23]. These use histogram-oriented color descriptors[23], scale-invariant feature transform (SIFT) based color descriptors [24], and Schmid filters [25]. Haarlike [26], Gabor wavelet [27] and local binary patterns (LBP) [28] are texture feature descriptors. Additionally, the
M
descriptors utilizing gradient feature useful in a re-identification framework are (SIFT) [29], speeded up robust features (SURF) [30], the histogram of oriented gradients (HOG) [31] and the pyramid of the histogram of oriented
ED
gradients (PHOG) [32]. Recently, local maximal occurrence (LOMO) [3] and Hexagonal-SIFT [33] descriptors have been proposed for vision systems and have achieved robust results.
PT
Metric learning methods use extracted features and calculate the optimal distance metric between images. This approach can enhance performance in classification, clustering and retrieval tasks [10] and is also applied in person
CE
re-identification by maximizing the distance and minimizing the differences between matched and mismatched image pairs [34]. Weinberger and Saul [20] present a large margin nearest neighbor (LMNN) method to find a
AC
reliable distance computation. Davis et al. [4] introduce an information theoretic metric learning (ITML) method for computing a Mahalanobis distance. Zheng et al. [35] provide a probabilistic relative distance comparison (PRDC) learning model, which boosts the likelihood of a potential image based on the smaller distance between a true and a false match pair. M. Kostinger et al. [5] produce a large-scale metric learning from equivalence constraints
(KISSME) strategy to learn a Mahalanobis metric from similar and dissimilar pairs. Later, a regularized smooth
5
ACCEPTED MANUSCRIPT
KISSME [36] and an error-based KISSME [37] methods are put forward to boost the performance of the KISS
ED
M
AN US
CR IP T
metric learning method. Recently, a metric learning method named cross view quadratic discriminant analysis
Fig. 2. Block diagram of the proposed system
PT
(XQDA) [3] is proposed that is employed in combination with the local maximal occurrence. Although these methods attain better results, rank-1 accuracy needs further improvements. Recently, various methods [38-41] have
CE
been proposed that utilize deep learning and produce robust results. Re-ranking regarding person re-identification is an emerging field that has been addressed by few researchers [2].
AC
One approach found is related to the selection and usage of different attributes [11] that the authors found useful in images. These attributes include using gender, backpack, short hair, jeans and “carrying” (having any object in their hands e.g., jackets/coats/uppers, pages or handbags) to re-rank the results. Moreover, the authors exploit the idea of sliding windows, in which the initial results are split into non-overlapping sections called windows. Specifically, in every window, three images are placed from the initial ranking results. Further, re-ranking is performed using the aforementioned attributes. At last, these windows are merged together to form a final window, also referred as a
6
ACCEPTED MANUSCRIPT
final re-rank list. In this process, the windows move or slide together to produce final re-rank results. However, time required for the creation and merging of these windows make it a slow process. In another work, the rank list is optimized [12] and the final results are refined with user intervention. This intervention makes it a difficult and laborious process. To obtain better ranking results, a bi-directional re-ranking method is proposed where the gallery subjects are used as a probe [13]. In [14], a supplementary re-ranking step that
CR IP T
takes advantage of salience patch matching is added to increase accuracy. The selected features used in this method were able to improve selection after re-ranking a little. Some researchers [15, 42] used the content and context information for rank optimization, which assumes that correct match is likely found in first ranks. Therefore, they also ignore the positions where the true match is located in previous positions.
AN US
In contrast to abovementioned methods, our approach does not require a human-in-the-loop and considers all those positions where the true match is present, even at the upper positions in the list. Moreover, to increase the ranking accuracy, we utilize the hypergraph-based learning that is successfully applied in various computer vision applications. The proposed method not only enhances the accuracy but also retrieves the relevant images that are similar from the gallery to be probe. While, images retrieved by another method [20] are quite dissimilar from the
M
probe images.
In a recent work, Gala and Shah [2] present a survey on recent approaches associated with person re-
ED
identification, in which they note the requirement of re-ranking for person re-identification. In light of the literature review and to the best of our knowledge, no re-ranking technique for person re-identification exists that utilizes
PT
hypergraphs to address the post-rank accuracy of re-identification methods. The proposed technique not only handles this issue but also incorporates the discriminative information from different visual features into a
CE
hypergraph.
3. Motivations of Using Hypergraph
AC
Existing methods construct a graph on the given data by representing sample images as vertices and similarities
as edges [43, 44]. These methods used in image search were demonstrated to be effective in image re-ranking [45] . The graphs only consider the pairwise relationship between samples and ignore higher-order and complex relationships. This issue can efficiently be addressed by a graph generalization method known as hypergraphs [16, 17]. In addition, hypergraphs can easily represent complex relational objects in many real world problems. In these methods, a hyperedge connects multiple vertices and therefore explores the higher-order relationships between data
7
ACCEPTED MANUSCRIPT
samples. Initially, hypergraph learning has been widely disseminated for tasks such as clustering, classification, and
CR IP T
embedding [17]. Following that step, several computer vision applications, such as image retrieval [16], object-
AN US
Fig. 3. Differentiating between simple graph and hypergraph: (a) A simple graph considering just pairwise relations among various vertices. (b) A hypergraph exploring higher-order relationships between multiple vertices by a single hyperedge drawn by connecting the nearest vertices
recognition [19], and person re-identification [18], also make the most of the hypergraph-based learning. For 3-D object retrieval [46], multiple hypergraphs are learned and fused together for superior performance. For
M
hyperspectral image classification, a bi-layer hypergraph-based learning approach was presented [47] in which the first layer creates a simple graph using pairwise relationships, while the second layer constructs the hypergraph. In
ED
Fig. 3, the difference may be seen in the hypergraph and a graph.
3.1 Basic notations used in hypergraph
PT
In the traditional graph, vertex set represents the data samples, while the edge set that is used for pairwise (
) and it may be (un)directed. As there is
CE
dependencies between the vertices. Such a graph is represented as
only one edge between each of the nodes, these graphs lack the true representation of pairwise relations. In contrast, in a hypergraph one hyperedge can connect more than two vertices simultaneously. It allows the discovery of
AC
pairwise as well as higher-order and multi-way relations. A hypergraph
*
+ and
(
) is formed using vertex set
*
+ , a set of hyperedges
as weights of hyperedges. A hyperedge incident with a vertex
degree ( ) defining the sum of all values regarding hyperedge weights of the vertex
( )
∑
( )
(
)
8
, provided that
and the
is (1)
ACCEPTED MANUSCRIPT
( ) is a positive number associated with each hyperedge. Later, these degree values are used to formulate is calculated as ( )
the diagonal matrix . The degree of a hyperedge assume that there are
(
). In our case, we
vertices for each hyperedge. Therefore, in the incidence matrix, there are
nonzero elements such that ( )
(
)
. Let | | and | |express the cardinalities of edge and vertex sets in the
hypergraph, respectively. It follows that a hypergraph | |, described as the incidence matrix
may also be given by the order of the matrix
, such that for every entry in
( In a hypergraph, the adjacency matrix
)
,
{ of
is denoted as
. Here,
stands for a diagonal matrix having hyperedge weights, while
| |
(2)
is computed using Eq.
is the transpose of
AN US
(1).
∑
CR IP T
where
, whereas
are
sample images. The weight learning for individual hyperedge is explained in Section 4.3. Generally, a hypergraphbased regularized framework [17] for optimization can be represented as
*
( )+
is the similarity score vector of gallery images later used for re-ranking function,
(3) ( ) denotes
M
where
( )
regularization term for selected empirical loss,
represents the regularization parameter/balance factor, while
( )
ED
is the regularization term that helps to differentiate between the initial label vector and initial results. The regularization term can be defined as a generalization on natural random walk on hypergraph, ( )
∑
PT
( )
(
)
(
( )
)
(
( )
√ ( )
( ) √ ( )
)
CE
where ( ) and ( ) are the similarity score, whereas d( ) and ( ) are the degrees relevant to vertices , respectively, in the hypergraph
(4) and
.
AC
4. System Overview
Fig. 2 demonstrates the block diagram of the proposed system. It has three major parts. In the first part, the initial
rank list is produced by extending the method proposed in [3]. In the second part, a newly introduced refinement technique applied to initial lists generates result lists that are more refined than initial ones. In the third part, hypergraph-based learning is utilized for re-ranking and final results are obtained. Details of the proposed system are as follows:
9
ACCEPTED MANUSCRIPT
4.1 Rank list refinement In person re-identification systems, the different matching algorithms [4, 6, 14, 20] return gallery images to the user matching a query image. These images are in a form called rank list, representing various images at different positions. The results produced by these strategies show that it is not necessary that the true match of the query lie at
CR IP T
first position. Consequently, it is very complex for the user to manually scan for the true match on the list, especially when there are more than one query images. Thus, a post-rank refinement procedure is required. An existing method [12] involves the user in the refinement process. Still, it is a tedious and time-consuming process. Few other methods use content and context information for refinement [15, 42].
Alternatively, we introduce a two-fold refinement process in which the actual ranking positions are determined
AN US
by using the calculations of a baseline method. Further classification is performed for various positions. We call this classification a post-rank refinement for categorization. This process helps us to improve the initial re-identification results and the overall accuracy of the system.
In the proposed refinement process, for a given probe image
, an initial rank list
*
+
is obtained
where
*
. Let
*
+ contain sub-lists for
number of test images against the gallery,
+ shows the position of all the relevant retrieved images related to the individual probe. The
ED
gallery images * +
M
by exploiting the method proposed in [3] that is based on an initial score vector computation for ranking the
basic assumption in our case is that against multiple probes, if
does not have a true match,
in the top
PT
returned candidates, then the refinement algorithm is employed for re-ordering. The purpose is to detect and filter such ranks from the corresponding correlated matches where the true match lies at first position. This approach
retrievals.
CE
facilitates focusing on the remaining retrieved ranks in the lists and on re-ordering the positions in the initial
AC
In the refinement phase, the output of the baseline method is used. In a baseline model, if
images from Camera A to be matched with the gallery images
represents probe
from Camera B, we have to utilize the same probe
and gallery for training our model to make it consistent with the baseline output. Otherwise, the results may change from the initial rank lists and consequently make our model difficult to comprehend and evaluate. Therefore, denote as { } and { } sets that will constitute the training data for our algorithm. For example, for the VIPeR dataset, both { }
and { } are set to 316. In this instance, it is assumed that the particular dataset is split according to the
10
ACCEPTED MANUSCRIPT
pre-defined protocols [3, 14, 33]. Moreover, the values for the splits can be altered according to the experimental requirements.
taken from { }
+
AN US
Algorithm 1 Input: Initial rank list Probe list taken from { } Gallery Images * + Output: and all rank lists Step 1: Take , and * + Step 2: Analyze the initial rank list * + Step 3: Get matching rank of each probe in Step 4: Apply refinement as For i=1 to N ǀ check for relative position of probe in each * ∑ ǀ do ǀ Store all list in the ranking database ǀ Store all other rank lists to End
CR IP T
TABLE 1 The rank list refinement algorithm
The proposed algorithm works in two steps. In the first step, the actual position of the probe images against the
M
retrieved gallery images is computed based on the calculation of a relevance score. This step employs assigning each probe image a score related to its position in the initial ranking list. For this purpose, the probe image +
are exploited. In particular, this approach involves the labeling of the respective retrieved images
ED
*
and
with a ranking score according to their position in the retrieved list. Additionally, the true location of the probe is
PT
also determined in its corresponding retrieved rank list. After calculating this score, the particular labeled probe and its relative initial rank list from
are further utilized in the second step. This estimation and assignment is
CE
helpful for processing and reducing the list size for re-rank learning that ultimately minimizes the computation time of re-ordering. In the second step, all such lists where the true match lies at first position are excluded from
,
AC
and further stored in the ranking database. All other remaining lists with their respective probe images are considered for re-ordering. Consequently, a reduced rank list
∑
is produced. The proposed
algorithm is summarized in Table 1. The refinement process is carried out before hypergraph learning; hence, it works in offline mode. As a result, this procedure has less of an effect on the overall computational complexity of the system. Another advantage of this
11
ACCEPTED MANUSCRIPT
strategy is that it operates on the baseline technique's mass pairwise similarity score matrix to determine the actual location of the probe.
CR IP T
4.2 Hypergraph learning for re-ranking The objective of re-ranking is to produce a score list for a new ranking by using learning on hypergraphs. In the is to be matched with the gallery * +
testing phase, the probe from
to produce results. In a hypergraph
construction, a hyperedge is formulated using the center vertex and the associated k nearest vertices, e.g., as in Fig.
AN US
3. The transductive learning framework is exploited for hypergraph learning. The aim is to assign a similarity score to those images that are associated with highly weighted hyperedges. In addition, similar gallery images might rank top in the resultant list. Therefore, Eq. (3) is presented by the optimization function ( )such that
( )
( )
{
( )}
(5)
More precisely, the term ( ) is calculated by using Eq. (4) where the identical hyperedges will get more similar
where ( )
‖
∑
( ( )
( ))
(6)
ED
‖
M
( ) is defined as
similarity scores, while
is a vector having binary labels, and ( )
{
, if
is set as a probe. Here,
is a
PT
vertex. Hence, the value of the first element is set to 1 in label vector and the remaining values are set to 0. In such
CE
cases, the term ( ), in Eq. (5) is calculated by deriving Eq. (4) as
AC
( )
∑
∑
( ( )
( ) where
∑
is denoted as
identity matrix and
(
)
(
( )
( )
( )∑
As from Eq. (7) if ∑
( )
)
(
( )
( )
)
( ( )
∑
(
( )
( ) ( )
√ ( )
( )
and ∑
)
(
∑ )
√ ( ) ( )
)
) ( )
∑
( )
(
)
(
) ( ) ( )
( )√ ( ) ( )
(7)
. Therefore, the solution can be obtained as ( )
(
)
(
) ( ) ( )
( )√ ( ) ( )
(8)
is a positive semi-definite matrix for the hypergraph Laplacian, represents the
can be calculated as
12
ACCEPTED MANUSCRIPT
( where
)
(
)
(9)
are diagonal matrices of vertex degrees, hyperedge weights and edge degrees correspondingly.
For computing
,
is used as an identity matrix. By substituting the values of Eq. (6) and Eq. (8) in Eq. (5) we get
( )
* ‖
‖
+
(10)
Algorithm 2 Input: Probe Image ,Gallery Images * + Output: Final rank list Hypergraph learning
and
(
CR IP T
TABLE 2 Hypergraph learning for post-rank optimization
)
For large values of
)
ED
(
M
By differentiating Eq. (10) w.r.t. , we have
AN US
( ) ( ) 1.Compute Similarity matrix S by using equation 2. Construct hypergraph for each vertex. Calculate vertex degree, hyperedge degree matrices 3. Calculate weight of individual vertex for hyperedge by searching k-nearest neighbors 4. Using soft assignment, compute incidence matrix H using Eq. (12) 5. Compute hypergraph Laplacian using 6. Calculate Relevance score using , and rank all the vertices and produce the results 7. Compare and merge results already stored in ranking database 8. Produce final results
(11)
, the computation of its inverse is not feasible. Instead, the score vector (
)
(
)
. In this equation,
PT
calculated efficiently through iterated computation as
can be
the iteration number is denoted by and this iterated procedure is guaranteed to converge [48]. The hypergraph-
CE
based learning algorithm is presented in Table 2. After getting the relevance score vector
from learning against the probe and gallery, a re-rank list is generated
AC
by sorting the score. The next task is to compare it with the previously saved results in the ranking database, with the purpose being to incorporate the pre-stored rank-1 results with newly generated re-rank lists and to avoid the repetitive results from the cumulative final lists.
4.3 Weight learning of hyperedges
13
ACCEPTED MANUSCRIPT
Hypergraph Laplacian and transductive learning framework [49] can be leveraged to gain robust matching results. Moreover, we can gain the advantage of different discriminative features to form a hypergraph. In this study, we assigned different weights to different hyperedges such that an edge gains a higher weight if it is more discriminative and vice versa. The incidence matrix may be calculated using Eq. (2). However, this traditional
AN US
CR IP T
binary structure treats every edge equally. The relative distance between the edges is ignored, which degrades the
Fig. 4. Shows comparison between the incidence matrices. In (a) based on the binary assignment using the equation
(
)
{
is the incidence matrix calculated
. This matrix contains only two values and
ignores the other values between 0 and 1. Whereas in (b) represents the incidence matrix computed by ( ) using soft assignment method ( ) { . In this technique, the degree of membership of a vertex to is shown by a real value between 0 and 1. This soft assignment technique represents more discriminative relationships between hyperedges
M
a hyperedge
ED
performance of the hypergraph. Therefore, instead of using binary values, in our case, the incidence matrix is constructed using softly assigned values as used in [16]. Therefore,
where (
{
(
)
(12)
) denotes the similarity between two vertices, which can be calculated as
CE
where (
)
PT
(
) is the distance calculated between hyperedge center
AC
which is more effective than using the Euclidean distance.
and
(
)
(
(
)
),
using the Mahalanobis [10] distance,
is the average distance of all the images in a
hyperedge. Therefore, the weights are assigned as
∑
(
)
(13)
Assuming a similarity function exists between the images, hyperedges are constructed by taking each image as a centroid vertex with its corresponding k-nearest neighbor image. In our case, the value of k is set as 2. In a particular hypergraph, an edge can connect multiple vertices. The ultimate goal is to find the best match, which is only
14
ACCEPTED MANUSCRIPT
possible by selecting the closest vertices in that hyperedge. Moreover, the weight regarding an edge is calculated between the two nearest vertices based on the similarity score (
) In addition, the calculation of this score
requires only two images for manipulation. Therefore, the motive for selecting the two nearest neighbors, i.e., k=2 is due to calculations of this pairwise distance. Furthermore, this mechanism also reduces the complexity of the hypergraph by making it more elaborate and uniform. It is also illustrated in Fig. 3, where an edge contains exactly
specific. In Fig. 4, a comparison between the incidence matrix matrix
CR IP T
three vertices. Moreover, the formation of a hyperedge and the selection of the nearest neighbor are problem-
of a hypergraph using binary values and the incidence
of a hypergraph using the soft assignment technique is given. The soft assignment technique represents
AN US
more discriminative relationships between hyperedges. Whereas, in the binary structure hypergraph, the strength between the hyperedges is ignored by using only two values, i.e., 0 and 1.
5. Experiments and Results
This section presents the experimental results and a detailed analysis of the proposed method. First, datasets,
M
feature extraction and evaluation protocols are given. Second, comparisons with contemporary approaches including direct matching and post-rank optimization are provided. The conducted experiments address the following
ED
questions:
How to evaluate the proposed work's performance with the publically available datasets?
2.
Are the joint effects of the baseline and proposed technique valuable?
3.
Does the proposed re-ranking technique perform better than other recent strategies, including both direct and
CE
PT
1.
re-ranking methods?
AC
5.1 Datasets
VIPeR [50] is a publically available dataset having 632 pairs of person images. It is captured from two non-
overlapping cameras and each subject appears in each camera view. It is widely used and contains features such as illumination, occlusion, and pose variation. Therefore, it is ideal for assessing the performance of person reidentification algorithms. Different samples from both datasets are shown in Fig. 5(a) and Fig. 5(b).
15
ACCEPTED MANUSCRIPT
CUHK02 [51] dataset is also taken using two non-overlapping cameras, that capture frontal and back views of the subjects. It contains 1871 challenging images having viewpoint, illumination, and occlusion variations. The motives of selecting these datasets are (a) the pairwise samples are suitable for surveillance and inhibit various re-identification and real world challenges and (b) the captured images are from non-overlapping cameras.
M
AN US
CR IP T
Moreover, the datasets are widely used and publically accessible for assessing re-identification approaches.
ED
Fig. 5. Model images from (a) the VIPeR and (b) the CUHK02 datasets. In each column, same person images shown are taken from two non-overlapping cameras of respective datasets
PT
5.2 Feature extraction and evaluation protocol
The feature extraction process comprises the following steps. First, all P person images are rescaled to 128*48.
CE
Block size is set to 8*16 for each image division, which overlaps with their neighboring blocks by half of their size i.e., 4*16. Second, HSV and lab color values are utilized as the quantized mean, whereas for texture features 8-bit
AC
LBP values are taken from each block. Ultimately, the resultant features vector is the concatenation of all the features extracted from each block. Moreover, selection of the block size in our experiment is a common practice and is adopted by many methods. Fig. 6 demonstrates the image scale and its respective divisions in blocks. In our trials, we exercise a similar protocol as given in [3, 14, 33]. In particular, both data sets are arbitrarily divided equally into two groups. One group is used for training while second is used for testing. To get a fair comparison of results, we run the tests ten times and present the comparison in the form of CMC curves of the
16
ACCEPTED MANUSCRIPT
average matching rate at various ranks. For the VIPeR dataset, the value of p is selected as 316 while for the CUHK02 dataset p is chosen as 485. The parameters used in the experiment, such as lambda, are adjusted to 0.1 in Eq. (10) and verified using cross-validation. Furthermore, both
and
involve matrix calculation which can be
efficiently stored and used by compressed sparse matrix representation. Moreover, the size of the gallery is not too is also feasible.
AN US
CR IP T
large. Therefore, the inverse calculation of
(b)
(a)
(c)
M
Fig. 6. Showing feature extraction mechanism of an image taken from the VIPeR dataset in Fig. 6(a). Sample image of size 128*48 is divided into 8*16 size blocks as shown in Fig. 6(b) and an individual block of size 8*16 is cropped and zoomed as shown in Fig. 6(c)
ED
The experiments are conducted on a PC having 6 GB RAM and equipped with Intel core i-7 processor while the implementation is done in the MATLAB 2014b platform.
PT
5.3 Evaluation with state-of-the-art post-ranking approaches
CE
This section provides comparisons with the modern techniques that have used re-ranking and shows the outcomes of the presented technique along with four different baseline methods.
AC
In Table 3, the computed results on the VIPeR dataset are given regarding recognition rate percentage at different ranks for those methods that have employed re-ranking as an additional process or have utilized the post-ranking process for optimization. i.e., RDs + Saliency Re-ranking [18], KISSME+SB [11], POP [12], Rank Optimization [52], Bidirectional re-ranking [15], and KCCA+DCIA [42]. This also illustrates the advantages of the proposed method w.r.t other state-of-the-art methods. As these methods are evaluated on the VIPeR dataset, we also provide a comparison for the same dataset as well. It is to be noticed that the methods such as KISSME+ SB and RDs+ Saliency Re-ranking, include additional re-ranking step in
17
ACCEPTED MANUSCRIPT
their main or proposed baseline method. Giving a small priority to their post-ranking step, therefore, these postranking methods have low performance at rank-1. The performances are 19.3% and 33.29%, respectively. Such methods as POP [12], Rank Optimization [52], Bidirectional re-ranking [15], and DCIA [42] are dedicated postranking methods designed exclusively for re-ranking. Therefore, these methods have better performance at rank-1. This is 59.09% for POP, 34.97% for Rank Optimization and 63.92% for DCIA, respectively.
Method
Ranks 10 78.35 63.30 63.10 72.03 67.11 78.21 80.12
20 88.48 76.60 70.01 80.21 89.32 87.11 86.51
50 97.53 90.60 --95.40 99.05 100
AN US
RDs+ Saliency Re-ranking KISSME+SB POP Rank Optimization Bidirectional re-ranking KCCA+DCIA XQDA+ Proposed
1 33.29 19.30 59.05 34.97 22.21 63.92 64.75
CR IP T
TABLE 3 Top-ranked matching rate (%) comparison of the proposed method with the state-of-the-art post-ranking methods on the VIPeR dataset @ p=316. Best results are highlighted in boldface font.
Although bi-directional re-ranking is designed for post-ranking, still it has 22.2% recognition rate at rank-1. One reason for this property may be the selection of inappropriate baseline methods for evaluation.
M
The results in boldface show that XQDA+proposed outperforms all state-of-the-art techniques. This method
ED
gains an improvement of 64.5% at rank-1. One reason is the selection of an appropriate baseline method at the time of the experiments. The second and most important reason for the results is that the proposed method employs a novel refinement algorithm and a dedicated hypergraph-based learning framework, both of which have shown
PT
excellent performance in re-ranking in our experiments. More interestingly, DCIA, SB and bi-directional re-ranking
CE
used KISSME[5] as baseline model and produce their results. Using KISSME in our proposed method, we achieve an improvement of almost 4% over DCIA, 17% over SB and 15% over bi-directional re-ranking. Hence, an average of 12% improvement over these three methods can be seen. Thus, the proposed post-ranking optimization method is
AC
more effective than existing strategies when the same baseline method is chosen for re-ranking. To analyze the proposed method's performance with different metric-learning methods, the results regarding
recognition rate at rank-1 are presented in Fig. 7 and Fig. 8. The figures show that various metric learning algorithms when used without the proposed re-ranking method have reduced recognition performance. However, by using the proposed hypergraph-based re-ranking method, better results are achieved.
18
ACCEPTED MANUSCRIPT
In Fig. 7, we illustrate the performance of the proposed method with different baseline methods. Specifically, for the VIPeR dataset at rank-1, Euclidian and Mahalanobis distance methods have a recognition rate of 8% and 18% at rank-1 before applying the proposed model. After the application, they achieved a recognition rate of 19% and 30%, respectively, at rank-1, showing an improvement of 11% and 12%, respectively. Similarly, KISSME improves results from 20% to 36%, gaining an increase of almost 16% at rank-1. The best results are shown by XQDA+
CR IP T
proposed method. Alone, LOMO+XQDA provides an average recognition rate of almost 42%, whereas LOMO+XQDA with the proposed method obtains a recognition rate of 64.54%, improving the results by more than 22.57%. Although the results of all base models are enhanced, XQDA as a baseline model achieves significant improvement than other methods.
AN US
Similarly, in Fig. 8, the proposed technique demonstrates a remarkable performance improvement against various baseline methods on the CUHK02 dataset. More precisely, it improves upon the baseline method results from 24% to 40% for KISSME, 14% to 29% and 19% to 34% for Euclidian and Mahalanobis distance models, respectively. The best results are show by improving XQDA’s performance from 52% to 66%. All of these results are at rank-1. For CUHK02, more performance improvement can be noticed. One reason for this improvement is
M
that the baseline methods have shown good results. Moreover, this dataset is less challenging than the VIPeR dataset.
ED
Concerning the evaluation of post-ranking methods on the CUHK02 dataset, the proposed method is compared against DCIA [42], RD [14], and Rank Optimization [52]. Particularly, our method improves the baseline methods
PT
[3] by almost 14%, and the post-ranking DCIA technique by 4% on this dataset. However, the proposed method improves RD [14] results from 31.10% to 66% by gaining an increase of almost 35% and improves the Rank
CE
optimization [52] results from 36% to 66%, increasing the accuracy by 30%. The results are reported in Fig. 9. Therefore, it is demonstrated from the results that the given approach performs significantly better than existing
AC
methods.
19
CR IP T
ACCEPTED MANUSCRIPT
PT
ED
M
AN US
Fig. 7. Recognition rate (%) bars of baseline methods on the VIPeR dataset. Gray bars show results before and maroon bars show results after applying proposed method with baseline
AC
CE
Fig. 8. Recognition rate (%) bars of baseline methods on the CUHK02 dataset. Gray bars show results without and maroon bars show results after applying proposed method with baseline
20
ACCEPTED MANUSCRIPT
Fig. 9. CMC curves for comparing the performance of the proposed method with DCIA[42], RD[14], and Rank Optimization[52] on the CUHK02 dataset 5.4 Evaluation with state-of-the-art ranking approaches To ensure a reasonable comparison, we evaluate the presented technique with the latest direct matching or ranking approaches that use the same dataset and similar evaluation protocols as were used in the original
CR IP T
experiments. The results of these techniques are acquired empirically from the respective papers. Table 4 and Table 5 exhibit the recognition rate percentage of numerous methods on both datasets while Fig. 10 provides CMC curves for each method on both datasets. Specifically, LOMO+XQDA [3] has shown remarkable performance on the VIPeR dataset as in Table 4, while RD [14] has shown good results on the CUHK02 dataset respectively as in Table
AN US
5. Selection of the baseline method is very important and it plays a vital role in the post-ranking framework.
TABLE 4 Recognition rate (%) of various state-of-the-art person re-identification or baseline methods on the VIPeR dataset @ p=316. LOMO-XQDA[3] shows the best performance and its results are highlighted in boldface font. Methods
Ranks
1
M
40.00 30.16 15.66 20.34 13.01 34.23
ED
LOMOXQDA SalMatch PRDC KISSME SDALF L1-norm
10
20
50
80.51
92.08
95.21
43.45 53.66 62.45 53.22 57.35
58.48 70.09 77.62 71.05 73.47
78.53 90 92.32 90.41 88.76
PT
TABLE 5 Recognition rate (%) of different contemporary person re-identification or baseline methods on the CUHK02 dataset @ p=485. RD[14] shows the best performance and its results are highlighted in boldface font. RD LMNN ITML SDALF ROCCA eSDC
Ranks 1 31.10 13.45 15.98 9.90 29.77 20.01
10 68.55 42.25 45.60 30.33 66.03 40.21
AC
CE
Methods
21
20 79.17 54.11 59.81 41.03 76.78 50.55
50 90.38 73.29 76.61 55.99 88.47 70.21
CR IP T
ACCEPTED MANUSCRIPT
AN US
Fig. 10. Showing CMC curves for performance comparison of proposed method with XQDA[3],RD[14], KCCA[53], LMNN[20], ITML[4], and KISSME[5] on VIPeR in Fig. 10(a) and Salmtach[54], ITML[4], L1-norm [8], Midfilter[55], eSDC[51],CCA[56] , and KCCA[53] on CUHK02 in Fig. 10(b) Fig. 10(a) shows the performance of the proposed method on the VIPeR dataset by comparing existing methods, namely, XQDA [3], RD [14], KCCA [53], ITML [4], KISSME [5], and LMNN [20]. The presented method outperforms the latest ranking techniques, especially from rank1-10. Particularly we achieved 64.5% at rank-1 and 84% at rank-5, while no other method gains such a robust recognition rate at these ranks. Moreover, an increased
M
performance gap can be seen in higher ranks, as well. This finding is observed because our refinement algorithm efficiently handles the initial rank results of the baseline methods. Hypergraph-based learning further enhances the
ED
output.
In Fig. 10(b), comparisons are given with Salmtach [57], ITML [4], L1-norm [8], Midfilter [55], eSDC [51],
PT
CCA [56] , and KCCA [53] on the CUHK02 dataset. KCCA has performed well on this dataset, having a recognition rate of 38% at rank-1, 65% at rank-5 and 72% at rank-10. Specifically, XQDA + proposed achieved a
CE
rank-1 recognition rate of 66%, 84% at rank-5 and 94% at rank-10, gaining an average of 22% increase in performance at these ranks and gaining 14% increase at rank-1. In particular, it is obvious through results and
AC
comparisons that the presented method gains an increase in correct recognition rate against all ranks relative to existing state-of-the-art methods. From the above analysis, the importance of the proposed method is obvious from two aspects: (a) by using the
proposed refinement technique, we can better classify the ranks by utilizing the correlation information from the initial rank list, and (b) applying hypergraph-based learning for post-rank optimization is more effective for
22
ACCEPTED MANUSCRIPT
discovering the relationships between images. Moreover, the presented framework is more robust than other post-
AN US
CR IP T
rank optimization methods. Fig. 11 shows some retrieval examples before and after re-ranking is applied.
ED
M
Fig. 11. Retrieval examples of comparative results before and after re-ranking on CUHK02 dataset. Probes are in the left column highlighted with green boxes and the top 10 rank results are shown on the right. The white arrows indicate before, and solid arrows refer results after the proposed re-ranking is applied. The true matches are highlighted by the red boxes 6. Conclusions
PT
A re-ranking framework has been presented for person re-identification that takes advantage of using a proposed refinement algorithm and hypergraph-based learning. The refinement for classification regarding different ranks was
CE
performed using the correlated information gained by exploiting the baseline model followed by exploring higherorder relationships among the images using hypergraphs. Extensive experiments and evaluations on public datasets
AC
revealed that the described re-ranking scheme is more robust and outperforms a wide range of other techniques by improving results at different ranks. At rank-1, we accomplished an average 22.57% improvement in the accuracy on the VIPeR dataset and an average 14% improvement on the CUHK02 dataset. The empirical investigation on different baseline methods also revealed the effectiveness of the presented method. In this paper, we focused on reranking of individual images in pairwise cameras. Future work might comprise generalizing the proposed
23
ACCEPTED MANUSCRIPT
framework for multi-camera and real-time scenarios, for example, in [33]. Furthermore, for large datasets such as CUHK03 [58] and Market-1501 [59], the time complexity issue can also be addressed in the future.
Acknowledgement The authors would like to thank National Natural Science Foundation of PR China (61375079), and The Chinese
CR IP T
Academy of Science-The World Academy of Sciences (CAS-TWAS) President’s Fellowship.
References
AC
CE
PT
ED
M
AN US
[1] H. Yang, L. Shao, F. Zheng, L. Wang, Z. Song, Recent advances and trends in visual tracking: A review, Neurocomputing, 74 (2011) 3823-3831. [2] A. Bedagkar-Gala, S.K. Shah, A survey of approaches and trends in person re-identification, Image and Vision Computing, 32 (2014) 270-286. [3] S. Liao, Y. Hu, X. Zhu, S.Z. Li, Person re-identification by local maximal occurrence representation and metric learning, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 21972206. [4] J.V. Davis, B. Kulis, P. Jain, S. Sra, I.S. Dhillon, Information-theoretic metric learning, In ACM Conference on Machine Learning (2007), pp. 209-216. [5] M. Köstinger, M. Hirzer, P. Wohlhart, P.M. Roth, H. Bischof, Large scale metric learning from equivalence constraints, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012), pp. 2288-2295. [6] B. Prosser, W.S. Zheng, S. Gong, T. Xiang, Person Re-Identification by Support Vector Ranking, In British Machine Vision Conference (BMVC 2010), pp. 21.01-21.11. [7] X. Wu, A.G. Hauptmann, C.-W. Ngo, Measuring novelty and redundancy with multiple modalities in cross-lingual broadcast news, Computer vision and image understanding, 110 (2008) 418-431. [8] W.-S. Zheng, S. Gong, T. Xiang, Reidentification by relative distance comparison, IEEE Trans. Pattern Anal. Mach. Intell., 35 (2013) 653-668. [9] C. Liu, S. Gong, C.C. Loy, X. Lin, Person re-identification: What features are important?, In European Conference on Computer Vision (ECCV 2012), pp. 391-401. [10] L. Yang, R. Jin, Distance metric learning: A comprehensive survey, Michigan State Universiy, 2 (2006). [11] L. An, X. Chen, M. Kafai, S. Yang, B. Bhanu, Improving person re-identification by soft biometrics based reranking, In IEEE Conference on Distributed Smart Cameras (ICDSC 2013), pp. 1-6. [12] C. Liu, C. Loy, S. Gong, G. Wang, POP: Person re-identification post-rank optimisation, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), pp. 441-448. [13] Q. Leng, R. Hu, C. Liang, Y. Wang, J. Chen, Bidirectional ranking for person re-identification, In IEEE Conference on Multimedia and Expo (ICME 2013), pp. 1-6. [14] L. An, M. Kafai, S. Yang, B. Bhanu, Person Reidentification With Reference Descriptor, IEEE Trans. Circuits Syst. Video Technol., 26 (2016) 776-787. [15] Q. Leng, R. Hu, C. Liang, Y. Wang, J. Chen, Person re-identification with content and context reranking, Multimedia Tools and Applications, 74 (2015) 6989-7014. [16] Y. Huang, Q. Liu, S. Zhang, D.N. Metaxas, Image retrieval via probabilistic hypergraph ranking, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010 ), pp. 3376-3383. 24
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
[17] D. Zhou, J. Huang, B. Schölkopf, Learning with hypergraphs: Clustering, classification, and embedding, Advances in Neural Information Processing Systems, (2006), pp. 1601-1608. [18] L. An, X. Chen, S. Yang, Person re-identification via hypergraph-based matching, Neurocomputing, (2015). [19] M. Hofmann, D. Wolf, G. Rigoll, Hypergraphs for joint multi-view reconstruction and multi-object tracking, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), pp. 3650-3657. [20] K.Q. Weinberger, L.K. Saul, Distance metric learning for large margin nearest neighbor classification, The Journ. Mach. Learning Res., 10 (2009) 207-244. [21] M. Guo, Y. Zhao, C. Zhang, Z. Chen, Fast object detection based on selective visual attention, Neurocomputing, 144 (2014) 184-197. [22] H. Bao, M. Lin, Z. Chen, Robust visual tracking based on hierarchical appearance model, Neurocomputing, 221 (2017) 108-122. [23] G. Doretto, T. Sebastian, P. Tu, J. Rittscher, Appearance-based person reidentification in camera networks: problem overview and current approaches, Journ. Ambient Intell. Humanized Comp., 2 (2011) 127-151. [24] A.E. Abdel-Hakim, A.A. Farag, CSIFT: A SIFT descriptor with color invariant characteristics, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2006), pp. 1978-1983. [25] K. Mikolajczyk, C. Schmid, An affine invariant interest point detector, In European Conference on Computer Vision (ECCV 2002), pp. 128-142. [26] R. Lienhart, J. Maydt, An extended set of haar-like features for rapid object detection, In IEEE Conference on Image Processing (2002), pp. I-900-I-903 vol. 901. [27] S. Arivazhagan, L. Ganesan, S.P. Priyal, Texture classification using Gabor wavelets based rotation invariant features, Pattern recognition letters, 27 (2006) 1976-1982. [28] Y. Zhang, S. Li, Gabor-LBP based region covariance descriptor for person re-identification, In IEEE Conference on Image and Graphics (ICIG 2011), pp. 368-371. [29] D.G. Lowe, Object recognition from local scale-invariant features, In IEEE Conference on Computer Vision (1999), pp. 1150-1157. [30] H. Bay, A. Ess, T. Tuytelaars, L. Van Gool, Speeded-up robust features (SURF), Computer vision and image understanding, 110 (2008) 346-359. [31] N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2005), pp. 886-893. [32] A. Dhall, A. Asthana, R. Goecke, T. Gedeon, Emotion recognition using PHOG and LPQ features, In IEEE Conference on Automatic Face & Gesture Recognition and Workshops (FG 2011), pp. 878-883. [33] J.H. Shah, M. Lin, Z. Chen, Multi-camera handoff for person re-identification, Neurocomputing, 191 (2016) 238-248. [34] T. Zhou, M. Qi, J. Jiang, X. Wang, S. Hao, Y. Jin, Person Re-identification based on nonlinear ranking with difference vectors, Information Sciences, 279 (2014) 604-614. [35] W.-S. Zheng, S. Gong, T. Xiang, Person re-identification by probabilistic relative distance comparison, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011), pp. 649-656. [36] D. Tao, L. Jin, Y. Wang, Y. Yuan, X. Li, Person re-identification by regularized smoothing kiss metric learning, IEEE Trans. Circuits Syst. Video Technol., 23 (2013) 1675-1685. [37] D. Tao, L. Jin, Y. Wang, X. Li, Person reidentification by minimum classification error-based KISS metric learning, IEEE Trans. Cybern., 45 (2015) 242-252. [38] L. Ren, J. Lu, J. Feng, J. Zhou, Multi-modal uniform deep learning for RGB-D person re-identification, Pattern Recognition, 72 (2017) 446-457. [39] J. Lin, L. Ren, J. Lu, J. Feng, J. Zhou, Consistent-aware deep learning for person re-identification in a camera network, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017).
25
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
[40] M. Raza, Z. Chen, S.-U. Rehman, P. Wang, P. Bao, Appearance based pedestrians’ head pose and body orientation estimation using deep learning, Neurocomputing, 272 (2018) 647-659. [41] M. Raza, Z. Chen, S.U. Rehman, P. Wang, J.-k. Wang, Framework for estimating distance and dimension attributes of pedestrians in real-time environments using monocular camera, Neurocomputing, 275 (2018) 533-545. [42] J. Garcia, N. Martinel, C. Micheloni, A. Gardel, Person re-identification ranking optimisation by discriminant context information analysis, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 1305-1313. [43] D. Conte, P. Foggia, C. Sansone, M. Vento, Thirty years of graph matching in pattern recognition, Int. Journ. Patt. Recog. Artifi. Intellig., 18 (2004) 265-298. [44] G. Kurillo, Z. Li, R. Bajcsy, Wide-area external multi-camera calibration using vision graphs and virtual calibration object, In ACM/IEEE Conference on Distributed Smart Cameras (ICDSC 2008), pp. 1-9. [45] M. Wang, H. Li, D. Tao, K. Lu, X. Wu, Multimodal graph-based reranking for web image search, IEEE Trans. Image Process., 21 (2012) 4649-4661. [46] Y. Gao, M. Wang, D. Tao, R. Ji, Q. Dai, 3-d object retrieval and recognition with hypergraph analysis, IEEE Trans. Image Process., 21 (2012) 4290-4303. [47] Y. Gao, R. Ji, P. Cui, Q. Dai, G. Hua, Hyperspectral image classification through bilayer graph-based learning, IEEE Trans. Image Process., 23 (2014) 2769-2778. [48] L. Zhu, J. Shen, H. Jin, R. Zheng, L. Xie, Content-based visual landmark search via multimodal hypergraph learning, IEEE Trans. Cybern., 45 (2015) 2756-2769. [49] D. Zhou, O. Bousquet, T.N. Lal, J. Weston, B. Schölkopf, Learning with local and global consistency, Advances in neural information processing systems, 16 (2004) 321-328. [50] D. Gray, H. Tao, Viewpoint invariant pedestrian recognition with an ensemble of localized features, In European Conference on Computer Vision (ECCV 2008), pp. 262-275. [51] R. Zhao, W. Ouyang, X. Wang, Unsupervised salience learning for person re-identification, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), pp. 3586-3593. [52] M. Ye, J. Chen, Q. Leng, C. Liang, Z. Wang, K. Sun, Coupled-view based ranking optimization for person re-identification, In International Conference on MultiMedia Modeling (2015), pp. 105-117. [53] G. Lisanti, I. Masi, A. Del Bimbo, Matching people across camera views using kernel canonical correlation analysis, In ACM Conference on Distributed Smart Cameras (2014), pp. 10. [54] R. Zhao, W. Oyang, X. Wang, Person re-identification by saliency learning, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2017) 356-370. [55] R. Zhao, W. Ouyang, X. Wang, Learning mid-level filters for person re-identification, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014), pp. 144-151. [56] D.R. Hardoon, S. Szedmak, J. Shawe-Taylor, Canonical correlation analysis: An overview with application to learning methods, Neural computation, 16 (2004) 2639-2664. [57] R. Zhao, W. Ouyang, X. Wang, Person re-identification by saliency learning, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2016) 356 - 370. [58] W. Li, R. Zhao, T. Xiao, X. Wang, Deepreid: Deep filter pairing neural network for person reidentification, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014), pp. 152159. [59] L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, Q. Tian, Scalable person re-identification: A benchmark, In IEEE International Conference on Computer Vision (ICCV 2015), pp. 1116-1124.
26
ACCEPTED MANUSCRIPT
Authors Biography
CR IP T
Saeed Ur Rehman has received his MS degree from Mohammad Ali Jinnah University Islamabad Pakistan. Currently he is a PhD Scholar under CAS-TWAS Fellowship; in the University of Science and Technology of China (USTC), Hefei, Anhui, PR. China. His Research Interests include computer vision, deep learning and use of machine learning and its applications.
AN US
Chen Zonghai, born in Anhui, China, in December 1963. He obtained his Bachelor's degree from the Department of Management and Systems Science of University of Science and Technology of China (USTC) in 1988. He is a Professor at the Department of Automation, USTC since 1998. Prof. Chen is also a recipient of special allowances from the State Council of PR China and a member of the Robotics Technical Committee of the International Federation of Automation Control (IFAC). Prof. Chen's main research area covers modeling and control of complex systems, control system engineering and intelligent information processing, energy management technologies for electric vehicles and smart microgrids.
M
Mudassar Raza is a PhD Scholar at University of Science and Technology of China (USTC), China under CAS-TWAS fellowship. He has more than seven years of experience of teaching undergraduate classes at COMSATS Institute of Information Technology, Pakistan. His interests include are Deep Learning, pattern recognition, and parallel & distributed computing. Peng Wang received his BS degree from University of Science and Technology (USTC) in 2010. He continued to be a PhD candidate ever since 2010 in USTC his PhD degree in 2015. His currently work as a postdoctor in the Automation Department. His current research is focused on uncertain information process in robot navigation, interval analysis, and deep/reinforcement learning.
ED
of China and got
PT
mobile
AC
CE
Qibin Zhang received his BS degree from the University of Science and Technology of China (USTC) in 2012. He is now a PhD candidate at the Knowledge Representation and Intelligent Information Technology Laboratory in the Department of Automation, USTC. His research interests include mobile robot localization, SLAM and knowledge representation.
27