A new approach for face recognition by sketches in photos

A new approach for face recognition by sketches in photos

ARTICLE IN PRESS Signal Processing 89 (2009) 1576–1588 Contents lists available at ScienceDirect Signal Processing journal homepage: www.elsevier.co...

1MB Sizes 2 Downloads 118 Views

ARTICLE IN PRESS Signal Processing 89 (2009) 1576–1588

Contents lists available at ScienceDirect

Signal Processing journal homepage: www.elsevier.com/locate/sigpro

A new approach for face recognition by sketches in photos Bing Xiao a, Xinbo Gao a,, Dacheng Tao b, Xuelong Li c a

School of Electronic Engineering, Xidian University, Xi’an 710071, People’s Republic of China School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Blk N4, 639798, Singapore c School of Computer Science and Information Systems, Birkbeck College, University of London, London WC1E 7HX, UK b

a r t i c l e i n f o

abstract

Article history: Received 28 October 2008 Received in revised form 16 February 2009 Accepted 17 February 2009 Available online 28 February 2009

Face recognition by sketches in photos remains a challenging task. Unlike the existing sketch–photo recognition methods, which convert a photo into sketch and then perform the sketch–photo recognition through sketch–sketch recognition, this paper devotes to synthesizing a photo from the sketch and transforming the sketch–photo recognition to photo–photo recognition to achieve better performance in mixture pattern recognition. The contribution of this paper mainly focuses on two aspects: (1) in view of that there are no many research findings of sketch–photo recognition based on the pseudo-photo synthesis and the existing methods require a large set of training samples, which is nearly impossible to achieve for the high cost of sketch acquisition, we make use of embedded hidden Markov model (EHMM), which can learn the nonlinearity of sketch–photo pair with less training samples, to produce pseudo-photos in terms of sketches; and (2) photos and sketches are divided into patches and pseudo-photo is generated by combining pseudo-photo patches, which makes pseudo-photo more recognizable. Experimental results demonstrate that the newly proposed method is effective to identify face sketches in photo set. & 2009 Elsevier B.V. All rights reserved.

Keywords: Sketch–photo recognition Complex sketch Pseudo-photo EHMM Local strategy

1. Introduction As one of the personal identification methods with the most potential, face recognition [1–3] has drawn increasing attention of researchers and played an important role in many application areas, one of which is retrieving a suspect’s images from the existing photo database automatically for identifying the suspect so as to prevent terrorist activities and ensure public security. An automated retrieval of suspects’ images would not only save many person-hours, but also reduce the effect of a subjective assessment. Actually, we acquire no photos of suspects but witnesses’ verbal description of them sometimes; thereby simulated sketches are generated by the cooperation of artists and witnesses instead of photos, and

 Corresponding author. Tel.: +86 29 88201838; fax: +86 29 88201620.

E-mail address: [email protected] (X. Gao). 0165-1684/$ - see front matter & 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2009.02.008

identity of suspects has to be determined by searching the existing photo database based on sketches. The problem of recognizing sketches from photo database automatically and effectively has attracted researchers. Since the research of face recognition sprang up in 1960s, many significant achievements have been made [1–3] and most recently [4–9], and most of them are only available for face photos. Sketches and photos are generated and expressed in different mechanism; therefore have great geometrical deformations and large difference of texture. In other words, the sketch and photo for a person may be similar in geometry, but the texture of them is always very different, which makes the existing face recognition algorithms inactive. Sketch–photo recognition becomes a challenging research focus of face recognition and deserves further research. The key of sketch–photo recognition method is transforming photos and sketches into the same modality to reduce difference between them, and then face recognition

ARTICLE IN PRESS B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

by sketches in pseudo-sketches or by pseudo-photos in photos is performed using classical approaches [10,11]. The initial research is presented in [12,13], which rely on human intervention heavily. Robert et al. proposed the method for automatic match of sketches and photos firstly [14]. As shown in Fig. 1, up to now, face sketches referred to in sketch–photo recognition focus on line-drawing sketch and complex sketch which depicts the face information by lighting and shading besides lines. In line-drawing sketch– photo recognition, photos have to be converted into linedrawing sketches, with which the line-drawing sketch to be recognized is compared. For complex sketch–photo recognition, on the one hand, photos are converted into pseudo-sketches based on which identification of complex sketch is performed. Tang et al. have made large contribution to convert photos into pseudo-sketches. They proposed a method based on principal components analysis (PCA) [15–17], and then introduced manifold in the pseudo-sketch synthesis [18] in which nonlinearity between photos and sketches is approximated by local linearity. Extending these ideas, Gao et al. proposed sketch synthesis algorithms [19–21] based on embedded hidden Markov model (EHMM) so that complex nonlinear relationship between photos and sketches is learnt exactly. Pseudo-sketch synthesis is developed from linear method, nonlinear method approximated by local linearity to real nonlinear method. On the other hand, in view of much information useful for recognition may be lost if all face photos are transformed into sketches, the complex sketch

1577

is transformed into a pseudo-photo which is recognized in photo databases. Facial pseudo-photo synthesis are achieved by photometrically standardizing sketch images [14,22], using a hybrid subspace method [23], and based on Bayesian tensor inference [24]. The trend in development of these methods is from the method based on basic pixels to the statistical one through the method with subspace. Our aim is the recognition of complex sketch in photo database and in the following, complex sketch is abbreviated to sketch. The research of sketch–photo recognition based on synthesizing pseudo-photos from the corresponding sketches is still at an elementary stage. The existing methods require a large set of training samples; however, the training set is usually restricted within a small size because of the high cost of sketch acquisition. The application of this kind of methods is limited. Aiming at this problem, a novel face pseudo-photo synthesis algorithm is proposed based on machine learning and recognition of pseudo-photo in photos is performed in this paper so as to implement sketch–photo recognition with small set of training samples. The nonlinear relationship of each sketch–photo pair in training set is learnt by a pair of EHMMs [25]. Given a face sketch, several intermediate pseudo-photos for it are generated based on the selected trained EHMM pairs according to the idea of selective ensemble and the expected pseudo-photo is resulted by fusing these intermediate pseudo-photos. Each intermediate pseudo-photo is generated with a

Conversion of photo to pseudo-sketch

.. . Photo set

Photos are converted to facial line-drawing sketches Recognition of facial line-drawing sketches

Line-drawing sketch

Conversion of sketch to pseudo-photo Method based on basic pixels Subspace method Statistical method

Recognition of pseudophotos in photo set Pseudo-photo

Linear method Non-linear method approximated by local linearity Real non-linear method

Complex sketch .. . Pseudo-sketch

Photos and sketches are converted into the same modality Fig. 1. Summary of existing sketch–photo recognition algorithms.

Recognition of complex sketches in pseudosketch set

Recognition

ARTICLE IN PRESS 1578

B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

training sketch–photo pair and it is a pseudo-photo for the given sketch to a certain extent, that is to say, one pair of training sketch and photo is sufficient for obtaining a pseudo-photo. Several training sketch–photo pairs are chosen to synthesize more intermediate pseudo-photos for the given sketch, so that they can be fused to generate a much better pseudo-photo. Accordingly, it is unnecessary to collect a great many of training samples. In addition, compared with the whole face, local facial features can provide more specific information, which are in favor of state estimation of EHMMs, and then we add local strategy to the synthesis of pseudo-photo. When the pseudo-photo is synthesized, eigenface recognition [10,26] is performed between pseudo-photo and photos. It can be proved that the proposed method leads to high recognition rate of sketch–photo recognition with a set of experiments. The remainder of this paper is organized as follows. Section 2 gives an overview to the proposed algorithm. In Section 3, the idea of EHMM is introduced briefly, and an EHMM pair is modeled for a pair of sketch patch and photo patch. The procedure of synthesizing and identifying pseudo-photos is described in detail in Section 4. The experimental results are presented in Section 5, and the final section gives the conclusions.

2. The overview of the proposed method The aim of the proposed method is to identify a sketch in a photo set by synthesizing a pseudo-photo for the sketch and performing photo–photo recognition. Hidden Markov model (HMM) is a probability statistics method used to model signals for processing [27]. Because face images have rich two-dimensional spatial information, modeling a face image with the traditional one-dimensional HMM not only loses the spatial information

Sketch to be recognized

Sketch patches

partially, but also increases computational complexity which is still in existence in the two-dimensional HMM proposed by Othman et al. for modeling faces [28]. Then the EHMM [25] comes into being to extract main twodimensional facial features with a moderate computational complexity. Besides that, EHMM can be used for the face-to-face transform [29], inspired by which, EHMMs of sketch and photo for a person have the same state transition and the EHMMs of patches located at the same position in a sketch and its corresponding photo should have the same state transition too. Given an EHMM of a sketch patch, different patches of sketches lead to distinct state sequences. Based on the derived state sequences and the EHMM of a photo patch with the same state transition as the given EHMM, pseudo-photo patches can be obtained, which will be presented in detail in Section 4. For those two reasons, nonlinear relationship between sketch patch and photo patch is learnt by the joint-trained EHMM which is decomposed into the coupled EHMMs having the same state transition matrix, the same mixture weights and different GMMs in each pair of corresponding states. The pseudo-photo patch for the sketch to be recognized is generated with the derived EHMMs. On the other hand, the eigenface approach for face recognition extracts face image features based on PCA [30] to represent the image best with the reduced dimensionality of the feature space. It surpasses other face recognition methods at its speed and efficiency, and works especially well when the faces are captured under the conditions of frontal view and similar lighting, which are satisfied by the experimental data in this paper. The idea of the proposed method is given in Fig. 2. After all the sketch–photo pairs in the training set and the testing sketch are evenly divided into N overlapping patches, each patch of the testing sketch is compared with the training sketch patches and K patches are chosen by the K-nearest neighbors algorithm. K corresponding

Training sketch set

Training photo set

Training sketch patch set

Training photo patch set

Recognition result

Correspondence Similarity computation Chosen training sketch patches

Joint training

EHMMs for sketch patches

Decoding and synthesis

Chosen training photo patches

EHMMs for photo patches Pseudo-photo patches Composing Pseudo-photo

Fig. 2. The idea of the proposed algorithm.

Recognition with eigenface algorithm

ARTICLE IN PRESS B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

1579

photo patches are selected too. The nonlinear relationship of each chosen sketch–photo-patch pair is modeled by an EHMM and it is decomposed into the EHMMs of the training sketch patch and photo patch. Then, the patch of the testing sketch is decoded according to EHMMs of K sketch patches, and the pseudo-photo patch is reconstructed by the result of decoding and EHMMs of K photo patches. The pseudo-photo patches are integrated into a pseudo-photo which is identified in the photo set with eigenface algorithm, so that the sketch–photo recognition is achieved.

number of embedded-states. In our method, each embedded-state is expressed by Gaussian mixture model (GMM) [31] which is in the form of

3. Embedded HMMs pair of sketch patch and photo patch

is the where D is the dimension of observation vector, N ðkÞ i number of mixture components in the i-th embedded-

EHMM consists of a series of super-states, each of which contains several embedded-states. Similar to the HMM, states of the EHMM are non-observable and only a sequence of instances O ¼ fo1 ; o2 ; . . . ; oT g generated by these states can be observed, T is the number of observation vectors. The model is denoted as

l ¼ ðPs ; As ; Le ; Ns Þ, where Ps ¼ fPk g includes initial super-state distribution vectors and Pk is for the k-th super-state, As ¼ fakq g is super-state transition probability matrix representing the probability of transiting from the k-th super-state to the q-th one, Le ¼ fLð1Þ ; Lð2Þ ; . . . ; LðNs Þ g denotes the embedded-states, and Ns represents the number of superstates. The relationship of all parameters is illustrated in Fig. 3 for convenient description. The k-th super-state is an one-dimensional HMM and represented by ðkÞ ðkÞ ðkÞ LðkÞ ¼ fPðkÞ e ; Ae ; Be ; N e g, ðkÞ where PðkÞ e ¼ fpi g is the initial embedded-state distribuðkÞ ðkÞ tion and pi is for the i-th embedded-state, AðkÞ e ¼ faij g is embedded-state transition matrix and aðkÞ denotes the ij probability of transiting from the i-th embedded-state to ðkÞ the j-th one, BðkÞ e ¼ fbi ðot Þg is the distribution probability of observations for each embedded-state, and NðkÞ is a e

ðkÞ

ðkÞ

bi ðot Þ ¼

Ni X

ðkÞ ðkÞ cðkÞ im Nðot ; mim ; Sim Þ

m¼1 ðkÞ

¼

Ni X

1 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi cðkÞ im m¼1 ð2pÞD jSðkÞ im j   1 ðkÞ ðkÞ T  exp  ðot  mðkÞ Þ ð S Þðo  m Þ , t im im im 2

(1)

ðkÞ ðkÞ state, and cðkÞ im , mim , Sim are mixture weight, mean value, covariance matrix for the m-th mixture component in the ðkÞ ðkÞ ðkÞ i-th embedded-state. Accordingly, BðkÞ e ¼ fC e ; M e ; U e ; ðkÞ ðkÞ ðkÞ ðkÞ ðkÞ ðkÞ NðkÞ mix g, in which C e ¼ fcim g, M e ¼ fmim g, U e ¼ fSim g,

NðkÞ ¼ fN ðkÞ g. In such cases, the LðkÞ is also represented as mix i ðkÞ ðkÞ ðkÞ ðkÞ ðkÞ ðkÞ LðkÞ ¼ fPðkÞ e ; Ae ; C e ; M e ; U e ; N mix ; N e g.

Given a patch s of the sketch manually generated and its corresponding photo patch p with t pixels, the main steps of modeling EHMMs for them are listed in Table 1 and presented in detail as below:

 Observation vector sets Os ¼ fois g

and

Op ¼ foip g

for s and p are extracted, where i ¼ 1; 2; . . . ; t. The observation vector at each pixel in s and p consists of the pixel gray value, Gaussian, Laplacian, horizontal derivative and vertical derivative operators, in which pixel gray value and Gaussian operator are related to the low-frequency information and average intensity respectively while other three operators are used for characterizing the high-frequency information. Observation vectors of s and p are extracted and stored in the same order; therefore, the vectors with the same index of s and p correspond with each other.

Fig. 3. The relationship of EHMM parameters.

ARTICLE IN PRESS 1580

B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

Table 1 Procedure of constructing EHMMs for sketch patch and photo patch. Given a patch s and its corresponding photo patch p having t pixels each Step 1: Observation vector sets Os and Op for s and p are extracted and each pair of observation vectors for s and p are combined into a vector so as to form the combined observation vectors O.

i

byx ðkÞ ¼ Pðoy;xþ1 ; . . . ; oyX jsyx ¼ LðiÞ ; Sy ¼ LðiÞ ; lÞ which are k computed by one-dimensional HMM forward–backward algorithms according to formulas (2) and (3): ðiÞ aiy1 ðkÞ ¼ pðiÞ b ðo1 Þ and k k

2

i yðxþ1Þ ðkÞ

a

¼4

ðiÞ

Ne X

3

ðiÞ 5 ðiÞ i yx ðlÞalk bk ðoyðxþ1Þ Þ

a

(2)

l¼1

Step 2: The number of super-states, embedded-states in each super-state and mixture components in each embedded-state are specified.

ðiÞ

biyX ðkÞ ¼ 1 and biyx ðkÞ ¼

Ns X

i

ðiÞ

aðiÞ b ðoyðxþ1Þ Þbyðxþ1Þ ðlÞ kl l

(3)

l¼1

Step 3: The sequence O is segmented uniformly according to the number of states and observation vectors within an embedded-state are clustered according to the number of mixture components.

Step 4: The joint-trained EHMM l ¼ ðP; A; L; NÞ is derived according to the combined observation vectors by Baum–Welch algorithm.

Step 5: ls ¼ ðPs ; As ; Ls ; N s Þ for sketch patch s and lP ¼ ðPP ; AP ; LP ; N P Þ for photo patch p are obtained by decomposing the model l.

 Observation vectors ois and oip for i-th pixel in s and p





are combined into a vector oi ¼ ½ois ; oip , and this step is repeated for all pixels in s and p so as to form the combined observation vectors O ¼ foi g, i ¼ 1; 2; . . . ; t. We specify the number of super-states, embedded-states in each super-state and mixture components in each embedded-state. The observation sequence O is segmented uniformly according to the number of states, and observation vectors within an embedded-state are clustered and the number of clusters is the same as that of mixture components in the embedded-state. The joint-trained EHMM l ¼ ðP; A; L; NÞ is derived according to the segmented observation sequence O with the help of Baum–Welch algorithm [32,33] which is given in Table 2. Baum–Welch algorithm is equivalent to the idea of expectation–maximization (EM) algorithm [34,35], and specifically, with all parameters in l initialized according to the segmented observation sequence O, these parameters are modified iteratively based on the idea of EM algorithm until PðOjlÞ, which is the similarity evaluation of observation vectors and the model l, is convergent. As illuminated in Table 2, EM algorithm is two-step iterative.

In the E-step, PðOjlÞ is evaluated with forward algorithm algorithms [36]. For image patches, the observation sequence O is denoted as O ¼ fOy ; 1pypYg, where Oy ¼ foyx ; 1p xpXg, and correspondingly, the state sequence S is S ¼ fSy ; 1pypYg, where Sy ¼ fsyx ; 1pxpXg. Oy and Sy are observation and state sequences of the y-th line, oyx and syx are the observation vector and state index of the x-th column in the y-th line. Because super-states model the information in vertical direction and embedded-states model that in horizontal direction, X and Y are the numbers of observation vectors in horizontal and vertical directions. The forward and backward variables for the observation sequence Oy are defined as aiyx ðkÞ ¼ Pðoy1 ; . . . ; oyx ; syx ¼ LðiÞ jSy ¼ LðiÞ ; lÞ and k

Based on these two variables, we can compute P iy and PNðiÞe i ayx ðkÞbiyx ðkÞ. The forward variP iy ¼ PðOy jSy ¼ Li ; lÞ ¼ k¼1 able for the observation sequence O1 ; O2 ; . . . ; Oy is defined as

Zy ðiÞ ¼ PðO1 ; . . . ; Oy ; Sy ¼ LðiÞ jlÞ. With these definitions in hand, the forward algorithm is carried through with the P i initialization of Z1 ðiÞ ¼ Pi Pi1 , and Zyþ1 ðiÞ ¼ ½ N j¼1 Zy ðjÞaji P y is PN computed recursively until Zy(i) and PðOjlÞ ¼ i¼1 ZY ðiÞ are obtained. Although the form of PðOjlÞ is too intricate to be given straightforwardly, it is derived iteratively based on the known quantities such as GMMs, state transition, and so on. State sequence and mixture index sequence corresponding to O is reestimated with the doubly embedded Viterbi algorithm [37,38]. It starts with d1 ðiÞ ¼ Pi Q i1 , and

dyþ1 ðiÞ ¼ maxj2½1;Ns  ½dy ðjÞaji Q iy is computed recursively until maxi2½1;Ns  dY ðiÞ is acquired, where Q iy ¼ maxsy1 ;sy2 ; . . . ; syX PðOy ; sy1 ; sy2 ; . . . ; syX jSy ¼ Li ; lÞ and it is processed with the help of one-dimensional HMM Viterbi algorithm: ðiÞ Wiyx ðkÞ is initialized as Wiy1 ðkÞ ¼ pðiÞ b ðo1 Þ and induced k k

according

to

ðiÞ Wiyðxþ1Þ ðkÞ ¼ max1plpNðiÞ ½Wiyx ðlÞaðiÞ b ðoyðxþ1Þ Þ lk k e

i

i

until WyX ðkÞ and Q iy ¼ max1pkpNðiÞ ½WyX ðkÞ are got. When e

maxi2½1;Ns  dY ðiÞ is computed, there is an array to keep track i

of the arguments that maximize dy ðiÞ and Wyx ðkÞ in each iteration so that the best state sequence and mixture index sequence Sb and Mb are tracked back finally. In the M-step, observation sequence O is segmented with the reestimated state and mixture index sequences Sb and Mb, and then the joint-trained EHMM is updated according to the segmented observation sequence. The re-estimation of EHMM parameters are performed with formulas (4)–(8): PðS1 ¼ LðiÞ jl; OÞ

^i¼P P Ns

i¼1 PðS1

¼ LðiÞ jl; OÞ

,

(4)

PY a^ ij ¼

ðiÞ ðjÞ y¼1 PðSy1 ¼ L ; Sy ¼ L jl; OÞ , PY ðiÞ y¼1 PðSy1 ¼ L jl; OÞ

(5)

PY

p^ ðiÞ j ¼

ðiÞ a^ jl ¼

ðiÞ ðiÞ y¼1 Pðsy1 ¼ Lj ; Sy ¼ L jl; OÞ , PY ðiÞ y¼1 PðSy ¼ L jl; OÞ

PY

y¼1

(6)

PX

ðiÞ ðiÞ ðiÞ x¼1 Pðsyðx1Þ ¼ Lj ; syx ¼ Ll ; Sy ¼ L jl; OÞ , PX ðiÞ ðiÞ y¼1 x¼1 Pðsyðx1Þ ¼ Lj ; Sy ¼ L jl; OÞ

PY

(7)

ARTICLE IN PRESS B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

1581

Table 2 Baum–Welch algorithm for the joint-trained EHMM. Given initialized parameters of the joint-trained EHMM l Do E-step: Firstly, PðOjlÞ is evaluated with forward algorithm: Initialization : Z1 ðiÞ ¼ Pi P i1 ; 2 3 N X Zy ðjÞaji 5Piy ; Recursion : Zyþ1 ðiÞ ¼ 4 j¼1

Termination : ZY ðiÞ

and

PðOjlÞ ¼

N X

ZY ðiÞ.

i¼1

i yx ðkÞ

computed with the help of forward and backward variables a

¼ Pðoy1 ; . . . ; oyx ; syx ¼

LðiÞ jSy k

ðiÞ

¼ L ; lÞ and

biyx ðkÞ

¼

PNðiÞe

i i ðkÞbyx ðkÞ is k¼1 yx ðiÞ Pðoy;xþ1 ; . . . ; oyX jsyx ¼ Lk ; Sy ¼ LðiÞ ; lÞ

where Zy ðiÞ ¼ PðO1 ; . . . ; Oy ; Sy ¼ LðiÞ jlÞ is the forward variable for the observation sequence O1 ; O2 ; . . . ; Oy , and P iy ¼ PðOy jSy ¼ Li ; lÞ ¼

a

for the observation sequence Oy which are obtained by one-dimensional HMM forward–backward algorithms: ðiÞ i b ðo1 Þ and byX ðkÞ ¼ 1; Initialization : aiy1 ðkÞ ¼ pðiÞ k k 2 Recursion : aiyðxþ1Þ ðkÞ ¼ 4

3

ðiÞ

Ne X

ðiÞ

5bðiÞ ðoyðxþ1Þ Þ and bi ðkÞ ¼ aiyx ðlÞaðiÞ yx k lk

Ns X

l¼1

ðiÞ

i

aðiÞ b ðoyðxþ1Þ Þbyðxþ1Þ ðlÞ; kl l

l¼1 ðiÞ

Termination : P iy ¼ PðOy jSy ¼ Li ; lÞ ¼

Ne X

aiyx ðkÞbiyx ðkÞ.

k¼1

Secondly, state sequence and mixture index sequence corresponding to O is reestimated with the doubly embedded Viterbi algorithm: Initialization : d1 ðiÞ ¼ Pi Q i1 ; Recursion : dyþ1 ðiÞ ¼ maxs ½dy ðjÞaji Q iy ; j2½1;N 

Termination : maxs dY ðiÞ. i2½1;N 

where Q iy ¼ maxsy1 ;sy2 ;...;syX PðOy ; sy1 ; sy2 ; . . . ; syX jSy ¼ Li ; lÞ is computed by the one-dimensional HMM Viterbi algorithm: i

ðiÞ

Initialization : Wy1 ðkÞ ¼ pðiÞ b ðo1 Þ; k k i

i

ðiÞ

b ðoyðxþ1Þ Þ; Recursion : Wyðxþ1Þ ðkÞ ¼ max ½Wyx ðlÞaðiÞ lk k 1plpN ðiÞ e

i

Termination : WyX ðkÞ

and

i

Q iy ¼ max ½WyX ðkÞ. 1pkpNðiÞ e

i

During the computation of maxi2½1;Ns  dY ðiÞ, there is an array keeping track of the arguments that maximize dy ðiÞ and Wyx ðkÞ in each iteration. The best state sequence Sb and mixture index sequence Mb are tracked back finally. M-step: Observation sequence O is segmented with the reestimated state and mixture index sequences Sb and Mb. The EHMM l is updated and the reestimated parameters are computed with following formulas: PY PY ðiÞ ðiÞ ðiÞ PðS ¼ LðiÞ ; Sy ¼ LðjÞ jl; OÞ y¼1 Pðsy1 ¼ Lj ; Sy ¼ L jl; OÞ ^ i ¼ P PðS1 ¼ L jl; OÞ ; a^ ij ¼ y¼1P y1 ^ ðiÞ ; , P p ¼ PY j Ns ðiÞ Y ðiÞ ðiÞ PðS ¼ L j l ; OÞ PðS ¼ L j l ; OÞ PðS ¼ L j l ; OÞ y 1 y1 y¼1 y¼1 i¼1

ðiÞ a^ jl ¼

PY

y¼1

PX

ðiÞ ðiÞ ðiÞ x¼1 Pðsyðx1Þ ¼ Lj ; syx ¼ Ll ; Sy ¼ L jl; OÞ ; PX ðiÞ ðiÞ Pðs ¼ L ; S ¼ L j l ; OÞ y yðx1Þ y¼1 x¼1 j

PY

ðiÞ b^ j ðkÞ ¼

PY

y¼1

PX

ðiÞ ðiÞ x¼1;s:t:oyx ¼vk Pðsyx ¼ Lj ; Sy ¼ L jl; OÞ , PX ðiÞ ðiÞ Pðs ¼ L ; S ¼ L j l ; OÞ yx y y¼1 x¼1 j

PY

where V ¼ fv1 ; v2 ; . . . ; vK g reserves distinct observation vectors and K is the number of them. While PðOjlÞ is not convergent

ðiÞ b^ j ðkÞ ¼

PY

y¼1

PY

ðiÞ ðiÞ x¼1;s:t:oyx ¼vk Pðsyx ¼ Lj ; Sy ¼ L jl; OÞ , PY ðiÞ ðiÞ y¼1 x¼1 Pðsyx ¼ Lj ; Sy ¼ L jl; OÞ

PY

 The model l derived in the previous step is decom(8)

where V ¼ fv1 ; v2 ; . . . ; vK g reserves distinct observation vectors and K is the length of it. If PðOjlÞ is convergent, EM algorithm is completed, otherwise it comes back to E-step.

posed into ls ¼ ðPs ; As ; Ls ; N s Þ for the sketch patch s and lP ¼ ðPP ; AP ; LP ; N P Þ for the photo patch p. As already stated, ls and lP only have different GMMs in each pair of corresponding embedded-states and the decomposition of l includes dividing the mean vector and covariance matrix of every

ARTICLE IN PRESS 1582

B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

mixture component in each embedded-state. We suppose that SðkÞ is a diagonal matrix and iml l ¼ 0; 1, where 0 and 1 represent sketch patch and photo patch respectively. Consequently, for the m-th mixture component in i-th embedded-state and k-th super-state, the division is performed as below: 2 3 SðkÞ 0 im0 ðkÞ ðkÞ ðkÞ ðkÞ 4 5. mim ¼ ½mim0 ; mim1  and Sim ¼ SðkÞ 0 im1

4. The synthesis and recognition of pseudo-photos The flow chart of synthesizing pseudo-photos and performing recognition of pseudo-photos in photos is shown in Fig. 4. A training set with M pairs of sketches and photos ðSi ; Pi Þ, where i ¼ 1; 2; . . . ; M, and a sketch S to be recognized are given. Firstly, they are evenly divided into N overlapping patches with the patch size B  B and the overlapped area size B  B  D, where D is the overlapping degree. Each patch strj of sketches in the training set is modeled with EHMM ltrj , where j ¼ 1; 2; . . . ; M  N; Secondly, for a patch sh of S, its corresponding pseudo-photo patch ph is obtained with the following steps:

 The similarity degree of patches sh and strj is measured

  



by computing the likelihood of sh observations and ltrj with the doubly embedded Viterbi algorithm. K sketch patches with the greatest similarity degree wl in the training set are hunted and denoted with schol , where l ¼ 1; 2; . . . ; K. K photo patches corresponding to the selected sketch patches are selected and they form ðschol ; pchol Þ, l ¼ 1; 2; . . . ; K. K EHMM pairs are trained based on the idea described in Section 3, which are ðls chol ; lp chol Þ, l ¼ 1; 2; . . . ; K. sh is decoded with ls chol to obtain the optimal decoded state sequence Q ¼ ðq1 ; q2 ; . . . ; qt Þ and mixture index sequence M ¼ ðm1 ; m2 ; . . . ; mt Þ base on the doubly embedded Viterbi algorithms. Intermediate pseudophoto patch pp seul corresponding to sh is derived with the sequences Q and M using lp chol . The value of each pixel in pp seul is equal to the mean value of the mixture component corresponding to the pixel. This step is repeated for K EHMM pairs and K intermediate pseudo-photo patches are generated. K intermediate pseudo-photo patches pp seul are fused to obtain the pseudo-photo patch ph with the similarity degree w l, l ¼ 1; 2; . . . ; K, that  is wl ph ¼ sum  pp seul . sumðwl Þ

 The above five steps are repeated for the next patch shþ1 of S until N patches are processed. Thirdly, with N pseudo-photo patches ph, h ¼ 1; 2; . . . ; N, in hand, pseudo-photo P is derived by combining all the pseudo-photo patches and averaging the overlapped areas

between patches. Finally, when the pseudo-photo P related to the sketch S in question has been synthesized, we apply eigenface method for recognizing pseudo-photo P in the training photo set fP i g, i ¼ 1; 2; . . . ; M:

 Eigenfaces W ¼ fwk g are derived in the training photo



 

set, where k ¼ 1; 2; . . . ; K and K is the dimensionality of eigenface space. K is set to be M1 in the proposed method. Each photo Pi is projected into the eigenface space W, that is to say, the weight vector ci is extracted from the space W and the face photo Pi can be reconstructed according to P i ¼ ci  W. Similar to the second sub-step, the pseudo-photo P is projected into the space W and the weight vector cs is extracted according to P ¼ cs  W. Weights cs and ci are compared to search for   j ¼ arg min kcs  ci k , i

where i ¼ 1; 2; . . . ; M, and then the sketch S is for the jth person in the photo set.

5. The experimental results and analysis In this section, the effectiveness of the proposed sketch–photo recognition algorithm is evaluated from the aspect of identifying sketch in photo set. The proposed method is compared with the direct sketch–photo recognition method and the approach based on pseudosketch synthesis [21]. In the direct sketch–photo recognition method, the testing sketch is identified based on training photos with eigenface method. According to the approach reported in [21] whose idea is illustrated in Fig. 5, pseudo-sketches are synthesized for training photos and sketch–photo recognition is performed on the testing sketch and pseudo-sketches with eigenface method. If a testing sketch is to be recognized, it is converted into the corresponding pseudo-photo in the proposed method, while all the training photos are transformed into pseudosketches in the approach of Ref. [21], which shows that there is less conversion in the proposed method so that the complexity is low. Of course, since selective ensemble and local strategy are employed, the computational complexity of the photo and sketch synthesis is a little bit high. In this paper, leave-one-out strategy [39] is applied for pseudo-photo and pseudo-sketch generation in the proposed method and the method in [21], respectively. In other words, for the proposed method, a sketch is left out as the test sample while other sketch–photo pairs in the database are training samples every time until all sketches in the database are transformed into pseudo-photos, and for the method in [21], a photo is the test sample while other photo–sketch pairs are used for training until pseudo-sketches are synthesized for all photos. In the recognition stage, testing and training samples are indicated in the Table 3. The generated pseudo-photos are to be identified according to photos in the proposed method, and in contrast, all sketches in the database are

ARTICLE IN PRESS B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

1583

Sketch-photo pair in training set

… ...

Projection

PCA

… ...

Test sketch

Partition into patches B

Training set of sketch patch- photo patch pairs ...

...

… ...

...

...

...

… ...

...

Patches of test sketch ... … Sketch patch

Computation of similarity between sketch patches

Generation of pseudophoto patches

K nearest neighbors for the patch of test sketch Selection of sketch patch - photo patch pair K pairs of sketch patches and corresponding photo patches in training set Training of EHMM pairs Decoding

K EHMMs for sketch patches in the training set

K EHMMs for photo patches in the training set

K intermediate pseudo-photo patches Reconstruction

Fusion Pseudo-photo patch

… Patches of pseudo-photo Putting patches together

Eigenfaces subspace

Projection The resulted pseudo-photo

coefficients Nearest neighbor Identity

Fig. 4. The idea of synthesizing and recognizing pseudo-photo.

recognized in the pseudo-sketches derived with the method in [21]. With regard to the direct sketch–photo recognition, testing and training samples are sketches and photos respectively. In the proposed method, many parameters have to be specified ahead: EHMM parameters including the number of super-states Ns, embedded-states N ðkÞ and mixture e

as shown in Fig. 3, the number of the components N ðkÞ i nearest neighbors K, the size of each patch B  B and the overlapped area B  B  D in Fig. 4. The values of these parameters are given in Table 4. EHMM has 3 super-states from top to bottom, 6 embedded-states from left to right in each super-state and 12 mixture components in each embedded-state. Photos and sketches are divided into

ARTICLE IN PRESS 1584

B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

patches of 32  32 pixels and the overlapping degree D is 75%. We search for 7 nearest neighbors for each testing sketch patch. The experiments are performed on the database containing 56 colorful face sketch–photo pairs provided by the Multimedia Lab of the Chinese University of Hong Kong. In the database, all sketches are drawn by artists and all photos are frontal faces with neutral expression and taken under normal lighting condition. Each image is of 155  200 pixels. Some instances of sketches and their corresponding photos are shown in the first and second lines, respectively, in Fig. 6. In this paper, these experimental data are resized into 64  64 pixels. Pseudo-photos and pseudo-sketches are, respectively, generated with the proposed method and the method in [21], whose examples are shown in Fig. 7. The original sketches, the

original photos, pseudo-photos and the pseudo-sketches are presented in the lines (a)–(d). From all appearances, the difference between the pseudo-photos and photos in lines (c) and (b) is much slighter than that between the pseudo-sketches and sketches in lines (d) and (a). Besides that, the synthesized pseudo-photos reserve more texture information in favor of face recognition than the pseudosketches. Because human’s naked eyes cannot tell difference among too many colors, so color quantization [40] may further speed up the procedure in future work. With the help of the universal image quality index (UIQI) [41], the quality of pseudo-sketches and pseudophotos is assessed in quantity. For the reference image x and the testing image y, the UIQI for image y is computed according to Q¼

Testing sketch

Pseudo-sketches

Recognition result

Photos Fig. 5. Sketch–photo recognition method based on pseudo-sketch synthesis.

Table 3 Testing and training set in recognition stage for three methods. Methods

Testing set

Training set

The proposed method

Pseudo-photos for all sketches Sketches

Photos

The method in [21] The direct sketch–photo recognition method

Sketches

Pseudo-sketches for all photos Photos

Table 4 The values of parameters. Parameters

Values

EHMM parameters Ns

N ðkÞ e

N ðkÞ i

3

6

12

K

B (pixels)

D

7

32

75%

4  sxy  x¯ y¯ , ðs2x þ s2y Þ½ðx¯ Þ2 þ ðyÞ ¯ 2

where x¯ and y¯ are mean values, s2x and s2y are variance, of images x and y, and sxy is covariance of these two images. The greater the value of UIQI, the better the image of y. For images in Fig. 7, the pseudo-sketches in (d) are treated as the testing images while the original sketches in (a) are reference images, and the quality of pseudophotos in (c) are evaluated with the original photos in (b) as reference images. Their UIQI values are shown in Table 5. Each pseudo-sketch has less UIQI value than its corresponding pseudo-photo, that is to say, the quality of pseudo-photos is higher than that of pseudo-sketches. Furthermore, the average UIQI values of all pseudosketches and pseudo-photos are computed, which are 0.876 and 0.891, respectively. It is demonstrated that the proposed method produces images with higher quality, which is important for further face recognition. With the pseudo-sketches and pseudo-photos in hand, face recognition rate of three methods are computed and the results are shown in Table 6. In the first row, the sketch in question is retrieved in the training photo set directly and the average recognition rate is 46.43%, less than a half. The second row corresponds to the recognition result of pseudo-sketch synthesis based method [21] with the average recognition rate 98.04%. The last row shows the result of the proposed method and the recognition rate comes up to 100%, increased by 53.57% and 1.96% compared with the first two methods. The proposed

Fig. 6. Examples of color sketch–photo pairs provided by the Multimedia Lab of the Chinese University of Hong Kong.

ARTICLE IN PRESS B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

P1

P2

P3

P4

1585

P5

P6

P7

Fig. 7. Examples of resulted pseudo-photos for color images: (a) the sketches drawn by artists; (b) the corresponding photo images; (c) the pseudo-photos generated by the proposed method; (d) the pseudo-sketches generated by the method in [21].

Table 5 The UIQI value of color images.

Pseudo-sketch (d) Pseudo-photo (c)

P1

P2

P3

P4

P5

P6

P7

Average UIQI

0.897 0.930

0.891 0.929

0.907 0.953

0.867 0.915

0.869 0.874

0.874 0.908

0.881 0.910

0.876 0.891

Table 6 The comparison of different face recognition methods. Methods

The average recognition rate (%)

Direct sketch–photo recognition Sketch and pseudo-sketch recognition Pseudo-photo and photo recognition

46.43 98.04 100

method achieves the highest recognition rate among these three methods, especially much higher than the direct sketch–photo recognition. Apropos of the database of 21 gray sketch–photo pairs which are also provided by the Multimedia Lab of the Chinese University of Hong Kong, each image is of 64  64 pixels and some images are shown in Fig. 8. Six examples of synthesized pseudo-photos are shown in Fig. 9 and from the line (a) to the line (d), there are the sketches

drawn by artists, the original photos, the pseudo-photos generated by the proposed method and the pseudosketches by the method in [21]. The recognition results of three methods are shown in Table 7. Direct sketch–photo recognition leads to the average recognition rate 17.6%. The average recognition rate of pseudo-sketch synthesis based method [21] is 88.2%. For the proposed method, the recognition rate is increased by 70.6% compared with the first method and is equal to that of the second method. Although the proposed method has almost the same ability with the method in [21] to identify testing sketches in photos, the proposed method is superior in the quality of the derived images, as shown in Table 8. Each pseudophoto generated by the proposed method has better quality than the corresponding pseudo-sketch in Fig. 9, and the same to the average quality of all pseudo-photos and pseudo-sketches. So, the proposed method has greater potential to achieve better recognition performance.

ARTICLE IN PRESS 1586

B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

Fig. 8. Instances of gray sketch–photo pairs.

P1

P2

P3

P4

P5

P6

Fig. 9. Examples of resulted pseudo-photos for gray images: (a) the sketches drawn by artists; (b) the original photos; (c) the pseudo-photos generated by the proposed method; (d) the pseudo-sketches generated by the method in [21].

Table 7 The comparison of different face recognition methods. Methods

The average recognition rate (%)

Direct sketch–photo recognition Sketch–pseudosketch recognition Pseudophoto–photo recognition

17.6 88.2 88.2

6. Conclusion This paper presents a novel sketch–photo recognition method. The sketch to be recognized is converted into a pseudo-photo by the combination of EHMM and selective

ensemble with the local strategy. The nonlinear relationship between sketch and photo patch pairs is modeled using EHMMs, many of which are selected to reconstruct pseudophoto patches. These pseudo-photo patches are synthesized into the expected pseudo-photo. Then sketch–photo recognition is transformed into recognizing pseudo-photo in photo set with eigenface method. It has been proved that this method outperforms the direct sketch–photo recognition and the pseudo-sketch synthesis based algorithm. Sketches depend seriously on the artists, that is to say, sketches drawn by different artists may vary a lot although they correspond to the same photo, and the sketch quality affects the sketch–photo recognition to some extent, therefore, enhancing the quality of sketches is required in the further research.

ARTICLE IN PRESS B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

1587

Table 8 The UIQI values of gray images.

Pseudo-sketch (d) Pseudo-photo (c)

P1

P2

P3

P4

P5

P6

Average UIQI

0.712 0.798

0.755 0.793

0.649 0.666

0.644 0.646

0.629 0.639

0.592 0.662

0.687 0.711

Acknowledgements The authors are grateful to the helpful comments and suggestions from the anonymous reviewers. Thanks must be expressed to the Multimedia Lab of the Chinese University of Hong Kong for providing us with the face photo–sketch images database. This research was supported by National Science Foundation of China (60771068, 60702061, 60832005), the Open-End Fund of National Laboratory of Pattern Recognition in China and National Laboratory of Automatic Target Recognition, Shenzhen University, China, and the Program for Changjiang Scholars and innovative Research Team in University of China (IRT0645). References [1] R. Chellappa, C. Wilson, S. Sirohey, Human and machine recognition of faces: a survey, Proc. IEEE 83 (5) (1995) 705–741. [2] W. Zhao, R. Chellappa, A. Rosenfeld, P. Phillips, Face recognition: a literature survey, ACM Comput. Surv. 35 (4) (2003) 399–458. [3] P. Phillips, P. Flynn, T. Scruggs, K. Bowyer, J. Chang, K. Hoffman, J. Marques, J. Min, W. Worek, Overview of the face recognition grand challenge, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–25 June 2005, pp. 947–954. [4] Y. Pang, D. Tao, Y. Yuan, X. Li, Binary two-dimensional PCA, IEEE Trans. Syst. Man Cybern. B Cybern. 38 (4) (2008) 1176–1180. [5] D. Tao, M. Song, X. Li, J. Shen, J. Sun, X. Wu, C. Faloutsos, S. Maybank, Bayesian tensor approach for 3-D face modeling, IEEE Trans. Circuits Syst. Video Technol. 18 (10) (2008) 1397–1410. [6] Y. Pang, Y. Yuan, X. Li, Gabor-based region covariance matrices for face recognition, IEEE Trans. Circuits Syst. Video Technol. 18 (7) (2008) 989–993. [7] D. Tao, X. Li, X. Wu, S. Maybank, General tensor discriminant analysis and Gabor features for gait recognition, IEEE Trans. Pattern Anal. Mach. Intell. 29 (10) (2007) 1700–1715. [8] D. Tao, X. Li, X. Wu, S. Maybank, Geometric mean for subspace selection, IEEE Trans. Pattern Anal. Mach. Intell. 31 (2) (2009) 260–274. [9] T. Zhang, D. Tao, J. Yang, Discriminative locality alignment, in: Proceedings of European Conference on Computer Vision, Marseille, France, 12–18 October 2008, pp. 725–738. [10] M. Turk, A. Pentland, Face recognition using eigenfaces, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Mauii, Hawaii, USA, 3–6 June 1991, pp. 586–591. [11] M. Bartlett, J. Movellan, T. Sejnowski, Face recognition by independent component analysis, IEEE Trans. Neural Networks 13 (6) (2002) 1450–1464. [12] A. Narasimhalu, CAFFIR: an image based CBR/IR application, in: Proceedings of AAA1 Spring Symposium, 1993. [13] J. Shepherd, An interactive computer system for retrieving faces, in: M. Jeeves, F. Newcombe, A. Youngs (Eds.), Aspects of Face Processing, Selected Pagers, Martinus Nijhoff Publishers, Dorrecht, 1986, pp. 398–409. [14] Robert G. Uhl Jr., Niels da Vitoria Lobo, A framework for recognizing a facial image from a police sketch, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 18–20 June 1996, pp. 586–593. [15] X. Tang, X. Wang, Face photo recognition using sketch, in: Proceedings of IEEE International Conference on Image Processing, Rochester, New York, USA, 22–25 September 2002, pp. 257–260.

[16] X. Tang, X. Wang, Face sketch synthesis and recognition, in: Proceedings of IEEE International Conference on Computer Vision, Nice, France, 14–17 October 2003, pp. 687–694. [17] X. Tang, X. Wang, Face sketch recognition, IEEE Trans. Circuits Syst. Video Technol. 14 (1) (2004) 50–57. [18] Q. Liu, X. Tang, H. Jin, H. Lu, S. Ma, A nonlinear approach for face sketch synthesis and recognition, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–25 June 2005, pp. 1005–1010. [19] X. Gao, J. Zhong, C. Tian, Face sketch synthesis algorithm based on machine learning, IEEE Trans. Circuits Syst. Video Technol. 18 (4) (2008) 487–496. [20] J. Zhong, X. Gao, C. Tian, Face sketch synthesis using E-HMM and selective ensemble, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, Hawaii, USA, 15–20 April 2007, pp. 485–488. [21] X. Gao, J. Zhong, D. Tao, X. Li, Local face sketch synthesis learning, Neurocomputing 71 (10–12) (2008) 1921–1930. [22] Robert G. Uhl Jr., Niels da Vitoria Lobo, Young H. Kwon, Recognizing a facial image from a police sketch, in: Proceedings of Second IEEE Workshop on Applications of Computer Vision, Sarasota, Florida, USA, 5–7 December 1994, pp. 129–137. [23] Y. Li, M. Savvides, V. Bhagavatula, Illumination tolerant face recognition using a novel face from sketch synthesis approach and advanced correlation filters, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France, 15–19 May 2006, pp. 357–360. [24] W. Liu, X. Tang, J. Liu, Bayesian tensor inference for sketch-based facial photo hallucination, in: Proceedings International Joint Conference on Artificial Intelligence, Hyderabad, India, 6–12 January 2007, pp. 2141–2146. [25] A. Nefian, M. Hayes, Face recognition using an embedded HMM, in: Proceedings of International Conference on Audio- and Video-based Biometric Person Authentication, Washington, DC, USA, 22–23 March 1999, pp. 19–24. [26] M. Turk, A. Pentland, Eigenfaces for recognition, J. Cognitive Neurosci. 3 (1) (1991) 71–86. [27] F. Samaria, Face recognition using hidden Markov models, Ph.D. Thesis, Engineering Department, University of Cambridge, 1994. [28] H. Othman, T. Aboulnasr, A separable low complexity 2D HMM with application to face recognition, IEEE Trans. Pattern Anal. Mach. Intell. 25 (10) (2003) 1229–1238. [29] T. Nagai, T. Nguyen, Appearance model based face-to-face transform, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Quebec, Canada, 17–21 May 2004, pp. 749–752. [30] M. Kirby, L. Sirovich, Application of the Karhunen–Loeve procedure for the characterization of human faces, IEEE Trans. Pattern Anal. Mach. Intell. 12 (1) (1990) 103–108. [31] G. Mclachlan, D. Peel, Finite Mixture Models, Wiley, New York, 2000. [32] L. Baum, T. Petrie, G. Soules, N. Weiss, A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains, Ann. Math. Statist. 41 (1) (1970) 164–171. [33] Ara V. Nefian, Monson H. Hayes III, Maximum likelihood training of the embedded HMM for face detection and recognition, in: Proceedings of IEEE International Conference on Image Processing, Vancouver, BC, Canada, 10–13 September 2000, pp. 33–36. [34] A. Dempster, N. Laird, D. Rubin, Likelihood from incomplete data via the EM algorithm, J. R. Statist. Soc. B 39 (1) (1977) 1–38. [35] J. Bilmes, A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models, Technical Report, ICSI TR-97-021, International Computer Science Institute, University of California, Berkeley, USA, April 1998. [36] A. Nefian, A hidden Markov model-based approach for face detection and recognition, Ph.D., Georgia Institute of Technology, 1999.

ARTICLE IN PRESS 1588

B. Xiao et al. / Signal Processing 89 (2009) 1576–1588

[37] A. Viterbi, Error bounds for convolutional codes and an asymptotically optimum decoding algorithm, IEEE Trans. Inf. Theory 13 (2) (1967) 260–269. [38] S. Kuo, O. Agazzi, Keyword spotting in poorly printed documents using pseudo 2-D hidden Markov models, IEEE Trans. Pattern Anal. Mach. Intell. 16 (8) (1994) 842–848. [39] M. Stone, Cross-validatory choice and assessment of statistical predictions, J. R. Statist. Soc. B 36 (2) (1974) 111–147.

[40] X. Li, T. Yuan, N. Yu, Y. Yuan, Adaptive color quantization based on perceptive edge protection, Pattern Recognition Lett. 24 (16) (2003) 3165–3176. [41] Z. Wang, A. Bovik, A universal image quality index, Signal Process. Lett. 9 (3) (2002) 81–84.