Combining sparse representation and local rank constraint for single image super resolution

Combining sparse representation and local rank constraint for single image super resolution

ARTICLE IN PRESS JID: INS [m3Gsc;July 13, 2015;17:22] Information Sciences xxx (2015) xxx–xxx Contents lists available at ScienceDirect Informati...

5MB Sizes 1 Downloads 81 Views

ARTICLE IN PRESS

JID: INS

[m3Gsc;July 13, 2015;17:22]

Information Sciences xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Information Sciences journal homepage: www.elsevier.com/locate/ins

Combining sparse representation and local rank constraint for single image super resolution Weiguo Gong∗, Lunting Hu, Jinming Li, Weihong Li

Q1

Key Lab of Optoelectronic Technology & Systems of Education Ministry, Chongqing University, Chongqing 400044, China

a r t i c l e

i n f o

Article history: Received 24 March 2015 Revised 28 May 2015 Accepted 3 July 2015 Available online xxx Keywords: Local rank constraint information Nonlocal and global optimization Single image super resolution Sparse representation

a b s t r a c t Sparse representation based reconstruction methods are efficient for single image super resolution. They generally consist of the code stage and the linear combination stage. However, the simple linear combination has not considered the image edge constraint information of image, and hence the classical sparse representation based methods reconstruct the image with the unwanted edge artifacts and the unsharp edges. In this paper, considering that the local rank is able to extract better edge information than other edge operator, we propose a new single image super resolution method by combining the sparse representation and the local rank constraint information. In our method, we first learn the local rank of the HR image via the traditional sparse representation model, and then use it as the edge constraint to restrict the image edges during the linear combination stage to reconstruct the HR image. Furthermore, we propose a nonlocal and global optimization model to further improve the HR image quality. Compared with many state-of-art methods, extensive experimental results validate that the proposed method can obtain the less edge artifacts and sharper edges. © 2015 Published by Elsevier Inc.

1

1. Introduction

2

High resolution (HR) images are needed in many practical applications, such as medical image analysis, computer vision, remote sensing and so on. The direct way to get the HR images is to increase the number of pixels per unit area or reduce the pixel size by sensor manufacturing techniques [23]. However, these methods are constrained by the physical limitations of imaging systems [16]. In order to overcome the physical limitations, various single image super resolution (SISR) methods have been proposed to obtain the HR image from its low resolution (LR) observation. The classical SISR methods can be mainly divided into three categories: the interpolation based methods [13] and [18], the reconstruction based methods [3] and [11] and the example based methods [12] and [6]. Although the interpolation based methods are easy to perform, the reconstructed HR images tend to be blurry with jagged artifacts and ringing. The reconstruction based methods can introduce some prior knowledge into the reconstruction process, but the HR results may be over smoothing or lacking some important details because of a large magnification factor or failing to model the visual complexity of the real image. In this paper, we focus on the example based methods because these methods are of stronger capability of SISR as the magnification factor becomes larger and are able to obtain the HR image which fuses the high-frequency information from all the example HR images and the LR images. The example based methods [12] and [6], which assume that the high-frequency details lost in LR image can be obtained by learning the relationship between a set of LR example patches and their corresponding HR patches, have become an active area

3 4 5 6 7 8 9 10 11 12 13 14 15 16



Corresponding author. Tel.: +86 23 65112779; fax: +86 23 65112779. E-mail address: [email protected] (W. Gong).

http://dx.doi.org/10.1016/j.ins.2015.07.004 0020-0255/© 2015 Published by Elsevier Inc.

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS 2

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

of research. Freeman et al. [12] proposed the example based method to obtain the HR image by employing the pairs of LR and HR patches directly in a Markov network. Chang et al. [6] further proposed a neighbor embedding based method with the assumption that the LR image and its corresponding HR vision have similar local geometry. To improve neighbor embedding based method, other neighbor embedding based methods have been proposed in [4] and [5]. Nevertheless, the effect of these methods mainly depends on a large supporting image database [31]. Recently, to alleviate this weakness, Yang et al. [31] proposed the sparse representation based SISR method which generally consists of the code stage and the linear combination stage. In this work, the joint dictionary training framework is first proposed for training the couple HR and LR dictionaries. Under this framework, Zeyde et al. [32] introduced the sparse-land model into sparse representation for better SISR results. In order to preserve the difference between image patch contents, Yang et al. [27] and [29] employed the cluster technology to learn multiple dictionaries in the code stage. Except for the sparse prior [27,29,31,32] and [19], other image priors, such as the structural similarity [22] and [28], the nonlocal self-similarity [8–10,17,30] and [34], are studied in the code stage. In order to improve the stability of the recovery results, [15] and [24] proposed the coefficient mapping methods, [20,26] and [35] attempted to capture more image information, such as edges and texture, in the code stage. A drawback of the above mentioned methods is that they fail to consider the edge constraint information of image in the linear combination stage. Thus the reconstructed image has unwanted edge artifacts and unsharp edges. Recently, local rank transform, as a useful tool for describing the data distribution characteristics, is usually used in statistical analysis [7], object detection [14], image denoising [1] and stereo matching [2]. In this paper, considering that local rank can extract better edge information and is insensitive to noise than other edge operator [21], we develop the local rank edge constraint and introduce it into the linear combination stage for SISR task. By combining this edge constraint and the sparse representation, we propose a new SISR method, which can better reconstruct HR image with less edge artifacts and sharp edges. First, the local rank edge information of the HR image is learned by the sparse representation model, and then it is used as the edge constraint to restrict the image edges during the linear combination stage. Second, we propose a nonlocal and global optimization model to further improve the image quality. The contributions of this paper can be summarized as follows: 1) We classify the training image patches into different patterns to learn the local rank of HR image. 2) We constrain the local rank of the HR image patch as close as possible to the reconstructed local rank by the energy minimization model to reconstruct the HR image. 3) We propose a new weight calculation method for non-local self-similarity in the nonlocal and global optimization model.

46

The rest of this paper is organized as follows. In Section 2, we give a brief overview of the sparse representation based single image super resolution and local rank transform. The proposed method is presented in Section 3. Experimental results are provided in Section 4. Section 5 concludes this paper.

47

2. Brief overview of sparse representation based SISR and local rank transform

48

2.1. Sparse representation for SISR

49

The LR image can be seen as a blurred and down-sampled version of the HR image. This observation model can be formulated as follows:

44 45

50

Y = SHX + E, 51 52 53 54 55

where Y represents the LR image, S is the down-sampling operator, H is the blurring filter, E is the noise and X is the HR image. Since there are many HR images satisfying the reconstruction constraint for a given LR image, the process of recovering X from Y is ill-posed. An effective way to deal with this problem is sparse representation. Sparse representation has become an important tool for single image super resolution. The SISR problem via sparse representation consists of the code stage and the linear combination stage. The code stage is formulated as follows:

min α

56

(1)

   y˜ − D˜ α 2 + λα1 , 2

(2)

y Dl ] and D˜ = [βFPD ], Dh and Dl are HR dictionary and LR dictionary respectively, F is a feature extraction operator, y is where y˜ = [βF w h

57 58 59 60 61

the input LR image patch, P extracts the region of overlap between the current target patch and previously reconstructed image patch, α is the sparse coefficients matrix and w contains the values of the previously reconstructed HR image on the overlap. The parameter β controls the tradeoff between matching the LR input and finding a HR patch that is compatible with its neighbors. In the linear combination stage, the ith image patch vector xi ∈ Rn , which is extracted from the HR image X, can be represented as a sparse linear combination in the HR dictionary Dh ∈ Rn×K :

xi ≈ Dh αi ,

αi ∈ RK , αi 0  K,

62

where αi is the sparse coefficient vector which is computed by the input LR image patch and the LR dictionary.

63

2.2. Local rank transform

64 65

(3)

Let x be an element of set S, the rank of x with respect to S is defined as the number of elements less than x: lr(x) = r(x; S). The local rank of S is: LRT (S) = {r(x; N(x))|x ∈ S}, where N(x) is the neighborhood of x. Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

66 67 68

[m3Gsc;July 13, 2015;17:22] 3

The local rank definition of the set is also suitable for an image. The local rank of an image is defined as follows: Let I be an image and I(x, y) be a pixel of I, in the rank window centered by I(x, y), the local rank of I(x, y) is to count the number of pixels whose values are less than I(x, y) in the rank window [2] and [7]. This definition can be formulated as:

LRT (I(x, y)) = Nw −



C (I(x, y) − I(i, j))

(4)

i, j

69

where

 C (•) =

70 71 72 73 74 75

1,

I(x, y) − I(i, j) > 0

0,

I(x, y) − I(i, j) ≤ 0

(5)

Nw is the total number of pixels in the rank window, (i, j) is the coordinate of the rank window. In Eq. (5), the value of C (•) depends on the values of I(x, y) and I(i, j). In a noisy environment, if the value of I(i, j) becomes larger than I(x, y), C (•) tends to be 0. On the contrary, if the value of I(i, j) becomes smaller than I(x, y), C (•) tends to be 1. Therefore, in order to make this transformation suitable in a noisy environment, the δ -rank of I(x, y) is defined as: In the rank window centered by I(x, y), the δ -rank of I(x, y) is to count the number of pixels whose values are less than I(x, y) by at least δ amount in the rank window [21]:

LRTδ (I(x, y)) = Nw −



C (I(x, y) − I(i, j))

(6)

i, j

76

where

 C (•) =

77 78 79 80 81 82 83 84 85 86 87

1,

I(x, y) − I(i, j) > δ

0,

I(x, y) − I(i, j) ≤ δ

(7)

This definition has been demonstrated that the value of LRTδ (I) is larger at the sharp edges when the value of δ is positive (i.e. LRTδ (I)) and the value of LRTδ (I) is larger around the sharp edges when the value of δ is negative (i.e. LRT−δ (I)). In order to illustrate that local rank can effectively be used in extracting edges of image, the local rank of the LR image, noisy LR image and the HR image are shown in Fig. 1. We can observe that the edges are highlighted in Fig. 1(b),(e),(h) and the smooth regions are highlighted in Fig. 1(c),(f),(i). Furthermore, when the noise (zero mean and σ = 4 variance Gaussian noise) is added on the LR image, the local rank of the image can still maintain the shape. As shown in these figures, the image dimension is unchanged and the transformed images only contain the edges information after performing local rank transformation on the images. That is to say, the local rank edge information of the HR image can be regarded as a special image and is able to be learned by learning approaches from the training sets. In this paper, we use sparse representation to learn the local rank edge information of the HR image. Thus, theoretically speaking, the learned local rank is able to be used as the edge constraint to restrict the edges during reconstructing the HR image.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

Fig. 1. Local rank transform on HR and LR images of “Butterfly”. (a)–(c) The LR image and the corresponding LRTδ images with δ = 50 and δ = −50 respectively. (d)–(f) The noisy LR image (with zero mean and σ = 4 variance Gaussian noise) and the corresponding LRTδ images with δ = 50 and δ = −50 respectively. (g)–(i) The HR image and the corresponding LRTδ images with δ = 50 and δ = −50 respectively.

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

JID: INS 4

ARTICLE IN PRESS

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

88

3. Proposed method

89

104

The sparse representation based SISR methods consist of the code stage and the linear combination stage. However, the simple linear combination fails to consider the edge constraint information of image. Hence the classical sparse representation based SISR methods reconstruct the HR image with unwanted edge artifacts and unsharp edges. According to Section 2.2, considering that the local rank of image can extract better edge information, we propose a new SISR method to reconstruct the HR image by combining the sparse representation and the local rank edge constraint information. In our method, the local rank edge constraint information is first learned by sparse representation and then directly used to restrict edges during the linear combination stage. The proposed method consists of four parts: The first part is training image processing by local rank transform. The second part is the multiple dictionaries learning. The third part is the initial HR image reconstruction by the local rank edge constraint. And the final part is the nonlocal and global optimization model to improve the image quality. In the first part, the training image patches are pre-processed by local rank transform. Then the training patches are classified into two patterns so that the LR image patches to be reconstructed can be chosen differently to adapt to the properties of different image contents. In the multiple dictionaries learning part, the multiple dictionaries of these two patterns are learned from the training patches. In the initial HR image reconstruction part, for each input LR image patch, it is necessary to select the suitable pattern to reconstruct the local rank of the HR image patch. Then the initial HR image is reconstructed by constraining the local rank of the HR image patch as close as possible to the obtained local rank. In the final part, we propose a nonlocal and global optimization model to make the initial HR image satisfy the image observation model to further improve the image quality.

105

3.1. Training image pre-processing by local rank transform

106

Due to the reason that the local rank information is unknown before reconstructing the HR image, we need to construct effective local rank training sets for learning the local rank of the HR image. In this paper, the training sets for learning the dictionaries, which are used to reconstruct the HR image and the local rank, contain three parts: the HR image patches, the LR image patches and the local rank image patches. The HR image patches are directly extracted from the HR images. We use the first-order and second-order derivatives of the image as the LR image patches features. By performing local rank transform on the HR image, the HR image is transformed into two kinds of images: the LRTδ image and the LRT−δ image. Hence, the local rank of the HR training image patches are consisted of the LRTδ image patches and the LRT−δ image patches which are extracted from the LRTδ image and the LRT−δ image respectively. The detailed training sets description is as follows: First, Yl = {Yli }ni=1 denotes the LR patches and Xh = {Xhi }ni=1 denotes the HR patches respectively, where Yli denotes ith M × 1 LR feature patch, Xhi denotes ith N × 1 HR patch and n is the total number of the patches. Second, Xδ = {Xδi }ni=1 denotes the LRTδ patches, Xδi denotes the ith N × 1 LRTδ patch, X−δ = {X−i δ }ni=1 denotes

90 91 92 93 94 95 96 97 98 99 100 101 102 103

107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123

the LRT−δ patches and X−i δ denotes the ith N × 1 LRT−δ patches. Finally, the training image patches database is denoted by P = {Xhi , Yli , Xδi , X−i δ }ni=1 . According to the definition of local rank, there are many LRTδ patches only containing zero elements. And these patches not only cost a lot of training time but also have influence on the accuracy of the dictionary during the dictionary learning process. In order to reduce the size of training samples and improve the accuracy of the dictionaries, we classify the training patches into two patterns (Pδ and P−δ ) so that the LR image patches can be chosen differently to adapt to the properties of different image contents. That is to say, if patch Xδi is not zero, we classify the corresponding patches Xhi , Yli , Xδi and X−i δ into pattern Pδ . If patch Xδi

Fig. 2. Some training images.

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

5

Fig. 3. The test images. From left to right and top to bottom: butterfly, boats, flower, leaf, leaves, plants, raccoon, building, starfish, house, lena, peppers, window, parrot, girl, bike.

124

is zero, we classify the corresponding patches Xhi , Yli and X−i δ into pattern P−δ . The patch Xδi is not included in pattern P−δ for the n

n

126

1 2 reason that patch Xδi only has zero elements. As a result, we get two patterns Pδ = {Xhi , Yli , Xδi , X−i δ }i=1 and P−δ = {Xhi , Yli , X−i δ }i=1 with n1 + n2 = n.

127

3.2. Multiple dictionaries learning

128

In the code stage, since a learned dictionary is more effective than pre-constructed dictionaries, we will consider how to n1 n1 learn the dictionaries. For the pattern Pδ , we need to learn four dictionaries. Since the patches Xh = {Xhi }i=1 , Xδ = {Xδi }i=1 and

125

129 130 131

n

n

1 1 X−δ = {X−i δ }i=1 share the same sparse representation coefficients with the LR patches Yl = {Yli }i=1 during the learning procedure, by using the selected patches Pδ in Section 3.1, multiple dictionaries learning model can be mathematically written as follows:

{Dh , Dδ , D−δ , Dl } =

arg min

{Dh ,Dδ ,D−δ ,Dl ,α}



Xh − Dh α2F + Xδ − Dδ α2F + X−δ − D−δ α2F +Yl − Dl α2F



s.t.

α0 ≤ τ1 , (8)

132 133 134 135

where Dh is the HR dictionary, Dδ is the LRTδ dictionary, D−δ is the LRT−δ dictionary, Dl is the LR dictionary, α = (α1 , α2 , . . . , αn1 ) is the sparse representation coefficients matrix and τ1 is the parameter to control the sparse level. In this model, the four terms are used to measure the similar level between the real patch and the sparse reconstructed one. The condition term is used to make the image patches have the sparse representation coefficients over the dictionary. According to the joint dictionary learning

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS 6

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

(a)

(b)

(c)

(d)

(e)

(f)

Fig. 4. Reconstruction results by different values of δ on “Bike”. (a) δ = 10 (PSNR = 23.35, SSIM = 0.7166). (b) δ = 20 (PSNR = 23.53, SSIM = 0.7297). (c) δ = 30 (PSNR = 23.69, SSIM = 0.7386). (d) δ = 40 (PSNR = 23.80, SSIM = 0.7441). (e) δ = 50 (PSNR = 23.81, SSIM = 0.7448). (f) δ = 60 (PSNR = 23.79, SSIM = 0.7440).

(a)

(b)

(d)

(c)

(e)

Fig. 5. The experimental results on different neighborhood size. (a) 3 × 3 (PSNR = 22.74, SSIM = 0.6770). (b) 5 × 5 (PSNR = 23.73, SSIM = 0.7414). (c) 7 × 7 (PSNR = 23.81, SSIM = 0.7448). (d) 9 × 9 (PSNR = 23.68, SSIM = 0.7341). (e) Original image.

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

(a)

(b)

(c)

(d)

(e)

7

(f)

(g)

Fig. 6. Comparison of super resolution results on “Window” image by different choice of γ . (a) LR image. (b) SCSR [31] (PSNR = 28.08, SSIM = 0.8000). (c) NOLRT_SR (PSNR = 28.10, SSIM = 0.8003). (d) PLRT_SR (PSNR = 28.19, SSIM = 0.0.8065). (e) NLRT_SR (PSNR = 28.17, SSIM = 0.8037). (f) NPLRT_SR (PSNR = 28.27, SSIM = 0.8095). (g) Original image.

136

model [31], in this paper, the dictionary learning model (8) can be reformulated as:

{Dh , Dδ , D−δ , Dl } = arg min

{Dh ,Dδ ,D−δ ,Dl }



 XP − DP α22 + λ1 α1 ,

(9)

137

where λ1 balances the sparsity, XP = [ √1 Xh , √1 Xδ , √1 X−δ , √1 Yl ]T , Nh and Nl are the dimensions of the HR image patch and

138

LR image patch in vector form respectively, and DP = [ √1 Dh , √1 Dδ , √1 D−δ , √1 Dl ]T . N N N N

139 140 141

Nh

n

Nl

h

h

n

h

l

n

2 the low resolution patches Yl = {Yli }i=1 during the learning procedure, we can obtain the following multiple dictionaries learning model:

{Dh ,D−δ ,Dl ,α}

143

Nh

2 2 On the other hand, since the patches Xh = {Xhi }i=1 and X−δ = {X−i δ }i=1 share the same sparse representation coefficients with

{Dh , D−δ , Dl } = arg min 142

Nh



Xh − Dh α2F + X−δ − D−δ α2F + Yl − Dl α2F



s.t.

α0 ≤ τ2 ,

(10)

where Dh is the HR dictionary, D−δ is the LRT−δ dictionary, Dl is the LR dictionary, α = (α1 , α2 , . . . , αn2 ) is the sparse representation coefficients matrix and τ2 is the parameter to control the sparse level. This dictionary learning model can also be Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS 8

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

Fig. 7. Comparison of super resolution results on “Leaves” image by different methods. (a) LR image. (b) BI (PSNR = 23.31, SSIM = 0.7998). (c) NE [6] (PSNR = 24.07, SSIM = 0.8385). (d) SCSR [31] (PSNR = 25.64, SSIM = 0.8818). (e) Zeyde’s [32] (PSNR = 25.55, SSIM = 0.8757). (f) NLM_SKR [33] (PSNR = 26.27, SSIM = 0.8989). (g) NARM [8] (PSNR = 24.11, SSIM = 0.8331). (h) The proposed (PSNR = 26.44, SSIM = 0.9018). (i) Original image.

144

reformulated as:

{Dh , D−δ , Dl } = arg min {XN − DN α22 + λ2 α1 },

(11)

{Dh ,D−δ ,Dl }

145

where XN = [ √1 Xh , √1 X−δ , √1 Yl ]T , DN = [ √1 Dh , √1 D−δ , √1 Dl ]T and λ2 balances the sparsity. N N N N N N h

h

l

h

h

l

146

3.3. Local rank constraint information for HR image reconstruction

147

For every input LR image patch, the key issue is to identify a pattern which can represent the properties of the LR image contents. If the test LR image patch Yli belongs to the pattern Pδ , we can get the sparse representation coefficient of the LR image patch by formula (12):

148 149

 2 αi∗ = arg min αi 0 s.t. Yli − Dl αi F ≤ ε1 , αi

150 151

where Dl is the LR dictionary of pattern Pδ , αi∗ is the sparse representation coefficient of Yli and ε1 is admissible error. Once the sparse representation coefficient αi∗ is determined by (12), the target HR image patch can be reconstructed by formula (13):

X˜hi ≈ Dh αi∗ , 152

(12)

where Dh is the HR dictionary of pattern

(13) Pδ , X˜hi

is the initial target HR image patch.

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

(a)

(b)

9

(c)

(d)

(e)

(f)

(g)

(h)

Fig. 8. Comparison of super resolution results on “Boats” image (with σ = 4) by different methods. (a) LR image. (b) BI (PSNR = 27.73, SSIM = 0.7746). (c) NE [6] (PSNR = 27.66, SSIM = 0.7654). (d) SCSR [31] (PSNR = 28.74, SSIM = 0.7889). (e) Zeyde’s [32] (PSNR = 28.87, SSIM = 0.7994). (f) NLM_SKR [33] (PSNR = 28.81, SSIM = 0.7966). (g) The proposed (PSNR = 29.10, SSIM = 0.8012). (i) Original image.

153 154 155 156 157 158 159

However, this simple linear combination fails to consider the edge constraint information of image, and hence the reconstructed HR image has unwanted edge artifacts and unsharp edges. According to the learned dictionaries Dδ ,D−δ and the fact that the local rank patch and the LR patch share the same sparse representation coefficient, the local rank LRTδ and LRT−δ of the HR image patch can be reconstructed by Rδ = Dδ αi∗ and R−δ = D−δ αi∗ , where Dδ and D−δ are the LRTδ and LRT−δ dictionaries of pattern Pδ respectively, Rδ and R−δ are the reconstructed local rank of the HR image patch. In order to restrict the image edges by local rank constraint, we constrain the local rank of the HR image patch as close as possible to the reconstructed local rank Rδ and R−δ . Thus we are able to get the following energy minimization model to restrict the image edges:





min E2 (LRTδ (X˜hi ) − Rδ ) + E3 (LRT−δ (X˜hi ) − R−δ ) .

(14)

X˜hi

160 161

In order to restrict the edges during the linear combination stage, we combine the linear combination formula (13) and the edge constraint (14) to reconstruct the initial HR image patch. Thus the initial HR image patch can be reconstructed by:

min X˜hi

162 163 164

      X˜hi − Dh αi∗ 2 + γ2 LRTδ (X˜hi ) − Rδ 2 + γ3 LRT−δ (X˜hi ) − R−δ 2 , 2

2

2

(15)

where γ2 and γ3 control the LRTδ and LRT−δ terms respectively. For the LR test image patch Yli which belongs to the pattern P−δ , we use the LR dictionary Dl of pattern P−δ and formula (12) to get the sparse representation coefficient αi∗ . It is reasonable to constrain LRT−δ of the HR image patch as close as possible to Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS 10

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Fig. 9. Comparison of super resolution results on “Boats” image (with σ = 6) by different methods. (a) LR image. (b) BI (PSNR = 27.56, SSIM = 0.7342). (c) NE [6] (PSNR = 27.38, SSIM = 0.7111). (d) SCSR [31] (PSNR = 28.19, SSIM = 0.7301). (e) Zeyde’s [32] (PSNR = 28.44, SSIM = 0.7482). (f) NLM_SKR [33] (PSNR = 28.33, SSIM = 0.7398). (g) The proposed (PSNR = 28.58, SSIM = 0.7442). (i) Original image.

165

the reconstructed local rank. The energy minimization model can be written as:







min E1 LRT−δ X˜hi − R−δ , ˜i

(16)

Xh

166 167

where R−δ = D−δ αi∗ , D−δ is the LRT−δ dictionary of pattern P−δ , R−δ is the reconstructed local rank of the HR image patch. Similar to formula (15), we can get formula (17) for reconstructing the initial HR image patch:

min X˜hi

168 169 170 171 172 173

    X˜hi − Dh αi∗ 2 + γ1 LRT−δ (X˜hi ) − R−δ 2 , 2

2

(17)

where Dh is the HR dictionary of pattern P−δ , γ1 controls the LRT−δ term. In our reconstruction models (15) and (17), the constrained terms have obvious difference. This is because that we first select the suitable pattern for every input LR image patch. If the patch belongs to the pattern Pδ , the LRTδ and LRT−δ edge constraint information are reconstructed for restricting the edges simultaneously. While only LRT−δ edge constraint is reconstructed to restrict the edges if the patch belongs to the pattern P−δ . By fusing all the initial HR image patches reconstructed by (15) and (17), the estimated initial HR image X0 can be obtained. Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

(a)

(b)

(c)

(d)

(f)

11

(e)

(g)

(h)

Fig. 10. Comparison of super resolution results on “Boats” image (with σ = 8) by different methods. (a) LR image. (b) BI (PSNR = 27.34, SSIM = 0.6885). (c) NE [6] (PSNR = 27.00, SSIM = 0.6653). (d) SCSR [31] (PSNR = 27.54, SSIM = 0.6688). (e) Zeyde’s [32] (PSNR = 27.91, SSIM = 0.6926). (f) NLM_SKR [33] (PSNR = 27.74, SSIM = 0.6816). (g) The proposed (PSNR = 28.15, SSIM = 0.7039). (i) Original image.

174

3.4. Nonlocal and global optimization

175

In order to make the HR image X0 satisfy the image observation model, papers [31] and [19] applied the global reconstruction constraint on the estimated HR image X0 . This process can be represented by the following global constraint model:

176

X ∗ = arg min X

177 178 179 180 181



 SHX − Y 22 + μX − X0 22 .

(18)

However, this model fails to reconstruct clear edges because it ignores the self-similarity of the patches within the same scale and across different scales. In order to reconstruct the HR image with clear edges, we introduce the non-local self-similarity constraint information [25] and [33] into patch aggregation to better reconstruct the HR image. The non-local self-similarity constraint information assumes that small patches in the image tend to redundantly repeat themselves many times within the same scale and across different scales. This constraint is given by:

N(X i ) =

1  wi j N(X j ), Z (i)

(19)

j

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS 12

182 183

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

where N(X i ) is the image patch at the ith position of image, N(X j ) is the similar image patches in the search window, wi j is the similar weight and Z (i) is the normalization parameter. In this formula, wi j and Z (i) are calculated respectively by:



N(X 1 wi j = exp − Z (i) 184

and

Z (i) =



exp ( −

i

2 ) − N(X j )2

186 187 188 189 190 191

  N(X i ) − N(X j )2 2

h



where

Z (i) =



193

exp −

(21)

2 ) − N(X j )2 h1

2

h1

   LRTδ (X i ) − LRTδ (X j )2

· exp −

2

h2

   LRTδ (X i ) − LRTδ (X j )2

· exp −

2

h2

,

(22)

.

(23)

Finally, by introducing the improved non-local self-similarity constraint information into model (18), we obtain the following nonlocal and global optimization model:

X ∗ = arg min X

195

i

  N(X i ) − N(X j )2

j

194

),

where h is the attenuator. This original non-local self-similarity constraint information obtains the weight by simply measuring patch similarities with the Euclidean norm. However, this measurement is less sensitive to the patches with fine details and edges [36], and this makes the weight have no obvious difference between different image contents. In order to overcome this drawback, we propose a new weight for non-local self-similarity constraint information, which not only depends on the image patches but also depends on the local rank of the image patch. Thus the proposed weight of the non-local self-similarity constraint information is written by the following formula:

N(X 1 wi j = exp − Z (i) 192

(20)

h

j

185

[m3Gsc;July 13, 2015;17:22]



 SHX − Y 22 + μX − X0 22 + βX − W X 22 ,

(24)

where W is the matrix of all the weights, μ and β are the regularization parameters. The entire SISR process is summarized as Algorithm 1. Algorithm 1. SISR reconstruction algorithm for LR test image Input: patterns Pδ and P−δ , the test LR image Y, dictionaries of Pδ , Dictionaries of P−δ , magnification factor s = 3, γ1 = γ2 = 0.02, γ3 = 0.08, μ = 0.08, β = 0.05, h1 = h2 = 100. 1. For each patch Yli in LR image Y, perform the following steps: 1.1. Adaptively select the corresponding pattern Pδ or P−δ , and choose the corresponding dictionaries; 1.2. Compute the optimal solution αi∗ via formula (12); 1.3. Generate the initial HR image patch by model (15) or (17). 2. Obtain the initial HR image X0 by fusing all the initial HR patches. 3. Estimate the final HR image X by model (24). Output: the final HR image X.

196 197

4. Experimental results and analysis

198

4.1. Experimental settings

199

All experiments are conducted in Matlab 2010a environment on a desktop PC with 2.80 GHz Dual-Core CPU and 2.9GB memory. The training images are selected from the software package for [31]. We choose 70 HR images to form the training set. The image database contains different types, such as flowers, architectures, animals, cars, plants and so on. Some training images are shown in Fig. 2. The number of training patches which are used for learning the dictionaries is 100,000. The dictionary size is always set to be 512 by considering computation and image quality. In order to get a better result and balance LRTδ and LRT−δ , we choose the patch size as 7×7 with overlap of six pixels between adjoined patches. In order to assess the quality of the SISR reconstruction objectively, we utilize two indices, i.e., the peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM). The test images are shown in Fig. 3, including butterfly, boats, flower, leaf, leaves, plants, raccoon, building, starfish, house, lena, peppers, window, parrot, girl, bike. All the test images are first blurred by a 5×5 Gaussian filter with standard deviation 1, and

200 201 202 203 204 205 206 207

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

13

Table 1 PSNR and SSIM results of reconstructed images by different values of δ . Values of δ

Measures

Images Boats

Building

Bike

Leaf

Window

Parrot

Avg.

δ = 10

PSNR SSIM

28.37 0.8279

23.99 0.7097

23.35 0.7166

37.59 0.8888

27.80 0.7897

28.08 0.8639

28.19 0.7994

δ = 20

PSNR SSIM

28.75 0.8374

24.08 0.7194

23.53 0.7297

37.76 0.8910

28.01 0.7993

28.38 0.8698

28.41 0.8077

δ = 30

PSNR SSIM

28.85 0.8404

24.24 0.7273

23.69 0.7386

37.82 0.8910

28.20 0.8055

28.54 0.8723

28.56 0.8125

δ = 40

PSNR SSIM

28.90 0.8402

24.22 0.7291

23.80 0.7441

37.82 0.8903

28.27 0.8086

28.58 0.8725

28.59 0.8141

δ = 50

PSNR SSIM

28.90 0.8419

24.30 0.7315

23.81 0.7448

37.93 0.8919

28.28 0.8095

28.62 0.8728

28.64 0.8154

δ = 60

PSNR SSIM

28.83 0.8400

24.16 0.7284

23.79 0.7440

37.88 0.8910

28.27 0.8086

28.58 0.8724

28.58 0.8141

Table 2 PSNR and SSIM results of reconstructed images by different values of neighborhood size. Images

Measures

Neighborhood size 3×3

5×5

7×7

9×9

Boats

PSNR SSIM

27.59 0.6570

28.94 0.8429

28.90 0.8419

28.88 0.8416

Building

PSNR SSIM

23.52 0.6712

24.25 0.7289

24.30 0.7315

24.21 0.7222

Bike

PSNR SSIM

22.74 0.6770

23.73 0.7414

23.81 0.7448

23.68 0.7341

Leaf

PSNR SSIM

36.77 0.8677

37.90 0.8918

37.93 0.8919

37.91 0.8919

Window

PSNR SSIM

27.21 0.7581

28.17 0.8051

28.28 0.8095

28.19 0.8040

Parrot

PSNR SSIM

27.35 0.8332

28.51 0.8722

28.62 0.8728

28.52 0.8706

Avg.

PSNR SSIM

27.53 0.7440

28.58 0.8137

28.64 0.8154

28.56 0.8107

211

then downsampled by a decimation factor of 3 to produce the corresponding LR images. In addition, the additive Gaussian noise is further added on the LR images for the noisy experiments. From Section 4.2 to Section 4.5, all test methods are applied on the luminance channel of each image. In Section 4.6, the methods are applied on all image channels to test the effectiveness of the proposed method.

212

4.2. Effects of threshold of δ and neighborhood size

208 209 210

213 214 215 216 217 218 219 220 221 222 223 224 225 226

The first key point is the threshold of δ . According to Section 2.2, the appropriate threshold of δ cannot only get the accurate edges information but also reconstruct high quality HR image. To evaluate the effect of δ , the testing experiments are conducted on six LR images by choosing different values of δ . The parameters γ1 and γ2 are experimentally set to be 0.02 and γ3 is experimentally set to be 0.08. In order to demonstrate the experimental results in terms of the visual quality and the values of PSNR and SSIM, we only use the reconstruction models (15) and (17) to obtain the HR images. The results are shown in Table 1 and Fig. 4. From Table 1, we can see that the value of PSNR tend to increase as δ varies from 10 to 50 and the value of PSNR reaches the highest when δ = 50. However, when δ exceeds 50, the value of PSNR begins to decrease. The same tendency can be seen in SSIM. In Fig. 4, we can see that the edges of the reconstructed images tend to be sharp as the value of δ becomes large. When the values of δ are 40, 50 and 60 respectively, there are no obvious changes in the edges. Therefore, considering both objective and subjective effects, we adopt 50 as the value of δ . In order to test the effect of the neighborhood size of local rank, we conduct the experiments on different neighborhood sizes, namely 3×3, 5×5, 7×7, 9×9. The threshold of δ is set to be 50. We only use the reconstruction model (15) and (17) to test the effect. Meanwhile, the parameters γ1 and γ2 are set to be 0.02 and γ3 is set to be 0.08. The experimental results are shown in Table 2 and Fig. 5. When the neighborhood size is 7×7, we can see that our method obtains the highest PSNR and SSIM values Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS 14

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx Table 3 PSNR and SSIM results of reconstructed images by different selection of γ . Images

Measures

Methods SCSR

NOLRT_SR

PLRT_SR

NLRT_SR

NPLRT_SR

Boats

PSNR SSIM

28.66 0.8351

28.69 0.8357

28.85 0.8406

28.81 0.8395

28.90 0.8419

Building

PSNR SSIM

24.09 0.7217

24.10 0.7226

24.24 0.7287

24.22 0.7230

24.30 0.7315

Bike

PSNR SSIM

23.68 0.7366

23.67 0.7360

23.75 0.7419

23.71 0.7378

23.81 0.7448

Leaf

PSNR SSIM

37.72 0.8902

37.73 0.8901

37.93 0.8919

37.91 0.8919

37.93 0.8920

Window

PSNR SSIM

28.08 0.8000

28.10 0.8003

28.19 0.8065

28.17 0.8037

28.27 0.8095

Parrot

PSNR SSIM

28.48 0.8733

28.49 0.8730

28.55 0.8722

28.46 0.8712

28.62 0.8728

Avg.

PSNR SSIM

28.45 0.8095

28.46 0.8096

28.85 0.8406

28.81 0.8395

28.90 0.8419

228

in Table 2 and the best image with sharp edges. As a result, in the following experiments, the neighborhood size of local rank is chosen as 7×7.

229

4.3. Effectiveness of proposed local rank edge constraint information

230

246

In order to evaluate that the proposed LRTδ and LRT−δ edge constraint information based method can efficiently remove edge artifacts and sharp edges compared with the traditional sparse representation based SISR (SCSR) [31] method, we first design some test experiments with different choice of the parameters γ1 , γ2 and γ3 . According to Section 3, the reconstruction models (15) and (17) would be the classical reconstruction ones if the parameters γ1 ,γ2 and γ3 are zero (named NOLRT_SR). We can study the effectiveness of the introduced LRTδ edge constraint information by setting γ1 , γ3 to be zero and γ2 = 0.02 (named PLRT_SR). By setting γ2 to be zero and γ1 = 0.02, γ3 = 0.08 (named NLRT_SR), we can study the effectiveness of the introduced LRT−δ edge constraint information. Furthermore, by setting γ1 = 0.02, γ2 = 0.02 and γ3 = 0.08, the effectiveness of the both information can be studied (named NPLRT_SR). Table 3 reports the PSNR and SSIM results and Fig. 6 presents the visual results. From Table 3, we can see that the results of the proposed method NPLRT_SR not only outperform PLRT_SR and NLRT_SR but also outperform SCSR. Moreover, without local rank edge constraint, the reconstruction method NOLRT_SR still has better results than SCSR. The main reason is that different LR image patch is reconstructed by using different dictionary. In order to exhibit more details, we segment a dominant region in each super-resolved image in Fig. 6. We can see that the NPLRT_SR method can provide less edge artifacts and sharper edges. This means that although the methods of NOLRT_SR, PLRT_SR and NLRT_SR can reconstruct the good HR image, they are inferior to the NPLRT_SR. Furthermore, NOLRT_SR, PLRT_SR, NLRT_SR and NPLRT_SR all can get fine results compared with SCSR. So the experimental results verify that the proposed local rank edge constraint information based SISR method can effectively improve the quality of the reconstructed HR image while removing edge artifacts and sharping edges.

247

4.4. Single image super resolution results on noiseless images

248

We first qualitatively and quantitatively compare our method with some well-known SISR methods on noiseless images, such as bicubic interpolation (BI), the neighbor embedding (NE) [6], SCSR [31], Zeyde’s [32], nonlocal autoregressive modelling based method (NARM) [8] and non-local means and steering kernel regression based method (NLM_SKR) [33]. In our method, the parameters h1 and h2 of the nonlocal regular term are both set to be 100, the search window is set to be 21×21, the magnification factor s = 3 and γ1 = γ2 = 0.02,γ3 = 0.08. The compared results of PSNR and SSIM are listed in Table 4. From Table 4, we can observe that the BI method and the NARM method always give the lowest performance and are worse than the NE-based method. Compared with NE method, both SCSR and Zeyde’s can achieve better results. This is mainly because that SCSR and Zeyde’s method learn more image information for reconstructing the HR image. In Table 4, the NLM_SKR method can obtain better results compared with other methods (i.e. BI, NE, SCSR, Zeyde’s and NARM). While, we also can see that our proposed method is the best than the above mentioned methods and can improve greatly the PSNR and SSIM values. So these results demonstrate that by using the proposed local rank constraint, our method can reconstruct HR image with good objective quality. To demonstrate the visual quality of different methods, the reconstructed images and the segmented local regions are presented in Fig. 7. We can see that the BI method misses some important details and blurs the edges. Although the NE method

227

231 232 233 234 235 236 237 238 239 240 241 242 243 244 245

249 250 251 252 253 254 255 256 257 258 259 260 261

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

15

Table 4 PSNR and SSIM results of reconstructed images by different methods. Images

Measures

Methods Bicubic

NE

SCSR

Zeyde’s

NARM

NLM_SKR

Proposed

Boats

PSNR SSIM

27.87 0.8142

27.89 0.8124

29.25 0.8542

29.24 0.8530

27.53 0.7824

29.26 0.8575

29.45 0.8606

Starfish

PSNR SSIM

26.53 0.7729

26.81 0.7823

27.82 0.8237

27.74 0.8182

26.04 0.7311

28.49 0.8399

28.49 0.8402

House

PSNR SSIM

24.21 0.7531

24.25 0.7575

24.85 0.7881

24.76 0.7849

24.36 0.7508

24.78 0.7881

25.14 0.7950

Lena

PSNR SSIM

29.48 0.7857

30.09 0.7937

30.88 0.8201

30.76 0.8171

29.38 0.7615

31.31 0.8299

31.36 0.8306

Window

PSNR SSIM

27.15 0.7644

27.51 0.7776

28.48 0.8174

28.35 0.8120

26.69 0.7303

28.73 0.8255

28.78 0.8279

Parrot

PSNR SSIM

27.45 0.8528

27.91 0.8596

29.11 0.8826

28.99 0.8790

27.57 0.8337

29.63 0.8867

29.53 0.8868

Butterfly

PSNR SSIM

23.85 0.7985

24.96 0.8415

26.31 0.8748

26.06 0.8651

24.79 0.8379

27.04 0.8950

27.14 0.8934

Flower

PSNR SSIM

34.21 0.9037

34.42 0.8987

36.24 0.9181

36.35 0.9188

32.41 0.8609

36.48 0.9215

36.64 0.9221

Plants

PSNR SSIM

30.75 0.8463

31.28 0.8343

32.58 0.8547

32.32 0.8536

30.36 0.8039

32.81 0.8587

32.99 0.8609

Leaf

PSNR SSIM

36.88 0.8842

37.07 0.8793

38.04 0.8940

38.18 0.8942

35.31 0.8494

38.25 0.8970

38.40 0.8981

Girl

PSNR SSIM

33.01 0.7606

33.17 0.7593

33.90 0.7836

33.89 0.7814

30.48 0.5955

33.88 0.7846

34.03 0.7855

Bike

PSNR SSIM

22.63 0.6746

23.16 0.7082

23.92 0.7561

23.78 0.7461

22.74 0.6564

24.06 0.7672

24.28 0.7721

Raccoon

PSNR SSIM

28.37 0.7001

28.42 0.7033

28.91 0.7402

28.92 0.7351

27.59 0.6251

29.02 0.7388

29.16 0.7418

Leaves

PSNR SSIM

23.31 0.7998

24.07 0.8385

25.64 0.8818

25.55 0.8757

24.11 0.8331

26.27 0.8989

26.44 0.9018

Building

PSNR SSIM

23.53 0.6842

23.72 0.7053

24.36 0.7445

24.33 0.7370

23.66 0.6722

24.79 0.7587

24.82 0.7611

Peppers

PSNR SSIM

29.43 0.8482

30.44 0.8423

31.91 0.8616

31.75 0.8667

29.60 0.8254

31.78 0.8670

32.23 0.8691

Avg.

PSNR SSIM

28.04 0.7902

28.45 0.7996

29.51 0.8310

29.44 0.8274

27.66 0.7594

29.79 0.8385

29.93 0.8404

270

can produce some high-frequency details, it also produces ringing effects along the edge regions. Moreover, the reconstructed image appears not to be natural. When the input image is not or less relevant to the training database, the watercolor-like results are produced and some regions of the image are out of shape. The method of SCSR recovers more details, but it produces some jaggy and ringing artifacts along with edges or details in the reconstructed image. Although Zeyde’s method is capable of both suppressing jagged artifacts and sharping the edges, it generates obvious ringing effects and leads to smooth result. NARM method reconstructs the smooth image with unsharp edges. As shown in Fig. 7, we can see that although the proposed method and the NLM_SKR method cannot only reconstruct fine details but also preserve correct edges, the proposed method can reconstruct sharper edges than NLM_SKR method. So the experimental results demonstrate the effectiveness of the proposed method.

271

4.5. Single image super resolution results on noisy images

262 263 264 265 266 267 268 269

272 273 274 275 276 277 278

In order to test the robustness of the proposed method, we add some Gaussian noise with zero mean and three variance,σ = 4, 6, 8, on randomly selected five test images. The parameters are γ1 = γ2 = 0.02,γ3 = 0.08 and h1 = h2 = 100. We compare our method with BI, NE, SCSR, Zeyde’s and NLM_SKR. The objective results are listed in Table 5, and the reconstructed HR images are shown in Figs. 8–10. From Table 5, we can see that the PSNR and SSIM values of all the methods tend to decrease when the noisy variance becomes larger. However, compared with other method, our method still achieves the best results by fixing the noisy level. From Figs. 8– 10, we also can see that our method is able to reconstruct the correct HR images with sharp edges and less edge artifacts. These results demonstrate the robustness of the proposed method against noise. Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS 16

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx Table 5 PSNR and SSIM results of reconstructed images by different methods contaminated on different noise levels. Noise variance

σ =4

σ =6

σ =8

Methods

Measures

Images Boats

Butterfly

Lena

Leaf

Window

Avg.

Bicubic

PSNR SSIM

27.73 0.7746

23.79 0.7715

29.27 0.7455

35.84 0.8229

27.03 0.7341

28.73 0.7697

NE

PSNR SSIM

27.66 0.7654

24.87 0.8095

29.73 0.7462

35.55 0.8073

27.31 0.7416

29.02 0.7740

SCSR

PSNR SSIM

28.74 0.7889

26.03 0.8287

30.12 0.7513

35.18 0.7926

28.04 0.7661

29.62 0.7855

Zeyde’s

PSNR SSIM

28.87 0.7994

28.54 0.8144

30.30 0.7642

35.79 0.8114

28.03 0.7710

30.31 0.7921

NLM_SKR

PSNR SSIM

28.81 0.7966

26.86 0.8477

30.68 0.7677

35.67 0.8075

28.35 0.7769

30.07 0.7992

Proposed

PSNR SSIM

29.10 0.8012

28.63 0.8495

30.80 0.7777

35.84 0.8141

28.45 0.7784

30.56 0.8042

Bicubic

PSNR SSIM

27.56 0.7342

23.72 0.7440

29.02 0.7042

34.81 0.7620

26.88 0.7026

28.40 0.7294

NE

PSNR SSIM

27.38 0.7111

24.81 0.7736

29.18 0.6915

33.79 0.7229

26.92 0.6936

28.42 0.7185

SCSR

PSNR SSIM

28.19 0.7301

25.70 0.7866

29.35 0.6900

33.16 0.7030

27.55 0.7190

28.79 0.7257

Zeyde’s

PSNR SSIM

28.44 0.7482

28.14 0.7569

29.71 0.7115

33.97 0.7338

27.67 0.7306

29.59 0.7362

NLM_SKR

PSNR SSIM

28.33 0.7398

26.46 0.8027

29.91 0.7077

33.72 0.7257

27.85 0.7292

29.25 0.7410

Proposed

PSNR SSIM

28.58 0.7442

28.29 0.8106

30.11 0.7163

33.99 0.7301

27.99 0.7323

29.79 0.7467

Bicubic

PSNR SSIM

27.34 0.6885

23.62 0.7129

28.69 0.6571

33.69 0.6948

26.68 0.6667

28.00 0.6840

NE

PSNR SSIM

27.00 0.6653

24.49 0.7381

28.73 0.6423

32.75 0.6584

26.68 0.6605

27.93 0.6729

SCSR

PSNR SSIM

27.54 0.6688

25.29 0.7426

28.47 0.6261

31.35 0.6149

26.96 0.6695

27.92 0.6644

Zeyde’s

PSNR SSIM

27.91 0.6926

27.64 0.6959

29.00 0.6547

32.28 0.6535

27.21 0.6867

28.81 0.6767

NLM_SKR

PSNR SSIM

27.74 0.6816

25.95 0.7565

28.99 0.6459

31.99 0.6421

27.26 0.6815

28.38 0.6815

Proposed

PSNR SSIM

28.15 0.7039

27.92 0.7729

29.13 0.6540

32.62 0.6815

27.31 0.6835

29.03 0.6992

279

4.6. Single image super resolution results on all image channels

280

287

In order to verify the performance by processing all image channels, the RGB LR image is first transformed into the YCBCr color space. Then, three image channels of the YCBCr color space can be reconstructed by the proposed method and the compared methods (i.e., BI, NE, SCSR, Zeyde’s, NLM_SKR) respectively. In experiments, the test images are contaminated by Gaussian noise with zero mean and σ = 6 variance, the parameters are chosen as γ1 = γ2 = 0.02,γ3 = 0.08 and h1 = h2 = 100 respectively. And the reconstructed images are transformed into gray to compute PSNR and SSIM instead of the values of every channel. The computed values of PSNR and SSIM are listed in Table 6 and the visual results are shown in Fig. 11. Compared with Table 6, we can find that the performance of the single channel (i.e. Table 5) is better. As shown in Fig. 11, we can find that the visual effect of the single channel has sharper edges and less artifacts.

288

5. Conclusions

289

In order to overcome the defect that the classical sparse representation based SISR methods had edge artifacts and no sharp edges because of failing to consider the edge constraint information in the linear combination stage, we proposed a new SISR method to overcome the problems by combining sparse representation and local rank edge constraint information. In our method, the local rank information of the HR image was learned by sparse representation and then it was used as the edge constraint to restrict edges to reconstruct the initial HR image. Moreover, we proposed an improved nonlocal and global optimization model to make the initial HR image satisfy the image observation model and further improve the image quality.

281 282 283 284 285 286

290 291 292 293 294

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

ARTICLE IN PRESS

JID: INS

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

17

Table 6 PSNR and SSIM results of reconstructed images by different methods on all image channels with σ = 6. Methods

Measures

Images Boats

Butterfly

Lena

Leaf

Window

Avg.

Bicubic

PSNR SSIM

26.24 0.7836

22.40 0.7880

27.70 0.7577

32.61 0.8347

25.56 0.7312

26.93 0.7790

NE

PSNR SSIM

26.04 0.7638

23.34 0.8097

27.84 0.7457

32.62 0.8038

25.73 0.7288

27.17 0.7704

SCSR

PSNR SSIM

26.87 0.7618

24.38 0.8104

28.02 0.7266

31.83 0.7465

26.23 0.7358

27.46 0.7562

Zeyde’s

PSNR SSIM

27.12 0.7866

24.30 0.8239

28.29 0.7522

32.64 0.7877

26.34 0.7512

27.73 0.7803

NLM_SKR

PSNR SSIM

27.01 0.7738

24.54 0.8267

28.29 0.7460

32.38 0.7745

26.52 0.7466

27.74 0.7735

Proposed

PSNR SSIM

27.13 0.7871

24.43 0.8285

28.44 0.7546

32.79 0.7954

26.57 0.7712

27.87 0.7874

(a)

(b)

(c)

(d)

Fig. 11. The reconstruct results on single channel and three channels with σ = 6 variance Gaussian noise. (a) LR image. (b) Single channel result (PSNR = 33.99, SSIM = 0.7301). (c) Three channels results (PSNR = 32.79, SSIM = 0.7954). (d) Original image.

296

Experiments with a relatively large number of images demonstrated the effectiveness of the proposed method in reconstructing HR image.

297

Author contributions

298

301

This manuscript was performed in collaboration between the authors. Weiguo Gong is the corresponding author of this research work. Lunting Hu proposed the local rank constraint, and applied it to SISR. Weiguo Gong improved the reconstruction model. Jinming Li proposed the nonlocal and global optimization model. Weihong Li was involved in the writing and argumentation of the manuscript. All authors discussed and approved the final manuscript.

302

Acknowledgments

303

This work was supported by Key Projects of the National Science and Technology Program, China (Grant no. 2013GS500303), the Key Science and Technology Projects of CSTC, China (Grant nos. CSTC2012GG-YYJSB40001, CSTC2013-JCSF40009), the Application Development Program of CSTC (cstc2013yykfC60006) and the National Natural Science Foundation of China (Grant no. 61105093). The authors would like to thank the editors and reviewers for their valuable comments and suggestions.

295

299 300

304 305 306 307

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

JID: INS 18

ARTICLE IN PRESS

[m3Gsc;July 13, 2015;17:22]

W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

308

References

309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364

[1] E. Abreu, S. Mitra, A signal-dependent rank ordered mean (SD-ROM) filter-a new approach for removal of impulses from highly corrupted images, in: Proceedings of the International Conference on Acoustics, Speech, & Signal Processing, Icassp., 1995, pp., pp. 2371–2374. [2] J. Banks, M. Bennamoun, Reliability analysis of the rank transform for stereo matching, IEEE Trans. Syst. Man, Cybernet. Part B 31 (6) (2001) 870–880. [3] T.F. Chan, N. Ng, A. Yau, A. Yip, Super-resolution image reconstruction using fast inpainting algorithms, Appl. Comput. Harmon. Anal. 23 (1) (2007) 3–24. [4] T. Chan, J. Zhang, An improved super-resolution with manifold learning and histogram matching, in: Proceedings of the IAPR International Conference on Biometric, 2005, pp. 756–762. [5] T.M. Chan, J. Zhang, J. Pu, H. Huang, Neighbor embedding based super-resolution algorithm through edge detection and feature selection, Pattern Recognit. Lett 30 (5) (2009) 494–502. [6] H. Chang, D.Y. Yeung, Y. Xiong, Super-resolution through neighbor embedding, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004, pp. 275–282. [7] W. Conover, R. Iman, Rank transformations as a bridge between parametric and nonparametric statistics, Am. Stat 35 (3) (1981) 124–129. [8] W. Dong, L. Zhang, R. Lukac, G. Shi, Sparse representation based image interpolation with nonlocal autoregressive modelling, IEEE Trans. Image Process 22 (4) (2013) 1382–1394. [9] W. Dong, L. Zhang, G. Shi, X. Li, Nonlocally centralized sparse representation for image restoration, IEEE Trans. Image Process. 22 (4) (2013) 1620–1630. [10] W. Dong, L. Zhang, G. Shi, X Wu., Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization, IEEE Trans. Image Process 20 (7) (2011) 1838–1857. [11] R. Fattal, Image upsampling via imposed edge statistics, ACM Trans. Graph. 26 (3) (2007) 95–102. [12] W.T. Freeman, T.R. Jones, E.C. Pasztor, Example-based super resolution, IEEE Comput. Graph. Appl. 22 (2) (2002) 56–65. [13] H. Hou, H.C. Andrews, Cubic splines for image interpolation and digital filtering, IEEE Trans. Acoust. Speech Signal Process. 26 (6) (1978) 508–517. [14] M. Hradis, A. Herout, P. Zemcik, Local rank patterns-novel features for rapid object detection, Comput. Vis. Graph. (2009) 239–248. [15] K. Jia, X. Wang, X. Tang, Image transformation based on learning dictionaries across image spaces, IEEE Trans. Pattern Anal. Mach. Intell 35 (2) (2013) 367–380. [16] K. Kim, Y. Kwon, Single-image super-resolution using sparse regression and natural image prior, IEEE Trans. Pattern Anal. Mach. Intell. 32 (6) (2010) 1127– 1133. [17] J. Li, W. Gong, W. Li., Dual-sparsity regularized sparse representation for single imageSuper-Resolution, Inform. Sci. 298 (2015) 257–273. [18] X. Li, M.T. Orchard, New edge-directed interpolation, IEEE Trans. Image Process 10 (10) (2001) 1521–1527. [19] W. Liu, S. Li., Sparse representation with morphologic regularizations for single image super-resolution, Signal Process. 98 (2014) 410–422. [20] J. Lu, Y. Sun, Context-aware single image super-resolution using sparse representation and cross-scale similarity, Signal Process-Image 32 (2015) 40–53. [21] J. Mukherjee, Local rank transform: properties and applications, Pattern Recogn. Lett. 32 (7) (2011) 1001–1008. [22] Z. Pan, J. Yu, H. Huang, S. Hu, A. Zhang, H. Ma, W. Sun, Super-resolution based on compressive sensing and structural self-similarity for remote sensing images, IEEE Trans. Geosci. Remote Sens 51 (9) (2013) 4864–4876. [23] S. Park, M. Park, M. Kang, Super-resolution image reconstruction: a technical overview, IEEE Signal Process. Mag 20 (3) (2003) 21–36. [24] T. Peleg, M. Elad, A statistical prediction model based on sparse representations for single image super-resolution, IEEE Trans. Image Process 23 (6) (2014) 2569–2582. [25] M. Protter, M. Elad, H. Takeda, P. Milanfar, Generalizing the nonlocal-means to super-resolution reconstruction, IEEE Trans. Image Process 18 (1) (2009) 36–51. [26] M. Sajjad, I. Mehmood, S. Baik, Image super-resolution using sparse coding over redundant dictionary based on effective image representations, J. Vis. Commun. Image Represent. 26 (2014) 50–65. [27] S. Yang, Z. Liu, M. Wang, F. Sun, L. Jiao, Multitask dictionary learning and sparse representation based single-image super-resolution reconstruction, Neurocomputing 74 (17) (2011) 3193–3203. [28] S. Yang,Y. Sun, Y. Chen, L. Jiao, Structural similarity regularized and sparse coding based super-resolution for medical images, Biomed. Signal Process 7 (6) (2012) 579–590. [29] S. Yang, M. Wang, Y. Chen, Y. Sun, Single-image super-resolution reconstruction via learned geometric dictionaries and clustered sparse coding, IEEE Trans. Image Process 21 (9) (2012) 4016–4028. [30] S. Yang, M. Wang, Y. Sun, F. Sun, L. Jiao, Compressive sampling based single-image super-resolution reconstruction by dual-sparsity and non-local similarity regularizer, Pattern Recogn. Lett. 33 (9) (2012) 1049–1059. [31] J. Yang, J. Wright, T. Huang, Y. Ma., Image super-resolution via sparse representation, IEEE Trans. Image Process 19 (11) (2010) 2861–2873. [32] R. Zeyde, M. Elad, M. Protter, On single image scale-up using sparse-representations, in: Proceedings of the 7th International Curves and Surfaces, 2012, pp. 711–730. [33] K. Zhang, X. Gao, D. Tao, X. Li., Single image super-resolution with non-local means and steering kernel regression, IEEE Trans. Image Process. 21 (11) (2012) 4544–4556. [34] J. Zhang, D. Zhao, W. Gao, Group-based sparse representation for image restoration, IEEE Trans. Image Process 23 (8) (2014) 3336–3351. [35] F. Zhou, T. Yuan, W. Yang, Q. Liao, Single-Image Super-resolution based on compact KPCA coding and kernel regression, IEEE Signal Process. Lett 22 (3) (2015) 336–340. [36] M. Zontak, M. Irani, Internal statistics of a single natural image, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2011, pp. 977–984.

365 366 367 368 369

Weiguo Gong received his doctoral degree in computer science from the Tokyo Institute of Technology of Japan in March 1996 as a scholarship gainer of Japan Government. From April 1996 to March 2002, he served as a researcher or senior researcher in NEC Labs of Japan. Now, he is a professor of Chongqing University, China. He has published over 130 research papers in international journals and conferences and two books as an author or co-author. His current research interests are in the areas of pattern recognition and image processing.

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004

JID: INS

ARTICLE IN PRESS W. Gong et al. / Information Sciences xxx (2015) xxx–xxx

[m3Gsc;July 13, 2015;17:22] 19

370 371

Lunting Hu is a M.S. candidate in the College of Opto-Electronic Engineering, Chongqing University. His research interests are in image processing.

372 373

Jinming Li is a Ph.D. candidate in the College of Opto-Electronic Engineering, Chongqing University. His current research interests are in the areas of image processing.

374 375

Weihong Li received her doctoral degree from Chongqing University in 2006. Now she is an associate professor in Chongqing University. Her current research interests are in the areas of pattern recognition and image processing.

Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004