ARTICLE IN PRESS
JID: INS
[m3Gsc;July 13, 2015;17:22]
Information Sciences xxx (2015) xxx–xxx
Contents lists available at ScienceDirect
Information Sciences journal homepage: www.elsevier.com/locate/ins
Combining sparse representation and local rank constraint for single image super resolution Weiguo Gong∗, Lunting Hu, Jinming Li, Weihong Li
Q1
Key Lab of Optoelectronic Technology & Systems of Education Ministry, Chongqing University, Chongqing 400044, China
a r t i c l e
i n f o
Article history: Received 24 March 2015 Revised 28 May 2015 Accepted 3 July 2015 Available online xxx Keywords: Local rank constraint information Nonlocal and global optimization Single image super resolution Sparse representation
a b s t r a c t Sparse representation based reconstruction methods are efficient for single image super resolution. They generally consist of the code stage and the linear combination stage. However, the simple linear combination has not considered the image edge constraint information of image, and hence the classical sparse representation based methods reconstruct the image with the unwanted edge artifacts and the unsharp edges. In this paper, considering that the local rank is able to extract better edge information than other edge operator, we propose a new single image super resolution method by combining the sparse representation and the local rank constraint information. In our method, we first learn the local rank of the HR image via the traditional sparse representation model, and then use it as the edge constraint to restrict the image edges during the linear combination stage to reconstruct the HR image. Furthermore, we propose a nonlocal and global optimization model to further improve the HR image quality. Compared with many state-of-art methods, extensive experimental results validate that the proposed method can obtain the less edge artifacts and sharper edges. © 2015 Published by Elsevier Inc.
1
1. Introduction
2
High resolution (HR) images are needed in many practical applications, such as medical image analysis, computer vision, remote sensing and so on. The direct way to get the HR images is to increase the number of pixels per unit area or reduce the pixel size by sensor manufacturing techniques [23]. However, these methods are constrained by the physical limitations of imaging systems [16]. In order to overcome the physical limitations, various single image super resolution (SISR) methods have been proposed to obtain the HR image from its low resolution (LR) observation. The classical SISR methods can be mainly divided into three categories: the interpolation based methods [13] and [18], the reconstruction based methods [3] and [11] and the example based methods [12] and [6]. Although the interpolation based methods are easy to perform, the reconstructed HR images tend to be blurry with jagged artifacts and ringing. The reconstruction based methods can introduce some prior knowledge into the reconstruction process, but the HR results may be over smoothing or lacking some important details because of a large magnification factor or failing to model the visual complexity of the real image. In this paper, we focus on the example based methods because these methods are of stronger capability of SISR as the magnification factor becomes larger and are able to obtain the HR image which fuses the high-frequency information from all the example HR images and the LR images. The example based methods [12] and [6], which assume that the high-frequency details lost in LR image can be obtained by learning the relationship between a set of LR example patches and their corresponding HR patches, have become an active area
3 4 5 6 7 8 9 10 11 12 13 14 15 16
∗
Corresponding author. Tel.: +86 23 65112779; fax: +86 23 65112779. E-mail address:
[email protected] (W. Gong).
http://dx.doi.org/10.1016/j.ins.2015.07.004 0020-0255/© 2015 Published by Elsevier Inc.
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS 2
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
of research. Freeman et al. [12] proposed the example based method to obtain the HR image by employing the pairs of LR and HR patches directly in a Markov network. Chang et al. [6] further proposed a neighbor embedding based method with the assumption that the LR image and its corresponding HR vision have similar local geometry. To improve neighbor embedding based method, other neighbor embedding based methods have been proposed in [4] and [5]. Nevertheless, the effect of these methods mainly depends on a large supporting image database [31]. Recently, to alleviate this weakness, Yang et al. [31] proposed the sparse representation based SISR method which generally consists of the code stage and the linear combination stage. In this work, the joint dictionary training framework is first proposed for training the couple HR and LR dictionaries. Under this framework, Zeyde et al. [32] introduced the sparse-land model into sparse representation for better SISR results. In order to preserve the difference between image patch contents, Yang et al. [27] and [29] employed the cluster technology to learn multiple dictionaries in the code stage. Except for the sparse prior [27,29,31,32] and [19], other image priors, such as the structural similarity [22] and [28], the nonlocal self-similarity [8–10,17,30] and [34], are studied in the code stage. In order to improve the stability of the recovery results, [15] and [24] proposed the coefficient mapping methods, [20,26] and [35] attempted to capture more image information, such as edges and texture, in the code stage. A drawback of the above mentioned methods is that they fail to consider the edge constraint information of image in the linear combination stage. Thus the reconstructed image has unwanted edge artifacts and unsharp edges. Recently, local rank transform, as a useful tool for describing the data distribution characteristics, is usually used in statistical analysis [7], object detection [14], image denoising [1] and stereo matching [2]. In this paper, considering that local rank can extract better edge information and is insensitive to noise than other edge operator [21], we develop the local rank edge constraint and introduce it into the linear combination stage for SISR task. By combining this edge constraint and the sparse representation, we propose a new SISR method, which can better reconstruct HR image with less edge artifacts and sharp edges. First, the local rank edge information of the HR image is learned by the sparse representation model, and then it is used as the edge constraint to restrict the image edges during the linear combination stage. Second, we propose a nonlocal and global optimization model to further improve the image quality. The contributions of this paper can be summarized as follows: 1) We classify the training image patches into different patterns to learn the local rank of HR image. 2) We constrain the local rank of the HR image patch as close as possible to the reconstructed local rank by the energy minimization model to reconstruct the HR image. 3) We propose a new weight calculation method for non-local self-similarity in the nonlocal and global optimization model.
46
The rest of this paper is organized as follows. In Section 2, we give a brief overview of the sparse representation based single image super resolution and local rank transform. The proposed method is presented in Section 3. Experimental results are provided in Section 4. Section 5 concludes this paper.
47
2. Brief overview of sparse representation based SISR and local rank transform
48
2.1. Sparse representation for SISR
49
The LR image can be seen as a blurred and down-sampled version of the HR image. This observation model can be formulated as follows:
44 45
50
Y = SHX + E, 51 52 53 54 55
where Y represents the LR image, S is the down-sampling operator, H is the blurring filter, E is the noise and X is the HR image. Since there are many HR images satisfying the reconstruction constraint for a given LR image, the process of recovering X from Y is ill-posed. An effective way to deal with this problem is sparse representation. Sparse representation has become an important tool for single image super resolution. The SISR problem via sparse representation consists of the code stage and the linear combination stage. The code stage is formulated as follows:
min α
56
(1)
y˜ − D˜ α 2 + λα1 , 2
(2)
y Dl ] and D˜ = [βFPD ], Dh and Dl are HR dictionary and LR dictionary respectively, F is a feature extraction operator, y is where y˜ = [βF w h
57 58 59 60 61
the input LR image patch, P extracts the region of overlap between the current target patch and previously reconstructed image patch, α is the sparse coefficients matrix and w contains the values of the previously reconstructed HR image on the overlap. The parameter β controls the tradeoff between matching the LR input and finding a HR patch that is compatible with its neighbors. In the linear combination stage, the ith image patch vector xi ∈ Rn , which is extracted from the HR image X, can be represented as a sparse linear combination in the HR dictionary Dh ∈ Rn×K :
xi ≈ Dh αi ,
αi ∈ RK , αi 0 K,
62
where αi is the sparse coefficient vector which is computed by the input LR image patch and the LR dictionary.
63
2.2. Local rank transform
64 65
(3)
Let x be an element of set S, the rank of x with respect to S is defined as the number of elements less than x: lr(x) = r(x; S). The local rank of S is: LRT (S) = {r(x; N(x))|x ∈ S}, where N(x) is the neighborhood of x. Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
66 67 68
[m3Gsc;July 13, 2015;17:22] 3
The local rank definition of the set is also suitable for an image. The local rank of an image is defined as follows: Let I be an image and I(x, y) be a pixel of I, in the rank window centered by I(x, y), the local rank of I(x, y) is to count the number of pixels whose values are less than I(x, y) in the rank window [2] and [7]. This definition can be formulated as:
LRT (I(x, y)) = Nw −
C (I(x, y) − I(i, j))
(4)
i, j
69
where
C (•) =
70 71 72 73 74 75
1,
I(x, y) − I(i, j) > 0
0,
I(x, y) − I(i, j) ≤ 0
(5)
Nw is the total number of pixels in the rank window, (i, j) is the coordinate of the rank window. In Eq. (5), the value of C (•) depends on the values of I(x, y) and I(i, j). In a noisy environment, if the value of I(i, j) becomes larger than I(x, y), C (•) tends to be 0. On the contrary, if the value of I(i, j) becomes smaller than I(x, y), C (•) tends to be 1. Therefore, in order to make this transformation suitable in a noisy environment, the δ -rank of I(x, y) is defined as: In the rank window centered by I(x, y), the δ -rank of I(x, y) is to count the number of pixels whose values are less than I(x, y) by at least δ amount in the rank window [21]:
LRTδ (I(x, y)) = Nw −
C (I(x, y) − I(i, j))
(6)
i, j
76
where
C (•) =
77 78 79 80 81 82 83 84 85 86 87
1,
I(x, y) − I(i, j) > δ
0,
I(x, y) − I(i, j) ≤ δ
(7)
This definition has been demonstrated that the value of LRTδ (I) is larger at the sharp edges when the value of δ is positive (i.e. LRTδ (I)) and the value of LRTδ (I) is larger around the sharp edges when the value of δ is negative (i.e. LRT−δ (I)). In order to illustrate that local rank can effectively be used in extracting edges of image, the local rank of the LR image, noisy LR image and the HR image are shown in Fig. 1. We can observe that the edges are highlighted in Fig. 1(b),(e),(h) and the smooth regions are highlighted in Fig. 1(c),(f),(i). Furthermore, when the noise (zero mean and σ = 4 variance Gaussian noise) is added on the LR image, the local rank of the image can still maintain the shape. As shown in these figures, the image dimension is unchanged and the transformed images only contain the edges information after performing local rank transformation on the images. That is to say, the local rank edge information of the HR image can be regarded as a special image and is able to be learned by learning approaches from the training sets. In this paper, we use sparse representation to learn the local rank edge information of the HR image. Thus, theoretically speaking, the learned local rank is able to be used as the edge constraint to restrict the edges during reconstructing the HR image.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Fig. 1. Local rank transform on HR and LR images of “Butterfly”. (a)–(c) The LR image and the corresponding LRTδ images with δ = 50 and δ = −50 respectively. (d)–(f) The noisy LR image (with zero mean and σ = 4 variance Gaussian noise) and the corresponding LRTδ images with δ = 50 and δ = −50 respectively. (g)–(i) The HR image and the corresponding LRTδ images with δ = 50 and δ = −50 respectively.
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
JID: INS 4
ARTICLE IN PRESS
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
88
3. Proposed method
89
104
The sparse representation based SISR methods consist of the code stage and the linear combination stage. However, the simple linear combination fails to consider the edge constraint information of image. Hence the classical sparse representation based SISR methods reconstruct the HR image with unwanted edge artifacts and unsharp edges. According to Section 2.2, considering that the local rank of image can extract better edge information, we propose a new SISR method to reconstruct the HR image by combining the sparse representation and the local rank edge constraint information. In our method, the local rank edge constraint information is first learned by sparse representation and then directly used to restrict edges during the linear combination stage. The proposed method consists of four parts: The first part is training image processing by local rank transform. The second part is the multiple dictionaries learning. The third part is the initial HR image reconstruction by the local rank edge constraint. And the final part is the nonlocal and global optimization model to improve the image quality. In the first part, the training image patches are pre-processed by local rank transform. Then the training patches are classified into two patterns so that the LR image patches to be reconstructed can be chosen differently to adapt to the properties of different image contents. In the multiple dictionaries learning part, the multiple dictionaries of these two patterns are learned from the training patches. In the initial HR image reconstruction part, for each input LR image patch, it is necessary to select the suitable pattern to reconstruct the local rank of the HR image patch. Then the initial HR image is reconstructed by constraining the local rank of the HR image patch as close as possible to the obtained local rank. In the final part, we propose a nonlocal and global optimization model to make the initial HR image satisfy the image observation model to further improve the image quality.
105
3.1. Training image pre-processing by local rank transform
106
Due to the reason that the local rank information is unknown before reconstructing the HR image, we need to construct effective local rank training sets for learning the local rank of the HR image. In this paper, the training sets for learning the dictionaries, which are used to reconstruct the HR image and the local rank, contain three parts: the HR image patches, the LR image patches and the local rank image patches. The HR image patches are directly extracted from the HR images. We use the first-order and second-order derivatives of the image as the LR image patches features. By performing local rank transform on the HR image, the HR image is transformed into two kinds of images: the LRTδ image and the LRT−δ image. Hence, the local rank of the HR training image patches are consisted of the LRTδ image patches and the LRT−δ image patches which are extracted from the LRTδ image and the LRT−δ image respectively. The detailed training sets description is as follows: First, Yl = {Yli }ni=1 denotes the LR patches and Xh = {Xhi }ni=1 denotes the HR patches respectively, where Yli denotes ith M × 1 LR feature patch, Xhi denotes ith N × 1 HR patch and n is the total number of the patches. Second, Xδ = {Xδi }ni=1 denotes the LRTδ patches, Xδi denotes the ith N × 1 LRTδ patch, X−δ = {X−i δ }ni=1 denotes
90 91 92 93 94 95 96 97 98 99 100 101 102 103
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123
the LRT−δ patches and X−i δ denotes the ith N × 1 LRT−δ patches. Finally, the training image patches database is denoted by P = {Xhi , Yli , Xδi , X−i δ }ni=1 . According to the definition of local rank, there are many LRTδ patches only containing zero elements. And these patches not only cost a lot of training time but also have influence on the accuracy of the dictionary during the dictionary learning process. In order to reduce the size of training samples and improve the accuracy of the dictionaries, we classify the training patches into two patterns (Pδ and P−δ ) so that the LR image patches can be chosen differently to adapt to the properties of different image contents. That is to say, if patch Xδi is not zero, we classify the corresponding patches Xhi , Yli , Xδi and X−i δ into pattern Pδ . If patch Xδi
Fig. 2. Some training images.
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
5
Fig. 3. The test images. From left to right and top to bottom: butterfly, boats, flower, leaf, leaves, plants, raccoon, building, starfish, house, lena, peppers, window, parrot, girl, bike.
124
is zero, we classify the corresponding patches Xhi , Yli and X−i δ into pattern P−δ . The patch Xδi is not included in pattern P−δ for the n
n
126
1 2 reason that patch Xδi only has zero elements. As a result, we get two patterns Pδ = {Xhi , Yli , Xδi , X−i δ }i=1 and P−δ = {Xhi , Yli , X−i δ }i=1 with n1 + n2 = n.
127
3.2. Multiple dictionaries learning
128
In the code stage, since a learned dictionary is more effective than pre-constructed dictionaries, we will consider how to n1 n1 learn the dictionaries. For the pattern Pδ , we need to learn four dictionaries. Since the patches Xh = {Xhi }i=1 , Xδ = {Xδi }i=1 and
125
129 130 131
n
n
1 1 X−δ = {X−i δ }i=1 share the same sparse representation coefficients with the LR patches Yl = {Yli }i=1 during the learning procedure, by using the selected patches Pδ in Section 3.1, multiple dictionaries learning model can be mathematically written as follows:
{Dh , Dδ , D−δ , Dl } =
arg min
{Dh ,Dδ ,D−δ ,Dl ,α}
Xh − Dh α2F + Xδ − Dδ α2F + X−δ − D−δ α2F +Yl − Dl α2F
s.t.
α0 ≤ τ1 , (8)
132 133 134 135
where Dh is the HR dictionary, Dδ is the LRTδ dictionary, D−δ is the LRT−δ dictionary, Dl is the LR dictionary, α = (α1 , α2 , . . . , αn1 ) is the sparse representation coefficients matrix and τ1 is the parameter to control the sparse level. In this model, the four terms are used to measure the similar level between the real patch and the sparse reconstructed one. The condition term is used to make the image patches have the sparse representation coefficients over the dictionary. According to the joint dictionary learning
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS 6
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
(a)
(b)
(c)
(d)
(e)
(f)
Fig. 4. Reconstruction results by different values of δ on “Bike”. (a) δ = 10 (PSNR = 23.35, SSIM = 0.7166). (b) δ = 20 (PSNR = 23.53, SSIM = 0.7297). (c) δ = 30 (PSNR = 23.69, SSIM = 0.7386). (d) δ = 40 (PSNR = 23.80, SSIM = 0.7441). (e) δ = 50 (PSNR = 23.81, SSIM = 0.7448). (f) δ = 60 (PSNR = 23.79, SSIM = 0.7440).
(a)
(b)
(d)
(c)
(e)
Fig. 5. The experimental results on different neighborhood size. (a) 3 × 3 (PSNR = 22.74, SSIM = 0.6770). (b) 5 × 5 (PSNR = 23.73, SSIM = 0.7414). (c) 7 × 7 (PSNR = 23.81, SSIM = 0.7448). (d) 9 × 9 (PSNR = 23.68, SSIM = 0.7341). (e) Original image.
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
(a)
(b)
(c)
(d)
(e)
7
(f)
(g)
Fig. 6. Comparison of super resolution results on “Window” image by different choice of γ . (a) LR image. (b) SCSR [31] (PSNR = 28.08, SSIM = 0.8000). (c) NOLRT_SR (PSNR = 28.10, SSIM = 0.8003). (d) PLRT_SR (PSNR = 28.19, SSIM = 0.0.8065). (e) NLRT_SR (PSNR = 28.17, SSIM = 0.8037). (f) NPLRT_SR (PSNR = 28.27, SSIM = 0.8095). (g) Original image.
136
model [31], in this paper, the dictionary learning model (8) can be reformulated as:
{Dh , Dδ , D−δ , Dl } = arg min
{Dh ,Dδ ,D−δ ,Dl }
XP − DP α22 + λ1 α1 ,
(9)
137
where λ1 balances the sparsity, XP = [ √1 Xh , √1 Xδ , √1 X−δ , √1 Yl ]T , Nh and Nl are the dimensions of the HR image patch and
138
LR image patch in vector form respectively, and DP = [ √1 Dh , √1 Dδ , √1 D−δ , √1 Dl ]T . N N N N
139 140 141
Nh
n
Nl
h
h
n
h
l
n
2 the low resolution patches Yl = {Yli }i=1 during the learning procedure, we can obtain the following multiple dictionaries learning model:
{Dh ,D−δ ,Dl ,α}
143
Nh
2 2 On the other hand, since the patches Xh = {Xhi }i=1 and X−δ = {X−i δ }i=1 share the same sparse representation coefficients with
{Dh , D−δ , Dl } = arg min 142
Nh
Xh − Dh α2F + X−δ − D−δ α2F + Yl − Dl α2F
s.t.
α0 ≤ τ2 ,
(10)
where Dh is the HR dictionary, D−δ is the LRT−δ dictionary, Dl is the LR dictionary, α = (α1 , α2 , . . . , αn2 ) is the sparse representation coefficients matrix and τ2 is the parameter to control the sparse level. This dictionary learning model can also be Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS 8
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Fig. 7. Comparison of super resolution results on “Leaves” image by different methods. (a) LR image. (b) BI (PSNR = 23.31, SSIM = 0.7998). (c) NE [6] (PSNR = 24.07, SSIM = 0.8385). (d) SCSR [31] (PSNR = 25.64, SSIM = 0.8818). (e) Zeyde’s [32] (PSNR = 25.55, SSIM = 0.8757). (f) NLM_SKR [33] (PSNR = 26.27, SSIM = 0.8989). (g) NARM [8] (PSNR = 24.11, SSIM = 0.8331). (h) The proposed (PSNR = 26.44, SSIM = 0.9018). (i) Original image.
144
reformulated as:
{Dh , D−δ , Dl } = arg min {XN − DN α22 + λ2 α1 },
(11)
{Dh ,D−δ ,Dl }
145
where XN = [ √1 Xh , √1 X−δ , √1 Yl ]T , DN = [ √1 Dh , √1 D−δ , √1 Dl ]T and λ2 balances the sparsity. N N N N N N h
h
l
h
h
l
146
3.3. Local rank constraint information for HR image reconstruction
147
For every input LR image patch, the key issue is to identify a pattern which can represent the properties of the LR image contents. If the test LR image patch Yli belongs to the pattern Pδ , we can get the sparse representation coefficient of the LR image patch by formula (12):
148 149
2 αi∗ = arg min αi 0 s.t. Yli − Dl αi F ≤ ε1 , αi
150 151
where Dl is the LR dictionary of pattern Pδ , αi∗ is the sparse representation coefficient of Yli and ε1 is admissible error. Once the sparse representation coefficient αi∗ is determined by (12), the target HR image patch can be reconstructed by formula (13):
X˜hi ≈ Dh αi∗ , 152
(12)
where Dh is the HR dictionary of pattern
(13) Pδ , X˜hi
is the initial target HR image patch.
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
(a)
(b)
9
(c)
(d)
(e)
(f)
(g)
(h)
Fig. 8. Comparison of super resolution results on “Boats” image (with σ = 4) by different methods. (a) LR image. (b) BI (PSNR = 27.73, SSIM = 0.7746). (c) NE [6] (PSNR = 27.66, SSIM = 0.7654). (d) SCSR [31] (PSNR = 28.74, SSIM = 0.7889). (e) Zeyde’s [32] (PSNR = 28.87, SSIM = 0.7994). (f) NLM_SKR [33] (PSNR = 28.81, SSIM = 0.7966). (g) The proposed (PSNR = 29.10, SSIM = 0.8012). (i) Original image.
153 154 155 156 157 158 159
However, this simple linear combination fails to consider the edge constraint information of image, and hence the reconstructed HR image has unwanted edge artifacts and unsharp edges. According to the learned dictionaries Dδ ,D−δ and the fact that the local rank patch and the LR patch share the same sparse representation coefficient, the local rank LRTδ and LRT−δ of the HR image patch can be reconstructed by Rδ = Dδ αi∗ and R−δ = D−δ αi∗ , where Dδ and D−δ are the LRTδ and LRT−δ dictionaries of pattern Pδ respectively, Rδ and R−δ are the reconstructed local rank of the HR image patch. In order to restrict the image edges by local rank constraint, we constrain the local rank of the HR image patch as close as possible to the reconstructed local rank Rδ and R−δ . Thus we are able to get the following energy minimization model to restrict the image edges:
min E2 (LRTδ (X˜hi ) − Rδ ) + E3 (LRT−δ (X˜hi ) − R−δ ) .
(14)
X˜hi
160 161
In order to restrict the edges during the linear combination stage, we combine the linear combination formula (13) and the edge constraint (14) to reconstruct the initial HR image patch. Thus the initial HR image patch can be reconstructed by:
min X˜hi
162 163 164
X˜hi − Dh αi∗ 2 + γ2 LRTδ (X˜hi ) − Rδ 2 + γ3 LRT−δ (X˜hi ) − R−δ 2 , 2
2
2
(15)
where γ2 and γ3 control the LRTδ and LRT−δ terms respectively. For the LR test image patch Yli which belongs to the pattern P−δ , we use the LR dictionary Dl of pattern P−δ and formula (12) to get the sparse representation coefficient αi∗ . It is reasonable to constrain LRT−δ of the HR image patch as close as possible to Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS 10
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Fig. 9. Comparison of super resolution results on “Boats” image (with σ = 6) by different methods. (a) LR image. (b) BI (PSNR = 27.56, SSIM = 0.7342). (c) NE [6] (PSNR = 27.38, SSIM = 0.7111). (d) SCSR [31] (PSNR = 28.19, SSIM = 0.7301). (e) Zeyde’s [32] (PSNR = 28.44, SSIM = 0.7482). (f) NLM_SKR [33] (PSNR = 28.33, SSIM = 0.7398). (g) The proposed (PSNR = 28.58, SSIM = 0.7442). (i) Original image.
165
the reconstructed local rank. The energy minimization model can be written as:
min E1 LRT−δ X˜hi − R−δ , ˜i
(16)
Xh
166 167
where R−δ = D−δ αi∗ , D−δ is the LRT−δ dictionary of pattern P−δ , R−δ is the reconstructed local rank of the HR image patch. Similar to formula (15), we can get formula (17) for reconstructing the initial HR image patch:
min X˜hi
168 169 170 171 172 173
X˜hi − Dh αi∗ 2 + γ1 LRT−δ (X˜hi ) − R−δ 2 , 2
2
(17)
where Dh is the HR dictionary of pattern P−δ , γ1 controls the LRT−δ term. In our reconstruction models (15) and (17), the constrained terms have obvious difference. This is because that we first select the suitable pattern for every input LR image patch. If the patch belongs to the pattern Pδ , the LRTδ and LRT−δ edge constraint information are reconstructed for restricting the edges simultaneously. While only LRT−δ edge constraint is reconstructed to restrict the edges if the patch belongs to the pattern P−δ . By fusing all the initial HR image patches reconstructed by (15) and (17), the estimated initial HR image X0 can be obtained. Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
(a)
(b)
(c)
(d)
(f)
11
(e)
(g)
(h)
Fig. 10. Comparison of super resolution results on “Boats” image (with σ = 8) by different methods. (a) LR image. (b) BI (PSNR = 27.34, SSIM = 0.6885). (c) NE [6] (PSNR = 27.00, SSIM = 0.6653). (d) SCSR [31] (PSNR = 27.54, SSIM = 0.6688). (e) Zeyde’s [32] (PSNR = 27.91, SSIM = 0.6926). (f) NLM_SKR [33] (PSNR = 27.74, SSIM = 0.6816). (g) The proposed (PSNR = 28.15, SSIM = 0.7039). (i) Original image.
174
3.4. Nonlocal and global optimization
175
In order to make the HR image X0 satisfy the image observation model, papers [31] and [19] applied the global reconstruction constraint on the estimated HR image X0 . This process can be represented by the following global constraint model:
176
X ∗ = arg min X
177 178 179 180 181
SHX − Y 22 + μX − X0 22 .
(18)
However, this model fails to reconstruct clear edges because it ignores the self-similarity of the patches within the same scale and across different scales. In order to reconstruct the HR image with clear edges, we introduce the non-local self-similarity constraint information [25] and [33] into patch aggregation to better reconstruct the HR image. The non-local self-similarity constraint information assumes that small patches in the image tend to redundantly repeat themselves many times within the same scale and across different scales. This constraint is given by:
N(X i ) =
1 wi j N(X j ), Z (i)
(19)
j
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS 12
182 183
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
where N(X i ) is the image patch at the ith position of image, N(X j ) is the similar image patches in the search window, wi j is the similar weight and Z (i) is the normalization parameter. In this formula, wi j and Z (i) are calculated respectively by:
N(X 1 wi j = exp − Z (i) 184
and
Z (i) =
exp ( −
i
2 ) − N(X j )2
186 187 188 189 190 191
N(X i ) − N(X j )2 2
h
where
Z (i) =
193
exp −
(21)
2 ) − N(X j )2 h1
2
h1
LRTδ (X i ) − LRTδ (X j )2
· exp −
2
h2
LRTδ (X i ) − LRTδ (X j )2
· exp −
2
h2
,
(22)
.
(23)
Finally, by introducing the improved non-local self-similarity constraint information into model (18), we obtain the following nonlocal and global optimization model:
X ∗ = arg min X
195
i
N(X i ) − N(X j )2
j
194
),
where h is the attenuator. This original non-local self-similarity constraint information obtains the weight by simply measuring patch similarities with the Euclidean norm. However, this measurement is less sensitive to the patches with fine details and edges [36], and this makes the weight have no obvious difference between different image contents. In order to overcome this drawback, we propose a new weight for non-local self-similarity constraint information, which not only depends on the image patches but also depends on the local rank of the image patch. Thus the proposed weight of the non-local self-similarity constraint information is written by the following formula:
N(X 1 wi j = exp − Z (i) 192
(20)
h
j
185
[m3Gsc;July 13, 2015;17:22]
SHX − Y 22 + μX − X0 22 + βX − W X 22 ,
(24)
where W is the matrix of all the weights, μ and β are the regularization parameters. The entire SISR process is summarized as Algorithm 1. Algorithm 1. SISR reconstruction algorithm for LR test image Input: patterns Pδ and P−δ , the test LR image Y, dictionaries of Pδ , Dictionaries of P−δ , magnification factor s = 3, γ1 = γ2 = 0.02, γ3 = 0.08, μ = 0.08, β = 0.05, h1 = h2 = 100. 1. For each patch Yli in LR image Y, perform the following steps: 1.1. Adaptively select the corresponding pattern Pδ or P−δ , and choose the corresponding dictionaries; 1.2. Compute the optimal solution αi∗ via formula (12); 1.3. Generate the initial HR image patch by model (15) or (17). 2. Obtain the initial HR image X0 by fusing all the initial HR patches. 3. Estimate the final HR image X by model (24). Output: the final HR image X.
196 197
4. Experimental results and analysis
198
4.1. Experimental settings
199
All experiments are conducted in Matlab 2010a environment on a desktop PC with 2.80 GHz Dual-Core CPU and 2.9GB memory. The training images are selected from the software package for [31]. We choose 70 HR images to form the training set. The image database contains different types, such as flowers, architectures, animals, cars, plants and so on. Some training images are shown in Fig. 2. The number of training patches which are used for learning the dictionaries is 100,000. The dictionary size is always set to be 512 by considering computation and image quality. In order to get a better result and balance LRTδ and LRT−δ , we choose the patch size as 7×7 with overlap of six pixels between adjoined patches. In order to assess the quality of the SISR reconstruction objectively, we utilize two indices, i.e., the peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM). The test images are shown in Fig. 3, including butterfly, boats, flower, leaf, leaves, plants, raccoon, building, starfish, house, lena, peppers, window, parrot, girl, bike. All the test images are first blurred by a 5×5 Gaussian filter with standard deviation 1, and
200 201 202 203 204 205 206 207
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
13
Table 1 PSNR and SSIM results of reconstructed images by different values of δ . Values of δ
Measures
Images Boats
Building
Bike
Leaf
Window
Parrot
Avg.
δ = 10
PSNR SSIM
28.37 0.8279
23.99 0.7097
23.35 0.7166
37.59 0.8888
27.80 0.7897
28.08 0.8639
28.19 0.7994
δ = 20
PSNR SSIM
28.75 0.8374
24.08 0.7194
23.53 0.7297
37.76 0.8910
28.01 0.7993
28.38 0.8698
28.41 0.8077
δ = 30
PSNR SSIM
28.85 0.8404
24.24 0.7273
23.69 0.7386
37.82 0.8910
28.20 0.8055
28.54 0.8723
28.56 0.8125
δ = 40
PSNR SSIM
28.90 0.8402
24.22 0.7291
23.80 0.7441
37.82 0.8903
28.27 0.8086
28.58 0.8725
28.59 0.8141
δ = 50
PSNR SSIM
28.90 0.8419
24.30 0.7315
23.81 0.7448
37.93 0.8919
28.28 0.8095
28.62 0.8728
28.64 0.8154
δ = 60
PSNR SSIM
28.83 0.8400
24.16 0.7284
23.79 0.7440
37.88 0.8910
28.27 0.8086
28.58 0.8724
28.58 0.8141
Table 2 PSNR and SSIM results of reconstructed images by different values of neighborhood size. Images
Measures
Neighborhood size 3×3
5×5
7×7
9×9
Boats
PSNR SSIM
27.59 0.6570
28.94 0.8429
28.90 0.8419
28.88 0.8416
Building
PSNR SSIM
23.52 0.6712
24.25 0.7289
24.30 0.7315
24.21 0.7222
Bike
PSNR SSIM
22.74 0.6770
23.73 0.7414
23.81 0.7448
23.68 0.7341
Leaf
PSNR SSIM
36.77 0.8677
37.90 0.8918
37.93 0.8919
37.91 0.8919
Window
PSNR SSIM
27.21 0.7581
28.17 0.8051
28.28 0.8095
28.19 0.8040
Parrot
PSNR SSIM
27.35 0.8332
28.51 0.8722
28.62 0.8728
28.52 0.8706
Avg.
PSNR SSIM
27.53 0.7440
28.58 0.8137
28.64 0.8154
28.56 0.8107
211
then downsampled by a decimation factor of 3 to produce the corresponding LR images. In addition, the additive Gaussian noise is further added on the LR images for the noisy experiments. From Section 4.2 to Section 4.5, all test methods are applied on the luminance channel of each image. In Section 4.6, the methods are applied on all image channels to test the effectiveness of the proposed method.
212
4.2. Effects of threshold of δ and neighborhood size
208 209 210
213 214 215 216 217 218 219 220 221 222 223 224 225 226
The first key point is the threshold of δ . According to Section 2.2, the appropriate threshold of δ cannot only get the accurate edges information but also reconstruct high quality HR image. To evaluate the effect of δ , the testing experiments are conducted on six LR images by choosing different values of δ . The parameters γ1 and γ2 are experimentally set to be 0.02 and γ3 is experimentally set to be 0.08. In order to demonstrate the experimental results in terms of the visual quality and the values of PSNR and SSIM, we only use the reconstruction models (15) and (17) to obtain the HR images. The results are shown in Table 1 and Fig. 4. From Table 1, we can see that the value of PSNR tend to increase as δ varies from 10 to 50 and the value of PSNR reaches the highest when δ = 50. However, when δ exceeds 50, the value of PSNR begins to decrease. The same tendency can be seen in SSIM. In Fig. 4, we can see that the edges of the reconstructed images tend to be sharp as the value of δ becomes large. When the values of δ are 40, 50 and 60 respectively, there are no obvious changes in the edges. Therefore, considering both objective and subjective effects, we adopt 50 as the value of δ . In order to test the effect of the neighborhood size of local rank, we conduct the experiments on different neighborhood sizes, namely 3×3, 5×5, 7×7, 9×9. The threshold of δ is set to be 50. We only use the reconstruction model (15) and (17) to test the effect. Meanwhile, the parameters γ1 and γ2 are set to be 0.02 and γ3 is set to be 0.08. The experimental results are shown in Table 2 and Fig. 5. When the neighborhood size is 7×7, we can see that our method obtains the highest PSNR and SSIM values Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS 14
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx Table 3 PSNR and SSIM results of reconstructed images by different selection of γ . Images
Measures
Methods SCSR
NOLRT_SR
PLRT_SR
NLRT_SR
NPLRT_SR
Boats
PSNR SSIM
28.66 0.8351
28.69 0.8357
28.85 0.8406
28.81 0.8395
28.90 0.8419
Building
PSNR SSIM
24.09 0.7217
24.10 0.7226
24.24 0.7287
24.22 0.7230
24.30 0.7315
Bike
PSNR SSIM
23.68 0.7366
23.67 0.7360
23.75 0.7419
23.71 0.7378
23.81 0.7448
Leaf
PSNR SSIM
37.72 0.8902
37.73 0.8901
37.93 0.8919
37.91 0.8919
37.93 0.8920
Window
PSNR SSIM
28.08 0.8000
28.10 0.8003
28.19 0.8065
28.17 0.8037
28.27 0.8095
Parrot
PSNR SSIM
28.48 0.8733
28.49 0.8730
28.55 0.8722
28.46 0.8712
28.62 0.8728
Avg.
PSNR SSIM
28.45 0.8095
28.46 0.8096
28.85 0.8406
28.81 0.8395
28.90 0.8419
228
in Table 2 and the best image with sharp edges. As a result, in the following experiments, the neighborhood size of local rank is chosen as 7×7.
229
4.3. Effectiveness of proposed local rank edge constraint information
230
246
In order to evaluate that the proposed LRTδ and LRT−δ edge constraint information based method can efficiently remove edge artifacts and sharp edges compared with the traditional sparse representation based SISR (SCSR) [31] method, we first design some test experiments with different choice of the parameters γ1 , γ2 and γ3 . According to Section 3, the reconstruction models (15) and (17) would be the classical reconstruction ones if the parameters γ1 ,γ2 and γ3 are zero (named NOLRT_SR). We can study the effectiveness of the introduced LRTδ edge constraint information by setting γ1 , γ3 to be zero and γ2 = 0.02 (named PLRT_SR). By setting γ2 to be zero and γ1 = 0.02, γ3 = 0.08 (named NLRT_SR), we can study the effectiveness of the introduced LRT−δ edge constraint information. Furthermore, by setting γ1 = 0.02, γ2 = 0.02 and γ3 = 0.08, the effectiveness of the both information can be studied (named NPLRT_SR). Table 3 reports the PSNR and SSIM results and Fig. 6 presents the visual results. From Table 3, we can see that the results of the proposed method NPLRT_SR not only outperform PLRT_SR and NLRT_SR but also outperform SCSR. Moreover, without local rank edge constraint, the reconstruction method NOLRT_SR still has better results than SCSR. The main reason is that different LR image patch is reconstructed by using different dictionary. In order to exhibit more details, we segment a dominant region in each super-resolved image in Fig. 6. We can see that the NPLRT_SR method can provide less edge artifacts and sharper edges. This means that although the methods of NOLRT_SR, PLRT_SR and NLRT_SR can reconstruct the good HR image, they are inferior to the NPLRT_SR. Furthermore, NOLRT_SR, PLRT_SR, NLRT_SR and NPLRT_SR all can get fine results compared with SCSR. So the experimental results verify that the proposed local rank edge constraint information based SISR method can effectively improve the quality of the reconstructed HR image while removing edge artifacts and sharping edges.
247
4.4. Single image super resolution results on noiseless images
248
We first qualitatively and quantitatively compare our method with some well-known SISR methods on noiseless images, such as bicubic interpolation (BI), the neighbor embedding (NE) [6], SCSR [31], Zeyde’s [32], nonlocal autoregressive modelling based method (NARM) [8] and non-local means and steering kernel regression based method (NLM_SKR) [33]. In our method, the parameters h1 and h2 of the nonlocal regular term are both set to be 100, the search window is set to be 21×21, the magnification factor s = 3 and γ1 = γ2 = 0.02,γ3 = 0.08. The compared results of PSNR and SSIM are listed in Table 4. From Table 4, we can observe that the BI method and the NARM method always give the lowest performance and are worse than the NE-based method. Compared with NE method, both SCSR and Zeyde’s can achieve better results. This is mainly because that SCSR and Zeyde’s method learn more image information for reconstructing the HR image. In Table 4, the NLM_SKR method can obtain better results compared with other methods (i.e. BI, NE, SCSR, Zeyde’s and NARM). While, we also can see that our proposed method is the best than the above mentioned methods and can improve greatly the PSNR and SSIM values. So these results demonstrate that by using the proposed local rank constraint, our method can reconstruct HR image with good objective quality. To demonstrate the visual quality of different methods, the reconstructed images and the segmented local regions are presented in Fig. 7. We can see that the BI method misses some important details and blurs the edges. Although the NE method
227
231 232 233 234 235 236 237 238 239 240 241 242 243 244 245
249 250 251 252 253 254 255 256 257 258 259 260 261
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
15
Table 4 PSNR and SSIM results of reconstructed images by different methods. Images
Measures
Methods Bicubic
NE
SCSR
Zeyde’s
NARM
NLM_SKR
Proposed
Boats
PSNR SSIM
27.87 0.8142
27.89 0.8124
29.25 0.8542
29.24 0.8530
27.53 0.7824
29.26 0.8575
29.45 0.8606
Starfish
PSNR SSIM
26.53 0.7729
26.81 0.7823
27.82 0.8237
27.74 0.8182
26.04 0.7311
28.49 0.8399
28.49 0.8402
House
PSNR SSIM
24.21 0.7531
24.25 0.7575
24.85 0.7881
24.76 0.7849
24.36 0.7508
24.78 0.7881
25.14 0.7950
Lena
PSNR SSIM
29.48 0.7857
30.09 0.7937
30.88 0.8201
30.76 0.8171
29.38 0.7615
31.31 0.8299
31.36 0.8306
Window
PSNR SSIM
27.15 0.7644
27.51 0.7776
28.48 0.8174
28.35 0.8120
26.69 0.7303
28.73 0.8255
28.78 0.8279
Parrot
PSNR SSIM
27.45 0.8528
27.91 0.8596
29.11 0.8826
28.99 0.8790
27.57 0.8337
29.63 0.8867
29.53 0.8868
Butterfly
PSNR SSIM
23.85 0.7985
24.96 0.8415
26.31 0.8748
26.06 0.8651
24.79 0.8379
27.04 0.8950
27.14 0.8934
Flower
PSNR SSIM
34.21 0.9037
34.42 0.8987
36.24 0.9181
36.35 0.9188
32.41 0.8609
36.48 0.9215
36.64 0.9221
Plants
PSNR SSIM
30.75 0.8463
31.28 0.8343
32.58 0.8547
32.32 0.8536
30.36 0.8039
32.81 0.8587
32.99 0.8609
Leaf
PSNR SSIM
36.88 0.8842
37.07 0.8793
38.04 0.8940
38.18 0.8942
35.31 0.8494
38.25 0.8970
38.40 0.8981
Girl
PSNR SSIM
33.01 0.7606
33.17 0.7593
33.90 0.7836
33.89 0.7814
30.48 0.5955
33.88 0.7846
34.03 0.7855
Bike
PSNR SSIM
22.63 0.6746
23.16 0.7082
23.92 0.7561
23.78 0.7461
22.74 0.6564
24.06 0.7672
24.28 0.7721
Raccoon
PSNR SSIM
28.37 0.7001
28.42 0.7033
28.91 0.7402
28.92 0.7351
27.59 0.6251
29.02 0.7388
29.16 0.7418
Leaves
PSNR SSIM
23.31 0.7998
24.07 0.8385
25.64 0.8818
25.55 0.8757
24.11 0.8331
26.27 0.8989
26.44 0.9018
Building
PSNR SSIM
23.53 0.6842
23.72 0.7053
24.36 0.7445
24.33 0.7370
23.66 0.6722
24.79 0.7587
24.82 0.7611
Peppers
PSNR SSIM
29.43 0.8482
30.44 0.8423
31.91 0.8616
31.75 0.8667
29.60 0.8254
31.78 0.8670
32.23 0.8691
Avg.
PSNR SSIM
28.04 0.7902
28.45 0.7996
29.51 0.8310
29.44 0.8274
27.66 0.7594
29.79 0.8385
29.93 0.8404
270
can produce some high-frequency details, it also produces ringing effects along the edge regions. Moreover, the reconstructed image appears not to be natural. When the input image is not or less relevant to the training database, the watercolor-like results are produced and some regions of the image are out of shape. The method of SCSR recovers more details, but it produces some jaggy and ringing artifacts along with edges or details in the reconstructed image. Although Zeyde’s method is capable of both suppressing jagged artifacts and sharping the edges, it generates obvious ringing effects and leads to smooth result. NARM method reconstructs the smooth image with unsharp edges. As shown in Fig. 7, we can see that although the proposed method and the NLM_SKR method cannot only reconstruct fine details but also preserve correct edges, the proposed method can reconstruct sharper edges than NLM_SKR method. So the experimental results demonstrate the effectiveness of the proposed method.
271
4.5. Single image super resolution results on noisy images
262 263 264 265 266 267 268 269
272 273 274 275 276 277 278
In order to test the robustness of the proposed method, we add some Gaussian noise with zero mean and three variance,σ = 4, 6, 8, on randomly selected five test images. The parameters are γ1 = γ2 = 0.02,γ3 = 0.08 and h1 = h2 = 100. We compare our method with BI, NE, SCSR, Zeyde’s and NLM_SKR. The objective results are listed in Table 5, and the reconstructed HR images are shown in Figs. 8–10. From Table 5, we can see that the PSNR and SSIM values of all the methods tend to decrease when the noisy variance becomes larger. However, compared with other method, our method still achieves the best results by fixing the noisy level. From Figs. 8– 10, we also can see that our method is able to reconstruct the correct HR images with sharp edges and less edge artifacts. These results demonstrate the robustness of the proposed method against noise. Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS 16
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx Table 5 PSNR and SSIM results of reconstructed images by different methods contaminated on different noise levels. Noise variance
σ =4
σ =6
σ =8
Methods
Measures
Images Boats
Butterfly
Lena
Leaf
Window
Avg.
Bicubic
PSNR SSIM
27.73 0.7746
23.79 0.7715
29.27 0.7455
35.84 0.8229
27.03 0.7341
28.73 0.7697
NE
PSNR SSIM
27.66 0.7654
24.87 0.8095
29.73 0.7462
35.55 0.8073
27.31 0.7416
29.02 0.7740
SCSR
PSNR SSIM
28.74 0.7889
26.03 0.8287
30.12 0.7513
35.18 0.7926
28.04 0.7661
29.62 0.7855
Zeyde’s
PSNR SSIM
28.87 0.7994
28.54 0.8144
30.30 0.7642
35.79 0.8114
28.03 0.7710
30.31 0.7921
NLM_SKR
PSNR SSIM
28.81 0.7966
26.86 0.8477
30.68 0.7677
35.67 0.8075
28.35 0.7769
30.07 0.7992
Proposed
PSNR SSIM
29.10 0.8012
28.63 0.8495
30.80 0.7777
35.84 0.8141
28.45 0.7784
30.56 0.8042
Bicubic
PSNR SSIM
27.56 0.7342
23.72 0.7440
29.02 0.7042
34.81 0.7620
26.88 0.7026
28.40 0.7294
NE
PSNR SSIM
27.38 0.7111
24.81 0.7736
29.18 0.6915
33.79 0.7229
26.92 0.6936
28.42 0.7185
SCSR
PSNR SSIM
28.19 0.7301
25.70 0.7866
29.35 0.6900
33.16 0.7030
27.55 0.7190
28.79 0.7257
Zeyde’s
PSNR SSIM
28.44 0.7482
28.14 0.7569
29.71 0.7115
33.97 0.7338
27.67 0.7306
29.59 0.7362
NLM_SKR
PSNR SSIM
28.33 0.7398
26.46 0.8027
29.91 0.7077
33.72 0.7257
27.85 0.7292
29.25 0.7410
Proposed
PSNR SSIM
28.58 0.7442
28.29 0.8106
30.11 0.7163
33.99 0.7301
27.99 0.7323
29.79 0.7467
Bicubic
PSNR SSIM
27.34 0.6885
23.62 0.7129
28.69 0.6571
33.69 0.6948
26.68 0.6667
28.00 0.6840
NE
PSNR SSIM
27.00 0.6653
24.49 0.7381
28.73 0.6423
32.75 0.6584
26.68 0.6605
27.93 0.6729
SCSR
PSNR SSIM
27.54 0.6688
25.29 0.7426
28.47 0.6261
31.35 0.6149
26.96 0.6695
27.92 0.6644
Zeyde’s
PSNR SSIM
27.91 0.6926
27.64 0.6959
29.00 0.6547
32.28 0.6535
27.21 0.6867
28.81 0.6767
NLM_SKR
PSNR SSIM
27.74 0.6816
25.95 0.7565
28.99 0.6459
31.99 0.6421
27.26 0.6815
28.38 0.6815
Proposed
PSNR SSIM
28.15 0.7039
27.92 0.7729
29.13 0.6540
32.62 0.6815
27.31 0.6835
29.03 0.6992
279
4.6. Single image super resolution results on all image channels
280
287
In order to verify the performance by processing all image channels, the RGB LR image is first transformed into the YCBCr color space. Then, three image channels of the YCBCr color space can be reconstructed by the proposed method and the compared methods (i.e., BI, NE, SCSR, Zeyde’s, NLM_SKR) respectively. In experiments, the test images are contaminated by Gaussian noise with zero mean and σ = 6 variance, the parameters are chosen as γ1 = γ2 = 0.02,γ3 = 0.08 and h1 = h2 = 100 respectively. And the reconstructed images are transformed into gray to compute PSNR and SSIM instead of the values of every channel. The computed values of PSNR and SSIM are listed in Table 6 and the visual results are shown in Fig. 11. Compared with Table 6, we can find that the performance of the single channel (i.e. Table 5) is better. As shown in Fig. 11, we can find that the visual effect of the single channel has sharper edges and less artifacts.
288
5. Conclusions
289
In order to overcome the defect that the classical sparse representation based SISR methods had edge artifacts and no sharp edges because of failing to consider the edge constraint information in the linear combination stage, we proposed a new SISR method to overcome the problems by combining sparse representation and local rank edge constraint information. In our method, the local rank information of the HR image was learned by sparse representation and then it was used as the edge constraint to restrict edges to reconstruct the initial HR image. Moreover, we proposed an improved nonlocal and global optimization model to make the initial HR image satisfy the image observation model and further improve the image quality.
281 282 283 284 285 286
290 291 292 293 294
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
ARTICLE IN PRESS
JID: INS
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
17
Table 6 PSNR and SSIM results of reconstructed images by different methods on all image channels with σ = 6. Methods
Measures
Images Boats
Butterfly
Lena
Leaf
Window
Avg.
Bicubic
PSNR SSIM
26.24 0.7836
22.40 0.7880
27.70 0.7577
32.61 0.8347
25.56 0.7312
26.93 0.7790
NE
PSNR SSIM
26.04 0.7638
23.34 0.8097
27.84 0.7457
32.62 0.8038
25.73 0.7288
27.17 0.7704
SCSR
PSNR SSIM
26.87 0.7618
24.38 0.8104
28.02 0.7266
31.83 0.7465
26.23 0.7358
27.46 0.7562
Zeyde’s
PSNR SSIM
27.12 0.7866
24.30 0.8239
28.29 0.7522
32.64 0.7877
26.34 0.7512
27.73 0.7803
NLM_SKR
PSNR SSIM
27.01 0.7738
24.54 0.8267
28.29 0.7460
32.38 0.7745
26.52 0.7466
27.74 0.7735
Proposed
PSNR SSIM
27.13 0.7871
24.43 0.8285
28.44 0.7546
32.79 0.7954
26.57 0.7712
27.87 0.7874
(a)
(b)
(c)
(d)
Fig. 11. The reconstruct results on single channel and three channels with σ = 6 variance Gaussian noise. (a) LR image. (b) Single channel result (PSNR = 33.99, SSIM = 0.7301). (c) Three channels results (PSNR = 32.79, SSIM = 0.7954). (d) Original image.
296
Experiments with a relatively large number of images demonstrated the effectiveness of the proposed method in reconstructing HR image.
297
Author contributions
298
301
This manuscript was performed in collaboration between the authors. Weiguo Gong is the corresponding author of this research work. Lunting Hu proposed the local rank constraint, and applied it to SISR. Weiguo Gong improved the reconstruction model. Jinming Li proposed the nonlocal and global optimization model. Weihong Li was involved in the writing and argumentation of the manuscript. All authors discussed and approved the final manuscript.
302
Acknowledgments
303
This work was supported by Key Projects of the National Science and Technology Program, China (Grant no. 2013GS500303), the Key Science and Technology Projects of CSTC, China (Grant nos. CSTC2012GG-YYJSB40001, CSTC2013-JCSF40009), the Application Development Program of CSTC (cstc2013yykfC60006) and the National Natural Science Foundation of China (Grant no. 61105093). The authors would like to thank the editors and reviewers for their valuable comments and suggestions.
295
299 300
304 305 306 307
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
JID: INS 18
ARTICLE IN PRESS
[m3Gsc;July 13, 2015;17:22]
W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
308
References
309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364
[1] E. Abreu, S. Mitra, A signal-dependent rank ordered mean (SD-ROM) filter-a new approach for removal of impulses from highly corrupted images, in: Proceedings of the International Conference on Acoustics, Speech, & Signal Processing, Icassp., 1995, pp., pp. 2371–2374. [2] J. Banks, M. Bennamoun, Reliability analysis of the rank transform for stereo matching, IEEE Trans. Syst. Man, Cybernet. Part B 31 (6) (2001) 870–880. [3] T.F. Chan, N. Ng, A. Yau, A. Yip, Super-resolution image reconstruction using fast inpainting algorithms, Appl. Comput. Harmon. Anal. 23 (1) (2007) 3–24. [4] T. Chan, J. Zhang, An improved super-resolution with manifold learning and histogram matching, in: Proceedings of the IAPR International Conference on Biometric, 2005, pp. 756–762. [5] T.M. Chan, J. Zhang, J. Pu, H. Huang, Neighbor embedding based super-resolution algorithm through edge detection and feature selection, Pattern Recognit. Lett 30 (5) (2009) 494–502. [6] H. Chang, D.Y. Yeung, Y. Xiong, Super-resolution through neighbor embedding, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004, pp. 275–282. [7] W. Conover, R. Iman, Rank transformations as a bridge between parametric and nonparametric statistics, Am. Stat 35 (3) (1981) 124–129. [8] W. Dong, L. Zhang, R. Lukac, G. Shi, Sparse representation based image interpolation with nonlocal autoregressive modelling, IEEE Trans. Image Process 22 (4) (2013) 1382–1394. [9] W. Dong, L. Zhang, G. Shi, X. Li, Nonlocally centralized sparse representation for image restoration, IEEE Trans. Image Process. 22 (4) (2013) 1620–1630. [10] W. Dong, L. Zhang, G. Shi, X Wu., Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization, IEEE Trans. Image Process 20 (7) (2011) 1838–1857. [11] R. Fattal, Image upsampling via imposed edge statistics, ACM Trans. Graph. 26 (3) (2007) 95–102. [12] W.T. Freeman, T.R. Jones, E.C. Pasztor, Example-based super resolution, IEEE Comput. Graph. Appl. 22 (2) (2002) 56–65. [13] H. Hou, H.C. Andrews, Cubic splines for image interpolation and digital filtering, IEEE Trans. Acoust. Speech Signal Process. 26 (6) (1978) 508–517. [14] M. Hradis, A. Herout, P. Zemcik, Local rank patterns-novel features for rapid object detection, Comput. Vis. Graph. (2009) 239–248. [15] K. Jia, X. Wang, X. Tang, Image transformation based on learning dictionaries across image spaces, IEEE Trans. Pattern Anal. Mach. Intell 35 (2) (2013) 367–380. [16] K. Kim, Y. Kwon, Single-image super-resolution using sparse regression and natural image prior, IEEE Trans. Pattern Anal. Mach. Intell. 32 (6) (2010) 1127– 1133. [17] J. Li, W. Gong, W. Li., Dual-sparsity regularized sparse representation for single imageSuper-Resolution, Inform. Sci. 298 (2015) 257–273. [18] X. Li, M.T. Orchard, New edge-directed interpolation, IEEE Trans. Image Process 10 (10) (2001) 1521–1527. [19] W. Liu, S. Li., Sparse representation with morphologic regularizations for single image super-resolution, Signal Process. 98 (2014) 410–422. [20] J. Lu, Y. Sun, Context-aware single image super-resolution using sparse representation and cross-scale similarity, Signal Process-Image 32 (2015) 40–53. [21] J. Mukherjee, Local rank transform: properties and applications, Pattern Recogn. Lett. 32 (7) (2011) 1001–1008. [22] Z. Pan, J. Yu, H. Huang, S. Hu, A. Zhang, H. Ma, W. Sun, Super-resolution based on compressive sensing and structural self-similarity for remote sensing images, IEEE Trans. Geosci. Remote Sens 51 (9) (2013) 4864–4876. [23] S. Park, M. Park, M. Kang, Super-resolution image reconstruction: a technical overview, IEEE Signal Process. Mag 20 (3) (2003) 21–36. [24] T. Peleg, M. Elad, A statistical prediction model based on sparse representations for single image super-resolution, IEEE Trans. Image Process 23 (6) (2014) 2569–2582. [25] M. Protter, M. Elad, H. Takeda, P. Milanfar, Generalizing the nonlocal-means to super-resolution reconstruction, IEEE Trans. Image Process 18 (1) (2009) 36–51. [26] M. Sajjad, I. Mehmood, S. Baik, Image super-resolution using sparse coding over redundant dictionary based on effective image representations, J. Vis. Commun. Image Represent. 26 (2014) 50–65. [27] S. Yang, Z. Liu, M. Wang, F. Sun, L. Jiao, Multitask dictionary learning and sparse representation based single-image super-resolution reconstruction, Neurocomputing 74 (17) (2011) 3193–3203. [28] S. Yang,Y. Sun, Y. Chen, L. Jiao, Structural similarity regularized and sparse coding based super-resolution for medical images, Biomed. Signal Process 7 (6) (2012) 579–590. [29] S. Yang, M. Wang, Y. Chen, Y. Sun, Single-image super-resolution reconstruction via learned geometric dictionaries and clustered sparse coding, IEEE Trans. Image Process 21 (9) (2012) 4016–4028. [30] S. Yang, M. Wang, Y. Sun, F. Sun, L. Jiao, Compressive sampling based single-image super-resolution reconstruction by dual-sparsity and non-local similarity regularizer, Pattern Recogn. Lett. 33 (9) (2012) 1049–1059. [31] J. Yang, J. Wright, T. Huang, Y. Ma., Image super-resolution via sparse representation, IEEE Trans. Image Process 19 (11) (2010) 2861–2873. [32] R. Zeyde, M. Elad, M. Protter, On single image scale-up using sparse-representations, in: Proceedings of the 7th International Curves and Surfaces, 2012, pp. 711–730. [33] K. Zhang, X. Gao, D. Tao, X. Li., Single image super-resolution with non-local means and steering kernel regression, IEEE Trans. Image Process. 21 (11) (2012) 4544–4556. [34] J. Zhang, D. Zhao, W. Gao, Group-based sparse representation for image restoration, IEEE Trans. Image Process 23 (8) (2014) 3336–3351. [35] F. Zhou, T. Yuan, W. Yang, Q. Liao, Single-Image Super-resolution based on compact KPCA coding and kernel regression, IEEE Signal Process. Lett 22 (3) (2015) 336–340. [36] M. Zontak, M. Irani, Internal statistics of a single natural image, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2011, pp. 977–984.
365 366 367 368 369
Weiguo Gong received his doctoral degree in computer science from the Tokyo Institute of Technology of Japan in March 1996 as a scholarship gainer of Japan Government. From April 1996 to March 2002, he served as a researcher or senior researcher in NEC Labs of Japan. Now, he is a professor of Chongqing University, China. He has published over 130 research papers in international journals and conferences and two books as an author or co-author. His current research interests are in the areas of pattern recognition and image processing.
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004
JID: INS
ARTICLE IN PRESS W. Gong et al. / Information Sciences xxx (2015) xxx–xxx
[m3Gsc;July 13, 2015;17:22] 19
370 371
Lunting Hu is a M.S. candidate in the College of Opto-Electronic Engineering, Chongqing University. His research interests are in image processing.
372 373
Jinming Li is a Ph.D. candidate in the College of Opto-Electronic Engineering, Chongqing University. His current research interests are in the areas of image processing.
374 375
Weihong Li received her doctoral degree from Chongqing University in 2006. Now she is an associate professor in Chongqing University. Her current research interests are in the areas of pattern recognition and image processing.
Please cite this article as: W. Gong et al., Combining sparse representation and local rank constraint for single image super resolution, Information Sciences (2015), http://dx.doi.org/10.1016/j.ins.2015.07.004