Accepted Manuscript
Combining Sparse coding with Structured Output Regression Machine for Single Image Super-Resolution Yongliang Tang , Weiguo Gong , Qiane Yi , Weihong Li PII: DOI: Reference:
S0020-0255(16)30735-6 10.1016/j.ins.2017.12.001 INS 13294
To appear in:
Information Sciences
Received date: Revised date: Accepted date:
3 September 2016 22 October 2017 4 December 2017
Please cite this article as: Yongliang Tang , Weiguo Gong , Qiane Yi , Weihong Li , Combining Sparse coding with Structured Output Regression Machine for Single Image Super-Resolution, Information Sciences (2017), doi: 10.1016/j.ins.2017.12.001
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights ๏ท
The structured output regression machine (SORM) is improved by considering the correlation and independence between different output components. The model combine sparse coding with the improved SORM is used to SISR.
๏ท
The global and nonlocal optimization with a new similarity weight is proposed
AC
CE
PT
ED
M
AN US
to further improve the reconstructed HR images.
CR IP T
๏ท
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Combining Sparse coding with Structured Output Regression Machine for Single Image Super-Resolution
Yongliang Tang, Weiguo Gong*, Qiane Yi, Weihong Li
CR IP T
Key Lab of Optoelectronic Technology & Systems of Education Ministry, Chongqing University, Chongqing 400044, China
AN US
Correspondence information: Weiguo Gong, Key Lab of Optoelectronic
Technology & Systems of Education Ministry, Chongqing University, Chongqing 400044, China,
M
[email protected],
AC
CE
PT
ED
+86 023 65112779
Combining Sparse Coding with Structured Output Regression Machine for Single Image Super-Resolution 1
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Yongliang Tang, Weiguo Gong*, Qiane Yi, Weihong Li Key Lab of Optoelectronic Technology & Systems of Education Ministry, Chongqing University, Chongqing 400044, China
CR IP T
Abstract In this paper, considering that the trained dictionary pairs by sparse coding based super-resolution (SR) methods have difficulty capturing the complicated nonlinear relationships between the low-resolution (LR) and high-resolution (HR) feature spaces,
AN US
we propose a new single image SR method by combining sparse coding with the improved structured output regression machine (SORM). In the proposed method, the dictionary pairs are firstly learned by joint sparse coding to characterize the structural
M
domain of each feature space and add more consistency between the sparse codes of two
ED
feature spaces. Then, since the classical SORM does not give sufficient weight to the independence of different output components, we improve the SORM by considering
PT
the correlation and independence between different output components to establish a set
CE
of mapping functions for tying the sparse code of two feature spaces. With this, the more precise mapping relationships between two feature spaces are obtained by the
AC
trained dictionary pairs and mapping functions. Moreover, we propose a new global and nonlocal optimization for further enhancing the quality of the restored HR images. Extensive experiments validate that the proposed method can achieve convincing improvement over other state-of-the-art methods in terms of the reconstruction quality
Corresponding author. Tel.: +86 023 65112779; E-mail address:
[email protected]
2
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences and computational cost. Keywords: Single image super-resolution, structured output regression machine,
AC
CE
PT
ED
M
AN US
CR IP T
sparse coding, global and nonlocal optimization.
1 Introduction High-resolution (HR) images often preserve more details and critical information 3
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences for later image processing, analysis and interpretation. However, due to the limitations of physical imaging system and the disturbances from external environment, it is difficult to obtain an image with the desired resolution [5]. As an effective technique, which can obtain the visually pleasing HR images from a low-cost imaging system and
CR IP T
limited environmental condition, image super-resolution (SR) has attracted much attentions and produced extremely promising results for many civilian and military applications, such as criminal investigation [61], video surveillance [60], medical
AN US
imaging [28], etc.
According to the number of input images, the existing image SR methods can be divided into multi-frame SR [18,32,44] and single-frame or single image SR
M
[6,11,21,35,36,39,40,45,46,49,58,59]. The multi-frame SR methods attempt to reconstruct an HR image by fusing a sequence of low-resolution (LR) images with
ED
subpixel displacement from the same scene. By contrast, the single image SR (SISR)
PT
methods aim to recover the HR image from a given LR image via removing the degradations caused by the limitations of physical imaging system and imaging
CE
environment. In this paper, we primarily focus on the SISR problem.
AC
The SISR methods can be classified into interpolation-based, reconstruction-based, and learning-based methods. The interpolation-based methods, such as bicubic interpolation, edge-guided interpolation, and nearest neighbor interpolation, typically adopt fixed-function kernels [21,45] or structure-adaptive kernels [11,40,59] to estimate the unknown pixels in the HR grid. Although the interpolation-based methods can reconstruct HR images in a very simple and effective way, they are usually prone to 4
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences yield overly smooth and blurring images. Therefore, the reconstructed results are unsatisfactory to the practical applications. The reconstruction-based methods usually suppose that the observed LR image is a product of several degradation factors such as blurring, down-sampling, and noising
CR IP T
with additive zero-mean white and Gaussian noise [58]. Since many HR images may degrade to the same LR image, the SISR is inherently ill-posed problem. To tackle the ill-posed problems, the certain priors or constraints need to be imposed in the SR
AN US
process. The typical priors include edge-directed priors [6,39,46,49], gradient profile priors [34, 35,36], Bayesian priors [1,24,29] and nonlocal self-similarity priors [23,27,30,57]. Although this kind of SR methods is particularly effective to preserve
M
geometric structure and to suppress ringing artifacts, it fails to add sufficient novel high frequency details to the reconstructed HR image and is limited in reconstructing the
ED
visual complexity of the real image, especially at high magnification.
PT
The learning-based methods presume that certain relationships exist between the LR images and their corresponding HR counterparts and that these relationships can be
CE
learned from millions of co-occurrence LR-HR image, before they are used to
AC
reconstruct a new HR image. Since the learning-based methods exploit the information on training images or example images adequately, they are able to recover the missing high-frequency details caused by the degradation factors and are superior to other SR methods. Depending on how the mapping relationships are established, the learning-based methods can be mainly divided into the regression-based methods and the coding-based methods. 5
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences The regression-based methods typically establish regression models to reveal the relationships between the LR and HR images. For example, Kim and Kwon [22] utilize kernel ridge regression (KRR) to estimate the high-frequency details of the desired HR image. He and Siu [16] predict each pixel of the HR image by its neighbors through
CR IP T
Gaussian process regression (GPR) with a proper covariance function. Timofte et al. [42] reduce the image SR to a projection problem from input feature space to HR feature space via the anchored neighborhood regression (ANR). Recently, they further propose
AN US
an improved ANR framework (A+) by combining the best qualities of ANR and simple functions [41]. Dong et al. [9] propose a deep learning method to train a deep convolutional neural network for establishing the end-to-end mapping between the LR
M
and HR images. Kim et al. [19,20] present two highly accurate SISR methods by training a deeper convolutional network inspired by VGG-net used for ImageNet
ED
classification [33] and a deeply-recursive convolutional network (up to 16 recursions).
PT
Recently, the more baseline techniques are proposed by Timofte et al.[43] to improve the learning-based single image SR.
CE
Another kind of learning-based methods is the coding-based ones that attempt to
AC
capture the relationships between the LR and HR images by space transformation. The coding-based methods include NE-based learning [4,13,56], k-nearest neighbor (k-NN) learning [12,37], and sparse coding [10,15,25,47,50,52,53,54]. The k-NN and NE-based learning methods often need to search similar patterns in an immense reference dataset for optimally representing complicated structures in generic images, and therefore the image SR lacks efficiency for practical applications [58]. Recently, the sparse coding 6
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences methods show promising performance to reconstruct the visually pleasing HR image from one LR image. In particular, Yang et al. [53] presume that the LR and HR feature spaces share the same sparse code with respect to their own dictionaries and jointly train a compact dictionary pair to capture the relationships between two feature spaces.
CR IP T
However, since the jointly trained dictionary pair in the training phase cannot guarantee the co-occurrence of the sparse code in the reconstruction phase, it fails to reveal accurately the intrinsic relationships between the two feature spaces. Several methods
AN US
have been proposed to alleviate the above-mentioned problem. Zeyde et al. [54] propose a two-step learning method, where the LR dictionary is learned by KSVD and the HR one is generated via least-square. Although it can guarantee co-occurrence of sparse
M
code, the HR dictionary has more errors to characterize the structure of HR feature space. Wang et al. [47] improve SR result by training a pair of dictionaries and a linear
ED
mapping function simultaneously. However, the linear mapping function is always not
PT
enough to reveal the complicated nonlinear relationships between the two feature spaces. Yang et al. [52] use a bilevel optimization of the problem instead of solving the
CE
two-optimization problems in two feature spaces together in the training phase. Since
AC
the bilevel optimization is highly nonlinear and nonconvex, it is difficult to achieve an optimal solution. He et al. [15] utilize the beta process prior to learn the over-complete dictionary pairs for adding more consistent and accurate mapping between two feature spaces. Yang et al. [50] propose a consistent coding scheme to guarantee the prediction accuracy of HR sparse code. Nevertheless, the independent trained dictionaries for each feature spaces enhance the difficulty to reveal relationships via established mapping 7
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences function in sparse representation space. Li et al. [25] present a dual-sparsity regularized sparse representation (DSRSR) model that exploring both the column and row nonlocal similarity priors among the matrix of sparse code for different image patch. Dong et al. [10] present a unified image restoration framework by combining local sparsity,
image denoising, deburring, and SR reconstruction.
CR IP T
nonlocal self-similarity, and local autoregressive constraints, which performs well on
Although the sparse coding methods have attained a convincing improvement over
AN US
many other state-of-the-art SR methods, there are still several challenging issues that need to be addressed. First, most existing sparse coding methods attempt to train one or a set of dictionary pairs to reveal the complex, spatial-variant and nonlinear
M
relationships between the LR and HR feature spaces, which leads to SR models complexity, large errors, and time or space sensitivity. Second, because the current SR
ED
methods are typically based on a fixed size image patch, which often ignore the inherent
PT
geometry structures in the natural image, it is easy to cause ringing artifacts and increase reconstruction errors. Third, most of sparse coding algorithms mainly focused
CE
on developing image priors to reconstruct the details rather than preserve geometry
AC
feature, which will lead to the blurring of small scale structures and excessively smooth image edges. To address the aforementioned problems, we explore a new SISR method by
combining sparse coding with the improved structured output regression machine (SORM). In the proposed method, we first train a set of dictionary pairs by the joint dictionary training method to transform the model domain from image space to sparse 8
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences coding space. Secondly, considering that the classical SORM does not give sufficient weight to the independence of different output components, we improve the SORM and make it more suitable to train the mapping functions in the sparse coding space for preferably revealing the relationships between the sparse codes of the LR and HR
CR IP T
feature spaces. Once the jointly trained dictionary pairs and the mapping functions are obtained, the more accurate relationships between two feature spaces can be established and then the missing details in a given LR input image can be predicted efficiently. In
AN US
addition, considering that the reconstruction based on fixed size image patches may cause ringing artifacts, a new global and nonlocal optimization is introduced into our framework to suppress these ringing artifacts and further improve the quality of desired
M
HR image. The major advantages of the proposed method are summarized as follows: 1) Since the establishment of mapping functions is based on elementary structures
ED
(atoms) of the two feature spaces, the proposed method is able to better preserve
PT
geometry structures and suppress noises. 2) Compared to consistent coding scheme [50] the jointly trained dictionary pairs
CE
can well characterize each feature space and add more consistency between the sparse
AC
code of two feature spaces. The more consistency is beneficial to establish the mapping functions.
3) Since the improved SORM considers not only the underlying correlations
between inputs and corresponding output but also the correlations and independence between different output components, the trained mapping functions can more precisely reveal the complicated nonlinear relationships between the two feature spaces. 9
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences 4) Because the geometry structure information is adequately considered in measuring the similarity between image patches, the proposed global and nonlocal optimization can better suppress ringing artifacts and improve the final HR image quality.
CR IP T
The remainder of the paper is organized as follows. Section 2 briefly reviews the related work that is important to the proposed method. Section 3 details our formulation and solution to the image SR problem based upon sparse coding and the improved
AN US
SORM. Various experimental results in Section 4 demonstrate the efficacy of the proposed method. The conclusion of this paper is drawn in Section 5. 2 Related Work
M
In this section, we will describe two related dictionary learning methods-sparse coding in a single feature space and couple sparse coding in two feature spaces for SR
PT
2.1 Sparse Coding
ED
problem, which are important to the proposed method.
For an image vector ๐ โ ๐
๐ , we denote the ๐th local patch of ๐ by ๐๐ = ๐ท๐ ๐,
CE
where ๐ท๐ is a matrix which extracts patch ๐๐ from ๐ at the spatial location ๐. The
AC
goal of sparse coding is to represent the patch ๐๐ โ ๐
๐ approximately as a weighted linear combination of a few basis atoms which often chosen from an over-complete dictionary ๐ซ๐ โ ๐
๐ร๐พ (๐ โช ๐พ). Compared with the classical signal representation, the key advantage of sparse coding is able to learn such dictionary ๐ซ๐ from a large number of samples. If there are enough image patches to form a training set *๐๐ +๐ ๐=1 , the problem of learning the dictionary is solved by minimizing the following energy 10
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences function that combines squared reconstruction errors and the ๐1 -sparsity penalties 2 min โ๐ ๐=1*โ๐๐ โ ๐ซ๐๐ โ2 + ๐โ๐๐ โ1 +
๐ซ,*๐ถ๐ +๐ ๐=1
s. t โ๐ซ๐ฅ (: , ๐)โ2 โค 1, โ๐ โ *1,2 โฏ ๐พ+
(1)
where ๐๐ is the sparse code of ๐๐ , ๐ซ๐ฅ (: , ๐) is the ๐-th column of ๐ซ๐ , and ๐ is a
CR IP T
parameter controlling the sparsity penalty and representation fidelity. Since the above-optimization problem is convex in either ๐ซ๐ or *๐๐ +๐ ๐=1 when the other is fixed, it can be effectively solve by many ๐1 minimization techniques, such as iterative Shrinkage algorithm [7] and Bregman split algorithm [55] Once the dictionary ๐ซ๐ was
AN US
trained, the sparse code for a given patch ๐๐ can be obtained via minimizing ฬ๐ = arg min*โ๐๐ โ ๐ซ๐ ๐๐ โ22 + ๐โ๐๐ โ1 + ๐ ๐ถ
2.2 Couple Sparse Coding
(2)
M
Unlike the general sparse coding, couple sparse coding considers the problem of
ED
two coupled feature spaces. Since the observed LR images can be seen as a degraded product of HR images, which can be generally formulated as [51]
PT
๐ = ๐ซ๐ฏ๐ + ๐
(3)
CE
where ๐ and ๐ represent the original HR and observed LR image respectively, D is the down-sampling operator, H is the blurring filter, and v represents the additive noise,
AC
the recovery of HR image ๐ from the observation ๐ is a typical problem of couple feature space. In order to tackle this problem, Yang et al. [53] propose to train two dictionaries ๐ซ๐ฆ and ๐ซ๐ฅ to tie the LR image feature spaces ๐ and its corresponding HR image feature spaces ๐ by postulating that the sparse code of ๐๐ โ ๐ in terms of ๐ซ๐ฆ is the same as that of ๐๐ โ ๐ in terms of ๐ซ๐ฅ . To train the dictionary ๐ซ๐ฆ and ๐ซ๐ฅ , 11
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences a joint dictionary training method for couple sparse coding is proposed via generalizing the general sparse coding scheme as follows min
๐ซ๐ฅ ,๐ซ๐ฆ *๐ถ๐ +๐ ๐=1
2
2 โ๐ ๐=1 2โ๐๐ โ ๐ซ๐ฅ ๐๐ โ2 + โ๐๐ โ ๐ซ๐ฆ ๐๐ โ2 + ๐โ๐๐ โ1 3
s. t โ๐ซ๐ฅ (: , ๐)โ2 โค 1, โ๐ซ๐ฅ (: , ๐)โ2 โค 1, โ๐ โ *1,2 โฏ ๐พ+
(4)
Grouping the two reconstruction error terms together and denoting ๐ซ ฬ
= [ ๐ฅ] ๐ซ ๐ซ
(5)
CR IP T
๐๐ ฬ
๐ = 0๐ 1 ๐ ๐
๐ฆ
the Eqn. (4) can be converted to the standard sparse coding as
ฬ
๐๐ โ22 + ๐โ๐๐ โ1 + ฬ
๐ โ ๐ซ min โ๐ ๐=1*โ๐
ฬ
,*๐ผ๐ +๐ ๐ซ ๐=1
(6)
AN US
ฬ
๐ฅ (: , ๐)โ2 โค 1, โ๐ โ *1,2 โฏ ๐พ+ s. t โ๐ซ
Thus, once the dictionary ๐ซ๐ฆ and ๐ซ๐ฅ are obtained by minimizing Eqn. (6), we can recover the desired HR image ๐ from its corresponding observed LR image y. However,
since there are not equivalent constrains to the sparse code of ๐ and ๐ as
M
in the training phase, we can only obtain the sparse code of ๐ with respect to ๐ซ๐ฆ and
ED
use it as an approximation to the sparse code of ๐ in terms of ๐ซ๐ฅ , which leads to large
PT
error in reconstructing the HR image. To alleviate the problems of jointly learned dictionaries, Yang et al. [52] further
CE
propose a bilevel dictionary training method to alternatively optimize the dictionaries
AC
๐ซ๐ฅ and ๐ซ๐ฆ . When ๐ซ๐ฅ is fixed, the problem of optimization over ๐ซ๐ฆ was solved by min โ๐ ฬ ๐ โ22 ๐=1โ๐๐ โ ๐ซ๐ ๐ ๐ซ๐ฆ
2
๐ . ๐ก ๐ฬ๐ = arg min โ๐๐ โ ๐ซ๐ฆ ๐๐ โ2 + ๐โ๐๐ โ1 ๐๐
(7)
โ๐ซ๐ฅ (: , ๐)โ2 โค 1, โ๐ โ *1,2 โฏ ๐พ+ where ๐ฬ๐ is the sparse code for each ๐๐ โ ๐ with respect to ๐ซ๐ฆ . Minimizing the Eqn. (7) over ๐ซ๐ฆ is a complicated nonlinear and nonconvex bilevel programming problem 12
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences because the independent variable ๐ฬ๐ of upper-level optimization is the optimum of a lower-level ๐1 -minimization. For that reason, Yang et al. [52] explore a first-order projected stochastic gradient descent algorithm to search the feasible direction which will decrease the objective function value of upper-level optimization. Although
CR IP T
compared with the joint dictionary training method, the bilevel dictionary training method for couple sparse coding achieves some improvements in recovering the desired HR image because of the establishment of a more accurate mapping between the two
AN US
feature spaces, it always has the following problems:
1) Owing to the highly nonlinearity and non-convexity of Eqn. (7), it is difficult to find a local optimum that will noticeably decrease the objective function value. 2) During each iteration of the training dictionary, we should optimize a
M
๐1 -minimization problem for each ๐๐ in the training set *๐๐ +๐ ๐=1 , which will
ED
dramatically increase computational burden.
PT
Moreover, the couple sparse coding method also is a linear model. As a result, the learned dictionaries ๐ซ๐ฅ and ๐ซ๐ฆ by couple sparse coding still have difficulty exactly
CE
capturing the complex nonlinear relationship between two feature spaces, as well as
AC
increase computational burden. 3 Proposed method To alleviate the above-mentioned problems, we proposed a new method by
combining sparse coding with the improved SORM to reveal the complicated nonlinear relationships between the LR feature space and its corresponding HR feature space. Moreover, a new global and nonlocal optimization model is introduced into the 13
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences proposed method to further improve the reconstructed result. Therefore, in this section, we first describe the improved SORM and the global and nonlocal optimization model, and then discuss how to combine them into our framework to solve the SISR problem. 3.1 Improved Structured Output Regression Machine
CR IP T
Multi-output regression aims to predict an output vector from an input vector via learning the relationships between the input space and the output space. Since the classical multi-output regression models are always constructed by assembling multiple
AN US
single-output regression, it ignores the correlations between the different output components and reduces the regression accuracy [2,38]. In order to utilize the possible correlations to improve the regression accuracy, Brudnak [3] propose a vector-valued
M
support vector regression (SVR) model by extending the estimator, regularization, and loss function from the scalar-valued to a vector one. Xu et al. [48] propose a
ED
multiple-output least-square SVR (MLS-SVR) model that takes full consideration into
PT
the circumstance that each sub-model may affect the other sub-models during the training phase. Since this kind of MLS-SVR model considers the correlations between
CE
different output components, we also call it as the structured output regression machine
AC
(SORM) model [8].
Although the SORM improves the regression accuracy by utilizing the correlations,
it does not give sufficient weight to the independence of different output components. Hence, we can improve the SORM by fully considering correlation and independence between different output components. In the following, we will detail the improved SORM. 14
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences ๐ Given a training set *(๐๐ , ๐๐ )+๐ is an input vector with ๐=1 , where ๐๐ โ ๐
dimensionality ๐, and ๐๐ โ ๐
๐ is a m-dimensional output vector. The regression function of the SORM in the primal weight spaces is formulated as ๐(๐) = ๐พ๐ ๐(๐) + ๐
(8)
CR IP T
where ๐พ = (๐1 , ๐2 , โฏ , ๐๐ ) โ ๐
๐โ ร๐ , ๐ โ ๐
๐ , ๐ โ ๐
๐ , and ๐(โ) : ๐
๐ โ ๐
๐โ is used to map the input vector to high-dimensional feature spaces. According to the structural risk minimization principle, the SORM problem can be expressed as a constrained optimization problem
AN US
2 1 1 ๐ (๐) ๐ ๐ min ๐ฝ(๐๐ , ๐๐ ) = 2 โ๐ ๐=1 ๐๐ ๐๐ + 2 โ๐=1 ๐พ๐ โ๐=1.๐๐ /
๐๐ ,๐,๐๐
๐ . ๐ก ๐๐ = ๐พ๐ ๐(๐๐ ) + ๐ + ๐๐ , โ๐ โ *1,2 โฏ ๐+
(9)
(๐)
where ๐๐ โ ๐
๐ is the error variable, ๐๐ is the ๐th element of the vector ๐๐ , and ๐พ๐
M
(โ๐ โ *1,2, โฏ , ๐+, โ๐พ๐ > 0) are the regularization parameters. In order to consider the
ED
correlations, Xu et al [48] assume that all ๐๐ โ ๐
๐โ (โ๐ โ *1,2, โฏ , ๐+) can be written as ๐๐ = ๐0 + ๐ ฬ ๐ , where the vectors ๐ ฬ ๐ are small when the different outputs are
PT
similar to each other, otherwise the mean vectors ๐0 are small. Accordingly, the Eqn. (9) can be rewritten as follow: 2 1 1 1 ๐ (๐) ๐ ๐ โ โ ๐ฝ(๐0 , ๐ ฬ ๐ , ๐๐ ) = 2 ๐๐0 ๐0 + 2 โ๐ ๐ ๐ ฬ ๐ ฬ + ๐พ .๐ / ๐=1 ๐ ๐ ๐ 2 ๐=1 ๐ ๐=1 ๐
CE min
๐ . ๐ก ๐๐ = ๐พ๐ ๐(๐๐ ) + ๐ + ๐๐ , โ๐ โ *1,2 โฏ ๐+
(10)
AC
๐0 ,๐ ฬ ๐ ,๐,๐๐
where ๐๐ (โ๐ โ *1,2, โฏ , ๐+, โ๐๐ > 0) is the regularization parameter that control the weight of correlation. Since the improved SORM can optimize the parameters ๐๐ and ๐พ๐ for each out component, the weight of correlations can be tuned for more regression accuracy. Actually, it preserves the independence between different output components. The Lagrangian function for Eqn. (10) is 15
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences ๐ฟ(๐๐ , ๐, ๐๐ , ๐ถ๐ ) = ๐ฝ(๐0 , ๐ ฬ ๐ , ๐๐ ) โ โ
๐
๐ถ๐ ๐ (๐พ๐ ๐(๐๐ ) + ๐ + ๐๐ โ ๐๐ ) (11)
๐=1
where ๐ถ๐ โ ๐
๐ (โ๐ โ *1,2, โฏ , ๐+). According to the KKT conditions, the partial derivatives of Eqn. (11) are taken as
CR IP T
๐ ๐ ๐๐ฟ (๐) = 0 โ ๐0 = โ (โ ๐ถ๐ ) ๐(๐๐ ) ๐๐0 ๐=1 ๐=1 ๐ ๐๐ฟ 1 (๐) =0โ ๐ ฬ๐ = โ ๐ถ๐ ๐(๐๐ ) ๐๐ ฬ๐ ๐๐ ๐=1 ๐ ๐๐ฟ = 0 โ โ ๐ถ๐ = ๐๐ ๐๐ ๐=1 ๐๐ฟ = 0 โ ๐ถ๐ = ๐ผ๐๐ ๐๐๐ ๐๐ฟ = 0 โ ๐พ๐ ๐(๐๐ ) + ๐ + ๐๐ โ ๐๐ = ๐๐ { ๐๐ถ๐
AN US
(12)
where ๐ = 1,2, โฏ ๐, ๐ = 1,2, โฏ ๐, ๐ผ = diag(๐พ1 , โฏ , ๐พ๐ ) โ ๐
๐ร๐ , ๐ฌ๐ is the identity (๐)
matrix of dimension ๐ ร ๐ , and ๐ถ๐
is the ๐ th element of the vector ๐ถ๐ . By
M
eliminating the variable ๐พ and ๐๐ , we then obtain the following KKT system
[๐๐ร๐ ๐ช
๐ช๐ ] 0 ๐ 1 = 0๐๐ 1 ๐ ๐ด+๐ซ ๐ถ
(13)
PT
where
ED
equation.
CE
1 1 ๐ช = blockdiag {[ โฎ ] , โฏ , [ โฎ ]} โ ๐
๐๐ร๐ , ๐๐ = ๐ ร ๐; 1 1
AC
๐ด = diag(1โ๐1 , โฏ , 1โ๐๐ ) โ ๐ฒ + ๐ฌ๐ โ ๐ฒ๐ โ ๐
๐๐ร๐๐ ; ๐
๐ฒ = (๐(๐1 ), โฏ , ๐(๐๐ )) (๐(๐1 ), โฏ , ๐(๐๐ )) โ ๐
๐ร๐ ; ๐ซ = ๐ฌ๐ โ ๐ผ โ1 โ ๐
๐๐ร๐๐ ; ๐ถ = (๐ถ1๐ , โฏ , ๐ถ๐๐ )๐ โ ๐
๐๐ ; ๐ = (๐1๐ , โฏ , ๐๐๐ )๐ โ ๐
๐๐ .
where โ is the Kronecker product. Thus, the matrix ๐ฒ is composed of the kernel 16
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences function ๐ฒ(๐๐๐ , ๐๐๐ ) = ๐(๐๐๐ )๐ ๐(๐๐๐ ), (โ๐๐, โ๐๐ โ *1,2, โฏ , ๐+). There is a variety of kernel functions for different problems, including Linear kernel, Polynomial kernel, Radial basis function (RBF) kernel, and so on. Since our problem is complicated nonlinear problem, the following RBF kernel function is applied in the improved model.
CR IP T
๐ฒ(๐๐๐ , ๐๐๐ ) = ๐(๐๐๐ )๐ ๐(๐๐๐ ) = exp(โโ๐๐๐ โ ๐๐๐ โ22 โ๐๐2 ), ๐ = 1, โฏ , ๐ (14) where ๐๐ is the scale parameter that is related to the bandwidth of the kernel in statistics which have shown that the bandwidth is an important parameter to generalize
AN US
the behavior of a kernel method. Since the KKT system in Eqn. (13) consists of (๐ + 1) ร ๐ linear equations and the RBF is a positive definite kernel, it can be reliably solved by many algorithms that have been developed in numerical analysis, including Conjugate Gradient (CG), Successive Over-Relaxation (SOR), Generalized
M
Minimal Residual, etc. Once the ๐ถ = (๐ถ1๐ , โฏ , ๐ถ๐๐ )๐ and ๐ are obtained by solving
ED
Eqn. (13), the predictive model of the improved SORM can be expressed as ๐(๐) = ๐พ๐ ๐(๐) + ๐ = (๐1 , โฏ , ๐๐ )๐ ๐(๐) + ๐
PT
๐ (1) (๐) ๐ .โ๐ ๐=1 ๐ถ๐ ๐(๐๐ ) , โฏ , โ๐=1 ๐ถ๐ ๐(๐๐ )/ ๐(๐)
=
(1)
(๐)
(15) +๐ ๐
CE
๐ ๐ ๐ = .โ๐ ๐=1 ๐ถ๐ ๐(๐) ๐(๐๐ ) , โฏ , โ๐=1 ๐ถ๐ ๐(๐) ๐(๐๐ )/ + ๐
AC
Unlike the classical SORM, the improved SORM not only considers the correlation among different output components but also the independence by tuning the parameters *๐๐ , ๐พ๐ , ๐๐ + for each output components. Therefore, the improved SORM is superior to the classical one. In this paper, since the sparse code has a strong sparsity that only a few of components is nonzero, which implies a high similarity between different components, and independence that each component represents a very different 17
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences elementary structure, the improved SORM is more suitable for solving our problem. 3.2 Global and Nonlocal Optimization Because of the patch-based reconstruction and other factors, the estimated HR image ๐0 may not satisfy the Eqn. (3) exactly and may contain ringing artifacts.
CR IP T
Inspired by [26,53], the following global constraint is introduced into the proposed method as the regularization term to enforce the estimated HR image ๐0 to satisfy the image degradation model
๐โ = minโ๐ซ๐ฏ๐ โ ๐โ22 + ฮดโ๐ โ ๐0 โ22
(16)
AN US
๐ฅ
where ฮด is the regularization parameter. DH is a randomly generated linear operator that analogizes the degradation process of blurring and down-sampling. At the same
M
time, in order to eliminate the ringing artifacts caused by noise and other factors, we explore the nonlocal self-similarity that is a useful prior knowledge for various image
ED
restoration tasks. The nonlocal self-similarity presumes that a small patch in the nature
PT
image tends to repeat itself many times redundantly and can be approximated by the convex combination of its similar image patches [14]. Mathematically, the constraint is
CE
given by
๐
๐๐ = โ ๐๐๐ ๐๐
(17)
AC
๐
๐
where ๐๐ is the image patch at the ๐th position of image, ๐๐ is the ๐th similar image patch of image patch ๐๐ within a search window, ๐๐๐ denotes the similarity weight and can be computed by ๐ 2
โ๐๐ โ ๐๐ โ2 1 ๐๐๐ = exp (โ ) ๐๐ โ 18
(18)
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences where ๐๐ = โ๐ ๐๐๐ is a normalization factor and โ is the attenuator that controls the extent of averaging. However, since the weight in measurement of similarity merely takes into account Euclidean distance between image patches, it is less sensitive to the image patches with fine texture structure. In order to overcome this drawback, a new
CR IP T
similarity weight is proposed by utilizing the spatial and intensity information of the pixels. Thus, the proposed similarity weight can be calculated by the following formula ๐ 2
(19)
AN US
โ๐๐ โ ๐๐ โ2 |๐ท๐๐ (๐๐ , ๐๐ ) โ ๐ท๐๐๐ (๐๐ , ๐๐ )| 1 ๐๐๐ = exp (โ โ ) ๐๐ โ1 โ2
where โ1 and โ2 are the attenuators, ๐ท๐๐ (๐๐ , ๐๐ ) is the geodesic distance between a pixel ๐๐ of the image patch ๐๐ and its center ๐๐ , and ๐ท๐๐ (๐๐ , ๐๐ ) also is the geodesic ๐
๐
distance of ๐๐ with the same pixel and path to the image patch ๐๐ . The ๐ท๐๐ (๐๐ , ๐๐ ) is
M
defined as the shortest path that connects ๐๐ and ๐๐
๐ท๐๐ (๐๐ , ๐๐ ) = min ๐(๐)
(20)
ED
๐โ๐๐๐ ,๐๐
PT
where ๐๐๐ ,๐๐ is the set of all paths connecting the pixel ๐๐ and center ๐๐ . The path ๐ is defined as a sequence of spatially neighboring points in 8-connected neighborhood ๐
CE
which connects the pixel ๐๐ and center ๐๐ . Let the sequence ๐ be ๐ = *๐๐2 โฏ ๐๐ ๐ +,
AC
then the ๐(๐) is computed by ๐(๐) = โ
๐๐ ๐=2
|๐๐ (๐๐๐ ) โ ๐๐ (๐๐๐โ1 )|
(21) ๐
Intuitively, if the difference of geodesic distance between the image patch ๐๐ and ๐๐ ๐
is low, the image patch ๐๐ and ๐๐ will have more similar textures or geometric ๐
structure. And larger similarity weights should be given to the image patch ๐๐ . Although computing the geodesic distance for all pixels in an image patch is a NP-hard problem, 19
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences a fast approximation algorithm was reported in [17]. Moreover, for each image patch, we only need to compute the geodesic distance of a randomly given pixel. In addition, because of the same pixel and path, it is easy to compute the geodesic distance of its similar image patches. Therefore, the new similarity weight will maintain a lower
CR IP T
computational cost. By incorporating the nonlocal self-similarity constraint into Eqn. (16), the following global and nonlocal optimization can be obtained
๐โ = minโ๐ซ๐ฏ๐ โ ๐โ22 + ฮดโ๐ โ ๐0 โ22 + ฮทโ๐ โ ๐๐โ22 ๐
(22)
computed by ๐(๐, ๐) = {๐๐๐
AN US
where ฮท is the regularization parameter, ๐ is the similarity weight matrix and can be
๐
if ๐๐ is the similar image patch to ๐๐ 0 otherwise
(23)
M
3.3 Algorithm of the Proposed Method for SISR
In section 3.1 and 3.2, we introduced the improved SORM and the global and
ED
nonlocal optimization model. Now we will discuss how to incorporate them into the
PT
proposed method to solve image SR problem. As show in Fig.1, The proposed SISR method is composed of two parts: the learning phase and the HR image reconstruction
CE
phase.
AC
(1) Learning phase: Firstly, we extract respectively the image patches from the training LR images *๐+ and its corresponding HR images *๐+ to form a couple feature ๐ space *(๐๐ , ๐๐ )+๐ ๐=1 . Secondly, the couple feature space *(๐๐ , ๐๐ )+๐=1 is partitioned into ๐
๐พ ๐1 ๐2 ๐พ K sub-feature space *(๐1๐ , ๐1๐ )+๐=1 , *(๐2๐ , ๐2๐ )+๐=1 , โฆ, {(๐๐ฒ ๐ , ๐๐ )}๐=1 by k-means with
๐๐
Euclidean distance. For each sub-feature space {(๐๐๐ , ๐๐๐ )}๐=1 , โ๐ โ *1,2, โฏ , ๐พ+ , a compact LR-HR sub-dictionary pair {๐ซ๐๐ฟ , ๐ซ๐๐ป } is trained via joint sparse coding, and 20
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences ๐๐
then the sub-feature space {(๐๐๐ , ๐๐๐ )}๐=1 is transformed to sparse coding space to form ๐๐
a sparse code set {(๐๐๐ , ๐๐๐ )}๐=1 via basic sparse coding scheme with jointly learned sub-dictionary pair {๐ซ๐๐ฟ , ๐ซ๐๐ป }. Finally, the improved SORM is utilized to establish the following mapping functions to reveal the relationships between the ๐๐๐ and
CR IP T
๐๐๐ , โ๐ โ *1,2, โฏ , ๐๐ + ๐(๐๐ ) = (๐พ๐ )๐ ๐(๐๐ ) + ๐๐ , โ๐ โ *1,2, โฏ , ๐พ+
(24)
where ๐พ๐ and ๐๐ is the parameters of the mapping functions and can be obtain by
AN US
solving the KKT system Eqn. (13). Accordingly, the trained sub-dictionary pairs and corresponding mapping functions can make the following energy equation to reach a smaller value with low time and space complexity. It means that the proposed method
M
can establish more precise mapping to reveal the relationships between the two feature spaces.
PT
ED
min
๐ ๐(๐๐ ),๐ซ๐ ๐ฟ ,๐ซ๐ป
๐ ๐ ๐ ๐ โ๐ ๐=1โ๐๐ โ ๐ซ๐ป ๐๐ โ
2 2
๐ . ๐ก ๐๐๐ = (๐พ๐ )๐ ๐(๐๐๐ ) + ๐๐
(25)
2
๐๐๐ = arg min โ๐๐๐ โ ๐ซ๐๐ฟ ๐๐๐ โ2 + ๐โ๐๐๐ โ1 ๐ ๐ง๐
CE
(2) HR images reconstruction phase: Firstly, the input LR image ๐ is divided into ๐
๐ก many overlapped image patches *๐๐ +๐=1 . Secondly, for each image patch ๐๐ (โ๐ โ
AC
*1,2, โฏ , ๐๐ก +), we can obtain its sparse code ๐๐ via the basic sparse coding scheme with a matched LR sub-dictionary that choose from there trained LR sub-dictionaries, and then the sparse code ๐๐ is mapped to the sparse code ๐๐ of the desired HR image patch by the learned corresponding mapping function. Then, initial HR image ๐0 be reconstructed by
21
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences
๐0 = (โ
๐๐ก
โ1
๐ท๐๐ ๐ท๐ )
๐=1
โ
๐๐ก
๐ท๐๐ ๐ซ๐โ ๐๐
๐=1
(26)
where ๐ซ๐โ (โ๐ โ *1,2, โฏ , ๐พ+) is the ๐๐ corresponding HR sub-dictionary and ๐ท๐ is the image patch extracting matrix. Finally, the initial HR image ๐0 will be refined by
be obtained by
CR IP T
using the global and nonlocal optimization Eqn. (22). Thus the final HR image ๐โ can
๐๐+1 = ๐๐ + ๐,(๐ซ๐ฏ)๐ (๐ซ๐ฏ๐๐ โ ๐) + ฮด(๐๐ โ ๐0 ) + ฮท(๐ฌ โ ๐)๐ (๐ฌ โ ๐)๐๐ - (27) where ๐ is the iteration number, ๐ is the step size.
AN US
The entire procedure of the proposed method is summarized in algorithm 1. Algorithm 1: the algorithm of the proposed method for SISR
Input: the test LR image ๐ , the dictionary pairs and mapping function {๐ซ๐๐ฟ , ๐ซ๐๐ป , ๐(๐๐ )}, โ๐ โ *1,2, โฏ , ๐พ+, scaling factor ๐ , the size of LR patch ๐ ร ๐, and Output: the final HR image ๐โ .
M
initial regularization and step parameters: ๐, ฮด, ฮท ๐.
ED
1. Upscale ๐ to ๐โ using Bicubic interpolation with a factor of ๐ 2. Extract overlapping patches *๐๐ + from the image ๐โ 3. For each LR image patch ๐๐ do
Compute the sparse code ๐๐ using Eqn. (2) with a matched LR sub-dictionary
5.
Map the ๐๐ to ๐๐ using Eqn. (24)
6.
Generate the HR image patch with ๐๐ = ๐ซ๐๐ป ๐๐
CE
PT
4.
8. End for
AC
7. Obtain the initial HR image ๐0 by fusing all the HR patches using Eqn. (26) 9. For ๐ โค
ax ter and โ๐๐+1 โ ๐๐ โ22
๐ do
10.
If mod(๐, ) = 0, update the similarity weights matrix ๐ using Eqn. (19)
11.
Update ๐๐+1 using Eqn. (27)
12. End for 4 Experimental Results and Discussion To validate the effectiveness and robustness of the proposed method, the SR 22
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences experiments for different scaling factors (ร2, ร3 and ร4) are performed on all the images in several datasets (Set5, Set14, Set17, and B100) [14,42]. Some of the test images are shown in Fig.2. Five representative learning-based SR methods are used as comparision baselines, including NE [4], SCSR [53], NCSR [10], A+ [41], and SRCNN
CR IP T
[9]. In order to evaluate the quality of the reconstructed HR images objectively, the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) are adopted as evaluation indexes in our experiments. All of the compared results are reproduced by the models
AN US
that are retrained by the codes downloaded from the authorsโ websites under the same blurring and down-sampling conditions in our experiment. 4.1 Experimental Settings
M
Since the human visual system is more sensitive to the luminance component of the image than to the chrominance ones, we transform images from the RGB color
ED
space to YCbCr color space and only perform SR reconstruction on the luminance
PT
component, and the others are simply magnified to the desired size via the Bicubic interpolation algorithm. To mimic the real imaging process, all the test HR images are
CE
first blurred by 5ร5 Gaussian kernel with standard deviation 1.2 and down-sampled by a
AC
decimated factor to produce noiseless LR images for the noiseless experiments. In addition, the Gaussian noises is added to the noiseless LR images for the noise experiments. For a fair comparison, we also select 91 HR images similar to [53] as training HR images and its corresponding training LR images are generated by the same blurring and down-sampling conditions in generating the test LR images. Some training images are shown in Fig.3. 23
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences 4.2 Parameters Selection and Optimization In our experiment, the basic parameters of the proposed method are empirically set as follows. For the training phase, the number of cluster used to partition the couple feature space is set to K=256. According to [53], we set the sparsity parameter ๐, the
CR IP T
size of sub-dictionary and the iteration parameter for dictionary training as 0.15, 128, and 40 in training couple sub-dictionary pairs by joint sparse coding respectively. Considering computation and reconstruction quality, the number of the training image
AN US
patches that are used for learning sub-dictionaries and mapping functions is 1,000,000. Since the RBF kernel function is adopted in the improved SORM, when we train the regression functions in sparse coding space, we should determine the parameter pairs *๐พ๐ , ๐๐ , ๐๐ +, โ๐ โ *1,2, โฏ , ๐+ for each output component. Because the parameter pairs
M
*๐พ๐ , ๐๐ , ๐๐ + have important effect on the mapping accuracy of regression functions, an
ED
improved optimization algorithm is utilized to optimize the parameter pairs for each
PT
component of output variable according to Suykens et al. [38] โs research on the structural risk minimization. The complete optimization process goes as follows: First,
CE
the Coupled Simulated Annealing (CSA) determines suitable starting points for each
AC
parameter pair in the range of exp(โ10) to exp(10). Then these starting points are given to the optimization routines to produce an evaluated grid over the parameter space. Thus, the optimization of parameter pairs is obtained by picking the minimum in the grid. In the reconstruction phase, the parameters for the global and nonlocal optimization are set based on the data in [14]. The number of similar image patches in each cluster is set as 85, and the window width for searching the similar image patches 24
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences is set as 50. For the noiseless experiments, the sparsity penalty and regularization parameters ๐, ๐ฟ, โ1 , โ2 and ฮท are set to be 0.10, 0.02, 3, 3 and 0.05 respectively. Similarily, the ๐, ๐ฟ, โ1 , โ2 and ฮท are experimentally set as 0.15, 0.01, 8, 8 and 0.53 for the noisy image reconstruction. And the iteration parameters of the optimization will
CR IP T
be discussed in the section 4.4. Correspondingly, all the empirical parameters in our experiments also are shown in Table 1. 4.3 Effectiveness of the Improved SORM
AN US
In this section, we design two groups of experiment on the noiseless and noisy test images to evaluate the effectiveness of the improved SORM. One experiment is the proposed method without the improved SORM named as N_SORM, thus reconstructed
M
model will degenerate into the classical sparse representation model, and the other one is the proposed method. The reconstructed results on noiseless and noisy Baboon image
ED
by the proposed and N-SORM method are showed in Fig. 4-5. It is obvious that the
PT
proposed method outperforms N_SORM method in preserving edges, reconstructing visual details, and suppressing noise. This is mainly because the improved SORM
CE
forces the sparse code of the reconstructed images to approximate the sparse code of the
AC
original HR images.
Moreover, we report the PSNR and SSIM values of the reconstructed HR images
by the proposed and N-SORM method for the noiseless and noisy experiments in Table 2 and Table 3. From the tables, the average PSNR and SSIM gains of the reconstructed noiseless results by the proposed method over the N_SORM method are 1.16 dB and 0.0244, respectively. Correspondingly, the average PSNR and SSIM gains over the 25
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences N_SORM method are 1.53 dB and 0.0722, respectively, in the noisy experiment. In view of the above, the combine sparse coding with the improved SORM can significantly enhance the performance of the sparse coding SR methods. 4.4 Effectiveness of Global and Nonlocal Optimization
CR IP T
In this section, three groups of experiments are carried out on all the images in Set17 to illustrate the effectiveness of the proposed global and nonlocal optimization. The proposed method without the global and nonlocal optimization (N_GNO), with the
AN US
global and nonlocal optimization of the classical similarity weight(N_GNC), and with the global and nonlocal optimization of the new similarity weight are considered in our experiments. As shown in Fig. 6 and Fig. 7, the reconstructed Parrots images by the
M
proposed method contain less ringing artifacts and more sharp edges. Correspondingly, we also show the average PSNR and SSIM values of all the restored HR images by the
ED
three groups of experiments in Table 4. It is clearly that the proposed method
PT
outperforms the N_GNO and N_GNC methods. Concretely, the average PSNR gain of the proposed method over the N_GNO and N_GNC method are 0.16dB and 0.04,
CE
respectively, in the noiseless experiments. The average PSNR gain of noise experiments
AC
over the N_GNO and N_GNC method are 0.39dB and 0.03, respectively. Simultaneously, we also report the optimization process in Fig.8 to show the performance of refinement and the influence of the iterations in Eqn. (23). From Fig. 8, we can observe that the average PSNR for all the reconstructed HR images in Set5 tends to increase with the number of iterations. In addition, due to the nonlocal self-similarity also is widely applied in denoising area, the proposed optimization achieves the more 26
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences promising performance in recovering the HR images from noise images. The above evaluation validates that the proposed global and nonlocal optimization can further improve the quality of reconstructed HR image. 4.5 Experiments on Noiseless Image
CR IP T
In order to verify the effectiveness of the proposed method, we implement the proposed and compared methods on noiseless test image. Table 5 show the PSNR and SSIM values of the reconstructed HR images. From Table 5, we can observe that NE
AN US
method always gives the worst performance. Since SCSR and A+ methods exploit more priori knowledge from examples, they obtain better results than NE method. SRCNN obtain higher PSNR and SSIM values than SCSR and NE methods because of the
M
establishment of more complex relationships between samples. NCSR method achieves more promising performance than the above-mentioned methods owing to the
ED
exploration of local and nonlocal self-similarity. Correspondingly, the proposed method
PT
significantly outperforms the compared methods on all the test images. The average PSNR and SSIM gains over the NCSR method are 0.59 dB and 0.0126, respectively. At
CE
the same time, we perform unequal variances T-test [31] on the proposed method and
AC
each of the compared methods to evaluate the significance degree of the average PSNR and SSIM gains. The T-statistics of PSNR and SSIM in Table 5 verify that the proposed method significantly improves the quality of the reconstructed HR image since most of the test statistics between the proposed and compared method are obviously away from 1. To further evaluate the SR performances of the proposed method, the scaling 27
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences factors ร3 results of the Butterfly, Boats, and Barbara images for visual quality comparison are shown in Fig. 9-11, respectively. For further comparison, we also show a local magnification result in the red rectangle region segmented from each reconstructed HR images. From Fig.8-10, we can observe that the reconstructed HR
CR IP T
images by NE method contains a large number of jaggy artifacts and blurred edges. Although SCSR method has better performance over NE method, the reconstructed HR images shows noticeable artifacts around the edges. SRCNN and A+ methods perform
AN US
better than NE methods in suppressing ringing artifacts, but it always gives low performance for blurred images. By exploiting the local and nonlocal self-similarity in nature images, NCSR methods is effective to suppress ringing and jaggy artifacts, but it
M
always leads to the blurring of small scale and sharp edge structures. Comparison with the baseline methods, the reconstructed results of proposed method show both sharp
ED
edges and finer details that can be observed from the black edges and white enclosed
PT
areas of the reconstructed Butterfly image in Fig. 9(g). We can also obtain the similar observations around the edge regions in Figs. 10 and 11. This is mainly due to that the
CE
proposed method can more accurately reveal the intrinsic relationship between the two
AC
feature spaces. Hence, our method is more capable of estimating the desired HR image. 4.6 Experiments on Noisy Image Since the input LR images are always corrupted by noise in real applications, it is
necessary to verify the effects of image noise on the proposed method. For fair comparison, the proposed and compared methods are all trained in noiseless condition. In addition, we add the additive Gaussian noise with different levels of standard 28
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences deviation (๐๐ = 4, 6, 8) to the LR input images. Fig. 12-14 show the reconstructed HR images by the proposed and compared methods in the different noise levels. As shown in Fig.12-14, NE method generates serious jaggy artifacts cause by noise along edges. SCSR and A+ method can better suppress these artifacts. However, the watercolor-like
CR IP T
artifacts are generated in reconstructed HR images. Similarly, there are many unpleasing blurred textural details and ringing effects in reconstructed HR images by SRCNN method. It is obvious that NCSR method performs better than other compared methods
AN US
in suppressing noise and recovering the miss high-frequency information. However, the reconstructed results always have the problem of the blurring of image small scale structures and over-smooth, especially at the high noise level. Compared with the
M
baselines, the proposed method show a more convincing performance in suppressing noise, preserving edge and reconstructing visual details in the different noise levels.
ED
Moreover, we show the PSNR and SSIM values of the reconstructed HR images by
PT
the proposed and compared methods with different noise level in Table 6. From the Table, the PSNR and SSIM values for all methods tend to decrease when the noise level
CE
becomes larger. However, compared with the other methods, our method still achieves
AC
the best reconstructed quality in the experiments of different noise level. Similiar to the noiseless experiments, we also obtain the T-statistics between the proposed method and the compared methods. These statistics in different noise levels indubitably indicate that although the quality of reconstructed HR images by the proposed method declines markedly with the increase of noise level, the average PSNR and SSIM gains are still significant compared with other methods. These results indicate even if the proposed 29
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences model is trained in noiseless condition, it still has stunning performance to recover the HR image from a given LR noise image, which means that the proposed method has better noise robustness than the compared methods. 4.7. Benchmarks
CR IP T
To further verify the effectiveness of the proposed method, the more statistical experiments with the different scaling factors ร2, ร3 and ร4 are performed on all the images in Set5, Set14 and B100. Table 7 and 8 show the PSNR and SSIM values of the
AN US
reconstructed results by the proposed and compared methods in the noiseless and noisy experiments. From Table 7 and 8, we can see that the proposed method achieves the consistent performance on Set 5, Set 14, and B100. Since the proposed and compared
M
methods are based on the learning methods and the training data are limit, the performance of all methods tends to decline with the increase of test examples.
ED
However, the proposed method still performs better than all the compared methods in
PT
both PSNR and SSIM.
In addition, considering that most of the compared methods (NE, SCSR, A+ and
CE
SRCNN ) which reported their performance on the LR images generated by the bicubic
AC
downsampling process in the original published paper, we also execute the same experiment for obtaining the more fair comparison. Table 9 show the average PSNR and SSIM values on the datasets Set5, Set14, and B100 with the scale factor ร2, ร3 and ร4. We can observe that the proposed method is capable of achieving more impressing performance than the compared methods as before. The average PSNR gains by the proposed method over SRCNN for the scale factor ร3 on Set5, Set14, and B100 are 30
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences 0.28, 0.31, 0.22 respectively. These experimental results further suggest that the proposed method can effectively improve the quality of reconstructed HR images. 4.8 Computational Complexity For the reality SR applications, the time and space complexity of algorithm also
CR IP T
need to be considered owing to the limitation of computing resource. In this section, we discuss and show the computational complexity of the proposed algorithm in terms of both training and reconstructing phase. For fair comparison, the proposed and compared
AN US
methods are conducted in MATLAB 2014a platform on a desktop PC with 64-win10 operating system, 2.9 GHz Intel-Core CPU, and 12 GB memory. In the training phase, the time complexity of the proposed method is higher than the SCSR method because
M
we have to train the mapping functions after the learning of the couple dictionary pairs. However, since the training of mapping functions is to solve the linear systems of
ED
equations in Eqn. (23) offline, the training time of mapping functions is at a low level.
PT
However, the total training time of the proposed method is about 3.43 times of SCSR because we train a set of joint dictionary pairs and mapping functions (K=256). At the
CE
same time, we also compare the CPU running time and the PSNR values between the
AC
proposed method and the compared methods to evaluate the practicability of the proposed algorithm. The comparisons of 3ร SR magnification on โBarbaraโ image are shown in Fig. 15. Since the RBF kernel function and the improved nonlocal regularization term are adopted during the SR process, the proposed method takes about 171 seconds to recover a 255 ร 255 HR image. As shown in Fig. 15, although the proposed method costs more time than the A+, SRCNN, SCSR, and NE methods and 31
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences only slightly faster than the NCSR methods, the reconstructed results of the proposed method are the best. 5 Conclusion In this paper, we propose a new SISR method by combining sparse coding with the
CR IP T
improved SORM. In the proposed method, we train a set of joint dictionary pairs and mapping functions. Since the trained dictionary pairs well characterize the structural of the LR and HR feature spaces and the learned mapping functions map the sparse code
AN US
of LR feature space to its corresponding the sparse code of HR feature space more precisely, the proposed method establishes a more accurate model to reveal the intrinsic relationship between the two feature spaces and improves the quality of reconstructed
M
HR image notably. Furthermore, aiming at alleviating the ringing artifacts cause by the patchwise and other factors such as noise, we propose an new global and nonlocal
ED
optimization to further improve the HR image quality by exploiting spatial and intensity
PT
information of nonlocal self-similarity image patches. The experiment results on test images demonstrate that the proposed method is capable of achieving performance that
CE
is more competitive over many state-of- the-art SISR methods.
AC
Quantitatively and qualitatively, the proposed method can obtain better quality HR images than other single image SR methods, but the reconstruction speed is much slower than many outstanding methods especially for the methods based on deep convolution neural network. In future work, we will investigate potential parallelism of the proposed method to improve the efficiency of training and reconstruction by making full use of computer CPU and GPU. 32
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences
Acknowledgements This work was supported by Key Projects of the National Science and Technology Program, China (Grant No. 2013GS500303), the Key Science and Technology Projects of CSTC, China (Grant Nos. CSTC2012GG-YYJSB40001, CSTC2013-JCSF40009),
CR IP T
and the National Natural Science Foundation of China (Grant No. 61105093).
The authors would like to thank the editors and reviewers for their valuable comments and suggestions.
AN US
Author Contributions
This manuscript was performed in collaboration between the authors. Weiguo
M
Gong is the corresponding author of this research work. Yongliang proposed a new SISR method based on the sparse coding and improved SORM. Qiane Yi proposed the
ED
global and nonlocal optimization model. Weihong Li was involved in the writing and
CE
manuscript.
PT
argumentation of the manuscript. All authors discussed and approved the final
AC
References [1] N. Akhtar, F. Shafait, A. Mian. Bayesian sparse representation for hyperspectral image super resolution, in: IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 3631-3640. [2] X. An, S. Xu, L. Zhang, S. Su, Multiple dependent variables LS-SVM regression algorithm and its application in NIR spectral quantitative analysis. Spectrosc. Spect. Anal. 29 (01) (2009)
33
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences 127โ130. [3] M. Brudnak, Vector-valued support vector regression, in: Int. Joint Conf. Neural Netw., 2006, pp. 1562โ1569. [4] H. Chang, D.-Y. Yeung, Y. Xiong, Super-resolution through neighbor embedding, in: IEEE Conf.
CR IP T
Comput. Vis. Pattern Recognit., 2004, pp. 275-282. [5] X. Chen, C. Qi, Nonlinear neighbor embedding for single image super-resolution via kernel mapping, Signal Process. 94 (1) (2014) 6-22.
AN US
[6] S. Dai, M. Han, W. Xu, Y. Wu, Y. Gong, Soft edge smoothness prior for alpha channel super resolution, in: IEEE Conf. Comput. Vis. Pattern Recognit., 2007, pp. 1โ8. [7] I. Daubechies, M. Defriese, C. DeMol, An iterative thresholding algorithm for linear inverse
M
problems with a sparsity constraint, Communications on Pure and Applied Mathematics 57(11) (2004) 1413-1457.
ED
[8] C. Deng, J. Xu, K. Zhang, Similarity constraints-based structured output regression
PT
machine: an approach to image super-resolution, IEEE Transactions on Neural Networks & Learning Systems 27 (12) (2015) 2472-2485.
CE
[9] C. Dong, C. Loy, X. Tang, Image super-resolution using deep convolutional networks, IEEE
AC
Trans. Pattern Anal. Mach. Intell. 38 (2) (2016) 295-307. [10] W. Dong, L. Zhang, G. Shi, X. Li, Nonlocally centralized sparse representation for image restoration, IEEE Trans. Image Process. 22 (4) (2013) 1620โ1630.
[11] R. Fattal, Image up-sampling via imposed edge statistics, ACM Transactions on Graphics 26 (03) (2007) 095-103. [12] W.-T. Freeman, T.-R. Jones, E.-C. Pasztor, Example-based super-resolution, IEEE Comput.
34
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Graph. Appl. 22 (2) (2002) 56-65. [13] X. Gao, K. Zhang, X. Li, D. Tao, Image super-resolution with sparse neighbor embedding, IEEE Trans. Image Process. 21 (7) (2012) 3194-3205. [14] W. Gong, L. Hu, J. Li, W. Li, Combining sparse representation and local rank constraint for
CR IP T
single image super resolution, Information Sciences 325 (2015) 1-19. [15] L. He, H. Qi, R. Zaretzki, Beta process joint dictionary learning for coupled feature spaces with application to single image super-resolution, in: IEEE Conf. Comput. Vis. Pattern
AN US
Recognit., 2013, pp. 345-352.
[16] H. He, W.-C. Siu, Single image super-resolution using Gaussian process regression, in: IEEE Conf. Comput. Vis. Pattern Recognit., 2011, pp. 449โ456.
M
[17] A. Hosni, M. Bleyer, M. Gelautz, C. Rhemann, Local stereo matching using geodesic support weights, in: IEEE International Conference on Image Processing, 2009, pp. 2093โ2096.
ED
[18] M. Irani, S. Peleg, Improving resolution by image registration, CVGIP Graphical Models &
PT
Image Processing 53 (03) (1991) 231-239. [19] K. Jiwon, J.-K. Lee, K.-M. Lee. Accurate image super-resolution using very deep
CE
convolutional networks, in: IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 1646-1654.
AC
[20] K. Jiwon, J.-K. Lee, K.-M. Lee. Deeply-recursive convolutional network for image super-resolution, in: IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp.1637-1645.
[21] R. Keys, Cubic convolution interpolation for digital image processing, IEEE Trans. Acoust., Speech, Signal Process. 29 (06) (1981) 1153-1160. [22] K. I. Kim, Y. Kwon, Single-image super-resolution using sparse regression and natural image prior, IEEE Trans. Pattern Anal. Mach. Intell. 32 (06) (2010) 1127-1132.
35
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences [23] S. Kindermann, S. Osher, P. W. Jones, Deblurring and denoising of images by nonlocal functionals, Multiscale Model. Simul. 04 (04) (2005) 1091-1115. [24] R.-O. Lane, Non-parametric Bayesian super-resolution, IET radar, sonar & navigation, 04 (04) (2010) 639-648.
super-resolution, Information Sciences 298 (2015) 257-273.
CR IP T
[25] J. Li, W. Gong, W. Li, Dual-sparsity regularized sparse representation for single image
[26] W. Liu, S. Li, Sparse representation with morphologic regularizations for single image
AN US
super-resolution, Signal Process. 98 (2014) 410โ422.
[27] J. Mairal, F. Bach, J. Ponce, G. Sapiro, A. Zisserman, Non-local sparse models for image restoration, in: IEEE Int. Conf. Comput. Vis., 2009, pp. 2272โ2279.
M
[28] S. Peled, Y. Yeshurun, Super-resolution in MRI: application to human white matter fiber tract visualization by diffusion tensor imaging, Magnetic Resonance in Medicine Official Journal of
ED
the Society of Magnetic Resonance in Medicine 45 (2001) 29โ35.
PT
[29] G. Polatkan, M. Zhou, L. Carin, D. Blei, I. Daubechies, A Bayesian Nonparametric Approach to Image Super-Resolution, IEEE Trans. Pattern Anal. Mach. Intell. 37 (02) (2015) 346-358.
CE
[30] M. Protter, M. Elad, H. Takeda, P. Milanfar, Generalizing the nonlocal-means to
AC
super-resolution reconstruction, IEEE Trans. Image Process. 18 (01) (2009) 36-51. [31] G.-D. Ruxton, Unequal variance t -test is an underused alternative to student's t -test and the mannโwhitney u test, Behavioral Ecology. 17(4) (2006) 688-690.
[32] R.-R. Schultz, R.-L. Stevenson, Extraction of high-resolution frames from video sequences, IEEE Trans. Image Process. 05 (06) (1996) 996-1011. [33] K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image
36
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences recognition, in: International Conference on Learning Representations, 2015. [34] J. Su, Z. Xu, H. Shum, Gradient pro๏ฌle prior and its applications in image super-resolution and enhancement, IEEE Transactions on Image Process. 20 (06) (2011) 1529-42 [35] J. Sun, Z. Xu, H.-Y. Shum, Image super-resolution using gradient profile prior, in: IEEE Conf.
CR IP T
Comput. Vis. Pattern Recognit., 2008, pp. 1โ8. [36] J. Sun, J. Zhu, M. F. Tappen, Context-constrained hallucination for image super-resolution, in: IEEE Conf. Comput. Vis. Pattern Recognit., 2010, pp.231โ238.
AN US
[37] J. Sun, N.-N. Zheng, H. Tao, H.-Y. Shum, Image hallucination with primal sketch priors, in: IEEE Conf. Comput. Vis. Pattern Recognit., 2003, pp. 729โ736.
[38] J. Suykens, J. Vandewalle, Least squares support vector machine classifiers, Neural Process.
M
Lett. 09 (03) (1999) 293โ300.
[39] Y.-W. Tai, S. Liu, M. S. Brown, S. Lin, Super resolution using edge prior and single image
ED
detail synthesis, in: IEEE Conf. Comput. Vis. Pattern Recognit., 2010, pp. 2400โ2407.
PT
[40] H. Takeda, S. Farsiu, P. Milanfar, Kernel regression for image processing and reconstruction, IEEE Trans. Image Process. 16 (02) (2007) 349-366.
CE
[41] R. Timofte, V. De, L. Gool, A+: Adjusted anchored neighborhood regression for fast
AC
super-resolution, in: 12th Asian Conf. Comput. Vis., 2014, pp. 111โ126. [42] R. Timofte, V. De, L. Gool, Anchored neighborhood regression for fast example-based super-resolution, in: IEEE Conf. Comput. Vis. Pattern Recognit., 2013, pp. 1920โ1927.
[43] R. Timofte, R. Rothe, L. Gool, Seven ways to improve example-based single image super resolution, in: IEEE Conf. Comput. Vis. Pattern Recognit., 2016:1865-1873. [44] R.-Y. Tsai, T.-S. Huang, Multi-frame image restoration and registration, Advances in
37
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Computer Vision & Image Process. 01 (1984) 317-339. [45] M. Unser, A. Aldroubi, M. Eden, Fast B-spline transforms for continuous image representation and interpolation, IEEE Trans. Pattern Anal. Mach. Intell. 13 (03) (1991) 227-285. [46] L. Wang, S. Xiang, G. Meng, H. Wu, C. Pan, Edge-directed single-image super-resolution via
Video Technology 23 (08) (2013) 1289-1299.
CR IP T
adaptive gradient magnitude self-interpolation, IEEE Transactions on Circuits & Systems for
[47] S. Wang, D. Zhang, Y. Liang, Q. Pan, Semi-coupled dictionary learning with applications to
Recognit., 2012, pp. 2216โ2223.
AN US
image super-resolution and photo-sketch synthesis, in: IEEE Conf. Comput. Vis. Pattern
[48] S. Xu, X. An, X. Qiao, L. Zhu, Multi-task least-squares support vector machines, Multimedia
M
Tools Appl. 71 (02) (2014) 699-715.
[49] H. Xu, G. Zhai, X. Yang, Single image super-resolution with detail enhancement based on
PT
1740-1754.
ED
local fractal analysis of gradient, IEEE Trans. Circuits Syst. Video Technol. 23 (10) (2013)
[50] W. Yang, Y. Tian, F. Zhou, Consistent coding scheme for single image super-resolution via
CE
independent dictionaries, IEEE Transactions on Multimedia 18 (03) (2016) 313-325.
AC
[51] S. Yang, M. Wang, Y. Chen, Y. Sun, Single-image super-resolution reconstruction via learned geometric dictionaries and clustered sparse coding, IEEE Trans. Image Process. 21 (09) (2012) 4016โ4028.
[52] J. Yang, Z. Wang, Z. Lin, S. Cohen, Thomas Huang, Couple dictionary training for image super-resolution, IEEE Trans. Image Process. 21 (08) (2012) 3467-3487. [53] J. Yang, J. Wright, T.-S. Huang, Y. Ma, Image super-resolution via sparse representation, IEEE
38
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Trans. Image Process. 19 (11) (2010) 2861-2873. [54] R. Zeyde, M. Elad, M. Protter, on single image scale-up using sparse-representations, in: 7th Int. Conf. Curves Surf., 2010, pp. 711โ730. [55] X. Zhang, M. Burger, X. Bresson, S. Osher, Bregmanized nonlocal regularization for
CR IP T
deconvolution and sparse reconstruction, SIAM J. Imag. Sci.. 03 (03) (2010) 253โ276. [56] K. Zhang, X. Gao, X. Li, D. Tao, Partially supervised neighbor embedding for example-based image super-resolution, IEEE J. Sel. Topics Signal Process. 05 (02) (2011) 230-239.
AN US
[57] K. Zhang, X. Gao, D. Tao, and X. Li, Image super-resolution via nonlocal steering kernel regression regularization, in: 20th IEEE Conf. Image Process, 2013, pp. 943โ946. [58] K. Zhang, D. Tao, X. Gao, X. Li, Z. Xiong, Learning multiple linear mappings for efficient
M
single image super-resolution, IEEE Trans. Image Process. 24 (03) (2015) 846-861. [59] L. Zhang, X. Wu, An edge-guided image interpolation algorithm via directional filtering and
ED
data fusion, IEEE Trans. Image Process. 15 (08) (2006) 2226-2238.
PT
[60] L. Zhang, H. Zhang, H. Shen, P. Li, A super-resolution reconstruction algorithm for surveillance images, Signal Process. 90 (3) (2010) 848โ859.
CE
[61] L-W. Zou, P.-C. Yuen, Very low resolution face recognition problem, IEEE Trans. Image
AC
Process. 21 (1) (2012) 327โ340.
39
ACCEPTED MANUSCRIPT
ED
M
AN US
CR IP T
Y. Tang et al. /Information Sciences
PT
Vitae
Yongliang Tang received the B.E. and M.S. degrees from Sichuan
CE
University of Science & Engineering, China, in 2010 and 2013,
AC
respectively. Currently๏ผhe is a PhD candidate in the College of Opto-Electronic Engineering, Chongqing University. His research
interests include machine learning and image processing. Weiguo Gong received his doctoral degree in computer science from the Tokyo Institute of Technology of Japan in March 1996 as a scholarship gainer of Japan Government. From April 1996 to March 40
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences 2002, he served as a researcher or senior researcher in NEC Labs of Japan. Now he is a professor of Chongqing University, China. He has published over 120 research papers in international journals and conferences and two books as an author or co-author. His current research interests are in the areas of pattern recognition and image processing.
CR IP T
Qiane Yi is a M.S. candidate majoring in Control Technology and Instrument in the College of Opto-Electronic Engineering, Chongqing University. Her research interests are in information acquiring and
AN US
image processing.
Weihong Li received her doctoral degree from Chongqing University in 2006. Now she is a professor in Chongqing University, China. Her
Table captions:
ED
image processing.
M
current research interests are in the areas of pattern recognition and
method.
PT
Table 1 Empirical parameters for the learning and reconstruction phases of the proposed
CE
Table 2 PSNR (dB) and SSIM results of reconstructed HR images by the proposed and
AC
N_SORM methods (scaling factor ๐ = 3, noise level ๐ = 0). Table 3 PSNR (dB) and SSIM results of reconstructed HR images by the proposed and N_SORM methods (scaling factor ๐ = 3, noise level ๐=6). Table 4 Average PSNR (dB) and SSIM results of reconstructed HR images by the proposed, N_GNO and N_GNC methods (scaling factor ๐ = 3). Table 5 PSNR (dB) and SSIM results of reconstructed HR images by the proposed and 41
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences baseline methods (scaling factor ๐ = 3, noise level ฯ= 0). Table 6 PSNR (dB) and SSIM results of reconstructed HR images by the proposed and baseline methods (scaling factor ๐ = 3, noise level ฯ= 4, 6, 8). Table 7 Average PSNR (dB) and SSIM for scaling factor ร2, ร3 and ร4 on datasets
CR IP T
Set5, Set14, and B100 among different methods (Noise Level, ๐๐ = 0). Table 8 Average PSNR (dB) and SSIM for scaling factor ร2, ร3 and ร4 on datasets Set5, Set14, and B100 among different methods (Noise Level, ๐๐ = 6).
AN US
Table 9 Average PSNR (dB) and SSIM for scale factor ร2, ร3 and ร4 on datasets Set5, Set14, and B100 with the bicubic downsampling (Noise Level, ๐๐ = 0). Figure captions:
M
Fig.1 Framework of the proposed method.
ED
Fig.2 Some of test images, from left to right and top to bottom: Butterfly, Parrots, Bike, Boats, Flower, Hat, Leaves, Lighthouse, Plants, and Raccoon.
PT
Fig.3 Some of training images.
CE
Fig.4 Comparison of reconstructed results on noiseless Baboon image by the N_SORM
AC
and proposed methods (scaling factor๐ = 3, noise level ๐ = 0). (a) LR image. (b) N_SORM (c). Proposed method. (d) Original image. Fig.5 Comparison of reconstructed results on noise Baboon image by the N_SORM and proposed methods (scaling factor ๐ = 3 , noise level ๐ = 6 ). (a) LR image. (b) N_SORM. (c) Proposed method. (d) Original image. Fig.6 Comparison of reconstructed results on noiseless Starfish image by the N_GNO, 42
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences N_GNC and proposed methods (scaling factor ๐ = 3, noise level ๐ = 0). (a) LR image. (b) N_GNO. (c) N_GNC. (d) Proposed method. (e) Original image. Fig.7 Comparison of reconstructed results on noise Starfish image by the N_GNO, N_GNC and proposed methods (scaling factor ๐ = 3, noise level ๐ = 6). (a) LR image.
CR IP T
(b) N_GNO. (c) N_GNC. (d) Proposed method. (e) Original image. Fig.8 Performance and influence of the number of iterations in the global and nonlocal optimization. (a) Average PSNR with the iterations for scaling factor ร3 on the noiseless
AN US
images in Set5 (Noise Level, ๐๐ = 0). (b) Average PSNR with the iterations for scaling factor ร3 on the noisy images in Set5 (Noise Level, ๐๐ = 6).
Fig.9 Comparison of reconstructed results on noiseless Butterfly image by the baseline
M
and proposed methods (scaling factor ๐ = 3, noise level ๐ = 0). (a) LR image. (b) NE
(h) Original image.
ED
[4]. (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method.
PT
Fig.10 Comparison of reconstructed results on noiseless Boats image by the baseline and proposed methods (scaling factor ๐ = 3, noise level ๐ = 0). (a) LR image. (b) NE
CE
[4]. (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method.
AC
(h) Original image.
Fig.11 Comparison of reconstructed results on noiseless Barbara image by the baseline and proposed methods (scaling factor ๐ = 3, noise level ๐ = 0). (a) LR image. (b) NE [4]. (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method. (h) Original image. Fig.12 Comparison of reconstructed results on noisy Butterfly image by the baseline 43
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences and proposed methods (scaling factor ๐ = 3, noise level ๐ = 4). (a) LR image. (b) NE [4]. (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method. (h) Original image. Fig.13 Comparison of reconstructed results on noisy Leaves image by the baseline and
CR IP T
proposed methods (scaling factor ๐ = 3, noise level ๐ = 6). (a) LR image. (b) NE [4]. (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method. (h) Original image.
AN US
Fig.14 Comparison of reconstructed results on noisy Hat image by the baseline and proposed methods (scaling factor ๐ = 3, noise level ๐ = 8). (a) LR image. (b) NE [4]. (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method. (h)
M
Original image.
Fig.15 Speed versus PSNR between the proposed and compared methods. Our proposed
ED
method provides the best quality in comparison with state-of-the-art learning-based SR
AC
CE
PT
methods and preserves the time complexity of NCSR.
44
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Preprocessing
Model Training
Optimization
DL
Postprocessing
SORM Training
Bicubic
(a) Global constraint
N
๏ปui ๏ฝi๏ฝ1
๏ฌ๏ฏ x ๏ญ D๏ก 2 ๏ผ๏ฏ i i 2 min ๏ฅ ๏ญ ๏ฝ D i ๏ฝ1 ๏ฏ ๏ซ ๏ฌ ๏ก i 1 ๏ฎ ๏พ๏ฏ N
Training Dictionary
(b) Nonlocal self-similarity
x* ๏ฝ min DHx ๏ญ y 2 ๏ซ ๏ค x ๏ญ x 0
N
๏ปvi ๏ฝi๏ฝ1
2
RGB
x
2 2
๏ซ x ๏ญ๏น x
2 2
DH
0.3 0.2
0.2
vi ๏ฝ W ๏ช ๏จ ui ๏ฉ ๏ซ b T
0.1
0.1
0
0
-0.1
SORM Mapping-0.1
-0.2
-0.2
-0.3
-0.3
n ๏พ Nm
u
YCbCR
-0.4
-0.4 0
20
LR
40
60
80
100
120
0
20
40
60
80
Feature-mapping
100
120
or
x n ๏ซ1 ๏ญ x n
2 2
๏ผ๏ฅ
CR IP T
0.3
xi ๏ฝ Dxvi
2
0.4
0.4
๏ป
2
uหi ๏ฝ argmin yi ๏ญ Dy ui ๏ซ ๏ฌ ui
1
๏ฝ
YCbCR
RGB HR
ED
M
AN US
Fig.1 Framework of the proposed method
PT
Fig.2 Some of test images, from left to right and top to bottom: Butterfly, Parrots, Bike,
AC
CE
Boats, Flower, Hat, Leaves, Lighthouse, Plants, and Raccoon.
2
ACCEPTED MANUSCRIPT
CR IP T
Y. Tang et al. /Information Sciences
(b)
M
(a)
AN US
Fig.3 Some of training images
(c)
(d)
ED
Fig.4 Comparison of reconstructed results on noiseless Baboon image by the N_SORM and proposed methods (scaling factor ๐ = 3, noise level ๐ = 0). (a) LR image. (b)
AC
CE
PT
N_SORM. (c) Proposed method. (d) Original image.
(a)
(b)
(c)
(d)
Fig.5 Comparison of reconstructed results on noise Baboon image by the N_SORM and proposed methods (scaling factor ๐ = 3 , noise level ๐ = 6 ). (a) LR image. (b) N_SORM. (c) Proposed method. (d) Original image.
3
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences
(a)
(b)
(c)
(d)
(e)
Fig.6 Comparison of reconstructed results on noiseless Starfish image by the N_GNO,
CR IP T
N_GNC and proposed methods (scaling factor ๐ = 3, noise level ๐ = 0). (a) LR image.
(a)
AN US
(b) N_GNO. (c) N_GNC. (d) Proposed method. (e) Original image.
(b)
(c)
(d)
(e)
Fig.7 Comparison of reconstructed results on noise Starfish image by the N_GNO,
M
N_GNC and proposed methods (scaling factor ๐ = 3, noise level ๐ = 6). (a) LR image.
AC
CE
PT
ED
(b) N_GNO. (c) N_GNC. (d) Proposed method. (e) Original image.
4
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences 33.55
33.50
33.40
CR IP T
PSNR(dB)
33.45
33.35
33.30
33.25 0
40
60
80 100 Iterations
120
140
160
180
AN US
20
(a)
31.80 31.75
M
31.70
ED
31.60 31.55 31.50 31.45
CE
31.40
PT
PSNR (dB)
31.65
31.35
AC
31.30 0
20
40
60
80 100 Iterations
120
140
160
180
(b)
Fig.8 Performance and influence of the number of iterations in the global and nonlocal optimization. (a) Average PSNR with the iterations for scaling factor ร3 on the noiseless images in Set5 (Noise Level, ๐๐ = 0). (b) Average PSNR with the iterations for scaling factor ร3 on the noisy images in Set5 (Noise Level, ๐๐ = 6). 5
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences
(c)
(e)
(f)
(g)
(d)
CR IP T
(b)
(h)
AN US
(a)
Fig.9 Comparison of reconstructed results on noiseless Butterfly image by the baseline and proposed methods (scaling factor ๐ = 3, noise level ๐ = 0). (a) LR image. (b) NE [4]. (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method.
(b)
(c)
(d)
AC
CE
(a)
PT
ED
M
(h) Original image.
(e)
(f)
(g)
(h)
Fig.10 Comparison of reconstructed results on noiseless Boats image by the baseline and proposed methods (scaling factor ๐ = 3, noise level ๐ = 0). (a) LR image. (b) NE 6
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences [4]. (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method.
(a)
(b)
(c)
(e)
(f)
AN US
CR IP T
(h) Original image.
(g)
(d)
(h)
Fig.11 Comparison of reconstructed results on noiseless Barbara image by the baseline
M
and proposed methods (scaling factor ๐ = 3, noise level ๐ = 0). (a) LR image. (b) NE
AC
CE
PT
(h) Original image.
ED
[4]. (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method.
7
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences
(c)
(e)
(f)
(g)
(d)
CR IP T
(b)
(h)
AN US
(a)
Fig.12 Comparison of reconstructed results on noisy Butterfly image by the baseline and proposed methods (scaling factor ๐ = 3, noise level ๐ = 4). (a) LR image. (b) NE [4]. (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method.
(b)
(c)
(d)
AC
CE
(a)
PT
ED
M
(h) Original image.
(e) (f)
(g)
(h)
Fig.13 Comparison of reconstructed results on noisy Leaves image by the baseline and proposed methods (scaling factor ๐ = 3, noise level ๐ = 6). (a) LR image. (b) NE [4]. 8
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences (c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method. (h)
(a)
(b)
(c)
(e)
(f)
AN US
CR IP T
Original image.
(g)
(d)
(h)
Fig.14 Comparison of reconstructed results on noisy Hat image by the baseline and
M
proposed methods (scaling factor ๐ = 3, noise level ๐ = 8). (a) LR image. . (b) NE [4].
AC
CE
PT
Original image.
ED
(c) SCSR [53]. (d) NCSR [10]. (e) A+ [41]. (f) SRCNN [9]. (g) Proposed method. (h)
9
ACCEPTED MANUSCRIPT
CR IP T
Y. Tang et al. /Information Sciences
AN US
Fig.15 Speed versus PSNR between the proposed and compared methods. Our proposed method provides the best quality in comparison with state-of-the-art learning-based SR
AC
CE
PT
ED
M
methods and preserves the time complexity of NCSR.
10
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Table 1 Empirical parameters for the learning and reconstruction phases of the proposed method. Parameters
Phase K
๐
๐ท ๐ก๐
Patch size
Dict. size
Sample size
256
0.15
40
5ร5
128
1000,000
Noise ๐๐ =0
๐ฟ
โ1
โ2
๐
0.10 0.02
0.05
3
3
0.12
๐๐ =4,6,8 0.20 0.01
0.53
8
8
0.08
๐ ๐
๐
85
50
40
160
85
50
40
160
AN US
Testing
๐
๐
๐ก๐
CR IP T
Training
Table 2 PSNR (dB) and SSIM results of reconstructed HR images by the proposed and N_SORM methods (scaling factor ๐ = 3, noise level ๐ = 0). N_SORM
PT
M
SSIM 0.9101 0.6245 0.7806 0.9522 0.8730 0.8954 0.8392
AC
CE
Plants Baboon Bike Leaf Hat Leaves Avg.
PSNR 32.94 24.86 24.08 39.97 30.73 26.11 29.78
ED
Images
11
PSNR 34.13 25.33 25.31 40.62 31.85 27.90 30.86
Proposed SSIM 0.9254 0.6561 0.8274 0.9564 0.8910 0.9254 0.8636
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Table 3 PSNR (dB) and SSIM results of reconstructed HR images by the proposed and N_SORM methods (scaling factor ๐ = 3, noise level ๐=6). N_SORM PSNR 30.44 24.36 23.65 33.14 29.10 25.46 27.69
Plants Baboon Bike Leaf Hat Leaves Avg.
Proposed SSIM 0.7806 0.5764 0.7261 0.7755 0.7309 0.8409 0.7383
PSNR 31.86 24.56 24.16 36.79 30.39 27.25 29.17
SSIM 0.8630 0.5782 0.7619 0.9068 0.8404 0.9124 0.8105
CR IP T
Images
AN US
Table 4 Average PSNR (dB) and SSIM results of reconstructed HR images by the proposed, N_GNO and N_GNC methods (scaling factor ๐ = 3). Noise
N_GNO
N_GNC
SSIM
PSNR
๐๐ =0
30.32
0.8670
30.44
๐๐ =6
28.65
0.8038
29.01
AC
CE
PT
ED
M
PSNR
1
SSIM
Proposed PSNR
SSIM
0.8685
30.48
0.8688
0.8125
29.04
0.8130
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Table 5 PSNR (dB) and SSIM results of reconstructed HR images by the proposed and compared methods (scaling factor ๐ = 3, noise level ฯ= 0). The T-statistics is closer to 0 meaning that average PSNR and SSIM gains over compared method is more significant and is closer to 1 indicating that average PSNR and SSIM difference is less obvious.
Flower Butterfly Hat Plants Bike House
PSNR SSIM 26.86 0.8047 27.97 0.8787 27.87 0.8080 27.42 0.7866 24.80 0.8511 29.69 0.8447 31.21 0.8692 23.11 0.7238 30.86 0.8511 30.10 0.8368 27.70 0.7081 24.27 0.7291 23.97 0.7238 32.95 0.8155 27.76 0.8043
PSNR SSIM 27.80 0.8464 29.19 0.9024 28.71 0.8399 28.52 0.8321 26.05 0.8892 30.62 0.8693 32.58 0.9032 23.95 0.7712 32.49 0.8766 31.08 0.8642 28.24 0.7408 24.61 0.7538 25.04 0.8005 33.88 0.8429 28.77 0.8380
PSNR SSIM 28.87 0.8726 29.94 0.9137 29.30 0.8590 29.14 0.8541 28.04 0.9248 31.31 0.8833 33.66 0.9203 24.55 0.8010 33.53 0.8890 31.92 0.8811 28.57 0.7533 24.89 0.7697 25.35 0.8224 34.18 0.8511 29.52 0.8568
PSNR SSIM 28.08 0.8524 29.56 0.9067 28.99 0.8475 28.81 0.8408 26.67 0.9037 30.89 0.8741 33.05 0.9122 24.35 0.7845 32.75 0.8804 31.48 0.8719 28.30 0.7395 27.72 0.7582 25.30 0.8129 34.02 0.8472 29.07 0.8450
AC
Raccoon
Lighthouse Boats Girl Avg.
2
SRCNN Proposed PSNR SSIM 28.31 0.8577 29.73 0.9120 29.22 0.8496 28.95 0.8420 26.75 0.9043 30.94 0.8771 33.07 0.9145 24.38 0.7854 32.91 0.8906 31.59 0.8779 28.47 0.7499 24.96 0.7602 25.39 0.8247 34.30 0.8522 29.21 0.8506
PSNR SSIM 29.51 0.8864 30.70 0.9194 29.67 0.8686 29.70 0.8679 28.35 0.9286 31.85 0.8910 34.13 0.9254 25.31 0.8274 34.49 0.8981 32.32 0.8885 28.97 0.7699 25.47 0.7874 26.60 0.8578 34.41 0.8548 30.11 0.8694
CR IP T
PSNR SSIM 26.38 0.7958 27.36 0.8697 27.58 0.7975 26.98 0.7725 23.51 0.8134 28.94 0.8314 30.60 0.8586 22.47 0.6842 30.55 0.8510 29.48 0.8272 27.63 0.6992 24.03 0.7191 23.76 0.7291 32.81 0.8130 27.51 0.7850
CE
Lena
A+
AN US
Barbara
NCSR
M
Parrots
SCSR
ED
Starfish
NE
PT
Images
Bicubic
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences 0.0511 0.0030
0.2649 0.1134
0.6239 0.5184
0.3848 0.2176
0.4536 0.3295
1.000 1.000
CR IP T
0.0226 0.0008
T-Test
Table 6 PSNR (dB) and SSIM results of reconstructed HR images by the proposed and baseline methods (scaling factor ๐ = 3, noise level ฯ= 4, 6, 8).
PSNR SSIM 26.47 0.7858 27.74 0.8402 23.85 0.7272
30.29 0.8318 23.46 Butterfly 0.8019 23.97 Lighthouse 0.7033 25.83 Avg. 0.7816 0.0910 T-Test 0.0468 26.16 Starfish 0.7743 27.09 Parrots 0.8283 23.64 Boats 0.7013
30.28 0.8114 24.61 0.8278 24.18 0.7010 26.19 0.7822 0.1208 0.0424 26.01 0.7442 27.19 0.7684 23.59 0.6665
Starfish Parrots Boats House
AC
CE
PT
ฯ=4
ฯ=6
NCSR
A+
AN US
PSNR SSIM 26.28 0.7860 27.24 0.8501 23.71 0.7163
Images
SCSR
SRCNN Proposed
PSNR SSIM 27.73 0.8177 28.70 0.8467 24.83 0.7605
PSNR SSIM 28.40 0.8487 29.77 0.8919 25.17 0.8017
PSNR SSIM 27.72 0.8270 29.08 0.8572 25.10 0.7772
PSNR SSIM 27.85 0.8298 29.19 0.8586 25.17 0.7858
PSNR SSIM 29.22 0.8631 30.57 0.8941 26.59 0.8350
31.48 0.8196 25.77 0.8552 24.43 0.7085 27.16 0.80014 0.3078 0.1142 27.01 0.7872 28.15 0.8751 24.58 0.7219
32.76 0.8675 27.56 0.9099 24.74 0.7485 28.07 0.8477 0.6220 0.7240 27.94 0.8298 29.47 0.8751 25.05 0.7874
31.79 0.8257 26.41 0.8750 24.54 0.7167 27.44 0.8131 0.3874 0.2087 27.30 0.7989 28.52 0.8063 24.86 0.7415
31.88 0.8293 26.46 0.8774 24.76 0.7217 27.55 0.8171 0.4206 0.2429 27.34 0.8037 28.59 0.8085 24.92 0.7435
34.10 0.8792 27.89 0.9103 25.41 0.7588 28.96 0.8568 1.000 1.000 28.33 0.8364 29.87 0.8768 25.96 0.8086
M
NE
ED
Noise
Bicubic
3
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences 29.35 0.7331 24.15 0.7766 23.70 0.6333 25.67 0.7202 0.1081 0.0057 25.81 0.7344
30.47 0.7629 25.46 0.8224 24.21 0.6636 26.65 0.7722 0.2906 0.1198 26.48 0.7510
32.06 0.8507 27.32 0.8999 24.57 0.7301 27.74 0.8288 0.7032 0.7887 27.45 0.8088
30.79 0.7759 26.10 0.8456 24.34 0.6746 26.99 0.7738 0.3913 0.0880 26.78 0.7650
30.85 0.7797 26.12 0.8466 24.53 0.6774 27.06 0.7766 0.4139 0.0999 26.61 0.7544
33.36 0.8659 27.71 0.9040 25.10 0.7389 28.39 0.8384 1.000 1.000 27.60 0.8126
Parrots
26.88 0.8003
26.88 0.7488
27.50 0.7319
29.11 27.85 0.8551 0.7490
27.87 0.7497
29.25 0.8598
Boats
23.55 0.6822
23.49 0.6624
24.26 0.6784
24.86 24.56 0.7694 0.7005
24.58 0.6989
25.45 0.7848
House
29.58 0.7814
28.93 0.7203
29.39 0.7026
31.14 29.71 0.8295 0.7180
29.73 0.7156
32.66 0.8515
Butterfly
23.30 0.7729
24.16 0.7738
25.05 0.7850
26.90 25.71 0.8857 0.8110
25.69 0.7877
27.22 0.8912
Lighthouse
23.79 0.6628
23.79 0.6268
23.92 0.6149
24.49 24.07 0.7122 0.6281
24.13 0.6286
24.86 0.7204
Avg.
AN US
M
25.52 0.7431
25.51 0.7111
26.10 0.7106
27.33 26.45 0.8101 0.7286
26.44 0.7225
27.87 0.8201
0.1612 0.0478
0.1408 0.0093
0.2577 0.0108
0.7470 0.3598 0.7865 0.0286
0.3552 0.0164
1.000 1.000
AC
CE
PT
T-Test
ED
ฯ=8
CR IP T
29.98 0.8095 23.39 Butterfly 0.7891 23.89 Lighthouse 0.6854 25.69 Avg. 0.7647 0.1247 T-Test 0.0543 25.99 Starfish 0.7590 House
4
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences
Table 7 Average PSNR (dB) and SSIM for scaling factor ร2, ร3 and ร4 on datasets Set5, Set14, and B100 among different methods (Noise Level, ๐๐ = 0). Set5
Set14
B100
Methods x3
x4
x2
x3
x4
x2
x3
x4
30.67
29.19
27.51
27.69
26.46
25.42
27.56
26.52
25.33
CR IP T
x2 Bicubic
0.8790 0.8390 0.7926 0.7897 0.7350 0.6846 0.7554 0.6990 0.6480 30.99
29.87
27.54
28.50
27.24
25.52
NE
28.17
26.85
25.46
AN US
0.9002 0.8555 0.7927 0.8275 0.7604 0.6883 0.7997 0.7250 0.6556 -
30.13
-
-
-
0.8733 -
-
36.01
32.82
32.13
SCSR
30.19
NCSR
-
-
27.10
0.7780 -
-
0.7485 -
29.17
31.01
28.22
28.13
27.16
-
26.57
34.67
32.15
A+
M
0.9431 0.9132 0.8664 0.8969 0.8217 0.7522 0.8768 0.7844 0.7090 29.49
30.93
28.82
26.67
29.98
28.05
26.36
34.78
ED
0.9371 0.9027 0.8454 0.8796 0.8100 0.7394 0.8549 0.7735 0.7021 32.32
SRCNN
29.60
30.96
29.10
26.75
30.12
28.12
26.39
PT
0.9387 0.9028 0.8549 0.8821 0.8258 0.7496 0.8564 0.7845 0.6960 36.68
33.51
30.89
32.61
29.96
CE
Proposed
27.71
31.79
28.94
27.15
AC
0.9573 0.9267 0.8796 0.9103 0.8332 0.7623 0.8893 0.7954 0.7194
5
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Table 8 Average PSNR (dB) and SSIM for scaling factor ร2, ร3 and ร4 on datasets Set5, Set14, and B100 among different methods (Noise Level, ๐๐ = 6). Set5
Set14
B100
Methods x3
x4
x2
x3
x4
x2
x3
x4
30.12
28.97
27.42
27.30
26.56
25.87
27.18
26.22
25.40
Bicubic
CR IP T
x2
0.8375 0.8062 0.7677 0.7530 0.7066 0.6626 0.7199 0.6711 0.6267 30.04
28.72
27.06
27.13
26.47
NE
25.80
26.94
26.16
25.13
0.7775 0.7808 0.7562 0.7181 0.6932 0.6571 0.6937 0.6597 0.6248 -
28.71
-
-
-
0.7839
-
32.94
31.34
29.51
27.05
NCSR
AN US
SCSR
-
-
26.27
-
-
0.6980
-
-
0.6715
-
30.01
28.32
26.76
28.95
27.48
26.18
30.72
30.29
A+
27.83
27.68
M
0.8947 0.8723 0.8395 0.8256 0.7761 0.7228 0.7913 0.7324 0.6774 28.52
27.56
26.10
27.37
27.01
25.81
0.8349 0.8207 0.7947 0.7569 0.7369 0.6957 0.7147 0.7016 0.6588 30.06
28.41
ED
30.68 SRCNN
27.77
25.95
27.31
26.89
25.56
0.8264 0.8198 0.8017 0.7482 0.7219 0.6929 0.7030 0.6881 0.6509 31.79
29.89
30.41
PT
33.43 Proposed
28.66
27.19
29.32
27.81
26.49
AC
CE
0.9058 0.8826 0.8492 0.8356 0.7851 0.7322 0.8007 0.7409 0.6851
6
ACCEPTED MANUSCRIPT Y. Tang et al. /Information Sciences Table 9 Average PSNR (dB) and SSIM for scale factor ร2, ร3 and ร4 on datasets Set5, Set14, and B100 with the bicubic downsampling process (Noise Level, ๐๐ = 0). Set5
Set14
B100
x3
x4
x2
x3
x4
x2
x3
x4
33.66
30.39
28.42
30.24
27.55
26.00
29.56
27.21
25.96
0.9299
0.8682
0.8104
0.8688
0.7742
0.7027
0.8431
0.7385
0.6675
35.57
31.84
29.61
31.76
28.60
26.81
30.41
27.85
26.47
0.9490
0.8956
0.8402
0.8993
0.8076
0.7331
0.8648
0.7592
0.6951
-
31.42
-
-
28.31
-
-
27.72
-
-
0.8821
-
-
0.7956
-
-
0.7647
-
36.54
32.58
30.28
32.28
29.13
27.32
31.21
28.29
26.82
0.9544
0.9088
0.8603
0.9056
0.8215
0.7491
0.8863
0.7835
0.7087
36.66
32.75
30.48
32.42
29.28
27.49
31.36
28.41
26.90
0.9542
0.9090
0.8628
0.9063
0.8209
0.7503
0.8879
0.7863
0.7101
36.93
33.03
30.76
32.61
29.59
27.71
31.59
28.63
27.07
0.9599
0.9142
0.8681
0.9087
0.8255
0.7543
0.8917
0.7896
0.7132
Bicubic
NE
AN US
SCSR
A+
SRCNN
AC
CE
PT
ED
Proposed
CR IP T
x2
M
Methods
7