Neurocomputing xx (xxxx) xxxx–xxxx
Contents lists available at ScienceDirect
Neurocomputing journal homepage: www.elsevier.com/locate/neucom
Modified sparse representation based image super-resolution reconstruction method☆ ⁎
Li Shanga, , Shu-fen Liua, Yan Zhoua, Zhan-li Sunb a b
Department of Communication Technology, College of Electronic Information Engineering, Suzhou Vocational University, Suzhou 215104, Jiangsu, China School of Electrical Engineering and Automation, Anhui University, Hefei 230039, Anhui, China
A R T I C L E I N F O
A BS T RAC T
Keywords: Sparse representation K-SVD algorithm Fast sparse coding (FSC) High resolution (HR) dictionary Low resolution (LR) dictionary Super-resolution reconstruction
To improve the geometric structure and texture features of reconstructed images, a novel image superresolution reconstruction (ISR) method based on modified sparse representation, here denoted by MSR_ISR, is discussed in this paper. In this algorithm, edge and texture features of images are synchronously considered, and the over-complete sparse dictionaries of high resolution (HR) and low resolution (LR) image patches, behaving clearer structure features, are learned by feature classification based fast sparse coding (FSC) algorithm. A LR image is first preprocessed by contourlet transform method to denoise unknown noise. Furthermore, four gradient feature images of the LR image preprocessed are extracted. For HR image patches, the edge features are extracted by Canny operator. Then using these edge pixel values as the benchmark to determine whether each image patch’s center value is equal to one of edge pixel values, then the edge and texture image patches can be marked out. For gradient image patches, they are first classified by the extreme learning machine (ELM) classifier, thus, corresponding to the class label sequence of LR image patches, the HR image features can also be classified. Furthermore, using FSC algorithm based on the k-means singular value decomposition (K-SVD) model, the edge and texture feature classification dictionaries of HR and LR image patches can be trained. Utilized HR and LR dictionaries trained, a LR image can be reconstructed well. In test, the artificial LR images, namely degraded natural images, are used to testify our ISR method proposed. Utilized the signal noise ratio (SNR) criterion to estimate the quality of reconstructed images and compared with other algorithms of the common K-SVD, FSC and FSC based K-SVD without considering feature classification technique, simulation results show that our method has clear improvement in visual effect and can retain well image edge and texture features.
1. Introduction The spatial high resolution (HR) problem of images has been always a hot topic in image processing field [1,2]. And with the progress of science and technology, higher clear images restored are paid more and more attention in application. However, as a result of the limited number of detector array and restrictions of the detector structure in the imaging system, the spatial sampling frequency can not usually satisfy the sampling theorem [3,4], therefore, the blurred images are always inevitable. To overcome defects of the physical limitations and the expensive cost of optical hardware devices, many image super-resolution algorithms have been developed in recent years [5–7]. ISR is such a technology, which utilizes multiple low resolution (LR) images on the same scene with complementary information to
reconstruct a piece or several pieces of HR images [7–9]. The earliest idea of ISR was proposed by Toraldo di Francia in 1955, but the earliest restoration idea was respectively proposed by Harris in 1964 and Coodman in 1965 [9]. Until 1982, Youla and Webb combined the idea of ISR and reconstruction first and proposed the ISR algorithm based on projections onto convex sets (POCS) [10]. The next 20 years, ISR algorithms are explored and developed by more and more domestic and foreign researchers, and many ISR algorithms have been applied widely in the image processing field [11]. Generally, ISR algorithms are commonly divided into two types, namely, frequency domain based methods and spatial domain based methods [12]. Frequency domain based methods, such as Fourier transform, discrete cosine transform (DCT), and wavelet transform methods, have some advantages of simple and intuitive theory, lower
☆ Preliminary version of this manuscript has been selected as one of the best papers in International Conference on Intelligent Computing (ICIC 2015), 2015 (Paper ID: 547) and subselected in the Neurocomputing journal. ⁎ Corresponding author. E-mail addresses:
[email protected] (L. Shang), sfl
[email protected] (S.-f. Liu),
[email protected] (Z.-l. Sun).
http://dx.doi.org/10.1016/j.neucom.2016.09.090 Received 25 December 2015; Received in revised form 15 March 2016; Accepted 3 September 2016 Available online xxxx 0925-2312/ © 2016 Elsevier B.V. All rights reserved.
Please cite this article as: Shang, L., Neurocomputing (2016), http://dx.doi.org/10.1016/j.neucom.2016.09.090
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
patch so as to take much time to finish image reconstruction. And Zhang et al. develop Yang’s method by replacing reconstruction steps in Yang's method with simple linear matrix multiplication [36], thereby the calculated amount is greatly reduced. However, because the nonlinear computation is replaced with linear computation, Zhang’s method is affected easily by noise and this algorithm’s precision is reduced [36]. The usual sparse representation method used widely is the k-means based singular value decomposition (K-SVD) algorithm [41,42]. In K-SVD algorithm, HR and LR dictionary are commonly learned by using optimized algorithms of matching pursuit (MP), orthogonal MP (OMP) [43], regularized OMP (ROMP) [42], sparse adaptive MP (SAMP) [43,44] and so on. To an extent, these methods can obtain high image resolution and only need one input image, but they can not capture many image details and still consume much calculation time [44,45]. In view of the above understanding, to reduce convergent time and obtain better image structure and image details, we propose a modified sparse representation based ISR (MSR_ISR) method here, where spare dictionaries are trained by fast sparse coding (FSC) based K-SVD algorithm [46–48] that can consume less time than other K-SVD algorithms above-mentioned. In our method, the reconstruction effect of utilizing features classified of LR image patches is considered. For a LR image, it is first preprocessed by the contourlet transform to denoise unknown noise. And then, the four gradient feature images of the contourlet transform result in the first and second horizontal and vertical direction are extracted by the gray level co-occurrence gradient matrix algorithm. Furthermore, each gradient image is randomly divided into image patches with fixed pixel size to obtain LR image patch set. For HR image patches, their edge features are extracted by Canny algorithm, and the corresponding edge pixel positions are meanwhile recorded. Then, utilized the center values of image patches as the criterion of image edge pixels, the edge and texture image patches can be obtained. Further, the extreme learning machine (ELM) classifier [22] is used to classify LR gradient image patches, thus, each cluster center value of an image patch can be calculated. Then according to classification labels of LR image patches’ features, the HR image features are classified well corresponding to the classification sequence number of LR image features. For classified samples, the (FSC) based K-SVD algorithm is used to train feature dictionaries. Further, utilized HR dictionary and LR feature coefficients learned, the LR image can be reconstructed well. Finally, utilized signal noise ratio (SNR) values to estimate the quality of reconstructed natural images, simulation results show that our method proposed here can obtain better visual effect and more image details.
calculation complexity, easily realization and so on. However, these methods only consider the global displacement and spatial linearly invariance, moreover, they have no enough prior knowledge and are limited in application [10,11], so these methods are not research hot topic now. Compared with frequency domain based methods, spatial domain based methods have higher calculation complex, but this type of methods is more flexible and has better adaptability. Typical spatial methods include those, such as interpolation based methods [11,12], iterative back-projection (IBP) methods [13], POCS ones [11], maximum likelihood (ML) methods [13], maximum a posterior (MAP) methods [14], total variation (TV) based methods [15–17], partial differential equations (PDEs) methods [17], sparse representation methods, and some modified algorithms based MAP, TV and ML etc., currently, this type of methods is again classified into reconstruction based methods and learning based methods [18–21]. Reconstruction based methods, such as ones of IBP, ML, MAP, are commonly used to solve the ill-posed problem of ISR [22,23]. They can reduce reconstruction error by the iteration way, and retain well image edge features, but they can not obtain smooth contour profile. To further improve the effect of image reconstruction, learning based methods are developed and become quickly the research hotspot [24–27]. This type of methods can predict the lost high frequency information in imaging process by learning the prior knowledge between HR and LR images [26–28]. The basic assumption is that LR images are regarded as the low frequency components of HR images. These models of learning based methods include two stages of training and image reconstruction. In the dictionary training stage, HR and LR image patches are used as input samples to train HR and LR dictionaries. In the image reconstruction stage, for input LR image patches, those HR image patches highly similar to LR image patches are found in dictionaries and used as high components of LR image patches, and then the HR image patches can be obtained by adding these high components and LR image patches [28–30]. Common learning based methods include Markov random field (MRF) based ones [30,31], locally linear embedding (LLE) based ones [32,33], sparse representation based ones [34,35] and so on. Although MRF based methods can improve the edge effect of reconstructed images, they need millions of HR and LR image patches in order to get better reconstruction effect. And LLE based methods can greatly reduce the computation work, but they neglect the special case which adjoining image patches do not exist and can cause the over-fitting case or underfitting case [32,33]. The concept of sparse representation based on the over-complete dictionary was early proposed by Mallat in 1993 [34]. In Mallat algorithm, the over-complete dictionary is selected to be Gabor dictionary, and the MP algorithm was used to implement sparse optimization. After that, many sparse optimization algorithms were fast developed, such as OMP, ROMP, SAMP, order recursive matching pursuit (ORMP), stagewise OMP (StOMP), compressive sampling matching pursuit (CoSaMP) and kernel matching pursuit (KMP), etc., currently, this type of methods has been a hot topic in the ISR field [34,35]. Usually, those methods of obtaining over-complete dictionaries are divided into two types, one is to utilize mathematic tools, and another is to train sample data [35,36]. Dictionaries obtained by the first type behave merits of simple calculation and can be realized easily, but they can not represent the complex image features [37]. And dictionaries obtained by training sample data are much more accurate and can describe self-adaptively images [37–40]. Now, the typical sparse representation method was proposed by Yang et al. in 2010 [18], and Yang’s method can avoid defects of LLE and MRF based methods, as well as reduce the dimension of dictionaries. But this method only considers image partial derivative features of the first and second order, while neglect image edge and texture features. Therefore, for images behaving complex edge and texture information, this method can not obtain better reconstruction results [32]. Otherwise, this method must perform the whole reconstruction process for each image
2. Basic idea of image super-resolution The task of super-resolution is cast as an inverse problem of recovering the original HR image by fusing the LR inputs, based on reasonable assumptions or prior knowledge about the observation model [3–5,18]. Consider the desired HR image of size L1 N1 × L 2 N2 written in lexicographical notation as X = [x1, x2, …, xN ]T , where N = L1 N1 × L 2 N2 and L1 and L 2 represent the down-sampling factors in the observation model for the horizontal and vertical directions. Namely, X is the ideal super-resolution image that is sampled at or above the Nyquist rate from a continuous scene that is assumed to be band limited. And then, the measured images are connected to the ideal HR image via the following relationship:
k = BGk Fk X + Ek X
(1 ≤ k ≤ N )
(1)
where B is a sub-sampling matrix with the size of (N1 N2 )2 × (L1 N1 L 2 N2 ), Gk is a linear space variant blur operator with the size of (L1 N1 L 21 N2 ) × (L1 N1 L 2 N2 ), Fk is the geometric warp matrix of size (L1 N1 L 21 N2 ) × (L1 N1 L 2 N2 ) performed on the ideal image to generate the kth measurement, and Ek represents a lexicographically ordered 2
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
noise vector, uncorrelated with the measurements and the ideal HR image. In application, all the above matrices are unknown. Considered factors of sub-sampling, blurring, motion, and let Wk = BGk Fk of size(N1 N2 )2 × (L1 N1 L 2 N2 ), Eq. (1) is again written as follows:
where sk be the kth row vector of sparse coefficient matrix S . To favor sparse coefficients, the prior distribution for each coefficient sk is defined as: P (sk ) ∝ exp(−λ1 ϕ (sk )), where λ1is a positive constant. ϕ (⋅) is generally selected as the following forms:
⎧ k = Wk X + Ek ⎪X ⎨ −1 −1 ⎪ X k − Wk nk ⎩ = Wk X
⎧ s (L1 penalty function ) ⎪ j 1 ⎪ 1 ϕ (Sj ) = ⎨ (sj2 + ε ) 2 (epsilon L1 penalty function ) ⎪ ⎪ log(1 + sj2 ) (log penalty function ) ⎩
(2)
where Wk represents the point spread function (PSF) of the LR sensor k . Based on and the contribution of HR pixels in X to the LR pixels in X Eq. (2), the aim of the super-resolution image reconstruction is to k for k = 1, 2, …, P . It is estimate the HR image X from the LR images X noted that the super-resolution problem is very ill-posed, although most of ISR methods are proposed, they are impossible to make precise recovery. However, according to the sparse represent idea developed in current years, the image patch sparse representation demonstrates both effectiveness and robustness in regularizing the inverse problem [18]. In sparse representation based ISR methods, spare prior knowledge on the local image patches is used to recover the HR image. And some ISR methods based on sparse representation, such as Yang’s and Zhang’s respectively described in documents [18,36], also have been proved to be efficient to some extent in ISR research fields.
4. Modified FSC algorithm 4.1. Our model of sparse representation To ensure the maximized sparsity of matrix S and the orthogonality of the column vector dk so as to reduce the redundancy between column vectors of dictionary D , in the base of Eq. (5), the following model of image sparse representation is difined by us as follows:
min s
3. Sparse representation of images
The basic idea of sparse representation is that natural signals are supposed to be represented by compressed method or by the liner combination of prototype atoms [29]. Assumed that a random signal x ∈ RN can be represented as a sparse linear combination of these atoms with respect to an over-complete sparse dictionary D , namely, the signal x can be expressed as x = Ds or its approximate form x ≈ Ds , satisfying x − Ds p ≤ ε , where D = {d1, …, dk , …, dK } ∈ RN × K (N < < K ) is an overcomplete dictionary of K prototype signal-atoms for columns {dj}Kj =1 , and s ∈ RK is a vector with very few nonzero entries [18,24]. In approximation methods, typical norms used for measuring the deviation are the l p norms ( p = 1, 2, …, ∞). Generally, it is concentrated on the case of p = 2 . If n < K and D is a full-rank matrix, an infinite number of solutions are available for the problem, hence constraints on the solution must be set. In practice, this sparsest representation cost function is usually defined as follows [18,24,43]:
min
s
s
0
subject to x − Ds
≤ε
min s
2 2
(4)
0
s. t . X − ∑kK dk sk
2 2
+ λ∑ f (sk ) k
0
⎪ ⎪
K
+ αh
2
K
+ λ∑ ϕ (sk ) + α ∑ (dkT dk ) k
k
K ⎧ 2 T ⎪ Xh − Dh S 2 + λh ∑ ϕ (sk ) + αh ∑ (dhk dhk ) k k s . t .⎨ K T ⎪ ⎩ Xl − Dl S 2 + λl ∑k ϕ (sk ) + αl ∑k (dlk dlk )
2 2
+
1 Xl − Dl S Kl
2 2
(7)
(8)
⎛ 1 1⎞ + β⎜ + ⎟ ∑ ϕ (sk ) ⎝ Kh Kl ⎠ k
⎫
K
⎪
∑ (dhkT dhk ) + αl ∑ (dlkT dlk ) ⎬ ⎭ ⎪
k
k
(9)
subjects to β , αl , αh > 0 . Parameters of Kh and Kl are the dimensions of HR and LR image patches in vector form, and the parameter β controls the tradeoff between matching the LR input and finding a HR patch that is compatible with its neighbors. The sparse optimization of Eq. (9) is solved by using our modified FSC algorithm. When the optimal solution sk* ∈ S is given, the kth HR patch set can be reconstructed as xhk = Dh sk*, and the image patch xhk can be put into the HR image X . Using gradient descent, the closest image to X͠ is found to satisfy the following reconstruction constraint formula [8]:
For an image patch with the size of l × l pixels, its sparse representation is the same as that of a random signal. Here, let X denote the data matrix of an image, and S denote the feature coefficient matrix, and dk be the kth atom of the sparse dictionary D . Then, the sparse representation cost function of an image is defined in this paper as follows:
Sk
sk
⎧ 1 arg min ⎨ Xh − Dh S Dh, Dl, S ⎩ Kh
3.2. Model of image sparse representation
s
2
For a LR image patch y , its corresponding sparse representation with respect to Dl can be found by sparse optimization methods. According to these coefficients, the corresponding HR Dh will be combined to generate the output HR image patch xh . Combined Dh and Dl dictionaries, and forced the HR and LR representations to share the same codes, meanwhile considered the matrix form, Eq. (6) is rewritten as follows:
where λ is the regular parameter and f (s ) is the spare constraint function. The choices for f (⋅) used by Olshausen et al. [21,22], are the 2 forms of −e−x , log(1 + x 2 ) and x . The reason for these choices is that they will favor those with the fewest number of non-zero coefficients among activity states with equal variance.
min
s. t . X − ∑kK dk sk
4.2. Optimized objection function
(3)
+ λf (s )
0
where dhk and dlk are respectively the kth atom of HR dictionary Dh and LR dictionary Dl , sk is the optimal solution to Eq. (8). And the HR image patches can be reconstructed by the formula xhk = Dh sk .
where ⋅ 0 is the l 0 norm, counting the nonzero entries of vector s . Considered the regular expresson, Eq. (3) can be defined as
min x − Ds
sk
subjects to the constraints: α , λ > 0 and dk = 1(k = 1, 2, …, K ), and the sparse constraint functionϕ (sk ) is selected to be the negative logarithm form of the Laplace density function sk . Further, let Xh and Xl respectively denote the HR and LR image patch sets, and Dh and Dl respectively denote HR and LR sparse dictionaries, the representation of the kth image patch can be recovered by the following minimization formula:
3.1. Basic sparse representation idea
2 2
(6)
⎧1 X * = arg min ⎨ Y − HX X ⎩2
(5) 3
2 2
2⎫ + γ X͠ − X 2 ⎬ ⎭
(10)
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 1. Original test images and their LR versions. The first row: Elaine image. The second row: the man-made image. (a) Original image. (b) LR image 1. (c) LR image 2.
descent using iterative projections often shows slow convergence. Therefore, to reduce the iteration time, parameters of L1 and L 2 constrained least squares problem are used to solve two convex optimization problems. Moreover, for the updating process of coefficients, the L1-regularized least squares problem is used to search the feature sign, and the Lagrange dual is used to solve the L 2 -constrained least squares problem in the learning feature vectors. In Eq. (11), keeping the feature bases (i.e. atoms of dictionary) fixed and with the L1 penalty over the coefficients{sk }, the updating formula of the sparse coefficient vector can be written as follows [18,26]:
Table 1 SNR values of denoised images using the contourlet transform with different number of layers and directions under different noise levels (LR images of Elaine). Contents
Noise levels SNR
2 layers
Directions
3 layers
Directions
4 layers
4 8 16 4 8 16 4 8 16
Directions
Noise image
0.01
0.03
0.05
12.59 12.48 12.32 12.85 12.76 12.55 12.62 12.61 12.25 5.21
4.96 4.74 4.55 5.12 4.89 4.66 5.11 4.88 4.64 0.69
1.48 1.32 1.22 1.56 1.4 1.25 1.53 1.43 1.2 <0
min (i ) → s
2 2
2
+ δ∑ sk(i ) (12)
k
(13)
And this algorithm converges to a global optimum in a finite number of steps. 4.4. Feature bases learning
n
+ δ ∑ ∑ ϕ (Skj ) + η ∑ (DikT Dik ) k =1 j =1
k =1
⎧ ⎞ ⎛ ⎪ sj(i ) (t + 1) = sj(i ) (t ) + djT ⎜X − ∑Kj =1 dj sj(i ) ⎟ ⎠ ⎝ ⎨ ⎪ (i ) (i ) (i ) s t s t s t ( + 1) = ( + 1)/ ( + 1) j j ⎩ j 2
The fast sparse coding (FSC) algorithm is based on iteratively solving two convex optimization problems, the L1-regularized problem and L 2 -constrained least squares problem. And L1 regularization is known to produce sparse coefficients and can be robust to irrelevant features. Referring to the basic FSC algorithm, Here, Considered the matrix operation, then Eq. (7) can be rewritten as follows: L
2
→ ∑ dk sk(i)
is called feature-sign search algorithm. If The learning process of the signs of the sk(i ) are known at the optimal value, each of the terms sk(i ) can be replaced with either sk(i ) ( if sk(i ) > 0 ), −sk(i ) (if sk(i ) < 0 ), or 0 ( if sk(i ) = 0 ). Considering only nonzero coefficients, this reduces Eq. (12) to a standard, unconstrained quadratic optimization problem. Therefore, it is a key to determine the sign of sk(i ). Once the sign of sk(i ) is selected, the sparse coefficients can be learned by the gradient descent algorithm:
4.3. Sparse coefficient learning
m
K
sk(i )
where Y is the LR image, X * is the restored image by the superresolutive method, and H is projection matrix implementing the transform of Y = HX͠ .
1 min J (D, S ) = X − DS D, S 2
X−
i
Fixed sparse coefficient matrix S , dictionary D can be learned by the following problem [18,26]:
(11)
subjects to by ∑i (Dik )2 ≤ c (∀ i = 1, 2, 3, …, n ). Assuming that the L1or epsilon L1penalty function is used as the sparse constrain of feature coefficients. When fixed S , the optimization is convex in D , in the same way, when fixed D , the optimization is convex in S , but not convex in both simultaneously. Generally, generic convex optimization solvers are too slow to be applicable to this problem, and gradient
n
min D
X − DS
2 2
+ η ∑ (DikT Dik ) i
n ∑i =1 Di2, j
(14)
≤ c , ∀ i = 1, 2, 3, …, n , and c is a constant. This subjects to is a least squares problem with quadratic constraints. Using the Lagrange dual, this problem can be much more efficiently solved. 4
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 2. Results of contourlet transform of Elaine’s LR image and man-made image (3 layers and 4 directions in each layer). (a) to (d): Results of contourlet transform in 1, 2 and 3 layers.
the features of FSC behave the distinct sparsity, locality and orientation. 5. ISR method of FSC based K-SVD algorithm 5.1. Steps of our ISR method K-SVD algorithm contains usually two stages, one is the sparse coding stage, and another is the dictionary updating stage [42]. This algorithm is usually used in sparse representation of natural images. It is flexible and works in conjunction with any pursuit algorithm. It is designed to be a truly direct generalization of the K-Means. Usually, in the common sparse representation algorithm, sparse coefficients are learned by utilizing sparse optimized algorithms such as BP, OMP, ROMP, RAMP and so on. However, these spare representation methods require more measurements for perfect reconstruction, and the convergent process in each method is very slow. Therefore, in this paper, the FSC algorithm is used to learn sparse coefficients in the step of sparse coding, and the updating step of HR and LR dictionaries is implemented by common K-SVD algorithm. The reconstruction method proposed here is generalized briefly as follows:
Fig. 3. Restored results obtained by the contourlet transform method. (a) Restored Elaine image. (b) Restored man-made image.
The Lagrangian form of Eq. (14) is written as follows:
L (D , → η ) = trace [(X − DS )T (X − DS ) + δ (DT D )] +
k
⎛
n
⎞
∑ λj ⎜⎜∑ Di2,j − c⎟⎟ j =1
⎝ i =1
⎠
(15)
•
where λj > 0 is a dual variable. Minimizing over D analytically, the Lagrange dual can be obtained:
→ → Du ( λ ) = min L (D, λ ) = trace [X T X − XS T (SS T + Λ)−1 (XS T )T + δDT D − c D
⋅Λ]
(16)
•
→ where Λ = diag ( λ ). Using Newton’s method or conjugate gradient, the Lagrange dual Eq. (14) can be optimized. The optimal basis vectors are deduced as follows: ⎡ ⎤T −1 = ⎢ (SS T + diag (→ D λ )) (XS T )T + ηD ⎥ ⎣ ⎦
•
(17)
and the advantage of solving the dual is that it uses significantly fewer optimization variables than the primal. And noted that the dual formulation is independent of the sparsity function, and it can be extended to other similar models. Clearly, as the same as SC and ICA,
• 5
be the Step 1. Preprocessing HR image data. Let X be a HR image, X estimation of X , Y be the LR version of X , and Y be the estimation of Y . Each image is sampled randomly L times with an image patch with p × p pixels, and the HR image patch set with p 2 × L pixels denoted by Xh = {Xk }kL=1 can be obtained. Step 2. Preprocessing LR image data. For a LR image, the contourlet transform is first used to denoise much unknown noise. The contourlet transform algorithm has been used widely in image denoising and can easily refer to some published documents. The ∼ contournet transform’s result is denoted by Y . Step 3. Extracting edge features of HR image patches using Canny algorithm, and the corresponding edge pixel values and their positions are meanwhile recorded. Step 4. Extracting edge and texture features of LR image patches. Gradient images of a LR image preprocessed by the contournet ∼ transforms (i.e. Y ) in the first order and second in horizontal and
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 4. Some edge feature image patches randomly selected from HR edge image patch set. The first two rows: Elaine image patch set. The last two rows: man-made image patch set.
• •
vertical direction, namely four gradient images, are first solved. Then, each gradient image is sampled randomly with p × p pixels and the LR image patch set, denoted by Xl = {Xk }kL=1, can be obtained. Then, using edge pixel values of HR image patches as the benchmark to determine whether each LR image patch’s center value is equal to one of the edge pixel values. If the two are consistent, then the center pixel value is used to be the threshold value, which is the criterion that the LR edge and texture image patch set are distinguished. Step 5. Classifying features learned of LR image patches by using ELM classifier, and the cluster center value of classified image patches are calculated and saved. Step 6. Learning the sparse dictionary of HR image features by
• •
• 6
utilizing the common K-SVD model, and the HR feature dictionary fh . learned is denoted by D Step 7. Learning the feature dictionary of LR image features by utilizing the FSC based K-SVD model, and the LR feature dictionary fl texture . fl edge and D learned is denoted respectively by D Step 8. Calculating sparse coefficient matrix of each feature corresponding to feature sparse dictionaries learned by the form S = D−1X , then, combining HR feature dictionary to obtain features of HR image patches. At the same time, HR image patches’ pixels obtained in this step are added by the mean value of HR image patches, thus, the final HR image patches can be obtained. Step 9. Reconstructing the LR image by based on the idea of ISR method.
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 5. Four gradient images and the corresponding texture feature images (Elaine image). (a) to (b): Gradient images of 1-order horizontal and vertical images. (c) to (d) Gradient images of 2-order horizontal and vertical images. (e) to (h): Real texture image corresponding to (a) to (d) respectively. (i) to (l): Imaginary texture image corresponding to (a) to (d) respectively.
⎡ h1 (x1) ⋯ hJ (x1) ⎤ H (x ) = [h1, h2, …, hJ ] = ⎢ ⋯ ⋯ ⋯ ⎥ ⎢⎣ h (x ) ⋯ h (x )⎥⎦ J V 1 V
5.2. The ElM classifier Originally, ELM was explored for the single hidden layer feedforward networks (SLFNs) instead of the classical gradient-based algorithms. So far, ELM learning has been developed to work at a much faster learning speed with the higher generalization performance, especially in the pattern recognition and the regression problem [16]. When ELM is used as the classifier, the principle of classification is based on two classification problems like support vector machines (SVM) classifier. Assumed that samples denoted by V {xi , yi }Vi =1 are given, where xi = [xi1, xi2, …, xiv ]T and yi = [yi1, yi2, …, yiu ]T , the decision function can be difined as follows:
⎡ L ⎤ f (x ) = sgn ⎢∑ βi G (ωi , bi x ) ⎥ = sgn[βH (x )] ⎢⎣ i =1 ⎥⎦
⎡ g (ω1 × x1 + b1) ⋯ g (ωJ × x1 + bJ ) ⎤ ⎥ ⋯ ⋯ ⋯ =⎢ ⎢⎣ g (ω × x + b ) ⋯ g (ω × x + b )⎥⎦ V J V J V ×J 1 1
(19)
where ωi = [ωi1, ωi2, …, ωin]T is the weight vector connecting the ith hidden neuron and the input neurons, and β = [β1, β2, …, βJ ]T m × J is the weight vector connecting the ith hidden neuron and the output neurons, and there are J hidden neurons with the activation function g (x ), which can be chosen as the Sigmoid funciton, the hard-limit function, the multiquadric funciton and so on [20]. The generalization performance of ELM is optimal when the following Equation is optimized:
(18)
arg min( Hβ − Y 2 , β )
where H (x )is represented as follows:
(20)
and it is easy to know that when output samples are mapped to the feature space of ELM, the solution of H (x ) is linearly separable and unique. According to the optimizaiton theory, the minimization 7
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 6. Four gradient images and the corresponding texture feature images (man-made image). (a) to (b): Gradient images of 1-order horizontal and vertical images. (c) to (d) Gradient images of 2-order horizontal and vertical images. (e) to (h): Real texture image corresponding to (a) to (d) respectively. (i) to (l): Imaginary texture image corresponding to (a) to (d) respectively.
Kuhn-Tucker condition, for ∀ i , the solution of ELM quadratic function is obtained:
problem of the ELM classification hyper-plane is defined as
min E =
1 β 2
V 2
+ γ ∑ ξi
⎧ ai = 0 ⇒ yi β⋅h (x ) ≥ 1 ⎪ ⎨ 0 ai γ = 0 ⇒ yi β⋅h (x ) = 1 ⎪a = γ ⇒ yi β⋅h (x ) ≤ 1 ⎩ i
(21)
i =1
subjects to yi β⋅h (x ) ≥ 1 − ξi , ξi ≥ 0 , i = 1, 2, …, V . We use the Lagrange dual problem to minimize Eq. (21). And the lagrange function of Eq. (21) is written as V
Eelm (β , ξ, a, u ) =
1 T β β + γ ∑ ξi − 2 i =1
V
Compared with SVM classifier, the essence of ELM is that when the input weights and the hidden layer biases are randomly assigned, the output weights can be computed by the generalized inverse of the hidden layer output matrix [20]. The above description process of ELM classification is suitable to two types of classification problems. Based on this description, for multi-classification problems, the classification task can be implemented by one to one or one to many relationships.
V
∑ ai [yi βH (xi ) − (1 − ξi )] − ∑ ui ξi i =1
i =1
(22) where ai and ui are Lagrange multipliers, and they are negative. According to the partial derivative of β and ξ , namely ∂Eelm /∂β = 0 and V ∂Eelm /∂ξ = 0 , we can derive the dual problems of β = ∑i =1 ai yi H (xi ) and γ = ai + μi , ∀ i . Thus, training ELM classifier is equal to solve the optimization of dual problems, which is defined as
1 min Ep = 2
V
V
6. Experimental results and analysis 6.1. Preprocessing of the LR image
V
∑ ∑ yi yj ai aj H (xi ) H (xj ) − ∑ ai i =1 j =1
(24)
i =1
In test, a HR image called Elaine with the same size of 512×512 pixels and its degenerated version (i.e. the LR image) are used, which are respectively shown in the first row of Fig. 1, at the same time,
(23)
subjects to 0 ≤ ai ≤ γ (i = 1, 2, …, V ). According to the Karush8
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 7. Some 2- order gradient feature image patches randomly selected from LR Elaine’s gradient image patch set. (a) to (d): Some horizontal gradient image patches. (e) to (h): Some vertical gradient image patches.
processing field in current. Contourlet transform is better than wavelet method and can provide a flexible multi-resolution, local and directional approach for image denoising problems [45,46]. There are two stages in contourlet transform [45]: the multi-scale analysis stage and directional analysis stage. The first stage is used to capture the point discontinuities. A Laplacian pyramid (LP) decomposes the input image into a low frequency sub-image and some high frequency band-pass images. In the second stage, the band-pass image is decomposed into
another LR image, i.e. the degenerated version of man-made image generated by our programming code, is shown in the second row of Fig. 1. The artificial LR image is obtained by blurring the corresponding HR image by using the point spread function (PSF) filter, the motion filter, the down-sampling method and adding Gaussian noise with different variances. However, in application, LR images contain much unknown noise. To reduce noise influence to get better restoration results, the preprocessing process of LR images is first considered here by using the contourlet transform method used widely in image 9
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 8. Sparse dictionaries with 144 atoms for the LR Elaine image, the noise level is set to be 0.05. (a) to (b) HR and LR dictionaries obtained by our method . (c) to (d): HR and LR dictionaries obtained by K-SVD.
and that of directions. Otherwise, fixed the number of layers, despite of noise levels, when the direction number is set to be 4, the corresponding SNR value is the largest, and when the direction number is 16, the SNR value is the smallest. However, it is also noted that the difference between 8 and 16 directions is relatively small. While fixed the number of decomposed directions, for each transform layer, when the direction is set to be 4, the corresponding SNR values under different noise levels are the largest. And when the direction is set to be 16, the SNR values under different noise levels are the smallest. However, when the number of directions in each layer is 4 and 8, SNR values calculated very approximate. Therefore, in terms of the test conclusions and considered the amount of calculation, the number of transform layers is selected to be 3 and that of decomposed directions is selected to be 4 in each layer in the process of contourlet transform. Here, in order to comprehend directly the contourlet transform method, for the LR Elaine image and LR man-made image, the low frequency sub-bands and high frequency sub-bands of each layer in each orientation are respectively shown in Fig. 2, where the number of layers is 3 and that of directions in each layer is 4. It is clear to see that the low frequency images contain the majority of energy of the original objection, and the contour of the original image is also retained well in the low frequency images. But some details of images exist in high frequency sub-images. And the denoised results obtained by using inverse contourlet transform corresponding to the Elaine’s LR image and man-made image are respectively shown in Fig. 3(a) and (b).
2k (k = 1, 2, 3, …, k ) wedge shape sub-image by the directional filter banks (DFB), and the detail sub-image is then decomposed by the LP for the next loop, this stage to link point discontinuities into linear structures. The whole loop can be done L p iteratively, and the number of direction decomposition at each level can be different, which is much more flexible than the three directions in wavelet. The overall result is an image expansion by using basic elements like contour segments. The detail description of contourlet transform can refer to the document [45]. To select the appropriate number of transform layers and that of directions in each layer, we utilized only Elaine’s LR images with Gaussian noise level 0.01, 0.03 and 0.05 to test. In test, the number of transform layers is assumed to be 2, 3 and 4, and that of decomposed directions in each layer is assumed to be 4, 8 and 16. For each LR image, fixed the layer number, the results of contourlet transform in different directions can be obtained. Furthermore, to find the optimal number of layers and directions, the signal noise ratio (SNR) criterion is used to measure the quality of transformed images under different noise variances, and the SNR values calculated are shown in Table 1. To prove that the preprocessed effect of contourlet transform method, the SNR values of noise images of Elaine with different noise levels are also listed in Table 1. From Table 1, it is clear to see that under fixed noise level, the SNR value of any preprocessed image obtained by contourlet transform is clear larger than the corresponding noise version. And the smaller the noise level is, the larger the SNR value despite of the number of layers
10
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 9. Sparse dictionaries with 144 atoms for the LR man-made image, the noise level is set to be 0.05. (a) to (b) HR and LR dictionaries obtained by our method . (c) to (d): HR and LR dictionaries obtained by K-SVD.
same time, the real and imaginary texture images of four gradient images are also shown in Fig. 5. Similarly, for the man-made image, the 1-order and 2-order gradient images and the corresponding real and imaginary texture images are shown in Fig. 6. Clearly, according to the visual effect of Fig. 5 and Fig. 6, the real images retain well the image details. And then, each HR image and LR gradient feature image are randomly sampled with 8×8 image patch overlapped 3×3 pixels. And let Xh denote the HR image patch set, and Xl denote the LR image patch set of four gradient feature images, which were first preprocessed by the contourlet transform. For LR gradient image patches, utilized each image patch’s center value as the threshold value to retain the maximize feature information. If each LR image patch’s center value is equal to one of edge pixels of LR image patches, then this LR image patch is regarded as the edge feature image patch, on the contrary, it is regarded as the texture feature image patch. Thus, the edge feature and texture feature image patches of LR are distinguished well. And some texture image patches of the LR Elaine’s gradient image patches are shown in Fig. 7. Further, using ELM classifier to classify those LR feature image patches, and then the image patch set classified is used as the K-SVD model’s input set. Thus, image patches with classification information are selected and training sample data are reduced clearly. Only using those samples with classification to learn sparse diction-
Compared Fig. 3 with the corresponding LR version shown in Fig. 1, it is clear to see that the results obtained by the contourlet transform method retain well the image contour. In fact, in ISR test, these preprocessed LR images are used as real input LR images. 6.2. Feature extraction of HR and LR image patches For any HR and LR image, considered image patches, each image is randomly sampled with fixed an image patch with p × p pixels to obtained corresponding HR and LR image patch sets. For HR image patches, the edge features are extracted by the Canny algorithm, and the corresponding edge pixel values and their positions are meanwhile recorded and saved. Then used these edge pixel values as the judgment standard, the edge and texture feature sets of HR image patches can be marked out well. Some of edge feature image patches of Elaine and the man-made HR image are shown in Fig. 4. Here, it is noted that for the original LR image, it is first preprocessed by the contourlet transform to denoise unknown noise. Furthermore, four gradient feature images of the preprocessed result in the first and second horizontal and vertical direction are respectively extracted. For the LR Elaine image preprocessed by the contourlet transform, its four gradient feature images of the first and second order in horizontal and vertical direction, which are denoted by Xlg1h , Xlg1v , Xlg2h and Xlg2v , are shown in Fig. 5. At the 11
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 10. Sparse dictionaries with 256 atoms for the LR Elaine image, the noise level is set to be 0.05. (a) to (b) HR and LR dictionaries obtained by our method . (c) to (d): HR and LR dictionaries obtained by K-SVD.
l learned and the LR feature And then, utilized the LR dictionary D image patch vector {xlk }, the LR coefficient vector {slk } can be optimized by using the cost function Eq. (12). Further, the HR image patch can be constructed as xhk = Dh slk . At the same time, it is noted that the mean values of HR image patches and the corresponding image patch location sampled randomly must be considered in the process of reconstructing HR image patches. Here, considered the number of atoms (namely 64, 144 and 256) of the HR and LR dictionaries and the noise level σ , the image reconstruction effect obtained by different algorithms is discussed. Here, to make the content simple, our method is denoted by Method 1, the method of FSC based K-SVD without feature classification is denoted by Method 2, and the method of common K-SVD model is denoted by Method 3. LR images are made by the motion filter and adding Gaussian variance with zero mean values by using MATLAB command of “fspecial” and “imnoise”. In order to further illustrate the image reconstruction effect of each algorithm in the different number of atoms, corresponding to the noise levelσ , such as 0.005, 0.01, 0.02, 0.03, 0.05, 0.1, 0.15, 0.2, etc., the signal noise ratio (SNR) criterion is used here to evaluate the quality of reconstructed images. SNR values calculated by each algorithm with different noise level σ under different number of atoms are respectively listed in Table 2 and Table 3. At the same time, to compare the charge of SNR values, those of LR images are also shown in Table 2 and Table 3. According to test data, it is easy to see that, for each algorithm, under different atoms and noise level, the larger the noise level is, the
aries, the calculation amount can be also reduced. 6.3. Results of ISR For the HR image patch set and LR image patch set classified, utilized our modified K-SVD algorithm based on the FSC method, the HR and LR dictionaries can be trained. In training, HR and LR images are randomly sampled 50,000 times with 9×9 image patch with overlap of three pixels between adjacent image patches. And each patch is converted into one column vector to be saved. Thus, the size of the HR and LR image patch set is both 81×50,000 pixels. Otherwise, to interpret the restoration effect and the time-consuming, and the atoms of HR and LR dictionaries, namely 64, g144 and 256 are respectively discussed. Considered the paper’s length, here only HR and LR dictionaries with 144 and 256 atoms, which are learned by our method and the common K-SVD model corresponding to different training samples, are given out. And the HR and LR dictionaries learned with 144 and 256 atoms corresponding to Elaine and man-made images are respectively shown in Figs. 8–11, where the Gaussian noise variance σ is selected as 0.05 for each LR image. It is also noted that the learned l combines the cluster center values of classified feature LR dictionary D image patches by ELM classifier. From Figs. 8–11, it is easy to see that, in despite of the number of atoms, for Elaine and man-made image, the corresponding HR dictionaries obtained by our method both behaves clearer orientated edges and contour than those obtained by the common K-SVD method. 12
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 11. Sparse dictionaries with 256 atoms for the LR man-made image, the noise level is set to be 0.05. (a) to (b) HR and LR dictionaries obtained by our method . (c) to (d): HR and LR dictionaries obtained by K-SVD.
Table 2 SNR values of restored Elaine images using different algorithms with different atoms and noise levels (Method 1: Our method. Method 2: FSC based K-SVD algorithm. Method 3: Common K-SVD algorithm.). Noise level
SNR 64 atoms
0.005 0.01 0.02 0.03 0.05 0.1 0.15 0.2
144 atoms
256 atoms
Noise image
Method 1
Method 2
Method 3
Method 1
Method 2
Method 3
Method 1
Method 2
Method 3
SNR
12.61 12.41 11.78 10.91 8.92 4.59 1.69 <0
11.71 11.61 11.22 10.80 7.82 4.18 1.44 <0
9.32 9.17 10.97 9.86 8.53 3.62 1.38 <0
11.43 11.31 10.65 10.17 8.47 4.44 1.54 <0
11.23 11.05 10.42 9.38 8.39 4.21 1.47 <0
10.47 10.38 9.75 9.07 8.12 3.86 1.31 <0
9.75 9.49 9.19 8.75 7.80 4.07 1.27 <0
9.62 9.39 8.73 8.21 7.55 3.93 1.22 <0
9.42 9.24 8.46 8.17 7.32 3.82 1.05 <0
5.21 5.20 5.08 4.91 4.33 2.38 0.34 <0
smaller the SNR value is. When given the noise level and fixed the atoms, in different algorithm, SNR values of Method 1 are clear larger than those of Method 2 and Method 3. Further, given the noise level and the method, the SNR values obtained by 64 atoms are some what larger than 256 atoms, and those obtained by 256 atoms are the smallest. Otherwise, for each LR image, when noise level is less than 0.05, the SNR values of restored images are evidently larger than those
of corresponding noise images, and no matter what kind of algorithms, SNR values are very close corresponding to 64 atoms and 256 atoms, as well as they are larger than those corresponding to 256 atoms. When noise level is in 0.1 and 0.2, the SNR values obtained by our algorithm are still clearly larger than those obtained by other two algorithms. Moreover, compared with Method 2 and Algorithm 3, when the number of atoms is 64, the SNR values obtain by our algorithm (i.e. 13
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Table 3 SNR values of restored images using different algorithms with different atoms and noise levels (Man-made image. Method 1: Our method. Method 2: FSC based K-SVD algorithm. Method 3: Common K-SVD algorithm.). Noise level
SNR 64 atoms
0.005 0.01 0.02 0.03 0.05 0.1 0.15 0.2
144 atoms
256 atoms
Noise image
Method 1
Method 2
Method 3
Method 1
Method 2
Method 3
Method 1
Method 2
Method 3
SNR
9.47 9.22 8.79 8.52 7.14 4.45 2.04 <0
8.76 8.54 8.04 7.56 6.49 3.79 1.77 <0
8.27 8.07 7.82 7.63 6.17 3.64 0.93 <0
8.92 8.70 8.27 7.66 5.95 3.91 1.52 <0
8.80 8.55 8.12 7.47 6.48 3.74 1.36 <0
8.57 8.36 7.66 6.82 5.73 3.48 1.38 <0
8.42 8.21 7.63 6.79 5.84 3.35 0.95 <0
8.31 8.17 7.56 6.97 5.62 3.48 1.61 <0
8.22 8.12 7.52 6.95 5.53 3.27 1.26 <0
8.13 7.91 7.34 6.64 5.45 2.79 0.58 <0
Fig. 12. Restored results by different methods using 64 atoms, where Method 1 is our method, Method 2 is the FSC based K-SVD method without feature classification, and Method 3 is the K-SVD method. (a): Three LR Elaine images. (b) to (d) are restored results obtained respectively by Method 1, Method 2 and Method 3.
the original versions, and when the noise level is very large, our method can still restore well the image edge and details, and the visual effect is also the best than other two methods.
Method 1) are distinctly the largest under the same noise level. So, it can be defined that the optimal number of atoms was 64 in discussing the image reconstruction task. Otherwise, considered the limitation of the paper’s length, for each LR image, only the restored images under 0.01, 0.05 and 0.1 noise levels are given here, respectively shown Fig. 12 and Fig. 13. It is clear to see that, fixed an algorithm, when the noise level is very small, it is difficult to tell the restored results from
7. Conclusions To improve the detail and structure information of reconstructed 14
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al.
Fig. 13. Restored results by different methods using 64 atoms, where Method 1 is our method, Method 2 is the FSC based K-SVD method without feature classification, and Method 3 is the K-SVD method. (a): Three LR man-made images. (b) to (d) are restored results obtained respectively by Method 1, Method 2 and Method 3.
Acknowledgments
images, a new ISR method is discussed in this paper by using the FSC based K-SVD method. In this method, edge features of HR image patches and texture features classified of LR image patches are considered. Edge features of HR image patches are extracted by the Canny algorithm, at the same time these pixel positions of corresponding edge features are marked out and they are used as those of LR image patches in the image reconstruction task. For a LR image, it is first preprocessed by the contourlet transform to denoise much unknown background noise, then four gradient images of 1-order and 2-order in horizontal and vertical directions of the preprocessed result are extracted and used as LR images. Furthermore, texture features of four LR gradient images are classified by using the ELM classifier. For HR image patches and feature image patches of LR images classified, HR and LR dictionaries are learned by using the FSC based K-SVD model, where FSC can fast implement sparse optimization of sparse feature coefficients than methods of MP, OMP, ROMP. Further, utilized HR dictionary and LR sparse coefficients learned, meanwhile, considered the classification label between HR and LR image patches, as well as the HR image patches’ mean values, the HR image patches can be restored. And then, considered the sampled location of HR image patches, the LR image can be reconstructed well. Simulation results show that, compared with methods of the common K-SVD and FSC algorithm, our method indeed has better image structures and visual effect.
This work was supported by two National Natural Science Foundation of China (Grant nos. 61373098 and 61370109),the grant from Natural Science Foundation of Jiangsu Province for Young Scholar (No. BK20160361), the grant from Natural Science Foundation of Anhui Province (No. 1308085MF85), and Jiangsu Colleges and Universities Outstanding Scientific and Technological Innovation Team (2015). References [1] W.S.Dong, G.M.Shi, L.Zhang, X.L.Wu, Super-resolution with nonlocal regularized sparse representation, in: Proceedings of the 2010 SPIE Conference on Visual Communication and Image Processing, Huangshan, China, IEEE Press, Aug. 22–24 2010, pp. 1–10. [2] X. Lu, Y. Yuan, P. Yan, Sparse coding for image denoising using spike and slab prior, Neurocomputing 101 (3) (2013) 94–103. [3] S. Yang, M. Wang, Y. Chen, Single-image super-resolution reconstruction via learned geometric dictionaries and clustered sparse coding, IEEE Trans. Image Process. 2l (9) (2012) 4016–4028. [4] R. Giryer, A greedy algorithm for the analysis transform domain, Neurocomputing 173 (2013) 278–289. [5] W. Zeng, X. Lu, S. Fei, Image super-resolution employing a spatial adaptive prior model, Neurocomputing 162 (2015) 218–233. [6] W. Dong, D. Zhang, G. Shi, Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regnlarization, IEEE Trans. Image Process. 20 (7) (2011) 1838–1857.
15
Neurocomputing xx (xxxx) xxxx–xxxx
L. Shang et al. [7] R. Zeyde, M. Elad, M. Protte, On single image scale-up using sparse-representations, in: Processing of the 2010 International Conference on Curves and Surfaces, Springer, Heidelberg, 2010, pp. 711–730. [8] C. Wang, P. Xue, W. Lin, Improved super-resolution reconstruction from video, IEEE Trans. Circuits Syst. Video Technol. 16 (11) (2006) 1411–1422. [9] H. Li, Z. Yu, C. Mao, Fractional differential and variational method for image fusion and super-resolution, Neurocomputing 171 (2015) 138–148. [10] Y.C. Eldar, H. Bolcskei, Block-sparsity: coherence and efficient recovery, IEEE Trans. Signal Process. 58 (6) (2010) 3042–3054. [11] S.Q. Wang, J.H. Zhang, Fast image painting using exponential threshold POCS plus conjugate gradient, J. Imaging Sci. 62 (3) (2014) 161–170. [12] W. Zhou, M. Fei, H. Zhou, K. Li, A sparse representation based fast detection method for surface defect detection of bottle caps, Neurocomputing 123 (2014) 406–414. [13] M. Volpi, G. Matasci, M. Kanevski, Semi-supervised multiview embedding for hyperspectral data classification, Neurocomputing 145 (18) (2014) 427–437. [14] N.M. Khan, M. Kyan, L. Guan, Intuitive volume exploration through spherical selforganizing map and color harmonization, Neurocomputing 147 (2015) 160–173. [15] L. Xu, J. Jiang, Zh. Zh Wei, Multiframe super-resoluton reconstructon algorithm based on total variatio regularization, Elec. Meas. Technol. 35 (1) (2012) 76–79. [16] V. Malathi, N.S. Marimuthu, S. Baskar, Intelligent approaches using support vector machine and extreme learing machine for transmission line protection, Neurocomputing 73 (10–12) (2010) 2160–2167. [17] Ri-sheng Liu, Zhou-chen Lin, W. Zhang, Learing PDEs for image restoration via optimal control, in: K. Daniilidis, P. Maragos, N. Paragios (Eds.), , ECCV2010, Part Ⅰ, LNCS 6311, Springer-Verlag, Berlin Heidelberg, 2010, pp. 115–128. [18] J. Jian-chao Yang, T. Wright, Huang, Image super-resolution via sparse representation, IEEE Trans. Image Process. 19 (2010) 2861–2873. [19] H. Liu, S. Li, Decision fusion of sparse representation and support vector machine for SAR image target recognition, Neurocomputing 113 (7) (2013) 97–104. [20] D.S. Huang, H.S. Horace, Chi Zheru, H.S. Wong, Dilation method for finding close roots of polynomials based on constrained learning neural networks, Phys. Lett. A 309 (5–6) (2001) 443–451. [21] D.S. Huang, The local minima free condition of feed forward neural networks for outer-supervised learning, IEEE Trans. Syst. Man Cybern. 28B (3) (1998) 477–480. [22] He Bo, Xu Dongxun, Nian Rui, Mark van Heeswijk, Yu Qi, Yoan Mich, Amaury Lendasse, et al., Fast face recoginition via sparse coding and extreme learning machine, Cogn. Comput. 6 (2) (2013) 264–277. [23] S.A. Saffari, A. Ebrahimi-Moghadam, Label propagation based on local information with adaptive etermination of number and degree of neighbor׳s similarity, Neurocomputing 153 (1) (2015) 41–53. [24] J. Mairal, F. Bach, J. Ponce, Online learning for matrix factorization and sparse coding, J. Mach. Learn. Res. 11 (2010) 19–60. [25] D.S. Huang, Ji-Xiang Du, A constructive hybrid structure optimization methodology for radial basis probabilistic neural networks, IEEE Trans. Neural Netw. 19 (12) (2008) 2099–2115. [26] J. Yang, Z. Wang, Z. Lin, Coupled dictionary training for image super-resolution, IEEE Trans. Image Process 21 (8) (2012) 3467–3478. [27] Y. Wang, W. Yin, Sparse signal reconstruction via iterative support detection, SIAM J. Imaging Sci. 3 (3) (2010) 462–491. [28] Li Shang, Yan Zhou, Tao Liu, Zhan-li Sun, Super-resolution restoration of MMW image using sparse representation based on couple dictionaries, Commun. Comput. Inf. Sci. (CCIS) 304 (2012) 286–291. [29] Sheng-peng Liu, Yong Fang, A. Contourlet-transform Based Sparse, ICA algorithm for blind image separation, J. Shanghai Univ. 11 (2) (2007) 464–468. [30] D.S. Huang, H.S. Horace, Zheru Chi, A neural root finder of polynomials based on root moments, Neural Comput. 16 (8) (2004) 1721–1762. [31] X. Xu, Y. Zhong, L. Zhang, A sub-pixel mapping method based on an attraction model for multiple shifted remotely sensed images, Neurocomputing 134 (9) (2014) 79–91. [32] X. Li, H. He, Z. Yin, F. Chen, J. Cheng, Single image super-resolution via subspace projection and neighbor embedding, Neurocomputing 139 (2014) 310–320. [33] D.S. Huang, A constructive approach for finding arbitrary roots of polynomials by neural networks, IEEE Trans. Neural Netw. 15 (2) (2004) 477–491. [34] L. Kai, E. Barth, T. Martinetz, Soft-competitive learning of sparse codes and its application to image reconstruction, Neurocomputing 74 (9) (2011) 1418–1428. [35] Y. Song, W. Cao, Y. Shen, G. Yang, Compressed sensing image reconstruction using intra prediction, Neurocomputing 151 (2015) 1171–1179. [36] H.Zhang, Y.Zhang, T.S.Huang, Efficient sparse representation based image super resolution via dual dictionary learning, in: Proceeding of the IEEE International Conference on Multimedia and Expo (ICME), 2011, pp: 1–6. [37] J.C.Yang, J.Wright, Y.Ma, T.Huang, Image super-resolution as sparse representation of raw image patches, in: Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, USA: Anchorage, IEEE Press, 2008, pp: 1–8. [38] J. Sun, Z.B. Xu, H.Y. Shum, Gradient profile prior and its applications in image super-resolution and enhancement, IEEE Trans. Image Process. 20 (6) (2011) 1529–1542. [39] H. Yoloyama, O. Watanabe, Emergence of sparse representation by unsupervised learning with kernel functions, Neurocomputing 112 (2013) 61–66. [40] Lei Zhang Chun-Hou Zheng, Vincent To-Yee Ng, Simon Chi-Keung Shiu,
[41]
[42] [43] [44] [45] [46] [47]
[48]
D.S. Huang, Metasample-based sparse representation for tumor classification, Trans. Comput. Biol. Bioinform. 8 (5) (2011) 1273–1282. S. Yang, Z. Liu, M. Wang, F. Sun, L. Jiao, Multitask dictionary learning and sparse representation based single –image super-resolution reconstrction, Neurocomputing 74 (17) (2011) 3193–3203. X. Song, Z. Liu, X. Yang, A parameterized fuzzy adaptive K-SVD approach for the multi-classes study of pursuit algorithms, Neurocomputing 123 (2014) 131–139. C.T. Tony, L. Wang, Orthogonal matching pursuit for sparse signal recovery with noise, IEEE Trans. Inf. Theory 57 (7) (2011) 4680–4688. Li Shang, Denoising natural images based on a modified sparse coding algorithm, Appl. Math. Comput. 205 (2) (2008) 883–889. J. Li, C.Y. Lu, A new decision rule for sparse representation based classification for face recognition, Neurocomputing 116 (10) (2013) 265–271. M. Elad, M. Aharo, Image denoising via sparse and redundant representations over learned dictionaries, IEEE Trans. Image Process. 15 (12) (2006) 3736–3745. H. Wang, R. Zhao, Y. Cen, Rank adaptive atomic decomposition for low-Rank matrix completion and its application on image recovery, Neurocomputing 145 (18) (2014) 374–380. D.S. Huang, Application of generalized radial basis function networks to recognition of radar targets, J. Pattern Recognit. Artif. Intell. 13 (6) (1999) 945–962.
Li Shang received the B.Sc. degree and M.Sc. degree in Xi’an Mine University in June 1996 and June 1999, respectively. And in June 2006, she received the Doctor’s degree in Pattern Recognition & Intelligent System in University of Science and Technology of China (USTC), Hefei, China. From July 1999 to July 2006, she worked at USTC, and applied herself to teaching. Now, she works at the college of Electronic Information Engineering, Suzhou Vocational University. At present, her research interests include Image Processing, Artificial Neural Networks and Intelligent Computing.
Shufen Liu received the B.Sc. degree in Xidian University in June 1990. And in June 2004, she received the M.Sc. degree in University of Science & Technology Beijing (USTB), Beijing, China. From February 2006, she works at the college of Electronic Information Engineering, Suzhou Vocation University. At present, her research interests include Image Processing, Artificial Neural Networks and Intelligent Computing.
Yan Zhou received the B.Sc. degree and M.Sc. degree in China University of Geosciences in June 2003 and June 2006, respectively. She is currently pursuing the Ph.D. degree at the School of Electronics and Information Engineering of Soochow University, in the area of signal and information processing. From July 2006 to now, she works at the department of Electronic information engineering, Suzhou Vocational University. At present, her research interests include Speech Signal Processing, Artificial Neural Networks and Intelligent Computing.
Zhan-Li Sun received the Ph.D. degree from the University of Science and Technology of China, in 2005. Since 2006, he has worked with The Hong Kong Polytechnic University, Nanyang Technological University, and National University of Singapore. He is currently a Professor with School of Electrical Engineering and Automation, Anhui University, China. His research interests include machine learning, and image and signal processing.
16