Multi-weighted nuclear norm minimization for real world image denoising

Multi-weighted nuclear norm minimization for real world image denoising

Optik - International Journal for Light and Electron Optics 206 (2020) 164214 Contents lists available at ScienceDirect Optik journal homepage: www...

1MB Sizes 2 Downloads 83 Views

Optik - International Journal for Light and Electron Optics 206 (2020) 164214

Contents lists available at ScienceDirect

Optik journal homepage: www.elsevier.com/locate/ijleo

Original research article

Multi-weighted nuclear norm minimization for real world image denoising

T

Xue Guoa,b,*, Feng Liua,b, Jie Yaob, Yiting Chena,b, Xuetao Tiana,b a Engineering Research Center of Network Management Technology for High Speed Railway Ministry of Education, School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China b School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China

A R T IC LE I N F O

ABS TRA CT

Keywords: Low-rank Non-local self-similarity Sparsity Color image denoising

The noise in real world images is much more complicated than additive Gaussian white noise that most existing denoising methods are designed for. The performances of the denoising methods aiming at additive Gaussian white noise on real world images are not satisfactory. A major feature of noise in real world images is that the noise levels vary with regions. We propose a denoising model named multi-weighted nuclear norm minimization according to the characteristic of the noise in real images which is the noise levels varying with regions. In our model, the objective function is divided into two parts: the data fidelity term and the regularization term. The regularization term is a weighted nuclear norm. We use two weight matrices on the data fidelity term to balance the data between channels and between regions, respectively. Since the objective function has no analytical solution, we use alternating direction method of multipliers to decompose the objective function into sub-problems with analytical solutions. We prove the effectiveness of the proposed model by experiments.

1. Introduction Due to the particle nature of the light and limitations of the sensing device, images are inevitably contaminated by noise during the acquisition process. On the one hand, noise could adversely affect subsequent processing of images, such as segmentation, classification, and image analysis [2]. On the other hand, images in some specific fields, such as medicine [3] and remote sensing [2], carry a lot of noise. These practical applications make image denoising very necessary. Most existing denoising methods assume that the noise meets a certain probability distribution, such as additive white Gaussian noise (AWGN) [4–6,1,7–14], and design a specific method to remove the noise. However, the noise in real images is complex, often signal-dependent, and cannot be simulated simply by some additive noise or their superposition. Therefore, real image denoising should be paid more attention. There are many denoising methods in the past few decades, which can be roughly divided into the following categories: filter methods [4,5,15,16], sparsity based methods [6,1,7,17], statistics based methods [8], low rank based methods [9,18,10,11,19] and deep learning methods [12–14,20–28]. Most of the existing models are designed to remove AWGN, and the performance on real image denoising will be degraded. For example, the baseline of AWGN has been refreshed by the deep learning method in the past two years. However, due to the dependence on the training data, the generalization ability of these trained networks on real images performs not well. Moreover, model-based denoising methods are more generalizable than deep learning method. Therefore, more

⁎ Corresponding author at: Engineering Research Center of Network Management Technology for High Speed Railway Ministry of Education, School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China. E-mail address: [email protected] (X. Guo).

https://doi.org/10.1016/j.ijleo.2020.164214 Received 31 October 2019; Received in revised form 13 January 2020; Accepted 13 January 2020 0030-4026/ © 2020 Elsevier GmbH. All rights reserved.

Optik - International Journal for Light and Electron Optics 206 (2020) 164214

X. Guo, et al.

suitable denoising models need to be designed for real images. Noise in a real world image cannot be described by a homogeneous single probability distribution. JPEG compression could make the noise distribution in the real image uneven and patch-dependent [29]. According to this characteristic of noise in real world image, we propose a multi-weighted nuclear norm minimization method to remove noise in real world image. MCWNNM [10] has achieved excellent denoising results in real world image. However, MCWNNM does not consider the difference in noise between patches. To explore the effect of regional differences in noise on denoising, we introduce a weight matrix to balance the difference in noise levels between patches. We reserve the weight matrix of MCWNNM to balance the noise levels between R, G, B channels, but the value of this weight matrix varies in our model. The model we proposed has no analytical solution. We use the alternating direction method of multipliers [30] framework to decompose the problem into several subproblems with analytical solutions. The structure of this paper is as follows: in Section 2, we introduce the work related to this paper, including the application of low rank matrix approximation methods for image denoising and real world image denoising. Section 3 presents the establishment and solution of the proposed multi-weighted nuclear norm minimization model. We introduce the datasets of three real images, and compare the denoising performance of our proposed method with some state-of-the-art methods on these three datasets in Section 4. We summarize our work in Section 5. 2. Related work This section describes the work related to our research. Due to the good characterization of data sparseness, low-rank models are widely used in image analysis [31], signal processing [32–34], recommendation [35,36], and many other fields. In this paper, we focus on the application of low-rank models to real world image denoising. The work related to this paper is introduced in two parts: low rank matrix approximation methods for image denoising and real world image denoising. 2.1. Low rank matrix approximation methods for image denoising The low-rank matrix approximation denoising model, which can be seen as a two-dimensional sparsity model, exploits an important property of images: non-local self-similarity (NSS). Sparsity method considers that the local pattern in the natural image is finite, so the natural image can be represented by some fixed elements. There is no coding of random noise in these elements, and the representation of noisy image with them can suppress random noise. The goal of sparsity method is to represent the image with as few elements as possible. NSS refers that there are many similar local patterns in a natural image. According to the NSS, the matrix arranged from the similar blocks in the image must be low rank. Therefore, the application of low rank matrix approximation in image denoising has a certain intuitive interpretation. SAIST [37] takes the lead in considering the relationship between sparsity and low rank matrix approximation, and explores the application of two-dimensional sparsity in image denoising. One of the most commonly used models for denoising in the low rank matrix approximation methods is nuclear norm minimization model (NNM) [38]. The NNM was first applied to video denoising. Later, considering a priori knowledge in the eigenvalues of the matrix composed of similar patches, Gu et al. [9] added weights to the regularization term of NNM and proposed weighted nuclear norm minimization (WNNM). In the regularization term of WNNM, the larger the eigenvalue of matrix is, the smaller the corresponding weight is. That is, WNNM retains more corresponding information with large eigenvalue when denoising. Gu et al. proved the existence of the theoretical solution of WNNM and successfully applied it to gray image denoising. The denoising WNNM model is

min X

1 ||Y − X||2F + ||X||w, * , σ2

(1)

where X is the matrix composed of similar patches in the denoised image we are looking for, Y is the matrix composed of similar patches in noisy image, ||X||F = ∑i ∑j |x ij | is the Frobenius norm, σ is the standard deviation of AWGN added to the image,

||X||w, * = ∑i |wi σi (X)| is the weighted nuclear norm of matrix X , w = [w1, w2, …, wn]T (wi > 0) is the weight vector, σi (X) is the i th singular value of X . After WNNM, many low rank matrix approximation methods have been proposed and applied to image denoising [39–41,10]. It is worth mentioning that, taking the difference of noise levels between R, G, B channels into account, MCWNNM generalizes WNNM to color image denoising. The MCWNNM model is formulated as min||W (Y − X)||2F + ||X||w, * ,

(2)

−1 0 ⎞ ⎛ σr I 0 W = ⎜ 0 σg−1 I 0 ⎟ ⎟ ⎜ 0 σb−1 I ⎠ ⎝ 0

(3)

X

where

is the weight matrix characterizing the noise level of different channels, I is identity matrix, σr , σg , σb are the standard deviations of AWGN added to the R, G, B channels, respectively. According to the characteristics of real noise, the denoising model proposed in this paper considers the difference of noise levels between different patches. Based on MCWNNM, we introduce another weight matrix to 2

Optik - International Journal for Light and Electron Optics 206 (2020) 164214

X. Guo, et al.

balance the difference in noise between different patches. 2.2. Real world image denoising Researches on the denoising of AWGN have existed for decades, and a variety of methods have been developed. In the past decade, real world image denoising has gradually attracted people's attention. Unlike AWGN, which is artificially added to the image, the noise level of the real world image is unknown. According to whether the noise level is required as input, we divide the methods of real image denoising into two categories. The methods of using the noise level as an input require an algorithm from others to estimate the noise level. The algorithms proposed by Liu et al. [42], Chen et al. [43], and Stanislav et al. [44] are commonly used for estimating noise. In this paper, we use the algorithm of Liu et al. [42] to estimate the noise level of each channel in real world image. LSSC [45] was the first to combine sparse coding with NSS in image denoising. TWSC [46] took three matrices into sparse coding for real world image denoising. These three matrices were utilized to characterize the statistics of realistic noise and image priors. Based on WNNM, MCWNNM introduced a weight matrix to balance the noise of R, G, B channels. We divide the methods that do not require noise levels as inputs into two categories: one is that the algorithms estimate the noise and denoising as unification, and the other is the deep learning methods. The first category of method can generally be divided into two parts: noise estimation and denoising. The denoising method proposed by Liu et al. [47] is inseparable from image segmentation. They estimated the noise level with the brightness of the image and used the variance of per-segment as the upper bound of the noise level. Then, they utilized the Gaussian conditional random field to denoise in each segmentation. Lebrun et al. [48] proposed a twostep blind denoising method: the first step uses Signal and Frequency Dependent model to estimate noise, and the second step takes multiscale adaptation of the Non-local Bayes denoising method to denoise. Xu et al. [49] proposed a blind image denoising method under the Bayesian learning framework. They modeled the patch group variations as mixture of Gaussians, and deduced all of their parameters by variational Bayesian method. Similarly, LR-MoG [50] used mixture of Gaussian distribution to estimate noise, but it utilized a novel Low-rank mixture of Gaussian distribution filter to remove noise. Nam et al. [29] trained a multi-layer perceptron network to estimate noise and combined existing Bayesian non-local means method with it for denoising. Some scholars try to use deep learning on real world image denoising for its advantage of powerful fitting ability. CBDNet [51] is the first network to use deep learning for real world image denoising. The training set of CBDNet is divided into two parts: a real image data set and an artificially generated data set. Since the data of the real image is not enough, the author generated a part of the data set, which used the signalrelated noise plus Gaussian noise to simulate the real noise. It is worth mentioning that the emergence of SIDD [52] as a data set with about 30,000 pairs of real images(noisy and ground truth) provides an opportunity for the development of deep learning in real image denoising. Using the same training set strategy as CBDNet, RIDNet [53] utilized attention mechanism to balance the correlation of data between channels. Yu et al. [54] believes that different regions in the image have different levels of difficulty in restoration. They trained a "pathfinder" with reinforcement learning. The "pathfinder" is used to find a suitable path for denoising in multi-path CNN. The problem of hardly getting ground truth for real images leads to the lack of paired data sets. The contribution of some scholars in real image denoising lies in generating ground truth images in their specific way. There are two main methods for generating ground truth of real images: the first method is to take a picture with a low resolution as the ground truth when the other conditions are the same [55]. The second method is to take multiple photos with different camera settings and lighting conditions and take their average as ground truth [29,56]. In this paper, the datasets generated by Xu et al. [56] and Nam et al. [29] are used as test sets for experiments. 3. Multi-weighted nuclear norm minimization for real world image denoising In this section, we introduce the proposed multi-weighted nuclear norm minimization model for real image denoising in detail.This section contains three parts: the definition of the proposed model, the calculation of weight matrices, and the processes of solving the optimization problem. 3.1. The multi-weighted nuclear norm minimization model We introduce some symbols to facilitate the representation of our proposed model. Given a local patch of size p × p in the image, we select M − 1 (plus the original patch for a total of M patches) patches which is the most similar to it from the fixed-size surrounding area. We use Euclidean distance to measure the similarity of two patches. The smaller the distance between two patches, the 2 higher their similarity. The pixel values of each patch are arranged into a vector, denoted by vcm ∈ ℝp , where v ∈ {x, y, n} can represent noisy image, clean image, and noise, c ∈ {r , g , b} is the index of R, G, B channels, m ∈ {1, 2, …, M } is the index of patch T ]. The matrices of R,G,B channels are number. Arrange the M vectors of each channel into a matrix, denoted by Vc = [vcT1 vcT2 ⋯ vcM T T T T 3 combined into a 3p × M matrix, denoted by V = [Vr Vg Vb ] . Then we use Y = X + N to represent the matrix consisting of the pixel values of M similar patches in the noisy image, where X is the matrix consisting of the pixel values of M similar patches in clean image, and N is the matrix contains the noise values of corresponding M similar patches. The noise levels in real images vary not only in different channels, but also in different patches due to JPEG compression [29]. Intuitively, in JPEG compression, the compression degrees of a flat patch and a textured patch in an image are significantly different. Therefore, the noise levels of different patches become different after compression. In MCWNNM, the authors took advantage of the difference in noise levels between R, G, B channels. In order to get closer to real noise, the characteristics of noise levels varying with 3

Optik - International Journal for Light and Electron Optics 206 (2020) 164214

X. Guo, et al.

the patch need to be considered. The denoising model in this paper considers the differences of noise levels both in R, G, B channels and in different patches. Using the symbols defined above, the model proposed in this paper is as follows:

min||W1 (Y − X) W2||2F + ||X||w, * ,

(4)

X

where W1 is the weight matrix similar to the matrix (3) in MCWNNM to characterize the noise in R,G,B channels with different values, W2 is the weight matrix introduced in this paper to describe the difference in noise levels between patches. The values of matrix W1 and W2 are represented in Section 3.2. Following the definition in WNNM [9], we set the weight of weighted nuclear norm ||X||w, * to wi = c M /(σi (X) + ϵ) . Where c > 0 , ϵ = 10−16 are constants, M is the number of similar patches in X . And ||X||w, * is defined in the definition of formula (1). 3.2. The definition of weighted matrices W1 and W2 We use the maximum a posterior estimation method to determine the value of two weight matrices W1 and W2 . W1 reflects the difference of noise level between R, G, B channels. W2 represents the difference of noise level between different patches. According to the maximum a-posterior, we have:

ˆ = argmax log P (X|Y) X X

= argmax{log P (Y|X) + log P (X)}.

(5)

X

According to Leung et al. [57], we assume that the noise between channels and patches is consistent with an independent and identically distributed Gaussian distribution. Then, we have nm ∼  (0, σm), m ∈ {1, 2, …, M } and n c ∼  (0, σc ), c ∈ {r , g , b} , where n c is noise of channel c , nm is noise of the m th patch. According to the log-linear model [58], we set σcm = σc1/2 σm1/2 , where σcm is the standard deviation of the m th patch in channel c . Thus, we have

n cm = ycm − x cm ∼  (0, σcm ),

c ∈ {r , g , b},

m ∈ {1, 2, …, M },

(6)

where n cm is the noise of the m th patch in channel c , ycm , x cm are the m th patches of channel c in noisy and clean image, respectively. According to (6), we get ycm |x cm ∼  (x cm, σcm ) . Therefore, the probability density function of Y|X is as follows: M

2

2 −3p M /2 ) exp(− ∏ ∏ (2πσcm

P (Y|X) =

c ∈ {r , g , b} m = 1

||ycm − x cm ||22 2 2σcm

). (7)

In accordance with Xu et al. [10], the distribution of clean image is proportional to the weighted nuclear norm. Their relationship is as follows:

1 P (X) ∝ exp ⎛− ||X||w, * ⎞. ⎝ 2 ⎠

(8)

Putting (7) and (8) into (5), we have M

ˆ = argmin X X

= argmin X

−2 ||ycm − x cm ||22 ∑ ∑ σcm c ∈ {r , g , b} m = 1

∑ c ∈ {r , g , b}

+ ||X||w, *

σc−1 ||(Yc − X c) W2||2F + ||X||w, *

= argmin||W1 (Y − X) W2||2F + ||X||w, * ,

(9)

X

where −1/2 −1/2 0 0 ⎞ ⎛ σr I 0 0 ⎞ ⎛ σ1 0 ⎟ σg−1/2 I W1 = ⎜ 0 , W2 = ⎜ 0 ⋱ 0 ⎟ . ⎜ 0 ⎟ ⎜ −1/2 ⎟ −1/2 0 σ M 0 σb I ⎠ M × 3p2 ⎝ ⎠M ×M ⎝ 0

Up to now, the elements in (4) have been clearly defined. We need to solve the model (4) and apply it to real world image denoising. The following subsection describes how to solve the model. 3.3. The solution of the proposed model After the model parameters are determined, optimization problem (4) needs to be solved. Unfortunately, (4) is a non-convex problem, and there is no analytical solution of it. Benefit from the alternating direction method of multipliers (ADMM) [30] framework, we can decompose (4) into several sub-problems and solve them iteratively. To take advantage of ADMM, we transform our problem from an unconstrained optimization problem (4) to an equivalent equality constraint problem: 4

Optik - International Journal for Light and Electron Optics 206 (2020) 164214

X. Guo, et al.

min||W1 (Y − X) W2||2F + ||Z||w, * X, Z

s. t . X = Z.

(10)

The augmented Lagrangian function of (10) is:

 (X , Z , A, ρ) = ||W1 (Y − X) W2||2F + ||Z||w, * + AT (X − Z) +

ρ ||X − Z||2F , 2

(11)

where A is the augmented Lagrangian multiplier, ρ > 0 is the penalty parameter. According to ADMM, the final solution can be obtained by solving the following variables iteratively.

X k + 1 = argmin||W1 (Y − X) W2||2F + AT (X − Zk) + X

ρk ||X − Zk ||2F , 2

(12)

Zk + 1 = argmin||Z − (X k + 1 + ρk−1 Ak)||2F + ||Z||w, * ,

(13)

Ak + 1 = Ak + ρk (X k + 1 − Zk + 1),

(14)

ρk + 1 = μ*ρk ,

(15)

Z

where μ ⩾ 1 is a manually chosen constant. Eqs. (14) and (15) refer to the process of variable update. We only need to solve the two optimization problems of (12) and (13). For (12), we derive X from the right part of it and get

W1T W1X +

ρk ρ 1 X (W2W2T)−1 = W1T W1Yk + ⎛ k Zk − Ak⎞ (W2W2T)−1. 2 2 ⎠ ⎝2

(16)

Eq. (16) is a standard Sylvester equation and the solution of it can be rewritten as the following form:

vec(X) = (IM ⊗ (W1T W1) + (

ρk ρ 1 (W2W2T)−1)T ⊗ I3p2)−1vec ⎛BYk + ⎛ k Zk − Ak⎞ (W2W2T)−1⎞, 2 2 ⎠ ⎝2 ⎝ ⎠

(17)

where vec is a function that vectorizes a matrix into a vector. Then we get the solution of (12) as X k + 1 = vec−1 (vec(X)) . According to Xu [10], (13) has an optimal solution

Zk + 1 = Uk Σk VkT ,

(18)

VkT

where Uk and are the unitary matrices in singular value decomposition of X k + 1 + X k + 1 + ρk−1 Ak is X k + 1 + ρk−1 Ak = Uk Σˆk VkT . And Σk is defined as follows:

ρk−1 Ak .

The singular value decomposition of

diag(σ1, σ2, …, σM ) ⎞ Σk = ⎛ , 0 ⎝ ⎠3p2 × M where

⎧ 0, ⎪ σi = σˆi + ⎨ ⎪ ⎩

if (σˆi − ϵ)2 − 8 2M / ρk < 0, (σˆi − ϵ)2 − 8 2M / ρk 2

, if (σˆi − ϵ)2 − 8 2M / ρk ≥ 0,

σˆi is the i th diagonal element of Σˆk , ϵ > 0 is a small constant. We initialize X 0 , Z0 , and A 0 to matrices whose elements are all zero. Fix Ak and Zk when updating X k + 1 by (17). Fix Ak and X k + 1 when updating Zk + 1 uses X k + 1. Then, update Ak + 1 and Zk + 1 by (18). It is worth noting that the update of Zk + 1 uses X k + 1. Then, update Ak + 1 and Zk + 1 with (14) and (15), respectively. In our algorithm, we take the noise level as a known parameter. Therefore, the noise estimation model proposed by Liu and Lin [42] is used to estimate the noise level of the real image as the input of our algorithm. The flow chart of our algorithm is shown in Fig. 1. 4. Experiments We evaluate the proposed method on three real noisy color image datasets. We compare the proposed method with state-of-theart denoising methods, including CBM3D [59], MCWNNM [10], DnCNN [14], TWSC [46] and CBDNet [51]. All codes are run under MATLAB 2018b environment on a computer with an Intel (R) Core (TM) i5-4590 CPU (3.30 GHz) and 16 GB of RAM. We first introduce the test datasets and assessment criteria for image denoising in Sections 4.1 and 4.2, respectively. Then, Section 4.3 presents the setting of parameters in our proposed method. Finally, we show the denoised results of all compared methods in Section 4.4. 4.1. Real noisy image datasets We utilize three commonly used datasets of real noisy images to evaluate the performances of our proposed method and other methods. The basic information of these three datasets is as follows. 5

Optik - International Journal for Light and Electron Optics 206 (2020) 164214

X. Guo, et al.

Fig. 1. The flow chart of our algorithm.

Fig. 2. The scene of Dataset 1: The 12 test images cropped from NC12 dataset.

Dataset 1: NC12 [48] collected many real noisy images under uncontrolled outdoor environment. We cropped 12 real world noisy images of size 512 × 512 from NC12. Fig. 2 shows the scenes of these 12 cropped images. It should be mentioned that this dataset does not have ground truth images. Dataset 2: In order to train the model, Nam et al. [29] established an image dataset of 11 scenes. 500 images were shot of each scene with different camera settings. And the mean of these 500 images was taken as the ground truth. Since the resolution of each image is very high (7360 × 4921) and there are many overlapping parts of each scene, Nam cropped 15 images with a resolution of 512 × 512 from these images for testing. In this paper, we use the 15 cropped images in png format as one of the test datasets. Dataset 3: Using the same strategies as the dataset 2, Xu et al. [56] constructed a dataset with more camera brands, more camera settings, and more captured scenes (up to 40). We employ the 100 jpg images of size 512 × 512 cropped by the author to evaluate the denoising methods. When using datasets 2 and 3 as test sets, we grouped the images in the datasets according to their original camera settings. The specific camera settings and number of test images are shown in the experimental results in Section 4.4.

4.2. Assessment criteria for denoising results As for datasets 2 and 3, we exploit PSNR and SSIM [60] to evaluate the quality of the denoised results. The definitions of PSNR and SSIM are shown in Eqs. (19) and (20), respectively. The bigger these two values are, the better the performance of denoised image is. Since the ground truth images are not available of dataset 1, we only evaluate the quality of the denoising result through visual effect: 6

Optik - International Journal for Light and Electron Optics 206 (2020) 164214

X. Guo, et al.

Fig. 3. The denoised images of “Dog” in dataset 1. Images from subgraphs (a) to (f) are: image denoised by CBM3D, image denoised by DnCNN, image denoised by MCWNNM, image denoised by TWSC, image denoised by CBDNet, and image denoised by our proposed method. Table 1 The PSNR and SSIM results of Dataset 2. The first column in the table is the camera settings, and the numbers in parentheses represent the number of test images. CBM3D

DnCNN

MCWNNM

TWSC

CBDNet

Ours

PSNR Canon 5D(3) Nikon D600(3) NikonD800(9) Average

35.4022 34.2355 32.9980 34.2119

35.1601 34.5016 33.2126 34.8309

38.3109 37.3761 37.6422 37.7764

37.2065 37.8601 38.1414 37.736

36.4588 36.9431 36.7126 36.7048

38.4916 37.2780 37.3005 37.69

SSIM Canon 5D(3) Nikon D600(3) NikonD800(9) Average

0.9286 0.8857 0.8300 0.8814

0.9186 0.8921 0.8356 0.8821

0.9683 0.9626 0.9511 0.9607

0.9543 0.9672 0.9581 0.9599

0.9531 0.9531 0.9402 0.9497

0.9699 0.9614 0.9482 0.9598

(28 − 1)2 ⎞ PSNR(x, y) = 10log10 ⎜⎛ ⎟. ⎝ MSE(x, y) ⎠

SSIM(x, y) =

(2μ x μ y + c1)(2σxy + c2) (μx2

+ μy2 + c1)(σx2 + σy2 + c2)

(19)

, (20)

where

MSE(x, y) =

1 MN

M−1 N −1

∑ ∑ i=0

[x (i, j ) − y (i, j )]2

j=0

is the mean square error of x and y , x and y are the pixel value matrices of noise-free and noise image, respectively, M × N is the size of x and y , μ x , μ y are the mean values of x and y , c1 and c2 respectively, σx2 , σy2 are the variances of x , y , respectively, σxy is the covariance of x and y are constants.

7

Optik - International Journal for Light and Electron Optics 206 (2020) 164214

X. Guo, et al.

Table 2 The PSNR and SSIM results of Dataset 3. The first column in the table is the camera settings, and the numbers in parentheses represent the number of test images. CBM3D

DnCNN

MCWNNM

TWSC

CBDNet

Ours

PSNR Canon 5D(29) Canon 80D(15) Canon 600D(11) Nikon D800(33) Sony A7II(12) Average

35.7202 36.8377 37.6126 36.0533 35.7495 36.3947

35.6393 36.6701 37.5083 35.9489 35.4898 36.2513

37.4788 39.5256 40.1285 38.2958 38.8369 38.8531

37.4940 39.8243 40.3572 38.4297 38.7757 38.9762

36.3944 34.4581 36.8481 36.4753 36.7588 36.1869

37.5258 39.6222 40.0626 38.2877 39.0021 38.9001

SSIM Canon 5D(29) Canon 80D(15) Canon 600D(11) Nikon D800(33) Sony A7II(12) Average

0.9202 0.9308 0.9328 0.9227 0.8976 0.9208

0.9165 0.9250 0.9297 0.9183 0.8851 0.9149

0.9671 0.9731 0.9747 0.9653 0.9582 0.9677

0.9663 0.9758 0.9767 0.9662 0.9547 0.9679

0.9595 0.9619 0.9662 0.9548 0.9412 0.9567

0.9682 0.9745 0.9748 0.9659 0.9602 0.9687

4.3. Experimental settings We set the parameters in our algorithm as follows: the number of iterations are K1 = 10, K2 = 2 , the patch size is p × p = 7 × 7 , the size of local search window to select similar patches is 40 × 40 , the number of similar patches is M = 70 , the updating parameter in (15) as μ = 1.001, the initial value of ρ is 6.4. All methods compared in this paper, except the CBDNet [51], need to take the noise level of the real world image as input. We utilize the noise estimation model proposed by Liu and Lin [42] to estimate the noise level of R,G,B channels separately, which correspond to {σr , σg , σb} . MCWNNM [10], TWSC [46], and our method directly take these three noise levels as input parameters. However, for method like CBM3D, only one single noise level is required as an input parameter. We set the single noise level as:

σ=

σr2 + σg2 + σb2 3

.

(21)

As for the noise standard deviation of patch in our algorithm, we initialize it as σm = σ , and update it with the following equation:

σm =

max(0, σ 2 − ||ym − x m||22 ) .

4.4. Experimental results on real world image datasets In this subsection, we show the denoising results of many state-of-the-art methods and the method presented in this paper. Since dataset 1 does not have ground truth images, we cannot calculate PSNR and SSIM. We present the results of the denoised images in Fig. 3. For datasets 2 and 3, their denoising results are shown in Tables 1 and 2 , respectively. The best PSNR and SSIM results are shown in bold. As can be seen from Fig. 3, TWSC achieves the best denoising effect, which preserves more image details than other methods. CBM3D and DnCNN, especially DnCNN, have poor results and retain most of the noise. Our method is comparable to CBDNet and MCWNNM. Table 1 shows that MCWNNM attains the highest PSNR and SSIM on dataset 2. Table 2 shows that TWSC and our method achieve the best PSNR and SSIM respectively. As a deep neural network designed for real world images, the denoising performance of CBDNet on the two datasets is not outstanding, which reflects the drawbacks of the neural network's excessive dependence on the training dataset. For dataset 2 in png format, our method is less effective than MCWNNM. For dataset 3 in jpg format, our approach works better than MCWNNM. This experimental results verify the effectiveness of the matrix W2 that we propose for the difference in noise levels between patches produced by JPEG compression. Since CBM3D and DnCNN are denoising methods for AWGN, the performance on the data set of real noisy images is generally worse than the method specifically designed for real noisy images. CBDNet and DnCNN are both deep learning methods. However, in real world image denoising, CBDNet designed for real noise is better than CNDNNM for synthetic AWGN. From the comparison of these state-of-the-art denoising methods, the AWGN denoising methods are not effective in real image denoising, and more methods for real noisy image denoising need to be proposed. 5. Conclusion The noise in real world image is related to patches due to JPEG compression. According to this characteristic of noise, we proposed a multi-weighted nuclear norm minimization denoising model. Two weight matrices have been introduced into our model. One is used to balance the difference in noise between patches. The other is to balance the difference in data between R, G, B 8

Optik - International Journal for Light and Electron Optics 206 (2020) 164214

X. Guo, et al.

channels. The values of the two matrices are obtained by the method of maximum a-posterior. The matrix used to balance the differences between channels is similar to the weighted matrix in MCWNNM, except that the values of the matrices are different. Our objective function is a non-convex problem, we transformed it into an equality constraint problem and solved it under the ADMM framework. Experiments showed that our method is comparable to the state-of-the-art methods in real image denoising. In the future, we will study more statistical characteristics of noise, and propose more suitable models for real world image denoising. Acknowledgments This work was supported by the China Railway Corporation (Originally known as the Ministry of Railways) [grant numbers K19D00021]. References [1] K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse 3-d transform-domain collaborative filtering, IEEE Trans. Image Process. 16 (8) (2007) 2080–2095, https://doi.org/10.1109/TIP.2007.901238. [2] T. Lu, S. Li, L. Fang, Y. Ma, J.A. Benediktsson, Spectral-spatial adaptive sparse representation for hyperspectral image denoising, IEEE Trans. Geosci. Rem. Sens. 54 (1) (2016) 373–385, https://doi.org/10.1109/TGRS.2015.2457614. [3] W. Zhao, H. Lu, Medical image fusion and denoising with alternating sequential filter and adaptive fractional order total variation, IEEE Trans. Instrum. Meas. 66 (9) (2017) 2283–2294, https://doi.org/10.1109/TIM.2017.2700198. [4] C. Tomasi, R. Manduchi, Bilateral filtering for gray and color images, IEEE International Conference on Computer Vision (1998) 839–846, https://doi.org/10. 1109/ICCV.1998.710815. [5] A. Buades, B. Coll, J.-M. Morel, A non-local algorithm for image denoising, IEEE Conference on Computer Vision and Pattern Recognition (2005), https://doi. org/10.1109/CVPR.2005.38. [6] M. Aharon, M. Elad, A. Bruckstein, K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation, IEEE Trans. Signal Process. 54 (11) (2006) 4311–4322, https://doi.org/10.1109/TSP.2006.881199. [7] W. Dong, L. Zhang, G. Shi, X. Li, Nonlocally centralized sparse representation for image restoration, IEEE Trans. Image Process. 22 (4) (2013) 1620–1630, https://doi.org/10.1109/TIP.2012.2235847. [8] S. Roth, M. Black, Fields of experts: a framework for learning image priors, IEEE Conference on Computer Vision and Pattern Recognition (2005), https://doi. org/10.1109/CVPR.2005.160. [9] S. Gu, L. Zhang, W. Zuo, X. Feng, Weighted nuclear norm minimization with application to image denoising, IEEE Conference on Computer Vision and Pattern Recognition (2014), https://doi.org/10.1109/CVPR.2014.366. [10] J. Xu, L. Zhang, D. Zhang, X. Feng, Multi-channel weighted nuclear norm minimization for real color image denoising, IEEE International Conference on Computer Vision (2017) 1105–1113, https://doi.org/10.1109/ICCV.2017.125. [11] N. Yair, T. Michaeli, Multi-scale weighted nuclear norm image restoration, IEEE Conference on Computer Vision and Pattern Recognition (2018) 3165–3174. [12] H.C. Burger, C.J. Schuler, S. Harmeling, Image denoising: can plain neural networks compete with BM3D? IEEE Conference on Computer Vision and Pattern Recognition (2012), https://doi.org/10.1109/CVPR.2012.6247952. [13] X.J. Mao, C. Shen, Y.B. Yang, Image restoration using very deep convolutional encoder–decoder networks with symmetric skip connections, Advances in Neural Information Processing Systems (2016) 2802–2810. [14] K. Zhang, W. Zuo, Y. Chen, D. Meng, L. Zhang, Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising, IEEE Trans. Image Process. 26 (7) (2017) 3142–3155, https://doi.org/10.1109/TIP.2017.2662206. [15] X. Zhang, X. Feng, W. Wang, S. Zhang, Q. Dong, Gradient-based wiener filter for image denoising, Comput. Electr. Eng. 39 (3) (2013) 934–944, https://doi.org/ 10.1016/J.COMPELECENG.201207.013. [16] G. Wang, Y. Liu, W. Xiong, Y. Li, An improved non-local means filter for color image denoising, Optik 173 (2018) 157–173, https://doi.org/10.1016/J.IJLEO. 2018.08.013. [17] S. Routray, A.K. Ray, C. Mishra, Image denoising by preserving geometric components based on weighted bilateral filter and curvelet transform, Optik 159 (2018) 333–343, https://doi.org/10.1016/J.IJLEO.2018.01.096. [18] H. Pan, T.-Z. Huang, T. Ma, Two-step group-based adaptive soft-thresholding algorithm for image denoising, Optik 127 (1) (2016) 503–509, https://doi.org/10. 1016/J.IJLEO.2015.08.131. [19] L. Fan, X. Li, H. Fan, C. Zhang, An adaptive boosting procedure for low-rank based image denoising, Signal Process. 164 (2019) 110–124, https://doi.org/10. 1016/J.SIGPRO.2019.06.004. [20] S. Lefkimmiatis, Non-local color image denoising with convolutional neural networks, IEEE Conference on Computer Vision and Pattern Recognition (2017) 3587–3596, https://doi.org/10.1109/CVPR.2017.623. [21] J. Lehtinen, J. Munkberg, J. Hasselgren, S. Laine, T. Karras, M. Aittala, T. Aila, Noise2noise: Learning image restoration without clean data, International Conference on Machine Learning (2018). [22] J. Chen, J. Chen, H. Chao, M. Yang, Image blind denoising with generative adversarial network based noise modeling, IEEE Conference on Computer Vision and Pattern Recognition (2018) 3155–3164. [23] D. Ulyanov, A. Vedaldi, V. Lempitsky, Deep image prior, IEEE Conference on Computer Vision and Pattern Recognition (2018) 9446–9454. [24] I. Kligvasser, T.R. Shaham, T. Michaeli, xUnit: learning a spatial activation function for efficient image restoration, IEEE Conference on Computer Vision and Pattern Recognition (2018) 2433–2442. [25] S. Lefkimmiatis, Universal denoising networks: a novel CNN architecture for image denoising, IEEE Conference on Computer Vision and Pattern Recognition (2018) 3204–3213. [26] T. Plötz, S. Roth, Neural nearest neighbors networks, Advances in Neural Information Processing Systems (2018) 1095–1106. [27] D. Liu, B. Wen, Y. Fan, C.C. Loy, T.S. Huang, Non-local recurrent network for image restoration, Advances in Neural Information Processing Systems (2018). [28] D. Valsesia, G. Fracastoro, E. Magli, Image denoising with graph-convolutional neural networks, IEEE International Conference on Image Processing (2019), https://doi.org/10.1109/ICIP.2019.8803367. [29] S. Nam, Y. Hwang, Y. Matsushita, S. Joo Kim, A holistic approach to cross-channel image noise modeling and its application to image denoising, IEEE Conference on Computer Vision and Pattern Recognition (2016) 1683–1691, https://doi.org/10.1109/CVPR.2016.186. [30] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn. 3 (1) (2010) 1–122, https://doi.org/10.1561/2200000016. [31] X. Zhou, C. Yang, H. Zhao, W. Yu, Low-rank modeling and its applications in image analysis, ACM Comput. Surv. 47 (2) (2014) 1–33, https://doi.org/10.1145/ 2674559. [32] Y. Chen, Y. Zhou, W. Chen, S. Zu, W. Huang, D. Zhang, Empirical low-rank approximation for seismic noise attenuation, IEEE Trans. Geosci. Rem. Sens. 55 (8) (2017) 4696–4711, https://doi.org/10.1109/TGRS.2017.2698342. [33] Y. Chen, M. Bai, Z. Guan, Q. Zhang, M. Zhang, H. Wang, Five-dimensional seismic data reconstruction using the optimally damped rank-reduction method, Geophys. J. Int. 218 (1) (2019) 224–246, https://doi.org/10.1093/GJI/GGZ130.

9

Optik - International Journal for Light and Electron Optics 206 (2020) 164214

X. Guo, et al.

[34] Y. Chen, M. Bai, Y. Chen, Obtaining free USArray data by multi-dimensional seismic reconstruction, Nat. Commun. 10 (1) (2019), https://doi.org/10.1038/ S41467-019-12405-0. [35] B. Chen, Z. Yang, Z. Yang, An algorithm for low-rank matrix factorization and its applications, Neurocomputing 275 (2018) 1012–1020, https://doi.org/10. 1016/J.NEUCOM.2017.09.052. [36] L. Huang, Z.-L. Zhao, C.-D. Wang, D. Huang, H.-Y. Chao, LSCD: low-rank and sparse cross-domain recommendation, Neurocomputing 366 (2019) 86–96, https:// doi.org/10.1016/J.NEUCOM.2019.07.091. [37] W. Dong, G. Shi, X. Li, Nonlocal image restoration with bilateral variance estimation: a low-rank approach, IEEE Trans. Image Process. 22 (2) (2013) 700–711, https://doi.org/10.1109/TIP.2012.2221729. [38] J. Cai, E.J. Candès, Z. Shen, A singular value thresholding algorithm for matrix completion, SIAM J. Optim. 20 (4) (2010) 1956–1982, https://doi.org/10.1137/ 080738970. [39] Y. Xie, S. Gu, Y. Liu, W. Zuo, W. Zhang, L. Zhang, Weighted schatten p-norm minimization for image denoising and background subtraction, IEEE Trans. Image Process. 25 (10) (2016) 4842–4857, https://doi.org/10.1109/TIP.2016.2599290. [40] F. Jing, L. Shuaiqi, X. Yang, H. Li, SAR image denoising based on texture strength and weighted nuclear norm minimization, J. Syst. Eng. Electron. 27 (4) (2016) 807–814, https://doi.org/10.21629/JSEE.2016.04.09. [41] Z. Zha, X. Zhang, Q. Wang, Y. Bai, T. Lan, Analyzing the Weighted Nuclear Norm Minimization and Nuclear Norm Minimization Based on Group Sparse Representation, (2017) arXiv:1702.04463. [42] W. Liu, W. Lin, Additive white Gaussian noise level estimation in SVD domain for images, IEEE Trans. Image Process. 22 (3) (2013) 872–883, https://doi.org/10. 1109/TIP.2012.2219544. [43] G. Chen, F. Zhu, P.A. Heng, An efficient statistical method for image noise level estimation, IEEE International Conference on Computer Vision (2015) 477–485, https://doi.org/10.1109/ICCV.2015.62. [44] P. Stanislav, J. Hesser, L. Zheng, Image noise level estimation by principal component analysis, IEEE Trans. Image Process. 22 (2) (2013) 687–699, https://doi. org/10.1109/TIP.2012.2221728. [45] J. Mairal, F. Bach, J. Ponce, G. Sapiro, A. Zisserman, Non-local sparse models for image restoration, International Conference on Computer Vision (2009), https://doi.org/10.1109/ICCV.2009.5459452. [46] J. Xu, L. Zhang, D. Zhang, A trilateral weighted sparse coding scheme for real-world image denoising, European Conference on Computer Vision (2018) 20–36. [47] C. Liu, R. Szeliski, S.B. Kang, C. Zitnick, W. Freeman, Automatic estimation and removal of noise from a single image, IEEE Trans. Pattern Anal. Mach. Intell. 30 (2) (2008) 299–314, https://doi.org/10.1109/TPAMI.2007.1176. [48] M. Lebrun, M. Colom, J.-M. Morel, The noise clinic: a blind image denoising algorithm, Image Process. OnLine 5 (2015) 1–54, https://doi.org/10.5201/IPOL. 2015.125. [49] J. Xu, D. Ren, L. Zhang, D. Zhang, Patch group based Bayesian learning for blind image denoising, Computer Vision – ACCV 2016 Workshops (2017) 79–95. [50] F. Zhu, G. Chen, P.A. Heng, From noise modeling to blind image denoising, IEEE Conference on Computer Vision and Pattern Recognition (2016), https://doi. org/10.1109/CVPR.2016.52. [51] S. Guo, Z. Yan, K. Zhang, W. Zuo, L. Zhang, Toward convolutional blind denoising of real photographs, IEEE Conference on Computer Vision and Pattern Recognition (2019). [52] A. Abdelhamed, S. Lin, M.S. Brown, A high-quality denoising dataset for smartphone cameras, IEEE Conference on Computer Vision and Pattern Recognition (2018) 1692–1700. [53] S. Anwar, N. Barnes, Real image denoising with feature attention, IEEE International Conference on Computer Vision (2019). [54] K. Yu, X. Wang, C. Dong, X. Tang, C.C. Loy, Path-Restore: Learning Network Path Selection for Image Restoration, (2019) arXiv:1904.10343. [55] T. Plotz, S. Roth, Benchmarking denoising algorithms with real photographs, IEEE Conference on Computer Vision and Pattern Recognition (2017), https://doi. org/10.1109/CVPR.2017.294. [56] J. Xu, L. Hui, Z. Liang, D. Zhang, Z. Lei, Real-World Noisy Image Denoising: A New Benchmark, (2018) arXiv:1804.02603. [57] B. Leung, G. Jeon, E. Dubois, Least-squares Luma-Chroma demultiplexing algorithm for Bayer demosaicking, IEEE Trans. Image Process. 20 (7) (2011) 1885–1894, https://doi.org/10.1109/TIP.2011.2107524. [58] P. McCullagh, Generalized linear models, Eur. J. Oper. Res. 16 (3) (1984) 285–292, https://doi.org/10.1016/0377-2217(84)90282-0. [59] K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Color image denoising via sparse 3d collaborative filtering with grouping constraint in luminance-chrominance space, IEEE International Conference on Image Processing (2007) 313–316, https://doi.org/10.1109/ICIP.2007.4378954. [60] Z. Wang, A. Bovik, H. Sheikh, E. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (2004) 600–612, https://doi.org/10.1109/TIP.2003.819861.

10