Designing a stable feedback control system for blind image deconvolution

Designing a stable feedback control system for blind image deconvolution

Neural Networks 101 (2018) 101–112 Contents lists available at ScienceDirect Neural Networks journal homepage: www.elsevier.com/locate/neunet Desig...

4MB Sizes 0 Downloads 120 Views

Neural Networks 101 (2018) 101–112

Contents lists available at ScienceDirect

Neural Networks journal homepage: www.elsevier.com/locate/neunet

Designing a stable feedback control system for blind image deconvolution Shichao Cheng a,c , Risheng Liu b,c,d , Xin Fan b,c, *, Zhongxuan Luo a,b,c a

School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China DUT-RU International School of Information Science & Engineering, Dalian University of Technology, Dalian 116620, China c Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian 116620, China d Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China b

highlights • A new perspective based on feedback control system to model image propagation is proposed. • A flexible framework for blind image deconvolution based on the latent image propagation system is designed. • The designed framework can effectively process all kinds of blurred images.

article

info

Article history: Received 3 August 2017 Received in revised form 18 December 2017 Accepted 26 January 2018 Available online 15 February 2018 Keywords: Image restoration Blind deconvolution Kernel estimation Feedback control system Latent sharp image

a b s t r a c t Blind image deconvolution is one of the main low-level vision problems with wide applications. Many previous works manually design regularization to simultaneously estimate the latent sharp image and the blur kernel under maximum a posterior framework. However, it has been demonstrated that such joint estimation strategies may lead to the undesired trivial solution. In this paper, we present a novel perspective, using a stable feedback control system, to simulate the latent sharp image propagation. The controller of our system consists of regularization and guidance, which decide the sparsity and sharp features of latent image, respectively. Furthermore, the formational model of blind image is introduced into the feedback process to avoid the image restoration deviating from the stable point. The stability analysis of the system indicates the latent image propagation in blind deconvolution task can be efficiently estimated and controlled by cues and priors. Thus the kernel estimation used for image restoration becomes more precision. Experimental results show that our system is effective on image propagation, and can perform favorably against the state-of-the-art blind image deconvolution methods on different benchmark image sets and special blurred images. © 2018 Elsevier Ltd. All rights reserved.

1. Introduction Blind image deconvolution is a classical medial analysis task and widely used in many image processing and multimedia analysis communities, such as image feature extraction, classification, and retrieval. The research on image deconvolution has potential value to solve the practical problems. In fact, the shake of camera, out of focus or noise in circumstances, both of them can result in blurred image. Generally, this process can be formulated as follows: y = k ⊗ x + n,

*

(1)

Corresponding author at: DUT-RU International School of Information Science & Engineering, Dalian University of Technology, Dalian 116620, China. E-mail addresses: [email protected] (S. Cheng), [email protected] (R. Liu), [email protected] (X. Fan), [email protected] (Z. Luo). https://doi.org/10.1016/j.neunet.2018.01.012 0893-6080/© 2018 Elsevier Ltd. All rights reserved.

where k is the point spread function (or blur kernel), x is the latent sharp image, y is the blurry observation, n is the error and ‘‘⊗’’ denotes the 2D convolution. Obviously, the solution of this problem is a deconvolution processing which involves the estimation of latent image from the observation compromised by unknown kernel. It becomes challenging because only the blurred observation y is known. A traditional blind deconvolution method includes kernel estimation and non-blind deconvolution. Most blind deblurring methods always focus on how to obtain a correct kernel, then use an existing algorithm to solve the non-blind deconvolution. It has been a long history to design algorithms to address kernel estimation and some recent models have demonstrated remarkable results (Gong, Tan, Zhang, van den Hengel, & Shi, 2016; Krishnan & Fergus, 2009; Pan, Hu, Su, & Yang, 2014b; Wipf & Zhang, 2014; Xu, Zheng, & Jia, 2013; Zuo, Ren, Zhang, Gu, & Zhang, 2016). Generally speaking, most works focus on handcraft designing various

102

S. Cheng et al. / Neural Networks 101 (2018) 101–112

priors to improve the deconvolution results based on MAP. MAP estimation is one of the most commonly used blind deconvolution frameworks (Chan & Wong, 1998; Krishnan & Fergus, 2009), which is a common method using the probability of the events happened to solving variables of problems (Gu, Sheng, Tay, Romano, & Li, 2015; Gu, Sun, & Sheng, 2017). The basic idea of MAP is to maximize the posterior distribution as follows: max p(x, k|y) ∝ max p(y|x, k)p(x)p(k) x,k

x,k

∝ max N (y; x, k)p(x)p(k), x,k

2. Related work (2)

where p(y|x, k) is defined as a Gaussian likelihood function N (y; x, k), p(x) and p(k) are some handcrafted prior assumptions. By performing a ‘‘− log’’ transformation and ignoring constant factors, it will lead to the following minimization problem: min x,k

1 2

∥k ⊗ x − y∥22 − log p(x) − log p(k),

to the priors of blind images are introduced into our training procedure when comparing with CNN. • We show that the designed framework can effectively process all kinds of blurred images including facial image, text image, and low-illumination image. Numerous experiments show that the proposed framework performs favorably against the state-of-the-art blind deconvolution methods.

(3)

where the two regularizers, respectively, reflect the priors on x and k. Based on (3), a common character under MAP framework is the explicit usage of priors for both the blur kernel and the sharp image to encourage smoothness in the solution. However, studies in Levin, Weiss, Durand, and Freeman (2009) have demonstrated that jointly estimating x and k using (3) may fit both sharp and blurry images, thus lead to the undesired trivial solution (global minimum), i.e., x = y and k = δ , where δ is the delta kernel. Actually, even with known k, the non-blind deconvolution task is still ill-posed (Krishnan & Fergus, 2009). So many efforts on designing more complex priors, choosing good initialization and introducing heuristic pre/post processes, have been made in this framework. Unfortunately, these tricks only work well on special blurry images and easily bring oscillation when estimating latent image and blur kernel. To tackle this problem and obtain stable latent sharp image and blur kernel, we notice the control system (Bhaya & Kaszkurewicz, 2004; SUN Guo-fa & Su-zhen, 2017; Turner, Chun, & Juang, 2015), which has more comprehensive stability analysis in theory. It has been widely applied to industrial production such as boiler stability (Han, Yan, & Zhang, 2010), but rarely used for image restoration. So we try to introduce the control system to our blind deblurring task, and then obtain stable propagation of latent sharp image and blur kernel. In this paper, we develop a stable feedback control system to estimate the latent sharp image which is helpful for kernel estimation. Concretely, we first design a controller using prior and cues to guarantee the sparsity and sharp characters, then add the fidelity from problem model as feedback. So far, we obtain the main elements of our stable feedback control system (SFCS) for image propagation. Next, we give the energy function for kernel estimation. The main contributions of this work can be summarized in three-fold:

• We propose a new perspective based on feedback control system to model image propagation. This propagation system has stable point to guarantee the correctness of descent direction (i.e. the evolution of latent image). • We design a flexible framework for blind image deconvolution based on the latent image propagation system. This framework uses the cues and priors from training data as the constraint and regularization. Furthermore, we discuss the relationship with convolutional neural network (CNN). The components of controller in our framework are similar to CNN, but there is an extreme decrease in parameters due

In order to design a stable feedback control system for better kernel estimation, we review the existing latent image estimation algorithm and feedback control system in this section. 2.1. Blind deconvolution Summarized from the last decades blind deconvolution works, we can find many efforts have been made to improve the performance by MAP framework. For MAP, how to estimate a fit latent sharp image for kernel estimation is crucial. It is equal to how to design the prior (regularization) term p(x) in (3). First, it is observed that salient structure can help avoid trivial delta kernel solution. So a natural idea is to perform kernel estimation on gradient domain (Gong et al., 2016; Wipf and Zhang 2014). Second, the regularizer on x can be carefully designed to help choose sharp latent structures over blur images. The most widely used form is ℓq norm. For example, ℓ1 norm of image gradient is adopted, leading to the well-known total variation (TV) model (Chan & Wong, 1998; Perrone & Favaro, 2014). The work in Shan, Jia, and Agarwala (2008) combines ℓ1 norm and a ring suppressing term as the prior. To enforce stronger sparsity, ℓ0 norm is also adopted as image priors (Pan et al., 2014b; Xu et al., 2013). Very recently, an iteration-wise ℓq (0 < q < 1) prior is designed to fit latent image distribution (Zuo et al., 2016). It achieves a better deconvolution result than a fixed ℓq norm by learning different ℓq in each iteration. Rather than designing sparse regularization, some works also explicitly extract salient image structures by using edge detection methods (Cho & Lee, 2009; Xu & Jia, 2010) or manually labeled exemplars (Pan et al., 2014b). Just like Pan et al. (2014b), it achieves better face images deblurring by briefly marking the face contour, eyes and mouth. Another kind of recently popular sparse regularization method is based on learning framework. Regression tree fields (RTF), cascaded shrinkage fields (CSF) and trainable nonlinear reaction diffusion (TNRD) propose learning image priors (sparse regularization) with pair-wise clear images and blurred images. Both of them are flexible than fixed sparse measure ℓq norm. 2.2. Stable feedback control system Feedback control system is a kind of dynamical system, which has wide application in the automation industry (SUN Guo-fa & Suzhen, 2017; Turner et al., 2015). A normal feedback control system includes controller and plant. The plant represents the problem to be solved, while the controller reflects the algorithm designed to solve it. The feedback works through whole system, and builds the relationship between controller and plant. In fact, a discrete dynamical system can be regarded as an iteration process (Bhaya & Kaszkurewicz, 2004), thus there exists many numerical computation methods can be analyzed by perspectives of dynamical systems. If considering image propagation as an iterative processing, we can simulate the image propagation by a discrete feedback control system. Thus, we devote to design an effective controller which can lead the latent image sharper. It should not be ignored that the stability plays an important role in a control system, which directly decides whether the system

S. Cheng et al. / Neural Networks 101 (2018) 101–112

can work normally. Stability analysis is also the indispensable part of the system in practical application. Lack of stability can result in an unanticipated output. If the iteration of the discrete control system can be convergence to a stable point, it means that the system is feasible and stable for the special problem. Similarly, if we design a stable control system for sharp image propagation, it is no doubt that we can obtain a stable sharp image in the later of the iterations to facilitate the kernel estimation. There are some works based on feedback or control system (Yousaf & Qin, 2015) for blind image deblurring. For example the method in Yousaf and Qin (2015), which designed an effective feedback control to enhance the stability of blind image deconvolution. Unfortunately, the authors only designed the ℓ2,1 -norm and ℓ1 -norm as the priors of latent image and kernel, respectively, without the adaptive part to make the control system more flexible and suitable to blind image deconvolution. In this paper, we not only require the final latent image has strong edges, but also guide the propagation can converge to a stable point by computing reasonable trade-off parameters. The stability analysis of the feedback control system in Section 3.3 guarantees the effectiveness of our algorithm.

103

manner. For example, one may choose first order difference filter (as f), ℓ1 norm (as φ ) and the observation (as g), but those fixed forms manually specified, leading to the evolution of the latent image, are not always beneficial to kernel estimation. Thus designing suitable and reasonable convolutional filter, sparsity measure and guidance are necessary to obtain a latent sharp image. We would like to introduce in detail how to learn (design) these elements in the next subsection. Feedback also plays an important role in SFCS, and it decides the circuit position of the whole system. The feedback consists of present state (latent image xt ) and fidelity of problem model in our system. The model is related to the formulation of the concrete image process tasks. It can be generally written as follows: M(xt ) = k ⊗ (k ⊗ xt − y),

(5) ◦

where k is obtained by rotating k 180 . For deconvolution, k is blur kernel and ⊗ is convolution operation. For super-resolution, k is downsample matrix and ⊗ is the matrix multiplication. As for denoise, k is identity matrix and ⊗ is also the matrix multiplication. The state observer is described by the equation: xt +1 = H(xt ) − U ,

3. The proposed framework

t

Different from general feedback control system, our SFCS has adaptive controller which can promptly avoid the latent image of the trajectory. We will introduce our adaptive SFCS in detail in the following. The subject of our adaptive SFCS concerns the evolution of the latent image and has been widely used to formulate natural phenomenon. Before building inference framework for Eq. (1), we would like to first consider the image modeling issue. As discussed above, SFCS is a flexible nonlinear dynamical system whose trajectory is related to the controller and feedback. In this work, we would like to design the controller based on popular convolution and sparsity measure at first, then consider the character which needs to be added in controller when dealing with specific image problems. 3.1. Feedback control system for image propagation For image restoration, we consider that the latent image propagation is the plant of our feedback control system, and regard the final latent image as the output of whole system. Thus we design the controller algorithm according to experience in most image processing works. In a feedback control system, the controller is the key part to decide the propagation of whole system. We propose the controller of image model to control the evolution of latent image based on priors and data. There are two components in controller. One is guidance control. In order to satisfy the different requirements (such as smoothing image) in various image tasks, we introduce the guidance control as follows: G (xt ) = xt − g, where g denotes the guidance. The other is diffusion control, which is inspired by the anisotropic filter and sparsity constraint in image processing. We can model it as follows: D(xt ) = f ⊗ ϕ (xt ; f), where f denotes the convolutional filter, ϕ denotes the sparsity measure and f is obtained by rotating f 180◦ . f is an anisotropic filter which can strengthen or weaken edges. Combine two components with trade-off parameters α and denote the output of the controller as follows: U := α D(xt ) + β G (xt ).

(4)

We obtain all components of the controller, and the guidance (g), the convolutional filter (f), and sparsity measure (ϕ ) are the undefined elements which can be designed by manual or learning

t

(6) t

where H(x ) = x −λM(x ). So far, we build the discrete nonlinear feedback control system for image propagation. The pink part in Fig. 1 shows the evolution of latent sharp image in detail. 3.2. Designing adaptive components in controller If we fix the guidance g, convolution filter f and sparsity measure ϕ , our control system becomes a traditional evolution based on a partial differential equation system. Thus evolution is uncontrolled and may converge to a delta kernel. In order to better fit the propagation of the latent image and blur kernel, we carefully design the guidance and regularization (convolution and sparse measure) by cues and training data. Thus we introduce learnable part into our image propagation system. In the following, we will discuss how to learn these elements by using both the training data and the priors. 3.2.1. Adaptive cross-layer guidance Considering the importance of the energy function and sharp edge image in image deconvolution, our guidance can separate into two parts. One part is the gradient of ℓ2 -norm about model prior in energy function, which is a simple linear operation without learnable parameters. The other part is how to obtain a preliminary guidance of sharp edge image. So we only discuss how to learn the guidance g. It can be observed that at the initial several stages, g should control the propagation to enhance the sharp edges while revealing the exact image structure at the later period. Based on these observations, we propose three settlements: edge extraction, shock and bilateral filter, and adaptive ℓp -regularization. First of all, we obtain edge of image as the guidance g by using existed edge extraction method or manually marking, which is convenient and time-saving. Second, we also can define the guidance g as follows: gt =



thresh(∂γ shock(bilateral(xt ))),

γ ∈{h,v}

where {h, v} denote the horizontal and vertical direction, thresh(·), shock(·) and bilateral(·) are thresholding operator, shock filter and bilateral filter, respectively. Once again, we adopt a restoration model with adaptive ℓp -regularization and cross-layer kernel cue for our system. Specifically, the guidance g can be obtained by solving gt = arg min g

ηg 2

∥kt ⊗ g − y∥22 + ∥∇ g∥pp ,

(7)

104

S. Cheng et al. / Neural Networks 101 (2018) 101–112

Fig. 1. The proposed blind image deconvolution framework. The pink part represents the image propagation feedback control system. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

where p ∈ [0, 1] and ηg is the parameter. We can approximate Eq. (7) by the following half-quadratic regularized model, auxiliary variable w is introduced and Eq. (7) is rewrote as follows: min g,w

ηg 2

∥kt ⊗ g − y∥22 + ρ∥w − ∇ g∥22 + ∥w∥pp .

(8)

Since the energy function Eq. (8) about g is derivable, we can solve it by the closed-form solution using Fast Fourier Transform (FFT), i.e.,

( t

g =F

ηg F (kt −1 ) ⊙ F (y) + ρ F (∇ ) ⊙ F (wt )

−1

ηg F (kt −1 ) ⊙ F (kt −1 ) + ρ F (∇ ) ⊙ F (∇ )

,

˜ ∥22 + ∥w∥pp wt = arg min ρ∥w − w w

˜ ={ prox(1/ρ,p) (w)

sgn(w ˜ i ) T (w ˜ i ), if |w ˜ i | > ϱρ,p = 0, if |w ˜ i | ≤ ϱρ,p ,

(9)

˜ w ˜ = ∇ g, ϱρ,p and Tρ,p (w where w ˜ i ∈ w, ˜ i ) is defined by

{ p−1 1 ϱρ,p = (2ρ (1 − p)) 2−p + ρ p(2ρ (1 − p)) 2−p , T(ρ,p) (w ˜ i) = w ˜ i − ρ p(T(ρ,p) (w ˜ i ))p−1 . Fig. 2 shows results of different guidances. Although the images in gradient domain are distinguished, the blind deconvolution results are visible. 3.2.2. Learnable convolutional filter The first order derivative filters are the most commonly used operator in image processing. However, such simple filters may not correctly extract the salient structure of complex image for kernel estimation (Levin et al., 2009). So the purpose of convolution filter is to transform images into characteristic space and then extract the salient structure. In our framework, we would like to learn a single convolutional filter f from a set of collected training s data {xi , xsi }m i=1 , where x and x are the filtering input and output, s respectively. ∑ As for x , we would like to generate it by performing ˜ , where γ ∈ {h, v, hh, hv, vv}, respectively, dexs = γ ∂γ x note the first-order and second-order derivative on horizontal and vertical directions and x˜

∑d

θ

j=1 j fj

and solve the following model:



⎞ m d ∑ ∑ min ηf ⎝ ∥ θj fj ⊗ xi − xsi ∥22 ⎠ + ∥θ∥22 , θ

i=1

(10)

j=1

where F = [f1 , . . . , fd ] is a given filter dictionary,1 θ = [θ1 , . . . , θd ] is the weight coefficients and ηf is a scalar constant. The closedform solution to Eq. (10) can be directly obtained by its normal equation.

)

where ⊙ is the dot product, F and F −1 denote the Fourier transform and its inverse transform, and F denotes the complex conjugate of F . As for the auxiliary variable w, it can be written as follows:

x˜ = arg min ηx ∥x − x˜ ∥22 + ∥∇ x˜ ∥0 .

Then we write f =

3.2.3. Smooth sparsity measure The purpose of sparsity measure is to model the sparse image prior. Common method is using shrinkage function to get it. Instead of setting a shrinkage function as the restriction of image prior, we attempt to design a more general function using a weighted linear combination of a group of radial basis functions (RBFs), i.e., ϕ (z) = ∑ n n j=1 ϑj ρ (∥z − cj ∥), where ρ is the radial basis, {cj }j=1 is the location parameters and ϑ = {ϑj }nj=1 is the weight coefficients. g g We first collect a set of training data {yi , xi , ki }m i=1 , where g g {xi , ki } are the ground truth latent image and kernel to yi . Then our task is to learn a series of ϑt to control the sparsity of our system. Specifically, at tth layer, we would like to learn ϑt by minimizing the tradeoff energy with parameter ηϕ between restoration and estimation error, i.e., J (ϑ t ) =

m ∑

g

g

(ηϕ ∥ki ⊗ xti − yi ∥22 + ∥xti − xi ∥22 ).

(11)

i=1

It can be seen that the gradients of J w.r.t ϑt can be calculated by back-propagation through time (BPTT), i.e., m

∑ ∂ϕ t ∂ xt ∂ J ∂J i = . t ∂ϕ t ∂ xti ∂ϑt ∂ϑ i=1 So the energy in Eq. (11) can be efficiently minimized by gradient based algorithm, e.g., L-BFGS (Liu & Nocedal, 1989). 3.2.4. Discussion with CNN First, we discuss the relationship of our system and CNN about the architecture. The guidance, convolutional filter and sparsity measure in our system are similar with full connection, convolutions, and ReLU in CNN, respectively. Comparing with many multiscale convolutions in CNN, we only use one 3 ∗ 3 convolutional filter to extract the features, which can extremely decrease the parameters. We also used sparsity measure to increase the flexibility of 1 In this paper, we adopt DCT basis (omitting the DC component to guarantee zero-mean filter) to build this dictionary.

S. Cheng et al. / Neural Networks 101 (2018) 101–112

(a) Edge extraction.

(b) TSB.

105

(c) ℓp -regularization.

Fig. 2. Example of deconvolution result with different guidances. TSB is the abbreviation of thresh(·), shock(·) and bilateral(·). The top row is the guidance in gradient domain, and the bottom row is the restoration result with different guidances.

Fig. 3. Comparisons with a deep CNN method for image deconvolution.

sparsity constraint, rather than a fixed ReLU function. It is more flexible than ReLU to avoid the sparsity feature vanishing if the value is negative. We can regard the guidance in our system as a simple full connection layer to compensate the information coming from the question model. Moreover, our iterations in system are similar to the accumulation of layers in CNN. We then analysis the main difference of our system and CNN on the incorporation of image priors. The traditional CNN do not have explicit priors, it always has deep layers to simulate the characters (priors) only by a great amount of training data, thus the test results are easily effected by the distribution of training data and only work well for the special data. But our feedback control system can learn sharp feature from a small number of training data since the cues and priors (Such as ℓp norm in our guidance) are introduced into the training process to provide a better guidance for image propagation. In fact, if we consider one guidance, one convolutional filter and one sparsity measure as a block, our system can be regarded as finding the stable point of the control system by effective gradient descent in Eq. (6). To verify our system is superior to a deep CNN, we compare our method with Nah et al.’s method (Nah, Kim, & Lee, 2017) which adopts a multi-scale strategy to training an end-to-end deep CNN for image deblurring. As shown in Fig. 3, our method has better performance on synthetic large scale image. We can obtain

better restoration of pillars and peoples in the building. We do not compare the kernel estimation since the CNN method is end-toend, and it has not the result of kernel. 3.3. Stability analysis of proposed feedback control system The stability is very important for the evolution of latent image, and it decides the propagation results whether oscillation or tending to a bad image directly. For image deconvolution, only sharp and strong edges images can guide a desirable kernel estimation. The designed system stability means that the latent image has strong edges in the final. If our system lacks the stable behavior, there is no doubt that the latent image will get extra noise or still blurry. Both of them are harmful to the kernel estimation. In more serious cases, the latent image will become strange and deviate to the original image, which can end the algorithm by the wrong computation. Our discrete SFCS is described in Eq. (6). According to the fact that the difference of latent images obtained by two adjacent iterations tends to zero if the system is stability, we can prove the stability as follows: Theorem 1. The feedback control system (Eq. (6)) is globally asymptotically stable if there exists α, β, λ such that the following condition

106

S. Cheng et al. / Neural Networks 101 (2018) 101–112

is satisfied:

⎧ ⎨α ∈ (0, 2⟨r t , ∂x D(xt )r t ⟩/⟨∂x D(xt )r t , ∂x D(xt )r t ⟩), β ∈ (0, 2⟨r t , ∂x G (xt )r t ⟩/⟨∂x G (xt )r t , ∂x G (xt )r t ⟩), ⎩ λ ∈ (0, 2⟨r t , ∂x M(xt )r t ⟩/⟨∂x M(xt )r t , ∂x M(xt )r t ⟩),

it. Combine with the careful designed of latent image, we update (x, k) by following alternating iterations scheme (12)

Proof. Consider the nonlinear vector function Q(x) := λM(x) + U , and exist x∗ such that Q(x∗ ) = 0. Then the system Eq. (6) can be rewritten as xt +1 = xt + Q(xt ). Let the residue function r = −Q(x), then consider the traditional control Liapunov function candidate, V (r) := ⟨r , r ⟩. According to the direct Lyapunov method, V (r) is positive definite, the derivative of V along the trajectories of system Eq. (6) is given as follows:

∆V = V (r t +1 ) − V (r t ) = ⟨r t − ∂x Q(xt )r t , r t − ∂x Q(xt )r t ⟩ − ⟨r t , r t ⟩. Since Q = λM(x) + U = λM(x) + α D + β G , we have ∆V < 0 if and only if α, β, λ satisfied Eq. (12). It means that ∆V is negative definite and the system Eq. (6) is globally asymptotically stable. □ 4. Blind deconvolution using SFCS A stable update strategy of latent image has been proposed, we will discuss how to apply it into blind image deconvolution. In this section, we first introduce the optimal energy of kernel estimation, then give the iteration of x and k by discrete method. Fig. 1 shows the whole blind image deconvolution processing. The final image estimation uses a present non-blind image deblurring method with our estimated kernel. 4.1. Kernel estimation Blur kernel k plays the important role in blind image deblurring, which is decided by the path of camera shake during exposure. So the kernel estimation energy should be used to control the propagation of x in Eq. (6). In this paper, we consider the ℓ2 -regularized fidelity energy with parameter λ, i.e.,

λ 2

∥k ⊗ x − y∥22 + ∥k∥22 .

xt +1 = xt − λM(xt ) − U kt +1 = PK kt − sk ∂k L(xt +1 , kt ) ,

(

where r t is the residue function in the tth iteration, ∂x is the partial derivative respect to x, and ⟨.⟩ is the inner produce.

L(k; x) =

{

(13)

Minimizing Eq. (13) w.r.t. k can estimate blur kernel. Notice that we should limit the kernel in a simplex K = {k | ∥k∥1 = 1, k ⪰ 0}, where k ⪰ 0 means that all elements of k are greater than or equal to zero. 4.2. Blind deconvolution Specific to blind deconvolution, multi-scale strategy is adopted in most state-of-the-art method (Cho & Lee, 2009; Krishnan, Tay, & Fergus, 2011; Levin, Weiss, Durand, & Freeman, 2011; Pan et al., 2014b; Sun, Cho, Wang, & Hays, 2013; Xu & Jia, 2010; Zuo et al., 2016) to guarantee the result can be more robust and prevent bad local optimal solution. In this work, to further enhance the efficiency of our system, we also incorporate the multi-scale scheme with scales (aM , . . . , am , . . . , a0 with 0 < a < 1) and inner iterations (0, . . . , t , . . . , T ) into our kernel estimation pipeline. In each scale, we will consider the discretize strategy to solve Eq. (13). Obviously, L(k; x) has a closed-form solution about k. But it should not be ignore that blur kernel k is very sparse. The tiny noise in latent sharp image will lead to the oscillation of kernel estimation. So we adopt gradient descent optimization to avoid

)

(14)

where sk is the step sizes, ∂k denotes the partial derivative operators, and PK is the projection operator of the simplex K. The iteration of x in SFCS is equal to a gradient descent method with one step-size if we regard λM + U as a descent direction. It builds the relationship between dynamical system and general iteration. The multiscale blur kernel estimation process is summarized in Algorithm 1. upsample and downsample denote the up-sample and down-sample according to the proportion am . For example, if a = 1/2, M = 2, the sample scales are 1/4, 1/2, 1 times of the original size of image and kernel, respectively. upsample(xm , 1/a) represents up-sample xm on the 2 times of the original size, and the other is similar. As the element in Eq. (14) will be efficiently estimated, one may also adopt conjugate gradient scheme to further speed up iterations. Fig. 4 gives an example for our algorithm and compares with a adaptive method (Zuo et al., 2016). Algorithm 1 Multi-scale Kernel Estimation via SFCS Require: The observation y and necessary parameters. Ensure: The estimated kernel k and latent image x. M 1: Initialization yM = downsample(y, a ), xM = yM , kM = kδ ; 2: for m = M , · · · , 0 do 3: Initialization x0 = xm , k0 = km ; 4: for t = 0, · · · , T do 5: xt +1 = xt −( β M(xt ) − U ; ) 6: kt +1 = PK kt − sk ∂k L(xt +1 , kt ) ; 7: end for 8: xm = xT , km = kT ; 9: xm−1 = upsample(xm , 1/a); 10: km−1 = upsample(km , 1/a); 11: ym−1 = downsample(y, am−1 ); 12: end for 5. Experiments We first compare our dynamical feedback control system with both analysis-based and learning-based non-blind image restoration approaches to verify our image propagation methodology. Then we perform our method together with several state-of-theart deblurring approaches on two benchmark datasets, three kinds of special images and natural blurry images to evaluate the efficiency of the proposed blind deconvolution pipeline. All experiments are conducted on a PC with Intel(R) Xeon(R) CPU E5-2620 v2 @2.10 GHz, 128G RAM and a NVIDIA GeForce GTX TITAN Black GPU. All the results of compared methods are provided by the authors or generated using their codes with default parameters setting. The non-blind stage uses (Daniel & Yair, 2011) to perform the final deconvolution for all the methods unless otherwise mentioned. As for the quantitative comparison, we follow recent works to report the average peak signal to noise ratio (PSNR), structural similarity (SSIM), error ratio (ER) and kernel similarity (KS) (Hu & Yang, 2012; Levin et al., 2009) on these images. The training data (blurred images) are synthesized by 40 images (180 × 180) in BSDS (Martin, Fowlkes, Tal, & Malik, 2001) filtering with a certain kernel (27 × 27), and adding gaussian noise with standard deviation σ = 0.01. 5.1. Verifying the image propagation feedback control system We adopt non-blind deconvolution as an example to verify the effectiveness of the feedback control system on image propagation. Here we conduct (non-blind) image restoration experiments

S. Cheng et al. / Neural Networks 101 (2018) 101–112

(a) Blurry image.

(b) Zuo et al.

107

(c) Ours.

Fig. 4. Examples of blind image deconvolution result: (a) the blurry image, (b) the result of Zuo et al.’s method and (c) our corresponding deblurring result. The alphabet on the ground can be easy to distinguish and the distance billboard becomes clear. Table 1 Quantitative results for non-blind image restoration. Method

PSNR SSIM TIME (s)

Analysis-based

Learning-based

FTVD

Hyper-L

CA

N-SFCS

CSF

MLP

SFCS

26.16 0.7103 2.0105

26.40 0.7727 0.2912

27.98 0.7942 47.0108

26.50 0.7268 0.2003

27.31 0.7314 0.1984

28.05 0.7804 1.1596

28.13 0.7917 0.1969

on a set of 68 images with the size of 180 × 180 (Chen, Yu, & Pock, 2015), which are sampled from BSDS database (Martin et al., 2001), to investigate the methodology of our image propagation system. We evaluate the non-blind deconvolution results with comparative methods. Analysis model, i.e., fast total variation deconvolution (FTVD) (Yang, Zhang, & Yin, 2009), hyper-laplacian priors (Hyper-L) (Krishnan & Fergus, 2009), coded aperture (CA) (Levin, Fergus, Durand, & Freeman, 2007), and learning based method, i.e., cascade of shrinkage fields (CSF) (Schmidt & Roth, 2014), multi-layer perceptron (MLP) (Schuler, Christopher Burger, Harmeling, & Scholkopf, 2013) have been considered in this part. As for our pipeline, we first specify fixed components in Eq. (6) as a naive baseline (denoted as N-SFCS). That is, we utilize the observation, the gradient filter and soft-shrinkage operator as g, f and ϕ in Eq. (6), respectively. In contrast, the components in SFCS are trained by the 40 blurred images which are generated from BSDS. Table 1 shows that learning-based algorithms (i.e., CSF, MLP and SFCS) almost achieve better performance than predesigned methods (i.e., FTVD, Hyper-L and N-SFCS) in PSNR, SSIM and running time (in seconds, ‘‘TIME (s)’’ for short). SFCS has the best performance in both PSNR and TIME, although slightly less effective than CA on SSIM. For analysis-based approaches, CA has a better score on PSNR and SSIM but it is time-consuming. As for learning-based methods, SFCS is better than CSF and MLP on all indicators. These results actually verified that our image propagation methodology is effective. 5.2. Evaluation on benchmark image sets 5.2.1. Levin et al. ’s image set The first group of blind image deconvolution experiments is conducted on the most commonly used Levin et al.’s image set (Levin et al., 2009), which includes 32 blurred images (generated by 4 clear images with the size of 255 × 255 and 8 blur kernels with sizes vary from 13 × 13 to 27 × 27). We compare our proposed method with 8 state-of-the-art approaches, such as Cho and Lee (2009), Fergus, Singh, Hertzmann, Roweis, and Freeman (2006), Krishnan et al. (2011), Levin et al. (2011), Perrone and Favaro (2014), Sun et al. (2013), Xu and Jia (2010) and Zhang, Wipf, and Zhang (2013). As shown in Fig. 5(a), our system achieved the highest average PSNR among all these compared methods. Krishnan et al.’s method

Table 2 Quantitative comparisons on Levin et al.’s image set (lower error ratio is better).

Non-blind Cho and Lee Fergus et al. Krishnan et al. Levin et al. Xu et al. Sun et al. Perrone et al. Zhang et al. Ours

PSNR

SSIM

ER

TIME (s)

32.34 27.63 27.65 24.87 29.03 26.71 29.71 27.43 28.01 30.30

0.93 0.85 0.84 0.74 0.89 0.83 0.90 0.86 0.86 0.91

1.00 1.46 2.22 2.05 1.40 1.68 1.32 1.66 1.51 1.24

– 0.74 43.45 8.26 41.77 1.00 209.47 113.70 37.45 5.68

is worst since the energy function about clear images and blurred images is controlled by the normalized sparsity priors, which cannot guide strong sharp edges. Perrone et al.’s method with simple ℓ2,1 norm is also hard to exactly recover the latent image structure for kernel estimation, thus leading to bad deconvolution results. Sun et al.’s method, which refines edges to get strong gradients based on patch priors, has a better performance. The performance of SSIM is shown in Fig. 5(b), in which we can see that our proposed system achieved the highest average value among all the compared methods, especially on ‘‘im4’’. We find our method is successful on images with large blur kernels when analyzing SSIM of each latent image. The comparison about ER which evaluates the deconvolution results between the estimated and correct kernels, as seen in Fig. 5(c), shows that our method is superior. Almost 90% ER of latent images are less than 1.5 in our results, which has a better performance. Table 2 shows the average results on the dataset, where the conclusions of ER are so consistent with PSNR and SSIM that our quantitative results are the best, followed by Sun et al’s. We also show the comparison about running time (TIME (s)) in Table 2. The proposed method is not the fastest, but it has better deblurring result than the faster method (Cho & Lee, 2009; Xu & Jia, 2010). However, Our method has absolute advantage on running time when compared with methods which have approximative restoration. 5.2.2. Sun et al. ’s image set We then evaluate our method on the more challenging Sun et al.’s image set Sun et al. (2013). This database contains 640 blurred images (sizes vary from 620 × 1024 to 928 × 1024),

108

S. Cheng et al. / Neural Networks 101 (2018) 101–112

(a) Average PSNR.

(b) Average SSIM.

(c) ER.

Fig. 5. Quantitative results on Levin et al.’s image set. (a) PSNR, (b) SSIM and (C) ER. The left 4 groups in (a) or (b) of bars denote the average score (PSNR or SSIM) of each image with 8 different kernels and the rightmost group of bars denotes the average score on the whole 32 images. ER (error ratio vs. success rate) in (c), the success rate represents the proportion below certain error rate (higher curves indicate more accurate results). Table 3 Quantitative comparisons on Sun et al.’s image set (lower error ratio is better).

Non-blind Cho and Lee Fergus et al. Krishnan et al. Levin et al. Xu et al. Sun et al. Perrone et al. Zhang et al. Ours

PSNR

SSIM

ER

TIME (s)

32.46 26.19 24.83 22.92 24.90 28.27 29.48 27.92 28.08 29.53

0.88 0.81 0.70 0.75 0.80 0.83 0.85 0.80 0.79 0.83

1.00 3.82 3.81 5.01 3.28 2.41 1.73 1.94 1.84 1.48

– 8.04 360.16 46.48 602.01 20.60 3457.56 949.24 1008.68 13.28

which are synthesized by 80 high-quality natural images and 8 blur kernels (the same as that in Levin et al.’s image set). To simulate sensor noise, 1% Gaussian noise is also added to these blurred images. Table 3 lists the four quantitative metrics of all comparative methods (Cho & Lee, 2009; Fergus et al., 2006; Krishnan et al., 2011; Levin et al., 2011; Perrone & Favaro, 2014; Sun et al., 2013; Xu & Jia, 2010; Zhang et al., 2013), the parts in bold represent the best results for each metric. The proposed method achieves better results than almost all the competing methods. More concretely, the average PSNR and ER are the best in all methods, especially ER. The average ER of our method is 1.48, which is less than the threshold value 2 controlling the image whether visually plausible. For running time, although the proposed method is only a bit slower than (Cho & Lee, 2009), it has better performance on other quantitative metrics. Particularly worth mentioning is that we use all the parameters the same as dealing with Levin et al.’s dataset, without refined modifications, however, the results still outperform other methods. Sun et al.’s method has comparable performance owing to its patch-based priors which can better

model large-scale structures and provide a better way to extract salient edges for kernel estimation. In summary, why our approach can have favorable deconvolution results? Not only because our system can learn a more appropriate norm constraint as image priors in controller, but also we consider the ℓ2 -regularization as fidelity energy about latent image, blur kernel, and blurred image in feedback to propel the evolutionary process of deblurring. 5.3. Evaluation on special types of images Facial, text and low-illumination images are the most commonly used images in recognition and classification tasks. However, due to the weak contour and edge structure of human faces, texts, and low-illumination scenes, it is challenging to utilize general blind deconvolution methods to address their deblurring problems. So many efforts have been made to develop new priors to formulate the latent structures for these special types of images (Hu, Cho, Wang, & Yang, 2014; Pan, Hu, Su, & Yang, 2014a; Pan et al., 2014b). 5.3.1. Facial images In facial images process, we extract most proximity edges using similarity measure in Pan et al. (2014a) to obtain the guidance, and only re-train the convolutional filter f using 20 facial images from CMU PIE dataset (Gross, Matthews, Cohn, Kanade, & Baker, 2010). Then test it on example facial images and compare with generic deblurring methods (Cho & Lee, 2009; Sun et al., 2013; Xu & Jia, 2010) (Cho and Lee, Xu et al., and Sun et al., top one of running time and top two of PSNR in Sun et al.’s image set) and special method (Pan et al., 2014a). Fig. 6 shows a deblurring example. The generic methods cannot restore the facial features well since there are not obviously edges in blurred facial images, so that the generic

S. Cheng et al. / Neural Networks 101 (2018) 101–112

109

Fig. 6. Qualitative and quantitative comparisons on facial images. The estimated kernel is plotted on the top right of each image. The numbers in the brackets following the deconvolution methods represent KS scores (higher is better) of each method.

methods cannot capture much gradient information to improve kernel estimation. Although the special method uses exemplars to exploit facial structures, it does not work for this little boy image and cannot guide a better kernel estimation, thus leading to a lousy facial image restoration. Thanks to our strategy in guidance and targeted training on facial images, our method is better than the other method, especially on the restoration for mouth and eyes. 5.3.2. Text images In text images, text characters and background regions usually have near uniform intensity values in clean images. Based on these priors, Pan et al. (2014b) proposes ℓ0 -norm as the image priors to obtain an effective deblurring. We also use ℓ0 -regularization as adaptive cross-layer guidance g to test text images. As shown in Fig. 7, both qualitative and quantitative comparisons with generic methods (Cho & Lee, 2009; Sun et al., 2013; Xu & Jia, 2010) (Cho and Lee, Xu et al., and Sun et al.) and special method (Pan et al., 2014b) confirm that our method can restore text images. Similar to the facial images results, both the special approach and the generic methods have the awful results. The text contents are still blurring and the kernel estimation is inexact in these approaches. The special method has sparse prior as constraint, but ℓ0 -regularization is too strong which can result in the failed kernel estimation for the messy test image. Our method not only has a highest KS score to verify that it is better than other comparative methods, but also has a readable and visible text result.

5.3.3. Low-illumination images Low-illumination images are often blurry due to the long exposure time in night scene. Deblurring for Low-illumination image is usually tricky because of the lack of salient features. We choose different ℓp -regularization in different iterations to cater for the requirements of guidance g, and use the same convolution filters f and sparse measure φ in Levin et al.’s image set testing which is training from nature images. As shown in Fig. 8, we compare with four approaches (Cho & Lee, 2009; Hu et al., 2014; Sun et al., 2013; Xu & Jia, 2010). Sun et al. (2013) and Xu and Jia (2010)’s approaches almost have no effect because the license number is still blurred and difficult to distinguish. Cho and Lee (2009)’s result removes some blur but the number is unclear. Hu et al. (2014)’s method is a special method for low-light image deconvolution, and it has a better result than generic methods. Our deconvolution method is better than others, not only the license number is clear and readable, but also the blurred illumination can be restored (as shown in the green rectangles). 5.4. Evaluation on nature blurred images To demonstrate the flexibility of the proposed method, we test our system on nature blurred images, and compare with the fastest method (Cho & Lee, 2009) and top three competing methods considering the PSNR on both Levin et al.’s and Sun et al.’s image set, i.e., Cho and Lee (2009), Levin et al. (2011), Sun et al. (2013)

110

S. Cheng et al. / Neural Networks 101 (2018) 101–112

Fig. 7. Qualitative and quantitative comparisons on text image. The estimated kernel is plotted on the bottom right of each image. The numbers in the brackets following the deconvolution methods represent KS scores of each method.

Fig. 8. Qualitative comparisons on low-illumination image. The blurred photograph is a natural image, thus we cannot offer the quantitative value for comparison. The visible results show that our result is better.

and Xu and Jia (2010). Figs. 9 and 10 show the deconvolution results of two different real blurred photographs. Our method has a better presentation whether near wall image or vast emptiness Roma scene. Concretely, for the wall image (Fig. 9), our proposed method can recover the numbers and alphabets readable, but the

other methods make the image more blurred and undistinguished. Fig. 10 expresses the results of Roma. Sun et al.’s method and our approach obtain the satisfactory deconvolution results. Unfortunately, there are ringing artifacts on Sun et al.’s result if looking carefully.

S. Cheng et al. / Neural Networks 101 (2018) 101–112

Fig. 9. Deblurred results on near nature blurred image.

Fig. 10. Deblurred results on distant nature blurred image.

111

112

S. Cheng et al. / Neural Networks 101 (2018) 101–112

6. Conclusion This paper proposed a flexible feedback control system for kernel estimation driven by image propagation. By estimating the convolutional filter, cross-layer guidance and layer-wise sparse measure with cues and priors on the training data, we can successfully control the propagation of the latent sharp image by the guidance of stability analysis. As a nontrivial byproduct, we also verified the effectiveness of SFCS in image propagation, which means that we can try to solve other vision tasks through improving our method with different data-driven and priors. Limitations: We note that the feedback in our system is using ℓp -norm as the constraint of the guidance. Different kinds of images have different intensity distributions, so choosing a proper ℓp norm is not obvious. We only gave the examples on face, text, and low-illumination images, and have to re-train on the other kinds of images that we intend to deblur. In this paper, we provided a framework and only have given one detailed scheme using control system to simulate the propagation of image, so others design and training can be merged into our control system to improve the performance of kernel estimation and blind image deconvolution. Acknowledgments This work was supported by the National Natural Science Foundation of China (Nos. 61672125, 61733002, 61572096, 61432003 and 61632019), the Fundamental Research Funds for the Central Universities (DUT2017TB02) and the Hong Kong Scholar Program (No. XJ2015008). Dr. Liu is also a visiting researcher with Shenzhen Key Laboratory of Media Security, Shenzhen University, Shenzhen 518060, China. References Bhaya, A., & Kaszkurewicz, E. (2004). Iterative methods as dynamical systems with feedback control. In IEEE conference on decision and control, Vol. 3 (pp. 2374–2380). Chan, T. F., & Wong, C.-K. (1998). Total variation blind deconvolution. IEEE TIP, 7(3), 370–375. Chen, Y., Yu, W., & Pock, T. (2015). On learning optimized reaction diffusion processes for effective image restoration. In CVPR (pp. 5261–5269). Cho, S., & Lee, S. (2009). Fast motion deblurring. ACM ToG, 28(5), 145. Daniel, Z., & Yair, W. (2011). From learning models of natural image patches to whole image restoration. In ICCV (pp. 479–486). Fergus, R., Singh, B., Hertzmann, A., Roweis, S., & Freeman, W. (2006). Removing camera shake from a single photograph. ACM ToG, 25(3), 787–794. Gong, D., Tan, M., Zhang, Y., van den Hengel, A., & Shi, Q. (2016). Blind image deconvolution by automatic gradient activation. In CVPR (pp. 1827–1836). Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2010). Multi-pie. Image and Vision Computing, 28(5), 807–813. Gu, B., Sheng, V. S., Tay, K. Y., Romano, W., & Li, S. (2015). Incremental support vector learning for ordinal regression. TNNLS, 26(7), 1403–1416.

Gu, B., Sun, X., & Sheng, V. S. (2017). Structural minimax probability machine. TNNLS. Han, Z. X., Yan, C. H., & Zhang, Z. (2010). Study on robust control system of boiler steam temperature and analysis on its stability. Proceedings of the Csee, 30(8), 101–109. Hu, Z., Cho, S., Wang, J., & Yang, M.-H. (2014). Deblurring low-light images with light streaks. In IEEE CVPR (pp. 3382–3389). Hu, Z., & Yang, M.-H. (2012). Good regions to deblur. In ECCV (pp. 59–72). Krishnan, D., & Fergus, R. (2009). Fast image deconvolution using hyper-laplacian priors. In NIPS (pp. 1033–1041). Krishnan, D., Tay, T., & Fergus, R. (2011). Blind deconvolution using a normalized sparsity measure. In CVPR (pp. 233–240). Levin, A., Fergus, R., Durand, F., & Freeman, W. T. (2007). Image and depth from a conventional camera with a coded aperture. ACM ToG, 26(3), 70. Levin, A., Weiss, Y., Durand, F., & Freeman, W. T. (2009). Understanding and evaluating blind deconvolution algorithms. In CVPR (pp. 1964–1971). Levin, A., Weiss, Y., Durand, F., & Freeman, W. T. (2011). Efficient marginal likelihood optimization in blind deconvolution. In CVPR (pp. 2657–2664). Liu, D. C., & Nocedal, J. (1989). On the limited memory bfgs method for large scale optimization. Mathematical Programming, 45(1–3), 503–528. Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV, Vol. 2 (pp. 416–423). Nah, S., Kim, T. H., & Lee, K. M. (2017). Deep multi-scale convolutional neural network for dynamic scene deblurring. In IEEE CVPR. Pan, J., Hu, Z., Su, Z., & Yang, M.-H. (2014a). Deblurring face images with exemplars. In ECCV (pp. 47–62). Pan, J., Hu, Z., Su, Z., & Yang, M.-H. (2014b). Deblurring text images via l0-regularized intensity and gradient prior. In CVPR (pp. 2901–2908). Perrone, D., & Favaro, P. (2014). Total variation blind deconvolution: The devil is in the details. In CVPR (pp. 2909–2916). Schmidt, U., & Roth, S. (2014). Shrinkage fields for effective image restoration. In CVPR (pp. 2774–2781). Schuler, C. J., Christopher Burger, H., Harmeling, S., & Scholkopf, B. (2013). A machine learning approach for non-blind image deconvolution. In CVPR (pp. 1067–1074). Shan, Q., Jia, J., & Agarwala, A. (2008). High-quality motion deblurring from a single image. ACM ToG, 27(3), 73. Sun, L., Cho, S., Wang, J., & Hays, J. (2013). Edge-based blur kernel estimation using patch priors. In ICCP (pp. 1–8). SUN Guo-fa, T. Y., & Su-zhen, W. (2017). Adaptive neural output feedback control for strict feedback nonlinear system. Control Theory & Applications. Turner, J. D., Chun, H. M., & Juang, J. N. (2015). An analytic solution for the state trajectories of a feedback control system. Journal of Guidance, Control, and Dynamics, 8(1), 147–148. Wipf, D., & Zhang, H. (2014). Revisiting bayesian blind deconvolution. Machine Learning Research, 15(1), 3595–3634. Xu, L., & Jia, J. (2010). Two-phase kernel estimation for robust motion deblurring. In ECCV (pp. 157–170). Xu, L., Zheng, S., & Jia, J. (2013). Unnatural l0 sparse representation for natural image deblurring. In CVPR (pp. 1107–1114). Yang, J., Zhang, Y., & Yin, W. (2009). An efficient tvl1 algorithm for deblurring multichannel images corrupted by impulsive noise. SIAM Journal on Scientific Computing, 31(4), 2842–2865. Yousaf, S., & Qin, S. (2015). Closed-loop restoration approach to blurry images based on machine learning and feedback optimization. TIP, 24(12), 5928–5941. Zhang, H., Wipf, D., & Zhang, Y. (2013). Multi-image blind deblurring using a coupled adaptive sparse prior. In CVPR (pp. 1051–1058). Zuo, W., Ren, D., Zhang, D., Gu, S., & Zhang, L. (2016). Learning iteration-wise generalized shrinkage–thresholding operators for blind deconvolution. IEEE TIP, 25(4), 1751–1764.