Boosting dark channel dehazing via weighted local constant assumption

Boosting dark channel dehazing via weighted local constant assumption

Boosting Dark Channel Dehazing via Weighted Local Constant Assumption Journal Pre-proof Boosting Dark Channel Dehazing via Weighted Local Constant A...

19MB Sizes 1 Downloads 39 Views

Boosting Dark Channel Dehazing via Weighted Local Constant Assumption

Journal Pre-proof

Boosting Dark Channel Dehazing via Weighted Local Constant Assumption Zhu Mingzhu, He Bingwei, Liu Jiantao, Yu Junzhi PII: DOI: Reference:

S0165-1684(19)30504-3 https://doi.org/10.1016/j.sigpro.2019.107453 SIGPRO 107453

To appear in:

Signal Processing

Received date: Revised date: Accepted date:

30 August 2019 30 December 2019 31 December 2019

Please cite this article as: Zhu Mingzhu, He Bingwei, Liu Jiantao, Yu Junzhi, Boosting Dark Channel Dehazing via Weighted Local Constant Assumption, Signal Processing (2019), doi: https://doi.org/10.1016/j.sigpro.2019.107453

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier B.V.

Highlights • Deep insight into the limitation of dark channel prior • Refreshingly concise algorithm contains only 5 straightforward steps. • Discussion and solution of the lower-bound constraint • Outstanding quality and speed on D-HAZY and Fattal datasets • Results can be further tuned by hand-drawing scribbles

1

Boosting Dark Channel Dehazing via Weighted Local Constant Assumption Zhu Mingzhua , He Bingweib,∗, Liu Jiantaob , Yu Junzhia a

the State Key Laboratory for Turbulence and Complex System, Department of Mechanics and Engineering Science, BIC-ESAT, College of Engineering, Peking University, China b Fujian Provincial Collaborative Innovation Center for High-End Equipment Manufacturing, Fuzhou University, China

Abstract In dark channel based methods, local constant assumption is widely used to make the algorithms invertible. It inevitably introduces outliers since the assumption can not perfectly avoid depth discontinuities and meanwhile cover enough pixels. Unfortunately, because of the limitation of the dark channel prior, which only confirms the existence of dark things but does not specify their locations or likelihoods, no fidelity measurement is available in refinement. Therefore, the outliers are either under-corrected or over-corrected. In this paper, we go deeper than the dark channel theory to overcome this problem. We split the concept of dark channel into dark pixel and local constant assumption, and then, propose a novel weight map. With such effort, our method shows improvements on quality, robustness and speed. The theory is even simpler and more intuitive than the original, leading to the refreshingly concise algorithm. In the last, we show that the results can be ever-improved by scribbles, which indicate better dark pixel locations. Keywords: dehaze, dark channel, local constant assumption, transmission, weight map

∗ Corresponding

author, E-mail: [email protected]

Preprint submitted to Journal of LATEX Templates

January 15, 2020

1. Introduction Hazy scenes widely exist in outdoor environment. The observed images suffer from low contrast and visibility. Since most computer vision applications assume input image as scene radiance, haze removal is highly desired, whether 5

by enhancement [1, 2, 3] or restoration [4, 5, 6]. Haze removal is an under-constrained problem, thus requires additional conditions. Various methods based on multiple inputs [7, 8, 9], manually designed priors [4, 5] or learned priors [10, 11] have been proposed. Among them, the dark channel prior [4] is widely recognized and popular in various fields [12, 13].

10

However, haze removal methods based on this prior usually have obvious microcontrast loss or halo-effect [4, 14, 15], which comes from the depth-irrelevant details or over-smoothed boundaries in their transmission estimates. In this paper, we discuss the reason and solution of this common problem. We argue that the reason lies in the local constant assumption on transmissions.

15

The assumption is employed by almost all the dark channel based methods. It suits the dark channel prior well because both of them are defined on local regions. However, it is false at depth discontinuities, introducing outliers in initial transmission estimates. Kinds of refinement algorithms have been proposed, but outliers are not detected and corrected, but treated as inliers with noises,

20

leading to various defects. The problem is solved in this paper based on a novel weight map. It is deduced based on a deeper insight of the dark channel prior, and it is efficient at detecting outliers of local constant assumption. Section 2 describes the problem details, and compares dark channel based methods with state-of-the-arts, concluding that local constant assumption is the

25

major reason. Section 3 reinterprets the dark channel theory, and then introduces a novel weight map, with which, WDC (weighted dark channel method) is proposed. Section 4 handles the transmission lower bound constraint of WDC. Section 5 conducts experiments on two popular datasets and various samples. Section 6 shows the flexibility and robustness of WDC, whose results can be

30

refined based on scribbles. Section 7 gives the conclusion.

3

2. Related Works 2.1. The imaging model and dark channel prior The model widely used to describe hazy image is [16] I(x) = t(x)J(x) + [1 − t(x)]A,

(1)

where x is the pixel coordinate, I is the observed image, J is the scene radiance, 35

A is the global atmospheric color, which can be estimated by selecting particular pixels [4], or statistical approaches [17, 18]. In this model, I is explained as a fusion of J and A with the ratio controlled by transmission map t. The recovery of scene radiance is under-constrained. The dark channel prior [4] is widely recognized as an effective constraint to this problem. It is defined as min( min J c (y)) = 0, c

40

y∈Ω(x)

(2)

where c is the color channel, and Ω(x) is the mask centered at x. 2.2. The common problem of dark channel based methods Despite the dark channel prior describes haze-free images in high quality, it does not provide sufficient constraints. The haze removal problem is still under-constrained, thus secondary assumptions are required. Various defects

45

arise along with these assumptions. He et al. [4] makes the secondary assumption that transmissions are constant in each patch Ω. Therefore, t(y) = t˜(x) for y ∈ Ω(x), where t˜(x) is the transmission of Ω(x). Combining this assumption and the prior, transmissions are initially estimated by dilating white balanced input, as I c (y) ). y∈Ω(x) Ac

t = 1 − min( min c

50

(3)

It introduces block effects. As shown in Fig. 1c, transmissions in the vicinity of depth discontinuities are over-estimated. Laplacian matting [19] is then employed to remedy this problem. However, this algorithm makes local linear assumption between transmissions and hazy colors. The refined estimates 4

mix depth discontinuities and color textures as shown in Fig. 1d, leading to 55

micro-contrast loss [5, 20, 21]. As an improvement, Meng et al. [14] applies image opening instead of dilation in initial transmission estimation, as (suppose the radiance bounds are 0 and 1) I c (z) ). c,z∈Ω(y) Ac

t = 1 − max ( min y∈Ω(x)

(4)

It can be understood as taking dark channel prior and local constant assump60

tion on space-variant masks, which abuses the prior, leading to wrong values in some cases, as circled out in Fig. 1e. The initial estimates are refined by minimizing a cost function based on piecewise smoothness assumption, which is more reasonable than the local linear assumption [20]. However, the optimal solution is searched by an initial sensitive algorithm. Edges not included

65

in initial estimates will never appear and wrong edges can only be smoothed. Consequently, the final estimates usually contain ill-defined or over-smoothed boundaries, as shown in Fig. 1f. As another improvement, Wang et al. [15] makes local constant assumption on super-pixels. It relies on the quality of super-pixel segmentation. Block

70

effects will appear if the super-pixels are too large to avoid depth discontinuities, while the prior will be undermined if the super-pixels are too small to cover white objects. Guided filter [22] is employed for robustness, but it causes haloeffect. Furthermore, guided filter [22] also applies the local linear assumption, introducing depth-irrelevant details.

75

As described previously, dark channel based methods have similar problems. Outliers in transmission estimates are inevitable since perfect mask shapes for local constant assumption can hardly achieved. Many researches pin hopes on post-refinement. However, local linear assumption based algorithms will introduce depth-irrelevant details [23, 24, 25], and piecewise smoothness assumption

80

based algorithms will lead to over-smoothness if fidelity measurement is absent [26]. Many researches develop edge-preserving filters [27, 20], but end up in preserving strong edges rather than depth edges.

5

2.3. Methods with fidelity measurement Instead of making insufficient constraints on patches, some recently pro85

posed priors offer pixel-level constraints directly. For example, Fattal [5] fits color-line in RGB space for each patch. Color-lines are supposed to cross the origin in haze-free images, but shifted in the direction of A by haze, thus their transmissions could be estimated from A-intercept. However, the model is incomplete because there are also many color-lines that do not cross the origin

90

in haze-free images [28]. Bahat and Irani [17] estimates transmission map by maximizing internal patch recurrence. Although haze-free images usually have more recurrent patches, surplus textures will appear when the number is simply maximized. Berman et al. [6] clusters pixels with same radiance as haze-lines. On each haze-line, the outermost pixel is assumed to be haze-free, thus the

95

transmissions of other pixels could be deduced. However, haze-line model will fail when the input lacks of color, and the haze-free assumption will fail when the input lacks haze-free regions. Despite being flawed in prior, these methods are almost free from the common problem of dark channel based methods. Outliers in their initial trans-

100

mission estimates are usually well removed. The major reason is their fidelity measurements. In Fattal [5], the weight of a patch depends on the colinearity of its pixels. In Bahat and Irani [17], the weight of a patch depends on its recurrence. In Berman et al. [6], the weight of a haze-line depends on its effective length. Benefited from these weight maps, pixels not well described by their

105

priors have limited effects. 2.4. Summary Dark channel based methods have not reached their full potential due to the absence of fidelity measurement for local constant assumption. Pixels not following the assumption are out-of-control. In this paper, we propose WDC,

110

which controls the defects of local constant assumption based on a novel and straightforward weight map. The motivation of fully exploiting the power of the dark channel prior is inherited from our previous work [29], which works in 6

discrete space (5-bit) due to the definition of label set. WDC is much faster, more concise and works in continuous space.

115

3. WDC According to Eq. (1), t(x) can be expressed as t(x) =

1 − minc (I c (x)/Ac ) . 1 − minc (J c (x)/Ac )

(5)

Since J c (x)/Ac is ranged from 0 and 1, the denominator of Eq. (5) is ranged from 0 and 1. Therefore, t(x) must be larger than the numerator, named lower bound b, as t(x) ≥ b(x) = 1 − min(I c (x)/Ac ). c

120

(6)

For pixels z satisfying minc J c (z) = 0 (named dark pixel), the denominator of Eq. (5) equals 1, thus we get t(z) = 1 − min(I c (z)/Ac ) = b(z), c

(7)

which means that the transmissions of dark pixels equal their lower bounds. Following most dark channel based methods, we make the local constant assumption on local patch Ω(x), and denote its transmission as t˜(x). It means 125

that, pixel y ∈ Ω(x) satisfies t(y) = t˜(x). Since these pixels also need to satisfy the lower bound constraint t(y) ≥ b(y), t˜(x) must be no less than any b(y), as t˜(x) ≥ max b(y) = 1 − min (min (I c (x)/Ac )). y∈Ω(x)

y∈Ω(x)

c

(8)

Denote Z(x) as the set of dark pixels in Ω(x). The dark channel prior indicates that Z(x) is always non-empty. For each dark pixel z ∈ Z(x), we have t(z) = b(z) = t˜(x) ≥ max b(y), y∈Ω(x)

130

(9)

which is based on Eq. (7), the local constant assumption and Eq. (8) respectively. Based on Eq. (9), we have b(z) ≥ maxy∈Ω(x) b(y). Since pixel z is also in Ω(x), we have maxy∈Ω(x) b(y) ≥ b(z). It comes out that b(z) = maxy∈Ω(x) b(y), thus t˜(x) = max b(y). y∈Ω(x)

7

(10)

The deduction above is shown in Fig. 2 as steps A-B-C, where step B is usually ignored by traditional dark channel based methods. Alternatively, we 135

take step B and re-evaluate the local constant assumption. The conclusion in step B indicates that pixels z satisfying b(z) = t˜(x) are dark pixels. Therefore, we are able to locate dark pixels in hazy image and calculate their transmissions (Eq. 7). Given local patch Ω(x) and dark pixel z ∈ Z(x), local constant assumption indicates that t˜(x) = b(z). The fidelity of this assumption can be measured

140

by 1/(I(x) − I(z))2 , which is a popular choice used in piecewise smoothness

assumption. However, we apply 1/(b(x) − b(z))2 instead based on two reasons: 1) pixels similar in I are similar in b. 2) b(z) can be replaced by t˜(x) to skip dark pixel localization. Finally, the fidelity of t˜(x) is W (x) = 1/(t˜(x) − b(x))2 . 145

(11)

In practice, we set a lower bound for t˜(x) − b(x) as 10−3 .

The initial transmission map t˜ and its weight map W are in the data term

of our cost function, which is  X X   W (x)(t(x) − t˜(x))2 + λ  E(t) =   

x

(x,y)∈N

(t(x) − t(y))2 ||I(x) − I(y)||2

,

(12)

t≥b

where (x, y) ∈ N means that x and y are adjacent pixels, and λ controls the degree of smoothness. This is a huge-scale constrained QP problem search150

ing the optimal t which is the balanced result of the dark channel prior (term t˜(x)), weighted local constant assumption (term W (x)) and piecewise smoothness assumption (the smoothness term on the right side). Searching the optimal solution is time-consuming, thus we ignore the constraint like other methods [5, 6, 20], and solve t as t = (W + λL)−1 W t˜,

155

(13)

where L is the Laplacian matrix and W is the diagonal matrix whose elements are from W . t and t˜ are in vector form. 8

Algorithm 1 WDC Input: I, A Output: J, t 1:

lower bound b(x) = 1 − minc

I c (x) Ac

2:

initial trans. t˜(x) = maxy∈Ω(x) b(y)

3:

weight map W (x) = 1/(t˜(x) − b(x))2

4:

calculate t by Eq. (13) and calculate J by Eq. (14) With A and t, the haze removal result is given by J(x) =

I(x) − A + A, (max(t(x), b) + t )/(1 + t )

(14)

where t is slightly increased by t to suppress the noise at far distance and to compensate the residual errors, which are from the prior and the ignoring of 160

lower bound constraint. WDC is summarized in Alg. 1 and demonstrated in Fig. 3. The intuition of the weight map can be better understood by checking the following points. 1) Large W only appear on likely dark pixels, which have t(x) ≈ t˜(x). The estimates of these high fidelity pixels are propagated to surrounding pixels

165

in refinement. As shown in Fig. 3c, the transmissions of the sky region in the vicinity of the building are over-estimated. However, their fidelities are very low (with small values in Fig. 3d), thus their final estimates t are propagated from other parts of the sky. 2) Low W (x) does not mean t(x) should be different from t˜(x), but means

170

t(x) should refer more to its neighbors. 3) As a benefit of weight map, our final estimates are robust to mask shapes, as shown in Fig. 4.

4. CWDC WDC ignores the lower bound of Eq. (12), thus the optimal solution might be 175

smaller than b, leading to negative values in J. The residuals are compensated by following tricks, which are also popular in other works [5, 6, 20]. 9

1) Apply t = max(t, b) to prevent negative result. 2) Increase transmissions gloablly (t of Eq. (14)). The tricks work well in most cases, but fail in very low transmission regions, 180

where the first trick leads to t = b and the second trick can not find a suitable increment. As shown in Fig. 5b, t = b results in micro-contrast loss. As shown in Fig. 5e, t = 0.1 is too large to remove haze, but too small to avoid negative results (the trees lost details). Exact solution provided by CWDC (constrained WDC) is free from these

185

problems. It is achieved by transforming Eq. (12) to a non-negative QP problem, which optimizes the gap between b and t instead, as  1   E(x) = g T Qg + cT g, g ≥ 0   2      g =t−b .    Q = 2W + 2λL       c = 2(b − t˜)T W + 2bT L

(15)

The solution is searched by Fletcher [30]. In our experiments, it takes about 20 seconds on each sample while traditional interior point or gradient projection algorithm takes more than 5 minutes.

190

5. Experiments End-to-end comparisons are popular [5, 6, 14] where haze removal results have significantly different appearances. It is beneficial for highlighting good results, but harmful to rigorousness since the effects of different processes and defects are mixed. Therefore, we introduce the following features.

195

1) Limited differences. Processes independent from transmission estimation are uniformed, including air-light estimation and post-enhancement. 2) Purpose-driven evaluation. Qualitative comparisons are made one by one. Each for a specific defect.

10

Several outstanding methods with available codes are selected. Inputs are 200

scaled so that the maximum of width or height is 640 pixels. For dark channel based methods [4, 14, 29], a round mask Ω is used with radius equals 25 pixels. Air-light estimates are given (if groundtruths are known) or provided by He et al. [4], and post-enhancements are disabled. Eq. (14) is used for transmission compensation with t = 0.05 (it is a rather small value leading to dark

205

and noisy sky. A larger t will be better, but it is a kind of post-enhancement need to be excluded). In Eq. 13, λ equals 0.02. 5.1. Quantitative comparisons The datasets include D-Hazy [31] and the one in Zhu et al. [29], which is an improved version of the dataset in Fattal [5]. Both datasets provide hazy images

210

and groundtruths in high quality. MSE (mean square error), CIEDE2000 [32], SSIM [33] and MS-SSIM [34] are employed as metrics. Moreover, we use several outstanding metrics in Zhang et al. [35], including VIF [36], IW-SSIM [37], RFSIM [38] and FSIM [39]. The results are displayed in the left parts of Table 1 and Table 21 , where

215

the top 3 methods are CWDC, Zhu et al. [29] (our previous work) and WDC. It is interesting that their algorithm precisions are also ranked in this way. WDC neglects the lower bound constraint. Zhu et al. [29] takes the constraint but works in discrete space. CWDC provides the exact solution. Such consistency proves the validity of Eq. 12. Haze-line based air-light estimation [18] and contrast-stretch might be suit-

220

able for Berman et al. [6] considering its non-local feature. Therefore, we also compare WDC and CWDC with Berman et al. [6] under their configurations. The results are displayed in the right parts of Table 1 and Table 2, where the conclusions are the same (all the results are degraded because air-lights esti1

Different from our previous work [29], we use t = 0.05 here, which is closer to He et al. [4]

and Berman et al. [6]. Previous experiments are also updated. The conclusion is the same. https://jiantaoliu.github.io/WDC/

11

225

mates are not groundtruths). 5.2. Qualitative comparisons The samples are listed in Fig. 7. Note that, the exclusion of post-enhancement makes the results less visual pleasant. Furthermore, the unification of air-light estimation and post-enhancement limits the results in a narrow solution space.

230

Therefore, visual pleasuring based evaluation is less effective. An easier evaluation way, which is also more reasonable, is finding defects in transmission maps and then identifying the consequences in haze removal results. The comparisons show that WDC is able to avoid following defects: 1) Depth-irrelevant details/micro-contrast loss. Fig. 8 compares WDC

235

with He et al. [4], which produces obvious depth-irrelevant details in transmission maps as discussed in Section 2. It leads to micro-contrast loss, which is evident in the zoomed regions. As a comparison, WDC well reflects scene depth in transmission maps and reveals more details. 2) Color-haze ambiguity/color bias. The inherent ambiguity between

240

color and haze has more serious consequences to Berman et al. [6] than to WDC, because haze-line model fully discards local informations in initial transmission estimation. As shown in Fig. 9b, the model fails on small white objects, resulting in the chaotic transmission maps and biased results. As a comparison, WDC performs stably.

245

3) Over-smoothness/halo-effect. Fig. 10 compares WDC with Meng et al. [14]. The transmission maps of Meng et al. [14] are over-smoothed due to the reason discussed in Section 2. It results in halo-effect on zigzag depth discontinuities. As a comparison, WDC well reflects depth-discontinuities and avoids halo-effect.

250

4) Vague estimates/haze preservation. Fig. 11 compares WDC with Ren et al. [40], whose transmission maps have low variances and fuzzy edges. As a comparison, WDC provides sharp and clear transmission maps, leading to a more complete haze removal.

12

5.3. End-to-end comparisons Most learning-based methods are end-to-end, thus do not have transmission

255

estimation process, such as Li et al. [41], Zhang and Patel [10] Cai et al. [25], and Qu et. al. [42]. Therefore, the comparisons are made in traditional way, where all the methods are in their default configurations 2 . Quantitative results are displayed in Table 3, where our methods achieve sig260

nificantly better scores. CWDC still covers WDC, proving the validity of Eq. 12 again. Qualitative results are displayed in Fig. 12, which shows the following advantages of WDC (more examples in https://jiantaoliu.github.io/WDC/). 1) Less haze preservation. Li et al. [41] and Cai et al. [25] preserve much haze. Zhang and Patel [10], and Qu et. al. [42] preserve haze in the distance, which is most obvious in the third and last samples.

265

2) Less micro-contrast loss. In the first and third samples, Li et al. [41] erases many details while WDC performs well. 3) Less color bias. Zhang and Patel [10] produces obvious color bias in the first sample. Qu et. al. [42] causes color bias in all the samples. WDC performs well in this aspect.

270

4) Less artifact. In the last sample, Zhang and Patel [10] enhances haze in the middle, while Qu et. al. [42] produces textures in the top. Learning-based methods becomes increasingly popular in recent years, and have achieved outstanding results in some challenges [43]. However, they are 275

highly unstable in our practice. The quantitative results does not support their superiorities, and the qualitative results indicate serious over-fitting problem. We believe that the lacking of training dataset is the major reason, since existing datasets are built in the following ways. 1) Synthesis. Hazy images are built based on haze-free images and depth maps, such as D-Hazy [31] and the synthetic part of RESIDE [44]. Such

280

2

Zhang and Patel [10] only produces 512x512 images. Comparison on 640x480 images

might be unfair since its results are up-sampled in vertical. Please check our project page for 512x512 comparison. The conclusion is the same. https://jiantaoliu.github.io/WDC/

13

datasets have high quality, but are limited in indoor or close scenes due to the small range of depth sensors. Learning-based methods trained on them usually fail in natural scenes. 2) Dependent. Hazy images are real, but haze-free images are generated 285

by dehazing algorithms [45] or depth estimation algorithms [44]. The defects of the relied algorithms are usually inherited. 3) Photography. Both haze-free and hazy images are real, such as IHAZE [46]. However, since capturing haze-free and hazy image pairs is very chanllenge and time-consuming, it has few samples, leading to

290

over-fitting problem. 5.4. Runtime comparison The runtimes are summarized in Table 4. Initialization time of learningbased methods are not recorded. Without GPU, WDC is only slower than Meng et al. [14], and even comparable to learning-based method [40]. Refer to

295

Alg. 1, the only time-consuming step is Eq. (13), which is also fast on modest size (720-1080p). Considering the conciseness, WDC can be easily implemented without complex libraries or expensive devices, and does not require initialization. Therefore, it is highly practical.

6. TUNE WDC 300

In this section, we ask for extra messages to rediscover the missing dark pixels which are essential to transmission estimation but missed by WDC due to the limitation of preset masks. For example, the hollows of the shelf in Fig. 13 are smaller than masks Ω thus no dark pixel is detected. As a consequence, the transmissions of these isolated backgrounds are over-estimated. Berman et al. [6]

305

performs better in this sample since these isolated backgrounds and the major one are classified together by the haze-line model. WDC can also produce right estimates by reducing the size of Ω. However, small Ω will undermine the dark channel prior, recognizing the white part of the flower as background and

14

resulting in the same under-estimated failure likes Berman et al. [6]. Apparently, 310

there is no universally correct Ω. External knowledge is necessary for such cases. We propose a semi-supervised solution. An interface is designed to receive two kinds of scribbles. Blue ones cover pixels with wrong transmission estimates, denoted as C. Red ones cover pixels whose transmission values are preferred, denoted as T . In each round of tuning, initial transmission estimates of C are

315

modified as the median of T . Then, we recalculate t based on Eq. 13. The process above seems aggressive and requires the scribbles being precise, but it is actually robust as long as key dark pixels are located. It is unnecessary to cover all the wrong estimates or avoid good estimates. As shown in Fig 14, the blue scribbles only need to strike through those over-estimated ar-

320

eas. Note that, although the scribbles are local, the improvements are global. For example, the sword behind the shelf, and the pointed area behind the tree are refined without specific scribbles. Furthermore, in the second sample, the wrong scribble outlined in yellow does not affect the result.

7. Conclusion 325

In this paper, we discuss a detail problem of dark channel, illustrate the defects and limitations it brought to dark channel based dehazing. As mentioned in Section 2, dark channel based methods have to make secondary assumption for sufficient constraints, while outliers of the assumption are out-of-control due to the absence of fidelity measurement. We solve this problem by rein-

330

terpreting dark channel as the combination of dark pixel and local constant assumption, and then controlling the outliers based on a novel weight map. A refreshingly concise method named WDC is constructed, which is highly robust to mask shapes and even modifications from scribbles. The experiments reveal that WDC can well handle the discussed problem, and significantly boost

335

dark channel based dehazing, which regains outstanding performances in various comparisons with state-of-the-arts.

15

8. Acknowledgement This work was supported by the National Natural Science Foundations of China (Project No. 61633020, 61725305, 61473090 and 61673115), the Beijing 340

Innovation Center for Engineering Science and Advanced Technology, and Fujian Provincial Collaborative Innovation Center for High-End Equipment Manufacturing.

16

References [1] C. O. Ancuti, C. Ancuti, Single image dehazing by multi-scale fusion, IEEE 345

Transactions on Image Processing 22 (8) (2013) 3271–3282. [2] A. Galdran, J. Vazquez-Corral, D. Pardo, M. Bertalm´ıo, Fusion-based variational image dehazing, IEEE Signal Processing Letters 24 (2) (2017) 151– 155. [3] J. Guo, J. Syue, V. Radzicki, H. Lee, An efficient fusion-based defogging,

350

IEEE Transactions on Image Processing. [4] K. He, J. Sun, X. Tang, Single image haze removal using dark channel prior, IEEE Transactions of Pattern Analysis and Machine Intelligence 33 (12) (2011) 2341–2353. [5] R. Fattal, Dehazing using color-lines, ACM Transactions on Graphics 34 (1)

355

(2014) 13. [6] D. Berman, S. Avidan, et al., Non-local image dehazing, in: IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1674–1682. [7] S. G. Narasimhan, S. K. Nayar, Chromatic framework for vision in bad weather, in: IEEE Conference on Computer Vision and Pattern Recogni-

360

tion, 2000, pp. 598–605. [8] Y. Y. Schechner, S. G. Narasimhan, S. K. Nayar, Instant dehazing of images using polarization, in: IEEE Conference on Computer Vision and Pattern Recognition, 2001, pp. 325–332. [9] J. Kopf, B. Neubert, B. Chen, M. Cohen, D. Cohen-Or, O. Deussen,

365

M. Uyttendaele, D. Lischinski, Deep photo: Model-based photograph enhancement and viewing, ACM Transactions on Graphics 27 (5) (2008) 116. [10] H. Zhang, V. M. Patel, Densely connected pyramid dehazing network, in: IEEE Conference on Computer Vision and Pattern Recognition, 2018.

17

[11] R. Li, J. Pan, Z. Li, J. Tang, Single image dehazing via conditional gener370

ative adversarial network, in: IEEE Conference on Computer Vision and Pattern Recognition, 2018. [12] J. Pan, D. Sun, H. Pfister, M.-H. Yang, Deblurring images via dark channel prior, IEEE Transactions of Pattern Analysis and Machine Intelligence 40 (10) (2017) 2315–2328.

375

[13] C. Zhu, G. Li, W. Wang, R. Wang, An innovative salient object detection using center-dark channel prior, in: IEEE International Conference on Computer Vision, 2017, pp. 1509–1515. [14] G. Meng, Y. Wang, J. Duan, S. Xiang, C. Pan, Efficient image dehazing with boundary constraint and contextual regularization, in: IEEE Interna-

380

tional Conference on Computer Vision, 2013, pp. 617–624. [15] R. Wang, R. Li, H. Sun, Haze removal based on multiple scattering model with superpixel algorithm, Signal Processing 127 (2016) 24–36. [16] W. E. K. Middleton, Vision through the atmosphere, in: Geophysik II/Geophysics II, Springer, 1957, pp. 254–287.

385

[17] Y. Bahat, M. Irani, Blind dehazing using internal patch recurrence, in: IEEE International Conference on Image Processing, 2016, pp. 1–9. [18] D. Berman, T. Treibitz, S. Avidan, Air-light estimation using haze-lines, in: IEEE International Conference on Computational Photography, 2017, pp. 1–9.

390

[19] A. Levin, D. Lischinski, Y. Weiss, A closed-form solution to natural image matting, IEEE Transactions of Pattern Analysis and Machine Intelligence 30 (2) (2008) 228–242. [20] C. Chen, M. N. Do, J. Wang, Robust image and video dehazing with visual artifact suppression via gradient residual minimization, in: European

395

Conference on Computer Vision, 2016, pp. 576–591. 18

[21] L. He, J. Zhao, N. Zheng, D. Bi, Haze removal using the differencestructure-preservation prior, IEEE Transactions on Image Processing 26 (3) (2017) 1063–1075. [22] K. He, J. Sun, X. Tang, Guided image filtering, IEEE Transactions of 400

Pattern Analysis and Machine Intelligence 35 (6) (2013) 1397–1409. [23] Z. Li, J. Zheng, Edge-preserving decomposition-based single image haze removal, IEEE Transactions on Image Processing 24 (12) (2015) 5432–5441. [24] Q. Zhu, J. Mai, L. Shao, A fast single image haze removal algorithm using color attenuation prior, IEEE Transactions on Image Processing 24 (11)

405

(2015) 3522–3533. [25] B. Cai, X. Xu, K. Jia, C. Qing, D. Tao, Dehazenet: An end-to-end system for single image haze removal, IEEE Transactions on Image Processing 25 (11) (2016) 5187–5198. [26] J. He, C. Zhang, R. Yang, K. Zhu, Convex optimization for fast image

410

dehazing, in: IEEE International Conference on Image Processing, 2016, pp. 2246–2250. [27] Y. Lai, Y. Chen, C. Chiou, C. Hsu, Single-image dehazing via optimal transmission map under scene priors, IEEE Transactions on Circuits and Systems for Video Technology 25 (1) (2015) 1–14.

415

[28] I. Omer, M. Werman, Color lines: Image specific color representation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2004, pp. 946–953. [29] M. Zhu, B. He, Q. Wu, Single image dehazing based on dark channel prior and energy minimization, IEEE Signal Processing Letters 25 (2) (2017)

420

174–178. [30] R. Fletcher, Augmented lagrangians, box constrained qp and extensions, Ima Journal of Numerical Analysis 37 (4) (2017) 1635–1656. 19

[31] C. Ancuti, C. O. Ancuti, C. De Vleeschouwer, D-hazy: A dataset to evaluate quantitatively dehazing algorithms, in: IEEE International Conference 425

on Image Processing, 2016, pp. 2226–2230. [32] G. Sharma, W. Wu, E. N. Dalal, The ciede2000 color-difference formula: Implementation notes, supplementary test data, and mathematical observations, Color Research and Application 30 (1) (2010) 21–30. [33] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, Image quality as-

430

sessment: from error visibility to structural similarity, IEEE Transactions on Image Processing 13 (4) (2004) 600–612. [34] Z. Wang, E. P. Simoncelli, A. C. Bovik, Multiscale structural similarity for image quality assessment, in: Asilomar Conference on Signals, Systems, and Computers, 2003, pp. 1398–1402.

435

[35] L. Zhang, L. Zhang, X. Mou, D. Zhang, A comprehensive evaluation of full reference image quality assessment algorithms, in: IEEE International Conference on Image Processing, 2012, pp. 1477–1480. [36] H. R. Sheikh, A. C. Bovik, Image information and visual quality, IEEE Transactions on Image Processing 15 (2) (2006) 430–444.

440

[37] Z. Wang, Q. Li, Information content weighting for perceptual image quality assessment, IEEE Transactions on Image Processing 20 (5) (2011) 1185– 1198. [38] L. Zhang, L. Zhang, X. Mou, Rfsim: A feature based image quality assessment metric using riesz transforms, in: IEEE International Conference on

445

Image Processing, 2010, pp. 321–324. [39] L. Zhang, L. Zhang, X. Mou, D. Zhang, Fsim: A feature similarity index for image quality assessment, IEEE Transactions on Image Processing 20 (8) (2011) 2378–2386.

20

[40] W. Ren, S. Liu, H. Zhang, J. Pan, X. Cao, M.-H. Yang, Single image dehaz450

ing via multi-scale convolutional neural networks, in: European Conference on Computer Vision, 2016, pp. 154–169. [41] B. Li, X. Peng, Z. Wang, J. Xu, D. Feng, Aod-net: All-in-one dehazing network, in: IEEE International Conference on Computer Vision, 2017, pp. 4770–4778.

455

[42] Y. Qu, Y. Chen, J. Huang, Y. Xie, Enhanced pix2pix dehazing network, in: IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 8160–8168. [43] C. Ancuti, C. O. Ancuti, R. Timofte, Ntire 2018 challenge on image dehazing: Methods and results, in: IEEE Conference on Computer Vision and

460

Pattern Recognition Workshop, 2018, pp. 891–901. [44] B. Li, W. Ren, D. Fu, D. Tao, D. Feng, W. Zeng, Z. Wang, Benchmarking single-image dehazing and beyond, IEEE Transactions on Image Processing 28 (1) (2019) 492–505. [45] X. Fan, Y. Wang, X. Tang, R. Gao, Z. Luo, Two-layer gaussian process

465

regression with example selection for image dehazing, IEEE Transactions on Circuits and Systems for Video Technology 27 (12) (2017) 2505–2517. [46] C. Ancuti, C. O. Ancuti, R. Timofte, C. De Vleeschouwer, I-haze: a dehazing benchmark with real hazy and haze-free indoor images, in: International Conference on Advanced Concepts for Intelligent Vision Systems, Springer,

470

2018, pp. 620–631. [47] H.-K. Shin, J.-Y. Kim, H.-K. Lee, S.-J. Ko, Single image dehazing based on weighted dark channel, in: IEEE International Conference on Consumer Electronics, 2019, pp. 1–2.

21

(a) hazy image

(b) trans. of our method

(c) initial trans. of [4]

(d) trans. of [4]

(e) initial trans. of [14]

(f) trans. of [14]

Figure 1: Intermediates of several dark channel based methods. Warmer color indicates higher transmission. Note that, transmission map should well and only reflect scene depth in this case.

Figure 2: Flowcharts of dark channel based methods.

22

(a) I

(b) b

(c) t˜

(d) W

(e) t

(f) J

Figure 3: Intermediate results of WDC.

(a) r(Ω) = 15

(b) r(Ω) = 35

(c)

image

ing [14]

open-

(d) super-pixel [15]

Figure 4: Intermediate results of WDC with different local constant assumptions, showing that WDC is robust to initial estimates. From top to bottom, initial trans. t˜, weight map W and final trans. t.

23

(a) I1

(b) Jwdc1 with t =0.1

(c) Jcwdc1 with t =0.1

(d) I2

(e) Jwdc2 with t =0.1

(f) Jcwdc2 with t =0.1

Figure 5: Comparison of WDC and CWDC.

(a) He et al. [4] Figure 6:

(b) Berman et al. [6]

(c) WDC

Comparison of the results with (up) and without (down) post-

enhancement (contrast-stretch [6]).

Figure 7: Samples used for qualitative comparisons.

24

Table 1: Comparison on D-HAZY dataset [31]. Given air-light, no post-process Metrics

DC [4]

MSE↓

0.235

CIEDE2000↓ 11.190

Configuration of Berman [6]

Berman [6] Meng [14] Shin [47] Ren [40] Zhu [29] WDC

CWDC Berman [6] WDC

CWDC

0.242

0.218

0.309

0.313

0.215

0.213

0.206

0.284

0.237

0.230

11.276

10.224

14.372

15.064

10.097

9.994

9.687

15.618

13.416

12.977

SSIM↑

0.836

0.806

0.837

0.766

0.802

0.847

0.846

0.853

0.775

0.835

0.844

MS-SSIM↑

0.819

0.834

0.837

0.819

0.789

0.846

0.837

0.842

0.798

0.827

0.834

VIF↑

0.568

0.638

0.667

0.574

0.557

0.652

0.647

0.653

0.651

0.663

0.670

IW-SSIM↑

0.781

0.801

0.800

0.781

0.739

0.812

0.803

0.808

0.763

0.793

0.801

RFSIM↑

0.963

0.966

0.967

0.966

0.949

0.968

0.965

0.967

0.949

0.962

0.965

FSIM↑

0.896

0.886

0.893

0.855

0.884

0.902

0.901

0.906

0.869

0.896

0.902

Table 2: Comparison on the improved dataset of Fattal [5]. Given air-light, no post-process DC [4] MSE↓

Configuration of Berman [6]

Berman [6] Meng [14] Shin [47] Ren [40] Zhu [29] WDC

CWDC Berman [6] WDC

CWDC

0.110

0.154

0.114

0.275

0.161

0.108

0.105

0.098

0.160

0.118

0.113

5.529

7.396

5.558

13.304

7.979

5.309

5.173

4.862

9.294

7.562

7.309

SSIM↑

0.929

0.888

0.929

0.789

0.890

0.941

0.936

0.944

0.885

0.931

0.938

MS-SSIM↑

0.929

0.919

0.932

0.859

0.904

0.941

0.933

0.939

0.911

0.929

0.934

VIF↑

0.690

0.693

0.734

0.567

0.649

0.730

0.731

0.739

0.715

0.748

0.756

IW-SSIM↑

0.918

0.910

0.922

0.853

0.889

0.931

0.922

0.928

0.900

0.917

0.923

RFSIM↑

0.986

0.984

0.987

0.978

0.978

0.989

0.988

0.989

0.979

0.986

0.987

FSIM↑

0.953

0.934

0.952

0.893

0.933

0.958

0.954

0.960

0.927

0.949

0.954

CIEDE2000↓

Table 3: Comparison with end-to-end methods with default configurations. D-HAZY dataset [31]

MSE↓

The improved dataset of Fattal [5]

AOD [41]Zhang [10] Cai [25] Qu [42]

WDC

CWDC AOD [41]Zhang [10] Cai [25] Qu [42]

0.336

CIEDE2000↓ 16.701

WDC

CWDC

0.358

0.311

0.292

0.270

0.270

0.194

0.263

0.136

0.151

0.121

0.118

17.481

14.867

15.323

13.081

13.035

10.402

12.890

6.759

8.248

6.176

6.054

SSIM↑

0.776

0.757

0.810

0.818

0.823

0.829

0.839

0.817

0.900

0.902

0.928

0.933

MS-SSIM↑

0.764

0.774

0.796

0.806

0.816

0.823

0.868

0.857

0.916

0.920

0.930

0.935

VIF↑

0.488

0.478

0.553

0.565

0.657

0.662

0.544

0.593

0.684

0.708

0.727

0.734

IW-SSIM↑

0.709

0.731

0.747

0.763

0.781

0.789

0.848

0.839

0.901

0.908

0.919

0.925

RFSIM↑

0.943

0.937

0.955

0.949

0.958

0.961

0.968

0.957

0.980

0.980

0.988

0.989

FSIM↑

0.855

0.866

0.887

0.888

0.893

0.899

0.898

0.887

0.946

0.945

0.947

0.952

25

Table 4: Mean runtime of each method in quantitative comparison (i5-4620T, GTX1080ti)

Platform

DC [4]

Berman [6] Meng [14] Shin [47] Ren [40] AOD [41]Zhang [10] Cai [25]

Matlab

Matlab

Matlab

withGPU

-

-

-

Seconds↓

8.53

1.86

1.27

(a) He et al. [4]

Matlab Matlab PyCaffe Pytorch √ √ 3.93

(b) trans. of [4]

1.22

0.02

(c) WDC

0.02

2.13

(d) trans. of WDC

Figure 8: Depth-irrelevant details/micro-contrast loss, WDC and He et al. [4].

(a) Berman et al. [6]

(b) trans. of [6]

(c) WDC

(d) trans. of WDC

Figure 9: Color-haze ambiguity/color bias, WDC and Berman et al. [6].

26

Qu [42]

Matlab&C Pytorch √ 0.06

Zhu [29] WDC

CWDC

Matlab Matlab Matlab -

-

-

28.48

1.47

21.16

(a) Meng et al. [14]

(b) trans. of [14]

(c) WDC

(d) trans. of WDC

Figure 10: Over-smoothness/halo-effect, WDC and Meng et al. [14].

(a) Ren et al. [40]

(b) trans. of [40]

(c) WDC

(d) trans. of WDC

Figure 11: Vague estimates/haze preservation, WDC and Ren et al. [40].

27

Figure 12: End-to-end comparisons. From left to right, input images, Li et al. [41], Zhang and Patel [10], Cai et al. [25], Qu [42], and WDC.

Figure 13: Failure cases. For left to right, input images, Berman et al. [6] and WDC.

28

(a) inputs

(b) WDC Figure 14: Tune WDC by scribbles.

29

(c) tuned WDC

Zhu Mingzhu: He Bingwei: Liu Jiantao: Yu Junzhi:

CRediT Author Statement Conceptualization, Methodology, Software, Writing – Original Draft Resources, Writing – Review & Editing, Project administration, Funding acquisition Validation, Investigation, Data Curation, Visualization Formal analysis, Supervision

Conflict of interest The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a 475 conflict of interest in connection with the work submitted.