Blind image blur metric based on orientation-aware local patterns

Blind image blur metric based on orientation-aware local patterns

Signal Processing: Image Communication 80 (2020) 115654 Contents lists available at ScienceDirect Signal Processing: Image Communication journal hom...

3MB Sizes 0 Downloads 39 Views

Signal Processing: Image Communication 80 (2020) 115654

Contents lists available at ScienceDirect

Signal Processing: Image Communication journal homepage: www.elsevier.com/locate/image

Blind image blur metric based on orientation-aware local patterns✩ Lixiong Liu a , Jiachao Gong a , Hua Huang a ,∗, Qingbing Sang b a

Beijing Laboratory of Intelligent Information Technology, School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China b

ARTICLE

INFO

Keywords: Blind blur metric Orientation selectivity Orientation-aware local pattern Toggle operator

ABSTRACT We develop an effective blind image blur assessment model based on a novel orientation-aware local pattern operator. The resulting metric first proposes an orientation-aware local pattern operator that fully considers the impact of anisotropy of orientation selectivity mechanism and the gradient orientation effect on visual perception. Our results indicate that the proposed descriptor is sensitive to image distortion and can effectively represent orientation information. We thus use it to extract image structure information. In order to enhance features’ representation capability for blur image, we extract edge information by a Toggle operator and use it as weight of local patterns to optimize the computed structural statistical features. Finally, a support vector regression method is used to train a predictive model with optimized features and subjective scores. Experimental results obtained on six public databases show that our proposed model performs better than state-of-the-art image blur assessment models.

1. Introduction Over the past decade, countless intelligent digital devices have charmed their way into our lives. It is very easy to obtain digital images anytime and anywhere. Nevertheless, during acquisition and transmission, the images can be inevitably affected by distortions. Since many more images are acquired by inexperienced users, a most common image blur is induced which includes out-of-focus blur, motion blur and so on. In order to be able to enhance and control image quality [1], it is highly valuable to identify and quantify image blur. Humans are the final perceiver of images, so subjective opinions are most effective and efficient in evaluating image quality. Unfortunately, this way is time-consuming and expensive, therefore developing objective image quality assessment (IQA) models becomes a particularly important need for techniques [1,2]. Depending on whether the original reference image is available, there exist three categories of objective IQA approaches: full-reference (FR), no-reference (NR) and reducedreference (RR). Since pristine reference information is unavailable in most practical applications, it is highly desirable to design a blind quality assessment model [3–6]. Here, we attempt to address the problem of no-reference image blur quality assessment. In recent years, research on blind blur quality assessment has been extensively studied [7–14]. Since image blur results in the loss of image texture information, or attenuates image’s high frequency components in transform domain. Based on this phenomenon, some image blur

prediction methods have been proposed by analyzing statistical characteristics of coefficients in Fourier [15] and wavelet [7,12] domains or decay velocity of singular value curve [16]. However, their computational complexity is relatively high due to their transformation process. Considering that image blur causes pixel diffusion directly which will distort image’s structure and the human visual system (HVS) can effectively extract structure information from visual scene [2,17], some methods use image edge [18] or image gradient [11,19] to evaluate image blur, which have relatively low compute complexity but become heavily reliant on how to accurately represent image structure information. Thus, some structural descriptors are widely used in the field of image blur assessment [10,20–22]. Due to the effectiveness and simplicity of the local binary pattern (LBP) operator [23], many researchers deployed it to capture statistical features from an image then used the computed feature vector to evaluate image blur. For example, Yue et al. [10] found that image’s LBP statistical distribution varies monotonously with the increase of blur, which can be used for blur metric. Dai et al. [20] decomposed the image into low-order and highorder components, then their LBP statistical features were extracted for evaluating image blur. However, the LBP operator encodes binary relationship between central pixel and its neighboring pixels from all orientations in a local image area equally, which means that the LBP operator is not sensitive to orientation information [24]. Experiments in neurophysiology indicate that HVS exhibits an orientation selectivity [25,26]. Specifically, simple neurons in the primary

✩ No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.image.2019.115654. ∗ Corresponding author. E-mail addresses: [email protected] (L. Liu), [email protected] (J. Gong), [email protected] (H. Huang), [email protected] (Q. Sang).

https://doi.org/10.1016/j.image.2019.115654 Received 17 February 2019; Received in revised form 30 May 2019; Accepted 1 October 2019 Available online 15 October 2019 0923-5965/© 2019 Elsevier B.V. All rights reserved.

L. Liu, J. Gong, H. Huang et al.

Signal Processing: Image Communication 80 (2020) 115654

visual cortex have their orientation preference, which show an excitatory response when perceiving their preferred orientation information [25], but express weakening excitatory response even inhibitory response when perceiving their insensitive orientation information. On the other hand, orientation information can well describe image structure [22] and has been successfully deployed in human detection [27] and image quality assessment [28]. Therefore, designing a novel local descriptor integrating image orientation information is contribution to enhancing image structure representation. Zhang et al. [24] proposed a local derivative pattern (LDP) operator, representing image orientation information according to the consistency of neighboring pixels in fixed derivative orientations. And Wu et al. [22] proposed an orientation selectivity based visual pattern (OSVP) operator, which represents local patterns by analyzing the orientation relationship between central pixel and its neighboring pixels. Although these two operators focus on enhancing the capability of orientation representation, they ignore the anisotropy of orientation selectivity [29]. Further studies in neurophysiology show that the neurons in the primary visual cortex strongly respond to visual information in horizontal and vertical orientations [29,30]. Motivated by these findings, we propose an orientation-aware local pattern operator. This operator identifies the neighboring pixels in horizontal and vertical orientations in a local region according to their gradient orientation, and a non-uniform coding strategy is implemented on each neighboring pixel. As a result, the local pattern is represented through the orientation selectivity. Furthermore, a blind image blur assessment model based on orientation-aware local pattern operator is developed. In this model, we firstly obtain image’s local pattern responses using the orientationaware local pattern operator. Pixel diffusion caused by blur leads to contrast reduction and detail loss in image edge regions [13], and edge regions have essential effect on blur perception [15]. Highlighting structure information in edge regions can improve the ability of structural statistical features to express image blur due to the fact that HVS is sensitive to distortion in attended regions [31]. Here, we introduce a Toggle operator [32] to extract edge information of a blur image and utilize the computed edge information as the weight of local pattern responses to optimize the structural statistical features. Through selectively enhancing the contrast in local image area, Toggle operator can enhance image edge and detail information [33,34]. Meanwhile, it can filter the noise in an image to a certain extent [33], making it more suitable for blur evaluation when the image also suffers from the noise distortion. Lastly, a support vector regression (SVR) is utilized to construct a mapping between image features and human subjective scores. The main contributions of this paper are summarized as follows: (1) Motivated by the anisotropy of orientation selectivity and gradient orientation effect on visual perception, we propose an orientationaware local pattern operator and use it to extract image structure information. (2) Toggle operator is introduced to optimize image’s statistical features which are integrated into our proposed blind image blur metric. (3) Experimental results show that our proposed model outperforms state-of-the art image blur assessment models. The rest of this paper is organized as follows. The proposed orientation-aware local pattern operator is described in Section 2. In Section 3, the optimized features are integrated into a novel blind image blur index. We test the performance of our proposed approach against existing models in Section 4. Finally, we conclude the paper in Section 5.

2.1. Orientation selectivity mechanism Early experiments in neurophysiology [25,26] found that when cats or primates face lines or edges with specific orientations, some neurons in local receptive field of striate cortex produce excitatory responses. When line’s orientation is changed gradually, excitation responses of these neurons decrease rapidly until producing inhibitory responses. Subsequent experiments further proved that HVS has stronger visual perception ability in horizontal and vertical orientations [30]. In addition, Mansfield’s orientation perception experiment [29] on primate striate cortex neurons also showed that the number of neurons sensitive to horizontal and vertical visual signals is significantly more than that in other orientations. That is to say, an orientation selectivity mechanism is existed in HVS, and it shows anisotropic characteristic [29]. Furthermore, in a grid illusion experiment [35,36] as shown in Fig. 1, people incorrectly observe smudges existing at the intersections between Hermann grids (see Fig. 1a). Nevertheless, this illusion is greatly relieved when people look at the curved and oblique grids (see Fig. 1b and c) due to changes of orientation. Experimental analysis in [35,36] showed that, cortical neurons perceiving orientation information around the intersections are activated, while those perceiving orientation information in the intersections are rarely activated. Thus, illusion arises from this activation intensity difference. This experiment also conforms the anisotropy of orientation selectivity. 2.2. Orientation-aware local pattern Image structure can be well described by orientation information [22], and the change of image content or quality also causes the change of orientation information [27,28]. Fig. 2 shows an original image from TID2013 database [37], its blur version and their corresponding gradient orientation maps. It is obvious that gradient orientation well captures image structure information. The gradient orientation map 𝑂 is calculated as: ) ( 𝐼 (𝑚 + 1, 𝑛) − 𝐼 (𝑚 − 1, 𝑛) (1) 𝑂 (𝑚, 𝑛) = 𝑎𝑟𝑐 tan 𝐼 (𝑚, 𝑛 + 1) − 𝐼 (𝑚, 𝑛 − 1) where 𝐼 is the input image, and 𝑚 and 𝑛 represent the location index of pixel in an image. Here, gradient orientation is integrated into our designed descriptor. As mentioned-above, the LBP operator [38], as a classical structural descriptor, encodes binary relationships between central pixel and its neighboring pixels into a binary sequence that can represent local image structure information. For a local image area, the LBP response between central pixel 𝑝𝑐 and neighboring pixels 𝑝𝑖 is modeled as: 𝐿𝐵𝑃𝑃 ,𝑅 =

𝑃∑ −1

( ) 𝑠 𝑝𝑖 − 𝑝𝑐 2𝑖

(2)

𝑖=0

where 𝑃 is the number of neighboring pixels and 𝑅 is the radius. 𝑠 (⋅) is a threshold function: { ( ) 1, 𝑝𝑖 − 𝑝𝑐 ≥ 0 𝑠 𝑝𝑖 − 𝑝𝑐 = (3) 0, 𝑝𝑖 − 𝑝𝑐 < 0. Obviously, the LBP operator can produce 2𝑃 kinds of patterns, which leads to high-dimension image features that are difficult to be used for learning tasks. To address this problem, a rotation invariant LBP operator [23] was proposed, which classifies the patterns through detecting whether the binary sequence belongs to uniform patterns. Thus, the feature dimension is reduced to 𝑃 + 2, making local binary patterns more discriminative [20], while the texture classification performance is still maintained. However, LBP is a non-directional local structure descriptor [24]. Considering the impact of orientation selectivity and the gradient orientation effect on distortion perception, we propose a novel orientation-aware local pattern operator by modeling orientation selectivity of simple neurons.

2. Orientation-aware local pattern operator In this section, we briefly analyze the orientation selectivity mechanism of the HVS. In order to address the limitation of the LBP operator in representing orientation information, we propose an orientationaware local pattern operator by simulating orientation selectivity mechanism with image gradient orientation. The resulting descriptor can better represent image structure information, so it may be applied to IQA or other application systems. 2

L. Liu, J. Gong, H. Huang et al.

Signal Processing: Image Communication 80 (2020) 115654

Fig. 1. Grid illusion experiment. (a) Classic Hermann grids, (b) curved grids, and (c) oblique grids.

Fig. 2. Pristine image, its bur distorted version and their corresponding gradient orientation maps. (a) Original image, (b) its gradient orientation map, (c) its blur distorted image, (d) gradient orientation map of (c).

Fig. 3. Encoding scheme of orientation-aware local pattern operator (R = 1 and P = 8).

3

L. Liu, J. Gong, H. Huang et al.

Signal Processing: Image Communication 80 (2020) 115654

represented by LBP operator belongs to uniform pattern before and after rotation process, and the number of 1 in sequence is the same, so its structure information remains unchanged. For orientation-aware local pattern operator, the local pattern is 5 before rotation but 3 after rotation. As a result, our proposed operator is capable of describing such a change.

Because the HVS is more sensitive to local horizontal and vertical visual information, for the LBP operator, relationships between central pixel 𝑝𝑐 and neighboring pixels 𝑝𝑖 in these two orientations should have more important effect on visual perception [30,35]. The binary relationships constitute the local patterns. As analyzed before, such a process is non-directional. To alleviate this limitation and make full use of orientation information, we design a novel orientationaware local pattern operator whose encoding scheme is shown in Fig. 3. The designed local pattern operator firstly calculated an original LBP sequence. Then a classification process is performed to determine whether the neighboring pixels belong to horizontal or vertical orientation according to their gradient orientations. For neighboring pixels in such two orientations, their binary responses with 𝑝𝑐 are calculated again, but for the other neighboring pixels, their binary responses are set to 0 directly because HVS is less sensitive to these visual information [29]. The resulting sequence, as a supplement of LBP sequence, can emphasize horizontal and vertical visual information. Then we can get a complete sequence through concatenating these two binary consequences. However, simple concatenation increases feature dimension. Meanwhile, two sequences represent structure information for the same local area, which means there is a correlation between them, yet such a concatenation may ignore the correlation [39]. In order to reduce its feature dimension, Zhao et al. [40] proposed a local binary count (LBC) method which only computed the sum of each value in a binary sequence to represent local structure information by 𝐿𝐵𝐶𝑃 ,𝑅 =

𝑃∑ −1

( ) 𝑠 𝑝𝑖 − 𝑝𝑐 .

3. Blind image blur metric A blind image blur assessment model using the orientation-aware local pattern operator is described in this section. The model firstly uses the orientation-aware local pattern operator to represent image’s local patterns. Then we extract image’s edge information using Toggle operator and utilize it as weight of local patterns to get optimized statistical features. Finally, a support vector regression (SVR) is applied to form a mapping between image features and human subjective scores. As described above, the orientation-aware local pattern operator can better represent the image structure information by stimulating the orientation selectivity mechanism. We thus capture statistical features of images with different degrees of blur distortion using (5). Note that its parameters R and P are set to 1 and 8, respectively. Fig. 5 shows an example of the structural statistical features extracted from an original image and its blur versions from the TID2013 database. Obviously, image blur leads to degradation of image structure, and structure information of different degrees of blur images presents different statistical regularity. As the blur degree increases, pixel diffusion makes complex structure smooth, so statistical histograms of local patterns appear to be concentrated. Generally, edge regions in an image are more attractive to HVS [15], and distortion in attended regions has important impact on human subjective opinion than other regions [31]. Using edge information as weight of local patterns can optimize the representation ability of structural statistical feature for blur image [20,22]. We thus use the Toggle operator [32], a powerful contrast operator, to extract image edge information. Through morphological dilation and erosion, Toggle operator selectively improves the contrast of local image area, which highlights image’s edge and detail information [33,34]. Therefore, it can enhance the edge affected by blur to a certain extent and extract information that human eyes pay attention to. A local contrast enhanced image 𝐼𝑡 of original image 𝐼 is modeled as:

(4)

𝑖=0

Such a method shows competitive texture classification performance and has relative low feature dimension compared with LBP operator. Inspired by their work, we combine two kinds of binary response sequences mentioned above by adding them directly to obtain a final response sequence. Specifically a local pattern is obtained by calculating the sum of each value in final response sequence (see Fig. 3). In practice, relationships between central pixel and neighborhood pixels in both horizontal and vertical orientations are equivalent to be calculated twice, so the proposed orientation-aware local pattern (OLP) operator is defined as:

𝑂𝐿𝑃𝑃 ,𝑅 =

𝑃∑ −1 𝑖=0

⎧2, 𝑝𝑖 − 𝑝𝑐 > 0 𝑎𝑛𝑑 𝑝𝑖 ∈ 𝛺 ⎪ ⎨1, 𝑝𝑖 − 𝑝𝑐 > 0 𝑎𝑛𝑑 𝑝𝑖 ∉ 𝛺 ⎪ ⎩0, 𝑝𝑖 − 𝑝𝑐 ≤ 0

(5)

⎧(𝐼 ⊕ 𝐽 ) (𝑚, 𝑛) , (𝐼 ⊕ 𝐽 ) (𝑚, 𝑛) − 𝐼 (𝑚, 𝑛) < 𝐼 (𝑚, 𝑛) ⎪ ⎪ − (𝐼 ⊙ 𝐽 ) (𝑚, 𝑛) ⎪ 𝐼𝑡 (𝑚, 𝑛) = ⎨(𝐼 ⊙ 𝐽 ) (𝑚, 𝑛) , (𝐼 ⊕ 𝐽 ) (𝑚, 𝑛) − 𝐼 (𝑚, 𝑛) > 𝐼 (𝑚, 𝑛) ⎪ − (𝐼 ⊙ 𝐽 ) (𝑚, 𝑛) ⎪ ⎪𝐼 (𝑚, 𝑛) , otherwise ⎩

where 𝛺 represents neighboring pixels in horizontal and vertical orientations (i.e., 𝑝𝑖 ∈ 𝛺 means 𝑝𝑖 is a neighboring pixel in horizontal or vertical orientation), and these pixels are classified according to whether their gradient orientations deviate from the horizontal or vertical within 20 degrees in this descriptor. The operator can encode the structure into 2𝑃 + 1 patterns (𝑂𝐿𝑃 ∈ [0, 2𝑃 ]). A value of 0 represents the light point or flat area and a value of 2𝑃 represents the dark point in an image, the others [1, 2𝑃 − 1] represent different types of local structures. To further illustrate the proposed orientation-aware local pattern operator in structure representation, a rotation experiment is carried out here. When an image is rotated, LBP representation can remain unchanged as central pixel’s surrounding pixels are still the same [23]. However, when the rotation angle of the image is not multiples of 90 degrees, due to anisotropy of orientation selectivity mechanism, visual information perceived by HVS should be changed. For example, an initial pixel matrix with its orientation information is shown in Fig. 4a. The sequence represented by the LBP operator is ‘‘00111000’’ (subscript number of each neighboring pixel is sequence index), and is ‘‘00122000’’ by the orientation-aware local pattern operator. When the matrix rotates 45 degrees clockwise (see Fig. 4b), the sequence represented by LBP operator is ‘‘00011100’’, and is also ‘‘00011100’’ by the orientation-aware local pattern operator. In a word, the sequence

(6)

where ⊕ and ⊙ denote morphological dilation and erosion, respectively, and 𝐽 is a 3 × 3 rectangular structure element template which can match the size of orientation-aware local pattern operator used here. Fig. 6 illustrates an example of locally enhanced image generation using Toggle operator. For a target pixel value 56, the morphological dilation yields the maximum value 63 in the green template region centered at this pixel, yet morphological erosion yields the minimum value 10. Specifically, Toggle operator actually selects the value from dilation or erosion result that is closer to the original pixel value 56. As a result, the value of target pixel is replaced with 63. An edge map 𝐼𝑒 is obtained by calculating difference between the enhanced image 𝐼𝑡 and original image 𝐼. The edge is given by 𝐼𝑒 = ||𝐼𝑡 − 𝐼 || .

(7)

Image blur results in edge diffusion and contrast reduction in an image, which may affect the edge extraction operators’ representation performance. In order to illustrate Toggle operator’s superiority in edge representation, we show a blur image from the TID2013 database and 4

L. Liu, J. Gong, H. Huang et al.

Signal Processing: Image Communication 80 (2020) 115654

Fig. 4. Two local image areas with orientation information: (a) initial matrix, (b) matrix rotated 45 degrees clockwise.

Fig. 5. Statistical features extracted from images with different degree of blur. (a) Original image, (b) blur image, (c) severe blur image, and (d) their statistical features.

Fig. 6. Example of locally enhanced image generation using Toggle operator.

5

L. Liu, J. Gong, H. Huang et al.

Signal Processing: Image Communication 80 (2020) 115654

Fig. 7. Comparison of different operators’ representation on edge information of (a) a blur image. (b), (c), (d) and (e) are its edge information represented by the Prewitt operator, Sobel operator, Roberts operator and Toggle operator, respectively. Table 1 Details of the four databases.

its edge representations on a patch of it resulting from four kinds of operators (the Prewitt, Sobel, Roberts and Toggle operators) in Fig. 7. It may be observed that Toggle operator manages to highlight subtle structures and extract edge information accurately, but the other three operators are affected by edge diffusion and their filter results are more blur than the real edge information. Then use the obtained edge information as weight of local pattern to optimize image’s structural statistical feature, which is defined as: ∑𝑀−1 ∑𝑁−1 𝑛=0 𝐼𝑒 (𝑚, 𝑛) ⋅ 𝑓 (𝑂𝐿𝑃 (𝑚, 𝑛), 𝑘) ℎ (𝑘) = 𝑚=0 ∑ (8) 𝑀−1 ∑𝑁−1 𝑚=0 𝑛=0 𝐼𝑒 (𝑚, 𝑛) { 1, 𝑥 = 𝑦 𝑓 (𝑥, 𝑦) = (9) 0, otherwise

Blur image number Total image number

LIVE

TID2013

CSIQ

VCL

145 779

125 3000

150 866

138 575

BIQA [48], and SSEQ [49], and two FR IQA models PSNR and SSIM [2]. The source codes of all of compared algorithms are publicly available. For the blind image blur metrics, we followed the experimental details from their original papers. And for the learning based models our model and three general-purpose NR-IQA models, the images were divided into two no overlap subsets randomly. 80% of the images were used to train the prediction model and the remaining 20% were used for testing. We repeated such process 1000 times and used four metrics for performance evaluation. The metrics are the Pearson’s linear correlation coefficient (PLCC), the Spearman’s Rank ordered Correlation Coefficient (SRCC), the Kendall rank order correlation coefficient (KRCC), and root mean squared error (RMSE). For PLCC and RMSE, a 5-parameter logistic nonlinearity function was used to map predicted scores to opinion scores. The higher SRCC, PLCC, and KRCC values or lower RMSE values indicate better model performance.

where M and N are the length and width of an image, and 𝑘 ∈ [0, 2𝑃 ] is the possible local pattern. Fig. 8 shows the difference between original statistical feature and weighted statistical feature. It can be found that the statistical histogram is more concentrated after optimization. This is because 𝐼𝑒 highlights the structure information in edge region, making the change of statistical feature caused by blur more significant. On the other hand, local patterns representing bright or dark spots are removed due to the weighting operation, then feature dimension is reduced. For a bright or dark spot, pixel value of the target spot is the maximum or minimum value in its template region, so its response by Toggle operator is always itself and the weight of this spot is zero using (7). We set P to 8 and only extract 15 features from a blur image. Natural images are multi-scale. It has been proved that the multiscale process can bring certain performance improvement for IQA models [3,41]. In our proposed model, we extract image’s statistical features at 5 scales, yielding a 75-dimension feature vector from an image. Finally, a support vector regression (SVR) [42] with radial basis function kernel is used to build the mappings between subjective quality scores and image features.

4.1. Performance on four databases We conducted the performance comparison between our proposed model and several IQA models on the LIVE, the TID2013, the CSIQ and the VCL databases. The results are listed in Table 2. The top 2 performance models were highlighted in boldface. It can be observed that our proposed model achieved the best performance compared with all the selected models on the LIVE, the TID2013 and the VCL databases, and achieved competitive performance among all the compared models except the FR SSIM algorithm on the CSIQ database. We also show the scatter plots of our model’s predicted quality scores against opinion scores on the four databases in Fig. 9. Clearly, our model correlated well with human subjective scores.

4. Experimental results We tested the performance of the proposed image blur assessment model on the blur distortion images from the four public image databases. These databases are the LIVE database [43], the TID2013 database [37], the CSIQ database [44] and the VCL [45] database, and their details are summarized in Table 1. Moreover, the LIVE MD [46] database containing multi-distortion images and the BID [47] database composed of real blur images, were also used for further comparison experiments. We compared our approach against existing IQA models. These include seven blind image blur Assessment models CPBD [18], S3 [15], LPC-SI [12], MLV [13], ARISM [8], SPARISH [9], and Zhan’s model [11], three general-purpose NR IQA models BRISQUE [3],

4.2. Comparison with model using different local patterns Fig. 10 shows the comparison of pattern responses extracted from a patch of an image from the TID2013 database by the LBP operator, the LDP operator, the OSVP operator and the orientation-aware local pattern operator. To verify the effectiveness of using orientationaware local pattern operator to represent image information, we used our proposed method to implement a comparative experiment using different local pattern operators on four databases. The results are listed in Table 3. It may be seen that our proposed method performed 6

L. Liu, J. Gong, H. Huang et al.

Signal Processing: Image Communication 80 (2020) 115654

Fig. 8. Difference between original statistical feature and weighted statistical feature. (a) Blur image, (b) original and weighted statistical features of (a).

Fig. 9. Scatter plots of predicted quality scores of the proposed model against opinion scores on the (a) LIVE database, the (b) TID2013 database, the (c) CSIQ database, and the (d) VCL database.

better using the orientation-aware local pattern operator than with other three operators. These results confirm the effectiveness of the orientation-aware local pattern operator.

optimization process, we compared the median SRCC values of our method without or with different edge information for optimization on four databases in Table 4, where OLP means original statistical features are used in our method, and OLP-Prewitt, OLP-Sobel, OLP-Roberts mean weighted statistical features optimized by Prewitt, Sobel, and Roberts operators respectively are used in our method. It can be seen that edge information contributes to the performance improvement to a certain degree, and Toggle operator achieves competitive results in feature optimization.

4.3. Contribution of feature optimization As mentioned before, the Toggle operator is introduced to optimize structural statistical features due to its capability for highlighting subtle structures and edge information. To verify the contribution of this 7

L. Liu, J. Gong, H. Huang et al.

Signal Processing: Image Communication 80 (2020) 115654

Fig. 10. Representation of (a) original image by different local patterns: (b) LBP, (c) LDP, (d) OSVP, and (e) orientation-aware local pattern. Table 2 Performance of different IQA models on the four databases. LIVE SRCC PSNR SSIM BRISQUE SSEQ BIQA CPBD S3 LPC-SI MLV ARISM SPARISH Zhan Proposed

0.782 0.925 0.954 0.939 0.945 0.919 0.941 0.887 0.932 0.952 0.960 0.945 0.973

TID2013 PLCC

KRCC

RMSE

SRCC

0.776 0.924 0.965 0.955 0.958 0.912 0.953 0.911 0.943 0.956 0.960 0.924 0.980

0.585 0.754 0.835 0.808 0.813 0.765 0.800 0.715 0.778 0.833 0.833 0.797 0.877

9.916 6.020 3.968 4.611 4.399 6.436 4.777 6.483 5.235 4.423 4.423 5.995 3.062

0.915 0.969 0.914 0.915 0.933 0.852 0.861 0.919 0.879 0.898 0.893 0.961 0.964

SRCC

PLCC

KRCC

RMSE

0.925 0.960 0.892 0.854 0.903 0.863 0.886 0.902 0.907 0.900 0.886 0.934 0.936

0.907 0.936 0.915 0.898 0.932 0.811 0.770 0.888 0.875 0.853 0.860 0.910 0.955

0.748 0.823 0.738 0.692 0.747 0.682 0.702 0.736 0.744 0.727 0.713 0.785 0.793

0.121 0.101 0.112 0.123 0.102 0.168 0.183 0.132 0.139 0.150 0.146 0.120 0.086

CSIQ

PSNR SSIM BRISQUE SSEQ BIQA CPBD S3 LPC-SI MLV ARISM SPARISH Zhan Proposed

Table 3 Comparison of the proposed model using different local pattern operators on the four databases.

PLCC

KRCC

LIVE

RMSE

0.915 0.889 0.938 0.937 0.954 0.862 0.882 0.916 0.883 0.895 0.901 0.938 0.972

0.788 0.848 0.773 0.765 0.793 0.647 0.640 0.748 0.681 0.715 0.699 0.827 0.853

0.504 0.571 0.411 0.427 0.366 0.632 0.589 0.502 0.586 0.556 0.540 0.433 0.295

SRCC

PLCC

KRCC

RMSE

0.779 0.891 0.920 0.911 0.911 0.924 0.852 0.914 0.879 0.927 0.931 0.918 0.955

0.779 0.804 0.942 0.933 0.933 0.929 0.905 0.750 0.890 0.943 0.940 0.930 0.966

0.579 0.712 0.762 0.746 0.751 0.751 0.663 0.747 0.709 0.757 0.764 0.753 0.837

15.279 14.470 7.927 8.613 8.513 9.035 10.380 16.115 11.108 8.084 8.333 8.959 6.198

LBP LDP OSVP Proposed

TID2013

SRCC

PLCC

KRCC

RMSE

SRCC

PLCC

KRCC

RMSE

0.971 0.964 0.951 0.973

0.981 0.977 0.958 0.980

0.880 0.862 0.837 0.877

3.081 3.266 4.474 3.062

0.959 0.952 0.919 0.964

0.968 0.966 0.945 0.972

0.785 0.831 0.769 0.853

0.512 0.571 0.403 0.295

SRCC

PLCC

KRCC

RMSE

SRCC

PLCC

KRCC

RMSE

0.924 0.909 0.839 0.936

0.945 0.927 0.906 0.955

0.784 0.738 0.669 0.793

0.093 0.105 0.117 0.086

0.941 0.952 0.920 0.955

0.959 0.966 0.928 0.966

0.804 0.831 0.767 0.837

6.933 6.182 8.881 6.198

CSIQ

LBP LDP OSVP Proposed

VCL

VCL

Table 4 Median SRCC of model with statistical features optimized by different edge information on the four databases.

OLP OLP-Prewitt OLP-Sobel OLP-Roberts Proposed

LIVE

TID2013

CSIQ

VCL

Average

0.968 0.973 0.970 0.968 0.972

0.952 0.965 0.960 0.948 0.964

0.922 0.930 0.927 0.926 0.936

0.939 0.949 0.946 0.944 0.955

0.945 0.954 0.951 0.946 0.957

Table 5 Median SRCC of different models on the LIVE MD database.

PSNR SSIM BRISQUE SSEQ BIQA CPBD S3 LPC-SI MLV ARISM SPARISH Zhan Proposed

4.4. Performance on the LIVE MD database

We also tested the performance of our proposed method on the LIVE MD database. There are two parts contained in this multi-distortion database: Part 1, in which all of the images consist of both blur and JPEG compression distortions, and Part 2, which suffers from both blur and noise distortions. The results are tabulated in Table 5. It may be seen that all the blind image blur assessment models had a modest performance drop when coping with blur images with other distortion, except our proposed model, which still achieved competitive performance. On the other hand, the performance of our model on the Part 1 is slightly inferior to that on the Part 2 because Toggle operator cannot capture block effect well.

Part 1

Part 2

ALL

0.662 0.849 0.929 0.899 0.877 0.438 0.580 0.860 0.804 0.866 0.889 0.677 0.960

0.709 0.876 0.902 0.893 0.830 0.212 0.282 0.836 0.455 0.331 0.041 0.390 0.966

0.677 0.860 0.918 0.894 0.879 0.074 0.390 0.849 0.622 0.203 0.372 0.539 0.965

4.5. Performance on real blur images All the above experiments were conducted on artificial databases mainly generated in the laboratory environment. However, these 8

L. Liu, J. Gong, H. Huang et al.

Signal Processing: Image Communication 80 (2020) 115654

Table 6 Median SRCC of different models on the BID database. BRISQUE

SSEQ

BIQA

CPBD

S3

LPC-SI

MLV

ARISM

SPARISH

Zhan

Proposed

0.547

0.531

0.529

0.018

0.410

0.205

0.316

0.039

0.304

0.058

0.585

Table 7 Results of t-test between SRCC values of compared models.

LIVE TID2013 CSIQ VCL

PSNR

SSIM

BRISQUE

SSEQ

BIQA

CPBD

S3

LPC-SI

MLV

ARISM

SPARISH

Zhan

1 1 1 1

1 1 −1 1

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

Table 8 Performance of compared IQA models across four databases. TID2013

BRISQUE SSEQ CPBD S3 MLV SPARISH Proposed

CSIQ

VCL

SRCC

PLCC

KRCC

RMSE

SRCC

PLCC

KRCC

RMSE

SRCC

PLCC

KRCC

RMSE

0.866 0.836 0.852 0.861 0.879 0.893 0.901

0.866 0.839 0.862 0.882 0.883 0.901 0.891

0.692 0.626 0.647 0.640 0.681 0.699 0.703

0.932 0.681 0.632 0.589 0.586 0.540 0.566

0.861 0.847 0.863 0.886 0.907 0.886 0.890

0.881 0.865 0.811 0.770 0.875 0.860 0.900

0.679 0.655 0.682 0.702 0.744 0.713 0.715

0.135 0.164 0.168 0.183 0.139 0.146 0.127

0.882 0.886 0.924 0.852 0.879 0.931 0.918

0.888 0.882 0.929 0.905 0.890 0.940 0.930

0.683 0.698 0.751 0.663 0.710 0.764 0.739

11.279 11.106 9.035 10.380 11.108 8.333 9.014

Fig. 11. Box plots of SRCC across 1000 train–test trials on the (a) LIVE database, the (b) TID2013 database, the (c) CSIQ database, and the (d) VCL database, respectively.

databases cannot simulate blur in real images. Thus we also tested the performance of our model on the BID database, which contains 590 real blur images. The results were showed in Table 6. We found that all the compared image blur metrics had a dramatic drop in performance, except our model, which outperformed all compared models and had a modest performance drop.

model was different from another one. The results were tabulated in Table 7, where the index 1, 0 or −1 denoted that our model was statistically superior, equal, or inferior to the other one. We also showed box plots of SRCC of models on four databases in Fig. 11. We found that our model was superior to all the compared algorithms on the LIVE, the TID2013 and the VCL databases, but inferior to the SSIM approach on the CSIQ database. The results again confirm the performance of our model on the four databases.

4.6. Statistical significance

4.7. Database independence

We also conducted a statistical significance test on the four databases. T-test was used to measure the significance of SRCC of each model across 1000 train–test trials. We set null hypothesis that the mean SRCC of one model was equal to another one with the confidence of 95%. And the alternate hypothesis is that the mean SRCC of one

To verify the database independence of our proposed, we trained our model on the LIVE database then tested the model on the TID2013, the CSIQ and the VCL databases. Two learning based models BRISQUE 9

L. Liu, J. Gong, H. Huang et al.

Signal Processing: Image Communication 80 (2020) 115654

Table 9 Time complexity of the compared models. Models

CPBD

S3

LPC-SI

MLV

ARISM

SPARISH

Zhan

Proposed

Times (s)

0.143

30.031

1.050

0.078

19.466

2.935

0.008

0.266

[12] R. Hassen, Z. Wang, M.M.A. Salama, Image sharpness assessment based on local phase coherence, IEEE Trans. Image Process. 22 (7) (2013) 2798–2810. [13] K. Bahrami, A.C. Kot, A fast approach for no-reference image sharpness assessment based on maximum local variation, IEEE Signal Process. Lett. 21 (6) (2014) 751–755. [14] L. Li, W. Xia, W. Lin, Y. Fang, S. Wang, No-reference and robust image sharpness evaluation based on multiscale spatial and spectral features, IEEE Trans. Multimed. 19 (5) (2017) 1030–1040. [15] C.T. Vu, T.D. Phan, D.M. Chandler, S3: A spectral and spatial measure of local perceived sharpness in natural images, IEEE Trans. Image Process. 21 (3) (2012) 934–945. [16] Q. Sang, H. Qi, X. Jun, C. Li, A.C. Bovik, No-reference image blur index based on singular value curve, J. Vis. Commun. Image Represent. 25 (7) (2014) 1625–1630. [17] L. Ding, H. Huang, Y. Zang, Image quality assessment using directional anisotropy structure measurement, IEEE Trans. Image Process. 26 (4) (2017) 1799–1809. [18] N.D. Narvekar, L.J. Karam, A no-reference blur metric based on the cumulative probability of blur detection (CPBD), IEEE Trans. Image Process. 20 (9) (2011) 2678–2683. [19] S. Wang, C. Deng, B. Zhao, G. Huang, B. Wang, Gradient-based no-reference image blur assessment using extreme learning machine, Neurocomputing 63 (2018) 124–138. [20] T. Dai, K. Gu, L. Niu, Y. Zhang, W. Lu, S. Xia, Referenceless quality metric of multiply-distorted images based on structural degradation, Neurocomputing 290 (2018) 185–195. [21] M. Oszust, Local feature descriptor and derivative filters for blind image quality assessment, IEEE Signal Process. Lett. 26 (2) (2019) 322–326. [22] J. Wu, W. Lin, G. Shi, L. Li, Y. Fang, Orientation selectivity based visual pattern for reduced-reference image quality assessment, Inform. Sci. 351 (2016) 18–29. [23] T. Ojala, M. Pietikäinen, T. Mäenpää, Multiresolution gray scale and rotation invariant texture classification with local binary patterns, IEEE Trans. Pattern Anal. Mach. Intell. 24 (7) (2002) 971–987. [24] B. Zhang, Y. Gao, S. Zhao, J. Liu, Local derivative pattern versus local binary pattern: Face recognition with high-order local pattern descriptor, IEEE Trans. Image Process. 19 (2) (2010) 533–545. [25] D.H. Hubel, T.N. Wiesel, Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex, J. Physiol. Lond. 160 (1) (1962) 106–154. [26] D.H. Hubel, T.N. Wiesel, Receptive fields and functional architecture of monkey striate cortex, J. Physiol. Lond. 195 (1) (1968) 215–243. [27] N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, pp. 886–893. [28] L. Liu, Y. Hua, Q. Zhao, H. Huang, A.C. Bovik, Blind image quality assessment by relative gradient statistics and adaboosting neural network, Signal Process., Image Commun. 40 (2016) 1–15. [29] R.J. Mansfield, Neural basis of orientation perception in primate vision, Science 186 (4169) (1974) 1133–1135. [30] L. Maffei, F.W. Campbell, Neurophysiological localization of the vertical and horizontal visual coordinates in man, Science 167 (3917) (1970) 386–387. [31] J. Wu, Y. Liu, L. Li, G. Shi, Attended visual content degradation based reduced reference image quality assessment, IEEE Access 6 (2018) 12493–12504. [32] P. Soille, Morphological Image Analysis Principles and Applications, Springer, Germany, 2003. [33] F.S. Marvasti, M.R. Mosavi, M. Nasiri, Flying small target detection in IR images based on adaptive Toggle operator, IET Comput. Vis. 12 (4) (2018) 527–534. [34] X. Bai, F. Zhou, B. Xue, Edge preserved image fusion based on multiscale toggle contrast operator, Image Vis. Comput. 29 (12) (2011) 829–839. [35] P.H. Schiller, C.E. Carvey, The Hermann grid illusion revisited, Perception 34 (11) (2005) 1375–1397. [36] J. Geier, L. Bernáth, M. Hudák, L. Séra, Straightness as the main factor of the Hermann grid illusion, Perception 37 (5) (2008) 651–665. [37] N. Ponomarenko, L. Jin, O. Leremeiev, V. Lukin, K. Egiazarian, J. Astola, B. Vozel, K. Chehdi, M. Carli, F. Battisti, C.C.J. Kuo, Image database TID2013: Peculiarities, results and perspectives, Signal Process., Image Commun. 30 (2015) 57–77. [38] T. Ojala, M. Pietikäinen, D. Harwood, A comparative study of texture measures with classification based on feature distributions, Pattern Recognit. 19 (1) (1996) 51–59. [39] X. Hong, G. Zhao, M. Pietikäinen, X. Chen, Combining LBP difference and feature correlation for texture description, IEEE Trans. Image Process. 23 (6) (2014) 2557–2568. [40] Y. Zhao, D. Huang, W. Jia, Completed local binary count for rotation invariant texture classification, IEEE Trans. Image Process. 21 (10) (2012) 4492–4497. [41] L. Liu, B. Liu, C. Su, H. Huang, A.C. Bovik, Binocular spatial activity and reverse saliency driven no-reference stereopair quality assessment, Signal Process., Image Commun. 58 (2017) 287–299. [42] C. Chang, C. Lin, LIBSVM: A library for support vector machines, 2018, Available from: https://www.csie.ntu.edu.tw/~cjlin/libsvm/.

and SSEQ, and four image blur assessment models CPBD, S3, MLV and SPARISH were selected to conduct a cross-database experiment. The median SRCC, PLCC, KRCC and RMSE values of the compared algorithms were listed in Table 8. It may be seen that, in most cases, our proposed model achieved better database independence compared with other IQA models. 4.8. Computational complexity We tested the time complexity of several models on the LIVE database, using a PC with i5-3.2 GHz CPU and 16G RAM. The average run time of an image handled by each model in the process of feature extraction is tabulated in Table 9. It may be seen that our proposed model is faster than the S3, ARISM and SPARISH algorithms, but slightly slower than the CPBD and LPC-SI approaches. 5. Conclusion We have present a blind image blur assessment model based on a novel orientation-aware local pattern operator. The orientation-aware local pattern operator simulates the orientation selectivity mechanism with image gradient orientation and is utilized to extract image structure information. A Toggle operator is also deployed to optimize the generated statistical features and boost performance further. Our designed image blur metric was thoroughly tested on several image databases, and shown to achieve competitive performance. In the future we will further improve the proposed operator and explore the efficacy of this operator in image processing systems. Acknowledgments This work is supported by the National Natural Science Foundation of China under grant 61672095 and grant 61425013. References [1] A.C. Bovik, Automatic prediction of perceptual image and video quality, Proc. IEEE 101 (9) (2013) 2008–2024. [2] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, Image quality assessment: From error visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (2004) 600–612. [3] A. Mittal, A.K. Moorthy, A.C. Bovik, No-reference image quality assessment in the spatial domain, IEEE Trans. Image Process. 21 (12) (2012) 4695–4708. [4] Y. Zhou, L. Li, J. Wu, K. Gu, W. Dong, G. Shi, Blind quality index for multiply images using biorder structure degradation and nonlocal statistics, IEEE Trans. Multimed. 20 (11) (2018) 3019–3032. [5] K. Gu, W. Lin, G. Zhai, X. Yang, W. Zhang, C. Chen, No-reference quality metric of contrast-distorted images based on information maximization, IEEE Trans. Cybern. 47 (12) (2017) 4559–4565. [6] Y. Fang, J. Yan, L. Li, J. Wu, W. Lin, No reference quality assessment for screen content images with both local and global feature representation, IEEE Trans. Image Process. 27 (4) (2018) 1600–1610. [7] G. Gvozden, S. Grgic, M. Grgic, Blind image sharpness assessment based on local contrast map statistics, J. Vis. Commun. Image Represent. 50 (2018) 145–158. [8] K. Gu, G. Zhai, W. Lin, X. Yang, W. Zhang, No-reference image sharpness assessment in autoregressive parameter space, IEEE Trans. Image Process. 24 (10) (2015) 3218–3231. [9] L. Li, D. Wu, J. Wu, H. Li, W. Lin, A.C. Kot, Image sharpness assessment by sparse representation, IEEE Trans. Multimed. 18 (6) (2016) 1085–1097. [10] G. Yue, C. Hou, K. Gu, N. Ling, No reference image blurriness assessment with local binary patterns, J. Vis. Commun. Image Represent. 49 (2017) 382–391. [11] Y. Zhan, R. Zhang, No-reference image sharpness assessment based on maximum gradient and variability of gradients, IEEE Trans. Multimed. 20 (7) (2018) 1796–1808. 10

L. Liu, J. Gong, H. Huang et al.

Signal Processing: Image Communication 80 (2020) 115654 [47] BID—Blurred image database [Online], Available from: http://www.lps.ufrj.br/ profs/eduardo/ImageDatabase.htm. [48] W. Xue, X. Mou, L. Zhang, A.C. Bovik, X. Feng, Blind image quality prediction using joint statistics of gradient magnitude and laplacian features, IEEE Trans. Image Process. 23 (11) (2014) 4850–4862. [49] L. Liu, B. Liu, H. Huang, A.C. Bovik, No-reference image quality assessment based on spatial and spectral entropies, Signal Process., Image Commun. 29 (8) (2014) 856–863.

[43] H.R. Sheikh, Z. Wang, L. Cormack, A.C. Bovik, LIVE image quality assessment database release 2, 2006, Available from: http://live.ece.utexas.edu/research/ quality. [44] E.C. Larson, D.M. Chandler, Most apparent distortion: Full-reference image quality assessment and the role of strategy, J. Electron. Imaging 19 (1) (2010). [45] A. Zaric, N. Tatalovic, N. Brajkovic, H. Hlevnjak, M. Loncaric, E. Dumic, S. Grgic, VCL@FER image quality assessment database, Automatika 53 (4) (2012) 344–354. [46] D. Jayaraman, A. Mittal, A.K. Moorthy, A.C. Bovik, Objective quality assessment of multiply distorted images, in: Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers, IEEE, 2012, pp. 1693–1697.

11