Perceptual objective quality assessment of stereoscopic stitched images

Perceptual objective quality assessment of stereoscopic stitched images

Perceptual Objective Quality Assessment of Stereoscopic Stitched Images Journal Pre-proof Perceptual Objective Quality Assessment of Stereoscopic St...

3MB Sizes 0 Downloads 97 Views

Perceptual Objective Quality Assessment of Stereoscopic Stitched Images

Journal Pre-proof

Perceptual Objective Quality Assessment of Stereoscopic Stitched Images Weiqing Yan, Guanghui Yue, Yuming Fang, Hua Chen, Chang Tang, Gangyi Jiang PII: DOI: Reference:

S0165-1684(20)30084-0 https://doi.org/10.1016/j.sigpro.2020.107541 SIGPRO 107541

To appear in:

Signal Processing

Received date: Revised date: Accepted date:

9 April 2019 17 February 2020 18 February 2020

Please cite this article as: Weiqing Yan, Guanghui Yue, Yuming Fang, Hua Chen, Chang Tang, Gangyi Jiang, Perceptual Objective Quality Assessment of Stereoscopic Stitched Images, Signal Processing (2020), doi: https://doi.org/10.1016/j.sigpro.2020.107541

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2020 Published by Elsevier B.V.

Perceptual Objective Quality Assessment of Stereoscopic Stitched Images Weiqing Yana,∗, Guanghui Yueb , Yuming Fangc , Hua Chend , Chang Tange , Gangyi Jiangd a

School of Computer and Control Engineering, Yantai University, Yantai 264005,China. b School of Biomedical Engineering, Shenzhen University, Shenzhen 518060, China. c School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330032, China. d Faculty of Information Science and Engineering,Ningbo University,Ningbo 315211,China. e School of Computer Science, China University of Geosciences, Wuhan 430074,China.

Abstract Large view stereoscopic images can provide users with immersive depth experience. Image stitching techniques aim to obtain large view stitched images, and there have been various image stitching algorithms proposed recently. However, there is still no effective objective quality assessment for stereoscopic stitched images. In this paper, we propose a new perceptual objective stereoscopic stitched image quality assessment (S-SIQA) method by considering different distortion types in the existing stitching methods, including color distortion, ghost distortion, structure distortion(shape distortion, information loss), and disparity distortion. The quality evaluation methods for these distortion types are designed by using the color difference coefficient, points distance, matched line inclination degree, information loss, and ∗

Corresponding author:Weiqing Yan, Guanghui Yue Email addresses: [email protected] (Weiqing Yan), [email protected] (Guanghui Yue)

Preprint submitted to Signal Processing

February 18, 2020

disparity difference. Then we fuse these measures in the proposed S-SIQA model by an optimally weighted linear combination. In addition, to evaluate the performance of the proposed S-SIQA, we build a subjective quality assessment database for stereoscopic stitched images. Experimental results have confirmed the proposed method can effectively measure the perceptual quality of stereoscopic stitched images. Keywords: Stereoscopic image, quality assessment, stitched image, image stitching 1. Introduction Virtual reality (VR) technologies can provide an immersive large viewing experience. It is widely applied to the field of film and entertainment, industrial design, military medical simulation and so on [1]. However, due to the limitation of imaging equipment, it is difficult to capture a large view horizon image in a single camera shot. Image stitching method can be used to construct a large view images from multiple captured viewpoint images. It has been widely studied in the fields of computer vision and graphics [2]. Image stitching method can stitch images of adjacent views with color difference and view angle difference. If these differences are not eliminated, the color inconsistency, ghost or misalignment will appear in stitched images. Therefore, how to eliminate these differences during stitching process is much investigated in the related research area. Some studies eliminate color differences by using the overexposure of the camera itself or some color correction algorithms [3, 4]. The elimination of ghost or misalignment requires transformation model. An early method 2

tries to use a simple transformation model (affine transformation and projection transformation) to align images if the visual scene is planar or if these views differ purely by rotation [2]. For complex scenes, simple transformation models cannot be used and it would cause ghost. To align image and eliminate ghost, some studies [8–11] propose local fine-tuning methods to align images. However, these methods designed based on a global transformation would cause shape distortion in non-overlapped image region. To reduce this distortion, some methods [12–17] take a transition from perspective transformation to similarity transformation to preserve image shape in non-overlapping regions. Currently, there are many stitching methods proposed to try to obtain high-quality stitched images. However, it is difficult for a single stitching method to work well on any image. In other words, different stitching methods might cause different types of distortion. Hence, it is important to design effective stitched image quality assessment (SIQA) methods to evaluate the stitched images and to choose suitable stitching method to obtain satisfied stitched images. Traditional image quality assessment methods [5–7] do not consider the characteristics of stitched images and cannot predict the visual quantity of stitched images accurately. Existing SIQA methods mainly focus on subjective evaluation method, while there are just a few objective SIQA methods [18–20] for 2D stitched images. These methods are good at assessing ghost distortion of 2D stitched images obtained by conventional perspective transformation models. However, for other distortion types and stereoscopic stitched images, current stitched image quality assessment methods are still not good enough and it is much desired to design

3

effective stereoscopic stitched image quality assessment method. In this paper, we propose an objective stereoscopic stitched image quality assessment method, named as stereoscopic SIQA (S-SIQA). The contributions of this paper are summarized as follows: 1) we propose a novel objective assessment model for stereoscopic stitched images based on different perceptual distortion types. To the best of our knowledge, this is the first attempt to investigate objective quality assessment of stereoscopic stitched images; 2) we investigate five types of distortion from existing stitching methods, including ghost distortion, color distortion, shape distortion, information loss, and disparity distortion; based on these distortions, we propose the measurement methods of each distortion by pruned feature points, matched line detection, and saliency map between the original stereoscopic image and its stitched version; 3) we create a subjective quality assessment database composed of 30 samples, which include two input stereoscopic images, three stitched results from three representative stitching methods for each input stereoscopic image, and anaglyph stereoscopic image (total 16 images) for each sample, to provide a test platform for the proposed algorithm. The rest of this paper is organized as follows. Section 2 gives a brief overview on existing image stitching methods as well as stitched image quality assessment methods. Section 3 describes the details of the proposed S-SIQA. The experimental results are shown in Section 4. The final section concludes this paper.

4

2. Related Work 2.1. Image stitching methods Large view stereoscopic images with large view can bring immersive visual experiences, and they have attracted a large amount of attention in the research community recently. The stitching process can be broadly divided into two parts [2]: i) photometric correction, and ii) geometric alignment and blending. Photometric correction is performed to remove these differences of color and intensity of all the images that are being stitched. These differences caused by heterogeneous imaging hardware or environmental conditions among the capturing cameras. Geometric alignment may result in ghosts [2, 8–10], structure-error [29, 30, 32, 33], vertical disparity distortion[33, 34]. Structure-error occurs when the stitched image is stretched severely and nonuniformly in non-overlapping region, including shape distortion [17] and information loss. To overcome these problems, some advanced stitching algorithms have been proposed recently. For geometric alignment, global parametric warps, such as projective transform, can cause misalignment and ghost. To improve alignment quality and eliminate ghost, several local warping models have been proposed, such as smoothly varying affine (SVA) warping [8] and as-projective-as-possible (APAP) warping [9]. These methods adopt multiple local parametric warps to get more accurate alignment results. For shape distortion, several methods attempt to address the issues with the distortion. Chang et al. [12] proposed a shape-preserving half-projective (SPHP) warping by a spatial combination of a projective transformation and a similarity transformation. Lin et al. [15] designed an adaptive as-natural-as-possible 5

warping method to address the problem with the unnatural rotation. For photometric inconsistency, Yao et al. [3] proposed an color matching approach, where gamma correction for the luminance component and linear correction for the chrominance components of the source images. For stereoscopic inconsistency [21, 22], Zhang et al. [21] proposed a stereoscopic image stitching method which first stitches the left view of input stereoscopic images and stitches input disparity maps, and then warps and stitches the right view constrained by the stitched left image and target disparity map to keep the consistence of disparity. Yan et al. [22] proposed a stereoscopic image stitching method based on a hybrid warping model, which uses a projective warping method to pre-warp left and right view images, and then use content-preserving warping to fine-tune images in order to guarantee stereoscopic consistence. In these image stitching methods introduced above, the quality of stitched images is measured by traditional image quality assessment methods or subjective assessment. Such as RMSE (Root Mean Squared Error) of feature points on a set of matching points between the reference image and warped image [8–11]. However, RMSE does not consider the characteristics of the human visual system and cannot predict the visual quantity of stitched images accurately. Some image stitching methods focusing on photometric inconsistency and shape distortion [12–17, 40] use subjective quality evaluation methods to assess the quality of stitched images. Subjective quality assessment methods can provide accurate results for the quality of stitched images, but it is time-consuming, laborious and accordingly unsuitable for online applications.

6

2.2. Stitched image quality assessment methods Currently, there are several objective quality assessment methods proposed for stitched images [18, 19, 23–26]. Most of these methods focus on measuring one of color distortion or ghost distortion. They rarely consider other distortions existing in stitched images. In [23] and [24], the authors proposed visual quality metrics focusing on color correction and intensity consistency, where ghosting distortion cannot be measured. Leorin et al. [25] designed a quality assessment method of panorama video by omnidirectional multi-camera system, where structure similarity index (SSIM) [28] is used for quality assessment and a luminancebased metric is introduced to measure color distortion. In addition, they pay more attention to the measurement of blockiness distortion and blur distortion in video sequences. In [26], the gradient of the intensity difference between the stitched and reference images is adopted to assess the ghost distortion. However, the comparison experiments are conducted on mere 6 stitched examples. Qureshi et al. [18] tried to quantize the ghost error by computing SSIM of high-frequency information of the stitched and input image difference in the over-lapping region. However, since input images used for testing are directly cropped from the reference image and have no perspective variations, the effectiveness of the method is unproven. Visual Saliency Induced index(VSI) [27] is an effective quality assessment metric designed by combining visual saliency, edge similarity and chrominance consistency. Based on VSI, Yang et al. [19] designed a quality assessment metric specifically for stitched images by combining a perceptual geometric error measurement and a local structure-guided measurement. For geometric error 7

metric, local variance of the energy by optical flow field between the stitched and original images is computed. The metric is suitable for stitched images obtained by traditional global transformed model. For stitched images obtained by local fine-tuning stitching methods, local variance is inconsistent, their geometric error metric would be high value. In addition, shape distortion and information loss occur in existing stitching methods, and they were not be considered in Yang’s method. In addition, all these methods are designed for 2D images. Compared with 2D images, stereoscopic images deliver an additional dimension of depth information. Currently, stereoscopic image quality assessment approaches [33– 35, 41–46] have been extensively studied. These methods related to imaging, coding, delivery, display. In this paper, we focus on stereoscopic stitched images quality assessment and propose an objective S-SIQA method by considering different types of perceptual visual distortion in existing stitching methods. Color distortion is measured based on transferring the luminance and chrominance of the input images to stitched images; ghost distortion is measured by pruned SIFT feature points [36]; shape distortion is measured by the angle difference of corresponding lines between input and stitched images; information loss in stitched images is estimated by image saliency value and region loss matrix of the original image. To test the effectiveness of the proposed method, we build a database of stitched images based on different methods through subjective experiments. The experimental results have confirmed that the proposed method can effectively measure the perceptual quality of stitched images.

8

3. The proposed method The proposed method aims to systematically address the distortion problem of stitched images. Fig. 1 shows the outline of proposed method. In the proposed method, the reference images are originally captured with two stereoscopic images and stitched stereoscopic images. With these images, we detect SIFT feature points and prune them for each metric; matched lines between input and stitched images are obtained by matched feature points; and the saliency map is computed by the study [39]. With feature points, corresponding lines, and saliency map, we design different objective quality assessment metrics, which include color distortion metric (Qc ), ghost distortion metric (Qg ), shape distortion metric (Qs ), information loss metric (Qi ), and disparity distortion metric (Qd ). At last, these metrics are fused into an objective assessment model by an optimally weighted linear combination for visual quality assessment of stereoscopic stitched images.

Fig. 1: The outline of proposed method.

In this paper, the 1th left and right view images are denoted as L1 and R1 , and the 2th left and right view images are denoted as L2 , R2 ; the stitched 9

left view and right view images are denoted as L and R. In our paper, fiL1 , fiL2 denote the feature points [36] from the image L1 and L2 , respectively. FL1 L , FL2 L , FL1 L2 , and FLR denote matched feature points sets from image pairs (L1 , L2 ), (L1 , L), (L2 , L), and (L, R). They are defined as follows: FL1 L = {(fiL1 , fiL ) |fiL1 ∈ L1 , fiL ∈ L, i = 1...n1 }, FL2 L = {(fiL2 , fiL ) |fiL2 ∈ L2 , fiL ∈ L, i = 1...n2 }, FL1 L2 = {(fiL1 , fiL2 ) |fiL1 ∈ L1 , fiL2 ∈ L2 , i = 1...n3 }, FLR = {(fiL , fiR ) |fiL ∈ L, fiR ∈ R, i = 1...n4 }. where n1 , n2 , n3 , n4 denote the number of feature points. 3.1. Image color distortion The color distortion in stitched image is caused by the difference in both the luminance and chrominance at input image overlapping region. Therefore, the proposed color distortion metric is defined by luminance and chrominance distortion. In the literature [3], it is pointed out that the luminance of two image overlaps in a logarithmic domain is matched by Gamma correction, while the chromaticity of the image can be corrected by linearity. They calculate the rate of luminance of two image overlaps in a logarithmic domain, and the rate of the chromaticity of the image. After that they correct the color of image depending on the two ratios. The ratio is named by difference coefficient. The difference coefficients on overlapping area and non-overlapping area should be consistent for stitched image without color distortion. On the contrary, it is inconsistent for stitched image with color distortion. Inspired by this literature [3], the ratios of luminance (chrominance) of stitched image L to the luminance (chrominance) of L1 or L2 are used to 10

measure luminance (chrominance) distortion. That is γ and α are used to measure the luminance (chrominance) distortion. Next, we show the process of the luminance and chrominance distortion measurements by combining the following Fig. 2.

Fig. 2: Region division of L1 , L2 , and L. The image L is stitched by image L1 and image L2 . L1 is divided into overlapping region L1O and non-overlapping region L1NO . L2 is divided into two regions L2O and L2NO . The stitched image is divided into three regions LO , L01NO ,and L02NO .

As shown in Fig. 2, we can obtain luminance difference coefficients as follows: γ1 =

AL01NO AL02NO ALO ALO , γ2 = , γ3 = , γ4 = AL1O AL1NO AL2O AL2NO

(1)

where A denotes average luminance of image in a logarithmic domain. The subscript of A denotes region of image. For example, ALO denotes average luminance value at image region LO , its expression is denoted as: ALO = log(

1 X YL (p)) n p∈L

(2)

O

Y denotes luminance component of image. n denotes the number of pixel in region LO . Other AL1O ,AL1NO ,AL01NO ,AL2O ,AL2NO ,AL02NO are defined similarly. As aforementioned, if the stitched image L is without luminance 11

distortion, γ1 = γ2 ,and γ3 = γ4 . Otherwise, they are unequal, that is kAL0 1NO − γ1 AL1NO k = 6 0 or kAL0 2NO − γ3 AL2NO k = 6 0 . Therefore, we use the following equation to measure luminance distortion:



QLD = AL01NO − γ1 AL1NO + AL02NO − γ3 AL2NO

(3)

Similarly, to measure chrominance distortion, we calculate the chrominance difference of non-overlapping region by the difference coefficients α1 and α3 , that is Eq. (4) in our paper:



QCD = BL01NO − α1 BL1NO + BL02NO − α3 BL2NO where α1 =

BLO , α3 BL1O

=

BLO BL2O

(4)

, B denotes the average chrominance of image.

The subscript of B denotes region of image. For example, BLO denotes average chrominance value at image region LO , its expression is denoted P as: BLO = n1 CL (p) . C denotes chrominance value of image. Other p∈LO

BL1O ,BL1NO ,BL01NO ,BL2O ,BL2NO ,BL02NO are defined similarly.

The color distortion measurement of left view image is defined by the sum of luminance and chrominance distortion, that is denoted as: QcL = QLD + QCD

(5)

Similarly, the color distortion measurement of right view image QcR can be defined as the above equation. For stereoscopic image, in addition to the color distortion of in-image, the color balance between left view stitched image L and right view stitched image R need to be considered. We define the color difference between two images from luminance and chrominance: QcLR = kAL − AR k + kBL − BR k 12

(6)

here, AL and AR denotes average luminance of left view image and right view image. BL and BR denotes the average chrominance of left view image and right view image. The image color distortion metric (CDM)of stereoscopic stitched image is denoted as: Qc = QcL + QcR + QcLR

(7)

3.2. Image ghost distortion Image ghost distortion results from misalignment during image stitching. Image ghost means an object appears twice at different positions in the stitched image. It is easy to understand the longer the distance between ghost objects, the greater the ghost distortion. Specifically, when the distance is zero, there is no ghost distortion. Therefore, we use the distance between ghosting objects to measure the ghost distortion. However, it is difficult to directly calculate the distance in stitched image. Since the ghost object is generated from different input image, we can use the difference between corresponding feature points in stitched image of feature points at input images overlapping region. In Fig. 3, fiL1 and fiL2 are a matched feature point pair from image L1 and L2 , their corresponding points in stitched image L are 1 2 1 2 fiL and fiL respectively. The difference between fiL and fiL is designed to

measure image ghost distortion. Since the sets FL1 L and FL2 L , which are defined in paragraph 1 of section 3, include feature points in non-overlapping region (see Fig. 4(a)), we obtain 1 2 the desired feature point pairs (i.e., the aforementioned points fiL and fiL )

by our pruned method. The pruning steps are presented as follows: 1) FL1 L2 and FL1 L can be pruned by the intersection common feature points 13

Fig. 3: The feature points of calculating ghost distortion.

fiL1 , the pruned set FL1 L2 is denoted as FL0 1 L2 . 2) The set FL2 L and FL0 1 L2 are pruned by the common feature points fiL2 ; the pruned set are denoted as FL0 2 L and FL001 L2 . 3) The pruned set FL0 1 L of FL1 L is obtained by intersection of FL001 L2 and FL1 L . The pruned matched feature point sets FL0 1 L and FL0 2 L are denoted as:  0 1 0 1 ∈ L, i = 1...n03 |fiL1 ∈ L1O , fiL fiL , fiL 1  0  0 2 2 ∈ L, i = 1...n04 = fiL , fiL |fiL2 ∈ L2O , fiL 2

FL0 1 L = FL0 2 L



(8)

0 0 where fiL , and fiL is the corresponding feature points in the overlapping 1 2 1 is the corresponding feature points region between the image L1 and L2 ; fiL 0 2 0 of fiL in the stitched image; fiL is the corresponding feature point of fiL in 1 2

stitched image. Fig. 4 shows a result of pruned matched feature points. 1 2 This distance difference between fiL and fiL , i.e, the ghost distortion

metric, is described as: Q gL

N1

1

1 X

f − f 2 2 = iL N1 i=1 iL

(9)

Similarly, the ghost distortion of right view image is denoted as QgR . The ghost distortion metric (GDM)of stereoscopic panorama image Qg is ex14

Fig. 4: The results of pruned feature points:(a) original matched feature point sets; (b) pruned matched feature point sets at overlapping region. Note: the matched feature points connected by red line are from the set FL1 L ; blue line are from the set FL2 L ; feature points connected by green line are from the set FL1 L2 .

pressed as: Qg = QgL + QgR

(10)

3.3. Image shape distortion As previously mentioned, image stitching is typically solved by finding global parametric warps to bring images into alignment [15]. Popular global warping methods include affine and projective ones. However, these warping methods would lead to shape distortion. As shown in Fig. 5, inclunation distortion occurs on the building in stitched image. Since humans are sensitive to straight lines, we use the angle difference of corresponding lines to measure image shape distortion. In this paper, we detect the lines at input images and stitched image by LSD algorithm [37], which is a line segment detector. l denotes the set of line at image L1 and L2 , l0 denotes the set of line at stitched image L. lk denotes the kth line from the set l, lk0 from the set l0 at stitched image denotes the corresponding line of lk . Next, we show the process of finding the corresponding lines (lk ,lk0 ). 15

Our corresponding lines are detected by matched feature points. The feature point set near the kth line lk is denoted as:  o n  Flk = p|d flk , p < δ

(11)

where p is the feature point at the input image; flk is the function of line lk ; d (flk , p) denotes the distance from the point p to the kth line lk (lk ∈ l); the δ denotes distance factor from p to the line lk . In our experiments, δ is set as 3 empirically. Depending on p from Flk , we can ensure its matched point p0 and the corresponding feature point set Fl0k of Flk . Depending on p0 and Fl0k , we can calculate the distance from p0 to each line lj0 (lj0 ∈ l0 , j = 1, ..., Nl ), and choose the min distance line as the corresponding line lk0 at stitched image of line lk , which can be expressed as: lk0 = min fl 0

j

X

p0i ∈Fl0 k

  d p0i , flj0

(12)

Fig. 5 shows an example of the resulting corresponding lines by matched points-guided method.

Fig. 5: The detection of corresponding lines:(a) lines at original image, (b) corresponding lines at stitched image. Note: we only show the long straight lines.

16

After obtaining the corresponding line lk0 of lk , we use the cosine value of the angle between lk and lk0 to measure shape distortion of stitched image. To match the measures of the above metrics where smaller values mean better results, we compute 1-cosine to evaluate the shape distortion. The shape distortion metric of left view image can be denoted as: QsL

2 Nl 0

1 X i h l , l k k

1 − = Nl k=1 klk k klk0 k

(13)

where Nl denotes the number of lines. Similarly, the shape distortion metric of right view image is denoted as QsR . The shape distortion metric (SDM) of stereoscopic panorama image Qs is expressed as: Qs = QsL + QsR

(14)

3.4. Image Information Loss In addition to the above distortion, the other major structure distortion caused by non-uniformly stretching image is information loss. For example, as shown in Fig.5, the top region of building in non-overlapping region is missing compared with the input images. Therefore, while assessing the quality of a stitched image, the information loss should also be taken into account for accurately assessing the visual quality of a stitched image. Since human vision system pays more attention to salient region, we propose to use the saliency region loss ratio to evaluate information loss of stitched image. Although it is easy to obtain the saliency maps [38, 39] of the original image and its stitched version, the information loss cannot be easily measured by comparing these two maps. The reason is that the stitched image would no more preserve the shape and size of input image. 17

However, since the information loss results in the loss of feature points, we can calculate saliency region loss ratio by the multiplication of image saliency value and image region loss matrix defined by feature points in the original image. First, we detect the feature point set FL2 of the input image itself, and detect matched feature point set FL2 L between the input image L2 and stitched image L. And then, we detect regions with feature points in input images by feature points fiL2 from FL2 and from FL2 L . The image is divided into m × n grid cells, and uses a matrix Gm,n to denote whether there is feature points in a grid cell. Gm,n consists of 0 and 1 (0 denotes that there is no a feature point in a grid cell, while 1 denotes there are feature points in a grid cell). At last, we calculate the saliency value of image each grid cell. Fig. 6 shows an example of calculating information loss.

Fig. 6: Image information loss:(a) original image L1 and L2; (b) stitched image L; (c) feature points from the set FL2 at L2 image; (d) feature points from the matched points set FL2 L at L2 image; (e) the matrix of feature points; (f) saliency map of image L2.

18

The information loss metric of image L2 can be expressed as: m P n P G0i,j · Si,j i=1 j=1 QiLL2 = P m P n Gi,j · Si,j

(15)

i=1 j=1

where

G0i,j

denotes whether there are feature points fiL2 from the set FL2 L in

the ith row and jth column grid cell; Gi,j denotes whether there are feature points fiL2 from the set FL2 in the ith row and jth column grid cell. Si,j denotes the saliency value of the ith row and jth column image grid cell. The information loss metric of left view image is consisted of information loss metric of image L1 and information loss metric of image L2: QiL = QiLL1 + QiLL2

(16)

Similarly, the image information loss metric of right view image is denoted as QiR . The image information loss metric (ILM) of stereoscopic panorama image Qi is expressed as: Qi = QiL + QiR

(17)

3.5. Disparity distortion Disparity is an important factor for stereoscopic images, and can affect the performance of assessment. Especially, large vertical disparities can increase visual discomfort. For stereoscopic stitched images, image projective warping can produce vertical disparities between left view images and right view images. Hence, the disparity distortion metric (DDM) is denoted as: n4 1 X Qd = |fiLy − fiRy |2 n4 i=1

(18)

where fiLy and fiRy denote the y-coordinate of the ith corresponding feature point in left view and right view images. 19

3.6. Overall evaluation metric and parameter choice The overall evaluation metric is expressed as: Q = w1 Qc + w2 Qg + w3 Qs + w4 Qi + w5 Qd

(19)

To obtain the optimal parameters w = (w1 , w2 , w3 , w4 , w5 ), we define a correlation coefficient function y = F (w) in a 5-dimensional space H and the variable w constraint is w1 + w2 + w3 + w4 + w5 = 1. The values of these parameters that should be controlled in the range [0, 1]. We then estimate the correlation coefficient function F (w) by RBF (Radial Basis Function) interpolation. Last, we find the maximum value of F (w) to determine the optimal parameters. To find the optimal parameter w, we need to reconstruct the function F (w). In the theory of numerical analysis, the function reconstruction can be realized by interpolation method. Radial basis function (RBF) interpolation [31] is an advanced method in approximation theory for constructing highorder accurate interpolations. It takes the form of a linear weighted sum of radial basis functions. Since RBF interpolating function has shown good performance to estimate correlation coefficient function in [32], we take RBF interpolation method to estimate F (w). The RBF interpolating function is: F (w) =

n P

ai Φ (w)

i=1 2

(20)

Φ (w) = e−(w−si ) , w ∈ H where si denotes the sample point at Φ (w) , n denotes the number of sample points, the sample interval is 0.05 in each dimension from H. Because each point w defines a deterministic measure Φ (w) in Eq.(18). By using Φ (w), we compute the Pearson correlation coefficient F (w) of the images in testing 20

sample set. Depending on the interpolating points wi and F (wi ), we can calculate the ai by the following equation.  P n n P  ai Φ (w1 ) = F (w1 ) ai Φ (w2 ) = F (w2 )    i=1  i=1 ...   n n P P    ai Φ (wm−1 ) = F (wm−1 ) ai Φ (wm ) = F (wm ) i=1

(21)

i=1

where F (wi ) is a correlation coefficient value between objective evaluation

value Qi and image quality ground truth value. Depending on the interpolating points wi , we can ensure the objective overall evaluation metric value Qi from Eq.(19). Moreover, combing the Qi and image quality ground truth value, we can obtain the correlation coefficient value F (wi ). Moreover, we can find the maximum value of F (w), and the corresponding optimal value w can be obtained. In Eq.(19), the optimal values of weights wi are 0.30, 0.12, 0.14, 0.38, 0.06. 4. Experimental Results 4.1. Database As mentioned previously, if the captured images differ purely by rotation or translation, projective transformation method can align images without ghost; otherwise, if the images are captured by other ways, projective transformation method would cause ghost distortion. Therefore, to show different distortion types, different input images from different captured ways and different stitched results from different stitching methods are shown in our database.

21

For stereoscopic image stitching methods, Liu et al. created a stereoscopic stitched image database [21]. However, that database only includes good results without ghost distortion and color distortion, since the input images are captured by the same light and the only rotation shooting. Yan’s database [22] only includes six test samples. Currently, there is no good enough database for stereoscopic stitched image quality assessment. Here, we build a stereoscopic stitched image database (SSID) to provide a test platform for our objective algorithm. In this database, the input stereoscopic images are captured with FUJIFILM REAL 3D by different complicated camera motion, not only sample rotation and planar. The database contains three stitched results for each sample, which are obtained by three representative image stitching methods: homography method, APAP method [9], and Yan’s method [22]. 30 samples are used to build in this database. Each sample is arranged in a file, which includes input stereoscopic images, stitched images from three stitched method, and anaglyph stereoscopic image, i.e, Input-L1, Input-R1, Input-L2, Input-R2, H-L, H-R, H-S, APAP-L, APAP-R, APAP-S, YAN-L, YAN-R, YAN-S. Fig. 7 shows the 30 samples cases (left view images) in our database. To verify the effectiveness of the proposed method, we conduct a user study. A total of 30 participants with normal stereoscopic vision were invited to participate in the user study. The participants include undergraduate and graduate students aging from 18 to 28. Before the test, each participant was shown with some example images to learn about the typical types of stitching distortions. For each sample, the input images were placed on the top of the

22

Fig. 7: 30 sample images (left view images of each sample).

screen (unchanged) and each stitched image was placed on the bottom in a random order showing. These participants were first shown with three 2D stitched left images in a random order, and were asked to simply rate from 1 (very bad) to 5 (very good). After that they were shown three 2D stitched right images in a random order, and were asked to simply rate from 1 (very bad) to 5 (very good). At last, these participants were shown with three anaglyph stereoscopic stitched images, rating from 1 (very uncomfortable) to 5 (very comfortable). The participants were asked to rate 9 times for each samples, totally 270 times with all samples. In this paper, the subjective values of all images are shown in the standardized subjective value. 4.2. Performance Comparison In this section, to verify the superiority of the proposed method, we compare it from two aspects. On one hand, we compare it with four categories of state of the art 2D SIQA methods. On the other hand, we can compare the proposed method with only the vertical disparity metric, and evaluate the effectiveness of each metric by a combination comparison. Two objec23

tive performance criteria are selected to measure and compare the proposed method with competing metrics, including Pearson linear correlation coefficient (PLCC) and Kendall rank correlation coefficients (KRCC). PLCC and KRCC are used to evaluate the prediction linearity and monotonicity, respectively. An optimal metric should obtain the following results: |PLCC| = 1, |KRCC|= 1. In this paper, we show PLCC and KRCC in their absolute values. 1) The proposed metrics except the vertical disparity metric can be used to evaluate 2D stitched images, and we combine these metrics to our evaluation metric and compare it with state-of-the-art 2D SIQA methods in this subsection. The current 2D SIQA methods include SSIM, Visual Saliency Induced (VSI), Qureshi’s method [18], and Yang’s method [19]. Our evaluation metric for 2D stitched method is expressed as: Q2D = w1 QgL + w2 QcL + w3 QsL + w4 QiL

(22)

wi can be calculated by the next section C- Parameter Choice. Here, w1 =0.14, w2 =0.24, w3 =0.16, w4 =0.46. The comparison results with 2D SIQA methods are tabulated in Table 1. Table I shows the PLCC and KRCC values, between the compared methods and the subjective results. The best result is highlighted in boldface on each type. By analyzing the experimental results in Table 1, we have the following observations. The proposed method is ahead of other competing methods with the state-of-the-art performance among 2D SIQA metrics for stitched images from different methods. Specifically, it outperforms the runner-up (Yang’s method) by about 10% of PLCC and KRCC values. In addition, SSIM and Qureshi’s method cannot obtain good enough performance 24

Table 1: Comparison Results with 2D SIQA methods.

Correlation

PLCC

KRCC

Data

H-L

APAP-L

YAN-L

H-L

APAP-L

YAN-L

SSIM

0.1963

0.1430

0.2213

0.1602

0.0839

0.2267

Quereshi’s

0.3050

0.2340

0.4025

0.1748

0.1821

0.2890

VSI

0.7309

0.7345

0.5877

0.5424

0.7671

0.5311

Yang’s

0.7784

0.7398

0.7298

0.7468

0.7405

0.5736

Proposed

0.8231

0.8014

0.7977

0.7951

0.7654

0.7174

with PLCC<0.5 and KRCC<0.5. The reason is that SSIM and Qureshi’s method only compare the similarity between the homographic transformed image and stitched image. As mentioned above, the homographic transform can cause ghosting effects. Thus they cannot measure the ghost distortion well. Moreover, for the color inconsistency and shape distortion, there are no other metrics to measure them. By considering color distortion and ghost distortion, VSI and Yang’s method have higher correlation coefficients than the value of SSIM and Qureshi’s method, but lower than those of the proposed method. The proposed method comprehensively considers image color distortion, shape distortion, ghost distortion, and image information loss distortion, and thus, the proposed method exhibits tremendous advantages and is suitable to handle quality evaluation problem of stitched images. 2) In this section, we verify the effectiveness for the proposed S-SIQA in stereoscopic stitched images, which include these anaglyph stereoscopic images (H-S, APAP-S, YAN-S). The metric of DDM was always used to evaluate the stitched image in stereoscopic image stitching algorithm.

25

Fig. 8: Impact of different weights on the performance of the proposed metric. (a) PLCC of different w1 and w2 , when the weights w3 , w4 , and w5 are set to the optimal values in Eq. (19). i.e., 0.14, 0.38, 0.06. (b)-(e) similar to (a).

In this paper, we obtain the optimal parameters w by estimating the function F (w) using the RBF interpolating function. To further verify the performance of our metric with the optimal weights, we show the impact of other different combinations of weights on the performance of the proposed metric by Fig. 8. In Fig. 8, we show the impact of two weights on the performance of our metric, when other three weights are set to our optimization parameters. Fig. 8(a) shows the impact of w1 and w2 , when w3 , w4 , and w5 are set to 0.14, 0.38, 0.06. It is clear that the best performance is obtained when w1 = 0.3 and w2 = 0.12. Fig. 8(b) shows the impact of w1 and w3 , when w2 , w4 , and w5 are set to 0.12, 0.38, 0.06. From Fig. 8(b), we can see

26

that the best performance is obtained when w1 = 0.3 and w3 = 0.14. The same analysis is applied to Fig. 8(c) (d) (e) (f). From the figure, we can further verify that the proposed metric delivers the best performance when the values of weights are 0.30, 0.12, 0.14, 0.38, 0.06 respectively, which are the optimal values. To evaluate the effectiveness, we conduct one comparison experiment for the alternative metrics. The comparison metrics include DDM, GDM+DDM, CDM+DDM, SDM+DDM, ILM+DDM, and our overall metric in Eq.(16). Table 2 shows the PLCC and KRCC values by using different evaluation metrics. As shown in Table 2, PLCC and KRCC values from the only DDM are lower than those by other metrics, which verify that the only DDM is insufficient for evaluating the visual quality of stereoscopic stitched images. PLCC and SRCC values by GDM+DDM, CDM+DDM, and SDM+DDM are higher than those of only DDM metric, but the values are lower than 0.5. In contrast, the proposed method can obtain higher PLCC and SRCC values than other alternative metrics method. 5. Conclusion In this paper, we propose a novel objective quality metric to assess the visual quality of a stereoscopic stitched image. The main contribution of the proposed metric lies in that the ghost, color distortion, shape distortion, information loss, and disparity distortion are taken into account simultaneously, thereby better characterizing the human perception on the visual quality of a stereoscopic stitched image compared with existing metrics. We establish a stereoscopic stitched image database and conduct user study to 27

Table 2: Comparison results of the proposed method with different metrics.

Correlation

PLCC

KRCC

Data

H-S

APAP-S

YAN-S

H-S

APAP-S

YAN-S

DDM

0.1544

0.0077

0.0917

0.2308

0.1455

0.1311

GDM+DDM

0.1479

0.2083

0.0381

0.2000

0.1130

0.1747

CDM+DDM

0.1874

0.0761

0.1149

0.3477

0.1751

0.1487

SDM+DDM

0.1092

0.2245

0.3096

0.1262

0.2279

0.3120

ILM+DDM

0.7057

0.7092

0.6712

0.5385

0.6035

0.5556

Proposed

0.8253

0.8203

0.7454

0.7108

0.7262

0.6809

verify the effectiveness of the proposed method. Our experiments show that the superior performance of the proposed metric in terms of the consistency with subjective evaluation results, compared with the relevant existing metrics. There are still some limitations for the proposed method. First, we use the inclination of line to measure shape distortion. For some images with lots of areas without lines or only few short lines, the line measure method for shape distortion may not work well for these images. Usually, the shape distortion metric does not have much impact on the accuracy of the proposed metric since human is insensitive to shape distortion for these image without lines. Acknowledgement This work was supported by Natural Science Foundation of Shandong Province under Grants ZR2017QF006, National Natural Science Foundation 28

of China under Grants 61801414, 61701451,Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan) under Grant No.CUG170654,Key Laboratory of Intelligent Perception and Advanced Control of State Ethnic Affairs Commission under Grant MD-IPAC2019102. References [1] Z. Chen, Y. Li, Y. Zhang, Recent advances in omnidirectional video coding for virtual reality: Projection and evaluation, Signal Process. 146 (2018) 66-78. [2] R. Szeliski, Image alignment and stitching: a tutorial, Founda. Trends in Comput. Graph. 2 (1)(2006) 1-105. [3] W. Yao and Z. Li, Instant color matching for mobile panorama imaging, IEEE Signal Process. Lett. 22 (1) (2014) 6-10. [4] F. Bellavia, and C. Colombo, Dissecting and Reassembling Color Correction Algorithms for Image Stitching, IEEE Trans. on Image Process. 27 (2)(2018) 735-748. [5] Y. Yang, J. Ming, Image quality assessment based on the space similarity decomposition model, Signal Process. 120 (2016) 797-805. [6] X. Min, K. Gu , G. Zhai, M. Hu, X. Yang, Saliency-induced reducedreference quality index for natural scene and screen content images, Signal Process. 145 (2018) 127-136.

29

[7] G. Yue, C. Hou, Q. Jiang, et al. Blind stereoscopic 3D image quality assessment via analysis of naturalness, structure, and binocular asymmetry. Signal Process. 2018, 150: 204-214. [8] W. Y. Lin, S. Liu, Y. Matsushita, T. T. Ng, L. F. Cheong, Smoothly varying affine stitching, in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 345-352. [9] J. Zaragoza, T. J. Chin, Q. H. Tran, M. Brown, and D. Suter, Asprojective-as-possible image stitching with moving DLT, IEEE Trans. Patt. Analy. Mach. Intell. 36 (7) (2014) 1285-1298. [10] F. Zhang and F. Liu, Parallax-tolerant image stitching, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2014,pp. 3262-3269. [11] Z. X. Tian, G. S. Xia, X. Bai, et al, Image stitching by line-guided local warping with global similarity constraint, Pattern Recognition, 83 (3)(2018) 481-497. [12] C. Chang, Y. Sato, Y. Chuang, Shape-preserving half-projective warps for image stitching, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2014, pp.3254-3261. [13] J. Gao, S. Kim, M. Brown, Constructing image panoramas using dualhomography warping, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011, pp.49-56. [14] Q. Chai, S. Liu, Shape-optimizing hybrid warping for image stitching, in Proc. IEEE International Conference on Multimedia and Expo.,2016, pp. 1-6. 30

[15] C. Lin, S. Pankanti, k. Ramamurthy, A. Aravkin, Adaptive as-naturalas-possible image stitching, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2015, pp. 1155-1163. [16] Y. Lu, Z. Hua, K. Gao, et al., Multiperspective Image Stitching and Regularization via Hybrid Structure Warping, Computing in Science Engineering, 20 (2)(2018) 10-23. [17] W. Yan, C. Hou, Reducing perspective distortion for stereoscopic image stitching, in Proc. IEEE International Conference on Multimedia Expo Workshops, 2016, pp. 1-6. [18] H. Qureshi. M. Khan, R. Hafiz, Y Cho, J. Cha, Quantitative quality assessment of stitched panoramic images, IET Image Process. 6 (9) (2012) 1348-1358. [19] L. Y. Yang, Z. G. Tan , Z. Huang, G. Cheung, A content-aware metric for stitched panoramic image quality assessment, in Proc. IEEE International Conference on Computer Vision Workshop, 2017, pp. 2487-2494. [20] L. Y. Yang, J. Liu, C. Q. Gao, An error-activation-guided blind metric for stitched panoramic image quality assessment, in Proc. Chinese Conference on Computer Vision, 2017, pp. 256-268. [21] F. Zhang, F. Liu, Casual Stereoscopic Panorama Stitching, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2015, pp. 20022010. [22] W. Yan, C. Hou, J. Lei, Y. Fang, Z. Gu, N. Ling, Stereoscopic image 31

stitching based on a hybrid warping model, IEEE Trans. on Circuits Syst. Video Tech. 27 (9)(2017) 1934-1946. [23] P. Paalanen, J. K. Kamarainen, H. Kalviainen, Image based quantitative mosaic evaluation with artificial video, In Scandinavian Conference on Image Analysis, 2009, pp.470-479. [24] W. Xu, J. Mulligan, Performance evaluation of color correction approaches for automatic multi-view image and video stitching , in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2010, pp. 263-270. [25] S. Leorin, L. Lucchese, R. G. Cutler, Quality assessment of panorama video for video conferencing applications, in Proc. IEEE Multimedia Signal Processing Workshop, 2005. [26] M. Solh and G. AlRegib, Miqm: A novel multi-view images quality measure, IEEE, International Workshop on Quality of Multimedia Experience, 2009, pp. 186-191. [27] L. Zhang, Y. Shen, H. Li, VSI: A visual saliency-induced index for perceptual image quality assessment, IEEE Trans. on Image Process., 23 (10)(2014) 4270-4281. [28] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli. Image quality assessment: From error visibility to structural similarity, IEEE Trans. on Image Process., 13 (4)(2004) 600-612. [29] A. Liu, W. Lin, H. Chen, et al. Image retargeting quality assessment 32

based on support vector regression, Signal Processing Image Communication, 39(2015) 444-456. [30] C. Hsu, C. Lin, Y. Fang, W. Lin. Objective quality assessment for image retargeting based on perceptual distortion and information loss, IEEE Journal of Selected Topics in Signal Processing, 8(3)(2014) 377-389. [31] M. J. D. Powell. Radial basis function for multivariable interpolation: a review. Algorithms for application. 1987, 143-167. [32] Y. Liang, Y. J. Liu, D. Gutierrez, Objective Quality Prediction of Image Retargeting Algorithms, IEEE Trans. on Vis. Comput. Graph. 23 (2)(2016) 1099-1110. [33] J. Wang, S. Wang, K. Ma, and Z. Wang. Perceptual depth quality in distorted stereoscopic images, IEEE Trans. on Image Process., 26 (3)(2017) 1202-1215. [34] F. Shao, W. Tian, W. Lin, et al., Learning Sparse Representation for NoReference Quality Assessment of Multiply Distorted Stereoscopic Images, IEEE Trans. on Multim., 19 (8) (2017) 1821-1836. [35] G. Yue, C. Hou, and T. Zhou, Subtitle Region Selection of S3D Images in Consideration of Visual Discomfort and Viewing Habit, ACM Trans. on Multim. Computing, Communications, and Applications (TOMM), 15(3), 1-16, 2019. [36] D. G. Lowe, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, 60 (2)(2004) 91-110. 33

[37] R. G. von Gioi, J. Jakubowicz, J.-M. Morel, G. Randall, LSD: A fast line segment detector with a false detection control, IEEE Trans. Patt. Analy. Mach. Intell. 32(4)(2010) 722-732. [38] K. Gu, G. Zhai, W. Lin, et al., Visual Saliency Detection With Free Energy Theory, IEEE Signal Process. Lett. 22 (10)(2015) 1552-1555. [39] C. Tang, P. Wang, C. Zhang, et al., Salient Object Detection via Weighted Low Rank Matrix Recovery, IEEE Signal Process. Lett. 24 (4)(2017) 490-494. [40] G. Yue, W. Yan, and T. Zhou, Referenceless Quality Evaluation of ToneMapped HDR and Multi-Exposure Fused Images, IEEE Trans. on Industrial Informatics, 2019. [41] J. Wang, A. Rehman, K. Zeng, S. Wang, and Z. Wang, Quality prediction of asymmetrically distorted stereoscopic 3D images, IEEE Trans. on Image Process., 24 (11), 3400-3414, 2015. [42] G. Yue, C. Hou, K. Gu, T. Zhou, and G. Zhai, Combining Local and Global Measures for DIBR Synthesized Image Quality Evaluation,” IEEE Trans. on Image Process., 28, 2075-2088, 2019. [43] J. Yang, B. Jiang, Y. Wang, W. Lu, Q. Meng. Sparse Representation Based Stereoscopic Image Quality Assessment Accounting for Perceptual Cognitive Process, Information Sciences, 430 (2018) 1-16. [44] Y. Liu, J. Yang, Q. Meng, Z. Lv, Z. Song, Z. Gao, Stereoscopic image quality assessment method based on binocular combination saliency model, Signal Process. 125 (2016) 237-248. 34

[45] X. Wang, L. Ma, S. Kwong, Y. Zhou, Quaternion representation based visual saliency for stereoscopic image quality assessment, Signal Process. 145 (2018) 202-213. [46] F. Shao, Q. Z. Yuan, W. S. Lin, et al., No-Reference View Synthesis Quality Prediction for 3-D Videos Based on Color-Depth Interactions, IEEE Trans. on Multim. 20 (3)(2018) 659-674.

35

Author Contributions Section In this paper, we propose an objective stereoscopic stitched image quality assessment method, named as stereoscopic SIQA (S-SIQA). The contributions of this paper are summarized as follows: 1) we propose a novel objective assessment model for stereoscopic stitched images based on different perceptual distortion types. To the best of our knowledge, this is the first attempt to investigate objective quality assessment of stereoscopic stitched images; 2) we investigate five types of distortion from existing stitching methods, including ghost distortion, color distortion, shape distortion, information loss, and disparity distortion; based on these distortions, we propose the measurement methods of each distortion by pruned feature points, matched line detection, and saliency map between the original stereoscopic image and its stitched version; 3) we create a subjective quality assessment database composed of 30 samples, which include two input stereoscopic images, three stitched results from three representative stitching methods for each input stereoscopic image, and anaglyph stereoscopic image (total 16 images) for each sample, to provide a test platform for the proposed algorithm.

Conflict of Interest Dear sir or madam: On behalf of my co-authors, I am submitting the manuscript entitled “Perceptual Objective Quality Assessment of Stereoscopic Stitched Images”, which we wish to be considered for publication in “Signal Processing”. No conflict of interest exits in the submission of this manuscript, and manuscript is approved by all authors for publication. I would like to declare on behalf of my co-authors that this manuscript is the authors' original work and has not been published nor has it been submitted simultaneously elsewhere. All authors have checked the manuscript and have agreed to the submission.

Best Regards, Yours sincerely, Weiqing Yan Email: [email protected]