Object Occlusion Guided Stereo Image Retargeting
Accepted Manuscript
Object Occlusion Guided Stereo Image Retargeting Diptiben Patel, Shanmuganathan Raman PII: DOI: Reference:
S0167-8655(18)30455-0 https://doi.org/10.1016/j.patrec.2019.07.018 PATREC 7581
To appear in:
Pattern Recognition Letters
Received date: Revised date: Accepted date:
15 August 2018 8 May 2019 23 July 2019
Please cite this article as: Diptiben Patel, Shanmuganathan Raman, Object Occlusion Guided Stereo Image Retargeting, Pattern Recognition Letters (2019), doi: https://doi.org/10.1016/j.patrec.2019.07.018
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT 1 Highlights • Formulated stereo seam carving method using graph-cut formulation. • Objects are preserved by allowing seams to pass through object-object occlusion boundary. • Object-object occlusion boundary is defined using object map and disparity map.
AC
CE
PT
ED
M
AN US
• Refinement of correspondence between stereo images to follow occlusion boundary.
CR IP T
• Adaptive occlusion boundary weights to prevent occlusion of small objects.
ACCEPTED MANUSCRIPT 2
Pattern Recognition Letters journal homepage: www.elsevier.com
Object Occlusion Guided Stereo Image Retargeting Electrical Engineering, Indian Institute of Technology Gandhinagar, Gujarat, India, 382355
ABSTRACT
CR IP T
Diptiben Patel∗∗, Shanmuganathan Raman
M
AN US
The resizing of stereo images while preserving salient content and geometric consistency between the image pair is an essential problem to be addressed for 3D visualization. Existing stereo image retargeting techniques are observed to be incurring salient object deformation. In this paper, we formulate a seam carving method to perform stereo image retargeting using graph-cuts having node size as the number of pixels in one of the stereo images. We define the object map with depth ordering of each object from the camera. The seams are allowed to pass along the object-object occlusion boundary at depth discontinuity in order to prevent salient object deformation. We propose adaptive occlusion boundary weights as a function of an object area to be occluded to preserve small objects. The seam passing through object-object occlusion boundary in one image may not follow the exact boundary in the other image. We propose a refinement of the seam to follow object-object occlusion boundary in both the stereo images. By comparing the proposed method with other state-of-the-art stereo image retargeting methods, we show that the proposed method is able to preserve the salient object(s) along with geometric consistency between the stereo image pair. keywords: Stereo Imaging, Object Occlusion, Geometric Consistency, Seam Carving c 2019 Elsevier Ltd. All rights reserved.
ED
1. Introduction
AC
CE
PT
The rapid development of 3D digital display technology has gained a lot of attention in the direction of improving the veiwing experience. Deployment of 3D media onto different display devices such as 3D theater screens, computer screens, televisions, and even mobiles [1, 2], has led to the requirement of resizing an image/video to be displayed on them. Conventional ways of resizing an image/video are cropping and scaling. The loss of important contextual content at the borders of an image/video may occur due to cropping of an image/video from the center. Non-uniform scaling in either direction of an image/video may incur elongation or squeezing of important content. Improvement in the performance of the retargeting techniques is achieved using content-aware retargeting techniques that alter the size of an image/video while preserving the salient content and the overall structure of an image/video. The salient or important content in an image/video is found using cues such as color [3], texture, attention [4], spatiotemporal information [5], priors from the images having similar context [6], to name a few. ∗∗ Corresponding
author: Tel.: +91-968-776-9626; e-mail:
[email protected] (Diptiben Patel),
[email protected] (Shanmuganathan Raman)
Different from the 2D image retargeting techniques, most stereo image retargeting techniques resize a pair of stereo images by preserving the geometric structure, depth disparity, visual comfort, and minimal distortion. However, very few works talk about the preservation of objects in order to get rid of deformation of salient objects. To address this problem, Lee et al. in [7] have proposed a layer based mesh warping method to warp each discrete layer separately to resize a stereo image pair. The input stereo image is decomposed into discrete layers using depth and color features. Warping of each discrete layer separately is computationally expensive and may incur artifacts at layer discontinuity and deformation of the background. Using matching at an object level, Lin et al. in [8] have retargeted stereo images while preserving the disparity and the shape of salient objects. The authors preserved the shape of salient objects well, but sparse correspondences led to deformation in disparity. In this paper, we propose a seam carving based stereo image retargeting which supports object occlusions to prevent object deformations. Objects at a different distance from the camera are beneficial for preventing object deformation by allowing occlusion of an object at a larger distance behind the object near to the camera. The idea of object occlusion is inspired by the 2D image retargeting with depth ordering in [9]. The disparity map defines corresponding pixels between stereo images. We uti-
ACCEPTED MANUSCRIPT 3
AN US
1. We propose a stereo image retargeting technique that allows object occlusion to prevent object deformation using graph-cut formulation. Use of graph-cut over dynamic programming makes it easier to implement stereo retargeting in order to prevent object deformation. 2. The objects with small areas need to be preserved more irrespective of their depth ordering to maintain their existence. We propose an adaptive weight assignment for the occlusion boundary pixels according to the area covered by the object that the boundary is going to be occluded. 3. The seam passing through the occlusion boundary in the reference image does not follow the exact occlusion boundary in the other image of the stereo image pair. We propose the refinement of correspondence for the seam pixels in the other image to assure the seam following the occlusion boundary in the stereo image pair.
Stereo Image Pair: Like 2D image retargeting techniques, stereo image retargeting techniques can be classified as continuous retargeting and discrete retargeting techniques. Continuous stereo retargeting techniques find a warping function to map the source grid of input image pair to the target grid of retargeted image pair. Using the sparse set of correspondences, Chang et al. in [25] have minimized least squared energy to retarget a stereo pair using interactive controls without depth information. Scene depth saliency and disparity consistency are preserved using matching of grid vertices and propagation of disparity in [26]. Region-based stereo image retargeting is performed to preserve the depth of retargeted stereo image pair in [27, 28]. Islam et al. in [29] have preserved image aesthetics along with depth while retargeting a stereo image pair. Under the guidance of the user’s quality of experience (QoE), Shao et al. have performed a stereo image retargeting to preserve image quality, visual comfort, and depth perception [30]. Discrete stereo retargeting techniques find the corresponding pair of pixels from both the images which can be removed safely to resize images to the target size. Shift map image editing is extended to retarget a stereo image pair in [31, 32]. Utsugi et al. in [33] extended seam carving for stereo image pair by minimizing energy to maintain the consistency between both the images. The geometric consistency between a stereo image pair is preserved by retargeting them using the Stereo Seam Carving (SSC) in [34]. While preserving the geometric consistency between the stereo image pair, the SSC method is unable to prevent deformation of salient objects. 2D visual features and depth features are detected to define 3D saliency to retarget a stereo image pair in [35]. High-level semantics are preserved, and low-level distortion is minimized using a superpixel based constraint for stereo image retargeting in [36]. Low-level saliency and edge features are used to define the visual significance of an image for retargeting purpose in [37]. Stereoscopic image pair retargeting is guided by visual attention model using 2D saliency, depth saliency, and binocular just-noticeable difference [38]. Retargeting using pixel fusion method is extended to a stereo pair by using distortion of disparity in [39]. Stereo image retargeting using optimization of cropping and scaling is performed to preserve the aesthetics of an image in [40]. In [41], authors have resized the stereo image pair by generating thumbnails using consistent saliency map of the stereo image pair. The thumbnail cropping method limits the performance when the stereo image pair consists of multiple salient objects with higher depth range. Also, the saliency detection method is able to capture only single salient object while failing to capture multiple salient objects at discrete depth levels.
CR IP T
lize the disparity map and the color features to generate object map with depth ordering. Using the object map, we make the seam to pass through object-object occlusion boundary rather than the object. The main contributions of the proposed stereo image retargeting technique are as follows.
M
The rest of the paper is organized as follows. Section 2 introduces the related work for both image and stereo retargeting techniques. Section 3 presents the proposed stereo image retargeting technique in detail. We evaluate the performance of the proposed technique in Section 4. Finally, the paper is concluded in Section 5. 2. Related Work
AC
CE
PT
ED
Image: Over the last few years, many image retargeting techniques have been developed. Different image retargeting techniques are studied and compared in [10, 11]. According to the studies in [10], content-aware image retargeting techniques are categorized into continuous and discrete techniques. Continuous retargeting techniques map triangular or quad mesh grid to the target mesh grid while preventing artifacts and preserving salient content [12–15]. Discrete retargeting techniques shift the pixels in order to achieve the image of size as the target display device [16–18]. Combining continuous and discrete techniques using multiple operators led to better retargeting results as reported in [19–21]. Cho et al. in [22] have proposed image retargeting using convolutional neural network inspired by shift map image editing. With the aid of attention box prediction network and aesthetic assessment network, resizing using best cropping window is carried out which contains the most important information and can be aesthetically enhanced [23]. Image seam carving using depth map as a disparity cue is carried out in [24]. Depth map without depth ordering allows objects near to the camera to be preserved. However, objects far from the camera are distorted. Also, multiscale graph-cut based optimization cannot be extended to the proposed stereo seam carving as traversing back from low resolution to high resolution may lose the exact object-object occlusion boundary.
3. Stereo Image Retargeting The input to the proposed method is a rectified stereo image pair {IL , IR } of size m×n and the disparity map D. The output of the proposed method is a retargeted stereo image pair {JL , JR } of size m × n0 and a retargeted disparity map DR . Fig. 1 shows the proposed method with the evaluation framework. Quality of the proposed method is evaluated using Depth Distortion Score ˆ R. D ˆ R is the disparity map calculated (DDS ) between DR and D
ACCEPTED MANUSCRIPT 4 Stereo image pair
Retargeted stereo image pair
Disparity map
Retargeted Disparity map
DR Occlusion map
-
ˆ D R ˆ DR D R
Object map
Stereo Retargeting
Evaluation
Fig. 1: Proposed method with evaluation framework.
x = 1, 2, . . . , m, y = 1, 2, . . . , n.
AC
CE
PT
ED
M
3.1. Occlusion Map We need to either remove or add corresponding pixels from the stereo image pair to maintain the geometric consistency in the retargeted stereo image pair. There exists a set of pixels in the reference image for which the corresponding pixels do not exist in the other image. They are either occluded by other pixels or out of FOV of the image. The set of occluded pixels and the pixels occluding them must not be revealed. The map classifying the occluded and occluding pixels is the occlusion map Oc . We compute the occlusion map using a simplified Zbuffer approach which is the same as described in [34]. Readers are referred to [34] for more details about the occlusion map generation. 3.2. Object Map
(a)
(b)
ψ(i) = t + 1, i = 1, 2, . . . , K.
1, a > b φ(d j , di ) and φ(a, b) = t= 0, otherwise. j=1
AN US
from the retargeted stereo image pair {JL , JR }. Among the various methods available [42–44], we use the method proposed in [44] to calculate the disparity map between the stereo image pair. Without loss of generality, we consider the left image IL as the reference image. The value of D at each location mentions how much and in which direction a pixel is moved in the right image IR with respect to the left image IL (Eq. (1)). D also contains 0 value for the pixels where the corresponding pixel is occluded or not in the field of view (FOV) of the right image IR . P, ∃ IL (x, y) = IR (x, y − P) and P ∈ [1, n] D(x, y) = (1) 0, otherwise.
CR IP T
Stereo Seam Carving
D. Segmentation using only color and texture features does not give an accurate and complete object. Fig. 2 shows the superpixel segmentation results for lazy random walk (LRW) [45], density-based spatial clustering of applications with noise (DBSCAN) [46], and depth-adaptive superpixels (DAS) [47]. LRW and DBSCAN algorithms are unable to adhere to the depth discontinuity as they use color and local binary pattern (LBP) texture features for segmentation. Segmentation using only disparity or depth map is also unable to segment object having varying disparity (object spanning in the direction of a camera). We use the algorithm proposed in [47]. The input image IL is over-segmented into superpixels using color and depth features. Complete image segmentation is performed using a spectral normalized cut algorithm, which extracts global shape information using the superpixel affinity and the superpixel neighborhood graph. Having a segmented image and average disparity for each object, we perform depth ordering to generate the object map. Let there be K objects in a segmented image with average disparity for each object as {d1 , d2 , . . . , dK }. The depth order value for each object is defined as Eq. (2).
(c)
K X
The object map O is defined as Eq. (3), ψ(i), if (x, y) ∈ ith object O(x, y) = 0, otherwise.
(3)
3.3. Graph-cut Formulation for Stereo Image Pair We can formulate a seam carving process as dynamic programming [16] or graph-cuts [48]. We formulate a graph G with (m × n + 2) nodes (additional two nodes are the source and the target nodes). Note that the proposed work constructs a single graph combining both stereo images in contrast to the work proposed in [36]. We segment the graph G using minimum energy cut along the vertical direction for vertical seam selection and seam removal/insertion. As per [48], the forward energy also needs to be minimized along with the energy to be removed. The energy going to be inserted due to new edges created after the removal of seam pixels is called forward energy. The forward energy for three different cases for the pixel (x, y) is shown in Fig. 3 and is defined in Eq. (4). In Fig. 3(a), each arrow connects the probable seam pixels to be removed with energy defined as a weight of that arrow (red arrow shows the energy for the pixel removal of (x − 1, y − 1) with (x, y) and so on). Equivalently, Fig. 3(b) defines these weights as edge weights in the graph-cut formulation. It is evident from Fig. 3(b) that the pixels just on the left side of the cut are seam pixels of minimum energy. EiLR (x, y) = |i(x, y − 1) − i(x, y + 1)|. EiLU (x, y) = |i(x, y − 1) − i(x − 1, y)|.
Fig. 2: Superpixel segmentation for object map generation using: (a) LRW [45], (b) DBSCAN [46], and (c) DAS [47].
To protect the objects as a whole, we generate an object map using the reference input left image IL and the disparity map
(2)
EiUR (x, y) L
R
= |i(x − 1, y) − i(x, y + 1)|.
(4)
i ∈ {I , I , D}. For stereo image retargeting, we consider the forward energy for both the images with pixel correspondence
ACCEPTED MANUSCRIPT 5 y 1
ELR ELU
y 1
y
E LR
x 1
y 1
y 1
y
ELR EUR
E LU
x
EUR
x
E LR
(a)
(b)
(a)
Fig. 3: Equivalent forward energy weights for seam carving (weights shown only for pixel (x, y)): (a) dynamic programming , (b) Graph-cut formulation.
(c)
Fig. 4: Object-object occlusion boundary using supernodes: (a) objects with depth ordering, (b) objects and their right neighbors (highlighted by blue and pink colored lines), (c) supernodes and weights along the boundary between supernodes (weights along the blue colored edge (yellow arrows) are u2 and so on).
occlusions. To define the object-object boundary, we construct supernodes in a similar way as in [9]. As we defined the object map with values in the increasing order from the camera, we define supernodes as an object and its right neighbors starting from the object nearer to the camera. We mentioned seam pixels to be on the left of the cut; we include right neighbors in a supernode in order to preserve the right-most pixels of an object. Fig. 4 shows the process of creating supernodes and assigning weights across the object-object occlusion boundary. Fig. 4(a) shows objects {F1 , F2 , F3 } with F1 being nearer to the camera and having value ψ = 1. Fig. 4(b) shows the right neighbors of objects F1 and F2 with blue and pink colored lines, respectively. Fig. 4(c) shows supernodes {S 1 , S 2 , S 3 } formed with the same blue, pink, and light blue color. The yellow colored arrows show the object-object occlusion boundary between F1 and F2 and assigned the weight u2 , and so on. The boundary pixels between supernodes are assigned weight u j if the jth object is being occluded by the tth object (t < j) at those boundary pixels as given in Eq. (7). uc uj = (7) |(x, y) : O(x, y) = j| Here, uc is a constant and is set to a very high value. The occlusion boundary weight u j is a function of an area covered by the object being occluded in order to protect small objects. The smaller the object area, the higher the weight, and the lower the chances of the object being occluded. For disconnected seams along the object boundary, we assign a smaller weight of 10 to the edge from (x − 1, y + 1) to (x, y) along the object boundary. The forward energy weight is not valid for the object background boundary. Removing these pixels can be viewed as an occlusion for a single image retargeting, but the object background boundary in one stereo image does not correspond to the object background boundary in other image due to occluded region across the boundary. Henceforth, we do not update the weights of object background boundary to allow seams to pass.
L
R
ELR (x, y) =(EILR (x, y) + EILR (x, y − D(x, y))
+ EDLR (x, y)) + αD(x, y) + βG(x, y). L
R
L
R
M
AN US
between them. Similar to the forward energy in stereo image pair, we minimize the forward energy in disparity map to reduce disparity distortion along the edge where disparity changes but the intensity does not change. The forward energies EDLR , EDLU , and ED UR for the disparity are added in the same manner as the forward energy of the stereo image pair. Usually, objects nearer to the camera are salient and offer higher saliency values from most of the saliency detection algorithms. Though we are treating objects differently as defined later in the paper, we assign high energy to the portion near to the camera and not detected as an object by adding disparity value D with factor α. Assuming that both the stereo images are captured under the same illumination condition, the intensity of the corresponding pixels in the left and the right images are similar. The similarity in the intensity is used to verify them being corresponding pixels using the term G defined as G(x, y) = |IL (x, y) − IR (x, y − D(x, y))|. The factor for the confidence of intensity similarity is β. The energy in terms of weights for each edge corresponding to the pixel (x, y) is defined as Eq. (5).
(b)
CR IP T
x 1
(5)
ED
ELU (x, y) =EILU (x, y) + EILU (x, y − D(x, y)) + EDLU (x, y).
EUR (x, y) =EIUR (x, y) + EIUR (x, y − D(x, y)) + ED UR (x, y).
AC
CE
PT
The backward direction weights are assigned a value of ∞ to satisfy the constraint of having a vertical cut and a single pixel in every row of a graph. (Observe the ∞ weights assigned in Fig. 3 for the edges other than ELR , ELU , and EUR . The practical value of ∞ is considerably large magnitude in the order of 1020 .) The weights calculated in Eq. (5) are assigned for the pixels which do not belong to any of the objects or the occlusion map. The weights modified for the pixels belonging to the object map or the occlusion map are discussed in section 3.4. 3.4. Object Occlusion as a Guide To prevent salient object deformation, we need to assign very high weights (ideally ∞) to the pixels belonging to any of the salient objects. To maintain the geometric consistency, we assign higher weights to the pixels occluding or being occluded in one of the images as in Eq. (6). ELR (x, y) =ELU (x, y) = EUR (x, y) = uh , ∀(x, y) ∈ {(x, y) : O(x, y) > 0 ∨ Oc (x, y) = 1}.
(6)
Here, uh is a very high value. We allow seams to pass through the object-object boundaries in order to incur object
3.5. Refinement of Seam Correspondence between Stereo Images After solving the graph-cut minimization, the pixels just left of the cut are seam pixels of minimum energy with reference to the image IL . If the left seam does not pass through the object-object occlusion boundary, we remove/insert the corresponding pixels from the right image using the disparity map D. For the left seam pixels passing through the object-object occlusion boundary, corresponding pixels derived from the disparity map D does not follow an occlusion boundary due to the
ACCEPTED MANUSCRIPT
(a)
(b)
(c)
(d)
(c) Proposed
6
AN US
Fig. 5: Refinement of object-object boundary pixels as a seam : (a) Left image with seam, (b) Object map with reference to image IL , (c) Right image with corresponding seam, (d) Object map with reference to image IR and refinement of pixels shown with red arrows in bottom row.
(b) SSC [34]
CR IP T
(a) Input
AC
CE
PT
ED
M
depth discontinuity. The same is evident from Fig. 5. The first row shows the left image IL and the right image IR with seams shown in red. One can observe from the right image IR that the seam pixels are not following the object-object occlusion boundary. The same observation becomes more pronounced from the object maps of the left image and the right image. The seam pixels are shown with white color in the second row of Fig. 5. Note that the object map of the right image is constructed using the object map of the left image and the disparity map D. The third row of Fig. 5 shows the zoomed-in view of the left image, an object map of the left image, the right image, and an object map of the right image, respectively. We update the corresponding right seam pixels along the object-object occlusion boundary by shifting them towards the occluding object using 0 value region between the objects. The shifting of pixels is shown with red arrows in the bottom row of Fig. 5(d). After seam removal/insertion from both the stereo image pair, we remove/insert left seam locations from the disparity map D. We update the disparity map D as in [34] to get the updated disparity map DR . If a pixel in IL and its corresponding pixel in IR both are on the left or the right of the seam, disparity value is retained. If a pixel in IL and its corresponding pixel in IR are on the opposite side of the seams, the disparity value is updated by a value of 1. 4. Results and Discussion We evaluate the performance of the proposed method on the Middlebury stereo dataset [49], stereo images from Flickr, and stereo images provided in [50]. The values of α, β, uh and uc are empirically chosen to be 0.68, 0.5, 1010 , and 1010 , respectively. We evaluate the geometric distortion of the method using a depth distortion score (DDS ) (shown in Fig. 1). We
Fig. 6: Diana dataset: Width reduced by 30%, (b) DDS = 3.67% for SSC [34], (c) DDS = 5.72% for the proposed method.
retarget and update the disparity map through the retargeting of the stereo image pair. The updated disparity map is DR . We ˆ R from the retargeted stereo image calculate the disparity map D pair using [44]. The DDS shows the difference between DR and ˆ R as defined in Eq. (8). D m n0 100 X X ˆ DDS = (|DR (x, y) − DR (x, y)| > 1)%. m × n0 x=1 y=1
(8)
Figs. 6, 7, 8, 9, 10, and 12 show the retargeted stereo image pairs using different stereo image retargeting methods including the proposed method. We compare the proposed method with both continuous Depth Preserving Warping (DPW) based retargeting method [28] and discrete Stereo Seam Carving (SSC) based retargeting method [34]. The method proposed in [28] preserves the shape of the salient objects using shape-preserving constraints along with depth preservation for geometry consistency. The method proposed in [34] is an extension of seam carving for stereo image pair. For each dataset, the first row shows the left input image and the left retargeted images using different methods, the second row shows the right input image and the right retargeted images using different methods, and the third row shows the input disparity map and the retargeted disparity maps using different methods. Caption for each dataset mentions the DDS for the stereo seam carving method [34] and the proposed method. As the Depth Preserving Warping (DPW) method does not warp the disparity map while warping the stereo image pair, we are unable to produce the DDS and compare with the proposed method using DDS . Fig. 6
ACCEPTED MANUSCRIPT Proposed (d) 17% (e) 40%
Fig. 7: Cloth2 dataset: Width reduced by 17%, (b) DDS = 0.31% for SSC [34], (d) DDS = 0.81% for the proposed method and width reduced by 40%, (c) DDS = 6.11% for SSC [34], (e) DDS = 2.77% for the proposed method.
(b) DPW [28] (c) SSC [34] (d) Proposed
M
Fig. 8: Man dataset: Width reduced by 17%, (c) DDS = 0.43% for SSC [34], (d) DDS = 0.95% for the proposed method.
AC
CE
PT
ED
shows the results for the Diana image pair [50]. The salient object (face) contains the major part of the image. The proposed method preserves the shape of the face as we define it as an object and no seams are allowed to pass through the face. We observe from the results that the salient object is preserved better than that of the SSC method proposed in [34]. Fig. 7 shows the results for the Cloth2 dataset [49]. The Cloth2 stereo image pair contains high texture region and no salient object. We retarget the Cloth2 stereo image pair for 17% and 40% width reduction with no salient object in the object map. We observe that the proposed method performs better to preserve texture and geometric consistency between the stereo images. We compare the proposed method with the DPW method [28] and the SSC method [34] in Fig. 8. Due to high weights assigned to the salient object, the proposed method is able to prevent the deformation of a man having homogeneous clothing. Fig. 9 shows the results for the Snowman dataset for an increase in the width by 17%. The proposed method is able to incur minimal distortion in the background region while seam insertion as compared to the SSC method [34]. The DDS values for other datasets are listed in Table 1 with a percentage width reduction of every image pair. For the stereo seam carving method, we include the DDS values reported in [34] for available datasets and % width reduction/increment. If
(c) Proposed
(b) SSC [34]
7
Fig. 9: Snowman dataset: Width increased by 17%, (b) DDS = 5.90% for SSC [34], (c) DDS = 0.71% for the proposed method. Table 1: Depth Distortion Score (DDS ) (%).
Dataset Name
% width reduction Cloth2[49] 30 Man[Flickr] 30 People[Flickr] 17 Diana[50] 17 Car[Flickr] 30 Snowman[Flickr] 17 Snowman[Flickr] 30 Aloe[49] 20 Aloe[49] 30
AN US
(a) Input
(a) Input
CR IP T
(a) Input
SSC [34] (b) 17% (c) 40%
SSC [34]
Proposed
4.43 1.49 2.1 0.28 2.44 1.3 4.05 2.9 7.20
1.71 1.36 0.91 2.31 0.90 1.37 3.61 2.31 4.39
not available, we estimate DDS values using Eq. (8). Lower values of DDS shows that the proposed method can preserve the geometric consistency well between the retargeted stereo image pair. The average execution time for the proposed algorithm is 2.5 seconds per seam on a machine with i7 3.4GHz CPU. The 76% of the execution time is consumed for solving graph-cut minimization for finding a pair of seam for the stereo image pair. 4.1. Effect of Disparity weight α The proposed framework outperforms the other stereo image retargeting methods when the size is reduced more than the total size of the salient objects present in the image. On comparing Fig. 10(b) with Fig. 10(c), one can observe that the person with red clothing and the leg of the person standing far from the camera are deformed using the SSC method [34]. By allowing the seams to pass between two objects at different depths and maintaining the depth ordering, preserves the salient objects, their shapes and geometric consistency between the pair of images using the proposed method. To analyze the effect of disparity weight factor α, we input the object map defined in Fig. 11(b) and generate the stereo retargeted image pair for different values of α as shown in Fig. 10(d), 10(e), and 10(f), respectively. For the objects not captured in object map and nearer to the camera, disparity value D with factor α preserves them against the high texture region in the background. It is evident from Fig. 10(d), 10(e), and
ACCEPTED MANUSCRIPT (b) SSC [34]
(c) Proposed (α = 0.68)
(d) Proposed (α = 0.10)
(e) Proposed (α = 0.68)
(f) Proposed (α = 0.95)
8
CR IP T
(a) Input
Fig. 10: People dataset: Width reduced by 30%, (b) DDS = 5.42% for SSC [34] and (c) DDS = 1.49% for the proposed method, Effect of factor α: Width reduced by 10% using object map shown in Fig. 11(b), (d) DDS = 0.76% , (e) DDS = 0.57%, and (f) DDS = 0.56% for the proposed method.
AN US
(a)
(b)
Fig. 11: Object map for People dataset: (a) Object map used to generate retargeted image pair in Fig. 10(c), (b) Object map used to generate retargeted image pair in Fig. 10(d), (e) and (f).
(c) Proposed
AC
CE
PT
ED
(b) SSC [34]
M
(a) Input
the object map does not produce visually pleasing retargeted image pair though able to preserve geometric consistency between the images. It concludes that the image needs to have discrete depth layers captured in the object map to access the advantages of the proposed method. We proposed the stereo seam carving method allowing object occlusions along the discrete depth discontinuity. We would like to extend the proposed method to transfer the information available in one of the images onto the other image while increasing the size of a stereo image pair and preserving geometric consistency. Also, the possible future direction for improving the proposed approach is to formulate the multiple label segmentation of objects at discrete depth layers using higher order energy measures [51, 52]. Also, the proposed graph-cut formulation for a stereo pair can be minimized using non-linear disparity and correspondence measures between stereo image pair [53].
Fig. 12: Car dataset: Width reduced by 17%, (b) DDS = 1.3% for SSC [34], (c) DDS = 0.87% for the proposed method.
10(f) and DDS values mentioned in the caption that increasing the weight of disparity value D prevents the deformation in the person with red cloth as well as the person behind him. 4.2. Limitations and Future Directions Fig. 12 shows the results for the Car stereo image pair. The objects in the image are spanning along the camera direction and not at the discrete depth layer. The object occlusion using
5. Conclusion We have proposed a stereo image retargeting method which prevents the salient object(s) deformation using graph-cut formulation. The shape of the salient objects is preserved by assigning much higher energy to the objects and allowing seams to pass through object-object boundary along the depth discontinuity. We have utilized the object map indicating objects with depth ordering from the camera to define the object-object occlusion boundary. We have addressed the problem of seams passing through the object-object boundary in both the stereo images by refinement of corresponding pixels in both the images. Experimental results show that the proposed method is able to preserve the salient objects by allowing occlusions between objects while maintaining depth ordering. References [1] J.-Y. Son, B. Javidi, S. Yano, K.-H. Choi, Recent developments in 3-d imaging technologies, Journal of Display Technology 6 (10) (2010) 394– 403 (2010).
ACCEPTED MANUSCRIPT 9
[30] [31] [32] [33] [34] [35] [36] [37] [38]
CR IP T
[29]
ing for stereo image retargeting, IEEE Trans. on Image Proc. 24 (9) (2015) 2811–2826 (2015). M. B. Islam, W. Lai-Kuan, W. Chee-Onn, K.-L. Low, Stereoscopic image warping for enhancing composition aesthetics, in: IEEE Asian Conf. on Pattern Recogn., 2015, pp. 645–649 (2015). F. Shao, W. Lin, W. Lin, Q. Jiang, G. Jiang, Qoe-guided warping for stereoscopic image retargeting, IEEE Trans. on Image Proc. 26 (10) (2017) 4790–4805 (2017). R. Nakashima, K. Utsugi, K. Takahashi, T. Naemura, Stereo image retargeting with shift-map, IEICE Trans. on Info. and Syst. 94 (6) (2011) 1345–1348 (2011). S. Qi, J. Ho, Shift-map based stereo image retargeting with disparity adjustment, in: Asian Conf. on Comput. Vis., 2012, pp. 457–469 (2012). K. Utsugi, T. Shibahara, T. Koike, K. Takahashi, T. Naemura, Seam carving for stereo images, in: 3DTV-Conf.: The True Vision-Capture, Trans. and Disp. of 3D Video, 2010, pp. 1–4 (2010). T. D. Basha, Y. Moses, S. Avidan, Stereo seam carving a geometrically consistent approach, IEEE trans. on pattern anal. and machine intel. 35 (10) (2013) 2513–2525 (2013). J. Wang, Y. Fang, M. Narwaria, W. Lin, P. Le Callet, Stereoscopic image retargeting based on 3d saliency detection, in: IEEE Int. Conf. on Acoustics, Speech and Sig. Proc., 2014, pp. 669–673 (2014). K.-C. Lien, M. Turk, On preserving structure in stereo seam carving, in: IEEE Int. Conf. on 3D Vis., 2015, pp. 571–579 (2015). Y. Fang, J. Wang, Y. Yuan, J. Lei, W. Lin, P. Le Callet, Saliencybased stereoscopic image retargeting, Info. Sciences 372 (2016) 347–358 (2016). F. Shao, W. Lin, W. Lin, G. Jiang, M. Yu, R. Fu, Stereoscopic visual attention guided seam carving for stereoscopic image retargeting, J. of Disp. Techno. 12 (1) (2016) 22–30 (2016). J. Lei, M. Wu, C. Zhang, F. Wu, N. Ling, C. Hou, Depth-preserving stereo image retargeting based on pixel fusion, IEEE Trans. on Multimedia 19 (7) (2017) 1442–1453 (2017). Y. Niu, F. Liu, W.-C. Feng, H. Jin, Aesthetics-based stereoscopic photo cropping for heterogeneous displays, IEEE Trans. on Multimedia 14 (3) (2012) 783–796 (2012). W. Wang, J. Shen, Y. Yu, K.-L. Ma, Stereoscopic thumbnail creation via efficient stereo saliency detection, IEEE trans. on visualization and computer graphics 23 (8) (2017) 2014–2027 (2017). L. De-Maeztu, A. Villanueva, R. Cabeza, Stereo matching using gradient similarity and locally adaptive support-weight, Pattern Recognition Letters 32 (13) (2011) 1643–1651 (2011). Z. Gu, X. Su, Y. Liu, Q. Zhang, Local stereo matching with adaptive support-weight, rank transform and disparity calibration, Pattern Recognition Letters 29 (9) (2008) 1230–1235 (2008). T. Taniai, Y. Matsushita, Y. Sato, T. Naemura, Continuous 3d label stereo matching using local expansion moves, IEEE Trans. on Pattern Anal. and Machine Intel. (2017). J. Shen, Y. Du, W. Wang, X. Li, Lazy random walks for superpixel segmentation, IEEE Trans. on Image Proc. 23 (4) (2014) 1451–1462 (2014). J. Shen, X. Hao, Z. Liang, Y. Liu, W. Wang, L. Shao, Real-time superpixel segmentation by dbscan clustering algorithm, IEEE Trans. on Image Proc. 25 (12) (2016) 5933–5942 (2016). D. Weikersdorfer, D. Gossow, M. Beetz, Depth-adaptive superpixels, in: IEEE Int. Conf. on Pattern Recogn., 2012, pp. 2087–2090 (2012). M. Rubinstein, A. Shamir, S. Avidan, Improved seam carving for video retargeting, ACM trans. on graph. 27 (3) (2008) 16 (2008). H. Hirschmuller, D. Scharstein, Evaluation of cost functions for stereo matching, in: IEEE Conf. on CVPR, 2007, pp. 1–8 (2007). F. Huguet, F. Devernay, A variational method for scene flow estimation from stereo sequences, in: IEEE Int. Conf. on Comput. Vis., 2007, pp. 1–7 (2007). J. Peng, J. Shen, X. Li, High-order energies for stereo segmentation, IEEE trans. on cybernetics 46 (7) (2016) 1616–1627 (2016). W. Wang, J. Shen, Higher-order image co-segmentation, IEEE Trans. on Multimedia 18 (6) (2016) 1011–1021 (2016). J. Shen, J. Peng, X. Dong, L. Shao, F. Porikli, Higher order energies for image segmentation, IEEE Trans. on Image Proc. 26 (10) (2017) 4911– 4922 (2017).
AC
CE
PT
ED
M
AN US
[2] J. Geng, Three-dimensional display technologies, Advances in optics and photonics 5 (4) (2013) 456–535 (2013). [3] S. Goferman, L. Zelnik-Manor, A. Tal, Context-aware saliency detection, IEEE trans. on pattern anal. and machine intel. 34 (10) (2012) 1915–1926 (2012). [4] W. Wang, J. Shen, Deep visual attention prediction, IEEE Trans. Image Proc. 27 (5) (2018) 2368–2378 (2018). [5] W. Wang, J. Shen, L. Shao, Video salient object detection via fully convolutional networks, IEEE Trans. on Image Proc. 27 (1) (2018) 38–49 (2018). [6] W. Wang, J. Shen, L. Shao, F. Porikli, Correspondence driven saliency transfer, IEEE Trans. on Image Proc. 25 (11) (2016) 5025–5034 (2016). [7] K.-Y. Lee, C.-D. Chung, Y.-Y. Chuang, Scene warping: Layer-based stereoscopic image resizing, in: IEEE Conf. on CVPR, 2012, pp. 49–56 (2012). [8] S.-S. Lin, C.-H. Lin, S.-H. Chang, T.-Y. Lee, Object-coherence warping for stereoscopic image retargeting, IEEE Trans. on Circuits and Syst. for Video Techno. 24 (5) (2014) 759–768 (2014). [9] A. Mansfield, P. Gehler, L. Van Gool, C. Rother, Scene carving: Scene consistent image retargeting, in: Eur. Conf. on Comput. Vis., 2010, pp. 143–156 (2010). [10] M. Rubinstein, D. Gutierrez, O. Sorkine, A. Shamir, A comparative study of image retargeting, in: ACM trans. on graph., Vol. 29, 2010, p. 160 (2010). [11] D. Vaquero, M. Turk, K. Pulli, M. Tico, N. Gelfand, A survey of image retargeting techniques, in: Appl. of Digital Image Proc., Vol. 7798. [12] F. Liu, M. Gleicher, Automatic image retargeting with fisheye-view warping, in: Proc. of the ACM symp. on User interface software and techno., 2005, pp. 153–162 (2005). [13] Y. Guo, F. Liu, J. Shi, Z.-H. Zhou, M. Gleicher, Image retargeting using mesh parametrization, IEEE Trans. on Multimedia 11 (5) (2009) 856–867 (2009). [14] S.-S. Lin, I.-C. Yeh, C.-H. Lin, T.-Y. Lee, Patch-based image warping for content-aware retargeting, IEEE trans. on multimedia 15 (2) (2013) 359–368 (2013). [15] V. Setlur, S. Takagi, R. Raskar, M. Gleicher, B. Gooch, Automatic image retargeting, in: Proc. of the int. conf. on Mobile and ubi. multimedia, 2005, pp. 59–68 (2005). [16] S. Avidan, A. Shamir, Seam carving for content-aware image resizing, in: ACM Trans. on graph., Vol. 26, 2007, pp. 10:1–10:10 (2007). [17] Y. Pritch, E. Kav-Venaki, S. Peleg, Shift-map image editing, in: IEEE Int. Conf. on Comput. Vis., 2009, pp. 151–158 (2009). [18] S. Qi, Y.-T. J. Chi, A. M. Peter, J. Ho, Casair: Content and shape-aware image retargeting and its applications, IEEE Trans. on Image Proc. 25 (5) (2016) 2222–2232 (2016). [19] Y. Fang, Z. Chen, W. Lin, C.-W. Lin, Saliency detection in the compressed domain for adaptive image retargeting, IEEE Trans. on Image Proc. 21 (9) (2012) 3888–3901 (2012). [20] Y. Fang, Z. Fang, F. Yuan, Y. Yang, S. Yang, N. N. Xiong, Optimized multioperator image retargeting based on perceptual similarity measure, IEEE Trans. on Syst., Man, and Cyber.: Syst. 47 (11) (2017) 2956–2966 (2017). [21] F. Shafieyan, N. Karimi, B. Mirmahboub, S. Samavi, S. Shirani, Image retargeting using depth assisted saliency map, Sig. Proc.: Image Comm. 50 (2017) 34–43 (2017). [22] D. Cho, J. Park, T.-H. Oh, Y.-W. Tai, I. S. Kweon, Weakly-and selfsupervised learning for content-aware deep image retargeting, in: IEEE Int. Conf. on Comput. Vis., 2017, pp. 4568–4577 (2017). [23] W. Wang, J. Shen, H. Ling, A deep network solution for attention and aesthetics aware photo cropping, IEEE Trans. on Pattern Anal. and Machine Intel. (2018). [24] J. Shen, D. Wang, X. Li, Depth-aware image seam carving, IEEE trans. on cybernetics 43 (5) (2013) 1453–1461 (2013). [25] C.-H. Chang, C.-K. Liang, Y.-Y. Chuang, Content-aware display adaptation and interactive editing for stereoscopic images, IEEE Trans. on Multimedia 13 (4) (2011) 589–601 (2011). [26] J. W. Yoo, S. Yea, I. K. Park, Content-driven retargeting of stereoscopic images, IEEE Sig. Proc. Let. 20 (5) (2013) 519–522 (2013). [27] B. Li, L. Duan, C.-W. Lin, W. Gao, Region-based depth-preserving stereoscopic image retargeting, in: IEEE Int. Conf. on Image Proc., 2014, pp. 2903–2907 (2014). [28] B. Li, L.-Y. Duan, C.-W. Lin, T. Huang, W. Gao, Depth-preserving warp-
[39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53]
ACCEPTED MANUSCRIPT 10
AC
CE
PT
ED
M
AN US
We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome. We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us. We confirm that we have given due consideration to the protection of intellectual property associated with this work and that there are no impediments to publication, including the timing of publication, with respect to intellectual property. In so doing we confirm that we have followed the regulations of our institutions concerning intellectual property. We understand that the Corresponding Author is the sole contact for the Editorial process (including Editorial Manager and direct communications with the office). She is responsible for communicating with the other authors about progress, submissions of revisions and final approval of proofs. We confirm that we have provided a current, correct email address which is accessible by the Corresponding Author and which has been configured to accept email from
[email protected] and
[email protected].
CR IP T
Conflict of Interest