Optics and Lasers in Engineering 126 (2020) 105890
Contents lists available at ScienceDirect
Optics and Lasers in Engineering journal homepage: www.elsevier.com/locate/optlaseng
Omnidirectional depth segmentation using orthogonal fringe patterns and multi-scale enhancement Ji Deng a, Jian Li a, Hao Feng a,∗, Yu Xiao b, Wenzhong Han b, Zhoumo Zeng a a
State Key Laboratory of Precision Measurement Technology and Instruments, School of Precision Instrument and Opto-electronic Engineering, Tianjin University, 92 Weijin Road, Tianjin 300072, China b School of Mechanical and Electrical Engineering, North China Institute of Aerospace Engineering, 133 Aimin East Road, Langfang 065000, China
a r t i c l e
i n f o
Keywords: Omnidirectional depth segmentation Phase-shift invariant Phase-insensitive
a b s t r a c t Phase-shifted depth segmentation, which is facilitated by the phase-shift invariance of the singular points in different wrapped phase maps, has the advantages of being immune to color, texture, and camera exposure. However, because the phase sensitivity is distributed unevenly, it is difficult to perform this type of depth segmentation properly in some special regions. Instead, this paper presents an omnidirectional method for depth segmentation without the need for precalibration or complementary cues. During data acquisition, two groups of orthogonal phase-shifted patterns are cast sequentially onto the target objects. A least-squares algorithm is then applied to these fringe patterns, and two sets of phase sequences are calculated separately. Using the proposed multi-scale enhancement technology, objects with abruptly changing surfaces can be segmented into different parts. Simulations show that even under a noisy environment (namely a signal-to-noise ratio of 10), the correct rate of the proposed framework still reaches 98.97%. Verification experiments show that the proposed method performs omnidirectional depth segmentation effectively, thereby offering a solution to the problem of mis-segmentation in phase-insensitive regions.
1. Introduction The aim of depth segmentation is to segment objects that have abruptly changing depth into different regions for further analysis. As an important branch of computer vision, this technique has been adapted to numerous fields, including television production, industrial monitoring, three-dimensional (3D) reconstruction, 3D data denoising, robot grasping, and object recognition [1–8]. However, the complexity of the measuring environment often limits the segmentation accuracy. Over the past few decades, various depth-segmentation methods have been developed from several perspectives. The earliest examples of depth segmentation were based on the segmentation work in twodimensional (2D) image processing. Alternatively, by maximizing the discrimination between different colors, thresholding-based approaches perform fast depth segmentation by means of simple thresholding [9–11]. Moreover, by finding evidence of a boundary between different regions, graph-based methods [12,13] provide improved segmentation by combining color and depth cues. Convolutional neural networks (CNNs) represent a flourishing technique that uses multiple perceptive layers for intelligent analysis and that has shown great success in various fields [14–17] when datasets for training are abundant. For this reason, depth segmentation based on
∗
CNNs [18,19] is both fast and accurate. Indeed, the method based on fully convolutional networks (FCNs) [20,21] improves the performance further by replacing convolutional layers with fully connected layers. With the development of technology for 3D perception, depth-based approaches have advantages over other techniques because depth information is robust to variations in the color and texture of the object surfaces [1,17,22]. Numerous techniques of this type have been proposed over the past few years and are classified mainly as disparitybased, time-of-flight-based, and Kinect-based methods according to how the depth data are acquired. Disparity-based methods use stereo matching to acquire depth information [23–25], time-of-flight-based methods [26–28] improve the segmentation performance that reflect in occlusion. and Kinect-based methods [29–31] are used widely because they are relatively cheap. Recently, Deng et al. [32] proposed a flexible depth-segmentation method (FDSM) that uses only one set of unidirectional phase-shifted patterns for depth segmentation. With no need for precalibration or complementary cues (e.g., texture or depth data), this method performs segmentation effectively by changing the pattern sequence selectively during post-processing. This segmentation method has numerous advantages over the aforementioned frameworks: (i) it is less sensitive to variations in object color and texture; (ii) the segmentation is unaffected by
Corresponding author. E-mail address:
[email protected] (H. Feng).
https://doi.org/10.1016/j.optlaseng.2019.105890 Received 9 July 2019; Received in revised form 31 August 2019; Accepted 3 October 2019 0143-8166/© 2019 Published by Elsevier Ltd.
J. Deng, J. Li and H. Feng et al.
Optics and Lasers in Engineering 126 (2020) 105890
overexposure; (iii) because no training data are required, the processing is relatively cheap; (iv) because no depth data are needed during the processing, intermediate errors (e.g., 3D reconstruction errors, calibration errors) are suppressed. However, despite the above merits, we found that two factors have a direct influence on the quality of depth segmentation. One is the quality of the wrapped phase maps. Although noise can be suppressed by using certain advanced filtering algorithms [33,34], some high-intensity random noise pollutes the segmentation results, which is presumably reflected in the scattered error points. The other factor is the nonuniform distribution of phase sensitivity about the depth variations [35,36], which means that some regions of low phase sensitivity may result in the phenomenon of mis-segmentation. This paper presents an omnidirectional depth-segmentation method (ODSM) with the aims of segmenting the target objects in arbitrary directions and suppressing the scattered error points due to highintensity random noise. This ODSM involves projecting two sets of mutually orthogonal fringe patterns, after which an optimization algorithm based on multi-scale enhancement is applied to optimize the depthsegmentation results. The structure of this paper is as follows. In Section 2, we introduce the related principles and our designed computational framework. In Section 3, we investigate the segmentation accuracy and robustness of this method by means of simulations. In Section 4, we describe the experimental validations, and we end in Section 5 with some conclusions and directions for future work. 2. Principles The process of the proposed ODSM (Fig. 1) comprises four individual procedures: (i) orthogonal fringe projection for retrieving the phase sequences, (ii) phase optimization for refining the phase sequences, (iii) obtaining the split lines, and (iv) segmenting the object into regions of different depths. 2.1. Orthogonal fringe projection To bypass the problem of phase insensitivity, two groups of orthogonal phase-shifted patterns (IOx , IOy ) with evenly spaced phase steps over one phase-shift period are cast sequentially onto the target objects. Here, IOx and IOy denote fringe patterns with phase variations in the x and y directions, respectively, and are expressed as 𝐼𝑂𝑥𝑖 (𝑥) = 𝐴(𝑥, 𝑦) + 𝐵(𝑥, 𝑦) cos[𝜙(𝑥) + 𝛿𝑖 ],
𝑗 = 1, 2, … , 𝑁𝑦 (𝑁𝑦 ≥ 3). (2)
In the above equations, the variables subscripted with i or x vary in the x direction, and those subscripted with j or y vary in the y direction. To provide a better description, Eqs. (1) and (2) can be abbreviated as 𝐼𝑂𝑛 = 𝐴(𝑥, 𝑦) + 𝐵(𝑥, 𝑦) cos[𝜙 + 𝛿𝑛 ],
𝑛 = 1, 2, … , 𝑁 (𝑁 ≥ 3),
∑𝑁
𝑛=1 𝐼𝑂𝑛 (𝑥, 𝑦) sin(𝛿𝑛 )
𝜙(𝑥, 𝑦) = − arctan ∑𝑁
𝑛=1 𝐼𝑂𝑛 (𝑥, 𝑦) cos(𝛿𝑛 )
.
(6)
Suppose that the phase obtained from the initial sequence of the phase-shifted pattern group 𝐺(𝐼𝑂1 , 𝐼𝑂2 , … , 𝐼𝑂𝑁 ) is defined as 𝜙1 (x, y). When the pattern sequence is changed to ( ) 𝐺 𝐼𝑂Round(𝑘𝑁∕3) , 𝐼𝑂Round(𝑘𝑁∕3)+1 , … , 𝐼𝑂Round(𝑘𝑁∕3)−2 , 𝐼𝑂Round(𝑘𝑁∕3)−1 (𝑘 = 2 or 3),
(7)
another two phase maps 𝜙2 (x, y) and 𝜙3 (x, y) can be calculated. Here, the Round( · ) function returns the closest integer to a number. Consequently, two sequences of phase maps (Sx and Sy ) in different changing directions can be computed separately, where 𝑆𝑥 = (𝜙𝑥1 (𝑥, 𝑦), 𝜙𝑥2 (𝑥, 𝑦), 𝜙𝑥3 (𝑥, 𝑦)),
𝑆𝑦 = (𝜙𝑦1 (𝑥, 𝑦), 𝜙𝑦2 (𝑥, 𝑦), 𝜙𝑦3 (𝑥, 𝑦)). (8)
2.2. Phase optimization Each phase map in these sequences is then optimized sequentially using a bilateral filter and a multi-scale enhancement algorithm. In the first stage, we extend the bilateral filter to the phase field to suppress random noise while preserving the details of the objects. The filtering process can be expressed as 1 ∑ ‖ ‖ ‖ 𝜙𝑓 (𝑥) = 𝜙(𝑥𝑖 )𝑓𝑝 (‖ (9) ‖𝜙(𝑥𝑖 ) − 𝜙(𝑥)‖)𝑔𝑠 (‖𝑥𝑖 − 𝑥‖), 𝑊 𝑜 𝑥 ∈Ω 𝑖
𝑖 = 1, 2, … , 𝑁𝑥 (𝑁𝑥 ≥ 3), (1)
𝐼𝑂𝑦𝑗 (𝑦) = 𝐴(𝑥, 𝑦) + 𝐵(𝑥, 𝑦) cos[𝜙(𝑦) + 𝛿𝑗 ],
where 𝜙 is the wrapped phase, 𝛿𝑛 = 2𝜋∕𝑁 is the phase shift, and A(x, y) and B(x, y) are the average intensity and fringe modulation of the fringe patterns, respectively. The average intensity describes the profile of the illuminated objects, while the fringe modulation reveals the region of interest (ROI), R(x, y), through simple thresholding. The variables A, B, and 𝜙 for each pixel can be calculated by applying summation and a least-squares algorithm to the target patterns, as follows: ∑𝑁 𝑛=1 𝐼𝑂𝑛 (𝑥, 𝑦) 𝐴(𝑥, 𝑦) = , (4) 𝑁 √[ ]2 [∑ ]2 ∑𝑁 𝑁 2 𝑛=1 𝐼𝑂𝑛 (𝑥, 𝑦) cos(𝛿𝑛 ) + 𝑛=1 𝐼𝑂𝑛 (𝑥, 𝑦) sin(𝛿𝑛 ) 𝐵(𝑥, 𝑦) = , (5) 𝑁
(3)
𝜙f (x)
where is the filtered result, 𝜙(x) is the phase map for filtering, x represents the position of the current pixel to be filtered, Ω is a filtering window centered on x, and xi is the neighboring pixel of x. Both fp and gs are Gaussian kernels, where fp is in the phase domain and gs is in the spatial domain. Therefore, two categories of parameter influence the processing results, namely the standard deviations (𝜎 p and 𝜎 s ) and the kernel radius (r). fp is used for smoothing the phase map, and the parameter 𝜎 p related to this function is kept small (0.4–0.5 rad) to preserve the details of the phase map. gs is a spatial filter that has small r (1–2 pixels) and large 𝜎 s (approximately 30 pixels) to smooth the differences between coordinates and suppress random noise. Wo is the weight function and is expressed as ∑ ‖ ‖ ‖ 𝑊𝑜 = 𝑓𝑝 (‖ (10) ‖𝜙(𝑥𝑖 ) − 𝜙(𝑥)‖)𝑔𝑠 (‖𝑥𝑖 − 𝑥‖). 𝑥𝑖 ∈Ω
Although the bilateral filter is edge-preserving, it may still dilute some details. Inspired by the difference of Gaussians framework [37], a procedure known as multi-scale enhancement is applied to 𝜙f (x) to boost the details. First, three filtered phase maps with consecutive scales are constructed by convolving different Gaussian kernels with 𝜙f (x), namely 𝐶1 (𝑥) = 𝐺1 (𝑥) ∗ 𝜙𝑓 (𝑥), Fig. 1. Flowchart of proposed framework.
𝐶2 (𝑥) = 𝐺2 (𝑥) ∗ 𝜙𝑓 (𝑥),
𝐶3 (𝑥) = 𝐺3 (𝑥) ∗ 𝜙𝑓 (𝑥), (11)
J. Deng, J. Li and H. Feng et al.
Optics and Lasers in Engineering 126 (2020) 105890
where the kernels are related by 𝜎𝐺3 = 2𝜎𝐺2 = 4𝜎𝐺1 . Generally, 𝜎𝐺1 = 1 pixel. Three enhancing layers E1 , E2 , and E3 can then be computed through the differential process of neighboring phase scale, as follows:
phase maps. As the depth-change points are invariant with changes in the grating sequence, the abrupt-change depth points in each direction (Ex and Ey ) can be calculated by the following equation:
𝐸1 (𝑥) = 𝜙𝑓 (𝑥) − 𝐶1 (𝑥),
where Pk (x, y) (𝑘 = 1, 2, 3) gives the abrupt-change points of each phase map,
𝐸2 (𝑥) = 𝐶1 (𝑥) − 𝐶2 (𝑥),
𝐸3 (𝑥) = 𝐶2 (𝑥) − 𝐶3 (𝑥). (12)
Finally, the enhanced phase map 𝜙E (x) can be obtained by merging these layers with the filtered phase map: 1 1 𝜙 (𝑥) = 𝜙 (𝑥) + 𝐸1 (𝑥) + 𝐸2 (𝑥) + 𝐸3 (𝑥). 2 4 𝐸
𝑓
(13)
Note that the parameter groups (1,2,4) and (1, 12 , 14 ) are selected to produce a continuous Gaussian scale space of the phase map. The theoretical background of this approach is discussed in more detail in [38]. By applying the above process to each of the optimized sequences 𝐸 and 𝑆 𝐸 ) can be ob(SOx and SOy ), the enhanced phase sequences (𝑆𝑂𝑥 𝑂𝑦 tained separately: 𝐸 𝑆𝑂𝑥
=
(𝜙𝐸 (𝑥, 𝑦), 𝜙𝐸 (𝑥, 𝑦), 𝜙𝐸 (𝑥, 𝑦)), 𝑥1 𝑥2 𝑥3
𝐸 𝑆𝑂𝑦
=
(𝜙𝐸 (𝑥, 𝑦), 𝜙𝐸 (𝑥, 𝑦), 𝜙𝐸 (𝑥, 𝑦)). 𝑦1 𝑦2 𝑦3 (14)
Note that the anisotropic filter [39], another widely used detailpreserving filter, is not recommended for this process. This is because the phase map contains phase-jump regions that change drastically; thus, the filter may induce error propagation in the subsequent steps.
𝐸(𝑥, 𝑦) = 𝑃1 (𝑥, 𝑦) ∧ 𝑃2 (𝑥, 𝑦) ∧ 𝑃3 (𝑥, 𝑦),
𝑃𝑘 (𝑥, 𝑦) = 𝑅𝑜𝑢𝑛𝑑(∇𝜙𝑘 (𝑥, 𝑦)).
(15)
(16)
Here, ∧ is the logical AND operator and ∇ is the gradient operator. However, some tiny noise points or disconnected lines may influence the segmentation results. The tiny noise points can be deleted by judging the area of the extracted points (e.g., using the function bwareaopen in MATLAB), and the breaking lines can be connected by using the morphological close operation with a small kernel. After these steps, the split lines L(x, y), which will be used for further segmentation, can be extracted through 𝐿(𝑥, 𝑦) = 𝐸𝑥 (𝑥, 𝑦) ∨ 𝐸𝑦 (𝑥, 𝑦),
(17)
where ∨ is the logical OR operator. Finally, depth segmentation can be achieved by subtracting the split lines from the ROI, 𝐷(𝑥, 𝑦) = 𝑅(𝑥, 𝑦) − 𝐿(𝑥, 𝑦),
(18)
and the regions in different depth ranges can be labeled with unique markers by region labeling. 3. Simulation
2.3. Extraction of split lines and depth segmentation There are two types of abrupt-change points: (i) the depth-change points that are extracted in this step and (ii) the phase-jump points of the
To analyze quantitatively the robustness of the proposed method, we compare it with the FDSM and the ground truth for various signalto-noise ratios (SNRs). As shown in Fig. 2, the resolution of the casted
Fig. 2. (a) Three-dimensional (3D) distribution of stacked blocks and (b–f) the corresponding ground-truth results. (g) Reconstruction result in environment with signal-to-noise ratio (SNR) of 15. (h) Corresponding split lines and labeled markers of the blocks extracted by flexible depth-segmentation method (FDSM). (i) An enlarged part of (h). (j–l) Segmented results of FDSM. (m–r) Corresponding results processed by omnidirectional depth-segmentation method (ODSM).
J. Deng, J. Li and H. Feng et al.
Optics and Lasers in Engineering 126 (2020) 105890
Fig. 3. (a) Reconstruction result in SNR = 10 environment. (b) Corresponding result processed by FDSM. (c-h) Segmented results processed by ODSM.
gratings is 1000 × 1000 pixels, and two stacked blocks placed on the floor are set for segmentation. The first simulation takes place in a noisy environment (SNR=15). As shown in Fig. 2(g), much undesired noise destroys the reconstruction results of these objects. Fig. 2(h)–(l) show the segmented results of the FDSM, and Fig. 2(i) shows an enlargement of the region in Fig. 2(h) that is enclosed by a red rectangle. Compared with Fig. 2(b), the split lines shown in Fig. 2(h) coincide highly with the ground truth. However, some undesired points are distributed discretely over the whole ROI. Moreover, as the color markers and Fig. 2(i)–(l) show, the segmentation has been divided mistakenly because of the breaking lines. By contrast, as shown in Fig. 2(m)–(r), the ODSM performs depth segmentation effectively even though it is disturbed by noise. As shown in Fig. 2(m) and (n), neither error points nor breaking lines influence the segmentation result. As shown in Fig. 3(a), the second simulation takes place in an extremely noisy environment (SNR = 10), and Fig. 3(b) shows the results calculated by the FDSM. Although the split lines can present the edges of the abrupt-change regions approximately, the FDSM cannot give satisfactory results because of the influence of noise. Strikingly, as shown in Fig. 3(c)–(h), the ODSM segments the noisy objects successfully into four individual objects. Although the profile of an extracted line as enlarged and shown partially in Fig. 3(d) is not smooth, this does not influence the precision of the result because the center of the split lines is aligned perfectly with the ground truth. More simulations with the FDSM and ODSM are then performed under a series of different noise conditions (SNR = 30, 25, 20, 15, 10), and the results are presented in Tables 1 and 2, respectively. In these tables, the correct points are defined as those that coincide with the lines extracted from the ground truth, while the correct rate is the number of pixels over the whole field divided by the number of correct points. The incorrect points are the extracted points that do not coincide with the ground-truth edges, and the error rate is equals to the number of incor-
Table 1 Segmentation results of FDSM.
Correct points Correct rate (%) Incorrect points Error rate (%)
SNR = 30
SNR = 25
SNR = 20
SNR = 15
SNR = 10
12,228 100 0 0
12,222 99.9509 6 0
12,118 99.1043 119 0.012
11,600 94.8642 3415 0.3457
10,759 87.9865 136,911 13.8606
SNR = 30
SNR = 25
SNR = 20
SNR = 15
SNR = 10
12,228 100 0 0
12,228 100 0 0
12,227 99.9918 0 0
12,224 99.9673 14 0.0014
12,102 98.9696 5974 0.6049
Table 2 Segmentation results of ODSM.
Correct points Correct rate (%) Incorrect points Error rate (%)
rect points divided by the number of total pixels that subtracted by the pixels number of the ground-truth lines. As can be seen from these data, although the FDSM performs well in moderate conditions, when the SNR is below 20 the correct rate drops rapidly and the error rate rises drastically. The variation tendencies reveal that the split lines are shielded progressively by powerful noise. By contrast, although the correct rate of the ODSM drops only slightly as the SNR is varied, the correct rate in the SNR = 10 environment remains at 98.9696% and the corresponding error rate is only 0.6049%, which means that the processing is of high fidelity. 4. Experiments The proposed ODSM was implemented via a projector–objects– camera system. As shown in Fig. 4(a), the projector and the camera
J. Deng, J. Li and H. Feng et al.
Optics and Lasers in Engineering 126 (2020) 105890
Fig. 4. (a) Experimental setup. (b) Target objects for experiment 1. (c) Captured grating groups for experiment 1.
Fig. 5. (a) 3D distributions of target objects. (b) Segmentation result calculated from X-direction patterns. (c) Segmentation result calculated from Y-direction patterns. (d) Segmentation result obtained from ODSM. (e–f) Depth distribution of each segmented part.
were set in a cross-optical-axis geometry such that the angle 𝜃 between them determined the field of view. A fluorescent lamp with a flashing rate of 100 Hz served as the ambient light. To assess the effectiveness of the proposed method, as shown in Fig. 4(b), one sculpture and a toy with two supporting feet were used as measurement objects. The toy was placed behind the statue, and both were of the same color, which often poses difficulties for traditional segmentation methods. As shown in Fig. 4(c), without the influence from ambient light, two groups of mutually orthogonal and high-quality phase-shifted gratings were cast sequentially onto the objects. Fig. 5(a) shows the 3D distribution of these objects. In this and the subsequent experiments, the 3D point clouds were reconstructed using the gray code technique with phase shifting [40]. According to the pictures, the target objects contain four isolated parts: the main body of the statue, the panel, and the two feet of the toy. Figs. 5 (b) and 5(c) show the results of segmentation by the X- and Y-direction phase-shifted patterns, respectively. Although most of these regions were segmented correctly, the limitation due to the nonuniformly distributed phase sensitivity meant that the mis-segmentation problem influenced both sets of segmentation results. This was especially notable around regions of low phase sensitivity, namely the R1 and R2 regions in Fig. 5(b) and the R3 region in Fig. 5(c). By contrast, as shown in Fig. 5(d), the depth segmentation performed by the ODSM segmented the isolated parts effectively. Furthermore, we present each segmented part of the target objects in Fig. 5(e)–(h). Each part of them is in good shape, which indicates that the ODSM provides satisfactory depth-segmentation results. In previous work, Wang and Zhang [35] set the fringe patterns at an optimal angle to increase the precision of 3D measurements. To deter-
mine an optimal pattern scheme for omnidirectional depth segmentation, we also set the fringe patterns at the optimal fringe angle for segmentation. As presented in Fig. 6(a), two different cylinders were placed in front of a panel for depth segmentation. After fringe calibration, as illustrated in Fig. 6(b), the optimal fringe angle was determined to be about 67.92∘ for this system. However, the fringe patterns were only slightly distorted in the regions enclosed by the rectangular box. Consequently, the phenomenon results in a false depth-segmentation result, as shown in Fig. 6(c). The undistortion phenomenon can be explained by the fact that the abrupt-change regions caused a phase jump of exactly 2k𝜋; thus, no distortion occurred between two regions with different depths. We also tested the effectiveness of ODSM in the same scenario. As shown in Fig. 6(d) and (e), the fringe was again distorted slightly in some regions. However, because these positions were mutually complementary, satisfactory depth segmentation results were achieved, as shown in Fig. 6(f)–(i). To test the robustness of the ODSM, as shown in Fig. 7, the same objects disturbed by ambient light and illuminated by a series of gratings of decreasing brightness were then set up for segmentation. As shown in Fig. 7(b), (e), and (h), the quality of each phase map decreased drastically with variation in the brightness of the cast gratings, especially around regions of low illumination. Moreover, as shown in Fig. 7(c), (f), and (i), the quality of the phase map directly influenced that of the 3D reconstruction, as reflected by the roughness of the point-cloud surfaces and the amount of random noise. Fig. 8 shows the depth-segmentation results corresponding to those in Fig. 5 as processed by the ODSM. Although the point clouds were affected by different amounts of environmental noise, the proposed method segmented all parts of the objects properly, indicating that the
J. Deng, J. Li and H. Feng et al.
Optics and Lasers in Engineering 126 (2020) 105890
Fig. 6. (a) 3D distributions of target objects. (b) One of the fringe pattern in the optimal angle. (c) One of the ODSM pattern which varies in x direction. (d) One of the ODSM pattern which varies in y direction. (e) Segmentation result calculated from optimal fringe patterns. (f) Segmentation result obtained from ODSM. (g–i) Depth distribution of each segmented part of ODSM.
Fig. 7. (a) High-quality gratings influenced by ambient light. (b) A phase map of (a). (c) 3D distribution of (a). (d) Moderate-quality gratings influenced by ambient light. (e) A phase map of (d). (f) 3D distribution of (d). (g) Lowquality gratings influenced by ambient light. (h) A phase map of (g). (i) 3D distribution of (g).
J. Deng, J. Li and H. Feng et al.
Optics and Lasers in Engineering 126 (2020) 105890
Fig. 8. Depth-segmentation results retrieved from phase maps of differing quality. (a–d) Segmentation results of Fig. 7(a). (e–h) Segmentation results of Fig. 7(d). (i–l) Segmentation results of Fig. 7(g).
Fig. 9. Profile and 3D distributions of target objects captured or calculated with different viewing angles: (a,b) 𝜃 = 30◦ ; (c,d) 𝜃 = 45◦ ; (e,f) 𝜃 = 60◦ .
ODSM is robust to noise variations. Note that some small parts of the target objects were missing around the margins (see Fig. 8(k) and (l)). However, this limitation is acceptable in practice because the phase maps around those margin regions are usually of low quality and disturbed by high-intensity random noise. Indeed, the noise may exceed the noise tolerance of the proposed method. Furthermore, these points were located in a small region and merged easily with the edges of the target objects. As these scatter points were merged together, the optimized framework was invalid under this condition. The problem of small missing parts could be avoided either by adjusting the angle between the projector and the camera or by processing these isolated regions individually. As the projector and camera were set in a cross-optical-axis geometry, the viewing angle 𝜃 between them determined the field of view of
the system. Therefore, occlusion or the problem of error segmentation could be eliminated by adjusting the angle between the camera and the projector. Thus, the effectiveness of the ODSM with different viewing angles was studied by adjusting 𝜃 as a series of angles (30∘ , 45∘ , and 60∘ ). In the third experiment, one toy and two complex sculptures were used for testing, as shown in Fig. 9. Fig. 10 (a), (e), and (i) show the segmentation results provided by our method. All the parts, with different viewing angles and different depth information, were segmented successfully into distinct parts. For better visualization, we merged the isolated parts with their corresponding main body according to a priori knowledge; the results are shown in Fig. 10(b)–(d), (f)–(h), and (j)–(l), respectively. These correct segmentation results indicate that the proposed method is valid for performing omnidirectional depth segmentation.
J. Deng, J. Li and H. Feng et al.
Optics and Lasers in Engineering 126 (2020) 105890
Fig. 10. Depth-segmentation results for different viewing angles: (a–d) 𝜃 = 30◦ ; (e–h) 𝜃 = 45◦ ; (i–l) 𝜃 = 60◦ .
5. Conclusions
References
A novel depth-segmentation method is proposed with the aim of segmenting the target objects in arbitrary directions while minimizing the problem of mis-segmentation. The optimal grating design allows the phase-map sequences to be sensitive to omnidirectional variations in the depth information. As the optimization work is based on scale space theory, the optimization process removes noise and enhances the depth feature with no influence on the precision of depth segmentation. The corresponding optimized framework can reduce the interference of environmental noise and increase the degree of variation in the phase maps. The simulation results show that the ODSM is robust to environmental noise, with the method reaching a correct rate of 98.97% even under an SNR of 10. In the verification experiments, the ODSM always provided satisfactory depth-segmentation results with different fields of view or when subjected to a series of disturbances. As depth segmentation can be performed omnidirectionally, the ODSM has the flexibility to solve the problems of occlusion or error segmentation by adjusting only the angle between the camera and the projector. Consequently, the proposed technique has a wide range of potential applications, such as television production, robot grasping, and object recognition.
[1] Crabb R, Tracey C, Puranik A, Davis J. Real-time foreground segmentation via range and color imaging. In: IEEE computer society conference on computer vision and pattern recognition workshops; 2008. [2] Wang O, Finger J, Yang Q, Davis J, Yang R. Automatic natural video matting with depth. In: 15th Pacific conference on computer graphics and applications (PG’07); 2007. p. 469–72. 10.1109/PG.2007.52 [3] Li L, An Q. An in-depth study of tool wear monitoring technique based on image segmentation and texture analysis. Measurement 2016;79:44–52. doi:10.1016/ j.measurement.2015.10.029. [4] Yang X, Zeng C, Luo J, Lei Y, Tao B, Chen X. Absolute phase retrieval using one coded pattern and geometric constraints of fringe projection system. Appl Sci Basel 2018;8(12). doi:10.3390/app8122673. [5] Guo X, Tang J, Li J, Shen C, Liu J. Attitude measurement based on imaging ray tracking model and orthographic projection with iteration algorithm. ISA Trans 2019. doi:10.1016/j.isatra.2019.05.009. [6] Deng H, Deng J, Ma M, Zhang J, Yu L, Wang Z. 3D Information detection with novel five composite fringe patterns. Mod Phys Lett B 2017;1740088:1–7. doi:10.1142/S0217984917400887. [7] Ala R, Kim DH, Shin SY, Kim C, Park S-K. A 3d-grasp synthesis algorithm to grasp unknown objects based on graspable boundary and convex segments. Inf Sci 2015;295:91–106. doi:10.1016/j.ins.2014.09.062. [8] Kootstra G, Popović M, Jørgensen JA, Kuklinski K, Miatliuk K, Kragic D, et al. Enabling grasping of unknown objects through a synergistic use of edge and surface information. Int J Robot Res 2012;31(10):1190–213. doi:10.1177/ 0278364912452621. [9] Smith AR, Blinn JF. Blue screen matting. In: IGGRAPH; 1996. p. 259–68. doi:10.1145/237170.237263. [10] Mishima Y. Curved color separation spaces for blue screen matting. SMPTE J 2001;110(3):131–9. doi:10.5594/J16464. [11] Dupont J, Deschenes F. Toward a realistic interpretation of blue-spill for blue-screen matting. In: The 3rd Canadian conference on computer and robot vision (CRV’06); 2006. p. 33–33, 10.1109/CRV.2006.77
Funding This work was supported by the Major Scientific Instrument and Equipment Development Project of the National Key Research and Development Program of China [grant number 2016YFF0101802].
J. Deng, J. Li and H. Feng et al. [12] Rao D, Le QV, Phoka T, Quigley M, Sudsang A, Ng AY. Grasping novel objects with depth segmentation. In: IEEE/RSJ international conference on intelligent robots and systems; 2010. [13] Toscana G, Rosa S, Bona B. Fast graph-based object segmentation for RGB-d images. In: Bi Y, Kapoor S, Bhatia R, editors. Proceedings of SAI intelligent systems conference (IntelliSys) 2016. Cham: Springer International Publishing; 2018. p. 42–58. ISBN 978-3-319-56991-8. [14] Lyu M, Wang W, Wang H, Wang H, Li G, Chen N, et al. Deep-learning-based ghost imaging. Sci Rep 2017;7(1):17865. [15] Yan K, Yu Y, Huang C, Sui L, Qian K, Asundi A. Fringe pattern denoising based on deep learning. Opt Commun 2019;437:148–52. doi:10.1016/j.optcom.2018.12.058. [16] Lyu M, Wang H, Li G, Situ G. Exploit imaging through opaque wall via deep learning. Tech Rep 2017. arXiv:170807881. [17] Han J, Chen H, Liu N, Yan C, Li X. Cnns-based RGB-d saliency detection via cross-view transfer and multiview fusion. IEEE Trans Cybern 2018;48(11):3171–83. doi:10.1109/TCYB.2017.2761775. [18] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing systems - Volume 1. USA: Curran Associates Inc.; 2012. p. 1097–105. [19] Asif U, Bennamoun M, Sohel FA. A multi-modal, discriminative and spatially invariant CNN for RGB-d object labeling. IEEE Trans Pattern Anal Mach Intell 2018;40(9):2051–65. doi:10.1109/TPAMI.2017.2747134. [20] Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Semantic image segmentation with deep convolutional nets and fully connected CRFS. Comput Sci 2014;4:357–61. [21] Zeng D, Zhu M. Background subtraction using multiscale fully convolutional network. IEEE Access 2018;6:16010–21. doi:10.1109/ACCESS.2018.2817129. [22] Chen X, Wang Y, Wang Y, Ma M, Zeng C. Quantized phase coding and connected region labeling for absolute phase retrieval. Opt Express 2016;24:28613–24. [23] Kolmogorov V, Criminisi A, Blake A, Cross G, Rother C. Bi-layer segmentation of binocular stereo video. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol. 2; 2005. p. 1186. doi:10.1109/ CVPR.2005.90. [24] Alahari K, Seguin G, Sivic J, Laptev I. Pose estimation and segmentation of people in 3d movies. In: 2013 IEEE international conference on computer vision; 2013. p. 2112–19. doi:10.1109/ICCV.2013.263. [25] Yao P, Zhang H, Xue Y, Zhou M, Xu G, Gao Z, et al. Segment-tree based cost aggregation for stereo matching with enhanced segmentation advantage. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP); 2017. p. 2027–31. 10.1109/ICASSP.2017.7952512 [26] Frick A, Kellner F, Bartczak B, Koch R. Generation of 3D-TV LDV-content with time-of-flight camera. In: 2009 3DTV Conference: the true vision - capture, transmission and display of 3D video; 2009. p. 1–4. 10.1109/3DTV.2009.5069624
Optics and Lasers in Engineering 126 (2020) 105890 [27] Wang L, Gong M, Zhang C, Yang R, Zhang C, Yang Y-H. Automatic real-time video matting using time-of-flight camera and multichannel poisson equations. Int J Comput Vision 2012;97(1):104–21. doi:10.1007/s11263-011-0471-x. [28] Hoegner L, Hanel A, Weinmann M, Jutzi B, Hinz S, Stilla U. Towards people detection from fused time-of-flight and thermal infrared images. In: International archives of the photogrammetry, remote sensing and spatial information sciences; 2014. p. 121–6. [29] Cinque L, Danani A, Dondi P, Lombardi L. Real-time foreground segmentation with kinect sensor. In: Murino V, Puppo E, editors. Image analysis and processing — ICIAP 2015. Cham: Springer International Publishing; 2015. p. 56–65. ISBN 978-3-319-23234-8. [30] Camplani M, Salgado L. Background foreground segmentation with RGB-d kinect data: an efficient combination of classifiers. J Vis Commun Image Represent 2014;25(1):122–36. doi:10.1016/j.jvcir.2013.03.009. [31] Abramov A, Pauwels K, Papon J, Wörgötter F, Dellen B. Depth-supported real-time video segmentation with the kinect. In: 2012 IEEE workshop on the applications of computer vision (WACV); 2012. p. 457–64. 10.1109/WACV.2012.6163000 [32] Deng J, Li J, Feng H, Zeng Z. Flexible depth segmentation method using phaseshifted wrapped phase sequences. Opt Lasers Eng 2019;122:284–93. doi:10.1016/ j.optlaseng.2019.06.016. [33] Shen C, Yang J, Tang J, Liu J, Cao H. Note: parallel processing algorithm of temperature and noise error for micro-electro-mechanical system gyroscope based on variational mode decomposition and augmented nonlinear differentiator. Rev Sci Instrum 2018;89:076107. doi:10.1063/1.5037052. [34] Wang H, Kemao Q, Gao W, Lin F, Seah HS. Fringe pattern denoising using coherenceenhancing diffusion. Opt Lett 2009;34(8):1141–3. doi:10.1364/OL.34.001141. [35] Wang Y, Zhang S. Optimal fringe angle selection for digital fringe projection technique. Appl Opt 2013;52(29):7094–8. doi:10.1364/AO.52.007094. [36] Zhang R, Guo H, Asundi AK. Geometric analysis of influence of fringe directions on phase sensitivities in fringe projection profilometry. Appl Opt 2016;55(27):7675–87. doi:10.1364/AO.55.007675. [37] Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vision 2004;60(2):91–110. doi:10.1023/B:VISI.0000029664.99615.94. [38] Sporring J. Gaussian scale-space theory. Norwell, MA, USA: Kluwer Academic Publishers; 1997. ISBN 0792345614. [39] Perona P, Malik J. Scale-space and edge detection using anisotropic diffusion. IEEE Trans Pattern Anal Mach Intell 1990;12(7):629–39. doi:10.1109/34.56205. [40] Zuo C, Feng S, Huang L, Tao T, Yin W, Chen Q. Phase shifting algorithms for fringe projection profilometry: a review. Opt Lasers Eng 2018;109:23–59. doi:10.1016/ j.optlaseng.2018.04.019.