Journal of Systems Engineering and Electronics Vol. 19, No. 6, 2008, pp.1115–1120
Multi-view video color correction using dynamic programming∗ Shao Feng1 , Jiang Gangyi1,2 & Yu Mei1,2 1. Faculty of Information Science and Engineering, Ningbo Univ., Ningbo 315211, P. R. China; 2. National Key Lab of Software New Technology, Nanjing Univ., Nanjing 210093, P. R. China (Received August 11, 2007)
Abstract: Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are constructed with sequential conditional probability in HSI color space. Then, dynamic programming is used to seek the best color mapping relation with the minimum cost path between target image histogram and source image histogram. Finally, video tracking technique is performed to correct multi-view video. Experimental results show that the proposed method can obtain better subjective and objective performance in color correction.
Keywords: multi-view video, color correction, dynamic programming, video tracking.
1. Introduction With the rapid development and wide application of digital video, requirements for high quality and variety representation of video become higher and higher. Traditional two-dimensional (2D) video is no longer entirely satisfying the requirements. Multi-view video is a new type of natural video media that expands the user’s sensation far beyond what is offered by traditional media. It allows the users to choose an own viewpoint and viewing direction freely and enjoy more photorealistic three-dimensional (3D) video. With these functions, multi-view video will have good applications in three-dimensional television (3DTV)[1] or free viewpoint television (FTV)[2] , and so on. While multi-view video is captured from different viewpoints with multiple cameras, it is difficult to guarantee color consistency for all cameras. In practical imaging, camera parameters in multi-camera capturing system might be inconsistent, and exposure or focus might be variable for different views. In addition, it is often impossible to capture an object under perfectly constant lighting conditions at different spatial positions within an imaging environment. Those
variations provide serious challenge for the realization of multi-view video system and degrade the performance of subsequent multi-view video coding (MVC)[3] or virtual view rendering[4] . Therefore, color correction is necessary as a pre-processing step in multi-view video processing. To eliminate the color inconsistency, some color correction methods were proposed. Color constancy method can eliminate non-consistent illumination by recovering surface spectral reflectance[5] . Reinhard et al.[6] presented a pioneering work about color transfer on the basis of statistical analysis in lαβ de-correlated color space. Using their methods, one image’s color characteristics can be imposed on another by transferring the mean and standard deviation. Fecker et al. used histogram matching to compensate luminance and chrominance variations in a pre-filtering step[7] . By extending Reinhard’s algorithm to multi-view images, we proposed content-adaptive color correction method and color correction for view rendering on the basis of disparity vectors[8−9]. In this article, we propose a new color correction method using dynamic programming, which can reveal the changing trend in histograms between differ-
* This project was supported by the National Natural Science Foundation of China (60672073), the Program for New Century Excellent Talents in University (NCET-06-0537) and the Natural Science Foundation of Ningbo (2008A610016), the K.C.Wong Magna Fund in Ningbo University.
1116
Shao Feng, Jiang Gangyi & Yu Mei
ent views. Then, video tracking technique is used to achieve color correction for multi-view video.
2. The proposed multi-view video color correction method Figure 1 shows a typical multi-view video system. In this system, the same object is captured with multiple cameras under various lighting conditions. Color correction is first performed to compensate the color inconsistency between different cameras. Then, after data compression, multi-view video is transmitted and is interpolated to create virtual view in client. Therefore, color correction is vital for multi-view video system, which determines subsequent coding efficiency and the quality of synthesized virtual view. Histograms are widely accepted as simple and useful probabilistic models. We define a cross-correlation matrix C between two gray-level histograms that represent the bin-wise mutual distance. Let ht [m] be target image histogram, and hs [n] be source image histogram with m = 1, . . . , M and n = 1, . . . , N . The correlation matrix is defined as ⎤ ⎡ c11 c12 . . . c1N ⎥ ⎢ ⎥ ⎢ ⎢ c21 c22 . . . c2N ⎥ ⎥ ⎢ C = ht ⊗ hs == ⎢ . .. .. ⎥ (1) ⎢ .. . . ⎥ ⎦ ⎣ cM1
cM2
. . . cMN
Here, each element cmn is a distance between the histogram bins. Absolute difference is used as distance norm, as cmn = |h1 [m]−h2 [n]|. Usually, M is equal to N in our processing.
Fig. 1
Let p: {(m0 , n0 ), . . . , (mi , ni ), . . . , (mI , nI )} represent a minimum cost path from the vertex c11 to cMN in the matrix C. The sum of the matrix elements on the path p gives the optimal route among all possible routes. The total length of the path I is limited as √ M 2 + N 2 I M + N . For example, the sum of diagonal represents the absolute error distance of histogram. After establishing minimum cost path, the color mapping relation from the histogram hs to ht can be defined as the form of mapping f (ni ) = mi , as shown in Fig. 2. An inverse mapping is defined as f −1 (mi ) = ni . In order to seek the optimum routes, Dijkstra’s dynamic programming algorithm is used[10] . It is one of the most important and useful algorithms available for generating optimal solutions to a large class of shortest path problems. Let v be a vertex and e be edge distance between the vertices, the cost of a path p(v0 , vs )={v0 , . . . , vs } is the sum of its connective vertex edges S Ω (p(v0 , vS )) = e(vs ) (2) s
Suppose vertex vmn has directional edges to vertices vm+1,n , vm,n+1, and vm+1,n+1 . That is to say, current vertex vmn has three possible routes, from vm−1,n , vm,n−1, or vm−1,n−1 , as showed in Fig. 3. Thus, overlaps of the bin indices and cyclic paths are avoided. If we have known the minimum path cost from v11 to vm−1,n , vm,n−1, and vm−1,n−1 , the minimum path cost for current vertex vmn can be expressed as Ω (vm,n ) = min{Ω (vi,j ) + e(vm,n , vi,j ),
A typical multi-view video system
Multi-view video color correction using dynamic programming vi,j ∈ {vm−1,n , vm,n−1 , vm−1,n−1 }}
(3)
1117
methods can be used. One method establishes mapping relation for each color channel, respectively, using gray-level technique. However, the inter-channel correlations are ignored completely. The other method constructs a 3D histogram on the basis of color distribution information and then, establishes 3D mapping relation using dynamic programming. But its computing complexity is very high. Considering the above advantage and disadvantage, we propose a novel method, maintaining three channel correlations as well as convenience of 1D histogram operation. We define three channel histograms h1 , h2x , and h3x,y as h1 (a) = Ph (x = a)
Fig. 2
Description of minimum cost path
h2a (b) = Ph (y = b|x = a) h3a,b (c)
(5)
= Ph (z = c|y = b&x = a)
where Ph denotes probability, which is defined by a probability density function, and at the right side of the second and third rows, conditional probabilities are written. Considering connection between conditional probabilities, we get h1 (a) · h2a (b) · h2a,b (c) = Ph (x = a) · Ph (y = b|x = a)· Ph (z = c|y = b&x = a) = Ph (x = a&y = b&z = c) = h(a, b, c) Fig. 3
Relation of vertexes
By tracking minimum cost vertices for Ω (vM,N ), the mapping relations can be established. However, since accumulation and dispersion of the values will occur if the path does not strictly traverse diagonally. To minimize such route, we add a penalty term δ to each horizontal edge e(vm−1,n , vm,n ) and vertical edge e(vm,n−1 , vm,n ). The value of the penalty term is set to δ=α·c max . Here, cmax is the maximum value in the cross-correlation matrix. M
α=
max
i=1 M i=1
c1,i + ci,M
ci,i
,
M
ci,1 + cM,i
i=1
(4) In the case of color images, in order to establish the color mapping relation for three channels, two
(6)
Therefore, color mapping with (x , y , z ) = f (x, y, z) is defined by applying dynamic programming sequentially
(x, x ), x ∈ IS1 , x ∈ IT1 = DP (S) h1 , (T ) h1
(y, y ), y ∈ IS2 , y ∈ IT2 = DP (S) h2x , (T ) h2x
(z, z ), z ∈ IS3 , z ∈ IT3 = DP (S) h3x,y , (T ) h3x ,y (7) where DP denotes dynamic programming algorithm. ISi and ITi are source image and target image pixel in i-th color channel, respectively. This sequential definition results in an appropriate mapping. It corresponds to the sequence of conditional probabilities in Eq.(6). Because there is no directly perceived relation in RGB color space, HSI space is used in our method. The quantization scales for hue, saturation, and intensity are 360, 101, and 256, respectively. Because of the high temporal correlation existing between two consecutive frames and almost un-
1118
Shao Feng, Jiang Gangyi & Yu Mei
changed imaging condition, video tracking technique is used to correct multi-view video. In our method, once color mapping relation is established at time t0 , for subsequent frames, the same mapping is used until the similarity between consecutive frames is lower than a threshold. The similarity is defined as
h N ht [i] − hht+1 [i] 1 1− = N i=1 max(hht [i], hht+1 [i]) (8) h h where ht and ht+1 are hue histograms for different frames. Sim(hht , hht+1 )
3. Experimental results In the experiments, multi-view videos, namely, “golf2” and “objects3” provided by KDDI Corp., and “Jungle” and “Uli” provided by HHI Lab. are used as test sets, in which the sizes of images are 320×240 or 1 024×768, respectively. Figures 4–7(a) and (b) show the target images and source images of “golf2”, “object3”, “Jungle”, and “Uli” test sets, respectively. Figures 4–7(c) show the corrected images with the proposed method. From the figures, it is seen that the color appearance of corrected images is almost consistent with the reference images. Figures 4–7(d) show the video tracking correction results. Figure 8 shows the mapping curve for hue. From the curve we can see, hue change is always the main factor for color inconsistency.
Fig. 5
Correction results of “objects3”
Fig. 6
Fig. 4
Correction results of “golf2”
Fig. 7
Correction results of “Jungle”
Correction results of “Uli”
Multi-view video color correction using dynamic programming
Fig. 8
Table 1 Test images
1119
Hue mapping curves
Color correction objective performance comparisons golf2
objects3
Jungle
Uli
Correction operation
RMSE
Euler
RMSE
Euler
RMSE
Euler
RMSE
Euler
With correction
105.76
0.022
111.87
0.073
103.16
0.030
97.54
0.029
Without correction
64.30
0.011
58.98
0.004
71.46
0.008
49.25
0.014
The root of mean squared errors (RMSE) and Euler distance between the corrected image and target image are calculated. Table 1 shows the results of the proposed method with respect to “golf2”, “objects3”, “Jungle”, and “Uli” test images compared with the test results without correction. From the table, it is noted that the proposed correction method can achieve smaller RMSE and Euler distance.
4. Conclusions Color inconsistency is an important problem to be solved in multi-view video systems, such as free viewpoint television and 3DTV. In this article, we propose a multi-view video color correction method using dy-
namic programming. Experimental results show the effectiveness of the proposed method. However, histogram is sensitive to some imaging factors, such as camera noise. In the future work, we will do further research on how to increase correction accuracy and reduce noise influence in color correction.
References [1] Matusik W, Pfister H. 3DTV: a scalable system for realtime acquisition, transmission, and autostereoscopic display of dynamic scenes. ACM Trans. on Graphics, 2004, 24(3): 814–824. [2] Tanimoto M. Overview of free viewpoint television. Signal
1120
Shao Feng, Jiang Gangyi & Yu Mei
Processing: Image Communication, 2006, 21(6): 454–461.
7(5): 569–578.
[3] Smolic A, Mueller K, Stefanoski N, et al. Coding algo-
[10] Cormen T H, Leiserson C E, Rivest R L, et al. Introduc-
rithms for 3DTV- a survey. IEEE Trans. on Circuits and
tion to algorithms. Second Edition. The Massachusetts
Systems for Video Technology, 2007, 17(11): 1606–1621.
Institute of Technology Press, 2001: 323–369.
[4] Tanimoto M, Fujii T, Suzuki K. Improvement of depth map estimation and view synthesis.
ISO/IEC JTC1/
SC29/WG11, Doc. M15090, Antalya, Turkey, 2008. [5] Shao Feng, Jiang Gangyi, Yu Mei, et al. A new image correction method for multiview video system. Proc. of IEEE International Conference on Multimedia & Expo, 2006: 205–208.
Shao Feng was born in 1980. He received his Ph. D. degree in electronic science and technology from Zhejiang University, China, in 2007. He jointed Ningbo University in 2007. His research interests include image/video coding, image processing, and image perception. E-mail:
[email protected]
[6] Reinhard E, Ashikhmin M, Gooch B, et al. Color transfer between images. IEEE Computer Graphics and Applications, 2001, 21(4): 34–41.
Jiang Gangyi was born in 1964. He received his Ph. D. degree in electronic engineering from Ajou Univer-
matching for colour compensation of multi-view video se-
sity, Korea, in 2000. He is now a professor in Ningbo University. His research interests include 3DTV signal
quences. Proc. of Picture Coding Symposium, 2007.
processing, visual communication and image process-
[7] Fecker U, Barkowsky M, Kaup A. Time-constant histogram
[8] Shao Feng, Jiang Gangyi, Yu Mei, et al.
A content-
ing. E-mail:
[email protected]
adaptive multi-view video color correction algorithm. Proc. of IEEE International Conference on Acoustics, Speech,
Yu Mei was born in 1968. She received her Ph. D. de-
and Signal Processing, 2007, 1: 969–972.
gree in computer science from Ajou University, Korea, in 2000. She is now a professor in Ningbo University.
[9] Shao Feng, Jiang Gangyi, Yu Mei. New color correction method of multi-view images for view rendering in freeviewpoint television. WSEAS Trans. on Computers, 2008,
Her research interests include image and video coding, and visual perception.