A new fast motion estimation algorithm based on the loop–epipolar constraint for multiview video coding

Signal Processing: Image Communication 27 (2012) 172–179 Contents lists available at SciVerse ScienceDirect Signal Processing: Image Communication j...

Download PDF

672KB Sizes 12 Downloads 187 Views

Report

PDF Reader
Full Text

Signal Processing: Image Communication 27 (2012) 172–179

Contents lists available at SciVerse ScienceDirect

Signal Processing: Image Communication journal homepage: www.elsevier.com/locate/image

A new fast motion estimation algorithm based on the loop–epipolar constraint for multiview video coding Zhaopeng Cui n, Guang Jiang, Shuai Yang, Chengke Wu ISN National Key Lab, School of Telecommunications Engineering, Xidian University, Xi’an, China

a r t i c l e in f o

abstract

Article history: Received 14 May 2011 Accepted 10 November 2011 Available online 18 November 2011

There lie geometric constraints between neighboring frames in multiview video sequences. The geometric constraints are valuable for reducing spatial and temporal redundancy in multiview video coding (MVC). In this paper, we propose a new fast motion estimation algorithm based on the loop–epipolar constraint which combines loop and epipolar constraints. A practical search technique is designed according to the characteristics of the loop–epipolar constraint. Experimental results show that the proposed algorithm is efﬁcient for sequences under different multiview camera setups. & 2011 Elsevier B.V. All rights reserved.

Keywords: Motion estimation Loop–epipolar constraint Multiview video coding

1. Introduction Multiview video is a group of video sequences captured by an array of cameras from different views at the same time. It has been used in video services such as free-viewpoint television (FTV) [1] and 3DTV [2] to provide viewers with an exciting viewing experience. Multiview video coding (MVC) has been developed rapidly in recent years, aiming to compress the multiview video data for efﬁcient storage and transmission. The standardization of MVC has been completed by the joint video team (JVT) of ITU-T VCEG and ISO/IEC MPEG [3]. The standard employs variable block-size motion estimation (ME) and disparity estimation (DE) in order to reduce the temporal and spatial redundancy in multiview video. For single view video coding, many fast block motion estimation algorithms have been developed to speed up the process of ME mainly in the following ways. One way is to improve search patterns based on the study of motionvector distribution characteristics [4–13]. Another way is to obtain better starting points using spatial or temporal correlation between blocks [14–20]. A third one is the

n

Corresponding author. E-mail address: [email protected] (Z. Cui).

0923-5965/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.image.2011.11.001

early termination method which makes a tradeoff between performance and complexity [19–22]. Recently, the algorithms which blend the approaches mentioned above together have been proposed [23–25]. All these algorithms can be used in MVC for both motion estimation and disparity estimation. There are geometric constraints between neighboring frames in MVC sequences, and these constraints are not considered in traditional ME algorithms. Based on this point, some new algorithms have been proposed for ME and DE in MVC. In [26], the loop constraint and the geometry of camera arrangement are utilized to reduce the search region for ME and DE through estimating the reliability of the predicted vectors, whereas backward disparity vectors should be computed for motion estimation. The loop constraint is also used in the algorithm in [27]. Instead of computing the backward disparity vectors, the algorithm utilized the correlation of disparity angles in the neighboring view to determine a basic line for motion estimation, whereas the algorithm is only suitable to the parallel multiview camera setup. Fast ME algorithms which adaptively utilizes the interview correlation are proposed in [28,29], where the correlation means the motion homogeneity in different views. The algorithms are efﬁcient only for low complexity multiview coding. In [30], an epipolar geometry-based fast DE

Z. Cui et al. / Signal Processing: Image Communication 27 (2012) 172–179

algorithm is proposed, which introduces a heuristic method of utilizing the epiploar constraint for block-based disparity estimation in MVC. However, the paper does not further explore the epipolar constraint for motion estimation in MVC. In this paper, we propose a new fast ME algorithm based on the loop–epipolar constraint which exploits both the loop constraint and the epipolar constraint. Compared with the ME algorithm in [26], our algorithm could reduce the search region without computing backward disparity vectors. Our algorithm can be applied to different sequences under different multiview camera arrangements, which makes up for the shortcomings of the algorithms in [27–29]. What’s more, our algorithm extends the application of the epipolar constraint from disparity estimation [30] to motion estimation. The rest of this paper is organized as follows. In Section 2, geometric constraints between two views are discussed. The proposed fast motion estimation algorithm is described in Section 3. Section 4 gives experimental results. Conclusions are presented in Section 5. 2. Geometric constraints between two views 2.1. Epipolar constraint The epipolar geometry is the intrinsic projective geometry between two views [31]. It gives the geometry constraint as shown in Fig. 1. Let C1 and C2 be the optical centers of the two cameras, and let p1 ¼ ½x1 ,y1 ,1T and p2 ¼ ½x2 ,y2 ,1T be the images of a 3D point P in the two image planes respectively. According to the epipolar geometry, the relationship between p1 and p2 can be expressed by pT2 Fp1 ¼ 0,

ð1Þ

where F is called the fundamental matrix. Suppose the world coordinate system coincides with the ﬁrst camera coordinate system, then 1 F ¼ K T 2 ½t RK 1 ,

ð2Þ

where K1 and K2 are the intrinsic matrices of the ﬁrst camera and second camera respectively; R and t are the

′

′

′

′

rotation matrix and the translation vector of the second camera respectively; ½t represents the skew-symmetric matrix of the vector t. Through Eq. (1), it can be easily obtained that pT2 l2 ¼ 0,

ð3Þ

where l2 is called the epipolar line of p1 and deﬁned as l2 ¼ Fp1 :

ð4Þ

In practice, the camera parameters of multiview video sequences are usually known because multiview video system designers usually ﬁrst calibrate the camera before capturing sequences [32,33]. The fundamental matrix can be computed from Eq. (2), and then the epipolar line for a given point can be computed from Eq. (4).

2.2. Loop constraint The loop constraint is another natural constraint between two views [27,34]. Suppose that one point locates at P at time t and then it moves to P0 at time t þ 1. If we draw its images in two views at time t and t þ 1 together in one image, we can get the relationship between these four points as shown in Fig. 2. It can be observed that ! ! ! ! p02 p01 þ p01 p1 ¼ p02 p2 þ p2 p1 :

ð5Þ

The constraint given by Eq. (5) is called the loop constraint. However, as a point-to-point constraint, the loop constraint can not be straightly used in the block-based coding process for the following reasons. First, for the block-based coding, one vector can not be derived directly from three known vectors through Eq. (5) and one backward vector at least should be known to derive one from others [35]. For example, let view 1 and view 2 be the reference view and current encoding view respectively, and let DV(p) and MV(p) represent the disparity and motion vectors of the block centered at p respectively. Then, as shown in Fig. 2, if MVðp01 Þ, DVðp02 Þ and DVðp2 Þ are known, MVðp02 Þ can not be obtained ! directly and a backward vector p1 p2 should be ﬁrst ! computed. Generally, for blocks, p1 p2 can not be considered as the opposite vector of DVðp2 Þ because MVðp01 Þ and DVðp2 Þ may not point to the same block exactly due to the ! block matching. In this case, p1 p2 should be estimated from the existing backward vectors in the neighborhood.

′

′

′

′ Fig. 1. The epipolar geometry.

173

′

Fig. 2. The loop constraint in two views.

′

174

Z. Cui et al. / Signal Processing: Image Communication 27 (2012) 172–179

Second, there may be some errors of vectors for blockbased coding because of shape distortion, illumination changes and so on. Any error in one vector may inﬂuence the loop constraint. This would cause the accumulation of errors. So the loop constraint for blocks

should be denoted by DVðp02 Þ þ MVðp01 Þ MVðp02 Þ þDVðp2 Þ:

ð6Þ

3. Fast motion estimation algorithm based on loop–epipolar constraint 3.1. Loop–epipolar constraint

′

′

′

′ ′ Fig. 3. The loop–epipolar constraint in two views.

′

In this paper, a new method is explored to utilize epipolar and loop constraints. Combining these two constraints, we could ﬁnd the loop–epipolar constraint shown in Fig. 3. It can be noticed that the epipolar geometry provides bidirectional constraints between different views. On the one hand, l01 provides a good constraint for the search of DVðp02 Þ. On the other hand, l2 provides a good constraint for the search of MVðp02 Þ. Therefore, using the loop–epipolar constraint, we can ﬁrst

Fig. 4. Visual inspection. (a) View ¼ 3 and t ¼ 11, (b) view ¼ 2 and t ¼ 11, (c) view ¼ 3 and t ¼ 12, (d) view ¼ 2 and t ¼ 12, (e) one enlarged part of (b). (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of this article.)

Z. Cui et al. / Signal Processing: Image Communication 27 (2012) 172–179

locate l2 to determine a rectangular search range when searching for MVðp02 Þ instead of computing backward vectors. Fig. 4 shows a visual inspection of the loop– epipolar relationship in four neighboring frames in two views of the multiview sequence Breakdancer. The second and third views of Breakdancer are selected and considered as the current encoding view and reference view respectively. The red cross in Fig. 4(d) shows the position of a chosen block in the current encoding frame; the blue cross in Fig. 4(c), the black cross in Fig. 4(a) and the green cross in Fig. 4(b) show the positions of its best matching blocks in the neighboring frames; the cyan and yellow lines show the epipolar lines. We draw the four points and the two epipolar lines together in Fig. 4(b) and enlarge one part of the ﬁgure for a clear observation. From Fig. 4(e), we can see that the green cross is on the epipolar line in yellow color and the four blocks submit to the loop–epipolar constraint just as our previous description. In order to learn the constraint further, we studied the locations of best matching blocks. The distance dl2 from the best matching block obtained by the full search to l2 is computed. Three sequences including Ballroom, Ballet and Breakdancer are tested with a JMVC 8.2 encoder. Experiment conditions are as follows: four basis quantization parameters are tested including 25, 29, 33 and 37; the size of GOP is 8; the number of reference frames is 2; the search range is 64; the size of blocks for analysis is 16 16. The result is listed in Table 1. It can be seen that most best matching blocks are close to l2, which is consistent with our former analysis. At the same time, it can be seen that not all the best matching blocks are under the constraint. This is due to occlusion, different levels of illumination, low texture, etc. These factors would make some best matching blocks far away from l2 [36]. 3.2. Formulation of the proposed algorithm From the discussion above, we learn that the best ME matching block of one block is mainly distributed in an

area along the line which is determined by the loop– epipolar constraint. This characteristic is valuable for motion estimation because it is helpful to reduce the number of search candidates and further to decrease the ME time. Based on this point, a practical ME search technique is proposed in this section. As it is known, a good starting point plays an important role in the search strategy. In [30], it has been proved in detail that the orthogonal projection epipolar search center (OPESC) is a better starting point than the median predicted search center (MPSC) for the epipolar geometry-based search. The OPESC is deﬁned as the projection point of MPSC on the epipolar line. In this paper, we adopt the small diamond search pattern (SDSP) centered at OPESC to reﬁne the starting point as shown in Fig. 5. Our experiments show that the reﬁnement of starting points could improve the rate-distortion (R-D) performance with negligible change in encoding time, which would be discussed further in Section 4. After the starting point is located, a proper search process should be designed. According to the distribution of dl2 , we ﬁrst adopt a rotated unsymmetrical roodpattern search centered at the starting point in parallel with l2 as shown in Fig. 6. The sampling grid size is set to 2. Then the recursive large diamond search pattern (LDSP) is used. From Table 1, we notice that most best matching blocks locate in the area of dl2 p6 pixels. The largest search step of LDSP is 2 pixels. As a result, a recursive time of three is enough to track these blocks. At the last stage, the small diamond search pattern is used to reﬁne the search result. As shown in Fig. 7, the proposed algorithm can be summarized as follows: Step 1: Step 2: Step 3: Step 4:

Locate p01 through DVðp02 Þ; Locate p1 through p01 and MVðp01 Þ ; Locate l2 corresponding to p1; Project MPSC onto l2 to get OPESC, and conduct the small diamond search pattern centered at OPESC to determine the starting point;

Table 1 Distribution of dl2 in pixels. Image sequence

QP

r1 (%)

r2 (%)

r3 (%)

r4 (%)

r5 (%)

r6 (%)

Ballroom

25 29 33 37 Avg.

34.19 36.23 38.86 43.20 38.12

69.23 70.19 70.53 71.90 70.46

81.31 82.18 82.58 83.49 82.39

84.92 85.55 85.93 86.62 85.76

86.65 87.25 87.63 88.44 87.49

87.91 88.45 88.84 89.72 88.73

Ballet

25 29 33 37 Avg.

30.17 27.18 24.94 22.09 26.10

51.67 47.39 43.88 38.71 45.41

64.49 60.55 57.26 51.95 58.56

74.89 71.55 68.62 63.40 69.62

81.87 79.22 76.07 71.40 77.14

85.52 82.85 79.78 75.44 80.90

Breakdancer

25 29 33 37 Avg.

24.60 24.43 23.40 21.15 23.40

43.38 43.47 42.37 39.75 42.24

53.45 54.51 54.59 53.47 54.01

61.19 63.24 64.14 64.15 63.18

66.87 69.85 71.38 72.37 70.12

71.11 74.49 76.29 77.94 74.96

175

Fig. 5. An example of choosing the starting point.

176

Z. Cui et al. / Signal Processing: Image Communication 27 (2012) 172–179

MME occurs at the center point or the recursive time is three, go to step (7); if not, repeat this step; Step 7: Use the small diamond search pattern centered at the previous point with the MME to reﬁne the search result. 4. Experimental results −

−

−

−

−

− −

− −

Fig. 6. An example of the search process.

The proposed algorithm has been implemented into MVC reference software JMVC 8.2. The detailed parameter conﬁguration of JMVC encoder is listed in Table 2. Multiview sequences captured by different multiview camera arrangements and different characteristics are employed in our experiment. Ballroom and Exit are provided by Mitsubishi Electric Research Laboratories. Race1 is from the KDDI. Ballet and Breakdancer are provided by Microsoft. The details of these employed sequences are listed in Table 3. In the view of motion complexity, Ballet and Exit represent smooth sequences; Ballroom represents complex sequences; Race1 represents sequences captured by moving cameras with ﬁxed relative positions. In the view of camera arrangement, Ballroom, Exit and Race1 are sequences captured by the parallel multiview camera setup; Ballet and Breakdancer are sequences captured by the convergent multiview camera setup. We select two neighboring views in which one view is previously coded and considered as the reference view, and the other view is considered as the current encoding view. In the whole test, the algorithm in [30] is used for disparity estimation. The results of the current encoding view are considered in our experiment. The Bjontegaard delta bit rate (BDBR) and the Bjontegaard delta PSNR (BDPSNR) [37] are used to measure the R-D performance difference between algorithms; the average encoding time change rate (D Time) is used to compare the encoding speed. In order to test the need of SDSP to reﬁne the starting point, we ﬁrst compare algorithms with and without the reﬁnement of starting points. The results are shown in Table 4. From Table 4, we ﬁnd that the reﬁnement of starting points improves the R-D performance for all the test sequences, especially for Race1, Ballet and Breakdancer. This is because the reﬁnement of starting points can deal with errors of the vectors in loop–epipolar constraint, for example, the errors of motion vectors caused by the movement of cameras in Race1 and the errors of disparity vectors caused by the convergent camera arrangement in Ballet and Breakdancer. At the same time, we can notice that the change of encoding time is negligible. The encoding time even decreases for Race1 and Breakdancer, which is due to the fact that a Table 2 Parameter conﬁguration of JMVC encoder.

Fig. 7. Flowchart of our proposed algorithm.

Step 5: Perform the rotated unsymmetrical rood-pattern search in parallel with l2; Step 6: Adopt the large diamond search pattern centered at the previous point with the minimum matching error (MME) and update the MME. If the updated

Parameters

Value

BasisQP GOPSize NumberReferenceFrames SearchRange SymbolMode InterPredPicsFirst

25 29 33 37 8 2 64 CABAC 0

Z. Cui et al. / Signal Processing: Image Communication 27 (2012) 172–179

177

Table 3 Details of the sequences. Image sequence

Format

Frame rate (fps)

Camera arrangement

Reference view

Encoding view

Ballroom Exit Race1 Ballet Breakdancer

640 480 640 480 640 480 1024 768 1024 768

25 25 30 15 15

Parallel Parallel Parallel Convergent Convergent

0 0 2 1 3

1 1 3 2 2

Table 4 Performance of algorithms with and without the reﬁnement of starting points. Image sequence

QP

Without reﬁnement

With reﬁnement

Bitrate (kbits/s)

PSNR (dB)

Time (s)

Bitrate (kbits/s)

PSNR (dB)

Time (s)

BDBR (%)

BDPSNR (dB)

D Time (%)

Ballroom

25 29 33 37

1164.95 651.68 380.38 221.04

38.111 36.322 34.198 31.753

1181.81 1155.46 1131.84 1116.92

1163.01 650.78 380.06 220.55

38.114 36.324 34.199 31.755

1184.11 1160.17 1142.33 1121.89

0.17

0.007

0.49

Exit

25 29 33 37

556.65 279.26 165.78 104.36

39.256 37.986 36.424 34.589

1131.25 1107.54 1089.40 1080.74

556.73 278.85 166.01 104.16

39.259 37.988 36.427 34.592

1135.62 1111.50 1099.85 1086.56

0.09

0.004

0.56

Race1

25 29 33 37

951.18 525.94 311.99 195.41

40.024 37.868 35.769 33.428

2464.85 2415.91 2377.02 2352.53

948.26 524.35 310.68 194.44

40.037 37.882 35.778 33.437

2439.75 2424.72 2349.22 2365.42

0.65

0.027

0.32

Ballet

25 29 33 37

426.48 243.75 152.42 101.74

40.839 39.598 37.998 36.039

1086.39 1076.57 1071.54 1067.28

422.33 240.70 151.49 101.08

40.845 39.606 38.008 36.045

1088.07 1078.10 1072.19 1066.64

1.05

0.038

0.07

Breakdancer

25 29 33 37

873.68 410.12 242.07 160.23

38.529 37.498 36.256 34.750

1140.53 1113.59 1098.04 1082.99

870.51 406.94 240.13 159.30

38.531 37.500 36.260 34.756

1137.13 1113.33 1098.07 1085.03

0.85

0.018

0.03

better starting point may lead to fewer recursive times of LDSP. Based on the experiments and analysis, we could see that it is necessary to use SDSP to reﬁne the starting point. In order to test the performance of the proposed motion estimation algorithm, we compare it with the fast ME algorithm proposed in [26] and TZSearch fast algorithm in JMVC. The ME algorithm in [26] utilizes the loop constraint to get a predicted vector and then compares it with the median predicted vector to determine a new search range. The full search is then used in the new search range. The search range factor FSR in [26] is set to 16 in our test. TZSearch combines many techniques such as motion vector prediction and early termination. Experiment results are given in Table 5. From Table 5, we can see that our algorithm reduces the encoding time from 76.92% to 87.64% compared with the fast ME algorithm in [26]. There are several reasons for the better performance of our algorithm. One is that our algorithm does not need to compute the backward vectors, which reduces the computation complexity. Another reason is that the epipolar line provides a good constraint to minimize the search area. The last one is that our algorithm combines the unsymmetrical rood-pattern

Table 5 Performance of the proposed algorithm compared with [26] and TZSearch. Image sequence

Compared with [26] BDBR (%)

BDPSNR (dB)

Ballroom 2.29 0.087 Exit 3.37 0.094 Race1 1.12 0.048 Ballet 1.23 0.043 Breakdancer 1.57 0.035

Compared with TZSearch

D Time BDBR (%)

BDPSNR (dB)

D Time

(%) 82.63 82.87 87.64 85.72 76.92

1.64 2.09 0.40 2.69 2.21

0.062 0.057 0.017 0.091 0.048

26.91 16.35 24.96 16.72 38.43

(%)

and diamond search algorithms which are faster than the full search algorithm in [26]. Compared with TZSearch, our algorithm decreases the encoding time from 16.35% to 38.43%, where the decrease is low for Ballet and Exit. This is because Ballet and Exit are smooth sequences and TZSearch also performs well in this situation due to the early termination. Fig. 8 shows the R-D curves of TZSearch and the proposed algorithm for the sequences. It can be seen that the curves of the two algorithms are very close to each other, which proves that the video quality loss of our algorithm is negligible.

178

Z. Cui et al. / Signal Processing: Image Communication 27 (2012) 172–179

39 38 PSNR (dB)

37 36 35 34 33 TZSearch Proposed

32 31 200

400

600

800

1000

1200

40

41

39

40 39

38

PSNR (dB)

PSNR (dB)

Bitrate (kbits/s)

37 36 TZSearch Proposed

36

100

200

300

400

500

33 100 200 300 400 500 600 700 800 900 1000 Bitrate (kbits/s)

600

41

39

40

38 PSNR (dB)

39 38 37 36 35 50

150

200 250 300 Bitrate (kbits/s)

350

400

37 36 35

TZSearch Proposed

100

TZSearch Proposed

34

Bitrate (kbits/s)

PSNR (dB)

37 35

35 34

38

450

34 100

TZSearch Proposed

200

300

400

500

600

700

800

900

Bitrate (kbits/s)

Fig. 8. The R-D curves of TZSearch and the proposed algorithm for the test sequences. (a) Ballroom, (b) Exit, (c) Race1, (d) Ballet, (e) Breakdancer.

5. Conclusion In this paper, a new fast motion estimation algorithm suitable to different multiview camera arrangements is proposed to speed up the ME process for MVC. The proposed algorithm is based on the loop–epipolar constraint which is proved to exist in multiview sequences. Experimental results show that the proposed algorithm can greatly reduce the encoding time with negligible video quality loss.

Acknowledgements This work is supported by the NSFC Grants (No. 60775020, No. 61072105) and the Fundamental Research Funds for

the Central Universities. We would also like to thank the anonymous reviewers for their valuable comments and suggestions. References E. Trucco, P. Kauff, O. Schreer, Three-dimensional image [1] F. Isgro, processing in the future of immersive media, IEEE Transactions on Circuits Systems and Video Technology 14 (3) (2003) 288–303. [2] M. Tanimoto, Free viewpoint television – FTV, in: Proceedings of Picture Coding Symposium 2004, San Francisco, December 2004. [3] Information Technology-Coding of Audio-Visual Objects – Part 10: Advanced Video Coding, Amendment 1: Multiview Video Coding ISO/IEC 14496-10: 2008/FDAM 1:2008(E), ISO/IEC JTC1 doc ref SC 29N 9783 (FDAM), March 2009. [4] J.R. Jain, A.K. Jain, Displacement measurement and its application in interframe image coding, IEEE Transactions on Communications 29 (12) (1981) 1799–1808.

Z. Cui et al. / Signal Processing: Image Communication 27 (2012) 172–179

[5] T. Koga, K. Iinuma, A. Hirano, Y. Iijima, T. Ishiguro, Motion compensated interframe coding for video conferencing, in: Proceedings of National Telecommunication Conference, New Orleans, LA, 1981. [6] R. Li, B. Zeng, M.L. Liou, A new three-step search algorithm for block motion estimation, IEEE Transactions on Circuits Systems and Video Technology 4 (4) (1994) 438–442. [7] L.M. Po, W.C. Ma, A novel four-step search algorithm for fast block motion estimation, IEEE Transactions on Circuits Systems and Video Technology 6 (3) (1996) 313–317. [8] L.K. Liu, E. Feig, A block-based gradient descent search algorithm for block motion estimation in video coding, IEEE Transactions on Circuits Systems and Video Technology 6 (4) (1996) 419–422. [9] J.Y. Tham, S. Ranganath, M. Ranganath, A.A. Kassim, A novel unrestricted center-biased diamond search algorithm for block motion estimation, IEEE Transactions on Circuits Systems and Video Technology 8 (4) (1998) 369–377. [10] S. Zhu, K. -K Ma, A new diamond search algorithm for fast blockmatching motion estimation, IEEE Transactions on Circuits Systems and Video Technology 9 (2) (2000) 287–290. [11] C. Zhu, X. Lin, L.-P. Chau, Hexagon-based search pattern for fast block motion estimation, IEEE Transactions on Circuits Systems and Video Technology 12 (5) (2002) 349–355. [12] C.-H. Cheung, L.-M. Po, A novel cross-diamond search algorithm for fast block motion estimation, IEEE Transactions on Circuits Systems and Video Technology 12 (12) (2002) 1168–1177. [13] X. Jing, L.P. Chau, An efﬁcient three-step search algorithm for block motion estimation, IEEE Transactions on Multimedia 6 (3) (2004) 435–438. [14] C.-H. Hsieh, P.C. Lu, J.-S. Shyn, E.-H. Lu, Motion estimation algorithm using interblock correlation, Electronics Letters 26 (5) (1990) 276–277. [15] J. Chana, P. Agathoklis, Adaptive motion estimating for efﬁcient video compression, Conference Records: 29th Asilomar Conference on Signals, Systems and Computers, vol. 1, 1996, pp. 690–693. [16] L.-J. Luo, C. Zou, X.-Q. Gao, A new prediction search algorithm for block motion estimation in video coding, IEEE Transactions on Consumer Electronics 43 (February) (1997) 56–61. [17] D.-W. Kim, J.-S. Choi, J.-T. Kim, Adaptive motion estimation based on spatio-temporal correlation, Signal Processing: Image Communications 13 (1998) 161–170. [18] J.-B. Xu, L.-M. Po, C.-K. Cheng, Adaptive motion tracking block matching algorithms for video coding, IEEE Transactions on Circuits Systems and Video Technology 97 (October) (1999) 1025–1029. [19] Y. Nie, K.-K. Ma, Adaptive rood pattern search for fast blockmatching motion estimation, IEEE Transactions on Image Processing 11 (12) (2002) 1442–1449. [20] Y. Nie, K.-K. Ma, Adaptive irregular pattern search with matching prejudgment for fast block-matching motion estimation, IEEE Transactions on Circuits Systems and Video Technology 15 (6) (2005) 789–794. [21] J.-F. Yang, S.-H. Chang, C.-Y. Chen, Computation reduction for motion search in low rate video coders, IEEE Transactions on Circuits Systems and Video Technology 12 (October) (2002) 948–951.

179

[22] L. Yang, K. Yu, J. Li, S. Li, An effective variable block-size early termination algorithm for H.264 video coding, IEEE Transactions on Circuits Systems and Video Technology 15 (6) (2005) 784–788. [23] Z. Chen, J. Xu, Y. He, J. Zheng, Fast integer-pel and fractional-pel motion estimation for H.264/AVC, Journal of Visual Communication and Image Representation 17 (2) (2006) 264–290. [24] J.-J. Tsai, H.-M. Hang, Modeling of pattern-based block motion estimation and its application, IEEE Transactions on Circuits Systems and Video Technology 19 (1) (2009) 108–113. [25] K.-H. Ng, L.-M. Po, K.-M. Wong, C.-W. Ting, K.-W. Cheung, A search patterns switching algorithm for block motion estimation, IEEE Transactions on Circuits Systems and Video Technology 19 (5) (2009) 753–759. [26] Y. Kim, J. Kim, K. Sohn, Fast disparity and motion estimation for multi-view video coding, IEEE Transactions on Consumer Electronics 53 (2) (2007) 712–719. [27] X. Li, D. Zhao, S. Ma, W. Gao, Fast disparity and motion estimation based on correlations for multiview video coding, IEEE Transactions on Consumer Electronics 54 (4) (2008) 2037–2044. [28] L. Shen, Z. Liu, T. Yan, Z. Zhang, P. An, View-adaptive motion estimation and disparity estimation for low complexity multiview video coding, IEEE Transactions on Circuits Systems and Video Technology 20 (6) (2010) 925–930. [29] L. Shen, Z. Liu, S. Liu, Z. Zhang, P. An, Selective disparity estimation and variable size motion estimation based on motion homogeneity for multi-view coding, IEEE Transactions on Broadcasting 55 (4) (2009) 761–766. [30] J. Lu, H. Cai, J. Lou, J. Li, An epipolar geometry-based fast disparity estimation algorithm for multiview image and video coding, IEEE Transactions on Circuits Systems and Video Technology 17 (6) (2007) 737–750. [31] R.I. Hartley, A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed., Cambridge University Press, 2003. [32] J. Lou, H. Cai, J. Li, A real-time interactive multi-view video system, in: Proceedings of ACM International Conference on Multimedia, Singapore, 2005, pp. 161–170 November. [33] C. Zitnick, S. Kang, M. Uyttendaele, S. Winder, R. Szeliski, Highquality video interpolation using a layered representation, in: Proceedings of ACM Conference on Computer Graphics, 2004, pp. 600–608. [34] W. Yang, K. Ngan, J. Lim, K. Sohn, Joint motion and disparity ﬁelds estimation for stereoscopic video sequence, Signal Processing: Image Communication 3 (March) (2005) 265–276. [35] Y. Kim, J. Lee, C. Park, K. Sohn, MPEG-4 compatible stereoscopic sequence codec for stereo broadcasting, IEEE Transactions on Consumer Electronics 51 (4) (2005) 1227–1236. [36] Z. Cui, G. Jiang, D. Wang, C. Wu, A novel homography-based search algorithm for block motion estimation in video coding, in: Proceedings of IEEE International Conference on Multimedia and Expo, Barcelona, 2011 July. [37] G. Bjontegaard, Calculation of average PSNR differences between RD curves, Document VCEG-M33, in: Proceedings of VCEG 13th meeting, Austin, April 2001.

A new fast motion estimation algorithm based on the loop–epipolar constraint for multiview video coding

A new fast motion estimation algorithm based on the loop–epipolar constraint for multiview video coding

Recommend Documents