Object-based encoding for multi-view sequences of 3D object

Signal Processing: Image Communication 17 (2002) 293–304 Object-based encoding for multi-view sequences of 3D object JongWon Yi*, KwangYeon Rhee, Seo...

Download PDF

824KB Sizes 2 Downloads 115 Views

Report

PDF Reader
Full Text

Signal Processing: Image Communication 17 (2002) 293–304

Object-based encoding for multi-view sequences of 3D object JongWon Yi*, KwangYeon Rhee, SeongDae Kim Division of Electrical Engineering, Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology, South Korea Received 5 January 2001; received in revised form 22 May 2001; accepted 31 August 2001

Abstract Image-based rendering is a powerful and promising approach for a 3D object representation. This approach considers a 3D object or a scene as a collection of images called key frames taken from the reference viewpoints and generates arbitrary views of the object using these key frames. In this paper, we propose an object-based encoding method for multi-view sequences of a 3D object and a view synthesis algorithm for predicting new views using an image-based rendering approach. r 2002 Elsevier Science B.V. All rights reserved.

1. Introduction Representation and storage of 3D objects or scenes have been the essential parts of many applications. Recently, there are growing demands for more sophisticated and intelligent representation of 3D objects or scenes in many applications such as * computer graphics communication: storage or transmission of sequences generated by computer graphics, * natural and synthetic hybrid coding/composition: composition of natural sequences with 3D computer graphics, and vice versa, * virtual reality and tele-presence applications: visual communication such as virtual teleconferencing, *Corresponding author. E-mail address: [email protected] (J.-W. Yi).

multimedia internet applications: object database with content-based retrieval capability such as virtual museum and tele-shopping. Geometry-based rendering has been the typical approach to these problems. But this approach has several major drawbacks. First, photo-realistic rendering requires perfect modeling of an object, which is too difﬁcult or may be impossible for a complex object. Second, rendering is computationally expensive, especially for a complex object. Thus, it requires a specialized hardware accelerator for real-time applications and the rendering time is dependent on the object complexity. To solve these problems, image-based rendering has been introduced and developed by many researchers in recent years. Image-based rendering is a powerful and promising approach for 3D object representation. It is fundamentally different from traditional geometry-based rendering. This approach considers a 3D object or a scene as

*

0923-5965/02/$ - see front matter r 2002 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 3 - 5 9 6 5 ( 0 1 ) 0 0 0 3 2 - 7

294

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304

a collection of images while geometry-based rendering uses mathematical descriptions of an object. Such collection of images is called key frames and taken from the predeﬁned reference viewpoints. Thus, image-based rendering system should be able to generate intermediate views of the object using these key frames. In an image-based rendering system, the computation required to render an object is only dependent on the number of pixels in the key frames, not on the geometric complexity of the object. But the conventional image-based rendering approaches described in the literature have some problems. First, warping-based methods [2,3,9–11] and Laveau and Faugeras’s method [7] need the exact pixel-by-pixel correspondences and the depth information of each pixel which are difﬁcult to calculate or may not be available in some cases. These methods also require that the disparity map as well as the key frames should be properly encoded and stored. This is not desirable because the size of the disparity map is also extra burden. Second, even though the lumigraph [12,13] and the light ﬁeld rendering [8] techniques do not require the exact pixel-by-pixel correspondences and the depth information of each pixel, an enormous set of key frames are required, which makes the size of database huge. And most of all, these approaches have one problem in common. They only consider the generation of the intermediate views, which means most of researches are conﬁned to view synthesis. Other parts for the system implementation are not fully considered: how to select a key frame set, how to compress the key frames, and how to encode additional side information efﬁciently. In particular, encoding of the key framesFencoding of multi-view sequencesFis extremely important. Since complete description of an object needs a large number of key frames and the transmission or storage of these key frames is not practical due to large cost and bandwidth, efﬁcient coding techniques should be employed to reduce the data rate or storage size. Only a few algorithms were presented in relation to the encoding of the key frames. In [8], VQ (vector quantization) and entropy coding are used to compress an enormous set of light ﬁeld array. But this approach requires

that every set of image array should be used for training and thus has its own codebook. In light ﬁeld rendering, since each preacquired image exhibits a high level of data redundancy, only VQ and entropy coding may work well. However, this is not generally true. Dally et al. introduced the delta tree [5], a data structure that represents an object using a set of reference images. Each node of a delta tree stores an image taken from a point on a sampling sphere that encloses the object. Each image is compressed by discarding pixels that can be reconstructed by warping its ancestor’s images to the node’s viewpoint and the partial images are stored. Fig. 1 illustrates an example of the delta tree. But in delta tree structure, the same viewpoint with different nodes (view sharing) decreases the overall coding efﬁciency. We choose the object-based coding approach because of the following reasons. First, in the applications mentioned earlier (virtual reality, telepresence applications, and multi-media internet applications such as tele-shopping), not a scene-based

B1

C11

C10

D113

D110

D112

D111

C13

B2

C12

A B0

B3

A

B0

B1

C10

D110

C11

D111

B2

C12

D112

C13

D113

Fig. 1. Example of the delta tree.

B3

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304

but an object-by-object manipulation is required. So it is desirable to encode the visual information in object-by-object way so that it is easy to manipulate the information directly. Also, notice that it is entirely unnecessary to encode the background region of the whole scene because we are only concerned about the object region. Second, the subjective quality of the reconstructed key frames should be sufﬁciently high. When generating virtual intermediate views of the stored 3D object, we use the reconstructed key frames. Blockbased coding schemes is simple and easy to implement. But the quality of the reconstructed images may be poor and show severe image degradations such as blocking artifacts and edge blurring at low bit rate, which is very annoying. Object-based coding schemes can eliminate these annoying artifacts effectively. In this paper, we present an object-based encoding for multi-view sequences of a 3D object. We brieﬂy discuss the key idea of 3D object representation via image-based rendering and our proposed encoding system in Section 2. We develop a multi-view sequences encoding strategy in Section 3 and a simple view synthesis algorithm in Section 4. Then, we show the simulation results in Section 5. Finally in Section 6, we conclude with the discussion of our work and the further works.

2. Overview of the proposed system 2.1. 3D object representation: image-based rendering approach Our proposed system uses a set of images as an underlying model for object representation and synthesize a new view like other image-based rendering systems. As depicted in Fig. 2, we obtain the key frames by taking the images at the grid points which are uniformly sampled in y; f directions on the sampling sphere that encloses the object (orbiting a camera around an object while keeping the view direction centered at the object). If we can acquire the key frames taken from all possible views for perfect description of an object, view synthesis is not necessary. But it is not

295

θ

φ

Fig. 2. Sampling sphere and key frame acquisition.

practical because we need to get inﬁnite numbers of key frames. Practically, we can obtain only a subset of the whole viewpoints with a ﬁnite numbers of key frames. So when we need views outside where the key frames are taken, we should synthesize virtual views to represent the object in that viewpoints. Efﬁcient object representation typically requires hundreds of key frames. For example, with full 3601 rotation in f direction and 1801 in y direction by 101 increment, respectively, the total number of the key frames is 36 18 ¼ 648 for each object. If we store these key frames at 256 256 resolution and 8 bit-per-pixel gray scale with no compression, the size of the entire set of the key frames is about 430 MB: Efﬁcient coding techniques should be employed to reduce the data rate since the transmission or storage of these key frames is not practical due to increased cost and large bandwidth. As mentioned earlier, we obtain the key frames at the grid points of the sampling sphere with a camera orbiting around an object while keeping the view direction centered at the object as shown in Fig. 2. And this can be referred as ‘‘object rotation’’, because it is equivalent to rotating the object. Therefore, there must be inter-viewpoint redundancy between these key frames. If we can effectively reduce such interviewpoint redundancy, we can achieve high coding performance. 2.2. System block diagram Fig. 3 shows the overview of our system implementation. We encode the key frames

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304

296

User Demand

Key Frame Aquisition

Bit Stream

(φuser , θuser )

Decision of Required Key Fram e s

Key Frame Rearrangement

User Demand

(φuser , θuser )

Key Frame Decoding

Key Frame Decoder

Key Frame Encoding

View Synthesis if Necessary Warping Parameter

Transmission Chann e l or Internal Storage

Decoded Key Fram e s

Rendered Object Image Disparity Field Generation

Reverse Warping

Fig. 3. Block diagram of the proposed system. Rendered Object Image

Key Frame Set

Fig. 5. Detailed block diagram of the key frame decoder and the view synthesizer. Key Frame Set Rearrangement

Warping Parameter Estimation by Total Least Squares

Paramter Coding

Global Motion Compensation by Warping Parameter

Motion Vector Coding

Texture Coding (EI DCT)

Shape Information Coding

H.263 Based Encoder

Bit Stream

Fig. 4. Detailed block diagram of the key frame encoder.

acquired by the way as shown in Fig. 2 at each uniformly sampled (f; y) angle. And when a user demands that the system renders a new view at ðfuser ; yuser Þ; the system decodes the appropriate key frames and when the user-given angle

ðfuser ; yuser Þ does not coincide with the grid point on the sampling sphere, view synthesis must be performed at ðfuser ; yuser Þ with the decoded key frames. Fig. 4 illustrates the detailed description of the key frame encoder. The acquired key frames are not the typical sequences in temporal order but in the form of 2D matrix according to ðf; yÞ: So the key frames should be rearranged so that the H.263-based encoder system can handle them properly. The warping parameters are estimated by the total least squares method [6,14] between every two key frames. These warping parameters are used for removing the inter-viewpoint redundancy and for view synthesis. Then the object texture and shape information are encoded in object-based approach by the modiﬁed H.263 encoder. Fig. 5 shows the detailed block diagram of our key frame decoder implementation and view synthesis algorithm. When a user demands that the system render a new view at ðfuser ; yuser Þ; the decoder decides the appropriate key frames required for display and decodes the corresponding warping parameters.

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304

P-frame

P-frame

(φ1 ,θ1 )

(φ 2 ,θ1 )

(φ N ,θ1 )

Sequence 1

(φ1 ,θ 2 )

(φ 2 ,θ 2 )

(φ N ,θ 2 )

Sequence 2

...

...

3. Object-based encoding for multi-view sequences of 3D object

I-frame

...

Our view synthesizer receives the decoded key frames and the warping parameters from the key frame decoder. Then, the disparity ﬁeld is generated by the warping parameters and the object region is partitioned by the shape information in order to linearize the disparity ﬁeld so that we can easily calculate the inverse disparity ﬁeld of each partitioned region. In reverse warping stage, we ﬁrst calculate the virtual disparity ﬁeld at ðfuser ; yuser Þ using the generated disparity ﬁeld for each partitioned region and then, with this virtual ﬁeld, the key frames are warped to make the immediate view image at ðfuser ; yuser Þ:

297

(φ1 ,θ M )

(φ 2 ,θ M )

(φ N ,θ M )

Sequence M

Fig. 6. Multi-viewpoint image coding 1.

(φC _1 , θ C_1 )

(φC , θ C_1 )

(φC +1 ,θ C _1 )

(φC_1 ,θ C )

(φC ,θ C ) I-frame

(φC +1 , θ C )

(φC _1 , θ C +1 )

(φC ,θ C +1 )

(φC +1 , θ C +1 )

3.1. Key frame rearrangement strategies In the traditional image coding system, the input sequence is arranged in the temporal axis. But the key frame set of the proposed system has the form of a 2D matrix according to ðf; yÞ: So, we need to rearrange the key frame set in order to apply H.263 coding scheme. Intuitively, we can think of a simple coding strategy (Strategy 1) as shown in Fig. 6. With ﬁxed y; we make a sequence which consists of N key frames according to f: In the same manner, we compose total M sequences and encode each sequence independently by H.263 coding scheme. The ﬁrst key frame of each sequence is coded as INTRA frame and the other following frames as INTER frame. But this approach is not appropriate for random access. If we put more INTRA frames in each sequence for fast random access, the coding efﬁciency is reduced. As a compromise between fast random access and reasonable coding efﬁciency, we propose a cellbased coding strategy (Strategy 2) as illustrated in Fig. 7. Each cell is composed of 9 key frames: 1 INTRA frame in the center of the cell and 8 INTER frames. First the INTRA key frame is encoded and then the other INTER key frames are encoded in a spiral manner as shown in Fig. 7. We compare the two strategies in view of random access. We assume that both strategies

Fig. 7. Multi-viewpoint image coding 2. Table 1 Number of frames to be decoded

Strategy 1 Strategy 2

Min.

Max.

Ave.

4 4

20 12

12 6.22

have one INTRA key frame per 9 key frames in order to balance the coding efﬁciency of the two strategies and that the user-given angle ðfuser ; yuser Þ is uniformly distributed. By simple calculations, we can obtain the average number of frames to be decoded to retrieve the expected key frames. The detailed results are shown in Table 1. From Table 1, we can see that the strategy 2 is about twice faster than the strategy 1 in the decoding speed. 3.2. Inter-viewpoint redundancy removal Warping parameter estimation is extremely important because warping parameter considerably reduces the inter-viewpoint redundancy

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304

298

between key frames, which we will show later in Section 5. Warping parameters are also used to synthesize immediate views. Thus, warping parameters must be estimated reliably. We estimate warping parameters by Total Least Squares method [6,14] instead of conventional Least Squares (LS) method because of reliability and robustness of TLS. 3.2.1. Total least squares Consider the following overdetermined problem: Xh ¼ d;

ð1Þ

where

2

2

^ 6 % U ¼ 4 u% 1 ^

2

s% 1 6 60 3 6 ^ 60 6 7 % y u% Mþ1 5; R ¼ 6 60 6 ^ 6 ^ 4 0

^

6 V% ¼ 4 v% 1 ^

^

? & ? ? ?

0

3

7 7 7 s% Mþ1 7 7 7; 0 7 7 ^ 7 5 0 0

3

7 y v% Mþ1 5: ^

ð6Þ

where X is N M; h is M 1; d is N 1 matrices, respectively, with N > M: The classical LS method considers the presence of noise only in the observation vector d and so it seeks to minimize the error,

And the vector v% Mþ1 can be expressed by the following equation: " # v0 v% Mþ1 ¼ : ð7Þ v1

8e 82 ¼ 8Xh d82

Then, the overdetermined solution h is obtained by the following equation:

ð2Þ

to ﬁnd the overdetermined solution h: Unfortunately, for such a problem in Eq. (1), there usually exists an error associated with the data matrix X as well as the observation vector d: So it is more desirable to express the overdetermined problem as " # h ðX DÞh ¼ d e ; ðX% D¯ Þ ¼ 0; ð3Þ 1 where e is an error vector associated with the observation vector d; D is an error matrix % associated with the data matrix X; X9½Xjd ], and ¯ D9½Dje : Total Least Squares (TLS) method minimizes the Frobenius norm of the total error 8D¯8F to ﬁnd the solution h where Frobenius norm of the real N M matrix X is deﬁned by !1=2 N X M X 2 8X8F 9 jmij j : ð4Þ i¼1

j¼1

TLS solution involves the singular value decomposition (SVD) of the N ðM þ 1Þ combined data % matrix X9½Xjd ]. X% is singular-value-decomposed as follows [6,14]: % % R % V% * T ; X9½Xjd ¼U

ð5Þ

1 h ¼ v0 : v1

ð8Þ

3.2.2. Warping parameter estimation and global motion compensation To estimate the warping parameters by TLS method, we ﬁrst calculate the disparity ﬁeld by the sliding block method. With the assumption that an object is a quadratic surface, 12 parameter model is used for 3D motion model by the following equation: vX ¼ a1 þ a2 x þ a3 y þ a4 x2 þ a5 y2 þ a6 xy; vY ¼ b1 þ b2 x þ b3 y þ b4 x2 þ b5 y2 þ b6 xy:

ð9Þ

The 12 parameters are estimated by TLS with the outlier removal and this process is iterated until the 12 parameters are converged. According to the scanning strategies discussed in the previous section, the previous key frame is warped to the corresponding current key frame to remove the inter-viewpoint redundancy with the 12 estimated warping parameters. Then only the difference picture between the warped frame and the current frame is encoded.

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304

In traditional block-based image coding systems, the image is divided into N N blocks. Instead of block-based coding approach, we take the object-oriented coding approach for encoding of multi-view sequences of a 3D object. This object-oriented approach yields image segments of arbitrary shape. Thus, it is necessary to encode the texture and shape information of an object that has an arbitrary shape. We use an extension-interpolation (EI) method [4,15] for its superior performance and simplicity for texture information encoding. We use CAE (context-based arithmetic encoding) algorithm in MPEG-4 [1] for shape information encoding. CAE is known to be an effective method for loss or lossy encoding of shape information. Readers interested in CAE refer to MPEG-4 VM [1].

4. View synthesis If the user-given angle ðfuser ; yuser Þ does not coincide with the grid point on the sampling sphere, view synthesis must be performed. In this section, we present a simple view synthesis algorithm based on the parameter warping. 4.1. Disparity field regeneration Since our view synthesis algorithm is based on image warping, pixel-by-pixel disparities are required. At the decoder, 12 warping parameters are decoded as well as the two corresponding key frames and then the disparity ﬁeld between these key frames is regenerated by these warping parameters. This regenerated disparity ﬁeld is used for view synthesis.

However, the major problems in this forward warping approach are the pixel overlaps and the holes or cracks in the view-synthesized image. These problems can be solved by the reverse warping. If we know the inverse disparity relation, we can bring back each pixel in the viewsynthesized image from the corresponding pixel in one of the key frames. But it is not easy to get the solution of the inverse transform of the 12 parameter model. To overcome such difﬁculty, we partition an object into the 32 32 blocks as shown in Fig. 8 and then the disparity ﬁeld in each block can be approximated by 6 parameter afﬁne transform as Eq. (10), which is easy to get the inverse transform. x2 ¼ a# 1 x1 þ a# 2 y1 þ a# 3 ;

Fig. 9 illustrates the reverse warping with the approximated afﬁne transform for each 32 32 block. The parameter set a# ¼ ða# 1 ; y; a# 6 Þ is the approximated afﬁne transform of a block and v ð0pvp1Þ is a view factor, a relative disparity between the two key frames. ðx; yÞ is a pixel 32

Fig. 8. Region partitioning with regular grid.

aˆ ( x1 , y1 )

( x, y )

4.2. Region partitioning and reverse warping Since we know the 3D warping parameters, the easiest way to view synthesis is an image warping. To generate an in-between view of the two key frames, the disparity vectors are linearly interpolated and each pixel in one of the key frames is warped by the corresponding disparity vector.

ð10Þ

y2 ¼ a# 4 x1 þ a# 5 y1 þ a# 6 :

32

3.3. Texture and shape information encoding

299

v

X˚

( x2 , y 2 ) 1_ v

((Y˚ _ X˚ )x v + X˚)

Y˚

Fig. 9. Reverse warping for each region with afﬁne approximation.

300

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304

coordinate in the view-synthesized image. ðx1 ; y1 Þ and ðx2 ; y2 Þ are the corresponding coordinates in key frame 1 and key frame 2, respectively. Eq. (11) is the solution of the inverse afﬁne transform and all pixels in each block are reversely warped to synthesize a new view. ! x1 ¼ y1 ! !1 vða# 1 1Þ þ 1 va# 2 x va# 3 : va# 4 vða# 5 1Þ þ 1 y va# 6 ð11Þ

5. Simulation results Fig. 10 shows the ﬁrst example of a unit cell composed of nine key frames. The whole key frames are composed of 27 frames taken at 01; 101; 201 in y direction (elevation angle) and 01; 201; 401; y; 1601 in f direction (azimuth angle) with the size of 256 256 pixels. We set both INTRA and INTER QP (quantization parameter) at 10 for quantization of DCT coefﬁcients. The encoding results are presented in Table 2. As shown in Table 2, the global motion compensation by the warping parameters reduces the inter-viewpoint redundancy. Therefore, the overall

Fig. 10. Unit cell of test object: UFO.

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304 Table 2 Bits generated per key frame: UFO test object

Strategy Strategy Strategy Strategy

1 1 þ GMC 2 2 þ GMC

Texture and motion bits

Shape bits

PSNR (dB)

17652 3535 18152 4952

1047 1047 1060 1060

28.8 50.9 28.8 48.1

bit amounts are greatly reduced. The PSNR of the reconstructed key frames is also improved surprisingly, which means the 12 parameter motion model is adequate for modeling of object rotation

301

and the matching error after the key frame warping is very small. Note that the bit amount generated by the cellbased rearrangement is somewhat larger than the simple raster scanning rearrangement. However, as we showed in Section 3.1, the cell-based rearrangement can achieve fast random access. A change of the scanning direction in the cell-based rearrangement may decrease the coding efﬁciency and we should study this problem for future works. Fig. 11 shows the second example of a unit cell composed of nine key frames. The whole key

Fig. 11. Unit cell of test object: NIKON.

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304

302

Table 3 Bits generated per key frame: NIKON test object

Strategy Strategy Strategy Strategy

1 1 þ GMC 2 2 þ GMC

Texture and motion bits

Shape bits

PSNR (dB)

34494 5982 41624 11080

1649 1649 1667 1667

29.1 54.8 28.7 46.9

frames are composed of 27 frames taken at 01; 101; 201 in y direction (elevation angle) and 01; 201; 401; y; 1601 in f direction (azimuth angle) with the size of 256 256 pixels. As shown in Table 3, we obtain almost the same results as the UFO object. Figs. 12 and 15 show the test images for view synthesis. The disparity ﬁelds regenerated by the 12 warping parameters are depicted in Figs. 13 and 16, respectively. The synthesized intermediate views between the two test images are shown in Figs. 14 and 17 for each kind of test images.

Fig. 12. Test image for view synthesis: UFO.

Fig. 13. UFO.

Disparity ﬁeld regenerated by warping parameter: Fig. 14. Synthesized intermediate view: UFO.

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304

303

Fig. 15. Test image for view synthesis: triceratops.

Fig. 16. Disparity ﬁeld regenerated by warping parameter: Triceratops.

6. Conclusion and further works In this paper, we have proposed an object-based encoding system for multi-view sequences of a 3D object. First, we have discussed the current demands for image-based rendering approaches and its applications. And then, we have raised several problems involved in image-based rendering and presented what we should do. We have provided the solutions for our system implementation: the key frame encoding system

Fig. 17. Synthesized intermediate view: Triceratops.

and the simple view synthesis algorithm. We have proposed the efﬁcient key frame rearrangement method, warping parameter estimation by TLS, the texture encoding, and the shape information encoding. The computer simulations show the encoding results and view synthesis results. For further works, we are going to study the key frame selection algorithm. At this point we think the key frames are uniformly sampled in the sampling sphere, but non-uniform sampling in y; f

304

J.-W. Yi et al. / Signal Processing: Image Communication 17 (2002) 293–304

directions is obviously better. And the shape information and the warping parameter encoding schemes should also be improved.

References [1] Ad Hoc Group on MPEG-4 Video VM Editing, MPEG-4 video veriﬁcation Model Version 10.0, ISO/IEC JTC1/ SC29/WG11, MPEG98/N1992, February 1998. [2] S.E. Chen, L. Williams (Apple Computer, Inc.), View interpolation for image synthesis, in: Proceedings of SIGGRAPH 1993, Computer Graphics Proceedings, Annual Conference Series, ACM SIGGRAPH, 1993, pp. 279–288. [3] S.E. Chen, QuickTime VR – An image-based approach to virtual environment navigation, in: Proceedings of SIGGRAPH 1995, Computer Graphics Proceedings, Annual Conference Series, ACM SIGGRAPH, 1995, pp. 29–38. [4] S.-J. Cho, S.-W. Lee, J.-G. Choi, S.-D. Kim, Arbitrarilyshaped image segment coding using extension-interpolation, The Journal of The Korean Institute of Communication Science 20 (9) (September 1995) 2453–2463. [5] W.J. Dally, L. McMillan, G. Bishop, H. Fuchs, The delta tree: An object-centered approach to image-based rendering, Technical Report, Department of CS, University of North Carolina. [6] G.H. Golub, C.F. Van Loan, Matrix Computation, Johns Hopkins University Press, Baltimore, MD, 1989.

[7] S. Laveau, O. Faugeras, 3D scene representation as a collection of images and fundamental matrices, Technical Report N2205, INRIA, February 1994. [8] M. Levoy, P. Hanrahan, Light ﬁeld rendering, in: Proceedings of SIGGRAPH 1996, Computer Graphics Proceedings, Annual Conference Series, ACM SIGGRAPH, 1993, pp. 31–42. [9] A. Lippman, Movie-maps: An application of the optical videodisc to computer graphics, in: Proceedings of SIGGRAPH 1980, Computer Graphics Proceedings, Annual Conference Series, ACM SIGGRAPH, 1980, pp. 32–43. [10] W.R. Mark, L. McMillan, G. Bishop, Post-rendering 3D warping, in: Proceedings of the Symposium on Interactive 3D Graphics, 1997, pp. 7–16. [11] L. McMillan, G. Bishop, Head-tracked stereoscopic display using image warping, in: Proceedings of SPIE, Vol. 2409, San Jose, CA, February, 1995, pp. 21–30. [12] Microsoft Research, The Lumigraph, in: Proceedings of SIGGRAPH 1996, Computer Graphics Proceedings, Annual Conference Series, ACM SIGGRAPH, 1996, pp. 43–54. [13] P.-P. Sloan, M.F. Cohen, Time critical lumigraph rendering, in: Proceedings of the Symposium on Interactive 3D Graphics, 1997, pp. 17–23. [14] S. Van Huffel, J. Vandewalle, The Total Least Squares Problem, Computational Aspects and Analysis, Society for Industrial and Applied Mathematics, 1991. [15] J.-W. Yi, S.-J. Cho, W.-J. Kim, S.-D. Kim, S.-J. Lee, A new coding algorithm for arbitrarily shaped image segments, Signal Processing: Image Communication 12 (June 1998) 231–242.

Object-based encoding for multi-view sequences of 3D object

Object-based encoding for multi-view sequences of 3D object

Recommend Documents