A three-layer scheme for M-channel multiple description image coding

Signal Processing 91 (2011) 2277–2289 Contents lists available at ScienceDirect Signal Processing journal homepage: www.elsevier.com/locate/sigpro ...

Download PDF

688KB Sizes 0 Downloads 49 Views

Report

PDF Reader
Full Text

Signal Processing 91 (2011) 2277–2289

Contents lists available at ScienceDirect

Signal Processing journal homepage: www.elsevier.com/locate/sigpro

A three-layer scheme for M-channel multiple description image coding$ Upul Samarawickrama a,1, Jie Liang a,, Chao Tian b a b

School of Engineering Science, Simon Fraser University, Burnaby, BC, Canada, V5A 1S6 AT&T Labs-Research, Florham Park, NJ 07932, USA

a r t i c l e i n f o

abstract

Article history: Received 17 November 2010 Received in revised form 21 March 2011 Accepted 22 March 2011 Available online 14 April 2011

A three-layer scheme is developed for M-channel multiple description image coding. In each description, a subset of the source is encoded in the ﬁrst layer. In the second layer, the remaining subsets are encoded sequentially by predicting from the already encoded subsets. Each description can thus produce a coarse reconstruction of the source. When multiple descriptions are received, a reﬁned reconstruction is obtained by fusing all coarse reconstructions. A third layer is further included to reﬁne the reconstruction when only one description is lost, which dominates when the probability of channel error is low. We ﬁrst derive the closed-form expressions of the expected distortion of the system for 1-D sources. The proposed scheme is then applied to lapped transformbased image coding, where we formulate and obtain the optimal lapped transform. Image coding results show that the proposed method outperforms other latest schemes. & 2011 Elsevier B.V. All rights reserved.

Keywords: Image coding Image communication Multiple description coding Estimation

1. Introduction Multiple description coding (MDC) addresses the packet losses in a communication network by sending several descriptions of the source such that the reconstruction quality improves with the number of received descriptions [1]. The MDC with two descriptions (or channels) has been studied extensively. Some representative practical designs include the multiple description scalar quantizer (MDSQ) and the pairwise correlating transform [2,3]. In this paper, we focus on MDC with more than two descriptions, which is more useful in practice. Information-theoretic analyses of

$ This work was supported in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada under Grants RGPIN312262, EQPEQ330976-2006, and STPGP350416-07. Corresponding author. Tel.: þ1 778 782 5484; fax: þ 1 778 782 4951. E-mail addresses: [email protected] (U. Samarawickrama), [email protected] (J. Liang), [email protected] (C. Tian). 1 Tel.: þ1 778 782 5484; fax: þ1 778 782 4951.

0165-1684/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2011.03.023

this case can be found in, e.g., [4–6]. Among them, an achievable rate-distortion (R-D) region for the MDC of memoryless sources is given in [4], by generalizing the two-description result in [7]. An improved achievable region is obtained in [5] by encoding a source in M stages, where the k-th stage reﬁnes the previous stages and can only be decoded when more than k descriptions are received. The scheme is based on the theory of source coding with side information and distributed source coding (DSC) [8]. Recently, another improved scheme is developed in [6], which can achieve points outside the achievable region in [5]. However, these information-theoretic schemes cannot be directly applied in practices, due to their high encoding and decoding complexities, particularly the lack of structured codes to implement the binning scheme. Various practical designs of M-channel MDC have also been developed. In [9,10] erasure correcting codes are used to provide unequal loss protections (ULP) to different layers of the output of a scalable coder. Fast ULP algorithms have been studied in, for example, [11,12]. However, many existing block transform-based coders are either not scalable or only have very limited granularity of

2278

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

scalability, such as JPEG, H.264 and the latest JPEG-XR image coding [13,14]. Therefore the ULP-based MDC may not be optimal in these cases, and it is necessary to develop alternative MDC schemes for these applications that can take full advantage of the underlying coders. In [15], the MDSQ in [2] is extended to more than two channels via a combinatorial optimization approach. However, the scheme only has one degree of freedom and only assigns the index symmetrically around the main diagonal of the index matrix. The most general M-channel symmetric MDC has M 1 degrees of freedom, which maximize the ﬂexibility in tuning the redundancy. Another extension of the MDSQ with M 1 degrees of freedom is developed in [16], which shares some similarities to the method in [5], i.e., the coding consists of multiple stages such that each stage reﬁnes the preceding stages. However, both the algorithms in [15,16] become quite complicated as the increase of the number of descriptions. A lattice vector quantization-based MDC method is presented in [17], where M descriptions are generated by uniquely assigning each point in a ﬁner central lattice to M points in a sublattice. The method also involves an index assignment problem, which increases the complexity in design and implementation. In another class of MDC methods, the source splitting scheme in [18,19] is generalized. In [20], the transform coefﬁcients are split into two subsets, and each is quantized into one description. Each description also includes the coarsely quantized result of the other subset, which improves the reconstruction when the other description is lost. Recently the method in [20] is generalized to JPEG 2000 in [21] for two-description coding, denoted as RDMDC, where each JPEG 2000 code-block is coded at two rates, one in each description. In [22], the RD-MDC in [21] is extended to M-channel case, where each JPEG 2000 code-block is still encoded at two rates. The higher-rate coded code-blocks are divided into M subsets and are assigned to M descriptions. Each description also carries the lower-rate codings of the remaining code-blocks. One beneﬁt of [21,22] is that they maintain a good compatibility with JPEG 2000, e.g., each description can be decoded by a standard JPEG2000 decoder. In [23] a multi-rate method, which generalizes the two-rate method in [22], is developed. The method in [23] exploits the redundancy more efﬁciently than in [22]. However, its complexity increases rapidly with the number of descriptions. In [24], a modiﬁed MDSQ (MMDSQ) is developed, where the quantization bins in the base layers of the two descriptions are staggered to each other. This creates a reﬁned central quantizer when both descriptions are available. The quantization error of the central quantizer is further encoded and split into the second layers of two descriptions. In [25], a two-rate coding method is proposed for M-channel MDC, where the lower rate coding also uses staggered quantization, similar to the MMDSQ. The method in [25] also exploits the residual correlations in the source using a specially designed predictive encoder. Another improvement of the MMDSQ is investigated in [26], where the second layer can be used to reduce the side distortion. In [27], the scheme in [26] is applied to

MD image coding, where the ﬁrst layers are obtained by rotating the image by different angles before encoding. The reconstructions from the ﬁrst stage are averaged, and the error is encoded and split to the second stages of all descriptions. However, this method only has one degree of freedom, and its theoretical performance for M 4 2 is unknown. In [28], a prediction compensated MDC scheme (PCMDC) is developed for the two-channel case, where the source is partitioned into two subsets, and each subset is encoded as the base layer of one description. Each description also encodes the prediction residual of the other subset, using the reconstruction of the base layer as the prediction reference. This is more efﬁcient than the two-rate coding in [20,21]. When applied to the time-domain lapped transform-based image coding [29], PCMDC achieves better results than those in [24,21], and represents the state of the art in two-description image coding. In this paper, motivated by the superior performance of the two-channel PCMDC in image coding, we generalize the prediction-compensated approach to M-channel case and develop a three-layer MDC (TLMDC) algorithm. In the ﬁrst layer of each description, a subset of the source samples is encoded. In the second layer, the remaining subsets are encoded sequentially by predicting from the already encoded subsets. As a result, the ﬁrst two layers can generate a coarse reconstruction of the source. When multiple descriptions are received, a reﬁned reconstruction is obtained by fusing all coarse reconstructions. Moreover, when only one description is lost, a further reﬁnement can be obtained, which also occurs the most frequently among all error scenarios when the loss probability of each description is very small. The organization of the paper is as follows. In Section 2, we describe the framework of the proposed scheme, and derive the closed-form expression of the expected distortion of the proposed scheme for 1-D Gaussian sources. In Section 3, we modify our scheme for lapped transformbased MD coding, and formulate the optimization of the corresponding lapped transform. In Section 4, the performance of the proposed method in MD image coding is demonstrated and compared with other methods. 2. System description and performance analysis In this section, we describe the proposed three-layer multiple description coding (TLMDC), and analyze its R-D performance for a 1-D wide sense stationary (WSS) Gaussian source. In Section 3, the scheme will be modiﬁed for block transform-based image coding. 2.1. System description In our scheme, the source samples fxðnÞg are partitioned into M polyphases xi, i ¼ 0, . . . ,M1, with samples in the i-th polyphase given by xi ðnÞ ¼ xðnM þ iÞ. As discussed in Section 1, each description in our method contains three layers. In the ﬁrst layer of description i, the i-th polyphase xi is coded at bit rate R0 bits/sample. In the second layer, the remaining polyphases are encoded sequentially by predicting from the already encoded

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

polyphases. As a result, the ﬁrst two layers of each description carry information about all the source samples. Given each description, a reconstruction of the entire input can be obtained. The second layer bit rate, R1, is lower than R0, and the predictively coded polyphases usually have lower quality than the directly coded polyphase in Layer 1. When all descriptions are received, only the ﬁrst layers of all descriptions are used to reconstruct the source. If some descriptions are lost, we ﬁrst generate a coarse reconstruction of the source from the ﬁrst two layers of each received description. All coarse reconstructions are then fused to obtain a reﬁned reconstruction as follows. If a polyphase is coded in Layer 1 of any received description, its corresponding reconstruction will be used directly. Otherwise its second layer reconstructions from all the received descriptions are averaged to get the reﬁned reconstruction for this polyphase. When the loss probability of each description is very low, as is the case in most network conditions, the dominating error scenario is that there is only one lost description. Therefore, a third layer is designed in our scheme to improve the performance in this case. As higher quality reconstructions of M 1 polyphases are already available from the ﬁrst layers of the received M 1 descriptions, the goal of the third layer is to reﬁne the quality of the remaining polyphase, which is usually lower than that of other polyphases, even after averaging all the coarse reconstructions. Since there are only M possibilities of losing one description, we can afford to consider each case in the decoder. To get balanced descriptions, i.e., all descriptions are approximately of the same size, the reﬁnement bits of the target polyphase in each case are evenly split into the third layers of the M 1 received descriptions, which is a generalization of the second layer in [24] for two-description coding. As a result, the third layer of description i contains 1=ðM1Þ of the reﬁnement bits for all other polyphases xj ,jai. The splitting can be skipped if the requirement of balanced descriptions is not critical. In that case, all the third layer bits for the i-th polyphase can be included in, e.g., the ðði þ 1Þ mod MÞ-th descriptions. The bitstream structures of all descriptions are illustrated in Fig. 1 for M¼4, where each row represents the bits for one description. Each row is divided into three layers, and each layer contains bits for different polyphases, as denoted by Xi in each box. The predictive Layer 2

Layer 1 Desc. 0

x0

Desc. 1

x1

x0

Desc. 2

x2

x0

x1

Desc. 3

x3

x0

x1

x1

x3

x2

x3

x0

x3

x0

x1

x0

x1

x2

x1

coding in Layer 2 is represented by horizontally connected boxes, whereas the splitting in Layer 3 is represented by vertically connected boxes. For example, in Description 0, Polyphase 0 is encoded independently in Layer 1, and the resulting bits are represented in Fig. 1 by X0 in the top-left box. The reconstructed Polyphase 0 is then used to predict Polyphase 1, and the prediction residual is coded into Description 0. The resulting bits are represented by X1 within Layer 2 of the ﬁrst row. Next, both the reconstructed Polyphase 0 and Polyphase 1 are used to predict Polyphase 2. In this way, in Layer 2, each polyphase is predicted from all the polyphases that are already encoded in the description. Further, each polyphase is also coded in Layer 3. For example, we ﬁrst obtain reconstructions of Polyphase 0 from Layer 1 of Descriptions 1, 2 and 3 respectively, and then average them. The resulting reconstruction error is encoded, and the bits are split among Descriptions 1–3 in Layer 3. Note that when there are only two descriptions, i.e., M¼2, Layer 2 and Layer 3 in our scheme can be merged, and the scheme reduces to the two-layer predictioncompensated MDC in [28]. 2.2. Sequential prediction in Layer 2 As shown in [28], predictive coding-based MDC achieves better R-D performance than direct two-rate coding. However, two options exist when generalizing the two-description predictive coding scheme in [28] to more than two descriptions. If each description simply uses one polyphase or subset to predictively encode all other subsets, the coding efﬁciency will deteriorate as the increase of M, due to the diminished correlation among different subsets. In this paper, this problem is resolved by using sequential prediction, where all previously encoded subsets are used to predict the next subset. The orthogonality principle is also used to reduce the complexity of the sequential prediction. To illustrate the second layer coding, consider Description 0, where the 0-th polyphase x0 is encoded in the ﬁrst layer. In the second layer, we ﬁrst predict polyphase x1 from the decoded 0-th polyphase x^ 0 . Let Sx ðoÞ be the power spectral density (psd) of the source. Each polyphase xi thus has the same psd given by [30] 1 X 1 M o2pk : ð1Þ S0 ðoÞ ¼ Sx Mk¼0 M Assuming high rate coding, the linear minimum mean squared error (LMMSE) or Wiener ﬁlter H1 ðoÞ for predicting x1 from x^ 0 is given by [31]

Layer 3

x2

2279

x2

x3

x2

x3 x3

x2

Fig. 1. The bitstream structure of the proposed three-layer MDC scheme for four-description coding.

H1 ð o Þ ¼

Sx10 ðoÞ , S0 ðoÞ

ð2Þ

where Sxij ðoÞ is the cross spectral density between xi and xj. The resulting prediction error e1 is orthogonal to x0 and has a psd of S1 ðoÞ ¼ S0 ðoÞ

jSx10 ðoÞj2 : S0 ðoÞ

ð3Þ

The prediction errors fe1 ðnÞg are coded at bit rate R1 in the second layer.

2280

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

The next step is to use both x^ 0 and x^ 1 to predictively encode polyphase x2. This is more efﬁcient than using x^ 0 alone, as the correlation between x1 and x2 is generally stronger than that between x0 and x2. Since x1 can be decomposed into the projection on x0 (via the Wiener ﬁlter) and the projection error e1, which is orthogonal to x0, the LMMSE prediction of x2 from x0 and x1 is equivalent to the summation of separate LMMSE predictions from x0 and e1. It can be shown that the psd of the resulting prediction error e2 is given by S2 ðoÞ ¼ S0 ðoÞ

jSx20 ðoÞj2 jSx2 e1 ðoÞj2 , S0 ðoÞ S1 ðoÞ

ð4Þ

where Sxi ej ðoÞ is the cross spectral density between xi and ej. The prediction of other polyphases can be obtained in a similar manner, i.e., the psd of the prediction error of the j-th polyphase is given by Sj ðoÞ ¼ S0 ðoÞ

j1 jSxj0 ðoÞj2 X jSxj ei ðoÞj2 : S 0 ð oÞ S i ð oÞ i¼1

ð5Þ

This prediction error is coded at bit rate Rj in the second layer. By the property of predictive coding, the MSE of the source equals to that of the prediction residual [32, p. 114]. Since the residual of linear prediction is also Gaussian, the MSE of the residual (and the source) can be written as [32] Z p 1 dj ¼ dexp logSj ðoÞ do 22Rj 9dg2j 22Rj , ð6Þ 2p p where d is a constant that depends on the input statistics and the quantization scheme. For Gaussian sources and entropy-constrained scalar quantizer, d ¼ pe=6 [32]. Eq. (6) is also applicable to Layer 1 when j¼ 0. 2.3. The third layer coding When M 1 descriptions are available, high-quality reconstructions of M 1 polyphases can be obtained from the ﬁrst layers of these descriptions. The goal of the third layer is to reﬁne the other polyphase. Two approaches can be employed in this layer. 2.3.1. Method 1 In the ﬁrst method, the target polyphase is predicted from all other M 1 polyphases that are reconstructed from the ﬁrst layer coding. The prediction residual is encoded at rate RM and split among the M 1 descriptions. This prediction is similar to the prediction in the second layer in Section 2.2 when j¼M 1. The difference is that the prediction in Section 2.2 involves one ﬁrstlayer-coded polyphase and M 2 second-layer-coded polyphases, whereas here the prediction uses M 1 ﬁrstlayer-coded polyphases. However, at high rate, this difference can be neglected. This simpliﬁcation is commonly used in the performance analysis of predictive coding [32, p. 114]. In this case, the psd of the prediction error is given by (5) when j¼ M 1. This error is coded at rate RM. The

corresponding MSE is given by dM ¼ dg2M1 22RM :

ð7Þ

2.3.2. Method 2 In the second method, the third layer is used to reﬁne the average of the reconstructions of the target polyphase from the second layer decoding of all received descriptions. In these M 1 descriptions, the target polyphase is predictively coded at rate R1 to RM1 , respectively. Assuming the reconstruction errors from the second layer codings of different descriptions are uncorrelated, the average of the M 1 second layer reconstructions has the following prediction residual variance:

s2e ¼

M 1 X

1 ðM1Þ

2

dg2j 22Rj :

ð8Þ

j¼1

In the third layer, we encode this residual with a bit rate of RM. The MSE due to this coding is dM ¼ ds2e 22RM :

ð9Þ

2.4. R-D performance analysis We now investigate the R-D performance of the proposed MDC scheme. Let Dk denote the expected distortion when there are k received descriptions. When k¼M, i.e., all descriptions are received, all source samples are reconstructed from the ﬁrst layer. Hence, the distortion is given by (6) with j ¼0: DM ¼ dg20 22R0 :

ð10Þ

When k oM, we ﬁrst reconstruct the k polyphases coded in the ﬁrst layer. The reconstructions of the other polyphases are obtained from either the second layer or third layer, depending on the value of k. Therefore, Dk ¼

1 ðkDM þ ðMkÞD0k Þ, M

ð11Þ

where D0k is the expected distortion for the other M k polyphases. When k¼M 1, D0k is obtained from the third layer coding and is given by dM in (7) or (9), depending on which method is used in the third layer. When k oM1, D0k is given by the second layer coding. In this case, assuming, without loss of generality, that Description 0 is lost and x0 needs to be reconstructed from the second layers of the k received descriptions. Therefore there are ðM1 k Þ possible combinations of the indices of the received descriptions. Let I l be the l-th index combination. If Description (M j) is received, x0 will be coded at rate Rj in its second layer. We use the average of all the reconstructions of x0 from the k received descriptions as the ﬁnal reconstruction. Assuming that the reconstruction errors are uncorrelated in different descriptions, the reconstruction error after the average can be written as D0kl ¼

1 X dg2 22Rj : k2 Mj2I j l

ð12Þ

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

0 11=ðM þ 1Þ M Y 2 @ cj A 22ðM=ðM þ 1ÞÞR , x þ ðM þ 1Þ j¼0

The distortion D0k is the average of all D0kl . Hence, D0k ¼

1 M1

D ¼ p0 s

X D0kl :

ð13Þ

‘

Eq. (13) is symmetric for each j 2 f1,2, . . . ,M1g. Therefore each g2j 22Rj appears exactly kðM1 k Þ=ðM1Þ times in (13). Thus M1 k M 1 k 1 1 X 2 D0k ¼ dg2j 22Rj M1 M1 k j¼1

2.4.2. Method 2 When Method 2 is used in the third layer, the optimal bit allocation becomes 1 ðM2Þc0 , Rj ¼ R0 log2 2 ðM1Þcj

k

RM ¼

M 1 X

1 ¼ dg2 22Rj : kðM1Þ j ¼ 1 j

ð14Þ

Given the expressions of all Dk, and assuming that the probability of losing each description is p, the overall expected distortion is M X

p k Dk ,

ð15Þ

k¼0 Mk ð1pÞk is the probability of receiving k where pk ¼ ðM k Þp descriptions. Our objective is to minimize the overall expected distortion subject to the rate constraint of R bits/sample/ description,

j 2 f1, . . . ,M1g,

1 cM dðM2Þ log2 : 2 cM1 ðM1Þ2

From (21), (22) and (16), R0 can be found as 0 1 Y c0 1 cM1 ðM2ÞM2 M1 @ A: log2 R0 ¼ R þ 2M cM dðM1ÞM3 j ¼ 1 cj The corresponding minimal expected distortion is 0 11=M Y cM dðM1ÞM3 M1 2 @ D ¼ p0 sx þ M cj A 22R : cM1 ðM2ÞM2 j ¼ 0

ð22Þ

ð23Þ

ð24Þ

Comparing (20) and (24), it can be seen that the distortion in Method 2 decays faster with the bit rate R than in Method 1. In other words, there is a relative rate loss in Method 1 compared to Method 2.

ð16Þ

Using standard Lagrangian multiplier method, we can ﬁnd the following solutions for the two methods of the third layer coding.

In this section, we compare the performance of the proposed three-layer MDC (TLMDC) scheme with other methods for a ﬁrst-order Gaussian–Markov (GM(1)) source, which can be modeled as xðnÞ ¼ rxðn1Þ þ vðnÞ,

2.4.1. Method 1 If Method 1 is used in the third layer, the optimal bit allocation is given by j 2 f1,2, . . . ,Mg,

ð17Þ

c0 ¼

M 1 2X dg ip , M 0i¼1 i

cj ¼

X pi ðMiÞ 1 2 M2 , dgj M iðM1Þ i¼1

ð25Þ

where x(n) is the n-th sample, v(n) is a Gaussian white noise sample independent of x(n), and r is the correlation coefﬁcient. After downsampling, each polyphase itself is a GM(1) source with a correlation coefﬁcient of rM and power spectral density S0 ðoÞ ¼

where

cM ¼

ð21Þ

2.5. Example with GM(1) sources

M 1 X R ¼ R: Mj¼0 j

1 c0 Rj ¼ R0 log2 , 2 cj

ð20Þ

where s2x is the source variance.

k

D¼

2281

s2x ð1r2M Þ : 12rM coso þ r2M

ð26Þ

From (5) it can be seen that evaluation of Sj ðoÞ requires the knowledge of Sxj ei ðoÞ and Si ðoÞ for i oj. It can be shown that Sxj ei ðoÞ is given by Sxj ei ðoÞ ¼ Sxji ðoÞ

j 2 f1,2, . . . ,M1g,

pM1 2 dgM1 : M

From (16) and (17), R0 can be obtained as 0 1 M Y M 1 c0 A @ Rþ log2 : R0 ¼ M þ1 2ðM þ 1Þ c j¼1 j

i1 S ðoÞS Sxi0 ðoÞSxj0 ðoÞ X xj el ðoÞ xi el : ð o Þ S S0 ðoÞ l l¼1

ð27Þ

When l 4 k, we can express Sxlk as ð18Þ Sxlk ðoÞ ¼

ð19Þ

Under this bit allocation, the minimal overall expected distortion is

s2x ðrlk r2Ml þ k þ rM ejo ðrkl rlk ÞÞ : 12rM cosðoÞ þ r2M

ð28Þ

These results can be used to recursively evaluate (5) and (27). Fig. 2 shows 10 log10 ð1=DÞ at different packet loss probabilities for four-description coding of a unitvariance GM(1) source with r ¼ 0:95 and R¼3 bits/ sample/description. Three versions of our scheme are

2282

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

high loss probabilities and high bit rates are dominated by p0 s2x .

Expected SNR 10log10(1/D) (dB)

70 TLMDC method 1 TLMDC method 2 TLMDC w/o Layer 3 RD−MDC

60

3. Optimal design for block transform and image coding 50 40 30 20 10 10−3

10−2

10−1

100

Loss probability Fig. 2. Expected PSNR 10 log(1/D) vs packet loss probability for MDC coding of a GM(1) source with M¼ 4, r¼ 0.95 and R ¼ 3 bits/sample/ description.

Expected SNR 10log10(1/D) (dB)

90 TLMDC Method 1 TLMDC Method 2

80

p = 0.001

70 60

p = 0.01

In this section we apply the proposed MDC scheme to image coding, for which transform coding needs to be used. In this paper, the time-domain lapped transform (TDLT) framework developed in [29] is adopted, which improves the performance of the DCT-based system by applying time-domain pre/postﬁlters. The TDLT has been selected by the forthcoming JPEG XR standard [14], which is a low-cost alternative to JPEG 2000 with competitive performance. As in [28,25], to improve the coding efﬁciency, instead of partitioning the source samples directly, we partition them into M polyphases at the block level. As a result, all predictions and reﬁnements will be performed at the block level. In addition, practical FIR Wiener ﬁlters will be used. Due to the three-layer structure of the proposed method, our predictions and reﬁnements are different from [28,25]. For example, since M descriptions are generated, multiple Wiener ﬁlters would be involved, whereas only one ﬁlter is required in [28]. A nice property of the TDLT is that the pre/postﬁlters can be optimized for different applications. Since our MDC framework described above differs from the traditional single description coding, the optimal ﬁlters found in [29] may not be optimal in our scheme. In this section, we formulate the optimization of the pre/postﬁlters and the corresponding Wiener ﬁlters for our framework.

50

3.1. Overview of the time-domain lapped transform

p = 0.1

40 30 3

4

5

6 Bit rate R

7

8

Fig. 3. Expected PSNR 10 log(1/D) vs total bit rate/sample for MDC coding of a GM(1) source with M ¼ 4, r¼ 0.95.

reported, i.e., Methods 1, 2, and no third layer at all. For comparison purpose, the result of the RD-MDC in [22] is also included. It can be seen that even without Layer 3, our method is 7–9 dB better than the RD-MDC when p o0:05, because predictive coding is more efﬁcient than direct coding. With Layer 3, our method can be further improved as the decrease of the loss probability. For the conﬁguration in Fig. 2, Method 1 is better when p is small, whereas Method 2 is better when p is larger. As pointed out earlier, Method 1 has a rate deﬁciency compared to Method 2. Fig. 3 shows a plot of 10 logð1=DÞ vs bit rate at different loss probabilities for four-description coding of the unit-variance GM(1) source with r ¼0.95, which shows that the rate deﬁciency of Method 1 increases as R. Note that when p¼0.1, the distortion has an asymptote around 40 dB, because the distortion at

Before we present the optimal design of the TDLT for the proposed MDC scheme, we ﬁrst deﬁne the necessary notations in the TDLT. Fig. 4 shows the block diagrams of the forward and inverse TDLT. An L L preﬁlter P is applied at the boundary of two blocks (L is the block size). The L-point DCT C is then applied to each block. As a result, the basis functions of the TDLT cover two blocks. At the decoder, the inverse DCT and the postﬁlter T ¼ P1 at block boundaries are applied. The preﬁlter P has the following structure to yield linear-phase ﬁlters [29]. P ¼ W diagfI,VgW,

ð29Þ

where diagfA,Bg denotes a block diagonal matrix with matrices A and B on the diagonal, and zeros elsewhere. The matrix V is an L/2 L/2 invertible matrix that can be optimized for different purposes. The butterﬂy W is given by " # 1 I J W ¼ pﬃﬃﬃ , ð30Þ 2 J I where I and J are L/2 L/2 identity matrix and counteridentity matrix, respectively. Let P ¼ ½PT0 PT1 T , where P0 and P1 contain the ﬁrst and the last L/2 rows of the preﬁlter P, respectively. Since P is applied at block boundaries, the L 2L forward transform is given by F ¼ CP12 , where P12 ¼ diagfP1 ,P0 g.

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

T

xˆ (n−1)

T

xˆ (n)

T

xˆ (n+1)

All matrices in the FIR Wiener ﬁlters can be obtained if the input statistics and the preﬁlter in the TDLT are known. In this paper, we assume the input has the GM(1) model with correlation coefﬁcient 0.95. The details can be found in [28]. As in [33,28], we also normalize the Wiener ﬁlter to have unit row sum. In addition, a special Wiener ﬁlter is used at the boundary to predict a missing block from only one neighboring block.

T

ˆx (n+2)

3.3. Optimal design of the lapped transform

L/2 x (n−1)

P D s (n−1) C T

x (n)

D C T

sˆ (n−1)

y (n)

I D C T

sˆ (n)

y (n+1)

I D C T

ˆs (n+1)

P D s (n+1) C T

x (n+2)

I D C T

P s (n)

x (n+1)

y (n−1)

P

Fig. 4. Forward and inverse time-domain lapped transforms.

Similarly, to obtain the inverse transform, let T ¼ ½T0 T1 , where T0 and T1 are the ﬁrst and the last L/2 columns of T, respectively. The 2L L inverse transform is thus G ¼ T21 CT , where T21 ¼ diagfT1 ,T0 g. When V is orthogonal, we have G ¼ FT .

3.2. The FIR Wiener ﬁlter In the following discussion, we use xðiÞ, sðiÞ and yðiÞ to denote the i-th block of preﬁlter input, DCT input and DCT output, respectively. Note that xðiÞ is aligned with the preﬁlter, whereas sðnÞ is aligned with the DCT, as shown in Fig. 4. The reconstruction of a variable is denoted by the hat operator. In the second layer of the proposed MDC scheme, the blocks that are not coded in the ﬁrst layer are coded with sequential prediction using Wiener ﬁlters. To reduce the complexity, instead of using all previously coded data in the prediction (as in Section 2), in the image coding, we only use the two nearest neighboring blocks to interpolate or predict the target block. Consider the i-th description in M description coding, where the DCT blocks fyðMl þiÞg,l ¼ 0,1,2, . . . , are coded in the ﬁrst layer with bit rate R0. In the second layer, we ﬁrst predict the preﬁlter output blocks fsðMl þi þ 1Þg using the reconstructed fs^ ðMl þiÞg from the ﬁrst layer. Let sil,2 be the two nearest neighboring blocks sil,2 ¼ ½sT ðMl þ iÞ sT ðMðl þ1Þ þiÞT :

ð31Þ

The Wiener ﬁlters for estimating sðMl þ iþ 1Þ from sil,2 is H1 ¼ RsðMl þ i þ 1Þsil,2 R1 sil,2 sil,2 :

2283

ð32Þ

As in [28], the quantization noise is ignored in the Wiener ﬁlter. The resulting prediction error is DCT coded at bit rate R1. Next, we interpolate fsðMl þ i þ2Þg from fs^ ðMl þi þ 1Þg and fs^ ðMðl þ 1Þ þ iÞg, which are the nearest available blocks to the left and right sides of fsðMl þ iþ 2Þg. In general, the block sðMl þ iþ jÞ, j 2 f1,2, . . . ,M1g in description i is interpolated from s^ ðMl þi þ j1Þ and s^ ðMðl þ 1Þ þiÞ, and the prediction residual is coded at rate Rj after the DCT. The resulting interpolation error is coded at rate Rj after DCT.

Given the target bit rate R and the probability p of losing each description, our objective is to ﬁnd the optimal preﬁlter and postﬁlter in the TDLT that minimize P the expected distortion in (15), i.e., D ¼ M k ¼ 0 pk Dk . In the proposed MDC scheme, when k descriptions are available, k out of M DCT blocks are reconstructed from the ﬁrst layer coding, and the rest are reconstructed from second layer (if k oM1) or third layer coding (if k ¼ M1). After postﬁltering at block boundaries, each block contains the contributions from two DCT blocks. When the bit rates are sufﬁciently high, the quantization noises in different DCT blocks are approximately uncorrelated, and their contributions to the reconstruction error are additive. Therefore Dk can still be written as (14). The only difference from Section 2 is that DM and D0k here are obtained from block transform and quantization rather than direct quantization. Plugging (14) into (15), we get D¼

¼

M M 1 X 1 X pk kDM þ ðMkÞpk D0k þp0 s2x Mk¼1 Mk¼1

mk M

DM þ

M 2 X k¼1

Mk 1 p D0 þ pM1 D0M1 þ p0 s2x , M k k M

ð33Þ

where mk is the expected value of k (which can be obtained when the loss probability of each description is independent and identical), and s2x is the variance of the source. We next ﬁnd the expressions of different terms in (33). To ﬁnd DM, assume that y0 ðnÞ is a ﬁrst-layer-coded DCT block with quantization noise qy0 ðnÞ. After the inverse TDLT, the reconstruction error becomes Gqy0 ðnÞ. Assuming the quantization noises of different subbands are uncorrelated, the average reconstruction error per sample is DM ¼

L1 1X Jg J2 s2qy ðiÞ, 0 Li¼0 i

ð34Þ

where s2qy ðiÞ is the variance of the i-th entry of qy0 ðnÞ, and 0 gi is the i-th column of G. As in Section 2, at high rates, 2 sqy ðiÞ can be written as 0

s2qy ðiÞ ¼ ds2y0 ðiÞ22R0i ,

ð35Þ

0

where s2y0 ðiÞ and R0i are the variance and the allocated bits of the i-th entry of y0 ðnÞ, respectively, and the bit allocaP tion satisﬁes ð1=LÞ L1 i ¼ 0 R0i ¼ R0 . The distortion D0k in (33) with ko M1 is the reconstruction error of a second-layer-coded block, which is

2284

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

37

36 35

36

34 33 D2 TLMDC D2 RD−MDC−W D1 RDMDC−W, D1 TLMDC D2 RD−MDC−T D1 RD−MDC−T

32 31 30 29 37.8

38

33

Side PSNR Di (dB)

Side PSNR Di (dB)

34 35

32 31

D3 TLMDC D2 TLMDC D3 RD−MDC−W D2 RD−MDC−W D1 RD−MDC−W, D1 TLMDC D3 RD−MDC−T D2 RD−MDC−T D1 RD−MDC−T

30 29 28 27 26 25 37.5

38.2 38.4 38.6 38.8 39 39.2 39.4 Central PSNR D3 (dB)

38

38.5

39

39.5

Central PSNR D4 (dB)

Fig. 5. Side distortion vs central distortion for three-description coding of image Lena at a total rate of 1 bits per pixel.

Fig. 6. Side distortion vs central distortion for four-description coding of image Lena at the total rate of 1 bits per pixel.

obtained after averaging the k second-layer-decoded blocks from k descriptions, each having a different rate. As in (14), when averaged over all the possible combinations, the variance of the quantization noise of the i-th DCT coefﬁcient is given by

k¼M 1 in (36). This error is further reduced by the thirdP layer coding with rate RMi such that ð1=LÞ L1 i ¼ 0 RMi ¼ RM . The ﬁnal MSE of the i-th DCT coefﬁcient is thus

D0k,i ¼

M 1 X 1 ds2 ðiÞ22Rji , kðM1Þ j ¼ 1 yj

s

residual block yj ðnÞ, and the bit allocation for this block P satisﬁes ð1=LÞ L1 i ¼ 0 Rji ¼ Rj . Similar to (34), after the inverse transform, the average reconstruction error per sample is L1 M 1 X 1X 1 Jgi J2 ds2 ðiÞ22Rji : Li¼0 kðM1Þ j ¼ 1 yj

M

ðiÞ ¼ @

ð36Þ

where s2yj ðiÞ is the variance of the i-th entry of a prediction

D0k ¼

0 2 qy

ð37Þ

M 1 X X Mk 1 L1 pk D0k ¼ aM,p Jgi J2 ds2yj ðiÞ22Rji , M L j¼1 i¼0 k¼1

D0M1 ¼

L1 X

d 2

LðM1Þ

ds

ð40Þ

Jgi J2

i¼0

M 1 X

ds2yj ðiÞ22ðRji þ RMi Þ :

ð41Þ

j¼1

Substituting (34), (38) and (41) in (33) we obtain D ¼ p0 s2x þu0 þ u1

L1 X

i¼0 M 1 X L1 X

Jgi J2 ds2y0 ðiÞ22R0i Jgi J2 ds2yj ðiÞ22Rji

j¼1i¼0

ð38Þ

þ u2

M 1 X L1 X

Jgi J2 ds2yj ðiÞ22ðRji þ RMi Þ ,

ð42Þ

j¼1i¼0

where 2 X 1 M Mk : aM,p ¼ p M k ¼ 1 k kðM1Þ

ðM1Þ2 j ¼ 1

1 2Rji A 2RMi 2 2 : yj ðiÞ2

After inverse transform, the MSE D0M1 for a third-layer coded sample is

Thus, the second term in (33) can be expressed as M 2 X

M 1 X

d

where ð39Þ

Finally, the distortion D0M1 in (33) is the reconstruction error of a third-layer-coded block. In Section 2.3, two methods are proposed for the third layer coding. However, it is observed from image coding experiments that the second method performs consistently better than the ﬁrst method. Hence only the second method is considered from now on. In this method, a third-layer-coded block is ﬁrst reconstructed by averaging all M 1 second-layer decodings of the block. The residual is further coded in the third layer with rate RM. When averaged over all M 1 second-layer codings, the MSE of the quantization noise is obtained by letting

u0 ¼

u1 ¼ u2 ¼

M 1 X p k, LM k ¼ 1 k

aM,p L

,

d LMðM1Þ2

pM1 :

ð43Þ

The Lagrangian method can be used to minimize the expected distortion, subject to the constraint R¼

M X L1 1 X R : LM j ¼ 0 i ¼ 0 ji

ð44Þ

It can be shown that at the optimal bit allocation, the

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

gain of the transform, which is deﬁned as [30]

expected distortion is u ðM2Þ=M 1 D ¼ p0 s2x þ dLMðM1ÞðM1Þ=M ðu0 u2 Þ1=M bðPÞ22R , M2

, J0 ¼ 1

ð45Þ

bðPÞ ¼ @

M1 1 Y LY

11=LM

s2yj ðiÞA

j¼0i¼0

LY 1

:

,

ð48Þ

3.4. Application in image coding

J ¼ wD þ ð1wÞJ0 ,

The system designed above is for 1-D signals. To apply it to 2-D images, we follow the conventional separable approach, i.e., for each block, the transforms and ﬁlters are applied row by row, then column and column. In particular, to apply the 1-D FIR Wiener ﬁlter in (32) to 2-D images, we ﬁrst estimate each row of a target block using the co-located rows from the two horizontal neighbors, and then estimate each column of the block using the co-located columns from the two vertical neighbors. The average of the horizontal and vertical estimations is used as the ﬁnal prediction.

ð47Þ

where 0 r w r1, and J0 is the single description coding

40.5

38 TLMDC TRPCSQ RD−MDC

40

TLMDC TRPCSQ RD−MDC

37.5 37 Expected PSNR (dB)

39.5 39 38.5 38 37.5

36.5 36 35.5 35 34.5 34

37

33.5 33 10−2 Loss Probability p

10−1

10−3

10−2 Loss Probability p

39.5 TLMDC TRPCSQ RD−MDC

39 38.5 Expected PSNR (dB)

Expected PSNR (dB)

Jgi J s

2 y ðiÞ

ð46Þ

i¼0

We can then design the lapped transform preﬁlter P (more precisely, V in (29)) to minimize bðPÞ, which is the only part in (45) that depends on the TDLT ﬁlters. Since there is no closed-form solution for this step, we use Matlab to ﬁnd the optimized solution. In order to easily control the tradeoff between the optimized transforms for single description coding and multiple description coding, we use a weighted objective function

36.5 10−3

!1=L 2

where s2y ðiÞ is the variance of the i-th entry of the DCT coefﬁcient yðnÞ.

!1=L Jgi J2

LY 1 i¼0

where 0

2285

38 37.5 37 36.5 36 35.5 35 34.5 10−3

10−2 Loss Probability p

10−1

Fig. 7. Three description image coding results at total bit rate of 1 bit per pixel. (a) Lena. (b) Barbara. (c) Boat.

10−1

2286

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

9-description image coding are identical to 3-description image coding. Another special case is M¼4, where the block partition pattern is a 2 2 matrix; hence the minimal distance of blocks within a polyphase is 2. The design of the transform and Wiener ﬁlters thus reduce to the twodescription method in [28].

Another design issue is how to partition the image blocks into different polyphases or groups. To improve the efﬁciency of the prediction, the blocks should be partitioned such that more neighboring blocks of a missing block can be available for prediction. This implies that the minimal distance between blocks of the same polyphase should be maximized. The optimal partition pattern for any value of M is found in [34]. In particular, when M¼K2, the optimal partition pattern is simply a K K square matrix. As another example, when M¼ 3, the polyphase index of the (i, j)-th image block is (i j) mod 3. Given the number of desired descriptions M and the corresponding block partition, the next step is to design the optimal TDLT transform and Wiener ﬁlters. Since separable approach is used, and the system is designed based on the 1-D signal model, we ﬁnd the horizontal and vertical distance between blocks in the same polyphase, and use them to design the corresponding ﬁlters. As a result, the ﬁlters in K2-description image coding are actually the same as those in K-description coding of 1-D signals. It can also be seen that the ﬁlters for

4. Experimental results In this section, the proposed three-layer MDC (TLMDC) method is applied to the TDLT-based image coding [29]. The entropy coding method in [35] is used to encode the quantized DCT transform coefﬁcients. Since there are very few M-description image coding software, we compare our method with the rate-distortionbased multiple description coding (RD-MDC) in [22] and the MDC scheme with two-rate predictive coding and staggered quantization (TRPCSQ) in [25]. We use the RD-MDC codec available at [36]. For a given bit rate, the codec has a parameter which decides the central and side distortions at the same time. RD-MDC and TRPCSQ have only one

40.5

38 TLMDC TRPCSQ RD−MDC

Expected PSNR (dB)

39 38.5 38 37.5

36 35 34 33

37 36.5 10−3

TLMDC TRPCSQ RD−MDC

37

39.5

32 10−2 Loss Probability p

10−1

10−3

10−2 Loss Probability p

40 TLMDC TRPCSQ RD−MDC

39 Expected PSNR (dB)

Expected PSNR (dB)

40

38 37 36 35 34 10−3

10−2 Loss Probability p

10−1

Fig. 8. Four-description image coding results at total bit rate of 1 bit per pixel. (a) Lena. (b) Barbara. (c) Boat.

10−1

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

obtain the same central PSNR DM and the side PSNR D1 as the wavelet-based RD-MDC. It can be seen from both ﬁgures that our method outperforms wavelet-based RDMDC. It can also be seen that the TDLT-based RD-MDC in general is better than wavelet-based RD-MDC. While TDLT-based RD-MDC has a better performance in D1, the proposed method has a superior performance in DM1 . However, DM1 has a larger contribution to the end-toend expected PSNR than D1 in practical loss scenarios. Therefore, as can be seen in Figs. 7–9, the proposed method outperforms the TDLT-based RD-MDC in terms of end-to-end expected PSNR. Figs. 7–9 compare the optimal expected PSNR of the proposed TLMDC, the TRPCSQ, and the TDLT-based RDMDC at different loss probabilities p for M ¼3, 4, and 9, respectively. The wavelet-based RD-MDC codec available at [36] does not provide a way to generate descriptions optimized for a particular loss probability. Hence waveletbased RD-MDC results are not included in the ﬁgures. The testing images are Lena, Barbara and Boat in each case. The redundancy is tuned in each case to achieve the

degree of freedom. Therefore, when the central distortion is decided, all the other distortions are determined automatically. In contrast, the proposed method has two degrees of freedom which enables us to decide up to two distortion values at the same time. Note that the RD-MDC in [22] is based on wavelet, and one of its goals is to maintain compatibility with JPEG 2000. To get fair comparison, we also implement the RD-MDC method in the TDLT framework. That is, the transformed blocks are partitioned into M subsets in the same way as in our method. Each description carries one subset with a higher rate and others with a lower rate, as in [22]. Figs. 5 and 6 compare the proposed method and the RD-MDC for M¼3 and 4 respectively, using the Lena image. The side PSNRs are plotted against the central PSNR at total rate of 1 bits per pixel. The ﬁgures contain results of both the wavelet-based RD-MDC (the curves denoted with RD-MDC-W) and the TDLT-based RD-MDC (the curves denoted with RD-MDC-T). To facilitate an easy comparison, for each PSNR combination generated by the wavelet-based RD-MDC codec, we tune our codec to

42

40 TLMDC TRPCSQ RD−MDC

TLMDC TRPCSQ RD−MDC

Expected PSNR (dB)

39

40 39 38 37 36

38 37 36 35 34 33 32 31

10−2 Loss Probability p

10−1

10−3

10−2 Loss Probability p

41 TLMDC TRPCSQ RD−MDC

40 Expected PSNR (dB)

Expected PSNR (dB)

41

35 10−3

2287

39 38 37 36 35 34 33 10−3

10−2 Loss Probability p

10−1

Fig. 9. Nine description image coding results at total bit rate of 1.25 bits per pixel. (a) Lena. (b) Barbara. (c) Boat.

10−1

2288

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

Fig. 10. Four-description coding results for the proposed method and TRPCSQ in [25] for image Boat at total bit rate of 1 bits per pixel. The PSNR values D4 and D1 of the proposed method are tuned to be equal to TRPCSQ (D4 ¼38.79 dB and D1 ¼24.65 dB). (a) Proposed, two descriptions (27.34 dB). (b) TRPCSQ, two description (27.57 dB). (c) Proposed, three descriptions (31.57 dB). (d) TRPCSQ, three descriptions (30.80 dB).

maximum expected PSNR for each loss probability. It can be seen that the TLMDC consistently outperforms the RDMDC at all values of p, and more gain can be achieved as the increase of M. The TLMDC has similar performance to TRPCSQ when p is high, but performs better when p is small, which shows the effectiveness of the third layer in our method. Fig. 10 shows examples of four-description coding of the proposed method and TRPCSQ for image Boat at total bit rate of 1 bits per pixel. The PSNRs of D4 and D1 are tuned to be the same in the two methods. This is possible since the proposed method has two degrees of freedom. In this case, when two descriptions are received, the PSNR by the proposed method is 0.23 dB lower than that of the TRPCSQ. However, it is clear from Fig. 10 that the proposed method yields better visual quality, especially in the sky area. When three descriptions are available, the PSNR of the proposed method is 0.77 dB higher than the TRPCSQ. 5. Conclusion This paper presents an M-channel MDC method using three-layer coding. The closed-form expressions of the

expected distortions of the system are derived for different numbers of received descriptions. The method is also applied to lapped transform-based multiple description image coding. Experimental results show that this method achieves better performance than other state-ofthe-art schemes. The image coding results can be further improved. For example, more advanced 2-D ﬁlters can be used instead of 1-D ﬁlters, and the entropy coding can be ﬁne-tuned based on the characteristics of the prediction residuals. References [1] V. Goyal, Multiple description coding: compression meets the network, IEEE Signal Process. Mag. 18 (September) (2001) 74–93. [2] V. Vaishampayan, Design of multiple description scalar quantizers, IEEE Trans. Inf. Theory 39 (May) (1993) 821–834. [3] Y. Wang, M. Orchard, V. Vaishampayan, A. Reibman, Multiple description coding using pairwise correlating transforms, IEEE Trans. Image Process. 10 (March) (2001) 351–366. [4] R. Venkataramani, G. Kramer, V. Goyal, Multiple description coding with many channels, IEEE Trans. Inf. Theory 49 (September) (2003) 2106–2114. [5] R. Puri, S.S. Pradhan, K. Ramchandran, n-Channel symmetric multiple descriptions—Part II: an achievable rate-distortion region, IEEE Trans. Inf. Theory 51 (April) (2005) 1377–1392.

U. Samarawickrama et al. / Signal Processing 91 (2011) 2277–2289

[6] C. Tian, J. Chen, New coding schemes for the symmetric k-description problem, IEEE Trans. Inf. Theory 56 (October) (2010) 5344–5365. [7] A.E. Gamal, T. Cover, Achievable rates for multiple descriptions, IEEE Trans. Inf. Theory IT-28 (November) (1982) 851–857. [8] T.M. Cover, J.A. Thomas, Elements of Information Theory, Wiley, New York, 1991. [9] A.E. Mohr, E.A. Riskin, R.E. Ladner, Unequal loss protection: graceful degradation over packet erasure channels through forward error correction, IEEE J. Sel. Areas Commun. 18 (June) (2000) 819–828. [10] R. Puri, K. Ramchandran, Multiple description source coding using forward error correction, in: 33rd Asilomar Conference on Signals, Systems and Computers, vol. 1, Paciﬁc Grove, CA, October 1999, pp. 342–346. [11] V. Stankovic, R. Hamzaoui, Y. Charﬁ, Z. Xiong, Real-time unequal error protection algorithms for progressive image transmission, IEEE J. Sel. Areas Commun. 21 (December) (2003) 1526–1535. [12] S. Dumitrescu, X. Wu, Z. Wang, Globally optimal uneven errorprotected packetization of scalable code streams, IEEE Trans. Multimedia 2 (April) (2004) 230–239. [13] T. Wiegand, G.J. Sullivan, G. Bjntegaard, A. Luthra, Overview of the H.264/AVC video coding standard, IEEE Trans. Circ. Syst. Video Technol. 13 (July) (2003) 560–576. [14] S. Srinivasan, C. Tu, S.L. Regunathan, G.J. Sullivan, HD Photo: a new image coding technology for digital photography, in: Proceedings of the SPIE Conference on Applications of Digital Image Processing XXX, vol. 6696, San Diego, August 2007. [15] T.Y. Berger-Wolf, E.M. Reingold, Index assignment for multichannel communication under failure, IEEE Trans. Inf. Theory 48 (October) (2002) 2656–2668. [16] C. Tian, S. Hemami, Sequential design of multiple description scalar quantizers, in: Proceedings of the Data Compression Conference, March 2004, pp. 32–41. [17] J. Ostergaard, J. Jensen, R. Heusdens, n-channel entropy-constrained multiple-description lattice vector quantization, IEEE Trans. Inf. Theory 52 (May) (2006) 1956–1973. [18] N. Jayant, Subsampling of a DPCM speech channel to provide two self-contained half-rate channels, Bell Syst. Tech. J. 60 (April) (1981) 501–509. [19] N. Jayant, S. Christensen, Effects of packet losses in waveform coded speech and improvements due to an odd–even sample-interpolation procedure, IEEE Trans. Commun. 29 (February) (1981) 101–109. [20] W. Jiang, A. Ortega, Multiple description coding via polyphase transform and selective quantization, Proceedings of the SPIE

[21]

[22]

[23]

[24]

[25]

[26] [27] [28]

[29]

[30] [31] [32]

[33]

[34] [35]

[36]

2289

Conference on Visual Communications and Image Processing, vol. 3653, February 1999, pp. 998–1008. T. Tillo, M. Grangetto, G. Olmo, Multiple description image coding based on Lagrangian rate allocation, IEEE Trans. Image Process. 16 (March) (2007) 673–683. E. Baccaglini, T. Tillo, G. Olmo, A ﬂexible R-D-based multiple description scheme for JPEG 2000, IEEE Signal Process. Lett. 14 (March) (2007) 197–200. T. Tillo, E. Baccaglini, G. Olmo, A ﬂexible multi-rate allocation scheme for balanced multiple description coding applications, in: Proceedings of the IEEE 7th Workshop on Multimedia Signal Processing, November 2005, pp. 1–4. C. Tian, S. Hemami, A new class of multiple description scalar quantizer and its application to image coding, IEEE Signal Process. Lett. 12 (April) (2005) 329–332. U. Samarawickrama, J. Liang, C. Tian, M-channel multiple description coding with two-rate coding and staggered quantization, IEEE Trans. Circ. Syst. Video Technol. 20 (July) (2010) 933–944. M. Liu, C. Zhu, Enhancing two-stage multiple description scalar quantization, IEEE Signal Process. Lett. 16 (4) (2009) 253–256. C. Lin, Y. Zhao, C. Zhu, Two-stage diversity-based multiple description image coding, IEEE Signal Process. Lett. 15 (2008) 837–840. G. Sun, U. Samarawickrama, J. Liang, C. Tian, C. Tu, T.D. Tran, Multiple description coding with prediction compensation, IEEE Trans. Image Process. 18 (May) (2009) 1037–1047. T.D. Tran, J. Liang, C. Tu, Lapped transform via time-domain preand post-processing, IEEE Trans. Signal Process. 51 (June) (2003) 1557–1571. P.P. Vaidyanathan, Multirate Systems and Filter Banks, PrenticeHall, 1993. S.M. Kay, Fundamentals of Statistical Signal Processing: Detection Theory, Prentice-Hall, Englewood Cliffs, NJ, 1998. D.S. Taubman, M.W. Marcellin, JPEG 2000: Image Compression Fundamentals, Standards, and Practice, Kluwer Academic, Boston, 2002. J. Liang, C. Tu, L. Gan, T.D. Tran, K.-K. Ma, Wiener ﬁlter-based error resilient time domain lapped transform, IEEE Trans. Image Process. 16 (February) (2007) 428–441. I.V. Bajic, J.W. Woods, Maximum minimal distance partitioning of the z2 lattice, IEEE Trans. Inf. Theory 49 (April) (2003) 981–992. C. Tu, T.D. Tran, Context based entropy coding of block transform coefﬁcients for image compression, IEEE Trans. Image Process. 11 (November) (2002) 1271–1283. RD-MDC source code: /http://www.telematica.polito.it/sas-ipl/ download.phpS, 2007.

A three-layer scheme for M-channel multiple description image coding

A three-layer scheme for M-channel multiple description image coding

Recommend Documents