Block effect reduction by the 1-D gray polynomial interpolation

Block effect reduction by the 1-D gray polynomial interpolation

Digital Signal Processing 20 (2010) 1424–1438 Contents lists available at ScienceDirect Digital Signal Processing www.elsevier.com/locate/dsp Block...

2MB Sizes 1 Downloads 19 Views

Digital Signal Processing 20 (2010) 1424–1438

Contents lists available at ScienceDirect

Digital Signal Processing www.elsevier.com/locate/dsp

Block effect reduction by the 1-D gray polynomial interpolation Cheng-Hsiung Hsieh a,∗ , Ren-Hsien Huang b a b

Department of Computer Science and Information Engineering, Chaoyang University of Technology, Taiwan Mars Semiconductor Corporation, Taiwan

a r t i c l e

i n f o

a b s t r a c t

Article history: Available online 4 January 2010

In this paper, the one-dimensional gray polynomial interpolation (1-D GPI) developed for image enlargement in Hsieh et al. (2007) [27] is applied to reduce the block effect in the block discrete cosine transform (BDCT) image/video coders. Note that the block effect in the BDCT coders results from insufficient bits for transform coefficients. An interpolation approach is proposed to relieve the block effect problem. The proposed approach consists of three stages. First, the input image/frame is down sampled by the direct down sampling (DDS) scheme to reduce the amount of data. Second, the down sampled image/frame is put into a BDCT coder, such as the JPEG and the MPEG-2. Finally, the 1-D GPI is applied to enlarge the decoded image. When the JPEG and the MPEG-2 are used as BDCT coders, the coding systems are called the JPEG-GPI and the MPEG-GPI, respectively. The proposed JPEGGPI and MPEG-GPI are verified through examples. Simulation results indicate that both the JPEG-GPI and the MPEG-GPI are able to effectively reduce the block effect significantly and therefore the subjective visual quality improved in the given examples. Note that (i) the down sampling scheme in the first stage can be any down sampling scheme other than the DDS, and (ii) the interpolation approach in the third stage may be different from the 1-D GPI. Thus, various down sampling schemes, the nearest neighbor down sampling (NNDS), the bilinear down sampling (BLDS), and the bicubic down sampling (BCDS) schemes are under consideration. These down sampling schemes are used to replace the DDS in the JPEG-GPI and the MPEG-GPI to investigate the effect on the coding performance. The simulation results show that the DDS always gives the highest PSNR for all cases under same condition. Moreover, the 1-GPI is substituted by popular 2-D interpolation approaches, the nearest neighbor interpolation (NNI), the bilinear interpolation (BLI), and the bicubic interpolation (BCI). With the DDS scheme, the simulation results indicate that the JPEG-BLI has a little bit better PSNR than the JPEG-GPI for all cases while similar visual quality of reconstructed image is found. When compared with the MPEG-GPI, the MPEGBCI has higher PSNR when the DDS is employed. And the reconstructed frames from the MPEG-GPI and the MPEG-BCI have similar visual quality. In summary, by the simulation results the proposed coding system, as expected, is able to significantly reduce the block effect happened in the cases of the low bit rate coding or high compression ratio. © 2009 Elsevier Inc. All rights reserved.

Keywords: Grey 1-AGO Polynomial interpolation Block effect Block discrete cosine transform Low bit rate coding

*

Corresponding author. E-mail address: [email protected] (C.-H. Hsieh).

1051-2004/$ – see front matter doi:10.1016/j.dsp.2009.12.012

©

2009 Elsevier Inc. All rights reserved.

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

1425

1. Introduction Nowadays, image and video data are quite common in the network transmission. Generally, these data require large memory capacity and therefore large transmission bandwidth. To relieve the problem, image and video coding schemes are sought. Among them, the block discrete cosine transform (BDCT) is the most popular scheme for image and video compression because of its near-optimum energy compaction and the availability of fast algorithms and hardwares. Consequently, the BDCT coder has been extensively used by many current image/video standards [1], such as JPEG, ITU-T H.261/3/L and ISO MPEG-1/2/4. However, one of problems in the BDCT coder is the block effect in low bit rate or high compression cases. To deal with the problem, several approaches have been reported recently. For the still image coding standard JPEG [2], several block effect reduction methods have been reported. In [3], a lowpass filter was applied to the decoded image to smooth out the blockness. However, a blurred image resulted accordingly. In [4–9], several types of de-blocking filters and post-processing were proposed. Yet, the improvement seemed not significant. In [10–14], the block effect reduction was performed in the wavelet domain which requires high computational cost generally. In [15–20], by the projection onto convex set the visual quality was enhanced significantly. Nevertheless, the computational complexity is quite high. As for the video coding standard MPEG-2, several approaches to reduce block effect have been reported. In [21], a lowpass filter with Gaussian-shaped impulse response was applied to the video sequence. In [22], it utilized the information obtained from the bitstream, including DCT coefficients and motion vectors. Based on the information, each block was classified and detected for its block effect. To remove the block effect, a filter was applied to the decoded images. In [23], the slope measure was applied to macro-block (MB) boundaries for the determination whether the MB was well compensated or not. If it is not, then the MB was adaptively quantized to reduce the block effect. Note that the approaches just described have problems either with poor performance or with high computational cost. To relieve the problems, this paper propose a novel approach, based on one-dimensional gray polynomial interpolation (1-D GPI), to reduce the block effect in the BDCT coder. When the JPEG and MPEG-2 are used in the proposed approach, the coding systems are abbreviated as the JPEG-GPI and the MPEG-GPI, respectively. This paper is organized as follows. In Section 2, 1-D polynomial interpolation (PI) is briefly reviewed and then the 1-D GPI is described. In Section 3, the proposed JPEG-GPI and MPEG-GPI coding systems are introduced. In Section 4, simulations are given to justify the proposed coding systems where down sampling schemes and popular 2-D interpolation approaches are under consideration as well. Finally, conclusion is made in Section 5. 2. The 1-D PI and the 1-D GPI In this section, the 1-D PI is briefly reviewed in Section 2.1. For details, one may consult [24]. Then the 1-D GPI which will be used later in the proposed coding systems is given in Section 2.2. 2.1. The 1-D PI Given the data {x(k), 1  k  L + 1}, the implementation steps for 1-D PI are described as follows. Step 1. Assume x(k) is an L-order polynomial as

x(k) = c L k L + c L −1 k L −1 + · · · + c 1 k + c 0

(1)

where cn , for 0  n  L, are the coefficients to be determined. Step 2. Substitute 1  k  L + 1 into (1) as

x= Vc

(2)

where elements of x and V are x(k) and v (k, j ) = k L − j +1 for 1  j  L + 1, 1  k  L + 1, respectively. Vector c has elements cn , for 0  n  L. Step 3. Find the interpolated data, xˆ (k + 1/ M ), as

xˆ (k + 1/ M ) = c L (k + 1/ M ) L + · · · + c 1 (k + 1/ M ) + c 0

(3)

where cn are the coefficients found in (2). Step 4. Obtain the final interpolated data as

xˆ (k + 1/ M ) = xˆ (k + 1/ M ) × M

(4)

where M denotes an up sampling factor and is assumed an integer without the loss of generality. For details of interpolation or up sampling, one may consult [25].

1426

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

Fig. 1. Randomness reduction in the 1-AGO converted data.

2.2. The 1-D GPI Note that randomness in data affects the performance of 1-D PI. Therefore the performance can be improved by reducing the randomness in data. And it is known that the preprocessing scheme in the gray system, the first-order accumulated generating operation (1-AGO), is able to reduce the randomness in data [26]. Consequently, 1-AGO is incorporated into the 1-D PI to improve the interpolation performance. The new 1-D interpolation scheme is called the 1-D gray polynomial interpolation (1-D GPI) which is described as follows. Given the data {x(k), 1  k  L + 1}, the 1-AGO converted data is found as

x(1) (k) =

k 

x(i )

(5)

i =1

for 1  k  L + 1. An example is depicted in Fig. 1 where the original data x(k) is {1, 4, 2, 5, 3} and the 1-AGO converted data x(1) (k) is {1, 5, 7, 12, 15}. It is easy to see the data after 1-AGO is of less randomness than the original data. Thus the 1-AGO may improve the interpolation performance of the 1-D PI. In the 1-D GPI, the 1-AGO is applied to preprocess input data through which reduces the randomness in the data. Then, by the 1-D PI described in Section 2.1, the 1-AGO preprocessed data is interpolated. Next, interpolated data are found through the inverse of 1-AGO, the first-order inverse accumulated generating operation (1-IAGO) which is defined as

x(k) = x(1) (k) − x(1) (k − 1)

(6)

for 2  k  L + 1. Finally, an α filter is applied to interpolated data to further enhance the interpolation performance. The 1-D GPI was applied to image enlargement in [27] where no coding scheme was used for the input image before enlargement. With less computational complexity, the 1-D GPI is shown having better performance than the conventional 2-D interpolation schemes, such as the bilinear interpolation (BLI) and the bicubic interpolation (BCI), in most of cases. Thus it is used in the proposed coding systems to be described in Section 3. The 1-D GPI is reviewed in the following. Assume a color image O has YCb Cr format and is down sampled by the factor M × M. The down sampled image is denoted as O M . Since YCb Cr components are processed separately in the 1-D GPI, thus only Y component is considered in the following. For Cb and Cr components, the steps can be applied equally well. The implementation steps for the 1-D GPI are given as follows. Step 1. On a row-by-row basis, the Y component in O M is segmented into ( L + 1)-point subsets and denoted as {x(k), 1  k  L + 1}. Step 2. Preprocess x(k) by the 1-AGO as in (5). Step 3. With x(1) (k), interpolated pixels xˆ (1) (k + 1/ M ) are found by (3).

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

1427

Fig. 2. The block diagram for the 1-D GPI.

Fig. 3. The block diagram for the JPEG-GPI coding system.

Step 4. By the 1-IAGO in (6), find the xˆ (k + 1/ M ) as



Step 5. By an



xˆ (k + 1/ M ) = xˆ (1) (k + 1/ M ) − xˆ (1) (k) × M

(7)

α filter, the interpolated xˆ (k + 1/ M ) is modified as xˆ (k + 1/ M ) = α xˆ (k) + (1 − α )ˆx(k + 1/ M )

(8)

where 0  α  1. Step 6. Similarly, on a column-by-column basis, Steps 1 to 5 are applied to find interpolated pixels. Note that in Step 6 some original pixels are not available in the 1-D GPI. In the case, the interpolated pixels are used in the interpolation. The block diagram for the 1-D GPI is depicted in Fig. 2. For most of cases, the 1-D GPI with second order is good enough for image enlargement by our experiences. 3. The proposed coding systems Based on the 1-D GPI, a coding scheme is proposed to deal with the block effect problem in the BDCT image/video coders. Basically, the coding scheme consists of three stages: a down sampling scheme, an BDCT-based encoder/decoder, and an up sampling or interpolation approach. Though the scheme in each stage can be chosen arbitrarily, a direct down sampling (DDS) is used here for its simplicity which keeps the pixels with odd indices in columns and rows and discards the others. Similarly, in spite the coder can be any BDCT-based coder, the JPEG and MPEG-2 are employed here for its popularity. As for the interpolation approach, the 1-D GPI is employed because the computational cost is low. With the JPEG and the MPEG-2, the proposed coding systems are called the JPEG-GPI and the MPEG-GPI which are described in Section 3.1 and Section 3.2, respectively. 3.1. The JPEG-GPI Basically, the JPEG coding process includes four stages: (i) to divide the input image into 8 × 8 non-overlapping blocks, (ii) to perform 2-D discrete cosine transform (DCT) on each image block, (iii) to quantize transform coefficients, and (iv) to code quantized transform coefficients by the variable length coding (VLC). Though the JPEG prevails in the coding community, it suffers the block effect in low bit rate coding. Intuitively, we consider the problem of the block effect resulted from insufficient bits to preserve information in transform coefficients. Consequently, the block effect can be reduced if more bits are assigned to each coefficient. To achieve this, one possible way is to down sample the input image, say by the factor 2 × 2. Then the average number of bits per coefficient in the down sampled image would be four times more than that in the original image, if the bit rate budget is same for both images. The following problem is to enlarge the down sampled reconstructed image. This problem can be easily dealt with an interpolation scheme, such as the 1-D GPI. This is the fundamental idea in the proposed JPEG-GPI coding system. The proposed coding system with the DDS, the JPEG, and the 1-D GPI is called the JPEG-GPI which is depicted in Fig. 3. Given an N × N image O , the implementation steps for the JPEG-GPI coding system are given as follows. Step 1. Down sample image O with the factor M × M by the DDS. The down sampled image is denoted as O M whose size is ( N / M ) × ( N / M ) where N is assumed to be a multiple of M. Step 2. Encode the down sampled image O M by the JPEG. ˆ M. Step 3. Decode and obtain the reconstructed image of O M , O ˆ M with the factor M × M by the 1-D GPI and obtain the reconstructed image of O , Oˆ . Step 4. Enlarge image O 3.2. The MPEG-GPI In this section, a brief review of MPEG-2 is given first and then the proposed scheme for video coding called the MPEGGPI is introduced. In the MPEG-2 [28], the input image sequence is first put into the unit of frame reorder. And a group of

1428

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

Fig. 4. The coding process for intra-block and inter-block in the MPEG-2.

Fig. 5. The block diagram for the MPEG-GPI coding system.

pictures (GOP) is formed by I-frame, P-frame and B-frame. In the coding process, the frame is divided into MB which is the basic unit for motion estimation and motion compensation. In the case of 4:2:0 sampling, an MB consists of four luminance blocks (Y-block) and two chroma blocks (U-block and V-block). The size for Y-, U-, and V-block is 8 × 8 which is the basic unit for the coding in the MPEG-2. Two types of blocks are involved in the coding process: intra-block and inter-block. The coding process for the intra-block and the inter-block is given in Fig. 4. As shown in Fig. 4, the coding process for blocks is same as in the JPEG, from the DCT to the VLC. Consequently, the MPEG-2 coder also suffers the block effect in high compression cases. With similar idea as described in the JPEG-GPI, we incorporate the DDS and the 1-D GPI into the MPEG-2 to reduce the block effect. The proposed coding system is called the MPEG-GPI which is depicted in Fig. 5. Given video sequence S with frame size N 1 × N 2 , the implementation steps for the MPEG-GPI are given as follows. Step 1. Down sample each frame in the video sequences S by the factor M × M with the DDS. The down sampled video sequence is denoted as S M with frame size ( N 1 / M ) × ( N 2 / M ) where N 1 and N 2 are assumed to be multiples of M. Step 2. Encode the down sampled video sequences S M by the MPEG-2. Step 3. Decode and obtain the reconstructed video sequence of S M , Sˆ M . Step 4. Enlarge each frame in the video sequence Sˆ M with the factor M × M by the 1-D GPI and obtain the reconstructed video sequence of S , Sˆ . 4. Simulation results In this section, examples are provided to verify the proposed JPEG-GPI and MPEG-GPI, respectively. Then comparisons with the JPEG and MPEG-2 are made to demonstrate the superiority of JPEG-GPI and MPEG-GPI in the objective and/or subjective assessments. In Section 4.1, the proposed JPEG-GPI is verified by five images and compared with the JPEG. Moreover, the effect of down sampling schemes on the JPEG-GPI is investigated where the nearest neighbor down sampling (NNDS), the bilinear down sampling (BLDS), and the bicubic down sampling (BCDS) are under consideration. Then the JPEG-GPI is compared with the JPEG incorporated by the popular 2-D interpolation approaches, such as the nearest neighbor interpolation (NNI), the bilinear interpolation (BLI), and bicubic interpolation (BCI). The coding systems are denoted as the JPEG-NNI, the JPEG-BLI, and the JPEG-BCI, respectively. In Section 4.2, four video sequences are given to justify the proposed MPEG-GPI and then to compare with the traditional MPEG-2. Also, the effect of down sampling schemes on the MPEG-GPI is investigated where the NNDS, the BLDS, and the BCDS are under study. Finally, the proposed MPEG-GPI is compared with the MPEG-NNI, the MPEG-BLI, and the MPEG-BCI where the MPEG-NNI is the MPEG-2 with the NNI and similarly for other coding systems. 4.1. Simulation results for the JPEG-GPI This section is divided into three parts. In the first part, the comparison of the JPEG and the JPEG-GPI is made. The second part considers the effect of down sampling schemes on the proposed JPEG-GPI. The last part of this section compares the proposed JPEG-GPI with the JPEG-NNI, the JPEG-BLI, and the JPEG-BCI. The results for the three parts are given in the following.

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

1429

Table 1 Comparison of PSNR for different JPEG coding systems with the DDS. Image

BR

JPEG

JPEG-NNI

JPEG-BLI

JPEG-BCI

JPEG-GPI

Lena

0.16 0.20 0.24

14.50 23.61 25.70

25.81 26.37 26.58

27.19 28.20 28.67

27.10 28.13 28.60

27.02 28.00 28.46

Lake

0.16 0.21 0.24

15.10 19.93 21.22

21.79 22.39 22.63

22.56 23.37 23.73

22.50 23.32 23.70

22.43 23.25 23.60

Pepper

0.16 0.21 0.23

14.22 22.40 23.26

23.91 24.57 24.79

24.58 25.46 25.77

24.57 25.48 25.79

24.47 25.35 25.65

Jet

0.16 0.20 0.24

17.16 22.36 24.49

24.02 24.49 24.71

25.19 25.91 26.31

25.15 25.90 26.32

25.04 25.76 26.15

Baboon

0.17 0.21 0.25

17.06 18.48 19.32

19.02 19.12 19.17

19.56 19.83 20.08

19.45 19.70 19.93

19.46 19.72 19.96

4.1.1. Comparison with the JPEG In order to compare with the JPEG and JPEG-GPI, five test images are used as examples: Lena, Lake, Peppers, Jet, and Baboon. All test images have size of 512 × 512. The JPEG software used in the simulation is developed by the IJG [29]. The 1-D GPI is implemented using the MATLAB. As described in Section 3.1 for the JPEG-GPI, all 512 × 512 test images are down sampled to 256 × 256, i.e., M = 2, where the DDS scheme is applied. Then the JPEG is applied to encode/decode the down sampled images where the default setting is employed. Finally, the down sampled reconstructed image is enlarged by the 1-D GPI with parameters α = 0.5 and L = 2. The objective assessment used here is the peak signal-to-noise ratio (PSNR), which is defined as

PSNR = 10 log10 MSE =

2552 MSE

N N 1  

N2

(dB)

(9)

2

ˆ (i , j ) O (i , j ) − O

(10)

i =1 j =1

where O (i , j ) and Oˆ (i , j ) are elements of the original and reconstructed images, respectively. Besides, the bit rate (bit/pixel), BR, is defined as

BR =

Bc N2

(11)

where N 2 is the number of pixels in the input image and B c denotes the total number of bits used in the compressed image. For comparison, the PSNR obtained from the JPEG and the proposed JPEG-GPI at different bit rates are given in the third and the seventh columns in Table 1, respectively. The simulation results indicate that by the proposed JPEG-GPI the PSNR have been improved significantly for all test images. To see how good the visual quality is improved, the reconstructed Lena and Baboon from the JPEG with various BR are, respectively, shown in the first rows in Tables 9 and 10 whose lower left corners show the reconstructed Lena and Baboon at BR = 0.16 and BR = 0.17 by the proposed JPEG-GPI, respectively. From the reconstructed images, the proposed JPEG-GPI, as expected, is able to reduce the block effect in the low bit rate coding. Even in the case of BR = 0.16, the JPEG-GPI scheme is still able to obtain a reconstructed image with an acceptable visual quality, while the JPEG fails to. From the reconstructed images of the JPEG-GPI shown in Tables 9 and 10, it clearly indicates that the visual quality is enhanced significantly by the proposed JPEG-GPI. Interesting enough, Table 1 also shows that for all test images the reconstructed images from the JPEG-GPI of about 0.16 BR have better PSNR than those from the JPEG of 0.24 BR or so. Interesting enough, by the JPEG-GPI the visual quality of reconstructed Lena and Baboon of lower BR is much better than that from the JPEG with higher BR as shown in Tables 9 and 10. This concludes, through the given examples, the proposed JPEG-GPI is able to reduce the block effect and generally has much better PSNR and visual quality than the JPEG in the low bit rate cases. 4.1.2. Effect of down sampling schemes on the JPEG-GPI In this part, we will replace the DDS in the JPEG-GPI described in Section 3.1 with other down sampling schemes: the NNDS, the BLDS, and the BCDS. The effect of these down sampling schemes on the JPEG-GPI is investigated. With various BR, the PSNR obtained from different down sampling schemes are, respectively, shown in the rightmost columns of Tables 1

1430

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

Table 2 Comparison of PSNR for different JPEG coding systems with the NNDS. Image

BR

JPEG

JPEG-NNI

JPEG-BLI

JPEG-BCI

JPEG-GPI

Lena

0.16 0.20 0.24

14.50 23.61 25.70

23.19 23.20 23.19

25.19 25.54 25.66

25.02 25.35 25.47

25.35 25.73 25.87

Lake

0.16 0.21 0.24

15.10 19.93 21.22

19.97 20.12 20.17

21.45 21.87 22.06

21.32 21.72 21.92

21.54 21.98 22.19

Pepper

0.16 0.21 0.23

14.22 22.40 23.26

21.98 22.27 22.32

23.31 23.83 23.95

23.21 23.73 23.85

23.41 23.96 24.09

Jet

0.16 0.20 0.24

17.16 22.36 24.49

21.45 21.43 21.40

23.50 23.73 23.85

23.36 23.61 23.72

23.60 23.86 23.99

Baboon

0.17 0.21 0.25

17.06 18.48 19.32

17.77 17.66 17.49

18.83 18.92 18.96

18.74 18.82 18.84

18.84 18.93 18.96

Table 3 Comparison of PSNR for different JPEG coding systems with the BLDS. Image

BR

JPEG

JPEG-NNI

JPEG-BLI

JPEG-BCI

JPEG-GPI

Lena

0.16 0.20 0.24

14.50 23.61 25.70

24.97 25.46 25.65

26.56 27.34 27.66

26.54 27.35 27.69

26.66 27.50 27.85

Lake

0.16 0.21 0.24

15.10 19.93 21.22

21.30 21.87 22.10

22.34 23.10 23.43

22.31 23.09 23.44

22.38 23.17 23.52

Pepper

0.16 0.21 0.23

14.22 22.40 23.26

23.36 23.95 24.16

24.40 25.18 25.48

24.37 25.18 25.49

24.45 25.26 25.58

Jet

0.16 0.20 0.24

17.16 22.36 24.49

23.54 23.96 24.11

25.11 25.72 26.06

25.10 25.75 26.10

25.18 25.85 26.20

Baboon

0.17 0.21 0.25

17.06 18.48 19.32

19.16 19.36 19.49

19.63 19.90 20.11

19.62 19.90 20.12

19.65 19.93 20.15

to 3 where the corresponding PSNR for the JPEG are listed in the third columns. With the same BR, the proposed JPEG-GPI has better PSNR than the JPEG for all four down sampling schemes except for Jet at 0.24 BR and for Baboon at 0.25 BR in Table 2. The ranking of PSNR for the four down sampling schemes, from high to low, is the DDS, the BCDS, the BLDS, and the NNDS. In other words, the proposed JPEG-GPI prefers the DDS than other compared down sampling schemes. As for the visual quality, the reconstructed Lena and Baboon by the JPEG-GPI with the DDS, the BLDS, and the BCDS, are shown in the last rows of Tables 9 and 10, respectively, where BR = 0.16 and BR = 0.17 for Lena and Baboon. In the reconstructed images, it can be found that the blurriness in the reconstructed image with the BLDS is a little bit more than those with the DDS and the BCDS schemes. When compared with the JPEG at the same BR, the reconstructed Lena and Baboon, obviously, have much better visual quality, no matter which down sampling scheme is used. 4.1.3. Comparison for the JPEG-GPI and 2-D PI with the JPEG Here the 1-D GPI in the JPEG-GPI coding system is replaced with popular 2-D interpolation approaches, the nearest neighbor interpolation (NNI), the bilinear interpolation (BLI), and the bicubic interpolation (BCI). The corresponding coding systems are called the JPEG-NNI, the JPEG-BLI, and the JPEG-BCI, respectively. Then the comparison is made for the JPEG-GPI and the three coding systems. With different BR and down sampling schemes, the PSNR for the four coding systems are shown in Tables 1 to 4 where the highest PSNR are set in bold. By Tables 1 to 4, the proposed JPEG-GPI has better PSNR than the JPEG-NNI, the JPEG-BLI, and the JPEG-BCI with NNDS, the BLDS, and the BCDS, for all cases except the case of Baboon at BR = 0.25 with equal PSNR for the JPEG-BLI in Table 2. For the DDS, the proposed JPEG-GPI has similar PSNR to those from the JPEG-BLI and the JPEG-BCI. The difference of PSNR is no more than 0.21 dB for all cases in Table 1. As for the visual quality, the reconstructed Lena and Baboon obtained from the JPEG, the JPEG-NNI, the JPEG-BLI, the JPEG-BCI, and the JPEG-GPI are shown in Tables 9 and 10, where the down sampling schemes, the DDS, the BLDS, and the BCDS are applied. Again, the reconstructed images with the BLDS are of a little bit more blurriness than those with other down sampling

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

1431

Table 4 Comparison of PSNR for different JPEG coding systems with the BCDS. Image

BR

JPEG

JPEG-NNI

JPEG-BLI

JPEG-BCI

JPEG-GPI

Lena

0.16 0.20 0.24

14.50 23.61 25.70

24.76 25.21 25.34

26.63 27.44 27.79

26.55 27.38 27.74

26.72 27.59 27.97

Lake

0.16 0.21 0.24

15.10 19.93 21.22

21.16 21.69 21.87

22.39 23.16 23.49

22.32 23.12 23.45

22.42 23.23 23.57

Pepper

0.16 0.21 0.23

14.22 22.40 23.26

23.27 23.80 23.99

24.44 25.23 25.50

24.39 25.20 25.48

24.48 25.31 25.60

Jet

0.16 0.20 0.24

17.16 22.36 24.49

23.33 23.64 23.80

25.17 25.77 26.15

25.10 25.72 26.13

25.23 25.88 26.28

Baboon

0.17 0.21 0.25

17.06 18.48 19.32

19.08 19.25 19.35

19.69 19.98 20.19

19.66 19.96 20.18

19.70 20.00 20.23

Table 5 Comparison of PSNR for different MPEG coding systems with the DDS. Coastguard

Container

Foreman

Hall_monitor

MPEG-2

PSNR (CR)

22.24 (94.49)

24.07 (114.99)

22.42 (87.32)

25.00 (113.51)

MPEG-NNI

PSNR (CR)

22.12 (94.67)

21.99 (115.46)

23.89 (87.43)

23.30 (113.72)

MPEG-BLI

PSNR (CR)

24.44 (94.67)

24.60 (115.46)

25.51 (87.43)

25.67 (113.72)

MPEG-BCI

PSNR (CR)

24.58 (94.67)

24.81 (115.46)

25.54 (87.43)

26.03 (113.72)

MPEG-GPI

PSNR (CR)

20.76 (94.67)

20.57 (115.46)

23.45 (87.43)

21.73 (113.72)

schemes. And the reconstructed images from the proposed JPEG-GPI have similar visual quality to those from the JPEG-BLI and the JPEG-BCI. The results indicate that the 1-D GPI has better performance than the 2-D NNI and similar performance to the 2-D BLI and BCI approaches, which is not consistent with the results given in [26]. In [26], the 1-D GPI generally is shown having better interpolation performance than the compared 2-D interpolation approaches. The reason may be that the input image in [26] was input directly to the 1-D GPI without any coding while the JPEG decoded image is used in the 1-D GPI in the JPEG-GPI coding system. 4.2. Simulation results for the MPEG-GPI As in Section 4.1, this section is composed of three parts: the comparison with the MPEG-2, the effect of down sampling schemes on the MPEG-GPI, and comparison for the MPEG-GPI and 2-D PI with MPEG-2. The down sampling schemes to be compared are the NNDS, the BLDS, and the BCDS. The coding systems to be investigated are the MPEG-NNI, the MPEG-BLI, and the MPEG-BCI. The simulation results for the three parts are given as follows. 4.2.1. Comparison with the MPEG-2 In this part, the proposed MPEG-GPI is verified and compared with the traditional MPEG-2. In the simulation, four test video sequences are used as examples: Coastguard, Container, Foreman, and Hall_monitor. All video sequences are of 352 × 288 frame size. The MPEG-2 software used in the simulation is the MPEG-2 Test Model 5 (TM5) [30]. The 1-D GPI is implemented using the MATLAB. As described in Section 3.2 for the MPEG-GPI, all test video sequences are first down sampled by the DDS to 176 × 144, that is, M = 2. Then the MPEG-2 TM5 is applied to encode/decode the down sampled video sequences where the default setting is employed. Finally, the down sampled reconstructed video sequences are enlarged by the 1-D GPI where parameters α = 0.5 and L = 2. The compression ratio (CR) is defined as

CR =

Bo Bc

(12)

1432

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

Table 6 Comparison of PSNR for different MPEG coding systems with the NNDS. Coastguard

Container

Foreman

Hall_monitor

MPEG-2

PSNR (CR)

22.24 (94.49)

24.07 (114.99)

22.42 (87.32)

25.00 (113.51)

MPEG-NNI

PSNR (CR)

18.02 (94.68)

17.61 (115.45)

21.02 (87.43)

18.38 (113.73)

MPEG-BLI

PSNR (CR)

20.49 (94.68)

20.11 (115.45)

23.33 (87.43)

20.68 (113.73)

MPEG-BCI

PSNR (CR)

20.41 (94.68)

20.04 (115.45)

23.24 (87.43)

20.62 (113.73)

MPEG-GPI

PSNR (CR)

17.61 (94.68)

17.31 (115.45)

20.50 (87.43)

18.03 (113.73)

Table 7 Comparison of PSNR for different MPEG coding systems with the BLDS. Coastguard

Container

Foreman

Hall_monitor

MPEG-2

PSNR (CR)

22.24 (94.49)

24.07 (114.99)

22.42 (87.32)

25.00 (113.51)

MPEG-NNI

PSNR (CR)

21.15 (94.68)

20.99 (115.46)

23.53 (87.41)

21.80 (113.71)

MPEG-BLI

PSNR (CR)

22.95 (94.68)

22.85 (115.46)

25.37 (87.41)

23.71 (113.71)

MPEG-BCI

PSNR (CR)

23.13 (94.68)

23.07 (115.46)

25.49 (87.41)

23.94 (113.71)

MPEG-GPI

PSNR (CR)

19.97 (94.68)

19.74 (115.46)

22.72 (87.41)

20.45 (113.71)

Table 8 Comparison of PSNR for different MPEG coding systems with the BCDS. Coastguard

Container

Foreman

Hall_monitor

MPEG-2

PSNR (CR)

22.24 (94.49)

24.07 (114.99)

22.42 (87.32)

25.00 (113.51)

MPEG-NNI

PSNR (CR)

20.85 (94.67)

20.69 (115.46)

23.35 (87.42)

21.53 (113.72)

MPEG-BLI

PSNR (CR)

23.21 (94.67)

23.17 (115.46)

25.58 (87.42)

24.00 (113.72)

MPEG-BCI

PSNR (CR)

23.37 (94.67)

23.38 (115.46)

25.65 (87.42)

24.19 (113.72)

MPEG-GPI

PSNR (CR)

19.61 (94.67)

19.40 (115.46)

22.55 (87.42)

20.14 (113.72)

where B o and B c are total number of bits used in the original and compressed video files, respectively. The average PSNR obtained from the MPEG-2 and the proposed MPEG-GPI with similar compression ratios are given in the second row and the sixth row in Table 5, respectively. The simulation results indicate that the average PSNR for the MPEG-2 is better than the MPEG-GPI for all video sequences except Foreman video. However, it is well known that the objective assessment, PSNR, may be not consistent with the visual quality of reconstructed frames in the high compression cases. Consequently, the subjective assessment should be considered. To compare the visual quality of reconstructed video sequences, some randomly chosen reconstructed frames for the Coastguard and the Hall_monitor from the MPEG-2 and the MPEG-GPI are shown in the left upper corner and the left lower corner in Tables 11 and 12, respectively. From the reconstructed frames, it clearly indicates that the visual quality is enhanced significantly by the proposed MPEG-GPI since more details are retained when compared with the MPEG-2. For example, the details of the sea are reconstructed acceptably by the MPEG-GPI while the MPEG-2 fails to give those details sufficiently and thus significant block effect results. Similar results can be found in the ceiling of Hall_monitor example. Consequently, the proposed MPEG-GPI is able to provide better visual quality, by reducing the block effect in the reconstructed video sequence, even in cases with inferior average PSNR.

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

Table 9 Comparison of visual quality for different JPEG coding systems with various down sampling schemes (Lena).

1433

1434

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

Table 10 Comparison of visual quality for different JPEG coding systems with various down sampling schemes (Baboon).

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

Table 11 Comparison of visual quality for different MPEG coding systems with various down sampling schemes (Coastguard).

1435

1436

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

Table 12 Comparison of visual quality for different MPEG coding systems with various down sampling schemes (Hall_monitor).

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

1437

4.2.2. Effect of down sampling schemes on the MPEG-GPI Here the effect of down sampling schemes on the MPEG-GPI is investigated. The average PSNR for the proposed MPEGGPI with the DDS, the NNDS, the BLDS, and the BCDS schemes are given in the last rows of Tables 5 to 8, respectively. For the given examples, the ranking of the average PSNR for the four down sampling schemes, from high to low, is the DDS, the BCDS, the BLDS, and the NNDS. The result is same as in the JPEG-GPI. It suggests that the proposed MPEG-GPI is also in favor of the DDS, as in the JPEG-GPI. The reconstructed frames in the Coastguard and the Hall_monitor are, respectively, shown in the last rows of Tables 11 and 12 where the DDS, the BLDS, and the BCDS are applied. As in the JPEG-GPI, the reconstructed frames with the BLDS are of a little bit blurriness more than those with the DDS and the BCDS. 4.2.3. Comparison for the MPEG-GPI and 2-D PI with the MPEG In this part, the 1-D GPI in the MPEG-GPI shown in Fig. 5 is replaced with popular 2-D interpolation approaches, the NNI, the BLI, and the BCI. The coding systems are called the MPEG-NNI, the MPEG-BLI, and the MPEG-BCI, respectively. With the DDS, the NNDS, the BLDS, and the BCDS, the average PSNR of the coding systems for the four given video sequences are recorded in Tables 5 to 8 where the highest PSNR are set in bold. By the results shown in Tables 5 to 8, the MPEG-BCI has highest PSNR except the cases with the MPEG-BLI in Table 6. Though the proposed MPEG-GPI has the lowest PSNR for all cases, its reconstructed frames have similar visual quality to those from the compared coding systems. This can be seen in two randomly selected frames from the Coastguard and the Hall_monitor given in Tables 11 and 12. The results in Tables 11 and 12 show that the reconstructed frames from the MPEG-NNI, the MPEG-BLI, the MPEG-BCI, and the MPEG-GPI with the DDS, the BLDS, and the BCDS. It is noted that the MPEG-GPI is of similar visual quality to the compared coding systems when the same down sampling scheme is applied. Moreover, the dimensionality of the GPI is 1 and the NNI, the BLI, and the BCI is 2. The 1-D GPI with second order requires only 3 pixels in the interpolation process while the 2-D BLI and the 2-D BCI need 4 and 16 pixels, respectively. Thus, the MPEG-GPI is still competitive with the compared coding systems when the simplicity is considered in the real-world applications. 5. Conclusion In this paper, we have presented a coding approach to reduce the block effect happened in the low bit rate coding or high compression cases. The proposed coding system consists of three stages: (i) to reduce the amount of image/frame data by a down sampling scheme, (ii) to encode/decode the down sampled image/frame by a BDCT coder, and (iii) to enlarge the down sampled reconstructed image/frame by an interpolation approach. Though the scheme in each stage can be chosen arbitrarily, in the first stage the direct down sampling (DDS) is used for its simplicity and preferred to the proposed coding systems. Similarly, in spite the coder can be any BDCT-based coder, the JPEG and MPEG-2 are employed in the second stage for its popularity. In the third stage, the 1-D GPI is employed because of its low computational complexity. With those schemes described, the proposed coding systems are called the JPEG-GPI and the MPEG-GPI. The proposed JPEG-GPI and MPEG-GPI coding systems are justified by examples and compared with the JPEG and the MPEG-2, respectively. The simulation results indicated that the reconstructed images of the JPEG-GPI are improved significantly both in the objective and subjective assessments when compared with the JPEG. As for the MPEG-GPI, better visual quality are obtained, even the average PSNR are worse than those for the MPEG-2 in most of cases. Through the simulation results, the JPEG-GPI and MPEG-GPI were verified. It suggests that the proposed JPEG-GPI and MPEG-GPI is able to reduce the block effect in the low bit rate coding or high compression cases, as expected. To have more understanding about the JPEG-GPI (MPEG-GPI), the effect of down sampling schemes on the JPEG-GPI (MPEG-GPI) was investigated where the NNDS, the BLDS, and the BCDS were under consideration. Moreover, three popular 2-D interpolation approaches, the NNI, the BLI, and the BCI were used to replace the 1-D GPI in the JPEG-GPI (MPEG-GPI) whose coding systems are called the JPEG-NNI (MPEG-NNI), the JPEG-BLI (MPEG-BLI), the JPEG-BCI (MPEG-BCI), respectively. With the down sampling schemes and the 2-D interpolation approaches, the simulations were performed whose results were then to compare with the corresponding results obtained from the JPEG-GPI (MPEG-GPI). For the given examples, it indicates that the DDS gives the highest PSNR for all cases in all coding systems and the coding system with the 2-D BCI generally has better performance. Consequently, the JPEG-BCI (MPEG-BCI) should be used for higher PSNR and better visual quality if the computational cost is not a critical issue in the application. When simplicity and low computational complexity are sought, the proposed JPEG-GPI (MPEG-GPI) with the DDS should be employed since it provides similar visual quality and/or PSNR to the JPEG-BCI (MPEG-BCI). Acknowledgment This work was supported by the National Science Council of Republic of China under grants NSC 95-2221-E-324-040, NSC 96-2221-E-324-044, and NSC 97-2221-E-324-032. References [1] K.R. Rao, J.J. Hwang, Techniques and Standards for Image, Video, and Audio Coding, Prentice Hall, 1996. [2] G.K. Wallace, The JPEG still picture compression standard, Commun. ACM 34 (Aug. 1991) 30–44.

1438

C.-H. Hsieh, R.-H. Huang / Digital Signal Processing 20 (2010) 1424–1438

[3] H.C. Reeve, J.S. Lim, Reduction of blocking effect in image coding, Opt. Eng. 23 (Jan. 1984) 34–37. [4] B. Ramamurthi, A. Gersho, Nonlinear space-variant post-processing of block coded images, IEEE Trans. Signal Process. 34 (Oct. 1986) 1258–1267. [5] S.D. Kim, J. Yi, H.M. Kim, J.B. Ra, A deblocking filter with two separate mode in block-based video coding, IEEE Trans. Circuits Syst. Video Technol. 9 (Feb. 1999) 156–160. [6] J. Chou, M. Crouse, K. Ramchandran, A simple algorithm for removing blocking artifacts in block-transformed coded images, in: IEEE International Conference on Image Processing, vol. 1, Oct. 1998, pp. 377–380. [7] H.W. Park, Y.L. Lee, A post-processing method for reducing quantization effects in low bit-rate moving picture coding, IEEE Trans. Circuits Syst. Video Technol. 9 (Feb. 1999) 161–171. [8] T. Jarske, P. Haavisto, I. Defee, Post-filtering methods for reducing blocking effects from coded images, IEEE Trans. Consumer Electron. 40 (3) (Aug. 1994) 521–526. [9] Y.F. Hsu, Y.C. Chen, A new adaptive separable median filter for removing blocking effects, IEEE Trans. Consumer Electron. 39 (3) (Aug. 1993) 510–513. [10] R.A. Gopinath, H. Guo, M. Lang, J.E. Odegard, Wavelet-based post-processing of low bit rate transform coded images, in: IEEE International Conference on Image Processing, vol. 2, Nov. 1994, pp. 913–917. [11] Z. Xiong, M.T. Orchard, Y. Zhang, A deblocking algorithm for JPEG compressed images using overcomplete wavelet representations, IEEE Trans. Circuits Syst. Video Technol. 7 (Apr. 1997) 433–437. [12] Hyuk Choi, Taejeoung Kim, Blocking-artifact reduction in block-coded images using wavelet-based subband decomposition, IEEE Trans. Circuits Syst. Video Technol. 10 (5) (Aug. 2000) 801–805. [13] T.C. Hsung, D.P.K. Lun, W.C. Siu, A deblocking technique for block-transform compressed image using wavelet transform modulus maxima, IEEE Trans. Image Process. 7 (Oct. 1998) 1488–1496. [14] N.C. Kim, I.H. Jang, D.H. Kim, W.H. Hong, Reduction of blocking artifact in block-coded images using wavelet transform, IEEE Trans. Circuits Syst. Video Technol. 8 (June 1998) 253–257. [15] S.-W. Hong, Y.-H. Chan, W.-C. Siu, The neural network modeled POCS method for removing blocking effect, in: IEEE International Conference on Neural Networks, vol. 3, Nov. 1995, pp. 1422–1425. [16] Y. Yang, N.P. Galatsanos, A.K. Katsaggelos, Regularized reconstruction to reduce blocking artifacts of block discrete cosine transform compressed images, IEEE Trans. Circuits Syst. Video Technol. 3 (Dec. 1993) 421–432. [17] Y. Yang, N.P. Galatsanos, Projection-based spatially adaptive reconstruction of block-transform compressed images, IEEE Trans. Image Process. 4 (July 1995) 896–908. [18] H. Paek, R.C. Kim, S.U. Lee, On the POCS-based post-processing techniques to reduce the blocking artifacts in transform coded images, IEEE Trans. Circuits Syst. Video Technol. 8 (June 1998) 358–367. [19] Y. Yang, N.P. Galatsanos, Removal of compression artifacts using projections onto convex sets and line process modeling, IEEE Trans. Image Process. 6 (Oct. 1997) 1345–1357. [20] S.H. Park, D.S. Kim, Theory of projection onto the narrow quantization constraint set and its application, IEEE Trans. Image Process. 8 (Oct. 1999) 1361–1373. [21] Yonghun Kim, Taihong Yi, Efficient post-processing for block-based compressed video, in: The 4th EURASIP Conference Focused on Video/Image Processing and Multimedia Communications, vol. 1, July 2003, pp. 101–105. [22] Satoshi Kondo, A method for removing blocking effects in MPEG-2 video by applying a block classification technique using stream information, IEEE Trans. Consumer Electron. 46 (3) (Aug. 2000) 872–878. [23] Cheol-Hong Mink, Sanghee Cho, Kyoung Won Lim, Heesub Lee, A new adaptive quantization method to reduce blocking effect, IEEE Trans. Consumer Electron. 44 (3) (Aug. 1998) 768–773. [24] Christopher J. Zarowski, An Introduction to Numerical Analysis for Electrical and Computer Engineers, Wiley, 2004. [25] Samuel D. Stearns, Digital Signal Processing with Examples in MATLAB, CRC Press, 2002. [26] C.-H. Hsieh, Grey neural network and its application to short term load forecasting problem, IEICE Trans. Inform. Syst. E85-D (5) (2002) 897–902. [27] C.-H. Hsieh, R.-H. Huang, T.-Y. Feng, One-dimensional grey polynomial interpolators for image enlargement, in: International Conference on Computer and Information Science, July 2007, pp. 450–456. [28] ISO-IEC/JTC1/SC29/WG11, Generic coding of moving pictures and associated audio information: Video, ISO-IEC 13818-2, Nov. 1994. [29] Available at ftp://ftp.simtel.net/pub/simtelnet/msdos/graphics/jpegsr6.zip. [30] Available at http://www.mpeg.org/MPEG/video.html.

Cheng-Hsiung Hsieh received his B.S. degree in Electronic Engineering from National Taiwan Institute of Technology, Taiwan, in 1989. In 1995, he earned the M.S. degree from the Department of Electrical Engineering of Tennessee Technological University, USA. He obtained his Ph.D. degree in Electrical Engineering from the University of Texas at Arlington, USA, in 1997. Currently, he is an associate professor at the Department of Computer Science and Information Engineering in Chaoyang University of Technology, Taiwan. Since 1998, he has developed several gray models and other schemes applied to image, video, and speech signal processing. Those studies have been published in journals and conferences. His research interests are gray system, image restoration, image enhancement, image enlargement, error concealment, and image/video coding. Besides, he serves as an initiating director of Taiwanese Association for Consumer Electronics established in 2008 and an initiating Technical Committee member on Awareness Computing in IEEE SMC Society from 2009. Ren-Hsien Huang received the B.A. degree in the Department of Information Management, Southern Taiwan University, Tainan, Taiwan, in 2005. He obtained the M.S. degree from the Department of Computer Science and Information Engineering at Chaoyang University of Technology, in 2007. Since 2007, he works with Mars Semiconductor Corporation which is an IC design house related to digital image/video processing. His research interests are image/video coding, image enlargement, and frame rate up conversion.