Autocorrelation-based interlaced to progressive format conversion

Digital Signal Processing 29 (2014) 67–77 Contents lists available at ScienceDirect Digital Signal Processing www.elsevier.com/locate/dsp Autocorre...

Download PDF

2MB Sizes 0 Downloads 47 Views

Report

PDF Reader
Full Text

Digital Signal Processing 29 (2014) 67–77

Contents lists available at ScienceDirect

Digital Signal Processing www.elsevier.com/locate/dsp

Autocorrelation-based interlaced to progressive format conversion Joohyeok Kim a , Gwanggil Jeon b,∗ , Jechang Jeong a,∗∗ a b

Department of Electronics and Computer Engineering, Hanyang University, 222 Wangsimni-ro, Seongdong-gu, Seoul 133-791, Republic of Korea Department of Embedded Systems Engineering, Incheon National University, 119 Academy-ro, Yeonsu-gu, Incheon 406-772, Republic of Korea

a r t i c l e

i n f o

Article history: Available online 22 January 2014 Keywords: Video deinterlacing Edge preserving Blackman–Harris windowed-sinc ﬁlter Format conversion

a b s t r a c t To generate a high resolution image from a low resolution one, interpolation plays a crucial role. However, conventional interpolation methods including edge-based interpolation methods have some drawbacks such as the limited number of edge directions, imprecise edge detection, and ineﬃcient interpolation. To overcome these shortcomings, we propose a new edge-directed interpolation method, which has three aims: various edge directions, reliable edge detection, and outstanding interpolation. Since the number of candidate edge directions in the proposed method is ﬂexible, we can use several edges included in the high resolution image. To accurately determine the edge direction, we use autocorrelation of neighboring pixels for candidate directions based on the duality between a high resolution image and its corresponding low resolution image. For the interpolation step, we utilize a Blackman–Harris windowedsinc weighted average ﬁlter where we use correlation values obtained in the edge detection step as weights. Experimental results show that the proposed method outperforms conventional methods in terms of both the subjective and the objective results. © 2014 Elsevier Inc. All rights reserved.

1. Introduction Recently, as the resolution of video content and viewer’s demand for high quality video have increased, data volumes of videos have grown tremendously. The resolution of HDTV is 1280 × 720p or 1920 × 1080i. UHDTV requires 4 times or 16 times more storage space (3840 × 2160 for 2K UHDTV and 7680 × 4320 for 4K UHDTV). Even more data is required for 3DTV, which may require data volumes that are double those of 2D video. To reduce data storage or bandwidth for transmission, most video transmission formats including DVD, SDTV and HDTV support an interlaced scanning format [1–8]. Since interlaced video consists of the two ﬁelds of a frame successively captured, it has higher temporal resolution than progressive video, which reduces ﬂicker effect [9]. However, the interlaced format requires a display that is able to show individual ﬁelds. Many displays cannot display interlaced video directly; therefore deinterlacing is required to view interlaced video on a progressive scan display. Over the past decades, many studies of deinterlacing have been performed [10–18]. Conventional deinterlacing methods are classiﬁed into two classes depending on whether motion information is considered or not. These classes are inter-ﬁeld temporal interpolation and intra-ﬁeld spatial interpolation [19–32]. While inter-

* Corresponding author. Fax: +82 2 2220 0370. ** Principal corresponding author. Fax: +82 32 835 0782.

E-mail addresses: [email protected] (J. Kim), [email protected] (G. Jeon), [email protected] (J. Jeong).

1051-2004/$ – see front matter © 2014 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.dsp.2014.01.002

ﬁeld methods perform better than intra-ﬁeld methods, they require huge computational complexity to obtain motion information which is not feasible in real-time. Moreover, if motion information is not reliable, frames interpolated by inter-ﬁeld methods may have considerable artifacts and even risk latent error propagation. On the other hand, intra-ﬁeld methods have the advantages of simplicity and ease of implementation. Intra-ﬁeld methods perform better when the motion of objects is complex or fast. In addition, intra-ﬁeld methods are necessary because they reduce the risk of error propagation and should be inserted periodically in video sequences for display at any time. The line average (LA) method is the simplest intra-ﬁeld method. While it is easy to implement, it suffers from various artifacts such as aliasing and jerkiness since it interpolates missing pixels with just the average of two vertical neighboring pixels. In order to take edge information into account and improve subjective quality, several methods have been proposed. Among them, edge-based LA (ELA) and modiﬁed ELA (MELA) are popular methods [24,25]. ELA considers three edge directions, 45◦ , 90◦ , and 135◦ . It ﬁrst calculates the absolute differences of two neighboring pixels along the three directions, and then chooses the direction with the minimum difference as the ﬁnal edge direction. The average of the two pixels in the ﬁnal edge direction is used to interpolate the missing pixel. MELA utilizes more than four pixels instead of two pixels in each direction to calculate the costs for the edge directions. With these costs, they interpolate missing pixels along the calculated direction at the minimum cost. MELA plays the role of a low pass ﬁlter because it uses an average of two or four pixels. However, the number of its edge directions is limited to three.

68

J. Kim et al. / Digital Signal Processing 29 (2014) 67–77

Table 1 Summary of conventional methods. Algorithm

Number of candidate edge directions

Edge detection method

Interpolation method

LA ELA MELA DOI ECA VDD FDD FDIF CAD FSID Prop.

1 3 3 33 5 3–33 9 3 N/A N/A 2x + 1

Only vertical SAD (2 pixels) SAD (4 or 6 pixels) SAD (2 × 3 pixels) Correlation weight function Stepwise edge detection Modiﬁed Sobel SAD N/A Sobel Autocorrelation

2-tap line average 2-tap line average 4-tap average 2-tap line average Weighted average FMELA, MVI, MDA, DOI Weighted average Weighted average Wiener ﬁlter Bilateral/trilateral ﬁlter BH windowed-sinc weighted average

In [26], authors proposed direction-oriented interpolation (DOI) to take account of edge directions. They used block matching error to determine the edge direction. They ﬁnd the best matching block by calculating the distortions between the current block and k-shifted 2 × 3 blocks on the upper (and lower) lines. If the determined directions for the upper and the lower lines are almost straight, they set the determined directions as the ﬁnal edge direction. Otherwise, they just use LA. DOI applies the proposed method only when the determined two edge directions are almost in a straight line, which is an excessively strict constraint considering that the search range proposed is ±16. In [27], authors introduced ﬁne directional deinterlacing (FDD), which utilized ELA and a modiﬁed Sobel ﬁlter to obtain the edge direction. After determining whether the edge is slanted toward the left or right by ELA, they applied a Sobel ﬁlter to ﬁve pixels determined according to the direction. The weighted sum of the two directions that have the largest magnitude was used as the ﬁnal direction. However edge direction from the Sobel ﬁlter may not be correct because natural images may have noise and/or variation in intensity in the ﬂat region. Both DOI and FDD adopted an early termination method before determining the edge direction so that they instead use the LA method if the absolute difference between two vertically neighboring pixels is less than a predeﬁned threshold. Since the threshold is relatively large, this early termination method may be a contributor to improved PSNR results as well as a bound to improvement. In other words, many pixels are interpolated using LA; therefore, performance converges toward that of LA. In [28], authors proposed edge-based correlation adaptive (ECA) method. They perform horizontal ﬂat area detection ﬁrst. If a pixel is on such an area, mean of four diagonal pixels is used as the estimate. Otherwise, correlation weight function is utilized where they use the absolute differences for ﬁve directions as weights. To reduce wrong determination of edge direction, votingbased deinterlacing method for directional error correction (VDD) was proposed [29]. VDD performs stepwise edge inspection and four different interpolation methods are utilized at different steps: modiﬁed MELA, majority voting-based interpolation (MVI), majority voting-based direction average (MVD), and DOI. VDD provides relatively better results in terms of image quality, but it uses six threshold parameters, which brings about degradation in some images. A ﬁxed directional interpolation ﬁlter (FDIF) was recently proposed in [30] which determined the edge direction by using the edge detection method of MELA and interpolated using a DCT interpolation ﬁlter adopted in the high eﬃciency video codec (HEVC). It is fast but has the same problem with MELA because they only have three candidate directions. In [31], authors used geometric duality and a Wiener ﬁlter for their deinterlacing system, covariance-based adaptive deinterlacing (CAD). They obtained the ﬁlter coeﬃcients from the corresponding low resolution image and interpolated missing pixels using an

8-tap Wiener ﬁlter. They used an inverse operation to obtain the ﬁlter coeﬃcients, but did not necessarily obtain an inverse matrix in the ﬂat region. Plus, their approach is too time-consuming, and the duality may be destroyed because they use a large data matrix (16 × 19 LR image). In [32], authors presented a ﬁlter switching interpolation method (FSID) where they determined the edge direction using a Sobel ﬁlter. A bilateral or trilateral ﬁlter is used to interpolate missing pixels. Because pre-interpolation is required to apply the Sobel ﬁlter, and bilateral and trilateral ﬁlters have exponential operation, this approach is time-consuming. Table 1 shows a summary of conventional methods and the proposed method. In order to overcome the drawbacks of conventional methods such as limited edge direction, ineﬃcient early termination, and inaccurate edge detection, we propose a new edge-directed interpolation method using autocorrelation on the interpolation errors of the neighboring pixels for the candidate directions. The proposed method provides a variety of edge directions that are able to extend to any number, and presents a reliable edge detection method that is based on autocorrelation of neighboring pixels for candidate directions. Since we interpolate missing pixels using a Blackman– Harris (BH) windowed-sinc weighted average ﬁlter along the determined edge direction, the interpolated image is smooth while preserving the edge. The remainder of the paper is organized as follows. The proposed method is explained in Section 2. The experimental results are described in Section 3 to compare the proposed method with conventional methods. Finally, we present conclusions in Section 4. 2. Proposed method The proposed method has three main contributions. First, we used extendable edge directions. The proposed method is able to use a variety of edge directions as shown in Table 2. The second contribution is reliable edge detection. To accurately determine the edge direction, we utilize the autocorrelation of neighboring pixels under the assumption of duality between a high resolution (HR) image and its corresponding low resolution (LR) image. In other words, we determine the edge direction of missing pixels using the edge direction of the neighboring pixels in the LR image. Details of edge direction detection are described in the next section. The third contribution is the interpolation scheme along the determined edge direction. After determining the edge direction, we interpolate missing pixels using a Blackman–Harris windowed-sinc weighted average. 2.1. Candidate edge directions Fig. 1 shows the candidate edge directions we consider, where circles with a red dot are to-be-interpolated pixels and the other circles such as ones on m − 3, m − 1, and m + 1 rows are known

J. Kim et al. / Digital Signal Processing 29 (2014) 67–77

69

Table 2 Candidate edge directions and their degrees. dir k

dir8

dir6

dir4

dir2

dir1

dir3

dir5

dir7

dir9

d Degree

−4

−3

−2

−1

0 90◦

1 63.43◦

2 45◦

3 33.69◦

4 26.57◦

153.43◦

146.31◦

135◦

116.57◦

Fig. 1. The candidate edge directions: (a) HR image, (b) LR image.

original pixels. The lines passing through the current pixel f (m, n) show the edge direction (d = . . . , −2, −1, 0, 1, 2, . . .). Here, d is any integer number. For example, dir2 in Fig. 1(a) is the line that connects the current pixel and the 0.5 left shifted pixel on the upper line, f (m − 1, n − 0.5). As shown in Fig. 1(b), the lines only pass through original pixels when the directions are applied to a neighboring original pixel in the LR image such as f (m − 1, n). These interpolation errors obtained in the LR image are utilized when calculating the autocorrelation.

The candidate edge directions can be represented as

θ = tan−1

2 .

(1)

d

Table 2 shows the candidate edge directions and their corresponding degrees that are produced by using Eq. (1). Here, the relation between d and k in dirk can be represented by

d = (−1)k+1

k

2

.

(2)

70

J. Kim et al. / Digital Signal Processing 29 (2014) 67–77

Fig. 2. Duality between two different resolutions: (a) high resolution image, (b) low resolution image, (c) Canny edge of (a), and (d) Canny edge of (b).

2.2. Edge detection The proposed edge detection consists of two steps: (1) calculation of interpolation errors on the neighboring pixels for each candidate direction in the LR image, and (2) calculation of the autocorrelation for the errors. Fig. 2 shows an example of the duality where the low resolution image was produced by sub-sampling. The edge images in this ﬁgure were generated by using a Canny ﬁlter where we set the standard deviation of the Gaussian ﬁlter to 1, high threshold to 0.5, and low threshold to 0.4 × high threshold [33]. The Canny ﬁlter ﬁnds edges using the gradient. The Canny ﬁlter detects a pixel if it is on the edge or not by considering the relationship between the weak edge and the strong edge. Thus, the Canny ﬁlter is less vulnerable to noise than other edge ﬁlters. In Fig. 2, the LR image is a great approximation of the HR image with loss of some high frequency components in the detailed region. Based on this observation, we developed an edge detection method. With the duality, we ﬁrst generate the predictive values of the original neighboring pixels along the candidate edges in the LR image. Then we calculate the errors between the predictive values and their original values. As the neighboring pixels have original intensity values, we can calculate the exact errors by

e dirk,i , j = f (i , j ) − f (i , j )

f ( i − 2, j + d ) + f ( i + 2, j − d ) , = f (i , j ) − 2

(3)

where i , j is the location of a neighboring pixel, f (i , j ) is the origf dirk (i , j ) is the pixel inal intensity of the neighboring pixel, and predicted along the dirk direction. We use the average of the two closest pixels along the direction as the predictive value. After obtaining the errors for all the candidate edges, we calculate the autocorrelation of each edge:

R X X (τ ) = E X (t ) X (t + τ )

= E X 2 (τ ) = μ2 + σ 2 .

(4)

The ﬁrst term E [ X (t ) X (t + τ )] in Eq. (4) is reduced to E [ X 2 (τ )] when the signal is wide-sense stationary (WSS) [34,35]. Because the data that we use are the error between the original value and its predictive value, the mean is almost zero and the variances are very similar. Therefore, the assumption of WSS is appropriate. We can obtain the third term μ2 + σ 2 by substituting zero to τ . From Eq. (4), we can obtain the autocorrelation using the mean and the

J. Kim et al. / Digital Signal Processing 29 (2014) 67–77

71

Fig. 3. An example of cost of dir5 direction.

variance without great effort. We utilize this autocorrelation as the cost to ﬁnd the edge of the to-be-interpolated pixel: 2 C dirk = μ2dirk + σdir k,

(5)

where C dirk is the cost of the k-direction, μ and σ 2 are the mean and the variance. We set the direction which has the smallest C dirk as the edge. The meaning of Eq. (5) is that μdirk is the average of the interpolation errors for the k-direction obtained from all the neighboring pixels. If the errors for one direction are smaller than errors for the other directions, interpolation along the direction is more accurate. That is, μdirk provides the validity for interpolation along the direction. Since the average of the error is proportional to the 2 cost, a smaller cost implies a more accurate edge. σdir k is a measure on the spread of the errors. Small variance means that the interpolation errors for the k-direction of all neighbors are simi2 lar and that the errors are close to the mean value; hence, σdir k provides reliability of the direction in the statistical point of view. This is the reason why we determine the direction, δ in Eq. (6), that has the minimum cost as the edge:

δ = arg min C dirk . dirk

(6)

Fig. 3 shows an example of how to calculate the cost on the dirk. The number of neighboring pixels is set to six. This number can be changed by a user parameter. However, two or six pixels are recommended to provide low complexity with high performance

conditions. We ﬁrst calculate the errors of six neighboring pixels for dir5 using Eq. (7):

e dir5,1 = f (m − 1, n − 1) − e dir5,2 = f (m − 1, n) −

2 f (m − 3, n + 2) + f (m + 1, n − 2)

e dir5,3 = f (m − 1, n + 1) − e dir5,4 = f (m + 1, n − 1) − e dir5,5 = f (m + 1, n) −

f (m − 3, n + 1) + f (m + 1, n − 3)

e dir5,6 = f (m + 1, n + 1) −

,

2 f (m − 3, n + 3) + f (m + 1, n − 1) 2 f (m − 1, n + 1) + f (m + 3, n − 3)

2 f (m − 1, n + 2) + f (m + 3, n − 2)

, ,

,

2 f (m − 1, n + 3) + f (m + 3, n − 1) 2

,

.

(7) For edge detection, we ﬁrst calculate the interpolation errors of six neighboring pixels for dirk using Eq. (7). In the example, the errors for the dirk are difference values between each neighboring pixel and its predictive value: e dir5,1 = 1, e dir5,2 = 1.5, e dir5,3 = 0, e dir5,4 = 2, e dir5,5 = 0, e dir5,6 = 1.5. The mean of the errors is 1 and variance is 0.58; therefore, the cost of dir5 is 1.58. The costs for the other directions are obtained analogously. After calculating the costs for all the directions, we determine the direction that has the smallest cost as the edge direction.

72

J. Kim et al. / Digital Signal Processing 29 (2014) 67–77

Fig. 4. Random distributed region.

2.3. Blackman–Harris windowed-sinc weighted average ﬁlter For the interpolation step, we use a Blackman–Harris (BH) windowed-sinc weighted average ﬁlter. Let C min1 and C min2 be the minimum cost and the second smallest cost obtained in the edge detection step, respectively. If one of them is upper-right direction and the other one is upper-left direction as in Fig. 4, we consider the pixel to be randomly distributed and interpolate this pixel by the average of the two spatially closest pixels,

f (m − 1, n) + f (m + 1, n) f (m, n) = .

(8)

2

Otherwise, we use a BH windowed-sinc weighted average ﬁlter for interpolation. The BH window is widely used because the implementation is easy and it has less ripple in the stop band. The BH window is expressed as follows:

W (n) = a0 + a1 cos

2π N

2π (n) + a2 cos (2n) ,

(9)

N

where a0 , a1 , and a2 are coeﬃcients and N is ﬁlter length [36]. When the ﬁlter length is six, it is known that a0 = 0.42323, a1 = 0.49755, a2 = 0.07922. By convolving the BH window with the sinc function in the frequency domain, we can obtain the BH windowed-sinc ﬁlter. Convolution of the 6-tap BH windowed-sinc ﬁlter with sampled pixel values from −3 to 3 constructs a predictive value at the (m, n) position:

p dirk = f ∗ hBH ,

(10)

where hBH and f denote the BH windowed-sinc ﬁlter coeﬃcients and pixels on the determined edge, respectively. We use six pixels for f which is represented as

f=

f m − 5, n + Φ

5d

2

d

f m + 3, n + Φ

2

−3d 2

, f m − 3, n + Φ

f m − 1, n + Φ

3d

2

−d , f m + 1, n + Φ ,

2

, f m + 5, n + Φ

−5d 2

,

where Φ(·) is a ﬁx function which rounds values toward zero. For fast implementation, we use approximated values for hBH ,

[2, −11, 73, 73, −11, 2] 128

.

along the vertical direction, respectively. Then, the estimated value is obtained from Eqs. (13) and (14):

f (m, n) = μ · p v + (1 − μ) p δ ,

μ=

,

(11)

hBH =

Fig. 5. Flowchart of the proposed method.

(12)

We denote p δ and p v are the values predicted by using the BH windowed-sinc ﬁlter along the determined edge direction and

Cδ Cδ + C v

,

(13) (14)

where C v is the cost for the vertical direction. The above BH windowed-sinc weighted average ﬁlter smoothes the image while preserving the edge by using optically and spatially close pixels with adaptive weight. Fig. 5 shows the ﬂowchart of the proposed method where error calculation is pre-computed for a whole frame by using Eq. (3). After mean and variance of six neighboring pixels for each candidate direction are calculated, the costs for the directions are obtained with Eq. (5). The direction that has the minimum cost is chosen as the edge direction by Eq. (6). If the direction of which cost is minimum and the direction that have the second smallest cost are randomly distributed, we interpolate the pixel using the average of the vertically closest two pixels as expressed in Eq. (8). Otherwise, BH windowed-sinc weighted average in Eq. (8) is used as the estimate of the pixel.

J. Kim et al. / Digital Signal Processing 29 (2014) 67–77

73

Fig. 6. Test images. (a) Barbara, (b) Boat, (c) Lena, (d) Blonde, (e) Building, (f) Cameraman, (g) Pirate, (h) Strawberries, and (i) Thumb print.

3. Experimental results To assess the performance of the proposed method, we simulated it and compared with the seven conventional methods explained in the introduction section: ELA, MELA, DOI, FDD, FDIF, CAD and FSID. In the simulation, we set the number of the candidate directions to 7. Note that it is changeable because it is a user parameter. We provide peak signal-to-noise ratio (PSNR) and mean structural similarity (MSSIM) results as measures of the objective quality and deinterlaced images for subjective comparison. In addition, we compared the computational complexity. We use nine images: Strawberries (666 × 666), Building (600 × 600), Barbara, Boat, Lena, Blonde, Cameraman, Pirate (512 × 512), and Thumb print (512 × 480). Fig. 6 shows the test images used in the simulation. Table 3 shows the PSNR results to evaluate the performance of the objective quality. The proposed method showed the best average PSNR results, and the PSNR gap ranges from 0.1 dB (compared to FDIF) to 2.87 dB (compared to ELA). The proposed method is ranked 6th for the Building image. The Building image contains roof bricks in some areas which have many edges in a very small

region. In this case, the best edges detected at each pixel are not statistically the best because the edge at the current pixel may not be the edges at the adjacent pixels. Therefore, LA-based methods such as MELA, DOI and FDD show higher PSNR results although they cause blur effects. For similar reasons, the proposed method is ranked 4th for the Blonde image where degradation came from a suntanned face. Thumb print is an image used in biomedical and forensic science. Because the image has a variety of edge directions that cover almost all degrees from 0◦ to 360◦ , the result of this image shows the performance of edge detection. In this image, the proposed method showed the best result. Table 4 shows the results for comparison of computational complexity. As we can see in this table, the proposed method is faster than FDD, CAD, and FSID. Note that MELA showed a faster result, but that was due to the early termination that caused most of the pixels to be interpolated by LA. The number of arithmetic operations required to interpolate one pixel for different methods are shown in Table 5, where ET denotes early termination. As mentioned before, VDD utilizes four different interpolation methods, and so we list the number of op-

74

J. Kim et al. / Digital Signal Processing 29 (2014) 67–77

Table 3 Comparison of performance measured by PSNR (in dB). PSNR

ELA

MELA

DOI

FDD

ECA

VDD

FDIF

CAD

FSID

Prop.

Rank

Barbara Boat Lena Blonde Building Cameraman Pirate Strawberries Thumb print Average

25.09 32.33 35.85 31.59 32.37 35.53 32.36 32.62 27.55 31.70

32.21 35.30 37.96 33.08 33.15 37.03 33.82 34.01 28.85 33.93

28.67 33.85 36.52 32.60 33.24 36.02 33.15 33.7 26.98 32.75

31.28 33.95 36.91 32.62 33.13 36.25 33.16 33.38 28.00 33.19

28.27 32.57 35.84 31.92 32.57 35.47 32.54 33.23 27.78 32.24

31.46 34.78 37.26 32.74 32.96 36.66 33.27 33.22 28.22 33.40

33.79 36.07 38.06 32.54 33.04 39.97 33.57 33.83 29.33 34.47

26.21 35.47 38.20 33.78 32.73 37.51 33.84 33.63 28.75 33.35

33.02 36.03 38.32 33.00 33.07 38.50 33.80 33.90 28.82 34.27

33.74 36.26 38.32 32.88 32.97 39.71 33.80 33.99 29.46 34.57

2 1 1 4 6 2 3 2 1 1

Table 4 Comparison of computational complexity measured by CPU time (s). Time

ELA

MELA

DOI

FDD

ECA

VDD

FDIF

CAD

FSID

Prop.

Rank

Barbara Boat Lena Blonde Building Cameraman Pirate Strawberries Thumb print Average

0.164 0.164 0.167 0.174 0.216 0.157 0.164 0.268 0.152 0.181

0.188 0.188 0.191 0.192 0.250 0.187 0.182 0.315 0.166 0.207

1.187 1.041 0.602 0.864 0.813 0.550 0.927 0.888 1.925 0.978

1.994 1.624 0.962 1.364 1.319 0.954 1.626 1.718 3.369 1.659

0.127 0.120 0.114 0.121 0.133 0.100 0.124 0.165 0.130 0.126

2.243 2.162 2.311 2.454 2.937 2.132 2.261 3.858 2.246 2.512

0.759 0.768 0.768 0.782 1.059 0.756 0.755 1.285 0.707 0.849

26.561 34.948 29.559 28.570 54.511 28.775 31.115 46.865 30.604 34.612

1.448 1.321 1.166 1.204 1.472 1.135 1.311 1.830 1.927 1.424

0.791 0.787 0.804 0.804 1.114 0.813 0.817 1.341 0.762 0.892

5 5 6 5 6 6 5 6 5 5

erations required for each case. For fast implementation, modiﬁed variance is used in the proposed algorithm:

Table 5 Comparison of arithmetic operations. Algorithm

ADD

ELA MELA

SHT

MUL

CMP

4 11

1 3

0 1

2 5

Remarks

2 dirk

σ

DOI

1st ET 2nd ET O.W

2 25 27

1 1 3

0 396 396

1 66 66

FDD

1st ET 2nd ET O.W

7 13 70

1 1 11

0 0 23

1 3 15

ECA

ET O.W

8 37

1 6

0 15

1 1

VDD

FMELA MVI MDA DOI

21 17 34 57

4 3 3 3

1 1 5 401

8 15 38 103

FDIF

V direction O.W

19 27

3 4

7 17

6 6

15 111

0

15 192

0

inverse of mtx

30 36

5 5

33 55

1 1

exp: 6, sqrt: 1 exp: 12, sqrt: 1

126 129

8 13

21 32

7 7

CAD FSID

Smooth region O.W

Prop.

R.D O.W

tant−1 : 2, sqrt: 5

=

N 1

N

2 |e − e i |

(15)

.

i =1

As one can observe, DOI and CAD require a lot of multiplication operations. FSID require 6 exponential and 1 square root as well as 33 or 55 multiplication operations. The number of operations required for proposed algorithm varies depending on the number of the candidate directions. Let N be the candidate direction number, then the operation number for the proposed method is 18N + 1 or 18N + 13 addition, 3N or 3N + 11 multiplication, N + 1 or N + 6 shift, and N comparison operations. The structural similarity (SSIM) index is a well-known image quality assessment tool for calculating the similarity between two images, which estimates perceived errors [37]. It is generally used in image enhancement areas because it provides a good approximation of perceived image quality. In other words, we can measure the similarity between the two images by using SSIM. In practice, mean SSIM index (MSSIM) is used to evaluate the quality of the entire image, which is calculated by normalizing the SSIM index. Table 6 shows the MSSIM results where a higher value indicates that the reconstructed image is more similar to the original image. The MELA and the FSID methods show comparable results. However, the proposed method shows the best results in

Table 6 Comparison of performance measured by MSSIM. MSSIM

ELA

MELA

DOI

FDD

ECA

VDD

FDIF

CAD

FSID

Prop.

Rank

Barbara Boat Lena Blonde Building Cameraman Pirate Strawberries Thumb print Average

0.866 0.907 0.940 0.901 0.889 0.964 0.919 0.940 0.925 0.917

0.949 0.938 0.955 0.925 0.906 0.974 0.943 0.956 0.954 0.944

0.923 0.931 0.952 0.922 0.908 0.973 0.938 0.953 0.925 0.936

0.942 0.932 0.952 0.922 0.906 0.973 0.938 0.951 0.940 0.940

0.907 0.919 0.946 0.911 0.899 0.965 0.925 0.946 0.936 0.928

0.934 0.933 0.950 0.918 0.901 0.972 0.934 0.949 0.942 0.937

0.956 0.939 0.951 0.916 0.904 0.985 0.940 0.953 0.964 0.945

0.896 0.938 0.956 0.934 0.894 0.976 0.941 0.952 0.951 0.938

0.953 0.945 0.958 0.927 0.908 0.976 0.943 0.956 0.960 0.947

0.960 0.944 0.953 0.930 0.903 0.985 0.942 0.960 0.965 0.949

1 2 4 2 6 1 3 1 1 1

J. Kim et al. / Digital Signal Processing 29 (2014) 67–77

75

Fig. 7. SSIM map. (a) Original image, (b) ELA, (c) MELA, (d) DOI, (e) FDD, (f) ECA, (g) VDD, (h) FDIF, (i) CAD, (j) FSID, and (k) Proposed.

terms of average MSSIM result. This means the deinterlaced images generated by the proposed method visually outperform the others. Note that the proposed method is ranked second for Blonde image in terms of MSSIM. However, it was ranked in fourth in PSNR result because the proposed method preserves details such as hair

area well. The degradation in the Building image came from regions with diverse edges in small areas. Fig. 7 shows the SSIM maps of the proposed method and the conventional methods. The brighter SSIM map indicates that the image is more similar to the original image. The proposed method showed the brightest SSIM map, which demonstrates the

76

J. Kim et al. / Digital Signal Processing 29 (2014) 67–77

Fig. 8. Deinterlaced images. (a) Original image, (b) partial image of (a), (c) ELA, (d) MELA, (e) DOI, (f) FDD, (g) ECA, (h) VDD, (i) FDIF, (j) FSID, and (k) Proposed.

subjective superiority of the proposed method. In particular, black areas on the table, left leg, and scarf are fairly reduced. Fig. 8 shows the deinterlaced images generated by different methods. To obtain deinterlaced images, we ﬁrst eliminated the odd ﬁelds in a frame and interpolated them. ELA and MELA made diagonal/anti-diagonal edges smooth, and we could ﬁnd some discontinuities, especially in the left character. DOI and FDD caused edge-bleeding artifacts. FDIF also had staircase artifacts, and FSID showed edge-bleeding artifacts. The proposed method was the most clear and reduced the staircase artifacts. 4. Conclusions In this paper, we proposed new edge detection method and interpolation method. We have three goals: consideration of various edges, reliable edge detection, and outstanding interpolation. Since the number of candidate edge directions is ﬂexible, we can take into account the varied edges included in the HR image. For edge detection, autocorrelation of the neighboring pixels along the candidate edge directions based on the duality between LR image and HR image is utilized. The autocorrelation values obtained in the edge detection step are reused as the weights in the interpolation step. Along the determined edge direction, we interpolate the missing pixel by using a BH windowed-sinc weighted average ﬁlter. Simulation results demonstrated that the proposed method has good performance in terms of complexity, objective results, and subjective results. Acknowledgments This work was supported by the MKE (The Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Re-

search Center) support program supervised by the NIPA (National IT Industry Promotion Agency) (NIPA-2013-H0301-13-1011) and by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (2013R1A1A1010797). References [1] E.B. Bellars, G.D. Haan, Deinterlacing: A Key Technology for Scan Rate Conversion, Elsevier, 2000. [2] K. Jack, Video Demystiﬁed – A Handbook for the Digital Engineer, 4th ed., Elsevier, Jordan Hill, Oxford, 2005. [3] Y. Fang, DAC spectrum of binary sources with equally-likely symbols, IEEE Trans. Commun. 61 (4) (Apr. 2013) 1584–1594. [4] Y. Fang, LDPC-based lossless compression of nonstationary binary sources using sliding-window belief propagation, IEEE Trans. Commun. 60 (11) (Nov. 2012) 3161–3166. [5] Y. Fang, Joint source–channel estimation using accumulated LDPC syndrome, IEEE Commun. Lett. 14 (11) (Nov. 2010) 1044–1046. [6] Y. Fang, EREC-based length coding of variable-length data blocks, IEEE Trans. Circuits Syst. Video Technol. 20 (10) (Oct. 2010) 1358–1366. [7] Y. Fang, Distribution of distributed arithmetic codewords for equiprobable binary sources, IEEE Signal Process. Lett. 16 (12) (Dec. 2009) 1079–1082. [8] Y. Fang, Crossover probability estimation using mean-intrinsic-LLR of LDPC syndrome, IEEE Commun. Lett. 13 (9) (Sept. 2009) 679–681. [9] G.D. Haan, E.B. Bellers, Deinterlacing – an overview, Proc. IEEE 86 (9) (Sept. 1998) 1839–1857. [10] G. Jeon, M. Anisetti, V. Bellandi, J. Jeong, Fuzzy rule-based edge-restoration algorithm in HDTV interlaced sequences, IEEE Trans. Consum. Electron. 53 (2) (May 2007) 725–731. [11] G. Jeon, M. Anisetti, V. Bellandi, E. Damiani, J. Jeong, Rough sets-assisted subﬁeld optimization for alternating current plasma display panel, IEEE Trans. Consum. Electron. 53 (3) (Aug. 2007) 825–832. [12] G. Jeon, M. Anisetti, V. Bellandi, E. Damiani, J. Jeong, Fuzzy weighted approach to improve visual quality of edge-based ﬁltering, IEEE Trans. Consum. Electron. 53 (4) (Nov. 2007) 1661–1667.

J. Kim et al. / Digital Signal Processing 29 (2014) 67–77

[13] G. Jeon, M. Anisetti, D. Kim, V. Bellandi, E. Damiani, J. Jeong, Fuzzy rough sets hybrid scheme for motion and scene complexity adaptive deinterlacing, Image Vis. Comput. 27 (4) (March 2009) 425–436. [14] G. Jeon, M. Anisetti, V. Bellandi, E. Damiani, J. Jeong, Designing of a type-2 fuzzy logic ﬁlter for improving edge-preserving restoration of interlaced-toprogressive conversion, Inform. Sci. 179 (13) (June 2009) 2194–2207. [15] G. Jeon, M. Anisetti, S. Kang, A rank-ordered marginal ﬁlter for deinterlacing, Sensors 13 (3) (March 2013) 3056–3065. [16] G. Jeon, M. Anisetti, J. Lee, V. Bellandi, E. Damiani, J. Jeong, Concept of linguistic variable-based fuzzy ensemble approach: application to interlaced HDTV sequences, IEEE Trans. Fuzzy Syst. 17 (6) (Dec. 2009) 1245–1258. [17] G. Jeon, S.J. Park, Y. Fang, M. Anisetti, V. Bellandi, E. Damiani, J. Jeong, Speciﬁcation of eﬃcient block matching scheme for motion estimation in video compression, Opt. Eng. 48 (12) (Dec. 2009) 127005. [18] G. Jeon, M.Y. Jung, M. Anisetti, V. Bellandi, E. Damiani, J. Jeong, Speciﬁcation of the geometric regularity model for fuzzy if-then rule-based deinterlacing, IEEE/OSA J. Display Technol. 6 (6) (June 2010) 235–243. [19] K. Sugiyama, H. Nakamura, A method of deinterlacing with motion compensated interpolation, IEEE Trans. Consum. Electron. 45 (3) (Aug. 1999) 611–616. [20] Y.-Y. Jung, B.-T. Choi, Y.-J. Park, S.-J. Ko, An effective deinterlacing technique using motion compensated information, IEEE Trans. Consum. Electron. 46 (3) (Aug. 2000) 460–466. [21] R. Li, B. Zheng, M.L. Liou, Reliable motion detection/compensation for interlaced sequences and its applications to deinterlacing, IEEE Trans. Circuits Syst. Video Technol. 10 (1) (Oct. 2000) 23–29. [22] O. Kwon, K. Sohn, C. Lee, Deinterlacing using directional interpolation and motion compensation, IEEE Trans. Consum. Electron. 49 (1) (Feb. 2003) 198–203. [23] D. Wang, A. Vincent, P. Blanchﬁeld, Hybrid de-interlacing algorithm based on motion vector, IEEE Trans. Circuits Syst. Video Technol. 15 (8) (Aug. 2005) 1019–1025. [24] T. Doyle, Interlaced to sequential conversion for EDTV applications, in: Proc. 2nd Int. Workshop Signal Processing of HDTV, Elsevier, Amsterdam, 1999, pp. 412–430. [25] W. Kim, S. Jin, J. Jeong, Novel intra deinterlacing algorithm using content adaptive interpolation, IEEE Trans. Consum. Electron. 53 (3) (Aug. 2007) 1036–1043. [26] H. Yoo, J. Jeong, Direction-oriented interpolation and its application to deinterlacing, IEEE Trans. Consum. Electron. 48 (4) (Nov. 2002) 954–962. [27] S. Jin, W. Kim, J. Jeong, Fine directional de-interlacing algorithm using modiﬁed Sobel operation, IEEE Trans. Consum. Electron. 54 (2) (May 2008) 857–862. [28] T.-H. Tsai, H.-L. Lin, Design and implementation for deinterlacing using the edge-based correlation adaptive method, J. Electron. Imaging 18 (1) (Mar. 2009) 013014, pp. 1–5. [29] H.-J. Cho, Y.-S. Lee, S.-H. O, S.-J. O, A voting-based intra deinterlacing method for directional error correlation, IEEE Trans. Consum. Electron. 56 (3) (Aug. 2010) 1713–1721. [30] S.-M. Hong, S.-J. Park, J. Jang, J. Jeong, Deinterlacing algorithm using ﬁxed directional interpolation ﬁlter and adaptive distance weighting scheme, Opt. Eng. 50 (6) (June 2011) 067008. [31] S.-J. Park, G. Jeon, J. Jeong, Computation-aware algorithm selection approach for interlaced-to-progressive conversion, Opt. Eng. 49 (5) (May 2010) 057005. [32] X. Chen, G. Jeon, J. Jeong, Filter switching interpolation method for deinterlacing, Opt. Eng. 51 (10) (Oct. 2012) 107402. [33] J. Canny, A computational approach to edge detection, IEEE Trans. Pattern Anal. Mach. Intell. PAMI-8 (6) (June 1986) 679–698. [34] P.Z. Peebles, Probability, Random Variables and Random Signal Principles, McGraw-Hill, 2000. [35] P.F. Dunn, Measurement and Data Analysis for Engineering and Science, McGraw-Hill, New York, 2005.

77

[36] H. Qian, R. Zhao, T. Chen, Interharmonics analysis based on interpolating windowed FFT algorithm, IEEE Trans. Power Deliv. 22 (2) (Apr. 2007) 1064–1069. [37] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, Image quality assessment: From error visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (Apr. 2004) 600–612.

Joohyeok Kim received his B.S. degree in information and communication engineering from Hanyang University, Korea, in 2008. He is currently pursuing a Ph.D. candidate in Electronic and Computer Engineering at Hanyang University. His research interests include video compression, such as H.264/AVC, HEVC, and 3D video coding, and image processing including interpolation, demosaicking, deinterlacing, and denoising. Gwanggil Jeon received the B.S., M.S., and Ph.D. (summa cum laude) degrees from the Department of Electronics and Computer Engineering, Hanyang University, Seoul, Korea, in 2003, 2005, and 2008, respectively. He was with the Department of Electronics and Computer Engineering, Hanyang University, from 2008 to 2009. He was with the School of Information Technology and Engineering, University of Ottawa, Ottawa, ON, Canada, as a Post-Doctoral Fellow, from 2009 to 2011. He was with the Graduate School of Science and Technology, Niigata University, Niigata, Japan, as an Assistant Professor, from 2011 to 2012. He is currently an Assistant Professor with the Department of Embedded Systems Engineering, Incheon National University, Incheon, Korea. His current research interests include image processing, particularly image compression, motion estimation, demosaicking, and image enhancement, and computational intelligence, such as fuzzy and rough sets theories. Dr. Jeon was a recipient of the IEEE Chester Sall Award, in 2007, and the ETRI Journal Paper Award, in 2008. Jechang Jeong received a B.S. degree in electronic engineering from Seoul National University, Korea, in 1980, an M.S. degree in electrical engineering from the Korea Advanced Institute of Science and Technology, in 1982, and a Ph.D. degree in electrical engineering from the University of Michigan, Ann Arbor, in 1990. From 1982 to 1986, he was with the Korean Broadcasting System, where he helped develop teletext systems. From 1990 to 1991, he worked as a postdoctoral research associate at the University of Michigan, Ann Arbor, where he helped to develop various signal-processing algorithms. From 1991 through 1995, he was with the Samsung Electronics Company, Korea, where he was involved in the development of HDTV, digital broadcasting receivers, and other multimedia systems. Since 1995, he has conducted research at Hanyang University, Seoul, Korea. His research interests include digital signal processing, digital communication, and image and audio compression for HDTV and multimedia applications. He has published numerous technical papers. Dr. Jeong received the Scientist of the Month award in 1998, from the Ministry of Science and Technology of Korea, and was the recipient of the 2007 IEEE Chester Sall Award and 2008 ETRI Journal Paper Award. He was also honored with a government commendation in 1998 from the Ministry of Information and Communication of Korea.

Autocorrelation-based interlaced to progressive format conversion

Autocorrelation-based interlaced to progressive format conversion

Recommend Documents