Digital Signal Processing 29 (2014) 67–77
Contents lists available at ScienceDirect
Digital Signal Processing www.elsevier.com/locate/dsp
Autocorrelation-based interlaced to progressive format conversion Joohyeok Kim a , Gwanggil Jeon b,∗ , Jechang Jeong a,∗∗ a b
Department of Electronics and Computer Engineering, Hanyang University, 222 Wangsimni-ro, Seongdong-gu, Seoul 133-791, Republic of Korea Department of Embedded Systems Engineering, Incheon National University, 119 Academy-ro, Yeonsu-gu, Incheon 406-772, Republic of Korea
a r t i c l e
i n f o
Article history: Available online 22 January 2014 Keywords: Video deinterlacing Edge preserving Blackman–Harris windowed-sinc filter Format conversion
a b s t r a c t To generate a high resolution image from a low resolution one, interpolation plays a crucial role. However, conventional interpolation methods including edge-based interpolation methods have some drawbacks such as the limited number of edge directions, imprecise edge detection, and inefficient interpolation. To overcome these shortcomings, we propose a new edge-directed interpolation method, which has three aims: various edge directions, reliable edge detection, and outstanding interpolation. Since the number of candidate edge directions in the proposed method is flexible, we can use several edges included in the high resolution image. To accurately determine the edge direction, we use autocorrelation of neighboring pixels for candidate directions based on the duality between a high resolution image and its corresponding low resolution image. For the interpolation step, we utilize a Blackman–Harris windowedsinc weighted average filter where we use correlation values obtained in the edge detection step as weights. Experimental results show that the proposed method outperforms conventional methods in terms of both the subjective and the objective results. © 2014 Elsevier Inc. All rights reserved.
1. Introduction Recently, as the resolution of video content and viewer’s demand for high quality video have increased, data volumes of videos have grown tremendously. The resolution of HDTV is 1280 × 720p or 1920 × 1080i. UHDTV requires 4 times or 16 times more storage space (3840 × 2160 for 2K UHDTV and 7680 × 4320 for 4K UHDTV). Even more data is required for 3DTV, which may require data volumes that are double those of 2D video. To reduce data storage or bandwidth for transmission, most video transmission formats including DVD, SDTV and HDTV support an interlaced scanning format [1–8]. Since interlaced video consists of the two fields of a frame successively captured, it has higher temporal resolution than progressive video, which reduces flicker effect [9]. However, the interlaced format requires a display that is able to show individual fields. Many displays cannot display interlaced video directly; therefore deinterlacing is required to view interlaced video on a progressive scan display. Over the past decades, many studies of deinterlacing have been performed [10–18]. Conventional deinterlacing methods are classified into two classes depending on whether motion information is considered or not. These classes are inter-field temporal interpolation and intra-field spatial interpolation [19–32]. While inter-
* Corresponding author. Fax: +82 2 2220 0370. ** Principal corresponding author. Fax: +82 32 835 0782.
E-mail addresses:
[email protected] (J. Kim),
[email protected] (G. Jeon),
[email protected] (J. Jeong).
1051-2004/$ – see front matter © 2014 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.dsp.2014.01.002
field methods perform better than intra-field methods, they require huge computational complexity to obtain motion information which is not feasible in real-time. Moreover, if motion information is not reliable, frames interpolated by inter-field methods may have considerable artifacts and even risk latent error propagation. On the other hand, intra-field methods have the advantages of simplicity and ease of implementation. Intra-field methods perform better when the motion of objects is complex or fast. In addition, intra-field methods are necessary because they reduce the risk of error propagation and should be inserted periodically in video sequences for display at any time. The line average (LA) method is the simplest intra-field method. While it is easy to implement, it suffers from various artifacts such as aliasing and jerkiness since it interpolates missing pixels with just the average of two vertical neighboring pixels. In order to take edge information into account and improve subjective quality, several methods have been proposed. Among them, edge-based LA (ELA) and modified ELA (MELA) are popular methods [24,25]. ELA considers three edge directions, 45◦ , 90◦ , and 135◦ . It first calculates the absolute differences of two neighboring pixels along the three directions, and then chooses the direction with the minimum difference as the final edge direction. The average of the two pixels in the final edge direction is used to interpolate the missing pixel. MELA utilizes more than four pixels instead of two pixels in each direction to calculate the costs for the edge directions. With these costs, they interpolate missing pixels along the calculated direction at the minimum cost. MELA plays the role of a low pass filter because it uses an average of two or four pixels. However, the number of its edge directions is limited to three.
68
J. Kim et al. / Digital Signal Processing 29 (2014) 67–77
Table 1 Summary of conventional methods. Algorithm
Number of candidate edge directions
Edge detection method
Interpolation method
LA ELA MELA DOI ECA VDD FDD FDIF CAD FSID Prop.
1 3 3 33 5 3–33 9 3 N/A N/A 2x + 1
Only vertical SAD (2 pixels) SAD (4 or 6 pixels) SAD (2 × 3 pixels) Correlation weight function Stepwise edge detection Modified Sobel SAD N/A Sobel Autocorrelation
2-tap line average 2-tap line average 4-tap average 2-tap line average Weighted average FMELA, MVI, MDA, DOI Weighted average Weighted average Wiener filter Bilateral/trilateral filter BH windowed-sinc weighted average
In [26], authors proposed direction-oriented interpolation (DOI) to take account of edge directions. They used block matching error to determine the edge direction. They find the best matching block by calculating the distortions between the current block and k-shifted 2 × 3 blocks on the upper (and lower) lines. If the determined directions for the upper and the lower lines are almost straight, they set the determined directions as the final edge direction. Otherwise, they just use LA. DOI applies the proposed method only when the determined two edge directions are almost in a straight line, which is an excessively strict constraint considering that the search range proposed is ±16. In [27], authors introduced fine directional deinterlacing (FDD), which utilized ELA and a modified Sobel filter to obtain the edge direction. After determining whether the edge is slanted toward the left or right by ELA, they applied a Sobel filter to five pixels determined according to the direction. The weighted sum of the two directions that have the largest magnitude was used as the final direction. However edge direction from the Sobel filter may not be correct because natural images may have noise and/or variation in intensity in the flat region. Both DOI and FDD adopted an early termination method before determining the edge direction so that they instead use the LA method if the absolute difference between two vertically neighboring pixels is less than a predefined threshold. Since the threshold is relatively large, this early termination method may be a contributor to improved PSNR results as well as a bound to improvement. In other words, many pixels are interpolated using LA; therefore, performance converges toward that of LA. In [28], authors proposed edge-based correlation adaptive (ECA) method. They perform horizontal flat area detection first. If a pixel is on such an area, mean of four diagonal pixels is used as the estimate. Otherwise, correlation weight function is utilized where they use the absolute differences for five directions as weights. To reduce wrong determination of edge direction, votingbased deinterlacing method for directional error correction (VDD) was proposed [29]. VDD performs stepwise edge inspection and four different interpolation methods are utilized at different steps: modified MELA, majority voting-based interpolation (MVI), majority voting-based direction average (MVD), and DOI. VDD provides relatively better results in terms of image quality, but it uses six threshold parameters, which brings about degradation in some images. A fixed directional interpolation filter (FDIF) was recently proposed in [30] which determined the edge direction by using the edge detection method of MELA and interpolated using a DCT interpolation filter adopted in the high efficiency video codec (HEVC). It is fast but has the same problem with MELA because they only have three candidate directions. In [31], authors used geometric duality and a Wiener filter for their deinterlacing system, covariance-based adaptive deinterlacing (CAD). They obtained the filter coefficients from the corresponding low resolution image and interpolated missing pixels using an
8-tap Wiener filter. They used an inverse operation to obtain the filter coefficients, but did not necessarily obtain an inverse matrix in the flat region. Plus, their approach is too time-consuming, and the duality may be destroyed because they use a large data matrix (16 × 19 LR image). In [32], authors presented a filter switching interpolation method (FSID) where they determined the edge direction using a Sobel filter. A bilateral or trilateral filter is used to interpolate missing pixels. Because pre-interpolation is required to apply the Sobel filter, and bilateral and trilateral filters have exponential operation, this approach is time-consuming. Table 1 shows a summary of conventional methods and the proposed method. In order to overcome the drawbacks of conventional methods such as limited edge direction, inefficient early termination, and inaccurate edge detection, we propose a new edge-directed interpolation method using autocorrelation on the interpolation errors of the neighboring pixels for the candidate directions. The proposed method provides a variety of edge directions that are able to extend to any number, and presents a reliable edge detection method that is based on autocorrelation of neighboring pixels for candidate directions. Since we interpolate missing pixels using a Blackman– Harris (BH) windowed-sinc weighted average filter along the determined edge direction, the interpolated image is smooth while preserving the edge. The remainder of the paper is organized as follows. The proposed method is explained in Section 2. The experimental results are described in Section 3 to compare the proposed method with conventional methods. Finally, we present conclusions in Section 4. 2. Proposed method The proposed method has three main contributions. First, we used extendable edge directions. The proposed method is able to use a variety of edge directions as shown in Table 2. The second contribution is reliable edge detection. To accurately determine the edge direction, we utilize the autocorrelation of neighboring pixels under the assumption of duality between a high resolution (HR) image and its corresponding low resolution (LR) image. In other words, we determine the edge direction of missing pixels using the edge direction of the neighboring pixels in the LR image. Details of edge direction detection are described in the next section. The third contribution is the interpolation scheme along the determined edge direction. After determining the edge direction, we interpolate missing pixels using a Blackman–Harris windowed-sinc weighted average. 2.1. Candidate edge directions Fig. 1 shows the candidate edge directions we consider, where circles with a red dot are to-be-interpolated pixels and the other circles such as ones on m − 3, m − 1, and m + 1 rows are known
J. Kim et al. / Digital Signal Processing 29 (2014) 67–77
69
Table 2 Candidate edge directions and their degrees. dir k
dir8
dir6
dir4
dir2
dir1
dir3
dir5
dir7
dir9
d Degree
−4
−3
−2
−1
0 90◦
1 63.43◦
2 45◦
3 33.69◦
4 26.57◦
153.43◦
146.31◦
135◦
116.57◦
Fig. 1. The candidate edge directions: (a) HR image, (b) LR image.
original pixels. The lines passing through the current pixel f (m, n) show the edge direction (d = . . . , −2, −1, 0, 1, 2, . . .). Here, d is any integer number. For example, dir2 in Fig. 1(a) is the line that connects the current pixel and the 0.5 left shifted pixel on the upper line, f (m − 1, n − 0.5). As shown in Fig. 1(b), the lines only pass through original pixels when the directions are applied to a neighboring original pixel in the LR image such as f (m − 1, n). These interpolation errors obtained in the LR image are utilized when calculating the autocorrelation.
The candidate edge directions can be represented as
θ = tan−1
2 .
(1)
d
Table 2 shows the candidate edge directions and their corresponding degrees that are produced by using Eq. (1). Here, the relation between d and k in dirk can be represented by
d = (−1)k+1
k
2
.
(2)
70
J. Kim et al. / Digital Signal Processing 29 (2014) 67–77
Fig. 2. Duality between two different resolutions: (a) high resolution image, (b) low resolution image, (c) Canny edge of (a), and (d) Canny edge of (b).
2.2. Edge detection The proposed edge detection consists of two steps: (1) calculation of interpolation errors on the neighboring pixels for each candidate direction in the LR image, and (2) calculation of the autocorrelation for the errors. Fig. 2 shows an example of the duality where the low resolution image was produced by sub-sampling. The edge images in this figure were generated by using a Canny filter where we set the standard deviation of the Gaussian filter to 1, high threshold to 0.5, and low threshold to 0.4 × high threshold [33]. The Canny filter finds edges using the gradient. The Canny filter detects a pixel if it is on the edge or not by considering the relationship between the weak edge and the strong edge. Thus, the Canny filter is less vulnerable to noise than other edge filters. In Fig. 2, the LR image is a great approximation of the HR image with loss of some high frequency components in the detailed region. Based on this observation, we developed an edge detection method. With the duality, we first generate the predictive values of the original neighboring pixels along the candidate edges in the LR image. Then we calculate the errors between the predictive values and their original values. As the neighboring pixels have original intensity values, we can calculate the exact errors by
e dirk,i , j = f (i , j ) − f (i , j )
f ( i − 2, j + d ) + f ( i + 2, j − d ) , = f (i , j ) − 2
(3)
where i , j is the location of a neighboring pixel, f (i , j ) is the origf dirk (i , j ) is the pixel inal intensity of the neighboring pixel, and predicted along the dirk direction. We use the average of the two closest pixels along the direction as the predictive value. After obtaining the errors for all the candidate edges, we calculate the autocorrelation of each edge:
R X X (τ ) = E X (t ) X (t + τ )
= E X 2 (τ ) = μ2 + σ 2 .
(4)
The first term E [ X (t ) X (t + τ )] in Eq. (4) is reduced to E [ X 2 (τ )] when the signal is wide-sense stationary (WSS) [34,35]. Because the data that we use are the error between the original value and its predictive value, the mean is almost zero and the variances are very similar. Therefore, the assumption of WSS is appropriate. We can obtain the third term μ2 + σ 2 by substituting zero to τ . From Eq. (4), we can obtain the autocorrelation using the mean and the
J. Kim et al. / Digital Signal Processing 29 (2014) 67–77
71
Fig. 3. An example of cost of dir5 direction.
variance without great effort. We utilize this autocorrelation as the cost to find the edge of the to-be-interpolated pixel: 2 C dirk = μ2dirk + σdir k,
(5)
where C dirk is the cost of the k-direction, μ and σ 2 are the mean and the variance. We set the direction which has the smallest C dirk as the edge. The meaning of Eq. (5) is that μdirk is the average of the interpolation errors for the k-direction obtained from all the neighboring pixels. If the errors for one direction are smaller than errors for the other directions, interpolation along the direction is more accurate. That is, μdirk provides the validity for interpolation along the direction. Since the average of the error is proportional to the 2 cost, a smaller cost implies a more accurate edge. σdir k is a measure on the spread of the errors. Small variance means that the interpolation errors for the k-direction of all neighbors are simi2 lar and that the errors are close to the mean value; hence, σdir k provides reliability of the direction in the statistical point of view. This is the reason why we determine the direction, δ in Eq. (6), that has the minimum cost as the edge:
δ = arg min C dirk . dirk
(6)
Fig. 3 shows an example of how to calculate the cost on the dirk. The number of neighboring pixels is set to six. This number can be changed by a user parameter. However, two or six pixels are recommended to provide low complexity with high performance
conditions. We first calculate the errors of six neighboring pixels for dir5 using Eq. (7):
e dir5,1 = f (m − 1, n − 1) − e dir5,2 = f (m − 1, n) −
2 f (m − 3, n + 2) + f (m + 1, n − 2)
e dir5,3 = f (m − 1, n + 1) − e dir5,4 = f (m + 1, n − 1) − e dir5,5 = f (m + 1, n) −
f (m − 3, n + 1) + f (m + 1, n − 3)
e dir5,6 = f (m + 1, n + 1) −
,
2 f (m − 3, n + 3) + f (m + 1, n − 1) 2 f (m − 1, n + 1) + f (m + 3, n − 3)
2 f (m − 1, n + 2) + f (m + 3, n − 2)
, ,
,
2 f (m − 1, n + 3) + f (m + 3, n − 1) 2
,
.
(7) For edge detection, we first calculate the interpolation errors of six neighboring pixels for dirk using Eq. (7). In the example, the errors for the dirk are difference values between each neighboring pixel and its predictive value: e dir5,1 = 1, e dir5,2 = 1.5, e dir5,3 = 0, e dir5,4 = 2, e dir5,5 = 0, e dir5,6 = 1.5. The mean of the errors is 1 and variance is 0.58; therefore, the cost of dir5 is 1.58. The costs for the other directions are obtained analogously. After calculating the costs for all the directions, we determine the direction that has the smallest cost as the edge direction.
72
J. Kim et al. / Digital Signal Processing 29 (2014) 67–77
Fig. 4. Random distributed region.
2.3. Blackman–Harris windowed-sinc weighted average filter For the interpolation step, we use a Blackman–Harris (BH) windowed-sinc weighted average filter. Let C min1 and C min2 be the minimum cost and the second smallest cost obtained in the edge detection step, respectively. If one of them is upper-right direction and the other one is upper-left direction as in Fig. 4, we consider the pixel to be randomly distributed and interpolate this pixel by the average of the two spatially closest pixels,
f (m − 1, n) + f (m + 1, n) f (m, n) = .
(8)
2
Otherwise, we use a BH windowed-sinc weighted average filter for interpolation. The BH window is widely used because the implementation is easy and it has less ripple in the stop band. The BH window is expressed as follows:
W (n) = a0 + a1 cos
2π N
2π (n) + a2 cos (2n) ,
(9)
N
where a0 , a1 , and a2 are coefficients and N is filter length [36]. When the filter length is six, it is known that a0 = 0.42323, a1 = 0.49755, a2 = 0.07922. By convolving the BH window with the sinc function in the frequency domain, we can obtain the BH windowed-sinc filter. Convolution of the 6-tap BH windowed-sinc filter with sampled pixel values from −3 to 3 constructs a predictive value at the (m, n) position:
p dirk = f ∗ hBH ,
(10)
where hBH and f denote the BH windowed-sinc filter coefficients and pixels on the determined edge, respectively. We use six pixels for f which is represented as
f=
f m − 5, n + Φ
5d
2
d
f m + 3, n + Φ
2
−3d 2
, f m − 3, n + Φ
f m − 1, n + Φ
3d
2
−d , f m + 1, n + Φ ,
2
, f m + 5, n + Φ
−5d 2
,
where Φ(·) is a fix function which rounds values toward zero. For fast implementation, we use approximated values for hBH ,
[2, −11, 73, 73, −11, 2] 128
.
along the vertical direction, respectively. Then, the estimated value is obtained from Eqs. (13) and (14):
f (m, n) = μ · p v + (1 − μ) p δ ,
μ=
,
(11)
hBH =
Fig. 5. Flowchart of the proposed method.
(12)
We denote p δ and p v are the values predicted by using the BH windowed-sinc filter along the determined edge direction and
Cδ Cδ + C v
,
(13) (14)
where C v is the cost for the vertical direction. The above BH windowed-sinc weighted average filter smoothes the image while preserving the edge by using optically and spatially close pixels with adaptive weight. Fig. 5 shows the flowchart of the proposed method where error calculation is pre-computed for a whole frame by using Eq. (3). After mean and variance of six neighboring pixels for each candidate direction are calculated, the costs for the directions are obtained with Eq. (5). The direction that has the minimum cost is chosen as the edge direction by Eq. (6). If the direction of which cost is minimum and the direction that have the second smallest cost are randomly distributed, we interpolate the pixel using the average of the vertically closest two pixels as expressed in Eq. (8). Otherwise, BH windowed-sinc weighted average in Eq. (8) is used as the estimate of the pixel.
J. Kim et al. / Digital Signal Processing 29 (2014) 67–77
73
Fig. 6. Test images. (a) Barbara, (b) Boat, (c) Lena, (d) Blonde, (e) Building, (f) Cameraman, (g) Pirate, (h) Strawberries, and (i) Thumb print.
3. Experimental results To assess the performance of the proposed method, we simulated it and compared with the seven conventional methods explained in the introduction section: ELA, MELA, DOI, FDD, FDIF, CAD and FSID. In the simulation, we set the number of the candidate directions to 7. Note that it is changeable because it is a user parameter. We provide peak signal-to-noise ratio (PSNR) and mean structural similarity (MSSIM) results as measures of the objective quality and deinterlaced images for subjective comparison. In addition, we compared the computational complexity. We use nine images: Strawberries (666 × 666), Building (600 × 600), Barbara, Boat, Lena, Blonde, Cameraman, Pirate (512 × 512), and Thumb print (512 × 480). Fig. 6 shows the test images used in the simulation. Table 3 shows the PSNR results to evaluate the performance of the objective quality. The proposed method showed the best average PSNR results, and the PSNR gap ranges from 0.1 dB (compared to FDIF) to 2.87 dB (compared to ELA). The proposed method is ranked 6th for the Building image. The Building image contains roof bricks in some areas which have many edges in a very small
region. In this case, the best edges detected at each pixel are not statistically the best because the edge at the current pixel may not be the edges at the adjacent pixels. Therefore, LA-based methods such as MELA, DOI and FDD show higher PSNR results although they cause blur effects. For similar reasons, the proposed method is ranked 4th for the Blonde image where degradation came from a suntanned face. Thumb print is an image used in biomedical and forensic science. Because the image has a variety of edge directions that cover almost all degrees from 0◦ to 360◦ , the result of this image shows the performance of edge detection. In this image, the proposed method showed the best result. Table 4 shows the results for comparison of computational complexity. As we can see in this table, the proposed method is faster than FDD, CAD, and FSID. Note that MELA showed a faster result, but that was due to the early termination that caused most of the pixels to be interpolated by LA. The number of arithmetic operations required to interpolate one pixel for different methods are shown in Table 5, where ET denotes early termination. As mentioned before, VDD utilizes four different interpolation methods, and so we list the number of op-
74
J. Kim et al. / Digital Signal Processing 29 (2014) 67–77
Table 3 Comparison of performance measured by PSNR (in dB). PSNR
ELA
MELA
DOI
FDD
ECA
VDD
FDIF
CAD
FSID
Prop.
Rank
Barbara Boat Lena Blonde Building Cameraman Pirate Strawberries Thumb print Average
25.09 32.33 35.85 31.59 32.37 35.53 32.36 32.62 27.55 31.70
32.21 35.30 37.96 33.08 33.15 37.03 33.82 34.01 28.85 33.93
28.67 33.85 36.52 32.60 33.24 36.02 33.15 33.7 26.98 32.75
31.28 33.95 36.91 32.62 33.13 36.25 33.16 33.38 28.00 33.19
28.27 32.57 35.84 31.92 32.57 35.47 32.54 33.23 27.78 32.24
31.46 34.78 37.26 32.74 32.96 36.66 33.27 33.22 28.22 33.40
33.79 36.07 38.06 32.54 33.04 39.97 33.57 33.83 29.33 34.47
26.21 35.47 38.20 33.78 32.73 37.51 33.84 33.63 28.75 33.35
33.02 36.03 38.32 33.00 33.07 38.50 33.80 33.90 28.82 34.27
33.74 36.26 38.32 32.88 32.97 39.71 33.80 33.99 29.46 34.57
2 1 1 4 6 2 3 2 1 1
Table 4 Comparison of computational complexity measured by CPU time (s). Time
ELA
MELA
DOI
FDD
ECA
VDD
FDIF
CAD
FSID
Prop.
Rank
Barbara Boat Lena Blonde Building Cameraman Pirate Strawberries Thumb print Average
0.164 0.164 0.167 0.174 0.216 0.157 0.164 0.268 0.152 0.181
0.188 0.188 0.191 0.192 0.250 0.187 0.182 0.315 0.166 0.207
1.187 1.041 0.602 0.864 0.813 0.550 0.927 0.888 1.925 0.978
1.994 1.624 0.962 1.364 1.319 0.954 1.626 1.718 3.369 1.659
0.127 0.120 0.114 0.121 0.133 0.100 0.124 0.165 0.130 0.126
2.243 2.162 2.311 2.454 2.937 2.132 2.261 3.858 2.246 2.512
0.759 0.768 0.768 0.782 1.059 0.756 0.755 1.285 0.707 0.849
26.561 34.948 29.559 28.570 54.511 28.775 31.115 46.865 30.604 34.612
1.448 1.321 1.166 1.204 1.472 1.135 1.311 1.830 1.927 1.424
0.791 0.787 0.804 0.804 1.114 0.813 0.817 1.341 0.762 0.892
5 5 6 5 6 6 5 6 5 5
erations required for each case. For fast implementation, modified variance is used in the proposed algorithm:
Table 5 Comparison of arithmetic operations. Algorithm
ADD
ELA MELA
SHT
MUL
CMP
4 11
1 3
0 1
2 5
Remarks
2 dirk
σ
DOI
1st ET 2nd ET O.W
2 25 27
1 1 3
0 396 396
1 66 66
FDD
1st ET 2nd ET O.W
7 13 70
1 1 11
0 0 23
1 3 15
ECA
ET O.W
8 37
1 6
0 15
1 1
VDD
FMELA MVI MDA DOI
21 17 34 57
4 3 3 3
1 1 5 401
8 15 38 103
FDIF
V direction O.W
19 27
3 4
7 17
6 6
15 111
0
15 192
0
inverse of mtx
30 36
5 5
33 55
1 1
exp: 6, sqrt: 1 exp: 12, sqrt: 1
126 129
8 13
21 32
7 7
CAD FSID
Smooth region O.W
Prop.
R.D O.W
tant−1 : 2, sqrt: 5
=
N 1
N
2 |e − e i |
(15)
.
i =1
As one can observe, DOI and CAD require a lot of multiplication operations. FSID require 6 exponential and 1 square root as well as 33 or 55 multiplication operations. The number of operations required for proposed algorithm varies depending on the number of the candidate directions. Let N be the candidate direction number, then the operation number for the proposed method is 18N + 1 or 18N + 13 addition, 3N or 3N + 11 multiplication, N + 1 or N + 6 shift, and N comparison operations. The structural similarity (SSIM) index is a well-known image quality assessment tool for calculating the similarity between two images, which estimates perceived errors [37]. It is generally used in image enhancement areas because it provides a good approximation of perceived image quality. In other words, we can measure the similarity between the two images by using SSIM. In practice, mean SSIM index (MSSIM) is used to evaluate the quality of the entire image, which is calculated by normalizing the SSIM index. Table 6 shows the MSSIM results where a higher value indicates that the reconstructed image is more similar to the original image. The MELA and the FSID methods show comparable results. However, the proposed method shows the best results in
Table 6 Comparison of performance measured by MSSIM. MSSIM
ELA
MELA
DOI
FDD
ECA
VDD
FDIF
CAD
FSID
Prop.
Rank
Barbara Boat Lena Blonde Building Cameraman Pirate Strawberries Thumb print Average
0.866 0.907 0.940 0.901 0.889 0.964 0.919 0.940 0.925 0.917
0.949 0.938 0.955 0.925 0.906 0.974 0.943 0.956 0.954 0.944
0.923 0.931 0.952 0.922 0.908 0.973 0.938 0.953 0.925 0.936
0.942 0.932 0.952 0.922 0.906 0.973 0.938 0.951 0.940 0.940
0.907 0.919 0.946 0.911 0.899 0.965 0.925 0.946 0.936 0.928
0.934 0.933 0.950 0.918 0.901 0.972 0.934 0.949 0.942 0.937
0.956 0.939 0.951 0.916 0.904 0.985 0.940 0.953 0.964 0.945
0.896 0.938 0.956 0.934 0.894 0.976 0.941 0.952 0.951 0.938
0.953 0.945 0.958 0.927 0.908 0.976 0.943 0.956 0.960 0.947
0.960 0.944 0.953 0.930 0.903 0.985 0.942 0.960 0.965 0.949
1 2 4 2 6 1 3 1 1 1
J. Kim et al. / Digital Signal Processing 29 (2014) 67–77
75
Fig. 7. SSIM map. (a) Original image, (b) ELA, (c) MELA, (d) DOI, (e) FDD, (f) ECA, (g) VDD, (h) FDIF, (i) CAD, (j) FSID, and (k) Proposed.
terms of average MSSIM result. This means the deinterlaced images generated by the proposed method visually outperform the others. Note that the proposed method is ranked second for Blonde image in terms of MSSIM. However, it was ranked in fourth in PSNR result because the proposed method preserves details such as hair
area well. The degradation in the Building image came from regions with diverse edges in small areas. Fig. 7 shows the SSIM maps of the proposed method and the conventional methods. The brighter SSIM map indicates that the image is more similar to the original image. The proposed method showed the brightest SSIM map, which demonstrates the
76
J. Kim et al. / Digital Signal Processing 29 (2014) 67–77
Fig. 8. Deinterlaced images. (a) Original image, (b) partial image of (a), (c) ELA, (d) MELA, (e) DOI, (f) FDD, (g) ECA, (h) VDD, (i) FDIF, (j) FSID, and (k) Proposed.
subjective superiority of the proposed method. In particular, black areas on the table, left leg, and scarf are fairly reduced. Fig. 8 shows the deinterlaced images generated by different methods. To obtain deinterlaced images, we first eliminated the odd fields in a frame and interpolated them. ELA and MELA made diagonal/anti-diagonal edges smooth, and we could find some discontinuities, especially in the left character. DOI and FDD caused edge-bleeding artifacts. FDIF also had staircase artifacts, and FSID showed edge-bleeding artifacts. The proposed method was the most clear and reduced the staircase artifacts. 4. Conclusions In this paper, we proposed new edge detection method and interpolation method. We have three goals: consideration of various edges, reliable edge detection, and outstanding interpolation. Since the number of candidate edge directions is flexible, we can take into account the varied edges included in the HR image. For edge detection, autocorrelation of the neighboring pixels along the candidate edge directions based on the duality between LR image and HR image is utilized. The autocorrelation values obtained in the edge detection step are reused as the weights in the interpolation step. Along the determined edge direction, we interpolate the missing pixel by using a BH windowed-sinc weighted average filter. Simulation results demonstrated that the proposed method has good performance in terms of complexity, objective results, and subjective results. Acknowledgments This work was supported by the MKE (The Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Re-
search Center) support program supervised by the NIPA (National IT Industry Promotion Agency) (NIPA-2013-H0301-13-1011) and by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (2013R1A1A1010797). References [1] E.B. Bellars, G.D. Haan, Deinterlacing: A Key Technology for Scan Rate Conversion, Elsevier, 2000. [2] K. Jack, Video Demystified – A Handbook for the Digital Engineer, 4th ed., Elsevier, Jordan Hill, Oxford, 2005. [3] Y. Fang, DAC spectrum of binary sources with equally-likely symbols, IEEE Trans. Commun. 61 (4) (Apr. 2013) 1584–1594. [4] Y. Fang, LDPC-based lossless compression of nonstationary binary sources using sliding-window belief propagation, IEEE Trans. Commun. 60 (11) (Nov. 2012) 3161–3166. [5] Y. Fang, Joint source–channel estimation using accumulated LDPC syndrome, IEEE Commun. Lett. 14 (11) (Nov. 2010) 1044–1046. [6] Y. Fang, EREC-based length coding of variable-length data blocks, IEEE Trans. Circuits Syst. Video Technol. 20 (10) (Oct. 2010) 1358–1366. [7] Y. Fang, Distribution of distributed arithmetic codewords for equiprobable binary sources, IEEE Signal Process. Lett. 16 (12) (Dec. 2009) 1079–1082. [8] Y. Fang, Crossover probability estimation using mean-intrinsic-LLR of LDPC syndrome, IEEE Commun. Lett. 13 (9) (Sept. 2009) 679–681. [9] G.D. Haan, E.B. Bellers, Deinterlacing – an overview, Proc. IEEE 86 (9) (Sept. 1998) 1839–1857. [10] G. Jeon, M. Anisetti, V. Bellandi, J. Jeong, Fuzzy rule-based edge-restoration algorithm in HDTV interlaced sequences, IEEE Trans. Consum. Electron. 53 (2) (May 2007) 725–731. [11] G. Jeon, M. Anisetti, V. Bellandi, E. Damiani, J. Jeong, Rough sets-assisted subfield optimization for alternating current plasma display panel, IEEE Trans. Consum. Electron. 53 (3) (Aug. 2007) 825–832. [12] G. Jeon, M. Anisetti, V. Bellandi, E. Damiani, J. Jeong, Fuzzy weighted approach to improve visual quality of edge-based filtering, IEEE Trans. Consum. Electron. 53 (4) (Nov. 2007) 1661–1667.
J. Kim et al. / Digital Signal Processing 29 (2014) 67–77
[13] G. Jeon, M. Anisetti, D. Kim, V. Bellandi, E. Damiani, J. Jeong, Fuzzy rough sets hybrid scheme for motion and scene complexity adaptive deinterlacing, Image Vis. Comput. 27 (4) (March 2009) 425–436. [14] G. Jeon, M. Anisetti, V. Bellandi, E. Damiani, J. Jeong, Designing of a type-2 fuzzy logic filter for improving edge-preserving restoration of interlaced-toprogressive conversion, Inform. Sci. 179 (13) (June 2009) 2194–2207. [15] G. Jeon, M. Anisetti, S. Kang, A rank-ordered marginal filter for deinterlacing, Sensors 13 (3) (March 2013) 3056–3065. [16] G. Jeon, M. Anisetti, J. Lee, V. Bellandi, E. Damiani, J. Jeong, Concept of linguistic variable-based fuzzy ensemble approach: application to interlaced HDTV sequences, IEEE Trans. Fuzzy Syst. 17 (6) (Dec. 2009) 1245–1258. [17] G. Jeon, S.J. Park, Y. Fang, M. Anisetti, V. Bellandi, E. Damiani, J. Jeong, Specification of efficient block matching scheme for motion estimation in video compression, Opt. Eng. 48 (12) (Dec. 2009) 127005. [18] G. Jeon, M.Y. Jung, M. Anisetti, V. Bellandi, E. Damiani, J. Jeong, Specification of the geometric regularity model for fuzzy if-then rule-based deinterlacing, IEEE/OSA J. Display Technol. 6 (6) (June 2010) 235–243. [19] K. Sugiyama, H. Nakamura, A method of deinterlacing with motion compensated interpolation, IEEE Trans. Consum. Electron. 45 (3) (Aug. 1999) 611–616. [20] Y.-Y. Jung, B.-T. Choi, Y.-J. Park, S.-J. Ko, An effective deinterlacing technique using motion compensated information, IEEE Trans. Consum. Electron. 46 (3) (Aug. 2000) 460–466. [21] R. Li, B. Zheng, M.L. Liou, Reliable motion detection/compensation for interlaced sequences and its applications to deinterlacing, IEEE Trans. Circuits Syst. Video Technol. 10 (1) (Oct. 2000) 23–29. [22] O. Kwon, K. Sohn, C. Lee, Deinterlacing using directional interpolation and motion compensation, IEEE Trans. Consum. Electron. 49 (1) (Feb. 2003) 198–203. [23] D. Wang, A. Vincent, P. Blanchfield, Hybrid de-interlacing algorithm based on motion vector, IEEE Trans. Circuits Syst. Video Technol. 15 (8) (Aug. 2005) 1019–1025. [24] T. Doyle, Interlaced to sequential conversion for EDTV applications, in: Proc. 2nd Int. Workshop Signal Processing of HDTV, Elsevier, Amsterdam, 1999, pp. 412–430. [25] W. Kim, S. Jin, J. Jeong, Novel intra deinterlacing algorithm using content adaptive interpolation, IEEE Trans. Consum. Electron. 53 (3) (Aug. 2007) 1036–1043. [26] H. Yoo, J. Jeong, Direction-oriented interpolation and its application to deinterlacing, IEEE Trans. Consum. Electron. 48 (4) (Nov. 2002) 954–962. [27] S. Jin, W. Kim, J. Jeong, Fine directional de-interlacing algorithm using modified Sobel operation, IEEE Trans. Consum. Electron. 54 (2) (May 2008) 857–862. [28] T.-H. Tsai, H.-L. Lin, Design and implementation for deinterlacing using the edge-based correlation adaptive method, J. Electron. Imaging 18 (1) (Mar. 2009) 013014, pp. 1–5. [29] H.-J. Cho, Y.-S. Lee, S.-H. O, S.-J. O, A voting-based intra deinterlacing method for directional error correlation, IEEE Trans. Consum. Electron. 56 (3) (Aug. 2010) 1713–1721. [30] S.-M. Hong, S.-J. Park, J. Jang, J. Jeong, Deinterlacing algorithm using fixed directional interpolation filter and adaptive distance weighting scheme, Opt. Eng. 50 (6) (June 2011) 067008. [31] S.-J. Park, G. Jeon, J. Jeong, Computation-aware algorithm selection approach for interlaced-to-progressive conversion, Opt. Eng. 49 (5) (May 2010) 057005. [32] X. Chen, G. Jeon, J. Jeong, Filter switching interpolation method for deinterlacing, Opt. Eng. 51 (10) (Oct. 2012) 107402. [33] J. Canny, A computational approach to edge detection, IEEE Trans. Pattern Anal. Mach. Intell. PAMI-8 (6) (June 1986) 679–698. [34] P.Z. Peebles, Probability, Random Variables and Random Signal Principles, McGraw-Hill, 2000. [35] P.F. Dunn, Measurement and Data Analysis for Engineering and Science, McGraw-Hill, New York, 2005.
77
[36] H. Qian, R. Zhao, T. Chen, Interharmonics analysis based on interpolating windowed FFT algorithm, IEEE Trans. Power Deliv. 22 (2) (Apr. 2007) 1064–1069. [37] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, Image quality assessment: From error visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (Apr. 2004) 600–612.
Joohyeok Kim received his B.S. degree in information and communication engineering from Hanyang University, Korea, in 2008. He is currently pursuing a Ph.D. candidate in Electronic and Computer Engineering at Hanyang University. His research interests include video compression, such as H.264/AVC, HEVC, and 3D video coding, and image processing including interpolation, demosaicking, deinterlacing, and denoising. Gwanggil Jeon received the B.S., M.S., and Ph.D. (summa cum laude) degrees from the Department of Electronics and Computer Engineering, Hanyang University, Seoul, Korea, in 2003, 2005, and 2008, respectively. He was with the Department of Electronics and Computer Engineering, Hanyang University, from 2008 to 2009. He was with the School of Information Technology and Engineering, University of Ottawa, Ottawa, ON, Canada, as a Post-Doctoral Fellow, from 2009 to 2011. He was with the Graduate School of Science and Technology, Niigata University, Niigata, Japan, as an Assistant Professor, from 2011 to 2012. He is currently an Assistant Professor with the Department of Embedded Systems Engineering, Incheon National University, Incheon, Korea. His current research interests include image processing, particularly image compression, motion estimation, demosaicking, and image enhancement, and computational intelligence, such as fuzzy and rough sets theories. Dr. Jeon was a recipient of the IEEE Chester Sall Award, in 2007, and the ETRI Journal Paper Award, in 2008. Jechang Jeong received a B.S. degree in electronic engineering from Seoul National University, Korea, in 1980, an M.S. degree in electrical engineering from the Korea Advanced Institute of Science and Technology, in 1982, and a Ph.D. degree in electrical engineering from the University of Michigan, Ann Arbor, in 1990. From 1982 to 1986, he was with the Korean Broadcasting System, where he helped develop teletext systems. From 1990 to 1991, he worked as a postdoctoral research associate at the University of Michigan, Ann Arbor, where he helped to develop various signal-processing algorithms. From 1991 through 1995, he was with the Samsung Electronics Company, Korea, where he was involved in the development of HDTV, digital broadcasting receivers, and other multimedia systems. Since 1995, he has conducted research at Hanyang University, Seoul, Korea. His research interests include digital signal processing, digital communication, and image and audio compression for HDTV and multimedia applications. He has published numerous technical papers. Dr. Jeong received the Scientist of the Month award in 1998, from the Ministry of Science and Technology of Korea, and was the recipient of the 2007 IEEE Chester Sall Award and 2008 ETRI Journal Paper Award. He was also honored with a government commendation in 1998 from the Ministry of Information and Communication of Korea.