Motion-adaptive 3D nonlocal means filter based on stochastic distance for low-dose X-ray fluoroscopy

Motion-adaptive 3D nonlocal means filter based on stochastic distance for low-dose X-ray fluoroscopy

Biomedical Signal Processing and Control 38 (2017) 74–85 Contents lists available at ScienceDirect Biomedical Signal Processing and Control journal ...

12MB Sizes 0 Downloads 12 Views

Biomedical Signal Processing and Control 38 (2017) 74–85

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control journal homepage: www.elsevier.com/locate/bspc

Motion-adaptive 3D nonlocal means filter based on stochastic distance for low-dose X-ray fluoroscopy Min Seok Lee a , Sang Wook Park b , Sang Yoon Lee a , Moon Gi Kang a,∗ a b

School of Electrical and Electronics Engineering, Yonsei University, 50 Yonsei-Ro, Seodaemun-Gu, Seoul, South Korea Medical Device Development Center, Daegu-Gyeongbuk Medical Innovation Foundation, 80, Cheombok-Ro, Dong-gu, Daegu 41061, South Korea

a r t i c l e

i n f o

Article history: Received 25 May 2016 Received in revised form 23 February 2017 Accepted 1 May 2017 Keywords: Low-dose X-ray fluoroscopy Nonlocal means Image denoising Stochastic distance Spatio-temporal filtering

a b s t r a c t Low-dose X-ray fluoroscopy avoids radiation risks, but X-ray fluoroscopic image sequences are contaminated by quantum noise. In this paper, a 3D nonlocal means (NLM) filter based on stochastic distance that incorporates motion information is proposed, which can be applied to the denoising of X-ray fluoroscopic image sequences. First, the stochastic distance is obtained for use as the NLM filter similarity measure. This facilitates the removal of Poisson noise between the patches to be denoised within motion-compensated 3D volumes for spatio-temporal filtering. Second, motion state detection and motion-adaptive weights are proposed, which preserve the details of the motion that can occur during medical procedures. Experimental results obtained by applying the proposed method to real X-ray fluoroscopic image sequence images are shown in visual and numerical comparison with state-of-the-art denoising methods for spatiotemporal filtering techniques. The quantitative and qualitative results confirm that the proposed novel frameworks outperform other techniques in terms of objective criteria, as well as the subjective visual perception of the real X-ray fluoroscopic image sequences. © 2017 Elsevier Ltd. All rights reserved.

1. Introduction X-ray fluoroscopy plays a crucial role in medical imaging, and this technique has been widely used in patient clinical examinations and interventional procedures (angiography). However, the radiation exposure experienced by patients and staff during clinical treatment has emerged as a major problem. Hence, the reduction of radiation exposure is an important challenge related to X-ray-based medical imaging. However, when the radiation dose is minimized, strong noise inevitably arises in the image. At a low signal-to-noise ratio (SNR), degradation of the image quality decreases the accuracy of the clinical diagnosis. Thus, an image denoising algorithm is necessary, not only to improve the image quality, but also to facilitate the use of low-dose X-ray fluoroscopy. Currently, image-sequence denoising filters are being actively studied [1–6]; however, the processing of X-ray fluoroscopic image sequences remains a challenging problem. Because of the lower number of X-ray photons involved, low-dose X-ray fluoroscopic images are strongly dominated by quantum noise, which is statistically modeled as a signal-dependent Poisson distribution. Many denoising algorithms for signal-dependent noise have been pre-

∗ Corresponding author. E-mail address: [email protected] (M.G. Kang). http://dx.doi.org/10.1016/j.bspc.2017.05.001 1746-8094/© 2017 Elsevier Ltd. All rights reserved.

sented in the literature [9–14]. In general, denoising methods can be divided into two categories based on the filtering domains, namely, spatial and temporal methods. In the spatial domain, the use of patch-based approaches has become popular in recent years. For example, nonlocal means (NLM) filtering has been proposed in Ref. [15], while blockmatching and a 3D filtering (BM3D) algorithm have been proposed in Ref. [8], all of which have been employed in denoising applications. With regard to Poisson noise removal, some patch-based methods specifically designed for signal-dependent noise have been presented [11,12]. The most common method involves use of a variance-stabilizing transform such as the Anscombe transform [9,10]. In the transformed domain, the Poisson noise behaves like Gaussian noise. Hence, denoising methods for signal-independent noise, such as the BM3D filtering developed in Ref. [8], can be adopted to remove the transformed Poisson noise. In addition, an NLM filter based on stochastic distances, which does not employ a variance-stabilizing transform, has been proposed [14]. For this NLM filter, the similarity in the Euclidean distance between the center patch and the neighboring patches is utilized for weight calculation [15]. In order to improve the weight calculation results for Poisson noise removal, however, the similarity measure is replaced by symmetric divergences, also known as stochastic distances [13]. In order to restore degraded image sequence data, denoising methods extended to the temporal domain have also been

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

presented in recent years [1–6]. Temporal filtering exhibits advantages in terms of preservation of fine details, because this filtering method does not average neighboring pixels, unlike spatial filtering. On the other hand, temporal filtering is likely to cause problems in cases involving moving regions, such as motion blurring and trailing artifacts. Motion representation plays a significant role in temporal filtering. In temporal filtering methods, corresponding motion information is categorized as explicit or implicit representation. In denoising, explicit motion representation refers to the detection of the true motion trajectories of a physical point via motion estimation and compensation. Many temporal filters, such as the locally adaptive linear minimum mean squared-error spatio-temporal filter (3D-LLMMSE) [1] and the multihypothesis motion-compensated filter (MHMCF) [2,3], adopt this approach. Such explicit motion representation is conceptually simple, but suffers from a fundamental weakness; that is, because of the difficulty in establishing correct correspondence for every pixel, uncertainty in the motion representation is inevitable. Thus, instead of explicitmotion-based filtering, implicit-motion-based filtering has been proposed, involving patch-based approaches extended to the temporal domain, for example. At present, patch-based approaches are also popular in 3D denoising methods. An NLM extended to the temporal domain [4], space-time adaptive filtering [5], and video block matching and 3-D filtering (VBM3D) [6] are some representative implicit-motion-based filtering methods. However, the methods that neglect motion estimation yield motion artifacts such as blurring and trailing. For images obtained using motioncompensated temporal filtering, the motion details are better preserved and the boundaries are less blurred than those obtained using implicit motion-based filtering. A patch-based denoising method and motion estimation algorithm inspired by the image fusion method have been proposed in Ref. [7]. Thus, a method based on motion-compensated 3D spatiotemporal volumes along with motion trajectories is proposed in this study. These motion-compensated 3D volumes are used to provide 3D filtering support for spatio-temporal filtering of X-ray fluoroscopic image sequences. Previously, a motion-compensated spatio-temporal filter operating in a multi-scale space has been proposed [24]. Further, several denoising algorithms based on spatio-temporal filtering techniques have been developed and used for medical imaging [16–25]. The contribution of this work is twofold. First, we propose a framework for a 3D NLM filter based on stochastic distance in order to reduce the quantum noise, which is modeled as a Poisson distribution. The developed 3D NLM filter is based on 3D motion-compensated support composed of sequences of support volumes stacked along the motion trajectories. Second, we consider the motion states of an X-ray fluoroscopic image sequence and the spatio-temporal similarity in order to obtain a motionadaptive weight for the 3D NLM filter. This motion-adaptive weight contains a motion detection value between neighboring frames within the motion-compensated 3D filtering support. The proposed motion-adaptive 3D NLM filter based on stochastic distance is designed to suppress quantum noise in X-ray fluoroscopic image sequences while preserving image features such as edges, textures, and details, even if complex motion occurs. The remainder of this paper is organized as follows. Beginning with the spatio-temporal filter structure of the static 3D NLM, we define the proposed 3D NLM structure based on stochastic distance for X-ray fluoroscopy, and proceed to explain the novelty and motivation behind the proposed method. In the experimental results section, we demonstrate the performance of the algorithm using real X-ray fluoroscopic image sequences obtained at different lowdose levels. The evaluation is performed on a set of six fluoroscopic image sequences including 50 frames, using a chest phantom with

75

two different motion types and two different X-ray dose levels. Finally, we conclude the paper. 2. Static 3D NLM filter for Gaussian noise The NLM filter [15] is one of the most popular denoising methods, and has been studied extensively. As an averaging-based filter that directly smoothens the pixel values in the spatial domain, the NLM filter is an effective denoising method. In the NLM filter, the similarity between the Euclidean distance between the center patch and the neighboring patches is utilized for weight calculation. Further, the NLM-based spatio-temporal filter is obtained by extending the 2D spatial filter support to a 3D time–space support [4]. This static 3D NLM filter is defined as fˆ3DNLM (x, t) =

 e

1 N(x, t)



˝S(x,t)

(G|g(x+.,t)g(y+.,s)|2 )(0) h2

g(y, s)dyds,

(1)

˝T (x,t)

where g(x, t) is an observed value at a 2D pixel index x in frame t, g(y, s) is an observed value at a 2D pixel index y in frame s, h is the smoothing parameter, G is a 2D Gauss kernel, and N(x, t) is the normalizing factor, as given by





N(x, t) =

e ˝S(x,t)

(G|g(x+.,t)g(y+.,s)|2 )(0) h2

dyds.

(2)

˝T (x,t)

The filter is 2D but integral in both the time and space dimensions in a 3D search window (˝S , ˝T ), where ˝S is a 2D spatial search window around x and ˝T is a 1D temporal search window around t. This means that (˝S , ˝T ) constitutes a 3D temporalspatial block around (x, t). In addition, (G ∗ |g(x + ., t) − g(y + ., s)|2 )(0)



=

2

G(z)|g(x + z, t) − g(y + z, s)| dz,

(3)

˝C

defines a similarity function between g(x, t) and g(y, s), based on the similarity of their neighborhoods (˝C is a 2D neighborhood domain defining the comparison window, the so-called “image patch”). The similarity represents a Euclidean distance weighted by a Gaussian kernel with standard deviation. Note that the static 3D NLM filter is ineffective in a low-dose X-ray fluoroscopic image sequence. Further, low-dose X-ray fluoroscopic images are strongly dominated by quantum noise, which is modeled as Poisson noise. In order to remove the Poisson noise from the X-ray fluoroscopic image sequences, we propose the use of a 3D NLM filter based on stochastic distance. The proposed filter is described in the next section. 3. Proposed 3D NLM filter based on stochastic distance In this section, we define the model for a noisy X-ray fluoroscopic image sequence, and recall the definition of 3D NLM filters given in Section 2. We present the framework and basic facts concerning the 3D NLM filter based on stochastic distance for X-ray fluoroscopic image sequence denoising. 3.1. Framework of proposed method As noted above, the noise model considered in this work is Poisson noise, and X-ray images are known to have signal-dependent Poisson noise. At low exposure levels, the photons emerging from a

76

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

Table 1 Poisson stochastic distances.

where gk are noisy observations and  is a set of pixel coordinates. In addition,

Distance Measures Closed-Form



(X − Y ) · ln

Kullback–Leibler

1 2

Hellinger

1−e

Bhattacharyya

1 (X 2

Rényi

1 ln ˇ−1

   X

√ Y

− 1 (X +Y )+ 2

+ Y ) −

 1 2

e



(X · Y )

ˆ ML (x + ., t),  ˆ ML (y + ., s))(0) d˚ (





(X · Y )

1−ˇ ˇ  −[(1−ˇ)Y +ˇX ] Y X



+e

1−ˇ ˇ  −[(1−ˇ)X +ˇY ] X Y





patient under a fluoroscopy system can be described by a stochastic Poisson point process. The number N of photons detected at position x can be modeled as a Poisson distribution, using the probability mass function given by N(x)

P(N(x); ) =

e−(x) [(x)] , N(x)!

(4)

where  » 1 is the expected number of photons in an interval of time, also called the counting rate (i.e., the reciprocal of the fluoroscope frame rate). Further, the mean and variance of the Poisson distribution are equal, as expressed by E[N(x)] = var[N(x)] = (x).

(5)

The original NLM filter that uses the Euclidean distance as the similarity measure is not robust against signal-dependent Poisson noise [15]. Therefore, in order to improve the denoising performance when the noise is modeled as Poisson type, the Euclidean distance is replaced by stochastic distances as the similarity measure between patches. This approach was recently proposed for the 2D NLM filter [14], and the concept of stochastic distances comes from the concept of divergence measures [13]. According to Ref. [14], the use of stochastic distance as a similarity measure is efficient for Poisson-noise denoising and can accommodate different probability distributions. In our approach, the stochastic distance is employed as the similarity measure in order to improve the lowdose X-ray denoising performance. Let X and Y be two Poisson random variables with counting rate parameters X and Y , respectively. The Poisson distances are given in Table 1. The proposed framework of our method extends the 2D NLM filter using stochastic distance to a 3D NLM filter, which is supported in the time–space domain. The static 3D NLM filter based on stochastic distance is then defined as fˆ3DPNLM (x, t) =

 e

1 N(x, t)



˝S(x,t)

ˆ ˆ d˚ ( ML (x+.,t), ML (y+.,s))(0) n

g(y, s)dyds,

(6)

˝T (x,t)

where the smoothing parameter h is replaced by the noise standard deviation multiplied by the constant  n , with  n being the noise level corresponding to the dose exposure. In the case of a higher dose exposure level, the correspondingly large  n value can smooth ˆ ML (x, t) is the maximum the images during the denoising process.  likelihood (ML) of each image patch to estimate  and N(x, t) is the normalizing factor, as given by





N(x, t) =

e ˝S(x,t)

ˆ ˆ d˚ ( ML (x+.,t), ML (y+.,s))(0) n

dyds.

(7)

˝T (x,t)

ˆ ML is given by The ML estimator of 

ˆ ML () = 1  gk , n k∈

(8)

ˆ ML (x + z, t),  ˆ ML (y + z, s))dz, d˚ (

=

(9)

˝C

defines the similarity function between two patches based on stochastic distance, as shown in Table 1. The ML estimator is employed in order to estimate  for each pixel of a patch. The proposed algorithm is based on stochastic distance and considers the dynamic character of a moving X-ray fluoroscopic image sequence. An X-ray fluoroscopic image sequence contains several motion sources, such as moving tissue, bones, and organs, along with moving devices including catheters and guide wires. Noise reduction in such motion scenarios is a basic requirement for X-ray fluoroscopy applications. In this work, we focus on the motion states of an Xray fluoroscopic image sequence. We show that several motion scenarios can be detected and approximately segmented using a motion state detection approach, without significant motion artifact degradation during spatio-temporal filtering. We then classify the motion into complex and medium motion categories, so as to reduce object blurring. Thus, in contrast to implicit temporal filtering using patch similarity between neighboring frames, this scheme distinguishes between motion states. A block diagram of the proposed denoising scheme is shown in Fig. 1. The first step of the proposed algorithm is to detect the motion states, which are then classified as complex (including local or global movement) or medium motion. Based on the motion state, different types of temporal filtering support are then selected to suppress the motion artifacts. Areas categorized as containing medium motion states are filtered using 3D NLM based on stochastic distance, which is performed along the motioncompensated spatio-temporal support. On the other hand, areas classified as having complex motion states are filtered using 2D NLM based on stochastic distance, and 1D motion-compensated temporal filtering is also performed. To prevent motion blurring, a motion-adaptive weight w is assigned during the spatio-temporal filtering process. Finally, the denoised X-ray fluoroscopic image sequence result is obtained. 3.2. Motion-compensated spatio-temporal support One of the major difficulties affecting motion-compensated filtering is trajectory ambiguity, i.e., the so-called aperture problem [4]. The majority of pixels in an image sequence have several possible displacement vectors, including similar gray-level values or similar blocks around them. A large number of good matches can exist, but the motion estimation algorithm must select one. On the other hand, the details of moving objects are better preserved and the boundaries are less blurred when motion compensation is applied in place of implicit-motion-based filtering. In this work, the proposed 3D NLM filter is based on motion-compensated 3D support composed of 2D supports stacked along the motion trajectory of the X-ray fluoroscopic image sequence. Almost all state-of-theart temporal filters are motion-compensated. In our approach, block-matching-based motion estimation is adopted in order to find the physical motion vector mv at every pixel. Let B(xs , s) be a 2D support or block extracted from a noisy image sequence at a 2D pixel index xs in reference frame s. A spatio-temporal support is a 3D sequence of 2D blocks constructed following a specific motion trajectory. The trajectory corresponding to a reference position (or the previous frame’s position) (xs , s) is defined as Tmc (xs , s) = (x + mvs , s),

s = t − 1, t − 2, .., t − N,

(10)

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

77

Fig. 1. Block diagram of proposed denoising scheme.

where xs is the 2D pixel index within a previous frame s, mvs are the motion vectors in the s-th frame, and the corresponding s is the neighboring frame of the current frame t, where s = t − 1, t − 2, .., t − N. After obtaining the trajectories at (xs , s), we can define the corresponding motion-compensated spatio-temporal support for the proposed 3D NLM filter as Smc (xs , s) = {Bmc (xi , i) : (xi , i) ∈ T (xs , s)},

(11)

where the subscript mc means that this support is motioncompensated and extracted from the noisy image sequence. In Fig. 2, the motion-compensated spatio-temporal support concept is illustrated, where the spatio-temporal support is stacked along the motion trajectory. Medium motion involves both static and slow-motion behavior. For medium motion behavior, local movement of the organ or medical device may occur. In this case, 3D NLM filtering with spatio-temporal support (as shown in Fig. 2(a)) can remove the noise efficiently. On the other hand, complex motion involves both global motion changes and rapid-

motion behavior. Complex motion refers to free motion changes in the patient’s body or in the X-ray equipment in a real clinical situation. Thus, if complex motion is encountered in the X-ray fluoroscopic image sequence, the use of 3D NLM filtering without spatio-temporal support may cause some difficulties. In that case, 3D NLM filtering with spatio-temporal support (as shown in Fig. 2(b)) can remove the noise efficiently and without difficulty. 3.3. Motion state detection To improve the spatio-temporal filtering by eliminating motion artifacts, the motion-state factor ␣ is defined in terms of the local and global motion, such that ˛=

(l 2 + g 2 ) (l 2 + g 2 ) + n

,

(12)

where  l 2 is the variance of the absolute frame difference in the local support ˝S , and  g 2 is the variance of the absolute frame

Fig. 2. Illustrations of motion-compensated spatio-temporal support: (a) medium and (b) complex motion states.

78

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

Fig. 3. (a) Single frame of X-ray fluoroscopic image sequence capturing chest phantom with moving objects. (b) Complex motion regions obtained via motion state detection.

difference in the global region (the pixel set of the entire image location) ˝G .  n is the noise level corresponding to the dose exposure. When the dose exposure level is higher and the motion becomes more rapid, the correspondingly large  n value can smooth the images during the denoising process. l2 and g2 are defined as follows: l

2

1 = n(˝S )

l =

1 n(˝S )

g 2 =

g =





1

n ˝G

(13)

|g(x + z, t) − g(x + z, t − 1)|dz,

(14)

˝S

1 n(˝G )



(|g (x + z, t) − g (x + z, t − 1) | − l )2 dz, ˝s







|g(x + z, t) − g(x + z, t − 1)| − g

2

3.4. Motion-adaptive 3D NLM filter The main goal of our approach is not only to preserve the details of moving areas, but also to improve the denoising performance for low-dose X-ray fluoroscopy. The proposed 3D NLM filter is based on stochastic distance and motion-adaptive weighting, which reflect the patch and frame similarity through motioncompensated spatio-temporal support. The proposed 3D NLM filter is defined as fˆ3DPNLM (x, t) =

1 N(x, t)

 dz,

(15)

˝G

w(x, s)e



˝Smc (x,t)

ˆ ˆ d˚ ( ML (x+.,t), ML (y+.,s))(0) n

g(y, s)dyds.

(17)

˝Tmc (xs ,t)



|g(x + z, t) − g(x + z, t − 1)|dz.

(16)

G

Note that ␣ reflects motion changes in not only the local support region, but also the global support region. If ∝ is smaller than a predefined threshold T, the region to be filtered is classified as exhibiting medium motion. Medium motion includes both static and slow motion behavior. On the other hand, if ␣ is larger than the threshold, the region is classified as containing complex motion, which includes both global and fast motion behavior. We regard ␣ as indicating global motion changes in the patient’s body or in the X-ray equipment, along with several local movements in the entire setup, including movement of the organs or medical devices. Fig. 3 shows a complex motion region acquired using ␣. Fig. 3(a) shows one frame of an X-ray fluoroscopic image sequence capturing a chest phantom, which includes moving objects. In this image, the medical device (a simulated moving agent) and some objects are in motion, so as to represent a realistic clinical situation. Fig. 3(b) shows the regions of complex motion obtained using ␣. When ␣ is larger than T, the motion-state region is regarded as being of complex motion type and is well detected, as can be seen in Fig. 3. Following the motion state detection, if the region is classified as a medium motion state, we perform 3D NLM filtering along the motion-compensated spatio-temporal support. On the other hand, if the region corresponds to a complex motion state, we perform 2D NLM filtering and 1D temporal filtering along the motion trajectory, so as to prevent motion artifacts.

The time and space dimensions of the filter correspond to a motion-compensated 3D search window (˝Smc , ˝T mc), where ˝Smc is a 2D spatial search window around motion-compensated x and ˝Tmc is a 1D temporal search window along the estimated motion direction, as shown in Fig. 2. Further, N(x, t)is the normalizing factor, as given by





N(x, t) =

w(x, s)e ˝Smc (x,t)

ˆ ˆ d˚ ( ML (x+.,t), ML (y+.,s))(0) n

dyds.(18)

˝Tmc (xs ,t)

In subsection 3.2, we classified the motion state into regions of medium or complex motion. In the medium-motion-state regions, we perform 3D NLM filtering along the spatio-temporal support using Eq. (15) above, as shown in Fig. 2(a). On the other hand, within the complex motion-state region, we perform 2D NLM filtering and 1D temporal filtering to prevent the motion artifacts, as shown in Fig. 2(b). The framework of the filter for the complex motion state is given by



fˆ3DPNLM (x, t) =

 +

1 { N(x, t)

e?

ˆ ˆ d˚ ( ML (x+.,t), ML (y+.,s))(0) n

g(y, t)dy

˝Smc (x,t)

w(x, s)g(y, s)ds}.

(19)

˝Tmc (xs ,t)

This filter combines a purely spatial NLM filter and a purely temporal filter for the complex motion state. As previously, N(x, t)is the

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

normalizing factor, here given by



N(x, t) =

e

ˆ ˆ d˚ ( ML (x+.,t), ML (y+.,s))(0) n

˝Smc (x,t)

 dy +

w(x, s)ds. ˝Tmc (xs ,t)

(20)

In a complex motion region, the temporal filtering can produce errors that degrade the filtered result. Thus, if complex motion appears in the X-ray fluoroscopic image sequence, the application of 3D NLM filtering with spatio-temporal support may cause some difficulties. One problem is that blurred artifacts will occur in response to any fast or large motion. This means that x in t will be blurred, because of the large contribution from y in s. In addition, the computational burden will increase, because of the need to find all possible similar patches in the 3D support. To overcome these problems, we separate the spatial and temporal filters in complex-motion-state regions. To preserve the motion details of medical devices or organs, we consider a motion-adaptive weight for the 3D NLM filter, defined as w(x, s) =

s−2 (x, s)

N

,

−2 t−k (x, s) + n−2

1 s = n(˝C )



2 (|g(x + z, t) − g(x + z + mv, s)| − ) dz,



(22)

˝C

|g(x + z, t) − g(x + z + mv, s)| dz,

Table 2 SNR comparison of different ROIs in Fig. 5 for real X-ray fluoroscopic image sequence (346 nGy/pulse).

ROI 1 ROI 2 ROI 3 ROI 4

where w(x, s) reflects the correlation between s at time t, and the corresponding s is the neighboring frame of the current frame t, where s = t − 1, t − 2, .., t − N. This weight reflects the frame similarity through the motion trajectories, which allows greater weight to be placed on patches of a similar frame. Here, s2 (x, s) is expressed as 1 n(˝C )

¨ Fig. 4. Chest phantom (Multipurpose Chest Phantom N1 “LUNGMAN,Kyoto Kagaku).

(21)

k=1

s 2 =

79

(23)

˝C

where  s 2 is the variance of the absolute frame difference in the local support ˝C along the motion trajectories. The w(x, s) value reflects the correlation between t and s in the motion-compensated support. The larger the motion that occurs in s, the smaller the weight of s, and vice versa. That is, the proposed 3D NLM filter uses w(x, s) to find similar patches in the spatio-temporal support according to the temporal correlation. 4. Experimental results The numerical and visual performance of the proposed scheme was compared experimentally with two recently presented denoising methods for video data: 3D NLM [4] and the Anscombe transform [9] +VBM3D [6]. The image data for the evaluation were acquired using a clinical angiography prototype system and a chest phantom (Multipurpose Chest Phantom N1 “LUNG¨ MAN,Kyoto Kagaku), yielding life-like radiographs very close to actual clinical images, as shown in Fig. 4. The size of each digital image was 1024 by 768 pixels, while the image intensity had 16-bit precision. The fluoroscope system was set to different peak kilovoltage and current values so as to acquire image sequences with various low-dose levels. Thus, different image sequences were acquired under three different radiation exposures: 345, 452, and 560 nGy/pulse. The highest noise level was obtained for the 345nGy/pulse exposure. Various experimental setups were also employed in order to examine different clinical situations involving motion, such as the movement of catheters or organs. In those experiments, the phantom was placed on a movable platform that can simulate

Noisy

3D NLM

Anscombe + VBM3D

Proposed

2.5 3.2 3.5 4.9

3.4 6.3 5.7 5

5.2 8.1 6.3 10

7.9 9.3 6.3 12.2

the deformable motion encountered in a clinical situation. Note that the motion encountered during real angiography is generally deformable, and the proposed algorithm should work well in such cases. Fig. 5 shows sample real low-dose X-ray fluoroscopic image sequences. One set incorporates complex motion, which was simulated using the phantom on the moving platform [Fig. 5(a), (c), (e)]. Thus, the examined motion includes not only local movement of the medical devices, but also global movement of the patient’s body or the X-ray system. The other set shows a medium motion example. Here, the phantom was imaged on a static platform, while certain individual objects were in motion [Fig. 5(b), (d), (f)]. This example represents local movement of the catheter or guide wire on the static phantom. In the proposed algorithm, the low-dose X-ray sequences under complex motion conditions were reconstructed using motion state detection and motion-adaptive weight. Motion state detection was performed in order to find complex motion, for which we assumed the motion type. The complex motion state was determined considering simultaneous movement of the patient’s body and medical devices. In our experiments, we performed the motion state detection considering global and local motion for each experimental condition (dose exposure level). Using these detection values, the motion-compensated supports varied adaptively under the different experimental conditions. 4.1. Objective performance analysis To evaluate the effectiveness of the presented algorithm quantitatively, the SNRs of different regions of interest (ROIs) were calculated. The ROIs are shown in Fig. 5, illustrated as white square blocks. The SNR is defined as SNR =

|s − b | , s

(24)

where s is the mean value of the signal, b is the mean value of the background (a uniform region near the selected ROI), and  s is the standard deviation of the signal. Four regions were chosen, in which the SNRs at three different dose levels were calculated, and the results are presented in Tables 2–4. The support size and the patch size in the 3D NLM [4] and the proposed method were 11 × 11

80

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

Fig. 5. Real X-ray fluoroscopy image sequences of chest phantom and moving objects. (a) Complex and (b) medium motion scenes with 346-nGy/pulse dose exposure; (c) complex and (d) medium motion scenes with 452-nGy/pulse dose exposure; and (e) complex and (f) medium motion scenes with 562-nGy/pulse dose exposure.

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

81

Fig. 6. Denoised X-ray fluoroscopic image sequences of complex motion scene (row) and medium motion scene (column) for 346-nGy/pulse exposure. The denoised results were obtained using the: (a, b) 3D NLM [4]; (c, d) Anscombe transform [9] +VBM3D [6]; and (e, f) proposed algorithm.

82

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

Fig. 7. Partially magnified results of Fig. 6 (Exposure: 346 nGy/pulse). The denoised results were obtained using the: (a) 3D NLM [4]; (b) Anscombe transform [9] +VBM3D [6]; and (c) proposed algorithm.

Table 3 SNR comparison of different ROIs in Fig. 5 for real X-ray fluoroscopic image sequence (452 nGy/pulse).

ROI 1 ROI 2 ROI 3 ROI 4

Noisy

3D NLM

Anscombe + VBM3D

Proposed

12.6 3.1 11.7 3.8

19 5.6 20.2 4.8

25.6 8.4 23.3 9.7

26 9.5 23.5 8.6

Medium motion Complex motion

Noisy

3D NLM

Anscombe + VBM3D

Proposed

3.5 0.9 2 4

5.1 1 5.4 5.7

6 1.4 6.8 6.3

10 1.1 6.7 6.3

Medium motion Complex motion

and 5 × 5 respectively, and five frames were used in the temporal domain. For the 3D NLM filter, the optimum results were achieved using h = 2.0, 1.8, and 1.5 for the 345-, 452-, and 560-nGy/pulse exposures, respectively. The VBM3D [6] parameter  was set to 50, 25, and 10 for 345-, 452-, and 560-nGy/pulse exposures, respectively. In general, our method outperformed the other approaches in terms of the SNR values. However, the SNR values in the ROIs do not have a sufficiently strong relationship with the actual edge and motion detail preservation performance. To evaluate the edge preservation, the structural similarity (SSIM) [26] index is used, which is defined as (2x y + C1 )(2xy + C2 ) (2x + 2y + C1 )(x2 + y2 + C2 )

,

Anscombe + VBM3D

Proposed

0.9038 0.8245

0.9124 0.8988

0.9495 0.9498

3D NLM

Anscombe + VBM3D

Proposed

0.9013 0.8564

0.9268 0.9001

0.9301 0.9030

Table 7 SSIM comparison for real X-ray fluoroscopic image sequence (562 nGy/pulse).

Medium motion Complex motion

SSIM =

3D NLM

Table 6 SSIM comparison for real X-ray fluoroscopic image sequence (452 nGy/pulse).

Table 4 SNR comparison of different ROIs in Fig. 5 for real X-ray fluoroscopic image sequence (562 nGy/pulse).

ROI 1 ROI 2 ROI 3 ROI 4

Table 5 SSIM comparison for real X-ray fluoroscopic image sequence (346 nGy/pulse).

(25)

where x and y represent the original and denoised signals, respectively. Further, xm and ym correspond to the image contents of the m th windows of the image. (x , y ) and (x , y ) represent the mean and standard deviation of each pixel over an 11 × 11-pixel Gaussian window, respectively. Thus, xy represents the covariance value of each pixel over an 11 × 11-pixel Gaussian window on the origi-

3D NLM

Anscombe + VBM3D

Proposed

0.9033 0.8919

0.9496 0.9004

0.9559 0.9120

nal and denoised images. C1 and C2 are variables used to stabilize division with a weak denominator. In order to evaluate the overall image quality, a mean structural similarity (MSSIM) index was employed [26]: 1 SSIM(xm , ym ), M M

MSSIM =

(26)

m=1

where M represents the number of local windows of the image. The MSSIM index yields a decimal value between −1 and 1, and 1 is only reachable in the case of two identical data sets. Complex and medium motion scenarios were chosen. For different motion scenarios, the SSIMs at three dose levels were calculated, and the results are presented in Tables 5–7. In general, our method outperformed the other approaches in terms of SSIM measurements. Further, the performance in terms of the visual aspects is more important for clinical or medical use. Thus, the efficacy of the proposed method can be better evaluated by considering the visual results.

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

83

Fig. 8. Denoised X-ray fluoroscopic image sequences of complex motion scene (row) and medium motion scene (column) for 562-nGy/pulse exposure. The denoised results were obtained using the: (a, b) 3D NLM [4]; (c, d) Anscombe transform [9] +VBM3D [6]; and (e, f) proposed algorithm.

84

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

Fig. 9. Partially magnified results of Fig. 8 (Exposure: 562 nGy/pulse). The denoised results were obtained using the: (a) 3D NLM [4]; (b) Anscombe transform [9] +VBM3D [6]; and (c) proposed algorithm.

4.2. Visual performance analysis

5. Conclusion

Experiments involving both complex and medium motion were again performed for the three different dose exposures. The denoised results obtained for the radiation exposure level of 345 nGy/pulse are shown in Figs. 6 and 7, while those obtained for the 560-nGy/pulse exposure are shown in Figs. 8 and 9. It is apparent from Figs. 6–9 that the visual results yielded by the proposed method are superior to those of the other methods, for both complex and medium motion. It is also apparent that the 3D NLM method [4] does not provide images of sufficient quality in either case, as the majority of the edges and details are degraded and motion artifacts remain in the areas containing movement. In contrast, the VBM3D [6] with Anscombe transform [9] yields significantly better results for both complex and medium motion, preserving details and moving objects well. However, the images are of visibly lower quality than those yielded by the proposed method, for which the majority of the fine details are well preserved and the flat regions are cleaner. Figs. 7 and 9 show partially magnified denoised results, demonstrating the superior visual quality of the images obtained using the proposed method for different exposure doses. With the proposed method, noise is suppressed to a greater extent and the motion details are preserved more effectively than in the images yielded by the other methods. Thus, the experimental analysis demonstrates the subjective and objective efficacy of the proposed framework for the denoising of a real, low-dose X-ray fluoroscopic image sequence. The processing time of the proposed algorithm when tested on 1024 × 768-resolution X-ray images was approximately 1.7s. For 500 × 500-resolution X-ray images, a processing time of approximately 0.43 s was required. The proposed algorithm was programmed in the C + + language and implemented in MS Visual Studio 2015 for MS Windows 10. The test computer employed an Intel i5-4670k (3.4 GHz) central processing unit (CPU) with 8-GB RAM. The code used to implement the proposed method was programmed to produce the optimal performance. Note that, if unnecessary and duplicate calculating portions of code are removed and the code is optimized, these processing times could be reduced.

In this paper, a motion-adaptive 3D NLM denoising filter based on stochastic distance has been proposed as a means of reducing noise in low-dose X-ray fluoroscopic images. In this approach, the stochastic distance is used for Poisson noise reduction, and can be applied to an X-ray fluoroscopic image sequence via extension to the spatio-temporal domain. The proposed denoising method performed well in application to an X-ray fluoroscopic image sequence containing not only medium motion (local motion from medical devices, such as the agent and guidewire), but also complex motion (global motion), through use of motion state detection and motionadaptive weight. The proposed method was applied in experiment, and the motion details were well preserved without the appearance of motion artifacts during spatio-temporal filtering. When compared with conventional denoising filters, the proposed scheme exhibits excellent denoising performance, with detailed preservation of the moving regions and retention of the fine structures in both complex and medium motion scenarios of real low-dose X-ray fluoroscopic image sequences. Funding This work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning [grant number 2015R1A2A1A14000912]. The funding source had no role in the study design; in the collection, analysis and interpretation of data; in the writing of the report; or in the decision to submit the article for publication. Conflict of interest The authors declare that they have no conflict of interest. Acknowledgment None.

M.S. Lee et al. / Biomedical Signal Processing and Control 38 (2017) 74–85

References [1] H.B. Yin, X.Z. Fang, Z. Wei, X.K. Yang, An improved motion-compensated 3-D LLMMSE filter with spatio-temporal adaptive filtering support, IEEE Trans. Circuits Syst. Video Technol. 17 (12) (2007) 1714–1727. [2] L. Guo, O.C. Au, M. Ma, Z. Liang, Temporal video denoising based on multihypothesis motion compensation, IEEE Trans. Circuits Syst. Video Technol. 17 (12) (2007) 1423–1429. [3] J. Dai, O.C. Au, F. Zou, C. Pang, Generalized multihypothesis motion compensated filter for grayscale and color video denoising, Signal Process. 93 (1) (2013) 70–85. [4] A. Buades, B. Coll, J. Morel, Nonlocal image and movie denoising, Int. J. Comput. Vis. 76 (2) (2008) 123–139. [5] J. Boulanger, Space-time adaptation for patch-based image sequence restoration, IEEE Trans. Pattern Anal. Mach. Intell. 29 (6) (2007) 1096–1102. [6] K. Dabov, A. Foi, K. Egiazarian, Video denoising by sparse 3D transform-domain collaborative filtering, in: Proc. 15th European Signal Process. Conf., EUSIPCO, 2007, pp. 145–149. ´ Patch-based video denoising with [7] A. Buades, J.L. Lisani, M. Miladinovic, optical flow estimation, IEEE Trans. Image Process. 25 (6) (2016) 2573–2586. [8] K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse 3D transform-domain collaborative filtering, IEEE Trans. Image Process. 16 (8) (2007) 2080–2095. [9] M. Mäkitalo, A. Foi, Optimal inversion of the generalized Anscombe transformation for Poisson-Gaussian noise, IEEE Trans. Image Process. 22 (1) (2013) 91–103. [10] M. Mäkitalo, A. Foi, Optimal inversion of the Anscombe transformation in low-count Poisson image denoising, IEEE Trans. Image Process. 20 (1) (2010) 99–109. [11] C.A. Deledalle, F. Tupin, L. Denis, Poisson NL means: unsupervised non local means for Poisson noise, in: IEEE 17th Int. Conf. Image Processi., Hong Kong, 2010, pp. 801–804. [12] J. Salmon, Z. Harmany, C. Deledalle, R. Willett, Poisson noise reduction with non-local PCA, J. Math. Imaging Vis. 48 (2) (2014) 279–294. [13] A.D.C. Nascimento, R.J. Cintra, A.C. Frery, Hypothesis testing in speckled data with stochastic distances, IEEE Trans. Geosci. Remote Sens. 48 (1) (2010) 373–385.

85

[14] A.A. Bindilatti, N.D.A. Mascarenhas, A nonlocal poisson denoising algorithm based on stochastic distances, IEEE Signal Process. Lett. 20 (11) (2013) 1010–1013. [15] A. Buades, B. Coll, J.M. Morel, A review of image denoising algorithms, with a new one, SIAM J, Multiscale Model. Simul. 4 (2) (2005) 490–530. [16] I. Rodrigues, J. Sanches, J. Bioucas-Dias, Denoising of medical images corrupted by Poisson noise, IEEE 15th Int Conf. Image Process. (2008) 1756–1759. [17] T. Cerciello, P. Bifulco, M. Cesarelli, A. Fratini, A comparison of denoising methods for X-ray fluoroscopic images, Biomed. Signal Process. Control 7 (6) (2012) 550–559. [18] M. Tomica, S. Loncaricb, D. Sersic, Adaptive spatio-temporal denoising of fluoroscopic X-ray sequences, Biomed. Signal Process. Control 7 (2) (2012) 173–179. [19] J. Wang, L. Zhu, L. Xing, Noise reduction in low-dose X-ray fluoroscopy for image-guided radiation therapy, Int. J. Radiat. Biol. 74 (2) (2009) 637–643. [20] C.L. Chan, A.K. Katsaggelos, A.V. Sahakian, Image sequence filtering in quantum-limited noise with applications to low-dose fluoroscopy, IEEE Trans. Med. Imaging 12 (3) (1993) 610–621. [21] R. Aufrichtig, D. Wilson, X-ray fluoroscopy spatio-temporal filtering with object detection, IEEE Trans. Med. Imaging 14 (4) (1995) 733–746. [22] M. Cesarelli, P. Bifulco, T. Cerciello, M. Romano, L. Paura, X-ray fluoroscopy noise modeling for filter design, Int. J. Comput. Assist. Radiol. Surg. 8 (2) (2013) 269–278. [23] N. Miyamoto, M. Ishikawa, K. Sutherland, R. Suzuki, T. Matsuura, C. Toramatsu, S. Takao, H. Nihongi, S. Shimizu, K. Umegaki, H. Shirato, A motion-compensated image filter for low-dose fluoroscopy in a real-time tumor-tracking radiotherapy system, J. Radiat. Res. 56 (1) (2015) 186–196. [24] C. Amiot, C. Girard, J. Chanussot, J. Pescatore, M. Desvignes, Spatio-temporal multiscale denoising of fluoroscopic sequence, IEEE Trans. Med. Imaging 35 (6) (2016) 1565–1574. [25] C. Kervrann, C.Ó.S. Sorzano, S.T. Acton, J.C. Olivo-Marin, M. Unser, A guided tour of selected image processing and analysis methods for fluorescence and electron microscopy, IEEE J. Sel. Topics Signal Process. 10 (1) (2016) 6–30. [26] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, Image quality assessment: from error measurement to structural similarity, IEEE Trans. Image Process. 13 (2004) 600–612.