Classification of breast lesions using quantitative ultrasound biomarkers

Biomedical Signal Processing and Control 57 (2020) 101786 Contents lists available at ScienceDirect Biomedical Signal Processing and Control journal...

Download PDF

2MB Sizes 0 Downloads 36 Views

Report

PDF Reader
Full Text

Biomedical Signal Processing and Control 57 (2020) 101786

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control journal homepage: www.elsevier.com/locate/bspc

Classiﬁcation of breast lesions using quantitative ultrasound biomarkers Navid Ibtehaj Nizam a , Sharmin R. Ara b , Md. Kamrul Hasan a,∗ a b

Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh Department of Electrical and Electronic Engineering, East West University, Dhaka-1212, Bangladesh

a r t i c l e

i n f o

Article history: Received 28 May 2019 Received in revised form 20 October 2019 Accepted 16 November 2019 Keywords: Quantitative ultrasound Effective scatterer diameter Mean scatterer spacing Non-invasive biomarker Tissue characterization Computer-aided diagnosis

a b s t r a c t Quantitative ultrasound (QUS) parameters are gaining attention recently as non-invasive biomarkers for soft tissue characterization. In this work, we use two QUS parameters, namely effective scatterer diameter (ESD) and mean scatterer spacing (MSS), for binary classiﬁcation of breast lesions. To produce improved classiﬁcation results, we propose a modiﬁed frequency-domain technique for ESD estimation of breast tissues from the diffuse component of backscattered radio-frequency (RF) data. Ensemble empirical mode decomposition (EEMD) is performed to separate the diffuse component from the coherent component of the backscattered RF data. A non-parametric Kolmogorov–Smirnov (K-S) test is employed for automatic selection of the intrinsic mode functions (IMFs). A novel multi-step system effect minimization process is also used. The ESD is estimated using a novel nearest neighborhood average regression line ﬁtting (NNARLF) algorithm. Furthermore, we use an ameliorated EEMD domain autoregressive (AR) spectral estimation technique for MSS estimation. On using the ESD for binary classiﬁcation of 139 lesions, we obtain high sensitivity, speciﬁcity, accuracy values of 95.45%, 95.79%, and 95.68%, respectively. The area under the receiver operating characteristics (ROC) curve is 0.95. On combining ESD with MSS we obtain even more improved sensitivity, speciﬁcity, and accuracy values of 97.73%, 95.79%, and 96.40%, respectively. The area under the ROC also increases to 0.97. This high classiﬁcation performance demonstrates the potential of the use of these QUS parameters as non-invasive biomarkers for breast cancer detection. © 2019 Elsevier Ltd. All rights reserved.

1. Introduction Quantitative ultrasound (QUS) based parameters have been employed for the characterization and classiﬁcation of breast tissues [1,2]. Such classiﬁcation provides some inherent advantages compared to diagnosis based on other modalities (such as mammography) due to the non-ionizing and non-invasive nature of ultrasound and a lower cost of operation [3]. Also, QUS allows the development of a diagnostic modality that is less subjective to operator settings and interpreter variability compared to conventional ultrasound imaging [1,2,4]. QUS based tissue characterization works on the principle that diseases alter the acoustic scattering properties of tissues [5]. Effective scatterer diameter (ESD) and mean scatterer spacing (MSS) are popular QUS parameters for quantiﬁcation of tissue microstructure [2]. These parameters are not visible from ultrasound B-mode or ultrasound elastography images but are estimable

∗ Corresponding author. E-mail address: [email protected] (Md.K. Hasan). https://doi.org/10.1016/j.bspc.2019.101786 1746-8094/© 2019 Elsevier Ltd. All rights reserved.

from histopathology slides using high-powered microscopes [1]. The estimated ESD values of several different types of tissues have been reported in the literature. These include eyes [6], liver [7], renal tissues [8], glomerular tissues [9], kidney [10], prostate [11], human aortae [12], and the uveal melanomas [13]. ESD values of female breast tissues have also been recently reported in the literature [2,14]. In these works, ESD has been combined with other QUS parameters for breast lesion classiﬁcation and characterization. In [2], ESD was combined with MSS and effective acoustic concentration (EAC) for breast tumor grading. In [15], a parametric image based approach, involving both ESD and MSS, was successful in binary (benign-malignant) classiﬁcation of breast lesions. A very recent work in [14] used ESD along with six other QUS parameters for binary classiﬁcation of breast lesions. But, both [14] and [15] employed limited datasets of 43 and 78 lesions, respectively. Our group, previously employed MSS for breast lesion classiﬁcation on a very small dataset of 24 lesions [16]. However, we used a signiﬁcantly larger dataset of 201 lesions for classiﬁcation of breast lesions in [17] using bi-modal parameters extracted from ultrasound B-mode and ultrasound elastography images.

2

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

Robust schemes for estimation of QUS parameters is essential for their successful application as non-invasive biomarkers for breast cancer detection. ESD estimation techniques proposed previously often employ a Gaussian form factor to model tissue scattering [14,18,19]. Moreover, it has been established that the backscattered RF data consists of a coherent component and a diffuse component [20]. In [21], it has been reported that, for the estimation of ESD, the coherent component behaves as an interference, and hence, needs to be suppressed. However, for using a signal decomposition technique, the coherent and diffuse components need to be in additive form [16]. To achieve this, system effects like the impact of the system point spread function (PSF) as well as the impacts of diffraction and other acquisition noises must be removed [16,22–24]. Furthermore, accurate estimation of QUS parameters require an accurate modeling of tissue pathology. Most of the previously proposed ESD estimation techniques fail to take care of one or more of these requirements [22]. A technique that relies on the minimization of the average squared deviation (MASD) between the theoretical and measured power spectra using a Gaussian form factor model has been proposed in [18]. However, this method does not employ any signal decomposition technique. A frequency-domain technique that employs a Gaussian form factor-based theoretical power spectrum has been proposed in [19]. This method also does not use any signal decomposition technique. A similar shortcoming is present in [14], which employs a Gaussian form factor based approach developed in [25]. Also, these techniques use large 2-D spatial signal blocks to generate a power spectrum [18,19]. This is often an unrealistic approach because the tissue pathology inside a large spatial region must be considered uniform, not taking into account the heterogeneity present in tissues. In some recent techniques, the generalized spectrum, Rayleigh envelope statistics [26], and Hanning tapers [27] have been used to separate the diffuse echoes from the backscattered data. But, these techniques also fail to consider the heterogeneity present in tissues. What is more, all these techniques rely on a single-step reference phantom based system effect minimization technique. However, no attempt has been made to investigate whether this single step process indeed adequately removes the system effects. In case of MSS estimation, traditional techniques have relied on the autoregressive (AR) spectrum [20] or the AR cepstrum [1]. Recently, MSS estimation has also moved towards signal decomposition techniques like the singular spectral analysis (SSA) [28]. We proposed an MSS estimation technique in [16] which combined EEMD with the AR spectrum. This technique has been shown to outperform most of the existing techniques for MSS estimation [16]. In this paper, we aim to develop a reliable technique for binary classiﬁcation of breast lesions using ESD and MSS. To achieve this, we propose a novel ESD estimation technique, which tries to overcome the shortcomings of the existing ESD estimation techniques. The salient features of our proposed algorithm are discussed below:

• We use ensemble empirical mode decomposition (EEMD) for signal decomposition, like the technique proposed in [16] for MSS estimation. However, in [16], the IMFs had been selected using an empirical criterion. In this paper, a completely automatic scheme has been employed, using the K-S test [29], to select the IMFs responsible for diffuse scattering. • To successfully employ the signal decomposition technique, we adopt a novel multi-step system effect minimization approach. The ﬁrst two steps involve bandpass ﬁltering and deconvolution of the backscattered RF data and the ﬁnal step is normalization using a reference tissue-mimicking phantom (TMP). • The proposed technique is based on a frequency domain Gaussian form factor model [19]. However, the NNARLF algorithm is

incorporated into the Gaussian form factor model, for a better representation of tissue structure and pathology. We also ameliorate our previously proposed EEMD domain autoreggresive (AR) spectral estimation based MSS estimation technique [16] using features of the ESD estimation algorithm. MSS is combined with ESD for binary classiﬁcation on the same dataset. The motivation behind combining ESD with MSS is that the MSS can be estimated using a scheme similar to the proposed ESD estimation technique. In fact, the use of a multi-step system effect minimization approach should lead to more reliable MSS estimates. Also, the use of multiple QUS parameters is a common approach for ultrasonic tissue characterization [2,14,17]. However, throughout this paper, a greater focus is placed on the technique for ESD estimation. This is because, as already discussed, the proposed ESD estimation technique introduces several novel features. Thus, numerical results are produced to highlight the important features of the proposed ESD estimation technique. Therefore, the contributions of this work are two-fold. Firstly, we propose an ESD and MSS based binary classiﬁcation scheme using a larger dataset than other works in the literature that employ ESD and/or MSS for classiﬁcation. Secondly, we propose a new ESD estimation technique and also, an ameliorated MSS estimation scheme for improved breast lesion classiﬁcation.

2. Materials and method 2.1. Patient data The in vivo data used in this paper have been obtained at the Bangladesh University of Engineering and Technology (BUET) Medical Center with the help of a SonixTOUCH Research (Ultrasonix Medical Corporation, Richmond BC, Canada) scanner. The scanner is integrated with an L14-5/38 linear probe. The probe was operating at 10 MHz with 65% bandwidth (−6-dB bandwidth) at a sampling rate of 40 MHz. The pulse length was approximately 0.4 mm and the beam width was approximately 2.2 mm. The pulse length has been estimated from the emitted pulse using the multiple input–output inverse theorem [30]. The study has been carried out on female subjects, with their prior written consent, and the study was approved by the Institutional Review Board (IRB) of BUET. From the female subjects, 139 pathological RF data have been recorded. The details of the 139 pathological RF data, that have been used for classiﬁcation, are summarized in Table 1. The age range of all patients was 13–75 years (mean: 35.27 years). The patients having masses underwent ﬁne-needle aspiration cytology (FNAC) and/or excision biopsy according to the suggestion of their physicians. All patients having FNAC diagnosis positive for malignancy underwent surgery. Therefore, diagnoses of malignant and some benign lesions were conﬁrmed by histopathology, and diagnoses of the remaining lesions, by cytopathology.

2.2. TMP data Three TMPs have been used in our work, both as reference phantoms and for validation of our proposed ESD estimation algorithm. The TMP is made of zerdine, which exhibits echogenic patterns similar to those obtained from soft tissues. The speed of sound in the TMP is around 1540 m/s. The TMP data was acquired using the same transducer settings as the in vivo data. We refer to the TMP datasets as TMP A, B, and C. The actual average ESD, as supplied by the manufacturer, i.e., CIRS, are used as gold standards for performance evaluation of the ESD estimators. The description of the TMP datasets is presented in Table 2.

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

3

Table 1 Patient data summary. Tissue type

No. of RF data

Mean age ± SD

Method of conﬁrmation

Lesion size mean (mm × mm) ± SD (mm × mm)

Malignant Fibroadenoma Inﬂammation

44 71 24

44.91 ± 9.81 27.48 ± 9.13 35.43 ± 12.71

Biopsy FNAC/Biopsy FNAC/Biopsy

21.32 × 19.14 ± 5.94 × 7.36 19.37 × 12.09 ± 7.93 × 4.67 21.53 × 13.15 ± 8.39 × 4.39

Table 2 Description of experimental TMPs. TMP dataset

Description

Average ESD of inclusion (␮m)

Average ESD of background (␮m)

Size (cm2 )

A B C

Homogeneous Homogeneous Heterogenous

– – 70 (spherical)

45 45 45

3×4 3×4 4.5 × 4 (inclusion dia = 1.4 cm)

Table 3 Division of breast lesion data into ﬁve groups for training-testing. Group no.

Total (= Malignant + Fibro. + Inﬂam.)

1 2 3 4 5 Total

29 (= 9 + 15 + 5) 28 (= 9 + 14 + 5) 28 (= 9 + 14 + 5) 28 (= 9 + 14 + 5) 26 (= 8 + 14 + 4) 139 (= 44 + 71 + 24)

2.3. Binary classiﬁcation of breast lesions The binary classiﬁcation performances of ESD, and the combination of ESD and MSS (both normalized) are evaluated on 139 breast lesions. We use support vector machine (SVM), K-nearest neighbor (KNN), linear discriminant analysis (LDA), multinomial logistic regression (MNR), and Naïve Bayes (NB) classiﬁers. The best classiﬁcation performance obtained is reported in this work. Here, we used the commonly employed “one-versus-all” or OVA based classiﬁcation technique [31]. The total dataset is randomly divided into ﬁve groups for training and testing of the characterization indices as in [31]. First, group 1 is used as the test set and the remaining four groups are used as the training set for each class. Next, group 2 is used as the test set and the remaining four groups are used as the training set. The process is repeated until the testing of each group is completed. Details on the grouping for the binary classiﬁcation are provided in Table 3. In the case of support vector machine (SVM) based classiﬁer, the two feature vectors, i.e., ESD and MSS, have been given equal weight for the multi-layer perceptron network. Similarly, for the Naïve Bayes, KNN, MNR and LDA based classiﬁers, equal observation weights are used for MSS and ESD. The classiﬁcation results include true positive (TP), true negative (TN), false positive (FP), false negative (FN), sensitivity, speciﬁcity, accuracy, positive predictive value (PPV), negative predictive value (NPV), the sum of the last ﬁve parameters, Sum5 , Matthew’s correlation coefﬁcient (MCC), and area under the receiver operating characteristics curve (AUC). From TP, TN, FP, and FN, we calculate the sensitivity, speciﬁcity, accuracy, PPV, NPV and MCC [32–34]. Higher values of sensitivity, speciﬁcity, accuracy, PPV, and NPV indicate better classiﬁcation performance. For MCC, +1 indicates a perfect prediction, 0 indicates a uniform random prediction, and −1 indicates an inverse prediction [33]. 2.4. Proposed ESD estimation technique A block diagram of our proposed ESD estimation algorithm is shown in Fig. 1. The proposed technique employs a frequencydomain Gaussian form factor model in the EEMD domain. The NNARLF algorithm is incorporated into the Gaussian form factor model. Additionally, the proposed technique uses a multi-step system effect minimization technique. Here, ﬁltering and deconvo-

lution serve as both preprocessing and system effect minimization steps. They are used to reduce the impact of diffraction and system PSF, respectively. In addition to these steps, normalization using a reference TMP is carried out for the reduction of residual system effects. Next, we discuss the different steps of the proposed algorithm in greater detail. The list of the important symbols and acronyms used in this paper are presented in Table 13, at the end of the paper, for convenience. 2.4.1. Preprocessing At ﬁrst, 2-D rectangular regions of interest (ROI) are selected from the calculated B-mode images obtained from the recorded RF data. Any lesion present had been pre-identiﬁed on the machine B-mode image by the radiologist, who acquired the data. A suitable 2-D rectangular region within the border of the lesion is taken as the ROI. Additionally, a second ROI is selected outside the lesion to compare the ESD values inside and outside the lesion. This has been illustrated schematically in the block diagram of Fig. 1. The dimension of each ROI is approximately (10–12) × (6.25–9.40) mm2 . This approximately equals to 25 pulse lengths axially and 4 beam widths laterally. The choice of such an ROI size has been previously shown to improve the accuracy of QUS parameter estimates [35]. Though ESD estimation from in vivo data uses only one ROI, for estimating the average ESD from TMP datasets A and B, 25 ROIs are selected. For TMP dataset C, 10 ROIs are selected from outside the inclusion and 5 ROIs are selected within the inclusion. In TMP dataset C, we have also selected 5 heterogeneous ROIs across the border of the inclusion such that the ROIs encompass both the inclusion and the background. Each TMP ROI is of dimension 10 × 10 mm2 . The size of the TMP ROIs, like the in vivo data ROIs, amounts to approximately 25 pulse lengths axially and 4 beam widths laterally. Next, the RF data is deconvolved using a blind multi-channel least-mean squares algorithm previously proposed in [36]. The technique in [36] requires no prior knowledge of the PSF. It employs an l1 -norm based cost function with a damped variable step-size, which helps mitigate the impacts of noise. This technique has been shown to outperform most conventional deconvolution techniques [37]. The technique is, therefore, well suited for our work, to reduce the impact of the system PSF. After deconvolution, a bandpass ﬁlter of frequency range 2–13 MHz is applied to the data. It has been established in [23] and very recently, in [24], that the diffraction effect is most prominent in the low-frequency region of the spectrum (below 2 MHz). It is to be noted that deconvolution shifts the backscattered signal from being centered at the pulse frequency (10 MHz) to lower frequencies. A upper cut-off frequency of 13 MHz is used to eliminate any high frequency acquisition noises that may be present [24]. 2.4.2. Coherent component suppression using EEMD The backscattered RF data consists of two major components, a coherent component, and a diffuse component [20]. The diffuse

4

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

Fig. 1. A block diagram illustrating the proposed ESD estimation technique. NNARLF refers to nearest neighborhood average regression line ﬁtting.

component represents scattering from aperiodic scatterers. Accurate estimation of ESD requires the suppression of the coherent component [26]. This is because, the coherent component has been shown to increase the bias and variance of ESD estimates [26]. Therefore, on removal of the system PSF, diffraction and other acquisition noises, only the coherent and the diffuse components remain in additive form and hence, can be separated by a suitable signal decomposition technique [16]. Empirical mode decomposition (EMD) decomposes a signal into its IMFs in additive form [38]. The main advantage of EMD over other data-driven approaches is that it requires no pre-selection of basis functions and is suitable for both stationary and nonstationary signals [38]. Hence, EMD can be used to extract the diffuse component from the selected ROI from the backscattered signal after deconvolution and ﬁltering have been carried out. However, small perturbations can adversely affect EMD and a completely different set of IMFs may be obtained each time it is performed [39]. To produce stable IMF estimates, EEMD is usually performed [40]. In EEMD, an ensemble of random Gaussian noise is added to the backscattered signal to produce an ensemble of signals. The EMD algorithm is applied to each signal in the ensemble to produce an ensemble of IMFs. This ensemble of IMFs is then averaged to produce a new set of IMFs. The IMFs are normalized by dividing with their maximum amplitude so that no undue weight is given to any single IMF. Selection of IMFs is a crucial issue in any algorithm involving EEMD [16,41]. This is because some IMFs will contain information about the coherent scatterers while the others will contain information about the diffuse scatterers [16]. To identify the IMFs exhibiting diffuse scattering, a K-S test is performed. The method used is described in [29]. The K-S classiﬁer assumes that diffuse scatterers generate Gaussian statistics and any deviation is a result of coherent scattering. Those IMFs that show deviation from Gaussian statistics (at 5% signiﬁcance level) are excluded and the remaining IMFs are added and used for further processing. 2.4.3. NNARLF for ESD estimation The gated backscattered RF signal intensity, W (v), where v = {f, Deff , z, nz } is the set of variables on which the backscattered RF signal depends, at the transducer face can be expressed in the frequency-domain as [42] W (v) = T (f ) · D(f, z) · A(f, z) · S(f, Deff , nz ).

(1)

Here, T(f) represents the combined effect of the transmit pulse and the transducer sensitivity (electro-acoustic and acousto-electric transfer functions); D(f, z) is the effect of diffraction; A(f, z) is the

cumulative attenuation in the soft tissue and S(f, Deff , nz ) represents the scattering properties of the tissue, including the ESD (Deff ) and EAC (nz ); z being the depth of the gated segment from the transducer face [42,43]. We acquire RF signals from the tissue sample and the reference TMP. The backscattered RF signal is then deconvolved and ﬁltered, as the ﬁrst two steps of the system effect minimization process. The resulting signal, in the frequency domain, is given by W (v). The cumulative attenuation, A(f, z), in soft tissues is a function of frequency f and depth z, and can be expressed as [44] A(f, z) = e−4(f )z = 10−2(f )z/10 ,

(2)

where (f) denotes the attenuation coefﬁcient (AC) in unit Nepers/cm. It is reported in [44] that (f) demonstrates a linear frequency dependence. Therefore, it can be written as (f) = ˇ · f, where ˇ denotes the AC in Nepers/cm/MHz. By compensating for the effect of frequency-dependent attenuation, using a depthdependent (dependent on z) spectral average based attenuation estimation technique developed in [45], we get the backscattered RF signal in the frequency-domain as (v) = W (v)Ac (f, z) Wcomp

= T (f ) · D (f, z) · S(f, Deff , nz ).

(3)

Here, Ac (f, z) is the frequency-dependent attenuation compensation function deﬁned as Ac (f, z) = A−1 (f, z).

(4)

T (f) and D (f, z) are residual effects of the system PSF and diffraction, respectively, remaining after deconvolution and ﬁltering. Now, let us consider the attenuation-compensated signal in (n). To perform EEMD of the discrete-time domain to be wcomp the attenuation-compensated RF data, an ensemble of NE random Gaussian noise, gp (n) (p = 1, . . ., NE ) is added to wcomp (n). That is, the ensemble is given by [46] w pcomp (n) = wcomp (n) + gp (n),

p = 1, . . ., NE .

(5)

The signal-to-noise ratio between the RF data and the Gaussian noise is kept at 30 dB. After that, the EMD algorithm [38] is applied to each of the signals in the ensemble to decompose them into a sum of their IMFs, cpj (n), j = 1, . . ., K, where K is the number of IMFs and a residue, rp (n), given by [46] w pcomp (n) =

K j=1

cpj (n) + rp (n),

p = 1, . . ., NE .

(6)

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

Finally, the IMFs using EEMD are obtained from the ensemble average [46]

The quantity, nz , is termed the EAC and is deﬁned in [18]. For sample and reference, the tissue scattering is then represented as

1 cpj (n), c¯j (n) = NE NE

j = 1, . . ., K.

(7)

p=1

185Lq2 SS (f, Deff,S , nz,S )

1 + 2.66

w icomp (n) =

M

d = 1, . . ., M,

2

(8)

Deff,S

,

Deff,R

fq

6 nz,R f 4

Deff,R

where w icomp (n) is the IMF-replaced signal in the discrete-time domain. We consider WS,comp (v) and WR,comp (v) as the IMF-replaced signals in the frequency-domain for the sample and the reference, respectively. In case of the same average sound speed in the reference and sample tissues, the residual diffraction terms can be considered equal. The sound speed in the reference is known from the manufacturer’s speciﬁcations to be approximately equal to that of the soft tissues. Moreover, since the sample and reference data are obtained using the same transducer settings, the residual effect of the system PSF can also be considered equal in the sample and the reference. It is to be noted that the reference data is also passed through the same preprocessing steps as the sample data. Hence, (v) by WR,comp (v), as the ﬁnal step of the multion dividing WS,comp step system effect minimization process, we get the normalized spectra, Wnorm (v), as

Wnorm (v) =

(v) WS,comp WR,comp (v)

=

SS (f, Deff,S , nz,S ) SR (f, Deff,R , nz,R )

.

(9)

Here, SR (f, Deff,R , nz,R ) and SS (f, Deff,S , nz,S ) represents the scattering properties of the sample and reference, respectively. Taking the logarithm on both sides of (9), we get

10 log Wnorm (v) = 10 log

SS (f, Deff,S , nz,S ) SR (f, Deff,R , nz,R )

.

(10)

A model for tissue scattering, S(f, De ff, nz ),in the frequencydomain was developed in [47] for clinical array systems given by

Deff

185Lq2 S(f, Deff , nz )

=

fq

−12.159f 2

×e

Deff

Deff 2

2 (11)

2

2

with L the gate length (mm), q the ratio of aperture radius to distance from the region of interest, f the frequency in MHz, and Deff the ESD in mm. The model was derived using a Gaussian form factor. It has been previously established that a Gaussian form factor is suitable for modeling the scattering from human tissues [18,25].

(13)

Deff,R 2

2

)

.

Here, Deff,S is the ESD to be estimated from the sample (tissue) and Deff,R is the ESD of the reference (TMP). Dividing (12) by (13), and considering the fact that 2.66(fq ized tissue scattering as SS (f, Deff,S , nz,S ) SR (f, Deff,R , nz,R )

=

6 nz,S Deff,S 6 Deff,R nz,R

e

Deff 2 ) 2

1 [19], we get the normal-

−3.03975(D2

eff,S

−D2

eff,R

)f 2

.

(14)

To estimate the ESD from breast tissues, TMP dataset A has been used as the reference. In the case of estimating the ESD from the TMP datasets, TMP dataset A has been used as the reference phantom for TMP datasets B and C and TMP dataset B has been used as the reference phantom for TMP dataset A. Next, taking logarithm on both sides of (14) yields 10 log

SS (f, Deff,S , nz,S ) SR (f, Deff,R , nz,R )

= 10 log

6 Deff,S nz,S 6 Deff,R nz,R

2 ×(Deff,S

− 13.20

(15)

2 − Deff,R )f 2 .

To estimate ESD, we ﬁt a regression line through the usable (i.e., −6 dB) bandwidth of the normalized log scattering power spectrum. From (10), this is equivalent to ﬁtting a line through the usable bandwidth of the normalized log power spectrum. Assuming f2 = x, the regression line can be expressed as y = mx + c.

(16)

Here, y = 10 log

nz f 4

2

1 + 2.66

6

×e

2

2

d=1 −12.159f 2 (

(12)

2

2

1 + 2.66

2

2

2

=

nz,S f 4

Deff,S

×e

SR (f, Deff,R , nz,R ) cd (n),

fq

−12.159f 2

185Lq

6

Deff,S 2

=

The IMFs responsible for diffuse scattering, cd (n), d = 1, . . ., M, where M is the number of IMFs (normalized) responsible for diffuse scattering, are then identiﬁed by the K-S test [29]. The attenuationcompensated RF data, for both the sample and the TMP, are then replaced by the summation of the IMFs responsible for diffuse scattering as

5

SS (f, Deff,S , nz,S ) SR (f, Deff,R , nz,R )

,

2 2 − Deff,R ), m = −13.20 × (Deff,S

c = 10 log

6 Deff,S nz,S 6 Deff,R nz,R

.

(17) (18)

(19)

The ESD may be estimated from (18). However, the probability that a single gated RF block includes regions of heterogeneous tissues increases with the block size. Thus, for better modelling of tissue pathology, we use an NNARLF algorithm. We assume that the ESD of the neighboring tissues of the scattering particles in the neighboring blocks is almost the same for their physical proximity.

6

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

We calculate an average regression line as the weighted average of the regression lines of the neighboring blocks as

is +La

i0 =is −La

Y (is , js ) =

js +Ll

j0 =js −Ll

is +La

i0 =is −La

y(i0 , j0 )w(is .js ) (i0 , j0 )

js +Ll

j0 =js −Ll

.

(20)

w(is .js ) (i0 , j0 )

Here, Y(is , js ) denotes the weighted average value of y(is , js ), and w(is .js ) (i0 , j0 ) is the exponential weight function for an interrogated point (is , js ) on the 2-D ESD map, deﬁned as w(is ,js ) (i0 , j0 ) = e−|a (i0 −is )|−|l (j0 −js )| ,

(21)

where a and l denote the weighting factors in the axial and lateral directions, respectively. Also, is − La ≤ i0 ≤ is + La and js − Ll ≤ j0 ≤ js + Ll , where, La and Ll are the nearest neighborhood (NN) factors in the axial and lateral directions, respectively. From (21) it is evident that w(is .js ) (i0 , j0 ) has the maximum value (unity) at (i0 , j0 ) = (is , js ). A 2-D weighted exponential neighborhood having La = LI = 5 is illustrated in Fig. 2. The values on the weight axis are arbitrary but show an exponential decay as we move away from the interrogated window both axially and laterally. Substituting the value of y from (16) into (20), we get

is +La js +Ll i0 =is −La

Y (is , js ) =

j =js −L

(m(i0 , j0 )x(i0 , j0 ) + c(i0 , j0 ))w(is .js ) (i0 , j0 )

0 l js +Ll is +La

i0 =is −La

.

j0 =js −Ll

(22)

w(is .js ) (i0 , j0 )

If we deﬁne the weighted average value of the slope as

is +La

i0 =is −La

M(is , js ) =

js +Ll

j0 =js −Ll

is +La

i0 =is −La

m(i0 , j0 )w(is .js ) (i0 , j0 )

js +Ll

j0 =js −Ll

,

(23)

w(is .js ) (i0 , j0 )

and the weighted average value of the intercept as

is +La

i0 =is −La

C(is , js ) =

js +Ll

is +La

j0 =js −Ll

i0 =is −La

c(i0 , j0 )w(is .js ) (i0 , j0 )

js +Ll

j0 =js −Ll

,

(24)

w(is .js ) (i0 , j0 )

[16], the AR spectrum was estimated from the IMFs produced by EEMD of the deconvolved RF data. In that work, the IMFs, responsible for coherent scattering (from which the MSS is estimated), were identiﬁed using an empirical criterion. The K-S test was only used to validate whether or not the selected IMFs are indeed a source of coherent scattering. In this work, we have ameliorated the previously proposed MSS estimation technique by incorporating the automatic IMF selection scheme, along with the bandpass ﬁltering of the deconvolved RF data. A block diagram to illustrate the MSS estimation technique is shown in Fig. 3. We present the values of the average MSS estimates for different types of breast tissues and the performance of the MSS estimator as a biomarker for breast lesion classiﬁcation in the results section. We also show the improvements in classiﬁcation results from the previously proposed technique in [16]. 3. Results In this section, we present the classiﬁcation results obtained by combining ESD with MSS. We also present the single-parameter classiﬁcation results obtained using our ESD estimates and compare that classiﬁcation performance with the classiﬁcation results obtained using ESD estimated from other techniques that employ a Gaussian form factor model [14,18,19]. In addition, we show the single-parameter classiﬁcation performance of the MSS estimator and compare its performance with some existing techniques for MSS estimation [2,16,28]. However, at ﬁrst, we focus on the ESD estimates obtained using our proposed method and validate the accuracy of our proposed algorithm by estimating the ESD from TMPs. Additionally, we present the results of ESD estimated using some of the conventional Gaussian form factor based techniques [14,18,19]. Moreover, we present numerical results to validate the different steps of the proposed ESD estimation algorithm and hence, establish the suitability of the proposed ESD estimator for breast lesion classiﬁcation.

then (22) can be written in the form of a regression line given by Y (is , js ) = M(is , js )x + C(is , js ).

(25)

Now, from the slope, M, of the regression line that ﬁts (25), we can estimate the ESD, Deff,S , at the point (is , js ) using (18) as

Deff,S =

−

M(is , js ) 2 . + Deff,R 13.20

(26)

To estimate the block power spectra of an interrogated block with higher resolution, the block is divided into 1-D segments. Consecutive window segments in a block have 50% axial overlapping. The windowed segments are gated by the Hamming window. The block power spectra are calculated using the Welch method. We select La = Ll = 5 as the NN factors to estimate the local ESD for a particular interrogated block. 2.5. MSS estimation technique Another QUS parameter which has previously been successfully used for breast tissue characterization is the MSS [1,2,16]. Specifically, an MSS estimation algorithm developed in [16] produced promising binary classiﬁcation results, albeit, on a small dataset. MSS estimation on the same dataset as ESD allows us to use multiple QUS parameters for breast lesion classiﬁcation. The use of multiple QUS parameters for tissue characterization has been previously shown to produced improved results [2,17]. What is more, because the MSS is estimated from the coherent component of the backscattered RF data and EEMD separates the diffuse and coherent components, the MSS may be estimated simultaneously with the ESD, using a branching scheme. In the technique proposed in

3.1. ESD estimation results on TMPs To check the accuracy of our proposed average ESD estimation technique, we use three CIRS experimental TMPs. The average ESD of these phantoms are also estimated using the MASD based method [18], the frequency-domain method [19], and the method used in [14]. It is to be noted that a Faran form factor is usually employed for modeling the scattering from TMPs [48]. But, since tissue scattering is more accurately modeled by a Gaussian form factor, to apply Eq. (9), a Gaussian form factor model has been applied for the TMPs as well. Moreover, Gaussian form factor models have also been previously used to model the scattering from TMPs [49]. Table 4 presents the actual average ESD provided by the manufacturer and the average ESD estimated using our proposed algorithm as well as the other techniques for the experimental phantoms. It is evident that our proposed algorithm estimates the ESD for all three TMP data with a higher degree of accuracy compared to the other methods. This is reﬂected by a lower mean absolute percentage error (MAPE) value. Moreover, our method also shows a lower SD of estimates compared to the other methods. This can be attributed to the proposed multi-step system effect minimization techniques and the NNARLF algorithm. We have also estimated the ESD from the heterogeneous ROIs of TMP C using our proposed method. A scatter plot showing how the ESD varies as we move laterally from a region just outside the inclusion across the border of the inclusion to a region within the inclusion is presented in Fig. 4. It is evident from the ﬁgure that our proposed method can reliably estimate the ESD in the homo-

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

7

Fig. 2. An illustration of the exponentially weighted neighborhood.

Fig. 3. A block diagram illustrating the proposed MSS estimation technique.

Table 4 Estimated ESD values (in ␮m) with SD (in bracket) from experimental TMPs. Method

TMP dataset

Average ESD of Background (␮m) (±SD)

MAPE (with respect to background) (%)

Average ESD inside Inclusion (␮m) (±SD)

MAPE (with respect to inclusion) (%)

MASD [18]

A B C

54.82 (±8.88) 52.18 (±8.09) 50.91 (±7.99)

21.82 15.96 13.13

– – 74.91 (±7.87)

– – 7.01

Frequency domain [19]

A B C

53.84 (±7.11) 50.10 (±7.07) 48.97 (±7.39)

19.64 11.33 8.82

– – 73.61(±7.09)

– 5.15

Nasief et al. [14]

A B C

53.75 (±7.09) 49.92 (±7.01) 49.54 (±7.43)

19.44 10.93 10.98

– – 73.58(±7.13)

– 5.11

Proposed

A B C

47.02 (±5.89) 47.76 (±6.01) 47.78 (±5.85)

4.49 6.13 6.18

– – 73.02 (±6.33)

– 4.31

geneous regions inside and outside the inclusion. The mean value is approximately 48 ␮m outside the inclusion (represented by a red line) and the mean value of approximately 74 ␮m within the inclusion (represented by a green line). There is a sharp change in the average ESD values across the border of the inclusion, as expected.

3.2. Effect of different steps of the proposed algorithm on the ESD estimation accuracy Now, to study how the different steps in the proposed algorithm impact the accuracy of the ESD estimates, we use an ablation

8

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

Fig. 4. Scatter plot showing the variation of ESD across different windows on moving laterally from the edge of the ROI in the TMP background to the edge of the ROI within the inclusion. (For interpretation of the references to color in the text, the reader is referred to the web version of this article.)

matory lesions are slightly higher than that for normal tissues. According to the best of our knowledge, the ESD values of inﬂammatory lesions have not been previously reported. It is evident that using our proposed method, the estimated ESD values for ﬁbroadenoma and malignant breast tissues fall within the range of ESD values previously reported in the literature [2]. Furthermore, it is seen that the average ESD value of malignant lesions is greater than that of benign lesions which also conforms with the previously reported results [2]. We see that, for all the other techniques [14,18,19], the SD of ESD estimates for both ﬁbroadenoma and malignant tissues are signiﬁcantly higher than that of our proposed method. The proposed ESD estimation algorithm has also been applied on some cystic lesions. The estimated ESD values of cysts are rather erratic and show a high SD and hence, are not presented. This is true for all the techniques used to obtain the ESD estimates in Table 5. Therefore, it can be concluded that ESD estimates of cystic lesions are of no diagnostic importance. This is in good concordance with the anatomy of the cysts as they are ﬂuid-ﬁlled sacs. Thus, scattering from cysts will largely be absent and some inconsistent scattering may occur due to debris (such as those present in complex cysts) [50]. Hence, these datasets are not included in binary classiﬁcation of breast lesions. However, the true gold standard for ESD estimates need to be extracted from histopathology images. But, unfortunately, we do not have excess to these images for our dataset. Thus, the beneﬁts of our proposed method can only be compared through the classiﬁcations results for which there exists deﬁnite pathological gold standard. 3.4. Effect of the weighted exponential neighborhood on the SD of ESD estimates

Fig. 5. Bar plot showing the impact on the percentage error and SD of the ESD estimates from the experimental TMPs on removing the different steps from our proposed algorithm.

approach. The ESD was estimated for several ROIs of TMP datasets A, B, and C by removing the different steps shown in the ﬂow chart of Fig. 1 one by one while retaining the others. The results are presented in Fig. 5 in the form of a bar plot. The ﬁrst column of the bar plot presents the performance of the proposed algorithm on the same datasets for comparison. It can be seen that the overall estimation accuracy is most adversely impacted by removing the EEMD step. There is also a slight rise in the SD of estimates on removing the EEMD step. Removal of any one of the system effect minimization steps, i.e., deconvolution, ﬁltering, and normalization using a reference TMP, also noticeably impacts the ESD estimation accuracy. There is a slight increase in the SD of ESD estimates in each case. Hence, it justiﬁes the use of a multi-step system effect minimization technique. The weighted neighborhood step seems to have the least impact on the overall accuracy. However, the removal of this step is seen to have a more negative impact on the SD of the estimates compared to the other steps of the proposed method. This is later shown to be an important factor in improving the binary classiﬁcation performance. 3.3. ESD estimation results on in vivo breast tissues The results of estimating the ESD of in vivo breast tissues using various techniques are presented in Table 5. All the methods give similar average ESD estimates for normal tissues. The average ESD estimates for normal tissues is in concordance with values previously reported in the literature [2]. The average ESD value for inﬂammatory tissues, using our proposed method, is close to that of normal tissues. However, for the other techniques [14,18,19], the average ESD estimates for inﬂam-

An important factor that has to be taken into consideration while estimating ESD from in vivo tissues is the impact of the kernel size of the weighted exponential neighborhood. As mentioned earlier, the algorithm uses a 5 × 5 weighted exponential neighborhood. The use of such a neighborhood allows the modeling of tissue homogeneity over a small region rather than a large spatial block where the tissue becomes more heterogeneous. To investigate the impact of the size of the neighborhood on the ESD estimates, the ESD is estimated using our proposed algorithm for normal tissues for no neighborhood, a neighborhood of size 3 × 3, and a neighborhood of size 8 × 8. The results are shown in Table 6. It is clear from the table that the choice of neighborhood size does not greatly affect the average value of the ESD estimates. However, the SD of estimates seems to decrease signiﬁcantly for a neighborhood size close to our selected one. However, using a large spatial block, without the use of an exponentially weighted neighborhood, increases the SD of estimates. This observation is consistent with that obtained for experimental TMPs. There, the removal of the weighted neighborhood step adversely impacted the SD of estimates. To further substantiate our argument for choosing an exponentially weighted neighborhood, we produce, in Fig. 6, an ESD map for a ﬁbroadenoma tissue. Two ROIs are taken within the border of the lesion and two ROIs are taken outside the lesion. In the case of ROI selection, the protocols mentioned previously have been followed. Clearly, the ESD values are consistent (having a low SD) across the ROIs (both inside and outside the lesion). Hence, the use of an exponential weighted neighborhood models the tissue pathology more closely than using a large spatial block. 3.5. MSS estimation results on in vivo breast tissues The MSS estimation results for different types of breast tissues are presented in Table 7. The MSS values estimated for ﬁbroade-

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

9

Table 5 Estimated values of ESD with SD (in bracket) using different methods for normal and pathological tissues. ESD (±SD) (␮m) Method

MASD [18] Frequency domain [19] Nasief et al. [14] Proposed

Malignant

Fibroadenoma

Inﬂammatory

Normal

In

Out

In

Out

In

Out

100.50 (±22.03) 108.21 (±17.14) 105.31 (±21.50) 124.38 (±8.64)

84.11 (±8.31) 80.01 (±8.19) 79.98 (±8.39) 75.20 (±4.17)

93.51 (±18.79) 102.56 (±12.76) 92.88 (±18.16) 98.79 (±9.65)

81.02 (±7.31) 79.42 (±8.43) 78.75 (±9.03) 75.14 (±4.14)

88.31 (±11.34) 81.24 (±9.81) 85.38 (±10.42) 75.72 (±4.09)

90.35 (±10.01) 82.45 (±11.01) 80.96 (±10.90) 75.77 (±4.07)

76.13 (±7.01) 75.88 (± 6.74) 75.43 (± 7.02) 75.12 (±4.01)

Table 6 Estimated values of ESD with SD (in bracket) for normal tissues with different neighborhood sizes. Size of neighborhood

ESD (±SD) (␮m)

No neighborhood 3×3 5×5 8×8

76.89 (±8.56) 74.99 (±4.09) 75.12 (±4.01) 76.46 (±7.56)

and malignant tissues from our previously proposed method in [16]. 3.6. Classiﬁcation results

Fig. 6. ESD map for four ROIs for a representative ﬁbroadenoma tissue. A deeper shade of blue indicates higher ESD values. The values on the axes are arbitrary. (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of this article.)

nomas, malignant lesions and normal breast tissues fall within the range of values previously reported [1,2]. As in the case of ESD, the MSS values of inﬂammatory lesions have not been reported before. The estimated MSS values for inﬂammatory lesions are slightly higher than those for normal tissues. The ﬁrst and second rows of the table present the results for MSS estimation using a modiﬁed spectral autocorrelation (SAC) based technique [2] and a singular spectrum analysis (SSA) based technique [28]. The modiﬁed SAC based technique [2] has been chosen for comparison since this method has been used for tissue characterization. The SSA based technique [28] is chosen because it a signal decomposition based technique, similar to the proposed MSS estimation technique. The fourth row presents our previously proposed MSS estimation technique [16]. It is evident that compared to the techniques in [2] and [28], the separation between the average MSS values of different groups of tissues is higher for our proposed MSS estimation method. There is a very slight increase in separation between the average MSS values of the benign

3.6.1. Classiﬁcation results based on ESD Table 8 presents the classiﬁcation results based on only ESD for the proposed method, the MASD-based method [18], the frequency-domain method [19], and the method used in [14]. For the proposed method, a classiﬁcation scheme based on ESD alone produces sensitivity, speciﬁcity, and accuracy values of 95.45%, 95.79%, and 95.68%, respectively. This is signiﬁcantly better than the other techniques used for comparison in this paper [14,18,19]. Additionally, the Sum5 and MCC values of 476.08 and 0.9019, respectively, are far superior compared to the other techniques. When the Fisher’s exact test [51] is used for testing statistical signiﬁcance, the proposed method achieved signiﬁcantly better quality metrics than the MASD-based method [18], the frequencydomain method [19], and the method in [14], with p values equal to 4.415 × 10−9 , 8.832 × 10−7 , and 0.0000057946, respectively. The AUC is calculated using a bootstrapping strategy (200 bootstraps) to obtain the mean and 95% conﬁdence intervals (CIs) of the ROC curve. This yields a mean AUC value of 0.95 with a CI of 0.89–0.98 for classiﬁcation based on our proposed ESD estimator. 3.6.2. Classiﬁcation results based on MSS Table 9 presents the classiﬁcation performance of MSS for different methods. The ﬁrst and second rows of the table present the classiﬁcation performance using the MSS values estimated from the modiﬁed SAC [2] and the SSA [28] based techniques, respectively. The third row presents the results obtained using the MSS

Table 7 Estimated values of MSS with SD (in bracket) using different methods for normal and pathological tissues. Method

Modiﬁed SAC [2] SSA [28] Nizam et al. [16] Proposed method

MSS (±SD) (mm) Malignant

Fibroadenoma

Inﬂammatory

Normal

0.75 (±0.05) 0.85 (±0.06) 0.79 (±0.05) 0.79 (±0.03)

0.77 (±0.05) 0.81 (±0.06) 0.76 (±0.04) 0.75 (±0.03)

0.78 (±0.07) 0.82 (±0.06) 0.73 (±0.05) 0.73 (±0.04)

0.80 (±0.07) 0.80 (±0.05) 0.70 (±0.05) 0.69 (±0.03)

10

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

Table 8 Breast lesion classiﬁcation results using ESD for different methods. Method

TP

TN

FP

FN

Sens. (%)

Spec. (%)

Acc. (%)

PPV (%)

NPV (%)

Sum5

MCC

AUC

MASD [18] Frequency domain [19] Nasief et al. [14] Proposed

17 22 24 42

85 79 84 91

10 16 11 4

27 22 20 2

38.63 50 54.54 95.45

89.47 83.16 88.42 95.79

73.38 72.66 77.70 95.68

62.96 57.89 68.57 91.30

75.89 78.22 80.77 97.85

340.34 341.93 370.00 476.08

0.3305 0.3460 0.4604 0.9019

0.69 0.69 0.75 0.95

Table 9 Breast lesion classiﬁcation results obtained using MSS for different methods. Method

TP

TN

FP

FN

Sens. (%)

Spec. (%)

Acc. (%)

PPV (%)

NPV (%)

Sum5

MCC

AUC

Modiﬁed SAC [2] SSA [28] Nizam et al. [16] Proposed

30 27 28 29

54 64 74 74

41 31 21 21

14 17 16 15

68.18 61.36 63.64 65.91

56.84 67.37 77.89 77.89

60.43 65.46 73.38 74.10

42.25 46.55 57.14 58

79.41 79.01 82.22 83.15

307.12 319.76 354.28 359.06

0.2328 0.2710 0.4043 0.4245

0.62 0.63 0.70 0.72

Table 10 Breast lesion classiﬁcation results of the proposed methods obtained using a combination of ESD and MSS. Parameter

TP

TN

FP

FN

Sens. (%)

Spec. (%)

Acc. (%)

PPV (%)

NPV (%)

Sum5

MCC

AUC

MSS ESD ESD and MSS

29 42 43

74 91 91

21 4 4

15 2 1

65.91 95.45 97.73

77.89 95.79 95.79

74.10 95.68 96.40

58 91.30 91.49

83.15 97.85 98.91

359.06 476.08 480.31

0.4245 0.9019 0.9195

0.72 0.95 0.97

Table 11 Comparative classiﬁcation results obtained using different techniques. Method

QUS parameter

Sum5

MCC

AUC

Modiﬁed SAC [2] SSA [52] MASD [18] Frequency domain [19] Nizam et al. [16] Proposed Nasief et al. [14] Proposed Proposed

MSS MSS ESD ESD MSS MSS ESD ESD ESD and MSS

307.12 319.76 340.34 341.93 354.28 359.05 370.00 476.08 480.32

0.2328 0.2710 0.3305 0.3460 0.4043 0.4245 0.4604 0.9019 0.9195

0.62 0.63 0.69 0.69 0.70 0.72 0.75 0.95 0.97

estimation technique proposed in [16]. The fourth row contains the results obtained from our proposed technique. We see that the classiﬁcation results, obtained using our proposed technique, are much better compared to the modiﬁed SAC and SSA techniques [2,28]. Also, ameliorating the technique in [16], has resulted in an improved classiﬁcation performance as exhibited by higher Sum5 and MCC values. 3.6.3. Classiﬁcation results based on a combination of ESD and MSS Next, the results of combining ESD with MSS is shown in Table 10. It is observed from the table that when ESD is fused with MSS, we obtain improved sensitivity, speciﬁcity, accuracy, and MCC values of 97.73%, 95.79%, and 96.40%, and 0.9195, respectively. The mean AUC value, in this case, is found to be 0.97 with a 95% CI of 0.93–0.99. These performance metrics are clearly better than classiﬁcation based on only a single QUS parameter (either the ESD or MSS). The ﬁrst two rows of Table 10 again present the classiﬁcation results obtained using our proposed ESD and MSS estimation techniques, for reference. As discussed before, the classiﬁcation performance of the combination of ESD and MSS has been evaluated using SVM, LDA, MNR, KNN, and Naïve Bayes classiﬁers. In Table 10, the reported Sum5 value of 480.31 is obtained for a quadratic LDA classiﬁer and represents the best Sum5 value obtained out of all the above-mentioned classiﬁers. The average Sum5 value (±SD), found by averaging the Sum5 values obtained from each of the mentioned classiﬁers, is

480.04 (±0.22). This indicates that the classiﬁcation performance obtained using the combination of ESD and MSS is fairly stable. 3.6.4. Comparative classiﬁcation performance of the different techniques In Table 11, we show a comparative performance of the different techniques discussed in this paper for breast lesion classiﬁcation. We use the quantitative indices AUC, MCC, and Sum5 since they provide a good picture of the overall classiﬁcation performance. For both MSS and ESD, our proposed algorithms produce a superior classiﬁcation performance. The best classiﬁcations results are obtained on combining ESD and MSS, estimated using our proposed techniques. This can again be attributed to the strength of our proposed algorithm. As the true gold standard in the form of histopathology slides is not available, a superior classiﬁcation performance is the best available indicator of the strength of our proposed algorithms compared to the other techniques. A higher AUC value of 0.999 was reported in [14] using ESD with two other QUS parameters. But, in that work, a limited dataset of 43 lesions had been used. 3.7. Effect of different steps of the proposed ESD estimation algorithm on the classiﬁcation performance We investigate how the different steps of the proposed algorithm impact the classiﬁcation accuracy when only ESD is used for classifying the breast lesions. This is again done by using an abla-

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

11

Table 12 Breast lesion classiﬁcation results obtained using only ESD by removing the different steps in the proposed technique. Condition

TP

TN

FP

FN

Sens. (%)

Spec. (%)

Acc. (%)

PPV (%)

NPV (%)

Sum5

MCC

AUC

Proposed No Deconv. No Norm. No Neigh. No Filt. No EEMD

42 37 40 31 36 32

91 92 87 92 86 77

4 3 8 3 9 18

2 7 4 13 8 12

95.45 84.09 90.91 70.45 81.81 72.72

95.79 96.84 91.58 96.84 90.52 81.05

95.68 92.81 91.37 88.49 87.77 78.41

91.30 92.50 83.33 91.18 80 64

97.85 92.93 95.61 87.62 91.49 86.51

476.08 459.17 452.79 434.58 431.60 382.71

0.9019 0.8315 0.8069 0.7282 0.7230 0.5212

0.95 0.93 0.91 0.89 0.86 0.75

Table 13 Symbols and acronyms. Symbol

Description

W (v) T(f) D(f, z) S(f, Deff , nz ) Deff nz z A(f, z) (f) Ac (f, z) (v) Wcomp (n) wcomp gp (n) NE w pcomp (n) cpj (n) cj (n) M w icomp (n) WS,comp (v) (v) WR,comp RF ESD EAC MSS AR QUS AC TMP ROI PSF EEMD IMF K-S test SD MASD MAPE NN NNARLF SAC SSA MCC AUC

Gated backscattered RF signal in the frequency domain Combined effect of transmitted pulse and transducer sensitivity in the frequency domain Diffraction component of W (v) in the frequency domain Scattering component of W (v) in the frequency domain Effective scatterer diameter Effective acoustic concentration Depth of the gated segment from the transducer face Cumulative attenuation Attenuation coefﬁcient (AC) in unit Nepers/cm Attenuation compensation function Attenuation compensated backscattered RF signal in the frequency domain Attenuation compensated backscattered signal in the discrete-time domain Random Gaussian noise in the discrete-time domain Vector length of random Gaussian noise wcomp (n) with 30dB Gaussian noise added, used for EEMD decomposition IMFs computed from w pcomp (n) using EEMD Ensemble average of IMFs, computed over the noise vector length of NE No of IMFs responsible for diffuse scattering, selected through K-S test Sum of all ensemble averaged IMF, responsible for diffuse scattering, in thediscrete-time domain Frequency domain representation of w icomp (n) for sample Frequency domain representation of w icomp (n) for reference Radio-frequency Effective scatterer diameter Effective acoustic concentration Mean scatterer spacing Autoregressive Quantitative ultrasound Attenuation coefﬁcient Tissue-mimicking phantom Region of interest Point spread function Ensemble empirical mode decomposition Intrinsic mode function Kolmogorov–Smirnov test Standard deviation Minimization of the average squared deviation Mean absolute percentage error Nearest neighborhood Nearest neighborhood average regression line ﬁtting Spectral autocorrelation Singular spectrum analysis Matthew’s correlation coefﬁcient Area under the receiver operating characteristics curve

tion approach. The results are shown in Table 12. It is clear from the table that removing the EEMD step has the most drastic impact on classiﬁcation performance. There is a sharp fall in the sensitivity, speciﬁcity, and accuracy results from the proposed method. Furthermore, removal of the exponential neighborhood step has a signiﬁcantly detrimental impact on the classiﬁcation performance. This can be correlated to the observations from experimental TMPs and in vivo tissues. There, it has been observed that the removal of the weighted neighborhood step signiﬁcantly increases the SD of ESD estimates. This, in turn, results in a poor classiﬁcation performance. It is also clear that removing any one of the system effect minimization steps, i.e., deconvolution, ﬁltering, and normalization, also impacts the classiﬁcation performance adversely. This again conﬁrms the idea that system effect minimization requires

a multi-step approach. Therefore, the different steps of the proposed ESD estimation algorithm work together in improving the classiﬁcation performance. 4. Discussion In this paper, we proposed a QUS parameter based breast lesion classiﬁcation scheme using ESD and MSS. To obtain reliable ESD estimates for breast lesion classiﬁcation, an improved ESD estimation technique for breast tissues has been proposed. The proposed ESD estimator has been shown to outperform the conventional techniques, that also employ a Gaussian form factor. The higher values of the performance metrics in Table 8 compared to the other techniques show the suitability of the proposed ESD estimator to be

12

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786

used as a CAD tool. Table 10 also demonstrates the suitability of the proposed ESD and MSS estimators to be used as CAD tools, when they are used in conjunction. However, it is to be noted that ESD and MSS cannot be estimated for cystic lesions. Therefore, an ESD/MSS based classiﬁcation scheme will be unable to characterize cysts correctly. Hence, this is a limitation of such a classiﬁcation scheme. However, some strain imaging techniques have been developed which can successfully characterize cysts [53]. These may be used in conjunction with ESD and MSS for breast lesion classiﬁcation. What is more, it is evident from Table 5, that the difference between the average ESD values of inﬂammatory lesions and normal tissues is quite small for the proposed method. Given the SD from the mean values, it is clear that there is a signiﬁcant overlap between these two types of tissue. In this paper, we have opted for a binary classiﬁcation of breast lesions. Here, inﬂammatory lesions have been classiﬁed as benign and normal tissues have been excluded from the classiﬁcation algorithm. If a three-class classiﬁcation technique were used (with benign, malignant, and normal tissues as the three classes), the classiﬁcation results would deteriorate. But, it is to be noted that inﬂammatory lesions, such as breast tuberculosis, is often confused with malignancy [54]. From that point of view, a greater separation between the average ESD values between malignant and inﬂammatory lesions might be suitable for diagnosis. Furthermore, it is clear from Eq. (19) that the proposed technique can estimate another QUS parameter, EAC. But, the EAC values estimated using this technique have a poor classiﬁcation performance on this dataset. Additionally, the classiﬁcation performance of the QUS parameters is found to be sensitive to several hyperparameters, such as the number of folds, weights, etc. Future works involving QUS based classiﬁers should try to reduce the sensitivity of the classiﬁcation performance to these parameters. Besides, this paper employs a Gaussian form factor to model tissue scattering. This has produced reasonable ESD estimates in this work. Similar schemes using other form factors such as the Faran form factor need to be explored for QUS parameter estimation. This requires extensive work in the future. Firstly, frequency-domain scattering models need to be developed using other form factors. Next, these form factor models need to be compared on a larger dataset. Also, the true gold-standard for QUS parameters relating to tissue micro-structures like the ESD and MSS have to be extracted using microscopy analysis from histopathology slides. Because of a lack of histopathology slides for our dataset, we could not carry out such an analysis. Moreover, recent trends suggest the use of deep neural networks for ultrasonic tissue characterization [55]. The dataset used in this work is still too limited for such an approach. But, deep neural networks have reached an accuracy value of only 90% even with a dataset consisting of 7408 lesions [55]. It has been suggested that combining traditional signal processing based preprocessing steps with deep neural networks may be the best approach for further improving the ultrasonic tissue classiﬁcation performance in the future [56].

5. Conclusion This paper has presented a multi-QUS biomarker-based breast lesion classiﬁcation scheme that uses ESD and MSS as the QUS parameters. An improved frequency domain technique for ESD estimation from the diffuse component of the backscattered data using EEMD has also been proposed. The ESD estimates based on this technique has been shown to produce a better classiﬁcation performance compared to some of the existing techniques for ESD estimation. Additionally, when the ESD is fused with another prominent QUS parameter, the MSS, estimated from an ameliorated EEMD domain AR spectral estimation technique, the classiﬁcation performance further improves. Therefore, the proposed ESD

estimator, in conjunction with the proposed MSS estimator, is promising enough to be used as a CAD tool for breast lesion classiﬁcation. Also, it opens the possibility for the use of these QUS parameters as non-invasive biomarkers for breast cancer detection. Acknowledgement This work was supported by HEQEP UGC, Bangladesh, under Grant CPSF#96/BUET/Win-2/ST(EEE)/2017. The in vivo breast data were acquired at the BUET Medical Center by Dr. Farzana Alam, Assistant Professor of Radiology and Imaging, Bangabandhu Sheikh Mujib Medical University, Dhaka-1000, Bangladesh. The computer programs for the methods in [18] and [19] were derived from the graphical user interface (GUI) developed by the authors of those papers. We would like to thank the authors for providing us with the GUI. Declaration of interests: None declared. References [1] Y. Bige, Z. Hanfeng, W. Rong, Analysis of microstructural alterations of normal and pathological breast tissue in vivo using the ar cepstrum, Ultrasonics 44 (2) (2006) 211–215. [2] H. Tadayyon, A. Sadeghi-Naini, L. Wirtzfeld, F.C. Wright, G. Czarnota, Quantitative ultrasound characterization of locally advanced breast cancer by estimation of its scatterer properties, Med. Phys. 41 (1) (2014). [3] B. Alacam, B. Yazici, N. Bilgutay, F. Forsberg, C. Piccoli, Breast tissue characterization using farma modeling of ultrasonic rf echo, Ultrasound Med. Biol. 30 (10) (2004) 1397–1407. [4] K. Nam, J.A. Zagzebski, T.J. Hall, Quantitative assessment of in vivo breast masses using ultrasound attenuation and backscatter, Ultrason. Imaging 35 (2) (2013) 146–161. [5] F.L. Lizzi, M. Ostromogilsky, E.J. Feleppa, M.C. Rorke, M.M. Yaremko, Relationship of ultrasonic spectral parameters to features of tissue microstructure, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 34 (3) (1987) 319–329. [6] E. Feleppa, F. Lizzi, D. Coleman, M. Yaremko, Diagnostic spectrum analysis in ophthalmology: a physical perspective, Ultrasound Med. Biol. 12 (8) (1986) 623–631. [7] W. Liu, J.A. Zagzebski, M.A. Kliewer, T. Varghese, T.J. Hall, Ultrasonic scatterer size estimation in liver tumor differntiation, Med. Phys. 34 (2597) (2007). [8] M.F. Insana, T.J. Hall, J.G. Wood, Z. Yan, Renal ultrasound using parametric imaging techniques to detect changes in microstructure and function, Invest. Radiol. 28 (8) (1993) 720–725. [9] T.J. Hall, M.F. Insana, L.A. Harrison, G.G. Cox, Ultrasonic measurement of glomerular diameters in normal adult humans, Ultrasound Med. Biol. 22 (8) (1996) 987–997. [10] M.F. Insana, Modeling acoustic backscatter from kidney microstructure using an anisotropic correlation function, J. Acoust. Soc. Am. 97 (1) (1995) 649–655. [11] E.J. Feleppa, T. Liu, A. Kalisz, M.C. Shao, N. Fleshner, V. Reuter, W.R. Fair, Ultrasonic spectral-parameter imaging of the prostate, Int. J. Imaging Syst. Technol. 8 (1) (1997) 11–25. [12] S.L. Bridal, P. Fornès, P. Bruneval, G. Berger, Parametric (integrated backscatter and attenuation) images constructed using backscattered radio frequency signals (25–56 MHz) from human aortae in vitro, Ultrasound Med. Biol. 23 (2) (1997) 215–229. [13] R.H. Silverman, R. Folberg, H.C. Boldt, H.O. Lloyd, M.J. Rondeau, M.G. Mehaffey, F.L. Lizzi, D.J. Coleman, Correlation of ultrasound parameter imaging with microcirculatory patterns in uveal melanomas, Ultrasound Med. Biol. 23 (4) (1997) 573–581. [14] H.G. Nasief, I.M. Rosado-Mendez, J.A. Zagzebski, T.J. Hall, A quantitative ultrasound-based multi-parameter classiﬁer for breast masses, Ultrasound Med. Biol. (2019). [15] A. Sadeghi-Naini, H. Suraweera, W.T. Tran, F. Hadizad, G. Bruni, R.F. Rastegar, B. Curpen, G.J. Czarnota, Breast-lesion characterization using textural features of quantitative ultrasound parametric maps, Sci. Rep. 7 (1) (2017) 13638. [16] N.I. Nizam, S.K. Alam, M.K. Hasan, EEMD domain ar spectral method for mean scatterer spacing estimation of breast tumors from ultrasound backscattered rf data, IEEE Trans. Ultrason. Ferroelect. Freq. Control 64 (10) (2017) 1487–1500. [17] S.R. Ara, S.K. Bashar, F. Alam, M.K. Hasan, EMD-DWT based transform domain feature reduction approach for quantitative multi-class classiﬁcation of breast lesions, Ultrasonics 80 (2017) 22–33. [18] M.L. Oelze, W.D. O’Brien Jr., Method of improved scatterer size estimation and application to parametric imaging using ultrasound, J. Acoust. Soc. Am. 112 (6) (2002) 3053–3063. [19] M.L. Oelze, J.F. Zachary, W.D. O’Brien Jr., Characterization of tissue microstructure using ultrasonic backscatter: Theory and technique for optimization using a gaussian form factor, J. Acoust. Soc. Am. 112 (3) (2002) 1202–1211.

N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786 [20] K.A. Wear, R.F. Wagner, M.F. Insana, T.J. Hall, Application of autoregressive spectral analysis to cepstral estimation of mean scatterer spacing, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 40 (1) (1993) 50–58. [21] R. Lavarello, M. Oelze, Quantitative ultrasound estimates from populations of scatterers with continuous size distributions: effects of the size estimator algorithm, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 59 (9) (2012) 2066–2076. [22] T. Varghese, K.D. Donohue, Estimating mean scatterer spacing with the frequency-smoothed spectral autocorrelation function, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 42 (3) (1995) 451–463. [23] Z. Klimonda, M. Postema, A. Nowicki, J. Litniewski, Tissue attenuation estimation by mean frequency downshift and bandwidth limitation, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 63 (8) (2016) 1107–1115. [24] M.H.R. Khan, M.K. Hasan, Attenuation estimation of soft tissue with reference-free minimization of system effects, Biomed. Signal Process. Control 50 (2019) 121–133. [25] M.F. Insana, R.F. Wagner, D.G. Brown, T.J. Hall, Describing small-scale structure in random media using pulse-echo ultrasound, J. Acoust. Soc. Am. 87 (1) (1990) 179–192. [26] A.C. Luchies, G. Ghoshal, W.D. O’Brien Jr., M.L. Oelze, Quantitative ultrasonic characterization of diffuse scatterers in the presence of structures that produce coherent echoes, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 59 (5) (2012) 893–904. [27] A.C. Luchies, M.L. Oelze, Backscatter coefﬁcient estimation using tapers with gaps, Ultrason. Imaging 37 (2) (2015) 117–134. [28] W.C.D.A. Pereira, C.D. Maciel, Performance of ultrasound echo decomposition using singular spectrum analysis, Ultrasound Med. Biol. 27 (9) (2001) 1231–1238. [29] G. Georgiou, F.S. Cohen, Statistical characterization of diffuse scattering in ultrasound images, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 45 (1) (1998) 57–64. [30] M. Miyoshi, Y. Kaneda, Inverse ﬁltering of room acoustics, IEEE Trans. Acoust. Speech Signal Process. 36 (2) (1988) 145–152. [31] W.K. Moon, C.-S. Huang, W.-C. Shen, E. Takada, R.-F. Chang, J. Joe, M. Nakajima, M. Kobayashi, Analysis of elastographic and b-mode features at sonoelastography for breast tumor classiﬁcation, Ultrasound Med. Biol. 35 (11) (2009) 1794–1802. [32] W. Zhu, N. Zeng, N. Wang, et al., Sensitivity, speciﬁcity, accuracy, associated conﬁdence interval and roc analysis with practical sas implementations, in: NESUG Proceedings: Health Care and Life Sciences, vol. 19, Baltimore, Maryland, 2010, pp. 1–9. [33] B.W. Matthews, Comparison of the predicted and observed secondary structure of t4 phage lysozyme, Biochim. Biophys. Acta (BBA)-Protein Struct. 405 (2) (1975) 442–451. [34] S. Zhou, J. Shi, J. Zhu, Y. Cai, R. Wang, Shearlet-based texture feature extraction for classiﬁcation of breast tumor in ultrasound image, Biomed. Signal Process. Control 8 (6) (2013) 688–696. [35] G. Ghoshal, M.L. Oelze, Improved scatterer property estimates from ultrasound backscatter using gate-edge correction and a pseudo-welch technique, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 57 (12) (2010) 2828–2832. [36] M.K. Hasan, M.S.E. Rabbi, S.Y. Lee, Blind deconvolution of ultrasound images using l1 -norm-constrained block-based damped variable step-size multichannel LMS algorithm, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 63 (8) (2016) 1116–1130.

13

[37] M.K. Hasan, M. Shifat-E-Rabbi, S.Y. Lee, Blind deconvolution of ultrasound images using l1 -norm-constrained block-based damped variable step-size multichannel lms algorithm, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 63 (8) (2016) 1116–1130. [38] N.E. Huang, Z. Shen, S.R. Long, M.C. Wu, H.H. Shih, Q. Zheng, N.-C. Yen, C.C. Tung, H.H. Liu, The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis, Proc. R. Soc. Lond. A: Math. Phys. Eng. Sci. 454 (1971) (1998) 903–995, The Royal Society. [39] R.J. Gledhill, Methods for Investigating Conformational Change in Biomolecular Simulations, Ph.D. Dissertation, School Chem., Southampton Univ., 2004. [40] Y. Gao, G. Ge, Z. Sheng, E. Sang, Analysis and solution to the mode mixing phenomenon in EMD, in: CISP’08, vol. 5, IEEE, 2008, pp. 223–227. [41] E. Khan, F. Al Hossain, S.Z. Uddin, S.K. Alam, M.K. Hasan, A robust heart rate monitoring scheme using photoplethysmographic signals corrupted by intense motion artifacts, IEEE Trans. Biomed. Eng. 63 (3) (2016) 550–562. [42] H. Kim, T. Varghese, Attenuation estimation using spectral cross-correlation, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 54 (3) (2007). [43] Y. Labyed, T.A. Bigelow, A theoretical comparison of attenuation measurement techniques from backscattered ultrasound echoes, J. Acoust. Soc. Am. 129 (4) (2011) 2316–2324. [44] R. Kuc, Bounds on estimating the acoustic attenuation of small tissue regions from reﬂected ultrasound, Proc. IEEE 73 (7) (1985) 1159–1168. [45] M.K. Hasan, M.A. Hussain, S.R. Ara, S.Y. Lee, S.K. Alam, Using nearest neighbors for accurate estimation of ultrasonic attenuation in the spectral domain, IEEE Trans. Ferroelectr. Freq. Control 60 (6) (2013) 1098–1114. [46] N.E. Huang, N.O. Attoh-Okine, The Hilbert-Huang Transform in Engineering, CRC Press, 2005. [47] F.L. Lizzi, M. Astor, T. Liu, C. Deng, D.J. Coleman, R.H. Silverman, Ultrasonic spectrum analysis for tissue assays and therapy evaluation, Int. J. Imaging Syst. Technol. 8 (1) (1997) 3–10. [48] E. Franceschini, R. Guillermin, Experimental assessment of four ultrasound scattering models for characterizing concentrated tissue-mimicking phantoms, J. Acoust. Soc. Am. 132 (6) (2012) 3735–3747. [49] J. Mamou, M.L. Oelze, Quantitative Ultrasound in Soft Tissues, Springer, 2013. [50] W.A. Berg, A.G. Sechtin, H. Marques, Z. Zhang, Cystic breast masses and the acrin 6666 experience, Radiol. Clin. 48 (5) (2010) 931–987. [51] R.A. Fisher, On the interpretation of 2 from contingency tables, and the calculation of p, J. R. Stat. Soc. 85 (1) (1922) 87–94. [52] G. Georgiou, F.S. Cohen, C.W. Piccoli, F. Forsberg, B.B. Goldberg, Tissue characterization using the continuous wavelet transform. Part II: Application on breast rf data, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 48 (2) (2001) 364–373. [53] M.S.E. Rabbi, M.K. Hasan, Speckle tracking and speckle content based composite strain imaging for solid and ﬂuid ﬁlled lesions, Ultrasonics 74 (2017) 124–139. [54] S. Baharoon, Tuberculosis of the breast, Ann. Thorac. Med. 3 (3) (2008) 110. [55] S. Han, H.-K. Kang, J.-Y. Jeong, M.-H. Park, W. Kim, W.-C. Bang, Y.-K. Seong, A deep learning framework for supporting the classiﬁcation of breast lesions in ultrasound images, Phys. Med. Biol. 62 (19) (2017) 7714. [56] S. Liu, Y. Wang, X. Yang, B. Lei, L. Liu, S.X. Li, D. Ni, T. Wang, Deep learning in medical ultrasound analysis: a review, in: Engineering, 2019.

Classification of breast lesions using quantitative ultrasound biomarkers

Classification of breast lesions using quantitative ultrasound biomarkers

Recommend Documents