Biomedical Signal Processing and Control 57 (2020) 101786
Contents lists available at ScienceDirect
Biomedical Signal Processing and Control journal homepage: www.elsevier.com/locate/bspc
Classification of breast lesions using quantitative ultrasound biomarkers Navid Ibtehaj Nizam a , Sharmin R. Ara b , Md. Kamrul Hasan a,∗ a b
Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh Department of Electrical and Electronic Engineering, East West University, Dhaka-1212, Bangladesh
a r t i c l e
i n f o
Article history: Received 28 May 2019 Received in revised form 20 October 2019 Accepted 16 November 2019 Keywords: Quantitative ultrasound Effective scatterer diameter Mean scatterer spacing Non-invasive biomarker Tissue characterization Computer-aided diagnosis
a b s t r a c t Quantitative ultrasound (QUS) parameters are gaining attention recently as non-invasive biomarkers for soft tissue characterization. In this work, we use two QUS parameters, namely effective scatterer diameter (ESD) and mean scatterer spacing (MSS), for binary classification of breast lesions. To produce improved classification results, we propose a modified frequency-domain technique for ESD estimation of breast tissues from the diffuse component of backscattered radio-frequency (RF) data. Ensemble empirical mode decomposition (EEMD) is performed to separate the diffuse component from the coherent component of the backscattered RF data. A non-parametric Kolmogorov–Smirnov (K-S) test is employed for automatic selection of the intrinsic mode functions (IMFs). A novel multi-step system effect minimization process is also used. The ESD is estimated using a novel nearest neighborhood average regression line fitting (NNARLF) algorithm. Furthermore, we use an ameliorated EEMD domain autoregressive (AR) spectral estimation technique for MSS estimation. On using the ESD for binary classification of 139 lesions, we obtain high sensitivity, specificity, accuracy values of 95.45%, 95.79%, and 95.68%, respectively. The area under the receiver operating characteristics (ROC) curve is 0.95. On combining ESD with MSS we obtain even more improved sensitivity, specificity, and accuracy values of 97.73%, 95.79%, and 96.40%, respectively. The area under the ROC also increases to 0.97. This high classification performance demonstrates the potential of the use of these QUS parameters as non-invasive biomarkers for breast cancer detection. © 2019 Elsevier Ltd. All rights reserved.
1. Introduction Quantitative ultrasound (QUS) based parameters have been employed for the characterization and classification of breast tissues [1,2]. Such classification provides some inherent advantages compared to diagnosis based on other modalities (such as mammography) due to the non-ionizing and non-invasive nature of ultrasound and a lower cost of operation [3]. Also, QUS allows the development of a diagnostic modality that is less subjective to operator settings and interpreter variability compared to conventional ultrasound imaging [1,2,4]. QUS based tissue characterization works on the principle that diseases alter the acoustic scattering properties of tissues [5]. Effective scatterer diameter (ESD) and mean scatterer spacing (MSS) are popular QUS parameters for quantification of tissue microstructure [2]. These parameters are not visible from ultrasound B-mode or ultrasound elastography images but are estimable
∗ Corresponding author. E-mail address:
[email protected] (Md.K. Hasan). https://doi.org/10.1016/j.bspc.2019.101786 1746-8094/© 2019 Elsevier Ltd. All rights reserved.
from histopathology slides using high-powered microscopes [1]. The estimated ESD values of several different types of tissues have been reported in the literature. These include eyes [6], liver [7], renal tissues [8], glomerular tissues [9], kidney [10], prostate [11], human aortae [12], and the uveal melanomas [13]. ESD values of female breast tissues have also been recently reported in the literature [2,14]. In these works, ESD has been combined with other QUS parameters for breast lesion classification and characterization. In [2], ESD was combined with MSS and effective acoustic concentration (EAC) for breast tumor grading. In [15], a parametric image based approach, involving both ESD and MSS, was successful in binary (benign-malignant) classification of breast lesions. A very recent work in [14] used ESD along with six other QUS parameters for binary classification of breast lesions. But, both [14] and [15] employed limited datasets of 43 and 78 lesions, respectively. Our group, previously employed MSS for breast lesion classification on a very small dataset of 24 lesions [16]. However, we used a significantly larger dataset of 201 lesions for classification of breast lesions in [17] using bi-modal parameters extracted from ultrasound B-mode and ultrasound elastography images.
2
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
Robust schemes for estimation of QUS parameters is essential for their successful application as non-invasive biomarkers for breast cancer detection. ESD estimation techniques proposed previously often employ a Gaussian form factor to model tissue scattering [14,18,19]. Moreover, it has been established that the backscattered RF data consists of a coherent component and a diffuse component [20]. In [21], it has been reported that, for the estimation of ESD, the coherent component behaves as an interference, and hence, needs to be suppressed. However, for using a signal decomposition technique, the coherent and diffuse components need to be in additive form [16]. To achieve this, system effects like the impact of the system point spread function (PSF) as well as the impacts of diffraction and other acquisition noises must be removed [16,22–24]. Furthermore, accurate estimation of QUS parameters require an accurate modeling of tissue pathology. Most of the previously proposed ESD estimation techniques fail to take care of one or more of these requirements [22]. A technique that relies on the minimization of the average squared deviation (MASD) between the theoretical and measured power spectra using a Gaussian form factor model has been proposed in [18]. However, this method does not employ any signal decomposition technique. A frequency-domain technique that employs a Gaussian form factor-based theoretical power spectrum has been proposed in [19]. This method also does not use any signal decomposition technique. A similar shortcoming is present in [14], which employs a Gaussian form factor based approach developed in [25]. Also, these techniques use large 2-D spatial signal blocks to generate a power spectrum [18,19]. This is often an unrealistic approach because the tissue pathology inside a large spatial region must be considered uniform, not taking into account the heterogeneity present in tissues. In some recent techniques, the generalized spectrum, Rayleigh envelope statistics [26], and Hanning tapers [27] have been used to separate the diffuse echoes from the backscattered data. But, these techniques also fail to consider the heterogeneity present in tissues. What is more, all these techniques rely on a single-step reference phantom based system effect minimization technique. However, no attempt has been made to investigate whether this single step process indeed adequately removes the system effects. In case of MSS estimation, traditional techniques have relied on the autoregressive (AR) spectrum [20] or the AR cepstrum [1]. Recently, MSS estimation has also moved towards signal decomposition techniques like the singular spectral analysis (SSA) [28]. We proposed an MSS estimation technique in [16] which combined EEMD with the AR spectrum. This technique has been shown to outperform most of the existing techniques for MSS estimation [16]. In this paper, we aim to develop a reliable technique for binary classification of breast lesions using ESD and MSS. To achieve this, we propose a novel ESD estimation technique, which tries to overcome the shortcomings of the existing ESD estimation techniques. The salient features of our proposed algorithm are discussed below:
• We use ensemble empirical mode decomposition (EEMD) for signal decomposition, like the technique proposed in [16] for MSS estimation. However, in [16], the IMFs had been selected using an empirical criterion. In this paper, a completely automatic scheme has been employed, using the K-S test [29], to select the IMFs responsible for diffuse scattering. • To successfully employ the signal decomposition technique, we adopt a novel multi-step system effect minimization approach. The first two steps involve bandpass filtering and deconvolution of the backscattered RF data and the final step is normalization using a reference tissue-mimicking phantom (TMP). • The proposed technique is based on a frequency domain Gaussian form factor model [19]. However, the NNARLF algorithm is
incorporated into the Gaussian form factor model, for a better representation of tissue structure and pathology. We also ameliorate our previously proposed EEMD domain autoreggresive (AR) spectral estimation based MSS estimation technique [16] using features of the ESD estimation algorithm. MSS is combined with ESD for binary classification on the same dataset. The motivation behind combining ESD with MSS is that the MSS can be estimated using a scheme similar to the proposed ESD estimation technique. In fact, the use of a multi-step system effect minimization approach should lead to more reliable MSS estimates. Also, the use of multiple QUS parameters is a common approach for ultrasonic tissue characterization [2,14,17]. However, throughout this paper, a greater focus is placed on the technique for ESD estimation. This is because, as already discussed, the proposed ESD estimation technique introduces several novel features. Thus, numerical results are produced to highlight the important features of the proposed ESD estimation technique. Therefore, the contributions of this work are two-fold. Firstly, we propose an ESD and MSS based binary classification scheme using a larger dataset than other works in the literature that employ ESD and/or MSS for classification. Secondly, we propose a new ESD estimation technique and also, an ameliorated MSS estimation scheme for improved breast lesion classification.
2. Materials and method 2.1. Patient data The in vivo data used in this paper have been obtained at the Bangladesh University of Engineering and Technology (BUET) Medical Center with the help of a SonixTOUCH Research (Ultrasonix Medical Corporation, Richmond BC, Canada) scanner. The scanner is integrated with an L14-5/38 linear probe. The probe was operating at 10 MHz with 65% bandwidth (−6-dB bandwidth) at a sampling rate of 40 MHz. The pulse length was approximately 0.4 mm and the beam width was approximately 2.2 mm. The pulse length has been estimated from the emitted pulse using the multiple input–output inverse theorem [30]. The study has been carried out on female subjects, with their prior written consent, and the study was approved by the Institutional Review Board (IRB) of BUET. From the female subjects, 139 pathological RF data have been recorded. The details of the 139 pathological RF data, that have been used for classification, are summarized in Table 1. The age range of all patients was 13–75 years (mean: 35.27 years). The patients having masses underwent fine-needle aspiration cytology (FNAC) and/or excision biopsy according to the suggestion of their physicians. All patients having FNAC diagnosis positive for malignancy underwent surgery. Therefore, diagnoses of malignant and some benign lesions were confirmed by histopathology, and diagnoses of the remaining lesions, by cytopathology.
2.2. TMP data Three TMPs have been used in our work, both as reference phantoms and for validation of our proposed ESD estimation algorithm. The TMP is made of zerdine, which exhibits echogenic patterns similar to those obtained from soft tissues. The speed of sound in the TMP is around 1540 m/s. The TMP data was acquired using the same transducer settings as the in vivo data. We refer to the TMP datasets as TMP A, B, and C. The actual average ESD, as supplied by the manufacturer, i.e., CIRS, are used as gold standards for performance evaluation of the ESD estimators. The description of the TMP datasets is presented in Table 2.
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
3
Table 1 Patient data summary. Tissue type
No. of RF data
Mean age ± SD
Method of confirmation
Lesion size mean (mm × mm) ± SD (mm × mm)
Malignant Fibroadenoma Inflammation
44 71 24
44.91 ± 9.81 27.48 ± 9.13 35.43 ± 12.71
Biopsy FNAC/Biopsy FNAC/Biopsy
21.32 × 19.14 ± 5.94 × 7.36 19.37 × 12.09 ± 7.93 × 4.67 21.53 × 13.15 ± 8.39 × 4.39
Table 2 Description of experimental TMPs. TMP dataset
Description
Average ESD of inclusion (m)
Average ESD of background (m)
Size (cm2 )
A B C
Homogeneous Homogeneous Heterogenous
– – 70 (spherical)
45 45 45
3×4 3×4 4.5 × 4 (inclusion dia = 1.4 cm)
Table 3 Division of breast lesion data into five groups for training-testing. Group no.
Total (= Malignant + Fibro. + Inflam.)
1 2 3 4 5 Total
29 (= 9 + 15 + 5) 28 (= 9 + 14 + 5) 28 (= 9 + 14 + 5) 28 (= 9 + 14 + 5) 26 (= 8 + 14 + 4) 139 (= 44 + 71 + 24)
2.3. Binary classification of breast lesions The binary classification performances of ESD, and the combination of ESD and MSS (both normalized) are evaluated on 139 breast lesions. We use support vector machine (SVM), K-nearest neighbor (KNN), linear discriminant analysis (LDA), multinomial logistic regression (MNR), and Naïve Bayes (NB) classifiers. The best classification performance obtained is reported in this work. Here, we used the commonly employed “one-versus-all” or OVA based classification technique [31]. The total dataset is randomly divided into five groups for training and testing of the characterization indices as in [31]. First, group 1 is used as the test set and the remaining four groups are used as the training set for each class. Next, group 2 is used as the test set and the remaining four groups are used as the training set. The process is repeated until the testing of each group is completed. Details on the grouping for the binary classification are provided in Table 3. In the case of support vector machine (SVM) based classifier, the two feature vectors, i.e., ESD and MSS, have been given equal weight for the multi-layer perceptron network. Similarly, for the Naïve Bayes, KNN, MNR and LDA based classifiers, equal observation weights are used for MSS and ESD. The classification results include true positive (TP), true negative (TN), false positive (FP), false negative (FN), sensitivity, specificity, accuracy, positive predictive value (PPV), negative predictive value (NPV), the sum of the last five parameters, Sum5 , Matthew’s correlation coefficient (MCC), and area under the receiver operating characteristics curve (AUC). From TP, TN, FP, and FN, we calculate the sensitivity, specificity, accuracy, PPV, NPV and MCC [32–34]. Higher values of sensitivity, specificity, accuracy, PPV, and NPV indicate better classification performance. For MCC, +1 indicates a perfect prediction, 0 indicates a uniform random prediction, and −1 indicates an inverse prediction [33]. 2.4. Proposed ESD estimation technique A block diagram of our proposed ESD estimation algorithm is shown in Fig. 1. The proposed technique employs a frequencydomain Gaussian form factor model in the EEMD domain. The NNARLF algorithm is incorporated into the Gaussian form factor model. Additionally, the proposed technique uses a multi-step system effect minimization technique. Here, filtering and deconvo-
lution serve as both preprocessing and system effect minimization steps. They are used to reduce the impact of diffraction and system PSF, respectively. In addition to these steps, normalization using a reference TMP is carried out for the reduction of residual system effects. Next, we discuss the different steps of the proposed algorithm in greater detail. The list of the important symbols and acronyms used in this paper are presented in Table 13, at the end of the paper, for convenience. 2.4.1. Preprocessing At first, 2-D rectangular regions of interest (ROI) are selected from the calculated B-mode images obtained from the recorded RF data. Any lesion present had been pre-identified on the machine B-mode image by the radiologist, who acquired the data. A suitable 2-D rectangular region within the border of the lesion is taken as the ROI. Additionally, a second ROI is selected outside the lesion to compare the ESD values inside and outside the lesion. This has been illustrated schematically in the block diagram of Fig. 1. The dimension of each ROI is approximately (10–12) × (6.25–9.40) mm2 . This approximately equals to 25 pulse lengths axially and 4 beam widths laterally. The choice of such an ROI size has been previously shown to improve the accuracy of QUS parameter estimates [35]. Though ESD estimation from in vivo data uses only one ROI, for estimating the average ESD from TMP datasets A and B, 25 ROIs are selected. For TMP dataset C, 10 ROIs are selected from outside the inclusion and 5 ROIs are selected within the inclusion. In TMP dataset C, we have also selected 5 heterogeneous ROIs across the border of the inclusion such that the ROIs encompass both the inclusion and the background. Each TMP ROI is of dimension 10 × 10 mm2 . The size of the TMP ROIs, like the in vivo data ROIs, amounts to approximately 25 pulse lengths axially and 4 beam widths laterally. Next, the RF data is deconvolved using a blind multi-channel least-mean squares algorithm previously proposed in [36]. The technique in [36] requires no prior knowledge of the PSF. It employs an l1 -norm based cost function with a damped variable step-size, which helps mitigate the impacts of noise. This technique has been shown to outperform most conventional deconvolution techniques [37]. The technique is, therefore, well suited for our work, to reduce the impact of the system PSF. After deconvolution, a bandpass filter of frequency range 2–13 MHz is applied to the data. It has been established in [23] and very recently, in [24], that the diffraction effect is most prominent in the low-frequency region of the spectrum (below 2 MHz). It is to be noted that deconvolution shifts the backscattered signal from being centered at the pulse frequency (10 MHz) to lower frequencies. A upper cut-off frequency of 13 MHz is used to eliminate any high frequency acquisition noises that may be present [24]. 2.4.2. Coherent component suppression using EEMD The backscattered RF data consists of two major components, a coherent component, and a diffuse component [20]. The diffuse
4
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
Fig. 1. A block diagram illustrating the proposed ESD estimation technique. NNARLF refers to nearest neighborhood average regression line fitting.
component represents scattering from aperiodic scatterers. Accurate estimation of ESD requires the suppression of the coherent component [26]. This is because, the coherent component has been shown to increase the bias and variance of ESD estimates [26]. Therefore, on removal of the system PSF, diffraction and other acquisition noises, only the coherent and the diffuse components remain in additive form and hence, can be separated by a suitable signal decomposition technique [16]. Empirical mode decomposition (EMD) decomposes a signal into its IMFs in additive form [38]. The main advantage of EMD over other data-driven approaches is that it requires no pre-selection of basis functions and is suitable for both stationary and nonstationary signals [38]. Hence, EMD can be used to extract the diffuse component from the selected ROI from the backscattered signal after deconvolution and filtering have been carried out. However, small perturbations can adversely affect EMD and a completely different set of IMFs may be obtained each time it is performed [39]. To produce stable IMF estimates, EEMD is usually performed [40]. In EEMD, an ensemble of random Gaussian noise is added to the backscattered signal to produce an ensemble of signals. The EMD algorithm is applied to each signal in the ensemble to produce an ensemble of IMFs. This ensemble of IMFs is then averaged to produce a new set of IMFs. The IMFs are normalized by dividing with their maximum amplitude so that no undue weight is given to any single IMF. Selection of IMFs is a crucial issue in any algorithm involving EEMD [16,41]. This is because some IMFs will contain information about the coherent scatterers while the others will contain information about the diffuse scatterers [16]. To identify the IMFs exhibiting diffuse scattering, a K-S test is performed. The method used is described in [29]. The K-S classifier assumes that diffuse scatterers generate Gaussian statistics and any deviation is a result of coherent scattering. Those IMFs that show deviation from Gaussian statistics (at 5% significance level) are excluded and the remaining IMFs are added and used for further processing. 2.4.3. NNARLF for ESD estimation The gated backscattered RF signal intensity, W (v), where v = {f, Deff , z, nz } is the set of variables on which the backscattered RF signal depends, at the transducer face can be expressed in the frequency-domain as [42] W (v) = T (f ) · D(f, z) · A(f, z) · S(f, Deff , nz ).
(1)
Here, T(f) represents the combined effect of the transmit pulse and the transducer sensitivity (electro-acoustic and acousto-electric transfer functions); D(f, z) is the effect of diffraction; A(f, z) is the
cumulative attenuation in the soft tissue and S(f, Deff , nz ) represents the scattering properties of the tissue, including the ESD (Deff ) and EAC (nz ); z being the depth of the gated segment from the transducer face [42,43]. We acquire RF signals from the tissue sample and the reference TMP. The backscattered RF signal is then deconvolved and filtered, as the first two steps of the system effect minimization process. The resulting signal, in the frequency domain, is given by W (v). The cumulative attenuation, A(f, z), in soft tissues is a function of frequency f and depth z, and can be expressed as [44] A(f, z) = e−4(f )z = 10−2(f )z/10 ,
(2)
where (f) denotes the attenuation coefficient (AC) in unit Nepers/cm. It is reported in [44] that (f) demonstrates a linear frequency dependence. Therefore, it can be written as (f) = ˇ · f, where ˇ denotes the AC in Nepers/cm/MHz. By compensating for the effect of frequency-dependent attenuation, using a depthdependent (dependent on z) spectral average based attenuation estimation technique developed in [45], we get the backscattered RF signal in the frequency-domain as (v) = W (v)Ac (f, z) Wcomp
= T (f ) · D (f, z) · S(f, Deff , nz ).
(3)
Here, Ac (f, z) is the frequency-dependent attenuation compensation function defined as Ac (f, z) = A−1 (f, z).
(4)
T (f) and D (f, z) are residual effects of the system PSF and diffraction, respectively, remaining after deconvolution and filtering. Now, let us consider the attenuation-compensated signal in (n). To perform EEMD of the discrete-time domain to be wcomp the attenuation-compensated RF data, an ensemble of NE random Gaussian noise, gp (n) (p = 1, . . ., NE ) is added to wcomp (n). That is, the ensemble is given by [46] w pcomp (n) = wcomp (n) + gp (n),
p = 1, . . ., NE .
(5)
The signal-to-noise ratio between the RF data and the Gaussian noise is kept at 30 dB. After that, the EMD algorithm [38] is applied to each of the signals in the ensemble to decompose them into a sum of their IMFs, cpj (n), j = 1, . . ., K, where K is the number of IMFs and a residue, rp (n), given by [46] w pcomp (n) =
K j=1
cpj (n) + rp (n),
p = 1, . . ., NE .
(6)
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
Finally, the IMFs using EEMD are obtained from the ensemble average [46]
The quantity, nz , is termed the EAC and is defined in [18]. For sample and reference, the tissue scattering is then represented as
1 cpj (n), c¯j (n) = NE NE
j = 1, . . ., K.
(7)
p=1
185Lq2 SS (f, Deff,S , nz,S )
1 + 2.66
w icomp (n) =
M
d = 1, . . ., M,
2
(8)
Deff,S
,
Deff,R
fq
6 nz,R f 4
Deff,R
where w icomp (n) is the IMF-replaced signal in the discrete-time domain. We consider WS,comp (v) and WR,comp (v) as the IMF-replaced signals in the frequency-domain for the sample and the reference, respectively. In case of the same average sound speed in the reference and sample tissues, the residual diffraction terms can be considered equal. The sound speed in the reference is known from the manufacturer’s specifications to be approximately equal to that of the soft tissues. Moreover, since the sample and reference data are obtained using the same transducer settings, the residual effect of the system PSF can also be considered equal in the sample and the reference. It is to be noted that the reference data is also passed through the same preprocessing steps as the sample data. Hence, (v) by WR,comp (v), as the final step of the multion dividing WS,comp step system effect minimization process, we get the normalized spectra, Wnorm (v), as
Wnorm (v) =
(v) WS,comp WR,comp (v)
=
SS (f, Deff,S , nz,S ) SR (f, Deff,R , nz,R )
.
(9)
Here, SR (f, Deff,R , nz,R ) and SS (f, Deff,S , nz,S ) represents the scattering properties of the sample and reference, respectively. Taking the logarithm on both sides of (9), we get
10 log Wnorm (v) = 10 log
SS (f, Deff,S , nz,S ) SR (f, Deff,R , nz,R )
.
(10)
A model for tissue scattering, S(f, De ff, nz ),in the frequencydomain was developed in [47] for clinical array systems given by
Deff
185Lq2 S(f, Deff , nz )
=
fq
−12.159f 2
×e
Deff
Deff 2
2 (11)
2
2
with L the gate length (mm), q the ratio of aperture radius to distance from the region of interest, f the frequency in MHz, and Deff the ESD in mm. The model was derived using a Gaussian form factor. It has been previously established that a Gaussian form factor is suitable for modeling the scattering from human tissues [18,25].
(13)
Deff,R 2
2
)
.
Here, Deff,S is the ESD to be estimated from the sample (tissue) and Deff,R is the ESD of the reference (TMP). Dividing (12) by (13), and considering the fact that 2.66(fq ized tissue scattering as SS (f, Deff,S , nz,S ) SR (f, Deff,R , nz,R )
=
6 nz,S Deff,S 6 Deff,R nz,R
e
Deff 2 ) 2
1 [19], we get the normal-
−3.03975(D2
eff,S
−D2
eff,R
)f 2
.
(14)
To estimate the ESD from breast tissues, TMP dataset A has been used as the reference. In the case of estimating the ESD from the TMP datasets, TMP dataset A has been used as the reference phantom for TMP datasets B and C and TMP dataset B has been used as the reference phantom for TMP dataset A. Next, taking logarithm on both sides of (14) yields 10 log
SS (f, Deff,S , nz,S ) SR (f, Deff,R , nz,R )
= 10 log
6 Deff,S nz,S 6 Deff,R nz,R
2 ×(Deff,S
− 13.20
(15)
2 − Deff,R )f 2 .
To estimate ESD, we fit a regression line through the usable (i.e., −6 dB) bandwidth of the normalized log scattering power spectrum. From (10), this is equivalent to fitting a line through the usable bandwidth of the normalized log power spectrum. Assuming f2 = x, the regression line can be expressed as y = mx + c.
(16)
Here, y = 10 log
nz f 4
2
1 + 2.66
6
×e
2
2
d=1 −12.159f 2 (
(12)
2
2
1 + 2.66
2
2
2
=
nz,S f 4
Deff,S
×e
SR (f, Deff,R , nz,R ) cd (n),
fq
−12.159f 2
185Lq
6
Deff,S 2
=
The IMFs responsible for diffuse scattering, cd (n), d = 1, . . ., M, where M is the number of IMFs (normalized) responsible for diffuse scattering, are then identified by the K-S test [29]. The attenuationcompensated RF data, for both the sample and the TMP, are then replaced by the summation of the IMFs responsible for diffuse scattering as
5
SS (f, Deff,S , nz,S ) SR (f, Deff,R , nz,R )
,
2 2 − Deff,R ), m = −13.20 × (Deff,S
c = 10 log
6 Deff,S nz,S 6 Deff,R nz,R
.
(17) (18)
(19)
The ESD may be estimated from (18). However, the probability that a single gated RF block includes regions of heterogeneous tissues increases with the block size. Thus, for better modelling of tissue pathology, we use an NNARLF algorithm. We assume that the ESD of the neighboring tissues of the scattering particles in the neighboring blocks is almost the same for their physical proximity.
6
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
We calculate an average regression line as the weighted average of the regression lines of the neighboring blocks as
is +La
i0 =is −La
Y (is , js ) =
js +Ll
j0 =js −Ll
is +La
i0 =is −La
y(i0 , j0 )w(is .js ) (i0 , j0 )
js +Ll
j0 =js −Ll
.
(20)
w(is .js ) (i0 , j0 )
Here, Y(is , js ) denotes the weighted average value of y(is , js ), and w(is .js ) (i0 , j0 ) is the exponential weight function for an interrogated point (is , js ) on the 2-D ESD map, defined as w(is ,js ) (i0 , j0 ) = e−|a (i0 −is )|−|l (j0 −js )| ,
(21)
where a and l denote the weighting factors in the axial and lateral directions, respectively. Also, is − La ≤ i0 ≤ is + La and js − Ll ≤ j0 ≤ js + Ll , where, La and Ll are the nearest neighborhood (NN) factors in the axial and lateral directions, respectively. From (21) it is evident that w(is .js ) (i0 , j0 ) has the maximum value (unity) at (i0 , j0 ) = (is , js ). A 2-D weighted exponential neighborhood having La = LI = 5 is illustrated in Fig. 2. The values on the weight axis are arbitrary but show an exponential decay as we move away from the interrogated window both axially and laterally. Substituting the value of y from (16) into (20), we get
is +La js +Ll i0 =is −La
Y (is , js ) =
j =js −L
(m(i0 , j0 )x(i0 , j0 ) + c(i0 , j0 ))w(is .js ) (i0 , j0 )
0 l js +Ll is +La
i0 =is −La
.
j0 =js −Ll
(22)
w(is .js ) (i0 , j0 )
If we define the weighted average value of the slope as
is +La
i0 =is −La
M(is , js ) =
js +Ll
j0 =js −Ll
is +La
i0 =is −La
m(i0 , j0 )w(is .js ) (i0 , j0 )
js +Ll
j0 =js −Ll
,
(23)
w(is .js ) (i0 , j0 )
and the weighted average value of the intercept as
is +La
i0 =is −La
C(is , js ) =
js +Ll
is +La
j0 =js −Ll
i0 =is −La
c(i0 , j0 )w(is .js ) (i0 , j0 )
js +Ll
j0 =js −Ll
,
(24)
w(is .js ) (i0 , j0 )
[16], the AR spectrum was estimated from the IMFs produced by EEMD of the deconvolved RF data. In that work, the IMFs, responsible for coherent scattering (from which the MSS is estimated), were identified using an empirical criterion. The K-S test was only used to validate whether or not the selected IMFs are indeed a source of coherent scattering. In this work, we have ameliorated the previously proposed MSS estimation technique by incorporating the automatic IMF selection scheme, along with the bandpass filtering of the deconvolved RF data. A block diagram to illustrate the MSS estimation technique is shown in Fig. 3. We present the values of the average MSS estimates for different types of breast tissues and the performance of the MSS estimator as a biomarker for breast lesion classification in the results section. We also show the improvements in classification results from the previously proposed technique in [16]. 3. Results In this section, we present the classification results obtained by combining ESD with MSS. We also present the single-parameter classification results obtained using our ESD estimates and compare that classification performance with the classification results obtained using ESD estimated from other techniques that employ a Gaussian form factor model [14,18,19]. In addition, we show the single-parameter classification performance of the MSS estimator and compare its performance with some existing techniques for MSS estimation [2,16,28]. However, at first, we focus on the ESD estimates obtained using our proposed method and validate the accuracy of our proposed algorithm by estimating the ESD from TMPs. Additionally, we present the results of ESD estimated using some of the conventional Gaussian form factor based techniques [14,18,19]. Moreover, we present numerical results to validate the different steps of the proposed ESD estimation algorithm and hence, establish the suitability of the proposed ESD estimator for breast lesion classification.
then (22) can be written in the form of a regression line given by Y (is , js ) = M(is , js )x + C(is , js ).
(25)
Now, from the slope, M, of the regression line that fits (25), we can estimate the ESD, Deff,S , at the point (is , js ) using (18) as
Deff,S =
−
M(is , js ) 2 . + Deff,R 13.20
(26)
To estimate the block power spectra of an interrogated block with higher resolution, the block is divided into 1-D segments. Consecutive window segments in a block have 50% axial overlapping. The windowed segments are gated by the Hamming window. The block power spectra are calculated using the Welch method. We select La = Ll = 5 as the NN factors to estimate the local ESD for a particular interrogated block. 2.5. MSS estimation technique Another QUS parameter which has previously been successfully used for breast tissue characterization is the MSS [1,2,16]. Specifically, an MSS estimation algorithm developed in [16] produced promising binary classification results, albeit, on a small dataset. MSS estimation on the same dataset as ESD allows us to use multiple QUS parameters for breast lesion classification. The use of multiple QUS parameters for tissue characterization has been previously shown to produced improved results [2,17]. What is more, because the MSS is estimated from the coherent component of the backscattered RF data and EEMD separates the diffuse and coherent components, the MSS may be estimated simultaneously with the ESD, using a branching scheme. In the technique proposed in
3.1. ESD estimation results on TMPs To check the accuracy of our proposed average ESD estimation technique, we use three CIRS experimental TMPs. The average ESD of these phantoms are also estimated using the MASD based method [18], the frequency-domain method [19], and the method used in [14]. It is to be noted that a Faran form factor is usually employed for modeling the scattering from TMPs [48]. But, since tissue scattering is more accurately modeled by a Gaussian form factor, to apply Eq. (9), a Gaussian form factor model has been applied for the TMPs as well. Moreover, Gaussian form factor models have also been previously used to model the scattering from TMPs [49]. Table 4 presents the actual average ESD provided by the manufacturer and the average ESD estimated using our proposed algorithm as well as the other techniques for the experimental phantoms. It is evident that our proposed algorithm estimates the ESD for all three TMP data with a higher degree of accuracy compared to the other methods. This is reflected by a lower mean absolute percentage error (MAPE) value. Moreover, our method also shows a lower SD of estimates compared to the other methods. This can be attributed to the proposed multi-step system effect minimization techniques and the NNARLF algorithm. We have also estimated the ESD from the heterogeneous ROIs of TMP C using our proposed method. A scatter plot showing how the ESD varies as we move laterally from a region just outside the inclusion across the border of the inclusion to a region within the inclusion is presented in Fig. 4. It is evident from the figure that our proposed method can reliably estimate the ESD in the homo-
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
7
Fig. 2. An illustration of the exponentially weighted neighborhood.
Fig. 3. A block diagram illustrating the proposed MSS estimation technique.
Table 4 Estimated ESD values (in m) with SD (in bracket) from experimental TMPs. Method
TMP dataset
Average ESD of Background (m) (±SD)
MAPE (with respect to background) (%)
Average ESD inside Inclusion (m) (±SD)
MAPE (with respect to inclusion) (%)
MASD [18]
A B C
54.82 (±8.88) 52.18 (±8.09) 50.91 (±7.99)
21.82 15.96 13.13
– – 74.91 (±7.87)
– – 7.01
Frequency domain [19]
A B C
53.84 (±7.11) 50.10 (±7.07) 48.97 (±7.39)
19.64 11.33 8.82
– – 73.61(±7.09)
– 5.15
Nasief et al. [14]
A B C
53.75 (±7.09) 49.92 (±7.01) 49.54 (±7.43)
19.44 10.93 10.98
– – 73.58(±7.13)
– 5.11
Proposed
A B C
47.02 (±5.89) 47.76 (±6.01) 47.78 (±5.85)
4.49 6.13 6.18
– – 73.02 (±6.33)
– 4.31
geneous regions inside and outside the inclusion. The mean value is approximately 48 m outside the inclusion (represented by a red line) and the mean value of approximately 74 m within the inclusion (represented by a green line). There is a sharp change in the average ESD values across the border of the inclusion, as expected.
3.2. Effect of different steps of the proposed algorithm on the ESD estimation accuracy Now, to study how the different steps in the proposed algorithm impact the accuracy of the ESD estimates, we use an ablation
8
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
Fig. 4. Scatter plot showing the variation of ESD across different windows on moving laterally from the edge of the ROI in the TMP background to the edge of the ROI within the inclusion. (For interpretation of the references to color in the text, the reader is referred to the web version of this article.)
matory lesions are slightly higher than that for normal tissues. According to the best of our knowledge, the ESD values of inflammatory lesions have not been previously reported. It is evident that using our proposed method, the estimated ESD values for fibroadenoma and malignant breast tissues fall within the range of ESD values previously reported in the literature [2]. Furthermore, it is seen that the average ESD value of malignant lesions is greater than that of benign lesions which also conforms with the previously reported results [2]. We see that, for all the other techniques [14,18,19], the SD of ESD estimates for both fibroadenoma and malignant tissues are significantly higher than that of our proposed method. The proposed ESD estimation algorithm has also been applied on some cystic lesions. The estimated ESD values of cysts are rather erratic and show a high SD and hence, are not presented. This is true for all the techniques used to obtain the ESD estimates in Table 5. Therefore, it can be concluded that ESD estimates of cystic lesions are of no diagnostic importance. This is in good concordance with the anatomy of the cysts as they are fluid-filled sacs. Thus, scattering from cysts will largely be absent and some inconsistent scattering may occur due to debris (such as those present in complex cysts) [50]. Hence, these datasets are not included in binary classification of breast lesions. However, the true gold standard for ESD estimates need to be extracted from histopathology images. But, unfortunately, we do not have excess to these images for our dataset. Thus, the benefits of our proposed method can only be compared through the classifications results for which there exists definite pathological gold standard. 3.4. Effect of the weighted exponential neighborhood on the SD of ESD estimates
Fig. 5. Bar plot showing the impact on the percentage error and SD of the ESD estimates from the experimental TMPs on removing the different steps from our proposed algorithm.
approach. The ESD was estimated for several ROIs of TMP datasets A, B, and C by removing the different steps shown in the flow chart of Fig. 1 one by one while retaining the others. The results are presented in Fig. 5 in the form of a bar plot. The first column of the bar plot presents the performance of the proposed algorithm on the same datasets for comparison. It can be seen that the overall estimation accuracy is most adversely impacted by removing the EEMD step. There is also a slight rise in the SD of estimates on removing the EEMD step. Removal of any one of the system effect minimization steps, i.e., deconvolution, filtering, and normalization using a reference TMP, also noticeably impacts the ESD estimation accuracy. There is a slight increase in the SD of ESD estimates in each case. Hence, it justifies the use of a multi-step system effect minimization technique. The weighted neighborhood step seems to have the least impact on the overall accuracy. However, the removal of this step is seen to have a more negative impact on the SD of the estimates compared to the other steps of the proposed method. This is later shown to be an important factor in improving the binary classification performance. 3.3. ESD estimation results on in vivo breast tissues The results of estimating the ESD of in vivo breast tissues using various techniques are presented in Table 5. All the methods give similar average ESD estimates for normal tissues. The average ESD estimates for normal tissues is in concordance with values previously reported in the literature [2]. The average ESD value for inflammatory tissues, using our proposed method, is close to that of normal tissues. However, for the other techniques [14,18,19], the average ESD estimates for inflam-
An important factor that has to be taken into consideration while estimating ESD from in vivo tissues is the impact of the kernel size of the weighted exponential neighborhood. As mentioned earlier, the algorithm uses a 5 × 5 weighted exponential neighborhood. The use of such a neighborhood allows the modeling of tissue homogeneity over a small region rather than a large spatial block where the tissue becomes more heterogeneous. To investigate the impact of the size of the neighborhood on the ESD estimates, the ESD is estimated using our proposed algorithm for normal tissues for no neighborhood, a neighborhood of size 3 × 3, and a neighborhood of size 8 × 8. The results are shown in Table 6. It is clear from the table that the choice of neighborhood size does not greatly affect the average value of the ESD estimates. However, the SD of estimates seems to decrease significantly for a neighborhood size close to our selected one. However, using a large spatial block, without the use of an exponentially weighted neighborhood, increases the SD of estimates. This observation is consistent with that obtained for experimental TMPs. There, the removal of the weighted neighborhood step adversely impacted the SD of estimates. To further substantiate our argument for choosing an exponentially weighted neighborhood, we produce, in Fig. 6, an ESD map for a fibroadenoma tissue. Two ROIs are taken within the border of the lesion and two ROIs are taken outside the lesion. In the case of ROI selection, the protocols mentioned previously have been followed. Clearly, the ESD values are consistent (having a low SD) across the ROIs (both inside and outside the lesion). Hence, the use of an exponential weighted neighborhood models the tissue pathology more closely than using a large spatial block. 3.5. MSS estimation results on in vivo breast tissues The MSS estimation results for different types of breast tissues are presented in Table 7. The MSS values estimated for fibroade-
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
9
Table 5 Estimated values of ESD with SD (in bracket) using different methods for normal and pathological tissues. ESD (±SD) (m) Method
MASD [18] Frequency domain [19] Nasief et al. [14] Proposed
Malignant
Fibroadenoma
Inflammatory
Normal
In
Out
In
Out
In
Out
100.50 (±22.03) 108.21 (±17.14) 105.31 (±21.50) 124.38 (±8.64)
84.11 (±8.31) 80.01 (±8.19) 79.98 (±8.39) 75.20 (±4.17)
93.51 (±18.79) 102.56 (±12.76) 92.88 (±18.16) 98.79 (±9.65)
81.02 (±7.31) 79.42 (±8.43) 78.75 (±9.03) 75.14 (±4.14)
88.31 (±11.34) 81.24 (±9.81) 85.38 (±10.42) 75.72 (±4.09)
90.35 (±10.01) 82.45 (±11.01) 80.96 (±10.90) 75.77 (±4.07)
76.13 (±7.01) 75.88 (± 6.74) 75.43 (± 7.02) 75.12 (±4.01)
Table 6 Estimated values of ESD with SD (in bracket) for normal tissues with different neighborhood sizes. Size of neighborhood
ESD (±SD) (m)
No neighborhood 3×3 5×5 8×8
76.89 (±8.56) 74.99 (±4.09) 75.12 (±4.01) 76.46 (±7.56)
and malignant tissues from our previously proposed method in [16]. 3.6. Classification results
Fig. 6. ESD map for four ROIs for a representative fibroadenoma tissue. A deeper shade of blue indicates higher ESD values. The values on the axes are arbitrary. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
nomas, malignant lesions and normal breast tissues fall within the range of values previously reported [1,2]. As in the case of ESD, the MSS values of inflammatory lesions have not been reported before. The estimated MSS values for inflammatory lesions are slightly higher than those for normal tissues. The first and second rows of the table present the results for MSS estimation using a modified spectral autocorrelation (SAC) based technique [2] and a singular spectrum analysis (SSA) based technique [28]. The modified SAC based technique [2] has been chosen for comparison since this method has been used for tissue characterization. The SSA based technique [28] is chosen because it a signal decomposition based technique, similar to the proposed MSS estimation technique. The fourth row presents our previously proposed MSS estimation technique [16]. It is evident that compared to the techniques in [2] and [28], the separation between the average MSS values of different groups of tissues is higher for our proposed MSS estimation method. There is a very slight increase in separation between the average MSS values of the benign
3.6.1. Classification results based on ESD Table 8 presents the classification results based on only ESD for the proposed method, the MASD-based method [18], the frequency-domain method [19], and the method used in [14]. For the proposed method, a classification scheme based on ESD alone produces sensitivity, specificity, and accuracy values of 95.45%, 95.79%, and 95.68%, respectively. This is significantly better than the other techniques used for comparison in this paper [14,18,19]. Additionally, the Sum5 and MCC values of 476.08 and 0.9019, respectively, are far superior compared to the other techniques. When the Fisher’s exact test [51] is used for testing statistical significance, the proposed method achieved significantly better quality metrics than the MASD-based method [18], the frequencydomain method [19], and the method in [14], with p values equal to 4.415 × 10−9 , 8.832 × 10−7 , and 0.0000057946, respectively. The AUC is calculated using a bootstrapping strategy (200 bootstraps) to obtain the mean and 95% confidence intervals (CIs) of the ROC curve. This yields a mean AUC value of 0.95 with a CI of 0.89–0.98 for classification based on our proposed ESD estimator. 3.6.2. Classification results based on MSS Table 9 presents the classification performance of MSS for different methods. The first and second rows of the table present the classification performance using the MSS values estimated from the modified SAC [2] and the SSA [28] based techniques, respectively. The third row presents the results obtained using the MSS
Table 7 Estimated values of MSS with SD (in bracket) using different methods for normal and pathological tissues. Method
Modified SAC [2] SSA [28] Nizam et al. [16] Proposed method
MSS (±SD) (mm) Malignant
Fibroadenoma
Inflammatory
Normal
0.75 (±0.05) 0.85 (±0.06) 0.79 (±0.05) 0.79 (±0.03)
0.77 (±0.05) 0.81 (±0.06) 0.76 (±0.04) 0.75 (±0.03)
0.78 (±0.07) 0.82 (±0.06) 0.73 (±0.05) 0.73 (±0.04)
0.80 (±0.07) 0.80 (±0.05) 0.70 (±0.05) 0.69 (±0.03)
10
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
Table 8 Breast lesion classification results using ESD for different methods. Method
TP
TN
FP
FN
Sens. (%)
Spec. (%)
Acc. (%)
PPV (%)
NPV (%)
Sum5
MCC
AUC
MASD [18] Frequency domain [19] Nasief et al. [14] Proposed
17 22 24 42
85 79 84 91
10 16 11 4
27 22 20 2
38.63 50 54.54 95.45
89.47 83.16 88.42 95.79
73.38 72.66 77.70 95.68
62.96 57.89 68.57 91.30
75.89 78.22 80.77 97.85
340.34 341.93 370.00 476.08
0.3305 0.3460 0.4604 0.9019
0.69 0.69 0.75 0.95
Table 9 Breast lesion classification results obtained using MSS for different methods. Method
TP
TN
FP
FN
Sens. (%)
Spec. (%)
Acc. (%)
PPV (%)
NPV (%)
Sum5
MCC
AUC
Modified SAC [2] SSA [28] Nizam et al. [16] Proposed
30 27 28 29
54 64 74 74
41 31 21 21
14 17 16 15
68.18 61.36 63.64 65.91
56.84 67.37 77.89 77.89
60.43 65.46 73.38 74.10
42.25 46.55 57.14 58
79.41 79.01 82.22 83.15
307.12 319.76 354.28 359.06
0.2328 0.2710 0.4043 0.4245
0.62 0.63 0.70 0.72
Table 10 Breast lesion classification results of the proposed methods obtained using a combination of ESD and MSS. Parameter
TP
TN
FP
FN
Sens. (%)
Spec. (%)
Acc. (%)
PPV (%)
NPV (%)
Sum5
MCC
AUC
MSS ESD ESD and MSS
29 42 43
74 91 91
21 4 4
15 2 1
65.91 95.45 97.73
77.89 95.79 95.79
74.10 95.68 96.40
58 91.30 91.49
83.15 97.85 98.91
359.06 476.08 480.31
0.4245 0.9019 0.9195
0.72 0.95 0.97
Table 11 Comparative classification results obtained using different techniques. Method
QUS parameter
Sum5
MCC
AUC
Modified SAC [2] SSA [52] MASD [18] Frequency domain [19] Nizam et al. [16] Proposed Nasief et al. [14] Proposed Proposed
MSS MSS ESD ESD MSS MSS ESD ESD ESD and MSS
307.12 319.76 340.34 341.93 354.28 359.05 370.00 476.08 480.32
0.2328 0.2710 0.3305 0.3460 0.4043 0.4245 0.4604 0.9019 0.9195
0.62 0.63 0.69 0.69 0.70 0.72 0.75 0.95 0.97
estimation technique proposed in [16]. The fourth row contains the results obtained from our proposed technique. We see that the classification results, obtained using our proposed technique, are much better compared to the modified SAC and SSA techniques [2,28]. Also, ameliorating the technique in [16], has resulted in an improved classification performance as exhibited by higher Sum5 and MCC values. 3.6.3. Classification results based on a combination of ESD and MSS Next, the results of combining ESD with MSS is shown in Table 10. It is observed from the table that when ESD is fused with MSS, we obtain improved sensitivity, specificity, accuracy, and MCC values of 97.73%, 95.79%, and 96.40%, and 0.9195, respectively. The mean AUC value, in this case, is found to be 0.97 with a 95% CI of 0.93–0.99. These performance metrics are clearly better than classification based on only a single QUS parameter (either the ESD or MSS). The first two rows of Table 10 again present the classification results obtained using our proposed ESD and MSS estimation techniques, for reference. As discussed before, the classification performance of the combination of ESD and MSS has been evaluated using SVM, LDA, MNR, KNN, and Naïve Bayes classifiers. In Table 10, the reported Sum5 value of 480.31 is obtained for a quadratic LDA classifier and represents the best Sum5 value obtained out of all the above-mentioned classifiers. The average Sum5 value (±SD), found by averaging the Sum5 values obtained from each of the mentioned classifiers, is
480.04 (±0.22). This indicates that the classification performance obtained using the combination of ESD and MSS is fairly stable. 3.6.4. Comparative classification performance of the different techniques In Table 11, we show a comparative performance of the different techniques discussed in this paper for breast lesion classification. We use the quantitative indices AUC, MCC, and Sum5 since they provide a good picture of the overall classification performance. For both MSS and ESD, our proposed algorithms produce a superior classification performance. The best classifications results are obtained on combining ESD and MSS, estimated using our proposed techniques. This can again be attributed to the strength of our proposed algorithm. As the true gold standard in the form of histopathology slides is not available, a superior classification performance is the best available indicator of the strength of our proposed algorithms compared to the other techniques. A higher AUC value of 0.999 was reported in [14] using ESD with two other QUS parameters. But, in that work, a limited dataset of 43 lesions had been used. 3.7. Effect of different steps of the proposed ESD estimation algorithm on the classification performance We investigate how the different steps of the proposed algorithm impact the classification accuracy when only ESD is used for classifying the breast lesions. This is again done by using an abla-
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
11
Table 12 Breast lesion classification results obtained using only ESD by removing the different steps in the proposed technique. Condition
TP
TN
FP
FN
Sens. (%)
Spec. (%)
Acc. (%)
PPV (%)
NPV (%)
Sum5
MCC
AUC
Proposed No Deconv. No Norm. No Neigh. No Filt. No EEMD
42 37 40 31 36 32
91 92 87 92 86 77
4 3 8 3 9 18
2 7 4 13 8 12
95.45 84.09 90.91 70.45 81.81 72.72
95.79 96.84 91.58 96.84 90.52 81.05
95.68 92.81 91.37 88.49 87.77 78.41
91.30 92.50 83.33 91.18 80 64
97.85 92.93 95.61 87.62 91.49 86.51
476.08 459.17 452.79 434.58 431.60 382.71
0.9019 0.8315 0.8069 0.7282 0.7230 0.5212
0.95 0.93 0.91 0.89 0.86 0.75
Table 13 Symbols and acronyms. Symbol
Description
W (v) T(f) D(f, z) S(f, Deff , nz ) Deff nz z A(f, z) (f) Ac (f, z) (v) Wcomp (n) wcomp gp (n) NE w pcomp (n) cpj (n) cj (n) M w icomp (n) WS,comp (v) (v) WR,comp RF ESD EAC MSS AR QUS AC TMP ROI PSF EEMD IMF K-S test SD MASD MAPE NN NNARLF SAC SSA MCC AUC
Gated backscattered RF signal in the frequency domain Combined effect of transmitted pulse and transducer sensitivity in the frequency domain Diffraction component of W (v) in the frequency domain Scattering component of W (v) in the frequency domain Effective scatterer diameter Effective acoustic concentration Depth of the gated segment from the transducer face Cumulative attenuation Attenuation coefficient (AC) in unit Nepers/cm Attenuation compensation function Attenuation compensated backscattered RF signal in the frequency domain Attenuation compensated backscattered signal in the discrete-time domain Random Gaussian noise in the discrete-time domain Vector length of random Gaussian noise wcomp (n) with 30dB Gaussian noise added, used for EEMD decomposition IMFs computed from w pcomp (n) using EEMD Ensemble average of IMFs, computed over the noise vector length of NE No of IMFs responsible for diffuse scattering, selected through K-S test Sum of all ensemble averaged IMF, responsible for diffuse scattering, in thediscrete-time domain Frequency domain representation of w icomp (n) for sample Frequency domain representation of w icomp (n) for reference Radio-frequency Effective scatterer diameter Effective acoustic concentration Mean scatterer spacing Autoregressive Quantitative ultrasound Attenuation coefficient Tissue-mimicking phantom Region of interest Point spread function Ensemble empirical mode decomposition Intrinsic mode function Kolmogorov–Smirnov test Standard deviation Minimization of the average squared deviation Mean absolute percentage error Nearest neighborhood Nearest neighborhood average regression line fitting Spectral autocorrelation Singular spectrum analysis Matthew’s correlation coefficient Area under the receiver operating characteristics curve
tion approach. The results are shown in Table 12. It is clear from the table that removing the EEMD step has the most drastic impact on classification performance. There is a sharp fall in the sensitivity, specificity, and accuracy results from the proposed method. Furthermore, removal of the exponential neighborhood step has a significantly detrimental impact on the classification performance. This can be correlated to the observations from experimental TMPs and in vivo tissues. There, it has been observed that the removal of the weighted neighborhood step significantly increases the SD of ESD estimates. This, in turn, results in a poor classification performance. It is also clear that removing any one of the system effect minimization steps, i.e., deconvolution, filtering, and normalization, also impacts the classification performance adversely. This again confirms the idea that system effect minimization requires
a multi-step approach. Therefore, the different steps of the proposed ESD estimation algorithm work together in improving the classification performance. 4. Discussion In this paper, we proposed a QUS parameter based breast lesion classification scheme using ESD and MSS. To obtain reliable ESD estimates for breast lesion classification, an improved ESD estimation technique for breast tissues has been proposed. The proposed ESD estimator has been shown to outperform the conventional techniques, that also employ a Gaussian form factor. The higher values of the performance metrics in Table 8 compared to the other techniques show the suitability of the proposed ESD estimator to be
12
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786
used as a CAD tool. Table 10 also demonstrates the suitability of the proposed ESD and MSS estimators to be used as CAD tools, when they are used in conjunction. However, it is to be noted that ESD and MSS cannot be estimated for cystic lesions. Therefore, an ESD/MSS based classification scheme will be unable to characterize cysts correctly. Hence, this is a limitation of such a classification scheme. However, some strain imaging techniques have been developed which can successfully characterize cysts [53]. These may be used in conjunction with ESD and MSS for breast lesion classification. What is more, it is evident from Table 5, that the difference between the average ESD values of inflammatory lesions and normal tissues is quite small for the proposed method. Given the SD from the mean values, it is clear that there is a significant overlap between these two types of tissue. In this paper, we have opted for a binary classification of breast lesions. Here, inflammatory lesions have been classified as benign and normal tissues have been excluded from the classification algorithm. If a three-class classification technique were used (with benign, malignant, and normal tissues as the three classes), the classification results would deteriorate. But, it is to be noted that inflammatory lesions, such as breast tuberculosis, is often confused with malignancy [54]. From that point of view, a greater separation between the average ESD values between malignant and inflammatory lesions might be suitable for diagnosis. Furthermore, it is clear from Eq. (19) that the proposed technique can estimate another QUS parameter, EAC. But, the EAC values estimated using this technique have a poor classification performance on this dataset. Additionally, the classification performance of the QUS parameters is found to be sensitive to several hyperparameters, such as the number of folds, weights, etc. Future works involving QUS based classifiers should try to reduce the sensitivity of the classification performance to these parameters. Besides, this paper employs a Gaussian form factor to model tissue scattering. This has produced reasonable ESD estimates in this work. Similar schemes using other form factors such as the Faran form factor need to be explored for QUS parameter estimation. This requires extensive work in the future. Firstly, frequency-domain scattering models need to be developed using other form factors. Next, these form factor models need to be compared on a larger dataset. Also, the true gold-standard for QUS parameters relating to tissue micro-structures like the ESD and MSS have to be extracted using microscopy analysis from histopathology slides. Because of a lack of histopathology slides for our dataset, we could not carry out such an analysis. Moreover, recent trends suggest the use of deep neural networks for ultrasonic tissue characterization [55]. The dataset used in this work is still too limited for such an approach. But, deep neural networks have reached an accuracy value of only 90% even with a dataset consisting of 7408 lesions [55]. It has been suggested that combining traditional signal processing based preprocessing steps with deep neural networks may be the best approach for further improving the ultrasonic tissue classification performance in the future [56].
5. Conclusion This paper has presented a multi-QUS biomarker-based breast lesion classification scheme that uses ESD and MSS as the QUS parameters. An improved frequency domain technique for ESD estimation from the diffuse component of the backscattered data using EEMD has also been proposed. The ESD estimates based on this technique has been shown to produce a better classification performance compared to some of the existing techniques for ESD estimation. Additionally, when the ESD is fused with another prominent QUS parameter, the MSS, estimated from an ameliorated EEMD domain AR spectral estimation technique, the classification performance further improves. Therefore, the proposed ESD
estimator, in conjunction with the proposed MSS estimator, is promising enough to be used as a CAD tool for breast lesion classification. Also, it opens the possibility for the use of these QUS parameters as non-invasive biomarkers for breast cancer detection. Acknowledgement This work was supported by HEQEP UGC, Bangladesh, under Grant CPSF#96/BUET/Win-2/ST(EEE)/2017. The in vivo breast data were acquired at the BUET Medical Center by Dr. Farzana Alam, Assistant Professor of Radiology and Imaging, Bangabandhu Sheikh Mujib Medical University, Dhaka-1000, Bangladesh. The computer programs for the methods in [18] and [19] were derived from the graphical user interface (GUI) developed by the authors of those papers. We would like to thank the authors for providing us with the GUI. Declaration of interests: None declared. References [1] Y. Bige, Z. Hanfeng, W. Rong, Analysis of microstructural alterations of normal and pathological breast tissue in vivo using the ar cepstrum, Ultrasonics 44 (2) (2006) 211–215. [2] H. Tadayyon, A. Sadeghi-Naini, L. Wirtzfeld, F.C. Wright, G. Czarnota, Quantitative ultrasound characterization of locally advanced breast cancer by estimation of its scatterer properties, Med. Phys. 41 (1) (2014). [3] B. Alacam, B. Yazici, N. Bilgutay, F. Forsberg, C. Piccoli, Breast tissue characterization using farma modeling of ultrasonic rf echo, Ultrasound Med. Biol. 30 (10) (2004) 1397–1407. [4] K. Nam, J.A. Zagzebski, T.J. Hall, Quantitative assessment of in vivo breast masses using ultrasound attenuation and backscatter, Ultrason. Imaging 35 (2) (2013) 146–161. [5] F.L. Lizzi, M. Ostromogilsky, E.J. Feleppa, M.C. Rorke, M.M. Yaremko, Relationship of ultrasonic spectral parameters to features of tissue microstructure, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 34 (3) (1987) 319–329. [6] E. Feleppa, F. Lizzi, D. Coleman, M. Yaremko, Diagnostic spectrum analysis in ophthalmology: a physical perspective, Ultrasound Med. Biol. 12 (8) (1986) 623–631. [7] W. Liu, J.A. Zagzebski, M.A. Kliewer, T. Varghese, T.J. Hall, Ultrasonic scatterer size estimation in liver tumor differntiation, Med. Phys. 34 (2597) (2007). [8] M.F. Insana, T.J. Hall, J.G. Wood, Z. Yan, Renal ultrasound using parametric imaging techniques to detect changes in microstructure and function, Invest. Radiol. 28 (8) (1993) 720–725. [9] T.J. Hall, M.F. Insana, L.A. Harrison, G.G. Cox, Ultrasonic measurement of glomerular diameters in normal adult humans, Ultrasound Med. Biol. 22 (8) (1996) 987–997. [10] M.F. Insana, Modeling acoustic backscatter from kidney microstructure using an anisotropic correlation function, J. Acoust. Soc. Am. 97 (1) (1995) 649–655. [11] E.J. Feleppa, T. Liu, A. Kalisz, M.C. Shao, N. Fleshner, V. Reuter, W.R. Fair, Ultrasonic spectral-parameter imaging of the prostate, Int. J. Imaging Syst. Technol. 8 (1) (1997) 11–25. [12] S.L. Bridal, P. Fornès, P. Bruneval, G. Berger, Parametric (integrated backscatter and attenuation) images constructed using backscattered radio frequency signals (25–56 MHz) from human aortae in vitro, Ultrasound Med. Biol. 23 (2) (1997) 215–229. [13] R.H. Silverman, R. Folberg, H.C. Boldt, H.O. Lloyd, M.J. Rondeau, M.G. Mehaffey, F.L. Lizzi, D.J. Coleman, Correlation of ultrasound parameter imaging with microcirculatory patterns in uveal melanomas, Ultrasound Med. Biol. 23 (4) (1997) 573–581. [14] H.G. Nasief, I.M. Rosado-Mendez, J.A. Zagzebski, T.J. Hall, A quantitative ultrasound-based multi-parameter classifier for breast masses, Ultrasound Med. Biol. (2019). [15] A. Sadeghi-Naini, H. Suraweera, W.T. Tran, F. Hadizad, G. Bruni, R.F. Rastegar, B. Curpen, G.J. Czarnota, Breast-lesion characterization using textural features of quantitative ultrasound parametric maps, Sci. Rep. 7 (1) (2017) 13638. [16] N.I. Nizam, S.K. Alam, M.K. Hasan, EEMD domain ar spectral method for mean scatterer spacing estimation of breast tumors from ultrasound backscattered rf data, IEEE Trans. Ultrason. Ferroelect. Freq. Control 64 (10) (2017) 1487–1500. [17] S.R. Ara, S.K. Bashar, F. Alam, M.K. Hasan, EMD-DWT based transform domain feature reduction approach for quantitative multi-class classification of breast lesions, Ultrasonics 80 (2017) 22–33. [18] M.L. Oelze, W.D. O’Brien Jr., Method of improved scatterer size estimation and application to parametric imaging using ultrasound, J. Acoust. Soc. Am. 112 (6) (2002) 3053–3063. [19] M.L. Oelze, J.F. Zachary, W.D. O’Brien Jr., Characterization of tissue microstructure using ultrasonic backscatter: Theory and technique for optimization using a gaussian form factor, J. Acoust. Soc. Am. 112 (3) (2002) 1202–1211.
N.I. Nizam, S.R. Ara and Md.K. Hasan / Biomedical Signal Processing and Control 57 (2020) 101786 [20] K.A. Wear, R.F. Wagner, M.F. Insana, T.J. Hall, Application of autoregressive spectral analysis to cepstral estimation of mean scatterer spacing, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 40 (1) (1993) 50–58. [21] R. Lavarello, M. Oelze, Quantitative ultrasound estimates from populations of scatterers with continuous size distributions: effects of the size estimator algorithm, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 59 (9) (2012) 2066–2076. [22] T. Varghese, K.D. Donohue, Estimating mean scatterer spacing with the frequency-smoothed spectral autocorrelation function, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 42 (3) (1995) 451–463. [23] Z. Klimonda, M. Postema, A. Nowicki, J. Litniewski, Tissue attenuation estimation by mean frequency downshift and bandwidth limitation, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 63 (8) (2016) 1107–1115. [24] M.H.R. Khan, M.K. Hasan, Attenuation estimation of soft tissue with reference-free minimization of system effects, Biomed. Signal Process. Control 50 (2019) 121–133. [25] M.F. Insana, R.F. Wagner, D.G. Brown, T.J. Hall, Describing small-scale structure in random media using pulse-echo ultrasound, J. Acoust. Soc. Am. 87 (1) (1990) 179–192. [26] A.C. Luchies, G. Ghoshal, W.D. O’Brien Jr., M.L. Oelze, Quantitative ultrasonic characterization of diffuse scatterers in the presence of structures that produce coherent echoes, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 59 (5) (2012) 893–904. [27] A.C. Luchies, M.L. Oelze, Backscatter coefficient estimation using tapers with gaps, Ultrason. Imaging 37 (2) (2015) 117–134. [28] W.C.D.A. Pereira, C.D. Maciel, Performance of ultrasound echo decomposition using singular spectrum analysis, Ultrasound Med. Biol. 27 (9) (2001) 1231–1238. [29] G. Georgiou, F.S. Cohen, Statistical characterization of diffuse scattering in ultrasound images, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 45 (1) (1998) 57–64. [30] M. Miyoshi, Y. Kaneda, Inverse filtering of room acoustics, IEEE Trans. Acoust. Speech Signal Process. 36 (2) (1988) 145–152. [31] W.K. Moon, C.-S. Huang, W.-C. Shen, E. Takada, R.-F. Chang, J. Joe, M. Nakajima, M. Kobayashi, Analysis of elastographic and b-mode features at sonoelastography for breast tumor classification, Ultrasound Med. Biol. 35 (11) (2009) 1794–1802. [32] W. Zhu, N. Zeng, N. Wang, et al., Sensitivity, specificity, accuracy, associated confidence interval and roc analysis with practical sas implementations, in: NESUG Proceedings: Health Care and Life Sciences, vol. 19, Baltimore, Maryland, 2010, pp. 1–9. [33] B.W. Matthews, Comparison of the predicted and observed secondary structure of t4 phage lysozyme, Biochim. Biophys. Acta (BBA)-Protein Struct. 405 (2) (1975) 442–451. [34] S. Zhou, J. Shi, J. Zhu, Y. Cai, R. Wang, Shearlet-based texture feature extraction for classification of breast tumor in ultrasound image, Biomed. Signal Process. Control 8 (6) (2013) 688–696. [35] G. Ghoshal, M.L. Oelze, Improved scatterer property estimates from ultrasound backscatter using gate-edge correction and a pseudo-welch technique, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 57 (12) (2010) 2828–2832. [36] M.K. Hasan, M.S.E. Rabbi, S.Y. Lee, Blind deconvolution of ultrasound images using l1 -norm-constrained block-based damped variable step-size multichannel LMS algorithm, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 63 (8) (2016) 1116–1130.
13
[37] M.K. Hasan, M. Shifat-E-Rabbi, S.Y. Lee, Blind deconvolution of ultrasound images using l1 -norm-constrained block-based damped variable step-size multichannel lms algorithm, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 63 (8) (2016) 1116–1130. [38] N.E. Huang, Z. Shen, S.R. Long, M.C. Wu, H.H. Shih, Q. Zheng, N.-C. Yen, C.C. Tung, H.H. Liu, The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis, Proc. R. Soc. Lond. A: Math. Phys. Eng. Sci. 454 (1971) (1998) 903–995, The Royal Society. [39] R.J. Gledhill, Methods for Investigating Conformational Change in Biomolecular Simulations, Ph.D. Dissertation, School Chem., Southampton Univ., 2004. [40] Y. Gao, G. Ge, Z. Sheng, E. Sang, Analysis and solution to the mode mixing phenomenon in EMD, in: CISP’08, vol. 5, IEEE, 2008, pp. 223–227. [41] E. Khan, F. Al Hossain, S.Z. Uddin, S.K. Alam, M.K. Hasan, A robust heart rate monitoring scheme using photoplethysmographic signals corrupted by intense motion artifacts, IEEE Trans. Biomed. Eng. 63 (3) (2016) 550–562. [42] H. Kim, T. Varghese, Attenuation estimation using spectral cross-correlation, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 54 (3) (2007). [43] Y. Labyed, T.A. Bigelow, A theoretical comparison of attenuation measurement techniques from backscattered ultrasound echoes, J. Acoust. Soc. Am. 129 (4) (2011) 2316–2324. [44] R. Kuc, Bounds on estimating the acoustic attenuation of small tissue regions from reflected ultrasound, Proc. IEEE 73 (7) (1985) 1159–1168. [45] M.K. Hasan, M.A. Hussain, S.R. Ara, S.Y. Lee, S.K. Alam, Using nearest neighbors for accurate estimation of ultrasonic attenuation in the spectral domain, IEEE Trans. Ferroelectr. Freq. Control 60 (6) (2013) 1098–1114. [46] N.E. Huang, N.O. Attoh-Okine, The Hilbert-Huang Transform in Engineering, CRC Press, 2005. [47] F.L. Lizzi, M. Astor, T. Liu, C. Deng, D.J. Coleman, R.H. Silverman, Ultrasonic spectrum analysis for tissue assays and therapy evaluation, Int. J. Imaging Syst. Technol. 8 (1) (1997) 3–10. [48] E. Franceschini, R. Guillermin, Experimental assessment of four ultrasound scattering models for characterizing concentrated tissue-mimicking phantoms, J. Acoust. Soc. Am. 132 (6) (2012) 3735–3747. [49] J. Mamou, M.L. Oelze, Quantitative Ultrasound in Soft Tissues, Springer, 2013. [50] W.A. Berg, A.G. Sechtin, H. Marques, Z. Zhang, Cystic breast masses and the acrin 6666 experience, Radiol. Clin. 48 (5) (2010) 931–987. [51] R.A. Fisher, On the interpretation of 2 from contingency tables, and the calculation of p, J. R. Stat. Soc. 85 (1) (1922) 87–94. [52] G. Georgiou, F.S. Cohen, C.W. Piccoli, F. Forsberg, B.B. Goldberg, Tissue characterization using the continuous wavelet transform. Part II: Application on breast rf data, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 48 (2) (2001) 364–373. [53] M.S.E. Rabbi, M.K. Hasan, Speckle tracking and speckle content based composite strain imaging for solid and fluid filled lesions, Ultrasonics 74 (2017) 124–139. [54] S. Baharoon, Tuberculosis of the breast, Ann. Thorac. Med. 3 (3) (2008) 110. [55] S. Han, H.-K. Kang, J.-Y. Jeong, M.-H. Park, W. Kim, W.-C. Bang, Y.-K. Seong, A deep learning framework for supporting the classification of breast lesions in ultrasound images, Phys. Med. Biol. 62 (19) (2017) 7714. [56] S. Liu, Y. Wang, X. Yang, B. Lei, L. Liu, S.X. Li, D. Ni, T. Wang, Deep learning in medical ultrasound analysis: a review, in: Engineering, 2019.