Radiomics Analysis on Ultrasound for Prediction of Biologic Behavior in Breast Invasive Ductal Carcinoma

Radiomics Analysis on Ultrasound for Prediction of Biologic Behavior in Breast Invasive Ductal Carcinoma

Accepted Manuscript Radiomics Analysis on Ultrasound for Prediction Biological Behavior in Breast Invasive Ductal Carcinoma Yi Guo, Yuzhou Hu, Mengyun...

2MB Sizes 3 Downloads 58 Views

Accepted Manuscript Radiomics Analysis on Ultrasound for Prediction Biological Behavior in Breast Invasive Ductal Carcinoma Yi Guo, Yuzhou Hu, Mengyun Qiao, Yuanyuan Wang, Jinhua Yu, Jiawei Li, Cai Chang PII:

S1526-8209(17)30146-5

DOI:

10.1016/j.clbc.2017.08.002

Reference:

CLBC 663

To appear in:

Clinical Breast Cancer

Received Date: 8 March 2017 Revised Date:

26 July 2017

Accepted Date: 7 August 2017

Please cite this article as: Guo Y, Hu Y, Qiao M, Wang Y, Yu J, Li J, Chang C, Radiomics Analysis on Ultrasound for Prediction Biological Behavior in Breast Invasive Ductal Carcinoma, Clinical Breast Cancer (2017), doi: 10.1016/j.clbc.2017.08.002. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Radiomics Analysis on Ultrasound for Prediction Biological Behavior in Breast Invasive Ductal Carcinoma

RI PT

Yi Guo1,2, Yuzhou Hu1, Mengyun Qiao1, Yuanyuan Wang1,2*, Jinhua Yu1,2, Jiawei Li3, Cai Chang3 1

Departmentof Electronic Engineering, Fudan University, Shanghai 200433, China Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai 200433, China 3 Department of Ultrasound, Fudan University Shanghai Cancer Center, Shanghai 200032, China

SC

2

AC C

EP

TE D

M AN U

Corresponding author: Yuanyuan Wang, Department of Electronic Engineering, Fudan University, Shanghai 200433, China; telephone: +86-21-65642756; email: [email protected]

1

ACCEPTED MANUSCRIPT

Conflicts of Interest

AC C

EP

TE D

M AN U

SC

RI PT

The authors have stated that they have no conflicts of interest.

2

ACCEPTED MANUSCRIPT Abstract

RI PT

This study illustrates that tumor characteristics can be captured by medical images at the genetic and cellular levels. A total of 215 patients with breast invasive ductal carcinoma were analyzed. An automatic radiomics approach was proposed to assess the associations between quantitative ultrasound features and biological characteristics. The results indicated a strong correlation. This application will be helpful for an accurate prognosis at an early stage.

EP

TE D

M AN U

SC

Introduction: In current clinical practice, invasive ductal carcinoma is always screened by medical imaging techniques and diagnosed by immunohistochemistry. Recent studies have illustrated that radiomics approaches provide a comprehensive characterization of entire tumors and can reveal predictive or prognostic associations between the images and medical outcomes. For better revealing the underlying biology, an improved understanding between objective image features and biological characteristics is urgently required. Patients and Methods: A total of 215 patients with definite histological results were enrolled in our study. The tumors were automatically segmented by our Phase-based Active Contour model. The high-throughput radiomics features were designed and extracted based on Breast Imaging Reporting and Data System and further selected according to Students’ t-test, inter-feature coefficients and a Lasso regression model. The Support Vector Machine classifier with three-fold-Cross-Validation was used to evaluate the relationship. Results: The radiomics approach demonstrated a strong correlation between receptor status and subtypes (p<0.05, AUC=0.760). Hormone receptor-positive, HER2-negative cancers have different ultrasound appearances from Triple-negative cancers. Conclusion: Our approach may assist clinicians with the accurate prediction of prognosis based on ultrasound findings, allowing early medical management and treatment.

AC C

Keywords: radiomics, ultrasound, hormone receptor, molecular subtypes, breast invasive ductal carcinoma

3

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

Introduction Breast cancer is a major public health problem among women in China1. Invasive ductal carcinoma (IDC) is a frequently observed breast cancer type that accounts for nearly 80% of all cases2.IDC is spatially and temporally heterogeneous, whose clinical outcomes are associated with different biologic characteristics3. In current clinical practice, breast cancer is always screened by medical imaging techniques and diagnosed by immunohistochemistry (IHC). Medical imaging technologies provide overall anatomical information about a tumor in a non-invasive manner, while IHC characterizes the intracellular antigen expression in the tumor tissue4. IHC is used to evaluate biomarkers and identify molecular subtypes, and it plays a critical role in prognosis and therapeutic decisions5. IHC benefits breast cancer characterization, but it also has some limitations. First, because of tumor heterogeneity, sampling tumor tissue with a biopsy is an invasive procedure that is subject to sampling and analysis uncertainty6. Second, visual interpretations are generally subjective and may lead to mistakes at the cut-off values7. Recent studies have illustrated that tumor characteristics at the genetic and the cellular levels can be captured by medical images8,9. It has been hypothesized that radiomics approaches provide a comprehensive characterization of entire tumors and can reveal predictive or prognostic associations between the images and medical outcomes10. Ultrasound is a widely used modality for breast cancer diagnosis, especially for IDC, because it is non-invasive, requires no radiation and is inexpensive. It is the preferred choice especially for early screening and diagnosis. Several studies have explored the correlation between breast ultrasound results and certain biological characteristics11-19. The results show that ultrasound imaging is promising for the assessment of tumor heterogeneity. However, previous studies all used visual assessment by radiologists. A small portion of subjective descriptions had large intraand inter-observer variability and were highly dependent on the radiologist’s experience. The integration of computer models into the ultrasound diagnosis can increase the objectiveness of image interpretation. Several effective computer-aided diagnosis systems have been designed to assist radiologists in locating and identifying abnormalities in breast ultrasound images20-24. Inspired by CAD systems, these computerized tools can improve our understanding between objective image features and biological characteristics to better reveal the underlying biology. The aim of our study is to develop a radiomics approach to investigate the associations between quantitative breast IDC ultrasound features and biological characteristics in an objective manner. Materials and Methods Patients We performed a retrospective analysis of 215 women with breast IDC who had been treated in Shanghai Cancer Center of Fudan University from April 2014 to June 2016. They underwent ultrasound diagnoses, biopsy and surgery, without neo-adjuvant therapy. The patients with the best prognosis (hormone receptor-positive and HER2-negative) 4

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

and the worst prognosis (Triple-negative)25 were taken into consideration. The study was approved by the institutional review board of the hospital and the informed consent was obtained from all patients. A summary of the patient characteristics is shown in Table 1. Ultrasound Examination All patients were examined using an Aixplorer® scanner (Supersonic, Aix-en-Provence, France) or a Logiq E9® scanner (GE Healthcare, Wauwatosa, WI) equipped with 6-13 MHz linear probes. Static images and video files were stored in a Picture Archiving and Communication Systems (PACS). Two breast radiologists who had at least five years of experience retrospectively and independently reviewed the ultrasound images. The criteria used were according to the Breast Imaging Reporting and Data System (BI-RADS)26. A consensus interpretation was reached for ambiguous cases. Pathological Examination All patients had definite histological results. The breast tumors were stained with hematoxylin-eosin (HE) and performed in formalin-fixed, paraffin-embedded materials. The expression of estrogen receptors (ER), progesterone receptors (PR), human epidermal growth factor receptor-2 (HER2) and Ki-67 (Ki67) was assessed by an IHC analysis with the appropriate antibodies. For the pathology results, the ER/PR status was classified as positive if nuclear staining was present ≥1%. The HER2 status was graded as 0, 1+, 2+ or 3+. Grades 0 and 1+ were defined as negative, while grade 3+ was deemed positive. Grade 2+ was indeterminate, and was confirmed by fluorescence in situ hybridization (FISH). According to the 2015 St Gallen International Expert Consensus, each patient was classified into one of four molecular subtypes as follows: hormone receptor-positive and HER2-negative, hormone receptor-positive and HER2-positive, hormone receptor-negative and HER2-positive, Triple Negative (TN)25. Histologic grading was based on Nottingham’s grading system and classified as Grade I, II and III27. Usually, Grades I and II were considered as the lower grades, and Grade III was higher grade. Image Analysis The ultrasound images in this study were evaluated with quantitative computer-aided techniques. First, each tumor was segmented by a novel phase-based active contour (PBAC) model developed by our group28. The new model was able to extract the tumor boundary automatically by minimizing a given energy function added with the phase information. For some extremely irregular tumors in low-contrast images, the automatic segmentation method was difficult to evolve to the actual boundary. So a manual refinement by an experienced radiologist was necessary in these cases. Three metrics were employed to evaluate the performance of the automatic segmentation method, the true-positive (TP) ratio, the false-positive (FP) ratio and the similarity index (SI). All tumor boundaries were delineated by a well-trained radiologist as the ground truth. TP =

| ∩ | | |

, FP =

| ∪  | | |

, SI =

| ∩ | | ∪ |

(1) 5

ACCEPTED MANUSCRIPT

EP

TE D

M AN U

SC

RI PT

where Am and Aa are the pixel set of the object region segmented by the manual and PBAC method. We subsequently designed and extracted 463 radiomics features to quantize and assess the ultrasound characteristics of the tumors based on the BI-RADS criteria. The features encompassed all of the BI-RADS catalogues, including shape, orientation, margin, boundary, echo pattern, posterior acoustic features and calcification. Besides, the tumor size was also under consideration. We classified high-throughput features into four main categories: (I) morphology, (II) intensity, (III) texture and (IV) wavelet features, which are concisely listed in the Appendices. Data Analysis The association between the quantitative high-throughput ultrasound features and IHC biomarkers was analyzed by a machine-learning method. We performed a three-fold cross-validation with 1000 bootstrapping to minimize bias and overfitting. Each time, one fold was used as a testing set and other two folds were put together to form a training set. Each cross-validation process repetition was composed of two stage: feature selection and classification. In the first stage, we used a three-step feature selection technique. First, Student’s t-test was employed to select features that were highly related to the biomarkers. A significance level of 0.05 (p<0.05) was set as the threshold. An inter-feature coefficient (R) between all possible pairs of features was subsequently used to eliminate high dimensional feature redundancy. R>0.95 was the cutoff for strong relationships, in which one of two features with a lower p value was excluded. Next, the least absolute shrinkage selection operator (Lasso) method was used to select the most important features with nonzero coefficients. In the second stage, the Support Vector Machine (SVM) classifier was performed immediately after feature selection by using the same cross-validation folds. The classification performance was expressed in terms of the sensitivity (SEN), the specificity (SPEC) and the area under the receiver operator characteristic curve (AUC)29. All values were averaged, and the 95% confidence intervals of AUC were estimated in 1000 bootstrapping repetitions. 



(2)

AC C

SEN =  , SPEC = 

Finally, the most frequently selected radiomics features in all bootstrap repetition were retained as the final stable feature set. All image and data processing were implemented in Matlab R2015b (Mathworks, Inc., Natick, MA, USA). Results Patient Population and Clinicopathologic Characteristics A total of 215 patients with definite IHC status and histologic grade were enrolled in this retrospective study. Of these, 164 (76.3%) were hormone receptor-positive, HER2-negative, and 51 (23.7%) were TN. Tumors with positive hormone receptor values were three times more likely to be of lower histologic grade (71.4% Grades I-II, 20.1% Grade III). TN tumors were mostly of higher histologic grade (80.4% Grade III,

6

ACCEPTED MANUSCRIPT 17.6% Grades I-II). Age was not significantly associated with molecular subtypes and tumor grades in our study.

AC C

EP

TE D

M AN U

SC

RI PT

Automatic Segmentation Performances The automatic segmentation results were all reviewed by the breast radiologist. Of 215 ultrasound images, 207 tumors were well segmented by our PBAC model. Only 8 tumors needed further manual refinement. For example, Fig. 1 shows the manual and automatic segmentation result for a representative breast ultrasound image. The TP, FP and SI of all images were 85.5%±7.3%, 1.64%±1.94% and 83.2%±7.18% respectively, which demonstrated that our method yielded contours close to the manual delineations. Small standard deviations for all evaluation metrics attested to its robustness. Optimal Ultrasound Feature Sets We ranked 463 high-throughput features with a three-step feature selection technique to identify the strength of the association with the biomarker status under consideration. We selected strongly correlated features with p-values less than 0.05 and eliminated the redundancies. Furthermore, 36 most frequently selected radiomics features in the Lasso model in all bootstrap repetition were used to construct the final optimal feature sets, which are shown in the Appendices. The new feature set encompassed 6 BI-RADS categories and provided a detailed and comprehensive description of the BI-RADS lexicon. Table 2 shows the efficiency of feature selection for classification. We compared the average classification performances of three-step feature selection technique by using SVM. It illustrated that the combination of effective features improved AUC from 0.723 [95% CI 0.719-0.727] to 0.760 [95% CI 0.755-0.764]. Our representative features achieved a superior performance, indicating the efficiency of the optimal feature selection. Association between Ultrasound Features and Biological Behavior The classification performance metrics for each feature category are detailed in Table 3. As expected, the model of all features outperformed those based on only one category, notably in terms of SEN and AUC. The combination achieved the highest AUC of 0.760 [95% CI 0.755-0.764]. The echo pattern with an AUC of 0.738 [95% CI 0.734-0.742] proved to be the most prevalent feature set for predicting receptor status. The posterior acoustic pattern, the margin and the calcification were also correlated with the receptor status and had an AUC greater than 0.6. Shape and boundary seemed to be weakly correlated with molecular subtype. No significant differences were observed for size and orientation, because all p-values were greater than 0.05. For each category, a feature with the lowest p-value was selected and these are illustrated in Table 4. Fig. 2 shows two representative breast IDC ultrasound images that are hormone receptor-positive, HER2-negative and TN. Shape: the solidity was used to describe the tumor shape. It was defined as the ratio of the tumor area to its minimum enclosing convex. A higher value indicated that the tumor was in a great possibility to be round or oval. In our study, the solidity of TN 7

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

cancers was 0.899±0.045, a little higher than that of Hormone receptor-positive, HER2-negative (p=0.048, AUC=0.560 [95% CI 0.556-0.564]). The cancers with positive hormone receptor values might exhibit an irregular shape, whereas TN cancers might have an oval or round shape. Margin: 4 features were selected to evaluate the tumor margin (AUC=0.648 [95% CI 0.641-0.654]). Of these, the edge roughness was most distinguishable (p=0.006). It was defined based on the entropy of the normalized radial length histogram30. A tumor with a larger roughness had a more complex boundary. The mean roughness of a hormone receptor-positive, HER2-negative tumor was 0.079, whereas that of a TN tumor was 0.063. The analysis revealed that a circumscribed tumor was more likely to be TN and of high grade. Malignant tumors with spiculated or angular margins were significantly associated with a positive hormone receptor status. Boundary: The boundary of a breast tumor was characterized by an echogenic halo or an abrupt interface. In our study, 5 features represented boundary properties. Although the values were slightly different, the overall combination implied that the boundary was weakly correlated with the receptor status or the tumor grade (AUC=0.598 [95% CI 0.592-0.604]). The breast IDC might have either an abrupt or a blurred interface. Echo pattern: We used 25 texture features to assess the tumor echogenicity characteristics, which are listed in Table 4. Most Gray-level co-occurrence matrix (GLCM), Gray-level size zone matrix (GLSZM), Gray-level run-length matrix (GLRLM) and Neighborhood gray-tone difference matrix (NGTDM)-based features and their wavelet decomposition could describe the echo pattern homogeneity (AUC=0.738 [95% CI 0.734-0.742]). A higher value was associated with a heterogeneous texture. As shown in Table 4, the mean of LL NGTDM. Busyness in ER+, PR+, HER2- was 1.402, while that of ER-, PR-, HER2- was 1.912 (p<0.001). Our study implied that a hypo- or complex echo pattern was more common in the TN tumor subtype than in the hormone receptor-positive, HER2-negative subtype. Posterior acoustic pattern: We used the mean intensity differences of the posterior area and its adjacent area to describe the posterior acoustic pattern (p<0.001, AUC=0.612 [95% CI 0.608-0.616]). In Table 4, the mean difference for a hormone receptor-positive, HER2-negative tumor was -3.541. The mean intensity of the posterior area was lower than in the adjacent area at the same depth, indicating posterior shadowing. The value for a TN tumor was 10.22. Tumors with enhancement had a significantly greater probability of having triple negative status and a higher grade. Calcification: 5 texture-based features were employed to quantify calcification. However, this feature had a low SEN (SEN=0.447) and AUC (AUC=0.632 [95% CI 0.625-0.638]), so the calcification was not a strong indicator. Discussion We developed an automatic radiomics model to assess the correlation between high-throughput digital ultrasound features and the biological characteristics of a breast IDC. We found that both tumor grade and receptor status affected the ultrasound appearance. Hormone receptor-positive, HER2-negative cancers of low grade were 8

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

associated with an irregular shape, uncircumscribed margin, hyper- or complex echo pattern and posterior shadowing, whereas TN cancers of high grade tended to have a regular, circumscribed margin, a hypo- or complex echo pattern and posterior enhancement. The posterior acoustic pattern is an important ultrasound feature for breast cancer. Irshad et al.15 reported that low grade tumors growing with lower mitotic rates may lead to excessive sound reflection or attenuation by the tumor compared with the surrounding tissue. Aho et al.11, Celebiet al.12, Costantini et al.13, Ko et al.31 and Zhang et al.18,19 found that the presence of shadowing was strongly associated with a low-grade tumor or ER+ cancer. Our results were consistent with these findings. The mean intensity of the posterior area in a hormone receptor-positive, HER2-negative tumor was negative, which denoted shadowing, while a TN cancer was more likely to show enhancement. Hence, the posterior acoustic pattern is a good predictive indicator. The majority of malignant breast tumors were expected to have a poorly defined and uncircumscribed margin11,12. However, new studies implied that TN subtypes were likely to have a well-defined margin. Costantini16 explained that the circumscribed margin was likely related to the high power of cell proliferation, preventing a desmoplastic reaction. Ahoet al.11 and Celebiet al.12 obtained the same results. On the other hand, IIdefonso et al.32, Irshad et al.15 and Zhang et al.18 revealed that the spiculated or angular margin might be associated with a low tumor grade and a positive hormone receptor status. Similar to their results, we found that an uncircumscribed margin was prone to being associated with a positive hormone receptor and a low grade, but a circumscribed margin was highly likely to be negative for all receptors and to have a high grade. The computerized features are well-known for their capability in assessing texture characteristics. In our study, the echo pattern represented by 25 digital features proved to be the most promising for predicting molecular subtypes, and it had the highest AUC of all feature categories. Due to a large amount of speckle, the echo pattern was difficult for a radiologist to interpret. Hence, the association with the echo pattern in previous studies has varied. Nevertheless, the radiomics features facilitated the evaluation of the echogenicity in an effective manner. Our results implied that a TN cancer might exhibit a hypo- or mixed echo appearance. Moreover, our results showed a weak correlation between the shape and the hormone receptor status. A hormone receptor-positive, HER2-negative cancer was likely to be irregular, while a TN cancer might be oval or round. This was comparable to the results of Costantiniet al.13 and Zhang et al.18. Although the single feature category had some association with molecular status, our combination of all features improved the overall performance. This proved that each category was necessary and complementary to each other. Since most of our digital features were designed based on BI-RADS, it implied that BI-RADS was not only instructive for diagnosing breast cancer but also potentially beneficial for prognosis. It was worth mentioning in the Appendices that the wavelet-decomposed features were preserved in the final optimal sets. Due to the low contrast of ultrasound images, some 9

ACCEPTED MANUSCRIPT

M AN U

SC

RI PT

important tumor characteristics were hidden behind the speckle. However, after a wavelet transform, their LL, LH, HL and HH sub-bands re-displayed such characters and showed discriminative ability. The results demonstrated the importance of radiomics features. As a simple, easily operable and widely accessible tool, breast ultrasound scanning is effective for early screening and distinguishing benign and malignant tumors. Since the invasive IHC method is not accepted by all women, it will be of significant importance to use ultrasound to predict prognosis and clinical response during a routine examination. Additionally, increased confidence in radiomics-based imaging prediction can help to resolve the uncertainty in biomarker values caused by tumor heterogeneity. Furthermore, compared with previous qualitative studies performed by experienced radiologists, our findings suggest that high-throughput quantitative ultrasound features are effective and robust for predicting hormone receptor status and histologic grade. It can assist pretreatment planning and prognosis, despite the intra- and inter-observation variability. The approach is feasible in other ultrasound machines and medical centers, so it may have global applications. Our research had the following limitations. First, the study was based on retrospective data with a small sample size. A large population is needed to generalize the approach and to validate the results. Moreover, HER2-positive patients were not included. Since HER2 over-expression is a typical subtype, we will collect more patients and correlate their ultrasound findings with biological characteristics.

AC C

EP

TE D

Conclusion A novel automatic radiomics approach is proposed to investigate the association between quantitative ultrasound features and biological characteristics. Our approach could automatically segment the tumor and extract the high-throughput features to evaluate the relationship effectively and objectively. The study confirmed that quantitative ultrasound features were significantly associated with hormone receptor status, molecular subtype and histologic grade in breast IDC. These findings expanded the scope of ultrasound for diagnosis. They can assist clinicians in predicting the prognosis accurately based on ultrasound appearance, allowing early medical management and treatment. Clinical Practice Points 1. Recent studies have illustrated that radiomics approaches provide a comprehensive characterization of entire tumors and will reveal predictive or prognostic associations between images and medical outcomes. 2. A total of 215 patients with breast invasive ductal carcinoma were enrolled in a retrospective study. 3. An automatic radiomics approach was developed to assess the relationship between quantitative ultrasound features and biological characteristics.

10

ACCEPTED MANUSCRIPT

Disclosure The authors have stated that they have no conflicts of interest.

RI PT

4. The results demonstrated a strong correlation. Additionally, our findings removed the intra- and inter-observation variability. Our approach is feasible in other ultrasound machines and medical centers. 5. Our study expands the scope of ultrasound for diagnosis. It is of significant importance for ultrasound to predict prognosis and clinical response during a routine examination, allowing early medical management and treatment.

SC

Acknowledgements This work was supported by the National Natural Science Foundation of China (61401102 and 81627804).

AC C

EP

TE D

M AN U

References 1. Chen WQ, Rong SZ, Baade PD, Zhang SW, Zeng HM, Bray F, DVM AJ, Yu XQ, He J. Cancer Statistics in China, 2015. Ca Cancer J Clin 2016;66:115-132. 2. Yersal O, Barutca S. Biological subtypes of breast cancer: prognostic and therapeutic implications. World J Clin Oncol 2014;5:412–424. 3. Tang P, Skinner KA, Hicks DG. Molecular classification of breast carcinomas by immunohistochemical analysis are we ready? DiagnMolPathol2009;18:125-132. 4. Zaha DC. Significance of immunohistochemistry in breast cancer. World J Clin Oncol 2014;5: 382–392. 5. Laurinavicius A, Laurinaviciene A, Ostapenko V, Dasevicius D, Jarmalaite S, Lazutka J. Immunohistochemistry profiles of breast ductal carcinoma: factor analysis of digital image analysis data. Diagn Pathol 2012;7:1-16. 6. Yip SF, Aerts HJ.Applications and limitations of radiomics. Phys Med Biol 2016;61:155-160. 7. Yang XY, KnoppMV.Quantifying tumor vascular heterogeneity with dynamic contrast-enhanced magnetic resonance imaging: a review. J Biomed Biotechnol 2011;12:732-848. 8. Aerts HJ, Velazquez ER, Leijenaar RT, Parmar C, Grossmann P, Carvalho S, Bussink J, Monshouwer R, Haibe-Kains B, Rietveld D, Hoebers F, Rietbergen MM, Leemans CR, Dekker A, Quackenbush J, Gillies RJ, Lambin P. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 2014;3:4006-4013. 9. Cho GY, Moy L, Kim SG, Baete SH, Moccaldi M, Babb JS, Sodickson DK, Sigmund EE. Evaluation of breast cancer using intravoxel incoherent motion (IVIM) histogram analysis: comparison with malignant status, histological subtype, and molecular prognostic factors. Eur Radiol 2016;26:2547-2558. 10. Kumar V, Gu Y, Basu S, Berglund A, Eschrich SA, Schabath MB, Forster K, Aerts HJ, Dekker A, Fenstermacher D, Goldgof DB, Hall LO, Lambin P, Balagurunathan Y, Gatenby RA, Gillies RJ. Radiomics: the process and the challenges. Magn Reson 11

ACCEPTED MANUSCRIPT

15.

16.

17.

18.

19.

20.

21.

22.

RI PT

SC

M AN U

14.

TE D

13.

EP

12.

AC C

11.

Imaging 2012;30: 1234-1248. Aho M, Irshad A, Ackerman SJ, Lewis M, Leddy R, Pope TL, Campbell AS, Cluver A, Wolf BJ, Cunningham JE. Correlation of sonographic features of invasive ductal mammary carcinoma with age, tumor grade, and hormone-receptor. J Clin Ultrasound 2013;41:10-17. Çelebi F, Pilancı KN, Ordu Ç, Ağacayak F, Alço G, İlgün S, Sarsenov D, Erdoğan Z, Özmen V. The role of ultrasonographic findings to predict molecular subtype, histologic grade, and hormone receptor status of breast cancer. Diagn Interv Radiol 2015;21:448-453. Costantini M, Belli P, Bufi E, Asunis AM, Ferra E, Bitti GT. Association between sonographic appearances of breast cancers and their histopathologic features and biomarkers. J Clin Ultrasound 2016;44:26-33. Gruber IV, Rueckert M, Kagan KO, Staebler A, Siegmann KC, Hartkopf A, Wallwiener D, Hahn M. Measurement of tumor size with mammography, sonography and magnetic resonance imaging as compared to histological tumor size in primary breast cancer. BMC Cancer 2013;13:328-335. Irshad A, Leddy R, Pisano E, Baker N, Lewis M, Ackerman S, Campbell A. Assessing the role of ultrasound in predicting the biological behavior of breast cancer. AJR Am J Roentgenol 2013;200:284-290. Kim SH, Seo BK, Lee J, Kim SJ, Cho KR, Lee KY, Je BK, Kim HY, Kim YS, Lee JH. Correlation of ultrasound findings with histology, tumor grade and biological markers in breast cancer. Acta Oncol 2008;47:1531-1538. Stein RG, Wollschläger D, Kreienberg R, Janni W, Wischnewsky M, Diessner J, Stüber T, Bartmann C, Krockenberger M, Wischhusen J, Wöckel A, Blettner M, Schwentner L. The impact of breast cancer biological subtyping on tumor size assessment by ultrasound and mammography-a retrospective multicenter cohort study of 6543 primary breast cancer patients. BMC Cancer 2016;16: 459-466. Zhang L, Liu YJ, Jiang SQ, Cui H, Li ZY, Tian JW. Ultrasound utility for predicting biological behavior of invasive ductal breast cancers. Asian Pac J Cancer Prev 2014;15:8057-8062. Zhang L, Li J, Xiao Y, Cui H, Du GQ, Wang Y, Li ZY, Wu T, Li X, Tian JW. Identifying ultrasound and clinical features of breast cancer molecular subtypes by ensemble decision. Sci Rep 2015;5:11085-11099. Shan J, Alam SK, Garra B, Zhang Y, Ahmed T. Computer-aided diagnosis for breast ultrasound using computerized BI-RADS features and machine learning methods. Ultrasound Med Biol 2016; 42:980–988. Kim JH, Cha JH, Kim N, Chang YJ,Ko MS,Choi YK, Kim HH. Computer-aided detection system for masses in automated whole breast ultrasonography: development and evaluation of the effectiveness. Ultrasonography 2014; 33:105-115. Wu WJ, Lin SW, Moon WK. Combining support vector machine with genetic algorithm to classify ultrasound breast tumor images. Comput Med Imaging Graph 2012; 36:627–633. 12

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

23. Liu B, Cheng HD, Huang J, Tian J, Tang X, Liu J. Fully automatic and segmentation-robust classification of breast tumors based on local texture analysis of ultrasound images. Pattern Recognit 2010; 43:280-298. 24. Cai LY, Wang X, Wang YY, Guo Y, Yu JH, Wang Y. Robust phase-based texture descriptor for classification of breast ultrasound images. Biomed Eng Onlin 2015; 14(26): DOI 10.1186/s12938-015-0022-8. 25. Coates AS, Winer EP, Goldhirsch A, Gelber RD, Gnant M, Piccart-Gebhart M, Thürlimann B, Senn HJ. Tailoring therapies—improving the management of earlybreast cancer: St Gallen International Expert Consensuson the Primary Therapy of Early Breast Cancer 2015. Ann Oncol 2015;26(8):1533-1546. 26. American College of Radiology, Breast Imaging Reporting and Data System (BI-RADS): Ultrasound (American College of Radiology, Reston, VA,2013). 27. Azzopardi JG, Chepick OF, Hartmann WH, Jafarey NA, Llombart-Bosch A, Ozzello L, Rilke F, Sasano N, Sobin LH, Sommers SC, Stalsberg H, Sugar J, Williams AO. The world health organization histological typing of breast tumors, 2nd ed. The World Organization. Am J Clin Pathol 1982;78:806-816. 28. Cai LY, Wang YY. A phase-based active contour model for segmentation of breast ultrasound images. 6rd International Conference on Biomedical Engineering and Informatics, Hangzhou, China, 2013, December 91-95. 29. Baldi P, Brunak S, Cgayvin Y, Andersen CA, Nielsen H. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 2000;16:412-424. 30. Chou YH, Tiu CM, Hung GS, Wu SC, Chang TY, Chiang HK. Stepwise logistic regression analysis of tumor contour features for breast ultrasound diagnosis. Ultra Med Biol 2001;7:1493-1498. 31. Ko ES, Lee BH, Kim HA, Noh WC, Kim MS, Lee SA. Triple-negative breast cancer: correlation between imaging and pathological findings. Eur Radiol 2010;20:1111-1117. 32. IIdefonso C, Vazquez J, Guinea O, Perez A, Fernandez A, Corte MD, Junquera S, Gonzalez LO, Pravia P, Garcia-Moran M, Vizoso FJ. The mammographic appearance of breast carcinomas of invasive ductal type: relationship with clinicopathological parameters, biological features and prognosis. Eur J Obstet Gynecol Reprod Biol 2008;136:224-231.

13

ACCEPTED MANUSCRIPT Table 1 The summary of patient characteristics. Subtypes

hormone receptor-positive, HER2-negative

Age (mean±SD, year)

Triple-negative (n=51)

(n=164)

52.53±11.61

50.94±11.16

I-II

117 (71.4%)

9 (17.6%)

III

33 (20.1%)

41 (80.4%)

Not available

14 (8.5%)

1 (2%)

AC C

EP

TE D

M AN U

SC

RI PT

Histologic grade

14

ACCEPTED MANUSCRIPT Table 2 Performance of different feature sets using SVM. SEN

SPEC

AUC [95%CI]

After t-test

0.952

0.514

0.723 [0.719-0.727]

After redundancy elimination

0.953

0.541

0.732 [0.728-0.736]

After Lasso

0.979

0.601

0.760 [0.755-0.764]

AC C

EP

TE D

M AN U

SC

RI PT

Feature Set

15

ACCEPTED MANUSCRIPT Table 3 Classification performance of different feature categories. SPEC

AUC [95%CI]

shape

0.534

0.577

0.560 [0.556-0.564]

margin

0.523

0.772

0.648 [0.641-0.654]

boundary

0.451

0.765

0.598 [0.592-0.604]

echo pattern

0.951

0.524

0.738 [0.734-0.742]

Posterior acoustic pattern

0.622

0.602

0.612 [0.608-0.616]

calcification

0.447

0.816

all

0.979

0.601

RI PT

SEN

0.632 [0.625-0.638]

AC C

EP

TE D

M AN U

SC

0.760 [0.755-0.764]

16

ACCEPTED MANUSCRIPT Table 4 One representative feature in each category. Category

Feature

Mean

in

hormone

Mean in Triple-negative

p-value

receptor-positive,HER 2-negative solidity

0.880±0.0554

0.899±0.045

p=0.048

margin

edge roughness

0.079±0.037

0.063±0.028

p=0.006

boundary

LL Std of correlation

0.013±0.009

0.009±0.003

p=0.015

1.402±0.885

1.912±0.959

p<0.001

-3.541±23.87

10.22±25.77

p<0.001

0.263±0.169

p<0.001

echo pattern

LL NGTDM.Busyness mean intensity

posterior acoustic

differences of

pattern

posterior area and its adjacent area

calcification

HL H-energy

0.199±0.164

SC

coefficient contrast

RI PT

shape

AC C

EP

TE D

M AN U

*Low pass/Low pass (LL), Low pass/High pass (LH), High pass/Low pass (HL) and High pass/High pass (HH)

17

ACCEPTED MANUSCRIPT Figure Captions List Fig.1 The segmentation result for a representative breast IDC ultrasound image, (a) the manual segmentation result, (b) our PBAC segmentation result.

AC C

EP

TE D

M AN U

SC

RI PT

Fig. 2 Two representative breast IDC ultrasound images. (a) A 50-year-old female with invasive ductal carcinoma (Grade II). The tumor size was 2.0×1.6×1.2 cm. An ultrasound image showed an irregular shape, spiculated or angular margin, complex echo pattern, posterior shadowing and internal calcification. On an IHC examination, the cancer was hormone receptor-positive, HER2-negative. (b) A 57-year-old female with Triple-negative invasive ductal carcinoma (Grade III). The tumor size was 3.5×3×2 cm. An ultrasound image showed a regular shape, circumscribed margin, an abrupt boundary, complex echo pattern and posterior enhancement, without calcification.

18

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT