The diagnostic value of quantitative texture analysis of conventional MRI sequences using artificial neural networks in grading gliomas

The diagnostic value of quantitative texture analysis of conventional MRI sequences using artificial neural networks in grading gliomas

Clinical Radiology xxx (xxxx) xxx Contents lists available at ScienceDirect Clinical Radiology journal homepage: www.clinicalradiologyonline.net Th...

755KB Sizes 0 Downloads 11 Views

Clinical Radiology xxx (xxxx) xxx

Contents lists available at ScienceDirect

Clinical Radiology journal homepage: www.clinicalradiologyonline.net

The diagnostic value of quantitative texture analysis of conventional MRI sequences using artificial neural networks in grading gliomas D. Alis a, *, O. Bagcilar b, Y.D. Senli b, C. Isler c, M. Yergin d, N. Kocer b, C. Islak b, O. Kizilkilic b a Istanbul Mehmet Akif Ersoy Thoracic and Cardiovascular Surgery Training and Research Hospital, Department of Radiology, Halkali, Istanbul, Turkey b Istanbul University-Cerrahpasa, Cerrahpasa Faculty of Medicine, Department of Radiology, KMPasa, Istanbul, Turkey c Istanbul University-Cerrahpasa, Cerrahpasa Faculty of Medicine, Department of Neurosurgery, KMPasa, Istanbul, Turkey d Bahcesehir University, Department of Software Engineering and Applied Sciences, Istanbul, Turkey

art icl e i nformat ion Article history: Received 19 August 2019 Accepted 11 December 2019

AIM: To explore the value of quantitative texture analysis of conventional magnetic resonance imaging (MRI) sequences using artificial neural networks (ANN) for the differentiation of high-grade gliomas (HGG) and low-grade gliomas (LGG). MATERIALS AND METHODS: A total of 181 patients, 97 with HGG (53.5%) and 84 with LGG (46.5%) with brain MRI having T2-weighted (W) fluid attenuation inversion recovery (FLAIR), and contrast-enhanced T1W images were enrolled in the present study. Histogram parameters and high-order texture features were extracted using manually placed regions of interest (ROIs) on T2W-FLAIR and contrast-enhanced T1W images covering the whole volume of the tumours. The reproducibility of the features was assessed by interobserver reliability analyses. The cohort was divided into training (n¼121) and test partitions (n¼60). The training set was used for attribute selection and model development, and the test set was used to evaluate the diagnostic performance of the pre-trained ANNs in discriminating HGG and LGG. RESULTS: In the test cohort, the ANN models using texture data of T2W-FLAIR and contrastenhanced T1W images achieved an area under the receiver operating characteristic curve (AUC) of 0.87 and 0.86, respectively. The combined ANN model with selected texture features achieved the highest diagnostic accuracy equating 88.3% with an AUC of 0.92. CONCLUSIONS: Quantitative texture analysis of T2W-FLAIR and contrast-enhanced T1W enhanced by ANN can accurately discriminate HGG from LGG and might be of clinical value in tailoring the management strategies in patients with gliomas. Ó 2020 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.

* Guarantor and correspondent: D. Alis, Istanbul Mehmet Akif Ersoy Thoracic and Cardiovascular Surgery Training and Research Hospital, Department of Radiology, Halkali/ Istanbul, Turkey. Tel.: þ90 5364797429; fax: þ90 212 4143167. E-mail address: [email protected] (D. Alis). https://doi.org/10.1016/j.crad.2019.12.008 0009-9260/Ó 2020 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.

Please cite this article as: Alis D et al., The diagnostic value of quantitative texture analysis of conventional MRI sequences using artificial neural networks in grading gliomas, Clinical Radiology, https://doi.org/10.1016/j.crad.2019.12.008

2

D. Alis et al. / Clinical Radiology xxx (xxxx) xxx

Introduction Gliomas are the most common type of primary malignant brain tumours with an estimated annual incidence in the range of 6/100,000.1 World Health Organization (WHO) categorises gliomas into four subtypes according to histopathological characterisation and genetic properties of the tumours.2,3 It is also a standard diagnostic scheme to categorise grade I and II as low-grade gliomas (LGG) and grade III and IV as high-grade gliomas (HGG).2,3 The differentiation of LGG and HGG is mandatory for risk stratification and tailoring the best management strategies for the patients.4e7 Histopathological diagnosis following biopsy or surgery is the reference method for glioma grading, yet the procedure is time-consuming, highly invasive, and there is always a risk of sampling error.1,8,9 Magnetic resonance imaging (MRI) is currently the most commonly used method for the non-invasive evaluation of gliomas given its high spatial and contrast resolution, and non-radiation bearing nature.10 Advanced MRI techniques including spectroscopy, diffusion-weighted imaging, and perfusion imaging have been proposed for the grading of gliomas, yet the interpretation of these techniques depends on the experience and skills of the radiologists; thus, somewhat subjective, non-quantitative, and prone to errors11e15 Radiomics refers to the process of extracting minable quantitative texture features from radiologic images.16e18 These features could be integrated into machine learning (ML)-based models to be used as a biomarker for the tumour grading, benign and malignant tumour differentiation, predict treatment response, and building individualised treatment strategies. To date, several MRI texture analysis studies for glioma grading, predicting prognosis, identifying genetic status, and assessing proliferative activity of the tumours, have been conducted with promising results.19e23 Nevertheless, radiomics is still a young discipline, which demands further studies. The present work aimed to test the diagnostic value of quantitative texture analysis of contrast-enhanced T1weighted (W) and T2W fluid-attenuated inversion recovery (FLAIR) images using artificial neural networks (ANN) for the differentiation of HGG and LGG.

Materials and methods The local ethics committee approved this retrospective study conducted between January 2013 and January 2019. The committee waived the need for informed consent for the de-identified use of medical and radiological data. The following eligibility criteria were executed for patients selection: (1) having diagnosed with WHO grade I (pilocytic astrocytoma, subependymal giant cell astrocytoma), grade II (astrocytoma, oligoastrocytoma, oligodendroglioma), grade III (anaplastic astrocytoma and anaplastic oligodendroglioma), or grade IV (glioblastoma) according to surgical or biopsy-derived histopathological findings,3 (2) being >18 years of age, and (3) having preoperative or preinterventional brain MRI with T2W-FLAIR and contrast-

enhanced T1W images. Exclusion criteria were: (1) motion or susceptibility artefacts on MRI, (2) history of radiotherapy or chemotherapy for prior brain tumour, (3) residual or recurrent brain tumours, (4) gliomas <1 cm in diameter, and (5) incomplete clinical data.

MRI acquisition All MRI studies were acquired with a 1.5 T MRI system (Avanto, Siemens Medical Systems, Enlargen, Germany). Imaging was performed using a 16-channel head coil. T2WFLAIR was utilised using the axial plane. The parameters for axial T2W-FLAIR images were the followings: repetition time (TR)¼8,000 ms, echo time (TE)¼119 ms, inversion time¼2,367 ms, field of view (FOV)¼ 2323 cm, section thickness¼5 mm, a number of excitations¼2, section number¼25, and matrix size¼268320. The parameters for axial contrast-enhanced T1W sequence were as follows: TR¼600 ms, TE¼17 ms, FOV¼23x23 cm, section thickness¼5 mm, number of excitations¼2, and section number¼25, and matrix size¼268320.

3D texture features extraction A neuroradiologist with >20 years of experience and a radiologist with >6 years of brain MRI interpretation experience performed all texture analysis in the present work. Before the feature extraction, the voxel size of images was resampled to 111 mm3 for each section to enhancing the reproducibility and avoiding the texture features to be affected by uneven pixel size.24 QMaZda texture analysis software was used for quantitative texture features extraction.25 It is well known that second-order texture features could be affected by image characteristics such as image contrast or brightness. Hence, a grey-level normalisation procedure, 3s normalisation, was implemented to increase robustness and reproducibility of the texture features.26 Two radiologists manually delineated the borders of the tumours for feature extraction in a consensus on axial T2W-FLAIR and contrast-enhanced T1W images covering the whole volume of the tumours. The investigators were free to assess all sequences of the patients to determine the exact borders of the tumour. The investigators placed the regions of interest onto the whole tumour, including all enhancing and necrotic tissues while avoiding oedema surrounding the gliomas. A sum of 306 texture features; first-order histogram (n¼13), histogram of oriented gradients (n¼8), gradient-map-based features (n¼5), grey-level co-occurrence matrix features (GLCM) (n¼176), grey-level run-length matrix features (GRLM; n¼28), autoregressive model (n¼5), Haar wavelet features (n¼12), Gabor transform features (n¼24), and local binary patterns (LBP; n¼35) were extracted for further analysis.27,28 GLCM and GRLM features were calculated at 5 bits per pixel, gradient-mapbased features were calculated at 4 bits per pixel, firstorder histogram, autoregressive model, Haar wavelet features calculated at 8 bits per pixel. LBP was calculated by one of the three algorithm identifiers; over-complete, transition, and centre-symmetric with the number of 4n

Please cite this article as: Alis D et al., The diagnostic value of quantitative texture analysis of conventional MRI sequences using artificial neural networks in grading gliomas, Clinical Radiology, https://doi.org/10.1016/j.crad.2019.12.008

D. Alis et al. / Clinical Radiology xxx (xxxx) xxx

neighbours. Fig 1 shows the 3D segmentation of the gliomas on T2W-FLAIR and contrast-enhanced T1W images.

Feature selection and dimension reduction Waikato Environment for Knowledge Analysis toolkit version 3.8.2 (The University of Waikato, Hamilton, New Zealand) was used to evaluate quantitative texture features using ML classifiers.17 Reproducibility of the texture feature analysis is an issue of concern; hence, precautions were taken to enhance the reliability of the present study.29 First, two observers drew ROIs onto randomly selected 40 gliomas to assess the reproducibility of the extracted texture features. The interobserver reliability was calculated for each texture feature by using the intra-class coefficient (ICC) analysis. Only features with a good ICC value (0.80) were further entered into the next steps. Furthermore, the importance of selected attributes was assessed by calculating the Pearson’s correlation between each attribute and the class.17 Only attributes showing correlation with the class >0.7 were entered into the next steps. Fig 2 summarises the workflow of the present work.

ANN models Selected texture features were explored by using multilayer perceptron (MLP).30,31 MLP is a type of ANN consisted of input, output, and hidden layers. The input layer receives the data, the output data serves for the classification of regression functions using the loss function, and the hidden layer consisted of significant amounts of interconnected neurons with individual trainable weights.30,31 The networks train itself by back propagating the errors and adjusting the weights of the networks in the layers using the previously labelled dataset as the reference method. The study cohort was randomly divided into the training partition (n¼121, 65 HGG and 56 LGG) and the test partition

3

(n¼60, 32 HGG and 28 LGG). The training cohort was used to select the most relevant attributes for each model and also hyperparameter tuning. During the attribute selection and hyperparameter tuning steps, 10-fold cross-validation was implemented to achieve more generalisable performance.32 The wrapper method, which evaluates the best attribute sets by using the relevant learning scheme, using 10-fold cross-validation with a linear forward-search method, was applied to identify the final subset of the features for ANN models.33 The hyperparameter selection for the ANN classifier was performed using a meta-classifier “CVParameterSelection,” which performs parameter selection by k-fold cross-validation for any classifier.17 The hyperparameters of the ANN were as follow momentum¼0.2, the learning rate¼ 0.4, and the number of the hidden layers ¼10. The diagnostic performance of the pre-trained ANN model was further evaluated on the test set. The diagnostic accuracy of each ML-model was assessed by using correlation matrices, which shows results as true positive (TP), true negative (TN), false positive (FP), and false negative (FN) according to the results of histopathological sampling as the references test. The following formulas were used the calculate sensitivity, specificity, and diagnostic accuracy; sensitivity¼ TP/(TPþFN), specificity¼ TN/(TNþFP), and diagnostic accuracy¼ (TP þTN)/(TPþTNþFPþFN). The area under the curve (AUC) was calculated for each model.

Results A total of 181 patients, 112 men (61.8%) and 69 women (38.1%), with the mean age of 58 years (range, 27e78) were included in the final study cohort. Among 181 patients, 97 had HGG (53.5%) while 84 had LGG (46.5%). The mean age of the patients with HGG was 61 years (range, 33e78 years), and the mean age of the patients with LGG was 46 years

Figure 1 The manual delineation of the tumour borders and 3D segmentation of the tumours on contrast-enhanced T1W images (above) and T2W-FLAIR images (below) are shown. The observers manually draw region of interests on the tumours on each slice, and care was taken not to include any areas with oedema around the tumour. Please cite this article as: Alis D et al., The diagnostic value of quantitative texture analysis of conventional MRI sequences using artificial neural networks in grading gliomas, Clinical Radiology, https://doi.org/10.1016/j.crad.2019.12.008

4

D. Alis et al. / Clinical Radiology xxx (xxxx) xxx

GLCM, one histogram, and one gradient features (Electronic Supplementary Material Table S1). In the training set, the ANN classifier using textural features of T2W-FLAIR images correctly identified 61 out of 65 HGG and 53 out of 56 LGG equating a sensitivity of 93.8%, a specificity of 94.6%, and an accuracy of 94.2%. The ANN classifier using textural features of contrast-enhanced T1W images correctly identified 62 out of 65 HGG and 51 out of 56 LGG equating a sensitivity of 95.3%, a specificity of 91.1%, and an accuracy of 93.3%. In the test set, the ANN classifier using textural features of T2W-FLAIR images correctly identified six out of 32 HGG and 23 out of 28 LGG equating a sensitivity of 82.1%, a specificity of 82.1%, and an accuracy of 81.6%. The ANN classifier using textural features of contrastenhanced T1W images correctly identified 27 out of 32 HGG and 22 out of 28 LGG equating a sensitivity of 84.3%, a specificity of 78.5%, and an accuracy of 81.6%. A total of 18 selected features of T2W-FLAIR and contrast-enhanced T1W images were combined for the combined ANN model. These features were further entered into wrapper subset evaluation with linear forward search method to identify the best features for the combined model. The combined ANN model had selected eight features (three GLRM and five GLCM features), four derived from T2W-FLAIR images and four derived from contrastenhanced T1W images (Electronic Supplementary Material Table S1). The combined ANN model achieved the highest diagnostic accuracies among all training and test sets with accuracies of 95.8% and 88.3% for the training and the test sets, respectively. Table 1 shows the sensitivity, specificity, diagnostic accuracy, and AUC of each model. Figure 2 The scheme summarised the workflow of the present work. First, the original voxel sizes were resampled into 111 cm3 then 3D texture features were manually extracted from T2W-FLAIR and contrast-enhanced T1W images with normalisation. A total of 306 features were assessed by reproducibility analysis, and textural features with ICC <0.80 were removed to improve the reproducibility of the ANN models. The study cohort was divided into training sets (n¼121) and test sets (n¼60). The training partition was used for feature selection and training the ANN classifier. Feature selection and dimension reduction were performed using Pearson’s correlation analysis to and wrappers with linear forward search method. The hyperparameter selection for the ANN classifier was performed using a meta-classifier “CVParameterSelection”. The performance of the ANN classifier in discriminating HGG from LGG was assessed using 10-fold cross-validation in the training cohort. Finally, the performance of the pre-trained ANN classifier was evaluated on the test set that did not participate in any of the model creation processes.

(range 27e68 years). Among 97 patients with HGG, 37 had grade III, and 60 had grade IV tumour. A sum of 212/306 (69.2%) features showed ICC  0.80 for intra-reader and inter-reader assessment for T2W-FLAIR images while a sum of 224/306 (73.2%) features showed ICC  0.80 for contrastenhanced T1WI. The ANN model of T2W-FLAIR images had nine selected features, three GRLM, five GLCM, and one histogram features. The ANN model of contrast-enhanced T1W images had nine selected features, three GRLM, four

Discussion The present study showed that ANN using data of 3D quantitative texture analysis of T2W-FLAIR and contrastenhanced T1W images could successfully differentiate HGG from LGG. Notably, the ANN models using combined data of T2W-FLAIR and contrast-enhanced T1W images are more robust compared with the models using textural features of a single conventional MRI sequence. Skogen et al.34 explored the utility of filtration-histogram analysis of contrast-enhanced T1W images in differing HGG from LGG. The authors demonstrated that standard deviation at a fine texture scale could successfully discriminate HGG and LGG.34 In a potentially related work, Ditmer et al.35 explored the role of histogram analysis of contrastenhanced T1W images in discriminating LGG and HGG, which also yielded in promising results. Wang et al.36 investigated the histogram features such using apparent diffusion coefficient maps, and they reported an AUC of 0.90 in differentiating HGG from LGG. Apart from studies focusing solely on histogram features, several authors explored the role of high-order textural features in discriminating HGG and LGG. A study by Cho et al.37 investigating quantitative texture analysis of conventional MRI sequences achieved AUC of 0.92 in differing HGG from LGG. Ryu et al.38 analysed histogram features and

Please cite this article as: Alis D et al., The diagnostic value of quantitative texture analysis of conventional MRI sequences using artificial neural networks in grading gliomas, Clinical Radiology, https://doi.org/10.1016/j.crad.2019.12.008

D. Alis et al. / Clinical Radiology xxx (xxxx) xxx

5

Table 1 Diagnostic performance of ANN models in discriminating low-grade and high-grade gliomas ANN models

Performance metrics

Confusion matrix

SEN (%) [95% CI]

SPE (%) [95% CI]

ACC [95% CI]

AUC [95% CI]

Predictions LGG

HGG

T2W-FLAIR (training)

93.8 [84.9e98.3]

94.6 [85.3e98.8]

94.2 [88.4e97.6]

0.96 [0.91e1]

T2W-FLAIR (test)

81.2 [63.5e92.7]

82.1 [63.1e93.9]

81.6 [69.5e90.4]

0.87 [0.79e0.99]

Contrast-enhanced T1W (training)

95.3 [87.1e99.1]

87.9 [76.7e95.1]

91.9 [85.6e96.3]

0.95 [0.92e1]

Contrast-enhanced T1W (test)

84.3 [67.2e94.7]

78.5 [59.1e91.7]

81.6 [69.5e90.4]

0.86 [0.8e1]

Combined model (training)

95.3 [87.1e99]

96.4 [87.6e99.5]

95.8 [90.6e98.4]

0.97 [0.9e1]

Combined model (test)

87.5 [71.1e96.4]

89.2 [71.7e97.3]

88.3 [73.7e95.1]

0.92 [0.82e1]

53 4 23 6 51 3 22 5 54 3 25 4

3 61 5 26 7 62 6 27 2 62 3 28

Ref. test

LGG HGG LGG HGG LGG HGG LGG HGG LGG HGG LGG HGG

SEN, SPE, ACC, and AUC are calculated in for the models’ accuracy in identifying HGG. ACC, accuracy; ANN, artificial neural networks; FLAIR, fluid attenuated inversion recovery; AUC, area under curve; ANN, artificial neural network; HGG, highgrade glioma; FLAIR, fluid inversion recovery; LGG, low-grade glioma; SEN, sensitivity; SF, selected features; SPE, specificity; T2W, T2-weighted; T1W, T1weighted.

GLCM of apparent diffusion coefficient maps, and the entropy of apparent diffusion coefficient maps achieved AUC of 0.94 in differentiating HGG from LGG. Zhang et al.39 examined histogram and high-order texture features of conventional MRI sequences in addition to advanced brain MRI techniques using 25 ML algorithms and achieved 0.80 AUC, which was substantially lower than the findings of the present work. Their study cohort had a relatively low number of patients with LGG compared with HGG (28 versus 98); hence, they virtually created a new data set with synthetic minority over-sampling technique (SMOTE), which consisted of 100 patients for each group. In their synthetically created cohort, they achieved an AUC of 0.94. Zacharaki et al.40 investigated the quantitative texture features of multiparametric MRI images of gliomas in addition to shape characteristics, such as irregularity and rectangularity, in a total of 26 patients with LGG and 51 patients with HGG. They achieved a diagnostic accuracy of 94.52% using k-nearest neighbours classifier. Tian et al.41 examined histogram and high-order texture features of multiparametric MRI images of 42 LGG and 11 HGG using several well-known ML classifiers. Given to their unbalanced data, they directly applied SMOTE without reporting the results of their original cohort. The authors achieved diagnostic accuracy of 96.8% using high-order texture features and 91.4% using histogram-based features, and they suggested that 3D high-order textural features are more robust compared with the histogram of single or multimodality.41 The present work differs from the works mentioned above in several critical methodological aspects. First, some of the previous works only evaluated the first-order, or namely histogram features, which represent the distribution of values of individual pixels or voxels without concern for spatial relationships34e36; however, high-order texture features might provide information regarding the statistical interrelationships between the pixels or voxels of the relevant area.18 The present work evaluated the diagnostic

value of both first-order and second-order features in grading gliomas. Second, the reproducibility and redundancy of texture features is always an issue of concern.29 None of the aforementioned works explored the reproducibility of the extracted texture features. Only texture features with an inter-reader reliability of  0.80 were included in the present work and texture features with low reproducibility were removed to enhance the reliability of the results.29 Third, it is recognised that gliomas, particularly HGG, are very heterogeneous tumours; hence, analysing only a single section might not be adequate to represent the internal characteristics of the entire tumour2,10; however, most of the previous works regarding glioma grading by textural features have only evaluated a single section of the tumours.34e38 Although there is no consensus, exploring the whole volume of the tumours as implemented in the present and several previous works39e41 seem to be a much more robust approach.23 Nevertheless, exploring the whole volume of the tumour is time-consuming compared with single section analysis; hence, more works should be conducted to justify the additional time and effort for conducting volumetric textural analysis. Finally, some of the previous works did not employ any ML algorithms to enhance their texture analysis.34e36,38 Additionally, the studies that implemented ML-based texture analysis evaluated their models’ performance by only internal crossvalidation procedure.37,39e41 In the present work, the study cohort was divided into the training cohort to select the most appropriate features and create the ML-model, and internal assessment of the models’ performance using 10-fold cross-validation, and test cohort, for evaluating the pre-trained models’ performance.

Limitations Several limitations to the present work should be acknowledged. The most important limitation of the

Please cite this article as: Alis D et al., The diagnostic value of quantitative texture analysis of conventional MRI sequences using artificial neural networks in grading gliomas, Clinical Radiology, https://doi.org/10.1016/j.crad.2019.12.008

6

D. Alis et al. / Clinical Radiology xxx (xxxx) xxx

present work was the lack of external validation set. ML models are under the risk of potential data leakage during the attribute selection and hyperparameter tuning.42 Therefore, the unbiased estimation of the models’ performance should preferably be tested on different datasets, preferably an external dataset derived from a different institution or scanner42; however, the present study cohort derived from the single-institution and all brain MRI examinations were acquired by the same scanner. Hence, some of the patients separated as a test cohort that did not participate in any steps of the model development. The texture features of advanced neuroimaging methods were not explored; however, advanced neuroimaging methods are not widely available as conventional MRI sequences; hence, using quantitative texture data extracted from conventional MRI seems to be a more practical and straightforward solution. Furthermore, the role of ML-based texture analysis was not investigated in discriminating mutation status in gliomas, in particular, IDH1 mutation in HGG and 1p19q co-deletions in LGG. Hence, texture analysis focusing on these mutations should be carried out in future studies. Finally, only agnostic features of gliomas were evaluated, but shape-based features were not evaluated, as investigated in other works.40 In conclusion, quantitative texture analysis of T2W-FLAIR and contrast-enhanced T1W images enhanced by ANN has excellent diagnostic accuracy in discriminating HGG and LGG. The present results demand validation with further prospective randomised works having multicentre data.

Conflict of interest The authors declare no conflict of interest.

Appendix A. Supplementary data Supplementary data to this article can be found online at https://doi.org/10.1016/j.crad.2019.12.008.

References 1. Weller M, van den Bent M, Tonn JC, et al. European Association for Neuro-Oncology (EANO) guideline on the diagnosis and treatment of adult astrocytic and oligodendroglial gliomas. Lancet Oncol 2017;18:e315e29. 2. Louis DN, Ohgaki H, Wiestler OD, et al. The 2007 WHO classification of tumours of the central nervous system. Acta Neuropathol 2007;114:97e109. 3. Louis DN, Perry A, Reifenberger G, et al. The 2016 World Health Organization classification of tumors of the central nervous system: a summary. Acta Neuropathol 2016;131:803e20. 4. Johnson DR, O’Neill BP. Glioblastoma survival in the United States before and during the temozolomide era. J Neuro Oncol 2012;107:359e64. 5. Rasmussen BK, Hansen S, Laursen RJ, et al. Epidemiology of glioma: clinical characteristics, symptoms, and predictors of glioma patients grade IeIV in the Danish Neuro-Oncology Registry. J Neuro Oncol 2017;135:571e9. 6. Gerard CS, Straus D, Byrne RW. Surgical management of low-grade gliomas. Semin Oncol 2014;41:458e67. 7. Raja R, Sinha N, Saini J, et al. Assessment of tissue heterogeneity using diffusion tensor and diffusion kurtosis imaging for grading gliomas. Neuroradiology 2016;58:1217e31.

8. McGirt MJ, Woodworth GF, Coon AL, et al. Independent predictors of morbidity after image-guided stereotactic brain biopsy: a risk assessment of 270 cases. J Neurosurg 2005;102:897e901. 9. Jackson RJ, Fuller GN, Abi-Said D, et al. Limitations of stereotactic biopsy in the initial management of gliomas. Neuro Oncol 2001;3:193e200. 10. Mullen KM, Huang RY. An update on the approach to the imaging of brain tumors. Curr Neurol Neurosci Rep 2017;17:53. 11. Zonari P, Baraldi P, Crisi G. Multimodal MRI in the characterization of glial neoplasms: the combined role of single-voxel MR spectroscopy, diffusion imaging and echo- planar perfusion imaging. Neuroradiology 2007;49:795e803. 12. Jolapara M, Patro SN, Kesavadas C, et al. Can diffusion tensor metrics help in preoperative grading of diffusely infiltrating astrocytomas? A retrospective study of 36 cases. Neuroradiology 2011;53:63e8. 13. Artzi M, Bokstein F, Blumenthal DT, et al. Differentiation between vasogenic-edema versus tumor-infiltrative area in patients with glioblastoma during bevacizumab therapy: a longitudinal MRI study. Eur J Radiol 2014;83:1250e6. 14. Cha S, Tihan T, Crawford F, et al. Differentiation of low-grade oligodendrogliomas from low-grade astrocytomas by using quantitative bloodvolume measurements derived from dynamic susceptibility contrastenhanced MR imaging. Am J Neuroradiol 2005;26:266e73. € m M, Rostrup E, et al. Discrimination between glioma 15. Falk A, Fahlstro grades II and III in suspected low-grade gliomas using dynamic contrast enhanced and dynamic susceptibility contrast perfusion MR imaging: a histogram analysis approach. Neuroradiology 2014;56:1031e8. 16. van Griethuysen JJM, Fedorov A, Parmar C, et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res 2017;77:e104e7. 17. Frank E, Hall AM, Witten I. The WEKA workbench. In: Data mining: practical machine learning tools and technique. 4th edn. Cambridge, MA: Morgan Kaufmann; 2016. 18. Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology 2015;278:563e77. 19. Kikingerder P, Burth S, Wick A, et al. Radiomic profiling of glioblastoma: identifying an imaging predictor of patient survival with improved performance over established clinical and radiologic risk models. Radiology 2016;280(3):880e9. 20. Kong Z, Li J, Liu Z, et al. Radiomics signature based on FDG-PET predicts proliferative activity in primary glioma. Clin Radiol 2019;815:e15. 21. Hsieh KL, Chen CY, Lo CM. Radiomic model for predicting mutations in the isocitrate dehydrogenase gene in glioblastomas. Onco-target 2017;8:45888e97. 22. Zhou M, Scott J, Chaudhury B, et al. Radiomics in brain tumor: image assessment, quantitative feature descriptors, and machine-learning approaches. AJNR Am J Neuroradiol 2018;39:208e16. 23. Soni N, Priya S, Bathla G. Texture analysis in cerebral gliomas: a review of the literature. AJNR Am J Neuroradiol 2019;40:928e34. 24. Shafiq-ul-Hassan M, Zhang GG, Latifi K, et al. Intrinsic dependencies of CT radiomic features on voxel size and number of gray levels. Med Phys 2017;44:1050e62.  ski PM, Klepaczko A. MaZda. A framework for biomedical im25. Szczypin age texture analysis and data exploration. In: Biomedical texture analysis. London: Elsevier; 2017. p. 315e47. 26. Szczypinski PM, Strzelecki M, Materka A. Mazda-a software for texture analysis. In: 2007 International Symposium on Information Technology Convergence (ISITC 2007). IEEE; 2007, November. p. 245e9. 27. Thibault G, Fertil B, Navarro C, et al. Texture indexes and gray level size zone matrix application to cell nuclei classification. Pattern Recognit Inform Process 2009:140e5. 28. Mao J, Jain AK. Texture classification and segmentation using multiresolution simultaneous autoregressive models. Pattern Recognit 1992;25:173e88. 29. Balagurunathan Y, Kumar V, Gu Y, et al. Testeretest reproducibility analysis of lung CT image features. J Digit Imaging 2014;27:805e23. 30. Bishop CM. Neural networks for pattern recognition. Oxford: Oxford University Press; 1995. 31. Castro W, Oblitas J, Santa-Cruz R, et al. Multilayer perceptron architecture optimization using parallel computing techniques. PloS One 2017;12:e0189369.

Please cite this article as: Alis D et al., The diagnostic value of quantitative texture analysis of conventional MRI sequences using artificial neural networks in grading gliomas, Clinical Radiology, https://doi.org/10.1016/j.crad.2019.12.008

D. Alis et al. / Clinical Radiology xxx (xxxx) xxx 32. Mwangi B, Tian TS, Soares JC. A review of feature reduction techniques in neuroimaging. Neuroinformatics 2014;12:229e44. 33. Arlot S, Celisse A. A survey of cross-validation procedures for model selection. Stat Surv 2010;4:40e79. 34. Skogen K, Schulz A, Dormagen JB, et al. Diagnostic performance of texture analysis on MRI in grading cerebral gliomas. Eur J Radiol 2016;85:824e9. 35. Ditmer A, Zhang B, Shujaat T, et al. Diagnostic accuracy of MRI texture analysis for grading gliomas. J Neuro Oncol 2018;140:583e9. 36. Wang S, Meng M, Zhang X, et al. Texture analysis of diffusion weighted imaging for the evaluation of glioma heterogeneity based on different regions of interest. Oncol Lett 2018;15:7297e304. 37. Cho HH, Lee SH, Kim J, et al. Classification of the glioma grading using radiomics analysis. PeerJ 2018;6:e5982.

7

38. Ryu YJ, Choi SH, Park SJ, et al. Glioma: application of whole-tumor texture analysis of diffusion-weighted imaging for the evaluation of tumor heterogeneity. PloS One 2014;9:e108335. 39. Zhang X, Yan LF, Hu YC, et al. Optimizing a machine learning based glioma grading system using multi-parametric MRI histogram and texture features. OncoTarget 2017;8:47816. 40. Zacharaki EI, Kanas VG, Davatzikos C. Investigating machine learning techniques for MRI-based classification of brain neoplasms. Int J Comput Assist Radiol Surg 2001;6:821e8. 41. Tian Q, Yan LF, Zhang X, et al. Radiomics strategy for glioma grading using texture features from mult-iparametric MRI. J Magn Reson Imaging 2018;48:1518e28. 42. Kocak B, Durmaz ES, Ates¸ E, et al. Radiomics with artificial intelligence: a practical guide for beginners. Diagn Interv Radiol 2019;25:485.

Please cite this article as: Alis D et al., The diagnostic value of quantitative texture analysis of conventional MRI sequences using artificial neural networks in grading gliomas, Clinical Radiology, https://doi.org/10.1016/j.crad.2019.12.008