Machine-learning-based computed tomography radiomic analysis for histologic subtype classification of thymic epithelial tumours

European Journal of Radiology 126 (2020) 108929 Contents lists available at ScienceDirect European Journal of Radiology journal homepage: www.elsevi...

Download PDF

2MB Sizes 0 Downloads 3 Views

Report

PDF Reader
Full Text

European Journal of Radiology 126 (2020) 108929

Contents lists available at ScienceDirect

European Journal of Radiology journal homepage: www.elsevier.com/locate/ejrad

Machine-learning-based computed tomography radiomic analysis for histologic subtype classiﬁcation of thymic epithelial tumours

T

Jianping Hu, Yijing Zhao, Mengcheng Li, Yin Liu, Feng Wang, Qiang Weng, Ruixiong You, Dairong Cao* Department of Radiology, The First Aﬃliated Hospital of Fujian Medical University, 20 ChaZhong Rd, Fuzhou, Fujian, 350005, PR China

A R T I C LE I N FO

A B S T R A C T

Keywords: Radiomics Machine learning Thymic epithelial tumour Computed tomography WHO classiﬁcation

Purpose: To evaluate the performance of machine-learning-based computed tomography (CT) radiomic analysis to diﬀerentiate high-risk thymic epithelial tumours (TETs) from low-risk TETs according to the WHO classiﬁcation. Method: This retrospective study included 155 patients with a histologic diagnosis of high-risk TET (n = 72) and low-risk TET (n = 83) who underwent unenhanced CT (UECT) and contrast-enhanced CT (CECT). The radiomic features were extracted from the UECT and CECT of each patient at the largest cross-section of the lesion. The classiﬁcation performance was evaluated with a nested leave-one-out cross-validation approach combining the least absolute shrinkage and selection operator feature selection and four classiﬁers: generalised linear model (GLM), k-nearest neighbor (KNN), support vector machine (SVM) and random forest (RF). The receiver-operating characteristic curve (ROC) and the area under the curve (AUC) were used to evaluate the performance of the classiﬁers. Results: The combination of UECT and CECT radiomic features demonstrated the best performance to diﬀerentiate high-risk TETs from low-risk TETs for all four classiﬁers. Among these classiﬁers, the RF had the highest AUC of 0.87, followed by GLM (AUC = 0.86), KNN (AUC = 0.86) and SVM (AUC = 0.84). Conclusions: Machine learning-based CT radiomic analysis allows for the diﬀerentiation of high-risk TETs and low-risk TETs with excellent performance, representing a promising tool to assist clinical decision making in patients with TETs.

1. Introduction Thymic epithelial tumours (TETs) are relatively rare neoplasms, but are the most common anterior mediastinal tumours in adults [1,2]. The histological classiﬁcation introduced by the World Health Organization (WHO) has been reported to be an independent prognostic factor [3,4]. In the WHO classiﬁcation, TETs are classiﬁed into six subtypes (types A, AB, B1, B2, B3, and thymic carcinomas), with increasing malignant nature [5]. A simpliﬁed risk classiﬁcation, which deﬁnes types A, AB, and B1 as low-risk TETs and types B2, B3, and thymic carcinomas as high-risk TETs, has also been recommended as an alternative to further characterise the invasive behaviour and tumour recurrence probability [4]. Compared with low-risk TETs, high-risk TETs have a much poorer prognosis [4,6]. Computed tomography (CT) has primarily been used for the evaluation and staging of TETs. Several studies have reported that some imaging features of the tumours were useful for diﬀerentiating the

⁎

subtypes of TETs. In these studies, a smooth contour and round shape were related to type A thymoma, whereas irregular contours, a necrotic component, and invasion of the great vessels were related to thymic carcinomas [7–9]. However, these were of limited value for the prediction of histologic subtypes of TETs due to the signiﬁcant overlap of features among the subgroups. Tumour heterogeneity is an essential feature of malignant tumours that can be assessed using histologic or imaging data [10]. Radiomics analysis of large imaging datasets has also illustrated the associations between tumour heterogeneity and radiomic features, which may be used for tumour detection, subtype classiﬁcation, and therapeutic response assessment [11–14]. In recent studies, quantitative texture analysis from 18F-FDG positron-emission tomography (PET)/CT and CT images have demonstrated the potential to diﬀerentiate the tumour grades of TETs [15,16]. Diﬀerent radiomic features can be combined into a robust predictive model to obtain a reliable diagnostic tool. As a technique for

Corresponding author. E-mail address: [email protected] (D. Cao).

https://doi.org/10.1016/j.ejrad.2020.108929 Received 26 April 2019; Received in revised form 1 February 2020; Accepted 26 February 2020 0720-048X/ © 2020 Elsevier B.V. All rights reserved.

European Journal of Radiology 126 (2020) 108929

J. Hu, et al.

2.2. CT image acquisition

recognising patterns, machine learning can be applied to determine the optimal combination of feature selection and classiﬁer methods to achieve the best performance [17,18]. Moreover, some studies have shown that classiﬁers from diﬀerent classiﬁer families show diﬀerent performance for diﬀerent types of tumours [19–21]. However, there have been no reports of radiomic studies on the diﬀerentiation in the risk classiﬁcation of TETs using machine-learning-based CT radiomic analysis with larger data and independent testing. The purpose of this study was to explore the potential use of CTbased radiomic features and machine-learning techniques to diﬀerentiate high-risk TETs from low-risk TETs. We also investigated the diagnostic performance of the diﬀerent classiﬁers and radiomic features extracted from diﬀerent imaging techniques, including unenhanced CT (UECT), contrast-enhanced CT (CECT), and the combination of UECT and CECT.

The CT images were acquired with three Toshiba Medical Systems CT scanners: 22 patients with a 16-MDCT scanner (Acquilion), 33 patients with an 80-MDCT scanner (Acquilion PRIME), and100 patients with a 320-MDCT scanner (Acquilion one). The scanning parameters were as follows: 120 kV; 130–200 mA s; detector collimation, 0.625 mm; the ﬁeld of view, 35 cm; matrix size, 512 × 512; and reconstruction kernel, standard (FC10). The slice thicknesses for the CT scans were 5 mm (115 patients) and 7 mm (40 patients). Following an unenhanced CT, a contrast medium of 1.5 mL/kg body weight (Omnipaque 350, GE Healthcare) was administered intravenously at a rate of 3.0 mL/s followed by a 20-mL saline ﬂush. Contrast-enhanced CT images were obtained 90 s after contrast agent administration. 2.3. Radiomic feature extraction and evaluation

2. Materials and methods Radiomic feature extraction was performed using the open-source Imaging Biomarker Explorer (IBEX) software, designed by commercial software package (Matlab, version 8.1.0; MathWorks, Natick, Mass) [22]. The radiomics study workﬂow diagram is shown in Fig. 2. The ROI of the tumour was manually segmented with ROI editor tools in IBEX software on unenhanced CT and enhanced CT images. A polygonal ROI was manually delineated to include the whole lesion based on the single-axial image with an optimal representation of the largest cross-sectional area (Fig. 2). The optimal representation section was determined together by two radiologists with 18 years (reader 1) and 9 years (reader 2) of experience in chest CT interpretation. The apparent calciﬁcation or cystic portion was carefully excluded when drawing the ROI. Before feature extraction, an image preprocessing with an edgepreserving smoothing (EPS) ﬁlter (as a preprocess tool implemented in the IBEX software) was applied to the lesion volume to reduce image noise. This smoothing preprocess has proved to be an eﬀective way to reduce the impact of imaging noise in lung CT scans [23]. After preprocessing, the radiomic features for two-dimensional (2D) image slices were extracted from the UECT and CECT within the ROI. The extracted features included shape [excluding the three-dimensional (3D) shape features], intensity direct, intensity histogram, grey level co-occurrence matrix (GLCM) 25, the neighbour intensity diﬀerence (NID) 25, and the grey level run length matrix (GLRM) 25. The cooccurrence matrix features (a subcategory of texture features) were calculated in four directions (0, 45, 90, and 135 degrees) with three diﬀerent oﬀsets (1, 4, and 7), and run length matrix features (another subcategory of texture features) were calculated in two directions (0 and 90 degrees). The average value of these directions was calculated as the ﬁnal value to avoid directional bias [24,25]. Moreover, some radiomic features from Intensity Direct (Kurtosis, Skewness, Percentile, Quantile, Range, InterQuartileRange, MeanAbsoluteDeviation, and MedianAbsoluteDeviation) that contained duplicate entries with Intensity Histogram sets were subsequently removed. In the end, 172 radiomic features were extracted for each image. The detailed features and abbreviations are listed in Table 1. Recently, several studies have demonstrated the inﬂuence of different acquisition protocols and diﬀerent scanners on radiomics analysis [26,27]. After feature extraction, all radiomics features were harmonised to remove the scanner-speciﬁc eﬀect using a ComBat harmonisation method with nonparametric mode. The ComBat harmonisation, which initially described genomic data analysis, has been shown to be useful for correcting the variations of radiomic feature values caused by diﬀerent scanners and diﬀerent acquisition protocols [28,29]. The Matlab and R function codes for ComBat harmonisation are available at https://github.com/Jfortin1/ComBatHarmonization. We randomly chose the CT images of 30 patients (20 % of all data) for ROI delineation and features extraction. The ROIs were independently drawn by two radiologists (reader 1 and reader 2) who

This retrospective study was approved by our institutional ethics committee, which waived the need for informed consent. 2.1. Patients A total of 201 patients with a diagnosis of thymic epithelial tumour was retrieved by searching the pathology database at our institution from January 2009 to December 2018. Six patients were excluded because of their histological diagnoses based on biopsy, and the other 195 patients had histological diagnoses based on surgical resection. Then, after searching our picture archiving and communication system (PACS), we excluded 40 patients for the following reasons: (1) patients without preoperative CT examination (n = 18); (2) patients with UECT (n = 9) or CECT (n = 8) only; (3) patients with lesions smaller than 1 cm (n = 2, to ensure enough area for drawing region of interest (ROI) and to minimise confounding factors for radiomic feature results); (4) poor CT image quality (n = 3). Finally, 155 patients were selected for this study. The study workﬂow diagram for patient selection is shown in Fig. 1. TETs were classiﬁed into two subgroups according to the simpliﬁed WHO classiﬁcation system: low-risk group (A, AB, and B1) and high-risk group (B2, B3, and TC) [4].

Fig. 1. The patient selection workﬂow. 2

European Journal of Radiology 126 (2020) 108929

J. Hu, et al.

Fig. 2. The radiomics study workﬂow. A. The ROIs of the lesion and the normal liver tissue were created, and radiomics features were extracted. B. The reproducibility and stability of the features were evaluated by the lesion ROI and liver ROI, respectively, and the scanner-speciﬁc eﬀect was subsequently removed. C. The classiﬁers were trained and evaluated with a nested leave-one-out cross-validation (LOOCV).

Table 1 List of radiomic features used in this study. Category

Features in IBE software

N= 172

Shape

Compactness1, Compactness2, Convex, ConvexHullVolume, Mass, Max3DDiameter, MeanBreadth, SphericalDisproportion, NumberOfVoxel, Orientation, Roundness, Sphericity, SphericalDisproportion, SurfaceArea, SurfaceAreaDensity, Volume Energy, GlobalEntropy, GlobalMax, GlobalMean, GlobalMedian, GlobalMin, GlobalStd, GlobalUniformity, LocalEntropyMax, LocalRangeMax, LocalEntropyMean, LocalEntropyMedian, LocalRangeStd, LocalEntropyMin, LocalEntropyStd, LocalRangeMean, LocalRangeMedian, LocalRangeMin, LocalStdMax, LocalStdMin, LocalStdMean, LocalStdMedian, LocalStdStd, RootMeanSquare, Variance, Kurtosis★, Skewness★, Range★, Percentile★, Quantile★, InterQuartileRange★, MeanAbsoluteDeviation★, MedianAbsoluteDeviation★ Kurtosis, Skewness, Range, Percentile*, PercentileArea*, Quantile*, InterQuartileRange, MeanAbsoluteDeviation, MedianAbsoluteDeviation AutoCorrelation, ClusterProminence, ClusterShade, ClusterTendency, Contrast, Correlation,DiﬀerenceEntropy, Dissimilarity,Energy,Entropy,Homogeneity, Homogeneity2, InformationMeasureCorr1, InformationMeasureCorr2, InverseDiﬀMomentNorm, InverseDiﬀNorm, InverseVariance, MaxProbability,SumAverage, SumEntropy,SumVariance,Variance GrayLevelNonuniformity, HighGrayLevelRunEmpha, LongRunEmphasis, LongRunHighGrayLevelEmpha, RunPercentage, RunLengthNonuniformity, LongRunLowGrayLevelEmpha, LowGrayLevelRunEmpha, ShortRunEmphasis, ShortRunHighGrayLevelEmpha, ShortRunLowGrayLevelEmpha Busyness, Coarseness, Complexity, Contrast, TextureStrength

16

Intensity Direct

Intensity Histogram GrayLevelCooccurenceMatrix25 (Oﬀset = 1,4, 7)

GrayLevelRunLengthMatrix25

NeighborIntensityDiﬀerence25

25

49 66

11

5

Notes: N, The total number of features (n=172); ★, The values of features in Intensity Direct set, which contained the same values with the features of the Intensity Histogram set, were removed. *, Percentile and Percentile Area were calculated at the point of from 5th to 95th with 5 intervals; Quantile was calculated at the point of 0.025,0.25, 0.5, 0.75, and 0.975, respectively.

were blinded to clinical and pathological results. The processes were repeated 4 weeks later. The remaining ROIs were segmented by reader 2. The intra-class correlation coeﬃcient (ICC) was used to assess the intra- and inter-observer reproducibility of radiomic features extraction. The features that met the criteria of ICC greater than 0.75 (in the intra- and interobserver reproducibility study) were considered for further analysis. All the ROIs ﬁnished by reader 2 were used for the ﬁnal feature analysis. We created a circular ROI with a diameter of 3.0 cm at the normal liver tissue (excluding the liver vessels) on the enhanced image. Radiomics features from the liver ROI were harmonised for the diﬀerent scanners using the ComBat harmonisation method. The diﬀerence of harmonised features of the liver ROIs among the diﬀerent scanners was compared using the Kruskal-Wallis rank sum test. The features that met the criteria with a P-value > 0.1 were considered to be the reliable features for removing the scanner-speciﬁc eﬀect (Fig. 2).

Table 2 Demographic characteristics of the 155enrolled patients. Group

number

M/F ratio

Age (years)

Low-risk group Type A Type AB Type B1 High risk group Type B2 Type B3 TC

83 22 47 14 72 31 35 6

41/42 12/10 24/23 5/9 40/32 19/12 19/16 2/4

53 49 56 48 52 49 54 55

(23–79) (23–78) (33–71) (28–79) (25–78) (25–69) (34–78) (42–71)

Notes: Low-risk group, low-risk thymic epithelial tumor, including WHO histological subgroup: Type A, AB, B1; High-risk group, High-risk thymic epithelial tumor, including WHO histological subgroup: TypeB2, TypeB3 and thymic carcinoma; Number, the number of patients; M/F ratio, Male/Female ratio; Age: median and range in the bracket.

3

European Journal of Radiology 126 (2020) 108929

J. Hu, et al.

Table 3 The features selected by LASSO regression over all the LOOCV loops. Mode/Features

LASSO coeﬀ

Number

UECT + CECT CE_GLCM25-7Homogeneity2 CE_GLCM25-7InformationMeasureCorr1 CE_GLCM25-4 AutoCorrelation UE_GLCM 25-4InverseVariance UE_IntensityDirectLocalEntropyMedian UE_IntensityHistogram10PercentileArea CE_IntensityHistogram95PercentileArea CE_GLCM25-ShortRunHighGrayLevelEmpha CE_GLCM25-7Contrast CE_IntensityHistogram10Percentile

−0.56 (−0.25∼−095) 0.46 (0.12∼0.53) −0.69 (−0.98∼−0.01) 0.38 (0.26∼0.49) 0.23 (0.06∼0.39) −0.02 (−0.04∼−0.00) −0.19 (−0.47∼0.00) −0.03 (−0.18∼0.00) 0.10 (0.00∼0.45) −0.20 (−0.47∼−0.01)

155/155 155/155 155/155 155/155 155/155 93/155 83/155 53/155 9/155 6/155

CECT GLCM 25-7InformationMeasureCorr1 IntensityHistogram0.975Quantile IntensityHistogramGaussFit1GaussAmplitude GLCM 25-4AutoCorrelation IntensityHistogram5PercentileArea IntensityHistogramInterQuartileRange IntensityHistogram95PercentileArea GLCM 25-7Contrast GLCM25-7Homogeneity2 GLCM25-7ClusterProminence

0.77 (0.05–1.43) 0.15 (0.00∼0.39) −0.12 (−0.4∼0.00) −0.92 (−1.35∼−0.02) 0.09 (0.02∼0.30) −0.06 (−0.29∼−0.01) −0.19 (−0.62∼−0.01) 0.25 (0.00∼1.15) −0.09 (−0.23∼−0.01) −0.21 (−0.36∼−0.03)

153/155 149/155 140/155 131/155 126/155 125/155 123/155 113/155 113/155 27/155

UECT GLCM 25-4InverseVariance GLCM 25-1InformationMeasureCorr2 IntensityHistogram10PercentileArea GLCM 25-7Homogeneity2 GLCM 25-1SumVariance GLCM 25-1ClusterProminence GLCM 25-7AutoCorrelation IntensityHistogram30Percentile IntensityDirectLocalEntropyMedian

0.56 (0.46∼0.65) 0.08 (0.00∼0.19) −0.12(−0.17∼−0.07) 0.05 (0.00∼0.14) −0.11 (−0.21∼0.00) −0.02(−0.06∼0.00) −0.03 (−0.14∼0.00) −0.05(−0.14∼0.00) 0.01(0.00∼0.04)

155/155 154/155 154/155 150/155 146/155 85/155 67/155 41/155 23/155

Notes UECT: unenhanced computed tomography; CECT: contrast enhanced computed tomography; GLCM: gray level co-occurrence matrix; LASSO coeﬀ: The least absolute shrinkage and selection operator (LASSO) coeﬃcients are reported by median and range in the bracket; LOOCV: leave-one-out cross-validation; Number: the frequency of feature selection.

parameter conﬁguration for the classiﬁer model by using an inner repeated 10-fold cross-validation with ﬁve times repeats. The performance of the predictive model was then evaluated in the testing set. The train/tuning parameters of the classiﬁers are as follows: GLM, the train family = “binomial”; KNN, the tuning parameter “k” was chosen from 1 to 11 in steps of 2; SVM-radial, the tuning parameters “C” and “sigma” were chosen from the set {2−2, 2−1… 23, 24} and {10−2, 10−1, 1, 101, 102}, respectively; RF, the tuning parameter “mtry” was from 2 to 10 by 1 increments. In the nested LOOCV, these classiﬁers were ﬁrst trained or tuned in the training set and then validated in the one independent left-out testing sample derived from the diﬀerent radiomic modes (UECT mode, CECT mode, and the combined mode, separately).

2.4. Feature selection and machine-learning classiﬁcation To avoid model overﬁtting of the classiﬁers, a nested leave-one-out cross-validation (LOOCV) was employed to evaluate the performance potential of diﬀerent machine-learning models. The detailed workﬂow is presented in Fig. 2. Speciﬁcally, at each step of the LOOCV process, one sample was taken as a test sample, and all the remaining samples were used as training sets. The procedure was repeated until all the samples in the dataset were used as the test sample. In each loop of the LOOCV, the feature values were standardised to the values to a mean of zero and a standard deviation of one to eliminate the possible inﬂuence caused by diﬀerent dimensions using the “scale” function in R software. The least absolute shrinkage and selection operator (LASSO) algorithm was applied to choose the most valuable features. Five-fold cross-validation was performed to select the best λ-a parameter in LASSO to be determined-using 1-SE criteria. The LASSO analysis was performed using the “glmnet” package (version 0.84) in R software. The features with non-zero coeﬃcients were selected from the candidate features and formed a radiomic signature for machine-learning classiﬁcation analysis. Finally, four well-known machine-learning classiﬁers were used: generalised linear models (GLM), k-nearest neighbour (KNN), support vector machines with a radial basis function kernel (SVM- radial), and random forest (RF). All classiﬁers were implemented using the CARET package (version 6.0–47) in R software, which provides an overall and friendly interface to access many machine-learning algorithms. The following packages were used for classiﬁcation methods: ‘glm’ (GLM), ‘knn’ (KNN), ‘kernlab’ (SVM-radial), and ‘random forest' (RF). The classiﬁers were trained or tuned in the training set to determine the best

2.5. Model evaluation The importance of the predictive radiomics features in the classiﬁers was evaluated by the selection frequencies of features obtained in Lasso regression over all of the LOOCV loops. The area under the curve (AUC) of the receiver operating characteristic (ROC), sensitivity (Sens), speciﬁcity (Spec), positive predictive value (PPV), negative predictive value (NPV), and accuracy (ACC) of the models were computed based on the results of all independent left-out testing samples. 2.6. Statistical analysis The statistical analysis was conducted with R software (version 3.3.2; http://www.Rproject.org). All continuous variables were described as mean and 95 % conﬁdence interval, or median and range. Wilcoxon rank-sum tests and two-sample t tests were used to compare 4

European Journal of Radiology 126 (2020) 108929

J. Hu, et al.

Fig. 3. Heat maps of the selected features after least absolute shrinkage and selection operator (LASSO) regression for diﬀerent radiomic modes. Upper row: The combined mode of unenhanced computed tomography (UECT) and contrast enhanced computed tomography (CECT); Middle row: The CECT mode; Bottom row: The UECT mode.

3.2. Feature selection and acquisition of radiomic signatures

groups regarding continuous variables, whereas the Pearson chi-square test or Fisher exact test was used to compare groups regarding categorical variables. The signiﬁcant diﬀerence between AUCs in the different classiﬁers and models was tested with the DeLong method using the MedCalc software package. All statistical analyses were two-sided, and p-values of < 0.05 were considered statistically signiﬁcant.

A total of 344 radiomic features for each patient were extracted from UECT (172 features) and CECT images (172 features). Features with low reproducibility (the intra- or inter-observer ICC < 0.75) were excluded, so that the number of features was reduced to 230 (119 features for UECT image, and 111 features for CECT image). Features with low stability (Kruskal-Wallis rank sum test in the liver ROI, p < 0.1) were further removed, and the ﬁnal number of features was 210 (117 features for UECT images and 109 features for CECT images). The features obtained in the process of LASSO regression were ranked by the selection frequencies of features over all of the LOOCV loops (Table 3). Most of these features were derived from the co-occurrence matrix and the intensity histogram. Heat maps of the selected features are presented in Fig. 3, and show the distribution and diﬀerences of normalised texture feature values for diﬀerent radiomic modes.

3. Results 3.1. Demographic characteristics of the patients The clinicopathological characteristics of patients in our study cohort are shown in Table 2. The number in the low-risk group and the high-risk group was 83 and 72, respectively. There was no signiﬁcant diﬀerence in age or sex either among the WHO histological subgroups or between the low-risk group and the high-risk group. 5

European Journal of Radiology 126 (2020) 108929

J. Hu, et al.

3.3. Performance of the classiﬁers and radiomic modes

Table 4 Classiﬁcation performance of the classiﬁers for three radiomic modes. Classiﬁers/Modes

AUC

Sens

Spec

PPV

NPV

ACC

GLM UECT + CECT CECT UECT

0.86 (0.79 - 0.91) 0.80 (0.72 - 0.86) 0.62 (0.54 - 0.70)

0.75 0.68 0.53

0.84 0.80 0.60

0.81 0.74 0.54

0.80 0.74 0.60

0.80 0.74 0.57

KNN UECT + CECT CECT UECT

0.86 (0.79 - 0.91) 0.79 (0.72 - 0.86) 0.65 (0.57 - 0.72)

0.75 0.64 0.58

0.81 0.83 0.64

0.77 0.77 0.58

0.79 0.73 0.64

0.78 0.74 0.62

SVM UECT + CECT CECT UECT

0.85 (0.79 - 0.90) 0.83 (0.76 - 0.89) 0.60 (0.52 - 0.68)

0.74 0.67 0.50

0.82 0.84 0.63

0.78 0.79 0.54

0.78 0.75 0.59

0.78 0.76 0.57

RF UECT + CECT CECT UECT

0.87 (0.80 - 0.92) 0.81 (0.73 - 0.86) 0.61 (0.53 - 0.69)

0.71 0.70 0.47

0.89 0.82 0.62

0.85 0.77 0.52

0.78 0.76 0.57

0.81 0.76 0.55

The average performance (AUC, Sens, Spec, NPV, PPV, and ACC) of classiﬁers for diﬀerent radiomic modes are presented in Table 4, and average ROC curves are shown in Fig. 4. The pairwise comparison of ROC curves indicated that the combination of UECT and CECT demonstrated the best performance for these machine-learning methods, followed by the CECT and UECT. There was no diﬀerence between CECT and the combination of UECT and CECT for the SVM classiﬁer. There was no signiﬁcant diﬀerence among the ROC curves of four classiﬁcation methods for each radiomic mode. The RF classiﬁer in the combination of UECT and CECT exhibited the best performance (0.87, 95 %CI: 0.80–0.92).

4. Discussion Medical images, such as CT or MRI, are traditionally “viewed” and “interpreted” by visual observation. Correct interpretation of these images depends on the observer's experience, knowledge, and the quality of the equipment. However, tumour heterogeneity, which may be associated with biologic aggressiveness, is challenging to capture and quantify with the visual assessment of images. Radiomic analysis, which can provide objective, quantitative evaluation of tumour heterogeneity from a large number of quantitative features of medical images, oﬀers the potential to overcome the limitations of a subjective visual image interpretation. In the present study, we attempt to develop and validate a machine-learning approach combined with CT radiomic analysis to diﬀerentiate the risk classiﬁcation of TETs. Among the three image modes, all the four classiﬁers provided excellent performance

Notes: AUC, area under the curve; UECT, unenhanced computed tomography; CECT, contrast enhanced computed tomography; Sens, Sensitivity; Spec, Speciﬁcity; NPV, negative predictive value; PPV, positive predictive value; ACC, accuracy; GLM, Generalised linear models; SVM, support vector machines; kNN, k-nearest neighbor; RF, Random forest. The value in the bracket is 95 % conﬁdence interval.

Fig. 4. The ROC curves of the diﬀerent radiomic modes (UECT, CECT and the combination of UECT and CECT) for the diﬀerent classiﬁers. A. GLM Generalised linear models; B. KNN k-nearest neighbour; C. SVM, support vector machines; D. RF. random forest. 6

European Journal of Radiology 126 (2020) 108929

J. Hu, et al.

machine-learning methods should be evaluated in diﬀ ;erent types of tumours concerning diﬀ ;erent radiomic cohorts [20,21]. In the present study, we investigated the performance of four classiﬁers (GLM, KNN, SVM, and RF) that achieved higher performance in previous studies [40–42]. The RF classiﬁer in the combined mode of UECT and CECT exhibited the best performance. Consistent with previous studies, the random forest classiﬁer was reported to have the best predictive performance in lung cancers and head and neck cancers when compared with other classiﬁer [20,21]. In general, our results demonstrated that all four classiﬁers showed excellent performance with relatively high AUC and ACC for the CECT mode and the combined mode of UECT and CECT. In addition, as the fact of competition between CT and MRI in the assessment of mediastinal tumours, using machine-learning methods, a recent study also shown that radiomics features from conventional MRI can also be used to establish robust prediction models for risk classiﬁcation of TETs [43]. These encouraging results indicated that machinelearning-based radiomic methods might be a robust risk stratiﬁcation approach for TETs that could aid clinical decision support. Our study has several limitations. First, a retrospective study performed at a single centre may have introduced a selection bias; a largescale, multi-centre, prospective study is needed to be performed for validation. Second, the ROIs were manually drawn at the largest crosssection of the lesion. A 3D volumetric analysis was not conducted because of the lack of thin-slice image data due to the long inclusion period of subjects and variable CT protocols. Besides, texture features may be aﬀected by calciﬁcation, haemorrhage, and cysts in TETs, which are diﬃcult to eliminate at the volumetric analysis. Third, CT image data were collected from three diﬀerent scanners with diﬀerent parameters, which can aﬀect the extracted features [26,27]. Although a ComBat harmonisation method was used to remove scanner-speciﬁc eﬀects from features, its eﬀectiveness needs to be proved in further studies. Finally, due to from the pathologic database, the diagnosis of the TETs was made by diﬀerent pathologists, which might result in inter-observer variability.

(AUC: 0.79-0.87) in the CECT mode and the combined mode of UECT and CECT. Although the RF classiﬁer in the combination of UECT and CECT exhibited the best performance (AUC = 0.87), there is no signiﬁcant diﬀerence among the four classiﬁers for each radiomic mode. Our results demonstrated that machine-learning analysis based on CT radiomic features has the potential to correctly diﬀerentiate high-risk TETs from low-risk TETs. The most important features after selection were the histogram and GLCM texture features, which were from either CECT or UECT. Among these features, as ﬁrst order feature, an intensity histogram evaluates the frequency distribution features of pixel intensity within a given area of interest. As second order features, according to their formulas and deﬁnitions, GLCM-InformationMeasureCorr1 quantiﬁes the degree of randomness, GLCM-7 InverseVariance indicates the similarity of voxel values along that direction, homogeneity is a measure of local grey level uniformity, and correlation reﬂects the consistency of image texture [22]. In previous studies, these features provide measures of tumour heterogeneity that have been reported to be related to histopathological features and prognosis in a variety of tumours such as oesophageal cancer, renal cancer, ovarian cancer, and non-small-cell lung cancer [30–33]. Another main ﬁnding of this study is that, when compared with the UECT radiomic features, an appropriate subset of features from the CECT or the combination of CECT and UECT can signiﬁcantly improve the risk classiﬁcation performance of TETs. Previous studies have pointed out that UECT and CECT may reﬂect diﬀerences in tumour biology. Radiomic features from UECT images may be associated with the spatial heterogeneity of histopathological characteristics in tumours such as cellular density, focal haemorrhage, and necrosis; whereas radiomic features from CECT images may indicate the heterogeneity of the tumour blood supply and the contrast distribution between intra-, extravascular, and extracellular space [34–36]. Therefore, radiomic features from CECT have been linked to the tumour microvascular architecture such as microvascular density and permeability. Our results appear to suggest that features from CECT may be more valuable than the ones from UECT, and a combination of features on UECT and CECT demonstrated the best performance of risk classiﬁcation for TETs. Similar to our results, a recent study also showed that radiomics features from UECT and CECT could be used as noninvasive biomarkers for the diﬀerentiation of high-risk TETs and low-risk TETs [37]. However, classiﬁcation performance was compared only between UECT and CECT, and no signiﬁcant diﬀerence was shown in their study, although the performance of CECT demonstrated slightly better than that of UECT. In recent years, the application of machine-learning-based radiomic analysis to medical imaging has drawn increased attention. The radiomic study is usually accompanied by the extraction of a large number of imaging features, among which some are redundant and unstable. Therefore, a key challenge for machine-learning-based radiomic analysis is the extraction of stable and signiﬁcant features for further assessment. In the present study, an EPS ﬁlter and a ComBat harmonisation method were ﬁrst used to remove the possible eﬀects caused by scan parameters and diﬀerent scanners. The repeatability and stability of features were further assessed to ensure the reliability of selected features. Finally, LASSO regression was employed for feature selection to eliminate the redundant features as much as possible. In previous studies, LASSO regression has been proved to be a powerful feature selection method that can ﬁnd important features and ﬁlter out the unimportant or unnecessary ones to achieve robust classiﬁcation performance [38,39]. When using machine-learning algorithms, one major problem is the risk of overﬁtting. In our study, a nested LOOCV method was used to reduce the bias in performance estimation and to minimise the risk of overﬁtting. In machine-learning analysis, choosing a robust classiﬁer was also crucial for obtaining the stability and classiﬁcation performance of the radiomic model. Some studies have shown that diﬀerent

5. Conclusion In conclusion, our study results suggest that machine-learning analysis based on CT radiomic features can be applied as a prediction method for risk classiﬁcation of TETs. All classiﬁers presented a high diagnostic performance using a combination of features on UECT and CECT. As machine-learning research in radiology is still evolving, further work with larger sample sizes will be useful to further validate the performance of the classiﬁer and to make it more reliable in clinical practice. Funding This work was supported by the Grant of Science and Technology Commission of Fujian Province (Grant number: 2019J01435). Declaration of Competing Interest We declare that we have no ﬁnancial and personal relationships with other people or organizations that can inappropriately inﬂuence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as inﬂuencing the position presented in, or the review of, the manuscript entitled. References [1] R.F. Riedel, W.R. Burfeind, Thymoma: benign appearance, malignant potential, Oncologist 11 (8) (2006) 887–894. [2] F.C. Detterbeck, A. Zeeshan, Thymoma: current diagnosis and treatment, Chin. Med. J. 126 (11) (2013) 2186–2191.

7

European Journal of Radiology 126 (2020) 108929

J. Hu, et al.

[3] M. Tsuyuguchi, S. Kimura, M. Sumitomo, WHO histologic classiﬁcation is a prognostic indicator in thymoma, Ann. Thorac. Surg. 77 (4) (2004) 1183–1188. [4] G. Chen, A. Marx, H.C. Wen, J. Yong, B. Puppe, P. Stroebel, H.K. Mueller Hermelink, New WHO histologic classiﬁcation predicts prognosis of thymic epithelial tumors: a clinicopathologic study of 200 thymoma cases from China, Cancer 95 (2) (2002) 420–429. [5] M.D. Juan Rosai, L.H. Sobin, Histological Typing of Tumours of the Thymus, (1999). [6] K. Beom Kyung, C. Byoung Chul, C. Hye Jin, S. Joo Hyuk, P. Moo Suk, C. Joon, K. Se Kyu, K. Dae Joon, C. Kyung Young, L.C. Geol, A single institutional experience of surgically resected thymic epithelial tumors over 10 years: clinical outcomes and clinicopathologic features, Oncol. Rep. 19 (6) (2008) 1525–1531. [7] Y.J. Jeong, K.S. Lee, J. Kim, Y.M. Shim, J. Han, O.J. Kwon, Does CT of thymic epithelial tumors enable us to diﬀerentiate histologic subtypes and predict prognosis? AJR Am. J. Roentgenol. 183 (2) (2004) 283. [8] N. Tomiyama, T. Johkoh, N. Mihara, O. Honda, T. Kozuka, M. Koyama, S. Hamada, M. Okumura, M. Ohta, T. Eimoto, M. Miyagawa, N.L. Muller, J. Ikezoe, H. Nakamura, Using the World Health Organization Classiﬁcation of thymic epithelial neoplasms to describe CT ﬁndings, AJR Am. J. Roentgenol. 179 (4) (2002) 881–886. [9] J. Sadohara, K. Fujimoto, N.L. Muller, S. Kato, S. Takamori, K. Ohkuma, H. Terasaki, N. Hayabuchi, Thymic epithelial tumors: comparison of CT and MR imaging ﬁndings of low-risk thymomas, high-risk thymomas, and thymic carcinomas, Eur. J. Radiol. 60 (1) (2006) 70–79. [10] B. Ganeshan, K.A. Miles, Quantifying tumour heterogeneity with CT, Cancer Imaging 13 (2013) 140–149. [11] J.P. O’Connor, E.O. Aboagye, J.E. Adams, H.J. Aerts, S.F. Barrington, A.J. Beer, R. Boellaard, S.E. Bohndiek, M. Brady, G. Brown, D.L. Buckley, T.L. Chenevert, L.P. Clarke, S. Collette, G.J. Cook, N.M. deSouza, J.C. Dickson, C. Dive, J.L. Evelhoch, C. Faivre-Finn, F.A. Gallagher, F.J. Gilbert, R.J. Gillies, V. Goh, J.R. Griﬃths, A.M. Groves, S. Halligan, A.L. Harris, D.J. Hawkes, O.S. Hoekstra, E.P. Huang, B.F. Hutton, E.F. Jackson, G.C. Jayson, A. Jones, D.M. Koh, D. Lacombe, P. Lambin, N. Lassau, M.O. Leach, T.Y. Lee, E.L. Leen, J.S. Lewis, Y. Liu, M.F. Lythgoe, P. Manoharan, R.J. Maxwell, K.A. Miles, B. Morgan, S. Morris, T. Ng, A.R. Padhani, G.J. Parker, M. Partridge, A.P. Pathak, A.C. Peet, S. Punwani, A.R. Reynolds, S.P. Robinson, L.K. Shankar, R.A. Sharma, D. Soloviev, S. Stroobants, D.C. Sullivan, S.A. Taylor, P.S. Tofts, G.M. Tozer, M. van Herk, S. Walker-Samuel, J. Wason, K.J. Williams, P. Workman, T.E. Yankeelov, K.M. Brindle, L.M. McShane, A. Jackson, J.C. Waterton, Imaging biomarker roadmap for cancer studies, Nat. Rev. Clin. Oncol. 14 (3) (2017) 169–186. [12] P. Lambin, E. Rios-Velazquez, R. Leijenaar, S. Carvalho, R.G. van Stiphout, P. Granton, C.M. Zegers, R. Gillies, R. Boellard, A. Dekker, H.J. Aerts, Radiomics: extracting more information from medical images using advanced feature analysis, Eur. J. Cancer 48 (4) (2012) 441–446. [13] E.J. Limkin, R. Sun, L. Dercle, E.I. Zacharaki, C. Robert, S. Reuze, A. Schernberg, N. Paragios, E. Deutsch, C. Ferte, Promises and challenges for the implementation of computational medical imaging (radiomics) in oncology, Ann. Oncol. 28 (6) (2017) 1191–1206. [14] R.J. Gillies, P.E. Kinahan, H. Hricak, Radiomics: Images Are More than Pictures, They Are Data, Radiology 278 (2) (2016) 563–577. [15] K. Yasaka, H. Akai, M. Nojima, A. Shinozaki-Ushiku, M. Fukayama, J. Nakajima, K. Ohtomo, S. Kiryu, Quantitative computed tomography texture analysis for estimating histological subtypes of thymic epithelial tumors, Eur. J. Radiol. 92 (2017) 84–92. [16] H.S. Lee, J.S. Oh, Y.S. Park, S.J. Jang, I.S. Choi, J.S. Ryu, Diﬀerentiating the grades of thymic epithelial tumor malignancy using textural features of intratumoral heterogeneity via (18)F-FDG PET/CT, Ann. Nucl. Med. 30 (4) (2016) 309–319. [17] B. Zhang, X. He, F. Ouyang, D. Gu, Y. Dong, L. Zhang, X. Mo, W. Huang, J. Tian, S. Zhang, Radiomic machine-learning classiﬁers for prognostic biomarkers of advanced nasopharyngeal carcinoma, Cancer Lett. 403 (2017) 21–27. [18] C. Parmar, P. Grossmann, J. Bussink, P. Lambin, H. Aerts, Machine learning methods for quantitative radiomic biomarkers, Sci. Rep. 5 (2015) 13087. [19] S.E. Viswanath, P.V. Chirra, M.C. Yim, N.M. Rofsky, A.S. Purysko, M.A. Rosen, B.N. Bloch, A. Madabhushi, Comparing radiomic classiﬁers and classiﬁer ensembles for detection of peripheral zone prostate tumors on T2-weighted MRI: a multi-site study, BMC Med. Imaging 19 (1) (2019) 22. [20] C. Parmar, P. Grossmann, D. Rietveld, M.M. Rietbergen, P. Lambin, H.J.W.L. Aerts, Radiomic machine-learning classiﬁers for prognostic biomarkers of head and neck cancer, Front. Oncol. 5 (4) (2015). [21] C. Parmar, P. Grossmann, J. Bussink, P. Lambin, H.J.W.L. Aerts, Machine learning methods for quantitative radiomic biomarkers, Sci. Rep. 5 (2015) 13087. [22] L. Zhang, D.V. Fried, X.J. Fave, L.A. Hunter, J. Yang, L.E. Court, IBEX: an open infrastructure software platform to facilitate collaborative work in radiomics, Med. Phys. 42 (3) (2015) 1341–1353. [23] J. Yang, L. Zhang, X.J. Fave, D.V. Fried, F.C. Stingo, C.S. Ng, L.E. Court, Uncertainty analysis of quantitative imaging features extracted from contrast-enhanced CT in

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

8

lung tumors, Computerized Medical Imaging & Graphics the Oﬃcial Journal of the Computerized Medical Imaging Society 48 (January 11) (2016) 1–8. C.A. Owens, C.B. Peterson, C. Tang, E.J. Koay, W. Yu, D.S. Mackin, J. Li, M.R. Salehpour, D.T. Fuentes, L.E. Court, J. Yang, Lung tumor segmentation methods: impact on the uncertainty of radiomics features for non-small cell lung cancer, PLoS One 13 (10) (2018) e0205003. X. Fave, M. Cook, A. Frederick, L. Zhang, J. Yang, D. Fried, F. Stingo, L. Court, Preliminary investigation into sources of uncertainty in quantitative imaging features, Comput. Med. Imaging Graph. 44 (2015) 54–61. L. He, Y. Huang, Z. Ma, C. Liang, C. Liang, Z. Liu, Eﬀects of contrast-enhancement, reconstruction slice thickness and convolution kernel on the diagnostic performance of radiomics signature in solitary pulmonary nodule, Sci. Rep. 6 (2016) 34921. L. Lu, R.C. Ehmke, L.H. Schwartz, B. Zhao, Assessing agreement between radiomic features computed for multiple CT imaging settings, PLoS One 11 (12) (2016) e0166550. F. Orlhac, S. Boughdad, C. Philippe, H. Stalla-Bourdillon, C. Nioche, L. Champion, M. Soussan, F. Frouin, V. Frouin, I. Buvat, A postreconstruction harmonization method for multicenter radiomic studies in PET, J. Nucl. Med. 59 (8) (2018) jnumed.117.199935. F. Orlhac, F. Frouin, C. Nioche, N. Ayache, I. Buvat, Validation of a method to compensate multicenter eﬀects aﬀecting CT radiomics, Radiology 291 (1) (2019) 53–59. S. Rizzo, F. Botta, S. Raimondi, D. Origgi, V. Buscarino, A. Colarieti, F. Tomao, G. Aletti, V. Zanagnolo, M. Del Grande, N. Colombo, M. Bellomi, Radiomics of highgrade serous ovarian cancer: association between quantitative CT features, residual tumour and disease progression within 12 months, Eur. Radiol. 28 (11) (2018) 4849–4859. D.V. Fried, O. Mawlawi, L. Zhang, X. Fave, S. Zhou, G. Ibbott, Z. Liao, L.E. Court, Stage III non-small cell lung cancer: prognostic value of FDG PET quantitative imaging features combined with clinical prognostic factors, Radiology 278 (1) (2016) 214–222. Z. Feng, P. Rong, P. Cao, Q. Zhou, W. Zhu, Z. Yan, Q. Liu, W. Wang, Machine learning-based quantitative texture analysis of CT images of small renal masses: diﬀerentiation of angiomyolipoma without visible fat from renal cell carcinoma, Eur. Radiol. 28 (4) (2018) 1625–1633. S. Liu, H. Zheng, X. Pan, L. Chen, M. Shi, Y. Guan, Y. Ge, J. He, Z. Zhou, Texture analysis of CT imaging for assessment of esophageal squamous cancer aggressiveness, J. Thorac. Dis. 9 (11) (2017) 4724–4732. Z. Haowei, C.M. Graham, E. Okan, M.E. Griswold, Z. Xu, M.A. Khan, P. Karen, J.J. Caudell, R.D. Hamilton, G. Balaji, Locally advanced squamous cell carcinoma of the head and neck: CT texture and histogram analysis allow independent prediction of overall survival in patients treated with induction chemotherapy, Radiology 269 (3) (2013) 801–809. F. Ng, B. Ganeshan, R. Kozarski, K.A. Miles, V. Goh, Assessment of primary colorectal cancer heterogeneity by using whole-tumor texture analysis: contrast-enhanced CT texture as a biomarker of 5-year survival, Radiology 266 (1) (2013) 177–184. H. Li, Y. Zhu, E.S. Burnside, K. Drukker, K.A. Hoadley, C. Fan, S.D. Conzen, G.J. Whitman, E.J. Sutton, J.M. Net, M. Ganott, E. Huang, E.A. Morris, C.M. Perou, Y. Ji, M.L. Giger, MR imaging radiomics signatures for predicting the risk of breast cancer recurrence as given by research versions of MammaPrint, oncotype DX, and PAM50 gene assays, Radiology 281 (2) (2016) 382–391. X. Wang, W. Sun, H. Liang, X. Mao, Z. Lu, Radiomics signatures of computed tomography imaging for predicting risk categorization and clinical stage of thymomas, Biomed Res. Int. (2019) 3616852. P. Yin, N. Mao, C. Zhao, J. Wu, C. Sun, L. Chen, N. Hong, Comparison of radiomics machine-learning classiﬁers and feature selection for diﬀerentiation of sacral chordoma and sacral giant cell tumour based on 3D computed tomography features, Eur. Radiol. 29 (4) (2018) 1841–1847. S. Wu, J. Zheng, Y. Li, H. Yu, S. Shi, W. Xie, H. Liu, Y. Su, J. Huang, T. Lin, A radiomics nomogram for the preoperative prediction of lymph node metastasis in bladder cancer, Clin. Cancer Res. 23 (22) (2017) 6904–6911. C. Parmar, P. Grossmann, D. Rietveld, M.M. Rietbergen, P. Lambin, H.J. Aerts, Radiomic machine-learning classiﬁers for prognostic biomarkers of head and neck cancer, Front. Oncol. 5 (2015) 272. W. Wu, C. Parmar, P. Grossmann, J. Quackenbush, P. Lambin, J. Bussink, R. Mak, H.J. Aerts, Exploratory study to identify radiomics classiﬁers for lung cancer histology, Front. Oncol. 6 (2016) 71. X. Meng, W. Xia, P. Xie, R. Zhang, W. Li, M. Wang, F. Xiong, Y. Liu, X. Fan, Y. Xie, X. Wan, K. Zhu, H. Shan, L. Wang, X. Gao, Preoperative radiomic signature based on multiparametric magnetic resonance imaging for noninvasive evaluation of biological characteristics in rectal cancer, Eur. Radiol. (2018). G. Xiao, W.-C. Rong, Y.-C. Hu, Z.-Q. Shi, Y. Yang, J.-L. Ren, G.-B. Cui, MRI radiomics analysis for predicting the pathologic classiﬁcation and TNM staging of thymic epithelial tumors: a pilot study, Am. J. Roentgenol. (2019) 1–13.

Machine-learning-based computed tomography radiomic analysis for histologic subtype classification of thymic epithelial tumours

Machine-learning-based computed tomography radiomic analysis for histologic subtype classification of thymic epithelial tumours

Recommend Documents