Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors

Journal Pre-proof Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors Jakub Nalepa, Pablo Ribalta Lorenzo, Michal Marcin...

Download PDF

8MB Sizes 0 Downloads 26 Views

Report

Full Text

Journal Pre-proof Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors Jakub Nalepa, Pablo Ribalta Lorenzo, Michal Marcinkiewicz, Barbara Bobek-Billewicz, Pawel Wawrzyniak, Maksym Walczak, Michal Kawulok, Wojciech Dudzik, Krzysztof Kotowski, Izabela Burda, Bartosz Machura, Grzegorz Mrukwa, Pawel Ulrych, Michael P. Hayball

PII:

S0933-3657(18)30663-8

DOI:

https://doi.org/10.1016/j.artmed.2019.101769

Reference:

ARTMED 101769

To appear in:

Artiﬁcial Intelligence In Medicine

Received Date:

5 November 2018

Revised Date:

28 October 2019

Accepted Date:

20 November 2019

Please cite this article as: Jakub Nalepa, Pablo Ribalta Lorenzo, Michal Marcinkiewicz, Barbara Bobek-Billewicz, Pawel Wawrzyniak, Maksym Walczak, Michal Kawulok, Wojciech Dudzik, Krzysztof Kotowski, Izabela Burda, Bartosz Machura, Grzegorz Mrukwa, Pawel Ulrych, Michael P. Hayball, Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors, (2019), doi: https://doi.org/10.1016/j.artmed.2019.101769

This is a PDF ﬁle of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the deﬁnitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its ﬁnal form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier.

Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors Jakub Nalepaa,b,∗ , Pablo Ribalta Lorenzob , Michal Marcinkiewicza , Barbara Bobek-Billewiczc , Pawel Wawrzyniakc , Maksym Walczaka , Michal Kawuloka,b , Wojciech Dudzika , Krzysztof Kotowskia , Izabela Burdaa , Bartosz Machuraa , Grzegorz Mrukwaa , Pawel Ulrychc , Michael P. Hayballd

of

a Future Processing Bojkowska 37A, 44-100 Gliwice, Poland b Institute

of Informatics, Silesian University of Technology Akademicka 16, 44-100 Gliwice, Poland, Tel./Fax.: +48 32 237 21 51

Sklodowska-Curie Memorial Cancer Center and Institute of Oncology Wybrzeze Armii Krajowej 15, 44-102 Gliwice, Poland

ro

c Maria

re

-p

d Feedback Medical Ltd. Broadway, Bourn, Cambridge CB23 2TA, UK

Abstract

lP

Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) plays an important role in diagnosis and grading of brain tumors. Although manual DCE biomarker extraction algorithms boost the diagnostic yield of DCE-MRI by providing quantitative information on tumor prognosis and prediction, they

ur na

are time-consuming and prone to human errors. In this paper, we propose a fully-automated, end-to-end system for DCE-MRI analysis of brain tumors. Our deep learning-powered technique does not require any user interaction, it yields reproducible results, and it is rigorously validated against benchmark and ∗ Corresponding

Jo

author Email addresses: [email protected] (Jakub Nalepaa,b,∗ ), [email protected] (Pablo Ribalta Lorenzob ), [email protected] (Michal Marcinkiewicza ), [email protected] (Barbara Bobek-Billewiczc ), [email protected] (Pawel Wawrzyniakc ), [email protected] (Maksym Walczaka ), [email protected] (Michal Kawuloka,b ), [email protected] (Wojciech Dudzika ), [email protected] (Krzysztof Kotowskia ), [email protected] (Izabela Burdaa ), [email protected] (Bartosz Machuraa ), [email protected] (Grzegorz Mrukwaa ), [email protected] (Pawel Ulrychc ), [email protected] (Michael P. Hayballd )

Preprint submitted to Artificial Intelligence in Medicine

November 20, 2019

clinical data. Also, we introduce a cubic model of the vascular input function used for pharmacokinetic modeling which significantly decreases the fitting error when compared with the state of the art, alongside a real-time algorithm for determination of the vascular input region. An extensive experimental study, backed up with statistical tests, showed that our system delivers state-of-the-art results while requiring less than 3 minutes to process an entire input DCE-MRI

of

study using a single GPU. Keywords:

ro

Deep neural network, pharmacokinetic model, tumor segmentation, DCE-MRI, perfusion, brain

-p

1. Introduction

Dynamic contrast-enhanced imaging using magnetic resonance (DCE-MRI)

re

has become a widely utilized clinical tool for non-invasive assessment of the vascular support of various tumors [66]. DCE analysis is performed on a time-series

lP

of images acquired following injection of contrast material (tracer) and investigating temporal changes of attenuation in vessels and tissues. Biomarkers extracted from such imaging have been shown to be correlated with physio-

ur na

logical and molecular processes which can be observed in tumor angiogenesis, morphologically characterized by an increased number of micro-vessels which are extremely difficult to image directly [66]. Therefore, DCE biomarkers can be used to assess tumor characteristics and stage, and provide an independent indicator of prognosis, enabling risk stratification for patients with cancer [18].

Jo

1.1. Contribution

Although DCE biomarkers have been validated against many reference meth-

ods and used for assessment of a wide range of tumors, including gliomas, glioblastomas, carcinoids, rectal, renal, and lung tumors, and many others [18], the process of their extraction is fairly time-consuming and prone to user errors as it requires manual segmentation of the MR data. The manual segmentation

2

adversely impacts reproducibility, which is an important issue in clinical applications, particularly for longitudinal studies. In this work, we address these problems and propose a fully-automated approach to assess brain tumor perfusion from DCE-MRI without any user intervention which fits well into clinical practice and can help clinicians decide on an optimal treatment pathway much

not been explored in the literature so far.

Quantified tumor characteristics

ro

Pharmacokinetic modeling

Determination of vascular input region

-p

Input DCE-MRI brain study

Brain tumor segmentation

of

faster. To the best of our knowledge, such hands-free DCE analysis engines have

Figure 1: Our deep learning-powered approach for assessing brain tumor perfusion does not

re

require user intervention—all steps presented in the flowchart are automatic.

The contribution of this work is multi-fold. We propose an extensively val-

lP

idated (Section 4) system, referred to as Sens.AI DCE, which requires no user intervention for assessing brain tumor perfusion from DCE-MRI data (see its high-level flowchart in Figure 1). Although there were some attempts to au-

ur na

tomate perfusion analysis in the literature [12], our system is first-to-date approach for DCE-MRI brain imaging. To make the analysis fully automatic, we introduce:

- A deep neural network (DNN) for segmenting brain tumors from the T2 Fluid Attenuated Inversion Recovery (T2-FLAIR) sequences (Section 2.1).

Jo

- A real-time image-processing algorithm for determination of the vascular input region from T1 Volumetric Interpolated Breath-hold Examination (T1 VIBE) sequences (Section 2.2).

- A new cubic model of the vascular input function used for pharmacokinetic modeling whose aim is to minimize the contrast-concentration fitting error (Section 2.3). 3

- An end-to-end DCE processing pipeline which can be seamlessly integrated into clinical practice. The process of extraction of DCE biomarkers from brain MRI requires segmenting brain tumors, determining the vascular input region, for which clustering and intensive 3D processing of brain scans is commonly used [86]. We

of

address the most important shortcoming of such approaches, being the computational complexity [61], by exploiting very fast image-processing routines coupled

with a 3D analysis. To the best of our knowledge, there are no fully-automated

ro

approaches for the DCE analysis proposed in the literature so far, and the existing techniques suffer from lack of reproducibility and require user interaction.

-p

Hence, post-processing of DCE-MRI studies cannot be seamlessly executed just after acquiring the scans. In the next section, we summarize the state of the

re

art in brain tumor segmentation, which is a critical DCE-MRI processing step. 1.2. Related work on the brain tumor segmentation

lP

The approaches for brain tumor segmentation can be divided into four main categories [36, 82]—atlas-, image analysis- and machine learning-based techniques, and these which are hybrid and couple various algorithms belonging to other groups (Figure 2). In the following sections, we discuss such approaches

ur na

in detail, highlighting their pros and cons. Finally, brain tumor segmentation algorithms can be classified according to another taxonomy into manual, semiautomated (requiring human intervention) and fully-automated techniques. In

Sens.AI DCE, we strive to deploy a fully-automated and reproducible segmentation which can make the analysis of brain MRI sequences faster and not

Jo

dependent upon a user, hence not prone to human errors. 1.2.1. Atlas-based algorithms In the atlas-based algorithms, manually segmented images, referred to as

atlases, are exploited to segment unseen scans [63]. These atlases model the anatomical variability of the brain tissue [60]. Such atlas images are extrapolated to new frames, commonly in a two-step process, where the first step 4

Automated segmentation of brain tumors from MRI

Atlas-based

Image analysis

Machine learning

Hybrid

[5, 10, 14, 60, 63]

[65, 73, 87]

Unsupervised

Deep learning

[3, 16, 23, 35, 51, 67, 70, 78, 80]

[26, 27, 31, 38, 41, 46, 54, 88] [19, 24, 33, 49, 50, 55]

Region-based

Supervised

[32]

[11, 20, 75, 84]

[25, 39, 43, 44, 62, 73, 74, 85, 89]

of

Thresholding

ro

Figure 2: Automated delineation of brain tumors from MRI—a taxonomy [82]. We underlined Deep learning, as Sens.AI DCE exploits a deep learning-powered approach for this task.

-p

involves a global registration, to initially align image data at low computational cost, and a local registration for adapting general atlas models to specific

re

anatomy performed in the second step [14]. Within the atlas-based algorithms, we can distinguish different segmentation strategies, with the single- and multiatlas label propagation and the probabilistic atlas-based segmentation being the

lP

most popular [14]. In the former approaches, the atlas labels—from one or more atlases—propagate in the image space, hence this strategy directly relies on the registration process that estimates the anatomical differences between the at-

ur na

las and the input image volumes. The probability atlas-based segmentation integrates the voxel probabilities as part of a statistical Bayesian framework. A shortcoming of atlas-based techniques is the necessity of creating large and representative annotated sets. It is a time-consuming procedure and may lead to atlases which cannot be applied to tumors with different characteristics [5].

Jo

Also, atlas-based algorithms are dependent on the registration process [10]. 1.2.2. Image analysis-based algorithms In the image analysis-based algorithms, we distinguish the thresholding- and

region-based techniques. Thresholding approaches operate on the premise that the pixels with the intensity values within a given range should be assigned the same class label [32]. Although such thresholding-based techniques, both

5

utilizing a single global and multiple local thresholds, are easy to implement and to interpret, they do not benefit from all important information available in an input MRI scan. Additionally, selecting an appropriate threshold value can easily become cumbersome if the difference between the intensity levels of tumorous and non-tumorous regions is low, e.g., in heterogeneous tumors [82]. To further benefit from the neighborhood pixel information, the region-based

of

algorithms divide an input image into regions of the pixels satisfying some spe-

cific similarity conditions [20, 84]. Such techniques often start from the seed

ro

points, determined either manually or automatically, and grow the regions until all pixels/voxels are attached to the appropriate regions. Therefore, region-

based approaches are sensitive to the selection of seed points, and require deter-

-p

mining the similarity conditions. Also, they may not perform well in the case of noisy, blurry, or heavily textured images [82]. A notable group of approaches

re

which benefit from the computer-vision and image-analysis techniques includes active contour models, also referred to as deformable models. These methods are useful for semi-supervised segmentation, where a user incorporates addi-

lP

tional a priori knowledge. Active contours are designed to track the boundary of a segmented object by matching a deformable model to the image curve, according to the energy functional which controls the elasticity and rigidity of

ur na

the curve [11]. It is known that snakes (parametric active contour models) do not perform well in presence of concavities, sharp corners, and smooth boundaries, but there exist their variations specifically designed to deal with these factors [75]. They are, however, heavily parameterized and require a practitioner to fine-tune their pivotal hyper-parameters. In geometric models, the contour is independent from the curves parametrization, and the region-based

Jo

active contour models are more robust against noise [82]. Finally, elastic image deformations may lead to tumor shapes that are not anatomically plausible [58]. 1.2.3. Machine learning-powered algorithms The machine learning-powered (ML-powered) approaches are split into conventional ML (CML) which requires performing feature engineering, and deep

6

learning (DL). Unsupervised algorithms reveal hidden structures within the input data [23, 51], therefore they do not rely on manually-delineated training sets [16, 51, 78]. Clustering-based approaches partition an input volume into consistent groups of pixels manifesting similar characteristics [35, 67, 70, 80]. In hard clustering, each pixel belongs to a single cluster, and the clusters do not overlap. Soft clustering methods assign a probability of a pixel being in a given

of

cluster, and such clusters can overlap [82]. The clustering methods, albeit being

pixel-wise, can also benefit from additional spatial information, e.g., by incor-

ro

porating modified objective functions [3]. To this end, selecting an appropriate

objective function is a pivotal step in such techniques. Also, the effectiveness of clustering-based algorithms strongly depends on their initialization.

-p

Supervised techniques utilize manually segmented image sets to train a model, hence perform better than unsupervised methods in terms of accuracy1 . These

re

approaches exploit a great variety of well-established engines, including decision forests [25], conditional random fields [85], decision forests [89], extremely randomized forests [62], support vector machines [43], k-nearest neighbors [39, 74],

lP

various types of artificial neural networks [1], and many other techniques, also including ensemble learning [44]. The major drawback of both unsupervised and supervised CML techniques is the necessity of extracting hand-crafted fea-

ur na

tures. In the pixel-wise classification, such features, e.g., intensity-based, textural, shape, first- or second-order statistical features, and many more, potentially extracted in 3D [73], are obtained for each pixel separately. Unfortunately, this process can lead to obtaining features which do not benefit from the full information contained in an input image, as some of them may be easily omitted in the extraction process, or may be even unknown for humans. Additionally, if a

Jo

large number of features are determined, we may face the curse of dimensionality problem [69], hence feature engineering commonly includes feature selection. 1 In

supervised approaches, a practitioner which trains a model acts as a teacher by pro-

viding examples with known class labels. This is in contrast to unsupervised methods, where such labels are unknown, and there is no teacher.

7

Various deep networks have been applied to segment different kinds of medical images [45, 76], including brain scans [26, 27, 31, 41, 46, 54, 88]. DNNs have been shown extremely successful in the Multimodal Brain Tumor Segmentation Challenge (BraTS)—the winning BraTS’17 algorithm utilized an ensemble of such learners [38]. Unfortunately, the authors reported neither training nor inference times of their method which may easily explode for sufficiently deep

of

architectures, and make it inapplicable in a hospital setting. Although the last year’s edition (2018) of the BraTS competition was won by a DNN which ben-

ro

efits from the encoder-decoder architecture [8, 55], the other top-performing

methods included U-Net-based networks [33], and the U-Net-inspired ensembles [50]. Isensee et al. [33] showed that U-Nets are extremely powerful and

-p

outperform more sophisticated and deeper networks.

The U-Net-based architectures have been widely researched in this context,

re

and constituted the mainstream in the BraTS challenge for multi-class tumor segmentation from multiple MRI modalities. They included U-Nets with domain adaptation [19], cascaded U-Nets [49] for separate tumor detection and multi-

lP

class segmentation, and multiple-pathway U-Nets [24], where each modality is analyzed in a separate pathway, and they are finally fused to get the final segmentation. Such multi-modal approaches can fully benefit from all available

ur na

modalities, but they require having them co-registered which is a challenging image-analysis task, and inaccurate registration deteriorates the performance of further processing steps [28]. Although multi-modal processing is pivotal for segmenting brain lesions into subregions, i.e., edema, non-enhancing solid core, necrotic core, and enhancing core, the whole area of brain tumors, without subdividing it into subregions, can be delineated from the T2-FLAIR MRI [81].

Jo

This approach, although giving slightly lower segmentation scores, is much easier to implement, train, and deploy, and infers in shorter time, especially due to the lack of the sequence co-registration step required in multi-modal techniques.

8

1.2.4. Hybrid algorithms Hybrid approaches combine methods from other categories [73]. They are usually tailored to detect a specific lesion type. Superpixel processing was coupled with support vector machines and extremely randomized trees in [73]. Although the results appeared promising, the authors did not report the computation time nor cost of their method, and did not provide any insights into the

of

classifier parameters. Tuning such kernel algorithms is very difficult and expensive [57], and improperly selected parameters can easily deteriorate the classifier

ro

abilities [56]. DNNs were coupled with conditional random fields in [87]—the

authors showed that combining DL-powered approaches with CML is becoming

-p

the current research trend, and should be exploited in the nearest future. 1.2.5. Summary

re

Medical image analysis is a mature research field, with automated brain tumor delineation from MRI not being an exception here. The current research trend has been directed by the revolutionizing impact of DL on computer vision

lP

and image analysis which we have witnessed in the recent years. Although all of the discussed methods have their own advantages and disadvantages, as gathered in Tables 1–2, DL-powered brain tumor segmentation techniques constitute the

ur na

current mainstream. Therefore, we followed it in Sens.AI DCE to make our segmentation engine deliver the state-of-the-art brain tumor delineation quality, as thoroughly analyzed in our rigorous multi-fold cross-validation experimental setting over the well-known benchmark data. Also, we ensure reproducibility of the method and its fast operation, while making it independent from a human

Jo

operator, as Sens.AI DCE exploits fully-automated segmentation. 1.3. Structure of the paper In Section 2, we present our fully-automated deep learning-based DCE-MRI

analysis technique. The datasets used for validating the most important steps of the processing pipeline are discussed in Section 3. In Section 4, we discuss and

9

Table 1: The pros and cons of the state-of-the-art atlas- and image analysis-based brain tumor segmentation algorithms. Atlas-based algorithms Advantages

Disadvantages

• Multi-atlas approaches can minimize the impact

• Require creating a representative set in a time-

of outliers present in separate atlases

consuming and user-dependent process

• High-quality atlases may be built for a specific • Difficult to appropriately weight specific atlases in the multi-atlas settings

• Dependent on the quality of registration

• Intuitive and interpretable

• Deformations may not be anatomically plausible

of

scenario, also in an iterative way • Multi-atlas segmentation is easily parallelizable

ro

• Complex non-rigid registration makes the segmentation process time-consuming Image analysis-based algorithms

-p

Thresholding algorithms Advantages

Disadvantages

• Easy to implement, understand, and interpret

• Require determining the threshold(s) value(s) • Sensitive to noise

re

• Offer real-time operation due to simplicity

• Very often fail for heterogeneous tumors due to intensity variations

Advantages • Easy to interpret

lP

Region-based algorithms

Disadvantages

• Sensitive to initialization (seed point selection)

• Applicable to tumors of any shape

• Region growing is sensitive to noise • Often fail at tumor edges with pixel inten-

against noise

sities similar to other tissue; can lead to over-

ur na

• Region-based active contours can be robust

segmentation • Often fail for heterogeneous (“textured”) tumors • Active contours do not deal well with sharp and very smooth tumor edges • Active contours may lead to not anatomically plausible tumor shapes • More appropriate for semi-automated segmen-

Jo

tation

analyze the experimental results. Section 5 concludes the paper and presents the research directions which emerge from our work.

10

Table 2: The pros and cons of the state-of-the-art machine learning and hybrid brain tumor segmentation algorithms. Machine learning algorithms Unsupervised techniques Disadvantages

• Do not require having annotated training sets

• Require feature engineering

• Easy to implement and interpret

• Require tuning of pivotal hyper-parameters

of

Advantages

• May be sensitive to noise (e.g., fuzzy k-means) • May strongly depend on the initialization

ro

Supervised techniques Disadvantages

• Can be easy to interpret (k-nearest neighbors)

• Require having annotated training sets

• Training can be fast

• Require tuning of pivotal hyper-parameters

• Often robust against outliers (e.g., ensembles)

• Require feature engineering

-p

Advantages

• Exploit well-established statistical models

• Training and segmentation can be slow

• Can offer real-time segmentation

• Can be challenging to interpret (artificial neural

re

networks, kernel methods, non-linear techniques)

Deep learning-powered techniques Advantages

Disadvantages

• Often require very large training sets

lP

• Deliver state-of-the-art segmentation

• Perform automatic representation learning

• Require tuning of pivotal hyper-parameters

• Can reveal features not known by humans

• Large-capacity networks are prone to overfitting

• May offer real-time operation

• Deep networks are challenging to interpret

ur na

Hybrid algorithms

Advantages

Disadvantages

• Can fully benefit from hybridized approaches

• Can inherit disadvantages of hybridized approaches • Segmentation is often slow due to hybridization

Jo

2. Method

In this section, we describe the core components of Sens.AI DCE: deep

learning-powered brain tumor segmentation, determination of the vascular input region, and pharmacokinetic modeling with the proposed VIF cubic model.

11

2.1. Brain tumor segmentation To segment brain tumors from MRI, we exploit an ensemble of U-Net-based deep networks (Section 2.1.2) trained in a multi-fold setting. As mentioned in Section 1.2.3, brain tumors can be accurately delineated from T2-FLAIR without co-registering other modalities. In this work, we exploit this observation, since we quantify the perfusion characteristics within the entire tumorous tissue.

of

The T2-FLAIR image data undergoes pre-processing (Section 2.1.1) before it is

ro

fed to the deep model for segmentation. 2.1.1. Data pre-processing

In the Sens.AI DCE pre-processing step, an input axially-oriented T2-FLAIR

-p

scan is resampled to the 240 × 240 size with the preserved aspect ratio, and with the voxel size of 1 mm3 . Then, it is skull-stripped to remove all non-brain voxels

re

from the input scan. Although there exist various skull stripping methods, commonly divided into morphology-, intensity-, deformable surface- and atlasbased techniques [4, 15, 37], they often operate on non-enhanced T1-weighted

lP

images and struggle when applied over different modalities. To deal with this issue and to be able to accurately skull strip T2-FLAIR sequences, we utilize exactly the same U-Net-based DNN architecture for this task (as in the case of

ur na

the brain tumor segmentation; see Section 2.1.2). Such deep models were shown extremely successful in extracting brain tissue across different modalities [40]. An input T2-FLAIR sequence is z-score normalized: z=

I −M , σ

(1)

where I denotes the input voxel, M is the median voxel extracted for this

Jo

volume, and σ is the standard deviation. To decrease the influence of outlying voxels during the normalization process, we center the data around median instead of the mean voxel value (this approach was inspired by [30]). The same z-score normalization technique is applied in the case of both skull stripping and brain tumor segmentation steps. However, in order to prune the background and noisy voxels, often visible next to the skull, and to not let them affect the 12

z-score normalization process for skull stripping, we discard these voxels whose intensities are lower than the experimentally-tuned threshold. This threshold is set to the first percentile of the intensity values within the volume, excluding the voxels that are lower or equal to 1, as they are clearly background. Such thresholding does not have to be performed for the tumor segmentation step, as

48 48

48 48

302

302

2402

1202

-p

48 48 96 96

602

96 96 96

1202

48 48 48

48 48 48

48 48 48

2402

ro

Expanding path

Contractive path

of

the input volume is already skull-stripped and contains only the brain voxels.

602

Max-pooling (2×2)

re

Convolution (3×3), ReLU Concatenate and upsample (2×2)

Convolution (1×1), sigmoid

lP

Figure 3: Our U-Net-based DNN with its blocks and connections. For each layer, we additionally present the number of kernels.

ur na

2.1.2. Deep neural network architecture

Our DNN designed for brain tumor segmentation and skull stripping, il-

lustrated in Figure 3, consists of a series of blocks placed symmetrically as a contractive and expanding path, yielding a U-shape [66]. Each block in the contractive path contains three 3 × 3 convolution layers with zero padding and 48

kernels with stride 1 × 1, and one convolution layer with 96 kernels with stride

Jo

1 × 1 in the deepest part of this architecture. They are followed by a rectified linear unit (ReLU) activation, and a max-pooling layer with 2 × 2 kernels (2 × 2 stride) to perform downsampling. In the expanding path, the output of each block receives a skip connection from the depth-matched block from the contractive path, and concatenates it to its own output, which is upsampled and passed to a higher block. Finally, a 1 × 1 convolutional layer with sigmoid 13

activation reduces the activation depth to one. In this manner, high-level features, extracted in the contractive path, propagate through higher resolution layers of the expanding path. In Figure 3, we can observe how the size of the feature map is affected by each operation, achieving a compression of the feature space from 240 × 240 to 30 × 30. Residual connections between corresponding blocks in the contractive and expanding paths

of

are also displayed. Our DNN performs multi-scale analysis—features from the

contractive path are combined with the upsampled output. As shown in our

ro

recent work [47], such skip connections allow the high-level features extracted in the contractive path to propagate through the expanding layers, hence to

provide the “local context” to the “global information” while upsampling. It is

-p

in contrast to other U-Nets, in which features from the contractive path are concatenated in the expanding path before convolutions, thus undergo additional

re

feature extraction. The contractive and expanding paths are symmetric, and the number of trainable parameters of our network is close to 6.5 · 105 . There are three main differences between the proposed DNN and the state-

lP

of-the-art U-Nets. First, the number of kernels is constant at the majority of steps in our processing pipeline in our model, while it is doubled in each deeper block in the original U-Net. Here, we double the number of kernels only in

ur na

the deepest part of the network, in order to exploit the contextual tumor information as best as possible. Keeping a constant number of filters reduces the number of parameters of the network, effectively lowering its computational requirements and processing time. Second, we preserve the shape of each feature map, which allows us to seamlessly take advantage of the bridged connections by simply concatenating activation maps at each depth, and to make the overall

Jo

implementation much easier (cropping is employed in [66]). Finally, as already mentioned, the contractive-path features are combined with the upsampled output in order to preserve their original characteristics.

14

c)

d)

of

b)

lP

re

-p

ro

a)

Figure 4: Determination of the vascular input region: a) a slice in the axial plane from an input volume, b) intensity thresholding reveals candidate regions, from which: blue regions are rejected because they occupy an upper section of the image (above the dotted line), red

ur na

ones are rejected due to the shape irregularities, green regions are retained, c) visualization of the volume with remained voxels grouped as connected components, d) visualization of the volume after retaining only the largest 3D connected component.

2.2. Determination of vascular input region The input of our algorithm is a time series of co-registered T1 VIBE scans [22].

Jo

First, an arithmetic average µI is calculated for high-intensity voxels for each volume. Voxels are considered to be high-intensity, if their intensity is above a given threshold (we manually tuned this parameter to 0.75 of the maximum

intensity in a volume). A volume with the maximum µI is selected from the time series as Vs . Next, Vs is cropped (Figure 4a), and a binary mask with the high-intensity voxels is created for Vs , yielding the volume VT . For each slice 15

from VT , we perform blob detection. To narrow down the search space for the vascular input region, only the components in the lower section of the slice are retained (Figure 4b), which corresponds to the G, H, and I sections of the brain in Talairach coordinates. We introduce a simple shape metric: M (S) = Ac /Ab , where Ac is the area of a shape S, and Ab is the area of its bounding box. For a square shape, the metric becomes 1, whereas for a circle it is π/4 (for elongated

of

and curvilinear shapes, the value of the metric will be lower). All connected

components Si for which M (Si ) < π/4 are rejected (Figure 4c). Finally, the

ro

binary volume VT undergoes the 3D connected-components labeling, and the

component with the largest volume is considered to contain only the voxels of the vascular region of interest in Vs (Figure 4d). We propagate the binary labels

-p

from VT to all volumes (they are co-registered), and use them to measure the contrast concentration. Importantly, our algorithm is deterministic and delivers

2.3. Pharmacokinetic modeling

re

reproducible vascular input region determination.

lP

Modern MR scanners provide images suitable not only for qualitative assessment by a reader to reveal the structural information about the patient, but they are also fast enough to acquire volumetric brain images in relatively

ur na

short time intervals for the contrast concentration analysis. High spatial and temporal resolutions give a possibility to quantify the concentration of a contrast agent (CA) in tissues, and to assess its distribution in time in terms of a pharmacokinetic model.

In order to apply any pharmacokinetic model to a series of MR images, we

use a mapping between the pixel intensity in an image, and the contrast con-

Jo

centration in the corresponding volume [17]. This procedure exploits the CA’s magnetic relaxivity, which is specific for the used agent, the value of patient’s haematocrit, and the pre-contrast T10 relaxation times of the scanned tissue. The patient’s haematocrit value, if not provided, is assumed to be 0.45 [17], whereas the pre-contrast T10 relaxation times are derived from scans at different flip angles of the magnetic field—at least two sequences acquired at two 16

different angles are required. For more details, see [17]. Once the mapping is established, we can obtain quantitative information about the CA’s concentration in any MRI series within the analyzed study, including these taken at different moments in time, revealing the kinetics of the CA. We interpret the spatial and temporal information in terms of the Tofts model [79], which belongs to a group of the compartments models widely

of

used in the DCE analysis. Two compartments of the model represent blood plasma and abnormal extravascular extracellular space (EES). The model allows

ro

us to describe the CA kinetics via three tissue parameters, two of which are

independent: 1) the influx volume transfer constant Ktrans , or the permeability surface area product per unit volume of tissue between plasma and EES; 2) the

-p

volume of EES per unit volume of tissue ve (0 ≤ ve ≤ 1); and 3) the efflux rate constant kep = Ktrans /ve . Those parameters are commonly used as biomarkers

re

in quantification of a state of a tumor [2]. The model consists of a plasma volume (vp ), which is connected to a large EES, and lesion leakage space (LLS). The LLS is assumed to be small enough to not change the total CA concentration,

lP

and is connected to the plasma through a leaky membrane. The whole system is assumed to be interconnected, and the CA is well-mixed with plasma. The CA flows to EES and LLS, and it is constantly being depleted by kidneys [79].

ur na

In each moment in time, the CA concentration in LLS (Ct (t)) is in a dynamic equilibrium, and can be derived from:

Ct (t) = Ktrans · Cp (t) ∗ exp(−kep t) ,

(2)

where the ∗ symbol between Cp (t) and the exponential decay denotes a convo-

Jo

lution operation.

The CA’s concentration Ct (t) in LLS can be calculated in two ways: 1) by a

numerical convolution, which is computationally expensive; 2) analytically, by exploiting a Cp (t) model and finding an analytical solution to the convolution operation (Eq. 2). We take the latter approach, as it may smoothen out the noise of real-life data [59], and is much faster, hence can be deployed in medical

17

applications. 2.3.1. Bi-exponential model of a vascular input function (VIF) Inaccurate modeling of the VIF propagates through to the estimated tissue parameters [64], hence an accurate model is required to obtain medical-grade

Cp (t) = A · exp(−αt) + B · exp(−βt),

of

performance. Tofts and Kermode proposed a bi-exponential model of a VIF [79]: (3)

ro

which has been widely adopted due to its simplicity. However, it assumes that Cp (t = 0) = max(Cp ), which is unrealistic, but was applicable when the scanners

had slow sampling rates. Nowadays, it is not the case, as the scanners produce

-p

images with temporal resolution which is high enough to track the initial increase

2.3.2. Linear model of a VIF

re

of Cp as the contrast begins to arrive.

Orton et al. proposed a more realistic model, containing a linear term in t,

lP

which was called the “Model 2” (bi-exponential model was called “Model 1”) in [59], but will be referred to as a linear model in this work: (4)

ur na

Cp (t) = A · t exp(−αt) + B · exp(−βt) − exp(−αt) .

The linear model has a desired property of Cp (t = 0) = 0, which allows for modeling more realistic VIF functions, while maintaining low number of parameters (A, B, α, and β). Our experiments on a simulated benchmark dataset created by the Barboriak’s lab at the Duke University Medical Center in the framework of the Quantitative Imaging Biomarker Alliance (QIBA) project (Section 3.1)

Jo

revealed that the linear model outperforms the bi-exponential one, yet it still does not fit the data perfectly. Although clinically-adopted software for pharmacokinetic modeling is lim-

ited (Tissue4D by Siemens is widely used), there exist other implementations which are being validated against benchmark data. Smith et al. [72] developed

18

the DCE-MRI.jl software suite, which was shown to be overcoming the drawbacks of other DCE-MRI analysis packages, where the lack of portability, huge computational burden, and complexity were their most important shortcomings. 2.3.3. Cubic model of a VIF In this work, we propose a generalization of the linear model, with the aim

of

of minimizing the fitting error, by substituting t → tn mentioned in [59], and putting n = 3. We have also investigated a model with a t2 term, however, since

ro

f (t) = t2 is an even function and f (t < 0) > 0, it could easily converge to local

minima if t0 = 0 was determined inaccurately—the function exhibited a visible curve in the vicinity of the true t0 . Moreover, the higher-order models may

affects their fitting speed and accuracy.

-p

suffer from overfitting, as they become “over-complicated”, which additionally

Here, we call the model with the t3 term the cubic model. To the best

re

of our knowledge, such model has not been used before. It also has only four

lP

parameters (A, B, α, and β), since only two terms from the polynomial are used:

Cp (t) = A · t3 exp(−αt) + B · exp(−βt) − exp(−αt) .

(5)

The model is not as flexible as functions containing all terms of the high-order

ur na

polynomials usually are, and therefore it is less prone to overfitting. It preserves the property of Cp (t = 0) = 0, while its form allows for finding an analytical solution to Eq. 2. Ct (t) is parametrized by six parameters: A, B, α, β originating

Jo

from a VIF, and Ktrans and kep (or ve = Ktrans /kep ):

Ct = Ktrans A · ∆−4 exp(−αt) · Ct1 + B · Ct2 , Ct1 = − (t∆)3 − 3(t∆)2 − 6t∆ − 6 exp(∆) + 6, exp(−βt) − exp(−kep t) kep − β exp(−αt) − exp(−kep t) − , kep − α

Ct2 =

where ∆ = α − kep . 19

(6)

of

a)

lP

re

-p

ro

b)

ur na

Figure 5: Fits of the linear model and our cubic model of the VIF to the QIBA phantom data: contrast agent’s concentration in a) vascular input region, and in b) tissue. The data represents tissue characterized by Ktrans = 0.10 min−1 and ve = 0.20. The mean square error

(MSE) of the fit is written in bold.

The comparison of fits of the linear and cubic models to the QIBA data

Jo

is presented in Figure 5. The function of the contrast concentration in time in the vascular input region and tissue with the fitted curves is presented on panels a) and b), respectively. The cubic model (green curve) has around an order of magnitude lower mean square error (MSE) than the fit of the linear model (orange curve), for both plasma and tissue. This empirical evidence shows that our proposed model of a VIF has a potential to yield higher-quality

20

fits required to obtain tissue parameters with high precision. The evaluation of the cubic model on the QIBA dataset is presented in detail in Section 4.5.

3. Data 3.1. QIBA phantom dataset

of

The QIBA synthetic DCE-MRI phantom sets that can be used for validating DCE-MRI analysis approaches2 . Several phantoms are available to validate

tration values, and tissue parameters fitting.

ro

both the procedure of mapping MR pixel intensities to the contrast concenIn this work, we exploit the

newest release of the QIBA set (version 14) simulating images obtained for

-p

patients who have low cardiac output, hence the patients would be expected to have lower and “broader” input functions. This set consists of 661 Digital Imaging and Communications in Medicine (DICOM) files, each file simu-

re

lating data of one timestamp. The data in each DICOM is divided into two regions—tissue and vascular. The tissue part is further divided into 30 (5×6)

lP

non-overlapping patches of size 10 × 10 pixels, marking regions of tissue characterized by different values of Ktrans ∈ {0.01, 0.02, 0.05, 0.10, 0.20, 0.35} min−1 , and ve ∈ {0.01, 0.05, 0.10, 0.20, 0.50}. The vascular part is placed at the bottom

ur na

of an image, and occupies the area of 10 × 50 pixels (Figure 15a). Examples of CA concentrations as a function of time in plasma and tissue for one patch (Ktrans = 0.10 min−1 and ve = 0.20) are shown in Figure 5a and Figure 5b. The QIBA set does not include the reference values for the vascular region

curve—the authors do not recommend using any particular model for VIF, nor any method to process the data. The only way to verify whether a given VIF

Jo

model yields desired performance is to test the complete solution and compare the resulting values of the tissue parameters with those that are provided—we follow this approach in this work. 2 https://qibawiki.rsna.org/index.php/Synthetic_DCE-MRI_Data

21

3.2. Brain tumor segmentation benchmark dataset (BraTS) The performance of our brain tumor segmentation was evaluated over the famous Brain Tumor Segmentation (BraTS) dataset3 in a multi-fold crossvalidation setting [7, 52]. The BraTS set contains MRI data of patients with diagnosed high-grade glioblastomas (HGG), and low-grade gliomas (LGG). The data comes in four co-registered modalities: native pre-contrast (T1), post-

of

contrast T1-weighted (T1c), T2-weighted (T2), and T2-FLAIR, and was ac-

quired with different clinical protocols and various scanners from multiple insti-

ro

tutions (https://www.med.upenn.edu/sbia/brats2017/people.html). The authors of the BraTS dataset co-registered all volumes rigidly to T1c, as it had

the highest spatial resolution in most cases. The volumes have been resampled

-p

to 1 mm3 isotropic resolution and skull-stripped to guarantee anonymization of the patients. All the pixels have one (out of four) label assigned—healthy

re

tissue, Gd-enhancing tumor (ET), peritumoral edema (ED), the necrotic and non-enhancing tumor core (NCR/NET). The scans are pre-operative, and have been manually delineated by medical experts [7, 52]. As we are interested only

lP

in a binary classification (healthy/tumorous tissue), the classes ET, ED, and NCR/NET are merged together into the “tumorous” class. Finally, we exploit the T2-FLAIR sequences only, as they may be effectively used for detecting

ur na

brain tumors [81].

This dataset underwent an additional verification performed by a group of

three experienced readers (one radiologist and two medical physicists with 11, 7, and 5 years of experience, respectively), who investigated the quality of each study and ground-truth (GT) segmentation. This process led us to determining patients who have been annotated as “inaccurately segmented in the ground-

Jo

truth segmentations” or as “low-quality MRI data”. The final dataset was divided into five non-overlapping folds of 47, 47, 46, 55, and 33 patients (228 patients in total, with 165 HGG and 63 LGG patients, respectively). These folds 3 In

this paper, we utilize the BraTS dataset version available at http://medicaldecathlon.

com/, as the data has a permissive copyright license (CC-BY-SA 4.0).

22

are stratified—they contain scans such that: (i) a similar ratio of patients with LGG and HGG is preserved (||LGG|| / ||HGG|| = 0.38, where ||·|| denotes the size of the corresponding subset), (ii) the scans come from different institutions, with the preserved ratio of such origins in each fold, (iii) the ratio of “small”, “medium”, and “large” tumors is preserved in each fold, where the volume of the small/large tumors falls into the first/fourth quartile of the volume distribution

of

in the corresponding set.

The motivation behind selecting the BraTS dataset to train and validate our

ro

models for brain tumor segmentation in Sens.AI DCE is multifold:

• It contains data acquired using different clinical protocols and scanners

-p

(data heterogeneity).

• It contains both LGG and HGG with different size characteristics (data

re

heterogeneity).

• It is the largest manually-annotated set of brain tumor MRI (data volume).

lP

• It contains the MRI scans that have been pre-processed and skull-stripped using well-defined pre-processing routines, hence they can be readily used for training deep models in the context of the brain tumor segmentation

ur na

task. Note that for new T2-FLAIR sequences, e.g., from other scanners or datasets, we perform the pre-processing discussed in Section 2.1.1 which is consistent with the BraTS pre-processing protocol. It allows us to employ Sens.AI DCE for segmenting any T2-FLAIR data—it is experimentally verified in Section 4.3.2.

Jo

• It has become the state-of-the-art set for comparing emerging brain tumor segmentation techniques in the framework of the competition held each year at the Medical Image Computing and Computer Assisted Intervention (MICCAI) conference [8].

• It has a permissive copyright license (CC-BY-SA 4.0), allowing to exploit it in both research and commercial projects.

23

3.3. Clinical brain tumor MRI data Sens.AI DCE was validated over the T2-FLAIR clinical data (CD) of 25 LGG and 25 HGG patients, classified according to the World Health Organization (WHO) tumor classification [13, 48]. This dataset was not manually delineated, therefore there is no ground-truth segmentation available. Also, the type of the

of

exploited MRI scanner was not revealed.

ro

4. Experimental validation 4.1. Experimental setup

We validated the pivotal Sens.AI DCE components over benchmark and

-p

clinical sets to check their robustness against different data (Sections 4.3–4.5), and presented an example of an end-to-end processing in Section 4.6.

re

Sens.AI DCE was coded in C++ and compiled with Microsoft Visual C++ 2017 (for segmentation, we used Tensorflow 1.6 over CUDA 9.0 and CuDNN 5.1). The experiments ran on an Intel i7-6850K (15 MB Cache, 3.60 GHz) CPU

lP

machine with 32 GB RAM and NVIDIA GTX Titan X GPU with 2 × 16 GB VRAM. We trained our DNNs for skull stripping and brain tumor segmentation using Nadam [21], with the initial learning rate of 10−5 , and optimizer

ur na

parameters set to β1 = 0.9, β2 = 0.999. We used the batch sizes of 32 for both skull stripping and brain tumor segmentation models, with the balanced batches containing 16 frames with and without brain for skull stripping, and 16 healthy and 16 tumorous frames for the brain segmentation networks. We report the training details in Section 4.2.

Jo

4.2. Training of the models

The training of the brain tumor segmentation DNN ran until DICE over the

validation set4 , did not increase by at least 0.001 in 7 consecutive epochs with 4 The

loss function lBT is defined as lBT = 1−DICE(A, B), as the loss function is commonly

minimized during the training process.

24

the maximum of 50 epochs. The DICE score is: DICE(A, B) =

2 · |A ∩ B| , |A| + |B|

(7)

where A and B are two brain tumor segmentations, i.e., manual and automated. DICE ranges from zero to one, where one is the perfect score. We train our models in a multi-fold fashion, dividing our BraTS dataset into five folds,

of

containing 47, 47, 46, 55, and 33 stratified patients (see Section 3.2). To make the DNNs generalize better over unseen data, the four initial folds are used to

ro

train four separate models, with three folds acting as a training set, and one as

a validation set for each DNN. Therefore, each fold was a validation set exactly once. These models are later fused into an ensemble which elaborates the final

-p

average prediction. For the brain tumor segmentation, we binarize the average ensemble predictions (threshold of 0.5), unless stated otherwise. The final fold

re

of 33 patients is exploited as a test fold, and it is never used during the training. Finally, we exploited random horizontal flipping as the training-time data augmentation for brain tumor segmentation.

lP

In Figure 6, we render the training curves obtained for each model in the ensemble. We can observe that the early stopping condition was triggered for each model, and the segmentation performance over the validation sets consistently

ur na

grew during the training. Although there was a visible drop in DICE over the validation set in the last fold (see the 15th epoch in the bottom-right curve in Figure 6), all of the final models did well for both training and validation data. For skull stripping, we used the same optimizer settings with 5 epochs with-

out improving the loss for an early stop. The loss function (lSS ) became:

Jo

lSS = −0.5 · DICE(A, B) − 0.5 · ηtp ,

(8)

where A and B are two brain tissue segmentations, i.e., manual and automated, ηtp =

TP FN+TP

is recall, and TP, FP, and FN are the numbers of true positives,

false positives, and false negatives, respectively. This loss function allowed us to penalize false negatives—skull stripping is a pre-processing step for brain tumor segmentation, hence should not remove any brain tissue. 25

0.95

0.90

0.90

0.85

0.85

0.80

0.80

DICE

0.75 0.70

0.70 Validation set Training set

40

0.60

50

0.95

0.95

0.90

0.90

0.85

0.85

0.80

0.80

0.75 0.70

10

20 30 Epoch

40

50

0.75 0.70

Validation set Training set

0

20 30 Epoch

10

40

0.65 50

0.60

0

10

re

0.65 0.60

0

of

20 30 Epoch

10

DICE

DICE

0

Validation set Training set

0.65

ro

0.65 0.60

0.75

-p

DICE

0.95

Validation set Training set

20 30 Epoch

40

50

lP

Figure 6: Training curves obtained in each fold for our deep network applied to brain tumor segmentation. Note that we present DICE instead of the loss function for easier interpretation. One training epoch took approximately 190 s.

ur na

To train and validate the skull-stripping models, we used the following sets: • The neurofeedback skull-stripped (NFBS) repository5 : 125 T1-weighted anonymized (de-faced) volumes with brain masks, with the resolution of 1 mm3 .

• The Cancer Imaging Archive (TCIA) [7]: 683 pre-operative volumes

Jo

(T1, T1c, T2, and T2-FLAIR) of 108 LGG and 135 HGG patients, resulting in 296 LGG and 387 HGG volumes, respectively. To generate brain masks, we exploited the HD-BET algorithm [34] over the

5 This

set

is

available

at:

http://preprocessed-connectomes-project.org/NFB_

skullstripped/ (last access: October 28, 2019).

26

standardized and resampled (1 mm3 ) volumes. These segmentations have been approved by a neuro-radiologist (three years of experience)—three (out of 686) volumes have been rejected due to visible under-segmentation of the brain. • The Calgary-Campinas-359 dataset (CC-359)6 : 359 multi-vendor T1-

of

weighted brain volumes with brain masks generated using the STAPLE algorithm [83]. The volumes underwent standardization and resam-

ro

pling to 1 mm3 .

• Brain clinical data: 89 T2-FLAIR volumes of LGG/HGG patients. To generate brain masks, we exploited the HD-BET algorithm [34] over

-p

the standardized and resampled (1 mm3 ) volumes.

In total, we included 1256 annotated brain volumes of different modalities

re

in our large and heterogeneous repository, which were divided into five nonoverlapping and stratified (according to the data source) folds containing 289,

lP

289, 289, 289, and 100 volumes, respectively. As previously, the four initial folds are used to train four separate models, with three folds acting as a training set, and one as a validation set for each DNN, and the final fold of 100 volumes was kept aside as the test fold. The models were combined into an ensemble which

ur na

elaborates the final average prediction (threshold of 0.5). In Figure 7, we can appreciate the training curves for each model in the skull

stripping task. All of the DNNs converged fast, and delivered high-quality brain segmentation. Also, the quality of segmentation of the training and validation sets is consistent across four folds, hence the models generalize well over the

Jo

unseen data—we obtained the following values of our combined quality metric (0.5 · DICE(A, B) + 0.5 · ηtp ): 0.982, 0.983, 0.984, and 0.983 for four validation sets. Finally, we obtained the average DICE of 0.975 with standard deviation of 0.008 over the test set, with all test volumes segmented at DICE > 0.940. 6 This

set is available at: https://sites.google.com/view/calgary-campinas-dataset/

home/download (last access: October 28, 2019).

27

a)

b) 1.0

0.95

0.98

0.90

0.96

0.85

0.94

0

10

20 30 Epoch

40

50

0.90

1.0

1.0

0.95

0.98

0.90

0.96

0.85

0.94

0.80

0.75

10

20 30 Epoch

40

50

0.90

1.0

0.95

20 30 Epoch

40

50

Validation set Training set

0

10

20 30 Epoch

40

50

0.98

0.90

0.96

0.94

0.80

lP

0.85

0.92

Validation set Training set

0

10

ur na

1.0

20 30 Epoch

40

50

0.90

0.98

0.90

0.96

0.85

0.94

0.80

20 30 Epoch

40

50

Jo

10

0

10

0.92

Validation set Training set

0

Validation set Training set

20 30 Epoch

40

50

1.0

0.95

0.75

10

re

1.0

0.75

0

0.92

Validation set Training set

0

Validation set Training set

ro

0.75

0.92

Validation set Training set

-p

0.80

of

1.0

0.90

Validation set Training set

0

10

20 30 Epoch

40

50

Figure 7: Training curves obtained in a) each fold for our deep network applied to skull stripping, and their b) zoomed counterparts. Note that we present the brain tumor segmentation quality given as 0.5·DICE(A, B)+0.5·ηtp instead of the loss function for easier interpretation.

One training epoch took approximately 1100 s.

28

4.3. Validation of brain tumor segmentation 4.3.1. Validation over the BraTS dataset In this experiment, we validated our brain tumor segmentation over the test fold of the BraTS dataset7 (Section 3.2). To evaluate the segmentation performance, we use DICE (the larger this measure becomes, the better, hence we aim to maximize it: ↑, and 0 ≤ DICE ≤ 1), precision ηprec (percentage of

of

correctly classified pixels out of all the pixels classified as lesions; ↑; 0 ≤ ηprec ≤ 1), and recall ηtp (percentage of lesion pixels correctly classified as lesions; ↑;

and ηtp =

TP FN+TP ,

TP TP+FP

ro

0 ≤ ηtp ≤ 1) elaborated by the models. They are given as ηprec =

where TP, FP, and FN are the numbers of true positives,

-p

false positives, and false negatives, respectively. We also investigate the receiver operating characteristic (ROC) curves, presenting the false positive rate ηfpr vs. recall, according to a set of cut-off points [42], where ηfpr =

FP TN+FP

(ratio

re

of the number of healthy pixels wrongly classified as tumorous to the total number of actual healthy pixels; ↓; 0 ≤ ηfpr ≤ 1), and the areas under both precision-recall (P − R) and ROC curves (↑; 0 ≤ AUC(P − R) ≤ 1 and 0 ≤

lP

AUC(ROC) ≤ 1). Finally, we analyze the Hausdorff distance (HD), alongside its 95th percentile, HD (95), which is more robust against the outliers [77], where HD is the maximum distance (in mm) of all points from the segmented lesion

ur na

to the corresponding nearest point of the ground-truth segmentation [68]. HD helps understand the behavior of our segmentation techniques close to the lesion boundaries (the lower the HD is, the better segmentation becomes; ↓). In Table 3, we gather the results obtained for the BraTS test dataset. To

better understand the 3D characteristics of the tumors, we report their vol-

Jo

umes (in mm3 ), for both ground-truth and Sens.AI DCE segmentations. These results are coupled with the cutoff analysis, presented in Figure 9, where we investigate the ROC and P−R curves. Since the prediction power of our DNN ensembles differs for various parts of the tumors, as presented in Figure 8, in 7 Note

that the BraTS dataset contains skull-stripped volumes, hence our skull stripping

was not exploited for this set.

29

Table 3: The segmentation performance of our DNN ensemble over the BraTS test set.

DICE

GT Volume

Volume

Volume

AUC

AUC

[mm3 ]

[mm3 ]

error [%]

(ROC)

(P − R)

HD

HD (95)

0.609

32051

29223

0.508

0.945

0.643

6.164

1.000

Mean

0.882

112120

105891

13.339

0.994

0.896

23.884

8.807

Max.

0.968

232098

212323

58.094

1.000

0.979

71.127

41.713

Std. dev.

0.095

59859

53784

15.794

0.014

0.095

20.217

11.769

of

Min.

which selecting too large thresholds would lead to pruning tumorous pixels from the final prediction (generating false negatives), and too small threshold would

ro

likely cause including healthy pixels in the segmented tumor region (false pos-

itives). Our models allowed us to obtain high-quality segmentation—0.882 of

-p

DICE with 13.339% of average brain tumor volume error—which is comparable with the current state of the art [8]. Although the average HD is high, indicating considerable discrepancies of the tumor segmentation near its boundaries,

re

it is sensitive to outliers. The average HD (95), which is more robust against such outliers, is more than 2.7× smaller than HD. Finally, the shape of the

lP

ROC and P−R curves, together with the area under these curves show that our ensembles perform well and stable for the majority of cutoff points. The average segmentation time of the test patients was 5.6 s using a single GPU.

ur na

4.3.2. Mean opinion score experiment: qualitative validation of brain tumor segmentation over the benchmark and clinical data

Delivering high-quality ground-truth segmentation is a very user-dependent

and expensive process which may lead to biased tumor delineations. Also, such gold-standard segmentations are not available for clinical image datasets. In the

Jo

mean opinion score experiment, we segmented (i) the BraTS test set, and our (ii) clinical MRI data (CD) of 25 LGG and 25 HGG patients (without groundtruth segmentations). In the latter case, we exploited exactly the same DNNs, trained using the BraTS dataset—we did not perform any transfer learning or additional fine-tuning of the models. However, the CD volumes underwent skull stripping with our DNNs for this task, alongside standardization and resam-

30

1.0

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

of

a)

1.0

0.0

0.0

1.0

ro

1.0

0.6

re

0.4

-p

0.8

0.6

0.4

0.2

0.2

0.0

0.0

lP

b)

0.8

Figure 8: Example brain tumor prediction maps obtained for two selected frames from a T2-FLAIR sequence using Sens.AI DCE for two BraTS test patients (a and b).

ur na

pling to match the characteristics of the training data (Section 2.1.1). These segmentations were assessed by a group of experienced readers (Table 4). We designed an ordinal scale for the segmentation quality (Table 5) which

was followed by all of the readers. The experiment was performed in a blinded setting, meaning that the readers obtained segmentations without knowing

Jo

(i) the source dataset (BraTS vs. CD), and (ii) the source of tumor delineation (Sens.AI DCE vs. ground truth for BraTS). Each reader was allowed to assign a single score for each segmentation. The examples of such segmentation reports which underwent scoring are rendered in Figure 10, with the corresponding DICE values, as these T2-FLAIR scans come from the BraTS set which is accompanied with the ground-truth delineations.

31

b)

1.0

0.8

0.8

Precision

0.6 0.4 0.2

0.6

0.4 0.2

0.2

0.4 0.6 0.8 False positive rate

1.0

0.0 0.0

0.4 0.6 Recall

0.2

0.8

1.0

ro

0.0 0.0

of

Recall

a)

1.0

Figure 9: The a) receiver operating characteristic and b) precision-recall curves obtained using

-p

Sens.AI DCE over the BraTS test set. The shaded areas represent the standard deviations. Table 4: The readers who participated in the mean opinion score experiment, their years of experience (YOE), and the average score obtained for the BraTS ground-truth and Sens.AI

re

DCE segmentations, and the Sens.AI DCE segmentations of our clinical data (CD). BraTS

BraTS

YOE

(GT)

(Sens.AI DCE)

CD

3

3.000

2.697

2.300

3

3.273

2.909

2.720

3

2.152

1.909

1.660

Institution A

4

2.848

2.758

2.740

Reader 5

Institution A

4

3.273

2.939

2.620

Reader 6

Institution A

4

3.697

3.333

2.600

Institution

Reader 1

Institution A

Reader 2

Institution A

Reader 3

Institution A

Reader 4

ur na

lP

ID

Institution A

5

2.242

2.455

1.940

Reader 8

Institution B

5

3.333

3.152

2.300

Reader 9

Institution B

7

2.909

2.758

2.040

Reader 10

Institution B

9

3.182

2.788

2.000

Reader 11

Institution B

11

2.848

2.636

2.080

Reader 12

Institution B

11

3.424

3.273

2.540

Average score

3.015

2.801

2.330

Weighted average score

3.050

2.842

2.318

Jo

Reader 7

In Table 4, we report the average score assigned by each reader to the in-

vestigated segmentations. The results indicate that the ground-truth BraTS segmentations would be more likely used in a clinical setting when compared with those obtained using Sens.AI DCE. However, both average and weighted

32

Table 5: The scale used in the mean opinion score experiment. Score

Description

Outcome

1

Very low quality segmentation

No, I would not use it to support diagnosis

2

Low quality segmentation

No, I would not use it to support diagnosis

3

Acceptable segmentation

Yes, I would use it to support diagnosis

4

Very high-quality segmentation

Yes, I would use it to support diagnosis

of

average, according to the reported years of experience of the corresponding participant, scores were close to 3 for BraTS, and larger than 2 for CD volumes.

ro

Table 6 presents the numbers of segmentations annotated with the corresponding score for all configurations. The results are considerably better for BraTS,

and allowed us to obtain the acceptable segmentation (the score ≥ 3) in 45%

-p

and 50% cases for the average and weighted average mean opinion score, respectively, showing the potential clinical utility of Sens.AI DCE. The results

re

are noticeably worse for CD, showing that the generalization abilities of our ensembles, albeit reasonable, still may be improved by fine-tuning the models

lP

over the target data.

Table 6: The number and percentage of cases with the corresponding minimal average score for all investigated datasets, in both average and weighted average settings. GT

GT

Sens.AI DCE

Sens.AI DCE

CD

CD

(Average)

(Weighted avg.)

(Average)

(Weighted avg.)

(Average)

(Weighted avg.)

ur na

Score ≥1

33 (100%)

33 (100%)

33 (100%)

33 (100%)

50 (100%)

50 (100%)

≥2

33 (100%)

32 (97%)

29 (88%)

29 (88%)

37 (74%)

34 (68%)

≥3

20 (61%)

20 (61%)

15 (45%)

18 (55%)

12 (24%)

8 (16%)

Since the above analysis is fairly rigorous, and the voting of a small subset

of readers may significantly drag the average scores for 12 participants of this

Jo

experiment down, we additionally present the number of cases classified as acceptable (the score ≥ 3) by a given number of readers (Figure 11). Assuming that at least 50% of the readers (6) should label a segmentation as “acceptable” to consider it clinically useful, we would obtain 25 (76%) and 24 (73%) of such delineations for BraTS in the case of ground-truth and Sens.AI DCE segmentations, respectively, and 28 (56%) delineations for CD. Finally, if we decided

33

Ground truth

Sens.AI DCE

of

Original T2-FLAIR

-p

ro

a)

Jo

ur na

c)

lP

re

b)

d)

Figure 10: Example segmentations (annotated in red) of the BraTS test patients alongside the ground-truth segmentation (in red) and the original image data. The DICE scores for these patients amounted to a) 0.936, b) 0.925, c) 0.762, and d) 0.644.

34

on the clinical utility based on the answer of at least one reader who annotated the segmentation as useful, we would end up having 33 (100%) and 31 (94%) “acceptable” GT and Sens.AI DCE BraTS segmentations, respectively, and 42 (84%) “acceptable” CD segmentations obtained using our technique.

Sens.AI DCE

40

30 20 10

0

1 2 3 4 5 6 7 8 9 10 11 12 Number of readers

-p

1 2 3 4 5 6 7 8 9 10 11 12 Number of readers

50

of

Ground truth Sens.AI DCE

ro

35 30 25 20 15 10 5 0

b)

Number of cases

Number of cases

a)

Figure 11: The number of cases which were scored as ≥ 3 by a given number of readers for

re

a) the BraTS test patients, and b) our clinical data (CD).

To better understand the inter-rater agreement across the participants, we extracted the Spearman correlation coefficient, as we operated in the ordinal

lP

scale presented in Table 5. Although the readers strongly disagree for the GT BraTS segmentations in the majority of cases (Table 12), their responses were consistent for both Sens.AI DCE segmentations, obtained for both BraTS (Ta-

ur na

ble 13) and CD (Table 14) test sets. The same observations were manifested in the Fleiss’ kappa values which quantify the inter-rater agreement—they are gathered in Table 7. Here, we analyzed the agreement at the level of four and two classes (scores); in the latter case, we merged classes 1 and 2 as “not-clinically useful”, and 3 and 4 as “clinically useful”. According to Altman [6], the agreement for BraTS GT and Sens.AI DCE segmentations is poor for the full scale

Jo

(< 0.20), whereas it is fair in all other cases (≥ 0.21 and ≤ 0.40). It indicates that the assessment of clinical utility of automated segmentation techniques is an open issue in practice, as the disagreement across readers, even with similar experience, is significant, e.g., due to different clinical trainings, experience, or even hardware and/or software settings used to display segmentation reports.

35

of ro -p

re

Figure 12: The inter-rater agreement quantified by the Spearman correlation coefficient for the ground-truth segmentation of the BraTS test patients.

lP

Table 7: The inter-rater agreement quantified by the Fleiss’ kappa values. BraTS

(GT)

(Sens.AI DCE)

CD

All classes

0.076

0.193

0.241

Two classes

0.114

0.340

0.367

ur na

BraTS

Score

4.4. Validation of the vascular input region determination Determination of the vascular input region is an important step in extracting

the plasma curve. To verify our algorithm for this task, we calculate the root mean square (RMS) intensity within the VIF regions for each slice separately,

Jo

and compare the values with the VIF masks segmented by a reader. Here, we report the results obtained over our clinical data of 44 LGG patients who underwent the MR imaging with a MAGNETOM Prisma 3T system (Siemens, Germany) equipped with a maximum field gradient strength of 80 mT/m, and using a 20-channel quadrature head coil. The MRI sequences were acquired in the axial plane with a field of view of 230 × 190 mm, matrix size 256 × 256

36

of ro -p

re

Figure 13: The inter-rater agreement quantified by the Spearman correlation coefficient for the Sens.AI DCE segmentation of the BraTS test patients.

lP

and 1 mm slice thickness with no slice gap. The 1995 slices from this dataset contained the segmented vascular input regions. The average RMS difference was 2.69 · 10−5 with the standard deviation amounting to 1.08 · 10−2 , median

ur na

equal to −9.40 · 10−5 , and maximum of 0.0481. The average analysis of the entire study takes 7.52 ± 2.24 ms.

4.5. Validation of pharmacokinetic modeling To quantify the performance of our pharmacokinetic modeling, we calculated

the tissue parameters for the QIBA phantom data (for each patch), and divided

Jo

them by their expected values. The results are presented in Figure 15(b,c), for

Ktrans and ve , respectively. Our fitting accuracy is high for most of the simu fitted target target error lated regions (the average error given as Ktrans = Ktrans − Ktrans /Ktrans is

1.510%)—the largest error occurred for the regions with expected high values of Ktrans , and low values of ve —the discrepancy reaches around 6-8% (Figure 15b). Analogously, the largest error (approx. 6-8%) for ve was visible in the regions of

37

of ro -p

re

Figure 14: The inter-rater agreement quantified by the Spearman correlation coefficient for the Sens.AI DCE segmentation of our clinical data.

lP

expected high values of ve , and low values of Ktrans (Figure 15c). The average error of the ve fitting was 2.782%. The fitting of 3000 voxels of this QIBA set (ver. 14) took approx. 5 s. A corresponding fitting error, using the linear model

ur na

(Eq. 4), was 6.501% and 5.261% for the Ktrans and ve , respectively. We compare the efficacy of our modeling with DCE-MRI.jl, whose authors

operated on version 6 of the QIBA dataset—their fitting errors over this set were 0.419% and 0.126% for Ktrans and ve , respectively. Importantly, DCE-MRI.jl exploits numerical integration instead of an analytical model to approximate the VIF, which eliminated the approximation error. However, numerical calcu-

Jo

lations are more computationally demanding than analytical approaches. Despite the fact that the authors implemented their algorithms in Julia, which is claimed to be a high-level language designed for scientific computing [72], the reported time required to fit 3000 voxels of the QIBA set was approx. 520 s using 8 threads. The fitting error of the cubic (linear) model obtained on the

38

a)

b)

Ve

Ve

0.01 0.05 0.10 0.20 0.50

0.01 0.05 0.10 0.20 0.50 1.08

0.01

0.01 1.06

0.02

0.02

1.04

0.10 0.20

0.05

1.02

0.10

1.00

of

Ktrans

Ktrans

0.05

0.98

0.20

0.35

ro

0.96

0.35

0.94

Ve

-p

c) 0.01 0.05 0.10 0.20 0.50

1.00

re

0.01

0.98

0.05

0.96

lP

Ktrans

0.02

0.10 0.20

0.94

0.92

ur na

0.35

0.90

Figure 15: QIBA (version 14) phantom set: a) a visualization of a single DICOM file (brown frame surrounds 10 × 10 px patches of tissue characterized by a combination of different

Ktrans and ve , purple frame surrounds a 10 × 50 px vascular region) alongside the results of

Jo

fitting the cubic model to this set divided by the expected values of b) Ktrans and c) ve .

QIBA6 dataset was 5.114% (4.684%) and 3.241% (3.466%) for the Ktrans and

ve , respectively. Our system, while characterized by a worse accuracy, requires less than 5 s to process the QIBA set (version 6), hence it is more than 100× faster than DCE-MRI.jl. Finally, our system is also portable and can be easily deployed using any hardware and operating system.

39

e) 0.20 0.15

0.10

0.05

f) 2.0 1.5 1.0 0.5 CA in plasma Fit

0.0 0

d)

ve

0.004 CA in tissue Fit

0.000

60 120 180 240 300

g)

0

0

0.04

0.12

0.08

7500 5000

0.01 0.00

0

0.20

ro

h)

-p

0.02

2500 0

Pixel count

0.03

0.16

0.03

0.06

ve

0.09

0.12

Pixel count

10000

60 120 180 240 300

Time [s]

Ktrans [min-1]

0.05 0.04

0.008

Time [s]

0.00

b)

0.012

of

Ktrans [min-1]

Contrast concentr. [mmol]

c)

Contrast concentr. [mmol]

a)

16000 12000 8000

4000 0 0.15

Figure 16: Artifacts generated at the pivotal steps of Sens.AI DCE: a) original image, b)

re

segmented tumor (green color represents true positives, blue—false negatives, and red—false positives), the parameter maps for c) Ktrans , and d) ve , contrast agent’s concentration alongside our fitting in e) plasma, and f) tissue, and the histograms extracted from the g) Ktrans

lP

(skewness is 15.859, kurtosis is 258.021), and h) ve (skewness is 3.571, kurtosis is -15.747) parameter maps. Automated analysis of this patient took less than 3 min with one GPU.

ur na

4.6. Illustrative example of automated DCE-MRI processing In this section, we present an end-to-end processing example of an LGG pa-

tient, alongside execution times of all pivotal steps of Sens.AI DCE. In Figure 16, we render the artifacts generated at each step of the Sens.AI DCE pipeline (the analysis took less than 3 min using a single GPU, and 4 min without GPU). All of them are finally embedded into a DICOM report which can be either sent back to

Jo

the Picture Archiving and Communication System (PACS), or analyzed locally by a reader—the process can be triggered automatically from PACS, and does not require changing the clinical protocol. We showed that Sens.AI DCE can extract quantifiable histogram-based features [53] from parameter maps: kurtosis (measure of the peakedness of the histogram) and skewness (measure of the asymmetry of the histogram). However, to use them as biomarkers (e.g., in

40

conjunction with Ktrans and ve ), a validation process is required. Table 8: Time required to accomplish selected parts of Sens.AI DCE using a single GPU and without GPU (run on CPU). This study consists of 160 T2-FLAIR images for tumor segmentation, and 25 volumes of 30 T1 (192 × 192) images acquired at different timestamps. Time on GPU [s]

Time on CPU [s]

2

80

VIF region determination

<1

<1 135

135 26

26

Total

163

241

ro

Tofts model fitting I/O, report generation

of

Sens.AI DCE part Brain tumor segmentation

The time required to execute all parts of Sens.AI DCE is presented in Table 8.

-p

This profile, which is consistent for all patients, shows that our brain tumor segmentation can be greatly accelerated (40×) using a GPU. Other operations are multi-threaded and run on CPU, hence their time is the same for both

re

hardware configurations. Sens.AI DCE requires less than 3 min and less than 4 min with and without GPU, which significantly reduces the DCE-MRI analysis

lP

time, and allows clinicians to extract reproducible results in real time.

5. Conclusions and future work

ur na

5.1. Conclusions

We introduced a fully-automated deep learning-powered approach (Sens.AI

DCE) for the DCE-MRI analysis of brain tumor patients. The pivotal steps of Sens.AI DCE have been thoroughly validated using both benchmark and clinical data. The experiments, backed up with statistical tests, showed that Sens.AI DCE obtains state-of-the-art reproducible results, in terms of both segmentation

Jo

accuracy and pharmacokinetic modeling, in a very short time which is orders of magnitude smaller when compared with other techniques. In particular, we showed that: - Our deep network for brain tumor segmentation delivers very consistent segmentation for BraTS and clinical data, and it allows for instant processing of full T2-FLAIR scans (less than 6 s using a single GPU) thanks 41

to its simplicity—it is orders of magnitude faster to train and deploy compared with top-performing BraTS segmentation engines known from the literature. Nowadays, the availability of GPU-based systems dramatically increases in clinical settings, and GPUs are being continuously deployed in both workstation and server-side solutions [71]. Thus, designing such deep learning-powered medical image analysis systems allowing for fast

of

performance may significantly improve personalized patient care.

- Our vascular input region determination delivers robust segmentation,

ro

with the average root mean square error between the contrast concen-

tration extracted from a segmented and ground-truth region is less than

-p

3 · 10−5 in real time—processing of full T1 VIBE scans takes less than 8 ms on average.

re

- Our cubic model of the VIF yields very accurate contrast-concentration fitting. The mean square fitting error, obtained for the synthetic DCEMRI (QIBA) phantom dataset, is an order of magnitude lower for both

lP

plasma and tissue when compared with commonly used models. Our implementation works 100× faster compared to a validated state-of-the-art DCE tool.

ur na

- Our DCE analysis pipeline requires less than 3 min for the end-to-end processing (including data loading and generating DICOM reports) using a single GPU. On a workstation which is not equipped with a GPU, the entire processing takes only 4 min. Our deep learning-powered brain tumor segmentation is accelerated 40× using the GPU processing when compared

Jo

with its CPU version.

- Sens.AI DCE is very flexible and can extract new quantitative DCE-MRI features (e.g., histogram- or texture-based).

5.2. Future work Our current work is focused on validation of DCE biomarkers extracted by Sens.AI DCE, and comparing them with the measures obtained using other 42

well-established software (Tissue4D, Siemens) which approximates volumes of interest by simpler geometrical objects, e.g., spheres, as Sens.AI DCE extracts biomarkers from the very segmented volumes without any additional approximation. Once this process is finished [9], Sens.AI DCE biomarkers (including texture- and histogram-based features) may help enhance diagnostic efficiency of the DCE-MRI imaging and bring new value into clinical practice. In the

of

future, we plan to additionally quantify perfusion in tumor subregions, and to experiment with various regularization factors for pharmacokinetic model fit-

ro

ting, which can be helpful in cases where the data is noisy or irregular, as

reflected in other versions of the simulated DCE-MRI data (https://sites. duke.edu/dblab/qibacontent/).

-p

Deciding on the brain tumor segmentation acceptance criteria is not a trivial task. As presented in our experimental study, there is a significant disagree-

re

ment across experienced readers when assessing the quality of segmentations. Our current work is focused on understanding the impact of the brain tumor segmentation quality, especially close to the tumor boundaries, which is still an

lP

open issue in the literature, on the extracted DCE values. This is in line with the comparison with Tissue4D mentioned above. In our initial experiments, we simulated inscribed and circumscribed spheres (Figure 17) approximating

ur na

ground-truth segmentation of our test DCE-MRI dataset containing 44 LGG patients (Section 4.4). Interestingly, Bland-Altman analysis [29] showed highagreement between perfusion parameters for GT and spheres (Figure 18) which may indicate a low impact of the segmentation quality on the DCE parameters. To build a full understanding in this context, we currently proceed with

Jo

exhaustive validation over the DCE-MRI data of LGG and HGG patients.

Acknowledgements This work has been supported by the Polish National Centre for Research

and Development (POIR.01.02.00-00-0030/15). JN, PRL, and MK were supported by the Silesian University of Technology funds (JN: The Rector’s Habili-

43

z

y

y

x

x

z

y

x

ur na

b)

lP

re

z

-p

ro

a)

of

z

Jo

z

y

c)

y

x

z

y

x

x

Figure 17: Example a) inscribed and b) circumscribed spheres approximating tumors, alongside the ratio of their volumes.

44

ve

of

Ktrans

-p

ro

a)

b)

re

Figure 18: The results of the Blant-Altman analysis performed for the DCE parameters (Ktrans and ve ), extracted for a) inscribed and b) circumscribed tumor-approximating spheres

lP

vs. accurate tumor segmentation, obtained over 44 investigated DCE-MRI patients.

tation Grant No. 02/020/RGH19/0185 and the Grant No. 02/020/BKM19/0183; PRL: BKM-556/RAU2/2018; MK: 02/020/BK 18/0128). The authors are grate-

ur na

ful to the anonymous Reviewers for their constructive and valuable comments that helped improve the paper. JN thanks Dana K. Mitchell for lots of inspiring discussions on (not only) brain MRI analysis. This paper is in memory of Dr. Grzegorz Nalepa, an extraordinary scientist

and pediatric hematologist/oncologist at Riley Hospital for Children, Indianapolis, USA, who helped countless patients and their families through some of the

Jo

most challenging moments of their lives.

Appendix: Table of acronyms In this section, we gather all acronyms used in the paper (Table 1).

45

Table A 1: The acronyms used in this paper.

Area under curve Brain Tumor Segmentation [dataset]

CA

Contrast agent

CD

Clinical data [dataset]

CML

Conventional machine learning

CNN

Convolutional neural network

CPU

Central processing unit

DCE-MRI DICOM DL DNN ED EES ET GPU

Dynamic contrast-enhanced magnetic resonance imaging Digital Imaging and Communications in Medicine Deep learning Deep neural network Peritumoral edema Extravascular extracellular space Enhancing tumor Graphics processing unit

GT

Ground truth [segmentation]

HD

Hausdorff distance High-grade glioblastomas

LGG

Low-grade gliomas

ML MRI MSE NCR/NET PACS

Machine learning

Magnetic resonance imaging Mean square error

Necrotic and non-enhancing tumor core Picture Archiving and Communication System Precision-recall [curve]

ur na

P-R

Lesion leakage space

Medical Image Computing and Computer Assisted Intervention

lP

LLS MICCAI

re

HGG

ReLU

Root mean square

ROC

Receiver operating characteristic [curve]

T1c

T1 VIBE T2

T2-FLAIR

Jo

Rectified linear unit

RMS

T1

QIBA VIF WHO YOE

of

BraTS

Meaning

ro

AUC

-p

Acronym

Native pre-contrast T1 [MRI sequence]

Post-contrast T1-weighted sequence [MRI sequence] T1 Volumetric Interpolated Breath-hold Examination [MRI sequence]

T2-weighted [MRI sequence] T2 Fluid Attenuated Inversion Recovery [MRI sequence]

Quantitative Imaging Biomarkers Alliance Vascular input function World Health Organization Years of experience

46

References [1] H. E. M. Abdalla, M. Y. Esmail, Brain tumor detection by using artificial neural network, in: 2018 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE), 2018, pp. 1–6. [2] T. Abe, Y. Mizobuchi, K. Nakajima, Y. Otomi, S. Irahara, Y. Obama,

of

M. Majigsuren, D. Khashbat, T. Kageji, S. Nagahiro, M. Harada, Diagnosis of brain tumors using dynamic contrast-enhanced perfusion imaging with

ro

short acquisition time, Springer 4 (1) (2015) 88.

[3] M. Ahmed, S. Yamany, N. Mohamed, A. Farag, T. Moriarty, A modified fuzzy c-means algorithm for bias field estimation and segmentation of mri

-p

data, IEEE Transactions on Medical Imaging 21 (3) (2002) 193–199.

[4] Z. Akkus, A. Galimzianova, A. Hoogi, D. L. Rubin, B. J. Erickson, Deep

re

learning for brain MRI segmentation: State of the art and future directions, Journal of Digital Imaging 30 (4) (2017) 449–459.

lP

[5] P. Aljabar, R. Heckemann, A. Hammers, J. Hajnal, D. Rueckert, Multiatlas based segmentation of brain images: Atlas selection and its effect on accuracy, NeuroImage 46 (3) (2009) 726 – 738.

ur na

[6] D. G. Altman, Practical statistics for medical research, Statistics in Medicine 10 (10) (1991) 1635–1636.

[7] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J. Kirby, J. Freymann, K. Farahani, C. Davatzikos, Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic fea-

Jo

tures, Nature Scientific data 4 (2017) 1–13.

[8] S. Bakas, M. Reyes, A. Jakab, S. Bauer, M. Rempfler, A. Crimi, R. T. Shinohara, C. Berger, S. M. Ha, M. Rozycki, M. Prastawa, E. Alberts, J. Lipkov´ a, J. B. Freymann, J. S. Kirby, M. Bilello, H. M. Fathallah-Shaykh, R. Wiest, J. Kirschke, B. Wiestler, R. R. Colen, A. Kotrotsou, P. LaMontagne, D. S. Marcus, M. Milchenko, A. Nazeri, M. Weber, A. Mahajan, 47

U. Baid, D. Kwon, M. Agarwal, M. Alam, A. Albiol, A. Albiol, A. Varghese, T. A. Tuan, T. Arbel, A. Avery, P. B., S. Banerjee, T. Batchelder, K. N. Batmanghelich, E. Battistella, M. Bendszus, E. Benson, J. Bernal, G. Biros, M. Cabezas, S. Chandra, Y. Chang, et al., Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge, CoRR

of

abs/1811.02629. URL http://arxiv.org/abs/1811.02629

ro

[9] C. Barata, M. E. Celebi, J. S. Marques, Development of a clinically oriented system for melanoma diagnosis, Pattern Recognition 69 (2017) 270 – 285.

-p

[10] S. Bauer, C. Seiler, T. Bardyn, P. Buechler, M. Reyes, Atlas-based segmentation of brain tumor images using a Markov Random Field-based tumor

re

growth model and non-rigid registration, in: Proc. IEEE EMBC, 2010, pp. 4080–4083.

lP

[11] A. Ben Rabeh, F. Benzarti, H. Amiri, Segmentation of brain mri using active contour model, International Journal of Imaging Systems and Technology 27 (1) (2017) 3–11.

ur na

[12] A. Bjørnerud, K. E. Emblem, A fully automated method for quantitative cerebral hemodynamic analysis using dsc–mri, Journal of Cerebral Blood Flow & Metabolism 30 (5) (2010) 1066–1078.

[13] B. Bobek-Billewicz, G. Stasik-Pres, A. Hebda, K. Majchrzak, W. Kaspera, M. Jurkowski, Anaplastic transformation of low-grade gliomas (WHO II)

Jo

on magnetic resonance imaging, Folia Neuropathologica 52 (2) (2014) 128– 140.

[14] M. Cabezas, A. Oliver, X. Llad´o, J. Freixenet, M. B. Cuadra, A review of atlas-based segmentation for magnetic resonance brain images, Computer Methods and Programs in Biomedicine 104 (3) (2011) e158 – e177.

48

[15] A. Chaddad, C. Tanougast, Quantitative evaluation of robust skull stripping and tumor detection applied to axial mr images, Brain Informatics 3 (1) (2016) 53–61. [16] A. Chander, A. Chatterjee, P. Siarry, A new social and momentum component adaptive PSO algorithm for image segmentation, Expert Systems

of

with Applications 38 (5) (2011) 4998 – 5004. [17] S.-L. Chao, T. Metens, M. Lemort, Tumourmetrics: a comprehensive clin-

ro

ical solution for the standardization of dce-mri analysis in research and routine use, J. QIMS 7 (5).

-p

[18] C. Cuenod, D. Balvay, Perfusion and vascular permeability: Basic concepts and measurement in DCE-CT and DCE-MRI, Diagnostic and Interven-

re

tional Imaging 94 (12) (2013) 1187 – 1204.

[19] L. Dai, T. Li, H. Shu, L. Zhong, H. Shen, H. Zhu, Automatic brain tumor segmentation with domain adaptation, in: A. Crimi, S. Bakas, H. Kuijf,

lP

F. Keyvan, M. Reyes, T. van Walsum (eds.), Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Springer International Publishing, Cham, 2019, pp. 380–392.

ur na

[20] W. Deng, W. Xiao, H. Deng, J. Liu, Mri brain tumor segmentation with region growing method based on the gradients and variances along and inside of the boundary curve, in: 2010 3rd International Conference on Biomedical Engineering and Informatics, vol. 1, 2010, pp. 393–396.

[21] T. Dozat, Incorporating Nesterov Momentum into Adam, in: Proc. Work-

Jo

shop track - ICLR 2016, 2015, pp. 1–6.

[22] B. M. Ellingson, the Jumpstarting Brain Tumor Drug Development Coalition Imaging Standardization Steering Committee, Consensus recommendations for a standardized Brain Tumor Imaging Protocol in clinical trials, Neuro-Oncology 17 (9) (2015) 1188–1198.

49

[23] X. Fan, J. Yang, Y. Zheng, L. Cheng, Y. Zhu, A novel unsupervised segmentation method for MR brain images based on fuzzy methods, in: Y. Liu, T. Jiang, C. Zhang (eds.), Proc. CVBIA, Springer, Berlin, 2005, pp. 160– 169. [24] L. Fang, H. He, Three pathways U-Net for brain tumor segmentation, in:

of

Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries - 4th International Workshop, BrainLes 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Pre-conference

ro

proceedings, 2018, pp. 119–126.

[25] E. Geremia, O. Clatz, B. H. Menze, E. Konukoglu, A. Criminisi, N. Ay-

-p

ache, Spatial decision forests for MS lesion segmentation in multi-channel magnetic resonance images, NeuroImage 57 (2) (2011) 378 – 390.

re

[26] M. Ghafoorian, N. Karssemeijer, T. Heskes, I. van Uden, C. I. S´anchez, G. J. S. Litjens, F. de Leeuw, B. van Ginneken, E. Marchiori, B. Platel,

lP

Location sensitive deep convolutional neural networks for segmentation of white matter hyperintensities, CoRR abs/1610.04834. URL http://arxiv.org/abs/1610.04834

ur na

[27] M. Ghafoorian, A. Mehrtash, T. Kapur, N. Karssemeijer, E. Marchiori, M. Pesteie, C. R. G. Guttmann, F. E. de Leeuw, C. M. Tempany, B. van Ginneken, A. Fedorov, P. Abolmaesumi, B. Platel, W. Wells, Transfer learning for domain adaptation in MRI: application in brain lesion segmentation, in: Proc. MICCAI, 2017, pp. 516–524.

Jo

[28] A. Gholipour, N. Kehtarnavaz, R. Briggs, M. Devous, K. Gopinath, Brain functional localization: A survey of image registration techniques, IEEE Transactions on Medical Imaging 26 (4) (2007) 427–451.

[29] D. Giavarina, Understanding Bland Altman analysis, Biochem Med (Zagreb) 25 (2) (2015) 141–151.

50

[30] M. Goetz, C. Weber, J. Bloecher, B. Stieltjes, H.-P. Meinzer, K. MaierHein, Extremely randomized trees based brain tumor segmentation, in: Brainlesion:

Glioma, Multiple Sclerosis, Stroke and Traumatic Brain

Injuries, MICCAI 2014, 2014, pp. 1–6. URL

https://www.researchgate.net/publication/267762444_

of

Extremely_randomized_trees_based_brain_tumor_segmentation [31] M. Havaei, A. Davy, D. Warde-Farley, A. Biard, A. Courville, Y. Bengio, C. Pal, P.-M. Jodoin, H. Larochelle, Brain tumor segmentation with deep

ro

neural networks, Medical Image Analysis 35 (2017) 18 – 31.

[32] U. Ilhan, A. Ilhan, Brain tumor segmentation based on a new threshold

-p

approach, Procedia Computer Science 120 (2017) 580 – 587.

[33] F. Isensee, P. Kickingereder, W. Wick, M. Bendszus, K. H. Maier-Hein, No

re

new-net, in: A. Crimi, S. Bakas, H. Kuijf, F. Keyvan, M. Reyes, T. van Walsum (eds.), Brainlesion: Glioma, Multiple Sclerosis, Stroke and Trau-

234–244.

lP

matic Brain Injuries, Springer International Publishing, Cham, 2019, pp.

[34] F. Isensee, M. Schell, I. Pflueger, G. Brugnara, D. Bonekamp, U. Neu-

ur na

berger, A. Wick, H.-P. Schlemmer, S. Heiland, W. Wick, M. Bendszus, K. H. Maier-Hein, P. Kickingereder, Automated brain extraction of multisequence mri using artificial neural networks, Human Brain Mapping 40 (17) (2019) 4952–4964.

[35] S. Ji, B. Wei, Z. Yu, G. Yang, Y. Yin, A new multistage medical segmen-

Jo

tation method based on superpixel and fuzzy clustering, Comp. and Math. Meth. in Med. 2014 (2014) 747549:1–747549:13.

[36] Jin Liu, Min Li, J. Wang, Fangxiang Wu, T. Liu, Yi Pan, A survey of MRIbased brain tumor segmentation methods, Tsinghua Science and Technology 19 (6) (2014) 578–595.

51

[37] P. Kalavathi, V. B. S. Prasath, Methods on skull stripping of mri head scan images—a review, Journal of Digital Imaging 29 (3) (2016) 365–379. [38] K. Kamnitsas, W. Bai, E. Ferrante, S. G. McDonagh, M. Sinclair, N. Pawlowski, M. Rajchl, M. Lee, B. Kainz, D. Rueckert, B. Glocker, Ensembles of multiple models and architectures for robust brain tumour

of

segmentation, in: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Inj., Springer, 2018, pp. 450–462.

ro

[39] N. Khalid, S. Ibrahim, P. Haniff, Mri brain abnormalities segmentation using k-nearest neighbors (k-nn, Int J Comput Sci Eng 3 (2) (2011) 980–

-p

990.

[40] J. Kleesiek, G. Urban, A. Hubert, D. Schwarz, K. Maier-Hein, M. Bendszus, A. Biller, Deep mri brain extraction: A 3d convolutional neural network

re

for skull stripping, NeuroImage 129 (2016) 460 – 469.

[41] P. Korfiatis, T. L. Kline, B. J. Erickson, Automated segmentation of hyper-

lP

intense regions in FLAIR MRI using deep learning, Tomography: a journal for imaging research 2 (4) (2016) 334—340. [42] R. Kumar, A. Indrayan, Receiver operating characteristic (roc) curve for

ur na

medical researchers, Indian Pediatrics 48 (4) (2011) 277–287.

[43] A. Ladgham, G. Torkhani, A. Sakly, A. Mtibaa, Modified support vector machines for MR brain images recognition, in: Proc. CoDIT, 2013, pp. 032–035.

[44] L. Lefkovits, S. Lefkovits, L. Szil´agyi, Brain tumor segmentation with op-

Jo

timized random forest, in: A. Crimi, B. Menze, O. Maier, M. Reyes, S. Winzeck, H. Handels (eds.), Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Springer International Publishing, Cham, 2016, pp. 88–99.

52

[45] P. Liskowski, K. Krawiec, Segmenting retinal blood vessels with deep neural networks, IEEE Transactions on Medical Imaging 35 (11) (2016) 2369– 2380. [46] P. R. Lorenzo, J. Nalepa, Memetic evolution of deep neural networks, in: Proc. GECCO, ACM, 2018, pp. 505–512.

of

[47] P. R. Lorenzo, J. Nalepa, B. Bobek-Billewicz, P. Wawrzyniak, G. Mrukwa,

M. Kawulok, P. Ulrych, M. P. Hayball, Segmenting brain tumors from

Programs in Biomedicine 176 (2019) 135 – 148.

ro

flair mri using fully convolutional neural networks, Computer Methods and

-p

[48] D. N. Louis, A. Perry, G. Reifenberger, A. von Deimling, D. FigarellaBranger, W. K. Cavenee, H. Ohgaki, O. D. Wiestler, P. Kleihues, D. W. Ellison, The 2016 World Health Organization classification of tumors of the

re

central nervous system: a summary, Acta Neuropathologica 131 (6) (2016) 803–820.

lP

[49] M. Marcinkiewicz, J. Nalepa, P. R. Lorenzo, W. Dudzik, G. Mrukwa, Segmenting brain tumors from MRI using cascaded multi-modal U-Nets, in: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain In-

ur na

juries - 4th International Workshop, BrainLes 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers, Part II, 2018, pp. 13–24.

[50] R. McKinley, R. Meier, R. Wiest, Ensembles of densely-connected CNNs with label-uncertainty for brain tumor segmentation, in:

A. Crimi,

Jo

S. Bakas, H. Kuijf, F. Keyvan, M. Reyes, T. van Walsum (eds.), Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Springer International Publishing, Cham, 2019, pp. 456–465.

[51] P. A. Mei, C. de Carvalho Carneiro, S. J. Fraser, L. L. Min, F. Reis, Analysis of neoplastic lesions in magnetic resonance imaging using self-organizing maps, Journal of the Neurological Sciences 359 (1-2) (2015) 78–83.

53

[52] B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani, J. Kirby, Y. Burren, N. Porz, J. Slotboom, R. Wiest, L. Lanczi, E. Gerstner, M. A. Weber, T. Arbel, B. B. Avants, N. Ayache, P. Buendia, D. L. Collins, N. Cordier, J. J. Corso, A. Criminisi, T. Das, H. Delingette, . Demiralp, C. R. Durst, M. Dojat, S. Doyle, J. Festa, F. Forbes, E. Geremia, B. Glocker, P. Golland, X. Guo, A. Hamamci, K. M. Iftekharuddin, R. Jena,

of

N. M. John, E. Konukoglu, D. Lashkari, J. A. Mariz, R. Meier, S. Pereira,

D. Precup, S. J. Price, T. R. Raviv, S. M. S. Reza, M. Ryan, D. Sarikaya,

ro

L. Schwartz, H. C. Shin, J. Shotton, C. A. Silva, N. Sousa, N. K. Subbanna,

G. Szekely, T. J. Taylor, O. M. Thomas, N. J. Tustison, G. Unal, F. Vasseur, M. Wintermark, D. H. Ye, L. Zhao, B. Zhao, D. Zikic, M. Prastawa,

-p

M. Reyes, K. V. Leemput, The multimodal brain tumor image segmentation benchmark (BRATS), IEEE Transactions on Medical Imaging 34 (10)

re

(2015) 1993–2024.

[53] K. A. Miles, B. Ganeshan, M. P. Hayball, CT texture analysis using the

lP

filtration-histogram method: what do the measurements mean?, Cancer Imaging 13 (4) (2013) 400–406.

[54] P. Moeskops, M. A. Viergever, A. M. Mendrik, L. S. de Vries, M. J. N. L.

ur na

Benders, I. Isgum, Automatic segmentation of MR brain images with a convolutional neural network, IEEE Transactions on Medical Imaging 35 (5) (2016) 1252–1261.

[55] A. Myronenko, 3D MRI brain tumor segmentation using autoencoder regularization, in: A. Crimi, S. Bakas, H. Kuijf, F. Keyvan, M. Reyes, T. van

Jo

Walsum (eds.), Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Springer International Publishing, Cham, 2019, pp. 311–320.

[56] J. Nalepa, M. Kawulok, Adaptive genetic algorithm to select training data for support vector machines, in: Proc. Applications of Evolutionary Computation, LNCS, Springer, 2014, pp. 514–525.

54

[57] J. Nalepa, M. Kawulok, Adaptive memetic algorithm enhanced with data geometry analysis to select training data for SVMs, Neurocomputing 185 (2016) 113 – 132. [58] J. Nalepa, G. Mrukwa, S. Piechaczek, P. R. Lorenzo, M. Marcinkiewicz, B. Bobek-Billewicz, P. Wawrzyniak, P. Ulrych, J. Szymanek, M. Cwiek,

of

W. Dudzik, M. Kawulok, M. P. Hayball, Data augmentation via image registration, in: 2019 IEEE International Conference on Image Processing

ro

(ICIP), 2019, pp. 4250–4254.

[59] M. R. Orton, J. A. d’Arcy, S. Walker-Samuel, D. J. Hawkes, D. Atkinson, D. J. Collins, M. O. Leach, Computationally efficient vascular input func-

Medicine & Biology 53 (5) (2008) 1225.

-p

tion models for quantitative kinetic modelling using DCE-MRI, Physics in

re

[60] M. T. M. Park, J. Pipitone, L. H. Baer, J. L. Winterburn, Y. Shah, S. Chavez, M. M. Schira, N. J. Lobaugh, J. P. Lerch, A. N. Voineskos,

lP

M. M. Chakravarty, Derivation of high-resolution MRI atlases of the human cerebellum at 3T and segmentation using multiple automatically generated templates, NeuroImage 95 (2014) 217 – 231.

ur na

[61] N. Passat, C. Ronse, J. Baruthio, J.-P. Armspach, J. Foucher, Watershed and multimodal data for brain vessel segmentation: Application to the superior sagittal sinus, Image and Vision Computing 25 (4) (2007) 512 – 521.

[62] A. Pinto, S. Pereira, H. Correia, J. Oliveira, D. M. L. D. Rasteiro, C. A.

Jo

Silva, Brain tumour segmentation based on extremely rand. forest with high-level features, in: Proc. IEEE EMBC, 2015, pp. 3037–3040.

[63] J. Pipitone, M. T. M. Park, J. Winterburn, T. A. Lett, J. P. Lerch, J. C. Pruessner, M. Lepage, A. N. Voineskos, M. M. Chakravarty, Multi-atlas segmentation of the whole hippocampus and subfields using multiple automatically generated templates, NeuroImage 101 (2014) 494 – 512.

55

[64] R. Port, M. Knopp, G. Brix, Dynamic contrast-enhanced MRI using GdDTPA: Interindividual variability of the arterial input function and consequences for the assessment of kinetics in tumors, Magn. Res. in Med. 45 (6) (2001) 1030–1038. [65] A. Rajendran, R. Dhanasekaran, Fuzzy clustering and deformable model for

of

tumor segmentation on MRI brain image: A combined approach, Procedia Engineering 30 (2012) 327 – 333.

ro

[66] O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional networks for biomedical image segmentation, CoRR abs/1505.04597.

[67] S. Saha, S. Bandyopadhyay, MRI brain image segmentation by fuzzy sym-

-p

metry based genetic clustering technique, in: Proc. IEEE CEC, 2007, pp. 4417–4424.

re

[68] N. Sauwen, M. Acou, D. M. Sima, J. Veraart, F. Maes, U. Himmelreich, E. Achten, S. V. Huffel, Semi-automated brain tumor segmentation on

lP

multi-parametric mri using regularized non-negative matrix factorization, BMC Medical Imaging 17 (1) (2017) 29. [69] R. W. Sembiring, J. M. Zain, A. Embong, Dimension reduction of health

ur na

data clustering, CoRR abs/1110.3569. URL http://arxiv.org/abs/1110.3569

[70] V. Simi, J. Joseph, Segmentation of glioblastoma multiforme from MR images - a comprehensive review, The Egyptian Journal of Radiology and Nuclear Medicine 46 (4) (2015) 1105–1110.

Jo

[71] E. Smistad, T. L. Falch, M. Bozorgi, A. C. Elster, F. Lindseth, Medical image segmentation on GPUs – a comprehensive review, Medical Image Analysis 20 (1) (2015) 1 – 18.

[72] D. S. Smith, X. Li, L. R. Arlinghaus, T. E. Yankeelov, E. B. Welch, DCEMRI.jl: a fast, validated, open source toolkit for dynamic contrast enhanced mri analysis, PeerJ 3 (2015) e909. 56

[73] M. Soltaninejad, G. Yang, T. Lambrou, N. Allinson, T. L. Jones, T. R. Barrick, F. A. Howe, X. Ye, Automated brain tumour detection and segmentation using superpixel-based extremely randomized trees in FLAIR MRI, Int. J. of Comp. Assist. Radiol. and Surgery 12 (2) (2017) 183–203. [74] M. D. Steenwijk, P. J. Pouwels, M. Daams, J. W. van Dalen, M. W. Caan,

of

E. Richard, F. Barkhof, H. Vrenken, Accurate white matter lesion segmentation by k nearest neighbor classification with tissue type priors (knn-ttps),

ro

NeuroImage: Clinical 3 (2013) 462 – 469.

[75] K. Sum, P. Y. Cheung, Boundary vector field for parametric active contours, Pattern Recognition 40 (6) (2007) 1635 – 1645.

-p

[76] C. Sun, S. Guo, H. Zhang, J. Li, M. Chen, S. Ma, L. Jin, X. Liu, X. Li, X. Qian, Automatic segmentation of liver tumors from multiphase contrast-

re

enhanced ct images based on fcns, Artificial Intelligence in Medicine 83 (2017) 58 – 66.

lP

[77] A. A. Taha, A. Hanbury, Metrics for evaluating 3d medical image segmentation: analysis, selection, and tool, BMC Medical Imaging 15 (1) (2015) 29.

ur na

[78] M. Taherdangkoo, M. H. Bagheri, M. Yazdi, K. P. Andriole, An effective method for segmentation of MR brain images using the ant colony optimization algorithm, Journal of Digital Imaging 26 (6) (2013) 1116–1123.

[79] P. S. Tofts, A. G. Kermode, Measurement of the blood-brain barrier permeability and leakage space using dynamic MR imaging, Magn. Res. in

Jo

Med. 17 (2) (1991) 357–367.

[80] N. Verma, M. C. Cowperthwaite, M. K. Markey, Superpixels in brain MR image analysis, in: Proc. IEEE EMBC, 2013, pp. 1077–1080.

[81] J. E. Villanueva-Meyer, M. C. Mabray, S. Cha, Current Clinical Brain Tumor Imaging, Neurosurgery 81 (3) (2017) 397–415. URL https://doi.org/10.1093/neuros/nyx103 57

[82] A. Wadhwa, A. Bhardwaj, V. S. Verma, A review on brain tumor segmentation of MRI images, Magnetic Resonance Imaging 61 (2019) 247 – 259. [83] S. K. Warfield, K. H. Zou, W. M. Wells, Simultaneous truth and performance level estimation (staple): an algorithm for the validation of image

of

segmentation, IEEE Transactions on Medical Imaging 23 (7) (2004) 903– 921.

ro

[84] T. Weglinski, A. Fabijanska, Brain tumor segmentation from mri data sets using region growing approach, 2011, pp. 185–188, cited By 15.

-p

[85] W. Wu, A. Y. C. Chen, L. Zhao, J. J. Corso, Brain tumor detection and segmentation in a CRF (conditional random fields) framework with pixelpairwise affinity and superpixel-level features, Int. J. of Comp. Assist. Ra-

re

diol. and Surgery 9 (2) (2014) 241–253.

[86] J. Yin, J. Yang, Q. Guo, Automatic determination of the arterial input

lP

function in dynamic susceptibility contrast MRI: comparison of different reproducible clustering algorithms, Neuroradiology 57 (5) (2015) 535–543. [87] X. Zhao, Y. Wu, G. Song, Z. Li, Y. Zhang, Y. Fan, A deep learning

ur na

model integrating FCNNs and CRFs for brain tumor segmentation, CoRR abs/1702.04528.

[88] Y. Zhuge, A. V. Krauze, H. Ning, J. Y. Cheng, B. C. Arora, K. Camphausen, R. W. Miller, Brain tumor segmentation using holistically nested neural networks in MRI images, Med. Phys. (2017) 1–10.

Jo

[89] D. Zikic, B. Glocker, E. Konukoglu, A. Criminisi, C. Demiralp, J. Shotton, O. M. Thomas, T. Das, R. Jena, S. J. Price, Decision forests for tissuespecific segmentation of high-grade gliomas in multi-channel MR, in: Proc. MICCAI, Springer, 2012, pp. 369–376.

58

Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors

Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors

Recommend Documents