Selection of Fitting Model and Arterial Input Function for Repeatability in Dynamic Contrast-Enhanced Prostate MRI

Selection of Fitting Model and Arterial Input Function for Repeatability in Dynamic Contrast-Enhanced Prostate MRI

ARTICLE IN PRESS Original Investigation Selection of Fitting Model and Arterial Input Function for Repeatability in Dynamic Contrast-Enhanced Prosta...

2MB Sizes 0 Downloads 31 Views

ARTICLE IN PRESS

Original Investigation

Selection of Fitting Model and Arterial Input Function for Repeatability in Dynamic Contrast-Enhanced Prostate MRI Sharon Peled, PhD, Mark Vangel, PhD, Ron Kikinis, MD, Clare M. Tempany, MD, Fiona M. Fennessy, MD, PhD, Andrey Fedorov, PhD

Abbreviations AIF arterial input function BAT bolus arrival time DCE dynamic contrast-enhanced ETK extended Tofts-Kety ETK+B extended Tofts-Kety + BAT NPZ normal-appearing peripheral zone PZ peripheral zone PI-RADS Prostate Imaging  Reporting and Data System PK pharmacokinetic RC repeatability coefficient

Rationale and Objectives: Analysis of dynamic contrast-enhanced (DCE) magnetic resonance imaging is notable for the variability of calculated parameters. The purpose of this study was to evaluate the level of measurement variability and error/variability due to modeling in DCE magnetic resonance imaging parameters. Materials and Methods: Two prostate DCE scans were performed on 11 treatment-na€ıve patients with suspected or confirmed prostate peripheral zone cancer within an interval of less than two weeks. Tumor-suspicious and normal-appearing regions of interest (ROI) in the prostate peripheral zone were segmented. Different Tofts-Kety based models and different arterial input functions, with and without bolus arrival time (BAT) correction, were used to extract pharmacokinetic parameters. The percent repeatability coefficient (%RC) of fitted model parameters Ktrans, ve, and kep was calculated. Paired t-tests comparing parameters in tumor-suspicious ROIs and in normal-appearing tissue evaluated each parameter's sensitivity to pathology. Results: Although goodness-of-fit criteria favored the four-parameter extended Tofts-Kety model with the BAT correction included, the simplest two-parameter Tofts-Kety model overall yielded the best repeatability scores. The best %RC in the tumor-suspicious ROI was 63% for kep, 28% for ve, and 83% for Ktrans . The best p values for discrimination between tissues were p <10¡5 for kep and Ktrans, and p = 0.11 for ve. Addition of the BAT correction to the models did not improve repeatability. Conclusion: The parameter kep, using an arterial input functions directly measured from blood signals, was more repeatable than Ktrans. Both Ktrans and kep values were highly discriminatory between healthy and diseased tissues in all cases. The parameter ve had high repeatability but could not distinguish the two tissue types. Key Words: Prostate; Pharmacokinetic modeling; Magnetic resonance; Test-retest; Treatment response; Cancer imaging. © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.

%RC percent repeatability coefficient ROI region of interest TK Tofts-Kety Acad Radiol 2018; &:111 From the Department of Radiology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115 (S.P., R.K., C.M.T., F.M.F., A.F.); Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts (M.V.). Received August 8, 2018; revised October 19, 2018; accepted October 21, 2018. Address correspondence to: S. P. e-mail: [email protected] © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved. https://doi.org/10.1016/j.acra.2018.10.018

1

ARTICLE IN PRESS PELED ET AL

Academic Radiology, Vol &, No &&, && 2018

TK+B Tofts-Kety + BAT TPZ tumor-suspicious peripheral zone

INTRODUCTION

D

ynamic contrast-enhanced magnetic resonance imaging (DCE MRI) can potentially differentiate tumor vasculature from a normal blood vessel network by measuring vascular permeability and perfusion. There is a wide consensus that the use of quantitative DCE MRI could have a role in the clinical practices for determining disease recurrence and for evaluation of response to neoadjuvant drug therapy in cancer. DCE MRI has been validated as a surrogate marker of tumor angiogenesis in prostate cancer (1,2). Existing evidence shows that pharmacokinetic (PK) DCE parameters can provide a good measure of response in androgen deprivation therapy for prostate cancer (3) and a good indicator of isolated limb perfusion treatment response in soft tissue sarcomas (4). To determine the usefulness of a biomarker, it is necessary to understand its variability (intrasubject, intersubject, etc.). Intrasubject variability can be measured using two scans of the patient without intervening therapy. Modeling and analysis of DCE is disappointingly notable for the variability of calculated parameters—both between patients and between studies. This is in part due to the fact that DCE analysis is sensitive to the variety of acquisition and analysis choices that need to be made and parameters to be selected (5,6). Some of the sources of uncertainty pertain to the particular scanner, radiofrequency coil homogeneity/shimming, and the pulse sequence used to acquire the images. Other sources of variability include the analysis software, the PK model used, the presence or absence of T1 mapping, and the presence or absence of time series image registration (5). One critical input to quantitative DCE calculations is the arterial contrast agent time-dependent concentration curve, also called the arterial input function (AIF). Uncertainty in the AIF is arguably the greatest contributor to measurement error (7). The choice of AIF determination, including the choice of source vessel, can lead to significant differences in the values of the estimated PK parameters (8), as also shown by McGrath et al. who used data from repeated DCE-MRI acquisitions in a preclinical tumor rat model to quantify the sensitivity of tracer kinetic modeling parameters to the form of the AIF (9). Often the AIF is calculated from the DCE signal in an artery in the image field of view. There are a number of problems associated with this approach. When contrast agent concentrations are large (e.g., within vessels) signal intensity varies nonlinearly with contrast agent concentration, making it difficult to assess concentration. The calculation of contrast concentration is also affected by the intrinsic T1, which is 2

difficult to directly measure in blood. Lu et al. found that the longitudinal relaxivity R1 (=1/T1) of arterial blood at 3T depends on hematocrit (Hct) and can be approximately described as following the linear relationship: R1 = 0.52 Hct + 0.38, valid in the normal Hct range 0.380.46 (10). According to this equation, with the given range of Hct, peak blood contrast concentrations anywhere between 6.5 and 9.1 mM/l could be calculated from the same example of blood signal. Slight anemia or polycythemia could increase the uncertainty further. Other systemic problems with evaluating the AIF concentration from a vessel in the field of view are partial volume effects, in-flow effects, flow-profile dependency, T2* effects, and temporal undersampling causing underestimation of the peak value (11,8). Any error in the amplitude of the AIF directly affects the calculated PK parameters - Ktrans and ve have been shown to scale nearly proportionally with the AIF (12), and the choice of source vessel can change their calculated values (13). Due to the many problems with estimating the AIF from an artery in the image field of view, sometimes a predefined functional form or population-averaged AIF is used (11). This choice of AIF obviously cannot account for differences between patients or even between scans of the same patient in cardiac output and injection rates. Ashton et al. evaluated the scan-rescan coefficient of variation for repeated scans of 25 patients with solid tumors in the liver and compared a data-derived AIF with a model AIF. Using the data-derived AIF reduced the scan-rescan coefficient of variation in Ktrans by 70% in liver lesions (14). In prostate cancer, Azahaf et al. found that one particular population AIF worked better than another in generating values of Ktrans that could discriminate between malignant and benign tissue, and that it provided better or equal discrimination compared to an individually measured AIF (15). Normalizing AIFs has also been investigated. AIF-caused variations in measured parameters were reduced when reference-tissue-adjusted AIFs were used (16). In that study a region of interest (ROI) in the adjacent obturator muscle area on the same image slice as the tumor ROI was used as a reference tissue for AIF amplitude adjustment. The AIF amplitude was adjusted until the Tofts Model fitting of the muscle ROI DCE-MRI data returned a ve value of 0.1, which is within the range of literature-reported values. If the AIF is measured in a large artery upstream from the tissue of interest, further downstream the concentration is likely to be both delayed and dispersed. Ignoring a possible shift in bolus arrival time (BAT) has been shown to degrade the fit quality in DCE modeling (17), while incorporation of a BAT shift has shown improved reproducibility of calculated

ARTICLE IN PRESS Academic Radiology, Vol &, No &&, && 2018

model parameters (18). When using a functional form for the AIF or one derived from a population, it is likely that there be an even greater need for the BAT to be part of the fitting process, due to variations in injection timing. This work sets out to investigate the practical utility of DCE-derived quantitative biomarkers by comparing the repeatability of PK parameters calculated using different models and analysis methods. In contrast to previous work on the subject of model and AIF choices in DCE, here the focus is on the within-patient stability of parameter estimates, i.e., on repeatability under conditions of unavoidable measurement error. We vary the PK model, the AIF determination approach, and whether a BAT shift calculation was included. While determination of the ground truth AIF is difficult to achieve in practice, analysis of the AIF choices in a scan-rescan dataset provides us with a unique setting to optimize those choices such that repeatability is increased. Although repeatability is a critical characteristic of a biomarker for treatment response or disease progression studies, the sensitivity of a biomarker to tissue changes such as may be reflected in different values for healthy and diseased tissue, is also important. For this reason the parameters calculated from the analysis methods were also tested for their discriminatory ability between areas of suspected malignancy and areas categorized as healthy. MATERIALS AND METHODS Image Acquisition and Annotation

The cohort analyzed in this study is a subset of patients (n = 11) from a population (n = 15) participating in a study of prostate multiparametric MRI repeatability (19). Four patients out of the original 15, on whose DCE scans a tumor-suspicious region could not be identified, were excluded. The work described was carried out in accordance with the World Medical Association Declaration of Helsinki Ethical Principles for Medical Research Involving Human Subjects. Institutional review board approval was obtained for this Health Insurance Portability and Accountability Actcompliant study. Written informed consent was obtained from the study participants. Two clinical prostate 3D DCE scans were performed on treatment-na€ıve patients within an

REPEATABILITY OF DCE-MRI IN PROSTATE

interval of less than two weeks on a General Electric 3T scanner using a receive endorectal coil. 0.15 mmol/kg gadopentetate dimeglumine (Magnevist, Berlex Laboratories, Wayne, NJ) was injected intravenously using a syringe pump at the rate of 3 mL/s followed by 20 mL saline flush at the same rate (Fig 1). The clinical pulse sequence parameters varied between patients but were in the range: TR 3.74.1 ms; flip angle 12° or 15°; TE = 1.31.4 ms; time per frame 58.4 seconds; scan time 4.55.5 minutes; matrix either 256 £ 256 £ 16 with resolution 1 £ 1 £ 6 mm, or 512 £ 512 £ 32 with resolution 0.55 £ 0.55 £ 2.5 mm. The MRI studies were deidentified and presented in a random order to an experienced abdominal radiologist (FMF) who assigned a Prostate Imaging Reporting and Data System (PI-RADS v2) score (20) to each study. Using 3D Slicer software (http://slicer.org) the same radiologist segmented three ROIs on the DCE difference image (subtraction of baseline precontrast images from postcontrast first bolus arrival phase images). As illustrated in Figure 2, these three regions were defined as follows: 1. A peripheral zone tumor-suspicious ROI (TPZ). The sector containing the lesion was noted. Mean TPZ ROI sizes are shown in Table 1, as are the percentage of the peripheral zone represented by the TPZ ROI. 2. The whole peripheral zone ROI (PZ). In all the following analyses, NPZ refers to “normal-appearing” peripheral zone, i.e., the whole peripheral zone PZ excluding the tumor-suspicious zone TPZ. 3. Left femoral artery and right femoral artery voxels. The segmentations were performed while simultaneously viewing the T2 and diffusion-weighted images from the same patient study. Participating patient details are shown in Table 1. There was good agreement in the locations of the suspected tumor ROI areas between the baseline and repeat studies, which was assessed by a separate reader by comparing the noted lesion sector, as discussed in Ref. (19). The size of the suspected tumor ROI did, in some cases, differ between the baseline and repeat study, as shown in Table 1.

Figure 1. Typical images of prostate appearance after contrast injection. Shown from left to right are cropped frames #6, #7, #8, #9, and #19. The time between frames was 7.3 seconds.

3

ARTICLE IN PRESS Academic Radiology, Vol &, No &&, && 2018

PELED ET AL

dependent contrast concentration, C(t), is assumed linear and was done via this equation: R1 ðtÞ ¼ r1 ¢ CðtÞ þ R10 , where r1 is the relaxivity of the contrast agent used, and R10 is the longitudinal relaxivity of the tissue without added contrast agent. Here T10 of prostate tissue was assumed constant at 1434 ms (21). The arterial T10 of blood was estimated as 1630 ms according to the equation T10 ¼ 1=ð0:52 ¢ Hct þ 0:38Þ as suggested in (10), assuming Hct = 0.45. The AIF estimation methods (designated AIF1, AIF2, and AIF3 below) evaluated were:

Figure 2. Examples of regions of interest (ROIs) that were identified and segmented by an experienced abdominal radiologist on each DCE difference image: left artery voxels and right artery voxels for AIF measurement; whole peripheral zone (PZ); and peripheral zone tumor-suspicious ROI (TPZ).

Model Analysis

First, the MRI signal from the prostate and from blood was converted to longitudinal relaxivity under the condition of a low-flip angle gradient-echo sequences, with knowledge of u and TR—the flip angle and repetition time, respectively. The conversion from longitudinal relaxivity, R1, to time-

1. AIF1: A plasma concentration AIF calculated from the mean of the measured blood signals in the manually segmented left and right artery voxels: Cp,i(t) for each scan i, shown in Figure 3. The plasma concentration was derived from the blood concentration, Cb(t) using Cp ðtÞ ¼ Cb ðtÞ=ð1HctÞ assuming Hct = 0.45. 2. AIF2: One AIF for all subjects' scans, which is the population average of all the measured AIF's in this study—the injection protocol was typical for our institution. The population average in this case was calculated by first fitting each AIF to a heuristic parameterization—in this case a biexponential multiplied by a sigmoid function—as follows:   h ðtt0tmax Þ ðtt0tmax Þ tt0tmax =2 i s1 s2 Y ðt Þ ¼ Ymax ð1f Þe þ fe  1 þ e tmax =10 ð1Þ A biexponential is widely used to describe a bolus injection (22)—and a sigmoid has been used elsewhere to

TABLE 1. Patient-level Summary of Clinical Indications for MR Imaging, Histopathology, PSA and PI-RADS v2 Assessment of the Disease in the Evaluated Cohort, Mean Suspected Tumor ROI Size (Mean of Segmentations in Two Repeated MRI Scans § Standard Deviation), and % Tumor Volume of Peripheral Zone (Mean of Two Repeated Scans § Standard Deviation) Subject#

Indication for the MRI Exam

1 2

Known PCa, staging Known PCa, assess change Known PCa, staging Known PCa, staging Elevated PSA, staging Known PCa, assess change Elevated PSA, staging Known PCa, assess change Known PCa, assess change Elevated PSA, staging Known PCa, staging

3 4 5 6 7 8 9 10 11

4

Max. Gleason at Biopsy

PSA ng/mL

PI-RADS v2 Scan I

PI-RADs v2 Scan II

Suspected Tumor ROI: Size (mm3)

Suspected Tumor ROI: % Volume of PZ

3+4 3+4

5.4 7.5

4 2

4 3

378 § 336 469 § 117

4.5 § 3.7 3.8 § 0.2

3+3 3+3 4+5 3+3

8.2 4.3 6.2 4.8

4 2 4 4

4 2 4 4

794 § 190 89 § 50 663 § 367 499 § 170

6.3 § 0.8 1.3 § 0.8 8.7 § 0.5 3.6 § 1.8

Benign 3+3

9.4 3.15

4 4

4 4

541 § 20 82 § 19

4.4 § 0.6 1.1 § 0.3

3+3

9.7

4

4

530 § 109

4.2 § 0.6

Benign 3+4

5.5 4.16

3 4

4 3

112 § 49 340 § 44

2.2 § 0.8 2.6 § 0.2

ARTICLE IN PRESS Academic Radiology, Vol &, No &&, && 2018

REPEATABILITY OF DCE-MRI IN PROSTATE

Figure 3. AIFs calculated after measurement from the mean of left and right artery ROIs in all 22 DCE scans (11 patients £ 2 scans) with fits to biexponential £ sigmoid functions (see text). The concentration units of the vertical axis are mM/liter and the horizontal axis is Time (sec). The population AIF resulting from taking the median values of the parameters fitted to Eq. [5] (sigmoid £ biexponential) from all 22 scans, is shown in the bottom right plot.

describe the rise from zero of the concentration (23). Figure 3 shows the calculated and fitted AIFs for all scans. The median values of the fitted parameters [t0, tmax, Ymax, s 1, s 2, f] were then used to construct the population AIF, Cpop(t), shown in the bottom right axes of Figure 3. The resulting population AIF looked qualitatively similar in amplitude and form to the median AIF calculated from a previous study at our institute. 3. AIF3: To counter the possible underestimation of the peak value, the above population AIF (AIF2) was scaled for each DCE acquisition scan j by the NPZ maximum concentration in that scan max(CNPZ, j) as follows:   1 XN   Cpop;j ðt Þ ¼ Cpop ðt Þ ¢ max CNPZ;j ¢ 1=max C NPZ;i i¼1 N ð2Þ where CNPZ,j is the average of the voxel signals in the NPZ ROI of scan j.

Although in this study regions of obturator muscle were also segmented, with the intention to attempt to use its signal for scaling, the enhancement signal in the obturator was generally too noisy to extract a scaling factor. The four PK models to which the tissue concentration time courses were fitted were: 1. The 2-parameter Tofts-Kety (22,24) model, (denoted TK in Figures 48); this model extracts the parameters Ktrans (the transfer constant of Gd across the capillary endothelium) and ve (the fractional volume of extravascular extracellular space), and fits the calculated contrast agent concentration in tissue, Ct(t), via the following equation: Zt Ct ðt Þ ¼ Ktrans

Cp ðuÞekep ðtuÞ du

ð3Þ

0

where kep=Ktrans/ve, and Cp(t) is the contrast agent concentration in blood plasma. 5

ARTICLE IN PRESS Academic Radiology, Vol &, No &&, && 2018

PELED ET AL

Zt Ct ðt Þ ¼ Ktrans

Cp ðut 0 Þekep ðtuÞ du;

ð5Þ

0

where t 0 is the BAT delay. 4. The three-parameter extended Tofts-Kety model with BAT correction fitting (ETK+B in Figures 48): Zt Ct ðt Þ ¼ vp Cp ðtt0 Þ þ Ktrans

Cp ðut0 Þekep ðtuÞ du:

ð6Þ

0

Figure 4. The model fits for a single randomly picked voxel in the TPZ ROI of patient #7: a. PK models with AIF1, b. PK models with AIF2.

2. The three-parameter extended Tofts-Kety model (25) (ETK in Figures 48); this model fits an additional parameter, vp, which is the fractional volume of blood plasma in tissue as follows: Zt Ct ðt Þ ¼ vp Cp ðt Þ þ Ktrans

Cp ðuÞekep ðtuÞ du:

ð4Þ

0

3. The two-parameter Tofts-Kety model with BAT correction fitting (TK+B in Figures 48) via the following equation:

All data analysis and model fits were performed in-house using Matlab (MATLAB Release 2016b, The MathWorks, Inc.) utilizing the Levenberg-Marquardt algorithm for function fitting to equations [3,4,5,6]. The initial parameter guesses for the LM algorithm were [0.2 l/minute, 0.25, 0.02, 2 seconds] for Ktrans, ve, vp and t0 respectively. All analysis methods based on the above four PK models and the AIFs were verified to work well in simulations, with and without added noise. In the cases where a BAT delay parameter was included in the models (TK+B and ETK+B), this was implemented by interpolation of the blood plasma concentration curve. Statistical Analysis

For each of the baseline and repeated scans, each voxel in both ROIs (TPZ, NPZ) was fitted using all combinations of the four PK models and the three AIFs. The parameters calculated for each voxel were: Ktrans, ve, kep, the sum of squares

Figure 5. Comparison of goodness of fit between analysis methods and regions of interest. Goodness of fit parameters was calculated for every voxel. Shown here is the mean (over patients and repeated scans) of the median goodness of fit parameter in each ROI. (a) The coefficient of determination (R2). (b) Akaike Information Criterion (AIC).

6

ARTICLE IN PRESS Academic Radiology, Vol &, No &&, && 2018

REPEATABILITY OF DCE-MRI IN PROSTATE

Figure 6. Ktrans repeatability and tumor discrimination for different AIFs and analysis methods. (a) %RC for AIFs (arterial, population, and scaled population) for the peripheral zone tumor-suspicious ROI (TPZ), and normal-appearing peripheral zone ROI (NPZ), for each analysis method. (b) p Values from paired two-sided t-tests comparing Ktrans in TPZ vs NPZ.

Figure 7. ve repeatability.

of the residuals, the total sum of squares, and vp (only in the case of ETK and ETK+B). For Ktrans and ve, calculated values that were negative were set to zero and also thresholded from above such that the valid parameter ranges were:0 < Ktrans < 3 min¡1, 0 < ve < 1.

The median values of the parameters in each ROI were used to represent the result from each patient/scan/ROI. The median was used, as opposed to the mean, in order to reduce the effect of outliers. To assess goodness of fit and the quality of each model relative to these data, both the coefficient of determination (R2) and the Akaike Information Criterion (AIC) (26), adjusted to small sample sizes, were calculated for each patient/scan/ROI. The AIC takes into account the simplicity of models, i.e., the number of free parameters, thus discourages overfitting: 2Kn AIC ¼ n ¢ ln RSS , where n is the number of timeþ nK1 n points in the data curve and K is the number of parameters. The lower the AIC, the better the fit of the model to the data. The repeatability coefficient (RC) is an estimate of the maximum difference likely to occur between two successive measurements on the same subject (27). If the differences between two measurements made on a subject are approximately normally distributed, in the long run we expect the absolute difference between two measurements on a subject to differ by no more than the RC on 95% of occasions. RC is defined as the estimated within-subject standard deviation pffiffiffi (wSD), multiplied by 2.77 (=1.96 2) where wSD is given by: wSD2 ¼ N1

N P j¼1

VarðXj Þ and Var(Xj) is the variance for two

repeated measurements, Xj1 and Xj2, on patient j: ðXj Þ ¼ 12 ðXj1 Xj2 Þ2 . A related test of repeatability, that minimizes any existing relationship between wSD and the magnitude of the 7

ARTICLE IN PRESS Academic Radiology, Vol &, No &&, && 2018

PELED ET AL

Figure 8. kep repeatability (a) and tumor discrimination (b). See Figure 6 for caption details.

measurements, is the %RC (28). To this end, instead of using purely the wSD, an estimate for the within-subject coefficient of variation, wCV, is calculated as follows: wCV2 ¼ N1

N P j¼1

2

VarðXj Þ=X j where X j is the mean of the two

repeated measurements on patient j. The %RC is defined as 2.77¢wCV¢100. The more repeatable a parameter is, the lower the %RC. Real change in a parameter is considered as distinguishable from measurement error if it satisfies the requirement that the absolute percent difference between the two measurements is greater than %RC. Discrimination between the parameter values in the ROI's TPZ vs NPZ was assessed by a paired two-tailed t-test for each of the parameters Ktrans, ve, and kep. The null hypothesis in the t-tests performed here was that parameters from different ROIs were drawn from the same population of values, i. e., the lower the p value, the less likely the values in TPZ and NPZ came from same distribution. Since the objective here was to rank analysis methods and not to assess absolute discriminatory p values, the t-tests were not corrected for multiple comparisons. A caveat related to the discriminatory p values in this study is that the differences detected in this study between tumor-suspicious regions and healthy peripheral zone tissue may be distinct from changes related to therapy. This study does not investigate the physiological relevance of models or the physiological interpretation of the fitted parameters.

8

RESULTS The model fits for a single representative voxel in the tumorsuspicious ROI of patient #7 is shown in Figure 4. Only the fits for AIF1 and AIF2 are shown, since the curves for AIF3 fall on top of those for AIF2. Generally, the more parameters are in the model, the less smooth is the resulting fitted curve. Figure 5 shows the mean R2 and AIC for the function fits in the three ROIs. Adding parameters to the model improves the fit in all ROIs, as expected. As shown, adding a thrid parameter to the TK model (making it either ETK or TK +B) yields a larger improvement to the fit than adding a fourth parameter. Since the scaled population AIF fit values are practically indistinguishable from the unscaled population AIF values, it can be concluded that the goodness of fit is minimally affected by scaling the AIF. Due to relatively large differences in fitted parameter values between patients, %RC was deemed the more appropriate statistical measure to use for repeatability here. The %RC for parameter Ktrans was generally lowest (indicating a more repeatable measurement) when using the scaled population AIF (AIF3), as shown in Figure 6a. The best repeatability for Ktrans was achieved with the simplest model—the twoparameter TK model, but the other models performed almost as well. Of the three-parameter models (ETK and TK+B), repeatability was better with the addition of BAT fitting. The ROI with the most repeatable Ktrans value was NPZ with a minimum of 59%, whereas the minimum %RC for TPZ was

ARTICLE IN PRESS Academic Radiology, Vol &, No &&, && 2018

REPEATABILITY OF DCE-MRI IN PROSTATE

TABLE 2. All Values of Parameter %RC and Discrimination p-values, Grouped According to Pharmacokinetic Model (TK, TK+BAT, ETK, ETK+BAT) and According to AIF Used (AIF1, AIF2, AIF3) Showing the Best Value of %RC for Each Combination Highlighted and in Bold Repeatability: %RC

Discrimination:

TPZ ROI

Ktrans

ve

kep

TK ETK TK+BAT ETK+BAT TK ETK TK+BAT ETK+BAT TK ETK TK+BAT ETK+BAT

p Value

NPZ ROI

AIF1

AIF2

AIF3

AIF1

AIF2

AIF3

AIF1

AIF2

AIF3

95% 105% 100% 130% 96% 90% 98% 101% 63% 71% 73% 81%

93% 119% 102% 108% 79% 74% 80% 89% 95% 103% 103% 102%

83% 95% 86% 93% 42% 28% 45% 67% 95% 103% 104% 103%

78% 94% 80% 100% 79% 75% 80% 83% 58% 68% 69% 80%

89% 115% 95% 101% 76% 72% 77% 78% 83% 89% 87% 86%

59% 84% 67% 67% 28% 17% 26% 35% 83% 89% 87% 86%

<10-4 10-4 2¢10-4 10-3 0.60 0.11 0.49 0.36 10-4 10-3 3¢10-3 5¢10-3

<10-4 <10-4 <10-4 7¢10-4 0.84 0.26 1 0.89 <10-4 <10-4 2¢10-4 7¢10-4

<10-4 <10-4 <10-4 4¢10-4 0.81 0.26 0.93 0.94 <10-4 <10-4 2¢10-4 7¢10-4

AIF, arterial input function; BAT, bolus arrival time; ETK, extended Tofts-Kety; NPZ, normal-appearing peripheral zone; RC, repeatability coefficient; ROI, region of interest; TK, Tofts-Kety; TPZ, tumor-suspicious peripheral zone.

83%. Discrimination between suspected pathological tissue in the TPZ ROI and NPZ tissue (NPZ ROI) according to the p values of the paired t-test was excellent, with p 0.001 for all analysis methods and AIFs (Figure 6b). As illustrated in Figure 7, ve was the most repeatable parameter as compared to Ktrans and kep. %RC for ve was always best in the case of the scaled population AIF (AIF3), with the ETK model yielding the most repeatable measurements—16% for NPZ and 28% for TPZ (see Table 2 also). Addition of BAT fitting to the TK model did not significantly affect the repeatability of ve, and BAT addition to the ETK model detracted from repeatability. Regarding differentiation between TPZ and NPZ—the paired t-test p value was always above 0.1.

In the case of kep, the simplest model, TK, yielded the best repeatability, but not by much, as shown in Figure 8a. The minimum %RC was 57% for NPZ ROI and 63% for TPZ. Using a population AIF (or a scaled population AIF) as opposed to a measured arterial AIF, was not beneficial in the case of kep. To further illustrate the repeatability of kep, Figure 9 shows the distribution of kep values in the TPZ ROI of all patients and all scans. Differentiation of TPZ and NPZ was very good with p <0.006 for all models and AIF choices—see Figure 8b. Table 2 shows the values of repeatability and discrimination for all three parameters, with the optimal combination of PK model and AIF estimation method in bold. DISCUSSION

Figure 9. Boxplots of calculated kep values in the tumor-suspicious region for all 11 patients. The two repeated scan results are shown next to each other for each patient, with a horizontal line at the median value and the box extent indicating the 2575 percentiles. Values above 6 min¡1 are plotted at 6 min¡1.

While prostate DCE analysis is generally challenging, it is important to determine the best methods for rendering clinical data usable for treatment response studies. In this study, the relative repeatability of extracted PK parameters using different analysis methods and AIF choices was assessed as an important characteristic of an imaging biomarker in order to help provide the groundwork to discriminate real change from measurement error. Since the relative repeatability of a parameter does not in itself justify its use as a biomarker—in addition to repeatability, each parameter was evaluated for its sensitivity to tissue pathological changes. One of the conclusions from this study was that although a scaled population AIF (AIF3) was the best choice for Ktrans repeatability and ve repeatability, an arterial measured AIF (AIF1) was the best choice for kep repeatability. The result for kep makes sense in light of the AIF-scaling insensitivity of this parameter (suggested by its definition as Ktrans/ve)—this has 9

ARTICLE IN PRESS PELED ET AL

been noted by others also (12). Although McGrath et al. in their high temporal resolution study of DCE analysis on rat tumors using ETK, did not look at kep, they too found that the repeatability and the intraclass correlation coefficient for both Ktrans and ve were better with use of population average AIFs than with measured AIFs (9). In contrast, a DCE study of human liver found less scan-rescan variability when using a data-derived AIF than when using a model AIF (14). One must try to use a model AIF the functional form of which resembles the data AIFs (15), however, it is unclear if this was not the case in (14). Overall, in this study both Ktrans and kep were excellent differentiators between tumor-suspicious tissue regions and normal-appearing regions in the prostate peripheral zone, as opposed to ve that could not discriminate the two types of ROIs. Both Ktrans and kep are widely recognized to be elevated in cancers—for a review, see Ref. (2). Some of the limitations of this study include the small number of patients, the use of ROIs annotated by a single reader, and the absence of targeted biopsy procedures. However, the main objective here was to test parameter repeatability, independent of the exact pathological diagnosis. The lack of localized pathology confirmation of the disease presence in the TPZ ROIs may have contributed to higher minimum %RC values but this may not necessarily have affected the comparisons between models, AIFs and parameters. We did not attempt to discriminate various sources that contribute to the variability of measurement beyond the DCE analysis parameters—these may have included imaging artifacts, gross motion of the patient and local motion due to peristalsis. While those aspects are important, their independent analysis was deemed not relevant here given the small size of the present dataset. The simplest PK model, two-parameter TK, performed best here in terms of parameter repeatability for parameters Ktrans and kep. Other repeatability studies, performed in rodents, have not found differences between TK and ETK (29,30). Inclusion of a BAT term also did not improve repeatability, even with the use of unshifted population AIFs, and warrants further investigation. Other future plans include applying the improved understanding of the effect of analysis methods on particular DCE parameter repeatability as part of a response assessment study in patients that have undergone therapy in clinical trials conducted at our institution. ACKNOWLEDGMENTS Grant Support: NIH U24 CA180918, NIH U01 CA151261, NIH P41 EB015898. REFERENCES 1.

10

Hara N, Okuizumi M, Koike H, et al. Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is a useful modality for the precise detection and staging of early prostate cancer. Prostate 2005; 62:140–147.

Academic Radiology, Vol &, No &&, && 2018

2. Berman RM, Brown AM, Chang SD, et al. DCE MRI of prostate cancer. Abdom Radiol (NY) 2016; 41:844–853. 3. Barrett T, Gill AB, Kataoka MY, et al. DCE and DW MRI in monitoring response to androgen deprivation therapy in patients with prostate cancer: a feasibility study. Magn Reson Med 2012; 67:778–785. 4. Alic L, van Vliet M, van Dijke CF, et al. Heterogeneity in DCE-MRI parametric maps: a biomarker for treatment response? Phys Med Biol 2011; 56:1601–1616. 5. Kim H. Variability in quantitative DCE-MRI: sources and solutions. J Nat Sci 2018; 4:1–8. 6. Heye T, Davenport MS, Horvath JJ, et al. Reproducibility of dynamic contrast-enhanced mr imaging Part I. Perfusion characteristics in the female pelvis by using multiple computer-aided diagnosis perfusion analysis solutions. Radiology 2013;266 doi:10.1148/radiol. 12120278/-/DC1. 7. Sung YS, Park B, Choi Y, et al. Dynamic contrast-enhanced MRI for oncology drug development. J Magn Reson Imaging 2016; 44:251–264. 8. Fedorov A, Fluckiger J, Ayers GD, et al. A comparison of two methods for estimating DCE-MRI parameters via individual and cohort based AIFs in prostate cancer: a step towards practical implementation. Magn Reson Imaging 2014; 32:321–329. 9. McGrath DM, Bradley DP, Tessier JL, et al. Comparison of mode-based arterial input functions for dynamic contrast-enhanced MRI in tumor bearing rats. Magn Reson Med 2009; 61:1173–1184. 10. Lu H, Clingman C, Golay X, van Zijl PCM. Determining the longitudinal relaxation time (T1) of blood at 3.0 Tesla. Magn Reson Med 2004; 52:679–682. 11. Sourbron SP, Buckley DL. Classic models for dynamic contrastenhanced MRI. NMR Biomed 2013; 26:1004–1027. 12. Li X, Cai Y, Moloney B, et al. Relative sensitivities of DCE-MRI pharmacokinetic parameters to arterial input function (AIF) scaling. J Magn Reson 2016; 269:104–112. 13. Keil VC, Maedler B, Gieseke J, et al. Effects of arterial input function selection on kinetic parameters in brain dynamic contrast-enhanced MRI. Magn Reson Imag 2017; 40:83–90. 14. Ashton E, Raunig D, Ng C, et al. Scan-rescan variability in perfusion assessment of tumors in MRI using both model and data-derived arterial input functions. J Magn Reson Imaging 2008; 28:791–796. 15. Azahaf M, Haberley M, Betrouni N, et al. Impact of arterial input function selection on the accuracy of dynamic contrast-enhanced MRI quantitative analysis for the diagnosis of clinically significant prostate cancer. JMRI 2016; 43:737–749. 16. Huang W, Chen Y, Fedorov A, et al. The impact of arterial input function determination variations on prostate dynamic contrast-enhanced magnetic resonance imaging pharmacokinetic modeling: a multicenter data analysis challenge. Tomography 2016; 2:56–66. 17. Mehrtash A, Gupta SN, Shanbhag D, et al. Bolus arrival time and its effect on tissue characterization with dynamic contrast-enhanced magnetic resonance imaging. J Med Imaging (Bellingham) 2016; 3:014503. 18. Chouhan MD, Bainbridge A, Atkinson D, et al. Estimation of contrast agent bolus arrival delays for improved reproducibility of liver DCE MRI. Phys Med Biol 2016; 61:6905–6918. 19. Fedorov A, Vangel MG, Tempany CM, Fennessy FM. Multiparametric magnetic resonance imaging of the prostate: repeatability of volume and apparent diffusion coefficient quantification. Invest Radiol 2017; 52:538–546. 20. Weinreb JC, Barentsz JO, Choyke PL, et al. PI-RADS prostate imaging reporting and data system: 2015, Version 2. Eur Urol 2016; 69:16–40. 21. Fennessy FM, Fedorov A, Gupta SN, et al. Practical considerations in T1 mapping of prostate for dynamic contrast enhancement pharmacokinetic analyses. Magn Reson Imaging 2012; 30:1224–1233. 22. Tofts PS, Kermode AG. Measurement of the blood-brain barrier permeability and leakage space using dynamic MR imaging. 1. Fundamental concepts. Magn Reson Med 1991; 17:357–367. 23. Parker GJM, Roberts C, Macdonald A, et al. Experimentally-derived functional form for a population-averaged high-temporal-resolution arterial input function for dynamic contrast-enhanced MRI. Magn Reson Med 2006; 56:993–1000. 24. Kety SS. The theory and applications of the exchange of inert gas at the lungs and tissues. Pharmacol Rev 1951; 3:1–41. 25. Tofts PS, Brix G, Buckley DL, et al. Estimating kinetic parameters from dynamic contrast-enhanced T(1)-weighted MRI of a diffusable tracer: standardized quantities and symbols. J Magn Reson Imaging 1999; 10:223–232.

ARTICLE IN PRESS Academic Radiology, Vol &, No &&, && 2018

26. Akaike H. A new look at the statistical model identification. IEEE transactions on automatic control 1974; AC-19:716–723. 27. Bland JM, Altman DG. Measurement error. Br M J 1996; 312:1654. 28. Obuchowski NA. Interpreting change in quantitative imaging biomarkers. Acad Radiol 2018; 25:372–379.

REPEATABILITY OF DCE-MRI IN PROSTATE

29. 30.

Ng CS, Wei W, Bankson JA, et al. Dependence of DCE-MRI biomarker values on analysis algorithm. PLoS One 2015; 10(7):e0130168. Barnes SL, Whisenant JG, Loveless ME, et al. Assessing the reproducibility of dynamic contrast enhanced magnetic resonance imaging in a murine model of breast cancer. Magn Reson Med 2013; 69:1721–1734.

11