Physica Medica xxx (2017) xxx–xxx
Contents lists available at ScienceDirect
Physica Medica journal homepage: http://www.physicamedica.com
Original paper
Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness Alistair Mackenzie a,⇑, Timothy D Eales b, Hannah L Dunn b, Mary Yip Braidley c, David R. Dance a,b, Kenneth C. Young a,b a b c
National Coordinating Centre for the Physics in Mammography (NCCPM), Level B, St Luke’s Wing, Royal Surrey County Hospital, Guildford GU2 7XX, UK Department of Physics, University of Surrey, Guildford GU2 7XH, UK Clinical Trials and Statistical Unit, Institute of Cancer Research, London SW7 3RP, UK
a r t i c l e
i n f o
Article history: Received 27 March 2017 Received in Revised form 23 May 2017 Accepted 16 June 2017 Available online xxxx Keywords: CDMAM Simulation Mammography Uncertainties
a b s t r a c t Purpose: To demonstrate a method of simulating mammography images of the CDMAM phantom and to investigate the coefficient of variation (CoV) in the threshold gold thickness (tT) measurements associated with use of the phantom. Methods: The noise and sharpness of Hologic Dimensions and GE Essential mammography systems were characterized to provide data for the simulation. The simulation method was validated by comparing the tT results of real and simulated images of the CDMAM phantom for three different doses and the two systems. The detection matrices produced from each of 64 images using CDCOM software were randomly resampled to create 512 sets of 8, 16 and 32 images to estimate the CoV of tT. Sets of simulated images for a range of doses were used to estimate the CoVs for a range of diameters and threshold thicknesses. Results: No significant differences were found for tT or the CoV between real and simulated CDMAM images. It was shown that resampling from 256 images was required for estimating the CoV. The CoV was around 4% using 16 images for most of the phantom but is over double that for details near the edge of the phantom. Conclusions: We have demonstrated a method to simulate images of the CDMAM phantom for different systems at a range of doses. We provide data for calculating uncertainties in tT. Any future review of the European guidelines should take into consideration the calculated uncertainties for the 0.1 mm detail. Ó 2017 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
1. Introduction Mammography is a technically challenging modality, as the imaging system requires high resolution and the ability to distinguish cancers with a similar density to the background tissue. In addition the breast is known to be a radiosensitive organ. The design and set up of a mammography system needs to be carefully balanced between image quality and dose. It is important therefore to ensure adequate image quality in any breast screening programme. When the quality of a screening programme drops due to technical reasons the consequences for cancer detection can be profound [1–3]. The European Commission have provided guidance for ensuring the quality of the whole breast screening programme [4], including guidelines on quality control of the image ⇑ Corresponding author. E-mail addresses:
[email protected] (A. Mackenzie),
[email protected]. uk (T.D Eales),
[email protected] (H.L Dunn),
[email protected] (M. Yip Braidley),
[email protected] (D.R. Dance),
[email protected] (K.C. Young).
quality and dose of mammography systems. The main image quality control (QC) test uses the CDMAM contrast detail phantom (Artinis Medical Systems BV, Nijmegen, Netherlands). This phantom (Fig. 1) comprises gold disks with a range of diameters (0.06 to 2.0 mm) and thicknesses (0.03 to 2.0 lm) placed onto a 0.5 mm thick sheet of aluminium. Each cell in the phantom contains two disks, one in the centre and the other in one of the corners. The phantom is normally imaged with 40 mm polymethyl methacrylate (PMMA) and images acquired at the factors used for a 50 mm thickness of PMMA (considered equivalent to a 60 mm thick compressed breast). The output from the measurements is the threshold gold thickness for each diameter, and the lower the value the better the image quality. The European Guidelines [4] defines two standards based on threshold gold thickness 1) acceptable: minimum acceptable standard and 2) achievable: a system should ideally operate better than this level. In addition, they recommend that 16 images are acquired to keep the uncertainties at a reasonable level. Importantly, the threshold gold thickness has been shown to relate to the detection of calcification
http://dx.doi.org/10.1016/j.ejmp.2017.06.019 1120-1797/Ó 2017 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Please cite this article in press as: Mackenzie A et al. Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness. Phys. Med. (2017), http://dx.doi.org/10.1016/j.ejmp.2017.06.019
2
A. Mackenzie et al. / Physica Medica xxx (2017) xxx–xxx
lowed the same approach but used the improved image and image noise simulation techniques developed by Mackenzie et al. [13,14]. The methods were designed for adapting acquired images but can be used for adapting mathematically created images. The purpose of this study is to demonstrate and validate a method for simulating CDMAM images, and to use it to investigate the experimental uncertainties in the threshold gold thickness measurements associated with the CDMAM phantom by making use of the ability to create large numbers of independent images. For this purpose we have simulated two systems with different imaging properties at a wide range of doses and have examined the uncertainties over a wide range of image qualities. 2. Methods
Fig. 1. Photograph of CDMAM phantom.
clusters and weakly to the detection of non-calcification lesions [5]. There has been criticism of the use of a CDMAM phantom in a QC test, including that it is expensive, time consuming to acquire sixteen images, and that there are differences between individual CDMAM phantoms. There is no doubt that there has been some variability between the CDMAM phantoms over the years of manufacture. Young et al. [6] compared four CDMAM phantoms and found differences in the measured threshold gold thickness of up to 10%. It is reported that there may be differences in thickness and diameter from the nominal values and certainly Bijkerk et al. [7] showed issues in the manufacture. It should be noted that the manufacturer claims to have improved the manufacturing process to produce phantoms where the thickness and diameter of the disks are consistently closer to the nominal values. The CDMAM phantoms are not cheap, although there are other phantoms available no other phantom seems to be as sensitive [8–10] or have as much supporting evidence for its use. The time consuming element does not appear to be a major issue, from personal experience this can be easily undertaken within 10 min for most digital mammography systems, although it is more time consuming for computed radiography systems. The use of automated reading removes the effort and time required for manual scoring of the images. It is also possible to reduce the number of images but the uncertainties may then be unacceptably high [6]. One area that has not been adequately investigated is the random uncertainties in the determination of the threshold gold thickness (tT). Young et al. [6] and Yang and Van Metter [11] estimated the coefficient of variation (CoV) in tT by resampling data from a large number of images. Young et al. [6] used 64 images at one dose level for 3 systems, while Yang and Van Metter [11] used 36 images for 3 systems and one system at half dose. Yang and Van Metter [11] noted that the resampling method underestimates the CoV, this may be expected as the data points are used multiple times. There was not a systematic examination of the uncertainties across the phantom, in particular at the edge of the CDMAM phantom where the uncertainties are expected to be higher, where the calculation method will have fewer points. However, to improve the accuracy of the measurements and to further investigate the uncertainties would require large numbers of images. Another approach to manually acquiring large numbers of images would be to completely simulate the images of the CDMAM phantom, in which very large numbers of images can be created. Yip et al. [12] described a method of simulating images of the phantom using a template and detailed knowledge of the noise and sharpness of the imaging systems. In this paper we have fol-
2.1. Characterisation of systems Two commonly used mammography systems, a Selenia Dimensions (Hologic Inc, Bedford, USA) and a Senographe Essential (GE Healthcare, Buc, France) were characterised to provide data to simulate images of the CDMAM v3.4 phantom. For both systems the signal transfer properties, modulation transfer function, noise, flat field correction, glare and scatter were measured. The radiographic factors used in the simulation were those selected by each system under automatic exposure control for a 50 mm thick slab of PMMA (Table 1). 2.1.1. Signal transfer properties The simulation methodology and characterisation requires the images to be linearised to absorbed energy per unit area (E). The conversion factor CK,E which relates the absorbed energy per unit area to the incident air kerma at the detector, was required for this image linearisation and was calculated by dividing the calculated value of E by the calculated incident air kerma using the method of Mackenzie et al. [14]. For each system, flat field images were acquired at a beam quality matched to that used to acquire the CDMAM images. An attenuator comprising a 40 mm thick PMMA slab and a 0.5 mm thick aluminium filter was placed at the exit port of the tube at 31 kV, W anode, 0.05 mm Rh filter and 29 kV, Rh anode, 0.025 mm Rh filter for the Hologic and GE systems respectively. The anti-scatter grid was removed for both systems and the breast support was removed for the GE system. The compression paddle was included in the beam as high as possible for each system. The radiation field was collimated to about 100 mm 100 mm. The air kerma was measured using a Radcal Accu-Pro (Radcal Corp., Monrovia, CA) dosimeter at 60 mm from the chest wall and 100 mm above the position of the breast support. The detector air kerma (DAK) was calculated by correcting the measured air kerma to the incident air kerma at the detector entrance plane using the inverse square law. In accordance with IEC 62220-1-2 [15], a flat field image was acquired for DAKs between 20 and 500 lGy. The air kerma values were then converted to absorbed energy per unit area (E) using CK,E. The average pixel values (PV) were measured within a region of interest (ROI) of dimension 50 50 mm2, positioned 60 mm from the chest wall edge and laterally centred within the image. The signal transfer properties (STP) were then calculated as the relationship between the pixel value and E. 2.1.2. Glare and scatter It is necessary to account for glare and scatter in the simulation model [13]. The glare-to-primary ratio (GPR) was measured using a lead beam stop technique for each system with five lead disks of diameter between 1 and 3 mm [16]. The images were acquired with 2 mm Al at the exit port of the tube, without the compression
Please cite this article in press as: Mackenzie A et al. Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness. Phys. Med. (2017), http://dx.doi.org/10.1016/j.ejmp.2017.06.019
3
A. Mackenzie et al. / Physica Medica xxx (2017) xxx–xxx Table 1 Radiographic factors used for acquisition of images CDMAM phantom and for simulation with 40 mm PMMA.
GE Healthcare Essential Hologic Selenia Dimensions
Factors
Low dose
Standard
High dose
29 kV, Rh/Rh 31 kV, W/Rh
32 mAs 70 mAs
63 mAs 140 mAs
125 mAs 280 mAs
paddle. One difficulty when measuring glare in a clinical system is that the detector cover, anti-scatter grid and breast support are close to the detector. It is not normal to remove these for measurements on clinical systems, and so the scatter produced from these objects was included in the glare measurement. For reasons of practicality, this was accepted as part of the glare [16]. The scatter-to-primary ratio (SPR) was measured for 40 mm of PMMA plus the 0.5 mm sheet of aluminium and compression paddle with the appropriate radiographic factors again using a lead beam stop technique. The measured SPR (SM p ) includes glare and so each measurement was corrected by removing the GPR (Gp) using Eq. (1) to give the corrected SPR (Sp) [13]. The 3 mm PMMA sheet that is part of the CDMAM phantom was not included in this measurement and so the measured SPR will be slightly smaller than reality.
Sp ¼ ð1 þ SM p Þ=ð1 þ Gp Þ
ð1Þ
2.1.3. Flat field correction map The exposure varies across the image due to the anode heel effect and variations in the amount of scatter. The size of this effect was measured for each system by estimating the flat field correction. A large PMMA block, which had been used for the flat field calibration of the detector, was imaged five times using the setup and factors that were used during calibration. The noise across the detector was calculated and expressed as a variance map [17]. Providing the quantum noise is the dominant noise source, the variance map will be proportional to the inverse of the flat field correction applied to the image. A second order two-dimensional fit was applied to the variance map to estimate the flat field correction map. 2.1.4. Noise power spectra, noise coefficients and beam quality correction factor Multiple images were acquired with 40 mm PMMA on the breast support and 0.5 mm Al at the tube exit port at 29 kV, Rh/ Rh and 31 kV, W/Rh for the GE and Hologic systems respectively over a range of mAs settings giving a range of incident air kermas from approximately 20–500 lGy. The X-ray beam was collimated to approximately to 100 mm 100 mm. The anti-scatter grid was used. Each image was linearised using the inverse of the STP as discussed above. The noise power spectra (NPS) was calculated over an ROI of dimension 50 50 mm2, positioned 60 mm from the chest wall edge and laterally centred within the image. The ROI was split into sub-ROIs of 128 128 pixels and the NPS was calculated for each sub-ROI and averaged for each dose level [18]. The pixel values are linearised to absorbed energy per unit area and as the NPS is not normalised, the units of NPS are GeV2 mm2 [19]. A model for the NPS (W) has been developed for a range of doses and beam qualities by Mackenzie et al. [14].
Wðu; v ; kÞ ¼ xoe ðu; v Þ þ BQ ðu; v ; kÞxoq ðu; v Þ 2 EA þ xos ðu; v Þ Eo
EA Eo ð2Þ
where xe, xq and xs (with units of GeV2 mm2) are referred to as electronic, quantum and structure noise coefficients respectively at a reference absorbed energy per unit area Eo (1 GeV mm2), EA
is the signal at the detector expressed as absorbed energy per unit area, BQ(u,v;k) is a dimensionless correction factor for quantum noise at beam quality k relative to the quantum noise at a reference beam quality; k is defined as the mean photon energy incident to the detector. The three noise coefficients were calculated by fitting a quadratic relationship for each spatial frequency of the NPS against E. The BQ factor is a correction for images acquired at different conditions from the reference beam quality. In this case, an extra 3 mm of PMMA (cover of the CDMAM phantom) compared to the reference beam quality. For such a small difference, the BQ factor has very little effect and is close to one. 2.1.5. Modulation transfer function The presampled modulation transfer function (MTF) was used in this work [20], this includes blurring associated with aperture size, the convertor and any geometric blur. The MTF was measured using a 120 60 0.8 mm3 steel edge suspended in air 20 mm above the breast support and the anti-scatter grid was not used. The edge was positioned between 2 and 5° to lateral and chest wall to nipple directions. A supersampled edge spread function (ESF) was calculated from each image and smoothed by fitting a monotonic curve to the ESF using the method described by Maidment and Albert [21]. The ESF was then differentiated to obtain the line spread function (LSF). A Fourier transform was undertaken of the LSF and normalised to one at the zero frequency to give the presampled MTF. 2.2. Simulation of images of CDMAM phantom The simulation of images of the CDMAM phantom involved creating a template of the phantom without disks and then inserting disks at the correct locations in the phantom with the correct contrast for the gold thickness and diameter and the required beam quality. The image was then blurred, and scatter and noise added. Fig. 2 shows a schematic representation of the simulation process. 2.2.1. Creation of template of the CDMAM phantom A high dose image of the CDMAM phantom was acquired using a Hologic Selenia system. The noise and details were removed from the images using thresholding leaving a template of the grid with pixel pitch of 70 mm [12]. The template was resampled such that the grid appears as if acquired at 20 mm above the breast support with a pixel pitch of half the pixel pitch of the detector being simulated. Saunders and Samei [22] suggested that ideally finer sampling at a seventh of the pixel pitch should be used for simulations. Using finer sampling was tested, and no differences were found in the threshold gold thickness but the time for simulation was considerably longer. The background signal for the simulation was calculated from the pixel values of real images of the CDMAM phantom with an estimate of the amount of scatter removed. The pixel values behind the CDMAM lead grid were then set to 83% of the background signal. The disks were placed in the simulated phantom with one detail in the centre of each cell and at one corner corresponding to the real phantom. The contrast of the inserted disk was calculated for the energy absorbed in the detector from primary X-ray photons. The incident X-ray spectra to the detector were calculated
Please cite this article in press as: Mackenzie A et al. Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness. Phys. Med. (2017), http://dx.doi.org/10.1016/j.ejmp.2017.06.019
4
A. Mackenzie et al. / Physica Medica xxx (2017) xxx–xxx
Fig. 2. Summary of process for simulation of images of CDMAM phantom.
using the thickness and attenuation coefficients of the gold disk, all other materials in the beam and the radiographic factors used. In addition to the phantom and PMMA, it was assumed that a 1.2 mm thick carbon fibre breast support, a total of 1 mm thick beryllium window (0.5 mm of Be included in the initial X-ray spectra data [23]) and a 2.4 mm thick polycarbonate paddle were in the beam. Partial areas were used to calculate the signal in pixels not fully covered by a gold disk. 2.2.2. Adapting the template of the CDMAM phantom The appropriate MTF for the image includes the aperture function associated with the pixel size, geometric blurring and blurring within the X-ray convertor layer. The blurring associated with the pixel size is included in the creation of the CDMAM template plus the down-sampling that is undertaken later. The blurring associated with the convertor and geometric blurring was calculated by dividing the measured presampled MTF by the aperture function. The aperture function is zero at twice the Nyquist frequency. Thus to avoid dividing by zero it was necessary to extrapolate the MTF from 1.5 times the Nyquist frequency using an exponential curve [24]. It has been shown that the measured MTF does not fully incorporate the glare and so the MTF was adapted to include glare [13]. A 2D MTF was then created by rotationally averaging the two orthogonal MTFs. The resulting image IA was padded, such that the array size was a power of two and blurred (IB) by multiplying the padded image in frequency space by the MTF associated with the convertor layer (HC).
IB ðx; yÞ ¼ I1 fIfI A ðx; yÞgHC ðu; v Þg
ð3Þ
where I is the Fourier transform and I1 is the inverse Fourier transform After this process the padding was removed. To simulate different locations of the phantom the array was rotated by a randomly selected angle between 2° and 2°. The image was then re-binned by a ratio of 2:1, to create image Ibin, with the starting point of the binning randomly varied between images. 2.2.3. Addition of noise and scatter The signal from the scattered radiation was estimated from the product of the mean pixel value and the measured SPR and added to image Ibin. The resultant image was multiplied by the inverse of
the flat field correction map to give image IO. This effectively added in the anode heel effect to the signal. The noise coefficients (from Eq. (2)) were converted into an image of noise using methods previously described [25]. These were then adjusted to correspond to the noise for the actual pixel values in IO and added to IO to produce an image IS which includes noise (Eq. (4)). It was assumed that the noise from the detected scattered photons has the same noise coefficient as the primary X-ray photons, as the difference in average photon energy between scattered and primary photons was very small.
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi IS ðx; yÞ ¼ IO ðx; yÞ þ Ie ðx; yÞ þ Iq ðx; yÞ IO ðx; yÞ þ Is ðx; yÞIO ðx; yÞ
ð4Þ
where Ie, Iq and Is are the noise coefficients converted to a noise image at 1 Gev mm2 Finally the flat field correction matrix was reapplied to the image to remove gross variation across the image (e.g. due to anode heel effect).
2.3. Measurement of threshold gold thickness from images of CDMAM phantom In this study the images were read automatically using CDCOM software version 1.6 (www.euref.org) and CDMAM analysis software version 2.1.0 (NCCPM, Guildford, UK). The CDCOM programme identified the location of the grid of the CDMAM phantom in the image. Then the four corners of each cell were tested for the most likely corner to contain the disk. This was then repeated with the three disk-absent corners and the centre disk for each cell, thus two matrices were created to indicate which disks were correctly detected (detection matrices); one matrix for the central disks and the other for the corner disks. The CDMAM analysis software averaged the detection matrices for all of the images to calculate the detection fraction for each disk. Psychometric curves were fitted for the detection fraction against detail contrast and from that the threshold gold thickness for a detection fraction of 0.625 was calculated for diameters between 0.1 and 1 mm (0.08 mm optional) using methods described by Young et al. [6]. The automatic software detects more disks correctly than human observers and the results were therefore then further corrected to estimate the results expected for human observers [26].
Please cite this article in press as: Mackenzie A et al. Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness. Phys. Med. (2017), http://dx.doi.org/10.1016/j.ejmp.2017.06.019
5
A. Mackenzie et al. / Physica Medica xxx (2017) xxx–xxx
2.4. Validation of simulation 2.4.1. Acquisition and simulation of images of the CDMAM phantom Images of the CDMAM 3.4 test phantom were acquired for comparison with the simulated images for the Hologic and GE systems. The phantom was imaged sandwiched between two 20 mm thick PMMA blocks. Sixteen images of this phantom were acquired on both systems using the factors shown in Table 1. Sixteen images of the CDMAM phantom were then simulated for each system at the three sets of radiographic factors. The real and simulated images were then read automatically using CDMAM Analysis software without including the 0.08 mm detail. A Student t-test was undertaken to test if there were any significant differences between the threshold gold thicknesses determined for the real and simulated images for the Hologic and GE systems. 2.5. Calculation of uncertainties in threshold gold thickness 2.5.1. Comparison of calculated CoVs of real with simulated images The CoVs for the measurement of the threshold gold thickness were determined using the method described by Young et al. [6]. 64 images of the CDMAM phantom were collected on each system at the standard dose (Table 1) and 64 matching images were simulated for each. CDCOM was applied to each image to output the individual detection matrices each CDMAM phantom image, in this case 128 matrices per system for the real and simulated images. The CDMAM Analysis software was adapted to create sixteen detection matrices (equivalent to eight images) by randomly selecting detection results from the 128 matrices. Although each cell produces two results, they have been shown to be independent [11], so it is acceptable to randomly sample across the two matrices. The threshold gold thickness for each diameter was calculated for this data set. This was repeated 512 times and the coefficient of variation of the threshold gold thickness was calculated for each diameter. This was then repeated by creating 512 sets of 16 and 32 images. The measured CoVs of the threshold gold thickness for the real and simulated images of the CDMAM phantom were then compared. 2.5.2. Validation of technique for calculating CoVs The calculation of uncertainties using resampling from a limited data set such as bootstrapping is well established [27]. This type of analysis was undertaken for this study by sampling the matrices to estimate the CoVs in the measurements of threshold gold thickness [6,11]. Using this method means that the same data from the measured detection matrices may be used multiple times in the created matrices and so potentially the measured CoVs could be lower than reality. The advantage of the simulation method is that we can create a much larger number of images than can be reasonably acquired on real imaging systems. In this case, 4096 images of the CDMAM were created to simulate acquisition by the Hologic system at the AEC dose. The threshold gold thickness was then calculated for 256 groups of 16 images and so in this case no data were used more than once. The CoV for the threshold gold thickness was then calculated for each diameter. The results can be considered to be the true CoV. The 4096 images were then reorganised into groups of 32. The uncertainty of the threshold gold thickness for each group of 32 was calculated using the technique described previously. This was then compared to the true value. This was then repeated by grouping them into 64, 128, 256 and 512 images. 2.5.3. Variation of the CoV across the CDMAM phantom image To examine the CoVs across a wide area of the CDMAM phantom image, 256 images were created at 17 dose levels from 8 to 960 mAs for the Hologic system and 17 dose levels from 5 to 440
mAs for the GE system. This equates to a mean glandular dose for a 60 mm thick compressed breast at exposure levels from 0.1 to 14 mGy for the Hologic systems and 0.1 to 8 mGy for the GE system. This range of exposure levels covered a wide range of tT for each disk diameter beyond values found for clinical systems. For each dose level, the CoVs were then calculated for each diameter at the measured threshold gold thickness using the resampling technique by creating 512 independent sets of 16 images. The measured results were interpolated to estimate the CoV for each nominal thickness and diameter value in the CDMAM phantom within the range of the measurements. The CoVs at the achievable and acceptable limits for the 0.1, 0.25, 0.5, 1.0 mm diameter disks were calculated by interpolating results from the different dose levels for 16 images as recommended by the European Guidelines [4]. In addition, the effect of the number of images used (1, 2, 4, 8, 16 and 32 images) on CoV was tested for the 0.1 mm diameter disks. 3. Results 3.1. Characterisation of Hologic and GE systems 3.1.1. Signal transfer properties The calculated CK,E values were 0.139 and 0.119 GeV mm2 lGy1 for the Hologic and GE systems respectively. These values were used to convert the measured values of air kerma at the detector to absorbed energy in the detector per unit area. The measured STPs for the relationship between pixel value and E were both straight lines with coefficients of determination greater than 0.998. 3.1.2. Glare-to-primary ratio and scatter-to-primary ratio The measured values of GPR and SPR are listed in Table 2. 3.1.3. Modulation transfer function Fig. 3 shows the measured presampled MTF for both systems measured at a height of 20 mm above the breast support and the convertor MTF (presampled MTF with the aperture function removed and including geometric unsharpness). The presampled MTFs are very similar to those published by Mackenzie et al. [28], but slightly higher than those shown by Marshall et al. [29], some of the difference may be due to differences in the calculation method [30]. Fig. 4 shows the MTF with the low frequency drop altered to match the expected value for the measured GPR. This is the MTF (Hc) that was applied to the CDMAM phantom template. 3.1.4. Noise coefficients The measured noise coefficients for the electronic, quantum and structure noise of both systems are shown in Fig. 5. These are very similar to previously published results for similar systems [14,28]. The images were linearised to absorbed energy per unit area and so caution is required to compare the electronic and quantum noise coefficients acquired using different radiographic factors. The electronic noise for the Hologic system is higher than that for the GE system even taking radiographic factors into consideration. The shapes of the quantum noise coefficient are different due to the
Table 2 Scatter-to-primary ratio and glare-to-primary for 40 mm PMMA with 0.5 mm Al.
GE Healthcare Essential Hologic Selenia Dimensions
Measured GPR (Gp)
Measured SPR & GPR
SPR corrected for glare (Sp)
0.082 ± 0.003
0.207 ± 0.008
0.116 ± 0.007
0.029 ± 0.001
0.140 ± 0.006
0.108 ± 0.006
Please cite this article in press as: Mackenzie A et al. Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness. Phys. Med. (2017), http://dx.doi.org/10.1016/j.ejmp.2017.06.019
6
A. Mackenzie et al. / Physica Medica xxx (2017) xxx–xxx
Fig. 3. Average of orthogonal presampled MTFs, MTF associated with pixel pitch (aperture function) and the convertor MTF for a) GE Essential and b) Hologic Dimensions.
higher MTF of the Hologic system, which also means that there is a large amount of aliased noise at higher spatial frequencies. The structure noise is higher for the Hologic system than for the GE system. 3.2. Validation of simulation of images of CDMAM phantom Fig. 6 shows the threshold gold thicknesses for the real and simulated images of the CDMAM phantom at three dose levels. Table 3 shows the average difference between the threshold gold thicknesses of the real and simulated images. The differences are within the uncertainties of the measurement. 3.3. Comparison of measured uncertainties for real and simulated images
Fig. 4. Average of orthogonal pre-sampled MTFs (HC) for convertor layer and adapted for glare. Data shown up to twice the Nyquist frequency for the GE and Hologic systems.
Fig. 7 shows the CoV of the threshold gold thickness for each diameter calculated from the simulated and real images for the AEC dose level. The CoVs are similar for the real and simulated images, which indicates that we can use this simulation method to further investigate uncertainties associated with threshold gold thickness measurements. The results are also similar to those published by Young et al. [6]. There is a clear pattern in the measured
Fig. 5. Radially averaged noise coefficients for GE and Hologic systems, images linearised to E. a) electronic noise, b) quantum noise, c) structure noise.
Please cite this article in press as: Mackenzie A et al. Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness. Phys. Med. (2017), http://dx.doi.org/10.1016/j.ejmp.2017.06.019
A. Mackenzie et al. / Physica Medica xxx (2017) xxx–xxx
7
Fig. 6. Threshold gold thickness of real and simulated images of the CDMAM phantom over a range of doses using 16 real and simulated CDMAM images. Error bars are two standard errors [using values from Fig. 10] a) GE Essential, b) Hologic Dimensions.
Table 3 Average difference (and range) between threshold gold thicknesses for the simulated and the real images. Dose
GE
Hologic
Low Standard High
7.5% (4.9 to 8.7%) 2.6% (3.7% to 6.7%) 0.6% (2.3% to 3.2%)
4.9% (0.5%–9.2% 5.5% (6.1%–13.6%) 2.1% (9.7%–6.5%)
uncertainties: those for largest and smallest diameters are greater than those for the medium size diameters by up to about a factor of two.
resampling is undertaken from more images. The CoV of the ‘resampling’ method used by Young et al. [6] for 64 images was on average 13% lower than that estimated using 256 sets of 16 images. For the following measurements it was decided to use resampling from 256 images due to the practicalities of time for creating and reading the images. This is acceptable as even at the worst case the difference from the ‘true’ value is 0.5% and was within the measurement uncertainties. The calculation of the CoV using the approach of Young et al. [6] was repeated for 16 sets of 256 simulated Hologic images. The CoV of the CoV was found to be on average 5.2%.
3.4. Validation of technique for calculating uncertainties 3.5. Variation of the CoV across the CDMAM phantom image Fig. 8 shows the CoV calculated for four disk sizes using the method described by Young et al. [6] for resampling from different numbers of images. These results are compared in the figure to the average CoV in the measured threshold thickness determined for the same images split into 256 groups of 16 independent simulated images of CDMAM phantom for the Hologic system. The uncertainties in CoV for the 256 threshold gold thickness results were estimated using bootstrapping. It can be seen in Fig. 8 that the measured CoV is closer to the 256 sets of 16 images when the
The CoVs were measured for 17 different dose levels for each system. As an example, Fig. 9 shows the measured CoVs for the 0.25 mm disks for both the GE and Hologic systems for each threshold thickness measured for each dose. The results shown here are typical for the other diameters, and show that there is no difference in the relationship between the CoV and tT for the two systems. It is therefore valid to combine the results of the two systems.
Fig. 7. CoV of threshold gold thickness for each disk diameter for real and simulated images of CDMAM phantom at the AEC dose level by resampling 64 images to create smaller sets of images. a) GE Essential, b) Hologic Dimensions.
Please cite this article in press as: Mackenzie A et al. Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness. Phys. Med. (2017), http://dx.doi.org/10.1016/j.ejmp.2017.06.019
8
A. Mackenzie et al. / Physica Medica xxx (2017) xxx–xxx
Fig. 8. Comparison of uncertainties using resampling of CDCOM matrices [6] from 32, 64, 128, 256, and 512 images and using 256 groups of 16 simulated images of the CDMAM phantom. The images were simulated for the Hologic system. Error bars are 2 standard errors for multiple measurements.
Fig. 9. Coefficient of variation of the threshold gold thicknesses for 0.25 mm disk diameter for images of CDMAM phantom for the GE and Hologic systems.
The CoVs of tT determined for each diameter and gold thickness using images simulated for a range of doses are shown in Fig. 10. The analysis was undertaken for the recommended range of diameters (between 0.1 mm and 1 mm) using the CDMAM analysis software. For each diameter it was not possible to measure the uncertainties for all of the gold thicknesses. For the thinnest details an increase in dose would not allow thinner details to be seen due to structure noise in the image. Table 4 shows the results for the CoV at the acceptable and achievable limits in the European guidelines when 16 images of the CDMAM phantom are used. The 0.1 mm diameter disk is particularly crucial in the testing of mammography systems, as a system is most likely to fail the CDMAM test at this diameter. These results indicate that the uncertainty for measurements is relatively high compared to that for the other acceptable and achievable limits. The effect of the number of images on the uncertainty of measurements of threshold gold thickness for the 0.1 mm diameter detail is shown in Fig. 11 averaged over the two detectors. There is a wide range of the coefficient of variation from 28.6% for one image down to 5.2% for 32 images for the acceptable level. The graph shows an excellent fit (R2 = 0.999) using the inverse of the square root of the number of images for both data sets.
Fig. 10. Coefficient of variation of the threshold gold thickness with disk diameter for images of CDMAM phantom. The results are an average of the results for the GE and Hologic systems.
4. Discussion We have shown that the simulation method produced results that were very similar to those for real images of the CDMAM phantom not only in the threshold gold thicknesses but also in the measurement uncertainties for two systems. This justified using this simulation method to investigate the use of the CDMAM phantom further without the requirement of acquiring large numbers of real images. This has allowed us to investigate in detail the uncertainties associated with the measurement of threshold gold
Please cite this article in press as: Mackenzie A et al. Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness. Phys. Med. (2017), http://dx.doi.org/10.1016/j.ejmp.2017.06.019
9
A. Mackenzie et al. / Physica Medica xxx (2017) xxx–xxx Table 4 Measured coefficients of variation for threshold gold thicknesses at limits set in the European guidelines [4] for the average of the Hologic and GE systems for 16 images. Diameter (mm)
Acceptable level (lm gold)
Achievable level (lm gold)
CoV at acceptable level
CoV at achievable level
1 0.5 0.25 0.1
0.091 0.15 0.352 1.68
0.056 0.103 0.244 1.10
5.9% 4.3% 3.8% 7.3%
6.9% 4.5% 4.1% 6.3%
Fig. 11. Measured CoV for threshold gold thickness at limits set for 0.1 mm diameter disks in European guidelines [4] for the average of the Hologic and GE systems for different numbers of images.
thickness, a study that would have been impractical without simulation. One issue with the validation of CDMAM phantom simulation has been the variability in CDMAM manufacture from its specification which makes comparison between the real and simulated images more uncertain. Our model has created a CDMAM phantom with perfect gold disks which match the nominal values for diameter and thickness. Our comparison phantom is a newer version, which the manufacturer claims to have better tolerances and certainly our model matches better to this phantom than older versions of the CDMAM phantom. This work may also give some support to the claim that the tolerance in the manufacture of the test phantom have improved, though at the moment there is no independent published data on the consistency of the newest phantoms. The simulation methods shown here are not practical for undertaking quality control and it would be simpler to image the CDMAM phantom. The value of the simulation of images of the CDMAM phantom is for experiments that would require an enormous number of images e.g. we created over 12,000 images for this study. The uncertainties provided by the CDMAM analysis software have been estimated using the results from Young et al. [6]. The methods used underestimated the uncertainties, but also they do not take account of the differences in the measurement uncertainties for details near the edge. In Fig. 9, we showed that the CoV for different threshold gold thicknesses were in agreement between the two different types of systems studied. It may be reasonable to assume that the CoVs calculated across the CDMAM phantom (Fig. 10) should be applicable to any system. Although it may be expected, we have confirmed that the CoV follows an inverse relationship to the square root of the number of images. The uncertainty at a given disk diameter is not independent of that for other disk diameters as the psychometric curves for each diameter are fitted as a group. Also the uncertainties will depend
on the number of curves being fitted. In the present study the results for the 0.08 mm diameter disk were always omitted. For systems with poorer image quality the inclusion of this disk may worsen the accuracy of the results for the 0.1 mm disk [31]. When the threshold gold thickness is similar to either the thickest or thinnest disk for a particular diameter there are fewer points available for fitting the steepest part of the psychometric curve and so the uncertainties will be higher. It is clear that the uncertainties of the threshold gold thicknesses are larger at the edges of the phantom. It is further noted that the CoV for the details on the edge with the thinnest details are larger than for the details on the opposite edge. This is likely to be a result of the correction applied to account for the higher sensitivity of CDCOM compared to human observers [26]. For these disks, the 62.5% detection fraction is estimated for a thickness of less than 0.03 mm thick by extrapolation, this result is then corrected back within the gold thicknesses of the phantom. It was particularly important to understand the uncertainties around the acceptable and achievable levels of the 0.1 mm detail. From experience a system is more likely to fail on the 0.1 mm detail. For a system that is borderline for failing the uncertainty of the threshold gold thickness is relatively high. Yang and Van Metter [11] also studied the uncertainties in the threshold gold thickness for groups of 8 images using a resampling method and suggested for a system bordering on failing the 0.1 mm diameter then more images may be advisable. Enough images need to be acquired to ensure the results are meaningful and the recommendation in the European Guidelines to acquire 16 images appears to be pragmatic. Mackenzie et al. [5] recommended a review of the limits in use for the CDMAM phantom for quality control in the European Guidelines. The large uncertainties found for the 0.1 mm diameter detail should be taken into consideration as part of any such review. The issue of high uncertainties for the 0.1 mm diameter disk may be solved by the latest generation of CDMAM phantoms (CDMAM 4.0, Artinis), where the details for the acceptable and achievable limits are more centrally placed in the phantom [10]. In addition the curve fitting should be more accurate due to smaller differences in the thicknesses of disks in the CDMAM 4.0 phantom compared to the CDMAM 3.4 phantom. Strudley et al. [32] showed the reproducibility of the results was better with CDMAM 4.0. The CoV measurements, shown in Fig. 10, include threshold gold thicknesses outside the normal range of threshold gold thicknesses found clinically. The other disks will either always or never seen, even at extreme levels of image qualities. The CDMAM4.0 phantom has a reduced range of gold thicknesses and so covers a more relevant range of gold thicknesses. The simulation of the images is not perfect and a number of simplifying assumptions have been made. The amount of scatter in this study was assumed to correspond to a constant SPR across the whole image. It is possible to estimate the magnitude of scatter across the detector using techniques by Diaz et al. [33], but the results appeared to be sufficiently accurate without this correction. The MTF at the Nyquist frequency was 0.18 and 0.37 for the GE and Hologic detectors respectively, thus aliasing will be expected in the Hologic images and to a lesser extent in the GE images. Aliasing is included in the noise via the NPS, but any aliasing of signal is not
Please cite this article in press as: Mackenzie A et al. Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness. Phys. Med. (2017), http://dx.doi.org/10.1016/j.ejmp.2017.06.019
10
A. Mackenzie et al. / Physica Medica xxx (2017) xxx–xxx
included, although this effect is expected to be small. The structure noise added to the images was effectively an average signal, and so may have a different appearance to the real structure noise. Overall, the model appears satisfactory and so the simplifications do not appear to adversely affect the model.
[12]
[13]
5. Conclusions We have demonstrated a method to simulate images of the CDMAM phantom for different detectors at a range of doses. The uncertainties in the threshold gold thickness were shown to vary with disk diameter and gold thickness. The uncertainties for threshold gold thickness for the acceptable limits in the European Guidelines are higher than previously published. For the 0.1 mm diameter disk and 16 images the CoV is 7.3%, and this can rise rapidly when fewer images are used. Any future review of the European guidelines should take into consideration the calculated uncertainties.
[14]
[15]
[16]
[17]
[18]
Acknowledgements This work is part of the OPTIMAM2 project and is supported by Cancer Research UK (grant, number: C30682/A17321). We thank the Jarvis Breast Screening Unit in Guildford for access to their mammography systems.
[19]
[20] [21]
References [1] Seradour B, Heid P, Esteve J. Comparison of direct digital mammography, computed radiography, and film-screen in the French national breast cancer screening program. AJR Am J Roentgenol 2014;202:229–36. http://dx.doi.org/ 10.2214/AJR.12.10419. [2] Chiarelli AM, Edwards SA, Prummel MV, Muradali D, Majpruz V, Done SJ, et al. Digital compared with screen-film mammography: performance measures in concurrent cohorts within an organized breast screening program. Radiology 2013;268:684–93. http://dx.doi.org/10.1148/radiol.13122567. [3] Mackenzie A, Warren LM, Wallis MG, Cooke J, Given-Wilson RM, Dance DR, et al. Breast cancer detection rates using four different types of mammography detectors. Eur Radiol 2016;26:874–83. http://dx.doi.org/10.1007/s00330-0153885-y. [4] European. Commission. European guidelines for quality assurance in breast cancer screening and diagnosis (4th ed.). Brussels, Belgium: European Commission; 2006. [5] Mackenzie A, Warren LM, Wallis MG, Given-Wilson RM, Cooke J, Dance DR, et al. The relationship between cancer detection in mammography and image quality measurements. Physica Med 2016;32:568–74. http://dx.doi.org/ 10.1016/j.ejmp.2016.03.004. [6] Young KC, Alsager A, Oduko JM, Bosmans H, Verbrugge B, Geertse T, et al. Evaluation of software for reading images of the CDMAM test object to assess digital mammography systems. In: HSieh J, Samei E, editors. Proc. SPIE Med. Imaging, vol. 6913. SPIE; 2008. http://dx.doi.org/10.1117/12.770571. pp. 69131C-69131C-11. [7] Bijkerk KR, Thijssen MAO, Arnoldussen TJM. Modification of the CDMAM contrast-detail phantom for image quality of Full Field Digital Mammography systems. In: Yaffe MJ, editor. Proc. IWDM, Medical Physics, Madison, WI, Toronto. p. 633–40. [8] Huda W, Sajewicz AM, Ogden KM, Scalzetti EM, Dance DR. How good is the ACR accreditation phantom for assessing image quality in digital mammography? Acad Radiol 2002;9:764–72. http://dx.doi.org/10.1016/ S1076-6332(03)80345-8. [9] Figl M, Semturs F, Kaar M, Hoffmann R, Kaldarar H, Homolka P, et al. Dose sensitivity of three phantoms used for quality assurance in digital mammography. Phys Med Biol 2013;58:N13–23. http://dx.doi.org/10.1088/ 0031-9155/58/2/N13. [10] Figl M, Semturs F, Kaar M, Hoffmann R, Floor-Westerdijk M, van der Burght R, et al. On the dose sensitivity of a new CDMAM phantom. Phys Med Biol 2015;60:N177–85. http://dx.doi.org/10.1088/0031-9155/60/9/N177. [11] Yang C-YJ, Van Metter R. The variability of software scoring of the CDMAM phantom associated with a limited number of images. In: Hsieh J, Flynn MJ,
[22] [23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
editors. Proc. SPIE Med. Imaging. SPIE; 2007. http://dx.doi.org/10.1117/ 12.713655. p. 65100C-65100C-23. Yip M, Alsager A, Lewis E, Wells K, Young KC. Validation of a digital mammography image simulation chain with automated scoring of CDMAM images. In: Krupinski EA, editor. Digit. Mammogr., vol. 5116. Berlin, Heidelberg: Springer; 2008. p. 409–16. http://dx.doi.org/10.1007/978-3-54070538-3. Mackenzie A, Dance DR, Workman A, Yip M, Wells K, Young KC. Conversion of mammographic images to appear with the noise and sharpness characteristics of a different detector and x-ray system. Med Phys 2012;39:2721–34. http:// dx.doi.org/10.1118/1.4704525. Mackenzie A, Dance DR, Diaz O, Young KC. Image simulation and a model of noise power spectra across a range of mammographic beam qualities. Med Phys 2014;41:121901. http://dx.doi.org/10.1118/1.4900819. IEC. Medical electrical equipment -Characteristics of digital x-ray imaging devices -Determination of the detective quantum efficiency -Detectors used in mammography. vol. 62220–1–2. IEC; 2007. Carton A-K, Acciavatti R, Kuo J, Maidment ADA. The effect of scatter and glare on image quality in contrast-enhanced breast imaging using an a-Si/CsI(Tl) full-field flat panel detector. Med Phys 2009;36:920–8. http://dx.doi.org/ 10.1118/1.3077922. Marshall NW. Retrospective analysis of a detector fault for a full field digital mammography system. Phys Med Biol 2006;51:5655–73. http://dx.doi.org/ 10.1088/0031-9155/51/21/018. Dobbins JT, Samei E, Ranger NT, Chen Y. Intercomparison of methods for image quality characterization. II. Noise power spectrum. Med Phys 2006;33:1466–75. http://dx.doi.org/10.1118/1.2188819. Mackenzie A, Doyle P, Honey ID, Marshall NW, O’Neill J, Smail M. IPEM report 32(VII) Measurement of the performance characteristics of diagnostic x-ray systems: digital imaging systems. York, UK: Institute of Physics and Engineering in Medicine; 2010. Giger ML, Doi K. Investigation of basic imaging properties in digital radiography. I. Modulation transfer function. Med Phys 1984;11:287–95. Maidment ADA, Albert M. Conditioning data for calculation of the modulation transfer function. Med Phys 2003;30:248–53. http://dx.doi.org/10.1118/ 1.1534111. Saunders Jr RS, Samei E. A method for modifying the image quality parameters of digital radiographic images. Med Phys 2003;30:3006–17. Boone JM, Fewell TR, Jennings RJ. Molybdenum, rhodium, and tungsten anode spectral models using interpolating polynomials with application to mammography. Med Phys 1997;24:1863–74. http://dx.doi.org/10.1118/ 1.598100. Yip M, Mackenzie A, Lewis E, Dance DR, Young KC, Christmas W, et al. Image resampling effects in mammographic image simulation. Phys Med Biol 2011;56:N275–86. http://dx.doi.org/10.1088/0031-9155/56/22/N02. Båth M, Håkansson M, Tingberg A, Månsson LG. Method of simulating dose reduction for digital radiographic systems. Radiat Prot Dosimetry 2005;114:253–9. http://dx.doi.org/10.1093/rpd/nch540. Young KC, Cook JJH, Oduko JM. Automated and Human Determination of Threshold Contrast for Digital Mammography Systems. Med Phys 2006;52:266–72. http://dx.doi.org/10.1007/11783237_37. Efron B. Bootstrap methods: another look at the jackknife. In: Kotz S, Johnston N, editors. New York: Springer; 1992. p. 569–93. http://dx.doi.org/10.1007/ 978-1-4612-4380-9_41. Mackenzie A, Marshall NW, Hadjipanteli A, Dance DR, Bosmans H, Young KC. Characterisation of noise and sharpness of images from four digital breast tomosynthesis systems for simulation of images for virtual clinical trials. Phys Med Biol 2017;62:2376–97. http://dx.doi.org/10.1088/1361-6560/aa5dd9. Marshall NW, Monnin P, Bosmans H, Bochud FO, Verdun FR. Image quality assessment in digital mammography: part I. Technical characterization of the systems. Phys Med Biol 2011;56:4201–20. Samei E, Buhr E, Granfors P, Vandenbroucke D, Wang X. Comparison of edge analysis techniques for the determination of the MTF of digital radiographic systems. Phys Med Biol 2005;50:3613–25. http://dx.doi.org/10.1088/00319155/50/15/009. Strudley CJ, Young KC. Manual: CDMAM analysis v2.1.0, Java software for the analysis of output for automated reading of images of CDMAM models 3.4 and 4.0; 2015. Strudley CJ, Young KC. Evaluation of a New Design of Contrast-Detail Phantom for Mammography: CDMAM Model 4.0. In: Fujita H, Hara T, Muramatsu C, editors. Proc IWDM. Gifu City: Springer International Publishing; 2014. p. 217–24. http://dx.doi.org/10.1007/978-3-319-07887-8. vol. LNCS 8539. Diaz O, Dance DR, Young KC, Elangovan P, Bakic PR, Wells K. Estimation of scattered radiation in digital breast tomosynthesis. Phys Med Biol 2014;59:4375–90. http://dx.doi.org/10.1088/0031-9155/59/15/4375.
Please cite this article in press as: Mackenzie A et al. Simulation of images of CDMAM phantom and the estimation of measurement uncertainties of threshold gold thickness. Phys. Med. (2017), http://dx.doi.org/10.1016/j.ejmp.2017.06.019