Radiography xxx (2015) 1e4
Contents lists available at ScienceDirect
Radiography journal homepage: www.elsevier.com/locate/radi
Mammography image assessment; validity and reliability of current scheme C. Hill a, *, L. Robinson b a b
Nightingale Centre, Wythenshawe Hospital, University Hospital of South Manchester, M23 9LT, UK Radiography, Allerton Building, University of Salford, M5 4WT, UK c
a r t i c l e i n f o
a b s t r a c t
Article history: Received 26 March 2015 Accepted 10 July 2015 Available online xxx
Mammographers currently score their own images according to criteria set out by Regional Quality Assurance. The criteria used are based on the ‘Perfect, Good, Moderate, Inadequate’ (PGMI) marking criteria established by the National Health Service Breast Screening Programme (NHSBSP) in their Quality Assurance Guidelines of 20061. This document discusses the validity and reliability of the current mammography image assessment scheme. Commencing with a critical review of the literature this document sets out to highlight problems with the national approach to the use of marking schemes. The findings suggest that ‘PGMI’ scheme is flawed in terms of reliability and validity and is not universally applied across the UK. There also appear to be differences in schemes used by trainees and qualified mammographers. Initial recommendations are to be made in collaboration with colleagues within the National Health Service Breast Screening Programme (NHSBSP), Higher Education Centres, College of Radiographers and the Royal College of Radiologists in order to identify a mammography image appraisal scheme that is fit for purpose. © 2015 The College of Radiographers. Published by Elsevier Ltd. All rights reserved.
Keywords: Image assessment Mammography PGMI Image evaluation PNL
Introduction Qualified, autonomous practitioners within mammography are bound by guidelines when assessing their own images; the National Health Service Breast Screening Programme (NHSBSP) established these in 2006.1 These guidelines are in place to help students, staff, researchers and also the clients that are having their breasts imaged, to ensure that everyone is working to the same standard across the screening programme. This paper considers whether the guidelines are robust enough to be considered the gold standard image assessment tool and whether the tool is valid in today's digital imaging environment. Mammography The NHSBSP commenced the breast screening programme with the intention of reducing mortality from breast cancer in the
* Corresponding author. E-mail address:
[email protected] (C. Hill). c Acknowledgement to Dr. Claire E. Mercer.
population. Mammography is the most commonly used imaging technique to detect breast cancer and the early detection of it has been found to reduce mortality rates.2e5 In 2011, 49,936 women and 349 men were diagnosed with invasive breast cancer and 78% survived for more than 10 years. Breast cancer accounts for 7% of all cancer deaths and 15% of all cancer cases; with early detection, there is a 9 in 10 chance of surviving at least 5 years post diagnosis. Indeed, breast screening in the UK has been shown to save 1300 lives per annum.6 ‘Mammography has been described as the science of imaging and the art of positioning’.7 In order to achieve a high quality, diagnostic mammogram to enable early detection, there are a number of factors that need to be considered; the equipment and its performance, the expertise of the mammographer and the cooperation of the client.1
Qualification Radiography became a graduate subject in 1995 and this has led to a change within the profession, moving radiographers forward, developing their skills of research, evaluation, reflection and evidence based practice.8 The Society and College of Radiographers
http://dx.doi.org/10.1016/j.radi.2015.07.005 1078-8174/© 2015 The College of Radiographers. Published by Elsevier Ltd. All rights reserved.
Please cite this article in press as: Hill C, Robinson L, Mammography image assessment; validity and reliability of current scheme, Radiography (2015), http://dx.doi.org/10.1016/j.radi.2015.07.005
2
C. Hill, L. Robinson / Radiography xxx (2015) 1e4
(SCoR) in 2010 highlighted that both service managers and educators carry responsibility for improving radiographers' potential to strengthen and improve patient care.9 The introduction of the radiography degree qualification led to a corresponding change in the way that the mammography qualification was positioned in terms of academic level. In 2001, the NHSBSP introduced the postgraduate certificate award to qualify radiographers to practice as mammographers within the NHS breast screening programme. This higher-level qualification was to ‘ensure that mammographers are technically expert and well informed in order to respond to the individual needs of the woman, and to influence service outcomes.’1 With this move towards level 7 study,10 radiographers undertaking the post graduate mammography certificate are not only expected to have in-depth knowledge of mammography and the services associated with it, but also to undertake research into best practice. As a result, qualified mammographers are now expected to perform high quality mammography and also be enquiring proactive practitioners who can change the service using evidence-based research.1,9 One of the ways they do this is to audit practice and assess the quality of their own work against national standards. Mammography assessment The current criteria for image appraisal are known as PGMI, (Perfect, Good, Moderate and Inadequate). These criteria were established by the NHSBSP in 2006.1 At that time, many breastimaging centres were still using film/screen combinations for mammography. When originally written, PGMI, although not evidence-based, was a tool which had guidelines for when remedial or suspension activity was required and gave the mammographers a clear indication as to the quality of their image. In the years following these guidelines, it became a recommendation that all Breast Screening Units operating under the NHSBSP should move to digital imaging. However the assessment guidelines have not been adapted to reflect this change in how mammograms are produced. As autonomous professionals, mammographers are expected to audit their clinical practice, carrying out image self-assessment against the criteria set out by the NHSBSP in publication No. 631 and reflect on it as part of ongoing personal performance monitoring. This process helps to maintain and improve the quality of service that is delivered. Quality Assurance objective number 3 of the NHSBSP1 guidelines is, “To minimise the number of repeat examinations. The repeat or recall rate should be <3%, ideally <2% of all examinations”. PGMI is used when a radiographer is assessing their own images to enable them to decide whether to repeat or accept their work. Aim This is how the evidence for this paper was researched: A review of current assessment criteria used to assess mammography image quality, how those criteria were developed and how effective they are in the digital era; Contacting all Superintendents of breast units in England and Wales (via the SCoR) to ascertain what marking criteria are currently in use within their service; A critical analysis of the current criteria used in the assessment of the craniocaudal (CC) and mediolateral oblique (MLO) views for monitoring individual mammographer's technique; and A review of how the final summative assessment of students' mammogram images is undertaken to judge their ability of the production of a high quality mammogram.
Review Google Scholar and the SCoR library of journals yielded research papers, but only one was based in the UK, which was over 15 years old. The rest arose from researchers in Australia (Bentley and Poulos). On discovering initial articles by Bentley, she was contacted directly and responded by providing other papers that she had written on the same topic, including questionnaires that she used in a survey5 to ascertain the understanding of PGMI and PNL (Posterior Nipple Line) by radiographers performing mammography. The questionnaires were distributed in the delegates' bags at a Breast Conference in Darling Harbour, Sydney, Australia, placed on line on I-MED MIA Infornet and mailed to all Breast Screen Australia centres with a freepost envelope for return. One hundred and sixty eight responses were received. There are 80 Breast Screening Units in 9 regions across the UK and via the SCoR, the Superintendents all of these sites were contacted to ascertain what marking criteria they were using. Unfortunately only 4 responses were received, which included two different marking grids and one centre stated that they were using PGMI. One centre commented on how they had replaced PGMI with Good, Diagnostic, Undiagnostic [GDU] criteria. All the respondents made mention of the fact they considered there was a need for a new marking criteria to be developed. In 2005, Moreira et al.3 compared two image classification systems for the assessment of mammography quality. The two systems were the PGMI used in the UK and a system known as EAR, “excellent, acceptable and repeat”. EAR was developed by a group of tutors in New South Wales (NSW), Australia and is currently used by Breast Screen NSW.3 They rated 30 sets of mammograms using 21 radiographers and an expert panel. Interobserver reliability and criterion validity were assessed using mean weighted observed agreement and kappa statistics. The results showed that both EAR and PGMI systems had poor reliability and validity for evaluating the quality of mammograms. They concluded that EAR is not a suitable alternative for PGMI, but, that any new or modified criteria that may be developed would need to be subject to rigorous assessment of reliability and validity before it could be used as an image assessment tool. Within the criteria of PGMI, the word ‘appropriate’ is used throughout; when assessing the exposure, the compression and the angle of the pectoral muscle in the mediolateral oblique view. This word, as highlighted by Bentley et al.,4 is subject to individual interpretation. The American College of Radiologists (ACR) and Royal Australian and New Zealand College of Radiologists (RANZCR) also use similarly subjective descriptors in their guidelines, such as ‘generous amount’ of pectoral muscle, ‘desirable’, ‘sufficiently’ and ‘preferable’. It is recognised that the use of such phrases leads to inherent subjectivity, variability in interpretation, and therefore uncertainty in any decisions.4 Another area of uncertainty in relation to ascertaining whether the MLO view is diagnostic is the Posterior Nipple Line (PNL) (Fig. 1). This causes problems for those who need to assess images, such as mammographers assessing their own images; assessors marking students' work; students themselves and researchers. The NHSBSP consider the whole breast to be imaged when ”Pectoral muscle shadow is to nipple level. The lower edge of the pectoral muscle shadow should reach nipple level whenever possible, to ensure that the posterior aspect of the breast is satisfactorily included on the image”.1
Please cite this article in press as: Hill C, Robinson L, Mammography image assessment; validity and reliability of current scheme, Radiography (2015), http://dx.doi.org/10.1016/j.radi.2015.07.005
C. Hill, L. Robinson / Radiography xxx (2015) 1e4
3
‘aspects of the current image evaluation criteria relating to the presentation of the pectoral muscle should be modified to include quantified ranges to reduce inherent subjectivity and variability in interpretation.’4 It would appear that the CC view is less problematic for assessment. The requirements are less subjective in their wording, but not without their ambiguity; the medial border is imaged according to local protocols; some axillary tail of the breast is shown; pectoral muscle shadow may be shown and the nipple is in profile.1 Burke and Mercer13 discussed the importance of the amount of breast tissue on the CC view. This could be calculated by drawing a line from the nipple to the back of the breast on the CC and then from the nipple to the pectoralis muscle using the PNL (Fig. 1). The measurement on the CC should be within 1 cm of the PNL measurement to ensure that sufficient tissue is demonstrated. Crucially, it appears that this well established criterion has never been validated through research. Furthermore, it is not in use within the current trainees marking standards for mammogram images. Again, no robust research has been carried out within this field to provide evidence-based criteria. Discussion
Figure 1. From personal collection.
Within this statement, again, there are several descriptors that are open to subjective interpretation and thus can have the effect of creating a non-standard practice. According to the NHSBSP, the ‘Nipple level’ (NL) (Fig. 1) is determined by drawing a horizontal straight line through the nipple parallel to the lower border of the film.10 However the ACR and RANZCR recommend the more achievable PNL. The PNL is the posterior nipple line and refers to a line that is drawn tangentially from the nipple towards the pectoral muscle.12 A national survey was undertaken by Naylor and York11 to ascertain whether centres were using the NL or the PNL. The results demonstrated that the tangent method (PNL) produced more achievable results.11 Though this audit was completed 15 years ago, there was no change in guidelines from the NHSBSP. Anecdotally, in the first author's role as training and education lead she has visited at least ten screening centres to update the staff. On these visits it has been made apparent that most centres are using the PNL instead of the NL as it is more achievable. However, even those using the PNL demonstrate confusion as to what the PNL is and where it is drawn. Naylor and York11 questioned whether the position of the nipple should be used in assessing the adequacy of the length of the pectoral muscle on the mediolateral oblique view. This question was also raised by a research group in Australia, who completed a study to ascertain the presentation of the pectoral muscle on the mediolateral oblique view and its relationship with current image evaluation systems.4 They recorded measurements of the length, width, contour type and inferior angle of the pectoral muscle and of the relationship of its inferior aspect to the posterior nipple line and the nipple level. In their conclusion they recommended that
For an assessment tool to be fair it needs to have reliability and validity. That is, for it to be classed as reliable can it demonstrate that the assessment of the images is reproducible between those viewing them? For it to be valid it must be able to measure what it sets out to measure and not something else, and the measurement must also be reliable.14 This review has shown the current PGMI assessment tool to be neither reliable nor valid, due to the fact that many of the descriptors are subjective and therefore prone to both inter- and intra-operator variability. The validity also has to be questioned, because there is a lack of agreement between a number of national standards about what features, e.g. nipple line, constitute a gold standard image. Preliminary work and personal experience suggests image appraisal marking grids used in the assessment of students may also demonstrate a lack of reproducibility and validity because they contain the same inherent flaws detailed above. Furthermore, there appears to be no agreement between education centres about what image appraisal tool should be used as image marking grids vary from institute to institute. The PGMI scheme is still used by some centres and anecdotally they believe that it gives the students something to aim for and work towards - achieving the ever-elusive ‘P’. Yet this system is open to both intra- and inter-operator variability and as such cannot be deemed reliable. One training centre within the UK has combined the ‘P’ and the ‘G’ into one level producing a ‘GMI’ scheme. Though they recognise that this is subjective they claim that it establishes a clear standard for clinical practice. Moreover, as well as differences existing between educational centres, a student image assessment grid is frequently different to the one used by qualified mammographers. This can be extremely confusing for newly qualified mammographers who have to learn a new scheme in order to carry out annual audit of their own practice. It is clear that there is scope for a national standard which should be evidence-based and implemented by all training centres and qualified mammographers. Initial recommendations The following changes could result in a more valid and reliable appraisal criteria;
Please cite this article in press as: Hill C, Robinson L, Mammography image assessment; validity and reliability of current scheme, Radiography (2015), http://dx.doi.org/10.1016/j.radi.2015.07.005
4
C. Hill, L. Robinson / Radiography xxx (2015) 1e4
1. Research into the PGMI scale and how mammographers interact with the scale is required. This could be carried out by the distribution of the questionnaire used by Spuur et al.5 to all Breast Screening Centres. 2. Discussion between Higher Education Centres which deliver mammographic training, the College of Radiographers, the NHSBSP Regional QA and The Royal College of Radiologists is required to ensure a national standard of how images are assessed can be established. 3. Research into the validity and reliability of a proposed marking criteria. 4. Changing the wording used in the current PGMI according to evidence based findings.
Conflict of interest statement None declared. References 1. CANCER Screening Programmes. Quality assurance guidelines for mammography, NHSBSP publication No 63. Sheffield: NHS Cancer Screening Programmes; 2006.
2. Spuur K, Tak Hung W, Poulos A, Rickard M. Mammography image Quality: model for predicting compliance with posterior nipple line criterion. Eur J Radiol 2011;80:713e8. 3. Moreira C, Svoboda K, Poulos A, Taylor R, Page A, Rickard M. Comparison of the validity and reliability of two image classification systems for the assessment of mammogram quality. J Med Screen 2005;12:38e42. 4. Bentley K, Poulos A, Rickard M. Mammography image quality: analysis of evaluation criteria using pectoral muscle presentation. Radiography 2008;14: 189e94. 5. Spuur K, Poulos A. Evaluation of the pectoral muscle in mammography images: the Australian experience. Eur J Radiogr 2009;1:12e21. 6. http://www.cancerresearchuk.org/cancer-info/cancerstats/types/breast/ [accessed December 2014]. 7. Eklund GW, Cardenosa G. The art of mammographic positioning. Radiol Clin N. Am. 1992;1:21e53. 8. Lee L, Stickland V, Wilson R, Evans A. Fundamentals of mammography. 2nd ed. Churchill Livingstone; 1995. 9. Society of radiographers. Education and professional development strategy: new directions. 17th March 2010. 10. The framework for higher education qualifications in England, Wales and Northern Ireland. August 2008. 11. Naylor S, York J. An evaluation of the use of pectoral muscle to nipple level as a component to assess the quality of the medio-lateral oblique mammogram. Radiography 1999;5:107e10. 12. http://radiopaedia.org/articles/posterior-nipple-line [Accessed March 2015]. 13. Burke K, Mercer C. The 1 cm rule: comparison of breast tissue evident on cranio-caudal versus medio-lateral oblique mammography. Imaging Ther Pract May 2011:18e21. 14. Jolly B, Grant J. The good assessment guide' joint centre for education in medicine. 1997.
Please cite this article in press as: Hill C, Robinson L, Mammography image assessment; validity and reliability of current scheme, Radiography (2015), http://dx.doi.org/10.1016/j.radi.2015.07.005