Whole Slide Imaging Diagnostic Concordance with Light Microscopy for Breast Needle Biopsies W. Scott Campbell PhD, MBA, Steven H. Hinrichs MD, Subodh M. Lele MD, John J. Baker MD, Audrey J. Lazenby MD, Geoffrey A. Talmon MD, Lynette M. Smith MS, William W. West MD PII: DOI: Reference:
S0046-8177(14)00160-9 doi: 10.1016/j.humpath.2014.04.007 YHUPA 3293
To appear in:
Human Pathology
Received date: Revised date: Accepted date:
30 January 2014 3 April 2014 9 April 2014
Please cite this article as: Campbell W. Scott, Hinrichs Steven H., Lele Subodh M., Baker John J., Lazenby Audrey J., Talmon Geoffrey A., Smith Lynette M., West William W., Whole Slide Imaging Diagnostic Concordance with Light Microscopy for Breast Needle Biopsies, Human Pathology (2014), doi: 10.1016/j.humpath.2014.04.007
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT Title Page
RI P
T
Whole Slide Imaging Diagnostic Concordance with Light Microscopy for Breast Needle Biopsies
University of Nebraska Medical Center, Department of Pathology and Microbiology
MA
1
NU
SC
W. Scott Campbell, PhD, MBA1; Steven H. Hinrichs, MD1; Subodh M Lele, MD1; John J. Baker, MD1; Audrey J. Lazenby, MD1; Geoffrey A. Talmon, MD1: Lynette M. Smith, MS2; William W. West, MD1
University of Nebraska Medical Center, College of Public Health, Center for Collaboration on Research, Design and Analysis
ED
2
PT
Keywords: Whole Slide Imaging, Virtual Microscopy, Telemedicine, Needle Breast Biopsy Running head: Whole Slide Imaging for Breast Needle Biopsies
CE
Corresponding Author:
AC
W. Scott Campbell, PhD, MBA Department of Pathology and Microbiology University of Nebraska Medical Center 985900 Nebraska Medical Center Omaha, NE 68198-5900 402-559-9593 (o) 402-559-5900 (f)
[email protected] Conflict of Interest Statement: At the time the work described in this manuscript was performed, Ventana was a research collaborator with the Department of Pathology and Microbiology. No monetary assistance was provided by Ventana to the Department or any project participant.
ACCEPTED MANUSCRIPT Abstract This study investigated the diagnostic accuracy of whole slide imaging (WSI) in breast
RI P
T
needle biopsy diagnosis in comparison with standard light microscopy (LM). The study examined the effects of image capture magnification and computer monitor quality on
SC
diagnostic concordance of WSI and LM.
NU
Four pathologists rendered diagnoses using WSI to examine 85 breast biopsies (92 parts; 786 slides) consisting of benign and malignant cases. Each WSI case was evaluated using
MA
images captured at either 20x or 40x magnifications and viewed using a DICOM grade, color-calibrated monitor or a standard, desktop LCD monitor. For each combination, the
ED
WSI result was compared to the original, LM diagnosis. The overall concordance rate
PT
observed between WSI and LM was 97.1% (95% CI:94.3%-98.5%). After a washout period, all cases were reviewed a second time by each pathologist after using LM, and the second
CE
LM diagnosis was compared to the WSI diagnosis rendered by the same pathologist.
AC
Intraobserver concordance between WSI and LM was 95.4% (95% CI:92.2%- 97.4%). The second LM diagnoses were also compared to the original LM diagnoses, and the observed interobserver LM concordance rate was 97.3% (95% CI:93.1% -99.0%). The study data demonstrated that breast needle biopsy diagnoses rendered by WSI were equivalent to diagnoses rendered by LM. No diagnostic differences were detected between the underlying viewing system parameters of monitor quality and image capture resolution. The results of this study demonstrated that WSI can be effectively utilized in subspecialty diagnostic cases where a minimum amount of tissue is available.
ACCEPTED MANUSCRIPT 1.0 Introduction The U.S. Food and Drug Administration (FDA) has determined that whole slide imaging
RI P
T
instruments (WSI) will be regulated as Class III medical devices [1,2]. The validation parameters used by commercial entities to obtain clearance are under consideration. This
SC
study investigated the diagnostic concordance rates of breast needle biopsy diagnosis when WSI was used as the primary diagnostic device in comparison with Light Microscopy
NU
(LM). Second, preliminary data was collected to determine whether the image capture
MA
resolution and viewing monitor characteristics utilized in this study affected diagnostic
ED
concordance rates between WSI and LM.
PT
While the diagnostic accuracy of traditional telepathology systems is well established [3-5],
CE
WSI represents the evolution of telepathology for remote histopathology diagnostics [6]. Evidence supporting the equivalence of WSI to LM as a primary diagnostic tool is mounting,
AC
and multiple studies supporting WSI for general surgical pathology usage have been published [7-9]. WSI has been evaluated for specialty surgical pathology applications including prostate [10], gastrointestinal [11, 12] and dermatopathology specimens [13-15]. The studies, in general, followed the CAP recommendations for validation of specific WSI uses [16]. WSI has been demonstrated to assist rapid diagnostic consultation in women’s health clinics [17], but there is limited data supporting the use of WSI for primary diagnosis of breast needle biopsies. This study investigated the use of WSI for primary diagnostic interpretation of breast tissue obtained by needle biopsy. Two factors have been identified that could impact the overall accuracy of WSI including the scanning image magnification
ACCEPTED MANUSCRIPT and the use of monitors of variable levels of quality. A previous WSI validation study by Campbell, et al. noted that when slides were imaged at a magnification of 20x,
T
interpretation of intranuclear detail was difficult and microorganism detection was
SC
RI P
challenging [7].
NU
There is a paucity of research investigating the optimal image capture resolution or monitor viewing characteristics used for pathologist interpretation of WSI. While
MA
researchers agree that glass slide quality [18], spatial resolution of images [19], camera optical lens quality [20] and color standardization [21] are important factors affecting WSI,
ED
no definitive specifications have been established to determine guidelines and parameters
PT
for WSI use for primary diagnosis. For WSI to be adopted and used for primary diagnoses, the issues of image quality and optimal system performance in terms of their impact on
AC
CE
diagnostic accuracy must be addressed.
The American College of Radiology (ACR) has published documentation and research supporting their guidelines for image management and viewing of images for diagnostic purposes [22]. The consideration and pathway established by the ACR for validation of digital imaging technologies was used to design a multi-step study to determine diagnostic accuracy of WSI and investigate the effect of image resolution and monitor characteristics on diagnostic concordance and accuracy of diagnoses rendered by WSI.
ACCEPTED MANUSCRIPT 2.0 Materials and Methods A computer review of the anatomic pathology laboratory information system was used to
RI P
T
identify 85 breast needle biopsy cases (i.e., stereo tactic, needle core, and vacuum assisted needle core) originally interpreted and reported by a single, board-certified pathologist
SC
who serves as the senior breast subspecialty expert for the department. Seventy-eight cases consisted of one part and 7 cases consisted of two parts comprising a total of 786
NU
Hematoxylin &Eosin (H&E) slides. Based on the original report, 56 benign diagnoses and
MA
36 malignant diagnoses were rendered on the 92 parts. Each case and part was given a study number, and all H&E slides for each case were scanned at 20x and at 40x. All images
ED
were created using a Ventana iCoreo scanner and viewed using Ventana Virtuoso Express viewing software (Ventana Medical Systems, Tucson, AZ). (Figure 1) The study was
CE
PT
approved by the University of Nebraska Medical Center Institutional Review Board.
AC
Four board-certified pathologists including three serving on the breast pathology subspecialty service reviewed all 85 cases exclusively by WSI. Each reviewer had received training in the use of the WSI viewer prior to the study and three of the four pathologists had extensive experience with WSI case examination spanning several years. The study was structured as a 2 x 2 study design comparing results using either images scanned at 20x or 40x magnifications and viewed on a standard, non-calibrated desktop monitor (17”, 1.3 megapixel flat screen Dell E177FP) or a DICOM, diagnostic grade, color calibrated monitor ( 30”, 4 megapixel flat screen NDS Dome E4c calibrated per manufacturer specifications). (See Table 1) Using this approach, each pathologist reviewed 25% of the
ACCEPTED MANUSCRIPT cases in one of the four image capture/image display categories; thereby each pathologist reviewed all the cases one time by WSI. No pathologist reviewed the same case twice by
RI P
T
WSI.
SC
Pathologists were presented cases in batches of 20-25 cases. Pathologists reviewed cases
NU
at their own pace. A total of 368 diagnoses on 92 parts using WSI over a period of six months were rendered. Each WSI diagnosis was compared to the original LM diagnosis to
MA
determine the overall WSI to LM concordance rate. The 92 WSI diagnoses rendered for each monitor quality/image resolution pairing were compared to the original LM
PT
ED
diagnoses to render a WSI concordance rate by monitor quality and image resolution.
CE
Cases were presented to each study physician with pre-examination information consistent with the usual operating procedures of the Department. Information included patient age,
AC
gender, gross description of the tissue specimen(s), pre-operative diagnosis and preoperative comments by the surgeon when provided. Study pathologists rendered a diagnosis for each case in a manner consistent with departmental case sign-out. In the event that a study pathologist requested a special diagnostic stain to reach a conclusive diagnosis and the stain was available from the original LM examination, it was retrieved, scanned at the magnification indicated in the study schedule for the requesting pathologist and presented to the study pathologist. If the study pathologist requested special stains that did not exist at the time of the original LM examination, no additional slides were
ACCEPTED MANUSCRIPT presented. Routine diagnostic special stains were provided (e.g., AE1/AE3, p63, e-
RI P
T
cadherin), but no tumor biomarker stains were provided (e.g., PR-Q, ER-Q, HER2/NEU).
Scoring was conducted on a three point scale: 1) concordant; 2) concordant with clinically
SC
insignificant differences; and 3) discordant with clinical significance. The scoring system
NU
was designed to differentiate diagnoses that were discordant with clinical significance such as malignant vs. benign or invasive carcinoma vs. ductal carcinoma in situ (DCIS)
MA
diagnoses. By contrast, differing diagnoses using equivalent variations in descriptive terminology were assigned to the concordant but clinically insignificant category as were
ED
differences in DCIS growth patterns and variations of no more than one histologic grade
CE
PT
difference in malignant diagnoses [19-21].
Following CAP recommendations for WSI validation and intraobserver variation, each
AC
study pathologist reviewed all study cases a second time using LM. To minimize recall bias, a washout period of no less than six months and no greater than 10 months was used. Case presentation details and procedures were identical to those used during WSI reviews. Each study pathologist’s subsequent LM diagnoses were compared to their individual WSI diagnoses and to the original LM sign-out diagnoses. Reproducibility of each pathologist’s interpretation relative to the diagnostic modality was evaluated in multiple ways as illustrated in Figure 2. Diagnostic comparisons included: A) Interoperator diagnostic comparison of WSI diagnoses and the original LM diagnoses; B) Intraoperator comparison of each study pathologist’s WSI diagnoses and their LM diagnoses of the same case; and C)
ACCEPTED MANUSCRIPT Interoperator diagnostic comparison of LM diagnoses of each pathologist to the original LM diagnoses. Generalized estimating equations (GEE) was used to estimate the concordance
T
rates, as well as 95% confidence intervals (CI). SAS software Version 9.3 (SAS Institute Inc.,
SC
RI P
Cary NC) was used for the data analysis.
NU
3.0 Results
Complete concordance between WSI and the original LM diagnoses was reached in 113
MA
(31%) diagnoses. Discordant opinions were reported from 12 (3%) diagnoses involving nine cases. The remaining 243 (66%) of the diagnoses were scored as concordant with
ED
clinically insignificant differences. When the two concordant categories were grouped
PT
together, overall concordance rates were 97.1% (95% CI: 94.3% - 98.5%) as shown in
AC
CE
Table 2. No significant difference between concordance rates by pathologist was noted.
Discordant diagnostic opinions are detailed in Table 3. One diagnosis was reported as invasive ductal adenocarcinoma by WSI which had been diagnosed originally as DCIS by LM (case 17). However, the LM diagnosis did report regions of tissue suspicious, but not diagnostic, for microinvasion. The LM diagnosis for Case 32 was invasive ductal adenocarcinoma in contrast to the diagnosis of DCIS rendered using WSI. Four cases (Cases 31, 48, 53 and 56) diagnosed as benign and non-atypical by LM were diagnosed using WSI as atypical ductal hyperplasia (ADH) and/or containing atypical cellular alterations and were categorized as discordant. In one case (case 71), the WSI opinion of one pathologist was fibroadenomatoid changes whereas the original LM diagnosis was invasive carcinoma with lobular features. One LM diagnosis of DCIS (Case 40)
ACCEPTED MANUSCRIPT was considered benign by two pathologists using WSI. One case diagnosed as benign fibrous breast tissue using LM was diagnosed as invasive carcinoma with lobular features by a single pathologist
RI P
T
using WSI.
SC
In 36 instances (39%), all study pathologist diagnoses were concordant with clinically insignificant differences to the original sign-out diagnosis. Four such cases were malignant diagnoses, and 31
NU
cases were benign diagnoses. Examples of concordant diagnoses with clinically insignificant differences are reported in Table 4. Complete concordance between all study pathologists and the
MA
original sign-out diagnosis was achieved in 7 cases (3%). Six of the 7 cases were benign, and one
PT
the original sign-out diagnosis.
ED
case was malignant in nature. In no instance were all study pathology diagnoses discordant with
CE
A comparison of study pathologist diagnoses rendered by LM to their WSI diagnoses (Figure 2 – Comparison B) and categorized using the same scoring criteria described previously resulted in a
AC
concordance rate of 95.4% (95% CI: 92.2%-97.4%) for intraoperator diagnoses (Table 5). When the study pathologists’ LM diagnoses were compared to the original, LM sign-out diagnoses (Figure 2 – Comparison C), a LM interoperator concordance rate of 97.3% (95% CI: 93.1% - 99.0%) was observed (Table 5).
Ten diagnoses rendered by WSI and classified as discordant to the originally reported LM diagnoses were reported as concordant when the same case was reviewed by LM by the same study pathologist. However, eight diagnoses rendered by WSI that were considered concordant to the original, LM sign out diagnoses were changed to discordant diagnoses by the study pathologist
ACCEPTED MANUSCRIPT when LM was used as their diagnostic modality. Examination of diagnostic changes made consisted equally of upgraded diagnoses (e.g., DCIS to invasive carcinoma or benign conditions to atypia) and
T
downgraded diagnoses (e.g., invasive carcinoma to DCIS or atypia to benign morphologies). See
SC
RI P
Table 6.
No statistically significant concordance relationships were observed between any of the four
NU
monitor quality/image resolution combinations. Overall diagnostic concordance rates by monitor/resolution category are reported in Table 7. No significant difference in discordant rates
MA
between WSI diagnoses and the original LM diagnoses or the study pathologist LM diagnoses were noted when the WSI case was reviewed using a standard desktop monitor vs. a higher resolution,
ED
color calibrated monitor. No significant difference in intraobserver concordance rates were
PT
observed between images scanned at 20x and 40x. A significant difference in interoperator discordant rates was observed between diagnoses rendered using images scanned at 20x and 40x
CE
(1.5% vs. 4.1%, p = 0.023). At 20x, two cases were downgraded by WSI and one case was upgraded
downgraded.
AC
(i.e., benign changed to malignant). Six 40x WSI diagnoses were upgraded and three were
4.0 Discussion The application of digital imaging to the practice of radiology required a complete evaluation of the technology including hardware, software and operator capability. Many similarities exist for digital pathology including the impact of computer monitor quality and image production on the accuracy of diagnoses rendered by WSI. To understand the effects of image resolution and viewing monitor quality on diagnostic accuracy of WSI, LM diagnoses were compared to WSI diagnoses rendered using different image capture resolutions and different monitor quality levels. An interoperator
ACCEPTED MANUSCRIPT WSI concordance rate of 97.1% and an interoperator LM concordance rate of 97.3% between study pathologists and the original sign-out pathologist were found to be equivalent to those published in
T
the literature [26-28]. The intraobserver concordance rate of 95.4% comparing LM diagnoses and
RI P
WSI diagnoses by the same pathologist was also within published LM intraoperator concordance rates. Together, the data indicate that diagnoses rendered by WSI are equivalent to LM and
SC
independent of the examining pathologist whether using 20x or 40x as the original image capture
NU
resolution or the quality of the monitor used to review the WSI.
MA
The definition of a “truth” diagnosis in surgical pathology has proven somewhat elusive in surgical pathology due to interpretive, cognitive and linguistic variability within the pathology community
ED
[29-32]. One solution to this problem has been to establish the “truth” diagnosis by expert panel
PT
consensus. In this study, the “truth” diagnosis was the original case diagnosis, and relying on one individual’s diagnostic opinion as the standard to measure diagnostic concordance would be
CE
problematic. This issue was mitigated by comparing the original LM diagnosis with LM diagnoses rendered by each pathologist in the study. This exercise established diagnostic consistency
AC
between all pathologists who participated in the study including the original, diagnosing pathologist using LM.
Although not a surprise, diagnostic discordance in the study arose from diagnostic disagreement between pathologists over terminilogy and interpretive criteria. Diagnostic discordance between usual hyperplasia and atypical hyperplasia is well documented [27] but was scored as discordant due to differences in subsequent clinical intervention. Pathologists involved in this concordance study were not provided the opportunity to acquire a second opinion to achieve consensus nor review other pertinent clinical information (e.g., radiographs), and therefore, did not have the
ACCEPTED MANUSCRIPT opportunity to review their interpretation with a colleague. This may have impacted the discordance rate in cases where discordance was noted between atypia and malignant diagnostic
RI P
T
interpretations and in cases where microinvasion was equivocal.
SC
The finding that intraobserver concordance rates of pathologists with WSI and LM competence was equivalent to published results from LM interoperator and intraoperator concordance studies [26-
NU
28] suggested that intraobserver validation studies may not be necessary for purposes of WSI validation. This possibility should be further explored because intraobserver studies increase the
MA
length of time to conduct validation studies. Following CAP recommendations [16], a washout period between case reviews to reduce recall bias could extend a study up to 12 months. While
ED
early telepathology diagnostic accuracy studies used a variety of washout periods for intraobserver
PT
comparison studies [33-35], CAP and the Laboratory Quality Center were unable to identify published studies documenting the relationship of washout period length and recall bias [16], a
AC
CE
finding which may impact the design and interpretation of intraobserver concordance studies.
Data from this study did not support the hypothesis that diagnostic concordance rates would increase from the use of a high-quality, color calibrated, DICOM monitor. While Yagi hypothesized that color calibration is important for WSI accuracy [21], data from this study does not indicate a significant relationship between color calibration and diagnostic concordance. This finding was consistent with Krupinski, et al. [36] who did not detect a relationship between diagnostic concordance and color calibration for breast biopsy tissue examination. Whereas Krupinski, et al. focused on specific regions of interest within a breast biopsy, the current study expands upon the findings by investigating the affect of color calibration on entire breast biopsy cases. Based on the
ACCEPTED MANUSCRIPT high levels of concordance in this study, additional studies with larger sample sets are needed to
RI P
T
detect differences in diagnostic concordance based on monitor quality.
Contrary to the initial hypothesis that WSI diagnosis concordance rates with LM would be higher
SC
when images were captured at higher resolutions, WSI diagnoses rendered using images captured at 40x were more likely to be discordant with LM diagnoses than WSI diagnoses rendered using
NU
images captured at 20x. The basis of this finding may be operational and not based on image clarity as images captured at 40x often approached one minute per slide to resolve, a substantial time
MA
difference compared to the time to place a slide under a LM or to change objective lenses. The study pathologists may simply not have reviewed all the 40x images involved in each case. High
ED
throughput computer processors optimized for image viewing may eliminate this operational
PT
concern [37]. However, some 40x WSI diagnoses were discordant by calling benign LM cases as malignant or atypical hyperplasia suggesting locator issues were not the sole cause for
CE
discordances. The total number of discordant diagnoses was small (n = 9), and additional study is
AC
needed to fully investigate the effect of image magnification on diagnostic accuracy.
The high rate of concordant but clinically insignificant diagnoses (65%) highlighted the variation in diagnostic language used by pathologists. This impacted the level of complete diagnostic concordance rates for this pathologist. The listing of fibrocystic changes and particular morphologies within the fibrocystic change continuum (e.g., adenosis, fibrosis, sclerosing adenosis, or proliferative fibrocystic change) was uncommonly identical to the LM diagnosis or between study pathologists. For cases with malignant diagnoses, reporting of growth pattern, grade, and presence of microcalcifications and necrosis varied. However, analysis of the findings demonstrated that a cancer diagnosis was less likely to result in a discordant and/or clinically
ACCEPTED MANUSCRIPT insignificant discordant result (p = 0.009) than a benign diagnosis such as fibrocystic change. This finding may be due to the fact that cancer diagnoses follow a structured reporting process in which
T
the presence or absence of defining characteristics are required to be enumerated within the
RI P
diagnostic report. A similar reporting structure does not exist for the wide array of benign changes
SC
in breast pathology.
NU
The data from this study support the use of WSI technology for breast needle biopsy diagnosis. The tissue presented to pathologists in this study varied in size but generally measured 2-3 mm in
MA
width and 10-20 mm in length. (Figure 3) None of the pathologists expressed concern over the small size of tissue available for diagnosis, and no reports of tissue insufficient of a diagnosis were
ED
made. Diagnostic discordance was characterized by known diagnostic challenges, variations in
PT
terminology and interpretive error. This study extends previous work demonstrating the equivalence of WSI to LM in general surgical pathology to challenging, specialty pathology
CE
diagnoses. Additional research is required to document minimum image capture resolution and computer monitor viewing characteristics for optimal WSI performance. Such data is important for
radiology.
AC
the FDA and CAP to develop WSI guidelines and standards for use as the ACR has done for
ACCEPTED MANUSCRIPT References
RI P
T
1. Faison TA. FDA regulation of whole slide imaging (WSI) devices: Current thoughts. U.S. Food and Drug Administration, Washington D.C., February 14, 2012. ftp://ftp.cdc.gov/pub/CLIAC_meeting_presentations/pdf/Addenda/cliac0212/Tab_15_Faison_CLIA C_2012Feb14_Whole_Slide_Imaging.pdf. Accessed 10 Jan 2014.
SC
2. Titus K. Regulators scanning the digital scanners. CAP Today. 2012;26:1-56-62.
NU
3. Dunn BE, Choi H, Recla DL, Kerr SE, Wagenman BL. Robotic surgical telepathology between the iron mountain and milwaukee department of veterans affairs medical centers: A 12-year experience. Hum Pathol. 2009;40(8):1092-1099.
MA
4. Dunn BE, Choi H, almagro U, Recla DL, Krupinski EA, Weinstein RS. Routine surgical telepathology in the department of veterans affairs: Experience-related improvements in pathologist performance in 2200 cases. Telemedicine Journal. 1999;5(4):323.
ED
5. Weinstein RS, Descour MR, Liang C, et al. Telepathology overview: From concept to implementation. Hum Pathol. 2001;32(12):1283-1299.
PT
6. Weinstein RS, Graham AR, Richter LC, et al. Overview of telepathology, virtual microscopy, and whole slide imaging: Prospects for the future. Hum Pathol. 2009;40(8):1057-1069.
CE
7. Campbell WS, Lele SM, West WW, Lazenby AJ, Smith LM, Hinrichs SH. Concordance between whole-slide imaging and light microscopy for routine surgical pathology. Hum Pathol. 2012;43:1739-1744.
AC
8. Bauer TW, Schoenfield L, Slaw RJ, Yerian L, Sun Z, Henricks WH. Validation of whole slide imaging for primary diagnosis in surgical pathology. Arch Pathol Lab Med. 2013;137:518-524. 9. Jukic DM, Drogowski LM, Martina J, Parwani AV. Clinical examination and validation of primary diagnosis in anatomic pathology using whole slide digital images. Arch Pathol Lab Med. 2011;135:372-378. 10. Rodriguez-Urrego PA, Cronin AM, Al-Ahmadie HA, et al. Interobserver and intraobserver reproducibility in digital and routine microscopic assessment of prostate needle biopsies. Hum Pathol. 2011;42:68-74. 11. Molnar B, Berczi L, Diczhazy C, et al. Digital slide and virtual microscopy based routine and telepathology evaluation of routine gastrointestinal biopsy specimens. J Clin Pathol. 2003;56:433438. 12. Al-Janabi S, Huisman A, Vink A, et al. Whole slide images for primary diagnostics of gastrointestinal tract pathology: A feasibility study. Hum Pathol. 2012;43:702-7. 13. Al Habeeb A, Evans A, Ghazarian D. Virtual microscopy using whole-slide imaging as an enabler for teledermatopathology: A paired consultant validation study. J Pathol Inform. 2012;3:2.
ACCEPTED MANUSCRIPT 14. Koch LH, Lampros JN, Delong LK, Chen SC, Woosley JT, Hood AF. Randomized comparison of virtual microscopy and traditional glass microscopy in diagnostic accuracy among dermatology and pathology residents. Hum Pathol. 2009;40:662-667.
RI P
T
15. Velez N, Jukic D, Ho J. Evaluation of 2 whole-slide imaging applications in dermatopathology. Hum Pathol. 2008;39:1341-1349.
SC
16. Pantanowitz L, Sinard JH, Henricks WH, et al. Validating whole slide imaging for diagnostic purposes in pathology: Guideline from the college of american pathologists pathology and laboratory quality center. Arch Pathol Lab Med. 2013;137:1798-810.
NU
17. Lopez AM, Graham AR, Barker GP, et al. Virtual slide telepathology enables an innovative telehealth rapid breast care clinic. Hum Pathol. 2009;40:1082-1091. 18. Yagi Y, Gilbertson JR. A relationship between slide quality and image quality in whole slide imaging (WSI). Diagn Pathol. 2008;3 Suppl 1:S12.
MA
19. Clarke GM, Zubovits JT, Katic M, Peressotti C, Yaffe MJ. Spatial resolution requirements for acquisition of the virtual screening slide for digital whole-specimen breast histopathology. Hum Pathol. 2007;38:1764-1771.
ED
20. Yagi Y, Gilbertson JR. The importance of optical optimization in whole slide imaging (WSI) and digital pathology imaging. Diagn Pathol. 2008;3 Suppl 1:S1.
PT
21. Yagi Y. Color standardization and optimization in whole slide imaging. Diagn Pathol. 2011;6 Suppl 1:S15.
CE
22. ACR practice guidelines and technical standards. http://www.acr.org/~/media/ACR/Documents/PGTS/toc.pdf. Accessed 10 Jan 2014.
AC
23. Monticciolo DL. Histologic grading at breast core needle biopsy: Comparison with results from the excised breast specimen. Breast J. 2005;11:9-14. 24. Schuh F, Biazus JV, Resetkova E, Benfica CZ, Edelweiss MI. Reproducibility of three classification systems of ductal carcinoma in situ of the breast using a web-based survey. Pathol Res Pract. 2010;206:705-711. 25. Wells WA, Carney PA, Eliassen MS, Grove MR, Tosteson AN. Pathologists' agreement with experts and reproducibility of breast ductal carcinoma-in-situ classification schemes. Am J Surg Pathol. 2000;24:651-659. 26. Raab SS, Nakhleh RE, Ruby SG. Patient safety in anatomic pathology: Measuring discrepancy frequencies and causes. Arch Pathol Lab Med. 2005;129:459-466. 27. Jain RK, Mehta R, Dimitrov R, et al. Atypical ductal hyperplasia: Interobserver and intraobserver variability. Mod Pathol. 2011;24:917-923. 28. Tsuda H, Akiyama F, Kurosumi M, Sakamoto G, Watanabe T. Monitoring of interobserver agreement in nuclear atypia scoring of node-negative breast carcinomas judged at individual
ACCEPTED MANUSCRIPT collaborating hospitals in the national surgical adjuvant study of breast cancer (NSAS-BC) protocol. Jpn J Clin Oncol. 1999;29:413-420.
T
29. Cramer SF. Interobserver variability in surgical pathology. In: Weinstein RS, Graham AR, eds. Advances in pathology and laboratory medicine. Volume 9 ed. St. Louis, MO: Mosby; 1996:3.
RI P
30. Frable WJ. Surgical pathology--second reviews, institutional reviews, audits, and correlations: What's out there? error or diagnostic variation? Arch Pathol Lab Med. 2006;130(5):620-625.
SC
31. Raab SS, Grzybicki DM, Janosky JE, et al. Clinical impact and frequency of anatomic pathology errors in cancer diagnoses. Cancer. 2005;104(10):2205-2213.
NU
32. Raab SS, Nakhleh RE, Ruby SG. Patient safety in anatomic pathology: Measuring discrepancy frequencies and causes. Arch Pathol Lab Med. 2005;129(4):459-466.
MA
33. Weinberg DS, Allaert FA, Dusserre P, et al. Telepathology diagnosis by means of digital still images: An international validation study. Hum Pathol. 1996;27(2):111-118.
ED
34. Piccolo D, Soyer HP, Burgdorf W, et al. Concordance between telepathologic diagnosis and conventional histopathologic diagnosis: A multiobserver store-and-forward study on 20 skin specimens. Arch Dermatol. 2002;138(1):53-58.
PT
35. Chorneyko K, Giesler R, Sabatino D, et al. Telepathology for routine light microscopic and frozen section diagnosis. Am J Clin Pathol. 2002;117(5):783-790.
CE
36. Krupinski EA, Silverstein LD, Hashmi SF, Graham AR, Weinstein RS, Roehrig H. Observer performance using virtual pathology slides: Impact of LCD color reproduction accuracy. J Digit Imaging. 2012;25(6):738-743.
AC
37. Yagi Y, Yoshioka S, Kyusojin H, et al. Ultra high speed whole slide image viewing system. Anal Cell Pathol (Amst). 2011;34:265-275. 38. Zhang, Stenback, Wardrop. Interval estimation of the process capability index. Communications in Statistics: Theory and Methods. 1990;19:4455-4470.
ACCEPTED MANUSCRIPT Legends
T
Figure 1 – Viewing station configuration
MA
Figure 3B – Example of tissue length (9mm)
NU
Figure 3A – Example of tissue width (3mm)
SC
RI P
Figure 2 – Diagnostic concordance comparisons between WSI, LM and pathologist. Comparison A: Interoperator comparison of Original LM diagnoses and WSI diagnoses. Comparison B: Intraoperator comparison of pathologists WSI diagnoses and their LM diagnoses. Comparison C: Interoperator comparison of original LM diagnoses to study pathologists LM diagnoses.
PT
ED
Table 1 – 2 x 2 study design describing each image capture and viewing monitor characteristics.
CE
Table 2 – Interobserver diagnostic concordance rates between WSI and original LM by pathologist and in total.
AC
Table 3 – Discordant diagnoses listed by case number. Diagnoses listed in the second column represent the original LM diagnoses, and diagnoses listed in the third column represent the discordant WSI diagnoses. As shown, two distinct, discordant diagnoses were rendered in two cases. Each diagnosis listed is attributed to a single pathologist.
Table 4 – Examples of diagnoses scored as concordant with clinically insignificant differences
Table 5 – Intraobserver diagnostic concordance rates between WSI and LM by the same pathologist and Interobserver diagnostic concordance rates between LM and original LM by pathologist and in total.
ACCEPTED MANUSCRIPT
T
Table 6 – Discordant diagnoses listed by case number. Diagnoses listed in the second column represent the original LM diagnoses, and diagnoses listed in the third column represent the discordant LM diagnoses rendered by study pathologists.
AC
CE
PT
ED
MA
NU
SC
RI P
Table 7. Probability of discordant WSI diagnoses to LM diagnoses by image resolution and monitor type groupings. (Generalized estimating equations (GEE) calculations [38])
ACCEPTED MANUSCRIPT
Pathologist
1.3 MP monitor
4 MP monitor
Pathologist 1
Parts 1 - 23
Parts 70 - 92
Pathologist 2
Parts 24 – 46
Pathologist 3
Parts 47 – 69
Pathologist 4
Parts 70 - 92
RI P
Parts 1 - 23
SC
Parts 24 - 46
NU
40x image capture
T
20x image capture
Parts 47 - 69
Parts 47 - 69
Parts 24 - 46
Pathologist 2
Parts 70 - 92
Parts 47 - 69
MA
Pathologist 1
Pathologist 3
Parts 70 - 92
Parts 24 - 46
Parts 1 - 23
ED
Pathologist 4
Parts 1 - 23
AC
CE
PT
Table 1 – 2 x 2 study design describing each image capture and viewing monitor characteristics.
ACCEPTED MANUSCRIPT
Pathologist 2
Pathologist 3
28 (30.4%) 59 (64.1%)
16 (17.4%) 75 (81.5%)
32 (34.8%) 56 (60.9%)
5 (5.4%)
1 (1.1%)
4 (4.3%)
Pathologist 4
Aggregate
SC
RI P
T
Pathologist 1
NU
Interobserver WSI Concordance Score to original LM diagnosis Concordant Clinically Insignificant Difference Discordant
37 (40.2%) 53 (57.6%)
113 (30.7%) 243 (66.0%)
2 (2.2%)
12 (3.3%)
MA
Table 2 – Interobserver diagnostic concordance rates between WSI and original LM by pathologist
AC
CE
PT
ED
and in total.
ACCEPTED MANUSCRIPT Original Light Microscope Diagnosis
Discordant Whole Slide Image Diagnoses
Left breast biopsies: benign fibrous breast tissue with foci of Fibroadematoid change; microcalcifications identified
Invasive mammary carcinoma with lobular features.
17
Ductal carcinoma in situ with extensive periductal sclerosis and inflammation. Focus suspicious but not diagnostic for microinvasion. Solid growth pattern with high nuclear grade. Microcalcifications within DCIS and necrosis present.
Invasive ductal adenocarcinoma, grade 3 with DCIS and without angiolymphatic invasion.
31
Fragmented portions of focally calcified sclerosing papilloma and associated benign breast tissue. No atypia or malignancy identified
Single microscopic focus of Atypical Ductal Hyperplasia (ADH)/DCIS. Fibrocystic change with sclerosing adenosis
32
Invasive ductal adenocarcinoma. Nottingham grade 2/3, no DCIS identified
SC
RI P
T
Case Number 6
Fibrocystic changes with myoepithelial hyperplasia. Focal rupture of duct with chronic inflammation.
2.
Mucocele like lesion focally with an area with crush artifact. Differential includes atypical ductal hyperplasia, sclerosed papilloma and in situ carcinoma
DCIS with cribriform to micropapillary growth pattern. Low nuclear grade without necrosis. Microcalcification in DCIS.
1.
Fibrocystic change with usual Intraductal hyperplasia, multifocal microcalcification, prominent sclerosing adenosis.
2.
Fibrocystic mastopathy with sclerosing adenosis and ductal hyperplasia of the usual type with associated microcalcifications within benign ductal lumina. (Possible ADH but favor usual hyperplasia)
PT
ED
40
MA
NU
1.
Benign breast tissue with proliferative fibrocystic changes including microcyst formation, aprocine metaplasia, and nonatypical hyperplasia
53
Benign breast tissue with proliferative fibrocystic changes including non-atypical ductal hyperplasia, fibrosis and duct ectasia
AC
CE
48
Fibrocystic mastopathy with ductal hyperplasia and areas of columnar cell and micropapillary change with atypia; associated microcalcifications noted within ductal lumina. 1.
Atypical micropapillary and papillary ductal epithelial hyperplasia
2.
Atypical ductal hyperplasia
56
Benign breast parenchyma with proliferative fibrocystic changes including sclerosis adenosis with associated microcalcifications
Proliferative fibrocystic change with atypical ductal hyperplasia focally bordering on DCIS; flat epithelial atypia, sclerosing adenosis, microcalcifications
71
Invasive carcinoma with features suggestive of pleomorphic lobular type. Nottingham grade 2/3.
Benign breast tissue with focal microcalcification and focal fibroadenomatoid change.
Table 3 – Discordant diagnoses listed by case number. Diagnoses listed in the second column represent the original LM diagnoses, and diagnoses listed in the third column represent the discordant WSI diagnoses. As shown, two distinct, discordant diagnoses were rendered in two cases. Each diagnosis listed is attributed to a single pathologist.
ACCEPTED MANUSCRIPT
15
18
Proliferative fibrocystic changes Fragments of benign breast tissue with fibrocystic change and focal microcalcification Intraductal papilloma with DCIS. Microcalcifications identified
T
9
Benign breast tissue with focal adenosis, non-atypical hyperplasia and associated microcalcifications Benign breast tissue without foci of fibroadematoid change and microcalcifications DCIS with papillary, micropapillary growth pattern. Low nuclear grade. No necrosis. Microcalcification in DCIS DCIS with papillary, micropapillary and cribriform growth pattern. Intermediate nuclear grade. Necrosis focally present and microcalcification present in DCIS Invasive ductal adenocarcinoma. Nottingham grade 2/3. Associated DCIS with solid growth pattern, intermediate nuclear grade, non-necrotic
Reason for scoring as concordant with clinically insignificant differences No mention of adenosis, microcalcifications, or ductal hyperplasia No mention of Fibroadenoma No mention of grade, growth pattern or necrotic. Report of papilloma not mentioned in original diagnosis Nuclear grade different. Growth patterns different.
RI P
6
Concordant diagnosis with clinically insignificant differences
DCIS with papillary and solid growth patterns, high nuclear grade with focal necrosis and microcalcifications Invasive ductal carcinoma, at least grade 2/3.
SC
1
Original Diagnosis
NU
Case Number
No mention of DCIS and characteristics of DCIS
AC
CE
PT
ED
MA
Table 4 – Examples of diagnoses scored as concordant with clinically insignificant differences
ACCEPTED MANUSCRIPT Pathologist 2
Pathologist 3
Pathologist 4
52 (56.5%)
39 (42.4%)
44 (47.8%)
29 (31.5%)
36 (39.1%)
46 (50.0%)
43 (46.7%)
4 (4.3%)
7 (7.6%)
5 (5.4%)
34 (37.0%)
17 (18.5%)
56 (60.9%)
70 (76.1%)
2 (2.2%)
5 (5.4%)
T
Pathologist 1
164 (44.6%) 186 (50.5%)
RI P
61 (66.3%)
Aggregate
18 (4.9%)
32 (34.8%)
33 (35.9%)
116 (31.5%)
57 (62.0%)
59 (64.1%)
242 (65.8%)
3 (4.3%)
0 (0%)
10 (2.7%)
NU
SC
2 (2.2%)
MA
Intraobserver WSI Concordance Score to LM diagnosis Concordant Clinically Insignificant Difference Discordant Interobserver LM Concordance to original LM diagnosis Concordant Clinically Insignificant Difference Discordant
Table 5 – Intraobserver diagnostic concordance rates between WSI and LM by the same
ED
pathologist and Interobserver diagnostic concordance rates between LM and original LM by
AC
CE
PT
pathologist and in total.
ACCEPTED MANUSCRIPT
6
Left breast biopsies: benign fibrous breast tissue with foci of Fibroadematoid change; microcalcifications identified DCIS, low grade with papillary and cribiform growth pattern, microcalcifications present
9
Study Pathologist Diagnostic resolutions/discrepancies when cases reviewed by light microscope All WSI discrepancies resolved 1.
Benign fibrocystic change with usual ductal hyperplasia, intraductal papilloma, and microcalcification.
2.
Intraductal papilloma with focal atypical ductal hyperplasia with calcifications
31 32
Fragmented portions of focally calcified sclerosing papilloma and associated benign breast tissue. No atypia or malignancy identified Invasive ductal adenocarcinoma. Nottingham grade 2/3, no DCIS identified DCIS with cribriform to micropapillary growth pattern. Low nuclear grade without necrosis. Microcalcification in DCIS.
50 53 56 71
Benign breast tissue with proliferative fibrocystic changes including microcyst formation, aprocine metaplasia, and non-atypical hyperplasia Benign breast tissue with proliferative fibrocystic changes and fibroadenoma Benign breast tissue with proliferative fibrocystic changes including non-atypical ductal hyperplasia, fibrosis and duct ectasia Benign breast parenchyma with proliferative fibrocystic changes including sclerosis adenosis with associated microcalcifications Invasive carcinoma with features suggestive of pleomorphic lobular type. Nottingham grade 2/3.
AC
48
CE
PT
ED
40
NU
20
Ductal carcinoma in situ with extensive periductal sclerosis and inflammation. Focus suspicious but not diagnostic for microinvasion. Solid growth pattern with high nuclear grade. Microcalcifications within DCIS and necrosis present. Benign breast tissue with foci of adenosis with associated microcalcifictations
MA
17
Atypical intraductal papilloma (with at least atypical ductal hyperplasia, bordering on low-grade DCIS) with associated microcalcifications. Recommend excision of lesion Invasive ductal carcinoma, high grade, ductal carcinoma in situ, high grade with comedo necrosis
SC
3.
T
Original Light Microscope Diagnosis
RI P
Case Number
Flat epithelial atypia with calcifications All WSI discrepancies resolved All WSI discrepancies resolved 1.
Fibrocystic change with usual Intraductal hyperplasia, multifocal microcalcification, prominent sclerosing adenosis. (Dr.1 stays discord without changes to WSI dx)
2.
Fibrocystic changes with sclerosing adenosis, radial scar, and focal atypical ductal hyperplasia. Adenoma with lactational change
3.
Fibrocystic mastopathy with sclerosing adenosis and ductal hyperplasia of the usual type with associated microcalcifications within benign ductal lumina. (Possible ADH but favor usual hyperplasia) All WSI discrepancies resolved Phyllodes tumor (two diagnoses) All WSI discrepancies resolved All WSI discrepancies resolved All WSI discrepancies resolved
Table 6 – Discordant diagnoses listed by case number. Diagnoses listed in the second column represent the original LM diagnoses, and diagnoses listed in the third column represent the discordant LM diagnoses rendered by study pathologists.
ACCEPTED MANUSCRIPT
Mean probability discordant
Standard Error
Lower
Upper
1-20x
1-Standard
0.007713
1-20x
2-Calibrated
0.03087
0.01513
RI P
Resolution Monitor
T
95% Confidence Limits
2-40x
1-Standard
0.05041
0.02133
0.02169
2-40x
2-Calibrated
0.03318
0.01886
0.01073 0.09798
0.01052 0.000526
0.01168 0.07906
SC
NU
0.103
0.1128
AC
CE
PT
ED
MA
Table 7. Probability of discordant WSI diagnoses to LM diagnoses by Image resolution and monitor type groupings. (Generalized estimating equations (GEE) calculations [26])
MA
NU
SC
RI P
T
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
Fig. 1
CE
PT
ED
MA
NU
SC
RI P
T
ACCEPTED MANUSCRIPT
AC
Figure 2 – Diagnostic concordance comparisons between WSI, LM and pathologist. Comparison A: Interoperator comparison of Original LM diagnoses and WSI diagnoses. Comparison B: Intraoperator comparison of pathologists WSI diagnoses and their LM diagnoses. Comparison C: Interoperator comparison of original LM diagnoses to study pathologists LM diagnoses.
NU
SC
RI P
T
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
MA
Fig. 3A
NU
SC
RI P
T
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
MA
Fig. 3B