Reliable Clinical MLH1 Promoter Hypermethylation Assessment Using a High-Throughput Genome-Wide Methylation Array Platform

Reliable Clinical MLH1 Promoter Hypermethylation Assessment Using a High-Throughput Genome-Wide Methylation Array Platform

Journal Pre-proof Reliable Clinical MLH1 Promoter Hypermethylation Assessment using a HighThroughput Genome-Wide Methylation Array Platform Jamal K. B...

2MB Sizes 2 Downloads 37 Views

Journal Pre-proof Reliable Clinical MLH1 Promoter Hypermethylation Assessment using a HighThroughput Genome-Wide Methylation Array Platform Jamal K. Benhamida, Jaclyn F. Hechtman, Khedoudja Nafa, Liliana Villafania, Justyna Sadowska, Jiajing Wang, Donna Wong, Ahmet Zehir, Liying Zhang, Tejus Bale, Maria E. Arcila, Marc Ladanyi PII:

S1525-1578(19)30455-6

DOI:

https://doi.org/10.1016/j.jmoldx.2019.11.005

Reference:

JMDI 865

To appear in:

The Journal of Molecular Diagnostics

Received Date: 25 July 2019 Revised Date:

29 October 2019

Accepted Date: 18 November 2019

Please cite this article as: Benhamida JK, Hechtman JF, Nafa K, Villafania L, Sadowska J, Wang J, Wong D, Zehir A, Zhang L, Bale T, Arcila ME, Ladanyi M, Reliable Clinical MLH1 Promoter Hypermethylation Assessment using a High-Throughput Genome-Wide Methylation Array Platform, The Journal of Molecular Diagnostics (2020), doi: https://doi.org/10.1016/j.jmoldx.2019.11.005. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Copyright © 2019 Published by Elsevier Inc. on behalf of the American Society for Investigative Pathology and the Association for Molecular Pathology.

1 Reliable Clinical MLH1 Promoter Hypermethylation Assessment using a High-Throughput Genome-Wide Methylation Array Platform

Jamal K. Benhamida, Jaclyn F. Hechtman, Khedoudja Nafa, Liliana Villafania, Justyna Sadowska, Jiajing Wang, Donna Wong, Ahmet Zehir, Liying Zhang, Tejus Bale, Maria E. Arcila, and Marc Ladanyi

From the Department of Pathology, Memorial Sloan Kettering Cancer Center. 1250 First Avenue, Room S801, New York, NY, 10065.

Corresponding Author: Jamal Benhamida Email: [email protected] Address: 1250 First Avenue, Room S-801, New York, NY 10065

Disclosures: None declared.

Funding: Supported by NCI Cancer Center Support Grant (P30 CA008748).

2 Abstract Clinical testing for MLH1 promoter hypermethylation status is important in the workup of patients with MLH1-deficient colorectal and uterine carcinomas when evaluating patients for Lynch Syndrome. Current assays utilize single gene–based methods to assess promoter hypermethylation. Herein, we describe the development, and report the performance of a clinical assay for MLH1 promoter hypermethylation using the Infinium methylationEPIC (850k) bead-array platform. Using four CpG sites within the MLH1 gene promoter, a qualitative MLH1 promoter hypermethylation assay was developed and validated using 63 gastrointestinal and uterine carcinoma samples of known hypermethylation status based on a pyrosequencing reference test. The arraybased method achieves clinically robust and reproducible results at an analytical sensitivity level of 8%. Importantly, the 850k array contains probes targeting over 850,000 additional CpG sites across the genome, covering sites in most known genes as well as important enhancer regions provided by the ENCODE and FANTOM5 projects. Thus, the testing modality presented may also be applied to determine the methylation status of other clinically relevant genes or regulatory regions, potentially providing a single laboratory testing workflow for all clinical methylation assays. Furthermore, the concomitant acquisition of genome-wide methylation information provides a workflow that seamlessly enables wider translational epigenetic research.

3 Introduction The DNA mismatch repair (MMR) system is composed of four key genes: mutL homologue 1 (MLH1), postmeiotic segregation increased 2 (PMS2), mutS homologue 2 (MSH2), and mutS homologue 6 (MSH6) [1]. These genes encode proteins that function in heterodimers (MLH1/PMS2 and MSH2/MSH6) to repair and prevent DNA mutations. Defects in the system result in a dramatic increase in single-nucleotide variants and small insertion and deletion mutations, a phenotype termed mismatch repair deficiency (MMR-D) or microsatellite instability–high (MSI-H). MMR-D results from a loss of gene function due to either epigenetic silencing or loss of function mutations. MLH1 promoter methylation with consequential gene silencing is the most common cause of the MMR-D/MSI-H phenotype in colorectal and uterine carcinomas, occurring in approximately 10% to 20% of cases [2, 3, 4, 5]. Germline mutations in any of the four MMR genes result in Lynch syndrome (also called hereditary nonpolyposis colorectal cancer), a hereditary cancer predisposition syndrome that carries an elevated risk of colorectal, uterine, and genitourinary cancers [6, 7]. Universal routine clinical testing for MSI status/MMR protein expression is recommended for all colorectal adenocarcinomas and endometrial carcinomas [8, 9, 10]. MLH1 promoter hypermethylation testing is frequently performed for tumors that demonstrate loss of MLH1 expression because somatic MLH1 promoter hypermethylation supports sporadic MSI-H/ MMR-D rather than Lynch Syndrome [11, 12]. Rarely, MLH1 promoter methylation may occur as a constitutional event or in the setting of Lynch Syndrome [13].

4 DNA methylation is an important epigenetic regulatory process with key functions in development, tissue differentiation, and disease [14]. Methylation most commonly occurs on the fifth carbon of cytosines within cytosine-guanine dinucleotides (CpG sites) [15]. A variety of analytic methodologies have been developed to assay MLH1 promoter hypermethylation status, including real-time PCR, methylation-specific PCR, and pyrosequencing [16-23]. The single gene–based methods are primarily based on the evaluation of a 78 basepair region in the MLH1 promoter containing eight CpG sites that are associated with loss of MLH1 protein expression [24, 25]. Recently, a variety of sequencing- and array-based high-throughput platforms have been developed to assess methylation status in a genome-wide fashion. The Illumina (San Diego, CA) Infinium bead-array–based methylation platform is a widely-used and cost-effective platform. The current iteration of the array technology (850k) profiles the methylation status of approximately 850,000 CpG sites across all 23 chromosomes. CpG sites in over 95% of known genes as well as important enhancer and intergenic regions, are covered on the platform [26]. A number of methods have been developed on the platform for clinical testing, such as a logistic regression-based model for MGMT promoter hypermethylation testing [27, 28]. Herein, we establish a clinically robust test for MLH1 promoter hypermethylation using four CpG sites targeted by the 850k array.

Materials and Methods

5 Samples were procured under IRB #12-245 and the study was conducted as a clinical laboratory assay validation for submission to New York State Department of Health.

DNA samples

Sixty-three DNA samples (38 positive/hypermethylated, 25 negative/not hypermethylated) extracted from formalin-fixed, paraffin-embedded (FFPE) tissue were obtained from our archive of clinical samples at Memorial Sloan Kettering Cancer Center. The samples were selected in two phases. First a group of 24 uterine and gastrointestinal samples were obtained followed by a group of 38 colorectal samples (see validation process below). The samples comprised 45 colorectal carcinomas, 15 uterine carcinomas, two gastric carcinomas, and one appendiceal carcinoma that previously underwent clinical testing for MLH1 promoter hypermethylation using pyrosequencing-based method. The samples were ensured to span range of methylation levels (positive samples 14% to 71%; negative samples 2% to 8%) routinely present in clinical specimens, particularly near 10% which is the level of sensitivity of the reference method. A summary of the samples is available in Supplemental Table S1. Human Methylated & Non-methylated DNA Set (catalog no. D5014, Zymo Research, Irvine, CA), fully methylated, and fully nonmethylated DNA derived from human cell lines was used to assess the performance characteristics of the platform.

850k protocol

6 Genomic DNA was extracted from FFPE tissue using the Chemagic DNA Tissue kit (PerkinElmer chemagen Technologie, GmbH, Germany) after manual macro-dissection to ensure at least 10% tumor content. For each sample, 250 ng of input was used for bisulfite conversion (EZ DNA Methylation Kit, Zymo Research, catalog no. D5002) followed by an FFPE restoration step using the Infinium HD FFPE DNA Restore Kit (Illumina, catalog no. WG-321-1002). All samples were processed on the Infinium 850k array and scanned using the Illumina iScan according to the manufacturer’s recommended protocol. Each CpG site interrogated by the Infinium array is identified by a unique cg id in the format of “cg#” where # is a number (for example, cg23658326 is the cg id of an MLH1 promoter CpG site).

Reference method and 850k CpG sites

A previously validated pyrosequencing-based assay used in the CLIA-accredited Diagnostic Molecular Genetic laboratory at Memorial Sloan Kettering Cancer was used as the reference method for all samples [20-23]. This method, utilizing the PyroMark (QIAGEN, Germany) system, quantitatively measures the methylation level of five CpG sites within the MLH1 promoter and defines a sample as MLH1 promoter hypermethylated if all five CpG sites demonstrate greater than 10% methylation. The five CpG sites are highly correlated; they are either all methylated or all unmethylated. The 850k array targets four (cg ids cg23658326, cg11600697, cg21490561, and cg00893636) of the eight CpG sites associated with MLH1 silencing. One of these CpG sites (cg23658326) overlaps with a site interrogated by the reference method and, of

7 note, one CpG site’s (cg00893636) methylation status was previously associated with decreased MLH1 gene expression [29]. Table 1 describes the genomic positions with respect to Genome Reference Consortium Human Build 37 (GRCh37) and Deng et al of the eight CpG sites associated with MLH1 gene silencing and describes which CpG sites are utilized by the reference method and the 850k array method for clinical testing [24]. Of note, these CpG sites are also contained on Illumina’s targeted methylation sequencing platform (using the TruSeq Methyl Capture EPIC Library Prep Kit). The MLH1 probes chosen do not meet criteria for poor probe performance (eg, cross-reactivity or the presence of SNPs near the CpG site) as defined by prior studies [27, 28].

Bioinformatic and statistical analyses

Bioinformatic and statistical analyses were performed with R version 3.5.2 (https://www.R-project.org/ ; date of last access: 11/27/2019) using the minfi 1.22.1 package [32]. Green and red channel intensities were normalized using the preprocessIllumina function with background correction. The methylation level for each CpG site was quantified using beta values, calculated as the ratio of methylated signal to total signal plus an offset (100 is the Illumina recommended offset). Beta values are continuous values between zero and one. Values closer to zero indicate low to absent methylation and values closer to one indicate high levels of methylation; values between the two extremes indicate partial methylation (eg, a mixture of cell types with methylation in a subset of cells or hemi-methylation of a CpG site) [33].

8 Sensitivity and specificity was estimated with 95% binomial proportion confidence intervals with the Jeffreys prior interval using the binom R package version 1.1-1 [31]

Quality control

Each array contains several internal control probes to assess sampleindependent and sample-dependent steps. Among these are a set of internal bisulfite conversion controls that target unmethylated cytosines (non-polymorphic non-CpG– associated cytosines). These controls consist of unconverted probes and converted probes that target unconverted and converted DNA, respectively. As the probes intentionally target unmethylated cytosines, signal from the converted probes should be greater than the signal from the unconverted probes in a successfully converted sample. The control performance can be assessed quantitatively by taking the ratio of converted to unconverted signals. Values greater than 1 signify successful conversion. According to the manufacturer’s recommendations, non-FFPE controls (eg, mixtures of fully methylated and unmethylated cell lines) are useful in evaluating the conversion because i) FFPE samples often demonstrate depressed ratios and ii) when bisulfite conversion fails, it typically occurs across all samples in a batch. Thus, a non-FFPE sample is useful to assess the experiment-wide bisulfite conversion efficiency. All of the controls were evaluated using Illumina’s BeadArray Controls Reporter software v1.1 and visualized with Genome Studio 2.0 according to the manufacturer’s recommendations.

9 Individual MLH1 probe failures were assessed using the detectionP minfi function. The function calculates P-values for each CpG site based on the CpG site’s total intensity under a background null hypothesis distribution defined by a set of internal negative control probes and assuming a normal distribution. Probes with Pvalues > 0.001 (three standard deviations from the mean) were considered failures.

Validation process

Quantitative and qualitative performance characteristics of the array were first assessed by looking at the distribution of beta values across pre-defined methylation levels (100%, 50%, 25%, 12.5%, 6.3%, 3.1%, and 0%) using samples generated from serial dilutions of fully methylated DNA into fully non-methylated DNA and run in triplicate. The method for clinical determination of MLH1 promoter hypermethylation status was developed and validated in a two-step process. First, 24 DNA samples (14 positive, 10 negative) extracted from FFPE tissue were analyzed as a training study to establish interpretive criteria for calling hypermethylation status based on the performance characteristics of the samples in the study. The accuracy, inter- and intraassay reproducibility, analytical sensitivity, and limit of detection were then evaluated on the remaining samples (n=38; 23 positive, 15 negative) using the interpretive criteria as defined by the training study. Particular attention was paid to samples at low levels of methylation (approximately 10%) to ensure the platform could accurately assay samples with low tumor content that are routinely seen in clinical settings.

10 Inter- and intra-assay variability was assessed by running three positive (methylation levels 15%, 24%, and 26%) and three negative FFPE samples (methylation levels 3%, 4%, and 5%) in triplicate on one run and across three different runs. Analytical sensitivity was performed by diluting a positive FFPE sample (average methylation level of 66% by pyrosequencing) into a negative FFPE sample to achieve 100%, 50%, 25%, 12.5%, 6.3%, 3.2%, and 0% levels of the positive sample (corresponding to methylation levels of 66%, 33%, 17%, 8%, 4%, 2%, and 0%). A limit of detection experiment was performed by running a positive sample (methylation level 40%) at DNA concentration inputs of 250, 200, 150, 100, 50, and 25 ng prior to bisulfite conversion.

Results The dilution study using samples generated from fully methylated and fully unmethylated human cell line DNA (Figure 1) demonstrated three important performance characteristics of the MLH1 CpG sites assayed by the array: i) that each CpG site generates a unique beta-value distribution for a given methylation level, ii) beta-values measured on the array are not a reliable quantitative measure of methylation, and iii) the array can distinguish unmethylated CpG sites from CpG sites with methylation level greater than 6% in high quality DNA samples. Results of the training study are shown in figure 2A. Informed by these data, a qualitative cutoff was defined for each CpG site such that a CpG site was designated as methylated if its beta-value is greater than or equal to the cutoff and unmethylated if it is less than the cutoff. The cutoffs were defined as the mean plus three standard

11 deviations of the negative samples’ beta values (table 2and figure 2). The methylation state of the four CpG sites was highly correlated; the CpG sites were either all methylated or all unmethylated. Based on these data, interpretive criteria for determining promoter hypermethylation status using the four MLH1 beta values were defined such that all four CpG sites are required to be qualitatively methylated (ie, above their respective cutoffs) to call a sample positive for MLH1 promoter hypermethylation (table 3).

These criteria were validated on the second group (validation group) of 38 FFPE samples with known MLH1 promoter hypermethylation status. The array demonstrated 100% concordance with respect to the reference method (figure 2B) using the samples tested. The estimated sensitivity was 98% (95% CI: 92% to 100%) and the estimated specificity was 97% (95% CI: 88% to 100%). All beta values for the four CpG sites were above their qualitative cutoff for all the positive samples. Negative samples showed larger variation of beta values in the validation group than in the training group (Supplemental Tables S2 and S3); however, this did not affect the final designation of the samples based on the a priori defined criteria. For example, five negative samples had a single CpG site above the respective cutoff (two samples with an elevated cg23658326 site and three samples with an elevated cg21490561 site). One negative sample had two CpG sites above their respective cutoffs (cg11600697 and cg21490561). The four CpG sites for the remaining eight negative samples were all below their respective qualitative cutoffs. A comparison of the methylation level of the one CpG

12 site on the pyrosequencing assay that overlaps with the array site was performed and supports the qualitative discriminative ability of the platform (Supplemental Figure S1). Consistent with the cell line experiment, the analytical sensitivity study using FFPE demonstrated the ability to detect promoter hypermethylation when the level is greater than 8% (figure 2C). Inter- and intra-assay variability studies demonstrated reproducibility of the assay within and across multiple runs (figure 3A-B), particularly at low methylation levels routinely seen in clinical settings (10%). Limit of detection demonstrated good performance of the assay down to 25 ng input DNA prior to bisulfite conversion (figure 3C). The beta values of the four MLH1 CpG sites across all experiments is provided in the Supplemental Tables S4, S5, S6, S7, S8, S9, and S10. Sixteen samples (from the training group) were repeated due to depressed bisulfite conversion control ratios (ratio of 1 or less) in 12 of the 16 samples tested on a single run, consistent with experiment-wide sub-optimal conversion. The run was deemed a failure and repeated with successful bisulfite conversion. All remaining samples’ internal controls showed acceptable performance, particularly the bisulfite conversion controls (ie, the ratio of converted probe signals to unconverted probe signals were greater than 1). No individual probes failed (P-value greater than 0.001) in any of the samples tested.

Discussion This study demonstrates the ability of the Infinium methylation array to achieve robust and reproducible results for clinical MLH1 promoter hypermethylation testing on

13 FFPE tissue at an analytical sensitivity of 8%. Consistent with prior studies, the beta values were heteroscedastic across methylation levels and demonstrated low variance at extremes of methylation (0 and 100%) [34]. Although this technical behavior precludes a quantitative assay, the qualitative cut-off–based approach developed here shows excellent performance characteristics using beta values. Interpretive criteria that require all four MLH1 CpG sites to be above their respective cut-offs provides robustness against experimental variation inherent to the array, particularly for negative samples for which one to two CpG sites showed mild elevation above their cutoffs (figure 2B). Used in isolation, a result based on a single CpG site may cause false positive results. By utilizing four separate CpG sites in the interpretive criteria, no false positive results occurred. No probe failures were observed in the experiments; however, it is recommended to repeat samples in which a probe failure occurs and the failure affects the clinical interpretation. A limitation of this study is the use of a single scanner for all experiments. Experimental variation across different instruments was, therefore, not evaluated. We surmise that the beta value cut-offs established in this study may be influenced by scanner-dependent technical variables and, therefore, recommend validating the cutoffs independently on individual scanners This same approach can be expanded to assay any site covered on the array. For example, a similar methodology could be applied to CpG sites within the BRCA1 gene promoter for development of a clinical assay that assesses epigenetic BRCA1 silencing [35]. Importantly, the use of methylation arrays enables a single clinical laboratory

14 testing workflow for all methylation assays, with potential advantages in turnaround time and more efficient utilization of technologist time due to the ability to batch samples across different assays. For example, MLH1, MGMT, BRCA1, and methylationbased CNS tumor classification tests can be performed simultaneously and analyzed on a single batch instead of multiple runs [36]. Finally, the use of genome-wide methylation arrays provides a large amount of additional data that can separately be used for a variety of epigenetic research purposes (upon appropriate IRB approval). The use of genome-wide methylation profiles as an ancillary diagnostic tool and as potential prognostic and/or predictive biomarkers is a rich area of current investigation [36, 37, 38, 39]. Thus, this platform could potentially provide a powerful parallel research platform for brain, gastrointestinal, endometrial, and other tumors in which the hypermethylation status of genes is clinically relevant.

15 References 1. Richman S. Deficient mismatch repair: Read all about it. Int. J. Oncol. 2015;47:1189–1202. 2. Herman JG, Umar A, Polyak K, Graff JR, Ahuja N, Issa JP, Markowitz S, Willson JK, Hamilton SR, Kinzler KW, Kane MF, Kolodner RD, Vogelstein B, Kunkel TA, Baylin SB. Incidence and functional consequences of hMLH1 promoter hypermethylation in colorectal carcinoma. Proc Natl Acad Sci U S A. 1998 Jun 9;95(12):6870-5. 3. Nagle CM, O'Mara TA, Tan Y, Buchanan DD, Obermair A, Blomfield P, Quinn MA, Webb PM, Spurdle AB; Australian Endometrial Cancer Study Group. Endometrial cancer risk and survival by tumor MMR status. J Gynecol Oncol. 2018 May;29(3):e39. 4. Black D, Soslow RA, Levine DA, Tornos C, Chen SC, Hummer AJ, Bogomolniy F, Olvera N, Barakat RR, Boyd J. Clinicopathologic significance of defective DNA mismatch repair in endometrial carcinoma. J Clin Oncol. 2006;24(11):1745–53. Epub 2006/03/22. 5. Moreira L, Balaguer F, Lindor N, de la Chapelle A, Hampel H, Aaltonen LA, Hopper JL, Le Marchand L, Gallinger S, Newcomb PA, Haile R, Thibodeau SN, Gunawardena S, Jenkins MA, Buchanan DD, Potter JD, Baron JA, Ahnen DJ, Moreno V, Andreu M, Ponz de Leon M, Rustgi AK, Castells A; EPICOLON Consortium. Identification of Lynch syndrome among patients with colorectal cancer. JAMA. 2012 Oct 17;308(15):1555-65. 6. Bonadona V, Bonaïti B, Olschwang S, Grandjouan S, Huiart L, Longy M, Guimbaud R, Buecher B, Bignon YJ, Caron O, Colas C, Noguès C, Lejeune-Dumoulin S, Olivier-Faivre L, Polycarpe-Osaer F, Nguyen TD, Desseigne F, Saurin JC, Berthet P, Leroux D, Duffour J, Manouvrier S, Frébourg T, Sobol H, Lasset C, Bonaïti-PelliéC, French Cancer Genetics Network . Cancer Risks Associated With Germline Mutations in MLH1, MSH2, and MSH6 Genes in Lynch Syndrome. JAMA. 2011;305(22):2304–2310. 7. Dowty JG, Win AK, Buchanan DD, Lindor NM, Macrae FA, Clendenning M, Antill YC, Thibodeau SN, Casey G, Gallinger S, Marchand LL, Newcomb PA, Haile RW, Young GP, James PA, Giles GG, Gunawardena SR, Leggett BA, Gattas M, Boussioutas A, Ahnen DJ, Baron JA, Parry S, Goldblatt J, Young JP, Hopper JL, Jenkins MA. Cancer risks for MLH1 and MSH2 mutation carriers. Hum Mutat. 2013 Mar;34(3):490-7. 8. National Comprehensive Cancer Network. Clinical Practice Guidelines in Oncology. Uterine Neoplasms (Version 4.2019).

16 9. National Comprehensive Cancer Network. Clinical Practice Guidelines in Oncology. Colon Cancer (Version 3.2019). 10. Sepulveda AR, Hamilton SR, Allegra CJ, Grody W, Cushman-Vokoun AM, Funkhouser WK, Kopetz SE, Lieu C, Lindor NM, Minsky BD, Monzon FA, Sargent DJ, Singh VM, Willis J, Clark J, Colasacco C, Rumble RB, Temple-Smolkin R, Ventura CB, Nowak J. Molecular Biomarkers for the Evaluation of Colorectal Cancer: Guideline from the American Society for Clinical Pathology, College of American Pathologists, Association for Molecular Pathology, and American Society of Clinical Oncology. J Mol Diagn. 2017;19(2):187–225. 11. Parsons MT, Buchanan DD, Thompson B, Young JP, Spurdle AB. Correlation of tumour BRAF mutations and MLH1 methylation with germline mismatch repair (MMR) gene mutation status: a literature review assessing utility of tumour features for MMR variant classification. Journal of medical genetics. 2012;49:151–7. 12. Newton K, Jorgensen NM, Wallace AJ, Buchanan DD, Lalloo F, McMahon RF, Hill J, Evans DG. Tumour MLH1 promoter region methylation testing is an effective prescreen for Lynch Syndrome (HNPCC). J Med Genet. 2014 Dec;51(12):789-96. 13. Gazzoli I, Loda M, Garber J, Syngal S, Kolodner RD. A hereditary nonpolyposis colorectal carcinoma case associated with hypermethylation of the MLH1 gene in normal tissue and loss of heterozygosity of the unmethylated allele in the resulting microsatellite instability-high tumor. Cancer Res. 2002;62(14):3925–8. 14. Robertson KD. DNA methylation and human disease. Nat Rev Genet. Aug 2005;6(8):597-610. 15. Illingworth RS, Bird AP. CpG islands--'a rough guide'. FEBS letters. 2009;583:1713–1720. 16. Bettstetter M, Dechant S, Ruemmele P, Grabowski M, Keller G, Holinski-Feder E, Hartmann A, Hofstaedter F, Dietmaier W. Distinction of hereditary nonpolyposis colorectal cancer and sporadic microsatellite-unstable colorectal cancer through quantification of MLH1 methylation by real-time PCR. Clin Cancer Res. 2007;13:3221–3228. 17. Grady W.M., Rajput A., Lutterbaugh J.D., Markowitz S.D. Detection of aberrantly methylated hMLH1 promoter DNA in the serum of patients with microsatellite unstable colon cancer. Cancer Res. 2001;61:900–902. 18. Herman JG, Graff JR, Myohanen S, Nelkin BD, Baylin SB. Methylation-specific PCR: a novel PCR assay for methylation status of CpG islands. Proc Natl Acad Sci U S A. Sep 3 1996;93(18):9821-9826. 19. Ogino S, Kawasaki T, Brahmandam M, Cantor M, Kirkner GJ, Spiegleman D, Makrigiorgos GM, Weisenberger DJ, Laird PW, Loda M, Fuchs CS. Precision and

17 performance characteristics of bisulfite conversion and real-time PCR (MethyLight) for quantitative DNA methylation analysis. J Mol Diagn. May 2006;8(2):209-217. 20. Colella S, Shen L, Baggerly KA, Issa JP, Krahe R. Sensitive and quantitative universal Pyrosequencing methylation analysis of CpG sites. Biotechniques. Jul 2003;35(1):146-150. 21. Tost J, Dunker J, Gut IG. Analysis and quantification of multiple methylation variable positions in CpG islands by Pyrosequencing. Biotechniques. Jul 2003;35(1):152-156. 22. Ronaghi M, Uhlen M, Nyren P. A sequencing method based on real-time pyrophosphate. Science. Jul 17 1998;281(5375):363, 365. 23. Dejeux E, Audard V, Cavard C, Gut IG, Terris B, Tost J. Rapid identification of promoter hypermethylation in hepatocellular carcinoma by pyrosequencing of etiologically homogeneous sample pools. J Mol Diagn. Sep 2007;9(4):510-520. 24. Deng G., Chen A., Hong J., Chae H. S., Kim Y. S. Methylation of CpG in a small region of the hMLH1promoter invariably correlates with the absence of gene expression. Cancer Res., 59: 2029-2033, 1999. 25. Deng G., Peng E., Gum J., Terdiman J., Sleisenger M., Kim Y. S. Methylation of hMLH1 promoter correlates with the gene silencing with a region-specific manner in colorectal cancer. Br. J. Cancer, 86: 574-579, 2002. 26. Moran S, Arribas C, Esteller M. Validation of a DNA methylation microarray for 850,000 CpG sites of the human genome enriched in enhancer sequences. Epigenomics. 2016;8:389–399. 27. Bady P, Sciuscio D, Diserens AC, Bloch J, van den Bent MJ, Marosi C, Dietrich PY, Weller M, Mariani L, Heppner FL, McDonald DR, Lacombe D, Stupp R, Deorenzi M, Hegi ME. MGMT methylation analysis of glioblastoma on the Infinium methylation BeadChip identifies two distinct CpG regions associated with gene silencing and outcome, yielding a prediction model for comparisons across datasets, tumor grades, and CIMP-status. Acta Neuropathol. 2012 Oct;124(4):547–560. 28. Bady P, M Delorenzi, ME Hegi. Sensitivity analysis of the MGMT-STP27 model and impact of genetic and epigenetic context to predict the MGMT methylation status in gliomas and other tumors. J Mol Diagn, 18 (2016), pp.350-361 29. Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012 Jul 18;487(7407):330-7. 30. Zhou W, Laird PW, Shen H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 2017;45(4):e22.

18 31. Chen YA, Lemire M, Choufani S, Butcher D, Grafodatskaya D, Zanke B, Gallinger S, Hudson T, Weksberg R. Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics. 2013;8(2):203–209. 32. Fortin JP, Triche TJ Jr, Hansen KD. Preprocessing, normalization and integration of the Illumina HumanMethylationEPIC array with minfi. Bioinformatics. 2017 Feb 15;33(4):558-560. 33. Brown LD, Cai TT, DasGupta A. Interval estimation for a binomial proportion. Stat. Sci. 2001;16:101-133. 34. Du P, Zhang X, Huang CC, Jafari N, Kibbe WA, Hou L, Lin SM. Comparison of Betavalue and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics. 2010 Nov 30;11:587. 35. Banerjee S, Kaye SB, Ashworth A: Making the best of PARP inhibitors in ovarian cancer. Nat Rev Clin Oncol 7:508-19, 2010. 36. Capper D, Jones DTW, Sill M, Hovestadt V, Schrimpf, Sturm D, et al. DNA methylation-based classification of central nervous system tumours. Nature. 2018;555(7697):469–474. 37. Moran S, Martínez-Cardús A, Sayols S, Musulén E, Balañá C, Estival-Gonzalez A, Moutinho C, Heyn H, Diaz-Lagares A, de Moura MC, Stella GM, Comoglio PM, Ruiz-Miró M, Matias-Guiu X, Pazo-Cid R, Antón A, Lopez-Lopez R, Soler G, Longo F, Guerra I, Fernandez S, Assenov Y, Plass C, Morales R, Carles J, Bowtell D, Mileshkin L, Sia D, Tothill R, Tabernero J, Llovet JM, Esteller M. Epigenetic profiling to classify cancer of unknown primary: a multicentre, retrospective analysis. Lancet Oncol. (2016) 17:1386–95. 10.1016/S1470-2045(16)30297-2. 38. Dogan S, Vasudevaraja V, Xu B, Serrano J, Ptashkin R, Jae Jung H, Chiang S, Jungbluth A, Cohen M, Ganly I, Berger M, Boroujeni AM, Ghossein R, Ladanyi M, Chute D, Snuderl M. DNA methylation-based classification of sinonasal undifferentiated carcinoma. Modern Pathology (2019). 11 June. 39. Boussios S, Ozturk MA, Moschetta M, Karathanasi A, Zakynthinakis-Kyriakou N, Katsanos KH, Christodoulou DK, Pavlidis N. The Developing Story of Predictive Biomarkers in Colorectal Cancer. J Pers Med. 2019 Feb 7;9(1):12.

19 Figure Legends Figure 1. Performance characteristics study using cell lines at varying methylation levels run in triplicate. The four CpG sites from the triplicate samples are represented with different colors. Each CpG site demonstrates a unique beta distribution. Variance of the beta distributions is largest for intermediate methylation levels and smallest for 0% and 100% levels. A small amount of jitter is added to methylation level for visualization purposes.

Figure 2. Results of training (A), validation (B), and analytical sensitivity (C) studies. In each figure, the four MLH1 CpG sites are represented on the x-axis (with a small amount of jitter for ease of visualization) and their respective beta values are represented on the y-axis. Line segments connect beta-values from the same sample. The training study (A) was used to establish qualitative cut-offs (blue line segments in A and B; black line segments in C) and interpretive criteria. The interpretive criteria were tested in the validation study (B). Box and whisker plots are shown for the negative samples only for each CpG site in the training and validation studies. Analytical sensitivity study (C) using a sample with 66% methylation by quantitative pyrosequencing diluted into a negative sample shows consistent hypermethylation detection when the methylation level is 8% or greater.

Figure 3. Inter-assay (A), intra-assay (B) reproducibility, and limit of detection (C) studies. In each figure, the four MLH1 CpG sites are represented on the x-axis and their

20 respective beta values are represented on the y-axis. Line segments connect betavalues from the same sample. Black line segments designate the CpG-specific qualitative cut-off as defined by the training study. Samples with methylation levels near the limit of sensitivity were run across three runs (A) and in triplicate on the same run (B). Results demonstrate accurate and reproducible results across and within experimental runs. Limit of detection study demonstrated the ability to detect hypermethylation in a positive sample using as little as 25 ng of DNA input prior to bisulfite conversion.

21 Table 1. Description of CpG sites associated with MLH1 gene repression. CpG site

Illumina cg id

GRCh 37 coordinate on chromosome 3

Position as designated in Deng et al [21, 22]

1*

N/A

37034770

-248

2*

N/A

37034777

-241

3*

cg23658326

37034787

-231

4*

N/A

37034789

-229

5*

N/A

37034795

-223

6

cg11600697

37034814

-204

7

cg21490561

37034825

-193

8

cg00893636

37034840

-178

*Sites interrogated by the pyrosequencing reference method.

22 Table 2. Empirically determined qualitative beta value cut-offs for the four MLH1 CpG sites. CpG site cg23658326 cg11600697 cg21490561 cg00893636

Qualitative beta value cut-off 0.18 0.27 0.11 0.10

23 Table 3. Interpretive criteria for reporting MLH1 promoter hypermethylation status. Analytical result

Clinical interpretation

Beta values for all four MLH1 CpG sites greater than or equal MLH1 promoter hypermethylation positive to their respective cut-offs. Beta values of less than four of the MLH1 CpG greater than or equal to their respective cut-offs.

MLH1 promoter hypermethylation negative

1.00

Beta value

0.75

0.50

0.25

CpG sites cg00893636

cg11600697

cg21490561

cg23658326

0.00 0

3

6

13

25

50

Methylation level (%)

100

A

Training study

B

Validation study

C

1.00

1.00

Positive

Analytical sensitivity study

1.00

Positive

Negative

Greater than 8%

Negative

8% triplicate 4% triplicate Less than 4%

Beta value

Beta value

0.50

0.75

Beta value

0.75

0.75

0.50

0.25

0.25

0.25

0.00

0.00 cg23658326

cg11600697

CpG site

cg21490561

cg00893636

0.50

0.00 cg23658326

cg11600697

cg21490561

CpG site

cg00893636

cg23658326

cg11600697

CpG site

cg21490561

cg00893636

A

Inter-assay reproducibility study

1.00

B

Positive−1 (26%)

Positive−2 (24%)

Positive−3 (15%)

Negative−1 (3%)

Negative−2 (4%)

Negative−3 (5%)

C

Positive−1 (26%)

Positive−2 (24%)

Positive−3 (15%)

Negative−1 (3%)

Negative−2 (4%)

Negative−3 (5%)

0.25

0.50

0.25

0.00 cg11600697

CpG site

cg21490561

cg00893636

200 ng

250 ng

25 ng

50 ng

100 ng

0.50

0.25

0.00 cg23658326

150 ng

0.75

Beta value

0.50

Limit of detection study

1.00

0.75

Beta value

0.75

Beta value

Intra-assay reproducibility study

1.00

0.00 cg23658326

cg11600697

CpG site

cg21490561

cg00893636

cg23658326

cg11600697

CpG site

cg21490561

cg00893636