Immunoreactive trypsinogen (IRT) as a biomarker for cystic fibrosis: Challenges in newborn dried blood spot screening

Immunoreactive trypsinogen (IRT) as a biomarker for cystic fibrosis: Challenges in newborn dried blood spot screening

Molecular Genetics and Metabolism 106 (2012) 1–6 Contents lists available at SciVerse ScienceDirect Molecular Genetics and Metabolism journal homepa...

303KB Sizes 0 Downloads 73 Views

Molecular Genetics and Metabolism 106 (2012) 1–6

Contents lists available at SciVerse ScienceDirect

Molecular Genetics and Metabolism journal homepage: www.elsevier.com/locate/ymgme

Conference Proceedings

Immunoreactive trypsinogen (IRT) as a biomarker for cystic fibrosis: Challenges in newborn dried blood spot screening☆ Bradford L. Therrell Jr.

a, b

, W. Harry Hannon c,⁎, Gary Hoffman d, Jelili Ojodu e, Philip M. Farrell f

a

National Newborn Screening and Genetics Resource Center, 1912 West Anderson Lane, Suite 210, Austin, TX, USA Department of Pediatrics, University of Texas Health Science Center at San Antonio, TX, USA National Newborn Screening and Genetics Resource Center, 4929 Duncans Lake Point, Buford, GA, USA d Wisconsin State Laboratory of Hygiene, 465 Henry Mall, Madison, WI, USA e Association of Public Health Laboratories, 8515 Georgia Avenue, Suite 700, Silver Spring, MD, USA f University of Wisconsin School of Medicine and Public Health, 610 Walnut Street, Madison, WI, USA b c

a r t i c l e

i n f o

Available online 28 February 2012 Keywords: Newborn dried blood spot screening Cystic fibrosis Immunoreactive trypsinogen False negatives

a b s t r a c t On May 23–24, 2011, a workshop entitled “Immunoreactive Trypsinogen (IRT) as a Biomarker for Cystic Fibrosis: Technical Issues and Challenges” was held in Annapolis, Maryland. The two-day workshop was co-hosted by the National Newborn Screening and Genetics Resource Center, Austin, Texas, and the Association of Public Health Laboratories, Silver Spring, Maryland, in collaboration with the Health Resources and Services Administration and the Centers for Disease Control and Prevention. Participants included nearly 40 representatives from U.S. state public health and commercial laboratories performing newborn dried blood spot screening tests for cystic fibrosis (CF), the federal government, academic research institutions, and commercial vendors of products used in newborn screening. Representatives from selected European CF newborn screening programs were also present. The workshop focused on identifying key IRT testing issues and mechanisms for achieving their resolution and laboratory harmonization in order to reduce, or eliminate completely, the late identified CF cases following a negative newborn screen. Informative findings are reported, their impacts on improving IRT screening are described, and their implications are discussed.

1. Introduction Cystic fibrosis (CF) occurs in about 1 of 4000 U.S. births and has a similar prevalence in Western European newborns. This autosomal recessive disorder is characterized clinically by lung disease (chronic infection, inflammation, and airways obstruction), gastrointestinal Abbreviations: ACMG, American College of Medical Genetics; APHL, Association of Public Laboratories; CDC, Centers for Disease Control and Prevention; CF, cystic fibrosis; CFTR, cystic fibrosis transmembrane conductance regulatory gene; DBS, dried blood spot; EGA, expanded genetic analysis; HRSA, Health Resources and Services Administration; IRT, immunoreactive trypsinogen; MoM, multiple of the median; NDBS, newborn dried blood spot screening; NNSGRC, National Newborn Screening and Genetics Resource Center; NICU, neonatal intensive care units; ROC, receiver operating characteristic. ☆ Author disclosure: Dr. Hannon, Emeritus Chief (retired) from the Newborn Screening and Molecular Biology Branch of the Centers for Disease Control and Prevention (CDC) serves on the NBS Scientific Advisory Council for Advanced Liquid Logic, Inc., Research Triangle Park, North Carolina, and the Georgia Governor's Public Health Advisory Council. Dr. Hannon also provides consulting services upon request to the National Newborn Screening and Genetic Resource Center in Austin, Texas and PerkinElmer, Inc. in Waltham, Massachusetts. ⁎ Corresponding author at: 4929 Duncans Lake Point, Buford, GA 30519, USA. Fax: +1 512 454 6509. E-mail addresses: [email protected] (B.L. Therrell), [email protected] (W.H. Hannon), [email protected] (G. Hoffman), [email protected] (J. Ojodu), [email protected] (P.M. Farrell).

doi:10.1016/j.ymgme.2012.02.013

abnormalities (pancreatic insufficiency, malabsorption, and malnutrition), salt loss (high sweat electrolytes), and other clinical manifestations (intestinal obstruction, cirrhosis, diabetes, etc.). Populationbased newborn dried blood spot screening (NDBS) for CF became possible in 1979 after Crossley [1,2] demonstrated that immunoreactive trypsinogen (IRT) was elevated in the blood of newborns eventually diagnosed with CF. CF was included among the original 29 core disorders in the recommended uniform screening panel of the American College of Medical Genetics (ACMG) [3], which was subsequently accepted and recommended by the US Secretary of Health and Human Services [4]. All states and the District of Columbia now require CF NDBS [5]. Although there have been three decades of testing experience, numerous studies demonstrating that NDBS for CF is effective, and several published international recommendations containing guidelines, a universally accepted NDBS analytical protocol does not exist. Many issues still remain concerning the biomarker IRT, which is used as the first marker for detecting CF by essentially all NDBS programs. Ironically, in 1983, the U.S. Cystic Fibrosis Foundation's Task Force noted “some inherent variability” in the IRT assay following initiation of CF NDBS by Colorado (CO). The Task Force recommended research on seven IRT screening issues and the need for procedural standardization. Indeed, the validity of IRT as a NDBS tool has been subject to

2

Conference Proceedings

question since numerous late diagnosed CF cases (false negatives) have occurred when the NDBS IRT value was lower than the IRT cutoff used by the program performing the initial screen. Prematurity and neonatal stress were identified as possible confounding factors [6]. A 1989 study indicated that CF NDBS using IRT was complicated by age-related declines in IRT levels and false-negative results [7]. More recently, seasonal and kit-related factors have also been reported to affect NDBS IRT levels. Adding to result interpretation complexity is the fact that some U.S. state NDBS programs require two screens on all newborns, which allows for the possibility of a negative screen on the first and a positive screen on the second. These factors contribute to controversies surrounding the definition of reliable IRT cutoff values. As a result, some NDBS programs use “floating” cutoffs (choosing a certain percentage of specimens as out-of-range—thus the numeric cutoff value varies or “floats”), while others use “fixed” numeric IRT concentrations. Two primary algorithms have emerged for CF NDBS. Both begin with an IRT blood test and end with a physiologic sweat chloride test for diagnostic confirmation. One algorithm, IRT/IRT, relies on elevated IRT levels in two time-separated blood specimens before referral for sweat testing [8]. It is used predominately in states where a second screening specimen is universally required at about 2 weeks of age. CF carrier detection is minimized by this algorithm. The other algorithm, IRT/DNA, uses a combination of biochemical and genetic markers. An elevated IRT is measured against a lowered initial cutoff to increase sensitivity, and is followed by DNA testing for one or more mutant cystic fibrosis transmembrane conductance regulatory gene (CFTR) alleles [9]. In either algorithm, a reliable assay for IRT is essential in order to avoid missing cases. Present data indicate that 40 of the 50 states plus the District of Columbia (covering approximately 90% of U.S. newborns) use the IRT/DNA algorithm [10]. Despite the importance of IRT as a biomarker in current NDBS algorithms, and the millions of such tests performed, there has been neither an international nor a national discussion focused on improving IRT NDBS test validity. In view of the unresolved issues with NDBS IRT, on May 23–24, 2011, a workshop entitled “Immunoreactive Trypsinogen (IRT) as a Biomarker for Cystic Fibrosis: Technical Issues and Challenges” was held in Annapolis, Maryland, USA. The two-day workshop was co-hosted by the National Newborn Screening and Genetics Resource Center (NNSGRC), Austin, Texas, and the Association of Public Health Laboratories (APHL), Silver Spring, Maryland, in collaboration with the Health Resources and Services Administration (HRSA) and the Centers for Disease Control and Prevention (CDC). Participants included more than 40 representatives from U.S. state public health and commercial laboratories, selected European countries (England, France and Italy), the U.S. government, academic research institutions, and commercial vendors of products used in NDBS. The workshop focused on identifying key IRT testing issues and factors affecting IRT results, and explored mechanisms for achieving problem resolution and laboratory harmonization. The ultimate goal was to reduce, or eliminate completely, late identified CF cases following an in-range (negative) initial newborn screen. 2. Results 2.1. Factors reported to affect IRT results 2.1.1. IRT1 and IRT2 There are two types of IRT – the cationic form, immunoreactive trypsinogen 1 (IRT-1), and the anionic form, immunoreactive trypsinogen 2 (IRT-2) – and different genes code for the two types. IRT-1 predominates in normal adult sera and is largely extra-pancreatic. IRT-2 is not restricted to the pancreas and is increased in pancreatitis. Related literature is inconsistent, but it appears that both IRT-1 and IRT-2 are increased in CF in early infancy, decreasing over time. Evaluating both IRT-1 and IRT-2 concentrations seems to provide slightly better screening sensitivity than IRT-1 alone. A recent study reported by

researchers at the CDC found that currently available commercial kits do not reliably detect IRT-2. This study agrees with a recent report using a Luminex® multiplex immunoassay for IRT-1 and IRT-2 [11]. 2.1.2. Baby's age and weight at time of specimen collection Early hospital discharges for newborns in the U.S. are common and the resultant early NDBS collection can add complexity to interpreting results. Variations in IRT concentrations related to a baby's age and weight have been observed. Both the Wisconsin (WI) and Massachusetts (MA) NDBS programs, which are among the most experienced in CF NDBS, reported decreasing IRT levels during the first 24 hours after birth followed by a period of stable concentration (~20 ng/mL) before another decrease occurred at about 10 days of age. The MA program observed that 95th percentile IRT concentrations were often erratic in newborns, increasing to about 58 ng/mL in the first 3 days of life, decreasing on the 3rd day, rising slightly on the 4th day, declining on the 5th day, and then rising slightly again on the 8th day. These variations make it difficult to reliably determine a meaningful cutoff for older specimens. This may jeopardize the validity of the IRT/IRT protocol as a result of time-related influences on the 2nd IRT usually obtained at ~2 weeks. By contrast, NDBS programs such as that in the United Kingdom (UK), which collect screening specimens on the 5th day of life, may experience less result variation. High median IRT concentrations (95th percentile) were reported in newborns who received blood transfusions within the first 48 hours after birth by the MA laboratory. However, transfusions resulting in lower IRT concentrations occur as evidenced by a late diagnosed case reported by the Texas (TX) program following an early transfusion. Newborns in neonatal intensive care units (NICUs) also pose IRT result interpretation challenges because they often exhibit elevated IRT levels that require additional, perhaps unnecessary, follow-up to resolve their IRT screening results. High IRT levels were reported by the MA laboratory in newborns weighing less than 1,500 grams. In the TX program, approximately half of the babies with excessively elevated IRT levels and no apparent CF mutations were found to have low birth weights. IRT levels were reported to be lower and more stable once weights reached 1,500 grams. Interestingly, NICU babies with elevated IRT levels screened by the Washington (WA) State NDBS program were found to have increased frequencies of maternal corticosteroid therapy and gestational diabetes. These babies were also more likely to be on antibiotics (~75%) or have hyperbilirubinemia (~60%). Those with high IRT concentrations were less likely to have mothers of advanced maternal age or mothers whose labor was induced. 2.1.3. Environment and specimen transport Environmental factors have been observed to cause variations in IRT NDBS values. The stability of IRT at different concentrations and under different environments is shown in Fig. 1. In one example from Fig. 1, the IRT concentration was reported to decrease 40% in 1 week in a dried blood spot (DBS) specimen stored at 27 °C with 80% humidity. IRT concentrations were also reported to decrease 2% in specimens held for 24 hours at ambient temperature in the WI program [12]. As a result, specimens with results above the 96th percentile are now no longer retested to confirm their initial results in WI. Preliminary MA laboratory data comparing 95th percentile IRT rankings when transit times varied suggested a direct relationship (independent of season), which may also cause variability (see Section 2.1.4). Studies at the CDC have shown that IRT-1 (added to whole blood, spotted onto filter paper, and dried) was stable for 1 year when stored with desiccant at −20 °C or at 4 °C [13]. 2.1.4. Seasonal variation The WI program previously reported that mean and 95th percentile IRT levels vary by both the season of the year (lower in summer) and the reagent lot number, thus affecting IRT assay sensitivity. This variation can result in fewer referrals for follow-up in warmer months. To

Conference Proceedings

3

Fig. 1. Immunoreactive trypsinogen (IRT) sample stability.*Unspiked (sample 1) and IRT-spiked (sample 2) blood spots were dried for 24 hours at ambient temperature. Thereafter the dried blood spots were stored in different storage conditions. The IRT concentrations were measured with the GSP®IRT kit from PerkinElmer. RH equals relative humidity.Note: Specimens stored at ambient temperature should be analyzed for IRT within 2 weeks of collection. For longer storage, specimens should be stored frozen. Humidity and moisture are detrimental to the dried blood spot specimens.*Source: Graphs and data were presented at the IRT Workshop and used here with the permission of PerkinElmer, Inc., Waltham, MA.

cope with seasonal and reagent lot variations and to ensure equity for screened infants, they argued that a cutoff based on percentiles rather than absolute IRT values (so-called floating cutoff) was superior to using a fixed value cutoff with IRT assays [12]. Seasonal variations have been confirmed by the MA laboratory when evaluating large numbers of samples (>850,000). On the other hand, both the Georgia (GA) and Florida (FL) programs, two states with relatively less ambient temperature extremes, reported no measurable seasonal variations. The CO and WA program reported minimal variations. Data from France, on IRT distributions in 44 late diagnosed newborns from 2002 through 2007, suggested that neither geographical nor seasonal variations had an effect on the sensitivity of their IRT assays and only a minimal effect on assay specificity. 2.1.5. Kit variability and reagent lot changes Manufacturers of IRT NDBS assay kits face at least two challenges. First, there is no international reference material readily available for IRT. Second, reference specimens are negatively affected by trypsin autolysis (destruction of a cell through the action of its own enzymes), which contributes to blood degradation during the manufacturing process (before being dried onto filter paper) and reduces homogeneity within a manufacturing batch. There are currently two manufacturers of IRT NDBS reagent kits approved for routine use by the U.S. Food and Drug Administration (FDA) and two other vendors are developing kits which have not yet been submitted for FDA approval. Both approved kits have been reported to exhibit lot-to-lot reagent variability raising result reliability questions among kit consumers. As an example, variations from 51 ng/mL to 68 ng/mL at the 95th percentile were reported by the WI program using the PerkinElmer AutoDELFIA® Neonatal IRT kit B005-112. This experience was confirmed by the GA, TX, and WA programs. Variability and lowered mean IRT values were also reported by the FL program using PerkinElmer's GSP® Neonatal IRT kit 3306-001U. Recent data from TX, however, appears to show improved assay stability for the AutoDELFIA® B005-112 kit. Interestingly, reporting programs are handling the variability issues in different ways including floating cutoffs, increased 2nd tier DNA testing, and fixed cutoff adjustments. Quantile–quantile plots have also been considered as a possible alternative for monitoring NDBS assay performance negating the necessity for analyzing external calibrators [14]. 2.1.6. Other confounding factors Prematurity and race/ethnicity also appear to affect NDBS IRT results. In particular, the WI program has reported higher IRT levels in preterm babies and higher mean IRT values for African–Americans versus Caucasians [12]. These data agree with data from the New York (NY) and Michigan (MI) programs showing higher IRT values in African–Americans when compared to other racial groups [15].

The MN program reported that patients with certain CF mutations may exhibit lower IRT levels. Surprisingly, among their late diagnosed CF cases, two patients homozygous for the ΔF508 mutation exhibited low screening IRT levels (38.1 ng/mL and 49.6 ng/mL). Discussions emphasized the importance of proper mutation panel selection based on the genetics of the screened population. This is important to prevent late diagnosed cases, which can result from inadequate mutation analyses [16]. Babies with meconium ileus (obstruction of the distal small intestine due to the thickening and congestion of the meconium) also appear to exhibit lowered IRT levels. The CO program has previously reported that IRT values are lower in newborns with meconium ileus and they tend to remain fairly stable across the first 4 days of life [17]. On the other hand, data from the WI program suggest that newborns with CF who present with meconium ileus do not have significantly lower IRT levels than newborns without meconium ileus. 2.2. Program responses to IRT intrinsic and extrinsic variations 2.2.1. Adjusting fixed IRT cutoff values NDBS programs vary in their method for determining ranges of expected results and the cutoff above which a newborn is considered at increased risk for CF. The WI laboratory was one of the earliest U.S. NDBS programs for CF and relied on data generated in CO, the first CF screening region in the U.S., to define its initial screening cutoff [7,8]. Several late diagnosed cases (false negatives) in CO were eventually identified through surveillance, which led the WI program to change to a more conservative fixed IRT cutoff of 56 ng/mL (~94th percentile for 5,000 specimens) [9]. This cutoff change reduced carrier detections and resulted in 8% follow-up in the winter months versus 3% in the summer; however, three late diagnosed cases were identified in WI during 1999–2004 which lead to a cutoff algorithm change (see Section 2.2.2). More recently, the WA program (which routinely obtains 2nd screens on most newborns) established cutoffs based on 20 years of data from the CO program. Out-of-range IRT results for newborns less than 6 days old are ≥100 ng/mL (99.7th percentile) and are ≥70 ng/mL for newborns 6 days or older. They reported little overlap in IRT levels between CF cases and controls at the 99.7th percentile level. 2.2.2. Using a floating IRT cutoff A floating cutoff uses percentiles rather than set numeric values to determine “at risk” newborns, thus translating percentiles to numbers that typically result in different (or floating) numeric values from assay to assay. With floating cutoffs, sufficiently high numbers of specimens are necessary for statistical validity. As an example, the MA laboratory uses a single cutoff for all specimens analyzed in a day. For days with less than 300 specimens, results are combined with the following

4

Conference Proceedings

day. Various percentiles and protocols for floating cutoff determination exist (e.g. mechanisms for including low birth weight infants, seasonal variation, etc.) including differing statistical manipulations. Statistical treatments are necessary since significant numbers of patients with higher or lower result tendencies (i.e. a number of newborns from the same NICU) in an assay may bias a floating cutoff and increase the chance of misidentifying a true case. As noted previously, the WI program historically has used both fixed and floating cutoffs for NDBS IRT assays. Their floating cutoff was based on analysis of 96th–99th percentile data. A cutoff based on the 96th percentile of a single day's batch of specimens was found to provide a sensitivity of 96.2% (up from 90.6%) and gave a specificity of 94.7%. Over time in WI, three newborns were detected using the floating cutoff that would have been missed using the traditional fixed cutoff value. This ~2 percent gain in sensitivity, if applied nationally, could mean avoidance of ~20 late diagnosed CF cases annually in the U.S. Workshop participants noted that late diagnosed CF cases have occurred in programs using both fixed or floating cutoffs, so laboratories must pay particular attention to their methodology in order to minimize late diagnoses. In the United Kingdom (UK), where specimens are collected on day 5 of life (or occasionally later), the cutoff for the first IRT value is relatively high (99.5th percentile). The UK program started in 2007, and analysis of data through 2010 suggests that a statistically appropriate number of cases have been recognized with a relatively low rate of sweat testing. Infants with one mutation have a second IRT measured between 21 and 28 days, and if that is low, they do not have a sweat test. Instead, the families are provided with advice on possible carrier status. The results of the UK protocol raise the possibility that collecting an initial blood spot for IRT measurement near the end of the first week of life may lead to better discrimination between CF and non-CF values than in traditional earlier specimen collecting NDBS programs, although further analysis is required to confirm this observation. 3. Discussion Of the approximately 4.5 million annual births receiving CF newborn screening in the U.S., about 11,000 are recalled for further follow-up testing with some 1,000 CF cases diagnosed. However, there continue to be late diagnosed cases each year. It is important that all CF NDBS stakeholders understand not only the importance of screening, but the associated risks of delayed diagnosis currently associated with the IRT as an initial NDBS test. There is presently no consensus screening algorithm recognized for CF NDBS (see Table 1). Four NDBS protocols are currently in use: (1) IRT/IRT (used in 14 states, including 11 states that universally require two blood spot specimens), (2) IRT/DNA (used in most states—a multi-mutation panel approach is used in most programs; however, the number and mutations vary), (3) IRT/IRT/DNA (requires two time-spaced NDBS specimens with “persistent” hypertrypsinoginemia for DNA testing [18]) and (4) IRT/DNA/EGA (expanded genetic analysis; expanded CFTR analysis to identify two mutations, as in California (CA) [19]). Strengths and weaknesses exist for each NDBS protocol. The most disconcerting weakness of the IRT/IRT protocol vs. IRT/DNA is the apparent higher number of delayed diagnosed (false negatives or “missed”) cases. Screening with IRT/DNA provides better sensitivity since a lowered IRT cutoff (typically the upper 3–5%) is used. However, because sequentially elevated IRT concentrations are more specific than a single elevation, a second IRT screen reduces sweat test referrals by several fold compared with IRT/DNA screening. Because a second specimen is required with IRT/IRT, the time to diagnosis may be slightly increased, although data from the WA program indicated a respectable median age at diagnosis of 21 days. Screening with IRT/IRT also reduces detection of CF carriers, since carrier second screens do not usually show an elevated IRT. The value of carrier detection remains controversial since carriers are unaffected, but carrier information may be useful for family planning.

Based on workshop participants' reports, late diagnosed CF cases (Table 1) appear to be impacted predominately by: 1) screening IRT cutoffs, 2) seasonal (temperature) variation, and 3) analytical kitrelated variables. Much of the information surrounding the contribution of meconium ileus to the number of late diagnoses appears to be anecdotal; however, newborns with meconium ileus are the first and most obvious patients to be diagnosed late following a normal NDBS screen (although they may represent only ~10–20% of the total late diagnosed cases). Further studies are needed to resolve the exact impact of meconium ileus on late diagnosed cases. Available data from the UK suggest that the false-negative rate in IRT NDBS for cystic fibrosis is low; however, the NDBS process there relies on collection of a specimen at a later time (approximately 5 days) than in the U.S., and this, combined with physical observations of patients by health visitors at the time of specimen collection, likely contributes to this result. In Italy, late diagnoses are generally associated with low IRT levels (including some patients with meconium ileus). Many of the babies missed have IRT values that are just below the cutoff. On the other hand, the CA program reported that of the 10 cases diagnosed late as a result of low NDBS IRT values, some had IRT concentrations significantly lower than the cutoff. Similarly, the TX program reported two late diagnosed cases with NDBS IRT values well below their 60 ng/mL cutoff. These different experiences further emphasize the dilemma in setting IRT screening cutoffs for CF detection in NDBS. When the U.S. screening population is considered, it has been estimated that as many as 70 cases annually may be diagnosed late, assuming 20% in IRT/IRT states and 5% in IRT/DNA states (estimates based on 100% impact in IRT/IRT states and ~67% in IRT/DNA states) [20]. Several issues appear to contribute to statistical challenges in analyzing NDBS IRT results: day-to-day IRT concentration variability (possibly related to physiology), variability of NDBS IRT reagent kit lots, seasonal variation in measured IRT levels, birth weight, race/ ethnicity, and other patient-specific factors. Any statistical procedure used to address these NDBS IRT assay challenges must be realistically capable of being implemented by NDBS laboratories. Despite these challenges, high sensitivity and high specificity with NDBS IRT assays appear to be simultaneously achievable. The use of a semiparametric ROC curve to relate test specificity (on x-axis) and sensitivity (on Y-axis) to cutoff changes may be useful [21]. The CA program reported using ROC curves to maximize assay performance resulting in an assay cutoff at the 98.2 percentile (using the AutoDELFIA® Neonatal IRT assay). Lowering the cutoff to the 96th or 97th percentile resulted in a small gain in sensitivity, but a large loss in specificity. Likewise, multiple of the median (MoM) calculations may be useful for improved NDBS IRT assay reliability. The MoM measures deviation of test results from the median value and is particularly useful when individual test results are highly variable [21]. Using MoM calculations in the WI program, however, was found to sacrifice specificity in favor of sensitivity. In France, centralized collection of population-wide percentile data has been used routinely to monitor laboratory and NDBS IRT assay kit performance. In the UK, improved IRT assays and kit performance monitoring has likewise been emphasized. External quality assurance is mandatory for NDBS laboratories, but of limited practical value; hence, laboratory and assay performance are monitored using centralized collection of population-wide percentile data. Population data can be used to detect assay variations (thus minimizing the need for additional assays or external calibrators, although adjustments for ethnic mix and seasonality may be necessary), but for statistical reliability, large number of samples are required. 4. Conclusions and recommendations Late CF diagnosis is a critical problem that decreases the value of early screening. Despite the general success of 30 years of IRT-driven cystic fibrosis NDBS, questions remain about IRT (or perhaps more

Conference Proceedings

5

Table 1 Summary of estimated number of false-negative screens (late diagnosis/missed) for cystic fibrosis (CF) cases and associated information.a State

Testing period (years)

Numbers screened

Number of CF cases detected by screenb

False-negative screens (late diagnosis cases)b

Present algorithm (2010–2011)

Present cutoff type (2010–2011)c

California Colorado Florida Georgia Kansas Massachusetts Michigan Minnesota New York Texas Washington Wisconsin Total

3 28 3.5 4 2.75 12 3.25 5 8 1.3 5 17 –

1,582,126 1,738,771 665,360 588,724 115,014 913,884 377,674 344,352 2,067,765 546,118 462,996 1,117,334 10,520,118

236 475 117 116 31 256 113 76 374 74 83 261 2,212

16 48 2 3 1 6 1 3 8 2 5 14 109

IRT/DNA/EGAd IRT/IRT/DNA IRT/DNA IRT/DNA IRT/DNA IRT/DNA IRT/DNA IRT/DNA IRT/DNA IRT/IRT/DNA IRT/IRT IRT/DNA

Fixed (62 ng/mL) Fixed (60 ng/mL) Floating (96th percentile) Fixed (55 ng/mL) Fixed (60 ng/mL) Floating (95th percentile) Floating (96th percentile) Floating (96th percentile) Floating (95th percentile) Floating (95th percentile) Fixed (100 ng/mL) Floating (96th percentile)

a The numbers of cases with late diagnoses listed should be considered the minimum estimates because 15 years or longer will be needed to identify some CF patients with falsenegative newborn screening results [M.J. Rock, H. Levy, C. Zaleski, P.M. Farrell, Factors accounting for a missed diagnosis of cystic fibrosis after newborn screening, Pediatr. Pulmonol. 46 (2011) 1166–1174.] and some may even die undiagnosed [I.J.M. Doull, H.C. Ryley, P. Weller, M.C. Goodchild, Cystic fibrosis related deaths and the effect of newborn screening, Pediatr. Pulmonol. 31 (2001) 363–366]. b CF cases listed may or may not include a few patients with meconium ileus (MI). False-negative screens (late diagnosis cases) listed may include a few patients with MI that had IRT values below the cutoff value (or had IRT values above the cutoff). c For fixed cutoff types, this is the first tier screen for newborns ≤ 6 days of age and different cutoffs may be used for second tier and special situations with older newborns. d EGA (extended gene analysis).

correctly, the IRT assay) as an initial NDBS biomarker. This is particularly evident given the large numbers of late diagnosed cases and their initial NDBS IRT results (Table 1). Late diagnosed CF cases, 5–15% (85–95% sensitivity), are more common than other genetic conditions commonly included in NDBS panels. While, physiologic IRT levels may contribute to the problem, the focus has been (and is) on laboratory issues, but more data are needed. For example, the effect of seasonal variations is not known with any certainty despite reported differences from programs that appear to show increased variability, which may relate to cooler climate states. Whether or not better mathematical/statistical modeling and algorithms can improve the quality of CF NDBS is also not known. Now that a significantly higher number of jurisdictions are screening newborns for CF, pooled data should provide answers more quickly and efforts should be made to centrally compile the appropriate information. It may be useful if the Cystic Fibrosis Foundation monitors initial and subsequent IRT levels in confirmed CF patients as part of their case registry. Genotyping information may also be helpful. Monitoring age-related IRT declines may help to establish more reliable IRT cutoffs. Basic science research into the natural history of CF will continue to be informative and is encouraged. Individuals with CF exhibit absent or defective production and function of CFTR, and new research into genes, proteins, and genetic modifiers could reveal new information. The NDBS programs for CF in Europe differ from those in the United States. Programs in European countries tend to use a much higher fixed IRT cutoff (≥99th percentile) in first-tier NDBS for cystic fibrosis than state programs in the United States (many of which use NDBS IRT cutoffs at or around the 96th percentile). Moreover, both the UK and France perform IRT testing on blood specimens collected several days after birth, which could be a useful consideration for improving CF NDBS outcomes, although issues exist pertaining to preventable deaths during the first few postnatal days and the logistics of home or clinic blood sampling. Workshop participants agreed that the following recommendations should be helpful in sorting out solutions to the issues in IRT NDBS: • NDBS IRT data should be aggregated and distributions of NDBS IRT values across programs should be systematically compared nationwide. • The Cystic Fibrosis Foundation, in collaboration with the Cystic Fibrosis Gene Modifier Consortium, should be encouraged to develop a

registry of late diagnosed CF cases with laboratory and clinical information as part of its Patient Registry. • The risk of late diagnosed CF cases occurring in NDBS programs should be communicated to the medical community and parents, with emphasis on the importance of performing sweat testing if symptoms appear. • Commercial manufacturers of NDBS IRT assay kits should work diligently to reduce lot-to-lot variation and other issues related to the consistent performance of these products.

Acknowledgments In addition to the authors listed, the following individuals served as participants in the workshop and contributed to the scientific data and information included in this paper: Frank Accurso, Michele Caggana, Bobby Chavli, Sara Copeland, Arturo Cowes, Carla Cuthbert, Marie Earley, Roger Eaton, Art Hagar, Jim Hendershot, Annika Hiekkanen, Amy Hietala, Patrick Hopkins, Elizabeth Jones, Marty Kharrazi, Petri Kerokoski, Michael Kosorok, Fizza Majid, Joanne Mei, Colleen Peterson, Rodney Pollitt, Joseph Quashnock, Teresa Repetto, Michael Rock, Kathy Sabadosa, Michael Schechter, Dennis Smith, Marci Sontag, Kevin Southern, Jan St. Germain, John Thompson, Jasmin Torres, Georges Travert, Tiina Urv, Art Willis, Dan Wright, and William Young. We thank them for their participation and important contributions to this paper, and the success of the workshop. We expressed our appreciation to the primary funders of the workshop, the Health Resources and Services Administration (HRSA), the Centers for Disease Control and Prevention (CDC) and PerkinElmer, Inc. This project was supported in part by HRSA Grant #U32MC00148 to the National Newborn Screening and Genetics Resource Center. References [1] J.R. Crossley, R.B. Elliott, P.A. Smith, Dried blood spot screening for cystic fibrosis in the newborn, Lancet 1 (1979) 472–474. [2] J.R. Crossley, P.A. Smith, B.W. Edgar, P.D. Gluckman, R.B. Elliott, Neonatal screening for cystic fibrosis using immunoreactive trypsin assay in dried blood spots, Clin. Chem. Acta 113 (1981) 111–121. [3] M.S. Watson, M.Y. Mann, M. Lloyd-Puryear, P. Rinaldo, Newborn screening: toward a uniform screening panel and system, Genet. Med. Suppl. 1 (2006) 1s–252s.

6

Conference Proceedings

[4] K. Sebelius, Response by the HHS Secretary to the Secretary's Advisory Committee on Heritable Disorders in Newborns and Children, May 21, 2010. http://www.hrsa.gov/advisorycommittees/mchbadvisory/heritabledisorders/ recommendations/correspondence/uniformpanelsecre052110.pdf. (Accessed February 14, 2012). [5] B.L. Therrell Jr., National Newborn Screening Status Report (2009), National Newborn Screening and Genetics Resource Center, Austin, TX. Available at: http:// genes-r-us.uthscsa.edu/nbsdisorders.htm (Accessed February 14, 2012). [6] Cystic Fibrosis Foundation, Ad Hoc Committee, Task Force on Neonatal Screening, Neonatal screening for cystic fibrosis: position paper, Pediatrics 72 (5) (1983) 741–745. [7] M.J. Rock, E.H. Mischler, P.M. Farrell, L.J. Wei, W.T. Bruns, D.J. Hassemer, R.H. Laessig, Newborn screening for cystic fibrosis is complicated by age-related decline in immunoreactive trypsinogen levels, Pediatrics 85 (6) (1990) 1001–1007. [8] K.B. Hammond, S.H. Abman, R.J. Sokol, F.J. Accurso, Efficacy of statewide neonatal screening for cystic fibrosis by assay of trypsinogen concentrations, N. Engl. J. Med. 325 (1991) 769–774. [9] R.G. Gregg, B.S. Wilfond, P.M. Farrell, A. Laxova, D. Hassemer, E.H. Mischler, Application of DNA analysis in population-screening program for neonatal diagnosis of cystic fibrosis (CF): comparison of screening protocols, Am. J. Hum. Genet. 52 (1993) 616–626. [10] Newborn Screening and Genetics Resource Center's National Newborn Screening Information System (database), Austin, TX. Available at: http://nnsis.uthscsa.edu/ xreports.aspx?XREPORTID=5 (Accessed February 20, 2012). [11] B.A. Lindau-Shepard, K.A. Pass, Newborn screening for cystic fibrosis by use of a multiplex immunoassay, Clin. Chem. 56 (2010) 445–450. [12] M. Kloosterboer, G. Hoffman, M. Rock, W. Gershan, A. Laxova, J. Zhanhai, P.M. Farrell, Clarification of laboratory and clinical variables that influence cystic fibrosis newborn

[13]

[14] [15]

[16]

[17]

[18] [19]

[20] [21]

screening with initial analysis of immunoreactive trypsinogen, Pediatrics 123 (2009) 338–346. L.L. Zhou, C.J. Bell, M.C. Earley, W.H. Hannon, Development and characterization of dried blood spot materials for measurement of immunoreactive trypsinogen, J. Med. Screen. 13 (2006) 79–84. R.J. Pollitt, A.J. Matthews, Population quantile–quantile plots for monitoring assay performance in newborn screening, J. Inherit. Metab. Dis. 30 (2007) 607. S.J. Korzeniewski, W.I. Young, H.C. Hawkins, K. Cavanagh, S.Z. Nasr, C. Langbo, K.R. Teneyck, S.D. Grosse, M. Kleyn, V. Grigorescu, Variation in immunoreactive trypsinogen concentrations among Michigan newborns and implications for cystic fibrosis newborn screening, Pediatr. Pulmonol. 46 (2011) 125–130. O.M. Alper, L.J. Wong, S. Young, M. Pearl, S. Graham, J. Sherwin, E. Nussbaum, D. Nielson, A. Platzker, Z. Davies, A. Lieberthal, T. Chin, G. Shay, K. Hardy, M. Kharrazi, Identification of novel and rare mutations in California Hispanic and African American cystic fibrosis patients, Hum. Mutat. 24 (2004) 353. M.K. Sontag, M. Corey, J.E. Hokanson, J.A. Marshall, S.S. Sommer, G.O. Zerbe, et al., Genetic and physiologic correlates of longitudinal immunoreactive trypsinogen decline in infants with cystic fibrosis identified through newborn screening, J. Pediatr. 149 (2006) 650–657. M.K. Sontag, D. Wright, J. Beebe, F.J. Accurso, S.D. Sagel, A new cystic fibrosis newborn screening algorithm: IRT/IRT/DNA, J. Pediatr. 155 (2009) 618–622. A. Kammesheidt, M. Kharrazi, S. Graham, S. Young, M. Pearl, C. Dunlop, S. Keiles, Comprehensive genetic analysis of the cystic fibrosis transmembrane conductance regulator from dried blood specimens—implications for newborn screening, Genet. Med. 8 (9) (2006) 557–562 (Sep). A. Fritz, P. Farrell, Estimating the annual number of false negative cystic fibrosis newborn screening tests, Pediatr. Pulmonol. 47 (2012) 207–208. T. Cai, M.S. Pepe, Semi-parametric ROC analysis to evaluate biomarker for disease, J. Am. Stat. Assoc. 97 (2002) 1099–1107.