Journal of Immunological Methods 344 (2009) 35–44
Contents lists available at ScienceDirect
Journal of Immunological Methods j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / j i m
Research paper
Accurate determination of copy number variations (CNVs): Application to the α- and β-defensin CNVs Hilde Nuytten a, Iwona Wlodarska a, Kristiaan Nackaerts b, Séverine Vermeire b, Joris Vermeesch a, Jean-Jacques Cassiman a, Harry Cuppens a,⁎ a b
Department of Human Genetics, K.U.Leuven, Leuven, Belgium University Hospital of Leuven, K.U.Leuven, Leuven, Belgium
a r t i c l e
i n f o
Article history: Received 19 September 2008 Received in revised form 26 February 2009 Accepted 5 March 2009 Available online 17 March 2009 Keywords: Copy number variation Quantification Real-time PCR Defensins
a b s t r a c t The human genome is rich in genomic regions that are repeated several times. These regions are known as CNVs (copy number variations) and can be polymorphic. Moreover, the number of copies may play a role in the predisposition to particular diseases. It is therefore important to accurately determine the copy number of those CNVs in individuals. We have developed a strategy, using concatemeric constructs containing different numbers of repeats as internal standards, to accurately determine the number of repeats in the α-defensin and β-defensin region on chromosome 8p23 by real-time PCR. The test was validated by FISH in DNA of 13 individuals. Comparison with previously published methods shows that this approach provides more accurate results for the determination of the exact number of repeats when they exceed 3 copies. This strategy can be easily transferred to any CNV. With this method we structurally analyzed the α- and β-defensin region in 334 Belgian individuals. © 2009 Elsevier B.V. All rights reserved.
1. Introduction CNVs have only been identified in recent years. The repeats of some CNVs contain one or more complete genes (Iafrate et al., 2004; Wong et al., 2007). Given the fact that the copy number of such CNVs may vary between individuals, the gene dosage will be variable between individuals, and may even result in, or predispose to, disease. This has been shown for Crohn's disease, psoriasis and HIV (Gonzalez et al., 2005; Fellermann et al., 2006; Hollox et al., 2008b). Hence, there is a need for genetic assays which accurately determine the copy numbers at CNV loci. The main technique used for the detection of CNVs is array comparative genomic hybridization (CGH). For large scale quantification of a single or a few CNVs this test is not Abbreviations: CNV, copy number variation;FISH, fluorescent in situ hybridization;DEFB, β-defensin;DEFA, α-defensins;Ct, cycle threshold. ⁎ Corresponding author. Center for Human Genetics, Katholieke Universiteit Leuven, Gasthuisberg O&N1 (602), Herestraat 49, B-3000, Leuven, Belgium. Tel.: +32 16 347240; fax: +32 16 345997. E-mail address:
[email protected] (H. Cuppens). 0022-1759/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jim.2009.03.002
appropriate because of the considerable cost of the technique and the large amount of DNA needed (Hollox et al., 2008a). Moreover, when more than 3 repeat units are present in the diploid genome, a test optimized for this higher copy number variation is needed. Most copy number quantification assays are not appropriate for the detection of this smaller differences in absolute amounts (Armour et al., 2007; Hollox et al., 2008a). For example, in a given quantitative assay, the signal of 6 repeats at a given CNV is only increased by 20% compared to the signal of 5 repeats. Quantitative differences become even smaller when a higher number of repeats is present (Fig. 1a). Alternative techniques, involving only a limited number of manipulations, are therefore needed to accurately determine such subtle quantitative differences. Several recent studies have demonstrated that some genes or groups of genes can show extensive copy number variation, and that this variation can have important functional consequences (Sharp et al., 2005; Tuzun et al., 2005; Kidd et al., 2008). For example, the genes CCL3L1, CCL4L1 and TBC1D3 are present on a segmental duplication that can vary between 0 and 10 copies per person (Townson et al., 2002);
36
H. Nuytten et al. / Journal of Immunological Methods 344 (2009) 35–44
Fig. 1. Concatemeric reference constructs (a): Absolute and relative signal differences between subsequent repeats that need to be detected in a quantitative assay for copy number determination of a CNV. (b): Reference constructs were generated by a digestion ligation protocol. (c): Concatemeric reference constructs that contain one copy of a reference sequence and a varying number of the repeated sequence. For each construct the ratio of repeated sequence versus non-repeated sequence is constant, irrespective of dilution, pipetting or concentration measurement errors. (d): Digestion of the 4 concatemeric reference constructs with EcoRI and SphI, gel-electrophoresis on a 1% agarose gel.
this variation appears to be a determinant of individual susceptibility to, and progression of, infection with HIV-1 (Gonzalez et al., 2005). The AMY1 also shows extensive copy number variation (Perry et al., 2006). Higher AMY1 copy numbers and protein levels probably improve the digestion of starchy foods and may buffer against the fitness-reducing effects of intestinal disease (Perry et al., 2006). To accurately quantify such copy number variations, a quantitative assay, including a standard curve based upon appropriate controls, is favorable. Reference samples for the generation of standard curves are generally obtained through preparation of dilution solutions, and/or mixing, of references containing the repeated sequence under investigation and a non-repeated reference sequence. Pipetting and diluting errors, as well as inaccurate DNA concentration measurements, may affect the accuracy of such assays. The error in the signal that one wants to measure may even become larger than the actual signal difference that one intends to determine. In order to overcome this accuracy problem, we have developed concatemeric reference constructs, wherein each reference construct is formed by an isolated non-repeated nucleic acid sequence comprising one copy of a reference sequence and by a known but different number of copies of a repeated sequence (Fig. 1c). The non-repeated sequence is preferably located close to the repeated nucleic acid sequence, which is more likely to have a similar local DNA structure. In this case, the ratio of the repeated versus the non-repeated DNA is always the same, irrespective of pipetting or DNA
measurement errors. Another advantage of this method is that the standard curve in each experiment will be based on exactly the same reference material, allowing the comparison of results across different experiments. Here, we have developed an assay to determine copy number variation in the α- and β-defensin CNV. Defensins are cationic antimicrobial peptides, they form part of the innate immune system and have an antimicrobial activity against a wide variety of gram-positive and gram-negative bacteria as well as fungi and enveloped viruses (Ganz, 2003). Besides their antimicrobial activity they have also a function in awaking the adaptive immunity (Yang et al., 2001). In humans, there are two families of defenins, α-defenins and β-defensins. α-defensins are mainly expressed in the neutrophils and in the paneth cells of the intestine, whereas β-defensins are expressed by epithelial tissue. Recently it has been shown that both the βdefensin region as well as the DEFA1A3 gene show copy number variation (Hollox et al., 2003; Aldred et al., 2005). The copy number polymorphism will probably be of more relevance than the SNPs found the β-defensin genes (Vankeerberghen et al., 2005). The β-defensin CNV is probably one of the most clinically relevant CNVs since its involvement in barrier disease like psoriasis and Crohn's disease (Fellermann et al., 2006; Hollox et al., 2008b). A map of the chromosomal region with the localization of the α- and β-defensin genes is shown in Fig. 2. To accurately quantify the β-defensin CNV, the DEFB4 or the DEFB104 gene were chosen as repeated sequences. Since both genes are located in the same CNV, diploid copy number
H. Nuytten et al. / Journal of Immunological Methods 344 (2009) 35–44
37
Fig. 2. Schematic overview of the defensin region on chromosome 8. The DEFB1 gene is located close to the α-defensin genes. Only the DEFA1A3 gene shows copy number variation. The β-defensin region is located more downstream, in the same genomic region. The whole β-defensin region shows copy number variation.
analyses should render the same copy numbers in both the assays irrespective of the position of the DNA sequence in the CNV. For the α-defensin CNV, the DEFA1A3 gene was chosen as repeated sequence. The α-defensin repeat contains the DEFA3 and DEFA1 genes. Since DEFA3 differs from DEFA1 by only one nucleotide, the combined number of DEFA1 and DEFA3 copies is determined in our assay. In fact, it has been proposed to consider DEFA3 as a polymorphic allele of DEFA1, rather than a paralogous gene of DEFA1. Approximately 10% of the individuals do not carry a DEFA3 allele (Mars et al., 1995). The DEFB1-gene which is in close proximity to the α- and β-defensin CNV is the non-repeated reference sequence both for the α-defensin as well as for the β-defensin diploid copy number determination. The test we have developed is a quantitative duplex real-time PCR assay. 2. Materials and methods
The PCR reaction was performed in 96 well clear optical reaction plates (Applied Biosystems). For each sample, 3 independent PCR reactions were performed and each of these 3 assays was even performed in duplo. For each real-time PCR reaction, 25 µl solutions containing 1× qPCR Master Mix (Eurogentec), 200 nM of each sense and antisense primer, and 250 nM TaqMan probe were used. Either 100 ng genomic DNA, which equals approximately 3 ⁎ 105 molecules/µl, or an equal molarity of plasmid DNA (reference constructs) was used. The reaction conditions were set at 95 °C for 10 min, followed by 40 cycles of 95 °C for 15 s and 60 °C for 1 min. Each 96 well plate contained the set of four reference constructs. The delta Ct value of the reference constructs (Ct DEFB1 − Ct repeated gene) was correlated to the diploid copy number of the repeated gene in each plasmid. In this way a standard curve could be constructed for each 96-well plate, which was then used for diploid copy number determination of the different samples under investigation in that plate.
2.1. Preparation of the control constructs The detailed procedure is described in the Supplementary data. Briefly, the DEFB1 and DEFB4 cDNAs used for the generation of the reference constructs did not contain any polymorphism. A DEFB4 PCR fragment surrounded by XhoI and SalI was generated and cloned in PCR2.1. This fragment was then ligated in the SalI site of pUC18-DEFB1 vector with T4 DNA-ligase (Roche). Indeed, after ligation of a DEFB4 fragment in pUC18, only the last SalI recognition site remains intact, so that repeats can be added one by one until the desired number of repeat reference constructs is obtained. A set of 6 DEFB1-DEFB4 reference constructs was generated. The exact copy numbers of the ligated fragments into the vectors were subsequently verified through digestion analysis of the different vectors with the restriction enzymes EcoRI and SphI which flank the DEFB1-DEFB4 inserts, after which the length of the obtained fragments was analyzed by gel electrophoration on a 1% agarose gel (Fig. 1d). Analogously, a set of 4 DEFB1-DEFB104 and DEFB1DEFA1A3 reference constructs was generated. 2.2. Real-time PCR The diploid copy number of DEFB4, DEFB104 or DEFA1A3 was determined by duplex real-time PCR with the qPCR Core kit (Eurogentec) on an ABI 7500 system (Applied Biosystems). For each sample, a PCR was performed with the TaqMan primers and probes listed in Table 1.
Table 1 Primers and probes. Generation of reference constructs DEFB1 DEFB4
DEFB104
Forward Reverse Forward Reverse SalI XhoI Forward Reverse SalI XhoI
5′-TCCAAAGGAGCCAGCCTCTC-3′ 5′-AAAAAGTTCATTTCACTTCTGCGTC-3′ 5′-CCAGCCATCAGCCATGAGGGT-3′ 5′-TGGTTTACATGTCGCACGTC-3′ 5′-AGGTCGACCAGCCATCAGCCATGAGGGT-3′ 5′-ACTCGAGTGGTTACATGTCGCACGTC-3′ 5′-CCC CAG CAT TAT GCA GAG AC-3′ 5′-CGACTCTAGGGACCAGCACT-3′ 5′-AGGTCGACTTGTGCTGCTATTAGCCGT-3′ 5′-ACTCGAGCGTTTCAGGGTTTTGTACGATT-3′
Real-time PCR primers and probes DEFB1
Forward Reverse Probe DEFB4 Forward Reverse Probe DEFB104 Forward Reverse Probe DEFA1A3 Forward Reverse Probe DEFA3 Forward Reverse Probe
5′-TTGCGTCAGCAGTGGAGG-3′ 5′-AACAGGTGCCTTGAATTTTGGT-3′ 5′-VIC-CAATGTCTCTATTCTGCCTGCCCGATCTT-TAMRA-3′ 5′-ACAAATTGGCACCTGTGGTCT-3′ 5′-GCAGCTTCTTGGCCTCCTC-3′ 5′-FAM-CCTGGAACAAAATGCTGCAAAAAGCC-TAMRA-3′ 5′-TGGTTATGGGACTGCCCG-3′ 5′-TGGGACATCTTCCAATTCTGTATTC-3′ 5′-FAM-GGCTGCGACATTTCTTCCGGCA-TAMRA-3′ 5′-CTCCAGGCAAGAGCTGATGAG-3′ 5′-TGGGATGTCCGCTGCAAT-3′ 5′-FAM-TTGCTGCAGCCCCGGAGCA-TAMRA-3′ 5′-GCTCAAGGAAAAACATGCCA-3′ 5′-CAGCAGAATGCCCAGAGTCTT-3′ 5′-FAM-AACGTCGCTATGGAACCTGCATCTACCATAMRA-3′
38
H. Nuytten et al. / Journal of Immunological Methods 344 (2009) 35–44
To investigate the DEFA3 polymorphism, we developed a TaqMan assay with a forward primer that can only amplify the DEFA3 gene and not the DEFA1 gene. All primers and probes used are listed in Table 1. 2.3. Fluorescent in situ hybridization Cytogenetic analysis was performed on peripheral blood cells from 13 individuals. DNA was isolated from whole blood cells using standard molecular biology procedures. The peripheral blood cells were air-dried on slides and pretreated with pepsin followed by fixation with a 1% free
formaldehyde solution and subsequent dehydration with e t h a n o l . T h e f o s m i d p ro b e s d i r e c t e d t o D E F B 4 (G248P89438G6, BACPAC Resources Center) and DEFB104 (G248P97323B1, BACPAC Resources Center) were directly labeled with Spectrumgreen™ and Spectrumorange™ dUTP (Vysis) using nick translation. The fosmids where purified using QIAquick PCR purification spin columns (Qiagen). 2 µl of each purified fosmid and 6 µl Cot-1 DNA were dried and diluted in hybridization buffer. After overnight hybridization at 37 °C, the slides were washed for 1 min in 0.4× SSC/0.3% NP40 solution at 72 °C, 1 min in 2× SSC/0.1% NP40 solution at room temperature, and 1 min in 2× SCC. The cells were
Fig. 3. Validation of the reference constructs (a–d) Evaluation of the amplification efficiency of the two amplicons in a duplex real-time PCR, using the 4 reference constructs as template. The cycle threshold value (Ct value) is plotted against the logarithmic value of the different template concentrations that were tested for each of the constructs (log quantity). A linear correlation is seen, indicating that the reaction is concentration independent. For all 4 constructs, the coefficient of the linear trendline shows that the reaction has the same efficiency for both fragments in the duplex PCR reactions. The ΔCt value (CtDEFB4 − CtDEFB1) is also independent of the DNA concentration used as template in the PCR reaction. The duplex PCR reactions can thus be used in quantitative assays. The distance between the two lines is the ΔCt value; the larger the difference of the Ct values, the higher the number of repeats present. (e) Linear standard curve generated from the 4 reference constructs quantitated in the duplex real-time PCR assay. The mean distances (i.e. mean CtDEFB4 − CtDEFB1) between the curves from a to d are calculated for each reference construct and are plotted against the diploid copy number of the DEFB4 gene in the construct. The standard error for each mean CtDEFB4 − CtDEFB1 derived from each construct is shown.
H. Nuytten et al. / Journal of Immunological Methods 344 (2009) 35–44
counterstained with DAPI. FISH experiments were evaluated using an Axioplan 2 fluorescence microscope equipped with a charge-coupled device Axiophot 2 camera (Carl Zeiss Microscopy, Jena, Germany) and a MetaSystems Isis imaging system (MetaSystems, Altlussheim, Germany). For each sample, 10 interphases were evaluated. 2.4. Studied populations DNA was isolated from whole blood samples using standard molecular biology procedures. The study was approved by the ethical committee of our university. All subjects involved in the study gave their informed consent. We included 344 healthy Belgian individuals, aged 0– 77 years, mean age 42 years, 60% male. 2.5. Statistical analyses Statistical analyses were performed using the SYSTAT package, release 7.0 (SPSS Inc. Chicago, IL, USA). Statistical tests were considered significant when their type I error was less than 0.05 3. Results 3.1. Generation and validation of the reference constructs In the first step concatemeric reference constructs were generated. In Fig. 1b the procedure to obtain these reference constructs is shown. The procedure is described more precisely in the Supplementary data. With this method 4 concatemeric constructs harboring 1 copy of DEFB1 and 1 to 4 copies of DEFB4 were generated (Fig. 1c). The insertion of the correct copy number of DEFB4 into the different reference constructs was then analyzed. First the constructs were sequenced. The insertion of up to three copies of DEFB4 could be validated by sequencing. When more copies were inserted, a different method was needed to count the exact number of DEFB4 copies. Therefore restriction digestion analysis was performed on the different reference constructs, using restriction enzyme recognition sites flanking the polycloning site. The length of the different DEFB4 containing digestion products was then determined by gel-electrophoresis on a 1% agarose gel. Fragments with different copy numbers showed a different length according to the copy number (Fig. 1d). These reference constructs were then used in a quantitative real-time PCR assay. The repeated and the non-repeated sequences were amplified together in a single duplex reaction, again neutralizing pipetting errors. The probe for DEFB1 was VIC-labeled and the probe for DEFB4 was FAMlabeled, allowing the analysis of the fluorescence signal for both genes in a single PCR reaction. In the first validation step, PCR efficiency was tested for the duplex PCR reaction. To be able to use the constructs for quantification, the PCR efficiency of DEFB1 and DEFB4 should be equal. To test this, 100 fold dilutions were made of the different plasmids. For each dilution, a real-time PCR experiment was performed. The cycle value where the fluorescence exceeds a given threshold is the cycle threshold value (Ct value). In Fig. 3a–d, the Ct value is plotted for each vector
39
dilution and for both the DEFB1 and the DEFB4 fragments for all the concatemeric reference constructs. The slope of both curves is the same. This means that the PCR has the same efficiency for both of the amplicons in the duplex PCR reaction. For each concatemeric reference construct, the distance between the DEFB1 and DEFB4 curves is different; this distance is the delta Ct value (ΔCt, CtDEFB4 − CtDEFB1). This ΔCt value is the same for each dilution tested and correlates with the number of DEFB4 copies into the concatemeric reference construct. The four concatemeric reference constructs DEFB4–DEFB1, (DEFB4)2–DEFB1, (DEFB4)3–DEFB1 and (DEFB4)4–DEFB1) were then used for the generation of a standard curve, by correlating the ΔCt value with the known diploid copy number of the different vectors (Fig. 3e). One copy of DEFB1 and 1 copy of DEFB4 equals with a diploid copy number of 2 when the vectors are used as references to quantitate the diploid copy numbers at the genomic DNA level. A vector that contains two copies of DEFB4 equals with a diploid copy number of 4, a vector that contains 3 copies of DEFB4 equals with a diploid copy number of 6 and a vector with 4 copies of DEFB4 equals with a diploid copy number of 8. A linear relationship between the DEFB4 copy number and the ΔCt values was obtained (R2 = 0.98, Fig. 3). Unknown samples with a ΔCt value within the range of the standard curve (from 2 to 8 diploid β-defensin copy numbers) can thus be accurately determined by a single duplex real-time PCR reaction.
3.2. Validation with interphase FISH In order to validate our developed real-time PCR based assay for diploid β-defensin copy number quantification, an independent technique was developed to analyze the copy number variation of the β-defensin CNV. A two-colored interphase fluorescence in situ hybridization (FISH) experiment was performed to quantitate the β-defensin copy n u m b e r. T wo f o s m i d p rob es d i re c te d to D E F B 4 (G248P89438G6) and DEFB104 (G248P97323B1) of the βdefensin repeat were respectively labeled with Spectrumgreen™ and Spectrumorange™. These probes allowed us to
Table 2 Comparison between real-time PCR and FISH copy number quantification. Sample
Diploid β-defensin copy number determined with real-time PCR
Copy number according to FISH Allele 1
Allele 2
Patient 1 (⁎) Patient 2 Patient 3 Patient 4 (⁎) Patient 5 Patient 6 Patient 7 Patient 8 Patient 9 Patient 10 Patient 11 Patient 12 (⁎) Patient 13 (⁎)
1.8 4.1 5.1 4.2 4.2 3.6 4.1 3.3 5.9 1.6 5.0 3.2 5.0
1 2 3 2 2 2 2 1 4 1 3 2 3
1 2 2 2 2 2 2 2 2 1 2 1 2
Patient with (⁎) are shown in Fig. 4.
40
H. Nuytten et al. / Journal of Immunological Methods 344 (2009) 35–44
Fig. 4. (1–4) Two-colored interphase FISH with fosmids that recognize DEFB4 (green) or DEFB104 (red). The different repeat units are marked with arrows. The repeat number of the samples determined by the real-time PCR assay, using the 4 concatemeric control constructs, respectively found 2, 3, 4 and 5 repeats.
visualize individual copies of the DEFB4 and DEFB104 genes and count them digitally on interphase DNA samples of 13 individuals. Furthermore, DNA was extracted from the same 13 samples and typed by the developed real-time PCR assay. For all samples, the real-time PCR assay and the FISH assay concluded the same diploid β-defensin copy number (Table 2, Fig. 4). These results further validated the quantitative βdefensin/CNV assay.
3.3. Diploid DEFB4 copy number versus diploid DEFB104 copy number It was also tested if our copy number analysis assay is sequence-context-dependent. Indeed, the assay is directed against a local sequence, i.e. DEFB4, of the total 250 kb CNV. Besides DEFB4, other genes such as DEFB104 are located in the β-defensin repeat. DEFB4 and DEFB104 are separated by 50 kb. We therefore constructed a series of four reference constructs in which the same non-repeated sequence against DEFB1 was used, but in which the repeated sequence was directed to DEFB104. Also here, the efficiency of the DEFB1 and DEFB104 amplifications was similar in a duplex PCR reaction, allowing the generation of a linear standard curve (R2 = 0.98) (data not shown). For 85 samples tested, the same diploid copy number of DEFB104 and DEFB4 was obtained for each sample (R2 value = 0.95; slope = 0.97; Fig. 5, Supplementary Table 1). With these assays we conclude the same diploid copy number of DEFB4 and DEFB104 in all the samples tested, as can be expected since they are located in the same CNV.
were evaluated (Fig. 6a,b). The diploid β-defensin copy numbers obtained by the different methods are summarized in Table 3. When only the DEFB4–DEFB1 and (DEFB4)2–DEFB1 constructs were used for generation of the standard curve, a different diploid copy number might be concluded compared to the determined diploid copy number when 4 reference constructs were used. This was especially the case when higher diploid β-defensin copy numbers (4 or more) were found (R2 = 0.69). The slope of the comparison curve is also lower than 1 (slope = 0.77), indicating that both assays conclude a different copy number. As shown in Table 3, this approach underestimated the diploid β-defensin copy number. For 17% of the samples, a copy number difference of at least 0.8 was found, with a maximum deviation of 2 copies. On the other hand, when the combination of the DEFB4– DEFB1 and (DEFB4)4–DEFB1 was used, a much better correlation was found with the diploid β-defensin copy number determined with the 4 control constructs (R 2 = 0.87, slope = 0.99). Here the correlation was worse for the samples carrying a low diploid β-defensin copy number. For 7% of the samples, a diploid copy number difference of 0.8 or more was found, with a maximum diploid copy number difference of 1.1. As shown in Table 3, for some samples, the diploid copy number is overestimated.
3.4. Analysis with two reference constructs versus four reference constructs During the course of our experiments, Chen et al. (2006) published a similar approach. However, they used only two reference constructs, with 1 or 2 copies of the repeated genes and 1 copy of a non-repeated reference gene. We re-analyzed our 150 real-time PCR data, using this two reference construct strategy for the generation of a standard curve. The two-construct combination of the DEFB4–DEFB1 and (DEFB4)2–DEFB1 reference constructs, as well as the two-construct combination of the DEFB4–DEFB1 and (DEFB4)4–DEFB1 constructs
Fig. 5. DEFB4 versus DEFB104 diploid copy number determination; comparison of the copy numbers of the β-defensin region determined with a standard curve obtained through the use of two different sets of 4 reference constructs. The first set contained sequences directed to DEFB4, the second set sequences directed to DEFB104.
H. Nuytten et al. / Journal of Immunological Methods 344 (2009) 35–44
41
Fig. 6. (a–b) Comparison of diploid copy numbers determined when 2 (y-axis) or 4 (x-axis) reference constructs are used for the generation of a standard curve on the same data set; (a) uses the DEFB4–DEFB1 and (DEFB4)2–DEFB1 constructs, (b) uses the DEFB4–DEFB1 and (DEFB4)4–DEFB1 constructs. (c–d) Comparison of the number of β-defensin repeats determined with a standard curve obtained by only two DEFB4 or two DEFB104 reference constructs; the combination of one and two repeat reference constructs (c), as well as the combination of one and four repeat reference constructs (d) was evaluated. A linear correlation is observed between the DEFB4 and DEFB104 analysis, however a larger deviation from linearity is observed compared to the diploid copy number determination that uses four reference constructs in the analysis.
Next, the DEFB104 diploid copy number was also reanalyzed with a standard curve containing only two data points. With this approach, the correlation between the DEFB4 and DEFB104 assays was not as good as the correlation seen with the 4 reference construct assay. This was especially the case when the DEFB4–DEFB1 + (DEFB4)2–DEFB1 and DEFB4–DEFB1 + (DEFB104)2–DEFB1 analyses were compared (R2 = 0.78, slope = 0.9, Fig. 6c). With such 2-reference constructs one can indeed falsely conclude that the DEFB4 and DEFB104 may be different in some samples, as was concluded in the study by Chen et al. (2006). 3.5. Use of 6 reference constructs with 1–6 copies of DEFB4 for copy number analysis We thus have shown that a sufficient number of control constructs should be used for the generation of a standard
curve, in order to obtain reliable diploid copy numbers values. Probably, constructs that cover the complete CNV range are preferably used. For the β-defensin CNV, individuals who carry a diploid copy number of 8 or more have been reported. Therefore, we also constructed (DEFB4)5–(DEFB1) and (DEFB4)6–(DEFB1) reference constructs. With these additional constructs we were able to generate a standard curve with an R2-value of 0.98. This is the same R2-value as obtained with only 4 reference constructs. The additional data points had thus none or a very limited effect on the equation of the standard curve. This equal equation resulted in a good correlation with the diploid β-defensin copy number when either 4 or 6 reference constructs were used (R2-value = 0.99, slope = 1). The addition of extra reference constructs thus did not improve the quantification. A disadvantage of these additional constructs was that the constructs with 5 and 6 copies of DEFB4 turned out to be
42
H. Nuytten et al. / Journal of Immunological Methods 344 (2009) 35–44
Table 3 Comparison of the diploid β-defensin copy numbers obtained on the same real time PCR data set in 149 individuals when different approaches are followed for the generation of a standard curve. β-defensin copy number
2 3 4 5 6 7 8 9
Number and type of reference constructs used for the generation of a standard curve to calculate the diploid βdefensin copy numbers 4 reference constructs
Reference constructs DEFB4–DEFB1 and (DEFB4)2–DEFB1
Reference constructs DEFB4–DEFB1 and (DEFB4)4–DEFB1
3 23 64 46 8 4 1 0
2 34 74 29 9 0 0 1
2 17 58 54 11 4 3 0
unstable at 4 °C. Therefore, the use of reference constructs containing 1–4 copies was preferred for further use in the assay. 3.6. Diploid DEFA1A3 copy number We also generated an assay, and therefore reference constructs, for the α-defensin CNV. More precisely, this CNV is a copy number variation of the DEFA1A3 gene, which is located distal from the β-defensin CNV. The non-repeated sequence was again directed to DEFB1. This assay determines the combined number of DEFA1 and DEFA3 loci. Since DEFA1 and DEFA3 only differ in 1 SNP, DEFA3 is considered as a polymorphic allele of DEFA1. Four concatemeric constructs were generated which contained 1–4 copies of the DEFA1A3 gene. Again, the constructs were evaluated by sequencing and restriction digestion. Moreover, PCR efficiency was tested for the duplex TaqMan assay on the reference constructs. With the 4 DEFA1A3 reference constructs, again an excellent standard curve was obtained (R2-value = 0.98). The presence or the absence of the DEFA3 allele was also investigated. We developed a TaqMan assay with a probe that only recognizes DEFA3 and not DEFA1. This assay was validated on constructs which contained DEFA1 or DEFA3 cDNA. The assay specifically identified the DEFA3 allele, but not the DEFA1 allele. 3.7. Copy number analysis in a Belgian control population With these assays, which were thoroughly tested and validated, the distribution of the α- and β-defensin CNV in a Belgian control population of 344 individuals was then determined. The β-defensin copy number varied from 2 to 8 with an average copy number of 4. The DEFA1A3 copy number varied from 2 to 10 with an average copy number of 7 (Fig. 7a). Approximately 15% of the individuals did not carry a DEFA3 allele. As shown in Fig. 7b–f, for each DEFA1A3 copy number, a similar distribution of β-defensin copy number was found to be associated. No linkage was found between the β-defensin copy number and the DEFA1A3 copy number and vice versa (Anova test, P-value of 0.8, Fig. 7g). This is in agreement with
the findings of Linzmeier and Ganz (2005). The DEFA3 allele was also neither linked with the diploid copy number of the DEFA1A3 gene (P = 0.6) nor with the diploid β-defensin copy number (P = 0.6). 4. Discussion The extensive CNV in the β-defensin region was originally determined by a multiplex amplifiable probe hybridization (MAPH) assay (Hollox et al., 2003). It is a rather complex assay involving hybridization and elution of DNA, followed by amplification. Each of these 3 major steps in the MAPH assay may influence the accuracy of the final quantitative test results. When the findings of the MAPH results were compared to a semi-quantitative fluorescence in situ hybridization (SQ-FISH) assay, using BAC or YAC clones, the observed signal intensity ratios by SQ-FISH were only consistent with the MAPH findings in 1 out of 3 tested families (Hollox et al., 2003). Although the actual β-defensin copy number determined by the MAPH assay have never been validated by microscopical means, MAPH typed samples have been used as reference controls in most studies (Chen et al., 2006; Fellermann et al., 2006; Hollox et al., 2008b). We have developed concatemeric reference constructs, wherein each reference construct is formed by an isolated non-repeated nucleic acid sequence comprising one copy of a reference sequence and by a known but different number of copies of a repeated sequence. We have used real-time PCR for quantification of the α- and β-defensin CNV, because this is a very accurate analysis method. This system is also amenable for high throughput. Furthermore, a real-time PCR assay for diploid copy number quantification is a 1 step assay, a single real-time PCR reaction on genomic DNA is sufficient to obtain reliable diploid copy numbers. The assay that we developed was indeed very accurate and reliable since the copy numbers assigned by the real-time PCR assay did correlate with the copy numbers observed in the FISH experiments. Also the assay was sequence-contextindependent as the same results were obtained when assays were developed against different regions in the same CNV. We conclude that a set of reference constructs that covers the whole range of diploid copy numbers should be used to obtain correct diploid copy numbers, or at least, sufficient reference constructs should be included in the assay to have a reliable standard curve with sufficient data points. An analogous real-time PCR assay has been put forward by Chen et al. (2006). This assay used only one calibrator vector (containing 1 copy of a reference sequence and 1 copy of the repeated sequence), and a reference vector (containing 1 copy of the reference sequence and only two copies of the repeated sequence). When we re-analyzed our data using only the DEFB4–DEFB1 and (DEFB4)2–DEFB1 constructs, which is the strategy of Chen et al., we found that a different repeat number may be concluded compared to the determined repeat number when 4 control constructs were used for the analysis, especially when the number of repeats is equal to four or higher (R2 = 0.69). Apart from the DEFB4 assay, we developed analogously a DEFB104 assay, which concluded the same diploid β-defensin copy number in our samples when 4 reference constructs were used. Re-analysis of our samples using only two reference constructs in the
H. Nuytten et al. / Journal of Immunological Methods 344 (2009) 35–44
43
Fig. 7. (a) The distribution of the DEFB4 and DEFA1A3 diploid copy numbers in the Belgian control population. (b–g) Distribution of the diploid DEFB4 copy number for the different DEFA1A3 CNV alleles. (f) The average diploid β-defensin copy number and standard error for each DEFA1A3 CNV allele.
DEFB4 and DEFB104 assays could wrongly result in different β-defensin copy number measurements in some samples. We also found that, in case that only two reference constructs were used in which the second reference constructs contains
the highest number of repeats available, errors were made to a lesser extent. The extraordinary finding that the number of DEFB4 and DEFB104 repeats may be different in some samples by Chen
44
H. Nuytten et al. / Journal of Immunological Methods 344 (2009) 35–44
et al. is thus most likely false. The different findings are likely the result of technical error inherent to the fact that only 2 control constructs for the generation of the standard curve were used. Chen et al. even used only two reference controls equivalent to the lowest repeat range CNV which results in the highest error rate. Indeed, it is well recognized that DEFB4 and DEFB104 are separated 50 kb apart but located in the same repeat unit (Nusbaum et al., 2006) and therefore DEFB4 and DEFB104 diploid copy number should be identical in individuals. So one can conclude that CNVs can only be accurately determined if sufficient concatemeric reference constructs are used in order to obtain a reliable standard curve. Conflict of interest We declare a conflict of interest, i.e. Hilde Nuytten, Harry Cuppens and Jean-Jacques Cassiman filed a patent application covering the technology described in this manuscript. Acknowledgements This study was supported by grants from the ‘Alphonse and Jean Forton Fund - Koning Boudewijn Stichting’ (2008R10150-002); ‘Het Fonds voor Wetenschappelijk Onderzoek Vlaanderen’ (G.0521.06 and 1.5.111.07); and the Interuniversity Attraction Poles (IAP P6/05). J-J Cassiman is holder of the Arthur Bax and Anna Vanuffelen Chair of Human Genetics, KU Leuven, Belgium. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.jim.2009.03.002. References Aldred, P.M., Hollox, E.J., Armour, J.A., 2005. Copy number polymorphism and expression level variation of the human alpha-defensin genes DEFA1 and DEFA3. Hum. Mol. Genet. 14, 2045. Armour, J.A., Palla, R., Zeeuwen, P.L., den Heijer, M., Schalkwijk, J., Hollox, E.J., 2007. Accurate, high-throughput typing of copy number variation using paralogue ratios from dispersed repeats. Nucleic Acids Res. 35, e19. Chen, Q., Book, M., Fang, X., Hoeft, A., Stuber, F., 2006. Screening of copy number polymorphisms in human beta-defensin genes using modified real-time quantitative PCR. J Immunol. Methods 308, 231. Fellermann, K., Stange, D.E., Schaeffeler, E., Schmalzl, H., Wehkamp, J., Bevins, C.L., Reinisch, W., Teml, A., Schwab, M., Lichter, P., Radlwimmer, B., Stange, E.F., 2006. A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn disease of the colon. Am. J. Hum. Genet. 79, 439. Ganz, T., 2003. Defensins: antimicrobial peptides of innate immunity. Nat. Rev. Immunol. 3, 710. Gonzalez, E., Kulkarni, H., Bolivar, H., Mangano, A., Sanchez, R., Catano, G., Nibbs, R.J., Freedman, B.I., Quinones, M.P., Bamshad, M.J., Murthy, K.K., Rovin, B.H., Bradley, W., Clark, R.A., Anderson, S.A., O'Connell, R.J., Agan, B. K., Ahuja, S.S., Bologna, R., Sen, L., Dolan, M.J., Ahuja, S.K., 2005. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/ AIDS susceptibility. Science 307, 1434.
Hollox, E.J., Armour, J.A., Barber, J.C., 2003. Extensive normal copy number variation of a beta-defensin antimicrobial-gene cluster. Am. J. Hum. Genet. 73, 591. Hollox, E.J., Barber, J.C., Brookes, A.J., Armour, J.A., 2008a. Defensins and the dynamic genome: what we can learn from structural variation at human chromosome band 8p23.1. Genome Res. 18, 1686. Hollox, E.J., Huffmeier, U., Zeeuwen, P.L., Palla, R., Lascorz, J., Rodijk-Olthuis, D., van de Kerkhof, P.C., Traupe, H., de Jongh, G., den Heijer, M., Reis, A., Armour, J.A., Schalkwijk, J., 2008b. Psoriasis is associated with increased beta-defensin genomic copy number. Nat. Genet. 40, 23. Iafrate, A.J., Feuk, L., Rivera, M.N., Listewnik, M.L., Donahoe, P.K., Qi, Y., Scherer, S.W., Lee, C., 2004. Detection of large-scale variation in the human genome. Nat. Genet. 36, 949. Kidd, J.M., Cooper, G.M., Donahue, W.F., Hayden, H.S., Sampas, N., Graves, T., Hansen, N., Teague, B., Alkan, C., Antonacci, F., Haugen, E., Zerr, T., Yamada, N.A., Tsang, P., Newman, T.L., Tuzun, E., Cheng, Z., Ebling, H.M., Tusneem, N., David, R., Gillett, W., Phelps, K.A., Weaver, M., Saranga, D., Brand, A., Tao, W., Gustafson, E., McKernan, K., Chen, L., Malig, M., Smith, J.D., Korn, J.M., McCarroll, S.A., Altshuler, D.A., Peiffer, D.A., Dorschner, M., Stamatoyannopoulos, J., Schwartz, D., Nickerson, D.A., Mullikin, J.C., Wilson, R.K., Bruhn, L., Olson, M.V., Kaul, R., Smith, D.R., Eichler, E.E., 2008. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56. Linzmeier, R.M., Ganz, T., 2005. Human defensin gene copy number polymorphisms: comprehensive analysis of independent variation in alpha- and beta-defensin regions at 8p22-p23. Genomics 86, 423. Mars, W.M., Patmasiriwat, P., Maity, T., Huff, V., Weil, M.M., Saunders, G.F., 1995. Inheritance of unequal numbers of the genes encoding the human neutrophil defensins HP-1 and HP-3. J. Biol. Chem. 270, 30371. Nusbaum, C., Mikkelsen, T.S., Zody, M.C., Asakawa, S., Taudien, S., Garber, M., Kodira, C.D., Schueler, M.G., Shimizu, A., Whittaker, C.A., Chang, J.L., Cuomo, C.A., Dewar, K., FitzGerald, M.G., Yang, X., Allen, N.R., Anderson, S., Asakawa, T., Blechschmidt, K., Bloom, T., Borowsky, M.L., Butler, J., Cook, A., Corum, B., DeArellano, K., DeCaprio, D., Dooley, K.T., Dorris III, L., Engels, R., Glockner, G., Hafez, N., Hagopian, D.S., Hall, J.L., Ishikawa, S.K., Jaffe, D.B., Kamat, A., Kudoh, J., Lehmann, R., Lokitsang, T., Macdonald, P., Major, J.E., Matthews, C.D., Mauceli, E., Menzel, U., Mihalev, A.H., Minoshima, S., Murayama, Y., Naylor, J.W., Nicol, R., Nguyen, C., O'Leary, S.B., O'Neill, K., Parker, S.C., Polley, A., Raymond, C.K., Reichwald, K., Rodriguez, J., Sasaki, T., Schilhabel, M., Siddiqui, R., Smith, C.L., Sneddon, T.P., Talamas, J.A., Tenzin, P., Topham, K., Venkataraman, V., Wen, G., Yamazaki, S., Young, S.K., Zeng, Q., Zimmer, A.R., Rosenthal, A., Birren, B.W., Platzer, M., Shimizu, N., Lander, E.S., 2006. DNA sequence and analysis of human chromosome 8. Nature 439, 331. Perry, G.H., Tchinda, J., McGrath, S.D., Zhang, J., Picker, S.R., Caceres, A.M., Iafrate, A.J., Tyler-Smith, C., Scherer, S.W., Eichler, E.E., Stone, A.C., Lee, C., 2006. Hotspots for copy number variation in chimpanzees and humans. Proc. Natl. Acad. Sci. U. S. A. 103, 8006. Sharp, A.J., Locke, D.P., McGrath, S.D., Cheng, Z., Bailey, J.A., Vallente, R.U., Pertz, L.M., Clark, R.A., Schwartz, S., Segraves, R., Oseroff, V.V., Albertson, D.G., Pinkel, D., Eichler, E.E., 2005. Segmental duplications and copynumber variation in the human genome. Am. J. Hum. Genet. 77, 78. Townson, J.R., Barcellos, L.F., Nibbs, R.J., 2002. Gene copy number regulates the production of the human chemokine CCL3-L1. Eur. J. Immunol. 32, 3016. Tuzun, E., Sharp, A.J., Bailey, J.A., Kaul, R., Morrison, V.A., Pertz, L.M., Haugen, E., Hayden, H., Albertson, D., Pinkel, D., Olson, M.V., Eichler, E.E., 2005. Fine-scale structural variation of the human genome. Nat. Genet. 37, 727. Vankeerberghen, A., Scudiero, O., De Boeck, K., Macek Jr., M., Pignatti, P.F., Van Hul, N., Nuytten, H., Salvatore, F., Castaldo, G., Zemkova, D., Vavrova, V., Cassiman, J.J., Cuppens, H., 2005. Distribution of human beta-defensin polymorphisms in various control and cystic fibrosis populations. Genomics 85, 574. Wong, K.K., deLeeuw, R.J., Dosanjh, N.S., Kimm, L.R., Cheng, Z., Horsman, D.E., MacAulay, C., Ng, R.T., Brown, C.J., Eichler, E.E., Lam, W.L., 2007. A comprehensive analysis of common copy-number variations in the human genome. Am. J. Hum. Genet. 80, 91. Yang, D., Chertov, O., Oppenheim, J.J., 2001. The role of mammalian antimicrobial peptides and proteins in awakening of innate host defenses and adaptive immunity. Cell. Mol. Life Sci. 58, 978.