Methods 58 (2012) 231–242
Contents lists available at SciVerse ScienceDirect
Methods journal homepage: www.elsevier.com/locate/ymeth
Chromosome conformation capture on chip in single Drosophila melanogaster tissues Bas Tolhuis ⇑, Marleen Blom, Maarten van Lohuizen ⇑ Division of Molecular Genetics and the Centre for Biomedical Genetics, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
a r t i c l e
i n f o
Article history: Available online 16 April 2012 Keywords: Nuclear organization Chromosome conformation capture Drosophila melanogaster Genome-wide Domainograms
a b s t r a c t Chromosomes are protein–DNA complexes that encode life. In a cell nucleus, chromosomes are folded in a highly specific manner, which connects strongly to some of their paramount functions, such as DNA replication and gene transcription. Chromosome conformation capture methodologies allow researchers to detect chromosome folding, by quantitatively measuring which genomic sequences are in close proximity in nuclear space. Here, we describe a modified chromosome conformation capture on chip (4C) protocol, which is specifically designed for detection of chromosome folding in a single Drosophila melanogaster tissue. Our protocol enables 4C analyses on a limited number of cells, which is crucial for fly tissues, because these contain relatively low numbers of cells. We used this protocol to demonstrate that target genes of Polycomb group proteins interact with each other in nuclear space of third instar larval brain cells. Major benefits of using D. melanogaster in 4C studies are: (1) powerful and tractable genetic approaches can be incorporated; (2) short generation time allows use of complex genotypes; and (3) compact and well annotated genome. We anticipate that our sensitized 4C method will be generally applicable to detect chromosome folding in other fly tissues. Ó 2012 Elsevier Inc. All rights reserved.
1. Introduction ‘‘The next frontier of genomics is space,’’ as recently stated in Nature [1]. Meaning, how chromosomes fold in the threedimensional space of an interphase cell nucleus is far from random, and correlates strongly with genome function [2–4]. During interphase of the cell cycle essential genomic functions, including replication and transcription, are carried out. Thus, using genomic approaches to understand how chromosomes are folded in a specific spatial arrangement in the interphase cell nucleus may shed light on basic processes that determine life. In recent years, a burst of high-throughput assays has enabled researchers to study nuclear organization on a genome-wide scale. These data clearly demonstrate that gene density, transcription and replication are correlated with chromosome folding and nuclear position. Briefly, chromosomes are folded in a highly specific manner, in which gene-dense, actively transcribed, and early replicating regions are in close spatial proximity. Conversely, genepoor, inactive, and late replicating regions segregate away and interact frequently with the nuclear periphery [5–9]. Some of the aforementioned studies were based on the so-called ‘‘chromosome conformation capture’’ (3C) method [10], which has proven to be a powerful method to study genome organization in nuclear space. ⇑ Corresponding authors. Address: NKI-AVL, Division of Molecular Genetics, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands. Fax: +31 20 512 20 11. E-mail addresses:
[email protected] (B. Tolhuis),
[email protected] (M.v. Lohuizen). 1046-2023/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ymeth.2012.04.003
The basic 3C method has five experimental steps (Fig. 1). First, chromatin segments that interact by chromosome folding are fixed together, using a cross-linking agent that creates covalent bonds between segments. Second, non-cross-linked and cross-linked chromatin segments are separated by restriction enzyme mediated genome fragmentation. Third, a ligation procedure joins restriction ends of cross-linked segments. As such, ligation junctions are a measure for spatial proximity. Chromatin segments that are spatially close will form more ligation junctions, then spatially distant segments. Fourth, cross-links are reversed to allow downstream analyses. The resulting pool of linear DNA fragments is referred to as the ‘‘3C library.’’ Finally, the library can be queried to identify interactions between chromatin segments, using various assays (Fig. 1). The nuclear interactions of a single gene against the rest of the genome can be determined using ‘‘chromosome conformation capture on chip’’ (4C) [9]. Originally, 4C required large numbers of cells [9]. Unfortunately, most fly tissues contain relatively low numbers of cells, making 4C under these conditions very demanding. We successfully adapted the 4C method to systematically map chromosome folding with limited material from a single Drosophila melanogaster tissue [11]. D. melanogaster has several advantages over other species to study nuclear organization. The functional significance of chromosome folding can be explored by making use of the huge collection of genetic tools and mutant strains that are available for Drosophila. For example, we showed in our studies that a chromosomal rearrangement dramatically alters interactions among Polycomb group
232
B. Tolhuis et al. / Methods 58 (2012) 231–242
Fig. 1. Schematic outline that describes chromosome conformation capture methodologies. Left panel shows the five basic steps of chromosome conformation capture (3C), including chromatin cross-linking, restriction enzyme digestion of cross-linked fragments, intramolecular ligation, reversal of cross-linked chromatin, and detection of interactions or further processing of ‘‘3C library.’’ Right panel shows our sensitized chromosome conformation capture on chip (4C) procedure that enables quantitative 4C in Drosophila melanogaster tissues, which contain low number of cells. Our strategy is largely based on the method designed by Simonis et al. [9], which trims the ‘‘3C library’’ using digestion with a frequently cutting restriction enzyme. The shortened library is circularized through ligation and a specific sequence (i.e., bait – red fragment) and all its interacting sequences (blue fragment) are amplified by inverse PCR. We included a second, strictly linear amplification step using T7 RNA polymerase [20] to obtain enough material for the final microarray detection.
(PcG) chromatin domains [11]. Similarly, it was shown that a deletion of the Fab-7 regulatory DNA element resulted in delicately altered interactions between homeotic genes and other PcG target genes [12]. Using Drosophila, phenotypic changes directly assess the functional relevance of distorted chromosomal interactions [11,12]. Furthermore, complex genotypes can quickly be made due the short generation time of flies. Finally, the compact and well annotated fly genome makes high-throughput assays relatively cost-effective. Here we provide detailed descriptions and considerations regarding experimental design, protocols, and data analyses to make the 4C method generally applicable in D. melanogaster. 2. Experimental design We discuss considerations to effectively employ our sensitized 4C method for single Drosophila tissues. Using this method, we have successfully demonstrated that PcG chromatin domains show a strong preference to interact with each other in larval brain cells [11]. 2.1. Choice of Drosophila tissue Two critical features of the 4C method determine the choice of tissue, namely the detection limit and the fact that it only detects
the most common interactions in a population. The detection limit requires that a large population of cells must be analyzed with original 4C due to a relatively low efficiency in detecting long-range interactions [13]. Routinely, 10 million cells or more are processed to capture interactions with the 4C method [9,14]. Since Drosophila tissues contain relatively few cells, it is demanding to retrieve high cell numbers. For example, Bantignies and colleagues [12] dissected and pooled brain and anterior imaginal disk tissues out of 200 third instar larvae to reach an appropriate number of cells. In our sensitized 4C protocol, we required brain tissue of only 40 third instar larvae [11]. Using this number of brains, we examined approximately 1 million cells (i.e., estimated from the total number of neuroblasts and progeny cells that neuroblasts produce [15]) in our sensitized 4C analysis. In addition to larval brain, our protocol provides opportunities to examine other Drosophila tissues with limited cell numbers. Interpretation of 4C data is clearest in a homogeneous cell population, because the method only detects the most common interactions in a population. Assuming that chromosome folding and function are linked, it seems reasonable that in a homogeneous cell population the most common interactions are linked to specific function. For homogeneity one can focus on several parameters. First, identical expression status of the gene-of-interest/bait sequence among the large majority of cells has been used as parameter [9,11]. As such, 4C allows analyzing the nuclear organization of either an active or inactive chromatin segment, linking chromosome folding to gene expression. Second, it is possible to use bait sequences that reside in specific chromatin domains. For example, we used Ptx1 and trachealess genes as bait sequences [11]. These genes are located inside Polycomb domains, which are bound by Polycomb and have increased methylation of lysine residue 27 on histone H3 [11,16]. As such, the epigenetic status of genes can be coupled to chromosome folding. Finally, Drosophila mutant strains can be used to obtain a homogeneous cell population. For instance, mutations in the brain tumor (brat) gene perturb asymmetric cell divisions in neuroblasts, causing uncontrollable overproliferation of cells that arise from a small group of type-II neuroblasts [17–19]. Consequently, brain tissue consists almost completely of type-II neuroblasts in brat mutant larvae. In this way, cell type specific chromosome folding can be studied. 2.2. Selection of bait sequences, restriction enzymes and primer design The choice of bait sequence depends on the gene or genomic region that is being studied. Importantly, the choice of bait sequence also depends on primer design for inverse PCR, which we explain further below. The exact sequence that will be amplified further depends on restriction enzymes used in the 4C assay. The first restriction enzyme is used to prepare the ‘‘3C library’’ (Fig. 1). We routinely use EcoRI that has a six base pair recognition sequence and on average cuts every 4 kb in the fly genome. After the intramolecular ligation step, the library consists of high molecular weight DNA that cannot easily be amplified by inverse PCR. For that reason, the library is further processed by a second digestion and ligation step to produce circular low molecular weight 4C templates (Fig. 1). For the second digestion we use restriction enzymes that have a four base pair recognition site and cut the genome much more frequently than EcoRI, yielding mostly fragments that range in size from 0.2 to 1 kb. The bait sequence lies within the recognition sites of both restriction enzymes (Fig. 2A). Besides EcoRI as restriction enzyme to prepare the ‘‘3C library,’’ other restriction enzymes have also been used successfully. These enzymes may also be used, but each enzyme may have different digestion efficiency in cross-linked chromatin. Therefore, it is important to test the digestion efficiencies of each enzyme prior to 4C assays. Thermophilic restriction enzymes are not suitable
B. Tolhuis et al. / Methods 58 (2012) 231–242
233
Fig. 2. How to design 4C oligomer primers? (A) The Antennapedia (Antp) bait sequence is flanked by EcoRI (red) and DpnII (blue) restriction sites. (B) The bait sequence is split into two halves and the 50 half (red) is placed downstream of the 30 half (blue) in sillico. Forward (blue arrow) and reverse (red arrow) oligomer primers flank the two juxtaposed restriction sites. (C) These primers are divergent on the original genomic DNA sequence and should not yield PCR products. Circularized 3C libraries ensure that the primers are convergent and amplification will be possible. (D) The Antp reverse oligomer primer (red sequence) is supplemented with the T7 promoter (green sequence) at its 30 end. Core promoter as well as 50 and 30 spacer sequences are indicated.
for library preparation, because they need high incubation temperatures (50–75 °C) and prolonged incubation at high temperatures disrupts crosslinked DNA–protein complexes. The inverse PCR depends on primers that anneal specifically to the bait sequence. The primers should be complementary to unique genomic sequences to ensure bait specific amplification. As a consequence, it will be challenging to amplify bait sequences in highly repetitive genomic regions. Furthermore, efficient primer design requires that the bait sequence is of sufficient length. We have successfully used bait sequences between 200 and 600 base pairs. To design primers we split the bait sequence into approximately two equal halves in silico and place the 50 half downstream
of the 30 half (Fig. 2B). The two restriction sites are now juxtaposed and we use software to design primers that flank the restriction sites. We successfully used PerlPrimer (http://perlprimer.sourceforge.net) and Primer3 (http://frodo.wi.mit.edu/primer3) software to design 4C primers. On the genome sequence 4C primers are divergent (Fig. 2C) and should not yield products on genomic DNA templates. The 4C template preparation circularizes the bait sequence and its interacting DNA sequences. This template will be amplified by the 4C primers, because they are convergent on the circular DNA fragments (Fig. 1). Our protocol makes use of an optional T7 amplification step (see Section 3.3.2.3), which requires that one of the primers is
234
B. Tolhuis et al. / Methods 58 (2012) 231–242
supplemented with a 30 overhang containing the T7 promoter sequence (Fig. 2D). In addition to the T7 promoter, we add 50 and 30 spacer sequences that improve efficiency of the in vitro transcription reaction [20]. In all our 4C experiments, we use a T7spacer sequence that was designed by Ambion, but additional designs have been used successfully in amplifying messenger RNA prior to microarray expression analysis [21,20]. We add the Ambion sequence to the 30 ends of all our reverse primers. In this way, inverse PCR products are further amplified in vitro using T7 RNA polymerase (Fig. 1). 2.3. Detection of long-range chromatin interactions The 4C assay is a quantitative methodology. Currently, two strategies are deployed to quantify long-range chromatin interactions using the 4C method. At first, DNA microarray technology was used to quantify the relative abundance of specific interactions [9]. This can be very cost-effective, because dedicated microarray platforms can be designed that only probe sequences close to specific restriction sites. In this manner, we probed almost every EcoRI site in the Drosophila genome using a microarray platform with only 44,000 spots [11]. Notably, once designed for a particular restriction enzyme a platform cannot be used for another restriction enzyme. This is a major drawback of this microarray technology. Alternatively, high-throughput sequencing has been used [14]. The output will be again a quantitative measure of interaction frequencies. Unlike microarrays, sequencing is independent of restriction enzyme choice, making sequencing the more flexible method. The costs of high-throughput sequencing currently exceed DNA microarray technology. However, sequencing power is ever increasing making the costs per read still go down.
multiple adjacent probes with increased interaction frequencies are presumably less likely to originate from random collisions [9], but rather represent true long-range chromatin interactions. We have employed a series of powerful computational tools to extract such regions out of 4C data [11]. These tools will be discussed in more detail below (see Section 4). 3. Step-by-step protocol This protocol describes a single 4C experiment using ‘‘3C libraries’’ prepared from brain tissues from wandering third instar larvae. 3.1. Dissecting tissues, chromatin cross-linking and isolation We routinely dissect and pool brain tissue from 40 wandering third instar larvae. The dissection, including cross-linking and isolation of chromatin, takes almost 1 day (Fig. 4). Afterwards, the material can be stored at 80 °C prior to further processing.
2.4. Bioinformatics infrastructure
3.1.1. Materials Two pairs of tweezers (#5 Tweezers Stainless Steel, #0102-5PO, Dumont, Switzerland); Microscope slide (76 26 mm, ground edges 45°, #AG00000102E, Thermo Scientific/Menzel-Gläser, Germany); Stereo microscope (#MZ10F, Leica, Germany); 0.5 ml microfuge tubes (Safe-lock 0.5 ml, #0030 121.023, Eppendorf, Germany); 1.5 ml microfuge tubes (Safe-lock 1.5 ml, #0030 120.086, Eppendorf, Germany); 50 ml polypropylene conical tube (#352070, BD Falcon, USA); 1 PBS (see Section 3.1.2); Freshly prepared fixation solution (see Section 3.1.2); 1 M Glycin (see Section 3.1.2); 50 Protease inhibitors cocktail (see Section 3.1.2); Freshly prepared cell lysis buffer (see Section 3.1.2); Ice and dry ice; Cooled microcentrifuge (#5417R, Eppendorf, Germany).
Great care should be taken when analyzing 4C data. One important notion is that chromatin fibers are highly dynamic. As a result, DNA fragments on the same fiber are engaged in random collisions with a frequency inversely proportional to the distance between them. Such collisions are not reproducibly found in replicate experiments (Fig. 3), but may result in skewing of the averaged data. Random collisions are thought to occur as isolated single probes along the chromosome. Genomic regions with clusters of
3.1.2. Recipes 1 PBS: disovle 1 PBS tablet (#18912-014, Invitrogen/Gibco, USA) in 500 ml double distilled water. The 1 PBS solution can be stored at room temperature for several months. Freshly prepared fixation solution: 40 ll Formaldehyde solution min. 37% (#1039991000, Merck, Germany) in 460 ll 1 PBS. Final concentration is 3% formaldehyde. Keep fixation solution at room temperature.
Fig. 3. Random collisions can be detected as non-reproducible high intensity signals. Bivariate scatter plots (Antp in left panel and Ptx1 in right panel) show correspondence of replicate experiments. The overall correlation (Pearson’s correlation coefficient – r) is high for both data sets. Yet, each replicate experiment has a subset of high intensity signals that is not present in the other replicate (red ovals). Axes are log2-transformed ratios of 4C signal over reference. Full data sets have been described previously [11].
B. Tolhuis et al. / Methods 58 (2012) 231–242
235
Fig. 4. Schematic outline shows dissection of brain tissue, chromatin cross-linking and cell lysis. Numbers correspond to paragraphs in main text. Crude dissection prepares brain tissue for the chromatin fixation step. A wandering third instar larva (top panel) is grabbed by two pairs of tweezers at its anterior and posterior ends (arrow heads). The anterior tweezers are placed just posterior of the larva’s mouth piece (M). The larva is gently teared apart by pulling the pairs of tweezers (middle panel). The central nervous system (CNS, red oval) is present in the anterior half of the larva. The posterior larval parts are removed and the anterior parts are subjected to the fixation procedure (bottom panel). Note that the anterior part still contains the mouth piece (M), several imaginal disks (d), brain lobes (B), ventral ganglion (V) and cuticle. The fixation step consists of formaldehyde cross-linking (top panel), quenching of the cross-linking reaction (middle panel) and washing (bottom panel). Next, the tissue is further dissected to remove any non-brain tissue (Fine dissection). The mouth parts and cuticle are removed (top panel), all imaginal disks are removed (middle panel), and finally the ventral ganglion is removed (bottom panel). The remaining brain lobes are mechanically disrupted in cell lysis buffer to isolate nuclei (cell lysis, top panel). Nuclei are collected by centrifugation (middle panel) and can either be stored at 80 °C or processed further to prepare the ‘‘3C library’’ (bottom panel).
1 M Glycin: dissolve 3.75 gram glycin powder (#50052, Sigma– Aldrich/Fluka Analytical, Switzerland) in 50 ml 1 PBS. The 1 M Glycin solution can be stored at room temperature for several months. 50 Protease inhibitors cocktail: dissolve one protease inhibitor cocktain tablet (cOmplete, EDTA-free, #11 873 580 001, Roche Diagnostics, USA) in 1 ml double distilled water. The protease inhibitor cocktail can be stored at 20 °C for several months. Freshly prepared cell lysis buffer: 10 mM Tris–HCl (pH 8), 10 mM NaCl, 0.2% (v/v) IGEPAL CA-630, 1 Protease inihibitors cocktail. For 1 ml of cell lysis buffer mix 10 ll 1 M Tris–HCl (pH 8), 2 ll 5 M NaCl, 20 ll 10% (v/v) IGEPAL CA-630 (#I8896, Sigma–Aldrich, Switzerland), 20 ll 50 protease inhibitors cocktail, and 948 ll double distilled water. Keep cell lysis buffer on ice. 3.1.3. Preparations Add 0.5 ml 1 PBS to 0.5 ml microfuge tube and place on ice. Prepare the fixation solution and keep it at room temperature. Prepare the cell lysis buffer and place it on ice. Fill a 50 ml polypropylene conical tube with 1 PBS and place it on ice. Pre-cool the microcentrifuge to 4 °C.
3.1.4. Protocol When isolating material for 4C it is important to work ‘cold’. This prevents degradation of protein–DNA complexes that enable higher-order chromatin structures. 3.1.4.1. Crude dissections. We first dissect each larva in a crude manner (Fig. 4), which has two major advantages. First, it significantly reduces the time to dissect, ensuring a relatively fast fixation protocol. Since we do not know how the dissection
procedure will affect higher-order chromatin folding in brain cells, we aim to keep the time to fixation as short as possible. In our current protocol and with our expertise, we need less than 1 h to dissect brains out of 40 larvae in a crude manner. We successfully mapped spatial interactions of several bait sequences using these conditions [11]. Second, our crude dissection procedure leaves the mouth piece of the larvae attached to the brains. This reduces the risk of material loss during the fixation and washing steps, because the heavy mouth piece rapidly sinks to the bottom of the tube, while brain tissue alone will float, making liquid aspiration without material loss more challenging. Apply 50 ll 1 PBS as a droplet to a microscope slide. Pick a larva and put it into the droplet and place the microscope slide under the stereo microscope. Use a low magnification to visualize the whole larva. Place one pair of tweezers immediately posterior of the larva’s mouth piece and the second pair at the posterior end of the larva. Gently pull the tweezers apart to tear the larva in two halves. Brain tissue can be found in the anterior half. Use the tweezers to remove excess tissue, such as gut, esophagus, fat body, and salivary glands. Place the anterior half, including brain tissue and mouth piece, into the 0.5 ml ice cold 1 PBS. Repeat this procedure for every larva that needs to be dissected. Continue with the Fixation step (below).
3.1.4.2. Fixation step. We fix at room temperature, because fixations at low temperatures are inefficient. Formaldehyde is a highly toxic component. Please, take appropriate safety precautions as indicated by the manufacturer when working with formaldehyde.
236
B. Tolhuis et al. / Methods 58 (2012) 231–242
Gently remove 0.4 ml supernatant from tube containing brain/ mouth piece material. Material should be at the bottom of the tube. Add 300 ll fixation solution. The final concentration is now 2.2% formaldehyde. Incubate 15 min at room temperature, while rotating. Add 50 ll 1 M Glycin to quench the fixation reaction. Incubate 5 min at room temperature, while rotating. Let the brain/mouth piece material set to the bottom of the tube for a few seconds. Use a pipette to aspirate as much supernatant as possible without removing any brain/mouth piece material. Wash with 400 ll ice cold 1 PBS. Let the brain/mouth piece material set to the bottom of the tube for a few seconds and use a pipette to aspirate as much supernatant as possible without removing any brain/mouth piece material. Repeat this washing step one more time. After the second washing step, add 400 ll ice cold 1 PBS and place material on ice. Continue with the fine dissections (below).
3.1.4.3. Fine dissections. The fixation procedure makes the material more brittle. As a result, the fine dissection (Fig. 4) will be more easy and rapid compared to unfixed tissue. Yet, this fine dissection requires expertise and may be challenging for a novice. We require 2–3 h to prepare a total of 40 brains. Apply 25 ll ice cold 1 PBS as a droplet to a microscope slide. Pick a brain/mouth piece and put it into the droplet and place the microscope slide under the stereo microscope. Use a high magnification to visualize the material in detail. Use the tweezers to remove any excess tissue, such as larval cuticle, mouth piece, imaginal disks, and central nerve cord (see Fig. 4 for an example dissection). Place the brain hemispheres into 0.1 ml ice cold cell lysis buffer (1.5 ml microfuge tube). Keep the brain hemispheres in cell lysis buffer on ice. Repeat this procedure for every larva that needs to be dissected. Continue with cell lysis (below).
3.1.4.4. Cell lysis. Lysis should take place at low temperatures and in the presence of protease inhibitors. Under these conditions protease activities are low and cannot disrupt higher-order chromatin structures. Make sure that the microcentrifuge is cooled to 4 °C. Use a pipette to mechanically disrupt brain tissue by gently pipetting the material up and down several times. Add 0.8 ml ice cold cell lysis buffer and gently mix. Incubate 10 min on ice. Spin at 2700g for 5 min in the cooled microcentifuge. Carefully remove the supernatant without disrupting the pellet. Immediately, continue with the preparation of the 3C library (3.2) or store the material. For storage, place the tube for several minutes on dry ice to quickly freeze the pellet. Alternatively, the material can be frozen using liquid nitrogen. The frozen material can be stored at 80 °C for several months. 3.2. Preparation of ‘‘3C-library’’ Due to several over night incubation steps, this procedure requires 3 days. It yields the template that can be used for 4C, as well as other 3C methodologies [10,22].
3.2.1. Materials Thermal shaker (Thermomixer Comfort, Eppendorf, Germany); digestion buffer (see Section 3.2.2); Sodium Dodecyl Sulfate (20% SDS Aqueous solution, #198123, Biosolve, The Netherlands); 20% (v/v) Triton X-100 (#108643, Merck, Germany); EcoRI restriction endonuclease (high concentration = 40 Units/ll, #10 200 310 001, Roche Diagnostics, Germany); 2.0 ml microfuge tubes (Safe-lock 2.0 ml, #0030 120.094, Eppendorf, Germany); 10 Ligase buffer (see Section 3.2.2); T4 DNA ligase (high concentration = 20 Units/ ll, #M1794, Promega, USA); Proteinase K solution (20 mg/ml); heatable water bath; RNase A solution (10 mg/ml); Phase Lock Gel Light 1.5 ml tubes (#2302800, 5prime, Germany); Phenol–Chloroform-Isoamyl alcohol (25:24:1, #77617, Sigma–Aldrich, Switzerland); 1.5 ml microfuge tubes (Safe-lock 1.5 ml, #0030 120.086, Eppendorf, Germany); Ethanol (Absolute, #32213, Merck, Germany); 3 M NaAc (pH5.2); 70% (v/v) Ethanol; 10 mM Tris–HCl (pH 8); Cooled microcentrifuge (#5417R, Eppendorf, Germany); Ice. 3.2.2. Recipes Digestion buffer: mix 30 ll 10 SuRE/Cut Buffer H (#11 417 991 001, Roche Diagnostics, Germany) and 220 ll double distilled water. Of note, this digestion buffer is used in combination with EcoRI enzyme. Other restriction enzymes may require other buffers. Hence, 10 SuRE/Cut Buffer H can be replaced by any appropriate other buffer. 10 Ligase buffer: 300 mM Tris–HCl (pH 8), 100 mM MgCl2, 100 mM Dithiothreitol, 10 mM ATP (pH 7.6). For 250 ll 10 Ligase buffer mix 75 ll 1 M Tris–HCl (pH 8), 25 ll 1 M MgCl2, 25 ll 1 M Dithiothreitol, 25 ll 0.1 M ATP (pH 7.6), and 100 ll double distilled water. Keep on ice. 3.2.3. Preparations Prepare the digestion buffer and keep on ice. Heat the thermal shaker to 37 °C. 3.2.4. Protocol This protocol is largely based on original 3C protocols [10,23]. However, we have specifically adjusted volumes such that they are appropriate for working with low cell numbers. 3.2.4.1. Restriction endonuclease treatment (day 1). We use EcoRI as restriction enzyme of choice, but many other restriction enzymes have been successfully used in 3C library preparations. The protocol below, however, is optimized for EcoRI. Dissolve pellet (see section 3.1) in 200 ll digestion buffer. Add 2 ll 20% SDS to obtain a final concentration of 0.2%. Incubate for 1 h at 37 °C, while shaking in the thermal shaker (1100 rpm). Add 20 ll 20% (v/v) Triton X-100 (final concentration = 2%) to quench the SDS. Incubate for 1 h at 37 °C, while shaking in the thermal shaker (1100 rpm). Add 4 ll EcoRI (i.e., 160 Units). Incubate over night at 37 °C, while shaking in the thermal shaker (1100 rpm). See Section 6 and Fig. 6 for additional information on efficiency of restriction endonuclease treatment.
3.2.4.2. Intramolecular ligation (day 2). Volumes are such that DNA concentrations are low. This enhances intramolecular ligation events. Add 8 ll 20% SDS (final concentration 0.7%).
B. Tolhuis et al. / Methods 58 (2012) 231–242
Increase the temperature of the thermal heater to 65 °C and leave the material at this temperature for 20 min, while shaking (1100 rpm). Together the 0.7% SDS and heat will inactivate the restriction enzyme. These conditions are optimal for EcoRI, but may be different for other restriction enzymes. Prepare the 10 Ligase buffer and keep on ice. During the inactivation of EcoRI, prepare in a 2 ml tube: 2% (v/v) Triton X-100 in 1 Ligase buffer. Mix together 0.2 ml 10 Ligase buffer, 0.1 ml 20% (v/v) Triton X-100, and 1.55 ml double distilled water. Keep the mixture on ice. After inactivation of EcoRI, transfer all of the material (250 ll) to the 2 ml tube containing 2% (v/v) Triton X-100 in 1 Ligase buffer. Incubate material for 1 h at 37 °C in a water bath. Cool material for several minutes on ice. While on ice, add 5 ll T4 DNA ligase (i.e., 100 Units). Incubate material for 6 h at 16 °C in a water bath. See Section 6 and Fig. 6 for additional information on efficiency of intramolecular ligation procedure. Add 6 ll Proteinase K (20 mg/ml) to reach a final concentration of 60 lg/ml. Incubate material over night at 65 °C in a water bath to decrosslink the chromatin.
3.2.4.3. Library purification (day 3). We purify the material using phenol/chloroform extraction and ethanol precipitation. Phenol/ chloroform extraction is done using Phase Lock Gel Light 1.5 ml tubes, which significantly reduces material loss. Moreover, it eliminates phenol contaminations, and reduces contact with hazardous organic solvents. Add 20 ll RNase A (10 mg/ml) to reach a final concentration of 50 lg/ml. Incubate material for 30 min at 37 °C in a water bath. Spin 4 Phase Lock Gel Light 1.5 ml tubes for 1 min at maximum speed in a microcentrifuge at room temperature. This collects the gel at the bottom of the tube. Transfer four 0.5 ml aliquots of the material to the Phase Lock Gel Light 1.5 ml tube and add 0.5 ml phenol–chloroform-isoamyl alcohol. Mix material and phenol–chloroform-isoamyl alcohol of all aliquots by shaking. The gel should not be disturbed. Spin aliquots for 10 min at maximum speed in a microcentrifuge at room temperature. After spinning there should be three layers: (1) bottom layer contains organic phase, (2) middle layer is the gel, and (3) the top layer is the water phase, including the DNA. Transfer of all aliquots the top layer to a fresh 1.5 ml tube. Add to each aliquot 50 ll 3 M NaAc (pH5.2) and 850 ll ice cold ethanol. Incubate for 1 h at 80 °C. Make sure the microcentrifuge is cooled to 4 °C. Spin aliquots for 30 min at maximum speed in a microcentrifuge at 4 °C. Discard the supernatant and wash the pellet with 500 ll 70% ethanol. Spin aliquots for 5 min at maximum speed in a microcentrifuge at room temperature. Dissolve and pool pellets from all 4 aliquots in 50 ll 10 mM Tris–HCl (pH8). The 3C library can now either be further processed (see below) or stored at 80 °C.
237
3.3. Amplification of bait sequence and interacting sequences Although the 3C library can be used for several different experiments, including conventional 3C [10] and 5C [22], this protocol strictly focuses on 4C [9]. This 4C method relies on amplification of a fragment-of-interest (bait sequence) and all its interacting fragments using inverse PCR. 3.3.1. Materials DpnII (10 Units/ll, #R054, New England Biolabs, USA); NlaIII (10 Units/ll, #R0125, New England Biolabs, USA); 1.5 ml microfuge tubes (Safe-lock 1.5 ml, #0030 120.086, Eppendorf, Germany); 2.0 ml microfuge tubes (Safe-lock 2.0 ml, #0030 120.094, Eppendorf, Germany); 10 Ligase buffer (see Section 3.2.2); T4 DNA ligase (high concentration = 20 Units/ll, #M1794, Promega, USA); Ethanol (Absolute, #32213, Merck, Germany); 3 M NaAc (pH5.2); 70% (v/v) Ethanol; 10 mM Tris–HCl (pH 8); Cooled microcentrifuge (#5417R, Eppendorf, Germany); Ice; custom oligomer primers for inverse PCR (see supplemental); 10 PCR Buffer, Minus Mg++, #10342, Invitrogen, USA); 50 mM MgCl2 (#10342, Invitrogen, USA); Taq DNA Polymerase (5 Units/ll, #10342, Invitrogen, USA); 10 mM dNTPs; Qiaquick PCR purification kit (#28106, Qiagen, Germany); Thermal cycler (C1000, Bio-Rad, United Kingdom); Thin wall PCR tubes (); (Optional) MEGAscript T7 Kit (#AM1334, Invitrogen, USA). 3.3.2. Protocol This section of the procedure requires at least two days and can take up to three days, if T7 amplification is also used. 3.3.2.1. Prepare inverse PCR template. The 4C method relies on amplification by inverse PCR. The 3C library contains high molecular weight DNA that is not efficiently amplified. Therefore, the library is prepared for efficient amplification using the same strategy as the original 4C protocol [9]. This strategy makes use of restriction enzymes that recognize 4 base pair consensus sites, which are present at high frequency in the genome. As a result, the size range of the 3C library is significantly reduced. We routinely use two enzymes, namely DpnII and NlaIII, but other enzymes can also be used. Finally, the trimmed 3C library is circularized to allow inverse PCR. 3C library is in 50 ll 10 mM Tris–HCl (pH8) (see Section 3.2.4.3). Add 125 ll double distilled water and mix. Add 20 ll 10 restriction enzyme buffer and mix. Notably, DpnII enzyme has a unique buffer that is supplied with the enzyme. NlaIII requires 10 NEB4 (New England Biolabs). Add 5 ll restriction enzyme (i.e., 50 Units). Incubate 2 h at 37 °C. See Section 6 for additional information on efficiency of trimming 3C-library by frequently cutting enzymes (e.g., DpnII or NlaIII). Incubate 20 min at 70 °C. Notably, library circularization requires a ligation step. Therefore, the restriction enzyme needs to be inactivated. For DpnII and NlaIII heat inactivation is sufficient. Of note, heat inactivation may not work for every restriction enzyme. Transfer digestion mixture to a 2 ml microcentrifuge tube and let it cool on ice for several minutes. Add 0.2 ml 10 Ligase buffer, 1.6 ml double distilled water, and 5 ll T4 DNA ligase (i.e., 100 Units). The final DNA concentration is very low (0.5–1 ng/ll). This low concentration enhances selfself ligation events between single fragments, resulting in circularized DNA fragments.
238
B. Tolhuis et al. / Methods 58 (2012) 231–242
Incubate overnight at 16 °C. See Section 6 for additional information on efficiency of ligation step to circularize trimmed 3C-library. After the ligation, the material is precipitated using ethanol. To this end, we split the material into four aliquots (500 ll each) and we add 50 ll 3 M NaAc (pH 5.2), and 850 ll ethanol. Incubate at least 1 h at 80 °C. Make sure the microcentrifuge is cooled to 4 °C. Spin aliquots for 30 min at maximum speed in a microcentrifuge at 4 °C. Discard the supernatant and wash the pellet with 500 ll 70% ethanol. Spin aliquots for 5 min at maximum speed in a microcentrifuge at room temperature. Dissolve and pool pellets from all 4 aliquots in 30 ll 10 mM Tris–HCl (pH8). The inverse PCR template can now either be further processed (see below) or stored at 80 °C.
3.3.2.2. Inverse PCR. Amplification by inverse PCR requires custom designed oligomer primers (see Section 2.3) that specifically anneal to the bait sequence. We typically amplify the inverse PCR template in triplo. In this manner, each reaction contains 300 to 600 ng template. Furthermore, it reduces the effects of stochastic amplification products. A single PCR further contains: 1 PCR Buffer, Minus Mg++, 2 mM MgCl2, 300 lM forward primer, 300 lM reverse primer, 200 lM dNTPs, and 5 Units Taq DNA polymerase. We normally have a 100 ll reaction in thin wall PCR tubes: Amplification program in thermal cycler: denature for 3 min at 94 °C; 30 cycles of 15 s at 94 °C, 1 min at 60 °C, and 1 min at 72 °C; final elongation takes 5 min at 72 °C. After amplification, free nucleotides are removed using the Qiaquick PCR purification kit (Qiagen). We use the standard manufacturers protocol and we refer to their manual for a detailed description. However, we make one adaptation to the protocol. We elute the inverse PCR products from the column using RNase-free water rather than the supplied elution buffer. This ensures us that the products are devoid of RNases, which is essential for the subsequent T7 amplification step (see below). Obviously, yields of the inverse PCR depend strongly on primer sequences and amplification conditions. Our initial protocol did not yield enough product for subsequent microarray analyses and we included a second strictly linear amplification step to overcome this difficulty [11]. However, other primer sequences or amplification conditions may yield enough material for microarray detection. In addition, high-throughput sequencing requires less input material. Under such conditions it will not be necessary to use the T7 amplification step. Yet, we think our T7 amplification procedure does have some advantages that should be taken into consideration (see below). We listed some extra considerations regarding the inverse PCR amplification in the Section 6. 3.3.2.3. Optional: T7 amplification. When inverse PCR yields are too low for direct use on microarrays, then the material can be further amplified using T7 RNA polymerase amplification. This T7 amplification method is preferred over a second ‘‘nested’’ PCR strategy, because T7 RNA polymerase amplifies in a strictly linear manner [20]. Importantly, this amplification requires that the custom inverse PCR primers are supplemented with T7 promoter overhang (see Section 2.3). The amplified material is an anti-sense RNA copy of the inverse PCR products, which hybridizes to the sense DNA oligomers that probe the genome. In our case, an additional advantage of this strategy is that the RNA::DNA hybrids that anneal to our custom designed microarray (see Section 3.4.1) can efficiently
be stripped after scanning. This allows us to re-use the same microarray, which reduces the costs of our 4C experiments. In contrast, we are unable to strip DNA::DNA hybrids from our microarray platform. Therefore, we always include the T7 amplification step in our 4C experiments even when we have enough material for a direct DNA::DNA hybridization. We use the MEGAscript T7 kit from Invitrogen to amplify our inverse PCR products. We amplify 0.5–1.0 lg products, using the standard procedure described by Invitrogen and we refer to their manual for a detailed description of the protocol. Importantly, RNA is extremely sensitive to degradation by RNases. Therefore, great care should be taken to work with RNase-free conditions when using the T7 amplification procedure. We listed some extra considerations regarding the T7 amplification in the Section 6. 3.4. Microarray hybridization Amplified 4C templates can be analyzed using (high-through put) DNA sequencing [14] and DNA microarrays [9,11,12]. This protocol focuses on 4C in combination with DNA microarray technology. 3.4.1. Materials ULS aRNA labeling kit (#EA-006, Kreatech, The Netherlands); 2 F Buffer (see Section 3.4.2); Custom DNA microarray (NKIAVL Drosophila 4C v. 2.0, #AMADID016016, Agilent Technologies, USA); DNA microarray hybridization chamber (SureHyb, #G2534, Agilent Technologies, USA); DNA microarray hybridization oven, including rotator (Agilent Technologies, USA); Series of wash buffers (see Section 3.4.2); glass beakers; water bath; (Optional) nitrogen gas; High-resolution microarray scanner (Agilent Technologies, USA); Microarray quantification software (Feature Extraction Software version GE2-v4.9.1, Agilent Technologies, USA). 3.4.2. Recipes 2 F Buffer: 50% (v/v) Formamide, 10 SSC, 0.2% (v/v) SDS. For 1 ml 2 F Buffer mix 0.5 ml Formamide (#F7503, Sigma–Aldrich, Switzerland), 0.5 ml 20 SSC (#BE51233, AccuGene/Lonza, Switzerland), and 10 ll 20% (v/v) SDS (#198123, Biosolve, The Netherlands). To completely dissolve the mixture, heat to 42 °C for at least 5 min. Do not put on ice, because this will result in precipitation. Series of wash buffers (1) 5 SSC, 0.1% (v/v) SDS. Mix 250 ml 20 SSC (), 5 ml 20% (v/v) SDS (), and 745 ml double distilled water to reach a total volume of 1 liter. Heat to 42 °C in a water bath. (2) 2 SSC, 0.1% (v/v) SDS. Mix 100 ml 20 SSC (#BE51233, AccuGene/Lonza, Switzerland), 5 ml 20% (v/v) SDS (), and 895 ml double distilled water. Heat to 42 °C in a water bath. (3) 1 SSC. Mix 50 ml 20 SSC (#BE51233, AccuGene/Lonza, Switzerland) and 950 ml double distilled water. Heat to 42 °C in a water bath. (4) 0.2 SSC. Mix 10 ml 20 SSC (#BE51233, AccuGene/Lonza, Switzerland) and 990 ml double distilled water. Keep at room temperature. 3.4.3. Protocol In this protocol, we label and hybridize amplified RNA (aRNA), which requires an RNase-free reagents and working environment. 3.4.3.1. Reference template. We use dual channel microarray technology and, therefore, we need a differentially labeled reference sample. We prepare this reference from genomic DNA that we isolated from whole larvae using standard laboratory protocols. We
B. Tolhuis et al. / Methods 58 (2012) 231–242
digested this genomic DNA using a frequently cutting restriction enzyme (e.g., DpnII). To each restriction end, we added double stranded adaptors containing the T7 promoter sequence by means of ligation. We amplified 1 lg of this DNA using the MEGAscript T7 Kit (#AM1334, Invitrogen, USA). 3.4.3.2. Labeling procedure. The aRNA derived from either the 4C or reference templates are differentially labeled with the ULS aRNA labeling kit (Kreatech) and we labeled our material according to the protocol that was delivered with the kit. This labeling method is efficient, quick and straight forward. However, other labeling methods can be used as well. 3.4.3.3. Hybridization and post-hybridization washes. We use custom 4 44 K microarray slides (Agilent Technologies). This allows us to hybridize four experiments in a single experiment. Make sure the hybridization oven is heated to 42 °C. Mix 1 lg labeled aRNA derived from the 4C template, 1 lg differentially labeled reference aRNA, 60 ll 2 F Buffer. Adjust the volume to 120 ll with RNase-free water.
239
Heat the mixture to 95 °C for 3 min. Briefly spin the content to the bottom of the tube. Apply 110 ll of the mixture to the backing slide that comes with the 4 44 K DNA microarray slide. Mount the backing slide and DNA microarray slide in the microarray hybridization chamber according to the manufacturer’s directions. Hybridize for more than 16 h at 42 °C in the DNA microarray hybridization oven, while the slides are rotating. Heat a water bath to 42 °C. Fill 4 glass beakers with either wash buffer 1, 2, 3, or 4. Place the beakers filled with buffers 1, 2, and 3 in the heated water bath. Leave the beaker with buffer 4 at room temperature. Unmount the backing slide and DNA microarray slide from the hybridization chamber. Wash the DNA microarray slide in wash buffer 1 for 30 s. Wash the DNA microarray slide in wash buffer 2 for 30 s. Wash the DNA microarray slide in wash buffer 3 for 5 min. Wash the DNA microarray slide in wash buffer 4 for 2 min. Optional: quickly dry the DNA microarray slide using a gently flow of nitrogen gas.
Fig. 5. Computational tools to analyze 4C data sets. (A) Background filter (green lines) applied to Abdominal-B (Abd-B, left panel) and Ptx1 (right panel) data sets [11]. The Filter yields a monotonously declining curve that represents hypothetical random collisions on a linear chromosome. Such random collisions occur with a frequency that is inversely proportional to their distance from the bait sequence (dashed blue lines). Unfiltered data is plotted as gray bars, while red bars represent data after correction with the background filter. Note that both unfiltered and filtered data were smoothened using a running median with a 50 probes sized window. Y axis shows log2-transformed ratios of 4C signal over reference and x axis shows position on chromosome. (B) Schematic outline of the domainogram algorithm shows how signal intensities along the chromosome are converted into p values using dynamic window sizes. The first window size contains individual probes that are evaluated for statistical significance while the algorithm slides along the chromosome. The next window size includes and statistically evaluates data of two probes, while sliding along the chromosome. For every next window its size is increased by a single probe until the maximum window size of 200 probes is reached. (C) Example of a domainogram obtained from 4C data using Abd-B as bait sequence [11]. P values are plotted in color coding ranging from insignificant (black) through intermediate (purple) to highly significant (red). Y axis indicates window size at which the statistical evaluation took place and x axis is position on the chromosome. Discrete Interaction Domains (DIDs) are extracted (black boxes) to mark the exact genomic boundaries of interacting sequences. Red arrow head shows position of the bait sequence.
240
B. Tolhuis et al. / Methods 58 (2012) 231–242
3.4.3.4. Scanning and quantification. We scan our DNA microarray slides using a high-resolution scanner (Agilent Technologies). The scanned images are quantified using software designed specifically for 4 44 K DNA microarray slides (Agilent Technologies). 4. Data analyses Interpretation of 4C data requires extensive computational analyses. Therefore, we provide a brief overview of the ‘‘dry lab’’ approaches that we use. 4.1. Data normalization It is necessary to normalize fluorescence signals to compensate for systematic variations. Moreover, data normalization is essential when the outcome of multiple experiments needs to be compared. Many normalization procedures are available. We routinely use a non-linear loess function to normalize the quantified data. In the supplementary information, we provide R scripts to normalize 4C data. 4.2. Background filter We designed a ‘‘background filter’’ to correct for chromatin interactions that are a consequence of short nucleotide distance to the bait sequence, rather than being caused by folding of the chromosome fiber (Fig. 5A) [11]. We apply this filter to data that comes from the bait containing chromosome arm, because we consider other arms to be independent entities. First, we rank data based on linear distance of the probed fragment to the bait frag-
ment. Next, we apply a loess function to this ranked data, giving a smoothened representation of the data as a loess curve. The curve shows high values for fragments located at short linear distance from the bait and gradually decreases as linear distance increases. As such, the loess curve represents a hypothetical linear conformation of the chromosome arm. We use the loess curve to correct for linear distance by subtracting the curve from the normalized 4C signals. Hence, correction of fragments at short nucleotide distance from the bait sequence are larger than correction at large nucleotide distance. Any long-range interaction is now detectable as a region with 4C signals that exceed the loess curve in that same position. In the supplementary information, we provide R scripts to apply the ‘‘background filter’’ to normalized 4C data. 4.3. Domainograms Domainograms were originally designed as multi-scale scale sliding window filters to identify chromatin domains in binding profiles of a compendium of chromatin proteins [24]. We found that they are also highly useful for visualizing 4C data in an intuitive and powerful manner. In a typical 4C experiment, reliable interactions can only be detected using a sliding window filter [9,13]. Disadvantages of standard sliding window filters are twofold. First, the window size is arbitrary, limiting the identification of interacting regions to a certain size range. Second, it is difficult to address the statistical significance of these interactions. Domainograms overcome both drawbacks. In brief, domainograms visualize statistically significant enrichments at multiple scaled window sizes (Fig. 5B-C). The smallest
Table 1 Trouble shooting guide. Paragraph
Title
3.2 1. Efficiency of restriction endonuclease treatment
Preparation of ‘‘3C-library’’ QC: test an aliquot by gel electrophoresis immediately after restriction endonuclease treatment (Fig. 6A). Make sure that the aliquot is decrosslinked, RNase A treated and purified before gel electrophoresis (see Section 3.2.4.3 for details) QC: more quantitative measures of efficiency are elegantly described elsewhere [25,23] Low restriction endonuclease efficiencies can be improved by adjusting the fixation procedure, concentration of the SDS treatment and concentration of Triton X-100 quenching reaction QC: test an aliquot by gel electrophoresis immediately after the ligation reaction (Fig. 6B). Make sure that the aliquot is decrosslinked, RNase A treated and purified before gel electrophoresis (see Section 3.2.4.3 for details) Always use freshly prepared 10 ligase buffer. This ensures that the ATP is not compromised and the Dithiothreitol is not precipitated. Both of these conditions are undesirable, because they will reduce ligation efficiencies We recommend using high concentration T4 DNA ligase (10–20 U/ll), because it gives superior ligation efficiencies To prevent ligase activity reduction, we avoid multiple freeze–thaw cycles and exposure to frequent temperature changes Low ligation efficiencies can be improved by adjusting incubation time and temperature
2. Efficiency of intramolecular ligation
3.3 3. Efficiency of trimming 3C-library by frequently cutting enzymes (e.g., DpnII or NlaIII) 4. Efficiency of ligation step to circularize trimmed 3C-library
5. Inverse PCR amplification
6. T7 amplification
Amplification of bait sequence and interacting sequences Low digestion efficiencies can be improved by adjusting the incubation time and increasing the total Units of enzyme. Digestion efficiencies can be assessed by gel electrophoresis. Frequently cutting enzymes should reduce the MW of the 3C-library yielding bulk fragments between 200 and 1000 bp We further recommend reading the guidelines provided by the manufacturer to ensure proper enzyme usage We recommend using the same precautions as with point 2 (Efficiency of intramolecular ligation) to ensure optimal ligation conditions Low ligation efficiency may also be a consequence of insufficient inactivation or removal of the restriction enzyme. Make sure that the restriction enzyme of choice can be heat inactivated (as in the protocol). If not, then purify the digestion reaction using, for example, phenol/chloroform extraction QC: test aliquots of independent replicate samples by gel electrophoresis (Fig. 6C) to get an indication whether samples deviate We recommend designing multiple inverse PCR primers for two neighboring bait sequences, because this will increase the likelihood that good and reproducible yields are obtained Of note, when working with fly strains you can encounter polymorphisms in DNA sequences. Therefore, it is important to be sure that the restriction sites flanking your bait sequences are not mutated in the experimental fly strain. Single Nucleotide Polymorphism maps are available for Drosophila, but they may not be appropriate for your experimental strain. The integrity of a restriction site can be tested by amplifying a PCR product using primers that flank the restriction site. Subsequent, incubation of the product with the restriction enzyme should decrease the size of the product Make sure that the inverse PCR products are in RNase-free water to avoid RNA degradation. Integrity of the amplified RNA can be assessed by gel electrophoresis or microfluidics based platforms (e.g., Bioanalyzer, Agilent technologies). The amplified RNA products should have a similar pattern as the inverse PCR products. A global reduction in MW indicates RNA degradation For further information we refer to the MEGAscript T7 kit manual that is provided by the manufacturer
B. Tolhuis et al. / Methods 58 (2012) 231–242
241
Fig. 6. Quality control measures. (A) Digestion efficiency of EcoRI restriction enzyme in crosslinked chromatin visualized by gel electrophoresis. Two samples (3C) are compared to genomic DNA (gDNA) that is digested with EcoRI (+) or not (-). Digestion in crosslinked chromatin is always partial (red arrows), while digestion of genomic DNA is complete. (B) Efficient intramolecular ligation events yield high molecular weight (MW) DNA (3C#1 and 3C#2), which is comparable to genomic DNA (gDNA) that is not digested with EcoRI. Inefficient intramolecular ligation events give a smear of lower MW DNA (3C#3). (C) Inverse PCR products of four different primer pairs that amplify four different bait sequences. The first panel gives reproducible amplification products from two independent 4C templates (1 and 2), while there is no amplification in genomic DNA (G) or a no template control (N). Inverse PCR products that meet these quality measures give reproducible microarray results as plotted in Fig. 3. The other panels show inverse PCR products that are not suitable for microarray analyses, because they have low yield, background in genomic DNA (aspecific PCR products), or deviating replicates (low reproducibility).
window is a single probed fragment on a chromosome arm. When sliding at this scale, the filter determines the statistical significance of each individual fragment along the chromosome arm. At the second scale data from two neighboring probed fragments is combined and statistically evaluated. Again this window of two fragments slides along the chromosome arm to identify all enriched regions. At every subsequent scale, the window size of the sliding filter is increased with a single probed fragment and significance of enrichment of the combined data is assessed. Domainograms plot significantly enriched interactions in red, while noninteracting regions are depicted in black. The full statistical procedure is thoroughly described by de Wit et al. [24]. In the supplementary information, we provide PERL scripts to convert 4C data into domainograms.
4.4. DID algorithm We identify exact boundaries of interacting regions using an adapted version of a dynamic programming algorithm [24], which identifies the statistically most probable discrete interaction domains (DIDs). The original DID algorithm discretizes the domainograms into a set of multi-layered and partially overlapping discrete domains of enrichment [24]. We only extract DIDs with the smallest window size of each set of partially overlapping discrete domains, because this represents the center of gravity of a longrange chromatin interaction (Fig. 5C). Stringency of the DID algorithm is provided by the user defined bias factor c. We empirically determine c, such that no DIDs were identified in randomly
permutated data. In the supplementary information, we provide PERL and R scripts to apply the DID algorithm to 4C data.
5. Concluding remarks Our sensitized 4C method overcomes the limitations of using single Drosophila tissues that contain limited number of cells. We successfully employed our method to third instar larval brain tissue [11]. We expect that our method will also be applicable to other fly tissues. Importantly, our method enables 4C in the genetically tractable model organism D. melanogaster. As a consequence, unique powerful genetic tools can now be combined with chromosome conformational studies. We expect that this will shed light on key questions regarding the functional role of chromosome structure and organization in gene expression, and DNA replication. Moreover, it will help in identifying molecular components that establish, maintain and modify these structures and organizations in nuclear space. In addition, we incorporated a set of computational tools to enhance 4C data analyses. The use of these tools is not restricted to Drosophila, but can be used to analyze any 4C data set.
6. Trouble shooting/Quality control In Table 1 and Fig. 6 we describe several quality controls that can be included in our 4C protocol and some trouble shooting issues that may help successfully applying the 4C technique.
242
B. Tolhuis et al. / Methods 58 (2012) 231–242
Acknowledgements We thank Joep Vissers and Inka Pawlitzky for critical reading the manuscript. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.ymeth. 2012.04.003. References [1] M. Baker, Nature 470 (2011) 289–294. [2] T. Cheutin, F. Bantignies, B. Leblanc, G. Cavalli, Cold Spring Harbor Symposia on Quantitative Biology 75 (2010) 461–473. [3] P.K. Geyer, M.W. Vitalini, L.L. Wallrath, Current Opinion in Cell Biology 23 (2011) 354–359. [4] S. Henikoff, Cold Spring Harbor Symposia on Quantitative Biology 75 (2010) 607–615. [5] D.M. Gilbert, S.I. Takebayashi, T. Ryba, J. Lu, B.D. Pope, et al., Cold Spring Harbor Symposia on Quantitative Biology 75 (2010) 143–153. [6] L. Guelen, L. Pagie, E. Brasset, W. Meuleman, M.B. Faza, et al., Nature 453 (2008) 948–951. [7] E. Lieberman-Aiden, N.L. van Berkum, L. Williams, M. Imakaev, T. Ragoczy, et al., Science 326 (2009) 289–293.
[8] H. Pickersgill, B. Kalverda, E. de Wit, W. Talhout, M. Fornerod, et al., Nature Genetics 38 (2006) 1005–1014. [9] M. Simonis, P. Klous, E. Splinter, Y. Moshkin, R. Willemsen, et al., Nature Genetics 38 (2006) 1348–1354. [10] J. Dekker, K. Rippe, M. Dekker, N. Kleckner, Science 295 (2002) 1306–1311. [11] B. Tolhuis, M. Blom, R.M. Kerkhoven, L. Pagie, H. Teunissen, et al., PLoS Genetics 7 (2011) e1001343. [12] F. Bantignies, V. Roure, I. Comet, B. Leblanc, B. Schuettengruber, et al., Cell 144 (2011) 214–226. [13] M. Simonis, J. Kooren, W. de Laat, Nature Methods 4 (2007) 895–901. [14] N. Gheldof, M. Leleu, D. Noordermeer, J. Rougemont, A. Reymond, Methods in Molecular Biology 786 (2012) 211–225. [15] H. Reichert, Results and Problems in Cell Differentiation 53 (2011) 529–546. [16] B. Tolhuis, E. de Wit, I. Muijrers, H. Teunissen, W. Talhout, et al., Nature Genetics 38 (2006) 694–699. [17] J. Betschinger, K. Mechtler, J.A. Knoblich, Cell 124 (2006) 1241–1253. [18] B. Bello, H. Reichert, F. Hirth, Development 133 (2006) 2639–2648. [19] S.K. Bowman, V. Rolland, J. Betschinger, K.A. Kinsey, G. Emery, et al., Developmental Cell 14 (2008) 535–546. [20] R.N. Van Gelder, M.E. von Zastrow, A. Yool, W.C. Dement, J.D. Barchas, et al., Proc. Natl. Acad. Sci. USA 87 (1990) 1663–1667. [21] R.M. Kerkhoven, D. Sie, M. Nieuwland, M. Heimerikx, J. De Ronde, et al., PLoS ONE 3 (2008) e1980. [22] J. Dostie, T.A. Richmond, R.A. Arnaout, R.R. Selzer, W.L. Lee, et al., Genome Research 16 (2006) 1299–1309. [23] B. Tolhuis, R.J. Palstra, E. Splinter, F. Grosveld, W. de Laat, Molecular Cell 10 (2002) 1453–1465. [24] E. de Wit, U. Braunschweig, F. Greil, H.J. Bussemaker, B. van Steensel, PLoS Genetics 4 (2008) e1000045. [25] H. Hagege, P. Klous, C. Braem, E. Splinter, J. Dekker, et al., Nature Protocols 2 (2007) 1722–1733.