Quantitative modeling of self-oligomerization of proteins in the nuclear envelope by fluorescence fluctuation analysis

Quantitative modeling of self-oligomerization of proteins in the nuclear envelope by fluorescence fluctuation analysis

Analytical Biochemistry 582 (2019) 113359 Contents lists available at ScienceDirect Analytical Biochemistry journal homepage: www.elsevier.com/locat...

1MB Sizes 0 Downloads 54 Views

Analytical Biochemistry 582 (2019) 113359

Contents lists available at ScienceDirect

Analytical Biochemistry journal homepage: www.elsevier.com/locate/yabio

Quantitative modeling of self-oligomerization of proteins in the nuclear envelope by fluorescence fluctuation analysis

T

Jared Hennen1, Kwang-Ho Hur1, Joachim D. Mueller* School of Physics and Astronomy, University of Minnesota, MN, 55455, United States

ABSTRACT

Analysis of fluorescence fluctuation data through the time-shifted mean-segmented Q (tsMSQ) analysis method has recently been shown to successfully identify protein oligomerization and mobility in the nuclear envelope by properly accounting for local volume fluctuations of the nuclear envelope within living cells. However, by its nature, tsMSQ produces correlated data which poses unique challenges for applying goodness of fit tests and obtaining parameter uncertainties from individual measurements. In this paper, we overcome these challenges by introducing bootstrap tsMSQ which involves randomly resampling the fluorescence intensity data to eliminate the correlations in the tsMSQ data. This analysis technique was verified in both the cytoplasm and the lumen of the nuclear envelope with well-characterized proteins that served as model systems. Uncertainties and goodness of fit tests of individual measurements were compared to estimates obtained from sampling multiple experiments. We further applied bootstrapping to fluorescence fluctuation data of the luminal domain of the SUN domain-containing protein 2 in order to characterize its self-oligomerization within the nuclear envelope. Analysis of the concentration-dependent brightness suggests a monomer-trimer transition of the protein.

1. Introduction Cellular function is predicated on the capacity of proteins to interact with other proteins to form complexes. Conversely, alterations in protein oligomerization have been implicated in human diseases including Alzheimer's disease, Parkinson's disease, and autoimmunity [1–3]. Fluorescence fluctuation spectroscopy (FFS) provides a powerful tool to directly monitor the oligomeric state of proteins in living cells [4]. By measuring a fluorescently-labeled protein's brightness, or average counts per second emitted per molecule, FFS provides a direct indicator of protein stoichiometry [5]. While most cellular FFS applications focused on the nucleoplasm, cytoplasm, and plasma membrane [5–7], we recently were able to extend the use of FFS to identify protein interactions in the nuclear envelope (NE) of living cells [8,9]. The NE not only separates the nucleus from the cytoplasm, but also serves as a critical signaling hub with proteins involved in force transduction, gene expression, and cell cycle regulation [10,11]. Despite the importance of the NE for cell function, proteins in the NE remain poorly characterized due in part to the structure of the NE which consists of two nuclear membranes separated by a ~40 nm wide lumen [12]. While diffusion in the NE has been examined by FFS [13], the presence of slow nuclear membrane undulations prevents the use of standard FFS data analysis algorithms to recover protein stoichiometry [8]. We recently overcame this obstacle by using the mean-segmented

Q (MSQ) analysis method which properly accounts for the slow intensity fluctuations caused by the undulating nuclear membranes [8,14]. We further improved upon the MSQ method with the introduction of time-shifted MSQ (tsMSQ) which is more robust than MSQ and simplifies the data analysis [15]. However, both methods share a significant shortcoming in that they produce correlated data [15]. These correlations prevent us from applying goodness of fit tests or obtaining uncertainties in fit parameters. As a result, rigorous examination of binding models as well as evaluating the significance of perturbations on binding equilibria is currently not feasible. This study addresses the current lack of error analysis and statistical hypothesis testing by introducing bootstrap tsMSQ analysis. The new analysis method involves resampling the FFS data of a single measurement multiple times, thereby eliminating the limitations caused by the correlated data. This improvement significantly decreases measurement time while simultaneously simplifying data analysis and data interpretation. After developing the method we apply it to measurements of cytoplasmic EGFP as a control to validate the technique. Bootstrap tsMSQ is further applied to EGFP residing in the lumen of the NE as well as to the EGFP-tagged luminal domain of the Sad1/UNC-84 (SUN) protein SUN2, which is a component of the linker of nucleoskeleton and cytoskeleton (LINC) complexes [16]. While both proteins had been examined previously [8,9,15], this study provides the first rigorous statistical analysis of the fit parameters and binding curve

*

Corresponding author. PAN 418 115 Union St SE, Minneapolis, MN, 55455, United States. E-mail address: [email protected] (J.D. Mueller). 1 Signifies equal contribution. https://doi.org/10.1016/j.ab.2019.113359 Received 19 May 2019; Received in revised form 2 July 2019; Accepted 3 July 2019 Available online 04 July 2019 0003-2697/ © 2019 Elsevier Inc. All rights reserved.

Analytical Biochemistry 582 (2019) 113359

J. Hennen, et al.

of NE proteins using bootstrap tsMSQ. We applied a monomer-dimertrimer binding model to describe the concentration-dependent brightness of luminal SUN2 and compared it to a monomer-trimer transition. We found that the dimer population lacked statistical significance, which led us to reject the monomer-dimer-trimer model. Our findings support a direct monomer-trimer equilibrium of SUN2 in the NE of living cells, which agrees with previous observations obtained by in vitro studies [17,18].

segments for different values of M as described elsewhere [15]. The experimental standard deviation stsMSQ(T ) of tsMSQ is defined in the Supplementary Material. The bootstrap tsMSQ analysis method used in this paper differs from previously published methods in that for the longest value of T (Tmax) the starting points of each segment are chosen at random as described in detail in the Results section. These randomly chosen segments are then subdivided into segments of shorter length and the corresponding tsMSQ value is calculated as before [15]. The experimental tsMSQ(T ) curves were fit using previously introduced models [15]. Briefly, a single diffusing species is described by

2. Material and methods 2.1. Experimental setup

tsMSQ D (T ; Q0,

FFS measurements were performed on a custom-built two-photon microscope as previously described [19]. A 63x C-Apochromat water immersion objective with numerical aperture (NA) = 1.2 (Zeiss, Oberkochen, Germany) was used to focus a laser beam with an excitation wavelength of 1000 nm and average power of 0.3–0.4 mW after the objective. Photon counts were detected using avalanche photodiodes (SPCM-AQ-141 APD; Perkin-Elmer, Dumberry, Quebec, Canada), recorded with a sampling rate of 20 kHz using a Flex04-12D card (correlator.com, Bridgewater, NJ), and analyzed using programs written in IDL 8.7 (Research Systems, Boulder, CO). Z-scans were performed on a PZ2000 piezo stage (ASI, Eugene, OR) which was moved axially with an arbitrary waveform generator (33522 A; Agilent Technologies, Santa Clara, CA) supplying a linear ramp function with peakto-peak amplitude of 1.6 V, corresponding to 24.1 μm of axial travel, and period of 10 s for a speed of 4.82 μm/s.

D)

= Q0

tsB2,D (TS ; TS 2

D)

tsC2,D (T ; D) (T TS ) 2

1

where tsB2,D (TS ; D ) and tsC2,D (T ; D ) are auxiliary functions [15] that depend on the diffusion time D of the fluorescent species. The amplitude Q0 specifies Mandel's Q-parameter, which is related to brightness by Q0 = 2 TS , where 2 is the geometry factor [6,21,22]. Multiple diffusing species are described by a superposition of the tsMSQ functions of all species [15], S

tsMSQ MD (T ) =

fi tsMSQ D (T ; Q0, i ,

D, i )

2

i=1

where S is the total number of diffusing species and fi is the intensity fraction of the ith diffusing species with diffusion time D, i and amplitude Q0, i . The total amplitude Q0 is given by S

2.2. Sample preparation

Q0 =

fi Q0, i

Experiments were conducted using transiently transfected U2OS cells (ATCC, Manassas, VA) maintained in DMEM with 10% fetal bovine serum (Hycolone Laboratories, Logan, UT). Cells were subcultured into 24-well glass bottom plates (In Vitro Scientific, Sunnyvale, CA) before transfection. Transfection was performed 12–24 h prior to measurement using GenJet (SignaGen Laboratories, Rockville, MD) according to the manufacturer's instructions. The growth medium was replaced with Dulbecco's phosphate-buffered saline containing calcium and magnesium (Biowhittaker, Walkerville, MD) immediately before measuring. The DNA constructs used in this manuscript have been previously described [9].

which identifies the apparent brightness of the mixture [8,15]. Finally, modeling of tsMSQ for proteins residing in the lumen of the NE requires the inclusion of an exponential correlation process to describe the volume fluctuations of the NE [15],

3

i=1

tsMSQ DE (T ) = tsMSQ D (T ) + A0

tsB2, E (TS ; TS2

0)

tsC2, E (T ; 0 ) (T TS ) 2

4

where A0 is the amplitude and 0 is the characteristic time of the volume fluctuation term, while tsB2, E and tsC2, E are auxiliary functions describing the exponential correlation process [15]. To allow for error analysis of the tsMSQ curves, we derived (Supplementary Material) an expression for the theoretical standard variance due to shot noise,

2.3. Measurement protocol Cells were selected using epifluorescence followed by a two-photon z-scan to ensure proper sub-cellular localization of the labeled protein [8,20]. The two-photon beam was then focused at the center of the cytoplasm for cytoplasmic measurements or on the dorsal NE for the measurement of nuclear envelope proteins, followed by the collection of intensity fluctuations for an acquisition time of 1 min. In the case of NE proteins a second FFS measurement was performed at the ventral NE [8]. Calibration measurements were performed on EGFP as previously described to obtain the brightness EGFP from Mandel's Qparameter Q0 [5,6]. Further details on the measurement procedure can be found in Hennen et al. [19].

TS 1 = Tdaq N0

2 tsMSQ (T )

5

The experimental variance from the bootstrap analysis procedure was defined as

sB2 (T ) =

N

1 N

1

(tsMSQi (T ) i=1

tsMSQ(T ) )2

6

where N is the total number of bootstrap iterations performed and tsMSQi(T) is the output of the ith bootstrap iteration. The angle brackets specify the mean value over all bootstrap samples. Each bootstrap analysis in this manuscript was performed with 100 iterations unless otherwise stated. The uncertainty was calculated using Eq. (6) and each bootstrap curve was fit to one of the tsMSQ models described above returning a vector Pi containing the fit parameters, such as Q0 , D , A0 , and 0 . From these fits the mean P and standard deviation sP of the fit parameters are determined. The overall bootstrap fit is defined by the tsMSQ curve described by the averaged fit parameters P , f (T ) = tsMSQ(T ; P) . The corresponding squared standardized residual for T = Tj is given by

2.4. Data analysis The FFS experiment records the photon counts ki received during the sampling time TS = 50 μs as a function of time t = iTS . For standard tsMSQ analysis the recorded set of photon counts ki with i ranging from 1 to N0 is divided into segments of length M , which corresponds to a time period of T = MTS . Since the data acquisition time is Tdaq = N0 TS , the data set is divided into R = N0 \ M segments, where \ denotes integer division. The value of tsMSQ(MTS ) is directly calculated from the 2

Analytical Biochemistry 582 (2019) 113359

J. Hennen, et al.

SR2 (Tj ) =

1 N

N i=1

(tsMSQi (Tj ) sB2 (Tj )

consecutive measurements of the same cell (Fig. 1C) illustrate the presence of correlations. While the three measurements produce scatter in the tsMSQ amplitude, the individual curves themselves lack this scatter because of the correlations within a single data set [15], which results in 2 values close to zero. As a consequence, we are unable to perform a meaningful goodness of fit test, which is a prerequisite to distinguish models and identify the accuracy of best-fit parameters [15]. A straightforward way to eliminate the correlations is to determine each point of the tsMSQ curve from an independently measured data set of the same sample, which we previously referred to as decorrelated tsMSQ [15]. An example curve constructed from repeated measurements of cytoplasmic EGFP along with its experimentally determined uncertainty is shown in Fig. 1D and exhibits scatter along the curve, which is absent in the standard tsMSQ data (Fig. 1B). The experimentally determined uncertainty from the repeat measurements is in good agreement with the theoretically predicted uncertainty of Eq. (5) (Fig. 1E), which demonstrates that the main contribution to the experimental uncertainty in this sample is shot noise. A fit of the decorrelated tsMSQ to a single diffusing species leads to a 2 value of 1.1, indicating agreement between data and fit (Fig. 1D). While the decorrelated tsMSQ algorithm removes the correlation in the tsMSQ data and restores quantitative interpretation of goodness-offit, it is an inefficient procedure as it requires multiple, independent measurements of the same sample to construct a single tsMSQ curve. The experimental time per sample increases at least five-fold [15], which is a significant drawback, especially for live cell experiments. These additional measurements are used to construct a single decorrelated tsMSQ curve. Individual decorrelated tsMSQ data points are chosen at random from the five data sets, which destroys the correlations, but does not result in more data that is used in the analysis. Importantly, the increased measurement time is not efficient because data must be collected from a population of cells to obtain the brightness titration plots typically reported by FFS experiments. This prompted us to look for an alternative that solves the self-correlation of tsMSQ without requiring additional measurement time. Inspired by bootstrapping, we implemented the algorithm depicted in Fig. 2A as a potential solution. As with the standard approach, a single data set is divided into segments, but the start point of each segment is now chosen at random as indicated in the figure. Note that the initial

f (Tj ))2 7

which serve to determine the chi-squared value of the bootstrap analysis, M 2

SR2 (Tj )

= j=1

8

The degrees of freedom (dof) equals M mP where mP is the number of fit parameters. By fitting to the models described above and using the relation Q0 = 2 TS we obtain the brightness , which is converted into the normalized brightness, b = / EGFP , where EGFP is the calibration brightness of EGFP. The value of b is equivalent to the average oligomeric state of the EGFP-tagged protein such that a monomer results in b = 1, while a dimer results in b = 2, whereas a mixture of monomers and dimers would have 1 < b < 2 [23]. The average number of EGFPlabeled proteins in the experimental observation volume is proportional to the protein concentration and was calculated by

N= F /

EGFP

9

where F is the average of the measured fluorescence intensity in photon counts per second [5]. 3. Results We begin with a brief illustration of the previously described standard tsMSQ algorithm [15] and its shortcomings. The algorithm (Fig. 1A) operates on a single data set containing the recorded photon counts from an FFS experiment sampled with a time resolution TS [15]. This data set is divided into segments of length T and the corresponding value of tsMSQ(T ) is calculated from the segments as described previously [15]. This process is repeated on the same data set for different values of T to construct a complete tsMSQ(T ) curve (Fig. 1A). The shortcoming of this approach is illustrated in Fig. 1B, which displays an experimental tsMSQ curve of cytoplasmic EGFP with uncertainties. A fit to a single diffusing species model (Eq. (1)) results in abnormally small standardized residuals (Fig. 1B) because the tsMSQ data points are statistically correlated [15]. The tsMSQ curves obtained from three

Fig. 1. tsMSQ algorithm. A) Illustration of the standard tsMSQ algorithm. Each point on the tsMSQ curve is calculated from different segmentations of the same data set. Segment sizes are chosen as integer multiples of the sampling time TS . B) Experimental tsMSQ curve (circles) of cytoplasmic EGFP determined by the standard algorithm with a fit (red line) to Eq. (1) and corresponding standardized residuals sr. C) tsMSQ curves from three consecutive measurements of the same cell. D) Decorrelated tsMSQ of cytoplasmic EGFP with fit (red line). E) Experimental (black circles), bootstrap (red squares), and theoretical (black line) uncertainty of tsMSQ vs. T for the data shown in panel B. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) 3

Analytical Biochemistry 582 (2019) 113359

J. Hennen, et al.

Fig. 2. The bootstrap tsMSQ algorithm. A) The bootstrap tsMSQ algorithm operates on a single data set, which is divided into segments of length Tmax with starting points chosen at random. The value of tsMSQ(Tmax) is calculated from these segments as in the standard algorithm. Next, the randomly chosen segments are repeatedly subdivided into shorter segments to construct a bootstrapped tsMSQ curve, tsMSQ1 (upper panel, left). These steps are reiterated N times to obtain a total of N bootstrapped tsMSQ curves. Each iteration starts by choosing the initial segment position at random as illustrated in the lower panel. B) Result of applying a single bootstrap iteration to a data set from a cell expressing EGFP (circles) with a fit to Eq. (1) (red line). C) Result of the 5th bootstrap iteration (black circles) fit to Eq. (1) (red line). The bootstrapped tsMSQ curves from the four previous iterations (faded circles and lines) are shown for reference. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

segmentation is performed using the longest segment period Tmax of interest. After selecting the segments, the tsMSQ value for Tmax is calculated from the segmented data in exactly the same procedure as previously described [15]. Next, the segments are further subdivided into segments of length T < Tmax to construct the complete tsMSQ curve (Fig. 2A). This tsMSQ curve carries the subscript 1 to indicate that is the first of an ensemble of tsMSQ curves. By repeating this process N times we generate a total of N tsMSQ curves sampled from a single data set (Fig. 2A), each representing a different realization based on random selection of initial data segments. Since iterative and random resampling of a dataset with replacement is the defining characteristic of the bootstrap method [24], we refer to this approach as the bootstrap tsMSQ algorithm. Next, the uncertainty is calculated by Eq. (6) and each of the bootstrap tsMSQ curves, tsMSQ1 (T ) to tsMSQ N (T ) , is fit to the same model leading to a set of N best fit parameters. The distribution of each fit parameter serves to identify their mean and variance. To illustrate the analysis procedure we applied the bootstrapping algorithm to the FFS data of EGFP previously analyzed by standard tsMSQ (Fig. 1A). A total of N = 1000 bootstrap samples were generated from the data set. The first and 5th bootstrap tsMSQ are shown together with fits to a single diffusing species as a representative sample (Fig. 2B and C). The uncertainty sB (T ) of the tsMSQ function was directly calculated from the bootstrapped samples (Eq. (6)) and agrees with the theoretically predicted standard deviation due to shot noise (Fig. 1E). Fitting all of the bootstrapped curves to a model of a single diffusing species generates N = 1000 estimates of the model parameters Q0 and D . We determined the mean ( Q0 = 0.018 and D = 0.61 ms) and

standard deviation (sQ0 = 0.001 and s D = 0.09 ms) from the bootstrapped parameter distributions (Fig. 3A and B). The shape of these distributions closely follows Gaussians with a center and width given by the mean and standard deviation (Fig. 3A and B). The tsMSQ curve with mean parameters Q0 and D represents the fit curve of the bootstrap analysis (Fig. 3C). Using the residuals (Eq. (7)) of the bootstrap analysis we recovered a reduced chi squared (Eq. (8)) of 2 = 1.03, indicating good agreement between experimental data and fit model. To demonstrate that bootstrap tsMSQ analysis accurately recovers the mean and standard uncertainty we generated a statistical sample for comparison by generating ten data sets from repeated measurements of a cell expressing EGFP. The data sets were analyzed by the standard tsMSQ algorithm to generate ten values of Q0 and D with sample mean and standard deviation of Q0 = 0.0138 ± 0.0012 and D = 0.74 ± 0.15 ms. These values served as a benchmark for comparison to the parameter values and uncertainties obtained by bootstrap analysis of each individual data set. The bootstrap analysis reproduced the sample mean and standard deviation from each individual measurement (Fig. 4). The average of the ten bootstrap results returned Q0 = 0.0138 and D = 0.78 (Fig. 4A–B) as well as uncertainties of sQ0 = 0.0011 and s D = 0.15 ms (Fig. 4C–D), which closely tracked the benchmark values. The theoretical uncertainty in Q0 due to shot noise (Eq. (5)) is lower than the experimental sample and bootstrap-determined uncertainty (Fig. 4C), which indicates that the data are not shot noise limited. We found that this excess noise varies from sample to sample. Thus, the analysis of tsMSQ data should not rely on the theoretically predicted error, but use the bootstrapped error estimate. We analyzed FFS data collected in cells with a wide range of EGFP

Fig. 3. Bootstrap tsMSQ analysis of a single FFS data set of cytoplasmic EGFP. A) Histogram of bootstrapped Q0 values and Gaussian (solid line) with mean and standard deviation of the histogram. B) Histogram of D values and Gaussian (solid line) with mean and standard deviation of the histogram. C) The distribution of bootstrapped tsMSQ(T ) values are visualized by the light and dark shaded regions corresponding to low and high density of the histogramed values, respectively. The tsMSQ fit (red line) and its standard deviation envelope (red dashed lines) are shown. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) 4

Analytical Biochemistry 582 (2019) 113359

J. Hennen, et al.

tsMSQ analysis lacked the means to accept or reject fit models. Bootstrap tsMSQ is able to judge the quality of fit models using chisquared statistics. We measured SS-EGFP expressing cells and performed bootstrap analysis using Eq. (4) as the fit model. We further used a single diffusing species (Eq. (1)) as a second fit model for comparison. Both models describe the bootstrapped tsMSQ with a 2 close to one at low expression levels (Fig. 6A). However, at higher expression levels (which is proportional to intensity, as given by Eq. (9)) only the model containing the exponential correlation process agrees with the data (Fig. 6B). The single species diffusion model shows a significant increase in 2 with increasing intensity, starting near 1 reaching values up to 2 ~4 at the high end of the intensity range measured (Fig. 6C), leading to a rejection of the model. In contrast, the single diffusion model with exponential correlation term resulted in 2 ~1 for all experiments (Fig. 6C). The probability to exceed based on the 2 values was greater than 0.05 for all but one measurement, which led us to accept the fit model. The corresponding fit parameters for SS-EGFP are shown in Fig. 7. The brightness of SS-EGFP is concentration independent with a mean and standard deviation of 0.99 ± 0.11 (Fig. 7A), which identifies a non-interacting monomer. The diffusion time D is concentration independent as confirmed by a fit to a straight line, which yields a slope (3.2 ± 1.7) x 10−3 ms/kHz. (Fig. 7B). The concentration independence of b and D of the diffusing species agrees with our expectations for noninteracting monomers of SS-EGFP. In contrast, the fluctuation amplitude A0 increases linearly with intensity with a fitted slope of (8.3 ± 0.4) x 10−5 kHz−1 (Fig. 7C) as expected for fluctuations in the sample volume [8]. Finally, the timescale of the volume fluctuations due to membrane undulations should not depend on concentration as confirmed by the data (Fig. 7D). We previously investigated the oligomerization of the luminal domain of SUN2 by MSQ using a two-species diffusion model [8,9], but lacked the tools to justify this model. To address this issue we took FFS data on cells expressing SS-EGFP-SUN2261−731 in the NE and applied bootstrap tsMSQ. A single bootstrap tsMSQ curve from one of the data sets is shown for illustration together with fits to a single- and twospecies diffusion model (Fig. 8A). We evaluated the bootstrapped curves of all FFS experiments by the 2 value for both models versus fluorescence intensity (Fig. 8B). The bootstrap analysis clearly rejects the single species model, as virtually all 2 values were significantly larger than one. On the other hand, the two species model described the data within statistical error as evidenced by 2 values which are close to one. We further calculated the probability to exceed from the chi squared values, the majority of which (79 out of 84 cells) were above 0.05. Thus, the two-diffusing species hypothesis is consistent with the data. The two species fits recovered a fast and slow diffusing component with mean and standard deviations of 8 ± 7 ms and 170 ± 80 ms, respectively (Fig. 8C). The presence of a fast and slow component is in agreement with previous results and probably reflects a freely diffusing and membrane associated component as previously discussed [8,15]. The total Q0 amplitude obtained by adding the amplitudes of the two diffusion species (Eq. (3)) was converted into a normalized brightness b which specifies the average oligomeric state of the sample

Fig. 4. The ensemble mean and standard deviation obtained by the standard tsMSQ method from ten repeated measurements is compared to individual fit results from bootstrap tsMSQ. Ten repeated measurements of a single cell expressing EGFP were individually analyzed by bootstrap tsMSQ The individual bootstrap fit parameters and uncertainties (grey) of the ten datasets as well as their mean (dashed line) are shown together with the ensemble results based on standard (S, blue) tsMSQ of all ten data sets. A) Q0 . B) D . C) Standard deviation of Q0 together with theoretically (T, red) predicted error based on shot noise. D) Standard deviation of D . (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

expression levels to test the performance of bootstrap tsMSQ as a function of fluorescence intensity (Fig. 5). Fits to a single diffusing species recovered 2 close to one (Fig. 5A), indicating agreement between data and fit. The fitted Q0 value was converted into a brightness , which was subsequently divided by the calibration value EGFP determined in an independent control experiment to calculate the normalized brightness b = / EGFP . The normalized brightness is independent of intensity and consistent with a monomer (b = 1) within the experimental uncertainty with a mean and standard deviation of 1.03 ± 0.06 (Fig. 5B). Similarly, the diffusion time is independent of the expression level with a mean and standard deviation of 0.63 ± 0.10 ms (Fig. 5C). Cytoplasmic EGFP served as a model system for the initial evaluation of bootstrap tsMSQ. The remaining experiments focus on the characterization of proteins residing in the NE by the bootstrap approach. Previous work showed that analysis of NE proteins by conventional FFS analysis methods suffers from artifacts, which are avoided by tsMSQ [15]. Our first experimental system is SS-EGFP [8], a construct which targets EGFP expression into the lumen of the endoplasmic reticulum (ER). The signal sequence SS is cleaved during protein expression, leading to EGFP occupying the luminal space. Because the NE and ER lumen are contiguous, expression of SS-EGFP leads to the presence of EGFP in the NE lumen. We previously showed that FFS data of SS-EGFP in the NE are described by a single diffusing species plus an exponential correlation process (Eq. (4)), which accounts for volume fluctuations of the NE lumen [8]. However, the standard

Fig. 5. Bootstrap tsMSQ analysis of cytoplasmic EGFP. Cells expressing EGFP (n = 17) were analyzed using bootstrap tsMSQ. A) intensity with the mean indicated by the dashed line. C) D vs. intensity with the mean indicated by the dashed line. 5

2

values vs. intensity. B) b vs.

Analytical Biochemistry 582 (2019) 113359

J. Hennen, et al.

Fig. 6. Bootstrap tsMSQ of SS-EGFP. A) Results from a low intensity (F = 16 kcps) measurement of SS-EGFP. Distribution of bootstrap tsMSQ (T ) values (light and dark shading corresponds to low and high density, respectively) for SS-EGFP with fits to single species (Eq. (1), red dashed line) and single species plus exponential correlation (Eq. (4), solid green line). Both fit lines overlap reflecting the very small exponential correlation amplitude at low fluorescence intensities. B) Results from a high intensity (F = 189 kcps) measurement of SS-EGFP following the same plotting convention as panel A. C) v2 values vs. intensity from bootstrap tsMSQ analysis of cells expressing SS-EGFP (n = 25) for fit to Eq. (1) (red squares) and Eq. (4) (green circles).

Fig. 7. Fit parameters of SS-EGFP by bootstrap tsMSQ. FFS data taken in the NE of cells expressing SS-EGFP (n = 25) were fit to Eq. (4) by bootstrap tsMSQ. Dashed line indicates the mean. A) b values vs. intensity with mean and standard deviation of 0.99 ± 0.11. B) D vs. intensity with mean and standard deviation of 1.8 ± 0.4 ms. C) A0 vs. intensity with fit to a linear slope (red line). D) 0 vs. intensity with mean and standard deviation of 0.3 ± 0.2 s. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

account for both sources of uncertainty, the brightness data were binned into equal concentration regions with width of N = 20, except for the highest bin which covered 160 < N ≤ 200 in order to ensure a minimum of 4 data points per bin. The weighted means and standard errors for each bin were calculated and plotted vs. the weighted N to generate a brightness titration graph (Fig. 8D). The brightness increases with concentration to a value approaching 3, indicating the expected trimerization of the protein [9,25]. The brightness titration data were fit to a monomer-dimer-trimer equilibrium (MDT model) and separately to an all-or-nothing transition between monomeric and trimeric SS-EGFP-SUN2261−731 (MT model) for comparison (Fig. 8D). The mathematical equations modeling the brightness of these transitions have been described elsewhere [9]. However, previous work was unable to compare the two models because parameter uncertainties were unavailable. Because of the known uncertainties provided by the new analysis approach, we were able to determine chi-squared values for both models. The MDT and MT model resulted in a 2 of 3 and 7, respectively, which corresponds to a reduced chi-squared value of 0.5 for the MDT model and 2 = 0.9 for the MT model. While both models result in a 2 below 1, these values fall within the 68% probability interval for their respective 2 distributions due to their low degrees of freedom (7 and 8 for the MDT and MT models, respectively). Furthermore, the lower chi squared value of the MDT model is of no significance since both 2 are less than one. Thus, both models describe the experimental binding curve within statistical error. The MDT model depends on two association coefficients (K1 and K2 ), while the MT model only requires one coefficient (K2 ) as described in the Supplementary Material. The fits identified the association coefficients and their 1 uncertainty intervals resulting in 3 K1 = 3+42 × 10 2 and K2 = .1. 9+2.0 for the MDT model and 0.8 × 10 K2 = 8+22 × 10 4 for the MT model. The results for the MDT model indicate assembly via an intermediate and weakly populated dimer population. In fact, the data are statistically consistent with a vanishing dimer population, as shown by a fit to the MT model. These two descriptions of assembly are consistent, because K1 = 0 is within the 95% uncertainty interval of the MDT model, which reduces to the MT model

Fig. 8. Applying bootstrapped tsMSQ to SUN2. A) Distribution of bootstrap tsMSQ(T ) values (light and dark shading corresponds to low and high density, respectively) for data taken in the NE of a cell expressing SS-EGFPSUN2261−731. The data was fit to a single species (Eq. (1), red curve) and two species (Eq. (2), blue curve) diffusion model resulting in 2 values of 5.4 and 1.1, respectively. B-D) Results from bootstrapped tsMSQ analysis for data taken in the NE of cells expressing SS-EGFP-SUN2261−731 (n = 84). B) 2 values vs. intensity for fits to Eq. (2) (blue squares) and to Eq. (1) (red triangles). C) D for fast (black circles) and slow (red squares) diffusing components vs. intensity from fits to Eq. (2). D) The brightness titration data together with fits to a monomer-trimer (solid green line) and monomer-dimer-trimer (dashed black line) binding model. The monomer-trimer model recovered a dissociation coefficient of KMT = 36 ± 4.

[5,8]. While bootstrap analysis provides uncertainties, these values account for the uncertainty of a fit parameter from a single measurement and do not reflect cell to cell variability (Fig. S1). In order to 6

Analytical Biochemistry 582 (2019) 113359

J. Hennen, et al.

In contrast to our previous work on the SUN2 luminal domain, the introduction of bootstrapped tsMSQ allowed for the statistical evaluation of different binding models. We also combined for the first time uncertainties from each measurement with those from cell to cell variation by binning brightness data within regions of similar concentration. These steps provide a new approach for quantifying binding curves in living cells with improved statistical rigor. Based upon this analysis we concluded that the inclusion of a significant population of dimers into the binding model was not statistically warranted, which supported the all-or-nothing assembly model between a monomer and trimer of SS-EGFP-SUN2261−731. This in vivo result is in agreement with in vitro studies suggesting the presence of trimers with no significant dimer population [17,18]. X-ray crystallography of the luminal domain of SUN2 further demonstrated that one of the two coiled-coil domains of SUN2 forms a trimeric bundle [25]. We expect that the coiled-coil domain is responsible for the all-or-nothing transition, noting that the existence of a direct monomer-trimer transition has previously been proposed for coiled-coils and specifically for a similar SUN2 construct [17,26]. While bootstrap tsMSQ was developed specifically to properly account for the slow membrane undulations of the NE, we predict it will prove useful for the study of other systems with slow fluctuating processes. For example, the slow diffusing component ( D ~100–200 ms) of SS-EGFP-SUN2261−731 is responsible for the increase of the tsMSQ amplitude for segment times beyond 6 s (Fig. 8A). Fluorescence fluctuation data are typically processed using a fixed segment time Tseg on the order of 1–10 s to achieve robust autocorrelation curves in live cell experiments [27]. However, the tsMSQ amplitude still increases for T > Tseg (Fig. 8A), which leads to biased results in autocorrelation analysis, as previously demonstrated [8]. Because tsMSQ explicitly takes the effect of segment time into account, it is free of this bias. This robustness of tsMSQ should be advantageous in certain live-cell applications, such as the study of plasma membrane associated proteins with low diffusivity. Thus, bootstrap tsMSQ provides a full-fledged and simple to use analysis tool for fluorescence fluctuation data taken in the NE of living cells and likely will prove useful for applications outside the NE as well.

Fig. 9. Illustrative model of SUN2261−731 interaction. SUN2261−731 (blue) is found as a freely diffusing monomer in the lumen or a trimer with significantly slower diffusion, likely due to interactions with endogenous partners (purple) at the nuclear membranes (grey). (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

as explained in the Supplementary Material. This result indicates that the assembly proceeds through a very short lived dimer species. Because the MDT and MT models are statistically indistinguishable for our experiments, we adopt the MT model over the MDT model to describe the data as succinctly as possible. This result provides the statistical underpinning for choosing the monomer-trimer transition model and supports the assembly model depicted in Fig. 9, which consists of monomers in the lumen that associate into trimers in an all-or-nothing transition. SUN2 is known to interact with Klarsicht, ANC-1, Syne homology (KASH) proteins, thereby forming LINC complex which has been implicated in force transduction across the NE [10]. Experimental results suggest that SUN2 can only interact with KASH after trimerization occurs [17,25]. Thus, we interpreted the presence of a fast and slow diffusion time (Fig. 8C) as representing two species [8,15], a fast and freely diffusing monomer in the lumen and a slowly diffusing trimer interacting with the KASH peptide of nesprins (Fig. 9).

Conflicts of interest The authors declare no conflicts of interest.

4. Discussion

Acknowledgement

While the use of MSQ and tsMSQ was critical for the first successful studies of NE protein interactions by fluorescence fluctuation methods, the lack of error analysis was a distinct shortcoming limiting its use. Because of the critical need to evaluate models and uncertainties in fit parameters, the only option available was the decorrelated tsMSQ algorithm [15]. This algorithm results in an inefficient experimental procedure as it requires multiple independent measurements of the same sample, with each measurement lasting on the order of 60 s. A single decorrelated tsMSQ curve is constructed by calculating each tsMSQ data point from a randomly chosen data set. While this process destroys the correlations, no additional data is gained for the final analysis. In contrast, the bootstrap tsMSQ allows for model evaluation and calculation of uncertainties from a single measurement. Therefore, approximately five separate cells may be measured via bootstrap tsMSQ in the same time required for a single cell measurement via decorrelated tsMSQ. This is significant as measurements must be performed on a population of cells with a range of expression levels. Although bootstrap tsMSQ may appear complex, the algorithm is straightforward to implement and efficient, taking only a few seconds to construct and fit a typical set of 100 bootstrap iterations while requiring no more user input than the standard tsMSQ method.

This research was supported by National Institutes of Health (NIH) grant GM064589. Appendix A. Supplementary data Supplementary data to this article can be found online at https:// doi.org/10.1016/j.ab.2019.113359. Abbreviations FFS LINC KASH MDT MSQ MT NE SUN tsMSQ

7

Fluorescence fluctuation spectroscopy Linker of nucleoskeleton and cytoskeleton Klarsicht, ANC-1, Syne homology Monomer-dimer-trimer Mean-segmented Q Monomer-trimer Nuclear envelope Sad1/UNC-84 Time-shifted mean-segmented Q

Analytical Biochemistry 582 (2019) 113359

J. Hennen, et al.

References [15]

[1] B. Mroczko, M. Groblewska, A. Litman-Zawadzka, J. Kornhuber, P. Lewczuk, Amyloid β oligomers (AβOs) in Alzheimer's disease, J. Neural Transm. 125 (2018) 177–191, https://doi.org/10.1007/s00702-017-1820-x. [2] U. Dettmer, A.J. Newman, F. Soldner, E.S. Luth, N.C. Kim, V.E. von Saucken, J.B. Sanderson, R. Jaenisch, T. Bartels, D. Selkoe, Parkinson-causing α-synuclein missense mutations shift native tetramers to monomers as a mechanism for disease initiation, Nat. Commun. 6 (2015) 7314, https://doi.org/10.1038/ncomms8314. [3] F. Larsen, H.O. Madsen, R.B. Sim, C. Koch, P. Garred, Disease-associated mutations in human mannose-binding lectin compromise oligomerization and activity of the final protein, J. Biol. Chem. 279 (2004) 21302–21311, https://doi.org/10.1074/ jbc.M400520200. [4] B.D. Slaughter, R. Li, Toward quantitative “in vivo biochemistry” with fluorescence fluctuation spectroscopy, Mol. Biol. Cell 21 (2010) 4306–4311, https://doi.org/10. 1091/mbc.E10-05-0451. [5] Y. Chen, L.-N. Wei, J.D. Müller, Probing protein oligomerization in living cells with fluorescence fluctuation spectroscopy, Proc. Natl. Acad. Sci. U.S.A. 100 (2003) 15492–15497, https://doi.org/10.1073/pnas.2533045100. [6] P.J. Macdonald, Y. Chen, X. Wang, Y. Chen, J.D. Mueller, Brightness analysis by Zscan fluorescence fluctuation spectroscopy for the study of protein interactions within living cells, Biophys. J. 99 (2010) 979–988, https://doi.org/10.1016/j.bpj. 2010.05.017. [7] E.M. Smith, P.J. Macdonald, Y. Chen, J.D. Mueller, Quantifying protein-protein interactions of peripheral membrane proteins by fluorescence brightness analysis, Biophys. J. 107 (2014) 66–75, https://doi.org/10.1016/j.bpj.2014.04.055. [8] J. Hennen, K.-H. Hur, C.A. Saunders, G.W.G. Luxton, J.D. Mueller, Quantitative brightness analysis of protein oligomerization in the nuclear envelope, Biophys. J. 113 (2017) 138–147, https://doi.org/10.1016/j.bpj.2017.05.044. [9] J. Hennen, C.A. Saunders, J.D. Mueller, G.W.G. Luxton, Fluorescence fluctuation spectroscopy reveals differential SUN protein oligomerization in living cells, Mol. Biol. Cell (2018), https://doi.org/10.1091/mbc.E17-04-0233. [10] S. Alam, D.B. Lovett, R.B. Dickinson, K.J. Roux, T.P. Lele, Nuclear forces and cell mechanosensing, Prog Mol Biol Transl Sci 126 (2014) 205–215, https://doi.org/10. 1016/B978-0-12-394624-9.00008-7. [11] K.L. Wilson, J.M. Berk, The nuclear envelope at a glance, J. Cell Sci. 123 (2010) 1973–1978, https://doi.org/10.1242/jcs.019042. [12] M.L. Watson, THE NUCLEAR ENVELOPE, J. Biophys. Biochem. Cytol. 1 (1955) 257–270. [13] C.J. Smoyer, S.S. Katta, J.M. Gardner, L. Stoltz, S. McCroskey, W.D. Bradford, M. McClain, S.E. Smith, B.D. Slaughter, J.R. Unruh, S.L. Jaspersen, Analysis of membrane proteins localizing to the inner nuclear envelope in living cells, J. Cell Biol. 215 (2016) 575–590, https://doi.org/10.1083/jcb.201607043. [14] K.-H. Hur, J.D. Mueller, Quantitative brightness analysis of fluorescence intensity

[16] [17] [18]

[19]

[20] [21] [22] [23] [24] [25] [26] [27]

8

fluctuations in E. Coli, PLoS One 10 (2015), https://doi.org/10.1371/journal.pone. 0130063. J. Hennen, K.-H. Hur, S.R. Karuka, G.W.G. Luxton, J.D. Mueller, Protein oligomerization and mobility within the nuclear envelope evaluated by the time-shifted mean-segmented Q factor, Methods 157 (2019) 28–41, https://doi.org/10.1016/j. ymeth.2018.09.008. M. Crisp, Q. Liu, K. Roux, J.B. Rattner, C. Shanahan, B. Burke, P.D. Stahl, D. Hodzic, Coupling of the nucleus and cytoplasm: role of the LINC complex, J. Cell Biol. 172 (2006) 41–53, https://doi.org/10.1083/jcb.200509124. S. Nie, H. Ke, F. Gao, J. Ren, M. Wang, L. Huo, W. Gong, W. Feng, Coiled-coil domains of SUN proteins as intrinsic dynamic regulators, Structure 24 (2016) 80–91, https://doi.org/10.1016/j.str.2015.10.024. Z. Zhou, X. Du, Z. Cai, X. Song, H. Zhang, T. Mizuno, E. Suzuki, M.R. Yee, A. Berezov, R. Murali, S.-L. Wu, B.L. Karger, M.I. Greene, Q. Wang, Structure of Sad1-UNC84 homology (SUN) domain defines features of molecular bridge in nuclear envelope, J. Biol. Chem. 287 (2012) 5317–5326, https://doi.org/10.1074/jbc. M111.304543. J. Hennen, I. Angert, K.-H. Hur, G.W. Gant Luxton, J.D. Mueller, Investigating LINC complex protein homo-oligomerization in the nuclear envelopes of living cells using fluorescence fluctuation spectroscopy, in: G.G. Gundersen, H.J. Worman (Eds.), The LINC Complex: Methods and Protocols, Springer New York, New York, NY, 2018, pp. 121–135, , https://doi.org/10.1007/978-1-4939-8691-0_11. E.M. Smith, J. Hennen, Y. Chen, J.D. Mueller, Z-scan fluorescence profile deconvolution of cytosolic and membrane-associated protein populations, Anal. Biochem. 480 (2015) 11–20, https://doi.org/10.1016/j.ab.2015.03.030. L. Mandel, Sub-Poissonian photon statistics in resonance fluorescence, Opt. Lett. 4 (1979) 205–207. A. Sanchez-Andres, Y. Chen, J.D. Müller, Molecular brightness determined from a generalized form of Mandel's Q-parameter, Biophys. J. 89 (2005) 3531–3547, https://doi.org/10.1529/biophysj.105.067082. Y. Chen, J.D. Müller, Determining the stoichiometry of protein heterocomplexes in living cells with fluorescence fluctuation spectroscopy, Proc. Natl. Acad. Sci. U. S. A. 104 (2007) 3147–3152, https://doi.org/10.1073/pnas.0606557104. B. Efron, Bootstrap methods: another look at the jackknife, Ann. Stat. 7 (1979) 1–26, https://doi.org/10.1214/aos/1176344552. B.A. Sosa, A. Rothballer, U. Kutay, T.U. Schwartz, LINC complexes form by binding of three KASH peptides to domain interfaces of trimeric SUN proteins, Cell 149 (2012) 1035–1047, https://doi.org/10.1016/j.cell.2012.03.046. J.A. Boice, G.R. Dieckmann, W.F. DeGrado, R. Fairman, Thermodynamic analysis of a designed three-stranded coiled coil, Biochemistry 35 (1996) 14480–14485, https://doi.org/10.1021/bi961831d. Y. Chen, J.D. Müller, Q. Ruan, E. Gratton, Molecular brightness characterization of EGFP in vivo by fluorescence fluctuation spectroscopy, Biophys. J. 82 (2002) 133–144, https://doi.org/10.1016/S0006-3495(02)75380-0.