C H A P T E R
T E N
Probing the Input–Output Behavior of Biochemical and Genetic Systems: System Identification Methods from Control Theory Jordan Ang,*,† Brian Ingalls,‡ and David McMillen*,† Contents 280 282 283 285 291 291 293 305 311 316
1. Introduction 2. System Identification Applied to a G-Protein Pathway 2.1. Construction of the frequency response 2.2. Interpretation of the frequency response 3. System Identification 3.1. Transfer function models 3.2. Applying the procedure to the G-protein pathway 3.3. Examples of experimental implementation 4. Conclusion References
Abstract A key aspect of the behavior of any system is the timescale on which it operates: when inputs change, do responses take milliseconds, seconds, minutes, hours, days, months? Does the system respond preferentially to inputs at certain timescales? These questions are well addressed by the methods of frequency response analysis. In this review, we introduce these methods and outline a procedure for applying this analysis directly to experimental data. This procedure, known as system identification, is a well-established tool in engineering systems and control theory and allows the construction of a predictive dynamic model of a biological system in the absence of any mechanistic details. When studying biochemical and genetic systems, the required experiments are not standard laboratory practice, but with advances in both our ability to measure system outputs (e.g., using fluorescent reporters) and our ability to generate * Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario, Canada Institute for Optical Sciences, University of Toronto Mississauga, Mississauga, Ontario, Canada { Department of Applied Mathematics, University of Waterloo, Waterloo, Ontario, Canada {
Methods in Enzymology, Volume 487 ISSN 0076-6879, DOI: 10.1016/S0076-6879(11)87010-2
#
2011 Elsevier Inc. All rights reserved.
279
280
Jordan Ang et al.
precise inputs (with microfluidic chambers capable of changing cells’ environments rapidly and under fine control), these frequency response methods are now experimentally practical for a wide range of biological systems, as evidenced by a number of successful recent applications of these techniques. We use a yeast G-protein signaling cascade as a running example, illustrating both theoretical concepts and practical considerations while keeping mathematical details to a minimum. The review aims to provide the reader with the tools required to design frequency response experiments for their own biological system and the background required to analyze and interpret the resulting data.
1. Introduction Recent advances in experimental techniques have allowed ever-increasing scrutiny of the dynamic behavior of cellular mechanisms. In many cases, such as development and response to environmental changes, the dynamic nature of a biological process is the key to its function. Standard tools of biological data analysis are well equipped to address steady-state behavior (e.g., dose–response curves), but to unravel the nature of dynamic processes (e.g., transient and oscillatory responses), new analytic methods are required. A fundamental aspect of a dynamic process is the timescale on which it acts. Just as biological systems span a wide range of spatial scales (from nanometer-wide proteins to kilometer-wide ecosystems), biological processes span a wide range of temporal scales, from ligand association (seconds) to protein expression (minutes to hours) to organismal development (months to years) to ecosystem alteration (years to millennia). Investigation of any given process often reveals a network of simultaneous events occurring over a range of timescales. The resulting dynamic network behavior is often not readily apparent from the nature of the individual components. The engineering community has long been dealing with (and exploit˚ stro¨m and ing) the dynamic behavior of mechanical and electrical systems (A Murray, 2008; Haykin and Van Veen, 2005). One of the basic notions the engineers have derived for addressing time-varying processes is the frequency response, which provides a concise characterization of the manner in which a system responds to perturbations at various timescales. The frequency response relies on frequency domain analysis, which applies directly only to linear systems. A system is linear if it acts additively: the response to two simultaneous perturbations is equivalent to the sum of the responses to the individual perturbations. Linearity is a helpful conceit; no real systems are perfectly linear. (In particular, biological systems typically involve saturation, rather than additivity, of response.) Nevertheless, systems often exhibit approximately linear behavior, and in fact all systems behave approximately linearly when exposed to perturbations that are
Frequency Response Methods in Biology
281
sufficiently small. In practice, many systems exhibit behavior that is close to linear. This is particularly true for self-regulating systems that tend to operate around a specific nominal condition; this applies both to engineered automatic feedback systems and to homeostatic biological mechanisms. The frequency response of a system can be assessed by observing the response of the system to specific inputs. This analysis is simplest when those inputs are sinusoidal oscillations at various frequencies; the corresponding responses indicate the behavior of the system over a range of timescales. In some contexts (e.g., electrical circuits), the generation of oscillatory signals is straightforward. In contrast, such signals can be difficult to produce in a biological setting, especially in the case of chemical signals (though Block et al., 1983 is an early example of work applying oscillatory signals to analyze a biochemical system). Recent advances in microfluidic technologies have placed the production of such inputs, and of other oscillatory inputs such as square waves, within broader reach (Beebe et al., 2002; Bennett and Hasty, 2009; Bennett et al., 2008; Hersen et al., 2008; Mettetal et al., 2008; Shimizu et al., 2010). In engineering applications, the frequency response can be efficiently assessed in a single experiment using an input that “excites” the system at multiple frequencies simultaneously (Ljung, 1999). (Standard examples are white noise, steps, or approximations of the “impulse function”—an infinitely tall, infinitely short pulse). The signal-to-noise ratios (SNRs) inherent in molecular biology make these experiments less useful in this setting (although see Block et al., 1982 for a successful implementation of a chemical impulse). The frequency response provides valuable insight into a system’s dynamic behavior. This includes a characterization of the bandwidth of the system: the fastest timescale on which the system can act. Moreover, the frequency response can be used to generate a transfer function model of the system’s input–response behavior. A transfer function model allows prediction of the response of the system to arbitrary inputs (provided linearity of behavior is adequately maintained). A transfer function model does not address the specific mechanisms (e.g., biochemical or genetic) underlying the input–response behavior; it is a “black box” model. The term system identification refers to the process of constructing such a model from observation of dynamic responses. In this review, we will introduce and illustrate the system identification process and highlight cases in which the method has been successfully applied experimentally. In Section 2, we present a running application—a yeast G-protein signaling cascade—and use a mathematical model of this system to illustrate how the frequency response can be assessed from idealized experimental observation of the system’s response to sinusoidal inputs, as well as how the frequency response provides insight into system behavior. In Section 3, we describe how a frequency response could be generated from real experimental observations, and then extend that analysis to the construction of a transfer function model. We close Section 3 with a brief
282
Jordan Ang et al.
discussion of examples of experimental applications of frequency response methods to biochemical systems. Finally, in Section 4, we review successful biological applications of systems identification to experimental data, and conclude with a discussion of the role that the methods we describe here may play in future efforts to understand biochemical and genetic systems.
2. System Identification Applied to a G-Protein Pathway Heterotrimeric G-protein signaling systems are a common component of eukaryotic signal transduction pathways, and are of acute clinical interest as common drug targets (McCudden et al., 2005; Oldham and Hamm, 2008; Yi et al., 2003). The G-protein component of the pheromone response pathway in the budding yeast Saccharomyces cerevisiae is a well-characterized example of this family of pathways; both the kinetic details and the dynamic behavior of the pathway have been studied (Yi et al., 2003). We will use this pathway, shown schematically in Fig. 10.1, to illustrate system identification techniques. In this section, we will examine the behavior of the mathematical model constructed by Yi et al. (2003) as an idealization of an experimental analysis. This will allow a clean presentation of the concepts underlying the frequency response.
production
L+R
de
gr
ad
Ga–Gbg
Ga–GDP
on
ion
at
rad
deg
ati
RL
Ga–GTP
Gbg
Figure 10.1 G-protein signaling pathway in yeast (Yi et al., 2003). The pathway input is the level of extracellular ligand, L. The ligand binds to receptor R to form the ligand– receptor complex RL. This complex catalyzes the association of GTP with the Ga subunit of the heterotrimeric G-protein complex, and the concomitant dissociation of the Ga and Gbg subunits. Ga-GTP activates downstream activity until the associated GTP is dephosphorylated to GDP, after which the G-protein complex reforms.
283
Frequency Response Methods in Biology
2.1. Construction of the frequency response As described in Section 1, the frequency response can be assessed from timeseries observation of the system’s response to oscillatory input signals. In the case of the pheromone response G-protein pathway, the input (extracellular ligand) is the level of alpha factor pheromone (assuming the target cells are mating type a). Figure 10.2 shows a model-generated dose–response curve indicating the abundance of the pathway output, active Ga-GTP as a function of extracellular alpha factor concentration (L in Figure 10.1). This curve, which matches the experimental results of Yi et al. (2003), indicates that over its active range, the signal transduction pathway’s active range is characterized by a near-linear response centered at an alpha factor level of about 1 nM. Consequently, we choose 1 nM as a nominal input level, and address the behavior of the system as the ligand input varies around this nominal value. From the curve, an alpha factor concentration of 1 nM corresponds to a Ga-GTP abundance of 510 molecules per cell. Linear systems and sinusoidal inputs share a special relationship that underlies all frequency domain analysis: the steady-state response of a linear system to a sinusoidal input is a sinusoidal output of the same frequency. This does not hold for nonlinear systems, or for other classes of inputs.
Abundance of active Ga-GTP (molecules per cell)
900 800 700 600 500 400 300 200 100 0 10−3
10−2
10−1
100 101 Alpha factor (nM)
102
103
Figure 10.2 Model-generated G-protein pathway dose–response. The steady-state abundance (in molecules per cell) of Ga-GTP (which we take as the system’s output) is plotted against the concentration of the extracellular alpha factor ligand (the system’s input).
284
Jordan Ang et al.
1000
alpha factor (input) Ga–GTP (output)
900 800
Abundance
700 600 500 400 300 200 100 0
0
50
100
150
200
250
Time (s)
Figure 10.3 Simulated G-protein pathway transient response. The system input (concentration of the extracellular alpha factor ligand) is varied sinusoidally. The system’s output passes through a brief transient period, after which it settles into a sinusoidal pattern of its own. The frequency of the output matches the input frequency, but the phase (the location of the peaks and troughs) is shifted and the amplitude is different.
Figure 10.3 shows a model simulation representing the time-series measurements of an experiment in which cells are exposed to a sinusoidal oscillation in ligand level. The system response follows a short transient before settling to a steady oscillatory behavior. The frequency of oscillation of the response matches that of the input, but the output oscillations do not have the same amplitude as the input, and the two signals are out-of-phase (i.e., the peaks and troughs are not aligned). The ratio of the amplitude of response to the amplitude of input is called the system gain; the difference in the phase is called the system phase shift (or just system phase). These measures of system behavior are dependent on the frequency of the input, but (as a consequence of linearity) do not depend on its amplitude. By probing the response of the system to sinusoidal inputs at various frequencies, we reveal how these two measures (gain and phase) depend on the timescale of the input. In the subsequent dynamic analysis, we will address the response of the system in terms of the deviation of the Ga-GTP level from 510 molecules per cell, to an input described by the deviation of the alpha factor level from 1 nM. Since we are simulating model behavior, we will address the response to pure sinusoids. In many cases, sinusoidal inputs cannot be experimentally achieved, and so other oscillatory inputs (e.g., square waves) must be used.
Frequency Response Methods in Biology
285
We will defer the treatment of these more general inputs (which requires a discussion of Fourier analysis) to Section 3. Figure 10.4 shows three simulated steady-state input–response pairs. In each case, the gain and phase can be assessed by comparing the two signals. Comparing these three responses (note the change in scale on the time axis), we see that the amplitude of the response (the deviation from nominal GaGTP levels) changes with input frequency, although the amplitude of the input remains the same. We see too that the phase behavior changes: at low frequency the response lies (nearly) in phase with the input; at higher frequencies the response falls significantly out-of-phase. When this experiment is repeated over a range of frequencies, an overall picture of the system’s frequency - dependence emerges. This information is typically displayed in a pair of plots: graphs of the system gain and system phase as functions of frequency. Figure 10.5 shows the corresponding curves for this system, with the data from Fig. 10.4 labeled. Following standard practice, frequency is plotted on a logarithmic scale, as is gain, while phase shift is plotted in degrees. These are known as gain and phase Bode plots, named after the American engineer Hendrik Bode (1905–1982) who was the first to apply this analysis. (Engineers typically scale the gain in a Bode plot by a prefactor of 20, and report the result in decibels; also they generally report frequency in radians per second rather than Hertz.) In this case, the Bode plot was generated directly from the mathematical model. In application, these curves would be fit to experimental observations. We illustrate this fitting process in Section 3.
2.2. Interpretation of the frequency response The Bode plots provide direct insight into the dynamic behavior of the system. The phase plot contains information about the way in which the system acts when connected to other systems (in cascade or in feedback). We will not address such issues here, but in section 3 we will find the phase plot useful as an additional constraint on the fitting of a transfer function to frequency response data. The gain plot indicates the response of the system to perturbations at various frequencies (i.e., timescales). In particular, it reveals the (frequency) filtering properties of the system. The notion of frequency filtering follows from the fact that any signal can be expressed as a combination of pure sinusoids, via Fourier decomposition.1 As illustrated above, the gain plot shows how a system amplifies or 1
A periodic signal can be written as a sum of sinusoids over a discrete range of frequencies, via a Fourier series. A nonperiodic signal can be expressed as an integral of sinusoids over a continuum of frequencies, via the Fourier transform.
286
Jordan Ang et al.
A 1.2
1.1
550
1.05 1
500
0.95 0.9
450
0.85 0.8 0.75
400 0
2
4
6
8
10
12
14
16
Time (s)
Input: deviation from nominal (nM)
B
Output: deviation from nominal (molecule per cell)
600
1.15
18 × 104
1.25 1.2
600
1.15 1.1
550
1.05 1
500
0.95 0.9
450
0.85 0.8 0.75
Output: deviation from nominal (molecule per cell)
Input: deviation from nominal (nM)
1.25
400 0
500 1000 1500 2000 2500 3000 3500 4000 4500 Time (s)
1.25 1.2
600
1.15 1.1
550
1.05 1
500
0.95 0.9
450
0.85 0.8 0.75
Output: deviation from nominal (molecule per cell)
Input: deviation from nominal (nM)
C
400 0
20
40
60
80
100
120
140
160
180
Time (s)
alpha factor (input)
Ga–GTP (output)
Figure 10.4 Simulated G-protein pathway steady-state response to sinusoidal inputs. Note the different scales on the time axes for the different plots. The input functions (light curves) have frequencies of (A) 2.5 10 5 Hz; (B) 10 3 Hz; and (C) 2.5 10 2 Hz; the resulting outputs are shown as dark curves. The change in amplitude and phase is summarized as a function of frequency in Fig. 10.5. (Note that since the input and output are measured in different units, the gain does not correspond to the ratio of the amplitudes of the traces in the graph.)
287
Frequency Response Methods in Biology
103 2
Gain
10
B Low-frequency gain: 200
20 4B
Corner frequency: 4C 0.005 Hz
101 100 10-1
4A
0
4A Phase shift (deg)
A
-20 4B
-40 -60 -80 -100 -120 -140
Roll-off (slope): -2
4C
-160 -2
10
10-5
10-4
10-3
10-2
10-1
100
-180
10-5
Frequency (Hz)
10-4
10-3
10-2
10-1
100
Frequency (Hz)
Figure 10.5 Frequency response of the G-protein pathway model. These Bode plots show (A) the system gain and (B) the system phase shift, as a function of the frequency of the input signal. The gains and phases of the signals with frequencies shown in Fig. 10.4 are marked here as points 4A, B, and C, corresponding to Fig. 10.4’s three panels. 103
102
Gain
101
100
10−1
10−2
10−1
100 Frequency (Hz)
101
Figure 10.6 Gain Bode plot for a resonant system. The system responds strongly at a frequency of 0.5 Hz, with the gain spiking upward by orders of magnitude at that point and dropping off sharply at lower and higher frequencies.
attenuates a sinusoidal input at a given frequency. When the linear system is exposed to a general input, each of its frequency components is passed to the output according to the corresponding gain, and thus this graph gives a powerful map of the system’s input–output behavior. As a first example, consider Fig. 10.6, which shows the gain plot for a system with a resonant frequency of 0.5 Hz. This plot suggests that this system will respond strongly to sinusoids at this resonant frequency, and will
288
A
Jordan Ang et al.
B
4 3
1.5 1 0.5
1
Output
Input
2 0 −1 −2 −3 −4
0 −0.5 −1
0
5
10
15 20 Time (s)
25
30
−1.5
0
5
10
15 20 Time (s)
25
30
Figure 10.7 Input–output pair for the resonant system whose Bode gain plot was shown in Fig. 10.6. (A) The input signal: Gaussian white noise, a random signal containing a wide range of frequencies. (B) The corresponding output signal: the resonant frequency at 0.5 Hz has been picked out of the noisy input and preferentially amplified, producing a near-sinusoidal output at the resonant frequency.
attenuate oscillatory signals at all other frequencies. Referring to Fig. 10.7, we see an example of that behavior. The input is a Gaussian white noise signal, which is composed of oscillations over a wide range of frequencies. The output, in contrast, shows only the influence of the input at the resonant frequency: the remaining components of the signal have been so attenuated that they are no longer visible on the plot. This sort of resonant system is useful in technology2 and may appear in some biological oscillators as well (see, e.g., Westermark et al., 2009), but is not representative of the typical behavior of biochemical and genetic networks; we include it here as a direct illustration of filtering. Figure 10.8 illustrates some of the filtering behaviors more commonly exhibited by cellular processes. A single input function—a low-frequency (0.16 Hz) sinusoid corrupted by higher frequency noise (Fig. 10.8A)—has been passed through three separate systems. In each case, the nature of the resulting output signal can be predicted from the system’s frequency response, as follows. In the first case (Fig. 10.8B and C), the system acts as a low-pass filter. The gain plot is approximately piece-wise linear: a constant value at low frequencies, then a corner, followed by a linear descent with a slope of 1 on the log–log plot. The frequency at which the corner occurs is known as the system’s corner (or cut-off or break) frequency, and indicates the system’s bandwidth. Below this frequency, the system allows all components of the 2
For instance, Fig. 10.7 illustrates the phenomenon that allows a distinct radio signal to be captured from the mixture of signals that we hear as radio “static.” The tuning dial allows the listener to change the resonant frequency, thus selecting which signal will be “lifted” out of the noise.
289
Frequency Response Methods in Biology
A Input:
5 4 3 2 1 0 −1 −2 −3 −4 0
1
10
0
10
−1
10
−2
Gain
Gain
10 10
−3
10
−4
10
−5
10 10
−6
10
10
2.5 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2
10 10
10
C
15 20 Time (s)
25
30
−4
−2
0
2
10 10 10 Frequency (Hz)
10
4
10
E
F
1
10
0
10
−1
10
−2
Gain
10 10
Outputs:
10
D
B Filters:
5
−3 −4
10 −4
10
−2
0
2
10 10 10 Frequency (Hz)
10
G
2.5 1.5 1 0.5 0
−0.5 5
10
15 20 Time (s)
25
30
−1 0
5
10
15 20 Time (s)
25
0
−1 −2 −3 −4 −5 −6
10
4
2
0
10 10
−5 −6
10
1
30
4 3 2 1 0 −1 −2 −3 −4
−4
−2
10
0
0
2
10 10 10 Frequency (Hz)
5
10
15 20 Time (s)
4
10
25
30
Figure 10.8 Filtering behavior. (A) The input signal is a sinusoid with frequency 0.16 Hz, corrupted with high-frequency noise. This input has been passed through each of the systems in the second row, resulting in the output signals in the third row. (B and C) The gain plot for B shows that this system is a low-pass filter with a corner frequency of 1 Hz and a roll-off of 1. The output C shows that the high-frequency noise has been reduced, leaving the underlying sinusoidal signal more visible. (D and E) The gain Bode plot D is again a low-pass filter, this time with a lower corner frequency (10 2 Hz). The output signal has its high-frequency components attenuated more than in panel C, resulting in a smoother sinusoid. (F and G) This system’s gain Bode plot F shows that it is a band-pass filter, with a peak at 102 Hz. The frequency of the sinusoidal oscillation in the input signal (panel A) lies well below this peak, so the filter obscures the sinusoidal nature of the original signal, leaving only the higher frequency noise components. The resulting output G is relatively flat, and somewhat less noisy than the original because the highest frequency components of the noise have also been attenuated by the band-pass filter.
input to “pass” through to the output signal. High-frequency components of the input, in contrast, are significantly attenuated. Comparing the input and output signals in this case, we see that the low-frequency oscillations are retained in the output, as is the noise around that oscillation, but the noise has been smoothed: the high-frequency components have been stripped away, leaving a signal with less “wild” changes in direction.
290
Jordan Ang et al.
More specifically, the gain plot for this low-pass filter is characterized by three parameters: the gain at low frequency, the frequency at which the “corner” occurs, and the slope after the corner (the so-called “roll-off ”); all three are informative about the system’s behavior:
Low-frequency (DC) gain: The gain at low frequency is the degree to which constant and slowly varying inputs are amplified or attenuated. (Electrical engineers refer to this as the “DC gain” since direct current (DC) provides a constant input.) This low-frequency gain describes, for instance, the steady-state response of the system to a step input, and is often referred to as the (local, steady-state) sensitivity of the system to the input (Ingalls, 2004). Corner frequency: As mentioned, the corner frequency indicates the system’s bandwidth: the frequency above which inputs are attenuated. For cellular processes, the bandwidth is closely related to the timescales on which component processes act (e.g., association/dissociation rates, halflives of proteins, metabolic consumption rates). When exposed to disturbances that act on timescales considerably faster than their inherent timescales, these processes are unable to “keep up,” and instead react only to the temporal average of the perturbation. In the case of sinusoidal inputs centered around a nominal level, the signal averages to the nominal input, and so the system exhibits no dynamic response to such highfrequency signals. Roll-off: The drop beyond the corner frequency eventually reaches a straight line with a downward slope of N decades of gain per decade of frequency on this log–log plot. The slope N is called the relative degree of the system, and is a measure of how quickly the output responds to changes in the input (the larger the relative degree, the more sluggish the response). The filters considered here have a slope of one decade per decade, and so have relative degree 1. The effect of a change in the bandwidth is illustrated by the second system shown in Fig. 10.8D and E. This system is also a low-pass filter, but has a shorter bandwidth, and so allows fewer high-frequency components of an input signal to pass to the output. The response in the Fig. 10.8E illustrates this behavior: the low-frequency sinusoid is clearly seen in the output, but this filter has removed even more of the high-frequency noise. The gain plot of our third example (Fig. 10.8F and G) differs significantly from the low-pass filters: it shows attenuation at both low and high frequencies. This is referred to as a band-pass filter, and is characterized by the range (or band) of frequencies that it allows to pass from input to output. (The resonant system in Fig. 10.6 is also a band-pass filter, with a very tight band.) Referring to the corresponding output signal (Fig. 10.8G), we see a behavior that is complementary to the low-pass filters: the low-frequency sinusoid has been filtered out since its frequency does not match the
Frequency Response Methods in Biology
291
frequencies “passed” by the band-pass filter; only the higher frequency noise has been retained (and has been somewhat smoothed by blocking of its highest frequencies). In technology, band-pass filters are implemented to allow downstream processes to access specific frequency ranges within input signals. Biological processes that exhibit band-pass behavior may have been selected to perform similar roles. In any case, the range of frequencies over which a biological band-pass filter acts is likely indicative of the kinds of signals to which it is attuned, and so speaks directly to its function and environment. A third category, the high-pass filter, has essentially the opposite form and effect as a low-pass filter: the frequency response is small for low frequencies, and rises to some maximum for high frequencies. Such filters can, much like low-pass filters, be characterized by the slope on their logscale Bode plot and the corner frequency (here representing the frequency below which the response is attenuated). Returning to our G-protein pathway model (Fig. 10.5), we see that it displays primarily low-pass behavior, with a mild band-pass around 10 3 Hz. The low-frequency (DC) gain is 200 (a ratio of molecules per cell to nM), the bandwidth is 5 10 3 Hz, and the roll-off indicates a relative degree of 2. These values reveal the types of signals to which the pathway responds most strongly, and may provide insight into its evolution or biological function. As we have seen, the gain plot provides a useful nonparametric model for the system, and could be fit using standard nonlinear regression techniques (Seber and Wild, 2003). However, the phase shift data sheds light on the construction of a more useful representation of system behavior: an input– output map known as the system transfer function, that will allow prediction of the system response to arbitrary signals
3. System Identification In this section, we present the method of system identification from data, using simulated data from a stochastic simulation of the G-protein pathway model to illustrate the method. Before discussing the details of implementation, we introduce the notion of the transfer function.
3.1. Transfer function models Most dynamic mathematical models in systems biology are based on knowledge of the kinetics of underlying chemical and genetic mechanisms (Hasty et al., 2001; Ideker et al., 2001; Kitano, 2002; Tyson et al., 2001). Here we take a complementary approach: using only our collection of
292
Jordan Ang et al.
input–response data (and assuming no knowledge of mechanism), we fit a “black box” model that accurately describes the observed input–output behavior. Such a model does not directly serve to provide insight into the mechanisms of the system, but rather provides a concise, predictive representation of its input–output behavior. The model we seek to fit is called a transfer function and takes the form of a ˚ stro¨m and rational function: a ratio of polynomials of a complex variable (A Murray, 2008). The transfer function formalism allows for a simple representation of dynamic input–output relationships. Such dynamic relationships are often captured by differential equations. For example, suppose that for a given system, the response y(t) is related to the input u(t) according to the differential equation d2 d d yðt Þ þ a1 yðtÞ þ a0 yðtÞ ¼ b1 uðtÞ þ b0 uðt Þ: 2 dt dt dt
ð10:1Þ
To avoid working with dynamic equations of this sort, we make use of the Laplace Transform L, which is an operator that maps functions of time to functions of a complex variable s, so that L[y(t)] ¼ Y(s) (Boyce and DiPrima, 2008). The Laplace transform has two key properties that we will use. The first is that it converts derivatives to products: L[(dn /dtn)y(t)] ¼ snY(s) (provided the signal, y, is zeroed and resting at its nominal value at t ¼ 0). This means that the differential equation relating y(t) to u(t) can be converted to a simpler, nondynamic, equation relating their Laplace transforms. With L[u(t)] ¼ U(s), we have s2 Y ðsÞ þ a1 sY ðsÞ þ a0 Y ðsÞ ¼ b1 sU ðsÞ þ b0 U ðsÞ;
ð10:2Þ
which can be simplified to Y ðsÞ ¼
s2
b1 s þ b0 U ðsÞ: þ a1 s þ a0
ð10:3Þ
Letting G(s) ¼ (b1s þ b0) / (s2 þ a1s þ a0), we call G(s) the transfer function for the system. The transfer function provides a simple relationship between the Laplace transforms, but will not be useful unless we can recover information about the original signals. Here the second key property of Laplace transforms can be employed: when evaluated at s ¼ io (where i is the square root of negative one),3 the Laplace transform yields the Fourier transform, which, as
3
Note: most control engineers follow the electrical engineering notation j ¼
pffiffiffiffiffiffiffi 1:
Frequency Response Methods in Biology
293
we have discussed above, is a measure of the frequency content of a signal at frequency o. Thus the equation Y ðioÞ ¼ GðioÞU ðioÞ
ð10:4Þ
relates the content of the input, u, and the output, y, at frequency o. This is precisely the role of the frequency response, and in fact G(io), the value of the transfer function at s ¼ io, is precisely the frequency response of the system (at frequency o). To compare with a Bode plot, we recognize that G(io) is a complex-valued function of o: the gain at o corresponds to the modulus of this complex number, while the phase shift corresponds to its argument.4
3.2. Applying the procedure to the G-protein pathway For this illustrative example, we have generated simulated “data” from a stochastic version of the G-protein pathway model from Yi et al. (2003) (details in Appendix). Our system identification procedure seeks to recover aspects of the model behavior directly from the simulated data. In the following sections, we will present our implementation of the method shown schematically in Fig. 10.9. 3.2.1. Preliminary investigations As a first step, it is often worth performing preliminary investigations using input profiles that are standard in the lab, and therefore relatively easy to produce. These investigations represent a simple but informative "first pass" allowing us to focus on the appropriate amplitude and frequency ranges of the input signal before beginning frequency response analysis, where the necessary inputs are more challenging to produce experimentally. We begin our analysis of the system by determining the dose–response behavior for our chosen input and output. Figure 10.10 shows the results of exposing the system to a range of input values and determining the corresponding steady-state output value. As in Section 2, the input is the concentration of the alpha factor ligand and the output is the cellular abundance of the species Ga-GTP. From this graph, we can identify a roughly linear range of behavior between saturation at low and high input values. We chose to center our linear analysis around the input level 1 nM, as indicated in the figure. The corresponding nominal output value is a concentration of Ga-GTP of about 450 molecules per cell. Outside of the surrounding linear range, nonlinear effects become appreciable; these effects are also exposed in the frequency response analysis detailed below. We will 4
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi For a complex number x ¼ a þ bi, the modulus is jxj ¼ a2 þ b2 while the argument is arg(x) ¼ arctan (b / a). The argument of a complex number is sometimes referred to as its phase.
294
Jordan Ang et al.
Preliminary investigations
Probe system with oscillatory input + collect response data
Preprocess response data
Fourier filter response data at input frequency to determine system gain and phase shift
Repeat for various input frequencies
Experimental design
Create experimental bode plot
Choose linear transfer function model structure
Fit transfer function to bode plot No
Good fit? Yes
Determine the form of a nonlinear rectifier (if necessary)
Model validation No
Model ok? Yes
Figure 10.9 Schematic flow chart of the sequence of steps for system identification via frequency response. The dashed box indicates steps that should be repeated over a range of frequencies in order to capture the principle features of the Bode plot. The feedback loops on the left side illustrate the iterative nature of this method. If the estimated model is not satisfactory, one should first look to different (higher order) transfer model structures. If a satisfactory model still cannot be found, then the overall experimental design may need to be modified in order to capture more detailed or more accurate information.
295
Abundance of active Ga-GTP (molecules per cell)
Frequency Response Methods in Biology
900 800 700 600 500 400 300
1 nM nominal input
200 100 0 10−3
10−2
10−1
100 101 Alpha factor (nM)
102
103
Figure 10.10 Simulated G-protein pathway dose–response curve. To avoid regimes of saturation, we use 1 nM as the nominal dosage level about which to estimate a linear transfer function model.
discuss the linearity of the outputs after we have examined the system’s varying-frequency behavior. We then begin an exploration of the dynamic behavior of the system by exposing it to an input rectangular “pulse” (i.e., a step up followed by a step down). Using the knowledge gained from the dose–response analysis, we center our input values about the chosen nominal input. We observe an exponential rise when the input increases, and then exponential relaxation when the input is removed (Fig. 10.11). This exponential behavior is typical of biochemical and genetic networks. Some networks might display significant delay in the output response (indicating a high relative degree, see Section 2.2), or even damped oscillatory “ringing” rather than a monotonic rise and fall. (Such ringing is indicative of a strongly resonant frequency, as in Fig. 10.6, and would suggest that careful attention be paid to the frequency response near the frequency of the ringing.) In the case of exponential rise and fall, we are able to extract information about the timescale on which the system acts from the rise time (how long the system takes to reach its steady-state value after the step input is applied) and the decay time (how long the system takes to return to its initial level once the step input jumps back down to zero). In Fig. 10.11, the rise time shows that the pathway can respond to changes in the input signal on timescales on the order of tens of seconds, while the decay time is on the order of hundreds of seconds. Thus, to probe the system effectively, we must use oscillatory signals with periods that span this range, and ideally
296
Jordan Ang et al.
Alpha factor
(nM)
1.5
1
0.5
0
0
500
1000
1500
2000
2500
3000
3500
4000
3000
3500
4000
Active Ga-GTP
900 Molecules per cell
800 700 600 500 400 300
0
500
1000
1500
2000 Time (s)
2500
Figure 10.11 A rectangular pulse input (top plot) and the stochastically simulated output response (bottom plot, sampled points) for the G-protein pathway. The rise and fall times in the response provide a general idea of the system’s timescale.
extend to significantly shorter and longer timescales as well. In the analysis to follow, we will use input signals with periods between 15 and 32,000 s (frequencies of 3.125 10 5 to 0.066 Hz). Of course in a real implementation, experimental limitations will likely provide constraints on the range of timescales that can be effectively probed. 3.2.2. Experimental design, data preprocessing, and the experimental oscillatory response We exposed the G-protein pathway model to sinusoidal oscillatory input signals over a range of frequencies (3.125 10 5 to 0.066 Hz), and observed the output response function at each frequency. In experimental implementations, square waves produced by rapid switching of computercontrolled values are often used as inputs (Hersen et al., 2008; McClean
Frequency Response Methods in Biology
297
et al., 2009; Mettetal et al., 2008), though sinusoids have also been implemented (Bennett et al., 2008; Block et al., 1983; Shimizu et al., 2010). Using sinusoidal inputs allows us to mirror the analysis in Section 2. We discuss the case of general oscillatory inputs in Section 3.2.8. The choice of input amplitude can depend on a number of factors. The signal to noise ratio (SNR) of observations improves with increased amplitude, but increased amplitudes will also more strongly excite nonlinear effects; the Gprotein model system will exhibit linear behavior only over the range of input values indicated in Fig. 10.10. The experimental implementation may also constrain on the range of input amplitudes that can be achieved. Although the frequency response technique discussed here is used to estimate a linear model, it may not be possible to avoid exciting nonlinearities at experimentally realizable input amplitudes; such nonlinearities may be addressed by incorporating a nonlinear rectifier into the model, as presented in Section 3.2.6. We chose to probe our model system with sinusoids of amplitude half the nominal input value (i.e., alpha factor concentrations oscillating between 0.5 to 1.5 nM) When generating a time-series from the output response, note that the sampling frequency should be larger than twice the probing frequency, thus satisfying the Nyquist limit (Haykin and Van Veen, 2005) in order to avoid aliasing effects.5 Noise and nonlinearities will introduce content in the response at frequencies other than the driving frequency, but as long as the SNR is adequate, any aliasing resulting from higher frequency components should be negligible. We sampled the output response of our model system with 100 time points over the period of the input signal for the lowest frequency inputs (periods of 500 to 32,000 s), 50 time points over the input period for midrange frequencies (62.5 to 250 s), and 10 time points over the input period for high frequencies (15.625 to 31.25 s). A few more comments on data preprocessing are called for. As discussed in Section 2, for frequency response analysis, it is important to consider only the steady-state portion of the response, and not its initial transient. Once steady-state behavior is attained, it is beneficial to sample from multiple periods, to improve the SNR (Lipan and Wong, 2005); in essence, each period acts as a replicate. In our example, we waited for the system to reach the desired steady-state behavior (where the nature of the output oscillations is consistent and well established) and then sampled time-series whose length ranged from a single complete period for the lowest frequency responses (reflecting the difficulties of running long time-series experiments) to 10 complete periods for the highest frequencies. For convenience, the start time of our sampled steady-state data corresponds to the beginning of a sinusoidal cycle in the input signal. 5
Aliasing occurs when there are not enough data points during a period of a signal to faithfully represent that signal’s shape. For example, consider sin(t) sampled at t ¼ 0, p, 2p, . . ., leading to the mistaken observation of a constant signal.
298
Jordan Ang et al.
Period = 16,000 s
Period = 32,000 s
Molecules/cell
nM
1.5
1.5
1
1
1
0.5
0.5
0.5
1000
1000
1000
500
500
500
0
0
0
1
2
3 ×10
4
0 0
Period = 4000 s
nM Molecules/cell
0
1
1
0.5
0.5
1000
1000
1000
500
500
500
0
0
4000
6000
8000
0
1000
2000
3000
4000
0
1
1
0.5
0.5
1000
1000
1000
500
500
500
0
0
500
1000
0
500
1000
0
1
1
1
0.5
0.5
1000
1000
1000
500
500
500
200
250
10 0
20 0
30 0
40 0
500
0
0 150
Time (s)
2000
Period = 15.625 s
0.5
100
1500
1.5
1.5
50
0
Period = 31.25 s
Period = 62.5 s 1.5
1000 Period = 125 s
1
0
500
1.5
0.5
0
0
Period = 250 s 1.5
0
15,000
10,000
Period = 1000 s
1
2000
5000
1.5
Period = 500 s
nM
3 4 ×10
0.5
1.5
Molecules/cell
2
1.5
0
nM
1
Period = 2000 s
1.5
Molecules/cell
Period = 8000 s
1.5
0
100
200 Time (s)
300
0
50
100
150
Time (s)
Figure 10.12 Sinusoidal inputs of different frequencies (top panel plots, smooth curves) and their corresponding stochastically simulated output responses (bottom panel plots, sampled points) for the G-protein pathway model.
Finally, if necessary, one should also remove any obvious trends from the output data (e.g., linear trends caused by experimental drift), as well as obvious outliers in the data that are likely to arise from momentary instrument noise rather than the actual dynamics of the system. Figure 10.12 shows the results of sampling the output from our model system. 3.2.3. Fourier filtering and the experimental frequency response Unlike the idealized model in Section 2, the observed output responses here, like real experimental data, are not pure sinusoids, since they have been corrupted by nonlinearities and noise. We can isolate the contribution
299
Frequency Response Methods in Biology
that corresponds solely to the input (driving) frequency by determining the corresponding coefficient in the Fourier series expansion of the ^ of the observed output response time-series. The Fourier coefficient , R, observed output response, R(t), at the angular input frequency o, is described by ^ðoÞ ¼ 2 R nT
ð nT
eiot RðtÞdt;
ð10:5Þ
0
where T ¼ 2p / o is the period of the input signal. Here, t ¼ 0 represents the beginning of our steady-state response data (recall that we had synced this to the beginning of a sinusoidal cycle in the input signal), and n denotes the number of full steady-state periods sampled. ^ðoÞ, whose The result of this integration is a complex number, R ^ modulus jRðoÞj is the output amplitude at frequency o, and whose argu^ðoÞ , is 90 degrees below the output phase at frequency o. ment, arg R Since Ro(t) is represented as a discrete time-series, Eq. (10.5) is evaluated by means of numerical integration. Numerical computing packages (such as MATLAB, Mathematica, and Maple) are equipped with the algorithms for handling such a problem, and a simple strategy such as the midpoint method can be easily implemented in, for example, Python. For our G-protein example, we have used the MATLAB function trapz, which implements a trapezoidal numerical integration algorithm. The system gain and phase shift is then calculated as in Section 2, as the ratio of the output to input amplitude and the difference between the output and input phase, respectively. The individual data points in Fig. 10.13 show the result of applying this analysis to the G-protein pathway model. Note that this is in close agreement with the idealized results in Fig. 10.5. 3.2.4. Fitting the transfer function model We now seek to go beyond the graphical representation provided by the Bode plots to construct a predictive model of the system’s input–output behavior. As discussed in Section 3.1, given the form of a rational function as in Eq. (10.2), we search for values for the coefficients ai and bi so that the modulus of G(io) and the argument of G(io) match the gain and phase shift observations at each frequency o. Before any fitting can be attempted, we must specify the form of the transfer function, that is, the degree of the numerator and denominator polynomials in Eq. (10.2). A transfer function model of the linear behavior of an autonomous biochemical or genetic network will take the form of a rational function in which the order of the numerator polynomial is less than the order of the
300
Jordan Ang et al.
Gain (log scale)
3 2 1 0 −1 10−6
10−5
10−4
10−3
10−2
10−1
100
10−5
10−4
10−3 Frequency (Hz)
10−2
10−1
100
45
Phase (deg)
0 −45 −90 −135 −180 10−6
Figure 10.13 Bode plot for the G-protein pathway determined by Fourier filtering the stochastically simulated sinusoidal response data (discrete data points) plotted alongside the fitted curve generated by the estimated linear model (solid curve).
denominator polynomial. The difference in the orders of these two polynomials is called the relative degree of the model, and corresponds, as noted earlier, to the slope of the roll-off on the gain plot. The order of the denominator polynomial is referred to as the order of the system; it represents, in some sense, a measure of the complexity of the system, and corresponds roughly to the number of corners or “kinks” in the Bode plots. (Referring back to the filters in Fig. 10.8, the two low-pass filters are first order systems—one corner—while the band-pass filter is a second order system—two corners). It is often the case that a higher order system is well approximated by a lower order model, so while the number of visible corners may not correspond directly to the order of the system, it gives a good indication of the minimal order required to provide a decent fit to the frequency response. The gain Bode plot for our G-protein example (Fig. 10.13) has a roll-off slope of approximately 2, indicating that we should choose a transfer function of relative degree 2. A simple transfer function of this sort has numerator of degree 1 and a denominator of degree 3:
301
Frequency Response Methods in Biology
GðsÞ ¼
s3
b2 s þ b0 : þ a2 s2 þ a1 s þ a0
ð10:6Þ
The solid curve in Fig. 10.13 shows a fit of this model to the frequency response data, with parameter values of: a2 ¼ 2.0870 10 1, a1 ¼ 2.7659 10 3, a0 ¼ 3.0898 10 6, b1 ¼ 1.0914, b0 ¼ 7.6371 10 4. (We used the MATLAB function fminsearch to perform the fit, with a least squares error. Other nonlinear optimization routines, surveyed, e.g., by Moles et al. (2003) could also be used). We also fitted a fourth order model, but found little improvement in the fit. (Note that the model used to generate our “data” is in fact fourth order. Our analysis shows that a third order model provides a good approximation to the system.) 3.2.5. Simulating model trajectories The system response (beginning at the nominal steady state) can be simulated by evaluating an integral involving the system’s unit impulse response and the input function. The unit impulse response, g(t), is the system’s transient output in response to a unit impulse input at t ¼ 0, and can be found by applying the inverse Fourier transform to the frequency response, G(io). Specifically, the response y(t) can be calculated as the convolution of g(t) with an input function u(t), defined by yðtÞ ¼
ðt
uðtÞgðt tÞ dt:
ð10:7Þ
0
In practice, g(t) can be evaluated numerically from G(io) via the Inverse Fast Fourier Transform (IFFT). The IFFT routine, along with the Fast Fourier Transform (FFT) are standard components of numerical packages like MATLAB (as are routines evaluating convolution integrals). Alternatively, one can apply an inverse Laplace transform to the transfer function (by replacing the algebraic s variable with the time-derivative operator d/dt (compare Eqs. (10.2) and (10.3) to Eq. (10.1)). Output trajectories can then be determined by numerically solving the resulting differential equation. The resulting trajectories correspond to the linear model and so describe displacements from the nominal steady state. The nominal input and output values must be accounted for when comparing output model trajectories to measured data. Figure 10.14 shows the sampled oscillatory outputs alongside the oscillatory behavior predicted by our estimated model. The plots are in agreement, implying that our linear transfer function adequately represents the original oscillatory input–output data set.
302
Jordan Ang et al.
Molecules/cell
Period = 32,000 s
Period = 16,000 s 1000
1000
500
500
500
0
0 0
1
2
3
4 4 ×10
0 0
Molecules/cell
Period = 4000 s
1
2
4 4 ×10
500
500
500
10,000
0
0
1000
2000
3000
4000
0
1000
500
500
500
500
1000
0
500
1000
500
500
500
Time (s)
300
1000
1500
2000
200
400
600
0
0 200
4
Period = 15.625 s
1000
100
500
0
1000
Period = 31.25 s
Period = 62.5 s
0
2
0
0
1000
0
1. 5
Period = 125 s
1000
0
0
Period = 250 s
1000
0
1
Period = 1000 s 1000
5000
0. 5
×10
1000
0
0
Period = 2000 s
Period = 500 s Molecules/cell
3
1000
0
Molecules/cell
Period = 8000 s
1000
0
100
200 Time (s)
300
400
0
50
100
150
200
Time (s)
Figure 10.14 Sinusoidal response outputs from the stochastically simulated G-protein pathway for different input frequencies (sampled points) plotted alongside the output response generated by the estimated linear model (solid curves).
3.2.6. Applying a nonlinear rectifier In many circumstances, stronger nonlinear effects will cause discrepancies between the linearly modeled and measured data sets. If such is the case, it may be possible to address such effects after the fact, by appending a nonlinear element to the linear transfer function model. The method we present here attempts to correct for nonlinear effects by passing the linear model output through a static nonlinear rectifier. This linear–nonlinear cascade structure is referred to as a Wiener model, and is shown in block diagram form in Fig. 10.15, where the rectifier is represented mathematically by the static function fNL.6 In order to determine a form for fNL, we construct a scatter diagram that plots the linear model outputs against observations. A best fit trendline will define the function fNL. If the scatter plot indicates perfect agreement between model prediction and data, then there is no need for a nonlinear element. 6
An alternative structure that passes the original input signal through a static nonlinear element before it reaches the linear block is referred to as a Hammerstein model.
303
Frequency Response Methods in Biology
Wiener Input u(t)
Dynamic linear block
ylin(t)
Static non-linear rectifier
Output y(t)
y(t) = fNL(ylin(t))
Figure 10.15 Block diagram representing the structure of the Wiener model which places a static nonlinear block in series following the dynamic linear block (the transfer function model in our case).
The reader may be concerned that the addition of this nonlinear element will alter the model’s frequency response, painstakingly determined from the data. This is not the case. While this nonlinear element will cause lower frequencies to mix into higher frequencies, the driving (lowest) frequency component of an oscillatory input signal will propagate through unchanged. Therefore, measurements at the driving frequency will not be affected (Westwick and Kearney, 2003). The scatter plot for our G-protein example is shown in Fig. 10.16. This plot is populated with data from the oscillatory response simulations. The linear trendline—with a slope nearly equal 1 and a very small offset value of 18 molecules per cell—indicates that, over the range of inputs that were used, the estimated linear model sufficiently reproduces the true oscillatory output. Therefore, no nonlinear correction appears to be necessary in this case. In general, the nonlinearity fNL may be more complex than a linear, or even higher order polynomial correction. For example, in the work of Mettetal et al. (2008), a piece-wise defined function was chosen to compensate for both an offset and the fact that the linear model was providing predictions of negative responses within the input range of interest. 3.2.7. Model validation The final step in the system identification procedure is to compare outputs predicted by the estimated model to experimentally measured outputs for an independent set of data (that is, data that was not used for the purposes of fitting the model). Separating these data will ensure that the actual system has been modeled, rather than the specific output represented in the estimation data set. Comparisons can be done in both the time domain and the frequency domain. In Fig. 10.17, we compare the model and stochastically simulated responses in the time domain of the G-protein pathway to a rectangular pulse input. That they are in good agreement supports the validation that our model accurately captures the general dynamics of the G-protein system. Note that it is also a common engineering practice to perform a variety of statistical tests (particularly residual analysis) on the prediction errors of the estimated model as part of the model validation procedure (Ljung, 1999).
304
Jordan Ang et al.
Active Ga-GTP
1100
y = 1.0169x + − 18.3626 1000
Simulated experimental output
900
800 700
600 500 400
300 200 100 300
350
400
450
500 550 600 Linear model output
650
700
750
Figure 10.16 Scatter plot for the linear model output versus the stochastically simulated output for the G-protein pathway. The plot is populated with data that was sampled during the sinusoidal response simulations. The trendline represents the nonlinear rectifier: in this case a linear fit with a slope 1 and a very small offset of 18.36 molecules per cell indicates no obvious nonlinearities in the oscillatory response data.
3.2.8. Nonsinusoidal inputs In some cases, the generation of purely sinusoidal inputs will not be feasible, and so another oscillatory waveform must be used to excite the system. The system identification procedure can still be followed in this case, with some minor modifications. To illustrate how nonsinusoidal inputs may be used, we have repeated the frequency response analysis of the G-protein pathway model using square wave inputs. The driving frequency of the square wave represents the input frequency. Sampled points from the resulting output are shown in Fig. 10.18. When using nonsinusoidal oscillatory inputs, the amplitude of the input contribution at the driving frequency is not equivalent to the amplitude of
305
Frequency Response Methods in Biology
Alpha factor
(nM)
1.5
1
0.5
0
0
500
1000
2000
2500
3000
3500
4000
3000
3500
4000
Active Ga-GTP
800 Molecules per cell
1500
700 600 500 400
0
500
1000
1500
2000 2500 Time (s)
Figure 10.17 Model validation. A rectangular pulse input (top panel) and the corresponding output response (bottom panel) for the G-protein pathway generated by the estimated linear model (solid curve) and from stochastic simulation (sampled points). The linear model faithfully reproduces the output response.
the waveform itself. Since the input corresponds to a combination of sinusoids at different frequencies, Eq. (10.5) can be used to determine the amplitude of the contribution at the driving frequency. In the case of square waves, this calculation indicates that the amplitude of the overall wave must be scaled by a factor of 4/p in order to recover the contribution of the component at the driving frequency. Using the stochastically simulated square wave data, we performed a new fit to a transfer function model. The result is essentially the same as that obtained from the sinusoidal inputs; Fig. 10.18 shows the behavior predicted by the estimated transfer function model.
3.3. Examples of experimental implementation There is a long tradition of studying biological systems using the techniques of control theory (Bayliss, 1966; Iglesias and Ingalls, 2009; Ingalls et al., 2006; Khoo, 2000; Wiener, 1965). Techniques from system identification,
306
Jordan Ang et al.
Period = 16,000 s
Molecules/cell
nM
1.5 1 0.5 1000 500 0
0
0.5
1
1.5
2
2.5
Period = 2000 s
3 ×104
nM
1.5 1
Molecules/cell
0.5 1000 500 0
0
1000
2000
3000
4000
Period = 250 s nM
1.5 1
Molecules/cell
0.5 1000 500 0
0
200
400
600
800
1000
Period = 31.25 s nM
1.5 1
molecules/cell
0.5 1000 500 0
0
50
100
150 200 Time (s)
250
300
Figure 10.18 Square wave inputs of different driving frequencies (top plots for each period, smooth curves) and their corresponding stochastically simulated output responses (bottom plots for each period, sampled points) as well as the predicted responses from the linear model (bottom plots for each period, smooth curves) for the G-protein pathway.
including frequency response methods, have long been experimentally applied in physiology (Westwick and Kearney, 2003) and in bacterial chemotaxis (Block et al., 1982, 1983; Shimizu et al., 2010), and have
Frequency Response Methods in Biology
307
more recently been applied to signaling cascades in yeast (Bennett et al., 2008; Hersen et al., 2008; Mettetal et al., 2008). 3.3.1. Bacterial chemotaxis Bacteria such as Escherichia coli respond to gradients of attractant or repellent in their environment by altering the proportion of time they spend randomly tumbling versus swimming in a straight line; this shift is implemented by changing the direction of rotation of their flagella, and has the effect of implementing a random walk that is biased in the direction of increasing concentrations of nutrients. Elegant work from Howard Berg’s group has examined the frequency response characteristics of bacterial chemotaxis for decades, initially at the whole-cell level (Block et al., 1982, 1983; Segall et al., 1986) and later targeting specific elements of the intracellular response system (Shimizu et al., 2010; Sourjik et al., 2007); we will touch here on only a small portion of the large body of work on this subject. In whole-cell investigations (Block et al., 1982, 1983; Segall et al., 1986), the experimental protocol consisted of affixing cells to a cover slip by a single flagellum, using a procedure described by Berg and Tedesco (1975). When pinned in this manner, the cells’ flagellar motors caused the entire cell to spin either clockwise (CW; corresponding to the random “tumble” mode of motion in a free-moving cell) or counterclockwise (CCW; corresponding to the linear “run” motion seen in free-moving cells) in response to the ambient chemical environment. This motion could be observed and classified (with particular attention to the timing of transitions between CW and CCW rotation) through phase-contrast microscopy. The intracellular measurement experiments (Shimizu et al., 2010) used Forster resonance energy transfer, (FRET; also known as fluorescence resonance energy transfer) to generate a fluorescence signal proportional to the physical proximity of the labeled proteins; by attaching fluorescent proteins to key players in the chemotaxis signaling system, FRET studies provided an output measurement corresponding to the level of activity of this system (Sourjik et al., 2007). Impulse inputs. In the work of Block et al. (1982), the environment was changed by introducing very brief changes in the concentration of either an attractant or a repellent, through a micropipette: the pulse of chemical stimulation passed across the population of tethered cells as a diffusive wave, providing each cell with a very brief burst of stimulation. This input approximates a mathematical “impulse”: a sharp “spike” of input (the mathematical ideal is infinitely tall but infinitely short; real impulses, such as the chemical pulses used in the work described here, serve as an approximation to this mathematical abstraction). Characterizing the response to an impulse input illustrates an alternative approach to the frequency response method we have discussed above. In ˚ stro¨m and Murray, 2008; Cluett engineering, impulse and step responses (A and Wang, 1991) are used as an alternative to explicitly sweeping the
308
Jordan Ang et al.
frequency of a series of input signals. The basis for this approach is that these signals, by virtue of their sharpness, excite a wide range of frequencies simultaneously since a sharp signal, when broken down into a sum of signals of different frequencies, requires many frequencies in order to represent it accurately. This approach succeeded in (Block et al., 1982) yielding frequency response plots without the experimental complications of generating inputs of varying frequencies; the plots showed a band-pass filtering behavior in the bacterial response to the chemical attractants and repellents, with a peak response near 0.25 Hz. More generally, it is not clear that impulse or step responses will always be sufficient to characterize other biochemical systems, especially in terms of gene expression and signal transduction. Using specific input frequencies improves the experimental SNR, allowing one to distinguish responses from noise, and to use the persistent, periodic input to average the response over multiple periods, gathering more data and thus improving SNR. Lipan and Wong (2005) argued that periodic inputs offer significant advantages over step inputs in particular: in addition to noting the SNR improvements, they carried out a theoretical analysis indicating that to obtain a given Fourier component with a desired level of confidence, significantly more experimental replicates of step inputs would be required than of periodic inputs. Further, the theoretical analysis presented by Saeki and Saito (2002) suggested that step inputs are limited in their ability to generate frequency response information, even in systems less complex than biological networks. The successful application of impulse responses by Block et al. (1982) to derive a full frequency response plot is impressive, and demonstrates that this approach is at least sometimes feasible in a biochemical response context. The “digital” nature of the output signal employed by Block et al. (1982) (only one of two outcomes is observed for any given cell, CW or CCW rotation, rather than a continuum of possible values) may have helped reduce potential problems with the SNR, since noise inside the system would be seen in the output only if it was large enough to flip the response from one rotational state to another. Exponentiated sine inputs. In the work of Block et al. (1983), computercontrolled pumps were used to generate varying-frequency input signals, similar to the sinusoidal inputs discussed in Sections 2 and 3.2, but in this case implemented as exponential functions of the form (esin ot), superimposing a periodic input onto an exponential ramp. The same work (Block et al., 1983) showed that an exponential ramp input caused the switching rate of the flagellar motor to reach and remain at a steady state over time; this motivated the choice of an exponentiated sine wave as the frequency-varying input for this system. Using such signals, they obtained a Bode plot spanning frequencies from 10 3 to 0.05 Hz, showing high-pass filtering behavior with a slope on the log–log plot of þ2.
Frequency Response Methods in Biology
309
Another study (Shimizu et al., 2010) again employed input signals of the exponentiated sine form, and used FRET (Sourjik et al., 2007) to monitor the intracellular interactions of proteins CheY-YFP and CheZ-CFP. Phosphorylated CheY binds to the flagellar motor to affect its rotational behavior, while CheZ dephosphorylates CheY; the FRET signal, by reporting how many CheY–CheZ pairs were close to one another, thus provided a measurement of the output of the intracellular signaling system underlying the bacterial chemotactic response. The Bode plot constructed by sweeping the frequencies of the exponentiated sine inputs showed high-pass filtering behavior, qualitatively similar to that seen by Block et al. (1983) and well predicted by a mathematical model (Shimizu et al., 2010; Tu et al., 2008). A Bode plot relating the time derivative of the input to the output showed low-pass filtering behavior; the bacterial chemotaxis system appears to function as a high-pass filter of the signal itself, and a low-pass filter of its derivative, as previously predicted theoretically (Tu et al., 2008). 3.3.2. Signaling cascades in yeast Three papers using frequency response methods in yeast appeared in 2008. Two effectively simultaneous publications dealt with the high-osmolarity glycerol (HOG) osmo-adaptation pathway (Hersen et al., 2008; Mettetal et al., 2008), while the third (Bennett et al., 2008) addressed the galactose utilization response. Experimental setup. All three papers made use of microfluidic chambers to contain the cells and expose them to varying inputs: changing concentrations of extracellular sorbitol (Hersen et al., 2008), sodium chloride (Mettetal et al., 2008), or glucose (Bennett et al., 2008). In the studies of Hersen et al. (2008) and Mettetal et al. (2008), the flow was driven by pumps, and varying concentrations were obtained by computer-controlled switching of valves connecting the microfluidic chamber to different fluid reservoirs; the input signals generated were approximately square waves. In the work of Bennett et al. (2008), the flow was driven by gravity-induced pressure differences and the input concentration was regulated by a computer-controlled actuators that varied the height of two fluid reservoirs (Bennett and Hasty, 2009; Bennett et al., 2008); the input signals used were sinusoidal in shape. In all three cases, the output signals were observed through fluorescence microscopy: fluorescence of Hog1-YFP, using colocalization with the strictly nuclear protein Nrd1-RFP to isolate and measure only the Hog1-YFP present in the nucleus (Mettetal et al., 2008); fluorescence of Hog1-GFP, using colocalization with Htb2-mCherry to isolate the nuclear Hog1-GFP (Hersen et al., 2008); and whole-cell fluorescence of Gal1-yECFP fusion proteins (Bennett et al., 2008). Time-lapse fluorescence images, with image intensities quantified through various image analysis routines, generated time-series for the output signals in each case; combining these with the known input signal time-series yielded a
310
Jordan Ang et al.
complete input–output response profile for each frequency, and this procedure was repeated across a range of input frequencies.
A (w)
Wild-type 10–1
Low Pbs2 –2
10
10–2
10–1
100
w (rad/min)
101
100
w = 0.0046 Hz
B A
Nuclear hog 1-GFP fluctuations (a.u.)
3.3.2.1. Osmo-adaptation Yeast sense and respond to osmotic pressure through a multistage signaling cascade that terminates in glycerol production, activated when the protein Hog1 is phosphorylated and enters the nucleus. Mettetal et al. (2008) and Hersen et al. (2008) studied the frequency response of this system. The Bode gain plots for the HOG pathway’s response to changing osmotic pressure in the two papers are shown in Fig. 10.19A and B, respectively. At first sight, the plots appear to present different conclusions about the same pathway, showing band-pass filtering behavior in one case, and low-pass filtering in the other. Note, however, the different frequency ranges considered in the studies: Mettetal et al. (2008) considered periods ranging from 2 to 128 min, corresponding to frequencies of 1.3 10 4–8.3 10 3 Hz (shown in radians per minute in Fig. 10.19A: 0.05–3.14 rad/min), while Hersen et al. (2008) considered higher frequencies, ranging from 10 3 to 1 Hz; the two ranges overlap only slightly. The band-pass response seen in the work of Mettetal et al. (2008) (panel A) does not appear in the Hersen et al. (2008) data (panel B) because
80 60 40 20 0 0.001
0.01
0.1
1
Frequency (Hz)
Figure 10.19 The gain portions of two experimental Bode plots for the HOG osmoadaptation system in yeast. (A) Experimental results from Mettetal et al. (2008). The input signals were square waves of extracellular sodium chloride concentrations at varying frequencies, and the output was the fluorescence intensity of Hog-YFP in the nucleus. “Low Pbs2” refers to a yeast strain that expresses lower (than wild type) levels of the Pbs2 protein (an element in the signaling cascade). [From Mettetal et al. (2008). Reprinted with permission from AAAS.] (B) Experimental results from Hersen et al. (2008). The input signals were square waves of extracellular sorbitol at varying frequencies, and the output was the fluorescence intensity of Hog1-GFP in the nucleus. [Copyright 2008, National Academy of Sciences (USA), reprinted with permission.] Note that the results in panel (B) are shown on a linear vertical scale, rather than the log scale used in panel (A) and in the Bode gain plots elsewhere in this review (Figs. 10.5, 10.8, and 10.13). The asymptotic approach to zero is a curve on the linear scale, rather than the straight line seen on a log scale.
Frequency Response Methods in Biology
311
the latter considers frequencies higher than those at which the initial increase in the amplitude response were observed; at the same time, the significantly higher frequency regime examined by Hersen et al. (2008) provides valuable information, showing very clearly that the amplitude response does continue to decline with frequency in a classic low-pass filter pattern. Combined, these two studies provide a complete and consistent picture of the frequency behavior of the HOG pathway: an initial rise in the amplitude response, to a frequency of approximately 10 3 Hz, followed by a steady decline for higher frequencies. The slope of the decline of the amplitude in the log–log plot shown in Fig. 10.19A suggests a relative degree of 1. The authors in Mettetal et al. (2008) chose a transfer function model with this relative degree. The parameters of this model were fit using experimental data, then the same parameters were used to predict the system’s response to a step input, a form of input not used in the fitting. The result was a good match between the model predictions and the experimentally observed response (Mettetal et al., 2008), validating the model and providing a nice illustration of the predictive power of transfer function models. 3.3.2.2. Carbon source utilization When its preferred carbon source, glucose, is not available, yeast can activate genes to utilize galactose instead. Bennett et al. (2008) examined the frequency response of this system was examined. The resulting frequency response showed low-pass behavior. The authors went on to formulate a detailed biochemical kinetic model of the system, and attempted to use it to reproduce the experimentally observed frequency response. They found, however, that their initial model did not provide a good match to the experimental frequency response behavior. Modifying the model to allow the degradation rates of two key components (the mRNA of genes GAL1 and GAL3) to be a function of glucose concentration yielded a much more accurate model, and subsequent experiments confirmed that these mRNA half-lives were in fact glucose-dependent, revealing a level of posttranscriptional regulation that had not previously been known. The ability to compare with a full set of frequency response experiments, rather than, for example, steady-state experiments at a series of fixed glucose concentrations, provided the important ability to isolate which portion of the model needed to be adjusted, and in this case yielded an important new insight into a well-studied biological system.
4. Conclusion The engineering concepts described in this review can serve the systems and synthetic biology communities in a number of ways.
312
Jordan Ang et al.
Some experimental projects may have a frequency response input– output characterization as their end goal. Using the transfer function models described above, such projects can result in the ability to predict a system’s response to arbitrary inputs. That capability is often valuable in itself, and is of particular interest in the engineering-oriented field of synthetic biology (Andrianantoandro et al., 2006; Hasty et al., 2002; Kærn and Weiss, 2006; Kærn et al., 2003; Khalil and Collins, 2010; Purnick and Weiss, 2009), where the design and construction of novel cellular regulatory systems can be guided and informed by the use of frequency response methods. For systems biology purposes, obtaining a frequency response profile using the methods described here will typically be part of a larger effort to understand biochemical and genetic systems at a level beyond their input– output profile. These methods have several things to offer this endeavor. Most basically, the frequency response profile provides insight into the timescales that are most relevant to the system being studied: high or low gains at particular frequency ranges will indicate which timescales the system responds to most strongly, and whether such responses are tuned to any particular frequencies. This information places constraints on the biochemical or genetic mechanisms that underlie the observed response, and thus can guide experimental investigations. When comparing computational or theoretical models to experimental results, a full frequency response profile provides a richer basis for comparison than does steady-state or step-response data. We noted above, for example, that Bennett et al. (2008) used their frequency response data to pin down the source of an experimental mismatch in their modeling work. This is not simply a question of there being more data available to compare: the data is organized in a qualitatively different manner when experimental frequency response information has been obtained, and it allows access to a set of analytical tools that would not apply in the absence of such information. In the long term, researchers may aim to use “black box” models (where inputs elicit outputs by an unknown internal process) provided by frequency response methods as a step toward “white box” models (where the internal mechanism of the system are known; biochemical kinetic models (Conrad and Tyson, 2006; Hasty et al., 2001) are an example), possibly through intermediate “gray box” models (where the black boxes are used to partially populate a model with limited internal details) (Lipan, 2008). Such an approach was adopted by Mettetal et al. (2008), using the structure of the frequency response to guide the construction of a partially filled-in kinetic model of their system. Experimentally, the generation of stable, repeatable, rapidly varying oscillatory inputs remains a challenge, but the barrier for entry is dropping quickly as microfluidic technologies become less expensive and more widely available. To date, identification of frequency response from
Frequency Response Methods in Biology
313
experimental data in biochemical and genetic systems has been carried out using extracellular signals that are conveyed to the cells by changing the concentration of some chemical in the surrounding medium (Bennett et al., 2008; Hersen et al., 2008; Mettetal et al., 2008; Shimizu et al., 2010). Inducing oscillations in internal states that are not coupled to an extracellular signal will be a further challenge, though as Lipan (2008) notes, one path might be to use intracellular regulatory elements that can be induced by the application of photons (Shimizu-Sato et al., 2002), perhaps coupled to other synthetic tunable systems (Grilly et al., 2007). Our ability to apply oscillatory inputs to cells will only grow more sophisticated over time, increasing the number of systems to which frequency response methods can be profitably applied. Finally, we address again an apparently severe limitation of this approach: its applicability only to linear systems (ones for which the response to a sum of two inputs is the sum of the responses to each individual input). Every real physical system is nonlinear, displaying at least saturation behavior, and biological systems are particularly subject to nonlinearities. Engineers must often deal with systems that are highly nonlinear, but this has not prevented them from designing much of the modern world using techniques founded on linear approximations. They make use of the fact that systems are always approximately linear in the vicinity of their steady states (where self-regulating systems spend the bulk of their time); they use nonlinear correction blocks of the type described in Section 3.2.5; and they use the information derived from linear approaches as a starting point for understanding fully nonlinear behavior. In fact, linear approximations are at the core of the study of nonlinear systems (Glendinning, 1994): the first thing generally done when analyzing a nonlinear system is to consider its locally linear behavior. Biologists are faced with the study of the most complex nonlinear dynamic systems ever examined, and they too can benefit from the insight and foundation for future study provided by linear analysis. There can be no doubt that the methods of dynamic analysis we review here will become a valuable part of the biological toolbox.
Appendix We use a model of the G-protein pathway as developed in Yi et al. (2003) as a running example throughout the above discussion. The model has four independent states: free receptor, R; ligand-bound receptor, RL; inactive heterotrimeric G-protein, G; and active Ga-GTP, Ga. In addition, there are three dependent states: free ligand, L; free Gbg, Gbg; and inactive
314
Jordan Ang et al.
Ga-GDP, Gd. The species concentrations are measured in molecules per cell, while ligand concentrations are measured in nanomoles per liter (nM).
A.1. Deterministic model The dynamics are d ½Rðt Þ ¼ kRL ½L ðtÞ½RðtÞ þ kRLm ½RL ðtÞ kRd0 ½RðtÞ þ kRs dt d ½RL ðtÞ ¼ kRL ½L ðtÞ½RðtÞ kRLm ½RL ðtÞ kRd1 ½RL ðt Þ dt d ½GðtÞ ¼ kGa ½RL ðt Þ½GðtÞ þ kG1 ½Gd ðtÞ½Gbgðt Þ dt d ½Gaðt Þ ¼ kGa ½RL ðtÞ½GðtÞ kGd ½Gaðt Þ dt where the G-protein components are constrained by ½GbgðtÞ ¼ ½Gtotal ½GðtÞ ½GdðtÞ ¼ ½Gtotal ½GðtÞ ½Gaðt Þ: The free ligand concentration L(t) is the input to the system. It is assumed that ligand uptake by the cells is fast, and has a negligible effect on the overall concentration.The parameter values used are kRL ¼ 2 106 M1 s1 ; kRLm ¼ 1 102 s1 ; kRd0 ¼ 4 104 s1 kRs ¼ 4ðmolecules per cellÞs1 ; kRd1 ¼ 4 103 s1 kGa ¼ 1 105 ðmolecules per cellÞ1 s1 ; kG1 ¼ 1ðmolecules per cellÞ1 s1 kGd ¼ 4 103 s1 L ðt Þ : variable input; nominal value 1 nM Gtotal ¼ 10; 000ðmolecules per cellÞ
A.2. Stochastic model Since all processes in the deterministic model are individual chemical events, it can be easily converted to a stochastic model. The states are the same, with abundance in molecules (and the system size being a single cell). Again, it is assumed that ligand uptake by the cells is fast, and has a negligible effect on the overall concentration. Thus we retain a molar measure of ligand concentration, and treat its effect as a variation of the effectively first order parameter kRL[L(t)], which has units of s 1.
315
Frequency Response Methods in Biology
The set of reactions representing the model from Yi et al. (2003) is kRL
L þ R Ð RL kRLm
kRs
ðÞ Ð R kRd0
kGa
RL þ G ! Ga þ Gbg þ RL kRd1
RL ! ðÞ kGd
Ga ! Gd kG1
Gd þ Gbg ! G Numbering the reactions r1 through r8 (including reverse reactions), we have corresponding propensities a1 ¼ kRL L ðt ÞRðt Þ; a2 ¼ kRLm RL ðt Þ; a5 ¼ kGa RL ðtÞGðtÞ; a6 ¼ kRd1 RL ðtÞ; a8 ¼ kG1 Gd ðtÞGbgðtÞ Numbering the state (s1, s2, s3, s4) matrix is given as 2 1 1 1 6 1 1 0 S¼6 4 0 0 0 0 0 0
a3 ¼ kRs ; a4 ¼ kRd0 Rðt Þ a7 ¼ kGd Gaðt Þ;
¼ (R, RL, G, Ga), the stoichiometry 1 0 0 0
0 1 0 0
0 0 0 0 1 1 1 0
3 0 07 7 05 1
The species Gbg and Gd are updated according to the conservations GbgðtÞ ¼ Gtotal GðtÞ Gd ðtÞ ¼ Gtotal Gðt Þ GaðtÞ: The model was simulated using Gillespie’s direct method (Gibson and Bruck, 2000; Gillespie, 1977). The resulting numerical output was sampled at uniform time intervals. To mimic nonsystematic measurement noise, we added to each value a white noise term with a coefficient of variation (CV, the standard deviation divided by the mean) of 0.2.
316
Jordan Ang et al.
REFERENCES Andrianantoandro, E., Basu, S., Karig, D. K., and Weiss, R. (2006). Synthetic biology: New engineering rules for an emerging discipline. Mol. Syst. Biol. 2, 2006–2028. ˚ stro¨m, K. J., and Murray, R. M. (2008). Feedback Systems: An Introduction for Scientists A and Engineers. Princeton University Press, Princeton, NJ. Bayliss, L. E. (1966). Living Control Systems. W. H. Freeman, San Francisco, CA. Beebe, D. J., Mensing, G. A., and Walker, G. M. (2002). Physics and applications of microfluidics in biology. Annu. Rev. Biomed. Eng. 4, 261–286. Bennett, M. R., and Hasty, J. (2009). Microfluidic devices for measuring gene network dynamics in single cells. Nat. Rev. Genet. 10, 628–638. Bennett, M. R., Pang, W. L., Ostroff, N. A., Baumgartner, B. L., Nayak, S., Tsimring, L. S., and Hasty, J. (2008). Metabolic gene regulation in a dynamically changing environment. Nature 454, 1119–1122. Berg, H. C., and Tedesco, P. M. (1975). Transient response to chemotactic stimuli in Escherichia coli. Proc. Natl. Acad. Sci. USA 72, 3235–3239. Block, S. M., Segall, J. E., and Berg, H. C. (1982). Impulse responses in bacterial chemotaxis. Cell 31, 215–226. Block, S. M., Segall, J. E., and Berg, H. C. (1983). Adaptation kinetics in bacterial chemotaxis. J. Bacteriol. 154, 312–323. Boyce, W. E., and DiPrima, R. C. (2008). Elementary Differential Equations. Wiley, New York, NY. Cluett, W. R., and Wang, L. (1991). Modelling and robust controller design using step response data. Chem. Eng. Sci. 46, 2065–2077. Conrad, E. D., and Tyson, J. J. (2006). Modeling molecular interaction networks with nonlinear ordinary differential equations. In “System Modeling in Cellular Biology,” (Z. Szallasi, V. Periwal, and S. Jorg, eds.), pp. 97–123. MIT Press, Cambridge, MA. Gibson, M. A., and Bruck, J. (2000). Efficient exact stochastic simulation of chemical systems with many species and many channels. J. Phys. Chem. A 104, 1876–1889. Gillespie, D. (1977). Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340–2361. Glendinning, P. (1994). Stability, Instability, and Chaos: An Introduction to the Theory of Nonlinear Differential Equations. Cambridge University Press, Cambridge, UK. Grilly, C., Strickler, J., Pang, W. L., Bennett, M. R., and Hasty, J. (2007). A synthetic gene network for tuning protein degradation in Saccharomyces cerevisiae. Mol. Syst. Biol. 3, 127. Hasty, J., McMillen, D. R., Isaacs, F., and Collins, J. J. (2001). Computational studies of gene regulatory networks: In numero molecular biology. Nat. Rev. Genet. 2, 268–279. Hasty, J., McMillen, D. R., and Collins, J. J. (2002). Engineered gene circuits. Nature 420, 224–230. Haykin, S., and Van Veen, B. (2005). Signals and Systems. Wiley, New York, NY. Hersen, P., McClean, M. N., Mahadevan, L., and Ramanathan, S. (2008). Signal processing by the HOG MAP kinase pathway. Proc. Natl. Acad. Sci. USA 105, 7165–7170. Ideker, T., Galitski, T., and Hood, L. (2001). A new approach to decoding life: Systems biology. Annu. Rev. Genomics Hum. Genet. 2, 343–372. Iglesias, P. A., and Ingalls, B. P. (eds.), (2009). Control Theory and Systems Biology, MIT Press, Cambridge, MA. Ingalls, B. P. (2004). A frequency domain approach to sensitivity analysis of biochemical networks. J. Phys. Chem. B 108, 1143–1152. Ingalls, B. P., Yi, T.-M., and Iglesias, P. A. (2006). Using control theory to study biology. In “System Modeling in Cellular Biology,” (Z. Szallasi, V. Periwal, and S. Jorg, eds.), pp. 243–267. MIT Press, Cambridge, MA.
Frequency Response Methods in Biology
317
Kærn, M., and Weiss, R. (2006). Synthetic gene regulatory systems. In “System Modeling in Cellular Biology,” (Z. Szallasi, J. Stelling, and V. Periwal, eds.), pp. 269–295. MIT Press, Cambridge. Kærn, M., Blake, W., and Collins, J. J. (2003). The engineering of gene regulatory networks. Annu. Rev. Biomed. Eng. 5, 179–206. Khalil, A. S., and Collins, J. J. (2010). Synthetic biology: Applications come of age. Nat. Rev. Genet. 11, 367–379. Khoo, M. C. K. (2000). Physiological Control Systems: Analysis, Simulation, and Estimation. IEEE Press, New York, NY. Kitano, H. (2002). Computational systems biology. Nature 420, 206–210. Lipan, O. (2008). Enlightening rhythms. Science 319, 417–418. Lipan, O., and Wong, W. H. (2005). The use of oscillatory signals in the study of genetic networks. Proc. Natl. Acad. Sci. USA 102, 7063–7068. Ljung, L. (1999). System Identification: Theory for the User. Prentice Hall, Saddle River, NJ. McClean, M. N., Hersen, P., and Ramanathan, S. (2009). In vivo measurement of signaling cascade dynamics. Cell Cycle 8, 373–376. McCudden, C. R., Hains, M. D., Kimple, R. J., Siderovski, D. P., and Willard, F. S. (2005). G-protein signaling: Back to the future. Cell. Mol. Life Sci. 62, 551–577. Mettetal, J. T., Muzzey, D., Go´mez-Uribe, C., and van Oudenaarden, A. (2008). The frequency dependence of osmo-adaptation in Saccharomyces cerevisiae. Science 319, 482–484. Moles, C. G., Mendes, P., and Banga, J. R. (2003). Parameter estimation in biochemical pathways: A comparison of global optimization methods. Genome Res. 13, 2467–2474. Oldham, W. M., and Hamm, H. E. (2008). Heterotrimeric G protein activation by Gprotein-coupled receptors. Nat. Rev. Mol. Cell Biol. 9, 60–71. Purnick, P. E. M., and Weiss, R. (2009). The second wave of synthetic biology: From modules to systems. Nat. Rev. Mol. Cell Biol. 10, 410–422. Saeki, M., and Saito, K. (2002). Estimation of frequency response set from interval data of step response. 41st Conference of the Society of Instrument and Control Engineers (SICE), Vol. 2, Osaka, Japan, pp. 1233–1236. Seber, G. A. F., and Wild, C. J. (2003). Nonlinear Regression. Wiley, New York, NY. Segall, J. E., Block, S. M., and Berg, H. C. (1986). Temporal comparisons in bacterial chemotaxis. Proc. Natl. Acad. Sci. USA 83, 8987–8991. Shimizu, T. S., Tu, Y., and Berg, H. C. (2010). A modular gradient-sensing network for chemotaxis in Escherichia coli revealed by responses to time-varying stimuli. Mol. Syst. Biol. 6, 382. Shimizu-Sato, S., Huq, E., Tepperman, J. M., and Quail, P. H. (2002). A light-switchable gene promoter system. Nat. Biotechnol. 20, 1041–1044. Sourjik, V., Vaknin, A., Shimizu, T. S., and Berg, H. C. (2007). In vivo measurement by FRET of pathway activity in bacterial chemotaxis. Methods Enzymol. 423, 365–391. Tu, Y., Shimizu, T. S., and Berg, H. C. (2008). Modeling the chemotactic response of Escherichia coli to time-varying stimuli. Proc. Natl. Acad. Sci. USA 105, 14855–14860. Tyson, J. J., Chen, K., and Novak, B. (2001). Network dynamics and cell physiology. Nat. Rev. Mol. Cell Biol. 2, 908–916. Westermark, P. O., Welsh, D. K., Okamura, H., and Herzel, H. (2009). Quantification of circadian rhythms in single cells. PLoS Comput. Biol. 5, e1000580. Westwick, D. T., and Kearney, R. E. (2003). Identification of Nonlinear Physiological Systems. IEEE Press, Wiley, Piscataway, NJ. Wiener, N. (1965). Cybernetics: Or the Control and Communication in the Animal and the Machine. The MIT Press, Cambridge, MA. Yi, T.-M., Kitano, H., and Simon, M. I. (2003). A quantitative characterization of the yeast heterotrimeric G protein cycle. Proc. Natl. Acad. Sci. USA 100, 10764–10769.