Molecular Cell
Review Emerging Principles of Gene Expression Programs and Their Regulation Scott D. Pope1 and Ruslan Medzhitov1,* 1Howard Hughes Medical Institute, Department of Immunobiology, Yale University School of Medicine, New Haven, CT 06510, USA *Correspondence:
[email protected] https://doi.org/10.1016/j.molcel.2018.07.017
Many mechanisms contribute to regulation of gene expression to ensure coordinated cellular behaviors and fate decisions. Transcriptional responses to external signals can consist of many hundreds of genes that can be parsed into different categories based on kinetics of induction, cell-type and signal specificity, and duration of the response. Here we discuss the structure of transcription programs and suggest a basic framework to categorize gene expression programs based on characteristics related to their control mechanisms. We also discuss possible evolutionary implications of this framework. Introduction Control of gene expression plays a key role in a wide variety of core biological processes, ranging from organismal development and cell differentiation to cellular stress responses, tissue homeostasis, and immunity. Much progress has been made in the past 20 years in characterizing the molecular mechanisms of transcription, the role of chromatin and its modifications, and the function of enhancers and non-coding RNAs (Andersson et al., 2015; Heinz et al., 2015; Pombo and Dillon, 2015; Soshnev et al., 2016; Weake and Workman, 2010). Important new insights have also been gained through the quantitative and computational analyses of gene expression, the role of transcriptional noise, and oscillation (Kim et al., 2009; Levchenko and Nemenman, 2014; Raser and O’Shea, 2005; Yosef and Regev, 2011). New technologies, including ATAC-seq, single-cell RNA-seq, and Hi-seq (Buenrostro et al., 2013; Fullwood et al., 2009; Lieberman-Aiden et al., 2009; Macosko et al., 2015; Mumbach et al., 2016; Picelli et al., 2014), now provide an increasingly detailed understanding of the workings of the genome and the complexity of gene expression patterns across multiple cell types in different tissues. These developments also raise new questions regarding both the mechanistic aspects of gene regulation and genome function, as well as the overall logic of transcriptional programs and general patterns of regulation of gene expression. The cells of the immune system are especially well suited to address these questions. Macrophages, in particular, provide an excellent model system to study developmental and inducible gene expression (Smale and Natoli, 2014). Indeed, the macrophage is one of the most functionally versatile cell types, with an extraordinarily broad functional repertoire: they play essential roles in host defense from infections, apoptotic cell removal, tissue homeostasis and repair, development, and metabolism (Davies et al., 2013; Wynn et al., 2013). This functional diversity is in large part enabled by transcriptional programs that can be engaged in macrophages in response to specific developmental, homeostatic, and environmental cues. Several themes have emerged from the studies of gene expression in macrophages, which are broadly applicable to most cell types: thus, the inducible transcriptional responses can be highly complex, involving, in some cases, many hundreds of activated and repressed
genes. These genes typically belong to multiple transcription programs. A transcription program can be defined as a set of functionally related and co-regulated genes, analogous to the bacterial operons. The individual transcription programs differ in the kinetics of their induction and can be regulated independently from each other by a dedicated combination of transcription factors (TFs). Transcription programs inducible in one cell type, can be constitutively expressed in another cell type (or another tissue). Similarly, a given transcription program can be either inducible or constitutive in the same cell type, depending on the tissues where they reside. This can complicate the distinctions between ‘‘cell activation’’ versus ‘‘differentiation,’’ and between ‘‘cell types’’ versus ‘‘cell states,’’ especially when some of the gene products of these transcription programs are used as phenotypic markers to define different cell types and tissue-specific sub-types. Additionally, macrophages, as many other cell types, can undergo sustained functional alterations, referred to as functional ‘‘polarization’’ to distinguish it from the more transient ‘‘activation.’’ Given this apparent complexity of gene regulation programs, there is an increasing need for a simplified perspective that can be used as a framework to characterize different patterns of gene regulation. Here, we will discuss some emerging principles of gene regulation, focusing on the examples provided by studies of macrophages. However, in the hope to make the discussion more broadly applicable, we will focus on the general themes at the expense of many details. Several excellent comprehensive reviews on different aspects of gene regulation in macrophages are available for interested readers (Glass and Natoli, 2016; Link et al., 2015; Smale et al., 2013, 2014). Transcription Factors Can Be Grouped into Four Functional Classes DNA-binding transcription factors (TFs) are key regulators of gene expression. Although their structural and functional heterogeneity makes it difficult to group them into distinct categories, for the purpose of this review we will use some simplified generalizations to define four classes of TFs. Class-A TF control expression of housekeeping genes. By definition, these TFs are expressed in most or all cell types. Molecular Cell 71, August 2, 2018 ª 2018 Elsevier Inc. 389
Molecular Cell
Review Prototypic examples of this class are the SP1, YY1, NRF-1 (Nuclear respiratory factor-1), and ELF families of TFs (Curina et al., 2017; Gordon et al., 2006; Kaczynski et al., 2003; Scarpulla, 2002). These TFs tend to operate on ‘‘open promoters’’ (i.e., promoters with TF binding sites not occluded by nucleosomes) (Cairns, 2009), which are accessible in most cell types. In vertebrates, these promoters usually have high CG content and are referred to as ‘‘CpG island’’ promoters (Deaton and Bird, 2011). Class-B are signal-dependent TFs that are also broadly expressed, but unlike Class-A TFs, they are present in a latent (inactive) form in un-stimulated cells. Activation of Class-B TF is regulated by specific signals. Upon activation, these TFs quickly activate or repress their target genes. Examples of Class-B TFs include SRF, CREB, HSF-1, NRF-2, HIF-1a, NF-kB, STATs, SMADs, p53, and many nuclear receptors. These TFs are activated by intracellular sensors (e.g., ROS sensor KEAP1 controls NRF-2, and oxygen sensor PHD2 controls HIF-1a), by plasma membrane receptors (e.g., Toll-like receptors [TLRs] control of NF-kB activation), or by specific ligands (e.g., lipids and steroid hormones for nuclear receptors). The genes induced by Class-B TFs are so called ‘‘primary response genes’’ (PRGs)—their transcriptional induction does not require new protein synthesis because Class-B TFs are pre-made (Fowler et al., 2011). In contrast, Class-C TFs are not pre-made. Their expression must be induced by Class-B TFs, and hence, Class-C TFs are PRGs (Fowler et al., 2011). The target genes of Class-C TFs are so called ‘‘secondary response genes’’ (SRGs) because the transcriptional induction of these target genes depends on new protein synthesis (which is required to make Class-C TFs). Class-C TFs can be further divided into three subclasses. Class-C1 TFs are transcriptionally induced in most cell types by a broad range of signals and they tend to regulate a very broad variety of target genes. Examples of Class-C1 TFs include FosB, c-Fos, JunB, c-Myc, and ERG1-3. The scope of the genes regulated by these TFs depends both on their affinity to specific DNA sequences in the promoters/enhancers of target genes, and by their expression level (Fowler et al., 2011). One way to think of Class-C1 TFs is that they function as amplifiers of transcription programs, as was suggested for the action of c-Myc (Lin et al., 2012; Nie et al., 2012). Class-C2 TFs are transcriptionally induced by specific signals and regulate expression of a smaller group of genes specialized on a particular function. Examples of Class-C2 TFs are the E2F family TFs, which are induced by mitogenic signals and regulate the cell-cycle genes, and CHOP, which regulates the ER stress response. Class-C3 TFs are transcriptionally induced only in specific cell types, where they regulate inducible expression of specialized gene programs unique to these cell types. Examples of Class-C3 TFs in macrophages include C/EBPd (Litvak et al., 2009), Irf4 (Satoh et al., 2010), and Klf4 (Liao et al., 2011). Thus, whereas Class-C1 TFs are more concerned with the amplitude of the transcriptional response, Class-C2 and C3 TFs are more concerned with the specificity. As such, these three subclasses of Class-C TFs can work in combinations to control both the magnitude and the specificity of the transcriptional response. In addition, Class-C TFs often cooperate with Class-B TFs in controlling cell-type specific inducible gene expression (Litvak et al., 390 Molecular Cell 71, August 2, 2018
2009). In these cases, promoters or enhancers of the target genes contain both Class-B and Class-C TF binding sites. Finally, Class-D TFs are lineage-restricted TFs that control cell differentiation and expression of cell-type-specific genes. Examples of this class of TFs include MyoD (myocyte differentiation) (Arnold and Braun, 1996), Pax5 (B cell differentiation) (Urba´nek et al., 1994), and Pu.1 (macrophage differentiation) (Holmberg and Perlmann, 2012). Class-D TFs usually work in antagonistic pairs, whereby two Class-D TFs expressed in a progenitor cell suppress expression of each other, thereby both promoting one cell fate and suppressing the alternative cell fate (Graf and Enver, 2009). Class-D TFs act on specific promoters and enhancers, making their associated target genes constitutively active in specific cell types (Ghisletti et al., 2010). They can also make cell-type-specific enhancers accessible to Class-B and Class-C TFs, making them inducible in a cell-type-specific manner (Ghisletti et al., 2010). This classification of DNA-binding TFs can be extended to transcriptional co-activators or co-repressors, which can be: ubiquitous, TF-A-like (e.g., CBP/P300), signal-dependent, TFB-like (e.g., b-catenin, Yap and Taz), inducible and cell-type specific, TF-C-like (e.g., IkBz and Bcl3), or constitutive, cell-type specific, TF-D-like (e.g., OCA-B, FOG and CRTC1). Combinations of co-activators/co-repressors and DNA-binding TF that belong to different classes afford further flexibility to transcriptional control mechanisms (Rosenfeld et al., 2006; Spiegelman and Heinrich, 2004). Finally, it should be noted that while we have grouped here the TFs into discrete categories for clarity, a more realistic picture is that of a continuum of expression or activation patterns. Thus, it is more accurate to think of ‘‘broad’’ versus ‘‘restricted’’ expression pattern. Likewise, there are usually different degrees of expression and induction for specific members of each of the class of TFs. Nevertheless, the simplified discrete picture helps to conceptualize the different functional classes of TFs and their co-activators/co-repressors. The same qualification applies to different patterns of gene expression, which we discuss next. Patterns of Gene Expression In multicellular organisms, basic patterns of gene expression can be defined based on the following two criteria: first, gene expression can be constitutive or inducible; second, gene expression can be ubiquitous or cell-type specific. This results in four categories of genes, defined by their expression pattern: ubiquitous constitutive genes (UCG), ubiquitous inducible genes (UIG), celltype-specific constitutive genes (SCG), and cell-type-specific inducible genes (SIG) (Figure 1). These are idealized discrete categories, and in reality, many of these characteristics are continuous rather than discrete. This simplified view can be informative, however, as several functional characteristics go along with this classification: (1) UCGs are mostly housekeeping genes that control core cellular functions operating in most cells. The constitutive expression of these genes is regulated by Class-A TFs. (2) UIGs are genes that are induced quickly on demand in most cell types. Typically, UIGs are the primary response genes. The expression of UIGs is induced by Class-B TFs.
Molecular Cell
Review Figure 1. Basic Gene Classification Scheme
A
(A) Genes can be grouped based on four expression characteristics: constitutive versus inducible and ubiquitous versus cell-type specific. (B) Four classes of transcription factors involved in regulation of the gene categories defined in (A).
B
UIGs include stress- and inflammation-induced genes, as well as genes involved in metabolic adaptation to their microenvironment. A subset of UIGs are Class-C TFs, which, in turn, control the expression of the secondary response genes. (3) SCGs are expressed constitutively but only in specific cell types (e.g., neuron- or muscle-specific genes). Their expression pattern is established and maintained by Class-D TFs during cell differentiation into specific lineages. These transcription factors activate cell-typespecific enhancers of SCGs (Blum et al., 2012; Natoli et al., 2011). (4) SIGs are genes that are inducible but only in specific cell types. Their inducible expression is regulated by a combined effect of Class-D TFs (to control cell type specificity), and Class-B and Class-C TFs (to control inducibility). Depending on whether the inducible TF is Class-B or Class-C, SIGs can be either primary- or secondary-response genes. An important characteristic of SIGs is that their transcriptional induction requires chromatin remodeling by Swi/Snf complexes (RamirezCarrozzi et al., 2006).
These four gene categories are idealized versions of a more nuanced reality, because the characteristics used to define them are relative, rather than absolute: first, a gene that is constitutively expressed can still be expressed at different levels in different cell types and conditions. Second, some genes may be expressed broadly (i.e., in most cell types), but not ubiquitously (e.g., some specialized cell types may not express them). Third, there is a hierarchy of cell-type specificity: for example, gene expression could be restricted to all lymphocytes, or only to T lymphocytes, or only to various subsets of T lymphocytes. Still, the ‘‘constitutive versus inducible’’ and ‘‘ubiquitous versus specific’’ dimensions arguably define the most basic patterns of gene expression with the corresponding functional classes of TFs. Many variations on this simple framework may actually be accounted for as combinations of the four basic control strategies. In addition, this simple classification provides a natural perspective on the evolution of gene regulation and evolution of cell types, as we discuss next. Evolutionary Pathways of Gene Expression Patterns What are the possible evolutionary relationships between the four patterns of gene expression outlined above? Given that the four modes of gene expression differ from each other by one or two characteristics (inducibility, cell-type specificity, or both), and assuming that evolutionary transitions occurred one characteristic at a time, we can envision the following evolutionary relations between gene categories (Figure 2): (1) UIG can be derived from UCG by acquiring the dependence on TF-Bs (signal-dependent TFs). Importantly, UIGs retain many characteristics of UCGs, including accessible promoter configuration with high CpG content and control by TF-As. They retain constitutive initiation of transcription, which is mediated by TF-As, but their transcriptional elongation (and therefore, expression) becomes TF-B dependent and therefore signal dependent (Hargreaves et al., 2009). Promoters of UIGs (in contrast to UCGs) have constitutively recruited co-repressors Molecular Cell 71, August 2, 2018 391
Molecular Cell
Review + (TF-B)
Figure 2. Evolutionary Transitions Scenarios between Gene Categories
+ (TF-C)
UCGs can be transformed into UIG by addition of DNA binding sites for TF-B and loss of constitutive expression due to recruitment of co-repressors. SIGs can evolve from UIGs by addition of binding sites for TF-D (to confer cell-type specificity) and TF-C to confer inducibility. SIGs also tend to lose the sites for TF-(A). SCGs can evolve from SIGs through the conversion of TF-C3 into TF-D (when TF-C3 expression becomes constitutive). SCGs can also evolve from UCGs by addition of binding sites for TF-(D). Finally, SCGs can give rise to SIGs by addition of TF-(C).
UCG
UIG
(TF-A)
(TF-A)+(TF-B)
+ (TF-D)
+ (TF-D) + (TF-C) - (TF-A)
SCG
SIG
(TF-A)+(TF-D)
(TF-D)+(C)+(B)
(TF-C) -> (TF-D)
(such as NCoR and CoREST, associated with HDAC1 and HDAC3), which prevent the recruitment of the elongation factor P-TEFb (Adelman and Lis, 2012; Mottis et al., 2013). These co-repressor complexes are dismissed upon signal-dependent activation of TF-Bs (Hargreaves et al., 2009). It is very likely that there are additional evolutionary changes in the promoter structure that account for loss of constitutive expression of UIGs that remain to be defined. (2) SIGs can be derived from UIG through loss of TF-A and acquisition of cell-type-specific enhancers, which are controlled by lineage-restricted TF-Ds (to afford celltype specificity), as well as by the acquisition of dependence on TF-C3. Promoters of SIGs have lower CpG content (compared to UCGs and UIGs) and their binding sites for TF-B and TF-C3 are occluded by nucleosomes (Ramirez-Carrozzi et al., 2009). Consequently, they lack constitutive TF-A binding and their induction is dependent on nucleosome remodeling by Swi/Snf chromatin remodeling complexes (Ramirez-Carrozzi et al., 2006). SIGs could also be derived from SCGs through the acquisition of TF-C dependence and loss of constitutive activity, perhaps by mechanisms analogous to the UCG to UIG conversion. (3) Finally, SCGs can be derived from SIG by acquiring constitutive, signal-independent expression pattern. A simple mechanism that can account for this is the conversion of TF-C3s into TF-Ds. Note that both TF-C3s and TF-Ds are lineage-specific TFs. The difference between them is that while TF-Ds are expressed constitutively, the expression of TF-C3s is inducible. A conversion of TF-C3 into TF-D would simply require that TF-C3, once transcriptionally induced, maintain their own expression (as TF-Ds are known to do). This would make genes regulated by TF-C3 (i.e., SIGs) expressed constitutively and in lineage-specific fashion—in other words, these genes would convert from SIGs into SCGs. One consequence of this transition is the evolution of new cell types (Arendt et al., 2016; Okabe and Medzhitov, 2016). For example, consider a cell that performs some cell-type-specific function (that requires new gene expression) in an inducible manner. This function 392 Molecular Cell 71, August 2, 2018
would thus be mediated by SIGs (since it is cell-typespecific and inducible). Recall that the specificity of SIG induction is controlled by TF-C3s. Now, for that function to be performed constitutively, would require that these SIGs are expressed constitutively (and still in a celltype-specific manner). This can be achieved by fixing expression of TF-C3s—i.e., making their expression constitutive, while remaining cell-type specific (in other words, converting TF-C3s into TF-Ds). Now these newly minted TF-Ds would define a new subtype of cells that are specialized on performing the function that used to be inducible in their ancestral cell type. An example illustrating this principle is the evolution of splenic macrophages. When exposed to heme, all macrophages induce expression of Spi-C that controls heme metabolism. However, splenic macrophages phagocytose senescent red blood cells and thus are continuously exposed to heme. Consequently, Spi-C controls differentiation of splenic macrophages, where it functions as a TF-D (Kohyama et al., 2009). Note that this scenario of cell-type evolution also naturally accounts for the hierarchy of cell-type specificities: Pu.1 is the master regulator of macrophage lineage and as such it is expressed in all macrophages. GATA6 and Spi-C are lineage regulators of peritoneal and splenic macrophages, respectively. However, these and other macrophage subsets still express Pu.1, which is a higher-level TF-D compared to GATA6 or Spi-C (Okabe and Medzhitov, 2014). More generally, TF-Ds form a hierarchy with each lower-level TF-D defining progressively more specific sub-lineages of a given lineage. SCGs can also evolve from UCGs by the acquisition of dependence on TF-Ds. Given that UCGs are housekeeping genes, how could they ever evolve to become cell-type specific? One way this can happen is through gene duplication of UCGs, with one copy retaining a housekeeping function, while another copy acquiring cell-type-specific function. An example of this is the smooth muscle actin, which is specifically expressed in myofibroblasts and is a relative of the housekeeping beta actin gene. Gene duplication likely preceded other cases of evolutionary transitions discussed above.
Molecular Cell
Review Figure 3. Kinetic Patterns of TLR-Induced Gene Expression in Macrophages Microarray intensity data were downloaded from EMBL-EBI ArrayExpress (E-tabm-310) (Ramsey et al., 2008) and processed with the Bioconductor package Limma (Smyth, 2004), using only time points with at least 3 replicates (0, 20, 40, 60, 80, 120, 240, 480, and 1,440 minutes). Genes induced or repressed at least 2-fold with a false discovery rate less than 0.05 were included in the analysis. K-means clustering was performed to group the genes into 20 clusters (Cluster 3.0 and visualized using Java TreeView). Clusters with similar profiles were combined, averaged, and visualized in Microsoft Excel.
Structure of the Transcription Programs One remarkable feature of macrophage transcriptional response is the large number of genes (on the order of several hundreds) that are transcriptionally induced or repressed upon stimulation with TLR ligands. These genes fall into multiple functional groups (e.g., inflammatory and anti-microbial genes, genes involved in antigen presentation, ECM-modifying genes, metabolic genes, etc.). In addition, the transcriptional response can be parsed into different kinetics modalities (Figure 3). This reflects the fact that the transcriptional response includes the primary response genes, which are induced rapidly (e.g., A20, IkBa, JunB, TNF), and secondary response genes, which tend to be cell type specific, and have a delayed kinetics of induction (e.g., genes encoding IL-6 and IL-12b). Each of these is regulated by a combination of TF-Bs (such as NF-kB and AP1) and TF-Cs, such as JunB (TF-C1) and CEBPd (TF-C3). Indeed, a detailed analysis of transcription factor binding site enrichment of TLR-induced genes demonstrated that groups of genes induced with different kinetics have distinct TF binding site enrichment patterns (Ramsey et al., 2008). The distinct kinetics of groups of lipopolysaccharide (LPS)-induced transcripts likely reflects additional features of the regulated genes, such as mRNA stability, requirements for chromatin remodeling and histone modifications, regulation by lncRNAs and microRNAs, involvement of transcriptional repressors, etc. An additional source of complexity of gene induction in macrophages is due to the fact that several cytokines produced by macrophages (most notably IFN-a/b and IL-10) can act in an autocrine manner to induce their own distinct gene expression programs, as well as to positively or negatively affect the genes induced directly by LPS (Shalek et al., 2014). Thus, the group of genes ‘‘C-up’’ in Figure 3 is composed largely of IFN-induced genes and has a distinct kinetics of delayed and persistent induction. A basic picture of the transcriptional response is that it is comprised of multiple transcription programs, which can be defined as sets of functionally related and coordinately regulated genes. Their coordinate regulation is typically achieved due to their control by the same TF or TF combinations. For example, NF-kB is a key regulator of inflammatory genes, CIITA is a
master regulator of genes involved in MHC class-II antigen presentation, and STAT1/STAT2 control the induction of IFN-regulated genes, many of which have anti-viral activities. Generally speaking, a given signal can induce a single transcriptional program (e.g., IFN-induced genes), or a combination of multiple transcription programs (as is the case in macrophages stimulated with LPS). Whether the transcriptional response is comprised of a single or multiple distinct transcription programs has an implication for the regulatory strategies used to control these responses. In the case of single transcriptional programs, regulation can be achieved at the level of signaling (for example, through the induction of inhibitory proteins, such as SOCS1, SOCS3, or A20). On the other hand, when multiple programs are induced by the same signal, individual programs may need to be differentially regulated. Thus, genes encoding inflammatory mediators that have a potential for tissue damage have to be turned off soon after induction, whereas genes encoding anti-microbial effectors may need to stay induced until the infection is cleared (Foster et al., 2007). This necessitates differential regulation of transcriptional programs induced by the same signal. How this is achieved in general is not yet fully understood, but several gene-specific mechanisms have been described, including differential control of chromatin modification at target genes (Foster et al., 2007; Joseph et al., 2003; Netea et al., 2016; Ricote et al., 1998). It should be noted that transcriptional programs need not be inducible (although the inducible programs are more obvious in gene expression studies). The cell-type-specific and constitutively expressed genes also make up distinct transcription programs, which again can be defined as co-regulated (by a given TF) and functionally related genes. Constitutive transcriptional programs are harder to define because transcription profiles of differentiated cells represent a mixture of all the transcription programs. However, the mapping of specific TF to transcription programs can be achieved by TF mutations and computational deconvolution of transcription factor networks (Amit et al., 2011; Thompson et al., 2015; Yosef and Regev, 2016). Recent analyses of tissue-resident macrophages and lymphocytes revealed an interesting feature of these constitutive and cell-type-specific transcription programs: the fact that they can be shared between different cell types within a given tissue. Thus, tissue-resident cells acquire some of the transcription Molecular Cell 71, August 2, 2018 393
Molecular Cell
Review programs (or parts thereof) of the corresponding main cell type of the tissue. For example, like adipocytes, macrophages and regulatory T cells residing in the adipose tissue express PPARg and PPARg-regulated genes (Cipolletta et al., 2012; Fan and Rudensky, 2016; Yosef and Regev, 2016). These genes can be referred to as ‘‘tissue specific’’ rather than ‘‘cell-type specific,’’ because they are expressed in multiple cell types in a given tissue. It would be interesting to investigate the functional significance of such shared transcriptional programs in tissue-resident cells in future studies. The notion of a transcription program as a set of co-regulated and functionally related genes is intuitively obvious, particularly for highly specialized functional programs, such as transcriptional control of cell-cycle or mitochondrial biogenesis. In reality, however, most transcription programs appear to contain many genes that are not obviously related to the core function of a given transcription program. This fact has not been properly explained but it confounds the understanding of the logic of gene expression programs. One way to address this apparent paradox is to distinguish, within a given transcription program, between ‘‘effector’’ genes and ‘‘accessory’’ genes. Effector genes can be defined as genes whose products are directly involved in a given function, for example, cell-cycle regulators, anti-microbial proteins, or cytokines. It is with regards to the effector genes where functional connections are most obvious. Accessory genes, on the other hand, are not directly involved in executing a given function. Rather, their role is to support the functions performed by the effector genes. For example, genes encoding cyclins and DNA replication machinery are the effector genes of the ‘‘cell-cycle’’ transcription program. Genes encoding metabolic transporters are also induced by mitogens, but they are not directly involved in cell division. Their function is to provide metabolic supplies necessary to support cell proliferation. Importantly, the accessory genes are co-regulated with effector genes by the same TFs, and in response to the same signals (when the programs are inducible). In addition, a given set of accessory genes can be used to support multiple effector gene programs. Thus, the same metabolic transporters may be required to support cell proliferation and some other responses that are dependent on supply of specific metabolites. The distinction between effector and accessory genes also applies to constitutive transcription programs (constitutively expressed cell-type-specific genes). Here too, cell-type specificity may be more stringently followed by the effector genes compared to accessory genes when the latter can be used to support more than one function, in more than one cell type. Finally, accessory genes may have either cell-autonomous or non-cell-autonomous functions: thus, the function of metabolic transporters is cell autonomous, while the function of ECM modifying genes is not. One could predict that the non-cellautonomous functions can be delegated to supporting cell types (such as tissue resident macrophages and stromal cells), while non-cell autonomous functions usually cannot. This may explain, at least in part, why some transcription programs are partitioned between the main functional cell types in the tissue and various tissue-resident myeloid, lymphoid, and stromal cells. 394 Molecular Cell 71, August 2, 2018
Transcription Programs Consist of Induced and Repressed Genes So far, our discussion of transcription programs focused on gene induction. However, transcriptional responses consist of both induced and repressed genes. Moreover, temporal parsing of induced and repressed genes in macrophages following LPS stimulation reveals that for each wave of transcriptional induction, there is a corresponding wave of transcription repression (Figure 3), suggesting that perhaps the repressed genes, at least in some cases, are turned off by the signals (and TFs) that turn on the induced genes. This indeed appears to be the case in LPS-stimulated macrophages, where gene repression is not observed upon deletion of key signaling pathways and TF involved in gene activation (Tong et al., 2016). Similar waves of induced and repressed transcripts have been observed in T cells (Yosef et al., 2013) and are likely a common feature of inducible changes in gene expression. This raises the question: why some genes need to be turned off when others are turned on? As noted above, transcription programs are defined as sets of co-regulated and functionally related genes. However, functional relations could be either positive or negative: in positive relation, a given function requires both genes X AND Y (for example, genes X and Y encode enzymes of the same metabolic pathway). In negative relation, a given function requires gene X AND NOT gene Y (for example, the effect of gene Y is opposite to the effect of gene X). In this case, a signal that induces gene X should also suppress gene Y so that the functional program induced by that signal can proceed unimpeded. More generally, induction of a function F1 should be accompanied by suppression of any functions incompatible with F1. This induction and suppression can occur at multiple levels, from transcription to protein activity. In the former case, this is coordinated by positive and negative effects of TFs on induced and repressed genes, in the latter case it is achieved by activating and inhibitory effects of signaling pathways (e.g., AMPK-mediated phosphorylation activates energy producing and inhibits energy consuming pathways). As an example of LPS-induced transcription responses (Figure 3), the genes in the group ‘‘C-up’’ are IFN-induced genes, whereas genes in the group ‘‘C-down’’ include cell-cycle genes. This makes sense because cell proliferation is incompatible with anti-viral defense. The notion of incompatible gene expression programs can be further generalized by considering a signal-induced cellular response (such as macrophage activation by LPS), as a signaldependent transition of a cell from one state to another (e.g., from quiescent to activated, or more generally from state 0 to state 1) (Figure 4A). A signal S1 that induces state 1 would also have to inhibit state 0, since the two states are incompatible. At the level of gene expression, this would be typically achieved by signal S1 simultaneously inducing genes that maintain state 1 and suppressing genes that maintain state 0. This notion is broadly generalizable, because state 1 could be any state alternative to 0. For example, if state 0 is quiescence, then state 1 could be activation, proliferation, differentiation, or apoptosis. In all cases, inducing any of these transitions requires suppression of the alternative programs (quiescence – proliferation, self-renewal – differentiation, survival – apoptosis, etc.).
Molecular Cell
Review A
B
Figure 4. Cellular Responses as State Transitions (A) Alternative cell states (here 0 and 1) are regulated by signals that maintain state 0 (S0) and inhibit transition to state 1, and by signals (S1) that inhibit state 0 and promote transition to state 1. States 0 and 1 are maintained by different sets of genes and TFs. (B) Cell activation, polarization, differentiation, and death can be viewed as signal-induced or spontaneous transitions between alternative states, here denoted as state 0 and state 1. Depending on the stability of state 1, the transitions can be described as activation, polarization, or differentiation.
Suppressing state 0 is analogous to releasing the brakes to allow for transition to state 1. Conversely, we can think of signals (S0) that promote state 0 as being inhibitory to signals that promote state 1, because they would have the opposite effect on cell transitions. Viewing cellular responses (and more generally, fate choices) as state transitions leads us to consider the properties of the states (their stability) and transitions (their reversibility) (Figure 4B). From this perspective, cell activation is a transient and reversible transition, with state 0 being stable and state 1 being unstable. Indeed, following stimulation, cell activation generally reverses to the original quiescent state. Transition during cell differentiation is irreversible, because the differentiated state (state 1) is generally stable (although there can be interesting exceptions where differentiated states need to be actively maintained by signal S1). Finally, in some cell types there is a third type of transition, which in the macrophage field is referred to as ‘‘cell polarization.’’ This transition is more sustained than activation, but it is still reversible, which distinguishes it from differentiation. In the case of polarization, cells can be in state 1 for as long as needed, as dictated by the demand on the function performed in state 1. Thus, cell polarization is intermediate between activation and differentiation. Because state 0 is usually stable (cells do not spontaneously
become activated or differentiated), the genes that maintain state 0 need to be actively suppressed by the activation, polarization, or differentiation signals to allow transitions to state 1. How this works mechanistically is yet to be explored, but presumably the TFs that maintain state 0 need to be suppressed to allow the transition to state 1. These same TFs likely suppress the transitions to different state 1s. In other words, the most likely scenario is that TFs controlling state 0 and TFs controlling state 1 are engaged in antagonistic cross-repression, as has been well established for alternative cell fate choices during differentiation (Graf and Enver, 2009). While there are multiple transcriptional ‘‘master regulators’’ for each state 1, this perspective suggests that there may also be multiple repressors for each of the transitions to state 1s (e.g., one repressor to suppress transition induced by IFN, another induced by IL-4, etc.) and these repressors may collectively maintain state 0. Alternatively, in some cases there might be ‘‘master regulators’’ of state 0. In either scenario, the state 0 can be maintained by dedicated extracellular signals that control the expression of TFs that prevent transitions to states 1. This paradigm has been demonstrated in the embryonic stem (ES) cell field, where signals such as LIF and BMP2/4 control uncommitted self-renewal state of ES cells through activation of TFs STAT3 and SMADs (Ying et al., 2003). This reciprocal control of cell fate decisions also applies to metabolic programs, where signals S0 and S1 control catabolic (state 0) and anabolic (state 1) programs, respectively (Nish et al., 2017). Exploring similar paradigms in broader contexts of cell activation, polarization, differentiation, and apoptosis should provide important mechanistic insights into cell fate decisions. Conclusions and Perspectives The recent technological and computational advances continue to generate the ever increasing amount of information about gene expression and its regulation in a variety of biological contexts. To help organize and interpret this data, we need to develop new conceptual frameworks. One obstacle here is the lack of functional classification of TFs. Indeed, although TFs belong to different structural families, these usually do not relate to their biological functions. However, TFs can be divided into four general categories, depending on their mode of regulation (constitutive, signal-dependent, inducible, and cell-type specific). These categories then relate to the four basic patterns of gene expression: ubiquitous and constitutive, ubiquitous and inducible, cell-type specific constitutive, and cell-type specific and inducible. Although these discrete categories are extremes of a continuous spectrum of gene expression patterns, the intermediate patterns can generally be derived from them by tuning one of the parameters of gene expression. The genes induced in response to extracellular signals fall into distinct transcription programs, each consisting of co-regulated and functionally related genes. Signal-dependent induction of one transcription program is usually accompanied by suppression of other, functionally incompatible programs. Each of the transcription programs has an activating TF, but each may also have a repressive TF that prevents the program from being executed without a signal. At least in some cases, the activating Molecular Cell 71, August 2, 2018 395
Molecular Cell
Review and repressive functions can be mediated by the same TF. The analogous scenario plays out during cell differentiation, where TFs that activate transcription programs for one cell fate can also suppress transcription programs of the alternative cell fates. The inducible transcription programs can have very different temporal characteristics: some are induced very transiently, some are sustained over prolonged periods of time, and some can be permanent. These patterns correspond to cell activation, polarization (the term used in macrophage biology but applicable to other cell types), and differentiation. An extreme version of polarization may result in long-term changes of macrophage behavior—a phenomenon that could be referred to as cellular memory. In the case of macrophages and epithelial cells, this can enable a modified (faster and stronger) response to secondary exposure to pathogens or injury (Foster et al., 2007; Naik et al., 2017; Netea et al., 2016). Finally, an exciting area of active research is focused on the impact of tissue micro-environment on cell specialization and gene expression (Okabe and Medzhitov, 2016; Yosef and Regev, 2016). Better understanding of the transcriptional mechanisms involved in these processes should provide further insights into cell fate decisions and cell-type evolution. ACKNOWLEDGMENTS
Davies, L.C., Jenkins, S.J., Allen, J.E., and Taylor, P.R. (2013). Tissue-resident macrophages. Nat. Immunol. 14, 986–995. Deaton, A.M., and Bird, A. (2011). CpG islands and the regulation of transcription. Genes Dev. 25, 1010–1022. Fan, X., and Rudensky, A.Y. (2016). Hallmarks of Tissue-Resident Lymphocytes. Cell 164, 1198–1211. Foster, S.L., Hargreaves, D.C., and Medzhitov, R. (2007). Gene-specific control of inflammation by TLR-induced chromatin modifications. Nature 447, 972–978. Fowler, T., Sen, R., and Roy, A.L. (2011). Regulation of primary response genes. Mol. Cell 44, 348–360. Fullwood, M.J., Liu, M.H., Pan, Y.F., Liu, J., Xu, H., Mohamed, Y.B., Orlov, Y.L., Velkov, S., Ho, A., Mei, P.H., et al. (2009). An oestrogen-receptor-alpha-bound human chromatin interactome. Nature 462, 58–64. Ghisletti, S., Barozzi, I., Mietton, F., Polletti, S., De Santa, F., Venturini, E., Gregory, L., Lonie, L., Chew, A., Wei, C.L., et al. (2010). Identification and characterization of enhancers controlling the inflammatory gene expression program in macrophages. Immunity 32, 317–328. Glass, C.K., and Natoli, G. (2016). Molecular control of activation and priming in macrophages. Nat. Immunol. 17, 26–33. Gordon, S., Akopyan, G., Garban, H., and Bonavida, B. (2006). Transcription factor YY1: structure, function, and therapeutic implications in cancer biology. Oncogene 25, 1125–1142. Graf, T., and Enver, T. (2009). Forcing cells to change lineages. Nature 462, 587–594.
We thank members of the Medzhitov lab for discussions. Work in the Medzhitov lab was supported by the Blavatnik Family Foundation, the Else Kroner-Fresenius Foundation, the Scleroderma Research Foundation, and the Howard Hughes Medical Institute.
Hargreaves, D.C., Horng, T., and Medzhitov, R. (2009). Control of inducible gene expression by signal-dependent transcriptional elongation. Cell 138, 129–145.
REFERENCES
Heinz, S., Romanoski, C.E., Benner, C., and Glass, C.K. (2015). The selection and function of cell type-specific enhancers. Nat. Rev. Mol. Cell Biol. 16, 144–154.
Adelman, K., and Lis, J.T. (2012). Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans. Nat. Rev. Genet. 13, 720–731.
Holmberg, J., and Perlmann, T. (2012). Maintaining differentiated cellular identity. Nat. Rev. Genet. 13, 429–439.
Amit, I., Regev, A., and Hacohen, N. (2011). Strategies to discover regulatory circuits of the mammalian immune system. Nat. Rev. Immunol. 11, 873–880.
Joseph, S.B., Castrillo, A., Laffitte, B.A., Mangelsdorf, D.J., and Tontonoz, P. (2003). Reciprocal regulation of inflammation and lipid metabolism by liver X receptors. Nat. Med. 9, 213–219.
Andersson, R., Sandelin, A., and Danko, C.G. (2015). A unified architecture of transcriptional regulatory elements. Trends Genet. 31, 426–433.
€ppel-like transcripKaczynski, J., Cook, T., and Urrutia, R. (2003). Sp1- and Kru tion factors. Genome Biol. 4, 206.
Arendt, D., Musser, J.M., Baker, C.V.H., Bergman, A., Cepko, C., Erwin, D.H., Pavlicev, M., Schlosser, G., Widder, S., Laubichler, M.D., and Wagner, G.P. (2016). The origin and evolution of cell types. Nat. Rev. Genet. 17, 744–757.
Kim, H.D., Shay, T., O’Shea, E.K., and Regev, A. (2009). Transcriptional regulatory circuits: predicting numbers from alphabets. Science 325, 429–432.
Arnold, H.H., and Braun, T. (1996). Targeted inactivation of myogenic factor genes reveals their role during mouse myogenesis: a review. Int. J. Dev. Biol. 40, 345–353.
Kohyama, M., Ise, W., Edelson, B.T., Wilker, P.R., Hildner, K., Mejia, C., Frazier, W.A., Murphy, T.L., and Murphy, K.M. (2009). Role for Spi-C in the development of red pulp macrophages and splenic iron homeostasis. Nature 457, 318–321.
Blum, R., Vethantham, V., Bowman, C., Rudnicki, M., and Dynlacht, B.D. (2012). Genome-wide identification of enhancers in skeletal muscle: the role of MyoD1. Genes Dev. 26, 2763–2779.
Levchenko, A., and Nemenman, I. (2014). Cellular noise and information transmission. Curr. Opin. Biotechnol. 28, 156–164.
Buenrostro, J.D., Giresi, P.G., Zaba, L.C., Chang, H.Y., and Greenleaf, W.J. (2013). Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218. Cairns, B.R. (2009). The logic of chromatin architecture and remodelling at promoters. Nature 461, 193–198. Cipolletta, D., Feuerer, M., Li, A., Kamei, N., Lee, J., Shoelson, S.E., Benoist, C., and Mathis, D. (2012). PPAR-g is a major driver of the accumulation and phenotype of adipose tissue Treg cells. Nature 486, 549–553. Curina, A., Termanini, A., Barozzi, I., Prosperini, E., Simonatto, M., Polletti, S., Silvola, A., Soldi, M., Austenaa, L., Bonaldi, T., et al. (2017). High constitutive activity of a broad panel of housekeeping and tissue-specific cis-regulatory elements depends on a subset of ETS proteins. Genes Dev. 31, 399–412.
396 Molecular Cell 71, August 2, 2018
Liao, X., Sharma, N., Kapadia, F., Zhou, G., Lu, Y., Hong, H., Paruchuri, K., Ma€ppel-like factor habeleshwar, G.H., Dalmas, E., Venteclef, N., et al. (2011). Kru 4 regulates macrophage polarization. J. Clin. Invest. 121, 2736–2749. Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., et al. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293. Lin, C.Y., Love´n, J., Rahl, P.B., Paranal, R.M., Burge, C.B., Bradner, J.E., Lee, T.I., and Young, R.A. (2012). Transcriptional amplification in tumor cells with elevated c-Myc. Cell 151, 56–67. Link, V.M., Gosselin, D., and Glass, C.K. (2015). Mechanisms Underlying the Selection and Function of Macrophage-Specific Enhancers. Cold Spring Harb. Symp. Quant. Biol. 80, 213–221.
Molecular Cell
Review Litvak, V., Ramsey, S.A., Rust, A.G., Zak, D.E., Kennedy, K.A., Lampano, A.E., Nykter, M., Shmulevich, I., and Aderem, A. (2009). Function of C/EBPdelta in a regulatory circuit that discriminates between transient and persistent TLR4induced signals. Nat. Immunol. 10, 437–443. Macosko, E.Z., Basu, A., Satija, R., Nemesh, J., Shekhar, K., Goldman, M., Tirosh, I., Bialas, A.R., Kamitaki, N., Martersteck, E.M., et al. (2015). Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell 161, 1202–1214. Mottis, A., Mouchiroud, L., and Auwerx, J. (2013). Emerging roles of the corepressors NCoR1 and SMRT in homeostasis. Genes Dev. 27, 819–835. Mumbach, M.R., Rubin, A.J., Flynn, R.A., Dai, C., Khavari, P.A., Greenleaf, W.J., and Chang, H.Y. (2016). HiChIP: efficient and sensitive analysis of protein-directed genome architecture. Nat. Methods 13, 919–922. Naik, S., Larsen, S.B., Gomez, N.C., Alaverdyan, K., Sendoel, A., Yuan, S., Polak, L., Kulukian, A., Chai, S., and Fuchs, E. (2017). Inflammatory memory sensitizes skin epithelial stem cells to tissue damage. Nature 550, 475–480. Natoli, G., Ghisletti, S., and Barozzi, I. (2011). The genomic landscapes of inflammation. Genes Dev. 25, 101–106.
Rosenfeld, M.G., Lunyak, V.V., and Glass, C.K. (2006). Sensors and signals: a coactivator/corepressor/epigenetic code for integrating signal-dependent programs of transcriptional response. Genes Dev. 20, 1405–1428. Satoh, T., Takeuchi, O., Vandenbon, A., Yasuda, K., Tanaka, Y., Kumagai, Y., Miyake, T., Matsushita, K., Okazaki, T., Saitoh, T., et al. (2010). The Jmjd3-Irf4 axis regulates M2 macrophage polarization and host responses against helminth infection. Nat. Immunol. 11, 936–944. Scarpulla, R.C. (2002). Transcriptional activators and coactivators in the nuclear control of mitochondrial function in mammalian cells. Gene 286, 81–89. Shalek, A.K., Satija, R., Shuga, J., Trombetta, J.J., Gennert, D., Lu, D., Chen, P., Gertner, R.S., Gaublomme, J.T., Yosef, N., et al. (2014). Single-cell RNAseq reveals dynamic paracrine control of cellular variation. Nature 510, 363–369. Smale, S.T., and Natoli, G. (2014). Transcriptional control of inflammatory responses. Cold Spring Harb. Perspect. Biol. 6, a016261. Smale, S.T., Plevy, S.E., Weinmann, A.S., Zhou, L., Ramirez-Carrozzi, V.R., Pope, S.D., Bhatt, D.M., and Tong, A.J. (2013). Toward an understanding of the gene-specific and global logic of inducible gene transcription. Cold Spring Harb. Symp. Quant. Biol. 78, 61–68.
Netea, M.G., Joosten, L.A., Latz, E., Mills, K.H., Natoli, G., Stunnenberg, H.G., O’Neill, L.A., and Xavier, R.J. (2016). Trained immunity: A program of innate immune memory in health and disease. Science 352, aaf1098.
Smale, S.T., Tarakhovsky, A., and Natoli, G. (2014). Chromatin contributions to the regulation of innate immunity. Annu. Rev. Immunol. 32, 489–511.
Nie, Z., Hu, G., Wei, G., Cui, K., Yamane, A., Resch, W., Wang, R., Green, D.R., Tessarollo, L., Casellas, R., et al. (2012). c-Myc is a universal amplifier of expressed genes in lymphocytes and embryonic stem cells. Cell 151, 68–79.
Smyth, G.K. (2004). Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3, e3.
Nish, S.A., Lin, W.W., and Reiner, S.L. (2017). Lymphocyte Fate and Metabolism: A Clonal Balancing Act. Trends Cell Biol. 27, 946–954.
Soshnev, A.A., Josefowicz, S.Z., and Allis, C.D. (2016). Greater Than the Sum of Parts: Complexity of the Dynamic Epigenome. Mol. Cell 62, 681–694.
Okabe, Y., and Medzhitov, R. (2014). Tissue-specific signals control reversible program of localization and functional polarization of macrophages. Cell 157, 832–844.
Spiegelman, B.M., and Heinrich, R. (2004). Biological control through regulated transcriptional coactivators. Cell 119, 157–167.
Okabe, Y., and Medzhitov, R. (2016). Tissue biology perspective on macrophages. Nat. Immunol. 17, 9–17.
Thompson, D., Regev, A., and Roy, S. (2015). Comparative analysis of gene regulatory networks: from network reconstruction to evolution. Annu. Rev. Cell Dev. Biol. 31, 399–428.
Picelli, S., Faridani, O.R., Bjo¨rklund, A.K., Winberg, G., Sagasser, S., and Sandberg, R. (2014). Full-length RNA-seq from single cells using Smartseq2. Nat. Protoc. 9, 171–181. Pombo, A., and Dillon, N. (2015). Three-dimensional genome architecture: players and mechanisms. Nat. Rev. Mol. Cell Biol. 16, 245–257. Ramirez-Carrozzi, V.R., Nazarian, A.A., Li, C.C., Gore, S.L., Sridharan, R., Imbalzano, A.N., and Smale, S.T. (2006). Selective and antagonistic functions of SWI/SNF and Mi-2beta nucleosome remodeling complexes during an inflammatory response. Genes Dev. 20, 282–296. Ramirez-Carrozzi, V.R., Braas, D., Bhatt, D.M., Cheng, C.S., Hong, C., Doty, K.R., Black, J.C., Hoffmann, A., Carey, M., and Smale, S.T. (2009). A unifying model for the selective regulation of inducible transcription by CpG islands and nucleosome remodeling. Cell 138, 114–128. Ramsey, S.A., Klemm, S.L., Zak, D.E., Kennedy, K.A., Thorsson, V., Li, B., Gilchrist, M., Gold, E.S., Johnson, C.D., Litvak, V., et al. (2008). Uncovering a macrophage transcriptional program by integrating evidence from motif scanning and expression dynamics. PLoS Comput. Biol. 4, e1000021.
Tong, A.J., Liu, X., Thomas, B.J., Lissner, M.M., Baker, M.R., Senagolage, M.D., Allred, A.L., Barish, G.D., and Smale, S.T. (2016). A Stringent Systems Approach Uncovers Gene-Specific Mechanisms Regulating Inflammation. Cell 165, 165–179. Urba´nek, P., Wang, Z.Q., Fetka, I., Wagner, E.F., and Busslinger, M. (1994). Complete block of early B cell differentiation and altered patterning of the posterior midbrain in mice lacking Pax5/BSAP. Cell 79, 901–912. Weake, V.M., and Workman, J.L. (2010). Inducible gene expression: diverse regulatory mechanisms. Nat. Rev. Genet. 11, 426–437. Wynn, T.A., Chawla, A., and Pollard, J.W. (2013). Macrophage biology in development, homeostasis and disease. Nature 496, 445–455. Ying, Q.L., Nichols, J., Chambers, I., and Smith, A. (2003). BMP induction of Id proteins suppresses differentiation and sustains embryonic stem cell selfrenewal in collaboration with STAT3. Cell 115, 281–292. Yosef, N., and Regev, A. (2011). Impulse control: temporal dynamics in gene transcription. Cell 144, 886–896.
Raser, J.M., and O’Shea, E.K. (2005). Noise in gene expression: origins, consequences, and control. Science 309, 2010–2013.
Yosef, N., and Regev, A. (2016). Writ large: Genomic dissection of the effect of cellular environment on immune response. Science 354, 64–68.
Ricote, M., Li, A.C., Willson, T.M., Kelly, C.J., and Glass, C.K. (1998). The peroxisome proliferator-activated receptor-gamma is a negative regulator of macrophage activation. Nature 391, 79–82.
Yosef, N., Shalek, A.K., Gaublomme, J.T., Jin, H., Lee, Y., Awasthi, A., Wu, C., Karwacz, K., Xiao, S., Jorgolli, M., et al. (2013). Dynamic regulatory network controlling TH17 cell differentiation. Nature 496, 461–468.
Molecular Cell 71, August 2, 2018 397