Accepted Manuscript Integration of metabolic, regulatory and signaling networks towards analysis of perturbation and dynamic responses Anush Chiappino-Pepe, Vikash Pandey, Meriç Ataman, Vassily Hatzimanikatis PII:
S2452-3100(17)30032-X
DOI:
10.1016/j.coisb.2017.01.007
Reference:
COISB 26
To appear in:
Current Opinion in Systems Biology
Received Date: 5 December 2016 Accepted Date: 23 January 2017
Please cite this article as: Chiappino-Pepe A, Pandey V, Ataman M, Hatzimanikatis V, Integration of metabolic, regulatory and signaling networks towards analysis of perturbation and dynamic responses, Current Opinion in Systems Biology (2017), doi: 10.1016/j.coisb.2017.01.007. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Integration of Metabolic, Regulatory and Signaling Networks Towards Analysis of Perturbation and Dynamic Responses
RI PT
Anush Chiappino-Pepe, Vikash Pandey, Meriç Ataman, and Vassily Hatzimanikatis
Address Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne 1015, Switzerland
SC
Corresponding author: Vassily Hatzimanikatis (
[email protected]) ABSTRACT
TE D
M AN U
The expanding generation of dynamic biological data requires approaches that integrate and analyze information from different types of cellular processes – metabolism, regulation, and signaling–, and ultimately increase our insights into the cell behavior upon perturbation. In the analysis of cellular processes, metabolism appears as the best scaffold to link the topology and crosstalk between regulation and signaling. Multiple methods for the integration of omics data into metabolic networks have been developed, but the dynamic and integrative analyses of cellular processes remain a challenge. Herein, we review the latest approaches to design, integrate and analyze metabolic, regulatory and signaling networks in static and dynamic fashions. We focus on the current challenges in applying these methods, and we highlight kinetic modeling as the promising approach for understanding the interactions and behavior of biological systems.
AC C
EP
Short title: Integrating Cellular Networks Towards Dynamic Analysis
1
ACCEPTED MANUSCRIPT
RI PT
Introduction Understanding the dynamic function of biological processes – at the cellular level: metabolism, regulation, and signaling – and their interactions is crucial for making progress in biology, medicine, and biotechnology. The dynamic analysis of cellular processes was hampered by limited technologies and the scarcity of time-series data. However, the latest technological advances allow generating high-quality measurements of different species (e.g.: metabolites, proteins, mRNAs) in biological systems at high frequency, and they require the parallel development of computational tools to integrate the generated knowledge [1]. The availability of dynamic data raises two main challenges to systems biology: to provide meaningful analyses of the data sets produced by the new technologies, and to further contribute to technology development by providing novel predictions [2,3].
M AN U
SC
The biological processes function coordinately and allow the survival and growth of the cell under varying environmental conditions. Metabolism encompasses the transformation of small molecules (sugars, amino acids, nucleotides, and lipids) into complex cellular building blocks and self-assembled structures. Enzymes catalyze the reactions at a controlled rate, and they ultimately determine the metabolic functions, capabilities, and requirements of the organisms. The metabolic capabilities are constrained and rewired by regulatory mechanisms implemented in fundamental cellular processes: transcription, translation, and signaling. Understanding the crosstalk between metabolism, regulation, and signaling is a primer goal in systems and synthetic biology [4], metabolic engineering [5], and cellular physiology, and it requires an integrative analysis of these cellular processes.
AC C
EP
TE D
The integrative analysis of cellular processes, rather than their separate study, provides quantitative predictions of the cell behavior under various conditions. Due to its central role in the function of the cell, metabolism emerges as the best scaffold to analyze regulatory and signaling events. Metabolism is represented with metabolic networks where the information about metabolic functions and capabilities of the organisms is stored. The topology of metabolic networks gives rise to a high number of degrees of freedom, which represent the complexity of metabolism, and its flexibility to adapt to different environments. Mathematical methods, like chemical reaction networks and network thermodynamics [6,7], can cope with complex topologies and are used to analyze metabolic networks. The integrative analysis of cellular processes requires approaches that can reconcile different dynamics (timescales) and the available data in a so-called multi-scale framework [8]. In this review, we revisit some of the latest developments in performing integrative analysis of metabolic, regulatory and signaling networks in static (steady-state) and dynamic (discrete- or continuous-time) fashions, and we further discuss future challenges. Combined with continuous efforts to generate dynamic and biochemical data, we believe that currently available approaches will allow the comprehensive analysis of the interactions and behavior of biological processes using kinetic models and advanced computational and systems engineering methods.
2
ACCEPTED MANUSCRIPT
SC
RI PT
Inference of metabolic, regulatory and signaling networks: Knowledge and Formalism Cellular processes are represented with networks, whose structures involve both the species that participate in the processes and the interactions or connectivity between these species. The first and most critical step in understanding a biological process is to infer the interactions and to reconstruct its network (Figure 1). Chemical interactions (reactions and signals) are identified using qualitative or static data that describe a steady-state condition. The crosstalk between biological networks (regulatory interactions) is inferred from quantitative or dynamic data that describe the evolution of the species (metabolites, proteins, and genes) from one steady-state to another upon genetic or environmental perturbation. Glucose homeostasis is based on such crosstalk between different biological processes (Figure 2). The high blood glucose level is diminished when insulin signaling interacts with glucose metabolism through transcriptional regulation of the forkhead protein (FOXO) and the glycogen synthase kinase 3 (GSK3) [9].
M AN U
Two general approaches are used to reconstruct biological networks, and their application depends on the available data in the literature. Top-down approaches use available experimental data (static or dynamic), and bottom-up approaches use previously characterized data and reconstructed networks from related organisms as a scaffold for assembling new biological networks [10].
AC C
EP
TE D
Metabolism is to date the most characterized biological process. Stoichiometric models have been extensively used to study metabolism for over three decades, and the early examples were built with top-down approaches [10]. Genome-scale models (GEMs) emerged as platforms that include all the metabolic capabilities of organisms while keeping the gene-to-protein-to-reaction (GPR) information obtained from the genome sequence with a bottom-up approach [11]. In the last twenty years, GEMs became the standard platforms to analyze metabolic networks and phenotypes of the cells [12]. The emergence of whole genome sequences for many organisms has triggered the recent developments of semi-automated [13-15] and automated [16,17] approaches to reconstruct GEMs. Although these methods have considerably accelerated the assembly of metabolic networks, the generated GEMs still require an extensive manual curation process to ensure high quality [11]. Currently, there are available GEMs for hundreds of organisms, which include known prokaryotes like Escherichia coli [18], and eukaryotes, like Saccharomyces cerevisiae [19], human cells [20], or pathogens such as Toxoplasma gondii [21]. In contrast with metabolism, regulation and signaling are less characterized, which further challenges their study. The reconstruction of regulatory and signaling networks is traditionally based on top-down approaches. Owing to the recent technological advances [22], we have increasing access to information about regulatory interactions [23-25] and signaling pathways [26,27] in various organisms, which facilitates their assembly through bottom-up approaches. Two main types of regulation that trigger metabolic responses are transcriptional regulation, which determines the enzyme levels through regulation of gene expression, and metabolic regulation, which involves regulation of enzyme catalytic activities from changes in metabolite levels. Transcriptional regulation has been studied in more detail due to the availability of gene expression data from
3
ACCEPTED MANUSCRIPT
M AN U
SC
RI PT
microarrays, RNAseq or ChIPseq, and their further analysis using bioinformatics tools [28,29] and mathematical modeling of metabolic networks [30]. Some of these approaches use transcriptomics data and GEMs to identify condition-dependent changes of metabolism between different environmental conditions, as shown in yeast [30,31]. The GEM-based methods offer an additional advantage because they can be used to reconcile inconsistencies among disparate data types [30]. Recently, a comprehensive experimental/computational framework was developed to infer metabolic and transcriptional regulation [1]. Such studies can reveal causal relationships within intertwined regulatory networks dependent on the availability of nutrient sources in different organisms. Other approaches based on metabolic modeling and optimization principles, like the cybernetic approach [32], or Pareto optimality [33], have been applied to understand complex regulatory mechanisms that describe cellular phenotypes. In the past, mixed-integer formulations were also suggested to infer regulatory interactions [34,35]. Although these approaches have not been further investigated, we believe that they remain valid for the study of metabolic and transcriptional regulations. The lack of metabolomics data and the limited knowledge of kinetic parameters are two significant limitations in the development and application of such methods.
AC C
EP
TE D
Protein–protein interaction (PPI) networks are static scaffolds of signaling–regulatory events in cells. The reconstruction of PPI networks started more than a decade ago based on proteomics data, as shown in the assembly of the signaling network in the Salmonella-infected human cell [36]. Several signaling network reconstructions, such as Sucrose Non-Fermenting kinase (Snf1) and epidermal growth factor receptor, have been developed with extensive literature reviews [37,38]. These signaling networks are then converted to stoichiometric models using constraint-based methods [39,40]. To reconstruct the interactions in a given signaling pathway, one can use the PATHLINKER computational method [41]. Using as a background protein interaction networks, PATHLINKER computes multiple short paths from the receptors to transcriptional regulators (TRs) in a pathway. These paths are used to formulate hypotheses about the coupling of signaling with metabolic enzymes. Although they provide only information about the (quasi) steady-state operation of both signaling and metabolic networks, this information can be used as a scaffold for the development of the corresponding kinetic models.
4
ACCEPTED MANUSCRIPT
M AN U
SC
RI PT
Qualitative and quantitative understanding of metabolic, regulatory and signaling networks: Integration and Analysis Functional models of biological networks can be analyzed based on two modeling approaches: qualitative and quantitative. Qualitative approaches provide a framework for studying biological networks at a steady-state using their topology. These methods do not explicitly integrate kinetic descriptions and cannot analyze the cell behavior in the transition to a perturbed steady-state. Due to very few parameter requirements, qualitative approaches are scalable to large size networks. Quantitative approaches perform dynamic analyses of the cell behavior upon perturbation or stimulus, and they involve a set of equations, most commonly ordinary differential equations (ODEs), that are solved to determine the concentration of species over time. Usually, ODE models comprise different types of kinetic laws, such as mass action and their parameters. When the size of the model increases, the analyses become increasingly challenging due to the rise of unknown parameters, and the little characterization of reaction, regulatory and signaling mechanisms. The kinetic parameters are determined with biochemical experiments or are estimated through calibration of the model with experimental measurements. The parameter quantification is critical to understand the strength of the interactions in the biological processes. The predictive ability of the calibrated model is then evaluated in subject to experimental tests. Steady-state approaches: flux balance analysis as an approach to integrate regulatory constraints
EP
TE D
Metabolic networks have been commonly analyzed using qualitative or constrainedbased approaches such as Flux Balance Analysis (FBA) [42]. FBA accounts for all of the gene products that are part of the metabolic network, and it defines mass balances around metabolites assuming quasi-steady-state condition [42]. Such assumption is applicable in metabolic networks because the changes in metabolite concentration levels occur in the order of milliseconds. This time-scale is similar to signaling processes but is not representative of regulation (which can take up to minutes), Figure 2. The different dynamics (time-scales) of the processes, their mechanistic, and the large size of the concerned networks necessitate the development of a multi-scale framework [8], Figure 1.
AC C
Metabolic and transcriptional regulatory networks (TRNs) were first integrated and analyzed within the FBA framework in 2001 using regulatory FBA (rFBA), which defines Boolean rules to account for the transcriptional regulation [43]. Other methods have followed a similar approach: state regulatory FBA (SR-FBA) [44], Probabilistic Regulation Of Metabolism (PROM) [45], and transcriptional controlled FBA (tFBA) [46]. Details about their applications and limitations have been extensively discussed before [47-49]. Macromolecular Expression models (ME-models) [50] is a stimulating novel framework that allows modeling of metabolism and expression at a genome-scale and a steady-state. ME-models have been extended to perform multi-level integrative analysis of genomic, transcriptomic, ribosomal profiling, proteomic, and fluxomic data to enhance the predictions of cellular physiology [51]. These models have considerably extended the size of GEMs, and although they present
5
ACCEPTED MANUSCRIPT computational challenges, specific computational methods have been developed to cope such complexity [52]. Hybrid approaches: dynamic analysis through iterative FBA
TE D
M AN U
SC
RI PT
Hybrid methods of FBA and ODEs were developed to integrate signaling data and to deal with different time-scales of cellular processes: integrated FBA (iFBA) [53], dynamic FBA (dFBA) [54], and integrated dynamic FBA (idFBA) [55], or the recently developed FlexFlux [56]. These hybrid methods further couple FBA with differential equations and allow semi-quantitative prediction of the cell behavior upon perturbation, as shown with the dFBA analysis of E. coli cultures upon changing aerobic conditions [57]. Two software suites have been recently developed to perform a multi-formalism simulation of biological networks: HepatoDyn [58] and MUFINS [59]. These methods can handle large size metabolic networks, and they were applied to study the dynamic responses upon perturbation in human cells through the iterative simulation of steady-state conditions [58,59]. These and all the aforementioned static approaches assume that many of the system parameters or components remain time invariant, and even if changes occur, they are at slower time-scale compared with the observation time of the system. Recent studies have shown that such an assumption does not impact significantly the results of the analyses. It has been shown that in the control of blood glucose by insulin (Figure 2) the steady-state of the system is maintained despite variations in the biochemical parameters and this robustness is called dynamical compensation [60]. However, more complex dynamic behavior, such as oscillations, might require a more mechanistic description to discriminate the details that give rise to such dynamics, and prevent inconsistencies between data types. Kinetic approaches: dynamic and integrative analysis with nonlinear models
AC C
EP
One of the recent attempts towards a large-scale dynamic and integrative approach of cellular processes is the whole-cell model of Mycoplasma genitalium, which was developed using 28 submodels for different cellular processes [61]. This model predicts a dynamic behavior of unobserved intracellular variables, such as the in vivo rates of protein-DNA association. The whole-cell modeling framework accounts independently for each cell process to differentiate the dynamics and time-scales. The availability of more dynamic data for the inference of the coupling between the cellular processes will be crucial for the improved performance of such large-scale, multi-process models [62]. The ultimate objective is the dynamic modeling of all three cellular processes using mechanistic rate laws and the same mathematical formalisms. The biggest obstacle in the efforts towards dynamic modeling is the lack of or partial information about the physicochemical parameters that underlie biochemical processes. Some approaches have been suggested to overcome the main burdens of kinetic models [63,64]. The uncertainty in kinetic parameters can be handled in frameworks such as the Optimization and Risk Analysis of Complex Living Entities (ORACLE) and ensemble modeling [64-66]. Such methods account for the uncertainty in the parameters by generating a population of models that are consistent with physiology [66,67]. The kinetic models developed with such approaches can be used to study the perturbations around a steady-state and to define statistical conclusions based on
6
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
the properties of the generated model population [68].
7
ACCEPTED MANUSCRIPT
RI PT
Applications and outlook of integrative approaches to analyze cellular processes Reconstruction of metabolic, regulatory and signaling networks has made substantial progress in the last decades, with most significant advances in the area of metabolic networks. Current efforts on the reconstruction of regulatory and signaling networks [39,40] focus on producing data and tools that will facilitate the reconstruction of these networks at the genome-scale. Such networks can then be linked to genomescale metabolic models, and they can be utilized to quantitatively predict the cell dynamic behavior from extracellular perturbations to metabolic responses, whole-cell and organ physiology [58,59]. It is envisioned that such integrative analyses can be employed as a tool to provide predictive, preventive, personalized and participatory healthcare information (P4 medicine), and to allow the design of more effective medical treatments [69,70].
EP
TE D
M AN U
SC
Towards the integrative analysis of biological processes, we face the challenge of reconciling their different dynamics (time-scales) and the available data. In the latest review on whole-cell models [62], the authors discuss the potential of such models to predict pleiotropic effects upon perturbation, and they identify the lack of organismspecific dynamic data to be one of the main barriers for applying whole-cell modeling [62]. We further believe that despite the technological advances and increasing available types and amounts of omics data, significant biochemical knowledge gaps remain to be characterized. This issue necessitates methods that enable analyzing the increasing sets of data and identifying the knowledge gaps in a systematic way. For example, although metabolic networks are better characterized than regulatory and signaling networks, knowledge about the enzymatic functions, and the catalytic and kinetic properties of most of the enzymes remains to be characterized. The recently developed ATLAS of biochemistry [71] suggests that there might be more than 130,000 possible enzymatic reactions between known biological compounds. We need the development of such tools that will enable us to identify all possible metabolic capabilities and regulatory interactions. Such predictions will provide an upper bound on the knowledge gaps and will guide the experimental studies to fill these gaps.
AC C
Biotechnology can also benefit from the predictions of dynamic and integrative analyses. Toxicity against intermediates of metabolic pathways could be alleviated by rebalancing the expression of the pathway enzymes downstream [72]. In a recent study, the authors suggest that regulatory mechanisms could be integrated to regulate the balance of substrates and products [72]. To engineer a cell that can itself rebalance the flux through a pathway, we need to design novel regulatory mechanisms and calibrate the strength of the regulatory interactions. Available computational tools propose regulatory mechanisms that satisfy a phenotype [3335]. Besides, tools that allow targeted perturbations with high efficacy, like the latest CRISPR/Cas9 technologies [73,74], could be used to test the predictions by performing genetic knockouts and modifications of gene expression.
8
ACCEPTED MANUSCRIPT Conclusions Understanding the function of cellular behavior under different conditions requires the development of computational approaches that integrate and analyze all the available knowledge in static and dynamic fashions. Experimental data is also needed to test the predictions made using computational methods and to characterize the biological networks to their full extension.
TE D
M AN U
SC
RI PT
The approaches for the analysis of metabolic, regulatory and signaling networks are well developed and have been further integrated into a multi-scale and whole-cell model. However, their coupling remains obscure, partially due to the lack of data, such as unknown kinetic parameters and regulatory mechanisms. Dynamic modeling is a promising approach for connecting cellular processes in a single modeling framework. Although kinetic modeling still faces challenges, such as uncertainty in the kinetic parameters and limitations to be applied to large network sizes, recent developments demonstrate the advances towards fully integrative analysis of metabolism under uncertainty. Future challenges will be the mathematical and computational issues associated with the parameter identification and the computational analysis of large-scale nonlinear models. While nonlinear models of large sizes have been analyzed before in other physical systems, such as weather and climate models, biological systems introduce new types of network interactions that will require further developments from computational and engineering sciences.
AC C
EP
Acknowledgements We would like to apologize to the authors of other relevant articles whose work was not cited in this review due to limited space. The authors gratefully acknowledge funding from the SystemsX.ch, the Swiss Initiative for Systems Biology evaluated by the Swiss National Science Foundation (http://www.systemsx.ch/), grants MalarX and MicroscapesX, and the École Polytechnique Fédérale de Lausanne (EPFL). The funders had no role in the study design, data collection, and analysis, decision to publish, or preparation of the manuscript. The authors gratefully acknowledge Dr. Hadadi for the critical feedback of this manuscript.
9
ACCEPTED MANUSCRIPT References and recommended reading Papers of particular interest, published within the period of review, have been highlighted as: • of special interest •• of outstanding interest
AC C
EP
TE D
M AN U
SC
RI PT
1. Oliveira AP, Dimopoulos S, Busetto AG, Christen S, Dechant R, Falter L, Chehreghani MH, Jozefczuk S, Ludwig C, Rudroff F, et al.: Inferring causal metabolic signals that regulate the dynamic TORC1-dependent transcriptome. Mol Syst Biol 2015, 11. 2. Kitano: Systems biology: a brief overview. Science 2002, 295:1662-1664. 3. Hatzimanikatis V: Bioinformatics and functional genomics: Challenges and opportunities. Aiche Journal 2000, 46:2339-2343. 4. Hatzimanikatis V, Saez-Rodriguez J: Integrative approaches for signalling and metabolic networks. Integr Biol (Camb) 2015, 7:844-845. 5. Lechner A, Brunk E, Keasling JD: The Need for Integrated Approaches in Metabolic Engineering. Cold Spring Harb Perspect Biol 2016, 8. 6. Perelson AS: Network thermodynamics. An overview. Biophys J 1975, 15:667-685. 7. Soh KC, Hatzimanikatis V: Network thermodynamics in the post-genomic era. Curr Opin Microbiol 2010, 13:350-357. 8. Yu JS, Bagheri N: Multi-class and multi-scale models of complex biological phenomena. Curr Opin Biotechnol 2016, 39:167-173. 9. Boucher J, Kleinridders A, Kahn CR: Insulin receptor signaling in normal and insulinresistant states. Cold Spring Harb Perspect Biol 2014, 6. 10. Cakir T, Khatibipour MJ: Metabolic network discovery by top-down and bottom-up approaches and paths for reconciliation. Front Bioeng Biotechnol 2014, 2:62. 11. Thiele I, Palsson BO: A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc 2010, 5:93-121. 12. O'Brien EJ, Monk JM, Palsson BO: Using Genome-scale Models to Predict Biological Capabilities. Cell 2015, 161:971-987. 13. Agren R, Liu LM, Shoaie S, Vongsangnak W, Nookaew I, Nielsen J: The RAVEN Toolbox and Its Use for Generating a Genome-scale Metabolic Model for Penicillium chrysogenum. PLoS Comput Biol 2013, 9. 14. Dias O, Rocha M, Ferreira EC, Rocha I: Reconstructing genome-scale metabolic models with merlin. Nucleic Acids Res 2015, 43:3899-3910. 15. Loira N, Zhukova A, Sherman DJ: Pantograph: A template-based method for genomescale metabolic model reconstruction. J Bioinform Comput Biol 2015, 13. 16. Wang YL, Eddy JA, Price ND: Reconstruction of genome-scale metabolic models for 126 human tissues using mCADRE. BMC Syst Biol 2012, 6. 17. Devoid S, Overbeek R, DeJongh M, Vonstein V, Best AA, Henry C: Automated genome annotation and metabolic model reconstruction in the SEED and Model SEED. Methods Mol Biol 2013, 985:17-45. 18. Orth JD, Conrad TM, Na J, Lerman JA, Nam H, Feist AM, Palsson BO: A comprehensive genome-scale reconstruction of Escherichia coli metabolism-2011. Mol Syst Biol 2011, 7. 19. Osterlund T, Nookaew I, Bordel S, Nielsen J: Mapping condition-dependent regulation of metabolism in yeast through genome-scale modeling. BMC Syst Biol 2013, 7.
10
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
20. Thiele I, Swainston N, Fleming RMT, Hoppe A, Sahoo S, Aurich MK, Haraldsdottir H, Mo ML, Rolfsson O, Stobbe MD, et al.: A community-driven global reconstruction of human metabolism. Nat Biotechnol 2013, 31:419-+. 21. Tymoshenko S, Oppenheim RD, Agren R, Nielsen J, Soldati-Favre D, Hatzimanikatis V: Metabolic Needs and Capabilities of Toxoplasma gondii through Combined Computational and Experimental Analysis. PLoS Comput Biol 2015, 11. 22. Simicevic J, Deplancke B: Transcription factor proteomics - tools, applications, and challenges. Proteomics 2016. 23. Tripathi S, Vercruysse S, Chawla K, Christie KR, Blake JA, Huntley RP, Orchard S, Hermjakob H, Thommesen L, Lgreid A, et al.: Gene regulation knowledge commons: community action takes care of DNA binding transcription factors. Database (Oxford) 2016. 24. Liu ZP, Wu CL, Miao HY, Wu HL: RegNetwork: an integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse. Database (Oxford) 2015. 25. Yang TH, Wang CC, Wang YC, Wu WS: YTRP: a repository for yeast transcriptional regulatory pathways. Database (Oxford) 2014, 2014:bau014. 26. Mi H, Poudel S, Muruganujan A, Casagrande JT, Thomas PD: PANTHER version 10: expanded protein families and functions, and analysis tools. Nucleic Acids Res 2016, 44:D336-342. 27. Chowdhury S, Sarkar RR: Comparison of human cell signaling pathway databasesevolution, drawbacks and challenges. Database (Oxford) 2015. 28. Kurt Z, Aydin N, Altay G: Comprehensive review of association estimators for the inference of gene networks. Turkish Journal of Electrical Engineering and Computer Sciences 2016, 24:695-U1401. 29. Ud-Dean SMM, Gunawan R: Optimal design of gene knockout experiments for gene regulatory network inference. Bioinformatics 2016, 32:875-883. 30. Chandrasekaran S, Price ND: Metabolic constraint-based refinement of transcriptional regulatory networks. PLoS Comput Biol 2013, 9:e1003370. 31. Osterlund T, Nookaew I, Bordel S, Nielsen J: Mapping condition-dependent regulation of metabolism in yeast through genome-scale modeling. BMC Syst Biol 2013, 7:36. 32. Ramkrishna D, Song HS: Dynamic models of metabolism: Review of the cybernetic approach. Aiche Journal 2012, 58:986-997. 33. Otero-Muras I, Banga JR: Exploring Design Principles of Gene Regulatory Networks via Pareto Optimality. Ifac Papersonline 2016, 49:809-814. 34. Hatzimanikatis V, Floudas CA, Bailey JE: Optimization of regulatory architectures in metabolic reaction networks. Biotechnol Bioeng 1996, 52:485-500. 35. Thomas R, Paredes CJ, Mehrotra S, Hatzimanikatis V, Papoutsakis ET: A model-based optimization framework for the inference of regulatory interactions using timecourse DNA microarray expression data. BMC Bioinformatics 2007, 8. 36. Budak G, Eren Ozsoy O, Aydin Son Y, Can T, Tuncbag N: Reconstruction of the temporal signaling network in Salmonella-infected human cells. Front Microbiol 2015, 6:730. 37. Lubitz T, Welkenhuysen N, Shashkova S, Bendrioua L, Hohmann S, Klipp E, Krantz M: Network reconstruction and validation of the Snf1/AMPK pathway in baker’s yeast based on a comprehensive literature review. npj Systems Biology and Applications 2015, 1:15007.
11
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
38. Oda K, Matsuoka Y, Funahashi A, Kitano H: A comprehensive pathway map of epidermal growth factor receptor signaling. Mol Syst Biol 2005, 1:2005.0010. 39. Li F, Thiele I, Jamshidi N, Palsson BØ: Identification of Potential Pathway Mediation Targets in Toll-like Receptor Signaling. PLoS Comput Biol 2009, 5:e1000292. 40. Choudhary KS, Rohatgi N, Halldorsson S, Briem E, Gudjonsson T, Gudmundsson S, Rolfsson O: EGFR Signal-Network Reconstruction Demonstrates Metabolic Crosstalk in EMT. PLoS Comput Biol 2016, 12:e1004924. 41. Ritz A, Poirel CL, Tegge AN, Sharp N, Simmons K, Powell A, Kale SD, Murali TM: Pathways on demand: automated reconstruction of human signaling networks. npj Systems Biology and Applications 2016, 2:16002. 42. Orth JD, Thiele I, Palsson BO: What is flux balance analysis? Nat Biotechnol 2010, 28:245-248. 43. Covert MW, Schilling CH, Palsson B: Regulation of gene expression in flux balance models of metabolism. J Theor Biol 2001, 213:73-88. 44. Shlomi T, Cabili MN, Herrgard MJ, Palsson BO, Ruppin E: Network-based prediction of human tissue-specific metabolism. Nat Biotechnol 2008, 26:1003-1010. 45. Chandrasekaran S, Price ND: Probabilistic integrative modeling of genome-scale metabolic and regulatory networks in Escherichia coli and Mycobacterium tuberculosis. Proc Natl Acad Sci U S A 2010, 107:17845-17850. 46. van Berlo RJ, de Ridder D, Daran JM, Daran-Lapujade PA, Teusink B, Reinders MJ: Predicting metabolic fluxes using gene expression differences as constraints. IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM 2011, 8:206-216. 47. Vivek-Ananth RP, Samal A: Advances in the integration of transcriptional regulatory information into genome-scale metabolic models. Biosystems 2016, 147:1-10. 48. Imam S, Schauble S, Brooks AN, Baliga NS, Price ND: Data-driven integration of genomescale regulatory and metabolic network. Front Microbiol 2015, 6. 49. Faria JP, Overbeek R, Xia F, Rocha M, Rocha I, Henry CS: Genome-scale bacterial transcriptional regulatory networks: reconstruction and integrated analysis with metabolic models. Brief Bioinform 2014, 15:592-611. 50. Lerman JA, Hyduke DR, Latif H, Portnoy VA, Lewis NE, Orth JD, Schrimpe-Rutledge AC, Smith RD, Adkins JN, Zengler K, et al.: In silico method for modelling metabolism and gene product expression at genome scale. Nat Commun 2012, 3. 51. Ebrahim A, Brunk E, Tan J, O'Brien EJ, Kim D, Szubin R, Lerman JA, Lechner A, Sastry A, Bordbar A, et al.: Multi-omic data integration enables discovery of hidden biological regularities. Nat Commun 2016, 7:13091. 52. Yang L, Ma D, Ebrahim A, Lloyd CJ, Saunders MA, Palsson BO: solveME: fast and reliable solution of nonlinear ME models. BMC Bioinformatics 2016, 17:391. 53. Covert M, Xiao N, Chen TJ, Karr JR: Integrating metabolic, transcriptional regulatory and signal transduction models in Escherichia coli. Bioinformatics 2008, 24:2044-2050. 54. Richard G, Chang H, Cizelj I, Belta C, Julius AA, Amar S: Integration of large-scale metabolic, signaling, and gene regulatory networks with application to infection responses. 2011 50th Ieee Conference on Decision and Control and European Control Conference (Cdc-Ecc) 2011:2227-2232. 55. Lee JM, Gianchandani EP, Eddy JA, Papin JA: Dynamic analysis of integrated signaling, metabolic, and regulatory networks. PLoS Comput Biol 2008, 4.
12
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
56. Marmiesse L, Peyraud R, Cottret L: FlexFlux: combining metabolic flux and regulatory network analyses. BMC Syst Biol 2015, 9:93. 57. von Wulffen J, RecogNice T, Sawodny O, Feuer R: Transition of an Anaerobic Escherichia coli Culture to Aerobiosis: Balancing mRNA and Protein Levels in a DemandDirected Dynamic Flux Balance Analysis. PLoS One 2016, 11:e0158711. 58. Foguet C, Marin S, Selivanov VA, Fanchon E, Lee WN, Guinovart JJ, de Atauri P, Cascante M: HepatoDyn: A Dynamic Model of Hepatocyte Metabolism That Integrates 13C Isotopomer Data. PLoS Comput Biol 2016, 12:e1004899. 59. Wu HW, von Kamp A, Leoncikas V, Mori W, Sahin N, Gevorgyan A, Linley C, Grabowski M, Mannan AA, Stoy N, et al.: MUFINS: multi-formalism interaction network simulator. npj Systems Biology and Applications 2016, 2. 60. Karin O, Swisa A, Glaser B, Dor Y, Alon U: Dynamical compensation in physiological circuits. Mol Syst Biol 2016, 12:886. 61. Karr JR, Sanghvi JC, Macklin DN, Gutschow MV, Jacobs JM, Bolival B, Assad-Garcia N, Glass JI, Covert MW: A Whole-Cell Computational Model Predicts Phenotype from Genotype. Cell 2012, 150:389-401. 62. Macklin DN, Ruggero NA, Covert MW: The future of whole-cell modeling. Curr Opin Biotechnol 2014, 28:111-115. 63. Miskovic L, Tokic M, Fengos G, Hatzimanikatis V: Rites of passage: requirements and standards for building kinetic models of metabolic phenotypes. Curr Opin Biotechnol 2015, 36:146-153. 64. Zomorrodi AR, Lafontaine Rivera JG, Liao JC, Maranas CD: Optimization-driven identification of genetic perturbations accelerates the convergence of model parameters in ensemble modeling of metabolic networks. Biotechnol J 2013, 8:1090-1104. 65. Miskovic L, Hatzimanikatis V: Production of biofuels and biochemicals: in need of an ORACLE. Trends Biotechnol 2010, 28:391-397. 66. Chakrabarti A, Miskovic L, Soh KC, Hatzimanikatis V: Towards kinetic modeling of genome-scale metabolic networks without sacrificing stoichiometric, thermodynamic and physiological constraints. Biotechnol J 2013, 8:1043-U1105. 67. Andreozzi S, Chakrabarti A, Soh KC, Burgard A, Yang TH, Van Dien S, Miskovic L, Hatzimanikatis V: Identification of metabolic engineering targets for the enhancement of 1,4-butanediol production in recombinant E. coli using large-scale kinetic models. Metab Eng 2016, 35:148-159. 68. Andreozzi S, Miskovic L, Hatzimanikatis V: iSCHRUNK--In Silico Approach to Characterization and Reduction of Uncertainty in the Kinetic Models of Genomescale Metabolic Networks. Metab Eng 2016, 33:158-168. 69. Flores M, Glusman G, Brogaard K, Price ND, Hood L: P4 medicine: how systems medicine will transform the healthcare sector and society. Per Med 2013, 10:565576. 70. Kolch W, Halasz M, Granovskaya M, Kholodenko BN: The dynamic control of signal transduction networks in cancer cells. Nat Rev Cancer 2015, 15:515-527. 71. Hadadi N, Hafner J, Shajkofci A, Zisaki A, Hatzimanikatis V: ATLAS of Biochemistry: A Repository of All Possible Biochemical Reactions for Synthetic Biology and Metabolic Engineering Studies. ACS Synth Biol 2016, 5:1155-1166.
13
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
72. Chubukov V, Mukhopadhyay A, Petzold CJ, Keasling JD, Garcia Martin H: Synthetic and systems biology for microbial production of commodity chemicals. npj Systems Biology and Applications 2016, 2. 73. Gilbert LA, Horlbeck MA, Adamson B, Villalta JE, Chen Y, Whitehead EH, Guimaraes C, Panning B, Ploegh HL, Bassik MC, et al.: Genome-Scale CRISPR-Mediated Control of Gene Repression and Activation. Cell 2014, 159:647-661. 74. Wang T, Wei JJ, Sabatini DM, Lander ES: Genetic Screens in Human Cells Using the CRISPR-Cas9 System. Science 2014, 343:80-84.
14
ACCEPTED MANUSCRIPT Summary of recommended reading Papers of particular interest, published within the period of review, have been highlighted as: • of special interest •• of outstanding interest
SC
RI PT
1. Oliveira AP, Dimopoulos S, Busetto AG, Christen S, Dechant R, Falter L, Chehreghani MH, Jozefczuk S, Ludwig C, Rudroff F, et al.: Inferring causal metabolic signals that regulate the dynamic TORC1-dependent transcriptome. Mol Syst Biol 2015, 11. •• Oliviera et al. co-designed dynamic experiments and a probabilistic, model-based method to infer causal relationships between metabolism, signaling, and gene regulation. They performed dynamic multi-level omics measurements of yeast cells perturbed by the modulation of the quality of the N-source and by chemical inhibition of TORC1. The authors identified the dynamics of the underlying regulation and signaling mechanisms up- and downstream of TORC1.
M AN U
33. Otero-Muras I, Banga JR: Exploring Design Principles of Gene Regulatory Networks via Pareto Optimality. Ifac Papersonline 2016, 49:809-814. •• Otero-Muras et al. developed a model-based exploration approach that is based on Pareto optimality principles to identify motifs capable of performing a specific biological task. They identified design patterns in three gene regulatory networks conferring to a tissue of isogenic cells the capability to form a stripe of gene expression in response to a morphogen gradient.
EP
TE D
51. Ebrahim A, Brunk E, Tan J, O'Brien EJ, Kim D, Szubin R, Lerman JA, Lechner A, Sastry A, Bordbar A, et al.: Multi-omic data integration enables discovery of hidden biological regularities. Nat Commun 2016, 7:13091. • Ebrahim et al. developed methods to integrate and analyze multi-level omics data into GEMs. They identified regularities between biological processes at different scales, such as the link between translation rate and the protein secondary structure. Their approach also allows the identification of data inconsistency and its quantitative reconciliation.
AC C
59. Wu HW, von Kamp A, Leoncikas V, Mori W, Sahin N, Gevorgyan A, Linley C, Grabowski M, Mannan AA, Stoy N, et al.: MUFINS: multi-formalism interaction network simulator. npj Systems Biology and Applications 2016, 2. •• Wu et al. developed the MUFINS software to allow multi-formalism simulation of interaction networks. The authors model simultaneously networks describing gene regulation, signaling and whole-cell metabolism at steady state. They used Recon2 to study changes in human metabolism among different conditions, such as the effect of cortisol infusion into the metabolic network and the blood concentration of other chemicals. 60. Karin O, Swisa A, Glaser B, Dor Y, Alon U: Dynamical compensation in physiological circuits. Mol Syst Biol 2016, 12:886. •• Karin et al. studied the effect of varying physiological parameters into the dynamic response of systems, and identified a design principle that provides the desired robustness, which they call dynamical compensation. They applied this principle to 15
ACCEPTED MANUSCRIPT study the responses of endocrine and neural tissues, such as the control of blood glucose by insulin.
RI PT
67. Andreozzi S, Chakrabarti A, Soh KC, Burgard A, Yang TH, Van Dien S, Miskovic L, Hatzimanikatis V: Identification of metabolic engineering targets for the enhancement of 1,4-butanediol production in recombinant E. coli using large-scale kinetic models. Metab Eng 2016, 35:148-159. • Andreozzi et al. used ORACLE to identify potential strategies for improved production of 1,4-butanediol in E. coli. The authors constructed a population of largescale kinetic models that describe the observed physiology and identified groups of potential enzymes that control and potentially increase the biosynthesis of 1,4butanediol.
M AN U
SC
68. Andreozzi S, Miskovic L, Hatzimanikatis V: iSCHRUNK--In Silico Approach to Characterization and Reduction of Uncertainty in the Kinetic Models of Genome-scale Metabolic Networks. Metab Eng 2016, 33:158-168. •• Andreozzi et al. developed iSCHRUNK using the ORACLE framework and machine learning principles to determine and quantify the kinetic parameters that correspond to a certain physiology. The authors applied iSCHRUNK to a 1,4butanediol-producing E. coli strain and identified a narrow set of enzymes and kinetic parameters associated to the observed physiology.
TE D
70. Kolch W, Halasz M, Granovskaya M, Kholodenko BN: The dynamic control of signal transduction networks in cancer cells. Nat Rev Cancer 2015, 15:515-527. • Kolch et al. comprehensively review the latest discoveries that show how nongenetic adaptations, such as signal transduction networks, can generate cancer cell heterogeneity and improve survival.
AC C
EP
71. Hadadi N, Hafner J, Shajkofci A, Zisaki A, Hatzimanikatis V: ATLAS of Biochemistry: A Repository of All Possible Biochemical Reactions for Synthetic Biology and Metabolic Engineering Studies. ACS Synth Biol 2016, 5:1155-1166. •• Hadadi et al. developed the ATLAS of biochemistry using BNICE.ch. The authors explored the existing and possible novel biochemical space, and identified up to 130,000 possible biochemical reactions between the known biological compounds based on the biochemistry reported in the KEGG database.
16
ACCEPTED MANUSCRIPT Figures
WORKFLOW 1
Knowledge
CHALLENGES
KEY ELEMENTS Mathematical principles
Chemical interactions, and their crosstalk
Biophysics and chemistry Inherent uncertainty
Biological network reconstruction
Formalism
F6P
G6PC
PEP
Observability
PYR
FBP PCK1
GS
OXA
Glycogen
Reconciliation
3
Integration
Mechanistics
0 30 60 90 120 s
0 3 6 9 12 min
0 60 180 360 min
Understanding
4
Data inconsistency
Analysis
GS Akt
GSK3
Applications in biology, medicine and biotechnology
Size
Dynamics
SC
2
PFK G6P
RI PT
GLK GLC
AC C
EP
TE D
M AN U
Figure 1. Workflow and challenges in the integrative analysis of biological networks
17
ACCEPTED MANUSCRIPT
Signaling
concentration of species
Insulin receptor
0 20 40 80 100 ms
0 3 6 10 s Extracellular
Metabolism
GLK IRS
FOXO
GLC
PFK G6P
G6PC
F6P
PEP
PYR
FBP
PCK1
SC
PI3K
GS
Akt
Intracellular
RI PT
Regulation
GSK3
OXA
Nucleus
concentration of species
M AN U
Glycogen
0 60 180 360 min
0 3 6 9 12 min
0 20 40 60 s
TE D
0 30 60 90 120 s
AC C
EP
Figure 2. Crosstalk between biological processes and differences in their dynamics. We illustrate how different cellular processes, such as signaling, regulatory, and metabolism, are involved in maintaining the blood glucose level. In response to high blood glucose level, insulin in the blood serum activates a signaling cascade. It starts with the phosphorylation of IRS protein, which then binds with PI3K, and subsequently activates Akt cascades. The activated Akt inhibits glycogen synthesis (GS) through inactivation of GSK3 transcription factor, and inhibits the expression of gluconeogenesis enzymes (G6PC, FBP, and PCK1) through FOXO transcription factor. Inhibition of glycogen synthesis and gluconeogenesis ensures that cells may utilize glucose for other metabolic tasks, such as energy formation in response to high glucose level. With caricatures, we show how different dynamics are governing the three biological processes. Legend: IRS insulin receptor substrate; PI3K phosphoinositide 3-kinase; GSK3 Glycogen synthase kinase 3; G6PC Glucose-6-phosphate, FBP fructose-1,6-biphosphatase, PCK1 Phosphoenolpyruvate carboxykinase 1, FOXO the forkhead protein family, GLC glucose, G6P glucose 6-phosphate, F6P fructose 6-phosphate, PEP phosphoenolpyruvate, OXA oxaloacetate, PYR pyruvate
18