Colin Hill discusses in silico modeling of human cells

Colin Hill discusses in silico modeling of human cells

UPDATE BIOSILICO Vol. 1, No. 5 November 2003 INTERVIEW often represent a small subset of the problem and are unlikely to contribute much to the unde...

68KB Sizes 1 Downloads 45 Views

UPDATE

BIOSILICO Vol. 1, No. 5 November 2003

INTERVIEW often represent a small subset of the problem and are unlikely to contribute much to the understanding of common forms of complex polygenic illnesses. However, experiments such as these often contribute important single points of insight into these illnesses. Insights gleaned from large numbers of these experiments can be pieced together to help elucidate the underlying pathways associated with the disease. For example, in the case of Alzheimer’s disease, mutations in three genes, β-amyloid precursor protein (β-APP), presinilin 1(PS1) and presinilin 2 (PS2), account for 30–40% of early onset familial forms of the disease.These changes are suggestive of a disease mechanism: alteration in the processing of the β-APP protein such that an increased amount of the 42 amino acid variant of the amyloid β-peptide is produced.This peptide is fibrillogenic and plaques in the brains of Alzheimer’s disease patients are rich in the variant peptide. Combined with other evidence, these studies have implicated amyloid fibrilinduced nerve cell death as a primary cause of Alzheimer’s disease [3].

Disease reclassification: an emerging theme in personalized medicine Efforts to unravel the molecular-level complexities of diseases like schizophrenia and bipolar disorder will continue to depend on the availability of information from a variety of sources, such as the human genome project, cell-specific gene expression profiles, sensitive protein detection experiments and clinical records. Unfortunately, there is no simple solution. Over the next few years microarrays are likely to evolve from a basic research tool into a diagnostic instrument that can be used to accurately quantify wellcharacterized sets of messages.These data must ultimately be combined with clinical observations and precise measurements of specific protein levels that are known to correlate with the onset of certain diseases. New views are likely to emerge that contain disease classifications based on genotype and gene expression profile.These new classifications are likely to reveal surprising relationships between previously distinct illnesses as well as new patient

Colin Hill discusses in silico modeling of human cells Interview by Christopher Watson

Colin Hill, Gene Network Sciences

Colin Hill is the founder of Gene Network Sciences (GNS; http://www.gnsbiotech.com) and serves as President and Chief Executive Officer. He has extensive scientific experience in the areas of gene network modeling, pioneering the application of methods based in statistical physics and non-linear dynamics to the stochastic dynamics of gene expression. He is the co-founder of a multidisciplinary research effort at Cornell University dedicated to combining computational and experimental approaches to the study of signal transduction pathways. Hill is the co-creator of the Digital CellTM software environment for the modeling of complex gene networks and biochemical pathways. He earned his BS degree in Physics from Virginia Polytechnic and State University and his MS degrees in Physics from McGill University and Cornell University.

1478-5282/03/$ – see front matter ©2003 Elsevier Science Ltd.All rights reserved. PII: S1478-5282(03)02379-7

subpopulations that, despite similar phenotypes, are markedly different. Recent discoveries about schizophrenia and manic depression are suggestive of this trend.The fact that a wide variety of genes are involved in these illnesses helps explain the variability of responses to treatment (in many cases a lack of response). Conversely, the geneticlevel overlap that seems to manifest itself as a failure of the myelination process suggests that some patients afflicted with bipolar disorder have the same illness as some patients with schizophrenia. Such revelations are likely to emerge in almost all disease categories.

References 1 Tkachev, D. et al. (2003) Oligodendrocyte dysfunction in schizophrenia and bipolar disorder. Lancet 362, 798–805 2 Mimics, K. et a. (2001) Analysis of complex brain disorders with gene expression microarrays: schizophrenia as a disease of the synapse. Trends Neurosci. 24, 479–486 3 Bradbury, J. (2003) Bipolar disorder gene identified. Drug Discov. Today 8, 724–726 4 Whittaker, P. (2001) From symptomatic treatments to causative therapy? Curr. Opin. Chem. Biol. 2, 352–359

What is GNS’s approach to cellular modeling and simulations and how does this approach differ from others that are available? Our approach is to systematically integrate heterogeneous data into comprehensive datadriven models of human cancer cells and to use these models to validate and prioritize targets for drug discovery and development, and to assess the effects of compounds against these targets.We have developed our own technologies, such as the Diagrammatic Cell Language™ (DCL™) and the Network Inference Engine (patent pending), which allow us to create models of much greater complexity and that can ‘suck’ in data from many different sources, allowing us to build models of a size, and therefore accuracy, that no other group can do. What are the advantages of DCL™? There are tens of thousands of components, states and interactions possible within the cell. For example, there are proteins in signal transduction pathways that have many different

www.drugdiscoverytoday.com

155

UPDATE

BIOSILICO Vol. 1, No. 5 November 2003

INTERVIEW states, arising from modification of different phosphorylation sites and so on. So there is this combinatorial complexity of interactions, and this complexity can only be dealt with using an approach such as DCL™, otherwise you end up with an explosion of interactions. DCL™ allows you to represent pathways in a much more compact way, enabling one to construct pathway models on the genome scale.There is absolutely no way, using graphical representations familiar to biologists, of representing and constructing simulations of the dynamics of gene expression networks and protein signaling pathways, it just does not work. Although your computer simulation is large, it still only represents a small percentage of the circuitry of a human cell. What about the rest? Does the absence of a large proportion of cellular components compromise the model? The goal is to construct a model at the genome-wide scale and this necessitates computational hypothesis testing.This is where our second new technology comes into play, the Network Inference Engine.This is really a marriage of bioinformatics, simulation output and quantitative data. GNS has proprietary algorithms for inferring pathways from many different data types and which utilize forward predictive simulations to make predictions of unknown biology.This is where bioinformatics comes into play – trying to characterize genes that have no literature information about them and inferring how these genes are connected to our core model. It is important that our model is of this size as it allows us to more easily place these unknown genes into context – to ascertain how these pieces fit into the bigger puzzle. There has been some debate over the best strategy for biological simulation, whether it should be ‘bottom-up’,‘top-down’ or some combination of the two.What are your thoughts on this? From the beginning GNS focused on the more rigorous ‘bottom-up’ approach.This is the only way to model the complex diseases such as cancer that the pharmaceutical industry is most interested in at a level of resolution that can predict the correct ‘hows’ and ‘whys’ of drug action.This cannot be done without a complete and comprehensive understanding at the molecular level.When Iya Khalil and I started the company some years ago, the focus was to build the most accurate and detailed simulation of a human cancer cell, and it had to be from the molecules up. If you are doing it a resolution above that the models are going to

156

www.drugdiscoverytoday.com

be phenomenological.With top-down models, one cannot distinguish a good data point from a bad data point and these models cannot be driven by all the proteomic and genomic data that are now becoming plentiful. For example, although the concentration of a certain protein under a particular condition would be an output of our model, it would probably be an input into a top-down model, and therefore a bad experimental data point would more easily ‘corrupt’ a top-down model. However, it does depend on the disease one is looking at, I should make that clear. If it is asthma or obesity then it might have to be more of a hybrid approach, but if it is cancer or many other complex diseases, then bottom-up is the way to do it. Part of your approach involves data and text mining, combing the literature for biological information about protein interactions and so on. Does the culling of data from different laboratories and under disparate conditions represent a problem or limitation for the model? I think that one has to be systematic about what is and is not included. Molecular-level models are able to accept data of different qualities.There might be some interactions that are well-characterized, others that are not. For example, it is known from the literature that the two key cancer proteins Ras and Raf interact. However, one does not always have that kind of certainty.What then happens with that other information? Is it thrown out? Or is there some way of including it? One of the key things about our model is that it can validate or invalidate various experimental data and evaluate literature data based on some set of facts that one is close to certain of. However, although important, I do not believe that text mining is the bottleneck that we are facing. What in your opinion is the bottleneck? I believe that the bottleneck is incorporating large quantities of data into the models, generating output and validating these models. I believe that GNS has overcome much of this bottleneck in the past three years that we have been operating.We would certainly like to have access to better experimental data. I think one of the tricks is to be able to take in data of very different types, whether it is knockout data, microarray data or clinical data and to make use of that. At the moment, the company focuses on colon cancer.Why did you choose this particular system and do you plan to work on others?

We chose colon cancer because it is probably the best understood cancer at the molecular level and, from a market perspective, it is one of the top four cancers. GNS will soon broaden the application of its technology to breast cancer, other cancers and other diseases that arise from aberrant behavior of cell growth, replication and apoptosis.We also have a small ongoing effort in the modeling of Escherichia coli for use in bioprocess engineering, with possible applications in infectious disease drug discovery and diagnostics, and bioremediation.

‘This approach allows drug targets and drugs to be validated at a much deeper level than would otherwise be possible.’ How do you envisage this approach driving the discovery of new pharmaceutic agents? This approach allows drug targets and drugs to be validated at a much deeper level than would otherwise be possible. One can conduct millions of virtual experiments around a target or lead compound to better understand its role and context, and to decide what anticancer leads to pursue. In addition, this approach could be used to better select or screen patient populations and to determine which patients are better responders to a given drug. It is ultimately about making better, earlier decisions and this is where GNS will have an impact. What are the limitations of in silico modeling at present? Are there systems or diseases that approaches such as yours will never accurately model? I believe that one does need an understanding of the molecular biology of the system and the disease mechanism. Diseases where this molecular or clinical level of understanding is lacking or where there is no ready access to data would not be good places to show the efficacy of systems biology. Certain neurological diseases would be very difficult to go after for that reason. Do you foresee in silico modeling becoming routine in the pharmaceutical industry, as it is in the electronics and aerospace industries? Yes, that has to be the case but the question is when and with what approach.The timing is going to depend on when the industry makes the commitment to adopt these technologies

UPDATE

BIOSILICO Vol. 1, No. 5 November 2003

INTERVIEW and this is dependent on their validation. Microarrays have been around for a number of years now and the FDA is now looking at using these in clinical trials.The same will happen for in silico modeling.The pharmaceutical industry will eventually depend on these models. Even if the use of these models results in only a 10% increase in the ability of a company to predict clinical trial outcomes, that is a huge advantage, and as soon as one pharma company really decides to go with this the rest will follow suit. Do you think pharmaceutical companies will choose to outsource in silico biology or attempt to do it themselves, as Eli Lilly has to a certain extent by establishing a systems biology center in Singapore? I believe, at least for the next few years, that it will be outsourced.There are some nascent efforts at a number of pharma companies, but these are small efforts (except for Eli Lilly), and there is not enough effort to build a platform. So these early endeavours will be outsourced and coupled with small internal efforts.This is definitely not something to be dabbled in. One needs to deal with this at a very serious level. These are not trivial questions and they cannot be answered with off-the-shelf software.This is real science – it is not just IT, software engineering or bioinformatics.This is what is very different about this technology compared with other computational approaches that have entered the fray in the pharmaceutical arena. It is finally taking genomics and other ‘omics data back into discovery, back into the mechanistic discovery of disease mechanisms. Hiroaki Kitano has said that it is possible that the FDA might one day mandate simulation-based screening of therapeutic agents. Do you think in silico methods will be included in the FDA approval process? I believe that they will.Again, citing the microarray example, these have been around for a number of years, the pharma industry has bought into it for at least five years and it is only now that the FDA is looking at using this in clinical trials. It is an obvious next step, but it will probably take a few more years before it actually happens. In silico modeling will probably follow a similar path for obvious reasons.The Federal Aviation Administration (FAA) mandates the use of simulations for testing the performance of aeroplanes. I would never fly in an aeroplane that has not been tested in simulations so why would I put a drug into my body that has not been tested in a computer? It is not clear when this will happen, a large bureaucracy has to be negotiated so this will take a long time, but I believe that in ten years

this will be a standard thing.The new FDA commissioner, Mark McClellan, seems like a forward-thinker, so you never know, perhaps he will mandate this kind of testing in the next few years and place the drug testing process into the 21st century.

‘I would never fly in an aeroplane that has not been tested in simulations so why would I put a drug into my body that has not been tested in a computer?’ Is biology too dynamic and perhaps chaotic to be transformed into precision engineering? That is a good question. I do not know if precision engineering is what we are aiming for, even though there are a lot of engineers and control theorists that do have loud voices in the field.We obviously believe that the answer is no but there are challenges and complexities that have not been encountered in physics or engineering disciplines. It is funny that you use the word chaotic as that is partly my background and chaos by definition is the unpredictability of a dynamic system in response to perturbation. I actually believe that these systems are not chaotic and that physiological conditions are not consistent with those kinds of dynamics in terms of levels of genes and proteins and so on.That is one of the reasons why I believe that these systems are understandable, because that implies an inherent robustness, so we will be able to get the answers without all the details. Biology is about living systems built from non-living components and if we are able to understand and predict the properties of complex physical systems, from the microscopic level to things as large as galaxies, and living systems are inbetween, there is no reason why one cannot apply mathematics and computer simulations to the understanding of complex biological systems.That is the fundamental belief that I have held from the beginning.

What are GNS’s plans for the next few years? Our plans are really to further develop the cancer model, to increase the applications in other areas and to show stronger commercial validation of the approaches.We will be entering into a number of alliances with major pharmaceutical companies and our goal is to really demonstrate that these models, using the kind of technology that we have invented, can make a great impact on the drug discovery and development process.There has been an expectation created by the genomics boom that has yet to be realized and this is the kind of technology that will accomplish this.

‘…there is no reason why one cannot apply mathematics and computer simulations to the understanding of complex biological systems.’ What has been the biggest achievement of your career to date? Getting GNS launched with Iya three years ago, managing to grow the company and to make such progress on the technology side in a very dry funding environment, particularly as a number of our brethren companies have fallen by the wayside, even the companies that were funded by top venture capital firms. I think that the company, with the approach we are using, has a great opportunity to transform the industry.This is a paradigm shift – recreating life on a computer, gene by gene, will impact on the lives of our children and grandchildren for years to come.

Colin Hill CEO and Founder Gene Network Sciences 31 Dutch Mill Road Ithaca, NY 14850, USA [email protected]

Conference reports BIOSILICO is pleased to publish the highlights from international conferences. Conference participants who wish to cover a particular meeting should contact: Dr Christopher Watson, Drug Discovery Today Publications, 84 Theobald’s Road, London, UK WC1X 8RR e-mail: [email protected]

www.drugdiscoverytoday.com

157