Synergy
Synthesis in the Chemical Space Age Accessible chemical space is shaped by the reactions used in its exploration, and this affects which molecules are tested. Automation has a role in the future of reaction selection in the search for drugs and other functional molecules. Chemical synthesis and chemical space exploration in search of life-saving medicines are closely allied. In 1869, chloral hydrate became the first synthetic chemical linked to medicinal properties, and since then, synthesis and medicine have grown hand in hand. Today, the ever-expanding synthetic toolbox enables the construction of nearly any molecule. However, recent analyses have suggested that chemical space exploration in drug discovery is dominated by a narrow list of preferred reactions and that the list has changed little in the past 20 years.1 Drug hunters are slowly incorporating new and diverse chemical transformations into their arsenal,2 but clearly a small subset of transformations are used with the most frequency and have been for some time. If dozens of novel chemical transformations are reported on any given day yet medicinal chemists favor a set of fewer than ten reactions that haven’t changed much in two decades, where does this disconnect come from? In the fast-paced world of drug discovery, it’s easy to gravitate toward experimentally simple, robust, and reliable reactions. A drug-discovery program is fraught with many difficult questions: will inhibiting this enzyme do what we want it to, and if so, should we inhibit it in the brain or in the liver? Will this receptor accept a molecule with a hydroxyl group at the C5 position, and will that get rid of the offtarget ion-channel activity we are observing? Do we have any clue how the enzyme we are targeting links to the biochemical pathway we are trying to block? Given the onus of navigating this landscape of unknowns, it becomes quite easy to select a familiar toolset of chemical transformations that have a high probability of delivering the planned product. Herein lies an opportunity, given that chemical space is shaped by the chemical transformations used. One must wonder what diversity might be lost by favoring only a small subset of the transformations available and what is needed to harness the power of new synthetic tools while still focusing on the complex tasks of understanding how a molecule links to biological pathways and disease. All signs point to the possibility that one day, a computer could perform the reductive algorithm known as retrosynthetic analysis and nominate a broader variety of transformations to access new pockets in chemical space.3 An easy criticism of this approach is that reactions suggested by a computer are unlikely to come with conditions that work for every substrate—there’s just too much diversity in chemical structure and reagent compatibility for us to understand how it all fits together. To address this, high-throughput4 and reaction fingerprinting strategies5 have recently been developed to study synthetic transformations in a systems context and suggest how a new reaction might perform in the pocket of chemical space that a drug hunter happens to be working in. It is hoped that these approaches, which catalog reaction performance across a panel of diversely functionalized substrates, will instill chemists’ trust in new transformations more rapidly. Dynamic drug-discovery research often provides little time for tinkering with reaction conditions, and when reactions fail, the targeted compound might never make it into an assay to test whether its design was any good. However, there is theoretically a lot of room for tinkering given that the overlap of reaction space and chemical space is frustratingly vast. For
6
Chem 1, 6–9, July 7, 2016
Figure 1. Linking Chemical Space to a Desirable Drug Profile with Synthesis (A) Chemical space is visualized by data-reduction methods where each dot represents a molecule. There are >10 60 drug-like molecules, and one of them might be a life-saving medicine. (B) The design-make-test learning cycle is used for nominating molecules from chemical space, executing their synthesis and testing, and then analyzing the outcome and nominating the next round of molecules. Only molecules that can be synthesized will complete the learning cycle. (C) Multiple iterations of the learning cycle are repeated until an optimized molecule with the desired biological and physicochemical profile is found.
instance, a transition-metal-catalyzed transformation that makes a single molecule is estimated to have >107 permutations of possible reaction conditions.6 Chemists, luckily, can turn to conditions from the literature as a sensible starting point for reagents, solvents, and stoichiometries; however, when reactions don’t perform as expected, the search space for locating productive conditions can be immense. Meanwhile, if strict filters of drug-likeness are applied to chemical space, there are still >1060 molecules out there.7 At an intersection somewhere in this infinity a life-saving drug exists, and modern navigation strategies are needed to find it quickly. The drug hunter’s mission is to find a synthetically accessible molecule with a desirable function and overall profile from an endless sea of possible molecules. To achieve this, a design-make-test learning cycle is used to optimize a compound toward a desired multiparameter profile. A typical oral drug has a profile that carefully balances potency, selectivity, permeability, metabolic stability, and solubility. Inventing the molecule that satisfies these criteria typically requires many iterations of the learning cycle (Figure 1). In the drug-design process, a molecule is plucked from a virtual chemical space of potential molecules—molecules that could exist but have never been made and tested before. Synthesis is the tool by which the molecule selected from chemical space comes into existence and completes the design-make-test learning cycle so it can be determined whether progress toward the desired profile is being made. Synthetic accessibility is a key component of a drug’s profile, and therefore successful synthetic-route planning, selection of reagents and conditions, reaction execution, and purification are essential. Optimizing a candidate molecule by using multiple iterations of the learning cycle takes a lot of work. It can be very rewarding work; often it is just tedious work. As such, there has been considerable effort to develop automation and machinelearning approaches for many aspects of synthesis and medicinal chemistry. The idea of automated synthesis is five decades old; it has ebbed and flowed in popularity, and in recent years, the field has seen a renaissance inspired by advances in robotic hardware and predictive software. The most sophisticated synthesis machines today spend a disproportionate amount of time making amide bonds and
Chem 1, 6–9, July 7, 2016
7
performing Suzuki couplings,8 but it’s easy to imagine machines that can make molecules by using a broad diversity of transformations. The synthetic route to each molecule might be algorithmically planned; appropriate catalysts or reagents might be located by automated high-throughput experimentation or statistical models that have been fed data from robustness screens and informer libraries.4 Robots have a long way to go before they can match the experimental finesse and technique of a chemist skilled in the art, but machines can already troubleshoot a reaction with modern online monitoring capabilities.9 Dependence on physicochemical scoring metrics, machine learning, and automation in the drug-discovery process carries many risks. You’ll miss countless opportunities in serendipity, machines can’t master the nuance of experimental technique, and creativity will be lost. That said, picking a molecule in chemical space that balances predicted properties, planning its synthesis, and choosing the reagents to ensure synthetic success is a laborious process. There are so many parameters involved—chemical transformations, molecular connectivities, physicochemical scoring metrics, and reagent compatibilities—that simultaneous mental processing of chemical space and reaction space becomes exceedingly complex. This onus contributes to the pragmatic favoring of a small subset of popular chemical transformations that have a long track record and little risk of failure. What if only a handful of transformations are needed for inventing a drug? For all the amides that have been synthesized in pharmaceutical explorations, we haven’t even come close to preparing a billionth of a percentage of all the amides that theoretically exist. In the infinity of chemical space, there will be many attractive virtual molecules, so why not select one that can be easily synthesized? Or does using just a few transformations lead to local minima in chemical space where molecular properties are overly influenced by ease of synthesis? This Synergy piece aims to highlight the disconnect between the number of powerful new synthetic methods reported every day and their actual incorporation in the drug-discovery process, as well as illuminate the role that hardware and software automation might play in closing the gap. Of course, the number of times a reaction gets used in drug discovery is not necessarily linked to its impact on drug invention. In addition, it must be understood that the toolbox of transformations used in modern drug discovery is indeed broad—it is just heavily biased toward fewer than ten transformations.1,2 The current situation, whereby diverse transformations are used occasionally while a handful of powerhouse transformations are used to blast through large, albeit closely related, swaths of chemical space, could already be ideal. A partnership between man and machine could inspire incorporation of the full menu of synthetic transformations in the search for medicines and other functional molecules. It could also drive down the currently unsustainable costs of drug discovery and development. In many industries, automation and machine learning are currently developing at a breakneck pace, and chemists must be aware that many aspects of our art form will be run by machines at some point in the future. Automated searching of chemical space and reaction space already has a role in modern drug discovery, and the field is advancing rapidly. Astronomers exploring cosmic space recently connected machine-learning algorithms to a telescope through the All-Sky Automated Survey for Supernovae. In 3 years, they’ve discovered 300 supernovae, including the most luminous supernova ever observed.10 One day, perhaps decades from now, a robo-chemist might chug out small molecules mapped to an omics-linked target personalized just for you. The human will never be replaced in
8
Chem 1, 6–9, July 7, 2016
the drug-discovery process, but allowing machines to help with what can be automated will free up more time for answering the more difficult questions of how a molecule cures a disease. Tim Cernak Merck Research Laboratories, Discovery Chemistry, 33 Avenue Louis Pasteur, BMB2-116B, Boston, MA 02115, USA Correspondence:
[email protected] http://dx.doi.org/10.1016/j.chempr.2016.06.002
1. Brown, D.G., and Bostro¨m, J. (2016). J. Med. Chem. 59, 4443–4458.
6. Murray, P.M., Tyler, S.N.G., and Moseley, J.D. (2013). Org. Process Res. Dev. 17, 40–46.
2. Schneider, N., Lowe, D.M., Sayle, R.A., Tarselli, M.A., and Landrum, G.A. (2016). J. Med. Chem. 59, 4385–4402.
7. Reymond, J.L. (2015). Acc. Chem. Res. 48, 722–730.
3. Szymkuc, S., Gajewska, E.P., Klucznik, T., Molga, K., Dittwald, P., Startek, M., Bajczyk, M., and Grzybowski, B.A. (2016). Angew. Chem. Int. Ed. Engl. 55, 5904–5937.
8. Godfrey, A.G., Masquelin, T., and Hemmerle, H. (2013). Drug Discov. Today 18, 795–802.
4. Gensch, T., and Glorius, F. (2016). Science 352, 294–295. 5. Buitrago Santanilla, A., Regalado, E.L., Pereira, T., Shevlin, M., Bateman, K., Campeau, L.C., Schneeweis, J., Berritt, S., Shi, Z.C., Nantermet, P., et al. (2015). Science 347, 49–53.
9. Reizman, B.J., and Jensen, K.F. (2015). Chem. Commun. (Camb.) 51, 13290–13293. 10. Dong, S., Shappee, B.J., Prieto, J.L., Jha, S.W., Stanek, K.Z., Holoien, T.W., Kochanek, C.S., Thompson, T.A., Morrell, N., Thompson, I.B., et al. (2016). Science 351, 257–260.
Chem 1, 6–9, July 7, 2016
9