The 1,014th Meeting of the Northeastern Section of the
American Chemical Society
Data-Driven Synthesis Planning and Molecular Design
By Connor W. Coley, Assistant Professor in the Department of Chemical Engineering and the Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology
Thursday, September 15, 2022
Register for the virtual September meeting via Zoom link at:
- Virtual Networking
- Opening by Carol Mulrooney, NESACS Chair
- Featured Presentation by Connor W. Coley, MIT, Department of Chemical Engineering and the Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology
Advances in laboratory automation promise to decrease the effort required to synthesize small molecule compounds, but determining how to synthesize a molecule is still a manual process that requires significant time investment from expert chemists. Computer-aided synthesis planning (CASP) focuses on accelerating this process by recommending synthetic pathways. This ability to formalize knowledge of reactivity in predictive models influences the overall process of molecular design by constraining the chemical space we are able to access, or access easily.
Machine learning and artificial intelligence have enabled new data-driven approaches to CASP where statistical models are trained directly on published experimental data. The two primary aspects of CASP—proposing retrosynthetic disconnections to connect the target to purchasable materials and validating proposed reactions in silico—are highly amenable to supervised learning approaches. We have developed several of these tools in a software suite, ASKCOS, that is capable of proposing retrosynthetic routes to new molecules, proposing reaction conditions for each step, and assessing the likelihood of experimental success. I will talk about the many learning tasks associated with the goal of synthesis planning, the progress that we and others in the field have made, and ongoing challenges in improving the fidelity of these models.
Connor W. Coley is an Assistant Professor at MIT in the Department of Chemical Engineering and the Department of Electrical Engineering and Computer Science. He received his B.S. and Ph.D. in Chemical Engineering from Caltech and MIT, respectively, and did his postdoctoral training at the Broad Institute. His research group at MIT develops new methods at the intersection of data science, chemistry, and laboratory automation to streamline discovery in the chemical sciences with an emphasis on therapeutic discovery. Key research areas in the group include the design of new neural models for representation learning on molecules, data-driven synthesis planning, in silico strategies for predicting the outcomes of organic reactions, model-guided Bayesian optimization, and de novo molecular generation. Connor is a recipient of C&EN’s “Talented Twelve” award, Forbes Magazine’s “30 Under 30” for Healthcare, the NSF CAREER award, and the Bayer Early Excellence in Science Award.