This article was co-written by Teodoro Laino, Alain Vaucher and Matteo Manica.
Chemistry is all around us. It can be found in a wide variety of products and technologies, ranging from essential ingredients in consumer products such as Aspirin to raw materials in products like Nylon. Yet, most of us are probably unaware that it takes at least 10 years to discover a new material and bring it to market, with production costs ranging from 10 to 100 million dollars. To appreciate the complexity of chemistry-enabled discovery, consider this: in 1972, it took 91 postdocs and a dozen PhD students 12 years to complete the synthesis of vitamin B12. Nearly five decades later, some syntheses remain nearly as complex as the one for vitamin B12.
The good news is that scientists have recently begun building on modern technologies to make these challenges more easily solvable. In fact, when it comes to digitization and the acquisition of new technologies, synthetic chemistry, or the art of creating materials, is still one of the most traditional disciplines. Chemists keep relying on many of the same protocols, and little progress has been made in modernizing the ancient practice of trial and error in order to usher in a new era of accelerated discovery. However, the last five years have seen a significant revolution: modern tools such as artificial intelligence (AI), cloud technology, and robotics are transforming chemistry from a traditional to a high-tech industry.
Innovating on the design and functionality of the chemical laboratory is a never-ending challenge. It requires a thorough understanding of the role of chemistry in scientific discovery, as well as an in-depth examination of the gaps in how work is done in chemical laboratories. Several examples from around the world show the potential of combining AI and automation in chemical laboratories to automate the process of synthesis: (1) the modular desktop-sized robotic synthesizer that converts text-based recipes into instructions to drive laboratory automation hardware built at the University of Glasgow [1]; (2) the robotic platform developed at the University of Liverpool, programmed and tested to work in the university’s materials innovation factory [2]; (3) the ETHZ “console”-like computer with preloaded chemical applications to drive several classes of organic reactions commonly used in drug discovery [3]; (4) the MIT robotic platform that automatically assembles the hardware and performs the reactions suggested for synthesizing a molecule with a combination of artificial intelligence and human expertise [4]; (5) the automated synthesis of small molecules designed by the MPI Potsdam, based on a series of continuous flow modules, radially placed around a central switching station [5].
More recently, we presented an end-to-end, integrated chemical research system demonstrating how artificial intelligence, robotics, and the cloud will change the future of synthetic chemistry. We call it IBM RoboRXN, and it is a laboratory extension of the digital chemistry platform IBM RXN for Chemistry, online since 2018 and serving more than 30,000 people around the world.
RXN for Chemistry is powered by a cutting-edge neural machine translation architecture that predicts the most likely outcome of chemical reactions via natural language processing (Molecular Transformer [6], over 90% top-1 accuracy best to date). Using SMILES notation to translate from the language of reactants to the one of products, the method can be also applied to design optimal retrosynthetic routes [7].
To take the designed routes to the lab and synthesize the desired compounds autonomously, the RoboRXN technology relies on an additional AI model that recommends the detailed sequence of operations to make a specific target molecule, including the order in which ingredients should be mixed. [8]. The construction of this model required a purely data-driven scheme to extract organic chemistry synthesis information and convert it into a structured and automation-friendly format [9] usable for training tasks.
RoboRXN combines the aforementioned AI models to automate the tedious task of programming commercial automation hardware. We deployed RoboRXN service suite on IBM Cloud, making it available anywhere via an internet connection. This was accomplished by completely reinventing the chemistry process. With the RoboRXN technology, scientists enter as input the skeletal structure of the molecular compound they want to make. The combination of AI, automation, and cloud computing enables RoboRXN to predict the starting materials and the optimal sequence of chemical reaction steps to synthesize the desired compound. It then sends the instructions to a robot in a remote lab, which executes them with minimal human intervention. Once the experiment is completed, the platform sends a report to the scientists detailing the findings, with the synthesized compound available in a vial for further analysis.
More recently, the RXN team has been working on exploring even more interactions between AI and humans. A recent addition is a reaction prediction model trained with enzymatic data [10], to favor the use of more sustainable and green processes in R&D daily operations. A few weeks ago, the team brought down another wall: they introduced a new functionality to allow everybody to retrain their own chemical models in the cloud without any prior knowledge [11]. The team achieved this by leveraging automation workflows for orchestrating fine-tuning pipelines on a cluster provisioned with eight accelerators, allowing for eight concurrent retraining and unlimited possibilities to customize private RXN models.
Outfitted with the power of advanced technologies like AI and cloud, we are on the cusp of an entire new era of chemistry that will produce historic discoveries. It’s truly an exciting time and shows that when humans and machines work together, the possibilities are endless.