Future designs for synthetic chemistry - University of Birmingham

Could artificial intelligence (AI) hold the key for the future of a core area of chemistry?

Chemistry can be divided into various sub-disciplines but in the lab on a day-to-day basis, it essentially boils down to three types of activity: making, modelling and measuring. The synthetic chemists are the makers, synthesising compounds that reflect their interests and expertise, for example, a new drug, molecular probe or superconducting material. Their challenge is to make their chosen target in high purity and in sufficient quantities.

Efficient synthesis is a particular issue for the organic chemist (organic chemistry is the study of compounds containing carbon atoms) as the route can involve many linear steps, with each intermediate often a new compound itself with unpredictable reactivity. A dead-end or several low-yielding steps in the synthesis could spell disaster for the whole route and the need to go back to the drawing board, using up both valuable time and resource. This perennial challenge has led to many celebrated breakthroughs in synthetic methodology over the years, often involving Nobel Prize-winning work, providing new short-cuts and more efficient reactions that other chemists can then use in their own research.

A synthetic organic chemist wanting to make a chosen target molecule would typically draw its structure out on a piece of paper and then work backwards, using as few steps as possible until they arrive at simpler compounds for which a synthesis is already known. For each step, they assess the feasibility of the forward process using their training and knowledge of the literature. This process is called retrosynthetic analysis, a technique that works well but nevertheless does not make designing a route to a complicated molecule a simple task. But could help soon be at hand to identify viable synthetic routes to such targets?

At present, chemists routinely use online computational tools to help them plan reactions, but most are in fact not much more than a ready source of encyclopedic knowledge for existing compounds. But imagine a situation where a chemist enters the structure of their chosen target into a program, presses ‘Enter’ and within a few seconds, a synthetic route is produced from simple starting materials, with all the steps displayed on the screen.

A recent paper published in the journal Nature suggests that this scenario could one day be a reality, with computer programs using knowledge from the vast banks of stored chemical information to work out viable multi-step routes for themselves. In the article, an algorithm incorporating AI was used to find the route to several organic targets using what the authors called computer-aided retrosynthesis. The algorithm used information stored in an online chemistry database containing over 10 million reactions as its source. What was particularly impressive was firstly the speed in which the algorithm performed its tasks, with for example a six-step route to a compound identified in just over five seconds. Secondly, a double-blind trial revealed that the routes proposed were seen as viable as those proposed by humans.

It is tempting to speculate on what the implications of this machine-learning approach would be for synthetic chemistry, and, in particular, for the human input that is currently so crucial for the successful synthesis of a compound. First of all, it is worth noting that the target compounds reported in this article were relatively simple, a fact acknowledged by the authors. Establishing routes to more complicated compounds with highly specific 3D structures, those that would get an experienced academic really scratching their head, would require much more work. But even if this AI eventually gives us viable routes to previously unmade and complex molecules, the luddites among us should not lose too much sleep.

Much of the progress in synthetic methodology comes from the creativity borne from lateral thinking and acting on those unexpected observations in the lab, to which a computational approach would be oblivious. Furthermore, chemists still need to identify what to make in the first place, including those compounds never made before that are only realised using wonderful feats of imagination. The 2016 Nobel Prize in Chemistry awarded for the design and synthesis of molecular machines, some of which involved work undertaken here in Birmingham by Sir J. Fraser Stoddart, is a good case in point.

In fact, a computer program that is successful in determining how to make such a compound based on all that is currently known might give us more time to think about what is currently unknown. This should mean more new target molecules with interesting properties as well as more new reactions, delivering synthetic breakthroughs that a machine would never think of.