Machine‑learning framework could transform prediction of drug side effects

Interpretable AI model could offer new insights into why medicines cause certain side effects, helping to improve future drug safety predictions.

A selection of different pharmaceutical drugs in packaging

Researchers have created a powerful new machine‑learning framework capable of predicting drug side effects by analysing how different medicines interact with biological targets in the body, offering a new path towards earlier detection of adverse drug reactions (ADRs) and safer drug development.

Published in PLOS ONE, the study – jointly led by Joseph Roberts-Nuttall and Dr Alan M. Jones, University of Birmingham – uses a new, interpretable AI framework that draws on millions of real‑world safety reports and state‑of‑the‑art pharmacological databases, helping researchers to link drug side effects to the specific proteins and genes with which medicines interact.

In future, this type of framework could help to identify safety risks earlier in the drug lifecycle, inform the design of new medicines and reduce side effects experienced by patients.

Using high‑confidence drug‑target interaction data from the STITCH database and nearly three million ADR reports from the UK’s Yellow Card Scheme, researchers from University of Birmingham’s School of Pharmacy and the Department of Mechanical Engineering trained a series of machine‑learning models to predict which drug‑ADR combinations were statistically significant.

By connecting adverse reactions back to the biological targets that each drug interacts with, this new framework brings us a step closer to safer drug development and improved patient experiences.

Dr Alan M. Jones, University of Birmingham.

ADRs are a persistent challenge for global healthcare systems and are estimated to account for around 16% of hospital admissions in the UK, costing the NHS more than £2 billion every year. While post‑marketing surveillance systems - such as the Yellow Card Scheme - remain essential for detecting safety issues, the underlying biological causes of many ADRs remain unclear.

Dr Alan M. Jones, joint lead author of the study, said: “One of the biggest challenges in the pharmaceutical industry is not just spotting when a side effect happens, but also understanding the reason why it occurs. By connecting adverse reactions back to the biological targets that each drug interacts with, this new framework brings us a step closer to safer drug development and improved patient experiences.”

Decoding the future of drug safety

The framework used interpretable Random Forest models: an ensemble machine‑learning method that combines the predictions of many individual decision trees to produce more stable, accurate results. By aggregating multiple 'tree‑based' decisions, these models reduce noise, avoid overfitting, and highlight which biological features (such as specific proteins or genes) are most important in making a prediction.

The approach achieved prediction accuracy with receiver operating characteristic area under the curve (ROC AUC), scores of up to 0.94 across multiple organ‑system categories, demonstrating that the model’s outputs aligned with realistic, biologically plausible associations.

For example, when analysing psychiatric disorders, which was the best‑performing category within the study, nine of the top ten predictive targets had established links to psychiatric conditions when cross‑checked against the DisGeNET disease‑gene database.

By uncovering patterns that show how specific drug actions in the body might be linked to specific side effects, researchers were able to explain why certain adverse reactions occur rather than simply detecting them.

Joseph Roberts-Nuttal, joint lead author of the study, said: “Clinical trials are essential, but they can't fully capture the diversity of real‑world patients. By integrating trial data with spontaneous reports and AI-driven target biology, we can create a more accurate picture of drug safety.

"In future, this could help to improve the detection and understanding of drug-related problems throughout a medicine's lifecycle, as well as regulatory decision-making.”

Examining real-world data

The researchers also compared ADR patterns detected in real‑world data with those identified in clinical trials, using the SIDER side‑effects database: with just under 18% of significant ADR signals overlapping between the two sources.

This highlights the substantial differences and limitations between clinical trial populations when compared to the diverse, real‑world medication use; which is captured by spontaneous reporting systems such as the Yellow Card Scheme.

To strengthen the framework’s performance and capabilities further, future studies would also need to incorporate additional global datasets and explore more advanced generative methods to enhance prediction accuracy.

Notes for editors

For media enquiries and more information please contact Holly Young, Press Office, University of Birmingham, tel: +44 (0)7815 607 157.

An Interpretable Machine Learning Framework for Adverse Drug Reaction Prediction from Drug-Target Interactions – Joseph Roberts-Nuttall, Alan M. Jones, Marco Castellani, and Duc Pham is published in PLOS ONE.

About the University of Birmingham

The University of Birmingham is ranked amongst the world’s top 100 institutions. Its work brings people from across the world to Birmingham, including researchers, educators and more than 40,000 students from over 150 countries.

England’s first civic university, the University of Birmingham is proud to be rooted in of one of the most dynamic and diverse cities in the country. A member of the Russell Group and a founding member of the Universitas 21 global network of research universities, the University of Birmingham has been changing the way the world works for more than a century.