Statistical Methods for Risk Prediction and Prognostic Models (Online) Non-credit

Delivery format
Online
Start date
6 - 9 July 2026
Duration
4 days
Award
Non-credit bearing
Entry requirements
A good understanding of key statistical principles.
Full requirements
Fees (UK/Ireland)
CPD course fees vary. Please see fee details for more information.
Fee details

Course overview

This online course provides a thorough foundation of statistical methods for developing and validating risk prediction and prognostic models in healthcare research.

It is delivered over 4 days and focuses on key principles for model development, internal validation, and external validation. Our focus is on multivariable models for individualised prediction of future outcomes (prognosis), although many concepts also apply to models for predicting existing disease (diagnosis). We focus mainly on binary and time-to-event outcomes, though continuous outcomes are also covered in special topics.

Computer practicals in R or Stata are included on all four days, and participants can choose whether to focus on logistic regression examples (for binary outcomes) or Cox/flexible parametric survival examples (for time-to-event outcomes).

Participants can choose whether to follow the practicals in either R or Stata, and all code is already written, allowing participants to focus more on their understanding of the methods and interpretation of the results.

Further information can be found on the Prognosis Research website.

Course content

Day 1:

The day begins with an overview of the rationale and phases of prediction model research.
It then outlines model specification, focusing on logistic regression for binary outcomes and Cox regression or flexible parametric survival models for time to event outcomes.
Model development topics are then covered, including identifying candidate predictors, handling of missing data, modelling continuous predictors using fractional polynomials or restricted cubic splines for non-linear functions, and variable selection procedures.

Day 2:

The day focuses on how models are overfitted for the data in which they were derived, and thus often do not generalise to other datasets.
Internal validation strategies are outlined to identify and adjust for overfitting. In particular cross-validation and bootstrapping are covered to estimate the optimism and shrink the model coefficients accordingly.
Statistical measures of model performance are introduced for discrimination (such as the C-statistic and D-statistic) and calibration (calibration-in-the-large, calibration plots, calibration slope, calibration curve).
A discussion of statistical vs. machine learning methods for prediction will be given.

Day 3:

With all this knowledge, we then discuss sample size considerations for model development and validation, and new software to implement sample size calculations.
The day introduces penalised regression methods, including lasso and elastic net, and the instability of these approaches.
We will then discuss sample considerations for model development and validation, and software to implement sample size calculations.
We will then focus on the need for model performance to be evaluated in new data to assess its generalisability, namely external validation. External validation approaches will first be introduced with application to logistic regression.
A framework for different types of the external validation of logistic regression is provided and the potential importance of model updating strategies (such as re-calibration techniques) are considered.

Day 4:

Day 4 continues our focus on external validation by introducing methods for the external validation of survival models.

Novel topics are then considered, including:

The use of pseudo-values to allow calibration curves in a survival model setting.
The development and validation of models using large datasets (e.g. from e-health records) or multiple studies.
The use of meta-analysis methods for summarising the performance of models across multiple studies or clusters.
The role of net benefit and decision curve analysis to understand the potential role of a model for clinical decision making.
Practical guidance about different ways in which prediction and prognostic models can be presented.

Course delivery

Teaching is via a combination of recorded lectures, live computer practicals, and live question and answer sessions following each lecture/session.

The computer practicals provide dedicated time for participants to work through the question sheets independently, with faculty available to support and answer queries. There will be opportunities to meet with faculty to ask specific questions about personal research queries.

Learning outcomes:

By the end of the course, participants will:

Understand phases of prediction model research.
Know the core statistical methods for developing a prediction model, and be able to apply them in R or Stata.
Understand the differences between models for binary and time-to-event outcomes.
Understand the use of logistic regression, Cox regression, and flexible parametric survival models in the context of prediction modelling.
Understand how to model non-linear relationships for continuous variables using splines or fractional polynomials.
Know how to derive predictions for new individuals after developing a prediction model.
Understand the issue of overfitting and how to limit and examine this.
Know the role of penalisation and shrinkage methods, including uniform shrinkage, the lasso and elastic net.
Know how to internally validate a prediction model after model development, using bootstrapping or cross-validation in R or Stata.
Understand how to produce optimism-adjusted estimates of model performance.
Know the importance and role of discrimination, calibration and clinical utility measures, and how to derive them in R or Stata.
Understand how to undertake an external validation study.
Understand how to calculate the sample size required for model development and model validation.
Appreciate different approaches to variable selection, including lasso and elastic net, and the instability of these approaches.
Recognise the importance of the TRIPOD reporting guideline and different formats for presentation of a model.
Appreciate opportunities for prediction modelling with big data and IPD meta-analysis datasets.
Appreciate methods for handling missing data, competing risks, pseudo-observations and continuous outcomes.

Programme team

The course is delivered by an internationally recognised team with extensive experience in prediction model research.

Dr Kym Snell (University of Birmingham)
Dr Lucinda Archer (University of Birmingham)
Dr Joie Ensor (University of Birmingham)
Professor Richard Riley (University of Birmingham)
Professor Gary Collins (University of Birmingham)
Dr Laura Bonnett (University of Liverpool)
Dr Rebecca Whittle (University of Birmingham)
Dr Paula Dhiman (University of Oxford)

Course Dates

6-9 July 2026

Last date to book is 21 June 2026

Registration will close earlier if all places are filled, therefore early booking is advisable

Time commitment

Ideally, participants should undertake the course live (9am to 5pm UK time). However, all course materials (e.g., lecture videos and computer practicals) will be made available two weeks in advance and remain accessible for two months afterwards, to provide plenty of time and flexibility for participants to work through the content in their own time. Early access also allows participants to engage with the material in advance and get even more value from the live Q&A sessions.

Accreditation

The course is not accredited.

Course results

Certificate of completion confirming participation.

Teaching staff

Programme team

Dr Kym Snell
Dr Kym Snell
Associate Professor in Biostatistics in the Biostatistics, Evidence Synthesis, Test Evaluation and prediction Modelling research group.
Dr Lucinda Archer
Assistant Professor in Biostatistics
Assistant Professor in Biostatistics, in the Tests and Prediction group.
Dr Joie Ensor
Associate Professor in Biostatistics
Joie’s research interests span the complete workflow of clinical prediction models (CPMs) from design to post-implementation monitoring.
Professor Richard Riley
Professor of Biostatistics
Professor of Biostatistics leading a team of statisticians undertaking applied and methodology research for healthcare.
Professor Gary S. Collins
125th Anniversary Chair
125th Anniversary Chair and Professor of Medical Statistics. He is a NIHR Senior Investigator.
Dr Rebecca Whittle
Research Fellow in Biostatistics
Medical statistician based in the BESTEAM research group.

Entry requirements

The course is aimed at individuals that want to learn how to develop and validate risk prediction and prognostic models, specifically for binary or time-to-event clinical outcomes (though continuous outcomes are also covered).

An understanding of key statistical principles and measures (such as effect estimates, confidence intervals and p-values) and the ability to apply and interpret regression models is essential.

Previous experience of using R or Stata for data analysis is also highly recommended, though computer code is already written in the practicals. Faculty will be available to support participants during the computer practicals; however, the course is not designed to provide introductory training in R or Stata. Participants are expected to be able to open the software, run code, and work with data files.

Fees and scholarships

Students - £550
Academics - £700
Industry - £1,000
Also a UOB staff discount category

Application process

Registration is open, you can register for the course using a debit/credit on the university's online shop. The courses have minimum required attendance levels and the University reserves the right to cancel or postpone the course if the minimum required number of delegates has not been achieved for the course.

For enquiries, please complete our enquiry form.

See how the University of Birmingham uses your data, view the Event attendee privacy notice.

Statistical Methods for Risk Prediction and Prognostic Models (Online) Non-credit

Page contents

Course overview

Course content

Course delivery

Learning outcomes:

Programme team

Course Dates

Time commitment

Accreditation

Course results

Teaching staff

Programme team

Dr Kym Snell

Dr Lucinda Archer

Dr Joie Ensor

Professor Richard Riley

Professor Gary S. Collins

Dr Rebecca Whittle

Entry requirements

Fees and scholarships

Application process