Data Analytics and Statistical Machine Learning

Course Type

Module overview

The aim of the module is to provide you with an in-depth understanding of the state of the art in data integration, mining and analysis with applications in biology and biomedicine. 

The module covers topics related to data

  • Data types,
  • Data modelling,
  • Data management,
  • Semantic representation,
  • Integration,
  • Analysis
It will include various statistical techniques:
  • Frequentist and Bayesian approaches,
  • Univariate and multivariate analysis,
  • Specific statistics definition.
Furthermore it will present Modelling and Optimisation approaches to deal with large structured, yet heterogeneous, dataset and will include several techniques
  • Hidden Markov Models,
  • Self Organizing Maps,
  • Boot-strapping and resampling procedures,
  • Agent-based modelling,
  • Statistical Machine Learning.

It will as well provide methods to analyze, visualize and integrate the various types of data.

It includes as well the training on several well used web-based resources (e.g. OMIM, TCGA, DAVID, REACTOME)

By the end of the module you should be able to:

  • Demonstrate a good understanding of complexity of omics and clinical data and their management including their semantic representation
  • Demonstrate an in-depth understanding and ability to perform Data integration, mining and analysis
  • Demonstrate conceptual understanding of Computing, Algorithmic and Programming that enables the student to evaluate methodologies and develop critiques of them and, where appropriate, propose new methods
  • Deal with the complexity of information available to enable the integration of diverse data types
  • Demonstrate self direction and originality in tackling and solving problems to perform the appropriate Modelling and Optimization


20 credits


Essay - 60%
Presentation - 40%

Academics involved in the teaching of this module

Module lead - Professor Georgios Gkoutos whose interest is in the general areas of clinical and biomedical informatics, computational biology, and integrative and translational research aiming at the discovery of molecular origins of human disease and the development of novel diagnostic and intervention strategies.

Please note this module is only available as part of MSc Bioinformatics and the International Doctoral Training Programme.