Electronic, Electrical and Systems Engineering PhD (Multimodal Interaction Technologies specialism)

The goal of our research is to develop the novel theories, algorithms and computer architecture needed to create the technological components of effective multimodal interactive systems. The group comprises nine academic staff, plus postdoctoral research staff and full-and part-time PhD students. Our research students come from diverse academic backgrounds, ranging from electronic and electrical engineering, mechanical engineering, physics, chemistry, mathematics and computer science, to speech science and linguistics.

Current activities

Current activities include:

  • Fundamental and applied research into computer vision and speech
  • Image and signalling processing
  • Reconfigurable computing
  • Technologies for multimodal human–machine interaction (including speech, eye-movement, gesture and tangible interfaces)
  • Acoustics
  • Music technology

Broader control and decision-support interests are in devising new techniques for handling uncertainty and complexity, and employing intelligence, learning and adaptation. These methods are being applied to target tracking, control of industrial processes, online decision support and control for water supply and wastewater systems.

Current and recent projects

Examples of current or recent projects include:

  • Collaboration with the Medical School in medical image processing
  • Basic research in computer vision,including colour image interpretation and symmetry analysis
  • Real-time video processing for object tracking and event analysis with application to intelligent video surveillance and autonomous vehicle navigation
  • Applications of novel computer architectures, particularly reconfigurable computing, to real, computationally complex problems
  • Research into vehicle hazard warning, as part of the EU 'RADARNet' project
  • Collaboration with BT Exact, Ensigma Technologies and others in the DTI/EPSRC 'PUMA' project on personalised spoken language interfaces and 'conventional biometrics'
  • Collaboration with the School of Psychology on modelling eye-and body-movement for multimodal human–machine interaction
  • Research into novel 'tangible acoustic interfaces' for human–machine interaction, as part of the EU FP6 'TAICHI' project
  • Spoken language processing for children in the FP5 'PF_STAR' collaboration with universities and research laboratories in Italy, Germany and Sweden
  • Development of EPSRC-supported research into novel 'unified' models of human speech that can support recognition, synthesis and coding technologies
  • Research into novel techniques for noise-robust speech processing
  • Modelling and characterisation of regional and non-native British English accents for speech technology
  • Collection of the ABI (Accents of the British Isles) speech database; the largest collection of accents of British English and an invaluable resource for research into spoken language processing.
  • Recurrent neural networks for non-linear adaptive control of uncertain systems
  • Fuzzy-logic supervisory control
  • Multi-target tracking
  • Constraint logic programming for integrated management, maintenance and operational control
  • Intelligent predictive control of hybrid dynamical systems
  • Hierarchical systems for online decision support, and control of integrated quantity and quality in water networks and waste water systems
  • Intelligent control of engine emissions
  • Online risk assessment and monitoring


We have a diverse multidisciplinary portfolio of activities involving collaborators in university, commercial and government laboratories in the UK and overseas, as well as other researchers at Birmingham. These projects are funded by a range of sources, including the EPSRC, EU, industry, and UK and overseas governments.

Professional development

We are known internationally for our post-experience courses on Underwater Acoustics and Sonar Signal Processing, which are attended by sonar system designers from around the world. In recent years, we have also delivered a CPD course in Spoken Language Technologies and Computer Vision.