Speech Recognition by Synthesis (SRbS)


Recent progress in reducing error rates for automatic speech recognition (ASR) has mainly been achieved by the use of machine learning algorithms, involving models with ever-more parameters.  Robust training of these models requires vast amounts of training data and is computationally expensive.  The data structures learned by these models, such as deep neural networks and hidden Markov models with high-dimensional Gaussian mixture model output distributions, are often not understood in any detail, and are difficult to relate in any meaningful way to the underlying mechanisms of speech production.

In this project we are developing new parsimonious models for robust speech recognition, inspired by linguistically and physically plausible models of speech production.  Desirable characteristics of these models are to be robust to speech in different environments and contexts and to require a minimal set of meaningful parameters.  Such models will perhaps sit at the interface between speech articulation, synthesis and recognition.

 Academic Staff

Research Staff

Doctoral Researchers

Publications

C. Champion and S.M. Houghton.  Application of Continuous State Hidden Markov Models to a Classical Problem in Speech Recognition, preprint submitted to Computer Speech and Language, 2014 (PDF).

P. Weber, S.M. Houghton, C.J. Champion, M.J. Russell and P. Jančovič.  Trajectory Analysis of Speech Using Continuous State Hidden Markov ModelsAccepted, ICASSP, 2014 (PDF Copyright IEEE).

C. Champion.  Recognition by rule 3: Application of Continuous State Hidden Markov Models to a Classical Problem in Speech RecognitionUnpublished internal paper, 2011 (PDF).

Seminars

We hold regular seminars given by internal and external speakers.  Seminar schedule for autumn 2014 tbc.Example Spectrogram with VTR Tracks