Speech Recognition by Synthesis (SRbS)

Recent progress in reducing error rates for automatic speech recognition (ASR) has mainly been achieved by the use of machine learning algorithms, involving models with ever-more parameters.  Robust training of these models requires vast amounts of training data and is computationally expensive.  The data structures learned by these models, such as deep neural networks and hidden Markov models with high-dimensional Gaussian mixture model output distributions, are often not understood in any detail, and are difficult to relate in any meaningful way to the underlying mechanisms of speech production.

In this project we are developing new parsimonious models for robust speech recognition, inspired by linguistically and physically plausible models of speech production.  Desirable characteristics of these models are to be robust to speech in different environments and contexts and to require a minimal set of meaningful parameters.  Such models will perhaps sit at the interface between speech articulation, synthesis and recognition.

 Academic Staff

Research Staff

Doctoral Researchers

Recent Activity

July 2-3 2015: (Photos below) Several members of the group recently attended UK Speech 2015 at the University of East Anglia, Norwich, UK and presented several posters and a talk.

June 1 2015: The group will present several papers and posters at Interspeech 2015 to be held in Dresden, Germany in September.

Publications

Journal Papers

C. Champion and S.M. Houghton.  Application of Continuous State Hidden Markov Models to a Classical Problem in Speech Recognition, Computer Speech and Language, in press, 2015 (DOI).

Refereed Conference Papers

Linxue Bai, P. Jančovič, M.J. Russell and P. Weber.  Analysis of a Low-Dimensional Bottleneck Neural Network Representation of Speech for Modelling Speech Dynamics.  Accepted to Interspeech 2015, Dresden, Germany, 2015 (Preprint PDF).

S. M. Houghton, C. J. Champion and P. Weber.  Recognition of Voiced Sounds with a Continuous State HMM.  Accepted to Interspeech 2015, Dresden, Germany, 2015 (Preprint PDF).

P. Weber, C. Champion, S. Houghton, P. Jančovič and M.J. Russell.  Consonant Recognition with Continuous-State Hidden Markov Models and Perceptually-Motivated Features.  Accepted to Interspeech 2015, Dresden, Germany, 2015 (Preprint PDF).

P. Weber, S.M. Houghton, C.J. Champion, M.J. Russell and P. Jančovič.  Trajectory Analysis of Speech Using Continuous State Hidden Markov Models.  In Proc. ICASSP pp3042-3046, 2014 (DOI).

Other Papers

Linxue Bai, P. Jančovič, M.J. Russell and P. Weber.  Analysis of a Low-Dimensional Bottleneck Neural Network Representation of Speech for Modelling Speech Dynamics.  Poster presented at UK Speech, University of East Anglia, Norwich, UK 2015 (PDF).

C. A. Seivwright, M. J. Russell and S. M. Houghton, Comparative Analysis of a Continuous and Discontinuous Piecewise Linear Decoder.  Poster presented at UK Speech, University of East Anglia, Norwich, UK, 2015 (PDF).

C. Champion.  Recognition by rule 3: Application of Continuous State Hidden Markov Models to a Classical Problem in Speech RecognitionUnpublished internal paper, 2011 (PDF).

Seminars

We hold regular seminars given by internal and external speakers.  The new seminar schedule will begin in autumn 2015.

Photos

Some photos of the group from UK Speech 2015.

Peter UK SpeechPhil UK SpeechEva UK SpeechChloe UK SpeechMengie UK Speech