Don't miss our weekly PhD newsletter | Sign up now Don't miss our weekly PhD newsletter | Sign up now

  String models for speech


   School of Computer Science

This project is no longer listed on FindAPhD.com and may not be available.

Click here to search FindAPhD.com for PhD studentship opportunities
Prof Thomas Hain  Applications accepted all year round  Self-Funded PhD Students Only

About the Project

In the past two decades speech technology has heavily relied on concatenate structures, in particular hidden Markov models (HMMs), for representation of the acoustic. Although these models have significant and well established short-comings, they are dominant because of their simplicity in practical use. Technology has moved dramatically and in order to make those models work under the complex situations required in speech or speaker recognition (or any other speech classification task) many algorithms were developed that aim to alleviate the known deficiencies. That has now led to a situation where it takes many years to develop practical recognition systems, and even then many applications still show very poor performance. One of the main shortcomings is the temporal modelling in HMMs.

In this project we will work on a new model that, instead of treating speech as a concatenation of elements, represents it as dynamic sequences that overlap in time. One can show that such modelling can be substantially more powerful than the existing concepts, allowing for new models to diversify different aspects such as intra speaker variation. The project will be conducted in the Speech and Hearing Group at the Department of Computer Science at Sheffield University. The group is at the fore-front of international research into speech recognition and the project naturally can leverage on the extensive infrastructure and knowledge available in the group.

 About the Project