Multiple PhD project topics available around Trustworthy AI-enabled multimodal assistive hearing and conversational technologies


   School of Computing, Engineering & the Built Environment

  Prof Amir Hussain, Mr Mandar Gogate  Applications accepted all year round  Self-Funded PhD Students Only

About the Project

Flexible PhD project topics are available as part of the prestigious UK EPSRC funded Programme Grant: COG-MHEAR, led by Programme Director, Professor Amir Hussain.

As a PhD researcher, you will take a lead role in one of the COG-MHEAR research programme challenge areas under the direction of Prof. Hussain. You will have the opportunity to work alongside renowned academics and other doctoral and postdoctoral researchers in our world-class, interdisciplinary Centres of: AI and Robotics; and Cybersecurity, IoT and Cyber-Physical Systems.

There will also be opportunities to carry out collaborative research on complementary topics as part of the EPSRC funded project on Natural Language Generation for Low-Resource Domains (NATGEN), which is co-led by Prof Hussain.

COG-MHEAR is a world-leading cross-disciplinary research programme funded under the EPSRC Transformative Healthcare Technologies 2050 Call. The programme aims to develop truly personalized multimodal assistive hearing and communication technology. It includes academic partners from 6 other UK Universities and a strong User Group comprising industrial and clinical collaborators, and end-user engagement organisations.

For more details, visit our website: https://cogmhear.org/.

We are looking for highly motivated applicants who will:

  • Carry out highly-impactful supervised research in developing and evaluating trustworthy machine learning models for multi-modal hearing-aid speech enhancement and conversational dialog systems.
  • Write-up high-quality peer-reviewed publications for leading journals and conferences.
  • Exploit opportunities to collaborate with other PhD and postdoctoral researchers, COG-MHEAR partner companies, clinicians and end-users in the User Group
  • Contribute to research and innovation proposals to secure future funding (e.g. for research/enterprise fellowships)

Any PhD project topic will be considered around developing and evaluating trustworthy machine learning models for multimodal hearing-aid speech enhancement and conversational dialog systems.

Example research areas of interest include:

  • Trustworthy machine learning for multi-modal speech enhancement, separation and recognition
  • Real-time augmented data-driven approaches to address related hearing-aid signal processing and integration challenges
  • Emotion-sensitive natural language processing/generation and evaluation of multi-modal assistive hearing and conversational/dialog systems in low-resource domains
  • Clinical and industrial applications (e.g human-robotics interaction, assistive technologies, wearable sensing, hardware/flexible electronics implementations, 5G-IoT and AR/VR use cases)

Academic qualifications

A first-class honours degree, or a distinction at master level, or equivalent achievements in AI, Computer Science, Informatics, Statistics, Mathematics, Electronic, Electrical, Computer, or Systems Engineering.

English language requirement

If your first language is not English, comply with the University requirements for research degree programmes in terms of English language.

Application process

Prospective applicants are encouraged to contact the supervisor, Prof Amir Hussain () to discuss the content of the project and the fit with their qualifications and skills before preparing an application. 

The application must include: 

Research project outline of 2 pages (list of references excluded). The outline may provide details about

  • Background and motivation, explaining the importance of the project, should be supported also by relevant literature. You can also discuss the applications you expect for the project results.
  • Research questions or
  • Methodology: types of data to be used, approach to data collection, and data analysis methods.
  • List of references

The outline must be created solely by the applicant. Supervisors can only offer general discussions about the project idea without providing any additional support.

  • Statement no longer than 1 page describing your motivations and fit with the project.
  • Recent and complete curriculum vitae. The curriculum must include a declaration regarding the English language qualifications of the candidate.
  • Supporting documents will have to be submitted by successful candidates.
  • Two academic references (but if you have been out of education for more than three years, you may submit one academic and one professional reference), on the form can be downloaded here.

Applications can be submitted here.

Download a copy of the project details here.

Computer Science (8) Engineering (12)

References

[1] COG-MHEAR: http://cogmhear.org
[2] 2nd COG-MHEAR International AVSEC Challenge organised as part of the IEEE ASRU 2023 Workshop (job applicants are encouraged to register and participate in the Challenge and propose ideas for new low-latency audiovisual (AV) speech enhancement models for potential submission: https://challenge.cogmhear.org/#/
[3] Special Issue CFP on "Conversational AI" in the IEEE Transactions in Artificial Intelligence (ideas are welcome for papers that could potentially link both COG-MHEAR and NATGEN related topics): https://cogmhear.org/assets/IEEE-TAI-Special-Issue.pdf
[4] Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Amir Hussain, CochleaNet: A robust language-independent audio-visual model for real-time speech enhancement, Information Fusion, Volume 63, 2020, Pages 273-285, ISSN 1566-2535, https://doi.org/10.1016/j.inffus.2020.04.001.
[5] Hussain, Tassadaq, Mandar Gogate, Kia Dashtipour, and Amir Hussain. "Towards Intelligibility-Oriented Audio-Visual Speech Enhancement. -https://claritychallenge.github.io/clarity2021-workshop/papers/Clarity_2021_CEC1_paper_final_hussain.pdf
[6] [Code available at: https://github.com/cogmhear/Intelligibility-Oriented-AudioVisual-Speech-Enhancement ]
[7] Gao, Ruohan, and Kristen Grauman. "Visualvoice: Audio-visual speech separation with cross-modal consistency." In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15490-15500. IEEE, 2021 [Code available at: https://github.com/facebookresearch/VisualVoice ]
[8] https://www.sciencedirect.com/science/article/pii/S1566253520302475
[9] https://www.sciencedirect.com/science/article/pii/S1566253518306018
[10]http://spandh.dcs.shef.ac.uk/chat2017/papers/CHAT_2017_hussain.pdf
[11]https://link.springer.com/content/pdf/10.1007/s12559-019-09653-z.pdf
[12]NATGEN paper: https://hbuschme.github.io/nlg-hri-workshop2020/assets/papers/NLG4HRI_paper_12.pdf / https://gtr.ukri.org/projects?ref=EP%2FT024917%2F1 (NATGEN project has close/complementary links to COG-MHEAR e.g. shared interest in development and evaluation of low-latency and generalisable multi-modal
neural network models. Another topic of shared interest is multi-modal speech and emotion-analysis models for cognitive load/listening effort detection and intelligibility prediction in conversational dialog systems).