Don't miss our weekly PhD newsletter | Sign up now Don't miss our weekly PhD newsletter | Sign up now

  PhD Studentship Opportunity: Audio-visual object-based dynamic scene representation from monocular video


   Faculty of Engineering and Physical Sciences

This project is no longer listed on FindAPhD.com and may not be available.

Click here to search FindAPhD.com for PhD studentship opportunities
  Dr Armin Mustafa, Prof A Hilton  No more applications being accepted  Funded PhD Project (UK Students Only)

About the Project

This research will investigate the transformation of monocular audio and visual video into a spatially localised object-based audio-visual representation.

Self-supervised and weakly supervised deep learning will be investigated for the transformation of general scenes into semantically labelled and localised objects. This will build on recent advances in deep-learning based monocular reconstruction of general dynamic scenes and objects with known semantic labels, such as people. Multi-modal information sources including audio and text subtitles will be employed to support weakly supervised learning for semantic labelling and object-based reconstruction.

The goal of this research is to generalise to unconstrained video sequences of complex real-world scenes with multiple interacting people. Research will investigate approaches for the transfer of multi-modal or additional information to support the object-based scene reconstruction and evaluate the relative importance of different information sources.

The approach should be able to achieve plausible reconstruction of unknown or unmodelled object classes, together with complete reconstruction for modelled object classes. Learning on in-the-wild and BBC archive datasets will be investigated to support the generalisation to complex scenes. Specific use-cases such as sports and programme recommendation will also be investigated for evaluation in constrained contexts. The approach will be evaluated on both live and legacy content. 

Supervisors: Dr Armin Mustafa, Professor Adrian Hilton and Professor Wenwu Wang

This is a 4-year project starting in October 2021.

AI4ME website

About the Centre for Vision, Speech and Signal Processing (CVSSP)

Entry requirements

All applicants should have (or expect to obtain) a first-class degree in a numerate discipline (mathematics, science or engineering) or MSc with Distinction (or 70% average) and a strong interest in pursuing research in this field. Additional experience which is relevant to the area of research is also advantageous, especially a demonstrated capability or interest in convergence research that spans the physical, engineering and biological sciences.

English language requirements: IELTS 6.5 or above (or equivalent) with no sub-test of less than 6.0.

How to apply

Applications should be submitted via the Vision, Speech and Signal Processing PhD programme page on the "Apply" tab.

Please state clearly the studentship project at you would like to apply for.

For enquiries contact Nan Bennett ([Email Address Removed]) indicating your areas of interest and including your CV with qualification details (copies of transcripts and certificates). Shortlisted applicants will be contacted directly to arrange a suitable time for an interview. For further information about our research portfolio and how to apply visit: www.surrey.ac.uk/cvssp.

About CVSSP

CVSSP is a leading UK research centre in audio-visual signal processing, computer vision and machine learning ranked 1st in the UK and 3rd in Europe for Computer Vision. Our Centre is one of the largest in Europe with over 170 researchers and a grant portfolio in excess of £27 million, bringing together a unique combination of cutting-edge sound and vision expertise. Our aim is to advance the state of the art in multimedia signal processing and computer vision, with a focus on image, video and audio applications. Our Centre has a robust track-record of innovative research leading to technology transfer and exploitation in biometrics, creative industries (film, TV, games, VR), communication, healthcare, robotics and consumer electronics.

CVSSP is a destination of choice for postgraduate talent and it is part of the Department of Electrical and Electronic Engineering which is ranked second in the Guardian newspaper league table 2020. The University of Surrey has recently been ranked 7th in the UK in the 2020 Advance HE Postgraduate Research Experience Survey (PRES).

We acknowledge, understand and embrace diversity.


Computer Science (8) Engineering (12) Mathematics (25)

Funding Notes

Full UK/EU tuition fee covered. Stipend at £18,609 p.a. increasing annually (enhanced stipend of £3k/annum). Personal Computer (provided by the department). Conference attendance budget £2k/annum. Equipment/consumables budget £1k/annum. Funding duration is 4 years.
https://www.ukri.org/our-work/developing-people-and-skills/find-studentships-and-doctoral-training/get-a-studentship-to-fund-your-doctorate/
Search Suggestions
Search suggestions

Based on your current searches we recommend the following search filters.