Imperial College London Featured PhD Programmes
University of Southampton Featured PhD Programmes
London School of Economics and Political Science Featured PhD Programmes

An explainable robotic perception framework for indoor scene understanding


   Centre for Accountable, Responsible and Transparent AI


Bath United Kingdom Artificial Intelligence Computer Vision Machine Learning Robotics Software Engineering

About the Project

The goal of this project is to develop a real-time robotic perception framework that captures visual information from real-world environments, and is further reconstructed and converted into editable geometric models with semantic contexts.

A typical robotic perception in the task of scene understanding may involve information capture and analysis at geometric and semantic levels. The former focus on extraction of geometric entities/primitives from a scene, as well as interactions between them. The latter is mainly to learn the dense semantic labels for each of the geometric primitive obtained from a scene. Properly understanding a scene is an important prerequisite for richer industrial applications, including autonomous systems, navigation, mapping, and localisation. The ability to understand a scene depicted in a set of static images with other multi-sensory information has been an essential computer vision problem in practice. However, this level of understanding is rather inadequate since the real-world scene is often dynamic and noisy where unknown objects might be moving independently, as well as visual properties like illumination and texture might change by time.

The aim of this project is to develop a real-time robotic perception framework using explainable deep learning techniques that

·        enable real-time estimation of both geometric and semantic information from a real-world scene;

·        create 3D editable contents using geometric and semantic information we obtained from a scene.

·        provide human-understandable explanation and visualisation for the learning process.

In particular, the outcome of the project is supposed to achieve a certain level of transparency of a deep learning based perception system and further fill the gap between them and human understanding. This should comprise real-time 3D reconstruction for a real-world scene including a series of real-world challenges e.g. dynamic objects, illumination changes, large textureless regions etc. In addition, a set of high quality labels should be achieved for such a raw 3D model, e.g. semantic labels and information about the shape and pose of objects and layouts of the real-world scene. So that the raw model is then turned into a proper representation that can be further edited by an average user in a visual interactive environment. The successful candidate is expected to work closely with the experts from Electronic Engineering, as well as external collaborators from Facebook Oculus, Lambda labs, Kujiale.com and Imperial College London.

This project is associated with the UKRI Centre for Doctoral Training (CDT) in Accountable, Responsible and Transparent AI (ART-AI).

Applicants should hold, or expect to receive, a first or upper-second class honours degree in computer science, electrical engineering, mechanical engineering or a closely related discipline. A master level qualification or publication history would be advantageous.

Informal enquiries about the research should be directed to Dr Li.

Formal applications should be accompanied by a research proposal and made via the University of Bath’s online application form. Enquiries about the application process should be sent to .

Start date: 3 October 2022.


Funding Notes

ART-AI CDT studentships are available on a competition basis and applicants are advised to apply early as offers are made from January onwards. Funding will cover tuition fees and maintenance at the UKRI doctoral stipend rate (£15,609 per annum in 2021/22, increased annually in line with the GDP deflator) for up to 4 years.
We also welcome applications from candidates who can source their own funding.

References

[1] Wenbin Li, Sajad Saeedi, John McCormac, Ronald Clark, Dimos Tzoumanikas, Qing Ye, Yuzhong Huang, Rui Tang, Stefan Leutenegger. InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset, British Machine Vision Conference, BMVC 2018.
[2] Binbin Xu, Wenbin Li, Dimos Tzoumanikas, Michael Bloesch, Andrew J Davison, Stefan Leutenegger. MID-Fusion: Octree-based Object-Level Multi-Instance Dynamic SLAM. International Conference on Robotics and Automation, ICRA 2019.
[3] Sajad Saeedi, Eduardo da Costa Carvalho, Wenbin Li, Dimos Tzoumanikas, Stefan Leutenegger, Paul H J Kelly, Andrew J Davison. Characterizing Visual Localization and Mapping Datasets. International Conference on Robotics and Automation, ICRA 2019.

Email Now


Search Suggestions
Search suggestions

Based on your current searches we recommend the following search filters.

PhD saved successfully
View saved PhDs