Driven by the increasing availability of large, annotated datasets and faster computational platforms, deep learning has been progressively employed for a broader spectrum of computer vision applications. In some scenarios, one might find that a machine performs even better than a human. On the other hand, a human, though sometimes less proficient at performing certain tasks, can perform a broader range of tasks, superior to any existing computer vision algorithms. Moreover, a machine has to learn with adequate data/labels to perform visual tasks, while a human effortlessly interprets visual scenes. To equip the machine perception with abilities that a human has, this PhD studies visual learning and understanding with multi-task and limited supervision.
Multi-task learning is common in deep learning, where clear evidence shows that jointly learning correlated tasks can improve on individual performances [1-2]. Notwithstanding many tasks are processed independently. The reasons are manifold: 1) many tasks are not strongly correlated, benefits might be obtained for only one or none of the tasks in joint learning [3]; 2) the scalability of learning multiple tasks is limited with the number of tasks [4]. Having a scalable and robust multi-task learning strategy is of substantial potential in many real applications, i.e., autonomous vehicle, robotic surgery.
Learning with limited supervision concerns both the lack of visual data and labels. The former addresses issues like rare classes or unseen classes which are featured as the long-tail problem [5] or few-shot problem [6], respectively. The latter is often referred to as weakly-supervised learning or semi-supervised learning where labels come in a weak form [7] or part of data come without labels [8]. Existing studies are mainly focused on the image classification task while for more complex visual tasks are just unfolding.
This project will first push forward the current studies in the above two regimes; next, a combinatorial view of studying them together will be proposed. The project will be mainly focused on high-level visual understanding tasks, i.e., object detection and semantic segmentation, image captioning. More tasks might be involved when applying the developed paradigms to real applications, e.g., autonomous driving.
The candidate should have a degree in Computer Science, Applied Mathematics or Electrical Engineering; solid mathematical background and programming skills; preferably, prior experience in computer vision, machine learning and deep learning.
The studentship is funded for 3.5 years and includes tuition fees, a stipend at the [UK Research Council Rate][https://www.ukri.org/our-work/developing-people-and-skills/find-studentships-and-doctoral-training/get-a-studentship-to-fund-your-doctorate/] plus London weighting, and allowance for research consumables and travel.
This studentship is valid for home students or international students who have a pre-settled status in the UK. The target starting date is Jun/Oct 2022. The PhD will be supervised by Dr Miaojing Shi and Dr Michael Spratling. Work will be carried out within the Department of Informatics, King’s College London.
Application Instructions: Candidates are requested to send an initial expression of interest to Miaojing Shi (miaojing.shi@kcl.ac.uk ) preferably with updated CV and research proposal.