or
Looking to list your PhD opportunities? Log in here.
With the development of deep learning approaches and convolutional neural networks (CNN) in particular, the task of recognising objects from an image has become associated with the ability to train a network using a large number of labelled images for each class of interest [He2016]. The creation of large, labelled datasets such as ImageNet [Ru2015] has permitted the development of ever more performing architectures. In order to extend further the number of classes to which objects can be recognised without requiring to have access to a large number of labelled images of the new classes, few-shot learning (FSL) and even one short learning were proposed: in such frameworks, the initial training set is completed with a support set containing only a few images to define each additional object class [Sung2018]. Eventually, reliance on training images for new classes has been totally removed; instead, textual attributes proved sufficient. By learning a mapping between object images and textual attributes using a large training set, an object from a new class only defined by its textual attributes could be recognised. While initially zero-shot learning (ZSL) focused on identifying images belonging to the unseen classes of interest, generalized ZSL offers predictions for both the seen and unseen classes [Ji2021]. Although textual attributes can be retrieved automatically from the known labels using resources such as word2vec [Ch17] and Wikipedia, ZSL is essentially a retrieval problem as labels of all classes of interest must be known. Unfortunately, such constraint can only be met in a limited number of scenarios, such as those associated to existing specialised datasets dedicated to specific applications, e.g., ‘bird watching’ [Wa11]. Here, it proposed to replace this constraint by a less limiting one, i.e., access to an internet connection (or at least an electronic copy of an encyclopaedia). In such scenario, in principle, picture of any not only unseen, but also unidentified object can be processed and labelled. For example, if a system were able to characterise an unidentified object in an image as ‘a beaver with a duck beak’, a simple query using one’s favourite search engine would be sufficient to label the object as a ‘platypus’.
The aim of the project is to develop an efficient deep learning-based pipeline allowing the annotation of any object present on an image without prior knowledge by the system of their existence. Taking advantage of extracted visual features to create an Unidentified Featured Object (UFO), and existing mapping between visual and textual features, natural language processing techniques can then be applied to generate a list of putative annotations. Eventually, they can be analysed using an FSL-based framework to identify the most likely label.
Successful completion of the project requires addressing the following scientific objectives:
1. Design of a CNN-based architecture able to convert the photograph of any object into a set of relevant features to create an UFO
2. Design of a suitable visual-to-textual features mapping function
3. Apply natural language processing and FSL techniques to label the UFO
4. Significantly reduce the size of the deep learning solution, while maintaining its performance, by using differential equations as neural network layers. Usage of efficient solvers should result in a global decrease of computation complexity
Applicants should have, at least, an Honours Degree at 2.1 or above (or equivalent) in Computer Science or related disciplines. In addition, they should have a good mathematical background, excellent programming skills in Python, and an interest in machine learning.
There is no funding for this project
The university will respond to you directly. You will have a FindAPhD account to view your sent enquiries and receive email alerts with new PhD opportunities and guidance to help you choose the right programme.
Log in to save time sending your enquiry and view previously sent enquiries
The information you submit to Kingston University will only be used by them or their data partners to deal with your enquiry, according to their privacy notice. For more information on how we use and store your data, please read our privacy statement.
Based on your current searches we recommend the following search filters.
Check out our other PhDs in London, United Kingdom
Start a New search with our database of over 4,000 PhDs
Based on your current search criteria we thought you might be interested in these.
Automatic Object and Behaviour Recognition in Video/ Image Sequences
University of Sheffield
Image Reconstruction using FPGA-based Generative AI
University of Edinburgh
Addressing Racial Bias in Face Recognition Based on Deep Learning Enabled Computer Vision
Durham University