Don't miss our weekly PhD newsletter | Sign up now Don't miss our weekly PhD newsletter | Sign up now

  Image-based Recognition of Unidentified Featured Objects (UFOs)


   Faculty of Engineering, Computing and the Environment

   Applications accepted all year round  Self-Funded PhD Students Only

About the Project

With the development of deep learning approaches and convolutional neural networks (CNN) in particular, the task of recognising objects from an image has become associated with the ability to train a network using a large number of labelled images for each class of interest [He2016]. The creation of large, labelled datasets such as ImageNet [Ru2015] has permitted the development of ever more performing architectures. In order to extend further the number of classes to which objects can be recognised without requiring to have access to a large number of labelled images of the new classes, few-shot learning (FSL) and even one short learning were proposed: in such frameworks, the initial training set is completed with a support set containing only a few images to define each additional object class [Sung2018]. Eventually, reliance on training images for new classes has been totally removed; instead, textual attributes proved sufficient. By learning a mapping between object images and textual attributes using a large training set, an object from a new class only defined by its textual attributes could be recognised. While initially zero-shot learning (ZSL) focused on identifying images belonging to the unseen classes of interest, generalized ZSL offers predictions for both the seen and unseen classes [Ji2021]. Although textual attributes can be retrieved automatically from the known labels using resources such as word2vec [Ch17] and Wikipedia, ZSL is essentially a retrieval problem as labels of all classes of interest must be known. Unfortunately, such constraint can only be met in a limited number of scenarios, such as those associated to existing specialised datasets dedicated to specific applications, e.g., ‘bird watching’ [Wa11]. Here, it proposed to replace this constraint by a less limiting one, i.e., access to an internet connection (or at least an electronic copy of an encyclopaedia). In such scenario, in principle, picture of any not only unseen, but also unidentified object can be processed and labelled. For example, if a system were able to characterise an unidentified object in an image as ‘a beaver with a duck beak’, a simple query using one’s favourite search engine would be sufficient to label the object as a ‘platypus’.

The aim of the project is to develop an efficient deep learning-based pipeline allowing the annotation of any object present on an image without prior knowledge by the system of their existence. Taking advantage of extracted visual features to create an Unidentified Featured Object (UFO), and existing mapping between visual and textual features, natural language processing techniques can then be applied to generate a list of putative annotations. Eventually, they can be analysed using an FSL-based framework to identify the most likely label.  

Successful completion of the project requires addressing the following scientific objectives:

1.   Design of a CNN-based architecture able to convert the photograph of any object into a set of relevant features to create an UFO

2.   Design of a suitable visual-to-textual features mapping function

3.   Apply natural language processing and FSL techniques to label the UFO

4.   Significantly reduce the size of the deep learning solution, while maintaining its performance, by using differential equations as neural network layers. Usage of efficient solvers should result in a global decrease of computation complexity

Applicants should have, at least, an Honours Degree at 2.1 or above (or equivalent) in Computer Science or related disciplines. In addition, they should have a good mathematical background, excellent programming skills in Python, and an interest in machine learning.


Computer Science (8)

References

[Ch17] K. Church, Word2Vec, Natural Language Engineering, 23(1), 155-162, 2017
[He16] K. He et al., Deep Residual Learning for Image Recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
[Ji21] Y. Jin et al., Zero-Shot Video Event Detection with High-Order Semantic Concept Discovery and Matching, IEEE Transactions on Multimedia, 2021
[Ru15] O. Russakovsky et al., ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision (IJCV), 2015
[Su18] F. Sung et al., Learning to Compare: Relation Network for Few-Shot Learning, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
[Wa11] C. Wah et al., The caltech-ucsd birds-200-2011 dataset,” California Institute of Technology, Tech. Rep. CNS-TR-2011-001, 2011.

Register your interest for this project


Search Suggestions
Search suggestions

Based on your current searches we recommend the following search filters.