The University of Glasgow (UofG) is home to world-leading research in the fields of information retrieval (search) and machine learning. This PhD project is funded by a 2019 Bloomberg Data Science Award and will be done in collaboration with Edgar Meij of Bloomberg Technology in London. To facilitate the project, Bloomberg will assist with travel, internship opportunities, hardware, and access to data resources.
We are searching for a highly motivated student to work at the interface between search, machine learning, and natural language processing.
The aim of this PhD is to research the design of algorithms for entity-centric information extraction and retrieval. In particular, the focus is on multi-task deep learning models for topic-specific extraction and ranking over heterogeneous text collections trained using existing knowledge resources and weak supervision.
We motivate our work based on an information analyst in a specialized domain, such as energy regulation. An entity-centric task would be to search or receive alerts on key topics of interest, with a particular focus on important entities (people, companies, projects, technology, etc...) and events (mergers, patents, lawsuits, etc…). A key property is that the target information is from diverse sources. We propose developing new models for this task.
Recent advancements in multi-task learning demonstrate that it’s possible to use unsupervised pre-training (BERT and Transformer models) to perform representation learning and then to perform supervised fine-tuning on the target task to achieve state-of-the-art results with limited data. This PhD will tackle fundamental challenges combining information extraction (entity detection, entity disambiguation, relation extraction) and content retrieval (passages, entities, and facts) with deep learning models. The result will be new models that represent significant improvements and new capability for complex entity-centric retrieval tasks.
The models developed will form a foundation for increasingly complex models combining language understanding with higher-level retrieval and summarisation tasks. It will result in a significant step towards assistive AI algorithms and information agents that operate over diverse heterogeneous data sources and provide analysts and algorithms with up-to-date and relevant information needed to make decisions of significant economic and social importance.
The ideal candidate will have:
● A strong first degree in Computer Science or related discipline
● An interest in conversational artificial intelligence -- including deep learning methods and reinforcement learning, information retrieval, and natural language understanding.
● An understanding of research principles and methods, through an undergraduate or postgraduate dissertation project.
Applications will be considered on a rolling basis. For enquiries specific to the project, please contact: [email protected]
Start Date: October 2019
How to Apply: Please refer to the following website for details on how to apply: https://www.gla.ac.uk/study/applyonline/?CAREER=PGR&PLAN_CODES=G500A-7201