Using deep transfer models from natural language processing to predict drug resistance in communicable diseases in humans

   Faculty of Engineering, Computing and the Environment

  Dr Farzana Rahman  Applications accepted all year round  Self-Funded PhD Students Only

About the Project

We are inviting applications from research students interested in learning, developing an understanding, and designing novel deep transfer and general-purpose predictive modelling approaches. 

The high-frequency emergence of drug-resistant diseases and recent surges of fast-spreading communicable diseases pose a challenge to medical professionals when treating patients. Indeed, not only can determining the drug resistance of a mutant disease/organism be lengthy, but in some cases this can be lethal.

In this project, the PhD candidate will explore, learn, and design new deep-learning techniques and natural language processing (NLP) methods to predict the causes of drug resistance at the molecular level. As their data-hungry nature, poor interpretability and inherent non-interoperability are limitations of deep transfer models, this project will seek to address these using novel approaches to enable their wider usage.

The core hypothesis behind this project is the exploitation of deep learning-based NLP models that have the unique property of capturing spatial relationships among terms or words in natural texts. Based on them, the PhD candidate will develop novel methods to capture the subtle long-distance inter-relationships in the genetic code of known drug-resistant variants. Those models should be able to discover new patterns currently out of reach from current state-of-the-art method, e.g., Drug-Resistant Mutation (DRM) profiles.

In order to validate putative discoveries, the candidate will mine the literature to attempt to explain the detected inter-relationships. This will provide opportunity to create a novel method to "interpret a large body of text at a pace". This will require investigating keyword-directed search to perform topic modelling using named entity recognition and extended classification mechanisms. 

Key contributions to knowledge from this PhD work will include pre-trained, transferable models for a single disease by analysing relevant factors. In addition, a stretch goal for the candidate will be to apply (through model tuning) the method/model to a new/potential drug-resistant disease/organism. 

Applicants should have at least an Honours Degree at 2.1 or above (or equivalent) in a STEM discipline with an interest in learning Artificial Intelligence/Data Science areas. In addition, the applicant must have previous programming experience in Python in an academic or industry setup.

Funding Notes

There is no funding for this project


