Coventry University Featured PhD Programmes
University of Liverpool Featured PhD Programmes
University of Edinburgh Featured PhD Programmes
King’s College London Featured PhD Programmes
Karlsruhe Institute of Technology Featured PhD Programmes

Applying machine learning to ’omics data for accelerated marine antibiotic discovery

Project Description

This project aims to generate comprehensive datasets (bioactivity, genomics, and metabolomics) that will enable us to develop machine learning tools to accelerate the discovery of novel antibiotics.

Over 70% of our antibiotics are produced naturally or derived from natural products, the vast majority of them by bacteria. In the last few decades the discovery of new antibiotics has stalled due to the re-discovery of known chemistry, resulting in the current global antimicrobial resistance crisis. As such, antimicrobial resistance is responsible for 700,000 deaths per year, a figure that is estimated to rise to 10 million by 2050 if new solutions are not developed.1

In the last 15 years, genome sequencing has uncovered the incredible potential of bacteria to produce medically-relevant chemistry and revealed that even well-studied strains maintain the genetic potential to produce many more metabolites than discovered thus far.2 This potentially rich chemical diversity has been underpinned by in-silico prediction of sequence regions that appear to be likely candidates for natural product production (known as biosynthetic gene clusters; BGCs). Researchers can thus mine genome sequences for putative BGCs whilst simultaneously growing the strains and measuring their chemical products and ability to kill bacterial pathogens (i.e. "superbugs").

Despite computational tools existing for individual datasets (i.e. mass spectrometry3,4), the linking of these rich and complex datasets is largely a manual process.5 This fundamental bottleneck for the discovery of new antibiotics can be addressed using machine learning. Machine learning is the name given to a family of techniques that can extract patterns present in data (training data), and use these learnt patterns to make predictions on previously unseen data. These techniques are particularly useful in domains where the data are too complex for direct analysis by humans. Developing and using machine learning tools requires large, high quality training data sets. In this project, data sets consisting of genomes from many strains (and their predicted BGCs), the chemical products of these same strains and their bioactivity profiles will be generated and linked to accelerate discovery.

In addition to emailing your CV for consideration, please write a maximum 1-page statement about your motivations for applying for this PhD studentship and why the research questions you will be addressing here are of interest to you.

Funding Notes

University of Strathclyde Research Excellence Award


1 O’Neil. The Review on Antimicrobial Resistance.
2 Baltz (2017) J. Ind. Microbiol. Biotechnology 44(4-5): 573-588
3 Wang et al. (2016) Nat. Biotech. 34: 828-837
4 van der Hooft et al. (2017) Anal. Chem. 89 (14): 7569-7577
5 Duncan et al. (2015) Chem. Biol. 22(4): 460-471

Related Subjects

Email Now

Insert previous message below for editing? 
You haven’t included a message. Providing a specific message means universities will take your enquiry more seriously and helps them provide the information you need.
Why not add a message here
* required field
Send a copy to me for my own records.

Your enquiry has been emailed successfully

FindAPhD. Copyright 2005-2019
All rights reserved.