Predictive models built with artificial intelligence (AI) methods are powerful tools to discover molecules with the potential to become drugs to treat a given disease. These models can leverage training datasets to identify such drug leads by computational (virtual) screening of massive libraries of molecules. In particular, AI models can be trained on atomic-resolution structures of macromolecular targets and the activities of their cognate molecules to predict the activities of other molecules across targets. Despite important successes, there are major challenges limiting the potential of such AI models. Some are specific to this problem (e.g. how to augment training datasets in a way that improves the performance of these models). Other challenges are also found in other supervised learning problems (e.g. anticipating how well the models performs outside its applicability domain). This PhD project is funded by the Royal Society (https://royalsociety.org/about-us/history/). It aims at making progress towards overcoming these challenges using both synthetic and real datasets. The successful candidate will join the group of Dr Pedro Ballester at Imperial College and the PhD will be carried out under his direct supervision. These are some relevant papers from the group:
Selection criteria - Essential
· University degree/s awarded in an area directly relevant to the project.
· Courses in the application of machine learning algorithms to scientific problems.
· Excellent grades in first and/or master degrees, especially in their research projects, with a major focus on computational analysis of data.
· Skilled in the implementation of Python or R scripts for scientific data analysis. · English language (https://www.imperial.ac.uk/study/pg/apply/requirements/english/).
Selection criteria - Desirable
· Research projects in the application of supervised learning to solve real-world problems in the context of biomedical research, especially virtual screening.
· Exposure to open-source chemical informatics toolkits (e.g. RDKit, OpenBabel), machine learning platforms (e.g. DeepChem, TorchDrug, Scikit-Learn, Caret), structural biology databases (e.g. PDBe, AlphaFold, SWISS-MODEL) and/or medicinal chemistry databases (e.g. ChEMBL, SureChEMBL, PubChem, ZINC).
· Exposure to the application of machine learning algorithms to drug design, e.g. QSAR.
· Exposure to computational chemistry software (e.g. Vina, DOCK).
What we offer
The studentship covers living expenses at an enhanced rate (tax-free £17,609 per year) plus PhD registration fees (£26,600 per year) for three years, with the possibility of extending it to a 4th year.
This is an exciting opportunity for a bright and motivated scientist to work on a timely and exciting data science problem of great therapeutic importance. The student will join the Ballester group at Imperial’s Department of Bioengineering, which provides an international and stimulating research environment. In terms of personal experience, London has been named the best city in the World to be a university student (https://www.topuniversities.com/city-rankings/2022).
How to apply
Candidates must send an email with their CV, grades for each held university degree and a covering letter (maximum two pages) to [Email Address Removed] with subject line “PhD in AI for SBVS”. This letter must explain how they meet the essential selection criteria, which desirable selection criteria are also met and how this position would fit in their future career plans. This email must also state the names and emails of two scientists involved in assessing their academic performance, who are willing to provide a reference. Please also mention in the letter where did you see this position