CSC studentship: Deep Learning approaches for the study of natural selection and the genotype-phenotype map

   School of Biology

This project is no longer listed on and may not be available.

Click here to search for PhD studentship opportunities
  Prof OE Gaggiotti  No more applications being accepted  Competition Funded PhD Project (Students Worldwide)

About the Project

The field of population genomics addresses one of the main challenges of modern biology, namely dissecting and understanding the molecular basis of phenotypic variation. Such understanding is fundamentally important for biodiversity conservation, improvement of agricultural crops and breeds, control of invasive (pest) species, and identification of disease genes in humans and economically important species. To address this challenge, this field rapidly adopted technical advances in next generation sequencing, which generates massive and heterogeneous data sets (big data) required to characterise complex biological systems and uncover their underlying mechanisms. It has recently become clear that the efficient exploitation of these very large and complex databases requires the use of Machine Learning approaches, including Deep Learning (Korfmann, Gaggiotti & Fumagalli 2023). The objective of this PhD project is aimed at addressing this important new avenue of research.

Several properties of DL algorithms make them ideally suited for population genomics applications. They can identify associations between features (e.g., genetic variants) that underlie their predictive power and, therefore, uncover variants with small effects, through the collective effect of all associated features. They can also detect non-linear effects of genetic variants on the predicted variables. Although DLs have been considered as Black Boxes, rapid advances in interpretable DL are overcoming this limitation.

The specific objectives of this PhD project are to develop interpretable DL approaches for the prediction of phenotype or some other attribute of individuals (e.g., geographic location; c.f., Qin, Chiang, Gaggiotti, 2022) to make inferences about genetic architecture of phenotypic traits or detect spatially varying selection. The methods will be applied to the study of an important pest species, Drosophila melanogaster using an open access database (Lack et al. 2016). 

The focus will be on refining and extending the Multilayer Perceptrons (MLP) approach of Qin, Chiang, and Gaggiotti (2022) using two approaches:

i)    Promoting higher modularity in the MLP using ‘informed machine learning’ approaches (c.f., von Rueden et al., 2023). More precisely, we will include a knowledge-based loss term in the objective function. Knowledge will consist in physical location of SNPs, which will allow us to incorporate genetic linkage information.

ii)  Developing a knowledge-based neural network (KBANN; c.f., von Rueden et al., 2023) that incorporates biological knowledge into the architecture of the MLP by defining meaningful connections between layers. More precisely, we will adopt the GenNet framework (van Hilten et al. 2021), which uses gene annotations to connect millions of SNPs to genes that are physically associated with them.

This PhD project is inscribed in the context of a collaboration with Dr Juan Ye (School of Computer Sciences, University of St Andrews). Thus, although the student will be based at the School of Biology, they will be co-supervised by Dr Juan Ye and interact with her group.

The student will receive extensive training in the development and application of Deep Learning approaches as well as in bioinformatics and population genomics. Thus, upon completion, the student will have key transferable skills that are highly sought after by both academia and industry. 


Submit an application to St Andrews University through the online application portal: Research programmes - Study at St Andrews - University of St Andrews (

Your online application must include the following documents:

  • 2 References
  •  Academic Qualifications
  • English Language Qualification (if applicable)
  • CV
  • personal statement

Once you have submitted your application to the online portal, please submit a scholarship application through the link provided. More information can be found here: China Scholarship Council - Global partnerships and study abroad - University of St Andrews (

 Candidates should have a strong background in one or more of the following areas: population genomics, computer sciences, statistics, bioinformatics. However, in all cases, they will have to demonstrate very strong quantitative skills. Interested candidates should contact Prof. Oscar Gaggiotti ([Email Address Removed]).

Biological Sciences (4) Computer Science (8)

Funding Notes

We encourage applications from Chinese nationals through the St Andrews China Scholarship Council Scheme.


Korfmann K, Gaggiotti OE, Fumagalli M. 2023. Deep Learning in Population Genetics. Genome Biology and Evolution 15.
Lack, J. B., Lange, J. D., Tang, A. D., Corbett-Detig, R. B. & Pool, J. E. 2016. A Thousand Fly Genomes: An Expanded Drosophila Genome Nexus. 33: 3308–3313.
Qin XH, Chiang CWK, Gaggiotti OE. 2022. Deciphering signatures of natural selection via deep learning. Briefings in Bioinformatics 23.
van Hilten A, Kushner SA, Kayser M, Ikram MA, Adams HHH, Klaver CCW, Niessen WJ, Roshchupkin GV. 2021. GenNet framework: interpretable deep learning for predicting phenotypes from genetic data. Communications Biology 4.
von Rueden L, Mayer S, Beckh K, Georgiev B, Giesselbach S, Heese R, Kirsch B, Pfrommer J, Pick A, Ramamurthy R, et al. 2023. Informed Machine Learning - A Taxonomy and Survey of Integrating Prior Knowledge into Learning Systems. IEEE Transactions on Knowledge and Data Engineering 35:614-633

How good is research at University of St Andrews in Biological Sciences?

Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities
Search Suggestions
Search suggestions

Based on your current searches we recommend the following search filters.