Scalable probabilistic modelling for high-resolution biological data

   Faculty of Biology, Medicine and Health

This project is no longer listed on and may not be available.

Click here to search for PhD studentship opportunities
  Prof M Rattray, Dr Mudassar Iqbal  Applications accepted all year round  Self-Funded PhD Students Only

About the Project

Probabilistic modelling provides a principled framework for model-based inference from data. Probabilistic modelling using Gaussian process inference has been used to gain insights into single-cell and spatial omics data [1,2,3]. These methods should ideally be associated with well-calibrated uncertainty estimates and appropriate generative models of the data, e.g. counts data should be modelled using an appropriate likelihood function. It was shown recently how the use of a negative binomial distribution likelihood model can greatly improve the quality of inference results [3]. However, scaling up inference in Gaussian process models with non-Gaussian likelihoods is computationally very challenging. At the same time, omics technologies are becoming increasingly data-rich, with modern single-cell methods able to profile millions of cells, and spatial omics methods able to profile tens of thousands of tissue locations. Both single-cell and spatial omics technologies can be applied genome-wide, with coverage over many thousands of genes. Therefore, there is a pressing need to scale up methods such as GPcounts [3] to deal with much larger datasets than is currently feasible. With spatially resolved transcriptomics declared by Nature Methods to be “Method of the Year” in 2020, it is likely that associated high-quality computational methods will have a major role in biological discovery. 

In the proposed project the student will adapt recent approaches from spatial statistics and probabilistic machine learning to develop scalable methods for analysis of spatial and single-cell omics data. Methods will be developed with appropriate likelihood models, leading to well-calibrated uncertainty estimates. The student will develop models that simultaneously capture the distribution and neighbourhood relationship of cell-types, while also modelling spatial variation in cellular activity and gene expression. Models will be multivariate across both genes and across spatial locations, greatly improving their interpretability and power over those in ref. [3]. Methods will be applied to a range of biological and medical applications of spatial and single-cell transcriptomics, with data from both public and in-house (10X visium) technologies. Methods will be published as well-documented and user-friendly open source tools to encourage widespread adoption by the community. 

1.     Candidates are expected to hold (or be about to obtain) a minimum upper second class honours degree (or equivalent) in a discipline with substantial computational and/or mathematical content. Candidates with experience in machine learning and with an interest in biology or medicine are encouraged to apply. 

2.     For information on how to apply for this project, please visit the Faculty of Biology, Medicine and Health Doctoral Academy website ( Informal enquiries may be made directly to the primary supervisor. On the online application form select the PhD title.

3.     For international students, we also offer a unique 4 year PhD programme that gives you the opportunity to undertake an accredited Teaching Certificate whilst carrying out an independent research project across a range of biological, medical and health sciences. For more information please visit

Biological Sciences (4) Computer Science (8) Mathematics (25)

Funding Notes

Applications are invited from self-funded students. This project has a Band 1 fee. Details of our different fee bands can be found on our website
Equality, diversity and inclusion is fundamental to the success of The University of Manchester, and is at the heart of all of our activities. The full Equality, diversity and inclusion statement can be found on the website


[1] Boukouvalas, A., Hensman, J., & Rattray, M. (2018). BGP: identifying gene-specific branching dynamics from single-cell data with a branching Gaussian process. Genome biology, 19(1), 1-15.
[2] Ahmed, S., Rattray, M., & Boukouvalas, A. (2019). GrandPrix: scaling up the Bayesian GPLVM for single-cell data. Bioinformatics, 35(1), 47-54.
[3] BinTayyash, N., Georgaka, S., John, S. T., Ahmed, S., Boukouvalas, A., Hensman, J., & Rattray, M. (2021). Non-parametric modelling of temporal and spatial counts data from RNA-seq experiments. bioRxiv, 2020-07.
Search Suggestions
Search suggestions

Based on your current searches we recommend the following search filters.