University College London Featured PhD Programmes
The Francis Crick Institute Featured PhD Programmes
Engineering and Physical Sciences Research Council Featured PhD Programmes
University of Kent Featured PhD Programmes
University of Reading Featured PhD Programmes

Integrative Computational Biology and Machine Learning: Combining computational biology, computational chemistry, and machine learning techniques with biological big data to unravel the higher genomic code of life

This project is no longer listed on and may not be available.

Click here to search for PhD studentship opportunities
  • Full or part time
    Dr A Sahakyan
    Prof P McHugh
  • Application Deadline
    No more applications being accepted
  • Self-Funded PhD Students Only
    Self-Funded PhD Students Only

Project Description

In the Sahakyan Group, we strive to make computational biology maximally independent from empirical experimental data, by basing our models and predictions on genomic sequences and core biological mechanisms. We achieve this by devising specific machine learning (supervised and unsupervised) methodologies that are physics and structure “aware”, and apply our ab initio modelling approaches to better understand gene regulation, mutations, and to spot driver DNA alterations in multigenic diseases (such as cancer, cardiomyopathies, autism).

Besides the direct benefits, our approach also lets us understand part of the biology that we cannot predict, i.e. the reminder - cell-specific factors not tightly inter-linked to our genomic blueprint. This may help us better characterise the genome-invariant factors involved in cell differentiation.

We seek enthusiastic individuals to join us and pursue a DPhil degree. Applicants are welcome with interests in either part or all of the genome, transcriptome, and proteome layers of information processing in life. We are particularly keen to decipher the higher genomic code of differential DNA damage susceptibility and repair efficiency, cross mapping our conclusions against the known mutation sites involved in cancer and other multigenic diseases. The work will proceed in close collaboration with the group of Prof. Peter McHugh.

The post will particularly suit computationally inclined individuals with enthusiasm and passion for life sciences and computers, coming from diverse background (computer science, chemistry, physics, engineering, biology). For inquiries and additional details, please contact [Email Address Removed].

Students will benefit from close supervision and multidisciplinary vibrant working environment. They will gain valuable knowledge and hands-in experience in machine learning, biological sequence analyses, advanced computational biology techniques, evolutionary data analyses and computer programming. They will be exposed to modern genomics technologies, combining in-house and public experimental datasets for advanced model development. Students’ work will be finalised in first-author publications, with further opportunities to present in local and international scientific conferences.

As well as the specific training detailed above, students will have access to high-quality training in scientific and generic skills, as well as access to a wide-range of seminars and training opportunities through the many research institutes and centres based in Oxford.

All MRC WIMM graduate students are encouraged to participate in the successful mentoring scheme of the Radcliffe Department of Medicine, which is the host department of the MRC WIMM. This mentoring scheme provides an additional possible channel for personal and professional development outside the regular supervisory framework.

Funding Notes

Our main deadline for applications for funded places has now passed. Supervisors may still be able to consider applications from students who have alternative means of funding (for example, charitable funding, clinical fellows or applicants with funding from a foreign government or equivalent). Prospective applicants are strongly advised to contact their prospective supervisor in advance of making an application.

Please note that any applications received after the main funding deadline will not be assessed until all applications that were received by the deadline have been processed. This may affect supervisor availability.


Sahakyan et al., “Machine learning model for sequence-driven DNA G-quadruplex formation”, Sci. Rep., 7:14535, 2017.

Sahakyan et al., “G-quadruplex structures within the 3’ UTR of LINE-1 elements stimulate retrotransposition”, Nature Str. Mol. Biol., 24:243-247, 2017.

Sahakyan and Balasubramanian, “Single genome retrieval of context-dependent variability in mutation rates for human germline”, BMC Genomics, 18:81, 2017.

Kwok et al., “rG4-seq reveals widespread formation of G-quadruplex structures in the human transcriptiome”, Nature Meth., 13:841-844, 2016.

Sahakyan and Balasubramanian, “Long genes and genes with multiple splice variants are enriched in pathways linked to cancer and other multigenic diseases”, BMC Genomics, 17:225, 2016.

Sahakyan et al, “Structure-based prediction of methyl chemical shifts in proteins”, J. Biomol. NMR, 50:331-346, 2011.

How good is research at University of Oxford in Clinical Medicine?

FTE Category A staff submitted: 238.51

Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities

FindAPhD. Copyright 2005-2019
All rights reserved.