FindAPhD Weekly PhD Newsletter | JOIN NOW FindAPhD Weekly PhD Newsletter | JOIN NOW

Pattern Recognition for Protein Crystallisation Strategies

   Department of Mathematics

This project is no longer listed on and may not be available.

Click here to search for PhD studentship opportunities
  Prof J Wilson, Dr Yinhai Wang  No more applications being accepted  Funded PhD Project (European/UK Students Only)

About the Project

Protein crystallography is of fundamental importance to structural biology and drug design, but producing crystals is a major bottleneck in the pipeline. Crystallisation is a trial and error process and scientists do not currently have the tools to correlate the hidden relationships between the properties of a protein that are known or can be measured and the experimental conditions in which they will crystallise. In collaboration with AstraZeneca, this project aims to use machine learning and statistical pattern recognition to reveal such relationships, thereby improving future experiments and accelerating the drug discovery process.
The results from many experiments, both successful and unsuccessful, will be required but information on failed experiments is rarely recorded by crystallographers and, once conditions producing crystals have been identified, alternative successful conditions are often not considered. However, the use of robotics to perform crystallisation trials not only increases throughput and reduces costs, but allows the results to be recorded by imaging systems. The MARCO (MAchine Recognition of Crystallization Outcomes) project showed that automated image analysis can be used to classify experimental results, having a 94% correct classification rate [1]. Work on a custom image classifier using convolutional neural networks is already underway at AstraZeneca. In this project, the student will incorporate Active Learning [2] into the training process to improve performance and use additional information from UV images to achieve higher accuracy. As the main aim of the project, the student will then use the annotated crystallisation images to connect compounds or proteins with experimental conditions using novel graph-based networks [3, 4].

Funding Notes

This is a 4-year iCASE studentship based in the Mathematics Department at the University of York fully funded by EPSRC and AstraZeneca. The studentship covers: (i) a tax-free annual stipend at the standard Research Council rate (£15,285 for 2020-2021), (ii) research costs, and (iii) tuition fees at the UK/EU rate. The successful applicant will spend between 3 and 12 months at the industrial partner’s research institute during the project with travel and accommodation costs covered by AstraZeneca.


1. Bruno, Andrew E., et al. "Classification of crystallization outcomes using deep convolutional neural networks." PLOS one 13.6 (2018): e0198883.
2. Zhou, Shusen, Qingcai Chen, and Xiaolong Wang. "Active deep learning method for semi-supervised sentiment classification." Neurocomputing 120 (2013): 536-546.
3. Kipf, Thomas N., and Max Welling. "Semi-supervised classification with graph convolutional networks." arXiv:1609.02907, ICLR 2017.
4. Yang, Zhilin, William W. Cohen, and Ruslan Salakhutdinov. "Revisiting semi-supervised learning with graph embeddings." Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 2016

How good is research at University of York in Mathematical Sciences?

Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities
PhD saved successfully
View saved PhDs