About the Project
Despite the wealth of knowledge generated by high-throughput sequencing and proteomics experiments, the rules underpinning codon usage are mostly unknown.
From an industrial biotechnology perspective, this knowledge gap limits our ability to efficiently express heterologous proteins and to optimise properties for end-user applications, such as solubility [Pellizza et al, 2018].
AIMS AND OBJECTIVES. In collaboration with Fujifilm Diosynth Biotechnologies UK (FDBK), we propose to learn codon usage rules by rephrasing protein synthesis as a language modelling problem. We will then use deep learning in order to capture complex epistatic and evolutionary patterns associated with highly expressed genes and with optimal solubility. Ultimately, these models will be validated in silico and in-vivo.
WORKPLAN. The project is structured in 3 work packages.
- WP1 – the student will collect transcriptomic data for E. coli from public repositories and generate a dataset of curated transcripts and associated protein sequences.
- WP2 – the student will develop a neural language model to convert amino acid sequences into DNA sequences, by taking into account evolutionary information and protein function.
- WP3 – experimental validation of models’ effectiveness, by synthesizing, building and expressing codon optimised proteins in E. coli and performing downstream comparison against wild-type variants and genes optimised with existing methods.
TRAINING PROGRAM. The student will receive training in machine learning, statistical learning and deep learning, and will build a competitive profile in biological sequence modelling and design. The student will be also introduced to the emerging field of synthetic biology and will learn modern DNA cloning and assembly techniques and the use of protein expression systems at scale. We also put a strong emphasis on reproducible research; the student will receive training in advanced research software engineering and in reproducible workflows for data analyses.
Fujifilm Diosynth Biotechnologies UK supervisor - Christopher Lennon
The School of Biological Sciences is committed to Equality & Diversity: https://www.ed.ac.uk/biology/equality-and-diversity
How to Apply:
The “Institution Website” button will take you to our Online Application checklist. Complete each step and download the checklist which will provide a list of funding options and guide you through the application process.
2. Cannarozzi, Gina, et al. "A role for codon order in translation dynamics." Cell 141.2 (2010): 355-367.
Pellizza, Leonardo, et al. "Codon usage clusters correlation: towards protein solubility prediction in heterologous expression systems in E. coli." Scientific reports 8.1 (2018): 1-12.
Based on your current searches we recommend the following search filters.
Based on your current search criteria we thought you might be interested in these.