Aberdeen University Featured PhD Programmes
University of Southampton Featured PhD Programmes
University College London Featured PhD Programmes
University of Oxford Featured PhD Programmes
University of Reading Featured PhD Programmes

Using machine learning to identify aggregation resistant biopharmaceuticals


Project Description

Background
The UK is a major stakeholder in biopharmaceutical development and production, a sector that had sales of $228 billion in 2016. Aggregation is a major hurdle to their manufacture resulting in the failure of promising candidate biologics even at very late stages in the development pipeline. The ability to identify sequences likely to aggregate during production, transport or storage is of crucial importance to the biologics industry. This is currently beyond our capability both for mAbs and for the arsenal of advanced therapies (antibody-drug conjugates etc) that have the potential of revolutionising medicine in the future.
Together with Astra Zeneca, we have developed an in vivo selection method in E.coli able to quantify the aggregation propensity of bio-therapeutics that include mAbs by linking aggregation to antibiotic resistance. We have shown the assay can be used to screen for aggregation-resistant proteins of therapeutic importance with different protein scaffolds (reference 1) (a previous BBSRC CASE student with Avacta/AZ)) and, most recently, have used it combined with directed evolution to generate new proteins with enhanced bioprocessing capability (under review).
Excitingly, in addition to isolating inherently developable therapeutics, this combined approach allows isolation of thousands protein sequences with known aggregation properties, opening the door to using machine learning (ML) to identify the key drivers of aggregation (whether during ageing and neurodegeneration or during advanced therapy manufacture) from such highly complex datasets.

Objectives. In collaboration with our industrial collaborators at Astra Zeneca we will:
1. Generate a large dataset of protein sequences with improved (positive selection) and worsened (negative selection) aggregation propensity. This will be achieved by performing directed evolution on five single-chain Fv (scFv) sequences with low sequence identity but poor biophysical behavior identified from the literature and our industrial partner.
2. Use these data as training sets for the development of ML algorithms to identify aggregation resistant sequences.
3. Validate the machine learning outputs by quantifying the aggregation properties of a test set of sequences ranked by the optimized ML algorithm.

Novelty and timeliness
The ability to identify aggregation-resistant protein therapeutics early in development, without the need of large scale purification is both novel and timely, especially as more complex protein therapeutics are currently in development. Additionally, our novel evolution platform will be used as a high throughput screen enabling the generation of large datasets which will be used in a ‘big data’ approach to understand the complex multi-factorial mechanisms underlying selection. This will ultimately lead to novel predictors of aggregation and an understanding of the fundamental mechanisms.

Experimental Approach
Molecular biology (error prone PCR and golden gate cloning) will be used to generate libraries of mutated scFv. High throughput sequencing and high throughput aggregation assays will be used to construct a large dataset of sequences with known aggregation behaviour.
These data will be used to carry out ML initially within Python using Scj-kit Learn with the aim of generating new predictive methods for protein aggregation.
The predictive power of the optimised classifier will be verified by expressing a range of optimised sequences in the full IgG scaffold (the student will do these experiments at AZ) and their properties assessed using industry employed methods (e.g. accelerated stability assays, SEC and AC SINS).

Work during placement
it is envisaged that several short visits to Astra Zeneca’s Cambridge site in years one and two will precede a longer visit in year 3. The aims of the visits in years 1 and 2 will be to construct, express, purify and characterize the “wild-type” IgG sequences that will be subjected to directed evolution at Leeds. In year 3, similar work will be undertaken on a larger number of constructs to quantify prediction accuracy of the developed algorithm. Proteins will be characterized using the panoply of methods used in industry e.g. SEC, AC-SINS, DSC, IEF, MS. The project will form part of a true collaboration with AZ, and visits to AZ will also be organized as the science dictates as the project develops.

Funding Notes

BBSRC White Rose Mechanistic Biology DTP CASE 4 year studentship.
Studentships covers UK/EU fees and stipend (c.£15,009) for 4 years to start in Oct 2020. Applicants should have/be expecting at least a 2.1 Hons. degree in a relevant subject. EU candidates require 3 years of UK residency in order to receive full studentship. English language requirements may apply.
Apply online View Website Course is PhD in Biological Sciences and we require a CV and transcripts.

References

1. An in vivo platform for identifying inhibitors of protein aggregation. Saunders, J., Young, L., Mahood, R., Jackson, M., Revill, C., Foster, R., Smith, A., Ashcroft, A., Brockwell, D. and Radford, S. (2016) Nat Chem Biol. 12:94-101.

How good is research at University of Leeds in Biological Sciences?

FTE Category A staff submitted: 60.90

Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities

Email Now

Insert previous message below for editing? 
You haven’t included a message. Providing a specific message means universities will take your enquiry more seriously and helps them provide the information you need.
Why not add a message here
* required field
Send a copy to me for my own records.

Your enquiry has been emailed successfully





FindAPhD. Copyright 2005-2019
All rights reserved.