FindAPhD Weekly PhD Newsletter | JOIN NOW FindAPhD Weekly PhD Newsletter | JOIN NOW

Identifying Unknown Chemical Structures through the Application of Machine Learning

   Department of Chemistry

This project is no longer listed on and may not be available.

Click here to search for PhD studentship opportunities
  Prof Caroline Dessent, Dr Brett Sallach  No more applications being accepted  Competition Funded PhD Project (UK Students Only)

About the Project


Machine learning is being increasingly applied across analytical chemistry to provide a step change in our ability to identify compounds. Globally, there are increasingly greater efforts being made to identify pollutants and their breakdown products in rivers and surface waters, in line with the UN sustainability goals. There are currently, however, considerable problems in the time it takes to identify unknown chemicals in complex samples, such as river water. In this project, we will acquire novel analytical chemistry data and use it to train a newly developed Machine Learning tool to provide a step change in our current ability to identify pharmaceuticals and their breakdown products in surface water samples. The state of the art machine learning model we will apply has been developed by our collaborator, Dr Feng Gao, Yale, and is a recently developed, powerful convolution neural network machine learning model. We will use the new methodology to analyse a broad range of surface water samples (from an available databank of samples acquired from international locations) to reassess the composition of pharmaceuticals in these samples. This will provide immediate impact for understanding the global distribution of pollutants and their breakdown products.


The key objective of this project is to apply machine learning techniques to develop new, more efficient analytical techniques for identifying unknown chemicals in complex samples. Many problems in environmental science (e.g. identification of pollutants and their breakdown products), and chemical manufacture (e.g. identification of side-products in pharmaceutical synthesis) arise because of the challenges of identifying unknowns, and the new methodology to be developed in this project has potential to impact significantly in such applied fields.  

 Experimental Approach

The experimental work will employ novel UV laser photodissociation mass spectrometry (developed in the PI’s group) to provide enhanced secondary structure information. This information is needed to build a rich data set of UV photofragments which are mapped to known molecular structures. We will then apply our machine learning tools so that subsequent unknown molecules can be identified by subjecting them to UV photofragmentation. The machine learning we will use is a powerful convolution neural network model that the co-supervisor and our collaborator recently developed (Journal of Hazardous Materials, 2022). The co-supervisor is an expert in identifying emerging pollutants in the environment, and will lead on application of the new methods developed to identifying unknowns in complex mixtures.


This project will employ a novel UV laser interfaced mass spectrometer to obtain a unique data set (following molecular photodissociation) that can be used to train the machine learning algorithms to obtain a new methodology for identifying unknowns in complex mixtures. Both the instrument to be used, as well as the new machine learning approach of Gao are novel, and provide an opportunity for the PhD student to work on an internationally novel project with excellent prospects of producing high-impact results. 


The project will provide excellent training in mass spectrometry, a highly valued analytical technique for chemical, biochemical and biomedical science. Training in the use of lasers and electronic spectroscopy will also be provided (supported by an experimental officer and a post-doctoral researcher). The project will also provide training in applying machine learning to chemical problems, and depending on the interests of the student, there will be the opportunity to develop further machine learning algorithms. Students in our research groups are encouraged to attend conferences and take part in international collaborations during their PhDs, and will publish their research in high-profile journals. All Chemistry research students have access to our innovative Doctoral Training in Chemistry (iDTC): cohort-based training to support the development of scientific, transferable and employability skills:

The Department of Chemistry holds an Athena SWAN Gold Award and is committed to supporting equality and diversity for all staff and students. The Department strives to provide a working environment which allows all staff and students to contribute fully, to flourish, and to excel:

For more information about the project, click on the supervisor's name above to email the supervisor. For more information about the application process or funding, please click on email institution

This PhD will formally start on 1 October 2022. Induction activities may start a few days earlier.

To apply for this project, submit an online PhD in Chemistry application:

You should hold or expect to achieve the equivalent of at least a UK upper second class degree in Chemistry or a related subject.  

Funding Notes

Fully funded for 3 years by the Department of Chemistry and covers: (i) a tax-free annual stipend at the standard Research Council rate (£15,609 for 2021-22), (ii) tuition fees at the Home rate, (iii) funding for consumables. See guidance for further details:
Studentships are available to any student who is eligible to pay tuition fees at the home rate:
Not all projects will be funded; candidates will be appointed via a competitive process.


Candidate selection process:
• You should hold or expect to receive at least an upper second class degree in chemistry or a chemical sciences related subject
• Applicants should submit a PhD application to the University of York by 28 February 2022
• Supervisors may contact candidates either by email, telephone or web-chat
• Supervisors can nominate up to 2 candidates to be interviewed for the project
• The interview panel will shortlist candidates for interview from all those nominated
• Shortlisted candidates will be invited to a panel interview on 30th or 31st March or 1stApril
• The awarding committee will award studentships following the panel interviews
• Candidates will be notified of the outcome of the panel’s decision by email

How good is research at University of York in Chemistry?

Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities
Search Suggestions
Search suggestions

Based on your current searches we recommend the following search filters.

PhD saved successfully
View saved PhDs