Don't miss our weekly PhD newsletter | Sign up now Don't miss our weekly PhD newsletter | Sign up now

  (A*STAR Split-site) Generative Models for Deep Reinforcement Learning with applications to material discovery

   Department of Computer Science

This project is no longer listed on and may not be available.

Click here to search for PhD studentship opportunities
  Dr Mingfei Sun, Dr Hangwei Qian  No more applications being accepted  Competition Funded PhD Project (Students Worldwide)

About the Project

The project investigates Generative Models, including Large Language Models (LLM) & diffusion models, and Deep Reinforcement Learning to support multi-modal, multi-task and multi-embodiment decision-making, and human-in-the-loop learning. Specifically, Deep Reinforcement Learning (RL) optimizes policies for sequential decision-making problems under the Markov Decision Processes (MDP) setting. Majority of existing policy gradient methods in this setting focuses on single-modal policy (i.e., the observation and action space have a single modality input/output), single-task configuration (i.e., the objective to optimize is defined with respective to a single fixed reward scalar) and single-embodiment environment (i.e., the policy learned is specific to one embodiment characterized by the environment transition dynamics). By contrast, many real-world decision-making problems feature in multi-modal, multi-task and multi-embodiment, and unavoidably involve the participation of human users, i.e., human-in-the-loop. These discrepancies between the existing RL methods and the real-world problems to solve calls for a rethinking of Deep RL studies and a revamp of the MDP idea. The recent advancement of Generative Models (GMs), including Large Language Models (LLMs) such as ChatGPT/GPT4 and diffusion models such as Dalle2, allow AI to generate images/text, write code, generate synthetic data and naturally interact with human users. Importantly, these GMs directly handle multi-modality inputs and outputs in a unified manner, solve various tasks as a generalist agent, and can be easily adapted to new contexts with fine-tuning or prompt engineering. Whilst the highest profile applications of GMs have been in text and images, they can also be applied to improve on the existing policy gradient methods for Deep RL, offering a unique opportunity to develop methods that are inherently multi-modal, multi-task and multi-embodiment, and readily usable for many real-world problems. This project thereby aims to decrease the discrepancies in Deep RL and investigates the effective combination of Generative Models and Reinforcement Learning for multi-modal, multi-task and multi-embodiment decision making problems. Moreover, the project considers the human-in-the-loop learning setting and leverage the Generative Models to address the value misalignment issue when human preferences are considered. The developed ideas in this project will be applied to solve complex problems in material designs. The project considers the following topics: reward models, reinforcement learning from human feedback, human intention alignment, generative models for planning, language models for task planning.


Applicants should have, or expect to achieve, at least a 2.1 honours degree or a master’s in a relevant science or engineering related discipline.


Scholarships are available for suitable candidates to commence on this 4-year programme in October 2024 including:

·        Tuition fees

·        Annual stipend at the minimum UKRI rate (2023/24 rate £18,622) to the students when in Manchester (for a maximum of 2 years) and when in A*STAR (two years) equivalent to S$2,700/month.

·        Flight allowance for students travelling to Singapore (£1,000) paid by University of Manchester.

·        One return airfare to/from Singapore (S$1,500) paid by A*STAR

·        Medical insurance and settling-in allowance (S$1,000) paid by A*STAR

·        An annual Research Training Support Grant (RTSG) towards project running costs/consumables (up to £5,000 pa) provided to all students when in Manchester.

·        Supervisor travel allowance up to £6,000 for two airfare/accommodation visits to Singapore. 

Before you apply

We strongly recommend that you contact the supervisor(s) for this project before you apply.

How to apply

To be considered for this project you’ll need complete a formal application through our online application portal.

 When applying, you’ll need to specify the full name of this project, the name of your supervisor, how you’re planning on funding your research, details of your previous study, and names and contact details of two referees.

Your application will not be processed without all of the required documents submitted at the time of application, and we cannot accept responsibility for late or missed deadlines. Incomplete applications will not be considered.  

If you have any questions about making an application, please contact our admissions team by emailing

[Email Address Removed].

Equality, diversity and inclusion 

Equality, diversity and inclusion are fundamental to the success of The University of Manchester, and are at the heart of all of our activities. We know that diversity strengthens our research community, leading to enhanced research creativity, productivity and quality, and societal and economic impact.

We actively encourage applicants from diverse career paths and backgrounds and from all sections of the community, regardless of age, disability, ethnicity, gender, gender expression, sexual orientation and transgender status.

We also support applications from those returning from a career break or other roles. We consider offering flexible study arrangements (including part-time: 50%, 60% or 80%, depending on the project/funder).

Computer Science (8) Engineering (12) Materials Science (24) Mathematics (25)

Funding Notes

See project description for funding notes

How good is research at The University of Manchester in Computer Science and Informatics?

Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities