Don't miss our weekly PhD newsletter | Sign up now Don't miss our weekly PhD newsletter | Sign up now

  Collective intelligence in multi-agent reinforcement learning for deliberative processes


   School of Computer Science

This project is no longer listed on FindAPhD.com and may not be available.

Click here to search FindAPhD.com for PhD studentship opportunities
  Dr Leonardo Stella  No more applications being accepted  Funded PhD Project (UK Students Only)

About the Project

Reinforcement Learning (RL) has achieved exceptional success in recent years, especially for sequential decision-making and tasks that require continuous control. Examples include the game of Go, video games – especially strategy games such as StarCraft – and also robotics.

Recently, a prominent area of research involves the extension to multi-agent reinforcement learning (MARL). In this project, we focus on decentralised MARL, i.e., where agents do not intrinsically know the state of the other agents but can interact with one another. The main advantages of this approach include faster learning (e.g., through parallel computation), robustness to individual failures, transfer learning from more experienced agents, increasing number of applications (both in cooperative and adversarial settings), to name a few.

However, the extension of RL to multi-agent settings brings several challenges. One of the main challenges is the coordination of these agents. Another fundamental issue is scalability and, therefore, the communication between large numbers of agents. Recent literature has tackled the scalability issue through the formulation of optimality guarantees in the limit of an infinite number of agents through a mean-field approach [J01]. The advantage of this approach is to be able to quantify upper bounds in the performance of a large number of agents when performing collective decision-making [J01]. 

However, when translating these results back to the original problem with finite population, the interactions among the agents play a crucial role and similar approaches do not scale down well. Therefore, working with multi-agent systems is necessary to ensure optimality.

The aim of this project is to study multi-agent reinforcement learning by embedding elements of game theory to tackle collective decision-making. Specifically, three objectives are considered:

1.    To design and develop scalable and robust MARL models for collective decision-making in complex tasks. This requires the investigation of diversified approaches to learning, including formalism from bio-inspired game theoretic approaches [J02, J03].

2.    To investigate a range of communication policies to foster coordination between the agents. This involves the study of the impact of heterogeneity in the agents’ interactions for scalability and robustness. These interactions can be captured by regular networks or scale-free networks and can drastically change the behaviour of the system and its properties [J02, J04].

3. To analyse the developed framework through a mean-field game approach [J01]. This leads to a much more tractable analysis for robustness in cooperative and competitive contexts [J01].

The candidate will be based at the School of Computer Science at the University of Birmingham, an internationally leading school for its research. The PhD candidate will work in a stimulating environment and have many opportunities to interact with leading scientists within the school. Eligibility:

-      Strong mathematical background.

-      Strong background in machine learning and multi-agent systems.

-      Good understanding of game theoretic concepts.

-      Strong programming skills in Python (and MATLAB), experience using ML toolboxes (Pytorch/Google Colab/PettingZoo).

- Excellent communication skills.

Eligibility: First or Upper Second Class Honours undergraduate degree and/or postgraduate degree with Distinction (or an international equivalent). We also consider applicants from diverse backgrounds that have provided them with equally rich relevant experience and knowledge. Full-time and part-time study modes are available.

We want our PhD student cohorts to reflect our diverse society. UoB is therefore committed to widening the diversity of our PhD student cohorts. UoB studentships are open to all and we particularly welcome applications from under-represented groups, including, but not limited to BAME, disabled and neuro-diverse candidates. We also welcome applications for part-time study.

If your first language is not English and you have not studied in an English-speaking country, you will have to provide an English language qualification.

Computer Science (8)

Funding Notes

The position offered is for three and a half years full-time study. The value of the award is stipend; £17,668 (subject to review) pa; tuition fee: £4,620 pa. Awards are usually incremented on 1 October each following year.

References

[J01] L. Stella, D. Bauso and P. Colaneri, “Mean-Field Game via Switched Systems for Collective Decision-Making”, IEEE Transactions on Automatic Control, vol. 67, no. 8, pp. 3863-3878, 2022.
[J02] L. Stella and D. Bauso, “Bio-inspired Evolutionary Dynamics on Complex Networks under Uncertain Cross-inhibitory Signals”, Automatica, vol. 100, pp. 61-66, 2019.
[J03] L. Stella and D. Bauso, “The Impact of Irrational Behaviours in the Optional Prisoner's Dilemma with Game-Environment Feedback,” International Journal of Robust and Nonlinear Control (IJRNC), 2021.
[J04] L. Stella, W. Baar and D. Bauso, “Lower Network Degrees Promote Cooperation in the Prisoner’s Dilemma with Environmental Feedback,” IEEE Control Systems Letters (L-CSS), vol. 6, pp. 2725-2730, 2022.

How good is research at University of Birmingham in Computer Science and Informatics?


Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities

Where will I study?