University of Leeds Featured PhD Programmes
Catalysis Hub Featured PhD Programmes
University of Kent Featured PhD Programmes
John Innes Centre Featured PhD Programmes
University of Reading Featured PhD Programmes

The interaction of performance and learning in action selection

  • Full or part time
  • Application Deadline
    Applications accepted all year round
  • Self-Funded PhD Students Only
    Self-Funded PhD Students Only

Project Description

Reinforcement learning models of behaviour separate the learning and performance of actions. In these models, appropriate actions are learnt by prediction error feedback from their consequences. Actions are chosen according to their learnt values, modulated by the current balance between the desire to exploit existing knowledge or explore new options. But by controlling which actions are chosen, this exploration-exploitation trade-off must alter the course of learning. This project will explore how this interaction between performance and learning works when the explore-exploit trade-off is a function of the rate of learning.

We have good reason to believe these are coupled in the brain. A longstanding theory holds that phasic dopamine signals a prediction error. New evidence and models suggest that tonic dopamine controls the exploration-exploitation trade-off. As tonic dopamine is, to a first approximation, just the time integral of phasic dopamine, so the two are coupled.

We will use both algorithmic and neural models to study this interaction, and the role of dopamine. One goal will be to determine if the classic habit vs goal-directed distinction of instrumental behaviour is actually a performance effect and not a distinction between learning systems. Another goal will be to seek ideas for forms of directed exploration to advance the cutting edge of machine learning.

Funding Notes

This project has a Band 1 fee. Details of our different fee bands can be found on our website. For information on how to apply for this project, please visit the Faculty of Biology, Medicine and Health Doctoral Academy website. Informal enquiries may be made directly to the primary supervisor.


Humphries, M. D., Khamassi, M. & Gurney, K. (2012) Dopaminergic control of the exploration-exploitation trade-off via the basal ganglia. Frontiers in Neuroscience, 6, 9.

Khamassi, M. & Humphries, M. D. (2012) Integrating cortico-limbic-basal ganglia architectures for learning model-based and model-free navigation strategies. Frontiers in Behavioural Neuroscience, 2012, 6, 79.

Wunderlich, K., Smittenaar, P. & Dolan, R. J. (2012) Dopamine Enhances Model-Based over Model-Free Choice Behavior. Neuron, 75, 418-424

Email Now

Insert previous message below for editing? 
You haven’t included a message. Providing a specific message means universities will take your enquiry more seriously and helps them provide the information you need.
Why not add a message here
* required field
Send a copy to me for my own records.

Your enquiry has been emailed successfully

FindAPhD. Copyright 2005-2019
All rights reserved.