The interaction of performance and learning in action selection at The University of Manchester on FindAPhD.com

This project is no longer listed on FindAPhD.com and may not be available.

Click here to search FindAPhD.com for PhD studentship opportunities

Dr M Humphries Applications accepted all year round Self-Funded PhD Students Only

About the Project

Reinforcement learning models of behaviour separate the learning and performance of actions. In these models, appropriate actions are learnt by prediction error feedback from their consequences. Actions are chosen according to their learnt values, modulated by the current balance between the desire to exploit existing knowledge or explore new options. But by controlling which actions are chosen, this exploration-exploitation trade-off must alter the course of learning. This project will explore how this interaction between performance and learning works when the explore-exploit trade-off is a function of the rate of learning.

We have good reason to believe these are coupled in the brain. A longstanding theory holds that phasic dopamine signals a prediction error. New evidence and models suggest that tonic dopamine controls the exploration-exploitation trade-off. As tonic dopamine is, to a first approximation, just the time integral of phasic dopamine, so the two are coupled.

We will use both algorithmic and neural models to study this interaction, and the role of dopamine. One goal will be to determine if the classic habit vs goal-directed distinction of instrumental behaviour is actually a performance effect and not a distinction between learning systems. Another goal will be to seek ideas for forms of directed exploration to advance the cutting edge of machine learning.

Funding Notes

This project has a Band 1 fee. Details of our different fee bands can be found on our website. For information on how to apply for this project, please visit the Faculty of Biology, Medicine and Health Doctoral Academy website. Informal enquiries may be made directly to the primary supervisor.

References

Humphries, M. D., Khamassi, M. & Gurney, K. (2012) Dopaminergic control of the exploration-exploitation trade-off via the basal ganglia. Frontiers in Neuroscience, 6, 9.

Khamassi, M. & Humphries, M. D. (2012) Integrating cortico-limbic-basal ganglia architectures for learning model-based and model-free navigation strategies. Frontiers in Behavioural Neuroscience, 2012, 6, 79.

Wunderlich, K., Smittenaar, P. & Dolan, R. J. (2012) Dopamine Enhances Model-Based over Model-Free Choice Behavior. Neuron, 75, 418-424

The interaction of performance and learning in action selection

The University of Manchester Faculty of Biology, Medicine and Health

This project is no longer listed on FindAPhD.com and may not be available.

About the Project

Funding Notes

References

Select your nearest city

Do you want hassle-free information and advice?