The aim of this PhD is to develop techniques for implementing state-of-the-art Bayesian techniques in ways that fully exploit the computational power of modern and next generation many-core architectures and systems (such as multicore CPUs, GPUs, Xeon Phis and super-computing clusters). This will link closely to a large current research project (“Big Hypotheses”, see here: http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/R018537/1
) and pull on previous work related to high performance computing, Big Data and Bayesian statistics. The aim is to solve difficult problems related to applications relevant to the IBM Research lab at Daresbury.
Markov-Chain Monte Carlo (MCMC) is a numerical Bayesian method that allows high-fidelity physical models to be combined with data to make inferences in the presence of pronounced uncertainty. Improvements to MCMC have historically focused on algorithmic advances, involving, for example, the use of local gradient information and of gradually migrating from an easy reference problem to the problem of interest. Particularly with these improvements, MCMC is an effective solution to the vast number of problems that can be posed as inferences involving data using statistical models. In the context of any one problem, bespoke optimisation can be used to exploit the available (parallel) computational resources. However, because MCMC fundamentally uses the evolution of a single Markov-Chain to convey uncertainty, such optimisation is necessarily problem-specific. There is therefore little scope to develop a generic MCMC implementation that fully exploits parallel processing architectures. As a result, the ability of MCMC to provide solutions to next-generation problems is limited.
Sequential Monte Carlo (SMC) samplers can solve the same problems as MCMC. In contrast to MCMC, SMC samplers use the diversity of a population of samples to convey uncertainty. For the majority of the operation of an SMC sampler, each sample is processed independently. This makes it trivial to parallelise the majority of the SMC sampler. However, at a specific point in the SMC sampler, it becomes necessary to perform a “resampling” step. A text-book implementation of this resampling step is impossible to parallelise in a scalable fashion. However, previous research has rearticulated the resampling operation as a divide-and-conquer algorithm. In so doing, it becomes possible to parallelise the resampling step. More recent work has identified that, by carefully considering data locality and pipelining and by making appropriate use of middleware (e.g., MPI and OpenMP), it is possible to implement the resampling algorithm in such a way that using more cores results in faster operation.
There is a need for this research to be developed to provide implementations of an SMC sampler that fully exploit multicore CPUs, GPUs, Xeon Phis and super-computing clusters. The aim is to have implementations which can dramatically outperform MCMC and to use these implementations to solve pertinent problems.
The PhD will therefore comprise three main strands of research on: understanding the problem and solution space (i.e., applying off-the-shelf MCMC algorithms to problems relevant to IBM Research); enhancing existing implementations of SMC samplers that fully exploit some exemplar many-core architectures (as being developed by “Big Hypotheses”); using these implementations (embodied in frameworks such as Stan) to solve some exemplar problems that are directly relevant to IBM research.
The PhD includes components that pull on Computer Science, Statistics, and Engineering and is at the intersection of these three academic disciplines. The successful applicant will have experience in one of these domains and will gain experience in the others. It is also anticipated that the successful applicant will gain experience on a number of HPC technologies (e.g., MPI, CUDA and OpenMP) and that the project will enable the student to enhance valuable programming skills (e.g., in Python, Java and C++).