# Multiscale statistical emulation

This project is no longer listed in the FindAPhD
database and may not be available.

for PhD studentship opportunities
• Dr J P Gosling
Dr S Barber
• Applications accepted all year round
• Competition Funded PhD Project (European/UK Students Only)

## Project Description

Complex computer simulators are frequently used to make predictions about real-world systems in many fields of science and technology. Studying these simulated data helps us to understand how to analyse, model, and predict real-world systems. A key statistical challenge is to link the model output to reality, and this is done by trying to get the simulator to agree with real-world observations whilst accounting for all of the uncertainties in the modelling. This is difficult when we have just one model output to consider and the simulator is computationally expensive, often taking days to run even on high-performance computers. In these cases, we often use an emulator, which seeks to approximate the simulator output across a range of conditions using simulator runs in just a few carefully chosen representative conditions.

The success of these emulators depends on small changes in the conditions making only small differences to the outputs. This works in many but not all cases. For example, it precludes the study of systems with "tipping points", where just a small change in conditions results in a step change in outputs; a topical example is climate change where a small increase in temperature may lead to massive ice-cap melting with disastrous consequences. This is analogous to a well-known problem in statistics of nonparametric smoothing; fitting a curve through data where the curve should have spikes or sudden changes can be accomplished using wavelet methods. We shall use multiscale methods based on wavelets and generalisations of wavelets to develop emulators which can cope with sudden large changes in outputs when there are small changes in inputs.

The challenge is amplified in the dynamic case when the model outputs are time series and the observed data perhaps do not correspond to the same time intervals as were modelled. In this project, we hope to address these challenges through building models of the reality-to-model discrepancy that take into account the time-varying structure and employ recent advances in simulator emulation methodology.

No prior knowledge of emulation or of wavelets / lifting will be assumed as full training will be given.

## Funding Notes

School of Mathematics Doctoral Training Grant (DTG) awards (variable number and open to European/UK Students Only)

## References

Background references include:

Barber, S., Nason, G. P. and Silverman, B. W. (2002), Posterior probability intervals for wavelet thresholding. Journal of the Royal Statistical Society: Series B, 64, 189-205.

Conti, S., Gosling, J. P., Oakley, J. E. and O'Hagan, A. (2009).

Gaussian process emulation of dynamic computer codes. Biometrika, 96, 663-676.

Heaton, T. J. and Silverman, B. W. (2008), A wavelet- or lifting-scheme-based imputation method. Journal of the Royal Statistical Society: Series B, 70, 567-587.

O'Hagan, A. (2006). Bayesian analysis of computer code outputs: a tutorial. Reliability Engineering and System Safety, 91, 1290-1300.

## How good is research at University of Leeds in Mathematical Sciences?

FTE Category A staff submitted: 53.00

Research output data provided by the Research Excellence Framework (REF)