Clinical prediction models (CPMs) take what we know about a person and predict the probability of subsequent outcomes using a regression model or algorithm . An example is QRISK, which uses a Cox model to predict risk of future cardiovascular events given a patient’s characteristics (e.g. weight, blood pressure, cholesterol levels etc.) . Historically, CPMs were developed using data on a collection of known risk factors at a fixed point in time, using high-quality data, such as from a prospective cohort study.
Health systems now typically collect large quantities of data, and these data are often of low quality with regards to their use for research. These data may derive from electronic health records, wearable technology, or a combination of linked sources. While this is an opportunity, it is also a great challenge since the traditional methods to develop CPMs can often not be applied. Some examples of the issues that need to be addressed are:
1) There are large amounts of missing data, and the presence or absence of data is typically highly informative (i.e. missing not at random).
2) Information is collected not at a single point in time, but longitudinally. This raises numerous questions. For example, can we exploit repeated measures through time of a single risk factor? At what point in time should we make a prediction?
3) CPMs can become ‘victims of their own success’ . Once a CPM is deployed in practice, the healthcare system (hopefully) changes: making different decisions. Therefore the associations between risk factors and outcomes changes, which reduces the quality of the CPM in future. How do we prevent this from happening?
For this PhD we are seeking a student with background in mathematics, statistics, epidemiology or computer science who can address one or more of the emerging methodological challenges with CPMs in new era of ‘big data’. The PhD would begin with the student undertaking a literature review of these emerging challenges and opportunities with developing, validating and deploying CPMs, before choosing an area of focus. Depending on the area of focus, the project could involve theoretical derivation, simulation studies, applied analyses (i.e. of real data), or a combination of these research methods.
Training/techniques to be provided:
MS and GM provide expertise in clinical prediction modelling methodology.
We also anticipate the successful student would attend the Statistical methods for risk prediction & prognostic models, at Keele, and (if relevant to the chosen focus of the PhD) Causal inference with observational data: the challenges and pitfalls, at Leeds. The prospective student will primarily sit within the Centre for Health Informatics, thereby giving exposure to a range of informatics, statistics, epidemiological, and clinical expertise.
Candidates are expected to hold (or be about to obtain) a minimum upper second class honours degree (or equivalent) in a related area / subject – i.e. mathematics, statistics, epidemiology or computer science.
For international students we also offer a unique 4 year PhD programme that gives you the opportunity to undertake an accredited Teaching Certificate whilst carrying out an independent research project across a range of biological, medical and health sciences. For more information please visit http://www.internationalphd.manchester.ac.uk
 E. W. Steyerberg, “Clinical prediction models: a practical approach to development, validation, and updating.” New York, NY, Springer, 2009.
 Hippisley-Cox, J., Coupland, C., & Brindle, P. (2017). Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study. BMJ, 357.
 Matthew C Lenert, Michael E Matheny, Colin G Walsh, Prognostic models will be victims of their own success, unless…, Journal of the American Medical Informatics Association, , ocz145, https://doi.org/10.1093/jamia/ocz145