Coventry University Featured PhD Programmes
Engineering and Physical Sciences Research Council Featured PhD Programmes
University of Kent Featured PhD Programmes
University of Glasgow Featured PhD Programmes
Karlsruhe Institute of Technology Featured PhD Programmes

Harnessing the full potential of primary care data to better predict patient outcomes: a case study in Type 2 Diabetes

Project Description

Prognostic models can be used in health care to predict the probability of an event happening to an individual. For example, knowing that a patient is at high risk of developing type 2 diabetes might change the way they are treated in order to try and prevent or delay this happening.

To date prognostic models have used risk factor data from a single point in time on which to predict the risk of the outcome of interest [1]. Primary care holds a wealth of data on individual patients with many repeated measurements over time, It maybe that this wealth of data can be used to better predict future events compared to only considering a ‘snap-shot’ of data. Also if measures have high within patient variability or have measurement error the use of a single measurement may lead to poor predictions. In recent years there has been an increased interest in dynamic risk prediction and advancing the statistical methodology in this area is a current hot topic.

The student will firstly review how previous studies have used dynamic risk prediction in order to gain an understanding of the available methods. They will then evaluate and compare methods for dynamic risk prediction, such as land marking [2], joint modelling [3] and machine learning approaches [4] plus any other methods identified during the review, to the more standard cross-sectional approaches using the prediction of type 2 diabetes as a case study using data from primary care records (CPRD). They will firstly use a single repeated risk factor and then extend this to assess multiple repeated risk factors. Approaches will be compared both in terms of well they perform (i.e. discrimination and calibration) but also how they could be used in practice. If required the student could look to further develop existing methods.

In the final part of the project the student will develop a prototype online risk calculator with input from patients which could be used by general practitioners to communicate diabetes risk to patients. A crucial benefit of such a tool, is that it can utilise advanced, complex but appropriate methods ‘under the hood’, and yet communicate risk in an accessible way to patients. Given the dynamic nature of the prediction tool, when new data is added, the clinician and patient will be able to directly view the impact such changes have on their risk.

Entry requirements:

Applicants are required to hold/or expect to obtain a data science related UK Bachelor Degree 2:1 or better (e.g. Computer Science, Bioinformatics, Biostatistics), and preferably also a similar MSc qualification. The University of Leicester English language requirements apply where applicable:

How to apply:

You should submit your application using our online application system:

Apply for a PhD in Health Sciences Research

In the funding section of the application please indicate you wish to be considered for a CLS HDRUK Studentship

In the proposal section please provide the name of the supervisor and project you want to be considered for – please list both your first and second choices.

Funding Notes

The College of Life Sciences (CLS) HDRUK Studentship will provide a tax-free stipend at RCUK rates (£15,009 for 2019/20) and UK/EU fees for 3 years.


[1] Barber SR, Dhalwani NN, Davies MJ, Khunti K, Gray LJ. External national validation of the Leicester Self-Assessment score for Type 2 diabetes using data from the English Longitudinal Study of Ageing. Diabet Med. 2017 Nov;34(11):1575-1583. doi: 10.1111/dme.13433. Epub 2017 Aug 20

[2] Paige E, Barrett J, Stevens D, Keogh RH, Sweeting MJ, Nazareth I, Petersen I, Wood AM. Landmark Models for Optimizing the Use of Repeated Measurements of Risk Factors in Electronic Health Records to Predict Future Disease Risk. Am J Epidemiol. 2018 Jul 1;187(7):1530-1538. doi: 10.1093/aje/kwy018.

[3] Crowther MJ1, Abrams KR, Lambert PC. Flexible parametric joint modelling of longitudinal and survival data. Stat Med. 2012 Dec 30;31(30):4456-71. doi: 10.1002/sim.5644. Epub 2012 Oct 4.

[4] Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019 Feb 11. pii: S0895-4356(18)31081-3. doi: 10.1016/j.jclinepi.2019.02.004. [Epub ahead of print] Review.

Project / Funding Enquiries: Prof Laura Gray [email protected]

Email Now

Insert previous message below for editing? 
You haven’t included a message. Providing a specific message means universities will take your enquiry more seriously and helps them provide the information you need.
Why not add a message here
* required field
Send a copy to me for my own records.

Your enquiry has been emailed successfully

FindAPhD. Copyright 2005-2019
All rights reserved.